Using Jobs to Perform Schema Changes Against MySQL Databases on K8s

Using Jobs to Perform Schema Changes Against MySQL Databases on K8s

schema changes k8 mysql

Performing an operation is always challenging when dealing with K8s.

When on-prem or DBaaS like RDS or Cloud SQL, it is relatively straightforward to apply a change. You can perform a DIRECT ALTER, use a tool such as pt-osc, or even, for certain cases where async replication is in use, perform changes on replicas and failover.

In this blog post, I’ll provide some thoughts on how schema changes can be performed when running MySQL on Kubernetes

I won’t focus on DIRECT ALTERs as it is pretty straightforward to apply them. You can just connect to the MySQL service POD and perform the ALTER.

But how can we apply changes in more complex scenarios where we may want to benefit from pt-osc, gain better control over the operation, or take advantage of the K8s features?

One convenient way that I’ve found working well is the use of an external POD running the percona-toolkit. If you need other tools or more flexibility, you can, of course, use your own image.

Assuming that there is already a Secrets object holding all system user passwords, the concept is really simple. We are going to create a Job.

The use of Jobs provides a convenient way for both Devs and DBAs to apply changes, taking advantage of the following benefits

  • Full track of the event lifecycle, including execution logs
  • Auditing
  • ConfigMap options, if any, can be reused
  • You don’t have to explicitly pass credentials or endpoints to the execution, as these all are available through the Secrets objects and environmental variables
  • Execution can be scheduled in a predefined maintenance window (you can easily convert a Job to a cronjob)
  • A task can be easily and consistently executed multiple times across multiple environments

Let’s proceed to the proof of concept now.

I deployed a three-node Percona XtraDB Cluster (PXC) using the Percona Operator for MySQL. HaProxy was acting as a load balancer in front of the cluster.

ubuntu@ip-172-31-25-177:~$ kubectl get pods

NAME                                               READY   STATUS    RESTARTS      AGE
minimal-cluster-haproxy-0                          2/2     Running   0             27m
minimal-cluster-haproxy-1                          2/2     Running   0             26m
minimal-cluster-haproxy-2                          2/2     Running   0             26m
minimal-cluster-pxc-0                              3/3     Running   0             27m
minimal-cluster-pxc-1                              3/3     Running   0             26m
minimal-cluster-pxc-2                              3/3     Running   0             26m
percona-xtradb-cluster-operator-6f4f56f5d5-pvx92   1/1     Running   2 (18h ago)   3d20h

These are my Secrets:

ubuntu@ip-172-31-25-177:~$ kubectl get secret

NAME                       TYPE     DATA   AGE
internal-minimal-cluster   Opaque   7      3d20h
minimal-cluster-secrets    Opaque   7      3d20h

The following simple table was created in the PXC cluster:

ubuntu@ip-172-31-25-177:~$ kubectl run -it --rm percona-client --image=percona:8.0 --restart=Never bash
If you don't see a command prompt, try pressing enter.

[mysql@percona-client /]$ mysql -h minimal-cluster-pxc -u root -p
Enter password:

Welcome to the MySQL monitor.  Commands end with ; or g.

Your MySQL connection id is 256
Server version: 8.0.32-24.2 Percona XtraDB Cluster (GPL), Release rel24, Revision 2119e75, WSREP version 26.1.4.3

Copyright (c) 2009-2023 Percona LLC and/or its affiliates

Copyright (c) 2000, 2023, Oracle and/or its affiliates.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or 'h' for help. Type 'c' to clear the current input statement.

mysql> create database if not exists atsaloux;
Query OK, 1 row affected (0.02 sec)

mysql> use atsaloux;
Database changed

mysql> CREATE TABLE `authors` (         `id` INT(11) NOT NULL AUTO_INCREMENT,   `first_name` VARCHAR(50) NOT NULL ,     `last_name` VARCHAR(50) NOT NULL,       `email` VARCHAR(100) NOT NULL ,         `birthdate` DATE NOT NULL,    `added` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,   PRIMARY KEY (`id`),     UNIQUE INDEX `email` (`email`) ) ENGINE=InnoDB;
Query OK, 0 rows affected, 1 warning (0.04 sec)

mysql> show create table authorsG
*************************** 1. row ***************************
Table: authors
Create Table: CREATE TABLE `authors` (
`id` int NOT NULL AUTO_INCREMENT,
`first_name` varchar(50) NOT NULL,
`last_name` varchar(50) NOT NULL,
`email` varchar(100) NOT NULL,
`birthdate` date NOT NULL,
`added` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
UNIQUE KEY `email` (`email`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
1 row in set (0.00 sec)

mysql> insert into authors (first_name,last_name,email,birthdate) values ('user1','lastname1','userlast@percona.com','1980/01/01');
Query OK, 1 row affected, 1 warning (0.01 sec)

mysql> insert into authors (first_name,last_name,email,birthdate) values ('user2','lastname2','userlast2@percona.com','1980/01/01');
Query OK, 1 row affected, 1 warning (0.00 sec)

mysql> insert into authors (first_name,last_name,email,birthdate) values ('user3','lastname3','userlast3@percona.com','1980/01/01');
Query OK, 1 row affected, 1 warning (0.01 sec)

mysql> insert into authors (first_name,last_name,email,birthdate) values ('user4','lastname4','userlast4@percona.com','1980/01/01');
Query OK, 1 row affected, 1 warning (0.00 sec)

mysql> insert into authors (first_name,last_name,email,birthdate) values ('user5','lastname5','userlast5@percona.com','1980/01/01');
Query OK, 1 row affected, 1 warning (0.01 sec)

mysql> select * from authors;
+----+------------+-----------+-----------------------+------------+---------------------+
| id | first_name | last_name | email                 | birthdate  | added               |
+----+------------+-----------+-----------------------+------------+---------------------+
|  1 | user1      | lastname1 | userlast@percona.com  | 1980-01-01 | 2023-07-26 12:18:40 |
|  4 | user2      | lastname2 | userlast2@percona.com | 1980-01-01 | 2023-07-26 12:18:49 |
|  7 | user3      | lastname3 | userlast3@percona.com | 1980-01-01 | 2023-07-26 12:18:58 |
| 10 | user4      | lastname4 | userlast4@percona.com | 1980-01-01 | 2023-07-26 12:19:08 |
| 13 | user5      | lastname5 | userlast5@percona.com | 1980-01-01 | 2023-07-26 12:19:16 |
+----+------------+-----------+-----------------------+------------+---------------------+
5 rows in set (0.00 sec)

What I wanted to do was apply the following simple schema change using pt-osc:

ALTER TABLE atsaloux.authors ADD INDEX email_idx(email)

To do so, I created a K8s Job YAML file as below:

ubuntu@ip-172-31-25-177:~$ cat pt-osc-job.yaml
apiVersion: batch/v1
kind: Job
metadata:
  name: pt-osc-atsaloux-request-123456
spec:
  template:
    spec:
      containers:
      - name: pt-osc-request-123456
        image: perconalab/percona-toolkit:latest
        command: ["pt-online-schema-change"]
        args:
          - --user
          - root
          - --print
          - --progress
          - time,10
          - --alter=$(PT_OSC_ALTER)
          - --tries
          - create_triggers:10000:0.25,drop_triggers:10000:0.25,copy_rows:10000:0.25,swap_tables:10000:0.25,update_foreign_keys:10000:0.25,analyze_table:10000:0.25
          - --set-vars
          - lock_wait_timeout=1
          - h=minimal-cluster-pxc,D=$(PT_OSC_DATABASE),t=$(PT_OSC_TABLE)
          - --recursion-method=none
          - --max-flow-ctl=0
          - --critical-load
          - threads_running=99999
          - --max-load
          - Threads_running=50
          - --chunk-time=1
          - --nocheck-unique-key-change
          - --execute
        env:
          - name: DBI_PASS
            valueFrom:
              secretKeyRef:
                name: minimal-cluster-secrets
                key: root
          - name: PT_OSC_DATABASE
            value: atsaloux
          - name: PT_OSC_TABLE
            value: authors
          - name: PT_OSC_ALTER
            value: ADD INDEX email_idx(email)
      restartPolicy: Never
  backoffLimit: 1

I’m not going to explain the YAML definition above as I believe this is pretty straightforward for you to understand and adjust.

I then applied the YAML file, and the Job was created:

ubuntu@ip-172-31-25-177:~$ kubectl apply -f pt-osc-job.yaml
job.batch/pt-osc-atsaloux-request-123456 created


ubuntu@ip-172-31-25-177:~$ kubectl get pods
NAME                                               READY   STATUS              RESTARTS      AGE
minimal-cluster-haproxy-0                          2/2     Running             0             68m
minimal-cluster-haproxy-1                          2/2     Running             0             67m
minimal-cluster-haproxy-2                          2/2     Running             0             67m
minimal-cluster-pxc-0                              3/3     Running             0             68m
minimal-cluster-pxc-1                              3/3     Running             0             68m
minimal-cluster-pxc-2                              3/3     Running             0             67m
percona-xtradb-cluster-operator-6f4f56f5d5-pvx92   1/1     Running             4 (68m ago)   5d
pt-osc-atsaloux-request-123456-h6t9p               0/1     ContainerCreating   0             2s


ubuntu@ip-172-31-25-177:~$ kubectl get pods
NAME                                               READY   STATUS      RESTARTS      AGE
minimal-cluster-haproxy-0                          2/2     Running     0             68m
minimal-cluster-haproxy-1                          2/2     Running     0             67m
minimal-cluster-haproxy-2                          2/2     Running     0             67m
minimal-cluster-pxc-0                              3/3     Running     0             68m
minimal-cluster-pxc-1                              3/3     Running     0             68m
minimal-cluster-pxc-2                              3/3     Running     0             67m
percona-xtradb-cluster-operator-6f4f56f5d5-pvx92   1/1     Running     4 (68m ago)   5d
pt-osc-atsaloux-request-123456-h6t9p               0/1     Completed   0             3s

As you can see above, after a few seconds the pt-osc-atsaloux-request-123456-h6t9p transitioned to a Completed status. This indicates that the execution of the event was successful.

I’m now able to review the logs using the following:

ubuntu@ip-172-31-25-177:~$ kubectl logs pt-osc-atsaloux-request-123456-h6t9p
No slaves found.  See --recursion-method if host minimal-cluster-pxc-0 has slaves.
Not checking slave lag because no slaves were found and --check-slave-lag was not specified.
Operation, tries, wait:
  analyze_table, 10000, 0.25
  copy_rows, 10000, 0.25
  create_triggers, 10000, 0.25
  drop_triggers, 10000, 0.25
  swap_tables, 10000, 0.25
  update_foreign_keys, 10000, 0.25
Altering `atsaloux`.`authors`...
Creating new table...
CREATE TABLE `atsaloux`.`_authors_new` (
  `id` int NOT NULL AUTO_INCREMENT,
  `first_name` varchar(50) NOT NULL,
  `last_name` varchar(50) NOT NULL,
  `email` varchar(100) NOT NULL,
  `birthdate` date NOT NULL,
  `added` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  PRIMARY KEY (`id`),
  UNIQUE KEY `email` (`email`)
) ENGINE=InnoDB AUTO_INCREMENT=16 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
Created new table atsaloux._authors_new OK.
Altering new table...
ALTER TABLE `atsaloux`.`_authors_new` ADD INDEX email_idx(email)
Altered `atsaloux`.`_authors_new` OK.
2023-07-26T12:22:42 Creating triggers...
-----------------------------------------------------------
Event : DELETE
Name  : pt_osc_atsaloux_authors_del
SQL   : CREATE TRIGGER `pt_osc_atsaloux_authors_del` AFTER DELETE ON `atsaloux`.`authors` FOR EACH ROW BEGIN DECLARE CONTINUE HANDLER FOR 1146 begin end; DELETE IGNORE FROM `atsaloux`.`_authors_new` WHERE `atsaloux`.`_authors_new`.`id` <=> OLD.`id`; END
Suffix: del
Time  : AFTER
-----------------------------------------------------------
-----------------------------------------------------------
Event : UPDATE
Name  : pt_osc_atsaloux_authors_upd
SQL   : CREATE TRIGGER `pt_osc_atsaloux_authors_upd` AFTER UPDATE ON `atsaloux`.`authors` FOR EACH ROW BEGIN DECLARE CONTINUE HANDLER FOR 1146 begin end; DELETE IGNORE FROM `atsaloux`.`_authors_new` WHERE !(OLD.`id` <=> NEW.`id`) AND `atsaloux`.`_authors_new`.`id` <=> OLD.`id`; REPLACE INTO `atsaloux`.`_authors_new` (`id`, `first_name`, `last_name`, `email`, `birthdate`, `added`) VALUES (NEW.`id`, NEW.`first_name`, NEW.`last_name`, NEW.`email`, NEW.`birthdate`, NEW.`added`); END
Suffix: upd
Time  : AFTER
-----------------------------------------------------------
-----------------------------------------------------------
Event : INSERT
Name  : pt_osc_atsaloux_authors_ins
SQL   : CREATE TRIGGER `pt_osc_atsaloux_authors_ins` AFTER INSERT ON `atsaloux`.`authors` FOR EACH ROW BEGIN DECLARE CONTINUE HANDLER FOR 1146 begin end; REPLACE INTO `atsaloux`.`_authors_new` (`id`, `first_name`, `last_name`, `email`, `birthdate`, `added`) VALUES (NEW.`id`, NEW.`first_name`, NEW.`last_name`, NEW.`email`, NEW.`birthdate`, NEW.`added`);END
Suffix: ins
Time  : AFTER
-----------------------------------------------------------
2023-07-26T12:22:43 Created triggers OK.
2023-07-26T12:22:43 Copying approximately 4 rows...
INSERT LOW_PRIORITY IGNORE INTO `atsaloux`.`_authors_new` (`id`, `first_name`, `last_name`, `email`, `birthdate`, `added`) SELECT `id`, `first_name`, `last_name`, `email`, `birthdate`, `added` FROM `atsaloux`.`authors` LOCK IN SHARE MODE /*pt-online-schema-change 1 copy table*/
2023-07-26T12:22:43 Copied rows OK.
2023-07-26T12:22:43 Analyzing new table...
2023-07-26T12:22:43 Swapping tables...
RENAME TABLE `atsaloux`.`authors` TO `atsaloux`.`_authors_old`, `atsaloux`.`_authors_new` TO `atsaloux`.`authors`
2023-07-26T12:22:43 Swapped original and new tables OK.
2023-07-26T12:22:43 Dropping old table...
DROP TABLE IF EXISTS `atsaloux`.`_authors_old`
2023-07-26T12:22:43 Dropped old table `atsaloux`.`_authors_old` OK.
2023-07-26T12:22:43 Dropping triggers...
DROP TRIGGER IF EXISTS `atsaloux`.`pt_osc_atsaloux_authors_del`
DROP TRIGGER IF EXISTS `atsaloux`.`pt_osc_atsaloux_authors_upd`
DROP TRIGGER IF EXISTS `atsaloux`.`pt_osc_atsaloux_authors_ins`
2023-07-26T12:22:43 Dropped triggers OK.
Successfully altered `atsaloux`.`authors`.

And finally confirmed that the schema changes had been successfully applied:

bash-5.1$ mysql -h minimal-cluster-haproxy -u root -p -e 'show create table atsaloux.authorsG'
Enter password:
*************************** 1. row ***************************
       Table: authors
Create Table: CREATE TABLE `authors` (
  `id` int NOT NULL AUTO_INCREMENT,
  `first_name` varchar(50) NOT NULL,
  `last_name` varchar(50) NOT NULL,
  `email` varchar(100) NOT NULL,
  `birthdate` date NOT NULL,
  `added` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  PRIMARY KEY (`id`),
  UNIQUE KEY `email` (`email`),
  KEY `email_idx` (`email`)
) ENGINE=InnoDB AUTO_INCREMENT=16 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci

For the purposes of this blog post, I repeated the operation by creating a variation of the Job above as below. The difference here is that the ALTER statement was actually incorrect.

ubuntu@ip-172-31-25-177:~$ cat pt-osc-job2.yaml
apiVersion: batch/v1
kind: Job
metadata:
  name: pt-osc-atsaloux-request-78910
spec:
  template:
    spec:
      containers:
      - name: pt-osc-request-123456
        image: perconalab/percona-toolkit:latest
        command: ["pt-online-schema-change"]
        args:
          - --user
          - root
          - --print
          - --progress
          - time,10
          - --alter=$(PT_OSC_ALTER)
          - --tries
          - create_triggers:10000:0.25,drop_triggers:10000:0.25,copy_rows:10000:0.25,swap_tables:10000:0.25,update_foreign_keys:10000:0.25,analyze_table:10000:0.25
          - --set-vars
          - lock_wait_timeout=1
          - h=minimal-cluster-pxc,D=$(PT_OSC_DATABASE),t=$(PT_OSC_TABLE)
          - --recursion-method=none
          - --max-flow-ctl=0
          - --critical-load
          - threads_running=99999
          - --max-load
          - Threads_running=50
          - --chunk-time=1
          - --nocheck-unique-key-change
          - --execute
        env:
          - name: DBI_PASS
            valueFrom:
              secretKeyRef:
                name: minimal-cluster-secrets
                key: root
          - name: PT_OSC_DATABASE
            value: atsaloux
          - name: PT_OSC_TABLE
            value: authors
          - name: PT_OSC_ALTER
            value: ADD INDEX email_idx_new(email_not_exists)
      restartPolicy: Never
  backoffLimit: 1

I then applied it again to create the new Job and noticed that it errored out this time:

ubuntu@ip-172-31-25-177:~$ kubectl apply -f pt-osc-job2.yaml
job.batch/pt-osc-atsaloux-request-78910 created

ubuntu@ip-172-31-25-177:~$ kubectl get pods
NAME                                               READY   STATUS      RESTARTS       AGE
minimal-cluster-haproxy-0                          2/2     Running     0              120m
minimal-cluster-haproxy-1                          2/2     Running     0              119m
minimal-cluster-haproxy-2                          2/2     Running     0              119m
minimal-cluster-pxc-0                              3/3     Running     0              120m
minimal-cluster-pxc-1                              3/3     Running     0              120m
minimal-cluster-pxc-2                              3/3     Running     0              119m
percona-xtradb-cluster-operator-6f4f56f5d5-pvx92   1/1     Running     4 (120m ago)   5d1h
pt-osc-atsaloux-request-123456-h6t9p               0/1     Completed   0              52m
pt-osc-atsaloux-request-78910-q2lk4                0/1     Error       0              2s

ubuntu@ip-172-31-25-177:~$ kubectl get jobs
NAME                             COMPLETIONS   DURATION   AGE
pt-osc-atsaloux-request-123456   1/1           4s         52m
pt-osc-atsaloux-request-78910    0/1           6s         6s

But why? You can see in the logs Key column 'email_not_exists' doesn't exist in table

ubuntu@ip-172-31-25-177:~$ kubectl logs pt-osc-atsaloux-request-78910-q2lk4
No slaves found.  See --recursion-method if host minimal-cluster-pxc-0 has slaves.
Not checking slave lag because no slaves were found and --check-slave-lag was not specified.
Operation, tries, wait:
  analyze_table, 10000, 0.25
  copy_rows, 10000, 0.25
  create_triggers, 10000, 0.25
  drop_triggers, 10000, 0.25
  swap_tables, 10000, 0.25
  update_foreign_keys, 10000, 0.25
Altering `atsaloux`.`authors`...
Creating new table...
CREATE TABLE `atsaloux`.`_authors_new` (
  `id` int NOT NULL AUTO_INCREMENT,
  `first_name` varchar(50) NOT NULL,
  `last_name` varchar(50) NOT NULL,
  `email` varchar(100) NOT NULL,
  `birthdate` date NOT NULL,
  `added` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  PRIMARY KEY (`id`),
  UNIQUE KEY `email` (`email`),
  KEY `email_idx` (`email`)
) ENGINE=InnoDB AUTO_INCREMENT=16 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
Created new table atsaloux._authors_new OK.
Altering new table...
ALTER TABLE `atsaloux`.`_authors_new` ADD INDEX email_idx_new(email_not_exists)
2023-07-26T13:14:44 Dropping new table...
  (in cleanup) Error altering new table `atsaloux`.`_authors_new`: DBD::mysql::db do failed: Key column 'email_not_exists' doesn't exist in table [for Statement "ALTER TABLE `atsaloux`.`_authors_new` ADD INDEX email_idx_new(email_not_exists)"] at /usr/bin/pt-online-schema-change line 9537.

Error altering new table `atsaloux`.`_authors_new`: DBD::mysql::db do failed: Key column 'email_not_exists' doesn't exist in table [for Statement "ALTER TABLE `atsaloux`.`_authors_new` ADD INDEX email_idx_new(email_not_exists)"] at /usr/bin/pt-online-schema-change line 9537.

DROP TABLE IF EXISTS `atsaloux`.`_authors_new`;
2023-07-26T13:14:44 Dropped new table OK.
`atsaloux`.`authors` was not altered.

Now, I can delete the Job, if needed, make adjustments, and re-run. Jobs documentation has quite a few details for failure handling, retries, deadlines, etc. that you may find useful.

ubuntu@ip-172-31-25-177:~$ kubectl get jobs
NAME                             COMPLETIONS   DURATION   AGE
pt-osc-atsaloux-request-123456   1/1           4s         53m
pt-osc-atsaloux-request-78910    0/1           83s        83s

ubuntu@ip-172-31-25-177:~$ kubectl delete job pt-osc-atsaloux-request-78910
job.batch "pt-osc-atsaloux-request-78910" deleted

ubuntu@ip-172-31-25-177:~$ kubectl get jobs
NAME                             COMPLETIONS   DURATION   AGE
pt-osc-atsaloux-request-123456   1/1           4s         53m

Again, this is a high-level idea, and other than schema change operations can be performed in a similar way. You can, of course, further improve or follow other approaches. In K8s, there are several ways to achieve the same result, so feel free to comment!

Percona Distribution for MySQL is the most complete, stable, scalable, and secure open source MySQL solution available, delivering enterprise-grade database environments for your most critical business applications… and it’s free to use!

 

Try Percona Distribution for MySQL today!

mysql mysql-server Tutorials