Category: FromDual

Shinguz: Dropped Tables with FromDual Backup Manager

Some applications have the bad behaviour to CREATE or DROP tables while our FromDual Backup Manager (bman) backup is running.

This leads to the following bman error message:

/opt/mysql/product/5.7.26/bin/mysqldump --user=dba --host=migzm96i --port=3306 --all-databases --quick --single-transaction --flush-logs --triggers --routines --hex-blob --events | tee >(md5sum --binary >/tmp/checksum.23357.md5) | gzip -1
to Destination: /var/mysql/dumps/mysql96i/daily/bck_mysql96i_full_2019-05-22_06-50-01.sql.gz
ERROR: /opt/mysql/product/5.7.26/bin/mysqldump command failed (rc=253).
mysqldump: [Warning] Using a password on the command line interface can be insecure.
Error: Couldn't read status information for table m_report_builder_cache_157_20190521035354 ()
mysqldump: Couldn't execute 'show create table `m_report_builder_cache_157_20190521035354`': Table 'totara.m_report_builder_cache_157_20190521035354' doesn't exist (1146)

There are various strategies to work around this problem:

  • If the table is only temporary create it with the CREATE command as a TEMPORARY TABLE instead of a normal table. This workaround would not work in this case because the table is a caching table which must be available for other connections as well.
  • Try to schedule your application job or your bman job in the way they do not collide. With bman that is quite easy but sometimes not with the application.
  • Try to create the table in its own schema (e.g. cache) which is excluded from bman backup. So you can easily do a bman backup without the cache schema. For example like this:
    $ bman --target=brman@127.0.0.1:3306 --type=schema --schema=-cache --policy=daily

  • If this strategy also does not work (because you cannot change the application behaviour) try to ignore the table. The underlying command mysqldump knows the option --ignore-table:
    mysqldump --help
    ...
      --ignore-table=name Do not dump the specified table. To specify more than one
                          table to ignore, use the directive multiple times, once
                          for each table.  Each table must be specified with both
                          database and table names, e.g.,
                          --ignore-table=database.table.
    

    This option can be used in bman as well. Options to the underlying application are passed through FromDual Backup Manager as follows:

    $ bman --target=brman@127.0.0.1:3306 --type=full --policy=daily --pass-through='--ignore-table=totara.m_report_builder_cache_157_20190521035354'

  • The problem here is, that this table contains a timestamp in its table name (20190521035354). So the table name is changing all the time. To pass through wildcards with --ignore-table is not possible with mysqldump. The tool mysqldump does not support (yet) this feature. The only solution we have in this case is, to ignore the error message with the risk that possible other error messages are also ignored. This is achieved again with the --pass-through option:
    $ bman --target=brman@127.0.0.1:3306 --type=full --policy=daily --pass-through='--force'

I hope with this few tricks we can help you to make your FromDual Backup Manager (bman) backups hassle-free.

Taxonomy upgrade extras: 

Shinguz: Do not underestimate performance impacts of swapping on NUMA database systems

If your MariaDB or MySQL database system is swapping it can have a significant impact on your database query performance! Further it can also slow down your database shutdown and thus influence the whole reboot of your machine. This is especially painful if you have only short maintenance windows or if you do not want to spend the whole night with operation tasks.

When we do reviews of our customer MariaDB or MySQL database systems one of the items to check is Swap Space and swapping. With the free command you can find if your system has Swap Space enabled at all and how much of your Swap Space is used:

# free
              total        used        free      shared  buff/cache   available
Mem:       16106252     3300424      697284      264232    12108544    12011972
Swap:      31250428     1701792    29548636

With the command:

# swapon --show
NAME      TYPE       SIZE USED PRIO
/dev/sdb2 partition 29.8G 1.6G   -1

you can show on which disk drive your Swap Space is physically located. And with the following 3 commands you can find if your system is currently swapping or not:

# vmstat 1
procs ------------memory------------ ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd    free   buff    cache   si   so    bi    bo   in   cs us sy id wa st
 1  0 1701784 692580 355716 11757864    2   12   685   601  237  146  9  3 86  2  0
 0  0 1701784 692472 355720 11757840    0    0     0   196  679 2350  2  1 97  1  0
 0  0 1701784 692720 355728 11757332    0    0     0   104  606 2170  0  1 98  1  0

# sar -W 1
15:44:30     pswpin/s pswpout/s
15:44:31         0.00      0.00
15:44:32         0.00      0.00
15:44:33         0.00      0.00

# sar -S 1
15:43:02    kbswpfree kbswpused  %swpused  kbswpcad   %swpcad
15:43:03     29548648   1701780      5.45     41552      2.44
15:43:04     29548648   1701780      5.45     41552      2.44
15:43:05     29548648   1701780      5.45     41552      2.44

Side note: Recent Linux distributions tend to use Swap Files instead of Swap Partitions. The performance impact seems to be negligible compared to the operational advantages of Swap Files… [ 1 ] [ 2 ] [ 3 ] [ 4 ]

What is Swap Space on a Linux system

Modern Operating Systems like Linux manage Virtual Memory (VM) which consists of RAM (fast) and Disk (HDD very slow and SSD slow). If the Operating System is short in fast RAM it tries to write some “old” pages to slow disk to get more free fast RAM for “new” pages and/or for the file system cache. This technique enables the Operating System to keep more and/or bigger processes running than physical RAM is available (overcommitment of RAM).
If one of those “old” pages is needed again it has to be swapped in which technically is a physical random disk read (which is slow, this is also called a major page fault).
If this block is a MariaDB or MySQL database block this disk read to RAM will slow down your SELECT queries but also INSERT, UPDATE and DELETE when you do write queries. This can severely slow down for example your clean-up jobs which have to remove “old” data (located on disk possibly in Swap Space).

Sizing of Swap Space for database systems

A rule of thumb for Swap Space is: Have always Swap Space but never use it (disk is cheap nowadays)!

A reasonable Swap Space sizing for database systems is the following:

Amount of RAM Swap Space
4 GiB of RAM or less a minimum of 4 GiB of Swap Space, is this really a Database server?
8 GiB to 16 GiB of RAM a minimum of once the amount of RAM of Swap Space
24 GiB to 64 GiB of RAM a minimum of half the amount of RAM of Swap Space
more than 64 GiB of RAM a minimum of 32 GiB of Swap Space

If you have a close look at your Swap usage and if you monitor your Swap Space precisely and if you know exactly what you are doing you can lower these values…

It is NOT recommended to disable Swap Space

Some people tend to disable Swap Space. We see this mainly in virtualized environments (virtual machines) and cloud servers. From the VM/Cloud administrator point of view I can even understand why they disable Swap. But from the MariaDB / MySQL DBA point of view this is a bad idea.

If you do proper MariaDB / MySQL configuration (innodb_buffer_pool_size = 75% of RAM) the server should not swap a lot. But if you exaggerate with memory configuration the system starts swapping heavily. Till to the end the OOM-Killer will be activated by your Linux killing the troublemaker (typically the database process). If you have sufficient Swap Space enabled you get some time to detect a bad database configuration and act accordingly. If you have Swap Space disabled completely you do not get this safety buffer and OOM killer will act immediately and kill your database process when you run out of RAM. This really cannot be in the interest of the DBA.

Some literature to read further about Swap: In defence of swap: common misconceptions

Influence swapping – Swappiness

The Linux kernel documentation tells us the following about swappiness:

swappiness

This control is used to define how aggressive the kernel will swap memory pages. Higher values will increase aggressiveness, lower values decrease the amount of swap. A value of 0 instructs the kernel not to initiate swap until the amount of free and file-backed pages is less than the high water mark in a zone.

The default value is 60.

Source: Documentation for /proc/sys/vm/*

A informative article on StackExchange: Why is swappiness set to 60 by default?

To change your swappiness the following commands will help:

# sysctl vm.swappiness 
vm.swappiness = 60

# sysctl vm.swappiness=1

# sysctl vm.swappiness 
vm.swappiness = 1

To make these changes persistent you have to write it to some kind of configuration file dependent on your Operating System:

#
# /etc/sysctl.d/99-sysctl.conf
#
vm.swappiness=1

Who is using the Swap Space?

For further analysing your Swap Space and to find who is using your Swap Space please see our article MariaDB and MySQL swap analysis.

What if your system is still swapping? – NUMA!

If you did everything correctly until here and your system is still swapping you possibly missed one point: NUMA systems behave a bit tricky related to Databases and swapping. The first person who wrote extensively about this problem in the MySQL ecosystem was Jeremy Cole in 2010 in his two well written articles which you can find here:

What NUMA is you can find here: Non-uniform memory access.

If you have spent your money for an expensive NUMA system you can find with the following command:

# lscpu 
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                56
On-line CPU(s) list:   0-55
Thread(s) per core:    2
Core(s) per socket:    14
Socket(s):             2
NUMA node(s):          2
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 79
Model name:            Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz
Stepping:              1
CPU MHz:               2600.000
CPU max MHz:           2600.0000
CPU min MHz:           1200.0000
BogoMIPS:              5201.37
Virtualization:        VT-x
Hypervisor vendor:     vertical
Virtualization type:   full
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              35840K
NUMA node0 CPU(s):     0-13,28-41
NUMA node1 CPU(s):     14-27,42-55

If you are now in the unfortunate situation of having such a huge box with several sockets you can do different things:

  • Configuring your MariaDB / MySQL database to allocate memory evenly on both sockets with the parameter innodb_numa_interleave. This works since MySQL 5.6.27, MySQL 5.7.9 and MariaDB 10.2.4 but there were various bugs in this area in Debian and CentOS packages (e.g. #80288, #78953, #79354 and MDEV-18660).
  • Disable NUMA support in your BIOS (Node Interleaving = enabled). Then there is no NUMA presentation to the Operating System any more.
  • Start your MariaDB / MySQL database with numactl --interleave all as described here: MySQL and NUMA.
  • Set innodb_buffer_pool_size to 75% of half of your RAM. Sad for having too much of RAM.
  • Playing around with the following Linux settings could help to decrease swapping: vm.zone_reclaim_mode=0 and kernel.numa_balancing=0.

Literature

Some further information about Swap Space you can find here:

Taxonomy upgrade extras: 

Shinguz: FromDual Ops Center for MariaDB and MySQL 0.9.1 has been released

FromDual has the pleasure to announce the release of the new version 0.9.1 of its popular FromDual Ops Center for MariaDB and MySQL focmm.

The FromDual Ops Center for MariaDB and MySQL (focmm) helps DBA’s and System Administrators to manage MariaDB and MySQL database farms. Ops Center makes DBA and Admins life easier!

The main task of Ops Center is to support you in your daily MySQL and MariaDB operation tasks. More information about FromDual Ops Center you can find here.

Download

The new FromDual Ops Center for MariaDB and MySQL (focmm) can be downloaded from here. How to install and use focmm is documented in the Ops Center User Guide.

In the inconceivable case that you find a bug in the FromDual Ops Center for MariaDB and MySQL please report it to the FromDual bug tracker or just send us an email.

Any feedback, statements and testimonials are welcome as well! Please send them to feedback@fromdual.com.

Installation of Ops Center 0.9.1

A complete guide on how to install FromDual Ops Center you can find in the Ops Center User Guide.

Upgrade from 0.3 or 0.9.0 to 0.9.1

Upgrade from 0.3 or 0.9.0 to 0.9.1 should happen automatically. Please do a backup of your Ops Center Instance before you upgrade! Please also check Upgrading.

Changes in Ops Center 0.9.1

Upgrade

  • Sever upgrade bug fixed which prohibited installation of v0.9.0.

Build and Packaging

  • RPM package for RHEL/CentOS 7 is available now.
  • DEB package for Ubuntu 18.04 LTS is available now.
  • SElinux Policy Package file added.
  • COMMIT tag was not replaced correctly during build. This is fixed now.
Taxonomy upgrade extras: 

Shinguz: FromDual Ops Center for MariaDB and MySQL 0.9 has been released

FromDual has the pleasure to announce the release of the new version 0.9 of its popular FromDual Ops Center for MariaDB and MySQL focmm.

The FromDual Ops Center for MariaDB and MySQL (focmm) helps DBA’s and System Administrators to manage MariaDB and MySQL database farms. Ops Center makes DBA and Admins life easier!

The main task of Ops Center is to support you in your daily MySQL and MariaDB operation tasks. More information about FromDual Ops Center you can find here.

Download

The new FromDual Ops Center for MariaDB and MySQL (focmm) can be downloaded from here. How to install and use focmm is documented in the Ops Center User Guide.

In the inconceivable case that you find a bug in the FromDual Ops Center for MariaDB and MySQL please report it to the FromDual bug tracker or just send us an email.

Any feedback, statements and testimonials are welcome as well! Please send them to feedback@fromdual.com.

Installation of Ops Center 0.9

A complete guide on how to install FromDual Ops Center you can find in the Ops Center User Guide.

Upgrade from 0.3 to 0.9

Upgrade from 0.3 to 0.9 should happen automatically. Please do a backup of you Ops Center Instance befor you upgrade! Please also check Upgrading.

Changes in Ops Center 0.9

Everything has changed!

Taxonomy upgrade extras: 

Shinguz: MariaDB Prepared Statements, Transactions and Multi-Row Inserts

Last week at the MariaDB/MySQL Developer Training we had one participant asking some tricky questions I did not know the answer by heart.

Also MariaDB documentation was not too verbose (here and here).

So time to do some experiments:

Prepared Statements and Multi-Row Inserts

SQL> PREPARE stmt1 FROM 'INSERT INTO `test`.`test` (`data`) VALUES (?), (?), (?)';
Statement prepared
SQL> SET @d1 = 'Bli';
SQL> SET @d2 = 'Bla';
SQL> SET @d3 = 'Blub';
SQL> EXECUTE stmt1 USING @d1, @d2, @d3;
Query OK, 3 rows affected (0.010 sec)
Records: 3  Duplicates: 0  Warnings: 0
SQL> DEALLOCATE PREPARE stmt1;
SQL> SELECT * FROM test;
+----+------+---------------------+
| id | data | ts                  |
+----+------+---------------------+
|  1 | Bli  | 2019-04-15 17:26:22 |
|  2 | Bla  | 2019-04-15 17:26:22 |
|  3 | Blub | 2019-04-15 17:26:22 |
+----+------+---------------------+

Prepared Statements and Transactions

SQL> SET SESSION autocommit=Off;
SQL> START TRANSACTION;
SQL> PREPARE stmt2 FROM 'INSERT INTO `test`.`test` (`data`) VALUES (?)';
Statement prepared

SQL> SET @d1 = 'BliTrx';
SQL> EXECUTE stmt2 USING @d1;
Query OK, 1 row affected (0.000 sec)

SQL> SET @d1 = 'BlaTrx';
SQL> EXECUTE stmt2 USING @d1;
Query OK, 1 row affected (0.000 sec)
SQL> COMMIT;

-- Theoretically we should do a START TRANSACTION; here again...
SQL> SET @d1 = 'BlubTrx';
SQL> EXECUTE stmt2 USING @d1;
Query OK, 1 row affected (0.000 sec)
SQL> ROLLBACK;

SQL> DEALLOCATE PREPARE stmt2;
SQL> SELECT * FROM test;
+----+---------+---------------------+
| id | data    | ts                  |
+----+---------+---------------------+
| 10 | BliTrx  | 2019-04-15 17:33:30 |
| 11 | BlaTrx  | 2019-04-15 17:33:39 |
+----+---------+---------------------+

Prepared Statements and Transactions and Multi-Row Inserts

SQL> SET SESSION autocommit=Off;
SQL> START TRANSACTION;
SQL> PREPARE stmt3 FROM 'INSERT INTO `test`.`test` (`data`) VALUES (?), (?), (?)';
Statement prepared

SQL> SET @d1 = 'Bli1Trx';
SQL> SET @d2 = 'Bla1Trx';
SQL> SET @d3 = 'Blub1Trx';
SQL> EXECUTE stmt3 USING @d1, @d2, @d3;
Query OK, 3 rows affected (0.000 sec)
SQL> COMMIT;

-- Theoretically we should do a START TRANSACTION; here again...
SQL> SET @d1 = 'Bli2Trx';
SQL> SET @d2 = 'Bla2Trx';
SQL> SET @d3 = 'Blub2Trx';
SQL> EXECUTE stmt3 USING @d1, @d2, @d3;
Query OK, 3 rows affected (0.000 sec)
SQL> ROLLBACK;

-- Theoretically we should do a START TRANSACTION; here again...
SQL> SET @d1 = 'Bli3Trx';
SQL> SET @d2 = 'Bla3Trx';
SQL> SET @d3 = 'Blub3Trx';
SQL> EXECUTE stmt3 USING @d1, @d2, @d3;
Query OK, 3 rows affected (0.001 sec)
SQL> COMMIT;

SQL> DEALLOCATE PREPARE stmt3;
SQL> SELECT * FROM test;
+----+----------+---------------------+
| id | data     | ts                  |
+----+----------+---------------------+
|  1 | Bli1Trx  | 2019-04-15 17:37:50 |
|  2 | Bla1Trx  | 2019-04-15 17:37:50 |
|  3 | Blub1Trx | 2019-04-15 17:37:50 |
|  7 | Bli3Trx  | 2019-04-15 17:38:38 |
|  8 | Bla3Trx  | 2019-04-15 17:38:38 |
|  9 | Blub3Trx | 2019-04-15 17:38:38 |
+----+----------+---------------------+

Seems all to work as expected. Now we know it for sure!

Shinguz: Uptime of a MariaDB Galera Cluster

A while ago somebody on Google Groups asked for the Uptime of a Galera Cluster. The answer is easy… Wait, no! Not so easy… The uptime of a Galera Node is easy (or not?). But Uptime of the whole Galera Cluster?

My answer then was: “Grep the error log.” My answer now is still: “Grep the error log.” But slightly different:

$ grep 'view(view_id' *
2019-03-07 16:10:26 [Note] WSREP: view(view_id(PRIM,0e0a2851,1) memb {
2019-03-07 16:14:37 [Note] WSREP: view(view_id(PRIM,0e0a2851,2) memb {
2019-03-07 16:16:23 [Note] WSREP: view(view_id(PRIM,0e0a2851,3) memb {
2019-03-07 16:55:56 [Note] WSREP: view(view_id(NON_PRIM,0e0a2851,3) memb {
2019-03-07 16:56:04 [Note] WSREP: view(view_id(PRIM,6d80bb1a,5) memb {
2019-03-07 17:00:28 [Note] WSREP: view(view_id(NON_PRIM,6d80bb1a,5) memb {
2019-03-07 17:01:11 [Note] WSREP: view(view_id(PRIM,24f67954,7) memb {
2019-03-07 17:18:58 [Note] WSREP: view(view_id(NON_PRIM,24f67954,7) memb {
2019-03-07 17:19:31 [Note] WSREP: view(view_id(PRIM,a380c8cb,9) memb {
2019-03-07 17:20:27 [Note] WSREP: view(view_id(PRIM,a380c8cb,11) memb {
2019-03-08  7:58:38 [Note] WSREP: view(view_id(PRIM,753a350f,15) memb {
2019-03-08 11:31:38 [Note] WSREP: view(view_id(NON_PRIM,753a350f,15) memb {
2019-03-08 11:31:43 [Note] WSREP: view(view_id(PRIM,489e3c67,17) memb {
2019-03-08 11:31:58 [Note] WSREP: view(view_id(PRIM,489e3c67,18) memb {
...
2019-03-22  7:05:53 [Note] WSREP: view(view_id(NON_PRIM,49dc20da,49) memb {
2019-03-22  7:05:53 [Note] WSREP: view(view_id(PRIM,49dc20da,50) memb {
2019-03-26 12:14:05 [Note] WSREP: view(view_id(NON_PRIM,49dc20da,50) memb {
2019-03-27  7:33:25 [Note] WSREP: view(view_id(NON_PRIM,22ae25aa,1) memb {

So this Cluster had an Uptime of about 18 days and 20 hours. Why can I seed this? Simple: In the brackets there is a number at the very right. This number seems to be the same as wsrep_cluster_conf_id which is reset by a full Galera Cluster shutdown.

So far so good. But, wait, what is the definition of Uptime? Hmmm, not so helpful, how should I interpret this for a 3-Node Galera Cluster?

I would say a good definition for Uptime of a Galera Cluster would be: “At least one Galera Node must be available for the application for reading and writing.” That means PRIM in the output above. And we still cannot say from the output above if there was at least on Galera Node available (reading and writing) at any time. For this we have to compare ALL 3 MariaDB Error Logs… So it does not help, we need a good Monitoring solution to answer this question…

PS: Who has found the little fake in this blog?

Taxonomy upgrade extras: 

Shinguz: Linux system calls of MySQL process

We had the problem today that a MySQL Galera Cluster node with the multi-tenancy pattern caused a lot of system time (sy 75%, load average about 30 (you really must read this article by Brendan Gregg, it is worth it!)) so we wanted to find what system calls are being used to see what could cause this issue (to verify if it is a TOC or a TDC problem:

$ sudo strace -c -p $(pidof -s mysqld) -f -e trace=all
Process 5171 attached with 41 threads
Process 16697 attached
^C
Process 5171 detached
...
Process 5333 detached
Process 16697 detached
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 66.85    1.349700         746      1810           io_getevents
 25.91    0.523055        1298       403       197 futex
  4.45    0.089773        1069        84        22 read
  2.58    0.052000       13000         4         3 restart_syscall
  0.19    0.003802        1901         2           select
  0.01    0.000235           3        69         1 setsockopt
  0.01    0.000210          18        12           getdents
  0.00    0.000078           2        32           write
  0.00    0.000056           1        49           fcntl
  0.00    0.000026           4         6           openat
  0.00    0.000012           2         6           close
  0.00    0.000000           0         2         2 open
  0.00    0.000000           0        22           stat
  0.00    0.000000           0         2           mmap
  0.00    0.000000           0         7           mprotect
  0.00    0.000000           0        16           pread
  0.00    0.000000           0         1           access
  0.00    0.000000           0         1           sched_yield
  0.00    0.000000           0         5           madvise
  0.00    0.000000           0         1           accept
  0.00    0.000000           0         1           getsockname
  0.00    0.000000           0         1           clone
  0.00    0.000000           0         1           set_robust_list
------ ----------- ----------- --------- --------- ----------------
100.00    2.018947                  2537       225 total

$ man io_getevents
...

See also: Configuration of MySQL for Shared Hosting.

Shinguz: MariaDB and MySQL Database Consolidation

Shinguz: MariaDB and MySQL Database Consolidation

We see at various customers the request for consolidating their MariaDB and MySQL infrastructure. The advantage of such a measure is clear in the first step: Saving costs! And this requests comes typically from managers. But what we unfortunately see rarely is to question this request from the IT engineering perspective. Because it comes, as anything in life, with some “costs”. So, saving costs with consolidation on one side comes with “costs” for operation complexity on the other side.

To give you some arguments for arguing with managers we collected some topics to consider before consolidating:

  • Bigger Database Instances are more demanding in handling than smaller ones:
    • Backup and Restore time takes longer. Copying files around takes longer, etc.
    • Possibly your logical backup with mysqldump does not restore any longer in a reasonable amount of time (Mean Time to Repair/Recover (MTTR) is not met any more). You have to think about some physical backup methods including MariaDB or MySQL Enterprise Backup solutions.
    • Consolidated database instances typically contain many different schemas of various different applications. In case of problems you typically want to restore and possibly recover only one single schema and not all schemas. And this becomes much more complicated (depending on you backup strategy). MariaDB/MySQL tooling is not yet (fully) prepared for this situation (#17365). Possibly your old backup strategy is not adequate any more?
    • When you restore a schema you do not want the application interfering with your restore. How can you properly exclude the one application from your database instance while you are restoring? Locking accounts (possible only with MariaDB 10.4 and MySQL 5.7 and newer). Tricks like --skip-networking, adding Firewall rules, --read-only, database port change (--port=3307), do not work any more (as easy)!
    • In short the costs are: Restore/Recovery Operations becomes more demanding!
  • Do NOT mix schemas of different criticalities into the same database instance! The worst cases we have seen were some development schemas which were on the same high-availability Cluster like highly critical transactional systems. The developers did some nasty things on their development systems (which IMHO is OK for them on a development system). What nobody considered in this case was that the troubles from the development schema brought down the whole production schema which was located on the same machine… Cost: Risk of failure of your important services caused by some non-important services AND planing becomes more expensive and you need to know more about all instances and other instances.
  • This phenomena is also called Noisy Neighbor effect. Noisy Neighbors become a bigger issues with consolidated systems. You have to know much more in detail what you and everybody else is doing on the system! Do you…? Costs are: More know-how is required, better education and training of people, more clever people, better planning, better monitoring, etc.
  • When you consolidate different applications into one system it becomes more critical than the previous ones on their own. So you have to think about High-Availability solutions. Costs are: 1 to 4 new instances (for HA), more complexity, more know-how, more technologies… Do you plan to buy an Enterprise Support subscription?
  • Do NOT mix different maintenances windows (Asia vs. Europe vs. America) or daily online-business and nightly job processing. You get shorter maintenance windows. Costs are: Better planning is needed, costly night and weekend maintenance time, etc…
    Shinguz: MariaDB and MySQL Database Consolidation 1
    Europe 12:00
    China 19:00 (7 hours ahead of us)
    US east 07:00 (5 hours behind us)
    US west 04:00 (8 hours behind us)
  • Resource Fencing becomes more tricky. Within the same instance resource fencing becomes more tricky and is not really doable atm. MySQL 8.0 shows some firsts steps with the Resource Groups but this is pretty complicated and is by far not complete and usable yet. A better way would be to install several instances on the same machine an fence them with some O/S means like Control Groups. This comes at the costs of know-how, complexity and more complicated set-ups.
  • Naming conflicts can happen: Application a) is called `wiki` and application b) is called `wiki` as well and for some reasons you cannot rename them (any more).
  • Monitoring becomes much more demanding and needs to be done more fine grained. You want to know exactly what is going on your system because it can easily have some side effects on many different schemas/applications. Example of today: We were running out of kernel file descriptors (file-max) and we did not recognize it in the beginning.
  • Consolidated things are a much a higher Bulk Risk (this is true also for SAN or Virtualisation Clusters). When you have an outage not only one application is down but the whole company is down. We have seen this already for SAN and Virtualisation Clusters and we expect to see that soon also on highly consolidated Database Clusters. Costs: Damage on the company is bigger for one incident.
  • Different applications have different configuration requirements which possibly conflict with other requirements from other applications (Jira from Atlassian is a good example for this).
    Server variables cannot be adjusted any more according to somebody’s individual wishes…

    • sql_mode: Some old legacy applications still require ONLY_FULL_GROUP_BY) 🙁
    • The requirements are conflicting: Performance/fast vs. Safe/durability: innodb_flush_log_at_trx_commit, sync_binlog, crash-safe binary logging, etc.
    • Transaction isolation: transaction_isolation = READ-COMMITTED (old: tx_isolation, Jira again as an example) vs. REPEATABLE-READ (default). Other applications which do not assume, that transaction isolation behaviour changes. And cannot cope with it. Have you ever asked your developers if their application can cope with a different transaction isolation levels? 🙂 Do they know what you are talking about?
    • Character set (utf8_bin for Jira as example again), which can be changed globally or on a schema level, but it has to be done correctly for all participants.
  • Some applications require MariaDB some application require MySQL. They are not the same databases any more nowadays (8.0 vs. 10.3/10.4). So you cannot consolidate them (easily).
  • You possibly get a mixture of persistent connections (typically Java with connection pooling) and non-persistent connections (typically PHP and other languages). Which causes different database behaviour, which has an impact on how you configure the database instance. Which is more demanding and needs more knowledge of the database AND the application or you solve it with more RAM.
  • You need to know much more about you application to understand what it does and how could it interfere with others…
  • When you consolidate more and more schemas into your consolidated database server you have to adjust your database setting as well from time to time (innodb_buffer_pool_size, table_open_cache, table_definition_cache, O/S File descriptors, etc). And possibly add more RAM, CPU and stronger I/O. When is your network saturated? Have you thought about this already?

This leads us to the result that consolidation let us save some costs on infrastructure but adds additional costs on complexity, skills etc. Theses costs will grow exponentially and thus at some point it is not worth the effort any more. This will end up in not only one big consolidated instance but possibly in a hand full of them.

Shinguz: MariaDB and MySQL Database Consolidation 2

Where this point is for you you have to find yourself…

Alternatives to consolidating everything into one instance

  • 1 Machine can contain 1 to many Database Instances can contain 1 to many Schemas. Instead of putting all schemas into one machine, think about installing several instances on one machine. This comes at the cost of more complexity. MyEnv will help you to manage this additional complexity.
  • 1 Machine can contain 1 to many Virtual Machines (VMs, kvm, XEN, VMWare, etc.) can contain 1 to many Instance(s) can contain 1 to many Schemas. This comes at the cost of even more complexity and pretty complex technology (Virtualization).

Taxonomy upgrade extras: 

Shinguz: FromDual Performance Monitor for MariaDB and MySQL 1.0.2 has been released

FromDual has the pleasure to announce the release of the new version 1.0.2 of its popular Database Performance Monitor for MariaDB, MySQL, Galera Cluster and Percona Server fpmmm.

The new FromDual Performance Monitor for MariaDB and MySQL (fpmmm) can be downloaded from here. How to install and use fpmmm is documented in the fpmmm Installation Guide.

In the inconceivable case that you find a bug in the FromDual Performance Manager for MariaDB and MySQL please report it the FromDual Bugtracker or just send us an email.

Any feedback, statements and testimonials are welcome as well! Please send them to feedback@fromdual.com.

Monitoring as a Service (MaaS)

You do not want to set-up your Database monitoring yourself? No problem: Choose our MariaDB and MySQL Monitoring as a Service (Maas) program to safe costs!

Upgrade from 1.0.x to 1.0.2

shell> cd /opt
shell> tar xf /download/fpmmm-1.0.2.tar.gz
shell> rm -f fpmmm
shell> ln -s fpmmm-1.0.2 fpmmm

Changes in FromDual Performance Monitor for MariaDB and MySQL 1.0.2

This release contains various bug fixes.

You can verify your current FromDual Performance Monitor for MariaDB and MySQL version with the following command:

shell> fpmmm --version

fpmmm agent

  • Server entropy probe added.
  • Processlist empty state is covered.
  • Processlist statements made more robust.
  • Error caught properly after query.
  • Branch for Ubuntu is different, fixed.
  • PHP Variable variables_order is included into program.
  • Fixed the documentation URL in file INSTALL.
  • Connection was not set to utf8. This is fixed now.
  • fprint error fixed.
  • Library myEnv.inc updated from MyEnv project.

fpmmm Templates

  • Backup template added.
  • SQL thread and IO thread error more verbose and running again triggers implemented. Typo in slave template fixed.
  • Forks graph fixed, y axis starts from 0.

fpmmm agent installer

  • Error messages made more flexible.

For subscriptions of commercial use of fpmmm please get in contact with us.

Taxonomy upgrade extras: 
Shinguz: MariaDB and MySQL consulting by plane

Shinguz: MariaDB and MySQL consulting by plane

Since January 2019 FromDual tries to contribute actively a little bit against global warming too.

The best for the climate would be to NOT travel to the customer at all! For this cases we have our FromDual remote-DBA services for MariaDB and MySQL.

But sometimes customer wants or needs us on-site for our FromDual in-house trainings or our FromDual on-site consulting engagements. In these cases we try to travel by train. Travelling by train is after walking or travelling by bicycle the most climate friendly way to travel:

But some customers are located more than 7 to 8 hours far away by train. For these customers we have to take the plan which is not good for the climate at all. But at least we will compensate for our CO2 emission via MyClimate.org:

my_climate.png