Planet MySQL

Monitoring MySQL backups with Datadog and TwinDB Backup Tool

Monitoring MySQL backups is a vital part of any reliable backup solution. By monitoring the most common disaster recovery metrics, the Recovery Time Objective and the Recovery Point Objective, you can find out if a backup job was successful and produced a usable backup copy. The TwinDB Backup Tool along with Datadog allows monitoring both […]

The post Monitoring MySQL backups with Datadog and TwinDB Backup Tool appeared first on TwinDB.

Updating/Deleting Rows with ClickHouse (Part 1)

In this post, we’ll look at updating and deleting rows with ClickHouse. It’s the first of two parts.

Update: Part 2 of this post is here.

ClickHouse is fast – blazing fast! It’s quite easy to pick up, and with ProxySQL integrating with existing applications already using MySQL, it’s way less complex than using other analytics options. However, ClickHouse does not support UPDATE/DELETE (yet). That entry barrier can easily dissuade potential users despite the good things I mentioned.

If there is a will, there is a way! We have so far taken advantage of the new feature that supports more granular partitioning strategy (by week, by day or something else). With more granular partitions, and knowing what rows have changed, we can drop partitions with modified/deleted data and re-import them from MySQL. It is worth noting that this feature is still marked as experimental. It is nevertheless a very important feature.

To better demonstrate it, let’s say I have the hits table below and want to import it into ClickHouse for analysis:

CREATE TABLE `hits` ( `id` int(11) NOT NULL AUTO_INCREMENT, `type` varchar(100) DEFAULT NULL, `user_id` int(11) NOT NULL, `location_id` int(11) NOT NULL, `created_at` datetime DEFAULT NULL PRIMARY KEY (`id`), KEY `created_at` (`created_at`) ) ENGINE=InnoDB;

Below is the corresponding ClickHouse table structure.

CREATE TABLE hits ( id Int32, created_day Date, type String, user_id Int32, location_id Int32, created_at Int32 ) ENGINE = MergeTree PARTITION BY toMonday(created_day) ORDER BY (created_at, id) SETTINGS index_granularity = 8192;

Here’s a workflow diagram of how this works.

  1. The first step is to make sure we capture changes from the table(s). Because we technically cannot tell if a DELETE occurred unless it is actually an INSERT with some indication that the data should be invisible. Support for this means changing a lot of existing tables to an INSERT only model. Therefore we use a separate application to capture changes from the binary logs instead.
  2. Once we have established a tap to the source-MySQL-server’s binary logs, we filter the changes we need to record. In this case, UPDATE and DELETE. For each matching modified or deleted row, we create a record that indicates which partition it affects from the corresponding ClickHouse table. From the example table above, we simply convert the “created_at” column into a valid partition value based on the corresponding ClickHouse table. Examples here.
  3. Before we can consume the changelog, we’d have to import our table in full. The common use case is a simple import from MySQL to ClickHouse with one-to-one column mapping (except maybe for the partitioning key). In other cases, we’d also have to do transformations on the columns if needed.
  4. Once we have the table fully imported and our changelog captured continuously, we can then rebuild this table continuously:
    • This process consults our changelog table to determine which partitions need to be updated. It would then dump the subset of that data from MySQL, drop the partition on ClickHouse and import the new data.
    • Based on the PRIMARY KEY value from the source MySQL table, we can also determine what new rows we need to dump from the source table and INSERT to ClickHouse. If the partition where new INSERTs go has already been updated, we skip this part.
    • Continue repeating the incremental process as frequent as needed.
  5. Your application should now be ready to use ProxySQL to connect to the ClickHouse backend and start running analytics queries.

Some footnotes:

  • There is a simpler way to do this if you have a table with an INSERT only workload, and a qualified partitioning key already exists. See this blog post.
  • Because this is not real time, before each iteration of the incremental refresh a query on the target table from ClickHouse may have inconsistencies, especially if this happens when a partition is currently dropped. Some coordination between the incremental refresh and the querying application helps avoid this.
  • The partitioning key is based on a weekly data. This is fine in our example, given the compromise between how much data we need to keep (ALL vs. one year), how much average data per week (if this affects regular DROP of partition and import) and how many total partitions would be the result.

In the second part of this series, we’ll show the actual process and example codes!

Webinar Wednesday, January 10, 2018: Why We’re Excited About MySQL 8.0

Join Percona’s, CEO and Co-Founder, Peter Zaitsev as he presents Why We’re Excited About MySQL 8.0 on Wednesday, January 10, 2018, at 11:00 am PST / 2:00 pm EST (UTC-8).

Experience: Basic

Tags: Developer, DBAs, Operations, Executive, MySQL

There are many great new features in MySQL 8.0, but how exactly can they help your applications? This session takes a practical look at MySQL 8.0 features and improvements. It looks at the bugs, issues and limitations of previous MySQL versions and how MySQL 8.0 addresses them. It will also cover what you can do with MySQL 8.0 that you couldn’t before.

Register for the webinar now.

Peter Zaitsev, CEO

Peter Zaitsev co-founded Percona and assumed the role of CEO in 2006. As one of the foremost experts on MySQL strategy and optimization, Peter leveraged both his technical vision and entrepreneurial skills to grow Percona from a two-person shop to one of the most respected open source companies in the business. With over 140 professionals in 30 plus countries, Percona now serves over 3000 customers. This includes the “who’s who” of internet giants, large enterprises and many exciting startups. Inc. 5000 recognized Percona in 2013, 2014, 2015 and 2016. Peter was an early employee at MySQL AB, eventually leading the company’s High-Performance Group.

A serial entrepreneur, Peter co-founded his first startup while attending Moscow State University where he majored in Computer Science. Peter is a co-author of High-Performance MySQL: Optimization, Backups, and Replication, one of the most popular books on MySQL performance. Peter frequently speaks as an expert lecturer at MySQL and related conferences, and regularly posts on the Percona Database Performance Blog. Fortune and DZone tapped Peter as a contributor multiple times, and his recent ebook Practical MySQL Performance Optimization is one the most popular downloads.

MariaDB MaxScale 2.1 and SSL Certificates

MariaDB MaxScale 2.1 and SSL Certificates Wagner Bianchi Tue, 01/09/2018 - 13:44

MariaDB MaxScale has becoming increasingly popular as a database proxy, adopted by users that would like to take advantage of a good strategy for scaling out databases and data infrastructure. As with any database environment, it is important to make the environment safe and to adopt the top industry best practices. Most MariaDB MaxScale users have or will have MaxScale handling traffic to database instances/backends in a wan, where servers can be added to MariaDB’s intelligent database proxy and based on the configurations, traffic is routed to those servers. In some cases, man-in-the-middle and other attack strategies are used to intercept information while data is being replicated and, while connections are routed to the backend databases.

This blog post will explore the setup of an environment using self-signed OpenSSL certificates to make it safe enough to replicate data between the multiple backend database servers and also, we’ll show you how to setup the communication between MariaDB MaxScale and the backend.

The following are the servers and version we’re using on this blog:

  • 4 VMs vagrant-wise created:
    • 1 MariaDB MaxScale;
    • 1 MariaDB Server as Master;
    • 2 MariaDB Server as Replica;
  • CentOS 7.3 as the operating system;
  • MariaDB Server 10.2.10;
  • MariaDB MaxScale 2.1.10;

For this blog, I assume you already have the servers configured and replicating (one master and two replicas).

MariaDB MaxScale will look like this below at the end of this blog:

[root@maxscale ~]# maxadmin list servers Servers. -------------------+-----------------+-------+-------------+-------------------- Server             | Address         | Port  | Connections | Status -------------------+-----------------+-------+-------------+-------------------- prod_mariadb01     |   |  3306 |           0 | Master, Running prod_mariadb02     |   |  3306 |           0 | Slave, Running prod_mariadb03     |   |  3306 |           0 | Slave, Running -------------------+-----------------+-------+-------------+--------------------

If you're following this tutorial, make sure you setup your servers via the MariaDB official repository to have access to the software we will need for set up.

#: setup MariaDB Official repository [root@box01 ~]# curl -sS | sudo bash [info] Repository file successfully written to /etc/yum.repos.d/mariadb.repo. [info] Adding trusted package signing keys... [info] Succeessfully added trusted package signing keys.

Generating the Self-Signed Certificates

The place to start with this is to generate your self-signed OpenSSL certificates, but, if you would like to acquire a certificate from any of the existing entities that will sign the certificate for you, that’s fine as well. Here, I’m going through the creation of certificates with OpenSSL, present on most of the Linux Distributions by the default and then, I’m going to use that. Below you can find the command to generate your certificates, the same as I used to generate the certificates at /etc/my.cnf.d/certs/. One detail here is that you won’t see this directory on the MariaDB MaxScale host, so, you will need to create that directory and move certs there.

[root@maxscale ~]# mkdir -pv /etc/my.cnf.d/certs/ mkdir: created directory ‘/etc/my.cnf.d/certs/’ [root@box01 ~]# mkdir -pv /etc/my.cnf.d/certs/ mkdir: created directory ‘/etc/my.cnf.d/certs/’ [root@box02 ~]# mkdir -pv /etc/my.cnf.d/certs/ mkdir: created directory ‘/etc/my.cnf.d/certs/’ [root@box03 ~]# mkdir -pv /etc/my.cnf.d/certs/ mkdir: created directory ‘/etc/my.cnf.d/certs/’

I created a directory on the MariaDB MaxScale server host, moved my prompt to /etc/my.cnf.d/certs/ and then, created the certificates using the below commands.

#: generate the ca-key $ openssl genrsa 2048 > ca-key.pem #: server certs $ openssl req -new -x509 -nodes -days 9999 -key ca-key.pem > ca-cert.pem $ openssl req -newkey rsa:2048 -days 1000 -nodes -keyout server-key.pem > server-req.pem $ openssl x509 -req -in server-req.pem -days 3600 -CA ca-cert.pem -CAkey ca-key.pem -set_serial 01 -out server-cert.pem #: client certs $ openssl req -newkey rsa:2048 -days 3600 -nodes -keyout client-key.pem -out client-req.pem $ openssl rsa -in client-key.pem -out client-key.pem $ openssl x509 -req -in client-req.pem -days 3600 -CA ca-cert.pem -CAkey ca-key.pem -set_serial 01 -out client-cert.pem #: verify generated certificates $ openssl verify -CAfile ca-cert.pem server-cert.pem client-cert.pem server-cert.pem: OK client-cert.pem: OK

One thing you should be aware of is if the last part doesn’t go well and the certificates verification don’t give you an OK is that you need to have different names for the CN or Common Names. The error that appeared sometimes is like the one below:

#: execution the SSL certificates verification $ openssl verify -CAfile ca-cert.pem server-cert.pem client-cert.pem server-cert.pem: C = BR, ST = MG, L = BH, O = WBC, OU = WB, CN = WB, emailAddress = error 18 at 0 depth lookup:self signed certificate OK client-cert.pem: C = BR, ST = MG, L = BH, O = WBC, OU = WB, CN = WB, emailAddress = error 18 at 0 depth lookup:self signed certificate OK

After finishing the certificate's creation successfully and then, pass through the verification, as shown above, you will have the following files at /etc/my.cnf.d/certs/:

#: listing servers on MariaDB Maxscale server host [root@maxscale ~]# ls -lh /etc/my.cnf.d/certs/ total 32K -rw-r--r-- 1 root root 1.4K Nov  5 11:08 ca-cert.pem -rw-r--r-- 1 root root 1.7K Nov  5 11:07 ca-key.pem -rw-r--r-- 1 root root 1.3K Nov  5 11:11 client-cert.pem -rw-r--r-- 1 root root 1.7K Nov  5 11:11 client-key.pem -rw-r--r-- 1 root root 1.1K Nov  5 11:10 client-req.pem -rw-r--r-- 1 root root 1.3K Nov  5 11:09 server-cert.pem -rw-r--r-- 1 root root 1.7K Nov  5 11:09 server-key.pem -rw-r--r-- 1 root root 1.1K Nov  5 11:09 server-req.pem

Now you do have the client's and server’s certificates you need to go ahead with this setup.

Setting Up GTID Replication SSL Based

If you got new servers, maybe it’s just easy enough to say that, to configure replication SSL based, you need to have a user for each of the slaves/replicas you plan to have under your master server or as well, you can have a specialized user create for connecting to your master from an IP using a wildcard, such as 192.168.100.%. I will encourage you to have one user per slave/replica as it can enforce the security of your environment and avoid other issues like someone else on the same network trying to gain access on the master database. It’s OK that the replication user just has REPLICATION SLAVE and REPLICATION CLIENT privileges, but, you never know what is going to be attempted. By the way, following what should be done, you have the following:

  1. Move client certificates to all 3 servers, adding certificates at /etc/my.cnf.d/certs/ (you need to create this directory on all four servers);
  2. Add a file under the /etc/my.cnf.d names ssl.cnf as MariaDB will read that when it starts up mysqld;
  3. Create the users, one for each of the slaves, on master with the directive REQUIRE SSL;
  4. Configure replication on slaves/replicas with that user and using the required MASTER_SSL directives on the CHANGE MASTER TO command.

To move files around, I like to have a key based authentication configured to makes things easier as you don't need to digit passwords anymore after getting keys in place on all the servers. You can generate you a key on each of the servers, copy them all them all to the ~/.ssh/authorized_keys file on the central servers, which in my case is the MariaDB MaxScale server host and them, send the files to all the servers. One additional thing you need to pay attention, in this case, is that the authorized_keys file should have permission set as 0600 to make it work. So, this is a way to go, or, you can use your user's password as well, it's going to work. You can as well for sure streamline the process like below (it's a terrible behavior generate a key without a passphrase, so, consider a passphrase to your keys to make it safer):

#: generate a simple key, you can have a strong one #: if you go create it on production [root@maxscale ~]# ssh-keygen Generating public/private rsa key pair. Enter file in which to save the key (/root/.ssh/id_rsa): Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /root/.ssh/id_rsa. Your public key has been saved in /root/.ssh/ [...snip...]

Let's get key published on database servers:

#: adding the public key on the other hosts [root@maxscale ~]# for i in {11..13}; do ssh-copy-id -i ~/.ssh/ 192.168.50.$i; done /bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed /bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys root@'s password: Number of key(s) added: 1 Now try logging into the machine, with: "ssh ''" and check to make sure that only the key(s) you wanted were added. /bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed /bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys root@'s password: Number of key(s) added: 1 Now try logging into the machine, with: "ssh ''" and check to make sure that only the key(s) you wanted were added. /bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed /bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys root@'s password: Number of key(s) added: 1 Now try logging into the machine, with: "ssh ''" and check to make sure that only the key(s) you wanted were added. #: testing if key based SSH is all set [root@maxscale ~]# for i in {11..13}; do ssh 192.168.50.$i hostname; done box01 box02 box03

Once SSH keys are in place, we can just move the certificates from your central host to the others; I use rsync for the below task, and as a hint, you will need to have it on all servers:

#: moving certificates for database hosts [root@maxscale ~]# for i in {11..13}; do rsync -avrP -e ssh /etc/my.cnf.d/certs/* 192.168.50.$i:/etc/my.cnf.d/certs/; done sending incremental file list ca-cert.pem   1261 100% 0.00kB/s 0:00:00 (xfer#1, to-check=7/8) ca-key.pem   1675 100% 1.60MB/s 0:00:00 (xfer#2, to-check=6/8) client-cert.pem   1135 100% 1.08MB/s 0:00:00 (xfer#3, to-check=5/8) client-key.pem   1675 100% 1.60MB/s 0:00:00 (xfer#4, to-check=4/8) client-req.pem   976 100% 953.12kB/s 0:00:00 (xfer#5, to-check=3/8) server-cert.pem   1135 100% 1.08MB/s 0:00:00 (xfer#6, to-check=2/8) server-key.pem   1704 100% 1.63MB/s 0:00:00 (xfer#7, to-check=1/8) server-req.pem   976 100% 953.12kB/s 0:00:00 (xfer#8, to-check=0/8)   sent 11046 bytes received 164 bytes 22420.00 bytes/sec total size is 10537 speedup is 0.94 [...snip...]

Once certificates are located on all servers, next step is to add the ssl.cnf at /etc/my.cnf.d, as below:

#: add the below as a content of the file /etc/my.cnf.d/ssl.cnf [root@box01 ~]# cat /etc/my.cnf.d/ssl.cnf [client] ssl ssl-ca=/etc/my.cnf.d/certs/ca-cert.pem ssl-cert=/etc/my.cnf.d/certs/client-cert.pem ssl-key=/etc/my.cnf.d/certs/client-key.pem [mysqld] ssl ssl-ca=/etc/my.cnf.d/certs/ca-cert.pem ssl-cert=/etc/my.cnf.d/certs/server-cert.pem ssl-key=/etc/my.cnf.d/certs/server-key.pem

You should restart your MariaDB Server after adding the certificates configuration, as if you don't, it's not going to be possible to connect to the database server with the users we created. In case something goes wrong with certificates, and you need to generate new ones, repeating the process aforementioned, you'll need to restart database servers as client certificates are loaded to the memory, and you can get an error like below if you have a certificates mismatch:

[root@box01 certs]# mysql ERROR 2026 (HY000): SSL connection error: error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed

Let’s now create a specific replication user for each of the servers we have on our replication topology currently:

box01 [(none)]> CREATE USER repl_ssl@'' IDENTIFIED BY '123456'; Query OK, 0 rows affected (0.00 sec) box01 [(none)]> GRANT REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO repl_ssl@'' REQUIRE SSL; Query OK, 0 rows affected (0.00 sec) box01 [(none)]> CREATE USER repl_ssl@'' IDENTIFIED BY '123456'; Query OK, 0 rows affected (0.00 sec) box01 [(none)]> GRANT REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO repl_ssl@'' REQUIRE SSL; Query OK, 0 rows affected (0.00 sec) box01 [(none)]> CREATE USER repl_ssl@'' IDENTIFIED BY '123456'; Query OK, 0 rows affected (0.00 sec) box01 [(none)]> GRANT REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO repl_ssl@'' REQUIRE SSL; Query OK, 0 rows affected (0.00 sec)

Above, we created one user per server, and I did that thinking at the moment that we eventually need to switch over the current master to one of the slaves, so, that way, the replication user don’t need to be of concern when dealing with an emergency or even when a planned failover is required. The next step should be thought in your case, and I’m going to simplify the case here and assume we’re working with a new environment, not in production yet. For changing your production environment to use SSL certificates, you need to spend more time on this, planning it well to avoid services disruptions. So, I’m going to grab the replication coordinates on the master, out of SHOW MASTER STATUS and then, issue the command CHANGE MASTER TO on slaves to get replication going. Here, I assumed you moved all the certs to all database servers, and they are living at /etc/my.cnf.d/certs/.

#: getting the current master status box01 [(none)]> show master status\G *************************** 1. row ***************************             File: box01-bin.000024         Position: 877     Binlog_Do_DB: Binlog_Ignore_DB: 1 row in set (0.00 sec) #: the CHANGE MASTER TO command should be something like the below box02 [(none)]> CHANGE MASTER TO MASTER_HOST='',   -> MASTER_USER='repl_ssl',   -> MASTER_PASSWORD='123456',   -> MASTER_LOG_FILE='box01-bin.000024',   -> MASTER_LOG_POS=877,   -> MASTER_SSL=1,   -> MASTER_SSL_CA='/etc/my.cnf.d/certs/ca-cert.pem',   -> MASTER_SSL_CERT='/etc/my.cnf.d/certs/client-cert.pem',   -> MASTER_SSL_KEY='/etc/my.cnf.d/certs/client-key.pem'; Query OK, 0 rows affected (0.05 sec) box02 [(none)]> start slave; Query OK, 0 rows affected (0.00 sec) box02 [(none)]> show slave status\G *************************** 1. row ***************************   Slave_IO_State: Waiting for master to send event   Master_Host:   Master_User: repl_ssl   Master_Port: 3306   Connect_Retry: 3   Master_Log_File: box01-bin.000028   Read_Master_Log_Pos: 794   Relay_Log_File: box02-relay-bin.000006   Relay_Log_Pos: 1082   Relay_Master_Log_File: box01-bin.000028   Slave_IO_Running: Yes   Slave_SQL_Running: Yes   [...snip...]   Master_SSL_Allowed: Yes   Master_SSL_CA_File: /etc/my.cnf.d/certs/ca-cert.pem   Master_SSL_CA_Path:   Master_SSL_Cert: /etc/my.cnf.d/certs/client-cert.pem   Master_SSL_Cipher:   Master_SSL_Key: /etc/my.cnf.d/certs/client-key.pem   Seconds_Behind_Master: 0   Last_IO_Errno: 0   Last_IO_Error:   Last_SQL_Errno: 0   Last_SQL_Error:   Replicate_Ignore_Server_Ids:   Master_Server_Id: 1   Master_SSL_Crl: /etc/my.cnf.d/certs/ca-cert.pem   Master_SSL_Crlpath:   Using_Gtid: No   Gtid_IO_Pos:   Replicate_Do_Domain_Ids:   Replicate_Ignore_Domain_Ids:   Parallel_Mode: conservative 1 row in set (0.00 sec)

You can use GTIDs as well, and then, you CHANGE MASTER TO command will be something like:

box03 [(none)]> CHANGE MASTER TO MASTER_HOST='',   -> MASTER_USER='repl_ssl',   -> MASTER_PASSWORD='123456',   -> MASTER_USE_GTID=SLAVE_POS,   -> MASTER_SSL=1,   -> MASTER_SSL_CA='/etc/my.cnf.d/certs/ca-cert.pem',   -> MASTER_SSL_CERT='/etc/my.cnf.d/certs/client-cert.pem',   -> MASTER_SSL_KEY='/etc/my.cnf.d/certs/client-key.pem'; Query OK, 0 rows affected (0.05 sec) box03 [(none)]> start slave; Query OK, 0 rows affected (0.04 sec) box03 [(none)]> show slave status\G *************************** 1. row ***************************   Slave_IO_State: Waiting for master to send event   Master_Host:   Master_User: repl_ssl   Master_Port: 3306   Connect_Retry: 3   Master_Log_File: box01-bin.000028   Read_Master_Log_Pos: 794   Relay_Log_File: box03-relay-bin.000002   Relay_Log_Pos: 654   Relay_Master_Log_File: box01-bin.000028   Slave_IO_Running: Yes   Slave_SQL_Running: Yes   [...snip...]   Master_SSL_Allowed: Yes   Master_SSL_CA_File: /etc/my.cnf.d/certs/ca-cert.pem   Master_SSL_CA_Path:   Master_SSL_Cert: /etc/my.cnf.d/certs/client-cert.pem   Master_SSL_Cipher:   Master_SSL_Key: /etc/my.cnf.d/certs/client-key.pem   Seconds_Behind_Master: 0   Last_IO_Errno: 0   Last_IO_Error:   Last_SQL_Errno: 0   Last_SQL_Error:   Replicate_Ignore_Server_Ids:   Master_Server_Id: 1   Master_SSL_Crl: /etc/my.cnf.d/certs/ca-cert.pem   Master_SSL_Crlpath:   Using_Gtid: Slave_Pos   Gtid_IO_Pos: 0-1-911075   Replicate_Do_Domain_Ids:   Replicate_Ignore_Domain_Ids:   Parallel_Mode: conservative 1 row in set (0.00 sec)

One of the things you can check at the end to make sure replication is all set is of course if error log gives you a clear view of everything that was set up until now and if you added the variable report_host, you could see the result of SHOW SLAVE HOSTS on the master like below:

box01 [(none)]> show slave hosts\G *************************** 1. row *************************** Server_id: 3   Host:   Port: 3306 Master_id: 1 *************************** 2. row *************************** Server_id: 2   Host:   Port: 3306 Master_id: 1 2 rows in set (0.00 sec)

Unfortunately, the @@report_host is not a dynamic system variable, and you need to add it to the MariaDB Server configuration file and restart mysqld to make it assume the new value.  It's passed to the master when the slave/replica's IO_THREAD establish the connection with the master (handshake process).   Setting Up MariaDB MaxScale and the ReadWriteSplit SSL Based   Until here, we went through the details of each of the configurations from the certificates generation, replication configuration, and setup. Now, we need to go over the MaxScale installation; the commands required to dynamically create the monitor, a service, a listener and add servers to have at the end the configurations for the ReadWriteSplit router to handle reads and writes for the master and slaves. The steps here will be:

  1. Setup MariaDB MaxScale;
  2. Put together a basic configuration for MariaDB MaxScale and start it;
  3. Create a user for the MaxScale's Service and another one for the Monitor on backends with the REQUIRE SSL;
  4. Add SSL certificates to the server's and listener definitions files;
  5. Run commands that will create a monitor, a listener, a service; we will then create the servers and add them to the monitor;
  6. Create a user for the application on backends.

To setup MariaDB MaxScale (when writing this blog, it was at its 2.1.10 version), run the below knowing that the MariaDB Official repository was set up at the very beginning of this exercise:

#: setting up MariaDB Maxscale [root@maxscale ~]# yum install maxscale -y Loaded plugins: fastestmirror Loading mirror speeds from cached hostfile  * base:  * epel:  * extras:  * updates: Resolving Dependencies --> Running transaction check ---> Package maxscale.x86_64 0:2.1.10-1 will be updated ---> Package maxscale.x86_64 0:2.1.11-1 will be an update --> Finished Dependency Resolution [...snip...] [root@maxscale maxscale.cnf.d]# maxscale --version MaxScale 2.1.11

You will notice that the password for the maxuser_ssl and maxmon_ssl users is a kind of hash. It was generated using maxkeys to avoid clean text, as you can see below. You will be required to configure yours instead of using the below one.

#: create the secrets file, by default at /var/lib/maxscale  [root@maxscale ~]# maxkeys Generating .secrets file in /var/lib/maxscale. #: the password configured on database servers, but encrypted for maxscale configs [root@maxscale ~]# maxpasswd /var/lib/maxscale/ 123456 AF76BE841B5B4692D820A49298C00272 #: change the file /var/lib/maxscale/.secrets ownership [root@maxscale ~]# chown maxscale:maxscale /var/lib/maxscale/.secrets

Let's now put together a basic configuration to start MariaDB Maxscale. Add the below configurations to MariaDB MaxScale's configuration file so you can start MaxScale:

[root@maxscale ~]# cat /etc/maxscale.cnf [maxscale] threads=auto log_info=true [rwsplit-service] type=service router=readwritesplit user=maxuser_ssl passwd=AF76BE841B5B4692D820A49298C00272 [CLI] type=service router=cli [CLI Listener] type=listener service=CLI protocol=maxscaled socket=default

Before starting MariaDB MaxScale, adding a listener to the pre-defined service, a monitor and creating and adding our servers on which we set up the replication previously, we need to create users, for the service, the monitor and for the application that will connect to the backend servers through MaxScale. Below users should be created on master and then, replicate for the replicas:

#: maxscale's mysqlmon user sudo mysql -e "grant all on *.* to maxmon_ssl@'' identified by '123456' require ssl" -vvv #: maxscale's service user sudo mysql -e "grant all on *.* to maxuser_ssl@'' identified by '123456' require ssl" -vvv #: application user sudo mysql -e "grant select,insert,delete,update on *.* to appuser_ssl@'192.168.%' identified by '123456' require ssl;" -vvv

Now we can start MariaDB Maxscale using the basic configuration file we just created; I created mine at /root/maxscale_configs. So, I can start Maxscale doing like below and checking the log file at /var/log/maxscale/maxscale.log:

[root@maxscale maxscale.cnf.d]# [root@maxscale certs]# systemctl start maxscale [root@maxscale certs]# systemctl status maxscale ● maxscale.service - MariaDB MaxScale Database Proxy Loaded: loaded (/usr/lib/systemd/system/maxscale.service; disabled; vendor preset: disabled) Active: active (running) since Fri 2017-12-15 13:21:57 GMT; 5s ago Process: 13138 ExecStart=/usr/bin/maxscale (code=exited, status=0/SUCCESS) Process: 13135 ExecStartPre=/usr/bin/install -d /var/run/maxscale -o maxscale -g maxscale (code=exited, status=0/SUCCESS) Main PID: 13140 (maxscale) CGroup: /system.slice/maxscale.service └─13140 /usr/bin/maxscale Dec 15 13:21:57 maxscale maxscale[13140]: Loaded module readwritesplit: V1.1.0 from /usr/lib64/maxscale/ Dec 15 13:21:57 maxscale maxscale[13140]: Loaded module maxscaled: V2.0.0 from /usr/lib64/maxscale/ Dec 15 13:21:57 maxscale maxscale[13140]: Loaded module MaxAdminAuth: V2.1.0 from /usr/lib64/maxscale/ Dec 15 13:21:57 maxscale maxscale[13140]: No query classifier specified, using default 'qc_sqlite'. Dec 15 13:21:57 maxscale maxscale[13140]: Loaded module qc_sqlite: V1.0.0 from /usr/lib64/maxscale/ Dec 15 13:21:57 maxscale maxscale[13140]: Service 'rwsplit-service' has no listeners defined. Dec 15 13:21:57 maxscale maxscale[13140]: Listening for connections at [/tmp/maxadmin.sock]:0 with protocol MaxScale Admin Dec 15 13:21:57 maxscale maxscale[13140]: MaxScale started with 1 server threads. Dec 15 13:21:57 maxscale maxscale[13140]: Started MaxScale log flusher. Dec 15 13:21:57 maxscale systemd[1]: Started MariaDB MaxScale Database Proxy.

At this point, the maxscale does not have anything to report in but the service we configured on the basic configuration file. That is mandatory to start Maxscale and make it happy on the first basic initialization. The log events above can show you that maxscale was started with a service but not a listener, not a monitor and no servers. So, this is what we're going to create now, running the below commands while checking the Maxscale's log file (below were extracted from this blog from Marküs Mäkelä and adjusted/fixed on this JIRA):

#: let's create a monitor based on the "mysqlmon" [root@maxscale maxscale_configs]# maxadmin create monitor cluster-monitor mysqlmon Created monitor 'cluster-monitor'   #: log file will tell the below 2017-10-10 15:34:31   notice : (3) [mysqlmon] Initialise the MySQL Monitor module. #: let's alter the monitor to add some options [root@maxscale maxscale_configs]# maxadmin alter monitor cluster-monitor user=maxuser_ssl password=AF76BE841B5B4692D820A49298C00272 monitor_interval=10000   #: log file will tell you about the last changes 2017-10-10 15:34:31   notice : (3) Loaded module mysqlmon: V1.5.0 from /usr/lib64/maxscale/ 2017-10-10 15:34:31   notice : (3) Created monitor 'cluster-monitor' 2017-10-10 15:35:03   notice : (4) Updated monitor 'cluster-monitor': type=monitor 2017-10-10 15:35:03   notice : (4) Updated monitor 'cluster-monitor': user=maxuser_ssl 2017-10-10 15:35:03   notice : (4) Updated monitor 'cluster-monitor': password=AF76BE841B5B4692D820A49298C00272 2017-10-10 15:35:03   notice : (4) Updated monitor 'cluster-monitor': monitor_interval=1000 #: let's restart the monitor to take changes in effect [root@maxscale maxscale_configs]# maxadmin restart monitor cluster-monitor 2017-10-10 18:40:50   error  : [mysqlmon] No Master can be determined #: let's list existing monitors [root@maxscale maxscale_configs]# maxadmin list monitors ---------------------+--------------------- Monitor              | Status ---------------------+--------------------- cluster-monitor      | Running ---------------------+--------------------- #: let’s create the listener, adding the client certificates for the connections [root@maxscale maxscale.cnf.d]# maxadmin create listener rwsplit-service rwsplit-listener 4006 default default default /etc/my.cnf.d/certs/client-key.pem /etc/my.cnf.d/certs/client-cert.pem /etc/my.cnf.d/certs/ca-cert.pem Listener 'rwsplit-listener' created   #: this is what log events tells us 2017-11-22 23:26:18 notice : (5) Using encrypted passwords. Encryption key: '/var/lib/maxscale/.secrets'. 2017-11-22 23:26:18 notice : (5) [MySQLAuth] [rwsplit-service] No users were loaded but 'inject_service_user' is enabled. Enabling service credentials for authentication until database users have been successfully loaded. 2017-11-22 23:26:18 notice : (5) Listening for connections at []:4006 with protocol MySQL 2017-11-22 23:26:18 notice : (5) Created TLS encrypted listener 'rwsplit-listener' at for service 'rwsplit-service' #: listing the existing listeners  [root@maxscale maxscale.cnf.d]# maxadmin list listeners Listeners. -----------------+---------------------+--------------------+-----------------+-------+-------- Name             | Service Name        | Protocol Module    | Address         | Port  | State -----------------+---------------------+--------------------+-----------------+-------+-------- rwsplit-listener | rwsplit-service     | MySQLClient        |         | 4006  | Running CLI Listener     | CLI                 | maxscale           | default         | 0     | Running -----------------+---------------------+--------------------+-----------------+-------+--------

Here is the point in which you need to create the servers and then, you need alter the server’s configurations to add the SSL certificates, let’s see:

#: creating the server prod_mariadb01 and alter its configurations to add SSL [root@maxscale ~]# maxadmin create server prod_mariadb01 3306 Created server 'prod_mariadb01' [root@maxscale ~]# maxadmin alter server prod_mariadb01 ssl=required ssl_key=/etc/my.cnf.d/certs/client-key.pem ssl_cert=/etc/my.cnf.d/certs/client-cert.pem ssl_ca_cert=/etc/my.cnf.d/certs/ca-cert.pem #: creating the server prod_mariadb02 and alter its configurations to add SSL [root@maxscale ~]# maxadmin create server prod_mariadb02 3306 Created server 'prod_mariadb02' [root@maxscale ~]# maxadmin alter server prod_mariadb02 ssl=required ssl_key=/etc/my.cnf.d/certs/client-key.pem ssl_cert=/etc/my.cnf.d/certs/client-cert.pem ssl_ca_cert=/etc/my.cnf.d/certs/ca-cert.pem   #: creating the server prod_mariadb03 and alter its configurations to add SSL [root@maxscale ~]# maxadmin create server prod_mariadb03 3306 Created server 'prod_mariadb03' [root@maxscale ~]# maxadmin alter server prod_mariadb03 ssl=required ssl_key=/etc/my.cnf.d/certs/client-key.pem ssl_cert=/etc/my.cnf.d/certs/client-cert.pem ssl_ca_cert=/etc/my.cnf.d/certs/ca-cert.pem #: maxscale logs should be like 2017-12-02 18:56:28   notice : (19) Loaded module MySQLBackend: V2.0.0 from /usr/lib64/maxscale/ 2017-12-02 18:56:28   notice : (19) Loaded module MySQLBackendAuth: V1.0.0 from /usr/lib64/maxscale/ 2017-12-02 18:56:28   notice : (19) Created server 'prod_mariadb01' at 2017-12-02 18:57:57   notice : (20) Enabled SSL for server 'prod_mariadb01' 2017-12-02 19:00:42   notice : (22) Created server 'prod_mariadb02' at 2017-12-02 19:00:49   notice : (23) Enabled SSL for server 'prod_mariadb02' 2017-12-02 19:00:58   notice : (24) Created server 'prod_mariadb03' at 2017-12-02 19:01:04   notice : (25) Enabled SSL for server 'prod_mariadb03'

It’s good to say that MySQLBackend and MySQLBackedAuth are default values for the server’s protocol and the authenticator module respectively and those values are assumed by default when it’s omitted when creating servers. At this point we can show servers to see the servers configured with the SSL certificates:

[root@maxscale ~]# maxadmin show servers | grep -i ssl     SSL initialized:                     yes     SSL method type:                     MAX     SSL certificate verification depth:  9     SSL certificate:                     /etc/my.cnf.d/certs/client-cert.pem     SSL key:                             /etc/my.cnf.d/certs/client-key.pem     SSL CA certificate:                  /etc/my.cnf.d/certs/ca-cert.pem     SSL initialized:                     yes     SSL method type:                     MAX     SSL certificate verification depth:  9     SSL certificate:                     /etc/my.cnf.d/certs/client-cert.pem     SSL key:                             /etc/my.cnf.d/certs/client-key.pem     SSL CA certificate:                  /etc/my.cnf.d/certs/ca-cert.pem     SSL initialized:                     yes     SSL method type:                     MAX     SSL certificate verification depth:  9     SSL certificate:                     /etc/my.cnf.d/certs/client-cert.pem     SSL key:                             /etc/my.cnf.d/certs/client-key.pem     SSL CA certificate:                  /etc/my.cnf.d/certs/ca-cert.pem

And then, we can list servers, and you will see that, it’s not yet recognized by MaxScale being neither master or slave:

[root@maxscale maxscale_configs]# maxadmin list servers Servers. -------------------+-----------------+-------+-------------+-------------------- Server             | Address         | Port  | Connections | Status -------------------+-----------------+-------+-------------+-------------------- prod_mysql03       |   |  3306 |           0 | Running prod_mysql02       |   |  3306 |           0 | Running prod_mysql01       |   |  3306 |           0 | Running -------------------+-----------------+-------+-------------+--------------------

Next step is to add the created servers to the monitor and service; both created previously as well:

[root@maxscale maxscale_configs]# maxadmin add server prod_mariadb01 cluster-monitor rwsplit-service Added server 'prod_mysql01' to 'cluster-monitor' Added server 'prod_mysql01' to 'rwsplit-service’ [root@maxscale maxscale_configs]# maxadmin add server prod_mariadb02 cluster-monitor rwsplit-service Added server 'prod_mysql02' to 'cluster-monitor' Added server 'prod_mysql02' to 'rwsplit-service' [root@maxscale maxscale_configs]# maxadmin add server prod_mariadb03 cluster-monitor rwsplit-service Added server 'prod_mysql03' to 'cluster-monitor' Added server 'prod_mysql03' to 'rwsplit-service’ #: logs 2017-10-10 18:45:45   notice : (16) Added server 'prod_mysql01' to monitor 'cluster-monitor' 2017-10-10 18:45:45   notice : (16) Added server 'prod_mysql01' to service 'rwsplit-service' 2017-10-10 18:45:45   notice : Server changed state: prod_mysql01[]: new_master. [Running] -> [Master, Running] 2017-10-10 18:45:45   notice : [mysqlmon] A Master Server is now available: 2017-10-10 18:45:52   notice : (17) Added server 'prod_mysql02' to monitor 'cluster-monitor' 2017-10-10 18:45:52   notice : (17) Added server 'prod_mysql02' to service 'rwsplit-service' 2017-10-10 18:45:53   notice : Server changed state: prod_mysql01[]: lost_master. [Master, Running] -> [Running] 2017-10-10 18:45:53   error  : [mysqlmon] No Master can be determined 2017-10-10 18:45:56   notice : (18) Added server 'prod_mysql03' to monitor 'cluster-monitor' 2017-10-10 18:45:56   notice : (18) Added server 'prod_mysql03' to service 'rwsplit-service' 2017-10-10 18:45:56   notice : Server changed state: prod_mysql01[]: new_master. [Running] -> [Master, Running] 2017-10-10 18:45:56   notice : Server changed state: prod_mysql03[]: new_slave. [Running] -> [Slave, Running] 2017-10-10 18:45:56   notice : [mysqlmon] A Master Server is now available:

You can see that, when adding servers to the service, which is the ReadWriteSplit, the current servers’ states and their roles pops up.

[root@maxscale ~]# maxadmin list servers Servers. -------------------+-----------------+-------+-------------+-------------------- Server             | Address         | Port  | Connections | Status -------------------+-----------------+-------+-------------+-------------------- prod_mariadb01     |   |  3306 |           0 | Master, Running prod_mariadb02     |   |  3306 |           0 | Slave, Running prod_mariadb03     |   |  3306 |           0 | Slave, Running -------------------+-----------------+-------+-------------+--------------------

Yet, all the configurations are modular and you can find all files created based on the dynamic configurations we have done until now at /var/lib/maxscale/maxscale.cnf.d:

[root@maxscale maxscale.cnf.d]# ls -lh total 24K -rw-r--r-- 1 root root 251 Dec  2 19:10 cluster-monitor.cnf -rw-r--r-- 1 root root 299 Dec  2 18:57 prod_mariadb01.cnf -rw-r--r-- 1 root root 299 Dec  2 19:00 prod_mariadb02.cnf -rw-r--r-- 1 root root 299 Dec  2 19:01 prod_mariadb03.cnf -rw-r--r-- 1 root root 313 Nov 22 23:26 rwsplit-listener.cnf -rw-r--r-- 1 root root  71 Dec  2 19:10 rwsplit-service.cnf

SSL configurations will be on rwsplit-listener.cnf and on servers’ files:

[root@maxscale maxscale.cnf.d]# cat rwsplit-listener.cnf [rwsplit-listener] type=listener protocol=MySQLClient service=rwsplit-service address= port=4006 authenticator=MySQLAuth ssl=required ssl_cert=/etc/my.cnf.d/certs/client-cert.pem ssl_key=/etc/my.cnf.d/certs/client-key.pem ssl_ca_cert=/etc/my.cnf.d/certs/ca-cert.pem ssl_cert_verify_depth=9 ssl_version=MAX   [root@maxscale maxscale.cnf.d]# cat prod_mariadb0* [prod_mariadb01] type=server protocol=MySQLBackend address= port=3306 authenticator=MySQLBackendAuth ssl=required ssl_cert=/etc/my.cnf.d/certs/client-cert.pem ssl_key=/etc/my.cnf.d/certs/client-key.pem ssl_ca_cert=/etc/my.cnf.d/certs/ca-cert.pem ssl_cert_verify_depth=9 ssl_version=MAX   [prod_mariadb02] type=server protocol=MySQLBackend address= port=3306 authenticator=MySQLBackendAuth ssl=required ssl_cert=/etc/my.cnf.d/certs/client-cert.pem ssl_key=/etc/my.cnf.d/certs/client-key.pem ssl_ca_cert=/etc/my.cnf.d/certs/ca-cert.pem ssl_cert_verify_depth=9 ssl_version=MAX   [prod_mariadb03] type=server protocol=MySQLBackend address= port=3306 authenticator=MySQLBackendAuth ssl=required ssl_cert=/etc/my.cnf.d/certs/client-cert.pem ssl_key=/etc/my.cnf.d/certs/client-key.pem ssl_ca_cert=/etc/my.cnf.d/certs/ca-cert.pem ssl_cert_verify_depth=9 ssl_version=MAX

At this point, as everything is set up, you can test the access to your databases through MaxScale, using the appuser_ssl (If you haven’t created that user yet, create it now on the master and check authentication). You will notice the below event added to the MaxScale's log as of when you create new users as Maxscale will update its internal information about users on backends:

2017-12-03 00:15:17   notice : [MySQLAuth] [rwsplit-service] Loaded 15 MySQL users for listener rwsplit-listener.

If you have the user created, as we did create it before, place the below contents at the home directory of your user and test the access with the appuser_ssl user.

#: check if mysql client is present on Maxscale server [root@maxscale ~]# which mysql /bin/mysql #: add the .my.cnf at your user's home directory [root@maxscale ~]# cat .my.cnf [client] ssl ssl-ca=/etc/my.cnf.d/certs/ca-cert.pem ssl-cert=/etc/my.cnf.d/certs/client-cert.pem ssl-key=/etc/my.cnf.d/certs/client-key.pem [mysql] ssl ssl-ca=/etc/my.cnf.d/certs/ca-cert.pem ssl-cert=/etc/my.cnf.d/certs/client-cert.pem ssl-key=/etc/my.cnf.d/certs/client-key.pem

To execute the below test you will need to install MariaDB-client package on the MariaDB Maxscale server host.

[root@maxscale ~]# mysql -u appuser_ssl -p123456 -h -P 4006 -e "select @@server_id\G" *************************** 1. row *************************** @@server_id: 2 [root@maxscale ~]# mysql -u appuser_ssl -p123456 -h -P 4006 -e "select @@server_id\G" *************************** 1. row *************************** @@server_id: 3


It's a very dense reading, full of practices, bells and whittles, but, it's going to serve as a reference for you when implementing MariaDB MaxScale, with traffic going over SSL. This is not only about MaxScale, but, about having MariaDB Servers with data being replicated using SSL certificates as well. Remember that, as MaxScale Dynamic Commands made it possible to configure ReadWriteSplit with mysqlmon, it gives you the same as well to work with galeramon. The product is becoming more and more versatile and the main point to highlight is it's making the task to position a load balancer or an intelligent database proxy between backends and the clients an easy thing.

This blog post will explore the setup of an environment using self-signed OpenSSL certificates to make it safe enough to replicate data between the multiple backend database servers and also, we’ll show you how to setup the communication between MariaDB MaxScale and the backend.

Login or Register to post comments

How To Achieve PCI Compliance for MySQL & MariaDB with ClusterControl - The Webinar

Join Laurent Blume, Unix Systems Engineer & PCI Specialist and Vinay Joosery, CEO at Severalnines, as they discuss all there is to know about how to achieve PCI compliance for MySQL & MariaDB with ClusterControl in this new webinar on January 30th.

The Payment Card Industry Data Security Standard (PCI-DSS) is a set of technical and operational requirements defined by the PCI Security Standards Council (PCI SSC) to protect cardholder data. These standards apply to all entities that store, process or transmit cardholder data – with requirements for software developers and manufacturers of applications and devices used in those transactions.

Download whitepaper  PCI Compliance for MySQL & MariaDB with ClusterControl

PCI data that resides in a MySQL or MariaDB database must of course also adhere to these requirements, and database administrators must follow best practices to ensure the data is secured and compliant. The PCI standards are stringent and can easily require a spiraling amount of time spent on meeting their requirements. Database administrators can end up overwhelmed when using software that was not designed for compliance, often because it long predates PCI itself, as is the case for most database systems in use today.

That is why, as often as possible, reliable tools must be chosen to help with that compliance, easing out the crucial parts. Each time the compliance for one requirement can be shown to be implemented, working, and logged accordingly, time will be saved. If well-designed, it will only require regular software upgrades, a yearly review and a moderate amount of tweaking to follow the standard's evolution over time.

This new webinar focuses on PCI-DSS requirements for a MySQL or MariaDB database back-end managed by ClusterControl in order to help meet these requirements. It will provide a MySQL and MariaDB user focussed overview of what the PCI standards mean, how they impact database management and provide valuable tips and tricks on how to achieve PCI compliance for MySQL & MariaDB with ClusterControl.

Sign up here!

Date, Time & Registration Europe/MEA/APAC

Tuesday, January 30th at 09:00 GMT / 10:00 CET (Germany, France, Sweden)

Register Now

North America/LatAm

Tuesday, January 30th at 09:00 PT (US) / 12:00 ET (US)

Register Now

  • Introduction to the PCI-DSS standards
  • The impact of PCI on database management
  • Step by step review of the PCI requirements
  • How to meet the requirements for MySQL & MariaDB with ClusterControl
  • Conclusion
  • Q&A
Speakers Laurent Blume, Unix Systems Engineer, PCI Specialist

Laurent’s career in IT started in 2000, his work since evolved from POS terminals for a jewelry store chain to infrastructure servers in a government aerospace R&D organization, even touching supercomputers. One constant throughout was the increasing need for security.

For the past 6 years, he has been in charge of first implementing, then keeping up with the PCI-DSS compliance of critical transnational payment authorization systems. Its implementation for databases has been an essential part of the task. For the last few years, it has expanded to the design and productization of MariaDB cluster backends for mobile contactless payments.

Vinay Joosery, CEO & Co-Founder, Severalnines

Vinay is a passionate advocate and builder of concepts and business around distributed database systems.

Prior to co-founding Severalnines, Vinay held the post of Vice-President EMEA at Pentaho Corporation - the Open Source BI leader. He has also held senior management roles at MySQL / Sun Microsystems / Oracle, where he headed the Global MySQL Telecoms Unit, and built the business around MySQL's High Availability and Clustering product lines. Prior to that, Vinay served as Director of Sales & Marketing at Ericsson Alzato, an Ericsson-owned venture focused on large scale real-time databases.

Tags:  clustercontrol MySQL MariaDB cloud pci pci-dss payment card industry data security webinar

Write Set in MySQL 5.7: Group Replication

In my previous post, I write that Write Set is not only in MySQL 8.0 but also in MySQL 5.7 though a little hidden.  In this post, I describe Write Set in 5.7 and this will bring us in the inner-working of Group Replication.  I am also using this opportunity to explain and show why members of a group can replicate faster than a standard slave.  We will also see the impacts, on Group Replication,

Using ioping to Evaluate Storage Performance for MySQL Workloads

In this blog post, we’ll look at how ioping can be used with other tools to understand and troubleshoot storage performance, specifically as it relates to MySQL workloads.

I recently ran into ioping, a nice little utility by Konstantin Khlebnikov that checks storage latency.  

For me, the main beauty of ioping is its simplicity and familiarity. It takes after the ubiquitous ping tool, but “pings” the storage instead of the network device.

First, let’s talk about what this tool isn’t: it isn’t a benchmark tool to stress load your storage as heavily as possible. For that, you can use iozone or sysbench (among many others). This also isn’t a tool for looking at storage latency history. Use PMM’s  Disk Perfomance Dashboard for that instead (among many others).

However, it is a great tool to evaluate the current storage latency. It allows you to see if storage is performing as expected, or if there are some performance issues that can express themselves as general slowness and as latency spikes for some requests. These latency issues are not always easily visible in historical graphs that are plotting averages.

What Storage Latencies Matter Most for MySQL ?

Before we look at using ioping to measure them, what IO latencies matter most for MySQL?      

The first is Sequential Synchronous writes to the Innodb Log File. Any stalls in these will stall write transaction commits, and all following transactions commits as well. Even though MySQL supports Group Commit, only one such Group Commit operation can process at any moment in time.

The second is Random Reads, which are submitted through Asynchronous IO, typically using a DirectIO operation. This is critical for serving your general IO intensive queries: Selects, Updates, Deletes and most likely Inserts will relay them on fetching such data from storage. Such fetches are latency sensitive: since they must be completed during query execution, they can’t be delayed.

You may ask, “What is about Random Writes?”  Random (non-sequential) writes happen in the background as InnoDB flushes dirty pages from its buffer pool. While it is important, storage has enough throughput to keep up with the workload. It is not latency sensitive since it is not in any query execution critical path.

One more access pattern important for MySQL performance is writing binary logs (especially with sync_binlog=1). This is different from writing to the InnoDB log file, because writes go to the end of file and cause the file to grow. As such, it requires constant updates to the file system metadata. Unfortunately, it doesn’t look like ioping supports this IO pattern yet.

Simulating MySQL IO Patterns with ioping

Now let’s see how we can simulate such IO patterns with ioping.

To simulate writing to the log file, we will use a medium-sized file (64M) and sequential 4K size writes to assess the latency:

[root@ip-172-31-16-160 ~]# ioping -S64M  -L -s4k -W -c 10 . 4 KiB >>> . (ext4 /dev/xvda1): request=1 time=522.5 us (warmup) 4 KiB >>> . (ext4 /dev/xvda1): request=2 time=679.1 us 4 KiB >>> . (ext4 /dev/xvda1): request=3 time=705.3 us 4 KiB >>> . (ext4 /dev/xvda1): request=4 time=699.8 us 4 KiB >>> . (ext4 /dev/xvda1): request=5 time=619.1 us 4 KiB >>> . (ext4 /dev/xvda1): request=6 time=785.7 us 4 KiB >>> . (ext4 /dev/xvda1): request=7 time=763.3 us 4 KiB >>> . (ext4 /dev/xvda1): request=8 time=692.2 us 4 KiB >>> . (ext4 /dev/xvda1): request=9 time=649.1 us 4 KiB >>> . (ext4 /dev/xvda1): request=10 time=613.8 us (fast) --- . (ext4 /dev/xvda1) ioping statistics --- 9 requests completed in 6.21 ms, 36 KiB written, 1.45 k iops, 5.66 MiB/s generated 10 requests in 9.00 s, 40 KiB, 1 iops, 4.44 KiB/s min/avg/max/mdev = 613.8 us / 689.7 us / 785.7 us / 55.3 us

For Read IO testing, we better use 16K IOs (InnoDB default page size) that are submitted through Asynchronous IO in O_DIRECT Mode:

[root@ip-172-31-16-160 ~]# ioping -A -D -s16k  -c 10 . 16 KiB <<< . (ext4 /dev/xvda1): request=1 time=311.0 us (warmup) 16 KiB <<< . (ext4 /dev/xvda1): request=2 time=428.9 us 16 KiB <<< . (ext4 /dev/xvda1): request=3 time=337.4 us 16 KiB <<< . (ext4 /dev/xvda1): request=4 time=478.8 us 16 KiB <<< . (ext4 /dev/xvda1): request=5 time=442.7 us 16 KiB <<< . (ext4 /dev/xvda1): request=6 time=367.3 us 16 KiB <<< . (ext4 /dev/xvda1): request=7 time=419.1 us 16 KiB <<< . (ext4 /dev/xvda1): request=8 time=445.9 us 16 KiB <<< . (ext4 /dev/xvda1): request=9 time=454.4 us 16 KiB <<< . (ext4 /dev/xvda1): request=10 time=518.3 us (slow) --- . (ext4 /dev/xvda1) ioping statistics --- 9 requests completed in 3.89 ms, 144 KiB read, 2.31 k iops, 36.1 MiB/s generated 10 requests in 9.00 s, 160 KiB, 1 iops, 17.8 KiB/s min/avg/max/mdev = 337.4 us / 432.5 us / 518.3 us / 51.4 us

The tests above were performed on the idle instance, so they show what you would expect: the best possible latency. If you run a similar test on a system with a real workload, you would likely see significantly more variance. Lets go ahead and simulate some extra load by using the sysbench tool:

sysbench fileio --threads=1 --time=600 --file-num=1 --file-test-mode=rndrw prepare sysbench fileio --threads=1 --time=600 --file-num=1 --file-test-mode=rndrw run

This simulates a relatively light load on the system. For heavier loads, you can create a larger file set and/or increase the number of threads from one to a higher value.

Running this workload, we would get:

Simulated “Log Write”

[root@ip-172-31-16-160 ~]# ioping -S64M  -L -s4k -W -c 10 . 4 KiB >>> . (ext4 /dev/xvda1): request=1 time=517.5 us (warmup) 4 KiB >>> . (ext4 /dev/xvda1): request=2 time=686.0 us 4 KiB >>> . (ext4 /dev/xvda1): request=3 time=515.5 us 4 KiB >>> . (ext4 /dev/xvda1): request=4 time=527.6 us 4 KiB >>> . (ext4 /dev/xvda1): request=5 time=1.31 ms 4 KiB >>> . (ext4 /dev/xvda1): request=6 time=508.4 us 4 KiB >>> . (ext4 /dev/xvda1): request=7 time=10.3 ms (slow) 4 KiB >>> . (ext4 /dev/xvda1): request=8 time=489.4 us (fast) 4 KiB >>> . (ext4 /dev/xvda1): request=9 time=5.37 ms 4 KiB >>> . (ext4 /dev/xvda1): request=10 time=471.7 us (fast) --- . (ext4 /dev/xvda1) ioping statistics --- 9 requests completed in 3.46 ms, 144 KiB read, 2.60 k iops, 40.7 MiB/s generated 10 requests in 9.00 s, 160 KiB, 1 iops, 17.8 KiB/s

Simulated Query Reads

[root@ip-172-31-16-160 ~]# ioping -A -D -s16k  -c 10 . 16 KiB <<< . (ext4 /dev/xvda1): request=1 time=331.9 us (warmup) 16 KiB <<< . (ext4 /dev/xvda1): request=2 time=291.6 us 16 KiB <<< . (ext4 /dev/xvda1): request=3 time=292.0 us 16 KiB <<< . (ext4 /dev/xvda1): request=4 time=368.6 us 16 KiB <<< . (ext4 /dev/xvda1): request=5 time=3.53 ms 16 KiB <<< . (ext4 /dev/xvda1): request=6 time=4.56 ms 16 KiB <<< . (ext4 /dev/xvda1): request=7 time=279.5 us (fast) 16 KiB <<< . (ext4 /dev/xvda1): request=8 time=284.5 us (fast) 16 KiB <<< . (ext4 /dev/xvda1): request=9 time=285.4 us (fast) 16 KiB <<< . (ext4 /dev/xvda1): request=10 time=346.4 us (fast) --- . (ext4 /dev/xvda1) ioping statistics --- 9 requests completed in 10.2 ms, 144 KiB read, 879 iops, 13.7 MiB/s generated 10 requests in 9.00 s, 160 KiB, 1 iops, 17.8 KiB/s min/avg/max/mdev = 279.5 us / 1.14 ms / 4.56 ms / 1.57 ms

Note that even with such trivial background loads you can see 10x or so outliers that can affect query and transaction latency.

Using  ioping for Monitoring

What if you not only want to run ioping periodically to check if your storage is performing well now, but also want to use it for monitoring so that you get an alert if storage latency spikes for any reason? For this, ioping supports the -B option. It suppresses all fancy human output and only print raw statistics. It also allows you to supply a ping interval in fractions of seconds, which is handy if you want to run enough pings for data to be statistically significant (but without the check taking too long):

[root@ip-172-31-16-160 ~]# ioping -k -B -S64M  -L -s4k -W -c 100 -i 0.1  . 99 1061503831 93 382009 476581 10722261 24505246 6074640 100 9901654105

The meaning of all numbers is described on the man page (run man ioping). For monitoring you might want to look at offsets 6,7,8 — which specify avg, max and stdev statistics for IO requests measured in nanoseconds (the manual for my version says it is microseconds, but it is incorrect).

Hope you find this tool helpful!

Common Table Expressions (CTEs) Part 1

Occasionally at conference or a Meetup, someone will approach me and ask me for help with a MySQL problem.  Eight out of ten times their difficulty includes a sub query. "I get an error message about a corrugated or conflabugated sub query or some such,"  they say, desperate for help.  Usually with a bit of fumbling we can get their problem solved.  The problem is not a lack of knowledge for either of us but that sub queries are often hard to write. 

MySQL 8 will be the first version of the most popular database on the web with Common Table Expressions or CTEs.  CTEs are a way to create temporary tables and then use that temporary table for queries. Think of them as easy to write sub queries!

WITH is The Magic WordThe new CTE magic is indicated with the WITH clause.

mysql> WITH myfirstCTE
SELECT Name, District
FROM myfirstCTE 
| Name        | District   |
| Abilene     | Texas      |
| Akron       | Ohio       |
| Albany      | New York   |
| Albuquerque | New Mexico |
| Alexandria  | Virginia   |
5 rows in set (0.01 sec)


So in the above example we create at temporary table named 'myfirstCTE' and then query from that table. The CTE itself isa very simple query, SELECT * FROM And then two columns are plucked out of 'myfirstCTE'. The CTE can even have modifiers on the query for ORDER BY, LIMIT or any other SQL qualifier.

A Little More Complex Example 
Lets try a little more complex query.

mysql> WITH secondCTE AS
(SELECT city.Name AS A, country.Name as B, city.District AS C
FROM city
JOIN country on (city.CountryCode=country.Code))
| A | B | C |
| Oranjestad | Aruba | – |
| Kabul | Afghanistan | Kabol |
| Qandahar | Afghanistan | Qandahar |
| Herat | Afghanistan | Herat |
| Mazar-e-Sharif | Afghanistan | Balkh |
5 rows in set (0.04 sec)


Here we select columns A, B, C from temporary table 'secondCTE'. In this case it is easier to see that we are getting data from the CTE and not the two tables city or country.  At least directly.

The reason I use this examples is often times you have to join two or more tables and the columns with the name name have different sources, here name of country and name of city.  Yes, you could alias them in a sub query but the legibility of the query increases.  And as query complexity builds things like the old N+1 problems raise their ugly heads.

More Than One Way to Skin a Cat So lets play again with the information from the second example but this time we will write two separate CTEs and then join them together.  First we are going to get the Name column from the city table and the Name column from the country table. We do have to drag along the columns for a join in the CTEs.  I think for many with limited SQL experience that this version is a little easier to comprehend than the second example.

mysql> WITH threeA AS (SELECT Name AS N1, CountryCode AS x FROM city),
-> threeB AS (SELECT Name AS N2, Code as y FROM country)
-> SELECT N1, N2 FROM threeA JOIN threeB WHERE threeA.x = threeB.y LIMIT 5;
| N1 | N2 |
| Oranjestad | Aruba |
| Kabul | Afghanistan |
| Qandahar | Afghanistan |
| Herat | Afghanistan |
| Mazar-e-Sharif | Afghanistan |
5 rows in set (0.01 sec)

mysql>    Hopefully the use of CTEs will produce easier to read SQL statements.

Percona Server for MySQL 5.7.20-19 Is Now Available

Percona announces the GA release of Percona Server for MySQL 5.7.20-19 on January 3, 2018. Download the latest version from the Percona web site or the Percona Software Repositories. You can also run Docker containers from the images in the Docker Hub repository.

Based on MySQL 5.7.20, including all the bug fixes in it, Percona Server for MySQL 5.7.20-19 is the current GA release in the Percona Server for MySQL 5.7 series. Percona’s provides completely open-source and free software. Find release details in the 5.7.20-19 milestone at Launchpad.

New Features:
  • Now MyRocks Storage Engine has General Availability status.
  • Binary log encryption has been implemented and can be now switched on using the encrypt_binlog  variable.
  • innodb_print_lock_wait_timeout_info variable, introduced in the previous release, but undocumented, allows to log information about all lock wait timeout errors.
Bugs Fixed:
  • Intermediary slave with Blackhole storage engine couldn’t record updates from master to its own binary log in case master has the binlog_rows_query_log_events option enabled. Bug fixed  #1722789 (upstream #88057).
  • Help command in the mysql command line client provided a link to an older version of the Percona Server manual. Bug fixed #1708073.
  • A regression in the mysqld_safe script forced it to print an extra error when stopping the MySQL service. Bug fixed #1738742.
  • Blackhole storage engine was incompatible with a newer length limit of the InnoDB index key prefixes. Bug fixed #1733049 (upstream #53588).
  • Heartbeats received by slave were reacted with mysql.slave_master_info table sync on each of them even with sync_master_info set to zero, causing a huge increase in write load. Bug fixed  #1708033 (upstream #85158).
MyRocks Changes:

The release notes for Percona Server for MySQL 5.7.20-19 are available in the online documentation. Please report any bugs on the launchpad bug tracker.

This Week in Data with Colin Charles 22: CPU vulnerabilities and looking forward to 2018

Join Percona Chief Evangelist Colin Charles as he covers happenings, gives pointers and provides musings on the open source database community.

Happy New Year. Here’s to 2018 being a great year in the open source database world. What is in store for us? Probably: MySQL 8.0 and MariaDB Server 10.3 as generally available. What will we see in the rest of the space? Clouds? All I know is that we move fast, and it’s going to be fun to see what unfolds.

The biggest news this week may not necessarily be database related; it focused on CPU security vulnerabilities and the potential slowdowns of your servers once the updates are applied. Please do read Meltdown and Spectre: CPU Security Vulnerabilities. Peter Zaitsev himself, got quoted in Bloomberg:

Peter Zaitsev, the co-founder and chief executive officer of Percona, a Raleigh, North Carolina-based company that helps businesses set and manage large computer databases, said that firms running such databases might see a 10 to 20 percent slowdown in performance from the patches being issued. He said this was not enough to cause major disruptions for most applications. He also said that subsequent versions of the patch would likely further reduce any performance impacts.

He also said that in cases where a company has a server completely dedicated to a single application there was likely no need to implement the patch as these machines are not susceptible to the attacks researchers have discovered.

Now that we’re all looking at 2018, I also recommend reading A Look at Ten New Database Systems Released in 2017 – featuring TimescaleDB, Azure Cosmos DB, Spanner, Neptune, YugaByte, Peloton, JanusGraph, Aurora Serverless, TileDB, and Memgraph. Also, don’t forget to read What lies ahead for data in 2018 – interesting thoughts on making graph/time series data easier, data partnerships, machine learning, and a lot more.

  • Percona Toolkit 3.0.6 – now with better support for pt-table-sync and MyRocks, pt-stalk checks the RocksDB status, pt-mysql-summary now expanded to include RocksDB information, and more!
  • MariaDB Server 10.2.12 – improvements in InnoDB, like shutdowns not being blocked by large transaction rollback, and more.
Link List Upcoming appearances
  • FOSDEM 2018 – Brussels, Belgium – February 3-4 2018
  • SCALE16x – Pasadena, California, USA – March 8-11 2018

I look forward to feedback/tips via e-mail at or on Twitter @bytebot.

Finding out the MySQL performance regression due to kernel mitigation for Meltdown CPU vulnerability

Update: I included the results for when PCID is disabled, for comparison, as a worse case scenario.

After learning about Meltdown and Spectre, I waited patiently to get a fix from my OS vendor. However, there were several reports of performance impact due to the kernel mitigation- for example on the PostgresQL developers mailing list there was reports of up to 23% throughput loss; Red Hat engineers report a regression range of 1-20%, but setting OLTP systems as the worse type of workload. As it will be highly dependent on the hardware and workload, I decided of doing some test myself for the use cases I need.

My setup

It is similar to that of my previous tests:

Hardware -desktop grade, no Xeon or proper RAID:

  • Intel(R) Core(TM) i7-4790K CPU @ 4.0GHz (x86_64 Quad-core with hyperthreading) with PCID support (disabling pcid with “nopcid” kernel command line will also be tested)
  • 32 GB of RAM
  • Single, desktop-grade, Samsung SSD 850 PRO 512GB

OS and configuration:

  • Debian GNU/Linux 9.3 “Stretch”, comparing kernels:
    • 4.9.0-4-amd64 #1 SMP Debian 4.9.65-3+deb9u1 (no mitigation)
    • 4.9.0-5-amd64 #1 SMP Debian 4.9.65-3+deb9u2 (latest kernel with security updates backported, including pti enabled according to security-announces)
  • datadir formatted as xfs, mounted with noatime option, all on top of LVM
  • MariaDB Server 10.1.30 compiled from source, queried locally through unix socket

The tests performed:

  • The single-thread write with LOAD DATA
  • A read-only sysbench with 8 and 64 threads
The results

LOAD DATA (single thread)

We have been measuring LOAD DATA performance of a single OpenStreetMap table (CSV file) in several previous tests as we detected a regression on some MySQL versions with single-thread write load. I believe it could be a interesting place to start. I tested both the default configuration and another more similar to WMF production:

Load time rows/s Unpatched Kernel, default configuration 229.4±1s 203754 Patched Kernel, default configuration 227.8±2.5s 205185 Patched Kernel, nopcid, default configuration 227.9±1.6s 205099 Unpatched Kernel, WMF configuration 163.5±1s 285878 Patched Kernel, WMF configuration 163.3±1s 286229 Patched Kernel, nopcid, WMF configuration 165.1±1.3s 283108

No meaningful regressions are observed in this case between the default patched and unpatched kernels- the variability is within the measured error. The nopcid could be showing some overhead, but the overhead (around 1%) is barely above the measuring error. The nopcid option is interesting not because the hardware support, but because of the kernel support- backporting it could be a no-option for older distro versions, as Moritz says on the comments.

It is interesting to notice, although offtopic, that while the results with the WMF “optimized” configuration have become better compared to previous years results (due, most likely, to improved CPU and memory resources); the defaults have become worse- a reminder that defaults are not a good metric for comparison.

This is not a surprising result, a single thread is not a real OLTP workload, and more time will be wasted on io waits than the necessary syscalls.


Let’s try with a different workload- let’s use a proper benchmarking tool, create a table and perform point selects with it, with 2 different levels of concurrency- 8 threads and 64 threads:

sysbench --test=oltp --oltp-table-size=1000000 --mysql-db=test --mysql-user=test prepare sysbench --test=oltp --oltp-table-size=1000000 --mysql-db=test --mysql-user=test --max-time=120 --oltp-read-only=on --max-requests=0 --num-threads={8, 64} run

TPS SELECTs/s 95 percentile of latency(ms) Unpatched Kernel, 8 threads 7333±30 100953±1000 1.15±0.05 Patched Kernel, 8 threads 6867±150 96140±2000 1.20±0.01 Patched Kernel, nopcid, 8 threads 6637±20 92915±200 1.27±0.05 Unpatched kernel, 64 threads 7298±50 102176±1000 43.21±0.15 Patched Kernel, 64 threads 6768±40 94747±1000 43.66±0.15 Patched Kernel, nopcid, 64 threads 6648±10 93073±100 43.96±0.10

In this case we can observe around a 4-7% regression in throughput if pcid is enabled. If pcid is disabled, they increase up to 9-10% bad, but not as bad as the warned by some “up to 20%”. If you are in my situation, and upgrade to stretch would be worth to get the pcid support.

Further testing would be required to check at what level of concurrency or what kind of workloads will work better or worse with the extra work for context switch. It will be interesting to measure it with production traffic, too, as some of the above could be nullified when network latencies are added to the mix. Further patches can also change the way mitigation works, plus probably things like having PCID support is helping transparently on all modern hardware.

Have you detected a larger regression? Are you going to patch all your databases right away? Tell me at @jynus.

Meltdown and Spectre: CPU Security Vulnerabilities

In this blog post, we examine the recent revelations about CPU security vulnerabilities.

The beginning of the new year also brings to light fresh and new CPU security vulnerabilities. Today’s big offenders originate on the hardware side – more specifically, the CPU. The reported hardware kernel bugs allow for direct access to data held in the computer/server’s memory, which in turn might leak sensitive data. Some of the most popular CPUs affected by these bugs are Intel, AMD and ARM.

The most important thing to know is that this vulnerability is not exploitable remotely, and requires that someone execute the malicious code locally. However, take extra precaution when running in virtualized environments (see below for more information).

A full overview (including a technical, in-depth explanation) can be found here:

These three CVEs refer to the issues:

Although the problems originate in hardware, you can mitigate the security issues by using updated operating system kernel versions. Patches specific to database servers such as Percona Server for MySQL, Percona Server for MongoDB, Percona XtraDB Cluster and others are unlikely.

Fixes in Various Operating Systems

Fixes and patches are available for Windows and MacOS. Not all major Linux distributions at the time of this post have released patches (though this is expected to evolve rapidly):

Security Impact

As mentioned above, this vulnerability is not exploitable remotely. It requires malicious code to be executed locally. An attacker must have either obtained unprivileged shell access or be able to load malicious code through other applications to be able to access memory from other processes (including MySQL’s memory).

To potentially exploit the vulnerability through MySQL, an attacker theoretically needs to gain access to a MySQL user account that has SUPER privileges. The attacker could then load UDF functions that contain the malicious code in order to access memory from the MySQL Server and other processes.
In MongoDB a similar behavior would need to use eval().

Cloud Providers, Virtualization and Containers

Some hypervisors are affected by this as they might access memory from other virtual machines. Containers are affected as well, as they can share the same kernel space.

More information (source):

Performance Impact

As a general rule, Percona always recommends installing the latest security patches. In this case, however, the decision to immediately apply the patch is complicated by the reported performance impact after doing so. These patches might affect database performance!

RedHat mentions the expected impact to be measurable, and saw a noticeable hit for both PostgreSQL and Redis.

At this time, Percona does not have conclusive results on how much performance impact you might expect on your databases. We’re working on getting some benchmarks results published shortly. Check back soon!

2017 Year in Review at VividCortex

It’s easy to observe (pun intended) in the rear-view mirror. Hindsight bias aside, 2017 was a big year for VividCortex and our customers! We shipped lots of features and made tons of progress on the business. Here’s a brief overview of some of our proudest moments from 2017.

Enhanced PostgreSQL support

We love PostgreSQL, and we released lots of enhancements for it! Here’s some of the stuff we did for our PostgreSQL customers:

  • Supported the great work the Postgres community did on version 10 and all its new features
  • Made VividCortex awesome for our customers who use CitusDB
  • Added support to monitor PgBouncer, which nearly everyone who’s using Postgres in a mission-critical environment is using by default
  • Added SQL query analysis to provide that extra level of database-specific insight, so you get help really understanding what your SQL does
  • Added support for collecting and analyzing lots more data. You can now rank, sort, slice-and-dice so many things: Queries, Databases and Verbs by Shared Blocks Hit, Shared Blocks Read, Shared Blocks Dirtied, Shared Blocks Written, Local Blocks Hit, Local Blocks Read, Local Blocks Dirtied, Local Blocks Written, Temp Blocks Read, Temp Blocks Written, Block Read Time and Block Write Time. And that's not all, you can rank and profile Verbs and Users too!
Released Many MongoDB Improvements

We spent a ton of time expanding our support for our MongoDB customers. A few of the many things we improved:

  • Index analysis, a key pain point for MongoDB, which relies heavily on indexes for performance
  • Automatic discovery of the full MongoDB scale-out configuration, including support for discovering and monitoring mongod, mongos, and config servers
  • Auto-discovery of replication sets and clusters
  • A Node Group view to visualize the cluster hierarchy
  • Deep profiling capabilities for missing indexes, locking performance, data size, index size, and looking for blocking and most frequent queries in db.currentOp() -- plus a bunch more!
  • Cursor operations, EXPLAIN, new expert dashboards, and more
  • In-app help documentation tooltips to help you understand all those MongoDB metrics and what they mean


Released Extended Support for Amazon RDS

We added support for enhanced metrics in Amazon RDS, so now you get CloudWatch, host metrics, and tons more detail. This is super helpful when trying to debug black-box behaviors on RDS!

Released Expert Insights

In October we released Expert Insights for PostgreSQL, MongoDB, and MySQL. These features automatically and continually monitor your databases with dozens of rules and checks. They essentially continually test your database’s state and behavior to provide best practice recommendations about configuration, security, and query construction. Now you don’t have to manually review everything, because VividCortex never sleeps!

  Improved Charts and Dashboards

We released a bunch of updates to the Charts section in 2017. KPI dashboards for MongoDB and PostgreSQL are new chart categories with key performance indicator (KPI) metrics preselected by our brainiacs. They show you the most important performance and health information at a glance. Built by experts, so you don’t have to do all that work yourself:


The new "System Resources by Hosts" category of charts plots several hosts on each chart so you can easily spot outliers. In the following screenshot, you can immediately see one of the hosts has much higher CPU utilization than the others:

There’s more. We redesigned Charts and Dashboards for a better experience with hundreds or thousands of hosts: new summarized charts scale to infinity and beyond, and you can make your own custom dashboards. Customize away!

New Funding, SOC-2 Compliance, And More!

In August we closed our $8.5M Series A-1 led by Osage Venture Partners with the participation of Bull City Venture Partners, New Enterprise Associates (NEA), and Battery Ventures. We’re thrilled to have recruited such a great group of investors to help us on our way! We’re using the funding to hire and we’re doubling down on our investments in product improvements.

In December we completed SOC 2 Type I certification. This was a massive effort involving almost everyone in the company. We value our customers and we work hard to keep everyone safe!

The most rewarding and important things we did in 2017 were for our customers. There’s too much to list in detail, but we invite you to read their stories. Here’s a few: Shopify, SendGrid, and DraftKings. If you just want the highlights, here’s a quick video that covers a lot of what our customers say about us.

2017 was a productive year for us. In 2018 we’re looking forward to more of the same: shipping more features for our users, keeping up with all of the database news, and seeing everyone at conferences! Here’s to a great 2018 from all of us at VividCortex. See you there -- and if you haven't experienced VividCortex yet, give it a try!

MariaDB Server 10.2.12 now available

MariaDB Server 10.2.12 now available dbart Thu, 01/04/2018 - 12:56

The MariaDB project is pleased to announce the immediate availability of MariaDB Server 10.2.12. See the release notes and changelog for details and visit to download.

Download MariaDB Server 10.2.12

Release Notes Changelog What is MariaDB 10.2?

The MariaDB project is pleased to announce the immediate availability of MariaDB Server 10.2.12. See the release notes and changelog for details.

Login or Register to post comments

Percona Toolkit 3.0.6 Is Now Available

Percona announces the release of Percona Toolkit 3.0.6 on January 4, 2018.

Percona Toolkit is a collection of advanced command-line tools that perform a variety of MySQL and MongoDB server and system tasks too difficult or complex for DBAs to perform manually. Percona Toolkit, like all Percona software, is free and open source.

You download packages from the website or install from official repositories.

This release includes the following changes:

New Features:
  • PT-221: Improve pt-table-sync support for MyRocks
  • PT-218: pt-stalk now checks the RocksDB status
  • PT-214: pt-mysql-summary contains the RocksDB section
  • PT-205: pt-osc shows a message if trying to set the engine to rocksdb and binlog_format != row
  • PT-204: pt-table-checksum skips RocksDB tables.
Bug fixes:
  • PT-234: The general log parser cannot handle time stamps which include time zones
  • PT-229: pt-online-schema-change does not retry on a deadlock error when using Percona Server 5.7
  • PT-225: pt-table-checksum ignores generated columns

A DBA Analyses 'The Phoenix Project'

Last year, I read 'The Phoenix Project'. I liked it and as an IT manager in the past, I did experience high blood pressure during the SEV1 scenarios in the book.

I also liked the way DevOps methodology helped solve issues with IT as well as help the company succeed overall.
As a DBA, however, I did have some things that didn't make sense to me about this story.  Bare in mind that the two major incidents in the book were database related. So in this post, I would like to jot down some things I have noticed and how they could have been solved looking at them from a different lens.
Caution, Spoiler Alert

Incident No.1 - Tokenisation
In the first incident, a 3rd party supplier ran a script against the database to tokenise some personal data. This was related to an issue that information security highlighted, but had the result of effecting HR and accounting.
In the book, there is a complaint that there was no test environment to see if this script would have any negative effects on other parts of the organisation. 
Now to me, this does make sense and at the same time, makes no sense at all.
If you meant, that back in the day, it was hard to get full environments setup to test changes on your application servers, then you would probably be right. Today, perhaps based on the methodology that this book introduces, you probably do have those environments setup: either virtualised or in a container. 

Testing Database
What doesn't make sense to me is that is not having a test database. Now reading through the book, there are mentions of Oracle database and some MS SQL databases. As a mainly MySQL DBA, I have not always worked on those databases, but I have worked next to people who have. My observation is, if you were to have an Oracle database, you would almost certainly have other dev/test/UAT/staging/pre-prod database servers as well.  Why do I think this? If you can afford to pay for an Oracle database, you would probably get more testing databases under the same license. License being the most expensive part when using Oracle.   So a testing database to test things that may effect the precious and expensive database server is almost a certainty.

DBA as a Gatekeeper
Now it seems shocking to me that the DBA had not been involved in the process to validate this 3rd party script. Old school Oracle DBAs are involved in everything that happens on their servers.  Need a user on the database? goto the DBA.  Need a database server for a new app? please fill these in triplicates, detailing what would be the projected usage for the next 5 years.  In most companies, an Oracle DBAs may even setup integration between other products like Oracle HR and finance.  So how could you have run something that significant against the database without their knowledge is beyond me.
Assuming that a database field had in fact been messed up, then Oracle DBAs have a TON of really enviable backup and restore features.  They can query a table to view all the backups that are available to restore from and choose the point-in-time that is closest to what they need. A DBA could simply restore the database, fetch the specific table that had its column changed and apply it to the to production database.  Its more than one table? Restore the database, go over the changes in the logs a point-in-time and skip the parts the conversion script applied.

It seems to me that the authors wrote the book based on their own experiences, but those experiences occurred in companies that had no DBAs. Not having a DBA is a product of start ups, not old school 1500-person car-parts manufacturers.

Incident No.2 - Conversion
There was a crippling database issue to do with a database conversion that was needed along side some new code roll out. The issue caused a 2 day - break out the hand held receipt machine - downtime to the system.

Works on My Laptop
During the initial investigation, a developer said something along the lines of 'it worked fine on my laptop' when describing the performance of the database conversion scripts. The problem was that on production, it was x1000 slower. Now, I have written about how to not be the one that deploys that slow query to production before and this really states that situation. Apparently, they still didn't have a database testing environment to test it against.
However, on the topic above of 'DBA as a gatekeeper': Why didn't the DBA review the conversion scripts or was involved in the the code review process for SQL statements? It could be that there wasn't any in the company.
Another point was that they couldn't cancel the conversion after they started and noticed how slow it was. If this was within a transaction or a single alter table statement, why not? If too many things have changed, could they not restore the database to a point-in-time before the changes were made? Was the conversion x1000 slow instead of maybe x10 slow, because of a foreign key check that could have been turned off? A DBA would have given you those options.

Project Unicorn
After the hero turns things around and things begin to pickup, they decide to start a separate project to add predictive features to the main project. In it, they decided to bypass seeking permission for database changes and create a new database where they copied production data into it from several locations. I very much like this approach and it falls in line with the reactive micro services pattern
This would make this book ahead of its time. Instead of managing one main database (although, they did mention in the book that had a couple of dozen database servers) for the website, they can break it up into several database servers, based on functionality. What is required is to use tools - and I would believe in 2012, they meant ETL tools - to migrate the needed data into these new database servers. 
This would still need a DBA though or at the very least, a data engineer with an ops background, as you now need to:
  • Data model new environments based on data from old ones
  • Create and maintain data pipelines
  • Monitor for errors and fix data that didn't make it 
  • Fix data drift and re-sync data across servers

In addition, you now need to backup, monitor the availability and performance of these additional database servers.
So while it adds complexity to the backend and you are now moving from simple database maintenance to a more data architecture role, it is the way forward. Certainly the only way to have proper micro services with their own single-purpose and loosely coupled data stores.

it might have been better if they just hired a DBA to solve thier DB issues.— GuybrushThreepwoodⓋ (@jonathan_ukc) January 6, 2017

;)— Kevin Behr (@kevinbehr) January 8, 2017

Announcing ClusterControl 1.5.1 - Featuring Backup Encryption for MySQL, MongoDB & PostgreSQL

What better way to start a new year than with a new product release?

Today we are excited to announce the 1.5.1 release of ClusterControl - the all-inclusive database management system that lets you easily deploy, monitor, manage and scale highly available open source databases - and load balancers - in any environment: on-premise or in the cloud.

ClusterControl 1.5.1 features encryption of backups for MySQL, MongoDB and PostgreSQL, a new topology viewer, support for MongoDB 3.4, several user experience improvements and more!

Feature Highlights Related resources  ClusterControl Change Logs  ClusterControl Upgrade Instructions  Announcing ClusterControl 1.5 Full Backup and Restore Encryption for these supported backup methods
  • mysqldump, xtrabackup (MySQL)
  • pg_dump, pg_basebackup (PostgreSQL)
  • mongodump (MongoDB)

New Topology View (BETA) shows your replication topology (including load balancers) for your entire cluster to help you visualize your setup.

  • MySQL Replication Topology
  • MySQL Galera Topology
Improved MongoDB Support
  • Support for MongoDB v3.4
  • Fix to add back restore from backup
  • Multiple NICs support. Management/public IPs for monitoring connections and data/private IPs for replication traffic

Improved user experience featuring a new left-side navigation that includes:

  • Global settings breakout to make it easier to find settings related to a specific feature
  • Quick node actions that allow you to quickly perform actions on your node
ClusterControl Single Console for Your Entire Database Infrastructure Find out what else is new in ClusterControl Install ClusterControl for FREE View Release Details and Resources Improving Database Security: Backup & Restore Encryption

ClusterControl 1.5 introduces another step to ensuring your databases are kept secure and protected.

Backup & restore encryption means that backups are encrypted at rest using AES-256 CBC algorithm. An auto generated key will be stored in the cluster's configuration file under /etc/cmon.d. The backup files are transferred in encrypted format. Users can now secure their backups for offsite or cloud storage with the flip of a checkbox. This feature is available for select backup methods for MySQL, MongoDB & PostgreSQL.

New Topology View (beta)

This exciting new feature provides an “overhead” topology view of your entire cluster, including load balancers. While in beta, this feature currently supports MySQL Replication and Galera topologies. With this new feature, you can drag and drop to perform node actions. For example, you can drag a replication slave on top of a master node - which will prompt you to either rebuild the slave or change the replication master.

Improved User Experience

The new Left Side Navigation and the new quick actions and settings that accompany it mark the first major redesign to the ClusterControl interface in some time. ClusterControl offers a vast array of functionality, so much so that it can sometimes be overwhelming to the novice. This addition of the new navigation allows the user quick access to what they need on a regular basis and the new node quick actions lets users quickly run common commands and requests right from the navigation.

Download the new ClusterControl or request a demo.

Tags:  clustercontrol backup security encryption MySQL MongoDB PostgreSQL database management

Case in Point: A Year of Customer Experience with Percona

In 2017 we have welcomed many new customers into the Percona fold. It’s always interesting to find out what challenges the Percona team helped them to address and how they chose their relationship with Percona. As unbiased champions of open source database software, our consultancy, support and managed services staff apply their expertise across a wide range of technologies. Here are just a few stories from the past year.

Scaling applications on Amazon RDS the right way

Specializing in on-demand transportation services, Grab needed a high-availability, high performing database engine to serve their rapidly growing application. Grab operates in over 30 densely populated cities in six countries of Southeast Asia. Although they had implemented a split database with Redis caching, they felt there was still room – and a necessity – to improve performance.

Maximum RDS hardware size had been reached, so Percona consultants worked with Grab to rearchitect the database design and reduce its overall footprint. The database management team were left with a solution that was manageable and with the capacity to expand.

“[Percona] proposed solutions that met our scaling and performance needs. The suggested changes were implemented fast, with no database downtime, and we saw an immediate improvement in our performance.” – Edwin Law, Data Engineering Lead, Grab

Managed database services

A B2C business with upward of 25 million subscribers, deal-brokering site has over 500 million page views a month. Data security is an issue at the forefront of their business. When their DBA left to take on a new challenge, the company was left with a dilemma common to many companies that have grown exponentially from a small, committed core of staff. With business continuity relying on a single resource, they knew they had to mitigate exposure to this risk, and Percona’s Managed Services offered a perfect solution.

Contracting out their database management provides 24-hour coverage for their databases, 365 days a year. Furthermore, rather than having to rely on one person’s knowledge and availability, in-depth expertise and a wide pool of knowledge are available if problems arise.

“Our subscribers expect deals that get posted on to not just be valid but scrutinized and rated by the community. Guaranteeing database performance is key to making sure our web applications are responsive and up-to-date. Percona Care Ultimate helps us to achieve these objectives.” – Pavel Genov, Head of Software Development at

Maximising database uptime

For Open Sky the challenge was to ensure maximum uptime. Their business model demands high availability with revenue directly tied to uptime. Using both MongoDB and MySQL in their architecture, hiring Percona to help with their DBA work made absolute sense, since Percona maintains distributions of these products.

Many smaller companies find it hard to justify the expense of a full-time DBA, never mind a whole team. Depending too much on a single person is an obvious candidate for a single point of failure. With high availability at the core of their business, not having DBA cover is a risky strategy.

Achieving outstanding database and query performance

LMAX Exchange also needed high availability, but was looking for very high performance too. Clients create ad hoc reports using vast amounts of data. With their logo incorporating the phrase “Speed Price Transparency”, their reporting system had something to live up to. As with all of our engagements, we carried out an initial audit of their existing systems. Through that were able to make some recommendations off the bat for improvements to their database configuration. The key to this exercise was to deliver important performance gains without the need for wholesale application upheaval. Percona support services help keep them on track.

Preparing for database growth

Alkemics’ challenge was that as a SaaS provider operating in a competitive industry and B2B, not only their own success but that of their customers relies on their delivery of an optimal service. The company was gearing for growth and needed to make sure that their database systems could support their ambitions. With that in mind, they hired Percona to carry out an audit of their database management and replication process. The database tuning recommendations once implemented led to a 45% decrease in memory usage. Along with overall improvements in query and database performance, Alkemics gained confidence in their ability to sustain planned growth.

Cloud Database Environments

Another factor influencing the need for optimization is cloud delivery, and the work with Alkemics illustrates what our team can achieve. During 2017, the number of applications using cloud infrastructure – like AWS, Microsoft Azure, and Google Cloud – increased rapidly, and we expect this trend to continue in 2018. With many cloud services charging solely on the basis of resource usage – disk space, memory, CPU, and IO all being factors – the DBA’s role has come into sharp focus in the context of cost. Percona Monitoring and Management underwent a substantial development, and in 1.5.3 incorporated support for Amazon RDS Aurora.

Optimizing database resource usage and cloud footprint

Lookout uses the Amazon Web Services (AWS) platform with Aurora as their main database software. Percona’s expertise with databases in the cloud helped Lookout to reduce their cloud footprint and, therefore, their operational costs. As a mobile security company, maximum availability in a secure environment is a core business function. As well as Percona’s database audit and optimization services, Lookout leverage Percona Monitoring and Management in the development of a resilient, pro-active approach to cloud management.

Check out our webinar with Lookout as well!

Summing up our customers’ stories

A common theme is that – whatever the size of their business – our customers are pushing the boundaries of technology with their projects and ambitions. This makes it an exciting place to work for Percona’s consultants, support, and technical account managers. Every new client brings a new challenge, an opportunity to look at the technology from a fresh angle and the chance to extend their knowledge.

If you are the kind of person that would enjoy this kind of challenge, then don’t forget we’re hiring!

Benchmarking ProxySQL 1.4.4

Comparing ProxySQL 1.4.3 vs. 1.4.4

ProxySQL 1.4.4 has recently been released (GA on Dec 20, 2017) which naturally leads to the question “What performance gains are to be expected when moving to the new version?”. In this article we compare performance between 1.4.3 and 1.4.4 in a CPU bound workload. The instances are configured with the default settings for the initial benchmark and then again after tuning one specific variable, mysql-max_stmts_per_connection, which can lead to substantial performance gains.

Lets first discuss what the ProxySQL variable mysql-max_stmts_per_connection affects and how it is evaluated. ProxySQL maintains a counter for each backend connection which increments each time a statement is prepared on that connection. Just before the connection is returned to the pool, this counter is evaluated against mysql-max_stmts_per_connection, and if the threshold is exceeded then the connection is closed (behaviour up to version 1.4.3) else the connection is reset (behaviour starting version 1.4.4 in order to improve the efficiency of connection handling as pointed out in Percona’s blog post regarding: XtraDB Cluster best practices for AWS).

Note: The mysql-max_stmts_per_connection variable is configured to 20 by default and can be tuned up however keep in mind that when increasing mysql-max_stmts_per_connection you may need to also increase the value of the MySQL variable max_prepared_stmt_count which has a maximum limit of 1048576.

In the graphs below performance is compared between ProxySQL version 1.4.3 and 1.4.4 using default values and two benchmarks for each (one with mysql-max_stmts_per_connection set to 20 [default] and another with the variable set to 100). A Sysbench benchmark was executed for 300 seconds with 64x threads performing a mixture of point and range selects on 10x tables consisting of 40,000,000 rows each running on a 3x node Percona XtraDB Cluster each running on 40x cores and 1 Gbps NICs.

The key averages for each benchmark are as follows:

Metric 1.4.3
(default) 1.4.4
(default) 1.4.3
(mspc 100) 1.4.4
(mspc 100) QPS (average) 126179.53 141689.68 155537.27 163209.93 Latency (average) 14.67 13.10 11.93 11.37 Latency (95th percentile) 18.28 16.41 15.55 14.73
  • It is interesting to note that ProxySQL 1.4.4 is 13% faster out of the box with the default settings compared to ProxySQL 1.4.3.
  • It is also quite interesting to see that when mysql-max_stmts_per_connection is tuned to 100, and for this specific workload, ProxySQL 1.4.3 in itself could perform 24% faster!
  • With ProxySQL 1.4.4 we can see that when mysql-max_stmts_per_connection is tuned to 100 performance is 15% faster, however this is still 5% faster than ProxySQL 1.4.3 when tuned as the code is more efficient in the newer release.
  • A similar trend can be seen in terms of latency between the various versions of ProxySQL and the tuning of the mysql-max_stmts_per_connection variable.
  • Once again ProxySQL 1.4.4 exhibits the lowest amount of latency (especially when mysql-max_stmts_per_connection is tuned higher than the default value).
  • Naturally the effects of the mysql-max_stmts_per_connection variable largely depend on your workload and a synthetic read only sysbench workload serves more for comparative purposes.

Based on the above benchmarks it is fair to say that ProxySQL 1.4.4 has more consistent and efficient performance with regards to 1.4.3 resulting in at least a 13% improvement in average QPS and a 15% improvement in terms of average Latency out of the box.

### For reference, the command used to benchmark was: sysbench --threads=64 /usr/share/sysbench/oltp_read_only.lua --tables=10 --table-size=40000000 --report-interval=5 --rand-type=pareto --forced-shutdown=1 --time=300 --events=0 --percentile=95 --mysql-user=sbtest --mysql-password=$pw --mysql-db=sbtest10t40M --mysql-storage-engine=INNODB --mysql-host= --mysql-port=6033 --point-selects=25 --range_size=5 --skip_trx=on run

Authored by: Nick Vyzas