Planet MySQL

Webinar Thurs 8/23: MySQL vs MongoDB – Choosing the Right Technology for Your Application

Please join Percona’s CEO, Peter Zaitsev as he presents MySQL vs MongoDB – Choosing the Right Technology for Your Application on Thursday, August 23, 2018, at 10:30 AM PDT (UTC-7) / 1:30 PM EDT (UTC-4).

Register Now

Are you considering to adopt the most popular open source relational database or the most popular open source NoSQL database? Which one is right for your particular application?

In this presentation, we will look into advantages and disadvantages of both and examine the applications where MySQL or MongoDB are the most appropriate choice.

Register Now

The post Webinar Thurs 8/23: MySQL vs MongoDB – Choosing the Right Technology for Your Application appeared first on Percona Database Performance Blog.

Codership is hiring Galera Cluster product specialist/community manager

Galera Cluster is the leading high availability solution for MySQL and MariaDB. We are always looking for bright and driven people who have a passion for technology, especially for database replication and clustering. If you are interested in challenging work, being part of a innovative team and world’s leading MySQL clustering company-then Codership is right place for you!  This is a remote job.


Job Description and Responsibilities

  • Write Galera Cluster technical marketing material (blogs, white papers, benchmarks)
  • Prepare presentations and present at webinars and conferences
  • Be active in social media channels and forums
  • Consult customers for Galera Cluster best practices
  • Pre-sales support
  • Interface with the development team to provide user feedback
  • Assist in writing Galera documentation


Galera product specialist/community manager qualifications

An ideal candidate must possess the following skills:

  • Deep understanding of Galera Cluster, replication technologies, databases generally
  • Experienced speaker at webinars and conferences
  • Ability to write technical blogs and white papers
  • Technical customer consulting/presales experience
  • Excellent verbal and written communication skills
  • Strong networking and social media skills
  • Ability to engage with a wide range of audiences from customers up to senior management
  • Experience working across different cultures and business environments
  • Organized, detail-oriented, and self-driven
  • Willingness to travel to conferences and customer meetings


Send your applications to or

MySQL NDB Cluster: Never Install a Management Node on the Same Host as a Data Node

In MySQL NDB Cluster, the management node (ndb_mgmd) is a lightweight process that among other things handles the configuration of the cluster. Since it is lightweight. It can be tempting to install it with one of the other nodes. However, if you want a high-availability setup, you should never install it on the same host as a data node (ndbd or ndbmtd). If you do that, it can cause a total cluster outage where the cluster could otherwise have survived.

The first sign of trouble occurs when you start the management nodes. The following warning is printed to standard output:

2018-08-22 18:04:14 [MgmtSrvr] WARNING -- at line 46: Cluster configuration warning: arbitrator with id 49 and db node with id 1 on same host arbitrator with id 50 and db node with id 2 on same host Running arbitrator on the same host as a database node may cause complete cluster shutdown in case of host failure.

To understand why this setup can cause a complete cluster shutdown, it is necessary first to review how a node failure is handled in MySQL NDB Cluster.

Node Failure Handling

When a data node fails, the cluster automatically evaluates whether it is safe to continue. A node failure can in this respect be either that a data node crashes or there is a network outage meaning one or more nodes cannot be seen by the other nodes.

A clean node shutdown (such as when using the recommended STOP command in the management client) is not subject to the evaluation as the other nodes will be notified of the shutdown.

So, how does MySQL NDB Cluster decide whether, the cluster can continue after a node failure or not? And if there are two halves, which one gets to continue?
Assuming two replicas (NoOfReplicas = 2), the rules are quite simple. To quote Pro MySQL NDB Cluster:

In short, data nodes are allowed to continue if the following conditions are true:
  • The group of data nodes holds all the data.
  • Either more than half the data nodes are in the group or the group has won the arbitration process.

For the group of data nodes to hold all the data, there must be one data node from each node group available. A node group is a group of NoOfReplicas nodes that share data. The arbitration process refers to the process of contacting the arbitrator (typically a management node) – the first half to make contact will remain online.

This is all a bit abstract, so let’s take a look at a couple of examples.


Consider a cluster with two data nodes and two management nodes. Most of the examples will have a management node installed on each of the hosts with the data nodes. The last example will as contrast have the management nodes on separate hosts.

The starting point is thus a cluster using two hosts each with one data node and one management node as shown in this figure:

A healthy cluster with two data nodes and the management nodes installed on the same hosts as the data nodes.

The green colour represents that the data node is online. The blue colour for the management node with Node Id 49 is the arbitrator, and the yellow management node is on standby.

This is where the problem with the setup starts to show up. The arbitrator is the node that is involved when there is exactly half the data nodes available after a node failure. In that case, the data node(s) will have to contact the arbitrator to confirm whether it is OK to continue. This avoids a split-brain scenario where there are two halves with all the data; in that case it is imperative that one half is shut down or the data can start to diverge. The half that is first to contact the arbitrator survives, the other is killed (STONITH – shoot the other node in the head).

So, let’s look at a potential split-brain scenario.

Split-Brain Scenario

A split-brain scenario can occur when the network between the two halves of the cluster is lost as shown in the next figure:

NDB Cluster with network failure but the cluster survives.

In this case the network connection between the two hosts is lost. Since both nodes have all data, it is necessary with arbitration to decide who can continue. The data node with Id 1 can still contact the arbitrator as it is on the same host, but Node Id 2 cannot (it would need to use the network that is down). So, Node Id 1 gets to continue whereas Node Id 2 is shut down.

So far so good. This is what is expected. A single point of failure does not lead to a cluster outage. However, what happens if we instead of a network failure considers a complete host failure?

Host Failure

Consider now a case where there is a hardware failure on Host A or someone by accident pulls the power. This causes the whole host to shut down taking both the data and management node with it. What happens in this case?

The first thought is that it will not be an issue. Node Id 2 has all the data, so surely it will continue, right? No, that is not so. The result is a total cluster outage as shown in the following figure:

The failure of the host with the arbitrator causes complete cluster outage.

Why does this happen? When Host A crashes, so does the arbitrator management node. Since Node Id 2 does not on its own constitute a majority of the data nodes, it must contact the arbitrator to be allowed to remain online.

You may think it can use the management node with Node Id 50 as the arbitrator, but that will not happen: while handling a node failure, under no circumstances can the arbitrator be changed. The nodes on Host B cannot know whether it cannot see the nodes on Host A because of a network failure (as in the previous example) or because the nodes are dead. So, they have to assume the other nodes are still alive or there would sooner or later be a split-brain cluster with both halves online.

Important: The arbitrator will never change while the cluster handles a node failure.

So, the data node with Id 2 has no other option than to shut itself down, and there is a total cluster outage. A single point of failure has caused a total failure. That is not the idea of a high availability cluster.

What could have been done to prevent the cluster outage? Let’s reconsider the case where the arbitrator is on a third independent host.

Arbitrator on Independent Host

The picture changes completely, if the management nodes are installed on Hosts C and D instead of Hosts A and B. For simplicity the management node with Node Id 50 is left out as it is anyway just a spectator while handling the node failure. In this case the scenario is:

The failure of the host with the arbitrator on a third host ensures the cluster remains online.

Here Node Id 2 can still contact the arbitrator. Node Id 1 is dead, so it will not compete to win the arbitration, and the end result becomes that Node Id 2 remains online. The situation is back where a single point of failure does not bring down the whole cluster.


If you want your cluster to have the best chance of survival if there is a problem with one of the hosts, never install the management nodes on the same hosts as where there are data nodes. One of the management nodes will also act as the arbitrator. Since the arbitrator cannot change while the cluster is handling a node failure, if the host with the arbitrator crashes, it will cause a complete cluster shutdown if arbitration is required.

When you consider what is a host, you should look at physical hosts. Installing the management node in a different virtual machine on the same physical host offers only little extra protection compared to the case where they are installed in the same virtual host or on the same host using bare metal.

So, to conclude: make sure your management nodes are on a completely different physical host compared to your data nodes.

Want to Know More?

The book Pro MySQL NDB Cluster (published by Apress) by Mikiya Okuno and me includes lots of information about designing your cluster and how MySQL NDB Cluster works.

Disclaimer: I am one of the authors of Pro MySQL NDB Cluster.

What’s new in MariaDB 10.3

What’s New in MariaDB 10.3

Let me start with an announcement: From now on, MariaDB is being treated as a distinct database on

The reason for the inclusion in my club of major SQL databases0 is simple: Although MariaDB was originally described as a “branch of MySQL that is, on the user level, compatible with the main version”1, both versions have diverged considerably in the past few years. At first, the differences were mostly limited to operative aspects (including some legal aspects). Over the last two years, even the SQL dialects started to diverge notably. Treating MariaDB as a distinct product is the unavoidable consequence for me.

Furthermore, MariaDB’s popularity is still growing2 and it seems that the MariaDB team is finally embracing the SQL standard. I must actually say that “they now embrace modern SQL standards”—not the SQL-92 standard that’s been overhauled six times.3

The release of MariaDB 10.3 demonstrates this in an impressive way. Read on to see what I mean.

System-Versioned Tables

Have you ever had the requirement to keep the old data when using update or delete? Pretty much every business application needs that—e.g. for customer master data. System-versioned tables is one of the standard SQL features that get’s it done.4

The SQL standard uses closed-open intervals5—stored in “from” and “to” timestamp columns—to denote the period in which each row is valid. The “system” in system-versioned tables means that those time stamps are automatically maintained by the database whenever you write to such a table. System versioning is meant to be transparent for the application.

The syntax to create system-versioned tables looks like this:


As you can see, it explicitly adds two columns to hold the validity time stamps. These are basically regular columns, and are visible to the user.6 That is the only required change that is not transparent to select and data-modifying statements. Next, the time stamps are logically combined into a period called system_time. Finally, the with system versioning clause enables the magic.

Of course you can also alter existing tables to add system versioning. If you already have the validity stored as an closed-open interval, you can use it right away.

Once enabled, system versioning maintains the columns in the system_time period automatically. That means that delete doesn’t actually remove anything. It just sets the end of the validity of the affected rows to the transaction time.7Update does the same, but it also adds new rows with the modified data and the transaction time as the start of validity. Once again, update is basically delete and insert.


The SQL standard does not specify how system-versioned tables store the data physically, nor does it define data retention mechanisms.

MariaDB keeps the old data in the same table by default. However, partitioning capabilities have been enhanced so you can physically separate the current data from the old one.

When you select from a system-versioned table you’ll just get the current data. To access old data, the syntax of the from clause was extended:

FROM <table> FOR SYSTEM_TIME [ AS OF <ts> | BETWEEN <ts> AND <ts> | FROM <ts> TO <ts> ] [ [AS] <new name> ]

The new for system_time clause immediately follows the table name in the from clause—i.e. a new table (AS <new name>) follows at the very end.

Of course the as of clause delivers the data as of the specified time. Between and from take two time stamps. The difference between them is that between included the upper bound while from excludes it. The lower bound is exclusive for both of them.8

Further reading:

The Values Clause

The values clause is probably the most basic SQL feature at all. Most people know about it from its use in the insert statement. However, the values clause is actually a concept in its own right that’s also useful outside of insert.

The values clause is the standard SQL approach to select without from. In addition to that, the values clause has the big advantage that it can easily create multiple rows in one go:

VALUES (<columns of row 1>) , (<columns of row 2>) , …

There is no need to union all multiple select statements.

In comparison to select without from, the values clause has a drawback too: it cannot assign names to its columns. You need to use from clause column renaming or the with clause for that.

SELECT * FROM ( VALUES ('a', 'b') ) t(a,b)

Unfortunately, MariaDB 10.3 does not support the from clause renaming shown above. It remains the with clause, which is somewhat bulky for this case.

Another problem with the values clause is its lack of support in many databases. Even though MariaDB 10.3 now supports the values clause, select without from still has better support among the tested databases.

Further reading:

Sequence Generators

“A sequence generator is a mechanism for generating successive exact numeric values, one at a time.”9 They are similar to identity columns, but they are not tied to an insert operation on a specific table so that they can be used for arbitrary purposes. The next value for <sequence name> expression is used to obtain a new value.

Naturally, sequence generator can be used in a stand-alone values statement to fetch a new value.

VALUES (NEXT VALUE FOR <sequence name>)

Further reading:

Percentile_disc and Percentile_cont

The percentile_disc and percentile_cont functions are used to get a percentile value—e.g. the median—from an ordered set of values.

In standard SQL these functions require the within group clause and optionally accept an over clause. However, in MariaDB 10.3 the over clause is also required.


The difference between these two functions is how they cope with a case in which the specified percentile falls in between two rows. For example, when the middle row—holding the median value—out of four rows is needed.

Percentile_disc always returns a discrete value from the input data—even if the specified percentile falls between two rows. In that case, it returns the first one with respect to the specified order by clause. Percentile_cont, on the other hand, performs a weighted linear interpolation between the two values of the adjacent rows.

MariaDB 10.3 also added the proprietary median(x) function, which is a different syntax for percentile_disc(0.5) within group (order by x).

Further reading:

Intersect and Except

Intersect and except are table operators similar to union. Instead of concatenating two tables, they produce the intersection and the difference, respectively.

Further reading:

Two-Phase Processing of Update’s Set Clause

This is best explained by example. Have a close look at this query.

UPDATE … SET c1 = c2 , c2 = c1

Note that both columns appear on both sides of the assignment operator (=)—i.e. both columns are read and changed in the same statement.

In this case, the SQL standard requires that all read operations are effectively completed before the first writing. You can think of it as a read-only phase in which all right-hand side expressions are evaluated, followed by a write-only phase, which actually stores the results. If this rule is obeyed, the statement above exchanges the values in the c1 and c2 columns.

Nonetheless the default behavior in MariaDB (including 10.3) as well as in MySQL is different. They execute the set clauses in the order they appear in the statement. Thus the result of the above query is that both columns contain the value previously stored in c2.

The new SIMULTANEOUS_ASSIGNMENT mode introduced with MariaDB 10.3 activates standard-conforming handling of this case.

SET sql_mode = (SELECT CONCAT( @@sql_mode , ',SIMULTANEOUS_ASSIGNMENT' ) )

Further reading:

Self-Referencing Update and Delete

This is very similar to the previous topic. The main difference is that previous issue deals with cyclic references between columns of the same row, whereas this issue is about cyclic references between different rows of the same table.

The problem can appear when a data-modifying statement has a query—e.g. inside exists or in10—that fetches data from the very same table that the statement is writing to. Again, there could be a cyclic reference, this time between rows.

Consider the simplest possible example:


The question is basically whether the select can already see the rows that are just inserted. Well, the obvious answer is “no”—otherwise it would end up as infinite loop.

The problem can also appear with the other data-modifying statements: update, delete and merge. However, the problem can only occur if these statements contain a query that accesses the target table again—otherwise it cannot access different rows from the same table—cyclic references cannot emerge.

The SQL standard solves the problem in the same way as for the set clause discussed above. It requires the execution to be effectively split into read-only and write-only phases. This time, the standard doesn’t strictly require this processing because it is an optional feature (F781, “Self-referencing operations”). Databases not supporting this feature should yield an error when you try to execute a self-referencing statement.

Indeed, MySQL and MariaDB do produce an error.

You can't specify target table 't1' for update in FROM clause

Note that this message always says “update” even when the statement was a delete. With MariaDB 10.0, the message was rephrased:

Table 't1' is specified twice, both as a target for 'UPDATE' and as a separate source for data

MariaDB 10.3 can execute self-referencing update and delete statements.11

Off Topic: Limit in Group_Concat

Group_concat is not a standard SQL feature. The standard function for that is listagg. However, I though I’d include this enhancement here nevertheless because listagg offers a different solution for the same problem—how to prevent exceeding the length limitation of the results’s data type?

The SQL standard’s listagg has the on overflow clause that can stop adding further values before exceeding the limit. MariaDB’s group_concat can now take an optional limit clause to put an upper bound on the number of concatenated values.

Further reading:

Event Note: Modern SQL Training in Vienna

If you have read this far, you are probably interested in learning more about all the modern SQL goodness. If so, have a look at my upcoming training in Vienna (Sep 17-21).

It’s a full week long and covers all the recurring issues that I have observed in my training and consulting assignments over the years. That includes indexing in detail, basics you might be afraid to ask about, and modern SQL features such as window functions and recursions in detail.

Check it out now ;)

What’s New in MariaDB 10.3” by Markus Winand was originally published at modern SQL.

ProxySQL Series : Percona Cluster/MariaDB Cluster (Galera) Read-write Split

ProxySQL is the most preferred and is widely used for load-balancing MySQL workload, thanks to Rene Cannon & Team for the great tool, and kudos on the recent release of ProxySQL 1.4.10, with a lot of bug fixes. ProxySQL is simple in design, lightweight, highly efficient and feature rich, We have been working with ProxySQL in production for our client quite a sometime, we have also shared some of our encounters/experience and use cases in the below blogs.

In this blog, we will see how to setup ProxySQL for Percona XtraDB cluster with the READ-WRITE split with Failover handling for various node states with Galera. PXC / MariaDB Clusters really works better with writes on single ode than multi node writes. That makes the read/write split up ideal for PXC. Application benefits a lot with PXC /MariaDB Cluster as they avail synchronous reads and High availability with these clusters.

For the purpose of this blog I have set up a three node cluster as below

Operating System: Centos 7.4 Cluster version : Percona XtraDB cluster 5.7.21 ProxySQL version : proxysql-1.4.8-1.1.el7.x86_64 ProxySQLNode : Node1 : Node2 : Node 3:

Setting up the cluster is beyond the scope of this, I will just directly move to proxySQL setup for a cluster in Single-writer mode, Which is the most recommended for Cluster to avoid of conflicts of writes and split-Brain scenarios.

ArchitectureInstalling ProxySQL:

Easy and robust way to have install proxySQL is by using the percona repo, because it comes with tool/scripts such as proxysql-admin and proxysql_galera_checker, which helps in the easy configuration, monitoring of cluster and also fail-over

#yum install #yum install proxysql-1.4.8-1.1.el7.x86_64 -y #chkconfig proxysql on (enabling service at startup) # service proxysql start

Now proxysql is up and running on the node and listening on ports 6032 for proxysql admin interface and 6033 for MySQL interface by default, They can be changed if needed

Adding Cluster Nodes to proxySQL

Now connect to proxySQL as below

$ mysql -u admin -padmin -h -P6032

In this setup we will be using two hostgroups

Hostgroup 10 (writer group)
Hostgroup 11 (Reader group)

mysql> INSERT INTO mysql_servers(hostgroup_id, hostname, port,weight,comment) VALUES (10,'',3306,1000000,'WRITE'); Query OK, 1 row affected (0.00 sec) mysql> INSERT INTO mysql_servers(hostgroup_id, hostname, port,weight,comment) VALUES (11,'',3306,1000,'READ'); Query OK, 1 row affected (0.00 sec) mysql> INSERT INTO mysql_servers(hostgroup_id, hostname, port,weight,comment) VALUES (11,'',3306,1000,'READ'); Query OK, 1 row affected (0.00 sec) mysql> load mysql servers to runtime; Query OK, 0 rows affected (0.00 sec) mysql> save mysql servers to disk; Query OK, 0 rows affected (0.03 sec)

Which looks as below

mysql> SELECT * FROM mysql_servers; +--------------+---------------+------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------+ | hostgroup_id | hostname | port | status | weight | compression | max_connections | max_replication_lag | use_ssl | max_latency_ms | comment | +--------------+---------------+------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------+ | 10 | | 3306 | ONLINE | 1000000 | 0 | 1000 | 0 | 0 | 0|WRITE | | 11 | | 3306 | ONLINE | 1000 | 0 | 1000 | 0 | 0 | 0 |READ | | 11 | | 3306 | ONLINE | 1000 | 0 | 1000 | 0 | 0 | 0 |READ | +--------------+---------------+------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------+ Query Rules:

Query rules will be important for query routing with proxySQL, incoming query pattern will be matched based on which it will be routed to the hostgroup accordingly, here I will be using the default and basic query rules for Read-Write splitting as below,

INSERT INTO mysql_query_rules (rule_id,active,match_pattern,destination_hostgroup,apply) VALUES (1,1,'^SELECT.*FOR UPDATE$',10,1), (2,1,'^SELECT',11,1); Query OK, 2 rows affected (0.00 sec) mysql> load mysql query rules to runtime; Query OK, 0 rows affected (0.00 sec) mysql> save mysql query rules to disk; Query OK, 0 rows affected (0.01 sec)

Now the query rules are added.

Integrating Application user:

Now it’s time to add embed application user into proxySQL through which application connects via proxysql, here I have assigned the default host group for the user as 10 writer group, when an incoming query without a matching pattern in a query rule, it routes automatically to the default hostgroup ie., 10

INSERT INTO mysql_users (username,password,default_hostgroup) VALUES ('app_user','application_user',10); Query OK, 0 rows affected (0.00 sec) mysql> load mysql users to runtime; Query OK, 0 rows affected (0.00 sec) mysql> save mysql users to disk; Query OK, 0 rows affected (0.01 sec)

Application user now is loaded and kept ready.

ProxySQL Monitoring:

Proxysql Monitors the node, added under it by making checks at regular interval, you can enable the monitoring as below,

mysql> UPDATE global_variables SET variable_value="monitor" WHERE variable_name="mysql-monitor_username"; Query OK, 1 row affected (0.00 sec) mysql> UPDATE global_variables SET variable_value="monitor" WHERE variable_name="mysql-monitor_password"; Query OK, 1 row affected (0.00 sec) mysql> LOAD MYSQL VARIABLES TO RUNTIME; Query OK, 0 rows affected (0.00 sec) mysql> SAVE MYSQL VARIABLES TO DISK; Query OK, 94 rows affected (0.00 sec)

Monitor check:

mysql> SELECT * FROM monitor.mysql_server_connect_log ORDER BY time_start_us DESC LIMIT 5; +---------------+------+------------------+-------------------------+---------------+ | hostname | port | time_start_us | connect_success_time_us | connect_error | +---------------+------+------------------+-------------------------+---------------+ | | 3306 | 1534068625859885 | 1452 | NULL | | | 3306 | 1534068625849598 | 1392 | NULL | | | 3306 | 1534068625838742 | 1505 | NULL | | | 3306 | 1534068565850089 | 1389 | NULL | | | 3306 | 1534068565839890 | 1713 | NULL | +---------------+------+------------------+-------------------------+---------------+

It shows that proxySQL is able to connect to all the nodes.


Here we come to the important aspect of the configuration, With cluster, we have different node states (1-4) , in any state comment beyond SYNCED , these node state switches are tracked by proxySQL health checks ( Monitoring Galera ).  And it routes the traffic accordingly with the help of  scheduler scripts.

Scheduler with help of the proxysql_galera_checker script checks the node states in a periodic configurable interval (ms), When a node state change is detected, It makes changes to the Hostgroup and takes the nodes out of traffic accordingly. An advanced scheduler was written by Marco Tusa is on his github ( proxy_sql_tools ).

For instance, if the writer node goes down, the scheduler script promotes a node from reader group to accept the writes, and when the node comes back online and add it back to hostgroup, The change is very quick that application doesn’t notice any interruption

INSERT INTO scheduler(id,active,interval_ms,filename,arg1,arg2,arg3,arg4,arg5) VALUES (1,'1','500','/usr/bin/proxysql_galera_checker','10','11','0','1', '/var/lib/proxysql/proxysql_galera_checker.log'); Query OK, 1 row affected (0.00 sec) mysql> LOAD SCHEDULER TO RUNTIME;SAVE SCHEDULER TO DISK; Query OK, 0 rows affected (0.00 sec) mysql> select * from scheduler\G *************************** 1. row *************************** id: 1 active: 1 interval_ms: 3000                     #check interval in Ms filename: /bin/proxysql_galera_checker #Check script arg1: 10                               # Writer group arg2: 11                               # Reader group arg3: 0                               # No of writers arg4: 1                                # Writers are readers Arg5:/var/lib/proxysql/mycluster_proxysql_galera_check.log  #log file comment: mycluster 1 row in set (0.00 sec)

Scheduler start checking immediately and writes the status to log as below


[2018-07-15 08:24:11] log file /var/lib/proxysql/mycluster_proxysql_galera_check.log [2018-07-15 08:24:11] ###### HANDLE WRITER NODES ###### [2018-07-15 08:24:12] --> Checking WRITE server 10:, current status ONLINE, wsrep_local_state 4 [2018-07-15 08:24:12] ###### HANDLE READER NODES ###### [2018-07-15 08:24:12] --> Checking READ server 11:, current status ONLINE, wsrep_local_state 4 [2018-07-15 08:24:12] server 11: is already ONLINE [2018-07-15 08:24:12] --> Checking READ server 11:, current status ONLINE, wsrep_local_state 4 [2018-07-15 08:24:12] server 11: is already ONLINE [2018-07-15 08:24:12] ###### SUMMARY ###### [2018-07-15 08:24:12] --> Number of writers that are 'ONLINE': 1 : hostgroup: 10 [2018-07-15 08:24:12] --> Number of readers that are 'ONLINE': 2 : hostgroup: 11

Now the setup is complete, with Read/Write split and failover, Now we will proceed to test Read-write Split

I will be using the below command snippet to watch the connection and failover scenarios

# watch -n 1 'mysql -h -P 6032 -uadmin -padmin -t -e "select * from stats_mysql_connection_pool where hostgroup in (10,11) order by hostgroup,srv_host ;" -e " select hostgroup_id,hostname,status,weight,comment from mysql_servers where hostgroup_id in (10,11) order by hostgroup_id,hostname ;"' +-----------+---------------+----------+--------+----------+----------+--------+---------+---------+-----------------+--------- -------+------------+ | hostgroup | srv_host | srv_port | status | ConnUsed | ConnFree | ConnOK | ConnERR | Queries | Bytes_data_sent | Bytes_data_recv | Latency_us | +-----------+---------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+ | 10 | | 3306 | ONLINE | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 604 | | 11 | | 3306 | ONLINE | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 509 | | 11 | | 3306 | ONLINE | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 601 | +-----------+---------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+ +--------------+---------------+--------+---------+---------+ | hostgroup_id | hostname | status | weight | comment | +--------------+---------------+--------+---------+---------+ | 10 | | ONLINE | 1000000 | WRITE | | 11 | | ONLINE | 1000 | READ | | 11 | | ONLINE | 1000 | READ | +--------------+---------------+--------+---------+---------+

Read/Write testing:

Only write test with sysbench.

sysbench --test='/usr/share/sysbench/oltp_update_index.lua' --table-size=1000000 --mysql-host= --mysql-db='sbtest' --mysql-user=app_user --mysql-password='application_user' --mysql-port=6033 --time=60 --threads=4 --db-driver=mysql run

ProxySQL stats:

+-----------+---------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+ | hostgroup | srv_host | srv_port | status | ConnUsed | ConnFree | ConnOK | ConnERR | Queries | Bytes_data_sent | Bytes_data_recv | Latency_us | +-----------+---------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+ | 10 | | 3306 | ONLINE | 0 | 4 | 4 | 0 | 29326 | 703876 | 0 | 0 | | 11 | | 3306 | ONLINE | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | | 11 | | 3306 | ONLINE | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 || +-----------+---------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+

As you can see the queries stats getting changed for the writer group and the readers do not receive any writes.

Read ( Select ) test:

Simulating a readonly primary key based ( select ) on cluster with sysbench.

# sysbench --test='/usr/share/sysbench/oltp_point_select.lua' --table-size=1000000 --mysql-host= --mysql-db='sbtest' --mysql-user=app_user --mysql-password='application_user' --mysql-port=6033 --time=60 --threads=4 --db-driver=mysql run

ProxySQL stats:

+-----------+---------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+ | hostgroup | srv_host | srv_port | status | ConnUsed | ConnFree | ConnOK | ConnERR | Queries | Bytes_data_sent | Bytes_data_recv | Latency_us | +-----------+---------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+ | 10 | | 3306 | ONLINE | 0 | 4 | 4 | 0 | 29326 | 703876 | 0 | 0 | | 11 | | 3306 | ONLINE | 0 | 3 | 3 | 0 | 95848 | 2300380 | 5731992 | 0 | | 11 | | 3306 | ONLINE | 0 | 3 | 3 | 0 | 96929 | 2326324 | 5818176 | 0 | +-----------+---------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+

Now you can see the change of query count in our reader hostgroup comparing the previous result.

Image Courtesy : Photo by Tim Gouw on Unsplash

Database Security Monitoring for MySQL and MariaDB

Data protection is one of the most significant aspects of administering a database. Depending on the organizational structure, whether you are a developer, sysadmin or DBA, if you are managing the production database, you must monitor data for unauthorized access and usage. The purpose of security monitoring is twofold. One, to identify unauthorised activity on the database. And two, to check if databases ´and their configurations on a company-wide basis are compliant with security policies and standards.

In this article, we will divide monitoring for security in two categories. One will be related to auditing of MySQL and MariaDB databases activities. The second category will be about monitoring your instances for potential security gaps.

Query and connection policy-based monitoring

Continuous auditing is an imperative task for monitoring your database environment. By auditing your database, you can achieve accountability for actions taken or content accessed. Moreover, the audit may include some critical system components, such as the ones associated with financial data to support a precise set of regulations like SOX, or the EU GDPR regulation. Usually, it is achieved by logging information about DB operations on the database to an external log file.

By default, auditing in MySQL or MariaDB is disabled. You and achieve it by installing additional plugins or by capturing all queries with the query_log parameter. The general query log file is a general record of what MySQL is performing. The server records some information to this log when clients connect or disconnect, and it logs each SQL statement received from clients. Due to performance issues and lack of configuration options, the general_log is not a good solution for security audit purposes.

If you use MySQL Enterprise, you can use the MySQL Enterprise Audit plugin which is an extension to the proprietary MySQL version. MySQL Enterprise Audit Plugin plugin is only available with MySQL Enterprise, which is a commercial offering from Oracle. Percona and MariaDB have created their own open source versions of the audit plugin. Lastly, McAfee plugin for MySQL can also be used with various versions of MySQL. In this article, we will focus on the open source plugins, although the Enterprise version from Oracle seems to be the most robust and stable.

Characteristics of MySQL open source audit plugins

While the open source audit plugins do the same job as the Enterprise plugin from Oracle - they produce output with database query and connections - there are some major architectural differences.

MariaDB Audit Plugin – The MariaDB Audit Plugin works with MariaDB, MySQL (as of version 5.5.34 and 10.0.7) and Percona Server. MariaDB started including the Audit Plugin by default from versions 10.0.10 and 5.5.37, and it can be installed in any version from MariaDB 5.5.20. It is the only plugin that supports Oracle MySQL, Percona Server, and MariaDB. It is available on Windows and Linux platform. Versions starting from 1.2 are most stable, and it may be risky to use versions below that in your production environment.

McAfee MySQL Audit Plugin – This plugin does not use MySQL audit API. It was recently updated to support MySQL 5.7. Some tests show that API based plugins may provide better performance but you need to check it with your environment.

Percona Audit Log Plugin – Percona provides an open source auditing solution that installs with Percona Server 5.5.37+ and 5.6.17+ as part of the installation process. Comparing to other open source plugins, this plugin has more reach output features as it outputs XML, JSON and to syslog.

As it has some internal hooks to the server to be feature-compatible with Oracle’s plugin, it is not available as a standalone plugin for other versions of MySQL.

Plugin installation based on MariaDB audit extension

The installation of open source MySQL plugins is quite similar for MariaDB, Percona, and McAfee versions.
Percona and MariaDB add their plugins as part of the default server binaries, so there is no need to download plugins separately. The Percona version only officially supports it’s own fork of MySQL so there is no direct download from the vendor's website ( if you want to use this plugin with MySQL, you will have to obtain the plugin from a Percona server package). If you would like to use the MariaDB plugin with other forks of MySQL, then you can find it from The McAfee plugin is available at

Before you start the plugin installation, you can check if the plugin is present in the system. The dynamic plugin (doesn’t require instance restart) location can be checked with:

SHOW GLOBAL VARIABLES LIKE 'plugin_dir'; +---------------+--------------------------+ | Variable_name | Value | +---------------+--------------------------+ | plugin_dir | /usr/lib64/mysql/plugin/ | +---------------+--------------------------+

Check the directory returned at the filesystem level to make sure you have a copy of the plugin library. If you do not have or server_audit.dll inside of /usr/lib64/mysql/plugin/, then more likely your MariaDB version is not supported and should upgrade it to latest version..

The syntax to install the MariaDB plugin is:

INSTALL SONAME 'server_audit';

To check installed plugins you need to run:


If you need additional information, check the PLUGINS table in the information_schema database which contains more detailed information.

Another way to install the plugin is to enable the plugin in my.cnf and restart the instance. An example of a basic audit plugin configuration from MariaDB could be :

server_audit_events=CONNECT server_audit_file_path=/var/log/mysql/audit.log server_audit_file_rotate_size=1073741824 server_audit_file_rotations=8 server_audit_logging=ON server_audit_incl_users= server_audit_excl_users= server_audit_output_type=FILE server_audit_query_log_limit=1024

Above setting should be placed in my.cnf. Audit plugin will create file /var/log/mysql/audit.log which will rotate on size 1GB and there will be eight rotations until the file is overwritten. The file will contain only information about connections.

Currently, there are sixteen settings which you can use to adjust the MariaDB audit plugin.

server_audit_events server_audit_excl_users server_audit_file_path server_audit_file_rotate_now server_audit_file_rotate_size server_audit_file_rotations server_audit_incl_users server_audit_loc_info server_audit_logging server_audit_mode server_audit_output_type Server_audit_query_log_limit server_audit_syslog_facility server_audit_syslog_ident server_audit_syslog_info server_audit_syslog_priority

Among them, you can find options to include or exclude users, set different logging events (CONNECT or QUERY) and switch between file and syslog.

To make sure the plugin will be enabled upon server startup, you have to set in your my.cnf settings. Such configuration can be additionally protected by server_audit=FORCE_PLUS_PERMANENT which will disable the plugin uninstall option.

UNINSTALL PLUGIN server_audit; ERROR 1702 (HY000): Plugin 'server_audit' is force_plus_permanent and can not be unloaded

Here is some sample entries produced by MariaDB audit plugin:

20180817 20:00:01,slave,cmon,cmon,31,0,DISCONNECT,information_schema,,0 20180817 20:47:01,slave,cmon,cmon,17,0,DISCONNECT,information_schema,,0 20180817 20:47:02,slave,cmon,cmon,19,0,DISCONNECT,information_schema,,0 20180817 20:47:02,slave,cmon,cmon,18,0,DISCONNECT,information_schema,,0 20180819 17:19:19,slave,cmon,cmon,12,0,CONNECT,information_schema,,0 20180819 17:19:19,slave,root,localhost,13,0,FAILED_CONNECT,,,1045 20180819 17:19:19,slave,root,localhost,13,0,DISCONNECT,,,0 20180819 17:19:20,slave,cmon,cmon,14,0,CONNECT,mysql,,0 20180819 17:19:20,slave,cmon,cmon,14,0,DISCONNECT,mysql,,0 20180819 17:19:21,slave,cmon,cmon,15,0,CONNECT,information_schema,,0 20180819 17:19:21,slave,cmon,cmon,16,0,CONNECT,information_schema,,0 20180819 19:00:01,slave,cmon,cmon,17,0,CONNECT,information_schema,,0 20180819 19:00:01,slave,cmon,cmon,17,0,DISCONNECT,information_schema,,0 Schema changes report

If you need to track only DDL changes, you can use the ClusterControl Operational Report on Schema Change. The Schema Change Detection Report shows any DDL changes on your database. This functionality requires an additional parameter in ClusterControl configuration file. If this is not set you will see the following information: schema_change_detection_address is not set in /etc/cmon.d/cmon_1.cnf. Once that is in place an example output may be like below:

It can be set up with a schedule, and the reports emailed to recipients.

ClusterControl: Schedule Operational Report MySQL Database Security Assessment Package upgrade check

First, we will start with security checks. Being up-to-date with MySQL patches will help reduce risks associated with known vulnerabilities present in the MySQL server. You can keep your environment up-to-date by using the vendors’ package repository. Based on this information you can build your own reports, or use tools like ClusterControl to verify your environment and alert you on possible updates.

ClusterControl Upgrade Report gathers information from the operating system and compares them to packages available in the repository. The report is divided into four sections; upgrade summary, database packages, security packages, and other packages. You can quickly compare what you have installed on your system and find a recommended upgrade or patch.

ClusterControl: Upgrade Report ClusterControl: Upgrade Report details

To compare them manually you can run

SHOW VARIABLES WHERE variable_name LIKE "version";

With security bulletins like:

Or vendor repositories:

On Debian

sudo apt list mysql-server

On RHEL/Centos

yum list | grep -i mariadb-server Accounts without password

Blank passwords allow a user to login without using a password. MySQL used to come with a set of pre-created users, some of which can connect to the database without password or, even worse, anonymous users. Fortunately, this has changed in MySQL 5.7. Finally, it comes only with a root account that uses the password you choose at installation time.

For each row returned from the audit procedure, set a password:

SELECT User,host FROM mysql.user WHERE authentication_string='';

Additionally, you can install a password validation plugin and implement a more secure policy:

INSTALL PLUGIN validate_password SONAME ''; SHOW VARIABLES LIKE 'default_password_lifetime'; SHOW VARIABLES LIKE 'validate_password%';

An good start can be: validate-password=FORCE_PLUS_PERMANENT validate_password_length=14 validate_password_mixed_case_count=1 validate_password_number_count=1 validate_password_special_char_count=1 validate_password_policy=MEDIUM

Of course, these settings will depend on your business needs.

Remote access monitoring

Avoiding the use of wildcards within hostnames helps control the specific locations from which a given user may connect to and interact with the database.

You should make sure that every user can connect to MySQL only from specific hosts. You can always define several entries for the same user, this should help to reduce a need for wildcards.

Execute the following SQL statement to assess this recommendation (make sure no rows are returned):

SELECT user, host FROM mysql.user WHERE host = '%'; Test database

The default MySQL installation comes with an unused database called test and the test database is available to every user, especially to the anonymous users. Such users can create tables and write to them. This can potentially become a problem on its own - and writes would add some overhead and reduce database performance. It is recommended that the test database is dropped. To determine if the test database is present, run:


If you notice that the test database is present, this could be that mysql_secure_installation script which drops the test database (as well as other security-related activities) was not executed.


If both server and client has the ability to run LOAD DATA LOCAL INFILE, a client will be able to load data from a local file to a remote MySQL server. The local_infile parameter dictates whether files located on the MySQL client's computer can be loaded or selected via LOAD DATA INFILE or SELECT local_file.

This, potentially, can help to read files the client has access to - for example, on an application server, one could access any data that the HTTP server has access to. To avoid it, you need to set local-infile=0 in my.cnf.

Execute the following SQL statement and ensure the Value field is set to OFF:

SHOW VARIABLES WHERE Variable_name = 'local_infile'; Monitor for non-encrypted tablespaces

Starting from MySQL 5.7.11, InnoDB supports data encryption for tables stored in file-per-table tablespaces. This feature provides at-rest encryption for physical tablespace data files. To examine if your tables have been encrypted run:

mysql> SELECT TABLE_SCHEMA, TABLE_NAME, CREATE_OPTIONS FROM INFORMATION_SCHEMA.TABLES WHERE CREATE_OPTIONS LIKE '%ENCRYPTION="Y"%'; +--------------+------------+----------------+ | TABLE_SCHEMA | TABLE_NAME | CREATE_OPTIONS | +--------------+------------+----------------+ | test | t1 | ENCRYPTION="Y" | +--------------+------------+----------------+

As a part of the encryption, you should also consider encryption of the binary log. The MySQL server writes plenty of information to binary logs.

Encryption connection validation

In some setups, the database should not be accessible through the network if every connection is managed locally, through the Unix socket. In such cases, you can add the ‘skip-networking’ variable in my.cnf. Skip-networking prevents MySQL from using any TCP/IP connection, and only Unix socket would be possible on Linux.

However this is rather rare situation as it is common to access MySQL over the network. You then need to monitor that your connections are encrypted. MySQL supports SSL as a means to encrypting traffic both between MySQL servers (replication) and between MySQL servers and clients. If you use Galera cluster, similar features are available - both intra-cluster communication and connections with clients can be encrypted using SSL. To check if you use SSL encryption run the following queries:

SHOW variables WHERE variable_name = 'have_ssl'; select ssl_verify_server_cert from mysql.slave_master_info;

That’s it for now. This is not a complete list, do let us know if there are any other checks that you are doing today on your production databases.

Related resources   An Overview of Database Operational Reporting in ClusterControl Read the blog   How to Achieve GDPR Compliance: Documenting Our Experience (I) Read the blog   How to Achieve GDPR Compliance: Documenting Our Experience (II) Read the blog   What DBAs Should Know About HIPAA and Other Data Protection Regulations Read the blog   How to Secure Your Open Source Databases with ClusterControl Read the blog Tags:  monitoring security MySQL MariaDB compliance

How to set up a Replication User


A replication user is necessary to set up the relationship Primary/Replica. This is a short step but it needs a bit more of attention.

From the MySQL 5.7 documentation (highlights are my own):

Although you do not have to create an account specifically for replication, you should be aware that the replication user name and password are stored in plain text in the master info repository file or table (see Section, “Slave Status Logs”). Therefore, you may want to create a separate account that has privileges only for the replication process, to minimize the possibility of compromise to other accounts.

The following command specifically will allow replication from all databases and tables connecting from all hosts. For security reasons you may want to limit access to replication only to the IP address of the server doing the replication.

Log into the MySQL console using a user with GRANT privileges in the primary server and execute the following:

CREATE USER 'replication'@'%' IDENTIFIED BY 'mysupersecretpassword' GRANT REPLICATION SLAVE ON *.* TO 'replication'@'%';

My advice is instead of using the % wildcard, set up the IP address of your primary.

This user will be added to the primary ’s MASTER_USER option, and in theory could be any user as long it also has REPLICATION SLAVE privileges. After that, the replica will connect to the primary and perform some kind of handshake with those credentials and if they match, theprimary will allow replication to occur.

See something wrong in this tutorial? Please don’t hesitate to message me through the comments or the contact page.

Using AWS EC2 instance store vs EBS for MySQL: how to increase performance and decrease cost

If you are using large EBS GP2 volumes for MySQL (i.e. 10TB+) on AWS EC2, you can increase performance and save a significant amount of money by moving to local SSD (NVMe) instance storage. Interested? Then read on for a more detailed examination of how to achieve cost-benefits and increase performance from this implementation.

EBS vs Local instance store

We have heard from customers that large EBS GP2 volumes can be affected by short term outages—IO “stalls” where no IO is going in or out for a couple of minutes. This can happen, especially, in the largest AWS region us-east-1. Statistically, with so many disks in disk arrays (which back EBS volumes) we can expect frequent disk failures. If we allocate a very large EBS GP2 volume, i.e. 10Tb+, hitting such failure events can be common.

In the case of MySQL/InnoDB, such an IO “stall” will be obvious, particularly with the highly loaded system where MySQL needs to do physical IO. During the stall, you will see all write queries are waiting, or “hang”.  Some of the writes may error out with “Error 1030” (MySQL error code 1030 (ER_GET_ERRNO): Got error %d from storage engine). There is nothing MySQL can do here – if the IO subsystem is not available, it will need to wait for it.

The good news is: many of the newer EC2 instances (i.e. i3, m5d, etc) have local SSD disks attached (NVMe). Those disks are local to the physical server and should not suffer from the EBS issues described above. Using local disks can be a very good solution:

  1. They are faster, as they are local to the server, and do not suffer from the EBS issues
  2. They are much cheaper compared to large EBS volumes.

Please note, however, that local storage does not guarantee persistence. More about this below.

Another potential option will be to use IO1 volumes with provisional IOPS. However, it will be significantly more expensive for the large volumes and high traffic.

A look at costs

To estimate the costs, I’ve used the AWS simple monthly calculator. Estimated costs are based on 1 year reserved instances. Let’s imagine we will need to use 14TB volume (to store ~10Tb of MySQL data including binary logs). The pricing estimates will look like this:

r4.4xlarge, 122GB RAM, 16 vCPUs + EBS, 14TB volume (this is what we are presumably using now)

Amazon EC2 Service (US East (N. Virginia)) $ 1890.56 / month Compute: $ 490.56 EBS Volumes: $1400.00

Local storage price estimate:
i3.4xlarge, 122GB RAM, 16 vCPUs, 3800 GiB disk (2 x 1900 NVMe SSD)

Amazon EC2 Service (US East (N. Virginia)) $ 627.21 / month Compute: $ 625.61

i3.8xlarge, 244GB RAM, 32 vCPUs, 7600 GiB disk (4 x 1900 NVMe SSD)

Amazon EC2 Service (US East (N. Virginia)) $1252.82 / month Compute: $ 1251.22

As we can see, even if we switch to i3.8xlarge and get 2x more RAM and 2x more virtual CPUs, faster storage, 10 gigabit network we can still pay 1.5x less per box what we are presumably paying now. Include replication, then that’s paying 1.5x less per each of the replication servers.

But wait … there is a catch.

How to migrate to local storage from EBS

Well, we have some challenges here to migrate from EBS to local instance NVMe storage.

  1. Wait, we are storing ~10Tb and i3.8xlarge have 7600 GiB disk. The answer is simple: compression (see below)
  2. Wait, but the local storage is ephemeral, if we loose the box we will loose our data – that is unacceptable.  The answer is also simple: replication (see below)
  3. Wait, but we use EBS snapshots for backups. That answer is simple too: we can still use EBS (and use snapshots) on 1 of the replication slave (see below)

To fit i3.8xlarge we only need 2x compression. This can be done with InnoDB row compression (row_format=compressed) or InnoDB page compression, which requires sparse file and hole punching support. However, InnoDB compression may be slower and will only compress ibd files—it does not compress binary logs, frm files, etc.


Another option: use the ZFS filesystem. ZFS will compress all files, including binary logs and frm. That can be very helpful if we use a “schema per customer” or “table per customer” approach and need to store 100K – 200K tables in a single MySQL instance. If the data is compressible, or new tables were provisioned without much data in those, ZFS can give a significant disk savings.

I’ve used ZFS (followed Yves blog post, Hands-On Look at ZFS with MySQL). Here are the results of data compression with ZFS (this is real data, not a generated data):

# du -sh --apparent-size /mysqldata/mysql/data 8.6T /mysqldata/mysql/data # du -sh /mysqldata/mysql/data 3.2T /mysqldata/mysql/data

Compression ratio:

# zfs get all | grep -i compress ... mysqldata/mysql/data compressratio 2.42x - mysqldata/mysql/data compression gzip inherited from mysqldata/mysql mysqldata/mysql/data refcompressratio 2.42x - mysqldata/mysql/log compressratio 3.75x - mysqldata/mysql/log compression gzip inherited from mysqldata/mysql mysqldata/mysql/log refcompressratio 3.75x -

As we can see, the original 8.6Tb of data was compressed to 3.2Tb, the compression ratio for MySQL tables is 2.42x, for binary logs 3.75x. That will definitely fit i3.8xlarge.

(For another test, I’ve generated 40 million tables spread across multiple schemas (databases). I’ve added some data only to one schema, leaving others blank. For that test I achieved ~10x compression ratio.)

Conclusion: ZFS can provide you with very good compression ratio, will allow you to use different EC2 instances on AWS, and save you a substantial amount of money. Although compression is not free performance-wise, and ZFS can be slower for some workloads, using local NVMe storage can compensate.

You can find some performance testing for ZFS on linux in this blog post: About ZFS Performance. Some benchmarks comparing EBS and local NVMe SSD storage (i3 instances) can be found in this blog post: Percona XtraDB Cluster on Amazon GP2 Volumes


Another option for compression would be using the MyRocks storage engine in Percona Server for MySQL, which provides compression.

Replication and using local volumes

As the local instance storage is ephemeral we need redundancy: we can use MySQL replication or Percona XtraDB cluster (PXC). In addition, we can use one replication slave—or we can attach a replication slave to PXC—and have it use EBS storage.

Local storage is not durable. If you stop the instance and then start it again, the local storage will probably disappear. (Though reboot is an exception, you can reboot the instance and the local storage will be fine.) In addition if the local storage disappears we will have to recreate MySQL local storage partition (for ZFS, i.e. zpool create or for EXT4/XFS, i.e. mkfs)

For example, using MySQL replication:

master - local storage (AZ 1, i.e. 1a) -> slave1 - local storage (AZ 2, i.e. 1b) -> slave2 - ebs storage (AZ 3, i.e. 1c) (other replication slaves if needed with local storage - optional)

Then we can use slave2 for ebs snapshots (if needed). This slave will be more expensive (as it is using EBS) but it can also be used to either serve production traffic (i.e. we can place smaller amount of traffic) or for other purposes (for example analytical queries, etc).

For Percona XtraDB cluster (PXC) we can just use 3 nodes, 1 in each AZ. PXC uses auto-provisioning with SST if the new node comes back blank. For MySQL replication we need some additional things:

  1. Failover from master to a slave if the master will go down. This can be done with MHA or Orchestrator
  2. Ability to clone slave. This can be done with Xtrabackup or ZFS snapshots (if using ZFS)
  3. Ability to setup a new MySQL local storage partition (for ZFS, i.e. zpool create or for EXT4/XFS, i.e. mkfs)
Other options

Here are some totally different options we could consider:

  1. Use IO1 volumes (as discussed). That can be way more expensive.
  2. Use local storage and MyRocks storage engine. However, switching to another storage engine is another bigger project and requires lots of testing
  3. Switch to AWS Aurora. That can be even more expensive for this particular case; and switching to aurora can be another big project by itself.
  1. Using EC2 i3 instances with local NVMe storage can increase performance and save money. There are some limitations: local storage is ephemeral and will disappear if the node has stopped. Reboot is fine.
  2. ZFS filesystem with compression enabled can decrease the storage requirements so that a MySQL instance will fit into local storage. Another option for compression could be to use InnoDB compression (row_format=compressed).

That may not work for everyone as it requires additional changes to the existing server provisioning: failover from master to a slave, ability to clone replication slaves (or use PXC), ability to setup a new MySQL local storage partition, using compression.

The post Using AWS EC2 instance store vs EBS for MySQL: how to increase performance and decrease cost appeared first on Percona Database Performance Blog.

Use cases for MySQL Cluster

Third chapter of "MySQL Cluster 7.5 inside and out".

NDB was designed for a number of networking applications. In particular the original
design focused a lot on telecom applications. The telecom applications have extreme
requirements on availability, many networking applications as well.

This chapter goes through a number of application types where NDB have been used
and also uses a number of public examples of use cases.

These application types include DNS servers, DHCP servers, Intelligent Network (IN)
applications, 3G/4G/5G applications, RADIUS servers, number portability, DIAMETER
server, Video-on-demand applications, Payment Gateways, Software Defined
Networking (SDN), Network Function Virtualization (NFV), Voice over IP.

In addition also many internet applications such as Gaming servers, financial
applications such as stock quote servers, whitelist/blacklist handling. eCommerce
applications, payment services, web applications, fraud detection, online banking,
session database.

In addition a number of frameworks have used NDB and presented it publically such
as a GE presentation, HopsFS (Hadoop distribution), HopsYARN, Oracle OpenStack
and OpenLDAP.

With the possibility to handle more data in MySQL Cluster 7.6 it is likely that this
list of application types will grow even more.

Webinar Tues 8/21: MariaDB 10.3 vs. MySQL 8.0

Please join Percona’s Chief Evangelist, Colin Charles on Tuesday, August 21st, 2018, as he presents MariaDB 10.3 vs. MySQL 8.0 at 7:00 AM PDT (UTC-7) / 10:00 PM EDT (UTC-4).

Register Now

Are they syntactically similar? Where do these two languages differ? Why would I use one over the other?

MariaDB 10.3 is on the path of gradually diverging from MySQL 8.0. One obvious example is the internal data dictionary currently under development for MySQL 8.0. This is a major change to the way metadata is stored and used within the server: MariaDB doesn’t have an equivalent feature. Implementing this feature could mark the end of datafile-level compatibility between MySQL and MariaDB.

There are also non-technical differences between MySQL 8.0 and MariaDB 10.4, including:

  • Licensing: MySQL offers their code as open-source under the GPL, and provides the option of non-GPL commercial distribution in the form of MySQL Enterprise. MariaDB can only use the GPL because their work is derived from the MySQL source code under the terms of that license.
  • Support services: Oracle provides technical support, training, certification and consulting for MySQL, while MariaDB has their own support services. Some people prefer working with smaller companies, as traditionally it affords them more leverage as a customer.
  • Community contributions: MariaDB touts the fact that they accept more community contributions than Oracle. Part of the reason for this disparity is that developers like to contribute features, bug fixes and other code without a lot of paperwork overhead (and they complain about the Oracle Contributor Agreement). However, MariaDB has its own MariaDB Contributor Agreement — which more or less serves the same purpose.

Colin will take a look at some of the differences between MariaDB 10.3 and MySQL 8.0 and help answer some of the common questions our Database Performance Experts get about the two databases.

Register Now

The post Webinar Tues 8/21: MariaDB 10.3 vs. MySQL 8.0 appeared first on Percona Database Performance Blog.

On Fine MySQL Manual

Today I am going to provide some details on the last item in my list of problems with Oracle's way of MySQL server development, maintenance of MySQL Manual. I stated that:
"MySQL Manual still have many details missing and is not fixed fast enough.
Moreover, it is not open source...
"Let me explain the above:
  1. MySQL Reference Manual is not open source. It used to be built from DocBook XML sources. Probably that's still the case. But you can not find the source code in open repositories (please, correct me if I am wrong, I tried to search...) That's because it is NOT open source. It says this clearly in Preface:
    "This documentation is NOT distributed under a GPL license. Use of this documentation is subject to the following terms:
    Any other use, such as any dissemination of printed copies or use of this documentation, in whole or in part, in another publication, requires the prior written consent from an authorized representative of Oracle. Oracle and/or its affiliates reserve any and all rights to this documentation not expressly granted above.
    "It was NOT Oracle who closed the source (as far as I remember, the manual was not GPL even in 2004, when I started to care about MySQL in general). But Oracle had a chance to change the license and set up some better contribution process for MySQL Community users, with many good writers among them. They decided not do this, so creators of forks and derived software have to duplicate efforts and rewrite everything themselves, and the only real way to help make the manual better is to report bugs.
  2. Quality of new documentation is not improved much. We, MySQL Community users, have to report bugs in the manual, as it still has some details missing or documented wrongly. Let me illustrate this with recent 10 or so documentation requests users made (I skipped reports for features I do not care about for now, like group replication):
    • Bug #91955 - "8.0 broke plugin API for C". According to the comment from Oracle developer in this bug reported by Laurynas Biveinis, writing plugins in C is not supported any more... But this change is NOT properly documented.
    • Bug #91781 - "Add version matrix in "Chapter 2 Keywords and Reserved Words". Good idea in the manual is not consistently implemented.
    • Bug #91743 - "performance_schema.data_locks not working correctly". Nice feature was added to Performance Schema (now we can check and study InnoDB locks with SQL!), but it is still not completely and properly documented.
    • Bug #91654 - "Handling an Unexpected Halt of a Replication Slave" Documentation is uncertain". This is also an example of improper bug handling - when I worked in Oracle newer bugs were usually declared duplicates of older ones. Here the opposite decision was made, even though both reports were from the same user, Uday Varagani, who explicitly asked to change the decision. Obviously, documentation requests do not get any extra care comparing to other kinds of bugs, quite to the contrary...
    • Bug #91648 - "Numeric literals must be a number between 0 and 9999". Surely ports with numbers larger than 9999 can be used.
    • Bug #91642 - "Document how protocol version 4.0 and 4.1 differ on how strings are terminated". As noted by Rene' Cannao', comments in the code are still sometimes more useful than documentation.
    • Bug #91549 - "MDL lock queue order seems to depend on table names". Make sure to read last comment in this nice request by Shane Bester. Dmitry Lenev provides some details on priorities of MDL locks in it. There are still cases when bugs and documentation requests document some details better than fine MySQL Manual!
    • Bug #90997 - "Handling an Unexpected Halt of a Replication Slave" manual page is wrong". Sveta Smirnova highlighted a totally misleading statement in the manual.
    • Bug #90935 - "Modify relay_log_recovery Documentation". Simple documentation request stays "Open" for 3 months. Definitely processing documentation requests is not a highest priority for Oracle engineers.
    • Bug #90680 - "dragnet logging: document how to use / find error_symbol codes". Even if request comes from a customer or otherwise well known bug reporter, like Simon Mudd, and it's related to some totally new cool feature of MySQL 8, it can wait for the fix for months...
    You can make your own conclusions from the above. But I do not see any good trends in the way new documentation is created or documentation requests are processed recently. Same problems as 4 years ago (see more on that in a side note below).
  3. Older documentation requests get even less attention than recent ones sometimes, even though they may highlight problems with software itself, not the MySQL Manual. Let me illustrate this with a few bugs I reported:
    • Bug #83640 - "Locks set by DELETE statement on already deleted record". I explicitly asked to
      "...document in the manual what locks this kind of DELETE sets when it encountered a record already marked as deleted, and why"This still had NOT happened. Moreover, eventually both MySQL and MariaDB engineers decided that current implementation of locking for this case is wrong and can be improved, so this report ended up as InnoDB bug. Check related  MDEV-11215 for more details.
    • Bug #73413 - "Manual does not explain MTS implementation in details". This is one of my favorite documentation requests. I've got a suggestion to explain what I want to see documented, in details. I tried my best, but if you want to get more details, read this blog.
    • Bug #72368 - "Empty/zero statistics for imported tablespace until explicit ANALYZE TABLE". This is a bug in the way persistent statistics (the feature I truly hate) in InnoDB is re-estimated "automatically". But until the bug is fixed, I asked to document current implementation (and problems with it). So, where current implementation is properly documented? If only in the bug reports...
    • Bug #71294 - "Manual page for P_S.FILE_INSTANCES table does not explain '~' in FILE_NAME".  They pretend they do not get the point:
      "Per Marc Alff, the Performance Schema is simply reporting what it gets from runtime. If runtime is reporting nonexistent filenames, that's a server issue.

      Recategorizing this as a server bug.
    • Bug #71293 - "Manual page for P_S.FILE_INSTANCES table does not explain EVENT_NAME values". Now they are waiting for the new DOCUMENTATION column in the setup_instruments table to be filled in with something by server developers... The code is the documentation? OK, bus as we know from the experience (see slides 44-46 here) chances to get a bug in Performance Schema fixed fast are even less than to see it properly and completely documented...
There are more problems with MySQL documentation (not only reference manual), but at the moment I consider 3 highlighted and illustrated above the most serious.

Regent's Canal is nice. If I only knew how to operate the locks there... MySQL Manual also misses some information about locks. As a side note, it's not the first time I write about MySQL Manual. You can find some useful details in the following posts:
  • "What's wrong with MySQL Manual". In 2014, after spending few weeks reporting up to 5 documentation bugs per day, I thought that, essentially, there is nothing much wrong with it - it's fine, well indexed by Google and has meaningful human-readable URLs. Few problems listed were lack of careful readers (I tried my best to fix that), limited attention to old documentation requests, some pages with not so much useful content and lack of "How to" documents. The later I also tried to fix to some small extent in this blog, see howto tagged posts. The real fix came mostly from Percona's blog, though.
  • I have a special "missing manual" tag for blog posts that mention at least one bug in the manual.
  • I tried to use "missing_manual" tag consistently for my own documentation requests. Last year I shared a detailed enough review of the bugs with that tag that were still active.
As a yet another side note, I tried once to create a kind of community driven "missing manual" project, but failed. Writing manual from scratch is a (hard) full time job, while my full time job is providing support to users of MySQL, MariaDB and Percona software...

That said, I also wanted to join MySQL's documentation team in the past, but this was not possible at least because I am not a native speaker. If anything changed in this regard, I am still open to a job offer of this kind. My conditions for an acceptable offer from Oracle are known to all interested parties and they include (but are not limited to) at least 4 times the salary I had before I quit (it was almost 6 years ago) and working for Oracle's office in Paris (because in Oracle office you ere employed by formally matters a lot for your happiness and success as employee).

In case of no offer in upcoming week, I'll just continue to write my annoying but hopefully somewhat useful blog posts (until MySQL Reference Manual becomes true open source project ;)

Percona Server for MySQL 5.6.41-84.1 Is Now Available

Percona announces the release of Percona Server for MySQL 5.6.41-84.1 on August 17, 2018 (downloads are available here and from the Percona Software Repositories). This release merges changes of MySQL 5.6.41, including all the bug fixes in it. Percona Server for MySQL 5.6.41-84.1 is now the current GA release in the 5.6 series. All of Percona’s software is open-source and free.

Bugs Fixed
  • A simple SELECT query on a table with CHARSET=euckr COLLATE=euckr_bin could return different results each time it was executed. Bug fixed #4513 (upstream 91091).
  • Percona Server 5.6.39-83.1 could crash when altering an InnoDB table that has a full-text search index defined. Bug fixed #3851 (upstream 68987).
Other Bugs Fixed
  • #3325 “online upgrade GTID cause binlog damage in high write QPS situation”
  • #3976 “Errors in MTR tests main.variables-big, main.information_schema-big, innodb.innodb_bug14676111”
  • #4506 “Backport fixes from 8.0 for InnoDB memcached Plugin”

Find the release notes for Percona Server for MySQL 5.6.41-84.1 in our online documentation. Report bugs in the Jira bug tracker.

The post Percona Server for MySQL 5.6.41-84.1 Is Now Available appeared first on Percona Database Performance Blog.

Percona Server for MySQL 5.5.61-38.13 Is Now Available

Percona announces the release of Percona Server for MySQL 5.5.61-38.13 on August 17, 2018 (downloads are available here and from the Percona Software Repositories). This release merges changes of MySQL 5.5.61, including all the bug fixes in it. Percona Server for MySQL 5.5.61-38.13 is now the current GA release in the 5.5 series. All of Percona’s software is open-source and free.

Bugs Fixed
  • The --innodb-optimize-keys option of the mysqldump utility fails when a column name is used as a prefix of a column which has the AUTO_INCREMENT attribute. Bug fixed #4524.
Other Bugs Fixed
  • #4566 “stack-use-after-scope in reinit_io_cache()” (upstream 91603)
  • #4581 “stack-use-after-scope in _db_enter_() / mysql_select_db()” (upstream 91604)
  • #4600 “stack-use-after-scope in _db_enter_() / get_upgrade_info_file_name()” (upstream 91617)
  • #3976 “Errors in MTR tests main.variables-big, main.information_schema-big, innodb.innodb_bug14676111”

Find the release notes for Percona Server for MySQL 5.5.61-38.13 in our online documentation. Report bugs in the Jira bug tracker.

The post Percona Server for MySQL 5.5.61-38.13 Is Now Available appeared first on Percona Database Performance Blog.

MySQL Connector/J 5.1.47 GA has been released

Dear MySQL Users,

MySQL Connector/J 5.1.47, a maintenance release of the production 5.1
branch has been released. Connector/J is the Type-IV pure-Java JDBC
driver for MySQL.

MySQL Connector/J is available in source and binary form from the
Connector/J download pages at
and mirror sites as well as Maven-2 repositories.

MySQL Connector/J (Commercial) is already available for download on
the My Oracle Support (MOS) website. This release will shortly be
available on eDelivery (OSDC).

As always, we recommend that you check the “CHANGES” file in the
download archive to be aware of changes in behavior that might affect
your application.

MySQL Connector/J 5.1.47 includes the following general bug fixes and
improvements, also available in more detail on

Changes in MySQL Connector/J 5.1.47 (2018-08-17, General Availability) Version 5.1.47 is a maintenance release of the production 5.1 branch. It is suitable for use with MySQL Server versions 5.5, 5.6, 5.7, and 8.0. It supports the Java Database Connectivity (JDBC) 4.2 API. * Functionality Added or Changed * Bugs Fixed Functionality Added or Changed * The value UTF-8 for the connection property characterEncoding now maps to the utf8mb4 character set on the server and, for MySQL Server 5.5.2 and later, characterEncoding=UTF-8 can now be used to set the connection character set to utf8mb4 even if character_set_server has been set to something else on the server. (Before this change, the server must have character_set_server=utf8mb4 for Connector/J to use that character set.) Also, if the connection property connectionCollation is also set and is incompatible with the value of characterEncoding, characterEncoding will be overridden with the encoding corresponding to connectionCollation. See Using Character Sets and Unicode ( j/5.1/en/connector-j-reference-charsets.html ) for details, including how to use the utf8mb3 character set now for connection. (Bug#23227334, Bug #81196) Bugs Fixed * Setting rewriteBatchedStatements=true and useLocalTransactionState=true caused transactions to be uncommitted for batched UPDATE and DELETE statements. It was due to the intermediate queries for enabling multiquery support on the server resetting the local transaction state as a side effect. With this fix, the local transaction state is preserved when the intermediate queries are executed. (Bug #27658489, Bug #89948) * Rewriting prepared INSERT statements in a multiquery batch failed with a BatchUpdateException when the statements did not contain place holders. This was due a faulty mechanism for query rewriting, which has been corrected by this fix. (Bug #25501750, Bug #84813) * When using batched prepared statements with multiple queries per statement, queries rewriting was incorrect, resulting in the wrong queries being sent to the server. (Bug #23098159, Bug #81063) * ResultSet.updateRow() failed when the character set used by a column in the ResultSet did not match that of the connection's encoding. With this fix, values for the affected columns are first converted to String before the update, so that the character set difference is properly handled. (Bug #22847443, Bug #80532) * Record updates failed for a scrollable and updatable PreparedStatement when the WHERE clause for the updater or refresher contained fractional timestamp values and the connection property sendFractionalSeconds was set to false. It was because in the situation, Connector/J did not perform the proper adjustments of the fractional parts in the WHERE clause values according to the length of the field's fractional part as defined in the database. This fix makes Connector/J perform the proper adjustment to the fractional part, so that the WHERE clause value can be properly compared to the value fetched from the database. Moreover, useJDBCCompliantTimezoneShift(), useGmtMillisForDatetimes(), and useSSPSCompatibleTimezoneShift() were applied to the WHERE clause values while they should not be, and this fix removes their applications. (Bug #22305979) * When a Java Date value was bound to a PreparedStatement parameter, attempts to format the value by a proleptic GregorianCalendar failed to make the dates proleptic, so that dates before the Julian-Gregorian cutover (October 15, 1582) were stored wrongly. With this fix, a proleptic calendar is properly used if supplied to the setDate() method. Note that when trying to set or retrieve dates before the Julian-Gregorian cutover with PreparedSatement methods, a proleptic GregorianCalendar should always be explicitly supplied to the setDate() and getDate() method.For details, see Known Issues and Limitations ( ). (Bug#18749544, Bug #72609)

On Behalf of MySQL Release Engineering Team,
Surabhi Bhat

Replication from Percona Server for MySQL to PostgreSQL using pg_chameleon

Replication is one of the well-known features that allows us to build an identical copy of a database. It is supported in almost every RDBMS. The advantages of replication may be huge, especially HA (High Availability) and load balancing. But what if we need to build replication between 2 heterogeneous databases like MySQL and PostgreSQL? Can we continuously replicate changes from a MySQL database to a PostgreSQL database? The answer to this question is pg_chameleon.

For replicating continuous changes, pg_chameleon uses the mysql-replication library to pull the row images from MySQL, which are transformed into a jsonb object. A pl/pgsql function in postgres decodes the jsonb and replays the changes into the postgres database. In order to setup this type of replication, your mysql binlog_format must be “ROW”.

A few points you should know before setting up this tool :

  1. Tables that need to be replicated must have a primary key.
  2. Works for PostgreSQL versions > 9.5 and MySQL > 5.5
  3. binlog_format must be ROW in order to setup this replication.
  4. Python version must be > 3.3

When you initialize the replication, pg_chameleon pulls the data from MySQL using the CSV format in slices, to prevent memory overload. This data is flushed to postgres using the COPY command. If COPY fails, it tries INSERT, which may be slow. If INSERT fails, then the row is discarded.

To replicate changes from mysql, pg_chameleon mimics the behavior of a mysql slave. It creates the schema in postgres, performs the initial data load, connects to MySQL replication protocol, stores the row images into a table in postgres. Now, the respective functions in postgres decode those rows and apply the changes. This is similar to storing relay logs in postgres tables and applying them to a postgres schema. You do not have to create a postgres schema using any DDLs. This tool automatically does that for the tables configured for replication. If you need to specifically convert any types, you can specify this in the configuration file.

The following is just an exercise that you can experiment with and implement if it completely satisfies your requirement. We performed these tests on CentOS Linux release 7.4.

Prepare the environment Set up Percona Server for MySQL

InstallMySQL 5.7 and add appropriate parameters for replication.

In this exercise, I have installed Percona Server for MySQL 5.7 using YUM repo.

yum install yum install Percona-Server-server-57 echo "mysql ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers usermod -s /bin/bash mysql sudo su - mysql

pg_chameleon requires the following the parameters to be set in your my.cnf file (parameter file of your MySQL server). You may add the following parameters to /etc/my.cnf

binlog_format= ROW binlog_row_image=FULL log-bin = mysql-bin server-id = 1

Now start your MySQL server after adding the above parameters to your my.cnf file.

$ service mysql start

Fetch the temporary root password from mysqld.log, and reset the root password using mysqladmin

$ grep "temporary" /var/log/mysqld.log $ mysqladmin -u root -p password 'Secret123!'

Now, connect to your MySQL instance and create sample schema/tables. I have also created an emp table for validation.

$ wget $ tar -xzf sakila-db.tar.gz $ mysql -uroot -pSecret123! < sakila-db/sakila-schema.sql $ mysql -uroot -pSecret123! < sakila-db/sakila-data.sql $ mysql -uroot -pSecret123! sakila -e "create table emp (id int PRIMARY KEY, first_name varchar(20), last_name varchar(20))"

Create a user for configuring replication using pg_chameleon and give appropriate privileges to the user using the following steps.

$ mysql -uroot -p create user 'usr_replica'@'%' identified by 'Secret123!'; GRANT ALL ON sakila.* TO 'usr_replica'@'%'; GRANT RELOAD, REPLICATION CLIENT, REPLICATION SLAVE ON *.* TO 'usr_replica'@'%'; FLUSH PRIVILEGES;

While creating the user in your mysql server (‘usr_replica’@’%’), you may wish to replace % with the appropriate IP or hostname of the server on which pg_chameleon is running.

Set up PostgreSQL

Install PostgreSQL and start the database instance.

You may use the following steps to install PostgreSQL 10.x

yum install yum install postgresql10* su - postgres $/usr/pgsql-10/bin/initdb $ /usr/pgsql-10/bin/pg_ctl -D /var/lib/pgsql/10/data start

As seen in the following logs, create a user in PostgreSQL using which pg_chameleon can write changed data to PostgreSQL. Also create the target database.

postgres=# CREATE USER usr_replica WITH ENCRYPTED PASSWORD 'secret'; CREATE ROLE postgres=# CREATE DATABASE db_replica WITH OWNER usr_replica; CREATE DATABASE

Steps to install and setup replication using pg_chameleon

Step 1: In this exercise, I installed Python 3.6 and pg_chameleon 2.0.8 using the following steps. You may skip the python install steps if you already have the desired python release. We can create a virtual environment if the OS does not include Python 3.x by default.

yum install gcc openssl-devel bzip2-devel wget cd /usr/src wget tar xzf Python-3.6.6.tgz cd Python-3.6.6 ./configure --enable-optimizations make altinstall python3.6 -m venv venv source venv/bin/activate pip install pip --upgrade pip install pg_chameleon

Step 2: This tool requires a configuration file to store the source/target server details, and a directory to store the logs. Use the following command to let pg_chameleon create the configuration file template and the respective directories for you.

$ chameleon set_configuration_files

The above command would produce the following output, which shows that it created some directories and a file in the location where you ran the command.

creating directory /var/lib/pgsql/.pg_chameleon creating directory /var/lib/pgsql/.pg_chameleon/configuration/ creating directory /var/lib/pgsql/.pg_chameleon/logs/ creating directory /var/lib/pgsql/.pg_chameleon/pid/ copying configuration example in /var/lib/pgsql/.pg_chameleon/configuration//config-example.yml

Copy the sample configuration file to another file, lets say, default.yml

$ cd .pg_chameleon/configuration/ $ cp config-example.yml default.yml

Here is how my default.yml file looks after adding all the required parameters. In this file, we can optionally specify the data type conversions, tables to skipped from replication and the DML events those need to skipped for selected list of tables.

--- #global settings pid_dir: '~/.pg_chameleon/pid/' log_dir: '~/.pg_chameleon/logs/' log_dest: file log_level: info log_days_keep: 10 rollbar_key: '' rollbar_env: '' # type_override allows the user to override the default type conversion into a different one. type_override: "tinyint(1)": override_to: boolean override_tables: - "*" #postgres destination connection pg_conn: host: "localhost" port: "5432" user: "usr_replica" password: "secret" database: "db_replica" charset: "utf8" sources: mysql: db_conn: host: "localhost" port: "3306" user: "usr_replica" password: "Secret123!" charset: 'utf8' connect_timeout: 10 schema_mappings: sakila: sch_sakila limit_tables: # - skip_tables: # - grant_select_to: - usr_readonly lock_timeout: "120s" my_server_id: 100 replica_batch_size: 10000 replay_max_rows: 10000 batch_retention: '1 day' copy_max_memory: "300M" copy_mode: 'file' out_dir: /tmp sleep_loop: 1 on_error_replay: continue on_error_read: continue auto_maintenance: "disabled" gtid_enable: No type: mysql skip_events: insert: # - #skips inserts on the table delete: # - delphis_mediterranea #skips deletes on schema delphis_mediterranea update:

Step 3: Initialize the replica using this command:

$ chameleon create_replica_schema --debug

The above command creates a schema and nine tables in the PostgreSQL database that you specified in the .pg_chameleon/configuration/default.yml file. These tables are needed to manage replication from source to destination. The same can be observed in the following log.

db_replica=# \dn List of schemas Name | Owner ---------------+------------- public | postgres sch_chameleon | target_user (2 rows) db_replica=# \dt sch_chameleon.t_* List of relations Schema | Name | Type | Owner ---------------+------------------+-------+------------- sch_chameleon | t_batch_events | table | target_user sch_chameleon | t_discarded_rows | table | target_user sch_chameleon | t_error_log | table | target_user sch_chameleon | t_last_received | table | target_user sch_chameleon | t_last_replayed | table | target_user sch_chameleon | t_log_replica | table | target_user sch_chameleon | t_replica_batch | table | target_user sch_chameleon | t_replica_tables | table | target_user sch_chameleon | t_sources | table | target_user (9 rows)

Step 4: Add the source details to pg_chameleon using the following command. Provide the name of the source as specified in the configuration file. In this example, the source name is mysql and the target is postgres database defined under pg_conn.

$ chameleon add_source --config default --source mysql --debug

Once you run the above command, you should see that the source details are added to the t_sources table.

db_replica=# select * from sch_chameleon.t_sources; -[ RECORD 1 ]-------+---------------------------------------------- i_id_source | 1 t_source | mysql jsb_schema_mappings | {"sakila": "sch_sakila"} enm_status | ready t_binlog_name | i_binlog_position | b_consistent | t b_paused | f b_maintenance | f ts_last_maintenance | enm_source_type | mysql v_log_table | {t_log_replica_mysql_1,t_log_replica_mysql_2} $ chameleon show_status --config default Source id Source name Type Status Consistent Read lag Last read Replay lag Last replay ----------- ------------- ------ -------- ------------ ---------- ----------- ------------ ------------- 1 mysql mysql ready Yes N/A N/A

Step 5: Initialize the replica/slave using the following command. Specify the source from which you are replicating the changes to the PostgreSQL database.

$ chameleon init_replica --config default --source mysql --debug

Initialization involves the following tasks on the MySQL server (source).

1. Flush the tables with read lock
2. Get the master’s coordinates
3. Copy the data
4. Release the locks

The above command creates the target schema in your postgres database automatically.
In the default.yml file, we mentioned the following schema_mappings.

schema_mappings: sakila: sch_sakila

So, now it created the new schema scott in the target database db_replica.

db_replica=# \dn List of schemas Name | Owner ---------------+------------- public | postgres sch_chameleon | usr_replica sch_sakila | usr_replica (3 rows)

Step 6: Now, start replication using the following command.

$ chameleon start_replica --config default --source mysql

Step 7: Check replication status and any errors using the following commands.

$ chameleon show_status --config default $ chameleon show_errors

This is how the status looks:

$ chameleon show_status --source mysql Source id Source name Type Status Consistent Read lag Last read Replay lag Last replay ----------- ------------- ------ -------- ------------ ---------- ----------- ------------ ------------- 1 mysql mysql running No N/A N/A == Schema mappings == Origin schema Destination schema --------------- -------------------- sakila sch_sakila == Replica status == --------------------- --- Tables not replicated 0 Tables replicated 17 All tables 17 Last maintenance N/A Next maintenance N/A Replayed rows Replayed DDL Skipped rows

Now, you should see that the changes are continuously getting replicated from MySQL to PostgreSQL.

Step 8:  To validate, you may insert a record into the table in MySQL that we created for the purpose of validation and check that it is replicated to postgres.

$ mysql -u root -pSecret123! -e "INSERT INTO sakila.emp VALUES (1,'avinash','vallarapu')" mysql: [Warning] Using a password on the command line interface can be insecure. $ psql -d db_replica -c "select * from sch_sakila.emp" id | first_name | last_name ----+------------+----------- 1 | avinash | vallarapu (1 row)

In the above log, we see that the record that was inserted to the MySQL table was replicated to the PostgreSQL table.

You may also add multiple sources for replication to PostgreSQL (target).

Reference :

Please refer to the above documentation to find out about the many more options that are available with pg_chameleon

The post Replication from Percona Server for MySQL to PostgreSQL using pg_chameleon appeared first on Percona Database Performance Blog.

Rationale for MySQL Cluster

Second chapter of "MySQL Cluster 7.5 inside and out".

When I started developing the ideas for NDB Cluster in my Ph.D studies
about 25 years ago, the first thing I did was to perform a very thorough
study of the requirements.

At that time I participated in a European Research for UMTS. UMTS was
later marketed as 3G. My part of this big study group of more than 100
researchers was to simulate the network behaviour. We used the
protocols developed by other groups to see what the load would be on
the various nodes in the telecom network.

I was mostly interested in the load it contributed to the network databases.
Through these studies and also by studying the AXE system developed
in Ericsson I got a very good picture of the requirements on a DBMS
to be used for future telecom services.

In parallel with this I also studied a number of other areas such as
multimedia email servers, news-on-demand servers, genealogy
servers that would be other popular services in the telecom network.

In the above chapter I go through how those requirements was turned
into requirements on predictable response times, availability requirements,
throughput requirements and so forth.

In particular how the requirement led to a model that divided the Data
Server functionality (NDB data nodes) and the Query Server
functionality (MySQL Server nodes).

Also how predictable response times are achieved by building on ideas
from the AXE architecture developed in Ericsson.

How to Monitor your Database Servers using ClusterControl CLI

How would you like to merge "top" process for all your 5 database nodes and sort by CPU usage with just a one-liner command? Yeah, you read it right! How about interactive graphs display in the terminal interface? We introduced the CLI client for ClusterControl called s9s about a year ago, and it’s been a great complement to the web interface. It’s also open source..

In this blog post, we’ll show you how you can monitor your databases using your terminal and s9s CLI.

Introduction to s9s, The ClusterControl CLI

ClusterControl CLI (or s9s or s9s CLI), is an open source project and optional package introduced with ClusterControl version 1.4.1. It is a command line tool to interact, control and manage your database infrastructure using ClusterControl. The s9s command line project is open source and can be found on GitHub.

Starting from version 1.4.1, the installer script will automatically install the package (s9s-tools) on the ClusterControl node.

Some prerequisites. In order for you to run s9s-tools CLI, the following must be true:

  • A running ClusterControl Controller (cmon).
  • s9s client, install as a separate package.
  • Port 9501 must be reachable by the s9s client.

Installing the s9s CLI is straightforward if you install it on the ClusterControl Controller host itself:$ rm

$ rm -Rf ~/.s9s $ wget $ ./

You can install s9s-tools outside of the ClusterControl server (your workstation laptop or bastion host), as long as the ClusterControl Controller RPC (TLS) interface is exposed to the public network (default to You can find more details on how to configure this in the documentation page.

To verify if you can connect to ClusterControl RPC interface correctly, you should get the OK response when running the following command:

$ s9s cluster --ping PING OK 2.000 ms

As a side note, also look at the limitations when using this tool.

Example Deployment

Our example deployment consists of 8 nodes across 3 clusters:

  • PostgreSQL Streaming Replication - 1 master, 2 slaves
  • MySQL Replication - 1 master, 1 slave
  • MongoDB Replica Set - 1 primary, 2 secondary nodes

All database clusters were deployed by ClusterControl by using "Deploy Database Cluster" deployment wizard and from the UI point-of-view, this is what we would see in the cluster dashboard:

Cluster Monitoring

We will start by listing out the clusters:

$ s9s cluster --list --long ID STATE TYPE OWNER GROUP NAME COMMENT 23 STARTED postgresql_single system admins PostgreSQL 10 All nodes are operational. 24 STARTED replication system admins Oracle 5.7 Replication All nodes are operational. 25 STARTED mongodb system admins MongoDB 3.6 All nodes are operational.

We see the same clusters as the UI. We can get more details on the particular cluster by using the --stat flag. Multiple clusters and nodes can also be monitored this way, the command line options can even use wildcards in the node and cluster names:

$ s9s cluster --stat *Replication Oracle 5.7 Replication Name: Oracle 5.7 Replication Owner: system/admins ID: 24 State: STARTED Type: REPLICATION Vendor: oracle 5.7 Status: All nodes are operational. Alarms: 0 crit 1 warn Jobs: 0 abort 0 defnd 0 dequd 0 faild 7 finsd 0 runng Config: '/etc/cmon.d/cmon_24.cnf' LogFile: '/var/log/cmon_24.log' HOSTNAME CPU MEMORY SWAP DISK NICs 1 6% 992M 120M 0B 0B 19G 13G 10K/s 54K/s 1 6% 992M 116M 0B 0B 19G 13G 11K/s 66K/s 2 39% 3.6G 2.4G 0B 0B 19G 3.3G 338K/s 79K/s

The output above gives a summary of our MySQL replication together with the cluster status, state, vendor, configuration file and so on. Down the line, you can see the list of nodes that fall under this cluster ID with a summarized view of system resources for each host like number of CPUs, total memory, memory usage, swap disk and network interfaces. All information shown are retrieved from the CMON database, not directly from the actual nodes.

You can also get a summarized view of all databases on all clusters:

$ s9s cluster --list-databases --long SIZE #TBL #ROWS OWNER GROUP CLUSTER DATABASE 7,340,032 0 0 system admins PostgreSQL 10 postgres 7,340,032 0 0 system admins PostgreSQL 10 template1 7,340,032 0 0 system admins PostgreSQL 10 template0 765,460,480 24 2,399,611 system admins PostgreSQL 10 sbtest 0 101 - system admins Oracle 5.7 Replication sys Total: 5 databases, 789,577,728, 125 tables.

The last line summarizes that we have total of 5 databases with 125 tables, 4 of them are on our PostgreSQL cluster.

For a complete example of usage on s9s cluster command line options, check out s9s cluster documentation.

Node Monitoring

For nodes monitoring, s9s CLI has similar features with the cluster option. To get a summarized view of all nodes, you can simply do:

$ s9s node --list --long STAT VERSION CID CLUSTER HOST PORT COMMENT coC- 23 PostgreSQL 10 9500 Up and running poM- 10.4 23 PostgreSQL 10 5432 Up and running poS- 10.4 23 PostgreSQL 10 5432 Up and running poS- 10.4 23 PostgreSQL 10 5432 Up and running soS- 5.7.23-log 24 Oracle 5.7 Replication 3306 Up and running. coC- 24 Oracle 5.7 Replication 9500 Up and running soM- 5.7.23-log 24 Oracle 5.7 Replication 3306 Up and running. mo-- 3.2.20 25 MongoDB 3.6 27017 Up and Running mo-- 3.2.20 25 MongoDB 3.6 27017 Up and Running coC- 25 MongoDB 3.6 9500 Up and running mo-- 3.2.20 25 MongoDB 3.6 27017 Up and Running Total: 11

The most left-hand column specifies the type of the node. For this deployment, "c" represents ClusterControl Controller, 'p" for PostgreSQL, "m" for MongoDB, "e" for Memcached and s for generic MySQL nodes. The next one is the host status - "o" for online, "l" for off-line, "f" for failed nodes and so on. The next one is the role of the node in the cluster. It can be M for master, S for slave, C for controller and - for everything else. The remaining columns are pretty self-explanatory.

You can get all the list by looking at the man page of this component:

$ man s9s-node

From there, we can jump into a more detailed stats for all nodes with --stats flag:

$ s9s node --stat --cluster-id=24 Name: Cluster: Oracle 5.7 Replication (24) IP: Port: 3306 Alias: - Owner: system/admins Class: CmonMySqlHost Type: mysql Status: CmonHostOnline Role: slave OS: centos 7.0.1406 core Access: read-only VM ID: - Version: 5.7.23-log Message: Up and running. LastSeen: Just now SSH: 0 fail(s) Connect: y Maintenance: n Managed: n Recovery: n Skip DNS: y SuperReadOnly: n Pid: 16592 Uptime: 01:44:38 Config: '/etc/my.cnf' LogFile: '/var/log/mysql/mysqld.log' PidFile: '/var/lib/mysql/' DataDir: '/var/lib/mysql/' Name: Cluster: Oracle 5.7 Replication (24) IP: Port: 3306 Alias: - Owner: system/admins Class: CmonMySqlHost Type: mysql Status: CmonHostOnline Role: master OS: centos 7.0.1406 core Access: read-write VM ID: - Version: 5.7.23-log Message: Up and running. Slaves: LastSeen: Just now SSH: 0 fail(s) Connect: n Maintenance: n Managed: n Recovery: n Skip DNS: y SuperReadOnly: n Pid: 975 Uptime: 01:52:53 Config: '/etc/my.cnf' LogFile: '/var/log/mysql/mysqld.log' PidFile: '/var/lib/mysql/' DataDir: '/var/lib/mysql/' Name: Cluster: Oracle 5.7 Replication (24) IP: Port: 9500 Alias: - Owner: system/admins Class: CmonHost Type: controller Status: CmonHostOnline Role: controller OS: centos 7.0.1406 core Access: read-write VM ID: - Version: Message: Up and running LastSeen: 28 seconds ago SSH: 0 fail(s) Connect: n Maintenance: n Managed: n Recovery: n Skip DNS: n SuperReadOnly: n Pid: 12746 Uptime: 01:10:05 Config: '' LogFile: '/var/log/cmon_24.log' PidFile: '' DataDir: ''

Printing graphs with the s9s client can also be very informative. This presents the data the controller collected in various graphs. There are almost 30 graphs supported by this tool as listed here and s9s-node enumerates them all. The following shows server load histogram of all nodes for cluster ID 1 as collected by CMON, right from your terminal:

It is possible to set the start and end date and time. One can view short periods (like the last hour) or longer periods (like a week or a month). The following is an example of viewing the disk utilization for the last hour:

Using the --density option, a different view can be printed for every graph. This density graph shows not the time series, but how frequently the given values were seen (X-axis represents the density value):

If the terminal does not support Unicode characters, the --only-ascii option can switch them off:

The graphs have colors, where dangerously high values for example are shown in red. The list of nodes can be filtered with --nodes option, where you can specify the node names or use wildcards if convenient.

Process Monitoring

Another cool thing about s9s CLI is it provides a processlist of the entire cluster - a “top” for all nodes, all processes merged into one. The following command runs the "top" command on all database nodes for cluster ID 24, sorted by the most CPU consumption, and updated continuously:

$ s9s process --top --cluster-id=24 Oracle 5.7 Replication - 04:39:17 All nodes are operational. 3 hosts, 4 cores, 10.6 us, 4.2 sy, 84.6 id, 0.1 wa, 0.3 st, GiB Mem : 5.5 total, 1.7 free, 2.6 used, 0.1 buffers, 1.1 cached GiB Swap: 0 total, 0 used, 0 free, PID USER HOST PR VIRT RES S %CPU %MEM COMMAND 12746 root 20 1359348 58976 S 25.25 1.56 cmon 1587 apache 20 462572 21632 S 1.38 0.57 httpd 390 root 20 4356 584 S 1.32 0.02 rngd 975 mysql 20 1144260 71936 S 1.11 7.08 mysqld 16592 mysql 20 1144808 75976 S 1.11 7.48 mysqld 22983 root 20 127368 5308 S 0.92 0.52 sshd 22548 root 20 127368 5304 S 0.83 0.52 sshd 1632 mysql 20 3578232 1803336 S 0.50 47.65 mysqld 470 proxysql 20 167956 35300 S 0.44 0.93 proxysql 338 root 20 4304 600 S 0.37 0.06 rngd 351 root 20 4304 600 R 0.28 0.06 rngd 24 root 20 0 0 S 0.19 0.00 rcu_sched 785 root 20 454112 11092 S 0.13 0.29 httpd 26 root 20 0 0 S 0.13 0.00 rcuos/1 25 root 20 0 0 S 0.13 0.00 rcuos/0 22498 root 20 127368 5200 S 0.09 0.51 sshd 14538 root 20 0 0 S 0.09 0.00 kworker/0:1 22933 root 20 127368 5200 S 0.09 0.51 sshd 28295 root 20 127452 5016 S 0.06 0.13 sshd 2238 root 20 197520 10444 S 0.06 0.28 vc-agent-007 419 root 20 34764 1660 S 0.06 0.04 systemd-logind 1 root 20 47628 3560 S 0.06 0.09 systemd 27992 proxysql 20 11688 872 S 0.00 0.02 proxysql_galera 28036 proxysql 20 11688 876 S 0.00 0.02 proxysql_galera

There is also a --list flag which returns a similar result without continuous update (similar to "ps" command):

$ s9s process --list --cluster-id=25 Job Monitoring

Jobs are tasks performed by the controller in the background, so that the client application does not need to wait until the entire job is finished. ClusterControl executes management tasks by assigning an ID for every task and lets the internal scheduler decide whether two or more jobs can be run in parallel. For example, more than one cluster deployment can be executed simultaneously, as well as other long running operations like backup and automatic upload of backups to cloud storage.

In any management operation, it's would be helpful if we could monitor the progress and status of a specific job, like e.g., scale out a new slave for our MySQL replication. The following command add a new slave, to scale out our MySQL replication:

$ s9s cluster --add-node --nodes="" --cluster-id=24 Job with ID 66992 registered.

We can then monitor the jobID 66992 using the job option:

$ s9s job --log --job-id=66992 addNode: Verifying job parameters. Adding host to cluster. Testing SSH to host. Installing node. Setup new node (installSoftware = true). Setting SELinux in permissive mode. Disabling firewall. Setting vm.swappiness = 1 Installing software. Setting up repositories. Installing helper packages. Upgrading nss. Upgrading ca-certificates. Installing socat. ... Installing pigz. Installing bzip2. Installing iproute2. Installing tar. Installing openssl. Upgrading openssl openssl-libs. Finished with helper packages. Verifying helper packages (checking if socat is installed successfully). Uninstalling existing MySQL packages. Installing replication software, vendor oracle, version 5.7. Installing software. ...

Or we can use the --wait flag and get a spinner with progress bar:

$ s9s job --wait --job-id=66992 Add Node to Cluster - Job 66992 RUNNING [ █] ---% Add New Node to Cluster

That's it for today's monitoring supplement. We hope that you’ll give the CLI a try and get value out of it. Happy clustering!

Related resources   How to Automate Galera Cluster Using the ClusterControl CLI Read the blog   How to use s9s -The Command Line Interface to ClusterControl Read the blog   Command Line Aficionados: Introducing s9s for ClusterControl Read the blog Tags:  s9s CLI MySQL galera monitoring MongoDB PostgreSQL

This Week in Data with Colin Charles 49: MongoDB Conference Opportunities and Serverless Aurora MySQL

Join Percona Chief Evangelist Colin Charles as he covers happenings, gives pointers and provides musings on the open source database community.

Beyond the MongoDB content that will be at Percona Live Europe 2018, there is also a bit of an agenda for MongoDB Europe 2018, happening on November 8 in London—a day after Percona Live in Frankfurt. I expect you’ll see a diverse set of MongoDB content at Percona Live.

The Percona Live Europe Call for Papers closes TODAY! (Friday August 17, 2018)

From Amazon, there have been some good MySQL changes. You now have access to time delayed replication as a strategy for your High Availability and disaster recovery. This works with versions 5.7.22, 5.6.40 and later. It is worth noting that this isn’t documented as working for MariaDB (yet?). It arrived in MariaDB Server in 10.2.3.

Another MySQL change from Amazon? Aurora Serverless MySQL is now generally available. You can build and run applications without thinking about instances: previously, the database function was not all that focused on serverless. This on-demand auto-scaling serverless Aurora should be fun to use. Only Aurora MySQL 5.6 is supported at the moment and also, be aware that this is not available in all regions yet (e.g. Singapore).

  • pgmetrics is described as an open-source, zero-dependency, single-binary tool that can collect a lot of information and statistics from a running PostgreSQL server and display it in easy-to-read text format or export it as JSON for scripting.
  • PostgreSQL 10.5, 9.6.10, 9.5.14, 9.4.19, 9.3.24, And 11 Beta 3 has two fixed security vulnerabilities may inspire an upgrade.
Link List Industry Updates
  • Martin Arrieta (LinkedIn) is now a Site Reliability Engineer at Fastly. Formerly of Pythian and Percona.
  • Ivan Zoratti (LinkedIn) is now Director of Product Management at Neo4j. He was previously on founding teams, was the CTO of MariaDB Corporation (then SkySQL), and is a long time MySQL veteran.
Upcoming Appearances Feedback

I look forward to feedback/tips via e-mail at or on Twitter @bytebot.


The post This Week in Data with Colin Charles 49: MongoDB Conference Opportunities and Serverless Aurora MySQL appeared first on Percona Database Performance Blog.