Planet MySQL

MariaDB Talks at Percona Live Data Performance Conference 2016

The Percona Live Data Performance Conference 2016 isn’t until 18-21 April 2016, but community voting is now open. So if you’re going, and want to see lots of MariaDB talks, please click on the buttons below and vote!

You have to be logged in to the Percona site to vote. Once logged in, scroll to the bottom of the talk description page, assign the number of stars that represents your interest in the talk (5 is the best) and click on the “Submit Vote” button. That’s all there is to it.

Here are all of the proposed talks I know of that are either specifically MariaDB-related or are by employees of either the MariaDB Foundation or MariaDB Corporation. If there are others, please let me know!

Using and Managing MariaDB by Colin Charles View & Vote! MariaDB 10.1 – What’s New? by Colin Charles View & Vote! MariaDB/MySQL security essentials by Colin Charles View & Vote! MariaDB Connectors: Fast and Smart with the new protocol optimizations by Sergei Golubchik View & Vote! MariaDB 10.1 Security: Validation, Authentication, Encryption by Sergei Golubchik View & Vote! Window functions in MariaDB by Sergei Petrunia View & Vote! Exploring MariaDB Query Optimizer with ANALYZE for statements by Sergei Petrunia View & Vote! Databases in the Hosted Cloud by Colin Charles View & Vote! Best practices for MySQL High Availability by Colin Charles View & Vote! Choosing a MySQL HA solution today by Colin Charles View & Vote! The MySQL Server Ecosystem in 2016 by Colin Charles View & Vote! Low Impact Monitoring by Gerardo Narvaja View & Vote! Characters sets and Collations – All you wanted to know but was afraid to ask by Anders Karlsson View & Vote! IBM Power8 – The new powerful open database platform by Anders Karlsson View & Vote! Migrating data – Working with datatypes, it might not be what you expect! by Anders Karlsson View & Vote! Real-time data streaming to Hadoop using MaxScale by Dipti Joshi View & Vote!
PlanetMySQL Voting: Vote UP / Vote DOWN

My proposals for Percona Live: Window Functions and ANALYZE for statements

I’ve made two session proposals for Percona Live conference:

if you feel these talks are worth it, please vote!

PlanetMySQL Voting: Vote UP / Vote DOWN

The Power and Ease of Adaptive Fault Detection

Adaptive Fault detection is a prime example of how efficiently VividCortex can help you understand and optimize your system. We define a “fault” as a special kind of system stall: a period during which applications are asking the server to perform a great deal of work, but that work is getting bottlenecked and therefore not completing.

This understanding of faults is based on Queueing Theory (if you need a refresher on QT, be sure to check out our recent and highly accessible ebook, Everything You Need to Know About Queueing Theory); we detect faults by using advanced statistics and machine learning. VividCortex’s Fault Detection is completely adaptive and self-tuning – it doesn’t require any configuration. The program can detect faults as short as one second in duration. Even the most attentive user would likely fail to notice system stalls so small, but with our Adaptive Fault Detection, they’re easily diagnosed and solved.

But why is it important for users to lock onto such small problems? Well, system performance problems almost always start small and, overtime, snowball into much more serious issues. Catching them early is the best way to prevent major performance problems and outages.

When you’re using Adaptive Fault detection, VividCortex displays faults in an easily understood timeline, running from left to right.

In the example here, you can see three stalls – notated as vertical bars along the timeline – that occurred in one of our production database servers. The widgets beneath the timeline (shown in the image below) illustrate what was happening in the server at the moment of each fault. As you select a fault, you can see a red line appear on the myriad charts, to indicate the precise instant when that fault occurred.

Right away, you’ll notice a few telltale signs of a fault. For instance, take a look at the chart tracking MySQL concurrency: when you hover over the fault in the timeline, you can see that at the moment of the fault, the concurrency spiked up to 141 queries, all trying to run simultaneously – much more than this machine can handle.

Now, let’s look at the fault on the far right of the timeline; we see that at the time it occurred, the newdownsample program had just started running, and we can see that there was a notably high amount of Disk Throughput and CPU activity in the server as a result.

And, again, when we hover over the pixel representing the exact instant of the fault, we see that MySQL concurrency spiked dramatically – in this case, to a clogged 154 queries running at a single moment.

Looking further down the collection of summary widgets, we find Top MySQL Queries. Here, we see the third query listed is quit abusive, arising from the newdownsample program as well. That abusive query whisks resources away from other processes.

Clearly, it would be beneficial to examine this query more closely. Doing so is simple: all you need to do is click to drill down into the widget, to see more top running queries during the relevant time range. You can then select that query and continue drilling down into it with the various tools at your disposal.

Also note that we’ve just introduced our new Profiler tool, which takes our older Top Queries and Top Processes tools to a whole new level. You’ll be able to drill down with more precision and customization than ever before. The APIs are faster too! To read more about the Profiler tool, check out our recent announcement here.

And finally, to watch VividCortex’s Adaptive Fault Detection in action, check out Baron Schwartz’s demo in the video below.

PlanetMySQL Voting: Vote UP / Vote DOWN

Voting for talks at the Percona Live Data Performance Conference 2016

So this year the Percona Live conference has a new name — it is the “Data Performance Conference” (presumably for a much broader appeal and the fact that Percona is now in the MongoDB world as well). And the next new thing to note? You have to go through a process of “community voting”, i.e. the speaker has to promote their talks before via their own channels to see how many votes they can get (we tried this before at the MySQL & Friends Devroom at FOSDEM; in this case, please remember you also need to create a new account and actually vote while logged in).

I hope you vote for Sergei, Monty and my proposals!

  1. Using and Managing MariaDB – a tutorial, which has been referred to as The Complete MariaDB Server tutorial, I thought I will change the name up a little, in addition to the content. The most recent version of this tutorial was given at the Percona Live Conference in Santa Clara in 2015 (slides). Since then we’ve released MariaDB Server 10.1, and there’s much more new things to talk about!
  2. MariaDB 10.1 – What’s New? – a talk that would have Michael “Monty” Widenius (creator of MySQL and MariaDB) and me give it together. I’ve described this as a dance, and the last time we did this was at Percona Live Amsterdam. The content will of course be new, and I am creating the slide deck this time around.
  3. Databases in the Hosted Cloud – this is a pet talk. It costs some money to make, and if accepted I plan to also showcase who has better performing hosted databases. I did this at Percona Live Amsterdam 2015 (slides), but since then we’ve seen Amazon offering MariaDB Server as part of RDS, HPCloud being sunset, and also Rackspace upping their offering with High Availability Databases. More research to be done from now till then!
  4. Best Practices for MySQL High Availability – this would be another tutorial, and at Percona Live Amsterdam 2015 it had the highest registered attendance (Kortney told me the day before and I removed all practicals, since 100+ people with practicals is impossible for one person to manage – slides). I think with the changes in NDBCLUSTER (recently announced at OpenWorld), the addition of tools in the MHA world (mha-helper), this should have a lot of new information (and more importantly a lot of new things to play with).
  5. Choosing a MySQL HA solution today – a talk based on the above tutorial, cut short, to ensure people whom are not at tutorial day, will have solutions to think about and take home for implementation in the future.
  6. MariaDB/MySQL security essentials – a talk which focuses on improvements in MariaDB Server 10.1, and MySQL 5.6/5.7, including encryption at rest, easier SSL setup for replication topologies, and even external authentication plugins (eg. Kerberos is almost ready – see MDEV-4691).
  7. The MySQL Server Ecosystem in 2016 – a talk about MySQL and the forks around it, including the private trees that exist (some like the Twitter tree haven’t been updated in a while, but clearly have made inroads in giving us new features). Learn what to use, and what is the best one for your use case. 
  8. MariaDB Connectors: Fast and Smart with the new protocol optimizations – a talk from Sergei Golubchik, about new protocol optimisations in MariaDB Server as well as how we optimise this from the connectors as well.
  9. MariaDB 10.1 Security: Validation, Authentication, Encryption – a talk from Sergei Golubchik focusing on MariaDB 10.1 security improvements; he’s got some amazing slides on encryption that I saw at Percona Live Amsterdam, and you can see a five-minute lightning version from the meetup.

Here’s to happy voting and I hope to give at least some of these talks (if not all!).

PlanetMySQL Voting: Vote UP / Vote DOWN

Log Buffer #451: A Carnival of the Vanities for DBAs


The show goes on. This Log Buffer Edition picks some blogs which are discussing new and old features of Oracle, SQL Server and MySQL.


  • Directory Usage Parameters (ldap.ora) list the host names and port number of the primary and alternate LDAP directory servers.
  • Data Visualization Cloud Service (DVCS) is a new Oracle Cloud Service. It is a subset offering of the currently supported Business Intelligence Cloud Service (BICS).
  • ORA-24247: network access denied by access control list (ACL).
  • Latches are low level serialization mechanisms, which protect memory areas inside SGA. They are light wait and less sophesticated than enqueues and can be acquired and released very quickly.
  • handling disks for ASM – when DB, Linux and Storage admins work together.

SQL Server:

  • How to use the Performance Counter to measure performance and activity in Microsoft Data Mining.
  • Phil Factor demonstrates a PowerShell-based technique taking the tedium out of testing SQL DML.
  • Sandeep Mittal provides an introduction to the COALESCE function and shows us how to use it.
  • Hadoop many flavors of SQL.
  • Installing and Getting Started With Semantic Search.


  • Support for storing and querying JSON within SQL is progressing for the ANSI/ISO SQL Standard, and for MySQL 5.7.
  • Loss-less failover using MySQL semi-syncronous replication and MySQL Fabric!
  • Memory consumption The binary format of the JSON data type should consume more memory.
  • This post compares a B-Tree and LSM for read, write and space amplification. The comparison is done in theory and practice so expect some handwaving mixed with data from iostat and vmstat collected while running the Linkbench workload.
  • If you do not have a reliable network access (i.e. in some remote places) or need something really small to store your data you can now use Intel Edison.


Learn more about Pythian’s expertise in Oracle SQL Server & MySQL.

PlanetMySQL Voting: Vote UP / Vote DOWN

MySQL Performance: Improved Connect/sec Rate in MySQL 5.7

This article is continuing the MySQL 5.7 Performance story, started from 1.6M QPS on MySQL 5.7 details post , then 1M QPS on mixed OLTP_RO with MySQL 5.7 article, and detailed story about why the Point-Selects performance is so critical (and why the 1M result published by MariaDB is not fair)..

The current story will be about Connect/sec (connect/disconnect) performance improvement in MySQL 5.7 - such kind of metric is very critical for any application which cannot use persistent connections all the time (and many web apps are in such a case). Well, I'd say MySQL from the beginning was extremely good for its lightweight connections, and made in the past the base of success for many web solutions.. However, time is going, and we're no more with 4cores as "commodity hardware" (this is rather a smart-watch today ;-)) - so, there was a need to speed-up this Connect rate to match higher workloads. This was already greatly done in MySQL 5.6, and finally yet more improved in MySQL 5.7 - you may read all details about directly from our developers - I'll just present here a short summary about where we're today..

So far, first of all, how to test the Connect/sec performance of your MySQL server instance? - the most simple way here is just to use a standard Sysbench kit, load 10M rows into sysbench database (1 table or several tables, no matter -- the main show-stopper here is the Connect code itself), and then run the following :

#!/bin/bash # ---------------------------------------------------------------- # Connect/sec test # ---------------------------------------------------------------- for Users in 8 16 32 64 128 256 512 1024 do LD_PRELOAD=/usr/lib64/ sysbench --num-threads=$Users \ --test=oltp --oltp-table-size=10000000 \ --db-ps-mode=disable --oltp-dist-type=uniform --oltp-table-name=sbtest_10M \ --max-requests=0 --max-time=300 --mysql-socket=/tmp/mysql.sock \ --mysql-user=dim --mysql-password=dim --mysql-db=sysbench \ --mysql-table-engine=INNODB --db-driver=mysql \ --oltp-point-selects=1 --oltp-simple-ranges=0 --oltp-sum-ranges=0 \ --oltp-order-ranges=0 --oltp-distinct-ranges=0 --oltp-skip-trx=on \ --oltp-read-only=on --oltp-reconnect-mode=query --oltp-connect-delay=0 run sleep 30 done # ----------------------------------------------------------------

means on every point-select query your client session will re-connect, and the final QPS result will give you the max Connect/sec rate your MySQL instance is able to reach for a given amount of concurrent users.

And here are the results obtained from older to newer generations Intel-based Linux servers :

12cores-HT @2.9Ghz :

32cores-HT @2.3Ghz :

40cores-HT @2.3Ghz :

72cores-HT @2.5Ghz :

Instead of SUMMARY :

  • Connect/sec performance is mainly depending on the MySQL "connect/disconnect" code itself + CPU chip speed
  • it's hard to speak about "scalability" here as the max possible Connect/sec rate limit is reached pretty quickly and depending on IP stack performance as well..
  • tuning "thread_cache_size" to something bigger than zero is helping here, but not too much..
  • MySQL 5.7 showing the best performance here regardless the HW platform, and reaching over 100K Connect/sec on the latest Intel CPU chip
  • there is a clear better-and-better tendency in MySQL 5.5 => 5.6 => 5.7 results
  • and we still can do yet more better with MySQL 5.8 ! (question of time and resources - so, please, send us your feedback/ votes/ wishes if you want to see it ;-))

MySQL 5.7 rocks! ;-))


PlanetMySQL Voting: Vote UP / Vote DOWN

Webinar replay & slides for MySQL DBAs: performing live database upgrades in replication & Galera setups

Thanks to everyone who joined us for our recent live webinar on performing live database upgrades for MySQL Replication & Galera, led by Krzysztof Książek. The replay and slides to the webinar are now available to watch and read online via the links below.

During this live webinar, Krzysztof covered one of the most basic, but essential tasks of the DBA: minor and major database upgrades in production environments.

Watch the replay

Become a MySQL DBA: performing live database upgrades - webinar replay from Severalnines AB  

Read the slides

Become a MySQL DBA: performing live database upgrades - webinar slides from Severalnines AB



  • What types of upgrades are there?
  • How do I best prepare for the upgrades?
  • Best practices for:
    • Minor version upgrades - MySQL Replication & Galera
    • Major version upgrades - MySQL Replication & Galera


Krzysztof Książek, Senior Support Engineer at Severalnines, is a MySQL DBA with experience managing complex database environments for companies like Zendesk, Chegg, Pinterest and Flipboard. 

This webinar builds upon recent blog posts and related webinar series by Krzysztof on how to become a MySQL DBA.

Blog category: Tags:
PlanetMySQL Voting: Vote UP / Vote DOWN

Introducing the Profiler Tool

For a long time we’ve been dissatisfied with the fact that we had two separate tools to do the same thing: Top Processes and Top Queries rank processes and queries respectively by a metric of your choice. For example you may rank queries by frequency, or rank databases by total execution time. There’s a lot of shared functionality and at the same time there’s so much more we could do with the metrics we’re capturing from your systems.

Today we are releasing a new tool called Profiler to replace Top Queries and Top Processes, so now we have one tool to do the job of two. Simplifying is always a good thing.

If you are wondering what the Profiler can rank, you’ll be glad to see that not only we still rank Users, Databases, Hosts, Queries, Verbs, Tags and Processes, but we also added Tables, Network Sockets, MySQL Commands and MySQL Processlist Queries. And many more are coming soon of course; this tool is not limited to that set, so if you want to rank things that we don’t have now, don’t hesitate to drop us a line.

Additionally, the metrics which you can rank by, have been increased: so far you could for instance, rank databases by total time and frequency, but you can choose between 13 combinations now, including Warnings, Errors, Affected Rows, Missing Indexes and Slow Queries. There are many more additions, so please go ahead and play with the tool to find out!

But wait, there’s more! The Profiler uses much faster APIs, so if you are one of our customers with a gazillion different queries, who used to wait several seconds while loading the top 10 queries, you’ll be happy to see that the Profiler loads the data much faster.

We are going to add many more features to this tool, to ease finding what you and your team are looking for. Stay tuned for more, and sign up for a free trial if you are interested in a monitoring solution that goes deeper to provide meaningful insight.

PlanetMySQL Voting: Vote UP / Vote DOWN

Standard SQL/JSON and MySQL 5.7 JSON

Support for storing and querying JSON within SQL is progressing for the ANSI/ISO SQL Standard, and for MySQL 5.7. I'll look at what's new, and do some comparisons.

The big picture

The standard document says

The SQL/JSON path language is a query language used by certain SQL operators (JSON_VALUE, JSON_QUERY, JSON_TABLE, and JSON_EXISTS, collectively known as the SQL/JSON query operators) to query JSON text.The SQL/JSON path language is not, strictly speaking, SQL, though it is embedded in these operators within SQL. Lexically and syntactically, the SQL/JSON path language adopts many features of ECMAScript, though it is neither a subset nor a superset of ECMAScript.The semantics of the SQL/JSON path language are primarily SQL semantics.

Here is a chart that shows the JSON-related data types and functions in the standard, and whether a particular DBMS has something with the same name and a similar functionality.

Standard Oracle SQL Server MySQL -------- ------ ---------- ----- Conventional data type YES YES NO JSON_VALUE function YES YES NO JSON_EXISTS function YES YES NO JSON_QUERY function YES YES NO JSON_TABLE function YES NO NO

My source re the standard is a draft copy of ISO/IEC 9075-2 SQL/Foundation. For Oracle 12c read Oracle's whitepaper. For SQL Server 2016 read MSDN's blog. My source re MySQL 5.7 is the MySQL manual and the latest source-code download of version 5.7.9.

Now, what is the significance of the lines in the chart?

Conventional data type

By "conventional", I mean that in standard SQL JSON strings should be stored in one of the old familiar data types: VARCHAR, CLOB, etc. It didn't have to be this way, and any DBMS that supports user-defined types can let users be more specific, but that's what Oracle and Micosoft accept.

MySQL 5.7, on the other hand, has decided that JSON shall be a new data type. It's closely related to LONGTEXT: if you say


then you get LONGTEXT. But if you use the C API to ask the data type, you get MYSQL_TYPE_JSON=245 (aside: this is not documented). And it differs because, if you try to put in non-JSON data, you get an error message.

At least, that's the theory. It didn't take me long to find a way to put non-JSON data in:


... That unfortunately works, and now if I say "SELECT * FROM j2b;" I get an error message "The JSON binary value contains invalid data". Probably bugs like this will disappear soon, though.

By making a new data type, MySQL has thrown away some of the advantages that come with VARCHAR or TEXT. One cannot specify a maximum size -- everything is like LONGTEXT. One cannot specify a preferred character set and collation -- everything is utf8mb4 and utf8mb4_bin. One cannot take advantage of all the string functions -- BIN() gives meaningless results, for example. And the advantage of automatic validity checking could have been delivered with efficient constraints or triggers instead. So why have a new data type?

Well, PostgreSQL has a JSON data type. As I've noticed before, PostgreSQL can be a poor model if one wants to follow the standard. And it will not surprise me if the MariaDB folks also decide to make a JSON data type, because I know that they are following similar logic for an "IP address" data type.

By the way, the validity checking is fairly strict. For example, '{x:3}' is considered invalid because quote marks are missing, and '{"x":.2} is considered invalid because the value has no leading digit.

JSON_VALUE function

For an illustration and example it's enough to describe the standard's JSON_VALUE and MySQL's JSON_EXTRACT.

The standard idea is: pass a JSON string and a JavaScript-like expression, get back an SQL value, which will generally be a scalar value. For example,

SELECT JSON_VALUE(@json, @path_string) FROM t; SELECT JSON_VALUE(json_column_name, 'lax $.c') AS c FROM t;

There are optional clauses for deciding what to do if the JSON string is invalid, or contains missing and null components. Again, the standard's JSON_VALUE is what Oracle and Microsoft accept. There's some similarity to what has gone before with SQL/XML.

MySQL, on the other hand, accomplishes some similar things with JSON_EXTRACT. For example,

SELECT JSON_EXTRACT(@json, @path_string); SELECT JSON_VALUE(json_column_name, '$.c') AS c FROM t;

And the result is not an ordinary MySQL scalar, it has type = JSON. In the words of physicist I.I.Rabi when confronted with a new particle, "Who ordered that?"

Well, JSON_EXTRACT and some of the other MySQL functions have fairly close analogues, in both name and functionality, with Google's BigQuery and with SQLite. In other words, instead of the SQL standard, MySQL has ended up with something like the NoSQL No-standard.

I should stress here that MySQL is not "violating" the SQL standard with JSON_EXTRACT. It is always okay to use non-standard syntax. What's not okay is to use standard syntax for a non-standard purpose. And here's where I bring in the slightly dubious case of the "->" operator. In standard SQL "->", which is called the "right arrow" operator, has only one purpose: dereferencing. In MySQL "->" has a different purpose: a shorthand for JSON_EXTRACT. Since MySQL will never support dereferencing, there will never be a conflict in practice. Nevertheless, technically, it's a violation.

Observed Behaviour

When I tried out the JSON data type with MySQL 5.7.9, I ran into no exciting bugs, but a few features.

Consistency doesn't apply for INSERT IGNORE and UPDATE IGNORE. For example:

CREATE TABLE t1 (date DATE, json JSON); INSERT IGNORE INTO t1 (date) VALUES ('invalid date'); INSERT IGNORE INTO t1 (json) VALUES ('{invalid json}');

The INSERT IGNORE into the date column inserts null with a warning, the INSERT IGNORE into the json column returns an error.

Some error messages might still need adjustment. For example:


Result: an error message = "A BLOB field is not allowed in partition function".

Comparisons of JSON_EXTRACT results don't work. For example:

SET @json = '{"a":"A","b":"B"}'; SELECT GREATEST( JSON_EXTRACT(@json,'$.a'), JSON_EXTRACT(@json,'$.b'));

The result is a warning "This version of MySQL doesn't yet support 'comparison of JSON in the LEAST and GREATEST operators'", which is a symptom of the true problem, that JSON_EXTRACT returns a JSON value instead of a string value. The workaround is:

SET @json = '{"a":"A","b":"B"}'; SELECT GREATEST( CAST(JSON_EXTRACT(@json,'$.a') AS CHAR), CAST(JSON_EXTRACT(@json,'$.b') AS CHAR));

... which returns "B" -- a three-character string, including the quote marks.

Not The End

The standard might change, and MySQL certainly will change anything that's deemed wrong. Speaking of wrong, I might have erred too. And I certainly didn't give justice to all the other details of MySQL 5.7 JSON.


The Ocelot GUI client for MySQL and MariaDB is still version 0.8 alpha, but since the last report there have been bug fixes and improvements to the Help option. Have a look at the new manual by going to and scrolling down till you see the screenshots and the words "User Manual".

PlanetMySQL Voting: Vote UP / Vote DOWN

Loss-less failover using MySQL semi-syncronous replication and MySQL Fabric!

In my last post playing with MySQL fabric here and here I used MySQL fabric version 1.4.1 and MySQL 5.6, i must say lot's of things have happened over the last 20 months! MySQL have released several new versions of MySQL Fabric and also a new version of the server with MySQL 5.7.

Current GA version of MySQL Fabric is 1.5.6 and there is also a MySQL Fabric 1.6.2 beta, both versions can be downloaded from here.
As I said earlier, lots of new things have happened, looks like focus has been on making it easier to work with MySQL Fabric but also making output from fabric commands much easier to understand. With latest version 1.6.2 (not yet GA) fabric also have support for running multiple fabric instances, more news can be found here.

Semi-synchronous replicationSemi-synchronous replication is an enhancement to normal replication making it possible to failover to a slave server without losing any data. In MySQL 5.7 we have introduced some changes making this feature even better, more details can be found here . So, how can we use semi-synchronous replication together with MySQL Fabric? Well it's not that hard to configure, just a few commands as explained in our manual, but hey let's try to upgrade the old MySQL fabric environment and make it use semi-synchronous replication!

Upgrading environment to MySQL 5.7Some things in installation/configuration had to be modified moving from MySQL 5.6 to 5.7:
  • installation process has changed a lot (more secure by default), if you want old behavior use --initialize-insecure to get back some old behavior.
  • server-id is now mandatory if you enble binary logging, more information here and here.
For more information on what new in MySQL 5.7 and some removed or deprecated functionality read more here and here.

MySQL FabricNext step is to install MySQL Fabric, I will not cover the steps as this was covered on my old posts and the procedure for latest version of MySQL fabric is explained in detail in our manual.

Scenario is, we have MySQL Fabric up and running and 3 stand alone MySQL 5.7 servers waiting to be used by MySQL Fabric.

First step is to create our group and populate it with our 3 MySQL servers. We also run promote command to elect a primary server, the activate command enables automatic failover is primary server dies.

mysqlfabric group create group1
mysqlfabric group add group1
mysqlfabric group add group1
mysqlfabric group add group1
mysqlfabric group promote group1mysqlfabric group activate group1
Let's look at current status of our group of servers:

ted@ted-PORTEGE-Z30-A:/opt/MySQL-fabricII$ mysqlfabric group lookup_servers group1
Fabric UUID:  5ca1ab1e-a007-feed-f00d-cab3fe13249e
Time-To-Live: 1
                         server_uuid         address    status       mode weight
------------------------------------ --------------- --------- ---------- ------
8066541b-9423-11e5-ab31-7c7a910307fa   PRIMARY READ_WRITE    1.0
839f04ac-9423-11e5-ab4d-7c7a910307fa SECONDARY  READ_ONLY    1.0
8681ea65-9423-11e5-ad0f-7c7a910307fa SECONDARY  READ_ONLY    1.0
All looks fine, our servers are up and running and group is ready!

Enabling Semi-synchronous replicationThis whole procedure can be done without restarting our secondary server, for safety we set the secondary server in SPARE mode so it might not be elected as primary on case of failover whilst we are working with the server.

On Primary Server load semi-sync plugin and enable semi-sync replication:

mysql> INSTALL PLUGIN rpl_semi_sync_master SONAME '';
mysql> SET GLOBAL rpl_semi_sync_master_enabled = 1;
On one or both of the secondary servers; load semi-sync plugin, enable semi-sync replication, set server in SPARE mode and restart replication channel.

mysql> INSTALL PLUGIN rpl_semi_sync_slave SONAME '';
mysql> SET GLOBAL rpl_semi_sync_slave_enabled = 1;
mysqlfabric server set_status <server-uuid> SPARE
mysql> STOP SLAVE IO_THREAD; START SLAVE IO_THREAD;mysqlfabric server set_status <server-uuid> SECONDARY
Verify semi-synchronous replication is up and running on slaves:

mysql> show status like  'Rpl_semi_sync_slave_status';
| Variable_name              | Value |
| Rpl_semi_sync_slave_status | ON    |
We are done, you now have semi-sync replication up and running in your group!
My MySQL Fabric wish list for the future is:
  • Feature request Semisynchronous replication support in MySQL Fabric.
  • Fabric GUI section in workbench for managing my groups of servers. I would also like to extend this functionality to handle basic monitoring and help to architect/deploy groups or failed servers.
  • Support for MySQL Group Replication in fabric, so we have three different ways of replicating data between servers in our HA-groups.

PlanetMySQL Voting: Vote UP / Vote DOWN

JSON memory consumption

I got some more questions on the new JSON data type and functions during our TechTours. And I like to summarize the answers in this blogpost.
Memory consumption The binary format of the JSON data type should consume more memory. But how much? I did a little test by comparing a freshly loaded 25,000 row dataset stored as JSON and stored as TEXT. Seven top level attributes per JSON document. Average JSON_DEPTH is 5.9 . Let's see:
mysql> DESC data_as_text;
| Field | Type | Null | Key | Default | Extra |
| id | int(11) | NO | PRI | NULL | |
| doc | text | YES | | NULL | |
2 rows in set (0.00 sec)

mysql> SELECT COUNT(*),AVG(JSON_LENGTH(doc)) FROM data_as_text;
| COUNT(*) | AVG(JSON_LENGTH(doc)) |
| 25359 | 7.0000 |
1 row in set (0.81 sec)

mysql> DESC data_as_json;
| Field | Type | Null | Key | Default | Extra |
| id | int(11) | NO | PRI | NULL | auto_increment |
| doc | json | NO | | NULL | |
2 rows in set (0.00 sec)

mysql> SELECT COUNT(*),AVG(JSON_LENGTH(doc)) FROM data_as_json;
| COUNT(*) | AVG(JSON_LENGTH(doc)) |
| 25359 | 7.0000 |
1 row in set (0.08 sec)

mysql> select name,allocated_size/1024/1024 AS "size in MB" from information_schema.innodb_sys_tablespaces where name like "%temp%";
| name | size in MB |
| temp/data_as_json | 23.00390625 |
| temp/data_as_text | 22.00390625 |
2 rows in set (0.00 sec)
The increased memory consumption is 1/22 in this case, which is roughly 4,5%. At the same time you see the benefit: The full table scan with some JSON operation has a 90% reduction in runtime when using JSON datatype.
Don't take this number for real. Of course it depends on the number of JSON attributes, character set and others. Just a rough indication. If you want all the details look at the JSON architecture in WL#8132.
PlanetMySQL Voting: Vote UP / Vote DOWN

A Gratitude List

A gratitude list is one of my favorite ways to influence my mood. Here’s a gratitude list on Thanksgiving Day 2015 (United States) to express some of the reasons we are so fortunate at VividCortex to have great customers, employees, investors, advisors, friends, vendors, community, and more. This list will omit lots of things and people equally worthy of inclusion, but that’s the point–a gratitude list only makes you aware of how much larger than 26 the list really is.

Amazon Web Services, without whom VividCortex would have been much harder to get off the ground.

Bugsnag, which helps us diagnose many confusing problems our customers experience but we can’t reproduce.

Customers. All of them.

Dyn. A special customer who took a chance on us before we were a “safe” service to risk trying.

Employees, past, present, and future (we’re hiring).

Free Software. As in beer, and as in speech. Several of us are FSF members.

Go, which has probably reduced our technical overhead by 50% through compounding effects in various ways. for being an inspiration and for cultivating interest in the disciplined pursuit of better performance.

Intercom for connecting us and our customers more directly and frictionlessly. Customers consistently praise “our” support chat.

Julien Schmidt and Arne Hormann for writing the MySQL driver for Go that we use.

Kate Matsudaira for advice, inspiration, and encouragement from the first early ideas onward.

LinkedIn for connecting the world of business in new ways.

MySQL for so many reasons.

Nespresso several times a day.

Open Source is this a duplicate of Free Software? Doesn’t matter.

Percona to whom I personally owe so much.

Quorum Algorithms and Raft for making possible many distributed systems such as Consul and Zookeeper, both of which we use. Also for being cool.

Static Site Generators which is why we haven’t been hacked and turned into an unwilling assistant of some nefarious corner of the Internet.

Time Off to reflect like today.

Uber for ferrying us around between venture capitalists and customers and conferences and billiards.

VictorOps for being our event timeline and communications backbone when things get a little dicey.

WCGW gifs to keep it light-hearted on the chat channel (and give us more reasons to be thankful).

X might be an eXercise for the reader, or it could be for the oXford Comma.

Y is also coming up blank for me right now.

Zlib for helping reduce our data ingest and Kafka disk space usage.

Thank you.

PlanetMySQL Voting: Vote UP / Vote DOWN

Smallest MySQL Server in the World (under $60) which can even make you toast while fixing MySQL bug #2

In my last blog post, Internet of Things, Messaging and MySQL, I have showed how to start your own Internet of Things with Particle Photon board. That implementation is great, but requires constant internet (wi-fi) access as the Particle Photon board does not have any local storage. If you do not have a reliable network access (i.e. in some remote places) or need something really small to store your data you can now use Intel Edison. I’ve even install MySQL on Edison, which makes it the smallest (in size) MySQL server in the world! Other options include:

MySQL Configuration

Intel Edison is a tiny computer based on 22 nm Silvermont dual-core Intel Atom CPU, 500MHz, running Linux (Ubuntu based distribution called Yocto). To program the Edison we will need a breakout board. Options include Arduino compatible breakout board (which includes SD card) and a small Intel breakout board.

The installation and configuration is straightforward. I’ve used the Get Started with Yocto Project on the Intel Edison Board guide to setup and configure the board. First we need to connect to Edison via serial connection and configure sshd and Wi-Fi; when it is done we can connect to Edison using SSH.

The MySQL installation is relatively easy as Linux generic binaries are compatible with Yocto Linux (so you do not have to compile anything). There are 2 challenges though:

  • By default the Yocto linux (as well as the official repository) does not include libraries needed for MySQL: libaio1, libcrypto, libssl
  • The internal storage is tiny and MySQL 5.7 binaries did not even fit into any partition. I had to remove some “tests” and other stuff I do not need. For the real installation one can use SD card (SD slot is available on some boards).

To install the libraries I’ve used the un-official Edison repositories following the excellent guide: Edison package repo configuration. Setup is simple:

To configure your Edison to fetch packages from this repo, replace anything you have in /etc/opkg/base-feeds.conf with the following (other opkg config files don’t need any changes):

src/gz all src/gz edison src/gz core2-32

Then we will need to setup the libraries:

# opkg install libaio1_0.3 libcrypto1.0.0 libssl1.0.0

Finally we can download Percona Server 5.6 and place it somewhere (use basedir in my.cnf to point to the installation path):

# wget

Please note that the latest Percona Server 5.6 depends on the Numa library and there is no such library for Yocto (does not make sense for Edison). So 5.6.25 is the latest Percona Server you can install here.

The simple (and rather useless) benchmark on Intel Edison:

root@edison:/usr/local/mysql# cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 74 model name : Genuine Intel(R) CPU 4000 @ 500MHz ... mysql> SELECT BENCHMARK(10000000,ENCODE('hello','goodbye')); +-----------------------------------------------+ | BENCHMARK(10000000,ENCODE('hello','goodbye')) | +-----------------------------------------------+ | 0 | +-----------------------------------------------+ 1 row in set (18.77 sec)

Can MySQL make you toast?

The famous MySQL Bug#2, submitted 12 Sep 2002, states that “MySQL Connector/J doesn’t make toast”. With Intel Edison and Arduino compatible breakout board it is now trivial to fix this bug: not only MySQL Connector/J but also MySQL server itself can make you a toast! This can be done via UDF or, in MySQL 5.7, with Query Rewrite Plugins, so you can execute MySQL query:

mysql> make toast;

For the actual implementation you can either “hack” an existing toaster to interface with breakout board pins or use a Arduino compatible Robotic Arm. Ok, MySQL, make me toast!

The post Smallest MySQL Server in the World (under $60) which can even make you toast while fixing MySQL bug #2 appeared first on MySQL Performance Blog.

PlanetMySQL Voting: Vote UP / Vote DOWN

Press Release: Severalnines aims for the stars with EGO boost

Swedish database technology ensures intergalactic research data protected “to infinity and beyond” for scientists

Stockholm, Sweden and anywhere else in the world - Severalnines, the provider of database infrastructure management software, today announced its latest customer, the European Gravitational Observatory (EGO). Severalnines’ ClusterControl platform is used to manage and monitor the operations of some of EGO’s open source databases and help scientists to track, analyse and share data gathered from research on gravitational waves.  

EGO is one of the world’s leading research structures devoted to the study of gravitational waves. EGO hosts the Virgo experiment, a project involving about 300 scientists spread across 19 laboratories in France, Italy, Poland, the Netherlands and Hungary. The observatory also collaborates with a wider network of scientists based in the United States and Asia. Gravitational waves were predicted one century ago by Einstein's theory of General Relativity but they have never been observed directly, although there is indirect evidence of their existence. The direct detection of gravitational waves, the main goal of  experiments such as Virgo, will solve one of the long-standing puzzles of Einstein's theory and provide a new powerful tool to observe the Universe.

Research on gravitational waves creates at least 6 Terabytes of new data every day. The data gathered is used for scientific research into gravitational waves and to help the development of the cutting-edge technologies used by Virgo, which span various fields, from mechanics to optics and electronics.  

The scale of data created on a daily basis provides challenges for EGO’s IT team which has to ensure research data is not lost either when at rest or in transfer. The IT team needed a robust technology which could work with various open and proprietary IT environments, and around-the-clock access to database performance and management expertise.

After a review process, Severalnines was selected by EGO in January 2015 after the team read a series of strong online reviews on how Severalnines’ ClusterControl platform manages MySQL database clusters. EGO tested ClusterControl on its Scientific Linux platform, based on the popular Red Hat system, during the trial period. The ease of installation during testing was another reason why EGO’s IT team decided to work with Severalnines beyond the trial period.

With ClusterControl, EGO can manage and monitor all database clusters on a single dashboard and reduce time spent on backing-up research data. Severalnines helped EGO to focus on new business initiatives such as improving the sharing and presentation of research data on various graphical formats.

Giuseppe Di Biase, Systems Administrator at EGO, said: “Data is so important to the work EGO carries out on a daily basis. Without it, important scientific discoveries based on gravitational waves may be missed. The Severalnines team offered excellent, personalised support and gave us practical advice on how to enhance our systems. I still learn something new every time I talk to Severalnines, they have made my job easier .”

Vinay Joosery, Severalnines CEO said: “ We are very proud at Severalnines to support the leading edge of science with the most advanced database management software. The scale of EGO’s operation is tremendous and it relies on stable and robust technology, we look forward to taking this relationship to another world.”

About Severalnines

Severalnines provides automation and management software for database clusters. We help companies deploy their databases in any environment, and manage all operational aspects to achieve high-scale availability.

Severalnines' products are used by developers and administrators of all skills levels to provide the full 'deploy, manage, monitor, scale' database cycle, thus freeing them from the complexity and learning curves that are typically associated with highly available database clusters. The company has enabled over 7,000 deployments to date via its popular online database configurator. Currently counting BT, Orange, Cisco, CNRS, Technicolor, AVG, Ping Identity and Paytrail as customers.  Severalnines is a private company headquartered in Stockholm, Sweden with offices in Singapore and Tokyo, Japan. To see who is using Severalnines today visit,

About the European Gravitational Observatory

The European Gravitational Observatory (EGO) is a French-Italian private Consortium founded by a joint agreement between the French Centre National de la Recherche Scientifique (CNRS) and the Italian Istituto Nazionale di Fisica Nucleare (INFN). Both these research institutions equally finance the running of the Consortium since its foundation in December 2000.

In 2009 EGO was pleased to welcome the Dutch Institute for Subatomic Physics (Nikhef) as Observer within its Council. Today about 60 people work at the headquarters of EGO in Cascina, a small town about 10 kilometers from Pisa, in the Italian region of Tuscany. The main purpose of EGO is to ensure the construction of Advanced Virgo and to guarantee its operations, maintenance, and upgrade, as well as the exploitation of scientific data. EGO is integrated to the Virgo Collaboration by providing a large and crucial support to the experiment and is fully responsible for the site of the Virgo interferometer. Another key goal of EGO is to promote the studies on gravitational physics in Europe and to be on the frontline of the gravitational wave research.

Fore more information, contact Jean-Jérôme Schmidt:

Blog category: Tags:
PlanetMySQL Voting: Vote UP / Vote DOWN

long, strange trip

Thanks to google’s insatiable appetite for scanning and offering documents, you can take a trip back to a 1994 issue of Computer World magazine, read about Windows NT, cc:Mail, pentium processors, and see an ad for Oracle CDE, featuring yours truly looking over his shoulder.  (Remember Oracle CDE?)  This was my first Oracle DBA gig, and the team received a Smithsonian Award for a global client/server customer information system that we built on Oracle 7 (and DB2, via Oracle Transparent Gateway).

PlanetMySQL Voting: Vote UP / Vote DOWN

Read, write &amp; space amplification - B-Tree vs LSM

This post compares a B-Tree and LSM for read, write and space amplification. The comparison is done in theory and practice so expect some handwaving mixed with data from iostat and vmstat collected while running the Linkbench workload. For the LSM I consider leveled compaction rather than size-tiered compaction. For the B-Tree I consider a clustered index like InnoDB.

The comparison in practice provides values for read, write and space amplification on real workloads. The comparison in theory attempts to explain those values.

B-Tree vs LSM in theory
Read Amplification
Most comparisons should be done for a specific context including the hardware and workload. For now I am only specific about the cache hit rate. For the B-Tree I assume that all non-leaf levels are in cache. For the LSM I assume that everything but the data blocks of the largest LSM level are in cache. While an LSM with leveled compaction has more things to keep in the cache (bloom filters) it also benefits from a better compression rate and the cache requirements are similar to a clustered B-Tree.
Worst-case disk read-amp for point queries is 1 for the B-Tree and the LSM as one block is read from the B-Tree leaf level and largest LSM level. Disk read-amp for range queries is 1 or 2 for a short range scan assuming that 1 or 2 blocks from the B-Tree leaf level and LSM max level are read. Note the impact of my assumption for cached data. While many files might be accessed for a short range query with an LSM everything but the max level data blocks are in cache.
The number of key comparisons can be used as the in-memory read-amp. For a B-Tree with 1M keys there are about 20 key comparisons on a point query. For a range query with a B-Tree there is one additional comparison for each row fetched.
It is harder to reason about the number of comparisons for an LSM. Bloom filters can be used for a point query to avoid comparisons but when there are too many files in level 0 then there will be too many bloom filter checks. Bloom filters don't work for range queries, ignoring prefix bloom filters. When query processing is IO-bound I don't expect key comparison overhead to make a difference between an LSM and B-Tree. So I will ignore this for now.
If you want to maximize the ratio of the database to cache sizes while doing at most one disk read per point query then an LSM with leveled compaction or a clustered B-Tree are the best choices. For a clustered B-Tree the things that must be in memory are one key per leaf block and all non-leaf levels of the index. An LSM with leveled compaction has similar requirements, although it also needs some of the bloom filters to be in memory.
The cache requirement is much larger for an LSM with size-tiered compaction. First, the max level has ~50% of the data compared to ~90% with leveled compaction and it less likely that all data except the max file are in cache. Second, there are more old versions of key-value pairs, space-amp is larger, so there is more data that needs to be in the cache.
An unclustered B-Tree index also requires more memory to keep the important bits in cache. The important bits are all keys, which is much more memory than one key per leaf block for a clustered B-Tree.
Write Amplification
For now I assume that flash storage is used so I can focus on bytes written and ignore disk seeks when explaining write-amp. For a B-Tree a change is first recorded in the redo log and the page is eventually written back. The worst case occurs when the buffer pool is full with dirty pages and reading the to-be-modified page into the buffer pool forces a dirty page to be evicted and written back. In this case there is a redo log write and a page write back per row change. If the row is 128 bytes and the page is 4096 bytes then 4096+128 bytes are written to storage per 128 byte row change. The write-amp is 33 -- (4096 + 128) / 128. The write-amp is reduced when there is more one changed row on a page or when one row is changed many times before write back.
For the LSM the redo log is written immediately on a row change. When the memtable is full and flushed to level 0 then the row is written again. When level N is full and compaction is done from level N to level N+1 then one SST file is read from level N, ~10 SST files are ready from level N+1 and ~10 files are written back to level N+1. The write-amp to move rows from level N to N+1 is ~10 given my handwaving but in practice it is ~7 and I am waiting for a paper to be published to explain that. The total write-amp is computed from the writes required to move a row change from the memtable to the max level. The write-amp is 1 for the redo log, 1 for the memtable flush and usually ~1 for compacting to level 1. Assuming the LSM has levels 0 to 4 and the per-level write-amp is 7 for levels 2 to 4 then the total write-amp is 24 -- 1 + 1 + 1 + 7 + 7 + 7.

From the examples above the LSM has less write-amp than the B-Tree but those examples were not meant to be compared. An LSM tends to have less write-amp than a B-Tree. When using flash storage this means the device will last longer. When using disk storage this is likely to save more IO capacity for reads leading to higher QPS.
The IO pattern for a busy LSM is concurrent streams of IO. Each stream writes files sequentially, but the writes from different streams can end up in the same logical erase block (logical means it is striped across many NAND chips). The level of the leveled compaction LSM predicts the lifetime of the write. Writes to level 0 have a short lifetime. Writes to level 4 have a much longer lifetime. The write rates per level are similar -- there might be 10 MB/second of writes to levels 0 and 1 and then 20 MB/second of writes to levels 2 through 4. This means that logical erase blocks will end up with a mix of long and short lived data and the long-lived data will get copied out during flash garbage collection. Does your flash device use logical erase blocks? If it does then there will be write-amp from flash GC even with an LSM. Flash devices that support multi-stream will help a lot.
Space Amplification
A B-Tree gets space-amp from fragmentation, per-row metadata and fixed page sizes on disk. The leaf pages in a B-Tree are between 50% and 70% full when subject to random updates. When they are 2/3 full then space-amp is 1.5 and when they are 1/2 full then space-amp is 2. An update-in-place B-Tree like InnoDB uses ~20 bytes/row for metadata to support consistent read and transactions. The metadata overhead is much smaller for an LSM like MyRocks. Finally, when compression is done for InnoDB there will be wasted space because page sizes are fixed on disk. When a 16kb in-memory page compressed to 5kb for a table that uses 8kb pages on disk, then 3kb of the 8kb page on disk is wasted.

An LSM gets space-amp from old versions of key-value pairs. Leveled and size-tiered compaction differ significantly in this regard. With leveled compaction you are likely to get space-amp of 1.1 or 1.2 and with size-tiered compaction a more common result is space-amp of 2. Size-tiered compaction can suffer even more from additional but temporary space-amp when the max file is compacted and disk space is required for the old and new version of that file.

Compression reduces space-amp and for this reason I claim that space-amp of less than 1 is possible.
B-Tree vs LSM in practice
This post is longer than I expected, so I will write less here. This is a good spot for a joke about space-amp and write-amp. I have begun reporting on read, write and space amplification by normalizing the server's IO and CPU rates by QPS during benchmarks. I use iostat to get data for on-disk read-amp and write-amp by measuring reads/second, MB read/second and MB written/second. I frequently ignore writes/second because that mixes fast and slow writes (redo log writes are fast, page writes are slow). I use vmstat to measure the CPU utilization and that is a proxy for the in-memory read-amp and write-amp. Finally I look at the size of the database on disk to compare space-amp. The data is usually measured over 1 hour intervals to make it easy to detect when metrics get worse as a database ages. I try to run workloads for at least 12 hours to give things time to go bad.

Percona has begun doing this for some benchmark reports. I hope this becomes a common practice.
This is an example from running Linkbench for MongoDB with RocksDB and WiredTiger. I will soon have more results for MyRocks. I am thrilled that we have a copy-on-write B-Tree (WiredTiger) and an LSM (RocksDB) available as storage engines in MongoDB. We are also bringing RocksDB to MySQL via the MyRocks effort. The big deal for MyRocks compared to InnoDB is half the space-amp and half the write-amp. This has been measured on Linkbench and on the real workload. This is a big deal.
PlanetMySQL Voting: Vote UP / Vote DOWN

Tarski and Codd

Wikipedia says that “Relational calculus is essentially equivalent to first-order logic, and indeed, Codd’s Theorem had been known to logicians since the late 1940s.”  I couldn’t find the cited sources online, but did find these interesting papers:

Applications of Alfred Tarski’s Ideas in Database Theory


Tarski’s influence on computer science.” (see the section starting “The final thing I want to tell something about is the connection of Tarski’s ideas and work with database theory.”)

If you’ve studied mathematical logic (or math at all, eg, topology), you are probably familiar with Tarski’s name.  The historical development of mathematical logic and relational database theory is an interesting topic I’d like to understand better.    Codd’s 1970 paper is fairly readable, and this 1972 paper is related to the correspondences between the different approaches taken.

PlanetMySQL Voting: Vote UP / Vote DOWN

Percona Live Data Performance Conference 2016 - CFP and Community voting

The call for participation for Percona Live Data Performance Conference 2016 is still open. Deadline for submission is November 29th.

There are two immediately noticeable novelties in this edition:

  • The name change. Not "MySQL & Expo" but "Data Performance Conference." It makes the conference open to a broader set of topics.
  • The community voting. Proposals can get evaluated by the community before the review committee takes decisions.

I think it's a good choice. Other conferences adopt the same method. The attendees choose what they want to see and hear. In this case, it's mixed method, where the community voting is used as an indication for the review committee, which, by my understanding, has the final say.

Vote for my proposals!

Below are my proposals. Clicking on the links below will take you to the conference site, where you can say if you want to see these talks or not. You will need to register (to the site, not yet to the conference) in order to cast votes.

Here is a talk that is in continuous evolution. It discusses the latest advances in replication, and gives an honest evaluation of the features from a new user standpoint. This talk, if accepted, will be updated with the latest novelties in MariaDB and MySQL, if they come out in time for the conference. You can see in my blog six articles covering related matters.Another tutorial, this one aimed at users of containers who want to get started with this exciting technology. Also for this topic I have written a few articles.This is a short talk that wants to explain the differences between deployment methods. Standalone physical servers, sandboxes, virtual machines, and containers are choices that require some information to get started. This talk, for which I also wrote an article, wants to show the good and bad of each choice.This is a lightning talk, which is not about data performance, but it's a geeky topic and I was asked to submit it. So here it is!

PlanetMySQL Voting: Vote UP / Vote DOWN

MariaDB CONNECT Storage Engine JSON Autodiscovery

The MariaDB CONNECT storage engine offers access to JSON file and allows you to see a external JSON file as a MariaDB table. A nice feature of the CONNECT storage Engine is its capability to auto discover a table structure when the table correspond to external data. In our case the CONNECT storage engine will automatically [...]
PlanetMySQL Voting: Vote UP / Vote DOWN