Planet MySQL

All things RocksDB at Percona Live Europe 2017

There is plenty of content about MyRocks, MongoRocks and RocksDB at Percona Live Europe. Go here for registration and use the code SeeMeSpeakPLE17 for a discount:

JOIN Elimination: An Essential Optimiser Feature for Advanced SQL Usage

The SQL language has one great advantage over procedural, object oriented, and “ordinary” functional programming languages. The fact that it is truly declarative (i.e. a 4GL / fourth generation programming language) means that a sophisticated optimiser can easily transform one SQL expression into another, equivalent SQL expression, which might be faster to execute.

How does this work?

Expression Tree Equivalence

Here’s an introduction to equivalent expression trees from my SQL Masterclass: Let’s assume we want to evaluate the following expression:

A + (A - B)

Now in maths, it can be proven trivially that by the laws of associativity, the above expression is really the same as this one:

(A + A) - B

We didn’t really win anything yet, but we can equally trivially turn the above addition into a multiplication that can be proven to be exactly equivalent:

(2 * A) - B

Now, imagine that A is an extremely “expensive” value, e.g. a value that we have to fetch from the disk, or worse, from the network. The cost of accessing A is thus very high and if we can avoid one access to A by the above transformation, then the resulting expression is much faster to evaluate than the original one, even if mathematically speaking, it does not matter at all.

That’s what optimisers do all the time. They transform expressions into equivalent expressions which are faster to execute. And they constantly adapt, so if the DBA chooses to move A from a remote server to the local server, thus reducing the cost of access to A, perhaps, suddenly, the original plan will be better again, because of the cost of multiplication (just an example).

A SQL Example

An equally trivial SQL example from my SQL Masterclass shows that it really doesn’t matter mathematically if we run this query:

SELECT first_name, last_name FROM customer WHERE first_name = 'JAMIE'

Or this one:

SELECT * FROM ( SELECT first_name, last_name FROM customer ) WHERE first_name = 'JAMIE'

With the SQL language, it may be a bit harder to see that these are exactly equivalent SQL statements, but if we translate the above queries to relational algebra, it may become more visible:

Selection before projection

… or WHERE before SELECT:

Projection then selection

… or SELECT before WHERE:

Don’t be fooled by relational algebra‘s term “selection”. It does not correspond to the SELECT clause, but to the WHERE clause!

We can prove (let’s leave the exercise to the reader), that both expressions are exactly equivalent, so optimisers can pick whichever they have a more efficient matching algorithm for:

  • Ordinary row stores will probably apply the selection first, before projecting, as the expensive operation is accessing rows and if we can filter rows early (e.g. through indexes) then the whole query is less expensive)
  • A column store might (just a hypothesis) apply the projection first, as the expensive operation is accessing the columns. We then might have to traverse the entire column anyway to filter out the rows
Let’s Talk About JOINs (and Their Elimination)

JOIN elimination is one of the most basic and yet most powerful SQL transformations, which is implemented by most modern databases in one way or another. It is so powerful, because we’re writing (potentially useless) JOINs all the time, when writing a bit more complex queries. See also our article about JOINs for a general overview.

Now, consider the following simplified schema, taken from the Sakila database:

CREATE TABLE address ( address_id INT NOT NULL, address VARCHAR(50) NOT NULL, CONSTRAINT pk_address PRIMARY KEY (address_id) ); CREATE TABLE customer ( customer_id INT NOT NULL, first_name VARCHAR(45) NOT NULL, last_name VARCHAR(45) NOT NULL, address_id INT NOT NULL, CONSTRAINT pk_customer PRIMARY KEY (customer_id), CONSTRAINT fk_customer_address FOREIGN KEY (address_id) REFERENCES address(address_id) );

Let’s ignore indexing and other useful features for this example.

INNER JOIN Elimination

The following query shows a common JOIN use-case, joining a parent table (ADDRESS) to a child table (CUSTOMER) in a to-one relationship:

SELECT c.* FROM customer c JOIN address a ON c.address_id = a.address_id

We fetch all customers and their addresses. But observe: We project only columns from the CUSTOMER table and we don’t have any predicates at all, specifically not predicates using the ADDRESS table. So, we’re completely ignoring any contributions from the ADDRESS table. We never really needed the JOIN in the first place!

And in fact, the optimiser can prove this too, because of the FOREIGN KEY constraint on C.ADDRESS_ID, which guarantees that every CUSTOMER record has exactly one corresponding ADDRESS record. The JOIN does not duplicate, nor remove any CUSTOMER rows, so it is unneeded and thus eliminated (by some, not all databases, will list each database at the end of the article).

So, the database can rewrite the SQL statement to the following, equivalent SQL statement in the presence of said FOREIGN KEY:

SELECT * FROM customer c

Now, quite obviously, this query will be faster than the previous one, if the entire JOIN can be avoided, and thus the entire access to the ADDRESS table. Neat, huh? Who would have thought that FOREIGN KEYs can be so useful in terms of performance.

OUTER JOIN Elimination

A LEFT [ OUTER ] JOIN will JOIN the right table to the left table but keep rows from the left table if there is no match (again, an explanation of joins can be seen here). When we apply LEFT JOIN to the previous query…

SELECT c.* FROM customer c LEFT JOIN address a ON c.address_id = a.address_id

… then we’ll fetch all rows from CUSTOMER regardless if that customer has any ADDRESS. This is useful if the FOREIGN KEY is optional (nullable), or completely absent, e.g. through:

ALTER TABLE customer DROP CONSTRAINT fk_customer_address

OUTER JOIN is even easier to eliminate, as it doesn’t require any FOREIGN KEY constraint for the database to prove that it is unneeded. A UNIQUE constraint on the parent table (here: ADDRESS.ADDRESS_ID) is sufficient to show that for every CUSTOMER there can be at most one ADDRESS, so the LEFT JOIN won’t duplicate any CUSTOMER rows (Unlike INNER JOIN, OUTER JOIN never remove rows).

Hence, the above query can again be rewritten to the more optimal:

SELECT * FROM customer c

OUTER JOIN Elimination with DISTINCT

Another interesting case of OUTER JOIN elimination is the following one, which unfortunately didn’t work on Oracle for a customer of ours, recently, in a complex query that ran rogue. Let’s look at some other tables of the Sakila database, which expose a to-many relationship:

CREATE TABLE actor ( actor_id numeric NOT NULL , first_name VARCHAR(45) NOT NULL, last_name VARCHAR(45) NOT NULL, CONSTRAINT pk_actor PRIMARY KEY (actor_id) ); CREATE TABLE film ( film_id int NOT NULL, title VARCHAR(255) NOT NULL, CONSTRAINT pk_film PRIMARY KEY (film_id) ); CREATE TABLE film_actor ( actor_id INT NOT NULL, film_id INT NOT NULL, CONSTRAINT pk_film_actor PRIMARY KEY (actor_id, film_id), CONSTRAINT fk_film_actor_actor FOREIGN KEY (actor_id) REFERENCES actor (actor_id), CONSTRAINT fk_film_actor_film FOREIGN KEY (film_id) REFERENCES film (film_id) );

Now, consider this query:

SELECT DISTINCT first_name, last_name FROM actor a LEFT JOIN film_actor fa ON a.actor_id = fa.actor_id

We’re looking for all actors and their films, but then we project only distinct actors. Again, the JOIN to FILM_ACTOR doesn’t contribute anything to the result, but because we’re joining a to-many relationship here (from parent table ACTOR to child table FILM_ACTOR), the JOIN is producing duplicate rows. Without DISTINCT, we’d get something like:

FIRST_NAME LAST_NAME ---------------------- ... PENELOPE GUINESS PENELOPE GUINESS PENELOPE GUINESS NICK WAHLBERG NICK WAHLBERG ...

But thanks to the DISTINCT keyword, the result is (provably) no different from the result of this much simpler query:

SELECT DISTINCT first_name, last_name FROM actor a

(Note, DISTINCT cannot be eliminated, unless we already have a UNIQUE constraint on (FIRST_NAME, LAST_NAME)).

Why not Just Refactor the SQL Manually?

Of course, all of this shouldn’t be needed if our SQL were perfect. In the above trivial examples, the SQL can (and should) be re-written manually to improve quality. But note that:

  • Developers make mistakes, and those mistakes may be very subtle when queries get more complex. I’ll show an example below.
  • The presence of this feature actually encourages writing more complex SQL, especially when using reusable views. I’ll show another example below.
  • Finally, I’ve previously advocated avoiding needless, mandatory work, like SELECT *. Such work is mandatory because the optimiser cannot prove its needlessness. In the case of these JOINs, the optimiser can prove the needlessness, so the work is no longer mandatory. It can be eliminated.

Here are some complex examples as promised, where this optimiser feature really shines:

Subtle Mistakes

Let’s consider the following query (in PostgreSQL syntax):

SELECT c.name, count(*) FROM actor a JOIN film_actor fa USING (actor_id) JOIN film f USING (film_id) JOIN film_category fc USING (film_id) JOIN category c USING (category_id) WHERE actor_id = 1 GROUP BY c.name ORDER BY count(*) DESC

What does it do? For ACTOR_ID = 1 (Penelope Guiness), we’re looking for all the different film categories she played in, and the number of films per category. This is easier to understand when we look at the result:

NAME COUNT Horror 3 Family 2 New 2 Classics 2 Games 2 Music 1 Sci-Fi 1 Animation 1 Sports 1 Children 1 Comedy 1 Documentary 1 Foreign 1

Now, can you spot the unneeded JOINs? In fact, we never needed ACTOR, nor did we need FILM

SELECT c.name, count(*) FROM film_actor fa JOIN film_category fc USING (film_id) JOIN category c USING (category_id) WHERE actor_id = 1 GROUP BY c.name ORDER BY count(*) DESC

Cool, eh? The JOINs can be eliminated (again, in some databases, see below) and our “mistake” is no longer relevant to the query. The mistake could have also snuck (or sneaked?) in from a previous query version, which may have looked like this, projecting also the actor information and the list of films per category, in case of which the additional JOIN are needed:

SELECT c.name, count(*), a.first_name, a.last_name, array_agg(f.title ORDER BY f.title) FROM actor a JOIN film_actor fa USING (actor_id) JOIN film f USING (film_id) JOIN film_category fc USING (film_id) JOIN category c USING (category_id) WHERE actor_id = 1 GROUP BY c.name, a.first_name, a.last_name ORDER BY count(*) DESC

The result being:

NAME COUNT FIRST_NAME LAST_NAME FILMS Horror 3 PENELOPE GUINESS {"ELEPHANT TROJAN","LADY STAGE","RULES HUMAN"} Family 2 PENELOPE GUINESS {"KING EVOLUTION","SPLASH GUMP"} New 2 PENELOPE GUINESS {"ANGELS LIFE","OKLAHOMA JUMANJI"} Classics 2 PENELOPE GUINESS {"COLOR PHILADELPHIA","WESTWARD SEABISCUIT"} Games 2 PENELOPE GUINESS {"BULWORTH COMMANDMENTS","HUMAN GRAFFITI"} Music 1 PENELOPE GUINESS {"WIZARD COLDBLOODED"} Sci-Fi 1 PENELOPE GUINESS {"CHEAPER CLYDE"} Animation 1 PENELOPE GUINESS {"ANACONDA CONFESSIONS"} Sports 1 PENELOPE GUINESS {"GLEAMING JAWBREAKER"} Children 1 PENELOPE GUINESS {"LANGUAGE COWBOY"} Comedy 1 PENELOPE GUINESS {"VERTIGO NORTHWEST"} Documentary 1 PENELOPE GUINESS {"ACADEMY DINOSAUR"} Foreign 1 PENELOPE GUINESS {"MULHOLLAND BEAST"}

As you can see, this optimisation can be very useful on your legacy SQL, because if we maintain a complex query, we might not always be able to see all the JOINs that are really needed.

Reusable Views

Sometimes, we simply add additional JOINs for convenience, when building complex queries from simpler ones, e.g. by using views (which is a completely underrated RDBMS feature! You should all write more views).

Consider this view:

CREATE VIEW v_customer AS SELECT c.first_name, c.last_name, a.address, ci.city, co.country FROM customer c JOIN address a USING (address_id) JOIN city ci USING (city_id) JOIN country co USING (country_id)

It’s not unlikely that we will write a view like this, simply because we’re incredibly bored to constantly join all these tables all the time. Every time we do something with customers and addresses, we need the CITY and COUNTRY table as well.

From now on (with this view), we can simply select from the view and “forget” about how it came to be. Now, let’s consider we completely forget about the underlying table, because the view was so useful all the time. We could think about doing this:

SELECT first_name, last_name FROM v_customer

What do you think will happen? Exactly. JOIN elimination. A view isn’t really anything special, just a “macro” of some stored SQL (beware of some databases, where this isn’t always the case, e.g. MySQL, which Bill Karwin was kind enough to hint me at). So the above statement will be transformed into:

SELECT first_name, last_name FROM ( SELECT c.first_name, c.last_name, a.address, ci.city, co.country FROM customer c JOIN address a USING (address_id) JOIN city ci USING (city_id) JOIN country co USING (country_id) ) v_customer

… which can be transformed into this (we don’t need all columns in the nested select):

SELECT first_name, last_name FROM ( SELECT c.first_name, c.last_name FROM customer c JOIN address a USING (address_id) JOIN city ci USING (city_id) JOIN country co USING (country_id) ) v_customer

… which can be transformed into this (JOINs can be eliminated):

SELECT first_name, last_name FROM ( SELECT c.first_name, c.last_name FROM customer c ) v_customer

… and finally (the subquery is not useful):

SELECT first_name, last_name FROM customer

The view is even very useful for this particular query, thanks to JOIN elimination!

Note, the SQL transformations exposed above are simply educational. Actual optimisers may perform transformations in an entirely differently, or in a different order. This is just to show what’s possible, and what kind of stuff is being done.

Cool, So Can My Database Do It?

Perhaps! Let’s look at the three different types of JOIN elimination in the context of these databases:

  • DB2 LUW 10.5
  • MySQL 8.0.2
  • Oracle 12.2.0.1
  • PostgreSQL 9.6
  • SQL Server 2014
INNER JOIN Elimination

Remember, this depends on the presence (and usefulness) of a FOREIGN KEY constraint. The SQL statement we’re using here is:

SELECT first_name, last_name FROM customer c JOIN address a ON c.address_id = a.address_id

We’re hoping to get:

SELECT first_name, last_name FROM customer c

DB2 LUW

The following execution plan (created with Markus Winand’s cool utility) shows that this works in DB2, there’s no access to the ADDRESS table:

Explain Plan | -----------------------------------------------------------------------| ID | Operation | Rows | Cost | 1 | RETURN | | 61 | 2 | FETCH CUSTOMER | 599 of 599 (100.00%) | 61 | 3 | IXSCAN IDX_CUSTOMER_FK_ADDRESS_ID | 599 of 599 (100.00%) | 20 |

MySQL

MySQL 8, apart from finally introducing CTE and window functions (yay), has a lot of new optimiser features, read Morgan Tocker’s useful optimiser guide for details. Unfortunately, INNER JOIN elimination is not implemented:

ID TABLE TYPE REF ROWS EXTRA 1 c ALL 599 1 a eq_ref sakila.c.address_id 1 Using index

Not only is the JOIN executed, but it is executed using a nested loop with 599 index lookups, as MySQL still only supports NESTED LOOP JOINs, not HASH JOINs.

Bummer.

Oracle

No problem at all for Oracle:

------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 599 | 28752 | 5 (0)| 00:00:01 | | 1 | TABLE ACCESS FULL| CUSTOMER | 599 | 28752 | 5 (0)| 00:00:01 | ------------------------------------------------------------------------------

… the JOIN is eliminated as expected.

PostgreSQL

Unfortunately, PostgreSQL cannot eliminate INNER JOIN:

Hash Join (cost=19.57..42.79 rows=599 width=13) Hash Cond: (c.address_id = a.address_id) -> Seq Scan on customer c (cost=0.00..14.99 rows=599 width=15) -> Hash (cost=12.03..12.03 rows=603 width=4) -> Seq Scan on address a (cost=0.00..12.03 rows=603 width=4)

Not as bad as in MySQL, though, as PostgreSQL chose to use a HASH JOIN to combine the two tables.

SQL Server

No problemo for SQL Server, the ADDRESS table access is gone!

|--Table Scan(OBJECT:([sakila].[dbo].[customer] AS [c]))

Excellent. What about…

OUTER JOIN Elimination

This one is a bit easier to prove for the database, remember? We don’t rely on any FOREIGN KEY anymore. A UNIQUE key in the parent table is sufficient to eliminate an OUTER JOIN. We can safely expect that if the INNER JOIN could be eliminated (DB2, Oracle, SQL Server), then an OUTER JOIN can be eliminated, too.

Here’s the query:

SELECT first_name, last_name FROM customer c LEFT JOIN address a ON c.address_id = a.address_id

And the outcome:

DB2 LUW

Good

Explain Plan | -----------------------------------------------------------------------| ID | Operation | Rows | Cost | 1 | RETURN | | 61 | 2 | FETCH CUSTOMER | 599 of 599 (100.00%) | 61 | 3 | IXSCAN IDX_CUSTOMER_FK_ADDRESS_ID | 599 of 599 (100.00%) | 20 |

MySQL

Still nope:

ID TABLE TYPE REF ROWS EXTRA 1 c ALL 599 1 a eq_ref sakila.c.address_id 1 Using index

Oracle

Perfect:

------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 599 | 28752 | 5 (0)| 00:00:01 | | 1 | TABLE ACCESS FULL| CUSTOMER | 599 | 28752 | 5 (0)| 00:00:01 | ------------------------------------------------------------------------------

PostgreSQL

Unlike INNER JOIN elimination, this works. Great!

Seq Scan on customer c (cost=0.00..14.99 rows=599 width=13)

SQL Server

As expected, good:

|--Table Scan(OBJECT:([sakila].[dbo].[customer] AS [c]))

Finally…

OUTER JOIN Elimination with DISTINCT

Remember, the query here navigates a to-many relationship, producing duplicate records of the parent table, but then removes all those duplicates again by

  • Ignoring contributions from the child table
  • Removing duplicates with DISTINCT

It’s easy to prove that this:

SELECT DISTINCT first_name, last_name FROM actor a LEFT JOIN film_actor fa ON a.actor_id = fa.actor_id

Is equivalent to this:

SELECT DISTINCT first_name, last_name FROM actor a

Remember also, this only works with OUTER JOIN, not with INNER JOIN, as the latter might remove rows, so we have to execute it to see if it does.

DB2 LUW

Cool, this actually works!

Explain Plan | ----------------------------------------------------------------------| ID | Operation | Rows | Cost | 1 | RETURN | | 20 | 2 | TBSCAN | 200 of 200 (100.00%) | 20 | 3 | SORT (UNIQUE) | 200 of 200 (100.00%) | 20 | 4 | TBSCAN ACTOR | 200 of 200 (100.00%) | 20 |

There’s no access to the FILM_ACTOR table, nor to its indexes. Very nice.

MySQL

As this is a more sophisticated transformation than the previous ones, we don’t have high hopes here.

ID TABLE TYPE REF ROWS EXTRA 1 a ALL 200 Using temporary 1 fa ref sakila.a.actor_id 27 Using index; Distinct

This has become a rather expensive query, again because of the lack of HASH JOIN support in MySQL!

Oracle

I’m very surprised to see that Oracle doesn’t support this optimisation, we’re executing the full query:

--------------------------------------------------------------- | Id | Operation | Name | Rows | --------------------------------------------------------------- | 0 | SELECT STATEMENT | | 5462 | | 1 | HASH UNIQUE | | 5462 | | 2 | NESTED LOOPS OUTER| | 5462 | | 3 | TABLE ACCESS FULL| ACTOR | 200 | |* 4 | INDEX RANGE SCAN | IDX_FK_FILM_ACTOR_ACTOR | 27 | ---------------------------------------------------------------

Curiously, Oracle also chose a NESTED LOOP JOIN in this case, even if we could have loaded the entire index on the FILM_ACTOR table into memory and then HASH JOINed it to ACTOR. Note that the cardinality estimate of the resulting query is quite off, too, despite the DISTINCT operation. This can lead to significant effects in an upstream query, which selects from this query (e.g. if stored as a view) – which is what happened to our customer.

PostgreSQL

PostgreSQL also doesn’t support this elimination, but at least gets cardinality estimates much more accurately and chooses a HASH JOIN operation:

HashAggregate (cost=193.53..194.81 rows=128 width=13) Group Key: a.first_name, a.last_name -> Hash Right Join (cost=6.50..166.22 rows=5462 width=13) Hash Cond: (fa.actor_id = a.actor_id) -> Seq Scan on film_actor fa (cost=0.00..84.62 rows=5462 width=2) -> Hash (cost=4.00..4.00 rows=200 width=17) -> Seq Scan on actor a (cost=0.00..4.00 rows=200 width=17)

SQL Server

The pleasant surprise came from SQL Server, which does support this optimisation too:

|--Sort(DISTINCT ORDER BY:([a].[first_name] ASC, [a].[last_name] ASC)) |--Table Scan(OBJECT:([sakila].[dbo].[actor] AS [a]))

As you can see, no access to any FILM_ACTOR related objects!

Summary

Here’s a summary of what databases can eliminate:

Database INNER JOIN:
to-one OUTER JOIN:
to-one OUTER JOIN DISTINCT:
to-many DB2 LUW 10.5 Yep Yep Yep MySQL 8.0.2 Nope Nope Nope Oracle 12.2.0.1 Yep Yep Nope PostgreSQL 9.6 Nope Yep Nope SQL Server 2014 Yep Yep Yep Conclusion

JOIN elimination is a very simple to understand, yet incredibly powerful feature that modern databases support to help developers build and maintain complex SQL queries without worrying too much about performance side effects.

It is possible because SQL is a 4GL (fourth-generation programming language), i.e. a declarative language whose parsed expression trees can “easily” be transformed into equivalent, simpler, and faster to execute expression trees. This work is constantly done by the optimiser behind the scenes, often without you even noticing (see example where unnecessary ACTOR and FILM tables were removed).

Currenly, DB2 and SQL Server are the leaders here with Oracle being a close runner-up, at least in my investigation. There’s hope for PostgreSQL, and a bit less hope for MySQL right now. There are tons of other interesting SQL transformations, which I’ll blog about in future blog posts, which may make the difference in other kinds of complex queries.

If you were intrigued by this sort of functionality, do have a look at my most recent SQL talk, which helps developers understand the real value of SQL in the context of the optimiser:


Filed under: sql Tagged: DB2, JOIN elimination, mysql, Optimisation, Oracle, PostgreSQL, sql, SQL Optimisation, SQL Server, SQL transformations

How MariaDB ColumnStore’s filenames work

Unlike most storage engines, MariaDB ColumnStore does not store its data files in the datadir. Instead these are stored in the Performance Modules in what appears to be a strange numbering system. In this post I will walk you through deciphering the number system.

If you are still using InfiniDB with MySQL, the system is exactly the same as outlined in this post, but the default path that the data is stored in will be a little different.

The default path for the data to be stored is /usr/local/mariadb/columnstore/data[dbRoot] where “dbRoot” is the DB root number selected when the ColumnStore system was configured.

From here onwards we are looking at directories with three digits ending in “.dir”. Every filename will be nested in similar to 000.dir/000.dir/003.dir/233.dir/000.dir/FILE000.cdf.

Now, to understand this you first need to understand how ColumnStore’s storage works. As the name implies every column of a table is stored separately. These columns are broken up into “extents” of 2^15 (roughly 8M) entries either 1 or 2 extents (depending on how much data you have) will make up a segment file. Each segment file is given a segment ID and a collection of four segments is given a partition ID. In addition to all this every column is given an “Object ID”.

You can find the object ID for every column using the information_schema.columnstore_columns table and details about every extent, including the partition and segment IDs using the information_schema.columnstore_extents table. This will be useful when working out the file names.

The following is how to work out a filename from an object ID. It should be noted that object IDs are 32bit and the output of each of these parts is converted to decimal:

Part 1: The top byte from the object ID (object ID >> 24)
Part 2: The next byte from the object ID ((object ID & 0x00ff0000) >> 16)
Part 3: The next byte from the object ID ((object ID & 0x0000ff00) >> 8)
Part 4: The last byte from the object ID (object ID & 0x000000ff)
Part 5: The partition ID
Part 6 (the filename): The segment ID

Each part here apart from the final part is a directory appended with “.dir”. The filename is prepended with FILE and appended with “.cdf”. There is of course a much easier way of finding out this information. The information_schema.columnstore_files table will give you the filename for each object/partition/segment combination currently in use as well as the file size information.

Image credit: Marcin Wichary, used under a Creative Commons license


Automatic Partition Maintenance in MySQL and MariaDB: Part 3

In part 1 and part 2 of this blog series, I showed how a DBA could configure MySQL or MariaDB to automatically drop old partitions. Some readers mentioned that they would also like for new partitions to be automatically created. In this blog post, I will show a stored procedure that can automatically create new partitions, and I will tie ... Read More

Percona Live Europe Featured Talks: Orchestrating ProxySQL with Orchestrator and Consul with Avraham Apelbaum

Welcome to another post our series of interview blogs for the upcoming Percona Live Europe 2017 in Dublin. This series highlights a number of talks that will be at the conference and gives a short preview of what attendees can expect to learn from the presenter.

This blog post is with Avraham Apelbaum, DBA and DevOps at Wix.com His talk is titled Orchestrating ProxySQL with Orchestrator and Consul. The combination of ProxySQL and Orchestrator solves many problems, but still requires some manual labor when the configuration changes when there is a network split (and other scenarios). In our conversation, we discussed using Consul to solve some of these issues:

Percona: How did you get into database technology? What do you love about it?

Avraham: On my first day as a soldier in a technology unit of the IDF, I received a HUGE Oracle 8 book and a very low-level design of a DB-based system. “You have one month,” they told me. I finished it all within ten days. Before that, I didn’t even know what a DB was. Today, I’m at Wix managing hundreds of databases that support 100M users!

Percona: You’re presenting a session called “Orchestrating ProxySQL with Orchestrator and Consul”. How do these technologies work together to help provide a high availability solution?

Avraham: ProxySQL is supposed to help you out with high availability (HA) and disaster recovery (DR) for MySQL servers, but it still requires some manual labor when the configuration changes – as a result of a network split, for example. Somehow all ProxySQL servers need to get the new MySQL cluster topology. So to automate all that, I added two more parts: a Consul KV store and a Consul template, which are responsible for updating ProxySQL on every architecture change in the MySQL cluster.

Percona: What is special about this combination of products that works better than other solutions? Is it right all the time, or does it depend on the workload?

Avraham: As DevOps I prefer not to do anything manually. What’s more, no one wants to wake up in the middle of the night because any one of our DB servers can fail. Most everyone, I guess, will have more than one ProxySQL server in their system at some point, so this solution can help them use ProxySql and Orchestrator.

Percona: What do you want attendees to take away from your session? Why should they attend?

Avraham: I am hoping to help people automate their HA and DR solutions. If as a result of my talk someone will earn even one minute off downtime, I’ll be happy.

Percona: What are you most looking forward to at Percona Live Europe 2017?

Avraham: In the DevOps and open source world, it’s all about sharing ideas. It was actually when I attended the talks by ProxySQL and Orchestrator’s creators that I thought of assembling it all up to solve our own problem. So I am looking forward to sharing my idea with others, and getting input from the audience so that everyone can benefit.

Want to find out more about Avraham and RDS migration? Register for Percona Live Europe 2017, and see his talk Orchestrating ProxySQL with Orchestrator and Consul. Register now to get the best price! Use discount code SeeMeSpeakPLE17 to get 10% off your registration.

Percona Live Open Source Database Conference Europe 2017 in Dublin is the premier European open source event for the data performance ecosystem. It is the place to be for the open source community as well as businesses that thrive in the MySQL, MariaDB, MongoDB, time series database, cloud, big data and Internet of Things (IoT) marketplaces. Attendees include DBAs, sysadmins, developers, architects, CTOs, CEOs, and vendors from around the world.

The Percona Live Open Source Database Conference Europe will be September 25-27, 2017 at the Radisson Blu Royal Hotel, Dublin.

Percona Server for MySQL 5.7.19-17 Is Now Available

Percona announces the release of Percona Server for MySQL 5.7.19-17 on August 31, 2017. Download the latest version from the Percona web site or the Percona Software Repositories. You can also run Docker containers from the images in the Docker Hub repository.

Based on MySQL 5.7.19, and including all the bug fixes in it, Percona Server for MySQL 5.7.19-17 is now the current GA release in the Percona Server for MySQL 5.7 series. Percona Server for MySQL is open-source and free – this is the latest release of our enhanced, drop-in replacement for MySQL. Complete details of this release are available in the 5.7.19-17 milestone on Launchpad.

NOTE: Percona software no longer supports Red Hat Enterprise Linux 5 (including CentOS 5 and other derivatives), Ubuntu 12.04 and older versions. These platforms have reached end of life, won’t be updated and are not recommended for use in production.

 

New Features

  • Included the Percona MyRocks storage engine

    NOTE: MyRocks for Percona Server is currently experimental and not recommended for production deployments until further notice. You are encouraged to try it in a testing environment and provide feedback or report bugs.

  • #1708087: Added the mysql-helpers script to handle checking for missing datadir during startup. Also fixes #1635364.

Platform Support

  • Stopped providing packages for Ubuntu 12.04 due to its end of life.

Bugs Fixed

  • #1669414: Fixed handling of failure to set O_DIRECT on parallel doublewrite buffer file.
  • #1705729: Fixed the postinst script to correctly locate the datadir. Also fixes #1698019.
  • #1709811: Fixed yum upgrade to not enable the mysqld service if it was disabled before the upgrade.
  • #1709834: Fixed the mysqld_safe script to correctly locate the basedir.
  • Other fixes: #1698996#1706055#1706262#1706981

TokuDB Changes

  • TDB-70: Removed redundant fsync of TokuDB redo log during binlog group commit flush stage. This fixes issue that prevented TokuDB to run in reduced durability mode when the binlog was enabled.
  • TDB-72: Fixed issue when renaming a table with non-alphanumeric characters in its name.

Release notes for Percona Server for MySQL 5.7.19-17 are available in the online documentation. Please report any bugs on the launchpad bug tracker.

In-memory sysbench, a larger server and contention - part 1

Yesterday I shared results for in-memory sysbench on a large server. Today I have more results from a similar test but with more contention. The tests yesterday used 8 tables with 1M rows/table. The test here uses 1 table with 8M rows. In some ways the results are similar to yesterday. But there are some interesting differences that I will explain in part 2 (another post).

tl;dr
  • MyRocks does worse than InnoDB 5.6.35 at low and mid concurrency for all tests except update-index. It suffers more on the read-heavy tests.
  • MyRocks and TokuDB are more competitive at mid and high concurrency than at low. MyRocks is faster than InnoDB 5.6.35 for many of the high concurrency tests.
  • InnoDB QPS at low concurrency tends to decrease after 5.6 for tests heavy on point queries but something in MySQL 5.7 made InnoDB range scans much faster.
  • InnoDB QPS at mid and high concurrency tends to increase after 5.6

Configuration

Everything is the same as explained in the previous post except I used 1 table with 8M rows here and 8 tables with 1M rows/table there. Using 1 table instead of 8 means there can be more contention in the database engine on things like the InnoDB per-index mutex. A command line is:
bash all.sh 1 8000000 180 300 180 innodb 1 0 /orig5717/bin/mysql none /sysbench.new
Results without charts

All of the data is here. Below I share the QPS ratios to compare the QPS for one engine with the QPS from InnoDB from MySQL 5.6.35. The engine is slower than InnoDB 5.6.35 when the ratio is less than 1.0.

QPS ratio:
* rocks = myrocks / inno5635
* inno = inno5717 / inno5635
* toku = toku5717 / inno5635

1 connection
rocks   inno    toku
0.460   0.946   0.195   update-inlist
0.862   0.906   0.278   update-one
1.496   1.239   0.569   update-index
0.674   0.906   0.248   update-nonindex
0.756   0.921   0.240   update-nonindex-special
0.793   0.866   0.319   delete-only
0.701   1.027   0.612   read-write.range100
0.812   1.657   1.033   read-write.range10000
0.701   1.089   0.737   read-only.range100
0.804   1.676   1.281   read-only.range10000
0.675   0.904   0.731   point-query
0.508   0.923   0.732   random-points
0.554   0.904   0.633   hot-points
0.760   0.857   0.257   insert-only
-----   -----   -----
0.754   1.059   0.562   average

8 connections
rocks   inno    toku
0.968   1.587   0.293   update-inlist
1.014   0.843   0.190   update-one
1.837   2.183   0.608   update-index
0.879   1.090   0.307   update-nonindex
0.928   1.094   0.312   update-nonindex-special
0.968   1.068   0.340   delete-only
0.722   1.045   0.560   read-write.range100
0.814   1.626   1.108   read-write.range10000
0.714   1.126   0.825   read-only.range100
0.811   1.639   1.255   read-only.range10000
0.690   0.914   0.727   point-query
0.718   1.156   0.840   random-points
0.966   1.354   0.832   hot-points
0.859   1.104   0.310   insert-only
-----   -----   -----
0.921   1.274   0.608   average

48 connections
rocks   inno    toku
1.679   3.087   0.788   update-inlist
0.982   0.979   0.231   update-one
1.222   1.986   0.606   update-index
1.379   1.947   0.886   update-nonindex
1.387   1.936   0.854   update-nonindex-special
1.189   1.876   0.578   delete-only
0.826   1.148   0.514   read-write.range100
0.840   1.316   0.953   read-write.range10000
0.743   1.112   0.740   read-only.range100
0.850   1.342   1.034   read-only.range10000
0.941   1.368   1.066   point-query
2.042   1.445   0.686   random-points
0.793   1.507   0.711   hot-points
1.820   1.605   0.692   insert-only
-----   -----   -----
1.192   1.618   0.739   average

Results with charts

Charts using the data from the previous section. For some of them I truncate the x-axis to make it easier to see differences between engines.





MySQL Cluster Manager 1.4.3 released!


You might already have seen the announcement earlier this summer – we made MCM 1.4.3 available for download from My Oracle Support.


What’s new?
The most notable changes to MCM 1.4.3 include

  • Significant reduction in cpu consumption when idle.
  • Added support for running your mysqlds with skip-networking enabled.

How to Submit MySQL Marinate Homework

We will be submitting the homework via GitHub. The reason for this is to have a centralized place for homework, as well as teaching how to use a revision control system and how to interact specifically with GitHub.

Reading:
Learn about git at http://git-scm.com/book/en/Getting-Started-Git-Basics (You only need to read this one webpage, no need to go to the next page)

To do:
– Create a github account at www.github.com if you do not already have one.
– Install and configure git on your local machine as per https://help.github.com/articles/set-up-git
– Fork the MySQL Marinate repo as per https://help.github.com/articles/fork-a-repo
The URL for the MySQL Marinate repo is https://github.com/Sheeri/mysql-marinate

– To demonstrate that you can submit homework, open and modify the 01Intro/homework.txt file
– Then commit your work locally:
git add 01Intro/homework.txt
git commit
– When you’re done, commit your work up to the github server:
git push origin master

When doing homework, remember to commit locally often. I would recommend committing up to the github server whenever you finish a question or a part of a question. Pretend at any time your local machine may die; how much work are you willing to lose? (5 minutes’ worth? 3 days’ worth?) I do not look at individual commits, only the final pull request when you’re done.

When you’re done with your homework, submit a Pull Request as per https://help.github.com/articles/using-pull-requests

If you have trouble and need help, ask away in the comments!

Reference/Quick Links for MySQL Marinate

Lesson 01: Introduction and Submitting Homework

This is for chapter 1 of Learning MySQL – it’s only pages 3-8, so it is an easy one!

Homework for this week: How to Submit Homework

Notes/errata/updates for Chapter 1:
The book mentions MySQL AB, the company behind MySQL. Since the book was written, MySQL was acquired by Sun, and then Sun was acquired by Oracle. (see http://en.wikipedia.org/wiki/MySQL#History­ for the timeline).

Topics covered:
MySQL History
Why MySQL is popular
Submitting Homework

Reference/Quick Links for MySQL Marinate

MySQL Marinate – So you want to learn MySQL! – START HERE

Want to learn or refresh yourself on MySQL? MySQL Marinate is the FREE virtual self-study group is for you!

This is for beginners – If you have no experience with MySQL, or if you are a developer that wants to learn how to administer MySQL, or an administrator that wants to learn how to query MySQL, this course is what you want. If you are not a beginner, you will likely still learn some nuances, and it will be easy and fast to do. If you have absolutely zero experience with MySQL, this is perfect for you. The first few chapters walk you through getting and installing MySQL, so all you need is a computer and the book.

The format of a virtual self-study group is as follows:
Each participant acquires the same textbook (Learning MySQL, the “butterfly O’Reilly book”, published 2007). You can acquire the textbook however you want (e.g. from the libary or from a friend, hard copy or online). Yes, the book is old, but SQL dates back to at least the 1970’s and the basics haven’t changed! There are notes and errata for each chapter so you will have updated information. The book looks like this:

O’Reilly Butterfly book picture

Each participant commits to reading each chapter (we suggest one chapter per week as a good deadline), complete the exercises and post a link to the completed work.

Each participant obtains assistance by posting questions to the comments on a particular chapter.

Note: There is no classroom instruction.

How do I get started?

– Watch sheeri.com each week for the chapters to be posted.

– Get Learning MySQL
Acquire a book (the only item that may cost money). Simply acquire Learning MySQL – see if your local library has it, if someone is selling their copy, or buy it new.

– Start!
When your book arrives, start your virtual learning by reading one chapter per week. Complete the exercises; if you have any questions, comments or want to learn more in-depth, that’s what the comments for!

FAQs:
Q: Does this cover the Percona patch set or MariaDB forks?

A: This covers the basics of MySQL, which are applicable to Percona’s patched MySQL or MariaDB builds, as well as newer versions of MySQL.

Q: What do I need in order to complete the course?

A: All you need is the book and access to a computer, preferably one that you have control over. Windows, Mac OS X or Unix/Linux will work. A Chromebook or tablet is not recommended for this course.

Q: Where can I put completed assignments?

A: Completed assignments get uploaded to github. See How to Submit Homework

Q: The book was published in 2007. Isn’t that a bit old?

A: Yes! The basics are still accurate, and we will let you know what in the book is outdated. I have contacted O’Reilly, offering to produce a new edition, and they are not interested in updating the book. We will also have optional supplemental material (blog posts, videos, slides) for those who want to learn more right away. We are confident that this self-study course will make you ready to dive into other, more advanced material.

Soak it in!

Reference/Quick Links for MySQL Marinate

Complete Megalist: 25 Helpful Tools For Back-End Developers

 

The website or mobile app is the storefront for participating in the modern digital era. It’s your portal for inviting users to come and survey your products and services. Much attention focuses on front-end development; this is where the HMTL5, CSS, and JavaScript are coded to develop the landing page that everyone sees when they visit your site.

 

But the real magic happens on the backend. This is the ecosystem that really powers your website. One writer has articulated this point very nicely as follows:

 

The technology and programming that “power” a site—what your end user doesn’t see but what makes the site run—is called the back end. Consisting of the server, the database, and the server-side applications, it’s the behind-the-scenes functionality—the brain of a site. This is the ecosystem of the database manager and the back-end developer.

 

The challenge of modern web development is that the tools and technologies are constantly changing. What’s in vogue today may become obsolete in less than a year. What’s more is that web applications have become much richer and more complex. New uses of video, virtual reality, and artificial intelligence on websites will continue to define the developer of the future.

 

And, of course, today’s savvy web developer must continue to learn and grow, which is equally challenging because the technologies are constantly changing. The internet of 3 years ago, is very different from the internet of today, and so it will be in another 3 years.

 

In the following, we’ve put together a long list of useful tools that every back-end developer should have in their toolkit. Read on and learn about some of the best resources on the market today for the backend developer.

 

 

Languages & Frameworks

 

1. PHP

 

In the 23 years since its introduction, PHP has become the world’s most popular server-side scripting language. PHP is pre-installed on most hosting sites, is known to be easy to use, and has tons of support. What’s more, there are many very well established PHP frameworks on the market that make developing applications more intuitive and agile. Some of the most popular ones are Symfony, Laravel, Phalcon, and Yii Framework.

 

 

2. Python

 

Python has become an extremely popular language for building applications in recent years. With a reputation for being fast, easy to learn, and carrying wide support, Python has become the first choice for many new programmers. It’s also the language of choice for data scientists and engineers. A couple of the most popular Python frameworks out there are Django and Pyramid.

 

 

3. Ruby on Rails

 

Ruby on Rails is the popular open source web application framework that uses the Ruby programming language. Ever since Rails burst on the scene a decade ago it has continued to scale up as an elegant way to build dynamic websites quickly and efficiently. Rails has garnered a strong following, especially among tech startups. In fact, some of the best-known firms out there are using this framework to build their sites, including Airbnb, Bloomberg, and Groupon.

 

 

4. Node.js

 

Node.js is an open-source, cross-platform JavaScript runtime environment for developing a diverse variety of server tools and applications. One notable feature of Node.js is that it contains a built-in library to allow applications to act as a Web server without software such as Apache HTTP Server or IIS.

 

 

5. Laravel

 

Laravel is probably one of the more popular PHP frameworks on the market right now. Known for its elegance and simplicity, Laravel meets a diverse range of programming needs and projects, from beginner to advanced, and is well suited for projects of all types and sizes – from basic scripting to huge enterprise applications. It’s built on top of several Symfony components that ensure a solid framework for producing well-tested and reliable code.

 

 

Web Server Technologies

 

6. Apache

 

Apache is an open source web server, created in 1999 and currently used by 50.1% of all websites. The numbers indicate the reliance that folks place on Apache to run their web services. It’s a proven, reliable tool that has stood the test of time, and really played a key role in the early growth of the internet.

 

7. NGINX

NGINX is open source software for web serving, reverse proxying, caching, load balancing, media streaming, and more. In addition to its HTTP server capabilities, NGINX can also function as a proxy server for email (IMAP, POP3, and SMTP) and a reverse proxy and load balancer for HTTP, TCP, and UDP servers.

 

 

Databases

 

8. MySQL

 

MySQL is the world’s most popular open source relational database. It’s easily accessible and is often known to be part of the LAMP web development stack, standing for the ‘M’ in the acronym of popular tools, along with Linux, Apache, and PHP/Perl/Python. The fact that MySQL is free, easy to setup, and scales fast are some of the main reasons why it’s the best match for many backend developers.

 

9. PostgreSQL

 

PostgreSQL is a powerful, open source object-relational database system with more than 15 years of active development and a proven architecture that has earned it a strong reputation for reliability, data integrity, and correctness.

 

10. MongoDB

 

MongoDB is a free and open-source NoSQL database system. MongoDB saves data in binary JSON format which makes it easier to pass data between client and server. The platform is closely associated with MEAN ( an acronym for MongoDB, ExpressJS, AngularJS, and NodeJS), a JavaScript based set of technologies used for building web applications.

 

 

Git Clients & Services

 

11. SourceTree

SourceTree simplifies the process of how developers interact with their Git and Mercurial repositories. The easy to use interface makes managing repositories more intuitive so you can focus on coding.

 

12. Github Client

Github Client was created to minimize the cost of building apps on multiple platforms. By using the Electron software framework, developers can now easily write cross-platform desktop applications using JavaScript, HTML, and CSS. Essentially, Github Client + Electron provides developers with the benefits of a unified cross-platform experience that’s completely open source and ready to customize.

 

 

Microservice Platforms

 

13. Docker

 

Docker is the open source platform that put microservices on the map by providing developers and testers with a fast and easy-to-use packaging, distribution, and deployment mechanism for building containerized applications. Docker also offers a formidable ecosystem of tools, such as native clustering (Docker Swarm) registry service (Docker Hub) and cloud service (Docker Cloud).

 

14. Kubernetes

 

Kubernetes is an open source container cluster manager that aims to provide a “platform for automating deployment, scaling, and operations of application containers across clusters of hosts.” Originally designed by Google, Kubernetes has the advantage of 15 years of Google research. Because of that, Kubernetes also draws upon a robust user community and is one of the top projects on Github.

 

 

Local Development Environments

 

15. XAMPP

 

There are lots of advantages to setting up your own local test environment. XAMPP is an open source, cross-platform web server solution stack, which comes with the Apache distribution containing MariaDB, PHP, and Perl all integrated into one downloadable file. The XAMPP open source package has been set up to be incredibly easy to install and to use.

 

16. WampServer

Wampserver is an open source windows development environment. It allows users to create web applications with Apache, PHP, and a MySQL database. The platform also comes with PhpMyAdmin and SQLite to help easily manage your databases. Wampserver is known to be particularly easy and intuitive to use.

 

17. Laragon

 

Laragon provides a fast and easy way to spin up an isolated Windows development (like a Virtual Machine, it doesn’t touch your OS). Users can install it as a software, start it up, do their programming, and just exit when finished. The platform comes pre-installed with many popular applications like Node.js, PHP, Apache, Composer, and MariaDB/MySQL.

 

 

Collaboration Services

 

18. Slack

 

Slack is a popular cloud-based messaging & collaboration app that has revolutionized enterprise communications in recent years. Known for its fun and intuitive user interface, “slackbots” (AI personal assistant), and a robust and rapidly expanding user community, Slack is on a mission to make working life simpler, more pleasant, and more productive.

 

 

19. Asana

 

Asana is a SaaS platform designed to help teams work more effectively by improving team collaboration. It focuses on allowing users to manage projects and tasks online without the use of email.

 

20. Jira

 

Jira is a web-based project management platform that uses the kanban and scrum paradigms for managing projects, originally popularized by Toyota in the 1980s for supply chain management. As its own website declares, “Trello’s boards, lists, and cards allow users to organize and prioritize projects in a fun, flexible, and rewarding way.”

 

 

Website Speed Test Tools

 

21. Google PageSpeed Insights

 

PageSpeed Insights is a helpful web performance tool that analyzes the content and speed of a web page. The output offers reports for both desktop and mobile versions of your site, along with a set of recommendations and metrics to make the page faster.

 

22. Full Page Load Tester

 

WebPage Test provides a free website speed test from multiple locations around the globe using real browsers (IE and Chrome) and at real consumer connection speeds. The output offers a rich set of diagnostics including resource loading waterfall charts, Page Speed optimization checks, and suggestions for improvements.

 

 

Web Development Communities

 

23. Stackoverflow

 

StackOverflow is easily the world’s largest developer website with approximately 50 million visits a month by developers who come to learn and share their knowledge. Users are encouraged by Stack Overflow’s badge system, which enables them to earn reputation points for receiving an “up” vote on an answer given to a question.

 

24. Refind

Refind is a content reposting site that allows members (consisting of founders, hackers, and designers) to collect and share the best links on the web. In its own words, “Join thousands of professionals to discover, save, and read what’s most relevant to you.”

 

25. Hashnode

Hashnode is basically a social network for software developers where they can share and grow their programming knowledge by posting of articles and content. This allows them to build their portfolio and gain followers and votes in the process.

Deploy WordPress on top of FAMP in FreeBSD 11.1

This guide will show how to install and configure the most popular open source CMS used on the internet, Wordpress, on top of FAMP in FreeBSD 11.x latest release. FAMP is an acronym which describes the following software bundle: FreeBSD 11.1 Unix-like operating system, Apache HTTP server, one of the most popular open-source web server on the internet, MariaDB RDBMS (relational database management system), which is a fork of MySQL database engine, and PHP programming server-side language.

How to configure remote and secure connections for MySQL on Ubuntu 16.04 VPS

In this article, we will show you how to configure remote and secure connections for MySQL on Ubuntu 16.04 VPS.  If you are using an unencrypted connection to connect to your remote MariaDB/MySQL server, then everyone with access to the network could monitor and analyze the data being sent or received between the client and the server. This guide should work on other Linux VPS systems as well but was tested and written for an Ubuntu 16.04 VPS. Login to your VPS via SSH ssh my_sudo_user@my_server The steps in this tutorial assume that you have installed MySQL 5.7 with the […]

Automatically Dropping Old Partitions in MySQL and MariaDB: Part 2

In a previous blog post, I showed how a DBA could configure MySQL or MariaDB to automatically drop old partitions. Some readers provided some feedback on some issues that they’ve run into while doing similar operations. Specifically: It can sometimes help to maintain an empty first partition when partitioning by dates, since partition pruning cannot always eliminate the first partition. ... Read More

Nested Data Structures in ClickHouse

In this blog post, we’ll look at nested data structures in ClickHouse and how this can be used with PMM to look at queries.

Nested structures are not common in Relational Database Management Systems. Usually, it’s just flat tables. Sometimes it would be convenient to store unstructured information in structured databases.

We are working to adapt ClickHouse as a long term storage for Percona Monitoring and Management (PMM), and particularly to store detailed information about queries. One of the problems we are trying to solve is to count the different errors that cause a particular query to fail.

For example, for date 2017-08-17 the query:

"SELECT foo FROM bar WHERE id=?"

was executed 1000 times. 25 times it failed with error code “1212”, and eight times it failed with error code “1250”. Of course, the traditional way to store this in relational data would be to have a table "Date, QueryID, ErrorCode, ErrorCnt" and then perform a JOIN to this table. Unfortunately, columnar databases don’t perform well with multiple joins, and often the recommendation is to have de-normalized tables.

We can create a column for each possible ErrorCode, but this is not an optimal solution. There could be thousands of them, and most of the time they would be empty.

In this case, ClickHouse proposes Nested data structures. For our case, these can be defined as:

CREATE TABLE queries ( Period Date, QueryID UInt32, Fingerprint String, Errors Nested ( ErrorCode String, ErrorCnt UInt32 ) )Engine=MergeTree(Period,QueryID,8192);

This solution has obvious questions: How do we insert data into this table? How do we extract it?

Let’s start with INSERT. Insert can look like:

INSERT INTO queries VALUES ('2017-08-17',5,'SELECT foo FROM bar WHERE id=?',['1220','1230','1212'],[5,6,2])

which means that the inserted query during 2017-08-17 gave error 1220 five times, error 1230 six times and error 1212 two times.

Now, during a different date, it might produce different errors:

INSERT INTO queries VALUES ('2017-08-18',5,'SELECT foo FROM bar WHERE id=?',['1220','1240','1258'],[3,2,1])

Let’s take a look at ways to SELECT data. A very basic SELECT:

SELECT * FROM queries ┌─────Period─┬─QueryID─┬─Fingerprint─┬─Errors.ErrorCode───────┬─Errors.ErrorCnt─┐ │ 2017-08-17 │ 5 │ SELECT foo │ ['1220','1230','1212'] │ [5,6,2] │ │ 2017-08-18 │ 5 │ SELECT foo │ ['1220','1240','1260'] │ [3,16,12] │ └────────────┴─────────┴─────────────┴────────────────────────┴─────────────────┘

If we want to use a more familiar tabular output, we can use the ARRAY JOIN extension:

SELECT * FROM queries ARRAY JOIN Errors ┌─────Period─┬─QueryID─┬─Fingerprint─┬─Errors.ErrorCode─┬─Errors.ErrorCnt─┐ │ 2017-08-17 │ 5 │ SELECT foo │ 1220 │ 5 │ │ 2017-08-17 │ 5 │ SELECT foo │ 1230 │ 6 │ │ 2017-08-17 │ 5 │ SELECT foo │ 1212 │ 2 │ │ 2017-08-18 │ 5 │ SELECT foo │ 1220 │ 3 │ │ 2017-08-18 │ 5 │ SELECT foo │ 1240 │ 16 │ │ 2017-08-18 │ 5 │ SELECT foo │ 1260 │ 12 │ └────────────┴─────────┴─────────────┴──────────────────┴─────────────────┘

However, usually we want to see the aggregation over multiple periods, which can be done with traditional aggregation functions:

SELECT QueryID, Errors.ErrorCode, SUM(Errors.ErrorCnt) FROM queries ARRAY JOIN Errors GROUP BY QueryID, Errors.ErrorCode ┌─QueryID─┬─Errors.ErrorCode─┬─SUM(Errors.ErrorCnt)─┐ │ 5 │ 1212 │ 2 │ │ 5 │ 1230 │ 6 │ │ 5 │ 1260 │ 12 │ │ 5 │ 1240 │ 16 │ │ 5 │ 1220 │ 8 │ └─────────┴──────────────────┴──────────────────────┘

If we want to get really creative and return only one row per QueryID, we can do that as well:

SELECT QueryID, groupArray((ecode, cnt)) FROM ( SELECT QueryID, ecode, sum(ecnt) AS cnt FROM queries ARRAY JOIN Errors.ErrorCode AS ecode, Errors.ErrorCnt AS ecnt GROUP BY QueryID, ecode ) GROUP BY QueryID ┌─QueryID─┬─groupArray(tuple(ecode, cnt))──────────────────────────────┐ │ 5 │ [('1230',6),('1212',2),('1260',12),('1220',8),('1240',16)] │ └─────────┴────────────────────────────────────────────────────────────┘

Conclusion

ClickHouse provides flexible ways to store data in a less structured manner and variety of functions to extract and aggregate it – despite being a columnar database.

Happy data warehousing!

In-memory sysbench and a large server

Today I share worst-case performance results for MyRocks -- in-memory sysbench and a small database. I like MyRocks because it reduces space and write amplification, but I don't show results for that here. Besides, there isn't much demand for better compression from such a small database. This is part 1 with results from a large server.

tl;dr
  • MyRocks does worse than InnoDB 5.6.35 at low and mid concurrency for all tests except update-index. It suffers more on the read-heavy tests.
  • MyRocks and TokuDB are more competitive at mid and high concurrency than at low
  • InnoDB QPS at low concurrency tends to decrease after 5.6
  • InnoDB QPS at mid and high concurrency tends to increase after 5.6

Configuration

I use my sysbench fork and helper scripts, release specific my.cnf files and a server with 48 HW threads, fast SSD and 256gb of RAM. The binlog was enabled and sync-on-commit was disabled for the binlog and database log. I remembered to disable SSL.

I tested MyRocks, TokuDB and InnoDB, with buffered IO and a 180g database cache for MyRocks/TokuDB and O_DIRECT and a 180gb buffer pool for InnoDB. The server is shared by the sysbench client and mysqld. For MyRocks I used a build from August 15 with git hash 0d76ae. For TokuDB I used Percona Server 5.7.17-12. For InnoDB I used upstream 5.6.35, 5.7.17 and 8.0.2. For InnoDB 8.0.2 I used latin1 charset and latin1_swedish_ci collation. Compression was not used for any engines. More details are in the release specific my.cnf files and I used the same my.cnf for InnoDB with 8.0.1 and 8.0.2. All mysqld use jemalloc.

The test used 8 tables with 1M rows/table. My use of sysbench is explained here. Tests are run in an interesting pattern -- load, write-heavy, read-only, insert-only. On the large server each test is run for 1, 2, 4, 8, 16, 24, 32, 40, 48 and 64 concurrent connections for either 3 or 5 minutes per concurrency level. So each test runs for either 30 or 50 minutes total and I hope that is long enough to get the database into a steady state. An example command line to run the test with my helper scripts is:
bash all.sh 8 1000000 180 300 180 innodb 1 0 /orig5717/bin/mysql none /sysbench.new

Results without charts

All of the data is here. Below I share the QPS ratios to compare the QPS for one engine with the QPS from InnoDB from MySQL 5.6.35. The engine is slower than InnoDB 5.6.35 when the ratio is less than 1.0. Things that I see include:
  • MyRocks does worse than InnoDB 5.6.35 at low and mid concurrency for all tests except update-index. It suffers more on the read-heavy tests.
  • MyRocks and TokuDB are more competitive at mid and high concurrency than at low
  • InnoDB QPS at low concurrency tends to decrease after 5.6
  • InnoDB QPS at mid and high concurrency tends to increase after 5.6

QPS ratio:
* rocks = myrocks / inno5635
* inno = inno5717 / inno5635
* toku = toku5717 / inno5635

1 connection
rocks   inno    toku
0.476   0.959   0.201   update-inlist
0.828   0.923   0.278   update-one
1.146   1.090   0.458   update-index
0.648   0.901   0.243   update-nonindex
0.740   0.898   0.236   update-nonindex-special
0.836   0.917   0.347   delete-only
0.711   1.041   0.586   read-write.range100
0.809   1.671   1.038   read-write.range10000
0.664   1.096   0.753   read-only.range100
0.801   1.657   1.263   read-only.range10000
0.641   0.905   0.735   point-query
0.442   0.923   0.753   random-points
0.480   0.900   0.694   hot-points
0.742   0.855   0.259   insert-only
-----   -----   -----
0.711   1.052   0.560   average of the above

8 connections
rocks   inno    toku
0.966   1.611   0.308   update-inlist
0.871   1.029   0.276   update-one
1.201   1.467   0.417   update-index
0.858   1.093   0.315   update-nonindex
0.898   1.090   0.314   update-nonindex-special
0.949   1.058   0.338   delete-only
0.710   1.039   0.534   read-write.range100
0.811   1.621   1.128   read-write.range10000
0.675   1.098   0.851   read-only.range100
0.810   1.639   1.263   read-only.range10000
0.648   0.910   0.746   point-query
0.541   1.097   0.931   random-points
0.754   1.317   1.037   hot-points
0.776   1.028   0.286   insert-only
-----   -----   -----
0.819   1.221   0.625   average of the above

48 connections
rocks   inno    toku
1.649   3.127   0.922   update-inlist
0.760   1.193   0.372   update-one
1.316   2.236   0.700   update-index
1.360   1.982   0.937   update-nonindex
1.374   1.965   0.995   update-nonindex-special
1.126   1.845   0.566   delete-only
0.804   1.129   0.507   read-write.range100
0.838   1.310   0.956   read-write.range10000
0.711   1.098   0.866   read-only.range100
0.823   1.305   1.034   read-only.range10000
0.932   1.347   1.084   point-query
1.417   2.920   2.248   random-points
1.840   3.226   2.350   hot-points
1.096   1.927   0.567   insert-only
-----   -----   -----
1.146   1.901   1.007   average of the above

Results with charts

The charts below have the same data from the previous section - the QPS for the engine relative to the QPS for InnoDB from MySQL 5.6.35.

Log Buffer #520: A Carnival of the Vanities for DBAs

This Log Buffer covers Oracle, SQL Server and MySQL.

Oracle:

Create a Custom Authentication and Authorization Scheme in Oracle APEX

Diagnosing Listener Issues in RAC

What Are NULL pname entries in v$process?

SecureFile LOB – the empty table

Apache Spark for High Energy Physics Analysis, an Example

SQL Server:

Imagine that your business is providing a service to individuals, and you charge by the amount of usage.

Incremental Package Deployment – A SSIS 2016 Feature

Implementing Azure SQL Data Sync in the Azure Portal

TDS Remoting: A Better Way to Create Linked Servers for ODBC Sources

Database Fundamentals #8: All About Data Types

MySQL:

Cost/Benefit Analysis of a MySQL Index

IO-bound sysbench on a smaller server

Looking at Disk Utilization and Saturation

MySQL 5.7 InnoDB Tablespace

Using Active Record migrations beyond SQLite

MySQL Connector/J 5.1.44 has been released

Dear MySQL Users,

MySQL Connector/J 5.1.44, a maintenance release of the production 5.1
branch has been released. Connector/J is the Type-IV pure-Java JDBC
driver for MySQL.

MySQL Connector Java is available in source and binary form from the
Connector/J download pages at
http://dev.mysql.com/downloads/connector/j/5.1.html
and mirror sites as well as Maven-2 repositories.

MySQL Connector Java (Commercial) is already available for download on the
My Oracle Support (MOS) website. This release will be available on eDelivery
(OSDC) in next month’s upload cycle.

As always, we recommend that you check the “CHANGES” file in the
download archive to be aware of changes in behavior that might affect
your application.

MySQL Connector/J 5.1.44 includes the following general bug fixes and
improvements, also available in more detail on
http://dev.mysql.com/doc/relnotes/connector-j/en/news-5-1-44.html

Changes in MySQL Connector/J 5.1.44 (2017-08-30)

Version 5.1.44 is a maintenance release of the production 5.1
branch. It is suitable for use with MySQL Server versions
5.5, 5.6, and 5.7. It supports the Java Database Connectivity
(JDBC) 4.2 API.

Functionality Added or Changed

* A new connection property, enabledTLSProtocols, can now
be used to override the default restrictions on the TLS
versions to be used for connections, which are determined
by the version of the MySQL Server that is being
connected to. By providing a comma-separated list of
values to this option (for example,
“TLSv1,TLSv1.1,TLSv1.2”) users can, for example, prevent
connections from using older TLS version, or allow
connections to use TLS versions only supported by a
user-compiled MySQL Server. See the entry for the new
property in Driver/Datasource Class Names, URL Syntax and
Configuration Properties for Connector/J
(http://dev.mysql.com/doc/connector-j/5.1/en/connector-j-
reference-configuration-properties.html) for details.
Thanks to Todd Farmer for contributing the code. (Bug
#26646676)

Bugs Fixed

* Important Change: Following the changes in MySQL Server
8.0.3, the system variables tx_isolation and tx_read_only
have been replaced with transaction_isolation and
transaction_read_only in the code of Connector/J. Users
should update Connector/J to this latest release in order
to connect to MySQL 8.0.3. They should also make the same
adjustments to their own applications if they use the old
variables in their codes. (Bug #26440544)

* When using cached server-side prepared statements, a
memory leak occurred as references to opened statements
were being kept while the statements were being decached;
it happened when either the close() method has been
called twice on a statement, or when there were
conflicting cache entries for a statement and the older
entry had not been closed and removed from the opened
statement list. This fix makes sure the statements are
properly closed in both cases. Thanks to Eduard Gurskiy
for contributing to the fix. (Bug #26633984, Bug #87429)

* The regression test for Bug#63800 failed because the
default value of the system variable
explicit_defaults_for_timestamp of MySQL Server has been
changed since release 8.0.2. The test has been adjusted
to take the change into consideration. (Bug #26501245)

On Behalf of the MySQL/ORACLE RE Team,
-Sreedhar S

Pages