Planet MySQL

Webinar Thursday July 27, 2017: Database Backup and Recovery Best Practices (with a Focus on MySQL)

Join Percona’s, Architect, Manjot Singh as he presents Database Backup and Recovery Best Practices (with a Focus on MySQL) on Thursday, July 27, 2017 at 11:00 am PDT / 2:00 pm EDT (UTC-7).

Register Now

In the case of a failure, do you know how long it will take to restore your database? Do you know how old the backup will be? In this presentation, we will cover the basics of best practices for backup, restoration and business continuity. Don’t put your company on the line due to bad data retention and backup policies.

Register for the webinar here.

Manjot Singh, Architect Manjot Singh is an Architect with Percona in California. He loves to learn about new technologies and apply them to real-world problems. Manjot is a veteran of startup and Fortune 500 enterprise companies alike, with a few years spent in government, education and hospital IT. Now he consults for Percona with companies around the world on many interesting problems.

PHP and MySQL Without the SQL

Embedding Structured Query Language (SQL) within PHP, or other programming languages, has been problematic for some. Mixing two programming languages together is just plainly not aesthetically pleasing. Especially when you have a declarative language (SQL) mixed with a procedural-object oriented language. But now, with the MySQL XDevAPI PECL extension, PHP developers can now stop mixing the two languages together together. MySQL Document StoreThe MySQL Document Store eliminates the heavy burden for SQL skills. It is designed to be a high speed, schema-less data store and is based on the MySQL JSON data type. This gives you roughly a gigabyte of store in a document format to do with as needed. So you do not need to architect you data before hand when you have no idea how it will evolve. No need to normalize your data. Now behind the scenes is a power MySQL database server but you are no longer writing SQL to use it! But Is The Code Ugly?If you looked at previous editions of this blog then you have seen examples of using the MySQL XDevAPI PECL extension. There is another example below of how to search for the information under various keys in a JSON document. The great news is that the code is all very modern looking PHP with no messy SQL statements thumb-tacked onto the code. This should ongoing support by those with little or no SQL skills.

Previous you would have had to stick SELECT JSON_EXTRACT(doc,'Name') AS 'Country', JSON_EXTRACT(doc,geography) as 'Geo', JSON_EXTACT(doc,'geography.Region) FROM world_x WHERE _id = "USA" as a string in the PHP code. If you prefer the -> operator to replace JSON_EXTRACT, the code can be trimmed down to SELECT doc->"$.Name" AS 'Country', doc->"$.geography" AS 'Geo', doc->"$.geography.Region" FROM world_x WHERE _id = "USA".

But the XDevAPI simplifies these queries into $result = $collection->find('_id = "USA"')->fields(['Name as Country','geography as Geo','geography.Region'])->execute();. This is much easier to understand than the previous two queries for most. And this example shows how to chain down the document path as it specifies all of the geography hey's values and also just the data under geography.Region. It also show how to alias columns from the document store to a label of the developers choice.


#!/usr/bin/php
<?PHP
// Connection parameters
$user = 'root';
$passwd = 'hidave';
$host = 'localhost';
$port = '33060';
$connection_uri = 'mysqlx://'.$user.':'.$passwd.'@'.$host.':'.$port;
echo $connection_uri . "\n";

// Connect as a Node Session
$nodeSession = mysql_xdevapi\getNodeSession($connection_uri);
// "USE world_x"
$schema = $nodeSession->getSchema("world_x");
// Specify collection to use
$collection = $schema->getCollection("countryinfo");

// Query the Document Store
$result = $collection->find('_id = "USA"')->fields(['Name as Country','geography as Geo','geography.Region'])->execute();

// Fetch/Display data
$data = $result->fetchAll();
var_dump($data);
?>
And The Output
mysqlx://root:hidave@localhost:33060
array(1) {
[0]=>
array(3) {
["Geo"]=>
array(3) {
["Region"]=>
string(13) "North America"
["Continent"]=>
string(13) "North America"
["SurfaceArea"]=>
int(9363520)
}
["Country"]=>
string(13) "United States"
["geography.Region"]=>
string(13) "North America"
}
}
User GuideThe MySQL Shell User Guide is a great place to start learning how to interactively start using the Document Store.

Tips & Tricks - DevOps Database Glossary for the MySQL Novice

When you need to work with a database that you are not 100% familiar with, you can be overwhelmed by the hundreds of metrics available. Which ones are the most important? What should I monitor, and why? What patterns in metrics should ring some alarm bells? In this blog post we will try to introduce you to some of the most important metrics to keep an eye on while running MySQL or MariaDB in production.

Com_* status counters

We will start with Com_* counters - those define the number and types of queries that MySQL executes. We are talking here about query types like SELECT, INSERT, UPDATE and many more. It is quite important to keep an eye on those as sudden spikes or unexpected drops may suggest something went wrong in the system.

Our all-inclusive database management system ClusterControl shows you this data related to the most common query types in the “Overview” section.

Handler_* status counters

A category of metrics you should keep an eye on are Handler_* counters in MySQL. Com_* counters tell you what kind of queries your MySQL instance is executing but one SELECT can be totally different to another - SELECT could be a primary key lookup, it can be also a table scan if an index cannot be used. Handlers tell you how MySQL access stored data - this is very useful for investigating the performance issues and assessing if there is a possible gain in query review and additional indexing.

As you can see from the graph above there are many metrics to track (and ClusterControl graphs the most important ones) - we won’t cover all of them here (you can find descriptions in MySQL documentation) but we’d like to highlight the most important ones.

Related resources  Whitepaper: DevOps Guide to Database Management  Database Management for DevOps Blog  Become a MySQL DBA Blog Series

Handler_read_rnd_next - whenever MySQL accesses a row without an index lookup, in sequential order, this counter will be increased. If in your workload handler_read_rnd_next is responsible for a high percentage of the whole traffic, it means that your tables, most likely, could use some additional indexes because MySQL does plenty of table scans.

Handler_read_next and handler_read_prev - those two counters are updated whenever MySQL does an index scan - forward or backward. Handler_read_first and handler_read_last may shed some more light onto what kind of index scans those are - if we are talking about full index scan (forward or backward), those two counters will be updated.

Handler_read_key - this counter, on the other hand, if its value is high, tells you that your tables are well indexed as many of the rows were accessed through an index lookup.

Replication lag

If you are working with MySQL replication, replication lag is a metric you definitely want to monitor. Replication lag is inevitable and you will have to deal with it, but to deal with it you need to understand why it happens. For that the first step will be to know _when_ it showed up.

Whenever you see a spike of the replication lag, you’d want to check other graphs to get more clues - why has it happened? What might have caused it? Reasons could be different - long, heavy DML’s, significant increase in number of DML’s executed in a short period of time, CPU or I/O limitations.

InnoDB I/O

There are a number of important metrics to monitor that related to the I/O.

In the graph above, you can see couple of metrics which tell you what kind of I/O InnoDB does - data writes and reads, redo log writes, fsyncs. Those metrics will help you to decide, for example, if replication lag was caused by a spike of I/O or maybe because of some other reason. It’s also important to keep track of those metrics and compare them with your hardware limitations - if you are getting close to the hardware limits of your disks, maybe it’s time to look into this before it has more serious effects on your database performance.

Galera metrics - flow control and queues

If you happen to use Galera Cluster (no matter which flavor you use), there are a couple more metrics you’d want to closely monitor, these are somewhat tied together. First of them are metrics related to flow control.

Flow control, in Galera, is a means to keep the cluster in sync. Whenever a node stalls and cannot keep up with the rest of the cluster, it starts to send flow control messages asking the remaining cluster nodes to slow down. This allows it to catch up. This reduces the performance of the cluster, so it is important to be able to tell which node and when it started to send flow control messages. This can explain some of the slowdowns experienced by users or limit the time window and host to use for further investigation.

Second set of metrics to monitor are the ones related to send and receive queues in Galera.

Galera nodes can cache writesets (transactions) if they cannot apply all of them immediately. If needed, they can also cache writesets which are about to be sent to other nodes (if a given node receives writes from the application). Both cases are symptoms of a slow down which, most likely, will result in flow control messages being sent, and require some investigation - why it happened, on which node, at what time?

This is, of course, just the tip of the iceberg when we consider all of the metrics MySQL makes available - still, you can’t go wrong if you start watching those we covered here, in addition to regular OS/hardware metrics like CPU, memory, disk utilization and state of the services.

Tags:  MySQL devops MariaDB monitoring

Percona XtraBackup 2.4.8 is Now Available

Percona announces the GA release of Percona XtraBackup 2.4.8 on July 24, 2017. You can download it from our download site and apt and yum repositories.

Percona XtraBackup enables MySQL backups without blocking user queries, making it ideal for companies with large data sets and mission-critical applications that cannot tolerate long periods of downtime. Offered free as an open source solution, Percona XtraBackup drives down backup costs while providing unique features for MySQL backups.

New features: Bugs Fixed:
  • xtrabackup would hang with Waiting for master thread to be suspended message when backup was being prepared. Bug fixed #1671437.
  • xtrabackup would fail to prepare the backup with 6th page is not initialized message in case server didn’t properly initialize the page. Bug fixed #1671722.
  • xbstream could run out of file descriptors while extracting the backup which contains many tables. Bug fixed #1690823.
  • When a table was created with the DATA DIRECTORY option xtrabackup would back up the .frm and .isl files, but not the .ibd file. Due to the missing .ibd files backup then could not be restored. Bug fixed #1701736.
  • Percona XtraBackup incorrectly determined use of master_auto_postion on a slave, and thus generated invalid xtrabackup_slave_info file. Bug fixed #1705193.
  • Percona XtraBackup will now print a warning if it encounters unsupported storage engine. Bug fixed #1394493.
  • Percona XtraBackup would crash while backing up MariaDB 10.2.x with --ftwrl-* options. Bug fixed #1704636.
  • xtrabackup --slave-info didn’t write the correct information into xtrabackup_slave_info file when multi-source replication was used. Bug fixed #1551634.
  • Along with xtrabackup_checkpints file, xtrabackup now copies xtrabackup_info file into directory specified by --extra-lsndir option. Bug fixed #1600656.
  • GTID position was not recorded when --binlog-info option was set to AUTO. Bug fixed #1651505.

Release notes with all the bugfixes for Percona XtraBackup 2.4.8 are available in our online documentation. Please report any bugs to the launchpad bug tracker.

MySQL Connector/C 6.1.11 GA has been released

Dear MySQL Users,

A new GA (general availability) version of MySQL Connector/C has been
made available: MySQL Connector/C 6.1.11 GA. The MySQL Connector/C 6.1
implements the MySQL C API for connecting client applications to the
MySQL Server 5.5 or newer.

You can download the production release at:
http://dev.mysql.com/downloads/connector/c/6.1.html

MySQL Connector C (Commercial) will be available for download on the My
Oracle Support (MOS) website. This release will be available on
eDelivery (OSDC) in next month’s upload cycle.

Please see the documentation and the README file in the source distribution
for a detailed description of bugs that have been fixed.

Enjoy!

=======================================================================

Changes in MySQL Connector/C 6.1.11 (2017-07-24)

Security Notes

* The linked OpenSSL library for MySQL Connector/C 6.1
Commercial has been updated to version 1.0.2l. For a
description of issues fixed in this version, see
http://www.openssl.org/news/vulnerabilities.html.
This change does not affect the Oracle-produced MySQL
Community build of Connector/C, which uses the yaSSL
library instead. (Bug #26321042)

Bugs Fixed

* Microsoft Windows: Program Database (PDB) files (with
file name extension .pdb) for debugging now are available
as -debug Zip packages. They are intended for use with
the main Connector/C Zip packages and should be unpacked
on top of an unpacked main package. (Bug #23283189)

* If the mysql_stmt_close() C API function was called, it
freed memory that later could be accessed if
mysql_stmt_error(), mysql_stmt_errno(), or
mysql_stmt_sqlstate() was called. To obtain error
information after a call to mysql_stmt_close(), call
mysql_error(), mysql_errno(), or mysql_sqlstate()
instead. (Bug #25988703)

On behalf of the Oracle MySQL Release Engineering Team,
-Sreedhar S

MySQL Connector/ODBC 5.3.9 has been released

MySQL Connector/ODBC 5.3.9, a new version of the ODBC driver for
the MySQL database management system, has been released.

The available downloads include both a Unicode driver and an ANSI
driver based on the same modern codebase. Please select the driver type
you need based on the type of your application – Unicode or ANSI.
Server-side prepared statements are enabled by default. It is suitable
for use with any MySQL version from 5.5.

This is the fourth release of the MySQL ODBC driver conforming to the
ODBC 3.8 specification. It contains implementations of key 3.8 features,
including self-identification as a ODBC 3.8 driver, streaming of output
parameters (supported for binary types only), and support of the
SQL_ATTR_RESET_CONNECTION connection attribute (for the Unicode driver
only).

Also, Connector/ODBC 5.3 introduces a GTK+-based setup library
providing a GUI DSN setup dialog on some Unix-based systems, currently
included in the Debian 7/8/9, EL6/OL6, EL7/OL7 (64-bit only), Fedora 24/25/26,
FreeBSD 10/11, SLES 12, Ubuntu 14/16/17 packages. Other new
features in the 5.3 driver are FileDSN and Bookmarks support.

The release is now available in source and binary form for a number of
platforms from our download pages at

http://dev.mysql.com/downloads/connector/odbc/5.3.html

For information on installing, please see the documentation at

http://dev.mysql.com/doc/connector-odbc/en/connector-odbc-installation.html

Changes in MySQL Connector/ODBC 5.3.9 (2017-07-24, General
Availability)

Functionality Added or Changed

  • When compiling Connector/ODBC on Windows platforms, it can now be linked statically (equivalent to the /MT compiler option in Visual Studio) or dynamically (equivalent to the /MD compiler option in Visual Studio) to the Visual C++ runtime. The default option is to link dynamically; if you want to link statically. set the new CMake option, STATIC_MSVCRT, to true. Another new CMake option, WITH_NODEFAULTLIB, has been introduced for specifying the Visual C++ runtime you want to link Connector/ODBC to when using mixed link types (that is, when link type to the Visual C++ runtime differs for Connector/ODBC and the MySQL client library it links to). See Building Connector/ODBC from a Source Distribution on Windows(http://dev.mysql.com/doc/connector-odbc/en/connector-odbc-installation-source-windows.html) for details.

Bugs Fixed

  • The SQLForeignKeys() function returned an empty result set when the information schema was being used. (Bug #26388694)
  • Calling SQLExecute() after calling SQLFreeStmt() with the SQL_RESET_PARAMS option resulted in an assertion failure. With this fix, a proper error is now thrown in the situation. (Bug #19148246)
  • A segmentation fault occurred in SQLFetch() when SQL_ATTR_CURSOR_TYPE was set to SQL_CURSOR_DYNAMIC. (Bug #18805392)
  • Connector/ODBC quit unexpectedly in SQLForeignKeys() when SQL_MODE was set to ANSI_QUOTES. (Bug #18641824)

On behalf of the MySQL/Oracle Release Engineering Team
Piotr Obrzut

jOOQ 3.10 Supports Exciting MySQL 8.0 Features

In recent months, there had been some really exciting news from the MySQL team:

These two SQL standard language features are among the most powerful SQL features that are available from most other databases. I frequently include them in conference talks about SQL (see my article about 10 SQL Tricks That You Didn’t Think Were Possible), and as well in the Data Geekery SQL Masterclass. With MySQL 8.0 now supporting these exciting features, the masterclass will be including MySQL as well (along with Oracle, SQL Server, PostgreSQL, and DB2). And, of course, these features are now supported in the upcoming jOOQ 3.10 as well.

Want to try it out yourself? Just run:

docker pull mysql:8.0.2 docker run --name MYSQL802 --net=host -p 3306:3306 -e MYSQL_ROOT_PASSWORD=test -d mysql:8.0.2

Then, connect to this instance and run this nice little query in it:

WITH RECURSIVE t(a, b) AS ( SELECT 1, CAST('a' AS CHAR(15)) UNION ALL SELECT t.a + 1, CONCAT(t.b, 'a') FROM t WHERE t.a < 10 ) SELECT a, SUM(a) OVER (ORDER BY a) AS ∑, b FROM t

And get this result:

a ∑ b -------------------------- 1 1 a 2 3 aa 3 6 aaa 4 10 aaaa 5 15 aaaaa 6 21 aaaaaa 7 28 aaaaaaa 8 36 aaaaaaaa 9 45 aaaaaaaaa 10 55 aaaaaaaaaa

Would you believe this is MySQL?

Bonus

A nice “hidden” feature is the support of new pessimistic locking clauses, in particular FOR UPDATE SKIP LOCKED. This has been available in Oracle for ages and since recently in PostgreSQL as well, and now in MySQL. A very useful feature when implementing simple message queues or reservation systems. More details in this article here:

http://mysqlserverteam.com/mysql-8-0-1-using-skip-locked-and-nowait-to-handle-hot-rows/

Of course, SKIP LOCKED (and NOWAIT) will be supported in jOOQ 3.10 as well.


Filed under: jooq-development, sql Tagged: Common Table Expressions, Hierarchical SQL, jooq, jOOQ 3.10, mysql, MySQL 8.0, Recursive Common Table Expressions, Recursive SQL, Window Functions

How to Install Pimcore with Nginx on CentOS 7

Pimcore is an open source enterprise content management system (CMS) written in PHP, It uses MySQL as database backend. In this tutorial, we will install Pimcore CMS on CentOS 7 server.

Community contributions to MariaDB

One of the goals of the MariaDB Foundation is to help new contributors understand the source code and to lower the barrier for new participants. One way to measure this is to look at the number of pull requests received and accepted, as these mostly reflect community contributions. The figures below are for the main […]

The post Community contributions to MariaDB appeared first on MariaDB.org.

How far can you go with MySQL or MariaDB ?

MySQL theoretical limits are known and they can be found in the manual, they include: MyISAM permits data and index files to grow up to 256 TiB by default, but this limit can be changed up to the maximum permissible size of 65,536 TiB (256^7 − 1 bytes). The maximum tablespace size depends on the InnoDB page size: 64 TiB for 16 KiB pages.(The maximum tablespace size is also the maximum size for a

Why Thread May Hang in "Waiting for table level lock" State - Part I

Last time I had to use gdb to check what's going on in MySQL server and found something useful with it to share in the blog post it was April 2017, and I miss this kind of experience already. So, today I decided to try it again in a hope to get some insights in cases when other tools may not apply or may not be as helpful as one could expect. Here is the long enough story, based on recent customer issue I worked on this week.
* * * Had you seen anything like this output of SHOW PROCESSLIST statement:

Id User Host db Command Time State
Info Progress
...
28 someuser01 xx.xx.xx.xx:39644 somedb001 Sleep 247
NULL 0.000
29 someuser01 xx.xx.xx.yy:44100 somedb001 Query 276
Waiting for table level lock DELETE FROM t1 WHERE (some_id = 'NNNNN') AND ...
0.000
...
33 someuser01 xx.xx.zz.tt:35886 somedb001 Query 275
Waiting for table level lock DELETE FROM t2 WHERE (some_id = 'XXXXX') 0.000
...
38 someuser01 xx.xx.xx.xx:57055 somedb001 Query 246
Waiting for table level lock DELETE FROM t3 WHERE (some_id in (123456)) AND ... 0.000
...recently? That is, many threads accessing InnoDB(!) tables and hanging in the "Waiting for table level lock" state for a long time without any obvious reason in the SHOW PROCESSLIST or SHOW ENGINE INNODB STATUS?

I've seen it this week, along with a statement from customer that the server is basically unusable. Unlike in many other cases I had not found any obvious problem from the outputs mentioned above (other than high concurrency and normal InnoDB row lock waits etc for other sessions). There were no unexpected table level locks reported by InnoDB, and the longest running transactions were those in that waiting state.

Unfortunately fine MySQL manual is not very helpful when describing this thread state:
"Waiting for lock_type lock
The server is waiting to acquire a THR_LOCK lock or a lock from the metadata locking subsystem, where lock_type indicates the type of lock.
This state indicates a wait for a THR_LOCK:
  • Waiting for table level lock
These states indicate a wait for a metadata lock:
  • Waiting for event metadata lock
  • Waiting for global read lock
    ..."
What that "THR_LOCK" should tell the reader is beyond me (and probably most of MySQL users). Fortunately, I know from experience that this state usually means that thread is trying to access some MyISAM table. The tables mentioned in currently running statements were all InnoDB, though. Server had performance_schema enabled, but only with default instrumentation and consumers defined, so I had no chance to get anything related to metadata locks or much hope to see all previous statement executed by each thread anyway. I surely insisted on checking the source code etc, but this rarely gives immediate results.

So, to check my assumption and try to prove MyISAM tables are involved, and get some hint what sessions they were locked by, I suggested to run mysqladmin debug, the command that is rarely used these days, with the output that Oracle explicitly refused to document properly (see my Bug #71346)! I do know what to expect there (from the days when MyISAM was widely used), and when in doubts I can always check the source code.

The mysqladmin debug command outputs a lot of information into the error log. Among other things I found the following section:

Thread database.table_name Locked/Waiting Lock_type

...
928 somedb001.somewhat_seq Waiting - write High priority write lock
...
12940 somedb001.somewhat_seq Locked - write High priority write lockwhere somedb001.somewhat_seq surely was the MyISAM table. There were actually many tables mentioned as locked, with waits on them, and names ending with _seq suffix. One can now check what MySQL thread with id=12940 is doing at all levels, where it comes from, should it be killed etc.

Now it became more clear what's going on. Probably application developers tried to implement sequences with MyISAM tables and, probably, triggers to use them when data are not provided for the columns! The idea is quite popular and described in many blog posts. Check this old blog post by Peter Zaitsev, for example. This assumption was confirmed, and essentially they did something like I did in the following deliberately primitive example:
-- this is our sequence table
mysql> show create table misam\G
*************************** 1. row ***************************
       Table: misam
Create Table: CREATE TABLE `misam` (
  `id` int(11) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8mb4
1 row in set (0.00 sec)

mysql> create table inno like misam;
Query OK, 0 rows affected (0.30 sec)

mysql> alter table inno engine=InnoDB;
Query OK, 0 rows affected (2.73 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> show create table inno\G
*************************** 1. row ***************************
       Table: inno
Create Table: CREATE TABLE `inno` (
  `id` int(11) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4
1 row in set (0.03 sec)

mysql> select * from misam;
Empty set (0.00 sec)

-- let's set up the first value for the sequence
mysql> insert into misam values(1);
Query OK, 1 row affected (0.05 sec)

-- now let's create trigger to insert the value from sequencemysql> delimiter //
mysql> create trigger tbi before insert on inno
    -> for each row
    -> begin
    -> if ifnull(new.id,0) = 0 then
    ->   update misam set id=last_insert_id(id+1);
    ->   set new.id = last_insert_id();
    -> end if;
    -> end
    -> //

Query OK, 0 rows affected (0.25 sec)

mysql> delimiter ;
mysql> start transaction;
Query OK, 0 rows affected (0.00 sec)

mysql> insert into inno values(0);
Query OK, 1 row affected (0.20 sec)

mysql> insert into inno values(0);
Query OK, 1 row affected (0.10 sec)

mysql> select * from inno;
+----+
| id |
+----+
|  2 |
|  3 |
+----+
2 rows in set (0.00 sec)

mysql> select * from misam;
+----+
| id |
+----+
|  3 |
+----+
1 row in set (0.00 sec)So, this is how it is supposed to work, and, you know, it works from two concurrent sessions without any locking noticed etc.

This is the end of the real story, for now. We have a lot of things to discuss there with customer, including how to find out previous commands executed in each sessions, what useful details we can get from the performance_schema etc. The root cause was clear, but we still have to find out why the sessions are spending so much time holding the blocking locks on MyISAM tables and what we can do about that, or with architecture based on such implementation of sequences (that I kindly ask you to avoid whenever possible! Use auto_increment values, please, if you care about concurrency.) I expect more blog posts eventually inspired by that real story, but now it's time to move to more generic technical details.
* * *I have two problems with this story if I try to approach it in a more generic way.

First, do we have any other way to see MyISAM table level locks, besides mysqladmin debug command? (Oracle kindly plans to replace with something, let me quote Shane Bester's comment the bug that originates from hist internal feature request: "We need to replace it with appropriate performance_schema or information_schema tables.")

Second, how exactly to reproduce the situation customer reported? My initial attempts to get such a status with the trigger in place failed, I was not able to capture threads spending any notable time in "Waiting for table level lock" state even with concurrent single row inserts and the trigger above in place. I was thinking about explicit LOCK TABLES misam WRITE etc, but I doubt real code does use that. Fortunately, the following INSERT running from one session:
mysql> insert into inno values(sleep(100));
Query OK, 1 row affected (1 min 40.20 sec)allowed me to run INSERT in the other session that got blocked:
mysql> insert into inno values(0);
Query OK, 1 row affected (1 min 36.36 sec)and while it was blocked I've seen the thread state I wanted in the output of the SHOW PROCESSLIST:
mysql> show processlist;+----+------+-----------+------+---------+------+------------------------------+-------------------------------------+-----------+---------------+
| Id | User | Host      | db   | Command | Time | State                        | Info                                | Rows_sent | Rows_examined |
+----+------+-----------+------+---------+------+------------------------------+-------------------------------------+-----------+---------------+
| 16 | root | localhost | test | Query   |   17 | User sleep                   | insert into inno values(sleep(100)) |         0 |             0 |
| 17 | root | localhost | test | Query   |   13 | Waiting for table level lock | insert into inno values(0)          |         0 |             0 |
| 22 | root | localhost | test | Query   |    0 | starting                     | show processlist                    |         0 |             0 |
+----+------+-----------+------+---------+------+------------------------------+-------------------------------------+-----------+---------------+
3 rows in set (0.00 sec)So, I know how exactly to reproduce the problem and what it can be caused by. Any slow running single INSERT statement (caused by something complex executed in trigger, some slow function used in the list of values inserted, or, maybe just INSERT ... SELECT ... from the big table) will give us the desired thread state.

Coming back to the first generic problem mentioned above, is there any way besides running mysqladmin debug and checking the output vs processlist etc, to identify the thread that holds MyISAM table level locks? One should expect performance_schema to help with this, at least as of 5.7 (I've used Percona Server 5.7.18-15 on my Ubuntu 14.04 netbook for testing today, while the original problem was on MariaDB 10.2.x. Percona is for fun, while MariaDB is for real work...). Indeed, we have table_handles there, and until recently I ignored its existence (it's new in 5.7, and I am not even sure if MariaDB 10.2.x has it already).

So, I reproduced the problem again and got the following there:
mysql> select * from performance_schema.table_handles;
+-------------+---------------+-------------+-----------------------+-----------------+----------------+---------------+----------------+
| OBJECT_TYPE | OBJECT_SCHEMA | OBJECT_NAME | OBJECT_INSTANCE_BEGIN | OWNER_THREAD_ID | OWNER_EVENT_ID | INTERNAL_LOCK | EXTERNAL_LOCK  |
+-------------+---------------+-------------+-----------------------+-----------------+----------------+---------------+----------------+
| TABLE       | test          | t           |       140392772351024 |               0 |              0 | NULL          | NULL           |
| TABLE       | test          | t1          |       140392772352560 |               0 |              0 | NULL          | NULL           |
| TABLE       | test          | ttime       |       140392772355632 |               0 |              0 | NULL          | NULL           |
| TABLE       | test          | misam       |       140392772358704 |              55 |             10 | WRITE         | WRITE EXTERNAL |
| TABLE       | test          | misam       |       140392896981552 |              54 |             10 | WRITE         | WRITE EXTERNAL |
| TABLE       | test          | inno        |       140392772361776 |              55 |             10 | NULL          | WRITE EXTERNAL |
| TABLE       | test          | inno        |       140392896983088 |              54 |             10 | NULL          | WRITE EXTERNAL |
| TABLE       | test          | inno        |       140392826836016 |               0 |              0 | NULL          | NULL           |
| TABLE       | test          | misam       |       140392826837552 |               0 |              0 | NULL          | NULL           |
+-------------+---------------+-------------+-----------------------+-----------------+----------------+---------------+----------------+
9 rows in set (0.00 sec)What I am supposed to do with the above to find out the MySQL thread id of the blocking session? I am supposed to join to performance_schema.threads table, looking for thread_id value that is the same as owner_thread_id above, 54 and 55. I get the following, note that processlist_id is what you are looking for in the processlist:
mysql> select * from performance_schema.threads where thread_id in (54, 55)\G
*************************** 1. row ***************************
          THREAD_ID: 54
               NAME: thread/sql/one_connection
               TYPE: FOREGROUND
     PROCESSLIST_ID: 27
   PROCESSLIST_USER: root
   PROCESSLIST_HOST: localhost
     PROCESSLIST_DB: test
PROCESSLIST_COMMAND: Sleep
   PROCESSLIST_TIME: 90
  PROCESSLIST_STATE: NULL
   PROCESSLIST_INFO: NULL
   PARENT_THREAD_ID: NULL
               ROLE: NULL
       INSTRUMENTED: YES
            HISTORY: YES
    CONNECTION_TYPE: Socket
       THREAD_OS_ID: 32252
*************************** 2. row ***************************
          THREAD_ID: 55
               NAME: thread/sql/one_connection
               TYPE: FOREGROUND
     PROCESSLIST_ID: 28
   PROCESSLIST_USER: root
   PROCESSLIST_HOST: localhost
     PROCESSLIST_DB: test
PROCESSLIST_COMMAND: Sleep
   PROCESSLIST_TIME: 101
  PROCESSLIST_STATE: NULL
   PROCESSLIST_INFO: NULL
   PARENT_THREAD_ID: NULL
               ROLE: NULL
       INSTRUMENTED: YES
            HISTORY: YES
    CONNECTION_TYPE: Socket
       THREAD_OS_ID: 27844
2 rows in set (0.10 sec)I do not see a way to distinguish lock wait from actually holding the lock. All I know at the moment is that there is engine-level lock on misam table (and inno table, for that matter). If the MySQL thread id is NOT for the thread that is "Waiting for table level lock", then it must be the thread that holds the lock! Some smart join with proper WHERE clause would let me to find out what I need directly. Maybe in one of the next parts I'll even present it, but writing it from the top of my head in critical situation is above my current skills related to performance_schema.
* * *Now, what those of us should do who do not have performance_schema enabled, or have to use version before 5.7? Or those with access to gdb and spare 5 minutes?

Surely we should attach gdb to the mysqld process and, if in doubts, read some parts of the source code to know where to start. I started with the following fragments of code that were easy to find (as long as you know that COM_DEBUG command is actually sent by mysqladmin debug to server):
openxs@ao756:~/git/percona-server$ grep -rni com_debug *
include/mysql/plugin_audit.h.pp:160:  COM_DEBUG,
include/mysql.h.pp:47:  COM_DEBUG,
include/my_command.h:39:  COM_DEBUG,
libmysql/libmysql.c:899:  DBUG_RETURN(simple_command(mysql,COM_DEBUG,0,0,0));
rapid/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/task_debug.h:62:  "[XCOM_DEBUG] ",
sql/sql_parse.cc:1884:  case COM_DEBUG:
^C
...
# from the line highlighted above...  case COM_DEBUG:
    thd->status_var.com_other++;
    if (check_global_access(thd, SUPER_ACL))
      break;                                    /* purecov: inspected */
    mysql_print_status();    query_logger.general_log_print(thd, command, NullS);
    my_eof(thd);
    break;

openxs@ao756:~/git/percona-server$ grep -rni mysql_print_status *
sql/sql_test.h:40:void mysql_print_status();
sql/sql_parse.cc:59:#include "sql_test.h"         // mysql_print_status
sql/sql_parse.cc:1888:    mysql_print_status();
sql/sql_test.cc:456:void mysql_print_status()sql/mysqld.cc:69:#include "sql_test.h"     // mysql_print_status
sql/mysqld.cc:2387:        mysql_print_status();   // Print some debug info
openxs@ao756:~/git/percona-server$

# from the line highlighted above...

void mysql_print_status()
{
  char current_dir[FN_REFLEN];
  STATUS_VAR current_global_status_var;

  printf("\nStatus information:\n\n");
  (void) my_getwd(current_dir, sizeof(current_dir),MYF(0));
  printf("Current dir: %s\n", current_dir);
  printf("Running threads: %u  Stack size: %ld\n",
         Global_THD_manager::get_instance()->get_thd_count(),
         (long) my_thread_stack_size);
  thr_print_locks();                            // Write some debug info
...
  printf("\nTable status:\n\
...
  display_table_locks();...
# finally, let's find the code of display_table_locks...openxs@ao756:~/git/percona-server$ grep -rni display_table_locks *
sql/sql_test.cc:371:static void display_table_locks(void)sql/sql_test.cc:501:  display_table_locks();
sql/mysqld.cc:9537:  { &key_memory_locked_thread_list, "display_table_locks", PSI_FLAG_THREAD},
openxs@ao756:~/git/percona-server$

static void display_table_locks(void)
{
  LIST *list;
  Saved_locks_array saved_table_locks(key_memory_locked_thread_list);
  saved_table_locks.reserve(table_cache_manager.cached_tables() + 20);

  mysql_mutex_lock(&THR_LOCK_lock);
  for (list= thr_lock_thread_list; list; list= list_rest(list))
  {
    THR_LOCK *lock=(THR_LOCK*) list->data;
    mysql_mutex_lock(&lock->mutex);
    push_locks_into_array(&saved_table_locks, lock->write.data, FALSE,
                          "Locked - write");
    push_locks_into_array(&saved_table_locks, lock->write_wait.data, TRUE,
                          "Waiting - write");
    push_locks_into_array(&saved_table_locks, lock->read.data, FALSE,
                          "Locked - read");
    push_locks_into_array(&saved_table_locks, lock->read_wait.data, TRUE,
                          "Waiting - read");
    mysql_mutex_unlock(&lock->mutex);
  }
...That's enough to start. We'll have to finally study what THR_LOCK is. There is a global (double) linked lists of all them, thr_lock_thread_list, and this is where we can get details from, in a way similar (let's hope) to those server does while processing COM_DEBUG command for us. So, let's attach gdb and let's fun begin:
openxs@ao756:~/git/percona-server$ sudo gdb -p 23841
[sudo] password for openxs:
GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1
...
Loaded symbols for /lib/x86_64-linux-gnu/libnss_files.so.2
0x00007fafd93d9c5d in poll () at ../sysdeps/unix/syscall-template.S:81
81      ../sysdeps/unix/syscall-template.S: No such file or directory.
(gdb) set $list=(LIST *)thr_lock_thread_list
(gdb) set $lock=(THR_LOCK*) $list->data
(gdb) p *($lock)
$1 = {list = {prev = 0x0, next = 0x7fafd2fa4780, data = 0x7fafbd545c80},
  mutex = {m_mutex = {__data = {__lock = 0, __count = 0, __owner = 0,
        __nusers = 0, __kind = 3, __spins = 0, __elision = 0, __list = {
          __prev = 0x0, __next = 0x0}},
      __size = '\000' <repeats 16 times>, "\003", '\000' <repeats 22 times>,
      __align = 0}, m_psi = 0x7fafd1f97c80}, read_wait = {data = 0x0,
    last = 0x7fafbd545cc8}, read = {data = 0x0, last = 0x7fafbd545cd8},
  write_wait = {data = 0x0, last = 0x7fafbd545ce8}, write = {
    data = 0x7fafbd69f188, last = 0x7fafbd69f190}
, write_lock_count = 0,
  read_no_write_count = 0, get_status = 0xf84910 <mi_get_status>,
  copy_status = 0xf84a70 <mi_copy_status>,
  update_status = 0xf84970 <mi_update_status>,
  restore_status = 0xf84a50 <mi_restore_status>,
  check_status = 0xf84a80 <mi_check_status>}In the above we started from the first item in the list. If needed we can move on to the next with set $list=(LIST *)($list->next) etc. We then interpret $list->data as a (THD_LOCK*). There we can see read, read_wait, write and write_wait structures. One of them will have data item that is not 0x0. This way we can identify is it a lock or lock wait, and what kind of lock is it. In my case i was looking for a write lock, and here it is, the very first one. So, I continue:
(gdb) p *($lock)->write.data
$2 = {owner = 0x7fafbd468d38, next = 0x0, prev = 0x7fafbd545cf8,
  lock = 0x7fafbd545c80, cond = 0x0, type = TL_WRITE,
  status_param = 0x7fafbd69ee20, debug_print_param = 0x7fafbd515020,
  m_psi = 0x7fafa4c06cc0}We can check the owner field:
(gdb) p *($lock)->write.data.owner
$3 = {thread_id = 16, suspend = 0x7fafbd469370}and thread_id there is what we were looking for, the MySQL thread id of the blocking thread. If we want to get the details about the table locks we can do this as follows, by studying debug_print_param in data:
(gdb) set $table=(TABLE *) &(*($lock)->write.data->debug_print_param)
(gdb) p $table
$4 = (TABLE *) 0x7fafbd515020
(gdb) p *($table)
$5 = {s = 0x7fafbd4d7030, file = 0x7fafbd535230, next = 0x0, prev = 0x0,
  cache_next = 0x0, cache_prev = 0x7fafc4df05a0, in_use = 0x7fafbd468000,
  field = 0x7fafbd4d7440, hidden_field_count = 0, record = {
    0x7fafbd4d7430 "\377", 0x7fafbd4d7438 "\n"}, write_row_record = 0x0,
  insert_values = 0x0, covering_keys = {map = 0}, quick_keys = {map = 0},
  merge_keys = {map = 0}, possible_quick_keys = {map = 0},
  keys_in_use_for_query = {map = 0}, keys_in_use_for_group_by = {map = 0},
  keys_in_use_for_order_by = {map = 0}, key_info = 0x7fafbd4d7508,
  next_number_field = 0x0, found_next_number_field = 0x0, vfield = 0x0,
  hash_field = 0x0, fts_doc_id_field = 0x0, triggers = 0x0,
  pos_in_table_list = 0x7fafbd47a9b8, pos_in_locked_tables = 0x7fafbd4e5030,
  group = 0x0, alias = 0x7fafbbbad120 "misam",
  null_flags = 0x7fafbd4d7430 "\377", bitmap_init_value = 0x0, def_read_set = {
    bitmap = 0x7fafbd4d75a8, n_bits = 1, last_word_mask = 4294967294,
    last_word_ptr = 0x7fafbd4d75a8, mutex = 0x0}, def_write_set = {
    bitmap = 0x7fafbd4d75ac, n_bits = 1, last_word_mask = 4294967294,
    last_word_ptr = 0x7fafbd4d75ac, mutex = 0x0}, tmp_set = {
    bitmap = 0x7fafbd4d75b0, n_bits = 1, last_word_mask = 4294967294,
    last_word_ptr = 0x7fafbd4d75b0, mutex = 0x0}, cond_set = {
    bitmap = 0x7fafbd4d75b4, n_bits = 1, last_word_mask = 4294967294,
    last_word_ptr = 0x7fafbd4d75b4, mutex = 0x0},
  def_fields_set_during_insert = {bitmap = 0x7fafbd4d75b8, n_bits = 1,
    last_word_mask = 4294967294, last_word_ptr = 0x7fafbd4d75b8, mutex = 0x0},
---Type <return> to continue, or q <return> to quit---
  read_set = 0x7fafbd515128, write_set = 0x7fafbd515148,
  fields_set_during_insert = 0x7fafbd5151a8, query_id = 0, quick_rows = {
    0 <repeats 64 times>}, const_key_parts = {0 <repeats 64 times>},
  quick_key_parts = {0 <repeats 64 times>}, quick_n_ranges = {
    0 <repeats 64 times>}, quick_condition_rows = 0, lock_position = 0,
  lock_data_start = 0, lock_count = 1, temp_pool_slot = 0, db_stat = 39,
  current_lock = 1, nullable = 0 '\000', null_row = 0 '\000',
  status = 3 '\003', copy_blobs = 0 '\000', force_index = 0 '\000',
  force_index_order = 0 '\000', force_index_group = 0 '\000',
  distinct = 0 '\000', const_table = 0 '\000', no_rows = 0 '\000',
  key_read = 0 '\000', no_keyread = 0 '\000', locked_by_logger = 0 '\000',
  no_replicate = 0 '\000', locked_by_name = 0 '\000',
  fulltext_searched = 0 '\000', no_cache = 0 '\000',
  open_by_handler = 0 '\000', auto_increment_field_not_null = 0 '\000',
  insert_or_update = 0 '\000', alias_name_used = 0 '\000',
  get_fields_in_item_tree = 0 '\000', m_needs_reopen = 0 '\000',
  created = true, max_keys = 0, reginfo = {join_tab = 0x0, qep_tab = 0x0,
    lock_type = TL_WRITE, not_exists_optimize = false,
    impossible_range = false}, mem_root = {free = 0x7fafbd4d7420,
    used = 0x7fafbd535220, pre_alloc = 0x0, min_malloc = 32, block_size = 992,
    block_num = 6, first_block_usage = 0, max_capacity = 0,
    allocated_size = 2248, error_for_capacity_exceeded = 0 '\000',
    error_handler = 0xc4b240 <sql_alloc_error_handler()>, m_psi_key = 106},
---Type <return> to continue, or q <return> to quit---
  blob_storage = 0x0, grant = {grant_table = 0x0, version = 0,
    privilege = 536870911, m_internal = {m_schema_lookup_done = true,
      m_schema_access = 0x0, m_table_lookup_done = true,
      m_table_access = 0x0}}, sort = {filesort_buffer = {m_next_rec_ptr = 0x0,
      m_rawmem = 0x0, m_record_pointers = 0x0, m_sort_keys = 0x0,
      m_num_records = 0, m_record_length = 0, m_sort_length = 0,
      m_size_in_bytes = 0, m_idx = 0}, io_cache = 0x0, merge_chunks = {
      m_array = 0x0, m_size = 0}, addon_fields = 0x0,
    sorted_result_in_fsbuf = false, sorted_result = 0x0,
    sorted_result_end = 0x0, found_records = 0}, part_info = 0x0,
  all_partitions_pruned_away = false, mdl_ticket = 0x7fafbd4c4f10,
  m_cost_model = {m_cost_model_server = 0x0, m_se_cost_constants = 0x0,
    m_table = 0x0}, should_binlog_drop_if_temp_flag = false}We already see table alias, but more details are in the s filed (of TABLE_SHARE * type):
(gdb) p $table->s$26 = (TABLE_SHARE *) 0x7fafbd4d7030
...
(gdb) p $table->s->table_cache_key
$27 = {str = 0x7fafbd4d73b8 "test", length = 11}
...
(gdb) p $table->s->db
$30 = {str = 0x7fafbd4d73b8 "test", length = 4}
(gdb) p $table->s->path
$31 = {str = 0x7fafbd4d73c8 "./test/misam", length = 12}
(gdb) p $table->s->table_name
$32 = {str = 0x7fafbd4d73bd "misam", length = 5}
# to remind you, this is how to get the thread id directly
(gdb) p (*($lock)->write.data.owner).thread_id
$37 = 16I skilled some of my tests and intermediate results. To summarize, for me it took only a bit more time to search the code and try commands in gdb than to find out that metadata_locks does not help and one has to use table_handles in performance_schema. Next time in gdb I'll do it instantly, and the number of key strokes is far less :)

Surely, for production environments I'll have to write, test and note somewhere proper queries to the performance_schema. For next attempts with gdb and, maybe, FOSDEM talks I'll have to read the code to remember key details about the data structures used above... More posts to come, stay tuned!

MySQL Connector/J 5.1.43 has been released

Dear MySQL Users,

MySQL Connector/J 5.1.43, a maintenance release of the production 5.1 branch has been released. Connector/J is the Type-IV pure-Java JDBC driver for MySQL.

MySQL Connector Java is available in source and binary form from the Connector/J download pages at http://dev.mysql.com/downloads/connector/j/5.1.html and mirror sit
repositories.

MySQL Connector Java (Commercial) is already available for download on the My Oracle Support (MOS) website. This release will be available on eDelivery
(OSDC) in next month’s upload cycle.

As always, we recommend that you check the “CHANGES” file in the download archive to be aware of changes in behavior that might affect your application.

MySQL Connector/J 5.1.43 includes the following general bug fixes and improvements, also available in more detail on http://dev.mysql.com/doc/relnotes/connector-j/en
nges in MySQL Connector/J 5.1.43 (2017-07-21)

Version 5.1.43 is a maintenance release of the production 5.1 branch. It is suitable for use with MySQL Server versions 5.5, 5.6, and 5.7. It supports the Java Da
(JDBC) 4.2 API.

Functionality Added or Changed

* Update the time zone mappings using the latest IANA and CLDR time zone databases. (Bug #25946965)

* A new option for the loadBalancingStrategy connection property called serverAffinity has been added. The servers listed in the new connection property serverA
hould be a subset of the servers in the host list of the connection URL) are contacted in the order they are listed until a server is available or until the list of
at which point a random load-balancing strategy is used with the hosts not listed by serverAffinityOrder. See descriptions for loadBalancingStrategy and serverAffin
tasource Class Names, URL Syntax and Configuration Properties for Connector/J
(http://dev.mysql.com/doc/connector-j/5.1/en/connector-j-reference-configuration-properties.html)
for details. (Bug #20182108)

Bugs Fixed

* Secure JDBC connections did not fall back to the default truststore when a custom one was not provided. (Bug
#26243128)

* Connector/J failed a number of regression tests in the testsuite related to geographic information system (GIS)
functions because of changes to GIS support by the MySQL server. The fix corrects the tests. (Bug #26239946, Bug
#26140577)

* Attempts to connect to a server started with collation utf8mb4_de_pb_0900_ai_ci resulted in null pointer exceptions. (Bug #26090721)

* In com/mysql/jdbc/ServerPreparedStatement.java, the arguments resultSetType and resultSetConcurrency for a call of Connection.preparedStatement() were swapped
#25874048, Bug #85885)

* A NullPointerException was returned when getDate(),
getTime(), or getTimestamp() was called with a null Calendar. This fix makes Connector/J throw an SQLException in the case. (Bug #25650305)

* Some JDBC proxied objects were missing the proper handling of the equals() methods, thus even comparison of one of these proxied objects to its own self with
e. This patch introduces proper handling for the equals() method in all the relevant proxies. (Bug #21931572, Bug #78313)

* A server-side prepared statement was not closed when the same statement was being prepared again while the original statement was being cached. This was cause
cement of the cache entry of the old statement by the new. When this happened repeatedly, it caused eventually the complaint that max_prepared_stmt_count was exceede
e that when a cache entry for a statement replaces an older one, the older statement is immediately closed. (Bug
#20066806, Bug #74932)

On Behalf of the MySQL/Oracle Release Engineering Team,
Hery Ramilison

Log Buffer #517: A Carnival of the Vanities for DBAs

This Log Buffer Edition covers Oracle, SQL Server and MySQL.

Oracle:

Protecting Financial Data with Oracle WebCenter and Adobe LiveCycle

Oracle Forms 12c oracle.security.jps.JpsException Error after Database change

The Future of Content Management: Oracle Content & Experience Cloud

Today Oracle released a very large „monster“ Upgrade. This July 2017 Update includes the first time the new RU „Release Upgrade“ and RUR „Release Update Revision“ Patches.

Cloud Ward: Who Will Win the Battle for IT’s Future?

SQL Server:

SQL Server Management Studio add-ins

Resolve Network Binding Order Warning in failover cluster

Queries to inventory your SQL Server Agent Jobs

SQL Server 2016 ColumnStore Index String Predicate Pushdown

The Fast Route from Raw Data to Analysis and Display

MySQL:

Group-Replication, sweet & sour

You QA Most of Your System — What About Your Database?

Multi-Threaded Slave Statistics

Protecting your data! Fail-safe enhancements to Group Replication.

InnoDB Basics – Compaction: when and when not

Faster Node Rejoins with Improved IST performance

In this blog, we’ll look at how improvements to Percona XtraDB Cluster improved IST performance.

Introduction

Starting in version 5.7.17-29.20 of Percona XtraDB Cluster significantly improved performance. Depending on the workload, the increase in throughput is in the range of 3-10x. (More details here). These optimization fixes also helped improve IST (Incremental State Transfer) performance. This blog is aimed at studying the IST impact.

IST

IST stands for incremental state transfer. When a node of the cluster leaves the cluster for a short period of time and then rejoins the cluster it needs to catch-up with cluster state. As part of this sync process existing node of the cluster (aka DONOR) donates missing write-sets to rejoining node (aka JOINER). In short, flow involves, applying missing write-sets on JOINER as it does during active workload replication.

Percona XtraDB Cluster / Galera already can apply write-sets in parallel using multiple applier threads. Unfortunately, due to commit contention, the commit action was serialized. This was fixed in the above Percona XtraDB Cluster release, allowing commits to proceed in parallel.

IST uses the same path for applying write-sets, except that it is more like a batch operation.

IST Performance

Let’s look at IST performance before and now.

Setup

  1. Two node cluster (node-1 and node-2) and gcache is configured large enough to avoid purging as we need IST
  2. Start workload against node-1 for 30 seconds
  3. Shutdown node-2
  4. Start workload that performs 4M requests against node-1. Workload produces ~3.5M write-sets that are cached in gcache and used later for IST
  5. Start node-2 with N-applier threads
  6. Wait until IST is done
  7. ….. repeat steps 3-6 with different values of N.

Observations:

  • IST is 4x faster with PXC-5.7.17 (compared to previous releases)
  • Improved performance means a quicker node rejoin, and an overall increase in cluster productivity as joiner node is available to process the workload more quickly
Conclusion

Percona XtraDB Cluster 5.7.17 significantly improved IST performance. A faster re-join of the node effectively means better cluster productivity and flexibility in planning maintenance window. So what are you waiting for? Upgrade to Percona XtraDB Cluster 5.7.17 or latest Percona XtraDB Cluster 5.7 release and experience the power!

Fun with Bugs #54 - On Some Bugs Fixed in MySQL 5.7.19

More than 3 months after 5.7.18 we' ve got MySQL 5.7.19 released recently. This is my quick review of the release notes with interesting fixed bug (reported in public) highlighted in the areas I am usually interested in.

Let's start with InnoDB. The following bug fixes attracted my attention:
  • Bug #85043 is still private. You know how much I hate those. At least we can see it was about the fact that "The server allocated memory unnecessarily for an operation that rebuilt the table." Let's hope this is no longer the case.
  • Bug #84038 - "Errors when restarting MySQL after FLUSH TABLES FOR EXPORT, RENAME and DROP.", was reported by Jean-François Gagné and verified by Umesh Shastry. Basically, InnoDB failed to update INNODB_SYS_DATAFILES internal data dictionary table properly while renaming tables.
  • Bug #80788 - "Reduce the time of looking for MLOG_CHECKPOINT during crash recovery". It was reported by  Zhai Weixiang (who had provided a patch) and quickly verified by Umesh Shastry.
  • Bug #83470 - "Reverse scan on a partitioned table does ICP check incorrectly, causing slowdown". This report that is related to both InnoDB, partitioning and optimizer, was created by my colleague from MariaDB, Sergey Petrunya, who had also suggested a patch under BSD license. Bug #84107 was considered a duplicate of this one it seems.
There were some public bugs fixed in group replication. While I do not really care, yet, about Group Replication, it's nice to see bugs in this area fixed so fast:
  • Bug #85667 - "GR node is in RECOVERING state if binlog_checksum configured on running server". This bug was reported by Ramana Yeruva.
  • Bug #85047 - "Secondaries allows transactions through ASYNC setup into the group". Yet another group replication bug recently fixed. It was reported by Narendra Chauhan who probably works for Oracle.
  • Bug #84728 - "Not Possible To Avoid MySQL From Starting When GR Cannot Start", was reported by Kenny Gryp from Percona.
  • Bug #84329 - "Some Class B private address is not permitted automatically". This funny bug was reported by Satoshi Mitani and verified by Umesh Shastry
  • Bug #84153 - "ASSERT "!contains_gtid(gtid)" rpl_gtid_owned.cc:94 Owned_gtids::add_gtid_owner". It was reported by Narendra Chauhan.
Usual async replication was also improved in 5.7.19, with the following related bugs fixed:
  • Bug #83184 - "Can't set slave_skip_errors > 3000". This bug was reported by Tsubasa Tanaka (who had provided a patch) and verified by Umesh Shastry.
  • Bug #82283 - "main.mysqlbinlog_debug fails with a LeakSanitizer error". Laurynas Biveinis from Percona found it and provided patches. The bug was verified by Umesh Shastry.
  • Bug #81232 - "Changing master_delay after stop slave results in loss of events". Ths regression bug (5.6.x was not affected) was reported by Parveez Baig. Bug #84249 (reported by Sergio Roysen) was declared a duplicate of this one.
  • Bug #77406 - "MTS must be able handle large packets with small slave_pending_jobs_size_max", was reported by my colleague Andrii Nikitin while he worked in Oracle.
  • Bug #84471 is still private. Why it could be so when the bug is supposed to be about "...the master could write to the binary log a last_committed value which was smaller than it should have been. This could cause the slave to execute in parallel transactions which should not have been, leading to inconsistencies or other errors." Let's hide the details about all bugs that may lead to data corruption,s why not?
  • Bug #83186 - "Request for slave_skip_errors = ER_INCONSISTENT_ERROR". Thanks to  Tsubasa Tanaka, who had provided a patch, we can do this now.
  • Bug #82848 - "Restarting a slave after seeding it with a mysqldump loses it's position". This serious problem with GTID-based replication was noted and resolved (with a patch) by Daniël van Eeden.
  • Bug #82209 is also private... Now, I am tired of them, just go real the release notes and hope they describe the problem and solution right. Great for any open source software, isn't it?
  • Bug #81119 - "multi source replication slave sql running error 1778". This serious bug (where GTIDs also play role) was reported by Mohamed Atef. Good to see it fixed finally.
I know long term query cache is doomed, Oracle is not going to work on it any more, but in the meantime, please, check this important bug fixed, Bug #86047 - "FROM sub query with 'group by' is cached by mistake", by Meiji Kimura. If you still use query cache and have complex queries with derived tables, please, consider upgrade. You might be getting wrong results all this time...

If you use xtrabackup or Oracle's MySQL Enterprise Backup in the environment with XA transactions, please, check Bug #84442 - "XA PREPARE inconsistent with XTRABACKUP", by David Zhao, who had also provided a patch. Your backups may be inconsistent until you upgrade...

Now, on some optimizer bugs fixed:
  • Bug #81031 - "count(*) on innodb sometimes returns 0". This happened when index merge was used by the optimizer. The bug was reported by Ashraf Amayreh and verified by Shane Bester.
  • Bug #84320 - "DISTINCT clause does not work in GROUP_CONCAT". It was reported by Varun Gupta and verified by Bogdan Kecman. See Bug #68145 also (that is still "Verified").
  • Bug #79675 - "index_merge_intersection optimization causes wrong query results". This bug was reported by Saverio M and verified by Miguel Solorzano.
There were a lot more bugs fixed in 5.7.19. I'd say that if one uses replication (especially GTID-based or group replication), upgrade is a must, but I had no tried this version myself, yet.

Group Replication – Extending Group Replication performance_schema tables

In MySQL 8.0.2, users will see the additional columns in the existing Group Replication Performance Schema tables which will provide extended information about Group Replication. Now user can view role and MySQL version of each member of the group, which earlier required a complex set of query.…

The Rig

Back in May I wrote about a web service I’m working on that uses local storage and replication with two-phase commit. I pulled out the core of it and created a package that I’m calling the Rig.

It’s up on GitHub already: https://github.com/Preetam/rig

The goal of the Rig is to take some web service and add a log and replication on top.

A service is simply something that accepts operations.

type Service interface { Validate(Operation) error Apply(uint64, Operation) error LockResources(Operation) bool UnlockResources(Operation) }

Each operation is associated with an entry in a log.

type LogPayload struct { Version uint64 `json:"version"` Op Operation `json:"op"` }

And an operation is a method with some data.

type Operation struct { Method string `json:"method"` Data json.RawMessage `json:"data"` }

The service I created has methods like user_create, user_update, user_delete, and so on.

The Rig is still very rough. It kinda works… meaning it works great during “happy” times. There’s a lot of work to do to handle errors. Synchronous replication is one of those things that makes a lot of sense in the abstract, but can be all over the place in implementation.

For example, consider synchronous replication with two nodes. Writes succeed as long as both the primary and the replica acknowledge the write. If the primary fails, usually the replica takes over without any loss of data. But then you’re left with only one node running, and no more replication. What if instead of the primary failing, only the replica fails? Should the primary keep going? If it does, didn’t we just ignore the synchronous part of the replication?

But that’s the expected behavior for most systems. You want the system to be available even if a node fails. And when a node fails, you don’t want to lose any data. It’s kind of complicated to decide whether to just wait for a failing node or consider it failed and move on.

I think it’s interesting to see how MySQL/MariaDB does it with semi-sync replication. The master will wait up to some configurable number of milliseconds for a replica to respond during semi-sync mode, and when the timeout is exceeded the master will leave the replica behind and continue in async mode. That way you’re not stuck with a failed replica, but are synchronous during “happy” times.

The Rig just crashes the program right now when it can’t make progress with the replica =P. I’m working on it.

Other stuff I was looking at:

Group-Replication, sweet & sour

A story around replication lag and Flow-Control.

Overview

In the last few months we had 2 main actors in the the MySQL ecosystem, ProxySQL and Group-Replication (with the evolution to InnoDB Cluster). 

While I had extensively covered the first, my last serious work on GR, goes back to some lab version in the past years. 
Given the decision Oracle made to declare it GA, and the Percona decision to provide some level of support to GR, I decide it was time for me to take a look at it again.
A lot of reviews were already done covering different topics. I saw articles about GR and performance, GR and basic functionalities (or lack of it like automatic node provisioning), GR and ProxySQL and so on.

But one question was coming up over and over in my mind. If GR and InnoDB cluster has to work as alternative to other (virtually) synchronous replication mechanism, what change or shift our customers must consider if they want move from one to the other.
In solutions using Galera, like Percona Xtradb Cluster (PXC), there is a main concept to which all of us must refer to. The cluster is data-centric, which at the end brings us to what matters, which is the data and its state, that must be exactly the same on each node at a given time (commit/apply). To guarantee this PXC and others use a set of data validation and FlowControl that at the end will make possible to the cluster dataset to be consistent on each node, respecting the main principle (be data-centric).
Immediate application of this principle is that an application can query ANY node in the PXC and be sure to get the same data, or to write and know that the data will be visible on all node (virtually) at the same time.
Last but not list, if a node is not consistent with the others, it will be excluded and must be rebuild, or inconsistency fix, before joining back.

Not a minor thing if you think carefully, and a very useful thing to have because allow you to transparently split write/read operations, or failover from one node to another with lees troubles, and more.

When I thought to GR (or InnoDB Cluster), I put myself in the customer shoes, and I asked to the other myself: “Aside all the other things we know (see above) what is the real impact of moving form PXC to GR/Innodb-Cluster for my application? Because when you mention me that GR is still using (basically) replication with binlogs and relaylog, also if there is a Flow-Control mechanism an alarm bell started to ring in my mind.”

My Answer was: “Let us do a POC, and see what is really going on in that terms.”

{autotoc enabled=yes}

 

The POC

Given that I setup a simple set of servers using GR, with a very basic application performing writes on a single Writer node, and (eventually) reads on the other nodes.Schema definition can be found here, mainly I had use the 4 tables from my windmills test suite, nothing special, or weird of specifically design for GR. Actually I had use this test a lot for PXC in the past, so was a perfect fit.

Test definition

The application will do very simple work, and I had thought to test 4 main cases:

  • One thread performing one insert at each transaction.
  • One thread performing 50 batched inserts at each transaction.
  • 8 threads performing one insert to each transaction.
  • 8 threads performing 50 batched inserts at each transaction.

As you can see nothing crazy, a quite simple operation.
Then I decide to test it using the following 4 conditions on the servers:

  • Single slave worker FC as default
  • Single slave worker FC set to 25
  • 8 slave workers FC as default
  • 8 slave workers FC set to 25

Again nothing weird or strange from my point of view.I had used 4 nodes:

  • Gr1 Writer
  • Gr2 Reader
  • Gr3 Reader minimal latency (~10ms)
  • Gr4 Reader minimal latency (~10ms)

Finally, I had to be sure I measure the lag in a way that allow me to reference it in a consistent way on all nodes.
I think we can safely say that the incoming GTID (last_ Received_transaction_set from replication_connection_status) is for sure the last apply on the master a slave node can know about.
This because network delay can prevent the last one to really be "received".
The other point of reference is the GTID_EXECUTED which refers to the latest GTID processed on the Node itself.

The closest Query that can track the distance will be:

1 select @last_exec:=SUBSTRING_INDEX(SUBSTRING_INDEX(SUBSTRING_INDEX( @@global.GTID_EXECUTED,':',-2),':',1),'-',-1) last_executed; select @last_rec:=SUBSTRING_INDEX(SUBSTRING_INDEX(SUBSTRING_INDEX( Received_transaction_set,':',-2),':',1),'-',-1) last_received FROM performance_schema.replication_connection_status WHERE Channel_name = 'group_replication_applier'; select (@last_rec - @last_exec) as real_lag

 

Or in case of a single worker

1 select @last_exec:=SUBSTRING_INDEX(SUBSTRING_INDEX( @@global.GTID_EXECUTED,':',-1),'-',-1) last_executed; select  @last_rec:=SUBSTRING_INDEX(SUBSTRING_INDEX(Received_transaction_set,':',-1),'-',-1) last_received FROM performance_schema.replication_connection_status WHERE Channel_name = 'group_replication_applier'; select (@last_rec - @last_exec) as real_lag;

 

The result will be something like this:

 

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 +---------------+ | last_executed | +---------------+ | 23607         | +---------------+   +---------------+ | last_received |   +---------------+ | 23607         | +---------------+   +----------+ | real_lag |   +----------+ |        0 | +----------+

 

The whole set of tests can be found here, with all the commands you may need to run the application (you can find it here), and replicate the tests.
I will focus on the results, or this blogpost will be far too long, but I invite you to see the details.


The results

Efficiency on Writer by execution time & Rows/secHere using the raw data from the tests (excel spreadsheet available here) I was interested to identify if and how the writer is affected by the use of GR and FC.


Reviewing the Graph we can see that the Writer has a linear increase of the execution time (when using default FC), with the increase of the load, nothing really concerning and all in all expected also if the load is light, we will see after that the volume of rows at the end justify the execution time.

Different scenario if we use FC, the execution time increase significantly in both cases (single  worker/multiple workers). In the worth case (8 threads, 50 inserts batch) it becomes 4 time higher the same load without FC.
What happen to the inserted rows? In the application I trace the rows inserted/sec, as such is easy to see what is going on there as well.

We can see that the Writer with FC activated is able to insert less than 1/3 of the rows it can process without FC.We can definitely say, that FC has a significant impact on the Writer performance.
To clarify let see this graph:


Without FC the Writer is able to process a high volume of rows in a limited number of time (results from test 8 workers; 8 threads; 50 insert batch).
While with FC the situation changes drastically, the Writer will take a long time processing a significant smaller amount of rows/sec. In short performance will drop significantly.


But, hey I can be ok with that if this means to have a consistent data-set cross all the nodes.
At the end also PXC and similar, pay a significant price in performance to match the data-centric Principle.

Ok let see what happen on the other nodes.

Entries Lag

Well, the scenario is not so good.

 

When NOT using FC, the nodes lag behind the writer significantly. Remember that by default the FC in GR is set to 25000 entries, I mean 25K of entries!!!

So what happens here is that as as soon as I put some salt (see load) on the Writer the slave nodes will start to lag.
When using the default single worker, that will be significant, while when using multiple workers, we will see that the lag will mainly happen only on the node(s) with a minimal (10ms network latency).
Sad thing is that is not really going down respect to the single thread worker, indicating the simple minimal latency of 10ms was enough to affect the replication.

Well time to activate the FC and have no lag.

Unfortunately, this is not going to be the case. As we can see the lag in case of single worker remain high also in Gr2 (154 entries).
While when using multiple workers, the Gr3/4 nodes are able to perform much better, and lag is significantly less, but still high ~1k entries.

It is important to remember that at this time the Writer is processing 1/3 or less of the rows it is normally able to do. It is important to note, that I had set 25 to the entry limit in the FC, and never the less the Gr3 (and Gr4) nodes are lagging more than 1K entries behind.

To clarify, let check the two graphs below:

 

Using the Writer (Master) as baseline in entry #N, without FC, the nodes (slaves) using GR, will start to lag behind the writer, in a significant way also with light load. The distance in this POC was from very minimal with 58 entries, up to 3849 entries in the case of higher load.


Using FC the Writer (Master) will diverge less, as expected, but also if it will have a significant drop in performance (1/3 or less), the Nodes will lag anyhow, worse case up to 1363 entries.Need to underline that we have not further way (I am aware of) to tune the lag and prevent it to happen.
Which means an application cannot transparently split Writes/Reads and expect consistency. The gap will be too high


A graph that tell us a story

I was using PMM to keep an eye on the nodes while doing the tests. And one of the graph was really telling me that GR has still some “limits” if we want to consider it as the replication mechanism for a cluster.


This graph shows the MySQL Queries executed on all the 4 nodes, in the 8-50 threads-batch with FC test.As you can see the Gr1 (the writer) is the first one to takeoff, followed by Gr2, the nodes Gr3 and Gr4 will require a bit more, given the binlog transmission (and 10ms delay), once the data is there, they match (inconsistently) the Gr2 node, this is an effect of the FC asking the master to slow down. But as seen previously, the nodes will never be able to match the Writer. At the end, when the load test is over, the nodes will continue to process the queue for additional ~130 sec. Considering that the whole load takes 420 sec on the Writer, this means that 1/3 of the total time on the Writer is spent AFTER on the slave to sync.


The above graph shows the same test without FC, is interesting to see how the Writer was going above the 300 Queries/sec while G2 stay around 200 and Gr3/4 far below. The writer was able to process the whole load in ~120 seconds instead 420, while Gr3/4 continue to process the load for additional ~360 seconds.This means that without FC set the Nodes will lag around 360 seconds behind the Master, while with FC set to 25, they will lag 130 seconds.


A significant gap.

 

Conclusions

Going back to the origin, and the reason why I was looking to this POC.
I, as customer of myself, think that my application(s) will not be a good fit for GR, given I have set PXC to scale out the reads, and be able to efficiently move my writer to another when in need to.
GR while based on a very interesting concept it is still based on asynchronous replication (as my colleague Kenny said). I am sure it could make sense in many other cases, but it cannot be compare to solutions that are based on virtually synchronous replication; and it still requires a lot of refinement.

 

On the other hand, all the applications that can afford to have a significant gap between writer and readers, will probably be fine with that.But I raise another question … was not standard replication already covering that?Reviewing the Oracle documentations (https://dev.mysql.com/doc/refman/5.7/en/group-replication-background.html) I can see why GR as part of the InnoDB cluster, can help me in improving HA, when comparing it to standard replication.But I also think it is important to understand that GR (and derived solutions like InnoDB cluster) are not comparable or a replacement of data-centric solutions as PXC. At least up to now.

 

Good MySQL to everyone.

 

References

https://dev.mysql.com/doc/refman/5.7/en/group-replication.html

https://dev.mysql.com/doc/refman/5.7/en/mysql-innodb-cluster-userguide.html

http://lefred.be/content/mysql-group-replication-understanding-flow-control/

https://dev.mysql.com/worklog/task/?id=9838

Blog Poll: What Operating System Do You Run Your Development Database On?

In this post, we’ll use a blog poll to find out what operating system you use to run your development database servers.

In our last blog poll, we looked at what OS you use for your production database. Now we would like to see what you use for your development database.

As databases grow to meet more challenges and expanding application demands, they must try and get the maximum amount of performance out of available resources. How they work with an operating system can affect many variables, and help or hinder performance. The operating system you use for your database can impact consumable choices (such as hardware and memory). The operating system you use can also impact your choice of database engine as well (or vice versa).

When new projects, new applications or services or testing new architecture solutions, it makes sense to create a development environment in order to test and run scenarios before they hit production. Do you use the same OS in your development environment as you do your production environment?

Please let us know what operating system you use to run your development database. For this blog poll, we’re asking which operating system you use to actually run your development database server (not the base operating system).

If you’re running virtualized Linux on Windows, please select Linux as the OS used for development. Pick up to three that apply. Add any thoughts or other options in the comments section:

What operating system do you use to run your development database? Note: There is a poll embedded within this post, please visit the site to participate in this post's poll.

Thanks in advance for your responses – they will help the open source community determine how database environments are being deployed.

Pages