Planet MySQL

VividCortex Adds Query Analysis Features

VividCortex is all about optimizing your queries. Many of our users asked us to analyse their queries, searching for common errors and mistakes.

It's true that there are some excellent tools out there to perform the same tasks, such as pt-query-advisor from the great Percona Toolkit. But having this information available right in our web application is something we always wanted to do, too. And today we released that!

Today we added Query Analysis to our Profiler tool, so as of now, you will have less trouble finding your bad queries. Query Analysis analyzes queries with heuristics. It can find bad application patterns, SQL bugs, and all kinds of other obvious and subtle issues.

The notifications count in the Profiler now includes queries that present some problem. This is the column with the little bell icon and the yellow and red warning counts:

 When you navigate to the Query Details page, an additional Failed Rules metric shows you how often those notifications are triggered:

To see the heuristics that an individual query triggered, you can click on a sample (the circle in the Query Details interface) to can see the full description of the query analysis notification. Notice the last line in this screenshot:

Query Analysis is an intelligent advisor for your queries, designed to help you catch issues and solve them. It's also a foundation for what we're going to add later, since there is a lot more intelligence we can add for other specific cases. Please let us know how it works for you and send us your suggestions with the in-app support chat!

 


PlanetMySQL Voting: Vote UP / Vote DOWN

RocksDB vs the world for loading Linkbench tables

I like RocksDB, MyRocks and MongoRocks because they are IO efficient thanks to excellent compression and low write-amplification. It is a bonus when I get better throughput with the RocksDB family. Here I show that RocksDB has excellent load performance and to avoid benchmarketing the context is loading Linkbench tables when all data fits in cache. This workload should be CPU bound and can be mutex bound. I compare engines for MySQL (RocksDB, TokuDB, InnoDB) and MongoDB (RocksDB, WiredTiger) using the same hardware.

Configuration
I used Linkbench for MySQL and LinkbenchX for MongoDB. The load test was run with generate_nodes=false to limit the load to the count and link tables/collections. The load was repeated with 1 and 8 users (loaders=1 & maxid1=1M, loaders=8 & maxid1=4M). The test server has 144G RAM, 2 sockets, 12 CPU cores (24 HW-threads) and a 400G Intel s3700 SSD.

The oplog, binlog and sync-on-commit were disabled. I did that to remove bottlenecks that might hide stalls in the database engine.

I tested the following binaries:

Single-threaded
This shows the average insert rate during the single-threaded load. The performance summary is:
  • MySQL gets more throughput than MongoDB for the same workload because MongoDB uses more CPU. MongoDB also suffers from using more indexes on the Linkbench tables (the extra indexes are internal) as explained previously. I hope they fix this.
  • For MongoDB, data is loaded faster with WiredTiger than RocksDB because RocksDB uses more CPU.
  • MySQL+RocksDB and MySQL+TokuDB were by far the fastest for single-threaded loads if you ignore uncompressed InnoDB and I ignore it because I must have compression. Compressed InnoDB suffers from the overhead of doing some (de)compression operations in the foreground.
  • Uncompressed InnoDB is about 10% slower in MySQL 5.7 compared to 5.6 for the single-thread load. Low-concurrency performance regressions in MySQL is a known problem.

This shows the value of: (CPU utilization * 1000) / insert_rate. That includes CPU consumed in the foreground by the thread processing the users requests and in the background. The value is inversely correlated with the insert rate. The value is larger for MongoDB than for MySQL which explains why the insert rate is larger for MySQL.
Multi-threaded
This shows the average insert rate during the load with 8 user threads. The performance summary is:
  • MySQL+RocksDB is a lot faster than everything else except uncompressed InnoDB and I ignore that because I require compression.
  • Compressed InnoDB suffers from mutex contention on the pessimistic code path. See this.
  • TokuDB suffers from mutex contention and is slower here than at 1 thread. See this.
  • Mongo+RocksDB suffers from mutex contention and the difference between it and WiredTiger is larger here than for the single-threaded load.


This is the rate of context switches per insert. The context switch rate was measured by vmstat. Several of the engines suffer from too much mutex contention.


PlanetMySQL Voting: Vote UP / Vote DOWN

Measuring Docker IO overhead

This will be another post on using Percona Server via a Docker image. I want to follow up on my previous post regarding CPU/Network overhead in Docker “Measuring Percona Server Docker CPU/network overhead” by measuring  if there is any docker IO overhead on operations.

After running several tests, it appears (spoiler alert) that there is no Docker IO overhead. I still think it is useful to understand the different ways Docker can be used with data volumes, however. Docker’s philosophy is to provide ephemeral containers, but ephemeral does not work well for data – we do not want our data to disappear.

So, the first pattern is to create data inside a docker container. This is the default mode:

docker run -e MYSQL_ALLOW_EMPTY_PASSWORD=1 --name ps13 -p 3306:3306 -v /data/flash/my.cnf:/etc/my.cnf --net=host percona/percona-server:5.6.28

(I am using --net=host to avoid network overhead; check the previous post for more information.)

The second pattern is to use an external data volume, there we need to substitute the data volume with -v /data/flash/d1/:/var/lib/mysql. The full command is:

docker run -e MYSQL_ALLOW_EMPTY_PASSWORD=1 --name ps13 -p 3306:3306 -v /data/flash/d1/:/var/lib/mysql -v /data/flash/my.cnf:/etc/my.cnf --net=host percona/percona-server:5.6.28

Finally, there is third pattern: using data volume containers. For this example, I created a dummy container:

docker run -e MYSQL_ALLOW_EMPTY_PASSWORD=1 --name ps13-data-volume -v /var/lib/mysql -v /data/flash/my.cnf:/etc/my.cnf --net=host percona/percona-server:5.6.28

After stopping the ps13-data-volume container, we can start a real one using the data volume from ps13-data-volume  as:

docker run -e MYSQL_ALLOW_EMPTY_PASSWORD=1 --name ps14 --volumes-from ps13-data-volume -v /data/flash/my.cnf:/etc/my.cnf --net=host percona/percona-server:5.6.28

I compared all these modes with Percona Server running on a bare metal box, and direct mounted in sysbench, for both read-intensive and write-intensive IO workloads. For the reference, sysbench command is:

./sysbench --test=tests/db/oltp.lua --oltp_tables_count=16 --oltp_table_size=10000000 --num-threads=16 --mysql-host=127.0.0.1 --mysql-user=root --oltp-read-only=off --max-time=1800 --max-requests=0 --report-interval=10 run

I’m not going to show the final numbers or charts, as the results are identical for all docker modes and for the bare metal case. So I can confidently say there is NO IO overhead for any docker data volume pattern described above.

As next experiment, I want to measure the Docker container overhead in a multi-host network environment.


PlanetMySQL Voting: Vote UP / Vote DOWN

Log Buffer #460: A Carnival of the Vanities for DBAs

This Log Buffer Edition covers blog posts from Oracle, SQL Server and MySQL for this week.

Oracle:

APEX shuttle item with one direction

Wondering about which tasks and work products are essential for your project?

Using Spark(Scala) and Oracle Big Data Lite VM for Barcode & QR Detection

Cloning 10.2.0.3 Oracle Home on fully patched 11.31 HP-UX hangs

An UNDO in a PDB in Oracle 12c?

SQL Server:

SQL Azure Performance Benchmarking

Monitoring In-Memory OLTP: What’s Important?

Find and Remove Duplicate Records SQL Server

A Database to Diagram For

Getting started with R scripts and R visuals in Power BI Desktop

MySQL:

MySQL Support People – Percona Support

How to Install Redmine 3 with Nginx on Ubuntu 15.10

The magical abandoned .ibd files after crash recovery with MySQL 5.7.10

How To Speed Up MySQL Restart (hint: just like before, but this time for real)

OmniSQL – Massively Parallel Query Execution


PlanetMySQL Voting: Vote UP / Vote DOWN

MariaDB 5.5.48 and Connector/J 1.3.5 now available

The MariaDB project is pleased to announce the immediate availability of MariaDB 5.5.48 and MariaDB Connector/J 1.3.5. See the release notes and changelogs for details on these releases. Download MariaDB 5.5.48 Release Notes Changelog What is MariaDB 5.5? MariaDB APT and YUM Repository Configuration Generator Download MariaDB Connector/J 1.3.5 Release Notes Changelog About MariaDB Connector/J […]

The post MariaDB 5.5.48 and Connector/J 1.3.5 now available appeared first on MariaDB.org.


PlanetMySQL Voting: Vote UP / Vote DOWN

New in MySQL 5.7: Performance Schema scalable memory allocation

Performance Schema is a mechanism to collect and report run time statistics for running MySQL server. These statistics are stored-in and fetched-from internal memory buffers. In MySQL 5.6 GA, memory for these buffers is allocated during MySQL server startup with either user specified configuration values or with default values that autosize.…


PlanetMySQL Voting: Vote UP / Vote DOWN

Foundation report for 2015

This is a repost of Otto Kekäläinen's blog of the MariaDB foundations work in 2015.

The mariadb.org website had over one million page views in 2015, a growth of about 9% since 2014. Good growth has been visible all over the MariaDB ecosystem and we can conclude that 2015 was a successful year for MariaDB.
Increased adoptionMariaDB was included for the first time in an official Debian release (version 8.0 "Jessie") and there has been strong adoption of MariaDB 10.0 in Linux distributions that already shipped 5.5. MariaDB is now available from all major Linux distributions including SUSE, RedHat, Debian and Ubuntu. Adoption of MariaDB in other platforms also increased, and MariaDB is now available as a database option on, among others, Amazon RDS, 1&1, Azure and Juju Charm Store (Ubuntu).
Active maintenance and active developmentIn 2015 there were 6 releases of the 5.5 series, 8 releases of the 10.0 series and 8 releases of the 10.1 series. The 10.1 series was announced for general availability in October 2015 with the release of 10.1.8. In addition, there were also multiple releases of MariaDB Galera Cluster, and the C, Java and OBDC connectors as well as many other MariaDB tools. The announcements for each release can be read on the Mariadb.org blog archives with further details in the Knowledge Base. Some of the notable new features in 10.1 include:
We are also proud that the release remains backwards compatible and it is easy to upgrade to 10.1 from any previous MariaDB or MySQL release. 10.1 was also a success in terms of collaboration and included major contributions from multiple companies and developers.
MariaDB events and talksThe main event organized by the MariaDB Foundation in the year was the MariaDB Developer Meetup in Amsterdam in October, at the Booking.com offices. It was a success with over 60 attendees In addition there were about a dozen events in 2015 at which MariaDB Foundation staff spoke.

We are planning a new MariaDB developer event in early April 2016 in Berlin. We will make a proper announcement of this as soon as we have the date and place fixed.
Staff, board and membersIn 2015 the staff included:
  • Otto Kekäläinen, CEO
  • Michael "Monty" Widenius, Founder and core developer
  • Andrea Spåre-Strachan, personal assistant to Mr Widenius
  • Sergey Vojtovich, core developer
  • Alexander Barkov, core developer
  • Vicențiu Ciorbaru, developer
  • Ian Gilfillan, documentation writer and webmaster
Our staffing will slightly increase as Vicențiu will start working full time in 2016 for the Foundation. Our developers worked a lot on performance and scalability issues, ported the best features from new MySQL releases, improved MariaDB portability for platforms like ARM, AIX, IBM s390 and Power8, fixed security issues and other bugs. A lot of time was also invested in cleaning up the code base as the current 2,2 million lines of code includes quite a lot of legacy code in it. Version control and issue tracker statistics shows that the foundation staff made 528 commits, reported 373 bugs or issues and closed 424 bugs or other issues. In total there were 2400 commits made by 91 contributors in 2015.

The Board of Directors in 2015 consisted of:
  • Chairman Rasmus Johansson, VP Engineering at MariaDB Corporation
  • Michael "Monty" Widenius, Founder and CTO of MariaDB Corporation
  • Jeremy Zawodny, Software Engineer at Craigslist
  • Sergei Golubchik, Chief Architect at MariaDB Corporation
  • Espen Håkonsen, CIO of Visma and Managing Director of Visma IT & Communications
  • Eric Herman, Principal Developer at Booking.com
MariaDB Foundation CEO Otto Kekäläinen served as the secretary of the board. In 2015 we welcomed as new major sponsors Booking.com, Visma, Verkkokauppa.com. Acronis just joined to be a member for 2016. Please check out the full list of supporters. If you want to help the MariaDB Foundation in the mission to guarantee continuity and open collaboration, please support us as with individual or corporate sponsorship.
What will 2016 bring?We expect steady growth in the adoption of MariaDB in 2016. There are many migrations from legacy database solutions underway, and as the world becomes increasingly digital, there are a ton of new software projects starting that use MariaDB to for their SQL and no-SQL data needs. In 2016 many will upgrade to 10.1 and the quickest ones will start using MariaDB 10.2 which is scheduled to be released some time during 2016. MariaDB also has a lot of plugins and storage engines that are getting more and more attention, and we expect more buzz around them when software developers figure out new ways to manage data in fast, secure and scalable ways.
PlanetMySQL Voting: Vote UP / Vote DOWN

Comment on Life at Pythian as a MySQL DBA by Natalia

Hi Derek, thanks for sharing with us this post, from my perspective it was really helpful.

I have worked as a Windows admin for several projects but the roles were very specific and sometimes you did not have the chance to do new things.
At the moment I work freelance but I am very interested in a job position for Pythian.

Do you know if there´s any job position for Windows Admin?

Thanks!


PlanetMySQL Voting: Vote UP / Vote DOWN

We need your feedback: please participate in our open source database management survey

As members of the wider open source database users community, we’d like you to participate in our open source database deployment and management survey.

Your input will help us make our resources and tools for deploying, monitoring, managing and scaling databases of even more use to the community. It will give us valuable insight into the challenges you face when operating databases.

Please take the survey today by providing your input below; this will take approx. 5 minutes of your time.

We’ll share the results of the survey once we have compiled your responses.

Thank you!

Blog category: Tags:
PlanetMySQL Voting: Vote UP / Vote DOWN

EXPLAIN FORMAT=JSON: buffer_result is not hidden!

Time for another entry in the EXPLAIN FORMAT=JSON is cool! series. Today we’re going to look at how you can view the buffer result using JSON (instead of the regular EXPLAIN command.

Regular EXPLAIN does not identify if SQL_BUFFER_RESULT was used at all. To demonstrate, let’s run this query:

mysql> explain select * from salariesG *************************** 1. row *************************** id: 1 select_type: SIMPLE table: salaries partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 2557022 filtered: 100.00 Extra: NULL 1 row in set, 1 warning (0.01 sec) Note (Code 1003): /* select#1 */ select `employees`.`salaries`.`emp_no` AS `emp_no`,`employees`.`salaries`.`salary` AS `salary`,`employees`.`salaries`.`from_date` AS `from_date`,`employees`.`salaries`.`to_date` AS `to_date` from `employees`.`salaries`

Now, let’s compare it to this query:

mysql> explain select sql_buffer_result * from salariesG *************************** 1. row *************************** id: 1 select_type: SIMPLE table: salaries partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 2557022 filtered: 100.00 Extra: Using temporary 1 row in set, 1 warning (0.00 sec) Note (Code 1003): /* select#1 */ select sql_buffer_result `employees`.`salaries`.`emp_no` AS `emp_no`,`employees`.`salaries`.`salary` AS `salary`,`employees`.`salaries`.`from_date` AS `from_date`,`employees`.`salaries`.`to_date` AS `to_date` from `employees`.`salaries`

Notice there is no difference, except the expected "Using temporary" value in the "Extra" row of the second query. The field "Using temporary"  is expected here, because SQL_BUFFER_RESULT  directly instructs the MySQL server to put a result set into a temporary table to free locks. But what if the query uses the temporary table by itself? For example, for a grouping operation? In this case, the EXPLAIN result for the original query and the query that contains the SQL_BUFFER_RESULT  clause will be 100% identical.

Compare:

mysql> explain select emp_no, salary/avg(salary) from salaries group by emp_no, salaryG *************************** 1. row *************************** id: 1 select_type: SIMPLE table: salaries partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 2557022 filtered: 100.00 Extra: Using temporary; Using filesort 1 row in set, 1 warning (0.00 sec) Note (Code 1003): /* select#1 */ select `employees`.`salaries`.`emp_no` AS `emp_no`,(`employees`.`salaries`.`salary` / avg(`employees`.`salaries`.`salary`)) AS `salary/avg(salary)` from `employees`.`salaries` group by `employees`.`salaries`.`emp_no`,`employees`.`salaries`.`salary`

With:

mysql> explain select sql_buffer_result emp_no, salary/avg(salary) from salaries group by emp_no, salaryG *************************** 1. row *************************** id: 1 select_type: SIMPLE table: salaries partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 2557022 filtered: 100.00 Extra: Using temporary; Using filesort 1 row in set, 1 warning (0.00 sec) Note (Code 1003): /* select#1 */ select sql_buffer_result `employees`.`salaries`.`emp_no` AS `emp_no`,(`employees`.`salaries`.`salary` / avg(`employees`.`salaries`.`salary`)) AS `salary/avg(salary)` from `employees`.`salaries` group by `employees`.`salaries`.`emp_no`,`employees`.`salaries`.`salary`

There is no difference! We not able to tell if we used a temporary table to resolve the query, or simply put the result set into the buffer. The EXPLAIN FORMAT=JSON  command can help in this case as well. Its output is clear, and shows all the details of the query optimization:

mysql> explain format=json select sql_buffer_result emp_no, salary/avg(salary) from salaries group by emp_no, salaryG *************************** 1. row *************************** EXPLAIN: { "query_block": { "select_id": 1, "cost_info": { "query_cost": "3073970.40" }, "grouping_operation": { "using_temporary_table": true, "using_filesort": true, "cost_info": { "sort_cost": "2557022.00" }, "buffer_result": { "using_temporary_table": true, "table": { "table_name": "salaries", "access_type": "ALL", "rows_examined_per_scan": 2557022, "rows_produced_per_join": 2557022, "filtered": "100.00", "cost_info": { "read_cost": "5544.00", "eval_cost": "511404.40", "prefix_cost": "516948.40", "data_read_per_join": "39M" }, "used_columns": [ "emp_no", "salary", "from_date" ] } } } } } 1 row in set, 1 warning (0.00 sec) Note (Code 1003): /* select#1 */ select sql_buffer_result `employees`.`salaries`.`emp_no` AS `emp_no`,(`employees`.`salaries`.`salary` / avg(`employees`.`salaries`.`salary`)) AS `salary/avg(salary)` from `employees`.`salaries` group by `employees`.`salaries`.`emp_no`,`employees`.`salaries`.`salary`

Firstly, we can see how the grouping_operation was optimized:

"grouping_operation": { "using_temporary_table": true, "using_filesort": true,

And it does indeed use the temporary table.

Now we can follow the details for SQL_BUFFER_RESULT:

"buffer_result": { "using_temporary_table": true,

With this output, we can be absolutely certain that the temporary table was created for both the  SQL_BUFFER_RESULT and the grouping operation. This is especially helpful for support engineers who need the EXPLAIN  output to help their customers to tune queries, but are afraid to ask for the same query twice — once with the SQL_BUFFER_RESULT clause and once without.

Conclusion: EXPLAIN FORMAT=JSON  does not hide important details for query optimizations.


PlanetMySQL Voting: Vote UP / Vote DOWN

Compaction priority in RocksDB

Compaction priority is an option in RocksDB that determines the next file to select for compaction. Here I show the impact of this option on the amount of storage writes while running Linkbench. The right value can reduce the amount of data written to storage which improves storage efficiency and can make an SSD device last longer.

The Linkbench schema has 3 tables with 4 indexes. Each of the indexes uses a separate column family with MyRocks. There is one column family for each of the primary key indexes on the link, count and node tables (link_pk, count_pk, node_pk) and another column family for the secondary index on the link table (link_id1_type). The schema for MyRocks is here.

Configuration
I ran Linkbench with maxid1=1B and requesters=20 for 3 days on a server with 24 CPU cores, 256 GB of RAM and an MLC SSD. I then looked at the compaction IO statistics via SHOW ENGINE ROCKSDB STATUS to determine the amount of storage writes per column family. The graphs below use the integer value for the compaction priority:

This graph displays the total amount written per column family for each of the compaction_pri options. The least data is written with kOldestSmallestSeqFirst and the most data is written with kOldestLargestSeqFirst. The difference is about 1.34X. I have been using the default value, kByCompensatedSize, for all of my tests prior to this result.
The next graph shows the amount of data written into RocksDB by the application (MyRocks) by column family. I call this the ingest. The storage writes done by RocksDB includes the ingest and writes done for compaction in the background. As expected the ingest is similar for each of the compaction_pri values.
The final result is the write-amplification which is the ratio of total-writes / ingest. A smaller write-amplification is usually better. Because the total-writes (the first graph) are largest for compaction_pri=1 it has the largest write-amplification.

PlanetMySQL Voting: Vote UP / Vote DOWN

My First Steps in Exploring RocksDB

RocksDB and storage engine for MySQL based on it (so called "MyRocks") is widely discussed in my circles since August 2015 at least, so I decided to spend some time checking it. The easy way to get it running is to use Facebook's MySQL 5.6, so I just clonned it and built from source with minor customization based on instructions (that just work in case of Fedora Core 23):
33       mkdir git
34       cd git
35       git clone https://github.com/facebook/mysql-5.6.git
36       cd mysql-5.6/
37       git submodule init
38       git submodule update
39       cmake . -DCMAKE_BUILD_TYPE=RelWithDebInfo -DWITH_SSL=system -DWITH_ZLIB=bundled -DMYSQL_MAINTAINER_MODE=0 -DENABLED_LOCAL_INFILE=1 -DCMAKE_INSTALL_PREFIX=/home/openxs/dbs/fb56
...
41       time make
42       make install && make clean
...
44       cd
45       vi fb56.cnf
46       cd dbs/fb56/
...
48       scripts/mysql_install_db --defaults-file=/home/openxs/fb56.cnf
49       ls data/mysql/
50       bin/mysqld_safe --defaults-file=/home/openxs/fb56.cnf &
51       bin/mysql -uroot testOn Ubuntu 14.04.3 last week I was affected by the problem similar to Issue #147, but with today's code (commit e9d85381d22a2c3a2f8cea614baa70f7e0cef7b7) there was no problem to build there as well.

The fb56.cnf file is quite simple:
[openxs@fc23 ~]$ cat fb56.cnf
[mysqld]
rocksdb
default-storage-engine=rocksdb
skip-innodb
default-tmp-storage-engine=MyISAM

log-bin
binlog-format=ROW and the reasons for the settings there are explained on their wiki (here and there).

As a result, I've got my first RocksDB table(s) created in a matter of minutes (spent mostly on building from source), and my first bug reports filed almost immediately:
  • Issue #159 - "Indexes on RocksDB table are listed as BTREE ones in SHOW INDEXES". It was closed very soon.
  • Issue #160 - "ANALYZE TABLE does not seem to update statistics for the RocksDB table". It was closed less than a day ago.Unfortunately it seems now ANALYZE TABLE updates some statistics but does not set data size properly, see my Issue #164 reported today.
  • Issue #163 - "Strange EXPLAIN output for UPDATE ("Using temporary")". I've reported this today and it seems something in optimizer (maybe just a feature that I am not aware about) that produce the result I consider strange. In the process I've also noted that ICP (index condition pushdown) is NOT sued for the PRIMARY key of RocksDB tables (unlike for MyISAM ones), but the same limitation is known and documented for InnoDB tables.
Besides some testing, I've surely executed the command I expected to be there, SHOW ENGINE ROCKSDB STATUS:
mysql> show engine rocksdb status\G
*************************** 1. row ***************************
  Type: DBSTATS
  Name: rocksdb
Status:
** DB Stats **
Uptime(secs): 106.0 total, 106.0 interval
Cumulative writes: 6 writes, 524K keys, 6 batches, 0.9 writes per batch, ingest: 0.01 GB, 0.08 MB/s
Cumulative WAL: 6 writes, 4 syncs, 1.20 writes per sync, written: 0.01 GB, 0.08 MB/s
Cumulative compaction: 0.01 GB write, 0.08 MB/s write, 0.00 GB read, 0.00 MB/s read, 0.6 seconds
Cumulative stall: 00:00:0.000 H:M:S, 0.0 percent
Interval writes: 6 writes, 524K keys, 6 batches, 0.9 writes per batch, ingest: 8.00 MB, 0.08 MB/s
Interval WAL: 6 writes, 4 syncs, 1.20 writes per sync, written: 0.01 MB, 0.08 MB/s
Interval compaction: 0.01 GB write, 0.08 MB/s write, 0.00 GB read, 0.00 MB/s read, 0.6 seconds
Interval stall: 00:00:0.000 H:M:S, 0.0 percent
** Level 0 read latency histogram (micros):
Count: 5  Average: 10.2000  StdDev: 8.70
Min: 1.0000  Median: 6.5000  Max: 22.0000
Percentiles: P50: 6.50 P75: 19.50 P99: 22.00 P99.9: 22.00 P99.99: 22.00
------------------------------------------------------
[       0,       1 )        1  20.000%  20.000% ####
[       1,       2 )        1  20.000%  40.000% ####
[       6,       7 )        1  20.000%  60.000% ####
[      18,      20 )        1  20.000%  80.000% ####
[      20,      25 )        1  20.000% 100.000% ####
...and checked the content of the datadir related to RocksDB:
[openxs@fc23 ~]$ ls -la dbs/fb56/data/
total 1228
drwxrwxr-x.  6 openxs openxs    4096 Feb  9 11:27 .
drwxrwxr-x. 13 openxs openxs    4096 Feb  4 12:36 ..
-rw-rw----.  1 openxs openxs      56 Feb  4 12:37 auto.cnf
-rw-rw----.  1 openxs openxs   24872 Feb  4 12:36 fc23-bin.000001
-rw-rw----.  1 openxs openxs 1148080 Feb  4 12:36 fc23-bin.000002
-rw-rw----.  1 openxs openxs     488 Feb  4 13:26 fc23-bin.000003
-rw-rw----.  1 openxs openxs     139 Feb  4 13:26 fc23-bin.000004
-rw-rw----.  1 openxs openxs    1990 Feb  9 10:55 fc23-bin.000005
-rw-rw----.  1 openxs openxs     684 Feb  9 11:30 fc23-bin.000006
-rw-rw----.  1 openxs openxs     108 Feb  9 11:27 fc23-bin.index
-rw-r-----.  1 openxs openxs   24184 Feb  9 11:30 fc23.err
-rw-rw----.  1 openxs openxs       6 Feb  9 11:27 fc23.pid
drwx------.  2 openxs openxs    4096 Feb  4 12:36 mysql
drwx------.  2 openxs openxs    4096 Feb  4 12:36 performance_schema
drwxr-x--x.  2 openxs openxs    4096 Feb  9 11:30 .rocksdb
drwxrwxr-x.  2 openxs openxs    4096 Feb  9 11:30 test
[openxs@fc23 ~]$ ls -la dbs/fb56/data/.rocksdb/
total 176
drwxr-x--x. 2 openxs openxs  4096 Feb  9 11:30 .
drwxrwxr-x. 6 openxs openxs  4096 Feb  9 11:27 ..
-rw-r-----. 1 openxs openxs   702 Feb  9 10:35 000040.sst
-rw-r-----. 1 openxs openxs   219 Feb  9 11:30 000047.log
-rw-r-----. 1 openxs openxs  1622 Feb  9 11:27 000054.sst
-rw-r-----. 1 openxs openxs    16 Feb  9 11:27 CURRENT
-rw-r-----. 1 openxs openxs    37 Feb  4 12:36 IDENTITY
-rw-r-----. 1 openxs openxs     0 Feb  4 12:36 LOCK
-rw-rw----. 1 openxs openxs 28361 Feb  9 11:30 LOG
-rw-rw----. 1 openxs openxs 19212 Feb  4 12:36 LOG.old.1454582184760374
-rw-rw----. 1 openxs openxs 19438 Feb  4 12:36 LOG.old.1454582234660587
-rw-rw----. 1 openxs openxs 20023 Feb  4 13:26 LOG.old.1455004201986761
-rw-rw----. 1 openxs openxs 38739 Feb  9 10:55 LOG.old.1455010078294084
-rw-r-----. 1 openxs openxs   556 Feb  9 11:30 MANIFEST-000045
-rw-r-----. 1 openxs openxs  5414 Feb  9 11:27 OPTIONS-000051
-rw-r-----. 1 openxs openxs  5415 Feb  9 11:27 OPTIONS-000053I've also checked some messages in the error log related to RocksDB:
2016-02-09 11:27:58 14773 [Warning] The option innodb (skip-innodb) is deprecated and will be removed in a future release2016-02-09 11:27:58 14773 [Note] Plugin 'InnoDB' is disabled.
2016-02-09 11:27:58 14773 [Note] Plugin 'FEDERATED' is disabled.
2016-02-09 11:27:58 14773 [Note] RocksDB: 2 column families found
2016-02-09 11:27:58 14773 [Note] RocksDB: Column Families at start:
2016-02-09 11:27:58 14773 [Note]   cf=default
2016-02-09 11:27:58 14773 [Note]     write_buffer_size=4194304
2016-02-09 11:27:58 14773 [Note]     target_file_size_base=2097152
2016-02-09 11:27:58 14773 [Note]   cf=__system__
2016-02-09 11:27:58 14773 [Note]     write_buffer_size=4194304
2016-02-09 11:27:58 14773 [Note]     target_file_size_base=2097152
2016-02-09 11:27:58 14773 [Note] RocksDB: Table_store: loaded DDL data for 6 tables
2016-02-09 11:27:58 14773 [Note] RocksDB instance opened
2016-02-09 11:27:58 14773 [Note] Starting crash recovery...
RocksDB: Last binlog file position 1772, file name fc23-bin.000005
2016-02-09 11:27:58 14773 [Note] Crash recovery finished.All these details will be discussed in the next blog posts eventually.

So, these were my very first steps with RocksDB as a storage engine for MySQL. I am really impressed by the speed of bug fixing and adding missing features.

The real fun started when I've attached gdb to mysqld and set some proper breakpoints to find out how RocksDB locks the data accessed. Stay tuned - it's a topic for next (maybe several) posts about RocksDB.
PlanetMySQL Voting: Vote UP / Vote DOWN

Jörg Brühe: On Files, the Space They Need, and the Space They Take

or

xfs Users, Take Care!

Recently, we had a customer ask: Why do many files holding my data take up vastly more space than their size is? That question may sound weird to you, but it is for real, and the customer's observation was correct. For a start, let's make sure we are using the same terms.

  • The size of a file is the number of bytes it will deliver if it is read sequentially from start to end.
  • The space it takes up is the sum of all disk pages which are used to hold the file's data, or to locate those data pages ("indirect" blocks in Unix/Linux terminology).

Every Unix/Linux admin knows (or at least should know) that a file may take up less disk space than its size is. This happens when not all bytes of the file were really written, but the write pointer was advanced via "seek()", leaving a gap. Disk pages which are completely contained in such a gap will not be written, and reading these positions will produce bytes containing zero. This is called a "sparse file". You will find some remarks about them in our blog at https://fromdual.com/mysql-cluster-sparse-files, or search the net for that term.

The Customer's Message

Now that we have brought those basics into active memory again, let's return to the original question: Can there be files which take up vastly more space than their size is? We will not consider potential administrative overhead (pointers to pages), because to the customer a file of slightly more than 4 GB was reported to take up 8.1 GB disk space - see this quote from his mail (file name changed):

# ls -l some_table#P#p01.ibd
-rw-rw---- 1 mysql mysql 4307550208 Jan 4 01:06 some_table#P#p01.ibd
# du -hs some_table#P#p01.ibd
8,1G some_table#P#p01.ibd

Luckily, the customer's mail mentioned the file system: It was not one of the "ext" family (ext2, ext3, or etx4), but rather they are using xfs. This gave me a hint to search for information, and Google provided several pointers, IMO the most helpful ones where these:

http://xfs.org/index.php/XFS_FAQ
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=055388a3188f56676c21e92962fc366ac8b5cb72

Both these texts describe that the Linux kernel includes a tuning function for xfs file systems, which is to pre-allocate pages at the end of a growing file. Originally, the amount was small (64 kB), but then the size was made a function of the file size - the larger the file, the more pages were pre-allocated. Hence, this is now called "dynamic speculative EOF preallocation". It is based on the assumption that the file will continue to grow, and these pre-allocated pages are adjacent, so the performance of later file use (especially reads) will be improved. To not waste disk space permanently, such pre-allocated pages will be cut from the file when it is closed.

Measuring File Size and Space Taken

To see this behavior in practice, I wrote a little shell script that lets a file grow in increments of 160 kB (= ten InnoDB pages of default size) without closing it. (You can find it attached.) In parallel, I checked the size ("ls -l --block-size=K") and the space allocated ("du -k"). With this script, I could easily observe the effect:

Test './try-xfs-prealloc' is running on TTY 'pts/10'.

Linux trift-6core 3.13.0-74-generic #118-Ubuntu SMP Thu Dec 17 22:52:10 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
/dev/mapper/vg1000-XFS_try on /XFS_try type xfs (rw)
Dateisystem 1K-Blöcke Benutzt Verfügbar Verw% Eingehängt auf
/dev/mapper/vg1000-XFS_try 52403200 33504 52369696 1% /XFS_try
...
-rw-rw-r-- 1 joerg joerg 480K Feb 4 16:56 /XFS_try/somedir/bigfile
960 /XFS_try/somedir/bigfile
...
-rw-rw-r-- 1 joerg joerg 960K Feb 4 16:57 /XFS_try/somedir/bigfile
960 /XFS_try/somedir/bigfile
...
-rw-rw-r-- 1 joerg joerg 1440K Feb 4 16:57 /XFS_try/somedir/bigfile
1984 /XFS_try/somedir/bigfile
...
-rw-rw-r-- 1 joerg joerg 1920K Feb 4 16:57 /XFS_try/somedir/bigfile
1984 /XFS_try/somedir/bigfile
..
-rw-rw-r-- 1 joerg joerg 2240K Feb 4 16:57 /XFS_try/somedir/bigfile
4032 /XFS_try/somedir/bigfile

(( several lines not quoted ))

-rw-rw-r-- 1 joerg joerg 11200K Feb 4 17:02 /XFS_try/somedir/bigfile
16320 /XFS_try/somedir/bigfile
...
-rw-rw-r-- 1 joerg joerg 11680K Feb 4 17:02 /XFS_try/somedir/bigfile
16320 /XFS_try/somedir/bigfile
..
=====
No further writes, status:
-rw-rw-r-- 1 joerg joerg 12288000 Feb 4 17:02 /XFS_try/somedir/bigfile
16320 /XFS_try/somedir/bigfile

Writer process killed, status:
-rw-rw-r-- 1 joerg joerg 12288000 Feb 4 17:02 /XFS_try/somedir/bigfile
12000 /XFS_try/somedir/bigfile

While dynamic preallocation is a good idea for most files, it fails badly on MySQL data files: The MySQL server will not close them, in general, even when they won't grow any further (like a table partitioned by date). So this is what the customer detected: A table partition which had grown somewhat beyond 4 GB had got pages for another 4 GB pre-allocated, they were not released, and this happened for many files. Those of you who have ample disk space may say "who cares?", but there are others who have to care. For them, this feature has risky consequences, so they should try to prevent them.

Avoiding The Unlimited Growth

Basically, the only way out is to use the "allocsize" mount option, as described in the FAQ. InnoDB reads 64 pages of 16 kB at most, so "allocsize=1M" might be best.

Like the customer, many DBAs or SysAdmins may not be aware of that behaviour and might detect it only on the running system. Of course, the first question will be: "Can I fix that without downtime?" Immediately, a "mount -o remount" comes to mind, so I tried that: While my test script was running, I issued
sudo mount -o remount,allocsize=1M /XFS_try

Sadly, I must tell you it had no effect: The size of the pre-allocated space continued to grow, like in the original run. Even worse, this command also did not have any effect on a run I started after issuing it.

This proves that the value of "allocsize" cannot be changed for a mounted XFS file system, rather its value at mount time remains effective until the unmount. Only when I unmounted it and then mounted it anew, giving "allocsize=1M", did I see the fixed size as pre-allocation amount. From the DBA point of view, it means that a shutdown of the MySQL instance cannot be avoided for this change. (Of course, if we talk about a Galera cluster, the system remains available, because the nodes can be handled one at a time.)

Can You Get It Without Shutdown?

Now what if you really need to avoid a shutdown, but also need to get back the pre-allocated space urgently? As written above, this will happen only when the file is closed. So the question is: How can the DBA let the MySQL server close a table data file without interrupting the service? There seems to be a chance: the "flush tables" statement. The manual says:

FLUSH TABLES
Closes all open tables, forces all tables in use to be closed, and flushes the query cache. ...

FLUSH TABLES tbl_name [, tbl_name] ...
With a list of one or more comma-separated table names, this statement is like FLUSH TABLES with no names except that the server flushes only the named tables. ...
http://dev.mysql.com/doc/refman/5.6/en/flush.html

The text is identical for versions from 5.1 to 5.7.

But then, see the user comment by Simon Mudd on that page: No effect for InnoDB. To check this, I wrote a script that inserts rows into an InnoDB table, then let it run: The effect of preallocation is clearly visible. However, sometimes the space used may suddenly go down to the file size, then go up again. My impression is that XFS will react different to a plain file and an InnoDB table, because a file will grow sequentially at the end only while an InnoDB table also has writes to other blocks during its growth. At the end of the insert run, "ls -l" and "du" might show a big preallocation, but not in all runs. To really be sure, I repeated the experiment with the daemon "mysqld" running under "strace" control: No, I did not get a "close()" logged from the "flush table" command.

I had the opportunity to discuss it with a MySQL developer: Yes, that is correct, and it is intentional. InnoDB relies heavily on background threads, and they do not want to add the complexity of syncing these tasks with a "flush table" command. So there is no command that would guarantee the release of preallocated space.

Conclusion While xfs is a good file system for databases, the "dynamic speculative EOF preallocation" is a feature to be aware of, and you may want to limit its amount so that you don't have too much wasted space on your disk(s). Use the "allocsize=" mount option, and remember that it needs to be set before the mount.

Take care!

AttachmentSize Shell script to show XFS preallocation1.92 KB
PlanetMySQL Voting: Vote UP / Vote DOWN

February 23rd: how CloudStats.me moved from MySQL to clustered MariaDB for high availability

On Tuesday, February 23, please join us and the WooServers team for a webinar on the scalable, open source database infrastructure behind CloudStats.me.

CloudStats.me is a fast growing cloud-based server and website monitoring service. The rapid growth of the CloudStats user base and the number of services being monitored created a significant load on its MySQL infrastructure. The system ingests large amounts of incoming metrics/event data collected by thousands of agents. The backend systems also perform analytics on large portions of that data, and alerts are triggered as soon as certain conditions are met.

Andrey Vasilyev, CTO of Aqua Networks Limited - a London-based company which owns brands, such as WooServers.com, CloudStats.me and CloudLayar.com, and Art van Scheppingen, Senior Support Engineer at Severalnines, will discuss the challenges encountered by CloudStats.me in achieving database high availability and performance, as well as the solutions that were implemented to overcome these challenges.

Registration, Date & Time Europe/MEA/APAC

Tuesday, February 23rd at 09:00 GMT / 10:00 CET (Germany, France, Sweden)
Register Now

North America/LatAm

Tuesday, February 23rd at 09:00 Pacific Time (US) / 12:00 Eastern Time (US)
Register Now

Agenda
  • CloudStats.me infrastructure overview
  • Database challenges
  • Limitations in cloud-based infrastructure
  • Scaling MySQL - many options
    • MySQL Cluster, Master-Slave Replication, Sharding, ...
  • Availability and failover
  • Application sharding vs auto-sharding
  • Migration to MariaDB / Galera Cluster with ClusterControl & NoSQL
  • Load Balancing with HAProxy & MaxScale
  • Infrastructure set up provided to CloudStats.me
    • Private Network, Cluster Nodes, H/W SSD Raid + BBU
  • What we learnt - “Know your data!”

 

 

Speakers

Andrey Vasilyev is the CTO of Aqua Networks Limited - a London-based company which owns brands, such as WooServers.com, CloudStats.me and CloudLayar.com. Andrey has been leading the company’s new product development initiatives for 5 years and worked closely with the development and sales teams helping turn customer feedback into mass-market products. Having previously worked at Bloomberg L.P. and UniCredit Bank, Andrey’s main focus has always been on building stable and reliable platforms capable of serving hundreds of thousands of users.

Art van Scheppingen is a Senior Support Engineer at Severalnines. He’s a pragmatic MySQL and Database expert with over 15 years experience in web development. He previously worked at Spil Games as Head of Database Engineering, where he kept a broad vision upon the whole database environment: from MySQL to Couchbase, Vertica to Hadoop and from Sphinx Search to SOLR. He regularly presents his work and projects at various conferences (Percona Live, FOSDEM) and related meetups.

We look forward to “seeing” you there and to some good discussions!

For more discussions on database clustering and high availability strategies, do visit our Webinars Replay page.

Blog category: Tags:
PlanetMySQL Voting: Vote UP / Vote DOWN

MySQL Performance: Scalability on OLTP_RW Benchmark with MySQL 5.7

Next article from the MySQL 5.7 Performance stories, now about OLTP_RW scalability (if you missed any previous ones, see 1.6M SQL Query/sec (QPS) with MySQL 5.7, 1M SQL Query/sec on mixed OLTP_RO / true Point-Selects performance / over 100K Connect/sec Rate / Re:Visiting nnoDB vs MyISAM Performance -- all with MySQL 5.7)..

Before we'll start looking on OLTP_RW results, let me explain first why we payed so many attention to MySQL 5.7 Performance in RO (read-only) workloads (and all my previous posts were mostly about RO as well).. -- the reason is very simple: there is no great RW performance if RO is lagging.. And also because we were pretty bad on RO before 5.7 ;-))

Let's get a look on the following graphs :

  • the graphs are representing the test results obtained more than 2 years ago..
  • they are all obtained from the same 32cores-HT server (4CPU sockets, each with 8cores-HT)
  • and we were looking for the best possible MySQL server performance on this host by limiting MySQL instance to be running on 1/2/4CPUs (8/16/32cores) and using/not-using CPU HyperThreading (HT) (16cores-HT vs 16cores, etc.)..

So, what we observed over 2 years when MySQL 5.7 development was just started ?..

Here are the results obtained on OLTP_RO workload on MySQL 5.5 / 5.6 / and 5.7 on that time :



Observations :
  • on MySQL 5.5 :
    • the results on 16cores-HT are x2 times better than on 32cores..
  • on MySQL 5.6 :
    • the results on 32cores are just slightly better than on 16cores-HT
    • as well the difference between 32cores vs 32cores-HT results is pretty small..
  • on MySQL 5.7 :
    • same as on 5.6, the results on 32cores are just slightly better than on 16cores-HT
    • but near no difference at all in 32cores vs 32cores-HT results..
    • and, the most painful, is that an overall result is worse than on MySQL 5.6 (!)..
  • this was the first painful point from where MySQL 5.7 was started over 2 years ago ;-))
  • (and probably you're better understanding now why we're so happy to see MySQL 5.7 scaling really well today and easily reaching now over 1M QPS on the same OLTP_RO workload ;-))

But well, let's go back 2 years ago again, and see also what it was about OLTP_RW workload on that time :

The following are the similar test results on MySQL 5.5/ 5.6/ 5.7 , but about OLTP_RW :



Observations :
  • I think you may observe the same tendency by yourself :
    • MySQL 5.5 is scaling up to only 16cores-HT
    • on MySQL 5.6 and 5.7 the results on 32cores are better than on 16cores
    • the benefit from CPU HyperThreading is better seen on 32cores-HT now (but not that big as on 16cores-HT)
    • however, MySQL 5.7 is better "resisting" to a higher concurrent users load
    • while the Max peak TPS is still reached by MySQL 5.6, and not 5.7 ;-))
  • but the most killing here is not this..
  • in fact the presented OLTP_RW results are intentionally presented in QPS (Query/sec) and not in TPS (Transactions/sec)
  • this is making OLTP_RW results "comparable" with OLTP_RO ;-))
  • from where you may discover the painful point #2 :
    • over 2 years ago our OLTP_RW performance was better than OLTP_RO (!!!)
    • and this was true for all presented MySQL versions on that time..
    • NOTE : OLTP_RW workload is including OLTP_RO ;-))
    • NOTE (again) : to be exact, OLTP_RW is extending OLTP_RO by adding write operations (INSERT, DELETE, UPDATE), so we're writing to the disk, we're logging every transaction, we're hitting transaction/REDO locking, and we're still reaching a higher QPS level than a pure OLTP_RO running fully in-memory... -- and this is all because our transactions management in InnoDB on that time was very heavy on locks and did not scale at all..
  • Hope you can better understand now our frustration level 2 years ago, and challenges we faced on that time ;-))

That's why so many efforts were spent to improve InnoDB performance in MySQL 5.7 on RO workloads.. -- was this challenge fully completed?.. -- not yet (some specific cases (block lock, AHI, etc.) are still remaining; then many new functionality features were added in MySQL 5.7 over a time, and adding more code making an overall code path more long as well, so on low load RO workloads you may observe some slight regressions with MySQL 5.7 today.. -- however, as soon as your load is growing, you'll see a real benefit from improved MySQL 5.7 scalability ;-)) Le's say that with MySQL we got a rid of the "main scalability show-stopper" for RO workloads! - and, of course, we don't stop here, work in progress, and yet more other improvements are in our TODO list ;-))

Now, what about MySQL 5.7 Performance on RW workloads ?..
  • the main InnoDB RW scalability show-stopper (generally and particularly in MySQL 5.7) is REDO log locking (log_sys mutex)
  • well, to be exact, log_sys contention is the "final" show-stopper ;-))
  • while before hitting log_sys, you may hit and be blocked by :
    • index lock contention (big stopper for RW workloads, was finally fixed since MySQL 5.7 only.. -- before the only possible "workaround" was to use partitioning (this will split your hot table in several tables (partitions), means split your index as well, means split your contention by the number of partitions, etc)..
    • transaction lock (trx_sys mutex) -- greatly improved in MySQL 5.7 too
    • lock_sys overhead -- lowered in MySQL 5.7, but need yet to be more improved..
    • AHI (Adaptive Hash Index) contention (btr_search_latch RW-lock) -- there is a huge story behind it, but to make it short - you're better to disable it on RW workloads, as every data modification is involving AHI update (e.g. write lock), and you're quickly hitting a serialization here.. (work in progress to improve it)..
  • but well, as soon as you're using MySQL 5.7, your main RW "scalability limit" will be mostly log_sys contention ;-))
  • and, unfortunately, we were not able on MySQL 5.7 timeframe to improve this part of code as much as we made it for RO issues..
  • a true fix is requiring a complete REDO log management re-design, and our timing was not favorable here..
  • however, a probe prototype of the potential new solution showed us a great improvement (you can see its impact in the past LinkBench test results on MySQL 5.7)..
  • the amazing part of this probe patch was that we were able to reach the same or better performance while using innodb_flush_log_at_trx_commit=1 (and flushing REDO log on every transaction) vs innodb_flush_log_at_trx_commit=2 (flushing REDO log only once per second).. -- this clearly proved that the main issue here is not the IO related fsync() of REDO log file, but the REDO log management itself..
  • but well, we're not yet there ;-))
  • so, while our MySQL 5.7 scalability on RW workloads got more better with innodb_flush_log_at_trx_commit=2, we're not really better with innodb_flush_log_at_trx_commit=1 yet (and on low loads / small HW configs you may see no difference vs MySQL 5.6) -- in fact getting other contentions lowered, the log_sys contention became more hot, and there is nothing to do with it, except to get it fixed, so the work in progress is here too ;-)) -- while with MySQL 5.6 you may still hit instead many other problems which were fixed only since MySQL 5.7, so the best answer here will be only your own test validation..

Well, this was about internal contentions which may limit RW scalability. While there are still few more factors :
  • trx_commit (trx) -- already mentioning before (innodb_flush_log_at_trx_commit=0/2/1) and, of course, flushing REDO log data to disk on every transaction commit (innodb_flush_log_at_trx_commit=1) for sure will bring more penalty if you're flushing REDO only once per second (innodb_flush_log_at_trx_commit=2) -- while the risk here is to loose the last second transaction(s) only (and maybe even nothing if your OS & storage did not crash or if you're using semi-sync replication, or even less than last 1 sec (because in reality REDO log with innodb_flush_log_at_trx_commit=2 is still flushed more often than once per second), and even many "serious companies" are doing so, etc.etc.) -- but well, you're always better to evaluate what is valid for your own production ;-))

  • flush_method -- as you're writing to disk, you have to choose the way how your page writes will be flushed to the disk.. -- InnoDB has several options here (and you may find many discussions around and different people defending different option preferences, etc.) -- I'd say from all the past experience and fighting various issues with FS cache, my preferred option here will be to use O_DIRECT (or O_DIRECT_NOFSYNC when available) combined with AIO (innodb_flush_method=O_DIRECT_NOFSYNC and innodb_use_native_aio=1). And, curiously, I'm still prefer EXT4 (while many are claiming XFS is better) -- will post my observations later about ;-))

  • double_write (dblwr) -- the only solution InnoDB has to protect your data from partially written pages on system crash (so, InnoDB will write each page twice: first on dblwr buffer disk space (sys tablespace), and once the write is confirmed, the page is written on its own place (and if on that write the system will crash, the valid page copy will be recovered from dblwr)) -- while I often hear that on the "modern HW" not need to care about, the risk is still here ;-)) and it's still up to you to decide will you turn this protection ON or OFF (innodb_doublewrite=1/0). However, there are several alternatives are possible:
    • you may buy Fusion-io flash card and use their NVMFS filesystem which is supporting "atomic IO writes" (so each page write is confirmed to be fully written) -- MySQL 5.7 is supporting this card automatically (combined with O_DIRECT)
    • you may use "secured" by-design FS (like ZFS for ex. or ZFS Appliance) -- such a storage solution by definition will garantee you'll never loose any bit of your data ;-)) (on the same time don't be surprised your writes are going slower -- each write (and read!) is hardly verified) -- while this may still be faster than the current dblwr..
    • or use FS with data journal (like EXT4, but you have to use O_DSYNC with it, so some FS cache related surprises are potentially possible ;-))
    • etc..
    • I'd say the HW-based "atomic IO writes" solution is looking as the most strong.. -- but we're working here as well to bring yet more possible options, so stay tuned ;-))

  • purge -- a kind of "garbage collector" in InnoDB, running in background, can be configured with several "purge threads", however you may still see it lagging in your RW workload (can be observed as a growing or remaining high "History List" via "show engine innodb status" or via InnoDB METRICS table) -- the problem with constantly lagging purge is that your data space can be finally completely filled up with a "trash", and your whole database processing will be stopped due no more free disk space available.. The good news with MySQL 5.7 that if even purge is lagging during a high load, it'll be still able to catch up once the load become low and "auto-magically" free the disk space used by UNDO images (this is available only since MySQL 5.7, and in all previous versions the only solution to get all this disk space back was to drop the whole InnoDB instance and restore it from a backup or import from a dump).. -- so, it's important to configure several purge threads to make such a space recovery faster (innodb_purge_threads=4)

  • adaptive flushing -- I'll not go too much in details here as the topic is extremely interesting and worth a dedicated article about, so here will just mention that since MySQL 5.7 you can have several "flushing threads" (cleaners) working in parallel -- the initial analyze about what is going odd was made yet more than 3 years ago with MySQL 5.6 (see: http://dimitrik.free.fr/blog/archives/2012/10/mysql-performance-innodb-buffer-pool-instances-in-56.html for details) -- however this was only the first step in this adventure, and a more advanced design was required ;-)) -- well, we're not yet "perfect" here, yet more to come, will just mention here that using 4 threads is usually ok (innodb_page_cleaners=4), then the IO capacity setting should be adapted to your workload and your storage (ex. innodb_io_capacity=2000 innodb_io_capacity_max=10000), and there is no more danger to use bigger REDO log files (recovery processing is going much more faster now than before, as well only a "really needed" REDO space is used, as well a previously existing "read-on-write" issue on REDO logs was fixed since MySQL 5.7, so using 8GB REDO, or bigger is no more a problem (innodb_log_file_size=1024M innodb_log_files_in_group=8) -- well, sorry to skip the details here, will provide them all later..

  • checksums -- as soon as you're using crc32 option, you're fine ;-)) however, keep in mind that this is not impacting your scalability limits, this is a pure "overhead" (your performance levels will still scale with the same tendency, just that the response times will be higher)..

  • there are some other points/tuning/etc. are coming in the game as well, but let's keep the list short just with the most important ones ;-))

After all this "preface", let's focus now on the OLTP_RW benchmark testing (hope it was not too much boring until now ;-))

So far, my main goal on the following testing is to mainly analyze the scalability of MySQL 5.7 on OLTP_RW workload :
  • means, I don't need a too big database (I'm not testing the storage here ;-))
  • so, the dataset should be :
    • not too small to run fully on CPU caches level ;-))
    • and not too big either to not involve IO reads (otherwise, again, we're testing the storage performance ;-))

My HW platform :
  • for my tests I'll use the 72cores-HT server running OracleLinux-7.2 and having flash storage
  • why 72cores ?..
  • in fact this is a 4CPU sockets server (18cores-HT per CPU socket)
  • so, I can easily test scalability on 1CPU (18cores-HT), 2CPU (36cores-HT) and 4CPU (72cores-HT) by binding my MySQL server to run exclusively on these CPU cores..
  • then, these CPUs are the latest CPU chips from Intel, they are really way more powerful comparing to what I have on my older machines..
  • and this is where the whole HW tendency is going -- you'll see these CPUs on all "big" and "commodity" HW, and even 18cores-HT per CPU is not a limit either, so there are really fun times are coming (and if you're still thinking that "commodity" HW is a host with 4cores -- it's a good time to wake up ;-))

While my main interest here is about MySQL 5.7, I'm also curious to see what are the limits on the other MySQL Engines as well, and I have the following on my list :

MySQL Engines :
  • MySQL 5.7
  • MySQL 5.6
  • MySQL 5.5
  • Percona Server 5.6
  • MariaDB 10.1

Test Scenario :
  • from the previous OLTP_RO test I've already observed that all engines are worse vs MySQL 5.7 when a single table only is used in OLTP test.. -- so, no need to waste a time again to point to the same problem..
  • let's focus then on x8-tables OLTP_RW Sysbench test workload, each table of 1M
  • before each test the database is completely restored from its backup (clean dataset for each test)
  • the load is progressively growing from 8, 16, 32, .. up to 1024 concurrent users
  • each load level is kept at least for 5min (was enough to get an understanding about scalability limits, while I'd prefer more longer steps, while in the current case there was no way to run more longer iterations, as to cover all planned test conditions the whole testing already took over 2 weeks non-stop running ;-))
  • each MySQL Engine is tested within the following configurations :
    • trx2 -- innodb_flush_log_at_trx_commit=2 && innodb_doublewrite=0 (default)
    • trx1 -- innodb_flush_log_at_trx_commit=1 && innodb_doublewrite=0
    • trx1-dblwr1 -- innodb_flush_log_at_trx_commit=1 && innodb_doublewrite=1
  • each configuration is also tested with the following tuning combinations :
    • ccr0-sd6 -- innodb_thread_concurrency=0 (default) && innodb_spin_wait_delay=6 (default)
    • ccr64-sd6 -- innodb_thread_concurrency=64 && innodb_spin_wait_delay=6
    • ccr0-sd6 -- innodb_thread_concurrency=0 && innodb_spin_wait_delay=96
    • ccr64-sd6 -- innodb_thread_concurrency=64 && innodb_spin_wait_delay=96
  • and, finally, all configurations + all tuning combinations are tested on 1, then 2, then 4 CPU sockets (18cores-HT, 36cores-HT, 72cores-HT)..
  • the best obtained results for each Engine from any tested combinations then used to compare performance in different configurations (best-to-best comparison)..

I think I need to explain here a little bit more in details the impact of the mentioned tuning options :
  • thread_concurrency : a well known InnoDB tuning to limit the amount of concurrently running threads (usually no more required since MySQL 5.7 for RO workloads, but still helping for RW -- as we're writing and for sure will involve IO operations + manage various raw/data locking (via mutexes/RW-locks, etc.) -- there is still a significant benefit possible with an "optimal" thread concurrency limitation. Which setting could you consider optimal?.. -- I'd say you need to analyze which peak performance level you're reaching on your workload without concurrency limit (innodb_thread_concurrency=0) and see how many concurrent user sessions are running during this period -- this will be then your main concurrency target (by not allowing more than N concurrent threads you'll be able to keep your performance stable even with a higher load (well, at least not to see it quickly going down ;-)) -- in my cases the most optimal setting was 64 until now (innodb_thread_concurrency=64), while in your case it may be something different as well (this tuning is fully dynamic, so you may do live experiments on any running workload at any time you want ;-))

  • spin_delay : and this tuning is directly related to how internal lock primitives (mutexes/RW-locks) are "spinning" on a lock wait (threads waiting on a lock will "sleep" a given delay between spins before to re-try to acquire a lock again) -- the important point here is that a waiting thread in InnoDB will not really "sleep" on delay, but rather execute a "pause" instruction to CPU, so the CPU will switch to execute another thread(s), and waiting thread will come back as soon as its "pause" is finished (for this reason "show mutex" output about mutex/RW-locks spins/waits is better reflecting as for today InnoDB internal waits stats (as the time spent on a wait is not really wasted)). The question is then which value will be the most optimal here?.. -- again, you can get it only by testing by yourself ;-)) (this tuning is also dynamic) -- the 6 is default value, and I'm usually using 96 (innodb_spin_wait_delay=96) for big enough systems. Again, for RO workloads since MySQL 5.7 it's no more required, while for RW workloads we'll hit log_sys mutex contention for sure, and such a tuning usually may help.. The only problem here is that this setting is applied to all lock primitives together, so you really need to do experiments yourself to see what is better for you. However, by getting rid of hot contentions with every new improvement in InnoDB, we're progressively making the need of such a tuning obsolete.. (work in progress, stay tuned ;-))

Now, let me show the impact of this tuning by example :
  • the following graph is representing MySQL 5.7 results on OLTP_RW test
  • there are 4 results for the same MySQL 5.7, just with different concurrency/spin_delay tuning settings: ccr=0 / 64, sd=6 / 96


Observations :
  • as you can see, tuning the spin_delay for this Engine in this workload giving the most important impact..
  • with spin_delay=6 (sd6) we're getting a better performance up to 64 concurrent users
  • however with spin_delay=96 (sd96) we're going more far up to 128 users, and then able to keep near the same level of performance on a higher load as well..
  • interesting that in this case tuning thread concurrency helps only for sd6 setting, and has no impact on sd96
  • (but by the past experience I know it helps a lot on IO-bound workloads, so no reason to not test it ;-))

The same tuning was applied to all other Engines, and then the best obtained results collected (Max(QPS) or Max(TPS)) for each test case.

Now, if you're curious, let me show you yet few more details about :
  • so, the next following graphs is representing "live" stats data corresponding to the obtained above results
  • from the left to the right you can see 4 tests with the same MySQL 5.7, but configured with :
    • #1) sd6, ccr0
    • #2) sd6, ccr64
    • #3) sd96, ccr0
    • #4) sd96, ccr64
  • the first graph is showing reached Commit/sec rate (TPS)
  • the second one is the amount of concurrent user sessions
  • and the third graph is showing corresponding mutex/RW-locks spin waits reported by InnoDB :


Observations :
  • as you can see the default #1) case is hitting the highest lock contentions and reaching the lowest TPS..
  • tuning concurrency=64 in the case #2) is helping to lower waits on other locks, except log_sys, and also helps to avoid a TPS drop on a higher load..
  • tuning spin_delay=96 in case #3) lowering finally log_sys and giving us the highest TPS result here
  • adding concurrency=64 in case #4) lower spin waits yet more, but then the processing becomes "too relaxed", and TPS results is not better, while becomes more stable ;-))

While if we will go yet more in details about observed spin waits we may discover the following (adding CPU Usage% and Perf Profiler stats):

Observations :
  • well, just to show you that only once spin_delay was set to 96 we're starting to use CPU time fully..
  • however, we're spending 15%, 20%, or over 25% in the "sleeping" code (ut_delay())
  • and this is where our future potential gain is ;-))

Very hope the next MySQL/InnoDB version will get a rid of all these lock contentions and use HW way more efficiently.. -- let's see ;-))

Now, let's go back to the beginning of all this story (over 2 years ago), and get a look where we're finally today!

So far, just as a reminder, here are the results on OLTP_RO workload obtained on the all mentioned MySQL Engines on the same 72cores-HT server (and published before) :

Sysbench OLTP_RO 1M x 8-tables @72cores-HT (QPS) :

As you can see, MySQL 5.7 is reaching here 1M QPS, while MySQL 5.6 (and other "5.6 based") Engines are blocked around 400K QPS...

Now, what about OLTP_RW ?..

Sysbench OLTP_RW 1M x 8-tables @72cores-HT, config: trx_commit=2 dblwr=0 (QPS) :

Observations :
  • NOTE : this is OLTP_RW results with Engines configured with trx_commit=2 and dblwr=0, so the max possible OLTP_RW performance is reached here..
  • NOTE : the results are presented in QPS (and not TPS) to be "compatible" with OLTP_RO
  • so far MySQL 5.7 is reaching 800K QPS here, the best result from all Engines, and its OLTP_RW result is lower than 1M QPS OLTP_RO (as naturally should be "expected")
  • Percona Server 5.6 is on the second positions with its not far from 600K QPS, and way higher than its 400K QPS obtained on OLTP_RO..
  • the 3rd is MySQL 5.6, 500K QPS on OLTP_RW, while 400K QPS on OLTP_RO
  • 4th is MariaDB 10.1 with 450K QPS on OLTP_RW and higher result as well than on OLTP_RO..

So far, the OLTP_RO vs OLTP_RW target was finally reached by MySQL 5.7 only. Time for other challenges, work in progress ;-))

Let's now go back to OLTP_RW results more in details and see the impact of all tested configurations.

Sysbench OLTP_RW 1M x 8-tables @72cores-HT, config: trx_commit=2 dblwr=0 (TPS) :


MySQL 5.7 Scalability @Sysbench OLTP_RW 1M x 8-tables @72cores-HT, config: trx_commit=2 dblwr=0 (TPS) :


MySQL 5.6 Scalability @Sysbench OLTP_RW 1M x 8-tables @72cores-HT, config: trx_commit=2 dblwr=0 (TPS) :


MySQL 5.5 Scalability @Sysbench OLTP_RW 1M x 8-tables @72cores-HT, config: trx_commit=2 dblwr=0 (TPS) :


Percona Server 5.6 Scalability @Sysbench OLTP_RW 1M x 8-tables @72cores-HT, config: trx_commit=2 dblwr=0 (TPS) :


MariaDB 10.1 Scalability @Sysbench OLTP_RW 1M x 8-tables @72cores-HT, config: trx_commit=2 dblwr=0 (TPS) :


Sysbench OLTP_RW 1M x 8-tables @72cores-HT, config: trx_commit=1 dblwr=0 (TPS) :


MySQL 5.7 Scalability @Sysbench OLTP_RW 1M x 8-tables @72cores-HT, config: trx_commit=1 dblwr=0 (TPS) :


MySQL 5.6 Scalability @Sysbench OLTP_RW 1M x 8-tables @72cores-HT, config: trx_commit=1 dblwr=0 (TPS) :


MySQL 5.5 Scalability @Sysbench OLTP_RW 1M x 8-tables @72cores-HT, config: trx_commit=1 dblwr=0 (TPS) :


Percona Server 5.6 Scalability @Sysbench OLTP_RW 1M x 8-tables @72cores-HT, config: trx_commit=1 dblwr=0 (TPS) :


MariaDB 10.1 Scalability @Sysbench OLTP_RW 1M x 8-tables @72cores-HT, config: trx_commit=1 dblwr=0 (TPS) :



Sysbench OLTP_RW 1M x 8-tables @72cores-HT, config: trx_commit=1 dblwr=1 (TPS) :


MySQL 5.7 Scalability @Sysbench OLTP_RW 1M x 8-tables @72cores-HT, config: trx_commit=1 dblwr=1 (TPS) :


MySQL 5.6 Scalability @Sysbench OLTP_RW 1M x 8-tables @72cores-HT, config: trx_commit=1 dblwr=1 (TPS) :


MySQL 5.5 Scalability @Sysbench OLTP_RW 1M x 8-tables @72cores-HT, config: trx_commit=1 dblwr=1 (TPS) :


Percona Server 5.6 Scalability @Sysbench OLTP_RW 1M x 8-tables @72cores-HT, config: trx_commit=1 dblwr=1 (TPS) :


MariaDB 10.1 Scalability @Sysbench OLTP_RW 1M x 8-tables @72cores-HT, config: trx_commit=1 dblwr=1 (TPS) :



From the presented above OLTP_RW results you can see that :
  • switching trx_commit=1 is giving the biggest impact on performance..
  • adding dblwr=1 here is not impacting too much due a relatively not too hard flushing involved by the tested workload (not too hard for the used flash storage)
  • with trx_commit=2 the absolute winner is MySQL 5.7
  • with trx_commit=1 on 72cores the winner is MySQL 5.7, while on 36cores is Percona Server 5.6 (small, but visible gain), and on 18cores rather MySQL 5.5 (surprise! ;-)) just that it has a drop on 1024 users
  • with trx_commit=1 + dblwr=1 the result is not much different from just trx_commit=1 :


INSTEAD OF SUMMARY :
  • there was a really huge gain made in scalability improvement in MySQL 5.7 !..
  • reaching over 40K TPS on OLTP_RW is the highest ever result I've seen on MySQL until now ;-)
  • however, there is yet more to do in MySQL for efficiency and further scalability improvements in RW workloads..
  • the main challenges are around REDO log management
  • while on heavy IO-bound RW workload the double_write becomes a huge problem as well (more about later, in the next articles)..
  • well, work in progress, stay tuned ;-))


As usually, any comments are welcome! And thank you for using MySQL! ;-))

Rgds,
-Dimitri

APPENDIX
The Sysbench command used to run OLTP_RW test via IP port (starting 8 processes in parallel):

$ LD_PRELOAD=/usr/lib64/libjemalloc.so /BMK/sysbench --num-threads=$1 \ --test=oltp --oltp-table-size=1000000 \ --oltp-dist-type=uniform --oltp-table-name=sbtest_1M_$n \ --max-requests=0 --max-time=$2 --mysql-host=127.0.0.1 --mysql-port=5700 \ --mysql-user=dim --mysql-password=dim --mysql-db=sysbench \ --mysql-table-engine=INNODB --db-driver=mysql \ run > /tmp/test_$n.log &

the my.conf I've used during the tests :

[mysqld] # general table_open_cache = 8000 table_open_cache_instances=16 back_log=1500 query_cache_type=0 max_connections=4000 # files innodb_file_per_table innodb_log_file_size=1024M innodb_log_files_in_group=3 innodb_open_files=4000 # buffers innodb_buffer_pool_size= 32000M innodb_buffer_pool_instances=32 innodb_log_buffer_size=64M # tune innodb_checksums=0 innodb_doublewrite= 0 / 1 innodb_support_xa=0 innodb_thread_concurrency=0 / 64 innodb_flush_log_at_trx_commit=2 / 1 innodb_flush_method=O_DIRECT_NOFSYNC innodb_max_dirty_pages_pct=90 innodb_max_dirty_pages_pct_lwm=10 innodb_lru_scan_depth=4000 innodb_page_cleaners=4 join_buffer_size=32K sort_buffer_size=32K innodb_use_native_aio=1 innodb_stats_persistent = 1 innodb_spin_wait_delay=6 / 96 # perf special innodb_adaptive_flushing = 1 innodb_flush_neighbors = 0 innodb_read_io_threads = 4 innodb_write_io_threads = 4 innodb_io_capacity=2000 innodb_io_capacity_max=4000 innodb_purge_threads=4 innodb_max_purge_lag_delay=30000000 innodb_max_purge_lag=0 innodb_adaptive_hash_index=0 # Monitoring innodb_monitor_enable = '%' performance_schema=OFF

PlanetMySQL Voting: Vote UP / Vote DOWN

Log Buffer #459: A Carnival of the Vanities for DBAs

This Log Buffer Edition arranges few tips and tricks from the blogs of Oracle, SQL Server and MySQL.

Oracle:

Oracle ® Solaris innovation is due in part to the UNIX® the standard (1), the test suites (2) and the certification (3). By conforming to the standard, using the test suites and driving to certification, Oracle ® Solaris software engineers can rely on stable interfaces and an assurance that any regressions will be found quickly given more than 50,000 test cases.

Building on the program established last year to provide evaluation copies of popular FOSS components to Solaris users, the Solaris team has announced the immediate availability of additional and newer software, ahead of official Solaris releases.

Tracing in Oracle Reports 12c.

Issues with Oracle Direct NFS.

An interesting observation came up on the Oracle-L list server a few days ago that demonstrated how clever the Oracle software is at minimising run-time work, and how easy it is to think you know what an execution plan means when you haven’t actually thought through the details – and the details might make a difference to performance.

SQL Server:

Manipulating Filetable Files Programatically

Auto-suggesting foreign keys and data model archaeology

Create/write to an Excel 2007/2010 spreadsheet from an SSIS package.

Tabular vs Multidimensional models for SQL Server Analysis Services.

The PoSh DBA – Towards the Re-usable PowerShell Script.

MySQL:

MyRocks vs InnoDB with Linkbench over 7 days.

MySQL has been able to harness the potential of more powerful (CPU) and larger (RAM, disk space.

Setup a MongoDB replica/sharding set in seconds.

MySQL 5.7 makes secure connections easier with streamlined key generation for both MySQL Community and MySQL Enterprise, improves security by expanding support for TLSv1.1 and TLSv1.2, and helps administrators assess whether clients are connecting securely or not.

While EXPLAIN shows the selected query plan for a query, optimizer trace will show you WHY the particular plan was selected. From the trace you will be able to see what alternative plans was considered, the estimated costs of different plans, and what decisions was made during query optimization.


PlanetMySQL Voting: Vote UP / Vote DOWN

Percona Server 5.7.10-2 second RC available

Percona is glad to announce the second release candidate of Percona Server 5.7.10-2 on February 8, 2016. Download the latest version from the Percona web site or from the Percona Software Repositories.

New Features:

  • Complete list of changes between Percona Server 5.6 and 5.7 can be seen in Changed in Percona Server 5.7.
  • 5.7 binlog group commit algorithm is now supported in TokuDB as well.
  • New TokuDB index statistics reporting has been implemented to be compatible with the changes implemented in upstream 5.7. Following the InnoDB example, the default value for tokudb_cardinality_scale_percent has been changed from 50% to 100%. Implementing this also addresses a server crash deep in the optimizer code.

Known Issues:

  • In Percona Server 5.7 super_read_only feature has been replaced with the upstream implementation. There are currently two known issues compared to Percona Server 5.6 implementation:
    • Bug #78963, super_read_only aborts STOP SLAVE if variable relay_log_info_repository is set to TABLE which could lead to a server crash in Debug builds.
    • Bug #79328, super_read_only set as a server option has no effect.
  • InnoDB crash recovery might fail if innodb_flush_method is set to ALL_O_DIRECT. The workaround is to set this variable to a different value before starting up the crashed instance (bug #1529885).

Bugs Fixed:

  • Clustering secondary index could not be created on a partitioned TokuDB table. Bug fixed #1527730 (#720).
  • Percona TokuBackup was failing to compile with Percona Server 5.7. Bug fixed #123.
  • Granting privileges to a user authenticating with PAM Authentication Plugin could lead to a server crash. Bug fixed #1521474.
  • TokuDB status variables were missing from Percona Server 5.7.10-1. Bug fixed #1527364 (#923).
  • Attempting to rotate the audit log file would result in audit log file name foo.log.%u (literally) instead of a numeric suffix. Bug fixed #1528603.
  • Adding an index to an InnoDB temporary table while expand_fast_index_creation was enabled could lead to server assertion. Bug fixed #1529555.
  • TokuDB would not be upgraded on Debian/Ubuntu distributions while performing an upgrade from Percona Server 5.6 to Percona Server 5.7 even if explicitly requested. Bug fixed #1533580.
  • Server would assert when both TokuDB and InnoDB tables were used within one transaction on a replication slave which has binary log enabled and slave updates logging disabled. Bug fixed #1534249 (upstream bug #80053).
  • MeCab Full-Text Parser Plugin has not been included in the previous release. Bug fixed #1534617.
  • Fixed server assertion caused by Performance Schema memory key mix-up in SET STATEMENT ... FOR ... statements. Bug fixed #1534874.
  • Setting the innodb_sched_priority_purge (available only in debug builds) while purge threads were stopped would cause a server crash. Bug fixed #1368552.
  • Enabling TokuDB with ps_tokudb_admin script inside the Docker container would cause an error due to insufficient privileges even when running as root. In order for this script to be used inside docker containers this error has been changed to a warning that a check is impossible. Bug fixed #1520890.
  • Write-heavy workload with a small buffer pool could lead to a deadlock when free buffers are exhausted. Bug fixed #1521905.
  • InnoDB status will start printing negative values for spin rounds per wait, if the wait number, even though being accounted as a signed 64-bit integer, will not fit into a signed 32-bit integer. Bug fixed #1527160 (upstream #79703).
  • Percona Server 5.7 couldn’t be restarted after TokuDB has been installed with ps_tokudb_admin script. Bug fixed #1527535.
  • Fixed memory leak when utility_user is enabled. Bug fixed #1530918.
  • Page cleaner worker threads were not instrumented for Performance Schema. Bug fixed #1532747 (upstream bug #79894).
  • Busy server was preferring LRU flushing over flush list flushing too strongly which could lead to performance degradation. Bug fixed #1534114.
  • libjemalloc.so.1 was missing from a binary tarball. Bug fixed #1537129.
  • When cmake/make/make_binary_distribution workflow was used to produce binary tarballs it would produce tarballs with mysql-... naming instead of percona-server-.... Bug fixed #1540385.
  • Added proper memory cleanup if for some reason a table is unable to be opened from a dead closed state. This prevents an assertion from happening the next time the table is attempted to be opened. Bug fixed #917.
  • Variable tokudb_support_xa has been modified to prevent setting it to anything but ON/ENABLED and to print a SQL warning anytime an attempt is made to change it, just like innodb_support_xa. Bug fixed #928.

Other bugs fixed: #1179451, #1534246, #1524763, #1525109 (upstream #79569), #1530102, #897, #898, #899, #900, #901, #902, #903, #905, #906, #907, #908, #909, #910, #911, #912, #913, #915, #919, and #904.

Release notes for Percona Server 5.7.10-2 are available in the online documentation. Please report any bugs on the launchpad bug tracker .


PlanetMySQL Voting: Vote UP / Vote DOWN

Percona XtraBackup 2.4.0-rc1 is now available

Percona is glad to announce the first release candidate of Percona XtraBackup 2.4.0-rc1 on February 8th 2016. Downloads are available from our download site and from apt and yum repositories.

This is a Release Candidate quality release and it is not intended for production. If you want a high quality, Generally Available release, the current Stable version should be used (currently 2.3.3 in the 2.3 series at the time of writing).

Percona XtraBackup enables MySQL backups without blocking user queries, making it ideal for companies with large data sets and mission-critical applications that cannot tolerate long periods of downtime. Offered free as an open source solution, Percona XtraBackup drives down backup costs while providing unique features for MySQL backups.

This release contains all of the features and bug fixes in Percona XtraBackup 2.3.3, plus the following:

New Features:

  • Percona XtraBackup has implemented basic support for MySQL 5.7 and Percona Server 5.7.

Known Issues:

  • Backed-up table data could not be recovered if backup was taken while running OPTIMIZE TABLE (bug #1541763) or ALTER TABLE ... TABLESPACE (bug #1532878) on that table.
  • Compact Backups currently don’t work due to bug #1192834.

Release notes with all the bugfixes for Percona XtraBackup 2.4.0-rc1 are available in our online documentation. Bugs can be reported on the launchpad bug tracker.


PlanetMySQL Voting: Vote UP / Vote DOWN

Pages