Planet MySQL

Leveraging ProxySQL with AWS Aurora to Improve Performance

Or How ProxySQL Out-performs Native Aurora Cluster Endpoints

In this blog post, I'll look at how you can use ProxySQL with AWS Aurora to further leverage database performance. My previous article described how easy is to replace the native Aurora connector with ProxySQL. In this article, you will see WHY you should do that. It is important to understand that aside from the basic optimization in the connectivity and connection management, ProxySQL also provides you with a new set of features that currently are not available in Aurora. Just think:

  • Better caching
  • Query filtering
  • Sharding
  • Query substitution
  • Firewalling
  • ... and more

We will cover areas like scalability, security and performance. In short, I think is more than worth it to spend some time and give ProxySQL with AWS Aurora a try.

The tests

I will show you the results from two different kinds of tests. One is sysbench-oriented, the other simulates a more complex application using Java, data object utilization and a Hikari connection pool in the middle as well. For the EC2 and Aurora platform I used:

  • Application/ProxySQL T2.xlarge eu-central-1a
  • 2 Aurora MySQL 5.7.12 db.t2.medium eu-central-1a
  • 1 Aurora MySQL 5.7.12 db.t2.medium eu-central-1b for AZ redundancy

The code for the application is available here, and for sysbench tests here. All the data and configurations for the application are available here. I ran three tests using both bench apps, obviously with Aurora as it comes and with ProxySQL. For the ProxySQL configuration see my previous article. The tests were read_only / Write_only / read_write. For Aurora, I only increased the number of connections and kept the how it comes out of the box approach. Note each test was run at least three times at different moments of the day, and on a different day. The data reported as final is the BEST performing result for each one.

The Results

For the impatient among us, here is a summary table of the tests: Sysbench:

Java App:

Now if this is enough for you, you can go to the conclusion and start to use ProxySQL with AWS Aurora. If you would like to know a bit more, continue reading. Aside from any discussion on the benchmark tool and settings, I really focused on identifying the differences between the two “connectors”. Given the layer below was exactly the same, any difference is due to the simple substitution of the endpoint.

Sysbench Read Only

The first image reports the number of events achieved at the time of the test. It is quite clear that when using ProxySQL, sysbench ran more events.
In this graph, higher is better:

In this graph, lower is better:

 

As we can see, the latency when using an Aurora cluster entry point is higher. True, we are talking about milliseconds, but it is not just the value that matters, but also the distribution:

Aurora cluster endpoint ProxySQL

An image is worth a thousand words! We can see, the behavior stays constant in analyzing the READS executed, with ProxySQL performing better.

  In this graph, higher is better:

  In this graph, higher is better:

Closing with the number of total queries performed, in which ProxySQL surpassed the Cluster endpoint by ~ 4K queries.

Write Only

For writing, things go a bit different. We see that all lines intersect, and the values are very close one to the other.

I will let the images speak for themselves:

In this graph, higher is better:

In this graph, lower is better:

Latency spiked in each ProxySQL test, and it may require additional investigation and tuning.

 In this graph, higher is better:

While the rates of writes/sec intersect with each other frequently, in the end ProxySQL resulted in more writes than the native endpoint.

In this graph, higher is better:

In the end, a difference exists and is consistent across the different test iterations, but is minimal. We are talking a range of 25 - 50 entries in total.

This result is not surprising, and it will be clear why later in the article.


Read and Write

As expected in the read and write test, we see a different situation.

ProxySQL is still performing better than the default entry point, but not by such a big margin as in read-only tests.

In this graph, higher is better:

In this graph, lower is better

Latency and events are following the expected trend, where read operations are executed more efficiently with ProxySQL and writes are close, but NOT the same as in the write only test. 

As a result, the number of queries in ProxySQL is approximately 13% better than the default entry point.

Java Application Tests

What about the Java application? First of all, we need to remember that the application used a connection pool mechanism (Hikari), and the connection pool was present in all cases (for both Aurora cluster endpoint or ProxySQL). Given that a small delay in establishing the first connection was expected, you can easily see this in the MAX value of the connection latency. In this graph, lower is better.

The connection latency reported here is expressed in nanoseconds and is the measure of the time taken by the connection provider to return an active connection to the application from the moment the application requested it. In other words, how long the HikariCP is taking to choose/check/return an open connection. As you can see, the MAX value is drastically higher, and this was expected since it is the connection initialization. While not really interesting in terms of performance, this value is interesting because it gives us the dimension of the cost in the CP to open a new connection, which in the worse case is 25 milliseconds. As the graphs show, ProxySQL manages both cases (first call, reassignment) more efficiently.
In this graph, higher is better.

In the CRUD summary table, we can see the number of SQL commands executed per second for each CRUD action and for each test. Once more we can see that when using ProxySQL, the application performed much better and significantly executed more operations (especially in the R/W test).

  In this graph, higher is better.

This graph represents the total number of events run at the time of the test. An event is a full application cycle, which sees the application generate the data needed to fill the SQL (no matter if it is for read/write), create the SQL, request the connection, push the SQL, get and read the resultset returned and give back the connection. Once more, ProxySQL shows better performance.
In this graph, lower is better.

The execution time reported in this graph is the time taken by the application to run a whole event. This is it, execution time is the time of a full cycle. The faster the cycle is executed, the better the application is performing. The time is express in milliseconds and it goes from a very fast read, which probably accesses the cache in Aurora, to almost two seconds taken to insert a batch of rows. Needless to say, the tests using ProxySQL performed better.

But Why?

Why do the tests using ProxySQL perform better? After all, it is just an additional step in the middle, which also has a cost in intercepting the queries and managing the connections.

So why the better performance? The answer is simple and can be found in the Aurora manual: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Aurora.Overview.html#Aurora.Overview.Endpoints.
The Cluster endpoint is an endpoint for an Aurora DB cluster that connects to the current primary instance for that DB cluster. Each Aurora DB cluster has a cluster endpoint and one primary instance.

That endpoint receives the read and write request and sends them to the same instance.The main use for it is to perform failover if needed. At the same time, the Reader endpoint is an endpoint for an Aurora DB cluster that connects to one of the available Aurora Replicas for that DB cluster.

Each Aurora DB cluster has a reader endpoint. If there is more than one Aurora Replica, the reader endpoint directs each connection request to one of the Aurora Replicas. The reader endpoint only load balances connections to available Aurora Replicas in an Aurora DB cluster. It does not load balance specific queries.

If you want to load balance queries to distribute the read workload for a DB cluster, you need to manage that in your application and use instance endpoints to connect directly to Aurora Replicas to balance the load.
This means that to perform a Read/Write split, your application must manage two entry points and you will NOT have much control over how the queries are handled or to which replica instance they are directed.

This could lead to unexpected results and delays. Needless to say, ProxySQL does all that by default (as described in my previous article). Now that we've clarified how Aurora entry points behave, let's see about the performance difference.

How do we read this graph? From left to right:

  • read_only test with an Aurora cluster endpoint
  • read_only test with ProxySQL
  • write_only with an Aurora cluster endpoint
  • write_only with ProxySQL
  • read and write with an Aurora cluster endpoint
  • read and write with ProxySQL

Here we go! As we can see, the tests with ProxySQL used the two configured instances, splitting R/W without the need to do anything on the application side. I purposely avoided the AZ replica because I previously identified it as having higher latency, so I can exclude it and use it ONLY in the case of an emergency.

The effects are clear in the next graph.

When using the cluster endpoint, given all the load was on a single instance, the CPU utilization is higher and that became a bottleneck. When using ProxySQL, the load is spread across the different instances, allowing real read scalability. This has immediate benefits in read and read/write operations, allowing better load distribution that results in better performance.

Conclusions

Aurora is a very interesting technology and can be a very good solution for read scaling.
But at the moment, the way AWS offers data connectivity with the Cluster endpoints and Reader endpoints can negatively affect performance.

The lack of configuration and the limitation of using different endpoints lead to confusion and less optimized utilization.

The introduction of ProxySQL, which now supports (from version 2) Aurora, allows an architect, SA or DBA to properly configure the environment. You can very granularly choose how to use each instance, without the need to have the application modify how it works. This helps keep the data layer solution separate from the application layer. Even better, this additional set of flexibility does not come with a cost.

On the contrary, it improves resource utilization and brings higher performance using less powerful instances. Given the cost of Aurora, this is not a secondary benefit.

  I suggest you try installing ProxySQL v2 (or higher) in front of your Aurora cluster. If you don't feel confident and prefer to have us help you, contact us and we will be more than happy to support you!

MySQL 8.0: Statements Latency Histograms

At the end we all agree that what really matters is the Query Response Time, isn’t it ?

MySQL 8.0 engineers have added more visibility in the summary tables as explained in this previous post. But to add more details, now the Performance_Schema also collects the statements latency and provide their distribution thanks to the collected histogram data.

Two tables have been added to Performance_Schema:

  • events_statements_histogram_by_digest: details about latency related to schema and query digest
  • events_statements_histogram_global: global latency summary across all schemas and queries

Let’s have a look at what we can see:

mysql> SELECT t1.SCHEMA_NAME, CONCAT(ROUND(BUCKET_QUANTILE*100,2),"% under ", BUCKET_TIMER_HIGH/1000000000," milliseconds") fact, LEFT(QUERY_SAMPLE_TEXT,64) as QUERY_SAMPLE, t1.DIGEST, COUNT(t1.DIGEST) OVER(PARTITION BY t1.DIGEST) as TOT FROM events_statements_histogram_by_digest t1 JOIN events_statements_summary_by_digest t2 ON t2.DIGEST = t1.DIGEST WHERE COUNT_BUCKET >1 ORDER BY t1.DIGEST, BUCKET_TIMER_HIGH DESC LIMIT 10\G *************************** 1. row *************************** SCHEMA_NAME: sbtest fact: 86.84% under 100.0000 milliseconds QUERY_SAMPLE: INSERT INTO sbtest1(k, c, pad) VALUES(49929, '83868641912-287739 DIGEST: 0b515c3fcc7c2095d2b2dccf0040f79254a6fabcf96c7bb9bd597398abba69fb TOT: 10 *************************** 2. row *************************** SCHEMA_NAME: sbtest fact: 81.58% under 95.4993 milliseconds QUERY_SAMPLE: INSERT INTO sbtest1(k, c, pad) VALUES(49929, '83868641912-287739 DIGEST: 0b515c3fcc7c2095d2b2dccf0040f79254a6fabcf96c7bb9bd597398abba69fb TOT: 10 *************************** 3. row *************************** SCHEMA_NAME: sbtest fact: 73.68% under 87.0964 milliseconds QUERY_SAMPLE: INSERT INTO sbtest1(k, c, pad) VALUES(49929, '83868641912-287739 DIGEST: 0b515c3fcc7c2095d2b2dccf0040f79254a6fabcf96c7bb9bd597398abba69fb TOT: 10 *************************** 4. row *************************** SCHEMA_NAME: sbtest fact: 68.42% under 83.1764 milliseconds QUERY_SAMPLE: INSERT INTO sbtest1(k, c, pad) VALUES(49929, '83868641912-287739 DIGEST: 0b515c3fcc7c2095d2b2dccf0040f79254a6fabcf96c7bb9bd597398abba69fb TOT: 10 *************************** 5. row *************************** SCHEMA_NAME: sbtest fact: 63.16% under 75.8578 milliseconds QUERY_SAMPLE: INSERT INTO sbtest1(k, c, pad) VALUES(49929, '83868641912-287739 DIGEST: 0b515c3fcc7c2095d2b2dccf0040f79254a6fabcf96c7bb9bd597398abba69fb TOT: 10 *************************** 6. row *************************** SCHEMA_NAME: sbtest fact: 57.89% under 63.0957 milliseconds QUERY_SAMPLE: INSERT INTO sbtest1(k, c, pad) VALUES(49929, '83868641912-287739 DIGEST: 0b515c3fcc7c2095d2b2dccf0040f79254a6fabcf96c7bb9bd597398abba69fb TOT: 10 *************************** 7. row *************************** SCHEMA_NAME: sbtest fact: 47.37% under 50.1187 milliseconds QUERY_SAMPLE: INSERT INTO sbtest1(k, c, pad) VALUES(49929, '83868641912-287739 DIGEST: 0b515c3fcc7c2095d2b2dccf0040f79254a6fabcf96c7bb9bd597398abba69fb TOT: 10 *************************** 8. row *************************** SCHEMA_NAME: sbtest fact: 39.47% under 45.7088 milliseconds QUERY_SAMPLE: INSERT INTO sbtest1(k, c, pad) VALUES(49929, '83868641912-287739 DIGEST: 0b515c3fcc7c2095d2b2dccf0040f79254a6fabcf96c7bb9bd597398abba69fb TOT: 10 *************************** 9. row *************************** SCHEMA_NAME: sbtest fact: 26.32% under 39.8107 milliseconds QUERY_SAMPLE: INSERT INTO sbtest1(k, c, pad) VALUES(49929, '83868641912-287739 DIGEST: 0b515c3fcc7c2095d2b2dccf0040f79254a6fabcf96c7bb9bd597398abba69fb TOT: 10 *************************** 10. row *************************** SCHEMA_NAME: sbtest fact: 18.42% under 36.3078 milliseconds QUERY_SAMPLE: INSERT INTO sbtest1(k, c, pad) VALUES(49929, '83868641912-287739 DIGEST: 0b515c3fcc7c2095d2b2dccf0040f79254a6fabcf96c7bb9bd597398abba69fb TOT: 10 10 rows in set (0.02 sec)

The LIMIT should match the amount of TOT.

We can see (row 7), that only 47.37% of this query is executed in 50ms or less and only 18.42% are executed in less than 36ms.

It’s also possible to illustrate those results like this:

mysql> SELECT CONCAT('<',ROUND(BUCKET_TIMER_HIGH/1000000000,2),'ms') as 'QRT', CONCAT(RPAD('',ROUND(BUCKET_QUANTILE*100),'*'),ROUND(BUCKET_QUANTILE*100,2),"%") bar, COUNT(t1.DIGEST) OVER(PARTITION BY t1.DIGEST) as TOT FROM events_statements_histogram_by_digest t1 JOIN events_statements_summary_by_digest t2 ON t2.DIGEST = t1.DIGEST WHERE COUNT_BUCKET >1 ORDER BY t1.DIGEST, BUCKET_TIMER_HIGH DESC LIMIT 10; +-----------+-----------------------------------------------------------------------------------------------+-----+ | QRT | bar | TOT | +-----------+-----------------------------------------------------------------------------------------------+-----+ | <100.00ms | ***************************************************************************************86.84% | 10 | | <95.50ms | **********************************************************************************81.58% | 10 | | <87.10ms | **************************************************************************73.68% | 10 | | <83.18ms | ********************************************************************68.42% | 10 | | <75.86ms | ***************************************************************63.16% | 10 | | <63.10ms | **********************************************************57.89% | 10 | | <50.12ms | ***********************************************47.37% | 10 | | <45.71ms | ***************************************39.47% | 10 | | <39.81ms | **************************26.32% | 10 | | <36.31ms | ******************18.42% | 10 | +-----------+-----------------------------------------------------------------------------------------------+-----+ 10 rows in set (0.02 sec)

Additionally to these tables, in SYS schema, we have added a procedure illustrating the latency histograms too, it’s called: ps_statement_avg_latency_histogram()

mysql> CALL sys.ps_statement_avg_latency_histogram()\G *************************** 1. row *************************** Performance Schema Statement Digest Average Latency Histogram: . = 1 unit * = 2 units # = 3 units (0 - 348ms) 64 | ******************************** (348 - 696ms) 7 | ....... (696 - 1044ms) 2 | .. (1044 - 1393ms) 2 | .. (1393 - 1741ms) 1 | . (1741 - 2089ms) 0 | (2089 - 2437ms) 0 | (2437 - 2785ms) 0 | (2785 - 3133ms) 0 | (3133 - 3481ms) 1 | . (3481 - 3829ms) 0 | (3829 - 4178ms) 0 | (4178 - 4526ms) 0 | (4526 - 4874ms) 0 | (4874 - 5222ms) 0 | (5222 - 5570ms) 0 | Total Statements: 78; Buckets: 16; Bucket Size: 348 ms; 1 row in set (0.02 sec) Query OK, 0 rows affected (0.02 sec)

As you can see, MySQL 8.0 added more visibility in many domains to analyze and understand better your workload.

 

Tungsten Clustering 6.0 and Tungsten Replicator 6.0 are now available!

Continuent is very pleased and excited to announce that the new Tungsten Clustering 6.0 and Tungsten Replicator 6.0 are now available for download by our customers. The 6.0 release is the culmination of over a years work within our clustering product in order to improve the functionality and manageability of what we now call our ‘Multimaster Clustering’ solution. This is the replacement for what we called the multi-site, multi-master (MSMM) clustering functionality in earlier releases. The multimaster clustering allows for multiple clusters, in multiple locations, to be linked together into a single composite cluster. Because it’s a composite cluster, you gain all of the functionality that’s already available in a cluster, such as:
  • High availability
  • Failover
  • Automated recovery
  • Read-write split
  • Maintenance without downtime
But it’s now applied to the composite environment so this functionality works across sites and multiple clusters. The key component to making this happen was for us to combine the technology used in the previous MSMM iteration into a single, managed, project. In 5.3 and earlier, our MSMM deployment relied on using the core clustering product with a separate Tungsten Replicator produce deployment to provide cross-site replication. We’ve now embedded the cross-site replication into the cluster, and then expanded the management layer provided by the Tungsten Manager so that we can monitor and control the cross replicators and entire composite cluster from a single location. For the first time with Tungsten Clustering, a multimaster cluster spread across multiple sites now has:
  • Failover and recovery between sites
  • Single unified view of the composite cluster
  • Single unified composite cluster
  • Single unified replication status across the cluster
  • Tungsten Connector is aware of the composite structure
The result is a much simpler and more capable multimaster cluster solution, all manageable with our cctrl command-line client. As part of the process, we’ve made some other minor changes: Package has been renamed to tungsten-clustering to match the product name we’ve been using. Support for Java 7 has been dropped. You must use at least Java 8 for 6.0 and higher We have upgrade support for moving directly from both the 4.x and 5.x versions of Tungsten Clustering, and we even have a script that will help with the process. Tungsten Replicator 6.0 We’ve also made some improvements to our Tungsten Replicator 6.0. Some of these are under the hood again in preparation for new products and functionality, but some of it builds on the technology and capability we already have. In particular, we’ve added:
  • Re-introduction of the support for writing data into PostgreSQL.
  • Significant expansion of the Kafka applier to provide more detailed information in the messages, including whole transaction information so that you can verify message states.
  • Support for replicating data from MySQL and Oracle databases that don’t have explicit database primary keys to analytics targets.
  • Support for larger server IDs for MySQL sources to allow the replicator to work better within some cloud deployments.
Always, in both products, the releases contain a few fixes for bugs and tweaks. You can read the full release notes here: https://docs.continuent.com/tungsten-clustering-6.0/release-notes-6-0-0.html https://docs.continuent.com/tungsten-replicator-6.0/release-notes-6-0-0.html To say that Continuent as a team is really excited about this new release is an understatement! If you have any questions or would like to discuss upgrades with the support team, please use the support portal to submit a request.

Performance Schema for MySQL Troubleshooting Webinar: Q & A

In this blog, I will provide answers to the Q & A for the Performance Schema for MySQL Troubleshooting webinar.

First, I want to thank everybody for attending my March 1, 2018, webinar. The recording and slides for the webinar are available here. Below is the list of your questions that I was unable to answer fully during the webinar.

Q: Is Workbench able to take advantage of the enhancements to Perf schema?

A: MySQL Workbench is a graphical tool for database architects, administrators and developers. It uses Performance Schema for its Performance Schema Reports and Query Statistics dashboards for MySQL Servers of version 5.6 or greater. So the answer is: yes, it is able to take advantage of the enhancements to Performance Schema.

Q: Can we check the history data ?

A: Yes. To do it you need to enable history consumers. You will find instructions here. For all kinds of consumers history table names follow the same pattern:

  • *_history  contains last N events per thread. Number N is defined byperformance_schema_*_history_size  variables. Default is -1 (autosized) in version 5.7 and 10 in version 5.6.
  • *_history_long contains the most recent M events. Value of M is defined by performance_schema_*_history_long_size  variables. Default is -1 (autosized) in version 5.7 and 10000 in version 5.6

For example, if you want to have historical data for statements, you need to enable consumers

events_statements_history and events_statements_history_long. If you want to limit the number of queries stored, you need to set variables performance_schema_events_statements_history_size and performance_schema_events_statements_history_long_size.

Q: Are there any guidelines regarding how much memory we should set aside for every X counters/statistics being enabled?

A: No, there is no such guideline I am aware of. But you can use definitions of tables in Performance Schema to calculate the approximate value of how much memory one row could occupy, and make predictions from it. You can also use memory instrumentation in Performance Schema and watch for changes of memory usage under load, adjusting as needed.

Q: How has the performance cost of performance schema changed from 5.6 to 5.7?

A: The worst situation for the performance cost of Performance Schema was in version 5.5. It was discussed in numerous places, but I recommend you read this 2010 post from Dimitri Kravtchuk, a MySQL Performance Architect at Oracle. The result of his post was a huge performance improvement in Performance Schema, reported in this post from 2011. There were more improvements, discussed in 2012. Since then, Performance Schema does not add significant performance overhead unless you enable 

waits instrumentation.

Version 5.7 added new instrumentation, and as my tests showed this instrumentation did not add any noticeable performance impact.

To summarize: version 5.6 made huge improvements to the performance of Performance Schema, and the new features in version 5.7 did not add any performance regressions.

Q: Will performance schema eat up my disk space? How long will it store all these logs and cause any issues?

A: Performance Schema does not store anything on disk, but uses memory. It stores data until it reaches the size of the consumer tables. When it reaches the maximum size, it removes the oldest data and replaces it with the newest statistics. Read the Performance Schema startup configuration guide on how to limit the maximum size of consumer tables.

Thanks for attending this webinar on Performance Schema for MySQL Troubleshooting. You can find the slides and a recording here.

The post Performance Schema for MySQL Troubleshooting Webinar: Q & A appeared first on Percona Database Performance Blog.

Percona Monitoring and Management 1.9.0 Is Now Available

Percona Monitoring and Management (PMM) is a free and open-source platform for managing and monitoring MySQL® and MongoDB® performance. You can run PMM in your own environment for maximum security and reliability. It provides thorough time-based analysis for MySQL® and MongoDB® servers to ensure that your data works as efficiently as possible.

There are a number of significant updates in Percona Monitoring and Management 1.9.0 that we hope you will like, some of the key highlights include:

  • Faster loading of the index page: We have enabled performance optimizations using gzip and HTTP2.
  • AWS improvements: We have added metrics from CloudWatch RDS to 6 dashboards, as well as changed our AWS add instance workflow, and made some changes to credentials handling.
  • Percona Snapshot Server: If you are a Percona customer you can now securely share your dashboards with Percona Engineers.
  • Exporting Percona Monitoring and Management Server logs: Retrieve logs from PMM Server for troubleshooting using single button-click, avoiding the need to log in manually to the docker container.
  • Low RAM support: We have reduced the memory requirement so PMM Server will run on systems with 512MB
  • Dashboard improvements: We have changed MongoDB instance identification for MongoDB graphs, and set maximum graph Y-axis on Prometheus Exporter Status dashboard
AWS Improvements CloudWatch RDS metrics

Since we are already consuming Amazon Cloudwatch metrics and persisting them in Prometheus, we have improved six node-specific dashboards to now display Amazon RDS node-level metrics:

  • Cross_Server (Network Traffic)
  • Disk Performance (Disk Latency)
  • Home Dashboard (Network IO)
  • MySQL Overview (Disk Latency, Network traffic)
  • Summary Dashboard (Network Traffic)
  • System Overview (Network Traffic)
AWS Add Instance changes

We have changed our AWS add instance interface and workflow to be more clear on information needed to add an Amazon Aurora MySQL or Amazon RDS MySQL instance. We have provided some clarity on how to locate your AWS credentials.

AWS Settings

We have improved our documentation to highlight connectivity best practices, and authentication options – IAM Role or IAM User Access Key.

Enabling Enhanced Monitoring

Credentials Screen

Low RAM Support

You can now run Percona Monitoring and Management Server on instances with memory as low as 512MB RAM, which means you can deploy to the free tier of many cloud providers if you want to experiment with PMM. Our memory calculation is now:

METRICS_MEMORY_MULTIPLIED=$(( (${MEMORY_AVAIABLE} - 256*1024*1024) / 100 * 40 )) if [[ $METRICS_MEMORY_MULTIPLIED < $((128*1024*1024)) ]]; then METRICS_MEMORY_MULTIPLIED=$((128*1024*1024)) fi Percona Snapshot Server

Snapshots are a way of sharing PMM dashboards via a link to individuals who do not normally have access to your PMM Server. If you are a Percona customer you can now securely share your dashboards with Percona Engineers. We have replaced the button for sharing to the Grafana publicly hosted platform onto one administered by Percona. Your dashboard will be written to Percona snapshots and only Percona Engineers will be able to retrieve the data. We will be expiring old snapshots automatically at 90 days, but when sharing you will have the option to configure a shorter retention period.

Export of PMM Server Logs

In this release, the logs from PMM Server can be exported using single button-click, avoiding the need to log in manually to the docker container. This simplifies the troubleshooting process of a PMM Server, and especially for Percona customers, this feature will provide a more consistent data gathering task that you will perform on behalf of requests from Percona Engineers.

Faster Loading of the Index Page

In Percona Monitoring and Management version 1.8.0, the index page was redesigned to reveal more useful information about the performance of your hosts as well an immediate access to essential components of PMM, however the index page had to load much data dynamically resulting in a noticeably longer load time. In this release we enabled gzip and HTTP2 to improve the load time of the index page. The following screenshots demonstrate the results of our tests on webpagetest.org where we reduce page load time by half. We will continue to look for opportunities to improve the performance of the index page and expect that when we upgrade to Prometheus 2 we will see another improvement.

The load time of the index page of PMM version 1.8.0

The load time of the index page of PMM version 1.9.0

Issues in this release New Features
  • PMM-781: Plot new PXC 5.7.17, 5.7.18 status variables on new graphs for PXC Galera, PXC Overview dashboards
  • PMM-1274: Export PMM Server logs as zip file to the browser
  • PMM-2058: Percona Snapshot Server
Improvements
  • PMM-1587: Use mongodb_up variable for the MongoDB Overview dashboard to identify if a host is MongoDB.
  • PMM-1788: AWS Credentials form changes
  • PMM-1823: AWS Install wizard improvements
  • PMM-2010: System dashboards update to be compatible with RDS nodes
  • PMM-2118: Update grafana config for metric series that will not go above 1.0
  • PMM-2215: PMM Web speed improvements
  • PMM-2216: PMM can now be started on systems without memory limit capabilities in the kernel
  • PMM-2217: PMM Server can now run in Docker with 512 Mb memory
  • PMM-2252: Better handling of variables in the navigation menu
Bug fixes
  • PMM-605: pt-mysql-summary requires additional configuration
  • PMM-941: ParseSocketFromNetstat finds an incorrect socket
  • PMM-948: Wrong load reported by QAN due to mis-alignment of time intervals
  • PMM-1486: MySQL passwords containing the dollar sign ($) were not processed properly.
  • PMM-1905: In QAN, the Explain command could fail in some cases.
  • PMM-2090: Minor formatting issues in QAN
  • PMM-2214: Setting Send real query examples for Query Analytic OFF still shows the real query in example.
  • PMM-2221: no Rate of Scrapes for MySQL & MySQL Errors
  • PMM-2224: Exporter CPU Usage glitches
  • PMM-2227: Auto Refresh for dashboards
  • PMM-2243: Long host names in Grafana dashboards are not displayed correctly
  • PMM-2257: PXC/galera cluster overview Flow control paused time has a percentage glitch
  • PMM-2282: No data is displayed on dashboards for OVA images
  • PMM-2296: The mysql:metrics service will not start on Ubuntu LTS 16.04

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system.

The post Percona Monitoring and Management 1.9.0 Is Now Available appeared first on Percona Database Performance Blog.

Array Ranges in MySQL JSON

Pretend you have a JSON array of data that looks roughly like the following.

mysql> insert into x(y) values('["a","b","c","d"]');
Query OK, 1 row affected (0.10 sec)


You could get all the values from that array using $[*]

mysql> select y->"$[*]" from x;
+----------------------+
| y->"$[*]" |
+----------------------+
| ["a", "b", "c", "d"] |
+----------------------+
1 row in set (0.00 sec)
Or the individual members of the array with an index that starts with zero.

mysql> select y->"$[0]" from x;
+-----------+
| y->"$[0]" |
+-----------+
| "a" |
+-----------+
1 row in set (0.00 sec)

But what about the times you want the last item in the array and really do not want to loop through all the items? How about using 'last'?

mysql> select y->"$[last]" as 'last' from x;
+------+
| last |
+------+
| "d" |
+------+
1 row in set (0.00 sec)

Holey Moley! What is this? Well Roland Bouman, who was my predecessor on the MySQL Certification Team many years ago, still contributes to MySQL on a regular basis. He put in a (feature request for a JSON range operator.
So please thank Roland when you see him!!
Be sides 'last' there is 'to' too!.
mysql> select y->'$[last - 3 to last - 1]' as 'last three' from x;
+-----------------+
| last three |
+-----------------+
| ["a", "b", "c"] |
+-----------------+
1 row in set (0.01 sec)
You can also use the ordinal number, here $[1], to strip off the first value of the array from the rest. mysql> select y->'$[1 to last]' as 'strip first' from x;
+-----------------+
| strip first |
+-----------------+
| ["b", "c", "d"] |
+-----------------+
1 row in set (0.00 sec)

This makes juggling array data in JSON columns much easier.

List of Conferences & Events w/ MySQL, April - June 2018!

We are happy to announce a list of events which we - MySQL &/or MySQL Community Team - are attending and speaking at during the period of time of April to June 2018. Please be aware that the list does not have to be final, during the time more events could be added or some of them removed. 

April 2018:

  • PHP Yorkshire, Yorkshire, UK, April 13-Workshop, April 14-Conference, 2018
    • Same as last year we are sponsoring and attending this conference as Silver sponsor. You can find us at MySQL booth in expo area.
  • Darkmira PHP Tour, Brasilia/DF, April 14-15, 2018
    • Do not miss the Keynote given by Sheeri K. Cabral, who is the leader of MySQL Community.  Sheeri will be talking about "On being a Senior Engineer" and her talk is scheduled for April 14 @2:00-2:45 pm. You can then find Sheeri at our MySQL booth in the expo area to discuss the MySQL related topics.
  • Longhorn PHP, Austin, TX, US, April 19-21, 2018
    • We are going to attend and speak at this pretty new show replacing Lonestar PHP and organized by Austin PHP Meetup group. Please do not miss the talk given by David Stokes, the MySQL Community Manager on "MySQL Without The SQL -- Oh My!", scheduled for Friday, April 20 @2:00-2:50pm. You can also find Dave at MySQL booth in the expo area.
  • PyCon NOVE IT, Florence, Italy, April 19-22, 2018
    • As a tradition we are going to attend the PyCon IT. This year Marco Carlessi, the MySQL Principal Sales Consultant is going to have a talk on MySQL 8 ("MySQL 8: un database SQL/NoSQL semplice da usare"), scheduled for Friday, April 20 @12:30pm.
  • LOADays, Antwerp, Belgium, April 21-22, 2018
    • This show is still in the process since we are waiting for confirmed speaker. Please stay turned or watch the website for more details.
  • COLLABORATE18, Las Vegas, US, April 22-26, 2018
    • Our participation at this conference will be confirmed soon.
  • Percona Live, Santa Clara, CA, US, April 23-25, 2018
    • MySQL is again attending and participating PerconaLive conference. This time there are multiple Oracle's MySQL talks as well as talks given by MySQL community members, partners & customers. See the full list of MySQL sessions here.
  • Great Indian Developer Summit (GIDS), Bangalore, India, April 24-27, 2018
    • MySQL, same as other groups, is part of the central Oracle booth at GIDS. Oracle is Platinum sponsor this year and you would be able to find our staff at the booth or after the several Oracle's talks. For MySQL, the speaker is Sanjay Manwani, the MySQL Senior Director responsible for all the India MySQL development. Sanjay will be speaking on "MySQL 8.0 : The best gets even better" and his talk will be announced in the GIDS.DATA day, scheduled for April 26, 2018.
  • Ruby Conference, Taipei, Taiwan, April 27-28, 2018
    • We are a Silver sponsor of this conference. This time unfortunately without a booth but with very active staff on site. Our colleague Ivan Tu, the Principal Sales Consultant for APAC will be representing MySQL.
  • LinuxFest NorthWest, Bellingham, US, April 28-29, 2018
    • MySQL is a Gold sponsor of this conference. You can find our colleague David Stokes, the MySQL Community Manager at our MySQL booth in expo area as well as attend any of his approved MySQL talks as follows:
      • "Database Programming Basics or Why Database Queries Stink!!!" scheduled for April 28 @3:30-4:15pm.
      • "Presentation Skills for Open Source Folks" scheduled for April 29 @12:30-1:15pm.

May 2018:

  • RootConf, Bangalore, India, May 10-11, 2018
    • We are going to be a Bronze sponsor of RootConf this year. You can find our MySQL staff at the booth in expo area as well as listen the MySQL talk given by Bhupinder Singh on "Database monitoring for performance and availability". Please watch the schedule for the further updates.
  • OpenSource Conference, Nagoya, Japan, May 19, 2018
    • As a tradition you can find us as Gold sponsor at the OSC event, this time in Nagoya. Find our MySQL talk in the schedule on the website and our staff in the exhibition area at the MySQL booth.
  • OpenSource Day, Warszawa, Poland, May 23, 2018
    • This year our colleagues from Linux team joined MySQL and we both will be together presenting our products & technologies at OS Day in Poland. Please do not miss to come to our booth to talk about Linux and / or MySQL topics as well as do not miss the MySQL & Linux talks which are still not yet announced. Please keep watching the organizers website for further updates!
  • OpenSUSE, Prague, Czech republic, May 25-27, 2018
    • MySQL is going to be Silver sponsor of this SUSE annual conference. We are going to have a small MySQL booth and hopefully also a talk accepted. Please watch organizers website for further updates on the submitted MySQL talk.
  • PHPtek18, Atlanta, US, May 31-June 1, 2018
    •  MySQL Community team is a Workshop sponsor this year. Do not miss two accepted MySQL talks given by David Stokes, the MySQL Community Manager. They are:
      • "MySQL Without the SQL -- Oh My!", scheduled for Jun 1st @3:45-4:45pm
      • "MySQL 8 - A New Beginning", scheduled for May 31 @2:30-3:30pm.

June 2018:

  • PyCon Taiwan, Taipei, Taiwan, Jun 1-2, 2018
    • MySQL is a Bronze sponsor. We are hoping to get a MySQL talk approved by the committee. Please watch the organizers' website for further updates.
  • OpenSource Conference Hong Kong (HK OS), Hong Kong, Jun 2-3, 2018
    • We are going to be Silver sponsor of this OS conference. Please find us at MySQL booth in the expo area as well as attend our MySQL session - topic and timing will be announced later on, please watch the organizers' website for further updates on it.
  • International PHP Conference, Berlin, Germany, Jun 4-8, 2018
    • MySQL will be supporting this event on the special level including a talk opportunity (not announced yet) and a booth. So, please come to talk to our staff and watch organizers website for the talk announcement.
  • OpenExpo Europe, Madrid, Spain, Jun 6-7, 2018
    • MySQL team is going to be Bronze sponsor of this show. Please do not miss to visit our booth and talk to our local staff on MySQL topic.
  • Bulgarian Oracle User Group Conference (BGOUG), Broadest, Bulgaria, Jun 8-10, 2018
    • As a tradition we are supporting and attending this User Group event organized by the BGOUG in Bulgaria. Please do not miss the MySQL talk given by Georgi Kodinov, the MySQL Senior SW Development Manager on "The new data dictionary in MySQL 8.0: what does it mean to you?".
  • Southeast Linux Fest (SELF), Charlotte, US, Jun 8-10, 2018
    • We are planning to become a Diamond sponsor of this show. You will be able to find us at the MySQL booth in the expo hall... Watch the organizers' website for further updates on the schedule in order to find the timing for MySQL talk.
  • Developer World @CEBIT, Hannover, Germany, Jun 11-15, 2018
    • MySQL is going only to attend this conference.

Pattern Matching Queries vs. Full-Text Indexes

In this blog post, we’ll compare the performance of pattern matching queries vs. full-text indexes.

In my previous blog post, I looked for a solution on how we can search only a part of the email address and how can we make faster queries where the condition is email LIKE '%n.pierre%'. I showed two possible ways that could work. Of course, they had some pros and cons as well but were more efficient and faster than a like '%n.prierre%'.

But you could also ask why I would bother with this? Let’s add a FULLTEXT index, and everybody is happy! Are you sure about that? I’m not. Let’s investigate and test a bit. (We have some nice blog posts that explain how FULLTEXT indexes work: Post 1, Post 2, Post 3.)

Let’s see if it works in our case where we were looking for email addresses. Here is the table:

CREATE TABLE `email` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `email` varchar(120) COLLATE utf8_unicode_ci NOT NULL DEFAULT '', PRIMARY KEY (`id`), KEY `idx_email` (`email`) ) ENGINE=InnoDB AUTO_INCREMENT=318465 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci

Add the default full-text index:

ALTER TABLE email ADD FULLTEXT KEY (email);

It took only five seconds for 320K email addresses.

Let’s run a search:

SELECT id, email FROM email where MATCH(email) AGAINST ('n.pierre' IN NATURAL LANGUAGE MODE); +--------+--------------------------------+ | id | email | +--------+--------------------------------+ | 2940 | pierre.west@example.org | | 10775 | pierre.beier@example.org | | 24267 | schroeder.pierre@example.org | | 26285 | bode.pierre@example.org | | 27104 | pierre.franecki@example.org | | 31792 | pierre.jaskolski@example.com | | 39369 | kuphal.pierre@example.org | | 58625 | olson.pierre@example.org | | 59526 | larkin.pierre@example.net | | 64718 | boyle.pierre@example.com | | 72033 | pierre.wolf@example.net | | 90587 | anderson.pierre@example.org | | 108806 | fadel.pierre@example.org | | 113897 | jacobs.pierre@example.com | | 118579 | hudson.pierre@example.com | | 118798 | pierre.wuckert@example.org | | 118937 | green.pierre@example.net | | 125451 | hauck.pierre@example.net | | 133352 | friesen.pierre@example.net | | 134594 | windler.pierre@example.com | | 135406 | dietrich.pierre@example.org | | 190451 | daugherty.pierre@example.org | ...

Immediately, we have issues with the results. It returns 43 rows, but there are only 11 rows with string n.pierre. Why? It is because of . The manual says:

The built-in FULLTEXT parser determines where words start and end by looking for certain delimiter characters; for example,   (space), , (comma), and . (period).

The parser believes that a . starts a new word, so it is going to search for pierre instead of n.pierre. That’s not good news as many email addresses contain ..  What can we do? The manual says:

It is possible to write a plugin that replaces the built-in full-text parser. For details, see Section 28.2, “The MySQL Plugin API”. For example parser plugin source code, see the plugin/fulltext directory of a MySQL source distribution.

If you are willing to write your own plugin in C/C++, you can try that route. Until then, it is going to give us back a lot of irrelevant matches.

We can order the results by relevancy:

SELECT id,email,MATCH(email) AGAINST ('n.pierre' IN NATURAL LANGUAGE MODE) AS score FROM email where MATCH(email) AGAINST ('n.pierre' IN NATURAL LANGUAGE MODE) ORDER BY 3 desc limit 10; +-------+------------------------------+-------------------+ | id | email | score | +-------+------------------------------+-------------------+ | 2940 | pierre.west@example.org | 14.96491813659668 | | 10775 | pierre.beier@example.org | 14.96491813659668 | | 24267 | schroeder.pierre@example.org | 14.96491813659668 | | 26285 | bode.pierre@example.org | 14.96491813659668 | | 27104 | pierre.franecki@example.org | 14.96491813659668 | | 31792 | pierre.jaskolski@example.com | 14.96491813659668 | | 39369 | kuphal.pierre@example.org | 14.96491813659668 | | 58625 | olson.pierre@example.org | 14.96491813659668 | | 59526 | larkin.pierre@example.net | 14.96491813659668 | | 64718 | boyle.pierre@example.com | 14.96491813659668 | +-------+------------------------------+-------------------+

This does not guarantee we get back the lines that we are looking for, however. I tried to change innodb_ft_min_token_size as well, but it did not affect the results.

Let’s see what happens when I search for williamson pierre. Two separate words. I know there is only one email address with these names.

SELECT id,email,MATCH(email) AGAINST ('williamson.pierre' IN NATURAL LANGUAGE MODE) AS score FROM email where MATCH(email) AGAINST ('williamson.pierre' IN NATURAL LANGUAGE MODE) ORDER BY 3 desc limit 50; +--------+---------------------------------+-------------------+ | id | email | score | +--------+---------------------------------+-------------------+ | 238396 | williamson.pierre@example.net | 24.08820343017578 | | 2940 | pierre.west@example.org | 14.96491813659668 | | 10775 | pierre.beier@example.org | 14.96491813659668 | | 24267 | schroeder.pierre@example.org | 14.96491813659668 | | 26285 | bode.pierre@example.org | 14.96491813659668 | | 27104 | pierre.franecki@example.org | 14.96491813659668 | | 31792 | pierre.jaskolski@example.com | 14.96491813659668 | | 39369 | kuphal.pierre@example.org | 14.96491813659668 | | 58625 | olson.pierre@example.org | 14.96491813659668 | ...

The first result is that we still got another 49 addresses. How can the application decide which email address is relevant and which is not? I am still not happy.

Are there any other options without writing our own plugin?

Can I somehow tell the parser to use n.pierre as one word? The manual says:

A phrase that is enclosed within double quote (") characters matches only rows that contain the phrase literally, as it was typed. The full-text engine splits the phrase into words and performs a search in the FULLTEXT index for the words. Nonword characters need not be matched exactly: Phrase searching requires only that matches contain exactly the same words as the phrase and in the same order. For example, "test phrase" matches "test, phrase".

I can use double quotes, but it will still split at . and the results are the same. I did not find a solution except writing your own plugin. If someone knows a solution, please write a comment.

With Parser Ngram

The built-in MySQL full-text parser uses delimiters between words, but we can create an Ngram-based full-text index.

mysql> alter table email ADD FULLTEXT KEY (email) WITH PARSER ngram; Query OK, 0 rows affected (20.10 sec) Records: 0 Duplicates: 0 Warnings: 0

Before that, I changed the ngram_token_size to 3.

mysql> SELECT id,email,MATCH(email) AGAINST ('n.pierre' IN NATURAL LANGUAGE MODE) AS score FROM email where MATCH(email) AGAINST ('n.pierre' IN NATURAL LANGUAGE MODE) ORDER BY 3 desc; +--------+----------------------------------+--------------------+ | id | email | score | +--------+----------------------------------+--------------------+ | 58625 | olson.pierre@example.org | 16.56794548034668 | | 59526 | larkin.pierre@example.net | 16.56794548034668 | | 90587 | anderson.pierre@example.org | 16.56794548034668 | | 118579 | hudson.pierre@example.com | 16.56794548034668 | | 118937 | green.pierre@example.net | 16.56794548034668 | | 133352 | friesen.pierre@example.net | 16.56794548034668 | | 200608 | wilkinson.pierre@example.org | 16.56794548034668 | | 237928 | johnson.pierre@example.org | 16.56794548034668 | | 238396 | williamson.pierre@example.net | 16.56794548034668 | | 278384 | monahan.pierre@example.net | 16.56794548034668 | | 306718 | rohan.pierre@example.com | 16.56794548034668 | | 226737 | warren.pfeffer@example.net | 12.156486511230469 | | 74278 | stiedemann.perry@example.net | 11.52701187133789 | | 75234 | bogan.perry@example.org | 11.52701187133789 | ... 4697 rows in set (0.03 sec)

Finally, we are getting somewhere. But it gives back 4697 rows. How can the application decide which results are relevant? Should we just use the score?

Subselect?

I dropped the Ngram FULLTEXT index and created a normal one because that gives me back only 43 results instead of 4697. I thought a full-text search might be good to narrow down the results from a million to a few thousand, and then we can run a select based on that. Example:

mysql> Select e2.id,e2.email from (SELECT id,email FROM email where MATCH(email) AGAINST ('n.pierre' IN NATURAL LANGUAGE MODE)) as e2 where e2.email like '%n.pierre%'; +--------+-------------------------------+ | id | email | +--------+-------------------------------+ | 58625 | olson.pierre@example.org | | 59526 | larkin.pierre@example.net | | 90587 | anderson.pierre@example.org | | 118579 | hudson.pierre@example.com | | 118937 | green.pierre@example.net | | 133352 | friesen.pierre@example.net | | 200608 | wilkinson.pierre@example.org | | 237928 | johnson.pierre@example.org | | 238396 | williamson.pierre@example.net | | 278384 | monahan.pierre@example.net | | 306718 | rohan.pierre@example.com | +--------+-------------------------------+ 11 rows in set (0.00 sec)

Wow, this can work and it looks quite fast as well. BUT (there is always a but), if I run the following query (searching for ierre):

mysql> Select e2.id,e2.email from (SELECT id,email FROM email where MATCH(email) AGAINST ('ierre' IN NATURAL LANGUAGE MODE)) as e2 where e2.email like '%ierre%'; Empty set (0.00 sec)

It gives back nothing because the default full-text parser uses only full words! In our case, that is not very helpful. Let’s switch back to Ngram and re-run the query:

mysql> Select e2.id,e2.email from (SELECT id,email FROM email where MATCH(email) AGAINST ('ierre' IN NATURAL LANGUAGE MODE)) as e2 where e2.email like '%ierre%'; +--------+--------------------------------+ | id | email | +--------+--------------------------------+ | 2940 | pierre.west@example.org | | 10775 | pierre.beier@example.org | | 16958 | pierre68@example.com | | 24267 | schroeder.pierre@example.org | ... 65 rows in set (0.05 sec) +-------------------------+----------+ | Status | Duration | +-------------------------+----------+ | starting | 0.000072 | | checking permissions | 0.000006 | | Opening tables | 0.000014 | | init | 0.000027 | | System lock | 0.000007 | | optimizing | 0.000006 | | statistics | 0.000013 | | preparing | 0.000006 | | FULLTEXT initialization | 0.006384 | | executing | 0.000012 | | Sending data | 0.020735 | | end | 0.000014 | | query end | 0.000014 | | closing tables | 0.000013 | | freeing items | 0.001383 | | cleaning up | 0.000024 | +-------------------------+----------+

It gives us back 65 rows, and it takes between 0.02-0.05s because the subquery results in many rows.

With my “shorting method”:

select e.email from email as e right join email_tib as t on t.email_id=e.id where t.email_parts like "ierre%"; +--------------------------------+ | email | +--------------------------------+ | anderson.pierre@example.org | | bode.pierre@example.org | | bode.pierre@example.org | | boyle.pierre@example.com | | bradtke.pierre@example.org | | bradtke.pierre@example.org | ... 65 rows in set (0.00 sec) mysql> show profile; +----------------------+----------+ | Status | Duration | +----------------------+----------+ | starting | 0.000069 | | checking permissions | 0.000011 | | checking permissions | 0.000003 | | Opening tables | 0.000020 | | init | 0.000021 | | System lock | 0.000008 | | optimizing | 0.000009 | | statistics | 0.000070 | | preparing | 0.000011 | | executing | 0.000001 | | Sending data | 0.000330 | | end | 0.000002 | | query end | 0.000007 | | closing tables | 0.000005 | | freeing items | 0.000014 | | cleaning up | 0.000010 | +----------------------+----------+

It reads and gives back exactly 65 rows and it takes 0.000s.

Conclusion

When it comes to pattern matching queries vs. full-text indexes, it looks like full-text index can be helpful, and it is built in. Unfortunately, we do not have many metrics regarding full-text indexes. We cannot see how many rows were read, etc. I don’t want to make any conclusions on which one is faster. I still have to run some tests with our favorite benchmark tool sysbench on a much bigger dataset.

I should mention that full-text indexes and my previous solutions won’t solve all the problems. In this and my other blog I was trying to find an answer to a specific problem, but there are cases where my solutions would not work that well.

The post Pattern Matching Queries vs. Full-Text Indexes appeared first on Percona Database Performance Blog.

MySQL Security – MySQL Enterprise Audit

In order to spot database misuse and/or to prove compliance to popular regulations including GDPR, PCI DSS, HIPAA, ... database administrators can be required to record and audit database activities. In this fifth episode of the MySQL 5.7 Security series, we will see what MySQL Enterprise Audit provide to help organizations implement stronger security controls and satisfy regulatory compliance.

Revisiting memory allocators and MySQL performance

Over the last years a lot of research has been done on choosing the most efficient memory allocation library for MySQL and its impact on MySQL performance (InnoDB kernel_mutex Contention and Memory Allocators, Impact of memory allocators on MySQL performance, TCMalloc and MySQL, MySQL performance: Impact of memory allocators (Part 2), Concurrent large allocations: glibc malloc, jemalloc and tcmalloc) The common wisdom has always been that the glibc implementation of malloc() doesn't scale and either jemalloc or tcmalloc should be used instead. Percona provides jemalloc in their repositories and recommends enabling it for efficiency reasons, and Oracle had even bundled tcmalloc in MySQL distributions until 5.7.13

I've always considered it unfortunate. Sure, for advanced MySQL users or fellow performance engineers installing and using an alternative allocation library is not a big deal. But for most users it is a little cumbersome to go through all those extra steps to install the library and configure MySQL to use it (and the way to do so has changed with systemd, so old instructions no longer apply to modern distributions). On top of that, both jemalloc and tcmalloc are not bug free as well, and their tuning is sometimes more an art than a science. The default allocator in glibc must be good enough for the vast majority of users, leaving alternative allocators to researches and those users willing to squeeze the last drop of performance out of their hardware.

Anyway, even for moderately heavy workloads an alternative allocator is a requirement. And that has been my opinion as well until a few days ago when I had to run some benchmarks on a Ubuntu Artful machine. I usually use tcmalloc, and from my previous experiments on older distributions there was little practical difference between jemalloc and tcmalloc, but a big difference compared to glibc.

This time around I noticed that MySQL scalability suffers due to a severe contention on a spinlock inside libtcmalloc_minimal.so.4. Searching the web suggested there have been some recent fixes of similar issues in tcmalloc. But I decided to revisit my allocator benchmarks, and to my surprise glibc came out a winner.

Performance Improvements in glibc 2.26

While looking for an explanation, I noticed that Ubuntu Artful was probably the first mainline distribution that included glibc 2.26 and a colleague pointed me to this excellent blog post describing malloc() improvements in glibc 2.26.

That was on an ARM64 machine, which is of limited interest to the general audience, so I was wondering if I could repeat that on an x86_64 machine. And yes, even though the picture is a little different on x86_64, I could repeat both tcmalloc lock contention and glibc as the fastest allocator on Ubuntu Artful running on x86_64.

Benchmarks with glibc 2.26

For my experiment, I decided to run the same benchmarks as Mark Callaghan ran in his most recent evaluation of allocator libraries. I'm not going to repeat all benchmark configuration, the only differences with Mark's setup were:

  • InnoDB instead of MyRocks
  • MySQL 5.7.21
  • 10 sysbench tables with 1M rows each instead of 8
  • Ubuntu Artful with glibc 2.26, jemalloc 3.6 and tcmalloc 2.5 running on a 2-socket Xeon Gold 6138 machine.
Results

The results are below:

Comparing to Mark's results:

  • with 2M per-connection blocks (i.e. with sort_buffer_size=2M) glibc 2.20 was slightly slower than jemalloc and tcmalloc, and glibc 2.23 was about the same. In my results glibc 2.26 is considerably faster than both tcmalloc and jemalloc;
  • with 32M per-connection blocks glibc performance has a sharp drop at higher concurrency. This is the same in both Mark's results and mine;
  • tcmalloc 2.5 shows poor performance with 2M and especially 32M blocks in my benchmarks. More on it later.

That is, glibc 2.26 has certainly improved its scalability with small block allocations, but bigger blocks (>=32 MB) are still problematic. In my comment to bug #88071 I explained the reasons for that and recommended to the bug reporter playing with malloc() parameters to see if they have any impact on scalability.

So it was time for me to follow my own advice and play with malloc tunable parameters. For experimental purposes I simply did export MALLOC_MMAP_MAX_=0 before starting MySQL to disable mmap() usage completely.

Below are updated results with glibc and disabled mmap() ("glibc_nommap"):

The summary is that with this simple tuning glibc 2.26 leads the pack. It is faster than both jemalloc and tcmalloc with both small and large blocks.

Anticipating questions about other jemalloc/tcmalloc versions and their tuning, I know that jemalloc and tcmalloc performance can vary considerably depending on their version and tuning parameters, but that wasn't my goal. I'm trying to look at it from a regular user perspective and just use whatever is provided by the distribution. My goal was to see if glibc 2.26 with recent scalability improvements is good enough as an allocation library for MySQL. In terms of performance and based on the benchmark numbers I got, the answer is rather "yes, it is good enough, but some tuning may be required for buffers >= 32 MB".

What about fragmentation?

One frequent comment that I hear when discussing memory allocators is that glibc has higher fragmentation than alternative libraries, which manifests itself as higher process RSS. That may very well be true, but not in that particular benchmark I was running. I was capturing mysqld RSS as reported by pidstat(1) by the end of each run, and here are the results:

So RSS with glibc was about the same as jemalloc, with worst results shown by tcmalloc again.

What's wrong tcmalloc?

There's obviously something wrong with tcmalloc shipped with Ubuntu Artful. I have some tricks up my sleeve in tuning tcmalloc (and I will be talking about them in my Percona Live talk), but none of them worked in this case. A typical PMP stacktrace would look as follows:

26 base::internal::SpinLockDelay(libtcmalloc_minimal.so.4), SpinLock::SlowLock(libtcmalloc_minimal.so.4), tc_malloc(libtcmalloc_minimal.so.4),my_raw_malloc(my_malloc.c:191), my_malloc(my_malloc.c:191), Filesort_buffer::alloc_sort_buffer(filesort_utils.cc:124), Filesort_info::alloc_sort_buffer(sql_sort.h:509), filesort(sql_sort.h:509),create_sort_index(sql_executor.cc:3664), QEP_TAB::sort_table(sql_executor.cc:3664), join_init_read_record(sql_executor.cc:2465), sub_select(sql_executor.cc:1271), do_select(sql_executor.cc:944),JOIN::exec(sql_executor.cc:944), handle_query(sql_select.cc:184),execute_sqlcom_select(sql_parse.cc:5156), mysql_execute_command(sql_parse.cc:2792), Prepared_statement::execute(sql_prepare.cc:3952), Prepared_statement::execute_loop(sql_prepare.cc:3560), mysqld_stmt_execute(sql_prepare.cc:2551), dispatch_command(sql_parse.cc:1392),do_command(sql_parse.cc:999), handle_connection(connection_handler_per_thread.cc:300), pfs_spawn_thread(pfs.cc:2190),start_thread,clone

I could probably do some further research and fix it either by tuning or using a different version. But again, that's not something most users would do, so let's just keep these results as a warning to Ubuntu Artful users following multiple recommendations on the Internet to use tcmalloc with MySQL: don't use the default tcmalloc in Artful, it can actually lead to worse MySQL scalability than glibc or jemalloc.

Conclusions

It is great to see some progress with malloc() performance in glibc 2.26. It already looks good enough for most installations, and for systems with large (>= 32 MB) per-connection buffers one may want to play with MALLOC_MMAP_THRESHOLD_ and MALLOC_MMAP_MAX_.

I also hear there are some further improvements coming up in 2.27. Hopefully some day, when these newer versions reach other mainline distributions and LTS releases, most users will not be required to bother with alternative allocator libraries anymore.

Update: per requests in the comments, I ran benchmarks with sort_buffer_size=32K. Updated charts:

I also ran a benchmark with DISTINCT range queries instead of ORDER BY ones:

As seen, using DISTINCT queries instead of ORDER BY does not have much impact on glibc and jemalloc, but magically restores tcmalloc performance to its original glory. Interesting thing for further research...

MySQL Compression Olympics

And the results are in:


Innodb (no compression/original) - 660Gb
RocksDB - 209Gb
TokuDB (snappy) - 144Gb
TokuDB (LZMA) - 67Gb


Benchmark performance with mysqlslap on production sample queries :
(8-9 Very quick SELECTs + 1-2 medium SELECTs)

Innodb (original)
Benchmark
Avg: 0.100 seconds
Min: 0.091 seconds
Max: 0.182 seconds
Total: 5.101s

TokuDB (snappy)
Benchmark
Avg: 0.100 seconds
Min: 0.089 seconds
Max: 0.183 seconds
Total: 5.106s

RocksDB
Benchmark
Avg: 0.113 seconds
Min: 0.104 seconds
Max: 0.164 seconds
Total: 5.730s

TokuDB (LZMA)
Benchmark
Avg: 0.099 seconds
Min: 0.090 seconds
Max: 0.155 seconds
Total: 5.037s


Testing Platform:
    Platform  | Linux
     Release  | CentOS release 6.8 (Final)
      Kernel   | 2.6.32-642.11.1.el6.x86_64
Architecture | CPU = 64-bit, OS = 64-bit
   Threading | NPTL 2.12
    Compiler | GNU CC version 4.4.7 20120313 (Red Hat 4.4.7-17).
     SELinux | Disabled
 Virtualized | VMWare
  Processors | physical = 2, cores = 4, virtual = 4, hyperthreading = no
      Speeds | 4x2299.998
      Models | 4xIntel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz
      Caches | 4x46080 KB
      Memory | 15.6G


Observations - Load during conversion:
TokuDB snappy - Load 1.07, IOPs (around) 30mb/s
RocksDB - Load 1.09, IOPs (around) 50-70Mb/s
(There seem to be data load round and then a second round of compression afterwards)
TokuDB LZMA - Load 3-4, IOPs (around) 7mb/s

How to Implement ProxySQL with AWS Aurora

In this post, we'll look at how to implement ProxySQL with AWS Aurora. Recently, there have been a few discussions and customer requests that focused on AWS Aurora and how to make the various architectures and solutions more flexible. Flexible how, you may ask? Well, there are the usual expectations:

  • How do you improve resource utilization?
  • How I can filter (or block) things?
  • Can I shard with Aurora?
  • What is the best way to implement query caching?
  • … and more.

The inclusion of ProxySQL solves many of the points above. We in Consulting design the solutions for our customers by applying the different functionalities to better match customers needs. Whenever we deal with Aurora, we've had to exclude ProxySQL because of some limitations in the software. Now however, ProxySQL 2.0 supports Aurora, and it does it amazingly. This article will show you how to implement ProxySQL with AWS Aurora. The the next article Leveraging AWS Aurora performance will show you WHY.

The Problem

ProxySQL has two different ways to deal with backend servers. One is using replication mechanisms, like standard Async replication and Group Replication. The other is to use the scheduler, as in the case of Percona XtraDB Cluster, MariaDB Cluster, etc. While we can use the scheduler as a solution for Aurora, it is not as immediate and well-integrated as the embedded support for replication, given that we normally opted not to use it in this specific case (Aurora). But what WAS the problem with Aurora? An Aurora cluster bases its definition of Writer vs. Readers using the innodb_read_only variable. So, where is the problem? Well actually no problem at all, just that ProxySQL up to version 2 for replication only supported the generic variable READ_ONLY. As such, it was not able to correctly identify the Writer/Readers set.

The Solution

In October 2017, this issue was opened (https://github.com/sysown/proxysql/issues/1195 )and the result was, as usual, a quite simple and flexible solution.

Brainstorming, a possible solution could be to add another column in mysql_replication_hostgroups to specify what needs to be checked, either read_only or innodb_read_only, or even super_read_only

This lead to the ProxySQL team delivering (“commit fe2f16d6df15252f0107a6a224dad7b1efdb13f6”):

Added support for innodb_read_only and super_read_only

MYHGM_MYSQL_REPLICATION_HOSTGROUPS "CREATE TABLE mysql_replication_hostgroups (writer_hostgroup INT CHECK (writer_hostgroup>=0) NOT NULL PRIMARY KEY , reader_hostgroup INT NOT NULL CHECK (reader_hostgroup<>writer_hostgroup AND reader_hostgroup>=0) , check_type VARCHAR CHECK (LOWER(check_type) IN ('read_only','innodb_read_only','super_read_only')) NOT NULL DEFAULT 'read_only' , comment VARCHAR NOT NULL DEFAULT '' , UNIQUE (reader_hostgroup))"

Which in short means they added a new column to the mysql_replication_hostgroup table. ProxySQL continues to behave exactly the same and manages the servers and the replication groups as usual. No need for scripts or other crazy stuff.

Implementation

Here we are, the HOW TO part. The first thing to keep in mind is that when you implement a new Aurora cluster, you should always consider having at least two instances in the same AZ and another instance in a remote AZ. To implement ProxySQL, you should refer directly to


the instances, NOT to the cluster entry-point. To be clear, you must take this for each instance:

The information is available in the web-admin interface, under the instance or using the command:

aws rds describe-db-instances

And filter the result for:

"Endpoint": { "Port": 3306, "Address": "proxysqltestdb.c7wzm8xxmrze.eu-central-1.rds.amazonaws.com" },

To run ProxySQL with RDS in general, you need to install it on an intermediate server or on the application box. Once you decide which one fits your setup better, you must download or git clone ProxySQL v2.0+. DO NOT use v1.4.x, as it does not contain these new features and will not work as expected. Once you have all the Aurora instances up, it is time to configure ProxySQL.

Below is an example of all the commands used during the installation:


1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 GRANT usage, replication client ON *.* TO monitor@'%' IDENTIFIED BY 'monitor';   DELETE FROM mysql_servers WHERE hostgroup_id IN (70,71); DELETE FROM mysql_replication_hostgroups WHERE writer_hostgroup=70; INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections) VALUES ('proxysqltestdb.c7wzm8xxmrze.eu-central-1.rds.amazonaws.com',70,3306,1000,2000); INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections) VALUES ('proxysqltestdb.c7wzm8xxmrze.eu-central-1.rds.amazonaws.com',71,3306,1000,2000); INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections) VALUES ('proxysqltestdb2.c7wzm8xxmrze.eu-central-1.rds.amazonaws.com',71,3306,1000,2000); INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections) VALUES ('proxysqltestdb-eu-central-1b.c7wzm8xxmrze.eu-central-1.rds.amazonaws.com',71,3306,1,2000);   INSERT INTO mysql_replication_hostgroups(writer_hostgroup,reader_hostgroup,comment,check_type) VALUES (70,71,'aws-aurora','innodb_read_only'); LOAD MYSQL SERVERS TO RUNTIME; SAVE MYSQL SERVERS TO DISK;   DELETE FROM mysql_query_rules WHERE rule_id IN (50,51,52); INSERT INTO mysql_query_rules (rule_id,proxy_port,username,destination_hostgroup,active,retries,match_digest,apply) VALUES(50,6033,'m8_test',70,0,3,'.',1); INSERT INTO mysql_query_rules (rule_id,proxy_port,username,destination_hostgroup,active,retries,match_digest,apply) VALUES(51,6033,'m8_test',70,1,3,'^SELECT.*FOR UPDATE',1); INSERT INTO mysql_query_rules (rule_id,proxy_port,username,destination_hostgroup,active,retries,match_digest,apply) VALUES(52,6033,'m8_test',71,1,3,'^SELECT.*$',1); LOAD MYSQL QUERY RULES TO RUNTIME;SAVE MYSQL QUERY RULES TO DISK;   DELETE FROM mysql_users WHERE username='m8_test'; INSERT INTO mysql_users (username,password,active,default_hostgroup,default_schema,transaction_persistent) VALUES ('m8_test','test',1,70,'mysql',1); LOAD MYSQL USERS TO RUNTIME;SAVE MYSQL USERS TO DISK;   UPDATE global_variables SET variable_value="67108864" WHERE variable_name='mysql-max_allowed_packet'; UPDATE global_variables SET Variable_Value=0 WHERE Variable_name='mysql-hostgroup_manager_verbose'; LOAD mysql VARIABLES TO run;save mysql VARIABLES TO disk;

 

The above will give you a ready-to-go ProxySQL setup that supports Aurora cluster, performing all the usual operations ProxySQL does, including proper W/R split and more for a user named 'm8_test'. The key is in passing the value 'innodb_read_only' for the column check_type in the table mysql_replication_hostgroups. To check the status of your ProxySQL, you can use this command (which gives you a snapshot of what is going to happen):


1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 watch -n 1 'mysql --defaults-file=~/.my.cnf -h 127.0.0.1 -P 6032 -t -e "select b.weight, c.* from stats_mysql_connection_pool c left JOIN runtime_mysql_servers b ON c.hostgroup=b.hostgroup_id and c.srv_host=b.hostname and c.srv_port = b.port where hostgroup in( 50,52,70,71) order by hostgroup,srv_host desc;" -e " select srv_host,command,avg(time_ms), count(ThreadID) from stats_mysql_processlist group by srv_host,command;" -e "select * from stats_mysql_users;";mysql --defaults-file=~/.my.cnf -h 127.0.0.1 -P 6032 -t -e "select * from stats_mysql_global "|egrep -i "(mirror|memory|stmt|processor)"' +--------+-----------+--------------------------------------------------------------------------+----------+--------+----------+----------+--------+---------+-------------+---------+-------------------+-----------------+-----------------+------------+ | weight | hostgroup | srv_host | srv_port | STATUS | ConnUsed | ConnFree | ConnOK | ConnERR | MaxConnUsed | Queries | Queries_GTID_sync | Bytes_data_sent | Bytes_data_recv | Latency_us | +--------+-----------+--------------------------------------------------------------------------+----------+--------+----------+----------+--------+---------+-------------+---------+-------------------+-----------------+-----------------+------------+ | 1000 | 70 | proxysqltestdb.c7wzm8xxmrze.eu-central-1.rds.amazonaws.com | 3306 | ONLINE | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 5491 | | 1000 | 71 | proxysqltestdb2.c7wzm8xxmrze.eu-central-1.rds.amazonaws.com | 3306 | ONLINE | 0 | 5 | 5 | 0 | 5 | 73 | 0 | 5483 | 28442 | 881 | | 1000 | 71 | proxysqltestdb.c7wzm8xxmrze.eu-central-1.rds.amazonaws.com | 3306 | ONLINE | 0 | 5 | 5 | 0 | 5 | 82 | 0 | 6203 | 32217 | 5491 | | 1 | 71 | proxysqltestdb-eu-central-1b.c7wzm8xxmrze.eu-central-1.rds.amazonaws.com | 3306 | ONLINE | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1593 | +--------+-----------+--------------------------------------------------------------------------+----------+--------+----------+----------+--------+---------+-------------+---------+-------------------+-----------------+-----------------+------------+ +----------+----------------------+--------------------------+ | username | frontend_connections | frontend_max_connections | +----------+----------------------+--------------------------+ | m8_test | 0 | 10000 | +----------+----------------------+--------------------------+ | Query_Processor_time_nsec | 0 | | Com_backend_stmt_prepare | 0 | | Com_backend_stmt_execute | 0 | | Com_backend_stmt_close | 0 | | Com_frontend_stmt_prepare | 0 | | Com_frontend_stmt_execute | 0 | | Com_frontend_stmt_close | 0 | | Mirror_concurrency | 0 | | Mirror_queue_length | 0 | | SQLite3_memory_bytes | 2652288 | | ConnPool_memory_bytes | 712720 | | Stmt_Client_Active_Total | 0 | | Stmt_Client_Active_Unique | 0 | | Stmt_Server_Active_Total | 0 | | Stmt_Server_Active_Unique | 0 | | Stmt_Max_Stmt_id | 1 | | Stmt_Cached | 0 | | Query_Cache_Memory_bytes | 0 |

 

At this point, you can connect your application and see how ProxySQL allows you to perform much better than the native cluster entry point. This will be expanded in the next article: Leverage AWS Aurora performance.

Conclusions

I had my first issue with the native Aurora connector a long time ago, but I had nothing to replace it. ProxySQL is a very good alternative to standard cluster access, with more options/controls and it also allows us to perform close-to-application caching, which is much more efficient than the remote MySQL one (http://www.proxysql.com/blog/scaling-with-proxysql-query-cache). In the next article I will illustrate how, in a simple setup, ProxySQL can help in achieving better results than using the default Aurora cluster endpoint.

Leveraging ProxySQL with AWS Aurora to Improve Performance, Or How ProxySQL Out-performs Native Aurora Cluster Endpoints

In this blog post, I’ll look at how you can use ProxySQL with AWS Aurora to further leverage database performance.

My previous article described how easy is to replace the native Aurora connector with ProxySQL. In this article, you will see WHY you should do that.

It is important to understand that aside from the basic optimization in the connectivity and connection management, ProxySQL also provides you with a new set of features that currently are not available in Aurora.

Just think:

  • Better caching
  • Query filtering
  • Sharding
  • Query substitution
  • Firewalling
  • … and more

We will cover areas like scalability, security and performance. In short, I think is more than worth it to spend some time and give ProxySQL with AWS Aurora a try.

The tests

I will show you the results from two different kinds of tests. One is sysbench-oriented, the other simulates a more complex application using Java, data object utilization and a Hikari connection pool in the middle as well. 

For the EC2 and Aurora platform I used:

  • Application/ProxySQL T2.xlarge eu-central-1a
  • 2 Aurora MySQL 5.7.12 db.t2.medium eu-central-1a
  • 1 Aurora MySQL 5.7.12 db.t2.medium eu-central-1b for AZ redundancy

The code for the application is available here, and for sysbench tests here. All the data and configurations for the application are available here.

I ran three tests using both bench apps, obviously with Aurora as it comes and with ProxySQL. For the ProxySQL configuration see my previous article.
The tests were read_only / Write_only / read_write.

For Aurora, I only increased the number of connections and kept the how it comes out of the box approach. Note each test was run at least three times at different moments of the day, and on a different day. The data reported as final is the BEST performing result for each one.

The Results

For the impatient among us, here is a summary table of the tests:

Sysbench:

Java App:


Now if this is enough for you, you can go to the conclusion and start to use ProxySQL with AWS Aurora. If you would like to know a bit more, continue reading.

Aside from any discussion on the benchmark tool and settings, I really focused on identifying the differences between the two “connectors”. Given the layer below was exactly the same, any difference is due to the simple substitution of the endpoint.

Sysbench Read Only

The first image reports the number of events achieved at the time of the test. It is quite clear that when using ProxySQL, sysbench ran more events.

In this graph, higher is better:

In this graph, lower is better:

As we can see, the latency when using an Aurora cluster entry point is higher. True, we are talking about milliseconds, but it is not just the value that matters, but also the distribution:

Aurora cluster endpoint ProxySQL    

An image is worth a thousand words!

We can see, the behavior stays constant in analyzing the READS executed, with ProxySQL performing better.

In this graph, higher is better:


In this graph, higher is better:

Closing with the number of total queries performed, in which ProxySQL surpassed the Cluster endpoint by ~ 4K queries.

Write Only

For writing, things go a bit different. We see that all lines intersect, and the values are very close one to the other. I will let the images speak for themselves:

In this graph, higher is better:

In this graph, lower is better:

Latency spiked in each ProxySQL test, and it may require additional investigation and tuning.  

In this graph, higher is better:

While the rates of writes/sec intersect with each other frequently, in the end ProxySQL resulted in more writes than the native endpoint.

In this graph, higher is better:

In the end, a difference exists and is consistent across the different test iterations, but is minimal. We are talking a range of 25 – 50 entries in total. This result is not surprising, and it will be clear why later in the article.  

Read and Write

As expected in the read and write test, we see a different situation. ProxySQL is still performing better than the default entry point, but not by such a big margin as in read-only tests.

In this graph, higher is better:

In this graph, lower is better

Latency and events are following the expected trend, where read operations are executed more efficiently with ProxySQL and writes are close, but NOT the same as in the write only test.

As a result, the number of queries in ProxySQL is approximately 13% better than the default entry point.

Java Application Tests

What about the Java application? First of all, we need to remember that the application used a connection pool mechanism (Hikari), and the connection pool was present in all cases (for both Aurora cluster endpoint or ProxySQL). Given that a small delay in establishing the first connection was expected, you can easily see this in the MAX value of the connection latency.    

In this graph, lower is better.

The connection latency reported here is expressed in nanoseconds and is the measure of the time taken by the connection provider to return an active connection to the application from the moment the application requested it. In other words, how long the HikariCP is taking to choose/check/return an open connection. As you can see, the MAX value is drastically higher, and this was expected since it is the connection initialization. While not really interesting in terms of performance, this value is interesting because it gives us the dimension of the cost in the CP to open a new connection, which in the worse case is 25 milliseconds.

As the graphs show, ProxySQL manages both cases (first call, reassignment) more efficiently.

In this graph, higher is better.

In the CRUD summary table, we can see the number of SQL commands executed per second for each CRUD action and for each test. Once more we can see that when using ProxySQL, the application performed much better and significantly executed more operations (especially in the R/W test).

In this graph, higher is better.

This graph represents the total number of events run at the time of the test. An event is a full application cycle, which sees the application generate the data needed to fill the SQL (no matter if it is for read/write), create the SQL, request the connection, push the SQL, get and read the resultset returned and give back the connection.

Once more, ProxySQL shows better performance.

In this graph, lower is better.

The execution time reported in this graph is the time taken by the application to run a whole event.

This is it, execution time is the time of a full cycle. The faster the cycle is executed, the better the application is performing. The time is express in milliseconds and it goes from a very fast read, which probably accesses the cache in Aurora, to almost two seconds taken to insert a batch of rows.

Needless to say, the tests using ProxySQL performed better.

But Why?

Why do the tests using ProxySQL perform better? After all, it is just an additional step in the middle, which also has a cost in intercepting the queries and managing the connections. So why the better performance?

The answer is simple and can be found in the Aurora manual: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Aurora.Overview.html#Aurora.Overview.Endpoints.

The Cluster endpoint is an endpoint for an Aurora DB cluster that connects to the current primary instance for that DB cluster. Each Aurora DB cluster has a cluster endpoint and one primary instance.

That endpoint receives the read and write request and sends them to the same instance.The main use for it is to perform failover if needed.

At the same time, the Reader endpoint is an endpoint for an Aurora DB cluster that connects to one of the available Aurora Replicas for that DB cluster. Each Aurora DB cluster has a reader endpoint. If there is more than one Aurora Replica, the reader endpoint directs each connection request to one of the Aurora Replicas.

The reader endpoint only load balances connections to available Aurora Replicas in an Aurora DB cluster. It does not load balance specific queries. If you want to load balance queries to distribute the read workload for a DB cluster, you need to manage that in your application and use instance endpoints to connect directly to Aurora Replicas to balance the load.

This means that to perform a Read/Write split, your application must manage two entry points and you will NOT have much control over how the queries are handled or to which replica instance they are directed. This could lead to unexpected results and delays.

Needless to say, ProxySQL does all that by default (as described in my previous article).

Now that we’ve clarified how Aurora entry points behave, let’s see about the performance difference.

How do we read this graph? From left to right:

  • read_only test with an Aurora cluster endpoint
  • read_only test with ProxySQL
  • write_only with an Aurora cluster endpoint
  • write_only with ProxySQL
  • read and write with an Aurora cluster endpoint
  • read and write with ProxySQL

Here we go! As we can see, the tests with ProxySQL used the two configured instances, splitting R/W without the need to do anything on the application side. I purposely avoided the AZ replica because I previously identified it as having higher latency, so I can exclude it and use it ONLY in the case of an emergency.

The effects are clear in the next graph.

When using the cluster endpoint, given all the load was on a single instance, the CPU utilization is higher and that became a bottleneck. When using ProxySQL, the load is spread across the different instances, allowing real read scalability. This has immediate benefits in read and read/write operations, allowing better load distribution that results in better performance.

Conclusions

Aurora is a very interesting technology and can be a very good solution for read scaling. But at the moment, the way AWS offers data connectivity with the Cluster endpoints and Reader endpoints can negatively affect performance.

The lack of configuration and the limitation of using different endpoints lead to confusion and less optimized utilization.

The introduction of ProxySQL, which now supports (from version 2) Aurora, allows an architect, SA or DBA to properly configure the environment. You can very granularly choose how to use each instance, without the need to have the application modify how it works. This helps keep the data layer solution separate from the application layer.

Even better, this additional set of flexibility does not come with a cost. On the contrary, it improves resource utilization and brings higher performance using less powerful instances. Given the cost of Aurora, this is not a secondary benefit.   

I suggest you try installing ProxySQL v2 (or higher) in front of your Aurora cluster. If you don’t feel confident and prefer to have us help you, contact us and we will be more than happy to support you!

The post Leveraging ProxySQL with AWS Aurora to Improve Performance, Or How ProxySQL Out-performs Native Aurora Cluster Endpoints appeared first on Percona Database Performance Blog.

How to Implement ProxySQL with AWS Aurora

In this post, we’ll look at how to implement ProxySQL with AWS Aurora.

Recently, there have been a few discussions and customer requests that focused on AWS Aurora and how to make the various architectures and solutions more flexible.

Flexible how, you may ask? Well, there are the usual expectations:

  • How do you improve resource utilization?
  • How can I filter (or block) things?
  • Can I shard with Aurora?
  • What is the best way to implement query caching?
  • … and more.

The inclusion of ProxySQL solves many of the points above. We in Consulting design the solutions for our customers by applying the different functionalities to better match customers needs. Whenever we deal with Aurora, we’ve had to exclude ProxySQL because of some limitations in the software.

Now, however, ProxySQL 2.0 supports Aurora, and it does it amazingly well.

This article shows you how to implement ProxySQL with AWS Aurora. The the next article Leveraging ProxySQL with AWS Aurora to Improve Performance will show you WHY.

The Problem

ProxySQL has two different ways to deal with backend servers. One is using replication mechanisms, like standard Async replication and Group Replication. The other is to use the scheduler, as in the case of Percona XtraDB Cluster, MariaDB Cluster, etc.

While we can use the scheduler as a solution for Aurora, it is not as immediate and well-integrated as the embedded support for replication, given that we normally opted not to use it in this specific case (Aurora).

But what WAS the problem with Aurora? An Aurora cluster bases its definition of Writer vs. Readers using the innodb_read_only variable. So, where is the problem? Well actually no problem at all, just that ProxySQL up to version 2 for replication only supported the generic variable READ_ONLY. As such, it was not able to correctly identify the Writer/Readers set.

The Solution

In October 2017, this issue was opened (https://github.com/sysown/proxysql/issues/1195 )and the result was, as usual, a quite simple and flexible solution.

Brainstorming, a possible solution could be to add another column in mysql_replication_hostgroups to specify what needs to be checked, either read_only or innodb_read_only, or even super_read_only

This lead to the ProxySQL team delivering (“commit fe2f16d6df15252f0107a6a224dad7b1efdb13f6”):

Added support for innodb_read_only and super_read_only  

MYHGM_MYSQL_REPLICATION_HOSTGROUPS "CREATE TABLE mysql_replication_hostgroups (writer_hostgroup INT CHECK (writer_hostgroup>=0) NOT NULL PRIMARY KEY , reader_hostgroup INT NOT NULL CHECK (reader_hostgroup<>writer_hostgroup AND reader_hostgroup>=0) , check_type VARCHAR CHECK (LOWER(check_type) IN ('read_only','innodb_read_only','super_read_only')) NOT NULL DEFAULT 'read_only' , comment VARCHAR NOT NULL DEFAULT '' , UNIQUE (reader_hostgroup))"

Which in short means they added a new column to the mysql_replication_hostgroup table. ProxySQL continues to behave exactly the same and manages the servers and the replication groups as usual. No need for scripts or other crazy stuff.

Implementation

Here we are, the HOW TO part. The first thing to keep in mind is that when you implement a new Aurora cluster, you should always consider having at least two instances in the same AZ and another instance in a remote AZ.

To implement ProxySQL, you should refer directly to the instances, NOT to the cluster entry-point. To be clear, you must take this for each instance:

The information is available in the web-admin interface, under the instance or using the command:

aws rds describe-db-instances

And filter the result for:

"Endpoint": { "Port": 3306, "Address": "proxysqltestdb.c7wzm8xxmrze.eu-central-1.rds.amazonaws.com" },

To run ProxySQL with RDS in general, you need to install it on an intermediate server or on the application box.

Once you decide which one fits your setup better, you must download or git clone ProxySQL v2.0+.

DO NOT use v1.4.x, as it does not contain these new features and will not work as expected.

Once you have all the Aurora instances up, it is time to configure ProxySQL. Below is an example of all the commands used during the installation:

grant usage, replication client on *.* to monitor@'%' identified by 'monitor'; delete from mysql_servers where hostgroup_id in (70,71); delete from mysql_replication_hostgroups where writer_hostgroup=70; INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections) VALUES ('proxysqltestdb.c7wzm8xxmrze.eu-central-1.rds.amazonaws.com',70,3306,1000,2000); INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections) VALUES ('proxysqltestdb.c7wzm8xxmrze.eu-central-1.rds.amazonaws.com',71,3306,1000,2000); INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections) VALUES ('proxysqltestdb2.c7wzm8xxmrze.eu-central-1.rds.amazonaws.com',71,3306,1000,2000); INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections) VALUES ('proxysqltestdb-eu-central-1b.c7wzm8xxmrze.eu-central-1.rds.amazonaws.com',71,3306,1,2000); INSERT INTO mysql_replication_hostgroups(writer_hostgroup,reader_hostgroup,comment,check_type) VALUES (70,71,'aws-aurora','innodb_read_only'); LOAD MYSQL SERVERS TO RUNTIME; SAVE MYSQL SERVERS TO DISK; delete from mysql_query_rules where rule_id in (50,51,52); insert into mysql_query_rules (rule_id,proxy_port,username,destination_hostgroup,active,retries,match_digest,apply) values(50,6033,'m8_test',70,0,3,'.',1); insert into mysql_query_rules (rule_id,proxy_port,username,destination_hostgroup,active,retries,match_digest,apply) values(51,6033,'m8_test',70,1,3,'^SELECT.*FOR UPDATE',1); insert into mysql_query_rules (rule_id,proxy_port,username,destination_hostgroup,active,retries,match_digest,apply) values(52,6033,'m8_test',71,1,3,'^SELECT.*$',1); LOAD MYSQL QUERY RULES TO RUNTIME;SAVE MYSQL QUERY RULES TO DISK; delete from mysql_users where username='m8_test'; insert into mysql_users (username,password,active,default_hostgroup,default_schema,transaction_persistent) values ('m8_test','test',1,70,'mysql',1); LOAD MYSQL USERS TO RUNTIME;SAVE MYSQL USERS TO DISK; update global_variables set variable_value="67108864" where variable_name='mysql-max_allowed_packet'; update global_variables set Variable_Value=0 where Variable_name='mysql-hostgroup_manager_verbose'; load mysql variables to run;save mysql variables to disk;

The above will give you a ready-to-go ProxySQL setup that supports Aurora cluster, performing all the usual operations ProxySQL does, including proper W/R split and more for a user named ‘m8_test’.

The key is in passing the value ‘innodb_read_only’ for the column check_type in the table mysql_replication_hostgroups.  

To check the status of your ProxySQL, you can use this command (which gives you a snapshot of what is going to happen):

watch -n 1 'mysql --defaults-file=~/.my.cnf -h 127.0.0.1 -P 6032 -t -e "select b.weight, c.* from stats_mysql_connection_pool c left JOIN runtime_mysql_servers b ON c.hostgroup=b.hostgroup_id and c.srv_host=b.hostname and c.srv_port = b.port where hostgroup in( 50,52,70,71) order by hostgroup,srv_host desc;" -e " select srv_host,command,avg(time_ms), count(ThreadID) from stats_mysql_processlist group by srv_host,command;" -e "select * from stats_mysql_users;";mysql --defaults-file=~/.my.cnf -h 127.0.0.1 -P 6032 -t -e "select * from stats_mysql_global "|egrep -i "(mirror|memory|stmt|processor)"' +--------+-----------+--------------------------------------------------------------------------+----------+--------+----------+----------+--------+---------+-------------+---------+-------------------+-----------------+-----------------+------------+ | weight | hostgroup | srv_host | srv_port | status | ConnUsed | ConnFree | ConnOK | ConnERR | MaxConnUsed | Queries | Queries_GTID_sync | Bytes_data_sent | Bytes_data_recv | Latency_us | +--------+-----------+--------------------------------------------------------------------------+----------+--------+----------+----------+--------+---------+-------------+---------+-------------------+-----------------+-----------------+------------+ | 1000 | 70 | proxysqltestdb.c7wzm8xxmrze.eu-central-1.rds.amazonaws.com | 3306 | ONLINE | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 5491 | | 1000 | 71 | proxysqltestdb2.c7wzm8xxmrze.eu-central-1.rds.amazonaws.com | 3306 | ONLINE | 0 | 5 | 5 | 0 | 5 | 73 | 0 | 5483 | 28442 | 881 | | 1000 | 71 | proxysqltestdb.c7wzm8xxmrze.eu-central-1.rds.amazonaws.com | 3306 | ONLINE | 0 | 5 | 5 | 0 | 5 | 82 | 0 | 6203 | 32217 | 5491 | | 1 | 71 | proxysqltestdb-eu-central-1b.c7wzm8xxmrze.eu-central-1.rds.amazonaws.com | 3306 | ONLINE | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1593 | +--------+-----------+--------------------------------------------------------------------------+----------+--------+----------+----------+--------+---------+-------------+---------+-------------------+-----------------+-----------------+------------+ +----------+----------------------+--------------------------+ | username | frontend_connections | frontend_max_connections | +----------+----------------------+--------------------------+ | m8_test | 0 | 10000 | +----------+----------------------+--------------------------+ | Query_Processor_time_nsec | 0 | | Com_backend_stmt_prepare | 0 | | Com_backend_stmt_execute | 0 | | Com_backend_stmt_close | 0 | | Com_frontend_stmt_prepare | 0 | | Com_frontend_stmt_execute | 0 | | Com_frontend_stmt_close | 0 | | Mirror_concurrency | 0 | | Mirror_queue_length | 0 | | SQLite3_memory_bytes | 2652288 | | ConnPool_memory_bytes | 712720 | | Stmt_Client_Active_Total | 0 | | Stmt_Client_Active_Unique | 0 | | Stmt_Server_Active_Total | 0 | | Stmt_Server_Active_Unique | 0 | | Stmt_Max_Stmt_id | 1 | | Stmt_Cached | 0 | | Query_Cache_Memory_bytes | 0 |

At this point, you can connect your application and see how ProxySQL allows you to perform much better than the native cluster entry point.

This will be expanded in the next article: Leverage AWS Aurora performance.

Conclusions

I had my first issue with the native Aurora connector a long time ago, but I had nothing to replace it. ProxySQL is a very good alternative to standard cluster access, with more options/controls and it also allows us to perform close-to-application caching, which is much more efficient than the remote MySQL one (http://www.proxysql.com/blog/scaling-with-proxysql-query-cache).

In the next article I will illustrate how, in a simple setup, ProxySQL can help in achieving better results than using the default Aurora cluster endpoint.

The post How to Implement ProxySQL with AWS Aurora appeared first on Percona Database Performance Blog.

Partial update of JSON values

MySQL 8.0 introduces partial update of JSON values, which is a nice performance improvement for applications that frequently update small portions of large JSON documents. Before, in MySQL 5.7, whenever you made a change to a stored JSON document, the full new JSON document would be written to the database, even if the update just changed a few bytes in the document.…

MySQL InnoDB Cluster & Group Replication: how to upgrade safely your cluster

Recently on MySQL Forums, somebody was looking for documentation or procedure to upgrade a MySQL InnoDB Cluster (or Group Replication cluster) to a newer version.

In this post I am illustrating the best practices to achieve this task safely.

To illustrate the procedure, I will use an InnoDB Cluster of 3 members: mysql1, mysql2 and mysql3. The cluster is setup in Single-Primary mode (mysql1) and runs MySQL 5.7.21.

Let’s have a look at the cluster status:

MySQL / mysql1:3306 / JS / cluster.status() { "clusterName": "MyCluster", "defaultReplicaSet": { "name": "default", "primary": "mysql1:3306", "ssl": "DISABLED", "status": "OK", "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", "topology": { "mysql1:3306": { "address": "mysql1:3306", "mode": "R/W", "readReplicas": {}, "role": "HA", "status": "ONLINE" }, "mysql2:3306": { "address": "mysql2:3306", "mode": "R/O", "readReplicas": {}, "role": "HA", "status": "ONLINE" }, "mysql3:3306": { "address": "mysql3:3306", "mode": "R/O", "readReplicas": {}, "role": "HA", "status": "ONLINE" } } }, "groupInformationSourceMember": "mysql://clusteradmin@mysql1:3306" }

Before upgrading a MySQL InnoDB Cluster, there is one important rule to remember: always write on the lowest version.

The procedure for such upgrade is the following:

  1. be sure you have saved the cluster (group replication) configuration to disk (in 8.0 no need to use ConfigureLocalInstance() as you can configure also remote instances)
  2. run the utility to verify if the upgrade is possible (check this post)
  3. stop safely one of the Secondary-Master nodes (mysql2 or mysql3 in our example). This means set innodb_fast_shutdown must be set to 0 and mysqd stopped properly
  4. upgrade the binaries
  5. edit the configuration to disable group replication at boot and to remove all eventual configuration settings that have been removed
  6. start mysqld
  7. run mysql_upgrade
  8. enable group replication at boot again (or if you don’t want, after the mandatory restart due to mysql_upgrade, you can use cluster.rejoinInstance() from MySQL Shell)
  9. start mysqld

That’s it !

Now you need to reproduce all these steps for each nodes and finish by the Primary-Master.

I’ve illustrated all this in the video below:

I hope this will help you if you are planning to upgrade your MySQL InnoDB Cluster to a newer version.

Enjoy MySQL HA made easy !

 

New Webinar: How to Measure Database Availability

Join us on April 24th for Part 2 of our database high availability webinar special!

In this session we will focus on how to measure database availability. It is notoriously hard to measure and report on, although it is an important KPI in any SLA between you and your customer. With that in mind, we will discuss the different factors that affect database availability and see how you can measure your database availability in a realistic way.

It is common enough to define availability in terms of 9s (e.g. 99.9% or 99.999%) - especially here at Severalnines - although there are often different opinions as to what these numbers actually mean, or how they are measured.

Is the database available if an instance is up and running, but it is unable to serve any requests? Or if response times are excessively long, so that users consider the service unusable? Is the impact of one longer outage the same as multiple shorter outages? How do partial outages affect database availability, where some users are unable to use the service while others are completely unaffected?

Not agreeing on precise definitions with your customers might lead to dissatisfaction. The database team might be reporting that they have met their availability goals, while the customer is dissatisfied with the service.

Join us for this webinar during which we will discuss the different factors that affect database availability and see how to measure database availability in a realistic way.

Register for the webinar

Date, Time & Registration Europe/MEA/APAC

Tuesday, April 24th at 09:00 BST / 10:00 CEST (Germany, France, Sweden)

Register Now

North America/LatAm

Tuesday, April 24th at 09:00 PDT (US) / 12:00 EDT (US)

Register Now

Agenda
  • Defining availability targets
    • Critical business functions
    • Customer needs
    • Duration and frequency of downtime
    • Planned vs unplanned downtime
    • SLA
  • Measuring the database availability
    • Failover/Switchover time
    • Recovery time
    • Upgrade time
    • Queries latency
    • Restoration time from backup
    • Service outage time
  • Instrumentation and tools to measure database availability:
    • Free & open-source tools
    • CC's Operational Report
    • Paid tools

Register for the webinar

Speaker

Bartlomiej Oles is a MySQL and Oracle DBA, with over 15 years experience in managing highly available production systems at IBM, Nordea Bank, Acxiom, Lufthansa, and other Fortune 500 companies. In the past five years, his focus has been on building and applying automation tools to manage multi-datacenter database environments.

Tags:  high availability ha open source database load balancing proxy replication cloud database high availability MySQL PostgreSQL MariaDB percona MongoDB failover recovery query latency measuring database availability webinar

Migrate to Amazon RDS Using Percona Xtrabackup

In this blog post, we’ll look at how to migrate to Amazon RDS using Percona XtraBackup.

Until recently, there was only one way to migrate your data from an existing MySQL instance into a new RDS MySQL instance: take and restore a logical backup with mysqldump or mydumper. This can be slow and error-prone. When Amazon introduced Amazon Aurora MySQL, you could use Percona XtraBackup to take an online physical backup of your database and restore that into a new Aurora instance. This feature is now available for RDS MySQL as well. Using Percona XtraBackup instead of a logical backup can save a lot of time, especially with a large dataset.

There are many caveats and limitations listed in Amazon’s documentation, but the most important ones are:
  • Source and destination databases must be MySQL 5.6. Earlier and later major versions are not supported at this time.
  • You can’t restore into an existing RDS instance using this method.
  • The total data size is limited to 6 TB.
  • User accounts, functions, and stored procedures are not imported automatically.
  • You can’t choose which databases and tables to migrate this way — migrate the whole instance. (You can’t use Percona Xtrabackup’s partial backup feature when migrating to RDS.)

If those limitations don’t apply to your use case, read on to learn how to migrate to Amazon RDS using Percona XtraBackup and restoring it into RDS.

Demonstration

For this demonstration, I created a Percona Server for MySQL 5.6 instance on EC2 with the sakila sample database and an extra InnoDB table. I filled the table with junk data to make the total data size about 13.5 GB. Then I installed the latest percona-xtrabackup-24  (2.3 would also have worked) and the AWS CLI tools. I took a backup from the EC2 instance with this command, using gzip to create a compressed backup:

sudo xtrabackup --backup --stream=tar | gzip -c > /data/backups/xtrabackup.tar.gz

Note that Amazon prepares the backup, so there’s no need to run

xtrabackup --prepare yourself.

For comparison, I took a mysqldump backup as well:

mysqldump --all-databases --triggers --events --routines --master-data=1 --single-transaction | gzip -c > /data/backups/mysqldump.sql.gz

I could have used mydumper to make this process multi-threaded, but to reduce complexity I did not. I then uploaded the backup to an S3 bucket (setting up credentials beforehand):

sudo aws s3 cp /data/backups/xtrabackup.tar.gz s3://dankow/
After that, I navigated to Relational Database Service in the AWS Console, and instead of clicking Launch DB Instance, I clicked Restore from S3. After that, the process is almost identical to creating a normal RDS MySQL or Amazon Aurora MySQL instance, with the addition of this box on Step 2:

I chose a db.m4.xlarge instance with 1000 Provisioned IOPS for this test. After I configured all the other options, I clicked “Launch DB Instance” and waited for my backup to decompress, prepare and restore into a new RDS instance.

For time comparison, I imported the backup I took with mysqldump, ignoring all the expected errors about privileges because they don’t affect the tables that we’re really interested in:

gunzip -c /data/backups/mysqldump.sql.gz | mysql --defaults-file=rds.cnf --force

Replication

If you’re planning on migrating a non-RDS instance to RDS, you might want to make your new RDS instance an async replica of the source instance. If there is a network path between the two instances, this is simple. Use the binary log coordinates from the xtrabackup_binlog_info (RDS does not support master_auto_position with GTID replication), and use them as arguments to the RDS external replication stored procedures, like this:CALL mysql.rds_set_external_master ( "<host_name>", <host_port>, "<replication_user_name", "replication_password", "<mysql_binary_log_file_name>", mysql_binary_log_file_position, 0 ); CALL mysql.rds_start_replication;
Currently, there is no way to make this connection use SSL. If the source instance is not in the same VPC as the RDS instance, set up a VPN connection between the two networks in order to protect the replication traffic. Time Comparison The time to back up was close: 8 minutes for Percona XtraBackup, and 7.5 minutes for mysqldump. Add the time to copy the backup to S3 (37 seconds), and the two methods are almost identical.The difference comes with restore time. The mysqldump backup took 22.5 minutes to restore, and Amazon took 10 minutes and 50 seconds to create the RDS instance from the backup. Some part of that is the normal overhead of creating an RDS instance, which always takes a few minutes.

Although my test dataset was small (13.5 GB) compared to most production databases, it was large enough to show a significant difference between physical (Percona XtraBackup) and logical (mysqldump) backups. The XtraBackup method was about 60% faster than mysqldump. If your dataset is larger, you will see even more of a difference.

Conclusion

When you migrate to Amazon RDS using a physical backup, it can be much faster than using a logical backup — but it’s not the right option for every use case. If your InnoDB tablespaces have significant fragmentation, or if you’re not currently using innodb_file_per_table, you may want to perform a logical migration to fix those issues. If you normally create RDS instances programmatically, the AWS CLI does not currently support creating an RDS instance from a physical backup. Any corruption in the InnoDB files transfers over to the RDS instance if you use a physical backup, but a logical backup will fail and allow you to fix the corruption before it gets to RDS.

For many use cases, however, building an RDS instance from Percona XtraBackup is a convenient way to get your data into RDS MySQL or Aurora relatively quickly. In this one small-scale test, migrating using XtraBackup was 60% faster than using mysqldump.

The post Migrate to Amazon RDS Using Percona Xtrabackup appeared first on Percona Database Performance Blog.

dbdeployer GA and semantic versioning

dbdeployer went into release candidate status a few weeks ago. Since then, I added no new features, but a lot of tests. The test suite now runs 3,000+ tests on MacOS and a bit more on Linux, for a grand total of 6,000+ tests that need to run at least twice: once with concurrency enabled and once without. I know that testing can't prove the absence of bugs, but I am satisfied with the results, since all this grinding has allowed me to find several bugs and fix them.

In this framework, I felt that dbdeployer could exit candidate status and get to version 1.0. This happened on March 26th. An immediate side effect of this change is that from this point on, dbdeployer must adhere to the semantic versioning principles:

A version number is made of Major, Minor, and Revision. When changes are applied, the following happens:

  • Backward-compatible bug fixes increment the Revision number (e.g. 1.0.0 to 1.0.1)
  • Backward-compatible new features increment the Minor number (1.0.1 to 1.1.0)
  • Backward incompatible changes (either features or bug fixes that break compatibility with the API) increment the Major number (1.15.9 to 2.0.0)

The starting API is defined in API-1.0.md, which was generated manually.
The file API-1.1.md contains the same API definition, but was generated automatically and can be used to better compare the initial API with further version.

So the app went from 1.0 to 1.1 in less than one week. In obedience to semantic versioning principles, if a new backward-compatible feature is added, the minor number of the version increases. What does backward-compatible mean? It means that commands, procedures, and workflows that were working with the previous version will also work with the current one. It's just that the new release will have more capabilities. In this case, the added feature is the ability of having environment variables HOME and PWD recognized and properly expanded in the configuration file. It's nothing very exciting, but changing the minor number gives the user a hint of what to expect from the new release.

Let's give a few examples:

  • Version goes from 1.0.0 to 1.0.1: It means that there are only bug fixes, and you should expect to use it without modifications.
  • Version goes from 1.0.1 to 1.1.0: You should be able to use dbdeployer just as before, but you should check the release notes to see what's new, because there are new functionalities that might be useful to you.
  • Version goes from 1.3.15 to 2.0.0: Danger! A major number bumped up means that something has changed in the API, which is now partially or totally incompatible with the previous release. Your workflow may break, and you must check the release notes and the documentation to learn how to use the new version.

This is different from other applications. For example, the MySQL server uses version numbers with hard to predict meaning:

  • MySQL 5.1, 5.5, 5.6, and 5.7 should be, in fact, major version number changes, not minor ones. Each one of them introduces incompatible changes that require careful review of the novelties.
  • Within the same version (such as MySQL 5.7) there are a lot of compatible and incompatible changes, although the minor number stays the same.

The plan with dbdeployer is to use the version number as a manifest, to give users an immediate feeling of what to expect. Rather than changing minor or major number only when the developers think there is some juicy new thing of which they can be proud, the version number will tell whether users should worry about compatibility or not.

In my general development plan, you are more likely to see versions like "1.25.16" than version "2.0," meaning that I will try to keep the current API valid as much as possible. A major version change will signify that a new feature could not fit in the current infrastructure and a new one would be needed.

You can draw your own conclusions here. A semantic versioning paradigm is unlikely to be adopted by most software vendors, because version numbers are often marketing gimmicks, and they can charge you more convincingly for a version 6.0 than for version 1.34.
Free software, OTOH, can do this. My goal with dbdeployer is to help the MySQL community, and I will achieve that goal more easily if my releases can be adopted without fear of incompatibility.


Fun with Bugs #64 - On MySQL Bug Reports I am Subscribed to, Part IV

I've subscribed to more than 15 new MySQL bug reports since the previous post in this series, so it's time for a new one. I am trying to follow important, funny or hard to process bug reports every day. Here is the list of the most interesting recent ones starting from the latest (with several still not processed properly):
  • Bug #90211 - "Various warnings and errors when compiling MySQL 8 with Clang".  Roel Van de Paar and Percona in general continue their QA efforts in a hope to make MySQL 8 better. Current opinion of Oracle engineers on this bug is the following:
    "First of all, these issues are in protobuf, not MySQL per se. There are some warnings with Clang 6, but since they're in third-party code, we have simply disabled them when compiling protobuf (will be part of 8.0.11). Optionally, -DUSE_SYSTEM_LIBS=1 will use system protobuf and thus not compile the files in question.
    As for the crash, we don't support prerelease compilers (more generally, we support platforms, not compilers). Given the stack trace, it is highly likely that the issue either is in the prerelease Clang, or in protobuf."Let's see how it may end up. Roel rarely gives up easily...
  • Bug #90209 - "Performance regression with > 15K tables in MySQL 8.0 (with general tablespaces)". Nice regression bug report from Alexander Rubin. It is still "Open".
  • Bug #90190 - "Sig=6 assertion in MYSQL_BIN_LOG::new_file_impl | binlog.cc:6862". Yet another bug report from Percona employee, Ramesh Sivaraman.
  • Bug #89994 - "INDEX DIRECTORY shown as valid option for InnoDB table creation". Everybody knows how much I like fine MySQL manual. Even more I like when missing or wrong details are found there, like in this case reported by by colleague from MariaDB, Claudio Nanni.
  • Bug #89963  - "Slowdown in creating new SSL connection". Maybe it's comparing apples to oranges, as stated in one of comments, but I am surprised that this (performance regression) bug report by Rene' Cannao' is still "Open". It requires more attention, IMHO. Speed of connections matters a lot for MySQL.
  • Bug #89904 - "Can't change innodb_max_dirty_pages_pct to 0 to flush all pages". Good intentions to set better default value (applied a bit later than needed) led to the problem. As Simon Mudd put it:
    "innodb_max_dirty_pages_pct_lwm setting has existed since 5.6. This issue only comes up as by changing the default value to 10 those of us who have ignored it until now never noticed it existed. That is a shame as setting this value to a value other than 0 (e.g. 10 which is the new default) should be better and trigger some background flushing of dirty pages avoiding us hitting innodb_max_dirty_pages_pct which would trigger much more aggressive behaviour which is not really desirable."
  • Bug #89876 - "mysqladmin flush-hosts is not safe in GTID mode". Yet another bug report from Simon Mudd. See also Bug #88720 that highlights even more problems with various FLUSH statements and GTIDs.
  • Bug #89870 - "Group by optimization not used with partitioned tables". For some reason this report from Arnaud Adant is still "Open". As my colleague Richard Stracke stated:
    "The only solution would be, that the optimizer is able to check, if the condition in the where clause include the whole table (or partition) and in this case use group by optimization."
  • Bug #89860 - "XA may lost prepared transaction and cause different between master and slave." As this (and other, like Bug #88534) bug report from Michael Yang shows, there is still a long way to go until it would be safe to use XA transactions with MySQL.
  • Bug #89834 - "Replication will not connect on IPv6 - does not function in an IPv6 only environ". This bug report from Tim St. Pierre is still "Open".
  • Bug #89822 - "InnoDB retries open on EINTR error only if innodb_use_native_aio is enabled". We have patch contributed by Laurynas Biveinis from Percona.
  • Bug #89758 - "Conversion from ENUM to VARCHAR fails because mysql adds prefix index". This funny bug was found and reported by Monty Solomon.
  • Bug #89741 - "Events log Note level messages even for log_warnings=0". Nikolai Ikhalainen found that this problem happens only in versions 5.5.x and 5.6.x, so chances to see it fixed are low. But I still want to know if this ever happens.
  • Bug #89696 - "Cyclic dependencies are not resolved properly with cascade removal". Make sure to check nice discussion that my dear friend Sinisa Milivojevic had with a bug reporter, Andrei Anishchenko, before marking the bug as "Verified". This regression was most likely caused by a change in MySQL 5.7.21:
    "InnoDB: An iterative approach to processing foreign cascade operations resulted in excessive memory use. (Bug #26191879, Bug #86573)"
  • Bug #89625 - "please package the debug symbols *.pdb files!". Shane Bester always cared about having a way to debug on Windows. Recently I also started to care about this...
--- It's April Fools' Day today, so why not to make fool of myself assuming that anyone cares about the series of blog posts.

Pages