Planet MySQL

MySQL Meetup in Dubai

We are happy to announce that there will be a MySQL Day/MySQL Community Meetup hold in Oracle Buliding in Dubai on October 17th. Please find more details below:

  • Date: Wednesday, October 17, 2018
  • Time: 6pm 
  • Address: Oracle Office Bulding #6, Dubai, Internet City 
  • Meeting room: will be confirmed soon.
  • Agenda: 
    • "Oracle MySQL 8 - The next big thing!" by Carsten Thalheimer the Master Principal Sales Consultant
    • Discussion & pizza
  • More information & Registration

MySQL : InnoDB Transparent Tablespace Encryption

From MySQL 5.7.11, encryption is supported for InnoDB (file-per-table) tablespaces. This is called Transparent Tablespace Encryption or sometimes referred as Encryption at Rest. This blog post aims to give the internal details of InnoDB Tablespace Encryption.

Keyring Plugin : Why What How ?

Percona Live Europe 2018 Session Programme Published

Offering over 110 conference sessions across Tuesday, 6 and Wednesday, 7 November, and a full tutorial day on Monday 5 November, we hope you’ll find that this fantastic line up of talks for Percona Live Europe 2018 to be one of our best yet! Innovation in technology continues to arrive at an accelerated rate, and you’ll find plenty to help you connect with the latest developments in open source database technologies at this acclaimed annual event.

Representatives from companies at the leading edge of our industry use the platform offered by Percona Live to showcase their latest developments and share plans for the future. If your career is dependent upon the use of open source database technologies you should not miss this conference!

Conference Session Schedule

Conference sessions will take place on Tuesday and Wednesday, November 6-7 and will feature more than 110 in-depth talks by industry experts. Conference session examples include:

  • Deep Dive on MySQL Databases on Amazon RDS – Chayan Biswas, AWS
  • MySQL 8.0 Performance: Scalability & Benchmarks – Dimitri Kravtchuk, Oracle
  • MySQL 8 New Features: Temptable engine – Pep Pla, Pythian
  • Artificial Intelligence Database Performance Tuning – Roel Van de Paar, Percona
  • pg_chameleon – MySQL to PostgreSQL replica made easy – Federico Campoli, Loxodata
  • Highway to Hell or Stairway to Cloud? – Alexander Kukushkin, Zalando
  • Zero to Serverless in 60 Seconds – Sandor Maurice, AWS
  • A Year in Google Cloud – Carmen Mason, Alan Mason, Vital Source Technologies
  • Advanced MySQL Data at Rest Encryption in Percona Server for MySQL – Iwo Panowicz, Percona, and Bartłomiej Oleś, Severalnines
  • Monitoring Kubernetes with Prometheus – Henri Dubois-Ferriere, Sysdig
  • How We Use and Improve Percona XtraBackup at Alibaba Cloud – Bo Wang, Alibaba Cloud
  • Shard 101 – Adamo Tonete, Percona
  • Top 10 Mistakes When Migrating From Oracle to PostgreSQL – Jim Mlodgenski, AWS
  • Explaining the Postgres Query Optimizer – Bruce Momjian, EnterpriseDB
  • MariaDB 10.3 Optimizer and Beyond – Vicentiu Ciorbaru, MariaDB FoundationHA and Clustering Solution: ProxySQL as an Intelligent Router for Galera and Group Replication – René Cannaò, ProxySQL
  • MongoDB WiredTiger WriteConflicts – Paul Agombin, ObjectRocket
  • PostgreSQL Enterprise Features – Michael Banck, credativ GmbH
  • What’s New in MySQL 8.0 Security – Georgi Kodinov, Oracle
  • The MariaDB Foundation and Security – Finding and Fixing Vulnerabilities the Open Source Way – Otto Kekäläinen, MariaDB Foundation
  • ClickHouse 2018: How to Stop Waiting for Your Queries to Complete and Start Having Fun – Alexander Zaitsev, Altinity
  • Open Source Databases and Non-Volatile Memory – Frank Ober, Intel Memory Group
  • MyRocks Production Case Studies at Facebook – Yoshinori Matsunobu, Facebook
  • Need for Speed: Boosting Apache Cassandra’s Performance Using Netty – Dinesh Joshi, Apache Cassandra
  • Demystifying MySQL Replication Crash Safety – Jean-François Gagné, Messagebird
See the full list of sessions Tutorial schedule

Tutorials will take place throughout the day on Monday, November 5, 2018. Tutorial session examples include:

  • Query Optimization with MySQL 8.0 and MariaDB 10.3: The Basics – Jaime Crespo, Wikimedia Foundation
  • ElasticSearch 101 – Antonios Giannopoulos, ObjectRocket
  • MySQL InnoDB Cluster in a Nutshell: The Saga Continues with 8.0 – Frédéric Descamps, Oracle
  • High Availability PostgreSQL and Kubernetes with Google Cloud – Alexis Guajardo, Google
  • Best Practices for High Availability – Alex Rubin and Alex Poritskiy, Percona
See the full list of tutorials. Sponsors

We are grateful for the support of our sponsors:

  • Platinum – AWS
  • Silver – Altinity, PingCap
  • Start Up – Severalnines
  • Branding – Intel, Idera
  • Expo – Postgres EU

If you would like to join them Sponsorship opportunities for Percona Live Open Source Database Conference Europe 2018 are available and offer the opportunity to interact with the DBAs, sysadmins, developers, CTOs, CEOs, business managers, technology evangelists, solution vendors and entrepreneurs who typically attend the event. Contact live@percona.com for sponsorship details.

Ready to register? What are you waiting for? Costs will only get higher!
Register now!

 

 

Finding Table Differences on Nullable Columns Using MySQL Generated Columns

Some time ago, a customer had a performance issue with an internal process. He was comparing, finding, and reporting the rows that were different between two tables. This is simple if you use a LEFT JOIN and an 

IS NULL  comparison over the second table in the WHERE clause, but what if the column could be null? That is why he used UNION, GROUP BY and a HAVING clauses, which resulted in poor performance.

The challenge was to be able to compare each row using a LEFT JOIN over NULL values.

The challenge in more detail

I’m not going to use the customer’s real table. Instead, I will be comparing two sysbench tables with the same structure:

CREATE TABLE `sbtest1` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `k` int(10) unsigned DEFAULT NULL, `c` char(120) DEFAULT NULL, `pad` char(60) DEFAULT NULL, PRIMARY KEY (`id`), KEY `k` (`k`,`c`,`pad`) ) ENGINE=InnoDB

It is sightly different from the original sysbench schema, as this version can hold NULL values. Both tables have the same number of rows. We are going to set to NULL one row on each table:

update sbtest1 set k=null where limit 1; update sbtest2 set k=null where limit 1;

If we execute the comparison query, we get this result:

mysql> select "sbtest1",a.* from -> sbtest1 a left join -> sbtest2 b using (k,c,pad) -> where b.id is null union -> select "sbtest2",a.* from -> sbtest2 a left join -> sbtest1 b using (k,c,pad) -> where b.id is null; +---------+------+------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+ | sbtest1 | id | k | c | pad | +---------+------+------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+ | sbtest1 | 4462 | NULL | 64568100364-99474573987-46567807085-85185678273-10829479379-85901445105-43623848418-63872374080-59257878609-82802454375 | 07052127207-33716235481-22978181904-76695680520-07986095803 | | sbtest2 | 4462 | NULL | 64568100364-99474573987-46567807085-85185678273-10829479379-85901445105-43623848418-63872374080-59257878609-82802454375 | 07052127207-33716235481-22978181904-76695680520-07986095803 | +---------+------+------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+ 2 rows in set (3.00 sec)

As you can see, column k is NULL. In both cases it failed and reported those rows to be different. This is not new in MySQL, but it would be nice to have a way to sort this issue out.

Solution

The solution is based on GENERATED COLUMNS with a hash function (md5) and stored in a binary(16) column:

ALTER TABLE sbtest1 ADD COLUMN `real_id` binary(16) GENERATED ALWAYS AS (unhex(md5(concat(ifnull(`k`,'NULL'),ifnull(`c`,'NULL'),ifnull(`pad`,'NULL'))))) VIRTUAL, ADD INDEX (real_id); ALTER TABLE sbtest2 ADD COLUMN `real_id` binary(16) GENERATED ALWAYS AS (unhex(md5(concat(ifnull(`k`,'NULL'),ifnull(`c`,'NULL'),ifnull(`pad`,'NULL'))))) VIRTUAL, ADD INDEX (real_id);

Adding the index is also part of the solution. Now, let’s execute the query using the new column to join the tables:

mysql> select "sbtest1",a.k,a.c,a.pad from -> sbtest1 a left join -> sbtest2 b using (real_id) -> where b.id is null union -> select "sbtest2",a.k,a.c,a.pad from -> sbtest2 a left join -> sbtest1 b using (real_id) -> where b.id is null; Empty set (2.31 sec)

We can see an improvement in the query performance—it now takes 2.31 sec whereas before it was 3.00 sec—and that the result is as expected. We could say that that’s all, and no possible improvement can be made. However, is not true. Even though the query is running faster, it is possible to optimize it in this way:

mysql> select "sbtest1",a.k,a.c,a.pad -> from sbtest1 a -> where a.id in (select a.id -> from sbtest1 a left join -> sbtest2 b using (real_id) -> where b.id is null) union -> select "sbtest2",a.k,a.c,a.pad -> from sbtest2 a -> where a.id in (select a.id -> from sbtest2 a left join -> sbtest1 b using (real_id) -> where b.id is null); Empty set (1.60 sec)

Why is this faster? The first query is performing two subqueries. Each subquery is very similar. Let’s check the explain plan:

mysql> explain select "sbtest1",a.k,a.c,a.pad from -> sbtest1 a left join -> sbtest2 b using (real_id) -> where b.id is null; +----+-------------+-------+------------+------+---------------+---------+---------+------------------+--------+----------+--------------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+------+---------------+---------+---------+------------------+--------+----------+--------------------------------------+ | 1 | SIMPLE | a | NULL | ALL | NULL | NULL | NULL | NULL | 315369 | 100.00 | NULL | | 1 | SIMPLE | b | NULL | ref | real_id | real_id | 17 | sbtest.a.real_id | 27 | 10.00 | Using where; Not exists; Using index | +----+-------------+-------+------------+------+---------------+---------+---------+------------------+--------+----------+--------------------------------------+

As you can see, it is performing a full table scan over the first table and using real_id to join the second table. The real_id is a generated column, so it needs to execute the function to get the value to join the second table. That means that it’s going to take time.

If we analyze the subquery of the second query:

mysql> explain select "sbtest1",a.k,a.c,a.pad -> from sbtest1 a -> where a.id in (select a.id -> from sbtest1 a left join -> sbtest2 b using (real_id) -> where b.id is null); +----+--------------+-------------+------------+--------+---------------+------------+---------+------------------+--------+----------+--------------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+--------------+-------------+------------+--------+---------------+------------+---------+------------------+--------+----------+--------------------------------------+ | 1 | SIMPLE | a | NULL | index | PRIMARY | k | 187 | NULL | 315369 | 100.00 | Using where; Using index | | 1 | SIMPLE | <subquery2> | NULL | eq_ref | <auto_key> | <auto_key> | 4 | sbtest.a.id | 1 | 100.00 | NULL | | 2 | MATERIALIZED | a | NULL | index | PRIMARY | real_id | 17 | NULL | 315369 | 100.00 | Using index | | 2 | MATERIALIZED | b | NULL | ref | real_id | real_id | 17 | sbtest.a.real_id | 27 | 10.00 | Using where; Not exists; Using index | +----+--------------+-------------+------------+--------+---------------+------------+---------+------------------+--------+----------+--------------------------------------+

We are going to see that it is performing a full index scan over the first table, and that the generated column has never been executed. That is how we can go from an inconsistent result of three seconds, to a consistent result of 2.31 seconds, to finally reach a performant query using the faster time of 1.60 seconds.

Conclusions

This is not the first blog post that I’ve done about generated columns. I think that it is a useful feature for several scenarios where you need to improve performance. In this particular case, it’s also presenting a workaround to expected inconsistencies with LEFT JOINS with NULL values. It is also important to mention that this improved a process in a real world scenario.

New White Paper on State-of-the-Art Database Management: ClusterControl - The Guide

Today we’re happy to announce the availability of our first white paper on ClusterControl, the only management system you’ll ever need to automate and manage your open source database infrastructure!

Download ClusterControl - The Guide!

Most organizations have databases to manage, and experience the headaches that come with that: managing performance, monitoring uptime, automatically recovering from failures, scaling, backups, security and disaster recovery. Organizations build and buy numerous tools and utilities for that purpose.

ClusterControl differs from the usual approach of trying to bolt together performance monitoring, automatic failover and backup management tools by combining – in one product – everything you need to deploy and operate mission-critical databases in production. It automates the entire database environment, and ultimately delivers an agile, modern and highly available data platform based on open source.

All-in-one management software - the ClusterControl features set:

Since the inception of Severalnines, we have made it our mission to provide market-leading solutions to help organisations achieve optimal efficiency and availability of their open source database infrastructures.

With ClusterControl, as it stands today, we are proud to say: mission accomplished!

Our flagship product is an integrated deployment, monitoring, and management automation system for open source databases, which provides holistic, real-time control of your database operations in an easy and intuitive experience, incorporating the best practices learned from thousands of customer deployments in a comprehensive system that helps you manage your databases safely and reliably.

Whether you’re a MySQL, MariaDB, PostgreSQL or MongoDB user (or a combination of these), ClusterControl has you covered.

Deploying, monitoring and managing highly available open source database clusters is not a small feat and requires either just as highly specialised database administration (DBA) skills … or professional tools and systems that non-DBA users can wield in order to build and maintain such systems, though these typically come with an equally high learning curve.

The idea and concept for ClusterControl was born out of that conundrum that most organisations face when it comes to running highly available database environments.

It is the only solution on the market today that provides that intuitive, easy to use system with the full set of tools required to manage such complex database environments end-to-end, whether one is a DBA or not.

The aim of this Guide is to make the case for comprehensive open source database management and the need for cluster management software. And explains in a just as comprehensive fashion why ClusterControl is the only management system you will ever need to run highly available open source database infrastructures.

Download ClusterControl - The Guide!

Tags:  clustercontrol database management MySQL MariaDB MongoDB PostgreSQL white paper

MySQL User Camp, Bangalore

We are happy to announce that another "MySQL User Camp" is going to be hold in Bangalore, India. Please find details below:

  • Date: Thursday, October 11, 2018
  • 3:00-5:30pm
  • Venue: OC001, Block1, B wing, Kalyani Magnum Infotech Park, J.P. Nagar, 7th Phase Bangalore – 76
  • Agenda: 
    • Listen to Tomas Ulin (VP Engineering, MySQL) talking about how MySQL 8.0 combines best of both worlds: SQL and NoSQL.
    • An engaging session by MySQL developer demonstrating “InnoDB Cluster in action”.
    • A presentation by MySQL Developers on MySQL Document Store.
  • Please send an email to tinku.ajit@oracle.com for registration. Registration is free (FCFS)

Explore the opportunity to get to know more about MySQL by networking with other MySQLers!!!

Shinguz: MariaDB/MySQL Environment MyEnv 2.0.1 has been released

FromDual has the pleasure to announce the release of the new version 2.0.1 of its popular MariaDB, Galera Cluster and MySQL multi-instance environment MyEnv.

The new MyEnv can be downloaded here. How to install MyEnv is described in the MyEnv Installation Guide.

In the inconceivable case that you find a bug in the MyEnv please report it to the FromDual bug tracker.

Any feedback, statements and testimonials are welcome as well! Please send them to feedback@fromdual.com.

Upgrade from 1.1.x to 2.0

Please look at the MyEnv 2.0.0 Release Notes.

Upgrade from 2.0.0 to 2.0.1 shell> cd ${HOME}/product shell> tar xf /download/myenv-2.0.1.tar.gz shell> rm -f myenv shell> ln -s myenv-2.0.1 myenv
Plug-ins

If you are using plug-ins for showMyEnvStatus create all the links in the new directory structure:

shell> cd ${HOME}/product/myenv shell> ln -s ../../utl/oem_agent.php plg/showMyEnvStatus/
Upgrade of the instance directory structure

From MyEnv 1.0 to 2.0 the directory structure of instances has fundamentally changed. Nevertheless MyEnv 2.0 works fine with MyEnv 1.0 directory structures.

Changes in MyEnv 2.0.1 MyEnv
  • CloudLinux was added as supported distribution.
  • Introduced different brackets for up () and down [] MariaDB/MySQL Instances in up output.
  • Script setMyEnv.sh added to set environment for instance (e.g. via ssh).
  • MyEnv should not complain any more when default my.cnf with include/includedir directives is used.
  • Missing instancedir configuration variable in myenv.conf is complaining now. This could be a left over from 1.x to 2.y migration.
  • OpenSuSE Leap 42.3 support added.

MyEnv Installer
  • Instance name with dot '.' is not allowed any more.
  • basedir without bin/mysqld is stripped out from installer overview.

MyEnv Utilities
  • Utilities cluster_conflict.php, cluster_conflict.sh, galera_monitor.sh, haproxy_maint.sh, group_replication_monitor.sh for Galera and Group Replication Cluster added.

For subscriptions of commercial use of MyEnv please get in contact with us.

Taxonomy upgrade extras:  MyEnv multi-instance virtualization consolidation SaaS Operations release upgrade mysqld_multi

System Performance Theory

I was invited to MongoDB’s New York headquarters to kick off a series of internal tech talks on performance, with a presentation on system performance theory. The talk began by building intuition about Little’s Law, segued into queueing theory, and rounded out with the Universal Scalability Law.

Below is the abstract that was circulated internally at MongoDB to promote the talk:

In college Henrik Ingo got to do an exercise on queueing theory: how to optimize the number of cash registers and bathroom stalls in a McDonalds restaurant. After graduating he never worked in a McDonalds and forgot all about this. It was only through some writings of Baron Schwartz that he realized that queueing theory also applies directly to database performance (and systems performance). So to start a series of internal talks at MongoDB about performance, there was no better way than to have Baron give an introduction to Queueing theory, Amdahl’s law and the Universal Scalability Law.

Baron Schwartz became known as a database performance expert when working at Percona, where he was one of their first employees. In addition to numerous conference talks and blogs, he is also co-author of the 2nd to 3rd editions of the book High Performance MySQL. Over time his methodology developed into a more data and statistics driven approach. In one publication he reviewed and categorized the root causes for database outages in all Percona support incidents over one year. Many of his methods and ideas were added to and popularized in the Percona Toolkit, which is the mtools of the MySQL world. The culmination of this path was the founding of his own company VividCortex, which provides a database monitoring service for open source databases (including MongoDB). In addition to standard performance monitoring and query analyzer dashboards, VividCortex includes fault detection algorithms based on—wait for it—queueing theory.

CRITICAL UPDATE for Percona XtraDB Cluster users: 5.7.23-31.31.2 Is Now Available

To resolve a critical regression, Percona announces the release of Percona XtraDB Cluster 5.7.23-31.31.2 on October 2, 2018 Binaries are available from the downloads section or from our software repositories.

This release resolves a critical regression in Galera’s wsrep library and supersedes 5.7.23-31.31

Percona XtraDB Cluster 5.7.23-31.31.2 is now the current release, based on the following:

All Percona software is open-source and free.

Fixed Bugs
  • #2254: A cluster conflict could cause a crash in Percona XtraDB Cluster 5.7.23 if autocommit=off.

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system. As always, thanks for your continued support of Percona!

Introducing Agent-Based Database Monitoring with ClusterControl 1.7

We are excited to announce the 1.7 release of ClusterControl - the only management system you’ll ever need to take control of your open source database infrastructure!

ClusterControl 1.7 introduces new exciting agent-based monitoring features for MySQL, Galera Cluster, PostgreSQL & ProxySQL, security and cloud scaling features ... and more!

Release Highlights Related resources  ClusterControl ChangeLog  Download ClusterControl  Learn More About ClusterControl Monitoring & Alerting
  • Agent-based monitoring with Prometheus
  • New performance dashboards for MySQL, Galera Cluster, PostgreSQL & ProxySQL
Security & Compliance
  • Enable/disable Audit Logging on your MariaDB databases
  • Enable policy-based monitoring and logging of connection and query activity
Deployment & Scaling
  • Automatically launch cloud instances and add nodes to your cloud deployments
Additional Highlights
  • Support for MariaDB v10.3

View the ClusterControl ChangeLog for all the details!

ClusterControl Single Console for Your Entire Database Infrastructure Find out what else is new in ClusterControl Install ClusterControl for FREE View Release Details and Resources Release Details Monitoring & Alerting Agent-based monitoring with Prometheus

ClusterControl was originally designed to address modern, highly distributed database setups based on replication or clustering. It provides a systems view of all the components of a distributed cluster, including load balancers, and maintains a logical topology view of the cluster.

So far we’d gone the agentless monitoring route with ClusterControl, and although we love the simplicity of not having to install or manage agents on the monitored database hosts, an agent-based approach can provide higher resolution of monitoring data and has certain advantages in terms of security.

With that in mind, we’re happy to introduce agent-based monitoring as a new feature added in ClusterControl 1.7!

It makes use of Prometheus, a full monitoring and trending system that includes built-in and active scraping and storing of metrics based on time series data. One Prometheus server can be used to monitor multiple clusters. ClusterControl takes care of installing and maintaining Prometheus as well as exporters on the monitored hosts.

Users can now enable their database clusters to use Prometheus exporters to collect metrics on their nodes and hosts, thus avoiding excessive SSH activity for monitoring and metrics collections and use SSH connectivity only for management operations.

Monitoring & Alerting New performance dashboards for MySQL, Galera Cluster, PostgreSQL & ProxySQL

ClusterControl users now have access to a set of new dashboards that have Prometheus as the data source with its flexible query language and multi-dimensional data model, where time series data is identified by metric name and key/value pairs. This allows for greater accuracy and customization options while monitoring your database clusters.

The new dashboards include:

  • Cross Server Graphs
  • System Overview
  • MySQL Overview, Replication, Performance Schema & InnoDB Metrics
  • Galera Cluster Overview & Graphs
  • PostgreSQL Overview
  • ProxySQL Overview
Security & Compliance Audit Log for MariaDB

Continuous auditing is an imperative task for monitoring your database environment. By auditing your database, you can achieve accountability for actions taken or content accessed. Moreover, the audit may include some critical system components, such as the ones associated with financial data to support a precise set of regulations like SOX, or the EU GDPR regulation. Usually, it is achieved by logging information about DB operations on the database to an external log file.

With ClusterControl 1.7 users can now enable a plugin that will log all of their MariaDB database connections or queries to a file for further review; it also introduces support for version 10.3 of MariaDB.

Additional New Functionalities

View the ClusterControl ChangeLog for all the details!

Download ClusterControl today!

Happy Clustering!

Tags:  clustercontrol PostgreSQL MySQL MongoDB MariaDB database monitoring agent-based

Shutdown and Restart Statements

There are various ways to shutdown MySQL. The traditional cross platform method is to use the shutdown command in the mysqladmin client. One drawback is that it requires shell access; another is that it cannot start MySQL again automatically. There are platform specific options that can perform a restart such as using systemctl on Linux or install MySQL as a service on Microsoft Windows. What I will look at here though is the built in support for stopping and restarting MySQL using SQL statements.

Photo by Michael Mroczek on Unsplash

MySQL 5.7 added the SHUTDOWN statement which allows you to shut down MySQL using the MySQL command-line client or MySQL Shell. The command is straight forward to use:

The SHUTDOWN command available in MySQL 5.7 and later.

You will need the SHUTDOWN privilege to use the statement – this is the same as it required to use mysqladmin to shutdown MySQL. There is one gotcha to be aware of with the SHUTDOWN statement: it only works with the old (traditional) MySQL protocol. If you attempt to use it when connected to MySQL using the new X Protocol, you get the error: ERROR: 3130: Command not supported by pluggable protocols as shown in the next example:

Executing SHUTDOWN when connected through the X Protocol causes error 3130.

The RESTART statement, on the other hand, works through both protocols and also requires the SHUTDOWN privilege:

The RESTART command available in MySQL 8.0.

For the restart to work, it is necessary that MySQL has been started in presence of a “monitoring service”. This is the default on Microsoft Windows (to disable the monitoring service start MySQL with --no-monitor). On Linux the monitoring service can for example be systemd or mysqld_safe.

As an example of where the RESTART statement comes in handy is for MySQL Shell’s AdminAPI for administrating a MySQL InnoDB Cluster cluster. MySQL Shell can when connected to MySQL Server 8.0 use the new SET PERSIST syntax to make the required configuration changes and then use the RESTART statement to restart the instance to make non-dynamic configuration changes take effect.

The SHUTDOWN and RESTART statements may not be the most important changes in MySQL 5.7 and 8.0, but they can be handy to know of in some cases.

Track Host, Distributed Data

Leveraging distributed data and state is today’s key competitive advantage. This track explored the technical challenges and lessons learned in managing distributed state in large-scale applications that reliably process millions of events per second. Attendees learned proven strategies and gained new insights from leading practitioners into how to handle real-time data in streams and events.

I was a program committee member, and the track host for the Distributed Data track on Tuesday. I did not present at this conference, but I composed remarks to help weave the presentations in this track into a cohesive narrative throughout the day. What follows is a lightly edited version of my introductory remarks for the day’s sessions, as well as introductions to each session. If you’re interested in viewing slides or video of the sessions, you can purchase the video collection directly through O’Reilly Safari. Speakers may also post their slides, which are often linked from the session pages.

Tuesday featured a great set of speakers and topics, lessons from practitioners on how to think about data at scale. The speakers are at the leading edge of their field, sharing stories about the challenges and solutions of pushing the limits of distributed data, and seeing what works and what doesn’t. But although they’re operating at the extremes today, these stories will be widespread and normal in a few years. This isn’t just tech at Google scale that only Google needs, this is all of us.

I was particularly excited to host this track because I’ve been doing a lot of interviews with my own customers to hear about their data engineering practices, their 1- and 3-year data strategy, what trends they’re seeing and how they’re responding, and what they think the future will look like. I’m seeing that the trends in how our customers build distributed data behind and within their distributed systems are all related to macro changes industrywide.

As companies become increasingly data-intensive, and turn to next-generation architectures and data engineering culture and tooling to address it, we’re seeing several consistent themes emerge strongly. These include:

  • Polyglot persistence—the trend to compose data tiers of different technologies.
  • The need to decouple systems for manageability and scalability.
  • The core principles of SRE and how to apply them to stateful systems.
  • Streaming data.
  • Observability, particularly in the data tier where it’s not custom code you can instrument.
  • Observability and traceability of the data flow itself, not just the work the systems do.
  • Different approaches to declarative rather than imperative composition of systems, for cleanliness and abstraction of underlying implementations.

We heard these themes echoed throughout the talks on Tuesday.

Smooth scaling: Slack’s journey toward a new database

The first speaker was Ameet Kotian from Slack. Ameet has a background not only in building distributed systems that manage distributed data, but in building databases themselves. You probably know that Slack is one of the fastest-growing applications in history, which has placed extraordinary demands on their data infrastructure. You might not know that one of Slack’s founders is Cal Henderson, who wrote one of the first books about how to operate MySQL at scale. It was the O’Reilly “fish book,” and introduced a lot of people to topics like sharding and active-passive replication pairs on cheap commodity hardware.

That was more than ten years ago. Now we have the cloud, so why isn’t database scaling just a solved problem? There are several reasons why companies like Slack still have to build scalable database infrastructure themselves and can’t just use a turnkey, off-the-shelf solution. As I listened to Ameet’s story, I was struck by the fact that although technology like Vitess is helping reduce the need to custom-build everything, most data-intensive applications both today and tomorrow will still need to grapple with some of the same challenges, so there’s a lot we can learn from how Slack is approaching this.

Managing multiple sources of truth in distributed applications

The next speaker was Adam Wolfe Gordon, from DigitalOcean. Adam works on the block storage team at DigitalOcean, among other things. One of the challenges of operating distributed systems at scale is that they tend to have a variety of different datastores that have to work together in some way. This is the trend that I call polyglot persistence, and it’s something that all of us deal with in one form or another. It’s often invisible until it causes problems, either operationally or in trying to mutate an app or architecture to grow.

A typical app has as many as ten different data storage technologies these days. If you do a quick count in your head, see what you come up with for your own app. Did you remember zookeeper? What about DNS? But more obviously, you often have file storage, database storage, and some key-value store for example.

These get introduced because you can’t use a single database like Postgres for everything. You always end up with some particular data that’s orders of magnitude better off in a specialized data store: “right tool, right job.” The problem is you now have multiple sources of truth, and you need to coordinate them. Adam’s talk dove into the specifics of coordinating these sources of truth at DigitalOcean, but the lessons apply broadly to every data-intensive app.

Trade-offs in resiliency: Managing the burden of data recoverability

Kristina Bennett is a software engineer who has spent years working on data integrity across Google and is now on the customer reliability team.

Distributed data is hard. It’s hard to build, it’s hard to make it performant, it’s hard to make it correct, it’s hard to evolve over time. Why is it so hard? It’s not just that databases are complicated systems, it’s not just the technology itself. There are two things I’ve seen over and over as I’ve worked with data-intensive systems with lots of distributed data.

The first is that data has a kind of mass, and mass has inertia. We talk about systems as being stateful or stateless, and we know that stateless systems are a lot easier to manage. A big part of that is because stateless systems can be created and destroyed, but state follows the laws of conservation of mass and energy. It takes time and effort to create, move, and destroy data. It doesn’t just happen instantaneously.

The second is time dependency. Many of our systems live in the here and now as constructed from the stream of events in time, storing an up-to-the-moment snapshot of the world, which constantly changes. But if anything goes wrong, it can be really hard to recover systems to a point in time from the past, and then merge that state into the current state, which has continued to change while the recovery efforts were ongoing. It’s like getting to the end of a homework problem, realizing that you dropped a negative, but you don’t know where because you didn’t show all of your work. Solving this problem is challenging in part because of the cost, performance, and other constraints that the size and scale of our systems impose upon us.

Kafka Streams in practice: What works and what doesn’t (yet)

Bart De Vylders is a data scientist at CoScale, a modern container monitoring platform that observes and ingests data from containers, apps, and the full stack at high velocity and applies advanced analysis to monitor them. He spoke about how CoScale converted part of their system to use the Kafka Streams API.

This talk was the first of two that deal with how data flows through your systems. If you’re not yet using Kafka, my personal belief is that Kafka and the event-stream philosophy behind it is one of the most important developments in handling data at large scale in the last decade. In a nutshell, it helps solve two critical problems with data:

  1. The time dependency created by systems like relational databases, that maintain a snapshot of data at a point in time (as I mentioned in my remarks to Kristina Bennett’s talk).
  2. The data flow dependencies introduced by systems that connect to data where it lives, and thus implicitly become dependent on that data being in that location, as well as that data’s snapshotty-ness being up to date.
Sell cron, buy Airflow: Modern data pipelines in finance

Our final talk was by James Meickle, who’s a site reliability engineer at Quantopian, and previously held roles at AppNeta and Harvard among other places.

My introductory remarks to the previous talk about Kafka shouldn’t be taken to mean that everything must be a streaming data problem. Batch processing is still the right answer for a very large set of problems, and unless something else comes along to revolutionize how we process data, it probably always will be.

The common thread between these two talks is managing data dependencies. When I talk to folks at big companies who are using (for example) the Hadoop stack at scale, I often hear that their most acute pains aren’t around things I’m familiar with myself, such as performance, reliability, or cost efficiency. No—their biggest pain and cost is often understanding, managing, and mutating the data flow through their organization. This flow is usually manifested in a very large and complex set of ETL-like tasks, that are often invisibly dependent on each other. James spoke about how Apache Airflow can help structure and manage some of these moving pieces.

Replicating data from MySQL to Oracle

In our work, We used to get a lot of requirements for replicating data from one data source to another.Previously I wrote replication from MySQL to Red-shift.

In this blog I am going to explain about replicating the data from MySQL to Oracle using Tungsten replicator.

1.0. Tungsten Replicator :

It is an open source replication engine supports data extract from MySQL, MySQL Variants such as RDS, Percona Server, MariaDB and Oracle and allows the data extracted to be applied on other data sources such as Vertica, Cassandra, Redshift etc.

Tungsten Replicator includes support for parallel replication, and advanced topologies such as fan-in and multi-master, and can be used efficiently in cross-site deployments.

1.1.0. Architecture :

There are three major components in tungsten replicator.

1. Extractor / Master Service
2. Transaction History Log (THL)
3. Applier / Slave Service

1.1.1. Extractor / Master Service :

The extractor component reads data from MySQL’s binary log and writes that information into the Transaction History Log (THL).

1.1.2. Transaction History Log (THL) :

The Transaction History Log (THL) acts as a translator between two different data sources. It stores transactional data from different data servers in a universal format using the replicator service acting as a master, That could then be processed Applier / Slave service.

1.1.3. Applier / Slave Service :

All the raw row-data recorded on the THL logs is re-assembled or constructed into another format such as JSON or BSON, or external CSV formats that enable the data to be loaded in bulk batches into a variety of different targets.

Therefore Statement information is not supported for heterogeneous deployments. So It’s mandatory that Binary log format on MySQL is ROW ( with Full Image ).

1.2.0. Schema Creation :

This heterogeneous replication does not replicated SQL statements, including DDL statements that would normally define and generate the table structures, a different method must be used.

Tungsten Replicator includes a tool called ddlscan which can read the schema definition from MySQL or Oracle and translate that into the schema definition required on the target database.

1.3.0. Pre Requisites: 1.3.1. Server Packages:
  • JDK 7 or higher
  • Ant 1.8 or higher
  • Ruby 2.4
  • Net-SSH
  • Net-SCP
1.3.2. MySQL:
  • All the tables to be replicated must have a primary key.

Following MySQL configuration should be enabled on MySQL

binlog-format = row binlog-row-image = full character-set-server=utf8 collation-server=utf8_general_ci default-time-zone='+00:00' 2.0. Requirements :

User creation in Oracle :

CREATE USER accounts_places IDENTIFIED BY accounts_1 DEFAULT TABLESPACE ACCOUNTS_PUB QUOTA UNLIMITED ON ACCOUNTS_PUB; GRANT CONNECT TO accounts_places; GRANT ALL PRIVILEGES TO accounts_places;

User creation in MySQL :

root@localhost:(none)> create user 'tungsten'@'%' identified by 'secret'; Query OK, 0 rows affected (0.01 sec) root@localhost:(none)> GRANT ALL PRIVILEGES ON *.* TO 'tungsten'@'%'; Query OK, 0 rows affected (0.00 sec)

We would need to replicate the NOTES_TESTING table from accounts schema on MySQL to Oracle.Structures of the table are given below.

CREATE TABLE ​NOTES_TESTING ( ID INT(11) NOT NULL AUTO_INCREMENT, NOTE TEXT, CREATED_AT DATETIME DEFAULT NULL, UPDATED_AT DATETIME DEFAULT NULL, PERSON_ID INT(11) DEFAULT NULL, ADDED_BY INT(11) DEFAULT NULL, PRIMARY KEY (ID));

Note :

The above table was created in MySQL .Moving forward the MySQL to Oracle replication a few datatypes are not supported in Oracle.Link

3.0. Implementation:

The implementation consists of following steps.

  1. Installation / Building tungsten from source
  2. Preparing equivalent schema for Oracle
  3. Configuring Master service
  4. Configuring Slave service
  5. Generating worker tables (temp tables used by tungsten) for replication to be created on MySQL
  6. Start the replication
3.1. Installation / Building From Source:
  • Download the source package from the GIT.
# git clone https://github.com/continuent/tungsten-replicator.git
  • Compile this package it will generate the tungsten-replicator.tar file.
# sh tungsten-replicator/builder/build.sh # mkdir -p tungsten
  • Once the tar file is generated extract the file under the folder tungsten 
# tar --strip-components 1 -zxvf tungsten-replicator/builder/build/tungsten-replicator-5.2.1.tar.gz -C tungsten/ 3.2. Preparing equivalent table for Oracle :

In tungsten replicator the ddl extractor which will read table definitions from MySQL and create appropriate Oracle table definitions to use during replication.

./bin/ddlscan -user root -url 'jdbc:mysql:thin://mysql-stg:3306/accounts' -pass root2345 -template ddl-mysql-oracle.vm -db ACCOUNTS > access_log.ddl

This ddlscan will extract the mysql table definitions and stored in access_log.ddl file.

The table structure will be like this.

DROP TABLE ACCOUNTS.notes_testing; CREATE TABLE ACCOUNTS.notes_testing ( ID NUMBER(10, 0) NOT NULL, NOTE CLOB /* TEXT */, CREATED_AT DATE, UPDATED_AT DATE, PERSON_ID NUMBER(38,0), ADDED_BY NUMBER(38,0), PRIMARY KEY (ID) );

Then we need to restore this ddl to Oracle like this .

# cat access_log.ddl | sqlplus sys/oracle as sysdba

To check tables are created inside the correct accounts schema.

SQL> desc notes_testing;  Name   Null?    Type  -------- -------- -----------------  ID   NOT NULL NUMBER(10, 0)  NOTE     CLOB  CREATED_AT     DATE  UPDATED_AT     DATE  PERSON_ID     NUMBER(38,0)  ADDED_BY     NUMBER(38,0) 3.3. Configuring Master Service:
  • Switch to tungsten directory and Reset the defaults configuration file.
#cd ~/tungsten #./tools/tpm configure defaults --reset
  • Configure the Master service on the directory of your choice, We have used /opt/master
  • Following commands will prepare the configuration file for Master service.
./tools/tpm configure master \ --install-directory=/opt/master \ --enable-heterogeneous-service=true \ --enable-heterogeneous-master=true \ --members=mysql-stg \ --master=mysql-stg ./tools/tpm configure master --hosts=mysql-stg \ --replication-user=tungsten \ --replication-password=password \ --skip-validation-check=MySQLUnsupportedDataTypesCheck
  • Once the configuration is prepared, Then we can install it using tpm.
./tools/tpm install Configuration is now complete. For further information, please consult Tungsten documentation, which is available at docs.continuent.com. NOTE >> Command successfully completed
  • Now Master service will be configured under /opt/master/
  • Start the tungsten Master service.
[root@mysql-stg tungsten]# /opt/master/tungsten/cluster-home/bin/startall Starting Tungsten Replicator Service... Waiting for Tungsten Replicator Service. running: PID:15141
  • Verify it’s working by checking the master status.
[root@mysql-stg tungsten]# /opt/master/tungsten/tungsten-replicator/bin/trepctl services Processing services command... NAME              VALUE ----              ----- appliedLastSeqno: 0 appliedLatency  : 1.412 role            : master serviceName     : master serviceType     : local started         : true state           : ONLINE Finished services command... [root@mysql-stg tungsten]# /opt/master/tungsten/tungsten-replicator/bin/trepctl status Processing status command... NAME                     VALUE ----                     ----- appliedLastEventId     : mysql-bin.000134:0000000000000652;-1 appliedLastSeqno       : 0 appliedLatency         : 1.412 autoRecoveryEnabled    : false autoRecoveryTotal      : 0 channels               : 1 clusterName            : master currentEventId         : mysql-bin.000134:0000000000000652 currentTimeMillis      : 1536839268029 dataServerHost         : mysql-stg extensions             :  host                   : mysql-stg latestEpochNumber      : 0 masterConnectUri       : thl://localhost:/ masterListenUri        : thl://mysql-stg:2112/ maximumStoredSeqNo     : 0 minimumStoredSeqNo     : 0 offlineRequests        : NONE pendingError           : NONE pendingErrorCode       : NONE pendingErrorEventId    : NONE pendingErrorSeqno      : -1 pendingExceptionMessage: NONE pipelineSource         : jdbc:mysql:thin://mysql-stg:3306/tungsten_master?noPrepStmtCache=true relativeLatency        : 13.029 resourcePrecedence     : 99 rmiPort                : 10000 role                   : master seqnoType              : java.lang.Long serviceName            : master serviceType            : local simpleServiceName      : master siteName               : default sourceId               : mysql-stg state                  : ONLINE timeInStateSeconds     : 12.854 timezone               : GMT transitioningTo        :  uptimeSeconds          : 13.816 useSSLConnection       : false version                : Tungsten Replicator 5.2.1 Finished status command...
  • If the master did not start properly refer to this (/opt/master/service_logs/trepsvc.log) error log.
3.4. Configuring Slave service:
  • Switch to tungsten directory and Reset the defaults configuration file.
#cd ~/tungsten #./tools/tpm configure defaults --reset
  • Configure the Slave service on the directory of your choice, We have used /opt/slave
  • Following commands will prepare the configuration file for Slave service.
./tools/tpm configure slave \ --install-directory=/opt/slave \ --enable-heterogeneous-service=true \ --members=mysql-stg ./tools/tpm configure slave --hosts=mysql-stg \ --datasource-type=oracle \ --datasource-host=172.17.4.106 \ --datasource-port=1526 \ --datasource-oracle-sid=PLACES \ --datasource-user=accounts_places \ --datasource-password=accounts_1 \ --svc-applier-filters=dropstatementdata,replicate \ --property=replicator.filter.replicate.do=accounts --property=replicator.applier.dbms.getColumnMetadataFromDB=false \ --skip-validation-check=InstallerMasterSlaveCheck \ --rmi-port=10002 \ --thl-port=2113 \ --master-thl-port=2112 \ --master-thl-host=mysql-stg
  • Once the configuration is prepared, Then we can install it using tpm.
./tools/tpm install Configuration is now complete.  For further information, please consult Tungsten documentation, which is available at docs.continuent.com. NOTE  >> Command successfully completed 3.5. Starting Replication:
  • Once the slave is configured then start the slave
[root@mysql-stg tungsten]# /opt/slave/tungsten/cluster-home/bin/startall Starting Tungsten Replicator Service... Waiting for Tungsten Replicator Service. running: PID:17039
  • Verify it’s working by checking the slave status.
[root@mysql-stg tungsten]# /opt/slave/tungsten/tungsten-replicator/bin/trepctl services Processing services command... NAME              VALUE ----              ----- appliedLastSeqno: -1 appliedLatency  : -1.0 role            : slave serviceName     : slave serviceType     : unknown started         : true state           : OFFLINE:ERROR Finished services command... [root@mysql-stg tungsten]# /opt/slave/tungsten/tungsten-replicator/bin/trepctl status Processing status command... NAME                     VALUE ----                     ----- appliedLastEventId     : NONE appliedLastSeqno       : -1 appliedLatency         : -1.0 autoRecoveryEnabled    : false autoRecoveryTotal      : 0 channels               : 1 clusterName            : slave currentEventId         : NONE currentTimeMillis      : 1536839732221 dataServerHost         : 172.17.4.106 extensions             :  host                   : 172.17.4.106 latestEpochNumber      : -1 masterConnectUri       : thl://mysql-stg:2112/ masterListenUri        : null maximumStoredSeqNo     : -1 minimumStoredSeqNo     : -1 offlineRequests        : NONE pendingError           : NONE pendingErrorCode       : NONE pendingErrorEventId    : NONE pendingErrorSeqno      : -1 pendingExceptionMessage: NONE pipelineSource         : thl://mysql-stg:2112/ relativeLatency        : -1.0 resourcePrecedence     : 99 rmiPort                : 10002 role                   : slave seqnoType              : java.lang.Long serviceName            : slave serviceType            : local simpleServiceName      : slave siteName               : default sourceId               : 172.17.4.106 state                  : ONLINE timeInStateSeconds     : 4.476 timezone               : GMT transitioningTo        :  uptimeSeconds          : 77.996 useSSLConnection       : false version                : Tungsten Replicator 5.2.1 Finished status command... 4.0. Testing:
  • Now both master & slave are in sync. Now I am going to insert a few record on MySQL server in notes_testing table.
insert into notes_testing values(1,'Mydbops ',NULL,NULL,13,45); insert into notes_testing values(2,'MySQL DBA',NULL,NULL,1,2);
  • Above these records are inserted in the master server. At the same I have checked Oracle these records are replicated or not.
SQL> select * from notes_testing; ID     NOTE                 CREATED_AT  UPDATED_AT PERSON_ID  ADDED_BY —      —-                      ———-              ———                ——–             ——— 1       Mydbops             13                      45 2       MySQL DBA         1                        2

 

SQL>

 

 

5.0. Troubleshooting:

While reconfiguring the tungsten replicatior  I am getting below error replication is not synced.Here I have mentioned the sample error. Sample Error :

NAME                     VALUE ----                     ----- appliedLastEventId     : NONE appliedLastSeqno       : -1 appliedLatency         : -1.0 autoRecoveryEnabled    : false autoRecoveryTotal      : 0 channels               : -1 clusterName            : slave currentEventId         : NONE currentTimeMillis      : 1536839670076 dataServerHost         : 172.17.4.106 extensions             :  host                   : 172.17.4.106 latestEpochNumber      : -1 masterConnectUri       : thl://mysql-stg:2112/ masterListenUri        : null maximumStoredSeqNo     : -1 minimumStoredSeqNo     : -1 offlineRequests        : NONE pendingError           : Event extraction failed pendingErrorCode       : NONE pendingErrorEventId    : NONE pendingErrorSeqno      : -1 pendingExceptionMessage: Client handshake failure: Client response validation failed: Master log does not contain requested transaction: master source ID=mysql-stg client source ID=172.17.4.106 requested seqno=3 client epoch number=0 master min seqno=0 master max seqno=0 pipelineSource         : UNKNOWN relativeLatency        : -1.0 resourcePrecedence     : 99 rmiPort                : 10002 role                   : slave

The reason is some transactions on the slave from a previous installation was not clear properly.

solution :

To clear old transactions from slave.

[root@mysql-stg tungsten]# /opt/slave/tungsten/tungsten-replicator/bin/trepctl -service slave reset -all Do you really want to delete replication service slave completely? [yes/NO] yes [root@mysql-stg tungsten]# /opt/slave/tungsten/tungsten-replicator/bin/trepctl online [root@mysql-stg tungsten]# /opt/slave/tungsten/tungsten-replicator/bin/trepctl status Processing status command... NAME VALUE ---- ----- appliedLastEventId : NONE appliedLastSeqno : -1 appliedLatency : -1.0 autoRecoveryEnabled: false autoRecoveryTotal: 0 channels : 1 clusterName: slave currentEventId : NONE currentTimeMillis: 1536839732221 dataServerHost : 172.17.4.106 extensions : host : 172.17.4.106 latestEpochNumber: -1 masterConnectUri : thl://mysql-stg:2112/ masterListenUri: null maximumStoredSeqNo : -1 minimumStoredSeqNo : -1 offlineRequests: NONE pendingError : NONE pendingErrorCode : NONE pendingErrorEventId: NONE pendingErrorSeqno: -1 pendingExceptionMessage: NONE pipelineSource : thl://mysql-stg:2112/ relativeLatency: -1.0 resourcePrecedence : 99 rmiPort: 10002 role : slave seqnoType: java.lang.Long serviceName: slave serviceType: local simpleServiceName: slave siteName : default sourceId : 172.17.4.106 state: ONLINE timeInStateSeconds : 4.476 timezone : GMT transitioningTo: uptimeSeconds: 77.996 useSSLConnection : false version: Tungsten Replicator 5.2.1 Finished status command... 6.0. Conclusion:

Tungsten replicator is a great tool when it comes to replication of data with heterogeneous data sources. If we understand it’s working, It’s easy to configure and perform  efficiently.

 

Image Courtesy : Photo by Karl JK Hedin on Unsplash

Transaction Processing in NewSQL

This is a list of references for transaction processing in NewSQL systems. The work is exciting. I don't have much to add and wrote this to avoid losing interesting links. My focus is on OLTP, but some of these systems support more than that.

By NewSQL I mean the following. I am not trying to define "NewSQL" for the world:
  1. Support for multiple nodes because the storage/compute on one node isn't sufficient.
  2. Support for SQL with ACID transactions. If there are shards then cross-shard operations can be consistent and isolated.
  3. Replication does not prevent properties listed above when you are wiling to pay the price in commit overhead. Alas synchronous geo-replication is slow and too-slow commit is another form of downtime. I hope NewSQL systems make this less of a problem (async geo-replication for some or all commits, commutative operations). Contention and conflict are common in OLTP and it is important to understand the minimal time between commits to a single row or the max number of commits/second to a single row.
NewSQL Systems
  • MySQL Cluster - this was NewSQL before NewSQL was a thing. There is a nice book that explains the internals. There is a company that uses it to make HDFS better. Cluster seems to be more popular for uses other than web-scale workloads.
  • VoltDB - another early NewSQL system that is still getting better. It was after MySQL Cluster but years before Spanner and came out of the H-Store research effort.
  • Spanner - XA across-shards, Paxos across replicas, special hardware to reduce clock drift between nodes. Sounds amazing, but this is Google so it just works. See the papers that explain the system and support for SQL. This got the NewSQL movement going.
  • CockroachDB - the answer to implementing Spanner without GPS and atomic clocks. From that URL they explain it as "while Spanner always waits after writes, CockroachDB sometimes waits before reads". It uses RocksDB and they help make it better.
  • FaunaDB - FaunaDB is inspired by Calvin and Daniel Abadi explains the difference between it and Spanner -- here and here. Abadi is great at explaining distributed systems, see his work on PACELC (and the pdf). A key part of Calvin is that "Calvin uses preprocessing to order transactions. All transactions are inserted into a distributed, replicated log before being processed." This approach might limit the peak TPS on a large cluster, but I assume that doesn't matter for a large fraction of the market.
  • YugaByte - another user of RocksDB. There is much discussion about it in the recent Abadi post. Their docs are amazing -- slides, transaction IO path, single-shard write IO path, distributed ACID and single-row ACID.
  • TiDB - I don't know much about it but they are growing fast and are part of the MySQL community. It uses RocksDB (I shouldn't have forgotten that).
Other relevant systems

Prepared Statements for MySQL: PDO, MySQLi, and X DevAPI

Recently I ran across a prominent PHP Developer who incorrectly claimed that only PDO allows binding values to variables for prepared statements.  A lot of developer use prepared statements to reduce the potential of SQL Injection and it is a good first step.  But there are some features that you do no kno

What is a Prepared Statement?
The MySQL Manual states The MySQL database supports prepared statements. A prepared statement or a parameterized statement is used to execute the same statement repeatedly with high efficiency.

So far, so good. Well there is also a performance issue to consider too.  From the same source The prepared statement execution consists of two stages: prepare and execute. At the prepare stage a statement template is sent to the database server. The server performs a syntax check and initializes server internal resources for later use.

So it is a two step process.  Set up the query as a template and then plug in the value. If you need to reuse the query, just plug in a new value into the template.

So lets look at how it is done.

PDO
On PHP.NET, there are a lot of really great examples. Question marks (?) are used as placeholders that will be filled in at execution time. 

$sth = $dbh->prepare('SELECT name, colour, calories
    FROM fruit
    WHERE calories < ? AND colour = ?');$sth->execute(array(150, 'red'));$red = $sth->fetchAll();$sth->execute(array(175, 'yellow'));$yellow = $sth->fetchAll();

So that is the basics. But what do they look like with the other two extensions?


MySQLi
So what does the MySQLi version look like? Once again question marks are used as placeholders.

$stmt = $mysqli->prepare("INSERT INTO CountryLanguage VALUES (?, ?, ?, ?)");

$stmt->bind_param('sssd', $code, $language, $official, $percent);$code = 'DEU';$language = 'Bavarian';$official = "F";$percent = 11.2;/* execute prepared statement */$stmt->execute();printf("%d Row inserted.\n", $stmt->affected_rows);

But what is that sssd stuff?  That is where you declare the type of variable you are want to use.  Use 's' for string, 'i' for integer, 'd' for double, and 'b' for a blob (binary large object).  So you get the advantage of type checking.

X DevAPI

The much newer X DevAPI is for the new X Protocol and the MySQL Document Store.  Unlike the other two examples it is not Structured Query Language (SQL) based.


$res = $coll->modify('name like :name')->arrayInsert('job[0]', 'Calciatore')->bind(['name' => 'ENTITY'])->execute();

$res = $table->delete()->orderby('age desc')->where('age < 20 and age > 12 and name != :name')->bind(['name' => 'Tierney'])->limit(2)->execute();

Note that this is not an object relational mapper as it is the protocol itself and not something mapping the object to the SQL.

Wrap Up So now you know how to use prepared statements with all three PHP MySQL Extensions.

ClickHouse: Two Years!

Following my post from a year ago https://www.percona.com/blog/2017/07/06/clickhouse-one-year/, I wanted to review what happened in ClickHouse during this year.
There is indeed some interesting news to share.

1. ClickHouse in DB-Engines Ranking. It did not quite get into the top 100, but the gain from position 174 to 106 is still impressive. Its DB-Engines Ranking score tripled from 0.54 last September to 1.57 this September

And indeed, in my conversation with customers and partners, the narrative has changed from: “ClickHouse, what is it?” to “We are using or considering ClickHouse for our analytics needs”.

2. ClickHouse changed their versioning schema. Unfortunately it changed from the unconventional …; 1.1.54390; 1.1.54394 naming structure to the still unconventional 18.6.0; 18.10.3; 18.12.13 naming structure, where “18.” is a year of the release.

Now to the more interesting technical improvements.

3. Support of the more traditional JOIN syntax. Now if you join two tables you can use SELECT ... FROM tab1 ANY LEFT JOIN tab2 ON tab1_col=tab2_col .

So now, if we take a query from the workload described in https://www.percona.com/blog/2017/06/22/clickhouse-general-analytical-workload-based-star-schema-benchmark/

We can write this:

SELECT C_REGION, sum(LO_EXTENDEDPRICE * LO_DISCOUNT) FROM lineorder ANY INNER JOIN customer ON LO_CUSTKEY=C_CUSTKEY WHERE (toYear(LO_ORDERDATE) = 1993) AND ((LO_DISCOUNT >= 1) AND (LO_DISCOUNT <= 3)) AND (LO_QUANTITY < 25) GROUP BY C_REGION;

instead of the monstrous:

SELECT C_REGION, sum(LO_EXTENDEDPRICE * LO_DISCOUNT) FROM lineorder ANY INNER JOIN ( SELECT C_REGION, C_CUSTKEY AS LO_CUSTKEY FROM customer ) USING (LO_CUSTKEY) WHERE (toYear(LO_ORDERDATE) = 1993) AND ((LO_DISCOUNT >= 1) AND (LO_DISCOUNT <= 3)) AND (LO_QUANTITY < 25) GROUP BY C_REGION;

4. Support for DELETE and UPDATE operations. This has probably been the most requested feature since the first ClickHouse release.
ClickHouse uses an LSM-tree like structure—MergeTree—and it is not friendly to single row operations. To highlight this specific limitation, ClickHouse uses ALTER TABLE UPDATE / ALTER TABLE DELETE syntax to highlight this will be executed as a bulk operation, so please consider it as such. Updating or deleting rows in ClickHouse should be an exceptional operation, rather than a part of your day-to-day workload.

We can update a column like this: ALTER TABLE lineorder UPDATE LO_DISCOUNT = 5 WHERE LO_CUSTKEY = 199568

5. ClickHouse added a feature which I call Dictionary Compression, but ClickHouse uses the name “LowCardinality”. It is still experimental, but I hope soon it will be production ready. Basically it allows internally to replace long strings with a short list of enumerated values.

For example, consider the table from our example lineorder which contains 600037902 rows, but has only five different values for the column LO_ORDERPRIORITY:

SELECT DISTINCT LO_ORDERPRIORITY FROM lineorder ┌─LO_ORDERPRIORITY─┐ │ 1-URGENT │ │ 5-LOW │ │ 4-NOT SPECIFIED │ │ 2-HIGH │ │ 3-MEDIUM │ └──────────────────┘

So we can define our table as:

CREATE TABLE lineorder_dict ( LO_ORDERKEY UInt32, LO_LINENUMBER UInt8, LO_CUSTKEY UInt32, LO_PARTKEY UInt32, LO_SUPPKEY UInt32, LO_ORDERDATE Date, LO_ORDERPRIORITY LowCardinality(String), LO_SHIPPRIORITY UInt8, LO_QUANTITY UInt8, LO_EXTENDEDPRICE UInt32, LO_ORDTOTALPRICE UInt32, LO_DISCOUNT UInt8, LO_REVENUE UInt32, LO_SUPPLYCOST UInt32, LO_TAX UInt8, LO_COMMITDATE Date, LO_SHIPMODE LowCardinality(String) )Engine=MergeTree(LO_ORDERDATE,(LO_ORDERKEY,LO_LINENUMBER),8192);

How does this help? Firstly, it offers space savings. The table will take less space in storage, as it will use integer values instead of strings. And secondly, performance. The filtering operation will be executed faster.

For example: here’s a query against the table with LO_ORDERPRIORITY stored as String:

SELECT count(*) FROM lineorder WHERE LO_ORDERPRIORITY = '2-HIGH' ┌───count()─┐ │ 119995822 │ └───────────┘ 1 rows in set. Elapsed: 0.859 sec. Processed 600.04 million rows, 10.44 GB (698.62 million rows/s., 12.16 GB/s.)

And now the same query against table with LO_ORDERPRIORITY as LowCardinality(String):

SELECT count(*) FROM lineorder_dict WHERE LO_ORDERPRIORITY = '2-HIGH' ┌───count()─┐ │ 119995822 │ └───────────┘ 1 rows in set. Elapsed: 0.350 sec. Processed 600.04 million rows, 600.95 MB (1.71 billion rows/s., 1.72 GB/s.)

This is 0.859 sec vs 0.350 sec (for the LowCardinality case).

Unfortunately this feature is not optimized for all use cases, and actually in aggregation it performs slower.

An aggregation query against table with LO_ORDERPRIORITY as String:

SELECT DISTINCT LO_ORDERPRIORITY FROM lineorder ┌─LO_ORDERPRIORITY─┐ │ 4-NOT SPECIFIED │ │ 1-URGENT │ │ 2-HIGH │ │ 3-MEDIUM │ │ 5-LOW │ └──────────────────┘ 5 rows in set. Elapsed: 1.200 sec. Processed 600.04 million rows, 10.44 GB (500.22 million rows/s., 8.70 GB/s.)

Versus an aggregation query against table with LO_ORDERPRIORITY as LowCardinality(String):

SELECT DISTINCT LO_ORDERPRIORITY FROM lineorder_dict ┌─LO_ORDERPRIORITY─┐ │ 4-NOT SPECIFIED │ │ 1-URGENT │ │ 2-HIGH │ │ 3-MEDIUM │ │ 5-LOW │ └──────────────────┘ 5 rows in set. Elapsed: 2.334 sec. Processed 600.04 million rows, 600.95 MB (257.05 million rows/s., 257.45 MB/s.)

This is 1.200 sec vs 2.334 sec (for the LowCardinality case)

6. And the last feature I want to mention is the better support of Tableau Software: this required ODBC drivers. It may not seem significant, but Tableau is the number one software for data analysts, and by supporting this, ClickHouse will reach a much wider audience.

Summing up: ClickHouse definitely became much more user friendly since a year ago!

The post ClickHouse: Two Years! appeared first on Percona Database Performance Blog.

MySQL Explain Example – Explaining MySQL EXPLAIN using StackOverflow data

I personally believe that the best way to deliver a complicated message to an audience, is by using a simple example. So in this post, I chose to demonstrate how to obtain insights from MySQL’s EXPLAIN output, by using a simple SQL query which fetches data from StackOverflow’s publicly available dataset.

The EXPLAIN command provides information about how MySQL executes queries. EXPLAIN can work with SELECT, DELETE, INSERT, REPLACE, and UPDATE statements.

We’ll first analyze the original query, then attempt to optimize the query and look into the optimized query’s execution plan to see what changed and why.

This is the first article in a series of posts. Each post will walk you through a more advanced SQL query than the previous post, while demonstrating more insights which can be obtained from MySQL’s execution plans.

The query and database structure

The following is the structure of the two tables used by this example query (posts and votes):

CREATE TABLE `posts` ( `Id` int(11) NOT NULL, `AcceptedAnswerId` int(11) DEFAULT NULL, `AnswerCount` int(11) DEFAULT NULL, `Body` longtext CHARACTER SET utf8 NOT NULL, ... `OwnerUserId` int(11) DEFAULT NULL, ... `Title` varchar(250) CHARACTER SET utf8 DEFAULT NULL, `ViewCount` int(11) NOT NULL PRIMARY KEY (`Id`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 CREATE TABLE `votes` ( `Id` int(11) NOT NULL, `PostId` int(11) NOT NULL, `UserId` int(11) DEFAULT NULL, `BountyAmount` int(11) DEFAULT NULL, `VoteTypeId` int(11) NOT NULL, `CreationDate` datetime NOT NULL, PRIMARY KEY (`Id`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1

The following SQL query will find the details of users who added my StackOverflow questions to their favorites list. For the sake of this example, my user id is 12345678.

SELECT v.UserId, COUNT(*) AS FavoriteCount FROM Votes v JOIN Posts p ON p.id = v.PostId WHERE p.OwnerUserId = 12345678 AND v.VoteTypeId = 5 -- (Favorites vote) GROUP BY v.UserId ORDER BY FavoriteCount DESC LIMIT 100;

The original query’s execution duration is very long. I stopped waiting and cancelled the execution after more than a minute has passed.

Explaining the original EXPLAIN

This is the original EXPLAIN plan for this query:

Before rushing to optimize the query, let’s take a closer look at the output of the EXPLAIN command, to make sure we fully understand all aspects of it. The first thing we notice, is that it can include more than one row. The query we’re analyzing involves two tables in the process, which are joined using an inner join. Each of these tables gets represented in a different row in the execution plan above. As an analogy to the coding world, you can look at the concept of an inner join as very similar to a nested loop. MySQL chooses the table it thinks will be best to start the journey with (the outer “loop”) and then touches the next table using the values from the outer “loop”.

Each of the rows in the EXPLAIN contains the following fields:

  • id – In most cases, the id field will present a sequential number of the SELECT query this row belongs to. The query above contains no subqueries nor unions, so therefore the id for both rows is 1, as there is actually only 1 query.
  • select_type – The type of SELECT query. In our case, it’s a simple query as it contains no subqueries or unions. In more complex cases, it will contain other types such as SUBQUERY (for subqueries), UNION (second or later statements in a union), DERIVED (a derived table) and others. More information about access_types can be found in MySQL’s docs.
  • table – the table name, or alias, this row refers to. In the screenshot above, you can see ‘v’ and ‘p’ mentioned, as those are the aliases defined for the tables votes and posts.
  • type – defines how the tables are accessed / joined. The most popular access types you’ll generally see are the following, sorted from the worst to the best: ALL, index, range, ref, eq_ref, const, system. As you can see in the EXPLAIN, the table votes is the first table accessed, using the ALL access_type, which means MySQL will scan the entire table, using no indexes, so it will go through over 14 million records. The posts table is then accessed using the eq_ref access type. Other than the system and const types, eq_ref is the best possible join type. The database will access one row from this table for each combination of rows from the previous tables.
  • possible_keys – The optional indexes MySQL can choose from, to look up for rows in the table. Some of the indexes in this list can be actually irrelevant, as a result of the execution order MySQL chose. In general, MySQL can use indexes to join tables. Said that, it won’t use an index on the first table’s join column, as it will go through all of its rows anyway (except rows filtered by the WHERE clause).
  • key – This column indicates the actual index MySQL decided to use. It doesn’t necessarily mean it will use the entire index, as it can choose to use only part of the index, from the left-most side of it.
  • key_len – This is one of the important columns in the explain output. It indicates the length of the key that MySQL decided to use, in bytes. In the EXPLAIN output above, MySQL uses the entire PRIMARY index (4 bytes). We know that because the only column in the PRIMARY index is Id, which is defined as an INT => 4 bytes. Unfortunately, there is no easier way to figure out which part of the index is used by MySQL, other than aggregating the length of all columns in the index and comparing that to the key_len value.
  • rows – Indicates the amount of number of rows MySQL believes it must examine from this table, to execute the query. This is only an estimation. Usually, high row counts mean there is room for query optimization.
  • filtered – The amount of rows unfiltered by the conditions in the WHERE clause. These rows will be joined to the table in the next row of the EXPLAIN plan. As mentioned previously, this is a guesstimate as well, so MySQL can be wrong with this estimation.
  • extra – Contains more information about the query processing. Let’s look into the extras for our query:
    • using where – The WHERE clause is used to restrict which rows are fetched from the current table (votes) and matched with the next table (posts).
    • using temporary – As part of the query processing, MySQL has to create a temporary table, which in many cases can result in a performance penalty. In most cases, it will indicate that one of the ORDER BY or GROUP BY clauses is executed without using an index. It can also happen if the GROUP BY and ORDER BY clauses include different columns (or in different order).
    • using filesort – MySQL is forced to perform another pass on the results of the query to sort them. In many cases, this can also result in a performance penalty.
Optimizing a slow query using MySQL’s EXPLAIN

What can we learn from this specific EXPLAIN plan?

  1. MySQL chooses to start with the votes table.The EXPLAIN output shows that it has to go through 145,045,878 rows, which is about all the rows in the table. That’s nasty.
  2. The Extra column indicates that MySQL tries to reduce the amount of rows it inspects using the WHERE clause, but it estimates to reduce those to 10%, which is still bad, as 10% = 14 million rows. A possible conclusion from this information is that the condition v.VoteTypeId = 5 isn’t selective enough, so therefore millions of rows will be joined to the next table.
  3. Looking at the WHERE clause, we can see there is another condition, p.OwnerUserId = 12345678, which looks very selective and should drastically reduce the amount of rows to inspect. The posts table contains ~40 million records, while only 57 rows are returned when applying the condition p.OwnerUserId = 12345678, which means that it’s very selective. In this case, it would be best if MySQL would have started the execution using the posts table, to take advantage of this selective condition. Later in this post, we’ll see which change will get MySQL to choose that order.
  4. Looking at the possible_keys values for both tables, we can see that MySQL doesn’t have a lot of optional indexes to choose from. More precisely, there is no index in the possible_keys that contains the columns mentioned in the WHERE clause.

Therefore, we’ll add the following two indexes. Each index starts with the column mentioned in the WHERE clause. The index for the votes table also includes the joined column.

ALTER TABLE `Posts` ADD INDEX `posts_idx_owneruserid` (`OwnerUserId`); ALTER TABLE `Votes` ADD INDEX `votes_idx_votetypeid_postid` (`VoteTypeId`,`PostId`); The optimized query’s EXPLAIN output

This is the new EXPLAIN output after adding the indexes:

What changed?

  1. The first change we see is that MySQL chose to start with the posts table this time (hurray!). It uses the new index to filter out the rows and estimates to filter all but 57 records, which are then joined to the second table, votes.
  2. The second change we see, by looking at the key column, is that indexes are used for lookups and filtering in both tables.
  3. By looking at the key_len column, we can see that the composite index for the votes table is used in full – 8 bytes, which covers both the VoteTypeId and PostId columns.
  4. The last important change we see is the amount of rows MySQL estimates it needs to inspect in order to run evaluate the query. It estimates it needs to inspect 57 * 2 = 114 rows, which is great, comparing to the millions of records in the original execution path.

So looking at the execution duration now, we can see a drastic improvement, from a very slow query which never actually returned, to only 0.063 seconds:

Conclusions

Summarizing the most important aspects of reading an EXPLAIN and optimizing this simple query:

  1. Run the EXPLAIN command to inspect the planned execution path for your SQL query.
  2. Look at the tables order MySQL chose for the execution. Does it makes sense? If not, ask yourself why did MySQL get it wrong, what’s missing?
  3. Find the conditions in the WHERE clause which are the most selective ones and make sure you create the optimal indexes to include them. You can read more information on how to create the optimal indexes here.
  4. Look for the places MySQL doesn’t use an index for look-ups and filtering, as those may be the weak spots.
  5. Look for the rows where MySQL shows a very high estimation of rows it needs to inspect to evaluate the query.

In our next post for this series, we’ll analyze a more complex query, looking into more insights the EXPLAIN can provide for other query structures such as sub-queries.

How To Fix MySQL Replication After an Incompatible DDL Command

MySQL supports replicating to a slave that is one release higher. This allows us to easily upgrade our MySQL setup to a new version, by promoting the slave and pointing the application to it. However, though unsupported, there are times when the MySQL version of slave deployed is one release lower. In this scenario, if your application has been performing much better on an older version of MySQL, you would like to have a convenient option to downgrade. You can simply promote the slave to get the old performance back.

The MySQL manual says that ROW based replication can be used to replicate to a lower version, provided that no DDLs replicated are incompatible with the slave. One such incompatible command is ALTER USER which is a new feature in MySQL 5.7 and not available on 5.6. :

ALTER USER 'testuser'@'localhost' IDENTIFIED BY 'testuser';

Executing that command would break replication. Here is an example of a broken slave in non-GTID replication:

*************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 127.0.0.1 Master_User: repl Master_Port: 5723 Connect_Retry: 60 Master_Log_File: mysql-bin.000002 Read_Master_Log_Pos: 36915649 Relay_Log_File: mysql_sandbox5641-relay-bin.000006 Relay_Log_Pos: 36174552 Relay_Master_Log_File: mysql-bin.000002 Slave_IO_Running: Yes Slave_SQL_Running: No *** redacted *** Last_Errno: 1064 Last_Error: Error 'You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'IDENTIFIED WITH 'mysql_native_password' AS '*3A2EB9C80F7239A4DE3933AE266DB76A784' at line 1' on query. Default database: ''. Query: 'ALTER USER 'testuser'@'localhost' IDENTIFIED WITH 'mysql_native_password' AS '*3A2EB9C80F7239A4DE3933AE266DB76A7846BCB8'' Skip_Counter: 0 Exec_Master_Log_Pos: 36174373 Relay_Log_Space: 36916179 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No *** redacted *** Seconds_Behind_Master: NULL Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 1064 Last_SQL_Error: Error 'You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'IDENTIFIED WITH 'mysql_native_password' AS '*3A2EB9C80F7239A4DE3933AE266DB76A784' at line 1' on query. Default database: ''. Query: 'ALTER USER 'testuser'@'localhost' IDENTIFIED WITH 'mysql_native_password' AS '*3A2EB9C80F7239A4DE3933AE266DB76A7846BCB8'' Replicate_Ignore_Server_Ids: Master_Server_Id: 1 Master_UUID: 00005723-0000-0000-0000-000000005723 *** redacted *** Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: 180918 22:03:40 *** redacted *** Auto_Position: 0 1 row in set (0.00 sec)

Skipping the statement does not resume replication:

mysql> STOP SLAVE; Query OK, 0 rows affected (0.02 sec) mysql> SET GLOBAL sql_slave_skip_counter=1; Query OK, 0 rows affected (0.00 sec) mysql> START SLAVE; Query OK, 0 rows affected (0.01 sec) mysql> SHOW SLAVE STATUS\G

Fixing non-GTID replication

When you check slave status, replication still isn’t fixed. To fix it, you must manually skip to the next binary log position. The current binary log (Relay_Master_Log_File) and position (Exec_Master_Log_Pos) executed are mysql-bin.000002 and 36174373 respectively. We can use mysqlbinlog on the master to determine the next position:

mysqlbinlog -v --base64-output=DECODE-ROWS --start-position=36174373 /ssd/sandboxes/msb_5_7_23/data/mysql-bin.000002 | head -n 30 /*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/; /*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/; DELIMITER /*!*/; # at 36174373 #180918 22:03:40 server id 1 end_log_pos 36174438 CRC32 0xc7e1e553 Anonymous_GTID last_committed=19273 sequence_number=19277 rbr_only=no SET @@SESSION.GTID_NEXT= 'ANONYMOUS'/*!*/; # at 36174438 #180918 22:03:40 server id 1 end_log_pos 36174621 CRC32 0x2e5bb235 Query thread_id=563 exec_time=0 error_code=0 SET TIMESTAMP=1537279420/*!*/; SET @@session.pseudo_thread_id=563/*!*/; SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0, @@session.unique_checks=1, @@session.autocommit=1/*!*/; SET @@session.sql_mode=1436549152/*!*/; SET @@session.auto_increment_increment=1, @@session.auto_increment_offset=1/*!*/; /*!\C latin1 *//*!*/; SET @@session.character_set_client=8,@@session.collation_connection=8,@@session.collation_server=8/*!*/; SET @@session.lc_time_names=0/*!*/; SET @@session.collation_database=DEFAULT/*!*/; ALTER USER 'testuser'@'localhost' IDENTIFIED WITH 'mysql_native_password' AS '*3A2EB9C80F7239A4DE3933AE266DB76A7846BCB8' /*!*/; # at 36174621 #180918 22:03:40 server id 1 end_log_pos 36174686 CRC32 0x86756b3f Anonymous_GTID last_committed=19275 sequence_number=19278 rbr_only=yes /*!50718 SET TRANSACTION ISOLATION LEVEL READ COMMITTED*//*!*/; SET @@SESSION.GTID_NEXT= 'ANONYMOUS'/*!*/; # at 36174686 #180918 22:03:40 server id 1 end_log_pos 36174760 CRC32 0x30e663f9 Query thread_id=529 exec_time=0 error_code=0 SET TIMESTAMP=1537279420/*!*/; BEGIN /*!*/; # at 36174760 #180918 22:03:40 server id 1 end_log_pos 36174819 CRC32 0x48054daf Table_map: `sbtest`.`sbtest1` mapped to number 226

Based on the output above, the next binary log position is 36174621. To fix the slave, run:

STOP SLAVE; CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.000002', MASTER_LOG_POS=36174621; START SLAVE;

Verify if the slave threads are now running by executing SHOW SLAVE STATUS\G

Slave_IO_State: Waiting for master to send event Master_Host: 127.0.0.1 Master_User: repl Master_Port: 5723 Connect_Retry: 60 Master_Log_File: mysql-bin.000002 Read_Master_Log_Pos: 306841423 Relay_Log_File: mysql_sandbox5641-relay-bin.000002 Relay_Log_Pos: 190785290 Relay_Master_Log_File: mysql-bin.000002 Slave_IO_Running: Yes Slave_SQL_Running: Yes *** redacted *** Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 226959625 Relay_Log_Space: 270667273 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No *** redacted *** Seconds_Behind_Master: 383 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 1 Master_UUID: 00005723-0000-0000-0000-000000005723 Master_Info_File: /ssd/sandboxes/msb_5_6_41/data/master.info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Opening tables Master_Retry_Count: 86400 *** redacted *** Auto_Position: 0

To make the slave consistent with the master, execute the compatible query on the slave.

SET SESSION sql_log_bin = 0; GRANT USAGE ON *.* TO 'testuser'@'localhost' IDENTIFIED BY 'testuser';

Done.

GTID replication

For GTID replication, in addition to injecting an empty transaction for the offending statement, you’ll need skip it by using the non-GTID solution provided above. Once running, flip it back to GTID.

Here’s an example of a broken GTID slave:

mysql> SHOW SLAVE STATUS\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 127.0.0.1 Master_User: repl Master_Port: 5723 Connect_Retry: 60 Master_Log_File: mysql-bin.000003 Read_Master_Log_Pos: 14364967 Relay_Log_File: mysql_sandbox5641-relay-bin.000002 Relay_Log_Pos: 8630318 Relay_Master_Log_File: mysql-bin.000003 Slave_IO_Running: Yes Slave_SQL_Running: No *** redacted *** Last_Errno: 1064 Last_Error: Error 'You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'IDENTIFIED WITH 'mysql_native_password' AS '*3A2EB9C80F7239A4DE3933AE266DB76A784' at line 1' on query. Default database: ''. Query: 'ALTER USER 'testuser'@'localhost' IDENTIFIED WITH 'mysql_native_password' AS '*3A2EB9C80F7239A4DE3933AE266DB76A7846BCB8'' Skip_Counter: 0 Exec_Master_Log_Pos: 12468343 Relay_Log_Space: 10527158 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No *** redacted *** Seconds_Behind_Master: NULL Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 1064 Last_SQL_Error: Error 'You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'IDENTIFIED WITH 'mysql_native_password' AS '*3A2EB9C80F7239A4DE3933AE266DB76A784' at line 1' on query. Default database: ''. Query: 'ALTER USER 'testuser'@'localhost' IDENTIFIED WITH 'mysql_native_password' AS '*3A2EB9C80F7239A4DE3933AE266DB76A7846BCB8'' Replicate_Ignore_Server_Ids: Master_Server_Id: 1 Master_UUID: 00005723-0000-0000-0000-000000005723 Master_Info_File: /ssd/sandboxes/msb_5_6_41/data/master.info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: 180918 22:32:28 Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: 00005723-0000-0000-0000-000000005723:2280-8530 Executed_Gtid_Set: 00005723-0000-0000-0000-000000005723:1-7403 Auto_Position: 1 1 row in set (0.00 sec) mysql> SHOW GLOBAL VARIABLES LIKE 'gtid_executed'; +---------------+---------------------------------------------+ | Variable_name | Value | +---------------+---------------------------------------------+ | gtid_executed | 00005723-0000-0000-0000-000000005723:1-7403 | +---------------+---------------------------------------------+ 1 row in set (0.00 sec)

Since the last position executed is 7403, so you’ll need to create an empty transaction for the offending sequence 7404.

STOP SLAVE; SET GTID_NEXT='00005723-0000-0000-0000-000000005723:7404'; BEGIN; COMMIT; SET GTID_NEXT=AUTOMATIC; START SLAVE;

Note: If you have MTS enabled, you can also get the offending GTID coordinates from Last_SQL_Error of SHOW SLAVE STATUS\G

The next step is to find the next binary log position. The current binary log(Relay_Master_Log_File) and position(Exec_Master_Log_Pos) executed are mysql-bin.000003 and 12468343 respectively. We can again use

mysqlbinlog  on the master to determine the next position:mysqlbinlog -v --base64-output=DECODE-ROWS --start-position=12468343 /ssd/sandboxes/msb_5_7_23/data/mysql-bin.000003 | head -n 30 /*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/; /*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/; DELIMITER /*!*/; # at 12468343 #180918 22:32:19 server id 1 end_log_pos 12468408 CRC32 0x259ee085 GTID last_committed=7400 sequence_number=7404 rbr_only=no SET @@SESSION.GTID_NEXT= '00005723-0000-0000-0000-000000005723:7404'/*!*/; # at 12468408 #180918 22:32:19 server id 1 end_log_pos 12468591 CRC32 0xb349ad80 Query thread_id=142 exec_time=0 error_code=0 SET TIMESTAMP=1537281139/*!*/; SET @@session.pseudo_thread_id=142/*!*/; SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0, @@session.unique_checks=1, @@session.autocommit=1/*!*/; SET @@session.sql_mode=1436549152/*!*/; SET @@session.auto_increment_increment=1, @@session.auto_increment_offset=1/*!*/; /*!\C latin1 *//*!*/; SET @@session.character_set_client=8,@@session.collation_connection=8,@@session.collation_server=8/*!*/; SET @@session.lc_time_names=0/*!*/; SET @@session.collation_database=DEFAULT/*!*/; ALTER USER 'testuser'@'localhost' IDENTIFIED WITH 'mysql_native_password' AS '*3A2EB9C80F7239A4DE3933AE266DB76A7846BCB8' /*!*/; # at 12468591 #180918 22:32:19 server id 1 end_log_pos 12468656 CRC32 0xb2019f3f GTID last_committed=7400 sequence_number=7405 rbr_only=yes /*!50718 SET TRANSACTION ISOLATION LEVEL READ COMMITTED*//*!*/; SET @@SESSION.GTID_NEXT= '00005723-0000-0000-0000-000000005723:7405'/*!*/; # at 12468656 #180918 22:32:19 server id 1 end_log_pos 12468730 CRC32 0x76b5ea6c Query thread_id=97 exec_time=0 error_code=0 SET TIMESTAMP=1537281139/*!*/; BEGIN /*!*/; # at 12468730 #180918 22:32:19 server id 1 end_log_pos 12468789 CRC32 0x48f0ba6d Table_map: `sbtest`.`sbtest8` mapped to number 115

The next binary log position is 36174621. To fix the slave, run:

STOP SLAVE; CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.000003', MASTER_LOG_POS=12468591, MASTER_AUTO_POSITION=0; START SLAVE;

Notice that I added MASTER_AUTO_POSITION=0 above to disable GTID replication for now. You can run SHOW SLAVE STATUS\G to determine that MySQL is running fine:

mysql> SHOW SLAVE STATUS\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 127.0.0.1 Master_User: repl Master_Port: 5723 Connect_Retry: 60 Master_Log_File: mysql-bin.000003 Read_Master_Log_Pos: 446194575 Relay_Log_File: mysql_sandbox5641-relay-bin.000002 Relay_Log_Pos: 12704248 Relay_Master_Log_File: mysql-bin.000003 Slave_IO_Running: Yes Slave_SQL_Running: Yes *** redacted *** Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 25172522 Relay_Log_Space: 433726939 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No *** redacted *** Seconds_Behind_Master: 2018 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 1 Master_UUID: 00005723-0000-0000-0000-000000005723 Master_Info_File: /ssd/sandboxes/msb_5_6_41/data/master.info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Reading event from the relay log Master_Retry_Count: 86400 *** redacted *** Retrieved_Gtid_Set: 00005723-0000-0000-0000-000000005723:7405-264930 Executed_Gtid_Set: 00005723-0000-0000-0000-000000005723:1-14947 Auto_Position: 0

Since it’s running fine you can now revert back to GTID replication:

STOP SLAVE; CHANGE MASTER TO MASTER_AUTO_POSITION=1; START SLAVE;

Finally, to make the slave consistent with the master, execute the compatible query on the slave.

SET SESSION sql_log_bin = 0; GRANT USAGE ON *.* TO 'testuser'@'localhost' IDENTIFIED BY 'testuser';

Summary

In this article, I’ve shared how to fix replication when it breaks due to an incompatible command being replicated to the slave. In fact, I’ve only identified ALTER USER as an incompatible command for 5.6. If there are other incompatible commands, please share them in the comment section. Thanks in advance.

Update: 
I filed a bug at https://bugs.mysql.com/bug.php?id=92629 to verify if the errors I’ve encountered here is a bug or undocumented behavior.

The post How To Fix MySQL Replication After an Incompatible DDL Command appeared first on Percona Database Performance Blog.

This Week in Data with Colin Charles #54: Percona Server for MySQL is Alpha

Join Percona Chief Evangelist Colin Charles as he covers happenings, gives pointers and provides musings on the open source database community.

I consider this to be the biggest news for the week: Alpha Build of Percona Server for MySQL 8.0. Experiment with it in a Docker container. It is missing column compression with dictionary support, native partitioning for TokuDB and MyRocks (excited to see that this is coming!), and encryption key rotation and scrubbing. All in, this should be a fun release to try, test, and also to file bugs for!

Database paradigms are changing, and it is interesting to see Cloudflare introducing Workers KV a key-value store, that is eventually consistent and highly distributed (at their global network of 152+ data centers). You can have up to 1 billion keys per namespace, keys up to 2kB in size, values up to 64kB, and eventual global consistency within 10 seconds. Read more about the cost and other technicals too.

For some quick glossing, from a MySQL Federal Account Manager, comes Why MySQL is Harder to Sell Than Oracle (from someone who has done both). Valid concerns, and always interesting to hear the barriers MySQL faces even after 23 years in existence! For analytics, maybe this is where the likes of MariaDB ColumnStore or ClickHouse might come into play.

Lastly, for all of you asking me about when Percona Live Europe Frankfurt 2018 speaker acceptances and agendas are to be released, I am told by a good source that it will be announced early next week. So register already!

Releases Link List Upcoming Appearances Feedback

I look forward to feedback/tips via Twitter @bytebot.

The post This Week in Data with Colin Charles #54: Percona Server for MySQL is Alpha appeared first on Percona Database Performance Blog.

Scaling Percona Monitoring and Management (PMM)

Starting with PMM 1.13,  PMM uses Prometheus 2 for metrics storage, which tends to be heaviest resource consumer of CPU and RAM.  With Prometheus 2 Performance Improvements, PMM can scale to more than 1000 monitored nodes per instance in default configuration. In this blog post we will look into PMM scaling and capacity planning—how to estimate the resources required, and what drives resource consumption.

We have now tested PMM with up to 1000 nodes, using a virtualized system with 128GB of memory, 24 virtual cores, and SSD storage. We found PMM scales pretty linearly with the available memory and CPU cores, and we believe that a higher number of nodes could be supported with more powerful hardware.

What drives resource usage in PMM ?

Depending on your system configuration and workload, a single node can generate very different loads on the PMM server. The main factors that impact the performance of PMM are:

  1. Number of samples (data points) injected into PMM per second
  2. Number of distinct time series they belong to (cardinality)
  3. Number of distinct query patterns your application uses
  4. Number of queries you have on PMM, through the user interface on the API, and their complexity

These specifically can be impacted by:

  • Software version – modern database software versions expose more metrics)
  • Software configuration – some metrics are only exposed in certain configuration
  • Workload – a large number of database objects and high concurrency will increase both the number of samples ingested and their cardinality.
  • Exporter configuration – disabling collectors can reduce amount of data collectors
  • Scrape frequency –  controlled by METRICS_RESOLUTION

All these factors together may impact resource requirements by a factor of ten or more, so do your own testing to be sure. However, the numbers in this article should serve as good general guidance as a start point for your research.

On the system supporting 1000 instances we observed the following performance:

As you can see, we have more than 2.000 scrapes/sec performed, providing almost two million samples/sec, and more than eight million active time series. These are the main numbers that define the load placed on Prometheus.

Capacity planning to scale PMM

Both CPU and memory are very important resources for PMM capacity planning. Memory is the more important as Prometheus 2 does not have good options for limiting memory consumption. If you do not have enough memory to handle your workload, then it will run out of memory and crash.

We recommend at least 2GB of memory for a production PMM Installation. A test installation with 1GB of memory is possible. However, it may not be able to monitor more than one or two nodes without running out of memory. With 2GB of memory you should be able to monitor at least five nodes without problem.

With powerful systems (8GB of more) you can have approximately eight systems per 1GB of memory, or about 15,000 samples ingested/sec per 1GB of memory.

To calculate the CPU usage resources required, allow for about 50 monitored systems per core (or 100K metrics/sec per CPU core).

One problem you’re likely to encounter if you’re running PMM with 100+ instances is the “Home Dashboard”. This becomes way too heavy with such a large number of servers. We plan to fix this issue in future releases of PMM, but for now you can work around it in two simple ways:

You can select the host, for example “pmm-server” in your home dashboard and save it, before adding a large amount of hosts to the system.

Or you can make some other dashboard of your choice and set it as the home dashboard.

Summary
  • More than 1,000 monitored systems is possible per single PMM server
  • Your specific workload and configuration may significantly change the resources required
  • If deploying with 8GB or more, plan 50 systems per core, and eight systems per 1GB of RAM

The post Scaling Percona Monitoring and Management (PMM) appeared first on Percona Database Performance Blog.

Pages