Planet MySQL

InnoDB: running out of AUTO_INCREMENT values

Most InnoDB primary keys are built on integer columns with the AUTO_INCREMENT option (which is a very good practice for reasons that are outside of the purpose of this article). But we have to monitor that we are not going to run out of AUTO_INCREMENT value. If this happens, we will get errors like this:

ERROR 167 (22003): Out of range value for column 'a' at row 1

Obviously, when creating tables, we should use a type that is sufficiently big, and make it UNSIGNED to avoid wasting half of its space. But there are also some details about AUTO_INCREMENT that we should remember.

First, the values to monitor are not MAX(id), but they are the AUTO_INCREMENT column in information_schema TABLES. This column shows the next autoincr value.

A used value is lost. It doesn’t matter if you delete a row: its autoincr value will not be reused. This is why we should monitor the next values from information_schema: it is possible that we used all autoincr values, but MAX(id) is much lower.

Values are lost even for transactions that fail or are rolled back. This is the reason why MAX(id) is generally higher, sometimes much higher, than the next AUTO_INCREMENT value. This can happen especially if we have periodical bulk inserts inserting many rows in one transaction. If such operation fails, it will waste autoincr values. If it fails often, we could run out of values.

Easy trick

An easy trick could be use a command like this:

ALTER TABLE table_name AUTO_INCREMENT = 1000;

But this only works if all values in the table are lower than the specified AUTO_INCREMENT. This can be useful if a bulk insert failed and no new row was still added, but typically this only postpones a bit the problem.

Or again, we could change all id’s to fill all the holes, and then set AUTO_INCREMENT. This is possible if, for example, the server is not used by night, or during the week end. But in other cases, it is not a viable solution. Modifying a primary key is not a cheap operation for InnoDB, and most likely, we will have to do this with many rows. Also, we would need to lock the table to avoid that new rows (with high values) are inserted in the process.

So, what can we do?

If we have a write-heavy workload, lowering the autoincr values is not trivial, because new rows are being inserted any time. And changing an id for many rows can be a long operation. If we do it with one transaction to make things faster, it’s locking. But well, we can do something similar to what pt-online-schema-change does to alter a table without locks. Let’s see the procedure step by step.

For our example, we will assume the following trivial table:

CREATE TABLE employee ( id INT UNSIGNED AUTO_INCREMENT, name VARCHAR(100), PRIMARY KEY (id) ); 1)

First, let’s create a table with the same structure. The id’s in the new table will start from 1. And let’s add a column, which is a reference to the original table’s id.

CREATE TABLE tmp_employee LIKE employee; ALTER TABLE tmp_employee ADD COLUMN orig_id INT UNSIGNED NULL; ALTER TABLE tmp_employee ADD UNIQUE INDEX idx_orig_id (orig_id); 2)

Now, we need to add a trigger to the original table, to add the new rows in the new table:

DELIMITER || CREATE TRIGGER employee_ai AFTER INSERT ON employee FOR EACH ROW BEGIN INSERT INTO tmp_employee (old_id, name) VALUES (NEW.id, NEW.name); END || DELIMITER ;

Note that original table’s id and the new table’s old_id columns are kept in sync.
This is just an example. In the real world, I would also create triggers for DELETE and UPDATE. This triggers need a reference to the new table, that’s why old_id exists and is indexed.

What if the table already has triggers? Fortunately MySQL 5.7 and MariaDB 10.2 support multiple triggers per timing/event. With older versions, just modify them and add the queries you need.

3)

Copy rows from the original table to the new table. If the new table already has some rows (added by our INSERT trigger) we won’t try to copy it.

INSERT INTO tmp_employee (old_id, name) SELECT id, name FROM employee WHERE (SELECT COUNT(*) FROM tmp_employee) = 0 OR id < (SELECT MIN(old_id) FROM tmp_employee);

We are still populating the old_id column for DELETE and UPDATE triggers.

We didn’t mention the new table’s primary key, so the values are automatically generated.

4)

Until now, I assume we didn’t cause any damage. But the next step is more dangerous. Before proceeding, verify that we didn’t do anything wrong and our transactions didn’t fail.

5)

If everything’s ok, let’s switch the tables:

RENAME TABLE employee TO old_employee, tmp_employee TO employee;

This operation is atomic, so no query is expected to fail.

6)

Drop the old table and the old_id column.

Enjoy!


How to create mysql login-path

This is just a note to myself. I don’t do this often enough to remember the command, but whenever I’m searching for this, it takes half a minute to find it in MySQL manual, so hopefully this gets indexed better (in my memory as well as in Google).

Here’s the simple command to create a login path:

mysql_config_editor set --login-path=mysql1 --host=localhost \ --port=3306 --socket=/path/to/socket --user=root --password

Obviously you can remove just about anything and only leave the essentials.

Once that’s done, accessing different MySQL instances is as simple as mysql --login-path=mysql1, which is especially useful if you’re accessing different servers from one machine, or if you’re running several MySQL instances on the same machine.

More information on login paths here.

The post How to create mysql login-path appeared first on Speedemy.

MySQL on Docker: ClusterControl and Galera Cluster on Docker Swarm

Our journey in adopting MySQL and MariaDB in containerized environments continues, with ClusterControl coming into the picture to facilitate deployment and management. We already have our ClusterControl image hosted in Docker Hub, where it can deploy different replication/cluster topologies on multiple containers. With the introduction of Docker Swarm, a native orchestration tools embedded inside Docker Engine, scaling and provisioning containers has become much easier. It also has high availability covered by running services on multiple Docker hosts.

In this blog post, we’ll be experimenting with automatic provisioning of Galera Cluster on Docker Swarm with ClusterControl. ClusterControl would usually deploy database clusters on bare-metal, virtual machines and cloud instances. ClusterControl relies on SSH (through libssh) as core communication module to connect to the managed hosts, so these would not require any agents. The same rule can be applied to containers, and that’s what we are going to show in this blog post.

ClusterControl as Docker Swarm Service

We have built a Docker image with extended logic to handle deployment in container environments in a semi-automatic way. The image is now available on Docker Hub and the code is hosted in our Github repository. Please note that only this image is capable of deploying on containers, and is not available in the standard ClusterControl installation packages.

The extended logic is inside deploy-container.sh, a script that monitors a custom table inside CMON database called “cmon.containers”. The created database container shall report and register itself into this table and this script will look for new entries and perform the necessary action using a ClusterControl CLI. The deployment is automatic, and you can monitor the progress directly from the ClusterControl UI or using “docker logs” command.

Before we go further, take note of some prerequisites for running ClusterControl and Galera Cluster on Docker Swarm:

  • Docker Engine version 1.12 and later.
  • Docker Swarm Mode is initialized.
  • ClusterControl must be connected to the same overlay network as the database containers.

To run ClusterControl as a service using “docker stack”, the following definition should be enough:

clustercontrol: deploy: replicas: 1 image: severalnines/clustercontrol ports: - 5000:80 networks: - galera_cc

Or, you can use the “docker service” command as per below:

$ docker service create --name cc_clustercontrol -p 5000:80 --replicas 1 severalnines/clustercontrol

Or, you can combine the ClusterControl service together with the database container service and form a “stack” in a compose file as shown in the next section.

Base Containers as Docker Swarm Service

The base container’s image called “centos-ssh” is based on CentOS 6 image. It comes with a couple of basic packages like SSH server, clients, curl and mysql client. The entrypoint script will download ClusterControl’s public key for passwordless SSH during startup. It will also register itself to the ClusterControl’s CMON database for automatic deployment.

Running this container requires a couple of environment variables to be set:

  • CC_HOST - Mandatory. By default it will try to connect to “cc_clustercontrol” service name. Otherwise, define its value in IP address, hostname or service name format. This container will download the SSH public key from ClusterControl node automatically for passwordless SSH.
  • CLUSTER_TYPE - Mandatory. Default to “galera”.
  • CLUSTER_NAME - Mandatory. This name distinguishes the cluster with others from ClusterControl perspective. No space allowed and it must be unique.
  • VENDOR - Default is “percona”. Other supported values are “mariadb”, “codership”.
  • DB_ROOT_PASSWORD - Mandatory. The database root password for the database server. In this case, it should be MySQL root password.
  • PROVIDER_VERSION - Default is 5.6. The database version by the chosen vendor.
  • INITIAL_CLUSTER_SIZE - Default is 3. This indicates how ClusterControl should treat newly registered containers, whether they are for new deployments or for scaling out. For example, if the value is 3, ClusterControl will wait for 3 containers to be running and registered into the CMON database before starting the cluster deployment job. Otherwise, it waits 30 seconds for the next cycle and retries. The next containers (4th, 5th and Nth) will fall under the “Add Node” job instead.

To run the container, simply use the following stack definition in a compose file:

galera: deploy: replicas: 3 image: severalnines/centos-ssh ports: - 3306:3306 environment: CLUSTER_TYPE: "galera" CLUSTER_NAME: "PXC_Docker" INITIAL_CLUSTER_SIZE: 3 DB_ROOT_PASSWORD: "mypassword123" networks: - galera_cc

By combining them both (ClusterControl and database base containers), we can just deploy them under a single stack as per below:

version: '3' services: galera: deploy: replicas: 3 restart_policy: condition: on-failure delay: 10s image: severalnines/centos-ssh ports: - 3306:3306 environment: CLUSTER_TYPE: "galera" CLUSTER_NAME: "Galera_Docker" INITIAL_CLUSTER_SIZE: 3 DB_ROOT_PASSWORD: "mypassword123" networks: - galera_cc clustercontrol: deploy: replicas: 1 image: severalnines/clustercontrol ports: - 5000:80 networks: - galera_cc networks: galera_cc: driver: overlay

Save the above lines into a file, for example docker-compose.yml in the current directory. Then, start the deployment:

$ docker stack deploy --compose-file=docker-compose.yml cc Creating network cc_galera_cc Creating service cc_clustercontrol Creating service cc_galera

Docker Swarm will deploy one container for ClusterControl (replicas:1) and another 3 containers for the database cluster containers (replicas:3). The database container will then register itself into the CMON database for deployment.

Wait for a Galera Cluster to be ready

The deployment will be automatically picked up by the ClusterControl CLI. So you basically don’t have to do anything but wait. The deployment usually takes around 10 to 20 minutes depending on the network connection.

Open the ClusterControl UI at http://{any_Docker_host}:5000/clustercontrol, fill in the default administrator user details and log in. Monitor the deployment progress under Activity -> Jobs, as shown in the following screenshot:

Or, you can look at the progress directly from the docker logs command of the ClusterControl container:

$ docker logs -f $(docker ps | grep clustercontrol | awk {'print $1'}) >> Found the following cluster(s) is yet to deploy: Galera_Docker >> Number of containers for Galera_Docker is lower than its initial size (3). >> Nothing to do. Will check again on the next loop. >> Found the following cluster(s) is yet to deploy: Galera_Docker >> Found a new set of containers awaiting for deployment. Sending deployment command to CMON. >> Cluster name : Galera_Docker >> Cluster type : galera >> Vendor : percona >> Provider version : 5.7 >> Nodes discovered : 10.0.0.6 10.0.0.7 10.0.0.5 >> Initial cluster size : 3 >> Nodes to deploy : 10.0.0.6;10.0.0.7;10.0.0.5 >> Deploying Galera_Docker.. It's gonna take some time.. >> You shall see a progress bar in a moment. You can also monitor >> the progress under Activity (top menu) on ClusterControl UI. Create Galera Cluster - Job 1 RUNNING [██▊ ] 26% Installing MySQL on 10.0.0.6

That’s it. Wait until the deployment completes and you will then be all set with a three-node Galera Cluster running on Docker Swarm, as shown in the following screenshot:

In ClusterControl, it has the same look and feel as what you have seen with Galera running on standard hosts (non-container) environment.

ClusterControl Single Console for Your Entire Database Infrastructure Find out what else is new in ClusterControl Install ClusterControl for FREE Management

Managing database containers is a bit different with Docker Swarm. This section provides an overview of how the database containers should be managed through ClusterControl.

Connecting to the Cluster

To verify the status of the replicas and service name, run the following command:

$ docker service ls ID NAME MODE REPLICAS IMAGE eb1izph5stt5 cc_clustercontrol replicated 1/1 severalnines/clustercontrol:latest ref1gbgne6my cc_galera replicated 3/3 severalnines/centos-ssh:latest

If the application/client is running on the same Swarm network space, you can connect to it directly via the service name endpoint. If not, use the routing mesh by connecting to the published port (3306) on any of the Docker Swarm nodes. The connection to these endpoints will be load balanced automatically by Docker Swarm in a round-robin fashion.

Scale up/down

Typically, when adding a new database, we need to prepare a new host with base operating system together with passwordless SSH. In Docker Swarm, you just need to scale out the service using the following command to the number of replicas that you desire:

$ docker service scale cc_galera=5 cc_galera scaled to 5

ClusterControl will then pick up the new containers registered inside cmon.containers table and trigger add node jobs, for one container at a time. You can look at the progress under Activity -> Jobs:

Scaling down is similar, by using the “service scale” command. However, ClusterControl doesn’t know whether the containers that have been removed by Docker Swarm were part of the auto-scheduling or just a scale down (which indicates that we deliberately wanted the containers to be removed). Thus, to scale down from 5 nodes to 3 nodes, one would:

$ docker service scale cc_galera=3 cc_galera scaled to 3

Then, remove the stopped hosts from the ClusterControl UI by going to Nodes -> rollover the removed container -> click on the ‘X’ icon on the top right -> Confirm & Remove Node:

ClusterControl will then execute a remove node job and bring back the cluster to the expected size.

Failover

In case of container failure, Docker Swarm automatic rescheduling will kick in and there will be a new replacement container with the same IP address as the old one (with different container ID). ClusterControl will then start to provision this node from scratch, by performing the installation process, configuration and getting it to rejoin the cluster. The old container will be removed automatically from ClusterControl before the deployment starts.

Go ahead and try to kill one of the database containers:

$ docker kill [container ID]

You’ll see the new containers that Swarm created will be provisioned automatically by ClusterControl.

Creating a new cluster

To create a new cluster, just create another service or stack with a different CLUSTER_NAME and service name. The following is an example that we want to create another Galera Cluster running on MariaDB 10.1 (some extra environment variables are required for MariaDB 10.1):

version: '3' services: galera2: deploy: replicas: 3 image: severalnines/centos-ssh ports: - 3306 environment: CLUSTER_TYPE: "galera" CLUSTER_NAME: "MariaDB_Galera" VENDOR: "mariadb" PROVIDER_VERSION: "10.1" INITIAL_CLUSTER_SIZE: 3 DB_ROOT_PASSWORD: "mypassword123" networks: - cc_galera_cc networks: cc_galera_cc: external: true

Then, create the service:

$ docker stack deploy --compose-file=docker-compose.yml db2

Go back to ClusterControl UI -> Activity -> Jobs and you should see a new deployment has started. After a couple of minutes, you will see the new cluster will be listed inside ClusterControl dashboard:

Destroying everything

To remove everything (including the ClusterControl container), you just need to remove the stack created by Docker Swarm:

$ docker stack rm cc Removing service cc_clustercontrol Removing service cc_galera Removing network cc_galera_cc Related resources  MySQL on Docker: Composing the Stack  MySQL on Docker: Deploy a Homogeneous Galera Cluster with etcd  MySQL on Docker: Introduction to Docker Swarm Mode and Multi-Host Networking

That’s it, the whole stack has been removed. Pretty neat huh? You can start all over again by running the “docker stack deploy” command and everything will be ready after a couple of minutes.

Summary

The flexibility you get by running a single command to deploy or destroy a whole environment can be useful in different types of use cases such as backup verification, DDL procedure testing, query performance tweaking, experimenting for proof-of-concepts and also for staging temporary data. These use cases are closer to developer environment. With this approach, you can now treat a stateful service “statelessly”.

Would you like to see ClusterControl manage the whole database container stack through the UI via point and click? Let us know your thoughts in the comments section below. In the next blog post, we are going to look at how to perform automatic backup verification on Galera Cluster using containers.

Tags:  MySQL docker galera swarm cluster

Setting up SQL for beginners is hard

Edvard Munch, Building The Winter Studio. Ekely

Although there has been a huge backlash against traditional programming paradigms, particularly relational databases, over the past several years, SQL is still one of the most popular programming languages.

First, it’s proven (working in production settings for over forty years). Second, its human-language-like syntax and declarative nature make it a perfect stepping stone for people who have zero programming knowledge and get overwhelmed by having to start learning by understanding data types and loops. When I started working in analytics, it was the first language I learned. I love teaching it to people new to/outside of technology and seeing them understand the power of being able to manipulate a data programmatically without shudder Excel spreadsheets.

With imperative programming languages, you can get started pretty quickly. You might need to install JDK, download an IDE, and update a couple of dependencies, but as soon as you have your environment set up, you’re writing code. (For beginners, it’s not always as easy as that), but usually half an hour of poking around or so should get you going.

And, if you don’t want to, you don’t even have to deal with any of that. You can just go to Code Academy, CodingBat, or, more recently, sites like Glitch, where the entire environment is already rendered in the browser for you, and start practicing.

The point is, because most modern programming languages are set up to move data rather than analyze it, they have a much lower overhead than SQL. The problem with SQL is that you can’t just get started. You need data, for that data to ideally be in several tables that are related to one another, loaded into a database, and made available in a way that you can access that database on your laptop. None of this is something beginners can easily do.

Interestingly enough, even though SQL is so well-established and has so many users, there simply aren’t a lot of environments already set up for practicing it. I encountered this interesting quirk of the programing community when I first learned it, and unfortunately, not much has changed.

Several years ago, I was asked by Girl Develop It to teach SQL to others, which I thought was a fantastic idea because of what I’ve already written about accessibility for beginners. The first time I taught it in 2013, there was nothing already set up out of the box for me to use.

My criteria included:

  1. Has to have an extremely easy to use UI for people who have never written code (so, no command line)
  2. Has to be cross-system compatible since students have a range of machines
  3. Accessible on any browser
  4. Involves no pre-installation before starting the class (so, no Docker containers)
  5. Is mostly open-source

My solution:

Running PHPMyAdmin on an AWS instance.

People really love to hate on PHP and the LAMP stack in general (I know I do,) but when something works, it just works.

Here’s what I did to set up a MySQL PHPMyAdmin database for my students to practice on:

  1. Start up an AWS medium Ubuntu instance.
  2. Install the LAMP stack
  3. Install PHP 7
  4. Install Python so I could run create users scripts and a couple of other clean-ups for the environment:

    A. Install easyinstall

    B. Easy intall pip

    C. sudo apt-get install python-setuptools python-dev build-essential

    D. Install a MySQL client for Python

  5. Install the database: sudo service mysql start

  6. Find a sample dataset to load into the DB and follow instructions on the GitHub page to get it up and running.

And, phew. You’re ready to go.

I just taught the SQL class again several weekends ago, and the set-up worked really well for 20-ish students, except for a moment when a cartesian join query made a couple of the instances hang, but I plan to refactor that in the next go-around.

After I presented, I wanted to see if anything new had come onto the market. The only promising new product was Mode Analytics (h/t Sara, which requires a sign-up and also is oriented towards Jupyter notebooks as opposed to pure SQL programming.

Here’s the class I ended up teaching. Hope this helps in case you’re thinking about teaching SQL, as well.

How We Made Percona XtraDB Cluster Scale

In this blog post, we’ll look at the actions and efforts Percona experts took to scale Percona XtraDB Cluster.

Introduction

When we first started analyzing Percona XtraDB Cluster performance, it was pretty bad. We would see contention even with 16 threads. Performance was even worse with sync binlog=1, although the same pattern was observed even with the binary log disabled. The effect was not only limited to OLTP workloads, as even other workloads (like update-key/non-key) were also affected in a wider sense than OLTP.

That’s when we started analyzing the contention issues and found multiple problems. We will discuss all these problems and the solutions we adapted. But before that, let’s look at the current performance level.

Check this blog post for more details.

The good news is Percona XtraDB Cluster is now optimized to scale well for all scenarios, and the gain is in the range of 3x-10x.

Understanding How MySQL Commits a Transaction

Percona XtraDB Cluster contention is associated mainly with Commit Monitor contention, which comes into the picture during commit time. It is important to understand the context around it.

When a commit is invoked, it proceeds in two phases:

  • Prepare phase: mark the transaction as PREPARE, updating the undo segment to capture the updated state.
    • If bin-log is enabled, redo changes are not persisted immediately. Instead, a batch flush is done during Group Commit Flush stage.
    • If bin-log is disabled, then redo changes are persisted immediately.
  • Commit phase: Mark the transaction commit in memory.
    • If bin-log is enabled, Group Commit optimization kicks in, thereby causing a flush of redo-logs (that persists changes done to db-objects + PREPARE state of transaction) and this action is followed by a flush of the binary logs. Since the binary logs are flushed, redo log capturing of transaction commit doesn’t need to flush immediately (Saving fsync)
    • If bin-log is disabled, redo logs are flushed on completion of the transaction to persist the updated commit state of the transaction.
What is a Monitor in Percona XtraDB Cluster World?

Monitors help maintain transaction ordering. For example, the Commit Monitor ensures that no transaction with a global-seqno greater than the current commit-processing transaction’s global seqno is allowed to proceed.

How Percona XtraDB Cluster Commits a Transaction

Percona XtraDB Cluster follows the existing MySQL semantics of course, but has its own step to commit the transaction in the replication world. There are two important themes:

  1. Apply/Execution of transaction can proceed in parallel
  2. Commit is serialized based on cluster-wide global seqno.

Let’s understand the commit flow with Percona XtraDB Cluster involved (Percona XtraDB Cluster registers wsrep as an additional storage engine for replication).

  • Prepare phase:
    • wsrep prepare: executes two main actions:
      • Replicate the transaction (adding the write-set to group-channel)
      • Entering CommitMonitor. Thereby enforcing ordering of transaction.
    • binlog prepare: nothing significant (for this flow).
    • innobase prepare: mark the transaction in PREPARE state.
      • As discussed above, the persistence of the REDO log depends on if the binlog is enabled/disabled.
  • Commit phase
    • If bin-log is enabled
      • MySQL Group Commit Logic kicks in. The semantics ensure that the order of transaction commit is the same as the order of them getting added to the flush-queue of the group-commit.
    • If bin-log is disabled
      • Normal commit action for all registered storage engines is called with immediate persistence of redo log.
    • Percona XtraDB Cluster then invokes the post_commit hook, thereby releasing the Commit Monitor so that the next transaction can make progress.

With that understanding, let’s look at the problems and solutions:

PROBLEM-1:

Commit Monitor is exercised such that the complete commit operation is serialized. This limits the parallelism associated with the prepare-stage. With log-bin enabled, this is still ok since redo logs are flushed at group-commit flush-stage (starting with 5.7). But if log-bin is disabled, then each commit causes an independent redo-log-flush (in turn probable fsync).

OPTIMIZATION-1:

Split the replication pre-commit hook into two explicit actions: replicate (add write-set to group-channel) + pre-commit (enter commit-monitor).

The replicate action is performed just like before (as part of storage engine prepare). That will help complete the InnoDB prepare action in parallel (exploring much-needed parallelism in REDO flush with log-bin disabled).

On completion of replication, the pre-commit hook is called. That leads to entering the Commit Monitor for enforcing the commit ordering of the transactions. (Note: Replication action assigns the global seqno. So even if a transaction with a higher global seqno finishes the replication action earlier (due to CPU scheduling) than the transaction with a lower global seqno, it will wait in the pre-commit hook.)

Improved parallelism in the innodb-prepare stage helps accelerate log-bin enabled flow, and the same improved parallelism significantly helps in the log-bin disabled case by reducing redo-flush contention, thereby reducing fsyncs.

PROBLEM-2:

MySQL Group Commit already has a concept of ordering transactions based on the order of their addition to the GROUP COMMIT queue (FLUSH STAGE queue to be specific). Commit Monitor enforces the same, making the action redundant but limiting parallelism in MySQL Group Commit Logic (including redo-log flush that is now delayed to the flush stage).

With the existing flow (due to the involvement of Commit Monitor), only one transaction can enter the GROUP COMMIT Queue, thereby limiting optimal use of Group Commit Logic.

OPTIMIZATION-2:

Release the Commit Monitor once the transaction is successfully added to flush-stage of group-commit. MySQL will take it from there to maintain the commit ordering. (We call this interim-commit.)

Releasing the Commit Monitor early helps other transactions to make progress and real MySQL Group Commit Leader-Follower Optimization (batch flushing/sync/commit) comes into play.

This also helps ensure batch REDO log flushing.

PROBLEM-3:

This problem is specific to when the log-bin is disabled. Percona XtraDB Cluster still generates the log-bin, as it needs it for forming a replication write-set (it just doesn’t persist this log-bin information). If disk space is not a constraint, then I would suggest operating Percona XtraDB Cluster with log-bin enabled.

With log-bin disabled, OPTIMIZATION-1 is still relevant, but OPTIMIZATION-2 isn’t, as there is no group-commit protocol involved. Instead, MySQL ensures that the redo-log (capturing state change of transaction) is persisted before reporting COMMIT as a success. As per the original flow, the Commit Monitor is not released till the commit action is complete.

OPTIMIZATION-3:

The transaction is already committed to memory and the state change is captured. This is about persisting the REDO log only (REDO log modification is already captured by mtr_commit). This means we can release the Commit Monitor just before the REDO flush stage kicks in. Correctness is still ensured as the REDO log flush always persists the data sequentially. So even if trx-1 loses its slots before the flush kicks in, and trx-2 is allowed to make progress, trx-2’s REDO log flush ensures that trx-1’s REDO log is also flushed.

Conclusion

With these three main optimizations, and some small tweaks, we have tuned Percona XtraDB Cluster to scale better and made it fast enough for the growing demands of your applications. All of this is available with the recently released Percona XtraDB Cluster 5.7.17-29.20. Give it a try and watch your application scale in a multi-master environment, making Percona XtraDB Cluster the best fit for your HA workloads.

Percona XtraDB Cluster 5.6.35-26.20-3 is now available

Percona announces the release of Percona XtraDB Cluster 5.6.35-26.20-3 on April 13, 2017. Binaries are available from the downloads section or our software repositories.

Percona XtraDB Cluster 5.6.35-26.20-3 is now the current release, based on the following:

All Percona software is open-source and free. Details of this release can be found in the 5.6.35-26.20-3 milestone on Launchpad.

NOTE: Due to end of life, Percona will stop producing packages for the following distributions after July 31, 2017:

  • Red Hat Enterprise Linux 5 (Tikanga)
  • Ubuntu 12.04 LTS (Precise Pangolin)

You are strongly advised to upgrade to latest stable versions if you want to continue using Percona software.

Fixed Bugs

  • Updated semantics for gcache page cleanup to trigger when either gcache.keep_pages_size or gcache.keep_pages_count exceeds the limit, instead of both at the same time.
  • Added support for passing the XtraBackup buffer pool size with the use-memory option under [xtrabackup] and the innodb_buffer_pool_size option under [mysqld] when the --use-memory option is not passed with the inno-apply-opts option under [sst].
  • Fixed gcache page cleanup not triggering when limits are exceeded.
  • PXC-782: Updated xtrabackup-v2 script to use the tmpdir option (if it is set under [sst], [xtrabackup] or [mysqld], in that order).
  • PXC-784: Fixed the pc.recovery procedure to abort if the gvwstate.dat file is empty or invalid, and fall back to normal joining process. For more information, see 1669333.
  • PXC-794: Updated the sockopt option to include a comma at the beginning if it is not set by the user.
  • PXC-797: Blocked wsrep_desync toggling while the node is paused to avoid halting the cluster when running FLUSH TABLES WITH READ LOCK. For more information, see 1370532.
  • Fixed several packaging and dependency issues.

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system. As always, thanks for your continued support of Percona!

MySQL 8.0: Improved performance with CTE

Last week I published a blog post in the MySQL Server Blog where I showed how the execution time of a query was reduced by 50% by using a Common Table Expression (CTE) instead of a view.

In the coming weeks, there will be several opportunities to attend a presentation on CTE and another SQL feature that will arrive in MySQL 8.0, Window Functions.  This presentation is part of the Oracle MySQL Innovation Day that will be held both in the Bay Area (April 28) and in the Boston Area (May 2).

I will also give the presentation at the Boston MySQL Meetup on May 1st.

I hope to see some of you there!


 

More Metadata Is Written Into Binary Log

In row based replication, the row data generated by DML is logged into binary log with some metadata. For example column type, type length etc. In the new release MySQL-8.0.1, more table metadata are written into binary log. All metadata together brings users and us below benefits:

  • Allows us to build robuster replication and convert row data between types smoothly.

Performance improvements in Percona XtraDB Cluster 5.7.17-29.20

In our latest release of Percona XtraDB Cluster, we’ve introduced major performance improvements to the MySQLwrite-set replication layer. In this post, we want to show what these improvements look like.

For the test, we used the sysbench OLTP_RW, UPDATE_KEY and UPDATE_NOKEY workloads with 100 tables, 4mln rows each, which gives about 100GB of datasize. In all the tests we use a three-node setup, connected via a 10GB network, with the sysbench load directed to the one primary node.

In the first chart, we show improvements comparing to the previous version (5.7.16):

The main improvements come from concurrent workloads, under multiple threads.

The previous chart is for cases using enabled binary logs, but in some situations we will have deployments without binary logs enabled (Percona XtraDB Cluster does not require them). The latest release significantly improves performance for this case as well.

Here is a chart showing throughput without binary logs:

Where does Percona XtraDB Cluster place in comparison with similar technologies? To find out, we’ll compare this release with MySQL 5.7.17 Group Replication and with the recently released MariaDB 10.2.5 RC.

For MySQL 5.7.17 Group Replication, I’ve reviewed two cases: “durable” with sync_binlog=1, and “relaxed durability” with sync_binlog=0.

Also for MySQL 5.7.17 Group Replication, we want to review two cases with different flow_control settings. The first setting is flow_control=25000 (the default setting). It provides better performance, but with the drawbacks that non-primary nodes will fall behind significantly and MySQL Group Replication does not provide a way to protect from reading stale data. So with a default flow_control=25000, we risk reading very outdated data. We also tested MySQL Group Replication with flow_control=1000 to minimize stale data on non-primary nodes.

A note on the Flow Control topic: it is worth mentioning that we also changed the flow_control default for Percona XtraDB Cluster. The default value is 100 instead of 16 (as in version 5.7.16).

Comparison chart with sync_binlog=1 (for MySQL Group Replication):

Comparison chart with sync_binlog=0 (for MySQL Group Replication):

So there are couple conclusions we can make out of these charts.

  1. The new version of Percona XtraDB Cluster performs on the level with MySQL Group Replication
  2. flow_control for MySQl Group Replication really makes a difference for performance, and default flow_control=25000 is better (with the risk of a lot of outdated data on non-primary nodes)

The reference our benchmark files and config files are here.

Percona XtraDB Cluster 5.7.17-29.20 is now available

Percona announces the release of Percona XtraDB Cluster 5.7.17-29.20 on April 19, 2017. Binaries are available from the downloads section or our software repositories.

NOTE: You can also run Docker containers from the images in the Docker Hub repository.

Percona XtraDB Cluster 5.7.17-29.20 is now the current release, based on the following:

All Percona software is open-source and free.

Performance Improvements

This release is focused on performance and scalability with increasing workload threads. Tests show up to 10 times increase in performance.

Fixed Bugs

  • Updated semantics for gcache page cleanup to trigger when either gcache.keep_pages_size or gcache.keep_pages_count exceeds the limit, instead of both at the same time.
  • Added support for passing the XtraBackup buffer pool size with the use-memory option under [xtrabackup] and the innodb_buffer_pool_size option under [mysqld] when the --use-memory option is not passed with the inno-apply-opts option under [sst].
  • Fixed gcache page cleanup not triggering when limits are exceeded.
  • Improved SST and IST log messages for better readability and unification.
  • Excluded the garbd node from flow control calculations.
  • Added extra checks to verify that SSL files (certificate, certificate authority, and key) are compatible before openning connection.
  • Improved parallelism for better scaling with multiple threads.
  • Added validations for DISCARD TABLESPACE and IMPORT TABLESPACE in PXC Strict Mode to prevent data inconsistency.
  • Added the wsrep_flow_control_status variable to indicate if node is in flow control (paused).
  • PXC-766: Added the wsrep_ist_receive_status variable to show progress during an IST.
  • Allowed CREATE TABLE ... AS SELECT (CTAS) statements with temporary tables (CREATE TEMPORARY TABLE ... AS SELECT) in PXC Strict Mode. For more information, see 1666899.
  • PXC-782: Updated xtrabackup-v2 script to use the tmpdir option (if it is set under [sst], [xtrabackup] or [mysqld], in that order).
  • PXC-783: Improved the wsrep stage framework.
  • PXC-784: Fixed the pc.recovery procedure to abort if the gvwstate.dat file is empty or invalid, and fall back to normal joining process. For more information, see 1669333.
  • PXC-794: Updated the sockopt option to include a comma at the beginning if it is not set by the user.
  • PXC-795: Set --parallel=4 as default option for wsrep_sst_xtrabackup-v2 to run four threads with XtraBackup.
  • PXC-797: Blocked wsrep_desync toggling while node is paused to avoid halting the cluster when running FLUSH TABLES WITH READ LOCK. For more information, see 1370532.
  • PXC-805: Inherited upstream fix to avoid using deprecated variables, such as INFORMATION_SCHEMA.SESSION_VARIABLE. For more information, see 1676401.
  • PXC-811: Changed default values for the following variables:
    • fc_limit from 16 to 100
    • send_window from 4 to 10
    • user_send_window from 2 to 4
  • Moved wsrep settings into a separate configuration file (/etc/my.cnf.d/wsrep.cnf).
  • Fixed mysqladmin shutdown to correctly stop the server on systems using systemd.
  • Fixed several packaging and dependency issues.

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system. As always, thanks for your continued support of Percona!

How to Deploy Asynchronous Replication Slave to MariaDB Galera Cluster 10.x using ClusterControl

Combining Galera and asynchronous replication in the same MariaDB setup, aka Hybrid Replication, can be useful - e.g. as a live backup node in a remote datacenter or reporting/analytics server. We already blogged about this setup for Codership/Galera or Percona XtraDB Cluster users, but a master failover as described in that post does not work for MariaDB because of its different GTID approach. In this post, we will show you how to deploy an asynchronous replication slave to MariaDB Galera Cluster 10.x (with master failover!), using GTID with ClusterControl.

Preparing the Master Related resources  Galera Cluster for MySQL - Tutorial  9 DevOps for going in production with Galera Cluster for MySQL

First and foremost, you must ensure that the master and slave nodes are running on MariaDB Galera 10.0.2 or later. A MariaDB replication slave requires at least one master with GTID among the Galera nodes. However, we would recommend users to configure all the MariaDB Galera nodes as masters. GTID, which is automatically enabled in MariaDB, will be used to do master failover.The following must be true for the masters:

  • At least one master among the Galera nodes
  • All masters must be configured with the same domain ID
  • log_slave_updates must be enabled
  • All masters’ MariaDB port is accessible by ClusterControl and slaves
  • Must be running MariaDB version 10.0.2 or later

From ClusterControl this is easily done by selecting Enable Binary Logging in the drop down for each node.

Enabling binary logging through ClusterControl

And then enable GTID in the dialogue:

Once Proceed has been clicked, a job will automatically configure the Galera node according to the settings described earlier.

If you wish to perform this action by hand, you can configure a Galera node as master, by changing the MariaDB configuration file for that node as per below:

gtid_domain_id=<must be same across all mariadb servers participating in replication> server_id=<must be unique> binlog_format=ROW log_slave_updates=1 log_bin=binlog

After making these changes, restart the nodes one by one or using a rolling restart (ClusterControl > Manage > Upgrades > Rolling Restart)

ClusterControl Single Console for Your Entire Database Infrastructure Find out what else is new in ClusterControl Install ClusterControl for FREE Preparing the Slave

For the slave, you would need a separate host or VM with or without MariaDB installed. If you do not have MariaDB installed, you need to perform the following tasks; configure root password (based on monitored_mysql_root_password), create slave user (based on repl_user, repl_password), configure MariaDB, start the server and finally start replication.

Adding the slave using ClusterControl, all these steps will be automated in the Add Replication Slave job as described below.

Add replication slave to MariaDB Cluster

After adding our slave node, our deployment will look like this:

MariaDB Galera asynchronous slave topology Master Failover and Recovery

Since we are using MariaDB with GTID enabled, master failover is supported via ClusterControl when Cluster and Node Auto Recovery has been enabled. Whether the master would fail due to network connectivity or any other reason, ClusterControl will automatically fail over to the most suitable other master node in the cluster.

Automatic slave failover to another master in Galera cluster

This way ClusterControl will add a robust asynchronous slave capability to your MariaDB Cluster!

Tags:  galera cluster MySQL MariaDB clustercontrol asynchronous replication

MySQL Connector/Python 2.1.6 GA has been released

Dear MySQL users,

MySQL Connector/Python 2.1.6 GA is a fourth GA version of 2.1 release
series of the pure Python database driver for MySQL. It can be used for
production environments.

MySQL Connector/Python version 2.1.6 GA is compatible with MySQL Server
versions 5.5 and greater. Python 2.6 and greater as well as Python 3.3
and greater are supported. Python 2.4, 2.5 and 3.1, 3.2 are not
supported.

MySQL Connector/Python 2.1.6 is available for download from:

http://dev.mysql.com/downloads/connector/python/#downloads

The ChangeLog file included in the distribution contains a brief summary
of changes in MySQL Connector/Python 2.1.6. For a more complete list of
changes, see below or online at:

http://dev.mysql.com/doc/relnotes/connector-python/en/

Enjoy!

Changes in MySQL Connector/Python 2.1.6 (2017-04-18, General
Availability)

Functionality Added or Changed

* An ssl-cipher option is now supported for specifying the
encryption cipher for secure connections. (Bug #22545879,
Bug #78186)

Bugs Fixed

* Compatibility issues with Django 1.9 were corrected. (Bug
#25726671)

* The fix for Bug #22529828 caused Python 2.7 to be unable
to insert binary data. (Bug #25589496, Bug #85100)
References: This issue is a regression of: Bug #22529828.

* Some SQL statements that worked using pure Python failed
with the Connector/Python C Extension enabled. (Bug
#25558885)

* Connector/Python produced no error or warning if the
server certificate was expired. (Bug #25397650)

* If an exception reset the underlying session, connections
in a connection pooled could become unavailable to the
pool. (Bug #25383644, Bug #84476)

* Methods for filtering time and datetime fields were
changed in Django 1.9 from value_to_db_datetime to
adapt_datetimefield_value and from value_to_db_time to
adapt_timefield_value. Proxy methods with the previous
names were added to Connector/Python ensure
compatibility. Thanks to Brian Tyndall for the patch.
(Bug #25349918, Bug #84410)

* Extra encapsulation was removed from the get_constraints
method for the foreign_key parameter. Thanks to Brian
Tyndall for the patch. (Bug #25349912, Bug #84409)

* Connector/Python added support for a database backend API
change introduced in Django 1.9 for the bulk_insert_sql
method. Thanks to Brian Tyndall for the patch. (Bug
#25349897, Bug #84408)

* Loading the world sample database worked using pure
Python but failed with the Connector/Python C Extension
enabled. (Bug #22476689, Bug #79780)

* If the output from the mysql_config –include command
included more than one directory, the C Extension failed
to compile. (Bug #20736339, Bug #76350)

On Behalf of the MySQL/ORACLE RE Team,
-Sreedhar S

Practical Orchestrator, BoF, GitHub and other talks at Percona Live 2017

Next week I will be presenting Practical Orchestrator at Percona Live, Santa Clara.

As opposed to previous orchestrator talks I gave, and which were either high level or algorithmic talks, Practical Orchestrator will be, well... practical.

The objective for this talk is that attendees leave the classroom with a good grasp of orchestrator's powers, and know how to set up orchestrator in their environment.

We will walk through discovery, refactoring, recovery, HA. I will walk through the most important configuration settings, share advice on what makes a good deployment, and tell you how we and others run orchestrator. We'll present a few scripting/automation examples. We will literally set up orchestrator on my computer.

It's a 50 minute talk and it will be fast paced!

ProxySQL & Orchestrator BoF

ProxySQL is all the rage, and throughout the past 18 months René Cannaò and myself discussed a few times the potential for integration between ProxySQL and Orchestrator. We've also received several requests from the community.

We will run a BoF, a very informal session where we openly discuss our thoughts on possible integration, what makes sense and what doesn't, and above all else would love to hear the attendees' thoughts. We might come out of this session with some plan to pick low hanging fruit, who knows?

The current link to the BoF sessions is this. It seems terribly broken, and hopefully I'll replace it later on.

GitHub talks

GitHub engineers will further present these talks:

  • gh-ost: triggerless, painless, trusted online schema migrations
    Jonah Berquist, 25 April - 2:20 PM - 3:10 PM @ Ballroom D
    This is the "classic" introduction to gh-ost, our very own open source schema migration tool. gh-ost enjoys good traction and adoption. For us, it made a significant impact on our development cycle and on our availability and reliability. We hear similar stories from users. This talk will explain how gh-ost works and why it works for us and others so much better. Also: superpowers.
  • Automating Schema Changes using gh-ost
    Tom Krouper, 27 April - 12:50 PM - 1:40 PM @ Ballroom D
    What's the greater picture? gh-ost runs the migration, but what's the migration cycle? How do our engineers design the change, get this to run, verify it doesn't break our environment? What automation do we have in place? Expect very interesting and definitely not trivial issues.
  • Practical JSON in MySQL 5.7 and beyond
    Ike Walker
    , 27 April - 3:00 PM - 3:50 PM @ Ballroom A
    Ike will review the JSON data type along with most recent changes, virtual columns, operational concerns and more. I'm looking forward to a great review!
Related
  • Experiences using gh-ost in a multi-tier topology
    Ivan Groenewold, Valerie Parham-Thompson, Pythian, 26 April - 5:00 PM - 5:25 PM @ Ballroom C
    Don't take it from us. Pythian are running gh-ost in production, and on completely different environments than us. I'm absolutely curious to hear about their experience and findings.
  • High Availability in GCE
    Carmen Mason, Allan Mason, 26 April - 3:30 PM - 4:20 PM @ Ballroom D
    This is a story on the road to MySQL HA on Google Cloud Engine. It ends up with MHA picked as the HA solution while orchestrator is not. I'm curious to hear!

See you in Santa Clara!

 

MariaDB 10.3-alpha released

While most of the MariaDB developers have been working hard on getting MariaDB 10.2 out as GA, a small team, including me, has been working on the next release, MariaDB 10.3.

The theme of MariaDB 10.2 is complex operations, like window functions, common table expressions, JSON functions, the theme of MariaDB 10.3 is compatibility.

Compatibility refers to functionality that exist in other databases but have been missing in MariaDB:
In MariaDB 10.2 ORACLE mode was limited to removing MariaDB specific options in SHOW CREATE TABLE, SHOW CREATE VIEW and setting SQL_MODE to "PIPES_AS_CONCAT, ANSI_QUOTES, IGNORE_SPACE, ORACLE, NO_KEY_OPTIONS, NO_TABLE_OPTIONS, NO_FIELD_OPTIONS, NO_AUTO_CREATE_USER".

In MariaDB 10.3, SQL_MODE=ORACLE mode allows MariaDB to understand a large subset of Oracle's PL/SQL language. The documentation for what is supported is still lacking, but the interested can find what is supported in the test suite in the "mysql-test/suite/compat/oracle" directory.

If things go as planned, the features we will add to 10.3 prior to beta are:
Most of the above features are already close to be ready (to be added in future Alphas), so I expect that it willl not take many months before we can make a first MariaDB 10.3 beta!

This is in line what was discussed on the MariaDB developer conference in New York one week ago, where most attendees wanted to see new MariaDB releases more often.

MariaDB 10.3 can be downloaded here

Happy testing!

New MySQL JSON Functions (more)

MySQL 8 is going to have new batch of JSON functions and last time JSON_PRETTY() was covered in some details. The recent release of 8.0.1 provides an opportunity to try these new functions and a few that might have been missed with 8.0.0. Unquoting

The -> shortcut for JSON_EXTRACT() was introduced with MySQL 5.7. And now there is the unquoting extraction operator or ->> to simplify things again! Think of it as JSON_UNQUOTE wrapped around JSON EXTRACT. The following there queries produce the same output. mysql> SELECT JSON_UNQUOTE(JSON_EXTRACT(doc,"$.GNP")) FROM countryinfo WHERE _id = "USA"; +-----------------------------------------+ | JSON_UNQUOTE(JSON_EXTRACT(doc,"$.GNP")) | +-----------------------------------------+ | 8510700 | +-----------------------------------------+ 1 row in set (0.00 sec) mysql> SELECT JSON_UNQUOTE(doc->"$.GNP") FROM countryinfo WHERE _id = "USA"; +----------------------------+ | JSON_UNQUOTE(doc->"$.GNP") | +----------------------------+ | 8510700 | +----------------------------+ 1 row in set (0.00 sec) mysql> SELECT doc->>"$.GNP" FROM countryinfo WHERE _id = "USA"; +---------------+ | doc->>"$.GNP" | +---------------+ | 8510700 | +---------------+ 1 row in set (0.00 sec) mysql>

AggregationThe new JSON_ARRAYAGG() and JSON_OBJECTAGG() takes a column or column argument and crates an array or object.

Clear as mud?

Well, examine this example: mysql> SELECT * FROM foo; +----------------------------------------+ | mycolumn | +----------------------------------------+ | {"key1": "value-A", "key2": "value-B"} | | {"key2": "value-X", "key3": "value-C"} | +----------------------------------------+ 2 rows in set (0.00 sec) mysql> SELECT JSON_ARRAYAGG(mycolumn) FROM foo; +----------------------------------------------------------------------------------+ | JSON_ARRAYAGG(mycolumn) | +----------------------------------------------------------------------------------+ | [{"key1": "value-A", "key2": "value-B"}, {"key2": "value-X", "key3": "value-C"}] | +----------------------------------------------------------------------------------+ 1 row in set (0.00 sec)

The two rows from table foo are combined to make a two element array. mysql> SELECT JSON_OBJECT(_id, doc->"$.GNP","Indep",doc->"$.IndepYear") FROM countryinfo LIMIT 2; +-----------------------------------------------------------+ | json_object(_id, doc->"$.GNP","Indep",doc->"$.IndepYear") | +-----------------------------------------------------------+ | {"ABW": 828, "Indep": null} | | {"AFG": 5976, "Indep": 1919} | +-----------------------------------------------------------+ 2 rows in set (0.00 sec) mysql>

The JSON_OBJECT() function takes pairs of columns, assumes they are a key/value pair, and combines them. Note that non-JSON columns and data from JSON columns can be combined, as well as literal strings.

sysbench 1.0 feature highlights

It's been a while since I announced a restart of sysbench development. Even though I failed to report my progress on sysbench 1.0 and even announce the release here in the blog (yes, I'm a pretty lousy blogger), there have been a lot of things going on behind the scenes.

First of all, sysbench 1.0 was released a couple of months ago with impressive performance and scalability improvements, some interesting new features, and recently added fully automated packaging, with automation powered by Travis CI and packpack, and packages being hosted by packagecloud.

On the performance and scalability side of things, most improvements were made possible by replacing plain, interpreted Lua with LuaJIT, removing and optimizing locks on critical code paths with help from the ConcurrencyKit library and refactoring Lua API provided by sysbench itself.

New feature highlights in sysbench 1.0 include:

  • simplified command line syntax
  • latency histograms
  • extended SQL API
  • error and report hooks
  • custom and parallel commands in Lua scripts

But there are even more improvements and refactoring under the hood which do not (yet) manifest themselves as user-visible changes. My goal for sysbench 1.0 was to create a base for a universal benchmark framework, paving the way for new features like custom workloads, NoSQL support, benchmark automation, OS tuning and results aggregation/visualization/publishing implemented as dynamically installed modules on top of sysbench.

If that sounds interesting and you are going to Percona Live Santa Clara 2017 next week, I will be talking about all those points in a bit more detail in my talk titled “sysbench 1.0: teaching an old dog new tricks”. I'd like to hear about your experience with sysbench and will be happy to answer questions!

Percona XtraBackup 2.4.7 is Now Available

Percona announces the GA release of Percona XtraBackup 2.4.7 on April 18, 2017. You can download it from our download site and apt and yum repositories.

Percona XtraBackup enables MySQL backups without blocking user queries, making it ideal for companies with large data sets and mission-critical applications that cannot tolerate long periods of downtime. Offered free as an open source solution, Percona XtraBackup drives down backup costs while providing unique features for MySQL backups.

New features:
  • Percona XtraBackup now uses hardware accelerated implementation of crc32 where it is supported.
  • Percona XtraBackup has implemented new options: --tables-exclude and --databases-exclude that work similar to --tables and --databases options, but exclude given names/paths from backup.
  • The xbstream binary now supports parallel extraction with the --parallel option.
  • The xbstream binary now supports following new options: --decrypt, --encrypt-threads, --encrypt-key, and --encrypt-key-file. When --decrypt option is specified xbstream will automatically decrypt encrypted files when extracting input stream. Either --encrypt-key or --encrypt-key-file options must be specified to provide encryption key, but not both. Option --encrypt-threads specifies the number of worker threads doing the encryption, default is 1.
Bugs Fixed:
  • Backups were missing *.isl files for general tablespace. Bug fixed #1658692.
  • In 5.7, MySQL changed default checksum algorithm to crc32, while xtrabackup was using innodb. This caused xtrabackup to perform extra checksum calculations which were not needed. Bug fixed #1664405.
  • For system tablespaces consisting of multiple files xtrabackup updated LSN only in first file. This caused MySQL versions lower than 5.7 to fail on startup. Bug fixed #1669592.
  • xtrabackup --export can now export tables that have more than 31 index. Bug fixed #1089681.
  • Unrecognized character x01; marked by message could be seen if backups were taken with the version check enabled. Bug fixed #1651978.

Release notes with all the bugfixes for Percona XtraBackup 2.4.7 are available in our online documentation. Please report any bugs to the launchpad bug tracker.

Percona XtraBackup 2.3.8 is Now Available

Percona announces the release of Percona XtraBackup 2.3.8 on April 18, 2017. Downloads are available from our download site or Percona Software Repositories.

Percona XtraBackup enables MySQL backups without blocking user queries, making it ideal for companies with large data sets and mission-critical applications that cannot tolerate long periods of downtime. Offered free as an open source solution, Percona XtraBackup drives down backup costs while providing unique features for MySQL backups.

This release is the current GA (Generally Available) stable release in the 2.3 series.

New Features
  • Percona XtraBackup now uses hardware accelerated implementation of crc32 where it is supported.
  • Percona XtraBackup has implemented new options: --tables-exclude and --databases-exclude that work similar to --tables and --databases options, but exclude given names/paths from backup.
  • The xbstream binary now supports parallel extraction with the --parallel option.
  • The xbstream binary now supports following new options: --decrypt, --encrypt-threads, --encrypt-key, and --encrypt-key-file. When --decrypt option is specified xbstream will automatically decrypt encrypted files when extracting input stream. Either --encrypt-key or --encrypt-key-file options must be specified to provide encryption key, but not both. Option --encrypt-threads specifies the number of worker threads doing the encryption, default is 1.
Bugs Fixed:
  • xtrabackup would not create fresh InnoDB redo logs when preparing incremental backup. Bug fixed #1669592.
  • xtrabackup --export can now export tables that have more than 31 index. Bug fixed #1089681.
  • Unrecognized character x01; marked by message could be seen if backups were taken with the version check enabled. Bug fixed #1651978.

Release notes with all the bugfixes for Percona XtraBackup 2.3.8 are available in our online documentation. Bugs can be reported on the launchpad bug tracker.

M17 Conference Observations on the Future of MariaDB

In this blog post, I’ll discuss some of my thoughts about the future of MariaDB after attending the M17 Conference.

Let me start with full disclosure: I’m the CEO of Percona, and we compete with the MariaDB Corporation in providing Support for MariaDB and other services. I probably have some biases!

Last week I attended the MariaDB Developers UnConference and the M17 Conference, which provided great insights into MariaDB’s positioning as a project and as a business. Below are some of my thoughts as I attended various sessions at the conference:

Breaking away from their MySQL past. Michael Howard’s (MariaDB CEO) keynote focused on breaking away from the past and embracing the future. In this case, the “past” means proprietary databases. But I think MariaDB is also trying to break away from their past of being a MySQL variant, and focus on becoming completely independent technology. If I didn’t know their history, I wouldn’t recognize how much codebase MariaDB shares with MySQL – and how much MariaDB innovation Oracle still drives.

MySQL compatibility is no longer the primary goal. In its first version, MariaDB 5.1 was truly compatible with MySQL (it had relatively few differences). By contrast, MariaDB 10.2 has different replication, JSON support and a very different optimizer. With MariaDB 10.3, more changes are planned for InnoDB on disk format, and no plans exist to remove .frm files and use the MySQL 8 Data Dictionary. With these features, another level of MySQL compatibility is removed. The MariaDB knowledgebase states: “For all practical purposes, MariaDB is a binary drop in replacement for the same MySQL version.” The argument can still be made that this is true for MySQL 5.7 (as long as your application does not use some of the new features), but this does not seem to be the case for MySQL 8.

The idea seems to be that since MariaDB has replaced MySQL in many (most?) Linux distributions, and many people use MariaDB when they think they are using MySQL, compatibility is not that big of a deal anymore.

Embracing contributions and keeping the development process open is a big focus of MariaDB. Facebook, Google, Alibaba and Tencent have contributed to MariaDB, along with many independent smaller companies and independent developers (Percona among them). This is different from the MySQL team at Oracle, who have provided some contributions, but not nearly to the extent that MariaDB has. An open source model is a double-edged sword – while it gives you more features, it also makes it harder to maintain a consistent user experience and consistent quality of code and documentation. It will be interesting to see how MariaDB deals with these challenges.

Oracle compatibility. MariaDB strives to be the open source database that is the most compatible with Oracle, and therefore the easiest to migrate to. I have heard people compare MariaDB’s desire for Oracle compatibility to EDB Postgres – only with the advantage of being open source as opposed to proprietary software.  For MariaDB 10.3 (alpha), they are developing support for Oracle PL/SQL syntax for stored procedures to be able to migrate applications with little, if any, changes. They are also developing support for SEQUENCE and other Oracle features, including a special sql_mode=ORACLE to maximize compatibility.

BSL as a key for success. When it comes to business source licensing (BSL), I couldn’t quite resolve the conflict I found in MariaDB’s messaging. On the one hand, MariaDB promotes open source as a great way to escape vendor lock-in (which we at Percona completely agree with). But on the other hand, Michael Howard stated that BSL software (“Eventual Open Source”) is absolutely critical for MariaDB’s commercial success. Is the assumption here that if vendor lock-in is not forever, it is not really lock-in? Currently, only MariaDB MaxScale is BSL, but it sounds like we should expect more of their software to follow this model.

Note. As MariaDB Server and MariaDB Columnstore use a lot of Oracle’s GPL code, these will most likely remain GPL.

I enjoyed attending both conferences. I had a chance to meet with many old friends and past colleagues, as well as get a much better feel for where MariaDB is going and when it is appropriate to advise its usage.

New monitoring replication features and more!

The new release of MySQL is packed with exciting features that help detecting and analyzing replication lag. In this post, you will be able to learn all about the new replication timestamps, the new useful information that is now reported by performance schema tables, and how delayed replication was improved.…

Pages