Planet MySQL

Platform End of Life (EOL) Announcement for RHEL 5 and Ubuntu 12.04 LTS

Upstream platform vendors have announced the general end of life (EOL) for Red Hat Enterprise Linux 5 (RHEL 5) and its derivatives, as well as Ubuntu 12.04 LTS. With this announcement comes some implications to support for Percona software running on these operating systems.

RHEL 5 was EOL as of March 31st, 2017 and Ubuntu 12.04 LTS was end of life as of April 28th, 2017. Pursuant to our end of life policies, we are announcing that these EOLs will go into effect for Percona software on August 1st, 2017. As of this date, we will no longer be producing new packages, binary builds, hotfixes, or bug fixes for Percona software on these platforms.

We generally align our platform end of life dates with those of the upstream platform vendor. The platform end of life dates are published in advance on our website under the page Supported Linux Platforms and Versions.

Per our policies, Percona will continue to provide operational support for your databases on EOLed platforms. However, we will be unable to provide any bug fixes, builds or OS-level assistance if you encounter an issue outside the database itself.

Each platform vendor has a supported migration or upgrade path to their next major release.  Please reach out to us if you need assistance in migrating your database to your vendor’s supported platform – Percona will be happy to assist you.

Webinar Wednesday August 2, 2017: MySQL Disk Encryption with LUKS

Join Percona’s, Senior Architect, Matthew Boehm as he presents MySQL Disk Encryption with LUKS on Wednesday, August 2, 2017, at 1:00 pm PDT / 4:00 pm EDT (UTC-7).

Register Now

Clients require strong security measures for PCI, HIPAA or PHI. You must encrypt MySQL “at rest” to satisfy the data managed under these standards. InnoDB’s built-in encryption features work, but there are some caveats to that solution.

In this talk, you’ll see how to encrypt your entire disk to protect everything from data, redo logs and binary logs.

Register for the webinar here.

Matthew Boehm, Architect

Matthew joined Percona in the fall of 2012 as a MySQL consultant. His areas of knowledge include the traditional Linux/Apache/MySQL/PHP stack, memcached, MySQL Cluster, massive sharding topologies, PHP development and a bit of MySQL-C-API development. Previously, Matthew DBAed for the 5th largest MySQL installation at eBay/PayPal, and also hails from managed hosting environments. During his off-hours, Matthew is a nationally-ranked competitive West Coast Swing dancer, and travels to competitions around the US. He enjoys working out, camping, biking and playing MMOs with his son.

How We Encrypt Data In MySQL With Go

A SaaS product needs to use security measures you might not ordinarily use in an on-premises solution. In particular, it’s important that all sensitive data be secured. Encryption plays an important role in information security. At VividCortex, we encrypt data in-flight and at-rest, so your sensitive data is never exposed.

We use Go and MySQL extensively at VividCortex and thought other Go programmers might be interested to see how we’ve integrated encryption into our services layer (APIs). (And if you'd like to learn more about programming with Go in general, please take a look at our free ebook The Ultimate Guide to Building Database-Driven Apps with Go.)


Image Source

Encryption Techniques

At a high level, you can think of two kinds of data encryption inside of MySQL or any similar data store. I’ll oversimplify for purposes of illustration. You can:

  1. Store the data in MySQL as normal, but encrypt the container that holds MySQL. Usually this means storing MySQL’s data on an encrypted disk volume. The protection? Broadly speaking, if someone gains access to a backup disk, they can’t see your data.
  2. Encrypt the data before sending it to MySQL. In this case the security boundary is pushed out further: even if someone gets access to the server, and can run SQL commands, they can’t see your data.

Each of these has advantages and disadvantages. These include ease of use, programmer overhead, ability to inspect (e.g. recovering from backups), searchability and indexability, and so on. There are a lot of things to consider here. Just a few:

  • Will data be exposed if backups are unencrypted? (Our backups are encrypted, by the way.)
  • Are sensitive values possibly in cleartext in query logs?
  • Will sensitive values be visible in status commands like SHOW FULL PROCESSLIST?

At VividCortex we err on the side of safety and security, rather than favoring convenience. There’s a fairly simple question that does a pretty good job of illustrating our goal: if someone succeeds in a SQL injection attack against our databases, will they see any sensitive data in cleartext? The answer needs to be “no.” This is a higher standard than on-disk encryption. It means that someone has to get access to the keys for the particular data they’re trying to decrypt, in order to see anything other than garbage. And those keys are not present or accessible on the database servers in any form, not even in-memory.

Making It Convenient

Convenience is important. If it’s too hard to do encryption, there’s an increased risk that it won’t be done. Fortunately, Go’s elegant interfaces for the database/sql package make the burden transparent to the programmer!

We learned how to do this from Jason Moiron’s excellent blog post on the Valuer and Scanner interfaces. Please read that if you haven’t yet.

To implement transparent encryption and decryption, we created a custom data type that implements the Valuer and Scanner interfaces. The implementation is straightforward and quite similar to Jason’s example of compressing and decompressing, except that we used encryption libraries instead.

Now our code is incredibly simple to use with encrypted values. All we do is define a variable of our custom type. For example, instead of

var password string err = rows.Scan(&password)

We simply use

var password EncryptedValue err = rows.Scan(&password)

It’s similarly simple to insert values encrypted into the database. Magic! This is why I often say that Go’s design, although it seems minimalistic at first, is actually very advanced and powerful.

"Go feels under-engineered because it only solves real problems." Exactly. http://t.co/18LhLT0ALB #golang

— VividCortex (@VividCortex) September 18, 2014 Nuts And Bolts

The code is small. The exact details of all the code are not all that important for this blog post; much of it is about things that are out of scope here. The gist of it, though, is that we store values as byte arrays:

  • The first byte is an indicator of the version of our encryption algorithm used, so there’s a clear migration path for changes.
  • The next four bytes indicate which key we used to encrypt this value, so we have 4 billion possible keys.
  • The rest is the encrypted payload.

We can even change this in the future. For example, we can switch on the first byte’s value, if we want, to determine whether the key ID is in the next 4 bytes, or if it’s something more, such as the next 8 bytes. So we can easily expand the number of keys we can indicate. We can also, if we ever hit version 255, use that to indicate that the version number continues in the next byte. This is a standard trick used, among other places, by the MySQL wire protocol.

The result is that we have a simple and future-proof way to encrypt values.

Alternative Approaches

In addition to the approaches we’ve mentioned, there are several others. There are commercial projects designed to help ease the encryption and decryption techniques you might otherwise wrap around MySQL and perhaps fumble in some ways. There are encryption functions inside of MySQL—but educate yourself about those before using them. There are others, too, but you should be able to find all you need with a search.

Conclusions

By using Go’s built-in interfaces, we created a solution for transparently encrypting values in our database so that it’s never in the database in cleartext, either on-disk or in-memory. The code is easy for programmers to use, which improves our security posture automatically. All sensitive data gets encrypted in-flight and at-rest, and an attacker would have to have extensive access to our systems (an SQL injection wouldn’t suffice) to be able to decrypt the data.

We highly recommend that you use the standard Go interfaces for the power they give you. And please, ask your SaaS providers, including us, hard questions about security and how it’s implemented. Every service needs to be secure to make the Internet a safer place.

If you liked this and want to learn more about Go, you might also like our webinar about developing database-driven apps with Go and MySQL. Click below to watch a recording.

Post Updated 7/31/2017

How We Encrypt Data In MySQL With Go

A SaaS product needs to use security measures you might not ordinarily use in an on-premises solution. In particular, it’s important that all sensitive data be secured. Encryption plays an important role in information security. At VividCortex, we encrypt data in-flight and at-rest, so your sensitive data is never exposed.

We use Go and MySQL extensively at VividCortex and thought other Go programmers might be interested to see how we’ve integrated encryption into our services layer (APIs). (And if you'd like to learn more about programming with Go in general, please take a look at our free ebook The Ultimate Guide to Building Database-Driven Apps with Go.)


Image Source

Encryption Techniques

At a high level, you can think of two kinds of data encryption inside of MySQL or any similar data store. I’ll oversimplify for purposes of illustration. You can:

  1. Store the data in MySQL as normal, but encrypt the container that holds MySQL. Usually this means storing MySQL’s data on an encrypted disk volume. The protection? Broadly speaking, if someone gains access to a backup disk, they can’t see your data.
  2. Encrypt the data before sending it to MySQL. In this case the security boundary is pushed out further: even if someone gets access to the server, and can run SQL commands, they can’t see your data.

Each of these has advantages and disadvantages. These include ease of use, programmer overhead, ability to inspect (e.g. recovering from backups), searchability and indexability, and so on. There are a lot of things to consider here. Just a few:

  • Will data be exposed if backups are unencrypted? (Our backups are encrypted, by the way.)
  • Are sensitive values possibly in cleartext in query logs?
  • Will sensitive values be visible in status commands like SHOW FULL PROCESSLIST?

At VividCortex we err on the side of safety and security, rather than favoring convenience. There’s a fairly simple question that does a pretty good job of illustrating our goal: if someone succeeds in a SQL injection attack against our databases, will they see any sensitive data in cleartext? The answer needs to be “no.” This is a higher standard than on-disk encryption. It means that someone has to get access to the keys for the particular data they’re trying to decrypt, in order to see anything other than garbage. And those keys are not present or accessible on the database servers in any form, not even in-memory.

Making It Convenient

Convenience is important. If it’s too hard to do encryption, there’s an increased risk that it won’t be done. Fortunately, Go’s elegant interfaces for the database/sql package make the burden transparent to the programmer!

We learned how to do this from Jason Moiron’s excellent blog post on the Valuer and Scanner interfaces. Please read that if you haven’t yet.

To implement transparent encryption and decryption, we created a custom data type that implements the Valuer and Scanner interfaces. The implementation is straightforward and quite similar to Jason’s example of compressing and decompressing, except that we used encryption libraries instead.

Now our code is incredibly simple to use with encrypted values. All we do is define a variable of our custom type. For example, instead of

var password string err = rows.Scan(&password)

We simply use

var password EncryptedValue err = rows.Scan(&password)

It’s similarly simple to insert values encrypted into the database. Magic! This is why I often say that Go’s design, although it seems minimalistic at first, is actually very advanced and powerful.

"Go feels under-engineered because it only solves real problems." Exactly. http://t.co/18LhLT0ALB #golang

— VividCortex (@VividCortex) September 18, 2014 Nuts And Bolts

The code is small. The exact details of all the code are not all that important for this blog post; much of it is about things that are out of scope here. The gist of it, though, is that we store values as byte arrays:

  • The first byte is an indicator of the version of our encryption algorithm used, so there’s a clear migration path for changes.
  • The next four bytes indicate which key we used to encrypt this value, so we have 4 billion possible keys.
  • The rest is the encrypted payload.

We can even change this in the future. For example, we can switch on the first byte’s value, if we want, to determine whether the key ID is in the next 4 bytes, or if it’s something more, such as the next 8 bytes. So we can easily expand the number of keys we can indicate. We can also, if we ever hit version 255, use that to indicate that the version number continues in the next byte. This is a standard trick used, among other places, by the MySQL wire protocol.

The result is that we have a simple and future-proof way to encrypt values.

Alternative Approaches

In addition to the approaches we’ve mentioned, there are several others. There are commercial projects designed to help ease the encryption and decryption techniques you might otherwise wrap around MySQL and perhaps fumble in some ways. There are encryption functions inside of MySQL—but educate yourself about those before using them. There are others, too, but you should be able to find all you need with a search.

Conclusions

By using Go’s built-in interfaces, we created a solution for transparently encrypting values in our database so that it’s never in the database in cleartext, either on-disk or in-memory. The code is easy for programmers to use, which improves our security posture automatically. All sensitive data gets encrypted in-flight and at-rest, and an attacker would have to have extensive access to our systems (an SQL injection wouldn’t suffice) to be able to decrypt the data.

We highly recommend that you use the standard Go interfaces for the power they give you. And please, ask your SaaS providers, including us, hard questions about security and how it’s implemented. Every service needs to be secure to make the Internet a safer place.

If you liked this and want to learn more about Go, you might also like our webinar about developing database-driven apps with Go and MySQL. Click below to watch a recording.

Post Updated 7/31/2017

MySQL 8.0.2 More Flexible Undo Tablespace Management

In MySQL 8.0.2 DMR we will introduce features which make managing undo tablespaces easier in InnoDB.

The main improvement is that you can now create and drop undo tablespaces at any time.  You can change the config file setting before any startup, whether recovery is needed or not. …

What’s New in MariaDB Connector/C 3.0

What’s New in MariaDB Connector/C 3.0 RalfGebhardt Mon, 07/31/2017 - 05:41

We are pleased to announce the general availability (GA) of MariaDB Connector/C 3.0. MariaDB Connector/C 3.0.2 is the newest version of MariaDB Connector/C. This release is compatible with MariaDB Connector/C 2.3 - no code changes necessary to upgrade.

MariaDB Connector/C 3.0 includes new security enhancements, plugins and API functions.

Security

In addition to OpenSSL, MariaDB Connector/C 3.0 now supports:

  • GnuTLS
  • Windows SChannel: removes dependencies on external libraries
  • Windows SChannel: becomes the default for SSL on Windows
  • TLSv1.1 and TLSv1.2 support
  • Passphrase protected private keys
Plugins
  • All plugins can either be linked statically or built as shared objects (or dynamic link libraries on Windows)
  • Pluggable Virtual IO (PVIO) for communication via socket, named pipe and shared memory
  • Connection plugins, e.g for Aurora failover or replication (master write, slave read)
  • Remote IO plugin, which allows to access remote files (via http, https, ftp, ldap, ...)
  • Trace plugin (for analyzing and dumping network traffic)
  • New GSSAPI authentication plugin
New API Functions

MariaDB Connector/C 3.0 is introducing the following new API functions:

  • Bulk operations (array binding) for prepared statements (insert, update, delete).
  • support for extended client/server capabilities (requires MariaDB 10.2 or newer)
  • mariadb_get_charset_by_name and mariadb_get_charset_by_nr, which return charset information for a given internal number or name of character set. These functions have been previously used internally by MariaDB Connector/ODBC and are now exported, so they can be used also within plugins.
  • mariadb_stmt_execute_direct prepares and executes in one step (mainly used by MariaDB ODBC driver)
  • mariadb_cancel aborts a connection immediately by making all subsequent read/write operations fail
  • mysql_get_option and mysql_get_optionv (variable argument list) for obtaining option values for a given connection.
  • mysql_reconnect which was used internally before (if the option MYSQL_OPT_RECONNECT was set) is now part of the API and can be used by applications and plugins to re-establish a failing connection
  • mysql_reset_connection resets the current connection and clears session state
  • mysql_stmt_warning_count returns warnings per statement
  • Functions for obtaining session state changes:
    • mysql_session_track_get_first
    • mysql_session_track_get_next
  • Added tls_version support for SChannel. tls_version has to be specified via mysql_options(mysql, MARIADB_OPT_TLS_VERSION, ...)
  •  

Download the MariaDB Connector now and learn about the newest evolution of MariaDB Connector/C 3.0.

Download Knowledge Base

We are pleased to announce the general availability (GA) of MariaDB Connector/C 3.0. MariaDB Connector/C 3.0.2 is the newest version of MariaDB Connector/C. This release is compatible with MariaDB Connector/C 2.3

Login or Register to post comments

Cassandra Query Language

After installing Cassandra and reading Cassandra The Definitive Guide, it struck me that I should learn a bit more about the Cassandra Query Language (CQL). So, after I setup a single-node environment and created a .bashcassandra environment file to connect as a student user to the Cassandra instance:

# Add the Java and JRE paths to the $PATH environments. export set PATH=$PATH:/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.45.x86_64:/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.45.x86_64/jre # Add the $JAVA_HOME and $JRE_HOME environment variables. export set JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.45.x86_64/ export set JRE_HOME=/usr

Having started Cassandra as the cassandra user, I connected to the Cassandra Query Language Shell (cqlsh) to learn how to write CQL. You can find the basic structure of the Cassandra Query Language (CQL) on the Apache Cassandra website. I also discovered that CQL by itself can’t let you join tables without using Apache SparkSQL. Apache SparkSQL adds the ability to perform CQL joins in Cassandra, and became available in 2015.

I also learned you can’t use a CREATE OR REPLACE command when you change certain aspects of User-Defined Functions (UDFs). You actually need to drop any UDF before you change RETURNS NULL ON NULL INPUT clause to a CALLED ON NULL INPUT clause or vice versa. You can’t embed Java that connects to database without using the cassandra-java-driver-2.0.2 driver.

You connect to the cqlsh like this:

cqlsh

Here’s my script that creates Cassandra keyspace, which is more or less a database. You use the USE command to connect to the keyspace or database, like you would in MySQL. You do not have sequences in Cassandra because they’re not a good fit for a distributed architecture. Cassandra does not support a native procedural extension like relational databases. You must create User-defined functions (UDFs) by embedding the logic in Java.

This script does the following:

  • Creates a keyspace
  • Uses the keyspace
  • Conditionally drops tables and functions
  • Creates two tables
  • Inserts data into the two tables
  • Queries data from the tables

I also included a call to a UDF inside a query in two of the examples. One of the queries demonstrates how to return a JSON structure from a query. To simplify things and provide clarification of the scripts behaviors, the details are outlined below.

  • The first segment of the script creates the keyspace, changes the scope to use the keyspace, conditionally drop tables, create tables, and insert values into the tables:

    /* Create a keyspace in Cassandra, which is like a database in MySQL or a schema in Oracle. */ CREATE KEYSPACE IF NOT EXISTS student WITH REPLICATION = { 'class':'SimpleStrategy' ,'replication_factor': 1 } AND DURABLE_WRITES = true; /* Use the keyspace or connect to the database. */ USE student; /* Drop the member table from the student keyspace. */ DROP TABLE IF EXISTS member; /* Create a member table in the student keyspace. */ CREATE TABLE member ( member_number VARCHAR , member_type VARCHAR , credit_card_number VARCHAR , credit_card_type VARCHAR , PRIMARY KEY ( member_number )); /* Conditionally drop the contact table from the student keyspace. */ DROP TABLE IF EXISTS contact; /* Create a contact table in the student keyspace. */ CREATE TABLE contact ( contact_number VARCHAR , contact_type VARCHAR , first_name VARCHAR , middle_name VARCHAR , last_name VARCHAR , member_number VARCHAR , PRIMARY KEY ( contact_number )); /* Insert a row into the member table. */ INSERT INTO member ( member_number, member_type, credit_card_number, credit_card_type ) VALUES ('SFO-12345','GROUP','2222-4444-5555-6666','VISA'); /* Insert a row into the contact table. */ INSERT INTO contact ( contact_number, contact_type, first_name, middle_name, last_name, member_number ) VALUES ('CUS_00001','FAMILY','Barry', NULL,'Allen','SFO-12345'); /* Insert a row into the contact table. */ INSERT INTO contact ( contact_number, contact_type, first_name, middle_name, last_name, member_number ) VALUES ('CUS_00002','FAMILY','Iris', NULL,'West-Allen','SFO-12345'); /* Insert a row into the member table. */ INSERT INTO member ( member_number, member_type, credit_card_number, credit_card_type ) VALUES ('SFO-12346','GROUP','3333-8888-9999-2222','VISA'); /* Insert a row into the contact table. */ INSERT INTO contact ( contact_number, contact_type, first_name, middle_name, last_name, member_number ) VALUES ('CUS_00003','FAMILY','Caitlin','Marie','Snow','SFO-12346');
  • The following queries the member table:

    /* Select all columns from the member table. */ SELECT * FROM member;

    It returns the following:

    member_number | credit_card_number | credit_card_type | member_type ---------------+---------------------+------------------+------------- SFO-12345 | 2222-4444-5555-6666 | VISA | GROUP SFO-12346 | 3333-8888-9999-2222 | VISA | GROUP
  • Create a concatenate User-defined function (UDF) for Cassandra. The first step requires you to edit the cassandra.yaml file, which you find in the /etc/cassandra/default.conf directory. There is a single parameter that you need to edit, and it is the enable_user_defined_functions parameter. By default the parameter is set to false, and you need to enable it to create UDFs.

    After you make the edit, the cassandra.yaml file should look like this:

    # If unset, all GC Pauses greater than gc_log_threshold_in_ms will log at # INFO level # UDFs (user defined functions) are disabled by default. # As of Cassandra 3.0 there is a sandbox in place that should prevent execution of evil code. enable_user_defined_functions: true

    After you make the change, you can create your own UDF. The following UDF formats the first, middle, and last name so there’s only one whitespace between the first and last name when there middle name value is null.

    This type of function must use a CALLED ON NULL INPUT clause in lieu of a RETURNS NULL ON NULL INPUT clause. The latter would force the function to return a null value if any one of the parameters were null.

    /* Drop the concatenate function because a replace disallows changing a RETURNS NULL ON NULL INPUT with a CALLED ON NULL INPUT without raising an "89: InvalidRequest" exception. */ DROP FUNCTION concatenate; /* Create a user-defined function to concatenate names. */ CREATE OR REPLACE FUNCTION concatenate (first_name VARCHAR, middle_name VARCHAR, last_name VARCHAR) CALLED ON NULL INPUT RETURNS VARCHAR LANGUAGE java AS $$ /* Concatenate first and last names when middle name is null, and first, middle, and last names when middle name is not null. */ String name; /* Check for null middle name. */ if (middle_name == null) { name = first_name + " " + last_name; } else { name = first_name + " " + middle_name + " " + last_name; } return name; $$;
  • Query the values from the contact table with the UDF function in the SELECT-list:

    /* Query the contact information. */ SELECT member_number , contact_number , contact_type , concatenate(first_name, middle_name, last_name) AS full_name FROM contact;

    It returns the following:

    member_number | contact_number | contact_type | full_name ---------------+----------------+--------------+-------------------- SFO-12345 | CUS_00001 | FAMILY | Barry Allen SFO-12345 | CUS_00002 | FAMILY | Iris West-Allen SFO-12346 | CUS_00003 | FAMILY | Caitlin Marie Snow
  • Query the values from the contact table with a JSON format:

    /* Query the contact information and return in a JSON format. */ SELECT JSON contact_number , contact_type , concatenate(first_name, middle_name, last_name) AS full_name FROM contact;

    It returns the following:

    [json] ------------------------------------------------------------------------------------------------- {"contact_number": "CUS_00001", "contact_type": "FAMILY", "full_name": "Barry Allen"} {"contact_number": "CUS_00002", "contact_type": "FAMILY", "full_name": "Iris West-Allen"} {"contact_number": "CUS_00003", "contact_type": "FAMILY", "full_name": "Caitlin Marie Snow"}

You can call the script from a relative directory inside cqlsh, like this:

source 'cstudent.cql'

At the end of the day, the concept of adding and removing nodes is attractive. Though, the lack of normal relational mechanics and narrowly supported set of CQL semantics leaves me with open questions. For example, is clustering without a coordinator really valuable enough to settle for eventual, or tunable, consistency with such a narrowly scoped query language?

As always, I hope this helps those looking for a quick how-to on Cassandra.

How to Find Processlist Thread id in gdb

I was involved in a discussion on some complex MySQL-related problem where we had to study backtraces of all threads in gdb (produced by the thread apply all bt command if you ever forgets this) in a hope to find out why MySQL hangs. In the process the question appeared on how to find the thread id for each thread to match it against previous collected outputs of SHOW PROCESSLIST. and SHOW ENGINE INNODB STATUS.

I assumed I know the answer, as I had to find this out recently enough for this blog post (and before that for the real customer case). The idea is simple. Find a frame where function has a parameter of THD * type (usually named thd), like this:
#10 0x0000000000cb47fe in do_command (thd=0x7f32512b7000)
    at /usr/src/debug/percona-server-5.7.18-15/percona-server-5.7.18-15/sql/sql_parse.cc:960and check thread_id item of this structure.

In that my blog post it looked as simple as just referring to thd of do_command's frame without even checking much:
(gdb) thread 2
[Switching to thread 2 (Thread 0x7f7f5ce02b00 (LWP 9232))]
#0  pthread_cond_timedwait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
238     ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S: No such file or directory.
(gdb) p do_command::thd->thread_id
$9 = 14I prefer to double check my suggestions before making them, so I immediately tried this with my CentOS 6.9 VM running recent Percona Server 5.7.x by default since that times when I worked at Percona:
[root@centos ~]# gdb -p `pidof mysqld`
GNU gdb (GDB) Red Hat Enterprise Linux (7.2-92.el6)
...
Loaded symbols for /usr/lib64/mysql/plugin/tokudb_backup.so
0x00007f550ad35383 in poll () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.209.el6_9.2.x86_64 jemalloc-3.6.0-1.el6.x86_64 keyutils-libs-1.4-5.el6.x86_64 krb5-libs-1.10.3-65.el6.x86_64 libaio-0.3.107-10.el6.x86_64 libcom_err-1.41.12-23.el6.x86_64 libgcc-4.4.7-18.el6.x86_64 libselinux-2.0.94-7.el6.x86_64 libstdc++-4.4.7-18.el6.x86_64 nss-softokn-freebl-3.14.3-23.3.el6_8.x86_64 numactl-2.0.9-2.el6.x86_64 openssl-1.0.1e-57.el6.x86_64 zlib-1.2.3-29.el6.x86_64
(gdb) thread 1
[Switching to thread 1 (Thread 0x7f550d2b2820 (LWP 1978))]#0  0x00007f550ad35383 in poll () from /lib64/libc.so.6
(gdb) p do_command::thd->thread_id
No frame is currently executing in block do_command(THD*).
(gdb) thread 2
[Switching to thread 2 (Thread 0x7f54d837b700 (LWP 2183))]#0  0x00007f550ad35383 in poll () from /lib64/libc.so.6
(gdb) p do_command::thd->thread_id
Cannot take address of method thread_id.
(gdb) call do_command::thd->thread_id()
Cannot evaluate function -- may be inlinedAs you can see I started to check threads one by one and apply that good old trick. Thread 1 had no frame executing do_command(), but I did not gave up and proceeded to the next thread, as I knew I had at least one active connection (I checked the output of SHOW PROCESSLIST). There I had a surprise, no way to get thread_id of thd. I used tab completion, so I know that thread_id (variable or method) exists, but attempt to call it also failed as you can see.

This is a problem with using gdb-based "tricks" over the code that evolves/changes in time. Last time I used p do_command::thd->thread_id it was for MariaDB 10.1.x probably, and the item was there. But in MySQL 5.7 (and all forks based on it) there were many code changes, so we should be ready to changes in unexpected places.

I had not added more comments on finding thread id to that discussion, made a note to myself and then, later, decided to check the source code of MySQL 5.7 (I did not have Percona 5.7 one at hand, but they hardly differs in such basic details) to find out what had changed in the THD structure so that thread_id is not longer just a variable. I expect to see the structure defined in sql/sql_class.h from the past, but grep will help to find this out even if it's no longer the case:
[root@centos mysql-server]# grep -n "class THD" sql/*.h
sql/debug_sync.h:27:class THD;
sql/derror.h:24:class THD;
sql/event_data_objects.h:40:class THD;
...
sql/sql_class.h:1412:class THD :public MDL_context_owner,
sql/sql_class.h:4175:    raise_error() or raise_warning() methods provided by class THD.
sql/sql_cmd.h:25:class THD;
... I found the following there:
class THD :public MDL_context_owner,
           public Query_arena,
           public Open_tables_state
{
...
private:
  my_thread_id  m_thread_id;public:
...
  /**
    Assign a value to m_thread_id by calling
    Global_THD_manager::get_new_thread_id().
  */
  void set_new_thread_id();
  my_thread_id thread_id() const { return m_thread_id; }...So, in MySQL 5.7 thread_id() is, indeed, a method that was inlined, and essentially it returns private m_thread_id item. Benefits of C++... I had highlighted Global_THD_manager singleton also as during my next gdb sessions I had found out that simple global list of threads is also gone and in 5.7 everything is done via that Global_THD_manager. This is a topic for some other post, though.

At least now I know what to do in gdb:
...
(gdb) thread 7[Switching to thread 7 (Thread 0x7f54d8236700 (LWP 2275))]#0  0x00007f550ad35383 in poll () from /lib64/libc.so.6
(gdb) p do_command::thd->m_thread_id
$1 = 86
(gdb) p do_command::thd->m_main_security_ctx
$3 = {m_user = {m_ptr = 0x7f5500fdaf90 "myuser", m_length = 6,
    m_charset = 0x1ded640, m_alloced_length = 8, m_is_alloced = true},
  m_host = {m_ptr = 0x7f54d98ab090 "localhost", m_length = 9,
    m_charset = 0x1ded640, m_alloced_length = 16, m_is_alloced = true},
  m_ip = {m_ptr = 0x7f54f0eb0210 "127.0.0.1", m_length = 9,
    m_charset = 0x1ded640, m_alloced_length = 16, m_is_alloced = true},
  m_host_or_ip = {m_ptr = 0x7f54d98ab090 "localhost", m_length = 9,
    m_charset = 0x1ded640, m_alloced_length = 0, m_is_alloced = false},
  m_external_user = {m_ptr = 0x15167ab "", m_length = 0,
    m_charset = 0x1ded640, m_alloced_length = 0, m_is_alloced = false},
  m_priv_user = "myuser", '\000' <repeats 89 times>, m_priv_user_length = 6,
  m_proxy_user = '\000' <repeats 161 times>, m_proxy_user_length = 0,
  m_priv_host = "localhost", '\000' <repeats 51 times>,
  m_priv_host_length = 9, m_master_access = 1589248, m_db_access = 0,
  m_password_expired = false}
...So, I know that specific thread  7 was for a session with Id 86 in the output of SHOW PROCESSLIST, and (from m_main_security_ctx, also a new name for old things in 5.7) I know it was a session of myuser connecting locally.

To summarize, there were notable changes in MySQL 5.7 in THD structure and threads management-related code in general, so make sure to re-check your "old gdb tricks" when you start working with 5.7. Reading the code helps.

Unfortunately (for gdb beginners like me) a lot of C++ approaches were introduced, including singletons, iterators based on templates instead of simple double linked lists etc, so one has to work hard to adapt to these. I hope to discuss some of my further findings and new "C++ specific" and "MySQL 5.7 specific" approaches studying MySQL in gdb in my upcoming posts.

Percona Server for MySQL 5.7.18-16 Is Now Available

Percona is glad to announce the GA release of Percona Server for MySQL 5.7.18-16 on July 28, 2017 (Downloads are available here and from the Percona Software Repositories).

Based on MySQL 5.7.18, including all the bug fixes in it, Percona Server for MySQL 5.7.18-16 is the current GA release in the Percona Server for MySQL 5.7 series. All of Percona‘s software is open-source and free, and you can find all the release details in the 5.7.18-16 milestone at Launchpad

Please note that RHEL 5, CentOS 5 and Ubuntu versions 12.04 and older are not supported in future releases of Percona Server and no further packages are added for these distributions.

New Features:

  • Percona Server for MySQL is now available on Debian 9 (stretch). The support only covers the amd64 architecture.
  • Percona Server for MySQL can now be built with the support of OpenSSL 1.1.
  • MyRocks storage engine has been merged into Percona Server.
  • TokuDB enables to kill a query that is awaiting an FT locktree lock.
  • TokuDB enables using the MySQL DEBUG_SYNC facility within Percona FT.

Bugs Fixed:

  • Row counts in TokuDB could be lost intermittently after restarts. Bug fixed #2.
  • In TokuDB, two races in the fractal tree lock manager could significantly affect transactional throughput for some applications that used a small number of concurrent transactions. These races manifested as transactions unnecessarily waiting for an available lock. Bug fixed #3.
  • Percona FT could assert when opening a dictionary with no useful information to an error log. Bug fixed #23.
  • Percona FT could assert for various reasons deserializing nodes with no useful error output. Bug fixed #24.
  • It was not possible to build Percona Server on Debian 9 (stretch) due to issues with OpenSSL 1.1. Bug fixed #1702903 (upstream #83814).
  • Packaging was using the dpkg --verify command which is not available on wheezy/precise. Bug fixed #1694907.
  • Enabling and disabling the slow query log rotation spuriously added the version suffix to the next slow query log file name. Bug fixed #1704056.
  • With two client connections to a server (debug server build), the server could crash after one of the clients set the global option userstat and flushed the client statistics (FLUSH CLIENT_STATISTICS) and then both clients were closed. Bug fixed #1661488.
  • Percona FT did not pass cmake flags on to snappy cmake. Bug fixed #41. The progress status for partitioned TokuDB table ALTERs was misleading. Bug fixed #42.
  • When a client application is connecting to the Aurora cluster end point using SSL (--ssl-verify-server-cert or --ssl-mode=VERIFY_IDENTITY option), wildcard and SAN enabled SSL certificates were ignored. Note that the --ssl-verify-server-cert option is deprecated in Percona Server 5.7. Bug fixed #1673656 (upstream #68052).
  • Killing a stored procedure execution could result in an assert failure on a debug server build. Bug fixed #1689736 (upstream #86260).
  • The SET STATEMENT .. FOR statement changed the global instead of the session value of a variable if the statement occurred immediately after the SET GLOBAL or SHOW GLOBAL STATUS command. Bug fixed #1385352.
  • When running SHOW ENGINE INNODB STATUS, the Buffer pool size, bytes entry contained 0. BUg fixed #1586262.
  • The synchronization between the LRU manager and page cleaner threads was not done at shutdown. Bug fixed #1689552.
  • Spurious lock_wait_timeout_thread wakeup in lock_wait_suspend_thread() could occur. Bug fixed #1704267 (upstream #72123).

Other bugs fixed: #1686603#6#44#65#1160986#1686934#1688319#1689989#1690012#1691682#1697700#1699788#1121072, and #1684601 (upstream #86016).

The release notes for Percona Server for MySQL 5.7.18-16 are available in the online documentation. Please report any bugs on the launchpad bug tracker.

Note

Due to new package dependency, Ubuntu/Debian users should use apt-get dist-upgrade or apt-get install percona-server-server-5.7 to upgrade.

Node.js and MySQL on the Oracle Cloud

Let's explore how to deploy a node.js app with MySQL backend on the Oracle Cloud. I want to cover several aspects:
  • How to deploy and initialize a MySQL instance
  • How to adapt the source code
  • How to deploy the application
  • How to scale the application
There are different ways to configure this. I tested the easiest deployment with MySQL Cloud Service and the Application Container Cloud for the node.js part. All configurations are done via the cloud web GUI. There is also a REST interface available. But let's keep that for later.
If you don't have access to the Oracle Cloud you can get a trial access here.
How to deploy a MySQL instanceOnce you logged into the Oracle cloud you can create new instances from the dashboard. The following screenshots describe the process.


On the next screen we upload the public key for admin access to the instance. Either upload your own public key or generate a new key pair. (If you generate a new key pair you need to download the private key to your local machine.)

I skipped the backup and monitoring configurations for this demo. Let's focus on the application instead. After creating the instance (approx. 10 min) you can navigate via the dashboard to this instance and get the IP address. This is needed for the next step.
To initialize the database I ran this little script that runs ssh to the instance (using the private key), switch user to "oracle" and then call the MySQL CLI to run a few SQL statements.

How to adapt the source code
    The Application Container Cloud sets a few environment variables that should be used inside the application to adapt to the environment. In my case this are the following variables:
    • PORT is the port number that the application should listen on
    • MYSQLCS_USER_NAME is the MySQL user name for the database backend
    • MYSQLCS_USER_PASSWORD is the corresponding password
    • MYSQLCS_CONNECT_STRING is the hostname and port of the database backend
    I could have hardcoded the database connection parameters but that is inflexible if the architecture changes. So let's use these variables. The Node.js code looks like this:

    How to deploy the applicationThere are two simple steps needed: Creating an application and defining service bindings. In my case the only service binding is the MySQL backend. But one step after the other. First let's create the application. First you need to create a manifest.json file to describe the application. Here is mine:
    Ideally you create a zip archive with the source code, the manifest.json file and all other resources that your application needs. If you want to use my zip archive, feel free. You find it on GitHub. From the Oracle Cloud Dashboard click on "create instance -> application container" and then select "Create Instance" and "Node.js". (Java SE, Java EE, Python, Ruby and PHP are available as well.)

    On the pop-up you define the application artifacts, number of application instances and the memory per instance. After you click "create" the application is deployed automatically within a few minutes.
    The last step is to connect the application service with the database backend. To achieve that, click on the application in the application overview page. Here you find the URL under which your application will be available. And on the left hand side you see three tabs:

    Overview, Deployments and Administration. Click on "Deployments". Here you can add the service binding as described in the following screenshot:

    After modifying the service bindings you have to click "Apply changes". This will restart the application instances. Obviously needed because now the environment variables for the database backend are set correctly.
    That's it. We have an application. The URL to access the new app is listed in the application overview tab. Because this URL is not so nice for offering a short url service, I registered a new domain and forwarded that to the anota application. Maybe it is still running? Check here. How to scale the applicationThis is really easy. On the application overview tab you can just increase the number of instances and the memory per instance. After applying the changes, the Application Container Cloud platform will deploy new instances, stop spare instances or reconfigure the existing instances. If you use my ANOTA application, go to the report page. The last line prints the hostname of the application server. Requests are automatically load balanced between the available application instances.  Summary There are some minor changes to the application to run on the Oracle Cloud Platform: Reading the port variable and database connection parameters from the provided environment variables and that's it. Deployment is really easy via the GUI. And scalability is so simple now that the full Oracle Cloud Plattform is available and can be provisioned within minutes. 

    What is MySQL Partitioning?

    In this blog, we’ll quickly look at MySQL partitioning.

    Partitioning is a way in which a database (MySQL in this case) splits its actual data down into separate tables, but still get treated as a single table by the SQL layer.

    When partitioning, it’s a good idea to find a natural partition key. You want to ensure that table lookups go to the correct partition or group of partitions. This means that all SELECT, UPDATE, DELETE should include that column in the WHERE clause. Otherwise, the storage engine does a scatter-gather, and queries ALL partitions in a UNION that is not concurrent.

    Generally, you must add the partition key into the primary key along with the auto increment, i.e., PRIMARY KEY (part_id,id). If you don’t have well-designed and small columns for this composite primary key, it could enlarge all of your secondary indexes.

    You can partition by range or hash. Range is great because you have groups of known IDs in each table, and it helps when querying across partition IDs. This still can create hotspots in the newest partition, as all new inserts go there. Partitioning by hash “load balances” the table, and allows you to write to partitions more concurrently. This makes range queries on the partition key a bad idea.

    In MySQL 5.7, partitioning became native to the store engine and deprecated the old method where MySQL itself had to handle the partitions. This means InnoDB partitions (and a larger amount of partitions) are a better choice than in the past.

    As with all features and recommendations, this only makes sense if it helps your data and workload!

    MySQL Group Replication: who is the primary master – updated!

    Some time ago I wrote this article explaining how to find which node was the Primary Master in a MySQL Group Replication cluster running in Single-Primary Mode.

    In the latest release of Group Replication, MySQL 8.0.2 dmr, Jaideep improved the visibility of Group Replication extending the performance_schema tables (see his article).

    Thanks to these improvements, it’s now very easy to find which host is acting a Primary-Master. This is the query you can use:

    mysql-sql> SELECT MEMBER_HOST as `PRIMARY` FROM performance_schema.replication_group_members WHERE MEMBER_ROLE='PRIMARY'; +---------+ | PRIMARY | +---------+ | mysql3 | +---------+ 1 row in set (0.00 sec)

    Much easier isn’t it ?

    I let you discover the other improvements and features in MySQL 8.0.2 and don’t forget that your feedback is very important !

    High Availability with Multi-Source Replication in MariaDB Server

    High Availability with Multi-Source Replication in MariaDB Server gerrynarvaja Wed, 07/26/2017 - 19:29

    Multi-source replication is a handy way to simplify your high availability database setup – especially as compared to the regular MySQL replication approach. Let's look at how it works.

    (Note: This post is a revision of one from 2014, with minor updates made.)

    The Basics

    For our example scenario, we'll use a three-server topology, which is commonly employed for simple failover. (I used the same setup in my post on how to enable GTIDs in MariaDB Server.) A is the active master, B is a standby master set up to replicate from A, and C is a multi-purpose slave replicating from A. I also set up A to replicate from B. This way, if A fails, the system will start writing to B, and once A comes back up it will recover all the transactions from B through replication. Typically this setup is used in combination with a tool like Master High Availability Manager (MHA) to migrate the slave from one master to the next.

    For this particular case, assuming that the application(s) write to the active master only, we're going to set the domain ID for all servers to 1.

    SET GLOBAL gtid_domain_id = 1;

    We'll do the same in the my.cnf files:

    gtid-domain-id=1 Set Up the Slave to Replicate from Two Masters

    With the basics taken care of, we now will set up server C to replicate from both server A (active master) and server B (standby master). Keep in mind that now each transaction that reaches server C will come from two different sources. For example, if you issue an INSERT on the active master A, it will replicate to C and B. Since C is also B's slave, the same transaction will replicate from B to C. By default, C will apply both transactions, which is redundant and can lead to data inconsistencies and errors. To avoid this, it is necessary to set the gtid_ignore_duplicates variable to ON.

    SET GLOBAL gtid_ignore_duplicates=ON;

    And do the same in the my.cnf files:

    gtid-ignore-duplicates=ON

    This way, when C receives the same transaction through two different slave connections, it will verify the GTID and only apply it once.

    The next step is to set the second slave connection. In my previous article, I set up the gtid_slave_pos global variable. Since server C is already replicating successfully from server A, this variable already has the proper value and there is no need to set it up manually. We only need to define the new connection using CHANGE MASTER TO:

    CHANGE MASTER 's-102' TO  MASTER_HOST='192.168.56.102', MASTER_USER='repl', MASTER_PASSWORD='repl', master_use_gtid=slave_pos;

    Notice that we add the label 's-102' after the MASTER reserved word. This string identifies the second replication connection. This identifier needs to be used for every replication command. So, to start replicating from server B, you would use START SLAVE like this:

    START SLAVE 's-102';

    In the same way, to check the status of this replication connection you would use:

    SHOW SLAVE 's-102' STATUS\G

    You can also issue the commands to all replication connections by using the keyword ALL; for example:

    SHOW ALL SLAVES STATUS\G STOP ALL SLAVES; START ALL SLAVES;

    Notice that when using the keyword ALL, you also have to use SLAVES (plural form). The STOP and START commands will show as many warnings as the number of replication connections:

    MariaDB [(none)]> START ALL SLAVES; Query OK, 0 rows affected, 2 warnings (0.02 sec) MariaDB [(none)]> SHOW WARNINGS; +-------+------+-----------------------+ | Level | Code | Message               | +-------+------+-----------------------+ | Note  | 1937 | SLAVE 's-102' started | | Note  | 1937 | SLAVE '' started      | +-------+------+-----------------------+ 2 rows in set (0.00 sec)

    After completing the setup, the resulting topology will look similar to this:

    +-----+         +-----+ |  A  | |  B  | +-----+         +-----+    |               |    \               /     \             /      V           V         +-----+         |  C  |         +-----+ Failures

    If server A goes away or the connection between server A and server C fails, server C will still receive all the transactions through the alternate replication connection to server B (labeled 's-102'). Once server A comes back online again, and/or server C can reconnect to it, server C will compare all the GTIDs coming from server A and ignore those which have already been applied.

    The advantage of using this approach over the traditional single-source replication is that now there is no need to re-source the slaves. When the active master fails and the applications start writing to the alternative masters, the slaves will continue to replicate from it, simplifying the management of failover scenarios.

    Multi-source replication is a handy way to simplify your high availability database setup – especially as compared to the regular MySQL replication approach. This blog post looks at how it works. 

    Login or Register to post comments

    Enabling GTIDs for Server Replication in MariaDB Server 10.2

    Enabling GTIDs for Server Replication in MariaDB Server 10.2 gerrynarvaja Wed, 07/26/2017 - 18:56

    I originally wrote this post in 2014, after the release of MariaDB Server 10.0. Most of what was in that original post still applies, but I've made some tweaks and updates since replication and high availability (HA) remain among the most popular MariaDB/MySQL features.

    Replication first appeared on the MySQL scene more than a decade ago, and as replication implementations became more complex over time, some limitations of MySQL’s original replication mechanisms started to surface. To address those limitations, MySQL v5.6 introduced the concept of global transaction identifiers (GTIDs), which enable some advanced replication features. MySQL DBAs were happy with this, but complained that in order to implement GTIDs you needed to stop all the servers in the replication group and restart them with the feature enabled. There are workarounds – for instance, Booking.com documented a procedure to enable GTIDs with little or no downtime, but it involves more complexity than most organizations are willing to allow. (Check out this blog post for more on how Booking.com handles replication and high availability.)

    MariaDB implements GTIDs differently from MySQL, making it possible to enable and disable them with no downtime. Here’s how.

    A Simple HA Implementation

    Let’s start with a common HA implementation, with three servers running MariaDB 10.0 or higher: One active master (server A), one passive master (server B), and a slave replicating from the active master (server C). The active and passive masters are set up to replicate master-master.

    I’m not showing it, but between servers A and B and the application you would want an additional layer to switch database traffic to server B in case A fails. Some organizations might deploy another slave replicating from B, or a mechanism to move server C to replicate from B, such as Master High Availability Manager (MHA), but let’s keep things simple here.

    Step 1 – Setting Up the Configuration Files

    GTIDs in MariaDB 10 have three parts: server ID, transaction ID, and domain ID. The server ID and transaction ID are similar in concept to those found in MySQL 5.6. In MariaDB, the server ID is a number and not a UUID, and it is taken from the server_id global variable. The domain ID is an important new concept for multisource replication, which you can read more about in the domain ID article in the MariaDB knowledgebase. In our case, this is the only variable we need to set up; the server ID should already be set up if replication is functional. Let’s use 1 for server A's domain ID, 2 for server B's and 3 for server C's by executing commands like the following on each of the servers:

    SET GLOBAL gtid_domain_id = 1;

    Keep in mind that each session can have its own value for gtid_domain_id, so you'll have to reset all existing connections to properly reset the gtid_domain_id. Finally, persist the values in the corresponding my.cnf files:

    # Domain = 1 for active master: server A gtid-domain-id=1 Step 2 – Changing Replication on the Slave

    Running SHOW MASTER STATUS on server A, the active master, shows the current coordinates for its binary log file:

    MariaDB [(none)]> SHOW MASTER STATUS\G *************************** 1. row *************************** File: mariadb-bin.000001 Position: 510 Binlog_Do_DB: Binlog_Ignore_DB: 1 row in set (0.00 sec)

    We can use these coordinates to find the information we need in order to use GTIDs from the active master by running this command:

    SELECT BINLOG_GTID_POS('mariadb-bin.000001', 510); +--------------------------------------------+ | BINLOG_GTID_POS('mariadb-bin.000001', 510) | +--------------------------------------------+ | 1-101-1 | +--------------------------------------------+ 1 row in set (0.00 sec)

    Note that the GTID can be an empty string; for clarity these examples work with non-empty GTID values. The result from the function call is the current GTID, which corresponds to the binary file position on the master. With this value, we can now modify the slave configuration on servers B and C, executing the following statements on each of them:

    STOP SLAVE; SET GLOBAL gtid_slave_pos = '1-101-1'; CHANGE MASTER TO master_use_gtid=slave_pos; START SLAVE;

    Check the slave status to see that the change has taken effect:

    MariaDB [mysql]> show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.56.101 Master_User: repl Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mariadb-bin.000001 Read_Master_Log_Pos: 510 Relay_Log_File: mysqld-relay-bin.000002 Relay_Log_Pos: 642 Relay_Master_Log_File: mariadb-bin.000001 Slave_IO_Running: Yes Slave_SQL_Running: Yes ... Using_Gtid: Slave_Pos Gtid_IO_Pos: 1-101-1 1 row in set (0.00 sec)

    The last two lines of SHOW SLAVE STATUS indicate that the slave is now using GTIDs to track replication.

    Conclusion

    As you can see, the procedure to enable GTIDs is straightforward, and doesn’t require restarting servers or planning for downtime. If you want to revert back to regular replication using binary log position, you can do so by using RESET SLAVE on the slave and resetting the proper binary log coordinates the traditional way. In fact, once you update your servers to use MariaDB Server and review the binary log files with the mysqlbinlog, you'll notice that every transaction in the MariaDB binary logs has the GTID already included. For the binary log in the examples used in this article, here's what you see:

    # at 314 #140807 14:16:01 server id 101 end_log_pos 352 GTID 1-101-1 /*!100001 SET @@session.gtid_domain_id=1*//*!*/; /*!100001 SET @@session.server_id=101*//*!*/; /*!100001 SET @@session.gtid_seq_no=1*//*!*/;

    I hope that the ease of implementing GTIDs in MariaDB Server piques your curiosity and encourages you to explore the variety of replication features. For more on replication and other high availability/disaster recovery strategies, check out our white paper, High Availability with MariaDB TX: The Definitive Guide.

    Replication has been one of the most popular MySQL features since it made its way into the application more than a decade ago. Global Transaction IDs was introduced to make handling complex solutions easier. This blog post explains how MariaDB makes handling GTID simpler.

    Login or Register to post comments

    MariaDB Galera Cluster 5.5.57 and Connector/C 3.0.2 now available

    The MariaDB project is pleased to announce the availability of MariaDB Galera Cluster 5.5.57 as well as MariaDB Connector/C 3.0.2. See the release notes and changelogs for details. Download MariaDB Galera Cluster 5.5.57 Release Notes Changelog What is MariaDB Galera Cluster? MariaDB APT and YUM Repository Configuration Generator Download MariaDB Connector/C 3.0.2 Release Notes Changelog […]

    The post MariaDB Galera Cluster 5.5.57 and Connector/C 3.0.2 now available appeared first on MariaDB.org.

    What is innodb_autoinc_lock_mode and why should I care?

    In this blog post, we’ll look at what innodb_autoinc_lock_mode is and how it works.

    I was recently discussing innodb_autoinc_lock_mode with some colleagues to address issues at a company I was working with.

    This variable defines the lock mode to use for generating auto-increment values. The permissible values are 0, 1 or 2 (for “traditional”, “consecutive” or “interleaved” lock mode, respectively). In most cases, this variable is set to the default of 1.

    We recommend setting it to 2 when the BINLOG_FORMAT=ROW. With interleaved, INSERT statements don’t use the table-level AUTO-INC lock and multiple statements can execute at the same time. Setting it to 0 or 1 can cause a huge hit in concurrency for certain workloads.

    Interleaved (or 2) is the fastest and most scalable lock mode, but it is not safe if using STATEMENT-based replication or recovery scenarios when SQL statements are replayed from the binary log. Another consideration – which you shouldn’t rely on anyway – is that IDs might not be consecutive with a lock mode of 2. That means you could do three inserts and expect IDs 100,101 and 103, but end up with 100, 102 and 104. For most people, this isn’t a huge deal.

    If you are only doing simple inserts, this might not help you. I did a sysbench test on MySQL 5.7 in Amazon RDS with 100 threads and found no difference in performance or throughput between lock modes 1 and 2. It helps the most when you when the number of rows can’t be determined, such as with INSERT INTO…SELECT statements.

    You can find a longer form article in the manual, but I highly recommend setting this value to 2 if you are not using STATEMENT-based replication.

    Replicating from a higher to lower MySQL version

    As we know, replication is only supported officially between consecutive major MySQL versions, and only from a lower version master to a higher version slave.

    This means for example, the following scenario is supported:
    5.6 master –> 5.7 slave

    while these two scenarios are not supported:
    5.5 master –> 5.7 slave
    5.7 master –> 5.6 slave

    That being said, in some contexts (e.g a MySQL upgrade) it can be valuable to be able to replicate from a master that is using a newer version of MySQL to an older version slave.

    This could be used as part of a rollback strategy, and/or be needed in the case of upgrading a master-master replication topology.

    The idea of the article is to provide you with instructions on how to make replication work for scenarios where the master is a higher version than the slaves.

    Replicating from a 5.6 master to a 5.5 slave

    The following parameters need to be present on a 5.6 master’s my.cnf to be able to replicate to a lower version slave:

    slave_sql_verify_checksum = OFF binlog_checksum = NONE log_bin_use_v1_row_events=1 binlog_row_image=FULL gtid_mode=OFF

    NOTE: remember to bounce the server after any needed parameter changes

    The temporal datatypes (TIME/DATETIME/TIMESTAMP) support fractional values starting from MySQL 5.6.4. Hence the storage requirement and encoding differ in comparison to pre-MySQL 5.6.4 temporal datatypes.

    For this reason, writing to tables with columns using any of (TIME/DATETIME/TIMESTAMP) created AFTER the upgrade to 5.6 will break replication to 5.5, unless the binlog format is set to STATEMENT.

    Replicating back temporal columns using the new format in ROW based format will not work.

    If using MIXED binlog format, unsafe statements are logged using ROW format, so it may or may not work depending on the presence of said unsafe statements.

    Replicating from a 5.7 master to a 5.6 slave

    There is no need for any special my.cnf parameters on this case. However, there is a bug with ignorable events in older MySQL 5.6 versions that prevents replication from a 5.7 master.

    This is fixed on 5.6.24, so if you want to replicate from 5.7 master to 5.6, make sure the slave is using MySQL >=5.6.24.

    Finally, avoid using SET PASSWORD statements (those are deprecated anyway), as they will lock you out of your slave in a mixed 5.7/5.6 topology.

    You can use ALTER USER user IDENTIFIED BY ‘auth_string’; instead. There is a bug reported about this as well (closed as Won’t fix).

    Conclusion

    While not a supported configuration, replicating from a higher to lower MySQL version is still possible. Keep in mind that there may be some edge cases where you still encounter problems.

    I look forward to hearing about your experiences in the comments section below.

    Tungsten Replicator and Clustering 5.2.0 Released

    Continuent are pleased to announce the release of Tungsten Replicator and Tungsten Clustering 5.2.0

    This release is one of our most exciting new releases for a while, as it contains some significant new features and lays the groundwork for some additional new functionality in the upcoming 5.3.0 and 6.0 releases due later this year.

    In particular, this release includes the following new features:

    • New replicator filtering environment to make filtering quicker and easier to use, and more flexible
      • New filter configuration standard for new filters
      • New filter to make replication out of a cluster easier
      • New filters for filtering events and data
    • New applier for sending Apache Kafka messages directly from an incoming data stream
    • New applier for adding incoming records directly to Elasticsearch for indexing
    • New applier for writing data into Apache Cassandra
    • New commands for the trepctl tool:
      • qs (quick status) command to show simplified output for the current replicator status
      • perf (performance) command to show detailed performance statistics in real-time for the replication process
      • Improvements to the status to show the time in a long-running transactions
    • Improvements to the thl command:
      • You can now determine the size of events within the THL to identify large transactions (rows or SQL)
      • Used in combination with trepctl you can identify long-running transactions more easily

    The release also contains a number of fixes and other minor improvements.

     

    DevOps Considerations for Production-ready Database Deployments

    MySQL is easy to install and use, it has always been popular with developers and system administrators. On the other hand, deploying a production-ready MySQL environment for a business-critical enterprise workload is a different story. It can be a bit of a challenge, and requires in-depth knowledge of the database. In this blog post, we’ll discuss some of the steps which have to be taken before we can consider our MySQL deployment production-ready.

    High Availability

    If you belong to those lucky ones who can accept hours of downtime, you can stop reading here and skip to the next paragraph. For 99.999% of business-critical systems, it would not be acceptable. Therefore a production-ready deployment has to include high availability measures. Automated failover of the database instances, as well as a proxy layer which detects changes in topology and state of MySQL and routes traffic accordingly, would be a main requirement. There are numerous tools which can be used to build such environments, for instance MHA, MRM or ClusterControl.

    Proxy layer

    Master failure detection, automated failover and recovery - these are crucial when building a production-ready infrastructure. But on their own, it is not enough. There is still an application which will have to adapt to the topology change triggered by the failover. Of course, it is possible to code the application so it is aware of instance failures. This is a cumbersome and inflexible way of handling topology changes though. Here comes the database proxy - a middle layer between application and database. A proxy can hide the complexity of your database layer from the application - all the application does is to connect to the proxy and the proxy will take care of the rest. The proxy will route queries to a database instance, it will handle topology changes and re-route as necessary. A proxy can also be used to implement read-write split, relieving the application from one more complex case to cover. This creates another challenge - which proxy to use? How to configure it? How to monitor it? How to make it highly available, so it does not become a SPOF?

    ClusterControl can assist here. It can be used to deploy different proxies to form a proxy layer: ProxySQL, HAProxy and MaxScale. It preconfigures proxies to make sure they will handle traffic correctly. It also makes it easy to implement any configuration changes if you need to customize proxy setup for your application. Read-write split can be configured using any of the proxies ClusterControl supports. ClusterControl also monitors the proxies, and will recover them in case of failures. The proxy layer can become a single point of failure, as automated recovery might not be enough - to address that, ClusterControl can deploy Keepalived and configure Virtual IP to automate failover.

    Backups

    Even if you do not need to implement high availability, you probably still have to care about your data. Backup is a must for almost every production database. Nothing else than a backup can save you from an accidental DROP TABLE or DROP SCHEMA (well, maybe a delayed replication slave, but only for some period of time). MySQL offers multiple methods of taking backups - mysqldump, xtrabackup, different types of snapshot (some available only with particular hardware or cloud provider). It’s not easy to design the correct backup strategy, decide on which tools to use and then script whole process so it will execute correctly. It’s not rocket science either, and requires careful planning and testing. Once a backup is taken, you are not done. Are you sure the backup can be restored, and the data is not garbage? Verifying your backups is time consuming, and perhaps not the most exciting thing you will have on your todo list. But it is still important, and needs to be done regularly.

    ClusterControl has extensive backup and restore functionality. It supports mysqldump for logical backup and Percona Xtrabackup for physical backup - those tools can be used in almost every environment, either cloud or on-premises. It is possible to build a backup strategy with a mixture of logical and physical backups, incremental or full, in an online fashion.

    Apart from recovery, it also has options to verify a backup - for instance restore it on a separate host in order to verify if the backup process works ok or not.

    If you’d like to regularly keep an eye on the backups (and you would probably want to do this), ClusterControl has the ability to generate operational reports. The backup report helps you track executed backups, and informs if there were any issues while taking them.

    Severalnines   DevOps Guide to Database Management Learn about what you need to know to automate and manage your open source databases Download for Free Monitoring and trending

    No deployment is production-ready without proper monitoring of the services. You want to make sure you will be alerted if some services become unavailable so you can take an action, investigate or start recovery procedures. Of course, you also want to have trending solution too. It can’t be stressed enough how important it is to have monitoring data for assessing the state of the infrastructure or for any investigation,either post-mortem or real-time monitoring of the state of services. Metrics are not equal in importance - if you are not very familiar with a particular database product, you most likely won’t know which are the most important metrics to collect and watch. Sure, you might be able to collect everything but when it comes to reviewing data, it’s hardly possible to go through hundreds of metrics per host - you need to know which of them you should focus on.

    The open source world is full of tools designed to monitor and collect metrics from different databases - most of them would require you to integrate them with your overall monitoring infrastructure, chatops platform or oncall support tools (like PagerDuty). It might also be required to install and integrate multiple components - storage (some sort of time-series database), presentation layer and data collection tools.

    ClusterControl is a bit of a different approach, as it is one single product with real-time monitoring, trending, and dashboards that show the most important details. Database advisors, which can be anything from simple configuration advice, warning on thresholds or more complex rules for predictions, would generally produce comprehensive recommendations.

    Ability to scale-up

    Databases tend to grow in size, and it is not unlikely that it would grow in terms of transaction volumes or number of users. The ability to scale out or up can be critical for production. Even if you do a great job in estimating your hardware requirements at the start of the product lifecycle, you will probably have to handle a growth phase - as long as your product is successful, that is (but that’s what we all plan for, right?). You have to have the means to easily scale-up your infrastructure to cope with incoming load. For stateless services like webservers, this is fairly easy - you just need to provision more instances using the latest production image or code from your version control tool. For stateful services like databases, it’s more tricky. You have to provision new instances using your current production data, set up replication or some form of clustering between the current and the new instances. This can be a complex process and to get it right, you need to have more in-depth knowledge of the clustering or replication model chosen.

    Related resources  The DevOps Guide to Database Management  Introducing the ClusterControl Alerting Integrations  Database Automation: Integrating the ClusterControl CLI with your ChatBot

    ClusterControl, as the name suggests, provides extensive support for building out clustered or replicated database setups. The methods used are battle tested through thousands of deployments. It comes with a Command Line Interface (CLI) so it can be easily integrated with configuration management systems. Please keep it in mind, though, that you might not want to make changes to your pool of databases too often - provisioning of a new instance takes time and adds some overhead in existing databases. Therefore you may want to stay on a “over-provisioned” side a little bit so you will have some time to spin up new instance before your cluster gets overloaded.

    All in all, there are several steps you still have to take after initial deployment, to make sure your environment is ready for production. With the right tools, it is much easier to get there.

    Tags:  MySQL devops MariaDB

    Pages