Planet MySQL

UNDERSTANDING MySQL CLIENT / SERVER PROTOCOL USING PYTHON AND WIRESHARK – PART 2

In the previous article we researched MySQL Client / Server Protocol using WireShark. Now lets start to write our codes in python to simulate MySQL native client. Ready codes are here: Github repo

First of all we have to create MYSQL_PACKAGE class. MYSQL_PACKAGE class is the parent of all other package classes (HANDSHAKE_PACKAGE, LOGIN_PACKAGE, OK_PACKAGE and etc.)

class MYSQL_PACKAGE: “””Data between client and server is exchanged in packages of max 16MByte size.”””    def __init__(self, resp= b”):       self.resp = resp       self.start =0       self.end =0 It accepts resp parameter on initialization. Resp is the binary response received from the server in bytesarray type. One of the important and interesting method of this class is next method. def next(self, length = None, type=int, byteorder=’little’, signed=False, freeze=False):    if not freeze:       if length:          self.end += length          portion = self.resp[self.start:self.end]          self.start =self.end       else:          portion = self.resp[self.start:]          self.start =self.end =0    else:       if length:          portion = self.resp[self.start:self.start + length]       else:          portion = self.resp[self.start:]    if type is int:       return int.from_bytes(portion, byteorder=byteorder, signed=signed)    elif type is str:       return portion.decode(‘utf-8’)    elif type is hex:       return portion.hex()    else:       return portion Method next reads portion of bytes from the binary response. When we call this method it reads some portion of bytes and put pointer to the last position where reading was ended (changes a value of self.start and self.end properties). When we call this method again, it starts to read bytes where stayed last time. Method next accepts five parameters: length, type, byteorder, signed and freeze. If freeze is True it reads some portion of bytes from binary response but does not change pointer position. Otherwise it reads portion of bytes with given length and change the position of pointer. If length is None then method reads bytes until the end of response bytesarray. Parameter type can be int, str and hex datatypes. Method next converts portion of bytes to appropriate datatype according to the value of type parameter. Parameter byteorder determines conversion of bytes to integer type. It ups to your architecture of computer. If your machine is big-endian then it stores bytes in memory from big address to the little. Or if your machine is little-endian then it stores bytes in memory from little address to the big. Thats why we have to know exact type of our architecture to be able to convert bytes to integer correctly. In my case it is little-endian, thats why i’ve set the default value of byteorder parameter to “little”. Parameter signed is also used in conversion of bytes to integer. We tell the function to consider each intereg as unsigned or signed. Second interesting method of this class is encrypt_password. This method encrypts password with given algorithm. from hashlib import sha1 def encrypt_password(self, salt, password):    bytes1 = sha1(password.encode(“utf-8”)).digest()    concat1 = salt.encode(‘utf-8’)    concat2 = sha1(sha1(password.encode(“utf-8”)).digest()).digest()    bytes2 = bytearray()    bytes2.extend(concat1)    bytes2.extend(concat2)    bytes2 = sha1(bytes2).digest()    hash=bytearray(x ^ y for x, y in zip(bytes1, bytes2))    return hash This method accepts two parameters: salt and password. Parameter salt is the concatenation of two salt1 and salt2 strings from Greeting Packet received from server. And parameter password is the password string of mysql user. In the official documentation password encryption algorithm is: Here “20-bytes random data from server” is concatenation of salt1 and salt2 from Greeting Packet received from server. To remember what is greeting packet look at the previous article Now I want to explain encrypt_password method line by line. bytes1 = sha1(password.encode(“utf-8”)).digest() We are converting password string to bytes, then encrypting it with sha1 function and assigning to bytes1 variable. It is equal to this part of algorithm: Then we are converting salt string into bytes and assigning to the concat1 variable. concat1 = salt.encode(‘utf-8’) Third line of the method is: concat2 = sha1(sha1(password.encode(“utf-8”)).digest()).digest() Here we are double-encrypting password string with sha1 function and assign it to the concat2 string. Now we have two concat1 and concat2 variables. We have to concatenate them into one byte array: bytes2 = bytearray() bytes2.extend(concat1) bytes2.extend(concat2) Then we have to encrypt concatenated bytes with sha1 function and assign to the bytes2 variable. bytes2 = sha1(bytes2).digest() So we have two variables with encrypted bytes: bytes1 and bytes2. Now we have to do bitwise XOR operation between this variables and return the obtained hash. hash=bytearray(x ^ y for x, y in zip(bytes1, bytes2)) return hash CLASSES FOR DATATYPES

In the previous article we learned about Int and String data types of MySQL Client / Server protocol. Now we need some classes to be able to read fields from received packets.

INT CLASS class Int: “””see documentation: https://dev.mysql.com/doc/internals/en/integer.html””&#8221; def __init__(self, package, length=-1, type=’fix’):    self.package = package    self.length = length    self.type =type def next(self):    # int<n>    if self.type == ‘fix’ and self.length > 0:       return self.package.next(self.length)    # int<lenenc>    if self.type == ‘lenenc’:       byte = self.package.next(1)    if byte < 0xfb:       return self.package.next(1)    elif byte == 0xfc:       return self.package.next(2)    elif byte == 0xfd:       return self.package.next(3)    elif byte == 0xfe:       return self.package.next(8) Int class implements INT data type of MySQL Client / Server protocol. It accepts package parameter on initialization. Parameter package should be the instance of any package class inherited from MYSQL_PACKAGE class. Method next detects the type of integer (int<fix> or int<lenenc>. See previous article) and calls the next method of package object to read the byte portion of received response. STR CLASS class Str: “””see documentation: https://dev.mysql.com/doc/internals/en/string.html””&#8221;    def __init__(self, package, length=-1, type=”fix”):       self.package = package       self.length = length       self.type =type    def next(self):       # string<fix>       if self.type == ‘fix’ and self.length > 0:          return self.package.next(self.length, str)       # string<lenenc>       elif self.type == ‘lenenc’:          length = self.package.next(1)          if length == 0x00:             return “”          elif length == 0xfb:             return “NULL”          elif length == 0xff:             return “undefined”          return self.package.next(length, str)          # string<var>       elif self.type == ‘var’:          length = Int(self.package, type=’lenenc’).next()          return self.package.next(length, str)       # string<eof>       elif self.type == ‘eof’:          return self.package.next(type=str)       # string<null> – null terminated strings       elif self.type == ‘null’:          strbytes = bytearray()       byte = self.package.next(1)       while True:          if byte == 0x00:             break          else:             strbytes.append(byte)             byte = self.package.next(1)       return strbytes.decode(‘utf-8’) Str class implements STRING data type of MySQL Client / Server protocol. It accepts package parameter on initialization. Parameter package should be the instance of any package class inherited from MYSQL_PACKAGE class. Method next detects the type of String (String<fix>, String<Var>, String<NULL>, String<EOF> or String<lenenc>. See previous article) and calls the next method of package object to read the byte portion of received response. HANDSHAKE_PACKAGE CLASS

HANDSHAKE_PACKAGE class is used for parse Greeting Packet received from server. It is inherited from MYSQL_PACKAGE class and accepts resp parameter on initialization. Parameter resp is the Greeting Packet response in bytes type recieved from the server.

class HANDSHAKE_PACKAGE(MYSQL_PACKAGE):    def __init__(self, resp):       super().__init__(resp) def parse(self):    return {       “package_name”: “HANDSHAKE_PACKAGE”,       “package_length”: Int(self, 3).next(), #self.next(3),       “package_number”: Int(self, 1).next(), #self.next(1),       “protocol”: Int(self, 1).next(), #self.next(1),       “server_version”: Str(self, type=’null’).next(),       “connection_id”: Int(self, 4).next(), #self.next(4),       “salt1”: Str(self, type=’null’).next(),       “server_capabilities”: self.get_server_capabilities(Int(self, 2).next()),       “server_language”: self.get_character_set(Int(self, 1).next()),       “server_status”: self.get_server_status(Int(self, 2).next()),       “server_extended_capabilities”: self.get_server_extended_capabilities(Int(self, 2).next()),       “authentication_plugin_length”: Int(self, 1).next(),       “unused”: Int(self, 10).next(), #self.next(10, hex),       “salt2”: Str(self, type=’null’).next(),       “authentication_plugin”: Str(self, type=’eof’).next() } Method parse reading fields from the response using Int and Str classes and puts them into a dictionary and returns. LOGIN_PACKAGE CLASS

This class is used for create Login Request packet.

class LOGIN_PACKAGE(MYSQL_PACKAGE):    def __init__(self, handshake):       super().__init__()       self.handshake_info = handshake.parse()    def create_package(self, user, password, package_number):       package = bytearray()       # client capabilities       package.extend(self.capabilities_2_bytes(self.client_capabilities))       # extended client capabilities       package.extend(self.capabilities_2_bytes(self.extended_client_capabilities))       # max package -> 16777216       max_package = (16777216).to_bytes(4, byteorder=’little’)       package.extend(max_package)       # charset -> 33 (utf8_general_ci)       package.append(33)       # 23 bytes are reserved       reserved = (0).to_bytes(23, byteorder=’little’)       package.extend(reserved)       # username (null byte end)       package.extend(user.encode(‘utf-8’))       package.append(0)       # password       salt = self.handshake_info[‘salt1’] + self.handshake_info[‘salt2’]       encrypted_password = self.encrypt_password(salt.strip(), password)       length = len(encrypted_password)       package.append(length)       package.extend(encrypted_password)       # authentication plugin       plugin = self.handshake_info[‘authentication_plugin’].encode(‘utf-8’)       package.extend(plugin)       finpack = bytearray()       package_length = len(package)       finpack.append(package_length)       finpack.extend((0).to_bytes(2, byteorder=’little’))       finpack.append(package_number)       finpack.extend(package)       return finpack This class accepts handshake parameter on initialization. Parameter handshake should be the instance of HANDSHAKE_PACKAGE class. In the __init__ method we call the parse method of handshake object and get all fields of Greeting Packet received from the server. Method create_package prepares the login request package to be able to send to the server for authentication. Accepts user, password and packet_number parameters. OK_PACKAGE & ERR_PACKAGE CLASSES

OK package and ERR package are the response package of server after authentication or after sending query to server on command phase.

class OK_PACKAGE(MYSQL_PACKAGE):    def __init__(self, resp):       super().__init__(resp)    def parse(self):       return {          “package_name”: “OK_PACKAGE”,          “package_length”: Int(self, 3).next(), #self.next(3),          “package_number”: Int(self, 1).next(), #self.next(1),          “header”: hex(Int(self, 1).next()),          “affected_rows”: Int(self, 1).next(), #self.next(1),          “last_insert_id”: Int(self, 1).next(), #self.next(1),          “server_status”: self.get_server_status(Int(self, 2).next()),          “warnings”: Int(self, 2).next()       } class ERR_PACKAGE(MYSQL_PACKAGE):    def__init__(self, resp):       super().__init__(resp)    def parse(self):       return {          “package_name”: “ERR_PACKAGE”,          “package_length”: Int(self, 3).next(), #self.next(3),          “package_number”: Int(self, 1).next(), #self.next(1),          “header”: hex(Int(self, 1).next()), #self.next(1, hex),          “error_code”: Int(self, 2).next(), #self.next(2),          “sql_state”: Str(self, 6).next(),          “error_message”: Str(self, type=’eof’).next()       } MYSQL CLASS

MYSQL class is the wrapper class which creates TCP connection with server, sends and receives packages from server using above classes.

from socket import AF_INET, SOCK_STREAM, socket, gethostbyname from modules.packages import * class MySQL:    def __init__(self, host=””, port=””, user=””, password=””):       self.host = host       self.port = port       self.user = user       self.password = password    def connect(self):       resp = self.client.recv(65536)       return HANDSHAKE_PACKAGE(resp)    def login(self, handshake_package, package_number):    “””Sending Authentication package”””       login_package = LOGIN_PACKAGE(handshake_package)       package = login_package.create_package(user=self.user, password=self.password, package_number=package_number)       self.client.sendall(package)       resp = self.client.recv(65536)       package = self.detect_package(resp)       return package.parse()[‘package_number’]    def __enter__(self):       self.client = socket(AF_INET, SOCK_STREAM)       ip = gethostbyname(self.host)       address=(ip,int(self.port))       self.client.connect(address)       return self    def __exit__(self, exc_type, exc_value, traceback):       print(“Good Bye!”)       self.close()    def close(self):       self.client.close() I think nothing is not unclear in this class. I’ve defined __enter__ and __exit__ to be able to use this class with “with” statement to automatically close TCP connection with server. In __enter__ method i’m creating TCP connection over socket. And in __exit__ method i’m closing created connection. This class accepts host, port, user and password parameters on initialization. In the connect method we receive greeting packet from server: resp = self.client.recv(65536) return HANDSHAKE_PACKAGE(resp) In the login method we create Login request package using LOGIN_PACKAGE and HANDSHAKE_PACKAGE classes and sends to the server and gets OK or ERR packages. That’s all. We’ve implemented connection phase. To avoid be this article too long I will not explain command phase. Because command phase is more easer that connection phase and you can research it yourself with accumulated knowledge from this and previous articles. Also you can get already implemented command phase codes from this Github repo and research: https://github.com/elshadaghazade/techacademy_mysql_native_client_in_python Demo Video:  Thank you!

Interesting happenstance when installing Ansible dependencies in a MySQL Docker container

I’ve been posting quite a bit about Docker as I’ve been working with it a lot as of late. I thought I would share something interesting I discovered a couple weeks ago while working on setting up a Docker container-based lab environment for training purposes here at Pythian, specifically when attempting to install the MySQL-python package inside the container.

I know what you’re thinking: why is he trying to install a package in a Docker container? Doesn’t that go against the “Docker run and forget about it” philosophy? Sure, but in this case, I’m looking to add orchestration via ansible, which I don’t think is completely out of the question in order to coordinate your Docker containers for something like replication. This requires using the ansible mysql_replication module, which clearly states that MySQL-python is required. I digress.

Back to the story. I run the mysql/mysql-server:5.7 docker image from docker hub, which is from the Oracle MySQL team and based on Oracle Linux. Then I attempted to install MySQL-python. Here was my output:

[vagrant@control nopkrbr]$ docker exec -e COLUMNS="`tput cols`" -e LINES="`tput lines`" -ti mysql1 /bin/bash bash-4.2# yum install MySQL-python Loaded plugins: ovl ol7_UEKR4 | 1.2 kB 00:00 ol7_latest | 1.4 kB 00:00 (1/5): ol7_UEKR4/x86_64/updateinfo | 194 kB 00:00 (2/5): ol7_latest/x86_64/group | 659 kB 00:00 (3/5): ol7_latest/x86_64/updateinfo | 1.8 MB 00:00 (4/5): ol7_latest/x86_64/primary | 18 MB 00:03 (5/5): ol7_UEKR4/x86_64/primary | 38 MB 00:06 ol7_UEKR4 704/704 ol7_latest 26654/26654 Resolving Dependencies --> Running transaction check ---> Package MySQL-python.x86_64 0:1.2.5-1.el7 will be installed --> Processing Dependency: libmysqlclient.so.18(libmysqlclient_18)(64bit) for package: MySQL-python-1.2.5-1.el7.x86_64 --> Processing Dependency: libmysqlclient.so.18()(64bit) for package: MySQL-python-1.2.5-1.el7.x86_64 --> Running transaction check ---> Package mariadb-libs.x86_64 1:5.5.56-2.el7 will be installed --> Finished Dependency Resolution Dependencies Resolved ================================================================================ Package Arch Version Repository Size ================================================================================ Installing: MySQL-python x86_64 1.2.5-1.el7 ol7_latest 89 k Installing for dependencies: mariadb-libs x86_64 1:5.5.56-2.el7 ol7_latest 757 k Transaction Summary ================================================================================ Install 1 Package (+1 Dependent package) Total download size: 846 k Installed size: 4.7 M Is this ok [y/d/N]: y Downloading packages: (1/2): MySQL-python-1.2.5-1.el7.x86_64.rpm | 89 kB 00:00 (2/2): mariadb-libs-5.5.56-2.el7.x86_64.rpm | 757 kB 00:00 -------------------------------------------------------------------------------- Total 1.4 MB/s | 846 kB 00:00 Running transaction check Running transaction test Transaction check error: file /etc/my.cnf from install of mariadb-libs-1:5.5.56-2.el7.x86_64 conflicts with file from package mysql-community-server-minimal-5.7.22-1.el7.x86_64 file /usr/share/mysql/charsets/Index.xml from install of mariadb-libs-1:5.5.56-2.el7.x86_64 conflicts with file from package mysql-community-server-minimal-5.7.22-1.el7.x86_64 file /usr/share/mysql/charsets/armscii8.xml from install of mariadb-libs-1:5.5.56-2.el7.x86_64 conflicts with file from package mysql-community-server-minimal-5.7.22-1.el7.x86_64 file /usr/share/mysql/charsets/ascii.xml from install of mariadb-libs-1:5.5.56-2.el7.x86_64 conflicts with file from package mysql-community-server-minimal-5.7.22-1.el7.x86_64 file /usr/share/mysql/charsets/cp1250.xml from install of mariadb-libs-1:5.5.56-2.el7.x86_64 conflicts with file from package mysql-community-server-minimal-5.7.22-1.el7.x86_64 file /usr/share/mysql/charsets/cp1256.xml from install of mariadb-libs-1:5.5.56-2.el7.x86_64 conflicts with file from package mysql-community-server-minimal-5.7.22-1.el7.x86_64

And so on.

When I saw this I couldn’t help but chuckle at the fact that the dependency to have lib (or libs-compat) on the system in order to install MySQL-python was resolved by using the default repo… in the MySQL community-issued Docker image… using Oracle Linux… which uses MariaDB.

This really wasn’t a big deal in this case. I was able to get around the issue by adding a few extra tasks to my playbook, as listed below.

--- - name: "Install mysql yum repository" yum: name: https://dev.mysql.com/get/mysql80-community-release-el7-1.noarch.rpm state: present - name: "Enable all versions in MySQL yum repository" replace: path: /etc/yum.repos.d/mysql-community.repo regexp: '^enabled=0$' replace: 'enabled=1' - name: "Install mysql compatible libraries. Required to install Python MySQL package" yum: disablerepo: "*" enablerepo: "mysql{{ mysql_version_no_periods }}-community" name: mysql-community-libs-compat state: present - name: "Install MySQL Python Module" yum: name: MySQL-python state: present

What’s happening here is that I’m installing the mysql yum repository in the first task so I can get packages from the Oracle MySQL project instead of from the default repo which uses MariaDB. This creates the /etc/yum.repos.d/mysql-community.repo which allows you to enable and disable MySQL repositories based on major version.

The second task marks all repositories in the /etc/yum.repos.d/mysql-community.repo file as enabled, so any of them can be used.

The third task installs the latest minor version of mysql-community-libs-compat based on the major version of MySQL that I have running in my container. There is a reference here to a variable ‘mysql_version_no_periods’ which is populated in an earlier playbook with the major version of MySQL that is running in the container, but with the decimals removed. So, in this case, that value of the variable is ‘57’. This task is using this variable to disable all repositories in yum with the exception of the repository for the specific major version that I’m using, thus ensuring that when this task runs, I’ll always get the latest minor version of the mysql-community-libs-compat package for the major version of MySQL that’s running in my container.

Finally, now that my dependency is installed, the fourth and final task installs MySQL-python so I can use ansible to work with the MySQL instance running in my Docker container.

CONCLUSION

Recently, MariaDB has become a default package when attempting to use package managers for MySQL installation on various Linux distributions, so it’s easy to see how something like this could have slipped through the cracks. When you take everything into consideration about the Docker image that has been made available by the Oracle MySQL team, this is really only a very small setback for an otherwise great resource. You could even argue that this really isn’t an issue considering the Docker philosophy. However, I do believe that process orchestration via automation tools like Chef, Puppet, and Ansible aren’t going to be out of the question for Docker deployments. So I think it would be worth it to ensure that dependencies like this can be more easily resolved by making sure the repositories in use are compatible with the MySQL fork that’s installed on the image.

Running Percona XtraDB Cluster in Kubernetes/OpenShift

Kubernetes, and its most popular distribution OpenShift, receives a lot of interest as a container orchestration platform. However, databases remain a foreign entity, primarily because of their stateful nature, since container orchestration systems prefer stateless applications. That said, there has been good progress in support for StatefulSet applications and persistent storage, to the extent that it might be already comfortable to have a production database instance running in Kubernetes. With this in mind, we’ve been looking at running Percona XtraDB Cluster in Kubernetes/OpenShift.

While there are already many examples on the Internet of how to start a single MySQL instance in Kubernetes, for serious usage we need to provide:

  • High Availability: how can we guarantee availability when an instance (or Pod in Kubernetes terminology) crashes or becomes unresponsive?
  • Persistent storage: we do not want to lose our data in case of instance failure
  • Backup and recovery
  • Traffic routing: in the case of multiple instances, how do we direct an application to the correct one
  • Monitoring
Percona XtraDB Cluster in Kubernetes/OpenShift

Schematically it looks like this:


The picture highlights the components we are going to use

Running this in Kubernetes assumes a high degree of automation and minimal manual intervention.

We provide our proof of concept in this project: https://github.com/Percona-Lab/percona-openshift. Please treat it like a source for ideas and as an alpha-quality project, in no way it is production ready.

Details

In our implementation we rely on Helm, the package manager for Kubernetes.  Unfortunately OpenShift does not officially support Helm out of the box, but there is a guide from RedHat on how to make it work.

In the clustering setup, it is quite typical to use a service discovery software like Zookeeper, etcd or Consul. It may become necessary for our Percona XtraDB Cluster deployment, but for now, to simplify deployment, we are going to use the DNS service discovery mechanism provided by Kubernetes. It should be enough for our needs.

We also expect the Kubernetes deployment to provide Dynamic Storage Provisioning. The major cloud providers (like Google Cloud, Microsoft Azure or Amazon Cloud) should have it. Also, it might not be easy to have Dynamic Storage Provisioning for on-premise deployments. You may need to setup GlusterFS or Ceph to provide Dynamic Storage Provisioning.

The challenge with a distributed file system is how many copies of data you will end up having. Percona XtraDB Cluster by itself has three copies, and GlusterFS will also require at least two copies of the data, so in the end we will have six copies of the data. This can’t be good for write intensive applications, but it’s also not good from the capacity standpoint.

One possible approach is to have local data copies for Percona XtraDB Cluster deployments. This will provide better performance and less impact on the network, but in the case of a big dataset (100GB+ ) the node failure will require SST with a big impact on the cluster and network. So the individual solution should be tailored for your workload and your requirements.

Now, as we have a basic setup working, it would be good to understand the performance impact of running Percona XtraDB Cluster in Kubernetes.  Is the network and storage overhead acceptable or it is too big? We plan to look into this in the future.

Once again, our project is located at https://github.com/Percona-Lab/percona-openshift, we are looking for your feedback and for your experience of running databases in Kubernetes/OpenShift.

Before you leave …

Percona XtraDB Cluster

If this article has interested you and you would like to know more about Percona XtraDB Cluster, you might enjoy our recent series of webinar tutorials that introduce this software and how to use it.

The post Running Percona XtraDB Cluster in Kubernetes/OpenShift appeared first on Percona Database Performance Blog.

Upgrading to MySQL 8.0? Here is what you need to know…

In my previous blog post, I had described the steps to perform an in-place upgrade for upgrading from MySQL 5.7 to MySQL 8.0. In this blog post, I will discuss about the considerations that needs to be taken into account while upgrading to MySQL 8.0

The general in-place upgrade requirement is that MySQL 8.0 will understand the disk image produced by MySQL 5.7. …

Webinar Tues 6/26: MariaDB Server 10.3

Please join Percona’s Chief Evangelist, Colin Charles on Tuesday, June 26th, 2018, as he presents MariaDB Server 10.3 at 7:00 AM PDT (UTC-7) / 10:00 AM EDT (UTC-4).

Register Now

 

MariaDB Server 10.3 is out. It has some interesting features around system versioned tables, Oracle compatibility, column compression, an integrated SPIDER engine, as well as MyRocks. Learn about what’s new, how you can use it, and how it is different from MySQL.

Register Now

Colin Charles Chief Evangelist

Colin Charles is the Chief Evangelist at Percona. He was previously on the founding team of MariaDB Server in 2009, and had worked at MySQL since 2005, and been a MySQL user since 2000. Before joining MySQL, he worked actively on the Fedora and OpenOffice.org projects. He’s well known within open source communities in APAC, and has spoken at many conferences. Experienced technologist, well known in the open source world for work that spans nearly two decades within the community. Pays attention to emerging technologies from an integration standpoint. Prolific speaker at many industry-wide conferences delivering talks and tutorials with ease. Interests: application development, systems administration, database development, migration, Web-based technologies. Considered expert in Linux and Mac OS X usage/administration/roll-out’s. Specialties: MariaDB, MySQL, Linux, Open Source, Community, speaking & writing to technical audiences as well as business stakeholders.

The post Webinar Tues 6/26: MariaDB Server 10.3 appeared first on Percona Database Performance Blog.

Percona XtraBackup 2.4.12 Is Now Available

Percona announces the GA release of Percona XtraBackup 2.4.12 on June 22, 2018. You can download it from our download site and apt and yum repositories.

Percona XtraBackup enables MySQL backups without blocking user queries, making it ideal for companies with large data sets and mission-critical applications that cannot tolerate long periods of downtime. Offered free as an open source solution, it drives down backup costs while providing unique features for MySQL backups.

New features and improvements:
  • Percona XtraBackup now prints used arguments to standard output. Bug fixed PXB-1494.
Bugs fixed
  • xtrabackup --copy-back didn’t read which encryption plugin to use from plugin-load setting of the my.cnf configuration file. Bug fixed PXB-1544.
  • xbstream was exiting with zero return code when it failed to create one or more target files instead of returning error code 1. Bug fixed PXB-1542.
  • Meeting a zero sized keyring file, Percona XtraBackup was removing and immediately recreating it, which could affect external software noticing this file had undergo manipulations. Bug fixed PXB-1540.
  • xtrabackup_checkpoints files were encrypted during a backup, which caused additional difficulties to take incremental backups. Bug fixed PXB-202.

Other bugs fixed: PXB-1526 “Test kill_long_selects.sh failing with MySQL 5.7.21”.

Release notes with all the improvements for version 2.4.12 are available in our online documentation. Please report any bugs to the issue tracker.

The post Percona XtraBackup 2.4.12 Is Now Available appeared first on Percona Database Performance Blog.

UNDERSTANDING MySQL CLIENT / SERVER PROTOCOL USING PYTHON AND WIRESHARK – PART 1

MySQL Client / Server protocol is used in many areas. For example:

  • MySQL Connectors like ConnectorC, ConnectorJ and etc.
  • MySQL proxy
  • Between master and slave

What is MySQL Client / Server protocol?

MySQL Client / Server protocol is accepted conventions (rules). Through these rules client and server “talks” and understand each other. Client connects to server through TCP connection with special socket, sends to server special packets and accepts them from server. There are two phases of this connection:

  • Connection phase
  • Command phase

Next illustration describes phases:

STRUCTURE OF PACKETS

Each packet consists of valuable data types. Maximum length of each packet can be 16MB. If the length of packet is more than 16MB, then it is separated into several chunks (16MB). First of all let’s see the protocol data types. MySQL Client / Server protocol has two data types:

  • Integer types
  • String types

(See the official documentation: https://dev.mysql.com/doc/internals/en/basic-types.html)

INTEGER TYPES

Integer types also separates into two section:

  • Fixed length integer types
  • Length-encoded integer types

Fixed length integer type consumes 1, 2, 3, 4, 6 or 8 bytes. For example if we want to describe number 2 in int data type then we can write it like this in hex format: 03 00 00. Or if we want to write describe number 2 in int then we can write it like this in hex format: 03 00

Length-encoded integer types consumes 1, 3, 4 or 9 bytes. Before length-encoded integer types comes 1 byte. To detect the length of integer we have to check that first byte.

  • If the first byte is less than 0xfb ( < 251 ) then next one byte is valuable (it is stored as a 1-byte integer)
  • If the first byte is equal to 0xfc ( == 252 ) then it is stored as a 2-byte integer
  • If the first byte is equal to 0xfd ( == 253 ) then it is stored as a 3-byte integer
  • If the first byte is equal to 0xfe ( == 254 ) then it is stored as a 8-byte integer

But if the first byte is equal to 0xfb there is no need to read next bytes, it is equal to the NULL value of MySQL, and if equal to 0xff it means that it is undefined.

For example to convert fd 03 00 00 … into normal integer we have to read first byte and it is 0xfd. According to the above rules we have to read next 3 bytes and convert it into normal integer, and its value is 2 in decimal number system. So value of length-encoded integer data type is 2.

STRING TYPES

String types also separates into several sections.

  • String<Fix> – Fixed-length string types. They have a known, hardcoded length
  • String<NULL> – Null terminated string types. These strings end with 0x00 byte
  • String<Var> – Variable length string types. Before such strings comes fixed-length integer type. According to that integer we can calculate actual length of string
  • String<Lenenc> – Length-encoded string types. Before such strings comes length-encoded integer type. According to that integer we can calculate actual length of string
  • String<EOF> – If a string is the last component of a packet, its length can be calculated from the overall packet length minus the current position
SNIFF WITH WIRESHARK

Let’s start wireshark to sniff the network, filter MySQL packets by ip (in my case server ip is 54.235.111.67). Then let’s try to connect to MySQL server by MySQL native client on our local machine.

>> mysql -u[username] -p[password] -h[host ip] -P3306

As you can see after TCP connection to the server we several MySQL packets from the server. First of them is greeting packet.

Let’s dig into this packet and describe each field.

First 3 bytes are packet length:

Next 1 byte is packet number:

Rest of bytes are payload of Greeting packet of MySQL Client / Server protocol

Let’s describe each field of greeting packet.

  • Protocol number – Int<1>
  • Server version – String<NULL>
  • Thread id – Int<4>
  • Salt1 – String<NULL>
  • Server capabilities – Int<2>
  • Server language – Int<1>
  • Server Status – Int<2>
  • Extended Server Capabilities – Int<2>
  • Authentication plugin length – Int<1>
  • Reserved bytes – 10 bytes
  • Salt2 – String<NULL>
  • Authentication plugin string – String<EOF>

Server language is integer, next table will help us to pick appropriate language by integer value:

In my case server language is 0x08 (in decimal number system it is 8 also). From above table we can see that equivalent of 8 is latin1_swedish_ci. Now we know that default language of server is latin1_swedish_ci.

Server capabilities and server status are also integers. But reading each BIT of these integers we can know about server capabilities and status. Next illustration describes server capability and status bits:

Using greeting packet client prepares Login Request Packet to send to the server for authentication. Now let’s research login request packet.

  • First 3 bytes describes payload length
  • Next 1 byte is packet number
  • Client capabilities – Int<2> / Same as Server capabilities
  • Extended client capabilities – Int<2> / Same as Server extended capabilities
  • Max packet – Int<4> / describes the maximum length of packet
  • Charset – Int<1> / in my case it is 0x21 (in decimal number system is 33), from the table we can see that it is utf8_general_ci. We set server’s default charset from latin1_swedish_ci to utf8_general_ci
  • Username – String<NULL>
  • Password – String<Var>
  • Client Auth Plugin string – String<NULL>

As you can see password is encrypted. To encrypt a password we will use sha1, md5 algorithms, also salt1 and salt2 strings from previous Greeting Packet sent from server.

Then we get OK packet from the server if we are authenticated successfully. Otherwise we would get ERR packet.

  • 3 bytes are packet length
  • 1 byte is packet number
  • Affected rows – Int<1>
  • Server status – Int<2>
  • Warnings – Int<2>

That’s all. We have finished researching Connection Phase. Now let’s start to write our python codes.

Second part of this article is here: UNDERSTANDING MySQL CLIENT / SERVER PROTOCOL USING PYTHON AND WIRESHARK – PART 2

On InnoDB Data Compression in MySQL

Another story that I've prepared back in April for my meeting with one of customers in London was a "compression story". We spent a lot of time on it in several support issues in the past, with only limited success.

In case of InnoDB tables, there are actually two ways to compress data (besides relying on filesystem compression or compressing individual columns at server or application side). Historically the first one was introduced by the Barracuda InnoDB file format and ROW_FORMAT=COMPRESSED it supported. Notable number of related bugs were reported with time, and it may be not that easy to identify them all (you can find current list of bugs tagged with "compression" here). I've picked up the following bugs for my "story":
  • Bug #88220 - "compressing and uncompressing InnoDB tables seems to be inconsistent". Over years Simon Mudd, Monty Solomon (see related Bug #70534 - "Removing table compression leaves compressed keys") and other community members reported several bugs related to inconsistencies and surprises with key_block_size option. It is used for both MyISAM and InnoDB storage engines (for compressed tables) and it seems nobody is going to fix the remaining problems until they are gone with MyISAM engine.
  • Bug #69588 - "MyISAM to InnoDB compressed slower than MyISAM to InnoDB, Then InnoDB to Compressed". Just a detail to take into account, noted 5 years ago by Joffrey MICHAIE, verified almost 4 years ago and then getting zero public attention from Oracle engineers.
  • Bug #62431 - "What is needed to make innodb compression work for 32KB pages?". Nothing can be done according to the manual:
    "In particular, ROW_FORMAT=COMPRESSED in the Barracuda file format assumes that the page size is at most 16KB and uses 14-bit pointers."
  • Bug #78827 - "Speedup replication of compressed tables". Come on, Daniël van Eeden, nobody cares that
    "Replication and InnoDB compressed tables are not efficiently working together."The bug is still "Open".
  • Bug #75110 - "Massive, to-be-compressed not committed InnoDB table is total database downtime". This problem was reported by Jouni Järvinen back in 2014. Surely this is not a bug, but it seems nobody even tried to speed up compression in any way on multiple cores.
  • Bug #84439 - "Table of row size of ~800 bytes does not compress with KEY_BLOCK_SIZE=1". It was reported by Jean-François Gagné, who asked for a reasonable error message at least. Nothing happens after verification.
  • Bug #77089 - "Misleading innochecksum error for compressed tables with key_block_size=16". This problem was reported by Laurynas Biveinis more than three years ago, immediately verified and then got zero attention.
The boats above do not use the space for mooring efficiently. They need better compression. Transparent Page Compression for InnoDB tables was added later and looked promising. If you are lucky to use filesystem with sparse file and hole punching support and proper OS or kernel version, then you could expect notable saving of disk space with very few additional keystrokes (like COMPRESSION="zlib") when defining the table. Different compression libraries were supported. Moreover (see here), only uncompressed pages are stored in memory in this case, and this improved the efficiency of buffer pool usage. Sounded promising originally, but there are still bugs to consider:
  • Bug #78277 - "InnoDB deadlock, thread stuck on kernel calls from transparent page compression". This bug alone (reported by Mark Callaghan back in 2015) may be a reason to NOT use the feature in production, as soon as you hit it (chances are high). there are many interesting comments that there are environments where the feature works as fast as expected, but I think this summary is good enough for most users:"[19 Oct 2015 15:56] Mark Callaghan
    ...
    Slow on XFS, slow on ext4, btrfs core team tells me it will be slow there. But we can celebrate that it isn't slow on NVMFS - closed source, not GA, can't even find out where to buy it, not aware of anyone running it."The bug is still "Open".
  • Bug #81145 - "Sparse file and punch hole compression not working on Windows". Not that I care about Windows that much, but still. The bug is "Verified" for 2 years.
  • Bug #87723 - "mysqlbackup cannot work with mysql5.7 using innodb page-level compression" Now this is awesome! Oracle's own MySQL Enterprise Backup does NOT support the feature. Clearly they cared about making it useful...
    As a side note, same problem affects Percona's xtrabackup (see PXB-1394). MariaDB resolved the problem (and several related ones like MDEV-13023) with mariabackup tool.
  • Bug #87603 - "compression/tablespace ignored in create/alter table when not using InnoDB". COMPRESSION='.../' option is supported for MyISAM tables as well, and this again leads to problems when switching to another storage engine, as Tomislav Plavcic noted.
  • Bug #78672 - "assert fails in fil_io during linkbench with transparent innodb compression". This crash (assertion failure) was noted by Mark Callaghan back in 2015. May not crash anymore since 5.7.10 according to the last comment, but nobody cares to close the bug or comment anything useful. The bug is still "Verified".
That's almost all I prepared for my "compression story". It had to be sad one.

What about the moral of the story? For me it's the following:
  1. Classical InnoDB compression (page_format=compressed) has limited efficiency and does not get any attention from developers recently. If you hit some problem with this feature you have to live with it.
  2. Transparent page compression for InnoDB seems to be originally more like a proof of concept in MySQL that may not work well in production on commodity hardware, and software and was not integrated with backup tools. MariaDB improved it, added support for backing up page compressed tables efficiently with the same familiar xtrabackup-based approach, but there are still open problems to resolve (see MDEV-15527 and MDEV-15528 that I also picked up for my "story").
  3. It seems (based on public sources review at least) that both compression options do not get much attention from Oracle developers recently. If you check new features of MySQL 8.0 GA here,  you may notice that zlib version is updated, compressed temporary InnoDB tables are no longer supported and... that's all about compression for InnoDB!
This story could probably be shortened to just one link to the summary post by Mark Callaghan from Facebook (who studied the efficiency of data compression by various engines a lot, among other performance metrics), or by simple statement that if you want data to be compressed efficiently at server side do NOT use current InnoDB implementations and better use RocksDB engine (with MariaDB or Percona Server if you need other modern features also). But I can not write any story about MySQL without referring to some bugs, and this is how I've ended up with the above.

What if you just switched to MySQL 8.0 GA and need some new features from it badly? Then just wait for a miracle to happen (and hope Percona will make it one day :)

MySQL 8.0 InnoDB Cluster – the quick hands-on manual

I’m just back from a trip in Barcelona where I presented MySQL 8.0 InnoDB Cluster (at dataops and Barcelona MySQL Meetup) and the majority of feedback was great, but I also had some comments on the demos I showed. The first one was:

This is a joke of course (maybe it’s true for some), people found it very easy and they liked it.

But then, the second one was that all I showed wasn’t easy to find, some people who already played with the solution didn’t succeeded in creating a cluster so easily… not because they had errors or encountered bugs, but more because they just didn’t know how to do it.

The goal of this blog post is to illustrate how to create a MySQL 8.0 InnoDB Cluster very quickly using the new MySQL Shell only !

Initial environment

We have 3 MySQL 8.0.11 instances running: mysql1, mysql2 and mysql3

They can all communicate to each others and they have a user with all privileges created on each of them : clusteradmin.

Nothing else has been changed, no configuration file has been modified. The only important thing is that if you have created the admin user on them individually, just run RESET MASTER on all the instances.

Quick Cluster Creation

Let’s connect to one instance (doesn’t matter which one), and let’s create the cluster. I will list all commands needed to create the cluster very quickly and on a next post I will explain them in more details:

Connection using the Shell MySQL JS> \c clusteradmin@mysql1 Creating a session to 'clusteradmin@mysql1' Enter password: **** Fetching schema names for autocompletion... Press ^C to stop. Your MySQL connection id is 10 (X protocol) Server version: 8.0.11 MySQL Community Server - GPL No default schema selected; type \use to set one. MySQL [mysql1+ ssl] JS> MySQL Configuration MySQL [mysql1+ ssl] JS> dba.configureInstance('clusteradmin@mysql1',{'restart': true})

You will prompted to validate the changes and if a restart of MySQL is required, it will happen (if you use a default MySQL installation, 3 settings will be changed and a restart will be required).
And you need to configure all the other nodes that you want to be part of the cluster:

MySQL [mysql1+ ssl] JS> dba.configureInstance('clusteradmin@mysql2',{'restart': true}) MySQL [mysql1+ ssl] JS> dba.configureInstance('clusteradmin@mysql3',{'restart': true}) Cluster Creation

When the servers are restarted, just connect again to one of them using the Shell and create the cluster:

MySQL [mysql1+ ssl] JS> \c clusteradmin@mysql1 MySQL [mysql1+ ssl] JS> cluster=dba.createCluster('MyCluster')

Now you already have a cluster but with only one membere, you can verify this using the status() method of the cluster object we created:

MySQL [mysql1+ ssl] JS> cluster.status() { "clusterName": "MyCluster", "defaultReplicaSet": { "name": "default", "primary": "mysql1:3306", "ssl": "REQUIRED", "status": "OK_NO_TOLERANCE", "statusText": "Cluster is NOT tolerant to any failures.", "topology": { "mysql1:3306": { "address": "mysql1:3306", "mode": "R/W", "readReplicas": {}, "role": "HA", "status": "ONLINE" } } }, "groupInformationSourceMember": "mysql://clusteradmin@mysql1:3306" } Add the other members

Now it’s time to add the 2nd and 3rd member to the new cluster:

MySQL [mysql1+ ssl] JS> cluster.addInstance('clusteradmin@mysql2:3306') MySQL [mysql1+ ssl] JS> cluster.addInstance('clusteradmin@mysql3:3306')

Please, pay attention that here the port for MySQL standard protocol (3306) is required.

Check the cluster

We can verify our cluster using again the status() method:

MySQL [mysql1+ ssl] JS> cluster.status() { "clusterName": "MyCluster", "defaultReplicaSet": { "name": "default", "primary": "mysql1:3306", "ssl": "REQUIRED", "status": "OK", "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", "topology": { "mysql1:3306": { "address": "mysql1:3306", "mode": "R/W", "readReplicas": {}, "role": "HA", "status": "ONLINE" }, "mysql2:3306": { "address": "mysql2:3306", "mode": "R/O", "readReplicas": {}, "role": "HA", "status": "ONLINE" }, "mysql3:3306": { "address": "mysql3:3306", "mode": "R/O", "readReplicas": {}, "role": "HA", "status": "ONLINE" } } }, "groupInformationSourceMember": "mysql://clusteradmin@mysql1:3306" }

And voilà ! We have a 3 node MySQL 8.0 InnoDB Cluster !

Minimal Commands Summary
  • connect to mysql1: \c clusteradmin@mysql1
  • configure mysql1: dba.configureInstance('clusteradmin@mysql1',{'restart': true})
  • configure mysql2: dba.configureInstance('clusteradmin@mysql2',{'restart': true})
  • configure mysql3: dba.configureInstance('clusteradmin@mysql3',{'restart': true})
  • connect to mysql1 (after restart): \c clusteradmin@mysql1
  • create cluster: dba.createCluster('MyCluster')
  • add mysql2 to the cluster: cluster.addInstance('clusteradmin@mysql2:3306')
  • add mysql3 to the cluster: cluster.addInstance('clusteradmin@mysql3:3306')

On Partitioning in MySQL

Back in April I was preparing for vacations that my wife and I planned to spend in UK. Among other things planned I wanted to visit a customer's office in London and discuss few MySQL and MariaDB related topics, let's call them "stories". I tried to prepare myself for the discussion and collected a list of known active bugs (what else could I do as MySQL entomologist) for each of them. Surely live discussion was not suitable to share lists of bugs (and for some "stories" they were long), so I promised to share them later, in my blog. Time to do what I promised had finally come!

One of the stories we briefly discussed was "partitioning story". Right now I can immediately identify at least 47 active MySQL bugs in the related category.  While preparing I checked the same list and picked up 15 or so bug reports that had to illustrate my points. Let me share them here in no specific order, and add few more.
In April the latest still active bug in partitioning reported by MySQL community was  Bug #88916 - "Assertion `table->s->db_create_options == part_table->s->db_create_options'", from my colleague Elena Stepanova. Note a very simple test case that leads to assertion in debug builds, immediately verified.

Recently two more bugs were reported. Reporter of Bug #91190 - "DROP PARTITION and REORGANIZE PARTITION are slow" suspects a performance regression in MySQL 8.0.11. I've subscribed to this bug and is following the progress carefully. Same with Bug #91203 - "For partitions table, deal with NULL with is mismatch with reference guide". I think what happens with NULL value and range partitioning perfectly matches the manual, but the fact that INFORMATION_SCHEMA.PARTITIONS table may return wrong information after dropping partition with NULL value is somewhat unexpected.

Now back to the original lists for the "story" I prepared in April:
  • Bug #60023 - "No Loose Index Scan for GROUP BY / DISTINCT on InnoDB partitioned table". It was reported by Rene' Cannao' and since 2013 I strongly suspect that it's fixed in MySQL 5.6+ or, as noted in another comment, may depend on statistics properly collected for the table. Still the status remains "Verified".
  • Bug #78164 - "alter table command affect partitioned table data directory". Your custom DATA DIRECTORY settings may get lost when ALTER is applied to the whole table. Quick test shows that at least in MariaDB 10.3.7 this is no longer the case. The bug is still "Verified".
  • Bug #85126 - "Delete by range in presence of partitioning and no PK always picks wrong index". It was reported by Riccardo Pizzi 16 months ago, immediately verified (without explicit list of versions affected, by the way). One more case when ordering of indexes in CREATE TABLE may matter...
  • Bug #81712 - "lower_case_table_names=2 ignored on ADD PARTITION on Windows". Who cares about Windows these days?
  • Bug #84356 - "General tablespace table encryption". It seems partitioning allows to overcome documented limitation. If this is intended, then the manual is wrong, otherwise I suspect the lack of careful testing of partitioning integration with other features.
  • Bug #88673 - "Regression CREATE TBL from 5.7.17 to 20 (part #1: innodb_file_per_table = ON)." I've probably mentioned this bug reported by Jean-François Gagné in more than one blog post already. Take care and do not use long partition names.
  • Bug #85413 - "Failing to rename a column involved in partition". As simple as it sounds, and it still happens.
  • Bug #83435 - "ALTER TABLE is very slow when using PARTITIONED table". It was reported by Roel Van de Paar back in 2016 and still remains "Verified".
  • Bug #73084 - "Exchanging partitions defined with DATA DIRECTORY and INDEX DIRECTORY options". The bug still remains "Open" (see Bug #77772 also).
  • Bug #73648 - "innodb table replication is very slow with some of the partitioned table". It seems to be fixed last year as internal Bug #25687813 (see release notes for 5.6.38), but nobody cares to find this older duplicate and change its status or re-verify it.
  • Bug #83750 - "Import via TTS of a partitioned table only uses 1 cpu core". This feature requested by Daniël van Eeden makes a lot of sense. I truly hope to see parallel operations implemented for partitioned tables in GA MySQL versions (as I saw some parallel processing for partitions done for some upcoming "6.1" or so version back in 2008 in Riga during the MySQL's last company meeting I've attended).
  • Bug #64498 - "Running out of file handles when ALTERing partitioned MyISAM table". Too many file handles are needed. This is a documented limitation that DBAs should still take into account.
I also prepared a separate small list of partition pruning bugs:
  • Bug #83248 - "Partition pruning is not working with LEFT JOIN". I've reported it back in 2016 and it is still not fixed. There are reasons to think it is not so easy.
  • Bug #75085 - "Partition pruning on key partitioning with ENUM". It was reported by  Daniël van Eeden back in 2014!
  • Bug #77318 - "Selects waiting on MDL when altering partitioned table". One of the worst expectations DBA may have is that partitioned tables help to workaround "global" MDL locks because of partition pruning! This is not the case.
Does this story have any moral? I think so, and for me it's the following:
  1. Partitioning bugs do not get proper attention from Oracle engineers. We see bugs with wrong status and even a bug with a clear test case and a duplicate that is "Open" for 4 years. Some typical use cases are affected badly, and still no fixes (even though since 5.7 we have native partitioning in InnoDB and changing implementation gave good chance to review and either fix or re-check these bugs).
  2. MySQL DBAs should expect all kinds of surprises when running usual DDL statements (ALTER TABLE to add column even) with partitioned tables. In the best case DDL is just unexpectedly slow for them.
  3. Partition pruning may not work they way one expects.
  4. We miss parallel processing for partitioned tables. They should allow to speed up queries and DDL, not to slow them down instead...
  5. One can suspect that there is no careful internal testing performed on integration of partitioning with other features, or even basic partition maintenance operations.

This Week in Data with Colin Charles 43: Polyglots, Security and DataOps.Barcelona

Join Percona Chief Evangelist Colin Charles as he covers happenings, gives pointers and provides musings on the open source database community.

This is a short working week for me due to a family emergency. It caused me to skip speaking at DataOps.Barcelona and miss hanging out with the awesome of speakers and attendees. This is the first time I’ve missed a scheduled talk, and I received many messages about my absence. I am sure we will all meet again soon.

One of the talks I was planning to give at DataOps.Barcelona will be available as a Percona webinar next week: Securing Your Database Servers from External Attacks on Thursday, June 28, 2018, at 7:00 AM PDT (UTC-7) / 10:00 AM EDT (UTC-4). I am also giving a MariaDB 10.3 overview on Tuesday, June 26, 2018, at 7:00 AM PDT (UTC-7) / 10:00 AM EDT (UTC-4). I will “virtually” see you there.

If you haven’t already read Werner Vogel’s post A one size fits all database doesn’t fit anyone, I highly recommend it. It is true there is no “one size fits all” solution when it comes to databases. This is why Percona has made “the polyglot world” a theme. It’s why Amazon offers different database flavors: relational (Aurora for MySQL/PostgreSQL, RDS for MySQL/PostgreSQL/MariaDB Server), key-value (DynamoDB), document (DynamoDB), graph (Neptune), in-memory (ElastiCache for Redis & Memcached), search (Elasticsearch service). The article has a plethora of use cases, from AirBnB using Aurora, to Snapchat Stories and Tinder using DynamoDB, to Thomson Reuters using Neptune, down to McDonald’s using ElastiCache and Expedia using Elasticsearch. This kind of detail, and customer use case, is great.

There are plenty more stories and anecdotes in the post, and it validates why Percona is focused not just on MySQL, but also MariaDB, MongoDB, PostgreSQL and polyglot solutions. From a MySQL lens, it’s also worth noting that not one storage engine fits every use case. Facebook famously migrated a lot of their workload from InnoDB to MyRocks, and it is exciting to see Mark Callaghan stating that there are already three big workloads on MyRocks in production, with another two coming soon.

Releases
  • MariaDB 10.1.34 – including fixes for InnoDB defragmentation and full text search (MDEV-15824). This was from the WebScaleSQL tree, ported by KakaoTalk to MariaDB Server.
  • Percona XtraDB Cluster 5.6.40-26.25 – now with Percona Server for MySQL 5.6.40, including a new variable to configure rolling schema upgrade (RSU) wait for active commit connection timeouts.
  • Are you using the MariaDB Connector/C, Connector/J or Connector/ODBC? A slew of updates abound.
Link List Industry Updates Upcoming appearances
  • OSCON – Portland, Oregon, USA – July 16-19 2018
Feedback

I look forward to feedback/tips via e-mail at colin.charles@percona.com or on Twitter @bytebot.

The post This Week in Data with Colin Charles 43: Polyglots, Security and DataOps.Barcelona appeared first on Percona Database Performance Blog.

Not a Fan of Redhat RPMs Today or Why No Follow Up for RH/Centos/Fedora from Last Blog


I received a lot of good feedback Building the PHP MySQL XDevAPI PECL Extension on MySQL 8.0.11 and PHP 7.2 for the MySQL Document Store including a few folks asking if I could document that it takes to get the MySQL X DevAPI working with an RPM based Linux distro.

Well I'd really like to.  But I can't.

Redhat Linux 4 I still remember getting my copy of Redhat Linux 4.0 (not RHEL -- no enterprise thoughts in those days)  It was January 1997 and I installed it the week before Rehaht 4.1 came out.  I thought that RPMs were much better than the old 'unzip the tar file;./configure; make install' circus. I thought Redhat was pretty cool. Heck I even became a RHCE.  
Then I found the Debian variants easier to work with and more up to date.  My not so humble opinion is that Ubuntu is the best all around Linux distro around. But there are a lot of RPM based systems out there and I need to be able to show how to get the MySQL X Devapi working on them.  But it ain't easy.
Step 1 Install CentOS 7
I had not installed CentOS in some time and it installed fairly nicely or as nice as Ubuntu.   So no problem there,
Step 2 What is that Deb package Called in RPM speak?
Here is where the wagon goes off the rails. TO get the developer tools you must sudo yum -y groupinstall  'Development Tools'.  Then you get to download the OpenSSL tarball 'cause there is not one RPM for CentOS 7 (considering its popularity I was gobsmacked to discover this lacking). Next was loading the protobuf-devel package.  
But when I try to run the PECL install mysql_xdevapi I run into a compiler error.  Well, it is Friday afternoon and my frustration/inexperience with recent RPM software is telling me to call it a week.
I will try again later.  If you are more up to speed on RPMs and want to provide guidance for me please do so.  If not have a good weekend!

Getting started with Orchestrator

Orchestrator is a MySQL high availability and replication management tool. In this blog post, we will cover the first steps for getting started with it on an existing topology.

The code examples assume you are running Centos 7, but the general steps should be similar if you are running other operating system versions/flavors.

Prep work

1. Create a MySQL user on each of your database servers.
Orchestrator will connect with this user to discover the topology and to perform any changes you tell it to make.

CREATE USER 'orchestrator'@'%' IDENTIFIED BY '****'; GRANT SUPER, PROCESS, REPLICATION SLAVE, RELOAD ON *.* TO 'orchestrator'@'%'; GRANT SELECT ON mysql.slave_master_info TO 'orchestrator'@'%'; GRANT SELECT ON meta.* TO 'orchestrator'@'%';

Note: Orchestrator reads replication credentials stored in mysql.slave_master_info table, which implies you need to set up your servers with master_info_repository = ‘TABLE’ option if you want Orchestrator to be able to make topology changes on your behalf.

2. (Optional) Create a table to store the cluster name on each of your database servers.
The idea is that if you set this up on each of your database servers, Orchestrator will pick up the cluster name automatically. While you could skip this step and just rename the clusters via the GUI later, it is a nice to have.

CREATE DATABASE meta; CREATE TABLE meta.cluster ( anchor TINYINT NOT NULL, cluster_name VARCHAR(128) CHARSET ascii NOT NULL DEFAULT '', cluster_domain VARCHAR(128) CHARSET ascii NOT NULL DEFAULT '', PRIMARY KEY (anchor) ) ENGINE=InnoDB DEFAULT CHARSET=utf8; INSERT INTO meta.cluster (anchor, cluster_name, cluster_domain) VALUES (1, 'testcluster', 'example.com'); Installation steps

1. Get the prerequisites.
Orchestrator client requires jq package, which is available from epel repo, so let’s install that first:

sudo yum install epel-release sudo yum install jq

2. Install Orchestrator.
Orchestrator is readily available as .deb or .rpm package so let’s use that:

sudo yum install https://github.com/github/orchestrator/releases/download/v3.0.11/orchestrator-3.0.11-1.x86_64.rpm

3. Prepare the configuration file.
Let’s start by copying the sample config file:

cp /usr/local/orchestrator/orchestrator-sample.conf.json /etc/orchestrator.conf.json

There are a lot of settings we can tweak, but to get started, we need to config the following values:

  • “MySQLTopologyUser”
  • “MySQLTopologyPassword”

These are the ones we defined above.

  • “DetectClusterAliasQuery”

We have to define a query that will let Orchestrator figure out the cluster name based on the table we created in the previous step.

  • “BackendDB”
  • “SQLite3DataFile”

You can use different backends for Orchestrator metadata. For simplicity, I suggest using SQLite which requires only the two lines above.

Here’s a complete configuration file:

{ "Debug": true, "EnableSyslog": false, "ListenAddress": ":3000", "MySQLTopologyUser": "orchestrator", "MySQLTopologyPassword": "****", "MySQLTopologyCredentialsConfigFile": "", "MySQLTopologySSLPrivateKeyFile": "", "MySQLTopologySSLCertFile": "", "MySQLTopologySSLCAFile": "", "MySQLTopologySSLSkipVerify": true, "MySQLTopologyUseMutualTLS": false, "MySQLOrchestratorHost": "127.0.0.1", "MySQLOrchestratorPort": 3306, "MySQLOrchestratorDatabase": "orchestrator", "MySQLOrchestratorUser": "orc_server_user", "MySQLOrchestratorPassword": "orc_server_password", "MySQLOrchestratorCredentialsConfigFile": "", "MySQLOrchestratorSSLPrivateKeyFile": "", "MySQLOrchestratorSSLCertFile": "", "MySQLOrchestratorSSLCAFile": "", "MySQLOrchestratorSSLSkipVerify": true, "MySQLOrchestratorUseMutualTLS": false, "MySQLConnectTimeoutSeconds": 1, "DefaultInstancePort": 3306, "DiscoverByShowSlaveHosts": true, "InstancePollSeconds": 5, "UnseenInstanceForgetHours": 240, "SnapshotTopologiesIntervalHours": 0, "InstanceBulkOperationsWaitTimeoutSeconds": 10, "HostnameResolveMethod": "default", "MySQLHostnameResolveMethod": "@@hostname", "SkipBinlogServerUnresolveCheck": true, "ExpiryHostnameResolvesMinutes": 60, "RejectHostnameResolvePattern": "", "ReasonableReplicationLagSeconds": 10, "ProblemIgnoreHostnameFilters": [], "VerifyReplicationFilters": false, "ReasonableMaintenanceReplicationLagSeconds": 20, "CandidateInstanceExpireMinutes": 60, "AuditLogFile": "", "AuditToSyslog": false, "RemoveTextFromHostnameDisplay": ".mydomain.com:3306", "ReadOnly": false, "AuthenticationMethod": "", "HTTPAuthUser": "", "HTTPAuthPassword": "", "AuthUserHeader": "", "PowerAuthUsers": [ "*" ], "ClusterNameToAlias": { "127.0.0.1": "test suite" }, "SlaveLagQuery": "", "DetectClusterAliasQuery": "SELECT ifnull(max(cluster_name), '''') as cluster_alias from meta.cluster where anchor=1;", "DetectClusterDomainQuery": "", "DetectInstanceAliasQuery": "", "DetectPromotionRuleQuery": "", "DataCenterPattern": "[.]([^.]+)[.][^.]+[.]mydomain[.]com", "PhysicalEnvironmentPattern": "[.]([^.]+[.][^.]+)[.]mydomain[.]com", "PromotionIgnoreHostnameFilters": [], "DetectSemiSyncEnforcedQuery": "", "ServeAgentsHttp": false, "AgentsServerPort": ":3001", "AgentsUseSSL": false, "AgentsUseMutualTLS": false, "AgentSSLSkipVerify": false, "AgentSSLPrivateKeyFile": "", "AgentSSLCertFile": "", "AgentSSLCAFile": "", "AgentSSLValidOUs": [], "UseSSL": false, "UseMutualTLS": false, "SSLSkipVerify": false, "SSLPrivateKeyFile": "", "SSLCertFile": "", "SSLCAFile": "", "SSLValidOUs": [], "URLPrefix": "", "StatusEndpoint": "/api/status", "StatusSimpleHealth": true, "StatusOUVerify": false, "AgentPollMinutes": 60, "UnseenAgentForgetHours": 6, "StaleSeedFailMinutes": 60, "SeedAcceptableBytesDiff": 8192, "PseudoGTIDPattern": "", "PseudoGTIDPatternIsFixedSubstring": false, "PseudoGTIDMonotonicHint": "asc:", "DetectPseudoGTIDQuery": "", "BinlogEventsChunkSize": 10000, "SkipBinlogEventsContaining": [], "ReduceReplicationAnalysisCount": true, "FailureDetectionPeriodBlockMinutes": 60, "RecoveryPollSeconds": 10, "RecoveryPeriodBlockSeconds": 3600, "RecoveryIgnoreHostnameFilters": [], "RecoverMasterClusterFilters": [ ".*" ], "RecoverIntermediateMasterClusterFilters": [ "_intermediate_master_pattern_" ], "OnFailureDetectionProcesses": [ "echo 'Detected {failureType} on {failureCluster}. Affected replicas: {countSlaves}' >> /tmp/recovery.log" ], "PreFailoverProcesses": [ "echo 'Will recover from {failureType} on {failureCluster}' >> /tmp/recovery.log" ], "PostFailoverProcesses": [ "echo '(for all types) Recovered from {failureType} on {failureCluster}. Failed: {failedHost}:{failedPort}; Successor: {successorHost}:{successorPort}' >> /tmp/recovery.log" ], "PostUnsuccessfulFailoverProcesses": [], "PostMasterFailoverProcesses": [ "echo 'Recovered from {failureType} on {failureCluster}. Failed: {failedHost}:{failedPort}; Promoted: {successorHost}:{successorPort}' >> /tmp/recovery.log" ], "PostIntermediateMasterFailoverProcesses": [ "echo 'Recovered from {failureType} on {failureCluster}. Failed: {failedHost}:{failedPort}; Successor: {successorHost}:{successorPort}' >> /tmp/recovery.log" ], "CoMasterRecoveryMustPromoteOtherCoMaster": true, "DetachLostSlavesAfterMasterFailover": true, "ApplyMySQLPromotionAfterMasterFailover": true, "MasterFailoverDetachSlaveMasterHost": false, "MasterFailoverLostInstancesDowntimeMinutes": 0, "PostponeSlaveRecoveryOnLagMinutes": 0, "OSCIgnoreHostnameFilters": [], "GraphiteAddr": "", "GraphitePath": "", "GraphiteConvertHostnameDotsToUnderscores": true, "BackendDB": "sqlite", "SQLite3DataFile": "/usr/local/orchestrator/orchestrator.db" }

4. Prepare systemd script.
At the time of this writing, a systemd script to manage start/stop of the service is not included. I set that up as follows:

vi /etc/systemd/system/orchestrator.service [Unit] Description="orchestrator: MySQL replication management and visualization" After=syslog.target network.target [Service] Type=simple ExecStart=/usr/local/orchestrator/orchestrator --verbose http PIDFile=/var/run/orchestrator.pid WorkingDirectory=/usr/local/orchestrator [Install] WantedBy=multi-user.target

5. Reload the service so it reads the updated configuration file /etc/orchestrator.conf.json.

service orchestrator reload

That’s it for the installation steps.

Using the GUI

The first step is pointing Orchestrator to a starting host. It will then automatically discover any other hosts that are members of the same cluster by crawling up and down the replication chain.

1. Using a browser, open the GUI (by default port 3000) on the host where you installed Orchestrator:
http://192.168.56.100:3000/

2. Click ‘Discover‘ on the top bar, enter the ip address and port of any host you want, and hit the Submit button.

3. Click on Clusters -> Dashboard. After a few seconds, you should see the cluster being shown (refresh the page if needed).

4. Now click on the cluster name and you should see the details about all current members of the replication topology.

The cluster should have automatically been named according to what is specified in meta.cluster table.
You can also use the GUI to drag & drop to perform topology changes, provided you are using GTID or pseudo-GTID (which is a topic for a different post).

Closing thoughts

Orchestrator is rapidly becoming the go-to topology management tool in the MySQL world. I encourage you to start playing with it as soon as possible.

If you are interested in learning more, I suggest you also check out the following posts in the Pythian blog.

Happy Orchestrating!

MySQL High availability with HAProxy, Consul and Orchestrator

Graceful master switchover with ProxySQL and Orchestrator

 

Back to basics: Isolation Levels In MySQL

In this blog, we will see the very basic thing “I” of “ACID” and an important property of Transaction ie., “ISOLATION”

The isolation defines the way in which the MySQL server (InnoDB) separates each transaction from other concurrent running transaction in the server and also ensures that the transactions are processed in a reliable way. If transactions are not isolated then one transaction could modify the data that another transaction is reading hence creating data inconsistency. Isolation levels determine how isolated the transactions are from each other.

MySQL supports all four the isolation levels that SQL-Standard defines.The four isolation levels are

  • READ UNCOMMITTED
  • READ COMMITTED
  • REPEATABLE READ
  • SERIALIZABLE

The Isolation level’s can be set globally or session based on our requirements.

 

 

Choosing the best isolation level based, have a great impact on the database, Each level of isolation comes with a trade-off, let’s discuss on each of them,

READ UNCOMMITTED:

In READ-UNCOMMITTED isolation level, there isn’t much isolation present between the transactions at all, ie ., No locks. A transaction can see changes to data made by other transactions that are not committed yet. This is the lowest level in isolation and highly performant since there is no overhead of maintaining locks, With this isolation level, there is always for getting a “Dirty-Read

That means transactions could be reading data that may not even exist eventually because the other transaction that was updating the data rolled-back the changes and didn’t commit. lest see the below image for better understanding

Suppose a transaction T1 modifies a row if a transaction T2 reads the row and sees the modification even though T1 has not committed it, that is a dirty read, the problem here is if T1 rolls back, T2 doesn’t know that and will be in a state of “totally perplexed”

READ COMMITTED:

IN READ-COMMITTED isolation level, the phenomenon of dirty read is avoided, because any uncommitted changes are not visible to any other transaction until the change is committed. This is the default isolation level with most of popular RDBMS software, but not with MySQL.

Within this isolation level, each SELECT uses its own snapshot of the committed data that was committed before the execution of the SELECT. Now because each SELECT has its own snapshot, here is the trade-off now, so the same SELECT, when running multiple times during the same transaction, could return different result sets. This phenomenon is called non-repeatable read.

A non-repeatable occurs when a transaction performs the same transaction twice but gets a different result set each time. Suppose T2 reads some of the rows and T1 then change a row and commit the change, now T2 reads the same row set and gets a different result ie.., the initial read is non-repeatable.

Read-committed is the recommended isolation level for Galera ( PXC, MariaDB Cluster ) and InnoDB clusters.

REPEATABLE READ:

In REPEATABLE-READ isolation level, the phenomenon of non-repeatable read is avoided. It is the default isolation in MySQL.This isolation level returns the same result set throughout the transaction execution for the same SELECT run any number of times during the progression of a transaction.

This is how it works, a snapshot of the SELECT is taken the first time the SELECT is run during the transaction and the same snapshot is used throughout the transaction when the same SELECT is executed. A transaction running in this isolation level does not take into account any changes to data made by other transactions, regardless of whether the changes have been committed or not. This ensures that reads are always consistent(repeatable). Maintaining a snapshot can cause extra overhead and impact some performance

Although this isolation level solves the problem of non-repeatable read, another possible problem that occurs is phantom reads.

A Phantom is a row that appears where it is not visible before. InnoDB and XtraDB solve the phantom read problem with multi-version concurrency control.

REPEATABLE READ is MySQL’s default transaction isolation level.

SERIALIZABLE

SERIALIZABLE completely isolates the effect of one transaction from others. It is similar to REPEATABLE READ with the additional restriction that row selected by one transaction cannot be changed by another until the first transaction finishes. The phenomenon of phantom reads is avoided. This isolation level is the strongest possible isolation level. AWS Aurora do not support this isolation level.

 

Photo by Alberto Triano on Unsplash

How to Install Pimcore CMS on Ubuntu 18.04 LTS

Pimcore is a free and open-source enterprise content management system written in PHP language and uses MySQL/MariaDB for databases. It is fast, flexible, and designer and developer-friendly that allow us to manage and share all of our digital assets across any channel.

Lock Down: Enforcing SELinux with Percona XtraDB Cluster

Why do I spend time blogging about security frameworks? Because, although there are some resources available on the Web, none apply to Percona XtraDB Cluster (PXC) directly. Actually, I rarely encounter a MySQL setup where SELinux is enforced and never when Percona XtraDB Cluster (PXC) or another Galera replication implementation is used. As we’ll see, there are good reasons for that. I originally thought this post would be a simple “how to” but it ended up with a push request to modify the SST script and a few other surprises.

Some context

These days, with all the major security breaches of the last few years, the importance of security in IT cannot be highlighted enough. For that reason, security in MySQL has been progressively tightened from version to version and the default parameters are much more restrictive than they used to be. That’s all good but it is only at the MySQL level if there is still a breach allowing access to MySQL, someone could in theory do everything the mysql user is allowed to do. To prevent such a situation, the operations that mysqld can do should be limited to only what it really needs to do. SELinux’ purpose is exactly that. You’ll find SELinux on RedHat/Centos and their derived distributions. Debian, Ubuntu and OpenSuse uses another framework, AppArmor, which is functionally similar to SELinux. I’ll talk about AppArmor in a future post, let’s focus for now on SELinux.

The default behavior of many DBAs and Sysadmins appears to be: “if it doesn’t work, disable SELinux”. Sure enough, it often solves the issue but it also removes an important security layer. I believe disabling SELinux is the wrong cure so let’s walk through the steps of configuring a PXC cluster with SELinux enforced.

Starting point

As a starting point, I’ll assume you have a running PXC cluster operating with SELinux in permissive mode. That likely means the file “/etc/sysconfig/selinux” looks like this:

# This file controls the state of SELinux on the system. # SELINUX= can take one of these three values: # enforcing - SELinux security policy is enforced. # permissive - SELinux prints warnings instead of enforcing. # disabled - No SELinux policy is loaded. SELINUX=permissive # SELINUXTYPE= can take one of three two values: # targeted - Targeted processes are protected, # minimum - Modification of targeted policy. Only selected processes are protected. # mls - Multi Level Security protection. SELINUXTYPE=targeted

For the purpose of writing this article, I created a 3 nodes PXC cluster with the hosts: BlogSELinux1, BlogSELinux2 and BlogSELinux3. On BlogSELinux1, I set SELinux in permissive mode, I truncated the audit.log. SELinux violations are logged in the audit.log file.

[root@BlogSELinux1 ~]# getenforce Permissive [root@BlogSELinux1 ~]# echo '' > /var/log/audit/audit.log

Let’s begin by covering the regular PXC operation items like start, stop, SST Donor, SST Joiner, IST Donor and IST Joiner. As we execute the steps in the list, the audit.log file will record SELinux related elements.

Stop and start

Those are easy:

[root@BlogSELinux1 ~]# systemctl stop mysql [root@BlogSELinux1 ~]# systemctl start mysql

SST Donor

On BlogSELinux3:

[root@BlogSELinux3 ~]# systemctl stop mysql

then on BlogSELinux2:

[root@BlogSELinux2 ~]# systemctl stop mysql [root@BlogSELinux2 ~]# rm -f /var/lib/mysql/grastate.dat [root@BlogSELinux2 ~]# systemctl start mysql

SST Joiner

We have BlogSELinux1 and BlogSELinux2 up and running, we just do:

[root@BlogSELinux1 ~]# systemctl stop mysql [root@BlogSELinux1 ~]# rm -f /var/lib/mysql/grastate.dat [root@BlogSELinux1 ~]# systemctl start mysql

IST Donor

We have BlogSELinux1 and BlogSELinux2 up and running, we just do:

[root@BlogSELinux2 ~]# systemctl stop mysql

Then on the first node:

[root@BlogSELinux1 ~]# mysql -e 'create database test;'; [root@BlogSELinux1 ~]# mysql -e 'create table test.testtable (id int not null, primary key (id)) engine=innodb;' [root@BlogSELinux1 ~]# mysql -e 'insert into test.testtable (id) values (1);'

Those statements put some data in the gcache, now we just restart the second node:

[root@BlogSELinux2 ~]# systemctl start mysql

IST Joiner

We have BlogSELinux1 and BlogSELinux2 up and running, we just do:

[root@BlogSELinux1 ~]# systemctl stop mysql

Then on the second node:

[root@BlogSELinux2 ~]# mysql -e 'insert into test.testtable (id) values (2);'

to insert some data in the gcache and we restart the first node:

[root@BlogSELinux1 ~]# systemctl start mysql

First run

Now that we performed the basic operations of a cluster while recording the security violations in permissive mode, we can look at the audit.log file and start building the SELinux policy. Let’s begin by installing the tools needed to manipulate the SELinux audit log and policy files with:

[root@BlogSELinux1 ~]# yum install policycoreutils-python.x86_64

Then, we’ll use the audit2allow tool to analyze the audit.log file:

[root@BlogSELinux1 ~]# grep -i denied /var/log/audit/audit.log | grep mysqld_t | audit2allow -M PXC ******************** IMPORTANT *********************** To make this policy package active, execute: semodule -i PXC.pp

We end up with 2 files, PXC.te and PXC.pp. The pp file is a compiled version of the human readable te file. If we examine the content of the PXC.te file, at the beginning, we have the require section listing all the involved SELinux types and classes:

module PXC 1.0; require { type unconfined_t; type init_t; type auditd_t; type mysqld_t; type syslogd_t; type NetworkManager_t; type unconfined_service_t; type system_dbusd_t; type tuned_t; type tmp_t; type dhcpc_t; type sysctl_net_t; type kerberos_port_t; type kernel_t; type unreserved_port_t; type firewalld_t; type systemd_logind_t; type chronyd_t; type policykit_t; type udev_t; type mysqld_safe_t; type postfix_pickup_t; type sshd_t; type crond_t; type getty_t; type lvm_t; type postfix_qmgr_t; type postfix_master_t; class process { getattr setpgid }; class unix_stream_socket connectto; class system module_request; class netlink_tcpdiag_socket { bind create getattr nlmsg_read setopt }; class tcp_socket { name_bind name_connect }; class file { getattr open read write }; class dir search; }

Then, using these types and classes, the policy file adds a series of generic allow rules matching the denied found in the audit.log file. Here’s what I got:

#============= mysqld_t ============== allow mysqld_t NetworkManager_t:process getattr; allow mysqld_t auditd_t:process getattr; allow mysqld_t chronyd_t:process getattr; allow mysqld_t crond_t:process getattr; allow mysqld_t dhcpc_t:process getattr; allow mysqld_t firewalld_t:process getattr; allow mysqld_t getty_t:process getattr; allow mysqld_t init_t:process getattr; #!!!! This avc can be allowed using the boolean 'nis_enabled' allow mysqld_t kerberos_port_t:tcp_socket name_bind; allow mysqld_t kernel_t:process getattr; #!!!! This avc can be allowed using the boolean 'domain_kernel_load_modules' allow mysqld_t kernel_t:system module_request; allow mysqld_t lvm_t:process getattr; allow mysqld_t mysqld_safe_t:process getattr; allow mysqld_t policykit_t:process getattr; allow mysqld_t postfix_master_t:process getattr; allow mysqld_t postfix_pickup_t:process getattr; allow mysqld_t postfix_qmgr_t:process getattr; allow mysqld_t sysctl_net_t:file { getattr open read }; allow mysqld_t syslogd_t:process getattr; allow mysqld_t system_dbusd_t:process getattr; allow mysqld_t systemd_logind_t:process getattr; allow mysqld_t tuned_t:process getattr; allow mysqld_t udev_t:process getattr; allow mysqld_t unconfined_service_t:process getattr; allow mysqld_t unconfined_t:process getattr; allow mysqld_t tuned_t:process getattr; allow mysqld_t udev_t:process getattr; allow mysqld_t sshd_t:process getattr; allow mysqld_t self:netlink_tcpdiag_socket { bind create getattr nlmsg_read setopt }; allow mysqld_t self:process { getattr setpgid }; #!!!! The file '/var/lib/mysql/mysql.sock' is mislabeled on your system. #!!!! Fix with $ restorecon -R -v /var/lib/mysql/mysql.sock #!!!! This avc can be allowed using the boolean 'daemons_enable_cluster_mode' allow mysqld_t self:unix_stream_socket connectto; allow mysqld_t sshd_t:process getattr; allow mysqld_t sysctl_net_t:dir search; allow mysqld_t sysctl_net_t:file { getattr open read }; allow mysqld_t syslogd_t:process getattr; allow mysqld_t system_dbusd_t:process getattr; allow mysqld_t systemd_logind_t:process getattr; #!!!! WARNING 'mysqld_t' is not allowed to write or create to tmp_t. Change the label to mysqld_tmp_t. allow mysqld_t tmp_t:file write; allow mysqld_t tuned_t:process getattr; allow mysqld_t udev_t:process getattr; allow mysqld_t unconfined_service_t:process getattr; allow mysqld_t unconfined_t:process getattr; #!!!! This avc can be allowed using one of the these booleans: # nis_enabled, mysql_connect_any allow mysqld_t unreserved_port_t:tcp_socket { name_bind name_connect };

I can understand some of these rules. For example, one of the TCP ports used by Kerberos is 4444 and it is also used by PXC for the SST transfer. Similarly, MySQL needs to write to /tmp. But what about all the other rules?

Troubleshooting

We could load the PXC.pp module we got in the previous section and consider our job done. It will likely allow the PXC node to start and operate normally but what exactly is happening? Why did MySQL or one of its subprocesses asked for the process attributes getattr of all the running processes like sshd, syslogd and cron. Looking directly in the audit.log file, I found many entries like these:

type=AVC msg=audit(1527792830.989:136): avc: denied { getattr } for pid=3683 comm="ss" scontext=system_u:system_r:mysqld_t:s0 tcontext=system_u:system_r:init_t:s0 tclass=process type=AVC msg=audit(1527792830.990:137): avc: denied { getattr } for pid=3683 comm="ss" scontext=system_u:system_r:mysqld_t:s0 tcontext=system_u:system_r:kernel_t:s0 tclass=process type=AVC msg=audit(1527792830.991:138): avc: denied { getattr } for pid=3683 comm="ss" scontext=system_u:system_r:mysqld_t:s0 tcontext=system_u:system_r:syslogd_t:s0 tclass=process

So, ss, a network utility tool, scans all the processes. That rang a bell… I knew where to look for, the sst script. Here’s the source of the problem in the wsrep_sst_xtrabackup-v2 file:

wait_for_listen() { local HOST=$1 local PORT=$2 local MODULE=$3 for i in {1..300} do ss -p state listening "( sport = :$PORT )" | grep -qE 'socat|nc' && break sleep 0.2 done echo "ready ${HOST}:${PORT}/${MODULE}//$sst_ver" }

This bash function is used when the node is a joiner and it checks using ss if the TCP port used by socat or nc is opened. The check is needed in order to avoid replying too early with the “ready” message. The code is functionally correct but wrong, security wise. Instead of looking if there is a socat or nc command running in the list of processes owned by the mysql user, it checks if any of the processes has opened the SST port and only then does it checks if the name of the command is socat or nc. Since we don’t know which processes will be running on the server, we can’t write a good security profile. For example, in the future, one could add the ntpd daemon, causing PXC to fail to start yet again. To avoid that, the function needs to be modified like this:

wait_for_listen() { local HOST=$1 local PORT=$2 local MODULE=$3 for i in {1..300} do sleep 0.2 # List only our (mysql user) processes to avoid triggering SELinux for cmd in $(ps -u $(id -u) -o pid,comm | sed 's/^\s*//g' | tr ' ' '|' | grep -E 'socat|nc') do pid=$(echo $cmd | cut -d'|' -f1) # List the sockets of the pid sockets=$(ls -l /proc/$pid/fd | grep socket | cut -d'[' -f2 | cut -d ']' -f1 | tr '\n' '|') if [[ -n $sockets ]]; then # Is one of these sockets listening on the SST port? # If so, we need to break from 2 loops grep -E "${sockets:0:-1}" /proc/$pid/net/tcp | \ grep "00000000:$(printf '%X' $PORT)" > /dev/null \ && break 2 fi done done echo "ready ${HOST}:${PORT}/${MODULE}//$sst_ver" }

The modified function removes many of the denied messages in the audit log file and simplifies a lot the content of PXC.te. I tested the above modification and made a pull request to PXC. Among the remaining items, we have:

allow mysqld_t self:process { getattr setpgid };

setpgid is called often used after a fork to set the process group, usually through the setsid call. MySQL uses fork when it starts with the daemonize option but our installation of Percona XtraDB cluster uses mysqld_safe and does not directly run as a daemon. Another fork call is part of the wsrep source files and is used to launch processes like the SST script and is done when mysqld is already running with reduced privileges. This later invocation is certainly our culprit.

TCP ports

What about TPC ports? PXC uses quite a few. Of course there is the 3306/tcp port used to access MySQL. Galera also uses the ports 4567/tcp for replication, 4568/tcp for IST and 4444/tcp for SST. Let’s have a look which ports SELinux allows PXC to use:

[root@BlogSELinux1 audit]# semanage port -l | grep mysql mysqld_port_t tcp 1186, 3306, 63132-63164

No surprise, port 3306/tcp is authorized but if you are new to MySQL, you may wonder what uses the 1186/tcp. It is the port used by NDB cluster for inter-node communication (NDB API). Now, if we try to add the missing ports:

[root@BlogSELinux1 audit]# semanage port -a -t mysqld_port_t -p tcp 4567 ValueError: Port tcp/4567 already defined [root@BlogSELinux1 audit]# semanage port -a -t mysqld_port_t -p tcp 4568 [root@BlogSELinux1 audit]# semanage port -a -t mysqld_port_t -p tcp 4444 ValueError: Port tcp/4444 already defined

4568/tcp was successfully added but, 4444/tcp and 4567/tcp failed because they are already assigned to another security context. For example, 4444/tcp belongs to the kerberos security context:

[root@BlogSELinux1 audit]# semanage port -l | grep kerberos_port kerberos_port_t tcp 88, 750, 4444 kerberos_port_t udp 88, 750, 4444

A TCP port is not allowed by SELinux to belong to more than one security context. We have no other choice than to move the two missing ports to the mysqld_t security context:

[root@BlogSELinux1 audit]# semanage port -m -t mysqld_port_t -p tcp 4444 [root@BlogSELinux1 audit]# semanage port -m -t mysqld_port_t -p tcp 4567 [root@BlogSELinux1 audit]# semanage port -l | grep mysqld mysqld_port_t tcp 4567, 4444, 4568, 1186, 3306, 63132-63164

If you happen to be planning to deploy a Kerberos server on the same servers you may have to run PXC using a different port for Galera replication. In that case, and in the case where you want to run MySQL on a port other than 3306/tcp, you’ll need to add the port to the mysqld_port_t context like we just did above. Do not worry too much for the port 4567/tcp, it is reserved for tram which, from what I found, is a remote access protocol for routers.

Non-default paths

It is very frequent to run MySQL with non-standard paths/directories. With SELinux, you don’t list the authorized path in the security context, you add the security context labels to the paths. Adding a context label is a two steps process, basically change and apply. For example, if you are using /data as the MySQL datadir, you need to do:

semanage fcontext -a -t mysqld_db_t "/data(/.*)?" restorecon -R -v /data

On a RedHat/Centos 7 server, the MySQL file contexts and their associated paths are:

[root@BlogSELinux1 ~]# bzcat /etc/selinux/targeted/active/modules/100/mysql/cil | grep filecon (filecon "HOME_DIR/\.my\.cnf" file (system_u object_r mysqld_home_t ((s0) (s0)))) (filecon "/root/\.my\.cnf" file (system_u object_r mysqld_home_t ((s0) (s0)))) (filecon "/usr/lib/systemd/system/mysqld.*" file (system_u object_r mysqld_unit_file_t ((s0) (s0)))) (filecon "/usr/lib/systemd/system/mariadb.*" file (system_u object_r mysqld_unit_file_t ((s0) (s0)))) (filecon "/etc/my\.cnf" file (system_u object_r mysqld_etc_t ((s0) (s0)))) (filecon "/etc/mysql(/.*)?" any (system_u object_r mysqld_etc_t ((s0) (s0)))) (filecon "/etc/my\.cnf\.d(/.*)?" any (system_u object_r mysqld_etc_t ((s0) (s0)))) (filecon "/etc/rc\.d/init\.d/mysqld" file (system_u object_r mysqld_initrc_exec_t ((s0) (s0)))) (filecon "/etc/rc\.d/init\.d/mysqlmanager" file (system_u object_r mysqlmanagerd_initrc_exec_t ((s0) (s0)))) (filecon "/usr/bin/mysqld_safe" file (system_u object_r mysqld_safe_exec_t ((s0) (s0)))) (filecon "/usr/bin/mysql_upgrade" file (system_u object_r mysqld_exec_t ((s0) (s0)))) (filecon "/usr/libexec/mysqld" file (system_u object_r mysqld_exec_t ((s0) (s0)))) (filecon "/usr/libexec/mysqld_safe-scl-helper" file (system_u object_r mysqld_safe_exec_t ((s0) (s0)))) (filecon "/usr/sbin/mysqld(-max)?" file (system_u object_r mysqld_exec_t ((s0) (s0)))) (filecon "/usr/sbin/mysqlmanager" file (system_u object_r mysqlmanagerd_exec_t ((s0) (s0)))) (filecon "/usr/sbin/ndbd" file (system_u object_r mysqld_exec_t ((s0) (s0)))) (filecon "/var/lib/mysql(-files|-keyring)?(/.*)?" any (system_u object_r mysqld_db_t ((s0) (s0)))) (filecon "/var/lib/mysql/mysql\.sock" socket (system_u object_r mysqld_var_run_t ((s0) (s0)))) (filecon "/var/log/mariadb(/.*)?" any (system_u object_r mysqld_log_t ((s0) (s0)))) (filecon "/var/log/mysql.*" file (system_u object_r mysqld_log_t ((s0) (s0)))) (filecon "/var/run/mariadb(/.*)?" any (system_u object_r mysqld_var_run_t ((s0) (s0)))) (filecon "/var/run/mysqld(/.*)?" any (system_u object_r mysqld_var_run_t ((s0) (s0)))) (filecon "/var/run/mysqld/mysqlmanager.*" file (system_u object_r mysqlmanagerd_var_run_t ((s0) (s0))))

If you want to avoid security issues with SELinux, you should stay within those paths. A good example of an offending path is the PXC configuration file and directory which are now located in their own directory. These are not labeled correctly for SELinux:

[root@BlogSELinux1 ~]# ls -Z /etc/per* -rw-r--r--. root root system_u:object_r:etc_t:s0 /etc/percona-xtradb-cluster.cnf /etc/percona-xtradb-cluster.conf.d: -rw-r--r--. root root system_u:object_r:etc_t:s0 mysqld.cnf -rw-r--r--. root root system_u:object_r:etc_t:s0 mysqld_safe.cnf -rw-r--r--. root root system_u:object_r:etc_t:s0 wsrep.cnf

I must admit that even if the security context labels on those files were not set, I got no audit messages and everything worked normally. Nevetheless, adding the labels is straightforward:

[root@BlogSELinux1 ~]# semanage fcontext -a -t mysqld_etc_t "/etc/percona-xtradb-cluster\.cnf" [root@BlogSELinux1 ~]# semanage fcontext -a -t mysqld_etc_t "/etc/percona-xtradb-cluster\.conf\.d(/.*)?" [root@BlogSELinux1 ~]# restorecon -v /etc/percona-xtradb-cluster.cnf restorecon reset /etc/percona-xtradb-cluster.cnf context system_u:object_r:etc_t:s0->system_u:object_r:mysqld_etc_t:s0 [root@BlogSELinux1 ~]# restorecon -R -v /etc/percona-xtradb-cluster.conf.d/ restorecon reset /etc/percona-xtradb-cluster.conf.d context system_u:object_r:etc_t:s0->system_u:object_r:mysqld_etc_t:s0 restorecon reset /etc/percona-xtradb-cluster.conf.d/wsrep.cnf context system_u:object_r:etc_t:s0->system_u:object_r:mysqld_etc_t:s0 restorecon reset /etc/percona-xtradb-cluster.conf.d/mysqld.cnf context system_u:object_r:etc_t:s0->system_u:object_r:mysqld_etc_t:s0 restorecon reset /etc/percona-xtradb-cluster.conf.d/mysqld_safe.cnf context system_u:object_r:etc_t:s0->system_u:object_r:mysqld_etc_t:s0

Variables check list

Here is a list of all the variables you should check for paths used by MySQL

  • datadir, default is /var/lib/mysql, where MySQL stores its data
  • basedir, default is /usr, where binaries and librairies can be found
  • character_sets_dir, default is basedir/share/mysql/charsets, charsets used by MySQL
  • general_log_file, default is the datadir, where the general log is written
  • init_file, no default, sql file read and executed when the server starts
  • innodb_undo_directory, default is datadir, where InnoDB stores the undo files
  • innodb_tmpdir, default is tmpdir, where InnoDB creates temporary files
  • innodb_temp_data_file_path, default is in the datadir, where InnoDB creates the temporary tablespace
  • innodb_parallel_doublewrite_path, default is in the datadir, where InnoDB created the parallel doublewrite buffer
  • innodb_log_group_home_dir, default is the datadir, where InnoDB writes its transational log files
  • innodb_data_home_dir, default is the datadir, used a default value for the InnoDB files
  • innodb_data_file_path, default is in the datadir, path of the system tablespace
  • innodb_buffer_pool_filename, default is in the datadir, where InnoDB writes the buffer pool dump information
  • lc_messages_dir, basedir/share/mysql
  • log_bin_basename, default is the datadir, where the binlogs are stored
  • log_bin_index, default is the datadir, where the binlog index file is stored
  • log_error, no default value, where the MySQL error log is stored
  • pid-file, no default value, where the MySQL pid file is stored
  • plugin_dir, default is basedir/lib/mysql/plugin, where the MySQL plugins are stored
  • relay_log_basename, default is the datadir, where the relay logs are stored
  • relay_log_info_file, default is the datadir, may include a path
  • slave_load_tmpdir, default is tmpdir, where the slave stores files coming from LOAD DATA INTO statements.
  • slow_query_log, default is in the datadir, where the slow queries are logged
  • socket, no defaults, where the Unix socket file is created
  • ssl_*, SSL/TLS related files
  • tmpdir, default is /tmp, where temporary files are stored
  • wsrep_data_home_dir, default is the datadir, where galera stores its files
  • wsrep_provider->base_dir, default is wsrep_data_home_dir
  • wsrep_provider->gcache_dir, default is wsrep_data_home_dir, where the gcache file is stored
  • wsrep_provider->socket.ssl_*, no defaults, where the SSL/TLS related files for the Galera protocol are stored

That’s quite a long list and I may have missed some. If for any of these variables you use a non-standard path, you’ll need to adjust the context labels as we just did above.

All together

I would understand if you feel a bit lost, I am not a SELinux guru and it took me some time to understand decently how it works. Let’s recap how we can enable SELinux for PXC from what we learned in the previous sections.

1. Install the SELinux utilities

yum install policycoreutils-python.x86_64

2. Allow the TCP ports used by PXC

semanage port -a -t mysqld_port_t -p tcp 4568 semanage port -m -t mysqld_port_t -p tcp 4444 semanage port -m -t mysqld_port_t -p tcp 4567

3. Modify the SST script

Replace the wait_for_listen function in the /usr/bin/wsrep_sst_xtrabackup-v2 file by the version above. Hopefully, the next PXC release will include a SELinux friendly wait_for_listen function.

4. Set the security context labels for the configuration files

These steps seems optional but for completeness:

semanage fcontext -a -t mysqld_etc_t "/etc/percona-xtradb-cluster\.cnf" semanage fcontext -a -t mysqld_etc_t "/etc/percona-xtradb-cluster\.conf\.d(/.*)?" restorecon -v /etc/percona-xtradb-cluster.cnf restorecon -R -v /etc/percona-xtradb-cluster.conf.d/

5. Create the policy file PXC.te

Create the file PXC.te with this content:

module PXC 1.0; require { type unconfined_t; type mysqld_t; type unconfined_service_t; type tmp_t; type sysctl_net_t; type kernel_t; type mysqld_safe_t; class process { getattr setpgid }; class unix_stream_socket connectto; class system module_request; class file { getattr open read write }; class dir search; } #============= mysqld_t ============== allow mysqld_t kernel_t:system module_request; allow mysqld_t self:process { getattr setpgid }; allow mysqld_t self:unix_stream_socket connectto; allow mysqld_t sysctl_net_t:dir search; allow mysqld_t sysctl_net_t:file { getattr open read }; allow mysqld_t tmp_t:file write;

6. Compile and load the policy module

checkmodule -M -m -o PXC.mod PXC.te semodule_package -o PXC.pp -m PXC.mod semodule -i PXC.pp

7. Run for a while in Permissive mode

Set SELinux into permissive mode in /etc/sysconfig/selinux and reboot. Validate everything works fine in Permissive mode, check the audit.log for any denied messages. If there are denied messages, address them.

8. Enforce SELINUX

Last step, enforce SELinux:

setenforce 1 perl -pi -e 's/SELINUX=permissive/SELINUX=enforcing/g' /etc/sysconfig/selinux

Conclusion

As we can see, enabling SELinux with PXC is not straightforward but, once the process is understood, it is not that hard either. In an IT world where security is more than ever a major concern, enabling SELinux with PXC is a nice step forward. In an upcoming post, we’ll look at the other security framework, Apparmor.

The post Lock Down: Enforcing SELinux with Percona XtraDB Cluster appeared first on Percona Database Performance Blog.

Eight Supported by Eighth!

We would like to take this moment to thank the Oracle company for continuous development of the popular MySQL database! As you all might know, Oracle has recently released  MySQL Server, v8.0. In its turn, Devart decided to get a move on and release the new dbForge Studio for MySQL, 8.0! Support for MySQL Server […]

MySQL on Docker: Running a MariaDB Galera Cluster without Orchestration Tools - DB Container Management - Part 2

As we saw in the first part of this blog, a strongly consistent database cluster like Galera does not play well with container orchestration tools like Kubernetes or Swarm. We showed you how to deploy Galera and configure process management for Docker, so you retain full control of the behaviour.  This blog post is the continuation of that, we are going to look into operation and maintenance of the cluster.

To recap some of the main points from the part 1 of this blog, we deployed a three-node Galera cluster, with ProxySQL and Keepalived on three different Docker hosts, where all MariaDB instances run as Docker containers. The following diagram illustrates the final deployment:

Graceful Shutdown

To perform a graceful MySQL shutdown, the best way is to send SIGTERM (signal 15) to the container:

$ docker kill -s 15 {db_container_name}

If you would like to shutdown the cluster, repeat the above command on all database containers, one node at a time. The above is similar to performing "systemctl stop mysql" in systemd service for MariaDB. Using "docker stop" command is pretty risky for database service because it waits for 10 seconds timeout and Docker will force SIGKILL if this duration is exceeded (unless you use a proper --timeout value).

The last node that shuts down gracefully will have the seqno not equal to -1 and safe_to_bootstrap flag is set to 1 in the /{datadir volume}/grastate.dat of the Docker host, for example on host2:

$ cat /containers/mariadb2/datadir/grastate.dat # GALERA saved state version: 2.1 uuid: e70b7437-645f-11e8-9f44-5b204e58220b seqno: 7099 safe_to_bootstrap: 1 Detecting the Most Advanced Node

If the cluster didn't shut down gracefully, or the node that you were trying to bootstrap wasn't the last node to leave the cluster, you probably wouldn't be able to bootstrap one of the Galera node and might encounter the following error:

2016-11-07 01:49:19 5572 [ERROR] WSREP: It may not be safe to bootstrap the cluster from this node. It was not the last one to leave the cluster and may not contain all the updates. To force cluster bootstrap with this node, edit the grastate.dat file manually and set safe_to_bootstrap to 1 .

Galera honours the node that has safe_to_bootstrap flag set to 1 as the first reference node. This is the safest way to avoid data loss and ensure the correct node always gets bootstrapped.

If you got the error, we have to find out the most advanced node first before picking up the node as the first to be bootstrapped. Create a transient container (with --rm flag), map it to the same datadir and configuration directory of the actual database container with two MySQL command flags, --wsrep_recover and --wsrep_cluster_address. For example, if we want to know mariadb1 last committed number, we need to run:

$ docker run --rm --name mariadb-recover \ --env MYSQL_ROOT_PASSWORD="PM7%cB43$sd@^1" \ --volume /containers/mariadb1/datadir:/var/lib/mysql \ --volume /containers/mariadb1/conf.d:/etc/mysql/conf.d \ mariadb:10.2.15 \ --wsrep_recover \ --wsrep_cluster_address=gcomm:// 2018-06-12 4:46:35 139993094592384 [Note] mysqld (mysqld 10.2.15-MariaDB-10.2.15+maria~jessie) starting as process 1 ... 2018-06-12 4:46:35 139993094592384 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins ... 2018-06-12 4:46:35 139993094592384 [Note] Plugin 'FEEDBACK' is disabled. 2018-06-12 4:46:35 139993094592384 [Note] Server socket created on IP: '::'. 2018-06-12 4:46:35 139993094592384 [Note] WSREP: Recovered position: e70b7437-645f-11e8-9f44-5b204e58220b:7099

The last line is what we are looking for. MariaDB prints out the cluster UUID and the sequence number of the most recently committed transaction. The node which holds the highest number is deemed as the most advanced node. Since we specified --rm, the container will be removed automatically once it exits. Repeat the above step on every Docker host by replacing the --volume path to the respective database container volumes.

Once you have compared the value reported by all database containers and decided which container is the most up-to-date node, change the safe_to_bootstrap flag to 1 inside /{datadir volume}/grastate.dat manually. Let's say all nodes are reporting the same exact sequence number, we can just pick mariadb3 to be bootstrapped by changing the safe_to_bootstrap value to 1:

$ vim /containers/mariadb3/datadir/grasate.dat ... safe_to_bootstrap: 1

Save the file and start bootstrapping the cluster from that node, as described in the next chapter.

Bootstrapping the Cluster

Bootstrapping the cluster is similar to the first docker run command we used when starting up the cluster for the first time. If mariadb1 is the chosen bootstrap node, we can simply re-run the created bootstrap container:

$ docker start mariadb0 # on host1

Otherwise, if the bootstrap container does not exist on the chosen node, let's say on host2, run the bootstrap container command and map the existing mariadb2's volumes. We are using mariadb0 as the container name on host2 to indicate it is a bootstrap container:

$ docker run -d \ --name mariadb0 \ --hostname mariadb0.weave.local \ --net weave \ --publish "3306" \ --publish "4444" \ --publish "4567" \ --publish "4568" \ $(weave dns-args) \ --env MYSQL_ROOT_PASSWORD="PM7%cB43$sd@^1" \ --volume /containers/mariadb2/datadir:/var/lib/mysql \ --volume /containers/mariadb2/conf.d:/etc/mysql/mariadb.conf.d \ mariadb:10.2.15 \ --wsrep_cluster_address=gcomm:// \ --wsrep_sst_auth="root:PM7%cB43$sd@^1" \ --wsrep_node_address=mariadb0.weave.local

You may notice that this command is slightly shorter as compared to the previous bootstrap command described in this guide. Since we already have the proxysql user created in our first bootstrap command, we may skip these two environment variables:

  • --env MYSQL_USER=proxysql
  • --env MYSQL_PASSWORD=proxysqlpassword

Then, start the remaining MariaDB containers, remove the bootstrap container and start the existing MariaDB container on the bootstrapped host. Basically the order of commands would be:

$ docker start mariadb1 # on host1 $ docker start mariadb3 # on host3 $ docker stop mariadb0 # on host2 $ docker start mariadb2 # on host2

At this point, the cluster is started and is running at full capacity.

Resource Control

Memory is a very important resource in MySQL. This is where the buffers and caches are stored, and it's critical for MySQL to reduce the impact of hitting the disk too often. On the other hand, swapping is bad for MySQL performance. By default, there will be no resource constraints on the running containers. Containers use as much of a given resource as the host’s kernel will allow. Another important thing is file descriptor limit. You can increase the limit of open file descriptor, or "nofile" to something higher to cater for the number of files MySQL server can open simultaneously. Setting this to a high value won't hurt.

To cap memory allocation and increase the file descriptor limit to our database container, one would append --memory, --memory-swap and --ulimit parameters into the "docker run" command:

$ docker kill -s 15 mariadb1 $ docker rm -f mariadb1 $ docker run -d \ --name mariadb1 \ --hostname mariadb1.weave.local \ --net weave \ --publish "3306:3306" \ --publish "4444" \ --publish "4567" \ --publish "4568" \ $(weave dns-args) \ --memory 16g \ --memory-swap 16g \ --ulimit nofile:16000:16000 \ --env MYSQL_ROOT_PASSWORD="PM7%cB43$sd@^1" \ --volume /containers/mariadb1/datadir:/var/lib/mysql \ --volume /containers/mariadb1/conf.d:/etc/mysql/mariadb.conf.d \ mariadb:10.2.15 \ --wsrep_cluster_address=gcomm://mariadb0.weave.local,mariadb1.weave.local,mariadb2.weave.local,mariadb3.weave.local \ --wsrep_sst_auth="root:PM7%cB43$sd@^1" \ --wsrep_node_address=mariadb1.weave.local

Take note that if --memory-swap is set to the same value as --memory, and --memory is set to a positive integer, the container will not have access to swap. If --memory-swap is not set, container swap will default to --memory multiply by 2. If --memory and --memory-swap are set to the same value, this will prevent containers from using any swap. This is because --memory-swap is the amount of combined memory and swap that can be used, while --memory is only the amount of physical memory that can be used.

Some of the container resources like memory and CPU can be controlled dynamically through "docker update" command, as shown in the following example to upgrade the memory of container mariadb1 to 32G on-the-fly:

$ docker update \ --memory 32g \ --memory-swap 32g \ mariadb1

Do not forget to tune the my.cnf accordingly to suit the new specs. Configuration management is explained in the next section.

Configuration Management

Most of the MySQL/MariaDB configuration parameters can be changed during runtime, which means you don't need to restart to apply the changes. Check out the MariaDB documentation page for details. The parameter listed with "Dynamic: Yes" means the variable is loaded immediately upon changing without the necessity to restart MariaDB server. Otherwise, set the parameters inside the custom configuration file in the Docker host. For example, on mariadb3, make the changes to the following file:

$ vim /containers/mariadb3/conf.d/my.cnf

And then restart the database container to apply the change:

$ docker restart mariadb3

Verify the container starts up the process by looking at the docker logs. Perform this operation on one node at a time if you would like to make cluster-wide changes.

Backup

Taking a logical backup is pretty straightforward because the MariaDB image also comes with mysqldump binary. You simply use the "docker exec" command to run the mysqldump and send the output to a file relative to the host path. The following command performs mysqldump backup on mariadb2 and saves it to /backups/mariadb2 inside host2:

$ docker exec -it mariadb2 mysqldump -uroot -p --single-transaction > /backups/mariadb2/dump.sql

Binary backup like Percona Xtrabackup or MariaDB Backup requires the process to access the MariaDB data directory directly. You have to either install this tool inside the container, or through the machine host or use a dedicated image for this purpose like "perconalab/percona-xtrabackup" image to create the backup and stored it inside /tmp/backup on the Docker host:

$ docker run --rm -it \ -v /containers/mariadb2/datadir:/var/lib/mysql \ -v /tmp/backup:/xtrabackup_backupfiles \ perconalab/percona-xtrabackup \ --backup --host=mariadb2 --user=root --password=mypassword

You can also stop the container with innodb_fast_shutdown set to 0 and copy over the datadir volume to another location in the physical host:

$ docker exec -it mariadb2 mysql -uroot -p -e 'SET GLOBAL innodb_fast_shutdown = 0' $ docker kill -s 15 mariadb2 $ cp -Rf /containers/mariadb2/datadir /backups/mariadb2/datadir_copied $ docker start mariadb2 Restore

Restoring is pretty straightforward for mysqldump. You can simply redirect the stdin into the container from the physical host:

$ docker exec -it mariadb2 mysql -uroot -p < /backups/mariadb2/dump.sql

You can also use the standard mysql client command line remotely with proper hostname and port value instead of using this "docker exec" command:

$ mysql -uroot -p -h127.0.0.1 -P3306 < /backups/mariadb2/dump.sql

For Percona Xtrabackup and MariaDB Backup, we have to prepare the backup beforehand. This will roll forward the backup to the time when the backup was finished. Let's say our Xtrabackup files are located under /tmp/backup of the Docker host, to prepare it, simply:

$ docker run --rm -it \ -v mysql-datadir:/var/lib/mysql \ -v /tmp/backup:/xtrabackup_backupfiles \ perconalab/percona-xtrabackup \ --prepare --target-dir /xtrabackup_backupfiles

The prepared backup under /tmp/backup of the Docker host then can be used as the MariaDB datadir for a new container or cluster. Let's say we just want to verify restoration on a standalone MariaDB container, we would run:

$ docker run -d \ --name mariadb-restored \ --env MYSQL_ROOT_PASSWORD="PM7%cB43$sd@^1" \ -v /tmp/backup:/var/lib/mysql \ mariadb:10.2.15

If you performed a backup using stop and copy approach, you can simply duplicate the datadir and use the duplicated directory as a volume maps to MariaDB datadir to run on another container. Let's say the backup was copied over under /backups/mariadb2/datadir_copied, we can run a new container by running:

$ mkdir -p /containers/mariadb-restored/datadir $ cp -Rf /backups/mariadb2/datadir_copied /containers/mariadb-restored/datadir $ docker run -d \ --name mariadb-restored \ --env MYSQL_ROOT_PASSWORD="PM7%cB43$sd@^1" \ -v /containers/mariadb-restored/datadir:/var/lib/mysql \ mariadb:10.2.15

The MYSQL_ROOT_PASSWORD must match the actual root password for that particular backup.

Severalnines   MySQL on Docker: How to Containerize Your Database Discover all you need to understand when considering to run a MySQL service on top of Docker container virtualization Download the White[]aper Database Version Upgrade

There are two types of upgrade - in-place upgrade or logical upgrade.

In-place upgrade involves shutting down the MariaDB server, replacing the old binaries with the new binaries and then starting the server on the old data directory. Once started, you have to run mysql_upgrade script to check and upgrade all system tables and also to check the user tables.

The logical upgrade involves exporting SQL from the current version using a logical backup utility such as mysqldump, running the new container with the upgraded version binaries, and then applying the SQL to the new MySQL/MariaDB version. It is similar to backup and restore approach described in the previous section.

Nevertheless, it's a good approach to always backup your database before performing any destructive operations. The following steps are required when upgrading from the current image, MariaDB 10.1.33 to another major version, MariaDB 10.2.15 on mariadb3 resides on host3:

  1. Backup the database. It doesn't matter physical or logical backup but the latter using mysqldump is recommended.

  2. Download the latest image that we would like to upgrade to:

    $ docker pull mariadb:10.2.15
  3. Set innodb_fast_shutdown to 0 for our database container:

    $ docker exec -it mariadb3 mysql -uroot -p -e 'SET GLOBAL innodb_fast_shutdown = 0'
  4. Graceful shut down the database container:

    $ docker kill --signal=TERM mariadb3
  5. Create a new container with the new image for our database container. Keep the rest of the parameters intact except using the new container name (otherwise it would conflict):

    $ docker run -d \ --name mariadb3-new \ --hostname mariadb3.weave.local \ --net weave \ --publish "3306:3306" \ --publish "4444" \ --publish "4567" \ --publish "4568" \ $(weave dns-args) \ --env MYSQL_ROOT_PASSWORD="PM7%cB43$sd@^1" \ --volume /containers/mariadb3/datadir:/var/lib/mysql \ --volume /containers/mariadb3/conf.d:/etc/mysql/mariadb.conf.d \ mariadb:10.2.15 \ --wsrep_cluster_address=gcomm://mariadb0.weave.local,mariadb1.weave.local,mariadb2.weave.local,mariadb3.weave.local \ --wsrep_sst_auth="root:PM7%cB43$sd@^1" \ --wsrep_node_address=mariadb3.weave.local
  6. Run mysql_upgrade script:

    $ docker exec -it mariadb3-new mysql_upgrade -uroot -p
  7. If no errors occurred, remove the old container, mariadb3 (the new one is mariadb3-new):

    $ docker rm -f mariadb3
  8. Otherwise, if the upgrade process fails in between, we can fall back to the previous container:

    $ docker stop mariadb3-new $ docker start mariadb3

Major version upgrade can be performed similarly to the minor version upgrade, except you have to keep in mind that MySQL/MariaDB only supports major upgrade from the previous version. If you are on MariaDB 10.0 and would like to upgrade to 10.2, you have to upgrade to MariaDB 10.1 first, followed by another upgrade step to MariaDB 10.2.

Take note on the configuration changes being introduced and deprecated between major versions.

Failover

In Galera, all nodes are masters and hold the same role. With ProxySQL in the picture, connections that pass through this gateway will be failed over automatically as long as there is a primary component running for Galera Cluster (that is, a majority of nodes are up). The application won't notice any difference if one database node goes down because ProxySQL will simply redirect the connections to the other available nodes.

If the application connects directly to the MariaDB bypassing ProxySQL, failover has to be performed on the application-side by pointing to the next available node, provided the database node meets the following conditions:

  • Status wsrep_local_state_comment is Synced (The state "Desynced/Donor" is also possible, only if wsrep_sst_method is xtrabackup, xtrabackup-v2 or mariabackup).
  • Status wsrep_cluster_status is Primary.

In Galera, an available node doesn't mean it's healthy until the above status are verified.

Scaling Out

To scale out, we can create a new container in the same network and use the same custom configuration file for the existing container on that particular host. For example, let's say we want to add the fourth MariaDB container on host3, we can use the same configuration file mounted for mariadb3, as illustrated in the following diagram:

Run the following command on host3 to scale out:

$ docker run -d \ --name mariadb4 \ --hostname mariadb4.weave.local \ --net weave \ --publish "3306:3307" \ --publish "4444" \ --publish "4567" \ --publish "4568" \ $(weave dns-args) \ --env MYSQL_ROOT_PASSWORD="PM7%cB43$sd@^1" \ --volume /containers/mariadb4/datadir:/var/lib/mysql \ --volume /containers/mariadb3/conf.d:/etc/mysql/mariadb.conf.d \ mariadb:10.2.15 \ --wsrep_cluster_address=gcomm://mariadb1.weave.local,mariadb2.weave.local,mariadb3.weave.local,mariadb4.weave.local \ --wsrep_sst_auth="root:PM7%cB43$sd@^1" \ --wsrep_node_address=mariadb4.weave.local

Once the container is created, it will join the cluster and perform SST. It can be accessed on port 3307 externally or outside of the Weave network, or port 3306 within the host or within the Weave network. It's not necessary to include mariadb0.weave.local into the cluster address anymore. Once the cluster is scaled out, we need to add the new MariaDB container into the ProxySQL load balancing set via admin console:

$ docker exec -it proxysql1 mysql -uadmin -padmin -P6032 mysql> INSERT INTO mysql_servers(hostgroup_id,hostname,port) VALUES (10,'mariadb4.weave.local',3306); mysql> INSERT INTO mysql_servers(hostgroup_id,hostname,port) VALUES (20,'mariadb4.weave.local',3306); mysql> LOAD MYSQL SERVERS TO RUNTIME; mysql> SAVE MYSQL SERVERS TO DISK;

Repeat the above commands on the second ProxySQL instance.

Finally for the the last step, (you may skip this part if you already ran "SAVE .. TO DISK" statement in ProxySQL), add the following line into proxysql.cnf to make it persistent across container restart on host1 and host2:

$ vim /containers/proxysql1/proxysql.cnf # host1 $ vim /containers/proxysql2/proxysql.cnf # host2

And append mariadb4 related lines under mysql_server directive:

mysql_servers = ( { address="mariadb1.weave.local" , port=3306 , hostgroup=10, max_connections=100 }, { address="mariadb2.weave.local" , port=3306 , hostgroup=10, max_connections=100 }, { address="mariadb3.weave.local" , port=3306 , hostgroup=10, max_connections=100 }, { address="mariadb4.weave.local" , port=3306 , hostgroup=10, max_connections=100 }, { address="mariadb1.weave.local" , port=3306 , hostgroup=20, max_connections=100 }, { address="mariadb2.weave.local" , port=3306 , hostgroup=20, max_connections=100 }, { address="mariadb3.weave.local" , port=3306 , hostgroup=20, max_connections=100 }, { address="mariadb4.weave.local" , port=3306 , hostgroup=20, max_connections=100 } )

Save the file and we should be good on the next container restart.

Scaling Down

To scale down, simply shuts down the container gracefully. The best command would be:

$ docker kill -s 15 mariadb4 $ docker rm -f mariadb4

Remember, if the database node left the cluster ungracefully, it was not part of scaling down and would affect the quorum calculation.

To remove the container from ProxySQL, run the following commands on both ProxySQL containers. For example, on proxysql1:

$ docker exec -it proxysql1 mysql -uadmin -padmin -P6032 mysql> DELETE FROM mysql_servers WHERE hostname="mariadb4.weave.local"; mysql> LOAD MYSQL SERVERS TO RUNTIME; mysql> SAVE MYSQL SERVERS TO DISK; Related resources  MySQL on Docker: Running a MariaDB Galera Cluster without Container Orchestration Tools - Part 1  MySQL on Docker - How to Containerize Your Database  MySQL on Docker: Multi-Host Networking for MySQL Containers (Part 2 - Calico)

You can then either remove the corresponding entry inside proxysql.cnf or just leave it like that. It will be detected as OFFLINE from ProxySQL point-of-view anyway.

Summary

With Docker, things get a bit different from the conventional way on handling MySQL or MariaDB servers. Handling stateful services like Galera Cluster is not as easy as stateless applications, and requires proper testing and planning.

In our next blog on this topic, we will evaluate the pros and cons of running Galera Cluster on Docker without any orchestration tools.

Tags:  MySQL MariaDB galera docker container proxysql management

Dynamic Replication Channel Specific Filtering for uses like InnoDB Cluster

There are numerous ways to filter your replication stream in MySQL.  So many, that the documentation team has created guided “walk-through” pages for considering the various complex DB Filtering and subsequent Table Filtering paths folks have used for the “global” filtering options.  But now there are “channel specific” options that can be implemented at startup with enhanced… Read More »

MySQL Connector/NET 8.0.11 is available as official MySQL NuGet packages

Dear MySQL Connector/NET users,

We are proud to announce that the open source version of Connector/NET 8.0.11 is also available as official MySQL NuGet packages. Connector/NET 8.0.11 is a GA release that can be used in production environments. Additionally, users with an older version can safely upgrade if preferred.

 

What this means for Connector/NET users?

Installing and upgrading packages couldn’t be easier, plus you need only to install the packages that fit your needs. Also, it’s even easier and more straight forward to develop for MySQL from Visual Studio.

NuGet is the package manager for the Microsoft development platform including .NET. The NuGet client tools provide the ability to produce and consume packages. The NuGet Gallery is the central software package repository populated by various package authors and tied directly to developers for ease of installation and development. For more information on NuGet see: https://github.com/nuget/home

Connector/NET NuGet Packages

Currently there are 5 Connector/NET NuGet packages:

  • MySQL Connector/NET Data
    • Provides the core functionality of Connector/NET including using MySQL as a Document Store through the X DevAPI. It implements the required ADO.NET interfaces and integrates into ADO.NET-aware tools. Provides access to multiple versions of MySQL Server and encapsulates database-specific protocols.
    • Supports .NET Framework 4.5.2, .NET Standard 1.6 & 2.0.
    • Refer to the official documentation for additional details.
    • Refer to the X DevAPI User Guide for details on using MySQL as a Document Store with Connector/NET.
  • MySQL Connector/NET Web providers
    • Includes support for the ASP.NET 2.0 provider model. This model enables application developers to focus on the business logic of their application instead of having to recreate boilerplate items such as membership and roles support.
    • Supported providers:
      • Membership
      • Role
      • Profile
      • Session State
    • Supports .NET Framework 4.5.2.
    • Refer to the ASP.NET Provider Model section of the official documentation for additional details.
  • MySQL Connector/NET for Entity Framework 6
    • Provides ORM capabilities enabling developers to work with MySQL databases using domain-specific objects, thus eliminating the need for most of the data access code.
    • Supports .NET Framework 4.5.2.
    • Refer to the Entity Framework 6 Support section of the official documentation for additional details.
  • MySQL Connector/NET for Entity Framework Core
    • Provides multi-platform support for Entity Framework tasks.
    • Supports .NET Framework 4.5.2, .NET Standard 1.6 & 2.0.
    • Refer to the Entity Framework Core Support section of the official documentation for additional details.
  • MySQL Connector/NET for Entity Framework Core Design
    • Provides shared design-time components for Entity Framework Core tools allowing the scaffolding and migration of databases.
    • Supports .NET Framework 4.5.2, .NET Standard 1.6.
    • Refer to the Entity Framework Core Support section of the official documentation for additional details.

In order to install the NuGet Package you can use the Package Manager Console as follows:

PM> Install-Package MySql.Data.EntityFrameworkCore

PM> Install-Package MySql.Data.Web

Or via Visual Studio’s NuGet Package Manager UI: http://docs.nuget.org/docs/start-here/Managing-NuGet-Packages-Using-The-Dialog

Note that packages for Web Providers and Entity Framework have a dependency with the core package MySql.Data, meaning their installation will also result in the installation of the MySql.Data package. Installation through the standalone MSI is recommended whenever you want to make use of all the functionality provided by Connector/NET E.g. availability in the GAC.

For more information or to go directly to our NuGet Packages review the following links:

For those of you interested in how to use Connector/NET with .NET Core you can refer to the MySQL Connector/NET for .NET Core 1.0 blog post.

If you happen to have any questions about this or any other Connector/NET related topic, you can always post them in the product’s main forum. Additionally, bugs can be reported in MySQL Bugs.

Connector/NET Team

Pages