Planet MySQL

Benchmarking the Read Backup feature in the NDB storage engine

Read Backup was a new feature in MySQL Cluster 7.5. When MySQL
Cluster 7.5 was released I was already busily engaged in working
on the partial LCP feature we now released in 7.6. So I had not
much time producing benchmarks showing the impact of the
Read Backup feature.

Read Backup means that committed reads in NDB can use the backup
replicas as well. In NDB tables reads are already directed towards
the primary replica. The reason is that MySQL Cluster wants to
ensure that applications can trust that a reader can see his own
updates. Many modern NoSQL DBMSs lack this feature since they are
using eventual replication and a very flexible scheduling of which
replicas to read. NDB provides a stronger consistency guarantee
in that all applications can see their own updates and replication
is synchronous.

The reason that reading using a backup replica can fail to see its own
changes in NDB is that we release the locks on the primary replica first,
next we deliver the committed message to the application and last
we release the locks on the backup. This means that reading a
backup replica using committed read (reads the latest committed row without
locks) can only be guaranteed to see its own updates if it reads the
primary replica.

With tables that have the Read Backup feature we will delay the
sending of the committed message to the application until all rows
have been unlocked. This means that we can safely read any replica
for those tables and still see our own updates.

Setting the Read Backup feature for a table can either be set through
a special syntax using the COMMENT in the CREATE TABLE statement. A
much easier and likely more useful approach is to set the
ndb_read_backup configuration variable in the MySQL Server to 1. This
means that all tables in this MySQL Server will be created with the
Read Backup feature. Similarly there is a similar feature for ensuring
that all tables are created with the fully replicated feature. In this
case the configuration variable ndb_fully_replicated is set to 1. In
MySQL Cluster 7.6 none of these configuration variables are enabled by
default. But for SQL applications it is a good to always enable the
read backup feature and for applications that focus on read scalability
with fairly limited size of the data, the fully replicated feature can
also be enabled.

Fully replicated tables have a replica in each data node and any replica
can be read for committed reads.

The benchmark we present here is performed in a setup with the optimal
network architecture. It is two machines where the MySQL Server and the
NDB data node can be colocated and the network between the nodes is
using an Infiniband network. This means that in most every real case the
impact of using read backup in a colocated scenario is even bigger.

In this scenario the extra delay to go over the network is fairly small,
thus the impact of low concurrency is fairly small, but the extra overhead
of going over the network a lot shows its impact on higher concurrency and
grows all the way up to 30%.

MySQL Cluster 7.6 and the thread pool

Looking at the graphs in the previous blog post one can see that
MySQL Cluster 7.6 using the shared memory transporter can improve
performance at very high thread counts by more than 100%. Still
the performance is still dropping fairly significantly moving from
512 to 1536 threads. The MySQL Server using the NDB transporter
scales very well on all sorts of architectures and using very many
cores. But I have noted that when the number of connections goes
beyond some limit (in my benchmarks usually around 512 threads),
the performance starts to drop.

Actually in the commercial version of MySQL Cluster help is available
to resolve this problem. The thread pool was developed by me and a team
of performance experts to ensure that MySQL using InnoDB would have
the same performance even with massive amounts of threads hitting at the
MySQL server. It still works for this purpose. I have never mentioned
the use of thread pool for MySQL Cluster before, but the fact of the matter
is that it works perfectly fine to use the thread pool in combination
with MySQL Cluster.

There is one limitation in the current thread pool implementation. The maximum
number of thread groups are 64. This limit was set since MySQL didn't scale
beyond this number using InnoDB in those days. NDB is a distributed engine,
so it works a bit differently for NDB compared to InnoDB. It would be possible
to make the integration of the thread pool and NDB a bit tighter. But even with
the current implementation NDB can be used perfectly fine with the thread pool.

The limit 64 means that it won't really be so useful to use the thread pool and
NDB in combination with MySQL servers that use more than 16 CPUs.
The graph above show how the thread pool compares to the performance of
MySQL Cluster 7.6 on a small MySQL Server. It loses 1-2% on low thread
counts, but it continues to deliver good results even when passing the 512 thread limit.

The graph below shows how MySQL Cluster 7.6.6 using the thread pool compares to
MySQL Cluster 7.5.9 without thread pool. We see the usual linear curve at
high concurrency for the thread pool, in this case however it is limited by
the 64 thread groups since the setup in this case have access to 30 CPUs using
one cluster connection. I did some experiments where I moved the limit of 64
up a bit (a minor change). The performance for most experiments has a
good setting with thread pool size set to 128 and in this case the performance
actually increases a small bit as the number of threads increase.


MySQL on Docker: Running a MariaDB Galera Cluster without Container Orchestration Tools - Part 1

Container orchestration tools simplify the running of a distributed system, by deploying and redeploying containers and handling any failures that occur. One might need to move applications around, e.g., to handle updates, scaling, or underlying host failures. While this sounds great, it does not always work well with a strongly consistent database cluster like Galera. You can’t just move database nodes around, they are not stateless applications. Also, the order in which you perform operations on a cluster has high significance. For instance, restarting a Galera cluster has to start from the most advanced node, or else you will lose data. Therefore, we’ll show you how to run Galera Cluster on Docker without a container orchestration tool, so you have total control.

Related resources  MySQL on Docker - How to Containerize Your Database - New Whitepaper  MySQL on Docker: Multi-Host Networking for MySQL Containers (Part 2 - Calico)  MySQL on Docker: Running Galera Cluster on Kubernetes

In this blog post, we are going to look into how to run a MariaDB Galera Cluster on Docker containers using the standard Docker image on multiple Docker hosts, without the help of orchestration tools like Swarm or Kubernetes. This approach is similar to running a Galera Cluster on standard hosts, but the process management is configured through Docker.

Before we jump further into details, we assume you have installed Docker, disabled SElinux/AppArmor and cleared up the rules inside iptables, firewalld or ufw (whichever you are using). The following are three dedicated Docker hosts for our database cluster:

  • host1.local - 192.168.55.161
  • host2.local - 192.168.55.162
  • host3.local - 192.168.55.163
Multi-host Networking

First of all, the default Docker networking is bound to the local host. Docker Swarm introduces another networking layer called overlay network, which extends the container internetworking to multiple Docker hosts in a cluster called Swarm. Long before this integration came into place, there were many network plugins developed to support this - Flannel, Calico, Weave are some of them.

Here, we are going to use Weave as the Docker network plugin for multi-host networking. This is mainly due to its simplicity to get it installed and running, and support for DNS resolver (containers running under this network can resolve each other's hostname). There are two ways to get Weave running - systemd or through Docker. We are going to install it as a systemd unit, so it's independent from Docker daemon (otherwise, we would have to start Docker first before Weave gets activated).

  1. Download and install Weave:

    $ curl -L git.io/weave -o /usr/local/bin/weave $ chmod a+x /usr/local/bin/weave
  2. Create a systemd unit file for Weave:

    $ cat > /etc/systemd/system/weave.service << EOF [Unit] Description=Weave Network Documentation=http://docs.weave.works/weave/latest_release/ Requires=docker.service After=docker.service [Service] EnvironmentFile=-/etc/sysconfig/weave ExecStartPre=/usr/local/bin/weave launch --no-restart $PEERS ExecStart=/usr/bin/docker attach weave ExecStop=/usr/local/bin/weave stop [Install] WantedBy=multi-user.target EOF
  3. Define IP addresses or hostname of the peers inside /etc/sysconfig/weave:

    $ echo 'PEERS="192.168.55.161 192.168.55.162 192.168.55.163"' > /etc/sysconfig/weave
  4. Start and enable Weave on boot:

    $ systemctl start weave $ systemctl enable weave

Repeat the above 4 steps on all Docker hosts. Verify with the following command once done:

$ weave status

The number of peers is what we are looking after. It should be 3:

... Peers: 3 (with 6 established connections) ... Running a Galera Cluster

Now the network is ready, it's time to fire our database containers and form a cluster. The basic rules are:

  • Container must be created under --net=weave to have multi-host connectivity.
  • Container ports that need to be published are 3306, 4444, 4567, 4568.
  • The Docker image must support Galera. If you'd like to use Oracle MySQL, then get the Codership version. If you'd like Percona's, use this image instead. In this blog post, we are using MariaDB's.

The reasons we chose MariaDB as the Galera cluster vendor are:

  • Galera is embedded into MariaDB, starting from MariaDB 10.1.
  • The MariaDB image is maintained by the Docker and MariaDB teams.
  • One of the most popular Docker images out there.

Bootstrapping a Galera Cluster has to be performed in sequence. Firstly, the most up-to-date node must be started with "wsrep_cluster_address=gcomm://". Then, start the remaining nodes with a full address consisting of all nodes in the cluster, e.g, "wsrep_cluster_address=gcomm://node1,node2,node3". To accomplish these steps using container, we have to do some extra steps to ensure all containers are running homogeneously. So the plan is:

  1. We would need to start with 4 containers in this order - mariadb0 (bootstrap), mariadb2, mariadb3, mariadb1.
  2. Container mariadb0 will be using the same datadir and configdir with mariadb1.
  3. Use mariadb0 on host1 for the first bootstrap, then start mariadb2 on host2, mariadb3 on host3.
  4. Remove mariadb0 on host1 to give way for mariadb1.
  5. Lastly, start mariadb1 on host1.

At the end of the day, you would have a three-node Galera Cluster (mariadb1, mariadb2, mariadb3). The first container (mariadb0) is a transient container for bootstrapping purposes only, using cluster address "gcomm://". It shares the same datadir and configdir with mariadb1 and will be removed once the cluster is formed (mariadb2 and mariadb3 are up) and nodes are synced.

By default, Galera is turned off in MariaDB and needs to be enabled with a flag called wsrep_on (set to ON) and wsrep_provider (set to the Galera library path) plus a number of Galera-related parameters. Thus, we need to define a custom configuration file for the container to configure Galera correctly.

Let's start with the first container, mariadb0. Create a file under /containers/mariadb0/conf.d/my.cnf and add the following lines:

$ mkdir -p /containers/mariadb0/conf.d $ cat /containers/mariadb0/conf.d/my.cnf [mysqld] default_storage_engine = InnoDB binlog_format = ROW innodb_flush_log_at_trx_commit = 0 innodb_flush_method = O_DIRECT innodb_file_per_table = 1 innodb_autoinc_lock_mode = 2 innodb_lock_schedule_algorithm = FCFS # MariaDB >10.1.19 and >10.2.3 only wsrep_on = ON wsrep_provider = /usr/lib/galera/libgalera_smm.so wsrep_sst_method = xtrabackup-v2

Since the image doesn't come with MariaDB Backup (which is the preferred SST method for MariaDB 10.1 and MariaDB 10.2), we are going to stick with xtrabackup-v2 for the time being.

To perform the first bootstrap for the cluster, run the bootstrap container (mariadb0) on host1:

$ docker run -d \ --name mariadb0 \ --hostname mariadb0.weave.local \ --net weave \ --publish "3306" \ --publish "4444" \ --publish "4567" \ --publish "4568" \ $(weave dns-args) \ --env MYSQL_ROOT_PASSWORD="PM7%cB43$sd@^1" \ --env MYSQL_USER=proxysql \ --env MYSQL_PASSWORD=proxysqlpassword \ --volume /containers/mariadb1/datadir:/var/lib/mysql \ --volume /containers/mariadb1/conf.d:/etc/mysql/mariadb.conf.d \ mariadb:10.2.15 \ --wsrep_cluster_address=gcomm:// \ --wsrep_sst_auth="root:PM7%cB43$sd@^1" \ --wsrep_node_address=mariadb0.weave.local

The parameters used in the the above command are:

  • --name, creates the container named "mariadb0",
  • --hostname, assigns the container a hostname "mariadb0.weave.local",
  • --net, places the container in the weave network for multi-host networing support,
  • --publish, exposes ports 3306, 4444, 4567, 4568 on the container to the host,
  • $(weave dns-args), configures DNS resolver for this container. This command can be translated into Docker run as "--dns=172.17.0.1 --dns-search=weave.local.",
  • --env MYSQL_ROOT_PASSWORD, the MySQL root password,
  • --env MYSQL_USER, creates "proxysql" user to be used later with ProxySQL for database routing,
  • --env MYSQL_PASSWORD, the "proxysql" user password,
  • --volume /containers/mariadb1/datadir:/var/lib/mysql, creates /containers/mariadb1/datadir if does not exist and map it with /var/lib/mysql (MySQL datadir) of the container (for bootstrap node, this could be skipped),
  • --volume /containers/mariadb1/conf.d:/etc/mysql/mariadb.conf.d, mounts the files under directory /containers/mariadb1/conf.d of the Docker host, into the container at /etc/mysql/mariadb.conf.d.
  • mariadb:10.2.15, uses MariaDB 10.2.15 image from here,
  • --wsrep_cluster_address, Galera connection string for the cluster. "gcomm://" means bootstrap. For the rest of the containers, we are going to use a full address instead.
  • --wsrep_sst_auth, authentication string for SST user. Use the same user as root,
  • --wsrep_node_address, the node hostname, in this case we are going to use the FQDN provided by Weave.

The bootstrap container contains several key things:

  • The name, hostname and wsrep_node_address is mariadb0, but it uses the volumes of mariadb1.
  • The cluster address is "gcomm://"
  • There are two additional --env parameters - MYSQL_USER and MYSQL_PASSWORD. This parameters will create additional user for our proxysql monitoring purpose.

Verify with the following command:

$ docker ps $ docker logs -f mariadb0

Once you see the following line, it indicates the bootstrap process is completed and Galera is active:

2018-05-30 23:19:30 139816524539648 [Note] WSREP: Synchronized with group, ready for connections

Create the directory to load our custom configuration file in the remaining hosts:

$ mkdir -p /containers/mariadb2/conf.d # on host2 $ mkdir -p /containers/mariadb3/conf.d # on host3

Then, copy the my.cnf that we've created for mariadb0 and mariadb1 to mariadb2 and mariadb3 respectively:

$ scp /containers/mariadb1/conf.d/my.cnf /containers/mariadb2/conf.d/ # on host1 $ scp /containers/mariadb1/conf.d/my.cnf /containers/mariadb3/conf.d/ # on host1

Next, create another 2 database containers (mariadb2 and mariadb3) on host2 and host3 respectively:

$ docker run -d \ --name ${NAME} \ --hostname ${NAME}.weave.local \ --net weave \ --publish "3306:3306" \ --publish "4444" \ --publish "4567" \ --publish "4568" \ $(weave dns-args) \ --env MYSQL_ROOT_PASSWORD="PM7%cB43$sd@^1" \ --volume /containers/${NAME}/datadir:/var/lib/mysql \ --volume /containers/${NAME}/conf.d:/etc/mysql/mariadb.conf.d \ mariadb:10.2.15 \ --wsrep_cluster_address=gcomm://mariadb0.weave.local,mariadb1.weave.local,mariadb2.weave.local,mariadb3.weave.local \ --wsrep_sst_auth="root:PM7%cB43$sd@^1" \ --wsrep_node_address=${NAME}.weave.local

** Replace ${NAME} with mariadb2 or mariadb3 respectively.

However, there is a catch. The entrypoint script checks the mysqld service in the background after database initialization by using MySQL root user without password. Since Galera automatically performs synchronization through SST or IST when starting up, the MySQL root user password will change, mirroring the bootstrapped node. Thus, you would see the following error during the first start up:

018-05-30 23:27:13 140003794790144 [Warning] Access denied for user 'root'@'localhost' (using password: NO) MySQL init process in progress… MySQL init process failed.

The trick is to restart the failed containers once more, because this time, the MySQL datadir would have been created (in the first run attempt) and it would skip the database initialization part:

$ docker start mariadb2 # on host2 $ docker start mariadb3 # on host3

Once started, verify by looking at the following line:

$ docker logs -f mariadb2 … 2018-05-30 23:28:39 139808069601024 [Note] WSREP: Synchronized with group, ready for connections

At this point, there are 3 containers running, mariadb0, mariadb2 and mariadb3. Take note that mariadb0 is started using the bootstrap command (gcomm://), which means if the container is automatically restarted by Docker in the future, it could potentially become disjointed with the primary component. Thus, we need to remove this container and replace it with mariadb1, using the same Galera connection string with the rest and use the same datadir and configdir with mariadb0.

First, stop mariadb0 by sending SIGTERM (to ensure the node is going to be shutdown gracefully):

$ docker kill -s 15 mariadb0

Then, start mariadb1 on host1 using similar command as mariadb2 or mariadb3:

$ docker run -d \ --name mariadb1 \ --hostname mariadb1.weave.local \ --net weave \ --publish "3306:3306" \ --publish "4444" \ --publish "4567" \ --publish "4568" \ $(weave dns-args) \ --env MYSQL_ROOT_PASSWORD="PM7%cB43$sd@^1" \ --volume /containers/mariadb1/datadir:/var/lib/mysql \ --volume /containers/mariadb1/conf.d:/etc/mysql/mariadb.conf.d \ mariadb:10.2.15 \ --wsrep_cluster_address=gcomm://mariadb0.weave.local,mariadb1.weave.local,mariadb2.weave.local,mariadb3.weave.local \ --wsrep_sst_auth="root:PM7%cB43$sd@^1" \ --wsrep_node_address=mariadb1.weave.local

This time, you don't need to do the restart trick because MySQL datadir already exists (created by mariadb0). Once the container is started, verify the cluster size is 3, the status must be in Primary and the local state is synced:

$ docker exec -it mariadb3 mysql -uroot "-pPM7%cB43$sd@^1" -e 'select variable_name, variable_value from information_schema.global_status where variable_name in ("wsrep_cluster_size", "wsrep_local_state_comment", "wsrep_cluster_status", "wsrep_incoming_addresses")' +---------------------------+-------------------------------------------------------------------------------+ | variable_name | variable_value | +---------------------------+-------------------------------------------------------------------------------+ | WSREP_CLUSTER_SIZE | 3 | | WSREP_CLUSTER_STATUS | Primary | | WSREP_INCOMING_ADDRESSES | mariadb1.weave.local:3306,mariadb3.weave.local:3306,mariadb2.weave.local:3306 | | WSREP_LOCAL_STATE_COMMENT | Synced | +---------------------------+-------------------------------------------------------------------------------+

At this point, our architecture is looking something like this:

Although the run command is pretty long, it well describes the container's characteristics. It's probably a good idea to wrap the command in a script to simplify the execution steps, or use a compose file instead.

Database Routing with ProxySQL

Now we have three database containers running. The only way to access to the cluster now is to access the individual Docker host’s published port of MySQL, which is 3306 (map to 3306 to the container). So what happens if one of the database containers fails? You have to manually failover the client's connection to the next available node. Depending on the application connector, you could also specify a list of nodes and let the connector do the failover and query routing for you (Connector/J, PHP mysqlnd). Otherwise, it would be a good idea to unify the database resources into a single resource, that can be called a service.

This is where ProxySQL comes into the picture. ProxySQL can act as the query router, load balancing the database connections similar to what "Service" in Swarm or Kubernetes world can do. We have built a ProxySQL Docker image for this purpose and will maintain the image for every new version with our best effort.

Before we run the ProxySQL container, we have to prepare the configuration file. The following is what we have configured for proxysql1. We create a custom configuration file under /containers/proxysql1/proxysql.cnf on host1:

$ cat /containers/proxysql1/proxysql.cnf datadir="/var/lib/proxysql" admin_variables= { admin_credentials="admin:admin" mysql_ifaces="0.0.0.0:6032" refresh_interval=2000 } mysql_variables= { threads=4 max_connections=2048 default_query_delay=0 default_query_timeout=36000000 have_compress=true poll_timeout=2000 interfaces="0.0.0.0:6033;/tmp/proxysql.sock" default_schema="information_schema" stacksize=1048576 server_version="5.1.30" connect_timeout_server=10000 monitor_history=60000 monitor_connect_interval=200000 monitor_ping_interval=200000 ping_interval_server=10000 ping_timeout_server=200 commands_stats=true sessions_sort=true monitor_username="proxysql" monitor_password="proxysqlpassword" } mysql_servers = ( { address="mariadb1.weave.local" , port=3306 , hostgroup=10, max_connections=100 }, { address="mariadb2.weave.local" , port=3306 , hostgroup=10, max_connections=100 }, { address="mariadb3.weave.local" , port=3306 , hostgroup=10, max_connections=100 }, { address="mariadb1.weave.local" , port=3306 , hostgroup=20, max_connections=100 }, { address="mariadb2.weave.local" , port=3306 , hostgroup=20, max_connections=100 }, { address="mariadb3.weave.local" , port=3306 , hostgroup=20, max_connections=100 } ) mysql_users = ( { username = "sbtest" , password = "password" , default_hostgroup = 10 , active = 1 } ) mysql_query_rules = ( { rule_id=100 active=1 match_pattern="^SELECT .* FOR UPDATE" destination_hostgroup=10 apply=1 }, { rule_id=200 active=1 match_pattern="^SELECT .*" destination_hostgroup=20 apply=1 }, { rule_id=300 active=1 match_pattern=".*" destination_hostgroup=10 apply=1 } ) scheduler = ( { id = 1 filename = "/usr/share/proxysql/tools/proxysql_galera_checker.sh" active = 1 interval_ms = 2000 arg1 = "10" arg2 = "20" arg3 = "1" arg4 = "1" arg5 = "/var/lib/proxysql/proxysql_galera_checker.log" } )

The above configuration will:

  • configure two host groups, the single-writer and multi-writer group, as defined under "mysql_servers" section,
  • send reads to all Galera nodes (hostgroup 20) while write operations will go to a single Galera server (hostgroup 10),
  • schedule the proxysql_galera_checker.sh,
  • use monitor_username and monitor_password as the monitoring credentials created when we first bootstrapped the cluster (mariadb0).

Copy the configuration file to host2, for ProxySQL redundancy:

$ mkdir -p /containers/proxysql2/ # on host2 $ scp /containers/proxysql1/proxysql.cnf /container/proxysql2/ # on host1

Then, run the ProxySQL containers on host1 and host2 respectively:

$ docker run -d \ --name=${NAME} \ --publish 6033 \ --publish 6032 \ --restart always \ --net=weave \ $(weave dns-args) \ --hostname ${NAME}.weave.local \ -v /containers/${NAME}/proxysql.cnf:/etc/proxysql.cnf \ -v /containers/${NAME}/data:/var/lib/proxysql \ severalnines/proxysql

** Replace ${NAME} with proxysql1 or proxysql2 respectively.

We specified --restart=always to make it always available regardless of the exit status, as well as automatic startup when Docker daemon starts. This will make sure the ProxySQL containers act like a daemon.

Verify the MySQL servers status monitored by both ProxySQL instances (OFFLINE_SOFT is expected for the single-writer host group):

$ docker exec -it proxysql1 mysql -uadmin -padmin -h127.0.0.1 -P6032 -e 'select hostgroup_id,hostname,status from mysql_servers' +--------------+----------------------+--------------+ | hostgroup_id | hostname | status | +--------------+----------------------+--------------+ | 10 | mariadb1.weave.local | ONLINE | | 10 | mariadb2.weave.local | OFFLINE_SOFT | | 10 | mariadb3.weave.local | OFFLINE_SOFT | | 20 | mariadb1.weave.local | ONLINE | | 20 | mariadb2.weave.local | ONLINE | | 20 | mariadb3.weave.local | ONLINE | +--------------+----------------------+--------------+

At this point, our architecture is looking something like this:

All connections coming from 6033 (either from the host1, host2 or container's network) will be load balanced to the backend database containers using ProxySQL. If you would like to access an individual database server, use port 3306 of the physical host instead. There is no virtual IP address as single endpoint configured for the ProxySQL service, but we could have that by using Keepalived, which is explained in the next section.

Virtual IP Address with Keepalived

Since we configured ProxySQL containers to be running on host1 and host2, we are going to use Keepalived containers to tie these hosts together and provide virtual IP address via the host network. This allows a single endpoint for applications or clients to connect to the load balancing layer backed by ProxySQL.

As usual, create a custom configuration file for our Keepalived service. Here is the content of /containers/keepalived1/keepalived.conf:

vrrp_instance VI_DOCKER { interface ens33 # interface to monitor state MASTER virtual_router_id 52 # Assign one ID for this route priority 101 unicast_src_ip 192.168.55.161 unicast_peer { 192.168.55.162 } virtual_ipaddress { 192.168.55.160 # the virtual IP }

Copy the configuration file to host2 for the second instance:

$ mkdir -p /containers/keepalived2/ # on host2 $ scp /containers/keepalived1/keepalived.conf /container/keepalived2/ # on host1

Change the priority from 101 to 100 inside the copied configuration file on host2:

$ sed -i 's/101/100/g' /containers/keepalived2/keepalived.conf

**The higher priority instance will hold the virtual IP address (in this case is host1), until the VRRP communication is interrupted (in case host1 goes down).

Then, run the following command on host1 and host2 respectively:

$ docker run -d \ --name=${NAME} \ --cap-add=NET_ADMIN \ --net=host \ --restart=always \ --volume /containers/${NAME}/keepalived.conf:/usr/local/etc/keepalived/keepalived.conf \ osixia/keepalived:1.4.4

** Replace ${NAME} with keepalived1 and keepalived2.

The run command tells Docker to:

  • --name, create a container with
  • --cap-add=NET_ADMIN, add Linux capabilities for network admin scope
  • --net=host, attach the container into the host network. This will provide virtual IP address on the host interface, ens33
  • --restart=always, always keep the container running,
  • --volume=/containers/${NAME}/keepalived.conf:/usr/local/etc/keepalived/keepalived.conf, map the custom configuration file for container's usage.

After both containers are started, verify the virtual IP address existence by looking at the physical network interface of the MASTER node:

$ ip a | grep ens33 2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 inet 192.168.55.161/24 brd 192.168.55.255 scope global ens33 inet 192.168.55.160/32 scope global ens33

The clients and applications may now use the virtual IP address, 192.168.55.160 to access the database service. This virtual IP address exists on host1 at this moment. If host1 goes down, keepalived2 will take over the IP address and bring it up on host2. Take note that the configuration for this keepalived does not monitor the ProxySQL containers. It only monitors the VRRP advertisement of the Keepalived peers.

At this point, our architecture is looking something like this:

Summary

So, now we have a MariaDB Galera Cluster fronted by a highly available ProxySQL service, all running on Docker containers.

In part two, we are going to look into how to manage this setup. We’ll look at how to perform operations like graceful shutdown, bootstrapping, detecting the most advanced node, failover, recovery, scaling up/down, upgrades, backup and so on. We will also discuss the pros and cons of having this setup for our clustered database service.

Happy containerizing!

Tags:  MySQL MariaDB galera docker container proxysql

Where you can find MySQL in June

As follow up to the previous message posted on June 4, 2018 please find below the shows where you can find MySQL during June this year:

  • DataOps in June 21-22, 2018 in Barcelona, Spain
    • ​MySQL is a Community sponsor of this event where our colleague Fred Descamps, the MySQL Community Manager is having following talk:
      • "MySQL 8.0 Document Store: NoSQL with all the benefits of a RDBMS", scheduled for Jun 21, 2018 @​12:20pm
    • ​​There are also other talks on MySQL which I would really recommend you to attend:
      • ​"A quick tour of MySQL 8.0 roles" by Giuseppe Maxia, scheduled for June 21@11:00
      • "MySQL High Availability at GitHub" by Shlomi Noach, scheduled for June 21@14:50
      • "MySQL Group Replication - Ready For Production?" by Kenny Gryp, scheduled for June 21@16:50
      • "dbdeployer: the future of MySQL sandboxes" by Giusepe Maxia, scheduled for June 22@11:00
      • ...and much more, see the full schedule here.
  • Hong Kong Open Source Conference, June 16-17, 2018, Hong Kong
  • OpenSource Developer Conference Okinawa, Japan, June 16, 2018, Okinawa, Japan
    • MySQL team is a Gold sponsor of this OS conference with a following MySQL talk given by Yoshiaki Yamasaki, the MySQL Senior Sales Consultant, Asia Pacific & Japan:
      • What's new in MySQL 8.0
  • PyCon Thailand, Bangkok, June 16-17, 2018, Bangkok, Thailand
    • We are currently Silver sponsor of this PyCon with an approved talk by Ronen Baram, the MySQL Principle Sales Consultant as follows: 
      • "​Build dynamically scaled highly available python application with MySQL", scheduled for Jun 16, 2018 @11:55am.
    • Please find Ronen there for any kind of MySQL discussion. Ronen is always happy to help!

 

TwinDB Backup Tool Release 2.15.7

Some time ago, our team released a new version of TwinDB Backup Tool that contains minor fixed and some major improvements. In this article I want to highlight major features in this release. Status refactoring The tool stores backups metadata in a JSON file that I will refer further as the status. The status is […]

The post TwinDB Backup Tool Release 2.15.7 appeared first on TwinDB.

How to drop a column in mysql table

In this 101 article, I will show how to drop/remove a column from a table in MySQL.

In this article I will use an example table:

CREATE TABLE tb( c1 INT PRIMARY KEY, c2 char(1), c3 varchar(2) ) ENGINE=InnoDB;

To remove a column we will make use of ALTER TABLE command:

ALTER TABLE tb DROP COLUMN c2;

The command allows you to remove multiple columns at once:

ALTER TABLE tb DROP COLUMN c2, DROP COLUMN c3;

If you are running MySQL 5.6 onwards, you can make this operation online, allowing other sessions to read and write to your table while the operation is been performed:

ALTER TABLE tb DROP COLUMN c2, ALGORITHM=INPLACE, LOCK=NONE;

Reference:

https://dev.mysql.com/doc/refman/5.7/en/alter-table.html
https://dev.mysql.com/doc/refman/5.7/en/innodb-create-index-overview.html

Shinguz: Special MySQL and MariaDB trainings 2018 in English

Due to a strong customer demand FromDual offers 2018 two extra MySQL/MariaDB trainings with its Training partner The Linuxhotel in Essen (Germany). Those trainings are in English.

  • MariaDB Performance Tuning on 5 and 6 September 2018 (2 days).
  • Advanced MySQL/MariaDB training on 26 to 30 November 2018 (5 days).

More information about the contents of the trainings can be found at Advanced MySQL and MariaDB training.

For conditions and booking: MariaDB Performance Tuning and Advanced MySQL Training.

For specific MariaDB or MySQL on-site Consulting or in-house Training please get in contact with us.

Taxonomy upgrade extras:  training mariadb training Performance Tuning

MySQL Streaming Xtrabackup to Slave Recovery

Overview

There are times when I need to restore a slave from a backup of a master or another slave, but too many times I have taken the typical approach of taking the backup on the source server and then copying it to the target server.  This takes a great deal of time, especially as your database grows in size.

These steps are going to detail how to use Netcat (nc) and Percona Xtrabackup (innobackupexec) to stream the backup from the source server to the target server, saving a great deal of time by copying the data only once to the desired location.  While the data is streaming to the target server, it’s being compressed and then uncompressed on the fly, reducing the amount of traffic going across the network by around 85% to 90% (typical backup compression ratios of MySQL Innodb I have witnessed are in this range).  I will also provide a simple script to complete these steps to give you a start at creating your own customized solution.

Requirements

In order to accomplish this task, you need to keep the following items in mind:

  • Netcat (nc) – Application needed on both servers, used for data streaming.
  • Percona Xtrabackup – Application needed on both servers, used to perform the backup.
  • MySQL Access – MySQL access is required in order to do the backup on the master, and the slave configuration on the slave.
  • Pigz (optional) – This is only needed if you want to compress and uncompress it on the fly.  If you are going to use this, it’s needed on both servers.
  • Debian – All code and scripts were tested using Debian.  Commands may slightly change with different OS’s.
Steps

Here are the steps that are required to accomplish this task.  The source server is the server where the backup is coming from.  The target server is where the backup is going to.

Step 1. Stop MySQL on Target Server and Clear MySQL Data

On the server that needs to be restored, we will make sure that MySQL is stopped.  Then we will clear out the old data as this will all be replaced with the backup coming from the source server.  The example assumes your MySQL data directory is /var/lib/mysql.

service mysql stop rm -rf /var/lib/mysql/* Step 2. Start Listener on Target Server

Now that the target server has its MySQL data directory cleared of its old data, it is now ready to receive the new backup directly from the source server. The port 2112 can be changed, but the port needs to match on both source and target commands. Nc command and options may vary by OS.

nc -l -p 2112 | unpigz -c | xbstream -x -C /var/lib/mysql No Compression would be nc -l -p 2112 | xbstream -x -C /var/lib/mysql Step 3. Start Backup on Source Server

The listener is now up on the target server, ready to accept the connection. We will now start the backup on the source server to stream to the target server. Update the parallel option to match the number of cpu cores on your server. Use the lower core count between the source and target server. The port 2112 can be changed, but the port needs to match on both source and target commands. Nc command and options may vary by OS.

innobackupex --stream=xbstream --parallel=4 /tmp | pigz -c --fast | nc -w 2 TARGET_SERVER 2112 No Compression would be innobackupex --stream=xbstream --parallel=4 /tmp | nc -w 2 TARGET_SERVER 2112 Step 4. Prepare Backup on Target Server

Percona xtrabackup requires that you prepare the backup after it has been completed to apply any outstanding logs and get the database ready to be started. Use as much memory on your target server as you can without causing it to go OOM (Out of Memory). As an example, 75% of your total memory would be a good starting point if there is nothing else running on your server. On a server with 4G of RAM you could safely set user memory to 3G.

innobackupex --use-memory=3G --apply-log /var/lib/mysql Step 5. Update Ownership and Start MySQL

Now that the apply logs step has completed on your backup, you should be able to update the ownership of the files to mysql:mysql, and then start the MySQL service.

chown -R mysql:mysql /var/lib/mysql service mysql start Step 6. Configure Replication

If the source server is a slave, you should be able to just start the new slave as the positioning will be already configured. If your source server is the master, then you will have to figure out if you are using GTID or legacy replication. If you are using GTIDs, you should be able start replication with gtid_purged being set and the master auto position parameter. If you are using legacy replication, you can find the master log and position in the xtrabackup_binlog_info file in your backup directory. In this scenario, the backup directory is the MySQL data directory (/var/lib/mysql/) on the target server.

GTID Replication

mysql CHANGE MASTER TO MASTER_HOST='&lt;MASTER_SERVER&gt;', MASTER_USER='&lt;REPL_USER&gt;', MASTER_PASSWORD='&lt;REPL_PASS&gt;', MASTER_AUTO_POSITION = 1; START SLAVE;"

Legacy Replication

cat /var/lib/mysql/xtrabackup_binlog_info mysql CHANGE MASTER TO MASTER_HOST='&lt;MASTER_SERVER&gt;', MASTER_USER='&lt;REPL_USER&gt;', MASTER_PASSWORD='&lt;REPL_PASS&gt;', MASTER_LOG_FILE='&lt;LOG_FROM_xtrabackup_binlog_info', MASTER_LOG_POS=&lt;POSITION_FROM_xtrabackup_binlog_info&gt;; START SLAVE;" Script

Below you will find the simple script that I came up with to get you started on automating this task to quickly rebuild a slave in your own environment.  The script was created with a lot of assumptions that you may not have in your environment.  Please make sure to update accordingly.

  • The script is running from a separate server that has access to both MySQL servers. May work by running on one of the MySQL servers but not tested to do so.
  • The account running the commands on the remote servers have SUDO access to run commands.
  • SSH key of the remote server is set up to allow ssh access to both MySQL servers allowing for ssh with no password prompt.
  • .my.cnf is configured in the home directory of the account being used to SSH on the MySQL servers allowing the script to run MySQL and Xtrabackup commands with no password prompt.
  • The following software is installed on the MySQL servers : netcat (nc), pigz, xtrabackup.
  • Firewall rules are open for the port being used by NETCAT streaming.
  • All my testing was on Debian servers. Found with other OS’s and Netcac versions there is a -d flag for running nc in the background. In Debian you have to use -p for netcat when telling it to listen on a port.

Script

#!/bin/bash SOURCE_SERVER=$1 TARGET_SERVER=$2 # This is if you have /var/lib/mysql on the / (Root) drive. You can change this to '/' to capture the size of the drive to get an idea of how long is left on your backup MOUNT_CONTAINING_DATADIR='/var' #Should match the number of CPU's on your server BACKUP_THREADS=2 # Seconds to wait in the loop to check that the backup has completed TIMER=5 # Amount of memory to use to apply the logs to the newly backed up server MEMORY_GB_USED_APPLY_LOGS='1G' # Change this to a 1 if you want to configure the target a slave of the source server CONFIGURE_TARGET_AS_SLAVE_OF_SOURCE=1 GTID_STATUS='' SSH='ssh -q -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null' NETCAT_PORT=2112 REPL_USER='repl' REPL_PASS='repl_PASSWORD' MYSQL_DATADIR='/var/lib/mysql' echo "" echo "Starting MySQL Streaming Slave Rebuild Script" echo "" if [ ! $1 ];then echo "SOURCE Server not set. Please run like :" echo "" echo "$0 &lt;source_server&gt; &lt;target_server&gt;" echo "" exit 1 fi # VALIDATE EXECUTION echo "Are you sure you wish to perform this rebuild. THIS IS DESTRUCTIVE!!!" echo " SOURCE SERVER (Performing Backup) : $SOURCE_SERVER" echo " TARGET SERVER (Receiving Backup) : $TARGET_SERVER" echo " All files in $MYSQL_DATADIR on $TARGET_SERVER will be DELETED!!!" echo -n "START NOW ? (y/n) : " read CONFIRM if [ $CONFIRM == "y" -o $CONFIRM == "Y" ]; then echo "" echo "STARTING REBUILD!!!" else echo "Y or y was not chosen. Exiting." exit 1 fi # PREPARE TARGET echo " Stopping MySQL" $SSH $TARGET_SERVER "service mysql stop" echo " Clearing $MYSQL_DATADIR" $SSH $TARGET_SERVER "rm -rf $MYSQL_DATADIR/*" # PERFORM STREAMING BACKUP echo " Start Listener on Target server $TARGET_SERVER:$NETCAT_PORT to accept the backup and place it in $MYSQL_DATADIR" $SSH $TARGET_SERVER "nc -l -p $NETCAT_PORT | unpigz -c | xbstream -x -C $MYSQL_DATADIR" &gt; /dev/null 2&gt;&amp;1 &amp; echo " Starting backup on source server $SOURCE_SERVER:$NETCAT_PORT to stream backup" $SSH $SOURCE_SERVER "innobackupex --stream=xbstream --parallel=$BACKUP_THREADS /tmp | pigz -c --fast | nc -w 2 $TARGET_SERVER $NETCAT_PORT" &gt; /tmp/backup_log 2&gt;&amp;1 &amp; sleep 4 echo " Watching backup every $TIMER seconds to validate when the backup is complete" LOOP=1 while [ 1 -eq $LOOP ];do BACKUP_PROCESSES=`$SSH $SOURCE_SERVER "ps aux | grep -v grep | grep -w innobackupex | wc -l"` if [ $BACKUP_PROCESSES -eq 0 ]; then echo " Backup has COMPLETED!!" LOOP=2 else echo " Backup is Running!" sleep $TIMER fi done # PREPARE AND COMPLETE BACKUP ON TARGET echo " Applying logs to the Xtrabackup" $SSH $TARGET_SERVER "innobackupex --use-memory=$MEMORY_GB_USED_APPLY_LOGS --apply-log $MYSQL_DATADIR" &gt; /tmp/backup_log 2&gt;&amp;1 &amp; sleep 3 LOOP=1 while [ 1 -eq $LOOP ];do APPLY_PROCESSES=`$SSH $TARGET_SERVER "ps aux | grep -v grep | grep innobackupex | wc -l"` if [ $APPLY_PROCESSES -eq 0 ]; then echo " Apply logs has COMPLETED!!" LOOP=2 else echo " Apply Logs Running!" sleep $TIMER fi done sleep 1 echo " Updating ownership on the backup files so that MySQL owns them" $SSH $TARGET_SERVER "chown -R mysql:mysql $MYSQL_DATADIR" echo " Starting MySQL" $SSH $TARGET_SERVER "service mysql start" if [ $CONFIGURE_TARGET_AS_SLAVE_OF_SOURCE -eq 1 ]; then echo " Configuring Replication" GTID_STATUS=`$SSH $SOURCE_SERVER "mysql -BN -e \"SHOW VARIABLES LIKE 'gtid_mode'\"" | grep -w ON | wc -l` if [ $GTID_STATUS -gt 0 ]; then echo "Found GTID ON. Using Master Auto Position. SLAVE STARTED" GTID_POS=`$SSH $TARGET_SERVER "cat $MYSQL_DATADIR/xtrabackup_binlog_info" | awk '{print $3}' | head -n 1 | sed 's/,//'` $SSH $TARGET_SERVER "mysql -e \"SET GLOBAL gtid_purged='$GTID_POS';\"" $SSH $TARGET_SERVER "mysql -e \"CHANGE MASTER TO MASTER_HOST='$SOURCE_SERVER', MASTER_USER='$REPL_USER', MASTER_PASSWORD='$REPL_PASS', MASTER_AUTO_POSITION = 1; START SLAVE; \"" else echo "Found GTID not ON. Grabbing positioning from the backup file and using that to configure replication" MASTER_LOG=`$SSH $TARGET_SERVER "cat $MYSQL_DATADIR/xtrabackup_binlog_info" | awk '{print $1}'` MASTER_POS=`$SSH $TARGET_SERVER "cat $MYSQL_DATADIR/xtrabackup_binlog_info" | awk '{print $2}'` echo "Setting the slave to $MASTER_LOG and $MASTER_POS. SLAVE STARTED" $SSH $TARGET_SERVER "mysql -e \"CHANGE MASTER TO MASTER_HOST='$SOURCE_SERVER', MASTER_USER='$REPL_USER', MASTER_PASSWORD='$REPL_PASS', MASTER_LOG_FILE='$MASTER_LOG', MASTER_LOG_POS=$MASTER_POS; START SLAVE;\"" fi fi

Script Output

[root@bastion DB]# ./mysql_file_streaming_slave_rebuild.sh 10.10.10.198 10.10.10.197 Starting MySQL Streaming Slave Rebuild Script Are you sure you wish to perform this rebuild. THIS IS DESTRUCTIVE!!! SOURCE SERVER (Performing Backup) : 10.10.10.198 TARGET SERVER (Receiving Backup) : 10.10.10.197 All files in /var/lib/mysql on 10.10.10.197 will be DELETED!!! START NOW ? (y/n) : y STARTING REBUILD!!! Stopping MySQL Clearing /var/lib/mysql Start Listener on Target server 10.10.10.197:2112 to accept the backup and place it in /var/lib/mysql Starting backup on source server 10.10.10.198:2112 to stream backup Watching backup every 5 seconds to validate when the backup is complete Backup is Running! Backup is Running! Backup is Running! Backup has COMPLETED!! Applying logs to the Xtrabackup Apply Logs Running! Apply Logs Running! Apply logs has COMPLETED!! Updating ownership on the backup files so that MySQL owns them Starting MySQL Configuring Replication Found GTID ON. Using Master Auto Position. SLAVE STARTED

 

Conclusion

I have found that this has greatly increased the timeframe in which it took me to recover a failed slave. By transferring the data only once from the source server to the target server, and with the data being compressed during the transfer, I feel this is one of the most efficient methods of recovering a failed slave, or building a new one.

A friendly comparison of InnoDB and MyRocks Performance

In this blog post, we have multiple OLTP performance benchmarking scenarios using sysbench 1.0.14 on InnoDB and MyRocks. InnoDB and MyRocks (RocksDB with MySQL) are definitely not to supplement each other, They actually compliment well with respective advantages, Let me quickly explain how InnoDB and MyRocks can benefit you when used wisely, Again this blog post is not to show who (InnoDB or MyRocks) is better ? We regularly benchmark both of these storage engines before recommending to our customers on what is best suited for their database infrastructure operations ? so we would like to share our thoughts on this post.

How InnoDB and MyRocks are different ?
  • MyRocks supports only READ-COMMITTED isolation level, There is no REPEATABLE-READ isolation level like InnoDB so no gap locking like InnoDB, We have written detailed blog on InnoDB transaction isolation levels here
  • To get an verbose information about MyRocks instance, the log is located in  ” /var/lib/mysql/#rocksdb ” . Much more detailed story about your RocksDB diagnostics report can be generated with the command SHOW ENGINE ROCKSDB STATUS , It really takes good amount of time to understand and interpret MyRocks operations matrices.
  • In MyRocks, you have rocksdb_block_cache_size system variable which is somewhat similar to innodb_buffer_pool_size but It’s mainly beneficial for reads. By default it uses buffered reads and OS cache contains cached compressed data and RockDB block cache will contain uncompressed data. You can have two levels of cache or disable buffering by forcing block cache to use direct reads with configuration rocksdb_use_direct_reads=ON.
  • LSM Data Structure – MyRocks is not an alternative or advanced version of InnoDB, LSM data structure is great for write-intensive database operations, reads will be slow and full table scans are too expensive. so InnoDB and RocksDB together makes an great combination !
Benchmarking InnoDB and MyRocks performance with sysbench 1.0.14 for OLTP operations 

Linux – CentOS Linux release 7.3.1611 (Core) 

Database infrastructure – MariaDB 10.3.7

Building database infrastructure for benchmarking 

We have used “oltp_common.lua” script to create database infrastructure for benchmarking InnoDB and MyRocks, We have not tuned both InnoDB and MyRocks variables for performance. The script below creates database for benchmarking:

[root@localhost sysbench]# [root@localhost sysbench]# sysbench bulk_insert.lua --threads=1 --db-driver=mysql --mysql-db=test --mysql-socket=/var/lib/mysql/mysql.sock --mysql-user=root --mysql-password=MyPassword2018 --mysql-storage-engine=rocksdb prepare sysbench 1.0.14 (using bundled LuaJIT 2.1.0-beta2) Creating table 'sbtest1'... [root@localhost sysbench]#

Benchmarking bulk INSERT performance on InnoDB and MyRocks

Benchmarking OLTP insert on InnoDB using “oltp_insert.lua” 

Script to create data (5M records) for benchmarking OLTP INSERT performance:

[root@localhost sysbench]# sysbench oltp_insert.lua --threads=100 --time=180 --table-size=5000000 --db-driver=mysql --mysql-db=test --mysql-socket=/var/lib/mysql/mysql.sock --mysql-user=root --mysql-password=MyPassword2018 prepare sysbench 1.0.14 (using bundled LuaJIT 2.1.0-beta2) Initializing worker threads... Creating table 'sbtest1'... Inserting 5000000 records into 'sbtest1' Creating a secondary index on 'sbtest1'... [root@localhost sysbench]#

MariaDB [test]> show table status like 'sbtest1'\G; *************************** 1. row *************************** Name: sbtest1 Engine: InnoDB Version: 10 Row_format: Dynamic Rows: 5404891 Avg_row_length: 265 Data_length: 1433403392 Max_data_length: 0 Index_length: 157024256 Data_free: 4194304 Auto_increment: 5696281 Create_time: 2018-06-03 12:48:12 Update_time: 2018-06-03 12:52:03 Check_time: NULL Collation: latin1_swedish_ci Checksum: NULL Create_options: Comment: Max_index_length: 0 Temporary: N 1 row in set (0.000 sec) ERROR: No query specified MariaDB [test]>

Script for benchmarking InnoDB OLTP INSERT performance:

[root@localhost sysbench]# sysbench oltp_insert.lua --threads=100 --time=180 --table-size=5000000 --db-driver=mysql --mysql-db=test --mysql-socket=/var/lib/mysql/mysql.sock --mysql-user=root --mysql-password=MyPassword2018 run sysbench 1.0.14 (using bundled LuaJIT 2.1.0-beta2) Running the test with following options: Number of threads: 100 Initializing random number generator from current time Initializing worker threads... Threads started! SQL statistics: queries performed: read: 0 write: 696280 other: 0 total: 696280 transactions: 696280 (3866.32 per sec.) queries: 696280 (3866.32 per sec.) ignored errors: 0 (0.00 per sec.) reconnects: 0 (0.00 per sec.) General statistics: total time: 180.0872s total number of events: 696280 Latency (ms): min: 0.62 avg: 25.85 max: 358.63 95th percentile: 81.48 sum: 17998504.11 Threads fairness: events (avg/stddev): 6962.8000/57.61 execution time (avg/stddev): 179.9850/0.04

What we look for seriously in this benchmarking is QPS (queries per seconds) , In the test above it is 3866 QPS

Benchmarking MyRocks INSERT performance using Sysbench 1.0.14: 

The steps are same, except for explicitly mentioning the storage engine RocksDB in sysbench scripts: “–mysql-storage-engine=rocksdb

Script for benchmarking OLTP insert on MyRocks using “oltp_insert.lua”  : 

[root@localhost sysbench]# sysbench oltp_insert.lua --threads=100 --time=180 --table-size=5000000 --db-driver=mysql --mysql-db=test --mysql-socket=/var/lib/mysql/mysql.sock --mysql-user=root --mysql-password=MyPassword2018 --mysql-storage-engine=rocksdb prepare sysbench 1.0.14 (using bundled LuaJIT 2.1.0-beta2) Initializing worker threads... Creating table 'sbtest1'... Inserting 5000000 records into 'sbtest1' Creating a secondary index on 'sbtest1'... [root@localhost sysbench]#

MariaDB [test]> show table status like 'sbtest1%'\G; *************************** 1. row *************************** Name: sbtest1 Engine: ROCKSDB Version: 10 Row_format: Fixed Rows: 5000000 Avg_row_length: 198 Data_length: 992949774 Max_data_length: 0 Index_length: 38739880 Data_free: 0 Auto_increment: 5000001 Create_time: NULL Update_time: NULL Check_time: NULL Collation: latin1_swedish_ci Checksum: NULL Create_options: Comment: Max_index_length: 0 Temporary: N 1 row in set (0.007 sec) ERROR: No query specified MariaDB [test]>

Script for benchmarking RocksDB OLTP INSERT performance:

[root@localhost sysbench]# sysbench oltp_insert.lua --threads=100 --time=180 --table-size=5000000 --db-driver=mysql --mysql-db=test --mysql-socket=/var/lib/mysql/mysql.sock --mysql-user=root --mysql-password=MyPassword2018 --mysql-storage-engine=rocksdb run sysbench 1.0.14 (using bundled LuaJIT 2.1.0-beta2) Running the test with following options: Number of threads: 100 Initializing random number generator from current time Initializing worker threads... Threads started! SQL statistics: queries performed: read: 0 write: 123049 other: 0 total: 123049 transactions: 123049 (683.37 per sec.) queries: 123049 (683.37 per sec.) ignored errors: 0 (0.00 per sec.) reconnects: 0 (0.00 per sec.) General statistics: total time: 180.0618s total number of events: 123049 Latency (ms): min: 1.03 avg: 146.30 max: 1675.09 95th percentile: 308.84 sum: 18001689.44 Threads fairness: events (avg/stddev): 1230.4900/26.26 execution time (avg/stddev): 180.0169/0.02 [root@localhost sysbench]#

Result: OLTP INSERT performance for RocksDB is 683 QPS (queries per second)

The graphical representation of more interesting performance benchmarking results on multiple bulk INSERT transactions scenarios:

Conclusion 

OLTP INSERT performance benchmarking clearly proves InnoDB is almost 6X faster than MyRocks. So MyRocks is not recommended for bulk INSERT transactions !

OLTP WRITE only transactions performance benchmarking for InnoDB and MyRocks 

OLTP WRITE only performance benchmarking for InnoDB using “oltp_write_only.lua”

The steps remains same for “oltp_write_only.lua” script like “oltp_insert.lua” mentioned above, So we are directly copying the results of benchmarking without explaining details again:

[root@localhost sysbench]# sysbench oltp_write_only.lua --threads=100 --table-size=5000000 --db-driver=mysql --mysql-db=test --mysql-socket=/var/lib/mysql/mysql.sock --mysql-user=root --mysql-password=MyPassword2018 prepare sysbench 1.0.14 (using bundled LuaJIT 2.1.0-beta2) Initializing worker threads... Creating table 'sbtest1'... Inserting 5000000 records into 'sbtest1' Creating a secondary index on 'sbtest1'...

[root@localhost sysbench]# sysbench oltp_write_only.lua --threads=100 --table-size=5000000 --db-driver=mysql --mysql-db=test --mysql-socket=/var/lib/mysql/mysql.sock --mysql-user=root --mysql-password=MyPassword2018 run sysbench 1.0.14 (using bundled LuaJIT 2.1.0-beta2) Running the test with following options: Number of threads: 100 Initializing random number generator from current time Initializing worker threads... Threads started! SQL statistics: queries performed: read: 0 write: 14529 other: 7265 total: 21794 transactions: 3632 (355.03 per sec.) queries: 21794 (2130.37 per sec.) ignored errors: 1 (0.10 per sec.) reconnects: 0 (0.00 per sec.) General statistics: total time: 10.2285s total number of events: 3632 Latency (ms): min: 1.88 avg: 277.61 max: 2701.56 95th percentile: 977.74 sum: 1008267.12 Threads fairness: events (avg/stddev): 36.3200/4.36 execution time (avg/stddev): 10.0827/0.09

Result : 2130 QPS (queries per second)

OLTP WRITE only performance benchmarking for RocksDB using “oltp_write_only.lua”

[root@localhost sysbench]# sysbench oltp_write_only.lua --threads=100 --table-size=5000000 --db-driver=mysql --mysql-db=test --mysql-socket=/var/lib/mysql/mysql.sock --mysql-user=root --mysql-password=MyPassword2018 --mysql-storage-engine=rocksdb prepare sysbench 1.0.14 (using bundled LuaJIT 2.1.0-beta2) Initializing worker threads... Creating table 'sbtest1'... Inserting 5000000 records into 'sbtest1' Creating a secondary index on 'sbtest1'...

[root@localhost sysbench]# sysbench oltp_write_only.lua --threads=100 --table-size=5000000 --db-driver=mysql --mysql-db=test --mysql-socket=/var/lib/mysql/mysql.sock --mysql-user=root --mysql-password=MyPassword2018 --mysql-storage-engine=rocksdb run sysbench 1.0.14 (using bundled LuaJIT 2.1.0-beta2) Running the test with following options: Number of threads: 100 Initializing random number generator from current time Initializing worker threads... Threads started! SQL statistics: queries performed: read: 0 write: 25191 other: 12596 total: 37787 transactions: 6296 (625.73 per sec.) queries: 37787 (3755.49 per sec.) ignored errors: 4 (0.40 per sec.) reconnects: 0 (0.00 per sec.) General statistics: total time: 10.0603s total number of events: 6296 Latency (ms): min: 1.39 avg: 159.29 max: 3620.58 95th percentile: 846.57 sum: 1002895.84 Threads fairness: events (avg/stddev): 62.9600/25.26 execution time (avg/stddev): 10.0290/0.02

Result : 3755 QPS (queries per second)

The graphical representation of more interesting performance benchmarking results on multiple WRITE only transactions scenarios:

Conclusion 

MyRocks OLTP write only performance is almost 2X compared to InnoDB, So MyRocks is definitely an preferred option for high performance and scalable writes, Thanks to LSM data structure  !

OLTP READ-WRITE performance benchmarking using Sysbench lua script “oltp_read_write.lua”

The lua scripts below create data for OLTP READ-WRITE performance benchmarking:

Benchmarking OLTP READ-WRITE performance for InnoDB:

[root@localhost sysbench]# sysbench oltp_read_write.lua --threads=100 --table-size=5000000 --db-driver=mysql --mysql-db=test --mysql-socket=/var/lib/mysql/mysql.sock --mysql-user=root --mysql-password=MyPassword2018 prepare sysbench 1.0.14 (using bundled LuaJIT 2.1.0-beta2) Initializing worker threads... Creating table 'sbtest1'... Inserting 5000000 records into 'sbtest1' Creating a secondary index on 'sbtest1'...

[root@localhost sysbench]# sysbench oltp_read_write.lua --threads=100 --table-size=5000000 --db-driver=mysql --mysql-db=test --mysql-socket=/var/lib/mysql/mysql.sock --mysql-user=root --mysql-password=MyPassword2018 run sysbench 1.0.14 (using bundled LuaJIT 2.1.0-beta2) Running the test with following options: Number of threads: 100 Initializing random number generator from current time Initializing worker threads... Threads started! SQL statistics: queries performed: read: 15652 write: 4472 other: 2236 total: 22360 transactions: 1118 (105.96 per sec.) queries: 22360 (2119.20 per sec.) ignored errors: 0 (0.00 per sec.) reconnects: 0 (0.00 per sec.) General statistics: total time: 10.5301s total number of events: 1118 Latency (ms): min: 15.71 avg: 922.19 max: 4973.09 95th percentile: 2009.23 sum: 1031006.57 Threads fairness: events (avg/stddev): 11.1800/1.68 execution time (avg/stddev): 10.3101/0.13 [root@localhost sysbench]#

Result: 2119 QPS (queries per second) 

Benchmarking OLTP READ-WRITE performance for RocksDB:

[root@localhost sysbench]# sysbench oltp_read_write.lua --threads=100 --table-size=5000000 --db-driver=mysql --mysql-db=test --mysql-socket=/var/lib/mysql/mysql.sock --mysql-user=root --mysql-password=MyPassword2018 --mysql-storage-engine=rocksdb prepare sysbench 1.0.14 (using bundled LuaJIT 2.1.0-beta2) Initializing worker threads... Creating table 'sbtest1'... Inserting 5000000 records into 'sbtest1' Creating a secondary index on 'sbtest1'...

[root@localhost sysbench]# sysbench oltp_read_write.lua --threads=100 --table-size=5000000 --db-driver=mysql --mysql-db=test --mysql-socket=/var/lib/mysql/mysql.sock --mysql-user=root --mysql-password=MyPassword2018 --mysql-storage-engine=rocksdb run sysbench 1.0.14 (using bundled LuaJIT 2.1.0-beta2) Running the test with following options: Number of threads: 100 Initializing random number generator from current time Initializing worker threads... Threads started! SQL statistics: queries performed: read: 26964 write: 7628 other: 3827 total: 38419 transactions: 1901 (182.46 per sec.) queries: 38419 (3687.46 per sec.) ignored errors: 25 (2.40 per sec.) reconnects: 0 (0.00 per sec.) General statistics: total time: 10.4153s total number of events: 1901 Latency (ms): min: 11.23 avg: 540.87 max: 3480.91 95th percentile: 1352.03 sum: 1028196.02 Threads fairness: events (avg/stddev): 19.0100/2.25 execution time (avg/stddev): 10.2820/0.10 [root@localhost sysbench]#

Result: 3687 QPS (queries per second) 

The graphical representation of more interesting performance benchmarking results on multiple READ-WRITE transactions scenarios:

Conclusion  

OLTP READ-WRITE I/O operations benchmarking results confirm MyRocks is the definite choice, May be these result vary more if we invest in tuning the InnoDB and MyRocks for performance.

Benchmarking OLTP READ ONLY operations using Sysbench oltp_read_only.lua script 

OLTP READ ONLY transactions performance benchmarking for InnoDB:

[root@localhost sysbench]# sysbench oltp_read_only.lua --threads=100 --table-size=5000000 --db-driver=mysql --mysql-db=test --mysql-socket=/var/lib/mysql/mysql.sock --mysql-user=root --mysql-password=MyPassword2018 prepare sysbench 1.0.14 (using bundled LuaJIT 2.1.0-beta2) Initializing worker threads... Creating table 'sbtest1'... Inserting 5000000 records into 'sbtest1' Creating a secondary index on 'sbtest1'...

[root@localhost sysbench]# sysbench oltp_read_only.lua --threads=100 --table-size=5000000 --db-driver=mysql --mysql-db=test --mysql-socket=/var/lib/mysql/mysql.sock --mysql-user=root --mysql-password=MyPassword2018 run sysbench 1.0.14 (using bundled LuaJIT 2.1.0-beta2) Running the test with following options: Number of threads: 100 Initializing random number generator from current time Initializing worker threads... Threads started! SQL statistics: queries performed: read: 51072 write: 0 other: 7296 total: 58368 transactions: 3648 (352.59 per sec.) queries: 58368 (5641.45 per sec.) ignored errors: 0 (0.00 per sec.) reconnects: 0 (0.00 per sec.) General statistics: total time: 10.3436s total number of events: 3648 Latency (ms): min: 1.10 avg: 274.41 max: 2863.46 95th percentile: 733.00 sum: 1001047.53 Threads fairness: events (avg/stddev): 36.4800/6.57 execution time (avg/stddev): 10.0105/0.05

Result: 5641 QPS (queries per second)

OLTP READ ONLY transactions performance benchmarking for RocksDB:

[root@localhost sysbench]# sysbench oltp_read_only.lua --threads=100 --table-size=5000000 --db-driver=mysql --mysql-db=test --mysql-socket=/var/lib/mysql/mysql.sock --mysql-user=root --mysql-password=MyPassword2018 --mysql-storage-engine=rocksdb prepare sysbench 1.0.14 (using bundled LuaJIT 2.1.0-beta2) Initializing worker threads... Creating table 'sbtest1'... Inserting 5000000 records into 'sbtest1' Creating a secondary index on 'sbtest1'...

[root@localhost sysbench]# sysbench oltp_read_only.lua --threads=100 --table-size=5000000 --db-driver=mysql --mysql-db=test --mysql-socket=/var/lib/mysql/mysql.sock --mysql-user=root --mysql-password=MyPassword2018 --mysql-storage-engine=rocksdb run sysbench 1.0.14 (using bundled LuaJIT 2.1.0-beta2) Running the test with following options: Number of threads: 100 Initializing random number generator from current time Initializing worker threads... Threads started! SQL statistics: queries performed: read: 26362 write: 0 other: 3766 total: 30128 transactions: 1883 (182.77 per sec.) queries: 30128 (2924.32 per sec.) ignored errors: 0 (0.00 per sec.) reconnects: 0 (0.00 per sec.) General statistics: total time: 10.2983s total number of events: 1883 Latency (ms): min: 6.20 avg: 540.16 max: 5258.04 95th percentile: 2045.74 sum: 1017118.10 Threads fairness: events (avg/stddev): 18.8300/2.29 execution time (avg/stddev): 10.1712/0.08

Result: 2924 QPS (queries per second)

The graphical representation of more interesting performance benchmarking results on multiple READ only transactions scenarios:

Conclusion

InnoDB works great if it is OLTP READ only transactions, So we can continue recommending customers to use InnoDB for read intensive database operations.

Benchmarking OLTP DELETE ONLY operations using Sysbench oltp_delete.lua script 

OLTP DELETE ONLY transactions performance benchmarking for InnoDB:

[root@localhost sysbench]# sysbench oltp_delete.lua --threads=100 --table-size=5000000 --db-driver=mysql --mysql-db=test --mysql-socket=/var/lib/mysql/mysql.sock --mysql-user=root --mysql-password=MyPassword2018 prepare sysbench 1.0.14 (using bundled LuaJIT 2.1.0-beta2) Initializing worker threads... Creating table 'sbtest1'... Inserting 5000000 records into 'sbtest1' Creating a secondary index on 'sbtest1'...

[root@localhost sysbench]# sysbench oltp_delete.lua --threads=100 --table-size=5000000 --db-driver=mysql --mysql-db=test --mysql-socket=/var/lib/mysql/mysql.sock --mysql-user=root --mysql-password=MyPassword2018 run sysbench 1.0.14 (using bundled LuaJIT 2.1.0-beta2) Running the test with following options: Number of threads: 100 Initializing random number generator from current time Initializing worker threads... Threads started! SQL statistics: queries performed: read: 0 write: 21659 other: 4464 total: 26123 transactions: 26123 (2521.93 per sec.) queries: 26123 (2521.93 per sec.) ignored errors: 0 (0.00 per sec.) reconnects: 0 (0.00 per sec.) General statistics: total time: 10.3568s total number of events: 26123 Latency (ms): min: 0.04 avg: 38.08 max: 2679.06 95th percentile: 116.80 sum: 994654.43 Threads fairness: events (avg/stddev): 261.2300/46.27 execution time (avg/stddev): 9.9465/0.18 [root@localhost sysbench]#

Result: 2521 QPS (queries per second)

OLTP DELETE ONLY transactions performance benchmarking for RocksDB:

[root@localhost sysbench]# sysbench oltp_delete.lua --threads=100 --table-size=5000000 --db-driver=mysql --mysql-db=test --mysql-socket=/var/lib/mysql/mysql.sock --mysql-user=root --mysql-password=MyPassword2018 --mysql-storage-engine=rocksdb prepare sysbench 1.0.14 (using bundled LuaJIT 2.1.0-beta2) Initializing worker threads... Creating table 'sbtest1'... Inserting 5000000 records into 'sbtest1' Creating a secondary index on 'sbtest1'...

[root@localhost sysbench]# sysbench oltp_delete.lua --threads=100 --table-size=5000000 --db-driver=mysql --mysql-db=test --mysql-socket=/var/lib/mysql/mysql.sock --mysql-user=root --mysql-password=MyPassword2018 --mysql-storage-engine=rocksdb run sysbench 1.0.14 (using bundled LuaJIT 2.1.0-beta2) Running the test with following options: Number of threads: 100 Initializing random number generator from current time Initializing worker threads... Threads started! SQL statistics: queries performed: read: 0 write: 7094 other: 421 total: 7515 transactions: 7515 (746.81 per sec.) queries: 7515 (746.81 per sec.) ignored errors: 0 (0.00 per sec.) reconnects: 0 (0.00 per sec.) General statistics: total time: 10.0613s total number of events: 7515 Latency (ms): min: 0.43 avg: 133.42 max: 1666.25 95th percentile: 502.20 sum: 1002663.49 Threads fairness: events (avg/stddev): 75.1500/15.50 execution time (avg/stddev): 10.0266/0.02

Result: 746 QPS (queries per second)

The graphical representation of more interesting performance benchmarking results on multiple DELETE only transactions scenarios:

Conclusion 

InnoDB is almost 3X faster than MyRocks in OLTP DELETE operations.

InnoDB and MyRocks performance comparison :
  • InnoDB is the definite choice if the transaction model is bulk INSERT, READ and DELETE intensive database operations.
  • MyRocks performance is much better than InnoDB in WRITE and READ-WRITE intensive database operations.

The post A friendly comparison of InnoDB and MyRocks Performance appeared first on MySQL Consulting, Support and Remote DBA Services By MinervaDB.

Using –login-path with MySQL Cluster Manager

Protective convenience
Recent versions of MySQL has an added security feature, the –login-path option. This feature puts your username and password in an encrypted file, and you refer clients to this file instead of typing plain text passwords on the console, or putting them in scripts.

MySQL participating to HKOSCon 2018

From June 16th to 17th, the MySQL Team will be attending and speaking at the Hong Kong Open Source Conference 2018.

Unfortunately I won’t be present but some of my great colleagues will be speaking about MySQL.

If you are planning to attend HKOSCon2018, don’t miss their talks:

During the event, there will be also very nice talks from the MySQL Community. I really recommend you to also attend the following sessions:

This conference seems to be a very nice opportunity to learn new stuff related to MySQL 8.0.

MariaDB 10.3 supports database sequences

Introduction Traditionally, both MySQL and MariaDB relied on AUTO_INCREMENT columns to generate an IDENTITY Primary Key. Although IDENTITY columns are very efficient in generating the Primary Key value, when it comes to using JPA and Hibernate, the IDENTITY generator prevents us from using JDBC batch inserts. To automatically enroll multiple INSERT, UPDATE or DELETE statements, … Continue reading MariaDB 10.3 supports database sequences →

The post MariaDB 10.3 supports database sequences appeared first on Vlad Mihalcea.

Call for Questions: Webinar with MySQL Benchmarking Experts

If you attended my latest Troubleshooting MySQL Concurrency Issues with Load Testing Tools webinar you learned how I exploit benchmarking tools to replicate locking issues and crashes. But I told you nothing about proper use of these tools: for reliable benchmarks. I did not tell you that for a reason… I am just a Support Engineer and not a benchmarking expert.

And I have a dream. I want to invite world famous MySQL benchmarking experts for a Percona webinar and ask them for their secrets. I have a pretty long list of my own questions, but in order to make 1-hour webinar productive, we need to concentrate on just a few of the hottest ones. Therefore I need your help.

Please add your questions into the comments field of this blog post. Then we will choose the most important and/or most popular of these to discuss at the webinar.

Some benchmarking questions

My list of questions includes:

  1. What do you benchmark? Are they real applications or artificial tests?
  2. How do you create tests for a real application workload?
  3. Do you test hardware?
  4. Which tools do you use? How? Any tips?
  5. How do you work with hardware limitations when the test machine is smaller than the production one?
  6. Which are the most common mistakes for benchmarks?
  7. We all heard about “Lies, Damned Lies and Benchmarks”. Why do you still benchmark? How are you sure that yours have meaningful results?
  8. What are the most common mistakes made by those who publish benchmark results for public use?
  9. What are the best practices for benchmarks?

Please tell me what you think about these questions and add yours!

The post Call for Questions: Webinar with MySQL Benchmarking Experts appeared first on Percona Database Performance Blog.

MySQL Performance : more in depth with latin1 and utf8mb4 in 8.0 GA

Looking on my previously obtained results on Read-Only (RO) tests for latin1 and UTF8 charsets, one question continued to turn in my mind :
- if MariaDB 10.3 is hitting a so deep drop on "distinct-ranges" workload :
- why then this is not impacting the "mixed" OLTP_RO workload results (which is containing "distinct-ranges" query too) :
The answer was within the test title :
  • I've missed one zero in my scripts while preparing initial tests.. ;-))
  • so, the "distinct-ranges" test was using range size=10 (instead of 100 by default)
  • while "mixed" OLTP_RO remained with default settings, and used range size=100 for all range tests..
  • was the use of a smaller range size which that much impacted MariaDB ?..
  • (generally people are expecting to see wider range queries to be more impacting than the smaller ones, right ?)..

To clarify this all -- I decided to replay RO tests with a more detailed analyze..

Read more... (13 min remaining to read)

Colocated MySQL Server and NDB data nodes


Historically the advice for MySQL Cluster has been to not colocate
the MySQL Server and the NDB data node for scalable applications.

There are still reasons to apply this principle in scalable setups
where the application isn't close to the data.

But with MySQL Cluster 7.6 we have added a number of reasons why it
makes sense to colocate the MySQL Server and the NDB data node.

Internally in the NDB development we have had a debate about whether
to integrate the NDB data node inside the MySQL Server. The reasons
for this is that the MySQL Server will be closer to the data. The
argument against is that the NDB data node and the MySQL Server are
designed with different recovery models. NDB data nodes are failfast,
as soon as we find a fault that is not supposed to happen we will
crash the data node. The MySQL Server on the other hand tries to
stay up as much as possible since a crash of the MySQL Server brings
down the data accessibility. In NDB we always expect another data
node to have a synchronous replica, thus data is accessible even in
the case of a crash.

With MySQL Cluster 7.6 we have gotten the best of both worlds. We
can now communicate from the MySQL Server to a NDB data node using
a shared memory transporter. This means that communication goes
entirely through the memory of the machine, the communication between
a thread in the MySQL Server and a thread in the NDB data node
goes through memory and when a thread needs to wake up a thread a
mutex is used with a condition variable exactly as in the MySQL
Server. Still the NDB data node and the MySQL Server is separate
programs that can reside on machines independent of each other
and they can crash independently of each other.

So with the release of MySQL Cluster 7.6 it is possible to have
clusters with locality of reads. Already in MySQL Cluster 7.5 we
introduced the possibility to declare tables as being able to
read from all replicas (Read Backup feature). In addition we
introduced tables that can be fully replicated in MySQL Cluster 7.5.
In these fully replicated tables access to a table is always local
to the data node we are in.

In MySQL Cluster 7.6 we are introducing a shared memory transporter
for efficient communication between a colocated MySQL Server and
an NDB data node. In addition we are introducing the possibility
to declare location domain ids for all nodes in the cluster. This
means that we can ensure that reads always stays local to the
Availability Domain in an Oracle Cloud (Availability Zone in Amazon
and Google clouds). Thus it is possible to design applications
without having to consider networking constraints as much as before
with NDB.

This means that we expect NDB to work very well in SQL applications.
We are also constantly working on improving the SQL performance of
NDB by supporting more and more push down of joins to the NDB data nodes.
We are working on improving the basic range scan mechanism in NDB,
we are working on improving the interface between the NDB storage
engine and the MySQL Server. Finally we are also working hard to
integrate all the changes in MySQL 8.0 into MySQL Cluster as well.

I will describe a number of different scenarios for how to build
applications in the cloud using a setup where we have 3 data nodes,
one in each availability domain of the Oracle Cloud.

But in this blog and a few more blogs I will start by looking
specifically at how the shared memory transporter improves performance
of standard sysbench benchmarks.

In the previous blog I showed how we have improved performance of
Sysbench OLTP RW even for the standard TCP transporter. This was
due to the use of a new wakeup thread and the use of locking the NDB API
receive thread to a CPU where it can work undisturbed. The receive
thread handles receive of all messages from the NDB data nodes and
must be prioritised over the other MySQL Server threads, the best way
to achieve this is to use CPU locking. In the benchmarks we present in
this blog we will always use this CPU locking.

In the figure above we show how the performance of a normal setup using
7.5.9 compares to the 7.6.6 with receive thread locked to a CPU using
the TCP transporter. Next we have a curve that shows performance when
simply replacing the TCP transporter with a shared memory transporter.
Next we show a curve of what happens if we configure the shared memory
transporter to use spinning for a while before it goes to sleep.

The final curve shows the performance when also spinning in the TC
threads and the LDM threads in the NDB data node. Spinning in those
threads is not likely to be beneficial if those threads are not locked
to their own CPU core, thus in this one should not use hyperthreading
for those threads.

The takeaways from the graph above are the following:

1) The shared memory transporter have similar performance at low
concurrency as the TCP transporter. As concurrency increases the
shared memory transporter has better performance, the improvement
is 10% at top performance and more than 40% at very high concurrency.

2) Using spinning in the configuration of the shared memory transporter
improves performance at low concurrency significantly, by more than
20%. Top performance is similar to not using spinning, but it is
easier to get to this top performance.

3) Using spinning in the TC threads and LDM threads improves performance
even more at low concurrency. Performance increases by more than 30% at
low concurrency compared to no spinning and by 10% compared to spinning
only in transporter. Performance at high concurrency is similar for all
variants using shared memory transporter. So spinning helps to make the
MySQL Server need less concurrency to reach high performance levels.

We have added a graph below where highlight the performance at 1 and 2
threads since it is difficult to see those differences in the first
figure.

Configuring NDB to use a shared memory transporter is easy, the easiest
way is to simply set a new configuration variable UseShm to 1 on the
NDB data nodes. With this setting we will create a shared memory transporter
between all API nodes and this node when the API node and the data node
share the same hostname. It is also possible to create a separate shared
memory section to describe the transporter setup between two specific
nodes in the cluster.

Spintime for the shared memory transporter is easiest to setup using the default
shared memory transporter section. Spintime for TC and LDM threads in the NDB
data nodes are configured using the ThreadConfig variable in NDB data nodes.

Since we are using mutex and condition variables in shared memory we are
only supporting shared memory transporters on Linux at the moment.

The conclusion is that using the shared memory transporter we can improve
performance at low concurrency by more than 30%, we can improve throughput
by 20% and at very high concurrency (1536 threads) we get about 100%
improvement, all comparing to the result in using 7.5.9.

In the graph below we show only the 7.5.9 curve and compare it to the curve
achieved with all improvements in 7.6.


Webinar: MySQL & MariaDB Performance Tuning for Dummies

You’re running MySQL or MariaDB as backend database, how do you tune it to make best use of the hardware? How do you optimize the Operating System? How do you best configure MySQL or MariaDB for a specific database workload?

Do these questions sound familiar to you? Maybe you’re having to deal with that type of situation yourself?

A database server needs CPU, memory, disk and network in order to function. Understanding these resources is important for anybody managing a production database. Any resource that is weak or overloaded can become a limiting factor and cause the database server to perform poorly.

In this webinar, we’ll discuss some of the settings that are most often tweaked and which can bring you significant improvement in the performance of your MySQL or MariaDB database. We will also cover some of the variables which are frequently modified even though they should not.

Performance tuning is not easy, especially if you’re not an experienced DBA, but you can go a surprisingly long way with a few basic guidelines.

Date, Time & Registration Europe/MEA/APAC

Tuesday, June 26th at 09:00 BST / 10:00 CEST (Germany, France, Sweden)

Register Now

North America/LatAm

Tuesday, June 26th at 09:00 Pacific Time (US) / 12:00 Eastern Time (US)

Register Now

Agenda
  • What to tune and why?
  • Tuning process
  • Operating system tuning
    • Memory
    • I/O performance
  • MySQL configuration tuning
    • Memory
    • I/O performance
  • Useful tools
  • Do’s and do not’s of MySQL tuning
  • Changes in MySQL 8.0
Speaker

Krzysztof Książek, Senior Support Engineer at Severalnines, is a MySQL DBA with experience managing complex database environments for companies like Zendesk, Chegg, Pinterest and Flipboard.

This webinar builds upon blog posts by Krzysztof from the ‘Become a MySQL DBA’ series.

We look forward to “seeing” you there!

Tags:  MySQL MariaDB performance tuning webinar

GitHub acquired by Microsoft

Microsoft has just acquired GitHub for $7.5bn.  Good or bad?

Injected VC capital was $350m, so ROI for the VCs = 21.4x = very happy VCs.

Microsoft has done excellent work on OSS software in recent years, including on the Linux kernel, PHP, and many others.  Just like Oracle continues to put very good effort into MySQL after the Sun Microsystems acquisition many years ago.

But Microsoft is not an Open Source software company. The open source development model is not something they have built into their business “DNA” – processes (actually many companies that only do OSS haven’t got that either). So why GitHub? Combine it with LinkedIn (acquired by Microsoft earlier), and you have developers’ resumes. That’s valuable. It’s a strategically smart move, for Microsoft.

Will GitHub users benefit, and if so, how?

Well, I expect there’ll be more hoovering of “useful” (meta)data by a corporation, which some LinkedIn users will find handy, but I think it’s mainly beneficial to Microsoft rather than users, and this type of gathering and combining data is fundamentally incompatible with basic privacy.  It will bite, at some point down the line.  It always does.

Fundamentally, GitHub and its use is self-contradictory.  Git explicitly enables distributed source code control and truly distributed development, whereas GitHub is very much centralised.  Don’t just walk away to something else now, that won’t address the actual problem.  Solving it properly will include having bug tracking as part of a repository, and by design not relying on a single central location, or company.  The developer community (and companies) must resolve this external dependency.

Improvements from NDB wakeup threads


In MySQL Cluster 7.6 we introduced a new thread type in the NDB API.
Traditionally each cluster connection has one send thread to assist in sending messages to NDB, a receive thread that can
assist in receiving messages from NDB.
There is also a connection thread that listens to new connections and connects to the NDB data nodes.

There is also a set of user threads that is created by the application
that uses the NDB cluster connection.

Most of the sending is done by the user threads, the NDB API sending
thread is only used only when we are sending faster than the network
is able to handle.

We can process receive messages in the user threads or in the NDB
API receive thread. The default behaviour is to use the user threads
until the concurrency is higher than 8 threads working at the same
time. So at the highest concurrency it is the receive thread that
handles the signals and at low concurrency it is handled directly
by the NDB API user threads. The 8 is configurable through the
MySQL server variable ndb_recv_thread_activation_threshold.

Receiving in the NDB API is slightly faster to use from user threads
when only one thread is active.  It is 3-4% better response time in
this particular case. However as more and more threads are sending
data to the NDB data nodes the efficiency of using the NDB API
receive thread increases.

One problem in using the NDB API receive thread is that it is responsible
to both receive the messages from the NDB data nodes and to wake up the
NDB API user threads. At low load this is not an issue. But when the
load on the NDB API receive thread reaches 90% and beyond, this becomes
an issue.

To avoid this problem we added a new thread in the NDB API in MySQL Cluster 7.6.
This is the wakeup thread. This thread only has one duty, this is to wakeup
other threads. We experimented with a number of different variants to see which
ensured that user threads are woken up as quickly as possible.

Our conclusion was that at low load the optimal is that the receive thread
handles the wakeup, but at very high load it requires assistance from one
wakeup thread. As load increases the receive thread will handle less and less
wakeups. At 99-100% load the receive thread will more or less offload all
wakeup calls to the wakeup thread.

In the figure above we compare a normal sysbench OLTP RW experiment
comparing 7.5.9 with 7.6.6. As can be seen there is no difference until
we reach 32 connections. As we start to offload a subset of the wakeups
to the wakeup thread we improve performance of the application.

The throughput increases 5% due to this new feature, with even more
threads the performance drops slower such that we gain 15-20% more
performance at 512 connections.

The best performance is normally achieved by using the NDB API
receive thread and that this thread is locked to a specific CPU.
When starting the MySQL server one specifies these CPUs in the
configuration parameter ndb_recv_thread_cpu_mask. If the MySQL
Server uses several NDB cluster connections, the parameter
should specify one CPU per cluster connection.

If locking the NDB API receive thread to a CPU, it is important to
also lock the MySQL server process to other CPUs and if other processes
are running on the same machine, these also need to be locked to
CPUs not interfering with the NDB API receive thread.

The figures above shows the improvements when using one of the CPU
cores locked to handle the NDB API receive thread. Locking the receive
thread to a CPU adds another 5% to the total throughput and up to
20% more at high thread counts.

So what we have achieved with MySQL Cluster 7.6 is that we can increase
the throughput by at least 10% and performance at high thread counts
can increase by as much as 40%. All these numbers are still using the
TCP transporter. In a coming blog we will show how these numbers increase
even more when using the shared memory transporter. In addition we will
show how using the thread pool with NDB can even further increase stability
of high throughputs at high thread counts.

The above experiment was always done with one data node using 8 LDM
threads, the data node is locked to CPUs within one CPU socket. The
MySQL Server is locked to using 30 CPUs (15 CPU cores). In all cases
the bottleneck is that we only use one cluster connection. In 7.5.9 this
cluster connection scales to about 18 CPUs and with 7.6.6 it scales to
more than 20 CPUs. So using one cluster connection per 8 CPU cores is
usually appropriate.

I will be presenting at OpenExpoEurope in Madrid on Wednesday

This Wednesday, 6th June 2018, I will be presenting at OpenExpo Europe in Madrid: Experiences with testing dev MySQL versions and why it’s good for you.  It’s always good to test new releases of MySQL and this presentation talks about why.  If you are at the conference then please come along and say hello.  

Troubleshooting MySQL Concurrency Issues with Load Testing Tools Webinar: Q & A

In this blog, I will provide answers to the Q & A for the Troubleshooting MySQL Concurrency Issues with Load Testing Tools webinar.

First, I want to thank everybody for attending my May 23, 2018, webinar on troubleshooting tools for MySQL. The recording and slides for the webinar are available here. Below is the list of your questions that I was unable to answer fully during the webinar.

Q: What do you recommend for benchmarking NDB cluster? Which should be used and how?

A: The issue with benchmarking NDB cluster is not the tool choice, but the limitations that this product has compared to “regular” InnoDB MySQL installations. You can use any tool you like (Sysbench!), just make sure you are not running a test case that can be affected by NDB limitations (full list, performance). Or, at least, adjust your expected results for such a test. You may check these slides by former MySQL Product Manager Mat Keep. He covers benchmarks too. Just note that NDB cluster is improving over the years and you need to consult the current documentation to find out which limitations have already been relaxed.

Q: Can SysBench coordinate load testing in a multi-master scenario, or master and slave, at the same time? Or does it have a single connection?

A: SysBench version > 1.0 can open more than one connection per test (and even per thread). But you will need to write your own custom tests to coordinate load testing in a multiple server scenario.

Q: What do you think about the function BENCHMARK()? Compared with mysqlslap?

A: Function BENCHMARK just evaluates the scalar expression whilst mysqlslap runs the SQL query.

BENCHMARK()  can only be used to test how fast a function or a subquery, returning a single value, executes. This limits its usage to only a small number of real life use cases. It may help to test, say, if a function is effective enough for calculating expressions. It runs everything in the same thread, therefore you cannot use it for testing concurrency issues unless you start it in multiple clients.

mysqlslap  is very basic tool, but it can run SQL in multiple threads which makes its tests closer to real life scenarios.

The post Troubleshooting MySQL Concurrency Issues with Load Testing Tools Webinar: Q & A appeared first on Percona Database Performance Blog.

Pages