Planet MySQL

MySQL 8.0: changing configuration easily and cloud friendly !

Changing configuration settings in MySQL wasn’t always easy. Of course it’s possible to change things (hopefully), but keeping track of everything is not always obvious. This is where configuration management systems like puppet, chef, ansible, … excels in making our life easier.

/etc/my.cnf

With MySQL 8.0, we worked in making your life easier (and the life of configuration management systems easier too).

Let’s first illustrate the problematic very naively:

As you can see, we can modify this configuration variable, but after a restart of MySQL, this change is lost, this behavior, of course is something known by all MySQL DBAs. The usual solution is then to also modify the configuration file (/etc/my.cnf in most cases).

SET PERSIST

We have added a new feature allowing to modify a configuration variable and make the change persistent:

Above, we can see that the change persists even after a restart. We have added SET PERSIST that will change the variable and store it somewhere so value will survive a crash or a restart. Also I hope you noticed a new feature: RESTART (since MySQL 8.0.4). Yes, before when you had to restart MySQL, you had to do it via an access to the operating system or using a GUI made available for you. This is the case when you use the cloud for example. The RESTART command is way more convenient.

SET PERSIST & RESTART

Below you can see a full example of how convenient these two new features in MySQL 8.0 are for the Cloud:

So you could see that it’s possible to change a READ_ONLY variable that will be modified after a restart and perform all this without quitting the MySQL Client. (using SET PERSIST_ONLY)

To change such variable extra privileges are needed:

  • SYSTEM_VARIABLES_ADMIN
  • PERSIST_RO_VARIABLES_ADMIN

If you are curious to know where the changes are stored, in MySQL datadir, there is new file mysqld-auto.cnf that store all the changes in JSON format:

What’s next ?

In the next version, we will also keep track of who, where and when the change was made:

And as now, it will be possible to use a performance_schema table to query such information from the MySQL Client (after restart):

mysql> select * from performance_schema.variables_info where variable_source like 'PERSISTED'\G *************************** 1. row *************************** VARIABLE_NAME: innodb_buffer_pool_size VARIABLE_SOURCE: PERSISTED VARIABLE_PATH: /var/lib/mysql/mysqld-auto.cnf MIN_VALUE: 5242880 MAX_VALUE: 9223372036854775807 SET_TIME: 2018-01-16 18:59:25 SET_USER: root SET_HOST: localhost *************************** 2. row *************************** VARIABLE_NAME: innodb_log_file_size VARIABLE_SOURCE: PERSISTED VARIABLE_PATH: /var/lib/mysql/mysqld-auto.cnf/mysqld-auto.cnf MIN_VALUE: 4194304 MAX_VALUE: 18446744073709551615 SET_TIME: 2018-01-16 18:58:47 SET_USER: root SET_HOST: localhost

I think these two new features when combined will really improve MySQL usability in the Cloud.

And of course, now the question we could ask ourselves is what is the best way to configure your MySQL server? There is not yet a best practice, and as almost always when talking about MySQL, it depends…

My own suggestion would be to have the initial configuration (after install and first tuning) in /etc/my.cnf and then set everything from the client (even with configuration management systems). But this is my own opinion.

Meltdown Vulnerability Impact On MariaDB Server

Meltdown Vulnerability Impact On MariaDB Server axel Wed, 01/24/2018 - 05:44

Recent attacks, dubbed Meltdown and Spectre, have had a significant impact on various data processors and data security. In response, processors require modifications to the operating system; however, those modifications can be expensive and lead to potential slowdown in the applications and services. The following will explore the performance impact of the Meltdown workaround in the Linux kernel specific to MariaDB Server. 

Test Environment

All tests were run on our benchmark machine: 2x 8 core Intel Xeon E5 v3 (Haswell) running Ubuntu Server LTS. We compared kernels 4.4.0-104 and 4.4.0-109. 4.4.0-104 is the last Ubuntu kernel in the 4.4.0 line without the KPTI patch. KPTI stands for Kernel Page Tabe Isolation and is meant as a remedy for the Meltdown attack.

The benchmark is sysbench 1.0 OLTP in read-only and read/write mode. The MariaDB Server version used for this benchmark is MariaDB Server 10.2.11 (GA). MariaDB Server was running mostly with defaults. The InnoDB Buffer Pool was configured to large (all reads coming from memory). The durability was for production (flush-log-at-trx-commit=1). Storage in this machine is SSD.

Since the performance often depends on the malloc implementation, a system function that allocates memory for the program, I ran two series of tests. One with the default glibc malloc and a second with tcmalloc (preloaded with the --malloc-lib option for mysqld_safe).

Results

The numbers shown below are queries per second. Note: higher is better.

OLTP read-only  

glibc malloc

tcmalloc

threads

4.4.0

4.4.0 KPTI

change

4.4.0

4.4.0 KPTI

change

1

17111

14894

-12.96%

16091

16039

-0.32%

2

31966

28777

-9.98%

29426

29396

-0.10%

4

56142

53817

-4.14%

54330

56160

+3.37%

8

109148

107049

-1.92%

105223

103910

-1.25%

16

206156

202927

-1.57%

206005

204688

-0.64%

32

279125

259109

-7.17%

290233

278157

-4.16%

64

270800

240219

-11.29%

290560

281810

-3.01%

128

259024

226367

-12.61%

282176

276145

-2.14%

256

262552

232061

-11.61%

284642

277417

-2.54%

 

The impact of KPTI seems to be much higher at high concurrency. Tcmalloc delivers not only better results, but also the impact of KPTI is somewhat lessened.

OLTP read/write  

glibc malloc

tcmalloc

threads

4.4.0

4.4.0 KPTI

change

4.4.0

4.4.0 KPTI

change

1

4706

4584

-2.59%

4772

4805

+0.69%

2

9407

9283

-1.32%

9620

9515

-1.09%

4

15912

15860

-0.33%

16436

15975

-2.80%

8

30531

29200

-4.36%

30786

32298

+4.91%

16

61954

61102

-1.38%

63526

63651

+0.20%

32

126154

121657

-3.56%

129561

128284

-0.99%

64

192459

180334

-6.30%

198795

195860

-1.48%

128

220700

200473

-9.16%

234564

229417

-2.19%

256

227077

164281

-27.65%

241731

235218

-2.69%

For read/write workload the KPTI impact is even higher at high thread counts. Again tcmalloc seems to deliver better results.

Conclusion

The Meltdown patch in the Linux kernel reduces the performance of MariaDB Server by 5% and, in some cases, more than 10%, depending on workload and concurrency. The performance impact is higher with high concurrency. In that case, it may help to try running MariaDB Server with a preloaded high-concurrency memory allocator like tcmalloc, jemalloc or other allocator of choice.

Additional Resources

There are some reports from third parties about KPTI regression for MySQL or general performance impact from Redhat that are worth reviewing. Their results are in line with what we found.

The recently found security vulnerabilities in various processors require modifications in operating systems. Those modifications make some operations more expensive, leading to a slowdown of running applications and/or services. Here we look at the impact on MariaDB Server.

Login or Register to post comments

Shinguz: MySQL 8.0.4-rc is out

Yesterday MySQL 8.0.4-rc came out. The Release Notes are quite long.
But caution: Do a BACKUP before upgrading...

I experienced some nice surprises. First I have to admit that I did not read the Release Notes or anything else. Reading manuals is for Girlies! Possibly something is written in there which is of importance. But I expect that it just works as usual...

I downloaded MySQL 8.0.4-rc and just want to upgrade my MySQL 8.0.3-rc testing system, where we did the 1M tables test.

First I got:

[MY-011096] No data dictionary version number found. [MY-010020] Data Dictionary initialization failed. [MY-010119] Aborting

Hmmm... Maybe something was not clean with the old system. So downgrade again:

[ERROR] [000000] InnoDB: Unsupported redo log format. The redo log was created with MySQL 8.0.4. Please follow the instructions at http://dev.mysql.com/doc/refman/8.0/en/upgrading-downgrading.html [ERROR] [000000] InnoDB: Plugin initialization aborted with error Generic error [ERROR] [003957] Failed to initialize DD Storage Engine [ERROR] [003634] Data Dictionary initialization failed. [ERROR] [003742] Aborting

OK. Understandable. I should have done a backup before. But backup is for Girlies as well! Anyway this test system is not important. So I created a new instance from scratch which finally worked... Possibly just removing the redo log files as indicated would have helped as well.

MySQL 8.0.4, OpenSSL, and MySQL Community Edition

Starting with the MySQL Community 8.0.4-RC we are unifying on OpenSSL as the default TLS/SSL library for both MySQL Enterprise Edition and MySQL Community Edition. Previously, MySQL Community Edition used YaSSL.

Why make this change?
  • Community Requests – Supporting OpenSSL in the MySQL Community Edition has been one of the most frequently requested features.

Making Maxscale/ProxySQL Highly Available ( 2 > 1 )

As Mydbops we are implementing Load Balancer using Maxscale or ProxySQL ( Our presentation ) for lot our client,  but these load balancers will become a SPOF (Single Point of failure) .  We have tried to explore services like HAProxy, Nginx, and Keepalived etc. Except Keepalived, all the services need to run on the standalone instance and  did not satisfy our exact need.

Keepalived does not requires any standalone instance, it can be deployed and configured with a minimal effort and provide the HA Solutions to the DB Infra. This approach not only fits for our DB setup, we can implement same even managing Application or Frontend server HA Solutions. Its very simple and elegant way.

Below we will see how to set up the configure the Keepalived for our requirements.

The question arises, how do you load balance the Maxscale or ProxySQL servers themselves without any standalone instance? One way is to use keepalived. In order to use this solution, you need at least two Maxscale or ProxySQL servers. On both of them install keepalived as explained below. Both servers will have a floating IP. The application server will attempt to connect to 192.168.33.30 (Floating IP) . Depending on which Maxscale or ProxySQL server is the master, the IP will be owned by that server. If that server fails, then the backup server will start to issue gratuitous ARP responses for the same IP of 192.168.33.30 and the requests to the App servers will then go through the backup Maxscale or ProxySQL server which has now become the primary.

Note :

  1. ProxySQL has its clustering inbuilt with ProxySQL 1.4 to synchronise the configuration. ProxySQL has more fine grained configuration while comparing all SQL load balancers.
  2. Maxscale as a config file based setup . The configuration has to be synced manually across nodes ( but chances are minimal on production setup ).
Install Keepalived on both Master and Slave LB’s yum install keepalived -y Configure Keepalived on Master vim /etc/keepalived/keepalived.conf global_defs { notification_email { keepalive@mydbops.com } notification_email_from keepalived@mydbops.com smtp_server 192.168.1.100 smtp_connect_timeout 30 router_id LVS_DEVEL } vrrp_script check_maxscale { script "/bin/systemctl status maxscale.service" interval 2 fall 2 rise 2 } vrrp_instance VI_1 { state MASTER interface enp0s8 virtual_router_id 51 priority 101 advert_int 1 authentication { auth_type PASS auth_pass 1111 } virtual_ipaddress { 192.168.33.30 } track_script { check_maxscale } } Configure Keepalived on Slave vim /etc/keepalived/keepalived.conf global_defs { notification_email { admin@example.com } notification_email_from keepalived@example.com smtp_server 10.1.1.100 smtp_connect_timeout 30 router_id LVS_DEVEL } vrr_script check_maxscale { script "/bin/systemctl status maxscale.service" interval 2 fall 2 rise 2 } vrrp_instance VI_1 { state BACKUP interface enp0s8 virtual_router_id 51 priority 100 advert_int 1 authentication { auth_type PASS auth_pass 1111 } virtual_ipaddress { 192.168.33.30 } track_script { check_maxscale } } Start Keepalived on both Master and Slave systemctl start keepalived Verifying Keepalived is Running On Master systemctl status keepalived keepalived.service - LVS and VRRP High Availability Monitor Loaded: loaded (/usr/lib/systemd/system/keepalived.service; disabled; vendor preset: disabled) Active: active (running) since Tue 2018-01-23 13:21:57 UTC; 2h 22min ago Process: 3615 ExecStart=/usr/sbin/keepalived $KEEPALIVED_OPTIONS (code=exited, status=0/SUCCESS) Main PID: 3616 (keepalived) CGroup: /system.slice/keepalived.service ├─3616 /usr/sbin/keepalived -D ├─3617 /usr/sbin/keepalived -D └─3618 /usr/sbin/keepalived -D Jan 23 13:25:56 mydbopslabs13 Keepalived_vrrp[3618]: Sending gratuitous ARP on enp0s8 for 192.168.33.30 Jan 23 13:25:56 mydbopslabs13 Keepalived_vrrp[3618]: Sending gratuitous ARP on enp0s8 for 192.168.33.30 Jan 23 13:25:56 mydbopslabs13 Keepalived_vrrp[3618]: Sending gratuitous ARP on enp0s8 for 192.168.33.30 Jan 23 13:25:56 mydbopslabs13 Keepalived_vrrp[3618]: Sending gratuitous ARP on enp0s8 for 192.168.33.30 Jan 23 13:26:01 mydbopslabs13 Keepalived_vrrp[3618]: Sending gratuitous ARP on enp0s8 for 192.168.33.30 Jan 23 13:26:01 mydbopslabs13 Keepalived_vrrp[3618]: VRRP_Instance(VI_1) Sending/queueing gratuitous ARPs on enp0s8 for 192.168.33.30 Jan 23 13:26:01 mydbopslabs13 Keepalived_vrrp[3618]: Sending gratuitous ARP on enp0s8 for 192.168.33.30 Jan 23 13:26:01 mydbopslabs13 Keepalived_vrrp[3618]: Sending gratuitous ARP on enp0s8 for 192.168.33.30 Jan 23 13:26:01 mydbopslabs13 Keepalived_vrrp[3618]: Sending gratuitous ARP on enp0s8 for 192.168.33.30 Jan 23 13:26:01 mydbopslabs13 Keepalived_vrrp[3618]: Sending gratuitous ARP on enp0s8 for 192.168.33.30 Verifying Keepalived Running on Slave systemctl status keepalived keepalived.service - LVS and VRRP High Availability Monitor Loaded: loaded (/usr/lib/systemd/system/keepalived.service; disabled; vendor preset: disabled) Active: active (running) since Tue 2018-01-23 13:21:30 UTC; 2h 25min ago Process: 8682 ExecStart=/usr/sbin/keepalived $KEEPALIVED_OPTIONS (code=exited, status=0/SUCCESS) Main PID: 8683 (keepalived) CGroup: /system.slice/keepalived.service ├─8683 /usr/sbin/keepalived -D ├─8684 /usr/sbin/keepalived -D └─8685 /usr/sbin/keepalived -D Jan 23 13:25:29 mydbopslabs12 Keepalived_vrrp[8685]: Sending gratuitous ARP on enp0s8 for 192.168.33.30 Jan 23 13:25:34 mydbopslabs12 Keepalived_vrrp[8685]: Sending gratuitous ARP on enp0s8 for 192.168.33.30 Jan 23 13:25:34 mydbopslabs12 Keepalived_vrrp[8685]: VRRP_Instance(VI_1) Sending/queueing gratuitous ARPs on enp0s8 for 192.168.33.30 Jan 23 13:25:34 mydbopslabs12 Keepalived_vrrp[8685]: Sending gratuitous ARP on enp0s8 for 192.168.33.30 Jan 23 13:25:34 mydbopslabs12 Keepalived_vrrp[8685]: Sending gratuitous ARP on enp0s8 for 192.168.33.30 Jan 23 13:25:34 mydbopslabs12 Keepalived_vrrp[8685]: Sending gratuitous ARP on enp0s8 for 192.168.33.30 Jan 23 13:25:34 mydbopslabs12 Keepalived_vrrp[8685]: Sending gratuitous ARP on enp0s8 for 192.168.33.30 Jan 23 13:25:54 mydbopslabs12 Keepalived_vrrp[8685]: VRRP_Instance(VI_1) Received advert with higher priority 101, ours 100 Jan 23 13:25:54 mydbopslabs12 Keepalived_vrrp[8685]: VRRP_Instance(VI_1) Entering BACKUP STATE Jan 23 13:25:54 mydbopslabs12 Keepalived_vrrp[8685]: VRRP_Instance(VI_1) removing protocol VIPs.

As we know 192.168.33.13 is MASTER,  Floating IP is Assigned to enp0s8 when Master is UP.

ip add show enp0s8 3: enp0s8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 1000 link/ether 08:00:27:c3:8e:7e brd ff:ff:ff:ff:ff:ff inet 192.168.33.13/24 brd 192.168.33.255 scope global enp0s8 valid_lft forever preferred_lft forever inet 192.168.33.30/32 scope global enp0s8 valid_lft forever preferred_lft forever inet6 fe80::a00:27ff:fec3:8e7e/64 scope link valid_lft forever preferred_lft forever

Notice in the above we have sections vrr_script check_maxscale. This code will run maxscale status on the MASTER. If the return code is ‘0’ then the service is considered to be up. If the return code is other than ‘0’ then the service is considered to be down and the BACKUP host will then take-over the floating IP.

Testing The Failover and How Keepalived help for minimal Downtime

For this testing, I will ping Floating IP 192.168.33.30 from some local machine and manually stop the MAXSCALE server. So that Keepalive recognizes the service failure and Slave server keepalive configured Floating on that machine.

Initiate Ping to Floating IP ping 192.168.33.30 Stop MAXSCAL on Master systemctl stop maxscale systemctl status maxscale maxscale.service - MariaDB MaxScale Database Proxy Loaded: loaded (/usr/lib/systemd/system/maxscale.service; disabled; vendor preset: disabled) Active: inactive (dead) Jan 23 13:25:49 mydbopslabs13 maxscale[3883]: [mysqlmon] No Master can be determined Jan 23 16:15:12 mydbopslabs13 systemd[1]: Stopping MariaDB MaxScale Database Proxy... Jan 23 16:15:12 mydbopslabs13 maxscale[3883]: Waiting for housekeeper to shut down. Jan 23 16:15:13 mydbopslabs13 maxscale[3883]: Finished MaxScale log flusher. Jan 23 16:15:13 mydbopslabs13 maxscale[3883]: Housekeeper shutting down. Jan 23 16:15:13 mydbopslabs13 maxscale[3883]: Housekeeper has shut down. Jan 23 16:15:13 mydbopslabs13 maxscale[3883]: MaxScale received signal SIGTERM. Exiting. Jan 23 16:15:13 mydbopslabs13 maxscale[3883]: MaxScale is shutting down. Jan 23 16:15:13 mydbopslabs13 maxscale[3883]: MaxScale shutdown completed. Jan 23 16:15:13 mydbopslabs13 systemd[1]: Stopped MariaDB MaxScale Database Proxy. Ping Output ping 192.168.33.30 PING 192.168.33.30 (192.168.33.30) 56(84) bytes of data. 64 bytes from 192.168.33.30: icmp_seq=1 ttl=64 time=0.288 ms 64 bytes from 192.168.33.30: icmp_seq=2 ttl=64 time=0.285 ms 64 bytes from 192.168.33.30: icmp_seq=3 ttl=64 time=0.220 ms 64 bytes from 192.168.33.30: icmp_seq=4 ttl=64 time=0.208 ms 64 bytes from 192.168.33.30: icmp_seq=5 ttl=64 time=0.187 ms 64 bytes from 192.168.33.30: icmp_seq=6 ttl=64 time=0.236 ms 64 bytes from 192.168.33.30: icmp_seq=7 ttl=64 time=0.293 ms 64 bytes from 192.168.33.30: icmp_seq=8 ttl=64 time=0.198 ms 64 bytes from 192.168.33.30: icmp_seq=9 ttl=64 time=0.255 ms 64 bytes from 192.168.33.30: icmp_seq=10 ttl=64 time=0.440 ms 64 bytes from 192.168.33.30: icmp_seq=11 ttl=64 time=0.288 ms 64 bytes from 192.168.33.30: icmp_seq=12 ttl=64 time=0.366 ms 64 bytes from 192.168.33.30: icmp_seq=13 ttl=64 time=0.507 ms 64 bytes from 192.168.33.30: icmp_seq=14 ttl=64 time=0.401 ms 64 bytes from 192.168.33.30: icmp_seq=15 ttl=64 time=0.284 ms 64 bytes from 192.168.33.30: icmp_seq=16 ttl=64 time=0.366 ms 64 bytes from 192.168.33.30: icmp_seq=18 ttl=64 time=0.510 ms 64 bytes from 192.168.33.30: icmp_seq=19 ttl=64 time=0.242 ms 64 bytes from 192.168.33.30: icmp_seq=20 ttl=64 time=0.254 ms 64 bytes from 192.168.33.30: icmp_seq=21 ttl=64 time=0.175 ms 64 bytes from 192.168.33.30: icmp_seq=22 ttl=64 time=0.205 ms 64 bytes from 192.168.33.30: icmp_seq=23 ttl=64 time=0.246 ms 64 bytes from 192.168.33.30: icmp_seq=24 ttl=64 time=0.231 ms 64 bytes from 192.168.33.30: icmp_seq=25 ttl=64 time=0.249 ms 64 bytes from 192.168.33.30: icmp_seq=26 ttl=64 time=0.243 ms

when we see the above output only one packet was dropped. Within few seconds Floating IP Assigned to Slave and started to respond.

Check Floating IP assigned to Salve

ip add show enp0s8 3: enp0s8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 1000 link/ether 08:00:27:a8:1d:d7 brd ff:ff:ff:ff:ff:ff inet 192.168.33.12/24 brd 192.168.33.255 scope global enp0s8 valid_lft forever preferred_lft forever inet 192.168.33.30/32 scope global enp0s8 valid_lft forever preferred_lft forever inet6 fe80::a00:27ff:fea8:1dd7/64 scope link valid_lft forever preferred_lft forever

Once Master Maxscale service online automatically Floating re-assign to Master Server. These can be applied for the ProxySQL Cluster too.

We believe this blog make worth your time. If you have any comments please share below.

20-30% Performance Hit from the Spectre Bug Fix on Ubuntu

In this blog post, we’ll look at the performance hit from the Spectre bug fix on Ubuntu.

Recently we measured the performance penalty from the Meltdown fix on Ubuntu servers. It turned out to be negligible.

Today, Ubuntu made a Spectre bug fix on Ubuntu available, shipped in kernel 4.4.0-112. As with the Meltdown fix, we measured the effect of this update. Unfortunately, we observed a major performance penalty on MySQL workloads with this new kernel.

Our benchmark used the following:

System:

  • CPU:
    • 2 x Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz (Codename Haswell)
    • /proc/cpuinfo has 48 entries
  • Server
    • Supermicro Motherboard X10DRI
  • Storage
    • NVMi Intel DC 36000
  • Ubuntu 16.04
    • no-fix – kernel 4.4.0-104-generic
    • fix (now GA) – kernel 4.4.0-112-generic

With the fix, the Spectre and Meltdown mitigation detection tool v0.27 reports following:

Checking for vulnerabilities against live running kernel Linux 4.4.0-112-generic #135-Ubuntu SMP Fri Jan 19 11:48:36 UTC 2018 x86_64 CVE-2017-5753 [bounds check bypass] aka 'Spectre Variant 1' * Checking count of LFENCE opcodes in kernel: YES > STATUS: NOT VULNERABLE (115 opcodes found, which is >= 70, heuristic to be improved when official patches become available) CVE-2017-5715 [branch target injection] aka 'Spectre Variant 2' * Mitigation 1 * Hardware (CPU microcode) support for mitigation: YES * Kernel support for IBRS: YES * IBRS enabled for Kernel space: YES * IBRS enabled for User space: NO * Mitigation 2 * Kernel compiled with retpoline option: NO * Kernel compiled with a retpoline-aware compiler: NO > STATUS: NOT VULNERABLE (IBRS mitigates the vulnerability) CVE-2017-5754 [rogue data cache load] aka 'Meltdown' aka 'Variant 3' * Kernel supports Page Table Isolation (PTI): YES * PTI enabled and active: YES > STATUS: NOT VULNERABLE (PTI mitigates the vulnerability)

Workload:

  • sysbench 1.0.11 oltp_read_only and oltp_point_select
  • 32 tables, 10mln rows each
  • datasize ~73GB

Server:

Sysbench script:

The results

The results are in transactions per sec (more is better).

In memory (buffer pool 100G)

Connection via TCP

bp workload threads tps no-fix tps fix ratio no-fix/fix 100 oltp_read_only 1 810.76 655.34 1.24 100 oltp_read_only 2 1589.66 1277.02 1.24 100 oltp_read_only 8 6233.69 5018.11 1.24 100 oltp_read_only 16 11253.28 9477.18 1.19 100 oltp_read_only 64 22702.29 18564.30 1.22 100 oltp_read_only 128 22281.22 18357.06 1.21 100 oltp_point_select 1 16095.32 12380.20 1.30 100 oltp_point_select 2 32665.97 24907.84 1.31 100 oltp_point_select 8 132480.34 101787.44 1.30 100 oltp_point_select 16 236832.94 189087.10 1.25 100 oltp_point_select 64 498322.41 415631.91 1.20 100 oltp_point_select 128 496661.65 414495.64 1.20 Buffer pool 50G

Connection via TCP

bp workload threads tps no-fix tps fix ratio no-fix/fix 50 oltp_read_only 1 683.09 595.63 1.15 50 oltp_read_only 2 1390.70 1143.30 1.22 50 oltp_read_only 8 5262.02 4493.87 1.17 50 oltp_read_only 16 9842.04 8242.02 1.19 50 oltp_read_only 64 21021.20 17644.76 1.19 50 oltp_read_only 128 21526.21 17932.34 1.20 50 oltp_point_select 1 14535.73 11758.57 1.24 50 oltp_point_select 2 28721.43 23277.60 1.23 50 oltp_point_select 8 108422.96 90189.94 1.20 50 oltp_point_select 16 203876.31 167382.92 1.22 50 oltp_point_select 64 447757.48 376506.97 1.19 50 oltp_point_select 128 473894.73 384301.33 1.23 Buffer pool 25G

connection via TCP

bp workload threads tps no-fix tps fix ratio no-fix/fix 25 oltp_read_only 1 542.09 470.88 1.15 25 oltp_read_only 2 1074.54 931.02 1.15 25 oltp_read_only 8 4169.10 3621.79 1.15 25 oltp_read_only 16 7626.30 6716.29 1.14 25 oltp_read_only 64 18206.18 15702.90 1.16 25 oltp_read_only 128 20224.22 16966.13 1.19 25 oltp_point_select 1 11107.73 9294.73 1.20 25 oltp_point_select 2 22486.65 18526.84 1.21 25 oltp_point_select 8 86385.70 73226.44 1.18 25 oltp_point_select 16 161409.65 135689.48 1.19 25 oltp_point_select 64 370809.49 320848.79 1.16 25 oltp_point_select 128 433324.54 358947.61 1.21

 

We can see that in CPU-bound workloads the overhead is 20-25%, reaching up to 30% in point select queries. In IO-bound (25G buffer pool) workloads, the observed overhead is 15-20%.

This is a major performance hit, and you should consider it carefully before upgrading to the new kernel.

One hope that is retpoline kernels will have much less impact.

MariaDB Galera Cluster 5.5.59 now available

MariaDB Galera Cluster 5.5.59 now available dbart Tue, 01/23/2018 - 12:24

The MariaDB project is pleased to announce the immediate availability of MariaDB Galera Cluster 5.5.59. See the release notes and changelog for details.

Download MariaDB Galera Cluster 5.5.59

Release Notes Changelog What is MariaDB Galera Cluster?

The MariaDB project is pleased to announce the immediate availability of MariaDB Galera Cluster 5.5.59. See the release notes and changelog for details.

Login or Register to post comments

Webinar Wednesday, January 24, 2018: Differences between MariaDB and MySQL

Join Percona’s Chief Evangelist, Colin Charles as he presents Differences Between MariaDB and MySQL on Wednesday, January 24, 2018, at 7:00 am PDT / 10:00 am EDT (UTC-7).

Register Now

Tags: MariaDB, MySQL, Percona Server for MySQL, DBA, SysAdmin, DevOps
Experience Level: Novice

MariaDB and MySQL. Are they syntactically similar? Where do these two query languages differ? Why would I use one over the other?

MariaDB is on the path of gradually diverging from MySQL. One obvious example is the internal data dictionary currently under development for MySQL 8.0. This is a major change to the way metadata is stored and used within the server. MariaDB doesn’t have an equivalent feature. Implementing this feature could mark the end of datafile-level compatibility between MySQL and MariaDB.

There are also non-technical differences between MySQL and MariaDB, including:

  • Licensing: MySQL offers their code as open-source under the GPL, and provides the option of non-GPL commercial distribution in the form of MySQL Enterprise. MariaDB can only use the GPL because they derive their work from the MySQL source code under the terms of that license.
  • Support services: Oracle provides technical support, training, certification and consulting for MySQL, while MariaDB has their own support services. Some people will prefer working with smaller companies, as traditionally it affords them more leverage as a customer.
  • Community contributions: MariaDB touts the fact that they accept more community contributions than Oracle. Part of the reason for this disparity is that developers like to contribute features, bug fixes and other code without a lot of paperwork overhead (and they complain about the Oracle Contributor Agreement). However, MariaDB has its own MariaDB Contributor Agreement — which more or less serves the same purpose.

Colin will take a look at some of the differences between MariaDB and MySQL and help answer some of the common questions our Database Performance Experts get about the two databases.

Register for the webinar now.

Colin Charles, Chief Evangelist

Colin Charles is the Chief Evangelist at Percona. He was previously on the founding team for MariaDB Server in 2009, worked in MySQL since 2005, and been a MySQL user since 2000. Before joining MySQL, Colin worked actively on the Fedora and OpenOffice.org projects. He’s well-known within many open source communities and speaks on the conference circuit.

CPU Utilization is Not a Useful Metric

Once upon a time CPU utilization was quite a useful metric. Following are the output of several tools that provide CPU utilization metrics:

top

top reports a load of 1.66.

Is this correct? No. The correct load number is probably closer to 2.4.

# top -b -n 1| head -20 top - 11:27:45 up 151 days, 1:55, 7 users, load average: 1.66, 1.84, 1.88 Tasks: 389 total, 3 running, 386 sleeping, 0 stopped, 0 zombie Cpu(s): 0.7%us, 20.6%sy, 1.2%ni, 77.3%id, 0.1%wa, 0.0%hi, 0.1%si, 0.0%st Mem: 32639636k total, 32206476k used, 433160k free, 235732k buffers Swap: 16359420k total, 10285664k used, 6073756k free, 2354840k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 16702 root 20 0 8274m 5.0g 5.0g S 85.1 16.1 59164:55 VirtualBox 4657 root 20 0 9.8g 5.2g 5.1g S 45.5 16.6 26518:13 VirtualBox 6239 root 20 0 9.8g 5.1g 5.1g S 39.6 16.5 31200:52 VirtualBox 27070 root 20 0 7954m 5.4g 5.4g S 17.8 17.5 17049:30 VirtualBox 27693 root 20 0 2233m 441m 20m S 5.9 1.4 3407:34 firefox 7648 root 20 0 6758m 4.1g 4.1g S 4.0 13.2 17069:52 VirtualBox 6633 root 20 0 368m 63m 31m R 2.0 0.2 1338:58 Xorg 14727 root 20 0 15216 1344 828 R 2.0 0.0 0:00.01 top 1 root 20 0 19416 932 720 S 0.0 0.0 0:00.90 init 2 root 20 0 0 0 0 S 0.0 0.0 0:03.53 kthreadd 3 root 20 0 0 0 0 S 0.0 0.0 2:08.23 ksoftirqd/0 5 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0:0H 7 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/u:0H sar

sar does not show the load average, but does report what it thinks is CPU utilization.

Is it correct? Again, no. Actual idle should be closer 45-50%.

# sar 1 1 Linux 3.8.13-16.2.1.el6uek.x86_64 (myserver.jks.com) 01/22/2018 _x86_64_ (8 CPU) 11:29:32 AM CPU %user %nice %system %iowait %steal %idle 11:29:33 AM all 0.88 1.00 17.27 0.00 0.00 80.85 Average: all 0.88 1.00 17.27 0.00 0.00 80.85 mpstat

mpstat reports per CPU.

Again, these values are not quite correct.

# mpstat -P ALL Linux 3.8.13-16.2.1.el6uek.x86_64 (myserver.jks.com) 01/22/2018 _x86_64_ (8 CPU) 11:35:49 AM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle 11:35:49 AM all 0.74 1.19 20.58 0.11 0.00 0.06 0.00 0.00 77.32 11:35:49 AM 0 1.11 1.18 20.24 0.58 0.00 0.48 0.00 0.00 76.42 11:35:49 AM 1 0.88 1.32 22.45 0.08 0.00 0.02 0.00 0.00 75.25 11:35:49 AM 2 0.84 1.34 22.78 0.06 0.00 0.01 0.00 0.00 74.98 11:35:49 AM 3 0.81 1.31 21.69 0.05 0.00 0.00 0.00 0.00 76.15 11:35:49 AM 4 0.64 1.00 16.76 0.05 0.00 0.00 0.00 0.00 81.54 11:35:49 AM 5 0.57 1.11 19.28 0.02 0.00 0.00 0.00 0.00 79.02 11:35:49 AM 6 0.57 1.10 19.46 0.02 0.00 0.00 0.00 0.00 78.85

Finally the venerable uptime command:

uptime # uptime 11:29:48 up 151 days, 1:57, 7 users, load average: 1.70, 1.81, 1.87

Notice that mpstat and sar both report 8 CPUs, and that is the crux of the problem.

Why is that a problem? It is a problem because this machine does not have 8 CPUs; it has only 4.

The CPU is an Intel i7-4790S with hyperthreading enabled. When hyperthreading is enabled, Linux utilities believe that the number of CPUs is actually twice the number actually present.

In this case it appears to top, sar, mpstat and uptime that there are 8 CPUs, when in reality there are only 4.

What is Hyperthreading?

“But wait; doesn’t hyperthreading double the processing power of my CPU?” you may ask.

Well, no, it doesn’t.

Hyperthreading is a clever bit of technology from Intel that allows the operating system to better take advantage of a CPU during what would otherwise be idle time. Please refer to the references list if you would like more detail.

There are many sources that estimate the performance advantage of enabling hypertreading vs not enabling it.

A good summary of the rules of thumb of expected performance benefits when hyperthreads are enabled:

Socket Count Max Benefit % 1 30% 2 15% 3+ testing required

When the previously noted utilities are reporting there are 8 CPUs, that is not quite correct then as enabling hyperthreading does not double the number of CPUs.

Given the example i7 processor, the best we can hope for is that this single socket 4 core CPU will provide the equivalent work of approximately 5.6 cores.

8 * ( ( 100 – 30 ) / 100) = 5.6

estimated CPU / reported CPUs = metric adjustment %

In this case:

5.6 / 8 = 0.7

When CPU utilization is reported as 80% idle, the real value is more like 56%

80 * 0.70 = 56

Load averages can be treated the same way:

The load of 1.66 is actually ~2.4

1.66 / .7 = 2.37

Is hyperthreading enabled?

So by now you probably would like to know how to determine if hyperthreading is enabled.

There are a couple things you need to know to investigate this.

First find out the info about the CPU in question. The following instructions are for Linux.

Start by determining the CPU model. Here is one easy method to find it:

# grep CPU /proc/cpuinfo model name : Intel(R) Core(TM) i7-4790S CPU @ 3.20GHz model name : Intel(R) Core(TM) i7-4790S CPU @ 3.20GHz model name : Intel(R) Core(TM) i7-4790S CPU @ 3.20GHz model name : Intel(R) Core(TM) i7-4790S CPU @ 3.20GHz model name : Intel(R) Core(TM) i7-4790S CPU @ 3.20GHz model name : Intel(R) Core(TM) i7-4790S CPU @ 3.20GHz model name : Intel(R) Core(TM) i7-4790S CPU @ 3.20GHz model name : Intel(R) Core(TM) i7-4790S CPU @ 3.20GHz

The next step is to point your browser at https://ark.intel.com/, and then search for the exact CPU model.

Searching for i7-4790S shows there are 4 cores and 8 threads available, so this CPU is capable of hyperthreading.

The next step is to determine if hyperthreads are enabled. Doing so is less straightforward than previous steps.

The following process can be used to determine the actual number of physical cores, and then compare that to the number of cores presented to the OS.

number of physical cores

There are 4 cores in this case

# grep 'core id' /proc/cpuinfo | sort -u core id : 0 core id : 1 core id : 2 core id : 3 number of processors

8 are shown

# grep 'processor' /proc/cpuinfo | sort -u processor : 0 processor : 1 processor : 2 processor : 3 processor : 4 processor : 5 processor : 6 processor : 7

The number of reported processors are double the number of physical cores, indicating that hyperthreads are enabled.

This was tested on another server as well, one with 4 sockets of 10 cores each and hyperthreading known to be enabled.

As there are only 40 physical cores enabled it is clear that hyperthreading is enabled.

$ grep 'core id' /proc/cpuinfo | sort -u| wc -l 10 $ grep 'processor' /proc/cpuinfo | sort -u| wc -l 80 So, what now?

The time for using CPU utilization as a metric to drive for performance improvements is now long past.

CPU technology has advanced so much in the past several years that this metric now has limited usefulness.

Load Averages and CPU utilization may still be useful as barometers on systems where it is known that exceeding a certain threshold indicates there may be some issues to look at.

Other than that though, these metrics have outlived their usufullnes if the goal is to drive performance improvement through monitoring and mitigation of key metrics.

For much more detailed information, please refer the the Reference section at this end of this blog.

References

Will Hyper-Threading Improve Processing Performance?
CPU Utilization is Wrong
Utilization is Virtually Useless as a Metric!
Linux Load Averages: Solving the Mystery

Understanding – Group Replication and InnoDB Cluster

In this blog post, I’d like to show some relationships of “Group Replication” and “InnoDB Cluster“.  My hope is that I will be able to draw meaningful correlations and attempt to simplify our way of thinking about these setups. The MySQL development team released the first GA release of Group Replication in MySQL 5.7.17, which turns… Read More »

About USE/FORCE/IGNORE INDEX

USE/FORCE/IGNORE INDEX syntax, or index hints, are nice shortcuts to make sure that MySQL will (or will not) use a certain index. But it comes with some drawbacks:

USE/FORCE INDEX will not allow to use an index not mentioned in the list

This could be by design, though in the case of USE INDEX it sounds weird to me. Why? Because if none of the indexes mentioned in the list is usable, a full table scan will happen.

Why is this a problem? Because in the real world queries are generated dynamic and evolve over time. Today’s optimisations could be tomorrow’s wrong hints. I had a case of a wrong USE INDEX preventing the use of the primary key.

Produces an error if the index doesn’t exist

Again, this could be by design, but in the case of IGNORE INDEX this seems to me not ideal. A warning would be much better. Even better, I’d like to have this behaviour governed by a variable.

Why is this a problem? Because indexes can be deleted. Maybe it’s because queries change, maybe it’s because they were wrong from the start (possibly not even created by a DBA). But then, dropping an index can generate errors for existing applications.

Unfortunately, IMHO, documenting the usage of such hints is too difficult.

Federico

Replication Performance Enhancements in MySQL 8

Although it feels like it was only yesterday that Oracle released version 5.7 of their acclaimed MySQL Community Server, version 8 is already available as a development milestone release (DMR). No, you didn’t sleep through a bunch of releases; MySQL is jumping several versions in its numbering due to 6.0 being dropped and 7.0 being reserved for the clustering version of MySQL. This new version boasts numerous changes (and bug fixes), one of the most exciting of which are replication enhancements. This blog will provide an overview of the new replication enhancements, including new replication timestamps, additional information reported by performance schema tables, and how replication delay has been reduced by updating the relationship between replication threads to make them more efficient.

New Replication Timestamps

The most common task when managing a replication process is to ensure that replication is, in fact, taking place and that there were no errors between the slave and the master. The primary statement for this is SHOW SLAVE STATUS, which provides status information on essential parameters of the slave threads. Hence, you must execute it on each slave. Here’s some sample output:

mysql> SHOW SLAVE STATUS\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: localhost Master_User: root Master_Port: 13000 Connect_Retry: 60 Master_Log_File: master-bin.000002 Read_Master_Log_Pos: 1307 Relay_Log_File: slave-relay-bin.000003 Relay_Log_Pos: 1508 Relay_Master_Log_File: master-bin.000002 Slave_IO_Running: Yes Slave_SQL_Running: Yes / * / * / * / Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: ETC...

One of the many output fields is the Seconds_Behind_Master. While perfectly suitable for a simple master-slave setup, this metric is insufficient for more complex replication scenarios. The Seconds_Behind_Master metric has four main drawbacks:

  1. It only reports the delay between the slave and the top-most master. For instance, in a chained replication setup, the Seconds_Behind_Master reports the delay relative to the original master and does not provide any information regarding the lag between the slave and its nearest – i.e. immediate – master.
  2. It is relative to the original master’s time zone. As a result, server replication across time zones cause the measured delay to be offset by the time zone difference between the two servers.
  3. Lag is measured on a per-event basis, based on the statement’s execution start time. A more insightful measure would be per-transaction, from the time the transaction was actually committed on the master.
  4. The timestamp used to measure the replication lag offers a precision only up to the nearest second.

MySQL 8 introduces two new timestamps that complement the Seconds_Behind_Master metric in circumventing the above issues. These are associated with the global transaction identifier (GTID) of each transaction (as opposed to each event), written to the binary log. A GTID is a unique identifier created and associated with each transaction committed on the server of origin (master). This identifier is unique not only to the server on which it originated but across all servers in a given replication setup. Being associated to a transaction, there is a 1-to-1 mapping between all transactions and all GTIDs.

The two new timestamps are:

  • original commit timestamp (OCT): the number of microseconds since epoch (i.e. POSIX time/ UNIX time/January 1, 1970/1970-01-01T00:00:00Z) when the transaction was written to the binary log of the original master
  • immediate commit timestamp (ICT): the number of microseconds since epoch when the transaction was written to the binary log of the immediate master

The output of mysqlbinlog displays the new timestamps in two formats:

  1. microseconds from epoch, and
  2. TIMESTAMP format in the user time zone (for better readability)

This snippet from a slave’s binary log shows both timestamps:

#170404 10:48:05 server id 1 end_log_pos 233 CRC32 0x016ce647 GTID last_committed=0 sequence_number=1 original_committed_timestamp=1491299285661130 immediate_commit_timestamp=1491299285843771 # original_commit_timestamp=1491299285661130 (2018-01-04 10:48:05.661130 WEST) # immediate_commit_timestamp=1491299285843771 (2018-01-04 10:48:05.843771 WEST) /*!80001 SET @@session.original_commit_timestamp=1491299285661130*//*!*/; SET @@SESSION.GTID_NEXT= 'aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:1'/*!*/; # at 288 New Information Reported by Performance Schema Tables

MySQL 8.0 added a few changes to the Performance Schema resulting in better performance and more metrics:

  1. It can instrument server errors
  2. It now supports indexes
  3. It adds new fields to the existing performance schema replication status tables.

Let’s explore each of these in more detail.

Instrumentation of Server Errors

MySQL 8 saw the introduction of five new summary tables to assist in the in the instrumentation of server errors. These include:

  1. events_errors_summary_by_account_by_error
  2. events_errors_summary_by_host_by_error
  3. events_errors_summary_by_thread_by_error
  4. events_errors_summary_by_user_by_error
  5. events_errors_summary_global_by_error
  6. The error statistics are aggregated by error in all of the above tables. Moreover, each table, with the exception of events_errors_summary_global_by_error, stores errors related to a particular user, host, account, or thread; events_errors_summary_global_by_error contains errors for the entire server.
Table Structure

Each table contains the following fields:

+-------------------+---------------------+------+-----+---------------------+ | Field | Type | Null | Key | Default | +-------------------+---------------------+------+-----+---------------------+ | ERROR_NUMBER | int(11) | YES | | NULL | | ERROR_NAME | varchar(64) | YES | | NULL | | SQL_STATE | varchar(5) | YES | | NULL | | SUM_ERROR_RAISED | bigint(20) unsigned | NO | | NULL | | SUM_ERROR_HANDLED | bigint(20) unsigned | NO | | NULL | | FIRST_SEEN | timestamp | YES | | 0000-00-00 00:00:00 | | LAST_SEEN | timestamp | YES | | 0000-00-00 00:00:00 | +-------------------+---------------------+------+-----+---------------------+

Note that:

    • The FIRST_SEEN/LAST_SEEN columns indicate the first and last time a particular error was seen.
    • The SUM_ERROR_RAISED column lists the number of times a particular error is raised.
    • The SUM_ERROR_HANDLED column lists the number of times a particular error is handled.

All errors which were handled in stored programs are counted/aggregated under SUM_ERROR_HANDLED. Meanwhile, SUM_ERROR_RAISED is the number of all other remaining errors which were raised but not handled. Hence, to see the number of times a particular error is encountered on the server, we could do the following:

-- select from an unknown table: select * from mydb.unknown_table; ERROR 1146 (42S02): Table 'mydb.unknown_table' doesn't exist -- look up the error SELECT * from performance_schema.events_errors_summary_global_by_error where ERROR_NAME = 'ER_NO_SUCH_TABLE'; +--------------+------------------+-----------+------------------+-------------------+---------------------+--------------------+ | ERROR_NUMBER | ERROR_NAME | SQL_STATE | SUM_ERROR_RAISED | SUM_ERROR_HANDLED | FIRST_SEEN | LAST_SEEN | +--------------+------------------+-----------+------------------+-------------------+---------------------+---------------------+ | 1146 | ER_NO_SUCH_TABLE | 42S02 | 1 | 0 | 2018-01-15 15:15:21 | 2018-01-15 15:15:21 | +--------------+------------------+-----------+------------------+-------------------+---------------------+---------------------+

Although this example is querying the global table, these error’s statistics could be retrieved aggregated by user/host/account/thread from their respective tables as well for more fine-grained statistics.

Index Support

Since its inclusion in MySQL 5.5, the Performance Schema has grown to 93 tables, some of which store a large amount of data. The added index support greatly increases the efficiency the performance_schema, resulting in a dramatic speed boost for many monitoring queries.

The performance improvements from indexes can be very easily seen in many of the sys schema queries. For instance, with 1000 idle threads, the query “SELECT * FROM sys.session drops from 34.70 seconds down to 1.01 seconds – a 30x improvement!

A total of 115 indexes have been added in the performance schema. Unlike the INFORMATION_SCHEMA, the performance schema exposes the data as a storage engine, rather than temporary tables. Whereas the latter are not able to expose indexes that may be utilized by the optimizer, storage engines can. Data access to the performance schema also uses the same (SQL) interface as regular tables, so that it is able to benefit from future improvements to the query optimizer.

New Fields Added to Existing Performance Schema Replication Status Tables

Beyond the new server error tables, existing performance_schema tables are also getting some extra fields to help detect and diagnose lag at several points. Each potential lag point in the replication stream maps to its own table:

Lag Point Performance Schema Replication Status Table The connection thread that retrieves transactions from the master and queues them in the relay log. replication_connection_status: current status of the connection to the master The coordinator thread that reads the transactions from the relay log and schedules them to a worker queue (when multi-threaded slave mode [MTS] is enabled). replication_applier_status_by_coordinator: current status of the coordinator thread that only displays information when MTS is enabled The worker threads applying the transactions. replication_applier_status_by_worker: current status of the thread(s) applying transactions received from the master

Nine additional fields were added to each table that store information about the last transaction, the corresponding thread processed, and the transaction that thread is currently processing. This information includes:

    1. the transaction’s GTID
    2. its OCT and ICT (retrieved from the slave’s relay log)
    3. the time the thread started processing it and
    4. in case of the last processed transaction, the time the thread finished processing it.
New replication_connection_status Fields

The new replication_connection_status fields report information on the last and current transaction the connection thread queued into the relay log. This includes when it:

  • started queuing the transaction, i.e., when it read the first event from the master and queued it in the relay log
  • for LAST_QUEUED_TRANSACTION, when it finished queuing the last event of the transaction in the relay log

Here are the fields, along with sample data:

Field                                                                                                                                                                      Sample Data
————————————————————————————————————————————————————————– LAST_QUEUED_TRANSACTION aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:1 LAST_QUEUED_TRANSACTION_ORIGINAL_COMMIT_TIMESTAMP 2018-01-04 12:48:05.661130 LAST_QUEUED_TRANSACTION_IMMEDIATE_COMMIT_TIMESTAMP 2018-01-04 12:48:05.661130 LAST_QUEUED_TRANSACTION_START_QUEUE_TIMESTAMP 2018-01-04 12:48:05.674003 LAST_QUEUED_TRANSACTION_END_QUEUE_TIMESTAMP 2018-01-04 12:48:05.697760 QUEUEING_TRANSACTION QUEUEING_TRANSACTION_ORIGINAL_COMMIT_TIMESTAMP 0000-00-00 00:00:00.000000 QUEUEING_TRANSACTION_IMMEDIATE_COMMIT_TIMESTAMP 0000-00-00 00:00:00.000000 QUEUEING_TRANSACTION_START_QUEUE_TIMESTAMP 0000-00-00 00:00:00.000000 New replication_applier_status_by_coordinator Fields

When MTS is enabled, the replication_applier_status_by_coordinator table also reports which was the last transaction buffered by the coordinator thread to a worker’s queue, as well as the transaction it is currently buffering. The start timestamp refers to when this thread read the first event of the transaction from the relay log to buffer it to a worker’s queue, while the end timestamp refers to when the last event finished buffering to the worker’s queue.

Here are its fields, along with sample data:

Field                                                                                                                                                                      Sample Data
————————————————————————————————————————————————————————– LAST_PROCESSED_TRANSACTION aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:1 LAST_PROCESSED_TRANSACTION_ORIGINAL_COMMIT_TIMESTAMP 2018-01-04 12:48:05.661130 LAST_PROCESSED_TRANSACTION_IMMEDIATE_COMMIT_TIMESTAMP 2018-01-04 12:48:05.661130 LAST_PROCESSED_TRANSACTION_START_BUFFER_TIMESTAMP 2018-01-04 12:48:05.674139 LAST_PROCESSED_TRANSACTION_END_BUFFER_TIMESTAMP 2018-01-04 12:48:05.819167 PROCESSING_TRANSACTION PROCESSING_TRANSACTION_ORIGINAL_COMMIT_TIMESTAMP 0000-00-00 00:00:00.000000 PROCESSING_TRANSACTION_IMMEDIATE_COMMIT_TIMESTAMP 0000-00-00 00:00:00.000000 PROCESSING_TRANSACTION_START_BUFFER_TIMESTAMP 0000-00-00 00:00:00.000000 New replication_applier_status_by_worker Fields

The table replication_applier_status_by_worker now contains details about the transactions applied by the applier thread, or, if MTS is enabled, by each worker.  The start timestamp refers to when the worker started applying the first event, whereas the end timestamp refers to when the last event of the transaction was applied.

Here are its nine new fields, along with sample data:

Field                                                                                                                                                                      Sample Data
————————————————————————————————————————————————————————– LAST_PROCESSED_TRANSACTION aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:1 LAST_APPLIED_TRANSACTION aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:1 LAST_APPLIED_TRANSACTION_ORIGINAL_COMMIT_TIMESTAMP 2018-01-04 12:48:05.661130 LAST_APPLIED_TRANSACTION_IMMEDIATE_COMMIT_TIMESTAMP 2018-01-04 12:48:05.661130 LAST_APPLIED_TRANSACTION_START_APPLY_TIMESTAMP 2018-01-04 12:48:05.822463 LAST_APPLIED_TRANSACTION_END_APPLY_TIMESTAMP 2018-01-04 12:48:05.948926 APPLYING_TRANSACTION APPLYING_TRANSACTION_ORIGINAL_COMMIT_TIMESTAMP 0000-00-00 00:00:00.000000 APPLYING_TRANSACTION_IMMEDIATE_COMMIT_TIMESTAMP 0000-00-00 00:00:00.000000 APPLYING_TRANSACTION_START_APPLY_TIMESTAMP 0000-00-00 00:00:00.000000 More Efficient Relationship between Replication Threads

As of MySQL 8.0.1, updates to the replica server will make it more efficient than previous MySQL versions thanks to improvements in the relationship between the replication threads.  Although it’s still early to say with any precision just how the increased efficiency will translate into faster performance, preliminary testing showed a benefit of up to 65 percent.

The core of MySQL replication on the replica side is composed of two threads (sometimes more):

  1. the connection: handles the connection with the master, retrieving the events and queuing them on the relay log.
  2. the applier: reads the queued events from the relay log and applies them to the replica
The Old Arbitration Thread Model

The relationship between connection and applier was problematic when both threads were dealing with the same relay log file, due to “arbitration”.  The relay log file could only be accessed by one thread at a time, resulting in mutual exclusion of the replication threads.  Hence, when the connection was writing to the relay log file, the applier was unable to read content to be applied and had to wait. Likewise, when the applier was reading from the relay log file, the connection was unable to write new content to it and went into an idle state.  The arbitration was necessary to prevent the applier from sending events that were only partially written to the relay log to workers.

While sometimes beneficial on slaves with limited resources, this arbitration was also limiting the scalability of the multi-threaded slave (MTS) applier.

The New Thread Model

As of MySQL 8.0.1, the applier should almost never block the connection anymore, the exception being when the relay log has exceeded its size limit.   Likewise, the connection will not block the applier for already fully queued transaction parts.  To make this solution work, the connection thread keeps updated information about the position in the relay log file of the last fully queued event. The applier now reads from the log up to this position and waits for notification from the connection thread when it’s done writing to the relay log.

Conclusion

This blog provided an overview of MySQL 8.0’s exciting new replication enhancements, including:

  • new replication timestamps: MySQL 8 introduces two new timestamps that complement the Seconds_Behind_Master metric.  These are associated with the global transaction identifier (GTID) of each transaction written to the binary log.  
  • new information reported by performance schema tables: the Performance Schema in version 8.0 can now instrument server errors, supports indexes and adds new fields to the existing performance schema replication status tables.
  • more efficient relationship between replication threads: as of MySQL 8.0.1, updates to the replica server will make it more efficient than previous MySQL versions thanks to improvements in the relationship between the replication threads. Preliminary testing showed a benefit of up to 65 percent!

All of these changes will make for much-improved performance monitoring. Once MySQL 8 goes into production, we’ll get a chance to gauge just how much.

The post Replication Performance Enhancements in MySQL 8 appeared first on Monyog Blog.

MariaDB 5.5.59 now available

The MariaDB project is pleased to announce the immediate availability of MariaDB 5.5.59. This is a stable (GA) release. See the release notes and changelog for details. Download MariaDB 5.5.59 Release Notes Changelog What is MariaDB 5.5? MariaDB APT and YUM Repository Configuration Generator Thanks, and enjoy MariaDB!

The post MariaDB 5.5.59 now available appeared first on MariaDB.org.

MariaDB Server 5.5.59 now available

MariaDB Server 5.5.59 now available dbart Fri, 01/19/2018 - 11:07

The MariaDB project is pleased to announce the immediate availability of MariaDB Server 5.5.59. See the release notes and changelog for details and visit mariadb.com/downloads to download.

Download MariaDB Server 5.5.59

Release Notes Changelog What is MariaDB 5.5?

The MariaDB project is pleased to announce the immediate availability of MariaDB Server 5.5.59. See the release notes and changelog for details.

Login or Register to post comments

Shinguz: Short term notice: Percona XtraDB Cluster training in English 7/8 February 2018 in Germany

FromDual offers short term a Percona XtraDB Cluster and MySQL Galera Cluster training (2 days) in English.

The training will take place in the Linuxhotel in Essen/Germany on February 7/8 2018.

There are already enough registrations so it is certain the training will take place. But there are still free places for some additional participants.

You can book online at the Linuxhotel.

Important: The Linuxhotel is nearly fully booked out. So accommodation is in nearby locations. The Linuxhotel will recommend you some locations.

The training is in English.

You can find the contents of this 2-day Percona XtraDB Cluster training here.

If you have any question please do not hesitate to contact us.

Taxonomy upgrade extras:  galera Percona XtraDB Cluster training linuxhotel

Enhanced replication monitoring in ProxySQL 1.4.4

ProxySQL and pt-heartbeat

The Percona toolkit provides a very useful script for more accurate replication lag monitoring called pt-heartbeat. The pt-heartbeat script provides replication lag metrics based on actual data that has been replicated and is more accurate than the Seconds_Behind_Master metrics gathered from the output of SHOW SLAVE STATUS. Seconds_Behind_Master relies purely on system time (i.e. current time - time the event executed on the master server). By default ProxySQL relies on the output of the SHOW SLAVE STATUS command in order to identify replicas that should be excluded from the backend connection pool when max_replication_lag has been configured for a server in the mysql_servers table.

Starting from ProxySQL version 1.4.4 a new feature has been introduced in order to leverage the data provided by pt-heartbeat and in turn much more precise replication lag monitoring, all that is required is to configure mysql-monitor_replication_lag_use_percona_heartbeat with the database and table name which pt-heartbeat is writing to.

This tutorial aims to provide a simple how-to article in order to setup pt-heartbeat and use the data provided for ProxySQL's replication lag monitoring. In order to proceed you'll need to make sure the Percona toolkit is installed on your system. Instructions can be found here. After Percona toolkit has been installed (either on a monitoring server or your master server) you can proceed with the following steps:

  • Create a percona database (on the master) to store the heartbeat data:

    $ mysql Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 7653 Server version: 5.7.19-17-log Percona Server (GPL), Release '17', Revision 'e19a6b7b73f' Copyright (c) 2009-2017 Percona LLC and/or its affiliates Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> CREATE DATABASE IF NOT EXISTS percona; Query OK, 1 row affected, 1 warning (0.54 sec) mysql> exit; Bye
  • Now we're ready to kick off the pt-heartbeat process:

    $ pt-heartbeat -D percona --update -h localhost --create-table --daemonize
  • Lets go ahead and verify pt-heartbeat is running properly on the slave:

    $ mysql Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 7653 Server version: 5.7.19-17-log Percona Server (GPL), Release '17', Revision 'e19a6b7b73f' Copyright (c) 2009-2017 Percona LLC and/or its affiliates Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> select * from percona.heartbeat; +----------------------------+-----------+------------------+----------+-----------------------+---------------------+ | ts | server_id | file | position | relay_master_log_file | exec_master_log_pos | +----------------------------+-----------+------------------+----------+-----------------------+---------------------+ | 2018-01-18T17:13:42.000790 | 192168111 | mysql-bin.000020 | 51025284 | mysql-bin.000012 | 40774236 | +----------------------------+-----------+------------------+----------+-----------------------+---------------------+ 1 row in set (0.00 sec) mysql> select * from percona.heartbeat; +----------------------------+-----------+------------------+----------+-----------------------+---------------------+ | ts | server_id | file | position | relay_master_log_file | exec_master_log_pos | +----------------------------+-----------+------------------+----------+-----------------------+---------------------+ | 2018-01-18T17:13:44.001590 | 192168111 | mysql-bin.000020 | 51026250 | mysql-bin.000012 | 51026250 | +----------------------------+-----------+------------------+----------+-----------------------+---------------------+ 1 row in set (0.00 sec)

    We can see from the above output that the pt-heartbeat data is being written on the master and is replicating to the slave as well so now we're ready to configure the ProxySQL host.

  • Connect to the ProxySQL Admin interface:

    $ mysql -uadmin -padmin -h127.0.0.1 -P6032 --prompt 'Admin> ' mysql: [Warning] Using a password on the command line interface can be insecure. Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 4 Server version: 5.5.30 (ProxySQL Admin Module) Copyright (c) 2009-2017 Percona LLC and/or its affiliates Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. Admin>
  • The next step is to enable the replication lag check on all or sum of the servers as desired:

    Admin> SELECT * FROM mysql_servers; +--------------+---------------+------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------+ | hostgroup_id | hostname | port | status | weight | compression | max_connections | max_replication_lag | use_ssl | max_latency_ms | comment | +--------------+---------------+------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------+ | 10 | ubuntu-vm-nv1 | 3306 | ONLINE | 1 | 0 | 1000 | 0 | 0 | 0 | | | 20 | ubuntu-vm-nv2 | 3306 | ONLINE | 1 | 0 | 1000 | 0 | 0 | 0 | | +--------------+---------------+------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------+ 2 rows in set (0.03 sec) Admin> update mysql_servers set max_replication_lag = 30; Query OK, 2 rows affected (0.00 sec) Admin> save mysql servers to disk; load mysql servers to runtime; Query OK, 0 rows affected (0.40 sec) Query OK, 0 rows affected (0.00 sec)

    Here the threshold has been set to 30(s) on all servers, feel free to adjust to whatever granularity your checks require. Remember that the frequency of the check is controlled by the mysql-monitor_replication_lag_interval variable. Make sure that max_replication_lag is not smaller than mysql-monitor_replication_lag_interval.

  • Finally configure ProxySQL to use pt-heartbeat data:

    Admin> select @@mysql-monitor_replication_lag_use_percona_heartbeat; +-------------------------------------------------------+ | @@mysql-monitor_replication_lag_use_percona_heartbeat | +-------------------------------------------------------+ | | +-------------------------------------------------------+ 1 row in set (0.00 sec) Admin> set mysql-monitor_replication_lag_use_percona_heartbeat = 'percona.heartbeat'; Query OK, 1 row affected (0.00 sec) Admin> save mysql variables to disk; load mysql variables to runtime; Query OK, 93 rows affected (0.30 sec) Query OK, 0 rows affected (0.00 sec)

ProxySQL is now monitoring replication lag using Percona's pt-heartbeat!

Authored by: Nick Vyzas

Percona Monitoring and Management (PMM) 1.6.0 Is Now Available

Percona announces the release of Percona Monitoring and Management (PMM) 1.6.0. In this release, Percona Monitoring and Management Grafana metrics are available in the Advanced Data Exploration dashboard. We’ve improved the integration with MyRocks, and its data is now collected from SHOW GLOBAL STATUS.

The MongoDB Exporter now features two new metrics: mongodb_up to inform if the MongoDB Server is running and mongodb_scrape_errors_total reporting the total number of errors when scaping MongoDB.

In this release, we’ve greatly improved the performance of the mongodb:metrics monitoring service.

Percona Monitoring and Management (PMM) 1.6.0 also includes version 4.6.3 of Grafana which includes fixes to bugs in the alert list and the alerting rules. More information is available in the Grafana’s change log.

New Features
  • PMM-1773: PMM Grafana specific metrics have been added to the Advanced Data Exploration dashboard.
Improvements
  • PMM-1485: Updated MyRocks integration: MyRocks data is now collected entirely from SHOW GLOBAL STATUS, and we have eliminated SHOW ENGINE ROCKSDB STATUS as a data source in mysqld_exporter.
  • PMM-1895: Update Grafana to version 4.6.3:
    • Alert list: Now shows alert state changes even after adding manual annotations on dashboard #9951
    • Alerting: Fixes bug where rules evaluated as firing when all conditions were false and using OR operator. #9318
  • PMM-1586: The mongodb_exporter exporter exposes two new metrics: mongodb_up informing if the MongoDB Server is running and mongodb_scrape_errors_total informing the total number of times an error occurred when scraping MongoDB.
  • PMM-1764: Various small mongodb_exporter improvement
  • PMM-1942: Improved the consistency of using labels in all Prometheus related dashboards.
  • PMM-1936: Updated the Prometheus dashboard in Metrics Monitor
  • PMM-1937 Added the CPU Utilization Details (Cores) dashboard to Metrics Monitor.
Bug fixes
  • PMM-1549: Broken default auth db for mongodb:queries
  • PMM-1631: In some cases, percentage values were displayed incorrectly for MongoDB hosts.
  • PMM-1640: RDS exporter: simplify configuration
  • PMM-1760: After the mongodb:metrics monitoring service was added, the usage of CPU considerably increased in QAN versions 1.4.1 through 1.5.3.

    1.5.0 – CPU usage 95%
    1.5.3 – CPU usage 85%
    1.6.0 – CPU usage 1%

  • PMM-1815QAN could show data for a MySQL host when a MongoDB host was selected.
  • PMM-1888: In QAN, query metrics were not loaded when the QAN page was refreshed.
  • PMM-1898: In QAN, the Per Query Stats graph displayed incorrect values for MongoDB
  • PMM-1796: In Metrics Monitor, the Top Process States Hourly graph from the MySQL Overview dashboard showed incorrect data.
  • PMM-1777: In QAN, the Load column could display incorrect data.
  • PMM-1744: The error Please provide AWS access credentials error appeared although the provided credentials could be processed successfully.
  • PMM-1676: In preparation for migration to Prometheus 2.0 we have updated the System Overview dashboard for compatibility.
  • PMM-1920: Some standard MySQL metrics were missing from the mysqld_exporter  Prometheus exporter.
  • PMM-1932: The Response Length metric was not displayed for MongoDB hosts in QAN.

This Week in Data with Colin Charles 24: more Meltdown, FOSDEM, Slack and reminiscing

Join Percona Chief Evangelist Colin Charles as he covers happenings, gives pointers and provides musings on the open source database community.

There is still much going on when it comes to Meltdown/Spectre in our world. Percona’s Vadim Tkachenko and Alexey Stroganov recently published Does the Meltdown Fix Affect Performance for MySQL on Bare Metal?. You also want to read Mark Callaghan’s excellent work on this: Meltdown vs MySQL part 1: in-memory sysbench and a core i3 NUC, XFS, nobarrier and the 4.13 Linux kernel, Meltdown vs MySQL part 2: in-memory sysbench and a core i5 NUC, and Meltdown vs storage. If you’re looking at this from a Cassandra standpoint, do read Meltdown’s Impact on Cassandra Latency. SolarWinds (formerly sponsors at Percona Live), have also released a statement on Meltdown/Spectre: Meltdown/Spectre fixes made AWS CPUs cry, says SolarWinds.

From a FOSDEM standpoint (its just a few weeks away, I hope to meet you there), don’t forget that the community dinner tickets are now on sale, and it happens on Friday 2 February 2018. Remember that the FOSDEM room for MySQL and friends is on Sunday 4 February 2018. And you’ll not want to miss Peter Zaitsev’s talk on Saturday, do read the Interview with Peter Zaitsev MySQL: Scaling & High Availability Production experience for the last decade.

Slack is becoming popular for database related discussions. You can join the MongoDB Community, and it’s a lot more active than the IRC channel on freenode. There is also a MySQL Community on Slack! Currently, the MongoDB community has 927 people in their #general channel, compared to the MySQL channel with 85 people. Will we see MariaDB Server have a Slack channel? Percona?

This past week has been an interesting one for the MySQL world – former CEO posted a little photo and message to Facebook. It’s a public post, hence I am linking to it. It reads, “10 years ago! What memories. A fantastic team. Such a great welcoming by Sun. MySQL did well as part of Sun, and has continued to do so as part of Oracle. Thank you, all you who did it!”. I was in Orlando, Florida when this happened. It was an amazing few days. A USD$1 billion exit may seem small today, but in January 2008 it was all the rage (keep in mind we were preparing for IPO). We may not have created the MySQL mafia-like PayPal managed (too many characters, egos, and concentrated wealth?), but to see how far the ecosystem has come since: forks, branches, usage. All I can say is – an extreme privilege to be part of the journey and ride.

Releases Link List Upcoming appearances
  • FOSDEM 2018 – Brussels, Belgium – February 3-4 2018
  • SCALE16x – Pasadena, California, USA – March 8-11 2018
Feedback

I look forward to feedback/tips via e-mail at colin.charles@percona.com or on Twitter @bytebot.

Does the Meltdown Fix Affect Performance for MySQL on Bare Metal?

In this blog post, we’ll look at does the Meltdown fix affect performance for MySQL on bare metal servers.

Since the news about the Meltdown bug, there were a lot of reports on the performance hit from proposed fixes. We have looked at how the fix affects MySQL (Percona Server for MySQL) under a sysbench workload.

In this case, we used bare metal boxes with the following specifications:

  • Two-socket Intel(R) Xeon(R) CPU E5-2683 v3 @ 2.00GHz (in total 56 entries in /proc/cpuinfo)
  • Ubuntu 16.04
  • Memory: 256GB
  • Storage: Samsung SM863 1.9TB SATA SSD
  • Percona Server for MySQL 5.7.20
  • Kernel (vulnerable) 4.13.0-21
  • Kernel (with Meltdown fix) 4.13.0-25

Please note, the current kernel for Ubuntu 16.04 contains only a Meltdown fix, and not one for Spectre.

We performed the validation with the https://github.com/speed47/spectre-meltdown-checker tool. The database size is 100GB in a sysbench workload with 100 tables, 4mln rows each with Pareto distribution.

We have used a socket connection and TCP host connection to measure a possible overhead from the TCP network connection. We also perform read-write and read-only benchmarks.

The results are below for a various number of threads:

Where

  • Nokpti: kernel without KPTI patch (4.13.0-21)
  • Pti: kernel with KPTI patch (4.13.0-25), with PTI enabled
  • Nopti: kernel with KPTI patch (4.13.0-25), with PTI disabled

 

testname bp socket threads pti nopti nokpti nopti_pct pti_pct 1 OLTP_RO in-memory tcp_socket 1 709.93 718.47 699.50 -2.64 -1.47 4 OLTP_RO in-memory tcp_socket 8 5473.05 5500.08 5483.40 -0.30 0.19 3 OLTP_RO in-memory tcp_socket 64 21716.18 22036.98 21548.46 -2.22 -0.77 2 OLTP_RO in-memory tcp_socket 128 21606.02 22010.36 21548.62 -2.10 -0.27  5 OLTP_RO in-memory unix_socket 1 750.41 759.33 776.88 2.31 3.53 8 OLTP_RO in-memory unix_socket 8 5851.80 5896.86 5986.89 1.53 2.31 7 OLTP_RO in-memory unix_socket 64 23052.10 23552.26 23191.48 -1.53 0.60 6 OLTP_RO in-memory unix_socket 128 23215.38 23602.64 23146.42 -1.93 -0.30 9 OLTP_RO io-bound tcp_socket 1 364.03 369.68 370.51 0.22 1.78 12 OLTP_RO io-bound tcp_socket 8 3205.05 3225.21 3210.63 -0.45 0.17 11 OLTP_RO io-bound tcp_socket 64 15324.66 15456.44 15364.25 -0.60 0.26 10 OLTP_RO io-bound tcp_socket 128 17705.29 18007.45 17748.70 -1.44 0.25 13 OLTP_RO io-bound unix_socket 1 421.74 430.10 432.88 0.65 2.64 16 OLTP_RO io-bound unix_socket 8 3322.19 3367.46 3367.34 -0.00 1.36 15 OLTP_RO io-bound unix_socket 64 15977.28 16186.59 16248.42 0.38 1.70 14 OLTP_RO io-bound unix_socket 128 18729.10 19111.55 18962.02 -0.78 1.24 17 OLTP_RW in-memory tcp_socket 1 490.76 495.21 489.49 -1.16 -0.26 20 OLTP_RW in-memory tcp_socket 8 3445.66 3459.16 3414.36 -1.30 -0.91 19 OLTP_RW in-memory tcp_socket 64 11165.77 11167.44 10861.44 -2.74 -2.73 18 OLTP_RW in-memory tcp_socket 128 12176.96 12226.17 12204.85 -0.17 0.23 21 OLTP_RW in-memory unix_socket 1 530.08 534.98 540.27 0.99 1.92 24 OLTP_RW in-memory unix_socket 8 3734.93 3757.98 3772.17 0.38 1.00 23 OLTP_RW in-memory unix_socket 64 12042.27 12160.86 12138.01 -0.19 0.80 22 OLTP_RW in-memory unix_socket 128 12930.34 12939.02 12844.78 -0.73 -0.66 25 OLTP_RW io-bound tcp_socket 1 268.08 270.51 270.71 0.07 0.98 28 OLTP_RW io-bound tcp_socket 8 1585.39 1589.30 1557.58 -2.00 -1.75 27 OLTP_RW io-bound tcp_socket 64 4828.30 4782.42 4620.57 -3.38 -4.30 26 OLTP_RW io-bound tcp_socket 128 5158.66 5172.82 5321.03 2.87 3.15 29 OLTP_RW io-bound unix_socket 1 280.54 282.06 282.35 0.10 0.65 32 OLTP_RW io-bound unix_socket 8 1582.69 1584.52 1601.26 1.06 1.17 31 OLTP_RW io-bound unix_socket 64 4519.45 4485.72 4515.28 0.66 -0.09 30 OLTP_RW io-bound unix_socket 128 5524.28 5460.03 5275.53 -3.38 -4.50

 

As you can see, there is very little difference between runs (in 3-4% range), which fits into variance during the test.

Similar experiments were done on different servers and workloads:

There also we see a negligible difference that fits into measurement variance.

Overhead analysis

To understand why we do not see much effect in MySQL (InnoDB workloads), let’s take a look where we expect to see the overhead from the proposed fix.

The main overhead is expected from a system call, so let’s test syscall execution on the kernel before the fix and after the fix (thanks for Alexey Kopytov for an idea how to test it with sysbench).

We will use the following script syscall.lua:

ffi.cdef[[long syscall(long, long, long, long);]] function event()  for i = 1, 10000 do  ffi.C.syscall(0, 0, 0, 0)  end end

Basically, we measure the time for executing 10000 system calls (this will be one event).

To run benchmark: sysbench syscall.lua --time=60 --report-interval=1 run 

And the results are following:

  • On the kernel without the fix (4.13.0-21): 455 events/sec
  • On the kernel with the fix (4.13.0-26): 250 events/sec

This means that time to execute 10000 system calls increased from 2.197ms to 4ms.

While this increase looks significant, it does not have much effect on MySQL (InnoDB engine). In MySQL, you can expect most system calls done for IO or network communication.

We can assume that the time to execute 10000 IO events on the fast storage takes 1000ms, so adding an extra 2ms for the system calls corresponds to adding 0.2% in overhead (which is practically invisible in MySQL workloads).

I expect the effect will be much more visible if we work with MyISAM tables cached in OS memory. In this case, the syscall overhead would be much more visible when accessing data in memory.

Conclusion:

From our results, we do not see a measurable effect from KPTI patches (to mitigate the Meltdown vulnerability) running on bare metal servers with Ubuntu 16.04 and 4.13 kernel series.

Reference commands and configs:

sysbench oltp_read_only.lua   {--mysql-socket=/tmp/mysql.sock|--mysql-host=127.0.0.1} --mysql-user=root --mysql-db=sbtest100t4M --rand-type=pareto  --tables=100  --table-size=4000000 --num-threads=$threads --report-interval=1 --max-time=180 --max-requests=0  run

RW:

sysbench oltp_read_write.lua   {--mysql-socket=/tmp/mysql.sock|--mysql-host=127.0.0.1} --mysql-user=root --mysql-db=sbtest100t4M --rand-type=pareto  --tables=100  --table-size=4000000 --num-threads=$threads --report-interval=1 --max-time=180 --max-requests=0  run

mysqld:
Percona Server 5.7.20-19

numactl --physcpubind=all --interleave=all   /usr/bin/env LD_PRELOAD=/data/opt/alexey.s/bin64_5720.ps/lib/mysql/libjemalloc.so.1 ./bin/mysqld --defaults-file=/data/opt/alexey.s/my-perf57.cnf --basedir=. --datadir=/data/sam/sbtest100t4M   --user=root  --innodb_flush_log_at_trx_commit=1 --innodb-buffer-pool-size=150GB --innodb-log-file-size=10G --innodb-buffer-pool-instances=8  --innodb-io-capacity-max=20000 --innodb-io-capacity=10000 --loose-innodb-page-cleaners=8 --ssl=0

My.cnf file:

[mysqld] user=root port=3306 innodb_status_file=0 innodb_data_file_path=ibdata1:100M:autoextend innodb_flush_log_at_trx_commit = 1 innodb_file_per_table = true innodb_log_buffer_size = 128M innodb_log_file_size = 10G innodb_log_files_in_group = 2 innodb_write_io_threads=8 innodb_read_io_threads=8 innodb_io_capacity=15000 innodb_io_capacity_max=25000 innodb_lru_scan_depth=8192 #innodb_buffer_pool_size=${BP}G innodb_doublewrite=1 innodb_thread_concurrency=0 innodb-checksum-algorithm=strict_crc32 innodb_flush_method=O_DIRECT_NO_FSYNC innodb_purge_threads=8 loose-innodb-page-cleaners=8 innodb_buffer_pool_instances=8 innodb_change_buffering=none innodb_adaptive_hash_index=OFF sync_binlog=0 max_connections=5000 table_open_cache=5000 query_cache_type=OFF thread_cache_size=16 back_log=2000 connect_timeout=15 skip-grant-tables sort_buffer_size=262144 key_buffer_size=8388608 join_buffer_size=262144 server-id=1 max_connections=50000 skip_name_resolve=ON max_prepared_stmt_count=1048560 performance_schema=OFF performance-schema-instrument='wait/synch/%=ON' innodb_monitor_enable=all innodb_flush_neighbors=0 metadata_locks_hash_instances=256 table_open_cache_instances=64

Pages