Planet MySQL

How to drop a column in mysql table

In this 101 article, I will show how to drop/remove a column from a table in MySQL.

In this article I will use an example table:

CREATE TABLE tb( c1 INT PRIMARY KEY, c2 char(1), c3 varchar(2) ) ENGINE=InnoDB;

To remove a column we will make use of ALTER TABLE command:

ALTER TABLE tb DROP COLUMN c2;

The command allows you to remove multiple columns at once:

ALTER TABLE tb DROP COLUMN c2, DROP COLUMN c3;

If you are running MySQL 5.6 onwards, you can make this operation online, allowing other sessions to read and write to your table while the operation is been performed:

ALTER TABLE tb DROP COLUMN c2, ALGORITHM=INPLACE, LOCK=NONE;

Reference:

https://dev.mysql.com/doc/refman/5.7/en/alter-table.html
https://dev.mysql.com/doc/refman/5.7/en/innodb-create-index-overview.html

Shinguz: Special MySQL and MariaDB trainings 2018 in English

Due to a strong customer demand FromDual offers 2018 two extra MySQL/MariaDB trainings with its Training partner The Linuxhotel in Essen (Germany). Those trainings are in English.

  • MariaDB Performance Tuning on 5 and 6 September 2018 (2 days).
  • Advanced MySQL/MariaDB training on 26 to 30 November 2018 (5 days).

More information about the contents of the trainings can be found at Advanced MySQL and MariaDB training.

For conditions and booking: MariaDB Performance Tuning and Advanced MySQL Training.

For specific MariaDB or MySQL on-site Consulting or in-house Training please get in contact with us.

Taxonomy upgrade extras:  training mariadb training Performance Tuning

MySQL Streaming Xtrabackup to Slave Recovery

Overview

There are times when I need to restore a slave from a backup of a master or another slave, but too many times I have taken the typical approach of taking the backup on the source server and then copying it to the target server.  This takes a great deal of time, especially as your database grows in size.

These steps are going to detail how to use Netcat (nc) and Percona Xtrabackup (innobackupexec) to stream the backup from the source server to the target server, saving a great deal of time by copying the data only once to the desired location.  While the data is streaming to the target server, it’s being compressed and then uncompressed on the fly, reducing the amount of traffic going across the network by around 85% to 90% (typical backup compression ratios of MySQL Innodb I have witnessed are in this range).  I will also provide a simple script to complete these steps to give you a start at creating your own customized solution.

Requirements

In order to accomplish this task, you need to keep the following items in mind:

  • Netcat (nc) – Application needed on both servers, used for data streaming.
  • Percona Xtrabackup – Application needed on both servers, used to perform the backup.
  • MySQL Access – MySQL access is required in order to do the backup on the master, and the slave configuration on the slave.
  • Pigz (optional) – This is only needed if you want to compress and uncompress it on the fly.  If you are going to use this, it’s needed on both servers.
  • Debian – All code and scripts were tested using Debian.  Commands may slightly change with different OS’s.
Steps

Here are the steps that are required to accomplish this task.  The source server is the server where the backup is coming from.  The target server is where the backup is going to.

Step 1. Stop MySQL on Target Server and Clear MySQL Data

On the server that needs to be restored, we will make sure that MySQL is stopped.  Then we will clear out the old data as this will all be replaced with the backup coming from the source server.  The example assumes your MySQL data directory is /var/lib/mysql.

service mysql stop rm -rf /var/lib/mysql/* Step 2. Start Listener on Target Server

Now that the target server has its MySQL data directory cleared of its old data, it is now ready to receive the new backup directly from the source server. The port 2112 can be changed, but the port needs to match on both source and target commands. Nc command and options may vary by OS.

nc -l -p 2112 | unpigz -c | xbstream -x -C /var/lib/mysql No Compression would be nc -l -p 2112 | xbstream -x -C /var/lib/mysql Step 3. Start Backup on Source Server

The listener is now up on the target server, ready to accept the connection. We will now start the backup on the source server to stream to the target server. Update the parallel option to match the number of cpu cores on your server. Use the lower core count between the source and target server. The port 2112 can be changed, but the port needs to match on both source and target commands. Nc command and options may vary by OS.

innobackupex --stream=xbstream --parallel=4 /tmp | pigz -c --fast | nc -w 2 TARGET_SERVER 2112 No Compression would be innobackupex --stream=xbstream --parallel=4 /tmp | nc -w 2 TARGET_SERVER 2112 Step 4. Prepare Backup on Target Server

Percona xtrabackup requires that you prepare the backup after it has been completed to apply any outstanding logs and get the database ready to be started. Use as much memory on your target server as you can without causing it to go OOM (Out of Memory). As an example, 75% of your total memory would be a good starting point if there is nothing else running on your server. On a server with 4G of RAM you could safely set user memory to 3G.

innobackupex --use-memory=3G --apply-log /var/lib/mysql Step 5. Update Ownership and Start MySQL

Now that the apply logs step has completed on your backup, you should be able to update the ownership of the files to mysql:mysql, and then start the MySQL service.

chown -R mysql:mysql /var/lib/mysql service mysql start Step 6. Configure Replication

If the source server is a slave, you should be able to just start the new slave as the positioning will be already configured. If your source server is the master, then you will have to figure out if you are using GTID or legacy replication. If you are using GTIDs, you should be able start replication with gtid_purged being set and the master auto position parameter. If you are using legacy replication, you can find the master log and position in the xtrabackup_binlog_info file in your backup directory. In this scenario, the backup directory is the MySQL data directory (/var/lib/mysql/) on the target server.

GTID Replication

mysql CHANGE MASTER TO MASTER_HOST='<MASTER_SERVER>', MASTER_USER='<REPL_USER>', MASTER_PASSWORD='<REPL_PASS>', MASTER_AUTO_POSITION = 1; START SLAVE;"

Legacy Replication

cat /var/lib/mysql/xtrabackup_binlog_info mysql CHANGE MASTER TO MASTER_HOST='<MASTER_SERVER>', MASTER_USER='<REPL_USER>', MASTER_PASSWORD='<REPL_PASS>', MASTER_LOG_FILE='<LOG_FROM_xtrabackup_binlog_info', MASTER_LOG_POS=<POSITION_FROM_xtrabackup_binlog_info>; START SLAVE;" Script

Below you will find the simple script that I came up with to get you started on automating this task to quickly rebuild a slave in your own environment.  The script was created with a lot of assumptions that you may not have in your environment.  Please make sure to update accordingly.

  • The script is running from a separate server that has access to both MySQL servers. May work by running on one of the MySQL servers but not tested to do so.
  • The account running the commands on the remote servers have SUDO access to run commands.
  • SSH key of the remote server is set up to allow ssh access to both MySQL servers allowing for ssh with no password prompt.
  • .my.cnf is configured in the home directory of the account being used to SSH on the MySQL servers allowing the script to run MySQL and Xtrabackup commands with no password prompt.
  • The following software is installed on the MySQL servers : netcat (nc), pigz, xtrabackup.
  • Firewall rules are open for the port being used by NETCAT streaming.
  • All my testing was on Debian servers. Found with other OS’s and Netcac versions there is a -d flag for running nc in the background. In Debian you have to use -p for netcat when telling it to listen on a port.

Script

#!/bin/bash SOURCE_SERVER=$1 TARGET_SERVER=$2 # This is if you have /var/lib/mysql on the / (Root) drive. You can change this to '/' to capture the size of the drive to get an idea of how long is left on your backup MOUNT_CONTAINING_DATADIR='/var' #Should match the number of CPU's on your server BACKUP_THREADS=2 # Seconds to wait in the loop to check that the backup has completed TIMER=5 # Amount of memory to use to apply the logs to the newly backed up server MEMORY_GB_USED_APPLY_LOGS='1G' # Change this to a 1 if you want to configure the target a slave of the source server CONFIGURE_TARGET_AS_SLAVE_OF_SOURCE=1 GTID_STATUS='' SSH='ssh -q -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null' NETCAT_PORT=2112 REPL_USER='repl' REPL_PASS='repl_PASSWORD' MYSQL_DATADIR='/var/lib/mysql' echo "" echo "Starting MySQL Streaming Slave Rebuild Script" echo "" if [ ! $1 ];then echo "SOURCE Server not set. Please run like :" echo "" echo "$0 <source_server> <target_server>" echo "" exit 1 fi # VALIDATE EXECUTION echo "Are you sure you wish to perform this rebuild. THIS IS DESTRUCTIVE!!!" echo " SOURCE SERVER (Performing Backup) : $SOURCE_SERVER" echo " TARGET SERVER (Receiving Backup) : $TARGET_SERVER" echo " All files in $MYSQL_DATADIR on $TARGET_SERVER will be DELETED!!!" echo -n "START NOW ? (y/n) : " read CONFIRM if [ $CONFIRM == "y" -o $CONFIRM == "Y" ]; then echo "" echo "STARTING REBUILD!!!" else echo "Y or y was not chosen. Exiting." exit 1 fi # PREPARE TARGET echo " Stopping MySQL" $SSH $TARGET_SERVER "service mysql stop" echo " Clearing $MYSQL_DATADIR" $SSH $TARGET_SERVER "rm -rf $MYSQL_DATADIR/*" # PERFORM STREAMING BACKUP echo " Start Listener on Target server $TARGET_SERVER:$NETCAT_PORT to accept the backup and place it in $MYSQL_DATADIR" $SSH $TARGET_SERVER "nc -l -p $NETCAT_PORT | unpigz -c | xbstream -x -C $MYSQL_DATADIR" > /dev/null 2>&1 & echo " Starting backup on source server $SOURCE_SERVER:$NETCAT_PORT to stream backup" $SSH $SOURCE_SERVER "innobackupex --stream=xbstream --parallel=$BACKUP_THREADS /tmp | pigz -c --fast | nc -w 2 $TARGET_SERVER $NETCAT_PORT" > /tmp/backup_log 2>&1 & sleep 4 echo " Watching backup every $TIMER seconds to validate when the backup is complete" LOOP=1 while [ 1 -eq $LOOP ];do BACKUP_PROCESSES=`$SSH $SOURCE_SERVER "ps aux | grep -v grep | grep -w innobackupex | wc -l"` if [ $BACKUP_PROCESSES -eq 0 ]; then echo " Backup has COMPLETED!!" LOOP=2 else echo " Backup is Running!" sleep $TIMER fi done # PREPARE AND COMPLETE BACKUP ON TARGET echo " Applying logs to the Xtrabackup" $SSH $TARGET_SERVER "innobackupex --use-memory=$MEMORY_GB_USED_APPLY_LOGS --apply-log $MYSQL_DATADIR" > /tmp/backup_log 2>&1 & sleep 3 LOOP=1 while [ 1 -eq $LOOP ];do APPLY_PROCESSES=`$SSH $TARGET_SERVER "ps aux | grep -v grep | grep innobackupex | wc -l"` if [ $APPLY_PROCESSES -eq 0 ]; then echo " Apply logs has COMPLETED!!" LOOP=2 else echo " Apply Logs Running!" sleep $TIMER fi done sleep 1 echo " Updating ownership on the backup files so that MySQL owns them" $SSH $TARGET_SERVER "chown -R mysql:mysql $MYSQL_DATADIR" echo " Starting MySQL" $SSH $TARGET_SERVER "service mysql start" if [ $CONFIGURE_TARGET_AS_SLAVE_OF_SOURCE -eq 1 ]; then echo " Configuring Replication" GTID_STATUS=`$SSH $SOURCE_SERVER "mysql -BN -e \"SHOW VARIABLES LIKE 'gtid_mode'\"" | grep -w ON | wc -l` if [ $GTID_STATUS -gt 0 ]; then echo "Found GTID ON. Using Master Auto Position. SLAVE STARTED" GTID_POS=`$SSH $TARGET_SERVER "cat $MYSQL_DATADIR/xtrabackup_binlog_info" | awk '{print $3}' | head -n 1 | sed 's/,//'` $SSH $TARGET_SERVER "mysql -e \"SET GLOBAL gtid_purged='$GTID_POS';\"" $SSH $TARGET_SERVER "mysql -e \"CHANGE MASTER TO MASTER_HOST='$SOURCE_SERVER', MASTER_USER='$REPL_USER', MASTER_PASSWORD='$REPL_PASS', MASTER_AUTO_POSITION = 1; START SLAVE; \"" else echo "Found GTID not ON. Grabbing positioning from the backup file and using that to configure replication" MASTER_LOG=`$SSH $TARGET_SERVER "cat $MYSQL_DATADIR/xtrabackup_binlog_info" | awk '{print $1}'` MASTER_POS=`$SSH $TARGET_SERVER "cat $MYSQL_DATADIR/xtrabackup_binlog_info" | awk '{print $2}'` echo "Setting the slave to $MASTER_LOG and $MASTER_POS. SLAVE STARTED" $SSH $TARGET_SERVER "mysql -e \"CHANGE MASTER TO MASTER_HOST='$SOURCE_SERVER', MASTER_USER='$REPL_USER', MASTER_PASSWORD='$REPL_PASS', MASTER_LOG_FILE='$MASTER_LOG', MASTER_LOG_POS=$MASTER_POS; START SLAVE;\"" fi fi

Script Output

[root@bastion DB]# ./mysql_file_streaming_slave_rebuild.sh 10.10.10.198 10.10.10.197 Starting MySQL Streaming Slave Rebuild Script Are you sure you wish to perform this rebuild. THIS IS DESTRUCTIVE!!! SOURCE SERVER (Performing Backup) : 10.10.10.198 TARGET SERVER (Receiving Backup) : 10.10.10.197 All files in /var/lib/mysql on 10.10.10.197 will be DELETED!!! START NOW ? (y/n) : y STARTING REBUILD!!! Stopping MySQL Clearing /var/lib/mysql Start Listener on Target server 10.10.10.197:2112 to accept the backup and place it in /var/lib/mysql Starting backup on source server 10.10.10.198:2112 to stream backup Watching backup every 5 seconds to validate when the backup is complete Backup is Running! Backup is Running! Backup is Running! Backup has COMPLETED!! Applying logs to the Xtrabackup Apply Logs Running! Apply Logs Running! Apply logs has COMPLETED!! Updating ownership on the backup files so that MySQL owns them Starting MySQL Configuring Replication Found GTID ON. Using Master Auto Position. SLAVE STARTED

 

Conclusion

I have found that this has greatly increased the timeframe in which it took me to recover a failed slave. By transferring the data only once from the source server to the target server, and with the data being compressed during the transfer, I feel this is one of the most efficient methods of recovering a failed slave, or building a new one.

A friendly comparison of InnoDB and MyRocks Performance

In this blog post, we have multiple OLTP performance benchmarking scenarios using sysbench 1.0.14 on InnoDB and MyRocks. InnoDB and MyRocks (RocksDB with MySQL) are definitely not to supplement each other, They actually compliment well with respective advantages, Let me quickly explain how InnoDB and MyRocks can benefit you when used wisely, Again this blog post is not to show who (InnoDB or MyRocks) is better ? We regularly benchmark both of these storage engines before recommending to our customers on what is best suited for their database infrastructure operations ? so we would like to share our thoughts on this post.

How InnoDB and MyRocks are different ?
  • MyRocks supports only READ-COMMITTED isolation level, There is no REPEATABLE-READ isolation level like InnoDB so no gap locking like InnoDB, We have written detailed blog on InnoDB transaction isolation levels here
  • To get an verbose information about MyRocks instance, the log is located in  ” /var/lib/mysql/#rocksdb ” . Much more detailed story about your RocksDB diagnostics report can be generated with the command SHOW ENGINE ROCKSDB STATUS , It really takes good amount of time to understand and interpret MyRocks operations matrices.
  • In MyRocks, you have rocksdb_block_cache_size system variable which is somewhat similar to innodb_buffer_pool_size but It’s mainly beneficial for reads. By default it uses buffered reads and OS cache contains cached compressed data and RockDB block cache will contain uncompressed data. You can have two levels of cache or disable buffering by forcing block cache to use direct reads with configuration rocksdb_use_direct_reads=ON.
  • LSM Data Structure – MyRocks is not an alternative or advanced version of InnoDB, LSM data structure is great for write-intensive database operations, reads will be slow and full table scans are too expensive. so InnoDB and RocksDB together makes an great combination !
Benchmarking InnoDB and MyRocks performance with sysbench 1.0.14 for OLTP operations 

Linux – CentOS Linux release 7.3.1611 (Core) 

Database infrastructure – MariaDB 10.3.7

Building database infrastructure for benchmarking 

We have used “oltp_common.lua” script to create database infrastructure for benchmarking InnoDB and MyRocks, We have not tuned both InnoDB and MyRocks variables for performance. The script below creates database for benchmarking:

[root@localhost sysbench]# [root@localhost sysbench]# sysbench bulk_insert.lua --threads=1 --db-driver=mysql --mysql-db=test --mysql-socket=/var/lib/mysql/mysql.sock --mysql-user=root --mysql-password=MyPassword2018 --mysql-storage-engine=rocksdb prepare sysbench 1.0.14 (using bundled LuaJIT 2.1.0-beta2) Creating table 'sbtest1'... [root@localhost sysbench]#

Benchmarking bulk INSERT performance on InnoDB and MyRocks

Benchmarking OLTP insert on InnoDB using “oltp_insert.lua” 

Script to create data (5M records) for benchmarking OLTP INSERT performance:

[root@localhost sysbench]# sysbench oltp_insert.lua --threads=100 --time=180 --table-size=5000000 --db-driver=mysql --mysql-db=test --mysql-socket=/var/lib/mysql/mysql.sock --mysql-user=root --mysql-password=MyPassword2018 prepare sysbench 1.0.14 (using bundled LuaJIT 2.1.0-beta2) Initializing worker threads... Creating table 'sbtest1'... Inserting 5000000 records into 'sbtest1' Creating a secondary index on 'sbtest1'... [root@localhost sysbench]#

MariaDB [test]> show table status like 'sbtest1'\G; *************************** 1. row *************************** Name: sbtest1 Engine: InnoDB Version: 10 Row_format: Dynamic Rows: 5404891 Avg_row_length: 265 Data_length: 1433403392 Max_data_length: 0 Index_length: 157024256 Data_free: 4194304 Auto_increment: 5696281 Create_time: 2018-06-03 12:48:12 Update_time: 2018-06-03 12:52:03 Check_time: NULL Collation: latin1_swedish_ci Checksum: NULL Create_options: Comment: Max_index_length: 0 Temporary: N 1 row in set (0.000 sec) ERROR: No query specified MariaDB [test]>

Script for benchmarking InnoDB OLTP INSERT performance:

[root@localhost sysbench]# sysbench oltp_insert.lua --threads=100 --time=180 --table-size=5000000 --db-driver=mysql --mysql-db=test --mysql-socket=/var/lib/mysql/mysql.sock --mysql-user=root --mysql-password=MyPassword2018 run sysbench 1.0.14 (using bundled LuaJIT 2.1.0-beta2) Running the test with following options: Number of threads: 100 Initializing random number generator from current time Initializing worker threads... Threads started! SQL statistics: queries performed: read: 0 write: 696280 other: 0 total: 696280 transactions: 696280 (3866.32 per sec.) queries: 696280 (3866.32 per sec.) ignored errors: 0 (0.00 per sec.) reconnects: 0 (0.00 per sec.) General statistics: total time: 180.0872s total number of events: 696280 Latency (ms): min: 0.62 avg: 25.85 max: 358.63 95th percentile: 81.48 sum: 17998504.11 Threads fairness: events (avg/stddev): 6962.8000/57.61 execution time (avg/stddev): 179.9850/0.04

What we look for seriously in this benchmarking is QPS (queries per seconds) , In the test above it is 3866 QPS

Benchmarking MyRocks INSERT performance using Sysbench 1.0.14: 

The steps are same, except for explicitly mentioning the storage engine RocksDB in sysbench scripts: “–mysql-storage-engine=rocksdb

Script for benchmarking OLTP insert on MyRocks using “oltp_insert.lua”  : 

[root@localhost sysbench]# sysbench oltp_insert.lua --threads=100 --time=180 --table-size=5000000 --db-driver=mysql --mysql-db=test --mysql-socket=/var/lib/mysql/mysql.sock --mysql-user=root --mysql-password=MyPassword2018 --mysql-storage-engine=rocksdb prepare sysbench 1.0.14 (using bundled LuaJIT 2.1.0-beta2) Initializing worker threads... Creating table 'sbtest1'... Inserting 5000000 records into 'sbtest1' Creating a secondary index on 'sbtest1'... [root@localhost sysbench]#

MariaDB [test]> show table status like 'sbtest1%'\G; *************************** 1. row *************************** Name: sbtest1 Engine: ROCKSDB Version: 10 Row_format: Fixed Rows: 5000000 Avg_row_length: 198 Data_length: 992949774 Max_data_length: 0 Index_length: 38739880 Data_free: 0 Auto_increment: 5000001 Create_time: NULL Update_time: NULL Check_time: NULL Collation: latin1_swedish_ci Checksum: NULL Create_options: Comment: Max_index_length: 0 Temporary: N 1 row in set (0.007 sec) ERROR: No query specified MariaDB [test]>

Script for benchmarking RocksDB OLTP INSERT performance:

[root@localhost sysbench]# sysbench oltp_insert.lua --threads=100 --time=180 --table-size=5000000 --db-driver=mysql --mysql-db=test --mysql-socket=/var/lib/mysql/mysql.sock --mysql-user=root --mysql-password=MyPassword2018 --mysql-storage-engine=rocksdb run sysbench 1.0.14 (using bundled LuaJIT 2.1.0-beta2) Running the test with following options: Number of threads: 100 Initializing random number generator from current time Initializing worker threads... Threads started! SQL statistics: queries performed: read: 0 write: 123049 other: 0 total: 123049 transactions: 123049 (683.37 per sec.) queries: 123049 (683.37 per sec.) ignored errors: 0 (0.00 per sec.) reconnects: 0 (0.00 per sec.) General statistics: total time: 180.0618s total number of events: 123049 Latency (ms): min: 1.03 avg: 146.30 max: 1675.09 95th percentile: 308.84 sum: 18001689.44 Threads fairness: events (avg/stddev): 1230.4900/26.26 execution time (avg/stddev): 180.0169/0.02 [root@localhost sysbench]#

Result: OLTP INSERT performance for RocksDB is 683 QPS (queries per second)

The graphical representation of more interesting performance benchmarking results on multiple bulk INSERT transactions scenarios:

Conclusion 

OLTP INSERT performance benchmarking clearly proves InnoDB is almost 6X faster than MyRocks. So MyRocks is not recommended for bulk INSERT transactions !

OLTP WRITE only transactions performance benchmarking for InnoDB and MyRocks 

OLTP WRITE only performance benchmarking for InnoDB using “oltp_write_only.lua”

The steps remains same for “oltp_write_only.lua” script like “oltp_insert.lua” mentioned above, So we are directly copying the results of benchmarking without explaining details again:

[root@localhost sysbench]# sysbench oltp_write_only.lua --threads=100 --table-size=5000000 --db-driver=mysql --mysql-db=test --mysql-socket=/var/lib/mysql/mysql.sock --mysql-user=root --mysql-password=MyPassword2018 prepare sysbench 1.0.14 (using bundled LuaJIT 2.1.0-beta2) Initializing worker threads... Creating table 'sbtest1'... Inserting 5000000 records into 'sbtest1' Creating a secondary index on 'sbtest1'...

[root@localhost sysbench]# sysbench oltp_write_only.lua --threads=100 --table-size=5000000 --db-driver=mysql --mysql-db=test --mysql-socket=/var/lib/mysql/mysql.sock --mysql-user=root --mysql-password=MyPassword2018 run sysbench 1.0.14 (using bundled LuaJIT 2.1.0-beta2) Running the test with following options: Number of threads: 100 Initializing random number generator from current time Initializing worker threads... Threads started! SQL statistics: queries performed: read: 0 write: 14529 other: 7265 total: 21794 transactions: 3632 (355.03 per sec.) queries: 21794 (2130.37 per sec.) ignored errors: 1 (0.10 per sec.) reconnects: 0 (0.00 per sec.) General statistics: total time: 10.2285s total number of events: 3632 Latency (ms): min: 1.88 avg: 277.61 max: 2701.56 95th percentile: 977.74 sum: 1008267.12 Threads fairness: events (avg/stddev): 36.3200/4.36 execution time (avg/stddev): 10.0827/0.09

Result : 2130 QPS (queries per second)

OLTP WRITE only performance benchmarking for RocksDB using “oltp_write_only.lua”

[root@localhost sysbench]# sysbench oltp_write_only.lua --threads=100 --table-size=5000000 --db-driver=mysql --mysql-db=test --mysql-socket=/var/lib/mysql/mysql.sock --mysql-user=root --mysql-password=MyPassword2018 --mysql-storage-engine=rocksdb prepare sysbench 1.0.14 (using bundled LuaJIT 2.1.0-beta2) Initializing worker threads... Creating table 'sbtest1'... Inserting 5000000 records into 'sbtest1' Creating a secondary index on 'sbtest1'...

[root@localhost sysbench]# sysbench oltp_write_only.lua --threads=100 --table-size=5000000 --db-driver=mysql --mysql-db=test --mysql-socket=/var/lib/mysql/mysql.sock --mysql-user=root --mysql-password=MyPassword2018 --mysql-storage-engine=rocksdb run sysbench 1.0.14 (using bundled LuaJIT 2.1.0-beta2) Running the test with following options: Number of threads: 100 Initializing random number generator from current time Initializing worker threads... Threads started! SQL statistics: queries performed: read: 0 write: 25191 other: 12596 total: 37787 transactions: 6296 (625.73 per sec.) queries: 37787 (3755.49 per sec.) ignored errors: 4 (0.40 per sec.) reconnects: 0 (0.00 per sec.) General statistics: total time: 10.0603s total number of events: 6296 Latency (ms): min: 1.39 avg: 159.29 max: 3620.58 95th percentile: 846.57 sum: 1002895.84 Threads fairness: events (avg/stddev): 62.9600/25.26 execution time (avg/stddev): 10.0290/0.02

Result : 3755 QPS (queries per second)

The graphical representation of more interesting performance benchmarking results on multiple WRITE only transactions scenarios:

Conclusion 

MyRocks OLTP write only performance is almost 2X compared to InnoDB, So MyRocks is definitely an preferred option for high performance and scalable writes, Thanks to LSM data structure  !

OLTP READ-WRITE performance benchmarking using Sysbench lua script “oltp_read_write.lua”

The lua scripts below create data for OLTP READ-WRITE performance benchmarking:

Benchmarking OLTP READ-WRITE performance for InnoDB:

[root@localhost sysbench]# sysbench oltp_read_write.lua --threads=100 --table-size=5000000 --db-driver=mysql --mysql-db=test --mysql-socket=/var/lib/mysql/mysql.sock --mysql-user=root --mysql-password=MyPassword2018 prepare sysbench 1.0.14 (using bundled LuaJIT 2.1.0-beta2) Initializing worker threads... Creating table 'sbtest1'... Inserting 5000000 records into 'sbtest1' Creating a secondary index on 'sbtest1'...

[root@localhost sysbench]# sysbench oltp_read_write.lua --threads=100 --table-size=5000000 --db-driver=mysql --mysql-db=test --mysql-socket=/var/lib/mysql/mysql.sock --mysql-user=root --mysql-password=MyPassword2018 run sysbench 1.0.14 (using bundled LuaJIT 2.1.0-beta2) Running the test with following options: Number of threads: 100 Initializing random number generator from current time Initializing worker threads... Threads started! SQL statistics: queries performed: read: 15652 write: 4472 other: 2236 total: 22360 transactions: 1118 (105.96 per sec.) queries: 22360 (2119.20 per sec.) ignored errors: 0 (0.00 per sec.) reconnects: 0 (0.00 per sec.) General statistics: total time: 10.5301s total number of events: 1118 Latency (ms): min: 15.71 avg: 922.19 max: 4973.09 95th percentile: 2009.23 sum: 1031006.57 Threads fairness: events (avg/stddev): 11.1800/1.68 execution time (avg/stddev): 10.3101/0.13 [root@localhost sysbench]#

Result: 2119 QPS (queries per second) 

Benchmarking OLTP READ-WRITE performance for RocksDB:

[root@localhost sysbench]# sysbench oltp_read_write.lua --threads=100 --table-size=5000000 --db-driver=mysql --mysql-db=test --mysql-socket=/var/lib/mysql/mysql.sock --mysql-user=root --mysql-password=MyPassword2018 --mysql-storage-engine=rocksdb prepare sysbench 1.0.14 (using bundled LuaJIT 2.1.0-beta2) Initializing worker threads... Creating table 'sbtest1'... Inserting 5000000 records into 'sbtest1' Creating a secondary index on 'sbtest1'...

[root@localhost sysbench]# sysbench oltp_read_write.lua --threads=100 --table-size=5000000 --db-driver=mysql --mysql-db=test --mysql-socket=/var/lib/mysql/mysql.sock --mysql-user=root --mysql-password=MyPassword2018 --mysql-storage-engine=rocksdb run sysbench 1.0.14 (using bundled LuaJIT 2.1.0-beta2) Running the test with following options: Number of threads: 100 Initializing random number generator from current time Initializing worker threads... Threads started! SQL statistics: queries performed: read: 26964 write: 7628 other: 3827 total: 38419 transactions: 1901 (182.46 per sec.) queries: 38419 (3687.46 per sec.) ignored errors: 25 (2.40 per sec.) reconnects: 0 (0.00 per sec.) General statistics: total time: 10.4153s total number of events: 1901 Latency (ms): min: 11.23 avg: 540.87 max: 3480.91 95th percentile: 1352.03 sum: 1028196.02 Threads fairness: events (avg/stddev): 19.0100/2.25 execution time (avg/stddev): 10.2820/0.10 [root@localhost sysbench]#

Result: 3687 QPS (queries per second) 

The graphical representation of more interesting performance benchmarking results on multiple READ-WRITE transactions scenarios:

Conclusion  

OLTP READ-WRITE I/O operations benchmarking results confirm MyRocks is the definite choice, May be these result vary more if we invest in tuning the InnoDB and MyRocks for performance.

Benchmarking OLTP READ ONLY operations using Sysbench oltp_read_only.lua script 

OLTP READ ONLY transactions performance benchmarking for InnoDB:

[root@localhost sysbench]# sysbench oltp_read_only.lua --threads=100 --table-size=5000000 --db-driver=mysql --mysql-db=test --mysql-socket=/var/lib/mysql/mysql.sock --mysql-user=root --mysql-password=MyPassword2018 prepare sysbench 1.0.14 (using bundled LuaJIT 2.1.0-beta2) Initializing worker threads... Creating table 'sbtest1'... Inserting 5000000 records into 'sbtest1' Creating a secondary index on 'sbtest1'...

[root@localhost sysbench]# sysbench oltp_read_only.lua --threads=100 --table-size=5000000 --db-driver=mysql --mysql-db=test --mysql-socket=/var/lib/mysql/mysql.sock --mysql-user=root --mysql-password=MyPassword2018 run sysbench 1.0.14 (using bundled LuaJIT 2.1.0-beta2) Running the test with following options: Number of threads: 100 Initializing random number generator from current time Initializing worker threads... Threads started! SQL statistics: queries performed: read: 51072 write: 0 other: 7296 total: 58368 transactions: 3648 (352.59 per sec.) queries: 58368 (5641.45 per sec.) ignored errors: 0 (0.00 per sec.) reconnects: 0 (0.00 per sec.) General statistics: total time: 10.3436s total number of events: 3648 Latency (ms): min: 1.10 avg: 274.41 max: 2863.46 95th percentile: 733.00 sum: 1001047.53 Threads fairness: events (avg/stddev): 36.4800/6.57 execution time (avg/stddev): 10.0105/0.05

Result: 5641 QPS (queries per second)

OLTP READ ONLY transactions performance benchmarking for RocksDB:

[root@localhost sysbench]# sysbench oltp_read_only.lua --threads=100 --table-size=5000000 --db-driver=mysql --mysql-db=test --mysql-socket=/var/lib/mysql/mysql.sock --mysql-user=root --mysql-password=MyPassword2018 --mysql-storage-engine=rocksdb prepare sysbench 1.0.14 (using bundled LuaJIT 2.1.0-beta2) Initializing worker threads... Creating table 'sbtest1'... Inserting 5000000 records into 'sbtest1' Creating a secondary index on 'sbtest1'...

[root@localhost sysbench]# sysbench oltp_read_only.lua --threads=100 --table-size=5000000 --db-driver=mysql --mysql-db=test --mysql-socket=/var/lib/mysql/mysql.sock --mysql-user=root --mysql-password=MyPassword2018 --mysql-storage-engine=rocksdb run sysbench 1.0.14 (using bundled LuaJIT 2.1.0-beta2) Running the test with following options: Number of threads: 100 Initializing random number generator from current time Initializing worker threads... Threads started! SQL statistics: queries performed: read: 26362 write: 0 other: 3766 total: 30128 transactions: 1883 (182.77 per sec.) queries: 30128 (2924.32 per sec.) ignored errors: 0 (0.00 per sec.) reconnects: 0 (0.00 per sec.) General statistics: total time: 10.2983s total number of events: 1883 Latency (ms): min: 6.20 avg: 540.16 max: 5258.04 95th percentile: 2045.74 sum: 1017118.10 Threads fairness: events (avg/stddev): 18.8300/2.29 execution time (avg/stddev): 10.1712/0.08

Result: 2924 QPS (queries per second)

The graphical representation of more interesting performance benchmarking results on multiple READ only transactions scenarios:

Conclusion

InnoDB works great if it is OLTP READ only transactions, So we can continue recommending customers to use InnoDB for read intensive database operations.

Benchmarking OLTP DELETE ONLY operations using Sysbench oltp_delete.lua script 

OLTP DELETE ONLY transactions performance benchmarking for InnoDB:

[root@localhost sysbench]# sysbench oltp_delete.lua --threads=100 --table-size=5000000 --db-driver=mysql --mysql-db=test --mysql-socket=/var/lib/mysql/mysql.sock --mysql-user=root --mysql-password=MyPassword2018 prepare sysbench 1.0.14 (using bundled LuaJIT 2.1.0-beta2) Initializing worker threads... Creating table 'sbtest1'... Inserting 5000000 records into 'sbtest1' Creating a secondary index on 'sbtest1'...

[root@localhost sysbench]# sysbench oltp_delete.lua --threads=100 --table-size=5000000 --db-driver=mysql --mysql-db=test --mysql-socket=/var/lib/mysql/mysql.sock --mysql-user=root --mysql-password=MyPassword2018 run sysbench 1.0.14 (using bundled LuaJIT 2.1.0-beta2) Running the test with following options: Number of threads: 100 Initializing random number generator from current time Initializing worker threads... Threads started! SQL statistics: queries performed: read: 0 write: 21659 other: 4464 total: 26123 transactions: 26123 (2521.93 per sec.) queries: 26123 (2521.93 per sec.) ignored errors: 0 (0.00 per sec.) reconnects: 0 (0.00 per sec.) General statistics: total time: 10.3568s total number of events: 26123 Latency (ms): min: 0.04 avg: 38.08 max: 2679.06 95th percentile: 116.80 sum: 994654.43 Threads fairness: events (avg/stddev): 261.2300/46.27 execution time (avg/stddev): 9.9465/0.18 [root@localhost sysbench]#

Result: 2521 QPS (queries per second)

OLTP DELETE ONLY transactions performance benchmarking for RocksDB:

[root@localhost sysbench]# sysbench oltp_delete.lua --threads=100 --table-size=5000000 --db-driver=mysql --mysql-db=test --mysql-socket=/var/lib/mysql/mysql.sock --mysql-user=root --mysql-password=MyPassword2018 --mysql-storage-engine=rocksdb prepare sysbench 1.0.14 (using bundled LuaJIT 2.1.0-beta2) Initializing worker threads... Creating table 'sbtest1'... Inserting 5000000 records into 'sbtest1' Creating a secondary index on 'sbtest1'...

[root@localhost sysbench]# sysbench oltp_delete.lua --threads=100 --table-size=5000000 --db-driver=mysql --mysql-db=test --mysql-socket=/var/lib/mysql/mysql.sock --mysql-user=root --mysql-password=MyPassword2018 --mysql-storage-engine=rocksdb run sysbench 1.0.14 (using bundled LuaJIT 2.1.0-beta2) Running the test with following options: Number of threads: 100 Initializing random number generator from current time Initializing worker threads... Threads started! SQL statistics: queries performed: read: 0 write: 7094 other: 421 total: 7515 transactions: 7515 (746.81 per sec.) queries: 7515 (746.81 per sec.) ignored errors: 0 (0.00 per sec.) reconnects: 0 (0.00 per sec.) General statistics: total time: 10.0613s total number of events: 7515 Latency (ms): min: 0.43 avg: 133.42 max: 1666.25 95th percentile: 502.20 sum: 1002663.49 Threads fairness: events (avg/stddev): 75.1500/15.50 execution time (avg/stddev): 10.0266/0.02

Result: 746 QPS (queries per second)

The graphical representation of more interesting performance benchmarking results on multiple DELETE only transactions scenarios:

Conclusion 

InnoDB is almost 3X faster than MyRocks in OLTP DELETE operations.

InnoDB and MyRocks performance comparison :
  • InnoDB is the definite choice if the transaction model is bulk INSERT, READ and DELETE intensive database operations.
  • MyRocks performance is much better than InnoDB in WRITE and READ-WRITE intensive database operations.

The post A friendly comparison of InnoDB and MyRocks Performance appeared first on MySQL Consulting, Support and Remote DBA Services By MinervaDB.

Using –login-path with MySQL Cluster Manager

Protective convenience
Recent versions of MySQL has an added security feature, the –login-path option. This feature puts your username and password in an encrypted file, and you refer clients to this file instead of typing plain text passwords on the console, or putting them in scripts.

MySQL participating to HKOSCon 2018

From June 16th to 17th, the MySQL Team will be attending and speaking at the Hong Kong Open Source Conference 2018.

Unfortunately I won’t be present but some of my great colleagues will be speaking about MySQL.

If you are planning to attend HKOSCon2018, don’t miss their talks:

During the event, there will be also very nice talks from the MySQL Community. I really recommend you to also attend the following sessions:

This conference seems to be a very nice opportunity to learn new stuff related to MySQL 8.0.

MariaDB 10.3 supports database sequences

Introduction Traditionally, both MySQL and MariaDB relied on AUTO_INCREMENT columns to generate an IDENTITY Primary Key. Although IDENTITY columns are very efficient in generating the Primary Key value, when it comes to using JPA and Hibernate, the IDENTITY generator prevents us from using JDBC batch inserts. To automatically enroll multiple INSERT, UPDATE or DELETE statements, … Continue reading MariaDB 10.3 supports database sequences →

The post MariaDB 10.3 supports database sequences appeared first on Vlad Mihalcea.

Call for Questions: Webinar with MySQL Benchmarking Experts

If you attended my latest Troubleshooting MySQL Concurrency Issues with Load Testing Tools webinar you learned how I exploit benchmarking tools to replicate locking issues and crashes. But I told you nothing about proper use of these tools: for reliable benchmarks. I did not tell you that for a reason… I am just a Support Engineer and not a benchmarking expert.

And I have a dream. I want to invite world famous MySQL benchmarking experts for a Percona webinar and ask them for their secrets. I have a pretty long list of my own questions, but in order to make 1-hour webinar productive, we need to concentrate on just a few of the hottest ones. Therefore I need your help.

Please add your questions into the comments field of this blog post. Then we will choose the most important and/or most popular of these to discuss at the webinar.

Some benchmarking questions

My list of questions includes:

  1. What do you benchmark? Are they real applications or artificial tests?
  2. How do you create tests for a real application workload?
  3. Do you test hardware?
  4. Which tools do you use? How? Any tips?
  5. How do you work with hardware limitations when the test machine is smaller than the production one?
  6. Which are the most common mistakes for benchmarks?
  7. We all heard about “Lies, Damned Lies and Benchmarks”. Why do you still benchmark? How are you sure that yours have meaningful results?
  8. What are the most common mistakes made by those who publish benchmark results for public use?
  9. What are the best practices for benchmarks?

Please tell me what you think about these questions and add yours!

The post Call for Questions: Webinar with MySQL Benchmarking Experts appeared first on Percona Database Performance Blog.

MySQL Performance : more in depth with latin1 and utf8mb4 in 8.0 GA

Looking on my previously obtained results on Read-Only (RO) tests for latin1 and UTF8 charsets, one question continued to turn in my mind :
- if MariaDB 10.3 is hitting a so deep drop on "distinct-ranges" workload :
- why then this is not impacting the "mixed" OLTP_RO workload results (which is containing "distinct-ranges" query too) :
The answer was within the test title :
  • I've missed one zero in my scripts while preparing initial tests.. ;-))
  • so, the "distinct-ranges" test was using range size=10 (instead of 100 by default)
  • while "mixed" OLTP_RO remained with default settings, and used range size=100 for all range tests..
  • was the use of a smaller range size which that much impacted MariaDB ?..
  • (generally people are expecting to see wider range queries to be more impacting than the smaller ones, right ?)..

To clarify this all -- I decided to replay RO tests with a more detailed analyze..

Read more... (13 min remaining to read)

Colocated MySQL Server and NDB data nodes


Historically the advice for MySQL Cluster has been to not colocate
the MySQL Server and the NDB data node for scalable applications.

There are still reasons to apply this principle in scalable setups
where the application isn't close to the data.

But with MySQL Cluster 7.6 we have added a number of reasons why it
makes sense to colocate the MySQL Server and the NDB data node.

Internally in the NDB development we have had a debate about whether
to integrate the NDB data node inside the MySQL Server. The reasons
for this is that the MySQL Server will be closer to the data. The
argument against is that the NDB data node and the MySQL Server are
designed with different recovery models. NDB data nodes are failfast,
as soon as we find a fault that is not supposed to happen we will
crash the data node. The MySQL Server on the other hand tries to
stay up as much as possible since a crash of the MySQL Server brings
down the data accessibility. In NDB we always expect another data
node to have a synchronous replica, thus data is accessible even in
the case of a crash.

With MySQL Cluster 7.6 we have gotten the best of both worlds. We
can now communicate from the MySQL Server to a NDB data node using
a shared memory transporter. This means that communication goes
entirely through the memory of the machine, the communication between
a thread in the MySQL Server and a thread in the NDB data node
goes through memory and when a thread needs to wake up a thread a
mutex is used with a condition variable exactly as in the MySQL
Server. Still the NDB data node and the MySQL Server is separate
programs that can reside on machines independent of each other
and they can crash independently of each other.

So with the release of MySQL Cluster 7.6 it is possible to have
clusters with locality of reads. Already in MySQL Cluster 7.5 we
introduced the possibility to declare tables as being able to
read from all replicas (Read Backup feature). In addition we
introduced tables that can be fully replicated in MySQL Cluster 7.5.
In these fully replicated tables access to a table is always local
to the data node we are in.

In MySQL Cluster 7.6 we are introducing a shared memory transporter
for efficient communication between a colocated MySQL Server and
an NDB data node. In addition we are introducing the possibility
to declare location domain ids for all nodes in the cluster. This
means that we can ensure that reads always stays local to the
Availability Domain in an Oracle Cloud (Availability Zone in Amazon
and Google clouds). Thus it is possible to design applications
without having to consider networking constraints as much as before
with NDB.

This means that we expect NDB to work very well in SQL applications.
We are also constantly working on improving the SQL performance of
NDB by supporting more and more push down of joins to the NDB data nodes.
We are working on improving the basic range scan mechanism in NDB,
we are working on improving the interface between the NDB storage
engine and the MySQL Server. Finally we are also working hard to
integrate all the changes in MySQL 8.0 into MySQL Cluster as well.

I will describe a number of different scenarios for how to build
applications in the cloud using a setup where we have 3 data nodes,
one in each availability domain of the Oracle Cloud.

But in this blog and a few more blogs I will start by looking
specifically at how the shared memory transporter improves performance
of standard sysbench benchmarks.

In the previous blog I showed how we have improved performance of
Sysbench OLTP RW even for the standard TCP transporter. This was
due to the use of a new wakeup thread and the use of locking the NDB API
receive thread to a CPU where it can work undisturbed. The receive
thread handles receive of all messages from the NDB data nodes and
must be prioritised over the other MySQL Server threads, the best way
to achieve this is to use CPU locking. In the benchmarks we present in
this blog we will always use this CPU locking.

In the figure above we show how the performance of a normal setup using
7.5.9 compares to the 7.6.6 with receive thread locked to a CPU using
the TCP transporter. Next we have a curve that shows performance when
simply replacing the TCP transporter with a shared memory transporter.
Next we show a curve of what happens if we configure the shared memory
transporter to use spinning for a while before it goes to sleep.

The final curve shows the performance when also spinning in the TC
threads and the LDM threads in the NDB data node. Spinning in those
threads is not likely to be beneficial if those threads are not locked
to their own CPU core, thus in this one should not use hyperthreading
for those threads.

The takeaways from the graph above are the following:

1) The shared memory transporter have similar performance at low
concurrency as the TCP transporter. As concurrency increases the
shared memory transporter has better performance, the improvement
is 10% at top performance and more than 40% at very high concurrency.

2) Using spinning in the configuration of the shared memory transporter
improves performance at low concurrency significantly, by more than
20%. Top performance is similar to not using spinning, but it is
easier to get to this top performance.

3) Using spinning in the TC threads and LDM threads improves performance
even more at low concurrency. Performance increases by more than 30% at
low concurrency compared to no spinning and by 10% compared to spinning
only in transporter. Performance at high concurrency is similar for all
variants using shared memory transporter. So spinning helps to make the
MySQL Server need less concurrency to reach high performance levels.

We have added a graph below where highlight the performance at 1 and 2
threads since it is difficult to see those differences in the first
figure.

Configuring NDB to use a shared memory transporter is easy, the easiest
way is to simply set a new configuration variable UseShm to 1 on the
NDB data nodes. With this setting we will create a shared memory transporter
between all API nodes and this node when the API node and the data node
share the same hostname. It is also possible to create a separate shared
memory section to describe the transporter setup between two specific
nodes in the cluster.

Spintime for the shared memory transporter is easiest to setup using the default
shared memory transporter section. Spintime for TC and LDM threads in the NDB
data nodes are configured using the ThreadConfig variable in NDB data nodes.

Since we are using mutex and condition variables in shared memory we are
only supporting shared memory transporters on Linux at the moment.

The conclusion is that using the shared memory transporter we can improve
performance at low concurrency by more than 30%, we can improve throughput
by 20% and at very high concurrency (1536 threads) we get about 100%
improvement, all comparing to the result in using 7.5.9.

In the graph below we show only the 7.5.9 curve and compare it to the curve
achieved with all improvements in 7.6.


Webinar: MySQL & MariaDB Performance Tuning for Dummies

You’re running MySQL or MariaDB as backend database, how do you tune it to make best use of the hardware? How do you optimize the Operating System? How do you best configure MySQL or MariaDB for a specific database workload?

Do these questions sound familiar to you? Maybe you’re having to deal with that type of situation yourself?

A database server needs CPU, memory, disk and network in order to function. Understanding these resources is important for anybody managing a production database. Any resource that is weak or overloaded can become a limiting factor and cause the database server to perform poorly.

In this webinar, we’ll discuss some of the settings that are most often tweaked and which can bring you significant improvement in the performance of your MySQL or MariaDB database. We will also cover some of the variables which are frequently modified even though they should not.

Performance tuning is not easy, especially if you’re not an experienced DBA, but you can go a surprisingly long way with a few basic guidelines.

Date, Time & Registration Europe/MEA/APAC

Tuesday, June 26th at 09:00 BST / 10:00 CEST (Germany, France, Sweden)

Register Now

North America/LatAm

Tuesday, June 26th at 09:00 Pacific Time (US) / 12:00 Eastern Time (US)

Register Now

Agenda
  • What to tune and why?
  • Tuning process
  • Operating system tuning
    • Memory
    • I/O performance
  • MySQL configuration tuning
    • Memory
    • I/O performance
  • Useful tools
  • Do’s and do not’s of MySQL tuning
  • Changes in MySQL 8.0
Speaker

Krzysztof Książek, Senior Support Engineer at Severalnines, is a MySQL DBA with experience managing complex database environments for companies like Zendesk, Chegg, Pinterest and Flipboard.

This webinar builds upon blog posts by Krzysztof from the ‘Become a MySQL DBA’ series.

We look forward to “seeing” you there!

Tags:  MySQL MariaDB performance tuning webinar

GitHub acquired by Microsoft

Microsoft has just acquired GitHub for $7.5bn.  Good or bad?

Injected VC capital was $350m, so ROI for the VCs = 21.4x = very happy VCs.

Microsoft has done excellent work on OSS software in recent years, including on the Linux kernel, PHP, and many others.  Just like Oracle continues to put very good effort into MySQL after the Sun Microsystems acquisition many years ago.

But Microsoft is not an Open Source software company. The open source development model is not something they have built into their business “DNA” – processes (actually many companies that only do OSS haven’t got that either). So why GitHub? Combine it with LinkedIn (acquired by Microsoft earlier), and you have developers’ resumes. That’s valuable. It’s a strategically smart move, for Microsoft.

Will GitHub users benefit, and if so, how?

Well, I expect there’ll be more hoovering of “useful” (meta)data by a corporation, which some LinkedIn users will find handy, but I think it’s mainly beneficial to Microsoft rather than users, and this type of gathering and combining data is fundamentally incompatible with basic privacy.  It will bite, at some point down the line.  It always does.

Fundamentally, GitHub and its use is self-contradictory.  Git explicitly enables distributed source code control and truly distributed development, whereas GitHub is very much centralised.  Don’t just walk away to something else now, that won’t address the actual problem.  Solving it properly will include having bug tracking as part of a repository, and by design not relying on a single central location, or company.  The developer community (and companies) must resolve this external dependency.

Improvements from NDB wakeup threads


In MySQL Cluster 7.6 we introduced a new thread type in the NDB API.
Traditionally each cluster connection has one send thread to assist in sending messages to NDB, a receive thread that can
assist in receiving messages from NDB.
There is also a connection thread that listens to new connections and connects to the NDB data nodes.

There is also a set of user threads that is created by the application
that uses the NDB cluster connection.

Most of the sending is done by the user threads, the NDB API sending
thread is only used only when we are sending faster than the network
is able to handle.

We can process receive messages in the user threads or in the NDB
API receive thread. The default behaviour is to use the user threads
until the concurrency is higher than 8 threads working at the same
time. So at the highest concurrency it is the receive thread that
handles the signals and at low concurrency it is handled directly
by the NDB API user threads. The 8 is configurable through the
MySQL server variable ndb_recv_thread_activation_threshold.

Receiving in the NDB API is slightly faster to use from user threads
when only one thread is active.  It is 3-4% better response time in
this particular case. However as more and more threads are sending
data to the NDB data nodes the efficiency of using the NDB API
receive thread increases.

One problem in using the NDB API receive thread is that it is responsible
to both receive the messages from the NDB data nodes and to wake up the
NDB API user threads. At low load this is not an issue. But when the
load on the NDB API receive thread reaches 90% and beyond, this becomes
an issue.

To avoid this problem we added a new thread in the NDB API in MySQL Cluster 7.6.
This is the wakeup thread. This thread only has one duty, this is to wakeup
other threads. We experimented with a number of different variants to see which
ensured that user threads are woken up as quickly as possible.

Our conclusion was that at low load the optimal is that the receive thread
handles the wakeup, but at very high load it requires assistance from one
wakeup thread. As load increases the receive thread will handle less and less
wakeups. At 99-100% load the receive thread will more or less offload all
wakeup calls to the wakeup thread.

In the figure above we compare a normal sysbench OLTP RW experiment
comparing 7.5.9 with 7.6.6. As can be seen there is no difference until
we reach 32 connections. As we start to offload a subset of the wakeups
to the wakeup thread we improve performance of the application.

The throughput increases 5% due to this new feature, with even more
threads the performance drops slower such that we gain 15-20% more
performance at 512 connections.

The best performance is normally achieved by using the NDB API
receive thread and that this thread is locked to a specific CPU.
When starting the MySQL server one specifies these CPUs in the
configuration parameter ndb_recv_thread_cpu_mask. If the MySQL
Server uses several NDB cluster connections, the parameter
should specify one CPU per cluster connection.

If locking the NDB API receive thread to a CPU, it is important to
also lock the MySQL server process to other CPUs and if other processes
are running on the same machine, these also need to be locked to
CPUs not interfering with the NDB API receive thread.

The figures above shows the improvements when using one of the CPU
cores locked to handle the NDB API receive thread. Locking the receive
thread to a CPU adds another 5% to the total throughput and up to
20% more at high thread counts.

So what we have achieved with MySQL Cluster 7.6 is that we can increase
the throughput by at least 10% and performance at high thread counts
can increase by as much as 40%. All these numbers are still using the
TCP transporter. In a coming blog we will show how these numbers increase
even more when using the shared memory transporter. In addition we will
show how using the thread pool with NDB can even further increase stability
of high throughputs at high thread counts.

The above experiment was always done with one data node using 8 LDM
threads, the data node is locked to CPUs within one CPU socket. The
MySQL Server is locked to using 30 CPUs (15 CPU cores). In all cases
the bottleneck is that we only use one cluster connection. In 7.5.9 this
cluster connection scales to about 18 CPUs and with 7.6.6 it scales to
more than 20 CPUs. So using one cluster connection per 8 CPU cores is
usually appropriate.

I will be presenting at OpenExpoEurope in Madrid on Wednesday

This Wednesday, 6th June 2018, I will be presenting at OpenExpo Europe in Madrid: Experiences with testing dev MySQL versions and why it’s good for you.  It’s always good to test new releases of MySQL and this presentation talks about why.  If you are at the conference then please come along and say hello.  

Troubleshooting MySQL Concurrency Issues with Load Testing Tools Webinar: Q & A

In this blog, I will provide answers to the Q & A for the Troubleshooting MySQL Concurrency Issues with Load Testing Tools webinar.

First, I want to thank everybody for attending my May 23, 2018, webinar on troubleshooting tools for MySQL. The recording and slides for the webinar are available here. Below is the list of your questions that I was unable to answer fully during the webinar.

Q: What do you recommend for benchmarking NDB cluster? Which should be used and how?

A: The issue with benchmarking NDB cluster is not the tool choice, but the limitations that this product has compared to “regular” InnoDB MySQL installations. You can use any tool you like (Sysbench!), just make sure you are not running a test case that can be affected by NDB limitations (full list, performance). Or, at least, adjust your expected results for such a test. You may check these slides by former MySQL Product Manager Mat Keep. He covers benchmarks too. Just note that NDB cluster is improving over the years and you need to consult the current documentation to find out which limitations have already been relaxed.

Q: Can SysBench coordinate load testing in a multi-master scenario, or master and slave, at the same time? Or does it have a single connection?

A: SysBench version > 1.0 can open more than one connection per test (and even per thread). But you will need to write your own custom tests to coordinate load testing in a multiple server scenario.

Q: What do you think about the function BENCHMARK()? Compared with mysqlslap?

A: Function BENCHMARK just evaluates the scalar expression whilst mysqlslap runs the SQL query.

BENCHMARK()  can only be used to test how fast a function or a subquery, returning a single value, executes. This limits its usage to only a small number of real life use cases. It may help to test, say, if a function is effective enough for calculating expressions. It runs everything in the same thread, therefore you cannot use it for testing concurrency issues unless you start it in multiple clients.

mysqlslap  is very basic tool, but it can run SQL in multiple threads which makes its tests closer to real life scenarios.

The post Troubleshooting MySQL Concurrency Issues with Load Testing Tools Webinar: Q & A appeared first on Percona Database Performance Blog.

Using MariaDB Backup and MariaDB MaxScale to Scale Online

Using MariaDB Backup and MariaDB MaxScale to Scale Online anderskarlsson4 Mon, 06/04/2018 - 10:53

This blog post is rather practical. What it aims to show is how we can use a script that in turn uses MariaDB Backup to back up a MariaDB Server master, how we can create a MariaDB Server slave from this backup and then how we can script an online update to MariaDB MaxScale 2.2 to include the new MariaDB Server slave. Lets start with describing our environment.

Our infrastructure

 

We currently have one MariaDB Server master, 2 MariaDB Server slaves and one instance of MariaDB MaxScale. They are all running CentOS 7.2 and the IP adresses are 192.168.142.110 (MariaDB MaxScale), 192.168.142.111 (MariaDB Server master), 192.168.142.112 and 192.168.142.113 (MariaDB Server slaves).

Setup of the MariaDB servers

The setup of the master server is nothing really complicated in this case, but there are a few things we have to configure to use this as a master. We need to enable the binlog and set a server id, and this means editing the /etc/my.cnf.d/server.cnf file (if you are not on CentOS or RedHat, the location might be different) and add the following to the [mysqld] section:

server_id=111 log-bin=hostonly111

We also need to adjust the [mysqld] section in the same configuration file on the slaves, for example:

server_id=112 log_bin=hostonly112 log_error=error.log datadir=/var/lib/mysql report-host=192.168.142.112

And this has to be adjusted accordingly of course to fit your setup and also note that not all of these are strictly necessary. I will not show more of the master and slave configuration here, this is not the goal of this blog.

Setup of MariaDB MaxScale

There is a basic setup of MariaDB MaxScale that is assumed here, but note that MariaDB MaxScale, from version 2.2, stores a binary version of it's configuration separately. This is useful when you use online dynamic reconfiguration, but it makes things a bit more complicated. What I am showing here then is the basic MariaDB MaxScale configuration that is used to support the cluster set up as above and this is stored in the file /etc/maxscale.cnf:

# Global parameters # [maxscale] threads=auto # Server definitions # [server1] type=server address=192.168.142.111 port=3306 protocol=MariaDBBackend [server2] type=server address=192.168.142.112 port=3306 protocol=MariaDBBackend [server3] type=server address=192.168.142.113 port=3306 protocol=MariaDBBackend # Monitor for the servers # [MariaDB-Monitor] type=monitor module=mariadbmon servers=server1,server2,server3 user=myuser passwd=mypwd monitor_interval=1000 # Service definitions # [Read-Write-Service] type=service router=readwritesplit servers=server1,server2,server3 user=myuser passwd=mypwd # This service enables the use of the MaxAdmin interface # [MaxAdmin-Service] type=service router=cli # Listener definitions for the services # [Read-Write-Listener] type=listener service=Read-Write-Service protocol=MariaDBClient port=4006 [MaxAdmin-Listener] type=listener service=MaxAdmin-Service protocol=maxscaled socket=default

Note that I am not going to cover all the aspects of configuring MariaDB MaxScale here.

Backing up MariaDB Server using MariaDB Backup

Mariabackup is fully documented in the MariaDB Knowledge base so I will not get into details, rather what I aim to show is a basic bash-script that runs a backup. How you run this is not really important, but it has to be run, and the compressed and archived backup has to be placed in the agreed location. Also note that the script also runs a prepare, which means that it does any recovery necessary on the backup to create a consistent copy of the running database.

The script is far from complete, but it performs the basics. It does keep old backups, and does a few other things. It is intended to be run in the background, which is why it is configured using environments variables at the top of the script and not any command line arguments.

#!/bin/bash -eu # set -o pipefail MDB_USER=root MDB_PWD= MDB_BACKUPBASE=/home/anders MDB_BACKUPNAME=backup MDB_ARCHIVENAME=backuparchive MDB_BACKUPDIR=$MDB_BACKUPBASE/$MDB_BACKUPNAME MDB_BACKUPARCHIVEPFX=$MDB_BACKUPBASE/$MDB_ARCHIVENAME MDB_BACKUPARCHIVEDIR="" MDB_BACKUPARCHIVETGZ="" MDB_BACKUPLOG=/tmp/backup.log MDB_BACKUPCMD=/usr/bin/mariabackup if [ "$EUID" != "0" ]; then echo "$0 must be run as root" >&2 exit 1 fi # Check if the backup directory exists. if [ -e "$MDB_BACKUPDIR" -o -e "$MDB_BACKUPDIR.tgz" ]; then # Find a backup archive directory. for I in {1..10000}; do if [ ! -e "$MDB_BACKUPARCHIVEPFX$I" -a ! -e "$MDB_BACKUPARCHIVEPFX$I.tgz" ]; then MDB_BACKUPARCHIVEDIR="$MDB_BACKUPARCHIVEPFX$I" MDB_BACKUPARCHIVETGZ="$MDB_BACKUPARCHIVEPFX$I.tgz" break fi done # Check that a directory was found. if [ "x$MDB_BACKUPARCHIVEDIR" = "x" ]; then echo "Can't find a suitable backup archive directory" >&2 exit 1 fi if [ -e "$MDB_BACKUPDIR" ] ; then mv $MDB_BACKUPDIR $MDB_BACKUPARCHIVEDIR fi if [ -e "$MDB_BACKUPDIR.tgz" ] ; then mv $MDB_BACKUPDIR.tgz $MDB_BACKUPARCHIVETGZ fi fi echo >> $MDB_BACKUPLOG echo "Starting backup on `date +"%Y-%m-%d %H:%M:%S"`" >> $MDB_BACKUPLOG # Do the backup. echo "Backup up to $MDB_BACKUPDIR" >> $MDB_BACKUPLOG $MDB_BACKUPCMD --backup -u $MDB_USER ${MDB_PASS:+"-p$MDB_PASS"} --target-dir=$MDB_BACKUPDIR >> $MDB_BACKUPLOG 2>&1 # Prepare and make the backup consistent. $MDB_BACKUPCMD --prepare -u $MDB_USER ${MDB_PASS:+"-p$MDB_PASS"} --target-dir=$MDB_BACKUPDIR >> $MDB_BACKUPLOG 2>&1 # Compress and archive the backup. cd $MDB_BACKUPBASE tar cvfz $MDB_BACKUPNAME.tgz $MDB_BACKUPNAME >> $MDB_BACKUPLOG 2>&1

As you can see, nothing too complicated and the resulting backup will be placed in the archive file /home/anders/backup.tgz.

Creating a slave from a master backup

The second script to present is the one that creates a slave from a master, using the content of a backup as described above. It is assumed that this slave has MariaDB Server and MariaBackup already installed and that ssh is configured to that files can be copied from the master, including appropriate keys. Then it is time to have a look at the script.

The script does quite a few things, first it figures out a suitable server_id for this slave and then it copies a backup archive from the master server and unpacks this. For a replication slave to be set up appropriately, we need to know the GTID at the point when the backup was taken, so this is recovered from a file that MariaDB Backup generates for us, that is included with the backup archive. At this point the MariaDB Server is shut down, if it is running, and the current datadir is saved.

It is then time to recover data from the backup and set up the correct ownership of the MariaDB Server directories. At this point we can start the MariaDB Server. All we need to do next is to configure the MariaDB Server as a slave and we are all set.

The script, which will run on the server of the MariaDB Server slave, takes two options: the IP address of the MariaDB Server master of this slave, and the host IP of this MariaDB Server slave. The latter isn't as obvious as one might think, and many servers, possibly most, have several host IPs (excluding localhost there is often one connected internally and one to a Firewalled DMZ).

#!/bin/bash # set -o pipefail MDB_MASTERHOST=$1 MDB_SLAVEHOST=$2 MDB_USER=root MDB_PWD= MDB_CNFFILE=/etc/my.cnf.d/server.cnf MDB_REPLUSER=repl MDB_REPLPWD=repl MDB_DEFAULTDATADIR=/var/lib/mysql MDB_BACKUPNAME=backup MDB_RESTORELOG=/tmp/createslave.log MDB_BACKUPLOC=$MDB_MASTERHOST:$MDB_BACKUPNAME.tgz MDB_BACKUPCMD=/usr/bin/mariabackup MDB_MASTEROPTS="-h $MDB_MASTERHOST -u $MDB_REPLUSER ${MDB_REPLPWD:+"-p$MDB_REPLPWD"} --batch --column-names=0" if [ "$EUID" != "0" ]; then echo "$0 must be run as root" >&2 exit 1 fi if [ "$#" -lt 2 ]; then echo "Usage: $0 []" >&2 exit 1 fi # Handle server id. if [ "$#" -gt 2 ]; then MDB_SLAVEID=$3 if [ "x`echo $MDB_SLAVEID | sed "s/^[0-9]*$//"`" != "x" ]; then echo "Slave server id invalid. It must be numeric" >&2 exit 1 fi else # Get a server id from the master if not specified. MDB_SLAVEID="`mysql $MDB_MASTEROPTS -e "SHOW SLAVE HOSTS" | awk '{print $1}' | sort -n | tail -1`" if [ "x$MDB_SLAVEID" == "x" ]; then MDB_SLAVEID="`mysql $MDB_MASTEROPTS -e "SELECT @@SERVER_ID"`" fi MDB_SLAVEID=$(($MDB_SLAVEID + 1)) fi # Check if we have mariabackup if [ ! -e "$MDB_BACKUPCMD" ]; then echo "Cannot find $MDB_BACKUPCMD command. Please install it" >&2 exit 1 fi # Check if datadir is set, else set the default. MDB_DATADIR="`(grep "^ *datadir *=" $MDB_CNFFILE || true) | awk -F= '{print $2}'`" if [ "x$MDB_DATADIR" == "x" ]; then MDB_DATADIR=$MDB_DEFAULTDATADIR fi # Print to log. echo >> $MDB_RESTORELOG echo "Starting restore on `date +"%Y-%m-%d %H:%M:%S"`" >> $MDB_RESTORELOG # Copy backup from master. scp $MDB_BACKUPLOC . >> $MDB_RESTORELOG 2>&1 # Remove old backup, if one exists. if [ -e "$MDB_BACKUPNAME" ]; then rm -rf $MDB_BACKUPNAME fi # Unpack backup. tar xvfz $MDB_BACKUPNAME.tgz >> $MDB_RESTORELOG 2>&1 # Get the GTID from the backup GTID_POS=`cat $MDB_BACKUPNAME/xtrabackup_binlog_info | awk '{print $3}'` echo >> $MDB_RESTORELOG 2>&1 echo "Restoring GTID: $GTID_POS" >> $MDB_RESTORELOG 2>&1 # Get MariaDB server status STATUS=`systemctl is-active mariadb || true` echo "MariaDB status: $STATUS" >> $MDB_RESTORELOG 2>&1 # Stop MariaDB if it is running. if [ "$STATUS" = "active" ]; then echo "Stopping MariaDB" >> $MDB_RESTORELOG 2>&1 systemctl stop mariadb >> $MDB_RESTORELOG 2>&1 STATUS=`systemctl is-active mariadb || true` if [ "$STATUS" = "active" ]; then echo "Error stopping MariaDB" >> $MDB_RESTORELOG 2>&1 exit 1 fi fi # Save current datadir if that exists. if [ -e "$MDB_DATADIR" ]; then MDB_DATADIR_SAVE="$MDB_DATADIR`date +\"%Y%m%d_%H%M%S\"`" if [ -e "$MDB_DATADIR_SAVE" ]; then for I in {1..100000}; do MDB_DATADIR_SAVE="$MDB_DATADIR`date +\"%Y%m%d_%H%M%S\"`_$I" if [ ! -e "$MDB_DATADIR_SAVE" ]; then break fi done if [ -e "$MDB_DATADIR_SAVE" ]; then echo "Can't find location for saved datadir" >> $MDB_RESTORELOG 2>&1 exit 1 fi fi # Move datadir to saved location. mv $MDB_DATADIR $MDB_DATADIR_SAVE fi # Find mysqld group in config file. GRPLINE=`grep -n "\[mysqld\]" $MDB_CNFFILE | tail -1 | awk -F: '{print $1}'` # If a group wasn't found, then add one. if [ "x$GRPLINE" == "x" ]; then echo "[mysqld]" >> $MDB_CNFFILE GRPLINE=`grep -n "\[mysqld\]" $MDB_CNFFILE | awk -F: '{print $1}'` fi # Set up section of variables to set. NEWCNF="" if [ "x`grep \"^ *server[-_]id *=\" $MDB_CNFFILE`" == "x" ]; then NEWCNF="server_id=$MDB_SLAVEID" fi if [ "x`grep \"^ *datadir *=\" $MDB_CNFFILE`" == "x" ]; then NEWCNF="${NEWCNF}${NEWCNF:+\n}datadir=/var/lib/mysql" fi if [ "x`grep \"^ *report[-_]host *=\" $MDB_CNFFILE`" == "x" ]; then NEWCNF="${NEWCNF}${NEWCNF:+\n}report_host=$MDB_SLAVEHOST" fi # Set up required variables in cnf if necessary. if [ "x$NEWCNF" != "x" ]; then sed -i "${GRPLINE}a$NEWCNF" $MDB_CNFFILE fi # Restore from backup. $MDB_BACKUPCMD --move-back --target-dir=$PWD/$MDB_BACKUPNAME >> $MDB_RESTORELOG 2>&1 # Set correct ownership. chown -R mysql:mysql $MDB_DATADIR chmod 755 $MDB_DATADIR # Start MariaDB again. systemctl start mariadb >> $MDB_RESTORELOG 2>&1 # Get MariaDB server status STATUS=`systemctl is-active mariadb || true` # Stop if MariaDB is not running. if [ "$STATUS" != "active" ]; then echo "Error starting MariaDB" >> $MDB_RESTORELOG 2>&1 exit 1 fi # Set up node as slave. mysql -u $MDB_USER ${MDB_PWD:+"-p$MDB_PWD"} -e "SET GLOBAL gtid_slave_pos = '$GTID_POS'" >> $MDB_RESTORELOG 2>&1 mysql -u $MDB_USER ${MDB_PWD:+"-p$MDB_PWD"} -e "CHANGE MASTER TO MASTER_HOST='$MDB_MASTERHOST',\ MASTER_USER='$MDB_REPLUSER', MASTER_PASSWORD='$MDB_REPLPWD', MASTER_USE_GTID=current_pos" >> \ $MDB_RESTORELOG 2>&1 mysql -u $MDB_USER ${MDB_PWD:+"-p$MDB_PWD"} -e "START SLAVE" >> $MDB_RESTORELOG 2>&

After running the script above, we have yet another slave attached to the master, but one thing remains, which is to tell MariaDB MaxScale to route reads also to the newly attached server.

Automated and online reconfiguration of MariaDB MaxScale

The last step is to add our new slave to the MariaDB MaxScale configuration. The way this script works is that it attaches to the master and use that to determine which slaves exist. This is compared to the slaves that MariaDB MaxScale knows about and if one it missing, it is added. Among the programs used by this script is curl and above all jq, which is used to parse JSON. These tools have to be installed to use this script.

#!/bin/bash # MAX_HOST=localhost MAX_PORT=8989 MAX_USER=admin MAX_PWD=mariadb MDB_PORT=3306 MDB_USER=myuser MDB_PWD=mypwd SERVERID_PREFIX=server CURL_OPTS="--user $MAX_USER:$MAX_PWD --silent" if [ "x`which curl 2> /dev/null`" == "x" ]; then echo "Cannot find curl program" 1>&2 exit 1 fi if [ "x`which mysql 2> /dev/null`" == "x" ]; then echo "Cannot find mysql program" 1>&2 exit 1 fi if [ "x`which jq 2> /dev/null`" == "x" ]; then echo "Cannot find jq program" 1>&2 exit 1 fi # # Function to add a server. # Arguments: # Server address # Server node name # function addserver { curl $CURL_OPTS -H "Content-Type:application/json" http://$MAX_HOST:$MAX_PORT/v1/servers -d '{ "data": { "id": "'$2'", "type": "servers", "attributes": { "parameters": { "address": "'$1'", "port": 3306, "protocol": "MariaDBBackend" } }, "relationships": { "services" : { "data": ['$SERVICELIST'] }, "monitors" : { "data": ['$MONITORLIST'] } } } }' } # # Function to generate a suitable server_id # function get_serverid { for I in {1..10000}; do found=0 for S in $SERVERIDS; do if [ "$SERVERID_PREFIX$I" == "$S" ]; then found=1 break fi done if [ $found -eq 0 ]; then echo "$SERVERID_PREFIX$I" break fi done return 0 } MASTER=`curl $CURL_OPTS http://$MAX_HOST:$MAX_PORT/v1/servers | jq --raw-output '.data[].attributes | select(.state == "Master, Running") | .parameters.address'` MASTERID=`curl $CURL_OPTS http://$MAX_HOST:$MAX_PORT/v1/servers | jq --raw-output '.data[] | select(.attributes.state == "Master, Running") | .id'` if [ "x$MASTER" == "x" ]; then echo "Cannot find a master node" 1>&2 exit 1 fi MASTER_SERVICES=`curl $CURL_OPTS http://$MAX_HOST:$MAX_PORT/v1/servers/$MASTERID | jq --raw-output '.data.relationships.services.data[].id'` MASTER_MONITORS=`curl $CURL_OPTS http://$MAX_HOST:$MAX_PORT/v1/servers/$MASTERID | jq --raw-output '.data.relationships.monitors.data[].id'` SERVERS=`curl $CURL_OPTS --silent http://$MAX_HOST:$MAX_PORT/v1/servers | jq --raw-output '.data[].attributes.parameters.address' | sort` SERVERIDS=`curl $CURL_OPTS --silent http://$MAX_HOST:$MAX_PORT/v1/servers | jq --raw-output '.data[].id' | sort` SLAVES=`mysql -h $MASTER -P $MDB_PORT -u $MDB_USER -p$MDB_PWD -e "show processlist" --batch | grep "Binlog Dump" | awk '{sub(/:[0-9]*/, "", $3); print $3;}'` # Create JSON list of services. SERVICELIST="" for S in $MASTER_SERVICES; do SERVICELIST="${SERVICELIST:+$SERVICELIST,}{\"id\":\"$S\",\"type\":\"services\"}" done # Create JSON list of monitors. MONITORLIST="" for S in $MASTER_MONITORS; do MONITORLIST="${MONITORLIST:+$MONITORLIST,}{\"id\":\"$S\",\"type\":\"monitors\"}" done # Loop for all slaves and see if they are defined in maxscale. for S in $SLAVES; do found=0 for SE in $SERVERS; do if [ "$S" == "$SE" ]; then found=1 break; fi done # If server is not found in maxscale, then add it. if [ $found -eq 0 ]; then echo "Server $S not found in MaxScale. Adding" SRVID=$(get_serverid) echo $SRVID addserver $S $SRVID SERVERIDS="$SERVERIDS $SRVID" fi done Conclusion

MariaDB MaxScale provides a powerful, flexible and convenient means to build a scalable MariaDB Server cluster, be it Galera or a Replicated Cluster. MariaDB Backup on the other hand is a powerful and flexible online backup solution for MariaDB Server. Combining these technologies means that a powerful and scalable environment can easily be built, and it is scalable and flexible without downtime.

References

Happy SQL'ing
/Karlsson

MariaDB backup is the online backup tool that is part of MariaDB TX, it is an online backup tool that supports many advanced features. MariaDB MaxScale is a database proxy that, among other things allows read/write split scalability, online configuration and much more and which is also included with MariaDB TX. By combining these tools, this blog shows how a replication cluster can be scaled online without downtime.

Login or Register to post comments

Busy week for MySQL - where you can find us!

This week is going to be a bit busy for MySQL team. We are going to be present at four conferences in Europe and two in the US. Please find them below:

  • International PHP Conference, Berlin Germany, June 4-8, 2018
    • This time we are going to have a booth on the Bronze level in the expo area. The exhibition is only on Jun 5-6, 2018. You can find our colleague Mario Beck, the MySQL Sales Consulting Manager for EMEA at the stand. Come to our MySQL booth!!
  • OpenExpo Europe, Madrid, Spain, June 6-7, 2018
    • Do not miss to come to the biggest OS show in Madrid! MySQL and Linux team will be there at the shared MySQL&Linux booth in the expo area. Also you can find our talks as follows in the schedule:
      • "The Revolutionary MySQL 8.0" by Abel Florez, the MySQL Technical Account Manager. His talk is scheduled for June 6, 2018 @16:05-16:35.
      • "Operating on your Choice of Hardware, in your Data Center or in the Cloud" by Simon Coter, the Director of Product Management, Oracle VM & VirtualBox.
      • "Oracle Cloud Infrastructure for Cloud Native Applications" by Vittorio Garbuglio, the Cloud Solutions Architect.
    • Come to talk to our experts, we are looking forward to meeting you there!
  • Bulgarian Oracle User Group Conference (BGOUG), Broadest, Bulgaria, June 8-10, 2018
    • As a tradition we are supporting and actively attending this User Group event organized by the local Oracle User Group. Please do not miss the MySQL talk given by Georgi Kodinov, the MySQL Senior SW Development Manager as follows:
      • "The New Data Dictionary in MySQL 8.0: what does it mean to you?" June 9, 2018 @14:30-15:15 in the Hall Rila 1. 
  • DevTalks Bucharest, Romania, June 8, 2018
    • We got to this show on the last minute! You can find our MySQL staff at MySQL booth in the exhibition area as well as find the MySQL talk given by Vittorio Cioe, the Sr. Sales Consultant, Oracle, MySQL:
      • "Better DevOps with MySQL 8.0", scheduled for June 8, 2018 @14:05-14:50 in the DevOps track
    • We are looking forward to meeting & talking to you there!
  • SyntaxCon, Charlestown, SC, US, June 6-8, 2018
    • We are attending and speaking at the SyntaxCon conference this year. You can find the MySQL talk in the schedule as follows:
      • "MySQL 8 -- A New Beginning" given by Dave Stokes, the MySQL Community Manager. The talk is scheduled for June 7, 2018 @11:15-12:15pm.
    • Come to listen the MySQL 8.0 news and discuss your questions with Dave @SyntaxCon!
  • Southeast Linux Fest (SELF), Charlotte, US, June 8-10, 2018
    • As tradition we are going to attend SELF! This year we become Diamond sponsor and you will be able to find us at MySQL booth in the expo hall as well as find following MySQL talks in the two day Zero to DBA Hero track:
      • "20 Years of MySQL, 20 Years of PHP, and 10 Years of SELF - What the heck has been going on?" given by Dave Stokes, the MySQL Community Manager. The talk is scheduled for June 8, 2018 @ 9:00-10:00am.
      • "MySQL Without the SQL - Oh My!" by Dave Stokes, talk is scheduled for June 8 @1:30-2:30 pm.

The next shows in June we are attending are: DataOps in June 21-22, 2018 in Barcelona,Spain, Hong Kong Open Source Conference, June 16-17, 2018, OpenSource Conference Okinawa, Japan, June 16, 2018, PyCon Thailand, Bankok, June 16-17, 2018.

MySQL NDB Cluster 7.6: Fast Import with ndb_import

A common problem when you need to set up a new database instance is to import the initial data. This can particularly be an issue, if you are converting an existing database so there is a large amount of data to import. In MySQL NDB Cluster 7.6 which just was announced as general available (GA) this week, there is a new tool to help with such imports: ndb_import.

The ndb_import utility takes a CSV file and imports the data into a table. (While C stands for comma, the tool is not limited to comma-separated files, in fact tabs is the default delimiter.) But hey, does that not sound familiar? It sure does. MySQL Server has “for ever” included the mysqlimport utility and LOAD DATA statement. So why do we need another utility and what does ndb_import do that we cannot already do with the existing tools?

The big advantage of ndb_import is that it is a native NDB tool. It does not connect to any of the SQL nodes; instead it connects directly to the data nodes as an NDB API program. This alone allows you to bypass the overhead of connecting to the SQL node and parse SQL statements. However, it is even better than that. The ndb_import tool can also take advantage of the parallelism offered by MySQL NDB Cluster. This means that in the end, the overall performance is expected to be quite a bit better than loading the data using mysqlimport or LOAD DATA.

Note: The speed-up will depend on your setup. A cluster with all nodes in a VM on a laptop may not experience any significant speed-up compared to other methods of importing data.

So, how does this work? Let us look at a simple example. We will import the following data from the data.txt file:

# ID First Surname Birthday 1 John Doe 1945-10-15 2 Jane Doe 1947-08-11 3 Mary Smith 1974-12-04 4 Mike Jones 1980-11-12

The data includes the first name, surname, and birthday for four persons separated by \t, and it will be imported into the db1.t1 table:

CREATE SCHEMA IF NOT EXISTS db1; CREATE TABLE db1.t1 ( id int unsigned NOT NULL auto_increment, FirstName varchar(20) NOT NULL, Surname varchar(20) NOT NULL, Birthday date NOT NULL, PRIMARY KEY (id) ) ENGINE=NDBCluster CHARACTER SET=utf8mb4;

Note: You still need to create the table through an SQL node.

With the data and table, the data can be imported using the following ndb_import command:

shell$ ndb_import --ndb-connectstring=localhost:1186 db1 t1.txt \ --connections=2 --db-workers=2 \ --fields-terminated-by="\t" --ignore-lines=1 \ --lines-terminated-by="\n"

The arguments in the first line of the command tells how to connect to the management node, the schema name, and the name of the file to import data from. Like for mysqlimport, the table name is derived from the file name by using the basename, t1 in this case, as the table name.

The –connections and –db-workers options defines the parallelism of the job. The –connections option is equivalent to the ndb-cluster-connection-pool option that can be set for SQL nodes and defines how many connections the ndb_import process creates to the data nodes. There must be at least as many free API node slots available as the number of connections requested. Otherwise the command will hang while waiting for enough connections to become available and eventually fail. The –db-workers option specifies the number of threads ndb_import will use per data node.

The remaining options are equivalent to the options available for mysqlimport, LOAD DATA, and SELECT … INTO OUTFILE. They specify the format of the data inside the source file and how many rows to skip – one in this case due to the header.

If you need to load data into several tables in the same schema, you can specify a file name for each table, and they will all be processed.

There are many more options available. I will recommend you to check out the reference manual page for ndb_import for more information.

The output of the command is:

job-1 import db1.t1 from t1.txt job-1 [running] import db1.t1 from t1.txt job-1 [success] import db1.t1 from t1.txt job-1 imported 4 rows in 0h0m0s at 9 rows/s jobs summary: defined: 1 run: 1 with success: 1 with failure: 0

This includes some metadata information about the job such as the table and file names as well as the total number of rows imported, how long time it took, and the number of rows imported.

What happens, if something goes wrong? If we thought the data was comma-separated and had used –fields-terminated-by=”,”, an error would have occurred:

shell$ ndb_import --ndb-connectstring=localhost:1186 db1 t1.txt \ --connections=2 --db-workers=2 \ --fields-terminated-by="," \ --ignore-lines=1 --lines-terminated-by="\n" job-1 import db1.t1 from t1.txt job-1 [running] import db1.t1 from t1.txt job-1 [error] import db1.t1 from t1.txt job-1 imported 0 rows in 0h0m0s at 0 rows/s job-1 rejected 1 rows (limit 0), see t1.rej job-1 error[gen-0]: error in teams: 1-csv-input (source:740) job-1 1-csv-input error[data-0]: reject limit 0 exceeded (source:534) jobs summary: defined: 1 run: 1 with success: 0 with failure: 1

Notice here how the fifth line of the output says “job-1 rejected 1 rows (limit 0), see t1.rej”. Each input file is considered a job, so “job-1” refers to the first (and in this case only) input file. It rejected 1 row, and the limit refers to the –rejects option which specifies how many permanent errors must be encountered before failing the whole job. The default is to fail on the first error. Finally, “see t1.rej” tells us to look at the t1.rej file (created by ndb_import) for more information about the failed job:

shell$ cat t1.rej runno rowid linenr startpos endpos bytes errortype errorcode sourceline errortext reject 0 0 2 32 54 22 data 0 1130 line 2: too few fields (1 < 4) 1\tJohn\tDoe\t1945-10-15\n

The t1.rej file is a tab-delimited file with information for each failed row. In this case, the error is that there are too few values: one value was found (because there are no commas in the data and we specified the wrong delimiter) but four values were expected. If –rejects had been set to 3, all four rows would have been included in t1.rej.

The ndb_import binary is included with the NDB Cluster installation and can be found in the bin directory. The next step is for you to try it out. Good luck importing data into MySQL NDB Cluster 7.6.

Zero to DBA Hero Track at Southeast Linuxfest

This year the Zero to DBA Hero track at the Southeas  Llinuxfest expands to a second day.  The event is free to attend but it helps if you pre register.  Here is the agenda of that you will see in Charlotte June 8th, 9th, and 10th.


MySQL Talks at SELF – Zero to DBA Hero Track Linode Ballroom
Friday
9am 20 years of MySQL, 20 years of PHP, and 10 Years of SELF -- What the heck has been going on? Dave Stokes
10:15 Introducing the MySQL Document Store
Charles Bell, PhD
11:30 Performance Analysis and Troubleshooting Methodologies for Databases Peter Zaitsev
1:30 MySQL Without the SQL -- Oh My! Dave Stokes
4:00 Introducing MySQL InnoDB Cluster Charles Bell, PhD
Saturday
9am MySQL/Percona Server/MariaDB Server Security Features Overview Colin Charles
10:15 ZFS and MySQL on Linux, The Sweet Spots Jervin Real
11:30 The MySQL Ecosystem - Understanding It, Not Running Away From It!! Colin Charles
1:30 MySQL Troubleshooting and Performance Optimization with PMM Peter Zaitsev
2:45 MySQL Windowing Functions Dave Stokes
4:00 Understanding the Licensing of Your Database Stack Colin Charles
Sunday
9am Copyright, Open Source, the Free Market, and the History of MySQL Alexander Rubin
MySQL is again proud to be a platinum sponsor and please drop by the MySQL booth to talk about MySQL 8, Group Replication, and the MySQL Document Store,

Pages