Planet MySQL

AU/X, NeXTSTEP, and Mac OS X

One thing that gets tedious in the IT community and Oracle community is the penchant for Windows only solutions. While Microsoft does an excellent job in certain domains, I remain a loyal Apple customer. By the way, you can install Oracle Client software on Mac OS X and run SQL Developer against any Oracle Database server. You can even run MySQL Workbench and MySQL server natively on the Mac OS X platform, which creates a robust development platform and gives you more testing options with the MySQL monitor (the client software).

Notwithstanding, some Windows users appear to malign Apple and the Mac OS X on compatibility, but they don’t understand that it’s a derivative of the Research Unix, through BSD (Berkeley Software Distribution). This Unix lineage chart illustrates it well:

I’m probably loyal to Apple because in the early 1990′s I worked on Mac OS 6, Mac OS 7, AU/X, NeXTSTEP, and AIX/6000 (Version 3) while working at APL (American President Lines) in Oakland, California. Back then, my desktop was a pricey Macintosh Quadra 950 and today I work on a pricey Mac Pro desktop. The Mac Pro lets me use VMware virtualize development environments for Oracle Linux, Red Hat Enterprise Linux, Fedora, and as you might guess Windows 7/8. My question to those dyed in the wool Microsoft users is simple, why would you choose a single user OS like Windows over a multi-users OS like Mac OS 10?


PlanetMySQL Voting: Vote UP / Vote DOWN

Biebermarks

Yet another microbenchmark result. This one is based on behavior that has caused problems in the past for a variety of reasons which lead to a few interesting discoveries. The first was that using a short lock-wait timeout was better than the InnoDB deadlock detection code. The second was that no-stored procedures could overcome network latency.

The workload is a large database where all updates are done to a small number of rows. I think it is important to use a large database to include the overhead from searching multiple levels of a b-tree. The inspiration for this is maintaining counts for popular entities like Justin Bieber and One Direction. This comes from serving the social graph. For more on that read about TAO and LinkBench.

The most popular benchmark for MySQL is sysbench and it is usually run with a uniform distribution so that all rows are equally likely to be queried or modified. But real workloads have skew which can cause stress in unexpected places and I describe one such place within InnoDB from this microbenchmark. YCSB and LinkBench are benchmarks that have skew and can be run for MySQL. I hope that more of the MySQL benchmark results in the future include skew.
ConfigurationSee a previous post for more details. Eight collections/tables with 400M documents/rows per collection/table were created. All collections/tables are in one database so MongoDB suffers from the per-database RW-lock. But MySQL and TokuMX also suffer from a similar issue when all clients are trying to update the same row. Tests were run for 1, 2, 4 and 8 tables where one row per table was updated. So when the test used 4 tables there were 4 rows getting updates. For each number of tables tests were run for up to 64 concurrent clients/threads. The result tables listed in the next section should make that clear.
The workload is updating the non-indexed column of one document/row by PK per transaction. There are no secondary indexes on the table. In this case the document/row with ID=1 is chosen for every update. For MySQL and TokuMX an auto-commit transaction is used. The journal (redo log) is used but the update does not wait for the journal/log to be forced to disk. The updates should not require disk reads as all relevant index and data blocks remain in cache. TokuMX might do reads in the background to maintain fractal trees but I don't understand their algorithm to be certain.
The database was loaded in PK order and about 8 hours of concurrent & random updates were done to warmup the database prior to this test. The warmup was the same workload as described in a previous post.
The MySQL test client limits clients to one table. So when there are 64 clients and 8 tables then there are 8 clients updating the 1 row per table. The MongoDB/TokuMX client does not do that. It lets all clients update all tables so in this case there are at most 64 clients updating the row per table and on average there would be 8.
The test server has 40 CPU cores with HT enabled, fast flash storage and 144G of RAM. The benchmark client and database servers shared the host. Tests were run for several configurations:
  • mongo26 - MongoDB 2.6.0rc2, powerOf2Sizes=1, journalCommitInterval=300, w:1,j:0
  • mongo24 - MongoDB 2.6.0rc2, powerOf2Sizes=0, journalCommitInterval=300, w:1,j:0
  • mysql - MySQL 5.6.12, InnoDB, no compression, flush_log_at_trx_commit=2, buffer_pool_size=120G, flush_method=O_DIRECT, page_size=8k, doublewrite=0, io_capacity=16000, lru_scan_depth=2000, buffer_pool_instances=8, write_io_threads=32, flush_neighbors=0
  • toku-32 - TokuMX 1.4.1, readPageSize=32k, quicklz compression, logFlushPeriod=300, w:1,j:0. I don't have results for toku-32 yet.
  • toku-64 - TokuMX 1.4.1, readPageSize=64k, quicklz compression, logFlushPeriod=300, w:1,j:0
Results per DBMSI first list the results by DBMS to show the impact from spreading the workload over more rows/tables. The numbers below are the updates per second rate. I use "DOP=X" to indicate the number of concurrent clients and "DOP" stands for Degree Of Parallelism (it is an Oracle thing). A few conclusions from the results below:

  • MySQL/InnoDB does much better with more tables for two reasons. The first is that it allows for more concurrency. The second is that it avoids some of the overhead in the code that maintains row locks and threads waiting for row locks. I describe that in more detail at the end of this post.
  • MongoDB 2.4.9 is slightly faster than 2.6.0rc2. I think the problem is that mongod requires more CPU per update in 2.6 versus 2.4 and this looks like a performance regression in 2.6 (at least in 2.6.0rc2). I am still profiling to figure out where. More details on this are at the end of the post. I filed JIRA 13663 for this.
  • MongoDB doesn't benefit from spreading the load over more collections when all collections are in the same database. This is expected given the per-database RW-lock.

Updates per second
config  #tables  DOP=1  DOP=2  DOP=4  DOP=8  DOP=16  DOP=32  DOP=64
mysql         1   8360  15992  30182  24932   23924   23191   21048
mysql         2      X  16527  30824  49999   41045   40506   38357
mysql         4      X      X  32351  51791   67423   62116   59137
mysql         8      X      X      X  54826   80409   73782   68128

config  #tables  DOP=1  DOP=2  DOP=4  DOP=8  DOP=16  DOP=32  DOP=64
mongo24       1  10212  17844  30204  34003   33895   33564   33451mongo24       2      X  10256  17698  30547   34125   33717   33573mongo24       4      X      X  10670  17690   30903   34027   33586mongo24       8      X      X      X  10379   17702   30920   33758
config  #tables  DOP=1  DOP=2  DOP=4  DOP=8  DOP=16  DOP=32  DOP=64mongo26       1   9187  16131  27648  28506   27784   27437   27021mongo26       2      X   9367  16035  27490   28326   27746   27354mongo26       4      X      X   9179  16028   27666   28330   27647mongo26       8      X      X      X   9125   16038   27275   27858
config  #tables  DOP=1  DOP=2  DOP=4  DOP=8  DOP=16  DOP=32  DOP=64toku-64       1   7327  12804  16179  12154   11021    9990    8344toku-64       2      X   7173  12690  20483   23064   22354   20349toku-64       4      X      X   7191  12943   21399   33485   40124toku-64       8      X      X      X   7121   12727   22096   38207Results per number of tablesThis reorders the results from above to show them for all configurations at the same number of tables. You are welcome to draw conclusions about which is faster.

Updates per second
config  #tables  DOP=1  DOP=2  DOP=4  DOP=8  DOP=16  DOP=32  DOP=64
mysql         1   8360  15992  30182  24932   23924   23191   21048
mongo24       1  10212  17844  30204  34003   33895   33564   33451
mongo26       1   9187  16131  27648  28506   27784   27437   27021
toku-64       1   7327  12804  16179  12154   11021    9990    8344
config  #tables  DOP=1  DOP=2  DOP=4  DOP=8  DOP=16  DOP=32  DOP=64
mysql         2      X  16527  30824  49999   41045   40506   38357
mongo24       2      X  10256  17698  30547   34125   33717   33573
mongo26       2      X   9367  16035  27490   28326   27746   27354
toku-64       2      X   7173  12690  20483   23064   22354   20349

config  #tables  DOP=1  DOP=2  DOP=4  DOP=8  DOP=16  DOP=32  DOP=64
mysql         4      X      X  32351  51791   67423   62116   59137
mongo24       4      X      X  10670  17690   30903   34027   33586
mongo26       4      X      X   9179  16028   27666   28330   27647
toku-64       4      X      X   7191  12943   21399   33485   40124

config  #tables  DOP=1  DOP=2  DOP=4  DOP=8  DOP=16  DOP=32  DOP=64
mysql         8      X      X      X  54826   80409   73782   68128
mongo24       8      X      X      X  10379   17702   30920   33758
mongo26       8      X      X      X   9125   16038   27275   27858
toku-64       8      X      X      X   7121   12727   22096   38207
Row locks for InnoDBI used PMP to understand MySQL/InnoDB on this workload. I frequently saw all user threads blocked on a condition variable with this stack trace. It seems odd that all threads are sleeping. I think the problem is that one thread can run but has yet to be scheduled by Linux. My memory of the row lock code is that it wakes threads in FIFO order and when N threads wait for a lock on the same row then each thread waits on a separate condition variable. I am not sure if this code has been improved in MySQL 5.7. A quick reading of some of the 5.6.12 row lock code showed many mutex operations. Problems in this code have escaped scrutiny in the past because much of our public benchmark activity has used workloads with uniform distributions.pthread_cond_wait@@GLIBC_2.3.2,os_cond_wait,os_event_wait_low2,lock_wait_suspend_thread,row_mysql_handle_errors,row_search_for_mysql,ha_innobase::index_read,handler::read_range_first,handler::multi_range_read_next,QUICK_RANGE_SELECT::get_next,rr_quick,mysql_update,mysql_execute_command,mysql_parse,dispatch_command,do_command,do_handle_one_connection,handle_one_connectionThis was a less frequent stack trace from the test ...lock_get_mode,lock_table_other_has_incompatible,lock_table,row_search_for_mysql,ha_innobase::index_read,handler::read_range_first,handler::multi_range_read_next,QUICK_RANGE_SELECT::get_next,rr_quick,mysql_update,mysql_execute_command,mysql_parse,dispatch_command,do_command,do_handle_one_connection,handle_one_connectionRow locks for TokuMXTokuMX has a similar point at which all threads wait. It isn't a big surprise given that both provide fine-grained concurrency control but there is no granularity finer than a row lock.pthread_cond_timedwait@@GLIBC_2.3.2,toku_cond_timedwait,toku::lock_request::wait,toku_db_wait_range_lock,toku_c_getf_set(__toku_dbc*,,db_getf_set,autotxn_db_getf_set(__toku_db*,,mongo::CollectionBase::findByPK(mongo::BSONObj,mongo::queryByPKHack(mongo::Collection*,,mongo::updateObjects(char,mongo::lockedReceivedUpdate(char,mongo::receivedUpdate(mongo::Message&,,mongo::assembleResponse(mongo::Message&,,mongo::MyMessageHandler::process(mongo::Message&,,mongo::PortMessageServer::handleIncomingMsg(void*)MongoDB 2.4 versus 2.6I get about 1.2X more updates/second with MongoDB 2.4.9 compared to 2.6.0rc2. I think the problem is that 2.6 uses more CPU per update. I filed JIRA 13663 for this but am still trying to profile the code. So far I know the following all of which indicates that the 2.4.9 test is running 1.2X faster than 2.6.0rc2 with 32 client threads and 1 table:
  • I get ~1.2X more updates/second with 2.4.9
  • the Java sysbench client uses ~1.2X more CPU per "top" with 2.4.9
  • the context switch rate is ~1.2X higher with 2.4.9
The interesting point is that mongod for 2.4.9 only uses ~1.03X more CPU than 2.6.0rc2 per "top" during this test even though it is doing 1.2X more updates/second. So 2.6.0rc2 uses more CPU per update. I will look at "perf" output. I can repeat this with the GA version of 2.6.
PlanetMySQL Voting: Vote UP / Vote DOWN

MySQL Partitioning – A Quick Look at Partitioning – Separate Your Data for Faster Searches

In MySQL, partitioning is a way to separate the data in one table into smaller “sub-tables” for better query performance and data management.

For example, let’s say that you have a database containing numerous accounting transactions. You could just store all of these transactions in one table, but you only need to keep seven year’s worth of data for tax purposes. Instead of placing all of the data in one table, and then deleting the old data from that table, you could split the table into partitions with each partition representing one year’s worth of data.

Then, after seven years, you could delete/drop the old partition. Partitions are flexible, as you can add, drop, redefine, merge, or split existing partitions (there are other options on what you could do with this data as well). Also, if you have a table that is going to contain a lot of rows, partitioning your data would allow your searches to be much faster, as the search can then be limited to a single partition. As of MySQL 5.6, you can split a table into as many as 8192 partitions.

Here is the MySQL website’s explanation about partitions:

The SQL standard does not provide much in the way of guidance regarding the physical aspects of data storage. The SQL language itself is intended to work independently of any data structures or media underlying the schemas, tables, rows, or columns with which it works. Nonetheless, most advanced database management systems have evolved some means of determining the physical location to be used for storing specific pieces of data in terms of the file system, hardware or even both. In MySQL, the InnoDB storage engine has long supported the notion of a tablespace, and the MySQL Server, even prior to the introduction of partitioning, could be configured to employ different physical directories for storing different databases (see Section 8.11.3.1, “Using Symbolic Links“, for an explanation of how this is done).

Partitioning takes this notion a step further, by enabling you to distribute portions of individual tables across a file system according to rules which you can set largely as needed. In effect, different portions of a table are stored as separate tables in different locations. The user-selected rule by which the division of data is accomplished is known as a partitioning function, which in MySQL can be the modulus, simple matching against a set of ranges or value lists, an internal hashing function, or a linear hashing function. The function is selected according to the partitioning type specified by the user, and takes as its parameter the value of a user-supplied expression. This expression can be a column value, a function acting on one or more column values, or a set of one or more column values, depending on the type of partitioning that is used.

(From: https://dev.mysql.com/doc/refman/5.6/en/partitioning-overview.html)

There are four types of partition options for your data:

RANGE – This type of partitioning assigns rows to partitions based on column values falling within a given range.

LIST – Similar to partitioning by RANGE, except that the partition is selected based on columns matching one of a set of discrete values.

HASH – With this type of partitioning, a partition is selected based on the value returned by a user-defined expression that operates on column values in rows to be inserted into the table. The function may consist of any expression valid in MySQL that yields a nonnegative integer value. An extension to this type, LINEAR HASH, is also available.

KEY – This type of partitioning is similar to partitioning by HASH, except that only one or more columns to be evaluated are supplied, and the MySQL server provides its own hashing function. These columns can contain other than integer values, since the hashing function supplied by MySQL guarantees an integer result regardless of the column data type. An extension to this type, LINEAR KEY, is also available.

(From: https://dev.mysql.com/doc/refman/5.6/en/partitioning-types.html)

This post will just give you one example of how to partition your data, and then how to verify that your query is searching only the correct partition. It doesn’t do you any good if you partition your data but then write queries that perform a table scan to get your results. In this example, I am going to be separating the table data by the year.

We are going to create a simple membership table, and partition by RANGE. We will separate the partition by the year that the person joined and we will add one member to each year. Our members table will be very simple, with an ID, the date the person joined, and their first and last name. We will create the partition by using just the YEAR that they joined, while we keep the full date they joined in the joined column. We are also assigning the columns id and joined to be primary keys. Here is the CREATE TABLE statement:

CREATE TABLE `members` ( `id` int(5) NOT NULL AUTO_INCREMENT, `joined` date NOT NULL, `lastname` varchar(25) NOT NULL, `firstname` varchar(25) NOT NULL, PRIMARY KEY (`id`,`joined`) ) ENGINE=InnoDB AUTO_INCREMENT=10000 DEFAULT CHARSET=latin1 /*!50100 PARTITION BY RANGE ( YEAR(joined)) (PARTITION p0 VALUES LESS THAN (2011) ENGINE = InnoDB, PARTITION p1 VALUES LESS THAN (2012) ENGINE = InnoDB, PARTITION p2 VALUES LESS THAN (2013) ENGINE = InnoDB, PARTITION p3 VALUES LESS THAN (2014) ENGINE = InnoDB, PARTITION p4 VALUES LESS THAN MAXVALUE ENGINE = InnoDB) */;

Our partitions will contain rows that have joined dates earlier than the dates shown in the PARTITION BY statement. In other words, partition p0 will contain dates earlier than 01/01/2011 (i.e. dates in 2010 or earlier). Partition p2 will contain dates earlier than 01/01/2012 but greater than 12/31/2010 (i.e. dates in 2011). Partition p3 will contains dates for 2013, and p4 will contain dates for 2014 and greater. Before the year 2015 arrives, you will need to add an additional partition for 2015. Of course, you could go ahead and add partitions for the next several years.

If you want partition p0 to contain all dates in 2011 (instead of those dates LESS THAN 2011), you can change the VALUES LESS THAN (2011) statement to VALUES IN (2010). But then any values less than 2011 would not be inserted into the database.

Now, let’s insert some data. We will insert one row into each partition, and then do a:

select id, joined, lastname, firstname from members;

to see what our data looks like:

mysql> insert into members (firstname, lastname, joined) values ("Mary", "Davis", "2010-01-14"); Query OK, 1 row affected (0.64 sec) mysql> insert into members (firstname, lastname, joined) values ("John", "Hill", "2011-02-12"); Query OK, 1 row affected (0.01 sec) mysql> insert into members (firstname, lastname, joined) values ("Steve", "Johnson", "2012-03-18"); Query OK, 1 row affected (0.01 sec) mysql> insert into members (firstname, lastname, joined) values ("Beth", "Daniels", "2013-04-22"); Query OK, 1 row affected (0.03 sec) mysql> insert into members (firstname, lastname, joined) values ("Bob", "Smith", "2014-05-29"); Query OK, 1 row affected (0.01 sec) mysql> select id, joined, lastname, firstname from members; +-------+------------+----------+-----------+ | id | joined | lastname | firstname | +-------+------------+----------+-----------+ | 10000 | 2010-01-14 | Davis | Mary | | 10001 | 2011-02-12 | Hill | John | | 10002 | 2012-03-18 | Johnson | Steve | | 10003 | 2013-04-22 | Daniels | Beth | | 10004 | 2014-05-29 | Smith | Bob | +-------+------------+----------+-----------+ 5 rows in set (0.00 sec)

When you start building your queries, you want to make sure that the query is using the partitions. You can do this by including the EXPLAIN PARTITIONS statement before your select statement. Visit this link you want to learn more about Obtaining Information About Partitions.

Since we made the id column a primary key, let’s look at what happens when we do a search by primary key. We will use the EXPLAIN PARTITIONS statement to see what partitions are being used in the search. Let’s look for Mary’s information. She has the ID of 10000.

mysql> EXPLAIN PARTITIONS select id, firstname, lastname, joined from members where id = '10000'; +----+-------------+---------+----------------+------+---------------+---------+---------+-------+------+-------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+---------+----------------+------+---------------+---------+---------+-------+------+-------+ | 1 | SIMPLE | members | p0,p1,p2,p3,p4 | ref | PRIMARY | PRIMARY | 4 | const | 5 | NULL | +----+-------------+---------+----------------+------+---------------+---------+---------+-------+------+-------+ 1 row in set (0.05 sec)

As you can see under the partitions column, all five partitions (p0,p1,p2,p3,p4) were searched for this information because the partitions were separated by the year, and not the id. So this query would not take advantage of our partitions.

Look at what happens when we also include Mary’s joined date along with the id column:

mysql> EXPLAIN PARTITIONS select id, firstname, lastname, joined from members where id = '10000' and joined = '2010-01-14'; +----+-------------+---------+------------+-------+---------------+---------+---------+-------------+------+-------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+---------+------------+-------+---------------+---------+---------+-------------+------+-------+ | 1 | SIMPLE | members | p0 | const | PRIMARY | PRIMARY | 7 | const,const | 1 | NULL | +----+-------------+---------+------------+-------+---------------+---------+---------+-------------+------+-------+ 1 row in set (0.00 sec)

As you can see, MySQL only had to search in partition p0. Since the joined column was included in the query, MySQL can go to that partition and use the PRIMARY key of id and quickly find the record it needs.

Let’s see what we would need to do if you wanted to find all of the members who joined in the year 2010 (like Mary). You would think that you could just use the YEAR function on the joined column. But, you can’t use a function to convert the joined date to a year, as MySQL will need to convert all of the values in the joined columns first, and then it won’t be able to use the partition:

mysql> EXPLAIN PARTITIONS select id, firstname, lastname, joined from members where YEAR(joined) = '2010'; +----+-------------+---------+----------------+------+---------------+------+---------+------+------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+---------+----------------+------+---------------+------+---------+------+------+-------------+ | 1 | SIMPLE | members | p0,p1,p2,p3,p4 | ALL | NULL | NULL | NULL | NULL | 5 | Using where | +----+-------------+---------+----------------+------+---------------+------+---------+------+------+-------------+ 1 row in set (0.03 sec)

In this case, you are still having to go through all partitions because of the YEAR function. It would be better to use a range in the WHERE clause to find the members from 2010:

mysql> EXPLAIN PARTITIONS select id, firstname, lastname, joined from members where joined '2009-12-31'; +----+-------------+---------+------------+------+---------------+------+---------+------+------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+---------+------------+------+---------------+------+---------+------+------+-------------+ | 1 | SIMPLE | members | p0 | ALL | NULL | NULL | NULL | NULL | 2 | Using where | +----+-------------+---------+------------+------+---------------+------+---------+------+------+-------------+ 1 row in set (0.00 sec)

But what happens when you need to change the partitioned value of the joined date? What if Mary’s date was incorrect, and she really joined in 2011? What happens to the data? When you change the value of the partitioned column, MySQL will move that data to the appropriate partition. Let’s look at Mary’s information again, and also look at the EXPLAIN PARTITIONS statement for the same query.

mysql> select id, firstname, lastname, joined from members where id = '10000' and joined = '2010-01-14'; +-------+-----------+----------+------------+ | id | firstname | lastname | joined | +-------+-----------+----------+------------+ | 10000 | Mary | Davis | 2010-01-14 | +-------+-----------+----------+------------+ 1 row in set (0.00 sec) mysql> EXPLAIN PARTITIONS select id, firstname, lastname, joined from members where id = '10000' and joined = '2010-01-14'; +----+-------------+---------+------------+-------+---------------+---------+---------+-------------+------+-------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+---------+------------+-------+---------------+---------+---------+-------------+------+-------+ | 1 | SIMPLE | members | p0 | const | PRIMARY | PRIMARY | 7 | const,const | 1 | NULL | +----+-------------+---------+------------+-------+---------------+---------+---------+-------------+------+-------+ 1 row in set (0.00 sec)

We can see that Mary’s data is in partition p0. Now let’s change Mary’s joined date from 2010-01-14 to 2011-05-30, and then run both of the above statements again (but in the query we need to change Mary’s joined date to reflect the new date):

mysql> update members set joined = '2011-05-30' where id = '10000'; Query OK, 1 row affected (0.06 sec) Rows matched: 1 Changed: 1 Warnings: 0 mysql> select id, firstname, lastname, joined from members where id = '10000' and joined = '2011-05-30'; +-------+-----------+----------+------------+ | id | firstname | lastname | joined | +-------+-----------+----------+------------+ | 10000 | Mary | Davis | 2011-05-30 | +-------+-----------+----------+------------+ 1 row in set (0.00 sec) mysql> EXPLAIN PARTITIONS select id, firstname, lastname, joined from members where id = '10000' and joined = '2011-05-30'; +----+-------------+---------+------------+-------+---------------+---------+---------+-------------+------+-------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+---------+------------+-------+---------------+---------+---------+-------------+------+-------+ | 1 | SIMPLE | members | p1 | const | PRIMARY | PRIMARY | 7 | const,const | 1 | NULL | +----+-------------+---------+------------+-------+---------------+---------+---------+-------------+------+-------+ 1 row in set (0.00 sec)

We can now see that Mary’s data is now in partition p1.

Partitioning data can really add performance to your queries, but only if you know how to write the proper queries to take advantage of the partitioning. Using the EXPLAIN PARTITIONS statement can really help you figure out if your queries are properly working. You can also store separate partitions on separate storage devices (by using innodb_file_per_table), and in MySQL 5.7.4 (or greater), you can even move partitioned tables to another server.

 
Tony Darnell is a Principal Sales Consultant for MySQL, a division of Oracle, Inc. MySQL is the world’s most popular open-source database program. Tony may be reached at info [at] ScriptingMySQL.com and on LinkedIn.

PlanetMySQL Voting: Vote UP / Vote DOWN

Congratulations Ubuntu, for the wide choice!

Inspired by Yngve Svendsen’s post, I too think it makes absolute sense to congratulate Ubuntu on the 14.04 LTS release (some server notes - MySQL has a section dedicated to it). Ubuntu users have a lot of server choice today (that’s from all major MySQL ecosystem vendors):

  • MySQL 5.5.35 ships in main. It is the default MySQL. Oracle has committed to providing updates to 5.5 throughout the LTS release cycle of Ubuntu (which is longer than the planned EOL for 5.5). This is why the grant of a Micro Release Exception (MRE).
  • MySQL 5.6.16 ships in universe
  • MariaDB 5.5.36 ships in universe.
  • Percona XtraDB Cluster 5.5.34 ships in universe

Ubuntu’s pitch is being the cloud platform of choice, with OpenStack support. This explains why Percona XtraDB Cluster (the only shipping Galera Cluster variant — no upstream Codership release, and no MariaDB Galera Cluster) is critical infrastructure as its used widely in OpenStack deployments. 451Research estimates that the OpenStack distributions market is worth $82 million in 2014 and $119 million in 2015.

Press release had a choice quote from Percona CEO, Peter Zaitsev:

“We are very pleased that Percona XtraDB Cluster is included in Ubuntu 14.04 LTS. Many organisations that use MySQL need high availability solutions to ensure that their applications meet the expectations of their users. Percona XtraDB Cluster is an easy to use, open source solution for MySQL clustering which addresses these high availability needs. We continue to see growth in Ubuntu usage by our customers and our open source software users so we are confident that the inclusion of Percona XtraDB Cluster in Ubuntu 14.04 will help spread the adoption of cost-effective, high availability MySQL.” Peter Zaitsev, Co-Founder and CEO at Percona

 

Related posts:

  1. Ubuntu 10.04 LTS released, MariaDB 5.1.44/5.2-BETA VM’s available
  2. OpenSUSE users have a choice of database now!
  3. Communications, Ubuntu 6.06 LTS & MySQL downloads


PlanetMySQL Voting: Vote UP / Vote DOWN

Percona Software in Ubuntu 14.04 LTS (Trusty Tahr) release

I’d like to congratulate Canonical with the new Ubuntu 14.04 LTS (Trusty Tahr) Release, it really looks like a great release, and I say it having my own agenda It looks even more great because it comes with a full line of Percona Software.
If you install Ubuntu 14.04 and run aptitude search you will find:

Percona Toolkit and Percona XtraBackup are up to the latest versions, but Percona Server and Percona XtraDB Cluster comes with 5.5 versions, and it is in line with default MySQL version, which again is 5.5.

I expect this release will make it much easier for users to get familiar with our software, so you can go and try this today!

The post Percona Software in Ubuntu 14.04 LTS (Trusty Tahr) release appeared first on MySQL Performance Blog.


PlanetMySQL Voting: Vote UP / Vote DOWN

Thoughts on Small Datum – Part 1

A little background…

When I ventured into sales and marketing (I’m an engineer by education) I learned I would often have to interpret and simply summarize the business value that is sometimes hidden in benchmarks. Simply put, the people who approve the purchase of products like TokuDB® and TokuMX™ appreciate the executive summary.

Therefore, I plan to publish a multipart series here on TokuView where I will share my simple summaries and thoughts on business value for the benchmarks Mark Callaghan (@markcallaghan), a former Google and now Facebook database guru, is publishing on his blog, Small Datum.

I’m going to start with his first benchmark post and work my way forward to the newest. And unless I get feedback which suggests this isn’t useful, I will add new posts here as Mark adds new benchmarks there.

In the interest of full disclosure, Mark is the brother of Tokutek VP of engineering, Tim Callaghan. Unfortunately for Tokutek, I know this means some of you may discount what he has to say. I hope you will look past the happy coincidence long enough to evaluate his methodology and findings. If you do I am confident you’ll find his work unbiased and informative.

The meat & potatoes…

With that introduction out of the way, here is how this marketer simply summarizes Mark’s first Small Datum benchmark post. It was published in February of this year and it’s titled Write Amplification: Write-optimized versus Update-in-place.

To understand the business value associated with this benchmark you first have to know a little about Write Amplification (WA): WA is an unfortunate characteristic of database applications that utilize SSD media to improve performance.  It can be measured and expressed as the actual number of bytes written in order to create a single byte of stored data.  There’s a good article on the subject over at Wikipedia.

Notably, WA degrades performance thereby reducing the gains that come from using SSD. Moreover, because SSD failure is tied to the total number of bytes written over time, WA also reduces the life expectancy of your SSD media.

Why should you care about this?  Well, performance problems usually exhibit themselves in the form of dissatisfied users. If your Big Data application is customer-facing that could mean a revenue hit or other undesirable business impacts. Reduced SSD life expectancy obviously leads to increased costs.

In his work, Mark uses Facebook’s LinkBench to compare the WA characteristics of MySQL with the InnoDB versus TokuDB storage engines. He uses a number of tuning techniques to minimize WA in InnoDB and compares that to untuned TokuDB which was purpose-built to, among other things, minimize WA by replacing 40-year-old B-Tree Indexes with Tokutek patented Fractal Tree Indexes.

Bottom line: Mark’s WA benchmarks clearly show the WA profile (expressed as total gigabytes written) of TokuDB applications are roughly 1/2 (or better than) that of write-optimized InnoDB applications. This suggests TokuDB applications will perform better while extending the life of your SSD hardware.  To wit, the expected business benefits will include better application performance and user satisfaction plus reduced hardware costs.

You can try it for yourself in your own environment by downloading the free community versions of TokuDB (or TokuMX) here and running your own benchmarks using Mark’s methodology as your guide. If you do, I’d love to hear from you.

As always, your thoughts and comments are welcome.

In Thoughts on Small Datum – Part II: This marketer’s simple summary of Mark’s insertion benchmarks comparing MySQL with InnoDB versus TokuDB and stock MongoDB versus TokuMX (our high-performance distribution of MongoDB).  If you want to get a head start, check out his post Insert Benchmark for InnoDB, MongoDB and TokuMX and Flash Storage.


PlanetMySQL Voting: Vote UP / Vote DOWN

"Anemomaster": DML visibility. Your must-do for tomorrow

Here's our take of master DML query monitoring at Outbrain (presented April 2014). It took a half-day to code, implement, automate and deploy, and within the first hour of work we managed to catch multiple ill-doing services and scripts. You might want to try this out for yourself.

What's this about?

What queries do you monitor on your MySQL servers? Many don't monitor queries at all, and only look up slow queries on occasion, using pt-query-digest. Some monitor slow queries, where Anemometer (relying on pt-query-digest) is a very good tool. To the extreme, some monitor TCP traffic on all MySQL servers -- good for you! In between, there's a particular type of queries that are of special interest: DML (ISNERT/UPDATE/DELETE) queries issued against the master.

They are of particular interest because they are only issued once against the master, yet propagate through replication topology to execute on all slaves. These queries have a direct impact on your slave lag and on your overall replication capacity. I suggest you should be familiar with your DMLs just as you are with your slow queries.

In particular, we had multiple occasions in the past where all or most slaves started lagging. Frantically we would go to our metrics; yes! We would see a spike in com_insert. Someone (some service) was obviously generating more INSERTs than usual, at a high rate that the slaves could not keep up with. But, which INSERT was that? Blindly, we would look at the binary logs. Well, erm, what are we looking for, exactly?

Two such occasions convinced us that there should be a solution, but it took some time till it hit us. We were already using Anemometer for monitoring our slow logs. We can do the same for monitoring our binary logs. Thus was born "Anemomaster".

Quick recap on how Anemometer works: you issue pt-query-digest on your slow logs on all MySQL hosts (we actually first ship the slow logs to a central place where we analyse them; same thing). This is done periodically, and slow logs are then rotated. You throw the output of pt-query-digest to a central database (this is built in with pt-query-digest; it doesn't necessarily produce human readable reports). Anemometer would read this central database and visualize the slow queries.

Analysing DMLs

But then, pt-query-digest doesn't only parse slow logs. It can parse binary logs. Instead of asking for total query time, we ask for query count, and on we go to establish the same mechanism, using same pt-query-digest and same Anemometer to store and visualize the DMLs issued on our masters.

When analysing DMLs we're interested in parsing binary logs -- and it makes no sense to do the same on all slaves. All slaves just have same copy of binlog entries as the master produces. It only takes one server to get an accurate picture of the DMLs on your replication topology.

One server could be the master, and this can indeed be done: just FLUSH MASTER LOGS, parse the binary logs with pt-query-digest, and you're done. But like others, we tend to look at our masters as tender babies. We care for them, and do not wish to overload them unnecessarily. We chose to get the binlog entries from our slaves, instead. We also chose to get the entries from the relay logs, since these are unaffected by slave performance and as long as network is good, we can expect the relay logs to be very up to date. At any given time we have two slaves that take this role (this is automated and verified). On a 10 minute period we would flush the relay logs on these servers, and analyse whatever relay logs we have not analysed as yet.

The script below is a slightly modified version of our own, and should work for the standard installation. Modify to fit your own data (in particular, it assumes relay logs are named mysqld-relay-bin; datadir is specified in /etc/my.cnf, and please don't ask me how to do this on Windows):

#!/bin/bash # # Digest latest relay logs file, write results to "anemomaster" # # This script can run from any machine; it only needs to execute on a single machine per mysql cluster, but for # analysis availability it should execute on at least two hosts per cluster. # DATADIR=`grep datadir /etc/my.cnf|awk -F= '{print $2}'` TMPDIR=/tmp DIGESTED_RELAY_LOGS_FILE=${DATADIR}/digested_relay_logs.txt touch $DIGESTED_RELAY_LOGS_FILE chown mysql:mysql $DIGESTED_RELAY_LOGS_FILE hostname=$(hostname) echo "deleting old relay logs from ${TMPDIR}" rm ${TMPDIR}/mysqld-relay-bin.[0-9]* echo "Getting current relay log files" existing_relay_log_files=$(ls -tr ${DATADIR}/mysqld-relay-bin.[0-9]* | head -n -1) for existing_relay_log_file in $existing_relay_log_files do cp -u $existing_relay_log_file $TMPDIR done echo "flushing relay logs" /usr/bin/mysql -umyself -psecret -e 'flush relay logs\G;' 2>/dev/null # sleep because the log file takes some time to disappear sleep 1 echo "Getting current relay log files" existing_relay_log_files=$(ls -tr ${DATADIR}/mysqld-relay-bin.[0-9]* | head -n -1) for existing_relay_log_file in $existing_relay_log_files do cp -u $existing_relay_log_file $TMPDIR done cd $TMPDIR for relay_log_file in mysqld-relay-bin.[0-9]* do # Filter this relay log file, since it's already been digested grep $relay_log_file $DIGESTED_RELAY_LOGS_FILE && rm $relay_log_file done for relay_log_file in mysqld-relay-bin.[0-9]* do echo "digesting $relay_log_file" mysqlbinlog $relay_log_file | /usr/bin/pt-query-digest \ --type binlog --order-by Query_time:cnt --group-by fingerprint --limit 100 \ --review  P=3306,u=anemomaster,p=secret,h=anemomaster_host,D=anemomaster,t=global_query_review \ --history P=3306,u=anemomaster,p=secret,h=anemomaster_host,D=anemomaster,t=global_query_review_history \ --filter=" \$event->{Bytes} = length(\$event->{arg}) and \$event->{hostname}=\"$(hostname)\" " \ --no-report echo "$relay_log_file" >> $DIGESTED_RELAY_LOGS_FILE rm $relay_log_file done # make sure the file does not bloat. 20 entries is more than enough. tail -n 20 $DIGESTED_RELAY_LOGS_FILE > ${TMPDIR}/DIGESTED_RELAY_LOGS_FILE cat ${TMPDIR}/DIGESTED_RELAY_LOGS_FILE > $DIGESTED_RELAY_LOGS_FILE echo "done"

As for Anemometer, we patched it to support multiple environments ("clusters") -- but irrelevant to the DML change. If you just want to make it visualize DMLs, here's the major configuration changes to config.inc.php (marked with bold):

$conf['history_defaults'] = array( 'output'        => 'table', 'fact-group'    => 'date', 'fact-order'    => 'date DESC', 'fact-limit' => '90', 'dimension-ts_min_start' => date("Y-m-d H:i:s", strtotime( '-90 day')), 'dimension-ts_min_end'    => date("Y-m-d H:i:s"), 'table_fields' => array(<strong>'date', 'query_time_avg','ts_cnt','Query_time_sum'</strong>) ); $conf['report_defaults'] = array( 'fact-group'    => 'checksum', 'fact-order'    => <strong>'ts_cnt DESC'</strong>, 'fact-limit' => '20', 'dimension-ts_min_start' => date("Y-m-d H:i:s", strtotime( '-1 day')), 'dimension-ts_min_end'    => date("Y-m-d H:i:s"), 'table_fields' => array(<strong>'checksum','snippet', 'query_time_avg','ts_cnt','Query_time_sum'</strong>), 'dimension-pivot-hostname_max' => null, 'dimension-pivot-clustername_max' => null ); $conf['graph_defaults'] = array( 'fact-group'    => 'minute_ts', 'fact-order'    => 'minute_ts', 'fact-limit' => '', 'dimension-ts_min_start' => date("Y-m-d H:i:s", strtotime( '-7 day')), 'dimension-ts_min_end'    => date("Y-m-d H:i:s"), 'table_fields' => array('minute_ts'), // hack ... fix is to make query builder select the group and order fields, // then table fields only has to contain the plot_field 'plot_field' => <strong>'ts_cnt'</strong>, );

With a 10 minute rotation & digestion, we are able to analyze near real-time what's been done on our masters. If we see a spike in com_insert/com_update/com_delete, or just see slave lags, we turn to Anemomaster and within a couple minutes know exactly what service is guilty of abusing our database. We are also working to protect our database against abuse, but that's for another discussion.


PlanetMySQL Voting: Vote UP / Vote DOWN

Ubuntu 14.04 LTS first impressions

The Trusty Tahr is out with support for 5 years. Here are the first thoughts about it…

Stupid Browser comes up when searching “Appearance”

  • Unity feels more responsive than ever, especially the Alt+Tab.
  • New Desktop environments these days (also read as Cinnamon, GNOME Shell) take up a lot of RAM, I wanted to test 14.04′s unity, after few hours usage it too scooped up 320 MB worth of RAM… I expected better
  • One thing that definitely annoyed me a lot – Everytime I pressed “Alt + D” (Firefox’s shortcut for address bar focus), the HUD used to come up after the address bar focus. Hence after each Alt+D, I had an “Escape key” following up. I hope Canonical listens to this, They might say “F6″ is an alternate shortcut that users might use to avoid this problem. The problem doesn’t occur if I’m slow to use the key combination
  • I would’ve loved if Ubuntu packaged the Ubuntu Tweak Tool as default installation.
  • The Web Apps for firefox seem to have gone! Feel bad for its lovers.
  • When I try to reach the “Appearance” app using the dash, a stupid Browser comes up as the first result
  • This Browser loads up “Ubuntu Home page” There are no UI elements to navigate anywhere else. The only way you could access is thru HUD ! – I have no idea why its included.

The Acid test is still pending, I’ve still got to use it during office hours. With apache, mysql, IntelliJ, Netbeans, Sublime Text all running at once.  Only then will Ubuntu as a desktop will truly be gauged. The test machine is fairly strong.

  • Intel® Core™ i5 CPU M 540 @ 2.53GHz × 4 Intel® Core™ i5 CPU M 540 @ 2.53GHz × 4
  • 3.7 GiB of RAM

All in all, Ubuntu continues to roll out another strong release.

Final Rating: 4/5


PlanetMySQL Voting: Vote UP / Vote DOWN

undo and redo

Here’s something that used to make my head spin a bit… A quick quiz: does undo generate redo? does redo generate undo?

When I heard that undo generated redo, it sent me for a loop. Undo is stored in the system tablespace (undo segments), as regular data, and therefore generated redo. Then I thought, OK, would redo generate undo? If so, we’re obviously in a vicious cycle. So, no. Why?

We need redo for undo so that if there is a crash, and some data has (have) been written to disk but not committed, and the undo wasn’t yet written to disk, we can recreate it from the redo, then use it to undo the uncommitted but written data. (Corrections if I’m inaccurate, please.  This is all from my head only.)

We don’t need undo for redo – if there is a crash, and a transaction was not fully completed, we need the undo to roll it back. And if we don’t need the redo, it’s not something we need to undo. The redo logs are special files outside of the database proper.

So, this may all be clear as mud from my writing, but I’ve always wanted to think it out loud.  Please correct me in the comments if I’m wrong on anything here.



PlanetMySQL Voting: Vote UP / Vote DOWN

WebScaleSQL on Windows? I wish, but not quite yet, it seems …

For fun, I tried building WebScaleSQL on Windows, even though it’s not [yet?] a support platform.

Using the following (as I would to build MySQL/MariaDB):

cd c:\mysql\webscalesql-5.6.16 mkdir bld cd bld cmake .. cmake --build . --config relwithdebinfo --target package

I end up with:

... 238 Warning(s) 110 Error(s) Time Elapsed 00:05:08.53

Looking through the output, the main error is this:

C:\mysql\webscalesql-5.6.16\include\atomic_stat.h(33): fatal error C1083: Cannot open include file: 'atomic': No such file or directory [C:\mysql\webscalesql-5.6.16\bld\storage\innobase\innobase.vcxproj]

Of course the directory does exist, and permissions are correct.

C:\mysql\webscalesql-5.6.16\include\atomic_stat.h contains the following line:

#include

And this exists:

C:\mysql\webscalesql-5.6.16\include\atomic

So there must be some reason Windows doesn’t like that include. As far as I can tell, most other includes specify an exact file, and not a directory.

I tried replacing the single include by specifying each .h file included in atomic\, but that resulted in many more errors (776).

If any ideas, please feel free to share. And/or if this is futile, fine to mention that also.

 


PlanetMySQL Voting: Vote UP / Vote DOWN

Resolving Error 1918, System Error Code 126, When Installing MySQL ODBC Driver

If you are installing MySQL ODBC Driver and encounter the following error:

Error 1918. Error installing ODBC driver MySQL ODBC 5.1 Driver, ODBC error 13: The setup routines for the MySQL ODBC 5.1 Driver could not be loaded due to system error code 126: The specified module could not be found. ...\myodbc5S.dll).. Verify...

Then you will need to install the Microsoft Visual C++ 2010 Redistributable Package (select the appropriate one for your OS architecture below):

64-bit version:

read more


PlanetMySQL Voting: Vote UP / Vote DOWN

Resolving Error 1918, System Error Code 126, When Installing MySQL ODBC Driver

If you are installing MySQL ODBC Driver and encounter the following error:

Error 1918. Error installing ODBC driver MySQL ODBC 5.1 Driver, ODBC error 13: The setup routines for the MySQL ODBC 5.1 Driver could not be loaded due to system error code 126: The specified module could not be found. ...\myodbc5S.dll).. Verify...

Then you will need to install the Microsoft Visual C++ 2010 Redistributable Package (select the appropriate one for your OS architecture below):

64-bit version:

http://www.microsoft.com/en-us/download/confirmation.aspx?id=14632

32-bit version:

http://www.microsoft.com/en-gb/download/details.aspx?id=5555

After installing that, then re-attempt installing the MySQL ODBC connector, and things should work smoothly.

Hope this helps.

 


PlanetMySQL Voting: Vote UP / Vote DOWN

Resolving MySQL ODBC &quot;architecture mismatch&quot; Error

If you attempt to use ODBC to run a MySQL application and run into the following error:

[Microsoft][ODBC Driver Manager] The specified DSN contains an architecture mismatch between the Driver and Application

This means there is a 64-bit versus 32-bit mismatch.

Most likely, you're running 64-bit Windows, as well as 64-bit MySQL ODBC connector, but the application is 32-bit.

If this is the case, you will also need to install the 32-bit MySQL ODBC connector, and then create the connection from the 32-bit ODBC.

read more


PlanetMySQL Voting: Vote UP / Vote DOWN

Resolving MySQL ODBC “architecture mismatch” Error

If you attempt to use ODBC to run a MySQL application and run into the following error:

[Microsoft][ODBC Driver Manager] The specified DSN contains an architecture mismatch between the Driver and Application

This means there is a 64-bit versus 32-bit mismatch.

Most likely, you’re running 64-bit Windows, as well as 64-bit MySQL ODBC connector, but the application is 32-bit.

If this is the case, you will also need to install the 32-bit MySQL ODBC connector, and then create the connection from the 32-bit ODBC.

odbcad32.exe is the file to create the connections. Both 64-bit and 32-bit files have the same name, just differing locations.

This is the default location for the 64-bit ODBC:

C:\Windows\System32\odbcad32.exe

This is the default location for the 32-bit ODBC:

C:\Windows\SysWOW64\odbcad32.exe

And should you need to install MySQL ODBC, there are good instructions here (both for MSI and Manual installs).

Hope this helps.

 


PlanetMySQL Voting: Vote UP / Vote DOWN

SSL and MariaDB/MySQL

With the recent Heartbleed bug, people are clearly more interested in their MariaDB/MySQL running with SSL and if they have problems. First up, you should read the advisory notes: MariaDB, Percona Server (blog), and MySQL (blog).

Next, when you install MariaDB (or a variant) you are usually dynamically linked to the OpenSSL library that the system provides. Typically on startup of MariaDB 10.0.10 on CentOS 6.5 (packages from the MariaDB repository), you can check what your status of SSL is.

MariaDB [(none)]> show variables like 'have_ssl'; +---------------+----------+ | Variable_name | Value | +---------------+----------+ | have_ssl | DISABLED | +---------------+----------+ 1 row in set (0.00 sec)

This means that SSL options are compiled, but mysqld didn’t start with it. You can verify SSL is linked dynamically:

ldd `which mysqld` | grep ssl libssl.so.10 => /usr/lib64/libssl.so.10 (0x00007ff82d1b1000)

If you are running with SSL enabled (some documentation at MySQL) you will have different options naturally. You can do this via: /etc/init.d/mysql start --ssl. Output now changes:

MariaDB [(none)]> show variables like 'have_ssl'; +---------------+-------+ | Variable_name | Value | +---------------+-------+ | have_ssl | YES | +---------------+-------+ 1 row in set (0.00 sec)

The value NO will be displayed if the server is not compiled with SSL support. See SSL Server System Variables for more.

Related posts:

  1. MySQL 5.6 system variables in the MariaDB 10 server
  2. Using MariaDB on CentOS 6
  3. MariaDB 10.0.5 storage engines – check the Linux packages


PlanetMySQL Voting: Vote UP / Vote DOWN

Congratulations, Ubuntu!

Today, we congratulate our friends at Ubuntu on a great new release, Ubuntu 14.04 LTS. As you can see in Mark Shuttleworth’s posting on Google+ from a few weeks back, MySQL has been cooperating closely with the Debian and Ubuntu communities to make sure that MySQL works very well on these platforms, and Ubuntu 14.04 […]
PlanetMySQL Voting: Vote UP / Vote DOWN

MySQL Enterprise Monitor 2.3.16 has been released

We are pleased to announce that MySQL Enterprise Monitor 2.3.16 is now available for download on the My Oracle Support (MOS) web site.

The Service Manager, Agent, and bundled MySQL Server binaries included in 2.3.16 are all updated to use OpenSSL 1.0.1g. Please see http://www.oracle.com/technetwork/topics/security/opensslheartbleedcve-2014-0160-2188454.html for further information. You can also find additional details about Enterprise Monitor 2.3.16 in the change log.

You will find binaries for the new release on My Oracle Support. Choose the "Patches & Updates" tab, and then choose the "Product or Family (Advanced Search)" side tab in the "Patch Search" portlet.

Thanks and Happy Monitoring!

- The MySQL Enterprise Tools Development Team


PlanetMySQL Voting: Vote UP / Vote DOWN

MariaDB 5.5.37 now available

The MariaDB project is pleased to announce the immediate availability of MariaDB 5.5.37. This is a Stable (GA) release.

See the Release Notes and Changelog for detailed information on this release and the What is MariaDB 5.5? page in the MariaDB Knowledge Base for general information about the MariaDB 5.5 series.

Download MariaDB 5.5.37

Release Notes Changelog What is MariaDB 5.5?

MariaDB APT and YUM Repository Configuration Generator

Thanks, and enjoy MariaDB!


PlanetMySQL Voting: Vote UP / Vote DOWN

InfiDB's Response to the &quot;Heartbleed&quot; Bug

At InfiniDB, we take matters of communication security and privacy very seriously, and want to assure our customers and users of our software that no version of InfiniDB is affected by "Heartbleed". None of the InfiniDB database components supports SSL and none of them link with any SSL libraries.
PlanetMySQL Voting: Vote UP / Vote DOWN

Proposal to deprecate mysqlhotcopy

In the MySQL team, we are considering deprecating the mysqlhotcopy utility. To provide some background, here is an excerpt from the MySQL manual:

mysqlhotcopy is a Perl script that was originally written and contributed by Tim Bunce. It uses FLUSH TABLES, LOCK TABLES, and cp or scp to make a database backup. It is a fast way to make a backup of the database or single tables, but it can be run only on the same machine where the database directories are located. mysqlhotcopy works only for backing up MyISAM and ARCHIVE tables. It runs on Unix.

And now let me explain the motivations leading to our proposal:

  • The name implies that this utility will work for ‘MySQL’ and is hot, but actually neither are true:
    1. Only MyISAM and ARCHIVE engines are supported.
    2. Tables are read-locked before the copy phase.

    By contrast, backing up InnoDB tables with mysqldump --single-transaction or MySQL Enterprise Backup uses the more traditional definition of ‘hot’, and allows both READ and WRITE operations to continue during the backup.

  • There is no warning when backing up InnoDB tables that this operation is not supported. The .ibd files are also copied as part of creating the backup, which may mislead a user to believe they have successfully taken a backup.
  • The manual page for mysqlhotcopy notes that the utility ‘runs on UNIX’, since it relies on some filesystem locking semantics which can not be guaranteed to work cross-platform. We would like to ensure that Windows users always have a first-class experience.
  • mysqlhotcopy does not guarantee a consistent backup in the case that there are VIEWs present. I have a test-case available here to demonstrate.

Our current plan is to deprecate mysqlhotcopy in MySQL 5.6, with removal as early as MySQL 5.7. Do you use mysqlhotcopy? Will you be affected by the removal of this program?

We are seeking feedback from users who will be affected by this plan. Please leave a comment or get in touch!


PlanetMySQL Voting: Vote UP / Vote DOWN

Pages