Planet MySQL

Implementing asynchronous cascade delete in MySQL

A while back one of my foreign keys started causing trouble. The problem was that some parent rows had tens of thousand of child rows, and the foreign key was defined with CASCADE DELETE enabled. When we deleted one of those parent rows on a master database, it took several seconds to execute the delete because of the cascade. This led to latency for the end user, and also led to replication delays.

The immediate solution was make the application tolerant of orphaned rows in the child table and to drop the explicit foreign key constraint.

I didn't really want to leave those orphaned rows hanging around in the child table, so I decided to implement an asynchronous process to delete the orphaned rows on a scheduled basis. Read on for a description of that process.

Using the sakila database as an example, imagine I drop the foreign key between file_category and category, like so:

alter table sakila.film_category drop foreign key fk_film_category_category;

Without the foreign key in place, deletes on the category table lead to orphaned rows in film_category. For example, I will delete the "New" category:

``` mysql> delete from sakila.category

-> where name = 'New';

Query OK, 1 row affected (0.01 sec) ```

There are several ways to count the orphaned rows. Here are two different naive implementations using OUTER JOIN or NOT EXISTS:

``` mysql> select count(*)

-> from sakila.film_category c -> left outer join sakila.category p on p.category_id = c.category_id -> where p.category_id is null;

+----------+ | count(*) | +----------+ | 63 | +----------+ 1 row in set (0.00 sec)

mysql> select count(*)

-> from sakila.film_category c -> where not exists -> ( -> select NULL from sakila.category p where p.category_id = c.category_id -> );

+----------+ | count(*) | +----------+ | 63 | +----------+ 1 row in set (0.01 sec) ```

I can also delete the orphaned rows using the same query approaches. I'll roll the first delete back so I can demonstrate the second query in the same session:

``` mysql> set autocommit = 0; Query OK, 0 rows affected (0.00 sec)

mysql> delete c.*

-> from sakila.film_category c -> left outer join sakila.category p on p.category_id = c.category_id -> where p.category_id is null;

Query OK, 63 rows affected (0.02 sec)

mysql> rollback; Query OK, 0 rows affected (0.00 sec)

mysql> delete c.*

-> from sakila.film_category c -> where not exists (select NULL from sakila.category p where p.category_id = c.category_id);

Query OK, 63 rows affected (0.01 sec)

mysql> commit; Query OK, 0 rows affected (0.00 sec) ```

This approach will be very slow for tables containing millions of rows, so in my real world case I didn't use this approach. Instead I decided it would be a lot easier and faster to delete the orphaned rows if I knew who their parent was. To this end I created a new table to track the deleted rows, and populated it using a trigger. Continuing the example in the sakila database:

``` CREATE TABLE category_deleted ( category_id tinyint(3) unsigned NOT NULL, name varchar(25) NOT NULL, last_update timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, delete_time timestamp NOT NULL, PRIMARY KEY (category_id) ) ENGINE=InnoDB DEFAULT CHARSET=utf8;

DELIMITER $$

DROP TRIGGER IF EXISTS sakila.TR_A_DEL_CATEGORY $$

CREATE TRIGGER sakila.TR_A_DEL_CATEGORY AFTER DELETE ON sakila.category FOR EACH ROW BEGIN

INSERT IGNORE INTO sakila.category_deleted (category_id, name, last_update, delete_time) VALUES (old.category_id, old.name, old.last_update,now());

END $$

DELIMITER ; ```

Now I can delete another category to test the trigger:

``` mysql> -- delete a single category mysql> delete from sakila.category

-> where name = 'Classics';

Query OK, 1 row affected (0.01 sec)

mysql> -- verify the trigger worked mysql> select * from sakila.category_deleted; +-------------+----------+---------------------+---------------------+ | category_id | name | last_update | delete_time | +-------------+----------+---------------------+---------------------+ | 4 | Classics | 2006-02-15 04:46:27 | 2013-05-21 18:21:53 | +-------------+----------+---------------------+---------------------+ 1 row in set (0.00 sec)

mysql> -- count the orphaned rows using the _deleted table mysql> select count(*)

-> from sakila.film_category c -> inner join sakila.category_deleted p on p.category_id = c.category_id;

+----------+ | count(*) | +----------+ | 57 | +----------+ 1 row in set (0.00 sec) ```

I also wanted to execute the deletes on the child table in chunks, so I implemented a stored procedure to delete the orphaned rows by iterating through the rows in the _deleted table, deleting the child rows, and then deleting from the _deleted table. If I had to implement it again, I would probably use common_schema to chunk the deletes so I wouldn't need the stored procedure.

Here's an implementation using common_schema:

``` -- first delete the orphaned rows from the child table set @script := " split(sakila.film_category: delete sakila.film_category.*

from sakila.film_category inner join sakila.category_deleted on sakila.category_deleted.category_id = sakila.film_category.category_id

) SELECT $split_total_rowcount AS 'rows deleted so far'; "; call common_schema.run(@script);

-- then delete the rows from the _deleted table (assuming they have no children) delete p.* from sakila.category_deleted p left outer join sakila.film_category c on c.category_id = p.category_id where c.category_id is null; ```


PlanetMySQL Voting: Vote UP / Vote DOWN

Shard-Query 2.0 Beta 1 released

It is finally here.  After three years of development, the new version of Shard-Query is finally available for broad testing.

This new version of Shard-Query is vastly improved over previous versions in many ways.  This is in large part due to the fact that the previous version of Shard-Query (version 1.1) entered into production at a large company.  Their feedback during implementation was invaluable in building the new Shard-Query features.   The great thing is that this means that many of the new 2.0 features have already been tested in at least one production environment.

This post is intended to highlight the new features in Shard-Query 2.0.  I will be making posts about individual features as well as posting benchmark results.

Configuration Repository
Shard-Query 1.x stored the configuration in a .ini file.  In an installation with many nodes, keeping the .ini file in sync was difficult.  There was no way to guarantee that all of Shard-Query saw the state of the cluster (in terms of node composition) as an entire unit.  Shard-Query 2.0 solves this problem and makes configuration and bootstrapping of new nodes simpler.  Shard-Query 2.0 stores the configuration inside of a database schema, rather than in a flat file.  Each node is made aware of the configuration repository through the node setup process.  The cluster configuration is then read from the repository each time Shard-Query initializes.

REST interface/GUI
Shard-Query 2.0 features a user interface built using the DooPHP MVC framework.  This interface was graciously contributed by Alex Hurd as part of the implementation I spoke of earlier.  The web interface provides for both query execution and post-installation Shard-Query configuration.

MySQL proxy 0.8.3 interface
Shard-Query now includes a Lua script for MySQL proxy.  This allows applications designed to interact with MySQL to work with Shard-Query transparently.  This is very useful for tools such as Mondrian, Tableau or other tools that are designed for working with and visualizing big data.

DDL and SHOW support
Shard-Query now supports creating and dropping tables and indexes.  The MySQL Proxy interface also supports SHOW commands such as SHOW TABLES and SHOW DATABASES.  Please note that cross-shard operations such as creating and dropping tables requires that all shards be online for the operation.  Failures may result in inconsistency (tables or indexes on some nodes, but not others) so please use with care.  Unfortunately MySQL does not have transactional DDL.

Extended SQL support
Previous versions of Shard-Query only supported limited support for SQL expressions.  Expressions such as ROUND(SUM(expr), 2) did not work.  Nor did SUM(expr)/COUNT(expr).  These issues have been resolved.  Shard-Query now supports nearly all MySQL SELECT syntax.  This includes complex expressions, ORDER BY, LIMIT, HAVING and WITH ROLLUP.  Support for subqueries in the FROM clause has been extended.  UNION and UNION ALL queries are now executed slightly differently as well.  SQL syntax checking is now performed as well, and malformed queries return familiar error messages.

Partitioning based parallelism
Tables partitioned at the MySQL level are now automatically detected and used for intra-query parallelism.  Currently RANGE, LIST, RANGE COLUMNS and LIST COLUMNS partitioning methods are supported for partition parallelism when a single column is used as the partition key. Shard-Query will automatically construct WHERE clauses to examine individual partitions and examine partitions in parallel. The next version of Shard-Query (2.5) will support all types of partitions for any number of partition key columns for MySQL 5.6.10 or greater, through use of the new PARTITION hint.

Improved parallel loading mechanism
The Shard-Query loader was formerly multi-process, but this did not work on Windows.  Now the loader has been split into two parts.  The first is a quick scan operation with identifies individual “chunks” of the table to load.  The second phase involves Gearman workers actually loading those chunks.  Currently, the flat file input must be available on all nodes running loader workers via the same path.  This likely means that the files must be placed on shared storage or duplicated on all nodes on which the loader workers run.  This limitation will probably be lifted in the future.

Asynchronous query execution
Shard-Query now supports executing queries asynchronously.  There is a job table which can be used to determine if a background job has completed yet.  See the bin/update_job_table script which actually populates the job table. If you don’r run (or cron) the script, you won’t see any jobs completing.  See another example in the GUI (it invokes the job table update code too).  This feature can be used to execute long running queries over large amounts of data.

Multi-shard (range) shard lookups
Queries which access more than one shard key via IN, BETWEEN or OR clauses will now be sent to only the shards which contain the keys in question.  In previous versions range lookups scanned all shards. A bug was also fixed in WHERE clause processing.  WHERE clause expressions such as “WHERE (the_shard_key IN(1,2,3,4,5))” were not handled properly due to the enclosing parenthesis, but this has been resolved.

Star Schema Optimization
Star schemata present unique challenges to traditional database optimizers.  By turning on star schema optimization Shard-Query can offer vastly improved query performance for queries which utilize a star schema.  Often an improvement of an order of magnitude or more can be seen when using this option, but it only works forstar schema.

Custom aggregate functions
The parser now supports the addition of custom aggregate functions to the SQL dialect.  A reference function called PERCENTILE(expr, N) is included.  Custom non-aggregate functions are not yet supported.



PlanetMySQL Voting: Vote UP / Vote DOWN

PHPTek

The PHPTek Conference bills itself as the premier professional PHP conference with a community flair. Having been to dozens of PHP conferences, I thought that was a pretty bold claim.

MySQL Boogiebot

But as they say in Texas, it ain’t bragging if you can do it.

MySQL plush dolphins and BoogieBots. The wind up dancing robots were popular in Chicago

And the organizers picked perfect weather for this Chicago based show.

The quality of the presenters and presentations was amazing high. Most of these sessions covered intricacies with the PHP language. The talks covered the range from beginning Symfony 2 to advanced security. The presenters were indeed the cream of the crop and the material covered very well developed.

Oracle’s own Ligaya Trumelle started the second day with a MySQL 5.6 new features session specifically tailored to what this elite audience wanted to know about 5.6. Later she teamed with Davey Shafik on Engine Yard on MySQL HA, Recovery, and Load Balancing. The second talks was a comprehensive look at the most popular options for all three areas1.

Oracle was a sponsor of this show and I also had an opportunity to present Ten Things To Do To Make Your MySQL Databases Stronger and Healthier when not at the booth handing out three hundred Boogie Bots, MySQL Stickers, and a handful of plush Sakila Dolphins. The attendee questions at the booth were also high level (isolation levels for the best insert concurrency, how to use GTIDs, and partitioning) as opposed to more mundane questions at other shows.

So this truly is the premier PHP show and if you are a serious PHP coder, you need to plan to attend this show next year.

  1. You should download their slides on this as they worked hard to provide good benchmark numbers. Sadly some of the software (third party) did not prove to be robust enough for a good comparison. Lig and Davey put a lot of hard work into this sessions and the slides were loaded with vital info and their configurations.


PlanetMySQL Voting: Vote UP / Vote DOWN

Replication in MySQL 5.6: GTIDs benefits and limitations – Part 1

Global Transactions Identifiers are one of the new features regarding replication in MySQL 5.6. They open up a lot of opportunities to make the life of DBAs much easier when having to maintain servers under a specific replication topology. However you should keep in mind some limitations of the current implementation. This post is the first one of a series of articles focused on the implications of enabling GTIDs on a production setup.

The manual describes very nicely how to switch to GTID-based replication, I won’t repeat it.

Basically the steps are:

  • Make the master read-only so that the slaves can execute all events and be in sync with the master
  • Change configuration for all servers and restart them
  • Use CHANGE MASTER TO to instruct all servers to use GTIDs
  • Disable read-only mode

This procedure will switch all your servers from regular replication to GTID replication. But if you are running a production system, you will probably want to gradually enable GTID replication for an easier rollback in the event of a problem. And some items in the documentation are not so clear.

For instance:

  • Do we really need to restart all the servers at the same time? Downtime is something we like to avoid!
  • Is it necessary to make the master read-only?
  • Can we use regular replication for some slaves and GTID replication for other slaves at the same time?

To find an answer to these questions, let’s create a simple replication configuration with one master and two slaves, all running MySQL 5.6 with GTIDs disabled.

First try: configure only one of the servers with GTIDs

Let’s stop slave #2, change configuration and restart it:

mysql> show slave status\G [...] Slave_IO_Running: No Slave_SQL_Running: Yes [...]

The error log tells us why the IO thread has not started:

2013-05-17 13:21:26 3130 [ERROR] Slave I/O: The slave IO thread stops because the master has GTID_MODE OFF and this server has GTID_MODE ON, Error_code: 1593

So unfortunately if you want replication to work correctly, gtid_mode must be ON on all servers or OFF on all servers, but not something in the middle.

What if we try to reconfigure the master? This time, replication on slave #1 will stop:

2013-05-17 13:32:08 2563 [ERROR] Slave I/O: The slave IO thread stops because the master has GTID_MODE ON and this server has GTID_MODE OFF, Error_code: 1593

These simple tests answer the first two questions: replication works only if all servers have the same value for gtid_mode, so you should restart them at the same time, which is best done by making the master read-only. However, “at the same time” means “at the same binlog position”, so you can perfectly restart the servers one by one.

Second try: GTIDs enabled, mixing regular replication and GTID replication

This time, we will enable GTID replication on slave #1, but not on slave #2:

# slave #1 mysql> change master to master_auto_position = 1; mysql> start slave;

and let’s create a new table on the master:

mysql> create table test.t (id int not null auto_increment primary key);

Executing SHOW TABLES FROM test on both slaves shows that the table has been created everywhere. So once GTIDs are enabled on all servers, you can have some slaves using file-based positioning and some other slaves using GTID-based positioning.

This answers the second question: we can have different replication modes on different servers, but only if all servers have gtid_mode set to ON. Could it be interesting to run file-based replication when gtid_mode is ON? I can’t think of any use case, so in practice, you’ll probably use either file-based replication only (gtid_mode=off for all servers) or GTID-based replication only (gtid_mode=on for all servers).

Additional question: how can you know if a slave is using GTID-based replication by inspecting the output of SHOW SLAVE STATUS? Look at the last field, Auto_Position:

# Slave #1 mysql> show slave status\G [...] Auto_Position: 1 -> GTID-based positioning # Slave #2 mysql> show slave status\G [...] Auto_Position: 0 -> File-based positioning

Conclusion

Enabling GTID-based replication can be tricky if your application does not easily tolerate downtime or read-only mode, especially if you have a lot of servers to reconfigure. It would be really nice to be able to mix servers where gtid_mode is ON with servers where gtid_mode is OFF. This would greatly simplify the transition to GTID-based replication and allow easier rollbacks if something goes wrong.

The post Replication in MySQL 5.6: GTIDs benefits and limitations – Part 1 appeared first on MySQL Performance Blog.


PlanetMySQL Voting: Vote UP / Vote DOWN

MyISAM's "table lock" problem, and how InnoDB solves it

Most serious users of MySQL have moved their tables to InnoDB years ago.  For those who haven't, let's discuss why InnoDB is a more scalable solution than MyISAM. MyISAM was designed to be very fast for read queries.  It does not handle higher loads of writes very well.  It also suffers a more serious flaw: it isn't crash-safe.  In other words, you better have frequent backups. MyISAM tables
PlanetMySQL Voting: Vote UP / Vote DOWN

Why I do what I do?

I was sincerely affected by this last MySQL post and this other very long post from Jeremy Cole.
Yes, these two guys are MySQL rock stars and they are really impressives, their involvement in the MySQL community is utter!

I don’t want to write a long long speech about my simple life…
I just want to clarify why I do what I do.

Many people have asked or wondered without asking why I do what I do (Jeremy Cole - 2013)

A few years ago Ashley Unitt asked me what I was most proud of, and now, I can make a complete answer. I’m very proud to take part of a community, MySQL has transformed my job into a passion and an incredible desire to share this passion, as honestly as possible.

What does it mean in real life?

If I can help in decision making or take part of a discussion, without any bad motivations, independently, I do it!

If the opportunity arises to have a good time over a glass of rosé and discussing open source with Stéphane Varoqui, I do it!

If I have suddenly an idea and a friend with me for several nights and weekends to achieve it, oh yes, I do it!

I do it because it’s terribly exciting and fun, even if I’ve just did it for myself… on a whim.

Yes, I’m very proud when we made MYXPLAIN.net with Max, I’m proud to work for an amazing company, I’m proud to advise someone who challenge me from the other side of the world, I’m proud to read an email from Rick James in my inbox…

I’m proud to be honest!


PlanetMySQL Voting: Vote UP / Vote DOWN

Get the Best from Web, Cloud, and Embedded Applications as a MySQL DBA

After taking this MySQL for Database Administrators course, you will be equipped to use all the features of MySQL to get the best out of your Web, Cloud, and embedded applications, whether you work with the command line or graphical tools such as MySQL Workbench and MySQL Enterprise Monitor, whether your application uses complex queries or the NoSQL API, and whether your preferred challenge is replicated servers or highly-tuned transactional systems.

You can take this 5-day live instructor-led course as a:

  • Live-Virtual Event: Take this course from your own desk, no travel required. You can choose from a wide selection of events on the schedule to suit different timezones.
  • In-Class Event: Travel to an education center to attend this class. Below is a selection of events already on the schedule.

 Location

 Date

 Delivery Language

 Brussels, Belgium

 3 June 2013

 English

 London, England

 9 September 2013

 English

 Aix-en-Provence, France

 2 December 2013

 French

 Bordeaux, France

 2 December 2013

 French

 Nice, France

 4 November 2013

 French

 Puteaux, France

 16 September 2013

 French

 Strasbourg, France

 1 July 2013

 French

 Dresden, Germany

 3 June 2013

 German

 Dusseldorf, Germany

 24 June 2013

 German

 Gummersbach, Germany

 1 July 2013

 German

 Hamburg, Germany

 24 June 2013

 German

 Munchen, Germany

 19 August 2013

 German

 Munster, Germany

 9 September 2013

 German

 Stuttgart, Germany

 8 July 2013

 German

 Budapest, Hungary

 4 November 2013

 Hungarian

 Belfast, Ireland

 24 June 2013

 English

 Milan, Italy

 7 October 2013

 Italian

 Rome, Italy

 17 June 2013

 Italian

 Utrecht, Netherlands

 24 June 2013

 Dutch

 Warsaw, Poland

 5 August 2013

 Polish

 Lisbon, Portugal

 16 September 2013

 European Portugese

 Cairo, Egypt

 30 June 2013

 English

 Nairobi, Kenya

 22 July 2013

 English

 Jakarta, Indonesia

 16 September 2013

 English

 Petaling Jaya, Malaysia

 1 July 2013

 English

 Makati City, Philippines

 3 June 2013

 English

 Melbourne, Australia

 3 June 2013

 English

 Curitiba, Brazil

 27 May 2013

 Brazilian Portugese

 Sao Paolo, Brazil

 10 June 2013

 Brazilian Portugese

 Mexico City, Mexico

 24 June 2013

 Spanish

To register for this course or to learn more about the courses on the authentic MySQL curriculum, go to http://oracle.com/education/MySQL.


PlanetMySQL Voting: Vote UP / Vote DOWN

More details on "MySQL 5.6 Experiences" coming soon...

I've already shared my presentation few hours before I made it during PLMCE 2013, back on April 24. Let's just have it here for the reference: http://www.slideshare.net/ValeriyKravchuk/mysql-56experiencesbugssolutions50mins.

During the upcoming weeks I plan to explain every slide in more details (as 50 minutes were not enough for this) here and check status of all the active bugs mentioned in it. I'll also check new bugs for each major feature mentioned (if any). So, stay tuned...
PlanetMySQL Voting: Vote UP / Vote DOWN

Webinar: SQL Query Patterns, Optimized

This Friday, May 31 at 10 a.m. Pacific, I’ll present Percona’s next webinar, “SQL Query Patterns, Optimized.”

Based on my experiences solving tough SQL problems for Percona training and consulting, I’ll classify several common types of queries with which developers struggle. I’ll test several SQL solutions for each type of query objective, and show how you can use MySQL 5.6 built-in methods to analyze them for optimal query efficiency.  The discussion will cover optimizer reports, query profiling, and session status to measure performance.

The query patterns will include:

  • Exclusion Join
  • Random Selection
  • Greatest-Per-Group
  • Dynamic Pivot
  • Relational Division

Please register for this webinar and join me next Friday!

The post Webinar: SQL Query Patterns, Optimized appeared first on MySQL Performance Blog.


PlanetMySQL Voting: Vote UP / Vote DOWN

Easier Overview of Current Performance Schema Setting

While I prepared for my Hands-On Lab about the Performance Schema at MySQL Connect last year, one of the things that occurred to me was how difficult it was quickly getting an overview of which consumers, instruments, actors, etc. are actually enabled. For the consumers things are made more complicated as the effective setting also depends on parents in the hierarchy. So my thought was: “How difficult can it be to write a stored procedure that outputs a tree of the hierarchies.” Well, simple enough in principle, but trying to be general ended up making it into a lengthy project and as it was a hobby project, it often ended up being put aside for more urgent tasks.

However here around eight months later, it is starting to shape up. While there definitely still is work to be done, e.g. creating the full tree and outputting it in text mode (more on modes later) takes around one minute on my test system – granted I am using a standard laptop and MySQL is running in a VM, so it is nothing sophisticated.

The current routines can be found in ps_tools.sql.gz – it may later be merged into Mark Leith’s ps_helper to try to keep the Performance Schema tools collected in one place.

Note: This far the routines have only been tested in Linux on MySQL 5.6.11. Particularly line endings may give problems on Windows and Mac.

Description of the ps_tools Routines

The current status are two views, four stored procedure, and four functions – not including several functions and procedures that does all the hard work:

  • Views:
    • setup_consumers – Displays whether each consumer is enabled and whether the consumer actually will be collected based on the hierarchy rules described in Pre-Filtering by Consumer in the Reference Manual.
    • accounts_enabled – Displays whether each account defined in the mysql.user table has instrumentation enabled based on the rows in performance_schema.setup_actors.
  • Procedures:
    • setup_tree_consumers(format, color) – Create a tree based on setup_consumers displaying whether each consumer is effectively enabled. The arguments are:
      • format is the output format and can be either (see also below).:
        • Text: Left-Right
        • Text: Top-Bottom
        • Dot: Left-Right
        • Dot: Top-Bottom
      • color is whether to add bash color escapes sequences around the consumer names when used a Text format (ignored for Dot outputs).
    • setup_tree_instruments(format, color, only_enabled, regex_filter) – Create a tree based on setup_instruments displaying whether each instrument is enabled. The tree is creating by splitting the instrument names at each /. The arguments are:
      • format is the output format and can be either:
        • Text: Left-Right
        • Dot: Left-Right
        • Dot: Top-Bottom
      • color is whether to add bash color escapes sequences around the instrument names when used a Text format (ignored for Dot outputs).
      • type – whether to base the tree on the ENABLED or TIMED column of setup_instruments.
      • only_enabled – if TRUE only the enabled instruments are included.
      • regex_filter – if set to a non-empty string only instruments that match the regex will be included.
    • setup_tree_actors_by_host(format, color) – Create a tree of each account defined in mysql.user and whether they are enabled; grouped by host. The arguments are:
      • format is the output format and can be either:
        • Text: Left-Right
        • Dot: Left-Right
        • Dot: Top-Bottom
      • color is whether to add bash color escape sequences around the names when used a Text format (ignored for Dot outputs).
    • setup_tree_actors_by_user – Create a tree of each account defined in mysql.user and whether they are enabled; grouped by username. The arguments are:
      • format is the output format and can be either:
        • Text: Left-Right
        • Dot: Left-Right
        • Dot: Top-Bottom
      • color is whether to add bash color escape sequences around the names when used a Text format (ignored for Dot outputs).
  • Functions:
    • is_consumer_enabled(consumer_name) – Returns whether a given consumer is effectively enabled.
    • is_account_enabled(host, user) – Returns whether a given account (host, user) is enabled according to setup_actors.
    • substr_count(haystack, needle, offset, length) – The number of times a given substring occurs in a string. A port of the PHP function of the same name.
    • substr_by_delim(set, delimiter, pos) – Returns the Nth element from a delimiter string.

The two functions substr_count() and substr_by_delim() was also described in an earlier blog.

The formats for the four stored procedures consists of two parts: whether it is Text or Dot and the direction. Text is a tree that can be viewed directly either in the mysql command line client (coloured output not supported) or the shell (colours supported for bash). Dot will output a DOT graph file in the same way as dump_thread_stack() in ps_helper. The direction is as defined in the DOT language, so e.g. Left-Right will have the first level furthest to the left, then add each new level to the right of the parent level.

Examples

As the source code – including comments – is more than 1600 lines, I will not discuss it here, but rather go through some examples.

setup_tree_consumers

Using the coloured output:

or the same using a non-coloured output:

setup_tree_instruments

Here a small part of the tree is selected using a regex.

setup_tree_actors_%

With only root@localhost and root@127.0.0.1 enabled, the outputs of setup_tree_actors_by_host and setup_tree_actors_by_user gives respectively:

DOT Graph of setup_instruments

The full tree of setup_instruments can be created using the following sequence of steps (I am using graphviz to get support for dot files):

MySQL 5.6.11$ echo -e "$(mysql -NBe "CALL ps_tools.setup_tree_instruments('Dot: Left-Right', FALSE, 'Enabled', FALSE, '')")" > /tmp/setup_instruments.dot MySQL 5.6.11$ dot -Tpng /tmp/setup_instruments.dot -o /tmp/setup_instruments.png

The full output is rather large (6.7M). If you want to see if you can get to it at http://mysql.wisborg.dk/wp-content/uploads/2013/05/setup_tree_instruments_dot_lr.png.

Views

mysql> SELECT * FROM ps_tools.setup_consumers; +--------------------------------+---------+----------+ | NAME | ENABLED | COLLECTS | +--------------------------------+---------+----------+ | events_stages_current | NO | NO | | events_stages_history | NO | NO | | events_stages_history_long | NO | NO | | events_statements_current | YES | YES | | events_statements_history | NO | NO | | events_statements_history_long | NO | NO | | events_waits_current | NO | NO | | events_waits_history | NO | NO | | events_waits_history_long | NO | NO | | global_instrumentation | YES | YES | | thread_instrumentation | YES | YES | | statements_digest | YES | YES | +--------------------------------+---------+----------+ 12 rows in set (0.00 sec) mysql> SELECT * FROM ps_tools.accounts_enabled; +-------------+-----------+---------+ | User | Host | Enabled | +-------------+-----------+---------+ | replication | 127.0.0.1 | NO | | root | 127.0.0.1 | YES | | root | ::1 | NO | | meb | localhost | NO | | memagent | localhost | NO | | root | localhost | YES | +-------------+-----------+---------+ 6 rows in set (0.00 sec)

Conclusion

There is definitely more work to do on making the Performance Schema easier to access. ps_helper and ps_tools are a great start to what I am sure will be an extensive library of useful diagnostic queries and tools.


PlanetMySQL Voting: Vote UP / Vote DOWN

Experimenting with MySQL 5.7

I was playing around with MySQL 5.7 this weekend and before having read the changelog, I managed to spot these two little gems.

Duplicate Indexes

“The server now issues a warning if an index is created that duplicates an existing index, or an error in strict SQL mode.” Bug #37520

Example Testcase:

mysql> SHOW CREATE TABLE city\G *************************** 1. row *************************** Table: city Create Table: CREATE TABLE `city` ( `ID` int(11) NOT NULL AUTO_INCREMENT, `Name` char(35) NOT NULL DEFAULT '', `CountryCode` char(3) NOT NULL DEFAULT '', `District` char(20) NOT NULL DEFAULT '', `Population` int(11) NOT NULL DEFAULT '0', PRIMARY KEY (`ID`), KEY `CountryCode` (`CountryCode`), CONSTRAINT `city_ibfk_1` FOREIGN KEY (`CountryCode`) REFERENCES `Country` (`Code`) ) ENGINE=InnoDB AUTO_INCREMENT=4080 DEFAULT CHARSET=latin1 1 row in set (0.00 sec) mysql> ALTER TABLE city add index (countrycode); ERROR 1831 (HY000): Duplicate index 'CountryCode_2' defined on the table 'world.city'. This is deprecated and will be disallowed in a future release.

Pretty cool - I know this previously caught a lot of people.

Control-C support in the client

“Previously, Control+C in mysql interrupted the current statement if there was one, or exited mysql if not. Now Control+C interrupts the current statement if there was one, or cancels any partial input line otherwise, but does not exit.” Bug #66583

Example Testcase:

mysql> this is a test -> test -> test -> ^C

So if I want to quit, I can now control-C then type “quit”. This is much more intuitive.


PlanetMySQL Voting: Vote UP / Vote DOWN

What's the deal with NoSQL?

Everybody seems to be looking at and debating NoSQL these days, and so am I and I thought I'd say a few words about it. Which is not to say I haven't said stuff before, bit them I was mainly targeting specific attributes of many NoSQL solutions (like "eventual consistency" or, as you might call it, "instant inconsistency", What I was opposing is that "eventual consistency" has anything to do with just that, consistency. Rather, what this means is that at any point in time the system is inconsistent, and even if it might be consistent, you cannot rely on it being so. Which is fine, but don't call it consistency, call it inconsistency. Allowing a database to be somewhat inconsistent doesn't necessarily mean that it's something wrong with it).

All this said, what is going on here, why are we MySQL and MariaDB users seeing so many MongoDB, Cassandra and LevelDB applications pop up? Come on, these are typically less functional implementations of a database than even the most basic MySQL setup? No transactions, no joins, no standards etc. etc. And the answer, if you want to hear what I have to say, is ease of use. So let's explore that a bit.

Following the Object Orientation frenzy of the 1990s, when any application project ended up consisting of endless sessions modeling objects, usually involving expensive consultants, dresses in expensive, blue suits. And when that was done (which took years!) you had a way cool object model, but no money left to do the actual implementation, i.e. do the real programming (shiver), and you went to some other project and the nicely dressed object design consultant left to see another OO sucker.

Now, objects are much more standard, even non-OO languages have a big chunk of OO features, and these are used enhance programmer productivity and better code and design. Which is fine (except that if you were one of those OO consultants, which means you are now out of a job, as such mundane tasks of writing is not something you would ever do, such dirty stuff is better left to "programmers". Oh no, I forgot that you are now an ITIL consultant, that just slipped my mind) but how does this relate to MySQL and MariaDB. The answer is that MySQL, which was once considered real easy to use, no longer is as easy as it used to be. The Relational data model is still brilliant when you look at data as data, and that is how many of us look at it, so we go through the process of mapping data to objects, if that is what it takes. SQL and Java, PHP or whatever merges, and the application now contains a layer mapping objects to real data. Or we use hibernate, which does this automatically for us.

But a new cadre of developers are emerging, and they look at OO as natural and they look at objects as data (it's not. Data, in my mind, should be independent from the application using it, objects on the other hand, are closely tied to the application at hand). With which I do not mean that there is something wrong with building applications using objects, quite the opposite. But if all you know is objects, then using relational technology turns difficult, and SQL, for all the good things with it, seems old-fashioned and arcane, which it is (but it is so widely used you cannot avoid it). So you go with something that looks at objects as all you need, and present that in some object format. Like JSON.

And again, there is nothing wrong with that. But if we who are on the SQL and Relational track just discards these NoSQL technologies, we are not making any friends. We have to accept that MySQL and MariaDB really aren't that easy to use anymore, at least not for newcomers.

And then there is another thing: Some data, like Big Data, has attributes that really doesn't fit well in a relational model. Data where the attribute of a value can't easily be determined once and for all, and you need to reprocess that data (large test objects, images and maps are some examples). In this case, you really need to extend the relational model, somehow.

But SQL-based relational isn't going away. The Relational model is still one of the best ways to look at data, it's just that we also need some other ways of looking at data. And it needs to be easier to access. And we shouldn't really have to push SQL down the throat of every single developer, trying to develop an application using some OO technology. The answer is we need both. And these technologies needs to interoperate. I want to use SQL for my data. But I also want JSON and REST for my data. And there shouldn't be much of a performance overhead. All in all, we SQL folks need to wake up and data easier to use again. We know data better than the Cassandra and MongoDB folks. We know transactions better than them too. But they know how to work with developers who doesn't know who The Beatles were and make Relational easy to use for them, without them having to learn JSON (and now having to listen to a tirade about todays youngsters not knowing what real music is and that it died with John Lennon! What? You don't know who John Lennon was! That's exactly what I mean, you have no taste at all!).

Just my 2 cents...

/Karlsson
PlanetMySQL Voting: Vote UP / Vote DOWN

HeidiSQL 8.0 released

343 revisions after the 7.0 release follows the new 8.0 release of HeidiSQL now.

Get it from the download page.

Here are the most noticable changes:


- Available in 23 languages now. Thanks to all translators and Transifex hereby!
- Database tree: Introduce optional folders for tables, views, routines etc.
- Introduce session folders in session manager.
- Make routine editor work on MS SQL servers.
- Support search and replace in data and query results.
- Add support for microseconds in temporal datatypes of MariaDB 5.3+ and MySQL 5.6.
- Introduce a query history, available in the right side helpers box. Can be turned off.
- Implement grid export as PHP array.
- Host > Variables: Add "Global" column, and highlight values different to their session pendant
- Add menu item for launching mysql.exe command line with current parameters.


... and more:
- Table editor: Fix handling of BIT default values, and support BIT columns in MS SQL.
- Table editor: Improve selection of ENUM and SET default values
- User manager: Support dots in database and table privileges
- Data grids: Support copying/pasting NULL values
- Fix stripped backslashes in VIEW body editor
- Apply hotkeys to dialog buttons
- Grid export: Remove zero padding to avoid octal => integer conversion in PHP
- Data grid: Propose column names from selected table in filter panel
- Database and new table filter above database tree
- Table editor: Display number of selected columns in status bar
- Database tree: Indicate previously selected tables with a non-ghosted icon in the tree, while leaving never selected ones ghosted
- Display timestamp in very right status bar panel when executing a query
- Table editor: Add missing DATE and TIME datatypes for MS SQL
- Table editor: Support old style "TYPE BTREE" in table index code
- Routine editor: Finally fix ramshackle detection of routine body
- Data grid: Make foreign values drop down optionally
- Dialogs: Introduce "KeepAskingSetting" checkbox
- Session manager: Move startup script and local time zone options together with SSL settings to a new "Advanced" tab
- SQL export: Support filename and dirname patterns in export target combobox
- Database tree: Display overlay icons for some special table engines like federated, csv, aria and performance_schema
- Implement an automatic keep-alive ping, to prevent SSH tunnels from disconnecting
- Add support for renaming tables in MS SQL
- Fix crash on exit when connected to pre-4.1 servers
- Table editor: Enhance MS SQL compatibility in table editor
- Fix and enhance handling of multiple statements and multiple results
- Grid export: Add "Include query" and "Include auto increment column" checkbox options
- Processlist: Add link label "EXPLAIN Analyzer on MariaDB.org"
- Internal: Refactor logic for reading and writing application and session settings
- Session manager: Introduce new columns "Last connect" and "Counter"
- Extend the variable editor to explicitly modify strings, numbers, booleans or enumerations
- Detect client timezone and send SET time_zone to the server, so that NOW() and friends return UTC-fixed values
- Session manager: Add server specific icons for TokuDB, InfiniDB and Infobright
- Session handling: Use home brown file format for exporting and importing registry settings, as used for the portable version
- Implement usage of mysql_warning_count(). Ask for running SHOW WARNINGS in a new query tab.
- Fix command line for Wine users
- Introduce new preference option "Prefill empty date/time fields".
- Restore previous selection after refreshing process list (and neighbor tabs)

PlanetMySQL Voting: Vote UP / Vote DOWN

MySQL 5.6, InnoDB and fast storage

I used a simple workload with sysbench to determine the rate at which InnoDB can read blocks from disk. The workload is read-only and each query fetches 1 row by PK. The workload was IO-bound with a 2G InnoDB buffer pool and 32G database. Storage was fast courtesy of buffered IO and enough RAM to cache the database in the OS filesystem cache.

Using MySQL 5.6.11 and InnoDB with a few hacks the peak throughput was about 240,000 QPS and 210,000 block reads/second. The test server has 32 cores (16 physical cores, 32 logical cores with HT enabled). This is a great result that can probably be even better. Contention on fil_system->mutex was the bottleneck and I think that can be improved (see feature request #69276). I wonder if 400,000 block reads/second is possible?

A few years back, in 2009 or 2010, I ran similar tests using a server with 8 physical cores. I think HT was disabled. Using MySQL 5.1 with the Facebook patch I was able to get about 40,000 QPS and 35,000 block reads/second. It is good to see software advance to keep up with hardware.
PlanetMySQL Voting: Vote UP / Vote DOWN

New Feature Qualification

Early this year Oracle released  MySQL 5.6 - Best MySQL Release Ever. This release delivered not only quality, but also quantity in terms of number of features. See a comprehensive list here . The blogs below also refer to the massive changes introduced in 5.6
http://www.mysqlperformanceblog.com/2013/01/27/mysql-5-6-improvements-in-the-nutshell/
http://www.flamingspork.com/blog/2013/03/05/mysql-code-size/

It is no mean task to deliver so many features with high quality that too for a feature rich product like MySQL. This was made possible by the increased focus given to testing during 5.6 development . Testing was integrated far better than in the past with MySQL processes and test team was grown significantly. After the first few milestone releases developers and other stake holders started seeing benefits of the new processes and that resulted in better cooperation between development and test teams which in turn resulted in a bunch of happy testers :). In this post I will take you through the process that was established for new feature qualification.

Features are delivered through milestones which Tomas Ulin has explained very well at http://insidemysql.com/the-milestone-release-model-revisited/feature. MySQL Server QA has the following goals for new features

  • Complete functional and non functional test coverage of changed and new functionality
  • No regressions
  • More than 80% Code Coverage
QA involvement starts as soon as the requirements and specifications of the feature are finalized by the development team. QA reviews available documents and provides feedback on the design, usability, testability etc. A discussion follows with the developer and changes are made as needed to ensure that the feature can be tested.

Once the specifications and requirements are acceptable QA starts working on the test plan.  First QA documents all scenarios that needs to be tested. This includes stand alone tests, integration tests, non functional tests etc. Next step is to identify tools for automating the scenarios. Most commonly used functional test tools are

MTR based tests are added by default for any new feature. RQG use started from 5.5 and it gained wide acceptance in 5.6. At least 90% of the features in 5.6 have gone through RQG testing and this helped weed out many bugs which would have been nearly impossible to find using MTR.

While the developers are working on the product code , QA engineers start working on the automated tests, test infrastructure improvements etc.Final round of testing starts after the feature has passed code reviews. This phase can last anywhere between a few days to months depending on the complexity of the feature, stability of the code etc.

Features gets signed off only when the following conditions are met:
  • No open bugs in the new feature - This is very strictly enforced and even minor bugs are not allowed. We believe that bugs are easiest to fix when the code is new and hence this can help us deliver features that are of high quality.
  • No regressions - A feature gets developed on a tree which gets tested regularly through a continuous integration testing tool. Any regressions are detected and fixed before sign off.
  • Acceptable Code coverage numbers - Code coverage report is generated for the changed lines of code and the minimum expected coverage is 80%. Most features have coverage of more than 90%. Any uncovered lines of code are analyzed and wherever possible new tests are added to increase code coverage.
  • All new tests are added to the automated regression suite.
  • All planned testing has completed with satisfactory results.
Every feature in 5.6 release was qualified using this process and it is our constant endeavour to improve this for an even better 5.7 !!

        PlanetMySQL Voting: Vote UP / Vote DOWN

        MySQL binlogs - Don't forget to do your homework!

        Now that I'm back doing just database stuff, I've come to realize I've gotten a little sloppy about doing my homework.  Homework's never been my favorite thing in the world, but it often reduces stress when your under the gun during an outage or upgrade... We had a MySQL database server that's been slow on DML changes, and based on the slowest statements being 'COMMIT', we had a good mind
        PlanetMySQL Voting: Vote UP / Vote DOWN

        Percona XtraBackup 2.1.2 for MySQL available for download

        Percona is glad to announce the release of Percona XtraBackup 2.1.2 for MySQL on May 17, 2013. Downloads are available from our download site here and Percona Software Repositories.

        This release fixes number of high-priority bugs since version 2.1 became GA. It’s advised to upgrade your latest 2.1 version to 2.1.2. This release is the latest stable release in the 2.1 series.

        Bugs Fixed:

        • Using Perl’s DBD::MySQL package for server communication instead of spawning the MySQL command line client introduced a regression which caused innobackupex –galera-info option to fail. Bug fixed #1180672.
        • The format of xtrabackup_galera_info was missing the ‘:’ separator between the values of wsrep_local_state_uuid and wsrep_last_committed. Bug fixed #1181222.
        • innobackupex automatic version detection did not work correctly for latest Percona Server and MySQL 5.1 releases which could cause innobackupex to fail. Bugs fixed #1181092, #1181099 and #1180905.
        • When backing up a server that is not a replication slave with the innobackupex –slave-info option, innobackupex failed with a fatal error. Replaced the fatal error with a diagnostic message about innobackupex –slave-info being ignored in such a case. Bug fixed #1180662.
        • Low values for wait_timeout on the server could cause server to close the connection while backup is being taken. Fixed by setting the bigger value for wait_timeout option on the server to prevent server from closing connections if the global wait_timeout value is set too low. Bug fixed #1180922.

        Other bug fixes: bug fixed #1177182.

        Release notes with all the bugfixes for Percona XtraBackup 2.1.2 are available in our online documentation. Bugs can be reported on the launchpad bug tracker.

        The post Percona XtraBackup 2.1.2 for MySQL available for download appeared first on MySQL Performance Blog.


        PlanetMySQL Voting: Vote UP / Vote DOWN

        Log Buffer #320, A Carnival of the Vanities for DBAs

        The red carpet has been laid down at this Log Buffer Edition, and you can witness and cheer the cat-walking blog posts from Oracle, SQL Server and MySQL. Every one of them is chic, elegant, sensual in its own right. Enjoy.
        Oracle:

        Create colored heat maps in SQL*Plus with Kyle Hailey.

        Here’s a quick and dirty script to create a procedure (in the SYS schema – so be careful) to check the Hakan Factor for an object.

        Connor has a good post about default null for collection parameter.

        This is yet another blogpost on Oracle’s direct path read feature which was introduced for non-parallel query processes in Oracle version 11.

        Owen Allen has seen some questions about provisioning Oracle Solaris 11. They boil down to this.

        SQL Server:

        Shashank Srivastava tells us as how to Change the SQL Server Instance Name after Renaming the Windows Host.

        Daniel Calbimonte shares as how to synchronize two SSAS Servers.

        Data Architecture underpins just about everything we do in IT.  Without a clear understanding of how data is structured, there is no reliable way to derive meaning from it.

        Orlando Colamatteo is login-less in Seattle.

        Lets get started testing database with tSQLt with Robert Sheldon.

        MySQL:

        After a lot of fuzz, Anders Karlsson is now releasing MyQuery version 3.5.1.

        Nothing like reestablishing a tradition and Dave Stokes is doing just that for MySQL.

        Mare Alff is spreading the word about the performance schema.

        Slava Akhmechet talks about secondary indexes, batched inserts performance improvements, soft durability mode.

        It is a central part of the MySQL philosophy to try and help you as much as you can. There are many occasions when it could tell you that what you are asking for is utterly stupid or give you a bad execution plan because “you asked for it”.


        PlanetMySQL Voting: Vote UP / Vote DOWN

        Pages