Planet MySQL

Using the new MySQL Shell Reporting Framework to monitor InnoDB Cluster

With MySQL Shell 8.0.16, a new very interesting feature was released: the Reporting Framework.

Jesper already blogged about it and I recommend you to read his articles if you are interested in writing your own report:

  • https://mysql.wisborg.dk/2019/04/26/mysql-shell-8-0-16-built-in-reports/
  • https://mysql.wisborg.dk/2019/04/27/mysql-shell-8-0-16-user-defined-reports/

I this post, I will show you one user-defined report that can be used to monitor your MySQL InnoDB Cluster / Group Replication.

Preparation

Before being able to use the report, you need to download 2 files. The first one is the addition in systhat I often use to monitor MySQL InnoDB Cluster:

addition_to_sys_GRDownload

And the second one is the report:

gr_infoDownload

Once downloaded, you can unzip them and install them:

On your Primary-Master run:

mysqlsh --sql clusteradmin@mysql1 < addition_to_sys_GR.sql

Now install the report on your MySQL Shell client’s machine:

$ mdkir -p ~/.mysqlsh/init.d
mv gr_info.py ~/.mysqlsh/init.d
Usage

Once installed, you just need to relaunch the Shell and you are ready to call the new report using the \show command:

Now let’s see the report in action when I block all writes on mysql2 with a FTWRL and call the report with \watch:

Conclusion

Yet another nice addition to MySQL Shell. With this report you can see which member still has quorum, how many transactions each nodes have to apply, …

Don’t hesitate to also share your reports too !

MySQL Memory Management, Memory Allocators and Operating System

When users experience memory usage issues with any software, including MySQL®, their first response is to think that it’s a symptom of a memory leak. As this story will show, this is not always the case.

This story is about a bug.

All Percona Support customers are eligible for bug fixes, but their options vary. For example, Advanced+ customers are offered a HotFix build prior to the public release of software with the patch. Premium customers do not even have to use Percona software: we may port our patches to upstream for them. But for Percona products all Support levels have the right to have a fix.

Even so, this does not mean we will fix every unexpected behavior, even if we accept that behavior to be a valid bug. One of the reasons for such a decision might be that while the behavior is clearly wrong for Percona products, this is still a feature request.

A bug as a case study

A good recent example of such a case is PS-5312 – the bug is repeatable with upstream and reported at bugs.mysql.com/95065

This reports a situation whereby access to InnoDB fulltext indexes leads to growth in memory usage. It starts when someone queries a fulltext index, grows until a maximum, and is not freed for quite a long time.

Yura Sorokin from the Percona Engineering Team investigated if this is a memory leak and found that it is not.

When InnoDB resolves a fulltext query, it creates a memory heap in the function

fts_query_phrase_search This heap may grow up to 80MB. Additionally, it has a big number of blocks (mem_block_t ) which are not always used continuously and this, in turn, leads to memory fragmentation.

In the function

exit , the memory heap is freed. InnoDB does this for each of the allocated blocks. At the end of the function, it calls free() which belongs to one of the memory allocator libraries, such as malloc or jemalloc. From the mysqld point of view, everything is done correctly: there is no memory leak.

However while

free() should release memory when called, it is not required to return it back to the operating system. If the memory allocator decides that the same memory blocks will be required soon, it may still keep them for the mysqld process. This explains why you might see that mysqld  still uses a lot of memory after the job is finished and all de-allocations are done.

This in practice is not a big issue and should not cause any harm. But if you need the memory to be returned to the operating system quicker, you could try alternative memory allocators, such as jemalloc. The latter was proven to solve the issue with PS-5312.

Another factor which improves memory management is the number of CPU cores: the more we used for the test, the faster the memory was returned to the operating system. This, probably, can be explained by the fact that if you have multiple CPUs, then the memory allocator can dedicate one of them just for releasing memory to the operating system.

The very first implementation of InnoDB full text indexes introduced this flaw. As our engineer Yura Sorokin found:

Options to fix

We have a few options to fix this:

  1. Change implementation of InnoDB fulltext index
  2. Use custom memory library like jemalloc

Both have their advantages and disadvantages.

Option 1 means we are introducing an incompatibility with upstream, which may lead to strange bugs in future versions. This also means a full rewrite of the InnoDB fulltext code which is always risky in GA versions, used by our customers.

Option 2 means we may hit flaws in the jemalloc library which is designed for performance and not for the safest memory allocation.

So we have to choose between these two not ideal solutions.

Since option 1 may lead to a situation when Percona Server will be incompatible with upstream, we prefer option 2 and look forward for the upstream fix of this bug.

Conclusion

If you are seeing a high memory usage by the

mysqld process, it is not always a symptom of a memory leak. You can use memory instrumentation in Performance Schema to find out how allocated memory is used. Try alternative memory libraries for better processing of allocations and freeing of memory. Search the user manual for LD_PRELOAD to find out how to set it up at these pages here and here.

Partial Revokes from Database Objects

Have you ever encountered situations where you want to grant a user access to all databases except a few databases ?  If yes, then this article will interest you.  Starting from MySQL 8.0.16, you can partially revoke database privileges from users, even if they are granted global privileges.…

Facebook Twitter Google+ LinkedIn

The SYSTEM_USER Dynamic Privilege

To modify users, you must have the CREATE USER privilege or the UPDATE privilege on the mysql schema. We have added the SYSTEM_USER dynamic privilege to make it possible to protect users against the first case: modification by other users who have the CREATE USER privilege.…

Facebook Twitter Google+ LinkedIn

How to create multiple accounts for an app?

You can now grant CREATE USER so that your web apps would be able to use multiple accounts without you risking the end user hijacking the database by changing your root credentials for example.  Wouldn’t it be nice, if you could grant a user privileges to create or modify users except a few users? …

Facebook Twitter Google+ LinkedIn

Percona XtraBackup 2.4.14 Is Now Available

Percona is glad to announce the release of Percona XtraBackup 2.4.14 on May 1, 2019. You can download it from our download site and apt and yum repositories.

Percona XtraBackup enables MySQL backups without blocking user queries, making it ideal for companies with large data sets and mission-critical applications that cannot tolerate long periods of downtime. Offered free as an open source solution, it drives down backup costs while providing unique features for MySQL backups.

Percona XtraBackup 2.4.14 enables saving backups to an Amazon S3, MinIO, and Google Cloud Storage (using interoperability mode) when using xbcloud. The following example demonstrates how to use an Amazon S3 storage to make a full backup:

$ xtrabackup --backup --stream=xbstream --extra-lsndir=/tmp --target-dir=/tmp | \ xbcloud put --storage=s3 \ --s3-endpoint='s3.amazonaws.com' \ --s3-access-key='YOUR-ACCESSKEYID' \ --s3-secret-key='YOUR-SECRETACCESSKEY' \ --s3-bucket='mysql_backups' --parallel=10 \ ${date -I}-full_backup

All Percona software is open-source and free.

New Features
  • Amazon S3 is now supported in xbcloud. More information in PXB-1813.
Bugs Fixed
  • When the row format was changed during the backup, xtrabackup could crash
    during the incremental prepare stage. Bug fixed PXB-1824.
  • If compressed InnoDB undo tablespaces were not removed beforehand, the
    incremental backup could crash at the prepare stage. Bug fixed PXB-1552.

Other bugs fixed:  PXB-1771, PXB-1809, PXB-1837.

Release notes with all the improvements for version 2.4.14 are available in our online documentation. Please report any bugs to the issue tracker.

Benchmark ClickHouse Database and clickhousedb_fdw

In this research,  I wanted to see what kind of performance improvements could be gained by using a ClickHouse data source rather than PostgreSQL. Assuming that I would see performance advantages with using ClickHouse, would those advantages be retained if I access ClickHouse from within postgres using a foreign data wrapper (FDW)? The FDW in question is clickhousedb_fdw – an open source project from Percona!

The database environments under scrutiny are PostgreSQL v11, clickhousedb_fdw and a ClickHouse database. Ultimately, from within PostgreSQL v11, we are going to issue various SQL queries routed through our clickhousedb_fdw to the ClickHouse database. Then we’ll see how the FDW performance compares with those same queries executed in native PostgreSQL and native ClickHouse.

Clickhouse Database

ClickHouse is an open source column based database management system which can achieve performance of between 100 and 1000 times faster than traditional database approaches, capable of processing more than a billion rows in less than a second.

Clickhousedb_fdw

clickhousedb_fdw, a ClickHouse database foreign data wrapper, or FDW, is an open source project from Percona. Here’s a link for the GitHub project repository:

https://github.com/Percona-Lab/clickhousedb_fdw

I wrote a blog in March which tells you more about our FDW: https://www.percona.com/blog/2019/03/29/postgresql-access-clickhouse-one-of-the-fastest-column-dbmss-with-clickhousedb_fdw/

As you’ll see, this provides for an FDW for ClickHouse that allows you to SELECT from, and INSERT INTO, a ClickHouse database from within a PostgreSQL v11 server.

The FDW supports advanced features such as aggregate pushdown and joins pushdown. These significantly improve performance by utilizing the remote server’s resources for these resource intensive operations.

Benchmark environment
  • Supermicro server:
    • Intel(R) Xeon(R) CPU E5-2683 v3 @ 2.00GHz
    • 2 sockets / 28 cores / 56 threads
    • Memory: 256GB of RAM
    • Storage: Samsung  SM863 1.9TB Enterprise SSD
    • Filesystem: ext4/xfs
  • OS: Linux smblade01 4.15.0-42-generic #45~16.04.1-Ubuntu
  • PostgreSQL: version 11
Benchmark tests

Rather than using some machine generated dataset for this benchmark, we used the “On Time Reporting Carrier On-Time Performance” data from 1987 to 2018. You can access the data using our shell script available here:

https://github.com/Percona-Lab/ontime-airline-performance/blob/master/download.sh

The size of the database is 85GB, providing a single table of 109 columns.

Benchmark Queries

Here are the queries I used to benchmark the ClickHouse, clickhousedb_fdw, and PostgreSQL.

Q# Query Contains Aggregates and Group By Q1 SELECT DayOfWeek, count(*) AS c FROM ontime WHERE Year >= 2000 AND Year <= 2008 GROUP BY DayOfWeek ORDER BY c DESC; Q2 SELECT DayOfWeek, count(*) AS c FROM ontime WHERE DepDelay>10 AND Year >= 2000 AND Year <= 2008 GROUP BY DayOfWeek ORDER BY c DESC; Q3 SELECT Origin, count(*) AS c FROM ontime WHERE DepDelay>10 AND Year >= 2000 AND Year <= 2008 GROUP BY Origin ORDER BY c DESC LIMIT 10; Q4 SELECT Carrier, count(*) FROM ontime WHERE DepDelay>10 AND Year = 2007 GROUP BY Carrier ORDER BY count(*) DESC; Q5 SELECT a.Carrier, c, c2, c*1000/c2 as c3 FROM ( SELECT Carrier, count(*) AS c FROM ontime WHERE DepDelay>10 AND Year=2007 GROUP BY Carrier ) a
INNER JOIN (  SELECT   Carrier,count(*) AS c2 FROM ontime WHERE Year=2007 GROUP BY Carrier)b on a.Carrier=b.Carrier ORDER BY c3 DESC; Q6 SELECT a.Carrier, c, c2, c*1000/c2 as c3 FROM ( SELECT Carrier,  count(*) AS c FROM ontime  WHERE DepDelay>10 AND Year >= 2000 AND
Year <= 2008 GROUP BY Carrier) a INNER JOIN ( SELECT  Carrier, count(*) AS c2 FROM ontime  WHERE Year >= 2000 AND Year <= 2008 GROUP BY Carrier ) b on a.Carrier=b.Carrier ORDER BY c3 DESC; Q7 SELECT Carrier, avg(DepDelay) * 1000 AS c3 FROM ontime WHERE Year >= 2000 AND Year <= 2008 GROUP BY Carrier; Q8 SELECT Year, avg(DepDelay) FROM ontime GROUP BY Year; Q9 select Year, count(*) as c1 from ontime group by Year; Q10 SELECT avg(cnt) FROM (SELECT Year,Month,count(*) AS cnt FROM ontime WHERE DepDel15=1 GROUP BY Year,Month) a; Q11 select avg(c1) from (select Year,Month,count(*) as c1 from ontime group by Year,Month) a; Q12 SELECT OriginCityName, DestCityName, count(*) AS c FROM ontime GROUP BY OriginCityName, DestCityName ORDER BY c DESC LIMIT 10; Q13 SELECT OriginCityName, count(*) AS c FROM ontime GROUP BY OriginCityName ORDER BY c DESC LIMIT 10; Query Contains Joins Q14 SELECT a.Year, c1/c2 FROM ( select Year, count(*)*1000 as c1 from ontime WHERE DepDelay>10 GROUP BY Year) a
INNER JOIN (select Year, count(*) as c2 from ontime GROUP BY Year ) b on a.Year=b.Year ORDER BY a.Year; Q15 SELECT a.”Year”, c1/c2 FROM ( select “Year”, count(*)*1000 as c1 FROM fontime WHERE “DepDelay”>10 GROUP BY “Year”) a INNER JOIN (select “Year”, count(*) as c2
FROM fontime GROUP BY “Year” ) b on a.”Year”=b.”Year”;

Table-1: Queries used in benchmark

Query executions

Here are the results from the each of the queries when run in different database set ups: PostgreSQL with and without indexes, native ClickHouse and clickhousedb_fdw. The time is shown in milliseconds. 

Q# PostgreSQL PostgreSQL (Indexed) ClickHouse clickhousedb_fdw Q1 27920 19634 23 57 Q2 35124 17301 50 80 Q3 34046 15618 67 115 Q4 31632 7667 25 37 Q5 47220 8976 27 60 Q6 58233 24368 55 153 Q7 30566 13256 52 91 Q8 38309 60511 112 179 Q9 20674 37979 31 81 Q10 34990 20102 56 148 Q11 30489 51658 37 155 Q12 39357 33742 186 1333 Q13 29912 30709 101 384 Q14 54126 39913 124 1364212 Q15 97258 30211 245 259

Table-1: Time taken to execute the queries used in benchmark

Reviewing the results

The graph shows the query execution time in milliseconds, the X-axis shows the query number from the tables above, while the Y-axis shows the execution time in milliseconds. The results for ClickHouse and the data accessed from postgres using clickhousedb_fdw are shown. From the table, you can see there is a huge difference between PostgreSQL and ClickHouse, but there is minimal difference between ClickHouse and clickhousedb_fdw.

 

Clickhouse Vs Clickhousedb_fdw (Shows the overhead of clickhousedb_fdw)

This graph shows the difference between ClickhouseDB and clickhousedb_fdw. In most of the queries, the FDW overhead is not that great, and barely significant apart from in Q12. This query involves joins and ORDER BY clause. Because of the ORDER BY clause the GROUP/BY and ORDER BY does not push down to ClickHouse.

In Table-2 we can see the spike of time in query Q12 and Q13. To reiterate, this is caused by the ORDER BY clause. To confirm this, I ran a queries Q-14 and Q-15 with and without the ORDER BY clause.  Without the ORDER BY clause the completion time is  259ms and with ORDER BY clause, it is 1364212. To debug that the query I explain both the queries, and here is the results of explain.

bm=# EXPLAIN VERBOSE SELECT a."Year", c1/c2 FROM (SELECT "Year", count(*)*1000 AS c1 FROM fontime WHERE "DepDelay" > 10 GROUP BY "Year") a INNER JOIN(SELECT "Year", count(*) AS c2 FROM fontime GROUP BY "Year") b ON a."Year"=b."Year";

 

                                                    QUERY PLAN                                                       Hash Join  (cost=2250.00..128516.06 rows=50000000 width=12)   Output: fontime."Year", (((count(*) * 1000)) / b.c2)   Inner Unique: true   Hash Cond: (fontime."Year" = b."Year")   ->  Foreign Scan  (cost=1.00..-1.00 rows=100000 width=12)         Output: fontime."Year", ((count(*) * 1000))         Relations: Aggregate on (fontime)         Remote SQL: SELECT "Year", (count(*) * 1000) FROM "default".ontime WHERE (("DepDelay" > 10)) GROUP BY "Year"   ->  Hash  (cost=999.00..999.00 rows=100000 width=12)         Output: b.c2, b."Year"         ->  Subquery Scan on b  (cost=1.00..999.00 rows=100000 width=12)               Output: b.c2, b."Year"               ->  Foreign Scan  (cost=1.00..-1.00 rows=100000 width=12)                     Output: fontime_1."Year", (count(*))                     Relations: Aggregate on (fontime)                     Remote SQL: SELECT "Year", count(*) FROM "default".ontime GROUP BY "Year"(16 rows)

 

bm=# EXPLAIN VERBOSE SELECT a."Year", c1/c2 FROM(SELECT "Year", count(*)*1000 AS c1 FROM fontime WHERE "DepDelay" > 10 GROUP BY "Year") a INNER JOIN(SELECT "Year", count(*) as c2 FROM fontime GROUP BY "Year") b ON a."Year"= b."Year" ORDER BY a."Year";

 

                                                          QUERY PLAN Merge Join  (cost=2.00..628498.02 rows=50000000 width=12)   Output: fontime."Year", (((count(*) * 1000)) / (count(*)))   Inner Unique: true   Merge Cond: (fontime."Year" = fontime_1."Year")   ->  GroupAggregate  (cost=1.00..499.01 rows=1 width=12)        Output: fontime."Year", (count(*) * 1000)         Group Key: fontime."Year"         ->  Foreign Scan on public.fontime  (cost=1.00..-1.00 rows=100000 width=4)               Remote SQL: SELECT "Year" FROM "default".ontime WHERE (("DepDelay" > 10)) ORDER BY "Year" ASC   ->  GroupAggregate  (cost=1.00..499.01 rows=1 width=12)         Output: fontime_1."Year", count(*)         Group Key: fontime_1."Year"         ->  Foreign Scan on public.fontime fontime_1  (cost=1.00..-1.00 rows=100000 width=4)                Remote SQL: SELECT "Year" FROM "default".ontime ORDER BY "Year" ASC(16 rows)

Conclusion

The results from these experiments show that ClickHouse offers really good performance, and clickhousedb_fdw offers the benefits of ClickHouse performance from within PostgreSQL. While there is some overhead when using clickhousedb_fdw, it is negligible and is comparable to the performance achieved when running natively within the ClickHouse database. This also confirms that the PostgreSQL foreign data wrapper push-down feature provides wonderful results.

Care with using the max_connections beta database flag on CloudSQL...

War story of the day: do not use — or be very careful when using — the max_connections beta database flag on CloudSQL... because it has many bugs. p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Times; color: #000000; -webkit-text-stroke: #000000} span.s1 {font-kerning: none} p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Times; color:

Fun with Bugs #85 - On MySQL Bug Reports I am Subscribed to, Part XX

We have a public holiday here today and it's raining outside for a third day in a row already, so I hardly have anything better to do than writing yet another review of public MySQL bug reports that I've subscribed to recently.

Not sure if these reviews are really considered useful by anyone but few of my readers, but I am still going to try in a hope to end up with some useful conclusions. Last time I've stopped on Bug #94903, so let me continue with the next bug in my list:
  • Bug #94912 - "O_DIRECT_NO_FSYNC possible write hole". In this bug report Janet Campbell shared some concerns related to the way O_DIRECT_NO_FSYNC (and O_DIRECT) settings for innodb_flush_method work. Check comments, including those by Sunny Bains, where he agrees that "...this will cause problems where the redo and data are on separate devices.". Useful reading for anyone interested in InnoDB internals or using  innodb_dedicated_server setting in MySQL 8.0.14+.
  • Bug #94971 - "Incorrect key file error during log apply table stage in online DDL". Monty Solomon reported yet another case when "online' ALTER for InnoDB table fails in a weird way. The bug is still "Open" and there is no clear test case to just copy/paste, but both the problem and potential solutions (make sure you have "big enough" innodb_online_alter_log_max_size or better use pt-online-schema-change or gh-ost tools) were already discussed here.
  • Bug #94973 - "Wrong result with subquery in where clause and order by". Yet another wrong results bug with subquery on MySQL 5.7.25 was reported by Andreas Kohlbecker. We can only guess if MySQL 8 is also affected (MariaDB 10.3.7 is not, based on my test results shared below) as Oracle engineer who verified the bug had NOT card to check or share the results of this check. What can be easier than running this (a bit modified) test case on every MySQL major version and copy pasting the results:
    MariaDB [test]> CREATE TABLE `ReferenceB` (
        ->   `id` int(11) NOT NULL,
        ->   `bitField` bit(1) NOT NULL,
        ->   `refType` varchar(255) NOT NULL,
        ->   `externalLink` longtext,
        ->   PRIMARY KEY (`id`)
        -> ) ENGINE=MyISAM DEFAULT CHARSET=utf8;
    Query OK, 0 rows affected (0.170 sec)

    MariaDB [test]> INSERT INTO ReferenceB (id, bitField, refType, externalLink) VALUES(1, 0, 'JOU', NULL);
    Query OK, 1 row affected (0.027 sec)

    MariaDB [test]> INSERT INTO ReferenceB (id, bitField, refType, externalLink) VALUES(2, 0, 'JOU', NULL);
    Query OK, 1 row affected (0.002 sec)

    MariaDB [test]> SELECT hex(bitField) from ReferenceB  where id in (select id as
    y0_ from ReferenceB  where refType='JOU') order by externalLink asc;
    +---------------+
    | hex(bitField) |
    +---------------+
    | 0             |
    | 0             |
    +---------------+
    2 rows in set (0.028 sec)But we do not see anything like that in the bug report... This is sad.
  • Bug #94994 - "Memory leak detect on temptable storage engine". Yet another memory leak (found with ASan) reported by Zhao Jianwei, who had also suggested a patch.
  • Bug #95008 - "applying binary log doesn't work with blackhole engine tables". This bug was reported by Thomas Benkert. It seems there is a problem to apply row-based events to BLACKHOLE table and this prevents some nice recovery tricks from working.
  • Bug #95020 - "select no rows return but check profile process Creating sort index". Interesting finding from cui jacky. I can reproduce this with MariaDB as well. It seems we either have to define some new stage or define "Creating sort index" better than in the current manual. This:
    The thread is processing a SELECT that is resolved using an internal temporary table. is plain wrong in the case shown in the bug report IMHO.
  • Bug #95040 - "Duplicately remove locks from lock_sys->prdt_page_hash in btr_compress". One of those rare cases when Zhai Weixiang does not provide the patch, just suggests the fix based on code review :)
  • Bug #95045 - "Data Truncation error occurred on a write of column 0Data was 0 bytes long and". This really weird regression bug in MySQL 8.0.14+ was reported by Adarshdeep Cheema. MariaDB 10.3 is surely not affected.
  • Bug #95049 - "Modified rows are not locked after rolling back to savepoint". Bug reporter, John Lin, found that fine MySQL manual does not describe the real current implementation. Surprise!
  • Bug #95058 - "Index not used for column with IS TRUE or IS FALSE operators". Take extra care when using BOOLEAN columns in MySQL. As it was noted by Monty Solomon, proper index is NOT used when you try to check BOOLEAN values as manual suggests, using IS TRUE or IS FALSE conditions. Roy Lyseng explained how such queries are threated internally, but surely there is a better way. MariaDB 10.3.7 is also affected, unfortunately.
  • Bug #95064 - "slave server may has gaps in Executed_Gtid_Set when a special case happen ". Nice bug report from yoga yoga, who had also contributed a patch. Parallel slave can easily get out of sync with master in case of lock wait timeout and failed retries. Again, we do NOT see any check if MySQL 8 is affected, unfortunately.
  • Bug #95065 - "Strange memory management when using full-text indexes". We all know that InnoDB FULLTEXT indexes implementation is far from perfect. Now, thanks to Yura Sorokin, we know also about a verified memory leak bug there that may lead to OOM killing of MySQL server.
  • Bug #95070 - "INSERT .. VALUES ( .., (SELECT ..), ..) takes shared lock with READ-COMMITTED". Seunguck Lee found yet another case of InnoDB locking behavior that MySQL manual does not explain. The bug is still "Open" for some reason.
  • Bug #95115 - "mysqld deadlock of all client threads originating from 3-way deadlock". It took some efforts for bug reporter, Sandeep Dube, and other community users (mostly Jacek Cencek) to attract proper attention to this bug from proper Oracle developer, Dmitry Lenev, until it ended up "Verified" based on code review. We still can not be sure if MySQL 8 is also affected.
That's all for now. I have few more new bug reports that I monitor, but I do not plan to continue with this kind of reviews in upcoming few months in this blog. I hope I'll get a reason soon to write different kind of posts, with more in depth study of various topics...

In any case you may follow me on Twitter for anything related to recent interesting or wrongly handled MySQL bug reports.

This view of Chelsea from our apartment at Chelsea Cloisters reminds me that last year I spent spring holiday season properly - no time was devoted to MySQL bugs :) To summarize:
  1. Do not use O_DIRECT_NO_FSYNC value for innodb_flush_method if your redo logs are located on different device than your data files. Just don't.
  2. Some Oracle engineers who process bugs still do not care to check if all supported major versions are affected and/or share the results of such checks in public.
  3. There are still many details of InnoDB locking to study, document properly and maybe fix.
  4. I am really concerned with the state of MySQL optimizer. We see all kinds of weird bugs (including regressions) and very few fixes in each maintenance release.

MySQL Community Contributor Awards 2019

Last week, Oracle reported on the list of MySQL Community Contributors for 2019 and we’re proud to say that four serving Perconians made it to the list of 19 people recognized as such. Entries are nominated by the MySQL community and their nominations are assessed against six criteria by a volunteer committee of community mavens.

  1. The most active code contributor
  2. Bug reporter
  3. The most active MySQL blogger
  4. People who plays very active role in translation/documentation of MySQL articles
  5. People who provides feedback on DMR releases, Labs releases, or change proposals
  6. Anyone in the community who worked their butt off or did really useful work that you feel should be thanked publicly

The members of Percona who will be proudly wearing their community contributor t-shirts at Percona Live are:

  • Laurynas Biveinis, Technical Director in Engineering, for a lot of code contributions & bug submissions.
  • Sveta Smirnova, Principal Bug Escalation Specialist, for popularization of MySQL Performance Schema, general MySQL troubleshooting and promoting new MySQL features.
  • Yura Sorokin, Senior Software Engineer, for a lot of code contributions & bug submissions.
  • Peter Zaitsev, CEO, for his active role in MySQL world as well as for his contributions to MySQL Community.

Well done to them, as well as to the other fifteen contributors the committee recognized this year. You can read the full list on Oracle’s blog post.

More award announcements to come

Of course, that’s not the end of proceedings for 2019, since as has become a tradition, the major awards in three categories will be announced at Percona Live. Emily Slocombe, Co-Secretary of the Committee, will be making the announcements and presentations during the conference keynote addresses. These major awards are in one of three categories:

MySQL Community Awards: Community Contributor of the year 2019

This is a personal award; a winner would be a person who has made contribution to the MySQL ecosystem. This could be via development, advocating, blogging, speaking, supporting, etc.

MySQL Community Awards: Application of the year 2019

An application, project, product etc. which supports the MySQL ecosystem by either contributing code, complementing its behavior, supporting its use, etc. This could range from a one person open source project to a large scale social service.

MySQL Community Awards: Corporate Contributor of the year 2019

A company who made a contribution to the MySQL ecosystem. This might be a corporation which released major open source code; one that advocates for MySQL; one that helps out community members by… anything.

Taking part

If you missed submitting your nominations this year, it’s worth putting a knot in your handkerchief to remind yourself to visit the website next time around. Any year that has a zero on the end is always, somehow, extra special, so 2020 should be well contested. The website also has a hall of fame recording past winners and their achievements.

Of course, you could also come and cheer along this years winners at Percona Live Open Source Database Conference in Austin, TX from May 28 -30, 2019 where we will be celebrating not just MySQL but all open source databases and topics related. Find out what it’s all about, and enjoy being a part of this big, open-hearted, open source database community.

This year, we’re also providing stand space free of charge to non-profit organizations related to open source databases, software, and development in the widest sense. If you’d be interested in occupying one of these spaces, please get in touch.

See you there?


Image by Free-Photos from Pixabay

MySQL Server 8.0.16: Thanks for the Contributions

MySQL 8.0.16 was released last week and includes many great features including support for CHECK constraints and upgrades without the need of mysql_upgrade. As usual there are also several contributions from the community. These are the ones, I would like to highlight in this blog to say “thank you for the contributions”.

Facebook has contributed with another two patches and so has Daniel Black. Additionally, there are patches from Yuhui Wang, Wei Zhao, Yan Huang, and Dirkjan Bussink. The contributions are:

  • mysql-test-run.pl now supports the MTR_UNIQUE_IDS_DIR environment variable, which may be set to specify a unique-IDs directory to be used as the common location for all chroot environments by multiple simultaneous mysql-test-run.pl instances. This enables those instances to avoid conflicts when reserving port numbers. Thanks to Facebook for the contribution. (Bug #29221085, Bug #93950)
  • Changes to the all_persisted_variables test. Thanks to the Facebook team for the contribution. (Bug #29013375, Bug #93478)
  • InnoDB: Write-ahead did not work as expected due to an incorrectly initialized variable. Thanks to Yuhui Wang for the contribution. (Bug #29028838, Bug #93442)
  • InnoDB: A Linux AIO handler function failed to check if completed I/O events succeeded. Thanks to Wei Zhao for the contribution. (Bug #27850600, Bug #90402)
  • InnoDB: A function called by a CREATE TABLE thread attempted to access a table object after it was freed by a background thread. Thanks to Yan Huang for the patch. (Bug #27373959, Bug #89126)
  • Replication: If the WAIT_FOR_EXECUTED_GTID_SET() function was used with a timeout value including a fractional part (for example, 1.5), an error in the casting logic meant that the timeout was rounded down to the nearest whole second, and to zero for values less than 1 second (for example, 0.1). The casting logic has now been corrected so that the timeout value is applied as originally specified with no rounding. Thanks to Dirkjan Bussink for the contribution. (Bug #29324564, Bug #94247)
  • Determination of the number of online CPUs available to the mysqld process is now more accurate. Thanks to Daniel Black for the contribution. (Bug #28907677, Bug #93144)
  • Made a comparison in the internal method Item_result::item_cmp_type() more efficient. Our thanks to Daniel Black for the contribution. (Bug #92784, Bug #28796107)

It sometimes happen that we receive a contribution that we really like, but – for one reason or another – choose not to use the patch. One such case is the new feature to allow reconfiguration of the SSL settings at runtime. Thank you Facebook for submitting a contribution for a similar feature even though we ended up not using the patch.

Thank you for your contributions. Feel free to keep submitting ideas to the MySQL bugs database with ideas how to improve MySQL

Jquery Duplicate Fields Form Submit with PHP

This tutorial explains How to submit jquery duplicate/clone field values to form with PHP.  In this example we are going to provide very simple example and using existing jquery plugin, so here we are using relCopy.js jquery plugin to duplicating the existing field.


                                             Download                             DEMO
Lets see the below example, where we are duplicating the name field of html form.

index.html
<html>
<head>
<script type="text/javascript" src="http://ajax.googleapis.com/
ajax/libs/jquery/1.4.2/jquery.min.js"></script>
<script type="text/javascript" src="reCopy.js"></script>
<script type="text/javascript">
$(function(){
var removeLink = ' <a class="remove" href="#" onclick="$(this).parent().slideUp(function(){ $(this).remove() }); return false">remove</a>';

$('a.add').relCopy({ append: removeLink});
});
</script>
</head>

<body>
<form method="post" action="displayData.php">
<p class="clone"> <input type="text" name="name[]" class='input'/></p>
<p><a href="#" class="add" rel=".clone">Add More</a></p>
<input type="submit" value=" Submit " />
</form>
</body>

</html>

displayData.php
<?php

if ($_POST['name']) {
$array = $_POST['name'];
foreach ($array as $name) {
if (strlen($name) > 0) {
echo '<h2>' . $name . '<h2/>';
// implememnt your mysql query to store data
//$sql=mysql_query("insert into Empployee..............")

}
}
}
?>
This is all about Jquery Duplicate Fields Form Submit with PHP. Thank you for reading this article, and if you have any problem, have a another better useful solution about this article, please write message in the comment section.

Upcoming Webinar Wed 5/1: Horizontally scale MySQL with TiDB while avoiding sharding

Join Percona CEO Peter Zaitsev and PingCAP Senior Product and Community Manager Morgan Tocker as they present How to horizontally scale MySQL with TiDB while avoiding sharding issues on Wednesday, May 1st, 2019, at 11:00 AM PDT (UTC-7) / 2:00 PM EDT (UTC-4).

Register Now

In this joint webinar, PingCAP will provide an introduction and overview of TiDB, tailored for those with a strong background in MySQL. PingCAP will use MySQL as an example to explain various implementation details of TiDB and translate terminology to MySQL/InnoDB terms.

Percona will then discuss sharding issues within MySQL and how TiDB resolves many of those issues through horizontal scalability.

In this webinar, the following topics will be covered:

– TiDB and the problems it resolves
– Issues with MySQL sharding, including cautionary use cases
– Benchmark results from different versions of TiDB
– MySQL compatibility with TiDB
– Summary, and question and answer

Register for this webinar on how to scale MySQL with TiDB.

MySQL is Ready for Fedora 30

Fedora 30 is out today. We congratulate the Fedora community on another rev of their favorite Linux distro. As usual, we support the latest Fedora from day one, and we have added the following MySQL products to our official MySQL yum repos: MySQL Server (8.0.16 and 5.7.26) Connector C++ 8.0.16 Connector Python 8.0.16 Connector ODBC […]

Exposing MyRocks Internals Via System Variables: Part 2, Initial Data Flushing

In this blog post, we continue on our series of exploring MyRocks mechanics by looking at configurable server variables and column family options. In our last post, I explained at a high level how data first entered memory space and in this post, we’re going to pick up where we left off and talk about how the flush from immutable memtable to disk occurs. We’re also going to talk about how newly created secondary indexes on existing tables get written to disk.

We already know from our previous post in the series that a flush can be prompted by one of several events, the most common of which would be when an active memtable is filled to its maximum capacity and is rotated into immutable status.

When your immutable memtable(s) is ready to flush, MyRocks will call a background thread to collect the data from memory and write it to disk. Depending on how often your records are updated, it’s possible that multiple versions of the same record may exist in the immutable memtable(s), so the flush thread will have to check and remove any record duplication so only the latest record state gets added to the data file.

Once deduplication is complete, the contents of the immutable memtable are written to disk in a series of data blocks (4k by default) that make up the data file that you can find in your MyRocks data directory with the extension ‘.sst’. The size of the data file is going to be the same size as the immutable memtable(s). Remember, the size of your memtable is designated by the column family option write_buffer_size.

There is also metadata added to the file including checksums to ensure that there have been no issues with storage between the time the data file was written and the next time it’s read. Top level index information is also written in order to speed up the time it takes to locate the record you’re seeking within the file itself. Other forms of metadata within the data file will be addressed later in the series when we cover bloom filters.

Variables and CF_OPTIONS

Now that we know a little bit about how data is flushed from memory to disk, let’s take a more detailed look at the mechanics associated with the flushing process as well as the variables and options that are associated with them.

Rocksdb_max_background_jobs

In my last post, we mentioned when flushing would occur based on memtable capacity, the size of the write buffer, etc. Once the system determines that a flush is required it will use a background thread to take the data from an immutable memtable and write it to disk.

The variable rocksdb_max_background_jobs allows you to specify how many threads will sit in the background to support flushing. Keep in mind that this pool of background threads are used to support both flushing and compaction (we’ll discuss compaction in the next post in the series). In previous versions, the number of threads to support memtable to disk flushes was defined by the variable rocksdb_max_background_flushes; however, this is no longer the case as rocksdb_max_background_jobs replaced this variable and similar variables used to define the number of threads that would support compaction. Now all of these have been grouped together and the number of threads that will be used for memtable to disk flushes versus compaction will be automatically decided by MyRocks.

Default: 2

The value of 2 isn’t all that surprising considering that in previous versions the value of rocksdb_max_background_compactions and rocksdb_max_background_flushes were both 1, meaning there was one thread for flushing and one for compaction, two threads total. We still have two threads, but now MyRocks will decide which process those threads are allocated to.

Rocksdb_datadir

When MyRocks flushes data from immutable memtable to disk with the extension ‘.sst’, it will add data files to the MyRocks data directory. You can specify the location of this directory using the system variable rocksdb_datadir.

Default: ./.rocksdb

The default indicates that a directory called ‘.rocksdb’ will be created in the MySQL data directory. The MySQL data directory is defined in system variable datadir, which has a default value of /var/lib/mysql. Assuming both datadir and rocksdb_datadir are their default values, the location of the MyRocks data directory would be /var/lib/mysql/.rocksdb.

I mentioned in my first post in this series that it’s not uncommon for users to want to separate sequential and random I/O, which is why you may want to put your write-ahead logs on one set of disks and your data directory on another. You can take this a step further by isolating your logs, data, and operating system. This is becoming less common as SSDs and similar technologies become the database storage of choice, but if you want to go the route of isolating your OS, data, and logs, you can leverage the rocksdb_datadir system variable for that purpose.

Rocksdb_block_size

Data from immutable memtables are flushed into data files in their entirety after deduplication, but are further broken down into data blocks. These blocks serve as a way to partition the data within the data file in order to speed up lookups when you are looking for a specific record by its key. In each data file there is also what’s called a “top-level index” that provides a key name and offset for each block based on the last key that’s found in each block. If you’re looking for a specific key it will let you know what data block it’s, in leveraging the top level index and a blockhandle.

You can specify how large each data block will be using the system variable rocksdb_block_size.

Default: 4096 (4 Kb)

Rocksdb_rate_limiter_bytes_per_sec

When the background thread is called upon to do a flush from memtable to disk, it will limit its disk activity based on the value of this variable. Keep in mind that this is the total amount of disk throughput allowed for all background threads, and as we stated earlier, background threads include memtable flushing and compaction.

Default: 0

The default of 0, in this case, means that there is no limit imposed for disk activity. It may be worth considering setting this to a non zero value if you want to make sure disk activity from flushing and compaction doesn’t consume all of your I/O capacity, or if you want to save capacity for other processes such as reads. Keep in mind that if you slow down the process of writing memtable data to disk, under significant load you could theoretically stop write activity if you hit the maximum number of immutable memtables.

For those of you that are familiar with InnoDB you may want to think of this as acting like innodb_io_capacity_max.

One other important thing to note is that according to the documentation, this variable should be able to be changed dynamically; however, my testing has shown that changing this variable to/from a zero value requires a restart. I have created a bug with Percona to bring it to their attention.

Rocksdb_force_flush_memtable_now

In the case that you would like to flush all the data in memtables to disk immediately, you can toggle this variable. A flush will occur, and then the variable will go back to its original value of ‘OFF’.

One important thing to note is that during this flush process, all writes will be blocked until the flush has completed.

Another important point to raise is that during my testing, I found that you could set this variable to any value including false and ‘OFF’ and it would still flush data to disk. You will need to exercise caution when working with this variable as setting it to its default value will still force a flush. In short, don’t set this variable to anything unless you want a flush to occur. I have opened a bug with Percona to raise bring this to their attention.

Default: OFF

Rocksdb_checksums_pct

When data is written to disk there is a checksum that’s written to the SST file of a percentage of its contents. Much like the InnoDB variable innodb_checksum_algorithm, for those of you who are familiar, the purpose of this is to ensure that a checksum can be read as a data file is retrieved from disk in order to assure disk storage issues such as bit rot didn’t corrupt the data between the time it was written to disk and the time when it was later retrieved for a read operation.

Default: 100

You may be able to increase overall performance by reducing the amount of data that is read to support the checksum, but I would recommend against it as you want to have that assurance that data being read is the same as when it was written.

Rocksdb_paranoid_checks

In addition to checksumming, there is another fault tolerance measure you can take to ensure the accuracy of written data and this is called “‘paranoid checking”. With paranoid checking enabled, data files will be read immediately after they are written in order to ensure the accuracy of the data.

Default: ON

I would be inclined to leave this enabled as I prefer to do all possible checking in order to make sure data is written with the highest degree of accuracy.

Rocksdb_use_fsync

When data files are written to disk, they are typically done using fdatasync() which utilizes caching in Linux, but doesn’t offer assurances that data is actually on disk at the end of the call. In order to get that assurance, you can use the variable rocksdb_use_fsync to specify that you would rather have MyRocks call fsync() which will assure a disk sync at the time that the request to write data is complete.

Default: OFF

The most likely reason that this is disabled is to allow for the performance gains achieved by the asynchronous data writing nature of fdatasync(). Potential data loss of data sitting in the OS cache but not on disk during a full system crash may or may not be acceptable for your workload, so you may want to consider adjusting this variable.

Rocksdb_use_direct_io_for_flush_and_compaction

If you would rather avoid the disk caching elements of fdatasync() or fsync() for writes to data files via memtables flushes or compaction, you have the option to do so by enabling the variable rocksdb_use_direct_io_for_flush_and_compaction. When it comes to flushing and compaction this will override the value of rocksdb_use_fsync and instead will specify that MyRocks should call O_DIRECT() when writing data to disk.

Default: OFF

In the wiki for RocksDB, you will see that performance gains from using O_DIRECT() are dependent on your use case and are not assured. This is true of all storage engines and testing should always be performed before attempting to adjust a variable like this.

Keep in mind that I have recommended O_DIRECT in the past for InnoDB, but that doesn’t apply here as MyRocks is a very different engine and there really isn’t enough data out there to say what is the best write method for most use cases so far. Exercise caution when changing your write method.

Rocksdb_bytes_per_sync

Another thing that is important to understand about syncing to disk is knowing how often it occurs, and that’s where rocksdb_bytes_per_sync comes into play. This variable controls how often a call is made to sync data during the process while data is being written to disk, specifically after how many bytes have been written. Keep in mind that write-ahead logs have their own variable, rocksdb_wal_bytes_per_sync, so rocksdb_bytes_per_sync is just for data files. Also, be aware that depending on what syncing function is called (see above for rocksdb_use_fsync and rocksdb_use_direct_io_for_flush_and_compaction) this may be an asynchronous request for a disk sync.

Default: 0

With the default value of 0, MyRocks will disable the feature of requesting syncs after the designated number of bytes and instead will rely on the OS to determine when syncing should occur.

It is recommended that users of MyRocks not use this feature as a way of ensuring a persistency guarantee.

Rocksdb_merge_buf_size

Typically, when new data gets flushed into persisted space, it ends up in the highest compaction layer, L0. This will be explained in more detail in the next blog post. There is one exception to this rule and that’s when a new secondary index is added to an existing table, which will skip this process and gets written to the bottom-most level of compaction available, which in MyRocks is L6 by default. Think of this as a way for secondary index data to get to its final destination faster. It does this by doing a merge sort of existing data to support the secondary index.

In order to better understand merge sort processes, I would recommend reading this blog post on hackernoon.

There is a memory cache that is used to support the merge sort process specifically for secondary index creation and it’s called the ‘merge buffer’. The rocksdb_merge_buf_size determines how large this buffer will be.

Default: 64Mb

Rocksdb_merge_combine_read_size

If you checked out the blog post on hackernoon that I mentioned, you’ll know that sorting eventually requires combining the smaller broken down sub-arrays back into the full sorted list. In the case of MyRocks, this uses a completely separate buffer called the “merge combine buffer”. The variable rocksdb_merge_combine_read_size determines how large the merge combine buffer will be.

Default: 1 Gb

You’ll see in the next variable we cover (rocksdb_merge_tmp_file_removal_delay_ms) that MyRocks will create merge files on disk to help support the process of creating new secondary indexes so I/O can occur, but with larger memory buffers you will see less IO.

My take on this would be to not change the global value of this variable, but instead to change it only within the session that I’m using to create the secondary index. Keep in mind that the tradeoff here is that you’re using more memory to speed up index creation; however, if you set the global value of this variable to a large size and forget about it, that large amount of memory may be allocated when you didn’t expect it, which may consume more memory resources than you anticipated, which could lead to issues like OOM, etc.

Rocksdb_merge_tmp_file_removal_delay_ms

In addition to the in-memory resources used to work with the merge sort process of creating new secondary indexes, you may also get merge files created on disk. These are temporary files that you will find in the MyRocks data directly with the .tmp extension. Once the secondary index completion process is created, it will immediately delete these files. For storage solutions like flash, removing large amounts of data can cause trim stalls. This variable will allow you to apply a rate limit delay to this process in order to help prevent this issue.

Default: 0 (no delay)

I wouldn’t change the value of this variable unless you have flash storage. If you do use flash storage, you can test by creating and removing indexes to determine what value would be best for this variable. Given that there are no other implications to setting this variable, I would recommend setting the variable globally, including an addition to the my.cnf.

Associated Metrics

Here are some of the metrics you should be paying attention to when it comes to initial data flushing.

You can find the following information using system status variables

  • Rocksdb_flush_write_bytes: Shows the amount of data that has been written to disk as part of a flush, in bytes, since the last MySQL restart.
  • Rocksdb_number_sst_entry_delete: The number of record delete markers written by MyRocks to a data file since the last MySQL restart.
  • Rocksdb_number_sst_entry_singledelete: The number of record single delete markers written by MyRocks to a data file since the last MySQL restart. This will make a bit more sense after we cover SingleDelete() markers in the next post in the series.
  • Rocksdb_number_deletes_filtered: Shows the number of times a deleted record was not persisted to disk if it made reference to a key that not exist since the last MySQL restart.
  • Rocksdb_stall_memtable_limit_slowdowns: The number of slowdowns that have occurred due to MyRocks getting close to the maximum number of allowed memtables since the last MySQL restart.
  • Rocksdb_stall_memtable_limit_stops: The number of write stalls that have occurred due to MyRocks hitting the maximum number of allowed memtables since the last MySQL restart.
  • Rocksdb_stall_total_slowdowns: The total number of slowdowns that have occurred in the MyRocks engine since the last MySQL restart.
  • Rocksdb_stall_total_stops: The total number of write stalls that have occurred in the MyRocks engine since the last MySQL restart.
  • Rocksdb_stall_micros: How long the data writer had to wait for a flush to finish since the last restart of MySQL.

In the information_schema.ROCKSDB_CFSTATS table, you can find the following information about each column family.

  • MEM_TABLE_FLUSH_PENDING: Shows you if there is a pending operation to flush an immutable memtable to disk.

In the perfomance_schema, you may find the following setup instrument to be helpful.

  • wait/synch/mutex/rocksdb/sst commit: Shows the amount of mutex time wait during the sst (data file) commit process.
Conclusion

In this post, we talked about the mechanics that are involved in flushing data from immutable memtables to disk. We also mentioned a few things about compaction layers, but just enough to help illustrate what’s going on with that initial flush from immutable memtable to disk. Stay tuned for my next post where we’ll do a deeper dive into the mechanics surrounding compaction.

Flashback Recovery in MariaDB/MySQL/Percona

In this blog, we will see how to do flashback recovery or rolling back the data in MariaDB, MySQL and Percona.

As we know the saying  “All human make mistakes”, following that in Database environment the data can be deleted or updated in the database either by intentionally or by accidentally.

To recover the lost data we have multiple ways.
  • The data can be recovered from the latest full backup or incremental backup when data size is huge it could take hours to restore it.
  • From backup of Binlogs.
  • Data can also be recovered from delayed slaves, this case would be helpful when the mistake is found immediately, within the period of delay.

The above either way can help to recover the lost data, but what really matters is, What is the time taken to rollback or recover the data? and How much downtime was taken to get back to the initial state?

To overcome this disaster mysqlbinlog has a very useful option i.e –flashback that comes along with binary of MariaDB server though it comes with Mariaserver, it works well with Oracle Mysql servers and Percona flavour of MySQL.

What is Flashback?

Restoring back the data to the previous snapshot in a MySQL database or in a table is called Flashback.

Flashback options help us to undo the executed row changes(DML events).

For instance, it can change DELETE events to INSERTs and vice versa, and also it will swap WHERE and SET parts of the UPDATE events.

Prerequisites for using flashback :
  • binlog_format = ROW
  • binlog_row_image = FULL

Let us simulate a few test cases where flashback comes as a boon for recovering data.

For simulating the test cases I am using employees table and mariadb version 10.2

MariaDB [employees]> select @@version; +---------------------+ | @@version           | +---------------------+ | 10.2.23-MariaDB-log | +---------------------+ 1 row in set (0.02 sec)

Table structure :

MariaDB [employees]> show create table employees\G *************************** 1. row ***************************        Table: employees Create Table: CREATE TABLE `employees` (   `emp_no` int(11) NOT NULL,   `birth_date` date NOT NULL,   `first_name` varchar(14) NOT NULL,   `last_name` varchar(16) NOT NULL,   `gender` enum('M','F') NOT NULL,   `hire_date` date NOT NULL,   PRIMARY KEY (`emp_no`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 1 row in set (0.00 sec) Case 1:  Rollbacking the Deleted data.

Consider the data is deleted was from employees table where first_name =’Chirstian’ .

MariaDB [employees]> select COUNT(*) from employees where first_name ='Chirstian'; +----------+ | COUNT(*) | +----------+ |      226 | +----------+ 1 row in set (0.07 sec) MariaDB [employees]> delete from employees where first_name ='Chirstian'; Query OK, 226 rows affected (0.15 sec)

To revert the data to the intial state ,we need to decode the binlog and get the start and stop position of the delete event happened to the employees table.

It is necessary to take a proper start and stop positions. Start position should be taken exactly after BEGIN and Stop position is before the final COMMIT.

[root@vm3 vagrant]# mysqlbinlog -v --base64-output=DECODE-ROWS /var/lib/mysql/mysql-bin.000007 > mysql-bin.000007.txt BEGIN /*!*/; # at 427 # at 501 #190417 17:49:49 server id 1  end_log_pos 501 CRC32 0xc7f1c84b  Annotate_rows: #Q> delete from employees where first_name ='Chirstian' #190417 17:49:49 server id 1  end_log_pos 569 CRC32 0x6b1b5c98  Table_map: `employees`.`employees` mapped to number 29 # at 569 #190417 17:49:49 server id 1  end_log_pos 7401 CRC32 0x6795a972         Delete_rows: table id 29 flags: STMT_END_F ### DELETE FROM `employees`.`employees` ### WHERE ###   @1=10004 ###   @2='1954:05:01' ###   @3='Chirstian' ###   @4='Koblick' ###   @5=1 ###   @6='1986:12:01' # at 23733 #190417 17:49:49 server id 1  end_log_pos 23764 CRC32 0xf9ed5c3e        Xid = 455 ### DELETE FROM `employees`.`employees` ### WHERE ### @1=498513 ### @2='1964:10:01' ### @3='Chirstian' ### @4='Mahmud' ### @5=1 ### @6='1992:06:03' # at 7401 COMMIT/*!*/; # at 23764 #190417 17:49:49 server id 1  end_log_pos 23811 CRC32 0x60dfac86        Rotate to mysql-bin.000008  pos: 4 DELIMITER ; # End of log file ROLLBACK /* added by mysqlbinlog */;

Once the count is verified the from the taken positions, we can prepare the data file or the .sql file using flashback as below

[root@vm3 vagrant]# mysqlbinlog  -v --flashback --start-position=427 --stop-position=7401 /var/lib/mysql/mysql-bin.000007  > insert.sql

Below is the comparison of conversion from Delete to Insert for a single record:

### DELETE FROM `employees`.`employees` ### WHERE ### @1=498513 ### @2='1964:10:01' ### @3='Chirstian' ### @4='Mahmud' ### @5=1 ### @6='1992:06:03' ### INSERT INTO `employees`.`employees` ### SET ### @1=498513 ### @2='1964:10:01' ### @3='Chirstian' ### @4='Mahmud' ### @5=1 ### @6='1992:06:03' MariaDB [employees]> source insert.sql Query OK, 0 rows affected (0.01 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.01 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.00 sec)

And the count is verified after the data load.

MariaDB [employees]> select COUNT(*) from employees where first_name ='Chirstian'; +----------+ | COUNT(*) | +----------+ |      226 | +----------+ 1 row in set (0.06 sec) Case 2 :  Rollbacking the Updated data.

The data was updated based on below conditions

MariaDB [employees]> select COUNT(*) from employees where first_name ='Chirstian' and gender='M'; +----------+ | COUNT(*) | +----------+ |      129 | +----------+ 1 row in set (0.14 sec) MariaDB [employees]> update employees set gender='F' where first_name ='Chirstian' and gender='M'; Query OK, 129 rows affected (0.16 sec) Rows matched: 129  Changed: 129  Warnings: 0 MariaDB [employees]> select COUNT(*) from employees where first_name ='Chirstian' and gender='M'; +----------+ | COUNT(*) | +----------+ |        0 | +----------+ 1 row in set (0.07 sec)

To revert back the updated data, the same steps to be followed as in case 1.

[root@vm3 vagrant]# mysqlbinlog -v --flashback --start-position=427 --stop-position=8380 /var/lib/mysql/mysql-bin.000008 > update.sql MariaDB [employees]> source update.sql Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.01 sec) Query OK, 0 rows affected (0.00 sec) MariaDB [employees]> select COUNT(*) from employees where first_name ='Chirstian' and gender='M'; +----------+ | COUNT(*) | +----------+ |      129 | +----------+ 1 row in set (0.06 sec)

In the above two cases by using flashback option we were able to change Event Type statements from DELETE to INSERT and Update_Event statements by Swapping the SET part and WHERE part.

There are few Limitations of Flashback 
  • It Doesn’t support DDL ( DROP/TRUNCATE or other DDL’s)
  • It Doesn’t support encrypted binlog
  • It Doesn’t support compressed binlog

Key Takeaways:

  • To reverse the mishandled operations from binary logs.
  • No need to stop the server to carry out this operation.
  • When the data is small to revert back, flashback process is very faster than recovering the data from Full Backup.
  • Point in time recovery (PITR) becomes easy.

Photo by Jiyeon Park on Unsplash

What configuration settings did I change on my MySQL Server ?

This post is just a reminder on how to find which settings have been set on MySQL Server.

If you have modified some settings from a configuration file or during runtime (persisted or not), these two queries will show you what are the values and how they were set. Even if the value is the same as the default (COMPILED) in MySQL, if you have set it somewhere you will be able to see where you did it.

Global Variables

First, let’s list all the GLOBAL variables that we have configured in our server:

SELECT t1.VARIABLE_NAME, VARIABLE_VALUE, VARIABLE_SOURCE
FROM performance_schema.variables_info t1
JOIN performance_schema.global_variables t2
ON t2.VARIABLE_NAME=t1.VARIABLE_NAME
WHERE t1.VARIABLE_SOURCE != 'COMPILED';

This is an example of the output:

Session Variables

And now the same query for the session variables:

SELECT t1.VARIABLE_NAME, VARIABLE_VALUE, VARIABLE_SOURCE
FROM performance_schema.variables_info t1
JOIN performance_schema.session_variables t2
ON t2.VARIABLE_NAME=t1.VARIABLE_NAME
WHERE t1.VARIABLE_SOURCE = 'DYNAMIC';

And an example:

You can also find some more info in this previous post. If you are interested in default values of different MySQL version, I also invite you to visit Tomita Mashiro‘s online tool : https://tmtm.github.io/mysql-params/

In case you submit bugs to MySQL, I invite you to also add the output of these two queries.

Shinguz: FromDual Ops Center for MariaDB and MySQL 0.9 has been released

FromDual has the pleasure to announce the release of the new version 0.9 of its popular FromDual Ops Center for MariaDB and MySQL focmm.

The FromDual Ops Center for MariaDB and MySQL (focmm) helps DBA's and System Administrators to manage MariaDB and MySQL database farms. Ops Center makes DBA and Admins life easier!

The main task of Ops Center is to support you in your daily MySQL and MariaDB operation tasks. More information about FromDual Ops Center you can find here.

Download

The new FromDual Ops Center for MariaDB and MySQL (focmm) can be downloaded from here. How to install and use focmm is documented in the Ops Center User Guide.

In the inconceivable case that you find a bug in the FromDual Ops Center for MariaDB and MySQL please report it to the FromDual bug tracker or just send us an email.

Any feedback, statements and testimonials are welcome as well! Please send them to feedback@fromdual.com.

Installation of Ops Center 0.9

A complete guide on how to install FromDual Ops Center you can find in the Ops Center User Guide.

Upgrade from 0.3 to 0.9

Upgrade from 0.3 to 0.9 should happen automatically. Please do a backup of you Ops Center Instance befor you upgrade! Please also check Upgrading.

Changes in Ops Center 0.9

Everything has changed!

Taxonomy upgrade extras:  Operations release Backup failover Restore

MySQL 8.0.16: mysql_upgrade is going away

As of 8.0.16, the mysql_upgrade binary is deprecated, but its functionality is moved into the server. Let’s call this functionality the “server upgrade”. This is added alongside the Data Dictionary upgrade (DD Upgrade), which is a process to update the data dictionary table definitions.…

Facebook Twitter Google+ LinkedIn

Pages