Planet MySQL

MySQL Connector/C++ 1.1.11 has been released

Dear MySQL Users,

A new GA (general availability) version of MySQL Connector/C++ has been made available: MySQL Connector/C++ 1.1.11 GA. The MySQL Connector/C++ provides a C++ API for connecting client applications to the MySQL Server 5.5 or newer.

You can download the production release at:

http://dev.mysql.com/downloads/connector/cpp/1.1.html

MySQL Connector C++ (Commercial) will be available for download on the My Oracle Support (MOS) website. This release will be available on eDelivery (OSDC) in next month’s upload cycle.

The MySQL driver for C++ offers an easy to use API derived from JDBC 4.0. MySQL Workbench has used it successfully for years.

We have improved the driver since the last GA release. Please see the documentation and the CHANGES file in the source distribution for a detailed description of bugs that have been fixed. Bug descriptions are also listed below.

Enjoy!

Changes in MySQL Connector/C++ 1.1.11 (2018-04-30, General Availability)

Functionality Added or Changed

  • MySQL Connector/C++ 1.1 now works with both MySQL 5.7 GA and MySQL 8.0 GA servers.
      • Applications can connect to MySQL 8.0 servers using accounts that authenticate using the caching_sha2_password authentication plugin.
      • Applications can connect to MySQL 8.0 servers using unencrypted connections by using the OPT_GET_SERVER_PUBLIC_KEY connection option with a value of true.
      • Connector/C++ 1.1 can be built from source against either MySQL 5.7 and MySQL 8.0 server installations.
      • A new BUNDLE_DEPENDENCIES CMake option is available. If enabled, the external libraries on which Connector/C++ depends at runtime (such as OpenSSL), are packaged together with the connector.
  • For connections to the server made using the legacy JDBC API (that is, not made using X DevAPI or XAPI), Connector/C++ now supports an OPT_GET_SERVER_PUBLIC_KEY connection option that enables requesting the RSA public key from the server. For accounts that use the caching_sha2_password or sha256_password authentication plugin, this key can be used during the connection process for RSA key-pair based password exchange with TLS disabled. This capability requires a MySQL 8.0 GA server, and is supported only for Connector/C++ built using OpenSSL.

Bugs Fixed

    • MySQL Connector/C++ packages now include a BUILDINFO.txt file that contains information about the build environment used to produce the distribution. (Bug #23556661)

On Behalf of the MySQL/ORACLE RE Team
Piotr Obrzut

MySQL Connector/NET 6.10.7 GA has been released

Dear MySQL users,

MySQL Connector/Net 6.10.7 is the fourth GA release with .NET Core
now supporting various connection-string options and MySQL 8.0 server
features.

To download MySQL Connector/Net 6.10.7 GA, see the “Generally Available
(GA) Releases” tab at http://dev.mysql.com/downloads/connector/net/

Changes in MySQL Connector/Net 6.10.7 (2018-04-30, General Availability) Functionality Added or Changed * Connections made to MySQL 8.0 (up to and including version 8.0.3) and compatibility with the new data dictionary are now supported. For information about the data dictionary, see MySQL Data Dictionary (http://dev.mysql.com/doc/refman/8.0/en/data-dictionary.html). * Support for the caching_sha2_password authentication plugin through the classic MySQL protocol was added. In addition, the sha256_password plugin was extended to support authentication when RSA keys are available through non-secure connections. Caching SHA-2 pluggable authentication offers faster authentication than basic SHA-256 authentication. * Support was added for the new caching_sha2_password padding mechanism introduced in the MySQL 8.0 release series. The new padding mechanism is enabled when all of the following conditions apply: + The user account is set with the caching_sha2_password authentication plugin. + SSL is disabled explicitly (SslMode=none). + The AllowPublicKeyRetrieval connection option is enabled (AllowPublicKeyRetrieval=true).  When enabled, the new padding mechanism is used to encode the password during RSA key encryption, which applies the correct padding to match the server. Bugs Fixed * Attempting to open the MySQL Web Configuration Tool, with Connector/Net and MySQL for Visual Studio prerequisites installed properly, displayed an error message instead of opening the tool. (Bug #27457398, Bug #88544) * The ADO.NET Entity Data Model wizard within Visual Studio closed unexpectedly without producing the data model.  Thanks to Laurents Meyer for the patch. (Bug #27420311, Bug #89338) * An exception prevented MySQL.Data.Entity for Entity Framework 6 from operating as expected. Thanks to Cédric Luthi for the patch. (Bug #27360520, Bug #89134) * MySQL Installer could not be installed with NuGet packages from Microsoft Visual Studio 2015. (Bug #27251839, Bug #88838) * With valid references to the DLLs provided, using DbConfiguration.SetConfiguration(new MySql.Data.Entity.MySqlEFConfiguration()) to set up the DbContext threw an exception. (Bug #25185319) * Attempting to generate an Entity Framework model from a MySQL 5.7 database using either EF5 or EF6 produced an exception that prevented the operation from generating the expected model. (Bug #22173048, Bug #79163)

Nuget packages are available at:

https://www.nuget.org/packages/MySql.Data/6.10.7
https://www.nuget.org/packages/MySql.Web/6.10.7
https://www.nuget.org/packages/MySql.Data.Entity/6.10.7
https://www.nuget.org/packages/MySql.Data.EntityFrameworkCore/6.10.7
https://www.nuget.org/packages/MySql.Data.EntityFrameworkCore.Design/6.10.7

Enjoy and thanks for the support!

On behalf of the MySQL Release Team,
Nawaz Nazeer Ahamed

Cloud Database Features Comparison - Amazon RDS vs Google Cloud SQL

As more companies run their workloads in the cloud, cloud database services are increasingly being used to manage data. One of the advantages of using a cloud database service instead of maintaining your database is that it reduces the management overhead. Database services from the leading cloud vendors share many similarities, but they have individual characteristics that may make them well-, or ill-suited to your workload. Developers are always looking for convenient ways of running their databases, whether it is to obtain more profound insight into database performance, to perform a migration efficiently, to simplify backup and restore processes, or to do many other "day to day" tasks. Among the number of available cloud services, it may not be easy to figure out which is the best one for our use case. In this article, we’ll compare two of the most popular cloud database services on the market - Google Cloud SQL and Amazon RDS.

Amazon RDS provides a web interface through which you can deploy MySQL. The RDS service manages the provisioning of the instance and configuration. Additionally, it also provides a console to monitor and perform basic database administration tasks. Google Cloud SQL similarly provides a predefined MySQL setup that is automatically managed. Predefined services can be a comfortable way to manage your databases however at the same time they can limit functionality. Let's take a closer look then at these management features.

Database logs and metrics monitoring

Amazon RDS and Google Cloud don't provide access to the shell. Your primary concern here may be access to essential log files. Amazon CloudWatch is a monitoring service for cloud resources which you can use to solve this problem. It collects metrics, collects and monitor log files or automatically react to changes in your AWS resources. Using CloudWatch, you can gather and processes error log, audit log and other logs from RDS into metrics presented in the web console. These statistics are recorded for 15 months so you can maintain a history. CloudWatch can take actions such as sending a notification to a notification recipient or if needed - autoscaling policies, which in turn may automatically handle an increase in load by adding more resources.

Amazon CloudWatch

Google cloud also provides log processing functionality. You can view the Google Cloud SQL logs in the operations panel or through Google console. The operations panel logs every operation performed on the instance with pretty basic information. It could be extended with manually added metrics based on data from a file source. Unfortunately, the operations log does not include activities performed using external management tools, such as the mysql client. To extend basic functionality Google has another service - Stackdriver. The Stackdriver service can be used to create alerts for metrics defined in operational panel. Stackdriver embraces not only Google Cloud Platform (GCP) but also AWS and local services. You can use it for cross-cloud platform monitoring without additional agents. Stackdriver requires the installation of an open source based collected agent to access non-cloud metrics.

Google Cloud SQL logging

There are various ways in which you could monitor the MySQL instances metrics. It can be performed by querying the server all the time for the metrics values or with predefined services. You can get more in-depth visibility into the health of your Amazon RDS instances in real time with Enhanced Monitoring for Amazon RDS. It provides metrics so that you can monitor the health of your DB instances and DB clusters. You can monitor both DB instance metrics and operating system (OS) metrics.

It provides a set of over 50 database instance metrics and aggregated process information for your instances, at the granularity of 1 second. You can visualize the metrics on the RDS console.

Both CloudWatch and Stackdriver provides functionality to create alarms based on metrics. Amazon does it with Amazon Simple Notification Service (SNS) for notification. In Stackdiver it's done directly in this service.

Google Stackdriver monitoring dashboard Data Migration into Cloud

At this moment backup based migration to Google Cloud SQL is quite limited. You can only use logical dump, which may be a problem for bigger databases. The SQL dump file must not include any triggers, views, or stored procedures. If your database needs these elements, you should recreate them after shipping the data. If you have already created a dump file that holds these components, you need manually edit the file. The database you are importing into must exist up front. There is no option to migrate to Google cloud from other RDBMS. It all makes the process quite limited, not to mention that there is no option for cross-platform migration in real time (AWS RDS).

AWS Database Migration Service

Amazon Database Migration Service (DMS) supports homogenous migrations such as MySQL to MySQL, as well as heterogeneous migrations between different database platforms. AWS DMS can help you in planning and migration of on-premises relational data stored in Oracle, SQL Server, MySQL, MariaDB, or PostgreSQL databases. DMS works by setting up and then managing a replication instance on AWS. This instance dumps data from the source database and loads it into the target database.

Achieving High Availability

Google use semisynchronous replicas to make your database highly available. Cloud SQL provides the ability to replicate a master instance to one or more read replicas. If the zone where the master is located experiences an outage and the backup server is set, Cloud SQL fails over to the failover replica.

Google Cloud SQL create read replica

The setup is straightforward, and with a couple of clicks, you can achieve a working slave node. Nevertheless, configuration options are limited and may not fit your system requirements. You can choose from the following replica scenarios:

  • read replica - a read replica is a one to one copy of the master. This is the base model where you create a replica to offload read requests or analytics traffic from the master,
  • external read replica - this option is to configure an instance that replicates to one or more replicas external to Cloud SQL,
  • external master - setup replication to migrate to Google Cloud SQL.

Amazon RDS provides read replica services. Cross-region read replicas gives you the ability to scale as AWS has its services in many areas in the world. RDS asynchronous replication is highly scalable. All read replicas are accessible and can be used for reading in a maximum number of five regions. These nodes are independent and can be used in your upgrade path or can be promoted to a standalone database.

In addition to that, Amazon offers Multi-AZ deployments based on DRBD, synchronous disk replication. How is it different from Read Replicas? The main difference is that only the database engine on the primary instance is active, which leads to other architectural variations.

Automated backups are taken from standby. That significantly reduces the possibility of performance degradation during a backup.

As opposed to read replicas, database engine version upgrades happen on the primary. Another difference is that AWS RDS will failover automatically while read replicas will require manual operations from you.

Multi-AZ failover on RDS uses a DNS change to point to the standby instance, according to Amazon this should happen in 60-120 seconds of unavailability during the failover. Because the standby uses the same storage data as the primary, there will probably be transaction/log recovery. Bigger databases may spend a significant amount of time on innoDB recovery, so please consider that in your DR plan.

Encryption

Security compliance is one of the critical concerns for enterprises whose data is in the cloud. When dealing with production databases that hold sensitive and vital data, it is highly recommended to implement encryption to protect the data from unauthorized access.

In Google Cloud SQL, customer data is encrypted when stored in database tables, temporary files, and backups. Outside connections can be encrypted by SSL certificates (especially for intra-zone connections to Cloud SQL), or by using the Cloud SQL Proxy. Google encrypts and authenticates all data in transit and data at rest with AES-256.

With RDS encryption enabled, the data is stored on the instance underlying storage, the automated backups, read replicas, and snapshots all become encrypted. The RDS encryption keys implement the AES-256 algorithm. Keys are being managed and protected by the AWS key management infrastructure through AWS Key Management Service (AWS KMS). You do not need to make any modifications to your code or operating model to benefit from this critical data protection feature. AWS CloudHSM is a service that helps meet stringent compliance requirements for cryptographic operations and storage of encryption keys by using a single tenant Hardware Security Module (HSM) appliances within the AWS cloud.

Pricing

Instance pricing for Google Cloud SQL is credited for every minute that the instance is running. The cost depends on the device type you choose for the instance, and the area where it's placed. Read replicas and failover replicas are charged at the same rate as stand-alone instances. The pricing starts from $0.0126 per hour of micro instance to $8k, db-n1-highmem-64 with 64 vCPUs, 416 GB RAM, 10,230 GB disk and limit of 4,000 connections.

Like other AWS products, users pay for what they use with RDS. But, this pay-as-you-go model has a specific billing construct that can, if left unchecked, yield questions or surprise billing elements if no one’s aware of what’s actually in the bill. You may bill your database options starting from 0.175$ per hour to upfront thousands of dollars. Both platforms are quite flexible, but you will see more configuration options in AWS.

Infrastructure

As mentioned in the pricing section, Google Cloud SQL can be scaled up to 64 processor cores and more than 400GB of RAM. The maximum size of the disk is 10TB per instance. You can configure your instance settings to increase it automatically. That should be plenty for many project requirements. Nevertheless if we take a look on what Amazon offers, there is still a long way for Google. RDS not only offers power instances but also long list of other services around it.

RDS supports storage volume snapshots, which you can use for point-in-time recovery or share with other AWS accounts. You can also take advantage of its provisioned IOPS feature to increase I/O. RDS can also be launched in Amazon VPC, Cloud SQL doesn’t yet support a virtual private network.

Related resources  How to Make Your MySQL or MariaDB Database Highly Available on AWS and Google Cloud  DIY Cloud Database on Amazon Web Services: Best Practices  Open Source Databases in 2017 and Trends for 2018  ClusterControl in the Cloud - All Our Resources  MySQL in the Cloud - Pros and Cons of Amazon RDS Backup

RDS generates automated backups of your DB instance. RDS establishes a storage volume snapshot of your DB instance, backing up the entire DB instance and not individual databases. Automated backups occur daily during the preferred backup window. If the backup requires more time than allotted to the backup window, the backup continues after the window ends, until it finishes. Read replication doesn't have backup enabled by default.

When you want to do a restore, the only option is to create a new instance. It can be restored to last backup or point in time recovery. Binary logs will be applied automatically, there is no possibility to get access to them. RDS PITR option is quite limited as it does not allow you to choose an exact time, or transaction. You will be limited to a 5 minutes interval. In most case scenarios, these settings may be sufficient however if you need to recover your database to the single transaction or exact time you need to be ready for manual actions.

Google Cloud SQL backup data is stored in separate regions for redundancy. With the automatic backup function enabled, database copy will be created every 4 hours. If needed you can create on-demand backups (for any Second Generation instance), whether the instance has automatic backups enabled or not. Google and Amazon approach for backups is quite the same however with Cloud SQL it is possible to perform the point in time recovery to the specific binary log and position.

Tags:  Amazon RDS MySQL google cloud sql cloud database

Spatial Reference Systems in MySQL 8.0

MySQL 8.0 brings a lot of new spatial features. The main feature is the support for multiple spatial reference systems (SRSs).

Spatial reference systems is not a new concept in MySQL, but 8.0 is the first version where it affects computations, and it’s the first version where users actually have to think about it.…

db4free.net goes MySQL 8.0

MySQL 8.0 has been released as stable (GA) earlier this month. For db4free.net this means it’s time to make MySQL 8.0 the default version and to deprecate the MySQL 5.7 server instance.

The new MySQL 8.0 server is running on the default port 3306. All new registrations will have the database created on this server. It is fresh and empty and will start from scratch.

The previous MySQL 5.7 server will remain available on port 3308. All users who have data there which they want to keep should migrate it to the new MySQL 8.0 server. This will require you to sign up again.

The previous MySQL 8.0 server will remain on port 3307. Both the old MySQL 5.7 and the old MySQL 8.0 server on port 3307 will be available until June 15, 2018. Data which isn’t migrated to the new server instance by then will be lost.

The new MySQL 8.0 server instance will come with the new utf8mb4 character set and the new utf8mb4_0900_ai_ci collation, which are the new defaults in MySQL 8.0. Since db4free.net already used utf8mb4 on the previous MySQL 5.7 server instance, this should not affect many people, if any at all.

Another long overdue change is that the default timezone (on db4free.net, this is not a change in MySQL 8.0 itself) will be UTC. Previously the servers were set to Central European Time since that’s the home time zone where db4free.net is hosted. But with a large international audience it makes sense to use UTC going forward. The timezone can be changed per connection as described in the MySQL Reference Manual.

The MySQL 8.0 Reference Manual is the place to go for all general MySQL questions and to find out what’s new in MySQL 8.0 (which are a lot of things).

As always: please backup data which you can’t afford to lose. db4free.net is a testing service and there is always a risk that something goes wrong, like the server doesn’t start up anymore. This has happened before and may happen again, especially with a brand new version. This service comes with no warranties at all.

If you keep that in mind you should have much fun exploring the new (and old) goodies of MySQL 8.0. Consider following db4free.net on Twitter as this is where you get updates and status information the quickest.

Enjoy!

Hidden caches catch your data

This article is different from my usual posts. It explains things that may be obvious to many database professionals – not all of them though.

The idea came indirectly from my friend Francesco Allertsen. He has a weekly mailing list he uses to share links to interesting articles he reads on the web. One of them was The hidden components of Web caching. Its purpose is to list all caches that play some role when we interact with a web site. An interesting idea, even if I find it incomplete. So I thought it was a good idea to talk about caches that we hit whenever we interact with a database.

Why should we care?

But first, a note on why we should care:

  • Caches increase the speed of IO by orders of magnitude. Latency numbers that everyone should know gives you an idea of that. The first comment suggests to use a solar system image to visualise the scale; curiously I had the same idea, and I used this image for some slides I made in the past.
  • When reliability is important, caches can get in the way in a dangerous way. Because caches are volatile, so they won’t survive a crash or other types of hardware/software failure (aka bugs and limitations). So when we write data, for certain use cases (financial transactions, etc) at no time data should be cached but not written to disk. Or it can happen temporarily before the database says “Ok Mr. Application, I got your data, you can resume your job”.

What guarantees this requirement? Well, caches can be write-through, write-around or write-back. Write-through and write-around caches are reliable in this respect, because data is always written to disk before the control is returned to the writer. Write-back are not reliable, because data is made persistent asynchronously, after the control has been returned. But of course they are much faster because they allow to groups more IO operations together (+ throughput), because latency is very low and because there is no stall if the IO capacity is currently saturated.

It’s all about tradeoffs

Depending on the use case, we will have to choice the proper tradeoff between reliability and performance. For example, InnoDB allows to:

  • Flush changes to disks at every commit: even if mysqld crashes, no data loss is possible if you don’t hit any software/hardware bug;
  • Flush changes to the filesystem cache at every commit: a filesystem crash is the only event that can cause data loss, and it is not likely;
  • Flush data to disk once a second, or even longer intervals.

Also, when we make the choice, we should keep into account data redundancy. For example, if we run a Galera Cluster, we have at least 3 copies of the data on different servers. Or we could store data on a RAID array, which also guarantees that we have multiple copies of data. Failover guarantees that our services don’t break if one copy gets damaged, and we can restore it from another copy. In such cases, even if we deal with critical data, we don’t necessarily need InnoDB to store data in the most reliable way – which is the slowest.

Types of caches

Enough blah blah. Here is the list of caches that could get in the way when we try to persist our data, starting from the lowest levels:

Disk caches – Both spinning disks and SSD can have caches. In your laptop, this cache is most probably write-back. This can usually be changed with something like hdparm.

Drive controllers and RAIDs caches – These can be write-through or write-back, and usually they are configurable. Notice that they could also be battery-backed, which means that they will survive a crash (unless the device has no power for a long time). Battery-backed caches can safely be write-back, but for other caches a write-through strategy could be necessary. Battery-backed RAIDs need a periodic learning cycle. A learning cycle slows down all operations sensibly, but it is necessary to be sure that the battery is totally charged. This operation should be scheduled carefully.

Filesystem cache – You can use it in a write-through or write-back fashion. This topic is amazingly complex (just like the previous ones), so I’ll just give you a link to a wonderful article: Files are Hard.

Virtual machines – Virtual machines have a disk interface cache. Its write strategy depends on the cache mode. There are several cache modes, but here we’ll only mention the reliable ones: none, which means that the VM doesn’t cache data (but the host system can), and writethrough, whose meaning should now be clear. Virtual machines also have a filesystem cache, of course. Note that having reliable settings on the VM guarantees that data changes will survive if the VM or anything running in it will crash; but if the host doesn’t have reliable settings and it crashes, most recent changes could be lost. Still, in production, typically a hypervisor runs many VMs. If many VMs bypass the cache, hypervisor’s IO capacity can be easily saturated. It could be better to build a database cluster with VMs running on separate physical hosts, to reduce the risk of data loss in case of crashes – in other words, often it’s better to rely on redundancy and failover, rather than reliability of individual systems.

Transaction log buffer – Terminology differ from DBMS to DBMS (WAL, transaction logs…), but the idea is that changes are persistent when they hit these logs. They will also need to be written to data files, but if they are in the logs they are safe. These logs have a buffer, which contains data not yet flushed. This idea can be confusing for some, so I’ll make it clear: this speeds up things, but doesn’t cause any danger. If your flush strategy is a write-through one, the buffer will contain not yet committed changes, which are flushed on commit – and only after the flush, the DBMS will report success. Regardless your flush strategy, some changes are flushed if the buffer gets full.

Binary log buffer – There is not necessarily a binary log separated from the transaction logs. MySQL has it because its architecture requires it – binary log contains all changes to data and it’s handled by the server, transaction logs contain information necessary to replay or rollback transactions and it’s handled by InnoDB (actually even non-transactional storage engines can have logs, but I’m not going to discuss this here). Considerations about the transaction logs apply to the binary log as well, keeping in mind that its purpose is different (incremental backups and replication, not crash recovery). In Postgres you have WAL files, which are used both for incremental backups/replication and for crash recovery.

Buffer pool – Most databases (Postgres is a famous exception) have a buffer pool to cache frequently accessed data and indexes. It can even contain dirty pages: changes that are not yet written to data files. This makes things much faster. And again: changes are persistent when they are written to transaction logs. Even after a crash, data files can be repaired using transaction logs.

Session buffers, work mem – These buffers speed up parts of query execution, like joins and sorting. However they have nothing to do with writes.

Query cache – MySQL older versions, MariaDB, and maybe other DBMS’s (not sure, sorry) have a query cache. This can speed up reads when the very same query is ran often. “Very same” means that hashes of the queries are compared, so any difference is relevant, including whitespaces. Every time a table is written, all queries mentioning the table are invalidated in the cache. This and its well-known scalability problems make it usually a bad idea, at least in MariaDB/MySQL (there are exceptions – for example, if you have small concurrency, a reasonable number of very slow queries and not many writes).

Proxy caches – Proxies, like ProxySQL, can also have a query cache. It can have problems and it is not necessarily useful, but at least it is supposed to be built with scalability in mind (because proxies are about scalability).

Redis & friends – This should be obvious: retrieving data from a cache system (like Redis or Memcached) is much faster than retrieving it from MySQL. Usually those data have a TTL (time to live), which determines when they will expire, and they can also be invalidated manually. Keep in mind that this makes response times unpredictable: if data is cached response time is X, if it is expired time is Y – where X and Y could be very different. It is even more unpredictable if this cache is not enough to contain all your hot data. So you should be careful about what to cache, unless your dataset is small. Note that these caches could also use the disks: for example, older Redis versions had Virtual Memory (currently deprecated). But we will not dig into this, as our focus is the persistent database. The point is: these caches can avoid database queries, but not always.

Application – No matter how fast proxy’s query cache, Redis and Memcached are: retrieving data from local RAM is much faster. No network round trip, no other servers response time involved. Of course you shouldn’t cache locally a big amount of data, or your memory will not be enough and your application could suffer. And cache invalidation can be a very complex problem to consider. But still, for hottest small data, local memory is the fastest option. To avoid making the response time unpredictable, it’s better to keep application-level caches updated, instead of running queries when an entry expires. Writes to the database are still necessary, and they can be synchronous or asynchronous, depending on how critical these data are.

Trust no one

A famous fairy said that some lies have short legs and others have long a nose. If hard disks, controllers and even filesystems had noses, some of them would have a long nose.

I will not dig into this complex topic myself, but the take away of this paragraph is: don’t trust them. They sometimes lie about consistency, so benchmarks are more impressive and marketing people are happy. Instead, try diskchecker.pl. It will tell you if something in your system is lying. It will not tell you if it is the hard disk, or the controller, or something in the OS. But it will tell you if data it writes are actually persisted immediately.

If your data are on the cloud, you cannot use this tool – because it involves shutting down the physical server suddenly while a file is being written. I am not aware of any tool or procedure to check if your cloud provider is lying about persistence. If you know one, please write a comment to this post. That would be much appreciated.

Databases don’t lie – at least, I am not aware of any DBMS or object store lying about persistence. But they have bugs, just like any piece of software, so you should check them periodically. Here is a PostgreSQL example.

Federico

MySQL 8.0 GA: Quality or Not?

What does Anton Ego – a fictional restaurant critic from the Pixar movie Ratatouille – have to do with MySQL 8.0 GA?

When it comes to being a software critic, a lot.

In many ways, the work of a software critic is easy. We risk very little and thrive on negative criticism, which is fun to read and write.

But what about those who give their many hours of code development, and those who have tested such code before release? How about the many people behind the scenes who brought together packaging, documentation, multiple hours of design, marketing, online resources and more?

And all of that, I might add, is open source! Free for the world to take, copy, adapt and even incorporate in full or in part into their own open development.

It is in exactly that area that the team at MySQL shines once again – they have from their humble beginnings build up a colossally powerful database software that handles much of the world’s data, fast.

Used in every area of life – aerospace, defense, education, finances, government, healthcare, pharma, manufacturing, media, retail, telecoms, hospitality, and finally the web – it truly is a community effort.

My little contribution to this effort is first and foremost to say: well done! Well done for such an all-in-all huge endeavor. When I tested MySQL 8.0, I experienced something new: an extraordinarily clean bug report screen when I unleashed our bug hunting rats, ahem, I mean tools. This was somewhat unexpected. Usually, new releases are a fun playground even for seasoned QA engineers who look for the latest toy to break.

I have a suspicion that the team at Oracle either uses newly-improved bug-finding tools or perhaps they included some of our methods and tools in their setup. In either case, it is, was and will be welcome.

When the unexpected occurs, a fight or flight syndrome happens. I tend to be a fighter, so I upped the battle and managed to find about 30 bugs, with 21 bugs logged already. Quite a few of them are Sig 11’s in release builds. Signal 11 exceptions are unexpected crashes, and release builds are the exact same build you would download at dev.mysql.com.

The debug build also had a number of issues, but less than expected, leading me to the conclusions drawn above. Since Oracle engineers marked many of the issues logged as security bugs, I didn’t list them here. I’ll give Oracle some time to fix them, but I might add them later.

In summary, my personal recommendation is this: unless you are a funky new web company thriving on the latest technology, give Oracle the opportunity to make a few small point bugfix releases before adapting MySQL 8.0 GA. After that, providing upgrade prerequisites are matched, and that your software application is compatible, go for it and upgrade.

Before that, this is a great time to start checking out the latest and greatest that MySQL 8.0 GA has to offer!

All in all, I like what I saw, and I expect MySQL 8.0 GA to have a bright future.

Signed, a seasoned software critic.

The post MySQL 8.0 GA: Quality or Not? appeared first on Percona Database Performance Blog.

Performance Improvements in MySQL 8.0 Replication

MySQL 8.0 became Generally Available (GA) on April 19th, a great moment for us working on MySQL at Oracle. It is now a “fully grown adult” packed with new features, and improvements to existing features, as described here.

This blog post focuses on the impact of replication performance improvements that went into MySQL 8.0.…

MySQL Enterprise Monitor 4.0.4 has been released

We are pleased to announce that MySQL Enterprise Monitor 4.0.4 is now available for download on the My Oracle Support (MOS) web site. This is a maintenance release that includes a few new features and fixes a number of bugs. You can find more information on the contents of this release in the change log.

You will find binaries for the new release on My Oracle Support. Choose the "Patches & Updates" tab, and then choose the "Product or Family (Advanced Search)" side tab in the "Patch Search" portlet.

Important: MySQL Enterprise Monitor (MEM) 8.0 offers many significant improvements over MEM 3.3, 3.4, and 4.0 and we highly recommend that you consider upgrading. More information on MEM 8.0 is available here:

Please open a bug or a ticket on My Oracle Support to report problems, request features, or give us general feedback about how this release meets your needs.

If you are not a MySQL Enterprise customer and want to try the Monitor and Query Analyzer using our 30-day free customer trial, go to http://www.mysql.com/trials, or contact Sales at http://www.mysql.com/about/contact.

Thanks and Happy Monitoring!

- The MySQL Enterprise Tools Development Team

Useful URLs

MySQL Enterprise Monitor 3.4.7 has been released

We are pleased to announce that MySQL Enterprise Monitor 3.4.7 is now available for download on the My Oracle Support (MOS) web site. This is a maintenance release that includes a few new features and fixes a number of bugs. You can find more information on the contents of this release in the change log.

You will find binaries for the new release on My Oracle Support. Choose the "Patches & Updates" tab, and then choose the "Product or Family (Advanced Search)" side tab in the "Patch Search" portlet.

Important: MySQL Enterprise Monitor (MEM) 8.0 offers many significant improvements over MEM 3.3, 3.4, and 4.0 and we highly recommend that you consider upgrading. More information on MEM 8.0 is available here:

Please open a bug or a ticket on My Oracle Support to report problems, request features, or give us general feedback about how this release meets your needs.

If you are not a MySQL Enterprise customer and want to try the Monitor and Query Analyzer using our 30-day free customer trial, go to http://www.mysql.com/trials, or contact Sales at http://www.mysql.com/about/contact.

Thanks and Happy Monitoring!

- The MySQL Enterprise Tools Development Team

Useful URLs

MySQL Enterprise Monitor 3.3.9 has been released

We are pleased to announce that MySQL Enterprise Monitor 3.3.9 is now available for download on the My Oracle Support (MOS) web site. This is a maintenance release that includes a few new features and fixes a number of bugs. You can find more information on the contents of this release in the change log.

You will find binaries for the new release on My Oracle Support. Choose the "Patches & Updates" tab, and then choose the "Product or Family (Advanced Search)" side tab in the "Patch Search" portlet.

Important: MySQL Enterprise Monitor (MEM) 8.0 offers many significant improvements over MEM 3.3, 3.4, and 4.0 and we highly recommend that you consider upgrading. More information on MEM 8.0 is available here:

Please open a bug or a ticket on My Oracle Support to report problems, request features, or give us general feedback about how this release meets your needs.

If you are not a MySQL Enterprise customer and want to try the Monitor and Query Analyzer using our 30-day free customer trial, go to http://www.mysql.com/trials, or contact Sales at http://www.mysql.com/about/contact.

Thanks and Happy Monitoring!

- The MySQL Enterprise Tools Development Team

Useful URLs

The Evolution of the DBA in an “As-A-Service” World

The requirements for managing and running a database in a modern enterprise have evolved over the past ten years. Those in charge of running enterprise databases have seen their focus shift from ensuring access and availability, to architecture, design and scalability responsibilities. Web-first companies pioneered the change by charging site reliability engineers (SRE’s) or multi-faceted DBAs with the task of ensuring that the company’s main revenue engine not only stayed up, but could scale to wherever the business needed to go. This is a far cry from the classic enterprise DBA’s top responsibilities: keep it up, keep it backed up, and react to issues as they present themselves.

Today, enterprises look for new revenue models to keep up with a shifting technology paradigm driven by the cloud. The requirements and needs for managing their database environments are changing along with this shift. In the SaaS world, application outages mean lost revenue. Worse, it leads to customer churn and gives your competitors an opening. To keep revenue flowing, every part of a SaaS company’s critical infrastructure needs to be planned out: redundancy should be built-in, and a future-proof architecture should be built to accommodate scale.

The more issues you can design out before launch, the less chance of a catastrophic outage later on. This means as a SaaS provider you want your DBAs and database engineers architecting a database that avoids problems at scale, and you want them working with your developers to write better, more efficient database calls. The database infrastructure is designed and automated to work at scale, while taking into account efficient use of resources for meeting today’s requirements.

When companies move to the cloud, the cloud provider takes care of much of the operational automation and many of the mundane day-to-day tasks (for example, using database as a service (DBaaS) options such as Amazon RDS and Aurora). But this does not eliminate the need for database expertise: it moves the function closer to the design and development side of the application. Someone needs to not only design and tune the database to support the application, but also has to understand how to build the modular pieces available in the cloud into a cohesive scalable unit that meets the needs of the application and the company. This means there are much higher impacts and clearer ROIs realized from efficient database expertise.

 

Over the years at Percona, we have seen this shift as well. Currently, more than 50% of the support tickets our customers open are related to application design issues, query performance or database infrastructure design. This is a far cry from five years ago when these represented less than 20% of our overall caseload. This makes sense, however, when you think about the maturity of our database products and the technological advances that impact the database. A more stable MySQL and MongoDB, coupled with advances in either homegrown automation or cloud-based infrastructure, reduce the likelihood of common crashing bugs and “Core Database Software” related bugs. Instead, outages and issues are increasingly caused by design decisions, bad code or unplanned-for edge cases. In order to keep up, DBAs need to evolve to move upstream to have the greatest impact.

At Percona, we recognize the changing requirements of modern database deployments. In fact, we have been providing database expertise since the beginning of the SaaS and cloud era. We recognize the needs of clients that choose to run on a DBaaS platform are slightly different than those managing their own full-stack database deployments.

That’s why we created a brand new tier of support focused on DBaaS platforms. These services allow you to rely on your cloud provider for operational break-fix support, while augmenting that with proven world-class expertise focused on the design, development, and tuning of the database itself (which cloud providers typically don’t address).

We also launched a DBaaS-focused version of our Percona DBA service. The Percona DBA service focuses on designing, setting up, and proactively improving your DBaaS cloud environment to ensure you get the most out of your investment. 

Contact us for more details on our new support and managed service options that can help optimize your cloud database environments, and make them run as efficiently as possible with the applications that drive your business.

The post The Evolution of the DBA in an “As-A-Service” World appeared first on Percona Database Performance Blog.

New JSON functions in MySQL 5.7.22

A number of new JSON functions have been added to MySQL 8.0. Since we appreciate that not everyone will be ready to upgrade to MySQL 8.0 the minute it is released, we have backported many of the new functions to MySQL 5.7 so that they are available starting with version 5.7.22.…

The High Cost and Low Benefit of Unused Index Advice

You have to be careful what you wish for. The signal:noise ratio of finding out which indexes are “unused” is low. This isn’t obvious at first glance, but upon deeper inspection, there’s often not much to be gained, and it can even harm you. Let’s look at why.

 

How to Find Unused Indexes

The typical way to figure out which indexes aren’t used is to examine index-usage counters over a period of time. The database itself can provide these statistics: PostgreSQL has built-in views that can expose them, and in MySQL there’s similar views as well as community patches that originally came from Google. There’s a way to get the stats in MongoDB too.

The Advice

What advice do people want about unused indexes? It typically boils down to:

  • Tell me indexes that don’t seem to have any activity according to the statistics.
  • Prioritize these indexes by the size or busyness of the data they reference.

The theory is that indexes cause bloat and wasted space if they’re not needed, and that they slow down modifications because they require duplication of writes when the indexed table is modified. Removing the biggest indexes from the heaviest-used tables could potentially eliminate a lot of wasted space and write amplification.

I’ve seen software that will auto-generate statements or migration code to drop the indexes. It couldn’t be more convenient: you just view the report and copy-paste the code, and you’ve got an “optimization easy button.”

Or is it?

Risk: Are They Really Unused?

The problem with finding “unused” indexes is that absence of evidence isn’t evidence of absence. There’s a nontrivial risk that an index really is used: just not while you gather statistics about its usage. And these indexes can be absolutely vital for performance during tasks you’re not thinking about. What about the end-of-month closing of the books, or the end-of-week generation of reports to email out to clients?

If you drop an index that’s actually needed, something performance-sensitive could turn into a full scan next time it runs. It could turn from a single-row access into a billion-row scan. I’ve seen a dropped index result in an outage at a later date, when it caused an emergency due to high load or missed deadlines.

Risk: Are the Indexes or Tables Large or Busy?

If you’ve ever operated data platforms at scale under high load, you know that schema changes need to be analyzed carefully for the potential impact. In particular, if you’re modifying a table or index that’s large or busy, you have the potential to cause a complete outage, which could be very long.

Depending on the type of modification, and the database technology, you could:

  • Cause high load for a long period of time, starving the system of the resources and impacting performance across the board.
  • Lock or limit access to a table or set of tables, potentially resulting in hung statements and giant queues for things like trying to get a connection from a pool, which can cause backpressure and cascading outages.
  • End up with failed, aborted, or killed alterations, which can result in outcomes like crashes and rollbacks, which can take much longer to recover from and even carry the risk of catastrophes like data corruption.

I have personally experienced these problems many times. My scars run deep.

The bigger and busier these tables and indexes are, the worse this problem gets. Schema changes on serious production systems aren’t to be treated casually.

Benefit: Speed Up Inserts and Updates!

Indexes slow down inserts and updates, so removing unused ones can speed those operations

But, how much? And how much does it matter?

Databases are smart. They don’t do wasteful updates for data that doesn’t change (if you update column A, they don’t update indexes that reference column B). And they do smart caching of data in memory, and smart buffering of writes, and write-ahead-logging, and a ton of other optimizations. The net result is often that index updates are far less costly than you might think.

But that’s kind of an unqualified assertion, isn’t it! How would you know if you don’t measure? Exactly. As far back as 2014 I was blogging about using data to prove/disprove such a hypothesis. When you have the type of data that VividCortex provides, you can make quantitative decisions and know whether you were right.

So the simple theory—indexes make things slower—isn’t always straightforward to prove. The database often optimizes things to the point that indexing overhead is negligible. You don’t know unless you measure. My experience is that when you measure you’ll often be surprised.

Benefit: Save Disk Space!

You can measure how much storage space an index uses, so you can figure out how much space you’ll save by dropping it.

But it turns out that the benefit isn’t clear-cut. Either an index is small and you’ll gain little by dropping it, or it’s large and there’s a higher chance of causing more harm than good.

All Risk, No Benefit?

So when you return to the “automatic waste-reduction adviser,” you’ll see that what it spits out is advice to the effect of this:

“Indexes bloat your disk space and slow down your database. Drop these unused* indexes, which are on seldom-updated tables and constitute a total of 20MB of data, to free up disk space and speed up your database, by copy-pasting the following commands: ALTER TABLE foo…”

* “unused” isn’t provably correct, remember.

What if the tool were pessimistic instead of optimistic? The advice might read like this:

“We don’t see that these indexes are used, but we can’t be sure. If you drop them you might free space and make writes faster, but you could also cause an outage: either now due to the cost of removing the index, or in the future due to the cost of missing the index in the event it’s actually needed.”

You can’t outsource caring about your database’s performance to a tool.

But what about the other case, when the index is on a hot column on a huge table and really is slowing things down? In that case the optimistic tool might say:

“Drop this unused index on a super-hot 1TB table and you’ll save 800GB of disk space and the overhead of maintaining 20k modifications per second. ALTER TABLE giant_table...”

But again, just pasting that ALTER into the command-line without scrutinizing it could cause take down production systems. And if the tool was wrong and the index really is used? Now you have to rebuild an index on a 1TB table. How long will that take?

Final note: how often does a situation like this occur? A few times a year? Does it need to be built into a tool, or should it just be a snippet that’s posted on a gist or a wiki page, or written into one of those DBA cookbooks?

In Conclusion

“Easy-button” unused-index advice is highly dangerous in my personal experience. It can trap inexperienced operators into a potentially doomsday scenario. The cost of bad advice is astronomical. When it’s not risky, there’s no benefit; and when there’s a lot of benefit, by definition there’s also a lot of risk. This is a terrible combination.

How to do Point-in-Time Recovery of MySQL & MariaDB Data using ClusterControl

Backups are crucial when it comes to safety of data. They are the ultimate disaster recovery solution - you have no database nodes reachable and your datacenter could literally have gone up in smoke, but as long as you have a backup of your data, you can still recover from such situation.

Typically, you will use backups to recover from different types of cases:

  • accidental DROP TABLE or DELETE without a WHERE clause, or with a WHERE clause that was not specific enough.
  • a database upgrade that fails and corrupts the data
  • storage media failure/corruption
Related resources  Download ClusterControl  MySQL & MariaDB Database Backup Resources  Become a PostgreSQL DBA: Point-in-Time Database Restoration  ClusterControl 1.5 - Automatic Backup Verification, Build Slave from Backup and Cloud Integration

Is restoring from backup not enough? What does it have to be point-in-time? We have to keep in mind that a backup is a snapshot of data taken at a given point in time. If you take a backup at 1:00 am and a table was removed accidently at 11:00 am, you can restore your data up to 1:00 am but what about changes which happened between 1:00 am and 11:00 am? Those changes would be lost unless you can replay modifications that happened in between. Luckily, MySQL has such a mechanism for storing changes - binary logs. You may know those logs are used for replication - MySQL uses them to store all of the changes which happened on the master, and a slave uses them to replay those changes and apply them to its dataset. As the binlogs store all of the changes, you can also use them to replay traffic. In this blog post, we will take a look at how ClusterControl can help you perform Point-In-Time Recovery (PITR).

Creating backup compatible with Point-In-Time Recovery

First of all, let’s talk about prerequisites. A host where you take backups from has to have binary logs enabled. Without them, PITR is not possible. Second requirement - a host where you take backups from should have all the binary logs required in order to restore to a given point in time. If you use too aggressive binary log rotation, this could become a problem.

So, let us see how to use this feature in ClusterControl. First of all, you have to take a backup which is compatible with PITR. Such backup has to be full, complete and consistent. For xtrabackup, as long it contains full dataset (you didn’t include just a subset of schemas), it will be PITR-compatible.

For mysqldump, there is an option to make it PITR-compatible. When you enable this option, all necessary options will be configured (for example, you won’t be able to pick separate schemas to include in the dump) and backup will be marked as available for point-in-time recovery.

Point-In-Time Recovery from a backup

First, you have to pick a backup to restore.

If the backup is compatible with PITR, an option will be presented to perform a Point-In-Time Recovery. You will have two options for that - “Time Based” and “Position Based”. Let’s discuss the difference between those two options.

“Time Based” PITR

With this option you can pass a date and time, up to which the backup should be restored. It can be defined within one second resolution. It does not guarantee that all of the data will be restored because, even if you are very precise in defining the time, during one second multiple events could be recorded in the binary log. Let’s say that you know that the data loss happened on 18th of April, at 10:00:01. You pass the following date and time to the form: ‘2018-04-18 10:00:00’. Please keep in mind that you should be using a time that is based on the timezone settings on the database server on which the backup was created.

It still may happen that the data loss even wasn’t the first one which happened at 10:00:01 so some of the events will be lost in the process. Let’s look at what that means.

During one second, multiple events may be logged in binlogs. Let's consider such case:
10:00:00 - events A,B,C,D,E,F
10:00:01 - events V,W,X,Y,Z
where X is the data loss event. With a granularity of a second, you can either restore up to everything which happened at 10:00:00 (so up to F) or up to 10:00:01 (up to Z). The later case is of no use as X would be re-executed. In the former case, we miss V and W.

That's why position based restore is more precise. You can tell "I want to restore up to W".

Time based restore is the most precise you can get without having to go to the binary logs and define the exact position to where you want to restore. This leads us to the second method of doing PITR.

“Position Based” PITR

Here some experience with command line tools for MySQL, namely mysqlbinlog utility, is required. On the other hand, you will have the best control over how the recovery will be made.

Let’s go through a simple example. As you can see in the screenshot above, you will have to pass a binary log name and binary log position up to which point the backup should be restored. Most of the time, this should be the last position before the data loss event.

Someone executed a SQL command which resulted in a serious data loss:

mysql> DROP TABLE sbtest1; Query OK, 0 rows affected (0.02 sec)

Our application immediately started to complain:

sysbench 1.1.0-ecf1191 (using bundled LuaJIT 2.1.0-beta3) Running the test with following options: Number of threads: 2 Report intermediate results every 1 second(s) Initializing random number generator from current time Initializing worker threads... Threads started! FATAL: mysql_drv_query() returned error 1146 (Table 'sbtest.sbtest1' doesn't exist) for query 'DELETE FROM sbtest1 WHERE id=5038' FATAL: `thread_run' function failed: /usr/local/share/sysbench/oltp_common.lua:490: SQL error, errno = 1146, state = '42S02': Table 'sbtest.sbtest1' doesn't exist

We have a backup but we want to restore all of the data up to that fatal moment. First of all, we assume that the application does not work so we can discard all of the writes which happened after the DROP TABLE as non-important. If your application works to some extent, you would have to merge the remaining changes later on. Ok, let’s examine the binary logs to find the position of the DROP TABLE statement. As we want to avoid parsing all of the binary logs, let’s find what was the position our latest backup covered. You can check that by examining logs for the latest backup set and look for a line similar to this one:

So, we are talking about filename 'binlog.000008' and position '16184120'. Let’s use this as our starting point. Let’s check what binary log files we have:

root@vagrant:~# ls -alh /var/lib/mysql/binlog.* -rw-r----- 1 mysql mysql 58M Apr 17 08:31 /var/lib/mysql/binlog.000001 -rw-r----- 1 mysql mysql 116M Apr 17 08:59 /var/lib/mysql/binlog.000002 -rw-r----- 1 mysql mysql 379M Apr 17 09:30 /var/lib/mysql/binlog.000003 -rw-r----- 1 mysql mysql 344M Apr 17 10:54 /var/lib/mysql/binlog.000004 -rw-r----- 1 mysql mysql 892K Apr 17 10:56 /var/lib/mysql/binlog.000005 -rw-r----- 1 mysql mysql 74M Apr 17 11:03 /var/lib/mysql/binlog.000006 -rw-r----- 1 mysql mysql 5.2M Apr 17 11:06 /var/lib/mysql/binlog.000007 -rw-r----- 1 mysql mysql 21M Apr 18 11:35 /var/lib/mysql/binlog.000008 -rw-r----- 1 mysql mysql 59K Apr 18 11:35 /var/lib/mysql/binlog.000009 -rw-r----- 1 mysql mysql 144 Apr 18 11:35 /var/lib/mysql/binlog.index

So, in addition to 'binlog.000008' we also have 'binlog.000009' to examine. Let’s run the command which will convert binary logs into SQL format starting from the position we found in the backup log:

root@vagrant:~# mysqlbinlog --start-position='16184120' --verbose /var/lib/mysql/binlog.000008 /var/lib/mysql/binlog.000009 > binlog.out

Please node ‘--verbose’ is required to decode row-based events. This is not necessarily required for the DROP TABLE we are looking for, but for other type of events it may be needed.

Let’s search our output for the DROP TABLE query:

root@vagrant:~# grep -B 7 -A 1 "DROP TABLE" binlog.out # at 20885489 #180418 11:24:32 server id 1 end_log_pos 20885554 CRC32 0xb89f2e66 GTID last_committed=38168 sequence_number=38170 rbr_only=no SET @@SESSION.GTID_NEXT= '7fe29cb7-422f-11e8-b48d-0800274b240e:38170'/*!*/; # at 20885554 #180418 11:24:32 server id 1 end_log_pos 20885678 CRC32 0xb38a427b Query thread_id=54 exec_time=0 error_code=0 use `sbtest`/*!*/; SET TIMESTAMP=1524050672/*!*/; DROP TABLE `sbtest1` /* generated by server */ /*!*/;

In this sample we can see two events. First, at the position of 20885489, sets GTID_NEXT variable.

# at 20885489 #180418 11:24:32 server id 1 end_log_pos 20885554 CRC32 0xb89f2e66 GTID last_committed=38168 sequence_number=38170 rbr_only=no SET @@SESSION.GTID_NEXT= '7fe29cb7-422f-11e8-b48d-0800274b240e:38170'/*!*/;

Second, at the position of 20885554 is our DROP TABLE event. This leads to the conclusion that we should perform the PITR up to the position of 20885489. The only question to answer is which binary log we are talking about. We can check that by searching for binlog rotation entries:

root@vagrant:~# grep "Rotate to binlog" binlog.out #180418 11:35:46 server id 1 end_log_pos 21013114 CRC32 0x2772cc18 Rotate to binlog.000009 pos: 4

As it can be clearly seen by comparing dates, rotation to binlog.000009 happened later therefore we want to pass binlog.000008 as the binlog file in the form.

Next, we have to decide if we are going to restore the backup on the cluster or do we want to use external server to restore it. This second option could be useful if you want to restore just a subset of data. You can restore full physical backup on a separate host and then use mysqldump to dump the missing data and load it up on the production server.

Keep in mind that when you restore the backup on your cluster, you will have to rebuild nodes other than the one you recovered. In master - slave scenario you will typically want to restore backup on the master and then rebuild slaves from it.

As a last step, you will see a summary of actions ClusterControl will take.

ClusterControl Single Console for Your Entire Database Infrastructure Find out what else is new in ClusterControl Install ClusterControl for FREE

Finally, after the backup was restored, we will test if the missing table has been restored or not:

mysql> show tables from sbtest like 'sbtest1'\G *************************** 1. row *************************** Tables_in_sbtest (sbtest1): sbtest1 1 row in set (0.00 sec)

Everything seems ok, we managed to restore missing data.

The last step we have to take is to rebuild our slave. Please note that there is an option to use a PITR backup. In the example here, this is not possible as the slave would replicate the DROP TABLE event and it would end up not being consistent with the master.

Tags:  MySQL MariaDB point in time recovery backup

MySQL Community Awards Winners 2018

One of the highlights of Percona Live is that the winners of the annual MySQL Community Awards are announced. A 100% community effort, the awards were created to recognize community contribution. This year saw six very deserving winners in three categories:

MySQL Community Awards: Community Contributor of the year 2018

Two individuals received these awards:

  • Jean-François Gagné
    Jean-François was nominated for his many blog posts, bug reports, and experiment results that make MySQL much better. His blog: https://jfg-mysql.blogspot.com/
  • Sveta Smirnova
    Sveta spreads knowledge and good practice on all things MySQL as a frequent speaker and blogger. Her years of experience in testing, support, and consulting are shared in webinars, technical posts, conferences around the world and in her book “MySQL Troubleshooting”. While we’re proud to say that Sveta works for Percona, this reward is for her outstanding individual contribution irrespective of that. Kudos and respect, Sveta!
MySQL Community Awards: Application of the year 2018

Three applications were honoured:

  • MyRocks
    MyRocks is now in MariaDB, Percona Server and PolarDB (Alibaba). Intel, MariaDB and Rockset are optimizing it for cloud native storage.
  • ProxySQL
    ProxySQL solves serious, real-world problems experienced by DBAs in an elegant way.
  • Vitess
    Vitess is a database clustering system for horizontal scaling of MySQL. Originally developed at YouTube/Google and now under CNCF, Vitess is free and open source and aims to solve scaling problems for MySQL installations.
MySQL Community Awards: Corporate Contributor of the year 2018

The awards were presented by Agustín Gallego and Emily Slocombe.

In the spirit of open source, much of the content of this post has been sourced from the MySQL Community Awards website and the full information can be read there. Please do take the time to read the full details and you can also read about past winners and initiatives on that site.

Congratulations to all!

The post MySQL Community Awards Winners 2018 appeared first on Percona Database Performance Blog.

MySQL Performance : 8.0 and UTF8 impact

The world is moving to UTF8, MySQL 8.0 has utf8mb4 charset as default now, but, to be honest, I was pretty surprised how sensible the "charset" related topic could be.. -- in fact you may easily hit huge performance overhead just by using an "odd" config settings around your client/server charset and collation. While to avoid any potential charset mismatch between client and server, MySQL has from a long time an excellent option : "skip-character-set-client-handshake" which is forcing any client connection to be "aligned" with server settings ! (for more details see the ref. manual : https://dev.mysql.com/doc/refman/8.0/en/server-options.html#option_mysqld_character-set-client-handshake) -- this option is NOT set by default (to leave you a freedom in choose of charsets used on client and server sides). However, in my sense, it's still better to align clients according to the server settings to avoid any potential client misconfig..

As well if you wish to use UTF8, please use "utf8mb4" as first of all it's the most complete for any kind of characters (and probably the only one which makes sense as of today), and second -- the related code was yet more improved in MySQL 8.0 for better efficiency. How much more efficient ? -- let's see from the following test results.

but first of all, the related config setup :

[mysqld] ... character_set_server=utf8mb4 collation_server=utf8mb4_0900_ai_ci skip-character-set-client-handshake sort_buffer_size=512K
NOTE: mind to use a bigger sort buffer for UTF8

The results are obtained with on the same 2S Skylake as in the previously published RO tests with MySQL 8.0 and latin1 and with the same test workloads (just that for latin1 you need to change character_set_server=latin1 and collation_server=latin1_swedish_ci)

So far, here we are :

Sysbench OLTP_RO 10Mx8-tables UTF8mb4 on 2S 48cores-HT Skylake Comments :
  • MySQL 8.0 is doing up to 40% better than 5.7
  • MariaDB 10.3.5 is trying to follow, but not yet there..


Sysbench RO Point-Selects 10Mx8-tables UTF8mb4 on 2S 48cores-HT Skylake Comments :
  • point-selects workload is much less sensible to UTF8
  • 8.0 and 5.7 are getting highest QPS due RO fixes in 5.7
  • MariaDB 10.3.5 is going higher than before since adoption of InnoDB 5.7 code
  • 5.6 is slower than others because it's 5.6, and has no 5.7 improvements ;-))


Sysbench RO Distinct-Ranges 10Mx8-tables UTF8mb4 on 2S 48cores-HT Skylake Comments :
  • MySQL 8.0 is doing 30% better than 5.7
  • MariaDB is doing so bad here just because it's already doing something bad yet in previous latin1 tests..

Instead of Summary :

  • a gentle reminder to PeterZ that MySQL is not "InnoDB only" ;-))
  • if you're doing "benchmarKeting" -- very easy to be "the best" by comparing everyone with UTF8, and hide all other regressions and bottlenecks.. ;-))
  • so, hope it's obvious why all my following benchmark results will be published with "latin1" only..

Thank You for using MySQL !

Rgds,
-Dimitri

Percona Live 2018: Securing Access to Facebook’s Databases

We’re moving along at Percona Live 2018, and there are still packed and energetic talks after lunch.

My next session was with Andrew Regner, Production Engineer at Facebook. His talk was on securing access to Facebook’s databases.

Since the beginning, Facebook has used a conventional username/password to secure access to production MySQL instances. Over the last few years, they’ve been working on moving to x509 TLS client certificate authenticated connections. Given the many types of languages and systems at Facebook that use MySQL in some way, this required a massive amount of changes for a lot of teams.

This talk is both a technical overview of how their new solution works and hard-learned tricks for getting an entire company to change their underlying MySQL client libraries.

After his talk, I had a chance to quickly talk with Andrew about his efforts to move the security process for Facebook’s databases. Check it out below.

The post Percona Live 2018: Securing Access to Facebook’s Databases appeared first on Percona Database Performance Blog.

Percona Live 2018: Migrating to Vitess at (Slack) Scale

Percona Live 2018 is moving along, and the first person I got a chance to talk with is Michael Demmer, Senior Staff Engineer at Slack. His talk was on Migrating to Vitess at (Slack Scale).

MySQL is the backbone of Slack’s data storage infrastructure. It handles billions of queries per day across thousands of sharded database hosts. Slack is migrating this system to use Vitess’ flexible sharding and topology management instead of simple application-based shard routing and manual administration. This effort aims to provide an architecture that scales to meet the growing demands of our largest customers and features while under pressure to maintain a stable and performant service.

This talk presented the core motivations behind our decision, why Vitess won out as the best option, and how Slack laid the groundwork for the migration within our development teams. Michael then presented some challenges and surprises (both good and bad) found during their transition, and the contributions to the Vitess project that mitigated them. Finally, he discussed the future plans for their migration, and suggested improvements to the Vitess ecosystem to aid other adoption efforts.

I spoke briefly with Michael after his talk, check it out below:

The post Percona Live 2018: Migrating to Vitess at (Slack) Scale appeared first on Percona Database Performance Blog.

Reading the log positions of a snapshot consistently

MySQL 8.0.11 introduced a new performance schema table named log_status, which provides consistent information about MySQL server instance log positions from replication and transactional storage engines.

Used in conjunction with other MySQL 8.0 feature (the backup lock), this new feature will allow backup tools to take backup with minimal impact in DML throughput, but ensuring consistent snapshot with respect to GTIDs, replication, binary logs and transactional storage engine logs.…

Pages