Planet MySQL

Rotating your ProxySQL log files

Recently I received several questions about rotating log files for ProxySQL, so I decided to draft it up as a blog post. Let me go by this using an example.

In my testing lab, I have set up a fairly default ProxySQL (version 1.4.13) service. The default location for the proxysql.log is in /var/lib/proxysql.

[root@proxysql ~]# ls -hal /var/lib/proxysql/proxysql.log* -rw-------. 1 root root 4.9K Jan 30 18:47 /var/lib/proxysql/proxysql.log

I created a pretty basic default logrotate configuration to ensure my logfile rotates on a daily basis and five rotations are kept before expiring.

[root@proxysql ~]# cat /etc/logrotate.d/proxysql /var/lib/proxysql/proxysql.log { missingok daily notifempty compress create 0600 root root rotate 5 } First attempt

Let’s check whether this is actually the correct file that is used that we will be rotating. As it turned out it is!

[root@proxysql ~]# lsof | grep proxysql.log proxysql 2361 root 1w REG 8,1 5010 1705843 /var/lib/proxysql/proxysql.log proxysql 2361 root 2w REG 8,1 5010 1705843 /var/lib/proxysql/proxysql.log proxysql 2361 2362 root 1w REG 8,1 5010 1705843 /var/lib/proxysql/proxysql.log proxysql 2361 2362 root 2w REG 8,1 5010 1705843 /var/lib/proxysql/proxysql.log ...

Now the time has come to test our logrotate configuration. We use the force option (-f) to ensure it actually does something and we use the verbose option (-v) so it will tell us what it’s doing.

[root@proxysql ~]# logrotate -fv /etc/logrotate.d/proxysql reading config file /etc/logrotate.d/proxysql Allocating hash table for state file, size 15360 B Handling 1 logs rotating pattern: /var/lib/proxysql/proxysql.log forced from command line (5 rotations) empty log files are not rotated, old logs are removed considering log /var/lib/proxysql/proxysql.log log needs rotating rotating log /var/lib/proxysql/proxysql.log, log->rotateCount is 5 dateext suffix '-20190130' glob pattern '-[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]' renaming /var/lib/proxysql/proxysql.log.5.gz to /var/lib/proxysql/proxysql.log.6.gz (rotatecount 5, logstart 1, i 5), old log /var/lib/proxysql/proxysql.log.5.gz does not exist ... renaming /var/lib/proxysql/proxysql.log.0.gz to /var/lib/proxysql/proxysql.log.1.gz (rotatecount 5, logstart 1, i 0), old log /var/lib/proxysql/proxysql.log.0.gz does not exist log /var/lib/proxysql/proxysql.log.6.gz doesn't exist -- won't try to dispose of it fscreate context set to system_u:object_r:var_lib_t:s0 renaming /var/lib/proxysql/proxysql.log to /var/lib/proxysql/proxysql.log.1 creating new /var/lib/proxysql/proxysql.log mode = 0600 uid = 0 gid = 0 compressing log with: /bin/gzip set default create context to system_u:object_r:var_lib_t:s0 set default create context

As the output above shows, our configuration looks fine. It rotated all log files to the next number (5 -> 6, 4 -> 5, etc). It tried to dispose of the 6th iteration but since that did not exist in our case, all seems fine. Let us check the ProxySQL log location to verify.

[root@proxysql ~]# ls -hal /var/lib/proxysql/proxysql.log* -rw-------. 1 root root 0 Jan 30 18:48 /var/lib/proxysql/proxysql.log -rw-------. 1 root root 995 Jan 30 18:47 /var/lib/proxysql/proxysql.log.1.gz

Yup! All seems good! Or is it? Why would this blog post be necessary if a default logrotate would work?

The problem

Let’s check the active (open) log file in the lsof output.

[root@proxysql ~]# lsof | grep proxysql.log proxysql 2361 root 1w REG 8,1 5010 1705843 /var/lib/proxysql/proxysql.log.1 (deleted) proxysql 2361 root 2w REG 8,1 5010 1705843 /var/lib/proxysql/proxysql.log.1 (deleted) proxysql 2361 2362 root 1w REG 8,1 5010 1705843 /var/lib/proxysql/proxysql.log.1 (deleted) proxysql 2361 2362 root 2w REG 8,1 5010 1705843 /var/lib/proxysql/proxysql.log.1 (deleted) ...

We now see that the proxysql process still has the old log file open which is marked as deleted. From the ls output earlier, we confirmed that the file it is referencing does not exist anymore.

We can see the open file descriptor in the proc-filesystem and we can even read the contents of that deleted file, but this is not our expected behavior.

[root@proxysql ~]# ls -hal /proc/2361/fd | grep log l-wx------. 1 root root 64 Jan 30 17:34 1 -> /var/lib/proxysql/proxysql.log.1 (deleted) l-wx------. 1 root root 64 Jan 30 17:34 2 -> /var/lib/proxysql/proxysql.log.1 (deleted) [root@proxysql ~]# tail -5 /proc/2361/fd/1 2019-01-30 18:58:54 [INFO] ProxySQL version 1.4.13-15-g69d4207 2019-01-30 18:58:54 [INFO] Detected OS: Linux proxysql 3.10.0-957.1.3.el7.x86_64 #1 SMP Thu Nov 29 14:49:43 UTC 2018 x86_64 [root@proxysql ~]# tail -5 /proc/2361/fd/2 2019-01-30 18:58:54 [INFO] ProxySQL version 1.4.13-15-g69d4207 2019-01-30 18:58:54 [INFO] Detected OS: Linux proxysql 3.10.0-957.1.3.el7.x86_64 #1 SMP Thu Nov 29 14:49:43 UTC 2018 x86_64 The solution

As with most log files, we need to make the software flush the log file, which basically means closing and re-opening the file internally. This resets the file descriptor and allows the old to be removed.

To do so, we will need to use the ProxySQL admin interface using the MySQL client to issue the command “PROXYSQL FLUSH LOGS”

[root@proxysql ~]# mysql -h 127.0.0.1 -P 6032 -uadmin -padmin ... mysql> PROXYSQL FLUSH LOGS; Query OK, 0 rows affected (0.00 sec)

As you can see in the output below, my proxysql.log file which was 0 bytes after the rotation, is in use by the process and has grown to 187 bytes which means some content was added. If we now check the contents of the log file we see that it is proxysql that actually started writing to it. Yay!

[root@proxysql ~]# lsof | grep proxysql.log proxysql 2361 root 1w REG 8,1 876 1705843 /var/lib/proxysql/proxysql.log proxysql 2361 root 2w REG 8,1 876 1705843 /var/lib/proxysql/proxysql.log proxysql 2361 2362 root 1w REG 8,1 876 1705843 /var/lib/proxysql/proxysql.log proxysql 2361 2362 root 2w REG 8,1 876 1705843 /var/lib/proxysql/proxysql.log ... [root@proxysql ~]# ls -hal /var/lib/proxysql/proxysql.log* -rw-------. 1 root root 187 Jan 30 18:50 /var/lib/proxysql/proxysql.log -rw-------. 1 root root 995 Jan 30 18:47 /var/lib/proxysql/proxysql.log.1.gz [root@proxysql ~]# cat /var/lib/proxysql/proxysql.log 2019-01-30 18:50:57 [INFO] ProxySQL version 1.4.13-15-g69d4207 2019-01-30 18:50:57 [INFO] Detected OS: Linux proxysql 3.10.0-957.1.3.el7.x86_64 #1 SMP Thu Nov 29 14:49:43 UTC 2018 x86_64 Let’s make logrotate do the flushing

We now have all the components to put this together. For an automated approach, we can use a my.cnf file or we can use the mysql_config_editor, which is provided with Oracle’s MySQL community edition client, to create a login path. Either one will work fine to make a scripted connection to the proxysql admin interface.

[root@proxysql ~]# mysql --defaults-file=/etc/proxysql.my.cnf --execute="PROXYSQL FLUSH LOGS" [root@proxysql ~]# cat /etc/proxysql.my.cnf [client] host = 127.0.0.1 port = 6032 user = admin password = admin [root@proxysql ~]# mysql_config_editor set --login-path=proxysql-admin --host=127.0.0.1 --port=6032 --user=admin -p Enter password: [root@proxysql ~]# mysql_config_editor print --login-path=proxysql-admin [proxysql-admin] user = admin password = ***** host = 127.0.0.1 port = 6032 [root@proxysql ~]# mysql --login-path=proxysql-admin --execute="PROXYSQL FLUSH LOGS;"

We can put this command in the postrotate section of the logrotate configuration to execute after rotation but before the compression phase.

[root@proxysql ~]# cat /etc/logrotate.d/proxysql /var/lib/proxysql/proxysql.log { missingok daily notifempty compress postrotate /usr/bin/mysql --defaults-file=/etc/proxysql.my.cnf --execute="PROXYSQL FLUSH LOGS" endscript create 0600 root root rotate 5 }

Let’s test to see if it works.

[root@proxysql ~]# ls -hal /var/lib/proxysql/proxysql.log* -rw-------. 1 root root 876 Jan 30 18:58 /var/lib/proxysql/proxysql.log -rw-------. 1 root root 995 Jan 30 18:47 /var/lib/proxysql/proxysql.log.1.gz [root@proxysql ~]# logrotate -fv /etc/logrotate.d/proxysql reading config file /etc/logrotate.d/proxysql Allocating hash table for state file, size 15360 B Handling 1 logs rotating pattern: /var/lib/proxysql/proxysql.log forced from command line (5 rotations) ... renaming /var/lib/proxysql/proxysql.log to /var/lib/proxysql/proxysql.log.1 creating new /var/lib/proxysql/proxysql.log mode = 0600 uid = 0 gid = 0 running postrotate script compressing log with: /bin/gzip set default create context to system_u:object_r:var_lib_t:s0 set default create context [root@proxysql ~]# ls -hal /var/lib/proxysql/proxysql.log* -rw-------. 1 root root 187 Jan 30 18:58 /var/lib/proxysql/proxysql.log -rw-------. 1 root root 241 Jan 30 18:58 /var/lib/proxysql/proxysql.log.1.gz -rw-------. 1 root root 995 Jan 30 18:47 /var/lib/proxysql/proxysql.log.2.gz Conclusion

As with most software that uses log files, it is necessary to take action after the rotation to make the daemon “flush” its handle to the files. This way the old files are no longer in use and the new files will be. ProxySQL provides us with the tools to do this, but you have to make sure to actually use them.

How to use console in node.js

Introduction

In this article, we'll learn how to use most methods available in the nodeJS console class more effectively.

To demonstrate, I'll use Chrome browser version 70.0.3538.77 and nodeJS version 8.11. 3.

Okay then. Let's begin.

console.log, console.info and console.debug console.log(string, substitution)

While the famed console.log method really needs no introduction, you'll be glad to know that the console.info and console.debug methods are identical to console.log in their operation.

You can use console.debug in the Firefox browser console by default but to use it in Chrome, you'll have to set the log level to verbose like this.

The console.log method prints to standard out, whether this be the terminal or browser console. It outputs strings by default but can be used in conjuction with template strings to modify what it returns.

Here's how it works:

The arguments in the template string are passed to util.format which then processes the arguments by replacing each substitution token with the respective converted value. The supported substitution tokens are:

%s const msg = `Using the console class`; console.log('%s', msg); // Using the console class console.log(msg); // Using the console class

%s is the default substitution pattern.

%d, %f, %i, %o const circle = (radius = 1) => { const profile = {}; const pi = 22/7; profile.diameter = 2_pi_radius; profile.circumference = pi_radius_2; profile.area = pi_radius^2; profile.volume = 4/3_pi_radius^3; console.log('This circle has a radius of: %d cm', radius); console.log('This circle has a circumference of: %f cm', profile.diameter); console.log('This circle has an area of: %i cm^2', profile.area); console.log('The profile of this cirlce is: %o', profile); console.log('Diameter %d, Area: %f, Circumference %i', profile.diameter, profile.area, profile.circumference) } circle();

This is what happens:

%d will be substituted by a digit (integer or float). %f will be replaced by a float value. %i will be replaced by an integer. %o will be replaced by an Object.

%o is especially handy because we don't have to use JSON.stringify to expand our object because it shows all the object's properties by default.

Note that you can use as many token substitutions as you like. They'll just be replaced the same order as the arguments you pass.

%c

This substitution token applies css styles to the subsituted text.

console.log('LOG LEVEL: %c OK', 'color: green; font-weight: normal'); console.log('LOG LEVEL: %c PRIORITY', 'color: blue; font-weight: medium'); console.log('LOG LEVEL: %c WARN', 'color: red; font-weight: bold'); console.log('ERROR HERE');

Here it is in action.

Above, we note that the text we pass to console.log after the %c substitution token is affected by the styles, but the text before is left as is without styling.

console.table

The first argument passed to it is the data to be returned in the form of a table. The second is an array of selected columns to be displayed.

console.table(tabularData, [properties])

This method will print the input passed to it formatted as a table then log the input object after the table representation.

Arrays

If an array is passed to it as data, each element in the array will be a row in the table.

const books = ['The Silmarillion', 'The Hobbit', 'Unfinished Tales']; console.table(books);

With a simple array with a depth of 1, the first column in the table has the heading index. Under the index header in the first column are the array indexes and the items in the array are listed in the second column under the value header.

This is what happens for a nested array.

const authorsAndBooks = [['Tolkien', 'Lord of The Rings'],['Rutger', 'Utopia For Realists'], ['Sinek', 'Leaders Eat Last'], ['Eyal', 'Habit']]; console.table(authorsAndBooks);

Objects

For objects with a depth of 1, the object keys will be listed under the index header and the values in the object under the second column header.

const inventory = { apples: 200, mangoes: 50, avocados: 300, kiwis: 50 }; console.table(inventory);

For nested objects,

const forexConverter = { asia: { rupee: 1.39, renminbi: 14.59 , ringgit: 24.26 }, africa: { rand: 6.49, nakfa: 6.7 , kwanza:0.33 }, europe: { swissfranc: 101.60, gbp: 130, euro: 115.73 } }; console.table(forexConverter);

Some more nested objects,

const workoutLog = { Monday: { push: 'Incline Bench Press', pull: 'Deadlift'}, Wednesday: { push: 'Weighted Dips', pull: 'Barbell Rows'}}; console.table(workoutLog);

Here, we specify that we only want to see data under the column push.

console.table(workoutLog, 'push');

To sort the data under a column, just click the column header.

Pretty handy, eh?

Try passing console.table an object with some values as arrays!

console.dir

The first argument passed to this function is the object to be logged while the second is an object containing options that will define how the resulting output is fomatted or what properties in the object will be shown.

What's returned is an object formatted by node's util.inspect function.

Nested or child objects within the input object are expandable under disclosure triangles.

console.dir(object, options); // where options = { showHidden: true ... }

Let's see this in action.

const user = { details: { name: { firstName: 'Immanuel', lastName: 'Kant' }, height: `1.83m"`, weight: '90kg', age: '80', occupation: 'Philosopher', nationality: 'German', books: [ { name: 'Critique of Pure Reason', pub: '1781', }, { name: 'Critique of Judgement', pub: '1790', }, { name: 'Critique of Practical Reason', pub: '1788', }, { name: 'Perpetual Peace', pub: '1795', }, ], death: '1804' } } console.dir(user);

Here it is in the Chrome console.

console.dirxml

This function will render an interactive tree of the XML/HTML it is passed. It defaults to a Javascript object if it's not possible to render the node tree.

console.dirxml(object|nodeList);

Much like console.dir, the rendered tree can be expanded through clicking disclosure triangles within which you can see child nodes.

It's output is similar to that which we find under the Elements tab in the browser.

This is how it looks when we pass in some HTML from a Wikipedia page.

Let's pass in some HTML from a page on this website.

This is how it looks when we pass in an object.

Try calling console.dir on some HTML and see what happens!

console.assert

The first argument passed to the function is a value to test as truthy. All other arguments passed are considered messages to be printed out if the value passed is not evaluated as truthy.

The Node REPL will throw an error halting execution of subsequent code.

console.assert(value, [...messages])

Here's a basic example:

console.assert(false, 'Assertion failed'); // Assertion failed

Now, let's have some fun. We'll build a mini testing framework using console.assert

const sum = (a = 0, b = 0) => Number(a) + Number(b); function test(functionName, actualFunctionResult, expected) { const actual = actualFunctionResult; const pass = actual === expected; console.assert(pass, `Assertion failed for ${functionName}`); return `Test passed ${actual} === ${expected}`; } console.log(test('sum', sum(1,1), 2)); // Test passed 2 === 2 console.log(test('sum', sum(), 0)); // Test passed 0 === 0 console.log(test('sum', sum, 2)); // Assertion failed for sum console.log(test('sum', sum(3,3), 4)); // Assertion failed for sum

Run the above in your node REPL or browser console to see what happens.

console.error & console.warn

These two are essentially identical. They will both print whatever string is passed to them.

However, console.warn prints out a triangle warn symbol before the message passed while console.error prints out a danger symbol before the message passed.

console.error(string, substitution); console.warn(string, substitution);

Let's note that string substitution can be applied in the same way as the console.log method.

Here's a mini logging function using console.error.

const sum = (a = 0, b = 0) => Number(a) + Number(b); function otherTest(actualFunctionResult, expected) { if (actualFunctionResult !== expected) { console.error(new Error(`Test failed ${actualFunctionResult} !== ${expected}`)); } else { // pass } } otherTest(sum(1,1), 3);

console.trace(label)

This console method will prints the string Trace: followed by the label passed to the function then the stack trace to the current position of the function.

function getCapital(country) { const capitalMap = { belarus: 'minsk', australia: 'canberra', egypt: 'cairo', georgia: 'tblisi', latvia: 'riga', samoa: 'apia' }; console.trace('Start trace here'); return Object.keys(capitalMap).find(item => item === country) ? capitalMap[country] : undefined; } console.log(getCapital('belarus')); console.log(getCapital('accra'));

console.count(label)

Count will begin and increment a counter of name label .

Let's build a word counter to see how it works.

const getOccurences = (word = 'foolish') => { const phrase = `Oh me! Oh life! of the questions of these recurring, Of the endless trains of the faithless, of cities fill’d with the foolish, Of myself forever reproaching myself, for who more foolish than I, and who more faithless?`; let count = 0; const wordsFromPhraseArray = phrase.replace(/[,.!?]/igm, '').split(' '); wordsFromPhraseArray.forEach((element, idx) => { if (element === word) { count ++; console.count(word); } }); return count; } getOccurences();

Here, we see that the word foolish was logged twice. Once for each appearance of the word in the phrase.

We could use this as a handy method to see how many times a function was called or how many times a line in our code executed.

console.countReset(label)

As the name suggests, this resets a counter having a label set by the console.count method.

const getOccurences = (word = 'foolish') => { const phrase = `Oh me! Oh life! of the questions of these recurring, Of the endless trains of the faithless, of cities fill’d with the foolish, Of myself forever reproaching myself, for who more foolish than I, and who more faithless?`; let count = 0; const wordsFromPhraseArray = phrase.replace(/[,.!?]/igm, '').split(' '); wordsFromPhraseArray.forEach((element, idx) => { if (element === word) { count ++; console.count(word); console.countReset(word); } }); return count; } getOccurences();

We can see that our getOccurences function returns 2 because there are indeed two occurences of the word foolish in the phrase but since our counter is reset at every match, it logs foolish: 1twice.

console.time(label) and console.timeEnd(label)

The console.time function starts a timer with the label supplied as an argument to the function, while the console.timeEnd function stops a timer with the label supplied as an argument to the function.

console.time('<timer-label>'); console.timeEnd('<timer-label'>);

We can use it to figure out how much time it took to run an operation by passing in the same label name to both functions.

const users = ['Vivaldi', 'Beethoven', 'Ludovico']; const loop = (array) => { array.forEach((element, idx) => { console.log(element); }) } const timer = () => { console.time('timerLabel'); loop(users); console.timeEnd('timerLabel'); } timer();

We can see the timer label displayed against time value after the timer is stopped.

It took the loop function 0.6909ms to finish looping through the array.

Conclusion

At last, we've come to the end of this tutorial. I hope you've enjoyed it.

I've left out the non standard uses of the console class like console.profile, console.profileEnd and console.timeLog , but feel free to experiment with them and let me know how it goes.

Feedback

As always, if you have any feedback on the article for me or if you'd like to chat, find me at @mabishi on the Scotch Slack or drop me a line in the comments below.

Reduce MySQL Core Dump Size by Excluding the InnoDB Buffer Pool

When things go horrible wrong and a process crashes, one of the most powerful things to investigate the cause of the crash is a core dump. As the amount of memory allocated to processes such as MySQL has increased – in some cases approaching 1TiB of memory – enabling core dumps can cause problems of their own. MySQL Server 8.0.14 and later supports an option to reduce the size of the core dump which will be discussed in this blog.

Typically the largest single user of memory for MySQL is the InnoDB buffer pool. This is used to cache the data and indexes for tables using the InnoDB storage engine (the default). It is rarely important to know what is stored in the InnoDB buffer pool when investigating a crash, so a good way to reduce the core dump size is to exclude the buffer pool. In MySQL 8.0.14 innodb_buffer_pool_in_core_file configuration option was added for this purpose. It defaults to ON which means the buffer pool is included in the core dumps (same behaviour as in older releases).

Contribution

Thanks for Facebook for the contribution that implemented the innodb_buffer_pool_in_core_file option.

Let’s look a bit more into core dumps, how to enable them, why you may want to exclude the InnoDB buffer pool, and an example.

Information

The innodb_buffer_pool_in_core_file option only applies on Linux with kernel 3.4 and later. For this reason the rest of the blog assumes Linux is used.

Enabling Core Dumps

Core dumps can be enabled by including the core-file option in the MySQL configuration. It is disabled by default. However, on Linux/Unix it is in general not enough to enable core-file to get a core dump in case of a crash as the system will also limit the size of a core dump. Typically this limit is 0 by default.

The details of the steps required to enable core dumps depends on the Linux distribution used. The following will discuss the core-file option and the limit on the core size. However, the exact steps required may differ for your system and possibly include additional steps.

Enabling Core Dumps in MySQL

As mentioned, the option in MySQL to enable core dumps is core-file. This can only be set using the MySQL configuration file (my.cnf or my.ini). For example:

[mysqld] core-file

After you have updated the configuration file, the change will take effect the next time MySQL is restarted.

Core Size Limit

Linux includes various limitations on the resources a process can use. This helps improve the stability of the system as a whole, but the default limits may be too restrictive for something like a database host where one process uses most of the resources. One of the limitations is the size of the core dump and typically the default limit is 0.

You can verify the current limits for your process using the proc file system, for example (assuming only a single mysqld process on the system):

shell$ sudo cat /proc/$(pidof mysqld)/limits Limit Soft Limit Hard Limit Units Max cpu time unlimited unlimited seconds Max file size unlimited unlimited bytes Max data size unlimited unlimited bytes Max stack size 8388608 unlimited bytes Max core file size 0 unlimited bytes Max resident set unlimited unlimited bytes Max processes 10240 12288 processes Max open files 10240 65536 files Max locked memory 65536 65536 bytes Max address space unlimited unlimited bytes Max file locks unlimited unlimited locks Max pending signals 15611 15611 signals Max msgqueue size 819200 819200 bytes Max nice priority 0 0 Max realtime priority 0 0 Max realtime timeout unlimited unlimited us

There are various ways to change the limits for MySQL and which one to use depends on how you start MySQL. If you start MySQL from the command-line, you can simply use the ulimit command first:

shell$ ulimit -c unlimited # Start MySQL shell$ sudo cat /proc/$(pidof mysqld)/limits | grep 'core file size' Max core file size unlimited unlimited bytes

This has set the limit for the core file size to unlimited.

Warning

Do not blindly set the size to unlimited. If you configure the limit too high, you may run out of disk space preventing MySQL from restarting.

You can also configure the limit in the /etc/security/limits.conf file or in a new file in /etc/security/limits.d/. This is a better way to persist a setting and allows you for example to configure the limits for a given user. For example to set the core dump file size to unlimited for the mysql user:

mysql soft core unlimited mysql hard core unlimited

However, on distributions using systemd (including Oracle Linux 7, Red Hat Enterprise Linux (RHEL) 7, and CentOS 7), systemd completely ignores /etc/security/limits.conf. Instead you need to use the service file for the process. For MySQL this is the /usr/lib/systemd/system/mysqld.service file, or if you have multiple instances on one host, /usr/lib/systemd/system/mysqld@.service. If you use Debian, replace mysqld with mysql. The option to set in the service file is LimitCore, for example:

[Service] LimitCore = infinity

If you use mysqld_safe to start MySQL, you can use the core-file-size option (in the [mysqld_safe] group in the MySQL configuration file or on the command-line) to change the limit. This requires that the hard limit is high enough for the requested limit.

Problems with Core Dumps

Over time the amount of memory allocated to MySQL has grown. This particularly applies to the InnoDB buffer pool – as databases handle more data, there is also a greater need to cache the most used data and indexes in memory. Today, some MySQL instances have total memory allocations approaching 1TiB.

When a core dump happens, the memory allocated to the process is written out to disk. For processes using a large amount of memory, this may take some time. Furthermore, there is a possibility that the system may run out of disk, particularly if repeated crashes occur. The disk space usage can particularly be an issue when MySQL is running as a service as it is likely the core file in that case is written to the data directory. From a support perspective, it is also less than trivial to share a core dump that is several hundred gigabytes large.

Warning

Remember that the core dump is a snapshot of the process memory. This means that any data loaded into memory will be present in the core dump. This is particularly something to consider if you store sensitive data in your database. Even if you have transparent data encryption (TDE) enabled, the data in memory will be decrypted!

Excluding the InnoDB buffer pool from the core dump does reduce the amount of data in the core dump, but some data will still be included, for example because it is stored in memory for sorting or as part of a query result.

The InnoDB buffer pool is in most cases the largest contributor – it can easily contribute 75% or more of the total core dump size. So, excluding the buffer pool can significantly reduce the size of the core dump. I will give an example at the end of the blog.

Example

To finish off this blog, let’s look at an example. In this case, MySQL has been configured with innodb_buffer_pool_size = 2G and data has been loaded, so most of the buffer is in use:

mysql> SELECT SUM(FREE_BUFFERS) FROM information_schema.INNODB_BUFFER_POOL_STATS; +-------------------+ | SUM(FREE_BUFFERS) | +-------------------+ | 0 | +-------------------+ 1 row in set (0.02 sec)

And from the output of top

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 11142 mysql 20 0 4121468 2.261g 0 S 4.0 58.6 2:38.50 mysqld

When MySQL is killed with signal 11 (segmentation fault), a core dump is created. Without innodb_buffer_pool_in_core_file, the core dump is almost 3 gigabytes (see output at the end of the blog).

If the new feature is enabled by adding innodb_buffer_pool_in_core_file to the configuration file or by persisting it

mysql> SET PERSIST innodb_buffer_pool_in_core_file = OFF; Query OK, 0 rows affected (0.02 sec)

The change takes effect immediately. The core dump will now be much smaller – around the size of the buffer pool smaller:

shell$ ls -lh /var/lib/mysql/core.* -rw-------. 1 mysql mysql 2.8G Feb 11 20:30 /var/lib/mysql/core.11142 -rw-------. 1 mysql mysql 759M Feb 11 20:47 /var/lib/mysql/core.14456

Here core.11142 is with innodb_buffer_pool_in_core_file enabled and core.14456 with the option disabled.

Tip

Want to know more? There are more details in the reference manual at Excluding Buffer Pool Pages from Core Files and in the blog from the server team: MySQL 8.0: Excluding the Buffer Pool from a Core File.

Working with JSON arrays in MySQL/MariaDB

Rubik cubes: arrays of arrays!

When you write stored procedures in MySQL or MariaDB, one of the features you may miss is arrays. At least, at a superficial look – because an array data type is actually not implemented.

But actually JSON support was added in MySQL 8.0 and in MariaDB 10.2, and JSON can contain arrays. We already used this characteristic in a previous article, using an array to implement a function that accepts any number of parameters.

Here we will see a more in details, and with some examples, how to work with JSON arrays: how to create arrays, how to loop over them, how to perform the most common operations. There will also be a general purpose function as an example.

Build a new array

You could compose a JSON array as a string. But I suggest to use the JSON_ARRAY() function instead, because it is less error prone and less verbose:

MariaDB [(none)]> SELECT JSON_ARRAY(1, 2, 3); +---------------------+ | JSON_ARRAY(1, 2, 3) | +---------------------+ | [1, 2, 3] | +---------------------+ Get the first and the last item

To access a single item from an array, you can use JSON_EXTRACT(). It accepts two parameters:

  • Any valid JSON document;
  • A path to a single element, written as a string.

As you can see, this function was not written specifically for arrays. However it works, as long as the array is valid JSON. To access the first item of an array:

SET @arr := JSON_ARRAY(10, 20, 30); SELECT JSON_EXTRACT(@arr, '$[0]');

In other words, you simply use $[N], where N is the index of the element, starting from 0.

To get the last item, if we don’t know the array length, we can use JSON_LENGTH():

SET @arr := JSON_ARRAY(10, 20, 30); SET @last := CONCAT('$[', JSON_LENGTH(@arr) - 1, ']'); SELECT JSON_EXTRACT(@arr, @last);

Note that it is possible to create empty arrays:

MariaDB [(none)]> SELECT JSON_ARRAY(); +--------------+ | JSON_ARRAY() | +--------------+ | [] | +--------------+ Add elements to an array

The JSON_ARRAY_INSERT() function adds an element at the specified position. If this position is already taken by another element, it will shift by one. To specify the position, we can use the same syntax used by JSON_EXTRACT().

Add an element at the beginning:

MariaDB [(none)]> SELECT @arr := JSON_ARRAY(100, 200, 300) AS my_array; +-----------------+ | my_array | +-----------------+ | [100, 200, 300] | +-----------------+ 1 row in set (0.001 sec) MariaDB [(none)]> SELECT @arr := JSON_ARRAY_INSERT(@arr, '$[0]', 'X') AS my_array; +----------------------+ | my_array | +----------------------+ | ["X", 100, 200, 300] | +----------------------+ 1 row in set (0.001 sec)

To add an element at the end when we don’t know the number of elements in advance:

MariaDB [(none)]> SELECT @arr := JSON_ARRAY(100, 200, 300) AS my_array; +-----------------+ | my_array | +-----------------+ | [100, 200, 300] | +-----------------+ 1 row in set (0.000 sec) MariaDB [(none)]> SET @last := CONCAT('$[', JSON_LENGTH(@arr), ']'); Query OK, 0 rows affected (0.000 sec) MariaDB [(none)]> SELECT @arr := JSON_ARRAY_INSERT(@arr, @last, 'X') AS my_array; +----------------------+ | my_array | +----------------------+ | [100, 200, 300, "X"] | +----------------------+ 1 row in set (0.000 sec)

Actually, you could also specify a high number instead and the result will be the same. For example, if you are sure, that the array never has more than 5 items, you can use:

SELECT @arr := JSON_ARRAY(100, 200, 300) AS my_array; SELECT @arr := JSON_ARRAY_INSERT(@arr, '$[100]', 'X') AS my_array; Other writes

Other array write functions are more or less what you would expect if you know at least one programming language. This post is not going to cover every possible array operation, however this is a list of the function you will most likely need if you are going to work with arrays:

  • JSON_UNQUOTE() – Needed to read a JSON string, unless we want it wrapped in double quotes.
    • Or you can use the ->> operator, but this is not supported by MariaDB.
  • JSON_REPLACE() – Replace a value with another; if the original value does not exist, does nothing.
  • JSON_REMOVE() – Delete an item; the following items will shift by one position.
  • JSON_MERGE() – Appends an array’s items to another array.

Functions that don’t contain the word 'ARRAY' are also useful when working with objects. However, here we are only considering arrays.

Iterate over a JSON array

There is no built-in syntax to iterate over an array. However, we can do it by using a normal loop and the JSON_EXTRACT() function. Here is an example:

CREATE /*M! OR REPLACE */ PROCEDURE foreach_array_item( in_array BLOB, in_callback VARCHAR(64) ) NOT DETERMINISTIC MODIFIES SQL DATA COMMENT 'Iterate an array and for each item invoke a callback procedure' BEGIN DECLARE i INT UNSIGNED DEFAULT 0; DECLARE v_count INT UNSIGNED DEFAULT JSON_LENGTH(in_array); DECLARE v_current_item BLOB DEFAULT NULL; -- loop from 0 to the last item WHILE i

This procedure is written for MariaDB. To run it in MySQL, the in_array parameter should be of type JSON instead of BLOB.

To test it, let's build a dummy procedure and invoke it properly:

CREATE /*M! OR REPLACE */ PROCEDURE do_something(p_something BLOB) DETERMINISTIC NO SQL BEGIN SELECT p_something AS something; END; -- let's invoke it with different data types CALL foreach_array_item(JSON_ARRAY( 100, 200.2, 'X', DATE '1994-01-01' ), 'do_something'); +-----------+ | something | +-----------+ | 100 | +-----------+ 1 row in set (0.01 sec) +-----------+ | something | +-----------+ | 200.2 | +-----------+ 1 row in set (0.01 sec) +-----------+ | something | +-----------+ | X | +-----------+ 1 row in set (0.01 sec) +------------+ | something | +------------+ | 1994-01-01 | +------------+ 1 row in set (0.01 sec) And more?

Again: this page cannot cover the whole topic. But if you have code, ideas or tricks to share, please feel absolutely free to do it in a comment. Or maybe ask me to dig into some details you are interested in, and I'll see what I can do. After all, this post was written after a request that I received privately.

Toodle pip,
Federico

Photo credit: Gerwin Sturm

Share

The post Working with JSON arrays in MySQL/MariaDB appeared first on Federico Razzoli.

On my Favorite FOSDEM 2019 MySQL, MariaDB and Friends Devroom Talks

This year I had not only spoken about MySQL bugs reporting at FOSDEM, but spent almost the entire day listening at MySQL, MariaDB and Friends Devroom. I missed only one talk, on ProxySQL, (to get some water, drink a bottle of famous Belgian beer and chat with my former colleague in MySQL support team, Geert, whom I had not seen for a decade). So, for the first time out of my 4 FOSDEM visits I've got a first hand impression about the entire set of talks in the devroom that I want to share today, while I still remember my feelings.

Most of the talks have both slides and videos already uploaded on site, so you can check them and make your own conclusions, but my top 5 favorite talks (that have both videos and slides already available to community) were the following:

  • "Un-split brain (aka Move Back in Time) MySQL", by Shlomi Noach. You can find slides at SlideShare.

    This was a replacement talk that was really interesting and had proper style for FOSDEM. It was mostly a nice background story of creation of the gh-mysql-rewind tool, a shell script that uses MariaDB's mysqlbinlog --flashback option and MySQL GTIDs and allows to "rewind" row-based binary log to roll back transactions to some previous point in time. The tool should become available to community soon, maybe as a part of orchestrator. I was impressed how one can successfully use 49 slides for 20 minutes talk. That's far beyond my current presentation skills...
  • "Test complex database systems in a laptop with dbdeployer", by Giuseppe Maxia. You can find slides at SlideShare.

    I've already built and used dbdeployer, as described in my blog post, so I was really interested in the talk. Giuseppe was able not only to show 45 slides over 20 minutes and explain all the reasons behind re-implementing MySQL-Sandbox in Go, but also run a live demo where dozens of sandbox instances were created and used. Very impressive!
  • "MySQL and the CAP theorem: relevance & misconceptions", second great talk and show by Shlomi Noach. You can find slides at SlideShare.

    The "CAP theorem" says is a concept that a distributed database system (like any kind of MySQL replication setup) can only have 2 of the 3 features: (atomic) Consistency, (high) Availability and Partition Tolerance. This can be proved mathematically, but Shlomi had not only defined terms and conditions to present the formal proof, but also explained that they are far from real production objectives of any engineer or DBA (like 99.95% of Availability). He had shown typical MySQL setups (from simple async master-slave replication to Galera, group replication and even Vitess) and proved that formally they all are neither consistent nor available from that formal CAP theorem point of view, while, as we all know, they are practically useful and work (and with some efforts, proxies on top etc can be made both highly available and highly consistent for practical purposes). So, CAP theorem is neither representing real production systems, nor meeting their real requirements. We've also got some kind of explanation of why async master-master or circular replication are still popular... All that in 48 slides, with links, and presented in 20 minutes! Greatest short MySQL-related talk I've ever attended.
  • "TiDB: Distributed, horizontally scalable, MySQL compatible", by Morgan Tocker. You can find slides at SlideShare.

    It was probably the first time when I listened to Morgan, even though we worked together for a long time. I liked his way of explaining the architecture of this yet another database system speaking MySQL protocol and reasons to create it. If you are interested in performance of this system, check this blog post.
  • "MySQL 8.0 Document Store: How to Mix NoSQL & SQL in MySQL 8.0", by Frédéric Descamps. You can find slides (70!) at SlideShare.

    LeFred managed to get me somewhat interested in MySQL Shell and new JSON functions in MySQL, way more than ever before. It's even more surprising that hist talk was the last one and we already spent 8+ hours listening before he started. Simple step by step explanation of how one may get the best of both SQL, ACID and NoSQL (JSON, "MongoDB") worlds, if needed, in a single database management syste, was impressive. Also this talk probably caused the longest discussion and the largest number of questions from those remaining attendees.

    He was also one of two "hosts" and "managers" of the devroom, so I am really thankful him for hist efforts year after year to make MySQL devroom at FOSDEM great!
There were more good talks, but I had to pick up few that already have slides shared and those of a kind that I personally prefer to listen to at FOSDEM. This year I also missed few people whom I like to see and talk to at FOSDEM, namely Mark Callaghan and Jean-François Gagné.

The only photo I made with my Nokia dumb phone this year in Brussels, on my way to FOSDEM on February 2. We've got snow and rain that morning, nice for anyone who had to walk 5 kilometers to the ULB campus. Overall, based on my experience this year, it still makes a lot of sense to visit FOSDEM for anyone interested in MySQL. You can hardly find so many good, different MySQL-related talks per just one single day on any other conference.

Inspecting Index Usage In MySQL, PostgreSQL, and MongoDB

In my recent post about the Left-Prefix Index Rule, I explained how queries can use all or part of a compound (multi-column) index. Knowing what makes an index fully usable by a query is important, but it's also important to know how to figure out how much of an index a query is able to use. In this article I'll show you how to do this by reading the query's explain plan. This article covers MySQL, Postgres, and MongoDB.

 

 

As a quick review, a query can use an index if it has filtering values that constrain a contiguous leading portion of the index, up to and including the first inequality condition in index-column order. Now let's see where the database server exposes how much of the index is used.

MySQL

In MySQL's EXPLAIN output, the key_len column shows how much of the index was considered usable. It's a number of bytes, not a number (or list) of columns. You have to look at the columns' data types to figure out how many columns this maps to. It can get tricky, but at a high level, imagine that MySQL stores indexes as a concatenation of columns' maximum possible lengths, including an extra byte if the column is nullable.

For example:

CREATE TABLE `actor` (
`actor_id` smallint(5) unsigned NOT NULL AUTO_INCREMENT,
`first_name` varchar(45) NOT NULL,
`last_name` varchar(45) NOT NULL,
`last_update` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`actor_id`),
KEY `idx_actor_last_name` (`actor_id`, `last_name`)
) ENGINE=InnoDB AUTO_INCREMENT=201 DEFAULT CHARSET=utf8

The idx_actor_last_name index is going to be up to 2 + 45 bytes long (a smallint is 2 bytes).

Now if you see key_len of 2 in EXPLAIN, that means the query is only able to constrain the actor_id column, and there's no value to match against the last_name column. The query is only using a 1-column prefix of the index.

If you're using the JSON format for EXPLAIN, look for the used_key_parts property, which will have a list of columns, like "used_key_parts": ["actor_id"].

PostgreSQL

PostgreSQL shows which values were used to search or filter an index with the "Index Cond" in the EXPLAIN output. Values that are used to filter rows from the table, are shown as filters. This example query from the EXPLAIN documentation shows both at once:

EXPLAIN ANALYZE SELECT * FROM tenk1 WHERE unique1 < 100 AND unique2 > 9000 LIMIT 2;

QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.29..14.71 rows=2 width=244) (actual time=0.177..0.249 rows=2 loops=1)
-> Index Scan using tenk1_unique2 on tenk1 (cost=0.29..72.42 rows=10 width=244) (actual time=0.174..0.244 rows=2 loops=1)
Index Cond: (unique2 > 9000)
Filter: (unique1 < 100)
Rows Removed by Filter: 287

The "Index Cond" on unique2 near the bottom shows a value being used to search for rows in the index, whereas the filter on unique1 is applied to the rows in the table.

Postgres doesn't make it explicitly visible in the index whether a column is "useful" in searching for values in the index. You might see complex Index Cond's like "Index Cond: ((i2 = 898732) AND (i5 = 123451))" but you'll need to know whether these two columns comprise a contiguous leading prefix of the index. The left-prefix rule still applies, but it's not as explicitly visible. However, as the multi-column index docs say (emphasis mine),

A multicolumn B-tree index can be used with query conditions that involve any subset of the index's columns, but the index is most efficient when there are constraints on the leading (leftmost) columns... Constraints on columns to the right of these columns are checked in the index, so they save visits to the table proper, but they do not reduce the portion of the index that has to be scanned. MongoDB

MongoDB has the same rules about using a prefix of a compound index. It exposes which columns were matched in the keyPattern property of the explain() output. Each column that was used has a "colname":1 indicator within this. Here's an example, which the docs dissect in more detail:

{
"queryPlanner" : {
...
"winningPlan" : {
"stage" : "FETCH",
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : {
"quantity" : 1,
"type" : 1
},
... Conclusions

When you combine this blog post together with the Left-Prefix Index Rule you have a complete way of examining whether queries use the full width of an index effectively. Are they using values to search to as small a range of the index as they could be? Or are they scanning a lot of an index and filtering a lot of it out? The former is a lot more efficient, so the database can make your queries run a lot faster that way.

Photo by Denny Müller

The Story of our Sea Lion

Why a sea lion? That’s a question we get every now and then, most recently at FOSDEM. Here is the story: Our Founder Monty likes animals in the sea. For MySQL, he picked a dolphin, after swimming with them in the Florida Keys. For the MariaDB sea lion, there was a similar encounter. It happened […]

The post The Story of our Sea Lion appeared first on MariaDB.org.

ProxySQL 1.4.14 and Updated proxysql-admin Tool

ProxySQL 1.4.14, released by ProxySQL, is now available for download in the Percona Repository along with an updated version of Percona’s proxysql-admin tool.

ProxySQL is a high-performance proxy, currently for MySQL,  and database servers in the MySQL ecosystem (like Percona Server for MySQL and MariaDB). It acts as an intermediary for client requests seeking resources from the database. René Cannaò created ProxySQL for DBAs as a means of solving complex replication topology issues.

The ProxySQL 1.4.14 source and binary packages available from the Percona download page for ProxySQL include ProxySQL Admin – a tool developed by Percona to configure Percona XtraDB Cluster nodes into ProxySQL. Docker images for release 1.4.14 are available as well. You can download the original ProxySQL from GitHub. GitHub hosts the documentation in the wiki format.

This release introduces an improvement on how proxysql-admin works with the --max-connections option. In previous releases, this option always equaled to 1000. Now, proxysql_galera_checker uses the value of the --max-connections option set by the user either in the command line (proxysql-admin --max-connections) or in the configuration file.

If the user doesn’t set this option, it defaults to 1000.

Improvements
  • PSQLADM-130: Every time a node is removed and then added back, proxysql_galera_checker script restores the custom value of the --max-connections option set using proxysql-admin --max-connections.
  • The --syncusers option of proxysql-admin starts to support MariaDB. Thanks to Jonas Kint (@jonaskint) for this contribution.

ProxySQL is available under Open Source license GPLv3.

pre-FOSDEM MySQL Day 2019 – slides

This event was just awesome. We got 110 participants ! Thank you everybody and also a big thank to the speakers.

Here are the slides of all the sessions:

I will add the 2 missing slide-decks as soon as I receive them.

As this day was a really success for the MySQL Community, we plan of course to organize it again next year and maybe add a second conference room to increase the amount of participant as we had to refuse people.

Django Authentication With Facebook, Instagram and LinkedIn

Introduction

For users of internet services, there are only a few things worse than having to manually sign in (and sign-up) with different websites. The ordeal of the manual sign-in process can be associated with the difficulty that the user may experience in having to remember multiple strong passwords.

The inability to remember a password could lead to the creation of multiple accounts on the same platform or the outright switch to browsing as an unauthenticated user (where it doesn’t prevent access to the sought-after information or service).

Modern web applications solve this problem using social authentication, which is primarily a way to allow users to sign in (and sign-up) with the application using login information from a social network provider that they already have an account with.

In this tutorial, we will build a simple Django application that allows users to sign in via their Facebook, Instagram and LinkedIn accounts. To achieve this, we will use the social-auth-app-django library. We will also learn how to extract additional information such as profile picture and username from the respective social accounts.

At the end of this tutorial, we will have the final application that works like this:

The source code for this project is available here on GitHub.

Prerequisites

You need the following items installed on your machine to follow along with this tutorial:

  1. Python3
  2. Pipenv

Pipenv is a production-ready tool that aims to bring the best of all packaging worlds to the Python world. It harnesses Pipfile, pip, and virtualenv into one single command.

This tutorial assumes that the reader has basic working knowledge with Django. You also need to have an account with Facebook, Instagram and LinkedIn.

Let’s dive right in!

Setting up the Django app

In this section, we will up a new Django project and install dependencies. Let’s start by creating a new folder and making it the present working directory:

$ mkdir django_social_app $ cd django_social_app

We will create and activate a new virtual environment using Pipenv; this has the effect of creating an isolated Python environment and preventing us from polluting the global package directory when we install Django dependencies. We will also install django and social-auth-app-django:

$ pipenv shell $ pipenv install django social-auth-app-django

social-auth-app-django simplifies the implementation of social authentication with Django.

Let’s create (and navigate into) a new Django project, we will call it social_app:

$ (django_social_app) $ django-admin startproject social_app $ (django_social_app) $ cd social_app

Note: It is important that we run the commands from the terminal that is sourced into the virtual environment i.e displays (django_social_app) at the beginning of each command line.

Next, we will create a Django application called core, this application will contain all our views and templates:

(django_social_app) $ python manage.py startapp core

Note: You need to be in the parent social_app directory to run ‘python manage.py *’ commands.

Let’s find the settings.py file in the social_app project and add both core and social-auth-app-django as INSTALLED_APPS:

#social_app/settings.py INSTALLED_APPS = [ 'django.contrib.admin', 'django.contrib.auth', 'django.contrib.contenttypes', 'django.contrib.sessions', 'django.contrib.messages', 'django.contrib.staticfiles', 'social_django', # add this 'core' # add this ]

Finally, let’s migrate the database:

(django_social_app) $ python manage.py migrate Configuring the authentication classes

Behind the scenes, Django maintains a list of “authentication backends” that it checks during user authentication. If the first authentication method fails, Django tries the second one, and so on, until all backends have been attempted.

The AUTHENTICATION_BACKENDS array contains a list of authentication backend classes (as strings) and is by default set to:

['django.contrib.auth.backends.ModelBackend']

We can update it and add new authentication classes in order to allow authentication with the social platforms we are considering in this tutorial.

To update it this, simply add the following code in the settings.py file:

#social_app/settings.py #add this AUTHENTICATION_BACKENDS = [ 'social_core.backends.linkedin.LinkedinOAuth2', 'social_core.backends.instagram.InstagramOAuth2', 'social_core.backends.facebook.FacebookOAuth2', 'django.contrib.auth.backends.ModelBackend', ]

We just added the authentication backend classes for Linkedin, Instagram and Facebook.

You can find a list of the authentication backends classes supported by social-auth-app-django here.

Adding templates and static files

We’ve only worked on setting up and configuring the application, let’s move on to something visual now. In this section, we will build the foundation of the templates that will display the application.

Let’s create a new folder in the coredirectory, we will call this folder templates:

(django_social_app) $ cd core/ (django_social_app) $ mkdir templates/

Create three files within the templates directory and call them:

  1. base.html
  2. login.html
  3. home.html

Now, open the base.html file and paste in the following snippet:

<!-- templates/base.html --> {% load static %} <!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8" /> <meta name="viewport" content="width=device-width, initial-scale=1.0" /> <meta http-equiv="X-UA-Compatible" content="ie=edge" /> <link rel="stylesheet" href="https://stackpath.bootstrapcdn.com/bootstrap/4.1.3/css/bootstrap.min.css" integrity="sha384-MCw98/SFnGE8fJT3GXwEOngsV7Zt27NXFoaoApmYm81iuXoPkFOJwJ8ERdknLPMO" crossorigin="anonymous" /> <link rel="stylesheet" href="{% static 'css/index.css' %}" /> <title>Social Auth with Django</title> </head> <body> <div class="container-fluid"> <div> <h1 class="text-white text-center">{% block title %}{% endblock %}</h1> <div class="card p-5"> {% block content %} {% endblock %} </div> </div> </div> </body> </html>

Paste in the following snippet in the login.html file:

<!-- templates/login.html --> {% extends 'base.html' %} {% block title %} Sign in {% endblock %} {% block content %} <div class="row"> <div class="col-md-8 mx-auto social-container my-2 order-md-1"> <button class="btn btn-danger mb-2"> <a href="#">Login with Instagram</a> </button> <button class="btn btn-primary mb-2"> <a href="#">Login with Facebook </a> </button> <button class="btn btn-info mb-2"> <a href="#">Login with LinkedIn</a> </button> </div> </div> </div> {% endblock %}

Lastly, update the home.html file with the code below:

<!-- templates/home.html --> {% extends 'base.html' %} {% block title %} Home {% endblock %} {% block content %} <div class="row"> <div class="col-sm-12 mb-3"> <h4 class="text-center"> Welcome {{ user.username }} </h4> </div> </div> {% endblock %}

We need some styles to help our code look nice when rendered, so let’s create a folder called static in the root of the core folder and we will store our styles there.

Create a folder called css folder within the static directory and finally, create an index.css file within the css folder.

Now open the index.css file and update it with the following code:

/_ index.css _/ img { border: 3px solid #282c34; } .container-fluid { height: 100vh; background-color: #282c34; display: flex; justify-content: center; align-items: center; } .container-fluid > div { width: 85%; min-width: 300px; max-width: 500px; } .card { width: 100%; } .social-container { display: flex; flex-direction: column; justify-content: center; } .btn a, .btn a:hover { color: white; text-decoration: none ; } Setting up the Views and URLs

In this section, we will define the Views and register the URLs that the application needs to work, so open the core/views.py file and replace its content with the snippet below:

# core/views.py from django.shortcuts import render from django.contrib.auth.decorators import login_required # Create your views here. def login(request): return render(request, 'login.html') @login_required def home(request): return render(request, 'home.html')

Next, we will register the routes for the application and attach their matching view functions. Replace the content of the social_app/urls.py file with the code below:

# social_app/urls.py from django.contrib import admin from django.urls import path, include from django.contrib.auth import views as auth_views from core import views urlpatterns = [ path('admin/', admin.site.urls), path("login/", views.login, name="login"), path("logout/", auth_views.LogoutView.as_view(), name="logout"), path('social-auth/', include('social_django.urls', namespace="social")), path("", views.home, name="home"), ]

In the settings.py file, we need to set four new values — LOGIN_URL, LOGOUT_URL, LOGIN_REDIRECT_URL and LOGOUT_REDIRECT_URL — because they will be used in redirecting the user when authentication is complete:

# social_app/settings.py # [...] LOGIN_URL = 'login' LOGIN_REDIRECT_URL = 'home' LOGOUT_URL = 'logout' LOGOUT_REDIRECT_URL = 'login' # [...]

Fantastic! We can now run the application to see what we’ve built so far. Let’s start the server with this command:

(django_social_app) $ python manage.py runserver

Note: You need to be in the parent social_app directory to run ‘python manage.py *’ commands.

We can view the application on http://localhost:8000, though we will be redirected to /login since we aren't authenticated:

Looking good! In the next sections, we will register our application with the social network providers so that users can be authenticated via social platforms.

Facebook Authentication

In this section, we will do the heavy lifting and set up authentication via Facebook.

Get Facebook Credentials

Head over to the Facebook Developers’ page, after signing in, click on Add a New App and enter the details for the app on the modal window that appears:

Once the app has been created, you will be redirected to the application’s dashboard. On the left side of the screen, click on Settings, then click on the Basic option that appears directly underneath it.

When the new screen loads, under the App Domains section, add localhost like this:

Now scroll down until you see an option that says Add Platform, Click on it and select the Website option. This will create a website section where you will see Site URL, add http://localhost:8000/ in the input and click on the Save Changes button:

Now, copy the App ID and App secret from the applications dashboard and add them to the settings.py file:

# social_app/settings.py #[...] SOCIAL_AUTH_FACEBOOK_KEY = YOUR_APP_KRY # App ID SOCIAL_AUTH_FACEBOOK_SECRET = YOUR_APP_SECRET # App Secret #[...]

Replace the YOUR_APP_* keys with the values from your Facebook application dashboard.

Let’s update the URL of the Login with Facebook button in login.html file with this one:

<!-- templates/login.html --> <button class="btn btn-primary mb-2"> <a href="{% url 'social:begin' 'facebook' %}">Login with Facebook</a> </button>

Start up the web server and visit localhost:8000/login to test that we can now log into the application via Facebook:

(django_social_app) $ python manage.py runserver

Note: You need to be in the parent social_app directory to run ‘python manage.py *’ commands.

When we click on the Login with Facebook button, we should be redirected to a page similar to the one below:

Clicking on Continue as USERNAME will redirect to the home page and your username will be displayed. We have successfully been authenticated via Facebook:

Want to know what just happened behind the scenes? Let’s find out; we will create a superuser account to access the admin panel:

(django_social_app) $ python manage.py createsuperuser

You will get a prompt to enter a username, email and password for the superuser. Be sure to enter details that you can remember because you will need them to log in to the admin dashboard shortly.

After creating the superuser account, we will run the server and visit the admin panel on this address — http://localhost:8000/admin/:

We can see that a new User (besides the superuser) has been created in the Users category:

We will also see that there is a new account under the User social auths category:

The explanation for the existence of these new accounts is: When a user logs into the application using Facebook (or any social network provider), a new User and User Social Auth instance are created. The User Social Auth is linked to the User account the first time, then subsequently, the User instance is simply queried from the database.

The User instance is created using the details received from the social network provider. In this case, Facebook sent back the first_name and last_name fields of the created User Instance, and the username is a concatenation of the first_name and last_name.

Some social providers (like Instagram) will return the user’s username on their platform and this is used (instead of the concatenation) for the created User instance on the application.

We don’t want our application to just display a username, we want to get additional User data such as profile picture, so let’s request for extra data from Facebook.

Note: You will need to log out from the admin panel to continue.

Open the settings.py file and update it accordingly:

# settings.py TEMPLATES = [ { 'BACKEND': 'django.template.backends.django.DjangoTemplates', 'DIRS': [], 'APP_DIRS': True, 'OPTIONS': { 'context_processors': [ 'django.template.context_processors.debug', 'django.template.context_processors.request', 'django.contrib.auth.context_processors.auth', 'django.contrib.messages.context_processors.messages', 'social_django.context_processors.backends', # add this 'social_django.context_processors.login_redirect', # add this ], }, }, ] SOCIAL_AUTH_FACEBOOK_KEY = YOUR_APP_SECRET # App ID SOCIAL_AUTH_FACEBOOK_SECRET = YOUR_APP_SECRET # App Secret SOCIAL_AUTH_FACEBOOK_SCOPE = ['email', 'user_link'] # add this SOCIAL_AUTH_FACEBOOK_PROFILE_EXTRA_PARAMS = { # add this 'fields': 'id, name, email, picture.type(large), link' } SOCIAL_AUTH_FACEBOOK_EXTRA_DATA = [ # add this ('name', 'name'), ('email', 'email'), ('picture', 'picture'), ('link', 'profile_url'), ]

Replace the YOUR_APP_* keys with the values from your Facebook application dashboard.

The social_django context processors help in adding backend and associations data to template context. This makes it easy for us to access data about the authenticated user using template tags. You can read more about it here.

When a user logs into the application via Facebook, we can access a subset of the user’s data using permissions; permissions are how we ask if we can access the data on Facebook. In the code above, SOCIAL_AUTH_FACEBOOK_SCOPE contains a list of permissions to access the data properties our application requires. The email and user_link permissions request access to the user’s Facebook email and profile link respectively.

Let’s start the server now, visit http://localhost:8000/login/, and attempt logging in via Facebook:

Clicking on Continue as USERNAME will grant the application access to user’s private data on Facebook and for this reason, to request for some permissions, you will need to submit your application to be reviewed by the Facebook team to ensure that the returned data isn’t misused. You can find the list of Facebook permissions here.

SOCIAL_AUTH_FACEBOOK_PROFILE_EXTRA_PARAMS has a key — fields — where the value is a list of attributes that should be returned by Facebook when the user has successfully logged in. The values are dependent on SOCIAL_AUTH_FACEBOOK_SCOP .

When a user logs in using Facebook, the Facebook API returns the data requested in SOCIAL_AUTH_FACEBOOK_PROFILE_EXTRA_PARAMS. To store the data in the database, we need to specify them in SOCIAL_AUTH_FACEBOOK_EXTRA_DATA.

The extra data will be stored in the extra_datafield of the User Social Auth instance:

Now, let’s update the frontend to display the extra data we received from Facebook, we will open the home.html file and replace its content with this one:

<!-- home.html --> {% extends 'base.html' %} {% block title %} Home {% endblock %} {% block content %} <div class="row"> <div class="col-sm-12 mb-3"> <h4 class="text-center">Welcome {{ user.username }}</h4> </div> <!-- Add from here --> {% for ass in backends.associated %} {% if ass.provider == 'facebook' %} <div class="col-md-6 text-center"> <img src="{{ass.extra_data.picture.data.url}}" alt="" width="150" height="150" style="border-radius: 50%"> </div> <div class="col-md-6 social-container my-2"> <p> Signed in via: {{ ass.provider }} </p> <p> Name: {{ ass.extra_data.name }} </p> <p> Provider url: <a href="{{ass.extra_data.profile_url}}">link</a></p> </div> {% endif %} {% endfor %} <div class="col-sm-12 mt-2 text-center"> <button class="btn btn-danger"> <a href="{% url 'logout' %}">Logout</a> </button> </div> <!-- End here --> </div> {% endblock %}

Now we can visit the application’s home address — http://localhost:8000/ — on a web browser and see the extra data:

Note: We now have access to the data that gets stored in the `extra_data` field because we added the context processors to `TEMPLATES`; this enables us to access `backends` data in the frontend.

Congratulations! we have successfully authenticated via Facebook and can now move on to setting up the next social network provider.

Instagram Authentication

Seeing that we’ve gone into details on how things work behind the scene, we can set up the Instagram authentication in just three steps:

  • Get API credentials

  • Setup the backend

  • Setup the frontend

Let’s go!

Get API credentials

Let’s visit the Instagram developers page and click on Register Your Application:

On the screen that comes up next, click on Register a New Client and fill out the details for the application:

Note: Instagram (and most social network providers) require a redirect URL which will be the address the user’s browser will be directed to after the authentication. For this tutorial, we will use this address http://localhost:8000/social-auth/complete/instagram/. You can learn more about this here.

Successful registration of the application will return a screen like this:

Now, we will click on the Manage option to get the application’s Client ID and Client Secret:

Note: We need the ID and the Secret to configure the backend.

Setup the Backend

Let’s add this code to the settings.py file:

# settings.py #[...] # add this code SOCIAL_AUTH_INSTAGRAM_KEY = YOUR_CLIENT_ID #Client ID SOCIAL_AUTH_INSTAGRAM_SECRET = YOUR_CLIENT_SECRET #Client SECRET SOCIAL_AUTH_INSTAGRAM_EXTRA_DATA = [ ('user', 'user'), ] #[...]

Replace the YOUR_CLIENT_* keys with the values from your Instagram application.

With the code above, once a user is authenticated via Instagram, a user object will be sent back to the application. We add this object to SOCIAL_AUTH_INSTAGRAM_EXTRA_DATA because we want it to be stored in the database for easy reference.

The user object is returned in this format:

{ "id": ..., "username": ..., "profile_picture": ..., "full_name": ..., "bio": ..., "website": ..., "is_business": ..., } Setup the Frontend

We want to display a nice UI for when a user is authenticated via Instagram so let’s replace the content of the home.html file with the code below:

<!-- home.html --> {% extends 'base.html' %} {% block title %} Home {% endblock %} {% block content %} <div class="row"> <div class="col-sm-12 mb-3"> <h4 class="text-center">Welcome {{ user.username }}</h4> </div> <!-- Add from here --> {% for ass in backends.associated %} {% if ass.provider == 'facebook' %} <div class="col-md-6 text-center"> <img src="{{ass.extra_data.picture.data.url}}" alt="" width="150" height="150" style="border-radius: 50%"> </div> <div class="col-md-6 social-container my-2"> <p> Signed in via: {{ ass.provider }} </p> <p> Name: {{ ass.extra_data.name }} </p> <p> Provider url: <a href="{{ass.extra_data.profile_url}}">link</a></p> </div> {% endif %} <!-- Add from here --> {% if ass.provider == 'instagram' %} <div class="col-md-6 text-center"> <img src="{{ ass.extra_data.user.profile_picture }}" alt="" width="150" height="150" style="border-radius: 50%"> </div> <div class="col-md-6 social-container my-2"> <p>Signed in via: {{ ass.provider }} </p> <p> Name: {{ ass.extra_data.user.full_name }} </p> <p>Provider url: <a href="https://instagram.com/{{ ass.username }}">link</a></p> {% endif %} <!-- End here --> {% endfor %} <div class="col-sm-12 mt-2 text-center"> <button class="btn btn-danger"> <a href="{% url 'logout' %}">Logout</a> </button> </div> <!-- End here --> </div> {% endblock %}

Let’s update the URL of the Login with Instagram button in login.html file:

<!-- templates/login.html --> <button class="btn btn-danger mb-2"> <a href="{% url 'social:begin' 'instagram' %}">Login with Instagram</a> </button>

We can now start the server, visit http://localhost:8000/login, and try to login with Instagram:

Note: The Instagram application is in sandbox (test) mode. To make it live, you will have to submit it for a review. You can learn more about this here.

Once the application is authorized, the user will be logged in and redirected to the home page:

Linkedin Authentication

We will set up LinkedIn authentication in three steps:

  • Get API credentials

  • Setup the Backend

  • Setup the Frontend

Get API credentials

Let’s visit the Linkedin developers page and click on Create app:

Fill out the application details:

Note: Most of these fields are required and you will have to fill them out with valid structured data.

Once the app has successfully been created, you will be redirected to the application’s dashboard. Here, add http://localhost:8000/social-auth/complete/linkedin-oauth2/ as the redirect URL and update the application:

Take note of the Client ID and Secret, we will need it in setting up the backend.

Setup the Backend

Let’s add this code to the settings.pyfile:

# settings.py #[...] # add this code SOCIAL_AUTH_LINKEDIN_OAUTH2_KEY = YOUR_CLIENT_ID #Client ID SOCIAL_AUTH_LINKEDIN_OAUTH2_SECRET = YOUR_CLIENT_SECRET #Client Secret SOCIAL_AUTH_LINKEDIN_OAUTH2_SCOPE = ['r_basicprofile', 'r_emailaddress'] SOCIAL_AUTH_LINKEDIN_OAUTH2_FIELD_SELECTORS = ['email-address', 'formatted-name', 'public-profile-url', 'picture-url'] SOCIAL_AUTH_LINKEDIN_OAUTH2_EXTRA_DATA = [ ('id', 'id'), ('formattedName', 'name'), ('emailAddress', 'email_address'), ('pictureUrl', 'picture_url'), ('publicProfileUrl', 'profile_url'), ] #[...]

Replace the YOUR_CLIENT_* keys with the values from your LinkedIn application.

The SOCIAL_AUTH_LINKEDIN_OAUTH2_SCOPE array contains the permissions needed to access the user’s data, similar to what we saw when we set up authentication via Facebook.

The SOCIAL_AUTH_LINKEDIN_OAUTH_FIELD_SELECTORS array contains the list of data that should be returned when the user is successfully authenticated via Linkedin. It is similar to the SOCIAL_AUTH_FACEBOOK_PROFILE_EXTRA_PARAMS array for Facebook. You can find the full list of data items that can be requested for here.

The SOCIAL_AUTH_LINKEDIN_OAUTH2_EXTRA_DATA array contains the data that we want to store in the database for later reference.

Setting up Frontend

Let’s replace the content of the home.html file with the code below:

<!-- home.html --> {% extends 'base.html' %} {% block title %} Home {% endblock %} {% block content %} <div class="row"> <div class="col-sm-12 mb-3"> <h4 class="text-center">Welcome {{ user.username }}</h4> </div> <!-- Add from here --> {% for ass in backends.associated %} {% if ass.provider == 'facebook' %} <div class="col-md-6 text-center"> <img src="{{ass.extra_data.picture.data.url}}" alt="" width="150" height="150" style="border-radius: 50%"> </div> <div class="col-md-6 social-container my-2"> <p> Signed in via: {{ ass.provider }} </p> <p> Name: {{ ass.extra_data.name }} </p> <p> Provider url: <a href="{{ass.extra_data.profile_url}}">link</a></p> </div> {% endif %} {% if ass.provider == 'instagram' %} <div class="col-md-6 text-center"> <img src="{{ ass.extra_data.user.profile_picture }}" alt="" width="150" height="150" style="border-radius: 50%"> </div> <div class="col-md-6 social-container my-2"> <p>Signed in via: {{ ass.provider }} </p> <p> Name: {{ ass.extra_data.user.full_name }} </p> <p>Provider url: <a href="https://instagram.com/{{ ass.username }}">link</a></p> {% endif %} <!-- Add from here --> {% if ass.provider == 'linkedin-oauth2' %} <div class="col-md-6 text-center"> <img src="{{ass.extra_data.picture_url}}" alt="" width="150" height="150" style="border-radius: 50%"> </div> <div class="col-md-6 social-container my-2"> <p> Signed in via: Linkedin </p> <p> Name: {{ ass.extra_data.name }} </p> <p> Provider url: <a href="{{ass.extra_data.profile_url}}">link</a></p> </div> {% endif %} <!-- End here --> {% endfor %} <div class="col-sm-12 mt-2 text-center"> <button class="btn btn-danger"> <a href="{% url 'logout' %}">Logout</a> </button> </div> <!-- End here --> </div> {% endblock %}

We will also update the URL of the Login with LinkedIn button in the login.html file:

<!-- templates/login.html --> <button class="btn btn-info mb-2"> <a href="{% url 'social:begin' 'linkedin-oauth2' %}">Login with LinkedIn</a> </button>

We can now start the server, visit http://localhost:8000/login, and try to login with LinkedIn:

Once we authorize the application by clicking on Allow, we will be directed to the homepage:

Conclusion

We have come to the end of this tutorial and have learnt how to set up social authentication in Django using the social-auth-app-django library with minimal configuration. We have also learnt how to request extra user data once the user has been authenticated via a social network provider.

As we already discussed at the beginning of this article, the importance of social authentication in modern web applications cannot be overemphasized.

The code for this article is available here on GitHub.

Column Families in MyRocks

In my webinar How To Rock with MyRocks I briefly mentioned the column families feature in MyRocks, that allows a fine tuning for indexes and primary keys.

Let’s review it in more detail.

To recap, MyRocks is based on the RocksDB library, which stores all data in [key => value] pairs, so when it translates to MySQL, all Primary Keys (data) and secondary keys (indexes) are stored in [ key => value ] pairs, which by default are assigned to “default” Column Family.

Each column family has individual set of

  • SST files, and their parameters
  • Memtable and its parameters
  • Bloom filters, and their parameters
  • Compression settings

There is a N:1 relation between tables and indexes to column family, so schematically it looks like this:

How do you assign tables and indexes to a column family?

It is defined in the COMMENT section for a key or primary key:

CREATE TABLE tab1( a INT, b INT, PRIMARY KEY (a) COMMENT ‘cfname=cf1’, KEY key_b (b) COMMENT ‘cfname=cf2’) )

Now, if you want to define individual parameters for column families, you should use

rocksdb_override_cf_options 

For example:

rocksdb_override_cf_options=’cf1={compression=kNoCompression}; cf2={compression=kLZ4Compression,bottommost_compression==kZSTD}’

Be careful of defining too many column families: as I mentioned, each column family will use an individual memtable, which takes 64MB of memory by default.

There is also an individual set of SST tables. You can see how they perform with

SHOW ENGINE ROCKSDB STATUS :  Type: CF_COMPACTION   Name: cf1 Status:  ** Compaction Stats [cf1] ** Level    Files   Size     Score Read(GB)  Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop ----------------------------------------------------------------------------------------------------------------------------------------------------------   L0      0/0     0.00 KB   0.0      0.0     0.0      0.0       1.5      1.5       0.0   1.0      0.0     99.1        15       37    0.410       0      0   L5      3/0   197.05 MB   0.8      0.4     0.4      0.0       0.4      0.4       0.0   1.0     75.6     75.6         6        1    5.923   8862K      0   L6      7/0   341.24 MB   0.0      1.7     1.3      0.5       0.8      0.3       0.0   0.6     42.8     19.5        42        7    5.933     61M      0  Sum     10/0   538.29 MB   0.0      2.2     1.7      0.5       2.7      2.2       0.0   1.8     35.5     44.1        63        45    1.392     70M     0  Int      0/0     0.00 KB   0.0      0.0     0.0      0.0       0.0      0.0       0.0   0.0      0.0      0.0         0        0    0.000       0      0   Type: CF_COMPACTION   Name: cf2 Status:  ** Compaction Stats [cf2] ** Level    Files   Size     Score Read(GB)  Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop ----------------------------------------------------------------------------------------------------------------------------------------------------------   L0      0/0     0.00 KB   0.0      0.0     0.0      0.0       0.3      0.3       0.0   1.0      0.0     13.5        22       22    1.023       0      0   L6      4/0   178.61 MB   0.0      0.6     0.3      0.3       0.4      0.2       0.0   1.5      9.3      7.3        61        5   12.243     72M      0  Sum      4/0   178.61 MB   0.0      0.6     0.3      0.3       0.7      0.5       0.0   2.5      6.8      9.0        84       27    3.100     72M      0  Int      0/0     0.00 KB   0.0      0.0     0.0      0.0       0.0      0.0       0.0   0.0      0.0      0.0         0        0    0.000       0      0

To check the current column families and their settings you can use:

SELECT * FROM INFORMATION_SCHEMA.ROCKSDB_CF_OPTIONS: | cf1        | COMPARATOR                                                      | RocksDB_SE_v3.10       | | cf1        | MERGE_OPERATOR                                                  | NULL                                                                                                      | | cf1        | COMPACTION_FILTER                                               | NULL                                                                                                                                                                      | cf1        | COMPACTION_FILTER_FACTORY                                       | Rdb_compact_filter_factory                                                                                                                                                | cf1        | WRITE_BUFFER_SIZE                                               | 67108864                                                                                                        | cf1        | MAX_WRITE_BUFFER_NUMBER                                         | 2                                                                                                                                                                         | cf1        | MIN_WRITE_BUFFER_NUMBER_TO_MERGE                                | 1                                                                                                                                                                         | cf1        | NUM_LEVELS                                                      | 7                                                                                                                                                                         | cf1        | LEVEL0_FILE_NUM_COMPACTION_TRIGGER                              | 4                                                                                                                                                                         | cf1        | LEVEL0_SLOWDOWN_WRITES_TRIGGER                                  | 20                                                                                                                                                                        | cf1        | LEVEL0_STOP_WRITES_TRIGGER                                      | 36                                                                                                                                                                        | cf1        | MAX_MEM_COMPACTION_LEVEL                                        | 0                                                                                                                                                                         | cf1        | TARGET_FILE_SIZE_BASE                                           | 67108864                                                                                                                                                                  | cf1        | TARGET_FILE_SIZE_MULTIPLIER                                     | 1                                                                                                                                                                         | cf1        | MAX_BYTES_FOR_LEVEL_BASE                                        | 268435456                                                                                                                                                                 | cf1        | LEVEL_COMPACTION_DYNAMIC_LEVEL_BYTES                            | ON                                                                                                                                                                        | cf1        | MAX_BYTES_FOR_LEVEL_MULTIPLIER                                  | 10.000000                                                                                                                                                                 | cf1        | SOFT_RATE_LIMIT                                                 | 0.000000                                                                                                                                                                  | cf1        | HARD_RATE_LIMIT                                                 | 0.000000                                                                                                                                                                  | cf1        | RATE_LIMIT_DELAY_MAX_MILLISECONDS                               | 100                                                                                                                                                                       | cf1        | ARENA_BLOCK_SIZE                                                | 0                                                                                                                                                                         | cf1        | DISABLE_AUTO_COMPACTIONS                                        | OFF                                                                                                                                                                       | cf1        | PURGE_REDUNDANT_KVS_WHILE_FLUSH                                 | ON                                                                                                                                                                        | cf1        | MAX_SEQUENTIAL_SKIP_IN_ITERATIONS                               | 8                                                                                                                                                                         | cf1        | MEMTABLE_FACTORY                                                | SkipListFactory                                                                                                                                                           | cf1        | INPLACE_UPDATE_SUPPORT                                          | OFF                                                                                                                                                                       | cf1        | INPLACE_UPDATE_NUM_LOCKS                                        | ON                                                                                                                                                                        | cf1        | MEMTABLE_PREFIX_BLOOM_BITS_RATIO                                | 0.000000                                                                                                                                                                  | cf1        | MEMTABLE_PREFIX_BLOOM_HUGE_PAGE_TLB_SIZE                        | 0                                                                                                                                                                         | cf1        | BLOOM_LOCALITY                                                  | 0                                                                                                                                                                         | cf1        | MAX_SUCCESSIVE_MERGES                                           | 0                                                                                                                                                                         | cf1        | OPTIMIZE_FILTERS_FOR_HITS                                       | ON                                                                                                                                                                        | cf1        | MAX_BYTES_FOR_LEVEL_MULTIPLIER_ADDITIONAL                       | 1:1:1:1:1:1:1                                                                                                                                                             | cf1        | COMPRESSION_TYPE                                                | kNoCompression                                                                                                                                                            | cf1        | COMPRESSION_PER_LEVEL                                           | NUL                                                                                                                                                                       | cf1        | COMPRESSION_OPTS                                                | -14:-1:0                                                                                                                                                                  | cf1        | BOTTOMMOST_COMPRESSION                                          | kLZ4Compression                                                                                                                                                           | cf1        | PREFIX_EXTRACTOR                                                | NULL                                                                                                                                                                      | cf1        | COMPACTION_STYLE                                                | kCompactionStyleLevel                                                                                                                                                     | cf1        | COMPACTION_OPTIONS_UNIVERSAL                                    | {SIZE_RATIO=1; MIN_MERGE_WIDTH=2; MAX_MERGE_WIDTH=4294967295; MAX_SIZE_AMPLIFICATION_PERCENT=200; COMPRESSION_SIZE_PERCENT=-1; STOP_STYLE=kCompactionStopStyleTotalSize} | | cf1        | COMPACTION_OPTION_FIFO::MAX_TABLE_FILES_SIZE                    | 1073741824                                                                                                                                                                | cf1        | TABLE_FACTORY::FLUSH_BLOCK_POLICY_FACTORY                       | FlushBlockBySizePolicyFactory(0x4715df0)                                                                                                                                  | cf1        | TABLE_FACTORY::CACHE_INDEX_AND_FILTER_BLOCKS                    | 1                                                                                                                                                                         | cf1        | TABLE_FACTORY::CACHE_INDEX_AND_FILTER_BLOCKS_WITH_HIGH_PRIORITY | 0                                                                                                                                                                         | cf1        | TABLE_FACTORY::PIN_L0_FILTER_AND_INDEX_BLOCKS_IN_CACHE          | 1                                                                                                                                                                         | cf1        | TABLE_FACTORY::INDEX_TYPE                                       | 0                                                                                                                                                                         | cf1        | TABLE_FACTORY::HASH_INDEX_ALLOW_COLLISION                       | 1                                                                                                                                                                         | cf1        | TABLE_FACTORY::CHECKSUM                                         | 1                                                                                                                                                                         | cf1        | TABLE_FACTORY::NO_BLOCK_CACHE                                   | 0                                                                                                                                                                         | cf1        | TABLE_FACTORY::BLOCK_CACHE                                      | 0x470c880                                                                                                                                                                 | cf1        | TABLE_FACTORY::BLOCK_CACHE_NAME                                 | LRUCache                                                                                                                                                                  | cf1        | TABLE_FACTORY::BLOCK_CACHE_OPTIONS                              |                                                                                                                                                                           | cf1        | TABLE_FACTORY::CAPACITY                                         | 536870912                                                                                                                                                                 | cf1        | TABLE_FACTORY::NUM_SHARD_BITS                                   | 6                                                                                                                                                                         | cf1        | TABLE_FACTORY::STRICT_CAPACITY_LIMIT                            | 0                                                                                                                                                                         | cf1        | TABLE_FACTORY::HIGH_PRI_POOL_RATIO                              | 0.000                                                                                                                                                                     | cf1        | TABLE_FACTORY::BLOCK_CACHE_COMPRESSED                           | (nil)                                                                                                                                                                     | cf1        | TABLE_FACTORY::PERSISTENT_CACHE                                 | (nil)                                                                                                                                                                     | cf1        | TABLE_FACTORY::BLOCK_SIZE                                       | 16384                                                                                                                                                                     | cf1        | TABLE_FACTORY::BLOCK_SIZE_DEVIATION                             | 10                                                                                                                                                                        | cf1        | TABLE_FACTORY::BLOCK_RESTART_INTERVAL                           | 16                                                                                                                                                                        | cf1        | TABLE_FACTORY::INDEX_BLOCK_RESTART_INTERVAL                     | 1                                                                                                                                                                         | cf1        | TABLE_FACTORY::METADATA_BLOCK_SIZE                              | 4096                                                                                                                                                                      | cf1        | TABLE_FACTORY::PARTITION_FILTERS                                | 0                                                                                                                                                                         | cf1        | TABLE_FACTORY::USE_DELTA_ENCODING                               | 1                                                                                                                                                                         | cf1        | TABLE_FACTORY::FILTER_POLICY                                    | rocksdb.BuiltinBloomFilter                                                                                                                                                | cf1        | TABLE_FACTORY::WHOLE_KEY_FILTERING                              | 1                                                                                                                                                                         | cf1        | TABLE_FACTORY::VERIFY_COMPRESSION                               | 0                                                                                                                                                                         | cf1        | TABLE_FACTORY::READ_AMP_BYTES_PER_BIT                           | 0                                                                                                                                                                         | cf1        | TABLE_FACTORY::FORMAT_VERSION                                   | 2                                                                                                                                                                         | cf1        | TABLE_FACTORY::ENABLE_INDEX_COMPRESSION                         | 1                                                                                                                                                                         | cf1        | TABLE_FACTORY::BLOCK_ALIGN                                      | 0                        

As a reminder MyRocks is available in Percona Server 5.7 and Percona Server 8.0, you can try it and share your experience!


Photo by Debby Hudson on Unsplash

MySQL Shell : the best DBA tool?

Last week I presented the following session at the pre-FOSDEM MySQL Day:

MySQL Shell : the best DBA tool ? from Frederic Descamps

The audience seemed very interested on how the MySQL Shell can be extended.

During the presentation I showed how I extended the MySQL Shell with two new modules in Python:

Both projects are on github and are waiting for ideas, feature requests, pull requests, …

Here is the video of the Innotop module as during the presentation:

I hope you will enjoy the MySQL Shell even more and that you will start contributing to these modules.

Miami and Santo Domingo with MySQL Document Store

The next two weeks are a big change for me compared to the last two weeks.  The last two weeks I was in Belgium and now I am in Miami before heading to the Dominican Republic.  The one thing in common is that I am speaking on the MySQL Document Store.

SunshinePHP has become a favorite conference for many PHP developers for many reasons and this morning I will be running a three hour tutorial on using the PECL MySQL X DevAPI and the MySQL Document Store with PHP and then have a one hour talk later in the show. Plus I am joined by the amazing Kathy Forte who will speak on Driving Innovation With MySQL 8.0 and Docker.

Pycaribbean is a new show hoping to foster a growing developer community into the area.  And i will be showing how to use the MySQL Document Store with Python.

MySQL Connector/Python 8.0.15: allow_local_infile Disabled by Default

MySQL Server has a feature where you can insert data into a table from a file with the data delimited by commas, tabs, or another delimiter. This is particularly useful when you need to bulk import data, for example when restoring a backup or migrating data from one system to another including from another database product than MySQL. The mysqldump backup utility is an example of a program that supports exporting the data to delimited text files.

The statement to load the data is LOAD DATA INFILE. By default the file must be server-side and MySQL Server will load it without involving the connections (other than for submitting the query and returning the result). However, there is also an optional keyword LOCAL that can be used to tell MySQL that a the file is located client-side: LOAD DATA LOCAL INFILE. It is this local variant that is the topic of this blog.

By default, MySQL Server 8 does not allow loading client-side files using the LOAD DATA LOCAL INFILE statement. (Similarly the mysql command-line client has the feature disabled by default.) Why? The reason is that there are security implications enabling the feature. For that reason, it is recommended to disable the feature unless you know you need it, and then only allow it for as short time as possible.

Advice

The server-side variant is safer as long as the server-side file system is secured. Additionally, MySQL will only read server-side files from the directory (including subdirectories) specified with the secure_file_priv option. Make sure the option is not set to an empty value as that will allow MySQL to read any file.

In MySQL Connector/Python, the mysql.connector module includes the option allow_local_infile which determines whether MySQL Connector/Python will allow the server to request a client-side file. In version 8.0.15 the default value for allow_local_infile has been changed from True to False to align with the default value for MySQL Server and the mysql command-line client.

This means that if you attempt to execute LOAD DATA LOCAL INFILE in MySQL Connector/Python 8.0.15 without explicitly enabling it, a ProgrammingError exception is raised:

mysql.connector.errors.ProgrammingError: 1148 (42000): The used command is not allowed with this MySQL version

(Admitted, the error message is not particularly good at telling what you are doing wrong.) What do you do, if you need to load client-side data files? These are the steps:

  1. Enable the local_infile option in MySQL Server. This only exists for the global scope. You can temporary enable it with SET GLOBAL local_infile = ON;, then disable it again when you are done by setting the variable to OFF.
  2. Enable allow_local_infile in your MySQL Connector/Python program. I will show an example of how to do this below. Again, it is recommended to only have it enabled when needed.

An example of creating a connection with allow_local_infile enabled is:

import mysql.connector # Create connection to MySQL db = mysql.connector.connect( option_files="my.ini", allow_local_infile=True )

This example uses an INI formatted configuration file to specify the connection options, and the allow_local_infile option is added explicitly.

If you are interested in learning more about using LOAD DATA [LOCAL] INFILE with MySQL Connector/Python, then my book MySQL Connector/Python Revealed (Apress) has a section describing how to load data from files including an example. The book is available from Apress (print and DRM free ePub+PDF), Amazon (print and Kindle), Barnes & Noble (print), and others.

Variable number of parameters and optional parameters in MySQL/MariaDB procedures

Hope your procedures don’t look like this

In most programming languages, a function can have optional parameters that can be passed or not. Or it can accept any number of parameters. I will globally call these features dynamic parameters. In this article we’ll see if, and how, we can use dynamic parameters in MySQL and MariaDB stored procedures.

What dynamic parameters are

But first, let me write a paragraph for those who don’t know what I’ve talking about. Let’s see a couple of examples in the PHP language – I’ve chosen it because it’s very widely used and it has these features.

function makeCoffee($coffeeType = NULL) { ... } makecoffee(); makecoffee('italian');

Or they can do something like this:

function sumSomeNumbers() { $numArgs = func_num_args(); echo "Summing $numArgs numbers \n"; $argList = func_get_args(); $r = 0; for ($i = 0; $i < $argList; $i++) { $r += $argList[$i]; } return $r; }

In the first example, there is a single parameter, and it’s optional. If it’s not specified, its value will be NULL. In the second example, zero or more parameters are accepted. In this case parameters don’t have a name, but they can be read with a function called func_get_args(), while their number can be obtained with func_num_args(). If we look at C language, we have an array argv and an integer argc, but the basic idea is the same.

Dynamic parameters in MySQL and MariaDB

Bad news: these nice features don’t exist in MySQL and MariDB.

But we can use them anyway, with recent versions. After all, some features allow us to pass a procedure a single structured value, that can nest any number of values. These features are:

  • MySQL 5.7 JSON: MySQL has a real JSON type, and several functions that allow us to work with it. In MySQL 8 we can even turn the JSON value into a table, with the JSON_TABLE() function.
  • MariaDB 10.2 JSON: MariaDB doesn’t have a JSON type, it uses BLOB instead. But it has a set of JSON functions, mostly compatible with MySQL. It doesn’t support JSON_TABLE(). Function differences between MariaDB and MySQL in several pages, and they have a JSON section.
  • MariaDB dynamic columns: This set of functions operates on a BLOB value. This is actually treated as a semi-structured value, containing any number of columns, without a fixed structure.

So basically we have two options: JSON or dynamic columns. I think that JSON is usually the best option, so I will use it in this post, testing the examples on MariaDB for better compatibility.

A JSON-based solution

Let’s see how we can have optional parameters and a variable number of parameters using JSON documents.

Optional parameters

Optional parameters are just named parameters that could be specified or not. We can achieve this with a JSON object:

CREATE FUNCTION table_exists(params BLOB) RETURNS BOOL NOT DETERMINISTIC READS SQL DATA COMMENT ' Return whether a table exists. Parameters must be passed in a JSON document: * schema (optional). : Schema that could contain the table. By default, the schema containing this procedure. * table : Name of the table to check. ' BEGIN DECLARE v_table VARCHAR(64) DEFAULT JSON_UNQUOTE(JSON_EXTRACT(params, '$.table')); DECLARE v_schema VARCHAR(64) DEFAULT JSON_UNQUOTE(JSON_EXTRACT(params, '$.schema')); IF v_schema IS NULL THEN RETURN EXISTS ( SELECT TABLE_NAME FROM information_schema.TABLES WHERE TABLE_SCHEMA = SCHEMA() AND TABLE_NAME = v_table ); ELSE RETURN EXISTS ( SELECT TABLE_NAME FROM information_schema.TABLES WHERE TABLE_SCHEMA = v_schema AND TABLE_NAME = v_table ); END IF; END;

And now, let’s test it:

MariaDB [test]> SET @params := JSON_OBJECT( -> 'table', 'TABLES' -> ); Query OK, 0 rows affected (0.000 sec) MariaDB [test]> SELECT table_exists(@params); +-----------------------+ | table_exists(@params) | +-----------------------+ | 0 | +-----------------------+ 1 row in set (0.001 sec) MariaDB [test]> MariaDB [test]> SET @params := JSON_OBJECT( -> 'schema', 'information_schema', -> 'table', 'TABLES' -> ); Query OK, 0 rows affected (0.001 sec) MariaDB [test]> SELECT table_exists(@params); +-----------------------+ | table_exists(@params) | +-----------------------+ | 1 | +-----------------------+ 1 row in set (0.001 sec)

 Jolly good, it works!

What we did here is simple. We’ve created a JSON object withJSON_OBJECT(), and we’ve read its properties withJSON_EXTRACT(). The JSON_EXTRACT() function takes a JSON document followed by a path parameter. To access an object members, the path syntax is'$.property_name'.

Notes on MySQL syntax

In MySQL we must use JSON for the params parameter, instead of BLOB. Otherwise, we will get an error:

ERROR 3144 (22032): Cannot create a JSON value from a string with CHARACTER SET 'binary'

Also, we can use a syntax sugar. Instead of:

JSON_UNQUOTE(JSON_EXTRACT(params, '$.table'))

we can write:

params->>'$.table'

Syntax sugar tends to hide some meaningful complexity, and this particular one is only supported in MySQL. But still.

Variable number of parameters

As mentioned before, a variable number of parameters can just be implemented with an array of parameters. Here’s a JSON example:

CREATE FUNCTION number_stats(in_numbers BLOB) RETURNS INTEGER NOT DETERMINISTIC CONTAINS SQL COMMENT 'Accept an array of integers and their median' BEGIN DECLARE v_count INT UNSIGNED DEFAULT JSON_LENGTH(in_numbers); RETURN JSON_EXTRACT( in_numbers, CONCAT('$[', FLOOR(v_count / 2), ']') ); END;

What we did here is not much different from the former example. We used the JSON_ARRAY() function instead ofJSON_OBJECT(). Then we used JSON_LENGTH() to find the number of items in the array. It is worth noting that this function also works with objects. Also, if we accept nested arrays/objects as parameters, it still serves the purpose because it is not recursive. Finally, we used JSON_EXTRACT() with a different syntax, to extract an elements of the array: '$[10]'.

Validating parameters

You may have noticed that one of the problems with this solution is that there is no automated validation. A normal parameter must be of a certain type. If the type is not a string type, and the procedure was created with a strict SQL_MODE, trying to pass a wrong type will result in an error. A JSON document however does not have a schema, so its properties can be of any type.

However, there are still some ways to do some basic validation:

  • MySQL JSON type automatically validates documents. In MariaDB we can use the JSON_VALID() function.
  • We can assume that the output of JSON_OBJECT() and JSON_ARRAY() is always valid (or NULL).
  • For nested objects/arrays, we can also useJSON_TYPE(), that returns the JSON type of a property/element.
  • Sometimes, JSON_LENGTH() will be sufficient.
Conclusions

Optional parameters and variable number of parameters are not really implemented for MySQL/MariaDB stored procedures.

Why? MySQL initially implemented the minimal procedural language set that was necessary for the SAP world. Unfortunately, they never went much beyond that. Very basic features are missing, like external languages, or a real debugger. Even performance is quite poor (though I’d argue that stored procedures are by nature the most performant solution in some cases).

Those flaws are also in MariaDB. Yet… there is some home for the future. MariaDB implemented a parser for Oracle procedures – even though I think it is not yet properly documented. Probably this feature was only developed to make migrations from Oracle to MariaDB easier. But they added other minor features, like EXECUTE IMMEDIATE statement and anonymous blocks.

MySQL tends to implement interesting MariaDB features – and to be clear, the opposite is also true. So, at some point, their competition may extend to this area.

Optional parameters and variable number of parameters are good examples of missing features that can easily be emulated by the users. They make stored procedures more flexible, allowing us to develop procedures that are easier to use and adapt to more use cases. This is very important when developing procedures designed to be reused in different contexts, like an open source library.

If you have more ideas about this topic, please share your thoughts – or your code – in a comment.

Toodle pip,
Federico

 

Share

The post Variable number of parameters and optional parameters in MySQL/MariaDB procedures appeared first on Federico Razzoli.

DevOps for the Database

Why is it hard to apply DevOps principles and practices to databases, and how can we get better at it? This talk explores real-life stories that answer these two questions, through the perspectives of teams that have changed the entrenched culture, processes, and tooling—and those who’ve tried.

Along the way, we’ll cover topics including:

  • What the research shows about DevOps, databases, and company performance
  • Current and emerging trends in how we build and manage data tiers, and implications
  • The traditional dedicated DBA role, and what has happened as a result
  • What it takes to change from a DBA-centric culture, to one where database-related competencies and responsibilities are more distributed
  • Why some teams succeed in this transformation, while others fail
  • How to make these approaches work for MySQL, PostgreSQL, SQL Server, and others

We can apply DevOps principles to the database, and our work will be better for it. This talk will show you how.

Percona Responds to MySQL LOCAL INFILE Security Issues

In this post, we’ll cover Percona’s thoughts about the current MySQL community discussion happening around MySQL LOCAL INFILE security issues.

Some of the detail within this blog post is marked <REDACTED>. I hope to address this shortly (by the end of Feb 2019) and provide complete detail and exploit proof-of-concept code. However, this post is released given the already public discussion of this particular issue, with the exploitation code currently redacted to ensure forks of MySQL client libraries have sufficient time to implement their response strategies.

Check back at the end of the month to see updates to this post!

Background

MySQL’s LOCAL INFILE  feature is fully documented by Oracle MySQL, and there is a legitimate use for the LOCAL INFILE feature to upload data to a MySQL server in a single statement from a file on the client system.

However, some MySQL clients can be coerced into sending contents local to the machine they are running upon, without having issued a LOCAL INFILE directive. This appears to be linked to how Adminer php web interface was attacked to point to a MALICIOUSLY crafted MySQL service to extract file data from the host on which Adminer was deployed. This malicious “server” has, it would appear, existed since early 2013.

The attack requires the use of a malicious/crafted MySQL “server”, to send a request for the file in place of the expected response to the SQL query in the normal query response flow.

IF however the client checks for the expected response, there is no file ex-filtration without further additional effort. This was noted with Java & ProxySQL testing, as a specific response was expected, and not sending the expected response would cause the client to retry.

I use the term “server” loosely here ,as often this is simply a service emulating the MySQL v10 protocol, and does not actually provide complete MySQL interaction capability—though this is theoretically possible, given enough effort or the adaption of a proxy to carry out this attack whilst backing onto a real MySQL server for the interaction capability.

For example, the “server” always responds OK to any auth attempt, regardless of credentials used, and doesn’t interpret any SQL sent. Consequently, you can send any string as a query, and the “server” responds with the request for a file on the client, which the client dutifully provides if local_infile is enabled.

There is potential, no doubt, for a far more sophisticated “server”. However, in my testing I did not go to this length, and instead produced the bare minimum required to test this theory—which proved to be true where local_infile was enabled.

The attack flow is as follows:
  1. The client connects to MySQL server, performs MySQL protocol handshaking to agree on capabilities.
  2. Authentication handshake (“server” often accepts any credentials passed to it).
  3. The client issues a query, e.g. SET NAMES, or other SQL (“server ignores this and immediately responds with file request response in 4.”).
  4. The server responds with a packet that is normally reserved when it is issued a “LOAD LOCAL DATA IN FILE…” SQL statement (0xFB…)
  5. IF Vulnerable the client responds with the full content of the file path if present on the local file system and if permissions allow this file to be read.
    1. Client’s handling here varies, the client may drop the connection with malformed packet error, or continue.
Exploitation testing

The following MySQL  clients were tested via their respective docker containers; and default configurations, the bash script which orchestrated this is as follows: <REDACTED>

This tests the various forks of the MySQL client; along with some manual testing the results were:

  • Percona Server for MySQL 5.7.24-26 (Not vulnerable)
    • PS 5.7.x aborts after server greeting
  • Percona Server for MySQL 5.6.42-64.2  (Not vulnerable)
    • PS 5.6 accepts the server greeting, proceeds to log in, aborts without handling malicious payload.
  • MariaDB 5.5
    • Susceptible to LOCAL INFILE abuse in testing
      • MariaDB has stated they will release a fix that tracks in the client to ensure the SQL for LOAD LOCAL INFILE was requested and otherwise drops the server request without handling.
  • MariaDB 10.0
    • Susceptible to LOCAL INFILE abuse in testing
      • MariaDB has stated they will release a fix that tracks in the client to ensure the SQL for LOAD LOCAL INFILE was requested and otherwise drops the server request without handling.
  • MariaDB 10.1.37
    • susceptible to LOCAL INFILE abuse in testing
      • MariaDB has stated they will release a fix that tracks in the client to ensure the SQL for LOAD LOCAL INFILE was requested and otherwise drops the server request without handling.
  • MariaDB 10.4.1
    • susceptible to LOCAL INFILE abuse in testing
      • MariaDB has stated they will release a fix that tracks in the client to ensure the SQL for LOAD LOCAL INFILE was requested and otherwise drops the server request without handling.
  • MySQL 5.7. (Not vulnerable by default)
    • Not susceptible to LOCAL INFILE abuse by default, enabling local_infile however makes this susceptible
  • MySQL 5.6. (Not vulnerable)
    • Not susceptible to LOCAL INFILE abuse by default, enabling local_infile however makes this susceptible
  • MySQL 8.0.14 (Not vulnerable)
    • Not susceptible to LOCAL INFILE abuse, enabling local_infile however makes this susceptible.
  • PHP 7 mysqli
    • Depends on libmysqlclient in use (As PHP’s mysqli is a C wrapper of the underlying library).
  • Ruby
    • Depends on libmysqlclient in use
    • Note: I couldn’t get this to build on my laptop due to a reported syntax error in mysql.c. However, given this wraps libmysqlclient, I would suggest the result to likely mirror PHP’s test.
  • ProxySQL
    • Underlying library is known susceptible to LOCAL INFILE abuse.
    • ProxySQL issues SQL to the backend MySQL server, and protocol commands such as PING, and expects a specific result in for queries issued by ProxySQL. This leads to difficulty for the malicious server being generic, a targeted client that specifically seeks to target ProxySQL is likely possible however this has not been explored at this time.
  • Java
    • com.mysql.jdbc.Driver
      • As with ProxySQL, testing this drive issues “background” SQL, and expects a specific response. While theoretically possible to have a malicious service target on this drive, this has not been explored at this time.
  • Connector/J

There are many more clients out there ranging from protocol compatible implementations to wrappers of the underlying c library.

Your own research will ensure you are taking appropriate measures should you choose/need to mitigate this risk in your controls.

Can/Should this be fixed?

This is a particularly tricky issue to correct in code, as the MySQL client needs to be aware of a

LOAD LOCAL INFILE SQL statement getting sent. MariaDB’s proposed path implements this. Even then, if a stored procedure issues a file request via LOAD LOCAL INFILE..., the client has no awareness of this even being needed until the packet is received with the request, and local_infile can be abused. However, the intent is to allow the feature to load data, and as such DBAs/Admins should seek to employ compensating controls to reduce the risk to their organization: Mitigation
  • DO NOT implement any stored procedures which trigger a LOAD INFILE.
  • Close/remove/secure access to ANY web admin interfaces.
    • Remember, security through obscurity is no security at all. This only delays time to access, it does not prevent access.
  • Deploy mandatory access controls
    • SELinux, AppArmor, GRSecurity, etc. can all help to ensure your client is not reading anything unexpected, lowering your risk of exposure through proper configuration.
  • Deploy Egress controls on your application nodes to ensure your application server can only reach your MySQL service(s) and does not attempt to connect elsewhere (As the exploit requires a malicious MySQL service).
    • Iptables/firewalld/ufw/pfsense/other firewall/etc.
    • This ensures that your vulnerable clients are not connecting to anything you do not know about.
    • This does not protect against a skilled adversary. Your application needs to communicate out to the internet to server pages. Running a malicious MySQL service on a suitably high random port can aid to “hide” this network traffic.
  • Be aware of Domain Name Service (DNS) rebinding attacks if you are using a Fully Qualified Domain Name (FQDN) to connect between application and database server. Use an IP address or socket in configurations if possible to negate this attack.
  • Deploy MySQL Transport Layer Security (TLS) configuration to ensure the server you expect requires the use of TLS during connection, set your client (if possible) to VERIFY_IDENTITY to ensure TLS “fails closed” if the client fails to negotiate TLS, and to perform basic identity checking of the server being connected to.
    • This will NOT dissuade a determined adversary who has a presence in your network long enough to perform certificate spoofing (in theory), and nothing but time to carry this out.
    • mysslstrip can also lead to issues if your configuration does “fail open” as such it is imperative you have:
      • In my.cnf: ssl_mode=VERIFY_IDENTITY
      • On the cli: –ssl_mode=VERIFY_IDENTITY
      • Be aware: This performs verification of the CA (Certificate Authority) and certificate hostname, this can lead to issues if you are using self-signed certificates and the CA is not trusted.
    • This is ONLY an issue if an adversary has the capability of being able to Man in the middle your Application <-> MySQL servers;
      • If they have this capability; this feature abuse is only a single avenue of data ex-filtration they can perform.
  • Deploy a Network Intrusion Detection System
    • There are many open source software (OSS) options, for example:
    • Set alerts on the logs, curate a response process to handle these alerts.
  • Client option mitigation may be possible; however, this varies from client to client and from underlying library to library.
    • MariaDB client binary.
      • Add to my.cnf: local_infile = 0
      • Or set –local_infile=0 on the command line
    • PHP / Ruby / Anything that relies on libmysqlclient
      • Replace libmysqlclient with a version that does not enable local_infile by default
        • This can be difficult, so ensure you test your process before running anything on production!
      • Switch to use PDO MySQL over MySQLi (PDO implementation implicitly sets, local_infile to 0 at the time of writing in php’s C code).
        • Authors note: mysqli_options($conn, MYSQLI_OPT_LOCAL_INFILE, false); failed to mitigate this in testing, YMMV (Your Mileage May Vary).
        • Attempting to set a custom handler to return nothing also failed to mitigate this. Again, YMMV.
IDS Rule example

Here I provide an example “FAST” format rule for your IDS/IPS system;

Note however YMMV; this works with Snort, Suricata, and _may_ work with Zeek (formerly Bro), OSSEC, etc. However, please test and adapt as needed;

alert tcp any any <> any any (msg: “MySQL LOCAL INFILE request packet detected”; “content:”|00 00 01 FB|”; rawbytes)

Note this is only an example, this doesn’t detect any packets flowing over TLS connections.

If you are running an Intrusion Prevention System (IPS), you should change the rule action from alert to drop.

Here the rule is set to any any as an adversary may wish to not use 3306 in an attempt to avoid detection you can of course change this as desired to suit your needs.

You must also assess if your applications are running local_infile legitimately and conduct your own threat modeling as well as impact analysis, prior to implementing such a rule.

Note increasing the “noise” threshold for your team, will likely only result in your team becoming desensitized to the “noise” and potentially missing an important alert as a result.

For example, you could modify the left and right side any any, to be anything not in your internal network range communicating to anything not in your internal network range:

alert tcp 192.168.1.0/24 any <> !192.168.1.0/24 any  (msg:”MySQL LOCAL INFILE request packet detected”; “content:”|00 00 01 FB|”; rawbytes)

Adapting to your environment is key for this IDS rule to be effective.

Further reading

As noted this issue is already being publicly discussed, as such I add links here to sources relevant to this discussion and exploitation.

Exploitation Network flow

<REDACTED>

Thanks

This assessment was not a single person effort, here I would like to link to and give thanks where appropriate to the following individuals whom have helped with this investigation:

Willem de Groot – For sharing insights into the Adminer exploitation and for graciously responding to an inquiry from myself (this helped me get the PoC working, thank you).

<REDACTED> – original author of <REDACTED> (in 2013!), from which I was able to adapt to function for this investigation.

Ceri Williams – for helping me with proxySQL testing.

Marcelo Altman – for discussing MySQL protocol in depth.

Sergei Golubchik – for responding to my email notice for MariaDB, and implementing a workaround mitigation so quickly, as well providing me with a notice on the Connector/J announcement url.

Peter Zaitsev – for linking me to the original reddit discussion and for feedback.

Pages