Quick Start with MySQL Cluster

MySQL Cluster (NDB Storage Engine) is a shared-nothing distributed in-memory database with support for disk-data tables as well. It provides a DB environment with high availability, near-linear scalability, realtime performance, automatic and user defined data partitioning, load balancing and transparent failover.

The data is distributed using a partitioning key (hash calculated on primary key as default) so the performance is especially great for primary key lookups and parallel data access.

In a few steps we will install, configure and run a simple 2 Data Node MySQL Cluster setup.

First an introduction to the different node types that makes a MySQL Cluster.

MySQL Cluster Node Types

A node in MySQL Cluster terminology is a software process. There are primarily 4 node types.

  • SQL Node (mysqld)
    is the SQL interface provided by the MySQL Server.
  • API Node
    is an application that uses the native NDB API. The SQL Node is "basically" an API Node.
  • Data Node (ndbd)
    is responsible for storing data and handling transactions.
  • Management Node (ndb_mgmd)
    is the management server which provides the cluster configuration and the cluster log

 Below is picture of a very simple typical 2 Data Node deployment.

2 Data Node deployment

MySQL Cluster Installation

For now I will only cover installation of MySQL Cluster. Personally I prefer installing from tar files and not using rpm or other packaging tools because it gives you better control over the installation, so that's what we are going to use.

  1. Download the .tar.gz (non RPM version) from http://dev.mysql.com/downloads/cluster
    I'm on OS X so I'll use 'mysql-cluster-gpl-6.3.20-osx10.5-x86_64.tar.gz'
  2. Create separate folders for the installation files and for the data storage
        $ mkdir ~/db
        $ mkdir ~/db/etc # this is where we'll put our config files later
        $ mkdir -p ~/db/data/mysql
        $ mkdir ~/db/data/mysql-cluster
  3. Extract the tar file into the installation directory
        $ cd ~/db
        $ tar zxvf mysql-cluster-gpl-6.3.20-osx10.5-x86_64.tar.gz
  4. Create a symbolic link 'mysql' (easy to upgrade or revert versions)
        $ ln -s mysql-cluster-gpl-6.3.20-osx10.5-x86_64 mysql
  5. Initialize the MySQL data directory and system tables
        $ cd mysql
        $ scripts/mysql_install_db --basedir=<homedir>/db/mysql --datadir=<homedir>/db/data/mysql
  6. Define a MYSQL_HOME and set paths
        $ export MYSQL_HOME=~/db/mysql
        $ export PATH=$MYSQL_HOME/bin:$PATH
        $ export PATH=$MYSQL_HOME/libexec:$PATH  # if installing using own source builds!

MySQL Cluster 7.0 is just around the corner. One new feature is better support for multi-core/threaded environments such as Sun's CMT servers. It will initially support up to 8 cores. There will be an additional binary called ndbmtd, which is the multithreaded version that is intended for use on multi-core systems.
The only change to the cluster configuration file is to add 'MaxNoOfExecutionThreads' to specify the maximum number of threads to use. If omitted ndbmtd will act as "single" threaded.

Configuration files

Next we will create two configuration files, one for the SQL Node, i.e., the MySQL Server and one for MySQL Cluster a.k.a NDB storage engine. The cluster configuration consist of 2 Data Nodes, 1 Management Node, 1 SQL Node and 7 API Nodes. The total amount of memory allocated for the data is 2GB, i.e,. 1GB per Data Node however change the configuration to match you environment. All running on localhost.

There are loads of cluster parameters to tune so check out the online manual for those.

  1. Minimal MySQL server configuration file, my.cnf
    # The MySQL server
    [mysqld]
    port = 3306
    socket = /tmp/mysql.sock basedir=/Users/alyu/Projects/db/mysql
    datadir = <homedir>/data/mysql
    # Enable NDB storage engine
    ndbcluster
    # Location of the management server
    ndb-connectstring=localhost
  2. Minimal MySQL Cluster configuration file, 2-node-config.ini (edit DataDir and FileSystemPath)
    [NDBD DEFAULT]
    # Number of copies of the data (2 is the default and recommended setting)
    NoOfReplicas: 2
    # trace files, logs, pid files
    DataDir: <homedir>/db/data/mysql-cluster
    # Metadata, redo/undo(for disk) logs and data files
    # must be created before the data nodes are started
    FileSystemPath: <homedir>/db/data/mysql-cluster
    # Data Memory for data contents and ordered indexes
    DataMemory: 1024M
    # Index memory, usually 1/6 or 1/8 of DataMemory is enough but depends on
    # number of unique hash indexes
    IndexMemory: 128M
    DataDir: /data/mysql-cluster
    # Size of each REDO log fragment,
    # there are 4 log fragments per log file
    # Bigger is better for high write load and should be change from default
    # when DataMemory > 512MB
    FragmentLogFileSize=256M
    # Set to 6xDataMemory/4*FragmentLogFileSize (in MB)
    # Good heuristics recommendation
    # NoOfFragmentLogFiles = 6*DataMemory/4*FragmentLogFileSize
    NoOfFragmentLogFiles=6
    # Redo log buffer is used to log activities
    # If you receive REDO log buffers overloaded try increase this value
    Redouffer = 32MB
    # Amount of memory used for logs buffers, disk operations, undo files etc
    # Change this from default 20M to a larger value when using Disk Data Tables
    #SharedGlobalMemory = 384M
    # Amount of memory used for caching pages on disk, default 64M
    # Set this as high as possible when using lots of disk data tables
    # DiskPageBufferMemory =
    # 0.8 x (total memory – (OS memory + [buffer memory + DataMemory + IndexMemory])
    #DiskPageBufferMemory = 500M
    # Management Node/Server
    [NDB_MGMD]
    Id: 1
    HostName: localhost
    # Default arbitrator
    ArbitrationRank: 1
    # Data Nodes
    [NDBD]
    Id: 2
    HostName: localhost
    [NDBD]
    Id: 3
    HostName: localhost
    # API nodes like the SQL Node, C++/Java applications using NDB API
    # Defining 8 API nodes
    # [MYSQLD] and [API] are interchangeable
    [MYSQLD]
    Id: 4
    HostName: localhost
    [API]
    Id: 5
    HostName: localhost
    [API]
    Id: 6
    # HostName defaults to localhost if omitted
    [API]
    Id: 7
    [API]
    Id: 8
    [API]
    # Id is automatically assigned if omitted
    [API]
    [API]
  3. Put the configuration files 'my.cnf' and '2-node-config.ini' in the 'db/etc' folder

If you are planning to use MySQL Cluster 7.0's multithreaded version 'ndbmtd' then you need to add 'MaxNoOfExecutionThreads' to the [NDBD DEFAULT] section in the cluster configuration.

The recommended values are:

Cores MaxNoOfExecutionThreads
2 2
4 4
8 or more 8
Starting MySQL Cluster

The first node that needs to be started is the Management Node because it provides the cluster configuration which all the other nodes need and ask for when they start up.

  1. Start the Management Node
        $ cd ~/db
        $ ndb_mgmd -f etc/2-node-config.ini
  2. Start the Data Nodes, as many as there are defined in the cluster configuration
        $ ndbd # or ndbmtd for MySQL Cluster 7.0 multithreaded version
        $ ndbd # or ndbmtd for MySQL Cluster 7.0 multithreaded version
  3. Start the SQL Node/MySQL server
        $ mysqld --defaults-file=etc/my.cnf & # or use the mysqld_safe script
  4. Use the Management client and check the Data Nodes startup progress, wait until all Data Nodes are in started state
        $ ndb_mgm -e "all status"
        Connected to Management Server at: 127.0.0.1:1186
        Node 2: started (mysql-5.1.30 ndb-6.3.20)
        Node 3: started (mysql-5.1.30 ndb-6.3.20)
  5. Create a cluster table (NDB storage engine)
        $ mysql -uroot
        mysql> create table t1 (id int not null auto_increment, city varchar(30)
             > primary key(id)) engine=ndb;
        mysql> insert into t1 values (null, 'Stockholm');
        mysql> insert into t1 values (null, 'Tokyo');
        mysql> select * from t1;
        +----+-----------+
        | id | city      |
        +----+-----------+
        |  1 | Stockholm | 
        |  2 | Tokyo     | 
        +----+-----------+
        2 rows in set (0.00 sec)
    

And that's all. Now you have a 2 data node cluster installed, running and ready to play with!

Command Line Tools

Finally some ndb tools that can be useful. They are all located in $MYSQL_HOME/bin directory.

  • ndb_mgm
    Cluster management client which you use to monitor the cluster, set cluster log filters, start backups etc.
  • ndb_show_tables
    Lists all the tables in the cluster
  • ndb_desc <table name>
    Shows cluster table descriptions and partition info
  • ndb_restore
    Restore data from backups
  • ndb_select_all
    select * from <table name>
  • ndb_select_count
    select count(*) from <table name>
MySQL Cluster Links

Getting Started with MySQL Cluster
JOHAN ANDERSSON'S CLUSTER AND HA BLOB
severalnines.com
Big DBA Head!
Mikael Ronström's blog
Jonas Oreland's Cluster blog

Tags: 

Add new comment

Plain text

  • No HTML tags allowed.
  • Quick Tips:
    • Two or more spaces at a line's end = Line break
    • Double returns = Paragraph
    • *Single asterisks* or _single underscores_ = Emphasis
    • **Double** or __double__ = Strong
    • This is [a link](http://the.link.example.com "The optional title text")
    For complete details on the Markdown syntax, see the Markdown documentation and Markdown Extra documentation for tables, footnotes, and more.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
By submitting this form, you accept the Mollom privacy policy.