Galera Cluster for MySQL with Amazon Virtual Private Cloud

Next: Galera Configurator and deploying Galera ClusterSetup Load Balancers and Web ServersTest your AWS VPC Galera Cluster

In the next following posts we’ll deploy a multi-master synchronous MySQL Galera Cluster with Amazon’s VPC service. We’re going to create a public facing subnet for app/web servers and a private subnet for our database cluster.

The deployment will look similar to the below diagram.

 

 

Amazon’s VPC provides a secure environment where you can chose to isolate parts of your servers by having complete control of how to deploy your virtual networking infrastructure much like your own datacenter. 

 

The steps that we’ll go through are as following:

 

  1. Create a VPC with Public and Private subnets
  2. Define Security Groups (add rules later)
  3. Launch one instance for ClusterControl
  4. Launch three EBS optimized instances for the Galera/database nodes
  5. Format and mount an EBS volume (or RAID set) for each Galera nodes
  6. Create a Galera Configuration with Severalnines Galera Configurator 
  7. Deploy and bootstrap Galera Cluster
  8. Add an internal load balancer
  9. Add a MySQL user for the internal load balancer
  10. Add web server instances
  11. Add an external load balancer
  12. Test the VPC database cluster setup

At the end we have the following instances available on the public subnet (Note your IP addresses would be different):

 

  • 1 Elastic External Load Balancer, Elastic IP 54.249.29.89
  • 2 Web servers: IP 10.0.0.28, Elastic IP 54.249.30.195 and IP 10.0.0.38, Elastic IP 54.249.30.136
  • 1 ClusterControl server: IP 10.0.0.53,  Elastic IP 54.249.30.164

and on the private subnet:

 

  • 1 Elastic Internal Load Balancer, IP 10.0.1.17
  • Galera Node 1, IP 10.0.1.13
  • Galera Node 2, IP 10.0.1.16
  • Galera Node 3, IP 10.0.1.26

In this example going forward we only deploy one private subnet however if you require a more fault tolerant setup you can for example create two private subnets for the database cluster one in each Availability Zone (AZ) which can protect you from single location failures within a single Amazon region.

 

There are a number issues that needs to be handled properly with a Galera cluster across two regions and/or AZs (which practically is two data centers). This will be addressed in a future post.

 

Create a VPC with Public and Private subnets

 

We’ll use Amazon’s VPC wizard to create our VPC that has a public and private subnet. Go into the Amazon VPC console dashboard, verify the region that your want your VPC to be created in and click on the ‘Get started creating a VPC’ button.

 

You will be prompted to select from a list of VPC templates and for this exercise we’ll chose ‘VPC with Public and Private Subnets'.

 

 

The final confirmation dialog shows you the VPC configuration that will be deployed. Using defaults your VPC will allow up to 65,531 IPs with the public and private subnet allocating up to 251 slots each. You can create the VPC with defaults or configure ranges and the subnets to your preference.

 

 

There will be also one NAT instance created which allows your EC2 instances on the private subnet to access the internet by routing traffic through that instance. Instances created on the public subnet will route traffic through the internet gateway.

 

Click on ‘Create VPC’ and shortly after you should have 1 VPC, 2 subnets, 1 network ACL, 1 internet gateway and route tables all setup.

 

 

Create Security Groups

 

Before continuing lets define some security groups to be used for our different EC2 instances.

 

  • GaleraCluster

This group is for the Galera database cluster which reside on the private subnet.

 

TCP: 4567 (Group comm), 4568 (IST), 4444 (rsync), 3306 (MySQL)

TCP: 9200 (HTTP health check ping port)  xinetd invoked shell script

TCP: 22 (ssh) ClusterControl passwordless ssh 

ICMP: Echo Request/Reply. Being able to ping the host is a requirement for the deployment/bootstrap scripts

 

  • ClusterControl

The ClusterControl instance is our monitoring and administration access point to the database cluster and resides on the public subnet. It is also serving as our staging server from where we deploy and bootstrap the MySQL Galera cluster.

 

TCP port 22 (ssh) SSH access to our VPC
TCP port 80 (HTTP) ClusterControl web application
TCP port 3306 (MySQL) The ClusterControl’s MySQL database and below we’ve also only allowed the source to be other instances with the GaleraCluster security group. 
This is important since the ClusterControl agent that is installed on the Galera nodes need access to this port.

 

 

  • Web

This are public facing instances and in our example we'll create a couple of web servers.

TCP port 22 (ssh) Source only allows ssh connections from the ClusterControl.

 

Create EC2 instances

 

We are later going to use Severalnines’s Galera configurator to quickly deploy our MySQL Galera Cluster. The number of instances needed are 1+3, one instance will be dedicated for the ClusterControl package and the rest for the Galera database nodes.

 

Use Amazon’s Quick Launch wizard and select one of the supported OSs for Galera (http://support.severalnines.com/entries/21589522-verified-and-supported-operating-systems).

 

I’m going to use a Ubuntu Server 12.04 LTS image and create 1 small instance for the ClusterControl and 3 large EBS optimized instances for the Galera Nodes.

 

 

In order to launch the instance in a VPC subnet you need to edit the details and make sure to enable ‘Launch into a VPC’. Select the private subnet for the Galera instances and the public subnet for the ClusterControl instance.

 

ClusterControl Instance

 

Select the public subnet.

Select the ‘ClusterControl’ security group.

 

 

Save the changed details.

 

 

Naming your instances makes is easier to identify it later on.

 

 

Next allocate an elastic IP. Instances in the public subnet that do not have an elastic IP are not able to access the internet.

 

Allocate an elastic IP and associate it to the ClusterControl instance

 

Go to the ‘Elastic IPs’ in the EC2 dashboard and click on the ‘Allocate New Address’.

Make sure that the selected ‘EIP used in’ says VPC and not EC2.

 

 

Associate the IP with the ClusterControl instance.

 

 

You should now be able to logon to your ClusterControl instance using ssh.

 

$ ssh -i <your aws pem> ubuntu@54.249.30.164

 

Galera Instances

 

Next launch 3 large instances for the Galera nodes and also make sure to launch it as an “EBS optimized instance” (500 Mbps bandwidth for large instance types) and create an EBS volume to store the database files in order to survive instance reboots. A separate EBS volume is great for taking backups/snapshots etc.

 

Using an EBS optimized instance should give give you an increase in throughput and a more consistent level of IOPS and latency between the EC2 instance and the EBS volume. 

AWS quote: “Provisioned IOPS volumes are designed to deliver within 10% of your provisioned IOPS 99.9% of the time. So for a volume provisioned with 500 IOPS, it volume should deliver at least 450 IOPS 99.9% of the time.”

 

 

Select the GaleraCluster security group.

 

 

Then add a provisioned IOPS EBS volume with the number of IOPS that you want.

 

Only a maximum ratio of 10:1 between the IOPS and volume is allowed so for example 10GB volume <=100 IOPS or 200GB <= 200 IOPS

 

If your database workload is very write intensive and/or if your “hot” data does not fit entirely into your InnoDB buffer pool then you can opt to create a RAID array with a bunch of EBS volumes to increase the throughput for disk-bound workloads.

 

Since AWS charge per GB used and the number of provisioned IOPS not volumes you could easily create for example 6 EBS volumes and setup a RAID 1+0 stripe.

 

Save and launch the instance. Repeat for the next two Galera instances.

 

“Pre-flight check” ClusterControl and Galera instances

 

Before deploying and bootstrapping the Galera Cluster there are a few pre-requisites that we need to do.

 

  • Copy your AWS key to the ClusterControl instance
$ scp <your aws pem file> ubuntu@54.249.30.164:~/.ssh/id_rsa
$ ssh -i <your aws pem file> ubuntu@54.249.30.164 chmod 400 ~/.ssh/id_rsa
$ ssh -i <your aws pem file> 54.249.30.164

 

The location of the aws key will be needed later in the Galera Configurator section.

Verify that you can use ssh to connect to your Galera instances

 

$ ssh 10.0.1.13 ls /etc
$ ssh 10.0.1.16 ls /etc
$ ssh 10.0.1.26 ls /etc

 

If you don’t feel comfortable using the AWS key you easily generate your own passwordless ssh key instead to be used for the database cluster.

 

$ ssh-keygen -t rsa -N "" -f ~/.ssh/id_rsa

 

  • Ping Galera hosts from the ClusterControl instance

This should work if you have setup the GaleraCluster security group properly allowing ICMP echo request/reply. The deployment script requires that you can ping the Galera nodes.

 

From your ClusterControl instance ping your Galera instances.

 

$ ping 10.0.1.13
$ ping 10.0.1.16
$ ping 10.0.1.26

 

  • 1) hostname -i needs to resolve properly on all instances/hosts

 

If you get “hostname: Name or service not known” then the most painless way to fix this issue is to add the hostname to the /etc/hosts file (another way is to edit the Galera deployment scripts).

 

Currently the deployment script does not use ‘hostname --all-ip-addresses’ as default.

 

  • 2) Doing ‘ssh 10.0.1.26 sudo ls’ should not give you “sudo: unable to resolve host ip-10-0-1-26”

Once again the most painless way to resolve this is to add the hostname to the /etc/hosts file on each instance.

On 10.0.1.26

$ echo "10.0.1.26 ip-10-0-1-26" | sudo tee -a /etc/hosts

 

Make sure that 1) and 2) passes on all Galera instances.

 

Format and mount EBS volume(s) on the Galera instances

 

On each Galera instance format a new ext4 (or use xfs) volume that we are going to use for our database files. If you created volumes to be used for a RAID setup then follow instructions further down. 

 

Look for your EBS volume

 

$ sudo fdisk -l  (or cat /proc/partitions)
...
Disk /dev/xvdf: 10.7 GB, 10737418240 bytes
255 heads, 63 sectors/track, 1305 cylinders, total 20971520 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000
...

 

Format the volume with an ext4 filesystem 

$ sudo mkfs -t ext4 /dev/xvdf

 

Create the mount point, this is where we’ll store the MySQL data files

$ sudo mkdir /data

 

Add the volume to the /etc/fstab so that it survives instance reboots.

$ echo “/dev/xvdf /data auto defaults,nobootwait,noatime,data=writeback,barrier=0,nobh 0 0” | sudo tee -a /etc/fstab

 

Mount the new volume

$ sudo mount -a
$ df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/xvda1      8.0G  867M  6.8G  12% /
udev            3.7G   12K  3.7G   1% /dev
tmpfs           1.5G  168K  1.5G   1% /run
none            5.0M     0  5.0M   0% /run/lock
none            3.7G     0  3.7G   0% /run/shm
/dev/xvdb       414G  199M  393G   1% /mnt
/dev/xvdf        10G  280M  9.2G   3% /data

 

Recommended optimized mount options for ext4: 

http://blog.smartlogicsolutions.com/2009/06/04/mount-options-to-improve-ext4-file-system-performance/

 

RAID Setup

 

If you for example created a few EBS volumes then you can create a RAID 1+0 setup in a few steps. 

 

6 volumes striped as RAID 1+0

$ sudo apt-get install mdadm
$ sudo mdadm --create md0 --level=10 --chunk=64 --raid-devices=6 /dev/xvdf /dev/xvdg /dev/xvdh /dev/xvdi /dev/xvdk /dev/xvdj
$ sudo mkfs -t ext4 /dev/md/md0

 

Verify the new array

$ cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] 
md127 : active raid10 xvdj[5] xvdk[4] xvdi[3] xvdh[2] xvdg[1] xvdf[0]
      31432320 blocks super 1.2 64K chunks 2 near-copies [6/6] [UUUUUU]
      [>....................]  resync =  1.0% (326636/31432320) finish=76.1min speed=6806K/sec
unused devices: <none>

 

Add the array 

$ sudo mdadm --detail --scan | sudo tee -a /etc/mdadm/mdadm.conf

(ARRAY /dev/md/ip-10-0-1-27:md0 metadata=1.2 name=ip-10-0-1-27:md0 UUID=6e339907:9d9ff219:044ae233:47b4a362)

 

$ echo “/dev/md/ip-10-0-1-27:md0 data auto defaults,nobootwait,noatime,data=writeback,barrier=0,nobh 0 0” | sudo tee -a /etc/fstab

 

Mount the new volume

$ sudo mount -a
$ df -hT
Filesystem     Type      Size  Used Avail Use% Mounted on
/dev/xvda1     ext4      8.0G  906M  6.7G  12% /
udev           devtmpfs  3.7G  8.0K  3.7G   1% /dev
tmpfs          tmpfs     1.5G  204K  1.5G   1% /run
none           tmpfs     5.0M     0  5.0M   0% /run/lock
none           tmpfs     3.7G     0  3.7G   0% /run/shm
/dev/xvdb      ext3      414G  199M  393G   1% /mnt
/dev/md127     ext4       30G  582M   28G   2% /data

 

Quick Disk IO Performance Test

Lets do a simple test with dd. Write 8GB to /data and perform a sync once before exit.

Standard EBS volume

$ time sudo dd bs=16K count=524288 if=/dev/zero of=test conv=fdatasync
524288+0 records in
524288+0 records out
8589934592 bytes (8.6 GB) copied, 1157.38 s, 7.4 MB/s

real    19m17.440s
user    0m0.504s
sys     0m12.641s

 

RAID 1+0 on 6 provisioned IOP volumes, 10GBx6
$ time sudo dd bs=16K count=524288 if=/dev/zero of=test conv=fdatasync
524288+0 records in
524288+0 records out
8589934592 bytes (8.6 GB) copied, 628.86 s, 13.7 MB/s

real    10m28.880s
user    0m0.432s
sys     0m11.697s

 

Read 8GB test file

 

Standard EBS volume

$ time sudo dd if=test of=/dev/null bs=16K
524288+0 records in
524288+0 records out
8589934592 bytes (8.6 GB) copied, 906.412 s, 9.5 MB/s

real    15m6.439s
user    0m0.428s
sys     0m6.256s

 

RAID 1+0 on 6 provisioned IOP volumes, 10GBx6

$ time sudo dd if=test of=/dev/null bs=16K
524288+0 records in
524288+0 records out
8589934592 bytes (8.6 GB) copied, 133.016 s, 64.6 MB/s

real    2m13.188s
user    0m0.180s
sys     0m5.080s

 

Next Galera Configurator and deploying Galera Cluster

 

Tags: 

Add new comment

Plain text

  • No HTML tags allowed.
  • Quick Tips:
    • Two or more spaces at a line's end = Line break
    • Double returns = Paragraph
    • *Single asterisks* or _single underscores_ = Emphasis
    • **Double** or __double__ = Strong
    • This is [a link](http://the.link.example.com "The optional title text")
    For complete details on the Markdown syntax, see the Markdown documentation and Markdown Extra documentation for tables, footnotes, and more.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
By submitting this form, you accept the Mollom privacy policy.