Full-service Internet Marketing & Web Development
Recent Posts

Sponsors
![]() |
Replicating Web servers using RsyncDawn Rossi, 02-04-2009 |

What is Web server replication?
Replicating a web server machine (Apache, NGinx etc.) is the process of synchronizing two or more machines, so that they each have a snapshot of the same files, at any given point in time.
Benefits of Web server replication
There are two primary benefits to web server replication:
1. High availability
If one machine goes down, providing you have a load balancer in-place, your site will continue to operate normally.
2. Performance during high load times
Having more than a single web server means the load can be equally distributed across several locations. You can use shortest-distance, weighted-average or round-robin to determine the designated machine, but in general, the more front-end servers you add, the better your site will handle increased traffic.
Rsync
Rsync is a linux shell utility, that supports efficient copying of data between two servers.
Unlike "normal copy" - Rsync only copies differential data (the stuff that just got changed, instead of an entire 500MB file). Another benefit of Rsync is that it copies over SSH - so data is encrypted across the channel.
Rsync is one of the most popular methods to replicate web servers. It's reliable, easy to setup and fast.
As part of this guide, I will walk you through the process of using Rsync to replicate two web server machines.
Architecture
Whether you have 2 web servers or 10, it is always helpful to designate a single machine as the "master web server" and all other machines as "slave web servers".
While all web servers will contain the same data, designating a single machine as the "master web server", will simplify things for you and your team.
Instruct your team members to always connect to the "master web server" whenever making changes.
Your replication flow is going to be structured like this:
Master_Web_Server =>
RSync to Slave_Web_Server1 =>
RSync to Slave_Web_Server2 =>
RSync to Slave_Web_Server3 etc.
Step 1 - Setup
Before we can setup RSync to replicate the data, we need to authorize SSH across multiple machines without requiring a password.
Here's the recipe -
Get on machine #1 ("Master web server") and type
ssh-keygen -t rsa
Follow the prompts and use the defaults for the filenames it gives you. Don't enter in a passphrase, otherwise you will still be prompted for a password when trying to connect.
You should then have two new files in ~/.ssh, id_rsa and id_rsa.pub.
Open ~/.ssh/id_rsa.pub and copy the line in it to the ~/.ssh/authorized_keys file on machine #2.
If you have more than two machines, repeat the process starting with machine #2, authorizing it to connect without a password to machine #3 and so no.
Make sure you repeat this process as many times as necessary so that -every- single one of your machines can connect to -every- other one, without requiring a password.
Step 2 - Install RSync
Install RSync on all machines. If you're using FreeBSD, this is as easy as doing
cd /usr/ports/net/rsync
make all
make install
Step 3 - Select folder to replicate
This is important.
As part of this step you have to select one (or more) folders to be replicated across all machines.
For best results, follow these pointers:
(a) Ensure all machines are dedicated web servers and don't have any other processes running on them
(b) Do NOT replicate temporary files or cache data.
-
Here at SoftwareProjects, we replicate the /usr/home/ folder, across all web server machines. Any sub directory under /usr/home/ gets replicated.
Engineers are encouraged to avoid storing any temporary files under /usr/home/*/temp. Instead use the tmpnam (or your programming-language equivalent), to create a temporary file under the system /usr/tmp
Step 4 - Create Rsync server config file
Rsync has two modes of operation - client and server (daemon).
Before you can start using the Rsync client to replicate data, you need to install the Rsync server daemon on all machines.
First we have to create the Rsync config file.
A typical Rsync config file (the one we use here), looks like this:
/etc/rsyncd.conf
path = /usr/home/
comment = SPI home node
max connections = 4
pid file = /var/run/rsyncd.pid
Create such a config file on all machines and save it as /etc/rsyncd.conf
Step 5 - Installing & Running Rsync server
Save this file under /usr/local/etc/rc.d/rsync.sh:
/usr/local/bin/rsync --daemon --config=/etc/rsyncd.conf
Run the command from the shell to start Rsync now.
Step 6 - Fetching changes with the Rsync client
To fetch changes, all you have to do is run this command on the machine you want to pull the changes into:
/usr/local/bin/rsync --progress --stats --archive -z --compress -t REMOTEMACHINE:/usr/home/ /usr/home/
Replace REMOTEMACHINE with the machine hostname
Replace /usr/home/ with the folder you are looking to replicate.
Sometime people get confused between source / destination machines. Think of Rsync client as a Pull Engine. It pulls new files (from remote) into the local machine. You should run the Rsync client "sync command" listed above, on the machines you want to pull data into.
Step 7 - Setting Rsync client as a cronjob
Add this line to your cronjob -e , to initiate Rsync client every 5 minutes, check for changes and update files:
*/5 * * * * /usr/local/bin/rsync --progress --stats --archive -z --compress -t REMOTEMACHINE:/usr/home/ /usr/home/
-=-
Notes about recovery
This quick guide does not cover proper handling of a dead machine. Namely, if you are replicating three machines (1 => 2 => 3) and the one in the middle goes down, machine #3 will no longer receive any updates. The loop is broken.
To fix, you will have to setup a script to monitor web server's uptime and update local Rsync config file with the next machine in line, when your designated slave dies.
![]() |
Christoph C. Cemper, 02-04-2009 |
Interesting article...
thought about this some time ago ... I'll discuss this with my tech guys if / how we can apply this to our WHM/CPANEL servers... oviously cpanel wants an extra license paid still
thought about this some time ago ... I'll discuss this with my tech guys if / how we can apply this to our WHM/CPANEL servers... oviously cpanel wants an extra license paid still
![]() |
Mike Peters, 02-04-2009 |
In the example above,
Remember not to reference any machine IP addresses, or have web server config files (like nginx.conf) under the /home directory, or rsync will overwrite them with the wrong ip addresses.
Remember not to reference any machine IP addresses, or have web server config files (like nginx.conf) under the /home directory, or rsync will overwrite them with the wrong ip addresses.
![]() |
Snake, 07-14-2009 |
I have one question.
My SSH server are running under other port not the default port 22. How can i change it for do the rsync?
Because i get a Connection refused error
Thanks
My SSH server are running under other port not the default port 22. How can i change it for do the rsync?
Because i get a Connection refused error
Thanks
![]() |
Travis, 07-19-2009 |
To Snake on 7-14-2009
I found this on google, hope this helps
######
You're in luck. the argument to -e can be a quoted string.
rsync --progress -vrae 'ssh -p 8022' /home
I found this on google, hope this helps
######
You're in luck. the argument to -e can be a quoted string.
rsync --progress -vrae 'ssh -p 8022' /home
![]() |
Mike Peters, 03-30-2010 |
When running rsync via cron, it's important to ensure another instance of rsync is not already running.
Rsync can consume a lot of resources and if you have an rsync script that takes 3 minutes to run, yet it is setup to execute every 1/minute on cron... the consequences can be fatal.
Add this at the top of your cronjob, to ensure no more than a single instance is ever running:
Then at the end of your cronjob script, add this:
Rsync can consume a lot of resources and if you have an rsync script that takes 3 minutes to run, yet it is setup to execute every 1/minute on cron... the consequences can be fatal.
Add this at the top of your cronjob, to ensure no more than a single instance is ever running:
# If another rsync is already running, do nothing
if [ -f "/var/run/rsync_running.pid" ]
then
exit
fi
# Mark that we are running
echo "Running" >> /var/run/rsync_running.pid
if [ -f "/var/run/rsync_running.pid" ]
then
exit
fi
# Mark that we are running
echo "Running" >> /var/run/rsync_running.pid
Then at the end of your cronjob script, add this:
# Delete file indicating we are running
rm "/var/run/rsync_running.pid"
rm "/var/run/rsync_running.pid"
|
|
Subscribe Now to receive new posts via Email as soon as they come out.
Comments
Post your comments





