Recent Posts

Sponsors
![]() |
6 Months with GlusterFS: a Distributed File SystemMike Peters, 08-09-2012 |
Gluster is an open-source software-only distributed file system designed to run on commodity hardware, scaling to support petabytes of storage.
Gluster supports file system mirroring & replication, striping, load balancing, volume failover, storage quotas and disk caching.
Hesitant with the lack of glowing reviews about Gluster, we were attracted by its feature set and simple architecture.
Over the last six months, we battle-tested Gluster in production, relying on the system to deliver high-availability and geo replication, to power large scale Internet Marketing product launches.
Architecture
The Gluster architecture aggregates compute, storage, and I/O resources into a global namespace. Each server plus attached commodity storage is considered to be a node. Capacity is scaled by adding additional nodes or adding additional storage to each node. Performance is increased by deploying storage among more nodes. High availability is achieved by replicating data n-way between nodes.
Unlike other distributed file systems, Gluster runs on top of your existing file-system, with client-code doing all the work. The clients are stateless and introduce no centralized single point of failure.
Gluster integrates with the local file system using FUSE, delivering wide compatibility across any system that supports extended file attributes - the "local database" where Gluster keeps track of all changes to a file.
The system supports several storage volume configurations:
* None: Files are transparently distributed across servers, with each node adding to the total storage capacity.
* Replica: Files are replicated between two LAN drives (synchronous replication)
* Geo replica: Files are replicated between two remote drives (asynchronous replication, using rsync in the background)
* Stripe: Each file is spread across 4 servers to distribute load.
As of October 2011, development of Gluster is funded by RedHat
Installing Gluster
This is one of the areas where Gluster really shines. You can be up and running in minutes.
Step 1
Installing the FUSE client, which serves as the "glue" between Gluster and your local file system.
wget "http://sourceforge.net/projects/fuse/files/fuse-2.X/2.9.1/fuse-2.9.1.tar.gz/download"
tar xvfz fuse-2.9.1.tar.gz
cd fuse-2.9.1
./configure
make all
make install
Step 2
Building Gluster from source
wget "http://download.gluster.org/pub/gluster/glusterfs/LATEST/glusterfs-3.3.0.tar.gz"
tar xvfz glusterfs-3.3.0.tar.gz
cd glusterfs-3.3.0
./configure
make all
make install
Starting Gluster and setting it to auto-start on next reboot
/etc/init.d/glusterd start
/usr/sbin/update-rc.d glusterd defaults
Step 3
Configuring your first two nodes as a Replica setup (mirroring)
On node 1 (backup1east):
mkdir /gfs
mkdir /gfs/bricks
mkdir /gfs/backup
gluster peer probe backup2west
gluster volume create backup replica 2 transport tcp backup1east:/gfs/bricks/vol0 backup2west:/gfs/bricks/vol0
mount -t glusterfs backup1east:/backup /gfs/backup
On node 2 (backup2west):
mkdir /gfs
mkdir /gfs/bricks
mkdir /gfs/backup
gluster peer probe backup1east
mount -t glusterfs backup2west:/backup /gfs/backup
Important: Make sure the name of your Gluster volume ('backup' in the example above) is different than the name of the share ('gfs' in the example above) or things will not work properly.
Our Experience
Going into this experiment, we had very high hopes for Gluster. Once proven, the goal was to replace our entire private cloud storage cluster with Gluster.
Unfortunately, we have been very disappointed with Gluster...
In spite of getting a lot of help from the Gluster community, testing different platforms and configurations, results have been consistent.
Like other users reported, we struggled with poor performance, bugs, race conditions when dealing with lots of small files, difficulties in monitoring node health and worst of all - two instances of unexplained data loss.
We ended up completely abandoning Gluster and switching back to our home-grown rsync-based solution.
As always, run your own tests to determine if this is a good fit for your needs.
Proceed with caution.
More Resources
* SlideShare Introduction to GlusterFS
* Gluster Documentation
* Gluster IRC Channel
* Gluster Blog
Gluster supports file system mirroring & replication, striping, load balancing, volume failover, storage quotas and disk caching.
Hesitant with the lack of glowing reviews about Gluster, we were attracted by its feature set and simple architecture.
Over the last six months, we battle-tested Gluster in production, relying on the system to deliver high-availability and geo replication, to power large scale Internet Marketing product launches.
Architecture
The Gluster architecture aggregates compute, storage, and I/O resources into a global namespace. Each server plus attached commodity storage is considered to be a node. Capacity is scaled by adding additional nodes or adding additional storage to each node. Performance is increased by deploying storage among more nodes. High availability is achieved by replicating data n-way between nodes.
Unlike other distributed file systems, Gluster runs on top of your existing file-system, with client-code doing all the work. The clients are stateless and introduce no centralized single point of failure.
Gluster integrates with the local file system using FUSE, delivering wide compatibility across any system that supports extended file attributes - the "local database" where Gluster keeps track of all changes to a file.
The system supports several storage volume configurations:
* None: Files are transparently distributed across servers, with each node adding to the total storage capacity.
* Replica: Files are replicated between two LAN drives (synchronous replication)
* Geo replica: Files are replicated between two remote drives (asynchronous replication, using rsync in the background)
* Stripe: Each file is spread across 4 servers to distribute load.
As of October 2011, development of Gluster is funded by RedHat
Installing Gluster
This is one of the areas where Gluster really shines. You can be up and running in minutes.
Step 1
Installing the FUSE client, which serves as the "glue" between Gluster and your local file system.
wget "http://sourceforge.net/projects/fuse/files/fuse-2.X/2.9.1/fuse-2.9.1.tar.gz/download"
tar xvfz fuse-2.9.1.tar.gz
cd fuse-2.9.1
./configure
make all
make install
Step 2
Building Gluster from source
wget "http://download.gluster.org/pub/gluster/glusterfs/LATEST/glusterfs-3.3.0.tar.gz"
tar xvfz glusterfs-3.3.0.tar.gz
cd glusterfs-3.3.0
./configure
make all
make install
Starting Gluster and setting it to auto-start on next reboot
/etc/init.d/glusterd start
/usr/sbin/update-rc.d glusterd defaults
Step 3
Configuring your first two nodes as a Replica setup (mirroring)
On node 1 (backup1east):
mkdir /gfs
mkdir /gfs/bricks
mkdir /gfs/backup
gluster peer probe backup2west
gluster volume create backup replica 2 transport tcp backup1east:/gfs/bricks/vol0 backup2west:/gfs/bricks/vol0
mount -t glusterfs backup1east:/backup /gfs/backup
On node 2 (backup2west):
mkdir /gfs
mkdir /gfs/bricks
mkdir /gfs/backup
gluster peer probe backup1east
mount -t glusterfs backup2west:/backup /gfs/backup
Important: Make sure the name of your Gluster volume ('backup' in the example above) is different than the name of the share ('gfs' in the example above) or things will not work properly.
Our Experience
Going into this experiment, we had very high hopes for Gluster. Once proven, the goal was to replace our entire private cloud storage cluster with Gluster.
Unfortunately, we have been very disappointed with Gluster...
In spite of getting a lot of help from the Gluster community, testing different platforms and configurations, results have been consistent.
Like other users reported, we struggled with poor performance, bugs, race conditions when dealing with lots of small files, difficulties in monitoring node health and worst of all - two instances of unexplained data loss.
We ended up completely abandoning Gluster and switching back to our home-grown rsync-based solution.
As always, run your own tests to determine if this is a good fit for your needs.
Proceed with caution.
More Resources
* SlideShare Introduction to GlusterFS
* Gluster Documentation
* Gluster IRC Channel
* Gluster Blog
![]() |
Sebastien, 03-24-2013 |
Hi,
I'm really interested in your feedback about Gluster.
Can you tell me more about your disappointment and the poor performance ?
Thanks for advance
I'm really interested in your feedback about Gluster.
Can you tell me more about your disappointment and the poor performance ?
Thanks for advance
|

Subscribe Now to receive new posts via Email as soon as they come out.
Comments
Post your comments