Call us Toll-Free:
1-800-218-1525
Live ChatEmail us
Ever try to delete a lot of files in a folder, only to have the operation fail with "Argument list too long"?

Here's how to get it done:

FreeBSD

find
. -name "*.pdf" -print0 | xargs -0 rm

Note that this is a recursive search and will find (and delete) files in subdirectories as well.

Linux

find
. -name "*.pdf" -maxdepth 1 -print0 | xargs -0 rm


View 1 Comment(s)
Gluster is an open-source software-only distributed file system designed to run on commodity hardware, scaling to support petabytes of storage.

Gluster supports file system mirroring & replication, striping, load balancing, volume failover, storage quotas and disk caching.

Hesitant with the lack of glowing reviews about Gluster, we were attracted by its feature set and simple architecture.

Over the last six months, we battle-tested Gluster in production, relying on the system to deliver high-availability and geo replication, to power large scale Internet Marketing product launches.

Architecture

The Gluster architecture aggregates compute, storage, and I/O resources into a global namespace. Each server plus attached commodity storage is considered to be a node. Capacity is scaled by adding additional nodes or adding additional storage to each node. Performance is increased by deploying storage among more nodes. High availability is achieved by replicating data n-way between nodes.

Unlike other distributed file systems, Gluster runs on top of your existing file-system, with client-code doing all the work. The clients are stateless and introduce no centralized single point of failure.

Gluster integrates with the local file system using FUSE, delivering wide compatibility across any system that supports extended file attributes - the "local database" where Gluster keeps track of all changes to a file.

The system supports several storage volume configurations:

* None: Files are transparently distributed across servers, with each node adding to the total storage capacity.
* Replica: Files are replicated between two LAN drives (synchronous replication)
* Geo replica: Files are replicated between two remote drives (asynchronous replication, using rsync in the background)
* Stripe: Each file is spread across 4 servers to distribute load.

As of October 2011, development of Gluster is funded by RedHat

Installing Gluster

This is one of the areas where Gluster really shines. You can be up and running in minutes.

Step 1

Installing the FUSE client, which serves as the "glue" between Gluster and your local file system.


wget
"http://sourceforge.net/projects/fuse/files/fuse-2.X/2.9.1/fuse-2.9.1.tar.gz/download"
tar xvfz fuse-2.9.1.tar.gz
cd fuse
-2.9.1
./configure
make all
make install

Step 2

Building Gluster from source


wget
"http://download.gluster.org/pub/gluster/glusterfs/LATEST/glusterfs-3.3.0.tar.gz"
tar xvfz glusterfs-3.3.0.tar.gz
cd glusterfs
-3.3.0
./configure
make all
make install

Starting Gluster and setting it to auto-start on next reboot


/etc/init.d/glusterd start
/usr/sbin/update-rc.d glusterd defaults

Step 3

Configuring your first two nodes as a Replica setup (mirroring)

On node 1 (backup1east):


mkdir
/gfs
mkdir
/gfs/bricks
mkdir
/gfs/backup

gluster peer probe backup2west

gluster volume create backup replica 2 transport tcp backup1east
:/gfs/bricks/vol0 backup2west:/gfs/bricks/vol0

mount
-t glusterfs backup1east:/backup /gfs/backup

On node 2 (backup2west):


mkdir
/gfs
mkdir
/gfs/bricks
mkdir
/gfs/backup

gluster peer probe backup1east

mount
-t glusterfs backup2west:/backup /gfs/backup

Important: Make sure the name of your Gluster volume ('backup' in the example above) is different than the name of the share ('gfs' in the example above) or things will not work properly.

Our Experience

Going into this experiment, we had very high hopes for Gluster. Once proven, the goal was to replace our entire private cloud storage cluster with Gluster.

Unfortunately, we have been very disappointed with Gluster...

In spite of getting a lot of help from the Gluster community, testing different platforms and configurations, results have been consistent.

Like other users reported, we struggled with poor performance, bugs, race conditions when dealing with lots of small files, difficulties in monitoring node health and worst of all - two instances of unexplained data loss.

We ended up completely abandoning Gluster and switching back to our home-grown rsync-based solution.

As always, run your own tests to determine if this is a good fit for your needs.

Proceed with caution.

More Resources

* SlideShare Introduction to GlusterFS
* Gluster Documentation
* Gluster IRC Channel
* Gluster Blog

View 1 Comment(s)
Enjoy our step-by-step guide to configuring PHP 5 with FPM, NGinx Web server, Memcached and MySQL 5.1, on FreeBSD 8:

1. Install FreeBSD 7 compatibility and standard packages


cd
/usr/ports/misc/compat7x
make all
make install

pkg_add
-r libevent
pkg_add
-r libtool
pkg_add
-r m4
pkg_add
-r pcre
pkg_add
-r pdftk
pkg_add
-r rsync
pkg_add
-r vim
pkg_add
-r wget

2. Install ProFTPD


cd
/usr/ports/ftp/proftpd
make all
make install

3. Install NGinx

Make sure you click to enable 'HTTP_GZIP_STATIC_MODULE', 'HTTP_SSL_MODULE' and 'HTTP_ZIP_MODULE'


cd
/usr/ports/www/nginx
make all
make install

echo "nginx_enable=YES" >> /etc/rc.conf
echo "<?php phpinfo(); ?>" >> /usr/local/www/nginx/phpinfo.php

You can always run make config to redo the configuration options

4. Install CURL+LibXML


cd
/usr/ports/ftp/curl
make all
make install

cd
/usr/ports/textproc/libxml
make all
make install

5. Install MySQL client and server


cd
/usr/ports/databases/mysql51-server
make all
make install

cd
/usr/ports/databases/mysql51-client
make all
make install

cd
/usr/tmp
fetch
"http://api6.softwareprojects.com/files/auto/my.cnf"
mv my.cnf /etc/my.cnf

mkdir
/usr/local/mysql
mkdir
/usr/local/mysql/data
chmod 777
/usr/local/mysql
chown
-R mysql:mysql /usr/local/mysql

/usr/local/bin/mysql_install_db
chmod
-R 777 /usr/local/mysql
chown
-R mysql:mysql /usr/local/mysql

6. Install GD


cd
/usr/ports/graphics/ruby-libpng
make all
make install

7. Install PHP 5


cd
/usr/ports/security/libmcrypt
make all
make install

cd
/usr/ports/devel/php5-pcntl
make all
make install

cd
/usr/tmp
fetch
"http://api6.softwareprojects.com/files/auto/php-5.2.8.tar.gz"
tar xvfz php-5.2.8.tar.gz

fetch
"http://api6.softwareprojects.com/files/auto/php-5.2.8-fpm-0.5.10.diff.gz"
gzip -cd php-5.2.8-fpm-0.5.10.diff.gz | patch -d php-5.2.8 -p1

fetch
"http://api6.softwareprojects.com/files/auto/suhosin-patch-5.2.8-0.9.6.3.patch.gz"
gzip -cd suhosin-patch-5.2.8-0.9.6.3.patch.gz | patch -d php-5.2.8 -p1

cd php
-5.2.8
./configure --with-config-file-path=/usr/local/lib/ --enable-pcntl --enable-fastcgi --enable-fpm --enable-calendar --enable-ftp --enable-mbstring --with-mysql --with-curl --with-mcrypt --with-gd --with-iconv --with-jpeg-dir=/usr/lib --with-mysql=/usr/local/mysql --enable-memcache --with-openssl --enable-soap --enable-sockets --with-zlib --enable-zip --enable-bcmath --with-ttf --enable-gd-native-ttf --with-freetype-dir=/usr/local/lib/ --enable-pdo --with-pdo_mysql --enable-suhosin
make all install

fetch
"http://api6.softwareprojects.com/files/auto/php-fpm.conf"
mv php-fpm.conf /usr/local/etc/php-fpm.conf

fetch
"http://api6.softwareprojects.com/files/auto/nginx.conf"
mv nginx.conf /usr/local/etc/nginx/nginx.conf

8. Install Memcached


cd
/usr/ports/databases/memcached
make all
make install

9. Install HAProxy


cd
/usr/ports/net/haproxy
make all
make install

10. Start MySQL and NGinx


/usr/local/bin/mysqld_safe &
/
usr/local/sbin/php-fpm start
/usr/local/etc/rc.d/nginx start

--

Verify MySQL is working properly:

Attempt connecting to MySQL:

/usr/local/bin/mysql -uroot

Verify NGinx is working properly:

Point your browser to http://1.2.3.4/ (replacing 1.2.3.4 with the PUBLIC ip address of the server)

Verify PHP is working properly:

Point your browser to http://1.2.3.4/phpinfo.php (replacing 1.2.3.4 with the PUBLIC ip address of the server).

If you see the PHP info screen, all is well

NTP for Accurate Global Time Synchronization

Mike Peters, February 11, 2011
Running a multi-server architecture?

Keeping your server clocks in-sync is very important, especially when using NoSQL databases like Cassandra.

Cassandra attaches a timestamp to every insert operation. If your server clocks fall out of sync, some updates will be dropped, due to one server taking precedence over others.

Even if your servers are all showing the same time right now, it's important to understand that without continually applying corrections, the different clocks will eventually fall out of sync.

How does Global Time Synchronization work?

Public time servers, update their clocks using hardware based on atom's electrons frequency (aka Atomic Clocks).

Your local machines ping the time server repeatedly, applying corrections so that all clocks are in sync.

NTP

NTP (Network Time Protocol) is an Internet protocol used to synchronize the clocks of computers to a global time reference.

FreeBSD and Linux servers come with an NTPD service that automatically adjusts the local clock based on the selected global time server.

To start NTPD on Linux:
ntpdate pool.ntp.org
service ntpd restart

To start NTPD on FreeBSD:
ntpdate pool.ntp.org
/etc/rc.d/ntpd start

Controlling which time server to use is done by updating /etc/ntp.conf. Example:
server pool.ntp.org prefer
driftfile /var/db/ntpd.drift
logfile /var/log/ntpd.log

To configure NTPD to start on boot automatically on Linux:
chkconfig --level 2345 ntpd on

To configure NTPD to start on boot automatically on FreeBSD:
Add these lines to your /etc/rc.conf file:
ntpd_enable="YES"
ntpdate_enable="YES"
ntpdate_flags="pool.ntp.org"


View 1 Comment(s)

Cassandra for PHP Sessions

Hojda Vasile Dan, January 26, 2011
Building on Dawn's Memcached for PHP sessions post, we've now converted our php-sessions handling from Memcached to Cassandra.

Cassandra supports built-in caching, sharding & replication and scales to infinity, overcoming the shortcomings of the memcached-for-sessions approach.

Click here to download the new dbsession.php and here to download common_cassandra.php

redis: a persistent key-value store

Mike Peters, November 5, 2010
redis is a key-value store, similar to memcached but with data persistence.

redis supports multiple value types, counters with atomic increment/decrement and built-in key expiration.

To achieve persistence without scarifying speed, like Cassandra, redis performs updates in memory as well as adding them to an append-only file, which is synced to disk from time to time.

redis is fast (110,000 writes per second, 81,000 reads per second), supports sharding and master-slave replication (no master-master yet)

Why redis?

Those of you keeping track, know we've always been big fans of MySQL but at the same time, we keep writing about migrating different parts of our application to Memcached, Cassandra, Lucene and ElasticSearch.

Why do we keep jumping from one storage engine to another? Can't we make up our minds already and settle with the "best" storage engine that meets our needs?

In short, No.

A common misconception is the belief that all storage engines are created equal, all designed to simply "store stuff" and provide fast access to your data. Unless your application performs one clearly defined simple task, it is a dire mistake to expect a single storage engine will effectively fulfill all of your data warehousing and processing needs.

* MySQL is great when you need ad-hoc queries and you're dealing with a relatively small data set.

* Memcached comes into play when you have a read-heavy environment and need a quick volatile cache to avoid querying MySQL a dozen times per page.

* Lucene and ElasticSearch are your friends when you need fulltext search, or when your MySQL data set grows to a point where running the filters and joins in MySQL becomes slow like a snail.

* Cassandra is amazing when you have a write-heavy environment, need to be capable of scaling writes linearly and supporting a huge data set.

* redis works particularly well as a state machine, when you need counters with atomic increment/decrement. Typical uses: "how many users are on my website?" ala ChartBeat, "how many jobs are waiting to be processed" etc.

Architecture

redis is written in Ansi C and runs as a single light-weight daemon on your machine. All updates are done in memory first and saved to disk later, asynchronously.

Supported languages: C, C#, Erlang, Java, JavaScript, Lua, Perl, PHP, Python, Ruby, Scala, Go, and Tcl.

As of 15 March 2010, development of redis is funded by VMware.

Installing redis

Step 1

Download the redis tarball and extract it


wget
"http://redis.googlecode.com/files/redis-2.0.3.tar.gz"
tar xvzf redis-2.0.3.tar.gz

Step 2

Compile redis and install it


gmake
gmake install

Step 3

Run redis


/usr/local/bin/redis-server

Once running you can use redis-benchmark to run some benchmarks.

redis doesn't come with a config file, it will use all default settings by default. But you're going to want to study the config options and set them up.

Sample redis config file here

Configuration options

A few important redis.conf options you're going to want to set:

* First, if you will only be connecting to a local Redis instance, uncomment the bind configuration in the sample file:


bind 127.0.0.1

That tells Redis not to listen for external connections.

* redis supports multiple databases, but for most use cases, you're only going to need one. Change the default 16 to 1:


databases 1

* You can set the maximum number of bytes Redis can allocate, after which it will start purging volatile keys. If it cannot reclaim any more memory it will start refusing write commands. Here's a sample setting for a 100MB limit:


maxmemory 104857600

* The server will periodically fork and asynchronously dump the current contents of the database to disk. The dump is actually made to a temporary file and then moved to replace any older dump, so the operation is atomic and won't leave you with a partially dumped database. If Redis is eventually shutdown and reloaded, it will restore from this dump file.

How often it dumps the keys is configureable by the amount of time that passes and the number of changes that have been made to the data. For example, the following settings tell Redis to dump the database after 60 seconds if 100 changes have been made or after five minutes if there has been at least 1 change:


save 300 1
save 60 100

* By default redis starts in foreground mode. To fix that, change demonize option in redis.conf file to "Yes":


demonize yes

To Redis or not to Redis?

If you have a large data set that cannot comfortably fit into RAM, Redis is not the key value store for you to use, but if you have smaller sets, and if you can live with the asynchronous write behavior, then, for me, the answer is definitely "to Redis."

As an alternative, Tokyo Cabinet is very fast for a synchronous key value store, and it does support some features that Redis does not, such as tables. Redis permits a master/slave setup, which can alleviate fears of data loss from failure, but it's not as certain as something like Tokyo Cabinet, which will write the data as soon as it gets it. On the other hand, Redis is blazingly fast, incredibly easy to use, and will support just about anything you can think of doing with your data.


More resources:

* Try redis in your browser
* Download redis
* Retwis - a PHP twitter clone using redis

View 4 Comment(s)

How to hide .php extension in your urls with Nginx

Adrian Singer, November 4, 2010
Looking for clean urls (/hello instead of /hello.php)?

Here's how to set it up:

Step 1

Create a notfound.php script and place it in your root web server folder


// Set this for easier access
$url = substr($REQUEST_URI,1);

// Strip parameters
if (($pos = strpos($url,"?"))>0)
{
   
$url_parameters = substr($url, $pos+1);
   
$url = substr($url, 0, $pos);
}
$url = trim(strtolower($url));

// Strip prefix and suffix '/'
if ($url[0]=='/') $url = substr($url,1);
if (
strlen($url)>1)
if (
$url[strlen($url)-1]=='/') $url = substr($url, 0, strlen($url)-1);

// If url starts with .. it's a hack attempt
if (Strcasecmp(substr($url,0,2),"..")==0)
{
 
$url = str_replace("..","",$url);
}

// If we have a php script with this name
if (file_exists($url.".php"))
{
 
// Set PHP_SELF and REQUEST_URI to point to the real script
 
$_SERVER['PHP_SELF'] = $PHP_SELF = $_SERVER['REQUEST_URI'] = $REQUEST_URI = "/".$url;
  if (!empty(
$url_parameters)) $_SERVER['REQUEST_URI'] = $REQUEST_URI .= "?".$url_parameters;

 
// Load real php script
 
require($DOCUMENT_ROOT."/$url.php");
  return;
}

Step 2

Update your Nginx nginx.conf file, rewriting all urls where the file is not found, to notfound.php


    location
/
    {
      if (-
d $request_filename)
      {
        break;
      }
      if (!-
f $request_filename)
      {
       
rewrite ^(.*)$ /notfound.php?$1 last;
        break;
      }
    }

Note: This is different than doing an error_document 404 redirect. With a 404 redirect, HTTP_POST data is not preserved.

View 4 Comment(s)
« Previous Posts



About Us  |  Contact us  |  Privacy Policy  |  Terms & Conditions