Call us Toll-Free:
1-800-218-1525
Live ChatEmail us
oops. I couldn't find the page you're looking for. It was either moved or removed.

You have been redirected to the Resources page where you can find anything SoftwareProjects related. (Original URL: /fifth-discipline-what-to-do-when-all-your-projects-are-failing-374.html)

Optimizing NGINX and PHP-fpm for high traffic sites

Adrian Singer, April 20    --    Posted under Programming
After 7 years of using NGINX with PHP, we learned a couple of things about how to best optimize NGINX and PHP-fpm for high traffic sites.

Below is a collection of tips and recommendations:

1. Switch from TCP to UNIX domain sockets

UNIX domain sockets offer better performance than TCP sockets over loopback interface (less copying of data, fewer context switches).

One thing to keep in mind is that UNIX sockets are only reachable from programs that are running on the same server (there's no network support, obviously).


upstream backend
{
 
# UNIX domain sockets
 
server unix:/var/run/fastcgi.sock;

 
# TCP sockets
  # server 127.0.0.1:8080;
}

2. Adjust Worker Processes

Modern hardware is multiprocessor and NGINX can leverage multiple physical or virtual processors.

In most cases your web server machine will not be configured to handle multiple workloads (like providing services as a Web Server and a Print Server at the same time) so you will want to configure NGINX to use all the available processors since NGINX worker processes are not multi-threaded.

You can determine how many processors your machine has by running:

On Linux -


cat
/proc/cpuinfo | grep processor

On FreeBSD -


sysctl dev
.cpu | grep location

Set the worker_processes in your nginx.conf file to the number of cores your machine has.

While you're at it, increase the number of worker_connections (how many connections each core should handle) and set "multi_accept" to ON, as well as "epoll" if you're on Linux:


# We have 16 cores
worker_processes 16;

# connections per worker
events
{
 
worker_connections 4096;
 
multi_accept on;
}

3. Setup upstream load balancing

In our experience, multiple upstream backends on the same machine, produce higher throughout than a single one.

For example, if you're looking to support 1,000 max children, divide that number across two backends, letting each handle 500 children:


upstream backend
{
 
server unix:/var/run/php5-fpm.sock1 weight=100 max_fails=5 fail_timeout=5;
 
server unix:/var/run/php5-fpm.sock2 weight=100 max_fails=5 fail_timeout=5;
}

Here are the two pools from php-fpm.conf:


 
<section name="pool">

      <
value name="name">www1</value>
      <
value name="listen_address">/var/run/php5-fpm.sock1</value>

      <
value name="listen_options">
        <
value name="backlog">-1</value>
        <
value name="owner"></value>
        <
value name="group"></value>
        <
value name="mode">0666</value>
      </
value>

      <
value name="user">www</value>
      <
value name="group">www</value>

      <
value name="pm">
        <
value name="style">static</value>
        <
value name="max_children">500</value>
      </
value>

      <
value name="rlimit_files">50000</value>
      <
value name="rlimit_core">0</value>
      <
value name="request_slowlog_timeout">20s</value>
      <
value name="slowlog">/var/log/php-slow.log</value>
      <
value name="chroot"></value>
      <
value name="chdir"></value>
      <
value name="catch_workers_output">no</value>
      <
value name="max_requests">5000</value>
      <
value name="allowed_clients">127.0.0.1</value>

      <
value name="environment">
        <
value name="HOSTNAME">$HOSTNAME</value>
        <
value name="PATH">/usr/local/bin:/usr/bin:/bin</value>
        <
value name="TMP">/usr/tmp</value>
        <
value name="TMPDIR">/usr/tmp</value>
        <
value name="TEMP">/usr/tmp</value>
        <
value name="OSTYPE">$OSTYPE</value>
        <
value name="MACHTYPE">$MACHTYPE</value>
        <
value name="MALLOC_CHECK_">2</value>
      </
value>

    </
section>

  <
section name="pool">

      <
value name="name">www2</value>
      <
value name="listen_address">/var/run/php5-fpm.sock2</value>
     
      <
value name="listen_options">
        <
value name="backlog">-1</value>
        <
value name="owner"></value>
        <
value name="group"></value>
        <
value name="mode">0666</value>
      </
value>

      <
value name="user">www</value>
      <
value name="group">www</value>

      <
value name="pm">
        <
value name="style">static</value>
        <
value name="max_children">500</value>
      </
value>

      <
value name="rlimit_files">50000</value>
      <
value name="rlimit_core">0</value>
      <
value name="request_slowlog_timeout">20s</value>
      <
value name="slowlog">/var/log/php-slow.log</value>
      <
value name="chroot"></value>
      <
value name="chdir"></value>
      <
value name="catch_workers_output">no</value>
      <
value name="max_requests">5000</value>
      <
value name="allowed_clients">127.0.0.1</value>
     
      <
value name="environment">
        <
value name="HOSTNAME">$HOSTNAME</value>
        <
value name="PATH">/usr/local/bin:/usr/bin:/bin</value>
        <
value name="TMP">/usr/tmp</value>
        <
value name="TMPDIR">/usr/tmp</value>
        <
value name="TEMP">/usr/tmp</value>
        <
value name="OSTYPE">$OSTYPE</value>
        <
value name="MACHTYPE">$MACHTYPE</value>
        <
value name="MALLOC_CHECK_">2</value>
      </
value>

    </
section>       

4. Disable access log files

This can make a big impact, because log files on high traffic sites involve a lot of I/O that has to be synchronized across all threads.


access_log off
;
log_not_found off;
error_log /var/log/nginx-error.log warn;

If you can't afford to turn off access log files, at least buffer them:


access_log
/var/log/nginx/access.log main buffer=16k;

5. Enable GZip


gzip on
;
gzip_disable "msie6";
gzip_vary on;
gzip_proxied any;
gzip_comp_level 6;
gzip_min_length 1100;
gzip_buffers 16 8k;
gzip_http_version 1.1;
gzip_types text/plain text/css application/json application/x-javascript text/xml application/xml application/xml+rss text/javascript;

6. Cache information about frequently accessed files


open_file_cache max
=200000 inactive=20s;
open_file_cache_valid 30s;
open_file_cache_min_uses 2;
open_file_cache_errors on;

7. Adjust client timeouts


client_max_body_size 500M
;
client_body_buffer_size 1m;
client_body_timeout 15;
client_header_timeout 15;
keepalive_timeout 2 2;
send_timeout 15;
sendfile on;
tcp_nopush on;
tcp_nodelay on;
client_max_body_size 500M;

8. Adjust output buffers


fastcgi_buffers 256 16k
;
fastcgi_buffer_size 128k;
fastcgi_connect_timeout 3s;
fastcgi_send_timeout 120s;
fastcgi_read_timeout 120s;
reset_timedout_connection on;
server_names_hash_bucket_size 100;

9. /etc/sysctl.conf tuning


# Recycle Zombie connections
net.inet.tcp.fast_finwait2_recycle=1
net
.inet.tcp.maxtcptw=200000

# Increase number of files
kern.maxfiles=65535
kern
.maxfilesperproc=16384

# Increase page share factor per process
vm.pmap.pv_entry_max=54272521
vm
.pmap.shpgperproc=20000

# Increase number of connections
vfs.vmiodirenable=1
kern
.ipc.somaxconn=3240000
net
.inet.tcp.rfc1323=1
net
.inet.tcp.delayed_ack=0
net
.inet.tcp.restrict_rst=1
kern
.ipc.maxsockbuf=2097152
kern
.ipc.shmmax=268435456

# Host cache
net.inet.tcp.hostcache.hashsize=4096
net
.inet.tcp.hostcache.cachelimit=131072
net
.inet.tcp.hostcache.bucketlimit=120

# Increase number of ports
net.inet.ip.portrange.first=2000
net
.inet.ip.portrange.last=100000
net
.inet.ip.portrange.hifirst=2000
net
.inet.ip.portrange.hilast=100000
kern
.ipc.semvmx=131068

# Disable Ping-flood attacks
net.inet.tcp.msl=2000
net
.inet.icmp.bmcastecho=1
net
.inet.icmp.icmplim=1
net
.inet.tcp.blackhole=2
net
.inet.udp.blackhole=1

10. Monitor

Continually monitor the number of open connections, free memory and number of waiting threads.

Set alerts to notify you when thresholds exceed. You can build these alerts yourself, or use something like ServerDensity.

Be sure to install the NGINX stub_status module

You'll need to recompile NGINX -


./configure --with-http_ssl_module --with-http_stub_status_module --without-mail_pop3_module --without-mail_imap_module --without-mail_smtp_module
make install BATCH
=yes 

How to install htop

Adrian Singer, April 20    --    Posted under Programming
htop is an interactive process viewer for Linux, replacing the traditional top.

Why htop?

htop provides a more interactive process-viewing experience. You can surf through running processes, scrolling horizontally and vertically to reveal information that would otherwise have been clipped, information such as full command lines.

You can see what files a process has open (click "l"); you can even trace processes using strace. There's also a handy tree view for understanding process ancestry.

Installing on Linux

CentOS:


yum install htop

Debian:


sudo apt
-get install htop

Installing on FreeBSD

Step 1:

Add the following line to /etc/fstab:


linproc
/compat/linux/proc linprocfs rw 0 0

Step 2:

Create a symbolic link for /usr/compat


mkdir
-p /usr/compat/linux/proc ; ln -s /usr/compat /compat ; mount linproc

Step 3:

Compile and install


cd
/usr/ports/sysutils/htop
make install

Passing parameters from DoubleClick tag to landing page

Adrian Singer, February 23    --    Posted under Traffic
This took a while to figure out, so hopefully it helps others -

The goal is generating a DoubleClick tag, that will take additional parameters and pass them to the destination landing page.

Step 1

Login to your DoubleClick account, click on Campaigns, then select the Ad



Step 2

Scroll down and open the Landing page URL suffix section



Enter the url key you'd like to populate, into the URL suffix box, in this format:

key=%pkey=!;

This will tell DoubleClick you want to pass "key" to the landing page, populating it with the value of key sent to the tag.

If you want to populate more than one key, use this format:

key=%pkey=!&key2=%pkey2=!&

Step 3

Click on the 'Tags' button at the top, then export tags. The tags you want will be under the "Iframes/Javascript Tag" column

The Art of managing Freelancers - Part 2 - Qualifying the bid

Mike Peters, January 24    --    Posted under Basics
Congratulations!

You created a rock solid specification document and posted it on ODesk.com, Elance.com, Freelancer.com or any of the other freelancer sites.

Within a matter of hours, you have dozens of bids from software engineers around the world.

All of them carefully reviewed your specification document and made sure it's a perfect fit for their skills before submitting their bid, right?

Wrong!

Canned Bids

Winning projects as a freelancer, is a numbers game. The more projects you bid on, the more likely you are to win a few.

Based on this logic, you should try to bid on as many projects as possible.

Unfortunately a lot of freelancers abuse the system, creating scripts for automatically submitting canned bids, to all new projects in their category.

Take a closer look at the bids you received. Do they mention anything specific about -your- project? Or could the same bid very well apply to any other project in this category?

Eliminating canned bids

In most cases, you don't want to work with a freelancer who didn't even take the time to read your specification document and submitted a canned bid, without ever stopping to think if this is a good fit with their skill-set.

One trick I use to easily detect and eliminate canned bids, is adding a request in the body of the specification document, that goes something like this:


When you bid on this project
, start your bid with "loglr.com is great", so that I know you actually took the time to read this spec. This helps me in eliminating canned bids.

If the freelancers bid doesn't include this text, I immediately know it's a canned bid.

Sometimes even though they didn't include the text, the freelancer might catch my attention with a stellar feedback review history. At that point, I give them a simple test, that usually goes like this:



Shortlisting

After you get rid of the canned bids, you're left with freelancers who took the time to carefully review the specification document.

Open a message board and ask them the following questions -

1. Do you have experience creating similar apps/websites?
2. Can you share with me links to your previous work?
3. Do you have experience working with the programming language and platform I described in the document?
4. Who will be doing the actual work?
5. What milestones do you suggest?
6. How much time do you expect it will take to complete this project?
7. How will we communicate during this time?

If the freelancer answers to all of the above questions are satisfactory, shortlist them.

Pay close attention to their communication skills, how long it takes between each reply, how detailed are they and how well you understand them.

Communication is key.

Pick the best two or three

Narrow down the best matches, to a list of two or three freelancers.

You are going to award the project to all of them at the same time.

The freelancer who creates the best result, will be the one you'll continue to the following projects with.

Yes - I am suggesting you actually award and pay two or three freelancers.

Think of it as "cost of doing business". It's an essential price to pay, for finding the perfect fit for your needs.

You'll thank me later.

The Art of managing Freelancers - Part 1 - The specifications document

Mike Peters, January 24    --    Posted under Basics
If you were building a new home, would you jot down your ideas for the house in a short Word document, then send them over to construction workers and expect everything to work out well?

Of course not.

And yet, when it comes to software development, most people do exactly this.

They expect a simple unabridged outline of their thoughts, should be all an accomplished software engineer would need to get the job done.

Architects and Builders

Software engineers need a detailed specification document to work by.

In the real world, when you build a house, you would start with an architect. Your architect's job is to translate your vision into a diagram, with detailed measurements and notes about which materials to use.

A builder then takes the diagram prepared by the architect, orders the required materials, creates a work plan, rounds up the construction workers to do the work and manages them on a day-to-day basis, until the work is completed to your satisfaction.

Product Managers and Project Managers

In the software development world, these two roles are played by a Product Manager and a Project Manager.

The Product Manager's role is to translate your vision, into a language software engineers can understand, using very detailed descriptions, screenshots and visuals, not leaving any room for interpretation.

This document is called a Specification Document. Getting it right is the first step in ensuring your project will be a success.

The Project Manager's role is to manage the software engineers until they successfully build a working program that works looks and feels exactly as described in the specification document.

The Specification document

Ideally, it would be best if you have a professional help you with preparing the specification document.

Try to reach out to a Product Manager, who has experience in preparing such documents.

If you don't have access to a Product Manager, your next best option is to reach out to an experienced software engineer and ask for his/her help.

A good specification document should at minimum include these parts:

1. Overview

Explain what this project is about. Who are the users, what function does it fill and what is the value proposition.

2. Goals

Detailed list of goals. Be as specific as possible about what you want to accomplish with this project.

Include screenshots and examples of similar websites/services as necessary.

3. Technical specification

Define the programming language you want the software engineers to use, the target platform and tools they should use

4. Process flow

Step-by-step description of all the screens a typical end user will go through.

Include screenshots, or mockups - very important!

5. Technical architecture & coding convention

Explain the architecture of the product, which modules you want to have created and how they should interact with each other.

Provide a link to your preferred coding convention, so that it's easier to later manage the code.

6. Budget and timeframe

Your estimated budget and target timeframe for the project.
Ask the software engineers to confirm they can meet these, or work with you on reaching a mutually agreed-upon budget and timeframe, before the project starts.

7. Communication

Include contact information for your project manager (probably you) and be clear about expecting daily updates from the engineer, outlining the progress of the work.

Make yourself available via Skype, email, or phone.

Be sure you are responsive and insist on getting frequent progress reports, along with links to review the project at every milestone.

How to tell if you got it right

Similar to an architectural diagram, a good specification document leaves no room for interpretation.

Two different qualified software development teams, should be able to create the same result, if the specification document is sufficiently detailed.

How to delete files when argument list too long

Mike Peters, January 23    --    Posted under Programming
Ever try to delete a lot of files in a folder, only to have the operation fail with "Argument list too long"?

Here's how to get it done:

FreeBSD

find
. -name "*.pdf" -print0 | xargs -0 rm

Note that this is a recursive search and will find (and delete) files in subdirectories as well.

Linux

find
. -name "*.pdf" -maxdepth 1 -print0 | xargs -0 rm


View 1 Comment(s)

Tracking Retention in Google Analytics

Mike Peters, January 29, 2013    --    Posted under Analytics
Tracking retention rates or cohort analysis, is the best way to visualize your site's addict-ability.

User retention tends to be an area where people pay the least amount of attention, but I think is one of the most important to monitor. I would argue that the single most telling metric for a great product is how many of them become dedicated, repeat users.

If you fail to retain users over time, traffic will never generate a "snow ball" effect. Your glass-ceiling becomes limited to the arbitrage difference between the cost of traffic and ad revenues.

Focus on continually improving your retention rates and you'll be well on your way to building a mega successful site.

First step is to monitor your retention rate numbers. As they say - "What doesn't get measured, Doesn't get done".

The goal is to have the data you need to generate a cohort report like this one:



In this post, I'll describe how to use Google Analytics, for cohort analysis.

Don't be fooled by Google's Returning visitors numbers

Google Analytics appears to provide information about visitor retention through the New vs. Returning visitors report.

That report shows you the proportion of returning visitors. You could set your date range for January, note the percentage of returning visitors, and then set the date range for February, hoping that the percentage of returning visitors has increased.

But what happens if you retain all your January visitors, but drive a ton of new visitors to the site in February? Your proportion of returning-to-new visitors will go down even though you're retaining visitors!

Additionally, if your funnel involves users leaving the site (to "Facebook-Connect" for example), Google Analytics could confuse that with a user leaving the site.

Tagging visitors

For proper cohort analysis, we need a way to "tag" users, segmenting them into groups based on the date of first visit.

We'll then be able to look at the group of new users generated on a given month and see how long they stuck around:



Google Analytics custom variables and events, lets us put it all together.

Step 1 - Install Google Analytics

Signup for Google Analytics (it's free) and create an account for your site.

Add the Google Analytics code to all pages on your site.

Generally this means adding the code to your footer include file.

Step 2 - Pulse script

Save this script under the root folder of your site:

pulse.php

putenv
('TZ=America/New_York');
header('Content-type: application/json');

die(
"if(typeof pulseCallback=='function')".
' pulseCallback({"pulse":"'.date('Y-m-d').'"});');

Note that I'm assuming your server supports PHP. If you're using a different server-side scripting language, find a geek who can help or contact us.

Step 3 - Store custom variable in Google Analytics

Add this javascript code to the footer of all pages:
function jsCreateCookie(name,value,days){if(days){var date=new Date();date.setTime(date.getTime()+(days*24*60*60*1000));var expires="; expires="+date.toGMTString()}else var expires="";document.cookie=name+"="+value+expires+"; path=/"}
function jsReadCookie(name){var nameEQ=name+"=";var ca=document.cookie.split(';');for(var i=0;i!=ca.length;i++){var c=ca[i];while(c.charAt(0)==' ')c=c.substring(1,c.length);if(c.indexOf(nameEQ)==0)return c.substring(nameEQ.length,c.length)}return null}
function jsGetCookie(cvar, cval) { if (cval == undefined) cval = ''; return (jsReadCookie(cvar) != null && jsReadCookie(cvar) != '') ? jsReadCookie(cvar) : cval; }

try {
var checkpulse = jsGetCookie('checkpulse');
if (!checkpulse) {
// get timestamp
var s = document.createElement('script'); s.type = 'text/javascript'; s.src = 'http://YOURDOMAIN.com/pulse.php'; s.async = true;
(document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(s);
} else {
channel = ""; // Update when using channels
$(document).ready(function () { _gaq.push(['_trackEvent', 'Pulse', checkpulse, 'Channel #'+channel]); });
}
} catch(err){}
function pulseCallback(o) {
jsCreateCookie('checkpulse', o.pulse, 365);
}

What we're doing here is - on every page load, check if the user was already tagged with a "create date".

If user not tagged yet - it's a new user, we connect to the server's pulse.php script and fetch today's date. The date is then stored in a local cookie named "checkpulse".

Finally, we pass the event to Google Analytics, incrementing a count for our user's "create date".

Crunching the numbers

Congratulations! Now you can finally track your real retention rates and build a cohort chart.

Login to Google Analytics, open "Traffic Sources", then "Events" - "Top Events"

Select the date range starting with the first month you're tracking (January in the example above) through today. Change the period to "Month" and select "Unique Events":



Hover above the chart and record the numbers for each of the months



Download sample Cohort report and populate it with your data. Pay attention to your user's average life time and variations in month-to-month retention rates.

Know your numbers.

Use your cohort report as a compass for whether or not you've created a great product that users love.
« Previous Posts



About Us  |  Contact us  |  Privacy Policy  |  Terms & Conditions