16th May
written by simplelight

Spent some time tuning my MySQL database for a small website (~2K users per day). MySQL Tuner was recommending that we increase the size of the query cache above 16M but we were dubious. The relevant metrics according to this article are:

  • Hit rate     = Qcache_hits / (Qcache_hits + Com_select)
  • Insert rate = Qcache_inserts / (Qcache_hits + Com_select)
  • Prune rate = Qcache_lowmem_prunes / Qcache_inserts

In our case we had gathered the following stats over a 48 hour period:

| Com_select                        | 1163740   |
| Qcache_hits                       |   531650   |
| Qcache_inserts                   | 1021165   |
| Qcache_lowmem_prunes     |    82507   |

| Qcache_not_cached             | 142575    |
| Qcache_queries_in_cache    | 2145        |
| Qcache_total_blocks           | 5643        |
| Qcache_free_blocks            | 1175        |
| Qcache_free_memory         | 11042672  |

So for our database:

  • Hit rate     = 24%
  • Insert rate = 60%
  • Prune rate =   8%

We’re not too sure what to make of this. A hit rate of 24% doesn’t seem to bad but our insert rate is also quite high. For now, we’re leaving the query cache as is. Especially since the comments in the post mentioned above suggest that making it larger than 20M is futile.

8th November
written by simplelight

It is a great time to be a web software developer. Over the last decade the components of web development which have little strategic advantage to a start up have gradually been eliminated and outsourced to such an extent that today the gap between writing code and deploying a new application is often bridged with a single click.

Whereas ten years ago deploying a new application required provisioning a new server, installing Linux, setting up MySQL, configuring Apache, and finally uploading the code, the process today has dramatically less friction. On Heroku, one powerful command line is now all that stands between a team of developers and a live application:

> git push heroku master

Let’s take a closer look at what is happening. The code residing in the repository is uploaded directly to, in this example, Heroku’s cloud platform. From that point onward, the long list of tasks involved in maintaining and fine-tuning a modern web stack are outsourced. The platform provider handles hard drive failures, exploding power supplies, denial-of-service attacks, router replacement, server OS upgrades, security patches, web server configuration … and everything in between.

The implications of this trend are bound to be far-reaching. As common infrastructure is outsourced to vendors such as Amazon, Rackspace, Google and, the base of customers for hardware and stack software will become increasingly concentrated. As the platform vendors function both as curators and distributors of middle-ware for associated services such as application monitoring and error logging, new monetization opportunities will arise for those companies, such as New Relic, providing these tools.

Just as the arrival of open-source blogging platforms eliminated the intervening steps between writers and audiences, so the new breed of platforms has reduced the friction between developers and their customers.

Most importantly, though, the barriers for new private companies to compete have been permanently lowered. Today, $100 per month can buy you a billion dollar data center.

19th July
written by simplelight

In this way you can obtain the list of the ten oldest processes:

ps -elf | sort -r -k12 | head -n 10

To sort processes by memory usage use “Shift M” when running.

Use ‘c’ to show full path for command.

For other useful ‘top’ configurations.

Tags: ,
15th February
written by simplelight

Stacy Smith, Intel’s CFO, has some interesting data on the tipping point for PC market penetration. As the cost of a PC in a region moves from multiple years to 8 weeks of income, the penetration changes from zero to about 15%. Once the cost drops below 8 weeks of income, the penetration rises very rapidly to 50%.

According to Smith, the cost of a PC in both India and China is now below 8 weeks of income in those countries.

27th January
written by simplelight

Facebook isn’t often cited as a cloud computing company since the ‘Social’ moniker has proven to be stickier. It does, however, meet the common definition of ‘Cloud’ i.e. the management of the hardware is highly abstracted from its users, the infrastructure is highly elastic, a variety of services (billing, authentication etc.) are bundled, and the underlying hardware is geographically dispersed.

What is fascinating is that Facebook, more than other cloud companies, gives us a glimpse into a future where computing and storage are virtually free and ubiquitous. With $2 billion in revenue for 2010 and about 500M users, Facebook has revenue of roughly $4 per user. With some back of the envelope math, it seems likely that the variable cost for each additional user is about $1 per year. Think of the services that Facebook is providing its users for $1. Unlimited photo storage and sharing. A contact database. Email. Instant messaging. A gaming platform.

The economics in the consumer cloud are compelling. They will become more so over time and as large enterprises realize that there is no strategic value in common IT, there will be a similar shift for businesses.

12th June
written by simplelight

One of the major issue with large data centers is power. This is applicable to both large data centers like Microsoft / Google and also to large Enterprise Data Centers which are very energy inefficient.

Definition of Power Effectiveness: Data Center Power Usage Effectiveness (PUE) is defined as the ration of data center power to IT (server) power draw. Thus a PUE of 2.0 means that the data center must draw 2 Watts for every 1 Watt of power consumed by IT (server) equipment. The ideal number would be 1.0, which means there is zero overhead. The overhead power is used by lightning, power delivery, UPS, chillers, fans, air-conditioning etc. Google claims to have achieved a PUE of 1.3 to 1.7. Microsoft runs somewhere close to 1.8. Most of Corporate America runs between 2.0 and 2.5.

A typical large data center these days costs in the range of $150 Million to $300 Million depending upon the size and location. A 15 MW data center facility is approximately $200 million. This is the capital cost so it is depreciated over time.

Most of the facility cost is power related. Anywhere  from 75% to 80% of the cost is power (pdu, chiller, ups, etc).

A typical 15MW datacenter with 50,000 servers costs about  $6.0 million per month for operating expense (excluding people cost) and the share of power infrastructure (pdu, chiller, ups, etc) is between 20% to 24% and actual power for the servers is 18% to 20%. Thus total power cost is between 38% to 44%. These numbers reflect what Microsoft / Google would do. EPA has done a study and they believe these numbers are close to 50% for inefficient data centers.

10th June
written by simplelight

If you use Google Analytics’ Site Overlay functionality, it occasionally results in a white or gray haze over your website which prevents you from clicking on any of the links.

The good news is that your browser is the only one affected (none of your customers will see the same effect). All you have to do to fix the problem on your end is clear your cookies (specifically a cookie called GASO).

28th May
written by simplelight

If you want to share video and visual information from your desktop you should check out Dyyno. They have combined some pretty cool video compression technology with a peer-to-peer networking layer and the result is very slick.

Their technology provides the plumbing for Xfire’s live video service. It’s still in beta but if you need a WoW fix, that’s the site to visit.

7th February
written by simplelight

The following gems and plugins are the most popular as of Nov 12th, 2008:

  • Javascript Framework: jQuery (56%), Prototype (43%)
  • Skeleton: Bort
  • Mocking: Mocha
  • Exception Notification: Hoptoad
  • Full text search: Thinking Sphinx
  • Uploading: Paperclip
  • User authentication: Restful_authentication (keep an eye on Authlogic)
  • HTML/XML Parsing: Hpricot
  • View Templates: Haml

NewRelic has a good article on the state of the Rails stack.

1st July
written by simplelight

Every now and then it is forcefully driven home to me that Linux is not yet ready for mass adoption. I have been trying to set up my back / forward mouse buttons on Feisty Fawn. There is no reason why this should be difficult but the official instructions are alarmingly non-deterministic! Exhortations to “experiment” are just plain annoying. Plug and Play (TM) might not be perfect but it gets the job done most of the time.