1-page websites

I realized today that the apps I use everyday are all 1-page websites. Gmail. Bloglines. Google search. Actually, Google is the king of the 1-page websites – almost all their products consist of only 1 page.

Twitter is 1-page, because there is 1 page that you spend 90% of your time on. Flickr is multipage, although it’s main function (watch photos) is 1-page. Digg is 1 page. Mmm…

But what bothers me about these immersive worlds (isn’t there a better name?) is that they’re all supercommercial. Why would that be?

You know this problem in IA when you design sites without real content, and before you know it there are loads of excerpts all over the page that don’t really mean anything, ending in 3 dots “…”? It leads to a homepage like this one for example, just lots of excerpted content that doesn’t really do much for anyone.

I got a word for that. Excerptitis. Maybe you have a better one?

Ah, I messed up the site, but now it works again, and THE COMMENTS WORK! (excuse the all-caps)

The comments work! Yey!

April was not only the hottest, the dryest and the warmest April ever in Belgium, but most likely (if there’s no rain tomorrow) it will also be the first month every without any rain at all.

So far so good, life in Belgium :) The weather has been incredible.

A bunch of presentations on scaling websites: twitter, Flickr, Bloglines, Vox and more.

(I changed the title because “top 10” posts are indeed sucky. Also: looking for my colombia travel site?)

By the way, here’s the RSS feed of my blog, in case you’d like to subscribe.

I always love to read scaling discussions, especially about popular web apps, and there are loads of them out there. Here’s my overview of the best. By the way, the best book on scaling apps I’ve ever read is Building Scalable Websites, by Cal Henderson (the Flickr guy).

It’s dog-eared on my desk, and taught me about sharding (which I used extensively for mefeedia). Sharding is when you cut a really big table into pieces, so you can put those on separate servers. It means you have to make changes to your code, and your database isn’t so database-y anymore, but it works. For example, online games use sharding to grow their virtual worlds, because there’s no way they could serve all that information from 1 db cluster.

Scaling Twitter with Ruby.

Twitter is hot today, and they ran into some serious scaling problems, although the app itself is quite simple. It consists of messages of maximum 140 characters. Lessons are the same as most apps: Memcache like crazy, and optimize the database (the biggest bottleneck most of the time).

Also, Ruby on Rails scales pretty much the same way as PHP and other similar languages: shared nothing architecture. Shared nothing means that there is no 1 thing that is shared by all servers, since that would become a bottleneck.

PHP, for example, has shared nothing architecture out of the box, except perhaps for sessions, but that’s easily solved by storing sessions in a db (which then has it’s own scaling approach) and not in the filesystem. Here’s a talk by Rasmus Lerdorf that explain scaling with PHP5. (Here’s the mp3 audio recorded by Niall Kennedy).

Blain Cook made this presentation:

Scaling Flickr.

Cal Henderson wrote the above book, and also has a good presentation: Scaling Flickr slides as PDF’s.

One of the problems you get into when scaling something like Flickr where you store LOTS of stuff, is that you can’t just store that on a harddrive anymore: it’s not big enough. Apart from just using Amazon’s S3 service (which rocks – I used it for mefeedia and I know lots of startups who use it), there are other solutions. A good presentation of that by Cal is this one:

Cal (he’s a busy dude) also made this presenation about scaling web apps, generally:

John Allspaw (flickr plumbr) also has a good presentation about scaling Flickr:

Scaling LiveJournal.

LiveJournal was one of the first social networks, before that word meant anything, and they’ve partly invented how to scale standard php/mysql/apache apps. They developed memcached, which is now used by almost anyone who wants to scale their site.

Brad Fitzpatrick has a good set of slides on how they evolved the service, here’s a PDF version. And here’s the slideshow embedded:

Kevin Rose mentioned this was “the bible for scaling Digg” – and I think quite a few other web apps are based on this.

Six Apart.

The livejournal guys with all their scaling expertise were acquired by Six Apart, and they soon launched Vox. And of course, here’s a presentation on making Vox scalable:

Bloglines.

Bloglines’ scaling problems where slightly different from your average web app, since they are an aggregator of feeds. That means they have billions of blogposts they have to keep and serve to users, and that creates its own scaling problems. The Bloglines approach was to, instead of using a database, just store all that stuff in a special filesystem. Today it’d be easier to do this since there are a few filesystems that do that, or you could just go with S3 again. Mark Fletcher (who also sold Onelist to Yahoo which is now Yahoo Groups) has given a few talks on scaling Onelist and Bloglines: here’s the mp3 audio version, and here’s the PDF of that talk. And a text transcript.

Last.fm

Last.fm is one of the aggregation-type apps: they gather a lot of data about what music you listen to. Similarly to Bloglines, that causes it’s own scaling problems:

Slideshare.

All the slides in this post are hosted by Slideshare, an incredible service by my fellow information architect Rashmi Sinha and team. When I found out about the project, I emailed her: “brilliant and so obvious once you think of it”. Like many startups, they use S3 to serve their content, and they have the obligatory yet interesting slides to explain how:

I haven’t linked to lots of good thinking about scaling, or to technical resources and stuff. But the presentations should get you going in the world of memcached, perlbal, nothing shared and federation :) Enjoy!

PS: See also How I Unexpectedly Found Myself Doing Consulting For Startups (this is a post on my “professional” site. I haven’t been able to figure out when to post here or there, any tips on that?).

Update: more presentations.

Another great talk in video this time, from the MySQL Bay Area Community Meetup, May 2007:

Finally, Dan Pritchett has a good presentation on scaling eBay (PDF). 26 Billion SQL queries per day! 300+ new features per quarter! 4 architecture versions since 1998 and some pretty crazy scaling of the search.

New: presentation on how Facebook uses PHP APC cache (PDF).

A talk on Youtube scalability: “In the summer of 2006, they grew from 30 million pages per day to 100 million pages per day, in a 4 month period. Thumbnails turn out to be surprisingly hard to serve efficiently. (I ran into this with mefeedia too, luckily Amazon S3 came to the rescue by then.)” Youtube uses Python, Apache, MySQL, Memcached.

NEW: Front end scaling is important too, and often ignored. Here’s a good presentation from the Yahoo guys:

Microsoft’s profits continue to be staggering: with a quarterly revenue of $14.4 billion, it takes Microsoft only:

  • 10 hours or so (yes, hours!) to exceed Red Hat’s quarterly net income of $20.5 million.
  • four days to exceed Research In Motion’s quarterly net income of $187.9 million.
  • four days to exceed Starbucks’ quarterly net income of $205 million.
  • one week to exceed Nike’s quarterly net income of $350.8 million.
  • two weeks to exceed McDonalds’ quarterly net income of $762 million.
  • two weeks to exceed Apple’s quarterly net income of $770 million.
  • 18 days to exceed Google’s quarterly net income of $1 billion.
  • 23 days to exceed Coca-Cola’s quarterly net income of $1.26 billion.
  • five weeks to exceed IBM’s quarterly net income of $1.85 billion.
  • 10 weeks to exceed Wal-Mart’s quarterly net income of $3.9 billion.

What’s wrong with the workhack todo list: it dissapears todo items that are done. I like to see what I’ve accomplished, to get that feeling of satisfaction, of knowing you’ve done at least *something* the past 2 days.

I hear mefeedia is doing great, numbers continue to grow fast. It’s very satisfying to see that the people I sold it to are continuing to build it out in the original spirit.