Sunday, November 14th, 2004
I had previously scraped my RSS numbers together by extrapolating from Bloglines subscriber data, but today’s Boing Boing article on RSS bandwidth includes a helpful link to Boing Boing’s own Web stats page.
File type Hits Percent Bandwidth Percent xml1 2230472 11.1 % 46.36 GB 22 % html 1924753 9.6 % 43.47 GB 20.6 % rdf2 225715 1.1 % 4.11 GB 1.9 %
1 from /rss.xml and /atom.xml
2 from /index.rdf
That’s right—they’ve served 50 GB of RSS/Atom data so far this month (somewhere in the vicinity of 3 GB/day, since we’re but halfway through November). Straight HTML only accounts for 44 GB over the same period, which is surprising: RSS readers are pulling down more bits of text and markup than web browsers.
I had previously conjectured (based on a rather larger RSS feed size, 40 KB; the logs say the average is more like 20 KB) that Boing Boing serves 22 GB/day (40 K × 11,500 Bloglines subscribers × 48 requests per day), so my estimate was perhaps a little high.
In the most conservative case, this means that Boing Boing has (3.6 GB·day-1) / (48 users·polls·day-1) / (20 KB·poll-1) = 3,750 unique RSS readers. Many clients don’t poll every half-hour or all day long, so there are probably quite a few more. What this tells me is that bandwidth problems are real, and we can expect them to get worse as more users discover RSS (to wit: yet more mainstream press, this time in Sunday’s TIME Magazine Europe).
Judging by these stats (and Glenn Fleishman’s figures), it’s clearly still important that we repair the distribution architecture of RSS. Even with the best-behaved clients, the growing user population spells DOOM for polled RSS.
[Aside: Glenn's graph shows a beautiful weekly heartbeat in RSS bandwidth; he attributes this to well-behaved readers cooling off over the weekend when his XML feed is completely static. I think that's part of it, but I'd also be willing to bet that some of this can be attributed to clients being switched off over the weekend. Rodrigo Rodrigues had some great p2p membership graphs in his IRIS workshop talk that showed many hosts disconnecting for about two days—Saturday and Sunday.]
[Note: This article was updated as of about 21:15 to reflect a closer reading of BB's stats page.]
Posted in Notebook | Comments Off