
I’m a huge fan of
digg.com, it provides me, and millions of others, with a constant source of interesting content to ponder. And for many sites around the internet, it can provide a much valued boost in traffic. Of course, when you start talking about digg.com and traffic the infamous digg effect always comes up. Put simply, this is the latest take on being slashdotted. When a popular site allows for user generated content, specifically links, there will no doubt be a surge in traffic to that site. In many cases this results in that site becoming unavailable. I’ve recently fallen foul of the digg effect myself, with a site becoming unavailable until I moved hosts. It turns what should be an exciting time into more of a worrying one. On digg itself there have been numerous comments mentioning that a large number of sites seem to be going down. So, given these two factors, I wanted to see for myself. In this post you’ll see what I found out by recording some important data regarding the stories that make it to the digg.com front page over the course of a single day, thats 24 hours and EVERY story that made it to the venerable homepage spots. Read on for what I’ve discovered so far.
So let me set this up for you. I would take every story that appears in the digg.com RSS front page feed and record the pertinent data regarding it. In this case, the pertinent data is Number of Diggs, Alexa Rank, whether the site remains up and whether the duggmirror site caught a mirror of the content. These are all related to the digg effect and should help quantify it’s effect on a wide variety of sites. This article is just a summary of the data I’ve collected, look for a more detailed analysis at a later data (hopefully within a few weeks).
Over 24 hours where were 127 different stories that appeared on the digg.com front page. This covered a 24 hour period between 20:30 on Monday 26th February to 20:30 on Tuesday 27th Feburary GMT. As expected there were peaks and troughs, with 10 minutes at the most popular time representing the same number of stories as half an hour in the quiet time. These periods, predictably, mirror the peak internet usage times in the US. During this time 53,317 diggs as I saw them. Bear in mind that I took a “snapshot” of the data at the point I accessed the story. It is highly likely that the number of diggs is now far greater than that for the same set of stories.
During this period, and once again as a snapshot of how I saw the sites, roughly 6% ( 8 out of 127 ) were unavailable. The popular duggmirror service caught 62% ( 5 out of 8 ) of those sites that went down, although the degree to which the content was available varied from complete to very minimal. In the case of one, duggmirror appeared to have caught the content but only part of the style sheet. As such the site appeared with the same text and background colour, making it impossible to read without selecting the text.
I thought this would be the most clear cut part of the data. I assumed that sites higher up the
Alexa Rankings would comfortably servive the digg effect whereas sites further down the rankings would struggle. This was partly true, although there are always exceptions that prove the rule. The average Alexa Rank of a downed site was 708,640. Which seems reasonable enough. Any site with even a modicum of traffic will make it into the top 2 million sites. perhaps the surprising detail is that the site with the highest Alexa Rank that went down was ranked at 1,087, which is a popular site in its own right. I would have expected a site of this popularity to survive. The least popular sites are difficult to judge. It seems as if many people use digg.com to launch their sites. This is indicated by the number of sites without traffic data available. Typically this means the site is new.
Something that did really surprise me, and really questions the need for digg in the first place, is the number of very popular sites that have stories on the front page. Specifically there were numerous stories from the New York Times, TechCrunch, Engadget, The BBC News and other major news outlets. I am sure this is content that most of digg’s users could find themselves.
Somewhat unrelated to the question I originally posed is the number of diggs needed to get a story to the front page of digg.com. In summary, this ranged from 1,614 to just 67. Of course this just represents the point at which I viewed the story but I will maintain it is within 20% tolerance given the frequency with which I was visiting the site. No doubt this reinforces the quality over quantity nature of digg’s algorithms.
I am releasing the summary of the data. The
screenshot on the left covers the main areas. You could also visit the Google Spreadsheet here. I will be releasing the full set of data when I finish the main writeup. Just a note on the data, for clarity. This was all done by hand, by me. This probably isn’t the best way of doing it but for this set of data it was the most efficient (if I wrote a script I would have to test it to be sure of its reliability etc.). As such there is a chance that there are mistakes present. although I do not believe this to be the case. Likewise, the data captured represents the point at which I accessed digg.com. Therefore it is unlikely the data remains the same as is it unlikely that you experienced the same data. I am also aware of the failings of Alexa as an accurate measure of traffic but it represents an easy way to gather this information. When I complete the full writeup of the data, I’ll break it down a bit more. Expect category and site specific data, which I really think will be interesting.
Please remember that it took a bit of effort to capture this data so please give due credit if you are using it. Also check the sites licence, to your left.
This post was written on Wednesday, February 28th 2007 by Simon T and has been categorised under Casestudies , Technology , Web 2.0. The trackback URL is here or you could add a response. If you really want to you can Digg this story or add it to del.icio.us, Technorati Cosmos, Blinklist, furl or Reddit.
April 3rd, 2007 at 12:03 am
[...] Check out this post over at the Web 2.0 Blog to see how many sites actually go down under the digg effect on any given [...]