Prelimenary results from Blogroll Ranking
Who are the influential bloggers? Which blogs matter? What metrics would you use to even begin to answer these questions?
I’ve been exploring alternate methods of ranking in the past months. The best results are coming from examining Blogrolls. When you think about it, blogrolls compromise the links in a huge implicit trust network. For now I’m calling the calculated score “PeopleRank”. It’s kinda like PageRank, in that blogroll links from higher PeopleRank-ed blogs count more. E.g. if Om Malik has you on his blogroll, that counts a lot more for your ranking than the blogroll of your niece on Livejournal. (No offense to your niece.)
So here are the top 50 blogs as ranked by the preliminary algorithm: (Commentary and caveats follow)
|Blog Name||URL||People Rank||Blogroll Count|
|TechCrunch (Arrington & Friends)||http://www.techcrunch.com/||16.88550||74|
|Subscribe to Posts (RSS)||http://feeds.feedburner.com/||10.35721||58|
|Advertise on this blog||http://money.cnn.com/services/mediakit/||8.24951||14|
|Creating Passionate Users||http://headrush.typepad.com/creating_passionate_users/||8.05627||51|
|Brad Feld – Feld Thoughts||http://www.feld.com/blog/||7.76376||57|
|How to Change the World||http://blog.guykawasaki.com/||7.36782||39|
|David Jones/PR Works||http://www.prworks.ca/||6.67162||11|
|New World Notes||http://secondlife.blogs.com||6.47961||6|
|Talking Points Memo: by Joshua Micah Marshall||http://www.talkingpointsmemo.com/||6.30786||23|
|Three Kid Circus||http://www.threekidcircus.com/threekidcircus/||6.10842||109|
|Rain City Real Estate Guide||http://www.raincityguide.com/||6.06087||11|
|Jottings By An Employer's Lawyer||http://employerslawyer.blogspot.com||5.95257||7|
|Joho the Blog||http://www.hyperorg.com/blogger/||5.85586||23|
|Jeneane Sessum – Allied||http://allied.blogspot.com||5.73544||91|
|Her Bad Mother||http://www.badladies.blogspot.com||5.73306||108|
|B.L. Ochman's Weblog||http://www.whatsnextblog.com/||5.69226||11|
|Techdirt (Mike Maznick)||http://www.techdirt.com/||5.64693||21|
|This Blog Sits at the||http://www.cultureby.com/trilogy/||5.50986||9|
Caveats of this calculation:
- Results with ~5K blogs crawled.
- Blogroll Count = Number of blogrolls this blog appears on = How many people publicly admit to reading this blog.
- The interesting datapoints are where the PeopleRank ordering puts a blog higher in the list than one with a higher blogroll count — those fewer subscribers must be “more important”.
- This crawl took Lijit user blogs as the starting seeds giving an overall tech bias.
- However, there was a period when the crawler went unchecked into what can only be called “The Mommy-o-sphere” so there is an over representation of Mom-blogs in teh dataset.
- Our blogroll detector algorithm still gets false positives, thus the high rank for “Subscribe to Feedburner” and the multiple ColoradoStartups.com listings.
- Some blogs use a Blogrolling widget for a “Web Ring” functionality, thus erroneously appearing as blogrolls. This explains most of the 100+ blogroll counts.
- We need better de-duping. Several blogs appeared until multiple URL’s, reducing the overall score.
So how is this different from existing rankings? Til now, the most common methods have fallen into one of two camps:
- Number of subscribers. I.e. a pure democracy. Use some combination of Feedburner (for RSS readers) and some web analytics (for web readers) to count the raw number of people reading a blog.
- Raw number of incoming links (citations). This is similar, except that links are counted instead of subscribers.
Note that neither method discriminates between the blogs “casting the votes”. It doesn’t matter if that 24th reader of your blog happens to be Scoble. Nor does it matter if those 3 citations to your blog in the last month (Technorati defines this as “very low authority”) came from Seth Godin, Fred Wilson, and Guy Kawasaki.
Initial results are encouraging, and I hope to do more analysis this week. What do you think? If you have any suggestions or ideas, please get in touch with me.