Archive for the ‘digg’ Category

Digg Effects, Neurons and your Personal Blogging Threshold

Monday, June 11th, 2007

First, I’d like to welcome the new readers who found wanderingstan this weekend via the posts on TUAW, Gizmodo, and others!

For those of you just joining the story: Last Thursday I posted a picture taken by Gwen Bell of her house-mates baked MacBook. As soon as Paul showed the photo I knew it had potential to go viral.

And so it did. The TUAW version of the story was Dugg (1828 so far), and I’ve had almost 15000 reads of that post.

This got me thinking: The web is beginning to operate very much the way our brain does. This is especially visible in the case of Digg.

You see, every neuron has a firing threshold. It has inputs from many other neurons, and when enough of those incoming neurons fire, the cumulative effect may be enough to cause it to fire.

Each digg story operates the same way. Each digg is a input activation for the story. When enough activation occurs within a short time period, the firing threshold is crossed: the story moves to the front page. (Note that the exact value of this threshold is a secret which Kevin Rose isn’t telling!) This “front page firing” causes activation of millions of readers.

Here’s a typical response chart for a neuron, and the physical structures responsible.

So what would this look like in Digg/Web-land? Here’s my take:

To stretch this analog further: Some of those readers will love the story so much that they then blob about it. In other words, the reader’s Personal Blogging Threshold™ was crossed, just like the firing threshold of the neuron. I suppose you could say twitter is the correct outlet for people with low PBT’s. :)

And how else can I end a post like this than with a “Digg This” button? Here you go!

Costs and Transparency in Ranking Systems

Monday, March 12th, 2007

Can a ranking system be transparent, inclusive, and successful? That was the topic of a long conversation last week with Lijit’s senior developer Derek Greentree. We kept coming back to questions about transparency and the cost of acquiring votes. And in the end we decided that this is some sort of rule:

The maximum success possible for a system is a function of the transparency of the algorithms and the cost of acquiring votes.

Consider this rough chart:

System Transparency Cost/Exclusivity
Online
Digg Low Low - Pass a CAPTCHA.
Google Low Lower - Create a web page with links.
SomethingAwful.com Forums High Med - $10 cover charge
Offline
Political Democracy High High - Become a citizen.
Academy Awards High. High - Become a member of the academy.
American Idol High Med - Cost of Text message

I hear you asking, “Why do Digg and Google get “Low” marks for transparency?”

Digg ranks news stories by the number of members which vote for (”digg”) each candidate. It’s pretty much a pure democracy, with an added time component: old articles are worth less. On the other hand, Google ranks pages by a more complicated algorithm known as PageRank, which treats links on web pages as “votes” for other pages and some pages’ votes worth more than others. It’s a bit like the electoral college, with an added semantic component: pages not related to the search query are worth less.

Do those descriptions sound about right? The thing is, neither is true these days. PageRank is now only one small ingredient of a page’s search ranking. Anyone who pays attention to their page in search listings is familiar with the “Google Dance” when ranking can change unpredictably and sometimes unfairly. Google has become a black box. Digg’s newfound popularity has it struggling to deal with spammers, and has also begun to shroud its algorithms in secret. The most recent Wired magazine, has an article “Herding the Mob” quotes Digg founder Kevin Rose as saying there are antihacking techniques that he can’t talk about. 

Jay and Kevin said they couldn’t explicitly detail how Digg’s ranking algorithm works because it would be used by those who want to game the system (the aiding the enemy defense is popular these days), but they gave enough information to understand the basics of how Digg’s version of a democracy works.

So what we see is that these two popular online ranking systems began with public algorithms, but have retreated into secrecy. 

woman showing finger after iraq voteOn the other hand, systems like the US election process remain part of the public record. Of course, in a democracy it costs a lot to get a vote. For one thing, you have to be born. And if you really want to cheat, you have to mess around with getting the ID’s of dead people and other very messy activities. In the recent Iraq elections the took the extra measure of dipping each voter’s finger in permanent ink to prevent double voting. 

Is this trend necessary? What are the underlying principles?

The trend seems to be that to thwart spammer in popular systems, transparency must go down or cost must go up. And in the online world, costs are dropping so low that transparency is being forced down as well.

The web has seen a lot of systems that begin with low costs and high transparency. That’s the very definition of openness. But as the systems experience success, they have 3 choices:

  • Raise the costs. E.g. SomethingAwful.com added a $10 cover charge to participate in voting. Metafilter added a $5 cover charge.
  • Obscure the algorithms. E.g. Digg adding secret “anti-gaming” algorithms
  • Become irrelevant. E.g. Usenet forums overrun with spammers

The most popular choice seems to be obscuring the algorithms.

Should we be alarmed at this? Imagine if the US government took the same approach: they will tell us who won the election, but the exact algorithm used to determine the winner can’t be revealed! One can argue that getting on the front page of a Google search or the front page of Digg is not nearly as important as an election. But the value of such positioning is only increasing in value, and the bad guys are already trying to rig these elections!

I would argue that low transparency is a form of editing. When Digg or Google says that they must keep their algorithms secret, they are in effect saying “Our algorithms are fair, but we can’t tell them to you. You can trust us.” But do we really trust them? Should we? If some quirk of Google’s algorithms somehow helps a company they have a partnership with, how motivated will they be to fix it? 

Anyways, those are some beginning thoughts on the subject. Any ideas from you would be appreciated, as I feel there is a lot more to explore here.

Gaming Digg, and the Lijit List

Thursday, October 19th, 2006


SpikeTheVote is a way for people to game the Digg voting system. It’s very clever.

We collectively vote each other’s stories to the front page.

I act as the middle man, verifying votes and keeping everyone in line. If someone stops digging, they won’t earn enough points to get their own stories dugg.

Spike the Vote works on a point system. Each day I give you a mission with several stories to Digg. 20% of your mission involves digging stories submitted by users in this community, while 80% of of your mission is completely random. This is to eliminate footprints and keep things anonymous.

You earn 1 point for each story Dugg. Once you earn enough points, you can trade them in for Diggs on your own stories.

I agree with Fabian that this isn’t inherantly evil, but it’s obviously not something that Digg or it’s users want.

This is the same root problem that comes up again and again on the internet. If all it takes to get a “vote” is to fill out a registration form or create a link, then any “election” can be gamed. The same goes for product reviews and search results.

What’s the solution?

The feature is in it’s infancy, but the Lijit List aims to solve exactly this problem. It combines a Digg-like voting system with a social network. Votes from people closer to you are given more weight than votes from strangers.

This should also avoid the very Digg-ish problem of nerd-centricity. The core users of Digg are nerds (like me), and the content reflects that. But the Lijit solution means that your list is customized to you and your network. If lots of your friends are into knitting, then you can expect to see a lot of stories about knitting. (I do not expect to ever see a story about knitting on Digg!)

Like I said, this feature is just getting going. We need more volume in the system to really get it tuned, so why not join the beta test? Be sure to let me know how it works for you.

Three sources of trusted information

Tuesday, July 25th, 2006

Half the challenge of explaining Outfoxed to people is convincing them that its actually a more natural way of doing things than what we’ve been conditioned to accept as normal in the online world. What I want to show is that the online world has automated only our least preferred methods getting information, and it’s time someone automated the most preferred.

carImagine for a moment that your car breaks down while you’re on a business trip out of town to Springville.  It’s easy enough to find a mechanic online or in the Springville yellow pages, but how will you find a good one? Since you know nothing about Springville, you’ve got three choices:

  1. Rely on an external trusted authority. Look for mechanics that are endorsed by the Springville Better Business Bureau, or by the American Association of Auto Mechanics.
  2. Rely on the wisdom of the crowd. Look for things like “Voted best Springville Mechanic 2003-2006″ or “One of top ten trusted mechanics in Springville by Springville Times poll”
  3. Rely on your network. Call your spouse and ask if she knows anyone in Springville; turns out her old college roommate lives there. Call the roommate and get a great mechanic recommendation.

Assuming all of these options threw out some recommendations, which would you take? If you’re like me, your network trumps options 1 and 2 every time.

Your network is the preferred way to get information, even over external authorities or the wisdom of the crowd.

But how do things look in the online world? What sources can you turn to?

  1. Rely on an external trusted authority. Look for the “Trust-e” or “BBB” or “Verisign” seal on a page.
  2. Rely on the wisdom of the crowd. If it’s a book you’re buying, look at how many stars it has on Amazon. If you’re on eBay, count the stars of the seller. 
  3. Rely on your network. Call your friends and ask if they’ve heard of this website, or if they know this seller on eBay. 

The thing to notice is that the first 2 options have been completely automated by the web. External authorities were more a thing of the 1990’s Web 1.0 world, but currently Web 2.0 is positively drunk with wisdom-of-the-crowd techniques. It’s so easy! Just give every user a way to vote, and you’ve got instant crowd statistics. This is digg, this is delicious/popular, this amazon and IMDB ratings, this is YouTube stars.

But the network option sounds strange here, doesn’t it? It’s odd to bug a friend over a mere webpage, and the odds of them knowing one particular seller on eBay are staggeringling small. The problem is that this option is not yet automated. To use your network, you have to use the old slow methods of telephoning and emailing, and you have to bug a lot of people.

Source Real-world Online equivalent
External Authority Better Business Bureau

trust-e
BBB
Wisdom of Crowd Newspaper Surveys digg
amazon
Social Network Call your friends, ask for recommendations ???

Of course, social networking sites like MySpace, Facebook, or LinkedIn could step into that gap, but they seem to be content with only helping people find dates or business contacts.  In my next post, I’ll talk about how the current crop of sites are missing the boat.