Wednesday, February 08, 2006

Das Analysis: More About The Reddit Model And The Future Of The Web

OK, hubristic enough title for you? Just trying to post reddit style ;-)

Yesterday, I made my first post about reddit. It was received with a warm thud on reddit itself, in terms of the score. But it did elicit some interesting comments and suggestions which I have been thinking about.

Then today, Ben of Ben and Alice posted part of what was missing from my misguided rant. His points are very valid, he expresses them much better than I could have, and for some insane reason he actually gathered evidence to back them up! So you should read what he has to say, and probably look at the reddit post linking to his and the attendant comments, before continuing.

OK, now that you've read that (if you haven't, then DO IT NOW! IT'S NOT A TUMAH!), the first thing I want to touch on is the headline factor. As redditor rnichols so wisely pointed out in a comment on the reddit posting of the link to my original blog post, the "killer headline" is very important in determining the success of a reddit post. So what makes a killer headline in the context of reddit posting? The fourth question on the reddit FAQ, about getting your submissions noticed, is a good start to figuring this out. The only truly important thing I could add to this is: don't change the original title just because you can. If it is clear, concise and sounds interesting when you read it out loud (ideally to a geek audience), go with it.

Now, to be clear, in case you didn't look at all of the comments on my original reddit post: I was wrong about a few things. Not the least of which is that all reddit votes are equal, which in retrospect would have been revealed by taking more than a cursory glance at the hot section and comparing the scores and recentness of posts in relation to their ranking. It's obvious that while the post with the most votes isn't necessarily the top ranked because of its relative lack of freshness, votes with roughly the same freshness are ordered by the their scores. There is probably just some simple algebra going on in the Python code to sort these into their respective slots.

Which brings us to the filtering system. The question is, how exactly does it work? The speculation is that it only looks at things like keywords and URLs. If so, then as Ben so eloquently points out, that isn't good enough. It should at least be collaborative, like the Amazon or Netflix or TiVo systems, and figure out what you would like or dislike based on what other people with similar tastes and interests say they like or dislike. This would be a "killer app" feature to put into reddit. But it would not be easy to implement or maintain. It would cost a lot of developer time and system resources.

So here's the question: will reddit do this? If so, they could implement the same basic approach to handle comments, and they could probably apply the same general methodology to determining how to rank posts in the hot section. The way that would work, as I see it, would basically be to create a recommendation filter for reddit as a whole, with freshness as an added variable in the calculation.

And as I mentioned in my first post on this, if they don't do it, who will? Perhaps Digg, or Newsvine, or even Google? I'm rooting for reddit because they are a cool, sleek upstart in this field. But we all know how that story can turn out. Let's just hope reddit is Cinderella and the future is holding a glass slipper.

1 Comment:

Ben said...

Reddit may well already use a limited/optimized version of the collaborative filtering you and I are talking about (comment from them would be very interesting -- I'm a bit surprised they don't jump in when there are reddit-related comments on posts).

A good place to start, if you're interested in learning more about how recommendation systems work and what the current technology is, is the grouplens homepage:

http://www.grouplens.org/publications.html

especially take a look at this paper:

http://www.cs.umn.edu/research/GroupLens/papers/pdf/algs.pdf