The new Google News service kind of bugs me. The FAQ says this:
The headlines on the Google News homepage are selected entirely by a computer algorithm, based on many factors including how often and on what sites a story appears elsewhere on the web. This is very much in the tradition of Google’s web search, which relies heavily on the collective judgment of web publishers to determine which sites offer the most valuable and relevant information. Google News relies in a similar fashion on the editorial judgment of online news organizations to determine which stories are most deserving of inclusion and prominence on the Google News page.
Huh. So the most reported stories show up on Google News, which causes people to report the stories more, and so on. In engineering, they call this positive feedback. It is not always a good thing.
Mind you, Google’s always used algorithms like this for their search. Daypop, the popular (and currently dead) weblog search service, creates a similar effect with their Top Forty listing of popular links from the world’s weblogs. So this is nothing new, per se.
Still. I have a penchant for the unexplored, the new, and the underreported. It seems to me that Google is encouraging the homogenization of the Web, here. The algorithm is problematic when applied to news, and it has the same problems when applied to web search.
Discussing this is, alas, met with scorn from the weblogging community. Daniel Brandt is a bit of a loon, admittedly, and his personal stake in these arguments is well documented. But there’s some truth at the core of his complaints. Besides, you’d kind of expect Doc Searls to stand up for Google. He’s one of the guys who benefits from PageRank.
When Doc Searls says “Why is this bad? Because PageRank doesn’t give a fair shake to stuff nobody points to? What user would want that?” I am forced to reply, “Users who want to find stuff outside the beaten path.” PageRank is great for building up an initial concept of the Web; if you’re starting from scratch, you get an accurate picture of which sites are important. But from that point on, you make it harder for completely new sites to break into the rankings. New clusters of link relationships won’t be ranked as highly as the old clusters.
So that’s why Google News kind of bugs me.
Disclaimer: I used to work for AltaVista.
Is there an algorithm for indexing pages
that is fairer and at least as scalable?
If I knew that, I might still be at AV…
A little while ago, I kvetched about the algorithms that produce Google News. I feel somewhat vindicated, since I don’t