Visit the Caliper Life Sciences web site
Click on the advert above to visit the company web site

Product category: Laboratory and scientific training and education
News Release from: Indiana University School of Informatics | Subject: Topical interests and search engine bias
Edited by the Laboratorytalk Editorial Team on 10 August 2006

Research throttles idea of search engine
dominance

Request your FREE weekly copy of the Laboratorytalk email newsletter. News about Laboratory and scientific training and education and more every issue. Click here for details.

Study challenges the 'Googlearchy' theory - the perception that search engines push web traffic toward popular sites, thus creating a monopoly over lesser-known sites

Search engines are not biased towards well-known websites In fact, they actually produce an egalitarian effect as to where traffic is directed, say researchers at the Indiana University School of Informatics

Their study, 'Topical interests and the mitigation of search engine bias', appears in the Aug 7-11 issue of the Proceedings of the National Academy of Sciences and challenges the 'Googlearchy' theory - the perception that search engines push web traffic toward popular sites, thus creating a monopoly over lesser-known sites.

As the web becomes larger and more complex, search engines have taken on an increased role in guiding Internet users to their destinations.

Yet, some are concerned that search engines, by means of their ranking algorithms, create a vicious cycle where popular sites receive more and more hits.

"Empirical data do not support the idea of a vicious cycle amplifying the rich-get-richer dynamic of the web," said Filippo Mencer, associate professor of informatics and computer science.

"Our study demonstrates that popular sites receive on average far less traffic than predicted by the Googlearchy theory and that the playing field is more even".

Menczer was joined in the study by IU post-doctoral fellow Santo Fortunato; Alessandro Flammini, assistant professor of informatics; and Alessandro Vespignani, professor of informatics.

A search engine is a complex system designed to find information stored on the web, allowing users to look for content meeting specific criteria, typically a word or phrase.

The engine then retrieves a set of references closely matching the criteria, returning a list of 'hits' ranked by page relevance.

The IU team pooled their expertise in web mining, networks and complex systems to collect empirical data from various search engines.

In one scenario, users browsed the web using only random links.

In another, users visited only pages returned by the search engines.

The researchers also have studied the critical role of search engines in shaping the evolution of the web.

"A simple ranking mechanism provides an elegant model to understand the genesis of a broad class of complex systems, including social and technological networks such as the internet and the world wide web," Fortunato said.

"These networks possess a peculiar 'long-tail' structure in which a few nodes attract a great majority of connections".

The long tail structure of the web is commonly explained through rich-get-richer models that require knowledge of the prestige of each node in the network.

However, those who create and link web pages may not know the prestige values of target pages.

In another study, 'Scale-free network growth by ranking1, published in May by the journal Physical Review Letters, the IU researchers showed that all that is necessary to give rise to a long tail network is to have the nodes sorted according to any prestige measure, even if the exact values are unknown.

If new nodes are linked to old ones according to their ranking order, a long tail emerges.

The IU researchers' model finds a striking application in understanding the evolution and social impact of search engines.

"By sorting results, search engines give us a simple mechanism to interpret how the web grows and how traffic is distributed among web sites," said Menczer.

The ranking model can help understand the dynamics of other complex networks besides the web.

For example, in a social system, one may be able to tell which of two people is richer without knowing their bank account balance.

Such a criterion might explain the frequency and robustness of the complex structure observed in many real networks.

Indiana University School of Informatics: contact details and other news
Email this article to a colleague
Register for the free Laboratorytalk email newsletter
Laboratorytalk Home Page

Search the Pro-Talk network of sites

Visit the Caliper Life Sciences web site