Friday, 13 May 2011

Aggregated searches (by Robert Villa)

Modern web search engines retrieve much more than just web pages.  Carrying out a search on Google or Bing will often result in map, image and video results, as well as the ever present adverts. This type of display, in which the results from many different information sources (or verticals) are integrated into a single page, has been called ‘aggregated search’ (or for Google, ‘universal search’). For example, the result list for the Google query “Sheffield Wednesday” includes a news result followed, further down the ranking, by a series of five image results.

A number of issues arise from this type of presentation. For one, there is the issue of how the system chooses which information sources should be displayed for a given query. Some queries may be highly oriented towards a particular information source, e.g. for the query “Caravaggio”, image results may be highly appropriate.

Then there is the issue of result presentation. Google’s universal search uses a “blended” approach, where results from the different sources are (generally) placed within the search result ranking (such as for the “Sheffield Wednesday” query). Alternatively, a “non-blended” approach can be used, where results from each different information source are placed in a separate result list, in a different section in the interface. This was the approach used in Yahoo Alpha. Alternatively, a mixed approach may be used, e.g. Google now places any map result on the right had side of the screen, separate from the ranked list.

Lastly, there is the ever present issue of evaluation. Should currently disparate collections (imageCLEF, TREC, VideoTREC, etc.) be combined together to provide a single large, multi-source collection for evaluation? If so, how should the results from the multiple sources be evaluated?

Shanu Sushmita is a PhD student at Glasgow University who has been investigating many of these issues, especially those concerning result presentation in aggregated interfaces. One of the interesting results of her work is the importance of the “source-orientation” of a query in the presentation of blended search results. It turns out that not all information sources have the same impact on the user. For example, moving image results from the top to the middle of the ranking makes little difference to the click through rate. However, move news results to the middle of the ranking, and the click through rate is likely to fall. This is, perhaps, another complication which search engines should take account of (or advantage of).

