Thursday, 5 March 2009

What seperates query refinement, clustering, and faceted search?

I've been thinking recently about what seperates out the different interactive information retrieval techniques, as a term I am using loosely for now. There's interactive query refinement or expansion, which is often used to suggests potential changes to a query to explore sub-groups of the results. There's clustering, which analyses the results for clusters, in order to help users explore sub-groups of the results. And there's faceted search, which provides many different types of categorisation over the results in order to help users explore sub-groups of the results.

Each of these can be used to explore groups in the results, and they mainly differ by the back-end system that is used to label the sub-groups. They each also come with a typical interaction model. IQE usually sends a new query to the server and returns a new set of results. Clustering interaces, like typically allow users to choose one cluster at a time to view. Faceted browsers, like Flamenco or mSpace, typically allow users to apply and unapply a series of filters.

My question is how much of the effect is down to the method and which is down to the interaction model. Marti Hearst wrote a great article in the CACM that highlighted the advantages of faceted exploration over clustering, but the majority of her highlights are over quality of data produced, such as the completeness of categories produced.

It would be interesting to compare the specific effect of interaction style. Such as allowing users to apply and unapply a series of interactive query refinemements, rather than sending off new queries as a new starting point. The nearest I can think to research doing this is the work by Hoeber, which allows users to turn on and off query refinement filters on the list of results. The aim of such a specific study would be to analyse the benefit of implementing more increasingly complicated backends, instead of simply improving the interactivity of the search interface and the range of search tactics they support.


Daniel Tunkelang said...

Really good question, and I like where you're going with this. Part of the thinking behind my recent post on ranked set retrieval was to get a simple evaluation framework for an assortment of interactive IR techniques. Maybe it's too simple, but today everything we have is so complicated that we end up with ungeneralizable user studies.

Max L. Wilson said...

Daniel, it was actually something that I misread in your blog that cued this thought in my head. somewhere around your mention of false dichotomies. It was timely as I read later that day a paper that I thought had largely confused the techniques.

I agree, and I think this really is an element of the cranfield style standard eval for the HCIR community, which you have previously blogged about.