Saturday, 28 June 2008

the best way to provide faceted search?

As research has produced an increasing number of insights into the different ways of providing faceted metadata to users in the form of a faceted browser, the question has become: what actually is the best way to provide faceted search? This same question has not really been seen in typical information retrieval, as each bit of research has (usually) incrementally improved the system performance, and a good keyword-based search system will try to include all the advancements in their UI (not that Google is providing interactive query refinements).

We actually do see faceted search all over the place. iTunes has it in their 'browser' function (3 columns that filter to the right). Google product search lets you refine by facets like brand and price (each facet filters every other facet). Endeca seem to be selling it to everyone these days (right on!), including Walmart and Borders.

There are actually 2 layers to this question: how best to provide a faceted classification and how best to provide a faceted browser. The earlier has been well investigated, with advice from Marti Hearst. Endeca certainly seem to ask 'after each click in a facet, what is the best set of facets and values to show the user?'. The second question is less well known. Even one of Endeca's clients, the NCSU library, are asking: should we have the facets on the left or the right?, should we place a breadcrumb or a list of decisions?. How does this affect the user?

Further to these layout questions, I have been trying to work out for a while now whether the structured and consistent iTunes approach is better or worse than the dynamic adaptive approach taken by Endeca? Especially with all the additional functionality (e.g. column swapping and backward highlighting) we have been adding to the iTunes-style approach with mSpace. There are even more additional questions to ask. maybe its a case of when is one better than the other? Finally, can we somehow take the best of both worlds, so that we can figure out what to add to our faceted browsers that make them incrementally strong.

Friday, 20 June 2008

privacy and social in-security

Its by no means been the focus of the collaborative search workshop this week, but the issue of privacy, not surprisingly, came up in terms of what your collaborators can see about your actions. There are obvious things to be concerned about, like do you all have the same clearance, for example. But ALOT, recently, I have heard people talking not about the insecurity of the systems, but the insecurity of the users!

The example we heard here, proposed by Merrie Ringel Morris, was if you are searching for something with someone superior to yourself, and you do something stupid. what if you do not want them to see that you're being 'sub-optimal'. or if you have to look up something they said, when you are trying to make a good impression.

These are interesting privacy and 'insecurity' issues, where users still want to protect themselves. I've not seen much on it though? anybody?

Wednesday, 18 June 2008

first faceted system?

Daniel Tunkelang, chief scientist at Endeca, has passed on an excellent entry on perhaps the first faceted system. It's actually come at a very timely point during the JCDL08 conference, where faceted browsing has been quite core to a number of discussions. I'll post more about this later. Someone even asked me about Ranganathan's colon classification (research core to the start of faceted classifications) after my talk on a longitudinal study of faceted and keyword use, and now i have link to research it further - thanks Daniel.

Monday, 16 June 2008

collaborative IR workshop

Let me start by apologising for any rubbish you find below - i just landed in pittsburgh for JCDL08. I keep going form hot sweats to shivers depending on if i'm outdoors or in!

Next week is the, what looks to be, exciting first international workshop on collaborative information retrieval. to be clear, the main focus is on teams of people trying to achieve a shared goal, either co-located or distributed, and either at the same time, or asynchronously.

One of the papers sets itself out from the crowd, as far as I am concerned, but also worries me slightly. Instead of delving straight into ideas of communication and task-allocation, one author (who shall remain nameless till after the workshop - but attendees have been asked to read each paper before the event) steps back and asks: what is the definition of collaboration, and how does it differ to/consume cooperation, coordination, and many more similar terms. His paper is clearly well researched and well informed, but the level of model-detail also worries me: how much detail is too much detail on these things, when designing a model. Interfaces that try to differentiate/support each individually could be confusing. The discussion will certainly be valuable, and that and other papers will make a very intersting workshop. stay tuned to hear more about it.

first time for JCDL08!

Thursday, 12 June 2008

exhibiting exploratory behaviour

Next week I am giving a talk on our paper at JCDL08, on the longitudinal real-world usage of a website that has both faceted and keyword search persistently available. One of the aims of this research was to see how people changed behaviour over time, as they grew more familiar with both the data and the website. This is motivated, of course, by the notion of Exploratory Search, which represents users who dont necessarily know what they are looking for or how to find it.

It has only struck me recently how undefined exploratory behaviour really is. Originally, it was suggested that people who are exploring would click around on things such as facets and categories, rather than keyword search, because they do not know what to search for. Then later they would keyword search, because they have learned whats available on the site.

The alternative view is that people who really don't know what to search for, start with the 'vague query', and then use the facets to refine.

What we saw in the study is that people exhibit either pattern of behaviour at any stage, and this idea of order is not the variable that defines exploratory behaviour. For example, some experienced users were using the facets to produce very specific queries, rather than typing boolean queries into the keyword search box. Similarly, we saw experienced users start with a keyword search and then narrow the results down effectively.

So what variables do identify exploratory behaviour? is it this effective behaviour? if we see a lot of similar queries or a lot of swap and change within one facet does that make them a learner? because i can sure think of occasions when one problem involves selecting lots of items in a column, regardless of whether im good or bad at it: where to go on holiday? you could select lots of countries and cities.

one of our earlier papers (a few years ago) thought maybe it was the idea of backing out of your decisions.

in the beginning...

...there was space - for blogging. Inspired by the ever interesting blogs produced by Daniel Tunkelang (Chief Scientist at Endeca), on the topics of HCI and IR/IS, I have decided to try and blog some of my own thoughts. I'd put some in here, but then the title would be misleading, and then it would be much harder to find! The interesting stuff should start asap.