DBpedia, tags, and Structured Searching

Tagged:  •    •  

DBpedia recently added a search facility that beautifully shows off the power of structured searching--treating content on the web as a database, rather than as a big pile of documents.

Google-like, you have a simple, one-input interface to start your search over the DBpedia set of data, Wikipedia, put into the nice structure of RDF (i.e., semantic web). Un-Google-like, the results pull up an image for the hits if one exists (a very nice touch!). And, even better, you get a tag cloud at the top that you can use to filter the results. So, for example, type in "education" and you get a tag cloud that includes such filters as "academic degree," "professional association, "civil right," "government department," and many more. Click the filters to narrow down your search.

As is to be expected, there are plenty of unexpected tags in the cloud. The tag "party" in the same search, for example (it leads to "Liberia Education and Development Party"). Or "album," leading to a number of albums with the word "education" in the title. That will be both a strength and weakness, depending on the user. To people with a strong training in traditional library searches (lots of booleans and classification systems), it could be a little disconcerting. But, if you put that aside, it provides a way into the data that is more intuitive in some ways. Say, for example, I heard a really great song, but the only thing I remember about it is one word from the title of the album. . .

It also exposed a searching methodology I hadn't anticipated. One-word searches, followed by filtering with the tag cloud, yielded better results than trying to put together a string of keywords that I thought would get me there. Google (and for that matter most every non-traditional search) has me trained to search by piling on as many possible keywords as I can. Here, the filtering through structured web techniques makes that the wrong approach. Instead, one word to get me in the right neighborhood is the best plan, then the tags will guide me through what's actually there, rather than my best guesses about what I hope is there.

It seems to me that other structured web applications could allow the same principle to work, based perhaps on multiple clouds, where each cloud is a slice-and-dice around a different axis. So, in addition to the "tag" cloud, maybe a "foaf:created" cloud, a "sioc:has_member" cloud, a "skos:broader" cloud, etc. That would be more complex than the general user might want, but for more particular audiences it would work well, I think. Or, make that the next generation of "Advanced Search" options.

Trackback URL for this post:

http://www.patrickgmj.net/trackback/85

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Hi Patrick,

I am the developer of DBpedia Search and so I'm highly interested in the way people use it :)

Especially what you’ve wrote about using multiple tag clouds for different axis/dimensions of data properties is interesting. I’m trying out several ways to enable the user do just that kind of “advanced searching”, and I’ll publish some of my ideas/prototypes soon. The big issue with that is finding the right mix of being generic and being useable…

One thing I would like to point you at:
The Search engine not only finds results that contains the search term.
A Search for “Scorsese” will not only find results with that term in their label, but also results that are somehow related, for example “The Departed”, because Martin Scorsese is the director of that film. That’s why “film” will appear in the tag cloud…
I posted a short introduction of DBpedia Search to my blog and will post some details soon.

I’d appreciate your feedback.

Georgi

Georgi,

Many thanks for filling out my observations. I'm quite excited by the work you are doing, in large part because I work closely with the library staff here at University of Mary Washington, and the new ways to discover information are near and dear to their work (as well as mine).

I agree that a big issue in the advanced searching will be "finding the right mix of being generic and being useable." I think that's related to the unexpected results I was thinking of. For better or worse (perhaps mostly worse), usability will be highly correlated with user expectations, and most users aren't expecting a search that works this way. They'll quickly discover the vast improvement, but I anticipate an adjustment period, during which they'll learn to trust that the tags that appear are there for a reason. Sometimes the reason will be intuitive, as in your example of Scorsese and film, and sometimes a little less intuitive. But then, the less intuitive results are sometimes the most interesting (sometimes even serendipitous), which is part of the virtue I see here.

Thanks again.

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.

More information about formatting options