Thursday, February 16, 2006

Thoughts about "the Search"

So I've been reading the book with a quite interest, and I'll blurb about a few things that stuck and might be worth mentioning. In the very beginning J.Battelle (can't find the page number now, that dam Ctrl+F does not work with paper,..wait a minute, maybe I should go to Amazon and search inside the book!) mentions that about 25% of searches that people make are intended to find or re-discover something that they've seen or read before. That's a lot! When reading it I thought of myself and how many times that has happened. Of course by that Battelle makes the case for the clickstream and how it will be the key element of search in the days and years to come. I kinda also like the social bookmarking aspect to it, "keep found things found", but this requires intent and action from the user, which can be avoided by just saving every click you take. Comes close to one of the themes of my interest, the level of user input required.

I especially liked the parts where Battelle was visioning and letting off his obsession of Google personalities (he talked in quite a few occasions about them, them, them, like they were some bloody simiantwins with a weird management style. Shows that the author is a freelance journalist, aren't all the bosses power obsessed freaks!?). The parts where he talked about personalised search (p.259) and how search would be linked to all types of services around was inspiring (p. 167). Everything would be indexed, even your kids and car keys. Imagine: "Honey, can't find my kids, can you Google them for me?" Well, seriously, I really dig the idea that if I make a search on the Web the rendered results would not be the same as the results that the guy next to me gets even for the same search term.

The whole deal with AdWord and purchasing the keyword for marketing purposes makes me think of "tags" and how could we use that idea for better things in education. It was something that I was thinking of last summer when writing this, but now reading about AdWord and such, it all became clearer to me. People tagging their own content, photos, blog posting, school assignments, etc, would not only tag it for their personal knowledge management purpose, but also like an advertiser buying a keyword to match it with other's demand, the tagger would better match hers own content to what ever is out there that might be relevant to her. For the educational purposes, both for formal and informal education, the possibilities could be enormous. And I'm not talking about matching people selling you an on-line course in French, but matching other people with similar interest to you, creating social networks where they previously didn't exist and allow learning to take new turns.

Towards the end of the book (p.137) J.Battelle talks about the differences between Google and Yahoo!. That was a very insightful part, and something that we've seen coming. Think of the difference between Google search results (algorithm+ adWord) vs. Yahoo! with some editorial touch adding Yahoo!'s search shortcuts, Google news (algorithm) vs. Yahoo! News (editorial), and now the whole service that Yahoo! is offering with 360 degree (hmm, not much hype there anymore and still in beta!). MySearch is about adding a social layer of Friends and Friends-of-a-Friend on the top of your search "to better understand the intent of your query" (p. 259). I love that last phrase!

This, to me, brings me back to quite a few discussions that I've had with my professor about the way my research should take. He seems to be the Google guy, whereas I see something more interesting and, ahem, intentional, in Yahoo!'s approach. I want to be able to give the steering wheel to user and say, "Fine, point me to the direction that you want to go and I'll refine the search and give you the best stuff!", instead of trusting a Good-mighty-algorithm to trust that it knows better what you like than you even think you do. I see this polarisation appearing in many places, services and applications on the Web, and I think it's great. I always think that with all the technologies that we have today we should be able to serve people's different preferences, was it then the Google guy or the Yahoo! guy.

Finally, I want to highlight this about how search pretty much still works these days: "Like DOS before Windows or the Mac, search's user interface is pretty much command driven: you punch in a query, you get a list of results." I think there is much to impove out there (like I think there is much to improve with anything that requires command line, e.g. installing most applications on Linux and how it sucks deeply), and the good thing is that, like the book says in one page, the search as a research area is still pretty uncharted, some estimate that we've covered only one digit % of it so far. Good, more to do for me :)

Monday, February 13, 2006

Weekly PhD report week 5-6

Work done:
  • Defined my research questions (below). Of course this is a piece of work that undergoes constant modifications, but at least this is now what I'm focusing on.
  • Worked on the plan for the Social Information Retrieval chapter. This is something that I'll be working on for the next 3 months.
    • The idea is to review some 20 LORs that use reviewing and ratings for LOs, look into classifying them using EQO model and repository, and eventually creating xml schema that allows saving them in a format that is interoperable. This was not only the metadata about a LO is interoperable, but also the annotation part.
    • This becomes be part of the work Jehad is on.

Things to do:
  • Continue by paper reviews
  • Continue the LOR annotation reviews
  • Enjoy the Idaho snow for the upcoming 2 weeks (20.2.-7.3.)

PhD research questions

Last weeks I've been trying to concentrate on defining my research questions. It's not an easy task. Firstly, there are so many things to look at, and then once you start framing the questions, they all becomes slippery and unclear, redundant and such. However, this is my last take on it:

Main question: What are the main barriers to discovery, re-discovery and use learning objects and how can Social Information Retrieval help?

  • Subquestion 1: Are there any specific requirements for the recommender system for LOs?
    Do we need to have any kind of “pedagogically aware” data or is it enough to use the existing models (such as movie, book recommenders) based on CF and content filtering methods?

  • Subquestion 2: What is the role of explicit information input such as ratings, reviews and annotations to enhance the discovery and and use of LOs? Related to "classical" collaborative filtering using ratings, reviews. (these are the reviews that people do mainly to help other people)

  • Subquestion 3: How can tags, lists and bookmarks enhance the discovery, recovery and use of LOs? e.g. “Playlist”, a feature in teachers' basket that allows them to make lists of LOs for a lesson, course, etc. “sequencing” This is stuff that people do for themselves, as part of PKM.

  • Subquestion 4: When using social information retrieval (SIR) techniques, what are the differences between anonymous (such as classical collaborative filtering using ratings, reviews) vs. self-manifested, explicit social networks (e.g. Yahoo MyRank2) as input for a recommender system? How do users perceive these, are there preferences, what type of interactions, etc? What would be the hybrid model like?