Wednesday, August 20, 2008

On the memory lane of the Internet - Paris 8

It's great to be getting older. It turns the Internet into a memory lane, something like cleaning your old cupboards in the place where you grew up, finding old pictures, mails, etc.

I just received a mail from someone asking me if this (updated link to Internet Archive: https://web.archive.org/web/20090501062208/http://membres.lycos.fr/riina/dea.html) was something that I have written. It was "mon memoire du DEA" from 1999 in the department of Hypermedia in Paris 8! I have not even put my name on it, but somehow this person was able to find it and associate it to me. Best of all is that she still found it useful for her studies, she wanted to cite it in her own dissertation on language teaching and new technologies, but since the text did not have the author nor the publication date, she found me. You never know, do you now..

So here it:
Riina Vuorikari
ENSEIGNEMENT ET APPRENTISSAGE
LE CAS DES LANGUES ÉTRANGÈRES
EFFETS DES TECHNOLOGIES DE L’INFORMATION ET DE LA COMMUNICATION
Date de parution: septembre 1999
Lieu de publication: L'Universite de Paris 8, Saint-Denis, France.

Notice the way that the title was written, no commas but different lines. Pretty arty, ha? It was mostly influenced by, hm, my really eccentric pormotor, J.Feat. So, I went back to my Yahoo! mail that I used already back then to check the mails between us when studying. Man, he was somewhat strange, but who would not be in Paris 8!

I always joked that the hardest task in it all was to get out of there with a diploma, what a mess. But Fun. A good place to hang out. Check out what the French version of wikipedia says about it. The English version is lame, it's hard to capture that feeling of "papa cools", all the old hippies from the late sixties who had installed themselves there ever since the Youth revolution of 1968. Every day there was (I bet still is) a student "manif", a little protest or signing a petition on this or that. When I read parts of my dissertation I noticed how that radicalism had snuck in..

D’un côté Internet offre " un accès libre au monde ", il est " international, pluriculturel et multilingue ", mais ceci est une image idéalisée d’Internet. D’un autre côté Internet est vu comme un média conditionné par McWorldet par les concepts d’américanisme à l’échelle mondiale. L’homogénéisation culturelle et le commerce électronique comptent sur l’idée que la consommation devient l’unique activité humaine qui uniforme le goût des consommateurs.

..and at the end about the future perspectives:

Une autre piste de recherche sera l’industrialisation de l’enseignement. Les questions soulevées par ces tendances sont multidimensionnelles : Est-ce que l’interaction humaine pourra être remplacée ? Est-ce que les enseignants qualifiés seront remplacés par les moins qualifiés une fois que le contenu du cours est mis en place sur Internet ou sur le cédérom ? Qui aura la propriété des contenus de ces cours qui sont devenus des produits à exploiter, le professeur ou l’administration de la faculté ? Qui aura l’intérêt à vendre ces contenus, qui aura les droits d’auteur et qui va gagner l’argent ?
Outch. Then again, there is lots of good stuff too. I love the translation of knowledge network in to "le tissu de savoir" or calling the whole Internet as " tissu social virtuel"! What a foresight! I remember sitting with my supervisor in the Montmartre graveyard and he was explaining that instead of talking about the Internet, I could use the term " tissu social virtuel", like a web woven in a tissue where the threads are all intervened, to illustrate the use of the Internet. pretty funny in its own way..

This is my opening line:
Internet est souvent associé au concept d’interactivité. Est-il possible d’exploiter cette propriété pour mettre en œuvre des techniques spécifiques pour l’enseignement de langues vivantes étrangères ? Au contraire de l’enfant qui apprend sa langue maternelle dans un environnement naturellement interactif - et de façon permanente - l’étudiant suit régulièrement un cours où l’immersion linguistique est artificielle et de courte durée. Jusqu’à quel point est-il possible de reconstituer, à partir d’un tissu social virtuel (=Internet) les conditions idéales d’apprentissage, particulièrement l’apprentissage des langues étrangères ?
I remember when I finally was writing my dissertation, my promotor was merciless. He really cracked the whip on me. But he knew how far to push, and at the end also I was really pleased with the result. After all, it was mention bien. When I thanked him for this, he said " Please, don't you give no "thank you" -- after all, it's my job...". That's a true educator! But hey, he could have taken some credit for it.

A funny thing was that he's English was perfect, but I only learned about that after we were done with all the writing. He was harsh on my French too. When I had already handed in the first version of my dissertation, he congratulated me on it. But half way down on his mail he says that in its current version no one can read it without lots of difficulties :
Toutefois, tu ne recevras ton diplôme que si tu déposes plus tard un nouvel exemplaire, corrigé de toutes les fautes de français -- il faut comprendre que cet exemplaire est destiné à la bibliothèque, et que, tel qu'il est, personne ne peux le lire sans une grande fatigue...
The final discussions that we had before my defense took place in Montmartre. We once met in that same café where they shot scenes for Amelie, or at the graveyard. I was quite surprised to see them in the movie, I remember.

It was also fun now to check out on some professors from Paris 8. I found Jean-Pierre Balpe who at the time was the head of the department of Hypermedia. Jean Clement who taught us a course on hypermedia. Imad Saleh, one of my "rapporteurs", is now the head of Laboratoire Paragraphe, that's the nickname of the deparment (so French, I love it!). Or Jean-Lois Weissberg - I never understood anything about his lectures, he talked about "telepercence" and such. I only remember that soulful intellectual Greek philosophy student who engaged in discussions with him about things that I could not follow. After all, they all carried the legacy of Pierre Levy who had just left the department year earlier.

I tried to dig out on my Lycos account where this stuff now resides, but I could not find my password and the system does not recognise my username :( But I found some other old stuff, like the index page of the site that I made at that time for Finnish students in Paris, Suomalainen Osakunta, my cool looking CV and the side of akseli that Pia, "Puumatyttö", teki suomalaisille taiteilijoille ja jota autoin jossain vaiheessa. Ihanaa!


Well there, that was a nice moment of memories from Paris and end of the Millennium.

Thursday, August 07, 2008

Can Social Information save teachers' time when choosing interesting learning resources?

One of my research questions is aimed at understanding what so called Social Information can do to help teachers to choose the right learning resources from a seemingly overwhelming collection. By Social Information I mean information about previous users' interactions with the resource. I am mainly interested in explicit annotations like ratings and tags, and more implicit ones like bookmarks.

As I'm interested in the use of resources that come from different countries than users do, I think Social Information (SI) should display not only annotations, but also information from where the user comes from.

One thing that I hypothesise is that among other things, Social Information, when associated with conventional metadata about learning resources, can make the decision making process faster for teachers when, for example, looking at the search result list. As a multilingual context in a repository can result in metadata that is in different languages, it could be speculated that Social Information indicating the origin of the users who have previously annotated the resource, could help the other users to make up their mind (see the image for an example).

We were interested in two different aspects:
  1. Does the appearance of Social Information make the decision making process any faster?
  2. Does the appearance of Social Information make the users choose more resources?

Method

We had 25 users from five different European countries. These teachers are primary and secondary teachers in science, language learning and ICTs in Finland, Estonia, Hungary, Belgium and Italy. xx of them are females and xxmales. xx participant is under 30 years old, xx are under 40 years, xx under 50 years, xx under 60 years old.

They have been part of the MELT project since Summer 2007. In March 2008 they were invited to create a profile on the MELT portal, where they are able to access multilingual learning resources for different topical areas.

We designed an experiment where teachers were shown two different imitations of search results list with learning resources and their associated metadata. One of the lists showed what we call the conventional metadata, such as title, url, language of the resource, a short description, subject area, type of content and its target audience. Here is an example.

The other list had the same metadata, but we also added the Social Information from the previous users. This could be the tags in their original language, the number of times bookmarked (favourites) and the ratings. Also, for bookmarks we would mention from which country the users come from. As example of this was shown above, the first image in this post.

We had 48 learning resources that came from different countries and were in different languages. About half of them were in English and other half in other languages, this also seems to reflect the division of the resources that users have bookmarked on the portal. The resources were about language learning, primary education, ICTs and science material, those were the areas of the teachers. I'll prepare better information about this later.

We had 12 learning resources on a page imitating a list of search results that user could get on a repository. In total, there were 4 such pages for each user, we call them sets. Every second set had conventional matadata, and every other had additionally also Social Information as indicated above.

At the beginning of the each set the participants were asked to write their names and the time when they started with the set of 12 resources. At the end, when they submitted their results, the system recorded a time. To answer to our first question we were interested in how much time do teachers spent to evaluate the appropriateness of 12 resources for them.

The teachers were asked to look at the metadata of the resource and the resource itself if interesting, and were asked one single question: "Would you use this resources, or parts of it, in your teaching in next Fall?" They answered on a scale 1 to 5, 1 being "I don't teach the topic", 2= No, 3= Maybe not, 4= Maybe and 5= Yes. Looking at the number of resources that users choose in their topical areas would give us indication of whether resources that have more Social Information related to them were more often chosen than the onces without.

Because of the low number of participants (n=25) we decided upon a within-subject design for this experiment. This is the one where the same group of subjects served in both treatments, i.e. they received both the material with conventional metadata and with social information.

Moreover, we had the participants in two different "groups". Group 1 had 12 participants and Group 2 had 13. Group 1 started first with a set with conventional metadata and Group2 with a set of resources that had Social Information added to it. When analysing the results, we found that one user in Group 2 had consistently added incorrect times. We excluded these times from the counts for time spent per set, leaving 12 participants in each. Moreover, in both set there were a few cases where the start time was forgotten.

Results

Descriptive statistics

We had 1129 responses to our questions, which means 71 responses were left blank. In 53% of the cases the users had answered that they do not teach the topic, which means that they deemed the resources not suitable for the topical area that they were teaching. 25.6% of the users found resources that they said that they would use (yes or maybe yes), whereas 21.6% of the resources were not found of use in the upcoming school year (not, maybe not). The mean for the responses was 2.23 (Min=1, Max=4), standard deviation was 1.138.

Q1: Does the appearance of Social Information make the decision making process any faster? Time spent on 12 resources (i.e. set)

On the average, users spent a bit more time on the sets that did not contain Social Information. The average to review a set of 12 resources with conventional metadata was 9 minutes and 8 minutes with Social Information.

I do not know yet whether this is a significant difference (my SPSS license ran out), but one could assume it is at least a sign of good news for Social Information. We can imagine that users go through a lot of resources when browsing a learning resources repository (I currently do not have the logs about the number of resources that users review per session, but I will produce them). So if you think of small cycles and multiply that number with, say 1o times, you could come up to some significant time savings when Social Information is made available to speed the decision making process.

Individual differences

Still looking at the average times spent, we can see that there were many individual differences. In the chart below the blue lines show the amount of time that participants spent with conventional metadata and the red one with Social Information added to it. You can see that for some users one metadata setting seems like a faster way, but anyhow, the lines follow one another pretty closely, apart from some odd-balls (like user 23). You can also see that there seem to be a wide variety of personal ways, some users scrutinise resources with a great care (user 7 and 8), whereas some go through them very fast (user 16 and 17).

I'd like to mention that here it does not matter that some of the resources are not in the competence area of the participants. We focus purely on the time that they spent going through pages and making decisions whether some of the resources are useful for them in the upcoming school year or not. However, this becomes crucial to answer to our second question:

Q2: Does the appearance of Social Information make the users choose more resources?

Table below presents the results when I looked at the amount of resources chosen per set. There was 4 different sets and each contained 12 learning resources in different languages. The two different treatments meant that teachers reviewed 2 sets with Social Information available, and two sets without. As teachers were from different tpical backgrounds, I excluded the responses from users who said that they do not teach the topic of the given resource. In table below you can see the percentile of positive responses (maybe use, use).

It appears that consistently teachers chose more resources when the Social Information was not available. This is contrary to what I expected. I have not calculated the significance of these results, but the differences do look big. In some cases, like in the 2nd set, even about 15% in favour of no Social Information available.

In a way, maybe the appearance of SI makes the teachers more careful or critical to choose the resources?

What is needed now is a follow up study at the end of this school term to check whether these teachers actually used the resources in their teaching. Or, I could check if they have bookmarked these resources on the MELT portal. They know the resources are available there. MORE to follow...

Wednesday, August 06, 2008

Asume nothing. Plan for everything.

I noticed this ad at the train station when I was returning from my weekend sailing trip in Friesland, Nl. It kinda captured the mood of the first half of this year, or better, it kind of captured the lesson I wish I have learned during that time. Assume nothing. Plan for everything. And then plan some more.

I think I sometimes assume too much and take things for granted. I think that everyone else is "with me" in the same thing or on a same (mental) trip, and I do not bother to explain the plot, set out my major expectations and go through all the details, etc. Then at one point, usually when it is already too late, I realise that this is not what I assumed it was.

Last weekend this dangerous thinking left me in the water when we accidentally capsized out small sail boat (seen in the pic)...and, there was no plan for what to do then. (one can be found here)

Clearly once again I had been trapped with my not so productive way of thinking. We had been sailing on a 6-meter open Falk boat already for a day with a crew of 4. My pal was skippering and we were 3 others with some/good sailing experience. I got in some really good sailing :) I managed to get us through a pretty rough channel with a strong head wind. We had to tack at least ten times to get to the other side, but I managed it all, we did not loose too much speed in turns and I made the guys jumping from side to another to give the weight to keep the boat from flipping. I realised that Falks are darn sensitive to your weight, it really makes a difference on which side you lean on.


So comes along the next day and we are sort of returning. We have one reef down because of the heavy winds. The skipper was steering, I was taking pics in the head, and the two others were somewhere in the middle. The next thing I hear is that one of us slips on the floor and tumbles on the side. Mind you, these are small boats, so by the time I turn to watch, he's trying to grab onto something and I'm sure he's going to fall off the boat.

With the sudden shift of his bodyweight and the heavy gust from the side, the next thing I observe is that he is not going to fall off the boat, but the boat is going to fall with him. I see our chart flying, the bottles and the gas jar landing in the water, and soon everyone else follows. Since I had yanked myself firmly in the head to take pictures, I found myself standing on the low side wall of the boat that now was horizontal in the water and slowly sinking down with more and more water getting in.

I was not sure what to do. Would it be better to stay in the boat or leave it. After a thought of the boat turning turtle, and seeing myself being tangled in the ropes under the boat, I decent in the water and start collecting our belonging. Soon enough I realise it's heavy to swim with cloths on and I thought "heck with our empty bottles of water and my fake crocks that I tried to save", and swam to the others who were on the other side of the boat. The skipper lifted me on the keel. I was like "so what now", but I did not get any instructions. I had not prepared for this and was not sure what was the next thing to do. Our weight on the keel did not have any effect on righting the boat (which was the goal that the skipper pursued).

Then, that's like seconds later, there was the rescue troop behind us. I could not figure how they were there so soon, but later they told me it was because of the regatta that was going on and they saw us capsize. They put a rope on our mast, and with a help of a second boat, we got it back right up. The mast was all muddy, the wind was pushing the boat so much that instead of turning all the way around, it got stuck in the low waters. No wonder our weight did not do a thing to right the boat.

We were eventually pulled back to the harbor by these Dutch gentlemen of the sea and left sorting out our dripping wet packages, cursing about wet mobile phones and the bill to pay for some lost gear from the boat.

In retrospect it feels just like what happened with the PhD where I got "snowflake*d" (i.e. my adviser at the time kicked me out of the programme) . What happened was just like with sailing, one day you feel like you can do all the good moves, and then the next is that you find yourself swimming in the water and wondering "this is not how I assumed it would end". The worst being that I was not prepared for it, I had not thought of what would be the plan B or C for that matter.

Reflecting on all this makes me see a reoccurring pattern that I'd better avoid in the future. It's clear that accidents will always happen and mistakes are made, but there are also precautions that can be taken to prevent them. Those are done by careful planning and someone taking clear leadership on things. Then the other thing is that when accidents happen or mistakes are made, one needs to know what to do next. Be prepared for them. And then some more.

Being comfortable having other people taking the lead is good, good leaders need good followers, like Dan is known to have said once. Perhaps it would be time for me to think more about taking leadership on things and try to influence, at least on my own life, that I'm prepared for everything. That is one characteristics that I expect from a skipper of a boat, and for that matter, it probably should go for any other areas too.

Monday, July 28, 2008

Measures for cross-border actions with Tags and Resources

If I break down the triple of {user, tag(s), resource} I can study the three things separately
  • User - resource
  • User - tags

  • Resource - users
  • Resource - tags

  • Tag - resource
  • Tag - users

And as I am especially interested in the cross-border actions, I would study the cases where:

User country ≠ Resource country

In this case I am interested in studying users collections of bookmarked resources, especially establishing the facts based on which country the resources are originated from. Using the cross-border metrics I can take a snapshot of the resources and calculate a cross-border resources value for the use.
  • E.g. User Finland has bookmarked Resource1 Poland , Resource2 Spain and Resource3Finland

  • This would make a User Finland to have a resource profile Poland 33%, Spain 33% and Finland 33%

  • In this case, as the user is from Finland, the cross-border profile would be 66% which would most likely have a value of .66, if we imagine that the cross-border value is between 0 and 1.
So what, you say. It makes a difference, I say.
  • This allows me to categorise this user into cross-border user of resources. I assume
    that users have differences in their inclination of using resources that come from different countries, some use them a lot others do not want to bother with them.

  • So this metric allows me to study who does what and thus better understand our user-base.

  • On the long run this of course will make it easier to recommend resources to users, as we
    already know that in their profile it shows that they are inclined to use cross-border resources.
Resource country≠ User country

This allows to me to look at the thing from a different point of view. Here, I am interested in establishing a profile for a resource. It appears that some resources are used a lot by people from different countries, whereas others are used predominantly by users from the same country than the resource itself is from.
  • E.g. Resource Finland has been bookmarked by User1 Poland , User2 Spain and User3Finland
  • This makes the ResourceFinland to have a profile Poland 33%, Spain 33% and Finland 33%.

  • In this case, as the resources is from Finland, the cross-border profile would be 66% of users, which would most likely have a value of .66, if we imagine that the cross-border value is between 0 and 1.
So what, you ask again. I think that it's cool, because then I can quickly and in an automated way calculate which of my resources have a high potent to cross borders easily. First of all, this will help me study whether there are some characteristics that make these resources to cross-borders.

Second, we can use this information to make filter out the resources that we think cross borders easily. This could be cool for example on our portal, we could flag out these resources for users, and furthermore, we could give these resources a priority when other repositories are harvesting or searching us in a federated manner.

Resource country Taglanguage
It'll also be interesting to create profiles for resources based on tags in different languages. For tag, we do not trace the country of origin, rather just the language. So in this case I'm interested in looking at resource profile on tags.
  • E.g. Resource Finland has been added a Tag1 Polish, Tag2 Spanish and Tag3 Finnish
  • This makes the ResourceFinland to have a tag profile Polish 33%, Spanish 33% and Finnish 33%.

  • In this case, as the resources is from Finland, the cross-border tag profile would be 66% of users, which would most likely have a value of .66, as above.
This is also an indication that the resource has a potent to cross borders. Tags in different language might yield some interesting information on how this learning resource could be used in a new context. In this example the resource was created in Finland, so one could assume that it has some underlying ingredients that make it suitable for Finnish curriculum. On the other hand, the fact that users have added tags in Polish and Spanish too might indicate that this resource is also useful for teachers in those countries.

Here an interesting case seem to emerge for topics like Language learning, say, English as Second Language (ESL). Language learning and teaching resources seem to be easily reusable in another language context. Interestingly, though, we've seen that in these cases teachers tend to tag them in the language in question.
E.g. User Finland has added a Tag English for ESL Resource Poland

Tag language Resource country

We can also look at the things from tags perspective.
  • E.g. Tag Finnish has been added to Resource1Poland, Resource2 Spain and Resource3 Finland
  • This makes the TagFinnish to have a resource profile Polish 33%, Spanish 33% and Finnish 33%.

  • In this case, as the resources is from Finland, the cross-border tag profile would be 66% of users, which would most likely have a value of .66, as above
This allows us to observe cases where a tag is related to learning resources that most likely share some thematic resemblance. It could be for example Science resources from different countries that Finnish teachers have collected. In this case we also can find evidence that these resources were adaptable to Finnish curriculum despite the fact that they come from other countries.

Tag language ≠ User country

On the other hand, we also find tags that have been used by users from different countries. These are the tags that we have previously identified as "travel well" tags. They have some interesting properties that make them easily understandable without translations, e.g. names (people, country, place), acronyms, common terms (web2.0).

By looking at the connection between Tag language and User country we can possibly identify such tags. The other common case for this seems to be that these people have tagged the resource in English. In any case, if many people have done that, we can identify these terms and manually analyse them. The hypothesis is that they either are "travel well" tags or then they are some super popular tags that could also count high on tag non-obviousness metric by Farooq et l (2007).

User country - Tag language
Lastly, just to enumerate the cases, we also have the relation User country and Tag language. This can be used to study user's personal tagging behaviour. In the previous study in Calibrate we found that on average users tag in their mother tongue and in English (75% to 25%). It seems though that things look different in MELT, where teachers are tagging more in English.

We are not sure whether these are personal preferences or the influence of social awareness, as in MELT tags are made readily available to others through a tag cloud, whereas in Calibrate they were only used for personal knowledge management reasons.

In any case, this relation allows us to measure individual differences between users and thus understand our user-base and possible user scenarios better.


What next? I will make a case study to apply these measures to MELT tags that we've got in the system so far

Dataset:
  • Learning resources: 199
  • Users: 40 (From Fi, Hu, Et, Be, At, It)
  • Tags:
    • 572 distinct,
    • 969 applied tags
    • 75% of tags were used only once
    • 25% of tags were used more than once

Tags and SNA measures

Studies on tags commonly have the triple of {user, tag(s), item} as a unit of study. That's also what I'm interested in, especially in those underlying structures that build relationships between users, tags and items. Some apply Social Network Analysis to study, for example the centrality measures of the network.

A run-down of SNA measures from Wikipedia

Betweenness
Degree an individual lies between other individuals in the network; the extent to which a node is directly connected only to those other nodes that are not directly connected to each other; an intermediary; liaisons; bridges. Therefore, it's the number of people who a person is connecting indirectly through their direct links.
(Somewhere else:The betweenness measurement indicates a node or nodes that connect clusters of nodes. Nodes that have hight betweenness have high influence over what information flows in the network.)
Closeness
The degree an individual is near all other individuals in a network (directly or indirectly). It reflects the ability to access information through the "grapevine" of network members. Thus, closeness is the inverse of the sum of the shortest distances between each individual and every other person in the network.
(Degree) centrality
The count of the number of ties to other actors in the network. See also degree (graph theory).
Flow betweenness centrality
The degree that a node contributes to sum of maximum flow between all pairs of nodes (not that node).
Eigenvector centrality
a measure of the importance of a node in a network. It assigns relative scores to all nodes in the network based on the principle that connections to nodes having a high score contribute more to the score of the node in question.
Centralization
The difference between the n of links for each node divided by maximum possible sum of differences. A centralized network will have many of its links dispersed around one or a few nodes, while a decentralized network is one in which there is little variation between the n of links each node possesses
Clustering coefficient
A measure of the likelihood that two associates of a node are associates themselves. A higher clustering coefficient indicates a greater 'cliquishness'.
Cohesion
The degree to which actors are connected directly to each other by cohesive bonds. Groups are identified as ‘cliques’ if every actor is directly tied to every other actor, ‘social circles’ if there is less stringency of direct contact, which is imprecise, or as structurally cohesive blocks if precision is wanted.
(Individual-level) density
the degree a respondent's ties know one another/ proportion of ties among an individual's nominees. Network or global-level density is the proportion of ties in a network relative to the total number possible (sparse versus dense networks).
Path Length
The distances between pairs of nodes in the network. Average path-length is the average of these distances between all pairs of nodes.
Radiality
Degree an individual’s network reaches out into the network and provides novel information and influence
Reach
The degree any member of a network can reach other members of the network.
Structural cohesion
The minimum number of members who, if removed from a group, would disconnect the group.[15]
Structural equivalence
Refers to the extent to which actors have a common set of linkages to other actors in the system. The actors don’t need to have any ties to each other to be structurally equivalent.
Structural hole
Static holes that can be strategically filled by connecting one or more links to link together other points. Linked to ideas of social capital: if you link to two people who are not linked you can control their communication.
What made me think of this now was that I read this mini study Using Social Network Analysis to Highlight an Emerging Online Community of Practice. Anthony Cocciolo, Hui Soo Chae, Gary Natriello, Teachers College, Columbia University

The method used made me tick. They used
..System Theory to define the uploading and downloading of materials as "communicative acts", the users of the system were the "actors" and the cululative communicative exchanges as "interactions" (Buckley, 1967). .. this particular systems arrangement is useful because it provides a readily available metric for assessing actors' interactions within a network.
I think it might be interesting to think how this could be used to study the underlying networks with tags.



The most comprehensive reference is: Wasserman, Stanley, & Faust, Katherine. (1994). Social Networks Analysis: Methods and Applications. Cambridge: Cambridge University Press. A short, clear basic summary is in Krebs, Valdis. (2000). "The Social Life of Routers." Internet Protocol Journal, 3 (December): 14-25.

Thursday, July 10, 2008

Notes: Tagging tagging. Analysing user keywords in scientific bibliography management systems

An interesting paper on JoDI about tagging in bibliography management system.
Tagging tagging. Analysing user keywords in scientific bibliography management systems
Christian Wolff, Markus Heckner, Susanne Mühlbacher
Journal of Digital Information, Vol 9, No 27 (2008)

Some outcomes:

a category model for tags in a scientific bibliography management scenario. This model covers linguistic features, the relation between tags and the text of the tagged resources, as well as functional and semantic aspects of social tags.
Here is an image of the model that I copied from the paper:










This is actually a really cool model for tags. I've been so far using three categories from MovieLens and Golder (2006)/Huberman (2005) studies; Factual, subjective and personal. I've noticed, though, that I've added many sub-categories for the Factual ones.

Like in this model, I've discovered very similar types in tags. Especially the "Functional Category Model" is interesting : it has 2 sub-classes:
  • subject related (e.g. resource related and content related) and
  • non-subject related, personal tags (e.g. affective, time and task related, tag avoidance=no tags).

Other things:
The ”typical tag” is a single-word noun, taken from the title of the respective article
(identical or variation), thus directly related to the respective subject.
Yep, we have many of these too! When I talk about these I refer to the non-obviousness metric from Farooq et al. (2007).

In contrast to previous studies the number of non-subject related tags remains rather low in the scientific data we observed and the full potential of tagging systems to describe qualities or aspects of resources does not seem to be used. But the absence of tags like cool, interesting, to_read does not mean that users who tagged the resource do not think it is cool, of interest or worthy of reading, but simply that the users did not express their ideas they may have or may not have about the resource.

This is interesting too. I think each audience tags differently. Our target audience are teachers, about 35-55 years old. They do not seem to go around tagging learning resources with tags like cool, etc.
Compared to author keywords, social tags tend to introduce less and simpler con-
cepts: Altogether, only one third of the social tags matched with (the far more numerous) authors’ keywords. Moreover, tags tend to be more general and users tag their articles more general and with less words than authors.

This is also interesting. There are some studies that have compared the tags and expert indexer keywords and have found even less overlap, if I remember right.

I love this one, it is so much the case:
Additionally, it shows that the respective system environment, e.g. tag suggestions, has a major influence on the tagging behaviour in terms of spelling errors, tag usage and creation of a specific tagging languages. This extends the number of the main influential factors on tagging behaviour being personal tendency and community influence through the additional component system influence.


They also flag out as an interesting study area the comparative studies across tagging platforms. I've looked at different tagging systems for educational resources a bit. This version is an old one, but I post it anyway:

Vuorikari, R., Poldoja, H. (submitted). Comparing tagging and its purposes across learning resource repositories. pdf

Wednesday, July 09, 2008

Teachers as Netpromotors of digital content

I made a survey with 28 teachers from different European countries on multilingual learning resources. You can find those 28 resources from this list. Our portal has a lot of multilingual resources that come from a variety of Ministries of Education in Europe.

But - we do not know for sure whether teachers find resources useful that come from different countries than they do, and that are in different languages than they speak. Hence my little survey. You can read more details here.

We only considered responses from teachers who came from different countries than the 18 resources did that we had in our survey. Quick round of results:
  • 43% of respondents found resources, which came from a different country than they did, of use for preparation purposes.

  • 41% of respondents found resources, which came from a different country than they did, of use for teaching purposes.

  • 65% of respondents said that they would share these resources, or parts of them, with their colleagues and friends.

  • Even 35% of respondents, who said they did not have expertise in the given subject area, thought that they would share the resource with their colleagues
These were the results on a scale 1-5 (n=254)







It made me think that:

a) If teachers use multilingual or foreign language resources, they most likely use them both for preparatory purposes and for teaching purposes. We do not know, though, whether they would use the resource in their teaching themselves or let pupils interact with this resource.

b) Teachers are good filters. More teachers said that they would be willing to share resources with their colleagues than actually use them themselves. It might be that this happens with a resource, which they think is interesting, but does not match to their curriculum goals for the year. They might say, "Hey, my colleague would love this, I'll send it to her!" This is the basic mechanism of viral marketing, how can we leverage this on a learning portal?

c) "Would you like to share it with your colleagues" is one of the key questions when studying customer satisfaction and loyalty, topic that we in learning repositories often neglect. If teachers are happy users, or if teachers find good material on the portal, they can become promoters of those resources. This might be very important especially when we deal with resources that are in multiple languages, because sometimes it is hard to discovery those resources.

If we take the teachers in the survey, we could calculate the Net Promoter Score by subtracting the % Detractors (e.g. the ones in my survey who rated this 1 or 2 on the scale 1-5) from the % Promoters (e.g. the ones in my survey who rated this 4-5).

Take the case for sharing: it would be 65% -22% =43%. That is a pretty good net promoter score, most companies have it around 5 to 10%, and it is very unusual to have it above 50%.

This can indicate that teachers are willing to put their credibility on the line by recommending a resource that comes from a different country than they do to a friend!

Now, I just have to think of the best way to do this ;)

A draft idea for a paper: A case study on teachers' use of social tagging tools to create collections of resources - and how to consolidate them?

UPDATE: the submitted paper, comments welcome!

This paper explores how a group of pilot teachers (16) create collections of digital learning resources using tagging tools. We study two different tools: an educational portal (MELT) and del.icio.us. We first look at the characteristics of these collections (number of resources, languages of resources, number of tags used, etc), and then propose a way to display the resources and tags from del.icio.us on the learning portal (MELT) using Attention Profing Markup Language (APML). This allows a higher level of integration between a learning portal and an external social tagging service like del.icio.us, and thus enhances the wider variety of digital learning resources to be discovered.

Method

We selected 16 pilot teachers to be subjects of this study from the MELT project. These teachers have both an account on the MELT portal and on the delicious bookmarking service. These teachers are primary and secondary teachers in science, language learning and ICTs in Finland, Estonia, Hungary and Belgium. 7 of them are females and 10 males. One participant is under 30 years old, 8 are under 40 years, 5 under 50 years, 3 under 60 years old.

They have been part of the MELT project since Summer 2007, when they were first introduced to delicious during a summer school. In March 2008 they were also invited to create a profile on the MELT portal, where they were able to access multilingual learning resources for different topical areas.

From the MELT portal we know the detailed profiles of these teachers: their names, topics they teach, country where they teach and languages they speak. Moreover, we have information regarding the learning resources that they have bookmarked using the portal. This includes the information about the resource itself and the tags applied. We additionally have asked for their delicious username to be part of this small study.

From delicious, using the html service, we were able to download the 100 last bookmarks and tags that these teachers had posted on delicious. We also took all the data regarding the tags and people these users had in their network. Lastly, we recorded the number of posts each teacher had on their account.

We collected the following data for our selected 16 users:

















Additionally, the delicious data contained the following information regarding the networks. Two people had chosen to keep their networks private:
  • Number of distinct people in the networks: 104
  • Number of people in the networks: 270
Results
Discussion


References

del.icio.us API and other not so successful trials

I am getting somewhat disappointed in some of these web 2.0 "things". Take, for example, the delicious API.

I wanted to download the posts by a number of ppl in my network to study what the hell are they doing. The API allows you to download all your posts in a neat xml format. That's cool, I thought, let me just do this to 20 of my buddies, and I can study better how teachers are bookmarking - especially how are they bookmarking websites that are not from their own countries or in their own languages (e.g. cross-border use).

The delicious API only allows you to get 30 latests posts from people that you do not know the password of. wtf? The same if you try to get them through RSS feeds, you only get 30. Then, there is the html code that you can use, but it also allows you to get only 100 posts. What about the rest, those 999 posts that I want? That stuff is so badly documented on the site that it's very annoying. Why not just be frank about it and say this is how things are?

I do not understand why to limit the API, RSS or html code when all that stuff is freely viewable anyway. So I tried using wget to suck that stuff out, but there is also something fishy and I can never get past 100 posts. So, I guess that just makes me to limit my study to a sample of 100 posts per user. Easy.

The other thing that I've been sightly disappointed with lately is APML and a number of tool that they make available for you to track your online profile, like engagd.com or tagurself.com.

You know what, the idea is great, but those tools/widgets suck, and they are so badly documented that it makes you just wanna cry. I've tried like 3 times in engagd to make my APML profile of 2 different feeds, and it never works. The tagurself cannot even load the example from the url that they have themselves posted as an example. wtf?

Moreover, the Yahoo! pipes are also somewhat strange, they never actually seem to post what they should. I put this example in one of my lasts post and it hardly never loads. Not so fun.

Hmm...I guess if more people used all these 2.0 tools, and not only talked about their potentially revolutionary usage by non savvy web-users, we could face the fact that the user-created web is far from being so revolutionary and does not empower users like me. Instead, I'd like to see those folks walk that talk, sit down on their asses and finally get past the BETA versions of their tools to actually make them work properly. Dude, cannot wait to get rid of all the BETA versions on the web.

Tuesday, July 08, 2008

Friday, June 13, 2008

Pipe trial for delicious

Monday, June 02, 2008

This is it! Resources that cross boundaries

Ok, I think this graph is the coolest kid in the blog!!




What you can see here are the communities of users by mother tongue (nodes) and the edges are the resources that these users have added to their collections.

This is a great visualisation of communities of practice. What you can see here at a glimpse is that the learning resources that these users have added to their collections, are very much community oriented, in this divided by languages.

I sometimes frame my research question as the following:
Does a multi-lingual and multi-cultural learning resources portal rather act as one system divided into different language or country groups, or is it more like one monolingual system with its own sub-groups and communities of practice (think of a system like delicious) that cross the language and cultural borders?
This visualisation seems to point more to the first one (this REALLY needs to be further investigated!!), it seems that users are divided into groups by mother tongue. Why I say so is that you cannot see many resources that are shared among the groups.

To play around with this by yourself, make sure that you click on the arrow head down at the menu bar. This allows you to see in which directions the links go. They often time just go to one direction.

There are some resource that indicate communities of interests between countries. For example, in this image, we can see that there are some resources that are shared by both Estonian and Lithuanians. One of them is highlighted in orange.

These are the interesting resources as they cross between boundaries. The more I think of it, the more I'm convinced that you cannot call these call boundary objects (see my previous post). If I got the boundary object right, they are the objects that help these two groups to talk to one another, because they do not share the same language or jargon. But in this case, I think it's the contrary, these people share so much the same, that they can even share resources in Russian (of course being ex-Soviet countries, Russian is a common knowledge).

Anyway, even if the rather disappointing news were that users on an international portal seem to stick to one another based on their mother tongue rather than common educational interests, the good news is that I believe that through making more social cues and traces available to them, they would actually start exploring the resources in other languages and other areas.

And besides, who says that my data here really actually displays this community correctly!? This is based only on the common resources that users have put to their collections. Actually, LeMill is more of an authoring environment, so maybe a better way to study this community would be through collaborative authoring of learning resources? Or something else, like common search terms or tags that are used.

So, take this exploratory description of this data set with a little bit of skepticism!

In what languages are the resources that end-up in collections?

Well then, I guess that will be a no-brainer...

In this visualisation, you can see the languages of resources (e.g. English) as nodes and the languages of users as edges (e.g. en, de..).


If you click, for example, on English, lot of edges are highlighted. Those are the mother tongues of users who have bookmarked these resources. After little bit of playing, you'll find that English resources, and the ones with no languages, seem to be most popular with users.

However, it is cool to see that resources in other languages also end up in users' collections. Here, for example, you can see that Czech (sorry for misspelling) are used also by users with Polish and Lithuanian as mother tongue.

More analyses are needed to give you any numbers, but this already is an interesting insight.

Resources country of origin and user mother tongue

This visualisation shows the links between the country, where the resources in the collections were created in, and the mother tongue of the users who had added them in their collections. You can explore the diagram by yourself.

This image here shows how, for example, resources created in Finland (the orange node in the network) have ended up in collections of users who speak Hungarian, Estonian, Lithuanian, etc. as their mother tongue.

Note that this graph does not make any assumption of the language in which these resources are in! If I'm right in my guess, most of these resources were in English, not in Finnish..

But anyhow, I find that as a demonstration that these resources can cross borders of some kind. In this case, a Finn has created the resource. It can be just a very little hint available in the design of the resource that it was a Finn, but still some of the underlying pedagogical assumptions or some hints of Finnish curriculum might be embedded in these resources. Nevertheless, or thanks to that, the resources created in Finland seem like a hit (they are in 8 different language groups).

Ok, to me more truthfully, I think this is because LeMill was create in Finland that many of the Finnish resources are shared.

About networks of resources and users

This visualisation is to explore the networks of users that form between resources that are shared in collections. I think this is one of the most interesting visualisations of the dataset, and the one that inspires me the most.

Same as before, click to interact within the image, or if you click on the title on top of the image, you can get the network in a bigger window.


What's there? It's a network diagram where the nodes represent users (user id number) and the edges are the names of learning resources that these users have saved in their collections.


You can zoom into the diagram and explore it. Same as with the previous post, we can see that lots of the resources that users have put in their collections are not shared with other users. These are the singletons that are not part of the common network here.

Then, there are some star like structures that can be found. Like this one. Here the resource highlighted is something that both users (user 59 and 155) had added into their collection.

What I think, I would almost bet on, is that if these users were made aware that they share this resource in their collections, they would be interested in looking at what other resources are in the other person's collection. In this case the user 59 could be interested in looking at the collection of the user 155 has put in her collection.

This basically would be the idea of making underlying social networks visible in a repository to allow social navigation of like-minded users collections. Or, if you wish, a recommender could take advantage of these underlying connections as well. For the recommender, though, the data is very sparse, as can be seen from the visualisation. For that reason, I think we first should explore social navigation possibilities, and then launch for recommenders, when we get more data.

These resources that connect users, or in some cases (hopefully one day) even communities together, are valuable stuff. I have previously referred to this as one way to identify learning resources that cross borders easily. In this case, the two communities could be speaking different languages or be from different countries.

Some suggested that these objects could be also boundary objects. I cannot get my hands on the original article now (frustration of working from home!), so I am referencing some others that reference it:
Star (1989) and Star and Griesemer (1989), on the other hand, are concerned with the distribution of artefacts across communities. Boundary objects are artefacts used by communities: they cross the boundaries between communities and retain their structure, but are interpreted differently by them. The notion of boundary objects was developed by Star (1989) and Star and Griesemer (1989) as a way to explain co-ordination work between communities.
In a larger sense, maybe some of them could be boundary objects. I will need to think about this more..

Anyway, here is another little visulaisation that is actually an overview of the resources that users have saved in their collections. You can visualise it in many ways, you the ordering function on the top.




Star, S. L. 1989. The structure of ill-structured solutions: boundary objects and heterogeneous distributed problem solving. In Distributed Artificial intelligence (Vol. 2), M. Huhns, Ed. Morgan Kaufmann Publishers, San Francisco, CA, 37-54.

Learning resources as part of collections - what about the network?

I'm just exploring a new dataset that I got from LeMill, it contains information about learning resources that users have put in their "collections". Collections is a tool for users to create their own sub-sets of resources and give them a common title, e.g. I find 5 resources on pyramids, I add them to my collection, and I call it "Pyramids for 5th graders", as I am going to use it during my History lesson that I teach with 5th graders.

I think that collections-tool is an excellent tool, also for me as a researcher ;) What I am interested in knowing is whether we could make the links between these collections visible. The link would, of course, be the resources that are shared with collections.

Let's just explore the early visualisation of LOs connecting the collections. Click on "click to interact", and you get the life image. Alternatively, you can click on the title in the image, and you'll have the whole visualisation in a bigger interface. So what's there?





What you first see is a top-level overview of users' collections using a network diagram. It first looks like a grid; the ones on the top left hand corner are small one, they only contain a few resources. The other ones towards the right bottom corner look more clunky and visibly bigger, they include many more resources and are actually overlapped one with another.

You can start zooming in with your mouse. You see that some names will start appearing. Those are the name of the collection and the resources within. With a right click on your mouse, you see a hand appearing. This allows you to move within the visualisation. What you see here is a huge amount of what is called “singletons” in the network jargon. These singletons are collections, but they do not have any connections through shared resources to other collections.

Now, try to locate yourself in the area where that big cluster is, at the bottom right hand corner.

Now, instead of looking at separate little singletons, we are hoovering over a “giant component”. This is clearly the largest group of nodes within this network and some of them seem interconnected. With interconnection I mean that the same resource is in more than one connection.

You can visualise this nicely, if you click on some of the big nodes. It will be highlighted in orange. This way you can see what are the resources related to this collection (the collection name is the node). Interestingly, you'll see some of the resources act as a connection between different collections.

What we can already quickly see is that something called “middle regions” are entirely missing from this network. They represents rather isolated groups that interact amongst themselves. In our case they would be a few resources that are in a few collections by a few users. There do not seem to be any such "isolated stars" in this network of collections. The cool thing about these isolated stars is that over some period of time, they might merge with the giant component. This would happen through a resource that is shared in both the giant component and the smaller entity.

Ok, visualisation is just a visualisation, a snapshot of a moment. More work is needed to properly analyse what is going on, and most importantly, does this have anything to do with how we can make a repository of learning resources a better place?

Well, I of course am on my SNA trip and think that it can help anything and everything, but more about that later..

Reference

Users, LOs, collections and networks forming

This visualisation is to explore the networks of users that form between resources that are shared in collections. I think this is one of the most interesting visualisations of the dataset, and the one that inspires me the most.

Same as before, click to interact within the image, or if you click on the link, you can get the network in a bigger window.

What's there? It's a network diagram where the nodes represent users (user id number) and the edges are the names of learning resources that these users have saved in their collections.



You can zoom into the diagram and explore it. Same as with the previous post, we can see that lots of the resources that users have put in their collections are not shared with other users. These are the singletons that are not part of the common network here.

Then, there are some star like structures that can be found. Like this one. Here the resource highlighted is something that both users (user 59 and 155) had added into their collection.

What I think, I would almost bet on, is that if these users were made aware that they share this resource in their collections, they would be interested in looking at what other resources are in the other person's collection. In this case the user 59 could be interested in looking at the collection of the user 155 has put in her collection.

This basically would be the idea of making underlying social networks visible in a repository to allow social navigation of like-minded users collections. Or, if you wish, a recommender could take advantage of these underlying connections as well.

These resources that connect users, or in some cases (hopefully one day) even communities together. They are valuable stuff. I have previously referred to this as one way to identify learning resources that cross borders easily.

Some suggested that these objects could be also boundary objects. I cannot get my hands on the original article now (frustration of working from home!), so I am referencing some others that reference it:
Star (1989) and Star and Griesemer (1989), on the other hand, are concerned with the distribution of artefacts across communities. Boundary objects are artefacts used by communities: they cross the boundaries between communities and retain their structure, but are interpreted differently by them. The notion of boundary objects was developed by Star (1989) and Star and Griesemer (1989) as a way to explain co-ordination work between communities.
In a larger sense, maybe some of them could be boundary objects. I will need to think about this more..

Anyway, here is another little visulaisation that is actually an overview of the resources that users have saved in their collections. You can visualise it in many ways, you the ordering function on the top.



Star, S. L. 1989. The structure of ill-structured solutions: boundary objects and heterogeneous distributed problem solving. In Distributed Artificial intelligence (Vol. 2), M. Huhns, Ed. Morgan Kaufmann Publishers, San Francisco, CA, 37-54.

From Attention metadata to Participatory metadada

Capturing and taking advantage of users’ actions on the Web has come a long way since business models were first implemented around the idea of clickstream in the ’90 . Instead of having the commercial sites taking advantage of the attention that users pay to different products, in the recent years the tide has turned arguing that interactions with the content (e.g. buying, listening, reading feeds) and users reactions to that content (e.g. ratings, reviews, tags) should be something that the user can control.

AttentionTrust.org, for example, calls this "attention data" and argues that it is a valuable resource that reflects user’s interests, activities and values, thus serves as a proxy for their attention.

AttentionXML (1) is an open specification to capture individual’s clicks to track user’s behaviour and information consumption on the Web. Contextualized Attention Metadata (CAM) schema was build upon it with an extension that allows capturing observations about users activities in any kind of tool, not just a browser or newsreader (Najjar et.al. 2006a,b).

Attention Profiling Markup Language (APML), on the other hand, offers a way for a user to create a personal Attention Profile, which is portable, sharable and captures users’ attention on self-defined services. Moreover, the social aspect of the Web, where users not only interact with resources, but actually participate in communities and create content, has created a need for users to capture these participatory aspects of their attention.

Thus User Labor Markup Language (ULML) that proposes an open data structure to outline the metrics of user participation in social web services. One of the ULML use cases, for example, is around creating metadata (e.g. tagging, voting, commenting etc.) as a way to improve and maintain users’ existence in social web. All these specifications serve the same goal; being openly transparent about one’s interests on the Web in order to make the best use out of them for the user’s own benefit.

I'm currently thinking with my studdy-buddy Nikos Manouselis how we could save such attention profiles from different repositories to have a more holistic picture of what do users do on educational repositories or on federations of them. I think that alone would be a great advance for the research.

Second, it might be that the same user have profiles in different repositories (like I have one in MELT, in LeMill and OERCommons), so this would allow the user to consolidate her interests and resources found in different places, like bookmarks or collections that I have created in these different repositories. It could be nice to have my personal tagcloud based on my attentions in different repositories to allow me to access resources in these different services this way.

Third, there are resources that many of the educational repositories share. Like in MELT, we have most bookmarks on resources from LeMill. It is of interest for LeMill to know that they have fans and users in MELT, so this is the info that can be fed back from MELT to LeMill, and they can boost their stats with this! Not to mention of getting back the participatory information from MELT, e.g. users tags, ratings, etc.

The fourth advantage could be that using this type of profiled information to see what resources from LeMill have been of use to the "extended community" (e.g. outside of LeMill's own user base). This info could help them to boost their reputation in the network of repositories. If we knew that half of the repositories in the federation actually have users who interact with LeMill resources, that would give LeMill a great boost as an interesting repository to play with, a reputable provider of resources (someone pointed out this saying, hey, think of eBay's reputation points for sellers!). I already had toyed with the idea of "travel well" value for each repository in the federation based on the evidence of previous cross-border use of their resources (of course tracked down using something like portable profile).

Of course, finally, such thing could be used for recommendation purposes and to allow users swiftly find resources of interest without noticing that they have to go to a different repository. Like the previous idea of cross-repository tag clouds.


[1] AttentionXML (2004). AttentionXML specifications, Retrieved June 8, 2007, from http://developers.technorati.com/ wiki/attentionxml.
[2] Najjar, J., Wolpers, M., & Duval, E. (2006a), Attention Metadata: Collection and Management. Paper presented at the World Wide Web 2006 Workshop Logging Traces of Web Activity: The Mechanics of Data Collection, May 23, 2006, Edinburgh, UK.
[3] Najjar, J., Wolpers, M., & Duval, E. (2006b). Towards Effective Usage-Based Learning Applications: Track and Learn from User Experience(s). Paper presented at the IEEE International Conference on Advanced Learning Technologies (ICALT 2006), July 5-7, 2006, Kerkrade, The Netherlands.

Friday, May 30, 2008

Visualising networks of learning resources

I'm looking at the first dataset of bookmarks from MELT portal. Here you can see some of the first descriptions created by using Many Eyes. Click on the interact button in the pic and it loads. This is a treemap visualisation of the bookmarks that users so far have found.

What do you see here? You first see boxes in different colours. They are "boxed" by the user IDs. The bigger one is, the more learning resources this person has bookmarked. If you hoover your mouse over the boxes, you can see the ID of resources. These, of course, do not mean nothing to you now, but imagine if they were links to resources?

Next you can explore the data a bit further. Drag the mother tongue box on the top of the graph to the first place. Now, the boxes are displayed by the languages spoken by users. You'll see that Hungarian speakers have been busy on the portal, they have the most bookmarks.

Third, you can explore further by dragging the obj_lang to the first place. This shows the languages in which the bookmarked resources are. Interestingly, it turns out, most of these resources are in English. However, the diversity is there to be observed: users have found resources in many different languages useful.

Let's go further. The next one is a network diagram. If you click on "click to interact" you can also zoom into the visualisation.

What do you see here? It's a network that consist of: user mother tongue and the learning resource that those users bookmarked on the portal. You see 4 quite big vertices, which are the mother tongues of the users.
..network consists of a set of objects called vertices connected by edges. The visualization of the network is optimized to keep strongly related items in close proximity to each other. In this way, the overall arrangement of vertices in the network is very telling of the structure of the connections between vertices (vertices that are far away are weakly related to each other).In this visualization, the size of a vertex is proportional to the number of edges emanating from it.
Take the Hungarian speakers, for example. They are the ones who user the portal most, and have actually bookmarked a fair amount of resource. At the end of each edge you can see an ID number. Those are the ID of learning resources that these teachers have bookmarked. The same goes for Finnish speakers, Dutch speakers, etc.

Interestingly, we can see from this visualisation that not many resources are shared among the users from different language groups. A few are, though: take, for example, the LeMill resource that is visualised in orange in the image here. It has edges linking it to Finnish, German and Hungarian speakers. I counted 14 resources in this small dataset that were shared by users from different countries, that's about 13% of resources.

This type of resources are what we call "travel well" resources, as they can cross borders. In this case those borders are lingual. The resource also acts as a bridge between these different language communities. If you look at the resource in question, you'll find that it is to teach English (as
foreign language) and it is in English. Thus, it is not that surprising that it is well accepted in many language communities.

Finally, I also visualised the languages of learning resources instead of the resource ID. You can find it here. As you see from the image on the right, I have highlighted the languages of resources from Dutch speaking users. They have been pretty busy finding resources in all kinds of languages!

Tuesday, May 20, 2008

Call: WORKSHOP ON SOCIAL INFORMATION RETRIEVAL FOR TECHNOLOGY ENHANCED LEARNING (SIRTEL'08)

Good news, we are ready to roll out the call for contributions for our 2nd workshop! This time we are planning more time for discussions and brainstroming type of exercises that participants can lead! This was the feedback from last year, so you see that we are taking it seriously :)

Check out the format for contributions; Research papers and System Demos are the more conventional stuff that we welcome, whereas Hands-On proposals are there to let us all loose and to think how could we use ideas from some exiting, existing systems to enhance and support learning and teaching. Oh then, there are of course the Pecha Kucha talks. That makes me really curious: someone said that they would not really work with computer science. I hope we are able to prove that wrong ;)


WORKSHOP ON SOCIAL INFORMATION RETRIEVAL FOR TECHNOLOGY ENHANCED LEARNING (link)

in the 3rd European Conference on Technology Enhanced Learning (EC-TEL08), Maastricht, The Netherlands

IMPORTANT DATES

  • Contribution Submission: June 29, 2008
  • Results Notification: August 3, 2008
  • Camera Ready Submission: August 31, 2008
  • Workshop date: September 17, 2008
  • Main conference dates: September 18-19, 2008

CALL FOR WORKSHOP CONTRIBUTIONS

After the successful first SIRTEL workshop last year, we are delighted to welcome
exciting new contributions for the 2nd Social Information Retrieval for Technology Enhanced Learning (SIRTEL) workshop:

  • Research papers
  • System Demos
  • Hands-On proposals
  • "Pecha Kucha" talks*


RATIONALE

Learning and teaching resources are available on the Web - both in terms of digital learning
content and people resources (e.g. other learners, experts, tutors). They can be used to
facilitate teaching and learning tasks. Developing, deploying and
evaluating Social information retrieval (SIR) methods, techniques and systems that provide
learners and teachers with guidance in potentially overwhelming variety of choices remains to be tackled.

The aim of the SIRTEL’08 workshop is to look onward beyond recent achievements to discuss
specific topics, emerging research issues, new trends and endeavors in SIR for Technology Enhanced Learning (TEL). The
workshop will bring together researchers and practitioners to present, and more importantly,
to discuss the current status of research in SIR and TEL and its implications for science
and teaching.


TOPICS OF INTEREST (but not limited to):


Technology Enhanced Learning (TEL) and Social Information Retrieval (SIR) techniques such as:

  • Recommender systems
  • Social collaborative searching, browsing and sharing of queries
  • Social network analysis
  • Game-theoretic approaches to select learning materials and learning partners in the long tail
  • Social bookmarking and tagging, folksonomies
  • Annotations, ratings and evaluations


Concepts for Social Information Retrieval (SIR)

  • Defining the scope, purpose and objects of social information retrieval in TEL
  • Defining user requirements for the deployment of SIR systems in a learning setting
  • Current and new trends in SIR methods for TEL
  • Approaches to TEL metadata that reflect social ties and collaborative experiences in the field of education
  • Analytical modelling of strategic intentions in TEL communities
  • Interoperability of SIR systems for TEL


Implementation of SIR in TEL

  • Methods and models of SIR in the area of learning and teaching
  • Social processes and metaphors in learning communities and social networks for searching, acquiring and sharing information
  • Pedagogical aspects of SIR in TEL; how to scaffold students, activity patterns, etc.
  • Integrating SIR services in existing learning platforms
  • Visualisation techniques to support SIR in TEL
  • Successful scaffolding techniques for SIR implementation

Evaluation of SIR in TEL

  • Ideas on how can we get more empirical on evaluation
  • Best practices
  • Evaluation of the success and acceptance of SIR systems in the context of teaching,learning and/or TEL community building
  • Challenges and enablers
  • Evaluating the performance and measuring the effectiveness of SIR systems in learning applications;
  • Evaluation the user satisfaction with SIR system in supporting learning and teaching, etc.


WORKSHOP SUBMISSIONS

This year we base our call for contributions on last year’s comments, where the participants wanted more time for discussions, for picking each other’s brains and to forecast how SIR could be used in TEL. Apart from more conventional contributions, we also have new formats for you to consider!

  • Research papers (4-8 pages)
    to present exciting new work that is not mature enough for a long conference/journal paper. We especially value papers with focus on evaluating early results and making them available for further discussion among practitioners.

  • Work in progress and System demos (upto 4 pages)
    allow participants to share the basics of their SIR for TEL applications. Papers can be short (upto 4 pages), but also different ways using screencasting or YouTube-type recordings of the demo are welcome. Include also information also needed on how others can access your system and test it.

  • Hands-On proposals (1-pager)
    Got a good idea for a SIRTEL implementation? Toying with ideas for SIRTEL prototypes, either totally new ones or based on some existing application (e.g. Amazon, Flickr, Digg, ..)? Interested in “pimping-up” your current LMS or platform to support social networks?
    Create a little scenario and write it down so that others can follow your thinking. Put in a few screen shots to illustrate your point better. During the session, which you will lead, the participants will have their hands and brains-on your idea. The outcome will help you with requirements of implementations in a TEL setting. Early ideas welcome!

  • Abstract for Pecha Kucha (5 min talk)
    Want to share your discussion ideas on SIRTEL concepts with others? We are listening! To leverage on the face-to-face of the workshop, we invite you to submit an abstract for CP type of presentation-discussion moment which you will lead during the workshop. Your talk can be max. 4 minutes long, the participants will decide how much discussion will follow.


Papers are to be submitted to: https://togather.eu/handle/123456789/274
Accepted papers will be published online as EC-TEL workshop proceedings
as part of the CEUR Workshop proceedings series.

The two best papers of the workshop will be published in a special issue of
the International Journal of Technology-Enhanced Learning (IJTEL)
http://www.inderscience.com/browse/index.php?journalCODE=ijtel

More information at the submission site. All questions and submissions should be sent to: sirtel @ cs.kuleuven.be


PROGRAM COMMITTEE

  • Alexander Felfernig, University of Klagenfurt, Germany
  • Barry Smyth, University College Dublin, Ireland
  • Brandon Muramatsu, Utah State University, USA
  • Clemens Cap, University of Rostock, TBC
  • Frans van Assche, European Schoolnet, Belgium
  • Fridolin Wild, Vienna University of Economics and Business Administration, Austria
  • Hendrik Drachsler, Open University of the Netherlands, The Netherlands
  • Jon Dron, Athabasca University, Canada
  • Lisa Petrides, ISKME, USA
  • Marc Spaniol, Max-Planck-Institute for Informatics, Germany
  • Markus Strohmaier, Technical University of Graz, TBC
  • Martin Memmel, DFKI, Germany
  • Wolpers, Fraunhofer, Germany
  • Miguel-Angel Sicilia, University of Alcala, Spain
  • Nikos Manouselis. Greek Research & Technology Network, Greece
  • Oliver Bohl, Accenture GmbH, Germany
  • Rick D. Hangartner, MyStrands, USA
  • Selmin Nurcan, University of Paris 1, France
  • Yiwei Cao, RWTH Aachen University, Germany

ORGANISERS

  • Riina Vuorikari, Katholieke Universiteit Leuven (K.U.Leuven) & European Schoolnet (EUN), Belgium
  • Barbara Kieslinger, Centre for Social Innovation (ZSI), Austria
  • Ralf Klamma, RWTH Aachen University, Germany
  • Prof. Erik Duval, Katholieke Universiteit Leuven (K.U.Leuven), Belgium & ARIADNE Foundation

Tuesday, May 13, 2008

Mine/d your data

I just participated in a week-long datamining course at the university. It was hard work, but actually a lot of fun. We plowed thorough a lot of things; including association rules, clustering, logistic regression, decision trees, neural networks, and also learned, well, made acquaintance with, some of the dataminging software like SAS Entreprise miner and used MatLab to check out the neural networks. What a strange world.

In one exercise we used the German credit dataset and wanted to come up with a decision tree to sort out the bad customers from the good ones. After lots of clicking and choosing values and setting roles, we came up with a tree that had an error rate of 47%. Wow. As well the banker could just flip a coin to choose which customer to give credit and whom not. Ok, probably a bad example, we did learn after that about the cost of misclassification, so we were able to make something better. But anyway, it just kind of made me laugh.

I was reading this blog and came across this interesting information about datamining methods that "miners" choose to use. Now that I know what all those words mean, this became an interesting piece of information for me :)

• Correspondingly, the most commonly used algorithms are regression (79 percent), decision trees (77 percent) and cluster analysis (72 percent). Again, this reflects what we have seen in our own work. Regression certainly remains the algorithm of choice for large sections of the academic community and within the financial services sector. More and more data miners, however, are using decision trees, and cluster analysis has long been the bedrock of the marketing community.
I personally thought that most useful techniques for me could be mining association rules, clustering analysis and maybe the use of decision trees. To be seen.

What I was actually pretty amazed about was that Datamining is very related to predicting missing values, i.e. the same methods that many recommender systems/studies use to predict the missing values of ratings. Another thing which was totally new was that Datamining and Machine learning are actually very related, well, quasi-overlapping, I guess.