Exploiting the Wisdom of the Crowds for Characterizing and Connecting Heterogeneous Resources

presentation-ht2014

Ricardo Kawase presenting @HT2014 in Santiago, Chile (picture taken by Christoph Trattner)

Heterogeneous content is an inherent problem for cross-system search, recommendation and personalization. In this paper we investigate differences in topic coverage and the impact of topics in different kinds of Web services. We use entity extraction and categorization to create fingerprints that allow for meaningful comparison. As a basis taxonomy, we use the 23 main categories of Wikipedia Category Graph, which has been assembled over the years by the wisdom of the crowds. Following a proof of concept of our approach, we analyze differences in topic coverage and topic impact. The results show many differences between Web services like Twitter, Flickr and Delicious, which reflect users’ behavior and the usage of each system. The paper concludes with a user study that demonstrates the benefits of fingerprints over traditional textual methods for recommendations of heterogeneous resources.

Authors: Ricardo Kawase, Patrick Siehndel, Bernardo Pereira Nunes, Eelco Herder and Wolfgang Nejdl

PDF: kawase-ht2014

Online Prototype: http://twikime.l3s.uni-hannover.de

 

To the Point: A Shortcut to Essential Learning

The volume of information on the Web is constantly growing. Consequently, finding specific pieces of information becomes a harder task. Wikipedia, the largest online reference Website is beginning to witness this phenomenon. Learners often turn to Wikipedia in order to learn facts regarding different subjects. However, as time passes, Wikipedia articles get larger and specific information gets more difficult to be located. In this work, we propose an automatic annotation method that is able to precisely assign categories to any textual resource. Our approach relies on semantic enhanced annotations and Wikipedia’s categorization schema. The results of a user study shows that our proposed method provides solid results for classifying text and provides a useful support for locating information. As implication, our research will help future learners to easily identify desired learning topics of interest in large textual resources.

Authors: Ricardo Kawase, Patrick Siehndel and Bernardo Pereira Nunes

PDF: kawase-icalt2014

Finding relevant missing references in learning courses.

Reference sites play an increasingly important role in learning processes. Teachers use these sites in order to identify topics that should be covered by a course or a lecture. Learners visit online encyclopedias and dictionaries to find alternative explanations of concepts, to learn more about a topic, or to better understand the context of a concept. Ideally, a course or lecture should cover all key concepts of the topic that it encompasses, but often time constraints prevent complete coverage. In this paper, we propose an approach to identify missing references and key concepts in a corpus of educational lectures. For this purpose, we link concepts in educational material to the organizational and linking structure of Wikipedia. Identifying missing resources enables learners to improve their understanding of a topic, and allows teachers to investigate whether their learning material covers all necessary concepts.

Authors:  Patrick Siehndel, Ricardo Kawase, Asmelash Teka Hadgu and Eelco Herder.

PDF: siehndel-www13-lile13

TwikiMe! User profiles that make sense.

The use of social media has been rapidly increasing in the last years. Social media, such as Twitter, has become an important source of information for a variety of people. The public availability of data describing some of these social networks has led to a great deal of research in this area. Link prediction, user classification and community detection are some of the main research areas related to social networks. In this paper, we present a user modeling framework that uses Wikipedia to model user interests inside a social network. Our model of user interests reflects the areas a user is interested in, as well as the level of expertise a user has in a certain field.

Authors: Patrick Siehndel, Ricardo Kawase

PDF: siehndel-iswc2012

Online Prototype: http://twikime.l3s.uni-hannover.de/twikime.php

Hyperlink of Men

kawase-laweb2012

Ricardo Kawase presenting @LAWEB2012

Hand-made hyperlinks are increasingly outnumbered by automatically generated links, which are usually based on text similarity or some sort of  recommendation algorithm. In this paper we explore the current linking and appreciation of automatically generated links. To what extent do they prevail on the Web, in what forms do they appear, and do users think those generated links are just as good as human-created links? To answer these questions we first propose a model for extracting contextual information of a hyperlink. Second, we developed a hyperlink ranker to assigned relevance to each existing human generated link. With the outcomes of the hyperlink ranker, together with another two recommendation strategies, we performed a user study with over 100 participants. Results indicate that automated links are “good enough”, and even preferred in
some user contexts. Still, they do not provide the deeper knowledge as expressed by human authors.

Venue: LAWEB2012

Authors: Ricardo Kawase, Patrick Siehndel, Eelco Herder and Wolfgang Nejdl

PDF:  kawase-laweb2012