To the Point: A Shortcut to Essential Learning

The volume of information on the Web is constantly growing. Consequently, finding specific pieces of information becomes a harder task. Wikipedia, the largest online reference Website is beginning to witness this phenomenon. Learners often turn to Wikipedia in order to learn facts regarding different subjects. However, as time passes, Wikipedia articles get larger and specific information gets more difficult to be located. In this work, we propose an automatic annotation method that is able to precisely assign categories to any textual resource. Our approach relies on semantic enhanced annotations and Wikipedia’s categorization schema. The results of a user study shows that our proposed method provides solid results for classifying text and provides a useful support for locating information. As implication, our research will help future learners to easily identify desired learning topics of interest in large textual resources.

Authors: Ricardo Kawase, Patrick Siehndel and Bernardo Pereira Nunes

PDF: kawase-icalt2014

Interlinking Documents based on Semantic Graphs

Connectivity and relatedness of Web resources are two concepts that define to what extent different parts are connected or related to one another. Measuring connectivity and relatedness between Web resources is a growing field of research, often the starting point of recommender systems. Although relatedness is liable to subjective interpretations, connectivity is not. Given the Semantic Web’s ability of linking Web resources, connectivity can be measured by exploiting the links between entities. Further, these connections can be exploited to uncover relationships between Web resources. In this paper, we apply and expand a relationship assessment methodology from social network theory to measure the connectivity between documents. The connectivity measures are used to identify connected and related Web resources. Our approach is able to expose relations that traditional text-based approaches fail to identify. We validate and assess our proposed approaches through an evaluation on a real world dataset, where results show that the proposed techniques outperform state of the art approaches.

Authors: Bernardo Pereira Nunes, Ricardo Kawase, Besnik Fetahu, Stefan Dietze, Marco A. Casanova and Diana Maynard

PDF: nunes-kes2013

Boosting Retrieval of Digital Spoken Content

Every day, the Internet expands as millions of new multimedia objects are uploaded in the form of audio, video and images. While traditional text-based content is indexed by search engines, this indexing cannot be applied to audio and video objects, resulting in a plethora of multimedia content that is inaccessible to a majority of online users. To address this issue, we introduce a technique of automatic, semantically enhanced, description generation for multimedia content. The objective is to facilitate indexing and retrieval of the objects with the help of traditional search engines. Essentially, the technique generates static Web pages automatically, which describe the content of the digital audio and video objects. These descriptions are then organized in such a way as to facilitate locating corresponding audio and video segments. The technique employs a combination of Web services and concurrently provides description translation and semantic enhancement. Thorough analysis of the click-data, comparing accesses to the digital content before and after automatic description generation, suggests a significant increase in the number of retrieval items. This outcome, however is not limited to the terms of visibility, but in supporting multilingual access, additionally decreases the number of language barriers.

Venue: KES (Selecte Papers) 2012

Authors:  Bernardo Pereira Nunes, Alexander Mera, Marco A. Casanova and Ricardo Kawase

PDF: nunes-kes(selected)2012

Automatically generating multilingual, semantically enhanced, descriptions of digital audio and video objects on the Web

Every day, millions of new images, videos and audios are uploaded to the web. However, unlike text-based content, audio and video objects cannot be indexed by search engines. Thus, much valuable multimedia content stay unreachable for a great majority of online users. To overcome this problem we introduce a technique that automatically generates semantically enhanced descriptions of audio and video objects. The goal is to facilitate indexing and retrieval of the objects with the help of traditional search engines. Basically, the technique automatically generates static Web pages that describe the content of the digital audio and video objects, organized in such a way as to facilitate locating segments of the audio or video that correspond to the descriptions. The technique is a mashup of Web services that also provides translation of the descriptions and semantic enhancement. We thoroughly analyzed the click-data comparing accesses to the digital content before and after the automatic generation of the descriptions. The outcomes suggest that the technique significantly improve the retrieval of items, not only in terms of visibility, but also brings down language barriers, by supporting multilingual access.

Venue: KES2012

Authors:  Bernardo Pereira Nunes, Alexander Mera, Marco A. Casanova and Ricardo Kawase

PDF: nunes-kes2012