Those were the days: learning to rank social media posts for reminiscence

Social media posts are a great source for life summaries aggregating activities, events, interactions and thoughts of the last months or years. They can be used for personal reminiscence as well as for keeping track with developments in the lives of not-so-close friends. One of the core challenges of automatically creating such summaries is to decide which posts are memorable, i.e., should be considered for retention and which ones to forget. To address this challenge, we design and conduct user evaluation studies and construct a corpus that captures human expectations towards content retention. We analyze this corpus to identify a small set of seed features that are most likely to characterize memorable posts. Next, we compile a broader set of features that are leveraged to build general and personalized machine-learning models to rank posts for retention. By applying feature selection, we identify a compact yet effective subset of these features. The models trained with the presented feature sets outperform the baseline models exploiting an intuitive set of temporal and social features.

Authors: Kaweh Djafari Naini, Ricardo Kawase, Nattiya Kanhabua, Claudia Niederée, Ismail Sengor Altingovde

Methods for web revisitation prediction: survey and experimentation

More than 45 % of the pages that we visit on the Web are pages that we have visited before. Browsers support revisits with various tools, including bookmarks, history views and URL auto-completion. However, these tools only support revisits to a small number of frequently and recently visited pages. Several browser plugins and extensions have been proposed to better support the long tail of less frequently visited pages, using recommendation and prediction techniques. In this article, we present a systematic overview of revisitation prediction techniques, distinguishing them into two main types and several subtypes. We also explain how the individual prediction techniques can be combined into comprehensive revisitation workflows that achieve higher accuracy. We investigate the performance of the most important workflows and provide a statistical analysis of the factors that affect their predictive accuracy. Further, we provide an upper bound for the accuracy of revisitation prediction using an ‘oracle’ that discards non-revisited pages.

Authors: George Papadakis, Ricardo Kawase, Eelco Herder & Wolfgang Nejdl

Towards a Semantically Enriched Online Newspaper

The Internet plays a major role as a source of news. Many publishers offer online versions of their newspapers to paying customers. Online newspapers bear more similarity with traditional print papers than with regular news sites. In a close collaboration with Mediengruppe Madsack – publisher of newspapers in several German federal states, we aim at providing a semantically enriched online newspaper. News articles are annotated with relevant entities – places, persons and organizations. These annotations form the basis for an entity-based `Theme Radar’, a dashboard for monitoring articles related to the users’ explicitly indicated and inferred interests.

Authors: Ricardo Kawase, Eelco Herder, Patrick Siehndel

PDF: kawase-iswc2014

Predicting User Locations and Trajectories

Location-based services usually recommend new locations based on the user’s current location or a given destination. However, human mobility involves to a large extent routine behavior and visits to already visited locations. In this paper, we show how daily and weekly routines can be modeled with basic prediction techniques. We compare the methods based on their performance, entropy and correlation measures. Further, we discuss how location prediction for everyday activities can be used for personalization techniques, such as timely or delayed recommendations.

Authors: Eelco Herder, Patrick Siehndel and Ricardo Kawase

PDF: herder-umap2014