Methods for web revisitation prediction: survey and experimentation

More than 45 % of the pages that we visit on the Web are pages that we have visited before. Browsers support revisits with various tools, including bookmarks, history views and URL auto-completion. However, these tools only support revisits to a small number of frequently and recently visited pages. Several browser plugins and extensions have been proposed to better support the long tail of less frequently visited pages, using recommendation and prediction techniques. In this article, we present a systematic overview of revisitation prediction techniques, distinguishing them into two main types and several subtypes. We also explain how the individual prediction techniques can be combined into comprehensive revisitation workflows that achieve higher accuracy. We investigate the performance of the most important workflows and provide a statistical analysis of the factors that affect their predictive accuracy. Further, we provide an upper bound for the accuracy of revisitation prediction using an ‘oracle’ that discards non-revisited pages.

Authors: George Papadakis, Ricardo Kawase, Eelco Herder & Wolfgang Nejdl

Client- and Server-side Revisitation Prediction with SUPRA

Users of collaborative applications as well as individual users in their private environment return to previously visited Web pages for various  reasons; apart from pages visited due to backtracking, they typically have a number of favorite or important pages that they monitor or tasks that reoccur on an infrequent basis. In this paper, we introduce a library of methods that facilitate revisitation through the effective prediction of the next page request. It is based on a generic framework that inherently incorporates contextual information, handling uniformly both server- and the client-side applications. Unlike other existing approaches, the methods it encompasses are real-time, since they do not rely on training data or machine learning algorithms. We evaluate them over two large, real-world datasets, with the outcomes suggesting a significant improvement over methods typically used in this context. We have also made our implementation and data publicly available, thus encouraging other researchers to use it as a benchmark and to extend it with new techniques for supporting user’s navigational activity.

Venue: WIMS2012

Authors: George Papadakis, Ricardo Kawase, and Eelco Herder

PDF: papadakis-wims12

Beyond the Usual Suspects: Context-Aware Revisitation Support

kawase-ht2011

Ricardo Kawase @HT2011

A considerable amount of our activities on the Web involves revisits to pages or sites. Reasons for revisiting include active monitoring of content, verification of information, regular use of online services, and reoccurring tasks. Browsers support for revisitation is mainly focused on frequently and recently visited pages. In this paper we present a dynamic browser toolbar that provides recommendations beyond these usual suspects, balancing diversity and relevance. The recommendation method used is a combination of ranking and propagation methods. Experimental outcomes show that this algorithm performs significantly better than the baseline method. Further experiments address the question whether it is more appropriate to recommend specific pages or rather (portal pages of) Web sites. We conducted two user studies with a dynamic toolbar that relies on our recommendation algorithm. In this context, the outcomes confirm that users appreciate and use the contextual recommendations provided by the toolbar.

Venue: HT2011

Authors: Ricardo Kawase, George Papadakis, Eelco Herder and Wolfgang Nejdl

Award:  Engelbart Best Paper Award (HT2011)

PDF: kawase-ht2011