What can be Found on the Web and How: A Characterization of Web Browsing Patterns


In this paper, we suggest a novel approach to studying user browsing behavior, i.e., the ways users get to different pages on the Web. Namely, we classified all user browsing paths leading to web pages into several types or browsing patterns. In order to define browsing patterns, we consider several im- portant points of the browsing path: its origin, the last page before the user gets to the domain of the target page, and the target page referrer. Each point can be of several types, which leads to 56 possible patterns. The distribution of the browsing paths over these patterns forms the navigational profile of a web page.

We conducted a comprehensive large-scale study of naviga- tional profiles of different web pages. First, we demonstrated that the navigational profile of a web page carry crucial in- formation about the properties of this page (e.g., its pop- ularity and age). Second, we found that the Web consists of several typical non-overlapping clusters formed by pages of similar ranges of incoming traffic. These clusters can be characterized by the functionality of their pages.