Modern web pages are highly dynamic, often deriving not just their behavior but also their structure from the execution of JavaScript code. Important page functionality commonly continues well past a page load event. As a result, web pages must be tr...
Modern web pages are highly dynamic, often deriving not just their behavior but also their structure from the execution of JavaScript code. Important page functionality commonly continues well past a page load event. As a result, web pages must be treated as applications running continuously in a browser rather than static entities which are simply downloaded and rendered. A failure to adopt this behavioral approach in web measurement risks overlooking important web page characteristics and oversimplifying pages, sites, and the ecosystem as a whole. Accordingly, studies have increasingly leveraged different forms of browser instrumentation which produce distinct abstractions of web page behavior. These abstractions are not interchangeable. The benefits of a particular abstraction in terms of the level of detail and the semantic value of the resultant dataset must be weighed against costs, including the overhead of instrumentation, computing resources, and analysis effort. When researchers select representations of web page behavior that are poorly suited to answer their research questions, they risk gathering inadequate data, overcomplicating their studies, or both.This thesis outlines and explores a framework for reasoning about trade-offs between web page abstractions in empirical studies. Our framework consists of four distinct categories of behavioral page representations: Inputs and Outputs, Feature Usage, Runtime Behavior, and Execution Traces. In the context of this framework, we present a series of applied web measurement studies, which investigate topics including real-time technology adoption, covert in-browser crypto-mining (or "cryptojacking"), browser fingerprinting, JavaScript code obfuscation, and online scams. For each study, we examine the costs and benefits of our chosen abstractions in the context of our framework and consider how different methodologies might alter study results. We generalize our findings, discussing the affordances of each category in our framework and offering insights into the types of research questions each category is best suited to address. We argue that a structured approach to weighing trade-offs between abstractions, such as the one presented here, leads to more efficient and effective studies, and clarifies areas of need for future work in the development of new behavioral web measurement techniques.