- Internet Traffic Analysis and Secure E-voting
- Spam and Phishing Detection
- Advanced Malware Detection Techniques
- Privacy, Security, and Data Protection
- Web Application Security Vulnerabilities
- Web Data Mining and Analysis
- Privacy-Preserving Technologies in Data
- User Authentication and Security Systems
- Security and Verification in Computing
- Caching and Content Delivery
- Network Security and Intrusion Detection
- Hate Speech and Cyberbullying Detection
- Cybercrime and Law Enforcement Studies
- Cryptography and Data Security
- Complex Network Analysis Techniques
- Access Control and Trust
- Vehicular Ad Hoc Networks (VANETs)
- Sexuality, Behavior, and Technology
- Cholinesterase and Neurodegenerative Diseases
- Personal Information Management and User Behavior
- Frailty in Older Adults
- Copyright and Intellectual Property
- Alzheimer's disease research and treatments
- Dementia and Cognitive Impairment Research
- Information and Cyber Security
University of St Andrews
2024
Nia Association
2022
ID Genomics (United States)
2022
University of Illinois Chicago
2013-2017
User demand for blocking advertising and tracking online is large growing. Existing tools, both deployed described in research, have proven useful, but lack either the completeness or robustness needed a general solution. detection approaches generally focus on only one aspect of (e.g. URL patterns, code structure), making existing susceptible to evasion.In this work we present AdGraph, novel graph-based machine learning approach detecting resources web. AdGraph differs from by building...
Doxing is online abuse where a malicious party harms another by releasing identifying or sensitive information. Motivations for doxing include personal, competitive, and political reasons, web users of all ages, genders internet experience have been targeted. Existing research on primarily qualitative. This work improves our understanding being the first to take quantitative approach. We do so designing deploying tool which can detect dox files measure frequency, content, targets, effects...
Modern web browsers have accrued an incredibly broad set of features since being invented for hypermedia dissemination in 1990. Many these benefit users by enabling new types applications. However, some also bring risk to users' privacy and security, whether through implementation error, unexpected composition, or unintended use. Currently there is no general methodology weighing costs benefits. Restricting access only the which are necessary delivering desired functionality on a given...
Modern web browsers are incredibly complex, with millions of lines code and over one thousand JavaScript functions properties available to website authors. This work investigates how these browser features used on the modern, open web. We find that differ wildly in popularity, 50% provided never web's 10,000 most popular sites according Alexa
Accurate web measurement is critical for understanding and improving security privacy online. Such measurements implicitly assume that automated crawls generalize to typical user experience. But anecdotal evidence suggests the behaves differently when seen via well-known endpoints or automation frameworks, various reasons. Our work improves state of by investigating how key differ using naive crawling tool defaults vs. careful attempts match "real" users across Tranco top 25k domains. We...
Filter lists play a large and growing role in protecting assisting web users. The vast majority of popular filter are crowd-sourced, where number people manually label resources related to undesirable (e.g. ads, trackers, paywall libraries), so that they can be blocked by browsers extensions.
Content blocking is an important part of a per-formant, user-serving, privacy respecting web. Current content blockers work by building trust labels over URLs. While useful, this approach has many well understood shortcomings. Attackers may avoid detection changing URLs or domains, bundling unwanted code with benign code, inlining in pages.The common flaw existing approaches that they evaluate based on its delivery mechanism, not behavior. In we address problem system for generating...
Billions of people use cloud-based storage for personal files. While many are likely aware the extent to which they store information in cloud, it is unclear whether users fully what storing online. We recruited 30 research subjects from Craigslist investigate how interact with and understand privacy issues cloud storage. studied this phenomenon through surveys, an interview, custom software lets see delete their photos stored cloud. found that a majority private did not intend upload, large...
Cookie stuffing is an activity which allows unscrupulous actors online to defraud affiliate marketing programs by causing themselves receive credit for purchases made web users, even if the marketer did not actively perform any program. Using 2 months of HTTP request logs from a large public university, we present empirical study fraud in programs. First, develop efficient, decision-tree based technique detecting cookie-stuffing logs. Our replicates domain-informed human labeling same data...
Ad and tracking blocking extensions are popular tools for improving web performance, privacy aesthetics. Content generally rely on filter lists to decide whether a request is associated with or advertising, so should be blocked. Millions of users protect their improve browsing experience.
Threshold aggregation reporting systems promise a practical, privacy-preserving solution for developers to learn how their applications are used "\emph{in-the-wild}". Unfortunately, proposed date prove impractical wide scale adoption, suffering from combination of requiring: \emph{i)} prohibitive trust assumptions; \emph{ii)} high computation costs; or \emph{iii)} massive user bases. As result, adoption truly-private approaches has been limited only small number enormous (and enormously...
Modern web browsers have accrued an incredibly broad set of features since being invented for hypermedia dissemination in 1990. Many these benefit users by enabling new types applications. However, some also bring risk to users' privacy and security, whether through implementation error, unexpected composition, or unintended use. Currently there is no general methodology weighing costs benefits. Restricting access only the which are necessary delivering desired functionality on a given...
Funding the production of quality online content is a pressing problem for producers. The most common funding method, advertising, rife with well-known performance and privacy harms, an intractable subject-agent conflict: many users do not want to see advertisements, depriving site needed funding.
Despite active privacy research on sophisticated web tracking techniques (e.g., fingerprinting, cache collusion, bounce tracking, CNAME cloaking), most the is basic "stateful" enabled by classical browser storage policies sharing per-site across all HTTP contexts. Alternative, privacy-preserving policies, especially for third-party contexts, have been proposed and even deployed, but these can break websites that presume traditional, non-partitioned storage. Such breakage discourages...
This work presents a systematic study of UID smuggling, an emerging tracking technique that is designed to evade browsers' privacy protections. Browsers are increasingly attempting prevent cross-site by partitioning the storage where trackers store user identifiers (UIDs). smuggling allows synchronize UIDs across sites inserting into users' navigation requests. Trackers can thus regain ability aggregate activities and behaviors sites, in defiance browser
Cloud based storage accounts like web email are compromised on a daily basis. At the same time, billions of Internet users store private information in these accounts. As matures and accrue more information, become single point failure for both users' online identities large amounts their information. This paper presents two contributions: first, heterogeneous documents abstraction, is data-centric strategy protecting high value stored globally accessible storage. Secondly, we present drano,...
Most popular web browsers include "reader modes" that improve the user experience by removing un-useful page elements. Reader modes reformat to hide elements are not related page's main content. Such site navigation, advertising videos and images, most JavaScript. The intended end result is users can enjoy content they interested in, without distraction.
We present the first extensive measurement of privacy properties advertising systems used by privacy-focused search engines. propose an automated methodology to study impact clicking on ads three popularprivate engines which have advertising-based business models: StartPage, Qwant, and DuckDuckGo, we compare them two dominant data-harvesting ones: Google Bing. investigate possibility third parties tracking users when analyzing first-party storage, redirection domain paths, requests sent...
The public suffix list is a community-maintained of rules that can be applied to domain names determine how they should grouped into logical organizations or companies. We present the first large-scale measurement study used by open-source software on Web and privacy harm resulting from projects using outdated versions list. measure often developers include out-of-date in their projects, old included lists are, estimate real-world with model based crawl Web. find incorrect use common...
In this paper, we design and deploy a synchronized multi-vantage point web measurement study to explore the comparability of measurements across vantage points (VPs). We describe in reproducible detail system with which performed crawls on Alexa top 5K domains from four distinct network VPs: research university, cloud datacenter, residential network, Tor gateway proxy. Apart expected poor results Tor, observed no shocking disparities VPs, but did find significant impact VP's reliability...