Your browsing history can accurately identify who you are according to a study conducted by Mozilla researchers Mozilla researchers have con...
Your browsing history can accurately identify who you are according to a study conducted by Mozilla researchers
Mozilla researchers have confirmed that browsing histories can be used to compile unique browsing profiles and that these can be used to track users. There are also many players ubiquitous enough to gather sufficient web histories to use browsing history as an identifier.
This is not the first time that researchers have demonstrated that browsing profiles are sufficiently distinctive and stable to be used as identifiers. Sarah Bird, Ilana Segall and Martin Lopatka were pushed to replicate the results presented in a 2012 paper by Lukasz Olejnik, Claude Castelluccia and Artur Janc, using finer data, and they extended this work to detail the privacy risk posed by aggregating browsing histories.
The data was collected from approximately 52,000 Firefox browser users who chose to share the data for research and product development purposes beyond what is provided for in Mozilla's default data collection policies. Mozilla researchers collected data for 7 days, then paused for 7 days and resumed for another 7 days. After analyzing the collected data, they identified 48,919 distinct browsing profiles, 99% of which are unique. The original 2012 document observed a set of approximately 400,000 browsing history profiles, 94% of which were unique.
"The high uniqueness holds even when the histories are truncated to only 100 top sites. We then find that for users who visited 50 or more distinct domains during the two-week data collection period, about 50% can be re-identified using the top 10,000 sites. The possibility of re-identification increased to more than 80% for users who visited 150 or more distinct domains," note Mozilla researchers.
They also confirmed that browsing history profiles are stable over time, a second prerequisite for these profiles to be repeatedly linked to specific users and used for online tracking.
"Our re-identification rates in a set of 1,766 sites were less than 10 percent per 100 sites despite having more than 90 percent unique profiles in the different datasets, but increased to about 80 percent when we look at 10,000 sites," they added.
Finally, some companies such as Alphabet (Google) and Facebook are able to observe the web to an even greater extent than when searching for the 2012 document, which may allow them to gain deep visibility of browsing activity and use that visibility for effective online tracking, even if users use different devices to navigate the Internet.
Other recent research has shown that anonymizing browsing patterns or profiles through generalization does not sufficiently protect user anonymity.
Regulation is needed
Lukasz Olejnik, a privacy researcher and one of the authors of the 2012 article, noted that the results of this new research are a welcome confirmation that web browsing histories are personal data that can reveal information about the user or be used to track users.
"In some ways, browsing history resembles biometric-type data because of its uniqueness and stability," he commented. He also pointed out that since this data is used to distinguish individuals from many others, it automatically falls under the General Data Protection Regulation (GDPR).
"Web browsing histories are private data and, in some contexts, they are personal data. This is what the state of the art of research indicates. Technology should follow. Data processing regulations and standards should also follow. So should law enforcement," he concluded.