Product detail

Analyzer of JavaScript calls on web pages

BEDNÁŘ, M. SCHAUER, M.

Product type

software

Abstract

The software consists of two modules:    - a tool for automatic web crawling and capturing JavaScript calls      (hereinafter referred to as "Crawler"),    - a tool for the analysis of acquired data and mining information from them      (hereinafter referred to as the "Analyzer").      The crawler (https://github.com/martinbednar/web_crawler) automatically      visits websites and uses a customized extension Web API Manager to capture what      JavaScript calls the page made. Individual calls are being stored into a database. The      tool is able to record and store hundreds of thousands of calls from a single website and      retrieve units of TB of data when running over a million most visited sites.      In the Crawler, it is possible to intercept JavaScript calls with security      extensions installed (e.g. uBlock Origin). This was used to obtain two datasets -      one for browsing with a security extension and the other without it.      The Analyzer (https://github.com/martinbednar/web_crawler_data_analysis)      tool provides processing of collected data, display of aggregated results and      significant values. With two data sets collected, the Analyzer can compare JavaScript      calls with and without a security extension, which answers research questions about      security and privacy on the Web.      Using the tool, for example, we found that on the 250 thousands most visited      websites (according to the Tranco list), with the security extension uBlock      Origin, approximately 30% of all JavaScript calls were blocked, when the API      Range (https://developer.mozilla.org/en-US/docs/Web/API/Range) was      suppressed the most. The complete results were published on the FIT cloud      (https://nextcloud.fit.vutbr.cz/s/LHxP4cYaTnoNHWQ).

Keywords

JavaScript, API, Web browser, Web crawl, Security, Privacy, Fingerprint

Create date

18. 11. 2021

Location

https://polcak.github.io/jsrestrictor-dev/blogarticles/crawling_results.html

Possibilities of use

K využití výsledku jiným subjektem je vždy nutné nabytí licence

Licence fee

Poskytovatel licence na výsledek nepožaduje licenční poplatek

www

Documents