Přístupnostní navigace
E-přihláška
Vyhledávání Vyhledat Zavřít
Detail produktu
BEDNÁŘ, M. SCHAUER, M.
Typ produktu
software
Abstrakt
The software consists of two modules: - a tool for automatic web crawling and capturing JavaScript calls (hereinafter referred to as "Crawler"), - a tool for the analysis of acquired data and mining information from them (hereinafter referred to as the "Analyzer"). The crawler (https://github.com/martinbednar/web_crawler) automatically visits websites and uses a customized extension Web API Manager to capture what JavaScript calls the page made. Individual calls are being stored into a database. The tool is able to record and store hundreds of thousands of calls from a single website and retrieve units of TB of data when running over a million most visited sites. In the Crawler, it is possible to intercept JavaScript calls with security extensions installed (e.g. uBlock Origin). This was used to obtain two datasets - one for browsing with a security extension and the other without it. The Analyzer (https://github.com/martinbednar/web_crawler_data_analysis) tool provides processing of collected data, display of aggregated results and significant values. With two data sets collected, the Analyzer can compare JavaScript calls with and without a security extension, which answers research questions about security and privacy on the Web. Using the tool, for example, we found that on the 250 thousands most visited websites (according to the Tranco list), with the security extension uBlock Origin, approximately 30% of all JavaScript calls were blocked, when the API Range (https://developer.mozilla.org/en-US/docs/Web/API/Range) was suppressed the most. The complete results were published on the FIT cloud (https://nextcloud.fit.vutbr.cz/s/LHxP4cYaTnoNHWQ).
Klíčová slova
JavaScript, API, Web browser, Web crawl, Security, Privacy, Fingerprint
Datum vzniku
18. 11. 2021
Umístění
https://polcak.github.io/jsrestrictor-dev/blogarticles/crawling_results.html
Možnosti využití
K využití výsledku jiným subjektem je vždy nutné nabytí licence
Licenční poplatek
Poskytovatel licence na výsledek nepožaduje licenční poplatek
www
Dokumenty
crawlink_results_JScalls_with_uBlock.png crawlink_results_JScalls_without_uBlock_opensource.png crawling_results.md crawlink_results_JScalls_without_uBlock.png crawlink_results_APIs_without_uBlock_opensource.png crawlink_results_APIs_with_uBlock.png crawlink_results_APIs_with_uBlock_opensource.png crawlink_results_APIs_without_uBlock.png crawlink_results_JScalls_with_uBlock_opensource.png