Imprint crawler |
Java |
A web crawler that is able to automatically extract legal notice information from websites while taking German legal aspects into account. |
https://github.com/rat-extensions/imprint-crawler
|
Readability score |
Python |
A Python tool that extracts the main text content of a web document and analyzes its readability. Input data: URL or/ Search Query. Output data: For a URL, the output includes the detected language and the readability score, calculated using different formulae, along with the average reading time. |
https://github.com/rat-extensions/readability-score
|
Forum scraper |
Python |
An extension to extract comments from German online news services. |
https://github.com/rat-extensions/forum-scraper
|
EI logger |
Typescript, Java |
A browser extension for conducting interactive information retrieval studies. With this extension, study participants can work on search tasks with search engines of their choice and both the search queries and the clicks on search results are saved. |
https://github.com/rat-extensions/EI_Logger_BA
|
Identifying affiliate links in web pages |
Python |
A Python tool that extracts all affiliate links of a web document and scores this web page according to its number and prominence of affiliate links. |
https://github.com/rat-extensions/Identifying-affiliate-links-in-webpages
|
App reviews scraper |
Python |
These app scrapes reviews, that will visit designated URLs of a set of applications and export the scraped reviews and relevant information. |
https://github.com/rat-extensions/app-reviews-scraper
|
Visualizations of IR measures |
Python |
This add-on aids researchers to have some initial visualizations based on the standard IR evaluation measures. There is a config.toml file for the theme. |
https://github.com/rat-extensions/ir-evaluation
|
Scraping news articles |
Python |
This Python tool retrieves the homepages of given news portals and scrapes the HTML text of the articles found. Each text is saved in a separate file. For each portal, an overview file is created, which contains the metadata of the articles and the corresponding file paths. |
https://github.com/rat-extensions/NewsArticlesScraper
|