Product detail

WTF-LOD Extractor

OTRUSINA, L. SMRŽ, P.

Product type

software

Abstract

This software creates the Web TextFull linkage to Linked Open Data (WTF-LOD) dataset intended for large-scale evaluation of named entity recognition (NER) systems from the largest publically-available textual corpora, including Wikipedia dumps, monthly runs of the CommonCrawl, and ClueWeb09/12. The software performs de-duplication of the data and advanced cleaning procedures.

Keywords

named entity evaluation, linked open data, CommonCrawl, ClueWeb, Wikipedia

Create date

31. 12. 2015

Location

http://www.fit.vutbr.cz/research/prod/index.php?id=480

Possibilities of use

K využití výsledku jiným subjektem je vždy nutné nabytí licence

Licence fee

Poskytovatel licence na výsledek nepožaduje licenční poplatek

www

Documents