DocRED A Large-Scale Document-Level Relation Extraction Dataset阅读笔记

这是一个介绍数据集的论文,主要是文档级别的关系抽取数据集。

论文下载地址

Problem Statement

existing datasets for document-level RE

  • either only have a small number of manually-annotated relations and entities,
  • or exhibit noisy annotations from distant supervision,
  • or serve specific domains or approaches.

Contribution (DocRED)

  • constructed from Wikipedia and Wikidata
  • DocRED contains 132, 375 entities and 56, 354 relational facts annotated on 5, 053 Wikipedia documents
  • As at least 40.7% of the relational facts in DocRED can only be extracted from multiple sentences
  • also provide large-scale distantly supervised data to support weakly supervised RE research

  • indicate the existing methods deal with the taks document level RE is more difficult sentence-level RE.

data

20190701156194521958350.jpg


 上一篇
Universal Representation Learning of Knowledge Bases by Jointly Embedding Instances and Ontological Concepts阅读笔记 Universal Representation Learning of Knowledge Bases by Jointly Embedding Instances and Ontological Concepts阅读笔记
 论文下载地址 Problem StatementExisting KG embedding models merely focus on representing of an ontology view for abstract
下一篇 
Learning Entity and Relation Embeddings for Knowledge Graph Completion阅读笔记 Learning Entity and Relation Embeddings for Knowledge Graph Completion阅读笔记
TransR embeds entities and relations in distinct entity space and relation space, and learns embeddings via translation
  目录