Your privacy, your choice

We use essential cookies to make sure the site can function. We also use optional cookies for advertising, personalisation of content, usage analysis, and social media.

By accepting optional cookies, you consent to the processing of your personal data - including transfers to third parties. Some third parties are outside of the European Economic Area, with varying standards of data protection.

See our privacy policy for more information on the use of your personal data.

for further information and to change your choices.

Skip to main content

Table 11 Comparison with existing disease-disease association extraction datasets

From: A study on large-scale disease causality discovery from biomedical literature

Dataset

Feature

Disease entity

Relation pair

Performance

Method in this study

Construct a disease causality semantic predicate list to facilitate the automatic identification of disease causalities

Obtain 14,335 standardized disease entities

Include 6,084 types of bidirectional relations (66,393 SPOs) and 92,557 types of unidirectional relations (17,608 SPOs)

Achieve an accuracy of 96.97% in disease causality extraction

dRiskKB

21,354,075 MEDLINE records comprised the text corpus under study, and use disease risk-specific syntactic pattern to automatically extract disease risk pairs

Cover 12,981 diseases

Consist of 34,448 unique disease relation pairs

The identified patterns have an average precision of 0.99, the exactly matched pairs of 0.919 and the partially matched pairs of 0.988

A publicly available DDAE dataset extracted from literature [10]

Consisting of 521 PubMed abstracts, containing positive, negative, and null DDAs, and dependency tree-based relation rules and DNorm are used to annotate disease mentions

Contain 12,346 diseases

Consist of 3,322 disease-disease pairs

An annotated DDAE dataset with the final kappa value of 76%