Dataset Description: Hetionet is an integrative network of biomedical knowledge assembled from 29 different databases of genes, compounds, diseases, and more. The network combines over 50 years of biomedical information into a single resource. In the dataset, TDC processes into a list of triplets where each row contains source_type, source_id, target_type, target_id, relation type, and direction of the relation.
Dataset Statistics: 47,031 nodes (11 types) and 2,250,197 relationships (24 types).
from tdc.resource import BioKG data = BioKG(name = 'HetioNet') data.get_data()
 Himmelstein, Daniel S., and Sergio E. Baranzini. “Heterogeneous network edge prediction: a data integration approach to prioritize disease-associated genes.” PLoS Comput Biol 11.7 (2015): e1004259.
Dataset License: CC BY 4.0.