MicroRNA-Target Interaction Prediction Task Overview

Definition: MicroRNA (miRNA) is small noncoding RNA that plays an important role in regulating biological processes such as cell proliferation, cell differentiation and so on. They usually function to downregulate gene targets. This task is to predict the interaction activity between miRNA and the gene target.

Impact: Accurately predicting the unknown interaction between miRNA and target can lead to a more complete knowledge about disease mechanism and also could result in potential disease target biomarkers. They can also help identify miRNA hits for miRNA therapeutics candidates.

Generalization: The model needs to learn the biochemicals of miRNA and target proteins so that it can extrapolate to new set of novel miRNAs and targets in various disease groups and tissues.

Product: Small-molecule, miRNA therapeutic.

Pipeline: Basic biomedical research, target discovery, activity.

miRTarBase

Dataset Description: miRTarBase has accumulated more than three hundred and sixty thousand miRNA-target interactions (MTIs), which are collected by manually surveying pertinent literature after NLP of the text systematically to filter research articles related to functional studies of miRNAs. Generally, the collected MTIs are validated experimentally by reporter assay, western blot, microarray and next-generation sequencing experiments. TDC uses miRBase to obtain miRNA mature sequence.

Task Description: Binary Classification. Given the miRNA mature sequence and target amino acid sequence, predict their likelihood of interaction.

Dataset Statistics: 400,082 MTI pairs, 3,465 miRNAs, 21,242 targets

Dataset Split: Random Split

from tdc.multi_pred import MTI
data = MTI(name = 'miRTarBase')
split = data.get_split()

Note: The dataset contains only positive pairs. To get the negative samples, you can call:

data = data.neg_sample(frac = 1)

References:

[1] Chou, Chih-Hung, et al. “miRTarBase update 2018: a resource for experimentally validated microRNA-target interactions.” Nucleic acids research 46.D1 (2018): D296-D302.

[2] Kozomara, Ana, Maria Birgaoanu, and Sam Griffiths-Jones. “miRBase: from microRNA sequences to function.” Nucleic acids research 47.D1 (2019): D155-D162.

Dataset License: CC BY 4.0.