Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is CC(C)Oc1ccc(NC(=O)[C@@H]2C[C@@H]3CC[C@@H]2N(S(=O)(=O)c2ccsc2)C3)cc1. The target protein (Q9H5J4) has sequence MNMSVLTLQEYEFEKQFNENEAIQWMQENWKKSFLFSALYAAFIFGGRHLMNKRAKFELRKPLVLWSLTLAVFSIFGALRTGAYMVYILMTKGLKQSVCDQGFYNGPVSKFWAYAFVLSKAPELGDTIFIILRKQKLIFLHWYHHITVLLYSWYSYKDMVAGGGWFMTMNYGVHAVMYSYYALRAAGFRVSRKFAMFITLSQITQMLMGCVVNYLVFCWMQHDQCHSHFQNIFWSSLMYLSYLVLFCHFFFEAYIGKMRKTTKAE. The pIC50 is 7.2. (2) The drug is O=C(Nc1cccc2c(=O)[nH]ccc12)c1cccs1. The target protein (O88554) has sequence MAPRRQRSGSGRRVLNEAKKVDNGNKATEDDSPPGKKMRTCQRKGPMAGGKDADRTKDNRDSVKTLLLKGKAPVDPECAAKLGKAHVYCEGDDVYDVMLNQTNLQFNNNKYYLIQLLEDDAQRNFSVWMRWGRVGKTGQHSLVTCSGDLNKAKEIFQKKFLDKTKNNWEDRENFEKVPGKYDMLQMDYAASTQDESKTKEEETLKPESQLDLRVQELLKLICNVQTMEEMMIEMKYDTKRAPLGKLTVAQIKAGYQSLKKIEDCIRAGQHGRALVEACNEFYTRIPHDFGLSIPPVIRTEKELSDKVKLLEALGDIEIALKLVKSERQGLEHPLDQHYRNLHCALRPLDHESNEFKVISQYLQSTHAPTHKDYTMTLLDVFEVEKEGEKEAFREDLPNRMLLWHGSRLSNWVGILSHGLRVAPPEAPITGYMFGKGIYFADMSSKSANYCFASRLKNTGLLLLSEVALGQCNELLEANPKAQGLLRGKHSTKGMGKMAPS.... The pIC50 is 5.2. (3) The target protein sequence is MTKIALIGSGQIGAIVGELCLLENLGDLILYDVVPGIPQGKALDLKHFSTILGVNRNILGTNQIEDIKDADIIVITAGVQRKEGMTREDLIGVNGKIMKSVAESVKLHCSKAFVICVSNPLDIMVNVFHKFSNLPHEKICGMAGILDTSRYCSLIADKLKVSAEDVNAVILGGHGDLMVPLQRYTSVNGVPLSEFVKKNMISQNEIQEIIQKTRNMGAEIIKLAKASAAFAPAAAITKMIKSYLYNENNLFTCAVYLNGHYNCSNLFVGSTAKINNKGAHPVEFPLTKEEQDLYTESIASVQSNTQKAFDLIK. The drug is COc1cc(CCOc2cc(O)c3c(=O)cc(C(=O)O)oc3c2)ccc1NC(=O)C(=O)O. The pIC50 is 3.6. (4) The small molecule is CC1(C)C=Cc2c(ccc(C(=O)/C=C/c3ccc(O)cc3)c2O)O1. The target protein (P11926) has sequence MNNFGNEEFDCHFLDEGFTAKDILDQKINEVSSSDDKDAFYVADLGDILKKHLRWLKALPRVTPFYAVKCNDSKAIVKTLAATGTGFDCASKTEIQLVQSLGVPPERIIYANPCKQVSQIKYAANNGVQMMTFDSEVELMKVARAHPKAKLVLRIATDDSKAVCRLSVKFGATLRTSRLLLERAKELNIDVVGVSFHVGSGCTDPETFVQAISDARCVFDMGAEVGFSMYLLDIGGGFPGSEDVKLKFEEITGVINPALDKYFPSDSGVRIIAEPGRYYVASAFTLAVNIIAKKIVLKEQTGSDDEDESSEQTFMYYVNDGVYGSFNCILYDHAHVKPLLQKRPKPDEKYYSSSIWGPTCDGLDRIVERCDLPEMHVGDWMLFENMGAYTVAAASTFNGFQRPTIYYVMSGPAWQLMQQFQNPDFPPEVEEQDASTLPVSCAWESGMKRHRAACASASINV. The pIC50 is 5.3. (5) The target protein sequence is MDTSGHFHDSGVGDLDEDPKCPCPSSGDEQQQQQQQQQQQQPPPPAPPAAPQQPLGPSLQPQPPQLQQQQQQQQQQQQQQQQQQQPPHPLSQLAQLQSQPVHPGLLHSSPTAFRAPPSSNSTAILHPSSRQGSQLNLNDHLLGHSPSSTATSGPGGGSRHRQASPLVHRRDSNPFTEIAMSSCKYSGGVMKPLSRLSASRRNLIEAETEGQPLQLFSPSNPPEIVISSREDNHAHQTLLHHPNATHNHQHAGTTASSTTFPKANKRKNQNIGYKLGHRRALFEKRKRLSDYALIFGMFGIVVMVIETELSWGLYSKDSMFSLALKCLISLSTIILLGLIIAYHTREVQLFVIDNGADDWRIAMTYERILYISLEMLVCAIHPIPGEYKFFWTARLAFSYTPSRAEADVDIILSIPMFLRLYLIARVMLLHSKLFTDASSRSIGALNKINFNTRFVMKTLMTICPGTVLLVFSISLWIIAAWTVRVCERYHDQQDVTSNFL.... The drug is Clc1cc(NC(Cc2ccccc2)c2ccccc2)n2nccc2n1. The pIC50 is 5.1.