Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is OC[C@@H]1[C@@H](O)[C@H](O)[C@@H](O)CN1CCCCCCOc1cccc(-c2c[nH]nn2)c1. The target protein sequence is MKKTWWKEGVAYQIYPRSFMDANGDGIGDLRGIIEKLDYLVELGVDIVWICPIYRSPNADNGYDISDYYAIMDEFGTMDDFDELLAQAHRRGLKIILDLVINHTSDEHPWFIESRSSRDNPKRDWYIWRDGKDGREPNNWESIFGGSAWQYDERTGQYYLHLFDVKQPDLNWENSEVRQALYDMINWWLDKGIDGFRIDAISHIKKKPGLPDLPNPKGLKYVPSFAAHMNQPGIMEYLRELKEQTFARYDIMTVGEANGVTVDEAEQWVGEENGVFHMIFQFEHLGLWKRKADGSIDVRRLKRTLTKWQKGLENRGWNALFLENHDLPRSVSTWGNDREYWAESAKALGALYFFMQGTPFIYQGQEIGMTNVQFSDIRDYRDVAALRLYELERANGRTHEEVMKIIWKTGRDNSRTPMQWSDAPNAGFTTGTPWIKVNENYRTINVEAERRDPNSVWSFYRQMIQLRKANELFVYGAYDLLLENHPSIYAYTRTLGRDRA.... The pIC50 is 5.6. (2) The compound is CCCC[C@]1(CC)CS(=O)(=O)c2cc(CNCC(=O)O)c(OC)cc2[C@@H](c2ccccc2)N1. The target protein (Q62633) has sequence MDNSSVCSPNATFCEGDSCLVTESNFNAILSTVMSTVLTILLAMVMFSMGCNVEINKFLGHIKRPWGIFVGFLCQFGIMPLTGFILSVASGILPVQAVVVLIMGCCPGGTGSNILAYWIDGDMDLSVSMTTCSTLLALGMMPLCLFIYTKMWVDSGTIVIPYDSIGISLVALVIPVSIGMFVNHKWPQKAKIILKIGSIAGAILIVLIAVVGGILYQSAWIIEPKLWIIGTIFPIAGYSLGFFLARLAGQPWYRCRTVALETGMQNTQLCSTIVQLSFSPEDLNLVFTFPLIYTVFQLVFAAIILGMYVTYKKCHGKNDAEFLEKTDNDMDPMPSFQETNKGFQPDEK. The pIC50 is 9.5. (3) The compound is Cc1ccc(C(=O)Nc2cccc(C(F)(F)F)c2)cc1-c1cc(N2CCOCC2)c(=O)n(C)n1. The target protein sequence is MEHIQGAWKTISNGFGFKDAVFDGSSCISPTIVQQFGYQRRASDDGKLTDPSKTSNTIRVFLPNKQRTVVNVRNGMSLHDCLMKALKVRGLQPECCAVFRLLHEHKGKKARLDWNTDAASLIGEELQVDFLDHVPLTTHNFARKTFLKLAFCDICQKFLLNGFRCQTCGYKFHEHCSTKVPTMCVDWSNIRQLLLFPNSTIGDSGVPALPSLTMRRMRESVSRMPVSSQHRYSTPHAFTFNTSSPSSEGSLSQRQRSTSTPNVHMVSTTLPVDSRMIEDAIRSHSESASPSALSSSPNNLSPTGWSQPKTPVPAQRERAPVSGTQEKNKIRPRGQRDSSEEWEIEASEVMLSTRIGSGSFGTVYKGKWHGDVAVKILKVVDPTPEQFQAFRNEVAVLRKTRHVNILLFMGYMTKDNLAIVTQWCEGSSLYKHLHVQETKFQMFQLIDIARQTAQGMDYLHAKNIIHRDMKSNNIFLHEGLTVKIGDFGLATVKSRWSGSQ.... The pIC50 is 9.3. (4) The compound is CC(=N)N1CCC(Oc2ccc3c(c2)OCC(=O)N3Cc2cc(-c3cccc(C(=N)N)c3)no2)CC1. The target protein (P00763) has sequence MRALLFLALVGAAVAFPVDDDDKIVGGYTCQENSVPYQVSLNSGYHFCGGSLINDQWVVSAAHCYKSRIQVRLGEHNINVLEGNEQFVNAAKIIKHPNFDRKTLNNDIMLIKLSSPVKLNARVATVALPSSCAPAGTQCLISGWGNTLSSGVNEPDLLQCLDAPLLPQADCEASYPGKITDNMVCVGFLEGGKDSCQGDSGGPVVCNGELQGIVSWGYGCALPDNPGVYTKVCNYVDWIQDTIAAN. The pIC50 is 3.6. (5) The drug is Nc1nc2c(-c3ccc(S(=O)(=O)NCC4(O)CNC4)c([SH](=O)=O)c3-c3nn[nH]n3)cccc2s1. The target protein (C7C422) has sequence MELPNIMHPVAKLSTALAAALMLSGCMPGEIRPTIGQQMETGDQRFGDLVFRQLAPNVWQHTSYLDMPGFGAVASNGLIVRDGGRVLVVDTAWTDDQTAQILNWIKQEINLPVALAVVTHAHQDKMGGMDALHAAGIATYANALSNQLAPQEGMVAAQHSLTFAANGWVEPATAPNFGPLKVFYPGPGHTSDNITVGIDGTDIAFGGCLIKDSKAKSLGNLGDADTEHYAASARAFGAAFPKASMIVMSHSAPDSRAAITHTARMADKLR. The pIC50 is 9.8. (6) The drug is O=C(N/N=C/c1cc([N+](=O)[O-])ccc1O)c1ccc(O)cc1O. The target protein sequence is FANLRKVLISDSLDPCCRKILQDGGLQVVEKQNLSKEELIAELQDCEGLIVRSATKVTADVINAAEKLQVVGRAGTGVDNVDLEAATRKGILVMNTPNGNSLSAAELTCGMIMCLARQIPQATASMKDGKWERKKFMGTELNGKTLGILGLGRIGREVATRMQSFGMKTIGYDPIISPEVSASFGVQQLPLEEIWPLCDFITVHTPLLPSTTGLLNDNTFAQCKKGVRVVNCARGGIVDEGALLRALQSGQCAGAALDVFTEEPPRDRALVDHENVISCPHLGASTKEAQSRCGEEIAVQFVDMVKGKSLTG. The pIC50 is 4.5. (7) The drug is CCCC[C@@H](O)/C=C(C)/C=C/C=C\C(=O)N1CCCC1=O. The target protein (P00639) has sequence MRGTRLMGLLLALAGLLQLGLSLKIAAFNIRTFGETKMSNATLASYIVRIVRRYDIVLIQEVRDSHLVAVGKLLDYLNQDDPNTYHYVVSEPLGRNSYKERYLFLFRPNKVSVLDTYQYDDGCESCGNDSFSREPAVVKFSSHSTKVKEFAIVALHSAPSDAVAEINSLYDVYLDVQQKWHLNDVMLMGDFNADCSYVTSSQWSSIRLRTSSTFQWLIPDSADTTATSTNCAYDRIVVAGSLLQSSVVPGSAAPFDFQAAYGLSNEMALAISDHYPVEVTLT. The pIC50 is 3.7.