Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The target protein (Q61009) has sequence MGGSSRARWVALGLGALGLLFAALGVVMILMVPSLIKQQVLKNVRIDPSSLSFGMWKEIPVPFYLSVYFFEVVNPNEVLNGQKPVVRERGPYVYREFRQKVNITFNDNDTVSFVENRSLHFQPDKSHGSESDYIVLPNILVLGGSILMESKPVSLKLMMTLALVTMGQRAFMNRTVGEILWGYDDPFVHFLNTYLPDMLPIKGKFGLFVGMNNSNSGVFTVFTGVQNFSRIHLVDKWNGLSKIDYWHSEQCNMINGTSGQMWAPFMTPESSLEFFSPEACRSMKLTYNESRVFEGIPTYRFTAPDTLFANGSVYPPNEGFCPCRESGIQNVSTCRFGAPLFLSHPHFYNADPVLSEAVLGLNPNPKEHSLFLDIHPVTGIPMNCSVKMQLSLYIKSVKGIGQTGKIEPVVLPLLWFEQSGAMGGKPLSTFYTQLVLMPQVLHYAQYVLLGLGGLLLLVPIICQLRSQEKCFLFWSGSKKGSQDKEAIQAYSESLMSPAAK.... The drug is COc1ccc(-c2nnn(CC(=O)N(C3CCCCC3)C(C)C(=O)N3CCCCC3)n2)cc1OC. The pIC50 is 5.2. (2) The compound is Cc1c(CCC(=O)O)c(=O)oc2c(C)c(OCc3ccc(-c4ccc(Cl)c(Cl)c4)cc3)ccc12. The target protein sequence is MSDVMADRTPPHNIEAEQAVLGAILIDQDALTSASELLVPDSFYRTKHQKIFEVMLGLSDKGEPIDLVMMTSAMADQGLLEEVGGVSYLAELAEVVPTAANVEYYARIIAEKALLRRLIRTATHIVSDGYEREDDVDGLLNEAEKKILEVSHQTNAKAFQNIKDVLVDAYDKIELLHNQKGEVTGIPTGFTELDKMTAGFQRNDLIIVAARPSVGKTAFSLNIAQNVATKTDENVAIFSLEMGADQLVMRMLCAEGNIDAQRLRTGSLTSDDWAKLTMAMGSLSNAGIYIDDTPGIKVNEIRAKCRRLKQEQGLGMILIDYLQLIQGSGKSGENRQQEVSEISRTLKGIARELQVPVIALSQLSRGVESRQDKRPMMSDIRESGSIEQDADIVAFLYREDYYDRETENKNTIEIIIAKQRNGPVGSVELAFVKEFNKFVNLERRFEDGHAPPA. The pIC50 is 5.5. (3) The small molecule is Cc1ccc(COC(=O)N2c3ccccc3Oc3ccccc32)cc1. The target protein (P51577) has sequence MAGCCSVLGSFLFEYDTPRIVLIRSRKVGLMNRAVQLLILAYVIGWVFVWEKGYQETDSVVSSVTTKAKGVAVTNTSQLGFRIWDVADYVIPAQEENSLFIMTNMIVTVNQTQSTCPEIPDKTSICNSDADCTPGSVDTHSSGVATGRCVPFNESVKTCEVAAWCPVENDVGVPTPAFLKAAENFTLLVKNNIWYPKFNFSKRNILPNITTSYLKSCIYNAQTDPFCPIFRLGTIVEDAGHSFQEMAVEGGIMGIQIKWDCNLDRAASLCLPRYSFRRLDTRDLEHNVSPGYNFRFAKYYRDLAGKEQRTLTKAYGIRFDIIVFGKAGKFDIIPTMINVGSGLALLGVATVLCDVIVLYCMKKKYYYRDKKYKYVEDYEQGLSGEMNQ. The pIC50 is 5.0. (4) The drug is C=C1C(=O)N[C@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(=O)O)[C@H](C)C(=O)N[C@@H](CCCNC(=N)N)C(=O)N[C@@H](/C=C/C(C)=C/[C@H](C)[C@H](Cc2ccccc2)OC)[C@H](C)C(=O)N[C@@H](C(=O)O)CCC(=O)N1C. The target protein (P36873) has sequence MADLDKLNIDSIIQRLLEVRGSKPGKNVQLQENEIRGLCLKSREIFLSQPILLELEAPLKICGDIHGQYYDLLRLFEYGGFPPESNYLFLGDYVDRGKQSLETICLLLAYKIKYPENFFLLRGNHECASINRIYGFYDECKRRYNIKLWKTFTDCFNCLPIAAIVDEKIFCCHGGLSPDLQSMEQIRRIMRPTDVPDQGLLCDLLWSDPDKDVLGWGENDRGVSFTFGAEVVAKFLHKHDLDLICRAHQVVEDGYEFFAKRQLVTLFSAPNYCGEFDNAGAMMSVDETLMCSFQILKPAEKKKPNATRPVTPPRGMITKQAKK. The pIC50 is 7.5. (5) The small molecule is Cn1nc(Nc2nc(N[C@@H](COC(=O)N3CCOC(CN)C3)c3ccccc3)nc3n[nH]cc23)cc1C(C)(C)C. The target protein (Q9HW02) has sequence MVKEPNGVTRTMRRIRRIHFVGIGGAGMCGIAEVLLNLGYEVSGSDLKASAVTERLEKFGAQIFIGHQAENADGADVLVVSSAINRANPEVASALERRIPVVPRAEMLAELMRYRHGIAVAGTHGKTTTTSLIASVFAAGGLDPTFVIGGRLNAAGTNAQLGASRYLVAEADESDASFLHLQPMVAVVTNIDADHMATYGGDFNKLKKTFVEFLHNLPFYGLAVMCVDDPVVREILPQIARPTVTYGLSEDADVRAINIRQEGMRTWFTVLRPEREPLDVSVNMPGLHNVLNSLATIVIATDEGISDEAIVQGLSGFQGVGRRFQVYGELQVEGGSVMLVDDYGHHPREVAAVIKAIRGGWPERRLVMVYQPHRYTRTRDLYEDFVQVLGEANVLLLMEVYPAGEEPIPGADSRQLCHSIRQRGQLDPIYFERDADLAPLVKPLLRAGDILLCQGAGDVGGLAPQLIKNPLFAGKGGKGA. The pIC50 is 9.2. (6) The small molecule is Cc1ccsc1C(=O)n1nc(Nc2ccc(S(N)(=O)=O)cc2)nc1N. The target is XTSFAESXKPVQQPSAFGS. The pIC50 is 7.6. (7) The small molecule is N[C@@H](CSC(Cc1ccccc1)(c1ccccc1)c1cccc(Cl)c1)C(=O)O. The target protein (P52732) has sequence MASQPNSSAKKKEEKGKNIQVVVRCRPFNLAERKASAHSIVECDPVRKEVSVRTGGLADKSSRKTYTFDMVFGASTKQIDVYRSVVCPILDEVIMGYNCTIFAYGQTGTGKTFTMEGERSPNEEYTWEEDPLAGIIPRTLHQIFEKLTDNGTEFSVKVSLLEIYNEELFDLLNPSSDVSERLQMFDDPRNKRGVIIKGLEEITVHNKDEVYQILEKGAAKRTTAATLMNAYSSRSHSVFSVTIHMKETTIDGEELVKIGKLNLVDLAGSENIGRSGAVDKRAREAGNINQSLLTLGRVITALVERTPHVPYRESKLTRILQDSLGGRTRTSIIATISPASLNLEETLSTLEYAHRAKNILNKPEVNQKLTKKALIKEYTEEIERLKRDLAAAREKNGVYISEENFRVMSGKLTVQEEQIVELIEKIGAVEEELNRVTELFMDNKNELDQCKSDLQNKTQELETTQKHLQETKLQLVKEEYITSALESTEEKLHDAASKLL.... The pIC50 is 6.9. (8) The small molecule is NCCCn1cc(C2=C(c3c[nH]c4ccccc34)C(=O)NC2=O)c2ccccc21. The target protein sequence is MDGTAAEPRPGAGSLQHAQPPPQPRKKRPEDFKFGKILGEGSFSTVVLARELATSREYAIKILEKRHIIKENKVPYVTRERDVMSRLDHPFFVKLYFTFQDDEKLYFGMSYAKNGELLKYIRKIGSFDETCTRFYTAEIVSALEYLHGKGIIHRDLKPENILLNEDMHIQITDFGTAKVLSPESKQARANSFVGTAQYVSPELLTEKSACKSSDLWALGCIIYQLVAGLPPFRAGNEYLIFQKIIKLEYDFPEKFFPKARDLVEKLLVLDATKRLGCEEMEGYGPLKAHPFFESVTWENLHQQTPPKLT. The pIC50 is 5.5. (9) The target protein (Q28838) has sequence MGSLQPDAGNASWNGTEAPGGGARATPYSLQVTLTLVCLAGLLMLFTVFGNVLVIIAVFTSRALKAPQNLFLVSLASADILVATLVIPFSLANEVMGYWYFGKAWCEIYLALDVLFCTSSIVHLCAISLDRYWSITQAIEYNLKRTPRRIKAIIVTVWVISAVISFPPLISFEKKRGRSGQPSAEPRCEINDQKWYVISSSIGSFFAPCLIMILVYVRIYQIAKRRTRVPPSRRGPDATAAELPGSAERRPNGLGPERGGVGPVGAEVESLQVQLNGAPGEPAPAGAGADALDLEESSSSEHAERPPGSRRSERGPRAKGKARASQVKPGDSLPRRGPGATGLGAPTAGPAEERSGGGAKASRWRGRQNREKRFTFVLAVVIGVFVVCWFPFFFTYTLTAIGCPVPPTLFKFFFWFGYCNSSLNPVIYTIFNHDFRRAFKKILCRGDRKRIV. The pIC50 is 5.5. The drug is COc1cccc2c1CC(N1CCCC1)CO2.