This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is CCCCCCc1ccn(Cc2ccccc2Cl)c(=O)c1. The target protein sequence is MIIKPRVRGFICVTAHPTGCEANVKKQIDYVTTEGPIANGPKRVLVIGASTGYGLAARITAAFGCGADTLGVFFERPGEEGKPGTSGWYNSAAFHKFAAQKGLYAKSINGDAFSDEIKQLTIDAIKQDLGQVDQVIYSLASPRRTHPKTGEVFNSALKPIGNAVNLRGLDTDKEVIKESVLQPATQSEIDSTVAVMGGEDWQMWIDALLDAGVLAEGAQTTAFTYLGEKITHDIYWNGSIGAAKKDLDQKVLAIRESLAAHGGGDARVSVLKAVVSQASSAIPMMPLYLSLLFKVMKEKGTHEGCIEQVYSLYKDSLCGDSPHMDQEGRLRADYKELDPEVQNQVQQLWDQVTNDNIYQLTDFVGYKSEFLNLFGFGIDGVDYDADVNPDVKIPNLIQG. The pIC50 is 5.7. (2) The small molecule is O=C(N[C@@H](Cc1c[nH]c2ccccc12)C(=O)O)c1ccc2c(c1)nc(-c1ccoc1)n2C1CCCCC1. The target protein (P0A8V2) has sequence MVYSYTEKKRIRKDFGKRPQVLDVPYLLSIQLDSFQKFIEQDPEGQYGLEAAFRSVFPIQSYSGNSELQYVSYRLGEPVFDVQECQIRGVTYSAPLRVKLRLVIYEREAPEGTVKDIKEQEVYMGEIPLMTDNGTFVINGTERVIVSQLHRSPGVFFDSDKGKTHSSGKVLYNARIIPYRGSWLDFEFDPKDNLFVRIDRRRKLPATIILRALNYTTEQILDLFFEKVIFEIRDNKLQMELVPERLRGETASFDIEANGKVYVEKGRRITARHIRQLEKDDVKLIEVPVEYIAGKVVAKDYIDESTGELICAANMELSLDLLAKLSQSGHKRIETLFTNDLDHGPYISETLRVDPTNDRLSALVEIYRMMRPGEPPTREAAESLFENLFFSEDRYDLSAVGRMKFNRSLLREEIEGSGILSKDDIIDVMKKLIDIRNGKGEVDDIDHLGNRRIRSVGEMAENQFRVGLVRVERAVKERLSLGDLDTLMPQDMINAKPISA.... The pIC50 is 3.5. (3) The small molecule is O=C(O)Cc1cc(Cl)c(Oc2ccc(O)c(-c3ccc(O)cc3)c2)c(Cl)c1. The target protein (P10828) has sequence MTPNSMTENGLTAWDKPKHCPDREHDWKLVGMSEACLHRKSHSERRSTLKNEQSSPHLIQTTWTSSIFHLDHDDVNDQSVSSAQTFQTEEKKCKGYIPSYLDKDELCVVCGDKATGYHYRCITCEGCKGFFRRTIQKNLHPSYSCKYEGKCVIDKVTRNQCQECRFKKCIYVGMATDLVLDDSKRLAKRKLIEENREKRRREELQKSIGHKPEPTDEEWELIKTVTEAHVATNAQGSHWKQKRKFLPEDIGQAPIVNAPEGGKVDLEAFSHFTKIITPAITRVVDFAKKLPMFCELPCEDQIILLKGCCMEIMSLRAAVRYDPESETLTLNGEMAVTRGQLKNGGLGVVSDAIFDLGMSLSSFNLDDTEVALLQAVLLMSSDRPGLACVERIEKYQDSFLLAFEHYINYRKHHVTHFWPKLLMKVTDLRMIGACHASRFLHMKVECPTELFPPLFLEVFED. The pIC50 is 7.1. (4) The target protein (Q96NT5) has sequence MEGSASPPEKPRARPAAAVLCRGPVEPLVFLANFALVLQGPLTTQYLWHRFSADLGYNGTRQRGGCSNRSADPTMQEVETLTSHWTLYMNVGGFLVGLFSSTLLGAWSDSVGRRPLLVLASLGLLLQALVSVFVVQLQLHVGYFVLGRILCALLGDFGGLLAASFASVADVSSSRSRTFRMALLEASIGVAGMLASLLGGHWLRAQGYANPFWLALALLIAMTLYAAFCFGETLKEPKSTRLFTFRHHRSIVQLYVAPAPEKSRKHLALYSLAIFVVITVHFGAQDILTLYELSTPLCWDSKLIGYGSAAQHLPYLTSLLALKLLQYCLADAWVAEIGLAFNILGMVVFAFATITPLMFTGYGLLFLSLVITPVIRAKLSKLVRETEQGALFSAVACVNSLAMLTASGIFNSLYPATLNFMKGFPFLLGAGLLLIPAVLIGMLEKADPHLEFQQFPQSP. The compound is Nc1nc(N)c2c(CCCc3csc(C(=O)N[C@@H](CCC(=O)O)C(=O)O)c3)coc2n1. The pIC50 is 6.0.