This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The pIC50 is 6.8. The target protein (O95180) has sequence MTEGARAADEVRVPLGAPPPGPAALVGASPESPGAPGREAERGSELGVSPSESPAAERGAELGADEEQRVPYPALAATVFFCLGQTTRPRSWCLRLVCNPWFEHVSMLVIMLNCVTLGMFRPCEDVECGSERCNILEAFDAFIFAFFAVEMVIKMVALGLFGQKCYLGDTWNRLDFFIVVAGMMEYSLDGHNVSLSAIRTVRVLRPLRAINRVPSMRILVTLLLDTLPMLGNVLLLCFFVFFIFGIVGVQLWAGLLRNRCFLDSAFVRNNNLTFLRPYYQTEEGEENPFICSSRRDNGMQKCSHIPGRRELRMPCTLGWEAYTQPQAEGVGAARNACINWNQYYNVCRSGDSNPHNGAINFDNIGYAWIAIFQVITLEGWVDIMYYVMDAHSFYNFIYFILLIIVGSFFMINLCLVVIATQFSETKQRESQLMREQRARHLSNDSTLASFSEPGSCYEELLKYVGHIFRKVKRRSLRLYARWQSRWRKKVDPSAVQGQGP.... The compound is COc1cc(Cn2ncc(NC(=O)Cc3ccc(C4(F)COC4)cc3)n2)ccc1F. (2) The small molecule is O=C1O/C(=C/I)CCC1c1cccc2ccccc12. The target protein (O60733) has sequence MQFFGRLVNTFSGVTNLFSNPFRVKEVAVADYTSSDRVREEGQLILFQNTPNRTWDCVLVNPRNSQSGFRLFQLELEADALVNFHQYSSQLLPFYESSPQVLHTEVLQHLTDLIRNHPSWSVAHLAVELGIRECFHHSRIISCANCAENEEGCTPLHLACRKGDGEILVELVQYCHTQMDVTDYKGETVFHYAVQGDNSQVLQLLGRNAVAGLNQVNNQGLTPLHLACQLGKQEMVRVLLLCNARCNIMGPNGYPIHSAMKFSQKGCAEMIISMDSSQIHSKDPRYGASPLHWAKNAEMARMLLKRGCNVNSTSSAGNTALHVAVMRNRFDCAIVLLTHGANADARGEHGNTPLHLAMSKDNVEMIKALIVFGAEVDTPNDFGETPTFLASKIGRLVTRKAILTLLRTVGAEYCFPPIHGVPAEQGSAAPHHPFSLERAQPPPISLNNLELQDLMHISRARKPAFILGSMRDEKRTHDHLLCLDGGGVKGLIIIQLLIAI.... The pIC50 is 7.5. (3) The small molecule is Cn1c(SCc2cc(=O)oc3cc(O)c(O)cc23)nc2ccccc21. The target protein sequence is MDRMYEQNQMPHNNEAEQSVLGSIIIDPELINTTQEVLLPESFYRGAHQHIFRAMMHLNEDNKEIDVVTLMDQLSTEGTLNEAGGPQYLAELSTNVPTTRNVQYYTDIVSKHALKRRLIQTADSIANDGYNDELELDAILSDAERRILELSSSRESDGFKDIRDVLGQVYETAEELDQNSGQTPGIPTGYRDLDQMTAGFNRNDLIILAARPSVGKTAFALNIAQKVATHEDMYTVGIFSLEMGADQLATRMICSSGNVDSNRLRTGTMTEEDWSRFTIAVGKLSRTKIFIDDTPGIRINDLRSKCRRLKQEHGLDMIVIDYLQLIQGSGSRASDNRQQEVSEISRTLKALARELKCPVIALSQLSRGVEQRQDKRPMMSDIRESGSIEQDADIVAFLYRDDYYNRGGDEDDDDDGGFEPQTNDENGEIEIIIAKQRNGPTGTVKLHFMKQYNKFTDIDYAHADMM. The pIC50 is 4.8. (4) The compound is Cc1ccc(-n2nc(C(=O)OCCn3c([N+](=O)[O-])cnc3C)cc2-c2ccc(I)cc2)cc1. The target protein sequence is MTNVLIEDLKWRGLIYQQTDEQGIEDLLNKEQVTLYCGADPTADSLHIGHLLPFLTLRRFQEHGHRPIVLIGGGTGMIGDPSGKSEERVLQTEEQVDKNIEGISKQMHNIFEFGTDHGAVLVNNRDWLGQISLISFLRDYGKHVGVNYMLGKDSIQSRLEHGISYTEFTYTILQAIDFGHLNRELNCKIQVGGSDQWGNITSGIELMRRMYGQTDAYGLTIPLVTKSDGKKFGKSESGAVWLDAEKTSPYEFYQFWINQSDEDVIKFLKYFTFLGKEEIDRLEQSKNEAPHLREAQKTLAEEVTKFIHGEDALNDAIRISQALFSGDLKSLSAKELKDGFKDVPQVTLSNDTTNIVEVLIETGISPSKRQAREDVNNGAIYINGERQQDVNYALAPEDKIDGEFTIIRRGKKKYFMVNYQ. The pIC50 is 4.4. (5) The small molecule is CC(=O)N1CCN(c2ccc(OC[C@@H]3CO[C@](Cn4ccnc4)(c4ccc(Cl)cc4Cl)O3)cc2)CC1. The target protein (P08683) has sequence MDPVLVLVLTLSSLLLLSLWRQSFGRGKLPPGPTPLPIIGNTLQIYMKDIGQSIKKFSKVYGPIFTLYLGMKPFVVLHGYEAVKEALVDLGEEFSGRGSFPVSERVNKGLGVIFSNGMQWKEIRRFSIMTLRTFGMGKRTIEDRIQEEAQCLVEELRKSKGAPFDPTFILGCAPCNVICSIIFQNRFDYKDPTFLNLMHRFNENFRLFSSPWLQVCNTFPAIIDYFPGSHNQVLKNFFYIKNYVLEKVKEHQESLDKDNPRDFIDCFLNKMEQEKHNPQSEFTLESLVATVTDMFGAGTETTSTTLRYGLLLLLKHVDVTAKVQEEIERVIGRNRSPCMKDRSQMPYTDAVVHEIQRYIDLVPTNLPHLVTRDIKFRNYFIPKGTNVIVSLSSILHDDKEFPNPEKFDPGHFLDERGNFKKSDYFMPFSAGKRICAGEALARTELFLFFTTILQNFNLKSLVDVKDIDTTPAISGFGHLPPFYEACFIPVQRADSLSSHL.... The pIC50 is 4.7. (6) The compound is COC(=O)c1cc(-c2ccc3[nH]c(-c4nc(C)no4)cc3c2)n(C)n1. The target protein (P33435) has sequence MHSAILATFFLLSWTPCWSLPLPYGDDDDDDLSEEDLVFAEHYLKSYYHPATLAGILKKSTVTSTVDRLREMQSFFGLEVTGKLDDPTLDIMRKPRCGVPDVGEYNVFPRTLKWSQTNLTYRIVNYTPDMSHSEVEKAFRKAFKVWSDVTPLNFTRIYDGTADIMISFGTKEHGDFYPFDGPSGLLAHAFPPGPNYGGDAHFDDDETWTSSSKGYNLFIVAAHELGHSLGLDHSKDPGALMFPIYTYTGKSHFMLPDDDVQGIQFLYGPGDEDPNPKHPKTPEKCDPALSLDAITSLRGETMIFKDRFFWRLHPQQVEAELFLTKSFWPELPNHVDAAYEHPSRDLMFIFRGRKFWALNGYDILEGYPRKISDLGFPKEVKRLSAAVHFENTGKTLFFSENHVWSYDDVNQTMDKDYPRLIEEEFPGIGNKVDAVYEKNGYIYFFNGPIQFEYSIWSNRIVRVMPTNSILWC. The pIC50 is 6.7. (7) The pIC50 is 4.2. The target is SSSEEGLTCRGIPNSISI. The compound is COc1ccc(/C(=N/NC(=O)NCCCC(C)Nc2cc(OC)cc3cccnc23)c2ccccc2)cc1. (8) The drug is CCC(CC)O[C@@H]1C=C(C(=O)O)C[C@H](N)[C@H]1NC(C)=O. The target protein sequence is MNPNKKIITIGSICMVTGMVSLMLQIGNLISIWVSHSIHTGNQHKAEPISNTNFLTEKAVASVKLAGNSSLCPINGWAVYSKDNSIRIGSKGDVFVIREPFISCSHLECRTFFLTQGALLNDKHSNGTVKDRSPHRTLMSCPVGEAPSPYNSRFESVAWSASACHDGTSWLTIGISGPDNGAVAVLKYNGIITDTIKSWRNNILRTQESECACVNGSCFTVMTDGPSNGQASHKIFKMEKGKVVKSVELDAPNYHYEECSCYPDAGEITCVCRDNWHGSNRPWVSFNQNLEYQIGYICSGVFGDNPRPNDGTGSCGPVSSNGAYGVKGFSFKYGNGVWIGRTKSTNSRSGFEMIWDPNGWTETDSSFSVKQDIVAITDWSGYSGSFVQHPELTGLDCIRPCFWVELIRGRPKESTIWTSGSSISFCGVNSDTVGWSWPDGAELPFTIDK. The pIC50 is 9.4. (9) The drug is CC1CC(=O)NN=C1c1ccc(NC2=C(Cc3ccccc3)C(=O)CCC2)cc1F. The target protein sequence is MGAFSGSCRPKINPLTPFPGFYPCSEIEDPAEKGDRKLNKGLNRNSLPTPQLRRSSGTSGLLPVEQSSRWDRNNGKRPHQEFGISSQGCYLNGPFNSNLLTIPKQRSSSVSLTHHVGLRRAGVLSSLSPVNSSNHGPVSTGSLTNRSPIEFPDTADFLNKPSVILQRSLGNAPNTPDFYQQLRNSDSNLCNSCGHQMLKYVSTSESDGTDCCSGKSGEEENIFSKESFKLMETQQEEETEKKDSRKLFQEGDKWLTEEAQSEQQTNIEQEVSLDLILVEEYDSLIEKMSNWNFPIFELVEKMGEKSGRILSQVMYTLFQDTGLLEIFKIPTQQFMNYFRALENGYRDIPYHNRIHATDVLHAVWYLTTRPVPGLQQIHNGCGTGNETDSDGRINHGRIAYISSKSCSNPDESYGCLSSNIPALELMALYVAAAMHDYDHPGRTNAFLVATNAPQAVLYNDRSVLENHHAASAWNLYLSRPEYNFLLHLDHVEFKRFRFLV.... The pIC50 is 9.5. (10) The small molecule is C[C@@H]1OC(=O)[C@@H]1NC(=O)OCc1ccc(C2CCCCC2)cc1. The target protein (Q5KTC7) has sequence MGTPAIRAACHGAHLALALLLLLSLSDPWLWATAPGTPPLFNVSLDAAPELRWLPMLQHYDPDFVRAAVAQVIGDRVPQWILEMIGEIVQKVESFLPQPFTSEIRGICDYLNLSLAEGVLVNLAYEASAFCTSIVAQDSQGRIYHGRNLDYPFGNALRKLTADVQFVKNGQIVFTATTFVGYVGLWTGQSPHKFTISGDERDKGWWWENMIAALSLGHSPISWLIRKTLTESEDFEAAVYTLAKTPLIADVYYIVGGTSPQEGVVITRDRGGPADIWPLDPLNGAWFRVETNYDHWEPVPKRDDRRTPAIKALNATGQAHLSLETLFQVLSVFPVYNNYTIYTTVMSAAEPDKYMTMIRNPS. The pIC50 is 8.2.