From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is COc1cc(-c2ccc(CN[C@H](C)c3cccc4ccncc34)cc2)ccc1C(=O)O. The target protein (P10759) has sequence MPLFKLTGQGKQIDDAMRSFAEKVFASEVKDEGGRHEISPFDVDEICPISLREMQAHIFHMENLSMSMDGRRKRRFQGRKTVNLSIPQSETSSTKLSHIEEFISSSPTYESVPDFQRVQITGDYASGVTVEDFEVVCKGLYRALCIREKYMQKSFQRFPKTPSKYLRNIDGEALVAIESFYPVFTPPPKKGEDPFRREDLPANLGYHLKMKGGVIYIYPDEAAASRDEPKPYPYPNLDDFLDDMNFLLALIAQGPVKTYTHRRLKFLSSKFQVHQMLNEMDELKELKNNPHRDFYNCRKVDTHIHAAACMNQKHLLRFIKKSYHIDADRVVYSTKEKNLTLKELFAQLNMHPYDLTVDSLDVHAGRQTFQRFDKFNDKYNPVGASELRDLYLKTDNYINGEYFATIIKEVGADLVDAKYQHAEPRLSIYGRSPDEWSKLSSWFVGNRIYCPNMTWMIQVPRIYDVFRSKNFLPHFGKMLENIFLPVFEATINPQTHPDLS.... The pIC50 is 7.7. (2) The small molecule is O=C(O)CNC(=O)C(=O)O. The target protein sequence is MEPGCDEFLPPPECPVFEPSWAEFQDPLGYIAKIRPIAEKSGICKIRPPADWQPPFAVEVDNFRFTPRVQRLNELEAQTRVKLNYLDQIAKFWEIQGSSLKIPNVERKILDLYSLSKIVIEEGGYEAICKDRRWARVAQRLHYPPGKNIGSLLRSHYERIIYPYEMFQSGANHVQCNTHPFDNEVKDKEYKPHSIPLRQSVQPSKFSSYSRRAKRLQPDPEPTEEDIEKHPELKKLQIYGPGPKMMGLGLMAKDKDKTVHKKVTCPPTVTVKDEQSGGGNVSSTLLKQHLSLEPCTKTTMQLRKNHSSAQFIDSYICQVCSRGDEDDKLLFCDGCDDNYHIFCLLPPLPEIPRGIWRCPKCILAECKQPPEAFGFEQATQEYSLQSFGEMADSFKSDYFNMPVHMVPTELVEKEFWRLVSSIEEDVTVEYGADIHSKEFGSGFPVSNSKQNLSPEEKEYATSGWNLNVMPVLDQSVLCHINADISGMKVPWLYVGMVFSA.... The pIC50 is 5.2. (3) The drug is CC1OC(c2ccccc2)N(Cc2ccccc2)C1=O. The target is PDASQDDGPAVERPSTEL. The pIC50 is 4.1. (4) The compound is CN[C@@H](C)C(=O)N[C@H]1CCC[C@H]2C[C@H]3CCN(C(=O)Oc4ccccc4)C[C@H]3N2C1=O. The pIC50 is 7.3. The target protein (Q13490) has sequence MHKTASQRLFPGPSYQNIKSIMEDSTILSDWTNSNKQKMKYDFSCELYRMSTYSTFPAGVPVSERSLARAGFYYTGVNDKVKCFCCGLMLDNWKLGDSPIQKHKQLYPSCSFIQNLVSASLGSTSKNTSPMRNSFAHSLSPTLEHSSLFSGSYSSLSPNPLNSRAVEDISSSRTNPYSYAMSTEEARFLTYHMWPLTFLSPSELARAGFYYIGPGDRVACFACGGKLSNWEPKDDAMSEHRRHFPNCPFLENSLETLRFSISNLSMQTHAARMRTFMYWPSSVPVQPEQLASAGFYYVGRNDDVKCFCCDGGLRCWESGDDPWVEHAKWFPRCEFLIRMKGQEFVDEIQGRYPHLLEQLLSTSDTTGEENADPPIIHFGPGESSSEDAVMMNTPVVKSALEMGFNRDLVKQTVQSKILTTGENYKTVNDIVSALLNAEDEKREEEKEKQAEEMASDDLSLIRKNRMALFQQLTCVLPILDNLLKANVINKQEHDIIKQKT.... (5) The drug is N=C1Cc2ccccc2CC1[PH](O)(O)O. The target protein (Q01693) has sequence MKYTKTLLAMVLSATFCQAYAEDKVWISIGADANQTVMKSGAESILPNSVASSGQVWVGQVDVAQLAELSHNMHEEHNRCGGYMVHPSAQSAMAASAMPTTLASFVMPPITQQATVTAWLPQVDASQITGTISSLESFTNRFYTTTSGAQASDWIASEWQALSASLPNASVKQVSHSGYNQKSVVMTITGSEAPDEWIVIGGHLDSTIGSHTNEQSVAPGADDDASGIAAVTEVIRVLSENNFQPKRSIAFMAYAAEEVGLRGSQDLANQYKSEGKNVVSALQLDMTNYKGSAQDVVFITDYTDSNFTQYLTQLMDEYLPSLTYGFDTCGYACSDHASWHNAGYPAAMPFESKFNDYNPRIHTTQDTLANSDPTGSHAKKFTQLGLAYAIEMGSATGDTPTPGNQLEDGVPVTDLSGSRGSNVWYTFELETQKNLQITTSGGYGDLDLYVKFGSKASKQNWDCRPYLSGNNEVCTFNNASPGTYSVMLTGYSNYSGASLK.... The pIC50 is 3.8. (6) The target protein (Q61180) has sequence MLDHTRAPELNLDLDLDVSNSPKGSMKGNNFKEQDLCPPLPMQGLGKGDKREEQALGPEPSEPRQPTEEEEALIEFHRSYRELFQFFCNNTTIHGAIRLVCSKHNRMKTAFWAVLWLCTFGMMYWQFALLFEEYFSYPVSLNINLNSDKLVFPAVTVCTLNPYRYTEIKEDLEELDRITEQTLFDLYKYNSSYTRQAGGRRRSTRDLRGALPHPLQRLRTPPPPNPARSARSASSSVRDNNPQVDRKDWKIGFQLCNQNKSDCFYQTYSSGVDAVREWYRFHYINILSRLPDTSPALEEEALGSFIFTCRFNQAPCNQANYSQFHHPMYGNCYTFNNKNNSNLWMSSMPGVNNGLSLTLRTEQNDFIPLLSTVTGARVMVHGQDEPAFMDDGGFNVRPGVETSISMRKEALDSLGGNYGDCTENGSDVPVKNLYPSKYTQQVCIHSCFQENMIKKCGCAYIFYPKPKGVEFCDYLKQSSWGYCYYKLQAAFSLDSLGCFS.... The pIC50 is 8.2. The small molecule is Cl.Nc1nc(N)c(C(=O)N[C@H]2CCC[N+](CCCc3ccc(OCC(=O)NCCc4ccccn4)cc3)(CCCc3ccc(OCC(=O)NCCc4ccccn4)cc3)C2)nc1Cl.[Cl-].