From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is Cc1[nH]c2c(C#N)cnn2c(=O)c1C(C)C. The target protein sequence is MEPGCDEFLPPPECPVFEPSWAEFQDPLGYIAKIRPIAEKSGICKIRPPADWQPPFAVEVDNFRFTPRVQRLNELEAQTRVKLNYLDQIAKFWEIQGSSLKIPNVERKILDLYSLSKIVIEEGGYEAICKDRRWARVAQRLHYPPGKNIGSLLRSHYERIIYPYEMFQSGANHVQCNTHPFDNEVKDKEYKPHSIPLRQSVQPSKFSSYSRRAKRLQPDPEPTEEDIEKHPELKKLQIYGPGPKMMGLGLMAKDKDKTVHKKVTCPPTVTVKDEQSGGGNVSSTLLKQHLSLEPCTKTTMQLRKNHSSAQFIDSYICQVCSRGDEDDKLLFCDGCDDNYHIFCLLPPLPEIPRGIWRCPKCILAECKQPPEAFGFEQATQEYSLQSFGEMADSFKSDYFNMPVHMVPTELVEKEFWRLVSSIEEDVTVEYGADIHSKEFGSGFPVSNSKQNLSPEEKEYATSGWNLNVMPVLDQSVLCHINADISGMKVPWLYVGMVFSA.... The pIC50 is 5.2. (2) The drug is CC(C)c1nc(N2CCN(c3ncc(OCc4ccncc4C#N)cn3)CC2)no1. The target protein (Q7TQP3) has sequence MESSFSFGVILAVLTILIIAVNALVVVAMLLSIYKNDGVGLCFTLNLAVADTLIGVAISGLVTDQLSSSAQHTQKTLCSLRMAFVTSSAAASVLTVMLIAFDRYLAIKQPLRYFQIMNGLVAGACIAGLWLVSYLIGFLPLGVSIFQQTTYHGPCSFFAVFHPRFVLTLSCAGFFPAVLLFVFFYCDMLKIASVHSQQIRKMEHAGAMAGAYRPPRSVNDFKAVRTIAVLIGSFTLSWSPFLITSIVQVACHKCCLYQVLEKYLWLLGVGNSLLNPLIYAYWQREVRQQLYHMALGVKKFFTSILLLLPARNRGPERTRESAYHIVTISHPELDG. The pIC50 is 5.6. (3) The compound is CCCCCCCCCCCCCCCCOCCCOP(=O)(O)CO[C@H](C)Cn1cnc2c(N)ncnc21. The target protein sequence is PISPIETVPVKLKPGMDGPKVKQWPLTEEKIKALVEICTEMEKEGKISKIGPENPYNTPVFAIKRKDSTKWRKLVDFRELNKRTQDFWEVQLGIPHPAGLKKKKSVTVLDVGDAYFSVPLDEDFRKYTAFTIPSINNETPGIRYQYNVLPQGWKGSPAIFQSSMTKILEPFRKQNPDIVIYQYVDDLYVGSDLEIGQHRTKIEELRQHLLRWGLTTPDKKHQKEPPFLWMGYELHPDKWTVQPIVLPEKDSWTVNDIQKLVGKLNWASQIYPGIKVRQLCKLLRGTKALTEVIPLTEEAELELAENREILKEPVHGVYYDPSKDLIAEIQKQGQGQWTYQIYQEPFKNLKTGKYARMRGAHTNDVKQLTEAVQKITTESIVIWGKTPKFKLPIQKETWETWWTEYWQATWIPEWEFVNTPPLVKLWYQLEKEPIVGAETFYVDGAANRETKLGKAGYVTNRGRQKVVTLTDTTNQKTELQAIYLALQDSGLEVNIVTDSQ.... The pIC50 is 8.7. (4) The pIC50 is 7.1. The target protein (Q54801) has sequence MTKKIVAIWAQDEEGVIGKENRLPWHLPAELQHFKETTLNHAILMGRVTFDGMGRRLLPKRETLILTRNPEEKIDGVATFQDVQSVLDWYQAQEKNLYIIGGKQIFQAFEPYLDEVIVTHIHARVEGDTYFPEELDLSLFETVSSKFYAKDEKNPYDFTIQYRKRKEV. The drug is COc1cc(Cc2cnc(N)nc2N)cc(OC)c1OC. (5) The compound is O=C(NCC(F)(F)F)[C@@H]1CN(Cc2ccc(-c3ccccn3)o2)CCN1C[C@@H](O)C[C@@H](Cc1ccccc1)C(=O)N[C@H]1c2ccccc2OC[C@H]1O. The target protein sequence is PQITLWKRPIVTIKIGGQLKEALLDTGADDTVLEEMSLPGKWKPKIIGGIGGFVKVRQYDQVPIEICGHKVIGTVLIGPTPANIIGRNLMTQLGCTLNF. The pIC50 is 9.3. (6) The drug is O=C1Nc2ccc(Cl)cc2/C1=C\c1ccc(C(=O)N/N=C/c2cccc(O)c2O)cc1. The pIC50 is 4.0. The target protein (P07374) has sequence MKLSPREVEKLGLHNAGYLAQKRLARGVRLNYTEAVALIASQIMEYARDGEKTVAQLMCLGQHLLGRRQVLPAVPHLLNAVQVEATFPDGTKLVTVHDPISRENGELQEALFGSLLPVPSLDKFAETKEDNRIPGEILCEDECLTLNIGRKAVILKVTSKGDRPIQVGSHYHFIEVNPYLTFDRRKAYGMRLNIAAGTAVRFEPGDCKSVTLVSIEGNKVIRGGNAIADGPVNETNLEAAMHAVRSKGFGHEEEKDASEGFTKEDPNCPFNTFIHRKEYANKYGPTTGDKIRLGDTNLLAEIEKDYALYGDECVFGGGKVIRDGMGQSCGHPPAISLDTVITNAVIIDYTGIIKADIGIKDGLIASIGKAGNPDIMNGVFSNMIIGANTEVIAGEGLIVTAGAIDCHVHYICPQLVYEAISSGITTLVGGGTGPAAGTRATTCTPSPTQMRLMLQSTDDLPLNFGFTGKGSSSKPDELHEIIKAGAMGLKLHEDWGSTPA.... (7) The small molecule is Cc1onc(-c2ccccc2Cl)c1C(=O)N[C@@H]1C(=O)N2[C@@H](C(=O)O)C(C)(C)S[C@H]12. The target protein sequence is MMKKSLCCALLLTASFSTFAAAKTEQQIADIVNRTITPLMQEQAIPGMAVAVIYQGKPYYFTWGKADIANNHPVTQQTLFELGSVSKTFNGVLGGDCIARGEIKLSDPVTKYWPELTGKQWQGIRLLHLATYTAGGLPLQIPDDVRDKAALLHFYQNWQPQWTPGAKRLYANSSIGLFGALAVKPSGMSYEEAMTRRVLQPLKLAHTWITVPENEQKDYAWGYREGKPVHVSPGQLDAEAYGVKSSVIDMARWVQANMDASHVQEKTLQQGIALAQSRYWRIGDMYQGLGWEMLNWPLKADSIINGSDSKVALAALPAVEVNPPAPAVKASWVHKTGSTGGFGSYVAFVPEKNLGIVMLANKSYPNPVRVEAAWRILEKLQ. The pIC50 is 8.4.