Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. From a dataset of Drug-target binding data from BindingDB using IC50 measurements. (1) The small molecule is Cc1nnc(C2CCNCC2)o1. The target protein (P30191) has sequence MLLLLPWLFSLLWIENAQAQLEDEGNFYSENVSRILDNLLEGYDNRLRPGFGGAVTEVKTDIYVTSFGPVSDVEMEYTMDVFFRQTWTDERLKFKGPAEILSLNNLMVSKIWTPDTFFRNGKKSIAHNMTTPNKLFRLMHNGTILYTMRLTINADCPMRLVNFPMDGHACPLKFGSYAYPKSEIIYTWKKGPLYSVEVPEESSSLLQYDLIGQTVSSETIKSNTGEYVIMTVYFHLQRKMGYFMIQIYTPCIMTVILSQVSFWINKESVPARTVFGITTVLTMTTLSISARHSLPKVSYATAMDWFIAVCFAFVFSALIEFAAVNYFTNLQSQKAERQAQTAAKPPVAKSKTTESLEAEIVVHSDSKYHLKKRISSLTLPIVPSSEASKVLSRTPILPSTPVTPPLLLPAIGGTSKIDQYSRILFPVAFAGFNLVYWIVYLSKDTMEVSSTVE. The pIC50 is 4.1. (2) The small molecule is CN1CCc2cc(O)c(O)cc2C1. The target protein (Q9ERK7) has sequence MARCSNSMALLFSFGLLWLCSGVLGTDTEERLVEHLLDPSRYNKLIRPATNGSELVTVQLMVSLAQLISVHEREQIMTTNVWLTQEWEDYRLTWKPEDFDNMKKVRLPSKHIWLPDVVLYNNADGMYEVSFYSNAVVSYDGSIFWLPPAIYKSACKIEVKHFPFDQQNCTMKFRSWTYDRTEIDLVLKSDVASLDDFTPSGEWDIIALPGRRNENPDDSTYVDITYDFIIRRKPLFYTINLIIPCVLITSLAILVFYLPSDCGEKMTLCISVLLALTVFLLLISKIVPPTSLDVPLVGKYLMFTMVLVTFSIVTSVCVLNVHHRSPTTHTMAPWVKVVFLEKLPTLLFLQQPRHRCARQRLRLRRRQREREGAGTLFFREGPAADPCTCFVNPASMQGLAGAFQAEPAAAGLGRSMGPCSCGLREAVDGVRFIADHMRSEDDDQSVREDWKYVAMVIDRLFLWIFVFVCVFGTIGMFLQPLFQNYTATTFLHSDHSAPSS.... The pIC50 is 4.8. (3) The small molecule is CC(C)(O)COc1cc(-c2ccc(N3CC[C@H](Oc4cnccn4)C3)nc2)c2c(C#N)cnn2c1. The target protein sequence is HCYHKFAHKPPISSAEMTFRRPAQAFPVSYSSSGARRPSLDSMENQVSVDAFKILEDPKWEFPRKNLVLGKTLGEGEFGKVVKATAFHLKGRAGYTTVAVKMLKENASPSELRDLLSEFNVLKQVNHPHVIKLYGACSQDGPLLLIMEYAKYGSLRGFLRESRKVGPGYLGSGGSRNSSSLDHPDERALTMGDLISFAWQISQGMQYLAEMKLVHRDLAARNILVAEGRKMKISDFGLSRDVYEEDSYVKRSQGRIPVKWMAIESLFDHIYTTQSDVWSFGVLLWEIVTLGGNPYPGIPPERLFNLLKTGHRMERPDNCSEEMYRLMLQCWKQEPDKRPVFADISKDLEKMMVKRRDYLDLAASTPSDSLIYDDGLSEEETPLVDCNNAPLPRALPSTWIENKLYGMSDPNWPGESPVPLTRADGTNTGFPRYPNDSVYANWMLSPSAAKLMDTFDS. The pIC50 is 6.1. (4) The compound is CCCN(CCC)S(=O)(=O)c1ccc(C(=O)O)cc1. The target protein (Q4U2R8) has sequence MAFNDLLQQVGGVGRFQQIQVTLVVLPLLLMASHNTLQNFTAAIPTHHCRPPADANLSKNGGLEVWLPRDRQGQPESCLRFTSPQWGLPFLNGTEANGTGATEPCTDGWIYDNSTFPSTIVTEWDLVCSHRALRQLAQSLYMVGVLLGAMVFGYLADRLGRRKVLILNYLQTAVSGTCAAFAPNFPIYCAFRLLSGMALAGISLNCMTLNVEWMPIHTRACVGTLIGYVYSLGQFLLAGVAYAVPHWRHLQLLVSAPFFAFFIYSWFFIESARWHSSSGRLDLTLRALQRVARINGKREEGAKLSMEVLRASLQKELTMGKGQASAMELLRCPTLRHLFLCLSMLWFATSFAYYGLVMDLQGFGVSIYLIQVIFGAVDLPAKLVGFLVINSLGRRPAQMAALLLAGICILLNGVIPQDQSIVRTSLAVLGKGCLAASFNCIFLYTGELYPTMIRQTGMGMGSTMARVGSIVSPLVSMTAELYPSMPLFIYGAVPVAASAV.... The pIC50 is 5.1. (5) The compound is N[C@H]1CCCC[C@H]1Nc1cc2cn[nH]c(=O)c2c(Nc2cccc3cccnc23)n1. The target protein sequence is ASSGMADSANHLPFFFGNITREEAEDYLVQGGMSDGLYLLRQSRNYLGGFALSVAHGRKAHHYTIERELNGTYAIAGGRTHASPADLCHYHSQESDGLVCLLKKPFNRPQGVQPKTGPFEDLKENLIREYVKQTWNLQGQALEQAIISQKPQLEKLIATTAHEKMPWFHGKISREESEQIVLIGSKTNGKFLIRARDNNGSYALCLLHEGKVLHYRIDKDKTGKLSIPEGKKFDTLWQLVEHYSYKADGLLRVLTVPCQKIGTQGNVNFGGRPQLPGSHPATWSAGGIISRIKSYSFPKPGHRKSSPAQGNRQESTVSFNPYEPELAPWAADKGPQREALPMDTEVYESPYADPEEIRPKEVYLDRKLLTLEDKELGSGNFGTVKKGYYQMKKVVKTVAVKILKNEANDPALKDELLAEANVMQQLDNPYIVRMIGICEAESWMLVMEMAELGPLNKYLQQNRHVKDKNIIELVHQVSMGMKYLEESNFVHRDLAARNVL.... The pIC50 is 7.3. (6) The target protein (P49888) has sequence MNSELDYYEKFEEVHGILMYKDFVKYWDNVEAFQARPDDLVIATYPKSGTTWVSEIVYMIYKEGDVEKCKEDVIFNRIPFLECRKENLMNGVKQLDEMNSPRIVKTHLPPELLPASFWEKDCKIIYLCRNAKDVAVSFYYFFLMVAGHPNPGSFPEFVEKFMQGQVPYGSWYKHVKSWWEKGKSPRVLFLFYEDLKEDIRKEVIKLIHFLERKPSEELVDRIIHHTSFQEMKNNPSTNYTTLPDEIMNQKLSPFMRKGITGDWKNHFTVALNEKFDKHYEQQMKESTLKFRTEI. The pIC50 is 5.5. The small molecule is O=c1c([CH+][N-]OCCCCCO[N-][CH+]c2cc(Br)cc([N+](=O)[O-])c2O)coc2c(Cl)cc(Cl)cc12. (7) The small molecule is CC(N)Cc1ccc(O)cc1. The target protein (P30679) has sequence MARSLTWRCCPWCLTEDEKAAARVDQEINRILLEQKKQDRGELKLLLLGPGESGKSTFIKQMRIIHGAGYSEEERKGFRPLVYQNIFVSMRAMIEAMERLQIPFSRPESKHHASLVMSQDPYKVTTFEKRYAAAMQWLWRDAGIRAYYERRREFHLLDSAVYYLSHLERITEEGYVPTAQDVLRSRMPTTGINEYCFSVQKTNLRIVDVGGQKSERKKWIHCFENVIALIYLASLSEYDQCLEENNQENRMKESLALFGTILELPWFKSTSVILFLNKTDILEEKIPTSHLATYFPSFQGPKQDAEAAKRFILDMYTRMYTGCVDGPEGSKKGARSRRLFSHYTCATDTQNIRKVFKDVRDSVLARYLDEINLL. The pIC50 is 4.5. (8) The drug is O=C(O)c1cccc(-c2ccncc2)c1. The target protein sequence is MRVLALSAVFLVASIIGMPAVAKEWQENKSWNAHFTEHKSQGVVVLWNENKQQGFTNNLKRANQAFLPASTFKIPNSLIALDLGVVKDEHQVFKWDGQTRDIATWNRDHNLITAMKYSVVPVYQEFARQIGEARMSKMLHAFDYGNEDISGNVDSFWLDGGIRISATEQISFLRKLYHNKLHVSERSQRIVKQAMLTEANGDYIIRAKTGYSTRIEPKIGWWVGWVELDDNVWFFAMNMDMPTSDGLGLRQAITKEVLKQEKIIP. The pIC50 is 3.6. (9) The small molecule is CCC1(C)C(=O)Nc2ccc(S(=O)(=O)Nc3ccc(F)cc3F)cc2C(=O)N1C. The target protein (Q921V5) has sequence MRFRIYKRKVLILTLVVAACGFVLWSSNGRQRKSDALGPPLLDAEPVRGAGHLAVSVGIRRVSNESAAPLVPAVPRPEVDNLTLRYRSLVYQLNFDQMLRNVGNDGTWSPGELVLVVQVHNRPEYLRLLIDSLRKAQGIQEVLVIFSHDFWSAEINSLISRVDFCPVLQVFFPFSIQLYPNEFPGSDPRDCPRDLKKNAALKLGCINAEYPDSFGHYREAKFSQTKHHWWWKLHFVWERVKVLQDYTGLILFLEEDHYLAPDFYHVFKKMWKLKQQECPGCDVLSLGTYTTIRSFYGIADKVDVKTWKSTEHNMGLALTRDAYQKLIECTDTFCTYDDYNWDWTLQYLTLACLPKIWKVLVPQAPRIFHAGDCGMHHKKTCRPSTQSAQIESLLNSNKQYLFPETLVIGEKFPMAAISPPRKNGGWGDIRDHELCKSYRRLQ. The pIC50 is 6.2.