Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. From a dataset of Drug-target binding data from BindingDB using IC50 measurements. (1) The target protein (P16497) has sequence MEQDTQHVKPLQTKTDIHAVLASNGRIIYISANSKLHLGYLQGEMIGSFLKTFLHEEDQFLVESYFYNEHHLMPCTFRFIKKDHTIVWVEAAVEIVTTRAERTEREIILKMKVLEEETGHQSLNCEKHEIEPASPESTTYITDDYERLVENLPSPLCISVKGKIVYVNSAMLSMLGAKSKDAIIGKSSYEFIEEEYHDIVKNRIIRMQKGMEVGMIEQTWKRLDGTPVHLEVKASPTVYKNQQAELLLLIDISSRKKFQTILQKSRERYQLLIQNSIDTIAVIHNGKWVFMNESGISLFEAATYEDLIGKNIYDQLHPCDHEDVKERIQNIAEQKTESEIVKQSWFTFQNRVIYTEMVCIPTTFFGEAAVQVILRDISERKQTEELMLKSEKLSIAGQLAAGIAHEIRNPLTAIKGFLQLMKPTMEGNEHYFDIVFSELSRIELILSELLMLAKPQQNAVKEYLNLKKLIGEVSALLETQANLNGIFIRTSYEKDSIYIN.... The pIC50 is 3.3. The drug is Cc1cc(N(C)C(=O)c2cc(I)cc(I)c2O)c(Cl)cc1C(C#N)c1ccc(Cl)cc1. (2) The small molecule is CCN(c1cc(C#CC2CCN(C)CC2)cc(C(=O)NCc2c(C)cc(C)[nH]c2=O)c1C)C1CCOCC1. The target protein sequence is ATKAARKSAPATGGVKKPHRYRPG. The pIC50 is 7.0. (3) The compound is N#Cc1ccc(-c2cnc3cnc(-c4ccc(C(=O)N5CCN(C6CC6)CC5)cc4)cn23)cc1. The target protein (Q9BUB5) has sequence MVSSQKLEKPIEMGSSEPLPIADGDRRRKKKRRGRATDSLPGKFEDMYKLTSELLGEGAYAKVQGAVSLQNGKEYAVKIIEKQAGHSRSRVFREVETLYQCQGNKNILELIEFFEDDTRFYLVFEKLQGGSILAHIQKQKHFNEREASRVVRDVAAALDFLHTKDKVSLCHLGWSAMAPSGLTAAPTSLGSSDPPTSASQVAGTTGIAHRDLKPENILCESPEKVSPVKICDFDLGSGMKLNNSCTPITTPELTTPCGSAEYMAPEVVEVFTDQATFYDKRCDLWSLGVVLYIMLSGYPPFVGHCGADCGWDRGEVCRVCQNKLFESIQEGKYEFPDKDWAHISSEAKDLISKLLVRDAKQRLSAAQVLQHPWVQGQAPEKGLPTPQVLQRNSSTMDLTLFAAEAIALNRQLSQHEENELAEEPEALADGLCSMKLSPPCKSRLARRRALAQAGRGEDRSPPTAL. The pIC50 is 6.0. (4) The pIC50 is 5.6. The drug is Nc1ncnc2c1ncn2C1CC(COP(=O)(O)O)C(OP(=O)(O)O)C1. The target protein (P49652) has sequence MTEALISAALNGTQPELLAGGWAAGNASTKCSLTKTGFQFYYLPTVYILVFITGFLGNSVAIWMFVFHMRPWSGISVYMFNLALADFLYVLTLPALIFYYFNKTDWIFGDVMCKLQRFIFHVNLYGSILFLTCISVHRYTGVVHPLKSLGRLKKKNAVYVSSLVWALVVAVIAPILFYSGTGVRRNKTITCYDTTADEYLRSYFVYSMCTTVFMFCIPFIVILGCYGLIVKALIYKDLDNSPLRRKSIYLVIIVLTVFAVSYLPFHVMKTLNLRARLDFQTPQMCAFNDKVYATYQVTRGLASLNSCVDPILYFLAGDTFRRRLSRATRKSSRRSEPNVQSKSEEMTLNILTEYKQNGDTSL. (5) The small molecule is CNc1nc(-c2ccc3c(c2)CCN3C(=O)c2ccccc2OCc2ccc(Cl)cc2)cs1. The target protein sequence is MSGGAAEKQSSTPGSLFLSPPAPAPKNGSSSDSSVGEKLGAAAADAVTGRTEEYRRRRHTMDKDSRGAAATTTTTEHRFFRRSVICDSNATALELPGLPLSLPQPSIPAAVPQSAPPEPHREETVTATATSQVAQQPPAAAAPGEQAVAGPAPSTVPSSTSKDRPVSQPSLVGSKEEPPPARSGSGGGSAKEPQEERSQQQDDIEELETKAVGMSNDGRFLKFDIEIGRGSFKTVYKGLDTETTVEVAWCELQDRKLTKSERQRFKEEAEMLKGLQHPNIVRFYDSWESTVKGKKCIVLVTELMTSGTLKTYLKRFKVMKIKVLRSWCRQILKGLQFLHTRTPPIIHRDLKCDNIFITGPTGSVKIGDLGLATLKRASFAKSVIGTPEFMAPEMYEEKYDESVDVYAFGMCMLEMATSEYPYSECQNAAQIYRRVTSGVKPASFDKVAIPEVKEIIEGCIRQNKDERYSIKDLLNHAFFQEETGVRVELAE. The pIC50 is 6.7. (6) The drug is CN[C@@H](C)C(=O)NC(Cc1ccc(NC(=O)c2ccc(C(=O)N[C@H]3C[C@@H](C(=O)NC4CCCc5ccccc54)N(C(=O)[C@@H](NC(=O)[C@H](C)NC)C(C)(C)C)C3)cc2)cc1)C(=O)N1C[Si](C)(C)C[C@H]1C(=O)NC1CCCc2ccccc21. The target protein sequence is SGVSSDRNFPNSTNSPRNPAMAEYEARIVTFGTWTSSVNKEQLARAGFYALGEGDKVKCFHCGGGLTDWKPSEDPWEQHAKWYPGCKYLLDEKGQEYINNIHLTHSLEESLGRTAE. The pIC50 is 7.9. (7) The compound is COC(=O)c1ccc(NS(=O)(=O)c2cccc(-c3ccccc3O)c2)cc1O. The target protein (Q16877) has sequence MASPRELTQNPLKKIWMPYSNGRPALHACQRGVCMTNCPTLIVMVGLPARGKTYISKKLTRYLNWIGVPTREFNVGQYRRDVVKTYKSFEFFLPDNEEGLKIRKQCALAALRDVRRFLSEEGGHVAVFDATNTTRERRATIFNFGEQNGYKTFFVESICVDPEVIAANIVQVKLGSPDYVNRDSDEATEDFMRRIECYENSYESLDEDLDRDLSYIKIMDVGQSYVVNRVADHIQSRIVYYLMNIHVTPRSIYLCRHGESELNLKGRIGGDPGLSPRGREFAKSLAQFISDQNIKDLKVWTSQMKRTIQTAEALGVPYEQWKVLNEIDAGVCEEMTYEEIQDNYPLEFALRDQDKYRYRYPKGESYEDLVQRLEPVIMELERQENVLVICHQAVMRCLLAYFLDKAAEQLPYLKCPLHTVLKLTPVAYGCKVESIFLNVAAVNTHRDRPQNVDISRPPEEALVTVPAHQ. The pIC50 is 5.2.