From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is O=C(NCC1CCCO1)c1ccc(N2CCC(Oc3ccccc3Cl)CC2)nn1. The target protein (P13516) has sequence MPAHMLQEISSSYTTTTTITAPPSGNEREKVKTVPLHLEEDIRPEMKEDIHDPTYQDEEGPPPKLEYVWRNIILMVLLHLGGLYGIILVPSCKLYTCLFGIFYYMTSALGITAGAHRLWSHRTYKARLPLRIFLIIANTMAFQNDVYEWARDHRAHHKFSETHADPHNSRRGFFFSHVGWLLVRKHPAVKEKGGKLDMSDLKAEKLVMFQRRYYKPGLLLMCFILPTLVPWYCWGETFVNSLFVSTFLRYTLVLNATWLVNSAAHLYGYRPYDKNIQSRENILVSLGAVGEGFHNYHHTFPFDYSASEYRWHINFTTFFIDCMAALGLAYDRKKVSKATVLARIKRTGDGSHKSS. The pIC50 is 5.7. (2) The compound is CC(C)[C@H](NC(=O)[C@@H]1CSSC[C@H](NC(=O)[C@H](C)N)C(=O)N[C@@H](Cc2ccccc2)C(=O)N[C@@H](C(c2ccccc2)c2ccccc2)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](Cc2ccc(O)cc2)C(=O)N1)C(=O)O. The target protein (Q9UKP6) has sequence MALTPESPSSFPGLAATGSSVPEPPGGPNATLNSSWASPTEPSSLEDLVATGTIGTLLSAMGVVGVVGNAYTLVVTCRSLRAVASMYVYVVNLALADLLYLLSIPFIVATYVTKEWHFGDVGCRVLFGLDFLTMHASIFTLTVMSSERYAAVLRPLDTVQRPKGYRKLLALGTWLLALLLTLPVMLAMRLVRRGPKSLCLPAWGPRAHRAYLTLLFATSIAGPGLLIGLLYARLARAYRRSQRASFKRARRPGARALRLVLGIVLLFWACFLPFWLWQLLAQYHQAPLAPRTARIVNYLTTCLTYGNSCANPFLYTLLTRNYRDHLRGRVRGPGSGGGRGPVPSLQPRARFQRCSGRSLSSCSPQPTDSLVLAPAAPARPAPEGPRAPA. The pIC50 is 6.6. (3) The compound is O=C(CCl)Nc1cn(-c2ccccc2)nc1-c1ccc(F)cc1. The target protein (P9WHJ3) has sequence MTQTPDREKALELAVAQIEKSYGKGSVMRLGDEARQPISVIPTGSIALDVALGIGGLPRGRVIEIYGPESSGKTTVALHAVANAQAAGGVAAFIDAEHALDPDYAKKLGVDTDSLLVSQPDTGEQALEIADMLIRSGALDIVVIDSVAALVPRAELEGEMGDSHVGLQARLMSQALRKMTGALNNSGTTAIFINQLRDKIGVMFGSPETTTGGKALKFYASVRMDVRRVETLKDGTNAVGNRTRVKVVKNKCLAEGTRIFDPVTGTTHRIEDVVDGRKPIHVVAAAKDGTLHARPVVSWFDQGTRDVIGLRIAGGAIVWATPDHKVLTEYGWRAAGELRKGDRVAQPRRFDGFGDSAPIPADHARLLGYLIGDGRDGWVGGKTPINFINVQRALIDDVTRIAATLGCAAHPQGRISLAIAHRPGERNGVADLCQQAGIYGKLAWEKTIPNWFFEPDIAADIVGNLLFGLFESDGWVSREQTGALRVGYTTTSEQLAHQIH.... The pIC50 is 6.7. (4) The compound is Oc1ccc(Nc2nc(-c3ccc(O)cc3O)cs2)cc1. The target protein (P04745) has sequence MKLFWLLFTIGFCWAQYSSNTQQGRTSIVHLFEWRWVDIALECERYLAPKGFGGVQVSPPNENVAIHNPFRPWWERYQPVSYKLCTRSGNEDEFRNMVTRCNNVGVRIYVDAVINHMCGNAVSAGTSSTCGSYFNPGSRDFPAVPYSGWDFNDGKCKTGSGDIENYNDATQVRDCRLSGLLDLALGKDYVRSKIAEYMNHLIDIGVAGFRIDASKHMWPGDIKAILDKLHNLNSNWFPEGSKPFIYQEVIDLGGEPIKSSDYFGNGRVTEFKYGAKLGTVIRKWNGEKMSYLKNWGEGWGFMPSDRALVFVDNHDNQRGHGAGGASILTFWDARLYKMAVGFMLAHPYGFTRVMSSYRWPRYFENGKDVNDWVGPPNDNGVTKEVTINPDTTCGNDWVCEHRWRQIRNMVNFRNVVDGQPFTNWYDNGSNQVAFGRGNRGFIVFNNDDWTFSLTLQTGLPAGTYCDVISGDKINGNCTGIKIYVSDDGKAHFSISNSAED.... The pIC50 is 3.9. (5) The drug is COc1ccc2c(c1)[C@@]13CCCC[C@@H]1[C@@H](C2)N(C)CC3. The target protein (P19224) has sequence MACLLRSFQRISAGVFFLALWGMVVGDKLLVVPQDGSHWLSMKDIVEVLSDRGHEIVVVVPEVNLLLKESKYYTRKIYPVPYDQEELKNRYQSFGNNHFAERSFLTAPQTEYRNNMIVIGLYFINCQSLLQDRDTLNFFKESKFDALFTDPALPCGVILAEYLGLPSVYLFRGFPCSLEHTFSRSPDPVSYIPRCYTKFSDHMTFSQRVANFLVNLLEPYLFYCLFSKYEELASAVLKRDVDIITLYQKVSVWLLRYDFVLEYPRPVMPNMVFIGGINCKKRKDLSQEFEAYINASGEHGIVVFSLGSMVSEIPEKKAMAIADALGKIPQTVLWRYTGTRPSNLANNTILVKWLPQNDLLGHPMTRAFITHAGSHGVYESICNGVPMVMMPLFGDQMDNAKRMETKGAGVTLNVLEMTSEDLENALKAVINDKSYKENIMRLSSLHKDRPVEPLDLAVFWVEFVMRHKGAPHLRPAAHDLTWYQYHSLDVIGFLLAVVLT.... The pIC50 is 3.5. (6) The compound is CC(C)C[C@H](NC(=O)[C@H](Cc1ccccc1)NCC(=O)O)C(=O)O. The target protein (Q61391) has sequence MGRSESQMDITDINAPKPKKKQRWTPLEISLSVLVLLLTIIAVTMIALYATYDDGICKSSDCIKSAARLIQNMDASVEPCTDFFKYACGGWLKRNVIPETSSRYSNFDILRDELEVILKDVLQEPKTEDIVAVQKAKTLYRSCINESAIDSRGGQPLLKLLPDIYGWPVASDNWDQTYGTSWTAEKSIAQLNSKYGKKVLINFFVGTDDKNSTQHIIHFDQPRLGLPSRDYYECTGIYKEACTAYVDFMISVARLIRQEQSLPIDENQLSLEMNKVMELEKEIANATTKPEDRNDPMLLYNKMTLAKLQNNFSLEVNGKSFSWSNFTNEIMSTVNINIQNEEEVVVYAPEYLTKLKPILTKYSPRDLQNLMSWRFIMDLVSSLSRNYKESRNAFRKALYGTTSETATWRRCANYVNGNMENAVGRLYVEAAFAGESKHVVEDLIAQIREVFIQTLDDLTWMDAETKKKAEEKALAIKERIGYPDDIISNENKLNNEYLEL.... The pIC50 is 4.0. (7) The drug is COc1ccccc1N1CCN(c2ncnc3c2cnn3-c2ccccc2)C(C)C1. The target protein (P11168) has sequence MTEDKVTGTLVFTVITAVLGSFQFGYDIGVINAPQQVIISHYRHVLGVPLDDRKAINNYVINSTDELPTISYSMNPKPTPWAEEETVAAAQLITMLWSLSVSSFAVGGMTASFFGGWLGDTLGRIKAMLVANILSLVGALLMGFSKLGPSHILIIAGRSISGLYCGLISGLVPMYIGEIAPTALRGALGTFHQLAIVTGILISQIIGLEFILGNYDLWHILLGLSGVRAILQSLLLFFCPESPRYLYIKLDEEVKAKQSLKRLRGYDDVTKDINEMRKEREEASSEQKVSIIQLFTNSSYRQPILVALMLHVAQQFSGINGIFYYSTSIFQTAGISKPVYATIGVGAVNMVFTAVSVFLVEKAGRRSLFLIGMSGMFVCAIFMSVGLVLLNKFSWMSYVSMIAIFLFVSFFEIGPGPIPWFMVAEFFSQGPRPAALAIAAFSNWTCNFIVALCFQYIADFCGPYVFFLFAGVLLAFTLFTFFKVPETKGKSFEEIAAEFQ.... The pIC50 is 5.6. (8) The drug is C[C@H]1CN(c2cc(-c3n[nH]c4ccc(OC5(C)CC5)cc34)ncn2)C[C@@H](C)O1. The target protein sequence is HSDSISSLASEREYITSLDLSANELRDIDALSQKCCISVHLEHLEKLELHQNALTSFPQQLCETLKSLTHLDLHSNKFTSFPSYLLKMSCIANLDVSRNDIGPSVVLDPTVKCPTLKQFNLSYNQLSFVPENLTDVVEKLEQLILEGNKISGICSPLRLKELKILNLSKNHISSLSENFLEACPKVESFSARMNFLAAMPFLPPSMTILKLSQNKFSCIPEAILNLPHLRSLDMSSNDIQYLPGPAHWKSLNLRELLFSHNQISILDLSEKAYLWSRVEKLHLSHNKLKEIPPEIGCLENLTSLDVSYNLELRSFPNEMGKLSKIWDLPLDELHLNFDFKHIGCKAKDIIRFLQQRLKKAVPYNRMKLMIVGNTGSGKTTLLQQLMKTKKSDLGMQSATVGIDVKDWPIQIRDKRKRDLVLNVWDFAGREEFYSTHPHFMTQRALYLAVYDLSKGQAEVDAMKPWLFNIKARASSSPVILVGTHLDVSDEKQRKACMSKI.... The pIC50 is 8.2. (9) The compound is O=C1NC(=O)c2c1c1c3ccccc3[nH]c1c1[nH]c3ccc(Br)cc3c21. The target protein (P11275) has sequence MATITCTRFTEEYQLFEELGKGAFSVVRRCVKVLAGQEYAAKIINTKKLSARDHQKLEREARICRLLKHPNIVRLHDSISEEGHHYLIFDLVTGGELFEDIVAREYYSEADASHCIQQILEAVLHCHQMGVVHRDLKPENLLLASKLKGAAVKLADFGLAIEVEGEQQAWFGFAGTPGYLSPEVLRKDPYGKPVDLWACGVILYILLVGYPPFWDEDQHRLYQQIKAGAYDFPSPEWDTVTPEAKDLINKMLTINPSKRITAAEALKHPWISHRSTVASCMHRQETVDCLKKFNARRKLKGAILTTMLATRNFSGGKSGGNKKNDGVKESSESTNTTIEDEDTKVRKQEIIKVTEQLIEAISNGDFESYTKMCDPGMTAFEPEALGNLVEGLDFHRFYFENLWSRNSKPVHTTILNPHIHLMGDESACIAYIRITQYLDAGGIPRTAQSEETRVWHRRDGKWQIVHFHRSGAPSVLPH. The pIC50 is 5.9. (10) The small molecule is CC1(NCC(=O)N2C[C@H](F)C[C@@H]2C#N)CCN(S(C)(=O)=O)CC1. The target protein (P28843) has sequence MKTPWKVLLGLLGVAALVTIITVPIVLLSKDEAAADSRRTYSLADYLKSTFRVKSYSLWWVSDFEYLYKQENNILLLNAEHGNSSIFLENSTFESFGYHSVSPDRLFVLLEYNYVKQWRHSYTASYNIYDVNKRQLITEEKIPNNTQWITWSPEGHKLAYVWKNDIYVKVEPHLPSHRITSTGEENVIYNGITDWVYEEEVFGAYSALWWSPNNTFLAYAQFNDTGVPLIEYSFYSDESLQYPKTVWIPYPKAGAVNPTVKFFIVNIDSLSSSSSAAPIQIPAPASVARGDHYLCDVVWATEERISLQWLRRIQNYSVMAICDYDKINLTWNCPSEQQHVEMSTTGWVGRFRPAEPHFTSDGSSFYKIISDKDGYKHICHFPKDKKDCTFITKGAWEVISIEALTSDYLYYISNQYKEMPGGRNLYKIQLTDHTNVKCLSCDLNPERCQYYAVSFSKEAKYYQLGCWGPGLPLYTLHRSTDHKELRVLEDNSALDRMLQD.... The pIC50 is 8.6.