Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. From a dataset of Drug-target binding data from BindingDB using IC50 measurements. (1) The small molecule is CC(C)C[C@H](NC(=O)[C@H](C)NC(=O)[C@H](Cc1ccccc1)NC(=O)c1ccc(C(=O)O)cc1)C(=O)N[C@@H](CCCC[N+](C)(C)C)C(=O)N[C@@H](CO)C(N)=O. The target protein (O95931) has sequence MELSAIGEQVFAVESIRKKRVRKGKVEYLVKWKGWPPKYSTWEPEEHILDPRLVMAYEEKEERDRASGYRKRGPKPKRLLLQRLYSMDLRSSHKAKGKEKLCFSLTCPLGSGSPEGVVKAGAPELVDKGPLVPTLPFPLRKPRKAHKYLRLSRKKFPPRGPNLESHSHRRELFLQEPPAPDVLQAAGEWEPAAQPPEEEADADLAEGPPPWTPALPSSEVTVTDITANSITVTFREAQAAEGFFRDRSGKF. The pIC50 is 5.2. (2) The compound is O=C(O)COc1cccc(CCCn2cc(C(c3ccc(F)cc3)c3ccc(F)cc3)ccc2=O)c1. The target protein sequence is MANATLKPLCPVLKDMSLLGSHSNSSLRYMDHISVLLHGLAALLGLVENGLIVFVVGCRMRQTVVTTWALHLALSDLLASAALPFFTYFLAVGHSWELGTAFCKLHSSVFFLNMFASGFLLSAISLDRCVRVVHPVWAQNHRSVSVARRVCAVLWALALLNTVPYFVFRDTILRRDGRTMCYYNVLLLAPAGDHNATCGTRQMALALSKFLLAFALPLGIIAASHAVVSARLQRRPQGGVRPGRFVRLVAAVVAAFALCWGPYHAFSLIEARAHAVPSLRPLAWRALPFVSSLAFINSVVNPLLYVLTCPDVGRKLRRSLRAVLESVLVDDGELGSRYRRRGGSSSPAVASASSSLSLAPATHQACSLLRWLRGSRGTGSDDAPSSASGQG. The pIC50 is 6.4. (3) The compound is CC(=NNC(=O)c1cc(Br)cc(Br)c1O)c1cc2ccccc2[nH]1. The target protein (Q6GG09) has sequence MRKTKIVCTIGPASESEEMIEKLINAGMNVARLNFSHGSHEEHKGRIDTIRKVAKRLDKIVAILLDTKGPEIRTHNMKDGIIELERGNEVIVSMNEVEGTPEKFSVTYENLINDVQVGSYILLDDGLIELQVKDIDHAKKEVKCDILNSGELKNKKGVNLPGVRVSLPGITEKDAEDIRFGIKENVDFIAASFVRRPSDVLEIREILEEQKANISVFPKIENQEGIDNIEEILEVSDGLMVARGDMGVEIPPEKVPMVQKDLIRQCNKLGKPVITATQMLDSMQRNPRATRAEASDVANAIYDGTDAVMLSGETAAGLYPEEAVKTMRNIAVSAEAAQDYKKLLSDRTKLVETSLVNAIGISVAHTALNLNVKAIVAATESGSTARTISKYRPHSDIIAVTPSEETARQCSIVWGVQPVVKKGRKSTDALLNNAVATAVETGRVTNGDLIIITAGVPTGETGTTNMMKIHLVGDEIANGQGIGRGSVVGTTLVAETVKDL.... The pIC50 is 7.1. (4) The drug is Cc1cccc(C)c1NC(C#N)c1ccccc1OCc1ccccc1. The target protein (Q8NHU3) has sequence MDIIETAKLEEHLENQPSDPTNTYARPAEPVEEENKNGNGKPKSLSSGLRKGTKKYPDYIQIAMPTESRNKFPLEWWKTGIAFIYAVFNLVLTTVMITVVHERVPPKELSPPLPDKFFDYIDRVKWAFSVSEINGIILVGLWITQWLFLRYKSIVGRRFCFIIGTLYLYRCITMYVTTLPVPGMHFQCAPKLNGDSQAKVQRILRLISGGGLSITGSHILCGDFLFSGHTVTLTLTYLFIKEYSPRHFWWYHLICWLLSAAGIICILVAHEHYTIDVIIAYYITTRLFWWYHSMANEKNLKVSSQTNFLSRAWWFPIFYFFEKNVQGSIPCCFSWPLSWPPGCFKSSCKKYSRVQKIGEDNEKST. The pIC50 is 4.5. (5) The target protein (P35236) has sequence MVQAHGGRSRAQPLTLSLGAAMTQPPPEKTPAKKHVRLQERRGSNVALMLDVRSLGAVEPICSVNTPREVTLHFLRTAGHPLTRWALQRQPPSPKQLEEEFLKIPSNFVSPEDLDIPGHASKDRYKTILPNPQSRVCLGRAQSQEDGDYINANYIRGYDGKEKVYIATQGPMPNTVSDFWEMVWQEEVSLIVMLTQLREGKEKCVHYWPTEEETYGPFQIRIQDMKECPEYTVRQLTIQYQEERRSVKHILFSAWPDHQTPESAGPLLRLVAEVEESPETAAHPGPIVVHCSAGIGRTGCFIATRIGCQQLKARGEVDILGIVCQLRLDRGGMIQTAEQYQFLHHTLALYAGQLPEEPSP. The pIC50 is 5.4. The compound is COc1cc(/C=C2\SC(=O)N(CC(=O)Nc3cccc(C)c3)C2=O)cc(Cl)c1OCC(=O)O. (6) The drug is COc1ccccc1Nc1cc(S(=O)(=O)[O-])c(N)c2c1C(=O)c1ccccc1C2=O. The target protein (Q63371) has sequence MERDNGTIQAPGLPPTTCVYREDFKRLLLPPVYSVVLVVGLPLNVCVIAQICASRRTLTRSAVYTLNLALADLLYACSLPLLIYNYARGDHWPFGDLACRLVRFLFYANLHGSILFLTCISFQRYLGICHPLAPWHKRGGRRAAWVVCGVVWLVVTAQCLPTAVFAATGIQRNRTVCYDLSPPILSTRYLPYGMALTVIGFLLPFTALLACYCRMARRLCRQDGPAGPVAQERRSKAARMAVVVAAVFVISFLPFHITKTAYLAVRSTPGVSCPVLETFAAAYKGTRPFASANSVLDPILFYFTQQKFRRQPHDLLQKLTAKWQRQRV. The pIC50 is 4.0. (7) The drug is O=C(Nc1cccc(C(F)(F)F)c1)c1ccc2nc(-c3ccco3)c(-c3ccco3)nc2c1. The target protein sequence is MSTNPKPQRKTKRNTNRRPQDVKFPGGGQIVGGVYLLPRRGPRLGVRATRKTSERSQPRGRRQPIPKDQRTTGKSWGKPGYPWPLYGNEGLGWAGWLLSPRGSRPSWGPNDPRHRSRNVGKVIDTL. The pIC50 is 4.3. (8) The compound is O=C(O)CN1C(=O)/C(=C/c2cn(-c3ccccc3)nc2-c2cccs2)SC1=S. The target protein sequence is MPFVNKQFNYKDPVNGVDIAYIKIPNAGQMQPVKAFKIHNKIWVIPERDTFTNPEEGDLNPPPEAKQVPVSYYDSTYLSTDNEKDNYLKGVTKLFERIYSTDLGRMLLTSIVRGIPFWGGSTIDTELKVIDTNCINVIQPDGSYRSEELNLVIIGPSADIIQFECKSFGHDVLNLTRNGYGSTQYIRFSPDFTFGFEESLEVDTNPLLGAGKFATDPAVTLAHELIHAEHRLYGIAINPNRVFKVNTNAYYEMSGLEVSFEELRTFGGHDAKFIDSLQENEFRLYYYNKFKDVASTLNKAKSIIGTTASLQYMKNVFKEKYLLSEDTSGKFSVDKLKFDKLYKMLTEIYTEDNFVNFFKVINRKTYLNFDKAVFRINIVPDENYTIKDGFNLKGANLSTNFNGQNTEINSRNFTRLKNFTGLFEF. The pIC50 is 3.8. (9) The small molecule is Cc1cc(N2CCN(c3ccnc([C@@H](C)O)n3)CC2)nc(C)n1. The target protein (Q00796) has sequence MAAAAKPNNLSLVVHGPGDLRLENYPIPEPGPNEVLLRMHSVGICGSDVHYWEYGRIGNFIVKKPMVLGHEASGTVEKVGSSVKHLKPGDRVAIEPGAPRENDEFCKMGRYNLSPSIFFCATPPDDGNLCRFYKHNAAFCYKLPDNVTFEEGALIEPLSVGIHACRRGGVTLGHKVLVCGAGPIGMVTLLVAKAMGAAQVVVTDLSATRLSKAKEIGADLVLQISKESPQEIARKVEGQLGCKPEVTIECTGAEASIQAGIYATRSGGNLVLVGLGSEMTTVPLLHAAIREVDIKGVFRYCNTWPVAISMLASKSVNVKPLVTHRFPLEKALEAFETFKKGLGLKIMLKCDPSDQNP. The pIC50 is 7.5. (10) The pIC50 is 4.1. The small molecule is CCn1cnc2c(Nc3cc(F)cc(F)c3)nc(C#N)nc21. The target protein (P06873) has sequence MRLSVLLSLLPLALGAPAVEQRSEAAPLIEARGEMVANKYIVKFKEGSALSALDAAMEKISGKPDHVYKNVFSGFAATLDENMVRVLRAHPDVEYIEQDAVVTINAAQTNAPWGLARISSTSPGTSTYYYDESAGQGSCVYVIDTGIEASHPEFEGRAQMVKTYYYSSRDGNGHGTHCAGTVGSRTYGVAKKTQLFGVKVLDDNGSGQYSTIIAGMDFVASDKNNRNCPKGVVASLSLGGGYSSSVNSAAARLQSSGVMVAVAAGNNNADARNYSPASEPSVCTVGASDRYDRRSSFSNYGSVLDIFGPGTSILSTWIGGSTRSISGTSMATPHVAGLAAYLMTLGKTTAASACRYIADTANKGDLSNIPFGTVNLLAYNNYQA.