From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is CC(C)OC(=O)C(C)(C)C(c1ccc(Nc2ccc3ccccc3c2)cc1)n1ccnc1. The target protein (O43174) has sequence MGLPALLASALCTFVLPLLLFLAAIKLWDLYCVSGRDRSCALPLPPGTMGFPFFGETLQMVLQRRKFLQMKRRKYGFIYKTHLFGRPTVRVMGADNVRRILLGEHRLVSVHWPASVRTILGSGCLSNLHDSSHKQRKKVIMRAFSREALECYVPVITEEVGSSLEQWLSCGERGLLVYPEVKRLMFRIAMRILLGCEPQLAGDGDSEQQLVEAFEEMTRNLFSLPIDVPFSGLYRGMKARNLIHARIEQNIRAKICGLRASEAGQGCKDALQLLIEHSWERGERLDMQALKQSSTELLFGGHETTASAATSLITYLGLYPHVLQKVREELKSKGLLCKSNQDNKLDMEILEQLKYIGCVIKETLRLNPPVPGGFRVALKTFELNGYQIPKGWNVIYSICDTHDVAEIFTNKEEFNPDRFMLPHPEDASRFSFIPFGGGLRSCVGKEFAKILLKIFTVELARHCDWQLLNGPPTMKTSPTVYPVDNLPARFTHFHGEI. The pIC50 is 6.5. (2) The compound is O=C(Nc1ccccc1-c1nnn(-c2ccccc2)n1)c1ccccc1. The target protein (P08183) has sequence MDLEGDRNGGAKKKNFFKLNNKSEKDKKEKKPTVSVFSMFRYSNWLDKLYMVVGTLAAIIHGAGLPLMMLVFGEMTDIFANAGNLEDLMSNITNRSDINDTGFFMNLEEDMTRYAYYYSGIGAGVLVAAYIQVSFWCLAAGRQIHKIRKQFFHAIMRQEIGWFDVHDVGELNTRLTDDVSKINEGIGDKIGMFFQSMATFFTGFIVGFTRGWKLTLVILAISPVLGLSAAVWAKILSSFTDKELLAYAKAGAVAEEVLAAIRTVIAFGGQKKELERYNKNLEEAKRIGIKKAITANISIGAAFLLIYASYALAFWYGTTLVLSGEYSIGQVLTVFFSVLIGAFSVGQASPSIEAFANARGAAYEIFKIIDNKPSIDSYSKSGHKPDNIKGNLEFRNVHFSYPSRKEVKILKGLNLKVQSGQTVALVGNSGCGKSTTVQLMQRLYDPTEGMVSVDGQDIRTINVRFLREIIGVVSQEPVLFATTIAENIRYGRENVTMDEI.... The pIC50 is 4.7. (3) The target protein (P09955) has sequence MLAFLILVTVTLASAHHSGEHFEGEKVFRVNVEDENDISLLHELASTRQIDFWKPDSVTQIKPHSTVDFRVKAEDILAVEDFLEQNELQYEVLINNLRSVLEAQFDSRVRTTGHSYEKYNNWETIEAWTKQVTSENPDLISRTAIGTTFLGNNIYLLKVGKPGPNKPAIFMDCGFHAREWISHAFCQWFVREAVLTYGYESHMTEFLNKLDFYVLPVLNIDGYIYTWTKNRMWRKTRSTNAGTTCIGTDPNRNFDAGWCTTGASTDPCDETYCGSAAESEKETKALADFIRNNLSSIKAYLTIHSYSQMILYPYSYDYKLPENNAELNNLAKAAVKELATLYGTKYTYGPGATTIYPAAGGSDDWAYDQGIKYSFTFELRDKGRYGFILPESQIQATCEETMLAIKYVTNYVLGHL. The pIC50 is 8.4. The drug is NCCCCCC(CS)C(=O)O. (4) The compound is CS(=O)(=O)c1cc(Cl)cc(COC(=O)N2CCC3(CC2)CN(C(=O)c2cccc(S(N)(=O)=O)c2)C3)c1. The target protein (Q13822) has sequence MARRSSFQSCQIISLFTFAVGVNICLGFTAHRIKRAEGWEEGPPTVLSDSPWTNISGSCKGRCFELQEAGPPDCRCDNLCKSYTSCCHDFDELCLKTARGWECTKDRCGEVRNEENACHCSEDCLARGDCCTNYQVVCKGESHWVDDDCEEIKAAECPAGFVRPPLIIFSVDGFRASYMKKGSKVMPNIEKLRSCGTHSPYMRPVYPTKTFPNLYTLATGLYPESHGIVGNSMYDPVFDATFHLRGREKFNHRWWGGQPLWITATKQGVKAGTFFWSVVIPHERRILTILQWLTLPDHERPSVYAFYSEQPDFSGHKYGPFGPEMTNPLREIDKIVGQLMDGLKQLKLHRCVNVIFVGDHGMEDVTCDRTEFLSNYLTNVDDITLVPGTLGRIRSKFSNNAKYDPKAIIANLTCKKPDQHFKPYLKQHLPKRLHYANNRRIEDIHLLVERRWHVARKPLDVYKKPSGKCFFQGDHGFDNKVNSMQTVFVGYGSTFKYKTK.... The pIC50 is 6.2. (5) The drug is Cc1ccn2nc3c(c2n1)CN([C@H]1CC[C@H](c2cc(F)c(F)cc2F)[C@@H](N)C1)C3. The target protein (P70470) has sequence MCGNNMSAPMPAVVPAARKATAAVIFLHGLGDTGHGWAEAFAGIKSSHIKYICPHAPVMPVTLNMSMMMPSWFDIIGLSPDSQEDESGIKQAAETVKALIDQEVKNGIPSNRIILGGFSQGGALSLYTALTTQQKLAGVTALSCWLPLRASFSQGPINSANRDISVLQCHGDCDPLVPLMFGSLTVERLKGLVNPANVTFKVYEGMMHSSCQQEMMDVKYFIDKLLPPID. The pIC50 is 5.7. (6) The compound is COc1ccc(CNC(=O)c2cc(=O)c3c(O)cc(OCCc4ccc(NC(=O)C(=O)O)c(OC)c4)cc3o2)cc1. The target protein (Q27743) has sequence MAPKAKIVLVGSGMIGGVMATLIVQKNLGDVVLFDIVKNMPHGKALDTSHTNVMAYSNCKVSGSNTYDDLAGADVVIVTAGFTKAPGKSDKEWNRDDLLPLNNKIMIEIGGHIKKNCPNAFIIVVTNPVDVMVQLLHQHSGVPKNKIIGLGGVLDTSRLKYYISQKLNVCPRDVNAHIVGAHGNKMVLLKRYITVGGIPLQEFINNKLISDAELEAIFDRTVNTALEIVNLHASPYVAPAAAIIEMAESYLKDLKKVLICSTLLEGQYGHSDIFGGTPVVLGANGVEQVIELQLNSEEKAKFDEAIAETKRMKALA. The pIC50 is 5.8. (7) The drug is O=C(Nc1nc(CN=C=S)cs1)c1ccccc1. The pIC50 is 6.5. The target protein (Q3THK7) has sequence MALCNGDSKPENAGGDLKDGSHHYEGAVVILDAGAQYGKVIDRRVRELFVQSEIFPLETPAFAIKEQGFRAIIISGGPNSVYAEDAPWFDPAIFTIGKPILGICYGMQMMNKVFGGTVHKKSVREDGVFNISMDNTCSLFRGLQKEEIVLLTHGDSVDKVADGFKVVARSGNIVAGIANESKKLYGVQFHPEVGLTENGKVILKNFLYDIAGCSGNFTVQNRELECIREIKEKVGTSKVLVLLSGGVDSTVCTALLNRALNQDQVIAVHIDNGFMRKRESQSVEEALKKLGIQVKVINAAHSFYNGTTTLPISDEDRTPRKRISKTLNMTTSPEEKRKIIGDTFVKIANEVIGEMSLKPEEVFLAQGTLRPDLIESASLVASGKAELIKTHHNDTELIRKLREEGKVIEPLKDFHKDEVRILGRELDLPEELVSRHPFPGPGLAIRVICAEEPYICKDFPETNNILKIVADFSASVKKPHTLLQRVKACTTEEDQEKLMQ....