Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. From a dataset of Drug-target binding data from BindingDB using IC50 measurements. (1) The drug is CC(C)C[C@H](NC(=O)[C@H](CCCNC(=N)N)NC(=O)OCc1ccccc1)C(=O)N[C@@H](CC(C)C)[C@@H](O)CC(=O)NC1CCCCC1.O=C(O)C(F)(F)F. The target protein sequence is MNNYFLRKENFFILFCFVFVSIFFVSNVTIIKCNNVENKIDNVGKKIENVGKKIGDMENKNDNVENKNDNVGNKNDNVKNASSDLYKYKLYGDIDEYAYYFLDIDIGKPSQRISLILDTGSSSLSFPCNGCKDCGIHMEKPYNLNYSKTSSILYCNKSNCPYGLKCVGNKCEYLQSYCEGSQIYGFYFSDIVTLPSYNNKNKISFEKLMGCHMHEESLFLHQQATGVLGFSLTKPNGVPTFVDLLFKHTPSLKPIYSICVSEHGGELIIGGYEPDYFLSNQKEKQKMDKSDNNSSNKGNVSIKLKNNDKNDDEENNSKDVIVSNNVEDIVWQAITRKYYYYIKIYGLDLYGTNIMDKKELDMLVDSGSTFTHIPENIYNQINYYLDILCIHDMTNIYEINKRLKLTNESLNKPLVYFEDFKTALKNIIQNENLCIKIVDGVQCWKSLENLPNLYITLSNNYKMIWKPSSYLYKKESFWCKGLEKQVNNKPILGLTFFKNK.... The pIC50 is 7.1. (2) The drug is CCN1CCN(C(=O)N[C@@H](C(=O)N[C@@H]2C(=O)N3[C@@H](C(=O)O)C(C)(C)S[C@H]23)c2ccccc2)C(=O)C1=O. The target protein sequence is MKLNHFQGALYPWRFCVIVGLLLAMVGAIVWRIVDLHVIDHDFLKGQGDARSVRHIAIPAHRGLITDRNGEPLAVSTPVTTLWANPKELMAAKERWPQLAAALGQDTKLFADRIEQNAEREFIYLVRGLTPEQGEGVISLKVPGVYSIEEFRRFYPAGEVVAHAVGFTDVDDRGREGIELAFDEWLAGVPGKRQVLKDRRGRVIKDVQVTKNAKPGKTLALSIDLRLQYLAHRELRNALVENGAKAGSLVIMDVKTGEILAMTNQPTYNPNNRRNLQPAAMRNRAMIDVFEPGSTVKPFSMSAALASGRWKPSDIVDVYPGTLQIGRYTIRDVSRNSRQLDLTGILIKSSNVGISKIAFDIGAESIYSVMQQVGLGQDTGLGFPGERVGNLPNHRKWPKAETATLAYGYGLSVTAIQLAHAYAALANDGKSVPLSMTRVDRVPDGVQVISPEVASTVQGMLQQVVEAQGGVFRAQVPGYHAAGKSGTARKVSVGTKGYRE.... The pIC50 is 6.8. (3) The compound is CC(=O)N[C@@H]1[C@@H](O)C[C@@](O)(C(=O)O)O[C@H]1[C@H](O)[C@H](O)CCP(=O)(O)O. The target protein (Q8TBE9) has sequence MGLSRVRAVFFDLDNTLIDTAGASRRGMLEVIKLLQSKYHYKEEAEIICDKVQVKLSKECFHPYNTCITDLRTSHWEEAIQETKGGAANRKLAEECYFLWKSTRLQHMTLAEDVKAMLTELRKEVRLLLLTNGDRQTQREKIEACACQSYFDAVVVGGEQREEKPAPSIFYYCCNLLGVQPGDCVMVGDTLETDIQGGLNAGLKATVWINKNGIVPLKSSPVPHYMVSSVLELPALLQSIDCKVSMST. The pIC50 is 5.0. (4) The drug is COc1cccc(Cn2cnc3cc(C(=O)O)ccc32)c1. The pIC50 is 2.9. The target protein sequence is MSVSIQGQFPGRRLRRLRKHDFSRRLVAENQLSVNDLIYPMFILMGKDRREKVDSMPGVERLSIDLMLEEAQYLANLGVPAIALFPVVNQDAKSLCAAEAYNPEGLVQRAVRALKEHVPQMGVITDVALDPFTTHGQDGIIDEQGYVLNDETTEVLVKQALSHAQAGADVVAPSDMMDGRIGRIRQALEEAGYIHTQIMAYSAKYASNYYGPFRDAVGSSANLKGGNKKNYQMDPANSDEALHEVAMDINEGADMVMVKPGMPYLDVVRRVKTELQVPTFAYQVSGEYAMHKAAIMNGWLKERETVFESLLCFKRAGADGILTYFAKEVAEWLAEDSAKAAQFLPKK. (5) The compound is CC1(C)CC=C(c2nc([C@@H]3CC(C)(C)O[C@](C)(C(=O)O)C3)ccc2NC(=O)c2nc(C#N)c[nH]2)CC1. The target protein (Q00495) has sequence MELGPPLVLLLATVWHGQGAPVIEPSGPELVVEPGETVTLRCVSNGSVEWDGPISPYWTLDPESPGSTLTTRNATFKNTGTYRCTELEDPMAGSTTIHLYVKDPAHSWNLLAQEVTVVEGQEAVLPCLITDPALKDSVSLMREGGRQVLRKTVYFFSAWRGFIIRKAKVLDSNTYVCKTMVNGRESTSTGIWLKVNRVHPEPPQIKLEPSKLVRIRGEAAQIVCSATNAEVGFNVILKRGDTKLEIPLNSDFQDNYYKKVRALSLNAVDFQDAGIYSCVASNDVGTRTATMNFQVVESAYLNLTSEQSLLQEVSVGDSLILTVHADAYPSIQHYNWTYLGPFFEDQRKLEFITQRAIYRYTFKLFLNRVKASEAGQYFLMAQNKAGWNNLTFELTLRYPPEVSVTWMPVNGSDVLFCDVSGYPQPSVTWMECRGHTDRCDEAQALQVWNDTHPEVLSQKPFDKVIIQSQLPIGTLKHNMTYFCKTHNSVGNSSQYFRAVS.... The pIC50 is 8.6. (6) The compound is Cn1c(=O)c2c(-c3cccc(C#N)c3)n3c(c2n(C)c1=O)C(c1nc(Cl)cs1)COC(CO)C3. The target protein sequence is WTTPILKKGYRQHLELSDVYQAPSSDSADHLSEQLEREWDREQASKKNPKLIHALRRCFFWRFIFYGILLYLGEVTKAVQPLLLGRIIASYDPDNKVERSIAIYLGIGLCLLFIVRTLLLHPAIFGLHRIGMQMRIAMFSLIYKKTLKLSSRVLDKISIGQLVSLLSNNLNKFDEGLALAHFVWIAPLQVALLMGLLWELLQFSAFCGLGLLIILVFFQAILGKMMVKYRVELKLTKKAAYTRFLTSSAFFFSGFFVVLLAVLPYTVLNGIILRKIFTTISFCIVLRMAVTRQLPTAVQTWYDSIGMITKVQDFLQYQEYKILEYNLMTTDVTMENVSAFWEEGFGELLEKVQLNNDDRKLSNDDDNPSLGHICFLENPVLKNISFKVEKGEMLAITGSTGAGKDISKFAEKDNTILGEGGVTLSGGQRARISLARAVYKDADVYLLDSPFGYLDVLTEEQIFENCVCKLMANKTRILVTSKMEHLKKADKILILHEGSS.... The pIC50 is 8.7. (7) The drug is CNc1ccccc1CN. The target protein (Q43077) has sequence MASTTTMRLALFSVLTLLSFHAVVSVTPLHVQHPLDPLTKEEFLAVQTIVQNKYPISNNRLAFHYIGLDDPEKDHVLRYETHPTLVSIPRKIFVVAIINSQTHEILINLRIRSIVSDNIHNGYGFPILSVDEQSLAIKLPLKYPPFIDSVKKRGLNLSEIVCSSFTMGWFGEEKNVRTVRLDCFMKESTVNIYVRPITGITIVADLDLMKIVEYHDRDIEAVPTAENTEYQVSKQSPPFGPKQHSLTSHQPQGPGFQINGHSVSWANWKFHIGFDVRAGIVISLASIYDLEKHKSRRVLYKGYISELFVPYQDPTEEFYFKTFFDSGEFGFGLSTVSLIPNRDCPPHAQFIDTYVHSANGTPILLKNAICVFEQYGNIMWRHTENGIPNESIEESRTEVNLIVRTIVTVGNYDNVIDWEFKASGSIKPSIALSGILEIKGTNIKHKDEIKEDLHGKLVSANSIGIYHDHFYIYYLDFDIDGTHNSFEKTSLKTVRIKDGS.... The pIC50 is 4.0.