Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKd (pKd = -log10(Kd in M); higher means stronger binding). Dataset: bindingdb_kd. From a dataset of Drug-target binding data from BindingDB using Kd measurements. (1) The small molecule is C[N+]1(C)[C@H]2CC(OC(=O)[C@H](CO)c3ccccc3)C[C@@H]1[C@H]1O[C@@H]21. The target protein sequence is MTLHSQSTTSPLFPQISSSWVHSPSEAGLPLGTVTQLGSYQISQETGQFSSQDTSSDPLGGHTIWQVVFIAFLTGFLALVTIIGNILVIVAFKVNKQLKTVNNYFLLSLASADLIIGVISMNLFTTYIIMNRWALGNLACDLWLSIDYVASNASVMNLLVISFDRYFSITRPLTYRCKRTTKRAGVMIGLAWVISFVLWAPAILFWQYFVGKRTVPPGECFIQFLSEPTITFGTAIAAFYMPVTIMTILYWRIYKETEKRTKELAGLQASGTEIEGRIEGRIEGRTRSQITKRKRMSLIKEKKAAQTLSAILLAFIITWTPYNIMVLVNTFADSAIPKTYWNLGYWLCYINSTVNPVAYALSNKTFRTTFKTLLLSQSDKRKRRKQQYQQRQSVIFHKRVPEQAL. The pKd is 9.2. (2) The drug is O=C(N/N=C/c1ccc(Sc2ccccn2)o1)c1cccc(F)c1. The target protein sequence is SKRSSDPSPAGDNEIERVFVWDLDETIIIFHSLLTGTFASRYGKDTTTSVRIGLMMEEMIFNLADTHLFFNDLEDCDQIHVDDVSSDDNGQDLSTYNFSADGFHSSAPGANLCLGSGVHGGVDWMRKLAFRYRRVKEMYNTYKNNVGGLIGTPKRETWLQLRAELEALTDLWLTHSLKALNLINSRPNCVNVLVTTTQLIPALAKVLLYGLGSVFPIENIYSATKTGKESCFERIMQRFGRKAVYVVIGDGVEEEQGAKKHNMPFWRISCHADLEALRHALELEYL. The pKd is 5.2. (3) The drug is COc1cc2c(N3CCN(C(=O)Nc4ccc(OC(C)C)cc4)CC3)ncnc2cc1OCCCN1CCCCC1. The target protein sequence is HHSTVADGLITTLHYPAPKRNKPTVYGVSPNYDKWEMERTDITMKHKLGGGHYGEVYEGVWKKYSLTVAVKTLKEDTMEVEEFLKEAAVMKEIKHPNLVQLLGVCTREPPFYIITEFMTYGNLLDYLRECNRQEVNAVVLLYMATQISSAMEYLEKKNFIHRDLAARNCLVGENHLVKVADFGLSRLMTGDTYTAHAGAKFPIKWTAPESLAYNKFSIKSDVWAFGVLLWEIATYGMSPYPGIDLSQVYELLEKDYRMERPEGCPEKVYELMRACWQWNPSDRPSFAEIHQAFETMFQES. The pKd is 5.0. (4) The compound is CNC(=O)c1c(F)cccc1Nc1nc(Nc2cc3c(cc2OC)CCN3C(=O)CN(C)C)nc2[nH]ccc12. The target protein (Q15746) has sequence MGDVKLVASSHISKTSLSVDPSRVDSMPLTEAPAFILPPRNLCIKEGATAKFEGRVRGYPEPQVTWHRNGQPITSGGRFLLDCGIRGTFSLVIHAVHEEDRGKYTCEATNGSGARQVTVELTVEGSFAKQLGQPVVSKTLGDRFSAPAVETRPSIWGECPPKFATKLGRVVVKEGQMGRFSCKITGRPQPQVTWLKGNVPLQPSARVSVSEKNGMQVLEIHGVNQDDVGVYTCLVVNGSGKASMSAELSIQGLDSANRSFVRETKATNSDVRKEVTNVISKESKLDSLEAAAKSKNCSSPQRGGSPPWAANSQPQPPRESKLESCKDSPRTAPQTPVLQKTSSSITLQAARVQPEPRAPGLGVLSPSGEERKRPAPPRPATFPTRQPGLGSQDVVSKAANRRIPMEGQRDSAFPKFESKPQSQEVKENQTVKFRCEVSGIPKPEVAWFLEGTPVRRQEGSIEVYEDAGSHYLCLLKARTRDSGTYSCTASNAQGQLSCSW.... The pKd is 8.0. (5) The compound is NC(=O)c1cccc(Br)c1. The target protein sequence is MPIHNLNHVNMFLQVIASGSISSAARILRKSHTAVSSAVSNLEIDLCVELVRRDGYKVEPTEQALRLIPYMRSLLNYQQLIGDIAFNLNKGPRNLRVLLDTAIPPSFCDTVSSVLLDDFNMVSLIRTSPADSLATIKQDNAEIDIAITIDEELKISRFNQCVLGYTKAFVVAHPQHPLCNASLHSIASLANYRQISLGSRSGQHSNLLRPVSDKVLFVENFDDMLRLVEAGVGWGIAPHYFVEERLRNGTLAVLSELYEPGGIDTKVYCYYNTALESERSFLRFLESARQRLRELGRQRFDDAPAWQPSIVETAQRRSGPKALAYRQRAAPE. The pKd is 5.1. (6) The compound is Nc1nc(N)c2nc(-c3cccc(O)c3)c(-c3cccc(O)c3)nc2n1. The target protein (O95835) has sequence MKRSEKPEGYRQMRPKTFPASNYTVSSRQMLQEIRESLRNLSKPSDAAKAEHNMSKMSTEDPRQVRNPPKFGTHHKALQEIRNSLLPFANETNSSRSTSEVNPQMLQDLQAAGFDEDMVIQALQKTNNRSIEAAIEFISKMSYQDPRREQMAAAAARPINASMKPGNVQQSVNRKQSWKGSKESLVPQRHGPPLGESVAYHSESPNSQTDVGRPLSGSGISAFVQAHPSNGQRVNPPPPPQVRSVTPPPPPRGQTPPPRGTTPPPPSWEPNSQTKRYSGNMEYVISRISPVPPGAWQEGYPPPPLNTSPMNPPNQGQRGISSVPVGRQPIIMQSSSKFNFPSGRPGMQNGTGQTDFMIHQNVVPAGTVNRQPPPPYPLTAANGQSPSALQTGGSAAPSSYTNGSIPQSMMVPNRNSHNMELYNISVPGLQTNWPQSSSAPAQSSPSSGHEIPTWQPNIPVRSNSFNNPLGNRASHSANSQPSATTVTAITPAPIQQPVKS.... The pKd is 5.3. (7) The compound is CCC(C(=O)Nc1ccncc1)c1ccccc1. The target protein sequence is MSAVALPRVSGGHDEHGHLEEFRTDPIGLMQRVRDECGDVGTFQLAGKQVVLLSGSHANEFFFRAGDDDLDQAKAYPLMTPIFGEGVVFDASPERRKEMLHNAALRGEQMKGHAATIEDQVRRMIADWGEAGEIDLLDFFAELTIYTSSACLIGKKFRDQLDGRFAKLYHELERGTDPLAYVDPYLPIESFRRRDEARNGLVALVADIMNGRIANPPTDKSDRDMLDVLIAVKAETGTPRFSADEITGMFISMMFAGHHTSSGTASWTLIELMRHRDAYAAVIDELDELYGDGRSVSFHALRQIPQLENVLKETLRLHPPLIILMRVAKGEFEVQGHRIHEGDLVAASPAISNRIPEDFPDPHDFVPARYEQPRQEDLLNRWTWIPFGAGRHRCVGAAFAIMQIKAIFSVLLREYEFEMAQPPESYRNDHSKMVVQLAQPACVRYRRRTGV. The pKd is 4.7. (8) The compound is CC[C@H](C)[C@H](NC(=O)[C@H](C)NC(=O)[C@H](CCCNC(=N)N)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CCCNC(=N)N)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H]1CCCN1C(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CCC(=O)O)C(=O)N[C@@H](Cc1ccccc1)C(=O)NCC(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O. The target protein (P9WGI5) has sequence MADAPTRATTSRVDSDLDAQSPAADLVRVYLNGIGKTALLNAAGEVELAKRIEAGLYAEHLLETRKRLGENRKRDLAAVVRDGEAARRHLLEANLRLVVSLAKRYTGRGMPLLDLIQEGNLGLIRAMEKFDYTKGFKFSTYATWWIRQAITRGMADQSRTIRLPVHLVEQVNKLARIKREMHQHLGREATDEELAAESGIPIDKINDLLEHSRDPVSLDMPVGSEEEAPLGDFIEDAEAMSAENAVIAELLHTDIRSVLATLDEREHQVIRLRFGLDDGQPRTLDQIGKLFGLSRERVRQIERDVMSKLRHGERADRLRSYAS. The pKd is 6.9. (9) The compound is COc1cc(N2CCC(N3CCN(C)CC3)CC2)ccc1Nc1ncc(Cl)c(Nc2ccccc2S(=O)(=O)C(C)C)n1. The target protein (Q8TBX8) has sequence MASSSVPPATVSAATAGPGPGFGFASKTKKKHFVQQKVKVFRAADPLVGVFLWGVAHSINELSQVPPPVMLLPDDFKASSKIKVNNHLFHRENLPSHFKFKEYCPQVFRNLRDRFGIDDQDYLVSLTRNPPSESEGSDGRFLISYDRTLVIKEVSSEDIADMHSNLSNYHQYIVKCHGNTLLPQFLGMYRVSVDNEDSYMLVMRNMFSHRLPVHRKYDLKGSLVSREASDKEKVKELPTLKDMDFLNKNQKVYIGEEEKKIFLEKLKRDVEFLVQLKIMDYSLLLGIHDIIRGSEPEEEAPVREDESEVDGDCSLTGPPALVGSYGTSPEGIGGYIHSHRPLGPGEFESFIDVYAIRSAEGAPQKEVYFMGLIDILTQYDAKKKAAHAAKTVKHGAGAEISTVHPEQYAKRFLDFITNIFA. The pKd is 5.6.