Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is COCCNC(=N)NCCC[C@H](N)C(=O)O. The target protein (O94760) has sequence MAGLGHPAAFGRATHAVVRALPESLGQHALRSAKGEEVDVARAERQHQLYVGVLGSKLGLQVVELPADESLPDCVFVEDVAVVCEETALITRPGAPSRRKEVDMMKEALEKLQLNIVEMKDENATLDGGDVLFTGREFFVGLSKRTNQRGAEILADTFKDYAVSTVPVADGLHLKSFCSMAGPNLIAIGSSESAQKALKIMQQMSDHRYDKLTVPDDIAANCIYLNIPNKGHVLLHRTPEEYPESAKVYEKLKDHMLIPVSMSELEKVDGLLTCCSVLINKKVDS. The pIC50 is 4.5. (2) The drug is CCN(CCO)c1ncc(C(=O)Nc2ccc(OC(F)(F)Cl)cc2)cc1-c1cncnc1. The target protein sequence is NLFVALYDFVASGDNTLSITKGEKLRVLGYNHNGEWCEAQTKNGQGWVPSNYITPVNSLEKHSWYHGPVSRNAAEYLLSSGINGSFLVRESESSPGQRSISLRYEGRVYHYRINTASDGKLYVSSESRFNTLAELVHHHSTVADGLITTLHYPAPKRNKPTVYGVSPNYDKWEMERTDITMKHKLGGGQYGEVYEGVWKKYSLTVAVKTLKEDTMEVEEFLKEAAVMKEIKHPNLVQLLGVCTREPPFYIITEFMTYGNLLDYLRECNRQEVNAVVLLYMATQISSAMEYLEKKNFIHRDLAARNCLVGENHLVKVADFGLSRLMTGDTYTAHAGAKFPIKWTAPESLAYNKFSIKSDVWAFGVLLWEIATYGMSPYPGIDLSQVYELLEKDYRMERPEGCPEKVYELMRACWQWNPSDRPSFAEIHQAFETMFQESSISDEVEKELGKQGV. The pIC50 is 9.2. (3) The compound is CC(C)C(C(=O)OC(C)(C)C)N1C(=O)C(NC(=O)OCc2ccccc2)CS1(=O)=O. The target protein sequence is MLKRLKEKSNDEIVQNTINKRINFIFGVIVFIFAVLVLRLGYLQIAQGSHYKQIIKNDENITVNESVPRGRILDRNGKVLVDNASKMAITYTRGRKTTQSEMLDTAEKLSKLIKMDTKKITERDKKDFWIQLHPKKAKAMMTKEQAMLADGSIKQDQYDKQLLSKIGKSQLDELSSKDLQVLAIFREMNAGTVLDPQMIKNEDVSEKEYAAVSQQLSKLPGVNTSMDWDRKYPYGDTLRGIFGDVSTPAEGIPKELTEHYLSKGYSRNDRVGKSYLEYQYEDVLRGKKKEMKYTTDKSGKVTSSEVLNPGARGQDLKLTIDIDLQKEVEALLDKQIKKLRSQGAKDMDNAMMVVQNPKNGDILALAGKQINKSGKMTDYDIGTFTSQFAVGSSVKGGTLLAGYQNKAIKVGETMVDEPLHFQGGLTKRSYFNKNGHVTINDKQALMRSSNVYMFKTALKLAGDPYYSGMALPSDISSPAQKLRRGLNQVGLGVKTGIDLP.... The pIC50 is 3.2. (4) The small molecule is CC1=C(C(=O)OCc2ccc3c(c2)OCO3)[C@@H](c2ccco2)N(C)C(=O)N1C. The target protein (Q63008) has sequence MEGAEAGARATFGAWDYGVFATMLLVSTGIGLWVGLARGGQRSADDFFTGGRQLAAVPVGLSLAASFMSAVQVLGVPAEAARYGLKFLWMCAGQLLNSLLTAFLFLPIFYRLGLTSTYQYLELRFSRAVRLCGTLQYLVATMLYTGIVIYAPALILNQVTGLDIWASLLSTGIICTLYTTVGGMKAVVWTDVFQVVVMLVGFWVILARGVILLGGPRNVLSLAQNHSRINLMDFDPDPRSRYTFWTFIVGGTLVWLSMYGVNQAQVQRYVACHTEGKAKLALLVNQLGLFLIVASAACCGIVMFVYYKDCDPLLTGRISAPDQYMPLLVLDIFEDLPGVPGLFLACAYSGTLSTASTSINAMAAVTVEDLIKPRMPGLAPRKLVFISKGLSFIYGSACLTVAALSSLLGGGVLQGSFTVMGVISGPLLGAFTLGMLLPACNTPGVLSGLAAGLAVSLWVAVGATLYPPGEQTMGVLPTSAAGCTNDSVLLGPPGATNASN.... The pIC50 is 9.1. (5) The pIC50 is 5.1. The compound is COc1ccc(/C=C2/C(=O)Nc3ccccc32)cc1. The target protein (P35918) has sequence MESKALLAVALWFCVETRAASVGLPGDFLHPPKLSTQKDILTILANTTLQITCRGQRDLDWLWPNAQRDSEERVLVTECGGGDSIFCKTLTIPRVVGNDTGAYKCSYRDVDIASTVYVYVRDYRSPFIASVSDQHGIVYITENKNKTVVIPCRGSISNLNVSLCARYPEKRFVPDGNRISWDSEIGFTLPSYMISYAGMVFCEAKINDETYQSIMYIVVVVGYRIYDVILSPPHEIELSAGEKLVLNCTARTELNVGLDFTWHSPPSKSHHKKIVNRDVKPFPGTVAKMFLSTLTIESVTKSDQGEYTCVASSGRMIKRNRTFVRVHTKPFIAFGSGMKSLVEATVGSQVRIPVKYLSYPAPDIKWYRNGRPIESNYTMIVGDELTIMEVTERDAGNYTVILTNPISMEKQSHMVSLVVNVPPQIGEKALISPMDSYQYGTMQTLTCTVYANPPLHHIQWYWQLEEACSYRPGQTSPYACKEWRHVEDFQGGNKIEVTKN.... (6) The drug is COc1cc(OC)cc(OCCCNc2nc(N)[nH]c(=O)c2N=O)c1. The target protein (P0AC13) has sequence MKLFAQGTSLDLSHPHVMGILNVTPDSFSDGGTHNSLIDAVKHANLMINAGATIIDVGGESTRPGAAEVSVEEELQRVIPVVEAIAQRFEVWISVDTSKPEVIRESAKVGAHIINDIRSLSEPGALEAAAETGLPVCLMHMQGNPKTMQEAPKYDDVFAEVNRYFIEQIARCEQAGIAKEKLLLDPGFGFGKNLSHNYSLLARLAEFHHFNLPLLVGMSRKSMIGQLLNVGPSERLSGSLACAVIAAMQGAHIIRVHDVKETVEAMRVVEATLSAKENKRYE. The pIC50 is 6.2. (7) The pIC50 is 3.4. The target protein (P10584) has sequence MKAPVRVAVTGAAGQIGYSLLFRIAAGEMLGKDQPVILQLLEIPQAMKALEGVVMELEDCAFPLLAGLEATDDPKVAFKDADYALLVGAAPRKAGMERRDLLQVNGKIFTEQGRALAEVAKKDVKVLVVGNPANTNALIAYKNAPGLNPRNFTAMTRLDHNRAKAQLAKKTGTGVDRIRRMTVWGNHSSTMFPDLFHAEVDGRPALELVDMEWYEKVFIPTVAQRGAAIIQARGASSAASAANAAIEHIRDWALGTPEGDWVSMAVPSQGEYGIPEGIVYSFPVTAKDGAYRVVEGLEINEFARKRMEITAQELLDEMEQVKALGLI. The small molecule is CN(C)CCCn1cc(C2=C(c3cn(C)c4ccccc34)C(=O)NC2=O)c2ccccc21. (8) The drug is CCCCCCCCCCCCCCC[C@@]1(O)C[N+](C)(C)C[C@@H](CC(=O)[O-])O1. The target protein (P32198) has sequence MAEAHQAVAFQFTVTPDGIDLRLSHEALKQICLSGLHSWKKKFIRFKNGIITGVFPANPSSWLIVVVGVISSMHAKVDPSLGMIAKISRTLDTTGRMSSQTKNIVSGVLFGTGLWVAVIMTMRYSLKVLLSYHGWMFAEHGKMSRSTKIWMAMVKVLSGRKPMLYSFQTSLPRLPVPAVKDTVSRYLESVRPLMKEEDFQRMTALAQDFAVNLGPKLQWYLKLKSWWATNYVSDWWEEYIYLRGRGPLMVNSNYYAMEMLYITPTHIQAARAGNTIHAILLYRRTLDREELKPIRLLGSTIPLCSAQWERLFNTSRIPGEETDTIQHIKDSRHIVVYHRGRYFKVWLYHDGRLLRPRELEQQMQQILDDPSEPQPGEAKLAALTAADRVPWAKCRQTYFARGKNKQSLDAVEKAAFFVTLDESEQGYREEDPEASIDSYAKSLLHGRCFDRWFDKSITFVVFKNSKIGINAEHSWADAPVVGHLWEYVMATDVFQLGYSE.... The pIC50 is 4.4.