Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is CCN1CCN(C(=O)N[C@@H](C(=O)N[C@H](C(=O)O)[C@@H]2N[C@@H](C(=O)O)C(C)(C)S2)C2C=CC=CC2)C(=O)C1=O. The target protein sequence is MKLNHFQGALYPWRFCVIVGLLLAMVGAIVWRIVDLHVIDHDFLKGQGDARSVRHIAIPAHRGLITDRNGEPLAVSTPVTTLWANPKELMAAKERWPQLAAALGQDTKLFADRIEQNAEREFIYLVRGLTPEQGEGVISLKVPGVYSIEEFRRFYPAGEVVAHAVGFTDVDDRGREGIELAFDEWLAGVPGKRQVLKDRRGRVIKDVQVTKNAKPGKTLALSIDLRLQYLAHRELRNALVENGAKAGSLVIMDVKTGEILAMTNQPTYNPNNRRNLQPAAMRNRAMIDVFEPGSTVKPFSMSAALASGRWKPSDIVDVYPGTLQIGRYTIRDVSRNSRQLDLTGILIKSSNVGISKIAFDIGAESIYSVMQQVGLGQDTGLGFPGERVGNLPNHRKWPKAETATLAYGYGLSVTAIQLAHAYAALANDGKSVPLSMTRVDRVPDGVQVISPEVASTVQGMLQQVVEAQGGVFRAQVPGYHAAGKSGTARKVSVGTKGYRE.... The pIC50 is 3.7. (2) The drug is CCc1ccc(-c2ccc(C(=O)NC[C@H]3OC(n4cnc5c(N)ncnc54)[C@H](O)[C@@H]3O)cc2)cc1. The target protein (P04406) has sequence MGKVKVGVNGFGRIGRLVTRAAFNSGKVDIVAINDPFIDLNYMVYMFQYDSTHGKFHGTVKAENGKLVINGNPITIFQERDPSKIKWGDAGAEYVVESTGVFTTMEKAGAHLQGGAKRVIISAPSADAPMFVMGVNHEKYDNSLKIISNASCTTNCLAPLAKVIHDNFGIVEGLMTTVHAITATQKTVDGPSGKLWRDGRGALQNIIPASTGAAKAVGKVIPELNGKLTGMAFRVPTANVSVVDLTCRLEKPAKYDDIKKVVKQASEGPLKGILGYTEHQVVSSDFNSDTHSSTFDAGAGIALNDHFVKLISWYDNEFGYSNRVVDLMAHMASKE. The pIC50 is 3.0. (3) The compound is Cc1[nH]n(-c2ccccc2)c(=S)c1C=Nc1ccccn1. The target protein sequence is MNKQRIYSIVAILLFVVGGVLIGKPFYDGYQAEKKQTENVQAVQKMDYEKHETEFVDASKIDQPDLAEVANASLDKKQVIGRISIPSVSLELPVLKSSTEKNLLSGAATVKENQVMGKGNYALAGHNMSKKGVLFSDIASLKKGDKIYLYDNENEYEYAVTGVSEVTPDKWEVVEDHGKDEITLITCVSVKDNSKRYVVAGDLVGTKAKK. The pIC50 is 5.8. (4) The drug is Cc1cc(NS(=O)(=O)c2ccc(NC(=S)NC(=O)c3cccc(S(=O)(=O)N(C)C)c3)cc2)no1. The target protein (Q62848) has sequence MASPRTRKVLKEVRAQDENNVCFECGAFNPQWVSVTYGIWICLECSGRHRGLGVHLSFVRSVTMDKWKDIELEKMKAGGNAKFREFLEAQDDYEPSWSLQDKYSSRAAALFRDKVATLAEGKEWSLESSPAQNWTPPQPKTLQFTAHRPAGQPQNVTTSGDKAFEDWLNDDLGSYQGAQENRYVGFGNTVPPQKREDDFLNSAMSSLYSGWSSFTTGASKFASAAKEGATKFGSQASQKASELGHSLNENVLKPAQEKVKEGRIFDDVSSGVSQLASKVQGVGSKGWRDVTTFFSGKAEDTSDRPLEGHSYQNSSGDNSQNSTIDQSFWETFGSAEPPKAKSPSSDSWTCADASTGRRSSDSWDIWGSGSASNNKNSNSDGWESWEGASGEGRAKATKKAAPSTAADEGWDNQNW. The pIC50 is 4.3. (5) The compound is O=C(CCl)c1cscc1Cl. The target is XTSFAESXKPVQQPSAFGS. The pIC50 is 4.6. (6) The drug is CCC(=O)Nc1ccc2ncnc(Nc3ccc(NC(=O)Nc4cc(C(C)(C)C)nn4-c4cccc(C)c4)cc3)c2c1. The target protein sequence is QTQGLAKDAWEIPRESLRLEVKLGQGCFGEVWMGTWNGTTRVAIKTLKPGTMSPEAFLQEAQVMKKLRHEKLVQLYAVVSEEPIYIVMEYMSKGSLLDFLKGEMGKYLRLPQLVDMAAQIASGMAYVERMNYVHRDLRAANILVGENLVCKVADFGLARLIEDNEYTARQGAKFPIKWTAPEAALYGRFTIKSDVWSFGILLTELTTKGRVPYPGMVNREVLDQVERGYRMPCPPECPESLHDLMCQCWRKDPEERPTFEYLQAFLEDYFTSTEPQYQPGENL. The pIC50 is 7.6. (7) The small molecule is CCCCC[C@H](NC(=O)[C@@H](N)Cc1c[nH]c2ccccc12)C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](Cc1ccccc1)C(N)=O. The target protein (P56481) has sequence MDLLKLNRSLQGPGPGSGSSLCRPGVSLLNSSSAGNLSCETPRIRGTGTRELELTIRITLYAVIFLMSVGGNVLIIVVLGLSRRLRTVTNAFLLSLAVSDLLLAVACMPFTLLPNLMGTFIFGTVICKAVSYLMGVSVSVSTLNLAAIALERYSAICRPLQARVWQTRSHAARVILATWLLSGLLMVPYPVYTVVQPVGPRILQCMHLWPSERVQQMWSVLLLILLFFIPGVVMAVAYGLISRELYLGLRFDGDNDSETQSRVRNQGGLPGGAAAPGPVHQNGGCRHVTSLTGEDSDGCYVQLPRSRLEMTTLTTPTTGPGPGPRPNQAKLLAKKRVVRMLLVIVLLFFVCWLPVYSANTWRAFDGPGARRALAGAPISFIHLLSYTSACANPLVYCFMHRRFRQACLDTCARCCPRPPRARPRPLPDEDPPTPSIASLSRLSYTTISTLGPG. The pIC50 is 8.5.