From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is COc1ccc(-c2cnc(C(=O)N3CCN(c4cc(C(=O)O)c5ccccc5c4)CC3)nc2-c2ccc(C)cc2F)cc1. The target protein (P15382) has sequence MILSNTTAVTPFLTKLWQETVQQGGNMSGLARRSPRSSDGKLEALYVLMVLGFFGFFTLGIMLSYIRSKKLEHSNDPFNVYIESDAWQEKDKAYVQARVLESYRSCYVVENHLAIEQPNTHLPETKPSP. The pIC50 is 5.0. (2) The compound is Clc1ccc(CSC(Cn2ccnc2)c2ccc(Cl)cc2Cl)cc1. The target protein (P28776) has sequence MALSKISPTEGSRRILEDHHIDEDVGFALPHPLVELPDAYSPWVLVARNLPVLIENGQLREEVEKLPTLSTDGLRGHRLQRLAHLALGYITMAYVWNRGDDDVRKVLPRNIAVPYCELSEKLGLPPILSYADCVLANWKKKDPNGPMTYENMDILFSFPGGDCDKGFFLVSLLVEIAASPAIKAIPTVSSAVERQDLKALEKALHDIATSLEKAKEIFKRMRDFVDPDTFFHVLRIYLSGWKCSSKLPEGLLYEGVWDTPKMFSGGSAGQSSIFQSLDVLLGIKHEAGKESPAEFLQEMREYMPPAHRNFLFFLESAPPVREFVISRHNEDLTKAYNECVNGLVSVRKFHLAIVDTYIMKPSKKKPTDGDKSEEPSNVESRGTGGTNPMTFLRSVKDTTEKALLSWP. The pIC50 is 4.5. (3) The target protein (Q9EPX4) has sequence MEVPGANATSANTTSIPGTSTLCSRDYKITQVLFPLLYTVLFFAGLITNSLAMRIFFQIRSKSNFIIFLKNTVISDLLMILTFPFKILSDAKLGAGHLRTLVCQVTSVTFYFTMYISISFLGLITIDRYLKTTRPFKTSSPSNLLGAKILSVAIWAFMFLLSLPNMILTNRRPKDKDITKCSFLKSEFGLVWHEIVNYICQVIFWINFLIVIVCYSLITKELYRSYVRTRGSAKAPKKRVNIKVFIIIAVFFICFVPFHFARIPYTLSQTRAVFDCNAENTLFYVKESTLWLTSLNACLDPFIYFFLCKSFRNSLMSMLRCSTSGANKKKGQEGGDPSEETPM. The drug is CCCSc1nc(N[C@@H]2C[C@H]2c2ccc(F)c(F)c2)c2nnn([C@H]3[C@H](O)[C@H](O)[C@@H](OCCO)[C@@H]3O)c2n1. The pIC50 is 5.9. (4) The small molecule is CCCCCCCCCCCCCCC(COCc1ccccc1)NC(=O)/C=C/C(=O)O. The target protein (P47713) has sequence MSFIDPYQHIIVEHQYSHKFTVVVLRATKVTKGTFGDMLDTPDPYVELFISTTPDSRKRTRHFNNDINPVWNETFEFILDPNQENVLEITLMDANYVMDETLGTATFPVSSMKVGEKKEVPFIFNQVTEMILEMSLEVCSCPDLRFSMALCDQEKTFRQQRKENIKENMKKLLGPKKSEGLYSTRDVPVVAILGSGGGFRAMVGFSGVMKALYESGILDCATYIAGLSGSTWYMSTLYSHPDFPEKGPEEINEELMKNVSHNPLLLLTPQKVKRYVESLWKKKSSGQPVTFTDIFGMLIGETLIQNRMSMTLSSLKEKVNAARCPLPLFTCLHVKPDVSELMFADWVEFSPYEIGMAKYGTFMAPDLFGSKFFMGTVVKKYEENPLHFLMGVWGSAFSILFNRVLGVSGSQNKGSTMEEELENITAKHIVSNDSSDSDDEAQGPKGTENEEAEKEYQSDNQASWVHRMLMALVSDSALFNTREGRAGKVHNFMLGLNLNT.... The pIC50 is 4.2. (5) The drug is CC(C)Oc1cc2c(-c3cc(N4CCS(=O)CC4)ncn3)n[nH]c2cc1F. The target protein sequence is HSDSISSLASEREYITSLDLSANELRDIDALSQKCCISVHLEHLEKLELHQNALTSFPQQLCETLKSLTHLDLHSNKFTSFPSYLLKMSCIANLDVSRNDIGPSVVLDPTVKCPTLKQFNLSYNQLSFVPENLTDVVEKLEQLILEGNKISGICSPLRLKELKILNLSKNHISSLSENFLEACPKVESFSARMNFLAAMPFLPPSMTILKLSQNKFSCIPEAILNLPHLRSLDMSSNDIQYLPGPAHWKSLNLRELLFSHNQISILDLSEKAYLWSRVEKLHLSHNKLKEIPPEIGCLENLTSLDVSYNLELRSFPNEMGKLSKIWDLPLDELHLNFDFKHIGCKAKDIIRFLQQRLKKAVPYNRMKLMIVGNTGSGKTTLLQQLMKTKKSDLGMQSATVGIDVKDWPIQIRDKRKRDLVLNVWDFAGREEFYSTHPHFMTQRALYLAVYDLSKGQAEVDAMKPWLFNIKARASSSPVILVGTHLDVSDEKQRKACMSKI.... The pIC50 is 8.2. (6) The drug is C[C@]1(Cn2ccnn2)[C@H](C(=O)O)N2C(=O)C[C@H]2S1(=O)=O. The pIC50 is 3.8. The target protein sequence is MQNTLKLLSVITCLAATVQGALAANIDESKIKDTVDDLIQPLMQKNNIPGMSVAVTVNGKNYIYNYGLAAKQPQQPVTENTLFEVGSLSKTFAATLASYAQVSGKLSLDQSVSHYVPELRGSSFDHVSVLNVGTHTSGLQLFMPEDIKNTTQLMAYLKAWKPADAAGTHRVYSNIGTGLLGMIAAKSLGVSYEDAIEKTLLPQLGMHHSYLKVPADQMENYAWGYNKKDEPVHGNMEILGNEAYGIKTTSSDLLRYVQANMGQLKLDANAKMQQALTATHTGYFKSGEITQDLMWEQLPYPVSLPNLLTGNDMAMTKSVATPIVPPLPPQENVWINKTGSTNGFGAYIAFVPAKKMGIVMLANKNYSIDQRVTVAYKILSSLEGNK. (7) The small molecule is O=C(NCC(F)(F)F)[C@@H]1CN(Cc2ccn(-c3ccccc3)c2)CCN1C[C@@H](O)C[C@@H](Cc1cccnc1)C(=O)N[C@H]1c2ccccc2OC[C@H]1O. The target protein sequence is PQITLWKRPIVTIKIGGQLKEALLDTGADDTVLEEMNLPGRWKPKIIGGIGGFVKVREYDQIPVEICGHKAIGTVLIGPTPFNVIGRNLMTQLGCTLNF. The pIC50 is 8.6. (8) The pIC50 is 7.8. The target protein (P51683) has sequence MEDNNMLPQFIHGILSTSHSLFTRSIQELDEGATTPYDYDDGEPCHKTSVKQIGAWILPPLYSLVFIFGFVGNMLVIIILIGCKKLKSMTDIYLLNLAISDLLFLLTLPFWAHYAANEWVFGNIMCKVFTGLYHIGYFGGIFFIILLTIDRYLAIVHAVFALKARTVTFGVITSVVTWVVAVFASLPGIIFTKSKQDDHHYTCGPYFTQLWKNFQTIMRNILSLILPLLVMVICYSGILHTLFRCRNEKKRHRAVRLIFAIMIVYFLFWTPYNIVLFLTTFQESLGMSNCVIDKHLDQAMQVTETLGMTHCCINPVIYAFVGEKFRRYLSIFFRKHIAKRLCKQCPVFYRETADRVSSTFTPSTGEQEVSVGL. The small molecule is CCO[C@H]1CN([C@H]2CC[C@@H](c3ccccc3)CC2)C[C@@H]1NC(=O)CNC(=O)c1cccc(C(F)(F)F)c1.