Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. From a dataset of Drug-target binding data from BindingDB using IC50 measurements. (1) The drug is CCCc1cc(C(=O)Nc2c(O)c3ccc(O[C@@H]4OC(C)(C)[C@H](OC)[C@@H](OC(=O)c5ccc(C)[nH]5)[C@H]4O)c(C)c3oc2=O)ccc1O. The target protein (P0AES6) has sequence MSNSYDSSSIKVLKGLDAVRKRPGMYIGDTDDGTGLHHMVFEVVDNAIDEALAGHCKEIIVTIHADNSVSVQDDGRGIPTGIHPEEGVSAAEVIMTVLHAGGKFDDNSYKVSGGLHGVGVSVVNALSQKLELVIQREGKIHRQIYEHGVPQAPLAVTGETEKTGTMVRFWPSLETFTNVTEFEYEILAKRLRELSFLNSGVSIRLRDKRDGKEDHFHYEGGIKAFVEYLNKNKTPIHPNIFYFSTEKDGIGVEVALQWNDGFQENIYCFTNNIPQRDGGTHLAGFRAAMTRTLNAYMDKEGYSKKAKVSATGDDAREGLIAVVSVKVPDPKFSSQTKDKLVSSEVKSAVEQQMNELLAEYLLENPTDAKIVVGKIIDAARAREAARRAREMTRRKGALDLAGLPGKLADCQERDPALSELYLVEGDSAGGSAKQGRNRKNQAILPLKGKILNVEKARFDKMLSSQEVATLITALGCGIGRDEYNPDKLRYHSIIIMTDAD.... The pIC50 is 7.7. (2) The small molecule is C[C@H](CCC(=O)NCCS(=O)(=O)O)[C@H]1CC[C@H]2[C@@H]3CC[C@@H]4C[C@H](O)CC[C@]4(C)[C@H]3CC[C@@]21C. The target protein (Q03145) has sequence MELRAVGFCLALLWGCALAAAAAQGKEVVLLDFAAMKGELGWLTHPYGKGWDLMQNIMDDMPIYMYSVCNVVSGDQDNWLRTNWVYREEAERIFIELKFTVRDCNSFPGGASSCKETFNLYYAESDVDYGTNFQKRQFTKIDTIAPDEITVSSDFEARNVKLNVEERMVGPLTRKGFYLAFQDIGACVALLSVRVYYKKCPEMLQSLARFPETIAVAVSDTQPLATVAGTCVDHAVVPYGGEGPLMHCTVDGEWLVPIGQCLCQEGYEKVEDACRACSPGFFKSEASESPCLECPEHTLPSTEGATSCQCEEGYFRAPEDPLSMSCTRPPSAPNYLTAIGMGAKVELRWTAPKDTGGRQDIVYSVTCEQCWPESGECGPCEASVRYSEPPHALTRTSVTVSDLEPHMNYTFAVEARNGVSGLVTSRSFRTASVSINQTEPPKVRLEDRSTTSLSVTWSIPVSQQSRVWKYEVTYRKKGDANSYNVRRTEGFSVTLDDLAP.... The pIC50 is 4.1. (3) The drug is CN[C@@H](C)C(=O)N[C@H]1CN(C(=O)CCCC(C)=O)c2ccccc2N(Cc2c(OC)ccc3cc(Br)ccc23)C1=O. The target protein sequence is MRHHHHHHRDHFALDRPSETHADYLLRTGQVVDISDTIYPRNPAMYSEEARLKSFQNWPDYAHLTPRELASAGLYYTGIGDQVQCFACGGKLKNWEPGDRAWSEHRRHFPNCFFVLGRNLNIRSE. The pIC50 is 7.5. (4) The drug is Cc1cc2c(CC(=O)O)cccc2n1C(=O)c1ccc(OCCc2ccccc2)cc1. The target protein (P70263) has sequence MNESYRCQTSTWVERGSSATMGAVLFGAGLLGNLLALVLLARSGLGSCRPGPLHPPPSVFYVLVCGLTVTDLLGKCLISPMVLAAYAQNQSLKELLPASGNQLCETFAFLMSFFGLASTLQLLAMAVECWLSLGHPFFYQRHVTLRRGVLVAPVVAAFCLAFCALPFAGFGKFVQYCPGTWCFIQMIHKERSFSVIGFSVLYSSLMALLVLATVVCNLGAMYNLYDMHRRQRHYPHRCSRDRAQSGSDYRHGSLHPLEELDHFVLLALMTVLFTMCSLPLIYRAYYGAFKLENKAEGDSEDLQALRFLSVISIVDPWIFIIFRTSVFRMLFHKVFTRPLIYRNWSSHSQQSNVESTL. The pIC50 is 6.9. (5) The drug is C[C@@H](O)[C@@H](NC(=O)[C@@H](CCC(N)=O)NC(=O)CNC(=O)[C@H]1CCCN1C(=O)[C@H](Cc1cnc[nH]1)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CS)C(=O)N[C@H](CS)C(=O)O. The target protein (P16233) has sequence MLPLWTLSLLLGAVAGKEVCYERLGCFSDDSPWSGITERPLHILPWSPKDVNTRFLLYTNENPNNFQEVAADSSSISGSNFKTNRKTRFIIHGFIDKGEENWLANVCKNLFKVESVNCICVDWKGGSRTGYTQASQNIRIVGAEVAYFVEFLQSAFGYSPSNVHVIGHSLGAHAAGEAGRRTNGTIGRITGLDPAEPCFQGTPELVRLDPSDAKFVDVIHTDGAPIVPNLGFGMSQVVGHLDFFPNGGVEMPGCKKNILSQIVDIDGIWEGTRDFAACNHLRSYKYYTDSIVNPDGFAGFPCASYNVFTANKCFPCPSGGCPQMGHYADRYPGKTNDVGQKFYLDTGDASNFARWRYKVSVTLSGKKVTGHILVSLFGNKGNSKQYEIFKGTLKPDSTHSNEFDSDVDVGDLQMVKFIWYNNVINPTLPRVGASKIIVETNVGKQFNFCSPETVREEVLLTLTPC. The pIC50 is 4.8. (6) The small molecule is O=C(O)c1cccc2c1nc(-c1ccc(Cl)cc1)c1ccncc12. The target protein sequence is MSGPVPSRARVYTDVNTHRPREYWDYESHVVEWGNQDDYQLVRKLGRGKYSEVFEAINITNNEKVVVKILKPVKKKKIKREIKILENLRGGPNIITLADIVKDPVSRTPALVFEHVNNTDFKQLYQTLTDYDIRFYMYEILKALDYCHSMGIMHRDVKPHNVMIDHEHRKLRLIDWGLAEFYHPGQEYNVRVASRYFKGPELLVDYQMYDYSLDMWSLGCMLASMIFRKEPFFHGHDNYDQLVRIAKVLGTEDLYDYIDKYNIELDPRFNDILGRHSRKRWERFVHSENQHLVSPEALDFLDKLLRYDHQSRLTAREAMEHPYFYTVVKDQARMGSS. The pIC50 is 5.7.