From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is CC(C)(C)c1cc(C(=O)/C(C#N)=N/Nc2cccc(Cl)c2Cl)no1. The pIC50 is 4.5. The target protein (Q9EQZ6) has sequence MVAAHAAHSQSSAEWIACLDKRPLERSSEDVDIIFTRLKGVKAFEKFHPNLLRQICLCGYYENLEKGITLFRQGDIGTNWYAVLAGSLDVKVSETSSHQDAVTICTLGIGTAFGESILDNTPRHATIVTRESSELLRIEQEDFKALWEKYRQYMAGLLAPPYGVMETGSNNDRIPDKENTPLIEPHVPLRPAHTITKVPSEKILRAGKILRIAILSRAPHMIRDRKYHLKTYRQCCVGTELVDWMIQQTSCVHSRTQAVGMWQVLLEDGVLNHVDQERHFQDKYLFYRFLDDEREDAPLPTEEEKKECDEELQDTMLLLSQMGPDAHMRMILRKPPGQRTVDDLEIIYDELLHIKALSHLSTTVKRELAGVLIFESHAKGGTVLFNQGEEGTSWYIILKGSVNVVIYGKGVVCTLHEGDDFGKLALVNDAPRAASIVLREDNCHFLRVDKEDFNRILRDVEANTVRLKEHDQDVLVLEKVPAGNRAANQGNSQPQQKYTV.... (2) The compound is O=C1C=C(Nc2cccc(O)c2)C(=O)C=C1Nc1cccc(O)c1. The target protein (P39052) has sequence MGNRGMEELIPLVNKLQDAFSSIGQSCHLDLPQIAVVGGQSAGKSSVLENFVGRDFLPRGSGIVTRRPLILQLIFSKTEYAEFLHCKSKKFTDFDEVRQEIEAETDRVTGTNKGISPVPINLRVYSPHVLNLTLIDLPGITKVPVGDQPPDIEYQIKDMILQFISRESSLILAVTPANMDLANSDALKLAKEVDPQGLRTIGVITKLDLMDEGTDARDVLENKLLPLRRGYIGVVNRSQKDIEGRKDIRAALAAERKFFLSHPAYRHMADRMGTPHLQKTLNQQLTNHIRESLPTLRSKLQSQLLSLEKEVEEYKNFRPDDPTRKTKALLQMVQQFGVDFEKRIEGSGDQVDTLELSGGARINRIFHERFPFELVKMEFDEKDLRREISYAIKNIHGVRTGLFTPDLAFEAIVKKQVVKLKEPCLKCVDLVIQELISTVRQCTSKLSSYPRLREETERIVTTYIREREGRTKDQILLLIDIEQSYINTNHEDFIGFANAQ.... The pIC50 is 4.4. (3) The small molecule is CN1C[C@@H](COc2ccc(C(=O)Nc3cc(CC(=O)O)ccc3Cl)c(Cl)c2)Oc2ccccc21. The target protein (P70263) has sequence MNESYRCQTSTWVERGSSATMGAVLFGAGLLGNLLALVLLARSGLGSCRPGPLHPPPSVFYVLVCGLTVTDLLGKCLISPMVLAAYAQNQSLKELLPASGNQLCETFAFLMSFFGLASTLQLLAMAVECWLSLGHPFFYQRHVTLRRGVLVAPVVAAFCLAFCALPFAGFGKFVQYCPGTWCFIQMIHKERSFSVIGFSVLYSSLMALLVLATVVCNLGAMYNLYDMHRRQRHYPHRCSRDRAQSGSDYRHGSLHPLEELDHFVLLALMTVLFTMCSLPLIYRAYYGAFKLENKAEGDSEDLQALRFLSVISIVDPWIFIIFRTSVFRMLFHKVFTRPLIYRNWSSHSQQSNVESTL. The pIC50 is 8.4. (4) The compound is CCOc1ccc(/C=C2\SC(=O)N(CC)C2=O)cc1. The target protein (P42582) has sequence MFPSPALTPTPFSVKDILNLEQQQRSLASGDLSARLEATLAPASCMLAAFKPEAYSGPEAAASGLAELRAEMGPAPSPPKCSPAFPAAPTFYPGAYGDPDPAKDPRADKKELCALQKAVELDKAETDGAERPRARRRRKPRVLFSQAQVYELERRFKQQRYLSAPERDQLASVLKLTSTQVKIWFQNRRYKCKRQRQDQTLELLGPPPPPARRIAVPVLVRDGKPCLGDPAAYAPAYGVGLNAYGYNAYPYPSYGGAACSPGYSCAAYPAAPPAAQPPAASANSNFVNFGVGDLNTVQSPGMPQGNSGVSTLHGIRAW. The pIC50 is 5.3. (5) The small molecule is CC(C)(C)N1CCC(NC(=O)c2cccc(Br)c2)C1. The target protein (Q96T88) has sequence MWIQVRTMDGRQTHTVDSLSRLTKVEELRRKIQELFHVEPGLQRLFYRGKQMEDGHTLFDYEVRLNDTIQLLVRQSLVLPHSTKERDSELSDTDSGCCLGQSESDKSSTHGEAAAETDSRPADEDMWDETELGLYKVNEYVDARDTNMGAWFEAQVVRVTRKAPSRDEPCSSTSRPALEEDVIYHVKYDDYPENGVVQMNSRDVRARARTIIKWQDLEVGQVVMLNYNPDNPKERGFWYDAEISRKRETRTARELYANVVLGDDSLNDCRIIFVDEVFKIERPGEGSPMVDNPMRRKSGPSCKHCKDDVNRLCRVCACHLCGGRQDPDKQLMCDECDMAFHIYCLDPPLSSVPSEDEWYCPECRNDASEVVLAGERLRESKKKAKMASATSSSQRDWGKGMACVGRTKECTIVPSNHYGPIPGIPVGTMWRFRVQVSESGVHRPHVAGIHGRSNDGAYSLVLAGGYEDDVDHGNFFTYTGSGGRDLSGNKRTAEQSCDQK.... The pIC50 is 4.0. (6) The pIC50 is 5.4. The compound is Cc1nc2cc(OC[C@H](O)CN3CCN(Cc4noc(-c5cccc(Cl)c5)n4)CC3)ccc2s1. The target protein (P07872) has sequence MNPDLRKERASATFNPELITHILDGSPENTRRRREIENLILNDPDFQHEDYNFLTRSQRYEVAVKKSATMVKKMREYGISDPEEIMWFKKLYLANFVEPVGLNYSMFIPTLLNQGTTAQQEKWMRPSQELQIIGTYAQTEMGHGTHLRGLETTATYDPKTQEFILNSPTVTSIKWWPGGLGKTSNHAIVLAQLITQGECYGLHAFVVPIREIGTHKPLPGITVGDIGPKFGYEEMDNGYLKMDNYRIPRENMLMKYAQVKPDGTYVKPLSNKLTYGTMVFVRSFLVGNAAQSLSKACTIAIRYSAVRRQSEIKQSEPEPQILDFQTQQYKLFPLLATAYAFHFVGRYMKETYLRINESIGQGDLSELPELHALTAGLKAFTTWTANAGIEECRMACGGHGYSHSSGIPNIYVTFTPACTFEGENTVMMLQTARFLMKIYDQVRSGKLVGGMVSYLNDLPSQRIQPQQVAVWPTMVDINSLEGLTEAYKLRAARLVEIAAK.... (7) The compound is CCc1cn([C@@H]2O[C@H](CNC(=O)C3c4cccc(C)c4Oc4c(C)cccc43)[C@@H](O)[C@@H]2F)c(=O)[nH]c1=O. The target protein sequence is MASHAGQQHAPAFGQAARASGPTDGRAASRPSHRQGASGARGDPELPTLLRVYIDGPHGVGKTTTSAQLMEALGPRDNIVYVPEPMTYWQVLGASETLTNIYNTQHRLDRGEISAGEAAVVMTSAQITMSTPYAATDAVLAPHIGGEAVGPQAPPPALTLVFDRHPIASLLCYPAARYLMGSMTPQAVLAFVALMPPTAPGTNLVLGVLPEAEHADRLARRQRPGERLDLAMLSAIRRVYDLLANTVRYLQRGGRWREDWGRLTGVAAATPRPDPEDGAGSLPRIEDTLFALFRVPELLAPNGDLYHIFAWVLDVLADRLLPMHLFVLDYDQSPVGCRDALLRLTAGMIPTRVTTAGSIAEIRDLARTFAREVGGV. The pIC50 is 9.3. (8) The pIC50 is 4.8. The compound is COCCCCn1cnc2c(=O)[nH]c(Nc3ccccc3)nc21. The target protein (P04407) has sequence MASHAGQQHAPAFGQAARASGPTDGRAASRPSHRQGASEARGDPELPTLLRVYIDGPHGVGKTTTSAQLMEALGPRDNIVYVPEPMTYWQVLGASETLTNIYNTQHRLDRGEISAGEAAVVMTSAQITMSTPYAATDAVLAPHIGGEAVGPQAPPPALTLVFDRHPIASLLCYPAARYLMGSMTPQAVLAFVALMPPTAPGTNLVLGVLPEAEHADRLARRQRPGERLDLAMLSAIRRVYDLLANTVRYLQRGGRWREDWGRLTGVAAATPRPDPEDGAGSLPRIEDTLFALFRVPELLAPNGDLYHIFAWVLDVLADRLLPMHLFVLDYDQSPVGCRDALLRLTAGMIPTRVTTAGSIAEIRDLARTFAREVGGV. (9) The drug is CC(C)[C@H](C(=O)Nc1ncc(C(F)(F)F)s1)c1ccc(Cl)cc1. The target protein (O15552) has sequence MLPDWKSSLILMAYIIIFLTGLPANLLALRAFVGRIRQPQPAPVHILLLSLTLADLLLLLLLPFKIIEAASNFRWYLPKVVCALTSFGFYSSIYCSTWLLAGISIERYLGVAFPVQYKLSRRPLYGVIAALVAWVMSFGHCTIVIIVQYLNTTEQVRSGNEITCYENFTDNQLDVVLPVRLELCLVLFFIPMAVTIFCYWRFVWIMLSQPLVGAQRRRRAVGLAVVTLLNFLVCFGPYNVSHLVGYHQRKSPWWRSIAVVFSSLNASLDPLLFYFSSSVVRRAFGRGLQVLRNQGSSLLGRRGKDTAEGTNEDRGVGQGEGMPSSDFTTE. The pIC50 is 5.8.