Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. From a dataset of Drug-target binding data from BindingDB using IC50 measurements. (1) The drug is CO[C@H]1CC[C@H](Cc2ccccc2)N(C(=O)n2ncc(C(O)(c3ccc(F)cc3)c3ccc(F)cc3)n2)C1. The target protein (Q9BV23) has sequence MDLDVVNMFVIAGGTLAIPILAFVASFLLWPSALIRIYYWYWRRTLGMQVRYVHHEDYQFCYSFRGRPGHKPSILMLHGFSAHKDMWLSVVKFLPKNLHLVCVDMPGHEGTTRSSLDDLSIDGQVKRIHQFVECLKLNKKPFHLVGTSMGGQVAGVYAAYYPSDVSSLCLVCPAGLQYSTDNQFVQRLKELQGSAAVEKIPLIPSTPEEMSEMLQLCSYVRFKVPQQILQGLVDVRIPHNNFYRKLFLEIVSEKSRYSLHQNMDKIKVPTQIIWGKQDQVLDVSGADMLAKSIANCQVELLENCGHSVVMERPRKTAKLIIDFLASVHNTDNNKKLD. The pIC50 is 6.9. (2) The drug is NCCCC(=O)O[C@H]1[C@H](O)[C@@H](CO)O[C@H]1n1cc(C=CBr)c(=O)[nH]c1=O. The target protein (Q9XZT6) has sequence MAEAASCARKGTKYAEGTQPFTVLIEGNIGSGKTTYLNHFEKYKNDICLLTEPVEKWRNVNGVNLLELMYKDPKKWAMPFQSYVTLTMLQSHTAPTNKKLKIMERSIFSARYCFVENMRRNGSLEQGMYNTLEEWYKFIEESIHVQADLIIYLRTSPEVAYERIRQRARSEESCVPLKYLQELHELHEDWLIHQRRPQSCKVLVLDADLNLENIGTEYQRSESSIFDAISSNQQPSPVLVSPSKRQRVAR. The pIC50 is 3.8. (3) The compound is O=C1OCC/C1=C\c1ccc2ncsc2c1. The target protein (P49759) has sequence MRHSKRTYCPDWDDKDWDYGKWRSSSSHKRRKRSHSSAQENKRCKYNHSKMCDSHYLESRSINEKDYHSRRYIDEYRNDYTQGCEPGHRQRDHESRYQNHSSKSSGRSGRSSYKSKHRIHHSTSHRRSHGKSHRRKRTRSVEDDEEGHLICQSGDVLSARYEIVDTLGEGAFGKVVECIDHKAGGRHVAVKIVKNVDRYCEAARSEIQVLEHLNTTDPNSTFRCVQMLEWFEHHGHICIVFELLGLSTYDFIKENGFLPFRLDHIRKMAYQICKSVNFLHSNKLTHTDLKPENILFVQSDYTEAYNPKIKRDERTLINPDIKVVDFGSATYDDEHHSTLVSTRHYRAPEVILALGWSQPCDVWSIGCILIEYYLGFTVFPTHDSKEHLAMMERILGPLPKHMIQKTRKRKYFHHDRLDWDEHSSAGRYVSRRCKPLKEFMLSQDVEHERLFDLIQKMLEYDPAKRITLREALKHPFFDLLKKSI. The pIC50 is 5.0. (4) The compound is O=C1CN(c2ccc(C(=O)NC3CCN(CC(F)(F)F)CC3)cn2)CCN1. The target protein (O60760) has sequence MPNYKLTYFNMRGRAEIIRYIFAYLDIQYEDHRIEQADWPEIKSTLPFGKIPILEVDGLTLHQSLAIARYLTKNTDLAGNTEMEQCHVDAIVDTLDDFMSCFPWAEKKQDVKEQMFNELLTYNAPHLMQDLDTYLGGREWLIGNSVTWADFYWEICSTTLLVFKPDLLDNHPRLVTLRKKVQAIPAVANWIKRRPQTKL. The pIC50 is 6.2. (5) The compound is CCOC(=O)c1cc(C#N)c2cccc(O[C@@H](CC(C)C)C(=O)NC(CC(=O)O)c3ccccc3)n12. The target protein (P51452) has sequence MSGSFELSVQDLNDLLSDGSGCYSLPSQPCNEVTPRIYVGNASVAQDIPKLQKLGITHVLNAAEGRSFMHVNTNANFYKDSGITYLGIKANDTQEFNLSAYFERAADFIDQALAQKNGRVLVHCREGYSRSPTLVIAYLMMRQKMDVKSALSIVRQNREIGPNDGFLAQLCQLNDRLAKEGKLKP. The pIC50 is 4.0. (6) The small molecule is S=C(NCCc1ccccc1)Nc1c(Cl)cccc1Cl. The target protein (P54829) has sequence MNYEGARSERENHAADDSEGGALDMCCSERLPGLPQPIVMEALDEAEGLQDSQREMPPPPPPSPPSDPAQKPPPRGAGSHSLTVRSSLCLFAASQFLLACGVLWFSGYGHIWSQNATNLVSSLLTLLKQLEPTAWLDSGTWGVPSLLLVFLSVGLVLVTTLVWHLLRTPPEPPTPLPPEDRRQSVSRQPSFTYSEWMEEKIEDDFLDLDPVPETPVFDCVMDIKPEADPTSLTVKSMGLQERRGSNVSLTLDMCTPGCNEEGFGYLMSPREESAREYLLSASRVLQAEELHEKALDPFLLQAEFFEIPMNFVDPKEYDIPGLVRKNRYKTILPNPHSRVCLTSPDPDDPLSSYINANYIRGYGGEEKVYIATQGPIVSTVADFWRMVWQEHTPIIVMITNIEEMNEKCTEYWPEEQVAYDGVEITVQKVIHTEDYRLRLISLKSGTEERGLKHYWFTSWPDQKTPDRAPPLLHLVREVEEAAQQEGPHCAPIIVHCSAGI.... The pIC50 is 4.8. (7) The compound is C=C(C)c1ccc(O)cc1. The target protein sequence is MSFPATPDYTGLNKPVGQEVSIKGLKASEGTIPADVRGAFFRAVPDPQFPPFFHPDTALSDDGMISRVLFNADGTVDYDIRYVQTPRWKAERAAGKRLFGRYRNPYTNDPSAFDLEGTVSNTTPVWHA. The pIC50 is 4.9. (8) The small molecule is Cc1ccc(S(=O)(=O)Nc2ccc(C(=O)/C=C/c3ccc(O)cc3)cc2)cc1. The target protein sequence is MLAPGSSRVELFKRQSSKVPFEKDGKVTERVVHSFRLPALVNVDGVMVAIADARYETSNDNSLIDTVAKYSVDDGETWETQIAIKNSRASSVSRVVDPTVIVKGNKLYVLVGSYNSSRSYWTSHGDARDWDILLAVGEVTKSTAGGKITASIKWGSPVSLKEFFPAEMEGMHTNQFLGGAGVAIVASNGNLVYPVQVTNKKKQVFSKIFYSEDEGKTWKFGKGRSAFGCSEPVALEWEGKLIINTRVDYRRRLVYESSDMGNSWLEAVGTLSRVWGPSPKSNQPGSQSSFTAVTIEGMRVMLFTHPLNFKGRWLRDRLNLWLTDNQRIYNVGQVSIGDENSAYSSVLYKDDKLYCLHEINSNEVYSLVFARLVGELRIIKSVLQSWKNWDSHLSSICTPADPAASSSERGCGPAVTTVGLVGFLSHSATKTEWEDAYRCVNASTANAERVPNGLKFAGVGGGALWPVSQQGQNQRYRFANHAFTVVASVTIHEVPSVASP.... The pIC50 is 4.1. (9) The compound is CC(=O)c1ccc([C@H]2C[C@@]3(C)[C@@H](CC[C@@]3(O)C(F)(F)C(F)(F)F)[C@@H]3CCC4=CC(=O)CCC4=C32)cc1. The target protein (P06537) has sequence MDSKESLAPPGRDEVPSSLLGRGRGSVMDLYKTLRGGATVKVSASSPSVAAASQADSKQQRILLDFSKGSASNAQQQQQQQQPQPDLSKAVSLSMGLYMGETETKVMGNDLGYPQQGQLGLSSGETDFRLLEESIANLNRSTSRPENPKSSTPAAGCATPTEKEFPQTHSDPSSEQQNRKSQPGTNGGSVKLYTTDQSTFDILQDLEFSAGSPGKETNESPWRSDLLIDENLLSPLAGEDDPFLLEGDVNEDCKPLILPDTKPKIQDTGDTILSSPSSVALPQVKTEKDDFIELCTPGVIKQEKLGPVYCQASFSGTNIIGNKMSAISVHGVSTSGGQMYHYDMNTASLSQQQDQKPVFNVIPPIPVGSENWNRCQGSGEDNLTSLGAMNFAGRSVFSNGYSSPGMRPDVSSPPSSSSTATGPPPKLCLVCSDEASVCHYGVLTCGSCKVFFKRAVEGQHNYLCAGRNDCIIDKIRRKNCPACRYRKCLQAGMNLEARKT.... The pIC50 is 7.8. (10) The small molecule is Cc1nc2cnccc2n1-c1ccc(C2=Nc3c(CO)nn(C)c3N(C)C(=O)C2)cc1. The target protein (P21556) has sequence MELNSSSRVDSEFRYTLFPIVYSIIFVLGIIANGYVLWVFARLYPSKKLNEIKIFMVNLTVADLLFLITLPLWIVYYSNQGNWFLPKFLCNLAGCLFFINTYCSVAFLGVITYNRFQAVKYPIKTAQATTRKRGIALSLVIWVAIVAAASYFLVMDSTNVVSNKAGSGNITRCFEHYEKGSKPVLIIHICIVLGFFIVFLLILFCNLVIIHTLLRQPVKQQRNAEVRRRALWMVCTVLAVFVICFVPHHMVQLPWTLAELGMWPSSNHQAINDAHQVTLCLLSTNCVLDPVIYCFLTKKFRKHLSEKLNIMRSSQKCSRVTTDTGTEMAIPINHTPVNPIKN. The pIC50 is 7.8.