Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. From a dataset of Drug-target binding data from BindingDB using IC50 measurements. (1) The drug is CC(=O)CN(Cc1ccc2nc(C)[nH]c(=O)c2c1)c1ccc(C(=O)N[C@@H](CCC(=O)O)C(=O)O)cc1. The target protein (P07607) has sequence MLVVGSELQSDAQQLSAEAPRHGELQYLRQVEHILRCGFKKEDRTGTGTLSVFGMQARYSLRDEFPLLTTKRVFWKGVLEELLWFIKGSTNAKELSSKGVRIWDANGSRDFLDSLGFSARQEGDLGPVYGFQWRHFGAEYKDMDSDYSGQGVDQLQKVIDTIKTNPDDRRIIMCAWNPKDLPLMALPPCHALCQFYVVNGELSCQLYQRSGDMGLGVPFNIASYALLTYMIAHITGLQPGDFVHTLGDAHIYLNHIEPLKIQLQREPRPFPKLKILRKVETIDDFKVEDFQIEGYNPHPTIKMEMAV. The pIC50 is 4.6. (2) The small molecule is CC1(C)CC2(CC(C)(C)c3c(Br)c(OS(=O)(=O)[O-])c(OS(=O)(=O)[O-])c(Br)c32)c2c(Br)c(OS(=O)(=O)[O-])c(OS(=O)(=O)[O-])c(Br)c21. The target protein sequence is MKRFFFYKDGELFHDRKCTRRVSRSNATYEVLRHVRIPAHLHDVRVLEQTYEQAQRGLVFVGLDSKQRAQYFYGKLHVQRRHERRDAIFVRVFGIMDGLRRFIDRHLDSPEAARDSKTQLAIFLLMETSFYIRTGKMRYFRENETVGMLTLKNRHLHEEDGLLAIRFVGKDQVTHEFRVHAEDRLYVPLMRLRDADAPEAFLFSRLSERVVYRFLRTFGVRVKDLRTYGVNVTFLSSLWTNVKELPALPSARKLVALSIRQTAEAIGHSPHIARQAYMALTVLELARDASVFEHIRTLSRDEFLAFIVDYVKRRARARCPETG. The pIC50 is 4.1. (3) The drug is COc1cc(NC(=O)c2ccc(-c3ccccc3OC(F)(F)F)c(CO)c2)ccc1S(N)(=O)=O. The target protein (P22199) has sequence METKGYHSLPEGLDMERRWSQVSQTLERSSLGPAERTTENNYMEIVNVSCVSGAIPNNSTQGSSKEKHELLPYIQQDNSRSGILPSDIKTELESKELSATVAESMGLYMDSVRDAEYTYDQQNQQGSLSPTKIYQNMEQLVKFYKENGHRSSTLSAMSRPLRSFMPDSAASMNGGALRAIVKSPIICHEKSSSVSSPLNMASSVCSPVGINSMSSSTTSFGSFPVHSPITQGTSLTCSPSVENRGSRSHSPTHASNVGSPLSSPLSSMKSPISSPPSHCSVKSPVSSPNNVPLRSSVSSPANLNNSRCSVSSPSNNTNNRSTLSSPTASTVGSIGSPISNAFSYATSGASAGAGAIQDVVPSPDTHEKGAHDVPFPKTEEVEKAISNGVTGPLNIVQYIKSEPDGAFSSSCLGGNSKISPSSPFSVPIKQESSKHSCSGASFKGNPTVNPFPFMDGSYFSFMDDKDYYSLSGILGPPVPGFDGSCEDSAFPVGIKQEPDD.... The pIC50 is 7.2. (4) The small molecule is O=S(=O)(O)c1ccc2cc(N=Nc3cc(S(=O)(=O)O)c4cccnc4c3O)ccc2c1. The target protein (Q12913) has sequence MKPAAREARLPPRSPGLRWALPLLLLLLRLGQILCAGGTPSPIPDPSVATVATGENGITQISSTAESFHKQNGTGTPQVETNTSEDGESSGANDSLRTPEQGSNGTDGASQKTPSSTGPSPVFDIKAVSISPTNVILTWKSNDTAASEYKYVVKHKMENEKTITVVHQPWCNITGLRPATSYVFSITPGIGNETWGDPRVIKVITEPIPVSDLRVALTGVRKAALSWSNGNGTASCRVLLESIGSHEELTQDSRLQVNISGLKPGVQYNINPYLLQSNKTKGDPLGTEGGLDASNTERSRAGSPTAPVHDESLVGPVDPSSGQQSRDTEVLLVGLEPGTRYNATVYSQAANGTEGQPQAIEFRTNAIQVFDVTAVNISATSLTLIWKVSDNESSSNYTYKIHVAGETDSSNLNVSEPRAVIPGLRSSTFYNITVCPVLGDIEGTPGFLQVHTPPVPVSDFRVTVVSTTEIGLAWSSHDAESFQMHITQEGAGNSRVEITT.... The pIC50 is 4.2. (5) The drug is O=C(Nc1cc([N+](=O)[O-])ccc1O)c1cccc(C(=O)Nc2cc([N+](=O)[O-])ccc2O)c1. The target protein (P11940) has sequence MNPSAPSYPMASLYVGDLHPDVTEAMLYEKFSPAGPILSIRVCRDMITRRSLGYAYVNFQQPADAERALDTMNFDVIKGKPVRIMWSQRDPSLRKSGVGNIFIKNLDKSIDNKALYDTFSAFGNILSCKVVCDENGSKGYGFVHFETQEAAERAIEKMNGMLLNDRKVFVGRFKSRKEREAELGARAKEFTNVYIKNFGEDMDDERLKDLFGKFGPALSVKVMTDESGKSKGFGFVSFERHEDAQKAVDEMNGKELNGKQIYVGRAQKKVERQTELKRKFEQMKQDRITRYQGVNLYVKNLDDGIDDERLRKEFSPFGTITSAKVMMEGGRSKGFGFVCFSSPEEATKAVTEMNGRIVATKPLYVALAQRKEERQAHLTNQYMQRMASVRAVPNPVINPYQPAPPSGYFMAAIPQTQNRAAYYPPSQIAQLRPSPRWTAQGARPHPFQNMPGAIRPAAPRPPFSTMRPASSQVPRVMSTQRVANTSTQTMGPRPAAAAAA.... The pIC50 is 5.0. (6) The drug is C[C@]12C[C@H](O)[C@H]3[C@@H](CCC4=CC(=O)CC[C@@]43C)[C@@H]1CC[C@]2(O)C(=O)COC(=O)c1cc([N+](=O)[O-])ccc1Cl. The target protein sequence is MLEPLRLSQLTVALDARLIGEDAVFSAVSTDSRAIGPGQLFIALSGPRFDGHDYLAEVAAKGAVAALVEREVADAPLPQLLVRDTRAALGRLGALNRRKFTGPLAAMTGSSGKTAVKEMLASILRTQAGDAESVLATRGNLNNDLGVPLTLLQLAPQHRSAVIELGASRIGEIAYTVELTRPHVAIITNAGTAHVGEFGGPEKIVEAKGEILEGLAADGTAVLNLDDKAFDTWKARASGRPLLTFSLDRPQADFRAADLQRDARGCMGFRLQGVAGEAQVQLNLLGRHNVANALAAAAAAHALGVPLDGIVAGLQALQPVKGRAVAQLTASGLRVIDDSYNANPASMLAAIDILSGFSGRTVLVLGDMGELGSWAEQAHREVGAYAAGKVSALYAVGPLMAHAVQAFGATGRHFADQASLIGALATEQPTTTILIKGSRSAAMDKVVAALCGSSEESH. The pIC50 is 2.3. (7) The compound is O=C(NC1CCOCC1)c1ccc(-c2cc(-c3c[nH]nc3-c3ccccn3)ccn2)cc1. The target protein (P01137) has sequence MPPSGLRLLLLLLPLLWLLVLTPGRPAAGLSTCKTIDMELVKRKRIEAIRGQILSKLRLASPPSQGEVPPGPLPEAVLALYNSTRDRVAGESAEPEPEPEADYYAKEVTRVLMVETHNEIYDKFKQSTHSIYMFFNTSELREAVPEPVLLSRAELRLLRLKLKVEQHVELYQKYSNNSWRYLSNRLLAPSDSPEWLSFDVTGVVRQWLSRGGEIEGFRLSAHCSCDSRDNTLQVDINGFTTGRRGDLATIHGMNRPFLLLMATPLERAQHLQSSRHRRALDTNYCFSSTEKNCCVRQLYIDFRKDLGWKWIHEPKGYHANFCLGPCPYIWSLDTQYSKVLALYNQHNPGASAAPCCVPQALEPLPIVYYVGRKPKVEQLSNMIVRSCKCS. The pIC50 is 6.3. (8) The drug is CCOc1cc2oc(-c3ccc(F)cc3)c(C(=O)NC)c2cc1-c1cnc(OC)c(C(=O)NC23CC4CC42C3)c1. The target protein (Q9WMX2) has sequence MSTNPKPQRKTKRNTNRRPQDVKFPGGGQIVGGVYLLPRRGPRLGVRATRKTSERSQPRGRRQPIPKARQPEGRAWAQPGYPWPLYGNEGLGWAGWLLSPRGSRPSWGPTDPRRRSRNLGKVIDTLTCGFADLMGYIPLVGAPLGGAARALAHGVRVLEDGVNYATGNLPGCSFSIFLLALLSCLTIPASAYEVRNVSGVYHVTNDCSNASIVYEAADMIMHTPGCVPCVRENNSSRCWVALTPTLAARNASVPTTTIRRHVDLLVGAAALCSAMYVGDLCGSVFLVAQLFTFSPRRHETVQDCNCSIYPGHVTGHRMAWDMMMNWSPTAALVVSQLLRIPQAVVDMVAGAHWGVLAGLAYYSMVGNWAKVLIVMLLFAGVDGGTYVTGGTMAKNTLGITSLFSPGSSQKIQLVNTNGSWHINRTALNCNDSLNTGFLAALFYVHKFNSSGCPERMASCSPIDAFAQGWGPITYNESHSSDQRPYCWHYAPRPCGIVPAA.... The pIC50 is 8.6.