Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is CC(C)c1ccc(CCN(C)C2CCN(c3nc(N)n[nH]3)CC2)cc1. The target protein (Q9D7Q1) has sequence MVQSLAWAGVMTLLMVQWGSAAKLVCYLTNWSQYRTEAVRFFPRDVDPNLCTHVIFAFAGMDNHQLSTVEHNDELLYQELNSLKTKNPKLKTLLAVGGWTFGTQKFTDMVATASNRQTFVKSALSFLRTQGFDGLDLDWEFPGGRGSPTVDKERFTALIQDLAKAFQEEAQSSGKERLLLTAAVPSDRGLVDAGYEVDKIAQSLDFINLMAYDFHSSLEKTTGHNSPLYKRQGESGAAAEQNVDAAVTLWLQKGTPASKLILGMPTYGRSFTLASSSDNGVGAPATGPGAPGPYTKDKGVLAYYEACSWKERHRIEDQKVPYAFQDNQWVSFDDVESFKAKAAYLKQKGLGGAMVWVLDLDDFKGSFCNQGPYPLIRTLRQELNLPSETPRSPEQIIPEPRPSSMPEQGPSPGLDNFCQGKADGVYPNPGDESTYYNCGGGRLFQQSCPPGLVFRASCKCCTWS. The pIC50 is 5.2. (2) The compound is CN[C@@H]1CSSC[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@@H]([C@@H](C)O)NC(=O)[C@H](Cc2ccccc2)NC(=O)[C@@H]([C@@H](C)O)NC(=O)[C@H](Cc2ccc(CNC(C)C)cc2)NC(=O)[C@H](Cc2ccc3ccccc3c2)NC(=O)[C@H](Cc2ccccc2)NC(=O)[C@@H](Cc2ccccc2)NC(=O)[C@@H](CCCCN)NC1=O. The target protein (P31391) has sequence MSAPSTLPPGGEEGLGTAWPSAANASSAPAEAEEAVAGPGDARAAGMVAIQCIYALVCLVGLVGNALVIFVILRYAKMKTATNIYLLNLAVADELFMLSVPFVASSAALRHWPFGSVLCRAVLSVDGLNMFTSVFCLTVLSVDRYVAVVHPLRAATYRRPSVAKLINLGVWLASLLVTLPIAIFADTRPARGGQAVACNLQWPHPAWSAVFVVYTFLLGFLLPVLAIGLCYLLIVGKMRAVALRAGWQQRRRSEKKITRLVLMVVVVFVLCWMPFYVVQLLNLFVTSLDATVNHVSLILSYANSCANPILYGFLSDNFRRFFQRVLCLRCCLLEGAGGAEEEPLDYYATALKSKGGAGCMCPPLPCQQEALQPEPGRKRIPLTRTTTF. The pIC50 is 6.0. (3) The compound is O=C(O)c1cc(-c2ccc(Cl)c(Cl)c2)ncn1. The target protein (Q91WN4) has sequence MASSDTQGKRVAVIGGGLVGALNACFLAKRNFQVDVYEAREDIRVAKSARGRSINLALSYRGRQALKAIGLEDQIVSKGVPMKARMIHSLSGKKSAIPYGNKSQYILSISRENLNKDLLTAVESYANAKVHFGHKLSKCIPEEGVLTVLGPDKVPRDVTCDLVVGCDGAYSTVRAHLMKKPRFDYTQQYIPHGYMELTIPPKNGEYAMEPNCLHIWPRNAYMMIALPNMDKSFTCTLFMPFEEFERLPTRSDVLDFFQKNFPDAIPLMGEQALMRDFFLLPAQPMISVKCSPFHLKSHCVLMGDAAHAIVPFFGQGMNAGFEDCLVFDELMDKFNNNLSMCLPEFSRFRIPDDHAISDLSMYNYIEMRAHVNSRWFLFQKLLDKFLHAIMPSTFIPLYTMVAFTRIRYHEAVLRWHWQKKVINRGLFVLGSLIAIGGTYLLVHHLSLRPLEFLRRPAWMGTTGYWTRSTDISLQVPWSY. The pIC50 is 8.2. (4) The compound is COc1ccccc1NC1=NC(c2ccc(Cl)cc2)N=C(N)N1. The target is TRQARRNRRRRWRERQR. The pIC50 is 4.1. (5) The drug is NC(=NCCC[C@H](N)C[C@@H](O)CCCO)N[N+](=O)[O-]. The target protein (P29473) has sequence MGNLKSVGQEPGPPCGLGLGLGLGLCGKQGPASPAPEPSRAPAPATPHAPDHSPAPNSPTLTRPPEGPKFPRVKNWELGSITYDTLCAQSQQDGPCTPRCCLGSLVLPRKLQTRPSPGPPPAEQLLSQARDFINQYYSSIKRSGSQAHEERLQEVEAEVASTGTYHLRESELVFGAKQAWRNAPRCVGRIQWGKLQVFDARDCSSAQEMFTYICNHIKYATNRGNLRSAITVFPQRAPGRGDFRIWNSQLVRYAGYRQQDGSVRGDPANVEITELCIQHGWTPGNGRFDVLPLLLQAPDEAPELFVLPPELVLEVPLEHPTLEWFAALGLRWYALPAVSNMLLEIGGLEFSAAPFSGWYMSTEIGTRNLCDPHRYNILEDVAVCMDLDTRTTSSLWKDKAAVEINLAVLHSFQLAKVTIVDHHAATVSFMKHLDNEQKARGGCPADWAWIVPPISGSLTPVFHQEMVNYILSPAFRYQPDPWKGSATKGAGITRKKTFKE.... The pIC50 is 2.6. (6) The drug is Cc1ccc(OC(=O)c2ccccc2)c(C(=O)c2ccccc2)c1. The target protein (Q7T3S7) has sequence MKTLWIVAVWLIAVEGNLYQFGRMIWNRTGKLPILSYGSYGCYCGWGGQGPPKDATDRCCLVHDCCYTRVGDCSPKMTLYSYRFENGDIICDNKDPCKRAVCECDREAAICLGENVNTYDKKYKSYEDCTEEVQEC. The pIC50 is 4.0. (7) The drug is C[C@]12CC[C@H]3[C@@H](CC[C@H]4CC(=O)CC[C@@]43C)[C@@H]1CC[C@@H]2O. The target protein (P08689) has sequence MEKGEVASLRCRLLLLLLLLTLPPTHQGRTLRHIDPIQSAQDSPAKYLSNGPGQEPVTVLTIDLTKISKPSSSFEFRTWDPEGVIFYGDTNTEDDWFMLGLRDGQLEIQLHNLWARLTVGFGPRLNDGRWHPVELKMNGDSLLLWVDGKEMLCLRQVSASLADHPQLSMRIALGGLLLPTSKLRFPLVPALDGCIRRDIWLGHQAQLSTSARTSLGNCDVDLQPGLFFPPGTHAEFSLQDIPQPHTDPWTFSLELGFKLVDGAGRLLTLGTGTNSSWLTLHLQDQTVVLSSEAEPKLALPLAVGLPLQLKLDVFKVALSQGPKMEVLSTSLLRLASLWRLWSHPQGHLSLGALPGEDSSASFCLSDLWVQGQRLDIDKALSRSQDIWTHSCPQSPSNDTHTSH. The pIC50 is 7.6. (8) The drug is CC(C)c1ccccc1-c1ncc2[nH]c(=O)n(Cc3ccc(-c4nc(C(F)(F)F)nn4C)cc3)c2n1. The target protein (O94782) has sequence MPGVIPSESNGLSRGSPSKKNRLSLKFFQKKETKRALDFTDSQENEEKASEYRASEIDQVVPAAQSSPINCEKRENLLPFVGLNNLGNTCYLNSILQVLYFCPGFKSGVKHLFNIISRKKEALKDEANQKDKGNCKEDSLASYELICSLQSLIISVEQLQASFLLNPEKYTDELATQPRRLLNTLRELNPMYEGYLQHDAQEVLQCILGNIQETCQLLKKEEVKNVAELPTKVEEIPHPKEEMNGINSIEMDSMRHSEDFKEKLPKGNGKRKSDTEFGNMKKKVKLSKEHQSLEENQRQTRSKRKATSDTLESPPKIIPKYISENESPRPSQKKSRVKINWLKSATKQPSILSKFCSLGKITTNQGVKGQSKENECDPEEDLGKCESDNTTNGCGLESPGNTVTPVNVNEVKPINKGEEQIGFELVEKLFQGQLVLRTRCLECESLTERREDFQDISVPVQEDELSKVEESSEISPEPKTEMKTLRWAISQFASVERIVG.... The pIC50 is 7.3. (9) The small molecule is O=C(C(=S)N1CCOCC1)c1ccccc1Br. The target protein (O43175) has sequence MAFANLRKVLISDSLDPCCRKILQDGGLQVVEKQNLSKEELIAELQDCEGLIVRSATKVTADVINAAEKLQVVGRAGTGVDNVDLEAATRKGILVMNTPNGNSLSAAELTCGMIMCLARQIPQATASMKDGKWERKKFMGTELNGKTLGILGLGRIGREVATRMQSFGMKTIGYDPIISPEVSASFGVQQLPLEEIWPLCDFITVHTPLLPSTTGLLNDNTFAQCKKGVRVVNCARGGIVDEGALLRALQSGQCAGAALDVFTEEPPRDRALVDHENVISCPHLGASTKEAQSRCGEEIAVQFVDMVKGKSLTGVVNAQALTSAFSPHTKPWIGLAEALGTLMRAWAGSPKGTIQVITQGTSLKNAGNCLSPAVIVGLLKEASKQADVNLVNAKLLVKEAGLNVTTSHSPAAPGEQGFGECLLAVALAGAPYQAVGLVQGTTPVLQGLNGAVFRPEVPLRRDLPLLLFRTQTSDPAMLPTMIGLLAEAGVRLLSYQTSLV.... The pIC50 is 3.8. (10) The drug is Nc1ccc(S(=O)(=O)c2ccccc2)cc1. The target protein sequence is MDIIEESNKCKENNKGNIVVLNFGTTDKTNAVTILETALYLTEKYIGKIINTSYMYETVPEYVVLDKSDIPKNIIGEDDPYDVSSLNDLVKGLEKSKYENVFQGEENLVSQCEYERFLNNKDLFENKIKQISTEKYESETSNIIKENDEIMKINLEKHKNKYYTSYFYNLVVVFKCFIDDPLNLLVILKYIEHLMKRKNSKEVEKFENRLIDIDILFFNNYTIFEKNINLTKNDLYTIMCKYINIEYDNSSSDNCNKLSRNIEEIKDNIKFLSIPHVYTKHRYSILLCLNDIMPNYKHNALKETINKLHEEFITSFSKLYNTCIKKYNKRLYVLKNEVLCLKEKTNIVGILNTNYNSFSDGGLFVKPNIAVHRMFQMIKEGVDIIDIGGESSAPFVSHNPEIKERDLVIPVLELFEQEWNKMLQIRENGMEKQKDKLNQNDLSLQKKTSTIYKPPISIDTMNYDLFKECVDKNLVDILNDISACTNDPKIIKLLKKKN. The pIC50 is 4.0.