This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is CC(C)c1c(N(Cc2ccc(OC(F)(F)F)cc2)S(=O)(=O)c2ccc(C(=O)[O-])cc2)nc2ccccn12. The target protein sequence is MSFEGARLSMRNRRNGTLDSTRTLYSSTSRSTDVSYSESDLVNFIQANFKKRECVFFTKDSKATENVCKCGYAQSQHIEGTQINSNEKWNYKKHTKEFPTDAFGDIQFETLGKKGKYIRLSCDTDAETLYELLTQHWHLKTPNLVISVTGGAKNFALKPRMRKIFSRLIYIAQSKGAWILTGGTHYGLMKYIGEVVRDNTISRNSEENIVAIGIAAWGMVSNRDTLLRNCDAEGYFSAQYIMDDFKRDPLYILDNNHTHLLLVDNGCHGHPTVEAKLRNQLEKYISERTIQDSNYGGKIPIVCFAQGGGRETLKAINTSIKSKIPCVVVEGSGQIADVIASLVEVEDVLTSSVVKEKLVRFLPRTVSRLPEEETESWIKWLKEILESSHLLTVIKMEEAGDEIVSNAISYALYKAFSTNEQDKDNWNGQLKLLLEWNQLDLANEEIFTNDRRWESADLQEVMFTALIKDRPKFVRLFLENGLNLRKFLTNDVLTELFSNH.... The pIC50 is 8.6. (2) The drug is CC[C@]1(C)NC(=O)c2cc(S(=O)(=O)Nc3ccc(C(F)(F)F)cc3)ccc2NC1=O. The target protein (Q6E213) has sequence MLLPSKKDLKTALDVFAVFQWSFSALLITTTVIAVNLYLVVFTPYWPVTVLILTWLAFDWKTPQRGGRRFTCVRHWRLWKHYSDYFPLKLLKTHDICPSRNYILVCHPHGLFAHGWFGHFATEASGFSKIFPGITPYILTLGAFFWMPFLREYVMSTGACSVSRSSIDFLLTHKGTGNMVIVVIGGLAECRYSLPGSSTLVLKNRSGFVRMALQHGVPLIPAYAFGETDLYDQHIFTPGGFVNRFQKWFQSMVHIYPCAFYGRGFTKNSWGLLPYSRPVTTIVGEPLPMPKIENPSQEIVAKYHTLYIDALRKLFDQHKTKFGISETQELEII. The pIC50 is 4.0. (3) The drug is COc1ccc(CCNC(C)CCc2ccc(OC)cc2)cc1. The target is TRQARRNRRRRWRERQR. The pIC50 is 4.1. (4) The drug is NC(=O)c1ccc(Oc2ccc(C(N)=O)cc2)cc1. The target protein sequence is AGQTLKGPWNNLERLAENTGEFQEVVRAFYDTLDAARSSIRVVRVERVSHPLLQQQYELYRERLLQRCERRPVEQVLYHGTTAPAVPDICAHGFNRSFCGRNATVYGKGVYFARRASLSVQDAYSPPNADGHKAVFVARVLTGDYGQGRRGLRAPPLRGPGHVLLRYDSAVDCICQPSIFVIFHDTQALPTHLITCEHVPRASPDDPSG. The pIC50 is 6.2. (5) The compound is Cc1cn(CC(OCP(=O)(O)O)Oc2ccccc2)c(=O)[nH]c1=O. The pIC50 is 6.1. The target protein (Q5FVR2) has sequence MAAPGTPPPLAPETAGADSGGGSGEHRQLPELIRLKRNGGHLSEADIRNFVHALMDGRAQDTQIGAMLMAIRLQGMDLEETSVLTQALAESGQQLEWPKAWHQQLVDKHSTGGVGDKVSLVLAPALAACGCKVPMISGRSLGHTGGTLDKLESIPGFSVTQSPEQMLQILEEVGCCIVGQSEKLVPADGILYAARDVTATVDSVPLITASILSKKAVEGLSTLVVDVKFGGAAVFPDQEKARELAKMLVRVGMGLGLQVAAALTAMDNPLGRNVGHTLEVEEALLCLDGAGPPDLRDLVIRLGGAILWLSGQAETQDQGAARVAAALDDGSALHRFQLMLSAQGVDPGLARALCSGSPTQRRQLLPHARKQEELLSPADGIVECVRALPLACVLHELGAGRSRAGQPIRPGVGAELLVDVGQWLSRGTPWLRVHLDGPALSSQQRRTLLGALVLSDRAPFKAPSPFAELVLPPTTP. (6) The pIC50 is 8.5. The target protein sequence is APITAYAQQTRGLLGTIVTSLTGRDKNVVTGEVQVLSTTTQTFLGTTVGGVMWTVYHGAGSRTLAGAKHPALQMYTNVDQDLVGWPAPPGAKSLELCTCGSADLYLVTRDADVIPARRRGDSTASLLSPRPLACLKGSSGGPVMCPSGHVAGIFRAAVCTRGVAKALQFIPVETLSTQARSPSFSDNSTPPAVPQSYQVGYLHAPTGSGKSTKVPAAYVAQGYNVLVLNPSVAATLGFGSYMSRAHGIDPNIRTGNRTVTTGAKLTYSTYGKFLADGGCSGGAYDVIICDECHAQDATSILGIGTVLDQAETAGVRLTVLATATPPGSITVPHSNIEEVALGSEGGIPFYGKAIPIAQLEGGRHLIFCHSRKKCDELASKLRGMGLNAVAYYRGLDVSVIPTVGDVVVCATDALMTGFTGDFDSVIDCNVAVEQYVDFSLDPTFSIETRTAPQDAVSRSQRRGRTGRGRPSTYRYVTPGERPSGMFDSVVLCECYDAGCS.... The small molecule is C=C[C@@H]1C[C@]1(NC(=O)[C@@H]1C[C@@]2(CN1C(=O)[C@@H](NC(=O)[C@@H](NC(=O)[C@@H]1CCCCN1C(C)C)C1(C)CCCCC1)C(C)(C)C)C(C)(C)C21CCC1)C(=O)NS(=O)(=O)N(CC)CC. (7) The compound is CCCCCCCc1nc(C(=O)NCC(=O)O)c(O)c2ccccc12. The target protein sequence is MASESETLNPSARIMTFYPTMEEFRNFSRYIAYIESQGAHRAGLAKVVPPKEWKPRASYDDIDDLVIPAPIQQLVTGQSGLFTQYNIQKKAMTVREFRKIANSDKYCTPRYSEFEELERKYWKNLTFNPPIYGADVNGTLYEKHVDEWNIGRLRTILDLVEKESGITIEGVNTPYLYFGMWKTSFAWHTEDMDLYSINYLHFGEPKSWYSVPPEHGKRLERLAKGFFPGSAQSCEAFLRHKMTLISPLMLKKYGIPFDKVTQEAGEFMITFPYGYHAGFNHGFNCAESTNFATRRWIEYGKQAVLCSCRKDMVKISMDVFVRKFQPERYKLWKAGKDNTVIDHTLPTPEAAEFLKESELPPRAGNEEECPEEDMEGVEDGEEGDLKTSLAKHRIGTKRHRVCLEIPQEVSQSELFPKEDLSSEQYEMTECPAALAPVRPTHSSVRQVEDGLTFPDYSDSTEVKFEELKNVKLEEEDEEEEQAAAALDLSVNPASVGGRLV.... The pIC50 is 4.4.