This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is CCC[C@@H]1C[C@@H](N2CCCC2)C[C@@]2(O1)C(=O)Nc1ccccc12. The target protein (O77759) has sequence MEPGGARLRLQRTEGPGGEREHQPCRDGNTETHRAPDLVKWTRHMEAVKAQLLEQAQGQLRELLDRAMWEAIQSYPSQDKPPPLPPPDSLSRTQEPSLGKQKVFIIRKSLLDELMEVQHFRTIYHMFIAGLCVFIISTLAIDFIDEGRLLLEFDLLIFSFGQLPLALVTWVPMFLSTLLAPYQALRLWARPGARGTWTLGAGLGCALLAAHALVLCALPVHVAVEHQLPPASRCVLVFEQVRFLMKSYSFLREAVPGTLRARRGEGIQAPSFSSYLYFLFCPTLIYRETYPRTPYIRWNYVAKNFAQALGCVLYACFILGRLCVPVFANMSREPFSTRALVLSILHATLPGIFMLLLIFFAFLHCWLNAFAEMLRFGDRMFYRDWWNSTSFSNYYRTWNVVVHDWLYSYVYQDGLWLLGAQARGVAMLGVFLVSAVAHEYIFCFVLGFFYPVMLILFLVIGGMLNFMMHDQHTGPAWNVLMWTMLFLGQGIQVSLYCQEW.... The pIC50 is 4.6. (2) The drug is CC(=O)c1cc(-c2ccc(N(C)C)cc2)c2ccccn12. The target protein (Q9UIF9) has sequence MEMEANDHFNFTGLPPAPAASGLKPSPSSGEGLYTNGSPMNFPQQGKSLNGDVNVNGLSTVSHTTTSGILNSAPHSSSTSHLHHPSVAYDCLWNYSQYPSANPGSNLKDPPLLSQFSGGQYPLNGILGGSRQPSSPSHNTNLRAGSQEFWANGTQSPMGLNFDSQELYDSFPDQNFEVMPNGPPSFFTSPQTSPMLGSSIQTFAPSQEVGSGIHPDEAAEKEMTSVVAENGTGLVGSLELEEEQPELKMCGYNGSVPSVESLHQEVSVLVPDPTVSCLDDPSHLPDQLEDTPILSEDSLEPFNSLAPEPVSGGLYGIDDTELMGAEDKLPLEDSPVISALDCPSLNNATAFSLLADDSQTSTSIFASPTSPPVLGESVLQDNSFDLNNGSDAEQEEMETQSSDFPPSLTQPAPDQSSTIQLHPATSPAVSPTTSPAVSLVVSPAASPEISPEVCPAASTVVSPAVFSVVSPASSAVLPAVSLEVPLTASVTSPKASPVTS.... The pIC50 is 6.0. (3) The small molecule is NC(=O)c1cn(-c2ccc(S(N)(=O)=O)cc2)nc1-c1ccccc1. The target is PDASQDDGPAVERPSTEL. The pIC50 is 4.1. (4) The drug is NC(=O)CC1COc2cc(F)ccc2N1C(=O)c1ccc2c(c1)NC(=O)CO2. The target protein sequence is TISRALTPSPVMVLENIEPEIVYAGYDSSKPDTAENLLSTLNRLAGKQMIQVVKWAKVLPGFKNLPLEDQITLIQYSWMCLSSFALSWRSYKHTNSQFLYFAPDLVFNEEKMHQSAMYELCQGMHQISLQFVRLQLTFEEYTIMKVLLLLSTIPKDGLKSQAAFEEMRTNYIKELRKMVTKCPNNSGQSWQRFYQLTKLLDSMHDLVSDLLEFCFYTFRESHALKVEFPAMLVEIISDQLPKVESGNAKPLYFHRK. The pIC50 is 6.6. (5) The small molecule is CCCOc1cc(C2(C)CNC(=O)O2)ccc1OC. The target protein sequence is MEKLSYHSICTSEEWQGLMQFTLPVRLCKEIELFHFDIGPFENMWPGIFVYMVHRSCGTSCFELEKLCRFIMSVKKNYRRVPYHNWKHAVTVAHCMYAILQNNHTLFTDLERKGLLIACLCHDLDHRGFSNSYLQKFDHPLAALYSTSTMEQHHFSQTVSILQLEGHNIFSTLSSSEYEQVLEIIRKAIIATDLALYFGNRKQLEEMYQTGSLNLNNQSHRDRVIGLMMTACDLCSVTKLWPVTKLTANDIYAEFWAEGDEMKKLGIQPIPMMDRDKKDEVPQGQLGFYNAVAIPCYTTLTQILPPTEPLLKACRDNLSQWEKVIRGEETATWISSPSVAQKAAASED. The pIC50 is 4.2. (6) The compound is CC(C)=CCC/C(C)=C/CC/C(C)=C/CC/C=C(\C)CC/C=C(\C)CCCO. The target protein sequence is MWTFLGIATFTYFYKKCGDFVSLANKELLLGVLVFLSLGLVLSYRCRYRNGALLGRQQSGSQFAVFSDILSALPLIGFFWAKSPTGSEKKEQLGSRRGKKGSNISETTLVGAAASPLISSQNDPEIIIVGSGVLGSALAAVLSRDGRKVTVIERDLKEPDRILGEYLQPGGCHVLKDLGLEDTMEGIDAQVVDGYIIHDQESKSEVQIPFPLSENNHVQSGRAFRHGRFIMGLRKAAMAEPNAKFIEGTVLQLLEEEDVVLGVQYRDKETGDIKELHAPLTIVADGLFSKFRKNLISNKVSVSSHFVGFLMENAPQFKANHAELVLANPSPVLIYQISPSETRVLVDIRGEMPRNLREYMIENIYPQLPDHLKEPFLEASQNSHLRSMPASFLPSSPVNKRGVLLLGDAHNMRHPLTGGGMTVAFNDIKLWRKLLKGIPDLYDDAAILQAKKSFYWTRKMSHSFVVNVLAQALYELFSATDDSLYQLRKACFFYFKLGGE.... The pIC50 is 5.4. (7) The target protein (Q15418) has sequence MPLAQLKEPWPLMELVPLDPENGQTSGEEAGLQPSKDEGVLKEISITHHVKAGSEKADPSHFELLKVLGQGSFGKVFLVRKVTRPDSGHLYAMKVLKKATLKVRDRVRTKMERDILADVNHPFVVKLHYAFQTEGKLYLILDFLRGGDLFTRLSKEVMFTEEDVKFYLAELALGLDHLHSLGIIYRDLKPENILLDEEGHIKLTDFGLSKEAIDHEKKAYSFCGTVEYMAPEVVNRQGHSHSADWWSYGVLMFEMLTGSLPFQGKDRKETMTLILKAKLGMPQFLSTEAQSLLRALFKRNPANRLGSGPDGAEEIKRHVFYSTIDWNKLYRREIKPPFKPAVAQPDDTFYFDTEFTSRTPKDSPGIPPSAGAHQLFRGFSFVATGLMEDDGKPRAPQAPLHSVVQQLHGKNLVFSDGYVVKETIGVGSYSECKRCVHKATNMEYAVKVIDKSKRDPSEEIEILLRYGQHPNIITLKDVYDDGKHVYLVTELMRGGELLDK.... The pIC50 is 9.0. The drug is CCn1c(-c2nonc2N)nc2c(C#CCO)ncc(OCCCN)c21. (8) The drug is CNc1cc(Cn2nc(C(=O)O)cc2C)c2nc(N3CC=CCC3)sc2c1. The pIC50 is 5.1. The target protein (P70597) has sequence MSPYGLNLSLVDEATTCVTPRVPNTSVVLPTGGNGTSPALPIFSMTLGAVSNVLALALLAQVAGRLRRRRSTATFLLFVASLLAIDLAGHVIPGALVLRLYTAGRAPAGGACHFLGGCMVFFGLCPLLLGCGMAVERCVGVTQPLIHAARVSVARARLALALLAAMALAVALLPLVHVGHYELQYPGTWCFISLGPPGGWRQALLAGLFAGLGLAALLAALVCNTLSGLALLRARWRRRRSRRFRENAGPDDRRRWGSRGLRLASASSASSITSTTAALRSSRGGGSARRVHAHDVEMVGQLVGIMVVSCICWSPLLVLVVLAIGGWNSNSLQRPLFLAVRLASWNQILDPWVYILLRQAMLRQLLRLLPLRVSAKGGPTELSLTKSAWEASSLRSSRHSGFSHL.