This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is CC(=O)N[C@@H](Cc1ccccc1)C(=O)N[C@H](C(=O)N[C@@H](Cc1ccccc1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](Cc1ccccc1)C(=O)O)[C@@H](C)O. The target protein (P07742) has sequence MHVIKRDGRQERVMFDKITSRIQKLCYGLNMDFVDPAQITMKVIQGLYSGVTTVELDTLAAETAATLTTKHPDYAILAARIAVSNLHKETKKVFSDVMEDLYNYINPHNGRHSPMVASSTLDIVMANKDRLNSAIIYDRDFSYNYFGFKTLERSYLLKINGKVAERPQHMLMRVSVGIHKEDIDAAIETYNLLSEKWFTHASPTLFNAGTNRPQLSSCFLLSMKDDSIEGIYDTLKQCALISKSAGGIGVAVSCIRATGSYIAGTNGNSNGLVPMLRVYNNTARYVDQGGNKRPGAFAIYLEPWHLDIFEFLDLKKNTGKEEQRARDLFFALWIPDLFMKRVETNQDWSLMCPNECPGLDEVWGEEFEKLYESYEKQGRVRKVVKAQQLWYAIIESQTETGTPYMLYKDSCNRKSNQQNLGTIKCSNLCTEIVEYTSKDEVAVCNLASLALNMYVTPEHTYDFEKLAEVTKVIVRNLNKIIDINYYPIPEAHLSNKRHRP.... The pIC50 is 4.6. (2) The compound is O=C1c2cccc3c(NCCO)ccc(c23)C(=O)N1c1ccccc1. The target protein sequence is MSTNPKPQRKTKRNTNRRPQDVKFPGGGQIVGGVYLLPRRGPRLGVRATRKTSERSQPRGRRQPIPKDRRSTGKSWGKPGYPWPLYGNEGCGWAGWLLSPRGSRPTWGPTDPRHRSRNLGRVIDTITCGFADLMGYIPVVGAPVGGVARALAHGVRVLEDGINYATGNLPGCSFSIFLLALLSCVTVPVSAVEVRNISSSYYATNDCSNNSITWQLTDAVLHLPGCVPCENDNGTLHCWIQVTPNVAVKHRGALTRSLRTHVDMIVMAATACSALYVGDVCGAVMILSQAFMVSPQRHNFTQECNCSIYQGHITGHRMAWDMMLSWSPTLTMILAYAARVPELVLEIIFGGHWGVVFGLAYFSMQGAWAKVIAILLLVAGVDATTYSSGQEAGRTVAGFAGLFTTGAKQNLYLINTNGSWHINRTALNCNDSLQTGFLASLFYTHKFNSSGCPERLSSCRGLDDFRIGWGTLEYETNVTNDGDMRPYCWHYPPRPCGIVP.... The pIC50 is 4.3. (3) The drug is CCCCCCCCCCCCCCCCOCCCOP(=O)(O)CO[C@H](C)Cn1cnc2c(N)ncnc21. The target protein sequence is PISPIETVPVKLKPGMDGPKVKQWPLTEEKIKALVEICTELEKEGKISKIGPENPYNTPVFAIKKKNSTRWRKLVDFRELNKRTQDFWEVQLGIPHPAGLKKKKSVTVLDVGDAYFSVPLDEDFRKYTAFTIPSINNETPGIRYQYNVLPQGWKGSPAIFQSSMTKILEPFRKQNPDIVIYQYMDDLYVGSDLEIGQHRTKIEELRQHLLRWGLFTPDEKHQKEPPFLWMGYELHPDKWTVQPIVLPEKDSWTVNDIQKLVGKLNWASQIYPGIKVRQLCKLLRGTKALTEVIPLTEEAELELAENREILKEPVHGVYYDPSKDLIAEIQKQGQGQWTYQIYQEPFKNLKTGKYARMRGAHTNDVKQLTEAVQKITTESIVIWGKTPKFKLPIQKETWETWWTEYWQATWIPEWEFVNTPPLVKLWYQLEKEPIVGAETFYVDGAANRETKLGKAGYVTNRGRQKVVTLTDTTNQKTELQAIYLALQDSGLEVNIVTDSQ.... The pIC50 is 8.4. (4) The target protein (Q3ZAV1) has sequence MAFPELLDRVGGRGRFQLLQAVALVTPILWVTTQNMLENFSAAVPHHRCWVPLLDNSTSQASIPGDFGRDVLLAVSIPPGPDQRPHQCLRFRQPQWQLIESNTTATNWSDADTEPCEDGWVYDHSTFRSTIVTTWDLVCDSQALRPMAQSIFLAGILVGAAVCGHASDRFGRRRVLTWSYLLVSVSGTIAALMPTFPLYCLFRFLVASAVAGVMMNTASLLMEWTSAQAGPLMMTLNALGFSFGQVLTGSVAYGVRSWRMLQLAVSAPFFLFFVYSWWLPESARWLITVGRLDQSLRELQRVAAVNRRKAEADTLTVEVLRSAMQEEPNGNQAGARLGTLLHTPGLRLRTFISMLCWFAFGFTFYGLALDLQALGSNIFLLQALIGIVDLPVKMGSLLLLSRLGRRLCQASSLVLPGLCILANILVPREMGILRSSLAVLGLGSLGAAFTCVTIFSSELFPTVIRMTAVGLGQVAARGGAMLGPLVRLLGVYGSWLPLLV.... The pIC50 is 4.7. The compound is O=c1c(O)c(-c2ccc(O)cc2O)oc2cc(O)cc(O)c12. (5) The small molecule is NC(=O)c1ccc(Oc2cc(NC(=O)N3CCC(O)(c4ccc(F)cc4)CC3)cc(Oc3ccc(F)cc3)c2)cc1. The target protein (P47752) has sequence MGGLYSEYLNPEKVQEHYNYTKETLDMQETPSRKVASAFIIILCCAIVVENLLVLIAVARNSKFHSAMYLFLGNLAASDLLAGVAFVANTLLSGPVTLSLTPLQWFAREGSAFITLSASVFSLLAIAIERQVAIAKVKLYGSDKSCRMLMLIGASWLISLILGGLPILGWNCLDHLEACSTVLPLYAKHYVLCVVTIFSVILLAIVALYVRIYFVVRSSHADVAGPQTLALLKTVTIVLGVFIICWLPAFSILLLDSTCPVRACPVLYKAHYFFAFATLNSLLNPVIYTWRSRDLRREVLRPLLCWRQGKGATGRRGGNPGHRLLPLRSSSSLERGLHMPTSPTFLEGNTVV. The pIC50 is 7.8.