This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is CCOC(=O)Nc1ccc2c(c1)N(C(=O)CN1C(=O)C=CC1=O)c1ccccc1CC2. The target protein (Q14994) has sequence MASREDELRNCVVCGDQATGYHFNALTCEGCKGFFRRTVSKSIGPTCPFAGSCEVSKTQRRHCPACRLQKCLDAGMRKDMILSAEALALRRAKQAQRRAQQTPVQLSKEQEELIRTLLGAHTRHMGTMFEQFVQFRPPAHLFIHHQPLPTLAPVLPLVTHFADINTFMVLQVIKFTKDLPVFRSLPIEDQISLLKGAAVEICHIVLNTTFCLQTQNFLCGPLRYTIEDGARVSPTVGFQVEFLELLFHFHGTLRKLQLQEPEYVLLAAMALFSPDRPGVTQRDEIDQLQEEMALTLQSYIKGQQRRPRDRFLYAKLLGLLAELRSINEAYGYQIQHIQGLSAMMPLLQEICS. The pIC50 is 4.4. (2) The drug is O=C1CCc2c(Oc3ccc4c(c3)C3C(C4)C3C(=O)O)ccnc2N1. The target protein sequence is MEHIQGAWKTISNGFGFKDAVFDGSSCISPTIVQQFGYQRRASDDGKLTDPSKTSNTIRVFLPNKQRTVVNVRNGMSLHDCLMKALKVRGLQPECCAVFRLLHEHKGKKARLDWNTDAASLIGEELQVDFLDHVPLTTHNFARKTFLKLAFCDICQKFLLNGFRCQTCGYKFHEHCSTKVPTMCVDWSNIRQLLLFPNSTIGDSGVPALPSLTMRRMRESVSRMPVSSQHRYSTPHAFTFNTSSPSSEGSLSQRQRSTSTPNVHMVSTTLPVDSRMIEDAIRSHSESASPSALSSSPNNLSPTGWSQPKTPVPAQRERAPVSGTQEKNKIRPRGQRDSSDDWEIEASEVMLSTRIGSGSFGTVYKGKWHGDVAVKILKVVDPTPEQFQAFRNEVAVLRKTRHVNILLFMGYMTKDNLAIVTQWCEGSSLYKHLHVQETKFQMFQLIDIARQTAQGMDYLHAKNIIHRDMKSNNIFLHEGLTVKIGDFGLATVKSRWSGSQ.... The pIC50 is 5.3. (3) The pIC50 is 5.3. The target protein (P34896) has sequence MTMPVNGAHKDADLWSSHDKMLAQPLKDSDVEVYNIIKKESNRQRVGLELIASENFASRAVLEALGSCLNNKYSEGYPGQRYYGGTEFIDELETLCQKRALQAYKLDPQCWGVNVQPYSGSPANFAVYTALVEPHGRIMGLDLPDGGHLTHGFMTDKKKISATSIFFESMPYKVNPDTGYINYDQLEENARLFHPKLIIAGTSCYSRNLEYARLRKIADENGAYLMADMAHISGLVAAGVVPSPFEHCHVVTTTTHKTLRGCRAGMIFYRKGVKSVDPKTGKEILYNLESLINSAVFPGLQGGPHNHAIAGVAVALKQAMTLEFKVYQHQVVANCRALSEALTELGYKIVTGGSDNHLILVDLRSKGTDGGRAEKVLEACSIACNKNTCPGDRSALRPSGLRLGTPALTSRGLLEKDFQKVAHFIHRGIELTLQIQSDTGVRATLKEFKERLAGDKYQAAVQALREEVESFASLFPLPGLPDF. The compound is COc1cccc(C2(C3CCC3)C(C#N)=C(N)Oc3[nH]nc(C)c32)c1. (4) The compound is COc1ccc(N=Nc2cc(-c3ccccc3)ccc2O)cc1. The target protein (Q9R0P9) has sequence MQLKPMEINPEMLNKVLAKLGVAGQWRFADVLGLEEETLGSVPSPACALLLLFPLTAQHENFRKKQIEELKGQEVSPKVYFMKQTIGNSCGTIGLIHAVANNQDKLEFEDGSVLKQFLSETEKLSPEDRAKCFEKNEAIQAAHDSVAQEGQCRVDDKVNFHFILFNNVDGHLYELDGRMPFPVNHGASSEDSLLQDAAKVCREFTEREQGEVRFSAVALCKAA. The pIC50 is 4.9. (5) The drug is O=C(O)c1ccc(-c2nc(-c3ccc(-c4ccccc4)cc3)c(-c3ccc(-c4ccccc4)cc3)[nH]2)cc1. The target protein (O50979) has sequence MPPKVKIKNDFEIFRKELEILYKKYLNNELSYLKLKEKLKILAENHKAILFRKDKFTNRSIILNLSKTRKIIKEYINLSVIERIRRDNTFLFFWKSRRIKELKNIGIKDRKKIEELIFSNQMNDEKSYFQYFIDLFVTPKWLNDYAHKYKIEKINSYRKEQIFVKINLNTYIEIIKLLLNQSRDIRLKFYGVLMAIGRRPVEVMKLSQFYIADKNHIRMEFIAKKRENNIVNEVVFPVFADPELIINSIKEIRYMEQTENLTKEIISSNLAYSYNRLFRQIFNNIFAPEESVYFCRAIYCKFSYLAFAPKNMEMNYWITKVLGHEPNDITTAFHYNRYVLDNLDDKADNSLLTLLNQRIYTYVRRKATYSTLTMDRLESLIKEHHIFDDNYIKTLIVIKNLMLKDNLETLAMVRGLNVKIRKAFKATYGYNYNYIKLTEYLSIIFNYKL. The pIC50 is 5.0.