Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is CCC(CC)Oc1cc(C)c(-c2cc(-c3cnc(NC(=O)OC)s3)nc(-c3cnccn3)n2)c(C)c1. The target protein (Q96S53) has sequence MDRSKRNSIAGFPPRVERLEEFEGGGGGEGNVSQVGRVWPSSYRALISAFSRLTRLDDFTCEKIGSGFFSEVFKVRHRASGQVMALKMNTLSSNRANMLKEVQLMNRLSHPNILRFMGVCVHQGQLHALTEYINSGNLEQLLDSNLHLPWTVRVKLAYDIAVGLSYLHFKGIFHRDLTSKNCLIKRDENGYSAVVADFGLAEKIPDVSMGSEKLAVVGSPFWMAPEVLRDEPYNEKADVFSYGIILCEIIARIQADPDYLPRTENFGLDYDAFQHMVGDCPPDFLQLTFNCCNMDPKLRPSFVEIGKTLEEILSRLQEEEQERDRKLQPTARGLLEKAPGVKRLSSLDDKIPHKSPCPRRTIWLSRSQSDIFSRKPPRTVSVLDPYYRPRDGAARTPKVNPFSARQDLMGGKIKFFDLPSKSVISLVFDLDAPGPGTMPLADWQEPLAPPIRRWRSLPGSPEFLHQEACPFVGREESLSDGPPPRLSSLKYRVKEIPPFR.... The pIC50 is 4.5. (2) The compound is NC(=O)c1ccc2[nH]cc(C3=CCC(NCCCNC4CC=C(c5c[nH]c6ccc(C(N)=O)cc56)CC4)CC3)c2c1. The target protein sequence is MDVFSFGQGNNTTASQEPFGTGGNVTSISDVTFSYQVITSLLLGTLIFCAVLGNACVVAAIALERSLQNVANYLIGSLAVTDLMVSVLVLPMAALYQVLNKWTLGQVTCDLFIALDVLCCTSSILHLCAIALDRYWAITDPIDYVNKRTPRRAAALISLTWLIGFLISIPPMLGWRTPEDRSDPDACTISKDHGYTIYSTFGAFYIPLLLMLVLYGRIFRAARFRIRKTVRKVEKKGAGTSLGTSSAPPPKKSLNGQPGSGDWRRCAENRAVGTPCTNGAVRQGDDEATLEVIEVHRVGNSKEHLPLPSESGSNSYAPACLERKNERNAEAKRKMALARERKTVKTLGIIMGTFILCWLPFFIVALVLPFCESSCHMPALLGAIINWLGYSNSLLNPVIYAYFNKDFQNAFKKIIKCKFCRR. The pIC50 is 7.7. (3) The drug is CCCCCC(=O)N[C@H](C(=O)N[C@@H](CCC(=O)N(C)C)C(=O)N[C@@H](CC(C)C)C(=O)[C@@]1(C)CO1)C(C)C. The target protein (P23639) has sequence MTDRYSFSLTTFSPSGKLGQIDYALTAVKQGVTSLGIKATNGVVIATEKKSSSPLAMSETLSKVSLLTPDIGAVYSGMGPDYRVLVDKSRKVAHTSYKRIYGEYPPTKLLVSEVAKIMQEATQSGGVRPFGVSLLIAGHDEFNGFSLYQVDPSGSYFPWKATAIGKGSVAAKTFLEKRWNDELELEDAIHIALLTLKESVEGEFNGDTIELAIIGDENPDLLGYTGIPTDKGPRFRKLTSQEINDRLEAL. The pIC50 is 6.8. (4) The target protein sequence is MPPPDKARRDVLISKALSYLLRHGAEKEKLSIDDQGYVKISDVLSHQRLKSLKTTRDDINRIVQENDKKRFTIKDDMICANQGHSLKAVKNDNLTPMTVDELNQLRIYHGTYRTKLPLIKSSGGLSKMNRNHIHFTCEQYSTCSGIRYNANVLIYINASKCIEHGIVFYKSLNNVILTSGDKDGKLSWEFIDRIVGLDGNEINKEQV. The compound is O=C1/C(=C/C=C/c2ccco2)SC(=S)N1NS(=O)(=O)c1ccccc1. The pIC50 is 4.7. (5) The pIC50 is 6.6. The target protein sequence is GHMQTQGLAKDAWEIPRESLRLEVKLGQGCFGEVWMGTWNGTTRVAIKTLKPGTMSPEAFLQEAQVMKKLRHEKLVQLYAVVSEEPIYIVTEYMSKGCLLDFLKGEMGKYLRLPQLVDMAAQIASGMAYVERMNYVHRDLRAANILVGENLVCKVADFGLARLIEDNEYTARQGAKFPIKWTAPEAALYGRFTIKSDVWSFGILLTELTTKGRVPYPGMVNREVLDQVERGYRMPCPPECPESLHDLMCQCWRKDPEERPTFEYLQAFLEDYFTSTEPQYQPGENL. The drug is CN(C)C/C=C/C(=O)Nc1ccc2ncnc(Nc3cccc(Br)c3)c2c1.