Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is COc1ccc(/N=c2\oc3cc(O)ccc3cc2C(=O)Nc2ccccn2)c(OC)c1. The target protein (P14550) has sequence MAASCVLLHTGQKMPLIGLGTWKSEPGQVKAAVKYALSVGYRHIDCAAIYGNEPEIGEALKEDVGPGKAVPREELFVTSKLWNTKHHPEDVEPALRKTLADLQLEYLDLYLMHWPYAFERGDNPFPKNADGTICYDSTHYKETWKALEALVAKGLVQALGLSNFNSRQIDDILSVASVRPAVLQVECHPYLAQNELIAHCQARGLEVTAYSPLGSSDRAWRDPDEPVLLEEPVVLALAEKYGRSPAQILLRWQVQRKVICIPKSITPSRILQNIKVFDFTFSPEEMKQLNALNKNWRYIVPMLTVDGKRVPRDAGHPLYPFNDPY. The pIC50 is 4.4. (2) The target protein (P23639) has sequence MTDRYSFSLTTFSPSGKLGQIDYALTAVKQGVTSLGIKATNGVVIATEKKSSSPLAMSETLSKVSLLTPDIGAVYSGMGPDYRVLVDKSRKVAHTSYKRIYGEYPPTKLLVSEVAKIMQEATQSGGVRPFGVSLLIAGHDEFNGFSLYQVDPSGSYFPWKATAIGKGSVAAKTFLEKRWNDELELEDAIHIALLTLKESVEGEFNGDTIELAIIGDENPDLLGYTGIPTDKGPRFRKLTSQEINDRLEAL. The drug is CCCCCC(=O)N[C@H](C(=O)N[C@@H](CCC(=O)N(C)C)C(=O)N[C@@H](CC(C)C)C(=O)[C@@]1(C)CO1)C(C)C. The pIC50 is 7.0. (3) The drug is CC1CCCN1c1cccc(Nc2cc(-c3ccc(O)cc3)nn3ccnc23)n1. The target protein sequence is PKEVYLDRKLLTLEDKELGSGNFGTVKKGYYQMKKVVKTVAVKILKNEANDPALKDELLAEANVMQQLDNPYIVRMIGICEAESWMLVMEMAELGPLNKYLQQNRHVKDKNIIELVHQVSMGMKYLEESNFVHRDLAARNVLLVTQHYAKISDFGLSKALRADENYYKAQTHGKWPVKWYAPECINYYKFSSKSDVWSFGVLMWEAFSYGQKPYRGMKGSEVTAMLEKGERMGCPAGCPREMYDLMNLCWTYDVENRPGFAAVELRLRNYYYDVVN. The pIC50 is 7.2. (4) The small molecule is Nc1nc2c(ncn2[C@H]2C[C@@H]3OP(=O)(O)OC[C@H]4O[C@@H](n5cnc6c(=O)[nH]c(N)nc65)C[C@@H]4OP(=O)(O)OC[C@H]3O2)c(=O)[nH]1. The target protein sequence is MNDLNVLVLEDEPFQRLVAVTALKKVVPGSILEAADGKEAVAILESCGHVDIAICDLQMSGMDGLAFLRHASLSGKVHSVILSSEVDPILRQATISMIECLGLNFLGDLGKPFSLERITALLTRYNARRQDLPRQIEVAELPSVADVVRGLDNGEFEAYYQPKVALDGGGLIGAEVLARWNHPHLGVLPPSHFLYVMETYNLVDKLFWQLFSQGLATRRKLAQLGQPINLAFNVHPSQLGSRALAENISALLTEFHLPPSSVMFEITETGLISAPASSLENLVRLRIMGCGLAMDDFGAGYSSLDRLCEFPFSQIKLDRTFVQKMKTQPRSCAVISSVVALAQALGISLVVEGVESDEQRVRLIELGCSIAQGYLFARPMPEQHFLDYCSGS. The pIC50 is 7.7. (5) The drug is COCCNC(=O)c1c(O)c2ncc(Cc3ccc(F)cc3)cc2n(CC(=O)N2CCOCC2)c1=O. The target protein sequence is FLDGIDKAQEEHEKYHSNWRAMASDFNLPPVVAKEIVASCDKCQLKGEAMHGQVDCSPGIWQLDCTHLEGKVILVAVHVASGYIEAEVIPAETGQETAYFLLKLAGRWPVKTVHTDNGSNFTSTTVKAACWWAGIKQEFGIPYNPQSQGVIESMNKELKKIIGQVRDQAEHLKTAVQMAVFIHNFKRKGGIGGYSAGERIVDIIATDIQTKELQKQITKIQNFRVYYRDSRDPVWKGPAKLLWKGEGAVVIQDNSDIKVVPRRKAKIIRDYGKQMAGDDCVASRQDED. The pIC50 is 8.1. (6) The drug is Nc1cccc(-c2cccc(-c3cn[nH]c3N)c2)c1. The target protein (P33261) has sequence MDPFVVLVLCLSCLLLLSIWRQSSGRGKLPPGPTPLPVIGNILQIDIKDVSKSLTNLSKIYGPVFTLYFGLERMVVLHGYEVVKEALIDLGEEFSGRGHFPLAERANRGFGIVFSNGKRWKEIRRFSLMTLRNFGMGKRSIEDRVQEEARCLVEELRKTKASPCDPTFILGCAPCNVICSIIFQKRFDYKDQQFLNLMEKLNENIRIVSTPWIQICNNFPTIIDYFPGTHNKLLKNLAFMESDILEKVKEHQESMDINNPRDFIDCFLIKMEKEKQNQQSEFTIENLVITAADLLGAGTETTSTTLRYALLLLLKHPEVTAKVQEEIERVVGRNRSPCMQDRGHMPYTDAVVHEVQRYIDLIPTSLPHAVTCDVKFRNYLIPKGTTILTSLTSVLHDNKEFPNPEMFDPRHFLDEGGNFKKSNYFMPFSAGKRICVGEGLARMELFLFLTFILQNFNLKSLIDPKDLDTTPVVNGFASVPPFYQLCFIPV. The pIC50 is 4.6. (7) The small molecule is CC[C@H](C)[C@H](NC(=O)[C@H](Cc1ccccc1)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@H](CCC(N)=O)NC(=O)CNC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](Cc1ccc(O)cc1)NC(=O)[C@H](CO)NC(=O)[C@H](CO)NC(=O)[C@@H](NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CO)NC(=O)[C@@H](NC(=O)[C@H](Cc1ccccc1)NC(=O)[C@@H](NC(=O)CNC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](C)NC(=O)[C@@H](N)Cc1cnc[nH]1)[C@@H](C)O)[C@@H](C)O)C(C)C)C(=O)N[C@@H](C)C(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(=O)N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](CCCN=C(N)N)C(N)=O)C(C)C. The target protein (P43220) has sequence MAGAPGPLRLALLLLGMVGRAGPRPQGATVSLWETVQKWREYRRQCQRSLTEDPPPATDLFCNRTFDEYACWPDGEPGSFVNVSCPWYLPWASSVPQGHVYRFCTAEGLWLQKDNSSLPWRDLSECEESKRGERSSPEEQLLFLYIIYTVGYALSFSALVIASAILLGFRHLHCTRNYIHLNLFASFILRALSVFIKDAALKWMYSTAAQQHQWDGLLSYQDSLSCRLVFLLMQYCVAANYYWLLVEGVYLYTLLAFSVLSEQWIFRLYVSIGWGVPLLFVVPWGIVKYLYEDEGCWTRNSNMNYWLIIRLPILFAIGVNFLIFVRVICIVVSKLKANLMCKTDIKCRLAKSTLTLIPLLGTHEVIFAFVMDEHARGTLRFIKLFTELSFTSFQGLMVAILYCFVNNEVQLEFRKSWERWRLEHLHIQRDSSMKPLKCPTSSLSSGATAGSSMYTATCQASCS. The pIC50 is 7.8. (8) The small molecule is NCc1ccncc1NC1CC1. The target protein (P12807) has sequence MERLRQIASQATAASAAPARPAHPLDPLSTAEIKAATNTVKSYFAGKKISFNTVTLREPARKAYIQWKEQGGPLPPRLAYYVILEAGKPGVKEGLVDLASLSVIETRALETVQPILTVEDLCSTEEVIRNDPAVIEQCVLSGIPANEMHKVYCDPWTIGYDERWGTGKRLQQALVYYRSDEDDSQYSHPLDFCPIVDTEEKKVIFIDIPNRRRKVSKHKHANFYPKHMIEKVGAMRPEAPPINVTQPEGVSFKMTGNVMEWSNFKFHIGFNYREGIVLSDVSYNDHGNVRPIFHRISLSEMIVPYGSPEFPHQRKHALDIGEYGAGYMTNPLSLGCDCKGVIHYLDAHFSDRAGDPITVKNAVCIHEEDDGLLFKHSDFRDNFATSLVTRATKLVVSQIFTAANYEYCLYWVFMQDGAIRLDIRLTGILNTYILGDDEEAGPWGTRVYPNVNAHNHQHLFSLRIDPRIDGDGNSAAACDAKSSPYPLGSPENMYGNAFYS.... The pIC50 is 3.0. (9) The drug is COc1cc2ncnc(Nc3ccc(OCc4cccc(F)c4)c(Cl)c3)c2cc1NC(=O)C(CSC(N)=O)CSC(N)=O. The target protein sequence is MRPSGTAGAALLALLAALCPASRALEEKKVCQGTSNKLTQLGTFEDHFLSLQRMFNNCEVVLGNLEITYVQRNYDLSFLKTIQEVAGYVLIALNTVERIPLENLQIIRGNMYYENSYALAVLSNYDANKTGLKELPMRNLQEILHGAVRFSNNPALCNVESIQWRDIVSSDFLSNMSMDFQNHLGSCQKCDPSCPNGSCWGAGEENCQKLTKIICAQQCSGRCRGKSPSDCCHNQCAAGCTGPRESDCLVCRKFRDEATCKDTCPPLMLYNPTTYQMDVNPEGKYSFGATCVKKCPRNYVVTDHGSCVRACGADSYEMEEDGVRKCKKCEGPCRKVCNGIGIGEFKDSLSINATNIKHFKNCTSISGDLHILPVAFRGDSFTHTPPLDPQELDILKTVKEITGFLLIQAWPENRTDLHAFENLEIIRGRTKQHGQFSLAVVSLNITSLGLRSLKEISDGDVIISGNKNLCYANTINWKKLFGTSGQKTKIISNRGENSCK.... The pIC50 is 7.3. (10) The compound is COc1ccc2[nH]nc(C(=O)NCC3CCN(CCC(=O)O)CC3)c2c1. The target protein (P49841) has sequence MSGRPRTTSFAESCKPVQQPSAFGSMKVSRDKDGSKVTTVVATPGQGPDRPQEVSYTDTKVIGNGSFGVVYQAKLCDSGELVAIKKVLQDKRFKNRELQIMRKLDHCNIVRLRYFFYSSGEKKDEVYLNLVLDYVPETVYRVARHYSRAKQTLPVIYVKLYMYQLFRSLAYIHSFGICHRDIKPQNLLLDPDTAVLKLCDFGSAKQLVRGEPNVSYICSRYYRAPELIFGATDYTSSIDVWSAGCVLAELLLGQPIFPGDSGVDQLVEIIKVLGTPTREQIREMNPNYTEFKFPQIKAHPWTKVFRPRTPPEAIALCSRLLEYTPTARLTPLEACAHSFFDELRDPNVKLPNGRDTPALFNFTTQELSSNPPLATILIPPHARIQAAASTPTNATAASDANTGDRGQTNNAASASASNST. The pIC50 is 6.2.