Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is COc1ccc(C2CNC(=O)C2)cc1OC1CCCC1. The target protein sequence is SAAEEETRELQSLAAAVVPSAQTLKITDFSFSDFELSDLETALCTIRMFTDLNLVQNFQMKHEVLCRWILSVKKNYRKNVAYHNWRHAFNTAQCMFAALKAGKIQNKLTDLEILALLIAALSHDLDHRGVNNSYIQRSEHPLAQLYCHSIMEHHHFDQCLMILNSPGNQILSGLSIEEYKTTLKIIKQAILATDLALYIKRRGEFFELIRKNQFNLEDPHQKELFLAMLMTACDLSAITKPWPIQQRIAELVATEFFDQGDRERKELNIEPTDLMNREKKNKIPSMQVGFIDAICLQLYEALTHVSEDCFPLLDGCRKNRQKWQALAEQQEKMLINGESGQAKRN. The pIC50 is 3.7. (2) The small molecule is O=C(Cc1ccccc1)NC(CSSCC(NC(=O)Cc1ccccc1)C(=O)O)C(=O)O. The target protein (Q5XJA0) has sequence MSALGESRTRLCDQFAFVSGSDSAVAQCYLAENEWDMERALNSFFEAHMDSVFDEEAEKTEVTGNKRKDDTAEASGTKKKLKTDNADCIDLTAEEPTCSITVNSKENQAENGTAKSEVEDSKLSIISWNVDGLDTLNLADRARGLCSYLALYTPDVVFLQELIPAYVQYLKKRAVSYLFFEGSDDGYFTGIMLRKSRVKFLESEIICFPTTQMMRNLLIAQVTFSGQKLYLMTSHLESCKNQSQERTKQLRVVLQKIKEAPEDAIVIFAGDTNLRDAEVANVGGLPAGVCDVWEQLGKQEHCRYTWDTKANSNKTVPYVSRCRFDRIFLRSAKTAPPVTPDHMALIGMEKLDCGRYTSDHWGIYCTFNT. The pIC50 is 4.7. (3) The small molecule is O=S1(=O)c2ccccc2CN1c1cc(F)cc(Oc2ccccc2)c1. The target protein (Q5NUL3) has sequence MSPECARAAGDAPLRSLEQANRTRFPFFSDVKGDHRLVLAAVETTVLVLIFAVSLLGNVCALVLVARRRRRGATACLVLNLFCADLLFISAIPLVLAVRWTEAWLLGPVACHLLFYVMTLSGSVTILTLAAVSLERMVCIVHLQRGVRGPGRRARAVLLALIWGYSAVAALPLCVFFRVVPQRLPGADQEISICTLIWPTIPGEISWDVSFVTLNFLVPGLVIVISYSKILQTSEHLLDARAVVTHSEITKASRKRLTVSLAYSESHQIRVSQQDFRLFRTLFLLMVSFFIMWSPIIITILLILIQNFKQDLVIWPSLFFWVVAFTFANSALNPILYNMTLCRNEWKKIFCCFWFPEKGAILTDTSVKRNDLSIISG. The pIC50 is 5.0. (4) The drug is C[C@H](NC(=O)C1CN(c2ncnn3cc(-c4ccc(C(=O)N5CCC5)cc4)cc23)C1)c1ccc(F)cc1. The pIC50 is 7.3. The target protein sequence is MRGARGAWDFLCVLLLLLRVQTGSSQPSVSPGEPSPPSIHPGKSDLIVRVGDEIRLLCTDPGFVKWTFEILDETNENKQNEWITEKAEATNTGKYTCTNKHGLSNSIYVFVRDPAKLFLVDRSLYGKEDNDTLVRCPLTDPEVTNYSLKGCQGKPLPKDLRFIPDPKAGIMIKSVKRAYHRLCLHCSVDQEGKSVLSEKFILKVRPAFKAVPVVSVSKASYLLREGEEFTVTCTIKDVSSSVYSTWKRENSQTKLQEKYNSWHHGDFNYERQATLTISSARVNDSGVFMCYANNTFGSANVTTTLEVVDKGFINIFPMINTTVFVNDGENVDLIVEYEAFPKPEHQQWIYMNRTFTDKWEDYPKSENESNIRYVSELHLTRLKGTEGGTYTFLVSNSDVNAAIAFNVYVNTKPEILTYDRLVNGMLQCVAAGFPEPTIDWYFCPGTEQRCSASVLPVDVQTLNSSGPPFGKLVVQSSIDSSAFKHNGTVECKAYNDVGKT.... (5) The drug is COc1ccc([C@H](Cc2c(Cl)c[n+]([O-])cc2Cl)OC(=O)c2ccc(CNC(C(=O)OC[C@H]3CCCN(C)C3)c3ccccc3)s2)cc1OC. The target protein sequence is MKEHGGTFSSTGISGGSGDSAMDSLQPLQPNYMPVCLFAEESYQKLAMETLEELDWCLDQLETIQTYRSVSEMASNKFKRMLNRELTHLSEMSRSGNQVSEYISNTFLDKQNDVEIPSPTQKDREKKKKQQLMTQISGVKKLMHSSSLNNTSISRFGVNTENEDHLAKELEDLNKWGLNIFNVAGYSHNRPLTCIMYAIFQERDLLKTFRISSDTFITYMMTLEDHYHSDVAYHNSLHAADVAQSTHVLLSTPALDAVFTDLEILAAIFAAAIHDVDHPGVSNQFLINTNSELALMYNDESVLENHHLAVGFKLLQEEHCDIFMNLTKKQRQTLRKMVIDMVLATDMSKHMSLLADLKTMVETKKVTSSGVLLLDNYTDRIQVLRNMVHCADLSNPTKSLELYRQWTDRIMEEFFQQGDKERERGMEISPMCDKHTASVEKSQVGFIDYIVHPLWETWADLVQPDAQDILDTLEDNRNWYQSMIPQSPSPPLDEQNRDCQ.... The pIC50 is 8.2. (6) The drug is C=C(C)c1cc(-c2cc(NC(=O)c3cccc(C(F)(F)F)c3)ccc2C)nc(N2CCOCC2)n1. The target protein sequence is MEHIQGAWKTISNGFGFKDAVFDGSSCISPTIVQQFGYQRRASDDGKLTDPSKTSNTIRVFLPNKQRTVVNVRNGMSLHDCLMKALKVRGLQPECCAVFRLLHEHKGKKARLDWNTDAASLIGEELQVDFLDHVPLTTHNFARKTFLKLAFCDICQKFLLNGFRCQTCGYKFHEHCSTKVPTMCVDWSNIRQLLLFPNSTIGDSGVPALPSLTMRRMRESVSRMPVSSQHRYSTPHAFTFNTSSPSSEGSLSQRQRSTSTPNVHMVSTTLPVDSRMIEDAIRSHSESASPSALSSSPNNLSPTGWSQPKTPVPAQRERAPVSGTQEKNKIRPRGQRDSSEEWEIEASEVMLSTRIGSGSFGTVYKGKWHGDVAVKILKVVDPTPEQFQAFRNEVAVLRKTRHVNILLFMGYMTKDNLAIVTQWCEGSSLYKHLHVQETKFQMFQLIDIARQTAQGMDYLHAKNIIHRDMKSNNIFLHEGLTVKIGDFGLATVKSRWSGSQ.... The pIC50 is 9.3. (7) The compound is CC(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@H](C(=O)N[C@@H](CC(=O)O)C(=O)CCCc1ccccc1)C(C)C. The target protein (P55210) has sequence MADDQGCIEEQGVEDSANEDSVDAKPDRSSFVPSLFSKKKKNVTMRSIKTTRDRVPTYQYNMNFEKLGKCIIINNKNFDKVTGMGVRNGTDKDAEALFKCFRSLGFDVIVYNDCSCAKMQDLLKKASEEDHTNAACFACILLSHGEENVIYGKDGVTPIKDLTAHFRGDRCKTLLEKPKLFFIQACRGTELDDGIQADSGPINDTDANPRYKIPVEADFLFAYSTVPGYYSWRSPGRGSWFVQALCSILEEHGKDLEIMQILTRVNDRVARHFESQSDDPHFHEKKQIPCVVSMLTKELYFSQ. The pIC50 is 10.0. (8) The drug is O=C(NNC(=S)NC(=O)c1ccccc1Cl)c1cccc([N+](=O)[O-])c1. The target protein sequence is MGVQAGLFGMLGFLGVALGGSPALRWYRTSCHLTKAVPGNPLGYLSFLAKDAQGLALIHARWDAHRRLQACSWEDEPELTAAYGALCAHETAWGSFIHTPGPELQRALATLQSQWEACRALEESPAGARKKRAAGQSGVPGGGHQREKRGWTMPGTLWCGVGDSAGNSSELGVFQGPDLCCREHDRCPQNISPLQYNYGIRNYRFHTISHCDCDTRFQQCLQNQHDSISDIVGVAFFNVLEIPCFVLEEQEACVAWYWWGGCRMYGTVPLARLQPRTFYNASWSSRATSPTPSSRSPAPPKPRQKQHLRKGPPHQKGSKRPSKANTTALQDPMVSPRLDVAPTGLQGPQGGLKPQGARWVCRSFRRHLDQCEHQIGPREIEFQLLNSAQEPLFHCNCTRRLARFLRLHSPPEVTNMLWELLGTTCFKLAPPLDCVEGKNCSRDPRAIRVSARHLRRLQQRRHQLQDKGTDERQPWPSEPLRGPMSFYNQCLQLTQAARRP.... The pIC50 is 4.7. (9) The drug is CC(C)n1c(=O)n(C(=O)NCCCN2CCN(C)CC2)c2ccccc21. The target protein (O70528) has sequence MDKLDANVSSKEGFGSVEKVVLLTFLSAVILMAILGNLLVMVAVCRDRQLRKIKTNYFIVSLAFADLLVSVLVMPFGAIELVQDIWVYGEMFCLVRTSLDVLLTTASIFHLCCISLDRYYAICCQPLVYRNKMTPLRIALMLGGCWVIPMFISFLPIMQGWNNIGIVDLIEKRKFNQNSNSTYCVFMVNKPYAITCSVVAFYIPFLLMVLAYYRIYVTAKEHARQIQVLQRAGAPAEGRPQPADQHSTHRMRTETKAAKTLCIIMGCFCLCWAPFFVTNIVDPFIDYTVPGQLWTAFLWLGYINSGLNPFLYAFLNKSFRRAFLIILCCDDERYRRPSILGQTVPCSTTTINGSTHVLRDTVECGGQWESQCHPAASSPLVAAQPIDT. The pIC50 is 6.0. (10) The compound is O=C(O)COc1ccc(CNc2cccc(-c3c(C(=O)c4ccccc4)cnc4c(C(F)(F)F)cccc34)c2)cc1. The target protein sequence is MREQCVLSEEQIRKKKIRKQQQQESQSQSQSPVGPQGSSSSASGPGASPGGSEAGSQGSGEGEGVQLTAAQELMIQQLVAAQLQCNKRSFSDQPKVTPWPLGADPQSRDARQQRFAHFTELAIISVQEIVDFAKQVPGFLQLGREDQIALLKASTIEIMLLETARRYNHETECITFLKDFTYSKDDFHRAGLQVEFINPIFEFSRAMRRLGLDDAEYALLIAINIFSADRPNVQEPGRVEALQQPYVEALLSYTRIKRPQDQLRFPRMLMKLVSLRTLSSVHSEQVFALRLQDKKLPPLLSEIWDVHE. The pIC50 is 7.9.