Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is C[C@H](CCC(=O)NCC(=O)O)[C@H]1CC[C@H]2[C@@H]3CC[C@@H]4C[C@H](O)CC[C@]4(C)[C@H]3CC[C@@]21C. The target protein sequence is MNSKSAQGLAGLRNLGNTCFMNSILQCLSNTRELRDYCLQRLYMRDLHHGSNAHTALVEEFAKLIQTIWTSSPNDVVSPSEFKTQIQRYAPRFVGYNQQDAQEFLRFLLDGLHNEVNRVTLRPKSNPENLDHLPDDEKGRQMWRKYLEREDSRIGDLFVGQLKSSLTCTDCGYCSTVFDPFWDLSLPIAKRGYPEVTLMDCMRLFTKEDVLDGDEKPTCCRCRGRKRCIKKFSIQRFPKILVLHLKRFSESRIRTSKLTTFVNFPLRDLDLREFASENTNHAVYNLYAVSNHSGTTMGGHYTAYCRSPGTGEWHTFNDSSVTPMSSSQVRTSDAYLLFYELASPPSRM. The pIC50 is 4.3. (2) The drug is Cc1c(C(=O)Nc2cccc(Br)c2)cccc1[N+](=O)[O-]. The target protein sequence is MISKLKPQFMFLPKKHILSYCRKDVLNLFEQKFYYTSKRKESNNMKNESLLRLINYNRYYNKIDSNNYYNGGKILSNDRQYIYSPLCEYKKKINDISSYVSVPFKINIRNLGTSNFVNNKKDVLDNDYIYENIKKEKSKHKKIIFLLFVSLFGLYGFFESYNPEFFLYDIFLKFCLKYIDGEICHDLFLLLGKYNILPYDTSNDSIYACTNIKHLDFINPFGVAAGFDKNGVCIDSILKLGFSFIEIGTITPRGQTGNAKPRIFADVESRSIINSCGFNNMGCDKVTENLILFRKRQEEDKLLSKHIVGVSIGKNKDTVNIVDDLKYCINKIGRYADYIAINVSSPNTPGLRDNQEAGKLKNIILSVKEEIDNLEKNNIMNDESTYNEDNKIVEKKNNFNKNNSHMMKDAKDNFLWFNTTKKKPLVFVKLAPDLNQEQKKEIADVLLETNIDGMIISNTTTQINDIKSFENKKGGVSGAKLKDISTKFICEMYNYTNKQI.... The pIC50 is 5.9. (3) The compound is O=C(NCCCCCCc1ccccc1)SCc1ccccc1F. The target protein (Q9WVG5) has sequence MRNTVFLLGFWSVYCYFPAGSITTLRPQGSLRDEHHKPTGVPATARPSVAFNIRTSKDPEQEGCNLSLGDSKLLENCGFNMTAKTFFIIHGWTMSGMFESWLHKLVSALQMREKDANVVVVDWLPLAHQLYTDAVNNTRVVGQRVAGMLDWLQEKEEFSLGNVHLIGYSLGAHVAGYAGNFVKGTVGRITGLDPAGPMFEGVDINRRLSPDDADFVDVLHTYTLSFGLSIGIRMPVGHIDIYPNGGDFQPGCGFNDVIGSFAYGTISEMVKCEHERAVHLFVDSLVNQDKPSFAFQCTDSSRFKRGICLSCRKNRCNNIGYNAKKMRKKRNSKMYLKTRAGMPFKVYHYQLKVHMFSYNNSGDTQPTLYITLYGSNADSQNLPLEIVEKIELNATNTFLVYTEEDLGDLLKMRLTWEGVAHSWYNLWNEFRNYLSQPSNPSRELYIRRIRVKSGETQRKVTFCTQDPTKSSISPGQELWFHKCQDGWKMKNKTSPFVNLA.... The pIC50 is 6.2. (4) The small molecule is CC[C@H](C)[C@H](NC(C)=O)C(=O)N[C@H](C(=O)N[C@@H](Cc1ccccc1)[C@H](O)C(=O)N1CSC(C)(C)[C@@H]1C(=O)N[C@H](C(=O)N[C@@H](CCSC)C(N)=O)[C@@H](C)CC)c1ccccc1. The target protein (P10274) has sequence MGQIFSRSASPIPRPPRGLAAHHWLNFLQAAYRLEPGPSSYDFHQLKKFLKIALETPARICPINYSLLASLLPKGYPGRVNEILHILIQTQAQIPSRPAPPPPSSPTHDPPDSDPQIPPPYVEPTAPQVLPVMHPHGAPPNHRPWQMKDLQAIKQEVSQAAPGSPQFMQTIRLAVQQFDPTAKDLQDLLQYLCSSLVASLHHQQLDSLISEAETRGITGYNPLAGPLRVQANNPQQQGLRREYQQLWLAAFAALPGSAKDPSWASILQGLEEPYHAFVERLNIALDNGLPEGTPKDPILRSLAYSNANKECQKLLQARGHTNSPLGDMLRACQTWTPKDKTKVLVVQPKKPPPNQPCFRCGKAGHWSRDCTQPRPPPGPCPLCQDPTHWKRDCPRLKPTIPEPEPEEDALLLDLPADIPHPKNLHRGGGLTSPPTLQQVLPNQDPASILPVIPLDPARRPVIKAQVDTQTSHPKTIEALLDTGADMTVLPIALFSSNTPL.... The pIC50 is 6.5. (5) The drug is CCn1cc(C[C@H](NC(=O)[C@@H]2CCC(=O)N2)C(=O)N2CCC[C@H]2C(N)=O)nc1I. The target protein sequence is MDGPSNVSLVHGDTTLGLPEYKVVSVLLVLLVCTVGIVGNAMVVLVVLTSRDMHTPTNCYLVSLALADLIVLLAAGLPNVSDSLVGHWIYGHAGCLGITYFQYLGINVSSCSILAFTVERYIAICHPMRAQTVCTVARARRIIAGIWGVTSLYCLLWFFLVDLNVRDNQRLECGYKVSRGLYLPIYLLDFAVFFIAPLLGTLVLYGFIGRILFQSPLSQEAWQKERQSHGQSEGTPGNCSRSKSSMSSRKQ. The pIC50 is 5.6.