Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is CC(=O)N[C@@H](C)COc1ccc(Oc2ccc(OC(C)C)cc2)cc1. The target protein sequence is MVLLLFLTCLVFSCLTISWLKIWGKMTDSKPLSNSKVDASLLSSKEESFSASDQSEEHGDCSCPLTTPDQEELASHGGPVDASQQRNSVPSSHQKPPRNPLSSNDTCSSPELQTNGVAAPGSEVPEANGLPFPARPQTQRTGSPTREDKKQAHIKRQLMTSFILGSLDDNSSDEDPSASSFQTSSRKGSRASLGTLSQEAALNTADPESHTPTMRPSMSGLHLVKRGREHKKLDLHRDFTVASPAEFVTRFGGNRVIETVLIANNGIAAVKCMRSIRRWAYEMFRNERAIRFVVMVTPEDLKANAEYIKMADQYVPVPGGPNNNNYANVELIIDIAKRIPVQAVWAGWGHASENPKLPELLCKHEIAFLGPPSEAMWALGDKISSTIVAQTLQIPTLPWSGSGLTVEWTEDSQHQGKCISVPEDVYEQGCVRDVDEGLQAAEKVGFPLMIKASEGGGGKGIRRAESAEDFPMLFRQVQSEIPGSPIFLMKLAQNARHLEV.... The pIC50 is 5.8. (2) The drug is CC[C@H](C)[C@H](NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](NC(C)=O)C(c1ccccc1)c1ccccc1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(=O)O. The target protein (P21451) has sequence MQSSASRCGRALVALLLACGLLGVWGEKRGFPPAQATPSLLGTKEVMTPPTKTSWTRGSNSSLMRSSAPAEVTKGGRVAGVPPRSFPPPCQRKIEINKTFKYINTIVSCLVFVLGIIGNSTLLRIIYKNKCMRNGPNILIASLALGDLLHIIIDIPINAYKLLAGDWPFGAEMCKLVPFIQKASVGITVLSLCALSIDRYRAVASWSRIKGIGVPKWTAVEIVLIWVVSVVLAVPEAIGFDVITSDYKGKPLRVCMLNPFQKTAFMQFYKTAKDWWLFSFYFCLPLAITAIFYTLMTCEMLRKKSGMQIALNDHLKQRREVAKTVFCLVLVFALCWLPLHLSRILKLTLYDQSNPQRCELLSFLLVLDYIGINMASLNSCINPIALYLVSKRFKNCFKSCLCCWCQTFEEKQSLEEKQSCLKFKANDHGYDNFRSSNKYSSS. The pIC50 is 6.1. (3) The drug is Cc1ncoc1-c1nnc(SCCCN2CC[C@]3(C[C@H]3c3ccc(C(F)(F)F)cc3)C2)n1C. The target protein (P40989) has sequence MSYNDPNLNGQYYSNGDGTGDGNYPTYQVTQDQSAYDEYGQPIYTQNQLDDGYYDPNEQYVDGTQFPQGQDPSQDQGPYNNDASYYNQPPNMMNPSSQDGENFSDFSSYGPPSGTYPNDQYTPSQMSYPDQDGSSGASTPYGNGVVNGNGQYYDPNAIEMALPNDPYPAWTADPQSPLPIEQIEDIFIDLTNKFGFQRDSMRNMFDHFMTLLDSRSSRMSPEQALLSLHADYIGGDTANYKKWYFAAQLDMDDEIGFRNMKLGKLSRKARKAKKKNKKAMQEASPEDTEETLNQIEGDNSLEAADFRWKSKMNQLSPFEMVRQIALFLLCWGEANQVRFTPECLCFIYKCASDYLDSAQCQQRPDPLPEGDFLNRVITPLYRFIRSQVYEIVDGRYVKSEKDHNKVIGYDDVNQLFWYPEGIAKIVMEDGTRLIDLPAEERYLKLGEIPWDDVFFKTYKETRSWLHLVTNFNRIWIMHISVYWMYCAYNAPTFYTHNYQQ.... The pIC50 is 6.7. (4) The drug is Brc1ccc2c(c1)CCN2. The target protein (Q969J5) has sequence MMPKHCFLGFLISFFLTGVAGTQSTHESLKPQRVQFQSRNFHNILQWQPGRALTGNSSVYFVQYKIMFSCSMKSSHQKPSGCWQHISCNFPGCRTLAKYGQRQWKNKEDCWGTQELSCDLTSETSDIQEPYYGRVRAASAGSYSEWSMTPRFTPWWETKIDPPVMNITQVNGSLLVILHAPNLPYRYQKEKNVSIEDYYELLYRVFIINNSLEKEQKVYEGAHRAVEIEALTPHSSYCVVAEIYQPMLDRRSQRSEERCVEIP. The pIC50 is 5.0. (5) The target protein (P68403) has sequence MADPAAGPPPSEGEESTVRFARKGALRQKNVHEVKNHKFTARFFKQPTFCSHCTDFIWGFGKQGFQCQVCCFVVHKRCHEFVTFSCPGADKGPASDDPRSKHKFKIHTYSSPTFCDHCGSLLYGLIHQGMKCDTCMMNVHKRCVMNVPSLCGTDHTERRGRIYIQAHIDREVLIVVVRDAKNLVPMDPNGLSDPYVKLKLIPDPKSESKQKTKTIKCSLNPEWNETFRFQLKESDKDRRLSVEIWDWDLTSRNDFMGSLSFGISELQKAGVDGWFKLLSQEEGEYFNVPVPPEGSEGNEELRQKFERAKIGQGTKAPEEKTANTISKFDNNGNRDRMKLTDFNFLMVLGKGSFGKVMLSERKGTDELYAVKILKKDVVIQDDDVECTMVEKRVLALPGKPPFLTQLHSCFQTMDRLYFVMEYVNGGDLMYHIQQVGRFKEPHAVFYAAEIAIGLFFLQSKGIIYRDLKLDNVMLDSEGHIKIADFGMCKENIWDGVTTKT.... The small molecule is Cn1cc(C2=C(c3cn(C)c4ccc(O)cc34)C(=O)NC2=O)c2ccccc21. The pIC50 is 6.0. (6) The compound is C=CC(=O)NC[C@H](NC(=O)NC(C)(C)C)C(=O)N1CC2[C@@H]([C@H]1C(=O)NC(CC1CCC1)C(=O)C(N)=O)C2(C)C. The target protein sequence is APITAYAQQTRGLLGCIITSLTGRDKNQVEGEVQIVSTAAQTFLATCINGVCWTVYHGAGTRTIASPKGPVIQMYTNVDQDLVGWPAPQGARSLTPCTCGSSDLYLVTRHADVIPVRRRGDSRGSLLSPRPISYLKGSSGGPLLCPAGHAVGLFRAAVCTRGVAKAVAFIPVENLETTMRS. The pIC50 is 5.6. (7) The drug is CCCCSc1nc2c(c(=O)n1-c1ccccc1)SCC2. The target protein sequence is MVLVLHHILIAVVQFLRRGQQVFLKPDEPPPPPQPCADSLQDALLSLGSVIDISGLQRAVKEALSAVLPRVETVYTYLLDGESQLVCEDPPHELPQEGKVREAIISQKRLGCNGLGFSDLPGKPLARLVAPLAPDTQVLVMPLADKEAGAVAAVILVHCGQLSDNEEWSLQAVEKHTLVALRRVQVLQQRGPREAPRAVQNPPEGTAEDQKGGAAYTDRDRKILQLCGELYDLDASSLQLKVLQYLQQETRASRCCLLLVSEDNLQLSCKVIGDKVLGEEVSFPLTGCLGQVVEDKKSIQLKDLTSEDVQQLQSMLGCELQAMLCVPVISRATDQVVALACAFNKLEGDLFTDEDEHVIQHCFHYTSTVLTSTLAFQKEQKLKCECQALLQVAKNLFTHLDDVSVLLQEIITEARNLSNAEICSVFLLDQNELVAKVFDGGVVDDESYEIRIPADQGIAGHVATTGQILNIPDAYAHPLFYRGVDDSTGFRTRNILCFPI.... The pIC50 is 4.4. (8) The drug is COc1ccc(NC(=O)Cn2c(-c3cscn3)nc3ccccc32)cc1. The target protein sequence is MRILQRALTFEDVLMVPRKSSVLPKDVSLKSRLTKNIRLNIPFISAAMDTVTEHKTAIAMARLGGIGIVHKNMDIQTQVKEITKVKKSESGVINDPIFIHAHKTLADAKVITDNYKISGVPVVDDKGLLIGILTNRDVRFETDLSKKVGDVMTKMPLVTAHVGISLDEASDLMHKHKIEKLPIVDKDNVLKGLITIKDIQKRIEYPEANKDDFGRLRVGAAIGVGQLDRAEMLVKAGVDVLVLDSAHGHSANILHTLEEIKKSLVVDVIVGNVVTKEATSDLISAGADAVKVGIGPGSICTTRIVAGVGMPQVSAIDNCVEVASKFDIPVIADGGIRYSGDVAKALALGASSVMIGSLLAGTEESPGDFMIYQGRQYKSYRGMGSIGAMTKGSSDRYFQEGVASEKLVPEGIEGRVPYRGKVSDMIFQLVGGVRSSMGYQGAKNILELYQNAEFVEITSAGLKESHVHGVDITKEAPNYYG. The pIC50 is 6.2. (9) The small molecule is COc1ccc(CN2CCc3sc(/C=C4/C(=O)N5C(C(=O)[O-])=CS[C@H]45)cc3C2)cc1. The target protein (P25910) has sequence MKTVFILISMLFPVAVMAQKSVKISDDISITQLSDKVYTYVSLAEIEGWGMVPSNGMIVINNHQAALLDTPINDAQTEMLVNWVTDSLHAKVTTFIPNHWHGDCIGGLGYLQRKGVQSYANQMTIDLAKEKGLPVPEHGFTDSLTVSLDGMPLQCYYLGGGHATDNIVVWLPTENILFGGCMLKDNQATSIGNISDADVTAWPKTLDKVKAKFPSARYVVPGHGDYGGTELIEHTKQIVNQYIESTSKP. The pIC50 is 7.1.