This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The target protein (P46925) has sequence MDITVREHDFKHGFIKSNSTFDGLNIDNSKNKKKIQKGFQILYVLLFCSVMCGLFYYVYENVWLQRDNEMNEILKNSEHLTIGFKVENAHDRILKTIKTHKLKNYIKESVNFLNSGLTKTNYLGSSNDNIELVDFQNIMFYGDAEVGDNQQPFTFILDTGSANLWVPSVKCTTAGCLTKHLYDSSKSRTYEKDGTKVEMNYVSGTVSGFFSKDLVTVGNLSLPYKFIEVIDTNGFEPTYTASTFDGILGLGWKDLSIGSVDPIVVELKNQNKIENALFTFYLPVHDKHTGFLTIGGIEERFYEGPLTYEKLNHDLYWQITLDAHVGNIMLEKANCIVDSGTSAITVPTDFLNKMLQNLDVIKVPFLPFYVTLCNNSKLPTFEFTSENGKYTLEPEYYLQHIEDVGPGLCMLNIIGLDFPVPTFILGDPFMRKYFTVFDYDNHSVGIALAKKNL. The drug is CCCCCc1ccc(C(=O)N(CCN(C(C)C)C(C)C)Cc2ccc(-c3ccc4c(c3)OCO4)cc2)cc1. The pIC50 is 5.8. (2) The small molecule is NC(=O)CC(NC(=O)C1(NC(=O)C(N)Cc2ccc(OP(=O)(O)O)c(N)c2)CCCCC1)C(N)=O. The target protein (P98077) has sequence MTQGPGGRAPPAPPAPPEPEAPTTFCALLPRMPQWKFAAPGGFLGRGPAAARAAGASGGADPQPEPAGPGGVPALAAAVLGACEPRCAAPCPLPALSRCRGAGSRGSRGGRGAAGSGDAAAAAEWIRKGSFIHKPAHGWLHPDARVLGPGVSYVVRYMGCIEVLRSMRSLDFNTRTQVTREAINRLHEAVPGVRGSWKKKAPNKALASVLGKSNLRFAGMSISIHISTDGLSLSVPATRQVIANHHMPSISFASGGDTDMTDYVAYVAKDPINQRACHILECCEGLAQSIISTVGQAFELRFKQYLHSPPKVALPPERLAGPEESAWGDEEDSLEHNYYNSIPGKEPPLGGLVDSRLALTQPCALTALDQGPSPSLRDACSLPWDVGSTGTAPPGDGYVQADARGPPDHEEHLYVNTQGLDAPEPEDSPKKDLFDMRPFEDALKLHECSVAAGVTAAPLPLEDQWPSPPTRRAPVAPTEEQLRQEPWYHGRMSRRAAERM.... The pIC50 is 8.3. (3) The small molecule is Cn1cc(C2=C(c3cn(C)c4cccc([N+](=O)[O-])c34)C(=O)NC2=O)c2ccccc21. The target protein (P68403) has sequence MADPAAGPPPSEGEESTVRFARKGALRQKNVHEVKNHKFTARFFKQPTFCSHCTDFIWGFGKQGFQCQVCCFVVHKRCHEFVTFSCPGADKGPASDDPRSKHKFKIHTYSSPTFCDHCGSLLYGLIHQGMKCDTCMMNVHKRCVMNVPSLCGTDHTERRGRIYIQAHIDREVLIVVVRDAKNLVPMDPNGLSDPYVKLKLIPDPKSESKQKTKTIKCSLNPEWNETFRFQLKESDKDRRLSVEIWDWDLTSRNDFMGSLSFGISELQKAGVDGWFKLLSQEEGEYFNVPVPPEGSEGNEELRQKFERAKIGQGTKAPEEKTANTISKFDNNGNRDRMKLTDFNFLMVLGKGSFGKVMLSERKGTDELYAVKILKKDVVIQDDDVECTMVEKRVLALPGKPPFLTQLHSCFQTMDRLYFVMEYVNGGDLMYHIQQVGRFKEPHAVFYAAEIAIGLFFLQSKGIIYRDLKLDNVMLDSEGHIKIADFGMCKENIWDGVTTKT.... The pIC50 is 5.7. (4) The small molecule is CC1(C)CC=C(c2nc([C@H]3CC(C)(C)O[C@](C)(CO)C3)ccc2NC(=O)c2nc(C#N)c[nH]2)CC1. The target protein (Q00495) has sequence MELGPPLVLLLATVWHGQGAPVIEPSGPELVVEPGETVTLRCVSNGSVEWDGPISPYWTLDPESPGSTLTTRNATFKNTGTYRCTELEDPMAGSTTIHLYVKDPAHSWNLLAQEVTVVEGQEAVLPCLITDPALKDSVSLMREGGRQVLRKTVYFFSAWRGFIIRKAKVLDSNTYVCKTMVNGRESTSTGIWLKVNRVHPEPPQIKLEPSKLVRIRGEAAQIVCSATNAEVGFNVILKRGDTKLEIPLNSDFQDNYYKKVRALSLNAVDFQDAGIYSCVASNDVGTRTATMNFQVVESAYLNLTSEQSLLQEVSVGDSLILTVHADAYPSIQHYNWTYLGPFFEDQRKLEFITQRAIYRYTFKLFLNRVKASEAGQYFLMAQNKAGWNNLTFELTLRYPPEVSVTWMPVNGSDVLFCDVSGYPQPSVTWMECRGHTDRCDEAQALQVWNDTHPEVLSQKPFDKVIIQSQLPIGTLKHNMTYFCKTHNSVGNSSQYFRAVS.... The pIC50 is 8.0. (5) The drug is OC[C@H]1NC[C@H](O)[C@@H](O)[C@H]1O. The target protein (P35573) has sequence MGHSKQIRILLLNEMEKLEKTLFRLEQGYELQFRLGPTLQGKAVTVYTNYPFPGETFNREKFRSLDWENPTEREDDSDKYCKLNLQQSGSFQYYFLQGNEKSGGGYIVVDPILRVGADNHVLPLDCVTLQTFLAKCLGPFDEWESRLRVAKESGYNMIHFTPLQTLGLSRSCYSLANQLELNPDFSRPNRKYTWNDVGQLVEKLKKEWNVICITDVVYNHTAANSKWIQEHPECAYNLVNSPHLKPAWVLDRALWRFSCDVAEGKYKEKGIPALIENDHHMNSIRKIIWEDIFPKLKLWEFFQVDVNKAVEQFRRLLTQENRRVTKSDPNQHLTIIQDPEYRRFGCTVDMNIALTTFIPHDKGPAAIEECCNWFHKRMEELNSEKHRLINYHQEQAVNCLLGNVFYERLAGHGPKLGPVTRKHPLVTRYFTFPFEEIDFSMEESMIHLPNKACFLMAHNGWVMGDDPLRNFAEPGSEVYLRRELICWGDSVKLRYGNKPE.... The pIC50 is 5.0.