This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is CCOC(=O)c1c(C)n(Cc2ccccc2)c2ccc(OCCCN3CCCCC3)cc12. The target protein (P0A0J7) has sequence MNKQIFVLYFNIFLIFLGIGLVIPVLPVYLKDLGLTGSDLGLLVAAFALSQMIISPFGGTLADKLGKKLIICIGLILFSVSEFMFAVGHNFSVLMLSRVIGGMSAGMVMPGVTGLIADISPSHQKAKNFGYMSAIINSGFILGPGIGGFMAEVSHRMPFYFAGALGILAFIMSIVLIHDPKKSTTSGFQKLEPQLLTKINWKVFITPVILTLVLSFGLSAFETLYSLYTADKVNYSPKDISIAITGGGIFGALFQIYFFDKFMKYFSELTFIAWSLLYSVVVLILLVFANGYWSIMLISFVVFIGFDMIRPAITNYFSNIAGERQGFAGGLNSTFTSMGNFIGPLIAGALFDVHIEAPIYMAIGVSLAGVVIVLIEKQHRAKLKEQNM. The pIC50 is 5.2. (2) The small molecule is Cc1ccc(NC(=O)c2cccc(C(F)(F)F)c2)cc1-c1cc(N2CCOCC2)nc(N2CCS(=O)(=O)CC2)n1. The target protein sequence is QEKNKIRPRGQRDSSEEWEIEASEVMLSTRIGSGSFGTVYKGKWHGDVAVKILKVVDPTPEQFQAFRNEVAVLRKTRHVNILLFMGYMTKDNLAIVTQWCEGSSLYKHLHVQETKFQMFQLIDIARQTAQGMDYLHAKNIIHRDMKSNNIFLHEGLTVKIGDFGLATVKSRWSGSQQVEQPTGSVLWMAPEVIRMQDNNPFSFQSDVYSYGIVLYELMTGELPYSHINNRDQIIFMVGRGYASPDLSKLYKNCPKAMKRLVADCVKKVKEERPLFPQILSSIELLQHSLPKINRSASEPSLHRAAHTEDINACTLTTSPRLPVF. The pIC50 is 9.0. (3) The drug is N[C@H]1CC[C@H](Nc2nc(Nc3ccc(CN4CCOCC4)cc3)c3ncn(-c4ccccc4)c3n2)CC1. The pIC50 is 6.2. The target is PFCDPK1(Pfalciparum). (4) The small molecule is O=CN(O)CCCc1ccccc1. The target protein (P44786) has sequence MTALNVLIYPDDHLKVVCEPVTKVNDAIRKIVDDMFDTMYQEKGIGLAAPQVDILQRIITIDVEGDKQNQFVLINPEILASEGETGIEEGCLSIPGFRALVPRKEKVTVRALDRDGKEFTLDADGLLAICIQHEIDHLNGILFVDYLSPLKRQRIKEKLIKYKKQIAKS. The pIC50 is 6.8. (5) The target protein (P00748) has sequence MRALLLLGFLLVSLESTLSIPPWEAPKEHKYKAEEHTVVLTVTGEPCHFPFQYHRQLYHKCTHKGRPGPQPWCATTPNFDQDQRWGYCLEPKKVKDHCSKHSPCQKGGTCVNMPSGPHCLCPQHLTGNHCQKEKCFEPQLLRFFHKNEIWYRTEQAAVARCQCKGPDAHCQRLASQACRTNPCLHGGRCLEVEGHRLCHCPVGYTGAFCDVDTKASCYDGRGLSYRGLARTTLSGAPCQPWASEATYRNVTAEQARNWGLGGHAFCRNPDNDIRPWCFVLNRDRLSWEYCDLAQCQTPTQAAPPTPVSPRLHVPLMPAQPAPPKPQPTTRTPPQSQTPGALPAKREQPPSLTRNGPLSCGQRLRKSLSSMTRVVGGLVALRGAHPYIAALYWGHSFCAGSLIAPCWVLTAAHCLQDRPAPEDLTVVLGQERRNHSCEPCQTLAVRSYRLHEAFSPVSYQHDLALLRLQEDADGSCALLSPYVQPVCLPSGAARPSETTLC.... The drug is CCOC(=O)c1cccc(NC(=O)CSc2nnc(-c3ccoc3C)n2CCOC)c1. The pIC50 is 4.3. (6) The drug is COc1ccc(-c2c(C(N)=O)c3n(c2C2CC2)CCN(C(=O)NCC24CC5CC(CC(C5)C2)C4)C3)cc1. The target protein (P49674) has sequence MELRVGNKYRLGRKIGSGSFGDIYLGANIASGEEVAIKLECVKTKHPQLHIESKFYKMMQGGVGIPSIKWCGAEGDYNVMVMELLGPSLEDLFNFCSRKFSLKTVLLLADQMISRIEYIHSKNFIHRDVKPDNFLMGLGKKGNLVYIIDFGLAKKYRDARTHQHIPYRENKNLTGTARYASINTHLGIEQSRRDDLESLGYVLMYFNLGSLPWQGLKAATKRQKYERISEKKMSTPIEVLCKGYPSEFSTYLNFCRSLRFDDKPDYSYLRQLFRNLFHRQGFSYDYVFDWNMLKFGAARNPEDVDRERREHEREERMGQLRGSATRALPPGPPTGATANRLRSAAEPVASTPASRIQPAGNTSPRAISRVDRERKVSMRLHRGAPANVSSSDLTGRQEVSRIPASQTSVPFDHLGK. The pIC50 is 8.4.