This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is CCc1[nH]c2nc(N)[nH]c(=O)c2c1Sc1ccc(Cl)cc1. The target protein (P0A884) has sequence MKQYLELMQKVLDEGTQKNDRTGTGTLSIFGHQMRFNLQDGFPLVTTKRCHLRSIIHELLWFLQGDTNIAYLHENNVTIWDEWADENGDLGPVYGKQWRAWPTPDGRHIDQITTVLNQLKNDPDSRRIIVSAWNVGELDKMALAPCHAFFQFYVADGKLSCQLYQRSCDVFLGLPFNIASYALLVHMMAQQCDLEVGDFVWTGGDTHLYSNHMDQTHLQLSREPRPLPKLIIKRKPESIFDYRFEDFEIEGYDPHPGIKAPVAI. The pIC50 is 4.5. (2) The drug is CCCN1CCN(S(=O)(=O)c2cnc(O[C@H](C)COC)c(-c3nc4c(CC)n(C)nc4c(=O)[nH]3)c2)CC1. The target protein (Q28263) has sequence MGEVTAEQVEKFLDSNIIFAKQYYNLRYRAKVISDMLGAKEAAVDFSNYHSLSSVEESEIIFDLLRDFQENLQAERCIFNVMKKLCFLLQADRMSLFMYRVRNGIAELATRLFNVHKDAVLEECLVAPDSEIVFPLDMGVVGHVAHSKKIANVVNTEEDEHFCDFVDTLTEYQTKNILASPIMNGKDVVAVIMAVNKVDEPHFTKRDEEILLKYLNFANLIMKVYHLSYLHNCETRRGQILLWSGSKVFEELTDIERQFHKALYTVRAFLNCDRYSVGLLDMTKQKEFFDVWPVLMGEAPPYSGPRTPDGREINFYKVIDYILHGKEDIKVIPNPPPDHWALVSGLPTYVAQNGLICNIMNAPAEDFFAFQKEPLDESGWMIKNVLSMPIVNKKEEIVGVATFYNRKDGKPFDEMDETLMESLAQFLGWSVLNPDTYESMNRLENRKDIFQDMVKYHVKCDNEEIQKILKTREVYGKEPWECEEEELAEILQGELPDAEK.... The pIC50 is 5.9. (3) The small molecule is O=C(Nc1ccnc(N2CCNCC2)n1)[C@@H]1CC[C@@H]2CN1C(=O)N2OS(=O)(=O)O. The target protein sequence is ATALTNLVAEPFAKLEQDFGGSIGVYAMDTGSGATVSYRAEERFPLCSSFKGFLAAAVLARSQQQAGLLDTPIRYGKNALVPWSPISEKYLTTGMTVAELSAAAVQYSDNAAANLLLKELGGPAGLTAFMRSIGDTTFRLDRWELELNSAIPGDARDTSSPRAVTESLQKLTLGSALAAPQRQQFVDWLKGNTTGNHRIRAAVPADWAVGDKTGTCGVYGTANDYAVVWPTGRAPIVLAVYTRAPNKDDKYSEAVIAAAA. The pIC50 is 7.7. (4) The drug is O=C([O-])C1=CS[C@@H]2/C(=C\c3cc4n(n3)CCSC4)C(=O)N12. The target protein (P52663) has sequence MSLNVKQSRIAILFSSCLISISFFSQANTKGIDEIKNLETDFNGRIGVYALDTGSGKSFSYRANERFPLCSSFKGFLAAAVLKGSQDNRLNLNQIVNYNTRSLEFHSPITTKYKDNGMSLGDMAAAALQYSDNGATNIILERYIGGPEGMTKFMRSIGDEDFRLDRWELDLNTAIPGDERDTSTPAAVAKSLKTLALGNILSEHEKETYQTWLKGNTTGAARIRASVPSDWVVGDKTGSCGAYGTANDYAVVWPKNRAPLIISVYTTKNEKEAKHEDKVIAEASRIAIDNLK. The pIC50 is 7.7. (5) The drug is C=C1C(C)=C(C(=O)[O-])N2C(=O)/C(=C/c3ccccn3)[C@H]2S1(=O)=O. The target protein sequence is MMKKSLCCALLLGISCSALATPVSEKQLAEVVANTVTPLMKAQSVPGMAVAVIYQGKPHYYTFGKADIAANKPVTPQTLFELGSISKTFTGVLGGDAIARGEISLDDPVTRYWPQLTGKQWQGIRMLDLATYTAGGLPLQVPDEVTDNASLLRFYQNWQPQWKPGTTRLYANASIGLFGALAVKPSGMPYEQAMTTRVLKPLKLDHTWINVPKAEEAHYAWGYRDGKAVRAVRVSPGMLDAQAYGVKTNVQDMANWVMANMAPENVADASLKQGIALAQSRYWRIGSMYQGLGWEMLNWPVEANTVVEGSDSKVALAPLPVAEVNPPAPPVKASWVHKTGSTGGFGSYVAFIPEKQIGIVMLANTSYPNPARVEAAYHILEALQ. The pIC50 is 7.4.