This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is O=C(O)c1cc2c(C#Cc3cc(F)cc(F)c3)c(-c3ccccc3)oc2cc1O. The target protein (Q9UNH5) has sequence MAAESGELIGACEFMKDRLYFATLRNRPKSTVNTHYFSIDEELVYENFYADFGPLNLAMVYRYCCKLNKKLKSYSLSRKKIVHYTCFDQRKRANAAFLIGAYAVIYLKKTPEEAYRALLSGSNPPYLPFRDASFGNCTYNLTILDCLQGIRKGLQHGFFDFETFDVDEYEHYERVENGDFNWIVPGKFLAFSGPHPKSKIENGYPLHAPEAYFPYFKKHNVTAVVRLNKKIYEAKRFTDAGFEHYDLFFIDGSTPSDNIVRRFLNICENTEGAIAVHCKAGLGRTGTLIACYVMKHYRFTHAEIIAWIRICRPGSIIGPQQHFLEEKQASLWVQGDIFRSKLKNRPSSEGSINKILSGLDDMSIGGNLSKTQNMERFGEDNLEDDDVEMKNGITQGDKLRALKSQRQPRTSPSCAFRSDDTKGHPRAVSQPFRLSSSLQGSAVTLKTSKMALSPSATAKRINRTSLSSGATVRSFSINSRLASSLGNLNAATDDPENKKT.... The pIC50 is 4.7. (2) The small molecule is Cc1c2c(O[C@H](C)[C@H]3CNC(=O)C3)cc(-c3cnn(C4CC4)c3)cc2nn1C. The target protein sequence is PMDTEVYESPYADPEEIRPKEVYLDRKLLTLEDKELGSGNFGTVKKGYYQMKKVVKTVAVKILKNEANDPALKDELLAEANVMQQLDNPYIVRMIGICEAESWMLVMEMAELGPLNKYLQQNRHVKDKNIIELVHQVSMGMKYLEESNFVHRDLAARNVLLVTQHYAKISDFGLSKALRADENYYKAQTHGKWPVKWYAPECINYYKFSSKSDVWSFGVLMWEAFSYGQKPYRGMKGSEVTAMLEKGERMGCPAGCPREMYDLMNLCWTYDVENRPGFAAVELRLRNYYYDVVN. The pIC50 is 8.6. (3) The drug is NCCCC[C@H](OP(=O)(O)CCCCc1ccccc1)C(=O)N1c2ccccc2C[C@H]1C(=O)O. The target protein (P12822) has sequence MGAAPGRRGPRLLRPPPPLLLLLLLLRPPPAALTLDPGLLPGDFAADEAGARLFASSYNSSAEQVLFRSTAASWAHDTNITAENARRQEEEALLSQEFAEAWGKKAKELYDPVWQNFTDPELRRIIGAVRTLGPANLPLAKRQQYNSLLSNMSQIYSTGKVCFPNKTASCWSLDPDLNNILASSRSYAMLLFAWEGWHNAVGIPLKPLYQEFTALSNEAYRQDGFSDTGAYWRSWYDSPTFEEDLERIYHQLEPLYLNLHAYVRRVLHRRYGDRYINLRGPIPAHLLGNMWAQSWESIYDMVVPFPDKPNLDVTSTMVQKGWNATHMFRVAEEFFTSLGLLPMPPEFWAESMLEKPEDGREVVCHASAWDFYNRKDFRIKQCTQVTMDQLSTVHHEMGHVQYYLQYKDQPVSLRRANPGFHEAIGDVLALSVSTPAHLHKIGLLDHVTNDTESDINYLLKMALEKIAFLPFGYLVDQWRWGVFSGRTPSSRYNFDWWYLR.... The pIC50 is 8.3. (4) The compound is COC1=C2C[C@@H](C)C[C@H](OC)[C@H](O)[C@@H](C)/C=C(\C)[C@H](OC(N)=O)[C@@H](OC)/C=C\C=C(/C)C(=O)NC(=CC1=O)C2=O. The target protein sequence is MPEEVQTQDQPMETFAVQTFAFQAEIAQLMSLIYESLTDPSKLDSGK. The pIC50 is 6.5. (5) The compound is C[C@H]1CC[C@H](N)CC1. The target protein (Q9GT49) has sequence MPTLQSLAVPFGCVQGYAPGGIPAYSNKHESYFSGERSIDGNLFCGFKYQCVEFARRWLFERKSLVLPDVDWAVHIFNLKEVSDARTGKPVRCVAIRNGTAAKPVVDSLLIYPSDDYSPVGHVAAITEVGDKWVRIADQNHRFHKWDANYAAELPLIHEKGVWTILDPLEDEVLKPLGWVTFPDTPDRNPNEPLVLHESLHFKRGELPTLRRLTFTPTSREKDWLDLTNEAEAYFADVCGIDVKNPKLEKASYYQMNRELYLDCAKYGNQLHQMFLEATKFVLGSDELLRLFCIPEEYWPRLRHSWETQPHAITGRFDFAFDEDTQQFKCFEYNADSASTLLECGVIQQKWARSVGLDDGTTYSSGSLVSSRLQLAWEMAEVTGRVHFLIDNDDEEHYTALYVMQHASAAGLETKLCVLFDEFHFDENGVVVDSDGVAVTTVWKTWMWETAIADHQKARVQRGNDWRPTPKDEVRLCDILLGPNWDLRVFEPMWKIIPSN.... The pIC50 is 7.0.