This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is Cc1csc2c1-n1cccc1/C2=N/OCCN(C)C. The target protein (P28222) has sequence MEEPGAQCAPPPPAGSETWVPQANLSSAPSQNCSAKDYIYQDSISLPWKVLLVMLLALITLATTLSNAFVIATVYRTRKLHTPANYLIASLAVTDLLVSILVMPISTMYTVTGRWTLGQVVCDFWLSSDITCCTASILHLCVIALDRYWAITDAVEYSAKRTPKRAAVMIALVWVFSISISLPPFFWRQAKAEEEVSECVVNTDHILYTVYSTVGAFYFPTLLLIALYGRIYVEARSRILKQTPNRTGKRLTRAQLITDSPGSTSSVTSINSRVPDVPSESGSPVYVNQVKVRVSDALLEKKKLMAARERKATKTLGIILGAFIVCWLPFFIISLVMPICKDACWFHLAIFDFFTWLGYLNSLINPIIYTMSNEDFKQAFHKLIRFKCTS. The pIC50 is 5.6. (2) The compound is N#Cc1ccc([C@H]2CCCCc3cncn32)c(-c2ccc(F)cc2)c1. The target protein (P30099) has sequence MGACDNDFIELHSRVTADVWLARPWQCLHRTRALGTTATLAPKTLKPFEAIPQYSRNKWLKMIQILREQGQENLHLEMHQAFQELGPIFRHSAGGAQIVSVMLPEDAEKLHQVESILPRRMHLEPWVAHRELRGLRRGVFLLNGAEWRFNRLKLNPNVLSPKAVQNFVPMVDEVARDFLEALKKKVRQNARGSLTMDVQQSLFNYTIEASNFALFGERLGLLGHDLNPGSLKFIHALHSMFKSTTQLLFLPRSLTRWTSTQVWKEHFDAWDVISEYANRCIWKVHQELRLGSSQTYSGIVAALITQGALPLDAIKANSMELTAGSVDTTAIPLVMTLFELARNPDVQQALRQETLAAEASIAANPQKAMSDLPLLRAALKETLRLYPVGGFLERILNSDLVLQNYHVPAGTLVLLYLYSMGRNPAVFPRPERYMPQRWLERKRSFQHLAFGFGVRQCLGRRLAEVEMLLLLHHMLKTFQVETLRQEDVQMAYRFVLMPSS.... The pIC50 is 9.0. (3) The drug is Cc1ccc(Cl)cc1-c1cn(-c2cc(N)ncn2)cc1C#N. The target protein sequence is LPEPSCPQLATLTSQCLTYEPTQRPSFRTILRDLTRLQPHNLADVLTVNPDSPASDPTVFHKRYLKKIRDLGEGHFGKVSLYCYDPTNDGTGEMVAVKALKADCGPQHRSGWKQEIDILRTLYHEHIIKYKGCCEDQGEKSLQLVMEYVPLGSLRDYLPRHSIGLAQLLLFAQQICEGMAYLHAQHYIHRDLAARNVLLDNDRLVKIGDFGLAKAVPEGHEYYRVREDGDSPVFWYAPECLKEYKFYYASDVWSFGVTLYELLTHCDSSQSPPTKFLELIGIAQGQMTVLRLTELLERGERLPRPDKCPCEVYHLMKNCWETEASFRPTFENLIPILKTVHEKYQGQAPSVFSVC. The pIC50 is 7.4. (4) The drug is CCOc1ccc(-c2ccc(Cn3c(CC(C)(C)C(=O)O)c(SC(C)(C)C)c4cc(OCc5ccc(C)cn5)ccc43)cc2)cn1. The target protein (P20291) has sequence MDQEAVGNVVLLAIVTLISVVQNAFFAHKVELESKAQSGRSFQRTGTLAFERVYTANQNCVDAYPTFLVVLWTAGLLCSQVPAAFAGLMYLFVRQKYFVGYLGERTQSTPGYIFGKRIILFLFLMSLAGILNHYLIFFFGSDFENYIRTITTTISPLLLIP. The pIC50 is 6.6. (5) The small molecule is COc1cc2nc3n(c(=O)c2cc1OC)CCc1c-3[nH]c2ccccc12. The target protein (Q16678) has sequence MGTSLSPNDPWPLNPLSIQQTTLLLLLSVLATVHVGQRLLRQRRRQLRSAPPGPFAWPLIGNAAAVGQAAHLSFARLARRYGDVFQIRLGSCPIVVLNGERAIHQALVQQGSAFADRPAFASFRVVSGGRSMAFGHYSEHWKVQRRAAHSMMRNFFTRQPRSRQVLEGHVLSEARELVALLVRGSADGAFLDPRPLTVVAVANVMSAVCFGCRYSHDDPEFRELLSHNEEFGRTVGAGSLVDVMPWLQYFPNPVRTVFREFEQLNRNFSNFILDKFLRHCESLRPGAAPRDMMDAFILSAEKKAAGDSHGGGARLDLENVPATITDIFGASQDTLSTALQWLLLLFTRYPDVQTRVQAELDQVVGRDRLPCMGDQPNLPYVLAFLYEAMRFSSFVPVTIPHATTANTSVLGYHIPKDTVVFVNQWSVNHDPLKWPNPENFDPARFLDKDGLINKDLTSRVMIFSVGKRRCIGEELSKMQLFLFISILAHQCDFRANPNEP.... The pIC50 is 6.8. (6) The small molecule is CCCCc1oc2ccccc2c1C(=O)c1cc(I)c(OCCN)c(I)c1. The target protein (P18113) has sequence MTPNSMTENRLPAWDKQKPHPDRGQDWKLVGMSEACLHRKSHVERRGALKNEQTSSHLIQATWASSIFHLDPDDVNDQSVSSAQTFQTEEKKCKGYIPSYLDKDELCVVCGDKATGYHYRCITCEGCKGFFRRTIQKSLHPSYSCKYEGKCIIDKVTRNQCQECRFKKCIYVGMATDLVLDDSKRLAKRKLIEENREKRRREELQKSIGHKPEPTDEEWELIKTVTEAHVATNAQGSHWKQKRKFLPEDIGQAPIVNAPEGGQVDLEAFSHFTKIITPAITRVVDFAKKLPMFCELPCEDQIILLKGCCMEIMSLRAAVRYDPDSETLTLNGEMAVTRGQLKNGGLGVVSDAIFDLGMSLSSFNLDDTEVALLQAVLLMSSDRPGLACVERIEKYQDSFLLAFEHYINYRKHHVTHFWPKLLMKVTDLRMIGACHASRFLHMKVECPTELFPPLFLEVFED. The pIC50 is 4.7.