From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The target protein (Q9NR21) has sequence MWEANPEMFHKAEELFSKTTNNEVDDMDTSDTQWGWFYLAECGKWHMFQPDTNSQCSVSSEDIEKSFKTNPCGSISFTTSKFSYKIDFAEMKQMNLTTGKQRLIKRAPFSISAFSYICENEAIPMPPHWENVNTQVPYQLIPLHNQTHEYNEVANLFGKTMDRNRIKRIQRIQNLDLWEFFCRKKAQLKKKRGVPQINEQMLFHGTSSEFVEAICIHNFDWRINGIHGAVFGKGTYFARDAAYSSRFCKDDIKHGNTFQIHGVSLQQRHLFRTYKSMFLARVLIGDYINGDSKYMRPPSKDGSYVNLYDSCVDDTWNPKIFVVFDANQIYPEYLIDFH. The pIC50 is 5.3. The compound is O=C(c1ccc(Nc2nc3ccccc3n3nnnc23)cc1)N1CCN(c2ncccn2)CC1. (2) The drug is C=CC(=O)Oc1ccc(C(C)(C)CC)cc1. The target protein sequence is EEMIRSLQQRPEPTPEEWDLIHIATEAHRSTNAQGSHWKQRRKFLPDDIGQSPIVSMPDGDKVDLEAFSEFTKIITPAITRVVDFAKKLPMFSELPCEDQIILLKGCCMEIMSLRAAVRYDPESDTLTLSGEMAVKREQLKNGGLGVVSDAIFELGKSLSAFNLDDTEVALLQAVLLMSTDRSGLLCVDKIEKSQEAYLLAFEHYVNHRKHNIPHFWPKLLMKEREVQSSILYKGAAAEGRPGGSLGVHPEGQQLLGMHVVQV. The pIC50 is 4.2. (3) The compound is CCOC(OCC)c1ccc(/C=C2\COC/C(=C\c3ccc(C(OCC)OCC)cc3)C2=O)cc1. The target is SSSEEGLTCRGIPNSISI. The pIC50 is 4.0. (4) The drug is COCc1ccccc1-c1ccccc1-c1nnnn1-c1ccccc1F. The target protein (O15554) has sequence MGGDLVLGLGALRRRKRLLEQEKSLAGWALVLAGTGIGLMVLHAEMLWFGGCSWALYLFLVKCTISISTFLLLCLIVAFHAKEVQLFMTDNGLRDWRVALTGRQAAQIVLELVVCGLHPAPVRGPPCVQDLGAPLTSPQPWPGFLGQGEALLSLAMLLRLYLVPRAVLLRSGVLLNASYRSIGALNQVRFRHWFVAKLYMNTHPGRLLLGLTLGLWLTTAWVLSVAERQAVNATGHLSDTLWLIPITFLTIGYGDVVPGTMWGKIVCLCTGVMGVCCTALLVAVVARKLEFNKAEKHVHNFMMDIQYTKEMKESAARVLQEAWMFYKHTRRKESHAARRHQRKLLAAINAFRQVRLKHRKLREQVNSMVDISKMHMILYDLQQNLSSSHRALEKQIDTLAGKLDALTELLSTALGPRQLPEPSQQSK. The pIC50 is 7.2. (5) The compound is CCOc1ccc(Nc2nc3cc(S(C)(=O)=O)ccc3o2)cc1. The target protein (O70342) has sequence MEVKLEEHFNKTFVTENNTAASQNTASPAWEDYRGTENNTSAARNTAFPVWEDYRGSVDDLQYFLIGLYTFVSLLGFMGNLLILMAVMKKRNQKTTVNFLIGNLAFSDILVVLFCSPFTLTSVLLDQWMFGKAMCHIMPFLQCVSVLVSTLILISIAIVRYHMIKHPISNNLTANHGYFLIATVWTLGFAICSPLPVFHSLVELKETFGSALLSSKYLCVESWPSDSYRIAFTISLLLVQYILPLVCLTVSHTSVCRSISCGLSHKENRLEENEMINLTLHPSKKSRDQAKPPSTQKWSYSFIRKHRRRYSKKTACVLPAPAGPSQEKHLTVPENPGSVRSQLSPSSKVIPGVPICFEVKPEESSDAQEMRVKRSLTRIKKRSRSVFYRLTILILVFAVSWMPLHVFHVVTDFNDNLISNRHFKLVYCICHLLGMMSCCLNPILYGFLNNGIKADLRALIHCLHMS. The pIC50 is 6.7. (6) The pIC50 is 6.5. The target protein (P58295) has sequence MDCSAPKEMNKPPTNILEATVPGHRDSPRAPRTSPEQDLPAAAPAAAVQPPRVPRSASTGAQTFQSADARACEAQRPGVGFCKLSSPQAQATSAALRDLSEGHSAQANPPSGAAGAGNALHCKIPALRGPEEDENVSVGKGTLEHNNTPAVGWVNMSQSTVVLGTDGIASVLPGSVATTTIPEDEQGDENKARGNWSSKLDFILSMVGYAVGLGNVWRFPYLAFQNGGGAFLIPYLMMLALAGLPIFFLEVSLGQFASQGPVSVWKAIPALQGCGIAMLIISVLIAIYYNVIICYTLFYLFASFVSVLPWGSCNNPWNTPECKDKTKLLLDSCVIGDHPKIQIKNSTFCMTAYPNLTMVNFTSQANKTFVSGSEEYFKYFVLKISAGIEYPGEIRWPLAFCLFLAWVIVYASLAKGIKTSGKVVYFTATFPYVVLVILLIRGVTLPGAGAGIWYFITPKWEKLTDATVWKDAATQIFFSLSAAWGGLITLSSYNKFHNNC.... The drug is CCc1ccc(/C(=C\C[C@H](N)C(=O)[O-])c2ccc(F)cc2F)cc1CC. (7) The compound is Cc1c(Oc2ccc(Cl)cc2O)c(=O)oc2cc(O)ccc12. The target protein sequence is MSVLHRFYLFFLFTKFFHCYKISYVLKNAKLAPNHAIKNINSLNLLSENKKENYYYCGENKVALVTGAGRGIGREIAKMLAKSVSHVICISRTQKSCDSVVDEIKSFGYESSGYAGDVSKKEEISEVINKILTEHKNVDILVNNAGITRDNLFLRMKNDEWEDVLRTNLNSLFYITQPISKRMINNRYGRIINISSIVGLTGNVGQANYSSSKAGVIGFTKSLAKELASRNITVNAIAPGFISSDMTDKISEQIKKNIISNIPAGRMGTPEEVANLACFLSSDKSGYINGRVFVIDGGLSP. The pIC50 is 4.0. (8) The target protein (P65502) has sequence MKKIVLYGGQFNPIHTAHMIVASEVFHELQPDEFYFLPSFMSPLKKHNNFIDVQHRLTMIQMIIDELGFGDICDDEIKRGGQSYTYDTIKAFKEQHKDSELYFVIGTDQYNQLEKWYQIEYLKEMVTFVVVNRDKNSQNVENAMIAIQIPRVDISSTMIRQRVSEGKSIQVLVPKSVENYIKGEGLYEH. The pIC50 is 4.8. The small molecule is COC(=O)CCCC1=CC2=CC(=O)C(C)(OC(=O)CCc3ccccc3)C(=O)C2=CO1.