This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is CCCCCCCCCCCCC/C=C/[C@@H](O)[C@H](CC(F)(F)P(=O)([O-])OCC[N+](C)(C)C)NC(=O)CCCCCCCCCCCCCCCCC. The target protein (O60906) has sequence MKPNFSLRLRIFNLNCWGIPYLSKHRADRMRRLGDFLNQESFDLALLEEVWSEQDFQYLRQKLSPTYPAAHHFRSGIIGSGLCVFSKHPIQELTQHIYTLNGYPYMIHHGDWFSGKAVGLLVLHLSGMVLNAYVTHLHAEYNRQKDIYLAHRVAQAWELAQFIHHTSKKADVVLLCGDLNMHPEDLGCCLLKEWTGLHDAYLETRDFKGSEEGNTMVPKNCYVSQQELKPFPFGVRIDYVLYKAVSGFYISCKSFETTTGFDPHRGTPLSDHEALMATLFVRHSPPQQNPSSTHGPAERSPLMCVLKEAWTELGLGMAQARWWATFASYVIGLGLLLLALLCVLAAGGGAGEAAILLWTPSVGLVLWAGAFYLFHVQEVNGLYRAQAELQHVLGRAREAQDLGPEPQPALLLGQQEGDRTKEQ. The pIC50 is 3.4. (2) The small molecule is CC(CCc1ccco1)NC(=O)CCc1nnc2ccc(NCc3ccccc3Cl)nn12. The target protein (P41235) has sequence MRLSKTLVDMDMADYSAALDPAYTTLEFENVQVLTMGNDTSPSEGTNLNAPNSLGVSALCAICGDRATGKHYGASSCDGCKGFFRRSVRKNHMYSCRFSRQCVVDKDKRNQCRYCRLKKCFRAGMKKEAVQNERDRISTRRSSYEDSSLPSINALLQAEVLSRQITSPVSGINGDIRAKKIASIADVCESMKEQLLVLVEWAKYIPAFCELPLDDQVALLRAHAGEHLLLGATKRSMVFKDVLLLGNDYIVPRHCPELAEMSRVSIRILDELVLPFQELQIDDNEYAYLKAIIFFDPDAKGLSDPGKIKRLRSQVQVSLEDYINDRQYDSRGRFGELLLLLPTLQSITWQMIEQIQFIKLFGMAKIDNLLQEMLLGGSPSDAPHAHHPLHPHLMQEHMGTNVIVANTMPTHLSNGQMCEWPRPRGQAATPETPQPSPPGGSGSEPYKLLPGAVATIVKPLSAIPQPTITKQEVI. The pIC50 is 4.4. (3) The compound is CC(=O)N[C@@H]1[C@@H](N)C=C(C(=O)O)O[C@H]1[C@H](O)[C@H](O)CO. The target protein (Q99519) has sequence MTGERPSTALPDRRWGPRILGFWGGCRVWVFAAIFLLLSLAASWSKAENDFGLVQPLVTMEQLLWVSGRQIGSVDTFRIPLITATPRGTLLAFAEARKMSSSDEGAKFIALRRSMDQGSTWSPTAFIVNDGDVPDGLNLGAVVSDVETGVVFLFYSLCAHKAGCQVASTMLVWSKDDGVSWSTPRNLSLDIGTEVFAPGPGSGIQKQREPRKGRLIVCGHGTLERDGVFCLLSDDHGASWRYGSGVSGIPYGQPKQENDFNPDECQPYELPDGSVVINARNQNNYHCHCRIVLRSYDACDTLRPRDVTFDPELVDPVVAAGAVVTSSGIVFFSNPAHPEFRVNLTLRWSFSNGTSWRKETVQLWPGPSGYSSLATLEGSMDGEEQAPQLYVLYEKGRNHYTESISVAKISVYGTL. The pIC50 is 3.0. (4) The drug is CCc1cc(OCc2ccc(-c3ccccc3-c3nnn[nH]3)cc2)c2ccc(OC)cc2n1. The target protein (Q9WV26) has sequence MILNSSTEDGIKRIQDDCPKAGRHSYIFVMIPTLYSIIFVVGIFGNSLVVIVIYFYMKLKTVASVFLLNLALADICFLLTLPLWAVYTAMEYRWPFGNYLCKIASASVSFNLYASVFLLTCLSIDRYLAIVHPMKSRLRRTMLVAKVTCVIIWLMAGLASLPAVIHRNVFFIENTNITVCAFHYESQNSTLPIGLGLTKNILGFMFPFLIILTSYTLIWKALKKAYEIQKNKPRNDDIFKIIMAIVLFFFFSWVPHQIFTFLDVLIQLGIIHDCKISDIVDTAMPITICIAYFNNCLNPLFYGFLGKKFKKYFLQLLKYIPPKAKSHSTLSTKMSTLSYRPSDNVSSSAKKPVQCFEVE. The pIC50 is 6.7. (5) The drug is O=c1[nH]c(O)c(Cc2cccc(OCc3ccccc3)c2)c(=O)[nH]1. The target protein sequence is MDNLLRHLKISKEQITPVVLVVGDPGRVDKIKVVCDSYVDLAYNREYKSVECHYKGQKFLCVSHGVGSAGCAVCFEELCQNGAKVIIRAGSCGSLQPDLIKRGDICICNAAVREDRVSHLLIHGDFPAVGDFDVYDTLNKCAQELNVPVFNGISVSSDMYYPNKIIPSRLEDYSKANAAVVEMELATLMVIGTLRKVKTGGILIVDGCPFKWDEGDFDNNLVPHQLENMIKIALGACAKLATKYA. The pIC50 is 3.0. (6) The target protein (P32214) has sequence MRFLLLNRFTLLLLLLVSPTPVLQAPTNLTDSGLDQEPFLYLVGRKKLLDAQYKCYDRIQQLPPYEGEGPYCNRTWDGWMCWDDTPAGVMSYQHCPDYFPDFDPTEKVSKYCDENGEWFRHPDSNRTWSNYTLCNAFTPDKLHNAYVLYYLALVGHSMSIAALIASMGIFLFFKNLSCQRVTLHKNMFLTYILNSIIIIIHLVEVVPNGDLVRRDPMHIFHHNTYMWTMQWELSPPLPLSAHEGKMDPHDSEVISCKILHFFHQYMMACNYFWMLCEGIYLHTLIVMAVFTEDQRLRWYYLLGWGFPIVPTIIHAITRAVYYNDNCWLSTETHLLYIIHGPVMAALVVNFFFLLNIVRVLVTKMRQTHEAEAYMYLKAVKATMVLVPLLGIQFVVFPWRPSNKVLGKIYDYLMHSLIHFQGFFVATIYCFCNHEVQVTLKRQWAQFKIQWSHRWGRRRRPTNRVVSAPRAVAFAEPGGLPIYICHQEPRNPPVSNNEGEE.... The drug is CC[C@H](C)[C@H](NC(=O)[C@H](C)NC(=O)[C@@H](NC(=O)[C@H](CCC(N)=O)NC(=O)[C@H]1CCCN1C(=O)[C@H](Cc1ccccc1)NC(=O)[C@@H]1CCCNC(=O)C[C@H](NC(=O)[C@H](Cc2ccccc2)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](NC(=O)[C@H](Cc2ccc(O)cc2)NC(=O)[C@@H](NC(=O)CNC(=O)[C@H](CC(C)C)NC(=O)[C@H](CCSC)NC(=O)[C@H](CS)NC(=O)[C@@H](NC(=O)[C@H](CO)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC(N)=O)NC(=O)CNC(=O)[C@@H](N)CS)[C@@H](C)O)[C@@H](C)O)[C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)[C@H](Cc2ccccc2)NC(=O)[C@H](Cc2c[nH]cn2)N1)[C@@H](C)O)C(=O)NCC(=O)N[C@H](C(=O)NCC(=O)N[C@@H](C)C(=O)N1CCC[C@@H]1C(N)=O)C(C)C. The pIC50 is 6.8. (7) The small molecule is O=c1c(CCOCc2ccccc2)c(O)n(-c2ccc(Cl)cc2)n1-c1ccc(Cl)cc1. The target protein (P0A749) has sequence MDKFRVQGPTKLQGEVTISGAKNAALPILFAALLAEEPVEIQNVPKLKDVDTSMKLLSQLGAKVERNGSVHIDARDVNVFCAPYDLVKTMRASIWALGPLVARFGQGQVSLPGGCTIGARPVDLHISGLEQLGATIKLEEGYVKASVDGRLKGAHIVMDKVSVGATVTIMCAATLAEGTTIIENAAREPEIVDTANFLITLGAKISGQGTDRIVIEGVERLGGGVYRVLPDRIETGTFLVAAAISRGKIICRNAQPDTLDAVLAKLRDAGADIEVGEDWISLDMHGKRPKAVNVRTAPHPAFPTDMQAQFTLLNLVAEGTGFITETVFENRFMHVPELSRMGAHAEIESNTVICHGVEKLSGAQVMATDLRASASLVLAGCIAEGTTVVDRIYHIDRGYERIEDKLRALGANIERVKGE. The pIC50 is 4.5. (8) The compound is O=C(O)[C@H]1/C(=C/CO)O[C@@H]2CC(=O)N21. The target protein sequence is MIKSSWRKIAMLAAAVPLLLASGALWASTDAIHQKLTDLEKRSGGRLGVALINTADNSQILYRGDERFAMCSTSKVMAAAAVLKQSESNKEVVNKRLEINAADLVVWSPITEKHLQSGMTLAELSAATLQYSDNTAMNLIIGYLGGPEKVTAFARSIGDATFRLDRTEPTLNTAIPGDERDTSTPLAMAESLRKLTLGDALGEQQRAQLVTWLKGNTTGGQSIRAGLPESWVVGDKTGAGDYGTTNDIAVIWPEDHAPLILVTYFTQPQQDAKNRKEVLAAAAKIVTEGL. The pIC50 is 7.5.