From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is CC1(C)CC2(CCCN(C3CCN(C(=O)c4c(NC(N)=O)sc5ccccc45)CC3)C2)C(=O)O1. The target protein (Q13085) has sequence MDEPSPLAQPLELNQHSRFIIGSVSEDNSEDEISNLVKLDLLEEKEGSLSPASVGSDTLSDLGISSLQDGLALHIRSSMSGLHLVKQGRDRKKIDSQRDFTVASPAEFVTRFGGNKVIEKVLIANNGIAAVKCMRSIRRWSYEMFRNERAIRFVVMVTPEDLKANAEYIKMADHYVPVPGGPNNNNYANVELILDIAKRIPVQAVWAGWGHASENPKLPELLLKNGIAFMGPPSQAMWALGDKIASSIVAQTAGIPTLPWSGSGLRVDWQENDFSKRILNVPQELYEKGYVKDVDDGLQAAEEVGYPVMIKASEGGGGKGIRKVNNADDFPNLFRQVQAEVPGSPIFVMRLAKQSRHLEVQILADQYGNAISLFGRDCSVQRRHQKIIEEAPATIATPAVFEHMEQCAVKLAKMVGYVSAGTVEYLYSQDGSFYFLELNPRLQVEHPCTEMVADVNLPAAQLQIAMGIPLYRIKDIRMMYGVSPWGDSPIDFEDSAHVPC.... The pIC50 is 7.1. (2) The small molecule is CC(C)(C)CC[C@H](c1ccc(C(=O)NCc2nn[nH]n2)cc1)N1C(=O)C(c2ccc(C(F)(F)F)cc2)=N[C@]12CC[C@@H](C(C)(C)C)CC2. The target protein (Q61606) has sequence MPLTQLHCPHLLLLLLVLSCLPEAPSAQVMDFLFEKWKLYSDQCHHNLSLLPPPTELVCNRTFDKYSCWPDTPPNTTANISCPWYLPWYHKVQHRLVFKRCGPDGQWVRGPRGQPWRNASQCQLDDEEIEVQKGVAKMYSSQQVMYTVGYSLSLGALLLALVILLGLRKLHCTRNYIHGNLFASFVLKAGSVLVIDWLLKTRYSQKIGDDLSVSVWLSDGAMAGCRVATVIMQYGIIANYCWLLVEGVYLYSLLSLATFSERSFFSLYLGIGWGAPLLFVIPWVVVKCLFENVQCWTSNDNMGFWWILRIPVFLALLINFFIFVHIIHLLVAKLRAHQMHYADYKFRLARSTLTLIPLLGVHEVVFAFVTDEHAQGTLRSTKLFFDLFLSSFQGLLVAVLYCFLNKEVQAELMRRWRQWQEGKALQEERLASSHGSHMAPAGPCHGDPCEKLQLMSAGSSSGTGCVPSMETSLASSLPRLADSPT. The pIC50 is 6.7. (3) The drug is Cc1cncc(-c2cc3c(cn2)cnn3-c2cccc(C3CNCCC3(F)F)n2)n1. The target protein sequence is MGPAAPLALPPPALPDPAGEPARGQPRQRPQSSSDSPSALRASRSQSRNATRSLSPGRRLSPSSLRRRCCSSRHRRRTDTLEVGMLLSKINSLAHLRARPCNDLHATKLAPGKEKEPLESQYQVGPLLGSGGFGSVYSGIRVADNLPVAIKHVEKDRISDWGELPNGTRVPMEVVLLKKVSSDFSGVIRLLDWFERPDSFVLILERPEPVQDLFDFITERGALQEDLARGFFWQVLEAVRHCHNCGVLHRDIKDENILIDLSRGEIKLIDFGSGALLKDTVYTDFDGTRVYSPPEWIRYHRYHGRSAAVWSLGILLYDMVCGDIPFEHDEEIIKGQVFFRQTVSSECQHLIKWCLSLRPSDRPSFEEIRNHPWMQGDLLPQAASEIHLHSLSPGSSK. The pIC50 is 6.7. (4) The pIC50 is 7.0. The target protein sequence is MDGTAAEPRPGAGSLQHAQPPPQPRKKRPEDFKFGKILGEGSFSTVVLARELATSREYAIKILEKRHIIKENKVPYVTRERDVMSRLDHPFFVKLYFTFQDDEKLYFGLSYAKNGELLKYIRKIGSFDETCTRFYTAEIVSALEYLHGKGIIHRDLKPENILLNEDMHIQITDFGTAKVLSPESKQARANSFVGTAQYVSPELLTEKSACKSSDLWALGCIIYQLVAGLPPFRAGNEYLIFQKIIKLEYDFPEKFFPKARDLVEKLLVLDATKRLGCEEMEGYGPLKAHPFFESVTWENLHQQTPPKLT. The compound is CNc1ncc2cc(OCCNc3ncc(C#N)cc3C(=O)N[C@@H](C)c3ccc(F)cc3)ccc2n1. (5) The compound is CC1CCCC(n2cnc(CC(CCCN)C(=O)O)c2)C1. The target protein (Q96IY4) has sequence MKLCSLAVLVPIVLFCEQHVFAFQSGQVLAALPRTSRQVQVLQNLTTTYEIVLWQPVTADLIVKKKQVHFFVNASDVDNVKAHLNVSGIPCSVLLADVEDLIQQQISNDTVSPRASASYYEQYHSLNEIYSWIEFITERHPDMLTKIHIGSSFEKYPLYVLKVSGKEQAAKNAIWIDCGIHAREWISPAFCLWFIGHITQFYGIIGQYTNLLRLVDFYVMPVVNVDGYDYSWKKNRMWRKNRSFYANNHCIGTDLNRNFASKHWCEEGASSSSCSETYCGLYPESEPEVKAVASFLRRNINQIKAYISMHSYSQHIVFPYSYTRSKSKDHEELSLVASEAVRAIEKISKNTRYTHGHGSETLYLAPGGGDDWIYDLGIKYSFTIELRDTGTYGFLLPERYIKPTCREAFAAVSKIAWHVIRNV. The pIC50 is 7.7. (6) The pIC50 is 6.3. The small molecule is N#C[C@@]1(NC(=O)[C@@H]([NH3+])Cc2cccs2)C[C@@H]1c1ccccc1. The target protein (P28293) has sequence MQPLLLLLTFILLQGDEAGKIIGGREARPHSYPYMAFLLIQSPEGLSACGGFLVREDFVLTAAHCLGSSINVTLGAHNIQMRERTQQLITVLRAIRHPDYNPQNIRNDIMLLQLRRRARRSGSVKPVALPQASKKLQPGDLCTVAGWGRVSQSRGTNVLQEVQLRVQMDQMCANRFQFYNSQTQICVGNPRERKSAFRGDSGGPLVCSNVAQGIVSYGSNNGNPPAVFTKIQSFMPWIKRTMRRFAPRYQRPANSLSQAQT. (7) The compound is CC(=O)NC(CCC(=O)O)P(=O)(O)CC(CCC(=O)O)C(=O)O. The target protein (A4Q9F0) has sequence MPSLPQDGVIQGSSPVDLGTELPYQCTMKRKVRKKKKKGIITANVAGTKFEIVRLVIDEMGFMKTPDEDETSNLIWCDAAVQQEKITDLQNYQRINHFPGMGEICRKDFLARNMTKMIKSRPMDYTFVPRTWIFPSEYTQFQNYVKELKKKRKQKTFIVKPANGAMGHGISLIRNGDKVPSQDHLIVQEYIEKPFLMEGYKFDLRIYILVTSCDPLKIFLYHDGLVRMGTEKYIPPNESNLTQLYMHLTNYSVNKHNERFERNETEDKGSKRSIKWFTEFLQANQHDVTKFWSDISELVVKTLIVAEPHVLHAYRMCRPGQPPGSESVCFEVLGFDILLDRKLKPWLLEINRAPSFGTDQKIDYDVKRGVLLNALKLLNIRTSDKRKNLAKQKAEAQRRLYGQNPVRRLSPGSSDWEQQRHQLERRKEELKERLLQVRKQVSQEEHENRHMGNYRRIYPPEDKALLEKYEGLLAVAFQTFLSGRAASFQREMNNPLKKMR.... The pIC50 is 3.8.