From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is CCNC(=O)Nc1cc(Nc2ccc(OC)cc2OC)c(C(=O)Nc2cccnc2)cn1. The target protein (Q2FYS5) has sequence MNKQNNYSDDSIQVLEGLEAVRKRPGMYIGSTDKRGLHHLVYEIVDNSVDEVLNGYGNEIDVTINKDGSISIEDNGRGMPTGIHKSGKPTVEVIFTVLHAGGKFGQGGYKTSGGLHGVGASVVNALSEWLEVEIHRDGNIYHQSFKNGGSPSSGLVKKGKTKKTGTKVTFKPDDTIFKASTSFNFDVLSERLQESAFLLKNLKITLNDLRSGKERQEHYHYEEGIKEFVSYVNEGKEVLHDVATFSGEANGIEVDVAFQYNDQYSESILSFVNNVRTKDGGTHEVGFKTAMTRVFNDYARRINELKTKDKNLDGNDIREGLTAVVSVRIPEELLQFEGQTKSKLGTSEARSAVDSVVADKLPFYLEEKGQLSKSLVKKAIKAQQAREAARKAREDARSGKKNKRKDTLLSGKLTPAQSKNTEKNELYLVEGDSAGGSAKLGRDRKFQAILPLRGKVINTEKARLEDIFKNEEINTIIHTIGAGVGTDFKIEDSNYNRVII.... The pIC50 is 6.1. (2) The small molecule is CCCCc1ccc(NS(=O)(=O)c2ccc3c4c(cccc24)C(=O)N3)cc1. The target protein sequence is MAVRELPGAWNFRDVADTATALRPGRLFRSSELSRLDDAGRATLRRLGITDVADLRSSREVARRGPGRVPDGIDVHLLPFPDLADDDADDSAPHETAFKRLLTNDGSNGESGESSQSINDAATRYMTDEYRQFPTRNGAQRALHRVVTLLAAGRPVLTHCFAGKDRTGFVVALVLEAVGLDRDVIVADYLRSNDSVPQLRARISEMIQQRFDTELAPEVVTFTKARLSDGVLGVRAEYLAAARQTIDETYGSLGGYLRDAGISQATVNRMRGVLLG. The pIC50 is 5.8. (3) The pIC50 is 5.7. The small molecule is N#C[C@@H]1CCCN1C(=O)CNC(=O)C12CC3CC(CC(O)(C3)C1)C2. The target protein (P97321) has sequence MKTWLKTVFGVTTLAALALVVICIVLRPSRVYKPEGNTKRALTLKDILNGTFSYKTYFPNWISEQEYLHQSEDDNIVFYNIETRESYIILSNSTMKSVNATDYGLSPDRQFVYLESDYSKLWRYSYTATYYIYDLQNGEFVRGYELPRPIQYLCWSPVGSKLAYVYQNNIYLKQRPGDPPFQITYTGRENRIFNGIPDWVYEEEMLATKYALWWSPDGKFLAYVEFNDSDIPIIAYSYYGDGQYPRTINIPYPKAGAKNPVVRVFIVDTTYPHHVGPMEVPVPEMIASSDYYFSWLTWVSSERVCLQWLKRVQNVSVLSICDFREDWHAWECPKNQEHVEESRTGWAGGFFVSTPAFSQDATSYYKIFSDKDGYKHIHYIKDTVENAIQITSGKWEAIYIFRVTQDSLFYSSNEFEGYPGRRNIYRISIGNSPPSKKCVTCHLRKERCQYYTASFSYKAKYYALVCYGPGLPISTLHDGRTDQEIQVLEENKELENSLRN.... (4) The drug is COc1c(N2CC3CCCNC3C2)c(F)cc2c1N(C1CC1)CC(C(=O)O)C2=O. The target protein (P9WG47) has sequence MTDTTLPPDDSLDRIEPVDIEQEMQRSYIDYAMSVIVGRALPEVRDGLKPVHRRVLYAMFDSGFRPDRSHAKSARSVAETMGNYHPHGDASIYDSLVRMAQPWSLRYPLVDGQGNFGSPGNDPPAAMRYTEARLTPLAMEMLREIDEETVDFIPNYDGRVQEPTVLPSRFPNLLANGSGGIAVGMATNIPPHNLRELADAVFWALENHDADEEETLAAVMGRVKGPDFPTAGLIVGSQGTADAYKTGRGSIRMRGVVEVEEDSRGRTSLVITELPYQVNHDNFITSIAEQVRDGKLAGISNIEDQSSDRVGLRIVIEIKRDAVAKVVINNLYKHTQLQTSFGANMLAIVDGVPRTLRLDQLIRYYVDHQLDVIVRRTTYRLRKANERAHILRGLVKALDALDEVIALIRASETVDIARAGLIELLDIDEIQAQAILDMQLRRLAALERQRIIDDLAKIEAEIADLEDILAKPERQRGIVRDELAEIVDRHGDDRRTRIIA.... The pIC50 is 5.0. (5) The small molecule is CC(C)(C)NC(=O)CCC(NC(=O)c1ccc(NCc2cnc3nc(N)nc(N)c3n2)cc1)C(=O)O. The target protein sequence is MVRPLNCIVAVSQNMGIGKNGDLPWPPLRNEFKYFQRMTTTSSVEGKQNLVIMGRKTWFSIPEKNRPLKDRINIVLSRELKEPPRGAHFLAKSLDDALRLIEQPELASKVDMVWIVGGSSVYQEAMNQPGHLRLFVTRIMQEFESDTFFPEIDLGKYKLLPEYPGVLSEVQEEKGIKYKFEVYEKKD. The pIC50 is 7.4. (6) The small molecule is CC(C)C[C@H](NC(=O)[C@H](Cc1ccccc1)NC(=O)c1cnccn1)B(O)O. The target protein (P28063) has sequence MALLDLCGAARGQRPEWAALDAGSGGRSDPGHYSFSAQAPELALPRGMQPTAFLRSFGGDQERNVQIEMAHGTTTLAFKFQHGVIVAVDSRATAGSYISSLRMNKVIEINPYLLGTMSGCAADCQYWERLLAKECRLYYLRNGERISVSAASKLLSNMMLQYRGMGLSMGSMICGWDKKGPGLYYVDDNGTRLSGQMFSTGSGNTYAYGVMDSGYRQDLSPEEAYDLGRRAIAYATHRDNYSGGVVNMYHMKEDGWVKVESSDVSDLLYKYGEAAL. The pIC50 is 7.8. (7) The pIC50 is 6.1. The target protein (Q9ERZ8) has sequence MADPGDGPRAAPGDVAEPPGDESGTSGGEAFPLSSLANLFEGEEGSSSLSPVDASRPAGPGDGRPNLRMKFQGAFRKGVPNPIDLLESTLYESSVVPGPKKAPMDSLFDYGTYRHHPSDNKRWRRKVVEKQPQSPKAPAPQPPPILKVFNRPILFDIVSRGSTADLDGLLSYLLTHKKRLTDEEFREPSTGKTCLPKALLNLSNGRNDTIPVLLDIAERTGNMREFINSPFRDIYYRGQTALHIAIERRCKHYVELLVAQGADVHAQARGRFFQPKDEGGYFYFGELPLSLAACTNQPHIVNYLTENPHKKADMRRQDSRGNTVLHALVAIADNTRENTKFVTKMYDLLLLKCSRLFPDSNLETVLNNDGLSPLMMAAKTGKIGVFQHIIRREVTDEDTRHLSRKFKDWAYGPVYSSLYDLSSLDTCGEEVSVLEILVYNSKIENRHEMLAVEPINELLRDKWRKFGAVSFYINVVSYLCAMVIFTLTAYYQPLEGTPPY.... The compound is Cc1nc(C)c(-c2csc(Nc3ccc(C(=O)N4C5COCC4CC(=O)C5)cn3)n2)s1. (8) The pIC50 is 6.0. The target protein sequence is MTDKAFTEHQFWSTQPVRQPGAPDADKVGFIMESSLDAVPAEPYSLPSTFEWWSPDVANPEDLRGVHELLRDNYVEDSESMFRFNYSEEFLRWALMPPGYHQSWHVGVRLKSNKSVLGFVAGVPITMRLGTPKMVLEKREHGEDGGEEVINDYLEPQTICEINFLCVHKKLRQRRLGPILIKEVTRRVNLMNIWHAVYTSGTLLPTPFAKGHYFHRSLNSQKLVDVKFSGIPPHYKRFQNPVAVMERLYRLPDKTKTRGLRLMEPADVPQVTQLLLKRLASFDVAPVFNEEEVAHYFLPREGVVFSYVVESPVGPGKDEENAGKASKGTPTGTKCVTGGCEKVITDFFSFYSLPSTIIGNSNHSLLKVAYVYYTAATSVSITQLVNDLLIIVKLNGFDVCNVVDIYDNGTYLKELKFSPGDGNLYYYFYNWSYPSIPANEVGLVMV. The drug is Cc1nn(C)c(C)c1N[S+](=O)([O-])c1c(Cl)cccc1Cl. (9) The drug is COc1ccc(-c2cc(-c3cc4c(=O)[nH]cnc4[nH]3)ccn2)cc1. The target protein (P49137) has sequence MLSNSQGQSPPVPFPAPAPPPQPPTPALPHPPAQPPPPPPQQFPQFHVKSGLQIKKNAIIDDYKVTSQVLGLGINGKVLQIFNKRTQEKFALKMLQDCPKARREVELHWRASQCPHIVRIVDVYENLYAGRKCLLIVMECLDGGELFSRIQDRGDQAFTEREASEIMKSIGEAIQYLHSINIAHRDVKPENLLYTSKRPNAILKLTDFGFAKETTSHNSLTTPCYTPYYVAPEVLGPEKYDKSCDMWSLGVIMYILLCGYPPFYSNHGLAISPGMKTRIRMGQYEFPNPEWSEVSEEVKMLIRNLLKTEPTQRMTITEFMNHPWIMQSTKVPQTPLHTSRVLKEDKERWEDVKEEMTSALATMRVDYEQIKIKKIEDASNPLLLKRRKKARALEAAALAH. The pIC50 is 6.3. (10) The small molecule is COC(=O)[C@@H](N)CCB1O[C@@]2(C)CCC3C[C@@]2(O1)C3(C)C. The target protein (P68890) has sequence MKISEEEVRHVAKLSKLSFSESETTTFATTLSKIVDMVELLNEVDTEGVAITTTMADKKNVMRQDVAEEGTDRALLFKNVPEKENHFIKVPAILDDGGDA. The pIC50 is 7.2.