From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is N#CC(c1ccc(Cl)cc1)c1ccc(Cl)cn1. The target protein sequence is MGMRTVLTGLAGMLLGSMMPVQADMPRPTGLAADIRWTAYGVPHIRAKDERGLGYGIGYAYARDNACLLAEEIVTARGERARYFGSEGKSSAELDNLPSDIFYAWLNQPEALQAFWQAQTPAVRQLLEGYAAGFNRFLREADGKTTSCLGQPWLRAIATDDLLRLTRRLLVEGGVGQFADALVAAAPPGTEKVALSGEQAFQVAEQRRQRFRLERGSNAIAVGSERSADGKGMLLANPHFPWNGAMRFYQMHLTIPGRLDVMGASLPGLPVVNIGFSRHLAWTHTVDTSSHFTLYRLALDPKDPRRYLVDGRSLPLEEKSVAIEVRGADGKLSRVEHKVYQSIYGPLVVWPGKLDWNRSEAYALRDANLENTRVLQQWYSINQASDVADLRRRVEALQGIPWVNTLAADEQGNALYMNQSVVPYLKPELIPACAIPQLVAEGLPALQGQDSRCAWSRDPAAAQAGITPAAQLPVLLRRDFVQNSNDSAWLTNPASPLQGF.... The pIC50 is 6.9. (2) The small molecule is Cc1c2c(O[C@H](C)[C@H]3CNC(=O)C3)cc(-c3cnn(C4CC(C)(C)OC4(C)C)c3)cc2nn1C. The target protein sequence is MAQKENSYPWPYGRQTAPSGLSTLPQRVLRKEPVTPSALVLMSRSNVQPTAAPGQKVMENSSGTPDILTRHFTIDDFEIGRPLGKGKFGNVYLAREKKSHFIVALKVLFKSQIEKEGVEHQLRREIEIQAHLHHPNILRLYNYFYDRRRIYLILEYAPRGELYKELQKSCTFDEQRTATIMEELADALMYCHGKKVIHRDIKPENLLLGLKGELKIADFGWSVHAPSLRRKTMCGTLDYLPPEMIEGRMHNEKVDLWCIGVLCYELLVGNPPFESASHNETYRRIVKVDLKFPASVPMGAQDLISKLLRHNPSERLPLAQVSAHPWVRANSRRVLPPSALQ. The pIC50 is 4.3. (3) The drug is O=C(c1ccc(F)cc1)C1CCN(CC(=O)N(Cc2nc3c(c(=O)[nH]2)COCC3)CC2CC2)CC1. The target protein (O95271) has sequence MAASRRSQHHHHHHQQQLQPAPGASAPPPPPPPPLSPGLAPGTTPASPTASGLAPFASPRHGLALPEGDGSRDPPDRPRSPDPVDGTSCCSTTSTICTVAAAPVVPAVSTSSAAGVAPNPAGSGSNNSPSSSSSPTSSSSSSPSSPGSSLAESPEAAGVSSTAPLGPGAAGPGTGVPAVSGALRELLEACRNGDVSRVKRLVDAANVNAKDMAGRKSSPLHFAAGFGRKDVVEHLLQMGANVHARDDGGLIPLHNACSFGHAEVVSLLLCQGADPNARDNWNYTPLHEAAIKGKIDVCIVLLQHGADPNIRNTDGKSALDLADPSAKAVLTGEYKKDELLEAARSGNEEKLMALLTPLNVNCHASDGRKSTPLHLAAGYNRVRIVQLLLQHGADVHAKDKGGLVPLHNACSYGHYEVTELLLKHGACVNAMDLWQFTPLHEAASKNRVEVCSLLLSHGADPTLVNCHGKSAVDMAPTPELRERLTYEFKGHSLLQAAREA.... The pIC50 is 7.5. (4) The drug is CN[C@@H](C)C(=O)N[C@@H](Cc1ccc(OCc2ccc(C(=O)NC3C[C@@H](C(=O)NC4CCCc5ccccc54)N(C(=O)[C@@H](NC(=O)[C@H](C)NC)C(C)(C)C)C3)c(F)c2)cc1)C(=O)N1Cc2ccccc2CC1C(=O)N[C@@H]1CCCc2ccccc21. The target protein sequence is RDHFALDRPSETHADYLLRTGQVVDISDTIYPRNPAMYSEEARLKSFQNWPDYAHLTPRELASAGLYYTGIGDQVQCFACGGKLKNWEPGDRAWSEHRRHFPNCFFVLGRNLNIRSE. The pIC50 is 8.7. (5) The small molecule is COc1cc(C=NNC(N)=S)cc(Br)c1OCc1ccccc1F. The target protein (P06492) has sequence MDLLVDELFADMNADGASPPPPRPAGGPKNTPAAPPLYATGRLSQAQLMPSPPMPVPPAALFNRLLDDLGFSAGPALCTMLDTWNEDLFSALPTNADLYRECKFLSTLPSDVVEWGDAYVPERTQIDIRAHGDVAFPTLPATRDGLGLYYEALSRFFHAELRAREESYRTVLANFCSALYRYLRASVRQLHRQAHMRGRDRDLGEMLRATIADRYYRETARLARVLFLHLYLFLTREILWAAYAEQMMRPDLFDCLCCDLESWRQLAGLFQPFMFVNGALTVRGVPIEARRLRELNHIREHLNLPLVRSAATEEPGAPLTTPPTLHGNQARASGYFMVLIRAKLDSYSSFTTSPSEAVMREHAYSRARTKNNYGSTIEGLLDLPDDDAPEEAGLAAPRLSFLPAGHTRRLSTAPPTDVSLGDELHLDGEDVAMAHADALDDFDLDMLGDGDSPGPGFTPHDSAPYGALDMADFEFEQMFTDALGIDEYGG. The pIC50 is 4.9. (6) The compound is CCCCCCCCNC1CC(CO)C(O)C(O)C1O. The target protein (Q9H227) has sequence MAFPAGFGWAAATAAYQVEGGWDADGKGPCVWDTFTHQGGERVFKNQTGDVACGSYTLWEEDLKCIKQLGLTHYRFSLSWSRLLPDGTTGFINQKGIDYYNKIIDDLLKNGVTPIVTLYHFDLPQTLEDQGGWLSEAIIESFDKYAQFCFSTFGDRVKQWITINEANVLSVMSYDLGMFPPGIPHFGTGGYQAAHNLIKAHARSWHSYDSLFRKKQKGMVSLSLFAVWLEPADPNSVSDQEAAKRAITFHLDLFAKPIFIDGDYPEVVKSQIASMSQKQGYPSSRLPEFTEEEKKMIKGTADFFAVQYYTTRLIKYQENKKGELGILQDAEIEFFPDPSWKNVDWIYVVPWGVCKLLKYIKDTYNNPVIYITENGFPQSDPAPLDDTQRWEYFRQTFQELFKAIQLDKVNLQVYCAWSLLDNFEWNQGYSSRFGLFHVDFEDPARPRVPYTSAKEYAKIIRNNGLEAHL. The pIC50 is 3.0. (7) The compound is COc1cc2c(Nc3ccc(NC(=O)c4ccccc4)cc3)ncnc2cc1OCCCN1CCOCC1. The target protein (Q6DE08) has sequence MSYKENLNPSSYTSKFTTPSSATAAQRVLRKEPYVSTFTTPSDNLLAQRTQLSRITPSASSSVPGRVAVSTEMPSQNTALAEMPKRKFTIDDFDIGRPLGKGKFGNVYLAREKQNKFIMALKVLFKSQLEKEGVEHQLRREIEIQSHLRHPNILRMYNYFHDRKRIYLMLEFAPRGELYKELQKHGRFDEQRSATFMEELADALHYCHERKVIHRDIKPENLLMGYKGELKIADFGWSVHAPSLRRRTMCGTLDYLPPEMIEGKTHDEKVDLWCAGVLCYEFLVGMPPFDSPSHTETHRRIVNVDLKFPPFLSDGSKDLISKLLRYHPPQRLPLKGVMEHPWVKANSRRVLPPVYQSTQSK. The pIC50 is 7.3. (8) The drug is CN1CCc2c(sc3c2C2=N/C(=C4/C=CC=CC4=O)NN2C=N3)C1. The target protein (P07711) has sequence MNPTLILAAFCLGIASATLTFDHSLEAQWTKWKAMHNRLYGMNEEGWRRAVWEKNMKMIELHNQEYREGKHSFTMAMNAFGDMTSEEFRQVMNGFQNRKPRKGKVFQEPLFYEAPRSVDWREKGYVTPVKNQGQCGSCWAFSATGALEGQMFRKTGRLISLSEQNLVDCSGPQGNEGCNGGLMDYAFQYVQDNGGLDSEESYPYEATEESCKYNPKYSVANDTGFVDIPKQEKALMKAVATVGPISVAIDAGHESFLFYKEGIYFEPDCSSEDMDHGVLVVGYGFESTESDNNKYWLVKNSWGEEWGMGGYVKMAKDRRNHCGIASAASYPTV. The pIC50 is 4.5. (9) The small molecule is Cc1cc(=O)n2[nH]c(C(F)(F)F)c(-c3ccc(Cl)cc3)c2n1. The target protein (P9WNS3) has sequence MLQQIRGPADLQHLSQAQLRELAAEIREFLIHKVAATGGHLGPNLGVVELTLALHRVFDSPHDPIIFDTGHQAYVHKMLTGRSQDFATLRKKGGLSGYPSRAESEHDWVESSHASAALSYADGLAKAFELTGHRNRHVVAVVGDGALTGGMCWEALNNIAASRRPVIIVVNDNGRSYAPTIGGVADHLATLRLQPAYEQALETGRDLVRAVPLVGGLWFRFLHSVKAGIKDSLSPQLLFTDLGLKYVGPVDGHDERAVEVALRSARRFGAPVIVHVVTRKGMGYPPAEADQAEQMHSTVPIDPATGQATKVAGPGWTATFSDALIGYAQKRRDIVAITAAMPGPTGLTAFGQRFPDRLFDVGIAEQHAMTSAAGLAMGGLHPVVAIYSTFLNRAFDQIMMDVALHKLPVTMVLDRAGITGSDGASHNGMWDLSMLGIVPGIRVAAPRDATRLREELGEALDVDDGPTALRFPKGDVGEDISALERRGGVDVLAAPADGLN.... The pIC50 is 4.0.