From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is OC[C@H]1O[C@@H](n2cnc3c(N[C@@H]4CCOC4)ncnc32)[C@H](O)[C@@H]1O. The target protein (O43868) has sequence MEKASGRQSIALSTVETGTVNPGLELMEKEVEPEGSKRTDAQGHSLGDGLGPSTYQRRSRWPFSKARSFCKTHASLFKKILLGLLCLAYAAYLLAACILNFQRALALFVITCLVIFVLVHSFLKKLLGKKLTRCLKPFENSRLRLWTKWVFAGVSLVGLILWLALDTAQRPEQLIPFAGICMFILILFACSKHHSAVSWRTVFSGLGLQFVFGILVIRTDLGYTVFQWLGEQVQIFLNYTVAGSSFVFGDTLVKDVFAFQALPIIIFFGCVVSILYYLGLVQWVVQKVAWFLQITMGTTATETLAVAGNIFVGMTEAPLLIRPYLGDMTLSEIHAVMTGGFATISGTVLGAFIAFGVDASSLISASVMAAPCALASSKLAYPEVEESKFKSEEGVKLPRGKERNVLEAASNGAVDAIGLATNVAANLIAFLAVLAFINAALSWLGELVDIQGLTFQVICSYLLRPMVFMMGVEWTDCPMVAEMVGIKFFINEFVAYQQLS.... The pIC50 is 3.6. (2) The small molecule is O=C(O)c1cnc2cc(O)ccc2c1O. The target protein (P11708) has sequence MSEPIRVLVTGAAGQIAYSLLYSIGNGSVFGKDQPIILVLLDITPMMGVLDGVLMELQDCALPLLKDVIATDKEEIAFKDLDVAILVGSMPRRDGMERKDLLKANVKIFKCQGAALDKYAKKSVKVIVVGNPANTNCLTASKSAPSIPKENFSCLTRLDHNRAKAQIALKLGVTSDDVKNVIIWGNHSSTQYPDVNHAKVKLQAKEVGVYEAVKDDSWLKGEFITTVQQRGAAVIKARKLSSAMSAAKAICDHVRDIWFGTPEGEFVSMGIISDGNSYGVPDDLLYSFPVTIKDKTWKIVEGLPINDFSREKMDLTAKELAEEKETAFEFLSSA. The pIC50 is 3.0. (3) The drug is C[C@H]1c2ccccc2N(Cc2ccccc2)C[C@@H](C)N1C(=O)NCc1ccccc1. The target protein (Q9P0X4) has sequence MAESASPPSSSAAAPAAEPGVTTEQPGPRSPPSSPPGLEEPLDGADPHVPHPDLAPIAFFCLRQTTSPRNWCIKMVCNPWFECVSMLVILLNCVTLGMYQPCDDMDCLSDRCKILQVFDDFIFIFFAMEMVLKMVALGIFGKKCYLGDTWNRLDFFIVMAGMVEYSLDLQNINLSAIRTVRVLRPLKAINRVPSMRILVNLLLDTLPMLGNVLLLCFFVFFIFGIIGVQLWAGLLRNRCFLEENFTIQGDVALPPYYQPEEDDEMPFICSLSGDNGIMGCHEIPPLKEQGRECCLSKDDVYDFGAGRQDLNASGLCVNWNRYYNVCRTGSANPHKGAINFDNIGYAWIVIFQVITLEGWVEIMYYVMDAHSFYNFIYFILLIIVGSFFMINLCLVVIATQFSETKQREHRLMLEQRQRYLSSSTVASYAEPGDCYEEIFQYVCHILRKAKRRALGLYQALQSRRQALGPEAPAPAKPGPHAKEPRHYHGKTKGQGDEGRH.... The pIC50 is 8.1. (4) The drug is C[C@@H]1[C@H]2C3=CC[C@@H]4[C@@]5(C)CC[C@H](O)C(C)(C)[C@@H]5CC[C@@]4(C)[C@]3(C)CC[C@@]2(C(=O)O)CC[C@H]1C. The target protein (P59264) has sequence HLLQFRKMIKKMTGKEPIVSYAFYGCYCGKGGRGKPKDATDRCCFVHDCCYEKVTGCDPKWSYYTYSLEDGDIVCEGDPYCTKVKCECDKKAAICFRDNLKTYKNRYMTFPDIFCTDPTEGC. The pIC50 is 5.6. (5) The compound is CCOc1cc(/C=C2\C(=O)NC(=O)N(CCc3ccc(F)cc3)C2=O)ccc1O. The target protein (P00651) has sequence MMYSKLLTLTTLLLPTALALPSLVERACDYTCGSNCYSSSDVSTAQAAGYQLHEDGETVGSNSYPHKYNNYEGFDFSVSSPYYEWPILSSGDVYSGGSPGADRVVFNENNQLAGVITHTGASGNNFVECT. The pIC50 is 4.5. (6) The compound is CC(C)Cn1c(=O)n(C)c(=O)c2[nH]cnc21. The target protein sequence is MERAGPSFGQQRQQQQPQQQKQQQRDQDSVEAWLDDHWDFTFSYFVRKATREMVNAWFAERVHTIPVCKEGIRGHTESCSCPLQQSPRADNSAPGTPTRKISASEFDRPLRPIVVKDSEGTVSFLSDSEKKEQMPLTPPRFDHDEGDQCSRLLELVKDISSHLDVTALCHKIFLHIHGLISADRYSLFLVCEDSSNDKFLISRLFDVAEGSTLEEVSNNCIRLEWNKGIVGHVAALGEPLNIKDAYEDPRFNAEVDQITGYKTQSILCMPIKNHREEVVGVAQAINKKSGNGGTFTEKDEKDFAAYLAFCGIVLHNAQLYETSLLENKRNQVLLDLASLIFEEQQSLEVILKKIAATIISFMQVQKCTIFIVDEDCSDSFSSVFHMECEELEKSSDTLTREHDANKINYMYAQYVKNTMEPLNIPDVSKDKRFPWTTENTGNVNQQCIRSLLCTPIKNGKKNKVIGVCQLVNKMEENTGKVKPFNRNDEQFLEAFVIFCG.... The pIC50 is 5.3. (7) The drug is O=C(NCCCCCN1CCCC1)c1cncc(Br)c1. The target protein (Q86YI8) has sequence MDSDSCAAAFHPEEYSPSCKRRRTVEDFNKFCTFVLAYAGYIPYPKEELPLRSSPSPANSTAGTIDSDGWDAGFSDIASSVPLPVSDRCFSHLQPTLLQRAKPSNFLLDRKKTDKLKKKKKRKRRDSDAPGKEGYRGGLLKLEAADPYVETPTSPTLQDIPQAPSDPCSGWDSDTPSSGSCATVSPDQVKEIKTEGKRTIVRQGKQVVFRDEDSTGNDEDIMVDSDDDSWDLVTCFCMKPFAGRPMIECNECHTWIHLSCAKIRKSNVPEVFVCQKCRDSKFDIRRSNRSRTGSRKLFLD. The pIC50 is 4.3. (8) The small molecule is O=c1[nH]ccc2nc(-c3ccc(CN4CCN(c5ncccn5)CC4)cc3)c(-c3ccccc3)cc12. The target protein sequence is MSDVAIVKEGWLHKRGEYIKTWRPRYFLLKNDGTFIGYKERPQDVDQRCAPLNNFSVAQCQLMKTERPRPNTFIIRCLQWTTVIERTFHVETPEEREEWTTAIQTVADGLKKQEEEEMDFRSGSPSDNSGAEEMEVSLAKPKHRVTMNEFEYLKLLGKGTFGKVILVKEKATGRYYAMKILKKEVIVAKDEVAHTLTENRVLQNSRHPFLTALKYSFQTHDRLCFVMEYANGGELFFHLSRERVFSEDRARFYGAEIVSALDYLHSEKNVVYRDLKLENLMLDKDGHIKITDFGLSKEGIKDGATMKTFSGTPEYLAPEVLEDNDYGRAVDWWGLGVVMYEMMSGRLPFYNQDHEKLFELILMEEIRFPRTLGPEAKSLLSGLLKKDPKQRLGGGSEDAKEIMQHRFFAGIVWQHVYEKKLSPPFKPQVTSETDTRYFDEEFTAQMITITPPDQDDSMECVDSERRPHFPQFSYSASGTA. The pIC50 is 6.4. (9) The compound is N=C(N)NC(=O)c1ncc(-c2cccc(-c3ccccc3)c2)nc1N. The target protein (O35240) has sequence MKPRSGLEEAQRRQASDIRVFASSCTMHGLGHIFGPGGLTLRRGLWATAVLLSLAAFLYQVAERVRYYGEFHHKTTLDERESHQLTFPAVTLCNINPLRRSRLTPNDLHWAGTALLGLDPAEHAAYLRALGQPPAPPGFMPSPTFDMAQLYARAGHSLEDMLLDCRYRGQPCGPENFTVIFTRMGQCYTFNSGAHGAELLTTPKGGAGNGLEIMLDVQQEEYLPIWKDMEETPFEVGIRVQIHSQDEPPAIDQLGFGAAPGHQTFVSCQQQQLSFLPPPWGDCNTASLDPDDFDPEPSDPLGSPRPRPSPPYSLIGCRLACESRYVARKCGCRMMHMPGNSPVCSPQQYKDCASPALDAMLRKDTCVCPNPCATTRYAKELSMVRIPSRASARYLARKYNRSESYITENVLVLDIFFEALNYEAVEQKAAYEVSELLGDIGGQMGLFIGASLLTILEILDYLCEVFQDRVLGYFWNRRSAQKRSGNTLLQEELNGHRTHV.... The pIC50 is 6.0.