Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50.. Dataset: Drug-target binding data from BindingDB using IC50 measurements (1) The small molecule is CN(CCO)c1ccc(C(=O)Nc2sc(Nc3ccc4ccccc4c3)nc2C(N)=O)cc1. The target protein (Q9UKE5) has sequence MASDSPARSLDEIDLSALRDPAGIFELVELVGNGTYGQVYKGRHVKTGQLAAIKVMDVTGDEEEEIKQEINMLKKYSHHRNIATYYGAFIKKNPPGMDDQLWLVMEFCGAGSVTDLIKNTKGNTLKEEWIAYICREILRGLSHLHQHKVIHRDIKGQNVLLTENAEVKLVDFGVSAQLDRTVGRRNTFIGTPYWMAPEVIACDENPDATYDFKSDLWSLGITAIEMAEGAPPLCDMHPMRALFLIPRNPAPRLKSKKWSKKFQSFIESCLVKNHSQRPATEQLMKHPFIRDQPNERQVRIQLKDHIDRTKKKRGEKDETEYEYSGSEEEEEENDSGEPSSILNLPGESTLRRDFLRLQLANKERSEALRRQQLEQQQRENEEHKRQLLAERQKRIEEQKEQRRRLEEQQRREKELRKQQEREQRRHYEEQMRREEERRRAEHEQEYIRRQLEEEQRQLEILQQQLLHEQALLLEYKRKQLEEQRQAERLQRQLKQERDYL.... The pIC50 is 7.6. (2) The drug is O=C1C(Cl)=C(N2C(=O)CCC2=O)C(=O)c2ccccc21. The target protein (Q27352) has sequence MREAICIHIGQAGCQVGNACWELFCLEHGIQPDGAMPSDKTIGVEDDAFNTFFSETGAGKHVPRAVFLDLEPTVVDEIRTGTYRQLFHPEQLISGKEDAANNYARGHYTIGKEIVDLCLDRIRKLADNCTGLQGFLVYHAVGGGTGSGLGALLLERLSVDYGKKSKLGYTVYPSPQVSTAVVEPYNSVLSTHSLLEHTDVAAMLDNEAIYDLTRANLDIERPTYTNLNRLIGQVVSALTASLRFDGALNVDLTEFQTNLVPYPRIHFVLTTYAPVISAEKAYHEQLSVSEISNAVFEPASMMTKCDPRHGKYMACCLMYRGDVVPKDVNAAVATIKTKRTIQFVDWSPTGFKCGINYQPPTVVPGGDLAKVQRAVCMIANSTAIAEVFARIDHKFDLMYSKRAFVHWYVGEGMEEGEFSEAREDLAALEKDYEEVGAESADMEGEEDVEEY. The pIC50 is 5.2. (3) The compound is CSCC[C@H](NC(=O)[C@H](CCCCNC(=O)COCC(=O)Nc1ccc(CCC(=O)N2CCC2=O)cc1)NC(=O)[C@H](CCCCNC(C)=O)NC(=O)[C@@H]1CSSC[C@H](NC(=O)[C@@H](NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](Cc2ccccc2)NC(C)=O)C(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](Cc2c[nH]c3ccccc23)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N2CCC[C@H]2C(=O)N[C@@H](Cc2cnc[nH]2)C(=O)N1)C(N)=O. The target protein (P49764) has sequence MLVMKLFTCFLQVLAGLAVHSQGALSAGNNSTEVEVVPFNEVWGRSYCRPMEKLVYILDEYPDEVSHIFSPSCVLLSRCSGCCGDEGLHCVPIKTANITMQILKIPPNRDPHFYVEMTFSQDVLCECRPILETTKAERRKTKGKRKRSRNSQTEEPHP. The pIC50 is 8.4. (4) The compound is CC(C)=CCC/C(C)=C/CC/C(C)=C/CNC(=O)C(CP(=O)(O)O)C(=O)O. The target protein (P17256) has sequence MLSEVLLVSAPGKVILHGEHAVVHGKVALAVALNLRTFLVLRPQSNGKVSLNLPNVGIKQVWDVATLQLLDTGFLEQGDVPAPTLEQLEKLKKVAGLPRDCVGNEGLSLLAFLYLYLAICRKQRTLPSLDIMVWSELPPGAGLGSSAAYSVCVAAALLTACEEVTNPLKDRGSIGSWPEEDLKSINKWAYEGERVIHGNPSGVDNSVSTWGGALRYQQGKMSSLKRLPALQILLTNTKVPRSTKALVAGVRSRLIKFPEIMAPLLTSIDAISLECERVLGEMAAAPVPEQYLVLEELMDMNQHHLNALGVGHASLDQLCQVTAAHGLHSKLTGAGGGGCGITLLKPGLERAKVEAAKQALTGCGFDCWETSIGAPGVSMHSATSIEDPVRQALGL. The pIC50 is 3.7. (5) The pIC50 is 6.7. The target protein (Q2FYS5) has sequence MNKQNNYSDDSIQVLEGLEAVRKRPGMYIGSTDKRGLHHLVYEIVDNSVDEVLNGYGNEIDVTINKDGSISIEDNGRGMPTGIHKSGKPTVEVIFTVLHAGGKFGQGGYKTSGGLHGVGASVVNALSEWLEVEIHRDGNIYHQSFKNGGSPSSGLVKKGKTKKTGTKVTFKPDDTIFKASTSFNFDVLSERLQESAFLLKNLKITLNDLRSGKERQEHYHYEEGIKEFVSYVNEGKEVLHDVATFSGEANGIEVDVAFQYNDQYSESILSFVNNVRTKDGGTHEVGFKTAMTRVFNDYARRINELKTKDKNLDGNDIREGLTAVVSVRIPEELLQFEGQTKSKLGTSEARSAVDSVVADKLPFYLEEKGQLSKSLVKKAIKAQQAREAARKAREDARSGKKNKRKDTLLSGKLTPAQSKNTEKNELYLVEGDSAGGSAKLGRDRKFQAILPLRGKVINTEKARLEDIFKNEEINTIIHTIGAGVGTDFKIEDSNYNRVII.... The drug is CCCn1c(=O)ccc2c(-c3cnc(-c4cccnc4)s3)n[nH]c21. (6) The target protein (Q9Y6I3) has sequence MSTSSLRRQMKNIVHNYSEAEIKVREATSNDPWGPSSSLMSEIADLTYNVVAFSEIMSMIWKRLNDHGKNWRHVYKAMTLMEYLIKTGSERVSQQCKENMYAVQTLKDFQYVDRDGKDQGVNVREKAKQLVALLRDEDRLREERAHALKTKEKLAQTATASSAAVGSGPPPEAEQAWPQSSGEEELQLQLALAMSKEEADQPPSCGPEDDAQLQLALSLSREEHDKEERIRRGDDLRLQMAIEESKRETGGKEESSLMDLADVFTAPAPAPTTDPWGGPAPMAAAVPTAAPTSDPWGGPPVPPAADPWGGPAPTPASGDPWRPAAPAGPSVDPWGGTPAPAAGEGPTPDPWGSSDGGVPVSGPSASDPWTPAPAFSDPWGGSPAKPSTNGTTAAGGFDTEPDEFSDFDRLRTALPTSGSSAGELELLAGEVPARSPGAFDMSGVRGSLAEAVGSPPPAATPTPTPPTRKTPESFLGPNAALVDLDSLVSRPGPTPPGAKA.... The drug is O=C(O)c1cc(P(=O)(O)O)n[nH]1. The pIC50 is 4.3. (7) The small molecule is Cc1nn(C)c(C)c1N[S+](=O)([O-])c1ccc(N2CCc3c(N)cccc3C2)nc1. The target protein (Q9UVX3) has sequence MSDSKDRKGKAPEGQSSEKKDGAVNITPQMAESLLENNPALRNETAGMDKDKAAEAMRKMNIAELLTGLSVSGKNQKDMASYKFWQTQPVPRFDETSTDTGGPIKIIDPEKVSKEPDALLEGFEWATLDLTNETELQELWDLLTYHYVEDDNAMFRFRYSQSFLHWALMSPGWKKEWHVGVRATKSRKLVASICGVPTEINVRNQKLKVVEINFLCIHKKLRSKRLTPVLIKEITRRCYLNGIYQAIYTAGVVLPTPVSSCRYYHRPLDWLKLYEVGFSPLPAGSTKARQITKNHLPSTTSTPGLRPMEPKDIDTVHDLLQRYLSRFALNQAFTREEVDHWLVHKPETVKEQVVWAYVVEDPETHKITDFFSFYNLESTVIQNPKHDNVRAAYLYYYATETAFTNNMKALKERLLMLMNDALILAKKAHFDVFNALTLHDNPLFLEQLKFGAGDGQLHFYLYNYRTAPVPGGVNEKNLPDEKRMGGVGIVML. The pIC50 is 4.8.