This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The pIC50 is 7.3. The target protein (P56068) has sequence MKIGVFDSGVGGFSVLKSLLKAQLFDEIIYYGDSARVPYGTKDPTTIKQFGLEALDFFKPHQIKLLIVACNTASALALEEMQKHSKIPVVGVIEPSILAIKRQVKDKNAPILVLGTKATIQSNAYDNALKQQGYLNVSHLATSLFVPLIEESILEGELLETCMRYYFTPLEILPEVVILGCTHFPLIAQKIEGYFMEHFALSTPPLLIHSGDAIVEYLQQNYALKKNACAFPKVEFHASGDVVWLEKQAKEWLKL. The compound is Cc1oc(S(C)(=O)=O)cc1-c1c2c(=O)n(C)c(=O)n(CC3CC3)c2nn1Cc1ccnc2ccc(Cl)cc12. (2) The drug is CCC(C)CC(C)CCCCCCCCC(=O)NC1C[C@@H](O)[C@@H](O)NC(=O)C2CN(C[C@@H]2O)C(=O)[C@H]([C@H](O)CCN)NC(=O)[C@H]([C@H](O)[C@@H](O)c2ccc(O)cc2)NC(=O)C2C[C@@H](O)CN2C(=O)[C@H]([C@@H](C)O)NC1=O. The target protein sequence is MANWQNTDPNGNYYYNGAENNEFYDQDYASQQPEQQQGGEGYYDEYGQPNYNYMNDPQQGQMPQQQPGGYENDGYYDSYYNNQMNAGVGNGLGPDQTNFSDFSSYGPPPFQNNQANYTPSQLSYSNNGMGSNGMNMSGSSTPVYGNYDPNAIAMTLPNDPYPAWTADPQSPVSIEQIEDVFIDLTNKFGFQRDSMRNIFDLFMTLLDSRTSRMSPDQALLSVHADYIGGDTANYKKWYFAAQLDMDDEVGFRNMNLGKLSRKARKAKKKNKKAMEEANPEDAAEVLNKIEGDNSLEASDFRWKTKMNMLTPIERVRQVALYMLIWGEANQVRFTSECLCFIYKCASDYLESPLCQQRTEPIPEGDYLNRVITPIYQFIRNQVYEIVDGPFMSKREKEKDHNKIIGYDDVNQLFWYPEGITKIVLEDGTKLTDIPSEERYLRLGEVAWNDVFFKTYKETRTWLHLVTNFNRIWIMHVSVYWMYVAYNSPTFYTHNYQQLVN.... The pIC50 is 8.0. (3) The small molecule is CCCC(C(=O)O)c1c(C)nc2sc3c(c2c1-c1ccc(OC)c(OC)c1)CCCC3. The target protein (P03367) has sequence MGARASVLSGGELDRWEKIRLRPGGKKKYKLKHIVWASRELERFAVNPGLLETSEGCRQILGQLQPSLQTGSEELRSLYNTVATLYCVHQRIEIKDTKEALDKIEEEQNKSKKKAQQAAADTGHSSQVSQNYPIVQNIQGQMVHQAISPRTLNAWVKVVEEKAFSPEVIPMFSALSEGATPQDLNTMLNTVGGHQAAMQMLKETINEEAAEWDRVHPVHAGPIAPGQMREPRGSDIAGTTSTLQEQIGWMTNNPPIPVGEIYKRWIILGLNKIVRMYSPTSILDIRQGPKEPFRDYVDRFYKTLRAEQASQEVKNWMTETLLVQNANPDCKTILKALGPAATLEEMMTACQGVGGPGHKARVLAEAMSQVTNSATIMMQRGNFRNQRKIVKCFNCGKEGHIARNCRAPRKKGCWKCGKEGHQMKDCTERQANFLREDLAFLQGKAREFSSEQTRANSPTISSEQTRANSPTRRELQVWGRDNNSLSEAGADRQGTVSFNF.... The pIC50 is 4.5. (4) The small molecule is O=C(c1cc([C@H]2CCCN2c2cc(F)cc(F)c2)c2oc(N3CCOCC3)cc(=O)c2c1)N1CCOCC1. The target protein (Q13574) has sequence METFFRRHFRGKVPGPGEGQQRPSSVGLPTGKARRRSPAGQASSSLAQRRRSSAQLQGCLLSCGVRAQGSSRRRSSTVPPSCNPRFIVDKVLTPQPTTVGAQLLGAPLLLTGLVGMNEEEGVQEDVVAEASSAIQPGTKTPGPPPPRGAQPLLPLPRYLRRASSHLLPADAVYDHALWGLHGYYRRLSQRRPSGQHPGPGGRRASGTTAGTMLPTRVRPLSRRRQVALRRKAAGPQAWSALLAKAITKSGLQHLAPPPPTPGAPCSESERQIRSTVDWSESATYGEHIWFETNVSGDFCYVGEQYCVARMLKSVSRRKCAACKIVVHTPCIEQLEKINFRCKPSFRESGSRNVREPTFVRHHWVHRRRQDGKCRHCGKGFQQKFTFHSKEIVAISCSWCKQAYHSKVSCFMLQQIEEPCSLGVHAAVVIPPTWILRARRPQNTLKASKKKKRASFKRKSSKKGPEEGRWRPFIIRPTPSPLMKPLLVFVNPKSGGNQGAK.... The pIC50 is 4.3.