Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50.. Dataset: Drug-target binding data from BindingDB using IC50 measurements (1) The small molecule is CC(C)NCc1ccc(C[C@@H]2NC(=O)[C@H](Cc3c[nH]c4ccccc34)NC(=O)[C@H]3CCC(=O)NCCCC(=O)NCCC[C@@H](NC(=O)[C@H](Cc4ccccc4)NC(=O)[C@@H]([C@@H](C)O)NC2=O)C(=O)N[C@@H](CO)C(=O)N[C@H](C(=O)O)CSSC[C@@H](NC(=O)[C@@H](N)Cc2ccc(O)cc2)C(=O)N[C@H](CCCCN)C(=O)N[C@H](Cc2ccccc2)C(=O)N3)cc1. The target protein (P35346) has sequence MEPLFPASTPSWNASSPGAASGGGDNRTLVGPAPSAGARAVLVPVLYLLVCAAGLGGNTLVIYVVLRFAKMKTVTNIYILNLAVADVLYMLGLPFLATQNAASFWPFGPVLCRLVMTLDGVNQFTSVFCLTVMSVDRYLAVVHPLSSARWRRPRVAKLASAAAWVLSLCMSLPLLVFADVQEGGTCNASWPEPVGLWGAVFIIYTAVLGFFAPLLVICLCYLLIVVKVRAAGVRVGCVRRRSERKVTRMVLVVVLVFAGCWLPFFTVNIVNLAVALPQEPASAGLYFFVVILSYANSCANPVLYGFLSDNFRQSFQKVLCLRKGSGAKDADATEPRPDRIRQQQEATPPAHRAAANGLMQTSKL. The pIC50 is 6.0. (2) The drug is NC/C(=C/F)CCc1ccc(F)cc1. The target protein (O70423) has sequence MTQKTTLVLLALAVITIFALVCVLLAGRSGDGGGLSQPLHCPSVLPSVQPRTHPSQSQPFADLSPEELTAVMSFLTKHLGPGLVDAAQARPSDNCVFSVELQLPAKAAALAHLDRGGPPPVREALAIIFFGGQPKPNVSELVVGPLPHPSYMRDVTVERHGGPLPYYRRPVLDREYQDIEEMIFHRELPQASGLLHHCCFYKHQGQNLLTMTTAPRGLQSGDRATWFGLYYNLSGAGFYPHPIGLELLIDHKALDPALWTIQKVFYQGRYYESLTQLEDQFEAGLVNVVLVPNNGTGGSWSLKSSVPPGPAPPLQFHPQGPRFSVQGSQVSSSLWAFSFGLGAFSGPRIFDIRFQGERVAYEISVQEAIALYGGNSPASMSTCYVDGSFGIGKYSTPLIRGVDCPYLATYVDWHFLLESQAPKTLRDAFCVFEQNQGLPLRRHHSDFYSHYFGGVVGTVLVVRSVSTLLNYDYIWDMVFHPNGAIEVKFHATGYISSAFF.... The pIC50 is 8.0. (3) The target protein (Q99788) has sequence MRMEDEDYNTSISYGDEYPDYLDSIVVLEDLSPLEARVTRIFLVVVYSIVCFLGILGNGLVIIIATFKMKKTVNMVWFLNLAVADFLFNVFLPIHITYAAMDYHWVFGTAMCKISNFLLIHNMFTSVFLLTIISSDRCISVLLPVWSQNHRSVRLAYMACMVIWVLAFFLSSPSLVFRDTANLHGKISCFNNFSLSTPGSSSWPTHSQMDPVGYSRHMVVTVTRFLCGFLVPVLIITACYLTIVCKLQRNRLAKTKKPFKIIVTIIITFFLCWCPYHTLNLLELHHTAMPGSVFSLGLPLATALAIANSCMNPILYVFMGQDFKKFKVALFSRLVNALSEDTGHSSYPSHRSFTKMSSMNERTSMNERETGML. The pIC50 is 5.0. The compound is CC1=CCC[C@H]1NC(=O)Nc1ccc(Cl)c(S(=O)(=O)[C@@]2(C)CCOC2)c1O. (4) The compound is Nc1nonc1C(=O)NCCN=Cc1cc(Cc2ccccc2)cc(Br)c1O. The target protein (P61981) has sequence MVDREQLVQKARLAEQAERYDDMAAAMKNVTELNEPLSNEERNLLSVAYKNVVGARRSSWRVISSIEQKTSADGNEKKIEMVRAYREKIEKELEAVCQDVLSLLDNYLIKNCSETQYESKVFYLKMKGDYYRYLAEVATGEKRATVVESSEKAYSEAHEISKEHMQPTHPIRLGLALNYSVFYYEIQNAPEQACHLAKTAFDDAIAELDTLNEDSYKDSTLIMQLLRDNLTLWTSDQQDDDGGEGNN. The pIC50 is 4.3. (5) The drug is CC1(C)NC(C)(C)C(c2cccc3c2oc2ccccc23)=C1c1nc2c(C(N)=O)cccc2[nH]1. The target protein (P27008) has sequence MAEATERLYRVEYAKSGRASCKKCSESIPKDSLRMAIMVQSPMFDGKVPHWYHFSCFWKVGHSIRQPDTEVDGFSELRWDDQQKVKKTAEAGGVAGKGQHGGGGKAEKTLGDFAAEYAKSNRSTCKGCMEKIEKGQMRLSKKMLDPEKPQLGMIDRWYHPTCFVKNRDELGFRPEYSASQLKGFSLLSAEDKEALKKQLPAVKSEGKRKCDEVDGIDEVAKKKSKKGKDKESSKLEKALKAQNELVWNIKDELKKACSTNDLKELLIFNQQQVPSGESAILDRVADGMAFGALLPCKECSGQLVFKSDAYYCTGDVTAWTKCMVKTQNPSRKEWVTPKEFREISYLKKLKIKKQDRLFPPESSAPAPPAPPVSITSAPTAVNSSAPADKPLSNMKILTLGKLSQNKDEAKAMIEKLGGKLTGSANKASLCISTKKEVEKMSKKMEEVKAANVRVVCEDFLQDVSASAKSLQELLSAHSLSSWGAEVKVEPGEVVVPKGKS.... The pIC50 is 5.7.