Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is Cc1ccc(C2c3c(nc4c(c3N)CCCC4)Oc3c2c(=O)oc2ccccc32)cc1. The target protein (O42275) has sequence MKILDALLFPVIFIMFFIHLSIAQTDPELTIMTRLGQVQGTRLPVPDRSHVIAFLGIPFAEPPLGKMRFKPPEPKKPWNDVFDARDYPSACYQYVDTSYPGFSGTEMWNPNRMMSEDCLYLNVWVPATPRPHNLTVMVWIYGGGFYSGSSSLDVYDGRYLAHSEKVVVVSMNYRVSAFGFLALNGSAEAPGNVGLLDQRLALQWVQDNIHFFGGNPKQVTIFGESAGAASVGMHLLSPDSRPKFTRAILQSGVPNGPWRTVSFDEARRRAIKLGRLVGCPDGNDTDLIDCLRSKQPQDLIDQEWLVLPFSGLFRFSFVPVIDGVVFPDTPEAMLNSGNFKDTQILLGVNQNEGSYFLIYGAPGFSKDNESLITREDFLQGVKMSVPHANEIGLEAVILQYTDWMDEDNPIKNREAMDDIVGDHNVVCPLQHFAKMYAQYSILQGQTGTASQGNLGWGNSGSASNSGNSQVSVYLYMFDHRASNLVWPEWMGVIHGYEIEF.... The pIC50 is 7.8. (2) The drug is COc1cc(/C=C/C(=O)c2ccc(NC(=O)CN3CCN(c4cc5c(cc4F)c(=O)c(C(=O)NO)cn5C4CC4)CC3)cc2)cc(OC)c1OC. The target protein (P14916) has sequence MKLTPKELDKLMLHYAGELAKKRKEKGIKLNYVEAVALISAHIMEEARAGKKTAAELMQEGRTLLKPDDVMDGVASMIHEVGIEAMFPDGTKLVTVHTPIEANGKLVPGELFLKNEDITINEGKKAVSVKVKNVGDRPVQIGSHFHFFEVNRCLDFDREKTFGKRLDIASGTAVRFEPGEEKSVELIDIGGNRRIFGFNALVDRQADNESKKIALHRAKERGFHGAKSDDNYVKTIKE. The pIC50 is 3.0. (3) The compound is COc1ccc(-c2nc(C(=O)Nc3ccccc3N3CCNCC3)cs2)cc1C. The target protein (O14757) has sequence MAVPFVEDWDLVQTLGEGAYGEVQLAVNRVTEEAVAVKIVDMKRAVDCPENIKKEICINKMLNHENVVKFYGHRREGNIQYLFLEYCSGGELFDRIEPDIGMPEPDAQRFFHQLMAGVVYLHGIGITHRDIKPENLLLDERDNLKISDFGLATVFRYNNRERLLNKMCGTLPYVAPELLKRREFHAEPVDVWSCGIVLTAMLAGELPWDQPSDSCQEYSDWKEKKTYLNPWKKIDSAPLALLHKILVENPSARITIPDIKKDRWYNKPLKKGAKRPRVTSGGVSESPSGFSKHIQSNLDFSPVNSASSEENVKYSSSQPEPRTGLSLWDTSPSYIDKLVQGISFSQPTCPDHMLLNSQLLGTPGSSQNPWQRLVKRMTRFFTKLDADKSYQCLKETCEKLGYQWKKSCMNQVTISTTDRRNNKLIFKVNLLEMDDKILVDFRLSKGDGLEFKRHFLKIKGKLIDIVSSQKIWLPAT. The pIC50 is 6.2. (4) The drug is C[C@]1(Cn2ccnn2)[C@H](C(=O)O)N2C(=O)C[C@H]2S1(=O)=O. The target protein (P62593) has sequence MSIQHFRVALIPFFAAFCLPVFAHPETLVKVKDAEDQLGARVGYIELDLNSGKILESFRPEERFPMMSTFKVLLCGAVLSRVDAGQEQLGRRIHYSQNDLVEYSPVTEKHLTDGMTVRELCSAAITMSDNTAANLLLTTIGGPKELTAFLHNMGDHVTRLDRWEPELNEAIPNDERDTTMPAAMATTLRKLLTGELLTLASRQQLIDWMEADKVAGPLLRSALPAGWFIADKSGAGERGSRGIIAALGPDGKPSRIVVIYTTGSQATMDERNRQIAEIGASLIKHW. The pIC50 is 6.9. (5) The small molecule is CCNC(=O)Nc1cn2c(-c3nccc(C4CC4)n3)cc(-c3cccnc3)cc2n1. The target protein (P0A4L9) has sequence MTEEIKNLQAQDYDASQIQVLEGLEAVRMRPGMYIGSTSKEGLHHLVWEIVDNSIDEALAGFASHIQVFIEPDDSITVVDDGRGIPVDIQEKTGRPAVETVFTVLHAGGKFGGGGYKVSGGLHGVGSSVVNALSTQLDVHVHKNGKIHYQEYRRGHVVADLEIVGDTDKTGTTVHFTPDPKIFTETTIFDFDKLNKRIQELAFLNRGLQISITDKRQGLEQTKHYHYEGGIASYVEYINENKDVIFDTPIYTDGEMDDITVEVAMQYTTGYHENVMSFANNIHTHEGGTHEQGFRTALTRVINDYARKNKLLKDNEDNLTGEDVREGLTAVISVKHPNPQFEGQTKTKLGNSEVVKITNRLFSEAFSDFLMENPQIAKRIVEKGILAAKARVAAKRAREVTRKKSGLEISNLPGKLADCSSNNPAETELFIVEGDSAGGSAKSGRNREFQAILPIRGKILNVEKASMDKILANEEIRSLFTAMGTGFGAEFDVSKARYQK.... The pIC50 is 6.5. (6) The compound is COc1ccc(Oc2cccc(CN3CCN(C(=O)Oc4ccc([N+](=O)[O-])cc4)CC3)c2)cc1. The target protein sequence is MDLDVVNMFVIAGGTLAIPILAFVASFLLWPSALIRIYYWYWRRTLGMQVRYAHHEDYQFCYSFRGRPGHKPSILMLHGFSAHKDMWLSVVKFLPKNLHLVCVDMPGHEGTTRSSLDDLSIVGQVKRIHQFVECLKLNKKPFHLIGTSMGGHVAGVYAAYYPSDVCSLSLVCPAGLQYSTDNPFVQRLKELEESAAIQKIPLIPSTPEEMSEMLQLCSYVRFKVPQQILQGLVDVRIPHNSFYRKLFLEIVNEKSRYSLHENMDKIKVPTQIIWGKQDQVLDVSGADILAKSISNSQVEVLENCGHSVVMERPRKTAKLIVDFLASVHNTDNKKLN. The pIC50 is 6.3. (7) The drug is CCOC(=O)c1ccc(OC(=O)P(=O)([O-])[O-])cc1. The target protein (P04293) has sequence MFSGGGGPLSPGGKSAARAASGFFAPAGPRGASRGPPPCLRQNFYNPYLAPVGTQQKPTGPTQRHTYYSECDEFRFIAPRVLDEDAPPEKRAGVHDGHLKRAPKVYCGGDERDVLRVGSGGFWPRRSRLWGGVDHAPAGFNPTVTVFHVYDILENVEHAYGMRAAQFHARFMDAITPTGTVITLLGLTPEGHRVAVHVYGTRQYFYMNKEEVDRHLQCRAPRDLCERMAAALRESPGASFRGISADHFEAEVVERTDVYYYETRPALFYRVYVRSGRVLSYLCDNFCPAIKKYEGGVDATTRFILDNPGFVTFGWYRLKPGRNNTLAQPAAPMAFGTSSDVEFNCTADNLAIEGGMSDLPAYKLMCFDIECKAGGEDELAFPVAGHPEDLVIQISCLLYDLSTTALEHVLLFSLGSCDLPESHLNELAARGLPTPVVLEFDSEFEMLLAFMTLVKQYGPEFVTGYNIINFDWPFLLAKLTDIYKVPLDGYGRMNGRGVFR.... The pIC50 is 3.3. (8) The drug is O=C(CC1CC(c2ccc(O)cc2)=NO1)NC1CCCCC1. The target protein (P34884) has sequence MPMFIVNTNVPRASVPEGFLSELTQQLAQATGKPAQYIAVHVVPDQLMTFSGTNDPCALCSLHSIGKIGGAQNRNYSKLLCGLLSDRLHISPDRVYINYYDMNAANVGWNGSTFA. The pIC50 is 4.6. (9) The compound is O=C(O)Cc1cccc(Cc2nc3c(F)c(F)cc(F)c3s2)c1. The target protein (P80276) has sequence MASHLVLYTGAKMPILGLGTWKSPPGKVTEAVKVAIDLGYRHIDCAHVYQNENEVGLGLQEKLQGQVVKREDLFIVSKLWCTDHEKNLVKGACQTTLRDLKLDYLDLYLIHWPTGFKPGKDPFPLDGDGNVVPDESDFVETWEAMEELVDEGLVKAIGVSNFNHLQVEKILNKPGLKYKPAVNQIEVHPYLTQEKLIEYCKSKGIVVTAYSPLGSPDRPWAKPEDPSLLEDPRIKAIAAKYNKTTAQVLIRFPMQRNLIVIPKSVTPERIAENFQVFDFELSPEDMNTLLSYNRNWRVCALMSCASHKDYPFHEEY. The pIC50 is 8.2. (10) The drug is Cn1c(N(Cc2ccc(C(=O)Nc3nnn[nH]3)cc2)[C@H]2CC[C@H](C(C)(C)C)CC2)nc2cc(F)ccc21. The target protein (P09681) has sequence MVATKTFALLLLSLFLAVGLGEKKEGHFSALPSLPVGSHAKVSSPQPRGPRYAEGTFISDYSIAMDKIHQQDFVNWLLAQKGKKNDWKHNITQREARALELASQANRKEEEAVEPQSSPAKNPSDEDLLRDLLIQELLACLLDQTNLCRLRSR. The pIC50 is 6.0.