Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is CCS(=O)(=O)Nc1ccc2c(c1)C(C(=Nc1ccc(CN3CCCCC3)cc1)c1ccccc1)C(=O)N2. The target protein (Q6DE08) has sequence MSYKENLNPSSYTSKFTTPSSATAAQRVLRKEPYVSTFTTPSDNLLAQRTQLSRITPSASSSVPGRVAVSTEMPSQNTALAEMPKRKFTIDDFDIGRPLGKGKFGNVYLAREKQNKFIMALKVLFKSQLEKEGVEHQLRREIEIQSHLRHPNILRMYNYFHDRKRIYLMLEFAPRGELYKELQKHGRFDEQRSATFMEELADALHYCHERKVIHRDIKPENLLMGYKGELKIADFGWSVHAPSLRRRTMCGTLDYLPPEMIEGKTHDEKVDLWCAGVLCYEFLVGMPPFDSPSHTETHRRIVNVDLKFPPFLSDGSKDLISKLLRYHPPQRLPLKGVMEHPWVKANSRRVLPPVYQSTQSK. The pIC50 is 8.5. (2) The small molecule is CNC(=O)c1ccc2c(c1)nc(Nc1ccc(OC)cc1)n2Cc1ccccc1C(F)(F)F. The target protein (Q12840) has sequence MAETNNECSIKVLCRFRPLNQAEILRGDKFIPIFQGDDSVVIGGKPYVFDRVFPPNTTQEQVYHACAMQIVKDVLAGYNGTIFAYGQTSSGKTHTMEGKLHDPQLMGIIPRIARDIFNHIYSMDENLEFHIKVSYFEIYLDKIRDLLDVTKTNLSVHEDKNRVPFVKGCTERFVSSPEEILDVIDEGKSNRHVAVTNMNEHSSRSHSIFLINIKQENMETEQKLSGKLYLVDLAGSEKVSKTGAEGAVLDEAKNINKSLSALGNVISALAEGTKSYVPYRDSKMTRILQDSLGGNCRTTMFICCSPSSYNDAETKSTLMFGQRAKTIKNTASVNLELTAEQWKKKYEKEKEKTKAQKETIAKLEAELSRWRNGENVPETERLAGEEAALGAELCEETPVNDNSSIVVRIAPEERQKYEEEIRRLYKQLDDKDDEINQQSQLIEKLKQQMLDQEELLVSTRGDNEKVQRELSHLQSENDAAKDEVKEVLQALEELAVNYDQ.... The pIC50 is 4.3. (3) The drug is CC[C@H](C)[C@H](NC(=O)[C@@H](NC(=O)[C@H](CC(C)C)NC(=O)[C@H](C)N(C)C(=O)[C@H](Cc1ccccc1)NC(=O)OC(C)(C)C)C(C)C)C(=O)N[C@@H](Cc1cnc[nH]1)C(=O)OC. The target protein (P06281) has sequence MDRRRMPLWALLLLWSPCTFSLPTRTATFERIPLKKMPSVREILEERGVDMTRLSAEWGVFTKRPSLTNLTSPVVLTNYLNTQYYGEIGIGTPPQTFKVIFDTGSANLWVPSTKCSRLYLACGIHSLYESSDSSSYMENGSDFTIHYGSGRVKGFLSQDSVTVGGITVTQTFGEVTELPLIPFMLAKFDGVLGMGFPAQAVGGVTPVFDHILSQGVLKEEVFSVYYNRGSHLLGGEVVLGGSDPQHYQGNFHYVSISKTDSWQITMKGVSVGSSTLLCEEGCAVVVDTGSSFISAPTSSLKLIMQALGAKEKRIEEYVVNCSQVPTLPDISFDLGGRAYTLSSTDYVLQYPNRRDKLCTLALHAMDIPPPTGPVWVLGATFIRKFYTEFDRHNNRIGFALAR. The pIC50 is 6.0. (4) The small molecule is CC(C)(O)c1cc(-c2nc3ccc(F)cc3n2C2CC2)cnn1. The target protein (P15538) has sequence MALRAKAEVCMAVPWLSLQRAQALGTRAARVPRTVLPFEAMPRRPGNRWLRLLQIWREQGYEDLHLEVHQTFQELGPIFRYDLGGAGMVCVMLPEDVEKLQQVDSLHPHRMSLEPWVAYRQHRGHKCGVFLLNGPEWRFNRLRLNPEVLSPNAVQRFLPMVDAVARDFSQALKKKVLQNARGSLTLDVQPSIFHYTIEASNLALFGERLGLVGHSPSSASLNFLHALEVMFKSTVQLMFMPRSLSRWTSPKVWKEHFEAWDCIFQYGDNCIQKIYQELAFSRPQQYTSIVAELLLNAELSPDAIKANSMELTAGSVDTTVFPLLMTLFELARNPNVQQALRQESLAAAASISEHPQKATTELPLLRAALKETLRLYPVGLFLERVASSDLVLQNYHIPAGTLVRVFLYSLGRNPALFPRPERYNPQRWLDIRGSGRNFYHVPFGFGMRQCLGRRLAEAEMLLLLHHVLKHLQVETLTQEDIKMVYSFILRPSMFPLLTFR.... The pIC50 is 6.1. (5) The pIC50 is 6.0. The target protein (P67870) has sequence MSSSEEVSWISWFCGLRGNEFFCEVDEDYIQDKFNLTGLNEQVPHYRQALDMILDLEPDEELEDNPNQSDLIEQAAEMLYGLIHARYILTNRGIAQMLEKYQQGDFGYCPRVYCENQPMLPIGLSDIPGEAMVKLYCPKCMDVYTPKSSRHHHTDGAYFGTGFPHMLFMVHPEYRPKRPANQFVPRLYGFKIHPMAYQLQLQAASNFKSPVKTIR. The drug is CC(Nc1nc(NC2CC2)n2ncc(/C=C3\NC(=O)NC3=O)c2n1)c1ccc(Br)cc1. (6) The drug is CC1(C)C=C(c2ccc(-c3nc4c(C(N)=O)cccc4[nH]3)cc2)C(C)(C)N1O. The target protein (P27008) has sequence MAEATERLYRVEYAKSGRASCKKCSESIPKDSLRMAIMVQSPMFDGKVPHWYHFSCFWKVGHSIRQPDTEVDGFSELRWDDQQKVKKTAEAGGVAGKGQHGGGGKAEKTLGDFAAEYAKSNRSTCKGCMEKIEKGQMRLSKKMLDPEKPQLGMIDRWYHPTCFVKNRDELGFRPEYSASQLKGFSLLSAEDKEALKKQLPAVKSEGKRKCDEVDGIDEVAKKKSKKGKDKESSKLEKALKAQNELVWNIKDELKKACSTNDLKELLIFNQQQVPSGESAILDRVADGMAFGALLPCKECSGQLVFKSDAYYCTGDVTAWTKCMVKTQNPSRKEWVTPKEFREISYLKKLKIKKQDRLFPPESSAPAPPAPPVSITSAPTAVNSSAPADKPLSNMKILTLGKLSQNKDEAKAMIEKLGGKLTGSANKASLCISTKKEVEKMSKKMEEVKAANVRVVCEDFLQDVSASAKSLQELLSAHSLSSWGAEVKVEPGEVVVPKGKS.... The pIC50 is 7.1. (7) The small molecule is Cc1cc(C)cc(CNc2nc(-c3cccc(C)n3)c(-c3ccc4ncnn4c3)s2)c1. The target protein sequence is MEAAVAAPRPRLLLLVLAAAAAAAAALLPGATALQCFCHLCTKDNFTCVTDGLCFVSVTETTDKVIHNSMCIAEIDLIPRDRPFVCAPSSKTGSVTTTYCCNQDHCNKIELPTTVKSSPGLGPVELAAVIAGPVCFVCISLMLMVYICHNRTVIHHRVPNEEDPSLDRPFISEGTTLKDLIYDMTTSGSGSGLPLLVQRTIARTIVLQESIGKGRFGEVWRGKWRGEEVAVKIFSSREERSWFREAEIYQTVMLRHENILGFIAADNKDNGTWTQLWLVSDYHEHGSLFDYLNRYTVTVEGMIKLALSTASGLAHLHMEIVGTQGKPAIAHRDLKSKNILVKKNGTCCIADLGLAVRHDSATDTIDIAPNHRVGTKRYMAPEVLDDSINMKHFESFKRADIYAMGLVFWEIARRCSIGGIHEDYQLPYYDLVPSDPSVEEMRKVVCEQKLRPNIPNRWQSCEALRVMAKIMRECWYANGAARLTALRIKKTLSQLSQQEG.... The pIC50 is 7.7. (8) The pIC50 is 7.1. The target protein (Q28039) has sequence MAAAQGPVAPSSLEQNGAVPSEATKKDQNLKRGNWGNQIEFVLTSVGYAVGLGNVWRFPYLCYRNGGGAFMFPYFIMLIFCGIPLFFMELSFGQFASQGCLGVWRISPMFKGVGYGMMVVSTYIGIYYNVVICIAFYYFFSSMTPVLPWTYCNNPWNTPDCMSVLDNPNITNGSQPPALPGNVSQALNQTLKRTSPSEEYWRLYVLKLSDDIGNFGEVRLPLLGCLGVSWVVVFLCLIRGVKSSGKVVYFTATFPYVVLTILFIRGVTLEGAFTGIMYYLTPQWDKILEAKVWGDAASQIFYSLGCAWGGLVTMASYNKFHNNCYRDSVIISITNCATSVYAGFVIFSILGFMANHLGVDVSRVADHGPGLAFVAYPEALTLLPISPLWSLLFFFMLILLGLGTQFCLLETLVTAIVDEVGNEWILQKKTYVTLGVAVAGFLLGIPLTSQAGIYWLLLMDNYAASFSLVIISCIMCVSIMYIYGHQNYFQDIQMMLGFPP.... The small molecule is O=C(N[C@@H]1COCC[C@H]1N1CCCC1)c1ccc(C(F)(F)F)cc1C1CC1. (9) The small molecule is O=C(O)c1ccccc1-c1ccccc1C(=O)Nc1ccc2c(c1)Cc1cc(F)ccc1-2. The target protein sequence is MRVLVCGGAGYIGSHFVRALLRDTNHSVVIVDSLVGTHGKSDHVETRENVARKLQQSDGPKPPWADRYAALEVGDVRNEDFLNGVFTRHGPIDAVVHMCAFLAVGESVRDPLKYYDNNVVGILRLLQAMLLHKCDKIIFSSSAAIFGNPTMGSVSTNAEPIDINAKKSPESPYGESKLIAERMIRDCAEAYGIKGICLRYFNACGAHEDGDIGEHYQGSTHLIPIILGRVMSDIAPDQRLTIHEDASTDKRMPIFGTDYPTPDGTCVRDYVHVCDLASAHILALDYVEKLGPNDKSKYFSVFNLGTSRGYSVREVIEVARKTTGHPIPVRECGRREGDPAYLVAASDKAREVLGWKPKYDTLEAIMETSWKFQRTHPNGYASQENGTPGGRTTKL. The pIC50 is 5.4. (10) The compound is CC(C)(C)c1ccc(NC(=O)CSc2nc(C3CCCCC3)c(C#N)c(=O)[nH]2)cc1. The target protein (P0C6X7) has sequence MESLVLGVNEKTHVQLSLPVLQVRDVLVRGFGDSVEEALSEAREHLKNGTCGLVELEKGVLPQLEQPYVFIKRSDALSTNHGHKVVELVAEMDGIQYGRSGITLGVLVPHVGETPIAYRNVLLRKNGNKGAGGHSYGIDLKSYDLGDELGTDPIEDYEQNWNTKHGSGALRELTRELNGGAVTRYVDNNFCGPDGYPLDCIKDFLARAGKSMCTLSEQLDYIESKRGVYCCRDHEHEIAWFTERSDKSYEHQTPFEIKSAKKFDTFKGECPKFVFPLNSKVKVIQPRVEKKKTEGFMGRIRSVYPVASPQECNNMHLSTLMKCNHCDEVSWQTCDFLKATCEHCGTENLVIEGPTTCGYLPTNAVVKMPCPACQDPEIGPEHSVADYHNHSNIETRLRKGGRTRCFGGCVFAYVGCYNKRAYWVPRASADIGSGHTGITGDNVETLNEDLLEILSRERVNINIVGDFHLNEEVAIILASFSASTSAFIDTIKSLDYKSFK.... The pIC50 is 3.5.