Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is CCc1nc(CC(C)(C)C)c(CN)c(-c2ccc(C)cc2)c1C(=O)O. The target protein (Q9EPB1) has sequence MGLHPCSPVDHGVPSWVLVLLLTLGLCSLQATADSVLDPDFRENYFEQYMDHFNFESFSNKTFGQRFLVSDKFWKMGEGPIFFYTGNEGDIWSLANNSGFIVELAAQQEALLVFAEHRYYGKSLPFGVQSTQRGYTQLLTVEQALADFAVLLQALRHNLGVQDAPTIAFGGSYGGMLSAYMRMKYPHLVAGALAASAPVIAVAGLGNPDQFFRDVTADFYGQSPKCAQAVRDAFQQIKDLFLQGAYDTISQNFGTCQSLSSPKDLTQLFGFARNAFTVLAMMDYPYPTNFLGPLPANPVKVGCERLLSEGQRIMGLRALAGLVYNSSGMEPCFDIYQMYQSCADPTGCGTGSNARAWDYQACTEINLTFDSNNVTDMFPEIPFSDELRQQYCLDTWGVWPRPDWLQTSFWGGDLKAASNIIFSNGDLDPWAGGGIQRNLSTSIIAVTIQGGAHHLDLRASNSEDPPSVVEVRKLEATLIREWVAAARLKQPAEAQWPGPK.... The pIC50 is 5.4. (2) The drug is O=C([O-])c1cccc(Cc2cc(Cl)ccc2OCc2ccccc2Cl)n1. The target protein (P34995) has sequence MSPCGPLNLSLAGEATTCAAPWVPNTSAVPPSGASPALPIFSMTLGAVSNLLALALLAQAAGRLRRRRSAATFLLFVASLLATDLAGHVIPGALVLRLYTAGRAPAGGACHFLGGCMVFFGLCPLLLGCGMAVERCVGVTRPLLHAARVSVARARLALAAVAAVALAVALLPLARVGRYELQYPGTWCFIGLGPPGGWRQALLAGLFASLGLVALLAALVCNTLSGLALLRARWRRRSRRPPPASGPDSRRRWGAHGPRSASASSASSIASASTFFGGSRSSGSARRARAHDVEMVGQLVGIMVVSCICWSPMLVLVALAVGGWSSTSLQRPLFLAVRLASWNQILDPWVYILLRQAVLRQLLRLLPPRAGAKGGPAGLGLTPSAWEASSLRSSRHSGLSHF. The pIC50 is 7.4. (3) The pIC50 is 4.1. The small molecule is O=C(O)c1ccccc1N=NCc1ccc(CN=Nc2ccccc2C(=O)O)cc1. The target protein sequence is MATSRAALCAVAVVCVVLAAACAPARAIYVGTPAAALFEEFKRTYRRAYGTLAEEQQRLANFERNLELMREHQARNPHARFGITKFFDLSEAEFAARYLNGAAYFAAAKQHAGQHYRKARADLSAVPDAVDWREKGAVTPVKNQGACGSCWAFSAVGNIESQWARAGHGLVSLSEQQLVSCDDKDNGCNGGLMLQAFEWLLRHMYGIVFTEKSYPYTSGNGDVAECLNSSKLVPGARIDGYVMIPSNETVMAAWLAENGPIAIGVDASSFMSYQSGVLTSCAGDALNHGVLLVGYNTTGGVPYCVIKNSWGEDWGEKGYVRVAMGLNACLLSEYPVSAHVPQSLTPALTASGNFCEACWTVMLHRILSVLKTNGWLLGRRPSARWREDGARGGQ. (4) The small molecule is Cc1noc(C)c1-c1ccc2c(c1)C(c1ccsc1)(N1CCC[C@@H](N)C1)C(=O)N2. The target protein sequence is MSAESGPGTRLRNLPVMGDGLETSQMSTTQAQAQPQPANAASTNPPPPETSNPNKPKRQTNQLQYLLRVVLKTLWKHQFAWPFQQPVDAVKLNLPDYYKIIKTPMDMGTIKKRLENNYYWNAQECIQDFNTMFTNCYIYNKPGDDIVLMAEALEKLFLQKINELPTEETEIMIVQAKGRGRGRKETGTAKPGVSTVPNTTQASTPPQTQTPQPNPPPVQATPHPFPAVTPDLIVQTPVMTVVPPQPLQTPPPVPPQPQPPPAPAPQPVQSHPPIIAATPQPVKTKKGVKRKADTTTPTTIDPIHEPPSLPPEPKTTKLGQRRESSRPVKPPKKDVPDSQQHPAPEKSSKVSEQLKCCSGILKEMFAKKHAAYAWPFYKPVDVEALGLHDYCDIIKHPMDMSTIKSKLEAREYRDAQEFGADVRLMFSNCYKYNPPDHEVVAMARKLQDVFEMRFAKMPDEPEEPVVAVSSPAVPPPT. The pIC50 is 7.6. (5) The small molecule is O=c1c(O)c(-c2cc(O)c(O)c(O)c2)oc2cc(O)cc(O)c12. The target protein (P27189) has sequence MAEVKKPFHEQVAERLIEQLKAGTAPWQKPWEPGMPGSFIPLNPTTGKRYKGINAIQLMAQGHADPRWMTYKQAAAAGAQVRRGEKGTPIQYWKFSEEQTKTDEQTGKPVLDANGDPVKVTVQLERPRVFFATVFNAEQIDGLPPLERKEQTWSAVERAEHILAASGATIRHGEHDRAFYRPSTDSIHLPDKGQFPSADNYYATALHELGHWTGHPSRLDRDLAHPFGSEGYAKEELRAEIASMILGDELGIGHDPGQHAAYVGSWIKALQEDPLEIFRAAADAEKIQDFVLAFEQKQIQEQTTQQAIEPAQGATMEQQQDQVARPAIAPADELIAQTLRMYRAGAEPAEGNQSLAALTETTLGFELPADWTGRVQVQANVEVEHDGERSVVPAGDREPEFWGVYANHAWGGHQWLADFAGPDAQTNAEALADRLAVIDAYATANEYEQAAKFARIHEERVRRDPNSTDEDRVAAKEARKAAEGTAMLHDEDLQRRIADY.... The pIC50 is 3.1. (6) The compound is COC1=C(Br)[C@@H](O)[C@]2(C=C1Br)CC(C(=O)NCCCOc1c(Br)cc(CC(N=O)C(=O)NCCc3cnc(N)[nH]3)cc1Br)=NO2. The target protein (P38650) has sequence MSETGGGEDGSAGLEVSAVQNVADVSVLQKHLRKLVPLLLEDGGDAPAALEAALEEKSALEQMRKFLSDPQVHTVLVERSTLKEDVGDEGEEEKEFISYNINIDIHYGVKSNSLAFIKRAPVIDADKPVSSQLRVLTLSEDSPYETLHSFISNAVAPFFKSYIRESGKADRDGDKMAPSVEKKIAELEMGLLHLQQNIEIPEISLPIHPIITNVAKQCYERGEKPKVTDFGDKVEDPTFLNQLQSGVNRWIREIQKVTKLDRDPASGTALQEISFWLNLERALYRIQEKRESPEVLLTLDILKHGKRFHATVSFDTDTGLKQALETVNDYNPLMKDFPLNDLLSATELDKIRQALVAIFTHLRKIRNTKYPIQRALRLVEAISRDLSSQLLKVLGTRKLMHVAYEEFEKVMVACFEVFQTWDDEYEKLQVLLRDIVKRKREENLKMVWRINPAHRKLQARLDQMRKFRRQHEQLRAVIVRVLRPQVTAVAQQNQGEAPEP.... The pIC50 is 4.5. (7) The drug is N#Cc1cccc(Nc2nccc(-c3ccnc(N4CCOCC4)c3)n2)c1. The target protein (Q96Q15) has sequence MSRRAPGSRLSSGGGGGGTKYPRSWNDWQPRTDSASADPDNLKYSSSRDRGGSSSYGLQPSNSAVVSRQRHDDTRVHADIQNDEKGGYSVNGGSGENTYGRKSLGQELRVNNVTSPEFTSVQHGSRALATKDMRKSQERSMSYSDESRLSNLLRRITREDDRDRRLATVKQLKEFIQQPENKLVLVKQLDNILAAVHDVLNESSKLLQELRQEGACCLGLLCASLSYEAEKIFKWIFSKFSSSAKDEVKLLYLCATYKALETVGEKKAFSSVMQLVMTSLQSILENVDTPELLCKCVKCILLVARCYPHIFSTNFRDTVDILVGWHIDHTQKPSLTQQVSGWLQSLEPFWVADLAFSTTLLGQFLEDMEAYAEDLSHVASGESVDEDVPPPSVSLPKLAALLRVFSTVVRSIGERFSPIRGPPITEAYVTDVLYRVMRCVTAANQVFFSEAVLTAANECVGVLLGSLDPSMTIHCDMVITYGLDQLENCQTCGTDYIISV.... The pIC50 is 7.1. (8) The drug is O=C(O)CC1=COc2ccccc2O1. The target protein (Q8VD48) has sequence MLLWVLALLFLCAFLWNYKGQLKIADIADKYIFITGCDSGFGNLAARTFDRKGFRVIAACLTESGSEALKAKTSERLHTVLLDVTNPENVKETAQWVKSHVGEKGLWGLINNAGVLGVLAPTDWLTVDDYREPIEVNLFGLINVTLNMLPLVKKARGRVINVSSIGGRLAFGGGGYTPSKYAVEGFNDSLRRDMKAFGVHVSCIEPGLFKTGLADPIKTTEKKLAIWKHLSPDIKQQYGEGYIEKSLHRLKSSTSSVNLDLSLVVECMDHALTSLFPKTRYTAGKDAKTFWIPLSHMPAALQDFLLLKEKVELANPQAV. The pIC50 is 4.0. (9) The compound is Brc1ccc2nc(-c3cccc(Oc4ccccc4)c3)cc(-c3ccccc3)c2c1. The target protein (P20831) has sequence MAELPQSRINERNITSEMRESFLDYAMSVIVARALPDVRDGLKPVHRRILYGLNEQGMTPDKSYKKSARIVGDVMGKYHPHGDSSIYEAMVRMAQDFSYRYPLVDGQGNFGSMDGDGAAAMRYTEARMTKITLELLRDINKDTIDFIDNYDGNEREPSVLPARFPNLLANGASGIAVGMATNIPPHNLTELINGVLSLSKNPDISIAELMEDIEGPDFPTAGLILGKSGIRRAYETGRGSIQMRSRAVIEERGGGRQRIVVTEIPFQVNKARMIEKIAELVRDKKIDGITDLRDETSLRTGVRVVIDVRKDANASVILNNLYKQTPLQTSFGVNMIALVNGRPKLINLKEALVHYLEHQKTVVRRRTQYNLRKAKDRAHILEGLRIALDHIDEIISTIRESDTDKVAMESLQQRFKLSEKQAQAILDMRLRRLTGLERDKIEAEYNELLNYISELEAILADEEVLLQLVRDELTEIRDRFGDDRRTEIQLGGFEDLEDED.... The pIC50 is 6.3.