Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is CCCCCCCCCCCC(=O)Oc1cccc2c1C(=O)C=CC2=O. The target protein (Q9Y253) has sequence MATGQDRVVALVDMDCFFVQVEQRQNPHLRNKPCAVVQYKSWKGGGIIAVSYEARAFGVTRSMWADDAKKLCPDLLLAQVRESRGKANLTKYREASVEVMEIMSRFAVIERASIDEAYVDLTSAVQERLQKLQGQPISADLLPSTYIEGLPQGPTTAEETVQKEGMRKQGLFQWLDSLQIDNLTSPDLQLTVGAVIVEEMRAAIERETGFQCSAGISHNKVLAKLACGLNKPNRQTLVSHGSVPQLFSQMPIRKIRSLGGKLGASVIEILGIEYMGELTQFTESQLQSHFGEKNGSWLYAMCRGIEHDPVKPRQLPKTIGCSKNFPGKTALATREQVQWWLLQLAQELEERLTKDRNDNDRVATQLVVSIRVQGDKRLSSLRRCCALTRYDAHKMSHDAFTVIKNCNTSGIQTEWSPPLTMLFLCATKFSASAPSSSTDITSFLSSDPSSLPKVPVTSSEAKTQGSGPAVTATKKATTSLESFFQKAAERQKVKEASLSS.... The pIC50 is 5.2. (2) The compound is COc1ccc2ncc(=O)n(CCN3CCC(NCc4ccc5c(n4)NC(=O)CO5)CC3)c2c1. The target protein (Q9Y3Q4) has sequence MDKLPPSMRKRLYSLPQQVGAKAWIMDEEEDAEEEGAGGRQDPSRRSIRLRPLPSPSPSAAAGGTESRSSALGAADSEGPARGAGKSSTNGDCRRFRGSLASLGSRGGGSGGTGSGSSHGHLHDSAEERRLIAEGDASPGEDRTPPGLAAEPERPGASAQPAASPPPPQQPPQPASASCEQPSVDTAIKVEGGAAAGDQILPEAEVRLGQAGFMQRQFGAMLQPGVNKFSLRMFGSQKAVEREQERVKSAGFWIIHPYSDFRFYWDLTMLLLMVGNLIIIPVGITFFKDENTTPWIVFNVVSDTFFLIDLVLNFRTGIVVEDNTEIILDPQRIKMKYLKSWFMVDFISSIPVDYIFLIVETRIDSEVYKTARALRIVRFTKILSLLRLLRLSRLIRYIHQWEEIFHMTYDLASAVVRIVNLIGMMLLLCHWDGCLQFLVPMLQDFPDDCWVSINNMVNNSWGKQYSYALFKAMSHMLCIGYGRQAPVGMSDVWLTMLSMI.... The pIC50 is 4.0. (3) The drug is Nc1ncnc2c1ncn2[C@@H]1O[C@H](COS(=O)(=O)NC(=O)[C@@H](N)CS)[C@@H](O)[C@H]1O. The target protein sequence is MISGAPSQDSLLPDNRHAADYQQLRERLIQELNLTPQQLHEESNLIQAGLDSIRLMRWLHWFRKNGYRLTLRELYAAPTLAAWNQLMLSRSPENAEEETPPDESSWPNMTESTPFPLTPVQHAYLTGRMPGQTLGGVGCHLYQEFEGHCLTASQLEQAITTLLQRHPMLHIAFRPDGQQVWLPQPYWNGVTVHDLRHNDAESRQAYLDALRQRLSHRLLRVEIGETFDFQLTLLPDNRHRLHVNIDLLIMDASSFTLFFDELNALLAGESLPAIDTRYDFRSYLLHQQKINQPLRDDARAYWLAKASTLPPAPVLPLACEPATLREVRNTRRRMIVPATRWHAFSNRAGEYGVTPTMALATCFSAVLARWGGLTRLLLNITLFDRQPLHPAVGAMLADFTNILLLDTACDGDTVSNLARKNQLTFTEDWEHRHWSGVELLRELKRQQRYPHGAPVVFTSNLGRSLYSSRAESPLGEPEWGISQTPQVWIDHLAFEHHGEV.... The pIC50 is 7.4. (4) The drug is Cc1cccc(CCN2C3=NC[C@H](Cc4ccc(O)cc4)N3C[C@@H]2CCCNC(=O)C2CCC2)c1. The target protein (Q04637) has sequence MNKAPQSTGPPPAPSPGLPQPAFPPGQTAPVVFSTPQATQMNTPSQPRQHFYPSRAQPPSSAASRVQSAAPARPGPAAHVYPAGSQVMMIPSQISYPASQGAYYIPGQGRSTYVVPTQQYPVQPGAPGFYPGASPTEFGTYAGAYYPAQGVQQFPTGVAPTPVLMNQPPQIAPKRERKTIRIRDPNQGGKDITEEIMSGARTASTPTPPQTGGGLEPQANGETPQVAVIVRPDDRSQGAIIADRPGLPGPEHSPSESQPSSPSPTPSPSPVLEPGSEPNLAVLSIPGDTMTTIQMSVEESTPISRETGEPYRLSPEPTPLAEPILEVEVTLSKPVPESEFSSSPLQAPTPLASHTVEIHEPNGMVPSEDLEPEVESSPELAPPPACPSESPVPIAPTAQPEELLNGAPSPPAVDLSPVSEPEEQAKEVTASMAPPTIPSATPATAPSATSPAQEEEMEEEEEEEEGEAGEAGEAESEKGGEELLPPESTPIPANLSQNLE.... The pIC50 is 4.3. (5) The compound is CC1(C)[C@H](NC(=O)/C(=N\OCc2cc(=O)c(O)cn2O)c2csc(N)n2)C(=O)N1OS(=O)(=O)O. The target protein sequence is MFKTTLCALLITASCSTFAA. The pIC50 is 6.3. (6) The drug is O=C(OCC1OC(OC(=O)c2cc(O)c(O)c(OC(=O)c3cc(O)c(O)c(O)c3)c2)C(OC(=O)c2cc(O)c(O)c(OC(=O)c3cc(O)c(O)c(O)c3)c2)C(OC(=O)c2cc(O)c(O)c(OC(=O)c3cc(O)c(O)c(O)c3)c2)C1OC(=O)c1cc(O)c(O)c(OC(=O)c2cc(O)c(O)c(O)c2)c1)c1cc(O)c(O)c(OC(=O)c2cc(O)c(O)c(O)c2)c1. The target protein (P06869) has sequence MKVWLASLFLCALVVKNSEGGSVLGAPDESNCGCQNGGVCVSYKYFSRIRRCSCPRKFQGEHCEIDASKTCYHGNGDSYRGKANTDTKGRPCLAWNAPAVLQKPYNAHRPDAISLGLGKHNYCRNPDNQKRPWCYVQIGLRQFVQECMVHDCSLSKKPSSSVDQQGFQCGQKALRPRFKIVGGEFTEVENQPWFAAIYQKNKGGSPPSFKCGGSLISPCWVASAAHCFIQLPKKENYVVYLGQSKESSYNPGEMKFEVEQLILHEYYREDSLAYHNDIALLKIRTSTGQCAQPSRSIQTICLPPRFTDAPFGSDCEITGFGKESESDYLYPKNLKMSVVKLVSHEQCMQPHYYGSEINYKMLCAADPEWKTDSCKGDSGGPLICNIEGRPTLSGIVSWGRGCAEKNKPGVYTRVSHFLDWIQSHIGEEKGLAF. The pIC50 is 8.4.