Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is Cc1noc(C)c1-c1ccc2c(c1)C(O)(C1CCCCC1=O)C(=O)N2. The target protein sequence is MSTATTVAPAGIPATPGPVNPPPPEVSNPSKPGRKTNQLQYMQNVVVKTLWKHQFAWPFYQPVDAIKLNLPDYHKIIKNPMDMGTIKKRLENNYYWSASECMQDFNTMFTNCYIYNKPTDDIVLMAQALEKIFLQKVAQMPQEEVELLPPAPKGKGRKPAAGAQSAGTQQVAAVSSVSPATPFQSVPPTVSQTPVIAATPVPTITANVTSVPVPPAAAPPPPATPIVPVVPPTPPVVKKKGVKRKADTTTPTTSAITASRSESPPPLSDPKQAKVVARRESGGRPIKPPKKDLEDGEVPQHAGKKGKLSEHLRYCDSILREMLSKKHAAYAWPFYKPVDAEALELHDYHDIIKHPMDLSTVKRKMDGREYPDAQGFAADVRLMFSNCYKYNPPDHEVVAMARKLQDVFEMRFAKMPDEPVEAPALPAPAAPMVSK. The pIC50 is 7.6. (2) The compound is Cc1ccc(S(=O)(=O)C[n+]2sc(N(c3ccccc3)c3ccccc3)nc2-c2ccccc2)cc1. The target protein (Q03330) has sequence MVTKHQIEEDHLDGATTDPEVKRVKLENNVEEIQPEQAETNKQEGTDKENKGKFEKETERIGGSEVVTDVEKGIVKFEFDGVEYTFKERPSVVEENEGKIEFRVVNNDNTKENMMVLTGLKNIFQKQLPKMPKEYIARLVYDRSHLSMAVIRKPLTVVGGITYRPFDKREFAEIVFCAISSTEQVRGYGAHLMNHLKDYVRNTSNIKYFLTYADNYAIGYFKKQGFTKEITLDKSIWMGYIKDYEGGTLMQCSMLPRIRYLDAGKILLLQEAALRRKIRTISKSHIVRPGLEQFKDLNNIKPIDPMTIPGLKEAGWTPEMDALAQRPKRGPHDAAIQNILTELQNHAAAWPFLQPVNKEEVPDYYDFIKEPMDLSTMEIKLESNKYQKMEDFIYDARLVFNNCRMYNGENTSYYKYANRLEKFFNNKVKEIPEYSHLID. The pIC50 is 4.5. (3) The drug is N#Cc1ccc2ccc(/C=C/c3ccc(O)c(O)c3)nc2c1O. The target protein (P03354) has sequence MEAVIKVISSACKTYCGKTSPSKKEIGAMLSLLQKEGLLMSPSDLYSPGSWDPITAALSQRAMILGKSGELKTWGLVLGALKAAREEQVTSEQAKFWLGLGGGRVSPPGPECIEKPATERRIDKGEEVGETTVQRDAKMAPEETATPKTVGTSCYHCGTAIGCNCATASAPPPPYVGSGLYPSLAGVGEQQGQGGDTPPGAEQSRAEPGHAGQAPGPALTDWARVREELASTGPPVVAMPVVIKTEGPAWTPLEPKLITRLADTVRTKGLRSPITMAEVEALMSSPLLPHDVTNLMRVILGPAPYALWMDAWGVQLQTVIAAATRDPRHPANGQGRGERTNLNRLKGLADGMVGNPQGQAALLRPGELVAITASALQAFREVARLAEPAGPWADIMQGPSESFVDFANRLIKAVEGSDLPPSARAPVIIDCFRQKSQPDIQQLIRTAPSTLTTPGEIIKYVLDRQKTAPLTDQGIAAAMSSAIQPLIMAVVNRERDGQTG.... The pIC50 is 4.4. (4) The small molecule is CCCC[C@H]1C(=O)N(C)[C@@H](CCCC)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(=O)NCC(N)=O)CSCC(=O)N[C@@H](Cc2ccc(O)cc2)C(=O)N(C)[C@@H](C)C(=O)N[C@@H](CCC(=O)O)C(=O)N2CCC[C@H]2C(=O)N[C@@H](CCCN)C(=O)N[C@@H](CC(C)C)C(=O)N2CCC[C@H]2C(=O)N[C@@H](Cc2c[nH]c3ccccc23)C(=O)N[C@@H](CO)C(=O)N[C@@H](Cc2c[nH]c3ccccc23)C(=O)N1C. The target protein sequence is FTVTVPKDLYVVEYGSNMTIECKFPVEKQLDLAALIVYWEMEDKNIIQFVHGEEDLKVQHSSYRQRARLLKDQLSLGNAALQITDVKLQDAGVYRCMISYGGADYKRITVKVNAPYNKINQRILVVDPVTSEHELTCQAEGYPKAEVIWTSSDHQVLSGKTTTTNSKREEKLFNVTSTLRINTTTNEIFYCTFRRLDPEENHTAELVIPELPLAHPPNERTGSSETVRFQGHHHHHH. The pIC50 is 6.3. (5) The compound is O=C(Nc1ccc(Cl)c(Cl)c1)Nc1ccc(Cl)c(Cl)c1. The target protein (P95276) has sequence MSQVHRILNCRGTRIHAVADSPPDQQGPLVVLLHGFPESWYSWRHQIPALAGAGYRVVAIDQRGYGRSSKYRVQKAYRIKELVGDVVGVLDSYGAEQAFVVGHDWGAPVAWTFAWLHPDRCAGVVGISVPFAGRGVIGLPGSPFGERRPSDYHLELAGPGRVWYQDYFAVQDGIITEIEEDLRGWLLGLTYTVSGEGMMAATKAAVDAGVDLESMDPIDVIRAGPLCMAEGARLKDAFVYPETMPAWFTEADLDFYTGEFERSGFGGPLSFYHNIDNDWHDLADQQGKPLTPPALFIGGQYDVGTIWGAQAIERAHEVMPNYRGTHMIADVGHWIQQEAPEETNRLLLDFLGGLRP. The pIC50 is 6.3. (6) The compound is CC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(=O)O)C(=O)N[C@H](C(N)=O)C(C)C. The target protein (Q13477) has sequence MDFGLALLLAGLLGLLLGQSLQVKPLQVEPPEPVVAVALGASRQLTCRLACADRGASVQWRGLDTSLGAVQSDTGRSVLTVRNASLSAAGTRVCVGSCGGRTFQHTVQLLVYAFPDQLTVSPAALVPGDPEVACTAHKVTPVDPNALSFSLLVGGQELEGAQALGPEVQEEEEEPQGDEDVLFRVTERWRLPPLGTPVPPALYCQATMRLPGLELSHRQAIPVLHSPTSPEPPDTTSPESPDTTSPESPDTTSQEPPDTTSPEPPDKTSPEPAPQQGSTHTPRSPGSTRTRRPEISQAGPTQGEVIPTGSSKPAGDQLPAALWTSSAVLGLLLLALPTYHLWKRCRHLAEDDTHPPASLRLLPQVSAWAGLRGTGQVGISPS. The pIC50 is 3.9. (7) The drug is CCCc1c(Sc2ccccc2)[nH]c2nc(N)nc(N)c12. The target protein (Q07422) has sequence MQKPVCLVVAMTPKRGIGINNGLPWPHLTTDFKHFSRVTKTTPEEASRLNGWLPRKFAKTGDSGLPSPSVGKRFNAVVMGRKTWESMPRKFRPLVDRLNIVVSSSLKEEDIAAEKPQAEGQQRVRVCASLPAALSLLEEEYKDSVDQIFVVGGAGLYEAALSLGVASHLYITRVAREFPCDVFFPAFPGDDILSNKSTAAQAAAPAESVFVPFCPELGREKDNEATYRPIFISKTFSDNGVPYDFVVLEKRRKTDDAATAEPSNAMSSLTSTRETTPVHGLQAPSSAAAIAPVLAWMDEEDRKKREQKELIRAVPHVHFRGHEEFQYLDLIADIINNGRTMDDRTGVGVISKFGCTMRYSLDQAFPLLTTKRVFWKGVLEELLWFIRGDTNANHLSEKGVKIWDKNVTREFLDSRNLPHREVGDIGPGYGFQWRHFGAAYKDMHTDYTGQGVDQLKNVIQMLRTNPTDRRMLMTAWNPAALDEMALPPCHLLCQFYVNDQ.... The pIC50 is 6.9. (8) The small molecule is O=C1CCN([C@H]2CCCC[C@@H]2OCCc2cccc3ccccc23)C1. The pIC50 is 5.0. The target protein (Q63881) has sequence MAAGVAAWLPFARAAAIGWMPVASGPMPAPPRQERKRTQDALIVLNVSGTRFQTWQDTLERYPDTLLGSSERDFFYHPETQQYFFDRDPDIFRHILNFYRTGKLHYPRHECISAYDEELAFFGLIPEIIGDCCYEEYKDRRRENAERLQDDADTDNTGESALPTMTARQRVWRAFENPHTSTMALVFYYVTGFFIAVSVIANVVETVPCGSSPGHIKELPCGERYAVAFFCLDTACVMIFTVEYLLRLAAAPSRYRFVRSVMSIIDVVAILPYYIGLVMTDNEDVSGAFVTLRVFRVFRIFKFSRHSQGLRILGYTLKSCASELGFLLFSLTMAIIIFATVMFYAEKGSSASKFTSIPAAFWYTIVTMTTLGYGDMVPKTIAGKIFGSICSLSGVLVIALPVPVIVSNFSRIYHQNQRADKRRAQKKARLARIRAAKSGSANAYMQSKRNGLLSNQLQSSEDEPAFVSKSGSSFETQHHHLLHCLEKTTNHEFVDEQVFE.... (9) The small molecule is C=CCNC(=O)O[C@H]1CC[C@@](CNC(=O)c2cc(F)ccc2OC)(c2ccccc2)CC1. The target protein (P22001) has sequence MDERLSLLRSPPPPSARHRAHPPQRPASSGGAHTLVNHGYAEPAAGRELPPDMTVVPGDHLLEPEVADGGGAPPQGGCGGGGCDRYEPLPPSLPAAGEQDCCGERVVINISGLRFETQLKTLCQFPETLLGDPKRRMRYFDPLRNEYFFDRNRPSFDAILYYYQSGGRIRRPVNVPIDIFSEEIRFYQLGEEAMEKFREDEGFLREEERPLPRRDFQRQVWLLFEYPESSGPARGIAIVSVLVILISIVIFCLETLPEFRDEKDYPASTSQDSFEAAGNSTSGSRAGASSFSDPFFVVETLCIIWFSFELLVRFFACPSKATFSRNIMNLIDIVAIIPYFITLGTELAERQGNGQQAMSLAILRVIRLVRVFRIFKLSRHSKGLQILGQTLKASMRELGLLIFFLFIGVILFSSAVYFAEADDPTSGFSSIPDAFWWAVVTMTTVGYGDMHPVTIGGKIVGSLCAIAGVLTIALPVPVIVSNFNYFYHRETEGEEQSQYM.... The pIC50 is 7.3. (10) The small molecule is O=C1Nc2ccccc2/C1=C1/Nc2cc(Br)ccc2C1=O. The target protein sequence is MSGRPRTTSFAESCKPVQQPSAFGSMKVSRDKDGSKVTTVVATPGQGPDRPQEVSYTDTKVIGNGSFGVVYQAKLCDSGELVAIKKVLQDKRFKNRELQIMRKLDHCNIVRLRYFFYSSGEKKDEVYLNLVLDYVPETVYRVARHYSRAKQTLPVIYVKLYMYQLFRSLAYIHSFGICHRDIKPQNLLLDPDTAVLKLCDFGSAKQLVRGEPNVSYICSRYYRAPELIFGATDYTSSIDVWSAGCVLAELLLGQPIFPGDSGVDQLVEIIKVLGTPTREQIREMNPNYTEFKFPQIKAHPWTKDSSGTGHFTSGVRVFRPRTPPEAIALCSRLLEYTPTARLTPLEACAHSFFDELRDPNVKLPNGRDTPALFNFTTQELSSNPPLATILIPPHARIQAAASTPSNATAASDTNAGDRGQTNNTASASASNST. The pIC50 is 4.7.