Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is CN(C)CCCN1C(=O)NC(c2ccccc2)(c2ccccc2)C1=O. The target protein (P04775) has sequence MARSVLVPPGPDSFRFFTRESLAAIEQRIAEEKAKRPKQERKDEDDENGPKPNSDLEAGKSLPFIYGDIPPEMVSEPLEDLDPYYINKKTFIVLNKGKAISRFSATSALYILTPFNPIRKLAIKILVHSLFNVLIMCTILTNCVFMTMSNPPDWTKNVEYTFTGIYTFESLIKILARGFCLEDFTFLRNPWNWLDFTVITFAYVTEFVNLGNVSALRTFRVLRALKTISVIPGLKTIVGALIQSVKKLSDVMILTVFCLSVFALIGLQLFMGNLRNKCLQWPPDNSTFEINITSFFNNSLDWNGTAFNRTVNMFNWDEYIEDKSHFYFLEGQNDALLCGNSSDAGQCPEGYICVKAGRNPNYGYTSFDTFSWAFLSLFRLMTQDFWENLYQLTLRAAGKTYMIFFVLVIFLGSFYLINLILAVVAMAYEEQNQATLEEAEQKEAEFQQMLEQLKKQQEEAQAAAAAASAESRDFSGAGGIGVFSESSSVASKLSSKSEKE.... The pIC50 is 4.1. (2) The small molecule is O=C(NCC(F)(F)F)[C@@H]1CN(Cc2ccc(-c3ccccc3)o2)CCN1C[C@@H](O)C[C@@H](Cc1cccnc1)C(=O)N[C@H]1c2ccccc2OC[C@H]1O. The target protein sequence is PQITLWKRPIVTIKIGGQLKEALLDTGADDTVLEEMSLPGKWKPKIIGGIGGFVKVRQYDQVPIEICGHKVIGTVLIGPTPANIIGRNLMTQLGCTLNF. The pIC50 is 9.8. (3) The small molecule is COc1cc(C(=O)N(C)C2CCN(C)CC2)ccc1Nc1ncc2c(n1)-c1c(nn(C)c1-c1ccc(-c3cnn(C)c3)cc1)CC2. The target protein sequence is FVFLFSVVIGSIYLFLRKRQPDGPLGPLYASSNPEYLSASDVFPCSVYVPDEWEVSREKITLLRELGQGSFGMVYEGNARDIIKGEAETRVAVKTVNESASLRERIEFLNEASVMKGFTCHHVVRLLGVVSKGQPTLVVMELMAHGDLKSYLRSLRPEAENNPGRPPPTLQEMIQMAAEIADGMAYLNAKKFVHRDLAARNCMVAHDFTVKIGDFGMTRDIYETDYYRKGGKGLLPVRWMAPESLKDGVFTTSSDMWSFGVVLWEITSLAEQPYQGLSNEQVLKFVMDGGYLDQPDNCPERVTDLMRMCWQFNPKMRPTFLEIVNLLKDDLHPSFPEVSFFHSEENKAPESEELEMEFEDMENVPLDRSSHCQREEAGGRDGGSSLGFKRSYEEHIPYTHMNGGKKN. The pIC50 is 9.6. (4) The drug is CC(C)C[C@@H](C=O)NC(=O)[C@@H](NC(=O)c1ccco1)C(C)C. The target protein sequence is MAEEFITPVYCTGVSAQVQKQRAKELGLGRHENAIKYLGQDYEQLRVHCLQRGALFRDEAFPPVPQSLGFKELGPNSSKTYGIKWKRPTELFSNPQFIVDGATRTDICQGALGACWLLAAIASLTLNDTLLHRVVPHGQSFQDGYAGIFHFQLWQFGEWVDVVVDDLLPTKDGKLVFVHSAQGNEFWSALLEKAYAKVNGSYEALSGGSTSEGFEDFTGGVTEWYELRKAPSDLYNIILKALERGSLLGCSIDISSILDMEAVTFKKLVKGHAYSVTGAKQVNYQGQMVNLIRMRNPWGEVEWTGAWSDGSSEWNGVDPYVREQLRIKMEDGEFWMSFRDFMREFTRLEICNLTPDALKSQRFRNWNTTLYEGTWRRGSTAGGCRNYPATFWVNPQFKIRLEETDDPDPDDYGGRESGCSFLLALMQKHRRRERRFGRDMETIGFAVYEVPPELVGQPAVHLKRDFFLANASRARSEQFINLREVSTRFRLPPGEYVVVP.... The pIC50 is 6.1. (5) The drug is CN(C)CCOc1ccc(/C(=C(/CCCO)c2ccccc2)c2ccc(F)cc2)cc1.Cl. The target protein (P62508) has sequence MDSVELCLPESFSLHYEEELLCRMSNKDRHIDSSCSSFIKTEPSSPASLTDSVNHHSPGGSSDASGSYSSTMNGHQNGLDSPPLYPSAPILGGSGPVRKLYDDCSSTIVEDPQTKCEYMLNSMPKRLCLVCGDIASGYHYGVASCEACKAFFKRTIQGNIEYSCPATNECEITKRRRKSCQACRFMKCLKVGMLKEGVRLDRVRGGRQKYKRRIDAENSPYLNPQLVQPAKKPYNKIVSHLLVAEPEKIYAMPDPTVPDSDIKALTTLCDLADRELVVIIGWAKHIPGFSTLSLADQMSLLQSAWMEILILGVVYRSLSFEDELVYADDYIMDEDQSKLAGLLDLNNAILQLVKKYKSMKLEKEEFVTLKAIALANSDSMHIEDVEAVQKLQDVLHEALQDYEAGQHMEDPRRAGKMLMTLPLLRQTSTKAVQHFYNIKLEGKVPMHKLFLEMLEAKV. The pIC50 is 5.4. (6) The small molecule is C[C@H]1CN(CCC(C(=O)NCc2cc(C(F)(F)F)cc(C(F)(F)F)c2)c2csc(NC(=O)Cc3ccccc3)n2)CC[C@]12C=Cc1ccccc12. The target protein (P51683) has sequence MEDNNMLPQFIHGILSTSHSLFTRSIQELDEGATTPYDYDDGEPCHKTSVKQIGAWILPPLYSLVFIFGFVGNMLVIIILIGCKKLKSMTDIYLLNLAISDLLFLLTLPFWAHYAANEWVFGNIMCKVFTGLYHIGYFGGIFFIILLTIDRYLAIVHAVFALKARTVTFGVITSVVTWVVAVFASLPGIIFTKSKQDDHHYTCGPYFTQLWKNFQTIMRNILSLILPLLVMVICYSGILHTLFRCRNEKKRHRAVRLIFAIMIVYFLFWTPYNIVLFLTTFQESLGMSNCVIDKHLDQAMQVTETLGMTHCCINPVIYAFVGEKFRRYLSIFFRKHIAKRLCKQCPVFYRETADRVSSTFTPSTGEQEVSVGL. The pIC50 is 6.8. (7) The compound is CCCS(=O)(=O)Nc1ccc(Cl)c(Nc2ncccc2-c2ncnc3[nH]cnc23)c1F. The target is CKENALLRYLLDKDD. The pIC50 is 7.0. (8) The drug is CN1C(=O)[C@@]23C[C@]4([C@]56C[C@@]78SS[C@@](CO)(C(=O)N7[C@H]5Nc5ccccc56)N(C)C8=O)c5ccccc5N[C@@H]4N2C(=O)[C@]1(CO)SS3. The target protein (Q9Z148) has sequence MRGLPRGRGLMRARGRGRAAPTGGRGRGRGGAHRGRGRPRSLLSLPRAQASWAPQLPAGLTGPPVPCLPSQGEAPAEMGALLLEKEPRGAAERVHSSLGDTPQSEETLPKANPDSLEPAGPSSPASVTVTVGDEGADTPVGAASLIGDEPESLEGDGGRIVLGHATKSFPSSPSKGGACPSRAKMSMTGAGKSPPSVQSLAMRLLSMPGAQGAATAGPEPSPATTAAQEGQPKVHRARKTMSKPSNGQPPIPEKRPPEVQHFRMSDDMHLGKVTSDVAKRRKLNSGSLSEDLGSAGGSGDIILEKGEPRPLEEWETVVGDDFSLYYDAYSVDERVDSDSKSEVEALAEQLSEEEEEEEEEEEEEEEEEEEEEEEEEDEESGNQSDRSGSSGRRKAKKKWRKDSPWVKPSRKRRKREPPRAKEPRGVNGVGSSGPSEYMEVPLGSLELPSEGTLSPNHAGVSNDTSSLETERGFEELPLCSCRMEAPKIDRISERAGHKCM.... The pIC50 is 5.6. (9) The small molecule is C[C@@H]1C[C@H](N)CN(c2ccncc2NCc2ccc(F)c(-c3c(F)ccc(C4(O)CC4)c3F)n2)C1. The target protein (P11309) has sequence MLLSKINSLAHLRAAPCNDLHATKLAPGKEKEPLESQYQVGPLLGSGGFGSVYSGIRVSDNLPVAIKHVEKDRISDWGELPNGTRVPMEVVLLKKVSSGFSGVIRLLDWFERPDSFVLILERPEPVQDLFDFITERGALQEELARSFFWQVLEAVRHCHNCGVLHRDIKDENILIDLNRGELKLIDFGSGALLKDTVYTDFDGTRVYSPPEWIRYHRYHGRSAAVWSLGILLYDMVCGDIPFEHDEEIIRGQVFFRQRVSSECQHLIRWCLALRPSDRPTFEEIQNHPWMQDVLLPQETAEIHLHSLSPGPSK. The pIC50 is 8.0.