Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is N[C@@H](CCC(=O)Nc1ccccc1N1CCOCC1)C(=O)O. The target protein (Q15758) has sequence MVADPPRDSKGLAAAEPTANGGLALASIEDQGAAAGGYCGSRDQVRRCLRANLLVLLTVVAVVAGVALGLGVSGAGGALALGPERLSAFVFPGELLLRLLRMIILPLVVCSLIGGAASLDPGALGRLGAWALLFFLVTTLLASALGVGLALALQPGAASAAINASVGAAGSAENAPSKEVLDSFLDLARNIFPSNLVSAAFRSYSTTYEERNITGTRVKVPVGQEVEGMNILGLVVFAIVFGVALRKLGPEGELLIRFFNSFNEATMVLVSWIMWYAPVGIMFLVAGKIVEMEDVGLLFARLGKYILCCLLGHAIHGLLVLPLIYFLFTRKNPYRFLWGIVTPLATAFGTSSSSATLPLMMKCVEENNGVAKHISRFILPIGATVNMDGAALFQCVAAVFIAQLSQQSLDFVKIITILVTATASSVGAAGIPAGGVLTLAIILEAVNLPVDHISLILAVDWLVDRSCTVLNVEGDALGAGLLQNYVDRTESRSTEPELIQ.... The pIC50 is 3.2. (2) The compound is O=C(O)c1ccc(C2CN(Cc3ccc([C@H]4COc5ccccc5O4)cc3)C2)cc1. The target protein (P09960) has sequence MPEIVDTCSLASPASVCRTKHLHLRCSVDFTRRTLTGTAALTVQSQEDNLRSLVLDTKDLTIEKVVINGQEVKYALGERQSYKGSPMEISLPIALSKNQEIVIEISFETSPKSSALQWLTPEQTSGKEHPYLFSQCQAIHCRAILPCQDTPSVKLTYTAEVSVPKELVALMSAIRDGETPDPEDPSRKIYKFIQKVPIPCYLIALVVGALESRQIGPRTLVWSEKEQVEKSAYEFSETESMLKIAEDLGGPYVWGQYDLLVLPPSFPYGGMENPCLTFVTPTLLAGDKSLSNVIAHEISHSWTGNLVTNKTWDHFWLNEGHTVYLERHICGRLFGEKFRHFNALGGWGELQNSVKTFGETHPFTKLVVDLTDIDPDVAYSSVPYEKGFALLFYLEQLLGGPEIFLGFLKAYVEKFSYKSITTDDWKDFLYSYFKDKVDVLNQVDWNAWLYSPGLPPIKPNYDMTLTNACIALSQRWITAKEDDLNSFNATDLKDLSSHQL.... The pIC50 is 9.5. (3) The drug is Nc1ccc(S(=O)(=O)c2ccc(N)cc2)cc1. The target protein sequence is MDIIEESNKCKENNKGNIVVLNFGTTDKTNAVTILETALYLTEKYIGKIINTSYMYETVPEYVVLDKSDIPKNIIGEDDPYDVSSLNDLVKGLEKSKYENVFQGEENLVSQCEYERFLNNKDLFENKIKQISTEKYESETSNIIKENDEIMKINLEKHKNKYYTSYFYNLVVVFKCFIDDPLNLLVILKYIEHLMKRKNSKEVEKFENRLIDIDILFFNNYTIFEKNINLTKNDLYTIMCKYINIEYDNSSSDNCNKLSRNIEEIKDNIKFLSIPHVYTKHRYSILLCLNDIMPNYKHNALKETINKLHEEFITSFSKLYNTCIKKYNKRLYVLKNEVLCLKEKTNIVGILNTNYNSFSDGGLFVKPNIAVHRMFQMIKEGVDIIDIGGESSAPFVSHNPEIKERDLVIPVLELFEQEWNKMLQIRENGMEKQKDKLNQNDLSLQKKTSTIYKPPISIDTMNYDLFKECVDKNLVDILNDISACTNDPKIIKLLKKKN. The pIC50 is 4.9. (4) The drug is CNc1nc(-c2ccc3c(c2)CCN3C(=O)c2ccccc2F)cs1. The target protein sequence is SQQQDDIEELETKAVGMSNDGRFLKFDIEIGRGSFKTVYKGLDTETTVEVAWCELQDRKLTKSERQRFKEEAEMLKGLQHPNIVRFYDSWESTVKGKKCIVLVTELMTSGTLKTYLKRFKVMKIKVLRSWCRQILKGLQFLHTRTPPIIHRDLKCDNIFITGPTGSVKIGDLGLATLKRASFAKSVIGTPEFMAPEMYEEKYDE. The pIC50 is 5.1. (5) The drug is O=C(C[C@@H]1C(=O)Nc2ccccc2N1S(=O)(=O)c1ccc(Cl)c(Cl)c1)NCCc1ccc(C2=NCCN2)cc1. The target protein (Q9BDQ5) has sequence MASRAPLELLPLNRSQLSPPNATTCDDAPEAWDLLHRVLPSVIIIICVCGLLGNLLVLAVLLRPRRRLNVAEMYLANLAASDLVFVLGLPFWAANISNQFRWPFGGLLCRLVNGVIKANLFISIFLVVAISRDRYRALVHPMATRRRRQARATCVLIWVAGSLLSVPTFLFRSIEAVPELNNDSACVLLHPPGAWHVARMVELNVLGFLLPLAAIVFFNCHILASLRGRPEVRGARCGGPPDGRTTALILTFVAAFLVCWTPYHFFAFLEFLTQVQVVRGCFWENFKDLGLQYASFFAFINSCLNPVIYVFVGRLFRTRVWDLFKQCAPRRPPAVSWSHRKRVLQLFWQN. The pIC50 is 7.8. (6) The small molecule is CC(C)Oc1ccc(CNC(=O)C2CCN(S(=O)(=O)c3cn(C(C)C)cn3)CC2)cc1. The target protein (P9WHH9) has sequence MTHYDVVVLGAGPGGYVAAIRAAQLGLSTAIVEPKYWGGVCLNVGCIPSKALLRNAELVHIFTKDAKAFGISGEVTFDYGIAYDRSRKVAEGRVAGVHFLMKKNKITEIHGYGTFADANTLLVDLNDGGTESVTFDNAIIATGSSTRLVPGTSLSANVVTYEEQILSRELPKSIIIAGAGAIGMEFGYVLKNYGVDVTIVEFLPRALPNEDADVSKEIEKQFKKLGVTILTATKVESIADGGSQVTVTVTKDGVAQELKAEKVLQAIGFAPNVEGYGLDKAGVALTDRKAIGVDDYMRTNVGHIYAIGDVNGLLQLAHVAEAQGVVAAETIAGAETLTLGDHRMLPRATFCQPNVASFGLTEQQARNEGYDVVVAKFPFTANAKAHGVGDPSGFVKLVADAKHGELLGGHLVGHDVAELLPELTLAQRWDLTASELARNVHTHPTMSEALQECFHGLVGHMINF. The pIC50 is 5.4. (7) The small molecule is O=C(Nc1ccc(S(=O)(=O)Nc2nccs2)cc1)C(c1ccccc1)c1ccccc1. The target protein (Q9NY46) has sequence MAQALLVPPGPESFRLFTRESLAAIEKRAAEEKAKKPKKEQDNDDENKPKPNSDLEAGKNLPFIYGDIPPEMVSEPLEDLDPYYINKKTFIVMNKGKAIFRFSATSALYILTPLNPVRKIAIKILVHSLFSMLIMCTILTNCVFMTLSNPPDWTKNVEYTFTGIYTFESLIKILARGFCLEDFTFLRDPWNWLDFSVIVMAYVTEFVSLGNVSALRTFRVLRALKTISVIPGLKTIVGALIQSVKKLSDVMILTVFCLSVFALIGLQLFMGNLRNKCLQWPPSDSAFETNTTSYFNGTMDSNGTFVNVTMSTFNWKDYIGDDSHFYVLDGQKDPLLCGNGSDAGQCPEGYICVKAGRNPNYGYTSFDTFSWAFLSLFRLMTQDYWENLYQLTLRAAGKTYMIFFVLVIFLGSFYLVNLILAVVAMAYEEQNQATLEEAEQKEAEFQQMLEQLKKQQEEAQAVAAASAASRDFSGIGGLGELLESSSEASKLSSKSAKEWR.... The pIC50 is 7.7. (8) The small molecule is CCOc1cc2ncnc(Nc3ccc4c(cnn4Cc4ccccc4)c3)c2cc1NC(=O)/C=C/CN(C)C. The target protein (P04626) has sequence MELAALCRWGLLLALLPPGAASTQVCTGTDMKLRLPASPETHLDMLRHLYQGCQVVQGNLELTYLPTNASLSFLQDIQEVQGYVLIAHNQVRQVPLQRLRIVRGTQLFEDNYALAVLDNGDPLNNTTPVTGASPGGLRELQLRSLTEILKGGVLIQRNPQLCYQDTILWKDIFHKNNQLALTLIDTNRSRACHPCSPMCKGSRCWGESSEDCQSLTRTVCAGGCARCKGPLPTDCCHEQCAAGCTGPKHSDCLACLHFNHSGICELHCPALVTYNTDTFESMPNPEGRYTFGASCVTACPYNYLSTDVGSCTLVCPLHNQEVTAEDGTQRCEKCSKPCARVCYGLGMEHLREVRAVTSANIQEFAGCKKIFGSLAFLPESFDGDPASNTAPLQPEQLQVFETLEEITGYLYISAWPDSLPDLSVFQNLQVIRGRILHNGAYSLTLQGLGISWLGLRSLRELGSGLALIHHNTHLCFVHTVPWDQLFRNPHQALLHTANRP.... The pIC50 is 7.9.