Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is N#Cc1cccc(-c2cc(C(N)=O)c3[nH]ccc3c2)c1. The pIC50 is 5.0. The target protein (O15111) has sequence MERPPGLRPGAGGPWEMRERLGTGGFGNVCLYQHRELDLKIAIKSCRLELSTKNRERWCHEIQIMKKLNHANVVKACDVPEELNILIHDVPLLAMEYCSGGDLRKLLNKPENCCGLKESQILSLLSDIGSGIRYLHENKIIHRDLKPENIVLQDVGGKIIHKIIDLGYAKDVDQGSLCTSFVGTLQYLAPELFENKPYTATVDYWSFGTMVFECIAGYRPFLHHLQPFTWHEKIKKKDPKCIFACEEMSGEVRFSSHLPQPNSLCSLVVEPMENWLQLMLNWDPQQRGGPVDLTLKQPRCFVLMDHILNLKIVHILNMTSAKIISFLLPPDESLHSLQSRIERETGINTGSQELLSETGISLDPRKPASQCVLDGVRGCDSYMVYLFDKSKTVYEGPFASRSLSDCVNYIVQDSKIQLPIIQLRKVWAEAVHYVSGLKEDYSRLFQGQRAAMLSLLRYNANLTKMKNTLISASQQLKAKLEFFHKSIQLDLERYSEQMTY.... (2) The small molecule is CC(C)[NH2+]C[C@@H](O)COc1cccc2ccccc12. The target protein (P10633) has sequence MELLNGTGLWSMAIFTVIFILLVDLMHRRHRWTSRYPPGPVPWPVLGNLLQVDLSNMPYSLYKLQHRYGDVFSLQKGWKPMVIVNRLKAVQEVLVTHGEDTADRPPVPIFKCLGVKPRSQGVILASYGPEWREQRRFSVSTLRTFGMGKKSLEEWVTKEAGHLCDAFTAQAGQSINPKAMLNKALCNVIASLIFARRFEYEDPYLIRMVKLVEESLTEVSGFIPEVLNTFPALLRIPGLADKVFQGQKTFMALLDNLLAENRTTWDPAQPPRNLTDAFLAEVEKAKGNPESSFNDENLRMVVVDLFTAGMVTTATTLTWALLLMILYPDVQRRVQQEIDEVIGQVRCPEMTDQAHMPYTNAVIHEVQRFGDIAPLNLPRFTSCDIEVQDFVIPKGTTLIINLSSVLKDETVWEKPHRFHPEHFLDAQGNFVKHEAFMPFSAGRRACLGEPLARMELFLFFTCLLQRFSFSVPVGQPRPSTHGFFAFPVAPLPYQLCAVVR.... The pIC50 is 3.7. (3) The small molecule is O=C(O)c1cc(Cc2cc(C(=O)O)c(O)c(-c3ccccc3)c2)cc(-c2ccccc2)c1O. The target protein (P25044) has sequence MAAAPWYIRQRDTDLLGKFKFIQNQEDGRLREATNGTVNSRWSLGVSIEPRNDARNRYVNIMPYERNRVHLKTLSGNDYINASYVKVNVPGQSIEPGYYIATQGPTRKTWDQFWQMCYHNCPLDNIVIVMVTPLVEYNREKCYQYWPRGGVDDTVRIASKWESPGGANDMTQFPSDLKIEFVNVHKVKDYYTVTDIKLTPTDPLVGPVKTVHHFYFDLWKDMNKPEEVVPIMELCAHSHSLNSRGNPIIVHCSAGVGRTGTFIALDHLMHDTLDFKNITERSRHSDRATEEYTRDLIEQIVLQLRSQRMKMVQTKDQFLFIYHAAKYLNSLSVNQ. The pIC50 is 3.6. (4) The drug is C[C@H](O)[C@H](NC(=O)[C@@H]1CCCN1C(=O)[C@@H]1CCCN1C(=O)[C@@H](N)[C@@H](C)O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(N)=O. The target protein (P09790) has sequence MKAKRFKINAISLSIFLAYALTPYSEAALVRDDVDYQIFRDFAENKGKFFVGATDLSVKNKRGQNIGNALSNVPMIDFSVADVNKRIATVVDPQYAVSVKHAKAEVHTFYYGQYNGHNDVADKENEYRVVEQNNYEPHKAWGASNLGRLEDYNMARFNKFVTEVAPIAPTDAGGGLDTYKDKNRFSSFVRIGAGRQLVYEKGVYHQEGNEKGYDLRDLSQAYRYAIAGTPYKDINIDQTMNTEGLIGFGNHNKQYSAEELKQALSQDALTNYGVLGDSGSPLFAFDKQKNQWVFLGTYDYWAGYGKKSWQEWNIYKKEFADKIKQHDNAGTVKGNGEHHWKTTGTNSHIGSTAVRLANNEGDANNGQNVTFEDNGTLVLNQNINQGAGGLFFKGDYTVKGANNDITWLGAGIDVADGKKVVWQVKNPNGDRLAKIGKGTLEINGTGVNQGQLKVGDGTVILNQKADADKKVQAFSQVGIVSGRGTLVLNSSNQINPDNLY.... The pIC50 is 3.4.