Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The pIC50 is 6.4. The target protein (O75909) has sequence MKENKENSSPSVTSANLDHTKPCWYWDKKDLAHTPSQLEGLDPATEARYRREGARFIFDVGTRLGLHYDTLATGIIYFHRFYMFHSFKQFPRYVTGACCLFLAGKVEETPKKCKDIIKTARSLLNDVQFGQFGDDPKEEVMVLERILLQTIKFDLQVEHPYQFLLKYAKQLKGDKNKIQKLVQMAWTFVNDSLCTTLSLQWEPEIIAVAVMYLAGRLCKFEIQEWTSKPMYRRWWEQFVQDVPVDVLEDICHQILDLYSQGKQQMPHHTPHQLQQPPSLQPTPQVPQVQQSQPSQSSEPSQPQQKDPQQPAQQQQPAQQPKKPSPQPSSPRQVKRAVVVSPKEENKAAEPPPPKIPKIETTHPPLPPAHPPPDRKPPLAAALGEAEPPGPVDATDLPKVQIPPPAHPAPVHQPPPLPHRPPPPPPSSYMTGMSTTSSYMSGEGYQSLQSMMKTEGPSYGALPPAYGPPAHLPYHPHVYPPNPPPPPVPPPPASFPPPAIP.... The drug is CN1CC[C@H](c2c(O)cc(O)c3c(=O)cc(/C=C/c4c(Cl)cccc4Cl)oc23)[C@H](O)C1. (2) The drug is O=C(CO)C[n+]1ccc(/C=C/c2cccc3ccccc23)cc1. The target protein (P32738) has sequence MPILEKAPQKMPVKASSWEELDLPKLPVPPLQQTLATYLQCMQHLVPEEQFRKSQAIVKRFGAPGGLGETLQEKLLERQEKTANWVSEYWLNDMYLNNRLALPVNSSPAVIFARQHFQDTNDQLRFAACLISGVLSYKTLLDSHSLPTDWAKGQLSGQPLCMKQYYRLFSSYRLPGHTQDTLVAQKSSIMPEPEHVIVACCNQFFVLDVVINFRRLSEGDLFTQLRKIVKMASNEDERLPPIGLLTSDGRSEWAKARTVLLKDSTNRDSLDMIERCICLVCLDGPGTGELSDTHRALQLLHGGGCSLNGANRWYDKSLQFVVGRDGTCGVVCEHSPFDGIVLVQCTEHLLKHMMTSNKKLVRADSVSELPAPRRLRLKCSPETQGHLASSAEKLQRIVKNLDFIVYKFDNYGKTFIKKQKYSPDGFIQVALQLAYYRLYQRLVPTYESASIRRFQEGRVDNIRSATPEALAFVQAMTDHKAAMPASEKLQLLQTAMQAHK.... The pIC50 is 5.5. (3) The compound is O=C(O)CS(=O)(=O)c1cccc(COc2ccccc2)c1. The target protein (P04014) has sequence MADDSGTENEGSGCTGWFMVEAIVEHTTGTQISEDEEEEVEDSGYDMVDFIDDRHITQNSVEAQALFNRQEADAHYATVQDLKRKYLGSPYVSPISNVANAVESEISPRLDAIKLTTQPKKVKRRLFETRELTDSGYGYSEVEAATQVEKHGDPENGGDGQERDTGRDIEGEGVEHREAEAVDDSTREHADTSGILELLKCKDIRSTLHGKFKDCFGLSFVDLIRPFKSDRTTCADWVVAGFGIHHSIADAFQKLIEPLSLYAHIQWLTNAWGMVLLVLIRFKVNKSRCTVARTLGTLLNIPENHMLIEPPKIQSGVRALYWFRTGISNASTVIGEAPEWITRQTVIEHSLADSQFKLTEMVQWAYDNDICEESEIAFEYAQRGDFDSNARAFLNSNMQAKYVKDCAIMCRHYKHAEMKKMSIKQWIKYRGTKVDSVGNWKPIVQFLRHQNIEFIPFLSKLKLWLHGTPKKNCIAIVGPPDTGKSCFCMSLIKFLGGTVI.... The pIC50 is 4.1. (4) The drug is CC1(C)[C@H](C(=O)O)N2C(=O)C[C@H]2S1(=O)=O. The pIC50 is 4.6. The target protein sequence is MSRTGRLSVFFSAIFPLLTLTNMAEAASQPPQVTVDKLKRLENDFGGRIGVYAIDTGSNKTFGYRANERFPLCSSFKGFLAAAVLSKSQQQEGLLNQRIRYDNRVMEPHSPVTEKQITTGMTVAELSAATLQYSDNGAANLLLEKLIGGPEGMTSFMRSIGDNVFRLDRWELELNSAIPGDDRDTSTPKAVAESMQKLAFGNVLGLTERHQLMDWFKGNTTGGARIRASVPANWVVGDKTGTCGVYGTANDYAVIWPVGHAPIVLAVYTSKPDKNSKHSDAVIADASRIVLESFNIDALRMATGKSIGF. (5) The small molecule is Cn1c(-c2ccc(C=O)s2)nc2c3ccccc3c3ccccc3c21. The target protein (Q96SD1) has sequence MSSFEGQMAEYPTISIDRFDRENLRARAYFLSHCHKDHMKGLRAPTLKRRLECSLKVYLYCSPVTKELLLTSPKYRFWKKRIISIEIETPTQISLVDEASGEKEEIVVTLLPAGHCPGSVMFLFQGNNGTVLYTGDFRLAQGEAARMELLHSGGRVKDIQSVYLDTTFCDPRFYQIPSREECLSGVLELVRSWITRSPYHVVWLNCKAAYGYEYLFTNLSEELGVQVHVNKLDMFRNMPEILHHLTTDRNTQIHACRHPKAEEYFQWSKLPCGITSRNRIPLHIISIKPSTMWFGERSRKTNVIVRTGESSYRACFSFHSSYSEIKDFLSYLCPVNAYPNVIPVGTTMDKVVEILKPLCRSSQSTEPKYKPLGKLKRARTVHRDSEEEDDYLFDDPLPIPLRHKVPYPETFHPEVFSMTAVSEKQPEKLRQTPGCCRAECMQSSRFTNFVDCEESNSESEEEVGIPASLQGDLGSVLHLQKADGDVPQWEVFFKRNDEIT.... The pIC50 is 4.7. (6) The compound is CC(=O)N(O)CCCP(=O)(O)O. The target protein (A0QVH7) has sequence MTTSAASGEPGRQRVLILGSTGSIGTQALEVIAANPDRFEVVGLAAGGGNPELLAAQRAQTGVAAVAVADPAAAEAVGDVRYSGPDAVTRLVEDTEADVVLNALVGALGLQPTLAALATGARLALANKESLVAGGPLVLKAAAPGQIVPVDSEHSAMAQCLRGGTRAELDKIVLTASGGPFLGWSAEDLKSVTPEQAGKHPTWSMGPMNTLNSATLVNKGLELIETHLLFGVDYDDIDVVVHPQSIVHSMATFTDGSTLAQASPPDMKLPIALALGWPDRIAGAAAACDFSTASTWEFLPLDNAVFPAVDLARFAGKQGGCLTAVYNSANEEAAEAFLDGRIGFPDIVETVGDVLHAADRWAAEPATVDDVLDAQRWAREQARGVVEQKSVRRGLVTK. The pIC50 is 6.5.