Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is COc1ccc2[nH]cc(C[C@@H]3CCCN3C)c2c1. The target protein (Q60484) has sequence MSPPNQSEEGLPQEASNRSLNATETPGDWDPGLLQALKVSLVVVLSIITLATVLSNAFVLTTILLTRKLHTPANYLIGSLATTDLLVSILVMPISIAYTTTRTWNFGQILCDIWVSSDITCCTASILHLCVIALDRYWAITDALEYSKRRTAGHAGAMIAAVWVISICISIPPLFWRQAQAQEEMSDCLVNTSQISYTIYSTCGAFYIPSVLLIILYSRIYRAARSRILNPPSLSGKRFTTAHLITGSAGSSLCSLNPSLHEGHMHPGSPLFFNHVRIKLADSVLERKRISAARERKATKTLGIILGAFIVCWLPFFVVSLVLPICRDSCWIHPALFDFFTWLGYLNSLINPIIYTVFNEDFRQAFQKVVHFRKAS. The pIC50 is 6.4. (2) The compound is Cc1cc(/C=C2/S/C(=N/c3ccc(Cl)cc3)NC2=O)c(C)n1-c1cc(C(=O)O)cc(C(=O)O)c1. The target protein (P22188) has sequence MADRNLRDLLAPWVPDAPSRALREMTLDSRVAAAGDLFVAVVGHQADGRRYIPQAIAQGVAAIIAEAKDEATDGEIREMHGVPVIYLSQLNERLSALAGRFYHEPSDNLRLVGVTGTNGKTTTTQLLAQWSQLLGEISAVMGTVGNGLLGKVIPTENTTGSAVDVQHELAGLVDQGATFCAMEVSSHGLVQHRVAALKFAASVFTNLSRDHLDYHGDMEHYEAAKWLLYSEHHCGQAIINADDEVGRRWLAKLPDAVAVSMEDHINPNCHGRWLKATEVNYHDSGATIRFSSSWGDGEIESHLMGAFNVSNLLLALATLLALGYPLADLLKTAARLQPVCGRMEVFTAPGKPTVVVDYAHTPDALEKALQAARLHCAGKLWCVFGCGGDRDKGKRPLMGAIAEEFADVAVVTDDNPRTEEPRAIINDILAGMLDAGHAKVMEGRAEAVTCAVMQAKENDVVLVAGKGHEDYQIVGNQRLDYSDRVTVARLLGVIA. The pIC50 is 4.0. (3) The drug is CCOC(=O)c1c(NC(=O)c2cc(S(=O)(=O)N(CC)CC)ccc2Cl)sc2c1CCC2. The target protein sequence is MKLTIHEIAQVVGAKNDISIFEDTQLEKAEFDSRLIGTGDLFVPLKGARDGHDFIETAFENGAAVTLSEKEVSNHPYILVDDVLTAFQSLASYYLEKTTVDVFAVTGSNGKTTTKDMLAHLLSTRYKTYKTQGNYNNEIGLPYTVLHMPEGTEKLVLEMGQDHLGDIHLLSELARPKTAIVTLVGEAHLAFFKDRSEIAKGKMQIADGMASGSLLLAPADPIVEDYLPIDKKVVRFGQGAELEITDLVERKDSLTFKANFLEQALDLPVTGKYNATNAMIASYVALQEGVSEEQIRLAFQHLELTRNRTEWKKAANGADILSDVYNANPTAMKLILETFSAIPANEGGKKIAVLADMKELGDQSVQLHNQMILSLSPDVLDIVIFYGEDIAQLAQLASQMFPIGHVYYFKKTEDQDQFEDLVKQVKESLGAHDQILLKGSNSMNLAKLVESLENEDK. The pIC50 is 4.0. (4) The compound is CC(C)C[C@@H](NC(=O)N1CCOCC1)C(=O)N[C@@H](C=CS(=O)(=O)c1ccccc1)CCc1ccccc1. The target protein (Q61096) has sequence MSGSYPSPKGIHPFLLLALVVGGAVQASKIVGGHEARPHSRPYVASLQLSRFPGSHFCGGTLIHPRFVLTAAHCLQDISWQLVTVVLGAHDLLSSEPEQQKFTISQVFQNNYNPEENLNDVLLLQLNRTASLGKEVAVASLPQQDQTLSQGTQCLAMGWGRLGTQAPTPRVLQELNVTVVTFLCREHNVCTLVPRRAAGICFGDSGGPLICNGILHGVDSFVIRECASLQFPDFFARVSMYVDWIQNVLRGAEP. The pIC50 is 4.7. (5) The drug is CC(c1ccccc1)C(CS)C(=O)NC(COCc1ccccc1)C(=O)O. The target protein (P09470) has sequence MGAASGQRGRWPLSPPLLMLSLLVLLLQPSPAPALDPGLQPGNFSPDEAGAQLFAESYNSSAEVVMFQSTVASWAHDTNITEENARRQEEAALVSQEFAEVWGKKAKELYESIWQNFTDSKLRRIIGSIRTLGPANLPLAQRQQYNSLLSNMSRIYSTGKVCFPNKTATCWSLDPELTNILASSRSYAKLLFAWEGWHDAVGIPLKPLYQDFTAISNEAYRQDDFSDTGAFWRSWYESPSFEESLEHIYHQLEPLYLNLHAYVRRALHRRYGDKYVNLRGPIPAHLLGDMWAQSWENIYDMVVPFPDKPNLDVTSTMVQKGWNATHMFRVSEEFFTSLGLSPMPPEFWAESMLEKPTDGREVVCHASAWDFYNRKDFRIKQCTRVTMEQLATVHHEMGHVQYYLQYKDLHVSLRRGANPGFHEAIGDVLALSVSTPAHLHKIGLLDHVTNDIESDINYLLKMALEKIAFLPFGYLVDQWRWGVFSGRTPPSRYNFDWWYL.... The pIC50 is 7.4. (6) The small molecule is CCCCCc1c(C)nc(Nc2ccccc2)[nH]c1=O. The target protein (P35639) has sequence MAAESLPFTLETVSSWELEAWYEDLQEVLSSDEIGGTYISSPGNEEEESKTFTTLDPASLAWLTEEPGPTEVTRTSQSPRSPDSSQSSMAQEEEEEEQGRTRKRKQSGQCPARPGKQRMKEKEQENERKVAQLAEENERLKQEIERLTREVETTRRALIDRMVSLHQA. The pIC50 is 5.0.