Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is c1cc2oc(-c3cnc4ccc(OCC[C@@H]5CCCN5)nn34)cc2cn1. The target protein sequence is MVSSQKLEKPIEMGSSEPLPIADGDRRRKKKRRGRATDSLPGKFEDMYKLTSELLGEGAYAKVQGAVSLQNGKEYAVKIIEKQAGHSRSRVFREVETLYQCQGNKNILELIEFFEDDTRFYLVFEKLQGGSILAHIQKQKHFNEREASRVVRDVAAALDFLHTKGIAHRDLKPENILCESPEKVSPVKICDFDLGSGMKLNNSCTPITTPELTTPCGSAEYMAPEVVEVFTDQATFYDKRCDLWSLGVVLYIMLSGYPPFVGHCGADCGWDRGEVCRVCQNKLFESIQEGKYEFPDKDWAHISSEAKDLISKLLVRDAKQRLSAAQVLQHPWVQGQAPEKGLPDPQVLQRNSSTMDLTLFAAEAIALNRQLSQHEENELAEEPEALADGLCSMKLSPPCKSRLARRRALAQAGRGEDRSPPTAL. The pIC50 is 6.9. (2) The pIC50 is 7.8. The compound is CCOC(=O)N[C@H](Cc1cccc2ccccc12)C(=O)N(C)[C@@H](CCCNC(=N)N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(=O)OC)C(C)C. The target protein (P04229) has sequence MVCLKLPGGSCMTALTVTLMVLSSPLALAGDTRPRFLWQLKFECHFFNGTERVRLLERCIYNQEESVRFDSDVGEYRAVTELGRPDAEYWNSQKDLLEQRRAAVDTYCRHNYGVGESFTVQRRVEPKVTVYPSKTQPLQHHNLLVCSVSGFYPGSIEVRWFRNGQEEKAGVVSTGLIQNGDWTFQTLVMLETVPRSGEVYTCQVEHPSVTSPLTVEWRARSESAQSKMLSGVGGFVLGLLFLGAGLFIYFRNQKGHSGLQPTGFLS. (3) The target protein (P55899) has sequence MGVPRPQPWALGLLLFLLPGSLGAESHLSLLYHLTAVSSPAPGTPAFWVSGWLGPQQYLSYNSLRGEAEPCGAWVWENQVSWYWEKETTDLRIKEKLFLEAFKALGGKGPYTLQGLLGCELGPDNTSVPTAKFALNGEEFMNFDLKQGTWGGDWPEALAISQRWQQQDKAANKELTFLLFSCPHRLREHLERGRGNLEWKEPPSMRLKARPSSPGFSVLTCSAFSFYPPELQLRFLRNGLAAGTGQGDFGPNSDGSFHASSSLTVKSGDEHHYCCIVQHAGLAQPLRVELESPAKSSVLVVGIVIGVLLLTAAAVGGALLWRRMRSGLPAPWISLRGDDTGVLLPTPGEAQDADLKDVNVIPATA. The pIC50 is 8.8. The small molecule is CC(C)C[C@H]1C(=O)N[C@@H](Cc2ccc(O)cc2)C(=O)C(=O)N2CCC[C@H]2C(=O)N[C@H](C(N)=O)CSSC(C)(C)[C@H](NC(=O)[C@H](Cc2ccccc2)NC(=O)[C@H](CCCNC(=N)N)NCCNC(=O)CCC(=O)N[C@@H](CCCNC(=N)N)C(=O)N[C@@H](Cc2ccccc2)C(=O)N[C@@H]2C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](Cc3cnc[nH]3)C(=O)N[C@@H](Cc3ccccc3)C(=O)NCC(=O)N(C)CC(=O)N(C)[C@@H](CC(C)C)C(=O)N[C@@H](Cc3ccc(O)cc3)C(=O)C(=O)N3CCC[C@H]3C(=O)N[C@H](C(N)=O)CSSC2(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](Cc2cnc[nH]2)C(=O)N[C@@H](Cc2ccccc2)C(=O)NCC(=O)N(C)CC(=O)N1C. (4) The drug is O=c1[nH]c(-c2ccc(C(F)(F)F)cc2)nc2c1CSCC2. The target protein (Q2NL67) has sequence MDIKGQFWNDDDSEGDNESEEFLYGVQGSCAADLYRHPQLDADIEAVKEIYSENSVSIREYGTIDDVDIDLHINISFLDEEVSTAWKVLRTEPIVLRLRFSLSQYLDGPEPSIEVFQPSNKEGFGLGLQLKKILGMFTSQQWKHLSNDFLKTQQEKRHSWFKASGTIKKFRAGLSIFSPIPKSPSFPIIQDSMLKGKLGVPELRVGRLMNRSISCTMKNPKVEVFGYPPSPQAGLLCPQHVGLPPPARTSPLVSGHCKNIPTLEYGFLVQIMKYAEQRIPTLNEYCVVCDEQHVFQNGSMLKPAVCTRELCVFSFYTLGVMSGAAEEVATGAEVVDLLVAMCRAALESPRKSIIFEPYPSVVDPTDPKTLAFNPKKKNYERLQKALDSVMSIREMTQGSYLEIKKQMDKLDPLAHPLLQWIISSNRSHIVKLPLSRLKFMHTSHQFLLLSSPPAKEARFRTAKKLYGSTFAFHGSHIENWHSILRNGLVNASYTKLQLHG.... The pIC50 is 5.0. (5) The compound is COc1cc2c(Nc3ccc(Cl)c(Cl)c3F)ncnc2cc1OC[C@@H]1CN2CCC[C@H]2CO1. The target protein (P54760) has sequence MELRVLLCWASLAAALEETLLNTKLETADLKWVTFPQVDGQWEELSGLDEEQHSVRTYEVCDVQRAPGQAHWLRTGWVPRRGAVHVYATLRFTMLECLSLPRAGRSCKETFTVFYYESDADTATALTPAWMENPYIKVDTVAAEHLTRKRPGAEATGKVNVKTLRLGPLSKAGFYLAFQDQGACMALLSLHLFYKKCAQLTVNLTRFPETVPRELVVPVAGSCVVDAVPAPGPSPSLYCREDGQWAEQPVTGCSCAPGFEAAEGNTKCRACAQGTFKPLSGEGSCQPCPANSHSNTIGSAVCQCRVGYFRARTDPRGAPCTTPPSAPRSVVSRLNGSSLHLEWSAPLESGGREDLTYALRCRECRPGGSCAPCGGDLTFDPGPRDLVEPWVVVRGLRPDFTYTFEVTALNGVSSLATGPVPFEPVNVTTDREVPPAVSDIRVTRSSPSSLSLAWAVPRAPSGAVLDYEVKYHEKGAEGPSSVRFLKTSENRAELRGLKRG.... The pIC50 is 5.0. (6) The drug is O=C(Nc1ccc(Oc2ncnc3n[nH]cc23)cc1)Nc1ccc(C(F)(F)F)cc1Cl. The pIC50 is 5.1. The target protein (Q5GIT4) has sequence MAKTSYALLLLDILLTFNVAKAIELRFVPDPPTLNITEKTIKINASDTLQITCRGRQILEWSTPHNRTSSETRLTISDCSGDGLFCSTLTLSKAVANETGEYRCFYKSLPKEDGKTSVAVYVFIQDYRTPFVRIAQDYDVVFIREGEQVVIPCLVSVEDLNVTLYTKYPVKELSTDGKEVIWDSRRGFILPSRVVSYAGVVYCQTTIRNETFQSSPYIVAVVGYKIYDLTLSPQHERLTVGERLILNCTAHTELNVGIDFQWTFPHEKRSVNGSMSTSRYKTSSNKKKLWNSLELSNTLTVENVTLNDTGEYICTASSGQMQKIAQASLIVYEKPFIALSDQLWQTVEAKAGDAEAKILVKYYAYPEPAVRWYKNDQLIVLRDEYRMKFYRGVHLTIYGVTEKDAGNYTVVMTNKITKEEQRRTFQLVVNDLPRIFEKDVSLDRDVHMYGSSPTLTCTASGGSSPVTIKWQWMPREDCPVRFLPKSDTRMAKCDKWREMS....