Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is C[C@H](CCC(=O)OC(C)(C)C)C1CC[C@H]2C3CC[C@@H]4C[C@H](OC(=O)CCC(=O)O)CC[C@]4(C)C3CC[C@]12C. The target protein (Q11205) has sequence MKCSLRVWFLSMAFLLVFIMSLLFTYSHHSMATLPYLDSGTLGGTHRVKLVPGYTGQQRLVKEGLSGKSCTCSRCMGDAGTSEWFDSHFDSNISPVWTRDNMNLTPDVQRWWMMLQPQFKSHNTNEVLEKLFQIVPGENPYRFRDPQQCRRCAVVGNSGNLRGSGYGQEVDSHNFIMRMNQAPTVGFEKDVGSRTTHHFMYPESAKNLPANVSFVLVPFKALDLMWIASALSTGQIRFTYAPVKSFLRVDKEKVQIYNPAFFKYIHDRWTEHHGRYPSTGMLVLFFALHVCDEVNVYGFGADSRGNWHHYWENNRYAGEFRKTGVHDADFEAHIIDILAKASKIEVYRGN. The pIC50 is 4.9. (2) The compound is NC(CCSCCCCC(=O)O)P(=O)(O)O. The target protein (Q93088) has sequence MPPVGGKKAKKGILERLNAGEIVIGDGGFVFALEKRGYVKAGPWTPEAAVEHPEAVRQLHREFLRAGSNVMQTFTFYASEDKLENRGNYVLEKISGQEVNEAACDIARQVADEGDALVAGGVSQTPSYLSCKSETEVKKVFLQQLEVFMKKNVDFLIAEYFEHVEEAVWAVETLIASGKPVAATMCIGPEGDLHGVPPGECAVRLVKAGASIIGVNCHFDPTISLKTVKLMKEGLEAARLKAHLMSQPLAYHTPDCNKQGFIDLPEFPFGLEPRVATRWDIQKYAREAYNLGVRYIGGCCGFEPYHIRAIAEELAPERGFLPPASEKHGSWGSGLDMHTKPWVRARARKEYWENLRIASGRPYNPSMSKPDGWGVTKGTAELMQQKEATTEQQLKELFEKQKFKSQ. The pIC50 is 3.7. (3) The drug is COc1cc(OC)c(C(=O)/C=C/c2cccc([N+](=O)[O-])c2)c(OC)c1. The target protein (Q86VQ6) has sequence MERSPPQSPGPGKAGDAPNRRSGHVRGARVLSPPGRRARLSSPGPSRSSEAREELRRHLVGLIERSRVVIFSKSYCPHSTRVKELFSSLGVECNVLELDQVDDGARVQEVLSEITNQKTVPNIFVNKVHVGGCDQTFQAYQSGLLQKLLQEDLAYDYDLIIIGGGSGGLSCAKEAAILGKKVMVLDFVVPSPQGTSWGLGGTCVNVGCIPKKLMHQAALLGQALCDSRKFGWEYNQQVRHNWETMTKAIQNHISSLNWGYRLSLREKAVAYVNSYGEFVEHHKIKATNKKGQETYYTAAQFVIATGERPRYLGIQGDKEYCITSDDLFSLPYCPGKTLVVGASYVALECAGFLAGFGLDVTVMVRSILLRGFDQEMAEKVGSYMEQHGVKFLRKFIPVMVQQLEKGSPGKLKVLAKSTEGTETIEGVYNTVLLAIGRDSCTRKIGLEKIGVKINEKSGKIPVNDVEQTNVPYVYAVGDILEDKPELTPVAIQSGKLLAQR.... The pIC50 is 4.9. (4) The compound is NC(=O)[C@@H]1CCCN1C(=O)[C@H](Cc1nc[nH]c1Br)NC(=O)c1cnccn1. The target protein (P21761) has sequence MENDTVSEMNQTELQPQAAVALEYQVVTILLVVIICGLGIVGNIMVVLVVMRTKHMRTPTNCYLVSLAVADLMVLVAAGLPNITDSIYGSWVYGYVGCLCITYLQYLGINASSCSITAFTIERYIAICHPIKAQFLCTFSRAKKIIIFVWAFTSIYCMLWFFLLDLNISTYKNAVVVSCGYKISRNYYSPIYLMDFGVFYVVPMILATVLYGFIARILFLNPIPSDPKENSKMWKNDSIHQNKNLNLNATNRCFNSTVSSRKQVTKMLAVVVILFALLWMPYRTLVVVNSFLSSPFQENWFLLFCRICIYLNSAINPVIYNLMSQKFRAAFRKLCNCKQKPTEKAANYSVALNYSVIKESDRFSTELEDITVTDTYVSTTKVSFDDTCLASEN. The pIC50 is 4.3. (5) The drug is CCCCCCCCCCCCC/C=C/[C@@H](O)[C@H](COC(=O)NCc1cccnc1)NC(=O)C(C)(C)C. The target protein (Q9ET64) has sequence MKHNFSLRLRVFNLNCWDIPYLSKHRADRMKRLGDFLNLESFDLALLEEVWSEQDFQYLKQKLSLTYPDAHYFRSGIIGSGLCVFSRHPIQEIVQHVYTLNGYPYKFYHGDWFCGKAVGLLVLHLSGLVLNAYVTHLHAEYSRQKDIYFAHRVAQAWELAQFIHHTSKKANVVLLCGDLNMHPKDLGCCLLKEWTGLRDAFVETEDFKGSEDGCTMVPKNCYVSQQDLGPFPFGVRIDYVLYKAVSGFHICCKTLKTTTGCDPHNGTPFSDHEALMATLCVKHSPPQEDPCSAHGSAERSALISALREARTELGRGIAQARWWAALFGYVMILGLSLLVLLCVLAAGEEAREVAIMLWTPSVGLVLGAGAVYLFHKQEAKSLCRAQAEIQHVLTRTTETQDLGSEPHPTHCRQQEADRAEEK. The pIC50 is 5.7.