Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The target protein (P69332) has sequence MTPTTTTAELTTEFDYDEDATPCVFTDVLNQSKPVTLFLYGVVFLFGSIGNFLVIFTITWRRRIQCSGDVYFINLAAADLLFVCTLPLWMQYLLDHNSLASVPCTLLTACFYVAMFASLCFITEIALDRYYAIVYMRYRPVKQACLFSIFWWIFAVIIAIPHFMVVTKKDNQCMTDYDYLEVSYPIILNVELMLGAFVIPLSVISYCYYRISRIVAVSQSRHKGRIVRVLIAVVLVFIIFWLPYHLTLFVDTLKLLKWISSSCEFERSLKRALILTESLAFCHCCLNPLLYVFVGTKFRQELHCLLAEFRQRLFSRDVSWYHSMSFSRRSSPSRRETSSDTLSDEVCRVSQIIP. The pIC50 is 4.0. The small molecule is OC(CCCN1CCC(O)(c2ccc(Cl)cc2)CC1)c1ccccc1. (2) The drug is C/C(=N\Nc1nc(-c2cc3ccccc3oc2=O)cs1)c1ccc(Br)cc1. The target protein (Q3ZCJ2) has sequence MAASCILLHTGQKMPLIGLGTWKSDPGQVKAAIKYALSVGYRHIDCAAIYGNETEIGEALKENVGPGKLVPREELFVTSKLWNTKHHPEDVEPALRKTLADLQLEYLDLYLMHWPYAFERGDSPFPKNADGTIRYDSTHYKETWRALEALVAKGLVRALGLSNFNSRQIDDVLSVASVRPAVLQVECHPYLAQNELIAHCQARNLEVTAYSPLGSSDRAWRDPEEPVLLKEPVVLALAEKHGRSPAQILLRWQVQRKVSCIPKSVTPSRILENIQVFDFTFSPEEMKQLDALNKNLRFIVPMLTVDGKRVPRDAGHPLYPFNDPY. The pIC50 is 5.8. (3) The drug is N#CCC1(n2cc(-c3ncnc4[nH]ccc34)cn2)CN(C2CCN(C(=O)c3ccnc(C(F)(F)F)c3F)CC2)C1. The target protein (P30530) has sequence MAWRCPRMGRVPLAWCLALCGWACMAPRGTQAEESPFVGNPGNITGARGLTGTLRCQLQVQGEPPEVHWLRDGQILELADSTQTQVPLGEDEQDDWIVVSQLRITSLQLSDTGQYQCLVFLGHQTFVSQPGYVGLEGLPYFLEEPEDRTVAANTPFNLSCQAQGPPEPVDLLWLQDAVPLATAPGHGPQRSLHVPGLNKTSSFSCEAHNAKGVTTSRTATITVLPQQPRNLHLVSRQPTELEVAWTPGLSGIYPLTHCTLQAVLSDDGMGIQAGEPDPPEEPLTSQASVPPHQLRLGSLHPHTPYHIRVACTSSQGPSSWTHWLPVETPEGVPLGPPENISATRNGSQAFVHWQEPRAPLQGTLLGYRLAYQGQDTPEVLMDIGLRQEVTLELQGDGSVSNLTVCVAAYTAAGDGPWSLPVPLEAWRPGQAQPVHQLVKEPSTPAFSWPWWYVLLGAVVAAACVLILALFLVHRRKKETRYGEVFEPTVERGELVVRYRV.... The pIC50 is 6.5. (4) The target protein sequence is MSQFNQNNKQIDVMGIRKILPHRYPFALLDKIVDWSVEDRTIVAQKNVTINEDFFNGHFPDFPVMPGVLIVEAMAQATAILGELMAETLFAHVVEKAGGGRRTFMLAGIDKVRVKRPVVPGDVLVIESRMVKQKNIICTAESVAKVDGQIVCSAELMAAYKDY. The pIC50 is 5.1. The small molecule is COc1c(O)cc2oc3cc(O)c(CC=C(C)C)c(O)c3c(=O)c2c1CC=C(C)C. (5) The small molecule is CN(C)[C@@H]1C(=O)C(C(N)=O)C(=O)[C@@]2(O)C(=O)C3C(=O)c4c(O)cccc4[C@H](CSCCC(=O)O)C3[C@H](O)C12. The target protein (P0AEY8) has sequence MQNKLASGARLGRQALLFPLCLVLYEFSTYIGNDMIQPGMLAVVEQYQAGIDWVPTSMTAYLAGGMFLQWLLGPLSDRIGRRPVMLAGVVWFIVTCLAILLAQNIEQFTLLRFLQGISLCFIGAVGYAAIQESFEEAVCIKITALMANVALIAPLLGPLVGAAWIHVLPWEGMFVLFAALAAISFFGLQRAMPETATRIGEKLSLKELGRDYKLVLKNGRFVAGALALGFVSLPLLAWIAQSPIIIITGEQLSSYEYGLLQVPIFGALIAGNLLLARLTSRRTVRSLIIMGGWPIMIGLLVAAAATVISSHAYLWMTAGLSIYAFGIGLANAGLVRLTLFASDMSKGTVSAAMGMLQMLIFTVGIEISKHAWLNGGNGLFNLFNLVNGILWLSLMVIFLKDKQMGNSHEG. The pIC50 is 4.4. (6) The compound is CCCN(CCC)C(=O)[C@@H]1OC(C(=O)O)=C[C@H](N=C(N)N)[C@H]1NC(C)=O. The target protein (P16207) has sequence MLPSTIQTLTLFLTSGGVLLSLYVSASLSYLLYSDILLKFSPKITAPTMTLDCTNASNVQAVNRSATKEMTFLLPEPEWTYPRLSCQGSTFQKALLISPHRFGEARGNSAPLIIREPFIACGPKECKHFALTHYAAQPGGYYNGTREDRNKLRHLISVKLGKIPTVENSIFHMAAWSGSACHDGREWTYIGVDGPDSNALIKIKYGEAYTDTYHSYANNILRTQESACNCIGGDCYLMITDGSASGISKCRFLKIREGRIIKEIFPTGRVEHTEECTCGFASNKTIECACRDNNYTAKRPFVKLNVETDTAEIRLMCTETYLDTPRPDDGSITGPCESNGDKGRGGIKGGFVHQRMASKIGRWYSRTMSKTERMGMELYVKYDGDPWTDSDALDPSGVMVSIKEPGWYSFGFEIKDKKCDVPCIGIEMVHDGGKKTWHSAATAIYCLMGSGQLLWDTVTGVDMAL. The pIC50 is 6.3.