Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is Cc1ccc(Sc2ccc(Nc3cc(S(=O)(=O)[O-])c(N)c4c3C(=O)c3ccccc3C4=O)cc2)cc1C.[Na+]. The target protein (Q15077) has sequence MEWDNGTGQALGLPPTTCVYRENFKQLLLPPVYSAVLAAGLPLNICVITQICTSRRALTRTAVYTLNLALADLLYACSLPLLIYNYAQGDHWPFGDFACRLVRFLFYANLHGSILFLTCISFQRYLGICHPLAPWHKRGGRRAAWLVCVAVWLAVTTQCLPTAIFAATGIQRNRTVCYDLSPPALATHYMPYGMALTVIGFLLPFAALLACYCLLACRLCRQDGPAEPVAQERRGKAARMAVVVAAAFAISFLPFHITKTAYLAVRSTPGVPCTVLEAFAAAYKGTRPFASANSVLDPILFYFTQKKFRRRPHELLQKLTAKWQRQGR. The pIC50 is 5.4. (2) The drug is O=C1c2ccc(O)cc2CCC1Cc1ccc(O)cc1. The target protein (Q09128) has sequence MSCPIDKRRTLIAFLRRLRDLGQPPRSVTSKASASRAPKEVPLCPLMTDGETRNVTSLPGPTNWPLLGSLLEIFWKGGLKKQHDTLAEYHKKYGQIFRMKLGSFDSVHLGSPSLLEALYRTESAHPQRLEIKPWKAYRDHRNEAYGLMILEGQEWQRVRSAFQKKLMKPVEIMKLDKKINEVLADFLERMDELCDERGRIPDLYSELNKWSFESICLVLYEKRFGLLQKETEEEALTFITAIKTMMSTFGKMMVTPVELHKRLNTKVWQAHTLAWDTIFKSVKPCIDNRLQRYSQQPGADFLCDIYQQDHLSKKELYAAVTELQLAAVETTANSLMWILYNLSRNPQAQRRLLQEVQSVLPDNQTPRAEDLRNMPYLKACLKESMRLTPSVPFTTRTLDKPTVLGEYALPKGTVLTLNTQVLGSSEDNFEDSHKFRPERWLQKEKKINPFAHLPFGIGKRMCIGRRLAELQLHLALCWIIQKYDIVATDNEPVEMLHLGI.... The pIC50 is 5.0. (3) The compound is Cn1nnnc1C(/C=C/[C@@H]1C[C@@H](O)CC(=O)O1)=C(c1ccc(F)cc1)c1ccc(F)cc1. The target protein (P17425) has sequence MPGSLPLNAEACWPKDVGIVALEIYFPSQYVDQAELEKYDGVDAGKYTIGLGQARMGFCTDREDINSLCLTVVQKLMERNSLSYDCIGRLEVGTETIIDKSKSVKSNLMQLFEESGNTDIEGIDTTNACYGGTAAVFNAVNWIESSSWDGRYALVVAGDIAIYASGNARPTGGVGAVALLIGPNAPVIFDRGLRGTHMQHAYDFYKPDMLSEYPVVDGKLSIQCYLSALDRCYSVYRKKIRAQWQKEGKDKDFTLNDFGFMIFHSPYCKLVQKSLARMFLNDFLNDQNRDKNSIYSGLEAFGDVKLEDTYFDRDVEKAFMKASAELFNQKTKASLLVSNQNGNMYTSSVYGSLASVLAQYSPQQLAGKRIGVFSYGSGLAATLYSLKVTQDATPGSALDKITASLCDLKSRLDSRTCVAPDVFAENMKLREDTHHLANYIPQCSIDSLFEGTWYLVRVDEKHRRTYARRPSTNDHSLDEGVGLVHSNTATEHIPSPAKKV.... The pIC50 is 7.4. (4) The small molecule is N=C(N)N1CCCC(C[C@@H](C=O)NC(=O)CN2CCCC[C@H](NS(=O)(=O)Cc3ccccc3)C2=O)C1. The target protein (Q29463) has sequence MHPLLILAFVGAAVAFPSDDDDKIVGGYTCAENSVPYQVSLNAGYHFCGGSLINDQWVVSAAHCYQYHIQVRLGEYNIDVLEGGEQFIDASKIIRHPKYSSWTLDNDILLIKLSTPAVINARVSTLLLPSACASAGTECLISGWGNTLSSGVNYPDLLQCLVAPLLSHADCEASYPGQITNNMICAGFLEGGKDSCQGDSGGPVACNGQLQGIVSWGYGCAQKGKPGVYTKVCNYVDWIQETIAANS. The pIC50 is 4.8. (5) The small molecule is Nc1cncc(-c2cn3ccnc3c(Nc3ccc(N4CCN(CC(CO)CO)CC4)cc3)n2)n1. The target protein (P08962) has sequence MAVEGGMKCVKFLLYVLLLAFCACAVGLIAVGVGAQLVLSQTIIQGATPGSLLPVVIIAVGVFLFLVAFVGCCGACKENYCLMITFAIFLSLIMLVEVAAAIAGYVFRDKVMSEFNNNFRQQMENYPKNNHTASILDRMQADFKCCGAANYTDWEKIPSMSKNRVPDSCCINVTVGCGINFNEKAIHKEGCVEKIGGWLRKNVLVVAAAALGIAFVEVLGIVFACCLVKSIRSGYEVM. The pIC50 is 6.9. (6) The compound is CCCCCCCCCCCCCCCCCCOc1ccc(N2CCN(C(=O)c3ccc(Cc4nc(=O)o[nH]4)cc3)CC2)cc1. The target protein (P14555) has sequence MKTLLLLAVIMIFGLLQAHGNLVNFHRMIKLTTGKEAALSYGFYGCHCGVGGRGSPKDATDRCCVTHDCCYKRLEKRGCGTKFLSYKFSNSGSRITCAKQDSCRSQLCECDKAAATCFARNKTTYNKKYQYYSNKHCRGSTPRC. The pIC50 is 7.2.