Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is O=C1Nc2cc(Nc3cccc(NC(=O)c4cccc(C(F)(F)F)c4)c3)ccc2/C1=C/c1ccc[nH]1. The target protein (Q03142) has sequence MWLLLALLSIFQGTPALSLEASEEMEQEPCLAPILEQQEQVLTVALGQPVRLCCGRTERGRHWYKEGSRLASAGRVRGWRGRLEIASFLPEDAGRYLCLARGSMTVVHNLTLLMDDSLTSISNDEDPKTLSSSSSGHVYPQQAPYWTHPQRMEKKLHAVPAGNTVKFRCPAAGNPMPTIHWLKDGQAFHGENRIGGIRLRHQHWSLVMESVVPSDRGTYTCLVENSLGSIRYSYLLDVLERSPHRPILQAGLPANTTAVVGSDVELLCKVYSDAQPHIQWLKHVVINGSSFGADGFPYVQVLKTTDINSSEVEVLYLRNVSAEDAGEYTCLAGNSIGLSYQSAWLTVLPEEDLTWTTATPEARYTDIILYVSGSLVLLVLLLLAGVYHRQVIRGHYSRQPVTIQKLSRFPLARQFSLESRSSGKSSLSLVRGVRLSSSGPPLLTGLVNLDLPLDPLWEFPRDRLVLGKPLGEGCFGQVVRAEAFGMDPSRPDQTSTVAVK.... The pIC50 is 5.3. (2) The small molecule is CCS(=O)(=O)CCN(C(=O)Cc1ccc(F)c(C(F)(F)F)c1)[C@H](C)c1nc2ncccc2c(=O)n1-c1ccc(OCC(F)(F)F)cc1. The target protein (Q5KSK8) has sequence MVPEMSERQVFQASELTYLLENCSSSYDYAENESDSCCASPPCPQDISLNFDRAFLPALYGLLFLLGLLGNGAVAAVLCSQRAARTSTDTFLLHLAVADMLLVLTLPLWRVDTAVQWVFGSGLCKVAGALFNINFYAGALLLACISFDRYLSIVHATQPYRRGPPARVTLTCVVVWGLCLFFAIPDFIFLSANRDERLNAMHCRYNFPQVGRTALRGLQLVAGFLLPLLVMAYCYARILAVLLVSRGQRRQRRMRLVVVVVVAFALCWTPYHLVVLVDTLMDLGALDRNCGRESRVDVAKSVTSGLGYMHCCLNPLLYAFVGVKFRERMWMLLLRLGCPDHRGHQRHPTLSRRESSWSETPSTPR. The pIC50 is 8.3. (3) The small molecule is CN1CCN(Cc2ccc(NC(=O)Nc3ccc(Oc4ccnc(N)n4)cc3)cc2C(F)(F)F)CC1. The target protein sequence is MENFQKVEKIGEGTYGVVYKARNKLTGEVVALKKIRXDTETEGVPSTAIREISLLKELNHPNIVKLLDVIHTENKLYLVFEFLHQDLKKFMDASALTGIPLPLIKSYLFQLLQGLAFLHSHRVLHRDLKPQNLLINTEGAIKLCDFGLARAFGVPVRTYTHEVVTLWYRAPEILLGCKYYSTAVDIWSLGCIFAEMVTRRALFPGDSEIDQLFRIFRTLGTPDEVVWPGVTSMPDYKPSFPKWARQDFSKVVPPLDEDGRSLLSQMLHYDPNKRISAKAALAHPFFQDVTKPVPHLRL. The pIC50 is 6.1. (4) The compound is Cc1ccc(NC(=O)c2ccc(CN3CCN(C)CC3)c(C(F)(F)F)c2)cc1C#Cc1nn(C(C)C)c2ncnc(N)c12. The target protein sequence is MGSNKSKPKDASQRRRSLEPAENVHGAGGGAFPASQTPSKPASADGHRGPSAAFAPAAAEPKLFGGFNSSDTVTSPQRAGPLAGGVTTFVALYDYESRTETDLSFKKGERLQIVNNTEGDWWLAHSLSTGQTGYIPSNYVAPSDSIQAEEWYFGKITRRESERLLLNAENPRGTFLVRESETTKGAYCLSVSDFDNAKGLNVKHYKIRKLDSGGFYITSRTQFNSLQQLVAYYSKHADGLCHRLTTVCPTSKPQTQGLAKDAWEIPRESLRLEVKLGQGCFGEVWMGTWNGTTRVAIKTLKPGTMSPEAFLQEAQVMKKLRHEKLVQLYAVVSEEPIYIVTEYMSKGSLLDFLKGETGKYLRLPQLVDMAAQIASGMAYVERMNYVHRDLRAANILVGENLVCKVADFGLARLIEDNEYTARQGAKFPIKWTAPEAALYGRFTIKSDVWSFGILLTELTTKGRVPYPGMVNREVLDQVERGYRMPCPPECPESLHDLMCQ.... The pIC50 is 8.2. (5) The pIC50 is 6.2. The compound is C(=C/c1cc2ccc(C3=NCCN3)cc2[nH]1)\c1cc2ccc(C3=NCCN3)cc2[nH]1. The target protein (P9WJ60) has sequence MNLVSEKEFLDLPLVSVAEIVRCRGPKVSVFPFDGTRRWFHLECNPQYDDYQQAALRQSIRILKMLFEHGIETVISPIFSDDLLDRGDRYIVQALEGMALLANDEEILSFYKEHEVHVLFYGDYKKRLPSTAQGAAVVKSFDDLTISTSSNTEHRLCFGVFGNDAAESVAQFSISWNETHGKPPTRREIIEGYYGEYVDKADMFIGFGRFSTFDFPLLSSGKTSLYFTVAPSYYMTETTLRRILYDHIYLRHFRPKPDYSAMSADQLNVLRNRYRAQPDRVFGVGCVHDGIWFAEG. (6) The drug is CC[C@H](C)[C@@H]1NC(=O)[C@H](Cc2ccc(O)cc2)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H]2CCCN2C(=O)[C@@H](NC(=O)[C@H](CCCNC(=N)N)NC(=O)[C@H](CCCNC(=N)N)NC(=O)[C@H](CCCNC(=N)N)NC(=O)[C@H](CCCNC(=N)N)NC(C)=O)CSSC[C@@H](C(=O)N[C@@H](CCCNC(=N)N)C(=O)N[C@@H](CCCNC(=N)N)C(=O)N[C@@H](CCCNC(=N)N)C(=O)N[C@@H](CCCNC(=N)N)C(N)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H]2CCCN2C(=O)[C@H](C)NC(=O)[C@H](Cc2ccc(O)cc2)NC(=O)[C@H](CO)NC1=O. The target protein (P01116) has sequence MTEYKLVVVGAGGVGKSALTIQLIQNHFVDEYDPTIEDSYRKQVVIDGETCLLDILDTAGQEEYSAMRDQYMRTGEGFLCVFAINNTKSFEDIHHYREQIKRVKDSEDVPMVLVGNKCDLPSRTVDTKQAQDLARSYGIPFIETSAKTRQRVEDAFYTLVREIRQYRLKKISKEEKTPGCVKIKKCIIM. The pIC50 is 4.2. (7) The drug is C[C@@]1(CSc2nc[nH]n2)S[C@@H]2[C@H](Br)C(=O)N2[C@H]1C(=O)O. The target protein (P30897) has sequence MNVRQHKASFFSVVITFLCLTLSLNANATDSVLEAVTNAETELGARIGLAAHDLETGKRWEHKSNERFPLSSTFKTLACANVLQRVDLGKERIDRVVRFSESNLVTYSPVTEKHVGKKGMSLAELCQATLSTSDNSAANFILQAIGGPKALTKFLRSIGDDTTRLDRWEPELNEAVPGDKRDTTTPIAMVTTLEKLLIDETLSIKSRQQLESWLKGNEVGDALFRKGVPSDWIVADRTGAGGYGSRAITAVMWPPNRKPIVAALYITETDASFEERNAVIAKIGEQIAKTVLMENSRN. The pIC50 is 8.1.