From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is Cc1c(NC(=O)c2nccs2)cccc1-c1ccc(C(N)=O)c2[nH]c(-c3cnn(C)c3)cc12. The target protein sequence is GSWEIDPKDLTFLKELGTGQFGVVKYGKWRGQYDVAIKMIKEGSMSEDEFIEEAKVMMNLSHEKLVQLYGVCTKQRPIFIITEYMANGCLLNYLREMRHRFQTQQLLEMCKDVCEAMEYLESKQFLHRDLAARNCLVNDQGVVKVSDFGLSRYVLDDEYTSSVGSKFPVRWSPPEVLMYSKFSSKSDIWAFGVLMWEIYSLGKMPYERFTNSETAEHIAQGLRLYRPHLASEKVYTIMYSCWHEKADERPTFKILLSNILDVMDEES. The pIC50 is 7.0. (2) The small molecule is CC(C)[C@@H]1CC[C@@H](C)C[C@H]1NC(=O)Oc1ccc(-c2ccccc2)cc1. The target protein (Q8R455) has sequence MSFEGARLSMRSRRNGTLGSTRTLYSSVSRSTDVSYSESDLVNFIQANFKKRECVFFTRDSKAMESICKCGYAQSQHIEGTQINQNEKWNYKKHTKEFPTDAFGDIQFETLGKKGKYLRLSCDTDSETLYELLTQHWHLKTPNLVISVTGGAKNFALKPRMRKIFSRLIYIAQSKGAWILTGGTHYGLMKYIGEVVRDNTISRNSEENIVAIGIAAWGMVSNRDTLIRNCDDEGHFSAQYIMDDFMRDPLYILDNNHTHLLLVDNGCHGHPTVEAKLRNQLEKYISERTSQDSNYGGKIPIVCFAQGGGRETLKAINTSVKSKIPCVVVEGSGQIADVIASLVEVEDVLTSSMVKEKLVRFLPRTVSRLPEEEIESWIKWLKEILESPHLLTVIKMEEAGDEVVSSAISYALYKAFSTNEQDKDNWNGQLKLLLEWNQLDLASDEIFTNDRRWESADLQEVMFTALIKDRPKFVRLFLENGLNLQKFLTNEVLTELFSTH.... The pIC50 is 6.2. (3) The compound is Nc1c(S(=O)(=O)[O-])cc(Nc2ccc(Nc3nc(=O)[nH]c(=O)[nH]3)cc2)c2c1C(=O)c1ccccc1C2=O. The target protein (P49653) has sequence MVRRLARGCWSAFWDYETPKVIVVRNRRLGFVHRMVQLLILLYFVWYVFIVQKSYQDSETGPESSIITKVKGITMSEDKVWDVEEYVKPPEGGSVVSIITRIEVTPSQTLGTCPESMRVHSSTCHSDDDCIAGQLDMQGNGIRTGHCVPYYHGDSKTCEVSAWCPVEDGTSDNHFLGKMAPNFTILIKNSIHYPKFKFSKGNIASQKSDYLKHCTFDQDSDPYCPIFRLGFIVEKAGENFTELAHKGGVIGVIINWNCDLDLSESECNPKYSFRRLDPKYDPASSGYNFRFAKYYKINGTTTTRTLIKAYGIRIDVIVHGQAGKFSLIPTIINLATALTSIGVGSFLCDWILLTFMNKNKLYSHKKFDKVRTPKHPSSRWPVTLALVLGQIPPPPSHYSQDQPPSPPSGEGPTLGEGAELPLAVQSPRPCSISALTEQVVDTLGQHMGQRPPVPEPSQQDSTSTDPKGLAQL. The pIC50 is 6.7. (4) The small molecule is O=C(Nc1ccc(OC(F)(F)F)cc1)Nc1ccc(C(=O)NCCN2CCOCC2)cc1. The target protein (P34913) has sequence MTLRAAVFDLDGVLALPAVFGVLGRTEEALALPRGLLNDAFQKGGPEGATTRLMKGEITLSQWIPLMEENCRKCSETAKVCLPKNFSIKEIFDKAISARKINRPMLQAALMLRKKGFTTAILTNTWLDDRAERDGLAQLMCELKMHFDFLIESCQVGMVKPEPQIYKFLLDTLKASPSEVVFLDDIGANLKPARDLGMVTILVQDTDTALKELEKVTGIQLLNTPAPLPTSCNPSDMSHGYVTVKPRVRLHFVELGSGPAVCLCHGFPESWYSWRYQIPALAQAGYRVLAMDMKGYGESSAPPEIEEYCMEVLCKEMVTFLDKLGLSQAVFIGHDWGGMLVWYMALFYPERVRAVASLNTPFIPANPNMSPLESIKANPVFDYQLYFQEPGVAEAELEQNLSRTFKSLFRASDESVLSMHKVCEAGGLFVNSPEEPSLSRMVTEEEIQFYVQQFKKSGFRGPLNWYRNMERNWKWACKSLGRKILIPALMVTAEKDFVLV.... The pIC50 is 8.7. (5) The pIC50 is 5.0. The target protein (Q01726) has sequence MAVQGSQRRLLGSLNSTPTAIPQLGLAANQTGARCLEVSISDGLFLSLGLVSLVENALVVATIAKNRNLHSPMYCFICCLALSDLLVSGSNVLETAVILLLEAGALVARAAVLQQLDNVIDVITCSSMLSSLCFLGAIAVDRYISIFYALRYHSIVTLPRARRAVAAIWVASVVFSTLFIAYYDHVAVLLCLVVFFLAMLVLMAVLYVHMLARACQHAQGIARLHKRQRPVHQGFGLKGAVTLTILLGIFFLCWGPFFLHLTLIVLCPEHPTCGCIFKNFNLFLALIICNAIIDPLIYAFHSQELRRTLKEVLTCSW. The drug is CCCC[C@@H]1NC(=O)CC[C@@H](C(N)=O)NC(=O)[C@H](Cc2c[nH]c3ccccc23)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@@H](Cc2ccccc2)NC(=O)[C@H](Cc2cnc[nH]2)NC1=O. (6) The small molecule is C/C=C/C[C@@H](C)[C@@H](O)[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N(C)CC(=O)N(C)[C@H](CC(C)C)C(=O)N[C@H](C(C)C)C(=O)N(C)[C@H](CC(C)C)C(=O)N[C@@H](C)C(=O)N[C@H](C)C(=O)N(C)[C@@H](CC(C)C)C(=O)N(C)[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N1C. The target protein (P21462) has sequence METNSSLPTNISGGTPAVSAGYLFLDIITYLVFAVTFVLGVLGNGLVIWVAGFRMTHTVTTISYLNLAVADFCFTSTLPFFMVRKAMGGHWPFGWFLCKFVFTIVDINLFGSVFLIALIALDRCVCVLHPVWTQNHRTVSLAKKVIIGPWVMALLLTLPVIIRVTTVPGKTGTVACTFNFSPWTNDPKERINVAVAMLTVRGIIRFIIGFSAPMSIVAVSYGLIATKIHKQGLIKSSRPLRVLSFVAAAFFLCWSPYQVVALIATVRIRELLQGMYKEIGIAVDVTSALAFFNSCLNPMLYVFMGQDFRERLIHALPASLERALTEDSTQTSDTATNSTLPSAEVELQAK. The pIC50 is 5.0. (7) The drug is CO[C@@H](C[C@H](O)C(C)[C@H](O)C(C)CO)[C@H]1OC2(C[C@@H](O)[C@H](C)C(CCO)O2)C(C)(C)[C@H]1OP(=O)(O)O. The target protein (Q13522) has sequence MEQDNSPRKIQFTVPLLEPHLDPEAAEQIRRRRPTPATLVLTSDQSSPEIDEDRIPNPHLKSTLAMSPRQRKKMTRITPTMKELQMMVEHHLGQQQQGEEPEGAAESTETQESRPPGIPDTEVESRLGTSGTAKKTAECIPKTHERGSKEPSTKEPSTHIPPLDSKGANSV. The pIC50 is 5.0.