This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is CCOc1ccc(Cc2cc([C@@H]3OC(OC(C)C)[C@@H](O)[C@H](O)[C@H]3O)ccc2Cl)cc1. The target protein (Q923I7) has sequence MEQHVEAGSELGEQKVLIDNPADILVIAAYFLLVIGVGLWSMFRTNRGTVGGYFLAGRSMVWWPVGASLFASNIGSGHFVGLAGTGAASGLAVAGFEWNALFVVLLLGWLFVPVYLTAGVITMPQYLRKRFGGHRIRLYLSVLSLFLYIFTKISVDMFSGAVFIQQALGWNIYASVIALLGITMIYTVTGGLAALMYTDTVQTFVILAGAFILTGYAFHEVGGYSGLFDKYLGAMTSLTVSKDPSVGNISSTCYQPRPDSYHLLRDPVTGDLPWPALLLGLTIVSGWYWCSDQVIVQRCLAGKNLTHIKAGCILCGYLKLMPMFLMVMPGMISRILYPDEVACVVPEVCKRVCGTEVGCSNIAYPRLVVKLMPNGLRGLMLAVMLAALMSSLASIFNSSSTLFTMDIYTRLRPRAGDKELLLVGRLWVVFIVAVSVAWLPVVQAAQGGQLFDYIQSVSSYLAPPVSAVFVLALFVPRVNEKGAFWGLVGGLLMGLARLIP.... The pIC50 is 6.7. (2) The compound is O=C(NC(c1ccccc1)c1ccccc1)C1CCCN1C(=[SH+])[N-]Nc1ccccc1[N+](=O)[O-]. The target protein (P30547) has sequence MDNVLPVDSDLFPNISTNTSEPNQFVQPAWQIVLWAAAYTVIVVTSVVGNVVVMWIILAHKRMRTVTNYFLVNLAFAEASMAAFNTVVNFTYAVHNEWYYGLFYCKFHNFFPIAAVFASIYSMTAVAFDRYMAIIHPLQPRLSATATKVVICVIWVLALLLAFPQGYYSTTETMPGRVVCMIEWPSHPDKIYEKVYHICVTVLIYFLPLLVIGYAYTVVGITLWASEIPGDSSDRYHEQVSAKRKVVKMMIVVVCTFAICWLPFHIFFLLPYINPDLYLKKFIQQVYLAIMWLAMSSTMYNPIIYCCLNDRFRLGFKHAFRCCPFISAADYEGLEMKSTRYFQTQGSVYKVSRLETTISTVVGAHEEDPEEGPKATPSSLDLTSNGSSRSNSKTVTESSSFYSNMLS. The pIC50 is 5.8. (3) The compound is CC(=O)N[C@@H](Cc1cnc[nH]1)C(=O)N[C@H](Cc1ccccc1)C(=O)N[C@@H](CCCNC(=N)N)C(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(=O)NCCCOCCOCCOCCCNC(=O)COCC(N)=O. The target protein (Q01727) has sequence MSTQEPQKSLLGSLNSNATSHLGLATNQSEPWCLYVSIPDGLFLSLGLVSLVENVLVVIAITKNRNLHSPMYYFICCLALSDLMVSVSIVLETTIILLLEAGILVARVALVQQLDNLIDVLICGSMVSSLCFLGIIAIDRYISIFYALRYHSIVTLPRARRAVVGIWMVSIVSSTLFITYYKHTAVLLCLVTFFLAMLALMAILYAHMFTRACQHAQGIAQLHKRRRSIRQGFCLKGAATLTILLGIFFLCWGPFFLHLLLIVLCPQHPTCSCIFKNFNLFLLLIVLSSTVDPLIYAFRSQELRMTLKEVLLCSW. The pIC50 is 6.8. (4) The target protein (P33534) has sequence MESPIQIFRGDPGPTCSPSACLLPNSSSWFPNWAESDSNGSVGSEDQQLESAHISPAIPVIITAVYSVVFVVGLVGNSLVMFVIIRYTKMKTATNIYIFNLALADALVTTTMPFQSAVYLMNSWPFGDVLCKIVISIDYYNMFTSIFTLTMMSVDRYIAVCHPVKALDFRTPLKAKIINICIWLLASSVGISAIVLGGTKVREDVDVIECSLQFPDDEYSWWDLFMKICVFVFAFVIPVLIIIVCYTLMILRLKSVRLLSGSREKDRNLRRITKLVLVVVAVFIICWTPIHIFILVEALGSTSHSTAALSSYYFCIALGYTNSSLNPVLYAFLDENFKRCFRDFCFPIKMRMERQSTNRVRNTVQDPASMRDVGGMNKPV. The pIC50 is 7.0. The small molecule is CN(C(=O)Cc1ccc(Cl)c(Cl)c1)C(CN1CCCC1)c1ccc(NC(=O)CNC(=S)N=C2C=CC(=C3c4ccc(O)cc4Oc4cc(O)ccc43)C(C(=O)O)=C2)cc1. (5) The small molecule is O=C1C(=O)N(Cc2cccnc2)C(c2cccc(F)c2)C1C(=O)c1ccc2c(c1)OCO2. The target protein sequence is MSGSTQPVAQTWRATEPRYPPHSLSYPVQIARTHTDVGLLEYQHHSRDYASHLSPGSIIQPQRRRPSLLSEFQPGNERSQELHLRPESHSYLPELGKSEMEFIESKRPRLELLPDPLLRPSPLLATGQPAGSEDLTKDRSLTGKLEPVSPPSPPHTDPELELVPPRLSKEELIQNMDRVDREITMVEQQISKLKKKQQQLEEEAAKPPEPEKPVSPPPIESKHRSLVQIIYDENRKKAEAAHRILEGLGPQVELPLYNQPSDTRQYHENIKINQAMRKKLILYFKRRNHARKQWEQKFCQRYDQLMEAWEKKVERIENNPRRRAKESKVREYYEKQFPEIRKQRELQERMQSRVGQRGSGLSMSAARSEHEVSEIIDGLSEQENLEKQMRQLAVIPPMLYDADQQRIKFINMNGLMADPMKVYKDRQVMNMWSEQEKETFREKFMQHPKNFGLIASFLERKTVAECVLYYYLTKKNENYKSLVRRSYRRRGKSQQQQQQQ.... The pIC50 is 5.3. (6) The small molecule is Oc1cnn(-c2cccc(C(F)(F)F)c2)c1. The target protein (P34913) has sequence MTLRAAVFDLDGVLALPAVFGVLGRTEEALALPRGLLNDAFQKGGPEGATTRLMKGEITLSQWIPLMEENCRKCSETAKVCLPKNFSIKEIFDKAISARKINRPMLQAALMLRKKGFTTAILTNTWLDDRAERDGLAQLMCELKMHFDFLIESCQVGMVKPEPQIYKFLLDTLKASPSEVVFLDDIGANLKPARDLGMVTILVQDTDTALKELEKVTGIQLLNTPAPLPTSCNPSDMSHGYVTVKPRVRLHFVELGSGPAVCLCHGFPESWYSWRYQIPALAQAGYRVLAMDMKGYGESSAPPEIEEYCMEVLCKEMVTFLDKLGLSQAVFIGHDWGGMLVWYMALFYPERVRAVASLNTPFIPANPNMSPLESIKANPVFDYQLYFQEPGVAEAELEQNLSRTFKSLFRASDESVLSMHKVCEAGGLFVNSPEEPSLSRMVTEEEIQFYVQQFKKSGFRGPLNWYRNMERNWKWACKSLGRKILIPALMVTAEKDFVLV.... The pIC50 is 4.3.