This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is O=C(CN1CCN(c2ccc(F)cn2)CC1)c1coc2ccccc12. The target protein (Q60714) has sequence MRAPGAGTASVASLALLWFLGLPWTWSAAAAFCVYVGGGGWRFLRIVCKTARRDLFGLSVLIRVRLELRRHRRAGDTIPCIFQAVARRQPERLALVDASSGICWTFAQLDTYSNAVANLFRQLGFAPGDVVAVFLEGRPEFVGLWLGLAKAGVVAALLNVNLRREPLAFCLGTSAAKALIYGGEMAAAVAEVSEQLGKSLLKFCSGDLGPESILPDTQLLDPMLAEAPTTPLAQAPGKGMDDRLFYIYTSGTTGLPKAAIVVHSRYYRIAAFGHHSYSMRAADVLYDCLPLYHSAGNIMGVGQCVIYGLTVVLRKKFSASRFWDDCVKYNCTVVQYIGEICRYLLRQPVRDVEQRHRVRLAVGNGLRPAIWEEFTQRFGVPQIGEFYGATECNCSIANMDGKVGSCGFNSRILTHVYPIRLVKVNEDTMEPLRDSEGLCIPCQPGEPGLLVGQINQQDPLRRFDGYVSDSATNKKIAHSVFRKGDSAYLSGDVLVMDELG.... The pIC50 is 5.7. (2) The small molecule is COc1cc(/C=N/NC(=O)c2ccc(NC(=O)c3ccccc3)cc2)cc(Br)c1O. The pIC50 is 4.9. The target protein (Q9Y294) has sequence MAKVQVNNVVVLDNPSPFYNPFQFEITFECIEDLSEDLEWKIIYVGSAESEEYDQVLDSVLVGPVPAGRHMFVFQADAPNPGLIPDADAVGVTVVLITCTYRGQEFIRVGYYVNNEYTETELRENPPVKPDFSKLQRNILASNPRVTRFHINWEDNTEKLEDAESSNPNLQSLLSTDALPSASKGWSTSENSLNVMLESHMDCM. (3) The compound is COc1ccc(-n2c(=S)[nH]c3c(oc4ccccc43)c2=O)cc1. The target protein sequence is MSGSTQPVAQTWRATEPRYPPHSLSYPVQIARTHTDVGLLEYQHHSRDYASHLSPGSIIQPQRRRPSLLSEFQPGNERSQELHLRPESHSYLPELGKSEMEFIESKRPRLELLPDPLLRPSPLLATGQPAGSEDLTKDRSLTGKLEPVSPPSPPHTDPELELVPPRLSKEELIQNMDRVDREITMVEQQISKLKKKQQQLEEEAAKPPEPEKPVSPPPIESKHRSLVQIIYDENRKKAEAAHRILEGLGPQVELPLYNQPSDTRQYHENIKINQAMRKKLILYFKRRNHARKQWEQKFCQRYDQLMEAWEKKVERIENNPRRRAKESKVREYYEKQFPEIRKQRELQERMQSRVGQRGSGLSMSAARSEHEVSEIIDGLSEQENLEKQMRQLAVIPPMLYDADQQRIKFINMNGLMADPMKVYKDRQVMNMWSEQEKETFREKFMQHPKNFGLIASFLERKTVAECVLYYYLTKKNENYKSLVRRSYRRRGKSQQQQQQQ.... The pIC50 is 5.3. (4) The drug is CCCCCCCCCCCCc1c(O)cc(O)c(C(=O)CCCCCCCCCCC)c1O. The target protein (Q9QXX3) has sequence MLLLLLLLLLGPGPGFSEATRRSHVYKRGLLELAGTLDCVGPRSPMAYMNYGCYCGLGGHGEPRDAIDWCCYHHDCCYSRAQDAGCSPKLDRYPWKCMDHHILCGPAENKCQELLCRCDEELAYCLAGTEYHLKYLFFPSILCEKDSPKCN. The pIC50 is 5.8. (5) The small molecule is CC1(C(=O)N(CCCC(=O)O)Cc2ccc3cc[nH]c3c2)CCN1C(=O)Cc1csc2ccccc12. The target protein (O15552) has sequence MLPDWKSSLILMAYIIIFLTGLPANLLALRAFVGRIRQPQPAPVHILLLSLTLADLLLLLLLPFKIIEAASNFRWYLPKVVCALTSFGFYSSIYCSTWLLAGISIERYLGVAFPVQYKLSRRPLYGVIAALVAWVMSFGHCTIVIIVQYLNTTEQVRSGNEITCYENFTDNQLDVVLPVRLELCLVLFFIPMAVTIFCYWRFVWIMLSQPLVGAQRRRRAVGLAVVTLLNFLVCFGPYNVSHLVGYHQRKSPWWRSIAVVFSSLNASLDPLLFYFSSSVVRRAFGRGLQVLRNQGSSLLGRRGKDTAEGTNEDRGVGQGEGMPSSDFTTE. The pIC50 is 7.3. (6) The drug is O=c1c(O)c(-c2ccc(O)c(O)c2)oc2cc(O)cc(O)c12. The target protein sequence is MAFLWLLSCWALLGTTFGCGVPAIHPVLSGLSRIVNGEDAVPGSWPWQVSLQDKTGFHFCGGSLISEDWVVTAAHCGVRTSDVVVAGEFDQGSDEENIQVLKIAKVFKNPKFSILTVNNDITLLKLATPARFSQTVSAVCLPSADDDFPAGTLCATTGWGKTKYNANKTPDKLQQAALPLLSNAECKKSWGRRITDVMICAGASGVSSCMGDSGGPLVCQKDGAWTLVGIVSWGSDTCSTSSPGVYARVTKLIPWVQKILAAN. The pIC50 is 4.0. (7) The target protein (P24822) has sequence MQGPWVLLLLGLRLQLSLSVIPVEEENPAFWNKKAAEALDAAKKLQPIQTSAKNLIIFLGDGMGVPTVTATRILKGQLEGHLGPETPLAMDRFPYMALSKTYSVDRQVPDSASTATAYLCGVKTNYKTIGLSAAARFDQCNTTFGNEVFSVMYRAKKAGKSVGVVTTTRVQHASPSGTYVHTVNRNWYGDADMPASALREGCKDIATQLISNMDINVILGGGRKYMFPAGTPDPEYPNDANETGTRLDGRNLVQEWLSKHQGSQYVWNREQLIQKAQDPSVTYLMGLFEPVDTKFDIQRDPLMDPSLKDMTETAVKVLSRNPKGFYLFVEGGRIDRGHHLGTAYLALTEAVMFDLAIERASQLTSERDTLTIVTADHSHVFSFGGYTLRGTSIFGLAPLNALDGKPYTSILYGNGPGYVGTGERPNVTAAESSGSSYRRQAAVPVKSETHGGEDVAIFARGPQAHLVHGVQEQNYIAHVMASAGCLEPYTDCGLAPPADE.... The drug is O=C(CCNS(=O)(=O)c1ccc2[nH]c(=O)oc2c1)NCCCc1ccc2c(c1)OCO2. The pIC50 is 4.4.