This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is Cc1cc(Nc2cc(C(F)(F)F)cc(C(F)(F)F)c2)n2ncnc2n1. The target protein sequence is MKRFDERMNKEKSKHKKVLFFIFSSIVGLYMYFESYNPEFFMYDVFLDFCLNYVDSEVCHDLFLLLGKYGLLPYDTSNDSVYATSDIKNLNFINPFGVAAGFDKNGICIDSILKLGFSFIEIGTITPKPQKGNNKPRIFRDVENKSIINACGFNNIGCDKVTENLINFRKKQEEDKLLSKHIVGVSIGKNKHTENIVDDLKYSIYKIARYADYIAINVSSPNTPGLRDNQESNKLKNIILFVKQEINKIEQIGHNGETFWMNTIKKKPLVFVKLAPDLENSEKKKIAQVLLDTGIDGMIISNTTINKMDIKSFEDKKGGVSGKKLKDLSTNLISDMYIYTNKQIPIIASGGILTGADALEKIEAGASVCQLYSCLVFNGVKSAIQIKREFNNALYQKGYYNLREAIGKKHSNAKSLKV. The pIC50 is 4.0. (2) The small molecule is CCNC(=O)c1c2n(c3c(N4CCN(CCc5ccc(F)c(F)c5)CC4)ncnc13)CCCC2. The target protein (O35379) has sequence MALRSFCSADGSDPLWDWNVTWHTSNPDFTKCFQNTVLTWVPCFYLWSCFPLYFFYLSRHDRGYIQMTHLNKTKTALGFFLWIICWADLFYSFWERSQGVLRAPVLLVSPTLLGITMLLATFLIQLERRKGVQSSGIMLTFWLVALLCALAILRSKIISALKKDAHVDVFRDSTFYLYFTLVLVQLVLSCFSDCSPLFSETVHDRNPCPESSASFLSRITFWWITGMMVHGYRQPLESSDLWSLNKEDTSEEVVPVLVNNWKKECDKSRKQPVRIVYAPPKDPSKPKGSSQLDVNEEVEALIVKSPHKDREPSLFKVLYKTFGPYFLMSFLYKALHDLMMFAGPKILELIINFVNDREAPDWQGYFYTALLFVSACLQTLALHQYFHICFVSGMRIKTAVVGAVYRKALLITNAARKSSTVGEIVNLMSVDAQRFMDLATYINMIWSAPLQVILALYFLWLSLGPSVLAGVAVMILMVPLNAVMAMKTKTYQVAHMKSKD.... The pIC50 is 4.0. (3) The drug is Cc1ccc(Sc2ccc(Nc3cc(S(=O)(=O)[O-])c(N)c4c3C(=O)c3ccccc3C4=O)cc2)cc1C.[Na+]. The target protein (P41231) has sequence MAADLGPWNDTINGTWDGDELGYRCRFNEDFKYVLLPVSYGVVCVPGLCLNAVALYIFLCRLKTWNASTTYMFHLAVSDALYAASLPLLVYYYARGDHWPFSTVLCKLVRFLFYTNLYCSILFLTCISVHRCLGVLRPLRSLRWGRARYARRVAGAVWVLVLACQAPVLYFVTTSARGGRVTCHDTSAPELFSRFVAYSSVMLGLLFAVPFAVILVCYVLMARRLLKPAYGTSGGLPRAKRKSVRTIAVVLAVFALCFLPFHVTRTLYYSFRSLDLSCHTLNAINMAYKVTRPLASANSCLDPVLYFLAGQRLVRFARDAKPPTGPSPATPARRRLGLRRSDRTDMQRIEDVLGSSEDSRRTESTPAGSENTKDIRL. The pIC50 is 5.0. (4) The compound is C[C@@H](O)[C@H]1C(=O)N2C(C(=O)[O-])=C(Sc3nc4ccc(N)cc4s3)[C@H](C)[C@H]12.[Na+]. The pIC50 is 4.4. The target protein sequence is MKKIKIVPLILIVVVVGFGIYFYASKDKEINNTIDAIEDKNFKQVYKDSSYISKSDNGEVEMTERPIKIYNSLGVKDINIQDRKIKKVSKNKKRVDAQYKIKTNYGNIDRNVQFNFVKEDGMWKLDWDHSVIIPGMQKDQSIHIENLKSERGKILDRNNVELANTGTHMRLGIVPKNVSKKDYKAIAKELSISEDYINNKWIKIGYKMIPSFHFKTVKKMDEYLSDFAKKFHLTTNETESRNYPLEKATSHLLGYVGPINSEELKQKEYKGYKDDAVIGKKGLEKLYDKKLQHEDGYRVTIVDDNSNTIAHTLIEKKKKDGKDIQLTIDAKVQKSIYNNMKNDYGSGTAIHPQTGELLALVSTPSYDVYPFMYGMSNEEYNKLTEDKKEPLLNKFQITTSPGSTQKILTAMIGLNNKTLDDKTSYKIDGKGWQKDKSWGGYNVTRYEVVNGNIDLKQAIESSDNIFFARVALELGSKKFEKGMKKLGVGEDIPSDYPFYN.... (5) The target protein (O00142) has sequence MLLWPLRGWAARALRCFGPGSRGSPASGPGPRRVQRRAWPPDKEQEKEKKSVICVEGNIASGKTTCLEFFSNATDVEVLTEPVSKWRNVRGHNPLGLMYHDASRWGLTLQTYVQLTMLDRHTRPQVSSVRLMERSIHSARYIFVENLYRSGKMPEVDYVVLSEWFDWILRNMDVSVDLIVYLRTNPETCYQRLKKRCREEEKVIPLEYLEAIHHLHEEWLIKGSLFPMAAPVLVIEADHHMERMLELFEQNRDRILTPENRKHCP. The small molecule is Cc1cn([C@H]2C[C@H](NC(=S)Nc3ccc(Cl)c(C(F)(F)F)c3)[C@@H](CO)O2)c(=O)[nH]c1=O. The pIC50 is 6.8. (6) The small molecule is COc1cc(C(=O)N2CO[C@](CCN3CCN(c4ccc(C(C)=O)cc4)CC3)(c3ccc(Cl)c(Cl)c3)C2)cc(OC)c1OC. The target protein (Q64077) has sequence MGACVIVTNTNISSGLESNTTGITAFSMPTWQLALWATAYLALVLVAVTGNATVTWIILAHQRMRTVTNYFIVNLALADLCMAAFNAAFNFVYASHNIWYFGRAFCYFQNLFPITAMFVSIYSMTAIAIDRYMAIVHPFQPRLSAPSTKAVIGGIWLVALALAFPQCFYSTITEDEGATKCVVAWPEDSRDKSLLLYHLVVIVLIYLLPLTVMFVAYSIIGLTLWRRAVPRHQAHGANLRHLQAKKKFVKTMVLVVVTFAICWLPYHLYFILGSFQEDIYCHKFIQQVYLALFWLAMSSTMYNPIIYCCLNRRFRSGFRLAFRCCPWVTPTEEDKLELTHTPSFSLRVNRCHTKEILFMAGDTVPSEATNGQAGGPQDRESVELSSLPGCRAGPSILAKASS. The pIC50 is 7.1. (7) The compound is O=P(O)(O)C(CC1CCCCC1)c1cccc2ccccc12. The target protein (P15309) has sequence MRAAPLLLARAASLSLGFLFLLFFWLDRSVLAKELKFVTLVFRHGDRSPIDTFPTDPIKESSWPQGFGQLTQLGMEQHYELGEYIRKRYRKFLNESYKHEQVYIRSTDVDRTLMSAMTNLAALFPPEGVSIWNPILLWQPIPVHTVPLSEDQLLYLPFRNCPRFQELESETLKSEEFQKRLHPYKDFIATLGKLSGLHGQDLFGIWSKVYDPLYCESVHNFTLPSWATEDTMTKLRELSELSLLSLYGIHKQKEKSRLQGGVLVNEILNHMKRATQIPSYKKLIMYSAHDTTVSGLQMALDVYNGLLPPYASCHLTELYFEKGEYFVEMYYRNETQHEPYPLMLPGCSPSCPLERFAELVGPVIPQDWSTECMTTNSHQGTEDSTD. The pIC50 is 4.9. (8) The drug is O=c1ccc(-c2c(-c3ccc(F)cc3Cl)nc3ccccn23)nn1-c1c(Cl)cccc1Cl. The target protein (Q9WUI1) has sequence MSGPRAGFYRQELNKTVWEVPQRLQGLRPVGSGAYGSVCSAYDARLRQKVAVKKLSRPFQSLIHARRTYRELRLLKHLKHENVIGLLDVFTPATSIEDFSEVYLVTTLMGADLNNIVKCQALSDEHVQFLVYQLLRGLKYIHSAGIIHRDLKPSNVAVNEDCELRILDFGLARQADEEMTGYVATRWYRAPEIMLNWMHYNQTVDIWSVGCIMAELLQGKALFPGNDYIDQLKRIMEVVGTPSPEVLAKISSEHARTYIQSLPPMPQKDLSSVFHGANPLAIDLLGRMLVLDSDQRVSAAEALAHAYFSQYHDPDDEPEAEPYDESVEAKERTLEEWKELTYQEVLSFKPLEPSQLPGTHEIEQ. The pIC50 is 7.2.