This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is Cc1cn(-c2cc(C(=O)Nc3cccc(Nc4ccc5c(c4)NC(=O)/C5=C\c4ccc[nH]4)c3)cc(C(F)(F)F)c2)cn1. The target protein (Q78DX7) has sequence MKNICWLTLKLVKFVVLGCIIWISVAQSTVLSSCLTSCVTNLGRQLDSGTRYNLSEACIHGCQFWNSVDQETCALKCNDTYATICERESCEVGCSNAEGSYEEEVLESTELPTAPFASSIGSHGVTLRWNPANISGVKYIIQWKYAQLPGSWTFTETVSKLSYTVEPLHPFTEYIFRVVWIFTAQLHLYSPPSPSYRTHPYGVPETAPLILNMESWSPDTVEVSWAPPHFPGGPILGYNLRLISKNQKLDSGTQRTSFQFYSTLPNTTYRFSIAAVNEVGEGPEAESTVTTPSPSVQEEEQWLFLSRKTSLRKRSLKYLVDEAHCLWSDAIHHNITGISVYAQQQVVYFSEGTVIWMKGAANMSDVSDLRIFYQGSGLVSSISIDWLYQRMYFIMDKLVYVCELKNCSNLEEITPFSLIAPQKVVVDSYNGYLFYLLRDGIYRVNLPLPSGRDTKAVRIVESGTLKDFAVKPQSKRIIYFNDTMQLFMSTFLDGSAFHRV.... The pIC50 is 5.2. (2) The small molecule is CCc1nc(N)nc(N)c1C#CCc1cc(OC)cc(-c2ccc(C(=O)O)cc2)c1. The target protein sequence is MKVSLIAAMDKNRVIGKENDIPWRIPEDWEYVKNTTKGYPIILGRKNLESIGRALPGRRNIILTRDKGFSFNGCEIVHSIEDVFELCNSEEEIFIFGGEQIYNLFLPYVEKMYITKIHYEFEGDTFFPEVNYEEWNEVSVTQGITNEKNPYTYYFHIYERKAS. The pIC50 is 6.8. (3) The drug is N#CCNC(=O)[C@@H]1CCCC[C@H]1CSc1ccc(Cl)cc1Cl. The target protein (P43236) has sequence MWGLKVLLLPVVSFALHPEEILDTQWELWKKTYSKQYNSKVDEISRRLIWEKNLKHISIHNLEASLGVHTYELAMNHLGDMTSEEVVQKMTGLKVPPSRSHSNDTLYIPDWEGRTPDSIDYRKKGYVTPVKNQGQCGSCWAFSSVGALEGQLKKKTGKLLNLSPQNLVDCVSENYGCGGGYMTNAFQYVQRNRGIDSEDAYPYVGQDESCMYNPTGKAAKCRGYREIPEGNEKALKRAVARVGPVSVAIDASLTSFQFYSKGVYYDENCSSDNVNHAVLAVGYGIQKGNKHWIIKNSWGESWGNKGYILMARNKNNACGIANLASFPKM. The pIC50 is 7.0. (4) The compound is Nc1ccc(-c2cnc3[nH]cc(-c4ccc5[nH]ccc5c4)c3c2)cn1. The target protein (Q80XI6) has sequence MEPLKNLFLKSPLGSWNGSGSGGGGGTGGVRPEGSPKATAAYANPVWTALFDYEPNGQDELALRKGDRVEVLSRDAAISGDEGWWAGQVGGQVGIFPSNYVSRGGGPPPCEVASFQELRLEEVIGIGGFGKVYRGSWRGELVAVKAARQDPDEDISVTAESVRQEARLFAMLAHPNIIALKAVCLEEPNLCLVMEYAAGGPLSRALAGRRVPPHVLVNWAVQIARGMHYLHCEALVPVIHRDLKSNNILLLQPIEGDDMEHKTLKITDFGLAREWHKTTQMSAAGTYAWMAPEVIKASTFSKGSDVWSFGVLLWELLTGEVPYRGIDCLAVAYGVAVNKLTLPIPSTCPEPFAQLMADCWAQDPHRRPDFASILQQLEALEAQVLREMPRDSFHSMQEGWKREIQGLFDELRAKEKELLSREEELTRAAREQRSQAEQLRRREHLLAQWELEVFERELTLLLQQVDRERPHVRRRRGTFKRSKLRARDGGERISMPLDFK.... The pIC50 is 6.4. (5) The drug is COc1ccc(C(C)=NOC(N)=O)cc1OC1CCCC1. The target protein sequence is SFLDNHKKLTPRRDVPTYPKYLLSPETIEALRKPTFDVWLWEPNEMLSCLEHMYHDLGLVRDFSINPVTLRRWLFCVHDNYRNNPFHNFRHCFCVAQMMYSMVWLCSLQEKFSQTDILILMTAAICHDLDHPGYNNTYQINARTELAVRYNDISPLENHHCAVAFQILAEPECNIFSNIPPDGFKQIRQGMITLILATDMARHAEIMDSFKEKMENFDYSNEEHMTLLKMILIKCCDISNEVRPMEVAEPWVDCLLEEYFMQSDREKSEGLPVAPFMDRDKVTKATAQIGFIKFVLIPMFETVTKLFPMVEEIMLQPLWESRDRYEELKRIDDAMKELQKKTDSLTSGATEKSRERSRDVKNSEGDCA. The pIC50 is 3.7. (6) The small molecule is CC(=O)N[C@H](C(=O)N[C@H](C(=O)N[C@@H](Cc1ccccc1)[C@H](O)C(=O)N1CSC(C)(C)[C@H]1C(=O)NCC(C)C)C(C)(C)C)c1ccccc1. The target protein sequence is MTVLPIALFSSNTPLRNTSVLGAGGQTQDHFKLTSLPVLIRLPFRTTPIVLTSCLVDTKNNWAIIGRDALQQCQGALYLPEAKGPPVILPIQAPAVLGLEHLPRPPEISQFPLNQNGSRPCNTWSGRPWRQAISNPTPGQEITQYSQLKRPMEPGDSSTTCGPLTL. The pIC50 is 7.0.