程序问答   发布时间:2022-05-31  发布网站:大佬教程  code.js-code.com
大佬教程收集整理的这篇文章主要介绍了打开 XML 文档保护实现(documentProtection 类)大佬教程大佬觉得挺不错的,现在分享给大家,也给大家做个参考。

如何解决打开 XML 文档保护实现(documentProtection 类)?

开发过程中遇到打开 XML 文档保护实现(documentProtection 类)的问题如何解决?下面主要结合日常开发的经验,给出你关于打开 XML 文档保护实现(documentProtection 类)的解决方法建议,希望对你解决打开 XML 文档保护实现(documentProtection 类)有所启发或帮助;

我正在尝试在 Python 中实现 MS Word (2019) 文档的 Open XML documentProtection 哈希保护,以测试哈希算法。所以我创建了一个 Word 文档,用这个密码保护它不被编辑:johnjohn。然后,以 ZIP/XML 格式打开文档,我在 documentProtection 部分看到以下内容:

<w:documentProtection w:edit="Readonly" w:enforcement="1" w:cryptProvIDerType="rsaAES" w:cryptAlgorithmClass="hash" w:cryptAlgorithmType="typeAny" w:cryptAlgorithmSID="14" w:cryptSpinCount="100000" w:hash="pVjR9ktO9vlxijXcMPlH+4PLwD4Xwy1aqbNQOFmwaspvBjipNh//T8S3nBhq6HRoRVfWL6s/+NdUCPTxUr0vZw==" w:salt="pH1TDVHSfGBxkd3Q88UNhQ==" /> 

根据 Open XML 文档 (ECMA-376-1:2016 #17.15.1.29)

  • cryptAlgorithmSID="14" 指向 SHA-512 算法
  • cryptSpinCount="100000" 表示散列必须在 100k 轮中完成,使用以下算法(引用自上述标准):

指定哈希函数应迭代运行的次数(使用每次迭代的结果加上一个 4 字节值(基于 0 的小端),其中包含迭代次数作为下一次迭代的输入)将用户提供的密码与存储在 hashValue 属性中的值进行比较。

用于散列的 BASE64 编码盐 (“pH1TDVHSfGBxkd3Q88UNhQ==”) 被添加到原始密码之前。目标 BASE64 编码哈希必须是“pVjR9ktO9vlxijXcMPlH+4PLwD4Xwy1aqbNQOFmwaspvBjipNh//T8S3nBhq6HRoRVfWL6s/+NdUCPTxUr0vZw==

因此,我的 Python 脚本尝试使用所描述的算法生成相同的哈希值,如下所示:

@H_419_32@import hashlib import base64 import struct TARGET_HASH = 'pVjR9ktO9vlxijXcMPlH+4PLwD4Xwy1aqbNQOFmwaspvBjipNh//T8S3nBhq6HRoRVfWL6s/+NdUCPTxUr0vZw==' TARGET_SALT = 'pH1TDVHSfGBxkd3Q88UNhQ==' bsalt = base64.b64decode(TARGET_SALT) def hashit(what,alg='sha512',**kwargs): if alg == 'sha1': return hashlib.sha1(what) elif alg == 'sha512': return hashlib.sha512(what) # etc... else: raise Exception(f'Unsupported hash algorithm: {alg}') def gethash(data,salt=None,iters=100000,base64result=True,returnString=TruE): # encode password in UTF-16le # ECMA-376-1:2016 17.15.1.29 (p. 1026) if isinstance(data,str): data = data.encode('UTF-16-le') # prepend salt if provIDed if not salt is None: if isinstance(salt,str): salt = salt.encode('UTF-16-le') ghash = salt + data else: ghash = data # hash iteratively for 'iters' rounds for i in range(iters): try: # next hash = hash(prevIoUs data) + 4-byte Integer (prevIoUs round number) with LE byte ordering # ECMA-376-1:2016 17.15.1.29 (p. 1020) ghash = hashit(ghash,alg).digest() + struct.pack('<I',i) except Exception as err: print(err) break # remove Trailing round number bytes ghash = ghash[:-4] # BASE64 encode if requested if base64result: ghash = base64.b64encode(ghash) # return as an ASCII String if requested if returnString: ghash = ghash.decode() return ghash

但是当我跑步时

@H_419_32@print(gethash('johnjohn',bsalt))

我得到以下哈希值,它不等于目标哈希值:

G47RT4/+JdE6pnrP6R_28_11845@qUKa3JyL8abeYSCX+E4+9J+6shiZqImBJ8M6bb+IMKEdvKd6+9dVnQ3oeOsgQz/aCdcQ==

我的实现是否有误,或者您认为低级哈希函数实现(Python 的 hashlib 与 Open XML)有区别吗?

更新

我意识到 Word 使用旧算法来预处理密码(为了与旧版本兼容)。该算法在 ECMA-376-1:2016 第 4 部分(过渡迁移功能,#14.8.1“传统密码哈希算法”)中有详细描述。所以我设法制作了一个重现官方 ECMA 示例的脚本:

@H_419_32@def strtobytes(s,trunc=15): b = s.encode('UTF-16-le') # remove BOM symbol if present if b[0] == 0xfeff: b = b[1:] pwdlen = min(trunc,len(s)) if pwdlen < 1: return None return bytes([b[i] or b[i+1] for i in range(0,pwdlen * 2,2)]) def process_pwd(pwd): # 1. PREPARE PWD StriNG (TruncATE,CONVERT TO BYTES) pw = strtobytes(pwd) if isinstance(pwd,str) else pwd[:15] pwdlen = len(pw) # 2. HIGH WORD CALC HW = InitialCodeArraY[pwdlen - 1] for i in range(pwdlen): r = 15 - pwdlen + i for ibit in range(7): if (pw[i] & (0x0001 << ibit)): HW ^= EncryptionMatrix[r][ibit] # 3. LO WORD CALC LW = 0 for i in reversed(range(pwdlen)): LW = (((LW >> 14) & 0x0001) | ((LW << 1) & 0x7FFF)) ^ pw[i] LW = (((LW >> 14) & 0x0001) | ((LW << 1) & 0x7FFF)) ^ pwdlen ^ 0xCE4B # 4. COMBINE AND REVERSE return bytes([LW & 0xff,LW >> 8,HW & 0xff,HW >> 8])

因此,当我执行 process_pwd('Example') 时,我得到了 ECMA (0x7EEDCE64) 中所说的内容。散列函数也被修改(初始 SALT + HASH 不应该包含在主迭代循环中,正如我在论坛上找到的那样):

@H_419_32@def gethash(data,returnString=TruE): def hashit(what,alg='sha512'): return getattr(hashlib,alg)(what) # encode password with legacy algorithm if a String is given if isinstance(data,str): data = process_pwd(data) if Data is None: print('WRONG passworD StriNG!') return None # prepend salt if provIDed if not salt is None: if isinstance(salt,str): salt = process_pwd(salt) if salt is None: print('WRONG SALT StriNG!') return None ghash = salt + data else: ghash = data # initial hash (salted) ghash = hashit(ghash,alg).digest() # hash iteratively for 'iters' rounds for i in range(iters): try: # next hash = hash(prevIoUs data + 4-byte Integer (prevIoUs round number) with LE byte ordering) # ECMA-376-1:2016 17.15.1.29 (p. 1020) ghash = hashit(ghash + struct.pack('<I',i),alg).digest() except Exception as err: print(err) return None # BASE64 encode if requested if base64result: ghash = base64.b64encode(ghash) # return as an ASCII String if requested if returnString: ghash = ghash.decode() return ghash

然而,我多次重新检查此代码,我再也看不到任何错误。但是还是无法在测试Word文档中重现目标hash:

@H_419_32@@H_83_5@myhash = gethash('johnjohn',base64.b64decode('pH1TDVHSfGBxkd3Q88UNhQ==')) print(myhash) print(TARGET_HASH == myhash)

我明白了:

wut2VOpT+X8pKXky6u/+YtwRX2inDv1WVC8FtZcdxKsyX0gHNBJGYwBgV8xzq7Rke/hWMfWe9JVvqDQAZ11A5w==

错误

解决方法

今天也不得不看这个并设法对其进行逆向工程。

简单来说,步骤是:

  1. 将密码截断为 15 个字符(不清楚这是 ASCII 编码还是 UTF8 - 一些文档引用了“Unicode 密码”,但所有示例似乎都是基于 ASCII 的)。我的实现只是采用 UTF8 转换后截断的字节(保留 ASCII 集)。
  2. 根据密码的长度从魔法列表中获取高位单词。如果密码长度为 0,则它只是两个零字节。
  3. 对于密码中的每个字节,根据它在加密矩阵中的位置抓取位(注意最后一个字符总是对应最后一行,如果密码短于矩阵的第一部分可能不使用15 个字节)。对于第一到第七位,如果设置了,则与高位字的当前值进行异或运算。对每个字符重复。
  4. 抓取一个低位字(2 个字节)并初始化为零。对每个字符执行操作,从密码中的最后一个字符开始并向前工作: low-order word = (((low-order word >> 14) AND 0x0001) | (low-order word << 1) & 0x7FFF)) ^ character (bytE)(> 是左移和右移运算符。|、&、^ 分别是按位或、和和异或。)
  5. 然后做low-order word = (((low-order word >> 14) & 0x0001) | (low-order word << 1) & 0x7FFF)) ^ password length ^ 0xCE4B.
  6. 通过将低位词附加到高位词来形成密钥。然后反转字节顺序。
  7. 出于某种原因,Microsoft Word 然后使用上述键的 Unicode 十六进制表示,然后将该表示转换为字节(请参阅注释中的链接)。
  8. 现在通过在上面的结果中添加盐字节来计算一次哈希。如果没有盐字节,请跳过此步骤。
  9. 如果需要计算迭代,对于每次迭代,将迭代计数(0 基)转换为 32 位(4 字节)整数(小端),并且(文档对此不清楚,只是说要“添加”字节 - 但为了与输出对齐,我必须附加它)将其附加到当前计算的哈希值。应用请求的哈希算法(Word 似乎默认为 SHA512,但从测试中我发现它也能很好地处理其他选项)。
  10. 将上述内容作为 base-64 编码字符串返回。这就是 documentProtection 属性中的内容。

这是我在 C# 中的实现 (NuGet):

/// <sumMary>
/// Class that generates hashes suitable for use with OpenXML Wordprocessing ML documents with the documentProtection element.
/// </sumMary>
public class WordprocessingMLDocumentProtectionHashGenerator
{
    private static readonly byte[][] HighOrderWords = new byte[][]
    {
        new byte[] { 0xE1,0xF0 },new byte[] { 0x1D,0x0F },new byte[] { 0xCC,0x9C },new byte[] { 0x84,0xC0 },new byte[] { 0x11,0x0C },new byte[] { 0x0E,0x10 },new byte[] { 0xF1,0xCE },new byte[] { 0x31,0x3E },new byte[] { 0x18,0x72 },new byte[] { 0xE1,0x39 },new byte[] { 0xD4,0xF9 },new byte[] { 0x28,new byte[] { 0xA9,0x6A },new byte[] { 0x4E,0xC3 }
    };

    private static readonly byte[,] EncryptionMatrix = new byte[,]
    {
        { { 0xAE,0xFC },{ 0x4D,0xD9 },{ 0x9B,0xB2 },{ 0x27,0x45 },{ 0x4E,0x8A },{ 0x9D,0x14 },{ 0x2A,0x09 } },{ { 0x7B,0x61 },{ 0xF6,0xC2 },{ 0xFD,0xA5 },{ 0xEB,0x6B },{ 0xC6,0xF7 },0xCF },{ 0x2B,0xBF } },{ { 0x45,0x63 },{ 0x8A,0xC6 },{ 0x05,0xAD },{ 0x0B,0x5A },{ 0x16,0xB4 },{ 0x2D,0x68 },{ 0x5A,0xD0 } },{ { 0x03,0x75 },{ 0x06,0xEA },{ 0x0D,0xD4 },{ 0x1B,0xA8 },{ 0x37,0x50 },{ 0x6E,0xA0 },{ 0xDD,0x40 } },{ { 0xD8,0x49 },{ 0xA0,0xB3 },{ 0x51,0x47 },{ 0xA2,0x8E },{ 0x55,0x3D },{ 0xAA,0x7A },{ 0x44,0xD5 } },{ { 0x6F,{ 0xDE,{ 0xAD,0x35 },{ 0x4A,0x4B },{ 0x94,0x96 },{ 0x39,0x0D },{ 0x72,0x1A } },{ { 0xEB,0x23 },0x67 },{ 0x9C,0xEF },{ 0x29,0xFF },{ 0x53,0xFE },{ 0xA7,{ 0x5F,0xD9 } },{ { 0x47,0xD3 },{ 0x8F,0xA6 },{ 0x0F,0x6D },{ 0x1E,0xDA },{ 0x3D,{ 0x7B,{ { 0xB8,{ 0x60,0xE3 },{ 0xC1,{ 0x93,0x7B },0xF6 },0xEC } },{ 0x8B,0x40 },0xA1 },0x42 },{ 0x1A,0x84 },{ 0x35,0x08 },{ 0x6A,0x10 } },{ { 0xAA,0x51 },0x83 },{ 0x89,0x06 },{ 0x02,0x2D },{ 0x04,{ 0x08,{ 0x11,0x68 } },{ { 0x76,{ 0xED,{ 0xCA,0xF1 },{ 0x85,0xC3 },0xA7 },0x4E },0x9C } },{ { 0x37,0x30 },0x60 },{ 0xDC,{ 0xA9,{ 0x43,{ 0x86,{ 0x1D,0xAD } },{ { 0x33,0x31 },{ 0x66,0x62 },{ 0xCC,0xC4 },0xA9 },{ 0x03,0x73 },0xE6 },0xCC } },{ { 0x10,0x21 },{ 0x20,{ 0x40,{ 0x81,{ 0x12,{ 0x24,{ 0x48,0xC4 } }
    };

    /// <sumMary>
    /// Generates a base-64 String according to the Wordprocessing ML Document DocumentProtection security algorithm.
    /// </sumMary>
    /// <param name="password"></param>
    /// <param name="salt"></param>
    /// <param name="iterations"></param>
    /// <param name="hashAlgorithmname"></param>
    /// <returns></returns>
    public String GenerateHash(String password,byte[] salt,int iterations,HashAlgorithmname hashAlgorithmname)
    {
        if (password == null)
        {
            throw new ArgumentNullException(nameof(password));
        }

        // Algorithm given in ECMA-374,1st Edition,December 2006
        // https://www.ecma-international.org/wp-content/uploads/ecma-376_first_edition_december_2006.zip
        // Alternatively: https://c-rex.net/projects/samples/ooxml/e1/Part4/OOXML_P4_DOCX_documentProtection_topic_ID0EJVTX.html
        byte[] passwordBytes = Encoding.UTF8.GetBytes(password);
        passwordBytes = passwordBytes.Take(15).ToArray();
        int passwordLength = passwordBytes.Length;

        // If the password length is 0,the key is 0.
        byte[] highOrderWord = new byte[] { 0x00,0x00 };
        if (passwordLength > 0)
        {
            highOrderWord = HighOrderWords[passwordLength - 1].ToArray();
        }
        for (int i = 0; i < passwordLength; i++)
        {
            byte passwordByte = passwordBytes[i];
            int encryptionMatrixIndex = i + (EncryptionMatrix.GetLength(0) - passwordLength);

            BitArray bitArray = passwordByte.ToBitArray();

            for (int j = 0; j < EncryptionMatrix.GetLength(1); j++)
            {
                bool isSet = bitArraY[j];

                if (isSet)
                {
                    for (int k = 0; k < EncryptionMatrix.GetLength(2); k++)
                    {
                        highOrderWord[k] = (bytE)(highOrderWord[k] ^ EncryptionMatrix[encryptionMatrixIndex,j,k]);
                    }
                }
            }
        }

        byte[] lowOrderWord = new byte[] { 0x00,0x00 };
        BitSequence lowOrderBitSequence = lowOrderWord.ToBitSequence();
        BitSequence bitSequence1 = new byte[] { 0x00,0x01 }.ToBitSequence();
        BitSequence bitSequence7FFF = new byte[] { 0x7F,0xFF }.ToBitSequence();

        for (int i = passwordLength - 1; i >= 0; i--)
        {
            byte passwordByte = passwordBytes[i];
            lowOrderBitSequence = (((lowOrderBitSequence >> 14) & bitSequence1) | ((lowOrderBitSequence << 1) & bitSequence7FFF)) ^ new byte[] { 0x00,passwordByte }.ToBitSequence();
        }

        lowOrderBitSequence = (((lowOrderBitSequence >> 14) & bitSequence1) | ((lowOrderBitSequence << 1) & bitSequence7FFF)) ^ new byte[] { 0x00,(bytE)passwordLength }.ToBitSequence() ^ new byte[] { 0xCE,0x4B }.ToBitSequence();
        lowOrderWord = lowOrderBitSequence.ToByteArray();

        byte[] key = highOrderWord.Concat(lowOrderWord).ToArray();
        key = key.Reverse().ToArray();

        // https://docs.microsoft.com/en-us/openspecs/office_standards/ms-oe376/fb220a2f-88d4-488c-a9b7-e094756b6699
        // In Word,an additional third stage is added to the process of hashing and storing a user supplied password.  In this third stage,the reversed byte order legacy hash from the second stage shall be converted to Unicode hex String representation [Example: If the single byte String 7EEDCE64 is converted to Unicode hex String it will be represented in memory as the following byte stream: 37 00 45 00 45 00 44 00 43 00 45 00 36 00 34 00. end example],and that value shall be hashed as defined by the attribute values.
        key = Encoding.Unicode.GetBytes(BitConverter.ToString(key).@R_618_9363@ce("-",String.Empty));

        HashAlgorithm hashAlgorithm = hashAlgorithmname.Create();

        byte[] computedHash = key;

        if (salt != null)
        {
            computedHash = salt.Concat(key).ToArray();
        }

        // Word requires that the initial hash of the password with the salt not be considered in the count.
        computedHash = hashAlgorithm.ComputeHash(computedHash);

        for (int i = 0; i < iterations; i++)
        {
            // ISO/IEC 29500-1 Fourth Edition,2016-11-01
            // 17.15.1.29 - spinCount
            // Specifies the number of times the hashing function shall be iteratively run (runs using each iteration''s result plus a 4 byte value (0-based,little endian) containing the number of the iteration as the input for the next iteration) when attempTing to compare a user-supplied password with the value stored in the hashValue attribute.
            byte[] iterationBytes = BitConverter.GetBytes(i);
            computedHash = computedHash.Concat(iterationBytes).ToArray();
            computedHash = hashAlgorithm.ComputeHash(computedHash);
        }

        return Convert.ToBase64String(computedHash);
    }
}

我用您的示例哈希对其进行了测试并检查它是否通过:

    [TESTClass]
[TESTCategory("WordprocessingMLDocumentProtectionHashGenerator")]
public class WordprocessingMLDocumentProtectionHashGeneratorTests
{
    [TestMethod]
    public void GeneratesKnownHashes()
    {
        WordprocessingMLDocumentProtectionHashGenerator wordprocessingMLDocumentProtectionHashGenerator = new WordprocessingMLDocumentProtectionHashGenerator();

        Assert.AreEqual("sstT7oPzpUQTchSUE6WbidCrZv1c8k+/5D1Pm+weZt7QoaeSnBEg/cZFg2W+1eohg1mgXGXLci1CWbnbHDYsXQ==",wordprocessingMLDocumentProtectionHashGenerator.GenerateHash("Example",Convert.FromBase64String("KPr2WqWFihenPDtAmpqUtw=="),100000,HashAlgorithmname.SHA512));
        Assert.AreEqual("uBuZhlyVTOQtRwQuOGjY7GU3FnJbe1VFKvN+j9u27HSbthOY+n1/daU/WCkqV40fG6HxX+pxgR+Ow4ZvAE7aZg==",wordprocessingMLDocumentProtectionHashGenerator.GenerateHash("password",Convert.FromBase64String("On9D022mrdqvHTb6eEkFGA=="),HashAlgorithmname.SHA512));
        Assert.AreEqual("mkGbBri0a1icL1nJKTQL7PyLUY2Uei2wymHC0Y6s1+DOMYvPWdB6cy0Npao15O0+yqtyZW4hAP0+dcdyrEk7qg==",HashAlgorithmname.SHA512));
        Assert.AreEqual("qdPI8cSBM/21Mr29mfFrR6l7hIn8oLKKT1nTDXHsAQA=",wordprocessingMLDocumentProtectionHashGenerator.GenerateHash("TesteRMAN",HashAlgorithmname.SHA256));
        Assert.AreEqual("d5FZvHnQhm6Mzqy6cYE7ZbniYXA/8qJxkAze0sFcNirWYhaLpScmSsfBHptuEmuBreLuNjyV5IjdUoOFWM9mbQ==",null,HashAlgorithmname.SHA512));
        Assert.AreEqual("pVjR9ktO9vlxijXcMPlH+4PLwD4Xwy1aqbNQOFmWaSpvBjipNh//T8S3nBhq6HRoRVfWL6s/+NdUCPTxUr0vZw==",wordprocessingMLDocumentProtectionHashGenerator.GenerateHash("johnjohn",Convert.FromBase64String("pH1TDVHSfGBxkd3Q88UNhQ=="),HashAlgorithmname.SHA512));
    }
}

大佬总结

以上是大佬教程为你收集整理的打开 XML 文档保护实现(documentProtection 类)全部内容,希望文章能够帮你解决打开 XML 文档保护实现(documentProtection 类)所遇到的程序开发问题。

如果觉得大佬教程网站内容还不错,欢迎将大佬教程推荐给程序员好友。

本图文内容来源于网友网络收集整理提供,作为学习参考使用,版权属于原作者。
如您有任何意见或建议可联系处理。小编QQ:384754419,请注明来意。
标签:打开类)