Skip to main content

Office file (DOC, DOCX, PPT…) Password cracking. Guide with real life examples!

Cover image

Article Metadata

Ecosystem Fit

This page mirrors the original Medium article into the 1200km.com Docusaurus ecosystem. The original article flow, images, screenshots, infographics, and technical blocks are preserved from the export.

Unlock the secrets of Office file password cracking with our in-depth guide. Learn the tools, techniques, and strategies used to breach all types off Office files encryption, illustrated with vivid real-life examples. Whether you’re a cybersecurity enthusiast or a professional, this article will provide you with actionable insights into the world of digital security and password recovery.

About author

Hello and welcome to my article. My name is Andrey, and I am a penetration tester and cybersecurity researcher

Disclaimer: Educational Purpose Only

The information provided in this article, is intended for educational purposes only. The techniques and methods described herein are discussed as a means to understand and improve security measures and should not be used for illegal purposes. The author and publisher disclaim any liability from the misuse of this information. Readers are urged to use this knowledge to enhance their cybersecurity defenses and are reminded that unauthorized hacking into any system is illegal and unethical.

Office file password cracking

I have a password-protected Office file.Secret.docx

Article image

Brute force/Dictionary Brute force

1. Extraction of Encrypted Data

When you have a office file that is password-protected, the actual password isn’t stored anywhere in plaintext or any easily readable format. Instead, what is stored is a cryptographic hash of the password. This hash is generated by applying a hashing algorithm to the password when the Office file is created. By creating a hash file from the Office file, you essentially extract this encrypted representation of the password, which is what you will attempt to crack.

For this propose you can useoffice2john

office2john Secret.docx >
hash
.txt && sed
's/[^:]*:\(.*\)/\1/'

hash
.txt > temp.txt && mv temp.txt
hash
.txt

If you have error: office2john: command not found

Verify office2john Availability

After installing, you should check ifoffice2johnis indeed available:

  • Navigate to the directory where John the Ripper is installed, particularly where therunsubdirectory is located.

  • Check ifoffice2johnis listed there by runninglsor directly try executing it from that directory (python3 ./office2john.py).

Try this Command:

python3 /snap/john-the-ripper/639/run/office2john.py ./Secret.docx > hash.txt && sed
's/[^:]*:\(.*\)/\1/'
hash.txt > temp.txt &&
mv
temp.txt hash.txt

Article image

Part 1: Hash Extraction

  • **python3**: This invokes the Python interpreter. The scriptoffice2john.pyis written in Python, so it requires Python to run.

  • **/snap/john-the-ripper/639/run/office2john.py**: This is the path to theoffice2john.pyscript, which is part of the John the Ripper toolset. The script is specifically designed to extract password hashes from Office files.

  • **./Secret.docx**: The path to the Office document from which you want to extract the hash. This path suggests that the document is located in the current directory.

  • **>**: This operator redirects the output of the command to the left (the output fromoffice2john.py) into the file on the right (hash.txt).

  • **hash.txt**: This file is where the extracted hash will be stored.

Part 2: Format Adjustment

  • **sed 's/[^:]*:\(.*\)/\1/' hash.txt > temp.txt**: This command usessed, a stream editor, to process the text inhash.txt. Thesedcommand used here removes everything before the first colon, including the colon itself, effectively isolating the hash. It uses a regular expression to accomplish this:

  • [^:]*:matches all characters up to and including the first colon (filename and colon).

  • \(.*\)captures everything after the colon (the hash).

  • The entire line is then replaced with just the captured group, which is the hash.

  • **temp.txt**: A temporary file where the processed hash is stored.

Part 3: Update Original Hash File

  • **mv temp.txt hash.txt**: This command moves or renamestemp.txttohash.txt, effectively updating the original hash file with the processed data. This step ensures thathash.txtnow only contains the necessary hash data, without the filename or any other preceding characters.

2. Compatibility with Cracking Tools

Tools like[hashcat](https://hashcat.net/hashcat/)andJohn the Ripperare designed to work with hashes rather than directly with files or passwords. They use various algorithms to attempt to match provided hashes with hashes generated from potential passwords. In essence, these tools need the specific hash data to function correctly. By converting the Office file into a hash format using tools likeoffice2john, you transform the password protection into a form that these cracking tools can process.

3. Efficiency and Focus

When you extract the hash from a Office file, you’re focusing the password cracking effort directly on what needs to be decoded — the password’s hash — rather than dealing with the entire file encryption scheme. This makes the cracking process more direct and efficient because the tool can concentrate all its computational power on breaking the hash, rather than navigating through file encryption methods, which might include additional complexities.

4. Enables Automated and Targeted Attacks

Creating a hash file allows the use of automated tools that can apply complex, targeted attacks like brute force, dictionary attacks, and others. These tools can handle large volumes of data and apply sophisticated patterns and methods to efficiently crack the password. Without converting the Office file’s protection into a hash, leveraging these powerful tools wouldn’t be possible.

For example in this file the password is just digits and maximum length is 8 chars.

Command:

hashcat -
a

3
-m
9400

--increment

--increment-min

1

--increment-max

7
hash
.txt
?d?d?d?d?d?d?d

Article image

Detailed Breakdown of the Command

  • hashcat: This is the command to invoke thehashcattool, which is one of the most powerful password recovery tools available, supporting numerous algorithms and attack modes.

  • List with all Optionshere.

  • -a 3: Specifies the attack mode to 3, which is brute force. This mode attempts to crack passwords by trying every possible combination within the defined character set and mask.

  • List with all Attack Modeshere

  • -m 9400: Sets the mode to 9400, indicating that the hash type is specific to MS Office 2007. (If not work correctry try other types of MS Office modes) This mode is necessary because different types of hashes require different handling and algorithms for effective cracking.

  • Table with hash modeshere

  • — increment: This option enables the incremental attack mode. Incremental mode is particularly useful when you do not know the exact length of the password but you have a range in mind. It starts at the shortest length and increases until it reaches either the password length or the specified maximum.

  • — increment-min 1: Sets the minimum starting length for the incremental attack at 1, meaninghashcatwill start by trying all single-digit possibilities.

  • — increment-max 7: Sets the maximum length for the incremental attack at 7, meaninghashcatwill increment the password length up to 7 digits, trying all combinations at each length.

  • hash.txt: This is the file containing the hash you aim to crack. This file should be prepared beforehand, containing the hash data extracted from the target Office file.

  • ?d?d?d?d?d?d?d: This mask pattern tellshashcatto use digits (0-9) for the password attempts. In the context of this command,hashcatwill start with the first?dand incrementally add more up to a total of seven digits as specified.

Article image

Done! Password was found — “123456”

Command flow for password-protected Office file simple brute force:

python3 /snap/john-the-ripper/639/run/office2john.py ./Secret.docx > hash.txt && sed
's/[^:]*:\(.*\)/\1/'
hash.txt > temp.txt &&
mv
temp.txt hash.txt
hashcat -a 3 -m 9400 --increment --increment-min 1 --increment-max 7 hash.txt ?d?d?d?d?d?d?d

For more complicated password cracking I need to use Dictionary Brute Force Attack:

Download or create file with passwords (dictionary)

Article image

Use this list with “hashcat” to Dictionary Attack

Command:

hashcat -a 0 -m 9400 ./hash.txt ./best1050.txt

Article image

Detailed Breakdown of the Command

  • hashcat: This is the command to invoke thehashcattool, which is one of the most powerful password recovery tools available, supporting numerous algorithms and attack modes.

  • List with all Optionshere.

  • -a 0: Specifies the attack mode to 0, which is a dictionary attack. In this mode,hashcatuses a list of predefined words or phrases as potential passwords from a specified wordlist file.

  • List with all Attack Modeshere

  • -m 9400: Sets the mode to 9400, indicating that the hash type is specific to MS Office 2007. (If not work correctry try other types of MS Office modes) This mode is necessary because different types of hashes require different handling and algorithms for effective cracking.

  • Table with hash modeshere

  • hash.txt: This is the file containing the hash you aim to crack. This file should be prepared beforehand, containing the hash data extracted from the target Office file.

  • best1050.txt: This represents the wordlist or dictionary file thathashcatwill use as the source of potential passwords. The file"best1050.txt"should contain a list of passwords thathashcatwill try against the hash. Each line in the file should represent a different password attempt.

Article image

Done! Password was found — “123456”

Command flow for password-protected Office file simple brute force:

python3 /snap/john-the-ripper/639/run/office2john.py ./Secret.docx > hash.txt && sed
's/[^:]*:\(.*\)/\1/'
hash.txt > temp.txt &&
mv
temp.txt hash.txt
hashcat -a 0 -m 9400 ./hash.txt ./best1050.txt

Good luck!

1200km@gmail.com