Passphrase-wordlist - Shameless clone of passphrase wordlist

Overview

This repository is NOT official -- the original repository is located on GitLab at https://gitlab.com/initstring/passphrase-wordlist

This repository is only a tribute

Overview

People think they are getting smarter by using passphrases. Let's prove them wrong!

This project includes a massive wordlist of phrases (over 20 million) and two hashcat rule files for GPU-based cracking. The rules will create over 1,000 permutations of each phase.

To use this project, you need:

  • The wordlist hosted here (right-click, save-as).
  • Both hashcat rules here.

WORDLIST LAST UPDATED: 2021-10-04

Usage

Generally, you will use with hashcat's -a 0 mode which takes a wordlist and allows rule files. It is important to use the rule files in the correct order, as rule #1 mostly handles capital letters and spaces, and rule #2 deals with permutations.

Here is an example for NTLMv2 hashes: If you use the -O option, watch out for what the maximum password length is set to - it may be too short.

hashcat -a 0 -m 5600 hashes.txt passphrases.txt -r passphrase-rule1.rule -r passphrase-rule2.rule -O -w 3

Sources Used

Some sources are pulled from a static dataset, like a Kaggle upload. Others I generate myself using various scripts and APIs. I might one day automate that via CI, but for now you can see how I update the dynamic sources here.

source file name source type description
wiktionary-2021-09-29.txt dynamic Article titles scraped from Wiktionary's index dump here.
wikipedia-2021-09-29.txt dynamic Article titles scraped from the Wikipedia pages-articles-multistream-index dump generated 29-Sept-2021 here.
urban-dictionary-2021-09-29.txt dynamic Urban Dictionary dataset pulled using this script.
know-your-meme-2021-09-29.txt dynamic Meme titles from KnownYourMeme scraped using my tool here.
imdb-titles-2021-09-29.txt dynamic IMDB dataset using the "primaryTitle" column from title.basics.tsv.gz file available here
global-poi-2021-09-29.txt dynamic Global POI dataset using the 'allCountries' file from 29-Sept-2021.
billboard-titles-2021-10-04.txt dynamic Album and track names using Ultimate Music Database, scraped with a fork of mwkling's tool, modified to grab Billboard Singles (1940-2021) and Billboard Albums (1970-2021) charts.
billboard-artists-2021-10-04.txt dynamic Artist names using Ultimate Music Database, scraped with a fork of mwkling's tool, modified to grab Billboard Singles (1940-2021) and Billboard Albums (1970-2021) charts.
book.txt static Kaggle dataset with titles from over 300,000 books.
rstone-top-100.txt static
(could be dynamic in future)
Song lyrics for Rolling Stone's "top 100" artists using my lyric scraping tool.
cornell-movie-titles-raw.txt static Movie titles from this Cornell project.
cornell-movie-lines.txt static Movie lines from this Cornell project.
author-quotes-raw.txt static Quotables dataset on Kaggle.
1800-phrases-raw.txt static 1,800 English Phrases.
15k-phrases-raw.txt static 15,000 Useful Phrases.

Hashcat Rules

The rule files are designed to both "shape" the password and to mutate it. Shaping is based on the idea that human beings follow fairly predictable patterns when choosing a password, such as capitalising the first letter of each word and following the phrase with a number or special character. Mutations are also fairly predictable, such as replacing letters with visually-similar special characters.

Given the phrase take the red pill the first hashcat rule will output the following:

take the red pill
take-the-red-pill
take.the.red.pill
take_the_red_pill
taketheredpill
Take the red pill
TAKE THE RED PILL
tAKE THE RED PILL
Taketheredpill
tAKETHEREDPILL
TAKETHEREDPILL
Take The Red Pill
TakeTheRedPill
Take-The-Red-Pill
Take.The.Red.Pill
Take_The_Red_Pill

Adding in the second hashcat rule makes things get a bit more interesting. That will return a huge list per candidate. Here are a couple examples:

[email protected]!
[email protected]
taketheredpill2020!
T0KE THE RED PILL

Additional Info

Optionally, some researchers might be interested in:

  • The raw source files mentioned in the table above. You can download them by appending the file name to https://f002.backblazeb2.com/file/passphrase-wordlist/.
  • The script I use to clean the raw sources into the wordlist here.

The cleanup script works like this:

$ python3.6 cleanup.py infile.txt outfile.txt
Reading from ./infile.txt: 505 MB
Wrote to ./outfile.txt: 250 MB
Elapsed time: 0:02:53.062531

Enjoy!

Owner
Jeff McJunkin
Jeff McJunkin
CamOver is a camera exploitation tool that allows to disclosure network camera admin password.

CamOver is a camera exploitation tool that allows to disclosure network camera admin password. Features Exploits vulnerabilities in most popul

EntySec 247 Jan 02, 2023
Python-based proof-of-concept tool for generating payloads that utilize unsafe Java object deserialization.

Python-based proof-of-concept tool for generating payloads that utilize unsafe Java object deserialization.

Astro 9 Sep 27, 2022
SARA - Simple Android Ransomware Attack

SARA - Simple Android Ransomware Attack Disclaimer The author is not responsible for any issues or damage caused by this program. Features User can cu

Termux Hackers 99 Jan 04, 2023
OpenSource Poc && Vulnerable-Target Storage Box.

reapoc OpenSource Poc && Vulnerable-Target Storage Box. We are aming to collect different normalized poc and the vulerable target to verify it. Now re

cckuailong 560 Dec 23, 2022
Colin O'Flynn's Hacakday talk at Remoticon 2021 support repo.

Hardware Hacking Resources This repo holds some of the examples used in Colin's Hardware Hacking talk at Remoticon 2021. You can see the very sketchy

Colin O'Flynn 19 Sep 12, 2022
An experimental script to perform bulk parsing of arbitrary file features with YARA and console logging.

RonnieColemanYARAParser This script is named after Ronnie Coleman, and peforms bulk lifts on arbitary file features using YARA console logging. Requir

Steve 20 Dec 13, 2022
This repository uses a mixture of numbers, alphabets, and other symbols found on the computer keyboard

This repository uses a mixture of numbers, alphabets, and other symbols found on the computer keyboard to form a 16-character password which is unpredictable and cannot easily be memorised.

Mohammad Shaad Shaikh 1 Nov 23, 2021
A Python Bytecode Disassembler helping reverse engineers in dissecting Python binaries

A Python Bytecode Disassembler helping reverse engineers in dissecting Python binaries by disassembling and analyzing the compiled python byte-code(.pyc) files across all python versions (including P

neeraj 95 Dec 26, 2022
Execution After Redirect (EAR) / Long Response Redirection Vulnerability Scanner written in python3

Execution After Redirect (EAR) / Long Response Redirection Vulnerability Scanner written in python3, It Fuzzes All URLs of target website & then scan them for EAR

Pushpender Singh 9 Dec 12, 2022
A small script to export all AWAF policies from a BIG-IP device

This script leverages BIG-IP iControl REST API to export ALL AWAF policies in the system and saves them locally. The policies can be exported in the following formats: xml, plc and json.

3 Feb 03, 2022
Searches filesystem for CVE-2021-44228 and CVE-2021-45046 vulnerable instances of log4j library, including embedded (jar/war/zip) packaged ones.

log4shell_finder Python port of https://github.com/mergebase/log4j-detector log4j-detector is copyright (c) 2021 - MergeBase Software Inc. https://mer

Hynek Petrak 33 Jan 04, 2023
Script for automatic dump and brute-force passwords using Volatility Framework

Volatility-auto-hashdump Script for automatic dump and brute-force passwords using Volatility Framework

whoamins 11 Apr 11, 2022
Proof of Concept Exploit for ManageEngine ServiceDesk Plus CVE-2021-44077

CVE-2021-44077 Proof of Concept Exploit for CVE-2021-44077: PreAuth RCE in ManageEngine ServiceDesk Plus 11306 Based on: https://xz.aliyun.com/t/106

Horizon 3 AI Inc 25 Nov 09, 2022
an impacket-dependent script exploiting CVE-2019-1040

dcpwn an impacket-dependent script exploiting CVE-2019-1040, with code partly borrowed from those security researchers that I'd like to say thanks to.

QAX A-Team 71 Nov 30, 2022
A DOM-based G-Suite password sprayer and user enumerator

A DOM-based G-Suite password sprayer and user enumerator

Mayk 1 Apr 07, 2022
Simple and easy framework for phishing 🎣

👋 It's in beta, I'm still building How to install Linux and Termux: Clone Rp: git clone https://github.com/J4c5/superfish.git Install the dependencie

Jack 4 Jan 27, 2022
🔍 IRIS: An open-source intelligence framework

IRIS is an open-source OSINT framework, consisting of modules to find information about a target by scraping sites and fetching data from APIs.

IRIS 79 Dec 20, 2022
com_media allowed paths that are not intended for image uploads to RCE

CVE-2021-23132 com_media allowed paths that are not intended for image uploads to RCE. CVE-2020-24597 Directory traversal in com_media to RCE Two CVEs

KIEN HOANG 67 Nov 09, 2022
About Hive Burp Suite Extension

Hive Burp Suite Extension Description Hive extension for Burp Suite. This extension allows you to send data from Burp to Hive in one click. Create iss

7 Dec 07, 2022
IDA plugin for quickly copying disassembly as encoded hex bytes

HexCopy IDA plugin for quickly copying disassembly as encoded hex bytes. This whole plugin just saves you two extra clicks... but if you are frequentl

OALabs 46 Oct 30, 2022