Mathias Bynens

PBKDF2+HMAC hash collisions explained

· tagged with Bash, cryptography, JavaScript, Python

Crypto enthousiast Christian ‘CodesInChaos’ Winnerlein recently tweeted:

plnlrtfpijpuhqylxbgqiiyipieyxvfsavzgxbbcfusqkozwpngsyejqlmjsytrmd and eBkXQTfuBqp'cTcar&g* have the same PBKDF2-HMAC-SHA1 hash.

This intrigued me, so I decided to find out what is going on exactly, and why this happens. If you’re curious too, keep reading.

To confirm these findings, I wrote a Node.js script:

#!/usr/bin/env node

var crypto = require('crypto');
var assert = require('assert');

var salt = 'hunter2'; // can be anything
var iterations = 4; // can be any number
var keyLength = 16; // can be any number

var hash = function(passphrase) {
return crypto.pbkdf2Sync(passphrase, salt, iterations, keyLength).toString();
};

var string1 = 'plnlrtfpijpuhqylxbgqiiyipieyxvfsavzgxbbcfusqkozwpngsyejqlmjsytrmd';
var string2 = 'eBkXQTfuBqp\'cTcar&g*';

var hash1 = hash(string1);
var hash2 = hash(string2);

assert(string1 != string2, 'Passwords should be different');
assert(hash1 == hash2, 'Hashes should be the same (collision)');

Running the script confirms that both strings indeed have the same PBKDF2-HMAC-SHA1 hash.

Explanation

PBKDF2 is a widely used method to derive a key of given length based on a given password, salt and number of iterations. In this case it specifically uses HMAC with the SHA-1 hash function, which is the default as per RFC2898.

HMAC has an interesting property: if a supplied key is longer than the block size of the hash function that’s being used, it uses the hash of the key rather than the key itself.

SHA-1 has a block size of 512 bits, which equals 64 bytes.

So in this case, if the supplied key takes up more than 64 bytes, then SHA1(key) is used as the key. More generally, for any chosen_password larger than 64 bytes, the following holds true (pseudo-code):

PBKDF2_HMAC_SHA1(chosen_password) == PBKDF2_HMAC_SHA1(HEX_TO_STRING(SHA1(chosen_password)))

Note that the smallest password of the two always has a length of 20 characters, because SHA1 hashes always consist of exactly 40 hexadecimal digits, and two hexadecimal digits are used for each character in the colliding password.

For example, in Bash:

$ echo -n 'plnlrtfpijpuhqylxbgqiiyipieyxvfsavzgxbbcfusqkozwpngsyejqlmjsytrmd' | openssl dgst -sha1 | xxd -r -p
eBkXQTfuBqp'cTcar&g*

That is why plnlrtfpijpuhqylxbgqiiyipieyxvfsavzgxbbcfusqkozwpngsyejqlmjsytrmd and eBkXQTfuBqp'cTcar&g* have the same PBKDF2-HMAC-SHA1 hash.

Consequences

This effectively means you can come up with as many PBKDF2-HMAC-SHA1 collisions as you like. In fact, as long as PBKDF2 is used in combination with HMAC and any hashing algorithm, the same trick can be applied — the only variable is the hash function’s block size. It’s trivial to find colliding passwords when hashing with PBKDF2-HMAC-anything.

So why did Chris choose plnlrtfpijpuhqylxbgqiiyipieyxvfsavzgxbbcfusqkozwpngsyejqlmjsytrmd and eBkXQTfuBqp'cTcar&g*, of all possible collisions? Wouldn’t it be more fun to tweet about a collision with, say, lolololololololololololololololololololololololololololololololol?

$ echo -n 'lolololololololololololololololololololololololololololololololol' | openssl dgst -sha1
6ff0b2d76dae24f8b58472ba19f918f07359c0c0

$ echo -n 'lolololololololololololololololololololololololololololololololol' | openssl dgst -sha1 | xxd -r -p
o��m�$�r���sY��

While it’s easy to find a collision for any given string larger than 64 bytes (just run the above Bash command), it gets trickier if you want the colliding password to consist of readable ASCII characters only. SHA1 hashes can contain any hexadecimal digits, and converting such a hash back into a string is likely to result in at least one character outside of the printable ASCII range ([\x20-\x7E]).

I wrote a Python script to brute-force PBKDF2-HMAC-SHA1 collisions where the large (> 64 bytes) password has a prefix of choice, and where the colliding password consists of printable ASCII characters only.

#!/usr/bin/env python
# coding=utf-8

import hashlib
import itertools
import re
import string
import sys

TOTAL_LENGTH = 65
PREFIX = sys.argv[1] if len(sys.argv) > 1 else ''

prefix_length = len(PREFIX)
brute_force_length = TOTAL_LENGTH - prefix_length
passwords = itertools.product(string.ascii_lowercase, repeat=brute_force_length)
regex_printable = re.compile('[\x20-\x7E]+$')
base_hasher = hashlib.sha1()
base_hasher.update(PREFIX)

for item in itertools.imap(''.join, passwords):
hasher = base_hasher.copy()
hasher.update(item)
sha1_hash = hasher.digest()
if regex_printable.match(sha1_hash):
print u'%s \U0001F4A5 %s'.encode('utf-8') % (PREFIX + item, sha1_hash)

First, I let it run for a few hours without specifying a prefix, and it found the following collisions:

$ ./brute-force.py
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaadzuyfdt 💥 /JRb+z%,6f{$;*|#\LHT
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaebihgje 💥 @3d9ggezHn@iy,vV/#YC
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaagkpicoe 💥 m;Ec4m@1JW)TOSgGl3ZO
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaidwsoeu 💥 Y#pt*^.[}~.6jx!:fu'P
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaitkpvbh 💥 RFvc?%tbygGt(fy7G*+,
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaajhixpyq 💥 @x!iEK2B*N]X`S$u"CEV
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaakidzupe 💥 1t_lP?o}R;YWoJPF7!GY
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaamyrpmpv 💥 &nbSlEfC.X`D0(l)x[tV
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaandkfdci 💥 %U;/> ,3S/4dv!fUku*N
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaowdgicp 💥 b<-'^;Qt7~G[8>\6=wH(
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaqdpodre 💥 EmjaaG|_\Eq;+Wgl%<@)
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaqsmqyjo 💥 oZD49:*Cd)PFCubU[^)_
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaasbvoipq 💥 Woyw!itp af;uJo'Z-x#
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaslsvwra 💥 +ME:wn{F[<f_Zw%yWN\j
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaatljisbf 💥 w<0k!([95gEP%G^?&tP*
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaavdramcj 💥 ?!`e6 m]e/JJubY`|ZM1
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaavutrypa 💥 d63mH`L=IW3Ucwb.FRec
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaawdxljmb 💥 h?2O+Pm5^x|^`du`A:@^
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaxdovzru 💥 ks*A]XD!U1I4[`!:+@s)

Brute-forcing collisions with chosen prefixes is much more fun, though:

$ ./brute-force.py 'trololololololololololololololololololololololololololo'
trololololololololololololololololololololololololololoaaabaslrsc 💥 M,2p[;|4rkZQ,G5`W 9j
trololololololololololololololololololololololololololoaaablkqqic 💥 Fs>"Y0,'Pj:.D8SYiJ!4
trololololololololololololololololololololololololololoaaacblecnb 💥 _0jZI5O+23MGs-1*I\oQ
trololololololololololololololololololololololololololoaaaltyeuck 💥 &8T-gvxEwb`>r\E_.E2@
trololololololololololololololololololololololololololoaaalyxrcss 💥 TGPwn!L?}]x20&U}[^*&
trololololololololololololololololololololololololololoaaaotcvgrg 💥 _b7z7m(^!?`wI;!_~vl<
trololololololololololololololololololololololololololoaaarbaxieg 💥 zR;cd7\'~6zr,*NHQsd$
trololololololololololololololololololololololololololoaaarwxcedc 💥 D`nPK5Is$>.vO9Pi!8\c
trololololololololololololololololololololololololololoaaasghxurd 💥 VI\T]xa:oZ{H04g0`|).
trololololololololololololololololololololololololololoaaayavvpbc 💥 {Aj_idpOz)S/mmc$k/aT
trololololololololololololololololololololololololololoaaazdtxalo 💥 xgdMs~n(Y<jD"?3^#"?}
trololololololololololololololololololololololololololoaabdfbdoeb 💥 Bz+\his88.X1=5@{%<vW
trololololololololololololololololololololololololololoaabhtispji 💥 _z)7#`~I8$!q$x#h_VUI
trololololololololololololololololololololololololololoaabjvvgkfw 💥 h7Kt8oO&#r$jG=BM^DVh
trololololololololololololololololololololololololololoaabkgljluc 💥 agtQL=tZ55nk['ts1t'|
trololololololololololololololololololololololololololoaabmaaoavm 💥 ~;Zp2M=!k>hE[c]6C'nI
trololololololololololololololololololololololololololoaabmhyfjbt 💥 \pEYgz=BP`W3e89zov-R
trololololololololololololololololololololololololololoaabnyplvil 💥 &_f[/u^R@CL|PnNgYI{V
trololololololololololololololololololololololololololoaabogonxnp 💥 FD(}rO@+GIpZo<];#<^o
trololololololololololololololololololololololololololoaabpskyycf 💥 j./w=Ct}HQF:Y5>@a\\Y
trololololololololololololololololololololololololololoaabsnvfhys 💥 ;>[}h6^ha.SNtW:]0`|1
trololololololololololololololololololololololololololoaabtuuecis 💥 emmP*T3"k4$iObPFRFxm
trololololololololololololololololololololololololololoaabuhtrkfz 💥 bMYCnc[o}r\}gcRaL^CG


$ ./brute-force.py 'collision_my_mission_when_the_dawn_breaks_with_a_hand_'
collision_my_mission_when_the_dawn_breaks_with_a_hand_aaaaaappxze 💥 uYmwXw'c04\LHh=i!E|*
collision_my_mission_when_the_dawn_breaks_with_a_hand_aaaaarcgqpq 💥 s%K>G0KHOdx#l*_y"pDj
collision_my_mission_when_the_dawn_breaks_with_a_hand_aaaabejvnge 💥 zI%j'd1f4$U&iZ@w|$!
collision_my_mission_when_the_dawn_breaks_with_a_hand_aaaabzkzrce 💥 <9.f8Zu5e?H"U&;ha!8%
collision_my_mission_when_the_dawn_breaks_with_a_hand_aaaacbszkmz 💥 k^VyGxWJJNy&{^Aa"}Nl
collision_my_mission_when_the_dawn_breaks_with_a_hand_aaaaddcwxmd 💥 oSnr*x=`Ky4`b<{Uh+C@
collision_my_mission_when_the_dawn_breaks_with_a_hand_aaaadqzyizo 💥 &6JZj4jQ0(E0[LrN'K:Y
collision_my_mission_when_the_dawn_breaks_with_a_hand_aaaadsjngbc 💥 DRq9</HIxBLD9Q]Ll6Bj
collision_my_mission_when_the_dawn_breaks_with_a_hand_aaaafcgqfrb 💥 6f.ZfjBU2TLkd2k\+Qxu
collision_my_mission_when_the_dawn_breaks_with_a_hand_aaaafgrkiwn 💥 Ac#Db(cOs\VWz6PU6/sU
collision_my_mission_when_the_dawn_breaks_with_a_hand_aaaafzbytnn 💥 M"F<(|/*K5TS~`:?[kjG
collision_my_mission_when_the_dawn_breaks_with_a_hand_aaaafzwvfen 💥 Z)Jd;6Ql[j5#EE8)=f/U
collision_my_mission_when_the_dawn_breaks_with_a_hand_aaaagkntuew 💥 voC,;s}M;vfggl}}6sr'
collision_my_mission_when_the_dawn_breaks_with_a_hand_aaaagssyiwu 💥 F..Ncx3Z.C]/2n.XX%bu
collision_my_mission_when_the_dawn_breaks_with_a_hand_aaaaibapezf 💥 v/+|qU1erE#4(<ko0D P
collision_my_mission_when_the_dawn_breaks_with_a_hand_aaaajdxotoa 💥 dNLKU-sFik_=AK[FKJTb
collision_my_mission_when_the_dawn_breaks_with_a_hand_aaaakhddjzb 💥 n(?}=^N^52WaozN7*UAh
collision_my_mission_when_the_dawn_breaks_with_a_hand_aaaakxafnmx 💥 "Dy.9Giik;[i$cC#hP_H
collision_my_mission_when_the_dawn_breaks_with_a_hand_aaaamdzdfpk 💥 5wE}l`Rn]nhiX]N/"EII
collision_my_mission_when_the_dawn_breaks_with_a_hand_aaaamofydxb 💥 z\ c_>Q`F?s;jybKeM#(


$ ./brute-force.py 'chosen-prefix_hash_collisions_ftw_'
chosen-prefix_hash_collisions_ftw_aaaaaaaaaaaaaaaaaaaaaaaabcjenbr 💥 e=D&^kQ^qs+^PRL!j_.q
chosen-prefix_hash_collisions_ftw_aaaaaaaaaaaaaaaaaaaaaaaafikpjor 💥 |BHwN@zatb0TT:5I3|7<
chosen-prefix_hash_collisions_ftw_aaaaaaaaaaaaaaaaaaaaaaaaguatnrr 💥 #f.FDRuJz1V%"i*8nLOz
chosen-prefix_hash_collisions_ftw_aaaaaaaaaaaaaaaaaaaaaaaahqmauyt 💥 dUH}]4wi($\O/L`fej0[
chosen-prefix_hash_collisions_ftw_aaaaaaaaaaaaaaaaaaaaaaaaifuaosw 💥 iKd3djP<]g(WLt%2>Uq;
chosen-prefix_hash_collisions_ftw_aaaaaaaaaaaaaaaaaaaaaaaajkzsniz 💥 D>e{,OD%KwQb-y\67nn2
chosen-prefix_hash_collisions_ftw_aaaaaaaaaaaaaaaaaaaaaaaajzgkoyv 💥 WGoWj5#JC+N! w*oH.A
chosen-prefix_hash_collisions_ftw_aaaaaaaaaaaaaaaaaaaaaaaanlmxgoc 💥 >IZASesr9T+_bA?64 LU
chosen-prefix_hash_collisions_ftw_aaaaaaaaaaaaaaaaaaaaaaaaogmbbtc 💥 #f,U~l|=7p<L{'l;wrcr
chosen-prefix_hash_collisions_ftw_aaaaaaaaaaaaaaaaaaaaaaaaolrkjhw 💥 D.^GkKUq$%yn$kmT_,'Z
chosen-prefix_hash_collisions_ftw_aaaaaaaaaaaaaaaaaaaaaaaapetgdgr 💥 6I}z8;8M4Nts,D~D&Z-@
chosen-prefix_hash_collisions_ftw_aaaaaaaaaaaaaaaaaaaaaaaapgkvttr 💥 auk+ *^c%1^\PC2qVi;g
chosen-prefix_hash_collisions_ftw_aaaaaaaaaaaaaaaaaaaaaaaapvkfwkv 💥 .,vNus(7m0'TyIEKHaKX
chosen-prefix_hash_collisions_ftw_aaaaaaaaaaaaaaaaaaaaaaaaqlxcssl 💥 ri%kSz\<mF +|,}=~NB9
chosen-prefix_hash_collisions_ftw_aaaaaaaaaaaaaaaaaaaaaaaasacwjzq 💥 Zx25[b K:94U>DPvV/(P
chosen-prefix_hash_collisions_ftw_aaaaaaaaaaaaaaaaaaaaaaaasqapqmw 💥 sc'S^?vOtrg;"8gF)U}f
chosen-prefix_hash_collisions_ftw_aaaaaaaaaaaaaaaaaaaaaaaatczlqqs 💥 bMob/6%C,=?an0fCtl^O
chosen-prefix_hash_collisions_ftw_aaaaaaaaaaaaaaaaaaaaaaaaxtjsnyi 💥 E0|^i;HKYSht*@T'FAqc
chosen-prefix_hash_collisions_ftw_aaaaaaaaaaaaaaaaaaaaaaaazjcrflx 💥 `]|D]>=4b8O>0{vwZ],e
chosen-prefix_hash_collisions_ftw_aaaaaaaaaaaaaaaaaaaaaaaazxwmsbm 💥 G0FFrj-[w]m pfx>W$Un

Comments

Thomas Geraghty wrote on :

Cool article as always! I’m just curious as to why you switched from Node to Python for the brute force collisions? I’m not trying to push some sort of Lang A vs. Lang B agenda, just wondering if there was some reasoning behind your decision to switch that I’m missing :)

wrote on :

Thomas: The reason I used Python for the brute-force script is the amazingly powerful itertools.product. You could do something similar in JavaScript, but you’d have to write it yourself.

On the other hand, I used Node.js to confirm the collisions because its built-in crypto module supports PBKDF2 out of the box.

Solar Designer wrote on :

A couple of weeks ago, I tweeted this:

scrypt(PBKDF2-HMAC-SHA256-fail-affects-scrypt-no-security-issue-bGoDFpr8) = scrypt(;`B3nR6wQ2-_LSg"mH #yszm`[#z8B&L) for any salt, N, r, p

The embedded " and ` are not great, but on the other hand there’s no '. The total number of different characters in the shorter password is 26. (The total number of different characters across both passwords could be made much smaller by repeating the same character in place of the message. Then it’d probably contain only 9 different characters.)

This may be usable to check whether a site uses something based on HMAC-SHA256 (with PBKDF2’s order of key and salt inputs) or not.

CodesInChaos wrote on :

I chose a random long password because I preferred it looking random. More mysterious that way — a chosen password makes it too obvious it’s just a cheap trick.

My code is quite similar to yours. I created a random 65-char string as starting point. Then on each iteration increment it in Base26. Compute its SHA-1 hash, check if the hash is nice and printable. Repeat.

Z wrote on :

The moral of the story? Using too looooooong passwords is not as secure as one might think.

I saw admins using 25-character-long passwords, while the hash was based on DES (yes, we are talking about Oracle 9 passwords).

Leave a comment

Comment on “PBKDF2+HMAC hash collisions explained”

Some Markdown is allowed; HTML isn’t. Keyboard shortcuts are available.

It’s possible to add emphasis to text:

_Emphasize_ some terms. Perhaps you’d rather use **strong emphasis** instead?

Select some text and press + I on Mac or Ctrl + I on Windows to make it italic. For bold text, use + B or Ctrl + B.

To create links:

Here’s an inline link to [Google](http://www.google.com/).

If the link itself is not descriptive enough to tell users where they’re going, you might want to create a link with a title attribute, which will show up on hover:

Here’s a [poorly-named link](http://www.google.com/ "Google").

Use backticks (`) to create an inline <code> span:

In HTML, the `p` element represents a paragraph.

Select some inline text and press + K on Mac or Ctrl + K on Windows to make it a <code> span.

Indent four spaces to create an escaped <pre><code> block:

    printf("goodbye world!"); /* his suicide note
was in C */

Alternatively, you could use triple backtick syntax:

```
printf("goodbye world!"); /* his suicide note
was in C */
```

Select a block of text (more than one line) and press + K on Mac or Ctrl + K on Windows to make it a preformatted <code> block.

Quoting text can be done as follows:

> Lorem iPad dolor sit amet, consectetur Apple adipisicing elit,
> sed do eiusmod incididunt ut labore et dolore magna aliqua Shenzhen.
> Ut enim ad minim veniam, quis nostrud no multi-tasking ullamco laboris
> nisi ut aliquip iPad ex ea commodo consequat.

Select a block of text and press + E on Mac or Ctrl + E on Windows to make it a <blockquote>.