Password Patterns

In December 2009, a critical data breach in the Internet has been experienced. Around 32 million user passwords of rockyou.com web portal were stolen by a hacker which had used SQL injection for his attack. He got all passwords and made them anonymously (i.e. without usernames) available in the Internet to download.

Security experts started analyzing the passwords and Imperva released a study regarding the security level of the passwords. They have come up with the following results:

Key findings The most commonly used 20 passwords
  • About 30% of users chose passwords whose length is equal or below six characters.
  • Almost 60% of users chose their passwords from a limited set of alpha-numeric
    characters.
  • Nearly 50% of users used names, slang words, dictionary words or trivial passwords
  • Only 0.2% of Rockyou.com users have a password that could be considered as strong password based on Nasa recommendations
    which requires that the password length should be eight characters or longer and the password should contain a mixture of special characters, numbers and both lower and upper case letters.
1. 123456
2. 12345
3. 123456789
4. Password
5. iloveyou
6. princess
7. rockyou
8. 1234567
9. 12345678
10. abc123
11. Nicole
12. Daniel
13. babygirl
14. monkey
15. Jessica
16. Lovely
17. michael
18. Ashley
19. 654321
20. Qwerty
Password Length Distribution

As the figure shows, ca. 60% of the passwords are quite insecure and contain either only lower case/only upper case characters or numeric values. The remaining 40% of the passwords are more secure and contain mixed letters, numeric and/or even special characters.

As security experts always repeat, a secure password must contain lower and upper case letters, numbers and special characters. This makes passwords more secure against brute-forcing and dictionary attacks.

At this point, the following question is raised. Do two passwords, which have the same length and both contain the same number of lower/upper case letters, numbers and special characters, provide the same security level? The answer of the question is NO. Consider the following two passwords: “z6iFk#rdlr” and “Password1.“. Both passwords contain 7 lower case characters, 1 upper case character, 1 number and 1 special character. But, the first one is more secure than the second one, since it seems it was randomly generated. On the other hand, the second password contains some kind of pattern which can jeopardize its security. If passwords share the same pattern, this then can be misused to execute automated attacks similar to dictionary attacks. This password pattern consists of the following aspects:

  • The first letter is a capital letter.
  • The password is based on a dictionary word.
  • A number and a special character are appended to the dictionary word respectively.

People with security in mind would like to follow the recommendations for choosing secure passwords. But they are also not capable of remembering randomly generated complicated passwords. My feeling was always that they have found a middle way. They take into consideration to choose a mixed password but easily remember it at the same time. This idea has led them to apply “password patterns”. In order to check my ideas about this issue, I made further analysis on the 32.6 million passwords. The aim of my analysis is to define some security patterns and check their usage ratio within the password list.

The Analysis

For the analysis, I imported 32.6 million passwords into a database table (exact number is 32,603,348). I used [:alpha:], [:digit:] and [:punct:] definitions to group different character sets within passwords. These definitions represent the following character sets:

[:alpha] Any alpha character A to Z or a to z
[:digit:] Only the digits 0 to 9
[:punct:] Punctuation symbols (i.e. . , ” ‘ ? ! ; : # $ % & ( ) * + – / < > = @ [ ] ^ _ { } | ~)

Password Patterns

The first pattern I analyzed is “concatenation”of different character sets. According to this pattern, people append one character set with another set or sets (as examples, “password.” or “password1.”). The first one is an example of alpha+punct dual concatenation. The second one is an example of alpha+digit+punct triple concatenation password pattern.

The second pattern I analyzed is “replacement” of certain alpha letters. According to this pattern, people replace certain alpha letters in passwords with a digit or punctuation character. As an example, “passw0rd” can be given (the letter o is replaced with the number zero).

1. Concatenation Password Pattern

People concatenate different character sets to each other. For example, they append a single number (mostly 1) or “.” symbol to the dictionary words. In the following sections the frequencies of all possible concatenations between different character sets are given.

1.1. No Concatenation
For the sake of completeness, I analyzed “no concatenation” case as well. That means I searched for the passwords contaning only alpha, digit or punctuation characters. The following table shows the occurrence quantity in the password list for each character set. According to the results, 44% of passwords contain only alpha characters (i.e. lower or/and upper case letters).

alpha 14,366,751 (44%)
digit 5,192,998 (16%)
punct 4,860 (0.015%)

1.2. Dual Concatenation

In this pattern, I searched for the passwords that belong to any “alpha+digit”, “alpha+punct” or “digit+punct” concatenations (their reverse combinations as well). For the alpha characters, it is not considered if it is a dictionary word or not. But it can be said that the majority belongs to dictionary words. The following table shows the frequencies of the possible concatenations.

Alpha+Digit Alpha+Punct Digit+Alpha Digit+Punct Punct+Alpha Punct+Digit
9,834,095 (30%) 240,993 (0.74%) 895,916 (2.75%) 12,646 (0.04%) 16,090 (0.05%) 3,395 (0.01%)
mekster11, khas8950, emilio1, holiday2, caitlin1, cats13, toohott69, cheer99, may2204, betteroff6, love1129 olives!, skittles?, cheaphat!, skating., junkbox!, easymac*, itsmiller!, balboa!, bobbiedee!, hotbitch., password!, sowhat?, iloveyou!, redbag., yankees!, princess!, iluvyou! 04maxima, 33orange, 12344321a, 1234567a, 118jefferson, 98101ef, 36987l, 1sweetness, 1simpleplan, 1loveyou, 5pointstar, 98765432q, 12345a, 1capital, 123xyz, 16inches, 50cent 78963., 13659*, 83593113$$, 123456], 369*, 1977.., 022590!!, 8825##, 92102310., 3636369., 1457., 963., 24824** *forever, !cheeky, $tevenrules, *phsyco, -angel, []dauoa, !qwert, !loveu , $prite, .com, *Twist, $upersonic, *jordan, $tennis , *jessica ,123456, /8520, *41681, .31331, $$$4369, +2511161897, .09164232572, -11185, !034780, ~@~@~@123, *13961, ****1, ~123456, {0106860511

1.3. Triple Concatenation

In this pattern, I searched for the passwords that belong to any of the following triple combinations: “alpha+digit+punct”, “alpha+punct+digit”, “digit+alpha+punct”, “digit+punct+alpha”, “punct+alpha+digit” or “punct+digit+punct”. For the alpha characters, it is not considered if it is a dictionary word or not. But it can be said that the majority belongs to dictionary words.

Alpha + Digit + Punct Alpha + Punct + Digit Digit + Alpha + Punct Digit + Punct + Alpha Punct + Alpha + Digit Punct + Digit + Alpha
82,151 (0.25%) 185,610 (0.57%) 13,298 (0.04%) 18,218 (0.06%) 9,940 (0.03%) 12,592 (0.04%)
teenager1@, abc123., karl143., windowsxp1!, kelvin258/, jessie18;, pretti7*, jordans07., JUNE24,, briana20., softball4!, blue42!, space1*, class08!, sonny21., mkjoy8!, Mas28@*, abc123!, roach89!, any83* kaitlyn.1, poopp<3, t=48697123, franco_1, dude!2, chris#6, tommy.2359, iloveyou*1, Summer#5, watru^2, beautiful_01 1hawaiian!, 1wish!, 072305AJ$, 1TIKA!!, 4evergreen!!, 123abc., 1love!, 707sucks!, 123loveme!, 1fighter/, 50cent., 1andonly., 1srael** 11!!JesusS, 6.five, 555-oup, 7-boss, 1!iloveyou, 1*princess, 305-boy, 123!qaz, 100%jumper, 1986@Jessica, 15-red, 1-Love .disney2, @$$baba82, *k123456, $hortii88, *supergirl12, *ILOVEYA7, *june7, $iloveu40, !batman76, @love2, $outh408, .loveable1, `cpecan10, *martin23. #1CHRIZ, #1kingsfan, <3ilovemanuel, !11Mom, *789ab, #1hawaiian, #1carlos, #1lover, #1lady


Based on the statistics for concatenation, the most commonly used dual combination is “alpha+digit” and the most commonly used triple combination is “alpha+punct+digit”.

2. Replacement Password Pattern

The second security pattern is replacement. People tend to replace certain letters in words with digits or punctuation characters. For example, “o” is replaced with “zero (0)”, “S” is replaced with “$” or “five (5)”. In the following table, some examples of replacement pattern is given. The numbers given in the second column are not exact numbers since there are false positives.

Alpha letter replaced with a digit
o -> zero (0) 30,485 il0veyou, ge0rge, m0vie, br0ken, passw0rd, c0llege, br0ther, n0thing, t0psecret, m0nkey, 1o/22/2003
i/l -> one (1) 57,456 1loveyou, P1ayer, mel1ssa, stup1d, denn1s, w1lliams, f1lipana, pr1ncess, 1srael**
s -> five (5) 9,867 du5tin,ju5tin, east5ide,augu5t, it5easy, eclip5e
b/g -> six (6) 7,059 straw6erry,soccer6irl, short6one, hun6ry
g -> nine (9) 6,599 an9els, en9ine
Alpha letter replaced with a punctuation character
s -> $ n.a. $prite, be$tfriend, ju$tin, two$hort, $pecial,$ummer, $upersonic, $tevenrules, $outh
i/l -> | n.a. love|y, my|ove, actual|y, M|ChElLe

3. Additional Patterns

There are also some additional interesting password patterns within the list that can be taken into consideration:

Dates 4,167 4/30/04, 12/02/03, 06/27/00, 19/03/1988
Keyboard sequences n.a. 123456 (in top 10), 12345678 (in top 10), qwerty (in top 20), qwertz (97), asdf(157), asdfg(1,190), asdfgh(2,908)
Keyboard reverse sequences n.a. 654321 (in top 20), trewq (14), ytrewq (160),
Starting with #1 8,617 #1kingsfan
Ending with 1. 3,047 dark1.

The Symbols
People prefer using certain symbols more commonly compared with the other symbols. The most commonly used punctuation character is point (.) with 0.7%. The second one is underscore (_) with 0.58% and the third one is exclamation mark (!) with 0.55%. The frequency of each punctuation symbol in the password list is given in the following table.

. 226,980
(0.7%)
, 27,722
(0.09%)
3,172
(0.01%)
16,097
(0.05%)
? 24,744
(0.08%)
! 179,666
(0.55%)
; 14,378
(0.044%)
: 7,239
(0.022%)
# 60,016
(0.18%)
$ 31,501
(0.1%)
% 11,282
(0.03%)
& 28,553
(0.088%)
( 16,557
(0.05%)
) 18,349
(0.056%)
* 95,400
(0.3%)
+ 24,000
(0.073%)
- 126,908
(0.39%)
/ 37,836
(0.12%)
< 11,856
(0.036%)
> 2,755
(0.008%)
= 18,741
(0.057%)
@ 10,4336
(0.32%)
[ 7,722
(0.02%)
] 10,731
(0.033%)
4,149
(0.013%)
^ 5,863
(0.018%)
_ 187,603
(0.58%)
{ 1,056
(0.003%)
} 933
(0.003%)
| 506
(0.002%)
~ 5,823
(0.018%)

Conclusion

In my pattern analysis, the following statistical results have come out:

  • The most commonly used special character is . (point).
  • The most commonly used dual concatenation of alpha-digit-punct characters is “alpha+digit” with 30%.
  • The most commonly used triple concatenation of alpha-digit-punct characters is “alpha+punct+digit” with 0.57%.
  • For the replacement pattern, replacing the letter i or l with the number 1 is the most commonly used pattern.

Password patterns might be the next generation of dictionary attacks. Revealing common password patterns, hackers can enhance their tools to enforce pattern-based brute-force attacks.

Finally, I suggest you the following aspects against password patterns:

  • Do not choose and use any password based on a common pattern!
  • Let a random password generator (e.g. pwgen firefox add-on) create strong passwords for you.
  • If you bad at remembering passwords, create a single strong password (i.e. master password), remember it and use a password manager (e.g. sxipper, keepass) protected with the master password. Then, let the password manager generate strong unique passwords and store them for you.

12 Comments

Filed under Password Security

12 Responses to Password Patterns

  1. Pingback: Twitted by 2gg

  2. Emin

    Another relevant discussion can be followed here as well: http://seclists.org/pen-test/2010/Sep/31

  3. Dennis Groves

    It is worth noting that no matter how 'strong' your password is; it doesn't mitigate against the threat of SQL injection attack; which is how these passwords were disclosed to begin with. In fact it is my conjecture that attackers get more passwords through application security attacks and failure to encrypt databases than by poor password practices and brute force attacks.

    degroves

    • doug

      @Dennis,
      Your statement is incorrect if you look at databases that only store the password hash. It is considered bad practices to store a password in plain text.

      • Joe Wulf

        Just like it is considered a bad practice to use '123456' as a password. Theory and implementation/execution are two entirely different matters.

  4. Pingback: Password Patterns « Steve on Security

  5. Pingback: Twitted by caphooke

  6. The author says the password “z6iFk#rdlr” is more secure than “Password1.“. This is only true for certain threats (can my password easily be guessed by others). I think the first is less secure because of another threat (me forgetting my password and not allowin gme to read my mail). Most advisories about secure password focus on complex character sets in the passwords. But increasing length off the password using only lowercase characters helps against both threats: password is hard to guess for others and easy to remember for me.

  7. Pingback: Tweets that mention Password Patterns :Architecting Security -- Topsy.com

  8. Munyaradzi

    The reason people choose password patterns is to assist in rembering the passwords. Humans are not good at remembering passwords like *au&*99klhJ . We need patterns for cognition. And the fact that every service requires a password makes it worse because i will have to remember 10+ passwords of that nature. In my humble opinion, passwords should be done away with soon before we start afflicting ourselves with "denial of service" as a result of "password amnesia" or better yet….as a result of multitude of "forgot password" page requests! Perhaps the use of certificates or something used accross various vendors will be good….ofcourse this has security implications but a challenge+response should work it out.

  9. Brian Svidergol

    The next step should be to use this information in an attempt to crack passwords (thus validating that you've improved over a standard dictionary or brute force attack). Example: Obtain another set of random user passwords that are publically available and brute force them while recording the length of time to get each password (and the total time to get 50%, 75%, etc.). Then, write up a routine to brute force them while using your password pattern information (placing emphasis on alpha+digit and the preferred punctuation). What kind of gains (if any) are seen? And, what do you lose (I'm thinking you may more quickly get the weak passwords but slow down the ability to get the stronger passwords). Interesting though.

  10. While the article will definitely help in future writing of password cracking tools, it is great for improving password security as well. I think that MS and other advising on password security inside OS insatllation should include some advisingf from this article, Good job hacker!

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>