Password Strength Validation with Regular Expressions

Regular Expressions are both complex and elegant at the same time. They can be made to look like someone was just randomly hammering on their keyboard. They are also an incredibly efficient and elegant solution to describing the structure of text and matching those structures. They are very handy for defining what a string should look like and as such are very good for use in data validation.

To validate a US phone number you might create a simple regular expression \d{3}-\d{3}-\d{4} which will match a phone number like 123-555-1212. Where regular expressions can become difficult though is when the format is not quite as clear cut. What if we need to support (123) 555-1212 and just 555-1212? Well that’s where things can get more complex. But this is not about validating phone numbers. In this post I will look at how to make assertions about the complexity of a string, which is very useful if you want to enforce complexity of a user created password.

The key to making password strength validation easy using Regular Expressions is to understand Zero-width positive lookahead assertions (also know as zero-width positive lookaheads). Now that’s a mouthful isn’t it? Luckily the concept itself is a lot simpler than the name.

Zero-width positive lookahead assertions

Basically a Zero-width positive lookahead assertion is simply an assertion about a match existing or not. Rather than returning a match though, it merely returns true or false to say if that match exists. It is used as a qualification for another match.

The general form of this is:

(?= some_expression)

For example:

  • The regular expression z matches the z in the string zorched.
  • The regular expression z(?=o) also matches the z in the string zorched. It does not match the zo, but only the z.
  • The regular expression z(?=o) does NOT match the z in the string pizza because the assertion of z followed by an o is not true.

Making Assertions About Password Complexity

Now that you know how to make assertions about the contents of a string without actually matching on that string, you can start deciding what you want to actually assert. Remember that on their own these lookaheads do not match anything, but they modify what is matched.

Assert a string is 8 or more characters:
(?=.{8,})

Assert a string contains at least 1 lowercase letter (zero or more characters followed by a lowercase character):
(?=.*[a-z])

Assert a string contains at least 1 uppercase letter (zero or more characters followed by an uppercase character):
(?=.*[A-Z])

Assert a string contains at least 1 digit:
(?=.*[\d])

Assert a string contains at least 1 special character:
(?=.*[\W])

Assert a string contains at least 1 special character or a digit:
(?=.*[\d\W])

These are of course just a few common examples but there are many more that you could create as well.

Applying Assertions to Create a Complexity Validation

Knowing that these make assertions about elements in a string, but not about a match itself, you need to combine this with a matching regular expression to create you match validation.

.* matches zero or more characters.
^ matches the beginning of a string.
$ matches the end of a string.

Put together ^.*$ matches any single line (including an empty line). With what you know about Zero-width positive lookahead assertions now you can combine a “match everything” with assertions about that line to limit what is matched.

If I want to match a line with at least 1 lowercase character then I can use:
^.*(?=.*[a-z]).*$
(Which reads something like: start of string, zero or more characters, assert that somewhere in the string is a lowercase character, zero or more trailing characters, the end of the string)

The part that makes this all interesting is that you can combine any number of assertions about the string into one larger expression that will create your rules for complexity. So if you want to match a string at least 6 characters long, with at least one lower case and at least one uppercase letter you could use something like:
^.*(?=.{6,})(?=.*[a-z])(?=.*[A-Z]).*$

And if you want to throw in some extra complexity and require at least one digit or one symbol you could make a match like:
^.*(?=.{6,})(?=.*[a-z])(?=.*[A-Z])(?=.*[\d\W]).*$

There you go. Now you can create regular expressions to check the complexity of passwords.

For more help with Regular Expressions, you might want to check out:

36 thoughts on “Password Strength Validation with Regular Expressions”

  1. Hi,
    Thanks for the explanation.

    How do you go about requiring at least 3 of 4 assertions? We require 3 of 4 within the sets upper, lower, numeric, and special. I want to use {3,} somewhere I think. One idea I had would be to use alternation to list out all of the possibilities. Is that possible with assertions?

    Thanks.

  2. I have the same issue as Dave. How can I handle this in RegEx?

    Password must be at least 12 characters long.

    AND

    Password must contain 3 out of the 4 of these:
    • lowercase letter
    • uppercase letter
    • number
    • special character ! @ $ % ^ & * ( ) + ?

  3. Dave and Lee,
    The assertions don’t return matches, just true/false, so they can’t be counted using repetition operations like {3,}. I think your best bet in those cases is to separate them out into multiple regex expressions and just count your matches.

    As Dave suggests, you can also use alternation (or operator) to do this, but you then have to create all the permutations of the assertions:

    ^.*(?=.{12,})(((?=.*[a-z])(?=.*[A-Z])(?=.*[\d]))|((?=.*[a-z])(?=.*[\d])(?=.*[\W]))|((?=.*[A-Z])(?=.*[\d])(?=.*[\W]))|((?=.*[a-z)(?=.*[A-Z])(?=.*[\d]))).*$

    It gets pretty gnarly pretty quickly…

  4. I modified Geoff’s regex for my situation for:
    8 characters only

    Password must contain 3 out of the 4 of these:
    • lowercase letter
    • uppercase letter
    • number
    • special character ! @ $ % ^ & * ( ) + ?

    ^(((?=.*[a-z])(?=.*[A-Z])(?=.*[\d]))|((?=.*[a-z])(?=.*[A-Z])(?=.*[\W]))|((?=.*[a-z])(?=.*[\d])(?=.*[\W]))|((?=.*[A-Z])(?=.*[\d])(?=.*[\W]))).{8,8}$

  5. hi,
    thanks a lot.
    i’ve modified the expression as given below, because ie7 browsers showing the validation errors.
    /^.*(?=.{8,})(?=.*[A-Z])(?=.*[\d])(?=.*[\W]).*$/i;
    it’s worked fine in all browsers.

  6. Excellent article – almost. There is one major problem with the structure of the recommended validation expressions (the ones wrapped in ^ and $ anchors). You have erroneously included an (unnecessary) dot-star at the beginning before the assertions. All of these lookahead assertions should be applied at only one location: the beginning of the string. (The dot-star following the assertions is the expression which will match the string.) Having two sets of dot-stars makes the expression inefficient (especially for non-matching cases), because the first greedy-dot-star consumes the entire string then is forced to backtrack to a location where all of the following assertions match. But the problem is not limited to efficiency – this is also an accuracy issue. If one of the validation requirements is that the password have a maximum length, then this structure will fail to reject a password which is too long. For example: Lets say that our requirement is that a password must be from 8 to 12 characters long. Your suggested logic would recommend the following expression:

    ^.*(?=.{8,12}).*$

    However, this does not work – it matches passwords having length greater than 12. The simple and effective solution is to remove the (unnecessary) initial dot-star from all these validation expressions.

  7. My previous comment was not quite complete. To correctly apply a max-length assertion, the lookahead needs to include an end of string anchor like so:

    ^(?=.{8,12}$).*$

  8. Question:

    Shouldn’t this line:
    Assert a string contains at least 1 special character or a digit: (?=.*[\d\W])

    Read more like:
    Assert a string contains at least 1 special character followed by a digit: (?=.*[\d\W])

    Or modifying the regexp to include a “pipe” character between the \d and \W:
    (?=.*[\d|\W])

  9. This limits the password length in case your encryption mechanism will blow through the character limits of the database:
    ^(?=.{6})(?=.*[a-z])(?=.*[A-Z])(?=.*[\d]).{6,15}$
    where 15 is max size and 6 is min.

  10. Nice article.
    We have a requirement to ensure that there are no more than 2 consecutive identical characters. For example abbc is good but abbbc is not good.
    Is there a way to enforce this policy using regex approach above?

  11. Thank you for translating positive lookahead assertions into english, I was able to match 3 of 4 of the crietera using your post and lobo’s comment. The following matches a string of at least 8 characters with at least one lowercase, at least one digit and at least one uppercase or special character:
    ^.*(?=.{8,})(?=.*[a-z])(?=.*[\d])(?=.*[[A-Z]|[@!#$%]).*$

  12. I have been exploring for a little bit for any high-quality articles or blog posts in this kind of space . Exploring in Yahoo I finally stumbled upon this site. Studying this information So i’m satisfied to convey that I have an incredibly just right uncanny feeling I came upon exactly what I needed. I most undoubtedly will make certain to don?t omit this site and provides it a glance on a continuing basis.

  13. Will it work for passwords on other languages then English? I expect some problems with the upper and lower case rules.

    In my opinion regular expressions is a two powerful tool for this task. A simple single loop would be enough.

  14. Awesome post! Thank you very much. Just what i needed to help me “wrap my head around” the whole process. Quite simple when its explained correctly. lol

  15. One more max-length evolution to this very useful post:
    according to (?=X) syntax, X can force match to the end of the string overruling the necessary trailing .*…
    ^(?=.{8,15}$)(?=.*[A-Za-z])(?=.*[\d\W]).*$

  16. Hi, I have a password I want to validate with these rules:
    atleast 1 digit, starts with a letter, no white spaces and length of 6 minimum

    I made this regex but I’m not sure if it covers all cases (I’m new with regex and not that good with them :P )

    ^[a-zA-Z]\S*(?=\S{5,})(?=\S*[\d])\S*$

    Can you tell me if this is OK?

    Thanks

  17. Hi
    Thanks for the info. It is very much useful.
    I have one more password rule which need to restrict consecutive characters.
    if aabbbccc or 12345–>then it should tell please avoid consecutive chars
    could some one please help reg that.
    Thank
    usha

  18. Hi
    Thanks for the info. It is very much useful.
    I have one more password rule which need to restrict consecutive characters.
    if aabbbccc or 12345–>then it should tell please avoid consecutive chars
    could some one please help reg that.
    Thank
    usha

  19. Sir,
    How do use ^(?=.?[A-Z])(?=.?[a-z])(?=.*?[0-9]).{7,}$ this expresion. Which place use this expression

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>