Dec 20, 2024

Passwords

4217 words, time to read 19 Minutes, 10 Seconds

2024-12-20 07:20 +0100

If we are to believe The Internet you need to have different passwords for every site, all your passwords need to be very long and very complex and if a site doesn’t offer 2FA (Two Factor Authentication) then you have basically already been hacked.

This is, of course, wildly out of proportion. It is all based on things that are individually true but when taken all together park us squarely in silly town.

All the advice is given with the best of intentions, but between forgetting that we are humans and the arrogant assumption that you are too stupid to understand the nuance and make intelligent decisions on your own we arrive at this absolute mess.

You can use a password manager to achieve most of this, and if you can find one you trust there is nothing wrong with using one. I’m sure the password manager companies love to spread the idea that it is the only solution and you need one. That said, I’m fairly certain they are more of a response to the prevailing winds than they are the cause - this kind of advice predates password managers by decades.

So what should I actually do? You might reasonably be asking at this point. And as we will get into, it is somewhat complicated and once you know, you might also revert to giving the silly advice from the beginning, because: it takes too long to explain, and “you wouldn’t understand anyway…”

Guess, Guess and Guess Some More

It is important to realize that the way password cracking works is by guessing. 20 years ago there were vulnerabilities to exploit and many service operators simply stored passwords in plain text in their databases, but these aren’t really concerns today. And so all password cracking is a guessing game. That doesn’t mean the hacker sits there and meticulously types in different guesses hoping to get lucky. The guessing is done by computers and it is fast; a billion guesses per second against a naïve implementation is entirely within the realm of the possible for a single person hobby hacker.

Additionally the hacker isn’t guessing at random, we have learned a lot over the past 20 years about how people pick passwords and this has allowed hackers to focus their guessing on the much smaller space that is passwords that a human might actually have chosen.

Online vs. Offline Attacks

When I said a naïve system above it is important to understand that, just because it is theoretically possible make billions of guesses per second that doesn’t mean you get to do that in real life. And this is the first major conflation that leads to the disproportionate advice.

Consider your credit card it has a 4 or 6 digit PIN code, and they are only numbers, even the 6 digit one is only 1 million possible PIN codes if it was actually possible to try billions of PIN codes per second it would be possible to crack a stolen credit card in less than the blink of an eye.

So why doesn’t that happen?

This is where the difference between an online and an offline attack comes into play. The way the credit card system works the PIN code isn’t stored on the card itself, when you put it in the ATM or use it in a shop, etc. the PIN code is essentially sent to your credit card company for verification. And as you have probably learned the hard way once or twice in your life you only get 3 wrong tries after that the card gets blocked and you have to talk someone to get 3 more tries or maybe even a new credit card.

The point is you have been forced to play by the rules of the system owner, it doesn’t matter how many PIN codes you can generate per second because the credit card company only lets you try 3 times and 3 in a million chances aren’t exactly what anyone would call good. Even 3 in the 10'000 offered by the 4 digit PIN are pretty much not even worth trying.

Most systems are a lot more lenient than credit cards. If you try to log into your favorite social media site you may have noticed that you get a lot more than 3 tries - there probably isn’t a limit. If they are security conscious they may have implemented a rate limit. A rate limit is a limit on how often you get to try, maybe you only get 100 tries an hour or 50 tries a day, or maybe each try increases a delay before you get to try again.

But even if the system operator has done nothing at all to rate limit, the mere fact that the tries have to be done over a network is going to drop the possible number of guesses per second dramatically. Depending on the size of the system and how stealthy the hacker feels the need to be, it could be anywhere from 10 per second to 100'000 per second, but nothing approaching billions.

Offline attacks really only come into play when a security breach has already happened and someone has gained access to a database containing user passwords, as mentioned earlier it used to be that a lot of places just stored the passwords in plain text and no guesses would be involved. However, at this point in time but this is rarely the case anymore.

This is where the “use a different password for every site” becomes relevant advice. Of course the more nuanced advice is do not reuse passwords that you use to protect something you actually care about, if you use the same password for the website with the cat pictures as you use for the one where you discuss recipes it is hardly a real problem.

If on the other hand you use the same password for your bank as you use for the aforementioned site with the cat pictures you have suddenly made the security of your bank account depend on the security practices of the people that run the cat picture site, and even if they are competent they are unlikely to apply the same level of diligence as your bank. After all they are just a cat picture site - why would they?

Then again your bank probably uses 2FA and the problem is kind of moot, now the hackers would have to both compromise the cat picture site, get access to the second factor and link those two pieces of information. And realistically there are easier ways to scam you out of your money.

Password Complexity

So how complex does a password really need to be?

The answer is: Not terribly. The problem is that, unless you understand something about how password cracking works there are a number of traps. And to make matters worse many of the password complexity meters on websites are more counterproductive than they are helpful.

We have all been told that your password needs to have lower case letters, upper case letters, numbers and symbols, based on this alone the password P4ssw0rd! would seem to be a high quality password, but in reality that particular password is so bad it might even fail on a system that only allows for 3 tries.

But why is it that bad, first off it is on the list, there is a list (there is probably more than one) with all the most commonly used passwords, things like 1234, password, guessthis along with millions of other human attempts to be clever.

Additionally letter substitution where a small number of letters in a known word have been replaced with numbers is among the first things cracking algorithms try, once they get through the list of common passwords. On top of that we capitalized the first letter and only the first letter a very predictable human preference. And it doesn’t even end there, we placed the special character at the end of the word like punctuation and we used one that is actual valid punctuation another very predictable and very human behavior.

Variations on the word password are particularly terrible, but any word in the dictionary is vulnerable, and the algorithms do in fact use the dictionary as a basis for guessing.

Does this mean that using a word in a language other than english, in particular one that contains characters unique to that language, like Rødgrød or Straẞe or going full Кири́ллица (cyrillic), makes stronger passwords?

Yes it does, it is hard to say by how much though. It depends on where the attacker is from and how much research has been done in password cracking in that particular language, but english words are certainly the most vulnerable.

LæsøStraẞe is quite possibly an extremely strong password that is very easy to remember, Læsø is the name of a danish island and Straẞe means street in german and streets named for islands are quite common in Denmark. A password cracking algorithm tuned to danish people who also speak german (a very common combination, in a rather small population group) would guess this password in seconds but one tuned for english would quite possibly never guess it.

And yes that does mean that including one or more letters that do not exist in english makes your password stronger. Whether or not that means you should do so is less obvious, it might mean that you could run into trouble entering the password in some circumstances and some rare/old systems might not even support it. In other words while it does help it may be too impractical to be worth it.

Two Factor Authentication

You may have noticed that more and more websites implement mandatory 2FA and you might reasonably conclude that this means that 2FA has become necessary because password cracking has evolved so much and computers have become super fast.

But in reality they do it for two reason:

They are tired of dealing with hacked accounts that were hacked because someone used a really terrible password. And it reflects poorly on them to get hacked even if it is entirely the fault of the user.
Culturally there is a movement on the internet that 2FA is the only way to be sure; any responsible website uses 2FA. And it is a lot easier to implement 2FA than it is to argue with people on The Internet.

Now, there is no question that 2FA does in fact enhance authentication security. But it is not a magic bullet. It does nothing to stop session theft for example, and some large social media video platforms who shall remain nameless that allow you to change your login credentials without having to reauthenticate with 2FA even when 2FA is enabled don’t make it any better.

It also doesn’t prevent attacks where the user is tricked into doing something stupid.

I’m not suggesting that 2FA is bad, it is not. It does improve the situation for a subset of cases especially when implemented correctly, but it doesn’t magically make the problem go away and it doesn’t mean you can now start acting recklessly without consequences.

And the biggest downside: You may lose that second factor whatever it is, and if you do you lose access. Access you would not have lost if all you had to do was remember a password. And yes they usually offer recovery keys, but now you have to figure out how to store a whole bunch of these safely and that is a lot easier said than done, and those are also subject to being lost.

Rather amusingly it probably also hurts phone sales. I, for one, am not replacing my phone until I absolutely have to, because migrating all the 2FA apps is a royal pain (some are worse than others).

Other Uses for Passwords

So far we have been focusing on passwords used for authentication - that is: passwords used to prove that you are who you say you are and by inference that you are allowed to do whatever it is you are trying to do.

Authentication invariably has two parties, the party who wishes to carry out whatever action it might be and the party capable of carrying out said action but unwilling to do so for just anyone. So you have to prove to the second party that you are someone for whom they are willing to do whatever it is you wish to do.

Simple example: You want to transfer money from your bank account to some other bank account, your bank is perfectly capable of doing this whether you want them to or not, but (perhaps wisely) they only do so if and when you want them to. Authentication is the part were you convince your bank that you are who you say you are.

Passwords is a strong mechanism for authentication and has been used long before computers were invented for this exact purpose and that is why it is called a “pass word” in the first place; the sentry will only let you pass if you know the word, the “password”.

The problem of course is that computers are incredibly trusting, if you were to go up to the sentry and start reading words from the dictionary… Well we can all imagine how that would end. But a computer happily keeps saying no until you get it right at which point it will let you pass.

Now of course we can and, as we have talked about, do program computers to be less trusting, and this is why offline attacks are rarely relevant.

This does, however, change when we start talking about other uses for passwords.

The most common such use is for encryption. We derive a cryptographic key from a password and use this to encrypt something. This perfectly viable and we use it a lot.

In this scenario there is no second party with the power to read our data at will, we have full control and we don’t need to trust anyone, this is good! The downside to this, however, is that there is no second party to refuse the interloper when they start reading the dictionary, and the computer will dutifully keep saying no until the right word is found.

That is to say in this scenario we are always talking about an offline attack, the same rule applies of course the attacker still has to get access to the encrypted data in the first place to try this attack, but now if they do and they are successful they get directly at whatever it is we were trying to protect in the first place.

And if we are using encryption the objective generally is to keep it safe even if it falls into the hands of someone we do not wish to have access to it and as such arguing that it is hard to get at in the first place does not obviate the need for a good password.

Threats

It is essential when choosing a password to understand what it is we are trying to defend against. Are we trying to prevent a random attacker from gaining access to our social media accounts, from where they might use our name to carry out scams that hurt our friends, or are we trying to protect the plans for an invasion against a powerful enemy with nearly inexhaustible resources and a keen interest in exactly our information. Or somewhere in between.

We can of course decide that we want all our passwords to always be viable in the most extreme scenario, which is what the typical Internet advice does. And an argument can be made that, if you are advising strangers in unknown circumstances and you only have 140 characters in which to do so, this is the only reasonable advice to give. That, however, does not mean it is good advice.

The most important element in this is: Are you trying to protect yourself against targeted attack or untargeted attack. A targeted attack is where someone is going for you specifically for some reason that is specific to you, a scammer that calls you on the phone is not a targeted attack, a con man that chats you up in a bar is not a targeted attack. They don’t care about you specifically you are just convenient.

Reasons you might be the target of an attack:

You are a celebrity
You have a jealous ex with the appropriate skills or contacts
You angered the wrong people on The Internet
You are suspected of a crime
You are involved in clandestine operations of some kind

As may be obvious most of us are rather unlikely to be the subject of targeted attacks, and if your password is part of a large leak with millions of other users it doesn’t need to be nearly as good as it would need to be in the targeted case.

Think of it this way if the attacker has a mio. passwords and the cost of cracking each password in period of time acceptable to the attacker is $1 that is a mio. dollars or in other words not happening. Now, a lot of the passwords are likely to be a lot cheaper than that but if you can make yours a $10 password the attacker is going to give up and move on to the next password.

Maybe the data will keep floating around and in 10 years it will have become cheap enough to crack yours then it is rather important whether or not it is protecting something that is still important in 10 years. If it is simply authentication then you can just change it when you learn of the attack and everything is fine, if it is directly protecting data also in the attackers possession that might not be good enough.

But if someone wants to get at you specifically then a cost of $10 is trivial and the question becomes how expensive does it have to be and how long does it have to stay safe for. if you can push it to say $1 mio to crack it in 1 year then you have likely ruled out most attackers, and if you are reading this blog post as your primary means to learn how to protect yourself we have probably arrived at more than good enough for whatever situation you are likely to find yourself in.

That said it is just a math problem at this point, if $1 mio isn’t enough add one more random character to your password and the price just passed $60 mio (lower(26) + upper(26) + numbers(10)) = 62

Protecting Against Offline Attacks

If all you are in control of is the password then making it resilient to a targeted offline attack can be very difficult indeed, yes a 16 character random password from a pool of letters, numbers and special characters is still good enough even if it is only protected by the most basic of not outright broken mechanisms (i.e. individually salted HMAC-SHA256) a single modern GPU¹ can execute billions of guesses per second in this scenario but 16 characters from a pool of 85 unique letters, numbers and special characters is still into multiple nonillion possible combinations also known as a thousand billion billion billion or in other words even if we as species made it our primary goal we would still be unlikely to be able to crack it in your lifetime.

The problem of course is that you would have to put in some serious effort to remember even just one such password.

Now, luckily, you are not that interesting, we are not going to make it our goal as a species to crack your password nor does it have to be protected in the weakest possible way.

A technique referred to as key stretching² allows us to effectively define how hard it is to make each guess, the only real downside to this is that you also pay this price each time you log in.

But scale is massively in your favor here, you only need to check one time each time you want to log in.

The attacker on the other hand, does benefit from being able to parallelize, that is make multiple guesses simultaneously, where that obviously doesn’t make much sense for you.

All this considered it is realistic to bring a modern GPU down from billions of guesses per second to less than 10'000 guesses per second and still make it tolerable for you to log in even on an old slow computer.

And now suddenly we can drop your random password down from 16 to 8 characters, and we can scrap the annoying special characters and still keep the cost of cracking your password in less than a year north of a mio. dollars.

Which is enough to stand up to most targeted attacks and wWkH8cDF sure is a lot easier to remember than gRnDn7N;a5$F)!#C

It is important to understand that this only works if the password is randomly generated, if you do it by hand even by blindly hammering the keyboard it is very unlikely to remain this good, it may still be good enough for your particular case, but it is really hard to tell for sure.

And I should mention that the two passwords I just posted are no longer good options, if you try to be clever thinking no one would ever think you would be dumb enough to pick one of them, you are vastly underestimating how dumb some people are willing to think you are.

So why doesn’t everyone just do this?

Traditionally we have sent the password in plain text to the server (yes hopefully encrypted with https) and the server then performs all the work to check the password and it would be rather expensive to run every password through such a computationally heavy process.

That doesn’t mean it couldn’t be done. A modern browser on a computer that is still likely to be in use by any sane person today is perfectly capable of performing the process suggested above to the degree suggested above, but it would require moving that calculation from the server to the client and for some reason very few service providers seem willing to do this, also the culture still dictates 2FA and if you are doing 2FA there is really no need to bother even if it would be a better experience, and unlike 2FA it still doesn’t work if the user picks 1234 as their password.

However, if you use a password manager that was written by someone even remotely awake, then this process is being used to derive the encryption key for storing your passwords.

So What Should You Actually Do?

First and foremost forget about being clever the computer you are up against is too stupid to be tricked. Secondly stop worrying so much, your password is not about to be hacked by a thousand hungry hackers in a foreign country.

Even if you handcraft your password, so long as it is reasonably random in nature and not overly short, it will be fine for authentication purposes in any practical scenario you are likely to encounter.

As a simple rule of thumb 8 characters is about as short as you should ever go, and the more human friendly you want it to be the longer you should make it.

In truth, there are lots of places where you could get away with 1234 as your password, places where no one has ever tried and no one is ever likely to try to crack your password, identifying them is a bit tricky and I would never recommend that you try it. But it is worth understanding that not everything is under constant attack.

If you are able to remember 8 properly random characters then you are fine in for pretty much any practical scenario a normal person is ever going to find them selves in, even your bank without 2FA would be fine with a password like this, provided this is the only place you use that password.

And writing your passwords down on paper with a pen is a perfectly reasonable solution for most cases, provided your physical space is reasonably safe. You may want to take some extra care to protect this piece of paper if you have teenagers in the house though.

The most important thing you can do is to not reuse a password you use for something you really care about, especially do not reuse it somewhere with substantially lower security needs.

Using 2FA where security is both really important you and highly profitable for an attacker to gain access such as you bank, does make a lot of sense and the consequences should you lose your second factor are usually far more manageable than getting hacked.

Using a reputable password manager or (shameless plug) a high security note program like Mimiri Notes is good way to take away the complexity og keeping track of passwords, recovery keys and other sensitive data.

I’m basing my calculations off an RTX 4090 ↩︎
I’m basing my calculations off of PBKDF2-SHA512 with 300'000 iterations ↩︎