In defense of deterministic password managers

tldr; Although they have some shortcomings, deterministic password management is a great idea with a lot of benefits!

Today I read this post which was trying to convince people to switch to stateful password managers which store all passwords in some central data storage locally or on the cloud. I have to admit that the article had some valid points, but generally, it did poorly to convince me that deterministic password managers are bad.

First let me explain what a non-deterministic password manager is.

Normally, when you register in a website, you use a password manager to keep track of all of your passwords. So after signing up for a new email account (e.g. mahdix@gmail.com), you will store email id and it’s password (which is chosen by you), in the password manager’s database.

password-manager
How password manager works? [Image taken from Zoho.com website]
Obviously in this scheme, the central database becomes the most important component which needs extra protection. Where will this database be stored? Either in the service provider’s servers which is out of your control, or locally on your laptop or mobile phone. So if there is a data breach or your mobile phone is infected with a virus, your passwords can be in wrong hands! Of course there are a number of security measures to protect confidentiality of these data but as long as you have no control over them, there is a risk.

What is the solution? There has recently been a new way to manage password which I call deterministic password management (or DPM for short). This means that there is no central password database. Everything is inside your head! There is no need to have a cloud-based server to store your passwords. Generally, you have a basic passphrase which I call the “master password”. All other passwords are generated uniquely using a formula based on “master password” and some other public information. As a result, you don’t need to memorize 100 passwords or save them anywhere! Just memorize the master password and the reset can be automated.

For example for website “gmail.com” and username “mahdix” the password will be calculated using some formula like this:

password = Hash(master_password, 'gmail.com', 'mahdix')

password-managers-no-storage

The most important advantage of this scheme is that there is no need to store sensitive data anywhere! I would like to stress the term “sensitive” because later I will refer to this feature. The other advantage is that your password will all be strong passwords because they are generated using a hash function which causes the result to be semi-random. There will no longer be passwords like “letmein” or “admin” in this scheme.

What are flaws of Deterministic Password Managers?

Now, back to the initial point! The blog post discusses some reasons why this type of password management is not secure and as the author claims: “has fatal flaws”.

There are four points discussed in the article:

  1. DPMs cannot accommodate varying password policies without keeping state: I disagree with this one. As long as the hash function is complex enough (which normally is), the generated password can be accoreding to every sane password policy. The examples that author provides: “password must be at least 8 characters long” or “password must be lowercase only” either are by default satisfied or don’t make sense! I have never seen a website which insists password must be lowercase.
  2. DPMs cannot handle recovation of exposed passwords without keeping state: I agree with this one but see point below.
  3. DPMs cannot store existing secrets: Agreed but it can be fixed without loosing security. See the point below.
  4. Exposed master password alone exposes all of your site passwords: I disagree with this point because it is not an issue of DPMs per-se. If your email account’s password is exposed, it will probably have similar effect! That is how email works. This is the natural side-effect of using email.

State vs Sensitive data

I would like to differentiate between state of a password manager with sensitive data (e.g. passwords). Of course for a traditional password manager like 1Password, these are the same: State of the password manager is the collection of stored password. So it is sensitive data by definition.

But for DPMs this is not the case. You can have some basic and minimal state which is not sensitive. What does that mean? It means that this state can be publicly exposed and available without loosing security of any of your accounts. The only important security feature of state in this sense is “Integrity”. They should not be altered without your consent. If this feature is satisfied then there is no problem for a DPM to have publicly available state.

For example you can store this state in your Google Drive or Dropbox or even on GitHub! Even if you want to maintain higher level of security you can encrypt those information using the “master password”, but that won’t be required.

Now, how can using state address the two problems stated above: Password revocation and storing existing secrets?

How to store existing secrets in a DPM?

Suppose that you already have an account with password: EP. Now you want to switch to DPM. Can you keep this password? It is possible using State.

Suppose that DLM expects the password of this account to be CP where CP is result of the Hash function formula of the system. It is enough to store “XP=CP^EP” in the State storage of the system (^ represents XOR operation). Now, it is impossible to retain any information about “EP” with having only “XP“. But user can easily calculate “EP” by calculating “CP” and then: “EP=CP^XP” (You can use any other function with similar property).

Using this method, a user can store his existing password without revealing any sensitive information. Same approach can be used to renew a password. Just keep a public parameter (like counter), in the State data. It won’t expose anything sensitive and can be used to renew the password. Just increment the counter and the password will be:

Password = Hash(master_password, website, user_id, counter)

This can easily be incorporated to any DLM and provide features to re-new password or add pre-existing passwords.

Of course there will always be a trade-off . So for above mentioned advantages, there are some drawbacks too. Most notably, if your master password is exposed or stolen, all of your accounts will be at risk. This is the natural consequence of using “Master Password”.

Why do I prefer statically typed programming languages?

tldr;  Dynamic typing sucks!

I have been working with a dynamic programming language, Perl, for the past 1.5 years. I also have worked with Java and C# languages during my career and at this point, I can really say that static typing makes the life of developers much easier.

So let me first explain the difference between these two terms and then I will talk about my reasons.

Dynamic Typing

Dynamically typed programming languages, are languages where the type of the data is not specified in the source code, but when the program is being executed. So the developer is not responsible for thinking about whether some variable should hold a number, string or an instance of DateTime object. He just uses a variable name. When the code is being executed, the runtime environment (interpreter, JIT compiler, …) will deduce the type which is most appropriate for that variable.

For example, the developer can write something like this, in Perl:

my $birth_date = Some::Module::get_birth_date();
$birth_date->plus_time_interval('18y');
my $day_of_week = $birth_date->day_of_week;

In the above code, type of ‘$birth_date‘ and ‘$day_of_week‘ are not explicitly specified. When someone reads this piece of code, it clearly implies that `$day_of_week` should be an integer number, but in a large codebase, this is not always possible when you look at the code. So, for example, you really don’t know the type of `$birth_date` unless you read the source code of the `get_birth_date` function, or you can’t know if `plus_time_interval‘ expects an integer or string or something else unless you have used that function before.

Now assume someone is responsible for maintaining or fixing a bug in a 100K SLOC codebase. When investigating the flow of execution of the code, it sometimes becomes really hard and complicated to find out the real type of a variable.

Examples of dynamically typed programming languages are: Perl, Python and Ruby and Javascript.

Static typing

On the other hand, in statically typed programming languages, the developer is responsible to explicitly state the expected type of all of the variables or function outputs. This will limit the scope of valid operations on a variable and help the compiler (or possibly the interpreter) to check for invalid operations. For example, when the developer defines an integer variable, he cannot treat that variable as a string or assign a string literal to it because the compiler will catch such errors. An example of statically typed code:

int x = 12;
int y = x+y;
Customer c = find_customer("id", y);
c.name = "Mahdi";
c.save();

In the above code, type of the variables `x`, `y` and the input/output of the function `find_customer` are all specified statically within the source code. You cannot pass an integer parameter as the first argument of `find_customer` and compile the code because you will get a compiler error.

So writing a statically typed code seems harder because the developer should write more code and keep track of variable types in his mind when writing code. Right? Yes! That is right if and only if you want to write a few lines of code and keep that for a relatively short period of time and ignore all of that afterward. But if you want to keep a relatively large code base over a longer period of time (which is the case most of the time, in thr software industry) this burden proves to have some benefits.

But if you want to keep a relatively large code base over a longer period of time (which is the case most of the time, in the software industry) this burden proves to have some benefits.

Examples of statically typed programming languages include C, C++, Java and C#.

Why static typing?

As I pointed out in the previous section, unless you want to write a few lines of code, keep it for a short period of time and discard everything afterward, the statically typed programming language have a clear advantage over dynamically typed ones.

The point is, the source code of a software is written once, but it will be read a lot of times (for review, bug fix, refactoring, migrations, optimizations, …). So the basic idea in static typing is that you write some more code and take care of some more things in the code when writing the code, and in exchange, you (or others) will have a much easier life when they want to read it. Another clear benefit is that compiler can help you find out lots of possible errors before releasing your code to the production environment.

I have had numerous cases of “Can't locate object method 'xyz' via package "ABC::DEF"` error in production server (Because in the code, someone was trying to call a method on a variable which was expected to be of type T, but at runtime, it had another type). That is because of we, human beings, make mistakes. And computers are there to help us prevent those mistakes. If we just ignore their help (And use dynamic typing), then this is what happens next :-)

But don’t get me wrong. There are cases where dynamic typing is the preferred choice when writing a software system. Generally speaking, if the code base size is relatively small (say 1-10K SLOC) with a small development team (~5 people), then dynamic typing can have benefits. Because obviously developers can write the code much faster and time-to-market will be much shorter. And this is what normally happens for startups where founders want to deliver the product/prototype as soon as possible.

I have seen some people say that we can achieve same advantages of static typing in a dynamically typed programming language by writing good tests. This is possible in theory but if you implement all of the required test cases for this purpose, then you will be doing what the computer is supposed to do! Manually doing something which can be automated. And by the way, the result code based will become much larger because of all those tests! which eliminates one of the important advantages of dynamic typing: more concise code with less boilerplate.