Bitwarden Research 4: Conclusion

To recap this series:

Post 1: we introduced the Bitwarden server security model
Post 2: we walked through out hashcat plugin with advice on writing your own
Post 3: we released a Python implementation of ASP.NET Core’s Data Protection to support automatically retrieving and formatting hashes from a Bitwarden server instance
Post 4: benchmarks, lessons learned, and final words

Lessons

Things that are theoretically possible are usually practically possible too. Everyone knows that password hashes are brute-forceable, but it’s easy to stop there. With a little push, we were able to develop the capability to make it practical too, even against a novel hash type.
Writing your first hashcat plugin is difficult, but it is achievable. I strongly believe that future plugins will be much easier to write.
For many parts of this research project, it was essential to have a full reimplementation of algorithms using cryptographic primitives (PBKDF2 isn’t a primitive). There is the temptation to say “well I have my Python script using hashlib.pbkdf2_hmac library function calls, that’ll work”, but it just wasn’t possible to split the underlying HMAC calls between the kernel functions without a deeper understanding of the underlying algorithms and the ability to print out intermediate values.
If things aren’t working like you expect, ~~rtfm~~ consult the documentation. I spent a while struggling with using Perl’s Crypt::PBKDF2->new(...)->PBKDF2(a, b) in a test script before realizing that the function actually takes (key, password), opposite from Python libraries. Checking the docs earlier instead of assuming I was doing something else wrong would have saved me some time.
“Encryption at rest” is a frequently abused concept. It usually means “let’s lock the safe but keep the key right next to it”. In security, words are meaningless, it’s implementations that matter.
Server operators should continue to secure server database files, but don’t need to be more concerned than they were. See the advice section in the first post for more details.

Alternative Method

Near the end of the technical work I had the realization that there was another way to solve the problem, and it was probably a better way. Our hashcat plugin brute-forces the Master Password by calculating the full authentication hash for each password candidate and comparing it against the value from the database. If you want to challenge yourself, see if you can come up with with another avenue of brute-forcing, referring to the intro post as needed.

The alternative method is brute-forcing by first calculating the Master Key, and instead of using it to calculate the Master Password Hash and then authentication hash, using the Master Key to decrypt the Protected Symmetric Key with AES-256. With a naive single-threaded Python implementation, this method was roughly 30% faster than the full-hashing method.

I only looked at the Bitwarden architecture after implementing the full-hashing method in hashcat, and by that time our prototype was working well enough to be functional in our engagements, which unfortunately left me no compelling excuse to take the time to rewrite the plugin.

Benchmarks

The following are very approximate benchmarks. The relevant hardware components are an AMD Ryzen 5 1600 CPU and an NVIDIA GeForce RTX 4090 Founder’s Edition.

benchmark	speed
naive Python	3 H/s
naive Python, alternative method	4 H/s
hashcat plugin, run on CPU	24 H/s
hashcat plugin, run on GPU via CUDA	12,000 H/s

Ethics

When releasing offensive security research, there is always an ethical consideration: does the benefit of releasing the research outweigh the harm? In this case, even more than usual, the answer is a resounding yes.

First, the benefits: It raises awareness of the ability of attackers to brute-force compromised Bitwarden server databases and provides defenders with real-world benchmark numbers to judge risk. Even though this is a known attack, the public commonly doesn’t take attacks seriously unless a proof of concept is available. By publishing a PoC, we provide a concrete appraisal of the risk associated with database compromise for those who deploy Bitwarden server instances. In addition, by providing source code and the documentation in this series, we assist other security researchers create more research and expand the state of the art. More practically, it also gives the ability for sysadmins to attempt recovery of their own secrets after a password is forgotten or a critical individual becomes unavailable.

Next, the harm, or lack of it: This series covers a well-known attack that is documented in the Bitwarden security model; no new vulnerability is being published. The code published won’t even work against modern Bitwarden hashes. In addition, attackers can’t even use the technique described in this series without first compromising a Bitwarden server database. In my experience, an attacker with the capabilities to compromise a Bitwarden server instance also has the capability to independently recreate this research. In summary, this research would not give an attacker an ability to conduct an attack that they wouldn’t already be able to execute.

What’s Next?

Even though we have released a fully-working hashcat plugin to crack certain Bitwarden server hashes and the tooling to retrieve them from a server database, there are still limitations. As discussed in the first post, the plugin only supports older hashes, and as talked about earlier in this post, there is a more efficient algorithm for cracking passwords.

Even with those limitations, we were still able to crack multiple real-world databases. The passwords were of sufficient complexity that it would not have been possible to crack any with a naive Python implementation.

Despite this success, there is room for improvement:

Adding support for client-side Argon2 hashing. This would be in the _init and _loop kernels.
Switching the server-side algorithm to attempt decryption of the Symmetric Key instead of calculating the second/outer layer of PBKDF2 hashing. This would be in the _init2, _loop2, and _comp kernels.

Both modifications would also require corresponding changes to the C module.

Please open an issue in the repo with comments, questions, or improvements you’ve made.

ivision Security Research

During ivision engagements, we frequently encounter challenges like the one described in this blog post series, where we have to analyze a system and implement a custom tool or exploit. If your company could benefit from a security assessment or other bespoke service that makes use of our broad skill set, check out our main site for more information or contact us and mention this post.