One of the more common findings we report for Android security reviews is an issue involving hard coded secrets. This blog post will specifically focus on hard coded secrets used for encrypting application data. I’ll try to use a bit of light threat modeling and risk assessment to guide the process, just as I would when approaching a problem like this on an engagement.

Threat model

When evaluating whether or not to encrypt our application data and the encryption method, it is important to have a threat model in place to guide our choice. The threat model starts with the data protection and what application data encryption can achieve. Here, we’ll focus on here is information disclosure as our primary threat. Android applications can hold sensitive information such as passwords, valid API tokens, location data, personally identifying information, etc. There are good reasons to store this information, so what follows will operate under the assumption that “don’t store it on the phone” isn’t a valid solution.

The next question we’ll have to think about is, “how would such data be exfiltrated from the application itself?” If there was no method to exfiltrate the data, we wouldn’t need to encrypt it in the first place!

The list would look something like this:

  • Rooted phone
  • Hard disk forensics
  • Social engineering
  • ADB backup
  • Android’s run-as command with a debuggable application
  • Bug in application allowing for data exfiltration
  • Code execution vulnerability in the application itself

Evaluating the risk

The first, and most important, thing to remember when viewing an issue such as “hard coded secrets” is that the attacker has full access to an application’s APK. This includes all resources and all Dalvik bytecode, which can be decompiled into smali or even directly back into Java. They will also have the ability to debug our application while it is running on the phone. While there are obfuscation methods to make these tasks significantly more difficult, they are not fundamentally addressing the problem we’re investigating here; therefore, the assumption of a reversible application will stand throughout this assessment.

First, it is worth while to note here that Android makes a concerted effort to isolate applications from each other. Primarily, our application is give its own unique Linux user and group and any files made by the application are only accessible by this user and group unless we explicitly state otherwise. Android also makes heavy use of SELinux to further sandbox applications, including its own services so that compromising them doesn’t necessarily mean full root access. In some cases, these protections are already strong enough to require no extra encryption.

Rooted Phone The case of a rooted phone is fairly straightforward: Android does not allow a regular user to root their phone easily. Given this, we know that a user with a rooted phone has either done so willingly, or has fallen victim to sophisticated malware that exploits Android itself. In the first case we have two options: employ rooting detection and change the applications behavior on a rooted phone, or we can accept the risk of a user who rooted their own phone being more vulnerable. The second case requires a more sophisticated attacker so we will need to evaluate the probability that our application will be the target for this attacker. The second case is particularly difficult to deal with, but Android can still potentially protect some application secrets even in the case of rooting.

Hard Disk Forensics Hard disk forensics would be an issue if a user’s phone were to fall into the hands of someone who knows how to image and read a hard disk. This isn’t as hard as it sounds, so it is a worthwhile method to consider. Luckily Android provides file based or full disk encryption to help here.

Social Engineering Social engineering is always a possibility for stealing a user’s secrets and encryption doesn’t necessarily help here. While it would help in the case where the attacker is trying to convince someone to install some malware on their phone that targets your application, that is also mitigated through general Android security practices. If this is our only reason for encrypting application data and we think it is valid, then it might be better to tackle this problem at a different level.

ADB Backup ADB backup is a pretty well publicized issue in which users with Android debugging enabled could be vulnerable to inadvertently allowing application data to be backed up by a malicious USB port or an attacker with physical access to their phone. The idea for the former would look something like plugging a phone into some USB port that “looks safe” but actually attempts to use the adb backup command. Neither of these are particularly easy to pull off anymore since the phone needs to be unlocked and there is user interaction required to start the backup. This can also be mitigated through other means, particularly configuration option in the AndroidManifest.xml file.

run-as Applications that are debuggable allow anyone that can get a shell on the phone to use the run-as {application} command to access the filesystem as if they were the application itself. Android warns you when applications are hard coded to be debuggable, but the mitigation here is simple: don’t deliver a debuggable application.

Application Bugs Bugs in the application are an interesting vector to take into account. The risk here is that data will be exfiltrated from the application due to an issue with application logic. Exposed IPC services – things such as exported Services, Broadcast Receivers, and Content Providers – would be a good place to look for these types of issues. These are some of the things that make Android so versatile but are also a major target for malware. If we have exported IPC services, we can bet that is the first place an attacker will look for an entrypoint into our application.

Code Execution Vulnerabilities Code execution in the application itself could be game over for any application data without a sufficiently robust encryption mechanism. We could consider attempting to mitigate the damage this causes to application data integrity and secrecy, but it is difficult in general.

Putting it all together

Now that we have an overview of what is needed to protect application data in an Android application, we can then move to trying to understand potential solutions. First off, we’ll separate these into a list of things that are already, at least partially, mitigated by Android and things that aren’t at all.

Mitigated (at least partially) by Android:

  • Rooted phone
  • Hard disk forensics
  • ADB backup
  • Debuggable application allowing run-as

Not mitigated by Android:

  • Social engineering
  • Bug in application allowing for data exfiltration
  • Bug in application allowing arbitrary command execution

Let’s ignore social engineering due to the complexity of that topic. If we think there is a chance our users will be targets of social engineering attacks that hinge on a lack of encryption, then the solutions we come up with for the other issues should cover us there, so that vector doesn’t help us much and just adds a lot of noise. The data exfiltration and command execution issues are entirely on us and for the [partially] mitigated threats we’d be adding defense in depth. Looking at this, we’ll can come up with the following scenarios scenarios for which our encryption would be vital to protecting application data:

  1. A malicious actor has discovered a method to exfiltrate data from our application due to a logic bug
  2. A malicious actor has discovered a method to run arbitrary code in the context of our application
  3. A malicious actor has discovered a flaw in the Android operating system that allows access to our application sandbox

These scenarios will guide our thought process on how to protect application data on the device. Note that we’re not thinking about data in or out of the device, only data at rest, on the device’s disk or in memory. Data in transit is a whole different game and has different out of the box solutions such as TLS. The last issue on that list we’ll try to keep in mind as best we can with the caveat that it is probably less likely than the others and less under our control. If you are handling critical user data, you may need to think about that as a more immediate threat when reading through the next parts.

Do we even need to encrypt?

With all of that in mind we can now ask the question: Do I even need to encrypt application data? If the answer is no, thanks for reading! Seriously though, there is nothing wrong with coming to the conclusion that we do not have data that is sensitive enough to merit encryption at rest on the device itself given the protections already in place. Not everything needs to be encrypted, and we can focus our development time on other, more pertinent issues if this is the case. As long as we understand and can accept the potential risk, we can move on. If all your application does is allow me to put digital stickers on top of my cat pictures, I’m not going to expect encryption.

If we do think that we have data that is sensitive and at risk in the scenarios we’ve discussed above, let’s move on and think about how it is done.

Hard Coded Keys

So we finally got to where we started: people are hard coding encryption keys, smacking the roof, and saying “this bad boy has Military Grade Encryption”. Given what we’ve thought about now, it should be easy to see why this doesn’t solve the problem, but let’s go over it methodically.

There are two ways we can include a hard coded key in an application:

  1. Plaintext
  2. Obfuscated

If our malicious actor is able to get a hold of an APK – something we are assuming true in our model above – method number one is bypassed in a few seconds via apktool and grep, so we’re going to go ahead and give a firm answer: No. Given the attack scenarios described above, we gain almost nothing from a hard coded plaintext encryption key.

Next, we’ll move on to option two: obfuscated. I think everyone is tired of the phrase “security through obscurity” at this point, but that is what we’d be getting here. The honest truth is that we do gain something from this method: we will stop the people who do not have the time, desire, or skill to reverse our obfuscation algorithm. The problem leftover here is twofold:

  1. There aren’t only unskilled, unmotivated attackers
  2. Software is easy to share

Given those two issues in combination, we can see how obfuscation isn’t adding much either. Once anyone reverses the obfuscation algorithm, they can release a tool to do it for everyone. We’d then have to answer by changing our algorithm and the cat and mouse game continues. This costs the attacker time, but it does not fundamentally address the issue.

Both methods are going to help us in the case that data is leaked without any context about what the data is, but that isn’t a particularly important scenario in our model here and it is unrealistic to think an attacker just found some bits from one of our users out in the wild where bits roam free. We end up with the conclusion that hard coded encryption keys do not address the problem we wanted to solve with encryption in the first place: if a user is able to exfiltrate encrypted data from our application on any device they can decrypt it with minimal effort.

Key derivation from user supplied data

So hard coded keys didn’t address the problem of an attacker being able to decrypt exfiltrated application data. The shortcomings were:

  1. A plaintext key was recoverable
  2. A user who recovered the key from a single APK could decrypt exfiltrated data from any application instance

Trying to address these two issues simultaneously leads to a potential time honored solution: per user key generation with PBKDF2. PBKDF2 has been around for a while and it is used to generate cryptographic keys from user supplied data. This is great because it gives us solutions to both of the issues hard coded keys left us with. First, the keys aren’t going to be stored anywhere in plaintext in the APK itself, and second, successfully recovering Bob’s key does not, in general, allow us to decrypt Alice’s data: we’d need to get Alice’s key too. PBKDF2 works by taking user input, combining it with a random salt, and then performing some opaque cryptographic operations to end up with a key that can be used for cryptography. Note that the random salt does not have to be secret, but it must not be static. The point of the salt is to stop Bob’s qwerty1 from ending up with the same end key as Alice’s qwerty1: so even though Bob and Alice have the same password, we’d never know by looking at the output key.

PBKDF2 is great and it might be the solution we end up with, but we have to think about some of its shortcomings. First, the essential secret is whatever is chosen as the user’s password. We know, through decades of experience, that this is not a good thing to fall back on if we can avoid it: people choose weak passwords. This is true especially when they have to type them on mobile phone keyboards every time they want to use an application. If every application on the phone was requesting a password, we could imagine a quick convergence to passwords such as password123 or qwerty1. For usability this often gets translated into taking output from a PIN screen and passing that through PBKDF2, but that leaves us in an even worse situation with respect to brute force: it is possible to exhaust the entire 4 digit PIN space in less than an hour, which is barely better than a plaintext key.

Lastly, the keys are based on data that is in plaintext at some point in the application’s memory and the key itself is stored in application memory. This would mean that the full key could be recovered through bugs that let an attack read application memory or command execution inside of the application itself. These are not common bugs in Android applications by any means, but it is still worth mentioning.

With regards to our threats this leaves us in the following situation:

  1. Given code execution or the ability to read application memory the full key can be recovered in plaintext
  2. There is a reasonable chance an attacker could brute force a key generated by a weak password
  3. Users are not always responsible with password storage and could leak it themselves

This leads us to the conclusion that PBKDF2 is better than the hard coded alternative, but it’d be nice if we could do something more for both UX and cryptographic reasons.

Generating a random key on first startup

Another, similar option would be to generate a random key when the application is first installed. This is nice because we’ll end up with a strong key if we generate it using a secure random number generator and we’ll have a different key per application instance: this solves both of our problems with the hard coded keys. The problem here is storage: we have to store the key somewhere, and, in our motivation for encryption, we’re assuming that the application sandbox is not safe enough for the storage of secrets in plaintext. This leads us to storing the user’s encryption key on our server and requiring internet access to use the application. If this fits our application, then maybe we could go with this route. We’ll have to think about the issues involved with storing the key on our server and safely transmitting it, but that could be something we’re OK with.

With regards to our threats this leaves us in the following situation:

  1. Given code execution or the ability to read application memory the full key can be recovered in plaintext
  2. The key must be stored somewhere that the application can access, and we are in charge of the security of that storage
  3. If the key is transmitted over the internet, we have to think about security of the transport

Android’s KeyStore API

The PBKDF2 method led us down the right track of getting a potentially unique key per user and removing it from the APK in plaintext. The shortcomings with the this method were:

  1. We rely on a user’s password or, worse, a PIN number
  2. The user’s password and key are in application memory and can be recovered

The random generation method also led us down the right track with a strong and much more likely to be unique key per user, but it came with a storage problem. How can we remove that? You can bet the people working on Android have thought about this problem too and they implemented a solution that looks to work quite well but I don’t see in use much: the Android KeyStore API. This quote is promising:

The Android Keystore system lets you store cryptographic keys in a container to make it more difficult to extract from the device. Once keys are in the keystore, they can be used for cryptographic operations with the key material remaining non-exportable.

Our best solution for a strong key had a storage issue, and we were hesitant to require internet access for that storage as that also the burden of secure storage onto our servers and we have to think about the transport protocol. Android offers us a nice solution for key storage here: let it handle it. This is a very attractive solution because it means less code for us and potentially better security — a double win. Even better, we can scale how much user interaction is required with how sensitive we think our data is. It is possible to completely hide the process of generating and storing a key or we can associate the key with a password and prompt the user. We can have multiple keys with varying access requirements for different data. One of the nicest things is that, for the developer, the process is completely transparent. Look at the following code:

import android.security.keystore.KeyGenParameterSpec
import android.security.keystore.KeyProperties
...
private const val ENC_KEYSTORE_PROVIDER = "AndroidKeyStore"
private const val ENC_APP_DATA_KEY_ALIAS = "application_data"

...
val keyGen = KeyGenerator.getInstance(KeyProperties.KEY_ALGORITHM_AES, ENC_KEYSTORE_PROVIDER)
keyGen.init(
   KeyGenParameterSpec.Builder(ENC_APP_DATA_KEY_ALIAS, KeyProperties.PURPOSE_ENCRYPT.or(KeyProperties.PURPOSE_DECRYPT))
       .setBlockModes(KeyProperties.BLOCK_MODE_GCM)
       .setEncryptionPaddings(KeyProperties.ENCRYPTION_PADDING_NONE)
       .build()
)
keyGen.generateKey()

With just this code we have created a 256 bit key for AES encryption in GCM mode and had Android handle its storage. There is no way to read this key without exploiting the underlying Android code or potentially the Android operating system itself, and, even in that case, the storage and operations should be happening in a trusted execution environment, making the exploit even more difficult. This means that even we don’t have access to the key itself, barring a pretty serious exploit. Usage of these keys works just the same as typical encryption code in Android:

private fun getEncryptionKey() : SecretKey {
    val store = KeyStore.getInstance(ENC_KEYSTORE_PROVIDER)
    // Note that we can put a password on the store
    store.load(null)
    // Note that there can be a password put on the key itself
    return store.getKey(ENC_APP_DATA_KEY_ALIAS, null) as SecretKey
}

class AppDataEncrypter(private val mSecretKey = getEncryptionKey()) {
    companion object {
        private const val IV_LENGTH = 12
        private const val TAG_LEN = 128
        private const val ENC_TYPE_STRING = "AES/GCM/NoPadding"
    }
    fun encrypt(buf: ByteArray) : ByteArray {
        val cipher = Cipher.getInstance(ENC_TYPE_STRING)
        cipher.init(Cipher.ENCRYPT_MODE, mSecretKey)
        val enc = cipher.doFinal(buf)
        return cipher.iv + enc
    }

    fun decrypt(buf: ByteArray) : ByteArray {
        val cipher = Cipher.getInstance(ENC_TYPE_STRING)
        val gcmSpec = GCMParameterSpec(TAG_LEN, buf.sliceArray(0.until(IV_LENGTH)))
        val enc = buf.sliceArray(IV_LENGTH.until(buf.size))
        cipher.init(Cipher.DECRYPT_MODE, mSecretKey, gcmSpec)
        return cipher.doFinal(enc)
    }
}

There is a lot of work going on behind the scenes involving a Binder interface and a Trusted Execution Environment (see https://source.android.com/security/keystore/), but we generally won’t have to worry about any of that on the implementation level. You may need to handle some exceptions and fall back to a different method if certain features are missing from the phone itself, but that would be a smaller fraction of users and a worst case scenario.

There is a downside to the simple method shown here: the users data is very tightly coupled to their phone and this specific application instance. This could be problematic for data that we actually want shared by multiple devices or clients but also want to be able to encrypt and decrypt on the phone. There is nothing stopping us from storing a key generated in a different way in the same store or using this method to create a key encrypting key to wrap that master key. The only restriction we have here is that there is a key that is bound to the application instance that we cannot export. Everything else we’d typically want to do with an encryption key is fair game.

With regard to our threats there is one issue worth noting:

  1. Given code execution or a decryption oracle in the application itself, an attacker can still use the key even though they cannot read it

This is a real possibility and worth keeping in mind, but the mitigations here are fully in the application’s court and not an issue at the design phase. A significantly higher bar has been set for our theoretical attacker: they are going to need code execution, an application level bug providing a decryption interface, or a full compromise of the Android system.

Conclusions

We’ve examined the pitfalls of storing sensitive application data on an Android phone and looked into how people typically solve this problem. We’ve found that the hard coded plaintext key does not offer much by way of encryption, even if obfuscated. We then used what we learned from the issues with those keys to consider a key generation algorithm as a solution to key storage: let the user decide how to store it. This had the typical pitfalls associated with passwords, accentuated by the frustration of using a small phone keyboard to type in complex passwords as well as usability issues with multiple applications. Then we had to decide where to store a strong, random key and decided that Android’s KeyStore API was actually a great fit.

I encourage you to try to use the KeyStore API next time you think your application needs to encrypt data on the device itself. As I said, the example provided above is not the only way to go about using this API and there are plenty more schemes that can be used, but the key take away is that you don’t need to worry about storing an encryption key anymore, which is a bigger win than you might have initially imagined.