How it works
What is end-to-end-encryption, and why is it important? Learn how Bunkyr eliminates the traditional tradeoffs and makes security accessible to all users.
Bunkyr allows security-centric applications to eliminate significant usability issues around account recovery. For applications that are looking to bolster their security, Bunkyr enables migration to more advanced security techniques without impacting their existing user experiences. End-to-end encrypted data now has a secure recovery method that is familiar and easy to use for all end users.
Approaches to protecting user data
Plaintext (authentication only)
Most end-user data stored in a database or in files is stored completely as-is, with the only protection coming from controlling access to the database or filesystem. If an attacker gains access to the system (via a software bug, weak or compromised password, social engineering, etc.), the one line of defense protecting that data has been defeated and any sensitive user data stored on the system can be leaked. This is the default and lowest-cost approach but has serious vulnerabilities to a wide range of attacks, and is considered insecure for use in production systems.
Current industry-standard practice is to encrypt data with at-rest encryption, for instance using a Key Management Service (KMS) provided by AWS or Google Cloud Platform. In this scenario, entire databases are encrypted with a single master key, protecting the data from being accessed if anyone without that key manages to access the system. This definitively offers an improvement over storing data in plaintext, but as both software and system administrators need access to the key in order to access the data, the keys are often stored entirely in plaintext and with the same access controls as access to the data. With present-day cyberattacks often focusing on gaining large-scale access to company infrastructure, attackers can easily obtain the master key and thus access the sensitive user data protected by it.
Hence, the data under at-rest encryption schemes can be seen as effectively stored in plaintext with only authentication, since the key must still be accessible for normal day-to-day operations. This presents a significant attack surface for a full-scale data breach, and there are many instances of large breaches happening despite the usage of at-rest encryption, including a 2019 Capital One data breach that cost the company a $190 million settlement, along with other associated remediation and reputation costs.
An end-to-end encrypted system encrypts each user's data with their own unique key (typically derived from their password), such that the company cannot access that data at all unless the user provides their key (or password). From a privacy standpoint, this is the strongest method to protect user data because only that user controls access to their own data. Because applications no longer have offline access to their users' sensitive data, each individual user's key must be compromised to get access to their data. Major data breaches of large numbers of users' records (and the associated costs and liability that go along with them) are effectively no longer feasible.
Stricter subsets of end-to-end encryption can also be used to enable zero-knowledge architectures, where the company's systems are never provided the key and thus have zero risk of ever having the user's sensitive data. An example of this is client-side encryption, where the user's device is sent their encrypted data and they decrypt it locally, without the key or decrypted data ever touching the company's servers. This method is extremely popular for higher-security applications such as cryptocurrency wallets, secure cloud storage, and encrypted messaging applications.
The typical implementation of end-to-end encryption is shown in the above diagram. At account creation, a randomized account master key is generated to encrypt the user's data, and that key is stored encrypted with a key that only the user controls (often derived from a user's password with a key-derivation function like PBKDF2). When a user logs in and provides their key (or password), the account master key can be decrypted and then used to access their data.
Additionally, to protect against the user forgetting/losing their primary key or password, the master key is encrypted with a secondary backup key. This key is provided to the user during account creation and they are asked to store it and then provide it back when they lose their primary key – familiar forms of this are seed phrases for cryptocurrency wallets, or printed-out backup codes for encrypted cloud storage.
Overall, end-to-end encryption provides incredibly strong security and privacy guarantees, because control of the data stays with the users instead of the company.
So why don't more people use end-to-end encryption?
Inevitably, users lose their passwords. This isn't a big deal in most applications; with plaintext storage or at-rest encryption, the company can simply reset their password and restore access to their data, as the company is the one in control of it. With end-to-end encryption however, the secondary key is the only remaining way back into the account, period, as the company no longer has access to the data without that key.
But what are the chances a paper code is still laying around years later? How many users might get a new computer and forget to bring their codes with them? The reality is that users regularly lose access to their backup methods, losing assets like millions of dollars of cryptocurrency or precious files stored encrypted in the cloud.
"I forgot the seed phrase to my crypto wallet,
now all my funds are lost”
Recovery codes and seed phrases are equivalent to secondary passwords that are used less frequently, meaning the same problems that affect passwords also affect the secondary keys. Here lies the fundamental problem with end-to-end encryption in its current state - the average user is forgetful. In order to make it work, we need something that is easy, familiar, and accessible to the user, without sacrificing security.
This problem exists across several industries faced with protecting sensitive data. Major examples include cloud storage providers, cryptocurrency wallets, IoT devices, educational technology, healthcare, and personal finance applications. If these applications elect to use end-to-end encryption, their users are faced with a major, scary disclaimer, like an example screenshot from an encrypted cloud storage provider below.
Bunkyr brings end-to-end encryption to existing applications without forcing their users to accept risk of total data loss. For high-security applications already using end-to-end encryption, Bunkyr provides a stable recovery method that's accessible to all users.
Bunkyr does not alter normal usage by end users; users still directly log in as normal, with no additional API calls or integration. Our lightweight API is only accessed during account setup when a user generates their recovery key for the first time, or when a user forgets their password to regenerate their recovery key and regain access to their data.
Integrating with Bunkyr replaces unwieldy backup methods such as seed phrases and printed recovery codes. For end users, setting up recovery for their sensitive data is as simple as signing in with Google or Apple (or any other OAuth provider).
With a stable long-term recovery option, users can have the confidence that they will not be locked out of their accounts when they forget their passwords, and companies do not need to risk unhappy users to provide modern data security.
Bunkyr is built developer-friendly from the ground up. Simply redirect the user to Bunkyr during account setup (generate a key) or account recovery (regenerate a key), and Bunkyr will return the user to your application with their key. Our API is user-transparent, meaning the user will only ever interact with the OAuth provider of their choice - they won't even know Bunkyr exists.
- For applications using client-side encryption, the key-generation process can be completed entirely from the end user's device, preserving existing zero-knowledge architectures.
The security details
Bunkyr composes recovery keys from three pieces of distributed information:
A token released by the end user
A user-specific token stored by the customer application
Bunkyr-hosted, defense-grade cryptographic hardware
All three pieces of information are used in the algorithm that derives the key, so all parties must release their piece of information or the key cannot be generated. This ensures that neither Bunkyr nor the customer application can regenerate the recovery encryption key alone or in tandem, without the user's permission, nor can the user's recovery method be used without approval of the customer application.
Bunkyr ties each user's key to a specific set of hardware instances, with each piece of hardware being cryptographically unique. This means that an identical piece of hardware that was not originally used to generate the key would not return the same value required to derive the key, preventing attacks where a different piece of hardware might be used. Bunkyr's architecture guards against hardware failures by allowing any one of the original set of hardware to regenerate the key, so long as one of the original hardware instances survives the end user's data is protected. Bunkyr can also enroll new hardware when a user regenerates their key, to continually refresh the hardware set and guard against failures.
With this architecture, even if an attacker compromises the customer application, Bunkyr's database and source code, and the end user's OAuth login, the attacker is still unable to generate a key without full access to an exact instance of hardware that was used to generate it (which is hosted in secure datacenters). Because the end user's data is protected by end-to-end encryption, even a worst-case scenario compromise of all three parties results in just one user's data being leaked. No more newsworthy data breaches.
The detailed diagram above shows the full account recovery process with all parties:
- When a user forgets their password, they submit a recovery request to the customer application (this is generally a typical "forgot your password?" form)
- The customer application can perform additional identity verification steps at this point (email verification, two-factor authentication, offline ID verification, etc.)
- The Bunkyr API is called, redirecting the end user along with the user-specific token stored by the customer
- Bunkyr then receives the user's piece of information in the form of an OAuth token or token stored on an authorized device
- The two tokens are combined with information from our cryptographic hardware to regenerate the recovery key
- The end user then returns to the customer application with the key attached, where their data can be decrypted and then re-encrypted with a new password