Companies are collecting more sensitive data than ever before. And with more data, there is more risk. The risk associated with managing sensitive data forces companies to make a tradeoff: data privacy or data utility.
- If you want data privacy, you can lock sensitive data in silos. But this causes data to go unutilized, which can put you at a disadvantage.
- If you want data utility, you can try building complex privacy tools and programs to allow your team to leverage the sensitive data. But this is extremely difficult and can often go wrong, putting your data at risk.
How can developers get the best of both worlds? That’s where data vaults come in.
The concept of data privacy vaults was born at companies like Apple, Google, and Netflix. A data privacy vault is a secure, isolated database designed to store, manage, and use sensitive data. Let’s break that down:
- Secure: Vaults have encryption, tokenization, masking, and other privacy-preserving technologies built in.
- Isolated: Vaults are segregated from your other infrastructure and services, and they’re only available through privileged access.
- Store: Vaults must have all the characteristics of a prod-critical data store: high availability, throughput, support for standard SQL interfaces, etc.
- Manage: Vaults must have built-in data governance tools that can enforce granular access control policies.
- Use: Vaults must have features that let you use data in a privacy-preserving way, such as privacy-preserving analytics and secure interoperability layer.
Skyflow empowers developers at companies of all sizes with a state-of-the-art data privacy vault delivered through a seamless API.
The Skyflow Vault consists of 4 pillars that each contribute to the secure storage and usage of data:
- Interoperability Layer
- Secure Storage
- Trusted Infrastructure
Skyflow Vaults have a sophisticated governance engine built in, which allows you to enforce granular, policy-based access controls at the data layer itself.
Skyflow exposes a simple policy-expression language that is used to define policies. The example below shows a policy with rules to mask social security data.
ALLOW READ ON identifiers.ssn WITH REDACTION = MASKED
Policies such as this one can then be attached to roles, which can be assigned to both users and machine identities. This ensures governed access to the data from both people and downstream applications.
Visit the Governance documentation to learn more.
Skyflow Vaults enable developers to leverage the value of their data when working with third parties or even working within the sensitive data itself without needing to bring the data into their infrastructure or services, and without having to provision or manage the compute infrastructure themselves.
Skyflow offers multiple ways to interact with third parties:
- Connections: These proxy functions help you build your own connections to any third party API to securely send and receive sensitive data. For instance, suppose you want to send credit card data to your payments processor. With connections, you can make a call to Stripe with tokenized credit card information. Connections will route the request through the Vault, where the tokenized data will be swapped for real values, and then sent to Stripe. Visit the Connections documentation to learn more.
- Prebuilt Integrations: You can also run generalized business logic on sensitive data with Vault Functions. For instance, suppose you wanted to make a decision based on a user’s credit score. You could write a function that approves an application if the user’s score is greater than a certain threshold and denies it otherwise and deploy that to the Vault. This would be exposed to you as a single API to hit.
Vaults store data in isolated databases that have a number of privacy preserving technologies built in. These technologies include (but are not limited to) polymorphic encryption, data de-identification, and tokenization.
- Polymorphic Encryption: Data is encrypted using a variety of encryption schemes, which allows for users to perform certain operations on encrypted data, such as aggregation and comparison, without having to decrypt it.
- Data De-identification: Upon retrieval, data can be dynamically de-identified depending on who is accessing the data. Skyflow offers powerful masking capabilities (for example, masking everything but the last 4 digits of a credit card) and can also redact sensitive data completely.
- Tokenization: Skyflow can generate tokens for sensitive data that can be safely stored on your infrastructure. Skyflow’s tokenization engine offers a variety of token types, including random tokens, format preserving tokens, deterministic tokens, and more. See the Tokenization documentation for more information.
In addition, Skyflow Vaults are built on top of a highly scalable, enterprise-ready RDBMS system. You can bring your own customizable schema, and a robust key management option allows you to manage your own encryption keys. You also have the ability to use multi-tenant or single tenant with VPC and privatelink.
The infrastructure that is the foundation for Skyflow Vaults meets all of the following qualifications:
- Secure: It is isolated in a virtual private cloud (VPC).
- BYOK: Support for bringing your own encryption keys.
- Highly available: It maintains high availability, so you don’t have to worry about infrastructure failures. Services are architected to transparently handle and recover from failures without service disruption or data loss with robust catastrophic disaster recovery.
- Compliant: It is SOC2, HIPAA, and PCI compliant.
- Zero-trust: It continually verifies permissions and access.
- Global: It uses multi-zone and multi-region deployments help overcome network disruptions.