As companies work to gather customer information from large data stores, it is becoming more critical to de-identify data as it moves within an organization, between third-party partners, and among various customer-focused applications. While healthcare professionals operating under the Healthcare Information Portability and Accountability Act (HIPAA) privacy standards have been aware of this imperative for years, data privacy concerns affecting personally identifiable information (PII) have recently become a top priority among global regulators, consumers, and companies. Gartner reports that by the end of 2024, 75% of all consumer information globally will be regulated in some way.
Recently, California has passed data privacy acts called the California Consumer Privacy Act (CCPA) and the California Privacy Rights Act (CPRA). In addition, the enforcement of the EU’s General Data Protection Rule (GDPR) is now in effect. Had companies like Facebook used de-identification techniques to conceal consumer data, severe penalties that can reach billions of dollars could have been avoided, according to Joseph Williams, partner in cybersecurity practices at Infosys Consulting.
One concern is the risk to companies’ reputations should cybercriminals obtain and abuse their customers’ personal information. Cybersecurity experts believe that most consumers have already been the victims of a data breach over the last decade. Various techniques are currently being employed to de-identify data, including redaction, aggregation, tokenization, the privacy vault method, and synthetic data generation. Confidential computing is also an emerging technology designed to secure data in use.
For most companies, data de-identification will not necessarily protect them from the outcome of a significant data breach, but it can assist in ensuring that the customer data they exchange during regular operations is safeguarded from casual or misinformed misuse or exposure. Most organizations can utilize a service from an existing software vendor, such as Salesforce or Snowflake. The key challenge for most companies lies in determining when and where de-identification is necessary and also what technique of de-identification will match their particular requirements without causing any adverse side effects to other business processes.