Cyber security and data security are closely related concepts that operate at different levels and provide different safeguards.
Cyber security is primarily about controlling access to systems and data through different security protection mechanisms, from the physical network layer all the way to the application layer. These security mechanisms come primarily in the form of encryption and digital signature algorithms, identity access management systems, design of partitions and segregations in networks and databases, and safe coding practices. The zero-trust architecture brings all these together in a coherent definition of what good looks like for cyber security.
Data security is also about controlling access to data and, more importantly, controlling what can and cannot be safely inferred from data that are provided to users. While access control can be accomplished with cyber security and supporting functions like meta-data management, controlling what can be inferred from data, usually in the service of higher-level organisational goals like protecting user privacy and confidentiality, require a different class of technologies.
There are four confidential-computing technologies that are applicable to protecting data security in a wide range of applications. Secure multiparty computation, through the use of secret-sharing schemes, permits multiple parties to jointly compute a function without divulging each party’s secret information to the others. Homomorphic encryption tackles this issue from a different perspective, by encrypting sensitive data in a way that allows arithmetic operations to be performed directly on the encrypted data. Some privacy issues could arise through so-called database reconstruction attacks that seek to reverse-engineer a database through a series of carefully crafted queries. To address this, the differential privacy framework can be used to provide individuals with guaranteed plausible deniability by adding suitably calibrated noise to query outcomes. Finally, federated learning is needed for multiple parties to jointly learn a machine learning model from distributed data. A detailed survey of these technologies and representative use cases in the Intelligence domain can be found at https://arxiv.org/abs/2408.09935.
It is important to keep the distinction between cyber security and data security in mind when assessing the different technologies available for protecting our privacy and confidentiality. For example, it is not sufficient to have Solid, the open standard for structuring data, digital identities, and applications on the Web championed by Tim Berners-Lee. While it is a particularly idealistic version of what a large-scale zero-trust architecture for decentralised personal databases spread across the web can and should look like, Solid does not actually offer any protection on what can be inferred from data that a user explicitly approve for use by third parties. For that, one needs to implement one or more of the four confidential computing technologies described above on top of Solid.