A recurring joke on some of the tech roast channels is that of multi billion dollar corporations and digital tech giants profiteering on consumer data while preaching about data security and privacy compliances. As much hilarious and entertaining those jokes are, the reality isn't much far. With every touch point and mouse clicks being tracked, information is indeed the most sought after asset in this digital age as supported by the exorbitant money companies are willing to throw at it.
These are data points on actual human beings, reflecting their tastes and interests, at times even building a digital clone of avid internet citizens. And in the wake of the AI boom, where almost every other product has a generative AI model integration and a privacy clause that defaults to data sharing for model training, data privacy and protection are close to impossible to maneuver. Reflecting on this would just make us wish if only we had the supreme control over our data, deciding on who to share it to, what they can utilize it for or maybe revoke it at our choice. Wouldn't that just be so great ?
And that is the promise of Gateway protocol !
Gateway started as a credentialing protocol, decentralizing the issue, receipt and consumption of verified credentials, hoping to fix the glaring shortcomings of the established web2 practices of credential management. It was only a matter of time to expand upon it's standard of verifiable on-chain credentials to the overarching data landscape of our digital ecosystem.
Imagine this, you have been a long time user of Netflix and has spent considerable screen time with it. Meta, at one point, wants to access your Netflix footprint to recommend specific ads and contents on its different suite of apps. In present day, this is happening pretty much without your acknowledgement or awareness. The deal is between the corporates where the user is a mere data point.
Gateway proposes to change this dynamic and holds you the reign to make critical decisions on your data.
If you have observed, in the above scenario we have got three parties involved : Netflix (data contributor) , Meta (data consumer) and the user (data creator). Gateway aims to take this distinction one step higher to assign ownership and access privileges to these parties. While Netflix still remains the data contributor, who puts together the screen time, the watchlist, genres of interest and the likes, the ownership of this data must be anchored to the creator - the user. This would mean that the user has now become an intermediary to the data transaction process and has extensive visibility and control over it.
The architectural set up of Gateway protocol proposes for data contributing organizations to bundle up specific formats of data whose ownership rests with the users. The user can then decide to share a static snapshot of this data upon request from a consumer, who in turn will have to pay an amount for accessing this information.
The technical setup starts with the contributing organization preparing high value sensitive data about its users in pre-defined template formats called 'data-models' to create what is to be called 'Personal Data Assets' (PDA). These PDAs are encrypted files targeted at a specific user and encapsulates sensitive user data collected by the organization from that particular user. These PDAs are initially verified against the particular data model schema by entities called validators. Upon successful verification, this transaction of PDA creation gets recorded in the Gateway Ledger - the protocol's layer01 blockchain. This allows for global verifiability and auditability for all transactions.
At this point, the contributing organization interacts with a smart contract to create a Data Assurance Commitment, specifying explicit storage parameters for the sensitive data in concern. Now to store this highly sensitive user data, the protocol utilizes special nodes called Sharders that partition this encrypted data into smaller fragments. These shards will later on be stored in specialized storage units called Encrypted Data Vaults (EDV), based on the specific storage parameters defined in the Data Assurance Commitment. The EDVs stake some amount of token to take part in this process.
These EDVs constitute the data layer of the protocol and are rewarded or penalized based on their ability to prove they are storing the data. This is done by using Challenge Protocol, a smart contract in the validation layer that periodically challenges the EDVs to prove data storage and with a positive challenge response, the EDVs will be allowed to claim rewards from the Challenge Pool - distributing funds among EDVs, validators, and other participants based on their roles and SLA adherence. The EDVs must also demonstrate their operational status via heartbeat mechanisms during data write operations. At case of a failure, the staked amount will be slashed.
The PDAs hence created are encrypted with the public keys of the respective users, making them the supreme authority of their own highly sensitive user data. In actuality the protocol demonstrates an incredible data protection standard by implementing a dual encryption for these PDAs, where the data is initially encrypted using a one-time symmetric key and this key is further encrypted using the user’s public encryption key. This ensures that only the intended recipient can decrypt the symmetric key and, by extension, the PDA data. This dual-layer encryption approach ensures both the speed and security of data encryption, maintaining data confidentiality and integrity.
At the other end, data consuming organization will directly deal with the users to access their data. This starts with these organizations drafting a 'Data Request', specifying a template ID for the data, the specific data being requested and detailing the intended use of the requested data. This highlights the inherent transparency outlined by the protocol. This Data Request will be sent to the particular user, whose sensitive data they are seeking and the user now have greater visibility into the intentions of the request and the implications there forth.
The user, upon receiving a Data Request can now freely decide if or not they should grant permission to these requesters to access their sensitive data. If approved, the data owners will go on to create a proxy re-encryption key specifically for the requester. This key allows the data to be re-encrypted without revealing the original encryption key. Using the proxy re-encryption key, the requested data is re-encrypted by the storage provider on behalf of the owner and the key is securely stored in the EDVs associated with the data. The requester can then use the Gateway API to pay for the access and retrieve encrypted data from the EDV.
This very idea of returning data ownership to individuals promises a secure, user-friendly internet that respects privacy. This fosters trust between consumers and products, assuring them that their data isn't spread across the global internet. Meanwhile, products can still effectively serve their users.
Gateway is promising to open a gateway for a healthier internet data landscape.