How do you prove that you know secret information without having to expose it?

by Caleb Lau


Preface

In a traditional sense, authentication commonly involves three factors – a user (with the intend to access a particular information), a server (destination holding the information), and a secret information (identification of authority for access). A common implementation of this involves transferring the entered password on the user/client end as plaintext to the server, where it is hashed and compared against the server’s stored hash. A better practice is to have the communication channel encrypted, and upon reaching the server the encrypted packet would be decrypted for the hashing to take place.

This relies on a few assumptions – that the communication channel has not been tampered with (spoofing, MITM attacks), and the server is not compromised such that a malicious actor is sitting on the server waiting for the password (server is trusted) – which could be a luxurious assumption to make in some use cases.

In other cases, the authentication process instead requires proving of certain actions without exposing any details (e.g., electronic voting where the vote is completed, and executed correctly, without exposing details about the vote itself). In those instances, we would want to look into alternative methods with additional attributes that could facilitate such security.

Zero-Knowledge Proof

Enter zero-knowledge proof, where a form of authentication takes place with the user/client proving knowledge of a secret information, without actually exposing the secret information itself to the requesting party nor to third parties. The requesting party should be sufficiently convinced that knowledge of the secret information is proven, however where third parties observe the authentication process, ambiguity (therefore lack of certain knowledge) arises whether the user/client actually have that knowledge.

Therefore, three properties could be observed to qualify zero-knowledge proof:

  1. Completeness – Where proof is provided and is true, the user/client will convince the requesting party.
  2. Soundness – Where proof is provided and is false, the user/client will not be able to convince requesting party.
  3. Zero-Knowledge – Where proof is provided and is true, a malicious requesting party will not be able to know anything else other than the proof is true.

Example – Same Objects, Different Colours

The quickest way to visualise this is probably through an example. Say Alice has two objects of the same shape and size, but with different colours. Alice is tasked to convince Bob, who is colour blind, that both the objects indeed have different colours yet she needs to keep the colours secret from Bob.

A proof protocol could be developed as such:

The protocol starts with a state where Bob holds out with his hands both the objects in plain sight of Alice. Subsequently Bob brings both hands with the objects to his back, and execute one of two actions: switch or stay. Where switch, Bob switches the objects between two hands. Where stay, the objects stay in their original state. Upon completion Bob brings out both his hands and have Alice provide a statement whether a switch had been done or not. This could be repeated as many rounds till Bob is satisfied.

If both the objects are truly of different colours, Alice should not have any problems telling if the objects were switched between hands; However if both objects have the same colours, while Alice may be able to guess correctly the first time around, the probability of guessing correctly each round decreases (0.5 *0.5*0.5…), which could be denoted with 2^-t.This translates to Alice being able to correctly answer 5 consecutive times a mere 3% possibility.

This protocol fulfills all the properties of a zero-knowledge proof as earlier mentioned:

  1. Completeness – If Alice could reliably answer each time whether the object switched or stayed, then there is a distinct difference between the objects, which in this case being the colour.
  2. Soundness – If Alice could not reliably answer, then there are no distinction between the objects, therefore Alice is lying.
  3. Zero-Knowledge – Bob has no knowledge of what the exact colours are apart from knowing the colours are indeed different. There is also an additional zero-knowledge requirement fulfilled in such that a third party colour blind observer observing this entire process would also not be able tell with certainty whether Bob and Alice had conspired and predetermined the actions to take (stay or switch); Therefore gaining no knowledge in distinguishing the objects in the process.

Sure… But What About the Real World?

In real life it is highly unlikely that you’d stumble onto the exact same scenario as above, so we should look into a more practical application fulfilling the three properties of a zero-knowledge proof. One such application uses discrete logarithm (ref: https://link.springer.com/chapter/10.1007%2F3-540-39118-5_13).

Alice has a secret key x. She creates y with (g^x) mod p = y , where g is a generator value, and p an arbitrarily large prime number. g, p and y is shared to Bob for verification on a later date. So when Alice shows up, Alice produces a random seed r and computes C = (g^r) mod p, providing Bob with C.

From there, Bob randomly requests for one of the below repeated over a number of times:

  1. Provide me with random seed r.
  2. Provide me with value derived from (x + r) mod (p – 1).

Where scenario 1 is asked for, Bob calculates (g^r) mod p to derive C, and cross checks against the C Alice provides. This allows Bob to know that Alice is using the same and p with (g^r) mod p to derive C, hence having a high likelihood of knowing x. Where scenario 2 is asked for, Bob computes g^((x+r)mod(p-1)) mod p, then checks this against (C * y) mod p, which would unlikely be the same if x is not known.

Ending Notes

This being a very brief introduction, where in reality there are multitudes of variants and possibilities for zero-knowledge proofs to be worked on (e.g. zkSNARK), and some very nice use cases already out in the wild (e.g. anonymous payment channels). Continuous research for both use cases and technical implementations are are expected and will be very much looked forward to.