CL15 Anonymous ID MappingThe CL-15 method is used to map two pseudonyms together, without revealing the user's ID. In this case the user has an ID (UID) and wishes to map System A to System B. System A has a value of xa and System B has a value of xb. Both System A and System B have public and private keys. In this case we use RSA for the public key encryption. |
Method
We have data sources which can span across the public sector. This might relate to health records, tax records, and so on. The nightmare situation for many is where governments create a data lake, and where all of your data is linked and merged. This would allow your GP to see your tax return, and the tax inspect the chance to see your health record. An improved solution is thus for citizens to own their own identity, and then allow them to link it between disparate systems.
So, let’s say that you don’t trust your government to match your identity across different data systems. Two of the best methods proposed for this matching are known as CL-15 and CL-17:
- Camenisch, Lehmann. (Un)linkable Pseudonyms for Governmental Databases. CCS’15. [link]
- Camenisch, Lehmann. Privacy- preserving User-auditable Pseudonym Systems. IEEE EuroSP’17.
Within these papers, Camenisch and Lehmann outline a way that the user can own their own ID (UID), and then to store a pseudonym on System A (PsIDA), and another one on System B (PsIDB). A controller then links the two IDs together, but cannot tell the identity of the person involved.
In this case System A has a value of xa, and System B has a value of xb. Next the user ID (UID) is used to generate the pseudo ID in each system. For System A, the pseudo ID will be \(UID^{xa}\), and, in System B, it will be \(UID^{xb}\). Because of the security of discrete logs, neither System A nor System B can determine the value of UID:
The pseudo IDs are then (and where \(p\) is a prime number):
\(PsIDA = UID^{xa} \pmod p\)
\(PsIDB = UID^{xb} \pmod p\)
Now we want to link Bob from System A to System B. First System A encrypts PsIDA with the public key of System B:
\(C = \text{Enc}(pubB, PsIDA)\)
Next we use homomorphic encryption to raise this cipher to the power of \(\frac{xb}{xa}\):
\(C’ = \text{Enc}(pubB,PsIDA)^{\frac{xb}{xa}}\)
This becomes:
\( C’ = \text{Enc}(pubB,UID^{xa})^{\frac{xb}{xa}} \)
The controller can then pass this cipher value to System B, and through the wonder of homomorphic encryption:
\(C’ = \text{Enc}(pubB,UID^{xb})\)
System B can then use its private key to get:
\(UID^{xb}\)
And thus System B will know the pseudo ID, but the controller or System A cannot tell the ID stored on System A. A sample run is:
The sample Python coding is:
import sys import libnum import Crypto.Util.number val=0 bits=60 p=Crypto.Util.number.getPrime(bits, randfunc=Crypto.Random.get_random_bytes) q=Crypto.Util.number.getPrime(bits, randfunc=Crypto.Random.get_random_bytes) n=p*q e=65537 PHI=(p-1)*(q-1) d=libnum.invmod(e,PHI) uid = 11 xa = 23 xb = 7 id1=pow(uid,xa,n) id2=pow(uid,xb,n) print("ID1:\t\t",id1) print("ID2:\t\t",id2) print("xa:\t\t",xa) print("xb:\t\t",xb) print("\ne:\t\t",e) print("N:\t\t",n) print("p:\t\t",p) print("q:\t\t",q) print("RSA Encryption parameters. Public key: [e,N].") cipher1=pow(id1,e,n) print("\nCipher1:\t",cipher1) val=pow(cipher1,xb,n) val=pow(val,libnum.invmod(xa,PHI),n) val=pow(val,d,n) print("Derived ID2:",val)
A sample run is:
ID1: 929293739471222707 ID2: 21611482313284249 xa: 11 xb: 10 e: 65537 N: 509628704099738411389114128951088319 p: 826182604440061657 q: 616847536320538967 RSA Encryption parameters. Public key: [e,N]. Cipher1: 458944241885995240133251377203010898 Derived ID2: 21611482313284249
We can also minimise the code with:
from Crypto.Util.number import * from Crypto import Random import Crypto import gmpy2 bits=128 if (len(sys.argv)>1): msg=str(sys.argv[1]) if (len(sys.argv)>2): bits=int(sys.argv[2]) p = Crypto.Util.number.getPrime(bits, randfunc=Crypto.Random.get_random_bytes) q = Crypto.Util.number.getPrime(bits, randfunc=Crypto.Random.get_random_bytes) n = p*q PHI=(p-1)*(q-1) # RSA Encryption keys e=65537 d=(gmpy2.invert(e, PHI)) uid = 1234 xa = 23 xb = 27 id1=pow(uid,xa,n) id2=pow(uid,xb,n) cipherID=pow(id1,e,n) val=pow(cipherID,xb,n) val=pow(val,gmpy2.invert(xa,PHI),n) val=pow(val,d,n) print "ID1:\t\t",id1 print "ID2:\t\t",id2 print "Derived ID2:",val
Presentation
The following is a presentation: