Embedding Data in an Elliptic Curve PointWith Elliptic Curve Cryptography (ECC), we use points on an elliptic curve to represent our data. These are defined with (x,y) points. With this, it is possible to embedded a certain amount of data into the point - as long as it still gives us a valid point on the curve. This example will Curve 25519 to embed data, and is achieved by storing the length of the data in a single byte in the point, and then store the data as byte values in the point (followed by random data). To decode, we can then just read the first byte for the length of the message, and then reveal the data encoded into the point. |
Outline
One of the most beautiful things in our online world is humble elliptic curve. It is there whenever you connect to a Web site with the ECDH (Elliptic Curve Diffie Hellman) key exchange method, and in proving the identity of a Web site with ECDSA (Elliptic Curve Digital Signature Algorithm). It is at core of trust in Bitcoin and Ethereum, and is generally making our digital work more private and trusted.
With elliptic curves, we have (x,y) points on a curve that maps to a defined equation, such as:
\(y^2=x^3 + ax + b \pmod p\)
and where \(a\), \(b\), and \(p\) (a prime number) are the characteristics of the curve. With ECC (Elliptic Curve Cryptography), we then define a base point on the curve (G), and can then perform simple operations such as point add (\(P+P\)) or point double (\(2.P\)). This allows us to compute points through multiplication, where we take a scale value (\(x\)) and produce:
\(Y=x.G\)
which is the point \(G\) added \(x\) times. Typically we use \(x\) as a private key value, and \(Y\) as the associated public key. At the current time — for safe curves (eg Curve 25519, NIST P256, and secp256k1) — it is extremely expensive to recover \(x\) from \(Y\), even though we know \(G\).
Embedding data in a point
Now, let’s look at the magic of embedding data in a point. The method we will use is defined in the Kyber package here]:
func (P *point) Embed(data []byte, rand cipher.Stream) kyber.Point { // How many bytes to embed? dl := P.EmbedLen() if dl > len(data) { dl = len(data) } for { // Pick a random point, with optional embedded data var b [32]byte rand.XORKeyStream(b[:], b[:]) if data != nil { b[0] = byte(dl) // Encode length in low 8 bits copy(b[1:1+dl], data) // Copy in data to embed } if !P.ge.FromBytes(b[:]) { // Try to decode continue // invalid point, retry } // If we're using the full group, // we just need any point on the curve, so we're done. // if c.full { // return P,data[dl:] // } // We're using the prime-order subgroup, // so we need to make sure the point is in that subencoding. // If we're not trying to embed data, // we can convert our point into one in the subgroup // simply by multiplying it by the cofactor. if data == nil { P.Mul(cofactorScalar, P) // multiply by cofactor if P.Equal(nullPoint) { continue // unlucky; try again } return P // success } // Since we need the point's y-coordinate to hold our data, // we must simply check if the point is in the subgroup // and retry point generation until it is. var Q point Q.Mul(primeOrderScalar, P) if Q.Equal(nullPoint) { return P // success } // Keep trying... } }
As we are dealing with 256-bit point values, we are limited in the amount of data we can embed into a single point. The method basically picks a random point on the curve. We now define the first byte for the length of the data and then add the data onto the point. In the following b[] is a byte array that will hold the point, and where we store the length in the first byte:
if data != nil { b[0] = byte(dl) // Encode length in low 8 bits copy(b[1:1+dl], data) // Copy in data to embed }
There is then a check that this is a valid (x,y) point — by computing the y-axis point. If it is, we have successfully embedded our data into the point. As each point is 256 bits, and we have used 8 bits for the length, the maximum amount of data we can embed is 248 bits (31 bytes). We could thus store a SHA-1 hash (160 bits) in a point, or a string of up to 31 characters.
Coding
So let’s take a simple example. Let’s say we have a secret value of x, and we want to hide our message (msg) with this. To encrypt we could perform:
\(M=\textrm{encode}(msg) \)
\(C = x.M\)
and where \(M\) is the encoded point for our message, and \(C\) is the cipher message at a point on the curve. To decrypt, we then take the inverse of \(x\), and compute:
\(Inv_x = x^{-1}\)
\(D = Inv_x.C\)
\(msg = \textrm{decode}(D)\)
and we should have recovered our message in \(M\). Note that \(x\) and \(Inv_x\) are scalar values, and \(C\) is an elliptic curve point. Obviously, to decode the point, we need to read the length in the first byte of the point, and then read a given number of bytes to recover the message.
Here is the code:
package main import ( "fmt" "os" "go.dedis.ch/kyber/v3/group/edwards25519" "go.dedis.ch/kyber/v3/util/random" ) func main() { message:="Testing" argCount := len(os.Args[1:]) if (argCount>0) {message= string(os.Args[1])} suite := edwards25519.NewBlakeSHA256Ed25519() x := suite.Scalar().Pick(suite.RandomStream()) M := suite.Point().Embed([]byte(message), random.New()) C:= suite.Point().Mul(x, M) invx := suite.Scalar().Inv(x) D:=suite.Point().Mul(invx,C) rtn, _:= D.Data() fmt.Printf("Message:\t\t%s\n" , message) fmt.Printf("Message point:\t\t%s\n" , M.String()) fmt.Printf("\nPrivate key (x):\t%s\n" , x) fmt.Printf("\nInv private key (x^{-1}):\t%s\n" , invx) fmt.Printf("\nCipher point (C=x.M):\t%s\n" ,C) fmt.Printf("\nMessage:\t%s\n" , string(rtn)) }
And a sample run:
Message: Hello Message point: 0548656c6c6fafa4123b2c2b8eb4684fba762b954737738e386570c706979e93 Private key (x): 53456343d65c3fab484f793581b2d9f79cf1a7e991c27fe95836fef266c3d005 Inv private key (x^{-1}): bad034fe125aaea5824205aef0ac60ec7e42ee11d419db2b912cbf01d7ff5f01 Cipher point (C=x.M): 060c7d8e1a93b4fd565df942f95a6e2d5a2a80441c929ee039aeb889f6c2baa8 Message: Hello
For "Hello", we see the encoded point as:
05 48656c6c6f 9024908b7d61a7ccb61e6df0bdf14b54116f4091340d5941c8d3
and where "05" identifies that we have five bytes in the message, and 0x48 represents 'H', 0x65 for 'e', 0x6c for 'l', and 0x6f for 'o'. The rest of the point is a random value.