Correlation Power Analysis on Speck - Part 1

Speck - Correlation Power Analysis

The following blog article describes a Correlation Power Analysis (CPA) attack on the Speck cipher. I will begin by giving a brief introduction to the Speck cipher and CPA attacks in general. After this, I will kick it off with the interesting part (Attacking Speck in practice).

Speck

Speck is a symmetric ARX cipher. This means that only the operations ADD, ROTATE and XOR are used within the cipher. This makes the cipher very performant in soft- and hardware. Speck can be operated in different modes, differentiating in a difference in key- and block length. The following tables gives brief overview of the available modes:

Speck Blocklength Keylength Number of Rounds
Speck3264 32 Bit 64 Bit 22
Speck4872 48 Bit 72 Bit 22
Speck4896 48 Bit 96 Bit 23
Speck6496 64 Bit 96 Bit 26
Speck64128 64 Bit 128 Bit 27
Speck9696 96 Bit 96 Bit 28
Speck96144 96 Bit 144 Bit 29
Speck128128 128 Bit 128 Bit 32
Speck128192 128 Bit 192 Bit 33
Speck1281256 128 Bit 256 Bit 34

Speck3264 over 22 rounds has been chosen for the CPA attack.

As Speck is a simple cipher, the following (short) pseudocode already describes the full cipher:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
pt = Plaintext Bytes              Pt = Plaintext as 16 Bit Words
ct = Ciphertext Bytes             Ct = Ciphertext as 16 Bit Words
k = Key as Bytes                  K = Key as 16 Bit Words

// Key Schedule
D=K[3], C=K[2], B=K[1], A=K[0]

for i in 0..<22
    rk[i]=A
    ER16(B, A, i++)
    rk[i]=A
    ER16(C, A, i++)
    rk[i]=A
    ER16(D, A, i++)

// Encryption
Ct[0]=Pt[0]; Ct[1]=Pt[1];

for i in 0..<22
    ER16(Ct[1], Ct[0], rk[i++])

The round function ER16 used in Speck is called in all 22 rounds during encryption and is also used as a function for generating the round key.

The structure of the function is as follows:

Correlation Power Analysis

Correlation Power Analysis (CPA) is a variant of Differential Power Analysis (DPA). The attack is based on the Pearson Correlation Coefficient, which can be used to calculate the linear relationship between two measurement values. In this case, it can be used to find correlations between the plaintext and the power consumption of the chip. Through CPA, the Speck round key can be extracted byte by byte, from which the encryption key can be calculated.

Basically, CPA relies on a concrete operation of an intermediate value in the round function ER16. Once the key is included in the computation, an attack can be attempted. In case of Speck, the XOR operation of the key with the intermediate value from the AND operation of the two plaintexts can be used:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
void FuncER16(u16 *x, u16 *y, u16 k)
{
    u16 tmp_x = *x;
    u16 tmp_y = *y;

    *x = (((tmp_x)>>(7)) | ((tmp_x)<<(16-(7))));  // ROR(7)
    *x += *y;

    *x = *x ^ k;   // <-- This operation can be attacked by CPA

    *y = (((tmp_y)<<(2)) | (tmp_y>>(16-(2))));   // ROL(2)
    *y = *y ^ *x;
}

A good (general) hands-on on the topic can be found in the Chipwhisperer Wiki.

Theoretical Attack

For the theoretical attack, a software implementation of the Speck roundfunction is required to identify if there is a potential correlation. To know how to approach building such a model, a short introduction to the Hamming Weight (HW) is required.

Hamming Weight

A chip has a certain base consumption while no active operations are performed (IDLE). If bytes in the chip are changed ($0 \rightarrow 1 ; 1 \rightarrow 0$), power is required and the power consumption increases as a result. This behavior can be simulated in the model by the so-called Hamming distance (HD). The Hamming distance describes the number of changed bits between two data sets:

$$HD(0100101, 0010101) = 2$$

A partial round in Speck always uses an XOR operation to calculate a partial key. The difference of these two values xored , is called the Hamming weight (HW):

$$HD(0100101, 0010101) = HW(0100101 \oplus 0010101)$$

Therefore, the Hamming weight is used in the model to model power consumption.

Speck Model - Simulation

For a software model, the below python code was used. It was enough to skip the 22 rounds of calling ER16() and only calling it once. Otherwise, it is basically the final step of the Speck encryption scheme.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
ALPHA = 7
BETA = 2
WORD_SIZE = 16

mod_mask = (2 ** WORD_SIZE) -1
mod_mask_sub = (2 ** WORD_SIZE)

# Speck Roundfunction
def ER16(x, y, k):
    rs_x = ((x << (16 - ALPHA)) + (x >> ALPHA)) & mod_mask
    add_sxy = (rs_x + y) & mod_mask
    new_x = k ^ add_sxy
    ls_y = ((y >> (16 - BETA)) + (y << BETA)) & mod_mask
    new_y = new_x ^ ls_y
    return new_x, new_y

# Speck `encryption` model
def simple_speck(plaintext, key):
    Ct_0 = plaintext[0]
    Ct_1 = plaintext[1]
    
    Ct_1, Ct_0 = ER16(Ct_1, Ct_0, key)  
    return popcount((Ct_1 << 8) + Ct_0)

The function popcount() is used for calculating the Hamming Weight. A performant implementation could look like this:

1
2
3
4
5
6
# From the Book: A Hackers Delight
def popcount(x):
    x -= (x >> 1) & 0x5555555555555555
    x = (x & 0x3333333333333333) + ((x >> 2) & 0x3333333333333333)
    x = (x + (x >> 4)) & 0x0f0f0f0f0f0f0f0f
    return ((x * 0x0101010101010101) & 0xffffffffffffffff ) >> 56

When simulating the attack with a fixed first keybyte (0x55) and a included noise level, the following correlations for all possible keybytes arise:

It can be seen that the highest correlation is actually measured for the correct first keybyte, 0x55.

Speck Model - Hardware

The next step is to bring this attack to real hardware and see if the same model works to find a correlation in the real power traces - this is were the fun part begins.

To attack Speck on a real hardware, the awesome Chipwhisperer was used. The first step was to implement the Speck cipher for the Chipwhisperer and integrate it in something like the CWs SimpleSerial interface to have a proper setup for testing, encrypting multiple plaintext and capturing the power traces.

Info: The speck-simple-serial firmware can be found in my Github, together with instructions on how to build and flash it.

Breaking first two keybytes

The first thing to do is to generate random plaintext pairs, let the CW encrypt the plaintexts and capture the traces that thereby occur.

Using the Python API for the Chipwhipserer, it was straight forward to capture the traces:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
from tqdm.notebook import trange
import random
import numpy as np

ktp = cw.ktp.Basic()

trace_array = []    # For the power-traces
textin_array = []   # For the plaintexts

pt = b"\x4c\x69\x74\x65" 
random.seed(0x1337) 

# Number of power-traces to capture
N = 2000

# For the nice UI output
for i in trange(N, desc='Capturing traces'):
    pt = bytes([random.randint(0, 255) for i in range(4)])
    scope.arm()
    
    target.simpleserial_write('e', pt)
    ret = scope.capture()
    if ret:
        print("Target timed out!")
        continue
    
    response = target.simpleserial_read('c', 4)
    
    trace_array.append(scope.get_last_trace())
    textin_array.append(pt)

trace_array = np.array(trace_array)

The capture will result in two arrays, the array with the power traces (trace_array) and for the plaintexts (textin_array).

Plotting the trace of one encryption looks like this (For 5000 sample points within one trace):

In the next step, the model that we created in the first step (simple_speck) gets executed for every plaintext that was used on the Hardware to obtain the Hamming Weight. This procedure is done for every possible keybyte (aka 256 times). After that, the correlation between the power-traces of keybyte X and the model executions for keybyte X is calculated over all plaintexts. The highest correlation should hopefully indicate the first keybyte for Speck.

The following graphic will hopefully make it more clear:

When plotting the correlations over all keybytes, it can be seen the the correlation for the correct keybyte (0x22) is the maximum correlation:

Boom! The first keybyte was successfully obtained with a power-analysis attack!

The calculations, plots and sample traces can be found in a IPython notebook in my Github