VMPC One-Way Function and VMPC Encryption Technology by Bartosz Zoltak

VMPC-MAC Authenticated Encryption Scheme

1. Introduction
2. Specification of a 4-level VMPC-MAC scheme
3. Performance of the VMPC-MAC scheme
4. Best known attack against the VMPC-MAC scheme
5. Test output of the VMPC-MAC scheme

Download VMPC Encryption Technology Implementation:
VMPC Encryption Technology Implementation in C
VMPC Encryption Technology Implementation in Pascal/Delphi
VMPC Encryption Technology Implementation in Assembler

Download paper
"VMPC-MAC: A Stream Cipher Based
Authenticated Encryption Scheme":
vmpc_mac.pdf (151 KB)
vmpc_mac.ps (253 KB)
vmpc_mac.dvi (43 KB)

Paper also available at
IACR ePrint archive

1. Introduction

VMPC-MAC scheme is an algorithm for computing Message Authentication Codes (MACs) for the VMPC Stream Cipher. A MAC allows to verify whether the message output from the decryption process is exactly the same message that was encrypted (i.e. that there was no transmission error, that nobody intercepted and changed (forged) our message and that we used the correct key to decrypt the message).

According to current state of the art the VMPC-MAC scheme provides a 2^144 level of security (resistance to best found message-forgery attack) (for value 4 of the d parameter). The security level can be extended by increasing the d parameter (e.g. 5-level VMPC-MAC scheme requires an effort of 2^200 to be forged)

The VMPC-MAC employs some of the internal-state-data of the VMPC Stream Cipher (the P permutation) to compute the MAC. This allowed to simplify the design and minimize the amount of additional-to-encryption computations and achieve high performance in software implementations of the complete encrypt-and-authenticate process. Performance of the VMPC-MAC scheme reaches a rate of about 27 clock cycles / byte (about 100MBytes / second on an Intel Pentium 2,66GHz processor).

2. Specification of 4-level VMPC-MAC scheme (d=4)

Notation :

	P	: 256-byte table storing a permutation of integers {0,1,...,255}
	n, s	: 8-bit integer variables
	T	: 32-byte table (the general size of T is (8*d) bytes)
	x1, x2, x3, x4	: 8-bit integer variables (the general number of x variables is d)
	M	: 20-byte table the resulting MAC will be stored in
	m, g, R, i	: temporary integer variables
	L	: length of plaintext in bytes
	+	denotes addition modulo 256

Table 1. VMPC-MAC scheme

1.1 Run the VMPC Key Scheduling Algorithm (VMPC-KSA or VMPC-KSA3).
1.2 m = g = x1 = x2 = x3 = x4 = 0; T[i]=0 for i ∈ {0,1,...,31}

2. repeat steps 3-16 L times:
    3. s = P[ s + P[n] ]
    4. Ciphertext[m] = Plaintext[m] xor P[P[P[s]]+1]

    5. x4 = P[ x4 + x3 ]
    6. x3 = P[ x3 + x2 ]
    7. x2 = P[ x2 + x1 ]
    8. x1 = P[ x1 + s + Ciphertext[m] ]

    9. T[g]   = T[g]   xor x1
    10. T[g+1] = T[g+1] xor x2
    11. T[g+2] = T[g+2] xor x3
    12. T[g+3] = T[g+3] xor x4

    13. swap P[n] with P[s]
    14. g = (g + 4) mod 32
    15. n = n + 1
    16. Increment m

17. Execute the Post Processing Phase specified in Table S.17.

18. Input T to the VMPC-KSA round function: KSARound(T, 32).
    This is equivalent to executing the algorithm in Table S.18.

19. Store 20 new outputs of the VMPC Stream Cipher in table M.
    This is equivalent to executing the algorithm in Table S.19.

20. Append the 20-byte MAC (stored in table M) to the Ciphertext.

Table S.17. The Post Processing Phase (step 17 of the algorithm specified in Table 1)

17.1. R = 1
17.2. repeat steps (17.3 - 17.15) 24 times:
      17.3. s = P[ s + P[n] ]

      17.4. x4 = P[ x4 + x3 + R ]
      17.5. x3 = P[ x3 + x2 + R ]
      17.6. x2 = P[ x2 + x1 + R ]
      17.7. x1 = P[ x1 + s + R ]

      17.8. T[g]   = T[g]   xor x1
      17.9. T[g+1] = T[g+1] xor x2
      17.10. T[g+2] = T[g+2] xor x3
      17.11. T[g+3] = T[g+3] xor x4

      17.12. swap P[n] with P[s]
      17.13. g = (g + 4) mod 32
      17.14. n = n + 1
      17.15. R = R + 1

Table S.18. Step 18 (KSARound(T, 32)) of the algorithm specified in Table 1

18.1. n = i = 0
18.2. repeat steps 18.3 - 18.6 768 times:
      18.3. s = P[ s + P[n] + T[i] ]
      18.4. swap P[n] with P[s]
      18.5. i = (i + 1) mod 32
      18.6. n = n + 1

Table S.19. Step 19 (20 VMPC Stream Cipher outputs) of the algorithm specified in Table 1

19.0. Make sure that n = 0 (as should be after step 18)
19.1. repeat steps 19.2 - 19.5 20 times:
      19.2. s = P[ s + P[n] ]
      19.3. M[n] = P[P[P[s]]+1]
      19.4. swap P[n] with P[s]
      19.5. n = n + 1

3. Performance of the VMPC-MAC scheme

Performance of a moderately optimized 32-bit assembler implementation of the VMPC-MAC scheme measured on an Intel Pentium 4, 2.66 GHz processor, is given in Table 2.

Table 2. Software performance of the VMPC-MAC scheme

MBytes / s	MBits / s	cycles / byte
100	800	27

4. Best known attack against the VMPC-MAC scheme

The most efficient (chosen ciphertext) attack found against the VMPC-MAC scheme is described in this section.

The attack assumes that the attacker has full passive and active access to the ciphertext and can use an unlimited number of verification attempts for the new message. The purpose of the attacker is to introduce a new ciphertext which conforms the MAC of the original one.

The attack model begins with a random (or intended by the attacker) change of one bit (or byte) of the ciphertext - Ct[m]. The purpose of the attacker is to hide this change by manipulating the remaining part of the ciphertext in such way as to leave the resulting MAC unchanged.

Let Ct[m] denote m-th byte of Ciphertext
Let xk(m) denote the value of the xk variable (for k=1,2,3 or 4) in iteration m
Let n = (m * d) modulo 32
Let (+) denote addition modulo 32

A change of Ct[m] unconditionally causes a change of x1(m), since P is a permutation.

Because x1(m) and only x1(m) directly updates x2(m+1) and indirectly updates x3(m+2) and x4(m+3), the variables x2(m+1), x3(m+2) and x4(m+3) will be unconditionally changed too.

The following elements of table T will be updated and unconditionally changed by those variables: T[n] changed by x1(m), T[n(+)5] changed by x2(m+1), T[n(+)10] changed by x3(m+2) and T[n(+)15] changed by x4(m+3).

The most efficient method of reverting these changes found forces the attacker to perform the following changes of the ciphertext:

1. Change Ct[m+1] in such way as to make x4(m+4) return to its original value. The unavoidable cost of this is a change of x1(m+1), x2(m+2) and x3(m+3).

[x3(m+3) must be changed in such way as to make x4(m+4) = (x4(m+3) + x3(m+3)) modulo 256 return to its original value]

As a result T[n(+)4] is changed by x1(m+1), T[n(+)9] is changed by x2(m+2) and T[n(+)14] is changed by x3(m+3). T[n(+)19] remains unchanged because the change of x4(m+4) was reverted.

2. Change Ct[m+2] in such way as to make x3(m+4) return to its original value. The unavoidable cost of this is a change of x1(m+2) and x2(m+3).

As a result T[n(+)8] is changed by x1(m+2) and T[n(+)13] is changed by x2(m+3). T[n(+)18] remains unchanged because the change of x3(m+4) was reverted.

3. Change Ct[m+3] in such way as to make x2(m+4) return to its original value. The unavoidable cost of this is a change of x1(m+3).

As a result T[n(+)12] is changed by x1(m+3). T[n(+)17] remains unchanged because the change of x2(m+4) was reverted.

4. Change Ct[m+4] in such way as to make x1(m+4) return to its original value. As a result T[n(+)16] remains unchanged.

At this moment the attacker succeeded in stopping the avalanche of changes of elements of T, resulting from a change of Ct[m], by reverting the changes of x1,x2,x3,x4 in the earliest possible iteration (m+4). The cost of this is an unavoidable change of 10 elements of the T table (T[n, n(+)4, n(+)5, n(+)8, n(+)9, n(+)10, n(+)12, n(+)13, n(+)14, n(+)15]).

To complete a successful forgery, the attacker needs to revert the changes of these elements of T, too. Operations similar to steps 1-4 need to be performed to refrain x1,x2,x3,x4 from causing more damage to T and the additional requirement - to revert the already caused changes to T - needs to be satisfied. The most efficient approach found achieves that in the following steps 5-9:

5. Change Ct[m+8] in such way as to change x1(m+8) in such way as to revert the change of T[n], make x2(m+9) change in such way as to revert the change of T[n(+)5], make x3(m+10) change in such way as to revert the change of T[n(+)10], and make x4(m+11) change in such way as to revert the change of T[n(+)15].

6. Change Ct[m+9] in such way as to make x4(m+12) return to its original value, make x1(m+9) change in such way as to revert the change of T[n(+)4], make x2(m+10) change in such way as to revert the change of T[n(+)9], make x3(m+11) change in such way as to revert the change of T[n(+)14]. T[n(+)19] remains unchanged because the change of x4(m+12) was reverted.

7. Change Ct[m+10] in such way as to make x3(m+12) return to its original value, make x1(m+10) change in such way as to revert the change of T[n(+)8], make x2(m+11) change in such way as to revert the change of T[n(+)13]. T[n(+)18] remains unchanged because the change of x3(m+12) was reverted.

8. Change Ct[m+11] in such way as to make x2(m+12) return to its original value, make x1(m+11) change in such way as to revert the change of T[n(+)12]. T[n(+)17] remains unchanged because the change of x2(m+12) was reverted.

9. Change Ct[m+12] in such way as to make x1(m+12) return to its original value. As a result T[n(+)16] remains unchanged.

A total complexity of the described attack is determined by the total number of changes to variables x1,x2,...,xd and T[0,1,..., 8*d-1], which need to be reverted. Steps 1-9 determine this complexity, for the assumed d=4, to 256^18 = 2^144. Extending the the d parameter to 5 would, in an analogous attack, yield a complexity of 256^25 = 2^200 (which would also imply an increase of the size of the MAC to 25 or more bytes).

5. Test output of the VMPC-MAC scheme

An example 20-byte MAC generated by the algorithm described in Section 2 is given in Table 3. The MAC is generated for an example key and Initialization Vector specified in Table 3 and for a 256-byte plaintext message consisting of consecutive numbers from 0 to 255 (Plaintext[i]=i for i=0,1,...,255; e.g. Plaintext[150]=150)

Table 3. Example MAC produced by the VMPC-MAC scheme

Key (hex)	`96, 61, 41, 0A, B7, 97, D8, A9, EB, 76, 7C, 21, 17, 2D, F6, C7`
Initialization Vector (hex)	`4B, 5C, 2F, 00, 3E, 67, F3, 95, 57, A8, D2, 6F, 3D, A2, B1, 55`
256-byte Message (dec)	`0, 1, 2, 3, ..., 252, 253, 254, 255`
Resulting 20-byte MAC (hex)	`9B, DA, 16, E2, AD, 0E, 28, 47, 74, A3, AC, BC, 88, 35, A8, 32, 6C, 11, FA, AD`