**Extension fields part I**

## Motivation

Extension fields is the mathematics involved in two example cases:

- When you burn data on a disk. The data can then be retrieved when read even when there are scratches on the disc. Even if there are errors on the disc when read due to the scratches, some algorithm called an error correcting code shall correct the errors for you up to a certain limit, beyond which the disc is corrupted. This algorithm/code called a BCH code works in the binary extension field GF(2
^{m}) to accomplish this. - When you want to skype, only the caller and receiver receiver the video streams from the webcam from both ends. This is called encryption and decryption and is a branch of mathematics/engineering known as cryptography. The cryptographic code that accomplishes this, known as AES works in the binary extension field GF(2
^{m}).

### crypto system definition

A system that can perform encryption and decryption is known as a **cryptographic system/cryptosystem**.

### cryptographic protocol definition

There is a difference between a cryptosystem and a cryptographic protocol is that while a crypto system deals with encryption and decryption, a crypto protocol may deal with several crypto systems in a given use case scenario. According to Wikipedia, as it is tied toa scenario, the crypto protocol has to provide a majority of the following:

- Key agreement/establishment between communicating parties.
- Communicating entity authentication.
- Secure application level data transport
- Non repudiation methods, mostly through digital signatures.
- secret sharing methods
- symmetric encryption and/or public key/multi party encryption. The use of both is called a mixed encrypton scheme.

Example: Transport layer security(TLS) which ensures secure communication over the web. It involves: An authentication mechanism through the use of X.509 certificates and key exchange grouped into one operation known as a TLS handshake. It also provides secure application level data transport after the handshake. It thus accomplishes 5 out of 6 of the above objectives of a crypto protocal.

## More about cryptosystems

**Encryption**– A means of hiding information(**plain text**) so that even if it is intercepted its secrets can’t be known. The form in which it is hidden in is called **cipher text.** Therefore encryption is a transformation of plain text to cipher text.

**Decryption**– A means of unhiding information so that the secret_systems can be known. Therefore decryption is a transformation of cipher text to plain text.

A **key** is thus introduced to ensure only the people needed to read the message can do so. This also allows the cryptographic system to be publicly available to be tested without worrying about the key. The flaws of the cryptographic system can then be corrected which leads to even stronger security. Even an attacker of the system may know details of the system itself but decryption would be infeasible without the key which is ideally only known to the sender and receiver.

Any cryptographic system thus comprises of 3 parts:

- The
**plain text** - The
**key** - The
**cipher text**

An attack on a cryptographic system is taken as an attack on any of the above 3 individually or through the interrelationships between them. It is thus important to decouple the three as much as possible in an encryption system.

According to Claude Shannon, there are two primitive operations upon which reliable encryption algorithms can be built:

**Confusion**– The relationship between key and cipher text is obscured. This is commonly implemented in substitution operations(S – boxes from a system view perspective).**Diffusion**– The effect of a plain text element should be spread out all through the cipher text. The goal is to hide the statistical properties of the plain text so that it is not evident by looking/peeking at the cipher text.

The only thing Shannon did not talk about is the plain text – key relationship. Since both are inputs to a cryptographic system, it is best that they are decoupled as much as possible with the key being as random as possible at the input, but shared between the sending and receiving parties. This generation of the key is classified under “key generation techniques”.

To understand how an attacker can attack a general encryption system, we thus look at the different ways the attacks can happen in 4 ways:

**Cipher text only attack**– The attacker has the some cipher text and thus needs to find a) The key if he/she is interested in decrypting subsequent messages b) The plain text if he /she is interested in the current plain text. This would most likely be due to some habit developed by the encrypter that would indicate where the interesting parts of the cipher text are. To do a) and b), the feasible option would be a brute force attack where the attacker would go through every feasible key combination till the plain text makes sense. The remedy would be to use a large key space till the in-feasibility / investment into such an attack is much greater than the gain from obtaining the cipher text. This can be viewed in terms of time or resources.**Known plain text attack**– Attacker has access to parts of the plain text and the corresponding cipher text.**Chosen plain text attack**– An attacker has access to a given plain text and corresponding cipher text of his/her choice.**Chosen cipher text attack**– The attacker can decrypt parts of the plain text but does not know the whole decryption key.

### The Advanced encryption standard (AES) as a cryptosystem.

**NOTE**: AES can be viewed as both a cryptographic system and a cryptographic standard.

The **advanced encryption standard(AES)** is a type of private key/symmetric encryption where the sender and receiver use a pre shared key to determine the contents of the message.

It is published as a Federal Information Processing Standards Publication 197 (FIPS PUB 197). The actual publication dated November 26 2001 can be found here

In a symmetric encryption scheme, the sender, say Alice, may encrypt the message into a non understandable text called cipher text to Bob, who then uses the same key to decrypt the cipher text and obtain the message sent.

The AES encryption block diagram looks as below:

In order to understand a sample use of extension fields GF(q^{m}) where in this case q=2, as the data is usually binary in nature, we thus need to look deeper in the AES encryption block above.

Each pass pass through the AES algorithm by a given data block is called a **round**. The number of rounds per 16 byte block is a function of the key length with:

- 128 bits having 10 rounds per block.
- 192 bits having 12 rounds per block.
- 256 bits having 14 rounds per block.

The path the data follows in the AES encryption block is known as the **data path**. It represents the continuous changes the data undergoes. The data path at a given point is also known as the **state**.

Within a round there, all layers layers that work on all 128 bits of the 16 byte data block. There exists 3 layers. Each of the 3 layers work repeatedly on the data in successive procedures known as rounds.

The various layers in a round include:

**Key addition layer**– A 128 bit round key is derived from the main key then XOR’ed with the data path. The function that derives the round keys from the main key is known as a**key schedule**. The extension field operation involved here is**GF(2**.^{m}) addition**Byte substitution/ sub bytes layer**– Each byte in the state is transformed by the use of lookup tables with special math properties. This causes**confusion**where the relation between the key and cipher text is hidden. This is usually a non linear transformation. The extension field operation involved here is**GF(2**.^{m}) inversion**Diffusion layer shift rows sub layer –**This layer treats the 16 bytes as a concatenated block of 4, 4 byte blocks. It then cyclically right shifts each byte by various amounts. This shall be shown later. The is no extension field operation here, just computational right shift instructions.**Diffusion layer mix columns sub layer –**The extension field operation involved here is**GF(2**.^{m}) multiplication

The next post shall go through the various extension field operations and how they relate to AES in depth in order to gain an intuition.

The post after next will go through an encryption and decryption operation manually on a sample 16 byte block while referencing code from libsodium / mbedTLS (C library) from start to finish. The result will then be verified using the python cryptographic library pycrypto.