open image in lightbox

Endianer Pwns

This is a short article about the different byte sequences. For simplicity, I show the sequences up to 32 bit. For the example, I use a 32 bit register and name the bytes A, B, C and D where our data is stored. Typically, you have either big endian or little endian, but there are also exotic ones that are all middle or mixed endian.

I don’t go into bit sequences, nibble swaps and 64 bit values in the article, as that would go beyond the scope.

Common Byte Orders

On the right side there is always the least significant byte and on the left side there is the most significant byte. The letters A to D show where the bytes are stored at the register, the address is increases from A to D.

For 16 bits the orders are:

  • Little Endian (LE): BA

  • Big Endian (BE): AB

For 32 bit the byte orders are:

  • Little Endian (LE): DC BA

  • Big Endian (BE): AB CD

  • Mixed Little Endian (MLE): CD AB

  • Mixed Big Endian (MBE): BA DC

Check it with the dump 50 6F 6E FF:

  • The byte A at the register contains: 50

  • The byte B at the register contains: 6F

  • The byte C at the register contains: 6E

  • The byte D at the register contains: FF

Integers

unsigned 16 bit little endian (BA)

Raw data
50 6F 6E FF
Byte 1 Byte 0 Outcome
6F 50 28496
FF 6E 65390

signed 16 bit little endian (BA)

Raw data
50 6F 6E FF
Byte 1 Byte 0 Outcome
6F 50 28496
FF 6E -146

unsigned 16 bit big endian (AB)

Raw data
50 6F 6E FF
Byte 1 Byte 0 Outcome
0 6F 20591
6E FF 28415

signed 16 bit big endian (AB)

Raw data
Byte 1 50 6F 6E FF Byte 0
50 6F 20591
6E FF 28415

unsigned 32 bit little endian (DCBA)

Raw data
50 6F 6E FF
Byte 3 Byte 2 Byte 1 Byte 0 Outcome
FF 6E 6F 50 4285427536

signed 32 bit little endian (DCBA)

Raw data
50 6F 6E FF
Byte 3 Byte 2 Byte 1 Byte 0 Outcome
FF 6E 6F 50 -9539760

unsigned 32 bit big endian (ABCD)

Raw data
50 6F 6E FF
Byte 3 Byte 2 Byte 1 Byte 0 Outcome
50 6F 6E FF 1349480191

signed 32 bit big endian (ABCD)

Raw data
50 6F 6E FF
Byte 3 Byte 2 Byte 1 Byte 0 Outcome
50 6F 6E FF 1349480191

unsigned 32 bit mixed little endian (CDAB)

Raw data
50 6F 6E FF
Byte 3 Byte 21 Byte 1 Byte 0 Byte 3
6E FF 50 6F 1862226031

signed 32 bit mixed little endian (CDAB)

Raw data
50 6F 6E FF
Byte 3 Byte 2 Byte 1 Byte 0 Outcome
6E FF 50 6F 1862226031

unsigned 32 bit mixed big endian (BADC)

Raw data
50 6F 6E FF
Byte 3 Byte 2 Byte 1 Byte 0 Outcome
6F 50 FF 6E 1867579246

signed 32 bit mixed big endian (BADC)

Raw data
50 6F 6E FF
Byte 3 Byte 2 Byte 1 Byte 0 Outcome
6F 50 FF 6E 1867579246

Floats (IEEE 754)

float 32 bit little endian (DCBA)

Raw data
50 6F 6E FF
Byte 3 Byte 2 Byte 1 Byte 0 Outcome
FF 6E 6F 50 -3.16934e+38

float 32 bit big endian (ABCD)

Raw data
50 6F 6E FF
Byte 3 Byte 2 Byte 1 Byte 0 Outcome
50 6F 6E FF 1.60681e+10

float 32 bit big endian (ABCD)

Raw data
50 6F 6E FF
Byte 3 Byte 2 Byte 1 Byte 0 Outcome
6E FF 50 6F 3.9508e+28

float 32 bit mixed big endian (BADC)

Raw data
50 6F 6E FF
Byte 3 Byte 2 Byte 1 Byte 0 Outcome
6F 50 FF 6F 6.46817e+28

8 Bit ASCII

ascii little endian (DCBA)

Raw data
50 6F 6E FF
Byte 3 Byte 2 Byte 1 Byte 0 Outcome
FF 6E 6F FF ÿnoP

ascii big endian (ABCD)

Raw data
50 6F 6E FF
Byte 3 Byte 2 Byte 1 Byte 0 Outcome
50 6F 6E FF Ponÿ

ascii mixed little endian (CDAB)

Raw data
50 6F 6E FF
Byte 3 Byte 2 Byte 1 Byte 0 Outcome
6E FF 50 6F nÿPo

ascii mixed big endian (BADC)

Raw data
50 6F 6E FF
Byte 3 Byte 2 Byte 1 Byte 0 Outcome
6F 50 FF 6E oPÿn

UTF-8

utf-8 little endian (DCBA)

FF is invalid. It marks a surrogate.

Raw data
50 6F 6E FF
Byte 3 Byte 2 Byte 1 Byte 0 Outcome
FF 6E 6F 50 �noP

utf-8 big endian (ABCD)

FF is invalid. It marks a surrogate.

Raw data
50 6F 6E FF
Byte 3 Byte 2 Byte 1 Byte 0 Outcome
50 6F 6E FF Pon�

utf-8 mixed little endian (CDAB)

FF is invalid. It marks a surrogate.

Raw data
50 6F 6E FF
Byte 3 Byte 2 Byte 1 Byte 0 Outcome
6E FF 50 6F n�Po

utf-8 mixed big endian (BADC)

FF is invalid. It marks a surrogate.

Raw data
50 6F 6E FF
Byte 3 Byte 2 Byte 1 Byte 0 Outcome
6F 50 FF 6E oP�n

UTF-16

utf-16 little endian (DCBA)

Raw data
50 6F 6E FF
Byte 3 Byte 2 Byte 1 Byte 0 Outcome
FF 6E 6F 50 ョ潐

utf-16 big endian (ABCD)

Raw data
50 6F 6E FF
Byte 3 Byte 2 Byte 1 Byte 0 Outcome
50 6F 6E FF 偯滿

utf-16 mixed little endian (CDAB)

Raw data
50 6F 6E FF
Byte 3 Byte 2 Byte 1 Byte 0 Outcome
6E FF 50 6F 滿偯

utf-16 mixed big endian (BADC)

Raw data
50 6F 6E FF
Byte 3 Byte 2 Byte 1 Byte 0 Outcome
6F 50 FF 6E 潐ョ

UTF-32

utf-32 little endian (DCBA)

It’s an invalid char because the maximum valid value is 00 10 FF FF (Plane 16).

Raw data
50 6F 6E FF
Byte 3 Byte 2 Byte 1 Byte 0 Outcome
FF 6E 6F 50

utf-32 big endian (ABCD)

It’s an invalid char because the maximum valid value is 00 10 FF FF (Plane 16).

Raw data
50 6F 6E FF
Byte 3 Byte 2 Byte 1 Byte 0 Outcome
50 6F 6E FF

utf-32 mixed little endian (CDAB)

It’s an invalid char because the maximum valid value is 00 10 FF FF (Plane 16).

Raw data
50 6F 6E FF
Byte 3 Byte 2 Byte 1 Byte 0 Outcome
6E FF 50 6F

utf-32 mixed big endian (BADC)

It’s an invalid char because the maximum valid value is 00 10 FF FF (Plane 16).

Raw data
50 6F 6E FF
Byte 3 Byte 2 Byte 1 Byte 0 Outcome
6F 50 FF 6E

Summary

Now when looking at all the data, you need to know the byte order of the source and the destination it needs to be swapped to. In order to interpret the data correctly, one must also know what they represent.

Normally, it is easy since most processor architectures use either little endian or big endian (check the picture below). For 64 bit this continues exactly the same, only the middle or mixed endians are a bit lost, especially for 64 bit.

32 bit little ↔ big endian swap