By engaging with this demonstration, you should be able to:
The contents of this demo are available in text and video form. You may watch the videos, or read the text, and you will cover the same information. The videos are about 40 minutes of content with short, periodic checks on your understanding.
The radio on the Feather M0+ LoRa boards used in MAE 4220 take data in bytes. You could essentially feed the radio a lot of bytes manually, but then, it becomes very tedious to keep track of the meaning of those bytes. We use C++ structures and datatypes to allow you to think more about high level values such as temperatures and times, and less about whether data bytes are in the correct order. The Things Network Community Edition (the internet site that your data are sent to) also just receives bytes. Understanding the basics of how different kinds of numbers are represented as bytes allows you to decode a list of bytes into high level values such as temperatures and times.
This is a short introduction into how to understand and convert between binary, hexadecimal and decimal number systems. If you understand this already, great! Otherwise, read on!
In the internet of things, we send data up through the internet and to do this we manipulate and read small amounts of data. In C++ we do this by manipulating data in structs. The data that we manipulate are often shown in a form that we are not used to, Binary or Hexadecimal. Sometimes, they are shown in decimal (what we’re used to).
In Decimal, also known as Base 10, meaning we have 10 distinct numbers 0 through 9 that increment as the number gets higher, and then when it gets to the highest number possible, 9, the next position increments by 1 and the positions before it reset back to 0. So when the ones place increments past 9, the ones place is reset to 0 and the tens place increments from 0 to 1.
000
001
002
…
009
010
This pattern continues in subsequent positions such as the hundreds place where the hundreds place increments from 0 to 1 and the tens and ones place reset back to 0.
098
099
100
Another way to look at interpreting these numbers is to add each place up by the value of the number, times the base raised to the place’s power.
934
9 * 10^2 = 900 +
3 * 10^1 = 30 +
4 * 10^0 = 4
= 934 in decimal
This example is indeed redundant because we have converted from decimal back to decimal. However, it is important in understanding how other base systems work.
Binary
In binary, instead of ten distinct numbers, we only have two: 0 and 1. Now, it only takes one increment before a place gets filled up and increments the next place so we count from 0 to 5 in this fashion.
Binary and Decimal representations of counting up.
0000 = 0
0001 = 1
0010 = 2
0011 = 3
0100 = 4
0101 = 5
The same logic applies here for understanding the meaning of Binary numbers, for instance 10110
1 * 2^(4) = 16
0 * 2^(3) = 0
1 * 2^(2) = 4
1 * 2^(1) = 2
0 * 2^(0) = 0
= 22 in Decimal
Hexadecimal
There are many other number systems that are also very important, particularly Hexadecimal which as 16 distinct “numbers”.
We don’t have 16 distinct numbers so instead we use letters to represent the rest of them. So, the “numbers” are 0-9 and then A-F.
0 = 0
…
9 = 9
A = 10
B = 11
…
F = 15
We use the same logic to understand these numbers and convert them into hexadecimal.
0x0A2D to Decimal
A * 16^(2) = 10 * 162 = 2560;
2 * 16^(1) = 2 * 161 = 32
D * 16^(0) = 13* 16^0 = 13
= 2605 in Decimal
While we don’t often use any other bases, theoretically one could use a different base.
Base 7
2601 would equal 981 in decimal
You are free to use converters! This is tedious to do by hand.
The smallest amount of data something can have on a computer is a bit which either has the value 0 or 1. In C++, the smallest directly-changeable unit of memory is a byte. All bytes are a collection of 8 bits. The table below represents a byte with the eight bit positions (7 through 0) labeled.
1 | 0 | 0 | 1 | 0 | 0 | 0 | 1 |
7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
We are going to read this as a number in binary and convert it to decimal. As a convention, the Nth byte in an unsigned integer like the one above adds 2N to the byte’s value. So, we get the value in base 10 through the following addition pattern introduced above.
100100012 = 27 + 24 + 20 = 14510
The subscripts notate the base.
Signed vs Unsigned terminology
This convention only works for positive integers so how do we represent negative integers?
This is where we used “signed” integers. The convention for bit contributions is the same except that the most significant bit contributes -2^N. So, if we treated that byte as a signed integer, we would instead perform the following addition:
100100012 = -27 + 24 + 20 = -11110
That convention, called “Two’s Complement,” makes addition and subtraction in
hardware work out nicely.
In these tutorials, simply pay attention to whether
bits are supposed to represent a signed or unsigned integer. The bits are simple
to reinterpret in C++ code, and binary-decimal converters often give both a signed
and an unsigned conversion.
The main way that we distinguish the two in code is that an int is signed and an uint is unsigned.
You may also see data types like unsigned int
in some libraries.
**For the rest of this demo, please assume that integers are unsigned unless stated otherwise. **
When we use the datatype uint8_t
, this means that this is an int that is 8 bits large (1 byte).
We can also have a uint16_t
which is 16 bits large, (2 bytes large).
#Addresses While we have seen these bytes with all their bits meaning numbers, one can also decide that some of the bits in a byte means something else, such as an address.
It is a design decision for us to assign a special meaning to a byte. For example, the following byte we have defined to have bits 7:5 mean a “Write Address”, with bits 4:0 meaning a “Write Value”.
1 | 0 | 0 | 1 | 0 | 0 | 0 | 1 |
7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
In that case, the write address is 4, and the write value is 17. Feel free to use a binary to decimal converter.
In the following byte, the first four bits are a “Write Address” and the rest are the “Write Value.”
1 | 0 | 0 | 1 | 0 | 0 | 0 | 1 |
7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
In embedded coding, many bits of bytes have special meaning because of a design decision. You will often have to read documentation so that you can agree with library authors and hardware designers on what some bits mean. You will also often have the freedom to choose what you want to make some bits mean.
We can also group bytes together. Below is a 16-bit integer formed by putting two bytes next to each other.
0 | 1 | 0 | 0 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 1 |
15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
As a result, we can represent very large numbers precisely by using a lot of bytes. In practice, plain numbers do not use more than four or eight bytes (32-bit or 64-bit numbers).
1 | 0 | 0 | 1 | 0 | 0 | 0 | 1 |
7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
Address | Value |
I want to write a value of 24 to address 4. What byte do I have to write?
We can also use base 16, or hexadecimal notation, to represent byte values. Hexadecimal uses 0-9 and then A-F. Each character represents four bits. Notate hexadecimal numbers as 0x############ for as many bits as you want to represent. The following table lists binary to hex conversions.
Binary | Hex | Binary | Hex | Binary | Hex | Binary | Hex |
---|---|---|---|---|---|---|---|
0000 | 0x00 | 0100 | 0x04 | 1000 | 0x08 | 1100 | 0x0C |
0001 | 0x01 | 0101 | 0x05 | 1001 | 0x09 | 1101 | 0x0D |
0010 | 0x02 | 0110 | 0x06 | 1010 | 0x0A | 1110 | 0x0E |
0011 | 0x03 | 0111 | 0x07 | 1011 | 0x0B | 1111 | 0x0F |
For example, the byte 0b10010001 used before has the following conversion to hex:
1001_0001 = 0x91
We can use _ to make long strings of bits more readable.
Bytes are stored at memory addresses. The following table lists a few bytes and their addresses. (completely arbitrary)
Address | Value |
---|---|
0x00001006 | 0x00 |
0x00001005 | 0xFA |
0x00001004 | 0xCE |
0x00001003 | 0xB0 |
0x00001002 | 0x0C |
0x00001001 | 0x4F |
0x00001000 | 0x91 |
C++ gives us the power to interpret bytes as data. Bytes could be a muti-byte integer such as 0x4F91 that appeared earlier in this demonstration.
Address | Value |
---|---|
0x00001006 | 0x00 |
0x00001005 | 0xFA |
0x00001004 | 0xCE |
0x00001003 | 0xB0 |
0x00001002 | 0x0C |
0x00001001 | 0x4F |
0x00001000 | 0x91 |
On the Arduino Uno and the Feather M0+ LoRa, multi-byte values are stored in little-endian order, meaning that lower value bytes go into lower memory addresses. We can say that 16-bit integer 0x4F91 is at memory address 0x00001000.
To check your understanding, what 32-bit integer is at memory address 0x00001000?
I want to store 0x12345678 at address 0x00001001. Please fill in the correct bytes. (feel free to fill in text on a spreadsheet for this one)
Address | Value |
0x00001006 | 0x?? |
0x00001005 | 0x?? |
0x00001004 | 0x12 |
0x00001003 | 0x34 |
0x00001002 | 0x56 |
0x00001001 | 0x78 |
0x00001000 | 0x?? |
Bytes could also hold a custom data structure. Those bytes could mean packet 0x4F of 0x91, data payload 0xFACEB00C.
Key Recipe
In order to spare us from undefined program behavior, we will follow a recipe for custom data structures. To load, pack, or encode a structure, we will load values into a specially identified C++ structure, and then copy it into its destination. To unload, unpack, or decode a structure, we will assemble larger values by copying bytes to appropriate places. This demonstration will give several examples of packing and unpacking structures. The video demonstrates the Arduino IDE, but the examples should work in other Arduino compilers such as TinkerCad.
Recall that we talked about a group of bytes at address 0x00001000 that meant
packet 0x4F of 0x91, data payload 0xFACEB00C. We can define a data structure,
notated as a struct
in C++, to assist us in loading data.
struct __attribute__((__packed__)) pkt_fmt{
uint8_t totalPackets;
uint8_t packetNumber;
uint8_t payload[4];
};
void setup(){
pkt_fmt mypkt = {
.totalPackets = 0x91,
.packetNumber = 0x4F,
.payload = {0x0C, 0xB0, 0xCE, 0xFA}
};
Serial.begin(115200);
while(!Serial);//You have to open Serial Monitor with Feather M0+.
Serial.print("Packet number ");Serial.print(mypkt.packetNumber);
Serial.print(" of "); Serial.println(mypkt.totalPackets);
}
void loop(){}
Let us unpack what went on in that code. We define struct pkt_fmt
to have three
members. totalPackets
is an 8-bit unsigned integer.
Aside:
uint8_t
is available with#include <cstdint>
, which Arduino includes automatically in your main sketch file.int8_t
is an 8-bit signed integer, anduint32_t
is a 32-bit unsigned integer. On Feather M0+, standard ints are signed or unsigned and use 8, 16, 32, or 64 bits.
packetNumber
is another 8-bit unsigned integer.
payload
is an array of 4 uint8_t
s. The [] tell the compiler
to reserve 4 consecutive bytes in the struct for that member. We will talk more
about arrays soon.
__attribute__((__packed__))
is
required on the Feather M0+ to tell the compiler to pack the struct members
densely. Otherwise, the compiler adds padding which makes the actual struct
bytes not match up with our intuition.
To check your understanding:
Please define a struct with the following members:
- 8-bit signed integer, “armSetVeloc”
- 32-bit signed integer, “armPosition”
- Data array of 12 bytes, “data”
An answer is below:
struct __attribute__((__packed__)) pkt_fmt{
int8_t armSetVeloc;
int32_t armPosition;
uint8_t data[12];
};
You can get the memory address of a piece of data using the &
operator.
Memory addresses are useful for copying data.
There is a function called memcpy
that allows you to copy a desired
number of bytes from one memory address to another (in the same order).
Let’s try copying our pkt_fmt
struct to a byte array of the same size.
struct __attribute__((__packed__)) pkt_fmt{
uint8_t totalPackets;
uint8_t packetNumber;
uint8_t payload[4];
};
void setup(){
pkt_fmt mypkt = {
.totalPackets = 0x91,
.packetNumber = 0x4F,
.payload = {0x0C, 0xB0, 0xCE, 0xFA}
};
Serial.begin(115200);
while(!Serial);//You have to open Serial Monitor with Feather M0+.
Serial.print("Packet number ");Serial.print(mypkt.packetNumber);
Serial.print(" of "); Serial.println(mypkt.totalPackets);
uint8_t mypkt2[sizeof(pkt_fmt)];//sizeof is a compile-time operation
memcpy(mypkt2, &mypkt, sizeof(pkt_fmt));
}
void loop(){}
Note that the name of an array is the address of the first element.
How do we go from an array of bytes to a packed struct? The 99%, works-on-most-compilers method is to use memcpy. However, we will decode an array of bytes by extracting the desired values byte by byte. Observe this new example:
struct __attribute__((__packed__)) pkt_fmt{
uint32_t totalPackets;
uint8_t packetNumber;
uint16_t illumination;
};
struct decodedObject {
uint32_t totalPackets;
uint8_t packetNumber;
uint16_t illumination;
};
void setup(){
pkt_fmt mypkt = {
.totalPackets = 0x91,
.packetNumber = 0x4F,
.illumination = 42
};
struct {
uint8_t bytes[sizeof(pkt_fmt)];
} input;
memcpy(input.bytes, &mypkt, sizeof(pkt_fmt));
decodedObject decoded;
decoded.totalPackets = input.bytes[0]
+(input.bytes[1]<<8)
+(input.bytes[2]<<16)
+(input.bytes[3]<<24);
decoded.packetNumber = input.bytes[4];
decoded.illumination = input.bytes[5] + (input.bytes[6]<<8);
Serial.begin(115200);
while(!Serial);//You have to open Serial Monitor with Feather M0+.
Serial.print("Illumination ");Serial.println(decoded.illumination);
}
void loop(){}
The left-shift operator byte<<N
shifts an integer left by N bits. Below
is an example of a left shift by 3 bits (N=3);
Before:
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 1 |
15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
After:
0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 |
15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
We are able to assemble multi-byte integers by using bit shifting and addition because the only bits that could be nonzero are of the bytes that we are shifting. So, shift-and-add works.
Check for understanding
Below is a packet format.
struct __attribute__((__packed__)) pkt_fmt{ uint16_t vBat; uint16_t vBus; uint16_t CO2; };
Please write a snippet of code that decodes that struct. You may assume that your input is in
input.bytes
and thatdecoded
has members of the same names and types as thepkt_fmt
struct. A solution is below.
decoded.vBat = input.bytes[0] + (input.bytes[1]<<8);
decoded.vBus = input.bytes[2] + (input.bytes[3]<<8);
decoded.CO2 = input.bytes[4] + (input.bytes[5]<<8);