training-labs

Prelab 3a: Intro to Numbers and Structures in C++

Objective

By engaging with this demonstration, you should be able to:

break up multi-byte little-endian integers into bytes
copy bytes from one memory address to another
be aware of struct padding issues that arise on the Feather M0+

Videos and Text

The contents of this demo are available in text and video form. You may watch the videos, or read the text, and you will cover the same information. The videos are about 40 minutes of content with short, periodic checks on your understanding.

C++ Motivation

The radio on the Feather M0+ LoRa boards used in MAE 4220 take data in bytes. You could essentially feed the radio a lot of bytes manually, but then, it becomes very tedious to keep track of the meaning of those bytes. We use C++ structures and datatypes to allow you to think more about high level values such as temperatures and times, and less about whether data bytes are in the correct order. The Things Network Community Edition (the internet site that your data are sent to) also just receives bytes. Understanding the basics of how different kinds of numbers are represented as bytes allows you to decode a list of bytes into high level values such as temperatures and times.

Number Systems Introduction

This is a short introduction into how to understand and convert between binary, hexadecimal and decimal number systems. If you understand this already, great! Otherwise, read on!

In the internet of things, we send data up through the internet and to do this we manipulate and read small amounts of data. In C++ we do this by manipulating data in structs. The data that we manipulate are often shown in a form that we are not used to, Binary or Hexadecimal. Sometimes, they are shown in decimal (what we’re used to).

In Decimal, also known as Base 10, meaning we have 10 distinct numbers 0 through 9 that increment as the number gets higher, and then when it gets to the highest number possible, 9, the next position increments by 1 and the positions before it reset back to 0. So when the ones place increments past 9, the ones place is reset to 0 and the tens place increments from 0 to 1.
000
001
002
…
009
010

This pattern continues in subsequent positions such as the hundreds place where the hundreds place increments from 0 to 1 and the tens and ones place reset back to 0.
098
099
100

Another way to look at interpreting these numbers is to add each place up by the value of the number, times the base raised to the place’s power.
934

9 * 10^2 = 900 +
3 * 10^1 = 30 +
4 * 10^0 = 4
= 934 in decimal
This example is indeed redundant because we have converted from decimal back to decimal. However, it is important in understanding how other base systems work.

Binary In binary, instead of ten distinct numbers, we only have two: 0 and 1. Now, it only takes one increment before a place gets filled up and increments the next place so we count from 0 to 5 in this fashion.

Binary and Decimal representations of counting up.
0000 = 0
0001 = 1
0010 = 2
0011 = 3
0100 = 4
0101 = 5

The same logic applies here for understanding the meaning of Binary numbers, for instance 10110 1 * 2^(4) = 16
0 * 2^(3) = 0
1 * 2^(2) = 4
1 * 2^(1) = 2
0 * 2^(0) = 0

= 22 in Decimal

Hexadecimal There are many other number systems that are also very important, particularly Hexadecimal which as 16 distinct “numbers”. We don’t have 16 distinct numbers so instead we use letters to represent the rest of them. So, the “numbers” are 0-9 and then A-F. 0 = 0
… 9 = 9
A = 10
B = 11
…
F = 15
We use the same logic to understand these numbers and convert them into hexadecimal.
0x0A2D to Decimal

A * 16^(2) = 10 * 162 = 2560;
2 * 16^(1) = 2 * 161 = 32
D * 16^(0) = 13* 16^0 = 13
= 2605 in Decimal

While we don’t often use any other bases, theoretically one could use a different base.
Base 7
2601 would equal 981 in decimal

You are free to use converters! This is tedious to do by hand.

Numbers in C++

The smallest amount of data something can have on a computer is a bit which either has the value 0 or 1. In C++, the smallest directly-changeable unit of memory is a byte. All bytes are a collection of 8 bits. The table below represents a byte with the eight bit positions (7 through 0) labeled.

1	0	0	1	0	0	0	1
7	6	5	4	3	2	1	0

We are going to read this as a number in binary and convert it to decimal. As a convention, the Nth byte in an unsigned integer like the one above adds 2^N to the byte’s value. So, we get the value in base 10 through the following addition pattern introduced above.

10010001₂ = 2⁷ + 2⁴ + 2⁰ = 145₁₀

The subscripts notate the base.

Signed vs Unsigned terminology

This convention only works for positive integers so how do we represent negative integers?

This is where we used “signed” integers. The convention for bit contributions is the same except that the most significant bit contributes -2^N. So, if we treated that byte as a signed integer, we would instead perform the following addition:

10010001₂ = -2⁷ + 2⁴ + 2⁰ = -111₁₀

That convention, called “Two’s Complement,” makes addition and subtraction in hardware work out nicely.
In these tutorials, simply pay attention to whether bits are supposed to represent a signed or unsigned integer. The bits are simple to reinterpret in C++ code, and binary-decimal converters often give both a signed and an unsigned conversion.

The main way that we distinguish the two in code is that an int is signed and an uint is unsigned. You may also see data types like unsigned int in some libraries.

**For the rest of this demo, please assume that integers are unsigned unless stated otherwise. **

When we use the datatype uint8_t, this means that this is an int that is 8 bits large (1 byte). We can also have a uint16_t which is 16 bits large, (2 bytes large).

#Addresses While we have seen these bytes with all their bits meaning numbers, one can also decide that some of the bits in a byte means something else, such as an address.

It is a design decision for us to assign a special meaning to a byte. For example, the following byte we have defined to have bits 7:5 mean a “Write Address”, with bits 4:0 meaning a “Write Value”.

1	0	0	1	0	0	0	1
7	6	5	4	3	2	1	0

In that case, the write address is 4, and the write value is 17. Feel free to use a binary to decimal converter.

In the following byte, the first four bits are a “Write Address” and the rest are the “Write Value.”

1	0	0	1	0	0	0	1
7	6	5	4	3	2	1	0

In embedded coding, many bits of bytes have special meaning because of a design decision. You will often have to read documentation so that you can agree with library authors and hardware designers on what some bits mean. You will also often have the freedom to choose what you want to make some bits mean.

We can also group bytes together. Below is a 16-bit integer formed by putting two bytes next to each other.

0	1	0	0	1	1	1	1	1	0	0	1	0	0	0	1
15	14	13	12	11	10	9	8	7	6	5	4	3	2	1	0

As a result, we can represent very large numbers precisely by using a lot of bytes. In practice, plain numbers do not use more than four or eight bytes (32-bit or 64-bit numbers).

Check for understanding

1	0	0	1	0	0	0	1
7	6	5	4	3	2	1	0
Address			Value

I want to write a value of 24 to address 4. What byte do I have to write?

Answer

I have to write the value 152. The address of 4 is 100₂ The value 24 is 11000₂. So, combined, the byte is 10011000₂ which is 152₁₀.

Hex notation and memory addresses

We can also use base 16, or hexadecimal notation, to represent byte values. Hexadecimal uses 0-9 and then A-F. Each character represents four bits. Notate hexadecimal numbers as 0x############ for as many bits as you want to represent. The following table lists binary to hex conversions.

Binary	Hex	Binary	Hex	Binary	Hex	Binary	Hex
0000	0x00	0100	0x04	1000	0x08	1100	0x0C
0001	0x01	0101	0x05	1001	0x09	1101	0x0D
0010	0x02	0110	0x06	1010	0x0A	1110	0x0E
0011	0x03	0111	0x07	1011	0x0B	1111	0x0F

For example, the byte 0b10010001 used before has the following conversion to hex:

1001_0001 = 0x91

We can use _ to make long strings of bits more readable.

Bytes are stored at memory addresses. The following table lists a few bytes and their addresses. (completely arbitrary)

Address	Value
0x00001006	0x00
0x00001005	0xFA
0x00001004	0xCE
0x00001003	0xB0
0x00001002	0x0C
0x00001001	0x4F
0x00001000	0x91

C++ gives us the power to interpret bytes as data. Bytes could be a muti-byte integer such as 0x4F91 that appeared earlier in this demonstration.

Address	Value
0x00001006	0x00
0x00001005	0xFA
0x00001004	0xCE
0x00001003	0xB0
0x00001002	0x0C
0x00001001	0x4F
0x00001000	0x91

On the Arduino Uno and the Feather M0+ LoRa, multi-byte values are stored in little-endian order, meaning that lower value bytes go into lower memory addresses. We can say that 16-bit integer 0x4F91 is at memory address 0x00001000.

To check your understanding, what 32-bit integer is at memory address 0x00001000?

Answer

The integer 0xB00C4F91 is at address 0x00001000.

I want to store 0x12345678 at address 0x00001001. Please fill in the correct bytes. (feel free to fill in text on a spreadsheet for this one)

Answer

Address	Value
0x00001006	0x??
0x00001005	0x??
0x00001004	0x12
0x00001003	0x34
0x00001002	0x56
0x00001001	0x78
0x00001000	0x??

We don't care about the value of the 0x?? bytes.

Bytes could also hold a custom data structure. Those bytes could mean packet 0x4F of 0x91, data payload 0xFACEB00C.

Working with bytes in C++

Key Recipe

In order to spare us from undefined program behavior, we will follow a recipe for custom data structures. To load, pack, or encode a structure, we will load values into a specially identified C++ structure, and then copy it into its destination. To unload, unpack, or decode a structure, we will assemble larger values by copying bytes to appropriate places. This demonstration will give several examples of packing and unpacking structures. The video demonstrates the Arduino IDE, but the examples should work in other Arduino compilers such as TinkerCad.

Recall that we talked about a group of bytes at address 0x00001000 that meant packet 0x4F of 0x91, data payload 0xFACEB00C. We can define a data structure, notated as a struct in C++, to assist us in loading data.

struct __attribute__((__packed__)) pkt_fmt{
  uint8_t totalPackets;
  uint8_t packetNumber;
  uint8_t payload[4];
};

void setup(){
  pkt_fmt mypkt = {
    .totalPackets = 0x91,
    .packetNumber = 0x4F,
    .payload = {0x0C, 0xB0, 0xCE, 0xFA}
  };

  Serial.begin(115200);
  while(!Serial);//You have to open Serial Monitor with Feather M0+.

  Serial.print("Packet number ");Serial.print(mypkt.packetNumber);
    Serial.print(" of "); Serial.println(mypkt.totalPackets);
}
void loop(){}

Let us unpack what went on in that code. We define struct pkt_fmt to have three members. totalPackets is an 8-bit unsigned integer.

Aside: uint8_t is available with #include <cstdint>, which Arduino includes automatically in your main sketch file. int8_t is an 8-bit signed integer, and uint32_t is a 32-bit unsigned integer. On Feather M0+, standard ints are signed or unsigned and use 8, 16, 32, or 64 bits.

packetNumber is another 8-bit unsigned integer. payload is an array of 4 uint8_ts. The [] tell the compiler to reserve 4 consecutive bytes in the struct for that member. We will talk more about arrays soon.

__attribute__((__packed__)) is required on the Feather M0+ to tell the compiler to pack the struct members densely. Otherwise, the compiler adds padding which makes the actual struct bytes not match up with our intuition.

To check your understanding:

Please define a struct with the following members:

8-bit signed integer, “armSetVeloc”

32-bit signed integer, “armPosition”

Data array of 12 bytes, “data”

An answer is below:

Solution

struct __attribute__((__packed__)) pkt_fmt{
   int8_t armSetVeloc;
   int32_t armPosition;
   uint8_t data[12];
};

You can get the memory address of a piece of data using the & operator. Memory addresses are useful for copying data. There is a function called memcpy that allows you to copy a desired number of bytes from one memory address to another (in the same order). Let’s try copying our pkt_fmt struct to a byte array of the same size.

struct __attribute__((__packed__)) pkt_fmt{
  uint8_t totalPackets;
  uint8_t packetNumber;
  uint8_t payload[4];
};

void setup(){
  pkt_fmt mypkt = {
    .totalPackets = 0x91,
    .packetNumber = 0x4F,
    .payload = {0x0C, 0xB0, 0xCE, 0xFA}
  };

  Serial.begin(115200);
  while(!Serial);//You have to open Serial Monitor with Feather M0+.

  Serial.print("Packet number ");Serial.print(mypkt.packetNumber);
    Serial.print(" of "); Serial.println(mypkt.totalPackets);

  uint8_t mypkt2[sizeof(pkt_fmt)];//sizeof is a compile-time operation
  memcpy(mypkt2, &mypkt, sizeof(pkt_fmt));
}
void loop(){}

Note that the name of an array is the address of the first element.

How do we go from an array of bytes to a packed struct? The 99%, works-on-most-compilers method is to use memcpy. However, we will decode an array of bytes by extracting the desired values byte by byte. Observe this new example:

struct __attribute__((__packed__)) pkt_fmt{
  uint32_t totalPackets;
  uint8_t packetNumber;
  uint16_t illumination;
};

struct decodedObject {
  uint32_t totalPackets;
  uint8_t packetNumber;
  uint16_t illumination;
};

void setup(){
  pkt_fmt mypkt = {
    .totalPackets = 0x91,
    .packetNumber = 0x4F,
    .illumination = 42
  };

  struct {
    uint8_t bytes[sizeof(pkt_fmt)];
  } input;
  memcpy(input.bytes, &mypkt, sizeof(pkt_fmt));
  decodedObject decoded;


  decoded.totalPackets = input.bytes[0]
                       +(input.bytes[1]<<8)
                       +(input.bytes[2]<<16)
                       +(input.bytes[3]<<24);

  decoded.packetNumber = input.bytes[4];

  decoded.illumination = input.bytes[5] + (input.bytes[6]<<8);




  Serial.begin(115200);
  while(!Serial);//You have to open Serial Monitor with Feather M0+.
  Serial.print("Illumination ");Serial.println(decoded.illumination);
}
void loop(){}

The left-shift operator byte<<N shifts an integer left by N bits. Below is an example of a left shift by 3 bits (N=3);

Before:

0	0	0	0	0	0	0	0	1	0	0	1	0	0	0	1
15	14	13	12	11	10	9	8	7	6	5	4	3	2	1	0

After:

0	0	0	0	0	1	0	0	1	0	0	0	1	0	0	0
15	14	13	12	11	10	9	8	7	6	5	4	3	2	1	0

We are able to assemble multi-byte integers by using bit shifting and addition because the only bits that could be nonzero are of the bytes that we are shifting. So, shift-and-add works.

Check for understanding

Below is a packet format.
struct __attribute__((__packed__)) pkt_fmt{
 uint16_t vBat;
 uint16_t vBus;
 uint16_t CO2;
};
Please write a snippet of code that decodes that struct. You may assume that your input is in input.bytes and that decoded has members of the same names and types as the pkt_fmt struct. A solution is below.

Solution

decoded.vBat = input.bytes[0] + (input.bytes[1]<<8);
decoded.vBus = input.bytes[2] + (input.bytes[3]<<8);
decoded.CO2 = input.bytes[4] + (input.bytes[5]<<8);