training-labs

Prelab 3a: Intro to Numbers and Structures in C++

Objective

By engaging with this demonstration, you should be able to:

Videos and Text

The contents of this demo are available in text and video form. You may watch the videos, or read the text, and you will cover the same information. The videos are about 40 minutes of content with short, periodic checks on your understanding.

C++ Motivation

The radio on the Feather M0+ LoRa boards used in MAE 4220 take data in bytes. You could essentially feed the radio a lot of bytes manually, but then, it becomes very tedious to keep track of the meaning of those bytes. We use C++ structures and datatypes to allow you to think more about high level values such as temperatures and times, and less about whether data bytes are in the correct order. The Things Network Community Edition (the internet site that your data are sent to) also just receives bytes. Understanding the basics of how different kinds of numbers are represented as bytes allows you to decode a list of bytes into high level values such as temperatures and times.

Number Systems Introduction

This is a short introduction into how to understand and convert between binary, hexadecimal and decimal number systems. If you understand this already, great! Otherwise, read on!

In the internet of things, we send data up through the internet and to do this we manipulate and read small amounts of data. In C++ we do this by manipulating data in structs. The data that we manipulate are often shown in a form that we are not used to, Binary or Hexadecimal. Sometimes, they are shown in decimal (what we’re used to).

In Decimal, also known as Base 10, meaning we have 10 distinct numbers 0 through 9 that increment as the number gets higher, and then when it gets to the highest number possible, 9, the next position increments by 1 and the positions before it reset back to 0. So when the ones place increments past 9, the ones place is reset to 0 and the tens place increments from 0 to 1.
000
001
002

009
010

This pattern continues in subsequent positions such as the hundreds place where the hundreds place increments from 0 to 1 and the tens and ones place reset back to 0.
098
099
100

Another way to look at interpreting these numbers is to add each place up by the value of the number, times the base raised to the place’s power.
934

9 * 10^2 = 900 +
3 * 10^1 = 30 +
4 * 10^0 = 4
= 934 in decimal
This example is indeed redundant because we have converted from decimal back to decimal. However, it is important in understanding how other base systems work.

Binary In binary, instead of ten distinct numbers, we only have two: 0 and 1. Now, it only takes one increment before a place gets filled up and increments the next place so we count from 0 to 5 in this fashion.

Binary and Decimal representations of counting up.
0000 = 0
0001 = 1
0010 = 2
0011 = 3
0100 = 4
0101 = 5

The same logic applies here for understanding the meaning of Binary numbers, for instance 10110 1 * 2^(4) = 16
0 * 2^(3) = 0
1 * 2^(2) = 4
1 * 2^(1) = 2
0 * 2^(0) = 0

= 22 in Decimal

Hexadecimal There are many other number systems that are also very important, particularly Hexadecimal which as 16 distinct “numbers”. We don’t have 16 distinct numbers so instead we use letters to represent the rest of them. So, the “numbers” are 0-9 and then A-F. 0 = 0
… 9 = 9
A = 10
B = 11

F = 15
We use the same logic to understand these numbers and convert them into hexadecimal.
0x0A2D to Decimal

A * 16^(2) = 10 * 162 = 2560;
2 * 16^(1) = 2 * 16
1 = 32
D * 16^(0) = 13* 16^0 = 13
= 2605 in Decimal

While we don’t often use any other bases, theoretically one could use a different base.
Base 7
2601 would equal 981 in decimal

You are free to use converters! This is tedious to do by hand.

Numbers in C++

The smallest amount of data something can have on a computer is a bit which either has the value 0 or 1. In C++, the smallest directly-changeable unit of memory is a byte. All bytes are a collection of 8 bits. The table below represents a byte with the eight bit positions (7 through 0) labeled.

10010001
76543210

We are going to read this as a number in binary and convert it to decimal. As a convention, the Nth byte in an unsigned integer like the one above adds 2N to the byte’s value. So, we get the value in base 10 through the following addition pattern introduced above.

100100012 = 27 + 24 + 20 = 14510

The subscripts notate the base.

Signed vs Unsigned terminology

This convention only works for positive integers so how do we represent negative integers?

This is where we used “signed” integers. The convention for bit contributions is the same except that the most significant bit contributes -2^N. So, if we treated that byte as a signed integer, we would instead perform the following addition:

100100012 = -27 + 24 + 20 = -11110

That convention, called “Two’s Complement,” makes addition and subtraction in hardware work out nicely.
In these tutorials, simply pay attention to whether bits are supposed to represent a signed or unsigned integer. The bits are simple to reinterpret in C++ code, and binary-decimal converters often give both a signed and an unsigned conversion.

The main way that we distinguish the two in code is that an int is signed and an uint is unsigned. You may also see data types like unsigned int in some libraries.

**For the rest of this demo, please assume that integers are unsigned unless stated otherwise. **

When we use the datatype uint8_t, this means that this is an int that is 8 bits large (1 byte). We can also have a uint16_t which is 16 bits large, (2 bytes large).

#Addresses While we have seen these bytes with all their bits meaning numbers, one can also decide that some of the bits in a byte means something else, such as an address.

It is a design decision for us to assign a special meaning to a byte. For example, the following byte we have defined to have bits 7:5 mean a “Write Address”, with bits 4:0 meaning a “Write Value”.

10010001
76543210

In that case, the write address is 4, and the write value is 17. Feel free to use a binary to decimal converter.

In the following byte, the first four bits are a “Write Address” and the rest are the “Write Value.”

10010001
76543210

In embedded coding, many bits of bytes have special meaning because of a design decision. You will often have to read documentation so that you can agree with library authors and hardware designers on what some bits mean. You will also often have the freedom to choose what you want to make some bits mean.

We can also group bytes together. Below is a 16-bit integer formed by putting two bytes next to each other.

01001111 10010001
15141312111098 76543210

As a result, we can represent very large numbers precisely by using a lot of bytes. In practice, plain numbers do not use more than four or eight bytes (32-bit or 64-bit numbers).

Check for understanding

10010001
76543210
AddressValue

I want to write a value of 24 to address 4. What byte do I have to write?

Answer I have to write the value 152. The address of 4 is 1002 The value 24 is 110002. So, combined, the byte is 100110002 which is 15210.

Hex notation and memory addresses

We can also use base 16, or hexadecimal notation, to represent byte values. Hexadecimal uses 0-9 and then A-F. Each character represents four bits. Notate hexadecimal numbers as 0x############ for as many bits as you want to represent. The following table lists binary to hex conversions.

Binary Hex Binary Hex Binary Hex Binary Hex
0000 0x00 0100 0x04 1000 0x08 1100 0x0C
0001 0x01 0101 0x05 1001 0x09 1101 0x0D
0010 0x02 0110 0x06 1010 0x0A 1110 0x0E
0011 0x03 0111 0x07 1011 0x0B 1111 0x0F

For example, the byte 0b10010001 used before has the following conversion to hex:

1001_0001 = 0x91

We can use _ to make long strings of bits more readable.

Bytes are stored at memory addresses. The following table lists a few bytes and their addresses. (completely arbitrary)

Address Value
0x00001006 0x00
0x00001005 0xFA
0x00001004 0xCE
0x00001003 0xB0
0x00001002 0x0C
0x00001001 0x4F
0x00001000 0x91

C++ gives us the power to interpret bytes as data. Bytes could be a muti-byte integer such as 0x4F91 that appeared earlier in this demonstration.

Address Value
0x00001006 0x00
0x00001005 0xFA
0x00001004 0xCE
0x00001003 0xB0
0x00001002 0x0C
0x00001001 0x4F
0x00001000 0x91

On the Arduino Uno and the Feather M0+ LoRa, multi-byte values are stored in little-endian order, meaning that lower value bytes go into lower memory addresses. We can say that 16-bit integer 0x4F91 is at memory address 0x00001000.

To check your understanding, what 32-bit integer is at memory address 0x00001000?

Answer The integer 0xB00C4F91 is at address 0x00001000.

I want to store 0x12345678 at address 0x00001001. Please fill in the correct bytes. (feel free to fill in text on a spreadsheet for this one)

Answer
AddressValue
0x000010060x??
0x000010050x??
0x000010040x12
0x000010030x34
0x000010020x56
0x000010010x78
0x000010000x??
We don't care about the value of the 0x?? bytes.

Bytes could also hold a custom data structure. Those bytes could mean packet 0x4F of 0x91, data payload 0xFACEB00C.

Working with bytes in C++

Key Recipe

In order to spare us from undefined program behavior, we will follow a recipe for custom data structures. To load, pack, or encode a structure, we will load values into a specially identified C++ structure, and then copy it into its destination. To unload, unpack, or decode a structure, we will assemble larger values by copying bytes to appropriate places. This demonstration will give several examples of packing and unpacking structures. The video demonstrates the Arduino IDE, but the examples should work in other Arduino compilers such as TinkerCad.

Recall that we talked about a group of bytes at address 0x00001000 that meant packet 0x4F of 0x91, data payload 0xFACEB00C. We can define a data structure, notated as a struct in C++, to assist us in loading data.

struct __attribute__((__packed__)) pkt_fmt{
  uint8_t totalPackets;
  uint8_t packetNumber;
  uint8_t payload[4];
};

void setup(){
  pkt_fmt mypkt = {
    .totalPackets = 0x91,
    .packetNumber = 0x4F,
    .payload = {0x0C, 0xB0, 0xCE, 0xFA}
  };

  Serial.begin(115200);
  while(!Serial);//You have to open Serial Monitor with Feather M0+.

  Serial.print("Packet number ");Serial.print(mypkt.packetNumber);
    Serial.print(" of "); Serial.println(mypkt.totalPackets);
}
void loop(){}

Let us unpack what went on in that code. We define struct pkt_fmt to have three members. totalPackets is an 8-bit unsigned integer.

Aside: uint8_t is available with #include <cstdint>, which Arduino includes automatically in your main sketch file. int8_t is an 8-bit signed integer, and uint32_t is a 32-bit unsigned integer. On Feather M0+, standard ints are signed or unsigned and use 8, 16, 32, or 64 bits.

packetNumber is another 8-bit unsigned integer. payload is an array of 4 uint8_ts. The [] tell the compiler to reserve 4 consecutive bytes in the struct for that member. We will talk more about arrays soon.

__attribute__((__packed__)) is required on the Feather M0+ to tell the compiler to pack the struct members densely. Otherwise, the compiler adds padding which makes the actual struct bytes not match up with our intuition.

To check your understanding:

Please define a struct with the following members:

An answer is below:

Solution
struct __attribute__((__packed__)) pkt_fmt{
   int8_t armSetVeloc;
   int32_t armPosition;
   uint8_t data[12];
};

You can get the memory address of a piece of data using the & operator. Memory addresses are useful for copying data. There is a function called memcpy that allows you to copy a desired number of bytes from one memory address to another (in the same order). Let’s try copying our pkt_fmt struct to a byte array of the same size.

struct __attribute__((__packed__)) pkt_fmt{
  uint8_t totalPackets;
  uint8_t packetNumber;
  uint8_t payload[4];
};

void setup(){
  pkt_fmt mypkt = {
    .totalPackets = 0x91,
    .packetNumber = 0x4F,
    .payload = {0x0C, 0xB0, 0xCE, 0xFA}
  };

  Serial.begin(115200);
  while(!Serial);//You have to open Serial Monitor with Feather M0+.

  Serial.print("Packet number ");Serial.print(mypkt.packetNumber);
    Serial.print(" of "); Serial.println(mypkt.totalPackets);

  uint8_t mypkt2[sizeof(pkt_fmt)];//sizeof is a compile-time operation
  memcpy(mypkt2, &mypkt, sizeof(pkt_fmt));
}
void loop(){}

Note that the name of an array is the address of the first element.

How do we go from an array of bytes to a packed struct? The 99%, works-on-most-compilers method is to use memcpy. However, we will decode an array of bytes by extracting the desired values byte by byte. Observe this new example:

struct __attribute__((__packed__)) pkt_fmt{
  uint32_t totalPackets;
  uint8_t packetNumber;
  uint16_t illumination;
};

struct decodedObject {
  uint32_t totalPackets;
  uint8_t packetNumber;
  uint16_t illumination;
};

void setup(){
  pkt_fmt mypkt = {
    .totalPackets = 0x91,
    .packetNumber = 0x4F,
    .illumination = 42
  };

  struct {
    uint8_t bytes[sizeof(pkt_fmt)];
  } input;
  memcpy(input.bytes, &mypkt, sizeof(pkt_fmt));
  decodedObject decoded;


  decoded.totalPackets = input.bytes[0]
                       +(input.bytes[1]<<8)
                       +(input.bytes[2]<<16)
                       +(input.bytes[3]<<24);

  decoded.packetNumber = input.bytes[4];

  decoded.illumination = input.bytes[5] + (input.bytes[6]<<8);




  Serial.begin(115200);
  while(!Serial);//You have to open Serial Monitor with Feather M0+.
  Serial.print("Illumination ");Serial.println(decoded.illumination);
}
void loop(){}

The left-shift operator byte<<N shifts an integer left by N bits. Below is an example of a left shift by 3 bits (N=3);

Before:

00000000 10010001
15141312111098 76543210

After:

00000 10010001 000
15141312111098 76543210

We are able to assemble multi-byte integers by using bit shifting and addition because the only bits that could be nonzero are of the bytes that we are shifting. So, shift-and-add works.

Check for understanding

Below is a packet format.

struct __attribute__((__packed__)) pkt_fmt{
 uint16_t vBat;
 uint16_t vBus;
 uint16_t CO2;
};

Please write a snippet of code that decodes that struct. You may assume that your input is in input.bytes and that decoded has members of the same names and types as the pkt_fmt struct. A solution is below.

Solution
decoded.vBat = input.bytes[0] + (input.bytes[1]<<8);
decoded.vBus = input.bytes[2] + (input.bytes[3]<<8);
decoded.CO2 = input.bytes[4] + (input.bytes[5]<<8);