Nearly Complete Guide to RNG on a microcontroller
Posted: February 12, 2022 at 11:50 AM
Security depends upon cryptography and which in turn depends upon a Random Number Generator (RNG). An RNG is used for key generation (both symmetric and asymmetric) and key negotiation (session establishment). The later is an absolute requirement to ensure that communications can be secured. The former (key generation) can be used at first boot for personalization, but isn’t necessary as it could be done when personalizing the device at programming or first deployment.
There are two types of RNGs, the first is a True Random Number Generator (TRNG). This is one that takes some non-deterministic process, often physical, and measures it. Often, these are slow and are not uniform, requiring a post processing step before the are useful.
The second type is a Pseudo Random Number Generator (PRNG)NIST also refers to a PRNG as a Deterministic Random Bit Generator (DRBG).. PRNGs take a seed, and can generate large, effectively unlimited amounts of random data, when seeded properly. The issue is than if someone is able to obtain the seed, they will be able to predict the subsequent values, allowing breaking security.
The standard practice is to gather data from a TRNG, and use it to seed a PRNG. It used to be common that the PRNG should more additional random data mixed in, but I agree w/ djb (D. J. Bernstein) that once seeded, no additional seeding is neededSee his blog post Entropy Attacks! as modern PRNGs are secure and can generate random data such that their state will not leak.That is, taking it’s output, that neither past nor future output can be predicted.
There are lots of libraries and papers that talk about how to solve the problem for RNGs on a microcontroller that may not have an integrated [T]RNG block, but I have not been able to find a complete guide for integrating their work into a project where even a relative beginner could get it functional.
This article was written as I developed the lora-irrigation project. This project will be used as an example, and the code reference is mostly licensed under the 2-clause BSD license, and so is freely usable for your own projects.
Sources of Randomness¶
As mentioned, most microcontrollers do not have a dedicated hardware block like modern AMD64 (aka x86-64) processors do w/ the RDRAND instruction. Though they do not, there are other sources that are available.
The first, and easiest one is the Analog Digital Converter (ADC). Even if the ADC pin is tied to ground, the process of digital conversion is not 100% deterministic as there are errors in the converter or noise introduced on the pin.The article ADC Input Noise: The Good, The Bad, and The Ugly. Is No Noise Good Noise? talks about this.
The data sheet for the microcontroller will help determine the expected randomness from the part. In the case of the STM32L151CC that I’m using, Table 57 of the data sheet lists the Effective number of bits (ENOB) as typically 10 bits, which is a couple bits short of the 12 bit resolution of the ADC. This means that the 2 least significant bits are likely to have some noise in them. I did a run, and collected 114200 samples from the ADC. The Shannon entropy calculated using the empirical probabilities was 2.48. Now this is not strictly Shannon entropy, as the values were calculated from the experiment, and Shannon entropy should be calculated from the a priori probabilities. Discarding the 0’s (which makes up over half the results) improves the entropy calculation to 3.29. The min-entropy, Forward reference: min-entropy awk script a better indicator of entropy, calculation is 1.2 bits, and if all the 0’s are dropped, it improves to 2.943. This does help, but in the end, subtracting the data sheet’s ENOB from the ADC resolution does result in an approximate estimate of entropy.
It is possibly that a correlation analysis between samples could further reduce the entropy gathers via the ADC, but with sufficient collection, this should be able to be avoided.
The second is using uninitialized SRAM. It turns out that this has been studied in Software Only, Extremely Compact, Keccak-based Secure PRNG on ARM Cortex-M and Secure PRNG Seeding on Commercial Off-the-Shelf Microcontrollers. Depending upon how the SRAM is designed in the chip, it can create a situation where each bit of SRAM will be indeterminate at boot up. Both of these papers studied a similar microcontroller, an STM32F100R8 to the one I am using, a STM32L151CC.
I ran my own experiments where I powered on an STM3L151CC and dumped the SRAM 8 times and analyzed the results. I limited my analysis to 26863 bytes the 32 KiBytes of ram (remaining was data/bss or stack, so would not change, or was zeros). I then calculated the min-entropy for each bit across power cycles and the resulting sum was 11188, or approximately .416 bits per byte. This is 5.2% and in line with what the later paper observed for a similar device.
Part of using a source of randomness is making sure that it is usable. In the case of the ADC, each reading can be evaluated against previous reads to ensure that the data being obtained is possibly random. In the case of SRAM, this is more tricky, as the state of SRAM is static, and short of a reset, will not change. This means that to use SRAM, proper analysis of the device, or family of devices, need to be evaluated for suitability. There are cases where a device’s SRAM does not provide adequate entropy, as discussed in the papers, and so this method should not be used in those cases, or not solely relied upon.
The following is an awk
script for calculating the min-entropy of the
provided data. Each sample must be the first item on a line, and each
sample must be a hexadecimal value w/o any leading 0x
or other leading identifier:
# Copyright 2021 John-Mark Gurney
# This script is licensed under the 2-clause BSD license
function max(a, b)
{
if (a > b)
return a;
else
return b;
}
{
v = ("0x" $1) + 0; a[NR] = v;
maxv = max(maxv, v);
}
END {
tcnt = length(a);
me = 0;
for (bit = 0; 2^bit <= maxv; bit += 1) {
cnt0 = 0;
cnt1 = 0;
for (i in a) {
tbit = int((a[i] / 2 ^ bit) % 2);
if (tbit)
cnt1 += 1;
else
cnt0 += 1;
}
v = -log(max(cnt0, cnt1) / tcnt) / log(2);
print "bit " bit ":\t" v;
me += v;
}
printf "total:\t%0.3f\n", me;
}
It is also possible that there are other parts of the board/design that could be a source of randomness. The project that started this journey is using LoRa for communication. It turns out that the sample code for the radio chip (LoRaMac‑node) implements a random interface. The function just waits one milisecond, reads the RSSI value, takes the low bit and repeats this 32 times to return a 32-bit word. There are issues with this as I cannot find any description of the expected randomness in the data sheet, nor in the code. It also does not do any conditioning, so just because it returns 32-bits, does not guarantee 32-bits of usable entropy. I have briefly looked at the output, and there does appear to be higher lengths of runs than expected. Another issue is that it’s collection takes a while, as the fastest is 1 bit per ms. So, assuming the need to collect 8 bits for 1 bit of entropy (pure speculation), that means at minimum 2 seconds to collect the 2048 bits necessary for 256 bits of entropy.
Uniquifying¶
One of the other ways to help ensure that a microcontroller is to integrate per device values into the PRNG. This does not guarantee uniqueness between boots, but it does make it harder to attack if an attacker is able to control the other sources of randomness.
In the case of the STM32L151 chip I am using, there is a unique device id register. The device register is programmed at the factory. Because it is unknown if this unique id is recorded by the manufacturer, and possibly traced through the supply chain, and no guarantees are made to both the uniqueness or privacy, it has limited use to provide any serious additional randomization.
Another method, is to write entropy at provisioning time. This can be done in either flash memory or EEPROM, which may have a more granular write access.
Using SRAM¶
The tricky part of using SRAM is figuring out how to access the uninitialized memory. Despite having full access to the environment, modifying the startup code, which is often written in assembly, to do the harvesting makes an implementation less portable. Using standard C, or another high level language, makes this easier, but we need to know where the end of the data and bss segments are. This is where looking at the linker script will come in.
A linker script is used to allocate and map the program’s data to the
correct locations. This includes allocating memory so that all the
code and data fits in flash, but also allocating ram for variables, and
stack. Often there will be a symbol provided that marks where the data
and bss sections in ram end, and the heap should begin. For example,
in STM32L151CCUX_FLASH.ld
at lines 185 &
186
it defines the symbols end
and _end
, the later of which is often
used by sbrk
(or _sbrk
in my project’s case in
libnosys
A sample _sbrk
is in utils_syscalls.c,
though this particular implementation is not used by my project.)
to allocate memory for the heap. Using sbrk is the easiest method to
access uninitalized SRAM, but modifying or adding a symbol can be used
if your microcontroller’s framework does not support sbrk.
Putting it together¶
It is accepted that integrating as many difference sournces of entropy (TRNGs) is best. This ensures that as long as any single soruce is good, or each one is not great, but combined they provide enough entropy (preferably at least 128 bits), that the seeded PRNG will be secure and unpredictable.
As some sources are only available at first boot, e.g. SRAM, it is best to save a fork of the PRNG to stable storage. In my implementation, I decided to use EEPROM for this. I added an additional EEPROM section in the linker script, and then added a symbol rng_save that is put in this section. This should be 256-bits (32-bytes) as the savings of smaller does not make sense, and any proper PRNG when seeded with 256-bits will provide enough randomness. Writing to EEPROM does require a little more work to have the code save to this region, rather than RAM, but the STM32 HAL layer has functions that make this easy.
It would be great if the PRNG seed could be stored in read-once, write-once memory to ensure that it can be read, mixed in with any additional entropy, and then written out, but I do not know of any microcontroller that supports this feature.
Part of this is is to ensure that the the state between the saved seed, and the PRNG state used for this boot is disjoint, and that if either seed is compromised, neither can be backtracked to obtain the other. In the case of strobe, the function strobe_randomize does a RATCHET operation at the end, which ensure the state cannot be rolled back to figure out what was generated, and as the generated bytes does not contain the entire state of the PRNG, it cannot be used to reconstruct the future seed.
Another advantage of using EEPROM is the ability to provide an initial set of entropy bytes at firmware flashing time. I did attempt to add this, but OpenOCD, which I use for programming the Node151 device, does not support programming EEPROM, so in my case, this was not possibleDespite not using it, the infrastructure to generate perso entropy is still present in the Makefile.. I could have added an additional source data file to the flash, but figured that the other sources of entropy were adequate enough for my project.