Part 1: Memory Optimization in Embedded Systems in the C Language - Where to Start?

Developing embedded applications involves improving existing applications and adding new features. Each change, however, has an impact on the amount of memory required for the system to function properly. In the early stages of a project, we often don't realize how close we are to the physical limits of memory. Eventually, there comes a point where it is no longer possible to choose a different hardware platform. How do we solve this problem?

What are embedded systems?

Embedded systems are software that is tightly coupled to hardware and designed to perform specific tasks. We see embedded systems in smart sensors in our homes, for example. They perform certain tasks, such as collecting and reporting room temperature measurements. In this way, limited functionality can reduce the cost of producing the hardware itself, since such simple embedded applications do not require many resources.

Embedded systems vs. operating systems

In contrast to embedded systems, operating systems such as Windows or Android not only run on a wide range of devices, but also allow the development of applications that run on them. They require much more computing power and much more memory, which automatically drives up the price of the hardware. 

The hypothetical situation of building a network of smart sensors, with each device running on Windows, would be a curiosity. Of course, a suitable Windows application can handle this challenge, but it would be like using an excavator to replant flowers in the garden. For this reason, dedicated systems are being developed that provide all the necessary functionality with as little hardware as possible.

Choosing a microcontroller for a specific embedded system

To function properly, an embedded application requires a microcontroller to control the device. It is a chip that contains the processor, memory, and input/output interfaces. These components are a factor in selecting a microcontroller for a particular embedded system. Once you have found a suitable candidate that has the required interfaces and sufficient processing power, it is time to consider memory. Manufacturers often offer microcontrollers in a variety of variants, with different amounts of memory, and at different prices.

Challenges and ways to optimize memory in embedded systems

Estimating the size of an application in the early stages of a project is a major challenge. Before mass production begins, it is possible to change the processor variant to an option with more memory, but once production begins or the product is released to the market, the only feasible solutions are in the application itself. It is not possible to increase the amount of memory available through a software update. However, you can try to implement the same functionality using fewer resources. This brings us to the topic of optimizing memory usage in C-based embedded systems.

Optimizing memory usage – where to start?

It is worth making sure that changes to recover memory do not cause unexpected application behavior. This applies to any change in the source code. Therefore, before making a change, it is a good idea to arm yourself with a set of regression tests and a script that returns the amount of memory used by the application. Such a set of tools will provide the developer with continuous and readily available feedback on whether new changes have a positive impact on the size of the application and whether they cause new problems. 

CI/CD, or continuous integration (CI) and continuous deployment/delivery (CD), are practices and tools designed to streamline and automate the software development process. 

CI is designed to regularly verify that new code integrates well with the current implementation. It includes all kinds of automated testing. CD, on the other hand, extends CI to include steps that enable continuous readiness for the next software release. When the release process is known, it can be automated, minimizing the risk of human error and reducing preparation time. If these tools can run on another machine as part of a CI/CD system, we gain an ideal environment in which to work. The topic of testing is very broad and will not be covered here, but the information in the following sections of the article should help you create a memory measurement script.

The benefits of CI/CD include:

  • Early bug detection: a set of automated tests that are frequently run on a daily basis in the form of nightly tests provides continuous feedback on new bugs.
  • Increased software quality: Increased test coverage combined with frequent test execution allows bugs to be caught and quickly fixed before they reach the main code base.
  • Reduced release cycle: An automated software release cycle means that a product is ready for release in its current state, typically in a matter of hours. If this were a manual process, it would take longer and expose us to errors.

Key step – memory analysis

RAM (random access memory) and ROM (read-only memory) are the two basic types of memory found in embedded systems. Each has a different role:

With a set of tools to safely work on the code, you can start making changes. So where to look for optimization? The primary method of monitoring memory usage is to analyze the mapfile, so we know where specific symbols are located in memory and their size. From this file, we can extract a list of everything in RAM and ROM.

Address space – how data Get distributed in memory

The figure above shows the distribution of data in memory. We list the address ranges that correspond to the different sections of RAM and ROM. In the mapfile we will see such a structure. During further optimization, we will focus on such elements as:

  • .text – read-only data, program instructions - ROM
  • .data – initialized variables - RAM and ROM
  • .bss – uninitialized variables - RAM
  • heap – an area of memory for dynamic allocation - RAM
  • stack – a part of memory for storing local variables and stack frames, so that the application can be built from functions - RAM

It is worth converting the memory map to a .csv file, which allows sorting and filtering for easier analysis. Wanting to optimize ROM, we look at .text and .data symbols, for RAM – .bss and .data symbols. What about the stack and the heap? The heap is used for dynamic allocation, while the stack is operational memory. Optimizing both of them have a positive impact on RAM utilization.

Compiling flags – optimizing methods provided by the GCC compiler

Before you make any changes to your code, it's a good idea to familiarize yourself with the optimization options that the compiler provides. Some of them affect the speed and size of the application. In some cases, using -Os optimizations can solve memory issues. Here is the list of GCC compiler flags, taken from the GCC manual:

  • or -O1: Optimize. Optimizing compilation takes somewhat more time, and a lot more memory for a large function. With -O, the compiler tries to reduce code size and execution time, without performing any optimizations that take a great deal of compilation time.
  • O2: Optimize even more. GCC performs nearly all supported optimizations that do not involve a space-speed tradeoff. As compared to -O, this option increases both compilation time and the performance of the generated code.
  • O3: Optimize yet more. -O3 turns on all optimizations specified by -O2 and also turns on other optimizations. This is often the best option to use.
  • O0: Reduce compilation time and make debugging produce the expected results. This is the default.
  • Os: Optimize for size. -Os enables all -O2 optimizations that do not typically increase code size. It also performs further optimizations designed to reduce code size.
  • Ofast: Disregard strict standards compliance. -Ofast enables all -O3 optimizations. It also enables optimizations that are not valid for all standard compliant programs.
  • Og: Optimize debugging experience. -Og enables optimizations that do not interfere with debugging. It should be the optimization level of choice for the standard edit-compile-debug cycle, offering a reasonable level of optimization while maintaining fast compilation and a good debugging experience. We will use -Og in cs107 when we are debugging.
  • Oz: Optimize aggressively for size rather than speed. This may increase the number of instructions executed if those instructions require fewer bytes to encode. -Oz behaves similarly to -Os including enabling most -O2 optimizations.

Architecture to support memory management

When we are at the architecture design stage, it is worth considering certain decisions at this stage that will make memory management easier later. A good practice is to divide the application into independent modules that can be included or excluded at your discretion. This is especially useful when the application is customized for different clients.

In addition, it is a good practice to initialize the required modules at program startup, in order to verify at this stage the amount of memory required for their correct operation. If the application can be varied according to the customer's needs, building it with modules that are initialized right after startup will allow a quick verification that such a combination is capable of running on a specific hardware. Having this mechanism in place from the beginning will save time in the future.

Key takeaways

Everything we have described above will allow you to prepare for safe and fast code changes that will help optimize application memory. The next step is to choose RAM and ROM optimization methods, which we will describe in the next article.


Author

Mateusz Szpiech

Embedded Software Engineer at Comarch