Keil Logo

Technical Support

On-Line Manuals

Compiler User Guide

Preface Overview of the Compiler Getting Started with the Compiler Compiler Features Compiler Coding Practices The compiler as an optimizing compiler Compiler optimization for code size versus speed Compiler optimization levels and the debug view Selecting the target processor at compile time Enabling FPU for bare-metal Optimization of loop termination in C code Loop unrolling in C code Compiler optimization and the volatile keyword Code metrics Code metrics for measurement of code size and data Stack use in C and C++ Benefits of reducing debug information in objects Methods of reducing debug information in objects a Guarding against multiple inclusion of header file Methods of minimizing function parameter passing o Returning structures from functions through regist Functions that return the same result when called Comparison of pure and impure functions Recommendation of postfix syntax when qualifying f Inline functions Compiler decisions on function inlining Automatic function inlining and static functions Inline functions and removal of unused out-of-line Automatic function inlining and multifile compilat Restriction on overriding compiler decisions about Compiler modes and inline functions Inline functions in C++ and C90 mode Inline functions in C99 mode Inline functions and debugging Types of data alignment Advantages of natural data alignment Compiler storage of data objects by natural byte a Relevance of natural data alignment at compile tim Unaligned data access in C and C++ code The __packed qualifier and unaligned data access i Unaligned fields in structures Performance penalty associated with marking whole Unaligned pointers in C and C++ code Unaligned Load Register (LDR) instructions generat Comparisons of an unpacked struct, a __packed stru Compiler support for floating-point arithmetic Default selection of hardware or software floating Example of hardware and software support differenc Vector Floating-Point (VFP) architectures Limitations on hardware handling of floating-point Implementation of Vector Floating-Point (VFP) supp Compiler and library support for half-precision fl Half-precision floating-point number format Compiler support for floating-point computations a Types of floating-point linkage Compiler options for floating-point linkage and co Floating-point linkage and computational requireme Processors and their implicit Floating-Point Units Integer division-by-zero errors in C code Software floating-point division-by-zero errors in About trapping software floating-point division-by Identification of software floating-point division Software floating-point division-by-zero debugging New language features of C99 New library features of C99 // comments in C99 and C90 Compound literals in C99 Designated initializers in C99 Hexadecimal floating-point numbers in C99 Flexible array members in C99 __func__ predefined identifier in C99 inline functions in C99 long long data type in C99 and C90 Macros with a variable number of arguments in C99 Mixed declarations and statements in C99 New block scopes for selection and iteration state _Pragma preprocessing operator in C99 Restricted pointers in C99 Additional library functions in C99 Complex numbers in C99 Boolean type and in C99 Extended integer types and functions in floating-point environment access in C99 snprintf family of functions in C99 type-generic math macros in C99 wide character I/O functions in C99 How to prevent uninitialized data from being initi Compiler Diagnostic Messages Using the Inline and Embedded Assemblers of the AR Compiler Command-line Options Language Extensions Compiler-specific Features C and C++ Implementation Details What is Semihosting? Via File Syntax Summary Table of GNU Language Extensions Standard C Implementation Definition Standard C++ Implementation Definition C and C++ Compiler Implementation Limits

Compiler optimization levels and the debug view

4.3 Compiler optimization levels and the debug view

The precise optimizations performed by the compiler depend both on the level of optimization chosen, and whether you are optimizing for performance or code size.

The compiler supports the following optimization levels:
Minimum optimization. Turns off most optimizations. When debugging is enabled, this option gives the best possible debug view because the structure of the generated code directly corresponds to the source code. All optimization that interferes with the debug view is disabled. In particular:
  • Breakpoints can be set on any reachable point, including dead code.
  • The value of a variable is available everywhere within its scope, except where it is uninitialized.
  • Backtrace gives the stack of open function activations that is expected from reading the source.


Although the debug view produced by -O0 corresponds most closely to the source code, users might prefer the debug view produced by -O1 because this improves the quality of the code without changing the fundamental structure.


Dead code includes reachable code that has no effect on the result of the program, for example an assignment to a local variable that is never used. Unreachable code is specifically code that cannot be reached via any control flow path, for example code that immediately follows a return statement.
Restricted optimization. The compiler only performs optimizations that can be described by debug information. Removes unused inline functions and unused static functions. Turns off optimizations that seriously degrade the debug view. If used with --debug, this option gives a generally satisfactory debug view with good code density.
The differences in the debug view from –O0 are:
  • Breakpoints cannot be set on dead code.
  • Values of variables might not be available within their scope after they have been initialized. For example if their assigned location has been reused.
  • Functions with no side-effects might be called out of sequence, or might be omitted if the result is not needed.
  • Backtrace might not give the stack of open function activations that is expected from reading the source because of the presence of tailcalls.
The optimization level –O1 produces good correspondence between source code and object code, especially when the source code contains no dead code. The generated code can be significantly smaller than the code at –O0, which can simplify analysis of the object code.
High optimization. If used with --debug, the debug view might be less satisfactory because the mapping of object code to source code is not always clear. The compiler might perform optimizations that cannot be described by debug information.
This is the default optimization level.
The differences in the debug view from –O1 are:
  • The source code to object code mapping might be many to one, because of the possibility of multiple source code locations mapping to one point of the file, and more aggressive instruction scheduling.
  • Instruction scheduling is allowed to cross sequence points. This can lead to mismatches between the reported value of a variable at a particular point, and the value you might expect from reading the source code.
  • The compiler automatically inlines functions.
Maximum optimization. When debugging is enabled, this option typically gives a poor debug view. ARM recommends debugging at lower optimization levels.
If you use -O3 and -Otime together, the compiler performs extra optimizations that are more aggressive, such as:
  • High-level scalar optimizations, including loop unrolling. This can give significant performance benefits at a small code size cost, but at the risk of a longer build time.
  • More aggressive inlining and automatic inlining.
These optimizations effectively rewrite the input source code, resulting in object code with the lowest correspondence to source code and the worst debug view. The --loop_optimization_level=option controls the amount of loop optimization performed at –O3 –Otime. The higher the amount of loop optimization the worse the correspondence between source and object code.
For extra information about the high level transformations performed on the source code at –O3 –Otime use the --remarks command-line option.
Because optimization affects the mapping of object code to source code, the choice of optimization level with -Ospace and -Otime generally impacts the debug view.
The option -O0 is the best option to use if a simple debug view is required. Selecting -O0 typically increases the size of the ELF image by 7 to 15%. To reduce the size of your debug tables, use the --remove_unneeded_entities option.
Related information
Non-ConfidentialPDF file icon PDF versionARM DUI0375H
Copyright © 2007, 2008, 2011, 2012, 2014-2016 ARM. All rights reserved. 
  Arm logo
Important information

This site uses cookies to store information on your computer. By continuing to use our site, you consent to our cookies.

Change Settings

Privacy Policy Update

Arm’s Privacy Policy has been updated. By continuing to use our site, you consent to Arm’s Privacy Policy. Please review our Privacy Policy to learn more about our collection, use and transfers
of your data.