Keil Logo

Technical Support

On-Line Manuals

Compiler User Guide

Preface Overview of the Compiler Getting Started with the Compiler Compiler Features Compiler Coding Practices The compiler as an optimizing compiler Compiler optimization for code size versus speed Compiler optimization levels and the debug view Selecting the target processor at compile time Enabling FPU for bare-metal Optimization of loop termination in C code Loop unrolling in C code Compiler optimization and the volatile keyword Code metrics Code metrics for measurement of code size and data Stack use in C and C++ Benefits of reducing debug information in objects Methods of reducing debug information in objects a Guarding against multiple inclusion of header file Methods of minimizing function parameter passing o Returning structures from functions through regist Functions that return the same result when called Comparison of pure and impure functions Recommendation of postfix syntax when qualifying f Inline functions Compiler decisions on function inlining Automatic function inlining and static functions Inline functions and removal of unused out-of-line Automatic function inlining and multifile compilat Restriction on overriding compiler decisions about Compiler modes and inline functions Inline functions in C++ and C90 mode Inline functions in C99 mode Inline functions and debugging Types of data alignment Advantages of natural data alignment Compiler storage of data objects by natural byte a Relevance of natural data alignment at compile tim Unaligned data access in C and C++ code The __packed qualifier and unaligned data access i Unaligned fields in structures Performance penalty associated with marking whole Unaligned pointers in C and C++ code Unaligned Load Register (LDR) instructions generat Comparisons of an unpacked struct, a __packed stru Compiler support for floating-point arithmetic Default selection of hardware or software floating Example of hardware and software support differenc Vector Floating-Point (VFP) architectures Limitations on hardware handling of floating-point Implementation of Vector Floating-Point (VFP) supp Compiler and library support for half-precision fl Half-precision floating-point number format Compiler support for floating-point computations a Types of floating-point linkage Compiler options for floating-point linkage and co Floating-point linkage and computational requireme Processors and their implicit Floating-Point Units Integer division-by-zero errors in C code Software floating-point division-by-zero errors in About trapping software floating-point division-by Identification of software floating-point division Software floating-point division-by-zero debugging New language features of C99 New library features of C99 // comments in C99 and C90 Compound literals in C99 Designated initializers in C99 Hexadecimal floating-point numbers in C99 Flexible array members in C99 __func__ predefined identifier in C99 inline functions in C99 long long data type in C99 and C90 Macros with a variable number of arguments in C99 Mixed declarations and statements in C99 New block scopes for selection and iteration state _Pragma preprocessing operator in C99 Restricted pointers in C99 Additional library functions in C99 Complex numbers in C99 Boolean type and in C99 Extended integer types and functions in floating-point environment access in C99 snprintf family of functions in C99 type-generic math macros in C99 wide character I/O functions in C99 How to prevent uninitialized data from being initi Compiler Diagnostic Messages Using the Inline and Embedded Assemblers of the AR Compiler Command-line Options Language Extensions Compiler-specific Features C and C++ Implementation Details What is Semihosting? Via File Syntax Summary Table of GNU Language Extensions Standard C Implementation Definition Standard C++ Implementation Definition C and C++ Compiler Implementation Limits

Optimization of loop termination in C code

4.6 Optimization of loop termination in C code

Loops are a common construct in most programs. Because a significant amount of execution time is often spent in loops, it is worthwhile paying attention to time-critical loops.

The loop termination condition can cause significant overhead if written without caution. Where possible:
  • Use simple termination conditions.
  • Write count-down-to-zero loops.
  • Use counters of type unsigned int.
  • Test for equality against zero.
Following any or all of these guidelines, separately or in combination, is likely to result in better code.
The following table shows two sample implementations of a routine to calculate n! that together illustrate loop termination overhead. The first implementation calculates n! using an incrementing loop, while the second routine calculates n! using a decrementing loop.

Table 4-1 C code for incrementing and decrementing loops

Incrementing loop Decrementing loop
int fact1(int n)
    int i, fact = 1;
    for (i = 1; i <= n; i++)
        fact *= i;
    return (fact);
int fact2(int n)
    unsigned int i, fact = 1;
    for (i = n; i != 0; i--)
        fact *= i;
    return (fact);
The following table shows the corresponding disassembly of the machine code produced by the compiler for each of the sample implementations above, where the C code for both implementations has been compiled using the options -O2 -Otime.

Table 4-2 C Disassembly for incrementing and decrementing loops

Incrementing loop Decrementing loop
fact1 PROC
    MOV      r2, r0
    MOV      r0, #1
    CMP      r2, #1
    MOV      r1, r0
    BXLT     lr
    MUL      r0, r1, r0
    ADD      r1, r1, #1
    CMP      r1, r2
    BLE      |L1.20|
    BX       lr
fact2 PROC
    MOVS     r1, r0
    MOV      r0, #1
    BXEQ     lr
    MUL      r0, r1, r0
    SUBS     r1, r1, #1
    BNE      |L1.12|
    BX       lr
Comparing the disassemblies shows that the ADD and CMP instruction pair in the incrementing loop disassembly has been replaced with a single SUBS instruction in the decrementing loop disassembly. This is because a compare with zero can be used instead.
In addition to saving an instruction in the loop, the variable n does not have to be saved across the loop, so the use of a register is also saved in the decrementing loop disassembly. This eases register allocation. It is even more important if the original termination condition involves a function call. For example:
for (...; i < get_limit(); ...);
The technique of initializing the loop counter to the number of iterations required, and then decrementing down to zero, also applies to while and do statements.
Non-ConfidentialPDF file icon PDF versionARM DUI0375H
Copyright © 2007, 2008, 2011, 2012, 2014-2016 ARM. All rights reserved. 
  Arm logo
Important information

This site uses cookies to store information on your computer. By continuing to use our site, you consent to our cookies.

Change Settings

Privacy Policy Update

Arm’s Privacy Policy has been updated. By continuing to use our site, you consent to Arm’s Privacy Policy. Please review our Privacy Policy to learn more about our collection, use and transfers
of your data.