Optimization of loop termination in
C code
Loops are a common construct in most programs. Because a significant
amount of execution time is often spent in loops, it is worthwhile
paying attention to time-critical loops.
The loop termination condition can cause significant overhead
if written without caution. Where possible:
use simple termination
conditions
write count-down-to-zero loops
use counters of type unsigned int
test for equality against zero.
Following any or all of these guidelines, separately or in
combination, is likely to result in better code.
Table 3 shows
two sample implementations of a routine to calculate n! that
together illustrate loop termination overhead. The first implementation
calculates n! using an incrementing loop, while
the second routine calculates n! using a decrementing
loop.
Table 3. C code for incrementing and decrementing loops
| Incrementing loop | Decrementing loop |
|---|
int fact1(int n)
{
int i, fact = 1;
for (i = 1; i <= n; i++)
fact *= i;
return (fact);
}
|
int fact2(int n)
{
unsigned int i, fact = 1;
for (i = n; i != 0; i--)
fact *= i;
return (fact);
}
|
Table 4 shows
the corresponding disassembly of the machine code produced by the
compiler for each of the sample implementations of Table 3, where the C code
for both implementations has been compiled using the options -O2 -Otime.
Table 4. C Disassembly for incrementing and decrementing loops
| Incrementing loop | Decrementing loop |
|---|
fact1 PROC
MOV r2, r0
MOV r0, #1
CMP r2, #1
MOV r1, r0
BXLT lr
|L1.20|
MUL r0, r1, r0
ADD r1, r1, #1
CMP r1, r2
BLE |L1.20|
BX lr
ENDP
|
fact2 PROC
MOVS r1, r0
MOV r0, #1
BXEQ lr
|L1.12|
MUL r0, r1, r0
SUBS r1, r1, #1
BNE |L1.12|
BX lr
ENDP
|
Comparing the disassemblies of Table 4 shows that the ADD and CMP instruction
pair in the incrementing loop disassembly has been replaced with
a single SUBS instruction in the decrementing
loop disassembly. This is because a compare with zero can be used
instead.
In addition to saving an instruction in the loop, the variable n does
not have to be saved across the loop, so the use of a register is
also saved in the decrementing loop disassembly. This eases register
allocation. It is even more important if the original termination
condition involves a function call. For example:
for (...; i < get_limit(); ...);
The technique of initializing the loop counter to the number
of iterations required, and then decrementing down to zero, also
applies to while and do statements.
See also