The ARM compiler is highly optimizing, for small code size and high performance. The compiler performs optimizations common to other optimizing compilers, for example, data‑flow optimizations such as common sub-expression elimination and loop optimizations such as loop combining and distribution. In addition, the compiler performs a range of optimizations specific to ARM architecture‑based processors.
Even though the compiler is highly optimizing, you can often significantly improve the performance of your C or C++ code by selecting correct optimization criteria, target processor and architecture, inlining options, and by adopting good RISC programming practices.