Keil Logo Arm Logo

Technical Support

On-Line Manuals

Compiler User Guide

Conventions and Feedback Overview of the Compiler Getting Started with the Compiler Compiler Features Compiler Coding Practices The compiler as an optimizing compiler Compiler optimization for code size versus speed Compiler optimization levels and the debug view Selecting the target CPU at compile time Optimization of loop termination in C code Loop unrolling in C code Compiler optimization and the volatile keyword Code metrics Code metrics for measurement of code size and data Stack use in C and C++ Benefits of reducing debug information in objects Methods of reducing debug information in objects a Guarding against multiple inclusion of header file Methods of minimizing function parameter passing o Functions that return multiple values through regi Functions that return the same result when called Comparison of pure and impure functions Recommendation of postfix syntax when qualifying f Inline functions Compiler decisions on function inlining Automatic function inlining and static functions Inline functions and removal of unused out-of-line Automatic function inlining and multifile compilat Restriction on overriding compiler decisions about Compiler modes and inline functions Inline functions in C++ and C90 mode Inline functions in C99 mode Inline functions and debugging Types of data alignment Advantages of natural data alignment Compiler storage of data objects by natural byte a Relevance of natural data alignment at compile tim Unaligned data access in C and C++ code The __packed qualifier and unaligned data access i Unaligned fields in structures Performance penalty associated with marking whole Unaligned pointers in C and C++ code Unaligned Load Register (LDR) instructions generat Comparisons of an unpacked struct, a __packed stru Compiler support for floating-point arithmetic Default selection of hardware or software floating Example of hardware and software support differenc Vector Floating-Point (VFP) architectures Limitations on hardware handling of floating-point Implementation of Vector Floating-Point (VFP) supp Compiler and library support for half-precision fl Half-precision floating-point number format Compiler support for floating-point computations a Types of floating-point linkage Compiler options for floating-point linkage and co Floating-point linkage and computational requireme Processors and their implicit Floating-Point Units Integer division-by-zero errors in C code About trapping integer division-by-zero errors wit About trapping integer division-by-zero errors wit Identification of integer division-by-zero errors Examining parameters when integer division-by-zero Software floating-point division-by-zero errors in About trapping software floating-point division-by Identification of software floating-point division Software floating-point division-by-zero debugging New language features of C99 New library features of C99 // comments in C99 and C90 Compound literals in C99 Designated initializers in C99 Hexadecimal floating-point numbers in C99 Flexible array members in C99 __func__ predefined identifier in C99 inline functions in C99 long long data type in C99 and C90 Macros with a variable number of arguments in C99 Mixed declarations and statements in C99 New block scopes for selection and iteration state _Pragma preprocessing operator in C99 Restricted pointers in C99 Additional <math.h> library functions in C99 Complex numbers in C99 Boolean type and <stdbool.h> in C99 Extended integer types and functions in <inttyp <fenv.h> floating-point environment access i <stdio.h> snprintf family of functions in C9 <tgmath.h> type-generic math macros in C99 <wchar.h> wide character I/O functions in C9 How to prevent uninitialized data from being initi Compiler Diagnostic Messages Using the Inline and Embedded Assemblers of the AR

Comparisons of an unpacked struct, a __packed struct, and a struct with individually __packed fields, and of a __packed struct and a #pragma packed struct

Comparisons of an unpacked struct, a __packed struct, and a struct with individually __packed fields, and of a __packed struct and a #pragma packed struct

These comparisons illustrate the differences between the methods of packing structures.

Show/hideComparison of an unpacked struct, a __packed struct, and a struct with individually __packed fields

The differences between not packing a struct, packing an entire struct, and packing individual fields of a struct are illustrated by the three implementations of a struct shown in Table 12.

Table 12. C code for an unpacked struct, a packed struct, and a struct with individually packed fields

Unpacked struct__packed struct__packed fields
struct foo
{
    char one;
    short two;
    char three;
    int four;
} c;
__packed struct foo
{
    char one;
    short two;
    char three;
    int four;
} c;
struct foo
{
    char one;
    __packed short two;
    char three;
    int four;
} c;

In the first implementation, the struct is not packed. In the second implementation, the entire structure is qualified as __packed. In the third implementation, the __packed attribute is removed from the structure and the individual field that is not naturally aligned is declared as __packed.

Table 13 shows the corresponding disassembly of the machine code produced by the compiler for each of the sample implementations of Table 12, where the C code for each implementation has been compiled using the option -O2.

Table 13. Disassembly for an unpacked struct, a packed struct, and a struct with individually packed fields

Unpacked struct__packed struct__packed fields
; r0 contains address of c
; char one
LDRB    r1, [r0, #0]
; short two
LDRSH   r2, [r0, #2]
; char three
LDRB    r3, [r0, #4]
; int four
LDR     r12, [r0, #8]
; r0 contains address of c
; char one
LDRB  r1, [r0, #0]
; short two
LDRB  r2, [r0, #1]
LDRSB r12, [r0, #2]
ORR   r2, r12, r2, LSL #8
; char three
LDRB  r3, [r0, #3]
; int four
ADD   r0, r0, #4
BL    __aeabi_uread4
; r0 contains address of c
; char one
LDRB  r1, [r0, #0]
; short two
LDRB  r2, [r0, #1]
LDRSB r12, [r0, #2]
ORR   r2, r12, r2, LSL #8
; char three
LDRB  r3, [r0, #3]
; int four
LDR   r12, [r0, #4]

Note

The -Ospace and -Otime compiler options control whether accesses to unaligned elements are made inline or through a function call. Using -Otime results in inline unaligned accesses. Using -Ospace results in unaligned accesses made through function calls.

In the disassembly of the unpacked struct in Table 13, the compiler always accesses data on aligned word or halfword addresses. The compiler is able to do this because the struct is padded so that every member of the struct lies on its natural size boundary.

In the disassembly of the __packed struct in Table 13, fields one and three are aligned on their natural size boundaries by default, so the compiler makes aligned accesses. The compiler always carries out aligned word or halfword accesses for fields it can identify as being aligned. For the unaligned field two, the compiler uses multiple aligned memory accesses (LDR/STR/LDM/STM), combined with fixed shifting and masking, to access the correct bytes in memory. The compiler calls the ARM Embedded Application Binary Interface (AEABI) runtime routine __aeabi_uread4 for reading an unsigned word at an unknown alignment to access field four because it is not able to determine that the field lies on its natural size boundary.

In the disassembly of the struct with individually packed fields in Table 13, fields one, two, and three are accessed in the same way as in the case where the entire struct is qualified as __packed. In contrast to the situation where the entire struct is packed, however, the compiler makes a word-aligned access to the field four. This is because the presence of the __packed short within the structure helps the compiler to determine that the field four lies on its natural size boundary.

Show/hideComparison of a __packed struct and a #pragma packed struct

The differences between a __packed struct and a #pragma packed struct are illustrated by the two implementations of a struct shown in Table 14.

Table 14. C code for a packed struct and a pragma packed struct

__packed struct#pragma packed struct


__packed struct foobar
{
    char x;
    short y[10];
};

short get_y0(struct foobar *s)
{
    // Unaligned-capable load
    return *s->y;
}
short *get_y(struct foobar *s)
{
    return s->y;    // Compile error
}
#pragma push
#pragma pack(1)
struct foobar
{
    char x;
    short y[10];
};
#pragma pop
short get_y0(struct foobar *s)
{
    // Unaligned-capable load
    return *s->y;
}
short *get_y(struct foobar *s)
{
    return s->y;    // No error
    // Potentially illegal unaligned load,
    // depending on use of result
}

In the first implementation, taking the address of a field in a __packed struct or a __packed field in a struct yields a __packed pointer, and the compiler will generate a type error if you try to implicitly cast this to a non-__packed pointer. In the second implementation, in contrast, taking the address of a field in a #pragma packed struct does not yield a __packed-qualified pointer. However, the field might not be properly aligned for its type, and dereferencing such an unaligned pointer results in Undefined behavior.

Copyright © 2007-2008, 2011-2012 ARM. All rights reserved.ARM DUI 0375D
Non-ConfidentialID062912

Keil logo

Arm logo
Important information

This site uses cookies to store information on your computer. By continuing to use our site, you consent to our cookies.