

Technical Support OnLine Manuals Libraries and Floating Point Support Guide Preface The ARM C and C++ Libraries The ARM C Microlibrary Floatingpoint Support About floatingpoint support The software floatingpoint library, fplib Calling fplib routines fplib arithmetic on numbers in a particular format fplib conversions between floats, long longs, doub fplib comparisons between floats and doubles fplib C99 functions Controlling the ARM floatingpoint environment Floatingpoint functions for compatibility with Mi C99compatible functions for controlling the ARM f C99 rounding mode and floatingpoint exception mac Exception flag handling Functions for handling rounding modes Functions for saving and restoring the whole float Functions for temporarily disabling exceptions ARM floatingpoint compiler extensions to the C99 Writing a custom exception trap handler Example of a custom exception handler Exception trap handling by signals mathlib double and singleprecision floatingpoint IEEE 754 arithmetic Basic data types for IEEE 754 arithmetic Single precision data type for IEEE 754 arithmetic Double precision data type for IEEE 754 arithmetic Sample single precision floatingpoint values for Sample double precision floatingpoint values for IEEE 754 arithmetic and rounding Exceptions arising from IEEE 754 floatingpoint ar Exception types recognized by the ARM floatingpoi Using the Vector FloatingPoint (VFP) support libr The C and C++ Library Functions reference Floatingpoint Support Functions Reference 
Single precision data type for IEEE 754 arithmetic
3.5.2 Single precision data type for IEEE 754 arithmeticA The structure is:
Figure 31 IEEE 754 singleprecision floatingpoint
format The
S field gives the sign of the number.
It is 0 for positive, or 1 for negative.The
Exp field gives the exponent of the
number, as a power of two. It is biased by 0x7F (127),
so that very small numbers have exponents near zero and very large
numbers have exponents near 0xFF (255).For example:
The
Frac field gives the fractional part
of the number. It usually has an implicit 1 bit on the front that
is not stored to save space.For example, if
Exp is 0x7F :
In general, the numeric value of a bit pattern in this format is given by
the formula:
(–1)^{S }* 2^{(Exp–0x7F)} *
(1 + Frac * 2^{–23})
Numbers stored in this form are called normalized numbers.
The maximum and minimum exponent values, 0 and 255, are special
cases. Exponent 255 can represent infinity and store Not
a Number (NaN) values. Infinity can occur as a result
of dividing by zero, or as a result of computing a value that is
too large to store in this format. NaN values are used for special
purposes. Infinity is stored by setting Exp to 255 and Frac to all
zeros. If Exp is 255 and Frac is nonzero, the bit pattern represents
a NaN.
Exponent 0 can represent very small numbers in a special way.
If
Exp is zero, then the Frac field has
no implicit 1 on the front. This means that the format can store
0.0, by setting both Exp and Frac to
all 0 bits. It also means that numbers that are too small to store
using Exp >= 1 are stored with less precision
than the ordinary 23 bits. These are called denormals.Related conceptsRelated referenceRelated information  

Arm’s Privacy Policy has been updated. By continuing to use our site, you consent to Arm’s Privacy Policy. Please review our Privacy Policy to learn more about our collection, use and transfers
of your data.