Shift-JIS and UTF-8 implementation
Table 7 shows
the Shift-JIS (Japanese characters) or UTF-8 (Unicode characters)
locale categories.
Table 7. Default Shift-JIS and UTF-8 locales
| Function | Description |
|---|
__use_sjis_ctype | Sets the character set to the Shift-JIS multibyte
encoding of Japanese characters |
__use_utf8_ctype | Sets the character set to the UTF-8 multibyte
encoding of all Unicode characters |
The following list describes the effects of Shift-JIS and
UTF-8 encoding:
The ordinary ctype functions
behave correctly on any byte value that is a self-contained character
in Shift-JIS. For example, half-width katakana characters that Shift-JIS
encodes as single bytes between 0xA6 and 0xDF are
treated as alphabetic by isalpha().
UTF-8 encoding uses the same set of self-contained characters
as the ASCII character set.
The multibyte conversion functions such as mbrtowc(), mbsrtowcs(),
and wcrtomb(), all convert between wide strings
in Unicode and multibyte character strings in Shift-JIS or UTF-8.
printf("%ls") converts a Unicode
wide string into Shift-JIS or UTF-8 output, and scanf("%ls") converts
Shift-JIS or UTF-8 input into a Unicode wide string.
You can arbitrarily switch between multibyte locales and single-byte
locales at runtime if you include more than one in your application.
By default, only one locale at a time is included.
See also