Annex D (normative) Compatibility features [depr]

D.20 Deprecated standard code conversion facets [depr.locale.stdcvt]

The header <codecvt> provides code conversion facets for various character encodings.

D.20.1 Header <codecvt> synopsis [depr.codecvt.syn]

namespace std {
  enum codecvt_mode {
    consume_header = 4,
    generate_header = 2,
    little_endian = 1
  };

  template<class Elem, unsigned long Maxcode = 0x10ffff, codecvt_mode Mode = (codecvt_mode)0>
    class codecvt_utf8 : public codecvt<Elem, char, mbstate_t> {
    public:
      explicit codecvt_utf8(size_t refs = 0);
      ~codecvt_utf8();
    };

  template<class Elem, unsigned long Maxcode = 0x10ffff, codecvt_mode Mode = (codecvt_mode)0>
    class codecvt_utf16 : public codecvt<Elem, char, mbstate_t> {
    public:
      explicit codecvt_utf16(size_t refs = 0);
      ~codecvt_utf16();
    };

  template<class Elem, unsigned long Maxcode = 0x10ffff, codecvt_mode Mode = (codecvt_mode)0>
    class codecvt_utf8_utf16 : public codecvt<Elem, char, mbstate_t> {
    public:
      explicit codecvt_utf8_utf16(size_t refs = 0);
      ~codecvt_utf8_utf16();
    };
}

D.20.2 Requirements [depr.locale.stdcvt.req]

For each of the three code conversion facets codecvt_­utf8, codecvt_­utf16, and codecvt_­utf8_­utf16:
  • Elem is the wide-character type, such as wchar_­t, char16_­t, or char32_­t.
  • Maxcode is the largest wide-character code that the facet will read or write without reporting a conversion error.
  • If (Mode & consume_­header), the facet shall consume an initial header sequence, if present, when reading a multibyte sequence to determine the endianness of the subsequent multibyte sequence to be read.
  • If (Mode & generate_­header), the facet shall generate an initial header sequence when writing a multibyte sequence to advertise the endianness of the subsequent multibyte sequence to be written.
  • If (Mode & little_­endian), the facet shall generate a multibyte sequence in little-endian order, as opposed to the default big-endian order.
For the facet codecvt_­utf8:
  • The facet shall convert between UTF-8 multibyte sequences and UCS-2 or UTF-32 (depending on the size of Elem) within the program.
  • Endianness shall not affect how multibyte sequences are read or written.
  • The multibyte sequences may be written as either a text or a binary file.
For the facet codecvt_­utf16:
  • The facet shall convert between UTF-16 multibyte sequences and UCS-2 or UTF-32 (depending on the size of Elem) within the program.
  • Multibyte sequences shall be read or written according to the Mode flag, as set out above.
  • The multibyte sequences may be written only as a binary file. Attempting to write to a text file produces undefined behavior.
For the facet codecvt_­utf8_­utf16:
  • The facet shall convert between UTF-8 multibyte sequences and UTF-16 (one or two 16-bit codes) within the program.
  • Endianness shall not affect how multibyte sequences are read or written.
  • The multibyte sequences may be written as either a text or a binary file.
The encoding forms UTF-8, UTF-16, and UTF-32 are specified in ISO/IEC 10646.
The encoding form UCS-2 is specified in ISO/IEC 10646-1:1993.