UTF-16

UTF-16UTF-16
  1. The character is specified as one or two UTF-16 code units in hexadecimal notation .

    这个字符指定为十六进制形式的一个或两个UTF-16编码单元。

  2. In addition , there is another encoding scheme called UTF-16 that can also be used to represent supplementary characters .

    另外,还有一种称为UTF-16的编码方案,它也可以用来表示补充字符。

  3. The input file should be encoded in UTF-8 or UTF-16 format .

    应该将输入文件编码为UTF-8或UTF-16格式。

  4. The DOMString type is explicitly specified to consist of wide UTF-16 characters .

    DOMString类型被显式指定包括宽UTF-16字符。

  5. Other Unicode formats such as UTF-16 tend to contain numerous zero bytes .

    其他Unicode格式如UTF-16往往包含很多零字节。

  6. First , unlike UTF-16 , UTF-8 has no endianness issues .

    首先,与UTF-16相比,UTF-8没有endianness问题。

  7. UTF-8 was chosen as the default format for character data columns , with UTF-16 for graphic data columns .

    UTF-8被选择为字符数据列的默认格式,其中UTF-16用于图形数据列。

  8. This method completely ignores all the encoding information available , and the returned string is always encoded in UTF-16 .

    这个方法完全忽略所有可用的编码信息,所返回的字符串总是用UTF-16编码的。

  9. Several encodings are used for Unicode : the two most popular are UTF-8 and UTF-16 .

    有几种编码可用于Unicode:最为常用的两个是UTF-8和UTF-16。

  10. Compressed UTF-8 will likely be close in size to compressed UTF-16 , regardless of the initial size difference .

    压缩后,UTF-8和UTF-16的大小差不多,不论原始大小相差多少。

  11. UTF-16 is a variable-width character encoding , once surrogate pairs are taken into account .

    如果考虑到替换对,UTF-16是一种变长字符编码。

  12. For instance , the Java ™ language 's internal representation of strings is based on UTF-16 , which makes indexing into the string much faster .

    比如,Java™语言中字符串的内部表示采用UTF-16,因此对字符串的索引更快。

  13. But even when you 're encoding CJK XML in UTF-8 , the actual size gain compared to UTF-16 probably isn 't so large .

    但即使用UTF-8编码CJKXML,实际的大小可能也比UTF-16小。

  14. Xerces-C + + uses this larger character representation to exchange text as UTF-16 as opposed to UTF-8 or ISO-8859 .

    Xerces-C++使用更大的字符表示UTF-16而不是UTF-8或者ISO-8859交换文本。

  15. This paper presents a " Fake UTF-16 " coding algorithm , so that all XML parsers can handle GB code in an easy and universal fashion .

    本文提出伪UTF16编解码算法,为XML中文数据的解析提供了简单、通用的方法。

  16. Thus the encoding attribute of the manually transcoded XML string is still " UTF-16 " instead of " Big5 " .

    因此这个以手工方式转码的XML字符串的编码属性仍然是“UTF-16”而不是“Big5”。

  17. In UTF-16 , you don 't always know whether the byte " 0x41 " is the letter " A " .

    在UTF-16中,就不能确定字节“0x41”是不是字母“A”。

  18. UTF-8 is less likely than UTF-16 or other Unicode encodings to cause problems for systems that are unaware of Unicode and XML .

    与UTF-16或其他Unicode编码相比,对于不支持Unicode和XML的系统,UTF-8更不容易造成问题。

  19. Google doesn 't even allow alternate encodings of Unicode such as UTF-16 , much less non-Unicode encodings like ISO-8859-1 .

    Google甚至不允许其他Unicode编码(如UTF-16),更不用说ISO-8859-1这样的非Unicode编码了。

  20. Unicode defines character encodings in three distinct sizes UTF-8 , UTF-16 , and UTF-32 while the traditional character type is8 bits .

    Unicode用三种不同的大小定义字符编码UTF-8、UTF-16和UTF-32而传统的字符类型是8位的。

  21. The third subclause gives fixed point pseudo-code for the remaining modules of the coder . GB Support of XML Parser Using " Fake UTF-16 " Coding Algorithm

    第3小节给出用于编码器的保持系数的定点伪码。一个解决XML解析器对中文数据处理的伪UTF-16编码算法

  22. IDENTITY_16BIT collator implements CESU-8 ( An8-bit compatibility encoding scheme for UTF-16 ) .

    IDENTITY16BIT排序器实现CESU-8(一种8位的兼容UTF-16的编码方案)。

  23. For instance , if UTF-16 data is naively loaded into a C string , the string may be truncated on the second byte of the first ASCII character .

    比方说,如果UTF-16数据原样加载到C字符串中,字符串可能从第一个ASCII字符的第二个字节截断。

  24. The character-based functions may need to convert the input data string to an intermediate UNICODE code page , like UTF-16 or UTF-32 , before its processing can be done .

    基于字符的函数可能需要将输入数据字符串转换为一个中间的UNICODE代码页,比如UTF-16或UTF-32,然后才能对它进行处理。

  25. Characters in the ASCII range occupy only half the space in UTF-8 that they do in some other encodings of Unicode , particularly UTF-16 .

    与其他Unicode编码特别是UTF-16相比,在UTF-8中ASCII字符占用的空间只有一半。

  26. Omit the XML declaration , and use the UTF-8 encoding , or use a UTF-16 Unicode Byte Order Mark ( BOM ) at the beginning of your document .

    在文档的开头部分,省略XML声明,并使用UTF-8编码,或者使用UTF-16Unicode字节顺序标记(ByteOrderMark,BOM)。

  27. Note this only does something if the string has a BOM , otherwise it is assumed that the string isn 't UTF-16 and it is returned unmodified .

    注意这个函数只在字符串拥有BOM时有效,否则它推测字符串不是UTF-16编码的而返回没有经过修改的原始值。

  28. There are others ( UTF-16 and UTF-32 , for example ) defined by the Unicode consortium , but UTF-8 is the best supported encoding for international character sets .

    Unicode协会还规定了其他一些编码方式(如UTF-16,UTF-32),但UTF-8是国际字符集支持得最好的一种。