軟件一般采用三種方式來決定文本的字符集和編碼: 檢測文件頭標識,提示用戶選擇,根據一定的規則猜測 最標準的途徑是檢測文本最開頭的幾個字節,開頭字節Charset/encoding,如下表: EF BB BF UTF-8 FE FF UTF-16/UCS-2, little endian FF FE UTF-16/UCS-2, big endian FF FE 00 00 UTF-32/UCS-4, little endian. 00 00 FE FF UTF-32/UCS-4, big-endian.