Namespace Pot.UTF8
Defined in: <pot.js>.
| Constructor Attributes | Constructor Name and Description |
|---|---|
|
UTF-8 and UTF-16 utilities.
|
| Method Attributes | Method Name and Description |
|---|---|
| <static> |
Pot.UTF8.byteOf(string)
Gets the byte size of string as UTF-8.
|
| <static> |
Pot.UTF8.convertEncodingToUnicode(data, (from))
Convert encoding to Unicode string.
|
| <static> |
Pot.UTF8.decode(string)
Convert to UTF-16 string from UTF-8 string.
|
| <static> |
Pot.UTF8.encode(string)
Convert to UTF-8 string from UTF-16 string.
|
Namespace Detail
Pot.UTF8
UTF-8 and UTF-16 utilities.
Mutual conversion between UTF-8 and UTF-16.
RFC 2044, RFC 2279: UTF-8, a transformation format of ISO 10646
- See:
- http://www.ietf.org/rfc/rfc2279.txt
Note that using "encodeURIComponent" or "decodeURIComponent" to
convert a string that includes surrogate pair or characters
U+FFFE or U+FFFF then will raise URIError.
U+FFFF and U+FFFE will convert unexpect result on SpiderMonkey.
This methods implements convertion functions for
UTF-8 and UTF-16 compatible with calling of
"unescape(encodeURIComponent(string))" and
"decodeURIComponent(escape(string))".
Example: decodeURIComponent(encodeURIComponent('\uFFFF')) === '\uFFFF'; Results: false (SpiderMonkey) Example: decodeURIComponent(encodeURIComponent('\uD811')) === '\uD811'; Results: URIError
Method Detail
<static>
{Function}
Pot.UTF8.byteOf(string)
Gets the byte size of string as UTF-8.
var string = 'abc123あいうえお'; var length = string.length; var byteSize = Pot.UTF8.byteOf(string); debug(string + ' : length = ' + length + ', byteSize = ' + byteSize); // @results // length = 11 // byteSize = 21
- Parameters:
- {String} string
- The target string.
- Returns:
- {Number} The UTF-8 byte size of string.
<static>
{Function}
Pot.UTF8.convertEncodingToUnicode(data, (from))
Convert encoding to Unicode string.
This function requires BlobBuilder and FileReader API.
If environment not supported HTML5 API, it will be raised by Deferred.
// 'こんにちは。ほげほげ'
var unicode = [
12371, 12435, 12395, 12385, 12399, 12290,
12411, 12370, 12411, 12370
];
// Shift_JIS: 'こんにちは。ほげほげ'
var sjis = [
130, 177, 130, 241, 130, 201,
130, 191, 130, 205, 129, 66,
130, 217, 130, 176, 130, 217,
130, 176
];
// EUC-JP: 'こんにちは。ほげほげ'
var eucjp = [
164, 179, 164, 243, 164, 203,
164, 193, 164, 207, 161, 163,
164, 219, 164, 178, 164, 219,
164, 178
];
// UTF-8: 'こんにちは。ほげほげ'
var utf8 = [
227, 129, 147, 227, 130, 147,
227, 129, 171, 227, 129, 161,
227, 129, 175, 227, 128, 130,
227, 129, 187, 227, 129, 146,
227, 129, 187, 227, 129, 146
];
Pot.convertEncodingToUnicode(sjis, 'Shift_JIS').then(function(res) {
Pot.debug('SJIS to Unicode:');
Pot.debug(res); // 'こんにちは。ほげほげ'
}).then(function() {
return Pot.convertEncodingToUnicode(eucjp, 'EUC-JP').
then(function(res) {
Pot.debug('EUC-JP to Unicode:');
Pot.debug(res); // 'こんにちは。ほげほげ'
});
}).then(function() {
return Pot.convertEncodingToUnicode(utf8, 'UTF-8').
then(function(res) {
Pot.debug('UTF-8 to Unicode:');
Pot.debug(res); // 'こんにちは。ほげほげ'
});
});
- Parameters:
- {TypedArray|Array|Blob} data
- The target data.
- {(String)} (from)
- (optional) Character encoding from.
- Returns:
- {Pot.Deferred} A new instance of Pot.Deferred that has Unicode string.
<static>
{Function}
Pot.UTF8.decode(string)
Convert to UTF-16 string from UTF-8 string.
var string = 'hogeほげ';
var encoded = Pot.utf8Encode(string);
var decoded = Pot.utf8Decode(encoded);
var toCharCode = function(s) {
return Pot.map(s.split(''), function(c) {
return c.charCodeAt(0);
});
};
Pot.debug(toCharCode(encoded));
// [104, 111, 103, 101, 227, 129, 187, 227, 129, 146]
Pot.debug(decoded); // 'hogeほげ'
Pot.debug(decoded === string); // true
- Parameters:
- {String} string
- UTF-8 string.
- Returns:
- {String} UTF-16 string.
<static>
{Function}
Pot.UTF8.encode(string)
Convert to UTF-8 string from UTF-16 string.
var string = 'hogeほげ';
var encoded = Pot.utf8Encode(string);
var decoded = Pot.utf8Decode(encoded);
var toCharCode = function(s) {
return Pot.map(s.split(''), function(c) {
return c.charCodeAt(0);
});
};
Pot.debug(toCharCode(encoded));
// [104, 111, 103, 101, 227, 129, 187, 227, 129, 146]
Pot.debug(decoded); // 'hogeほげ'
Pot.debug(decoded === string); // true
- Parameters:
- {String} string
- UTF-16 string.
- Returns:
- {String} UTF-8 string.