Text encoding: Difference between revisions

From Boktai Hacking Wiki
(Created page with "Text is stored uncompressed in the Script directory. Each character is encoded as either 1 or 2 bytes. * 1 byte: Used for ASCII characters with a code <= 127 (but there are a few exceptions where some ASCII codes have been replaced, see the encoding tables below). * 2 bytes: Used for special characters (e.g. arrows) and for non-English characters (e.g. Japanese). The 1st byte will be between 0x80 and 0x85. In other words, if the top bit of a byte is set, that means...")
 
(Update Boktai 1 (U) encoding table)
 
Line 9: Line 9:


= Encoding tables =
= Encoding tables =
* [https://git.sr.ht/~raphi/bokasm/tree/master/item/bokasm/charset_us.tbl Boktai 1 (U)]
* [https://git.sr.ht/~raphi/bokmagic/tree/master/item/resources/U3IE/encoding.tbl Boktai 1 (U)]
* [https://git.sr.ht/~raphi/bokasm/tree/master/item/bokasm/wh.sjs.tbl Boktai 3 (J)]
* [https://git.sr.ht/~raphi/bokasm/tree/master/item/bokasm/wh.sjs.tbl Boktai 3 (J)]
* Others: ''TODO''
* Others: ''TODO''

Latest revision as of 21:14, 18 November 2024

Text is stored uncompressed in the Script directory. Each character is encoded as either 1 or 2 bytes.

  • 1 byte: Used for ASCII characters with a code <= 127 (but there are a few exceptions where some ASCII codes have been replaced, see the encoding tables below).
  • 2 bytes: Used for special characters (e.g. arrows) and for non-English characters (e.g. Japanese). The 1st byte will be between 0x80 and 0x85.

In other words, if the top bit of a byte is set, that means it's the 1st byte of a 2-byte character. Otherwise, it's a 1-byte ASCII character.

Example: 0x41, 0x80, 0x25 encodes the two characters A★ in Boktai 1 (U).

Encoding tables