Skip to content

Add UTF-8 capabilities #3

@javierguerragiraldez

Description

@javierguerragiraldez

Lua 5.3 already includes some:

  • '\u{XX...}' embeds the UTF-8 encoding in string literals.
  • %U in lua_pushfstring
  • utf8 library (for codepoint handling, no Unicode semantics)

surprisingly, it seems it doesn't include

  • %U in string.format

Other things that could be managed by a separate / optional library:

  • conversion between different encodings. (windows still uses some mixture of UCS2 and UTF16)
  • collation
  • normalization, case folding
  • text boundaries

The most obvious objection about including these capabilities with the language is the need of big tables. I think it would be valuable to evaluate what can the basic language do to make a binding as transparent as possible, without a hard dependency.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions