Invisible XML Specification Errata

Version: 2023-05-09

Status

This document lists errata on the Invisible XML 1.0 Specification.

Proposed errata

E003: Ignore UTF-8 BOM

The second paragraph of the section titled “The Grammar” is changed to indicate that a UTF-8 BOM must be ignored.

A grammar is an optional prolog, followed by a sequence of one or more rules, surrounded and separated by spacing and comments. Spacing and comments are entirely optional, except that rules must be separated by at least one of either (error S01). If an input grammar encoded in UTF-8 begins with a byte order mark (BOM), the BOM must be ignored.

A new paragraph is added between the second and third paragraphs of the section titled “Parsing”.

If an input encoded in UTF-8 begins with a BOM, the BOM should be ignored.

Accepted errata

E002: A string cannot contain C0 or C1 control characters

Accepted as an erratum against 1.0 at the meeting of 10 January 2023.

The definition of a quoted string is changed to exclude the C0 and C1 control characters.

A string cannot contain any of the C0 or C1 control characters, includingextend over a line-break (error S11). The enclosing quote is represented in a string by doubling it; these two strings are identical: 'Isn''t it?' and "Isn't it?", as are these: "He said ""Don't!""" and 'He said "Don''t!"'.

The error S11 is amended to include the C0 and C1 control characters.

S11
It is an error if a string contains a C0 or C1 control character, including a line break.

Resolved errata

E001: The Unicode character class “LC”

Accepted as an erratum against 1.0 at the meeting of 11 October 2022. Resolved in the current specification on 15 October 2022.

The definition of the nonterminal letter is changed to ["A"-"Z" | "a"-"z"] to address the fact that recent versions of Unicode include the class “LC”.

A class is one or two letters, representing any character from the Unicode character category [Categories] of that name, which must exist (error S10). E.g. [Ll] matches any lower-case letter, [Ll; Lu] matches any upper- or lower-case character.

   -class: code.
    @code: capital, letter?.
 -capital: ["A"-"Z"].
  -letter: ["A"-"Z" | "a"-"z"].