iXML Community Group Test Suite

21 Jun 2022 (28 Jun 2022)

Top-level catalog for tests in the iXML Community Group Test Suite.

Tests have been contributed from several sources, but the core of the test collection are the tests contributed by Steven Pemberton in December 2021.

Error tests

28 Jun 2022

Tests intended to demonstrate errors that processors are required to raise.

non-XML-char-in-input-output-errors

Created 20 Jun 2022 by MSM

Followup to invalid-char. Tests for serialization of non-XML characters in a different way.

In this test set, the grammar accepts a range of non-XML characters and hides some of them, but not all. Input in which only 'hidden' control characters are present produces output, input in which non-hidden controls are present should produce a dynamic error.

Invisible XML Grammar

{ Grammar for experiments with non-XML characters in input.

  Some input characters are to be shown, some to be hidden.
  We choose to show the characters with prime character
  numbers and hide those with composite numbers.

  Since we need non-XML characters in the input, and the only 
  non-XML characters we can make are in the C0 range, the 
  input and output are likely to be challenging to check.
  
  You have been warned.
}
S: (show; hide; printable)*.

NUL: -#00 { null }.
SOH: -#01 { start of header }.
STX:  #02 { start of text }.
ETX:  #03 { end of text }.
EOT: -#04 { end of transmission }.
ENQ:  #05 { enquiry ('are you there?') }.
ACK: -#06 { acknowledgement ('yes i am here') }.
BEL:  #07 { bell }.
BS:  -#08 { backspace }.
HT:  -#09 { horizontal tab, an XML character, but 
           not a prime, so it's in the 'hide' category }.
LF:  -#0A { linefeed, also an XML character }.
VT:   #0B { vertical tabulation }.
FF:  -#0C { form feed }.
CR:   #0D { carriage return (an XML character) }.
SO:  -#0E { shift out (to alternate character set }.
SI:  -#0F { shift in (from alternate character set }.
DLE: -#10 { data link escape }.
DC1:  #11 { device-control 1 / XON }.
DC2: -#12 { device-control 2 }.
DC3:  #13 { device-control 3 / XOFF }.
DC4: -#14 { device-control 4 }.
NAK: -#15 { negative acknowledgement }.
SYN: -#16 { synchronous idle }.
ETB:  #17 { end of transmission block / end of paragraph }.
CAN: -#18 { cancel }.
EM:  -#19 { end of medium / em-space / beginning of paragraph }.
SUB: -#1A { substitute (often used for end of file) }.
ESC: -#1B { escape }.
FS:  -#1C { file separator }.
GS:   #1D { group separator }.
RS:  -#1E { record separator }.
US:   #1F { unit separator }.

show: STX; ETX; ENQ; BEL; VT; CR; DC1; DC3; ETB; GS; US.
hide:  NUL; SOH; EOT; ACK; BS; HT; LF; FF; SO; SI; DLE; DC2;
        DC4; NAK; SYN; CAN; EM; SUB; ESC; FS; RS.

-printable: [#20 - #7E].

Test case: SOH-RS-EOT

Repository URI: …/tests/error/test-catalog.xml

Created 20 Jun 2022 by MSM

Input string

String value cannot be inlined in web page.

Input contains SOH, RS, and EOT control characters.

Expected result

<S>
   <hide>
      <SOH/>
   </hide>This file begins with a start-of-header (SOH, ^A) character,<hide>
      <LF/>
   </hide>followed by some lines of text,<hide>
      <LF/>
   </hide>a record-separacter character (RS, ^^, right &gt;<hide>
      <RS/>
   </hide>&lt; here),<hide>
      <LF/>
   </hide>and then more text<hide>
      <LF/>
   </hide>and finally at the end an end-of-transmission (EOT, ^D) character.<hide>
      <EOT/>
   </hide>
</S>