iXML Community Group Test Suite

23 Oct 2023 (22 Nov 2023)

Top-level catalog for tests in the iXML Community Group Test Suite.

Tests have been contributed from several sources, but the core of the test collection are the tests contributed by Steven Pemberton in December 2021.

Tests producing parse trees

22 Nov 2022

Tests provided by Steven Pemberton in December 2021, with corrections of 21 December. Reorganized by Norm Tovey-Walsh, February 2022.

ixml tests

Created 16 Dec 2021 by SP

Updated 21 Dec 2021 by SP

Corrected input, grammar, or output for 5 tests

Updated 30 Dec 2021 by MSM

Updated catalog, corrected many tests.

Updated 30 May 2022 by MSM

Add whitespace-and-delimiters test sets.

unicode-classes

Created 13 Jun 2023 by SP

Updated 15 Jun 2023 by NDW

Added to the test catalog

Updated 16 Nov 2023 by MSM

Added dependencies to mark this as requiring Unicode 14.0 or later

Depends on Unicode version 14.0, 15.0, or 15.1.

Invisible XML Grammar

{ Unicode classes test}
{Each input line starts with the one or two letter name of the class it is testing, a space, and a list of unicode characters that are in that class.

The output is an element of that classname for each line of input, with the characters as content, with the exception of the control characters, for which the element content is a "." for each character in the input.}

classes: line+.
-line: ( C; Cc; Cf; Cn; Co; Cs; L; LC; Ll; Lm; Lo; Lt; Lu; M; Mc; Me; Mn; N; Nd; Nl; No; P; Pc; Pd; Pe; Pf; Pi; Po; Ps; S; Sc; Sk; Sm; So; Z; Zl; Zp; Zs), newline.
-newline: (-#a; -#d)+.

  C: -"C ", (-[C], +".")*.
  L: -"L ", [L]*.
  M: -"M ", [M]*.
  N: -"N ", [N]*.
  P: -"P ", [P]*.
  S: -"S ", [S]*.
  Z: -"Z ", [Z]*.
  
  Cc: -"Cc ", (-[Cc], +".")*.
  Cf: -"Cf ", (-[Cf], +".")*.
  Cn: -"Cn ", (-[Cn], +".")*.
  Co: -"Co ", (-[Co], +".")*.
  Cs: -"Cs ", (-[Cs], +".")*.
  LC: -"LC ", [LC]*.
  Ll: -"Ll ", [Ll]*.
  Lm: -"Lm ", [Lm]*.
  Lo: -"Lo ", [Lo]*.
  Lt: -"Lt ", [Lt]*.
  Lu: -"Lu ", [Lu]*.
  Mc: -"Mc ", [Mc]*.
  Me: -"Me ", [Me]*.
  Mn: -"Mn ", [Mn]*.
  Nd: -"Nd ", [Nd]*.
  Nl: -"Nl ", [Nl]*.
  No: -"No ", [No]*.
  Pc: -"Pc ", [Pc]*.
  Pd: -"Pd ", [Pd]*.
  Pe: -"Pe ", [Pe]*.
  Pf: -"Pf ", [Pf]*.
  Pi: -"Pi ", [Pi]*.
  Po: -"Po ", [Po]*.
  Ps: -"Ps ", [Ps]*.
  Sc: -"Sc ", [Sc]*.
  Sk: -"Sk ", [Sk]*.
  Sm: -"Sm ", [Sm]*.
  So: -"So ", [So]*.
  Zl: -"Zl ", [Zl]*.
  Zp: -"Zp ", [Zp]*.
  Zs: -"Zs ", [Zs]*.

Test case: unicode-classes

Repository URI: …/tests/correct/test-catalog.xml

Created 13 Jun 2023 by SP

Updated 15 Jun 2023 by NDW

supplied expected output

Input string

String value cannot be inlined in web page.

Expected result

<classes>
   <Lm>ʰ</Lm>
   <Lo>ªאتܐޓߊࠀࡀऄঅਅઅଅஅఅಅഅඅกກༀကᄀሀᐁᚁᚠᜀᜠᝀᝠកᠠᢰᤁᥐᦀᨀᨠᬅᮃᯀᰀᱚᳩℵⴰⶀぁァㄅ智取威虎山</Lo>
   <Ll>aàdžµßαϐюաდᏸℊℹⰰⲁa𐐨𐓘𐖗𐳀𑣁ðþ</Ll>
   <Lu>AÀDŽ</Lu>
   <Lt>Dž</Lt>
   <LC>aàdžAÀDŽDžΘϢЮԱႠᎠℂÅⰁⲀA𝐀𝓐𝕲</LC>
   <L>aàdžAÀDŽDžʰªאتܐޓߊࠀࡀऄঅਅઅଅஅఅಅഅඅกກༀကᄀሀᐁᚁᚠᜀᜠᝀᝠកᠠᢰᤁᥐᦀᨀᨠᬅᮃᯀᰀᱚᳩℵⴰⶀぁァㄅ智取威虎山</L>
   <Mc>ऻ</Mc>
   <Me>҈</Me>
   <Mn>̀</Mn>
   <M>ऻ҈</M>
   <Nd>0٩۲߀०০੦૦</Nd>
   <Nl>Ⅻⅻ</Nl>
   <No>²½</No>
   <N>0٩۲߀०০੦૦Ⅻⅻ²½</N>
   <Pc>_‿⁀⁔︳︴﹍﹎﹏_</Pc>
   <Pd>-</Pd>
   <Pe>)]}</Pe>
   <Pf>»’”›</Pf>
   <Pi>«‘‛“</Pi>
   <Po>!"#%&amp;'*,./:;?@\¡§¶·</Po>
   <Ps>([{</Ps>
   <P>_‿⁀⁔︳︴﹍﹎﹏_-)]}»’”›«‘‛“!"#%&amp;'*,./:;?@\¡§¶·([{</P>
   <Sc>$¢£¤¥€</Sc>
   <Sk>^`¨¯´</Sk>
   <Sm>+&lt;=&gt;|~¬→</Sm>
   <So>¦©®°</So>
   <S>$¢£¤¥€^`¨¯´+&lt;=&gt;|~¬→¦©®°</S>
   <Zl>&#x2028;</Zl>
   <Zp>
</Zp>
   <Zs>  </Zs>
   <Z>&#x2028;
  </Z>
   <Co>.</Co>
   <Cc>...</Cc>
   <Cf>.</Cf>
   <Cs/>
   <Cn>.</Cn>
   <C>.....</C>
</classes>