Invisible XML is a language for describing the implicit structure of data, and a set of technologies for making that structure explicit as XML markup. It allows you to write a declarative description of the format of some text and then leverage that format to represent the text as structured information.
Specification
Invisible XML (ixml) 1.0 was published on 10 June, 2022. The same specification was published as a Final Community Group Report by the W3C on 12 December, 2023. The grammar of Invisible XML is defined with Invisible XML. The ixml grammar for ixml is available in ixml format or in xml format. Invisible XML was developed by the Invisible Markup Community Group (CG) at the W3C.
The current working draft is also available. This draft includes the resolved errata and any additional proposals adopted since the 1.0 specification was published.
Discussion
If you’re interested in following along in the discussion, please join the community group! You can also:
- Browse the archives of the mailing list (and subscribe!)
- Review the issues list.
- Keep track of current proposals on the dashboard.
Tutorials
- Steven Pemberton’s
hands-on tutorial
is a great way to begin learning how to write your own grammars.
- His advanced tutorial explores some more complex Invisible XML parsing challenges.
- Norm Tovey-Walsh has written introductory material about ixml and about writing grammars on xml.com.
Implementations
- Aparecium is Michael Sperberg-McQueen’s proof-of-concept implementation in XQuery.
- CoffeePot is Norm Tovey-Walsh’s Java implementation.
- EarleyBird is M. Joel Dubinko’s Rust implementation (in development)
- ixampl is Steven Pemberton’s implementation (it is available online as a web service).
- jωiXML is John Lumley’s browser-based JavaScript implementation.
- JayParser is Tomos Hillman’s XSLT implementation (in progress).
- Hywel is Bethan Tovey-Walsh’s Python implementation (in development)
- Markup Blitz is Gunther Rademacher’s Java implementation.
Test suite
The ixml GitHub repository includes a comprehensive test suite. It is also available as a browsable catalog.
Schemas
The XML versions of Invisible XML grammars conform to the RELAX NG schemas provided.
Example grammars
A selection of sample grammars are available including:
- ABNF, the grammar notation used in IETF Requests for Comments, as defined in RFC 5234.
- ISBN, International Standard Book Numbers.
- ISO 8601, the standard ISO date, time, datetime, duration, interval, and recurrence formats, as defined in ISO 8601:2004.
- BCP 47, tags for identifying languages, as defined in BCP 47.
In addition, Gunther Rademacher's grammar converter is able to convert Invisible XML to W3C-style EBNF. This can be used to produce syntax diagrams using the RR tool.
Other resources
- Gingersnap is a library of XSLT modules for processing (the XML form) of Invisible XML grammars.
- Steven Pemberton discussed Invisible XML in a podcast for BrightTALK.
Other things named “ixml”
If you found your way to this page with a web search for “ixml” and you weren’t looking for Invisible XML, you might have been looking for one of these:
- iXML, a standard for embedded metadata in production media files.
- ixml, an iterative event-driven XML parser with a standard Python iterator interface.
- IXmlNode, a Microsoft DOM interface.
- iXML Library, an SAP API for handling XML documents in DOM format. (sic)
Bibliography
There are a few additional papers and presentations on the development of Invisible XML
- Hillman, Tomos. XSLT Earley: First Steps to a Declarative Parser Generator. Presented in two halves at XML Prague, 2020 and Declarative Amsterdam, 2020.
- Pemberton, Steven. Invisible XML: introduces the concepts, and develops a notation to support them.
- Pemberton, Steven. Data just wants to be (format) neutral: discusses issues with automatic serialisation, and the relationship between Invisible XML grammars and data schemas.
- Pemberton, Steven. Parse Earley, Parse Often: How to parse anything to XML: discusses issues around grammar design, and in particular parsing algorithms used to recognise any document, and converting the resultant parse-tree into XML, and gives a new perspective on a classic algorithm.
- Pemberton, Steven. On the Descriptions of Data: discusses the Usability of Notations: Discusses changes to the design following experience with using it, giving examples of its use to develop data descriptions, and in passing, suggests other output formats.
- Pemberton, Steven. On the Specification of Invisible XML: describes decisions made during the production of the specification of ixml.
- Sperberg-McQueen, C. M. “Aparecium: An XQuery / XSLT library for invisible XML.” Presented at Balisage: The Markup Conference 2019, Washington, DC, July 30 - August 2, 2019. In Proceedings of Balisage: The Markup Conference 2019. Balisage Series on Markup Technologies, vol. 23 (2019). https://doi.org/10.4242/BalisageVol23.Sperberg-McQueen01.
- Tovey-Walsh, Norm. “Ambiguity in iXML: and how to control it.” Presented at Balisage: The Markup Conference 2023, Washington, DC, July 31 - August 4, 2023. In Proceedings of Balisage: The Markup Conference 2023. Balisage Series on Markup Technologies, vol. 28 (2023). https://doi.org/10.4242/BalisageVol28.Tovey-Walsh01.