Support for the xml namespace in Codesynthesis xsd cxx-parser

There are two ways to access the xml:base attribute using Codesynthesis xsd cxx-parser. (See also similar work on cxx-tree.) At the moment, this page and the files it references are work in progress. See the description, in the second half of this page, of the data model for ways to handle all attributes in the xml namespace.

Everything presented here requires at least version 3.0.0 of xsd. Everything has been tested with versions 3.0.0, 3.1.0 and 3.2.0. In the source code, the differences between the xsd versions are handled with preprocessor conditionals.

The easier way is when the XML Schema allows or requires the xml:base attribute explicitely. This is demonstrated in the following files.

  • strict.xsd: An XML Schema referencing xml:base explicitely.

  • w3.org-xml-1998.xsd: An XML Schema defining xml:base explicitely.

  • sample.xml: An XML Schema defining xml:base explicitely.

  • Makefile: The GNU Makefile to build and run the program.

As you can see, there is no source code for this case. Instead, the --generate-print-impl and --generate-test-driver options of cxx-parser are used. Together they generate a program that prints the content of the XML file that is its input. You can use the implementation files generated by xsd (named strict-pimpl.cxx, strict-pimpl.hxx and strict-driver.cxx in this case) as the starting point for your own implementation.

Accessing xml:base when it is covered by an '<xs:anyAttribute namespace="##other"/>' in the XML Schema requires some coding, as demonstrated in the following files.

  • lax.xsd: An XML Schema where the xml:base attribute is covered by '<xs:anyAttribute namespace="##other"/>'.

  • lax-pimpl.hpp: The implementation for the parsers generated by xsd, and for the parser of the xml:base attribute. The contents of this file might also have been part of file driver.cpp below.

  • driver.cpp: The driver.

  • using-lax.xml: Sample input.

  • Makefile: The GNU Makefile to build and run the program.

The attribute sna was added to demonstrate that this implementation does not interfere with other attributes.


Note that I use the extensions .cpp and .hpp for manually crafted code. Extensions .cxx and .hxx are reserved for files generated by xsd. In the Makefiles I use option --force-overwrite, hence generated files will be quietly overwritten. You have been warned.

The default target in the Makefiles will build and run the driver program and store the program's stdout and stderr in file driver.txt. This uses utility program tee. I assume this program is part of any Linux distribution.

There also is a tarball file containing all source files referenced here. It expands in the current working directory with a directory for each implementation. The names of these directories is (in order of appearance on this page): parser-generated, parser-wildcard, parser-datamodel and parser-anytype.

A data model for all attributes in the xml namespace.
As this data model uses xercesc to manipulate URIs, the implementions below will not work for expat.

To handle the inheritance of xml:base, xml:lang and xml:space, and to enforce the uniqueness of xml:id, I implemented a data model.

In the implementation below (named parser-datamodel), the XML Schema references all attributes in the xml namespace explicitely

  • XMLnamespace.hpp and XMLnamespace.cpp: The implementation of the data model. These files also implement classes derived from xml_schema::document and xml_schema::complex_content. In the derived classes, operations in the data model proper are called to synchronize the stack of inherited attribute values.

  • allattributes.xsd: An XML Schema referencing all attributes in the xml namespace explicitely.

  • allattributes-pimpl.hpp and allattributes-pimpl.cpp: The application logic, demonstrating the getters of the data model.

    The setters in the data model related to xml:base and xml:id are also called here.

  • w3.org-xml-1998.xsd and w3.org-xml-1998-pimpl.cpp: The xml namespace and related code.

    The setters in the data model related to xml:lang and xml:space are called here.

  • allattributes-driver.cpp: The driver program.

  • sample.xml: Sample input.

  • expect-fail-1.xml: Testcase to show what happens with an invalid value for xml:space.

  • expect-fail-2.xml: Testcase attemptig to show what happens with an invalid value for xml:base. However, a scheme fail:/ appears to be valid.

  • Makefile: The GNU Makefile to build and run the program.


In xsd version 3.2.0 the way to derive your own classes from xml_schema::simple_content and xml_schema::complex_content has changed. This derivation is needed to override _start_element() and _end_element() (and for parser-anytype to override _any_attribute and _attribute()). Instead of deriving the _pimpl classes from both the corresponding _pskel class and xml_schema::simple_content or xml_schema::complex_content, the _pimpl class now derives from a template over the _pskel class implementing the overrides. This template class apparently can be used for both simple types and complex types.

When the attributes in the xml namespace are covered by '<xs:anyAttribute namespace="##other"/>', there are two ways to handle them.

  1. Provide an implementation of the _any_attribute() function as a mixin. It is based on and is very similar to the parser-wildcard example

  2. Use the low-level _attribute() callback to intercept all attributes. It then handles the attributes in the xml namespace itself and delegates parsing of other attributes to the original implementation. One advantage of this approach is that it will support the attributes in the xml namespace even if the schema does not allow it (provided you have disabled validation in the underlying XML parser or is using validation in the generated code).

In the implementation below (named parser-anytype) the difference between these two ways can be seen in files XMLnamespace.hpp and XMLnamespace.cpp, surrounded by #ifdef XMLISDECLD / #else / #endif statements. This code is based on examples provided by Boris Kolpackov.

The Makefile builds and runs two programs, named declared (using _any_attribute()) and intercept (using _attribute()), that are built (with some make magic) from the same sources. The output of these programs is stored in files declared.txt and intercept.txt.

  • XMLnamespace.hpp and XMLnamespace.cpp: The implementation of the data model. This is almost identical to the data model on the left.

    In this case however, the setters in the data model related to all attributes in the xml namespace are called in the derived classes implemented here.

  • lax.xsd: An XML Schema where the attributes in the xml namespace are covered by '<xs:anyAttribute namespace="##other"/>'.

  • lax-pimpl.hpp: The application logic, demonstrating the getters of the data model. Note that in this case, no setters in the data model are called here.

  • w3.org-xml-1998.xsd, w3.org-xml-1998.map and w3.org-xml-1998-pimpl.hpp: The xml namespace and related code.

  • driver.cpp: The driver program.

  • using-lax.xml: Sample input.

  • dupl-id.xml: Show what happens if xml:id has a non-unique value.

  • Makefile: The GNU Makefile to build and run the programs.

up arrow mailto:jnw@xs4all.nl