XML / XPath / XSLT

Escaping Strings in XPath 1.0
XML / XPath / XSLT c++ xpath
Published: 2008-06-03
Escaping Strings in XPath 1.0
XPath is a language for selecting nodes from an XML document. XPath is used extensively in XSLT and other XML technologies. I also vastly prefer using XPath (e.g. with XPathNavigator) over the XML DOM when manipulating XML in a non-streaming fashion. In XPath, strings must be delimited by either single or double quotes. Given a quote character used to delimit a string, one can’t represent that same quote character within the string. Read more...
Microsoft’s XmlLite
XML / XPath / XSLT win32 xml
Published: 2007-07-12
Microsoft has created a new, lightweight C++ XML processing library called XmlLite. It includes a streaming XML writing class patterned after .NET’s System.Xml.XmlWriter. This library makes the IXmlWriter in Implementing IXmlWriter Series obsolete for Windows developers.
XmlTextWriter Can Produce Invalid XML
XML / XPath / XSLT csharp xml
Published: 2007-06-16
XmlTextWriter Can Produce Invalid XML
XmlTextWriter is .NET’s class for writing XML in a forward-only streaming manner. It is highly efficient and is the preferred way to generate XML in .NET in most circumstances. I find XmlTextWriter so useful I wrote a partial C++ implementation of it in Implenting IXmlWriter Series. Unfortunately, XmlTextWriter isn’t quite as strict as it could be. It will let slip some invalid XML such as duplicate attributes, invalid Unicode characters in the range 0×0 to 0×20, and invalid element and attribute names. Read more...
XSLT Number Formatting Notes
XML / XPath / XSLT xslt
Published: 2006-02-02
XSLT Number Formatting Notes
When using XSLT’s format-number() function to format a decimal, consider using a zero in the least significant place of the decimal part of your format string. This will allow a number with a 0 integer part to display correctly. For example: Number format-number using #,### format-number using #,##0 12345 12,345 12,345 5 5 5 No output! This also applies to decimals: Read more...
Implementing IXmlWriter Part 14: Supporting Writing To A Stream
Implementing IXmlWriter c++ ixmlwriter xml
Published: 2006-01-10
Implementing IXmlWriter Part 14: Supporting Writing To A Stream
This is part 14/14 of my Implementing IXmlWriter post series. Today I will add support for writing the generated XML to a C++ stream to last time’s IXmlWriter. Finally the reason why I’ve insisted on calling this series IXmlWriter (instead of StringXmlWriter) should become clear: I’ve been planning on supporting writing the generated XML to more than just a string. Specifically, today I will add the ability to write the XML to a C++ ostream object, a base class in the C++ iostream library which defines a writable stream. Read more...
Implementing IXmlWriter Part 13: Putting IXmlWriter Behind A Pimpl Firewall
Implementing IXmlWriter c++ ixmlwriter pimpl xml
Published: 2005-12-15
Implementing IXmlWriter Part 13: Putting IXmlWriter Behind A Pimpl Firewall
This is part 13/14 of my Implementing IXmlWriter post series. As the private members of IXmlWriter are getting too numerous and too likely to change by my judgment, today I will put last time’s IXmlWriter behind a compilation firewall (pimpl). The idea behind the pimpl idiom is to hide as much of the class definition as possible in order to avoid requiring users of the class to recompile if the class’s private members are changed. Read more...
Implementing IXmlWriter Part 12: Supporting Pretty-Printing
Implementing IXmlWriter c++ ixmlwriter xml
Published: 2005-12-13
Implementing IXmlWriter Part 12: Supporting Pretty-Printing
This is part 12/14 of my Implementing IXmlWriter post series. Today I will add support for pretty-printing to last time’s IXmlWriter. Pretty-printing is the addition of whitespace at predetermined locations to make the resulting XML easier to read than when it is all on one line. In the .NET Framework’s System.Xml.XmlTextWriter class, it is supported by the properties Formatting, which allows you to enable or disable pretty-printing; Indentation, which allows you to specify how many whitespace characters indentation should use; and IndentChar, which allows you to specify the whitespace character to use for indentation. Read more...
Implementing IXmlWriter Part 11: Supporting Namespaces
Implementing IXmlWriter c++ ixmlwriter xml
Published: 2005-12-07
Implementing IXmlWriter Part 11: Supporting Namespaces
This is part 11/14 of my Implementing IXmlWriter post series. Today I will add support for namespaces to last time’s IXmlWriter. Namespaces are defined by the W3C recommendation Namespaces in XML. Using namespaces requires two parts: a namespace declaration, which associates a prefix with a namespace name (a user-defined, ideally globally-unique string which defines the namespace, often in the form of a URL); and the assignment of XML elements and attributes to this namespace by using the aforementioned prefix. Read more...
Implementing IXmlWriter Part 10: Supporting WriteComment()
Implementing IXmlWriter c++ ixmlwriter xml
Published: 2005-12-02
Implementing IXmlWriter Part 10: Supporting WriteComment()
This is part 10/14 of my Implementing IXmlWriter post series. Today I will add support for the function WriteComment() to last time’s IXmlWriter. Quoting from Section 2.5: Comments of the XML 1.0 spec: Comments MAY appear anywhere in a document outside other markup; in addition, they MAY appear within the document type declaration at places allowed by the grammar. Considering this, we should allow writing comments in virtually every WriteState that the IXmlWriter can be in. Read more...
Implementing IXmlWriter Part 9: Supporting WriteStartDocument() and WriteEndDocument()
Implementing IXmlWriter c++ ixmlwriter xml
Published: 2005-11-22
Implementing IXmlWriter Part 9: Supporting WriteStartDocument() and WriteEndDocument()
This is part 9/14 of my Implementing IXmlWriter post series. Today I will add support for the functions WriteStartDocument() and WriteEndDocument() to last time’s IXmlWriter. WriteStartDocument() writes the XML declaration (i.e. <?xml version="1.0"?>) and WriteEndDocument() closes all open attributes and elements and sets the IXmlWriter back in the initial state. Adding support for these functions is straightforward. Note that I have introduced a new IXmlWriter state called WriteState_Prolog; this will be important later. Read more...
Implementing IXmlWriter Part 8: Supporting WriteStartAttribute() and WriteEndAttribute()
Implementing IXmlWriter c++ ixmlwriter xml
Published: 2005-11-18
Implementing IXmlWriter Part 8: Supporting WriteStartAttribute() and WriteEndAttribute()
This is part 8/14 of my Implementing IXmlWriter post series. Today I will add support for the functions WriteStartAttribute() and WriteEndAttribute() to last time’s IXmlWriter. These functions are (obviously) used to denote the start and end of an attribute; the attribute value is written using WriteString() (this usage is analogous to WriteStartElement() and WriteEndElement()). Because WriteString() must now be aware of whether it is writing an attribute value or element content, I must keep track of the state the IXmlWriter is in — a change that affects nearly every function. Read more...
Implementing IXmlWriter Part 7: Cleaning Up
Implementing IXmlWriter c++ ixmlwriter xml
Published: 2005-11-17
Implementing IXmlWriter Part 7: Cleaning Up
This is part 7/14 of my Implementing IXmlWriter post series. Wow, I can’t believe that it’s been over a month already since my last IXmlWriter post. I guess my vacation ruined my exercise plan and my blogging habits. It’s well past time to get back into both. Rather than introduce a new test case, I’m going to spend today “cleaning up” the previous version of IXmlWriter. The first cleanup method is trivial but overdue — I will separate the implementation of IXmlWriter from its interface, as a user of IXmlWriter shouldn’t be particularly concerned about its implementation. Read more...
Implementing IXmlWriter Part 6: Escaping Attribute Content
Implementing IXmlWriter c++ ixmlwriter xml
Published: 2005-10-12
Implementing IXmlWriter Part 6: Escaping Attribute Content
This is part 6/14 of my Implementing IXmlWriter post series. Last time’s IXmlWriter has a serious bug: it doesn’t properly handle attribute value escaping and can lead to malformed XML. Consider the following test case: 1 2 3 4 5 6 7 8 StringXmlWriter xmlWriter; xmlWriter.WriteStartElement("root"); xmlWriter.WriteStartElement("element"); xmlWriter.WriteAttributeString("att", "\""); xmlWriter.WriteEndElement(); xmlWriter.WriteEndElement(); std::string strXML = xmlWriter.GetXmlString(); The previous version of IXmlWriter will generate the XML string <root><element att="""/></root>, which is invalid and will be rejected by a XML parser. Read more...
Implementing IXmlWriter Part 5: Supporting WriteAttributeString()
Implementing IXmlWriter c++ ixmlwriter xml
Published: 2005-10-11
Implementing IXmlWriter Part 5: Supporting WriteAttributeString()
This is part 5/14 of my Implementing IXmlWriter post series. Today I will add support for writing attributes to yesterday’s version of IXmlWriter. To learn more about attributes, see the W3C XML 1.0 Recommendation. Writing attributes will be supported with the function WriteAttributeString(). Here’s today’s test case: 1 2 3 4 5 6 7 8 9 StringXmlWriter xmlWriter; xmlWriter.WriteStartElement("root"); xmlWriter.WriteStartElement("element"); xmlWriter.WriteAttributeString("att", "value"); xmlWriter.WriteEndElement(); xmlWriter.WriteEndElement(); std::string strXML = xmlWriter.GetXmlString(); // strXML should be <root><element att="value"/></root> Because the changes in Implementing IXmlWriter Part 4 keep start elements unclosed until another function is called which requires them to be closed (e. Read more...
Implementing IXmlWriter Part 4: Collapsing Empty Elements
Implementing IXmlWriter c++ ixmlwriter xml
Published: 2005-10-10
Implementing IXmlWriter Part 4: Collapsing Empty Elements
This is part 4/14 of my Implementing IXmlWriter post series. One of the enhancements that XML introduced over SGML was a shorthand for specifying an element with no content by adding a trailing slash at the end of an open element. For example, <br/> is equivalent to <br></br>. Let’s add this functionality to the previous version of IXmlWriter. Here’s the test case: 1 2 3 4 5 6 7 8 StringXmlWriter xmlWriter; xmlWriter. Read more...
Implementing IXmlWriter Part 3: Supporting WriteElementString()
Implementing IXmlWriter c++ ixmlwriter xml
Published: 2005-10-07
Implementing IXmlWriter Part 3: Supporting WriteElementString()
This is part 3/14 of my Implementing IXmlWriter post series. Today’s addition to the previous iteration of IXmlWriter is quite trivial: supporting the WriteElementString() method. Here’s the test case: 1 2 3 4 5 6 7 StringXmlWriter xmlWriter; xmlWriter.WriteStartElement("root"); xmlWriter.WriteElementString("element", "value"); xmlWriter.WriteEndElement(); std::string strXML = xmlWriter.GetXmlString(); // strXML should be <root><element>value</element></root> Implementation is extremely simple because WriteElementString() is nothing but a convenience method which calls WriteStartElement(), WriteString(), and WriteEndElement(). Read more...
Implementing IXmlWriter Part 2: Escaping Element Content
Implementing IXmlWriter c++ ixmlwriter xml
Published: 2005-10-06
Implementing IXmlWriter Part 2: Escaping Element Content
This is part 2/14 of my Implementing IXmlWriter post series. In the previous post of this series, we ended up with a simple class which could write XML elements and element content to a std::string. However, this code has a common, serious problem that was mentioned in my post Don’t Form XML Using String Concatenation: it doesn’t properly escape XML special characters such as & and <. This means that if you call WriteString() with one of these characters, your generated XML will be invalid and will not be able to be parsed by an XML parser. Read more...
Implementing IXmlWriter Part 1: The Basics
Implementing IXmlWriter c++ ixmlwriter xml
Published: 2005-09-30
Implementing IXmlWriter Part 1: The Basics
This is part 1/14 of my Implementing IXmlWriter post series. After writing my blog post Don’t Form XML Using String Concatenation, I realized that writing a C++ System.Xml.XmlWriter workalike involves some interesting challenges. Therefore, I’ve decided to write a series of blog posts about building a streaming C++ XML generator, a.k.a. IXmlWriter, step-by-step. For this series, I will follow the practice of test-driven development and write a test case followed by an implementation which passes the test case. Read more...
Don’t Form XML Using String Concatenation
XML / XPath / XSLT c++ xml
Published: 2005-09-16
Don't Form XML Using String Concatenation
It seems very common for developers to create XML using string concatenation, as in: 1 2 3 4 5 6 7 8 9 10 std::string CreateXML ( const std::string& strValue ) { std::string strXML("<tag>"); strXML += strValue; strXML += "</tag>"; return strXML; } As any experienced XML developer knows, this code has a bug: strValue must be escaped (& must be converted to &amp;, < must be converted to &lt;, etc. Read more...