Don't Form XML Using String Concatenation

Don't Form XML Using String Concatenation

It seems very common for developers to create XML using string concatenation, as in:

std::string CreateXML
    (
    const std::string& strValue
    )
{
    std::string strXML("<tag>");
    strXML += strValue;
    strXML += "</tag>";
    return strXML;
}

As any experienced XML developer knows, this code has a bug: strValue must be escaped (& must be converted to &amp;, < must be converted to &lt;, etc.) or the XML that is generated will not be well-formed and will not be able to be parsed by an XML parser. One could write a simple function to handle this escaping, but there are so many other issues that can potentially creep up with XML generation—string encoding, illegal element and attribute names, collapsing empty elements, making sure all opened elements are closed, the performance of string concatenation, etc.—that I recommend putting all this logic into a single class. In fact, I suggest modelling the class after .NET’s streaming XML generator, System.Xml.XmlWriter.

Typically, this class (let’s call it IXmlWriter) will be instantiated high up on the call stack and passed as a parameter to functions which generate XML, as in:

void CreateXML
    (
    IXmlWriter& w,
    const std::string& strValue
    )
{
    w.WriteStartElement(std::string("tag"));
    w.WriteString(strValue);
    w.WriteEndElement();
    // Or w.WriteElementString(std::string("tag"), strValue);
}

If you still require the XML as a string, you can write an implementation of IXmlWriter to write to a resizable string buffer (such as ostringstream) and then get the buffer’s contents as a string once all XML writing has finished.

Writing this code is a bit more work but it will pay off in the end.

Update 2005-09-21 12:20 PM: Removed code fragment of what an ostringstream-driven IXmlWriter might look like in anticipation of future blog posts.