This is part 11/14 of my Implementing IXmlWriter post series.
Today I will add support for namespaces to last time’s IXmlWriter.
Namespaces are defined by the W3C recommendation Namespaces in XML. Using namespaces requires two parts: a namespace declaration, which associates a prefix with a namespace name (a user-defined, ideally globally-unique string which defines the namespace, often in the form of a URL); and the assignment of XML elements and attributes to this namespace by using the aforementioned prefix.
Here’s an example of a XML document that uses namespaces:
<?xml version="1.0"?>
<bk:book xmlns:bk='urn:loc.gov:books'>
</bk:book>
The xmlns:bk='urn:loc.gov:books' is the namespace declaration, and it assigns the prefix bk: to the namespace name urn:loc.gov:books. The book element is declared as a member of the urn:loc.gov:books namespace (and not the default, empty namespace) by the usage of this prefix.
There are a few subtleties to the use of namespaces. A common one is that while you can declare a default namespace into which unprefixed elements are automatically assigned (through the use of xmlns="..."), unprefixed attributes are not automatically assigned into this namespace. In other words, the following XML fragments are not equivalent because the title attributes are in different namespaces:
<-- The title attribute is in the
urn:loc.gov:books namespace -->
<bk:book bk:title='Cheaper by the Dozen'
xmlns:bk='urn:loc.gov:books'>
</bk:book>
<-- The title attribute is in the
empty namespace -->
<book title='Cheaper by the Dozen'
xmlns='urn:loc.gov:books'>
</book>
An important point to note (and one which we will take advantage of shortly) is that the value of the prefix is meaningless — it is simply a shorthand way of denoting the membership of a XML element or attribute in a namespace. In other words, if I replaced bk: with foobar: everywhere in the above code, the resulting document would be equivalent to the original. Therefore, for now, I choose to not allow users of IXmlWriter to control the namespace prefixes — I will assign them automatically as ns1:, ns2:, …
In order to keep track of what namespaces have already been declared, I will store them (in addition to the namespace-qualified name QName) in m_openedElements. Because I need the ability to search all declared namespaces for all opened elements, I will change m_openedElements from a std::stack to a std::vector. Furthermore, because namespaces can only be declared in an essentially stack-like manner, I can assign namespace prefixes by simply counting up the total number of namespace prefixes already declared and adding one. I will not support default namespaces at this time.
Here are the five test cases I developed for this functionality:
// Test simple namespaces
StringXmlWriter xmlWriter;
xmlWriter.WriteStartElement("root", "namespace1");
xmlWriter.WriteEndElement();
std::string strXML = xmlWriter.GetXmlString();
// strXML should be <ns1:root xmlns:ns1="namespace1"/>
// Test attribute namespacing
StringXmlWriter xmlWriter;
xmlWriter.WriteStartElement("root", "namespace1");
xmlWriter.WriteAttributeString("att", "namespace1", "value");
xmlWriter.WriteEndElement();
std::string strXML = xmlWriter.GetXmlString();
// strXML should be <ns1:root xmlns:ns1="namespace1" ns1:att="value"/>
// Test child namespace declarations
StringXmlWriter xmlWriter;
xmlWriter.WriteStartElement("root", "namespace1");
xmlWriter.WriteElementString("child", "namespace2", "value");
xmlWriter.WriteEndElement();
std::string strXML = xmlWriter.GetXmlString();
// strXML should be:
// <ns1:root xmlns:ns1="namespace1"><ns2:child xmlns:ns2="namespace2">value</ns2:child></ns1:root>
// Complicated namespace test
StringXmlWriter xmlWriter;
xmlWriter.WriteStartElement("root", "namespace1");
xmlWriter.WriteStartElement("child", "namespace2");
xmlWriter.WriteAttributeString("att1", "namespace1", "value1");
xmlWriter.WriteAttributeString("att2", "namespace2", "value2");
xmlWriter.WriteAttributeString("att3", "namespace3", "value3");
xmlWriter.WriteAttributeString("att4", "value4");
xmlWriter.WriteStartElement("child2", "namespace3");
xmlWriter.WriteString("value");
xmlWriter.WriteEndElement();
xmlWriter.WriteEndElement();
xmlWriter.WriteEndElement();
std::string strXML = xmlWriter.GetXmlString();
// strXML should be (on one line):
// <ns1:root xmlns:ns1="namespace1">
// <ns2:child xmlns:ns2="namespace2" ns1:att1="value1" ns2:att2="value2" xmlns:ns3="namespace3" ns3:att3="value3" att4="value4">
// <ns3:child2>value</ns3:child2>
// </ns2:child>
// </ns1:root>
// Test "sibling" namespace declarations
StringXmlWriter xmlWriter;
xmlWriter.WriteStartElement("root");
xmlWriter.WriteStartElement("child1", "namespace1");
xmlWriter.WriteEndElement();
xmlWriter.WriteStartElement("child2", "namespace1");
xmlWriter.WriteEndElement();
xmlWriter.WriteEndElement();
std::string strXML = xmlWriter.GetXmlString();
// strXML should be (on one line):
// <root>
// <ns1:child1 xmlns:ns1="namespace1"/>
// <ns1:child2 xmlns:ns1="namespace1"/>
// </root>
Here’s the new header file:
// StringXmlWriter.h
class StringXmlWriter
{
private:
enum WriteState
{
WriteState_Attribute, // An attribute value is being written
WriteState_Content, // Element content is being written
WriteState_Element, // An element start tag has been written (and is unclosed)
WriteState_Prolog, // The prolog is being written
WriteState_Start, // No Write() methods have been called
};
struct OpenElement
{
explicit OpenElement(const std::string& localName) :
QName(localName)
{
}
explicit OpenElement(const std::string& localName,
const std::string& prefix) :
QName(prefix.empty() ? localName : prefix + ":" + localName)
{
}
// The qualified name (namespace prefix-included) of the
// opened element
std::string QName;
// All namespaces declared in this element (maps namespace
// to namespace prefix)
typedef std::map<std::string, std::string> Namespaces_t;
Namespaces_t Namespaces;
};
WriteState m_writeState;
// Need to use a vector instead of a stack because we must be able
// to iterate over each opened element in the stack to see if a
// namespace has already been declared.
typedef std::vector<OpenElement> OpenedElements_t;
OpenedElements_t m_openedElements;
std::string m_xmlStr;
public:
StringXmlWriter();
std::string GetXmlString() const;
void WriteAttributeString(const std::string& localName,
const std::string& text);
void WriteAttributeString(const std::string& localName,
const std::string& ns,
const std::string& text);
void WriteComment(const std::string& text);
void WriteElementString(const std::string& localName,
const std::string& text);
void WriteElementString(const std::string& localName,
const std::string& ns,
const std::string& text);
void WriteEndAttribute();
void WriteEndDocument();
void WriteEndElement();
void WriteStartAttribute(const std::string& localName);
void WriteStartAttribute(const std::string& localName,
const std::string& ns);
void WriteStartDocument();
void WriteStartElement(const std::string& localName);
void WriteStartElement(const std::string& localName,
const std::string& ns);
void WriteString(const std::string& text);
private:
// Disable copy construction and assignment
StringXmlWriter(const StringXmlWriter&);
StringXmlWriter& operator=(const StringXmlWriter&);
std::string GetExistingNamespacePrefix(const std::string& ns);
std::string GetNextNamespacePrefix(const std::string& ns);
};
Here’s the new implementation file:
// StringXmlWriter.cpp
#include "StringXmlWriter.h"
#define ARRAYSIZE(x) ( sizeof(x) / sizeof(x[0]) )
struct CharTranslation
{
char OriginalChar;
const char* ReplacementString;
};
static const CharTranslation AttributeValueTranslations[] =
{
{ '"', """ },
{ '&', "&" },
};
static const CharTranslation CharDataTranslations[] =
{
{ '&', "&" },
{ '<', "<" },
{ '>', ">" },
};
struct OriginalCharEquals :
public std::binary_function<CharTranslation, char, bool>
{
bool operator() (const CharTranslation& translation, char ch) const
{
return (translation.OriginalChar == ch);
}
};
static std::string TranslateString(const std::string& originalStr,
const CharTranslation* translations,
int numTranslations)
{
// Actually one past end, needed for proper std::find_if semantics
const CharTranslation* endTranslations = translations + numTranslations;
std::string translatedStr;
for (std::string::const_iterator stringIter = originalStr.begin();
stringIter != originalStr.end();
++stringIter)
{
char ch = *stringIter;
const CharTranslation* translation = std::find_if
(
translations,
endTranslations,
std::bind2nd(OriginalCharEquals(), ch)
);
if (translation != endTranslations)
{
translatedStr += translation->ReplacementString;
}
else
{
translatedStr += ch;
}
}
return translatedStr;
}
StringXmlWriter::StringXmlWriter() : m_writeState(WriteState_Start)
{
}
std::string StringXmlWriter::GetXmlString() const
{
return m_xmlStr;
}
void StringXmlWriter::WriteAttributeString(const std::string& localName,
const std::string& text)
{
WriteStartAttribute(localName);
WriteString(text);
WriteEndAttribute();
}
void StringXmlWriter::WriteAttributeString(const std::string& localName,
const std::string& ns,
const std::string& text)
{
WriteStartAttribute(localName, ns);
WriteString(text);
WriteEndAttribute();
}
void StringXmlWriter::WriteComment(const std::string& text)
{
switch (m_writeState)
{
case WriteState_Element:
// An element is currently open. Close the element so we can open
// a new one.
m_xmlStr += '>';
m_writeState = WriteState_Content;
// FALL THROUGH
case WriteState_Content:
case WriteState_Prolog:
case WriteState_Start:
m_xmlStr += "<!--";
m_xmlStr += text;
m_xmlStr += "-->";
break;
default:
// It doesn't make sense to allow writing comments when writing an
// attribute value.
// TODO: Generate error
break;
}
}
void StringXmlWriter::WriteElementString(const std::string& localName,
const std::string& text)
{
WriteStartElement(localName);
WriteString(text);
WriteEndElement();
}
void StringXmlWriter::WriteElementString(const std::string& localName,
const std::string& ns,
const std::string& text)
{
WriteStartElement(localName, ns);
WriteString(text);
WriteEndElement();
}
void StringXmlWriter::WriteEndAttribute()
{
switch (m_writeState)
{
case WriteState_Attribute:
m_xmlStr += '"';
m_writeState = WriteState_Element;
break;
default:
// TODO: Generate error
break;
}
}
void StringXmlWriter::WriteEndDocument()
{
switch (m_writeState)
{
case WriteState_Attribute:
WriteEndAttribute();
// FALL THROUGH
case WriteState_Content:
case WriteState_Element:
while (!m_openedElements.empty())
{
WriteEndElement();
}
break;
case WriteState_Start:
case WriteState_Prolog:
// DO NOTHING
break;
default:
// TODO: Generate error
break;
}
m_writeState = WriteState_Start;
}
void StringXmlWriter::WriteEndElement()
{
switch (m_writeState)
{
case WriteState_Content:
{
m_xmlStr += "</";
m_xmlStr += m_openedElements.back().QName;
m_xmlStr += '>';
m_openedElements.pop_back();
m_writeState = WriteState_Content;
break;
}
case WriteState_Element:
{
m_xmlStr += "/>";
m_openedElements.pop_back();
m_writeState = WriteState_Content;
break;
}
default:
// TODO: Generate error
break;
}
}
void StringXmlWriter::WriteStartAttribute(const std::string& localName)
{
WriteStartAttribute(localName, "");
}
void StringXmlWriter::WriteStartAttribute(const std::string& localName,
const std::string& ns)
{
switch (m_writeState)
{
case WriteState_Element:
{
std::string nsPrefix;
bool mustDeclareNamespace = false;
if (!ns.empty()) {
nsPrefix = GetExistingNamespacePrefix(ns);
if (nsPrefix.empty()) {
nsPrefix = GetNextNamespacePrefix(ns);
m_openedElements.back().Namespaces[ns] = nsPrefix;
mustDeclareNamespace = true;
}
}
if (mustDeclareNamespace) {
m_xmlStr += " xmlns:";
m_xmlStr += nsPrefix;
m_xmlStr += "=\"";
m_xmlStr += ns;
m_xmlStr += '"';
}
m_xmlStr += ' ';
if (!nsPrefix.empty()) {
m_xmlStr += nsPrefix;
m_xmlStr += ':';
}
m_xmlStr += localName;
m_xmlStr += "=\"";
m_writeState = WriteState_Attribute;
break;
}
default:
// TODO: Generate error
break;
}
}
void StringXmlWriter::WriteStartDocument()
{
switch (m_writeState)
{
case WriteState_Start:
m_xmlStr += "<?xml version=\"1.0\"?>";
m_writeState = WriteState_Prolog;
break;
default:
// TODO: Generate error
break;
}
}
void StringXmlWriter::WriteStartElement(const std::string& localName)
{
WriteStartElement(localName, "");
}
void StringXmlWriter::WriteStartElement(const std::string& localName,
const std::string& ns)
{
switch (m_writeState)
{
case WriteState_Element:
// An element is currently open. Close the element so we can open
// a new one.
m_xmlStr += '>';
// FALL THROUGH
case WriteState_Content:
case WriteState_Prolog:
case WriteState_Start:
{
std::string nsPrefix;
bool mustDeclareNamespace = false;
if (!ns.empty()) {
nsPrefix = GetExistingNamespacePrefix(ns);
if (nsPrefix.empty()) {
nsPrefix = GetNextNamespacePrefix(ns);
mustDeclareNamespace = true;
}
}
OpenElement openElement(localName, nsPrefix);
if (mustDeclareNamespace) {
openElement.Namespaces[ns] = nsPrefix;
}
m_openedElements.push_back(openElement);
m_xmlStr += '<';
if (!nsPrefix.empty()) {
m_xmlStr += nsPrefix;
m_xmlStr += ':';
}
m_xmlStr += localName;
if (mustDeclareNamespace) {
m_xmlStr += " xmlns:";
m_xmlStr += nsPrefix;
m_xmlStr += "=\"";
m_xmlStr += ns;
m_xmlStr += '"';
}
m_writeState = WriteState_Element;
break;
}
default:
// TODO: Generate error
break;
}
}
void StringXmlWriter::WriteString(const std::string& text)
{
switch (m_writeState)
{
case WriteState_Attribute:
m_xmlStr += TranslateString
(
text,
AttributeValueTranslations,
ARRAYSIZE(AttributeValueTranslations)
);
break;
case WriteState_Element:
// An element is currently open. Close the element so we can start
// writing the element content.
m_xmlStr += '>';
m_writeState = WriteState_Content;
// FALL THROUGH
case WriteState_Content:
m_xmlStr += TranslateString
(
text,
CharDataTranslations,
ARRAYSIZE(CharDataTranslations)
);
break;
default:
// TODO: Generate error
break;
}
}
std::string StringXmlWriter::GetExistingNamespacePrefix(const std::string& ns)
{
for (OpenedElements_t::const_iterator openElemIter = m_openedElements.begin();
openElemIter != m_openedElements.end();
++openElemIter)
{
OpenElement::Namespaces_t::const_iterator nsIter =
openElemIter->Namespaces.find(ns);
if (nsIter != openElemIter->Namespaces.end())
{
return nsIter->second;
}
}
return "";
}
std::string StringXmlWriter::GetNextNamespacePrefix(const std::string& ns)
{
// Namespace prefixes are named ns1, ns2, … They directly correlate to
// the total number of namespaces already declared.
size_t totalNumNamespaces = 0;
for (OpenedElements_t::const_iterator iter = m_openedElements.begin();
iter != m_openedElements.end();
++iter)
{
totalNumNamespaces += iter->Namespaces.size();
}
std::stringstream ss;
ss << "ns" << (totalNumNamespaces + 1);
return ss.str();
}