Getting CDATA values using DOM and XPath in PHP

XML
That title is a mouthful, isn’t it?

Anyway – one of the recent issues I’ve had in my coding journey is using DOM and XPath in conjunction to navigate around an AML document.  The problem came when the text values in an element would be wrapped in a CDATA tag:

public function getNodeValueByPath($query)
{
    $xpath = new DOMXPath($this->xmlDOM);
    $node = $xpath->query($query . '/text()');
    return $node->item(0)->nodeValue;
}

When using the text() method of retrieving the XPath node, it would return the content fine unless there was a CDATA tag wrapping around the content.

/root/chapters/chapter/text()

I used Xacobeo to explore the document and discovered that the CDATA tag was being treated as an additional node.

Luckily, as long as you know that you’re accessing the text node you can do a quick check to see if that second node exists:

public function getNodeValueByPath($query)
{
	$xpath = new DOMXPath($this->xmlDOM);
	$node = $xpath->query($query . '/text()');

	// Handle CDATA stuff
	if (isset($node->item(1)->nodeValue)) {
		return $node->item(1)->nodeValue;
	} else {
		return $node->item(0)->nodeValue;
	}
}

This will check if the second node exists and use it if necessary.