setcooki
ALLROUND WEB DEVELOPER

28
Sep 11

PHP DOM XML to Array

  
  
  

What is the fastest way to parse a XML string into a equivalent valid Array representation?

I have seen quite some efforts on different blogs and the comments on php.net in some of the functions suggest some valid approaches. The problem with most approaches though, i find, is the lack of xml testing before passing it to the function to transform the xml to array. When you want to make sure you have a valid XML string the PHP DOMDocument class and its sub classes provide everything for iteration through its childs, to saving string presentations to file and so on. On top DOM perfectly validates XML strings when passed into the loading methods of the class. Also i believe any approach using DOM rather then handling the XML string with regex or whatever should be much faster and safer to rely on.

After having validated the XML string, your XML to Array function should never throw any errors. So since you validated your XML with DOM you might as well make use of the DOM functions to transform your XML into Array; rather then only testing your XML string for validity. So here is a function i found a while ago (i dont remember where exactly so no credits :/) which i optimized a little to get rid of unwanted notices and also to make it more reliable:

function xmlToArray(DOMNode $node = null)
{
    $result = array();
    $group = array();
    $attrs = null;
    $children = null;

    if($node->hasAttributes())
    {
        $attrs = $node->attributes;
        foreach($attrs as $k => $v)
        {
            $result[$v->name] = $v->value;
        }
    }

    $children = $node->childNodes;

    if(!empty($children))
    {
        if((int)$children->length === 1)
        {
            $child = $children->item(0);

            if($child !== null && $child->nodeType === XML_TEXT_NODE)
            {
                $result['#value'] = $child->nodeValue;
                if(count($result) == 1)
                {
                    return $result['#value'];
                }else{
                    return $result;
                }
            }
        }

        for($i = 0; $i < (int)$children->length; $i++)
        {
            $child = $children->item($i);

            if($child !== null)
            {
                if(!isset($result[$child->nodeName]))
                {
                    $result[$child->nodeName] = xmlToArray($child);
                }else{
                    if(!isset($group[$child->nodeName]))
                    {
                        $result[$child->nodeName] = array($result[$child->nodeName]);
                        $group[$child->nodeName] = 1;
                    }
                    $result[$child->nodeName][] = xmlToArray($child);
                }
            }
        }
    }
    return $result;
}

$doc = DOMDocument::loadXML('your xml string');
$arr = xmlToArray($doc);
print_r($arr);

If you want to have control over how the function deals with the value containing text nodes and attributes – looks for the attributes loop. You can alter it to the way you like. In the current form it will, once there is a node attribute, return a associative Array with attribute name => attribute value pairs and the actually text node value as #value key. If you always check for arrays, in the lowest depth where you will expect the node values, you should do fine.


Copyright © 2012 setcooki
Proudly powered by WordPress, Free WordPress Themes