DOMEntity

Represents an entity in an XML document as a DOM tree.

parseDOM either takes a range of characters or an dxml.parser.EntityRange and generates a DOMEntity from that XML.

When parseDOM processes the XML, it returns a DOMEntity representing the entire document. Even though the XML document itself isn't technically an entity in the XML document, it's simplest to treat it as if it were an EntityType.elementStart with an empty name. That DOMEntity then contains child entities that recursively define the DOM tree through their children.

For DOMEntities of type EntityType.elementStart, _DOMEntity.children gives access to all of the child entities of that start tag. Other DOMEntities have no children.

Note that the type determines which properties of the DOMEntity can be used, and it can determine whether functions which a DOMEntity is passed to are allowed to be called. Each function lists which EntityTypes are allowed, and it is an error to call them with any other EntityType.

If parseDOM is given a range of characters, it in turn passes that to dxml.parser.parseXML to do the actual XML parsing. As such, that overload accepts an optional dxml.parser.Config as a template argument to configure the parser.

If parseDOM is given an EntityRange, the range does not have to be at the start of the document. It can be used to create a DOM for a portion of the document. When a character range is passed to it, it will return a DOMEntity with the type EntityType.elementStart and an empty name. It will iterate the range until it either reaches the end of the range, or it reaches the end tag which matches the start tag which is the parent of the entity that was the front of the range when it was passed to parseDOM. The EntityType.elementStart is passed by $(K_REF), so if it was not at the top level when it was passed to parseDOM (and thus still has elements in it when parseDOM returns), the range will then be at the entity after that matching end tag, and the application can continue to process the range after that if it so chooses.

Postblit

this(this)
this(this)
Undocumented in source.

Public Imports

std.algorithm.searching
public import std.algorithm.searching : canFind;
std.range
public import std.range : only, takeExactly;
std.typecons
public import std.typecons : Tuple;
dxml.parser
public import dxml.parser : TextPos;

Members

Aliases

Attribute
alias Attribute = Tuple!(SliceOfR, "name", SliceOfR, "value", TextPos, "pos")

The exact instantiation of $(PHOBOS_REF Tuple, std, typecons) that attributes returns a range of.

SliceOfR
alias SliceOfR = R

The type used when any slice of the original range of characters is used. If the range was a string or supports slicing, then SliceOfR is the same type as the range; otherwise, it's the result of calling $(PHOBOS_REF takeExactly, std, range) on it.

SliceOfR
alias SliceOfR = typeof(takeExactly(R.init, 42))
Undocumented in source.

Properties

attributes
auto attributes [@property getter]

Returns a dynamic array of attributes for a start tag where each attribute is represented as a
$(PHOBOS_REF_ALTTEXT Tuple, Tuple, std, typecons)!( $(LREF2 SliceOfR, EntityRange), $(D_STRING "name"), $(LREF2 SliceOfR, EntityRange), $(D_STRING "value"), $(REF_ALTTEXT TextPos, TextPos, dxml, parser), $(D_STRING "pos")).

children
DOMEntity[] children [@property getter]

Returns the child entities of the current entity.

name
SliceOfR name [@property getter]

Gives the name of this DOMEntity.

path
SliceOfR[] path [@property getter]

Gives the list of the names of the parent start tags of this DOMEntity.

pos
TextPos pos [@property getter]

The position in the the original text where the entity starts.

text
SliceOfR text [@property getter]

Returns the textual value of this DOMEntity.

type
EntityType type [@property getter]

The EntityType for this DOMEntity.

Return Value

A DOMEntity representing the DOM tree from the point in the document that was passed to parseDOM (the start of the document if a range of characters was passed, and wherever in the document the range was if an EntityRange was passed).

Throws

XMLParsingException if the parser encounters invalid XML.

Examples

parseDOM with the default Config and a range of characters.

import std.range.primitives;

auto xml = "<root>\n" ~
           "    <!-- no comment -->\n" ~
           "    <foo></foo>\n" ~
           "    <baz>\n" ~
           "        <xyzzy>It's an adventure!</xyzzy>\n" ~
           "    </baz>\n" ~
           "    <tag/>\n" ~
           "</root>";

auto dom = parseDOM(xml);
assert(dom.type == EntityType.elementStart);
assert(dom.name.empty);
assert(dom.children.length == 1);

auto root = dom.children[0];
assert(root.type == EntityType.elementStart);
assert(root.name == "root");
assert(root.children.length == 4);

assert(root.children[0].type == EntityType.comment);
assert(root.children[0].text == " no comment ");

assert(root.children[1].type == EntityType.elementStart);
assert(root.children[1].name == "foo");
assert(root.children[1].children.length == 0);

auto baz = root.children[2];
assert(baz.type == EntityType.elementStart);
assert(baz.name == "baz");
assert(baz.children.length == 1);

auto xyzzy = baz.children[0];
assert(xyzzy.type == EntityType.elementStart);
assert(xyzzy.name == "xyzzy");
assert(xyzzy.children.length == 1);

assert(xyzzy.children[0].type == EntityType.text);
assert(xyzzy.children[0].text == "It's an adventure!");

assert(root.children[3].type == EntityType.elementEmpty);
assert(root.children[3].name == "tag");

parseDOM with simpleXML and a range of characters.

import std.range.primitives : empty;

auto xml = "<root>\n" ~
           "    <!-- no comment -->\n" ~
           "    <foo></foo>\n" ~
           "    <baz>\n" ~
           "        <xyzzy>It's an adventure!</xyzzy>\n" ~
           "    </baz>\n" ~
           "    <tag/>\n" ~
           "</root>";

auto dom = parseDOM!simpleXML(xml);
assert(dom.type == EntityType.elementStart);
assert(dom.name.empty);
assert(dom.children.length == 1);

auto root = dom.children[0];
assert(root.type == EntityType.elementStart);
assert(root.name == "root");
assert(root.children.length == 3);

assert(root.children[0].type == EntityType.elementStart);
assert(root.children[0].name == "foo");
assert(root.children[0].children.length == 0);

auto baz = root.children[1];
assert(baz.type == EntityType.elementStart);
assert(baz.name == "baz");
assert(baz.children.length == 1);

auto xyzzy = baz.children[0];
assert(xyzzy.type == EntityType.elementStart);
assert(xyzzy.name == "xyzzy");
assert(xyzzy.children.length == 1);

assert(xyzzy.children[0].type == EntityType.text);
assert(xyzzy.children[0].text == "It's an adventure!");

assert(root.children[2].type == EntityType.elementStart);
assert(root.children[2].name == "tag");
assert(root.children[2].children.length == 0);

parseDOM with simpleXML and an EntityRange.

import std.range.primitives : empty;
import dxml.parser : parseXML;

auto xml = "<root>\n" ~
           "    <!-- no comment -->\n" ~
           "    <foo></foo>\n" ~
           "    <baz>\n" ~
           "        <xyzzy>It's an adventure!</xyzzy>\n" ~
           "    </baz>\n" ~
           "    <tag/>\n" ~
           "</root>";

auto range = parseXML!simpleXML(xml);
auto dom = parseDOM(range);
assert(range.empty);

assert(dom.type == EntityType.elementStart);
assert(dom.name.empty);
assert(dom.children.length == 1);

auto root = dom.children[0];
assert(root.type == EntityType.elementStart);
assert(root.name == "root");
assert(root.children.length == 3);

assert(root.children[0].type == EntityType.elementStart);
assert(root.children[0].name == "foo");
assert(root.children[0].children.length == 0);

auto baz = root.children[1];
assert(baz.type == EntityType.elementStart);
assert(baz.name == "baz");
assert(baz.children.length == 1);

auto xyzzy = baz.children[0];
assert(xyzzy.type == EntityType.elementStart);
assert(xyzzy.name == "xyzzy");
assert(xyzzy.children.length == 1);

assert(xyzzy.children[0].type == EntityType.text);
assert(xyzzy.children[0].text == "It's an adventure!");

assert(root.children[2].type == EntityType.elementStart);
assert(root.children[2].name == "tag");
assert(root.children[2].children.length == 0);

parseDOM with an EntityRange which is not at the start of the document.

import std.range.primitives : empty;
import dxml.parser : parseXML, skipToPath;

auto xml = "<root>\n" ~
           "    <!-- no comment -->\n" ~
           "    <foo></foo>\n" ~
           "    <baz>\n" ~
           "        <xyzzy>It's an adventure!</xyzzy>\n" ~
           "    </baz>\n" ~
           "    <tag/>\n" ~
           "</root>";

auto range = parseXML!simpleXML(xml).skipToPath("baz/xyzzy");
assert(range.front.type == EntityType.elementStart);
assert(range.front.name == "xyzzy");

auto dom = parseDOM(range);
assert(range.front.type == EntityType.elementStart);
assert(range.front.name == "tag");

assert(dom.type == EntityType.elementStart);
assert(dom.name.empty);
assert(dom.children.length == 1);

auto xyzzy = dom.children[0];
assert(xyzzy.type == EntityType.elementStart);
assert(xyzzy.name == "xyzzy");
assert(xyzzy.children.length == 1);

assert(xyzzy.children[0].type == EntityType.text);
assert(xyzzy.children[0].text == "It's an adventure!");

parseDOM at compile-time

enum xml = "<!-- comment -->\n" ~
           "<root>\n" ~
           "    <foo>some text<whatever/></foo>\n" ~
           "    <bar/>\n" ~
           "    <baz></baz>\n" ~
           "</root>";

enum dom = parseDOM(xml);
static assert(dom.type == EntityType.elementStart);
static assert(dom.name.empty);
static assert(dom.children.length == 2);

static assert(dom.children[0].type == EntityType.comment);
static assert(dom.children[0].text == " comment ");

Meta