Whether the parser should throw when it encounters any entity references
other than the five entity references defined in the XML standard.
Any other entity references would have to be defined in the DTD in
order to be valid. And in order to know what XML they represent (which
could be arbitrarily complex, even effectively inserting entire XML
documents into the middle of the XML), the DTD would have to be parsed.
However, dxml does not support parsing the DTD beyond what is required
to correctly parse past it, and replacing entity references with what
they represent would not work with the slicing semantics that
EntityRange provides. As such, it is not possible for dxml to
correctly handle any entity references other than the five which are
defined in the XML standard, and even those are only parsed by using
dxml.util.decodeXML or dxml.util.parseStdEntityRef.
EntityRange always validates that entity references are one
of the five, predefined entity references, but otherwise, it lets them
pass through as normal text. It does not replace them with what they
represent.
As such, the default behavior of EntityRange is to throw an
XMLParsingException when it encounters an entity reference
which is not one of the five defined by the XML standard. With that
behavior, there is no risk of processing an XML document as if it had
no entity references and ending up with what the program using the
parser would probably consider incorrect results. However, there are
cases where a program may find it acceptable to treat entity references
as normal text and ignore them. As such, if a program wishes to take
that approach, it can set throwOnEntityRef to ThrowOnEntityRef.no.
If throwOnEntityRef == ThrowOnEntityRef.no, then any entity
reference that it encounters will be validated to ensure that it is
syntactically valid (i.e. that the characters it contains form what
could be a valid entity reference assuming that the DTD declared it
properly), but otherwise, EntityRange will treat it as normal
text, just like it treats the five, predefined entity references as
normal text.
Note that any valid XML entity reference which contains start or end
tags must contain matching start or end tags, and entity references
cannot contain incomplete fragments of XML (e.g. the start or end of a
comment). So, missing entity references should only affect the data in
the XML document and not its overall structure (if that were not _true,
attempting to ignore entity references such as ThrowOnEntityRef.no
does would be a disaster in the making). However, how reasonable it is
to miss that data depends entirely on the application and what the XML
documents it's parsing contain - hence, the behavior is configurable.
Whether the parser should throw when it encounters any entity references other than the five entity references defined in the XML standard.
Any other entity references would have to be defined in the DTD in order to be valid. And in order to know what XML they represent (which could be arbitrarily complex, even effectively inserting entire XML documents into the middle of the XML), the DTD would have to be parsed. However, dxml does not support parsing the DTD beyond what is required to correctly parse past it, and replacing entity references with what they represent would not work with the slicing semantics that EntityRange provides. As such, it is not possible for dxml to correctly handle any entity references other than the five which are defined in the XML standard, and even those are only parsed by using dxml.util.decodeXML or dxml.util.parseStdEntityRef. EntityRange always validates that entity references are one of the five, predefined entity references, but otherwise, it lets them pass through as normal text. It does not replace them with what they represent.
As such, the default behavior of EntityRange is to throw an XMLParsingException when it encounters an entity reference which is not one of the five defined by the XML standard. With that behavior, there is no risk of processing an XML document as if it had no entity references and ending up with what the program using the parser would probably consider incorrect results. However, there are cases where a program may find it acceptable to treat entity references as normal text and ignore them. As such, if a program wishes to take that approach, it can set throwOnEntityRef to ThrowOnEntityRef.no.
If throwOnEntityRef == ThrowOnEntityRef.no, then any entity reference that it encounters will be validated to ensure that it is syntactically valid (i.e. that the characters it contains form what could be a valid entity reference assuming that the DTD declared it properly), but otherwise, EntityRange will treat it as normal text, just like it treats the five, predefined entity references as normal text.
Note that any valid XML entity reference which contains start or end tags must contain matching start or end tags, and entity references cannot contain incomplete fragments of XML (e.g. the start or end of a comment). So, missing entity references should only affect the data in the XML document and not its overall structure (if that were not _true, attempting to ignore entity references such as ThrowOnEntityRef.no does would be a disaster in the making). However, how reasonable it is to miss that data depends entirely on the application and what the XML documents it's parsing contain - hence, the behavior is configurable.