stripIndent

Strips the indent from a character range (most likely from Entity.text). The idea is that if the XML is formatted to be human-readable, and it's multiple lines long, the lines are likely to be indented, but the application probably doesn't want that extra whitespace. So, stripIndent and withoutIndent attempt to intelligently strip off the leading whitespace.

For these functions, whitespace is considered to be some combination of ' ', '\t', and '\r' ('\n' is used to delineate lines, so it's not considered whitespace).

Whitespace characters are stripped from the start of the first line, and then those same number of whitespace characters are stripped from the beginning of each subsequent line (or up to the first non-whitespace character if the line starts with fewer whitespace characters).

If the first line has no leading whitespace, then the leading whitespace on the second line is treated as the indent. This is done to handle case where there is text immediately after a start tag and then subsequent lines are indented rather than the text starting on the line after the start tag.

If neither of the first two lines has any leading whitespace, then no whitespace is stripped.

So, if the text is well-formatted, then the indent should be cleanly removed, and if it's unformatted or badly formatted, then no characters other than leading whitespace will be removed, and in principle, no real data will have been lost - though of course, it's up to the programmer to decide whether it's better for the application to try to cleanly strip the indent or to leave the text as-is.

The difference between stripIndent and withoutIndent is that stripIndent returns a $(K_STRING), whereas withoutIndent returns a lazy range of code units. In the case where a $(K_STRING) is passed to stripIndent, it will simply return the original string if there is no indent (whereas in other cases, stripIndent and withoutIndent are forced to return new ranges).

string
stripIndent
(
R
)
()
if (
isForwardRange!R &&
isSomeChar!(ElementType!R)
)

Parameters

range R

A range of characters.

Return Value

Type: string

The text with the indent stripped from each line. stripIndent returns a $(K_STRING), whereas withoutIndent returns a lazy range of code units (so it could be a range of $(K_CHAR) or $(K_WCHAR) and not just $(K_DCHAR); which it is depends on the code units of the range being passed in).

Examples

1 import std.algorithm.comparison : equal;
2 
3 // The prime use case for these two functions is for an Entity.text section
4 // that is formatted to be human-readable, and the rules of what whitespace
5 // is stripped from the beginning or end of the range are geared towards
6 // the text coming from a well-formatted Entity.text section.
7 {
8     import dxml.parser;
9     auto xml = "<root>\n" ~
10                "    <code>\n" ~
11                "    bool isASCII(string str)\n" ~
12                "    {\n" ~
13                "        import std.algorithm : all;\n" ~
14                "        import std.ascii : isASCII;\n" ~
15                "        return str.all!isASCII();\n" ~
16                "    }\n" ~
17                "    </code>\n" ~
18                "<root>";
19     auto range = parseXML(xml);
20     range.popFront();
21     range.popFront();
22     assert(range.front.type == EntityType.text);
23     assert(range.front.text ==
24            "\n" ~
25            "    bool isASCII(string str)\n" ~
26            "    {\n" ~
27            "        import std.algorithm : all;\n" ~
28            "        import std.ascii : isASCII;\n" ~
29            "        return str.all!isASCII();\n" ~
30            "    }\n" ~
31            "    ");
32     assert(range.front.text.stripIndent() ==
33            "bool isASCII(string str)\n" ~
34            "{\n" ~
35            "    import std.algorithm : all;\n" ~
36            "    import std.ascii : isASCII;\n" ~
37            "    return str.all!isASCII();\n" ~
38            "}");
39 }
40 
41 // The indent that is stripped matches the amount of whitespace at the front
42 // of the first line.
43 assert(("    start\n" ~
44         "    foo\n" ~
45         "    bar\n" ~
46         "        baz\n" ~
47         "        xyzzy\n" ~
48         "           ").stripIndent() ==
49        "start\n" ~
50        "foo\n" ~
51        "bar\n" ~
52        "    baz\n" ~
53        "    xyzzy\n" ~
54        "       ");
55 
56 // If the first line has no leading whitespace but the second line does,
57 // then the second line's leading whitespace is treated as the indent.
58 assert(("foo\n" ~
59         "    bar\n" ~
60         "        baz\n" ~
61         "        xyzzy").stripIndent() ==
62        "foo\n" ~
63        "bar\n" ~
64        "    baz\n" ~
65        "    xyzzy");
66 
67 assert(("\n" ~
68         "    foo\n" ~
69         "    bar\n" ~
70         "        baz\n" ~
71         "        xyzzy").stripIndent() ==
72        "foo\n" ~
73        "bar\n" ~
74        "    baz\n" ~
75        "    xyzzy");
76 
77 // If neither of the first two lines has leading whitespace, then nothing
78 // is stripped.
79 assert(("foo\n" ~
80         "bar\n" ~
81         "    baz\n" ~
82         "    xyzzy\n" ~
83         "    ").stripIndent() ==
84        "foo\n" ~
85        "bar\n" ~
86        "    baz\n" ~
87        "    xyzzy\n" ~
88        "    ");
89 
90 // If a subsequent line starts with less whitespace than the indent, then
91 // all of its leading whitespace is stripped but no other characters are
92 // stripped.
93 assert(("      foo\n" ~
94         "         bar\n" ~
95         "   baz\n" ~
96         "         xyzzy").stripIndent() ==
97        "foo\n" ~
98        "   bar\n" ~
99        "baz\n" ~
100        "   xyzzy");
101 
102 // If the last line is just the indent, then it and the newline before it
103 // are stripped.
104 assert(("    foo\n" ~
105         "       bar\n" ~
106         "    ").stripIndent() ==
107        "foo\n" ~
108        "   bar");
109 
110 // If the last line is just whitespace, but it's more than the indent, then
111 // the whitespace after the indent is kept.
112 assert(("    foo\n" ~
113         "       bar\n" ~
114         "       ").stripIndent() ==
115        "foo\n" ~
116        "   bar\n" ~
117        "   ");
118 
119 // withoutIndent does the same as stripIndent but with a lazy range.
120 assert(equal(("  foo\n" ~
121               "    bar\n" ~
122               "    baz\n").withoutIndent(),
123              "foo\n" ~
124              "  bar\n" ~
125              "  baz"));

See Also

Meta