Internet Explorer’s DOM has a few issues regarding whitespace. Whitespace symbols in HTML 4.01 are any of the following symbols:
- ASCII space ( )
- ASCII tab (	)
- ASCII form feed (
- Zero-width space (​)
Refer to w3’s specifications here.
Querying InnerHTML and outerHTML
outerHTML adds all sorts of whitespaces when querying these properties. For example:
var container = document.createElement("div"); container.innerHTML = "<div><ul><li>I <em>like</em> sushi!</li></ul></div>" alert(container.innerHTML);
<DIV><UL><LI>I <EM>like</EM> sushi!</LI></UL></DIV>
Where the “” symbols denote whitespace symbols. In this example the added whitespaces causes the alert to break them (probably a newline and carriage return) thus physically displaying:
<DIV> <UL> <LI>I <EM>like</EM> sushi!</LI></UL></DIV>
Maybe the implementor was trying to be helpful by adding these mysterious newline symbols before every block-level element for automatic readability. What a damn fool.
Is there a fix? Well, if you really need
innerHTML to be precise you could walk the DOM tree yourself and spit out the markup as you traverse. You could bite the bullet can parse the string using regular expressions – and for every opening block-level element tag check for preceding whitespace symbols and eat them.
Creating DOM trees via innerHTML
When creating DOM trees via
innerHTML, IE does not always create a DOM tree to reflect the exact HTML contents you pass it. This is because IE automatically collapses whitespace (normalization on the fly). For example:
var container = document.createElement("div"); container.innerHTML = "\n Apples \n" alert(container.firstChild.nodeValue.length);
All browsers except for IE print “10”, IE collapses the surrounding whitespace and prints “7”.
To over come this: don’t use
innerHTML – if you need the DOM tree to be precise, manually create the DOM structures yourself. I tried using “pre” white-space styles but it still normalized. You could use a pre element but if you HTML contains block-level elements the markup will be invalid (pre only allows a select few of inline-level elements).
Note: you may find this useful: JS2HTML
See this bug report at quirksmode for more details.
Hope this article relieves some of your IE headaches… it probably just aggravated you.