IE Whitespace madness

Internet Explorer’s DOM has a few issues regarding whitespace. Whitespace symbols in HTML 4.01 are any of the following symbols:

  • ASCII space ( )
  • ASCII tab (	)
  • ASCII form feed ()
  • Zero-width space (​)

Refer to w3’s specifications here.

Querying InnerHTML and outerHTML

Now IE’s innerHTML and outerHTML adds all sorts of whitespaces when querying these properties. For example:

    var container = document.createElement("div");
    container.innerHTML = "<div><ul><li>I <em>like</em> sushi!</li></ul></div>"
    alert(container.innerHTML);

Prints:

<DIV>[][]<UL>[][]<LI>I <EM>like</EM> sushi!</LI></UL></DIV>

Where the “[]” symbols denote whitespace symbols. In this example the added whitespaces causes the alert to break them (probably a newline and carriage return) thus physically displaying:

<DIV>
<UL>
<LI>I <EM>like</EM> sushi!</LI></UL></DIV>

Maybe the implementor was trying to be helpful by adding these mysterious newline symbols before every block-level element for automatic readability. What a damn fool.

Is there a fix? Well, if you really need innerHTML to be precise you could walk the DOM tree yourself and spit out the markup as you traverse. You could bite the bullet can parse the string using regular expressions – and for every opening block-level element tag check for preceding whitespace symbols and eat them.

Creating DOM trees via innerHTML

When creating DOM trees via innerHTML, IE does not always create a DOM tree to reflect the exact HTML contents you pass it. This is because IE automatically collapses whitespace (normalization on the fly). For example:

    var container = document.createElement("div");
    container.innerHTML = "\n Apples \n"
    alert(container.firstChild.nodeValue.length);

All browsers except for IE print “10”, IE collapses the surrounding whitespace and prints “7”.

To over come this: don’t use innerHTML – if you need the DOM tree to be precise, manually create the DOM structures yourself. I tried using “pre” white-space styles but it still normalized. You could use a pre element but if you HTML contains block-level elements the markup will be invalid (pre only allows a select few of inline-level elements).

Note: you may find this useful: JS2HTML

See this bug report at quirksmode for more details.

Hope this article relieves some of your IE headaches… it probably just aggravated you.

Advertisements

2 thoughts on “IE Whitespace madness

  1. I don’t know whether it’s just me or if perhaps everyone else
    experiencing problems with your website. It seems like some of the text in
    your posts are running off the screen. Can somebody else please provide feedback and let me know if this is
    happening to them as well? This could be a problem with my browser because I’ve had this happen previously. Many thanks

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s