The Web needs a Superfund-like cleanup

Yahoo News. CNN. Ebay. Any major site you visit today has hefty use (or, more accurately, abuse) of HTML. We’ve pushed HTML far beyond what it was designed to do, and browsers (and developers!) pay the price. Plaxo’s Joseph Smarr gave a talk on just what the end-result of this is (featured on planet.mozilla earlier, but here’s a link to the horse’s mouth: High Performance JavaScript video).

This has been bugging me for a few weeks now, as a deeper-level problem. Part of my job (not just at DVC, but at previous companies) involves figuring out why products I work on break on certain web pages. QA says “We’re broken on this page,” and before I can work on the bug I have to dismantle the page. Imagine how much fun it is to dismantle code like:
<div><div><div><div><div><div><div><div><div>…</div></div></div></div></div></div></div></div></div>… Or the horrors of table-based layout, especially nested-table-based layout. This practice is officially known as “minimizing a testcase”, but realistically it ought to be called a steaming pile of fertilizer. No one wants to do it, and pages using hyperbloated markup make it ten times harder. (Especially when one character of whitespace can make the bug disappear.) Then there’s the worst feeling of all: minimizing a testcase, fixing the bug, and then finding out there’s something else busted on the same page, which your fix didn’t catch.

Unfortunately, it’s a vicious cycle. Major sites want to work in all major browsers, and major browsers don’t want to break major sites. People are quick to blame Internet Explorer (and I’m one of those people), but that’s not enough. We need to understand HTML + JavaScript + CSS + AJAX + smart developers = radioactive sludge code that just barely does what we want.

Seriously, how much code should a web page need to implement a tabbox?

I just spotted from GMail a Quote of the Day that really summarizes this well, by Gen. George S. Patton: “If everyone is thinking alike, then somebody isn’t thinking.”

There are design flaws in the whole process that make some of this unfixable in its current state. XML Namespaces has been around since 1999, but HTML isn’t XML (unless you convert wholesale to XHTML). So as long as we keep generating web pages with just HTML, we’re stuck. On the other hand, HTML isn’t going to go away any time soon.

Maybe it’s a standardized user-interface language we need. Mozilla has XUL (which has never worked as well on web pages as it has for chrome apps), Microsoft has XAML (oh, wait, you need Vista for that, don’t you?), and somewhere there’s a W3C discussion about creating a unified UI language (but how credible is the W3C among developers these days?).

The Tamarin project should improve JS performance significantly (and when it does, Microsoft and the others will be forced to respond – good news for all users), but this doesn’t solve the underlying problem – it just makes the problematic code run faster.

Of course, all this is “Web 2.0” – but I don’t think anyone really understands that Web 2.0 should be easier to work with than Web 1.0. Web 2.0 shouldn’t just be about cool widgets.

How can we, particularly Mozilla developers, contribute to a fix? (It’d be the height of arrogance to ask “How can we fix this ourselves?”.)

One way would be to encourage web sites to switch to XHTML (and serve it as something other than text/html). This would enable mixing HTML with other languages more efficiently (even more than with XML data islands, which end up being generic XML). To those who say, “But Internet Explorer doesn’t really support XHTML,” they should start screaming at Internet Explorer’s team for this. Here’s a really good question for Microsoft: for the same effect, would IE’s parser team prefer to eat hundreds of HTML tags, plus hundreds of lines of JavaScript, plus CSS, or would the team prefer to eat a few dozen XAML tags mixed in with XHTML? I know which I would pick – the one that requires fewer bytes to express in web page source code.

Another way would be to make a standard user-interface language for the web – and implement it so that web pages could use it. Stop downloading cruft from the web – store it locally on the machine as sandboxed components. (Think XTF or XBL without privileges to implement user-interface.) Make it something that doesn’t need a whole package from Yahoo!, another package from Google, another package from Tom’s Best User Interface Widgets Page, etc. Just make it something everyone can use. And if it means a plug-in for a stubborn vendor, hire someone to write the damn plug-in! (Provided that the plug-in itself has a clearly available spec and a clearly available owner.)

I think the biggest impact we could have would be to ask the people who build and maintain these major sites, “Hey, what can we add support for that would make your jobs easier and your code cleaner?” It would not surprise me to find out that Mozilla was already talking behind the scenes to CNN, Ebay, Yahoo, etc. – specifically to their web engineers. It also wouldn’t surprise me if they weren’t, but instead focusing efforts on making the browser UI better, the user experience better, and on meeting standards that these major sites just don’t give a damn about – while still trying to render these sites well.

I don’t think HTML is broken, nor is JavaScript, CSS, or anything else. I simply think we’re demanding far too much from them, and we need a better solution for the parts of the web that really put HTML to the test. Even WHATWG doesn’t go far enough…

</dvorak>

6 thoughts on “The Web needs a Superfund-like cleanup”

  1. To make Internet Explorer “support” XHTML, the page should be send as application/xml or text/xml and it should use a simple XSLT file with a(n) PI (wrapped in a conditional comment, so other browsers ignore it) that simply copies all elements in the XHTML namespace (http://www.w3.org/1999/xhtml) into the resulting document. Other elements from different namespaces could also be processed in the XSLT document if IE doesn’t support them.

  2. For CSS i’d like to see:
    * border-radius
    * multiple backgrounds
    * stretchable/scalable backgrounds
    * something like xul’s image-region (clipping on backgrounds)
    * CSS animations (see http://webkit.org/blog/138/css-animation/)
    * a more UI like simplified box/layout model. (ever tried making a grid like layout? how about vertically centering something? heights are a pain)
    * embeddable fonts (font-face)
    Just think, no more image replacement/flash for headers in a different font, no more complex JavaScript library to move something slightly to the left, no more divs layered inside of divs to make something as simple as rounded corners…

  3. Isn’t there work going on with an XBL2 spec? If browser vendors would support it we could do advanced stuff with simple HTML and leave old browsers/IE with just the HTML.

  4. There’s a huge chunk of the HTML5 spec dedicated to extra UI form controls and interactive scriptable things. Admittedly, it’s the part I completely skipped over, but it’s nice to see they’re doing something about it.
    That datagrid element could’ve saved me from writing a few hundred lines of radioactive sludge code myself…

Comments are closed.