(X)HTML and Beyond

In my previous article (X)HTML Wars (The History of HTML) I looked at the history of HTML, standards and browsers. What I left out of my article was where these technologies are going. With the release of the (X)HTML5 draft we see the direction W3C want to push but I post the question is HTML a dying language? After writing my previous article I had a discussion with a work mate on the nature of HTML in conjunction mainly with my closing remarks. This discussion led to a number of interesting ideas a few of which I want to discuss here.

(X)HTML is a XML-based language, meaning it uses the XML format to store its data. At current, HTML is moving further towards defining only logical structure of web documents but we still have some remnants of style elements (such as header and list tags). This trend in web development, the transition to object or container based programming, namely assigning styles through css to div elements containing data. This trend began early this century as we saw the transition of table-based layouts to div-based layout. The advantage of course is the ability to create tier-based frameworks, or to separate styles from logic elements making it easier to traverse documents. I believe that the direction HTML should go is to continue this trend and rather than bowing to the ever further reaching of the search engine giants (no follow from google?) we should be creating HTML documents that use single type containers or divs.

While tags like h1 are useful due to their default behaviours and search engine benefits I believe the advantage of removing them far out way the advantages. Google and other search engines have a mandate to follow the web trends of the industry rather than defining them and the default behaviours can rather easily be replicated using simple CSS leaving very little left except an argument for simplicity.

When we get to this level we can truly explore the nature of the web as I would pose the question is XML even the right format we want to use for HTML. At the time of XHTMLs inception XML was the latest buzz technology on the town – everyone was writing formats of XML and with the inception of SOAP and XSLT it gathered wide spread popularity. I would question though whether it was XML as a format, which gathered the popularity or the fact that up until this point the concept of industry standards-based formats hadn’t existed. To be honest a format such as JSON can do just as good job as providing the structure of the document as XML, if not better. Look at the following example contrasting HTML in XML first and then in JSON.

<title>Untitled Document</title>
<div id="”header”">This is header</div>
<div id="”content”">This is content</div>
<div id="”footer”">This is footer</div>

And now in JSON:

{“head”: { “title”: “Untitled Document” } },
{“body”: { “header”: “This is header”, “content”: “This is content”, “footer”: “This is footer” } } }

JSON and XML are both self-describing formats meaning they are languages or protocols that have human readable tags. Unlike XML though, it is significantly faster to write JSON.

If you know your JSON you may have noticed a mistake in the above example. Rather than defining a container called div and giving it an attribute of id like I did in the XML example I just straight created a container with the div’s id.

This is just one of many examples of how HTML is becoming an almost obsolete format – I know when I work with data in Javascript I work with strictly JSON these days rather than XML. The point of my example though is that it is possible to take this even further and produce web pages that include no HTML at all. We should be able to write a web document completely in a scripting language– defining our ‘containers’ or ‘div’s in any format we choose to (whether that be JSON or XML or any other format.)

Due to HTML’s universal and simple nature I’m convinced we won’t see dramatic and sweeping changes like what I suggest here for many years but I am convinced that this is the direction that the web is going. The ability to handle web documents in any format of choice not just XML (even object based formats!) means we can take standards-based documents to a whole new level – not having to make sacrifices on readability and efficiently traversing documents.

Creative Commons License(X)HTML and Beyond by Marc Loney is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 2.5 Australia License.


About this entry