Cyber Security Awareness Month - Day 16: W3C and HTML

Published: 2012-10-16
Last Updated: 2012-10-16 21:40:02 UTC
by Johannes Ullrich (Version: 1)
1 comment(s)

The W3C (World Wide Web Consortium, is responsible for defining standards around HTML. One of the most prominent current developments is HTML 5. 

HTML 5 is not just about the HTML "mark-up" language. The standard includes extensive extensions to Javascript APIs around geolocation, storage, media access and other features.

In addition, HTML is defined by the "WHATWG" (Web Hypertext Application Technology Working Group), an organization not associated with W3C. The WHATWG was created by Apple, Opera and Mozilla after the companies felt that the W3C's HTML Working Group (HTMLWG) didn't move fast enough.

These days, the HTMLWG and the WHATWG are working together, but they are taking a different approach to the future development of HTML. The WHATWG is defining HTML as a constantly developing, "living" standard. The HTMLWG is taking various snapshots of the WHATWG standard, and defining them as an HTML version.

Here are some of the more recent notable additions to HTML, which are usually kept under the umbrella of "HTML 5":

- Access to hardware sensors: Most browsers already support GPS geolocation, or access to other geolocation APIs of the hosts (e.g. via WiFi). But sensors like accelerometers commonly found in mobile devices are supported as well. Recently, support for the access to cameras and microphones emerged but support is still spotty.

- Extended storage options: Traditionally, web applications had to store data in cookies. Cookies are rather limited in size, and wouldn't scale to a larger size as they are sent with each request. With HTML 5, web applications can store up to 20 MB on the browser, and if that's not enough, they can ask the user for permission to store more data.

- Offline applications: An application may provide a manifest listing all files (HTML, Javascript) that are required to run an application offline

- Video/Audio codecs: the <video> and <audio> tags allow for the playback of audio without the help of plugins like Flash or Java. However, not all browsers support the same codecs.

- Client input validation: Many web applications use javascript to validate user input on the client. In HTML 5, this can be done within the "input" tag by specifying a regular expression. Just like the javascript client validation, this should never be used for security purposes, but can make an application more usable.

There are many more features that are part of the most recent HTML specs, and browsers are starting to implement them. Which features you will find depends on the browser you are using.

But with great power comes great responsibility. All these features need to be implemented correctly in order to avoid security vulnerabilities in the browser. The browser is also very exposed constantly downloading code and executing it from various sites. The fundamental problem in HTML is that data (HTML) and code (Javascript) isn't well separated from each other. This missing separation opens the door to issues like XSS.

There is also no good way to "sign" a piece of javascript like you would sign a desktop application. The best you can do is to protect the transit via SSL.

Johannes B. Ullrich, Ph.D.
SANS Technology Institute

1 comment(s)


Another problem is that browsers "compete" based on how fast they run Javascript. Faster execution can come at the cost of less input validation. On a related note, it will be interesting to see how Windows 8 will be received. Without looking at the added security-related features, reviewers are probably going to compare it speed-wise to Windows 7 and XP and conclude that it is not much faster - and probably slower.

Diary Archives