The id attribute just got more classy in HTML5
One of the more subtle yet awesome changes that HTML5 brings, applies to the id attribute. I already tweeted about this a few months ago, but I think this is interesting enough to write about in more than 140 characters.
How id differs in between HTML 4.01 and HTML5
The HTML 4.01 spec states that ID tokens must begin with a letter ([A-Za-z]) and may be followed by any number of letters, digits ([0-9]), hyphens (-), underscores (_), colons (:), and periods (.). For the class attribute, there is no such limitation. Classnames can contain any character, and they don’t have to start with a letter to be valid.
HTML5 gets rid of the additional restrictions on the id attribute. The only requirements left — apart from being unique in the document — are that the value must contain at least one character (can’t be empty), and that it can’t contain any space characters.
This means the rules that apply to values of class and id attributes are now very similar in HTML5.
Err, what?
Although that probably sounds boring, this actually is pretty cool. In HTML 4.01, the following code is perfectly valid:
<p class="#">Foo.
<p class="##">Bar.
<p class="♥">Baz.
<p class="©">Inga.
<p class="{}">Lorem.
<p class="“‘’”">Ipsum.
<p class="⌘⌥">Dolor.
<p class="{}">Sit.
<p class="[attr=value]">Amet.
Heck, you could even use a brainfuck program as a classname:
<p class="++++++++++[>+++++++>++++++++++>+++>+<<<<-]>++.>+.+++++++..+++.>++.<<+++++++++++++++.>.+++.------.--------.>+.>.">Hello world!
I’ve put up a demo page with some other examples, but I’m sure you can think of more. After all, the possibilities are endless :)
So what’s new?
In HTML5, you can take all of these groovy classnames and use them as values for id attributes. Yes, HTML5 is that awesome.
<p id="#">Foo.
<p id="##">Bar.
<p id="♥">Baz.
<p id="©">Inga.
<p id="{}">Lorem.
<p id="“‘’”">Ipsum.
<p id="⌘⌥">Dolor.
<p id="{}">Sit.
<p id="[attr=value]">Amet.
<p id="++++++++++[>+++++++>++++++++++>+++>+<<<<-]>++.>+.+++++++..+++.>++.<<+++++++++++++++.>.+++.------.--------.>+.>.">Hello world!
…you get the idea. I remade the same demo page as before to use ids instead of classes.
How about the CSS then?
When writing CSS for this markup, you need to consider some rules. For example, you can’t just use ## { color: #f00; } to target the element with id="#". Instead, you’ll have to escape the weird characters (in this case, the second #). Doing so will cancel the meaning of special CSS characters and allows you to refer to characters you cannot easily put in a document, like crazy Unicode shizzle.
Here’s a simple list of rules you should keep in mind when writing a CSS selector for a funky id or class:
- If the first character of the classname or ID you’re trying to target is numeric, you’ll need to escape it based on its Unicode code point. For example, the code point for the character
1isU+0031, so you would escape it as\31. Basically, to escape any numeric character, just prefix it with\3and append a space character (). Yay Unicode! - Other than that, alphanumeric characters as well as non-alphanumeric characters that can’t possibly convey any meaning in CSS (e.g.
♥) can just be used unescaped. - For non-alphanumeric characters, you can use the Unicode code point. For example, the colon character (
:) isU+003A, so if you want to use a colon in a CSS selector, escape it into\3A. (Note the space character at the end.)
Some examples that illustrate this:
.\3A \`\( { } /* will match elements with class=":`(" */
.\31 a2b3c { } /* will match elements with class="1a2b3c" */
#\#fake\-id {} /* will match the element with id="#fake-id" */
#© { } /* will match the element with id="©" */
For more, check out the demo page.
Comments
Glenn Glerum wrote on :
Mathias, is this backwards compatible? Or do older browsers just ignore
IDs like that? And what does it do for semantics? It’s fun when you can use aclasslike"i-just-♥-this-sub-navigation„ø¤º°¨¨°º¤ø ¸„ø¤º°¨¨°º¤ø"but would you put it to practice?seutje wrote on :
The night after I noticed this in the HTML5 spec was my worst night ever. Never before have I had such violent nightmares about IRC support and people doing fucked up shit, wondering why it doesn’t work entirely as expected… :(
Mathias wrote on :
Glenn: All tests on both demo pages pass in every A-grade browser, including IE6. So yeah, I’d say it’s backwards compatible.
Mathias wrote on :
I’m getting a lot of “Who would ever use this?” and “Any real use case here?” responses, so it seems a little more explanation is needed. While some of my examples will most likely never be used in production, the fact that HTML5 now allows
IDs to contain just about any character (as was already the case for theclassattribute in HTML 4.01) is definitely an improvement.As some people on Hacker News have pointed out, this is pretty damn useful:
<input>element withname="[items][0][name]"can now finally have anidmatching thenameattribute. This would be invalid HTML 4.01, but valid HTML5:<input type="text" id="[items][0][name]" name="[items][0][name]">"id1"), transliterating words, or replacing letters with ‘similar’ ones (‘O’ or ‘OE’ for ‘Ø’).Kroc Camen wrote on :
seutje: This is just standardizing what all browsers already support. Developers have been able to do this all along anyway. Yes, you can shoot yourself in the foot with it, but being able to use accented characters in
classandidnames is a definite plus and much welcomed.seutje wrote on :
Kroc Camen: I know, I’ve already run into this nightmare as someone was using underscores in his class names and wasn’t escaping these in the CSS, which caused IE6 to completely ignore it, while all other browsers gladly accept it unescaped: http://jsbin.com/esofe3 IE6 will show all green, all other browsers will show all red.
Albert wrote on :
Marvelous! Random usage off the top of my head: links,
#456bereast,#321contact,#24ways, etc. Nice, nice, nice!Weston Ruter wrote on :
Does this new relaxing of ID restrictions apply to HTML5 in the XML serialization, e.g. XHTML5?
David Bishop wrote on :
I don’t see how this makes HTML5 more classy. This just seems… unnecessary at best. Sometimes restrictions such as what exists in the HTML 4.01 spec are necessary to keep developers from doing crazy stuff.
I’m just not sure why most developers would need this; I’m not sure the minor gains are worth the possible headaches that can now be made by poor programmers.
Mathias wrote on :
Weston: Yes, this works in XHTML5 as well.
David: The “
classy” part is a pun, since theidattribute restrictions in HTML5 are very similar to those of theclassattribute (in HTML4+). HTML5 gives developers more freedom to choose which characters they want to use forIDs. I’m not sure why you think this is a bad thing. To me, it’s definitely an improvement.Vic Shoup wrote on :
David Bishop: Agreed. Sounds really cool with all the flexibility until you start accounting for all the other things it can impact… Then you have to go in and do those things differently so they behave normally under HTML 5.
Weston Ruter wrote on :
Fascinating. If this works in XHTML5 as well, isn’t this a direct violation of the XML spec? I guess not if DTDs aren’t used anyway and so the
idattributes aren’t of theIDtype—so they don’t have to be XMLNames.Mathias wrote on :
Weston: To be honest, I wouldn’t know if this is a violation of the XML spec or not. I’m not much of an XML guy.
I just recreated the entire testcase in XHTML5 (I had only tested a few
IDs in XML before) and it turns out that in XML mode, there are three invalidIDvalues on my demo pages:id="<p>"id="<><<<>><>"id="++++++++++[>+++++++>++++++++++>+++>+<<<<-]>++.>+.+++++++..+++.>++.<<+++++++++++++++.>.+++.------.--------.>+.>."I had to remove these or the page wouldn’t be rendered.
So, in XHTML5,
IDs cannot contain an unescaped less-than sign (<). Other than that, everything seems to work fine. Use of the greater-than sign (>) presents no problem whatsoever, and I can see why.Note that it would be possible to use these three
IDs in XHTML by wrapping the contents of the<style>element in a CDATA block so the XML parser ignores it. Then, you could escape theidattribute values in the XHTML to prevent the “Unescaped<not allowed in attribute values” error, but that kind of defeats the purpose IMHO.Is this what you expected? What does the XML spec say about this?
Weston Ruter wrote on :
The XML spec says that you are a very bad man: http://www.w3.org/TR/REC-xml/#id
Christo wrote on :
David Bishop: I agree with David, nice to know the restrictions have been lifted, but lets stick to self documenting ids and classes. Future development would be a nightmare if people used such ridiculous naming techniques...
Alex wrote on :
This is wicked! Will solve a lot of my problems. But still a lot of libraries don’t handle this correctly yet e.g. jQuery. And it will destroy all backward compatibility.
Cheers, Alex
Ant Gray wrote on :
Still I see no reason to use that. What is the point of making
classandidnames harder to read and type?