voidmoth

Moira/Mallory/Magpie/Molly

Electronic literature, tool-building, poetry, games, puzzles, &cet. 29.


Itch.io
m-campbell.itch.io

cactus
@cactus

<div /> is not an empty div in HTML5

saw some discussion on twitter about this blog post

and i haven't decided if i agree with the opinions but i am definitely having a hard time making my peace with this:

<input />This text is outside the input.
<input>This text is outside the input.</input>
<div>This text is inside the div.</div>
<div />This text is inside the div.
syntax re-highlighting by codehost

Why did anyone ever think this was okay? For what conceivable reason did anyone sign off on this bullshit?

I was just complaining earlier about worse-is-better, and this is the exact sort of worse that worse-is-better defends. If I let the brain worms take me and I forget that better things are possible, would I be able to rationalize bullshit like this? If I Nuance™ myself into thinking bad things are good, would I be happy?


tabatkins
@tabatkins
  1. HTML was originally (kinda) an SGML language. SGML was ridiculously over-complicated, so nobody actually implemented it properly, but it was also kinda a "formating commands" language. This whole tree structure bullshit is a modern invention, comparatively. Having null tags (those without children, like <img> and <input>) wasn't very weird.
  2. Somebody squinted really hard at SGML and invented XML from what they thought they saw. Self-closing tags (<div />) came from SGML's "short tags" feature where you could write a tag like <div/ to imply the end tag. (Note the lack of >!)
  3. XML IS THE FUTURE ALL MUST BE XML, Jake's article basically explains what went on here.
  4. When you've got a trillion documents (not exaggerating that number) in some bizarre bastard syntax, "just document what's out there" starts looking really attractive, for good reason.


You must log in to comment.

in reply to @cactus's post:

Nope, Jake is 100% correct in the post. This is what it means to truly, Truly live with 30 years of backwards compatibility. Much of it isn't beautiful but it sure is robust and interoperable.

Yeah, stuff like this makes me feel like no one was at the steering wheel when HTML and CSS were created or continuously updated, so convincing ourselves that Everything Is Fine™ seems like buying into the Kool-Aid

the history of it goes something like this:

  • when html was created, <div /> was totally invalid syntax, and whether an element was self-closing or not depended on what the element was. hence, <br> is always self-closing and there is no </br> tag in html. this was fine at the time because html was only intended for writing simple documents without much nesting. in fact, due to the relative simplicity of html at the time and the fact that javascript and css did not exist until years after html's creation, there was generally no reason to even parse html into a tree. <b> would turn on bold, </b> would turn off bold, etc.
  • later, xml was created, where <foo /> meant a self-closing foo element, and no elements were specially handled
  • the w3c decided xml was the future and created xhtml, which was a way to write web documents in xml. because it was xml, <br> would no longer work, and you instead had to write <br />
  • a lot of people who didn't really understand the finer points of this beyond "xhtml is the future now" and "now you write <br /> for some reason" made a bunch of web pages that they thought were xhtml but were actually being parsed as html because they were served with the wrong mime type
  • the only reason html parsers didn't choke on this pseudo-xhtml is because they were actually parsing the / as an attribute name. hence, / did not actually close the tag
  • later, html5 came about, which was largely an effort to nail down and specify how browsers actually parse html, since up to this point it was a huge disgusting mess where browsers were basically forced to reverse-engineer each other to figure out how they handled different errors
  • making / self-closing in html, when up to this point it was never actually valid html syntax, would break too many pages, so instead it is now just ignored

edit: just realized i'm basically just summarizing the article (thought i'd read it already but i hadn't and turns out it goes over most of the same stuff) but maybe this summary will be helpful

in reply to @tabatkins's post:

I do wish they’d made self closing tags valid when they explicitly added custom tags. I’m fine with it being different from the rest of HTML, I understand the compatibility issue. I wish we could do that right now.

The compat issue is that HTML "had" custom tags for decades before we finally blessed them in the spec. Giving them a new behavior might have broken some pages.

There's also a small moral hazard - if custom elements have more conveniences than the built-ins, it encourages using custom even when there's an appropriate built-in, which loses the semantics of the built-in, harming users for a tiny convenience bump for devs.

in reply to @kaybee's post: