No, no, no, no, not another @domains discussion

This is my attempt to work out my feelings about the DITA @domains attribute. It's very technical. If you're reading this, you probably shouldn't be. If you want something technical and/or DTD related, go read my ditasplainer instead - it's at least supposed to be easy to read.

TL;DR version (because this is really TL)

I've heard at least 10 justifications for the @domains attribute and the syntactically-complex tokens it uses. I'll respond to each of those below, but in summary:
  • Two are currently necessary for some DITA processing - but I think could be handled in other ways
  • Three are potentially (but not realistically) useful
  • Five have no practical use

With a record like that, why would we want to keep it around in future versions of DITA?

What is my object here?

To convince anybody who cares that we should reduce or eliminate the @domains attribute in DITA 2.0. It's incredibly complex, easy to mess up, with few or no repercussions when you do mess it up. So (beyond a couple of very specific cases), it should be destroyed.

Is this heresy?

No, I don't think so. I'm a firm believer that if we've gotten little to no use out of a 10 year old feature – especially such a technically complex feature – we should consider removing it.

If I said domains should go away, that would be heresy. I am not saying that. Domain extensions are good. Please don't go away thinking I want to get rid of them.

My relationship with the @domains attribute...

In the first days of DITA, I understood the @domains attribute. It was very technical, but had a designated (if still theoretical) benefit.

As DITA moved into adolescence, I questioned that original impression. Especially as, like a teenager, @domains became more and more awkward.

While I still understand a lot of theoretical arguments for the @domains attribute, I no longer think most of them are realistic. I'm analyzing those arguments here in an attempt to work out that conflict in my own mind. Reminder: this is my internal argument about the @domains attribute and the tokens that go in it – not about actual domains. Domain extensions are good. They make me happy. They make DITA happy. Use them. Love them. They just might make you happy too.


It's possible that I'm missing a critical argument in favor of the @domains attribute tokens. It's also possible that my logic is wrong. Either one would put my conclusions at risk. If so, let me know!

In other words, this is just me, rambling. You were warned in the <shortdesc> to stay away. If you choose to ignore that warning, do not blame me when the dragons attack.

What goes in the @domains attribute?

A DITA Document Type Shell is made up of modules. Per the DITA specification, each module must add a token to @domains. This means that by analyzing @domains, you can tell precisely what modules are part of the shell – which brings along some real and some imagined benefits.

It's now my contention that most of our rules for constructing tokens in @domains serve no practical purpose and have no practical benefit; instead, they serve primarily to make sure you are following unrelated best practices with regards to modular grammar files.

OK, what do those tokens look like? And please indicate how they got worse between versions.

Tokens looked something like this in DITA 1.0 - a clear syntax, with clear (if mostly theoretical) instructions for processors:
(topic pr-d)
DITA 1.1 added a new way to declare attribute domains. Processing requirements for attribute domains differ, so the syntax for attribute domain tokens differed slightly. The tweaked syntax for attribute tokens made processing easier; @domains is still on solid ground. For example, an imagined @guitarChord attribute might result in:
a(base guitarChord)

DITA 1.2 brought new domain modules and new ways to combine domains. These had to get new tokens with new syntax because … well, by this point that's just how things were done. Processing instructions / practical usage for the new tokens and new syntax became more abstract (IMHO too abstract and ill-defined to be useful). I won't even bother with samples because ugggghhh.

Finally, DITA 1.3 brought yet another way to combine domains, and I think you can guess what that meant. Yep – new syntax for new tokens, because … just because. And by this point, the new tokens look rather obscene. I mean, this (directly from the DITA 1.3 specification) is obscene, right?
(topic reference cppApiRef+cpp-d+compilerTypeAtt-d)

By this point I've lost any sense of how this could be useful.

What are @domains tokens used for?

What are the arguments, real or imagined, for all of these tokens? I've heard many. I decided to list and respond to each one I know of.

By the end, I had a book. Even the one or two people interested in this subject won't read that much. So I broke the arguments out into sub-topics. Only read those that interest you. Which is probably none of them, because again, gahhhhh.

Also, if I've already convinced you, don't read any further. For the sake of your mental health.

So: when are @domains tokens useful?

Not often. Today, I only see real value in tokens that support attribute specialization, which was added in DITA 1.1. Click through to see why these are useful.

Attribute domain tokens allow you to generalize and specialize attributes
This is hard to get around. But there are other ways we could do it in the future.
Attribute domain tokens allow you to recognize attributes that can be used for filtering / flagging
Definitely used today. And there are definite alternatives for the future.

When are @domains tokens theoretically useful, but not actually useful?

Also not often. As before, click through if you have any interest in why I think these aren't useful.

Element domain tokens help with standalone generalization
Yeah, but … that doesn't happen much, and by the way, isn't needed.
Element domain tokens enable generalization during conref
I used to think this was useful - enough that I put quite a bit of work into an implementation. But … no longer.
Need to generalize when domains are specialized from structures
Is anybody really going to run into this? And if so, can we just keep it simple?

Which expectations for @domains tokens are fantasy, and / or have zero practical use?

Most of them. Admittedly, my own perspective, but here we go.

Use @domains to allow automatic DTD assembly via "registry in the cloud"
I'll admit this is one of my favorite arguments to abandon.
Constraint domain tokens can disable conref based on context
This was part of the constraint definition in DITA 1.2. Warning: this is one of my <rant>soapbox</rant> topics when it comes to @domains.
Structural modules need tokens

"I need a token like (topic reference) because I need a token for every module. You know, for completeness." That was actually one of the main arguments for these tokens. But are they useful? See the following.

The collection of tokens in @domains serves to define the document
While interesting, this is a fatally flawed argument, so I've added some ominousMusic.
Strict constraints allow grammar file creators to disable conref
Almost as frustrating and rant-worthy as ordinary constraint tokens, but (so far) virtually unused, so few have been harmed by the associated rules.

Are there any other places where domain tokens veer into nonsense?

New grammar formats bring new concerns
I think I'm right here. I might not be. But if I am, it's another indication that our rules might be nonsense.
What happens if they're not declared properly?
With the clear exception of attribute domains … generally nothing, at least outside of a number of uncommon or unlikely cases. So I wonder again … what's the point?

No! Bad information architect! Don't do that!

I am certain @domains tokens did not start out this way - but apart from attribute extensions, it's my contention that today the different tokens primarily serve to ensure that information architects follow other, unrelated rules. The requirement for domain tokens helps force you to follow a modular design when creating a constraint or specialization. Designs which are, in fact, encoded as requirements in the DITA specification.

But at the same time - you can be modular without the tokens. Architects can follow modular rules without an extra step to ensure that they're following the rules.

And ugggghhh, the tokens are complex, ugly, and serve very few of the functions that have been claimed. DITA does not gain by forcing people to use them, even when their use encourages other good practices.

What is to be done?

The rules for @domains are complex.

The rules for @domains edge cases are poorly specified.

It's actually not too hard to find edge cases where the rules fall apart, and become (in a real and practical sense) useless.

For DITA 2.0:

  • When tokens are built on an imagined-but-unrealized premise, they are useless, and we should remove them.
  • When tokens serve no practical use, especially if they bring along syntax that is hard to understand or get right, we should remove them.
  • Keep only tokens that have a real, practical reason to exist, in real documents. But even there, if we can find better ways to do the same thing … then do the better thing.
  • If the result of all that leaves us with no tokens left for @domains … then let's get rid of it!