Reporting techniques for DITA topics


I have worked on a customized DITA implementation on Arbortext in my company. Internally we did not have reporting on the top of it, which I created. I want to find out how does it work with other DITA implementation. How do you find out things like:

  • how many topics and ditamaps a writer owns
  • How many topics the writer has printed
  • How many products writer contributes to in the hierarchy
  • how many obsolete topics a writer has
    and more analytic stuff like that.
    at my company I have created an internal tool that provides various dashboards for writers, managers, for such analytics. But does other DITA tools come with this out of the box?


We authored in Arbortext and then oxygen, always wrote our own little scripts in the shell for reports. You could search on various metadata, but that’s not the same as a report or dashboard!

I’d love to know how you determine that a topic is obsolete. I think that’s the biggest challenge in a mature company. Also, crisp definitions of products are challenging as over time, products become services…

Anyway, kudos to you for developing the reports & dashboards. I don’t know of any DITA/XML authoring tool that provides that kind of reporting.



Good to know that there aren’t many DITA/XML tools providing these reports, and I haven’t really worked on something that’s already existing in other systems :slight_smile: .

Topics that are obsolete has an obsolete date field set for them. So those without obsolete date are valid topics. That’s how I determine the obsolete status of the topic.


I would expect that sort of reporting to come from your CCMS. We’re in the middle of migrating our content from essentially a static repository to a true DITA CCMS-- and I’m certainly hoping for much more functionality like this. At the very least, I know that we’ll be able to find out where the topics are used, what topics link to each other, etc. I would certainly hope that we could do some more high-powered searching on various attributes of files (owner, obsolescence, last updated, etc.)