Thursday, 10 October 2013

HTML5 key for publishers in the digital era

I'm working for a couple of years in the media & publishing world already. One thing that I noticed is the vast amount of effort publisher have to do to get their content into a digital format. Typically authoring tools like Microsoft Word and other desktop publishing suites are being used to create the content. This content is then transformed into formats like DocBook or proprietary formats to drive the print process.

Publishers are in the midst of creating e-book variants of these to provide a digital copy and some even repurpose their content into online databases and dropping the book concept as a container of content.

Its not uncommon that publishers still have their authoring & editorial infrastructure from the print era and on top of that they added a layer to produce the digital counterparts. This way publisher got into the digital world fairly easily without too much investments. Drawback is of course that they aren't really working in a "digital first" manner resulting in a catch up game in the rapid changing digital world. Even if they do tend to write for the web, they still have the tools from before.

Forget about print?

If a publisher would stop thinking about print, I'm sure they would have a totally different technology stack to publish for the web and other digital products. However most publishers still rely on a great deal of revenue in their print products. And besides, their are still a lot of cases where a good printed book is preferable.

Focus on digital 

So what if publishers could focus on digital publishing but still support the printed books as well? I strongly believe that today's technology allows us to do so. key technologies are HTML5 and CSS3 which publishers can use for both the authoring as the publishing processes. Imagine the possibilities of using one set of languages for both worlds, especially formats that are used all over in our digital world. Just to name a few:

  • Authors would be able to edit their content right inside their browsers using HTML5 editors (like raptor editor) that could be part of an authoring platform. This way they can connect far better than with the Word and email tools they use today.
  • Author round tripping would be way simpler, no more conversion back and forth from the author's format (e.g. MS Word) to the internal publishers format e.g. DocBook. 
  • Direct visual feedback for authors as the content they provide can immediately be presented in a layout that closely resembles the printed end result. Especially with the upcomming CSS extensions around paged media 
  • Whenever a piece of content is accessed by anyone of the publisher they can immediately visualize it in their browser, no more xml codes for editorial staff. This is especially true seeing most of today's applications are web based. 
  • No special tools for proofing and previewing the content as intermediate results don't need to go via special rendering systems.
  • ...

Can it be done?

The big question is of course, is HTML5 rich enough to convey all semantic meanings required for print publishing? I do believe so, in fact, Sanders Kleinfeld at O'Reilly believes it as well seeing his open spec called HTMLBook which he described in his recent article

Sanders also presented his view of "The Case for Authoring and Producing Books in (X)HTML5" at Balisage: The Markup Conference in 2013

So, if you're a publisher, grab some of your books and express them in the HTML5 syntax. See how far you will get and reap the benefits of thousands of surrounding web technologies.