Banner Image

Next Gen CMI

Show, Don’t Tell: How We Learned the Importance of Content Samples the Hard Way

April 5, 2016

What may be correct in theory, may or may not hold well in practice. It’s almost magical how this saying by the famous German philosopher Immanuel Kant, fits perfectly well in all aspects of life. You would wonder why the touch of philosophy in a technology blog? Well, here’s why. Through our engagements with several leading publishing companies the world over, we’ve come to believe that the best way to develop content management and publishing systems is to focus on representative samples of content. So, in essence, more than design principles (read theory), what does the trick is actual, live piece of content (read practice).

So, what are content management systems?

Simply put, content management systems help businesses process content and render it in a variety of ways, suited to business requirements. Take for example a legal publisher who wants to generate multiple outputs from the same master copy of a certain law: the paper edition, the online edition, the annotated version, the abridged version, and so on. This is equally true in the case where publishing is a support function and not really the main line of business. For instance, the myriad renditions of service manuals for heavy equipment manufacturers.

Is this as simple as: in, process, out?

Well, the answer is YES and NO.

The basic process is quite simple in fact: input content, let the in-built rules engine process it, and then receive the same content in the way you had asked for. However, for a company to develop an effective content management system, the three main steps – input conversion, enrichment processing, and output formatting – need to be considered as individual capsules of transformation.
Each step has a document as input and a document as output. Therefore, to list the salient features of the system as a whole, one will need to account for the finer nuances of each of these transformations. And here’s where the need for content samples comes in. To develop a content management system that delivers exactly what you wished it should, use sample documents to depict the type and structure for corresponding input and output cycles.

Okay, too much of theory, right? Let’s see an example of how useful a sample document is, while laying down system specifications.

               Metadata mapping specs
 Old field  New field
 Author  Contributor
 Title  Headline



Input conversion: In a recent content migration project, significant work was done to specify how legacy reports and their metadata were to be converted (on a one-time basis) to fit a new editorial workflow. Created as mapping tables (sample table shown), which were reviewed multiple times over, these tools guided the development of the automated conversion that brought old reports into the new CMS repository.

Output formatting: This was developed according to the provided specs, in which some metadata fields controlled the color of the header graphic, while others ascertained the display of the output document’s frames and sidebars. These formatting specifications were displayed by way of a sample document.

End-to-end testing: When it was time for user acceptance testing, a certain display issue was initially attributed to the output formatting process. Further analysis, however, traced the root cause to the input conversion step. It was confirmed that the migration exercise was absolutely accordance with the specification that was laid down, however, the specification itself was wrong.

Even though the input specification was complete, with all possible cases covered, it was very much the culprit here. What’s even more interesting is that the output specification, which showed only representative data in the expected format, was instrumental in detecting, and subsequently fixing, the error. Therefore, no matter how precise and comprehensive your content mapping tables may be in theory, in practice, it is the visual samples that people find easier to understand and verify. Concrete examples are definitely a lot easier to relate to than abstract rules.

In pioneer computer scientist Niklaus Wirth’s famous formulation, algorithms + data structures = programs. Data samples, being concrete, are easier to see and understand than algorithms. So, when developing a content management system with a group of solution architects, technology specialists, and business users, it is far easier and a lot more productive to work with actual samples of structured data than to talk the language of algorithms that form the basis of the underlying rules of data transformation. The proof is in the pudding: it’s often easier to just specify the whole process with examples of the end product, the final, formatted, published document.

Charlie Hamu is a Domain Consultant with the Media & Information Systems (MIS) business unit at Tata Consultancy Services (TCS). He has around two decades of experience working with leading publishing companies as a content and systems architect, consultant, best-practices coordinator, and content manager. At TCS, Hamu offers consulting services and technical support to global publishing companies in redesigning the ingestion, management, rendering and delivery of their core content.