Exporting InDesign to HTML: Tables

Tl;dr: skip to Putting It All Together to read the conclusion without having to read the adventure it took to get there.

Overview

This is part 2 of a multi-part series where I do a deep dive into Adobe InDesign’s export to HTML functionality, with this post’s focus on tables.

Part 1, Exporting InDesign to HTML: The Basics took a look at exporting basic document elements: title, headers, text (including bolded and italicized text), bulleted lists and numbered lists.

Background

I much prefer posting long-form documents to the web as HTML. It’s not that I dislike PDFs, but experiencing a PDF on a smartphone and, sometimes, on a tablet is irritating. But creating a PDF is much faster than marking up a long document in HTML.

Enter Adobe InDesign, a document production app most often associated with print documents (and PDFs). But in the past few years Adobe has worked to make it viable in the digital publishing space. And one aspect of that is the ability to export a document to HTML.

That functionality intrigues me because the potential efficiencies created by a fully-featured, well-executed export function are measurable. So I decided to do a deep dive into this with three questions framing my assessment:

  • How semantic and clean is the markup on export?
  • How does a complex document export to HTML?
  • How robust are the options when exporting to HTML?

The Approach

I want to continue being methodical about this, so that as I found pitfalls in the process I would be able to know where to fix things and adjust along the way.

My approach:

  1. Update the source Word document I’d created in Part 1 to include a table;
  2. Use the standard paragraph and character styles found in Word to be consistent across apps;
  3. Format the table in Word so that the first row of the table is a header; and
  4. Iteratively add complexity once I met with success doing the simple markup export and successes with each level of complexity added.

Update the Source Word Doc

I added a simple three column by three row table to the Word doc and assigned the top row as the header.

Added a simple table to the source Word document.
Added a simple table to the source Word document.

Update the InDesign Document

I re-imported the text from the Word doc to InDesign and realized several things:

  • There is no standard paragraph style in Word for table headers;
  • In many instances the typography in tables is different than the body copy type; and
  • I can’t assign the <th> and <td> tags to the “Normal” Word paragraph style, so I’m going to need to create some new styles.

So I created the two new paragraph styles:

  • Table Header: to customize the font and font size for a table’s header and to map the <th> tag;
  • Table Normal: to customize the font and font size for content in a table and to map the <td> tag.

I also updated two Word docs with the new styles:

  • Style Mapping – Word to HTML doc: since this is my “source of truth” document, all the styles should be in this document.
  • InDesign to HTML Word Doc: the document with the dummy text that I’m using to test the functionality

Export Attempt 1

With my new table-specific paragraph styles created, including the mapping to the <th> and <td> tags, I did my first export. I kept the same “Export to HTML” settings in InDesign that I’d done in Part 1.

The table code is very clean, except by mapping the <th> and <td> tags to my newly-created, table-specific paragraph styles it ended up duplicating the tags.

I also noted that there is no <caption> tag inside the <table> markup.

The <th> and <td> tags are duplicated because I mapped the paragraph styles to them.
The <th> and <td> tags are duplicated because I mapped the paragraph styles to them.

Export Attempt 2

I removed the <th> and <td> tag mapping in the Table Header and Table Normal paragraph styles respectively. InDesign wouldn’t let the field be blank, so I chose “[Automatic].”

I created the “Table Caption” paragraph style and mapped it to the <p> tag with the class “table-caption.”

This time the export:

  • Put the table caption in a <p class=“table-caption”> tag, and
  • Surrounded the table cell text in <p> tags. In my research, proper HTML table markup does not require text to be surrounded by <p> tags.
This time InDesign surrounded the text in the table cells with <p> tags.
This time InDesign surrounded the text in the table cells with <p> tags.

I could not find a way to have the text export without the <p> tag, so I assigned the class “excess-p” to both paragraph styles. That way I can do a find/replace in the HTML to remove it after export.

Export Attempt 3

Before exporting I updated the “CSS Options” in InDesign’s “Export to HTML > Advanced” panel settings to:

  • Include classes in HTML,
  • Generate CSS, and
  • Preserve local overrides

On export I received this error:

Error notification: CSS name collision : 2 detected. Paragraph Style “Table Header” and “Table Normal” generate conflict css name “excess-p”

But it let me continue and exported the HTML. This export produced some messier markup. InDesign:

  • Created an external CSS file (probably because the “Generate CSS” checkbox was checked), and
  • Added unnecessary classes to the <table>, <col>, <tr> and <td> tags; it also added unnecessary classes to the nested unordered lists elsewhere in the markup.

Export Attempt 4

This time I updated the “CSS Options” in InDesign’s “Export to HTML > Advanced” panel settings to Include classes in HTML but unchecked Generate CSS, which then made the Preserve local overrides option unavailable.

This time I did not get the CSS error prior to export and InDesign only assigned unnecessary classes to the <table>, <tr> and <td> tags.

Export Attempt 5

After doing some searching on the Web, I came no closer to finding a way to prevent InDesign from adding unnecessary classes to the <table>, <tr> and <td> tags, so I decided to take a look at table and cell styles.

Cell Styles

Cell styles, like paragraph and character styles, provide extensive options to quickly style a table cell. One of those options is to assign a paragraph style to text that appears in a table cell. I created two cell styles:

  • Table Header: to create a standard design for table headers across Brettro documents, including assigning the Table Header paragraph style; and
  • Table Data: to create a standard design for table cells across Brettro documents, including assigning the Table Normal paragraph style.

Table Styles

Table styles, also like paragraph and character styles, provide extensive options to quickly (and consistently) format tables in documents, like choosing cell styles for header rows, body rows and footer rows. I created a table style called “Table Normal” with the “Header Rows” set to the “Table Header” cell style and the “Body Rows” set to “Table Data” cell style.

The Export

When exported with table and cell styles applied, InDesign adds those styles as classes to the HTML markup. While they’re still unnecessary, it is a hook for a search-and-replace to quickly remove them in the HTML.

Now that these are automatically included, I updated the “Table Header” and “Table Normal” paragraph styles export mapping to “[Automatic]” and removed the “excess-p” class. It is unnecessary.

Putting it All Together

It does not appear that InDesign provides a way to export tables to HTML without assigning unnecessary classes and HTML tags. That’s unfortunate, but using regular expressions easily and quickly removes the extra classes and tags using the find/replace feature available in just about every code editor.

To create clean, semantically correct HTML markup for tables you’ll need to do two things:

  1. Create paragraph styles specific to tables and then create cell and table styles. you’ll only need to do this once and then add those styles to your “Presos, Proposals and Pub Type Styles” CC library for repeated quick access to them.
  2. After exporting to HTML, do several find/replace steps using regular expressions to remove the extra markup. you’ll need to do this with every document you export, but it is a very quick step.

Create Paragraph, Cell and Table Styles

  • Create “Table Header” paragraph style with export mapping the “Tag:” to [Automatic].
  • Create “Table Normal” paragraph style with export mapping the “Tag:” to [Automatic].
  • Create “Table Caption” paragraph style with export mapping the “Tag:” to p with class table-caption.
  • Create “Table Header” cell style assigning the “Table Header” paragraph style.
  • Create “Table Data” cell style assigning the “Table Normal” paragraph style.

Find and Replace

Using a regular expression to quickly remove unnecessary classes, spaces and tags:

  1. Fix the <p class="table-caption"> and the <table> tag:
    1. Find <p class="table-caption">(.*?)</p>[\r\n\t]+<table id="(.*?)" class="Table-Normal">
    2. Replace: <table>\n\t\t\t\t<caption>$1</caption> (The \n inserts a new line and the multiple \t’s insert tabs.)
  2. Fix the <tr> tags:
    1. Find: <tr class="Table-Normal">
    2. Replace: <tr>. (A regular expression is not used here.)
  3. Fix the <th> tags that appear as <td> tags, remove unnecessary classes and unnecessary <p> tags:
    1. Find: <td class="Table-Normal Table-Header">[\r\n\t]+<p>(.*?)</p>[\r\n\t]+</td>
    2. Replace: <th>$1</th>
  4. Fix the <td> tags, remove unnecessary classes and unnecessary <p> tags:
    1. Find: <td class="Table-Normal Table-Data">[\r\n\t]+<p>(.*?)</p>[\r\n\t]+</td>
    2. Replace: <td>$1</td>

What’s Next?

Most long-form documents have more complexity than just the basic HTML tags I mapped in part 1 and the table workflow I mapped here, so as I explore this functionality stay tuned for additional entries on this topic.

Update: The Bland Brettrospective

It’s been a minute since I posted about creating a new custom theme for brettro.com, so I thought I’d post an update.

When I sat down and really started thinking about how I wanted to approach this, I realized how much had changed since I developed the last brettro.com WordPress theme. I decided I wanted to keep up with and learn about as many of those changes as possible throughout the entire design and development process, so I’m tackling each step with intention and a willingness to learn.

Step 1: Develop a Design System

Design systems seem to be the natural evolution of atomic design and provide an incredible amount of detail from the vision of a design to the exact pixel dimension of a rounded corner on a button and just about everything in between. I want to do a deep dive into this, create one for Brettrospective and document what I learn along the way.

Step 2: Use a New Tool to Design the User Interface

In 2017 Adobe released XD, an app specifically for creating user interfaces for websites and apps. From the little bit of it I’ve used, it seems to be an incredibly useful and functional app and one that I can see becoming the de facto standard for creating and prototyping interfaces. I’m excited to dive in and learn this tool.

Step 3: Assess my Development Toolbox

I want to take a look at the tools I use to do my actual development work. I’m both very comfortable with and a huge fan of all the tools I use. I know that Panic, the makers of Coda have Nova , a new code editor, on the horizon and I am very excited about that. And I recently decided to start using GitLab instead of Github. So those are a few changes on the horizon.

Step 4: Develop the Theme

Even though the Brettro WordPress theme won’t be available for anyone to use, I want to develop it to the standard that would get it approved for posting on the WordPress theme directory. Making sure it takes advantage of the newest features of WordPress is an important learning moment for me, as is understanding what elements must be and should be included in a theme. I plan to use the underscores theme as my foundation.

I also want to give some thought to what CSS framework I might use. My previous theme used Bourbon and Bourbon Neat. Recently the folks who created both frameworks discontinued development on Neat and are encouraging people to use modern, native CSS features like Grid and Flexbox. They’re smart to do so and I’m excited to dive into those two CSS features as well.

Documenting It All

Like I said in my original post about this, I plan on documenting the things I learn, the challenges I face and the victories I achieve while creating this new theme. So stay tuned…

Submitting to Subversion

This is the first in a four-part series about how Brettro integrated Subversion into its workflow. Part two will discuss the products Brettro uses to manage its SVN repositories, part three will discuss how Brettro uses SVN with ExpressionEngine and part four will talk about how Brettro uses SVN with WordPress.

About a year ago I tackled the task of integrating version control into my coding practices. I had spent a frustrating amount of time either recreating my code base as I started on new projects or backtracking and rewriting code when I would get a flash of inspiration to try something new. Also, I have both an iMac and a MacBook Pro and I use them interchangeably and wanted to be able to have the most up-to-date code on both of machine. Plus, it seemed all the pros were doing it—and since I consider myself one—I should join the gang.

Why Subversion?

Usually I do quite a bit of research, compare features and discover the value of one product over another. In choosing SVN, however, all I knew was that the WordPress Core Team used it. That, in and of itself, was a strong enough testimonial for me to dive right in.

Getting Started

When I know nothing about a topic, I buy a book, which is the first thing I did. I picked up Apress’s Practical Subversion, second edition, plopped down and started reading. I read chapters one, two and six completely as they provided an overview of version control in general, a “crash course” in Subversion and best practices in using Subversion. (By the way, Subversion is also known as SVN.)

Although clearly and plainly written, I was still confused as to the best way to get started and the best way to integrate SVN into my current workflow. I asked folks how they used it. I tweeted about my confusion consistently. I read blog entries and articles ad nauseum. I definitely hadn’t had my “ah ha!” moment yet.

Integrating SVN into My Workflow

As it turns out and after some starts, stops and stumbles, I realized that my current workflow wasn’t so much a “workflow” as much as it was a “jumble-of-tasks-that-stumbled-over-themselves” to get a project done. So I began to map out two workflows: one for managing Brettro web properties and one for creating and managing client web properties. This was really helpful as it:

  • Clarified the basic SVN concepts of trunks, tags and branches, and
  • Formalized how I create, produce and maintain website code.

Creating a Foundation: My HTML ‘Codebase’

With a better understanding of the basic SVN terminology and process, I decided to start with a fresh series of HTML, CSS and JavaScript files that would serve as the basis of all my website code from hereon out; and I’d call it my “codebase.” This seemed to be a great time to:

  • Make the switch from HTML 4/XHTML 1.1 to HTML5,
  • Adopt HTML5Boilerplate, a well-maintained framework established to ensure HTML5 code worked fairly well on legacy browsers (like any version of Internet Explorer before IE9),
  • Adopt 960.gs, another well-maintained framework established to ensure CSS consistency across browsers, and
  • Create the Brettro website design style manual.
With my basic HTML5 codebase complete, it was time to venture into the world of SVN. My goals at this point:
  • Be able to modify my HTML5 codebase as necessary for both my purposes and as both HTML5Boilerplate and 960.gs released improvements,
  • Create a branch of this codebase for my ExpressionEngine development,
  • Create a branch of this codebase for my WordPress development.

Repositories, Trunks, Branches, Tags and Working Copies

Let’s get some basic terminology out of the way:

  • Repository: the “repository” (or “repo”) is the container where your SVN-managed code is kept;
  • Trunk: the “trunk” is the main codebase of a project where most of your development will occur;
  • Branch: a “branch” is an offshoot of the trunk (do you see the tree metaphor?) whereby you might want to try out an idea or a feature that may not actually make it into production;
  • Tag: a “tag” is a copy of either the trunk or a branch frozen at a specific point in time, such as a release (the tree metaphor comes to a screeching halt here); a tag is never modified once it’s created;
  • Working Copy: the “working copy” is either the trunk or a branch copied to your computer from the repo to allow you to make changes.

After all this reading, contemplating, starting, stopping, deleting and creating, I settled on a basic structure that works for me. I’d like to think it’s a pretty common structure because it is based on my understanding of how WordPress organizes their SVN repository (and I like to use best practices because there’s no need to reinvent the wheel):

Trunk

This is where I do my “next major version” development. For example, right after I finished my HTML5 codebase, HTML5Boilerplate released version 2 of their product. Rather than integrate those changes into version 1 of my codebase, I’ll integrate them into version 2 and do that development here.

Branches

When I imported my initial HTML5 codebase into my first SVN repo, I immediately created a branch and named it “1.0.” The “1.0” branch is my working copy of my HTML5 codebase code where I squash bugs and make minor fixes, then release them as dot releases. These releases are merged back down to the trunk so that they are included in the next major version codebase release.

Tags

After creating my “1.0” branch, I also created a “1.0” tag. This gives me a complete capture of version 1 of my HTML5 codebase so that I have a stable, working copy to use to create new projects.

Committing to SVN (See what I did there?)

By finally having an understanding of basic SVN techniques, by documenting my SVN structure and practices and by creating my first version controlled codebase, I was ready to take a deeper dive. Next I created branches of my codebase for both my WordPress codebase and my ExpressionEngine codebase. Stay tuned to the next three parts to learn what software I use and what workflow I use to manage sites built with either of these content management systems.

Strategory

About nine months ago, I wrote about a strategic shift in the foundational elements Brettro will use to design websites for its customers. The rumblings about HTML5 were becoming louder, I had just finished deploying my first full-scale, medium size website using ExpressionEngine and I had deployed a handful of mediocre (on the coding side) WordPress sites. (They have since become less mediocre.) The benefit of using products with a solid, committed design, developer and fan base behind them, like ExpressionEngine and WordPress, really began to gel with me.

Platforms and Coding

My skill is in coding, my passion is in creating meaningful, usable user experiences and content, my “superhero strength” is managing projects and my “weakest link” is my design chops. Recognizing this, I decided it was time to step up my game and create a plan combining my skill, passion, “superhero strength” and “weakest link” into a functioning business strategy and workflow.

Step one:

  • professionalize my internal processes,
  • document coding standards for Brettro,
  • develop a sustainable platform for client documentation, and
  • really sink my teeth into merging my “mad coding skills” with the best practices for developing websites using both WordPress and ExpressionEngine.

In that time, and largely through Twitter, I have found an incredible and helpful community surrounding both EE and WP. I started more professional code management using Subversion (and an amazing Mac client called Versions). I have launched my HTML5 codebase. I have created nearly 13 WordPress plugins (and, in many cases, had to completely scrap and re-create them as I learned more and more about WP coding best practices).

In short, I’m well on my way to achieving my goals of becoming more familiar with and better at coding for two brilliantly executed, powerhouse web publishing platforms.

But there’s more to do…

Design, Design, Design

If anything has suffered through all this, it is the design of the Brettro web properties. They are certainly not horrible—and definitely don’t rank amongst the legacy 1996-era sites with all the visual appeal of an avocado green refrigerator—but they are my weakest link.

With comfortable knowledge of HTML5, Subversion and WordPress under my belt, it is time to focus on Step Two: creating a visually rich, responsive, usable, compelling interface for our websites. (The Brettro web properties use WordPress, but that’s a different discussion for a different time. Coming soon, though.)

I’m excited by this next step. Design has always been the biggest struggle for me. I understand the basic concepts, but the crippling self-doubt (which apparently plagues most of the design types I follow on Twitter) about the elegance and visual appeal of the concepts I develop keep me from executing my best work. It’s in there. I know it is. I just need to take a deep breath, walk away from time-to-time, and push through the doubt.

Soon enough you’ll see those efforts come to light. And your feedback will be important.

But there’s still more to do…

Content, Education, Engagement

This is the point where I become so overwhelmed by the sheer volume of things I need to do, learn and participate in. But then I recognize how excited I am by all of it! And really, this batch of “more to do”—also known as Step 3–is really best done by intertwining it throughout the other efforts on a day to day basis:

  • Education in this arena never stops. And that’s one thing I absolutely love about it.
  • Strategically writing content for the web is such a compelling activity, especially with advocates (and professional heroes of mine) like Kristina Halvorson and Erin Kissane making such strong showings for its benefits
  • And Washington, DC has some absolutely brilliant programmers and designers with which to engage. I look forward to becoming an active and beneficial part of the community.

So, guess what? It’s time to get busy. Come back to read about and see the progress as it unfolds. It’s going to be exciting!