Exporting InDesign to HTML: the Basics

Tl;dr: skip to Putting It All Together to read the conclusion without having to read the adventure it took to get there.

Overview

I much prefer posting long-form documents to the web as HTML. It’s not that I dislike PDFs, but experiencing a PDF on a smartphone and, sometimes, on a tablet is irritating. But creating a PDF is much faster than marking up a long document in HTML.

Enter Adobe InDesign, a document production app most often associated with print documents (and PDFs). But in the past few years Adobe has worked to make it viable in the digital publishing space. And one aspect of that is the ability to export a document to HTML.

That functionality intrigues me because the potential efficiencies created by a fully-featured, well-executed export function are measurable. So I decided to do a deep dive into this with three questions framing my assessment:

  • How semantic and clean is the markup on export?
  • How does a complex document export to HTML?
  • How robust are the options when exporting to HTML?

The Approach

I wanted to be methodical about this, so as I found the pitfalls in the process I wanted to be able to know where to fix things and adjust along the way.

My approach:

  1. Create a source Word document;
  2. Use the standard paragraph and character styles found in Word to be consistent across apps;
  3. Start simple by focusing on the (probably) most-often-used paragraph styles: Title, Heading 1, Heading 2, Heading 3 and Normal (which is the default style for body copy); and
  4. Iteratively add complexity once I met with success doing the simple markup export and successes with each level of complexity added.

Creating a Source Content Document

I started with a dummy Word doc, as most writing starts in a word processor like Word. My document contains multiple pages of “lorem ipsum” dummy text sprinkled with a title, several headings, secondary headings and tertiary headings.

Basic Document Formatting

I applied the following standard Word paragraph styles to the text in my document:

  • Title: for the document title
  • Heading 1: for the primary headings
  • Heading 2: for the secondary headings
  • Heading 3: for the tertiary headings
  • Normal: for the body text
Word doc with Title, Heading 1, Heading 2, Heading 3 and Normal paragraph styles in use.
Word doc with Title, Heading 1, Heading 2, Heading 3 and Normal paragraph styles in use.

With that start I decided to introduce some mild complexity to the document by adding:

  • A bulleted list using the “Bullets” icon in the ribbon
  • A nested bulleted list, also with the “Bullets” icon
  • A numbered list using the “Numbering” icon in the ribbon
  • A nested numbered list, also with the “Numbering” icon

Each of these list items is assigned the “List Paragraph” paragraph style by Word.

Adding Complexity with Text Formatting

Most documents have bold or italicized text somewhere in them as well as a hyperlink or two. I added those to the document as well.

  • Bold text: using the “Bold” button in the ribbon
  • Italic text: using the “Italic” button in the ribbon

Creating the InDesign Document

I created a basic letter-sized InDesign document and imported the source Word document using Place… from the File menu. InDesign imported from Word all the paragraph and character styles I had used. (An imported style is identified by an icon in either the “Paragraph Styles” or “Character Styles” palette that looks like a down arrow pointing into a hard drive.)

The imported style icon
The imported style icon

Assigning HTML Tags to Paragraph Styles

With the Word document imported, we can now assign the HTML tags to paragraph and character styles. To do so, double-click on any of the paragraph styles. As an example, I double-clicked on “Normal.”

The mapping of paragraph and character styles happens in the “Export Tagging” setting in the Style Options window.

To set the mapping for the “Normal” paragraph style do the following:

  1. In the “EPUB and HTML” settings area:
    1. Tag: set to p
    2. Include Classes in HTML: uncheck
  2. In the “PDF” settings area set the Tag to P.
  3. Click OK.

Now anything assigned the “Normal” paragraph style will export inside an HTML <p> tag. (You’ll also note the icon indicating an imported style is now gone. That’s because the style has been adjusted in InDesign.)

I set the mapping for the Title, Heading 1, Heading 2 and Heading 3 styles with the following settings:

  • Title: “Tag” set to h1; “Include Classes in HTML” unchecked
  • Heading 1: “Tag” set to h2; “Include Classes in HTML” unchecked
  • Heading 2: “Tag” set to h3; “Include Classes in HTML” unchecked
  • Heading 3: “Tag” set to h4; “Include Classes in HTML” unchecked

In just about every case I can think of, an HTML page has only one <h1> tag and that is reserved for the page’s title, which is why I set the Title style to “h1.” From there, it can be confusing to assign the Heading styles one number lower than their style number (for example, “Heading 1” as “h2”), but that will create properly, semantically structure HTML.

Assigning HTML Tags to Character Styles

I followed the same process for mapping the character styles as I did with the paragraph styles using the following settings:

  • Emphasis: “Tag” set to em; “Include Classes in HTML” unchecked
  • Hyperlink: “Tag” kept at Automatic; “Include Classes in HTML” unchecked
  • Strong: “Tag” set to strong; “Include Classes in HTML” unchecked

Export Attempt 1

With my HTML tags mapped to my paragraph and character styles, it’s time to try my first export!

Export Screen

To export to HTML:

  1. Choose Export… from the File menu
  2. In the “Export” window:
    1. Enter a filename in the “Save As” field (it is pre-populated with the name of your InDesign file)
    2. Choose HTML from the “Format” dropdown menu
    3. Click Save
  3. In the “HTML Export Options” window:
    1. General: I left everything as-is
    2. Image: I left everything as-is
    3. Advanced: I chose Include classes in HTML with Generate CSS and Preserve Local Overrides checked.
  4. Click OK

The HTML exported and a browser window opened to display it.

HTML Markup Review

It’s tough to see how semantically correct HTML markup is in a browser window, so I opened the HTML file in Nova, my favorite code editor for the Mac.

The first HTML export in Nova, my favorite code editor for the Mac.
The first HTML export in Nova, my favorite code editor for the Mac.

A quick scrub of the markup shows markup mostly without superfluous code, classes or ID’s, specifically:

  • Title style mapped to <h1>
  • Heading 1 mapped to <h2>
  • Heading 2 mapped to <h3>
  • Bolded text (assigned the Strong style) mapped to <strong>
  • Italicized text (assigned the Emphasis style) mapped to <em>

The bulleted lists and numbered lists are messier.

The bulleted and numbered lists have unnecessary classes and no indentation for nested lists.
The bulleted and numbered lists have unnecessary classes and no indentation for nested lists.

Fixing the Bulleted and Numbered Lists

Both the bulleted and numbered lists are assigned the “List Paragraph” paragraph style by Word. I did not assign an export mapping to it. After trying a few things that didn’t work in InDesign’s export mapping for the “List Paragraph” style, I returned to my source Word document and did the following:

  1. Assigned the bulleted lists the “List Bullet” paragraph style.
  2. Assigned the numbered lists the “List Number” paragraph style.
  3. For nested lists, I assigned the “List Bullet 2” and “List Bullet 3” styles depending on the level of nesting.

Back in InDesign, I then mapped the List Number, List Bullet, List Bullet 2 and List Bullet 3 paragraph styles by:

  1. Setting the “Tag” to li and unchecked “Include Classes in HTML” in the InDesign export mapping for the List Bullet, List Bullet 2, List Bullet 3 and List Number paragraph styles
  2. In the “Advanced” panel of the “HTML Export Options” screen (when you do File > Export…) I chose Don’t include classes in HTML.

With that, the HTML exported without the unnecessary classes.

Bullet and number lists exported without unnecessary classes.
Bullet and number lists exported without unnecessary classes.

What Did I Find?

In short, InDesign’s Export to HTML functionality is very robust and produces very clean, semantically correct HTML markup. But it does take some effort to set up the styles correctly to get there.

This underscored the value I found in the Adobe CC Libraries functionality as those carefully constructed paragraph and character styles can be added to a CC Library with every setting intact. This makes it extremely easy to expand this workflow to multiple people without requiring them to completely understand the nuances of HTML markup and the nuances of setting up the styles correctly.

And each time I added complexity to a document and met with success I wanted to push the functionality further and further. To that end, I have three more blog posts planned:

  • Often-used HTML tags
  • Tables
  • Images

Putting It All Together

The Export to HTML functionality requires paragraph and character styles mapped to HTML tags.

To create a repeatable, sustainable workflow, the most important first step is aligning HTML tags to paragraph and character styles and either saving those styles in an InDesign template document or in a CC Library. I prefer the CC Libraries because they are available anywhere and can be shared with others.

Here is a table with the mapping I did in this post:

Style NameStyle TypeHTML Tag
TitleParagraph<h1>
Heading 1Paragraph<h2>
Heading 2Paragraph<h3>
Heading 3Paragraph<h4>
NormalParagraph<p>
StrongCharacter<strong>
EmphasisCharacter<em>
List BulletParagraph<li>
List Bullet 2Paragraph<li>
List Bullet 3Paragraph<li>
List NumberParagraph<li>
Style Mapping to HTML Tags

Most long-form documents have more complexity than just the HTML tags I mapped here, so as I explore this functionality by adding additional complexity I’ll update this table.

Stay tuned for the additional three entries on this topic.

Website Content Workflow

As I’ve (slowly) began writing again, I struggled to remember my workflow for ideating, writing, editing, storing and posting content. It seems like a smart thing to document and doing so was a great reminder of my process. So, here it is. It very closely mirrors the workflow I established at my full-time gig and is a workflow works well for one person writing content—mostly blog entries—for a small website.

The workflow is simple:

  1. Ideate and Plan
  2. Write and Edit
  3. Publish

Ideate and Plan

I use OneNote to brainstorm, organize and schedule my blog entries. I have a section in my Brettro OneNote notebook called “Brettro Website Content.” Inside that section I created pages for each major section of the Brettro website:

  • Blog Entries
  • Pages
  • Portfolio

Each page is then split into two sections:

Brainstorms

OneNote list for brainstorming content

This section is a freeform, bulleted list of ideas for entries; sometimes a bullet is just a word and sometimes it is a more in-depth, complete thought.

Schedule

Planning table in OneNote

This section is a four-column table:

  • Title: the title of the entry
  • Description: the description is a more fleshed-out version of the brainstorm bullet and many times becomes the entry’s excerpt
  • Scheduled Post: anticipated posting date; my goal is to post every two weeks, but clearly that hasn’t been the case in 2019.
  • Actual Post: the date the entry is actually posted.

Like I said in my “My Creative Toolbox” entry, I love OneNote and went all-in on OneNote in 2014 and never looked back. It’s available on every device, it’s easy to send items to it from other apps and its interface is elegant.

Write and Edit

Brettro’s content template in Microsoft Word

Years ago I created a content template Word document that includes places for the excerpt, the meta description, meta tags (though those are less necessary now), the copy and custom text for both posting to Facebook and Twitter. The template has worked really well.

File Structure and Storage

In order to have my drafts and related imagery available everywhere, I keep my website content on my OneDrive. (I have an Office 365 subscription, which I find very useful.) I have a folder named “Brettro Web Content” with four folders inside it that match my OneNote pages—blogs, pages and portfolio—and then an additional one for graphics.

Inside each of those folders I create a unique folder for each blog entry, page or portfolio item (respectively) so that I can store the text and any graphics or photos together. For consistency, I use the following naming conventions for those content folders:

  • Blog entry: entry - YYYY-MM-DD - slug
  • Page: page - section - slug
  • Portfolio: project - YYYY-MM-DD - slug

The graphics folder contains template Photoshop and Illustrator files for sizing featured images and images in the content column correctly.

Publish

Posting an entry to WordPress is a copy-and-paste effort from the Word doc. On the Windows version of Word there is a “Post to Blog” functionality that, for whatever reason, isn’t available on the Mac. That’s disappointing because I think having that functionality would be nice. For the moment, I open my content entry in Word, open WordPress and then copy and paste the content and upload the images to create my entry.