Jan 082019
 

Markdown discovery

In August 2018 a work colleague sent me some notes in the form of a .md file and corresponding PDF. I looked up fileinfo and learnt that this was Markdown. As a long time note taker in ASCII file the idea of formatting text simply so it could be later converted to HTML was very appealing, I could not believe I had not heard of it before or thought about it myself. When doing web pages I’ve always striven to have readable HTML – e.g. save a word file to html and look at the source to see an example of unreadable html.

Syntax highlighting

I immediately started using Markdown and reading about it myself. It was released by John Gruber in 2004. I have used EditPad as my text editor for many years and I am used to opening files in it and they being syntax colour highlighted, e.g. for HTML, XML, JSON (you need the pro version for Syntax highlighting). My .md file were not syntax highlighted. EditPad Pro has a Syntax Coloring Scheme Editor which people can use to create schemes and then share with other EditPad users by uploading. But surely someone had done it already? I found a syntax highlighter on GitHub for EditPad, which seemed to work. But better still, I then found out that I had not updated to the latest EditPad as Markdown support was added in EditPad Pro 7.6.2 :-). It turns out the GitHub scheme is faulty and redundant anyway with built in EditPad support now.

What is Markdown though?

After a week of so of note taking with Markdown I figured it would be nice to see some output, have a table of contents and see all my pages together. Reading the EditPad Pro 7.6.2 release notes I was wondering about how to print to HTML and I even emailed support asking where I go to convert the .md file to a PDF? The ever patient Jan Goyvaerts explained to me that an editor like EditPad allows you to edit Markdown but it cannot render markdown as a formatted page. He added that this is similar to allowing you to edit HTML, but not being able to render the HTML as a formatted page. I could not believe I had not made the same conclusion! Update Feb 2019. Visual Code does a fantastic job of previewing markdown.

I’ve been using Markdown for over 4 months now and only in writing this article today did I learnt that Markdown is also a tool for converting the Markdown syntax to HTML! If I had taken in all of Jan’s email at the start he diligently explained that another application is needed to convert Markdown to PDF, or one can use the original Markdown.pl Perl script to convert Markdown to HTML and print that to PDF with your browser. Indeed, In Gruber’s readme, he says:

Markdown is a text-to-HTML conversion tool for web writers. Markdown allows you to write using an easy-to-read, easy-to-write plain text format, then convert it to structurally valid XHTML (or HTML).

Thus, “Markdown” is two things: a plain text markup syntax, and a software tool, written in Perl, that converts the plain text markup to HTML

The download contains the pearl script, Markdown.pl, and refers to Gruber’s webpage for the Markdown syntax.

Converting Markdown to HTML

I wanted to render the documents to generate HTML from multiple files, i.e. create a table of contents of pages and join them together outputting HTML. I discovered Pandoc and learnt you could use it to convert Markdown to html. It works well.

pandoc *.md > markdown_book.html

will merge all the files in the execution directory alphabetically prior to translating to HTML. Of course there’s no auto generated table of contents.

The manual does suggest you need to do

pandoc -f markdown -t html5 -o myfile.html myfile.md

but the more simple options works fine:

pandoc myfile.md > myfile.html

There is one gotcha however, the markdown file must first be in UTF-8, otherwise you get the error:

pandoc: Cannot decode byte '\x92': Data.Text.Internal.Encoding.decodeUtf8: Invalid UTF-8 stream

My editor (EditPad) had a real handy option to convert the text encoding to Unicode UTF-8, you can then resave the file. Alternatively you can convert the file from the PowerShell command line using the not very convenient:

Get-Content .\myfile.md | Set-Content -Encoding utf8 myfile.utf-8.md

Another gotcha for me is to remember to leave blank lines around indented code blocks, like the above, otherwise they will not indent!

Markdown alternatives

Researching more for options I found out that that (arguably) the main Markdown alternatives are a massive rabbit hole and there’s loads others too…Creole, Textile briefly had me excited…

You can find blog posts on all the alternatives which extol their virtues, e.g. asciidoc: an awesome markdown alternative. Then, you will also find “When comparing reStructuredText vs AsciiDoc, the Slant community recommends reStructuredText for most people”. Another religious topic methinks!

DokuWiki

It started to really annoy me that there are so many variants of markup syntax all over the place! I started to think about DokuWiki which I have used as a personal wiki since the mid naughties, it has it’s own markup syntax too. Could I not use DokuWiki syntax independent of the wiki? On the DokuWiki forum I learnt that this was not possible. I found out about Zim, most of Zim’s syntax is DokuWiki syntax, and you can use it like a notebook and also export to a joined up website.

I also learnt that Pandoc had support for DokuWiki but it was for output only, that is Pandoc can convert to DokuWiki syntax. It turned out there was an open issue for the last 4 years to add Pandoc DokuWiki reader support, I added a comment with my use case. Coincidentally, the issue was closed today and will be included in the next Pandoc release 🙂

Markdown cheatsheet

The Markdown alternatives lead me back to, well Markdown, though I hope to remain curious. I guess there is a reason Markdown has not been updated since 2004. As Gruber says

Markdown is not a replacement for HTML, or even close to it. Its syntax is very small, corresponding only to a very small subset of HTML tags. The idea is not to create a syntax that makes it easier to insert HTML tags. In my opinion, HTML tags are already easy to insert. The idea for Markdown is to make it easy to read, write, and edit prose. HTML is a publishing format; Markdown is a writing format. Thus, Markdown’s formatting syntax only addresses issues that can be conveyed in plain text.

I wanted a typical one page cheat sheet I could hang on my wall. None of the cheat sheets were ideal for me, they were too long or missing information like how to deal with nested lists\code blocks. The one I settled on is from Ahred code.

Alternative to HTML generation

For wiki pages and clog posts there are options to generating output from your Markdown.

Confluence has it’s own markup sytax of course though is WYSIWYG now. There are a number of ways to use Markdown inc a confluence page but they simplest is via Insert > Markup & select Markdown.

If you have your own WordPress.org site there is a plugin WP-Markdown that allows you to use paste your MarkDown into a new post but it is not maintained and I would not recommend it. Instead I use Pandoc and paste the HTML. I used Markdown for this post in August 2018 as well as this page itself. Have a look at the Markdown used for this page in side-by-side view with the HTML.

Trying Markdown

Gruber has a sandbox where you can try Markdown and you can use Pandoc online to render it to HTML.

Alternatively there are loads of online Markdown editors such as stackedit.io In-browser Markdown editor & Markdowner. I like the side by side editors where you can see the markup in one pan and the HTML output in another, hackmd.io works great for this and each edit also gets it’s own URL.