Simple Phoenix LiveView App: Markdown show notes

In this episode, we'll set up an enhanced version of Markdown for editing show notes, and sanitize user-entered content.

Installing the needed dependencies

We'll be using a stripped down version of Alchemist Markdown, which relies on Earmark. Since the site will also allow user commenting in the future, we'll also need an HTML sanitizer. Add these to the deps section in mix.exs:

      {:earmark, "~> 1.4"},
      {:html_sanitize_ex, "~> 1.3"},

Alchemist Markdown

We'll be using the same Markdown that powers this site (or at least part of it). Much of it is to fill in some gaps between Earmark's feature set and what I want, but it's not worth investing a great deal of time into.

It's just a small wrapper and it's not published on on hex, so just add this file under your lib directory and name it alchemist_markdown.ex:

defmodule AlchemistMarkdown do
  def to_html(markdown \\ "", opts \\ [])

  def to_html(markdown, _opts) do
    markdown
    |> hrs
    |> divs
    |> Earmark.as_html!(earmark_options())
    |> HtmlSanitizeEx.html5()
    |> smalls
    |> bigs
  end

  # for now, we'll just replace H1 and H2 tags with H3s, but as the site grows and it becomes necessary, we'll add more restrictions to commenters. 
  def to_commenter_html(markdown) do
    to_html(markdown)
    |> h3_is_max
  end

  # Earmark doesn't support adding CSS classes to divs yet
  def divs(text) do
    matcher = ~r{(^|\n)::div((\.[\w-]*)*) ?(.*?)(\n):/div ?}s
    matches = Regex.run(matcher, text)

    case matches do
      nil ->
        text

      [matched_part, _, classes, _, inner_md | _tail] ->
        classname = classes |> String.split(".", trim: true) |> Enum.join(" ")
        html = "<div class=#{classname}>#{to_html(inner_md)}</div>"
        String.replace(text, matched_part, html)
    end
  end

  def bigs(text) do
    replace_unless_pre(text, ~r/\+\+(.+)\+\+/, "<big>\\1</big>")
  end

  def hrs(text) do
    Regex.replace(~r{(^|\n)([-*])( *\2 *)+\2}s, text, "\\1<hr />")
  end

  def h3_is_max(text) do
    text = Regex.replace(~r{<h1([^<]*>(.*)<\/)h1>}s, text, "<h3\\1h3>")
    Regex.replace(~r{<h2([^<]*>(.*)<\/)h2>}s, text, "<h3\\1h3>")
  end

  # Replace the input text based on the regex and replacement text provided
  # ... except leave everything inside <pre> blocks as is
  def replace_unless_pre(text, rexp, replacement) do
    Regex.split(~r|<pre[^<]*>.*<\/pre>|s, text, include_captures: true)
    |> Enum.map(fn str ->
      case String.starts_with?(str, "<pre") do
        true -> str
        _ -> Regex.replace(rexp, str, replacement)
      end
    end)
    |> Enum.join("")
  end

  def smalls(text) do
    replace_unless_pre(text, ~r/--(.+)--/, "<small>\\1</small>")
  end

  defp earmark_options() do
    %Earmark.Options{
      code_class_prefix: "lang-",
      smartypants: false
    }
  end
end

Note: A previous version of this file was matching for |\r\n|\r|\n to capture newlines from any OS in a couple of functions, but the above replaces it with just \n and the s modifer after closing the regular expression. The s modifier is called dotall and makes newlines match all types.

Generate HTML notes from Markdown

We'll use the AlchemistMarkdown module to generate HTML show notes every time the Markdown show notes are modified. To do so, we'll add a gen_notes function to podcast.ex and update the changeset to use it:

  def changeset(podcast, attrs) do
    podcast
    |> cast(attrs, [:audio_url, :is_published, :notes_html, :notes_md, :subtitle, :title])
    |> validate_required([:audio_url, :is_published, :notes_html, :notes_md, :subtitle, :title])
    |> validate_required([:audio_url, :notes_md, :subtitle, :title])
    |> unique_constraint(:title)
    |> gen_notes()
  end

  defp gen_notes(%{valid?: true, changes: %{notes_md: text}} = changeset) do
    put_change(changeset, :notes_html, AlchemistMarkdown.to_html(text))
  end

  defp gen_notes(changeset), do: changeset
end

Why use this pattern?

As demonstrated in the video, it would have been easier to just put AlchemistMarkdown.to_html(@podcast.notes_md) inside our show template to generate the show notes HTML from the Markdown previously saved. It worked fine, too. The site was completely usable, and in fact this site you're reading now used to render Markdown-generated HTML that way for nearly a year!

The reason for the above logic in the changeset is that our Markdown conversion is expensive. Earmark is not a particularly fast library. It's written in Elixir, not C or another low level language. On top of that Alchemist Markdown is another level of regex functions thrown on top. Though it's perfectly usable for a typical blog, its performance isn't great. In load tests on a $5/month droplet, the breaking point for Markdown pages rendered that way was about 100 requests/second vs about 1000 requests/second for simple HTML-based templates.

I solved the problem by setting up an ETS cache, which was fun, but ultimately not needed for this situation. Since a content site is viewed far more than it's written to, the markdown can just be converted to HTML once upon each save to the DB and then be read any number of times without needing to repeat the (somewhat) expensive conversion.

Back to index