In this episode, we'll set up an enhanced version of Markdown for editing show notes, and sanitize user-entered content.
Installing the needed dependencies
We'll be using a stripped down version of Alchemist Markdown, which relies on Earmark. Since the site will also allow user commenting in the future, we'll also need an HTML sanitizer. Add these to the deps section in mix.exs
:
{:earmark, "~> 1.4"},
{:html_sanitize_ex, "~> 1.3"},
Alchemist Markdown
We'll be using the same Markdown that powers this site (or at least part of it). Much of it is to fill in some gaps between Earmark's feature set and what I want, but it's not worth investing a great deal of time into.
It's just a small wrapper and it's not published on on hex, so just add this file under your lib directory and name it alchemist_markdown.ex
:
defmodule AlchemistMarkdown do
def to_html(markdown \\ "", opts \\ [])
def to_html(markdown, _opts) do
markdown
|> hrs
|> divs
|> Earmark.as_html!(earmark_options())
|> HtmlSanitizeEx.html5()
|> smalls
|> bigs
end
# for now, we'll just replace H1 and H2 tags with H3s, but as the site grows and it becomes necessary, we'll add more restrictions to commenters.
def to_commenter_html(markdown) do
to_html(markdown)
|> h3_is_max
end
# Earmark doesn't support adding CSS classes to divs yet
def divs(text) do
matcher = ~r{(^|\n)::div((\.[\w-]*)*) ?(.*?)(\n):/div ?}s
matches = Regex.run(matcher, text)
case matches do
nil ->
text
[matched_part, _, classes, _, inner_md | _tail] ->
classname = classes |> String.split(".", trim: true) |> Enum.join(" ")
html = "<div class=#{classname}>#{to_html(inner_md)}</div>"
String.replace(text, matched_part, html)
end
end
def bigs(text) do
replace_unless_pre(text, ~r/\+\+(.+)\+\+/, "<big>\\1</big>")
end
def hrs(text) do
Regex.replace(~r{(^|\n)([-*])( *\2 *)+\2}s, text, "\\1<hr />")
end
def h3_is_max(text) do
text = Regex.replace(~r{<h1([^<]*>(.*)<\/)h1>}s, text, "<h3\\1h3>")
Regex.replace(~r{<h2([^<]*>(.*)<\/)h2>}s, text, "<h3\\1h3>")
end
# Replace the input text based on the regex and replacement text provided
# ... except leave everything inside <pre> blocks as is
def replace_unless_pre(text, rexp, replacement) do
Regex.split(~r|<pre[^<]*>.*<\/pre>|s, text, include_captures: true)
|> Enum.map(fn str ->
case String.starts_with?(str, "<pre") do
true -> str
_ -> Regex.replace(rexp, str, replacement)
end
end)
|> Enum.join("")
end
def smalls(text) do
replace_unless_pre(text, ~r/--(.+)--/, "<small>\\1</small>")
end
defp earmark_options() do
%Earmark.Options{
code_class_prefix: "lang-",
smartypants: false
}
end
end
Note: A previous version of this file was matching for |\r\n|\r|\n
to capture newlines from any OS in a couple of functions, but the above replaces it with just \n
and the s
modifer after closing the regular expression. The s
modifier is called dotall
and makes newlines match all types.
Generate HTML notes from Markdown
We'll use the AlchemistMarkdown
module to generate HTML show notes every time the Markdown show notes are modified. To do so, we'll add a gen_notes
function to podcast.ex
and update the changeset to use it:
def changeset(podcast, attrs) do
podcast
|> cast(attrs, [:audio_url, :is_published, :notes_html, :notes_md, :subtitle, :title])
|> validate_required([:audio_url, :is_published, :notes_html, :notes_md, :subtitle, :title])
|> validate_required([:audio_url, :notes_md, :subtitle, :title])
|> unique_constraint(:title)
|> gen_notes()
end
defp gen_notes(%{valid?: true, changes: %{notes_md: text}} = changeset) do
put_change(changeset, :notes_html, AlchemistMarkdown.to_html(text))
end
defp gen_notes(changeset), do: changeset
end
Why use this pattern?
As demonstrated in the video, it would have been easier to just put AlchemistMarkdown.to_html(@podcast.notes_md)
inside our show template to generate the show notes HTML from the Markdown previously saved. It worked fine, too. The site was completely usable, and in fact this site you're reading now used to render Markdown-generated HTML that way for nearly a year!
The reason for the above logic in the changeset is that our Markdown conversion is expensive. Earmark is not a particularly fast library. It's written in Elixir, not C or another low level language. On top of that Alchemist Markdown is another level of regex functions thrown on top. Though it's perfectly usable for a typical blog, its performance isn't great. In load tests on a $5/month droplet, the breaking point for Markdown pages rendered that way was about 100 requests/second vs about 1000 requests/second for simple HTML-based templates.
I solved the problem by setting up an ETS cache, which was fun, but ultimately not needed for this situation. Since a content site is viewed far more than it's written to, the markdown can just be converted to HTML once upon each save to the DB and then be read any number of times without needing to repeat the (somewhat) expensive conversion.
1 Comment
It's posible to put some buttons with menu for this caracteristics. I mean like the Trix Editor? For help normal users...