"Safe" markdown processor for PHP? "Safe" markdown processor for PHP? php php

"Safe" markdown processor for PHP?


PHP Markdown has a sanitizer option, but it doesn't appear to be advertised anywhere. Take a look at the top of the Markdown_Parser class in markdown.php (starts on line 191 in version 1.0.1m). We're interested in lines 209-211:

# Change to `true` to disallow markup or entities.var $no_markup = false;var $no_entities = false;

If you change those to true, markup and entities, respectively, should be escaped rather than inserted verbatim. There doesn't appear to be any built-in way to change those (e.g., via the constructor), but you can always add one:

function do_markdown($text, $safe=false) {    $parser = new Markdown_Parser;    if ($safe) {        $parser->no_markup = true;        $parser->no_entities = true;    }    return $parser->transform($text);}

Note that the above function creates a new parser on every run rather than caching it like the provided Markdown function (lines 43-56) does, so it might be a bit on the slow side.


JavaScript Markdown Editor Hypothesis:

  • Use a JavaScript-driven Markdown Editor, e.g., based on showdown
  • Remove all icons and visual clues from the Toolbar for unwanted items
  • Set up a JavaScript filter to clean-up unwanted markup on submission
  • Test and harden all JavaScript changes and filters locally on your computer
  • Mirror those filters in the PHP submission script, to catch same on the server-side.
  • Remove all references to unwanted items from Help/Tutorials

I've created a Markdown editor in JavaScript, but it has enhanced features. That took a big chunk of time and SVN revisions. But I don't think it would be that tough to alter a Markdown editor to limit the HTML allowed.


How about running htmlspecialchars on the user entered input, before processing it through markdown? It should escape anything dangerous, but leave everything that markdown understands.

I'm trying to think of a case where this wouldn't work but can't think of anything off hand.