A little problem came up with some user submitted content on a platform I’m working with.
A form allows users to submit content with tinyMCE. If the content is pasted from MS Word, the source is then littered with HTML conditional comments that can have a detrimental effect on the the page that returns it.
After discovering what was going on, I thought that the best time to capture the offending content is when the text is submitted. Using a regular expression, I can capture the HTML conditionals as well as remove any unnecessary comments:
<code> function clear_html_comments($html) { return preg_replace('/<!--(.|\s)*?-->/', '', $html); } </code>
That’s it. Just pass in the content from tinyMCE and it should prevent any content being returned that is in HTML comments.