Taking care of HTML comments in PHP

A little problem came up with some user submitted content on a platform I’m working with.

A form allows users to submit content with tinyMCE. If the content is pasted from MS Word, the source is then littered with HTML conditional comments that can have a detrimental effect on the the page that returns it.

After discovering what was going on, I thought that the best time to capture the offending content is when the text is submitted. Using a regular expression, I can capture the HTML conditionals as well as remove any unnecessary comments:

<code>
function clear_html_comments($html)
{
    return preg_replace('/&lt;!--(.|\s)*?--&gt;/', '', $html);
}
</code>

That’s it. Just pass in the content from tinyMCE and it should prevent any content being returned that is in HTML comments.

Praise Be for Regular Expressions

I used a regular expression in VB for the first time ever (I think) today!

The reason is that I cached some generated ASP and realised that when I needed to use it on different pages, the id lookup for certain elements would change. So, no problem (except for the fact that I’m coding in ASP):

<code>Dim regExpObj, file_content, newStr
Set regExptObj = new RegExp
With regExpObj
    .Pattern = "idTag[0-9]" 'Matches the id name that I'm using,
        'regardless of what number the generated code makes it
    .Global = True 'Matches all occurrences, not just the first
End With

newStr = objRegExp.Replace (file_content, "idTag3")

response.write newStr
</code>

Easy.