This is a simple HTML String Sanitizing tool.
It allows a highly rich,- but safe - html content input to be published on your pages.
The script code is very light and to some degree customizable.
- This function takes care of blacklisted tags, which are the first to be discarded without further processing.
- Otherwise, it checks link protocols for code injection, and strips it off if JavaScript is encountered.
- Further on, it removes all event driven code assignments from event attributes on all tags.
- And lastly, it restores images after stripping their event attributes. This security step is taken directly on the string source, because images are able to trigger the onerror event on a faulty source as soon as they are converted to DOM Elements, - making it possible for the attacker to execute his malicious code even before appending it to the document. This is the only part of the measure that operates on a string level. But it's absolutely, - a necessary one!
All questions, suggestions or remarks, are welcome...
p.s.: just before posting, I decided to also restrict the use of inline style, -since earlier versions of IE do support JavaScript returned values on this property.