Earlier today, Cory Fowler suggested I write up a post discussing the differences
between the AntiXss library and the methods found in HttpUtility and how it helps
defend from cross site scripting (xss). As I was thinking about what to write,
it occurred to me that I really had no idea how it did what it did, and why it differed
from HttpUtility. <side-track>I’m kinda wondering how many other people
out there run in to the same thing? We are told to use some technology because
it does xyz better than abc, but when it comes right down to it, we aren’t quite sure
of the internals. Just a thought for later I suppose. </side-track>
A Quick Refresher
To quickly summarize what xss is: If you have a textbox on your website that someone
can enter text into, and then on another page, display that same text, the user could
maliciously add in <script> tags to do anything it wanted with JavaScript.
This usually results in redirecting to another website that shows advertisements or
try’s to install malware.
The way to stop this is to not trust any input, and encode any character that could
be part of a tag to an HtmlEncode’d entity.
HttpUtility does this though, right?
The HttpUtility class definitely does do this. However, it is relatively limited
in how it encodes possibly malicious text. It works by encoding specific characters
like the the brackets < > to < and > This can get tricky
because it you could theoretically bypass these characters (somehow – speculative).
Enter AntiXss
The AntiXss library works in essentially the opposite manner. It has a white-list
of allowed characters, and encodes everything else. These characters are the
usual a-z 0-9, etc characters.
Further Reading
I’m not really doing you, dear reader, any help by reiterating what dozens of people
have said before me (and probably did it better), so here are a couple links that
contain loads of information on actually using the AntiXss library and protecting
your website from cross site scripting: