is it okay to "repeatedly" xss-clean data in CodeIgniter? is it okay to "repeatedly" xss-clean data in CodeIgniter? codeigniter codeigniter

is it okay to "repeatedly" xss-clean data in CodeIgniter?


It's not going to hurt you, but it is definitely is pointless.

There's a very good chance that eventually, you will reach a point where the global XSS filter is going to be cumbersome. Since it can't be disabled per controller without extensive hacks, and access to the raw $_REQUEST data will be impossible, you will need to disable it globally. This will happen the moment you want to process a single piece of trusted data, or data that isn't HTML output and must remain intact.

Using it as a form validation rule is pointless and potentially destructive as well. Imagine what this site would be like if every time you typed <script> it was replaced with [removed], with no way to revert it in the future. For another example, what if a user users some "XSS" content in his password? Your application will end up altering the input silently.

Just use the XSS filter where you need it: on your HTML output, places where javascript can be executed.


Yes. Assume, your input is 'A'. Then, lets say you run an xss_clean to get XSS-safe content:

B = xss_clean(A)

Now, lets say I do it again to get C:

C = css_clean(B)

Now, if B and C differ, then it must mean that B had some xss-unsafe content. Which clearly means that xss_clean is broken as it did not clean A properly. So as long as you assume that the function returns xss-safe content, you are good to go.

One argument that can be made is what if the function modifies even xss-safe content? Well, that would suck and it would still mean that the function is broken, but that is not the case (saying just out of my experience, as in haven't seen it behave like this ever).

The only drawback I see is the additional processing overhead, but doing it twice is fine (once with global filtering, and once doing it explicitly, just in case global filtering is turned off sometime by someone), and is a pretty ok overhead cost for the security assurance.

Also, if I may add, codeigniters xss clean doesn't really parse the HTML and drop the tags and stuff. It just simply converts the < and > to < and >. So with that in mind, I don't see anything that could go wrong.


Using xss_clean even once is bad as far as I am concerned. This routine attempts to sanitise your data by removing parts or replacing parts. It is lossy and not guaranteed to return the same content when run multiple times. It is also hard to predict and will not always act appropriately. Given the amount of things it does to try to sanitise a string there is a massive performance hit for using this on input. Even the tiniest bit of input such as a=b will cause a flurry of activity for xss_clean.

I would like to say that you should never use xss_clean but realistically I can't say that. This system is made for inexperienced developers who do not know how to safely manage user content. I'm an experienced developer so I can say that no project I am working on should ever use xss_clean. The fact is though, the corruption issues will be less problematic for inexperience developers with simpler usage and ultimately it probably will make their code more secure even if they should be making their code more secure themselves rather than relying on quick dirty and cheap hacks. On the otherhand, xss_clean isn't guaranteed to make your code completely secure and can ultmimately make things worse by giving a false sense of security. You are recommended to really study instead to make sure you understand exactly everything your code does so you can make it truly secure. xss_clean does not compensate for code errors, it compensates for coder errors.

Ideally xss_clean wants to be done only on output (and wants to be replaced with htmlentities, etc) but most people wont bother with this as it's simpler for them to violate data purity by just filtering all input rather than filtering output (something can be input once but output ten times). Again, an undisciplined developer may not put xss_clean for one out of those ten cases of output.

Realistically however, the only real decent way is to properly encode everything in the view the moment it is to be displayed on a page. The problem with pre-emptive encoding is that you store data that might be incorrectly encoded and you can double encode data if it is input, then output into a form, then inputted again. If you think of something like an edit box you can have some serious problems with data growth. Not all sanitation removes content. For example, if you addslashes this will add content. If you have a slash in your content every time you run addslashes a new slash is added causing it to grow. Although there is a good chance your data will end up embedded in HTML you also can't always really know where data will end up. Suddenly you get a new requirement that applies to previous data and that's it, you're screwed because you applied and lossy filter to incoming data prior to storage. By lossy, in this case, that might mean your job after corrupting all the user data in your database. Your data is usually the most valuable thing for a web application. This is a big problem with pre-emptive encoding. It is easier to work with if you always know your data is pure and can escape it according to the situation at had but if your data could be in any condition down the line this can be very problematic. The filtering can also cause some occasional logical breakages. As the sanitisation can remove content for example, two strings that don't match can be made to match.

Many of the problems with xss_clean on input are the same or similar to those for magic_quotes:http://en.wikipedia.org/wiki/Magic_quotes

Summary: You should not use it but instead block bad data on user input and escape properly on output. If you want to sanitise user data, it should happen in the client (browser, form validation) so that the user can see it. You should never have invisible data alteration. If you must run xss_clean. You should only run it once, on output. If you're going to use it for validation of input, have $posted_data !== xss_clean($posted_data) then reject.