PHP Data Validation

WordPress data validation has some good reference in doing data validation, especially the philosophy part. For best practice, always use format correction first, then use whitelist.

// format correction
$action = (int) $_GET['action'];
// whitelist
switch ($action) {
    case 1:
        do_this();
        break;
    case 2:
        do_that();
        break;
    case 0:
    default:
        die("Don't know this action!");
}

However the above example is only for one input ($_GET[‘action’]). What if your web app (or controller) need  a few inputs, and surely you don’t want to write the same code over again, like this

// correct the format
$id = (int) $_GET['id'];
$name = (string) trim($_POST['name']);
$email = (string) trim($_POST['email');
$about = (string) htmlspecialchars(trim($_POST['about']), ENT_QUOTES, 'UTF-8');
// then do validation
if ($id == 0) {
    $id = 1;
}
if (empty($name)) {
    $msg = 'Name is required';
}
...

Most PHP frameworks I know whether validate the input one by one or using just one input source (GET or POST or COOKIE). It’s better to use $_REQUEST, since some input can be passed in <form> or just query string. E.g. /blog/post/?id=10&do=edit, then inside the page got <input type="hidden" name="do" value="save" />, now you can use both input from $_REQUEST[‘do’], which determine what action need to be done

$do = (string) $_REQUEST['do'];
switch ($do) {
    case 'edit':
        get_post($id);
        break;
    case 'save':
        save_post($id);
        break;
    case 'view':
    default:
        view_post($id);
        break;
}

With that pattern, here comes PHP filter functions, filter_input_array(). It receive an array of arguments, and source type (GET/POST etc.) and validate each of the input. It will return false on failed validation. This function is really convenient when we need to validate a long list of inputs.

However there’s a drawback. Since all the validation can be put in one input key argument (‘input_name’ => array(…list of various validators…), it is difficult to provide accurate error message, of which validation that failed. Therefore we can recreate another filter_input_array(), and put in the previous pattern (format first, then whitelist) into it. So basically the validation function work like this:

$input = validate_input(array(
    'id' => array('filter' => 'int', 'options' => array(
        'range' => array('max' => 1000, 'msg' => 'Max. value for ID is 1000')
    )),
    'name' => array('filter' => 'string', 'options' => array(
        'required' => array('value' => true, 'msg' => 'This input is required'),
        'alphanumeric' => array('value' => true, 'msg' => 'Name need to be alphanumeric')
    ))
));
// now we can use $input
$input['id'];
$input['name'];
// to get error msg
$err_msg['id']; // which may contain the error msg of specific validator
$err_msg['name'];
// to check if the whole form passed the validator or not, simply:
if (empty($err_msg)) {
    // form is valid
}

Additional note: String input.

To sanitize the string input, actually trim() is enough, and can be directly stored in database. You only need to sanitize the value only when outputting it. When echo() the value to HTML, make sure to always encode it first:

<p class="comments"><?php echo htmlspecialchars($value, ENT_QUOTES, 'UTF-8'); ?></p>

This not only can prevent XSS issue, but also display the correct Unicode character. Avoid htmlentities() as it may corrupt the Unicode characters

Share this...
Share on FacebookTweet about this on TwitterShare on Google+Share on LinkedInPin on Pinterest

Leave a Reply

Your email address will not be published. Required fields are marked *