mirror of
https://github.com/php/php-src.git
synced 2024-10-19 23:44:13 +00:00
332 lines
17 KiB
Plaintext
332 lines
17 KiB
Plaintext
Input Filter Extension
|
|
~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
Introduction
|
|
============
|
|
We all know that you should always check input variables, but PHP does not
|
|
offer really good functionality for doing this in a safe way. The Input Filter
|
|
extension is meant to address this issue by implementing a set of filters and
|
|
mechanisms that users can use to safely access their input data.
|
|
|
|
|
|
Change Log
|
|
==========
|
|
2005-10-27
|
|
* Updated filter_data prototype
|
|
* Added filter constants
|
|
* Fixed minor problems
|
|
* Changes by David Tulloh
|
|
|
|
2005-10-05
|
|
* Changed "input_filter.paranoid_admin_default_filter" to
|
|
"filter.default".
|
|
* Updated API prototypes to reflect implementation.
|
|
* Added 'on' and 'off' to the boolean filter.
|
|
* Removed min_range and max_range flags from the float filter.
|
|
* Added validate_url, validate_email and validate_ip filters.
|
|
* Updated allows flags for all filters.
|
|
|
|
2005-08-15
|
|
* Unmade *source* a bitmask, it doesn't make sense to do.
|
|
* Changed return value of filters which got invalid data from 'false' to
|
|
'null.
|
|
* Failed filters do not throw an E_NOTICE any longer.
|
|
* Added a magic_quotes sanitizing filter.
|
|
|
|
|
|
General Considerations
|
|
======================
|
|
* If the filter's expected input data mask does not match the provided data
|
|
for logical filters the filter function returns "false". If the data was
|
|
not found, "null" is returned.
|
|
* Character filters always return a string.
|
|
* With the input filter extension enabled, and the
|
|
input_filter.paranoid_admin_default_filter is set to something != 'raw',
|
|
then all entries in the affected super globals will be passed through the
|
|
configured filter. The 'callback' filter can not be used here, as that
|
|
requieres a PHP script to be running already.
|
|
* As the input filter acts on input data before the magic quotes function
|
|
mangles data, all access through the filter() function will not have any
|
|
quotes or slashes added - it will be the pure data as send by the browser.
|
|
* All flags mentioned here should be prepended with `FILTER_FLAG_` when used
|
|
with PHP.
|
|
|
|
|
|
API
|
|
===
|
|
mixed *input_get* (int *source*, string *name*, [, int *filter* [, mixed *filter_options*, [ string *characterset* ] ]);
|
|
Returns the filtered variable *$name* from source *$source*. It uses the
|
|
filter as specified in *$filter* with a constant, and additional options
|
|
to the filter through *$filter_options*.
|
|
|
|
mixed *input_get_args* (array *definitions*, int *source*, [, array *data*]);
|
|
Returns an array with all filtered variables defined in 'definition'.
|
|
The keys are used as the name of the argument. The value can be either
|
|
an integer (flags) or an array of options. This array can contain
|
|
the 'filter' type, the 'flags', the 'otptions' or the 'charset'
|
|
|
|
bool *input_has_variable (int *source*, string *name*);
|
|
Returns *true* if the variable with the name *name* exists in *source*, or
|
|
*false* otherwise.
|
|
|
|
array *input_filters_list* ();
|
|
Returns a list with all supported filter names.
|
|
|
|
mixed *filter_data* (mixed *variable*, int *filter* [, mixed *filter_options*, [ string *characterset* ] ]);
|
|
Filters the user supplied variable *$variable* in the same manner as
|
|
*input_get*.
|
|
|
|
*$source*:
|
|
|
|
* INPUT_POST 0
|
|
* INPUT_GET 1
|
|
* INPUT_COOKIE 2
|
|
* INPUT_ENV 4
|
|
* INPUT_SERVER 5 (not implemented yet)
|
|
* INPUT_SESSION 6 (not implemented yet)
|
|
|
|
|
|
General flags
|
|
=============
|
|
|
|
* FILTER_FLAG_SCALAR
|
|
* FILTER_FLAG_ARRAY
|
|
|
|
These two constants define whether to allow arrays in the source values. The
|
|
default value is SCALAR for input_get_args and ARRAY for the other functions
|
|
(< 0.9.5). These constants also insure that the function returns the correct
|
|
type, if you ask for an array, you will get an array even if the source is
|
|
only one value. However, if you ask for a scalar and the source is an array,
|
|
the result will be FALSE (invalid).
|
|
|
|
|
|
Logical Filters
|
|
===============
|
|
|
|
These filters check whether passed data was valid, and do never mangle input
|
|
variables, but ofcourse they can deny the whole input variable getting to the
|
|
application by returning false.
|
|
|
|
The constants should be prepended by `FILTER_VALIDATE_` when used with php.
|
|
|
|
================ ========== =========== ==================================================
|
|
Name Constant Return Type Description
|
|
================ ========== =========== ==================================================
|
|
int INT integer Returns the input variable as an integer
|
|
|
|
$filter_options - an array with the optional
|
|
elements:
|
|
|
|
* min_range: Minimal number that is allowed
|
|
(inclusive)
|
|
* max_range: Maximum number that is allowed
|
|
(inclusive)
|
|
* flags: A bitmask supporting the following flags:
|
|
|
|
- ALLOW_OCTAL: allow octal numbers with the format
|
|
0nn as input too.
|
|
- ALLOW_HEX: allow hexadecimal numbers with the
|
|
format 0xnn or 0Xnn too.
|
|
|
|
boolean BOOLEAN boolean Returns *true* for '1', 'on' and 'true' and *false*
|
|
for '0', 'off' and 'false'
|
|
|
|
float FLOAT float Returns the input variable as a floating point value
|
|
|
|
validate_regexp REGEXP string Matches the input value as a string against the
|
|
regular expression. If there is a match then the
|
|
string is returned, otherwise the filter returns
|
|
*null*.
|
|
Remarks: Only available if pcre has been compiled
|
|
into PHP.
|
|
|
|
validate_url URL string Validates an URL's format.
|
|
|
|
$filter_options - an bitmask that supports the
|
|
following flags:
|
|
|
|
* SCHEME_REQUIRED: The 'schema' part of the URL
|
|
needs to in the passed URL.
|
|
* HOST_REQUIRED: The 'host' part of the URL
|
|
needs to in the passed URL.
|
|
* PATH_REQUIRED: The 'path' part of the URL
|
|
needs to in the passed URL.
|
|
* QUERY_REQUIRED: The 'query' part of the URL
|
|
needs to in the passed URL.
|
|
|
|
validate_email EMAIL string Validates the passed string against a reasonably
|
|
good regular expression for validating an email
|
|
address.
|
|
|
|
validate_ip IP string Validates a string representing an IP address.
|
|
|
|
$filter_options - an bitmask that supports the
|
|
following flags:
|
|
|
|
* IPV4: Allows IPv4 addresses.
|
|
* IPV6: Allows IPv6 addresses.
|
|
* NO_RES_RANGE: Disallows addresses in reversed
|
|
ranges (IPv4 only)
|
|
* NO_PRIV_RANGE: Disallows addresses in private
|
|
ranges (IPv4 only)
|
|
================ ========== =========== ==================================================
|
|
|
|
|
|
Sanitizing Filters
|
|
==================
|
|
|
|
These filters remove data, or change data depending on the filter, and the
|
|
set rules for this specific filter. Instead of taking an *options* array, they
|
|
use this parameter for flags for the specific filter.
|
|
|
|
The constants should be prepended by `FILTER_SANITIZE_` when used with php.
|
|
|
|
============= ================ =========== =====================================================
|
|
Name Constant Return Type Description
|
|
============= ================ =========== =====================================================
|
|
string STRING string Returns the input variable as a string after it has
|
|
been stripped of XML/HTML tags and other evil things
|
|
that can cause XSS problems.
|
|
|
|
$filter_options - an bitmask that supports the
|
|
following flags:
|
|
|
|
* NO_ENCODE_QUOTES: Prevents single and double
|
|
quotes from being encoded as numerical HTML
|
|
entities.
|
|
* STRIP_LOW: excludes all characters < 0x20 from the
|
|
allowed character list
|
|
* STRIP_HIGH: excludes all characters >= 0x80 from
|
|
the allowed character list
|
|
* ENCODE_LOW: allows characters < 0x20 but encodes
|
|
them as numerical HTML entities
|
|
* ENCODE_HIGH: allows characters >= 0x80 but encodes
|
|
them as numerical HTML entities
|
|
* ENCODE_AMP: encodes & as &
|
|
|
|
The flags STRIP_LOW and ENCODE_LOW are mutual
|
|
exclusive, and so are STRIP_HIGH and ENCODE_HIGH. In
|
|
the case they clash, the characters will be
|
|
stripped.
|
|
|
|
stripped STRIPPED string Alias for 'string'.
|
|
|
|
encoded ENCODED string Encodes all characters outside the range
|
|
"a-zA-Z0-9-._" as URL encoded values.
|
|
|
|
$filter_options - an bitmask that supports the
|
|
following flags:
|
|
|
|
* STRIP_LOW: excludes all characters < 0x20 from the
|
|
allowed character list
|
|
* STRIP_HIGH: excludes all characters >= 0x80 from
|
|
the allowed character list
|
|
* ENCODE_LOW: allows characters < 0x20 but encodes
|
|
them as numerical HTML entities
|
|
* ENCODE_HIGH: allows characters >= 0x80 but encodes
|
|
them as numerical HTML entities
|
|
|
|
special_chars SPECIAL_CHARS string Encodes the 'special' characters ' " < > &, \0 and
|
|
everything below 0x20 as numerical HTML entities.
|
|
|
|
$filter_options - an bitmask that supports the
|
|
following flags:
|
|
|
|
* STRIP_LOW: excludes all characters < 0x20 from the
|
|
allowed character list. If this is not set, then
|
|
those characters are encoded as numerical HTML
|
|
entities
|
|
* STRIP_HIGH: excludes all characters >= 0x80 from
|
|
the allowed character list
|
|
* ENCODE_HIGH: allows characters >= 0x80 but encodes
|
|
them as numerical HTML entities
|
|
|
|
unsafe_raw UNSAFE_RAW string Returns the input variable as a string without
|
|
XML/HTML being stripped from the input value.
|
|
|
|
$filter_options - an bitmask that supports the
|
|
following flags:
|
|
|
|
* STRIP_LOW: excludes all characters < 0x20 from the
|
|
allowed character list
|
|
* STRIP_HIGH: excludes all characters >= 0x80 from
|
|
the allowed character list
|
|
* ENCODE_LOW: allows characters < 0x20 but encodes
|
|
them as numerical HTML entities
|
|
* ENCODE_HIGH: allows characters >= 0x80 but encodes
|
|
them as numerical HTML entities
|
|
* ENCODE_AMP: encodes & as &
|
|
|
|
The flags STRIP_LOW and ENCODE_LOW are mutual
|
|
exclusive, and so are STRIP_HIGH and ENCODE_HIGH. In
|
|
the case they clash, the characters will be
|
|
stripped.
|
|
|
|
email EMAIL string Removes all characters that can not be part of a
|
|
correctly formed e-mail address (exception are
|
|
comments in the email address) (a-z A-Z 0-9 " ! # $
|
|
% & ' * + - / = ? ^ _ ` { | } ~ @ . [ ]). This
|
|
filter does `not` validate if the e-mail address has
|
|
the correct format, use the validate_email filter
|
|
for that.
|
|
|
|
url URL string Removes all characters that can not be part of a
|
|
correctly formed URI. (a-z A-Z 0-9 $ - _ . + ! * ' (
|
|
) , { } | \ ^ ~ [ ] ` < > # % " ; / ? : @ & =) This
|
|
filter does `not` validate if a URI has the correct
|
|
format, use the validate_url filter for that.
|
|
|
|
number_int NUMBER_INT int Removes all characters that are [^0-9+-].
|
|
|
|
number_float NUMBER_FLOAT float Removes all characters that are [^0-9+-].
|
|
|
|
$filter_options - an bitmask that supports the
|
|
following flags:
|
|
|
|
* ALLOW_FRACTION: adds "." to the characters that
|
|
are not stripped.
|
|
* ALLOW_THOUSAND: adds "," to the characters that
|
|
are not stripped.
|
|
* ALLOW_SCIENTIFIC: adds "eE" to the characters that
|
|
are not stripped.
|
|
|
|
magic_quotes MAGIC_QUOTES string BC filter for people who like magic quotes.
|
|
============= ================ =========== =====================================================
|
|
|
|
|
|
Callback Filter
|
|
===============
|
|
|
|
This filter will callback to the specified callback function as specified with
|
|
the *filter_options* parameter. All variants of callback functions are
|
|
supported:
|
|
|
|
* function with *'functionname'*
|
|
* static method with *array('classname', 'methodname')*
|
|
* dynamic method with *array(&$this, 'methodname')*
|
|
|
|
The constants should be prepended by `FILTER_` when used with php.
|
|
|
|
============= =========== =========== =====================================================
|
|
Name Constant Return Type Description
|
|
============= =========== =========== =====================================================
|
|
callback CALLBACK mixed Calls the callback function/method with the input
|
|
variable's value by reference which can do filtering
|
|
and modifying of the input value. If the callback
|
|
function returns "false" then the input value is
|
|
supposed to be incorrect and the returned value will
|
|
be 'false' (and an E_NOTICE will be raised).
|
|
============= =========== =========== =====================================================
|
|
|
|
The callback function's prototype is:
|
|
|
|
boolean callback(&$value, $characterset);
|
|
With *$value* being a reference to the input variable and *$characterset*
|
|
containing the same value as this parameter's value in the call to
|
|
*input_get()* or *input_get_array()*. If the *$characterset* parameter was
|
|
not passed, it defaults to *'null'*.
|
|
|
|
Version: $Id$
|
|
.. vim: et syn=rst tw=78
|
|
|