php-src/ext/filter/docs/filter.txt

332 lines
17 KiB
Plaintext

Input Filter Extension
~~~~~~~~~~~~~~~~~~~~~~
Introduction
============
We all know that you should always check input variables, but PHP does not
offer really good functionality for doing this in a safe way. The Input Filter
extension is meant to address this issue by implementing a set of filters and
mechanisms that users can use to safely access their input data.
Change Log
==========
2005-10-27
* Updated filter_data prototype
* Added filter constants
* Fixed minor problems
* Changes by David Tulloh
2005-10-05
* Changed "input_filter.paranoid_admin_default_filter" to
"filter.default".
* Updated API prototypes to reflect implementation.
* Added 'on' and 'off' to the boolean filter.
* Removed min_range and max_range flags from the float filter.
* Added validate_url, validate_email and validate_ip filters.
* Updated allows flags for all filters.
2005-08-15
* Unmade *source* a bitmask, it doesn't make sense to do.
* Changed return value of filters which got invalid data from 'false' to
'null.
* Failed filters do not throw an E_NOTICE any longer.
* Added a magic_quotes sanitizing filter.
General Considerations
======================
* If the filter's expected input data mask does not match the provided data
for logical filters the filter function returns "false". If the data was
not found, "null" is returned.
* Character filters always return a string.
* With the input filter extension enabled, and the
input_filter.paranoid_admin_default_filter is set to something != 'raw',
then all entries in the affected super globals will be passed through the
configured filter. The 'callback' filter can not be used here, as that
requieres a PHP script to be running already.
* As the input filter acts on input data before the magic quotes function
mangles data, all access through the filter() function will not have any
quotes or slashes added - it will be the pure data as send by the browser.
* All flags mentioned here should be prepended with `FILTER_FLAG_` when used
with PHP.
API
===
mixed *input_get* (int *source*, string *name*, [, int *filter* [, mixed *filter_options*, [ string *characterset* ] ]);
Returns the filtered variable *$name* from source *$source*. It uses the
filter as specified in *$filter* with a constant, and additional options
to the filter through *$filter_options*.
mixed *input_get_args* (array *definitions*, int *source*, [, array *data*]);
Returns an array with all filtered variables defined in 'definition'.
The keys are used as the name of the argument. The value can be either
an integer (flags) or an array of options. This array can contain
the 'filter' type, the 'flags', the 'otptions' or the 'charset'
bool *input_has_variable (int *source*, string *name*);
Returns *true* if the variable with the name *name* exists in *source*, or
*false* otherwise.
array *input_filters_list* ();
Returns a list with all supported filter names.
mixed *filter_data* (mixed *variable*, int *filter* [, mixed *filter_options*, [ string *characterset* ] ]);
Filters the user supplied variable *$variable* in the same manner as
*input_get*.
*$source*:
* INPUT_POST 0
* INPUT_GET 1
* INPUT_COOKIE 2
* INPUT_ENV 4
* INPUT_SERVER 5 (not implemented yet)
* INPUT_SESSION 6 (not implemented yet)
General flags
=============
* FILTER_FLAG_SCALAR
* FILTER_FLAG_ARRAY
These two constants define whether to allow arrays in the source values. The
default value is SCALAR for input_get_args and ARRAY for the other functions
(< 0.9.5). These constants also insure that the function returns the correct
type, if you ask for an array, you will get an array even if the source is
only one value. However, if you ask for a scalar and the source is an array,
the result will be FALSE (invalid).
Logical Filters
===============
These filters check whether passed data was valid, and do never mangle input
variables, but ofcourse they can deny the whole input variable getting to the
application by returning false.
The constants should be prepended by `FILTER_VALIDATE_` when used with php.
================ ========== =========== ==================================================
Name Constant Return Type Description
================ ========== =========== ==================================================
int INT integer Returns the input variable as an integer
$filter_options - an array with the optional
elements:
* min_range: Minimal number that is allowed
(inclusive)
* max_range: Maximum number that is allowed
(inclusive)
* flags: A bitmask supporting the following flags:
- ALLOW_OCTAL: allow octal numbers with the format
0nn as input too.
- ALLOW_HEX: allow hexadecimal numbers with the
format 0xnn or 0Xnn too.
boolean BOOLEAN boolean Returns *true* for '1', 'on' and 'true' and *false*
for '0', 'off' and 'false'
float FLOAT float Returns the input variable as a floating point value
validate_regexp REGEXP string Matches the input value as a string against the
regular expression. If there is a match then the
string is returned, otherwise the filter returns
*null*.
Remarks: Only available if pcre has been compiled
into PHP.
validate_url URL string Validates an URL's format.
$filter_options - an bitmask that supports the
following flags:
* SCHEME_REQUIRED: The 'schema' part of the URL
needs to in the passed URL.
* HOST_REQUIRED: The 'host' part of the URL
needs to in the passed URL.
* PATH_REQUIRED: The 'path' part of the URL
needs to in the passed URL.
* QUERY_REQUIRED: The 'query' part of the URL
needs to in the passed URL.
validate_email EMAIL string Validates the passed string against a reasonably
good regular expression for validating an email
address.
validate_ip IP string Validates a string representing an IP address.
$filter_options - an bitmask that supports the
following flags:
* IPV4: Allows IPv4 addresses.
* IPV6: Allows IPv6 addresses.
* NO_RES_RANGE: Disallows addresses in reversed
ranges (IPv4 only)
* NO_PRIV_RANGE: Disallows addresses in private
ranges (IPv4 only)
================ ========== =========== ==================================================
Sanitizing Filters
==================
These filters remove data, or change data depending on the filter, and the
set rules for this specific filter. Instead of taking an *options* array, they
use this parameter for flags for the specific filter.
The constants should be prepended by `FILTER_SANITIZE_` when used with php.
============= ================ =========== =====================================================
Name Constant Return Type Description
============= ================ =========== =====================================================
string STRING string Returns the input variable as a string after it has
been stripped of XML/HTML tags and other evil things
that can cause XSS problems.
$filter_options - an bitmask that supports the
following flags:
* NO_ENCODE_QUOTES: Prevents single and double
quotes from being encoded as numerical HTML
entities.
* STRIP_LOW: excludes all characters < 0x20 from the
allowed character list
* STRIP_HIGH: excludes all characters >= 0x80 from
the allowed character list
* ENCODE_LOW: allows characters < 0x20 but encodes
them as numerical HTML entities
* ENCODE_HIGH: allows characters >= 0x80 but encodes
them as numerical HTML entities
* ENCODE_AMP: encodes & as &amp;
The flags STRIP_LOW and ENCODE_LOW are mutual
exclusive, and so are STRIP_HIGH and ENCODE_HIGH. In
the case they clash, the characters will be
stripped.
stripped STRIPPED string Alias for 'string'.
encoded ENCODED string Encodes all characters outside the range
"a-zA-Z0-9-._" as URL encoded values.
$filter_options - an bitmask that supports the
following flags:
* STRIP_LOW: excludes all characters < 0x20 from the
allowed character list
* STRIP_HIGH: excludes all characters >= 0x80 from
the allowed character list
* ENCODE_LOW: allows characters < 0x20 but encodes
them as numerical HTML entities
* ENCODE_HIGH: allows characters >= 0x80 but encodes
them as numerical HTML entities
special_chars SPECIAL_CHARS string Encodes the 'special' characters ' " < > &, \0 and
everything below 0x20 as numerical HTML entities.
$filter_options - an bitmask that supports the
following flags:
* STRIP_LOW: excludes all characters < 0x20 from the
allowed character list. If this is not set, then
those characters are encoded as numerical HTML
entities
* STRIP_HIGH: excludes all characters >= 0x80 from
the allowed character list
* ENCODE_HIGH: allows characters >= 0x80 but encodes
them as numerical HTML entities
unsafe_raw UNSAFE_RAW string Returns the input variable as a string without
XML/HTML being stripped from the input value.
$filter_options - an bitmask that supports the
following flags:
* STRIP_LOW: excludes all characters < 0x20 from the
allowed character list
* STRIP_HIGH: excludes all characters >= 0x80 from
the allowed character list
* ENCODE_LOW: allows characters < 0x20 but encodes
them as numerical HTML entities
* ENCODE_HIGH: allows characters >= 0x80 but encodes
them as numerical HTML entities
* ENCODE_AMP: encodes & as &amp;
The flags STRIP_LOW and ENCODE_LOW are mutual
exclusive, and so are STRIP_HIGH and ENCODE_HIGH. In
the case they clash, the characters will be
stripped.
email EMAIL string Removes all characters that can not be part of a
correctly formed e-mail address (exception are
comments in the email address) (a-z A-Z 0-9 " ! # $
% & ' * + - / = ? ^ _ ` { | } ~ @ . [ ]). This
filter does `not` validate if the e-mail address has
the correct format, use the validate_email filter
for that.
url URL string Removes all characters that can not be part of a
correctly formed URI. (a-z A-Z 0-9 $ - _ . + ! * ' (
) , { } | \ ^ ~ [ ] ` < > # % " ; / ? : @ & =) This
filter does `not` validate if a URI has the correct
format, use the validate_url filter for that.
number_int NUMBER_INT int Removes all characters that are [^0-9+-].
number_float NUMBER_FLOAT float Removes all characters that are [^0-9+-].
$filter_options - an bitmask that supports the
following flags:
* ALLOW_FRACTION: adds "." to the characters that
are not stripped.
* ALLOW_THOUSAND: adds "," to the characters that
are not stripped.
* ALLOW_SCIENTIFIC: adds "eE" to the characters that
are not stripped.
magic_quotes MAGIC_QUOTES string BC filter for people who like magic quotes.
============= ================ =========== =====================================================
Callback Filter
===============
This filter will callback to the specified callback function as specified with
the *filter_options* parameter. All variants of callback functions are
supported:
* function with *'functionname'*
* static method with *array('classname', 'methodname')*
* dynamic method with *array(&$this, 'methodname')*
The constants should be prepended by `FILTER_` when used with php.
============= =========== =========== =====================================================
Name Constant Return Type Description
============= =========== =========== =====================================================
callback CALLBACK mixed Calls the callback function/method with the input
variable's value by reference which can do filtering
and modifying of the input value. If the callback
function returns "false" then the input value is
supposed to be incorrect and the returned value will
be 'false' (and an E_NOTICE will be raised).
============= =========== =========== =====================================================
The callback function's prototype is:
boolean callback(&$value, $characterset);
With *$value* being a reference to the input variable and *$characterset*
containing the same value as this parameter's value in the call to
*input_get()* or *input_get_array()*. If the *$characterset* parameter was
not passed, it defaults to *'null'*.
Version: $Id$
.. vim: et syn=rst tw=78