xref: /PHP-5.5/ext/filter/docs/filter.txt (revision 1b0a789a)
1Input Filter Extension
2~~~~~~~~~~~~~~~~~~~~~~
3
4Introduction
5============
6We all know that you should always check input variables, but PHP does not
7offer really good functionality for doing this in a safe way. The Input Filter
8extension is meant to address this issue by implementing a set of filters and
9mechanisms that users can use to safely access their input data.
10
11
12Change Log
13==========
142005-10-27
15    * Updated filter_data prototype
16    * Added filter constants
17    * Fixed minor problems
18    * Changes by David Tulloh
19
202005-10-05
21    * Changed "input_filter.paranoid_admin_default_filter" to
22      "filter.default".
23    * Updated API prototypes to reflect implementation.
24    * Added 'on' and 'off' to the boolean filter.
25    * Removed min_range and max_range flags from the float filter.
26    * Added validate_url, validate_email and validate_ip filters.
27    * Updated allows flags for all filters.
28
292005-08-15
30    * Unmade *source* a bitmask, it doesn't make sense to do.
31    * Changed return value of filters which got invalid data from 'false' to
32      'null.
33    * Failed filters do not throw an E_NOTICE any longer.
34    * Added a magic_quotes sanitizing filter.
35
36
37General Considerations
38======================
39* If the filter's expected input data mask does not match the provided data
40  for logical filters the filter function returns "false". If the data was
41  not found, "null" is returned.
42* Character filters always return a string.
43* With the input filter extension enabled, and the
44  input_filter.paranoid_admin_default_filter is set to something != 'raw',
45  then all entries in the affected super globals will be passed through the
46  configured filter. The 'callback' filter can not be used here, as that
47  requieres a PHP script to be running already.
48* As the input filter acts on input data before the magic quotes function
49  mangles data, all access through the filter() function will not have any
50  quotes or slashes added - it will be the pure data as send by the browser.
51* All flags mentioned here should be prepended with `FILTER_FLAG_` when used
52  with PHP.
53
54
55API
56===
57mixed *input_get* (int *source*, string *name*, [, int *filter* [, mixed *filter_options*, [ string *characterset* ] ]);
58    Returns the filtered variable *$name* from source *$source*. It uses the
59    filter as specified in *$filter* with a constant, and additional options
60    to the filter through *$filter_options*.
61
62mixed *input_get_args* (array *definitions*, int *source*, [, array *data*]);
63    Returns an array with all filtered variables defined in 'definition'.
64    The keys are used as the name of the argument. The value can be either
65    an integer (flags) or an array of options. This array can contain
66    the 'filter' type, the 'flags', the 'otptions' or the 'charset'
67
68bool *input_has_variable (int *source*, string *name*);
69    Returns *true* if the variable with the name *name* exists in *source*, or
70    *false* otherwise.
71
72array *input_filters_list* ();
73    Returns a list with all supported filter names.
74
75mixed *filter_data* (mixed *variable*, int *filter* [, mixed *filter_options*, [ string *characterset* ] ]);
76    Filters the user supplied variable *$variable* in the same manner as
77    *input_get*.
78
79*$source*:
80
81* INPUT_POST     0
82* INPUT_GET      1
83* INPUT_COOKIE   2
84* INPUT_ENV      4
85* INPUT_SERVER   5 (not implemented yet)
86* INPUT_SESSION  6 (not implemented yet)
87
88
89General flags
90=============
91
92* FILTER_FLAG_SCALAR
93* FILTER_FLAG_ARRAY
94
95These two constants define whether to allow arrays in the source values. The
96default value is SCALAR for input_get_args and ARRAY for the other functions
97(< 0.9.5). These constants also insure that the function returns the correct
98type, if you ask for an array, you will get an array even if the source is
99only one value. However, if you ask for a scalar and the source is an array,
100the result will be FALSE (invalid).
101
102
103Logical Filters
104===============
105
106These filters check whether passed data was valid, and do never mangle input
107variables, but ofcourse they can deny the whole input variable getting to the
108application by returning false.
109
110The constants should be prepended by `FILTER_VALIDATE_` when used with php.
111
112================ ========== =========== ==================================================
113Name             Constant   Return Type Description
114================ ========== =========== ==================================================
115int              INT        integer     Returns the input variable as an integer
116
117                                        $filter_options - an array with the optional
118                                        elements:
119
120                                        * min_range: Minimal number that is allowed
121                                          (inclusive)
122                                        * max_range: Maximum number that is allowed
123                                          (inclusive)
124                                        * flags: A bitmask supporting the following flags:
125
126                                          - ALLOW_OCTAL: allow octal numbers with the format
127                                            0nn as input too.
128                                          - ALLOW_HEX: allow hexadecimal numbers with the
129                                            format 0xnn or 0Xnn too.
130
131boolean          BOOLEAN    boolean     Returns *true* for '1', 'on' and 'true' and *false*
132                                        for '0', 'off' and 'false'
133
134float            FLOAT      float       Returns the input variable as a floating point value
135
136validate_regexp  REGEXP     string      Matches the input value as a string against the
137                                        regular expression. If there is a match then the
138                                        string is returned, otherwise the filter returns
139                                        *null*.
140                                        Remarks: Only available if pcre has been compiled
141                                        into PHP.
142
143validate_url     URL        string      Validates an URL's format.
144
145                                        $filter_options - an bitmask that supports the
146                                        following flags:
147
148                                        * SCHEME_REQUIRED: The 'schema' part of the URL
149                                          needs to in the passed URL.
150                                        * HOST_REQUIRED: The 'host' part of the URL
151                                          needs to in the passed URL.
152                                        * PATH_REQUIRED: The 'path' part of the URL
153                                          needs to in the passed URL.
154                                        * QUERY_REQUIRED: The 'query' part of the URL
155                                          needs to in the passed URL.
156
157validate_email   EMAIL      string      Validates the passed string against a reasonably
158                                        good regular expression for validating an email
159                                        address.
160
161validate_ip      IP         string      Validates a string representing an IP address.
162
163                                        $filter_options - an bitmask that supports the
164                                        following flags:
165
166                                        * IPV4: Allows IPv4 addresses.
167                                        * IPV6: Allows IPv6 addresses.
168                                        * NO_RES_RANGE: Disallows addresses in reversed
169                                          ranges (IPv4 only)
170                                        * NO_PRIV_RANGE: Disallows addresses in private
171                                          ranges (IPv4 only)
172================ ========== =========== ==================================================
173
174
175Sanitizing Filters
176==================
177
178These filters remove data, or change data depending on the filter, and the
179set rules for this specific filter. Instead of taking an *options* array, they
180use this parameter for flags for the specific filter.
181
182The constants should be prepended by `FILTER_SANITIZE_` when used with php.
183
184============= ================ =========== =====================================================
185Name          Constant         Return Type Description
186============= ================ =========== =====================================================
187string        STRING           string      Returns the input variable as a string after it has
188                                           been stripped of XML/HTML tags and other evil things
189                                           that can cause XSS problems.
190
191                                           $filter_options - an bitmask that supports the
192                                           following flags:
193
194                                           * NO_ENCODE_QUOTES: Prevents single and double
195                                             quotes from being encoded as numerical HTML
196                                             entities.
197                                           * STRIP_LOW: excludes all characters < 0x20 from the
198                                             allowed character list
199                                           * STRIP_HIGH: excludes all characters >= 0x80 from
200                                             the allowed character list
201                                           * ENCODE_LOW: allows characters < 0x20 but encodes
202                                             them as numerical HTML entities
203                                           * ENCODE_HIGH: allows characters >= 0x80 but encodes
204                                             them as numerical HTML entities
205                                           * ENCODE_AMP: encodes & as &amp;
206
207                                           The flags STRIP_LOW and ENCODE_LOW are mutual
208                                           exclusive, and so are STRIP_HIGH and ENCODE_HIGH. In
209                                           the case they clash, the characters will be
210                                           stripped.
211
212stripped      STRIPPED         string      Alias for 'string'.
213
214encoded       ENCODED          string      Encodes all characters outside the range
215                                           "a-zA-Z0-9-._" as URL encoded values.
216
217                                           $filter_options - an bitmask that supports the
218                                           following flags:
219
220                                           * STRIP_LOW: excludes all characters < 0x20 from the
221                                             allowed character list
222                                           * STRIP_HIGH: excludes all characters >= 0x80 from
223                                             the allowed character list
224                                           * ENCODE_LOW: allows characters < 0x20 but encodes
225                                             them as numerical HTML entities
226                                           * ENCODE_HIGH: allows characters >= 0x80 but encodes
227                                             them as numerical HTML entities
228
229special_chars SPECIAL_CHARS    string      Encodes the 'special' characters ' " < > &, \0 and
230                                           everything below 0x20 as numerical HTML entities.
231
232                                           $filter_options - an bitmask that supports the
233                                           following flags:
234
235                                           * STRIP_LOW: excludes all characters < 0x20 from the
236                                             allowed character list. If this is not set, then
237                                             those characters are encoded as numerical HTML
238                                             entities
239                                           * STRIP_HIGH: excludes all characters >= 0x80 from
240                                             the allowed character list
241                                           * ENCODE_HIGH: allows characters >= 0x80 but encodes
242                                             them as numerical HTML entities
243
244unsafe_raw    UNSAFE_RAW       string      Returns the input variable as a string without
245                                           XML/HTML being stripped from the input value.
246
247                                           $filter_options - an bitmask that supports the
248                                           following flags:
249
250                                           * STRIP_LOW: excludes all characters < 0x20 from the
251                                             allowed character list
252                                           * STRIP_HIGH: excludes all characters >= 0x80 from
253                                             the allowed character list
254                                           * ENCODE_LOW: allows characters < 0x20 but encodes
255                                             them as numerical HTML entities
256                                           * ENCODE_HIGH: allows characters >= 0x80 but encodes
257                                             them as numerical HTML entities
258                                           * ENCODE_AMP: encodes & as &amp;
259
260                                           The flags STRIP_LOW and ENCODE_LOW are mutual
261                                           exclusive, and so are STRIP_HIGH and ENCODE_HIGH. In
262                                           the case they clash, the characters will be
263                                           stripped.
264
265email         EMAIL            string      Removes all characters that can not be part of a
266                                           correctly formed e-mail address (exception are
267                                           comments in the email address) (a-z A-Z 0-9 " ! # $
268                                           % & ' * + - / = ? ^ _ ` { | } ~ @ . [ ]). This
269                                           filter does `not` validate if the e-mail address has
270                                           the correct format, use the validate_email filter
271                                           for that.
272
273url           URL              string      Removes all characters that can not be part of a
274                                           correctly formed URI. (a-z A-Z 0-9 $ - _ . + ! * ' (
275                                           ) , { } | \ ^ ~ [ ] ` < > # % " ; / ? : @ & =) This
276                                           filter does `not` validate if a URI has the correct
277                                           format, use the validate_url filter for that.
278
279number_int    NUMBER_INT       int         Removes all characters that are [^0-9+-].
280
281number_float  NUMBER_FLOAT     float       Removes all characters that are [^0-9+-].
282
283                                           $filter_options - an bitmask that supports the
284                                           following flags:
285
286                                           * ALLOW_FRACTION: adds "." to the characters that
287                                             are not stripped.
288                                           * ALLOW_THOUSAND: adds "," to the characters that
289                                             are not stripped.
290                                           * ALLOW_SCIENTIFIC: adds "eE" to the characters that
291                                             are not stripped.
292
293magic_quotes  MAGIC_QUOTES     string      BC filter for people who like magic quotes.
294============= ================ =========== =====================================================
295
296
297Callback Filter
298===============
299
300This filter will callback to the specified callback function as specified with
301the *filter_options* parameter. All variants of callback functions are
302supported:
303
304* function with *'functionname'*
305* static method with *array('classname', 'methodname')*
306* dynamic method with *array(&$this, 'methodname')*
307
308The constants should be prepended by `FILTER_` when used with php.
309
310============= =========== =========== =====================================================
311Name          Constant    Return Type Description
312============= =========== =========== =====================================================
313callback      CALLBACK    mixed       Calls the callback function/method with the input
314                                      variable's value by reference which can do filtering
315                                      and modifying of the input value. If the callback
316                                      function returns "false" then the input value is
317                                      supposed to be incorrect and the returned value will
318                                      be 'false' (and an E_NOTICE will be raised).
319============= =========== =========== =====================================================
320
321The callback function's prototype is:
322
323boolean callback(&$value, $characterset);
324    With *$value* being a reference to the input variable and *$characterset*
325    containing the same value as this parameter's value in the call to
326    *input_get()* or *input_get_array()*. If the *$characterset* parameter was
327    not passed, it defaults to *'null'*.
328
329Version: $Id$
330.. vim: et syn=rst tw=78
331
332