History log of /php-src/ext/mbstring/tests/gh13123.phpt (Results 1 – 1 of 1)
Revision Date Author Comments
# 67051eb8 11-Jan-2024 Alex Dowad

Fix segfault caused by use of 'pass' encoding when mbstring converts multipart form POST data

When mbstring.encoding_translation=1, and PHP receives an (RFC1867)
form-based file upload,

Fix segfault caused by use of 'pass' encoding when mbstring converts multipart form POST data

When mbstring.encoding_translation=1, and PHP receives an (RFC1867)
form-based file upload, and the Content-Disposition HTTP header contains
a filename for the uploaded file, PHP will internally invoke mbstring
code to 1) try to auto-detect the text encoding of the filename, and if
that succeeds, 2) convert the filename to internal text encoding.

In such cases, the candidate text encodings which are considered during
"auto-detection" are those listed in the INI parameter
mbstring.http_input. Further, mbstring.http_input is one of the few
contexts where mbstring allows the magic string "pass" to appear in
place of an actual text encoding name.

Before mbstring's encoding auto-detection function was reimplemented,
the old implementation would never return "pass", even if "pass" was the
only candidate it was given to choose from. It is not clear if this was
intended by the original developers or not. This behavior was the result
of some rather subtle details of the implementation.

After mbstring's auto-detection function was reimplemented, if the new
implementation was given only one candidate to choose, and it was not
running in 'strict' mode, it would always return that candidate, even
if the candidate was the non-encoding "pass".

The upshot of all of this: Previously, if
mbstring.encoding_translation=1 and mbstring.http_input=pass, encoding
conversion of RFC1867 filenames would never be attempted. But after
the reimplementation, encoding 'conversion' would occur (uselessly).

Further, in December 2022, I reimplemented the relevant bit of
encoding conversion code. When doing this, I never bothered to
implement encoding/decoding routines for the non-encoding "pass",
because I thought that they would never be used. Well, in the one case
described above, those routines *would* have been used, had they
actually existed. Because they didn't exist, we get a nice NULL pointer
dereference and ensuing segfault instead.

Instead of 'fixing' this by adding encoding/decoding routines for the
non-encoding "pass", I have modified the function which the RFC1867
form-handling code invokes to auto-detect input encoding. This function
will never return "pass" now, just like the previous implementation.

Thanks to the GitHub user 'tstangner' for reporting this bug.

show more ...