History log of /PHP-7.4/ext/mbstring/mbstring.c (Results 76 – 100 of 627)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# fb9bf5b6 03-Aug-2017 Nikita Popov

Revert/fix substitution character fallback

The introduced checks were not correct in two respects:
* It was checked whether the source encoding of the string matches
the internal

Revert/fix substitution character fallback

The introduced checks were not correct in two respects:
* It was checked whether the source encoding of the string matches
the internal encoding, while the actually relevant encoding is
the *target* encoding.
* Even if the correct encoding is used, the checks are still too
conservative. Just because something is not a "Unicode-encoding"
does not mean that it does not map any non-ASCII characters.

I've reverted the added checks and instead adjusted mbfl_convert
to first try to use the provided substitution character and if
that fails, perform the fallback to '?' at that point. This means
that any codepoint mapped in the target encoding should now be
correctly supported and anything else should fall back to '?'.

show more ...


# a8a9e93e 03-Aug-2017 Nikita Popov

Revert/fix mb_substitute_character() codepoint checks

The introduced checks did not treat "non-Unicode" encodings correctly,
because they treated the passed integer as encoded in the int

Revert/fix mb_substitute_character() codepoint checks

The introduced checks did not treat "non-Unicode" encodings correctly,
because they treated the passed integer as encoded in the internal
encoding in that case, while in actuality the substitute character
is always a Unicode codepoint.

Additionally checking the codepoint against the internal encoding
is not correct in any case, because the substitution character must
be mapped in the *target* encoding of the conversion, which does
not necessarily coincide with the internal encoding (the internal
encoding is the default *source* encoding, not *target* encoding).

This reverts the checks back to simple range checks, but in a way
that still resolves #69079: Characters outside the Basic
Multilingual Plane are now accepted and Surrogate Codepoints are
rejected. A distinction between UTF-8 and non-UTF-8 encodings is
not made for surrogate checks (as in the original patch), as
surrogates are always illegal on their own. Specifying a surrogate
as substitution character would only make sense if you could
specify a substitution string with more than one character --
however we do not support that.

show more ...


# 2cc1cbf2 28-Jul-2017 Fabien Villepinte

Fix Bug #75001: Wrong reflection on mb_eregi_replace


# 582a65b0 27-Jul-2017 Nikita Popov

Implement full case mapping

Implement full case mapping according to SpecialCasing.txt and
also full case folding according to CaseFolding.txt (F). There
are a number of caveats:

Implement full case mapping

Implement full case mapping according to SpecialCasing.txt and
also full case folding according to CaseFolding.txt (F). There
are a number of caveats:

* Only language-agnostic and unconditional full case mapping
is implemented. The only language-agnostic conditional case
mapping rule relates to Greek sigma in final position
(Final_Sigma). Correctly handling this requires both arbitrary
lookahead and lookbehind, which would require some larger
changes to how the case mapping is implemented. This is a
possible future extension.
* The only language-specific handling that is implemented is
for Turkish dotted/undotted Is, if the ISO-8859-9 encoding
is used. This matches the previous behavior and makes sure
that no codepoints not supported by the encoding are
produced. A future extension would be to also handle the
Turkish mappings specified by SpecialCasing.txt based on
the mbfl internal language.
* Full case folding is implemented, but case-insensitive mb_*
operations continue to use simple case folding. The reason is
that full case folding of the haystack string may change the
position at which a match occurred. This would have to be
mapped back into the position in the original string.
* mb_convert_case() exposes both the full and the simple case
mapping / folding, where full is the default. The constants
are:

* MB_CASE_LOWER (used by mb_strtolower)
* MB_CASE_UPPER (used by mb_strtolower)
* MB_CASE_TITLE
* MB_CASE_FOLD
* MB_CASE_LOWER_SIMPLE
* MB_CASE_UPPER_SIMPLE
* MB_CASE_TITLE_SIMPLE
* MB_CASE_FOLD_SIMPLE (used by case-insensitive operations)

show more ...


# 9ac7c1e7 27-Jul-2017 Nikita Popov

Use case-folding for case insensitive comparisons

Instead of using lowercasing.


# f56b0afe 26-Jul-2017 Nikita Popov

Avoid some unnecessary mbfl_strlen() calculations


# 13a26290 25-Jul-2017 Anatol Belski

size_t fixes


# 445e13b1 23-Jul-2017 Nikita Popov

Add MBFL_SUBSTR_TO_END mode to mbfl_substr

This takes the substr from the offset to the end of the string.
This avoids pointless searching for the end position and also
saves us a le

Add MBFL_SUBSTR_TO_END mode to mbfl_substr

This takes the substr from the offset to the end of the string.
This avoids pointless searching for the end position and also
saves us a length calculation in the strstr family of functions.

show more ...


# bff11c38 23-Jul-2017 Nikita Popov

Remove more obsolete length checks


# 78944bdf 23-Jul-2017 Anatol Belski

remove cast


# 6809be20 23-Jul-2017 Anatol Belski

fix warnings and datatype

ident


# bd63c0f5 23-Jul-2017 Nikita Popov

Fix bug #73528


# 80463579 23-Jul-2017 Nikita Popov

Remove confusing null checks in mb_send_mail

These are required parameters, they cannot be missing.


# 9af5b7f3 23-Jul-2017 Nikita Popov

Fix use after free in mb_send_mail


# 4fbd7ccb 22-Jul-2017 Anatol Belski

touch yet more places for datatypes


# 61784bcb 22-Jul-2017 Anatol Belski

sync libmbfl allocator with the size_t changes


# e0825ec6 22-Jul-2017 Anatol Belski

Mitigation for ssize_t issue in 22a5f554a8766d63fd2c2ce91a90ebacb13c0f6a

and some more


# 1388751f 20-Jul-2017 Nikita Popov

Use fast zpp in mb_strlen()

For short strings this function is now sufficiently fast for zpp
to be a bottleneck.


# b3c1d9d1 20-Jul-2017 Nikita Popov

Directly use encodings instead of no_encoding in libmbfl

In particular strings now store encoding rather than the
no_encoding.

I've also pruned out libmbfl APIs that existed in

Directly use encodings instead of no_encoding in libmbfl

In particular strings now store encoding rather than the
no_encoding.

I've also pruned out libmbfl APIs that existed in two forms, one
using no_encoding and the other using encoding. We were not actually
using any of the former.

show more ...


# 77cb7bd8 20-Jul-2017 Nikita Popov

Free last_used_encoding_name in RSHUTDOWN

efree() cannot be used in GSHUTDOWN


# ba383b82 20-Jul-2017 Nikita Popov

Add basic mbstring encoding cache

Store the last used encoding and compare against it. It's quite
likely that an application is going to be using the same encoding
again and again.

Add basic mbstring encoding cache

Store the last used encoding and compare against it. It's quite
likely that an application is going to be using the same encoding
again and again.

The actual mbfl_name2encoding() function could also be optimized
to use a hash lookup rather than a linear scan, but we don't have
a hashtable implmentation in libmbfl...

show more ...


# 264387e3 20-Jul-2017 Nikita Popov

Add php_mb_get_no_encoding() helper function


# adaea775 20-Jul-2017 Nikita Popov

Switch libmbfl to use size_t

Switch mbfl_string and related structures to use size_t lengths.

Quite likely that I broke some things along the way...


# 9c73be89 19-Jul-2017 Nikita Popov

Directly accept encoding in php_unicode_convert_case()

As a side-effect mb_strtolower() and mb_strtoupper() now correctly
handle a NULL encoding parameter by using the internal encoding.

Directly accept encoding in php_unicode_convert_case()

As a side-effect mb_strtolower() and mb_strtoupper() now correctly
handle a NULL encoding parameter by using the internal encoding.
This is what caused the two test changes.

show more ...


# 4128746b 19-Jul-2017 Nikita Popov

Add php_mb_get_encoding() convenience function


12345678910>>...26