#
935fef29 |
| 22-Oct-2024 |
Niels Dossche <7771979+nielsdos@users.noreply.github.com> |
Optimize DOM HTML serialization for UTF-8 (#16376) * Use a direct call for decoding the UTF-8 buffer * Add fast path for UTF-8 HTML serialization This patch adds a fast pat
Optimize DOM HTML serialization for UTF-8 (#16376) * Use a direct call for decoding the UTF-8 buffer * Add fast path for UTF-8 HTML serialization This patch adds a fast path to the HTML serialization encoding that has to encode to UTF-8. Because the DOM internally represents all strings using UTF-8, we only need to validate here. Tested on Wikipedia English home page on an i7-4790: ``` Benchmark 1: ./sapi/cli/php x.php Time (mean ± σ): 516.0 ms ± 6.4 ms [User: 511.2 ms, System: 3.5 ms] Range (min … max): 506.0 ms … 527.1 ms 10 runs Benchmark 2: ./sapi/cli/php_old x.php Time (mean ± σ): 682.8 ms ± 6.5 ms [User: 676.8 ms, System: 3.8 ms] Range (min … max): 675.8 ms … 695.6 ms 10 runs Summary ./sapi/cli/php x.php ran 1.32 ± 0.02 times faster than ./sapi/cli/php_old x.php ``` (And if you're interested: it takes over a second on my machine using the old DOMDocument class) Future optimizations are certainly possible, but let's start here.
show more ...
|
#
baa76be6 |
| 12-Oct-2024 |
Niels Dossche <7771979+nielsdos@users.noreply.github.com> |
Use SWAR to seek for non-ASCII UTF-8 in DOM parsing (#16350) GitHub FYP test case: ``` Benchmark 1: ./sapi/cli/php test.php Time (mean ± σ): 502.8 ms ± 6.2 ms [User: 4
Use SWAR to seek for non-ASCII UTF-8 in DOM parsing (#16350) GitHub FYP test case: ``` Benchmark 1: ./sapi/cli/php test.php Time (mean ± σ): 502.8 ms ± 6.2 ms [User: 498.3 ms, System: 3.2 ms] Range (min … max): 495.2 ms … 509.8 ms 10 runs Benchmark 2: ./sapi/cli/php_old test.php Time (mean ± σ): 518.4 ms ± 4.3 ms [User: 513.9 ms, System: 3.2 ms] Range (min … max): 511.5 ms … 525.5 ms 10 runs Summary ./sapi/cli/php test.php ran 1.03 ± 0.02 times faster than ./sapi/cli/php_old test.php ``` Wikipedia English homepage test case: ``` Benchmark 1: ./sapi/cli/php test.php Time (mean ± σ): 301.1 ms ± 4.2 ms [User: 295.5 ms, System: 4.8 ms] Range (min … max): 296.3 ms … 308.8 ms 10 runs Benchmark 2: ./sapi/cli/php_old test.php Time (mean ± σ): 308.2 ms ± 1.7 ms [User: 304.6 ms, System: 2.9 ms] Range (min … max): 306.9 ms … 312.8 ms 10 runs Summary ./sapi/cli/php test.php ran 1.02 ± 0.02 times faster than ./sapi/cli/php_old test.php ```
show more ...
|
#
1e949d18 |
| 04-Oct-2024 |
Niels Dossche <7771979+nielsdos@users.noreply.github.com> |
Fix edge-case in DOM parsing decoding There are three connected subtle issues: 1) The fast path didn't correctly handle the case where the decoder requests more data. This caused
Fix edge-case in DOM parsing decoding There are three connected subtle issues: 1) The fast path didn't correctly handle the case where the decoder requests more data. This caused a bogus additional replacement sequence to be outputted when encountering an incomplete sequence at the edges of a buffer. 2) The finishing of decoding incorrectly assumed that the fast path cannot be in a state where the last few bytes were an incomplete sequence, but this is not true as shown by test 08. 3) The finishing of decoding could output bytes twice because it called into dom_process_parse_chunk() twice without clearing the decoded data. However, calling twice is not even necessary as the entire buffer cannot be filled up entirely. Closes GH-16226.
show more ...
|
#
88393cfa |
| 26-Aug-2024 |
Niels Dossche <7771979+nielsdos@users.noreply.github.com> |
Fix GH-13988: Storing DOMElement consume 4 times more memory in PHP 8.1 than in PHP 8.0 We avoid creating backing storage by using the feature introduced in f78d5cfcd2fe06ddd6da33ff880c6
Fix GH-13988: Storing DOMElement consume 4 times more memory in PHP 8.1 than in PHP 8.0 We avoid creating backing storage by using the feature introduced in f78d5cfcd2fe06ddd6da33ff880c6823072adc1b. Closes GH-15593.
show more ...
|
#
d32b97a1 |
| 23-Aug-2024 |
Niels Dossche <7771979+nielsdos@users.noreply.github.com> |
Fix NULL pointer dereference with NULL content in legacy nodes in title getting (#15558)
|
#
5853cdb7 |
| 20-Aug-2024 |
Gina Peter Bnayard |
Use "must not" instead of "cannot" wording
|
#
6d9a74cd |
| 18-Aug-2024 |
Gina Peter Bnayard |
ext/dom: Use standard wording for ValueError
|
#
80a4783d |
| 18-Jul-2024 |
Niels Dossche <7771979+nielsdos@users.noreply.github.com> |
Deduplicate NULL checks in ext/dom (#15015) This introduces a new helper php_dom_create_nullable_object() that does the NULL check and puts NULL in return_value. Otherwise it runs ph
Deduplicate NULL checks in ext/dom (#15015) This introduces a new helper php_dom_create_nullable_object() that does the NULL check and puts NULL in return_value. Otherwise it runs php_dom_create_object(). This deduplicates a bit of code.
show more ...
|
#
6980eba8 |
| 10-Jul-2024 |
Niels Dossche <7771979+nielsdos@users.noreply.github.com> |
Support templated content The template element in HTML 5 is special in the sense that it does not add its contents into the DOM tree, but instead keeps them in a separate shadow DOM
Support templated content The template element in HTML 5 is special in the sense that it does not add its contents into the DOM tree, but instead keeps them in a separate shadow DOM document fragment. Interacting with the DOM tree cannot touch the elements in the document fragment. Closes GH-14906.
show more ...
|
#
4ef75391 |
| 09-Jul-2024 |
Niels Dossche <7771979+nielsdos@users.noreply.github.com> |
Split off private data from the ns mapper
|
#
88da9149 |
| 27-Apr-2024 |
Niels Dossche <7771979+nielsdos@users.noreply.github.com> |
Implement CSS selectors
|
#
48c9f1e2 |
| 27-Apr-2024 |
Niels Dossche <7771979+nielsdos@users.noreply.github.com> |
Implement Dom\HTMLElement class
|
#
78401ba8 |
| 07-Apr-2024 |
Niels Dossche <7771979+nielsdos@users.noreply.github.com> |
Implement Dom\Document::$title setter
|
#
04af9603 |
| 07-Apr-2024 |
Niels Dossche <7771979+nielsdos@users.noreply.github.com> |
Implement Dom\Document::$title getter
|
#
a12db3b6 |
| 23-Mar-2024 |
Niels Dossche <7771979+nielsdos@users.noreply.github.com> |
Implement Dom\Document::$body setter
|
#
287cf917 |
| 23-Mar-2024 |
Niels Dossche <7771979+nielsdos@users.noreply.github.com> |
Implement Dom\Document::$head
|
#
a1485df5 |
| 23-Mar-2024 |
Niels Dossche <7771979+nielsdos@users.noreply.github.com> |
Implement Dom\Document::$body getter
|
#
11accb5c |
| 25-Jun-2024 |
Arnaud Le Blanc |
Preferably include from build dir (#13516) * Include from build dir first This fixes out of tree builds by ensuring that configure artifacts are included from the build dir.
Preferably include from build dir (#13516) * Include from build dir first This fixes out of tree builds by ensuring that configure artifacts are included from the build dir. Before, out of tree builds would preferably include files from the src dir, as the include path was defined as follows (ignoring includes from ext/ and sapi/) : -I$(top_builddir)/main -I$(top_srcdir) -I$(top_builddir)/TSRM -I$(top_builddir)/Zend -I$(top_srcdir)/main -I$(top_srcdir)/Zend -I$(top_srcdir)/TSRM -I$(top_builddir)/ As a result, an out of tree build would include configure artifacts such as `main/php_config.h` from the src dir. After this change, the include path is defined as follows: -I$(top_builddir)/main -I$(top_builddir) -I$(top_srcdir)/main -I$(top_srcdir) -I$(top_builddir)/TSRM -I$(top_builddir)/Zend -I$(top_srcdir)/Zend -I$(top_srcdir)/TSRM * Fix extension include path for out of tree builds * Include config.h with the brackets form `#include "config.h"` searches in the directory containing the including-file before any other include path. This can include the wrong config.h when building out of tree and a config.h exists in the source tree. Using `#include <config.h>` uses exclusively the include path, and gives priority to the build dir.
show more ...
|
#
84a0da15 |
| 09-Jun-2024 |
Peter Kokot |
Sync #if/ifdef/defined (#14508) This syncs CPP macro conditions: - _WIN32 - _WIN64 - HAVE_ALLOCA_H - HAVE_ALPHASORT - HAVE_ARPA_INET_H - HAVE_CONFIG_H - HAVE_DIRE
Sync #if/ifdef/defined (#14508) This syncs CPP macro conditions: - _WIN32 - _WIN64 - HAVE_ALLOCA_H - HAVE_ALPHASORT - HAVE_ARPA_INET_H - HAVE_CONFIG_H - HAVE_DIRENT_H - HAVE_DLFCN_H - HAVE_GETTIMEOFDAY - HAVE_LIBDL - HAVE_POLL_H - HAVE_PWD_H - HAVE_SCANDIR - HAVE_SYS_FILE_H - HAVE_SYS_PARAM_H - HAVE_SYS_SOCKET_H - HAVE_SYS_TIME_H - HAVE_SYS_TYPES_H - HAVE_SYS_WAIT_H - HAVE_UNISTD_H - PHP_WIN32 - ZEND_WIN32 These are either undefined or defined to 1 in Autotools and Windows. Follow up of GH-5526 (-Wundef).
show more ...
|
#
1fdbb0ab |
| 12-May-2024 |
Niels Dossche <7771979+nielsdos@users.noreply.github.com> |
Get rid of unused declarations
|
#
e7af2bfd |
| 12-May-2024 |
Niels Dossche <7771979+nielsdos@users.noreply.github.com> |
Get rid of reserved name usage
|
#
44485892 |
| 10-May-2024 |
Niels Dossche <7771979+nielsdos@users.noreply.github.com> |
Factor out all common code for XML serialization and merge common paths
|
#
6e7adb3c |
| 09-May-2024 |
Niels Dossche <7771979+nielsdos@users.noreply.github.com> |
Update ext/dom names after policy change (#14171)
|
#
191d0501 |
| 23-Mar-2024 |
Niels Dossche <7771979+nielsdos@users.noreply.github.com> |
Cleanup dom_html_document_encoding_write() (#13788)
|
#
b9559738 |
| 13-Mar-2024 |
Niels Dossche <7771979+nielsdos@users.noreply.github.com> |
Only register error handling when observable Closes GH-13702.
|