1Upgrading from PHP-Parser 4.x to 5.0 2==================================== 3 4### PHP version requirements 5 6PHP-Parser now requires PHP 7.4 or newer to run. It is however still possible to *parse* code for older versions, while running on a newer version. 7 8### PHP 5 parsing support 9 10The dedicated parser for PHP 5 has been removed. The PHP 7 parser now accepts a `PhpVersion` argument, which can be used to improve compatibility with older PHP versions. 11 12In particular, if an older `PhpVersion` is specified, then: 13 14 * For versions before PHP 7.0, `$foo =& new Bar()` assignments are allowed without error. 15 * For versions before PHP 7.0, invalid octal literals `089` are allowed without error. 16 * For versions before PHP 7.0, unicode escape sequences `\u{123}` in strings are not parsed. 17 * Type hints are interpreted as a class `Name` or as a built-in `Identifier` depending on PHP 18 version, for example `int` is treated as a class name on PHP 5.6 and as a built-in on PHP 7.0. 19 20However, some aspects of PHP 5 parsing are no longer supported: 21 22 * Some variables like `$$foo[0]` are valid in both PHP 5 and PHP 7, but have different interpretation. In that case, the PHP 7 AST will always be constructed (`($$foo)[0]` rather than `${$foo[0]}`). 23 * Declarations of the form `global $$var[0]` are not supported in PHP 7 and will cause a parse error. In error recovery mode, it is possible to continue parsing after such declarations. 24 * The PHP 7 parser will accept many constructs that are not valid in PHP 5. However, this was also true of the dedicated PHP 5 parser. 25 26The following symbols are affected by this removal: 27 28 * The `PhpParser\Parser\Php5` class has been removed. 29 * The `PhpParser\Parser\Multiple` class has been removed. While not strictly related to PHP 5 support, this functionality is no longer useful without it. 30 * The `PhpParser\ParserFactory::ONLY_PHP5` and `PREFER_PHP5` options have been removed. 31 32### Changes to the parser factory 33 34The `ParserFactory::create()` method has been removed in favor of three new methods that provide more fine-grained control over the PHP version being targeted: 35 36 * `createForNewestSupportedVersion()`: Use this if you don't know the PHP version of the code you're parsing. It's better to assume a too new version than a too old one. 37 * `createForHostVersion()`: Use this if you're parsing code for the PHP version you're running on. 38 * `createForVersion()`: Use this if you know the PHP version of the code you want to parse. 39 40The `createForNewestSupportedVersion()` and `creatForHostVersion()` are available since PHP-Parser 4.18.0, to allow libraries to support PHP-Parser 4 and 5 at the same time more easily. 41 42In all cases, the PHP version is a fairly weak hint that is only used on a best-effort basis. The parser will usually accept code for newer versions if it does not have any backwards-compatibility implications. 43 44For example, if you specify version `"8.0"`, then `class ReadOnly {}` is treated as a valid class declaration, while using `public readonly int $prop` will lead to a parse error. However, `final public const X = Y;` will be accepted in both cases. 45 46```php 47use PhpParser\ParserFactory; 48use PhpParser\PhpVersion; 49 50$factory = new ParserFactory(); 51 52# Before 53$parser = $factory->create(ParserFactory::PREFER_PHP7); 54 55# After (this is roughly equivalent to PREFER_PHP7 behavior) 56$parser = $factory->createForNewestSupportedVersion(); 57# Or 58$parser = $factory->createForHostVersion(); 59 60# Before 61$parser = $factory->create(ParserFactory::ONLY_PHP5); 62# After (supported on a best-effort basis) 63$parser = $factory->createForVersion(PhpVersion::fromString("5.6")); 64``` 65 66### Changes to the throw representation 67 68Previously, `throw` statements like `throw $e;` were represented using the `Stmt\Throw_` class, 69while uses inside other expressions (such as `$x ?? throw $e`) used the `Expr\Throw_` class. 70 71Now, `throw $e;` is represented as a `Stmt\Expression` that contains an `Expr\Throw_`. The 72`Stmt\Throw_` class has been removed. 73 74```php 75# Code 76throw $e; 77 78# Before 79Stmt_Throw( 80 expr: Expr_Variable( 81 name: e 82 ) 83) 84 85# After 86Stmt_Expression( 87 expr: Expr_Throw( 88 expr: Expr_Variable( 89 name: e 90 ) 91 ) 92) 93``` 94 95### Changes to the array destructuring representation 96 97Previously, the `list($x) = $y` destructuring syntax was represented using a `Node\Expr\List_` 98node, while `[$x] = $y` used a `Node\Expr\Array_` node, the same used for the creation (rather than 99destructuring) of arrays. 100 101Now, destructuring is always represented using `Node\Expr\List_`. The `kind` attribute with value 102`Node\Expr\List_::KIND_LIST` or `Node\Expr\List_::KIND_ARRAY` specifies which syntax was actually 103used. 104 105```php 106# Code 107[$x] = $y; 108 109# Before 110Expr_Assign( 111 var: Expr_Array( 112 items: array( 113 0: Expr_ArrayItem( 114 key: null 115 value: Expr_Variable( 116 name: x 117 ) 118 byRef: false 119 unpack: false 120 ) 121 ) 122 ) 123 expr: Expr_Variable( 124 name: y 125 ) 126) 127 128# After 129Expr_Assign( 130 var: Expr_List( 131 items: array( 132 0: ArrayItem( 133 key: null 134 value: Expr_Variable( 135 name: x 136 ) 137 byRef: false 138 unpack: false 139 ) 140 ) 141 ) 142 expr: Expr_Variable( 143 name: y 144 ) 145) 146``` 147 148### Changes to the name representation 149 150Previously, `Name` nodes had a `parts` subnode, which stores an array of name parts, split by 151namespace separators. Now, `Name` nodes instead have a `name` subnode, which stores a plain string. 152 153For example, the name `Foo\Bar` was previously represented by `Name(parts: ['Foo', 'Bar'])` and is 154now represented by `Name(name: 'Foo\Bar')` instead. 155 156It is possible to convert the name to the previous representation using `$name->getParts()`. The 157`Name` constructor continues to accept both the string and the array representation. 158 159The `Name::getParts()` method is available since PHP-Parser 4.16.0, to allow libraries to support 160PHP-Parser 4 and 5 at the same time more easily. 161 162### Changes to the block representation 163 164Previously, code blocks `{ ... }` were always flattened into their parent statement list. For 165example `while ($x) { $a; { $b; } $c; }` would produce the same node structure as 166`if ($x) { $a; $b; $c; }`, namely a `Stmt\While_` node whose `stmts` subnode is an array of three 167statements. 168 169Now, the nested `{ $b; }` block is represented using an explicit `Stmt\Block` node. However, the 170outer `{ $a; { $b; } $c; }` block is still represented using a simple array in the `stmts` subnode. 171 172```php 173# Code 174while ($x) { $a; { $b; } $c; } 175 176# Before 177Stmt_While( 178 cond: Expr_Variable( 179 name: x 180 ) 181 stmts: array( 182 0: Stmt_Expression( 183 expr: Expr_Variable( 184 name: a 185 ) 186 ) 187 1: Stmt_Expression( 188 expr: Expr_Variable( 189 name: b 190 ) 191 ) 192 2: Stmt_Expression( 193 expr: Expr_Variable( 194 name: c 195 ) 196 ) 197 ) 198) 199 200# After 201Stmt_While( 202 cond: Expr_Variable( 203 name: x 204 ) 205 stmts: array( 206 0: Stmt_Expression( 207 expr: Expr_Variable( 208 name: a 209 ) 210 ) 211 1: Stmt_Block( 212 stmts: array( 213 0: Stmt_Expression( 214 expr: Expr_Variable( 215 name: b 216 ) 217 ) 218 ) 219 ) 220 2: Stmt_Expression( 221 expr: Expr_Variable( 222 name: c 223 ) 224 ) 225 ) 226) 227``` 228 229### Changes to comment assignment 230 231Previously, comments were assigned to all nodes starting at the same position. Now they will be 232assigned to the outermost node only. 233 234```php 235# Code 236// Comment 237$a + $b; 238 239# Before 240Stmt_Expression( 241 expr: Expr_BinaryOp_Plus( 242 left: Expr_Variable( 243 name: a 244 comments: array( 245 0: // Comment 246 ) 247 ) 248 right: Expr_Variable( 249 name: b 250 ) 251 comments: array( 252 0: // Comment 253 ) 254 ) 255 comments: array( 256 0: // Comment 257 ) 258) 259 260# After 261Stmt_Expression( 262 expr: Expr_BinaryOp_Plus( 263 left: Expr_Variable( 264 name: a 265 ) 266 right: Expr_Variable( 267 name: b 268 ) 269 ) 270 comments: array( 271 0: // Comment 272 ) 273) 274``` 275 276### Renamed nodes 277 278A number of AST nodes have been renamed or moved in the AST hierarchy: 279 280 * `Node\Scalar\LNumber` is now `Node\Scalar\Int_`. 281 * `Node\Scalar\DNumber` is now `Node\Scalar\Float_`. 282 * `Node\Scalar\Encapsed` is now `Node\Scalar\InterpolatedString`. 283 * `Node\Scalar\EncapsedStringPart` is now `Node\InterpolatedStringPart` and no longer extends 284 `Node\Scalar` or `Node\Expr`. 285 * `Node\Expr\ArrayItem` is now `Node\ArrayItem` and no longer extends `Node\Expr`. 286 * `Node\Expr\ClosureUse` is now `Node\ClosureUse` and no longer extends `Node\Expr`. 287 * `Node\Stmt\DeclareDeclare` is now `Node\DeclareItem` and no longer extends `Node\Stmt`. 288 * `Node\Stmt\PropertyProperty` is now `Node\PropertyItem` and no longer extends `Node\Stmt`. 289 * `Node\Stmt\StaticVar` is now `Node\StaticVar` and no longer extends `Node\Stmt`. 290 * `Node\Stmt\UseUse` is now `Node\UseItem` and no longer extends `Node\Stmt`. 291 292The old class names have been retained as aliases for backwards compatibility. However, the `Node::getType()` method will now always return the new name (e.g. `ClosureUse` instead of `Expr_ClosureUse`). 293 294### Modifiers 295 296Modifier flags (as used by the `$flags` subnode of `Class_`, `ClassMethod`, `Property`, etc.) are now available as class constants on a separate `PhpParser\Modifiers` class, instead of being part of `PhpParser\Node\Stmt\Class_`, to make it clearer that these are used by many different nodes. The old constants are deprecated, but are still available. 297 298```php 299PhpParser\Node\Stmt\Class_::MODIFIER_PUBLIC -> PhpParser\Modifiers::PUBLIC 300PhpParser\Node\Stmt\Class_::MODIFIER_PROTECTED -> PhpParser\Modifiers::PROTECTED 301PhpParser\Node\Stmt\Class_::MODIFIER_PRIVATE -> PhpParser\Modifiers::PRIVATE 302PhpParser\Node\Stmt\Class_::MODIFIER_STATIC -> PhpParser\Modifiers::STATIC 303PhpParser\Node\Stmt\Class_::MODIFIER_ABSTRACT -> PhpParser\Modifiers::ABSTRACT 304PhpParser\Node\Stmt\Class_::MODIFIER_FINAL -> PhpParser\Modifiers::FINAL 305PhpParser\Node\Stmt\Class_::MODIFIER_READONLY -> PhpParser\Modifiers::READONLY 306PhpParser\Node\Stmt\Class_::VISIBILITY_MODIFIER_MASK -> PhpParser\Modifiers::VISIBILITY_MASK 307``` 308 309### Changes to node constructors 310 311Node constructor arguments accepting types now longer accept plain strings. Either an `Identifier` or `Name` (or `ComplexType`) should be passed instead. This affects the following constructor arguments: 312 313* The `'returnType'` key of `$subNodes` argument of `Node\Expr\ArrowFunction`. 314* The `'returnType'` key of `$subNodes` argument of `Node\Expr\Closure`. 315* The `'returnType'` key of `$subNodes` argument of `Node\Stmt\ClassMethod`. 316* The `'returnType'` key of `$subNodes` argument of `Node\Stmt\Function_`. 317* The `$type` argument of `Node\NullableType`. 318* The `$type` argument of `Node\Param`. 319* The `$type` argument of `Node\Stmt\Property`. 320* The `$type` argument of `Node\ClassConst`. 321 322To follow the previous behavior, an `Identifier` should be passed, which indicates a built-in type. 323 324### Changes to the pretty printer 325 326A number of changes to the standard pretty printer have been made, to make it match contemporary coding style conventions (and in particular PSR-12). Options to restore the previous behavior are not provided, but it is possible to override the formatting methods (such as `pStmt_ClassMethod`) with your preferred formatting. 327 328Return types are now formatted without a space before the `:`: 329 330```php 331# Before 332function test() : Type 333{ 334} 335 336# After 337function test(): Type 338{ 339} 340``` 341 342`abstract` and `final` are now printed before visibility modifiers: 343 344```php 345# Before 346public abstract function test(); 347 348# After 349abstract public function test(); 350``` 351 352A space is now printed between `use` and the following `(` for closures: 353 354```php 355# Before 356function () use($var) { 357}; 358 359# After 360function () use ($var) { 361}; 362``` 363 364Backslashes in single-quoted strings are now only printed if they are necessary: 365 366```php 367# Before 368'Foo\\Bar'; 369'\\\\'; 370 371# After 372'Foo\Bar'; 373'\\\\'; 374``` 375 376`else if` structures will now omit redundant parentheses: 377 378```php 379# Before 380else { 381 if ($x) { 382 // ... 383 } 384} 385 386# After 387else if ($x) { 388 // ... 389} 390``` 391 392The pretty printer now accepts a `phpVersion` option, which accepts a `PhpVersion` object and defaults to PHP 7.4. The pretty printer will make formatting choices to make the code valid for that version. It currently controls the following behavior: 393 394* For PHP >= 7.0 (default), short array syntax `[]` will be used by default. This does not affect nodes that specify an explicit array syntax using the `kind` attribute. 395* For PHP >= 7.0 (default), parentheses around `yield` expressions will only be printed when necessary. Previously, parentheses were always printed, even if `yield` was used as a statement. 396* For PHP >= 7.1 (default), the short array syntax `[]` will be used for destructuring by default (instead of `list()`). This does not affect nodes that specify an explicit syntax using the `kind` attribute. 397* For PHP >= 7.3 (default), a newline is no longer forced after heredoc/nowdoc strings, as the requirement for this has been removed with the introduction of flexible heredoc/nowdoc strings. 398* For PHP >= 7.3 (default), heredoc/nowdoc strings are now indented just like regular code. This was allowed with the introduction of flexible heredoc/nowdoc strings. 399 400### Changes to precedence handling in the pretty printer 401 402The pretty printer now more accurately models operator precedence. Especially for unary operators, less unnecessary parentheses will be printed. Conversely, many bugs where semantically meaningful parentheses were omitted have been fixed. 403 404To support these changes, precedence is now handled differently in the pretty printer. The internal `p()` method, which is used to recursively print nodes, now has the following signature: 405```php 406protected function p( 407 Node $node, int $precedence = self::MAX_PRECEDENCE, int $lhsPrecedence = self::MAX_PRECEDENCE, 408 bool $parentFormatPreserved = false 409): string; 410``` 411 412The `$precedence` is the precedence of the direct parent operator (if any), while `$lhsPrecedence` is that precedence of the nearest binary operator on whose left-hand-side the node occurs. For unary operators, only the `$lhsPrecedence` is relevant. 413 414Recursive calls in pretty-printer methods should generally continue calling `p()` without additional parameters. However, pretty-printer methods for operators that participate in precedence resolution need to be adjusted. For example, typical implementations for operators look as follows now: 415 416```php 417protected function pExpr_BinaryOp_Plus( 418 BinaryOp\Plus $node, int $precedence, int $lhsPrecedence 419): string { 420 return $this->pInfixOp( 421 BinaryOp\Plus::class, $node->left, ' + ', $node->right, $precedence, $lhsPrecedence); 422} 423 424protected function pExpr_UnaryPlus( 425 Expr\UnaryPlus $node, int $precedence, int $lhsPrecedence 426): string { 427 return $this->pPrefixOp(Expr\UnaryPlus::class, '+', $node->expr, $precedence, $lhsPrecedence); 428} 429``` 430 431The new `$precedence` and `$lhsPrecedence` arguments need to be passed down to the `pInfixOp()`, `pPrefixOp()` and `pPostfixOp()` methods. 432 433### Changes to the node traverser 434 435If there are multiple visitors, the node traverser will now call `leaveNode()` and `afterTraverse()` methods in the reverse order of the corresponding `enterNode()` and `beforeTraverse()` calls: 436 437```php 438# Before 439$visitor1->enterNode($node); 440$visitor2->enterNode($node); 441$visitor1->leaveNode($node); 442$visitor2->leaveNode($node); 443 444# After 445$visitor1->enterNode($node); 446$visitor2->enterNode($node); 447$visitor2->leaveNode($node); 448$visitor1->leaveNode($node); 449``` 450 451Additionally, the special `NodeVisitor` return values have been moved from `NodeTraverser` to `NodeVisitor`. The old names are deprecated, but still available. 452 453```php 454PhpParser\NodeTraverser::REMOVE_NODE -> PhpParser\NodeVisitor::REMOVE_NODE 455PhpParser\NodeTraverser::DONT_TRAVERSE_CHILDREN -> PhpParser\NodeVisitor::DONT_TRAVERSE_CHILDREN 456PhpParser\NodeTraverser::DONT_TRAVERSE_CURRENT_AND_CHILDREN -> PhpParser\NodeVisitor::DONT_TRAVERSE_CURRENT_AND_CHILDREN 457PhpParser\NodeTraverser::STOP_TRAVERSAL -> PhpParser\NodeVisitor::STOP_TRAVERSAL 458``` 459 460Visitors can now also be passed directly to the `NodeTraverser` constructor: 461 462```php 463# Before (and still supported) 464$traverser = new NodeTraverser(); 465$traverser->addVisitor(new NameResolver()); 466 467# After 468$traverser = new NodeTraverser(new NameResolver()); 469``` 470 471### Changes to token representation 472 473Tokens are now internally represented using the `PhpParser\Token` class, which exposes the same base interface as 474the `PhpToken` class introduced in PHP 8.0. On PHP 8.0 or newer, `PhpParser\Token` extends from `PhpToken`, otherwise 475it extends from a polyfill implementation. The most important parts of the interface may be summarized as follows: 476 477```php 478class Token { 479 public int $id; 480 public string $text; 481 public int $line; 482 public int $pos; 483 484 public function is(int|string|array $kind): bool; 485} 486``` 487 488The token array is now an array of `Token`s, rather than an array of arrays and strings. 489Additionally, the token array is now terminated by a sentinel token with ID 0. 490 491### Changes to the lexer 492 493The lexer API is reduced to a single `Lexer::tokenize()` method, which returns an array of tokens. The `startLexing()` and `getNextToken()` methods have been removed. 494 495Responsibility for determining start and end attributes for nodes has been moved from the lexer to the parser. The lexer no longer accepts an options array. The `usedAttributes` option has been removed without replacement, and the parser will now unconditionally add the `comments`, `startLine`, `endLine`, `startFilePos`, `endFilePos`, `startTokenPos` and `endTokenPos` attributes. 496 497There should no longer be a need to directly interact with the `Lexer` for end users, as the `ParserFactory` will create an appropriate instance, and no additional configuration of the lexer is necessary. To use formatting-preserving pretty printing, the setup boilerplate changes as follows: 498 499```php 500# Before 501 502$lexer = new Lexer\Emulative([ 503 'usedAttributes' => [ 504 'comments', 505 'startLine', 'endLine', 506 'startTokenPos', 'endTokenPos', 507 ], 508]); 509 510$parser = new Parser\Php7($lexer); 511$oldStmts = $parser->parse($code); 512$oldTokens = $lexer->getTokens(); 513 514$traverser = new NodeTraverser(); 515$traverser->addVisitor(new NodeVisitor\CloningVisitor()); 516$newStmts = $traverser->traverse($oldStmts); 517 518# After 519 520$parser = (new ParserFactory())->createForNewestSupportedVersion(); 521$oldStmts = $parser->parse($code); 522$oldTokens = $parser->getTokens(); 523 524$traverser = new NodeTraverser(new NodeVisitor\CloningVisitor()); 525$newStmts = $traverser->traverse($oldStmts); 526``` 527 528### Miscellaneous changes 529 530 * The deprecated `Builder\Param::setTypeHint()` method has been removed in favor of `Builder\Param::setType()`. 531 * The deprecated `Error` constructor taking a start line has been removed. Pass `['startLine' => $startLine]` attributes instead. 532 * The deprecated `Comment::getLine()`, `Comment::getTokenPos()` and `Comment::getFilePos()` methods have been removed. Use `Comment::getStartLine()`, `Comment::getStartTokenPos()` and `Comment::getStartFilePos()` instead. 533 * `Comment::getReformattedText()` now normalizes CRLF newlines to LF newlines. 534 * The `Node::getLine()` method has been deprecated. Use `Node::getStartLine()` instead. 535