1Walking the AST 2=============== 3 4The most common way to work with the AST is by using a node traverser and one or more node visitors. 5As a basic example, the following code changes all literal integers in the AST into strings (e.g., 6`42` becomes `'42'`.) 7 8```php 9use PhpParser\{Node, NodeTraverser, NodeVisitorAbstract}; 10 11$traverser = new NodeTraverser; 12$traverser->addVisitor(new class extends NodeVisitorAbstract { 13 public function leaveNode(Node $node) { 14 if ($node instanceof Node\Scalar\Int_) { 15 return new Node\Scalar\String_((string) $node->value); 16 } 17 } 18}); 19 20$stmts = ...; 21$modifiedStmts = $traverser->traverse($stmts); 22``` 23 24Visitors can be either passed to the `NodeTraverser` constructor, or added using `addVisitor()`: 25 26```php 27$traverser = new NodeTraverser($visitor1, $visitor2, $visitor3); 28 29// Equivalent to: 30$traverser = new NodeTraverser(); 31$traverser->addVisitor($visitor1); 32$traverser->addVisitor($visitor2); 33$traverser->addVisitor($visitor3); 34``` 35 36Node visitors 37------------- 38 39Each node visitor implements an interface with following four methods: 40 41```php 42interface NodeVisitor { 43 public function beforeTraverse(array $nodes); 44 public function enterNode(Node $node); 45 public function leaveNode(Node $node); 46 public function afterTraverse(array $nodes); 47} 48``` 49 50The `beforeTraverse()` and `afterTraverse()` methods are called before and after the traversal 51respectively, and are passed the entire AST. They can be used to perform any necessary state 52setup or cleanup. 53 54The `enterNode()` method is called when a node is first encountered, before its children are 55processed ("preorder"). The `leaveNode()` method is called after all children have been visited 56("postorder"). 57 58For example, if we have the following excerpt of an AST 59 60``` 61Expr_FuncCall( 62 name: Name( 63 name: printLine 64 ) 65 args: array( 66 0: Arg( 67 name: null 68 value: Scalar_String( 69 value: Hello World!!! 70 ) 71 byRef: false 72 unpack: false 73 ) 74 ) 75) 76``` 77 78then the enter/leave methods will be called in the following order: 79 80``` 81enterNode(Expr_FuncCall) 82enterNode(Name) 83leaveNode(Name) 84enterNode(Arg) 85enterNode(Scalar_String) 86leaveNode(Scalar_String) 87leaveNode(Arg) 88leaveNode(Expr_FuncCall) 89``` 90 91A common pattern is that `enterNode` is used to collect some information and then `leaveNode` 92performs modifications based on that. At the time when `leaveNode` is called, all the code inside 93the node will have already been visited and necessary information collected. 94 95As you usually do not want to implement all four methods, it is recommended that you extend 96`NodeVisitorAbstract` instead of implementing the interface directly. The abstract class provides 97empty default implementations. 98 99Modifying the AST 100----------------- 101 102There are a number of ways in which the AST can be modified from inside a node visitor. The first 103and simplest is to simply change AST properties inside the visitor: 104 105```php 106public function leaveNode(Node $node) { 107 if ($node instanceof Node\Scalar\LNumber) { 108 // increment all integer literals 109 $node->value++; 110 } 111} 112``` 113 114The second is to replace a node entirely by returning a new node: 115 116```php 117public function leaveNode(Node $node) { 118 if ($node instanceof Node\Expr\BinaryOp\BooleanAnd) { 119 // Convert all $a && $b expressions into !($a && $b) 120 return new Node\Expr\BooleanNot($node); 121 } 122} 123``` 124 125Doing this is supported both inside enterNode and leaveNode. However, you have to be mindful about 126where you perform the replacement: If a node is replaced in enterNode, then the recursive traversal 127will also consider the children of the new node. If you aren't careful, this can lead to infinite 128recursion. For example, let's take the previous code sample and use enterNode instead: 129 130```php 131public function enterNode(Node $node) { 132 if ($node instanceof Node\Expr\BinaryOp\BooleanAnd) { 133 // Convert all $a && $b expressions into !($a && $b) 134 return new Node\Expr\BooleanNot($node); 135 } 136} 137``` 138 139Now `$a && $b` will be replaced by `!($a && $b)`. Then the traverser will go into the first (and 140only) child of `!($a && $b)`, which is `$a && $b`. The transformation applies again and we end up 141with `!!($a && $b)`. This will continue until PHP hits the memory limit. 142 143Finally, there are three special replacement types. The first is removal of a node: 144 145```php 146public function leaveNode(Node $node) { 147 if ($node instanceof Node\Stmt\Return_) { 148 // Remove all return statements 149 return NodeVisitor::REMOVE_NODE; 150 } 151} 152``` 153 154Node removal only works if the parent structure is an array. This means that usually it only makes 155sense to remove nodes of type `Node\Stmt`, as they always occur inside statement lists (and a few 156more node types like `Arg` or `Expr\ArrayItem`, which are also always part of lists). 157 158On the other hand, removing a `Node\Expr` does not make sense: If you have `$a * $b`, there is no 159meaningful way in which the `$a` part could be removed. If you want to remove an expression, you 160generally want to remove it together with a surrounding expression statement: 161 162```php 163public function leaveNode(Node $node) { 164 if ($node instanceof Node\Stmt\Expression 165 && $node->expr instanceof Node\Expr\FuncCall 166 && $node->expr->name instanceof Node\Name 167 && $node->expr->name->toString() === 'var_dump' 168 ) { 169 return NodeVisitor::REMOVE_NODE; 170 } 171} 172``` 173 174This example will remove all calls to `var_dump()` which occur as expression statements. This means 175that `var_dump($a);` will be removed, but `if (var_dump($a))` will not be removed (and there is no 176obvious way in which it can be removed). 177 178Another way to remove nodes is to replace them with `null`. For example, all `else` statements could 179be removed as follows: 180 181```php 182public function leaveNode(Node $node) { 183 if ($node instanceof Node\Stmt\Else_) { 184 return NodeVisitor::REPLACE_WITH_NULL; 185 } 186} 187``` 188 189This is only safe to do if the subnode the node is stored in is nullable. `Node\Stmt\Else_` only 190occurs inside `Node\Stmt\If_::$else`, which is nullable, so this particular replacement is safe. 191 192Next to removing nodes, it is also possible to replace one node with multiple nodes. This 193only works if the parent structure is an array. 194 195```php 196public function leaveNode(Node $node) { 197 if ($node instanceof Node\Stmt\Return_ && $node->expr !== null) { 198 // Convert "return foo();" into "$retval = foo(); return $retval;" 199 $var = new Node\Expr\Variable('retval'); 200 return [ 201 new Node\Stmt\Expression(new Node\Expr\Assign($var, $node->expr)), 202 new Node\Stmt\Return_($var), 203 ]; 204 } 205} 206``` 207 208Short-circuiting traversal 209-------------------------- 210 211An AST can easily contain thousands of nodes, and traversing over all of them may be slow, 212especially if you have more than one visitor. In some cases, it is possible to avoid a full 213traversal. 214 215If you are looking for all class declarations in a file (and assuming you're not interested in 216anonymous classes), you know that once you've seen a class declaration, there is no point in also 217checking all it's child nodes, because PHP does not allow nesting classes. In this case, you can 218instruct the traverser to not recurse into the class node: 219 220```php 221private $classes = []; 222public function enterNode(Node $node) { 223 if ($node instanceof Node\Stmt\Class_) { 224 $this->classes[] = $node; 225 return NodeVisitor::DONT_TRAVERSE_CHILDREN; 226 } 227} 228``` 229 230Of course, this option is only available in enterNode, because it's already too late by the time 231leaveNode is reached. 232 233If you are only looking for one specific node, it is also possible to abort the traversal entirely 234after finding it. For example, if you are looking for the node of a class with a certain name (and 235discounting exotic cases like conditionally defining a class two times), you can stop traversal 236once you found it: 237 238```php 239private $class = null; 240public function enterNode(Node $node) { 241 if ($node instanceof Node\Stmt\Class_ && 242 $node->namespacedName->toString() === 'Foo\Bar\Baz' 243 ) { 244 $this->class = $node; 245 return NodeVisitor::STOP_TRAVERSAL; 246 } 247} 248``` 249 250This works both in enterNode and leaveNode. Note that this particular case can also be more easily 251handled using a NodeFinder, which will be introduced below. 252 253Multiple visitors 254----------------- 255 256A single traverser can be used with multiple visitors: 257 258```php 259$traverser = new NodeTraverser; 260$traverser->addVisitor($visitorA); 261$traverser->addVisitor($visitorB); 262$stmts = $traverser->traverse($stmts); 263``` 264 265It is important to understand that if a traverser is run with multiple visitors, the visitors will 266be interleaved. Given the following AST excerpt 267 268``` 269Stmt_Return( 270 expr: Expr_Variable( 271 name: foobar 272 ) 273) 274``` 275 276the following method calls will be performed: 277 278```php 279$visitorA->enterNode(Stmt_Return) 280$visitorB->enterNode(Stmt_Return) 281$visitorA->enterNode(Expr_Variable) 282$visitorB->enterNode(Expr_Variable) 283$visitorB->leaveNode(Expr_Variable) 284$visitorA->leaveNode(Expr_Variable) 285$visitorB->leaveNode(Stmt_Return) 286$visitorA->leaveNode(Stmt_Return) 287``` 288 289That is, when visiting a node, `enterNode()` and `leaveNode()` will always be called for all 290visitors, with the `leaveNode()` calls happening in the reverse order of the `enterNode()` calls. 291Running multiple visitors in parallel improves performance, as the AST only has to be traversed 292once. However, it is not always possible to write visitors in a way that allows interleaved 293execution. In this case, you can always fall back to performing multiple traversals: 294 295```php 296$traverserA = new NodeTraverser; 297$traverserA->addVisitor($visitorA); 298$traverserB = new NodeTraverser; 299$traverserB->addVisitor($visitorB); 300$stmts = $traverserA->traverser($stmts); 301$stmts = $traverserB->traverser($stmts); 302``` 303 304When using multiple visitors, it is important to understand how they interact with the various 305special enterNode/leaveNode return values: 306 307 * If *any* visitor returns `DONT_TRAVERSE_CHILDREN`, the children will be skipped for *all* 308 visitors. 309 * If *any* visitor returns `DONT_TRAVERSE_CURRENT_AND_CHILDREN`, the children will be skipped for *all* 310 visitors, and all *subsequent* visitors will not visit the current node. 311 * If *any* visitor returns `STOP_TRAVERSAL`, traversal is stopped for *all* visitors. 312 * If a visitor returns a replacement node, subsequent visitors will be passed the replacement node, 313 not the original one. 314 * If a visitor returns `REMOVE_NODE`, subsequent visitors will not see this node. 315 * If a visitor returns `REPLACE_WITH_NULL`, subsequent visitors will not see this node. 316 * If a visitor returns an array of replacement nodes, subsequent visitors will see neither the node 317 that was replaced, nor the replacement nodes. 318 319Simple node finding 320------------------- 321 322While the node visitor mechanism is very flexible, creating a node visitor can be overly cumbersome 323for minor tasks. For this reason a `NodeFinder` is provided, which can find AST nodes that either 324satisfy a certain callback, or which are instances of a certain node type. A couple of examples are 325shown in the following: 326 327```php 328use PhpParser\{Node, NodeFinder}; 329 330$nodeFinder = new NodeFinder; 331 332// Find all class nodes. 333$classes = $nodeFinder->findInstanceOf($stmts, Node\Stmt\Class_::class); 334 335// Find all classes that extend another class 336$extendingClasses = $nodeFinder->find($stmts, function(Node $node) { 337 return $node instanceof Node\Stmt\Class_ 338 && $node->extends !== null; 339}); 340 341// Find first class occurring in the AST. Returns null if no class exists. 342$class = $nodeFinder->findFirstInstanceOf($stmts, Node\Stmt\Class_::class); 343 344// Find first class that has name $name 345$class = $nodeFinder->findFirst($stmts, function(Node $node) use ($name) { 346 return $node instanceof Node\Stmt\Class_ 347 && $node->resolvedName->toString() === $name; 348}); 349``` 350 351Internally, the `NodeFinder` also uses a node traverser. It only simplifies the interface for a 352common use case. 353 354Parent and sibling references 355----------------------------- 356 357The node visitor mechanism is somewhat rigid, in that it prescribes an order in which nodes should 358be accessed: From parents to children. However, it can often be convenient to operate in the 359reverse direction: When working on a node, you might want to check if the parent node satisfies a 360certain property. 361 362PHP-Parser does not add parent (or sibling) references to nodes by default, but you can enable them 363using the `ParentConnectingVisitor` or `NodeConnectingVisitor`. See the [FAQ](FAQ.markdown) for 364more information. 365