1Walking the AST
2===============
3
4The most common way to work with the AST is by using a node traverser and one or more node visitors.
5As a basic example, the following code changes all literal integers in the AST into strings (e.g.,
6`42` becomes `'42'`.)
7
8```php
9use PhpParser\{Node, NodeTraverser, NodeVisitorAbstract};
10
11$traverser = new NodeTraverser;
12$traverser->addVisitor(new class extends NodeVisitorAbstract {
13    public function leaveNode(Node $node) {
14        if ($node instanceof Node\Scalar\Int_) {
15            return new Node\Scalar\String_((string) $node->value);
16        }
17    }
18});
19
20$stmts = ...;
21$modifiedStmts = $traverser->traverse($stmts);
22```
23
24Visitors can be either passed to the `NodeTraverser` constructor, or added using `addVisitor()`:
25
26```php
27$traverser = new NodeTraverser($visitor1, $visitor2, $visitor3);
28
29// Equivalent to:
30$traverser = new NodeTraverser();
31$traverser->addVisitor($visitor1);
32$traverser->addVisitor($visitor2);
33$traverser->addVisitor($visitor3);
34```
35
36Node visitors
37-------------
38
39Each node visitor implements an interface with following four methods:
40
41```php
42interface NodeVisitor {
43    public function beforeTraverse(array $nodes);
44    public function enterNode(Node $node);
45    public function leaveNode(Node $node);
46    public function afterTraverse(array $nodes);
47}
48```
49
50The `beforeTraverse()` and `afterTraverse()` methods are called before and after the traversal
51respectively, and are passed the entire AST. They can be used to perform any necessary state
52setup or cleanup.
53
54The `enterNode()` method is called when a node is first encountered, before its children are
55processed ("preorder"). The `leaveNode()` method is called after all children have been visited
56("postorder").
57
58For example, if we have the following excerpt of an AST
59
60```
61Expr_FuncCall(
62   name: Name(
63       name: printLine
64   )
65   args: array(
66       0: Arg(
67           name: null
68           value: Scalar_String(
69               value: Hello World!!!
70           )
71           byRef: false
72           unpack: false
73       )
74   )
75)
76```
77
78then the enter/leave methods will be called in the following order:
79
80```
81enterNode(Expr_FuncCall)
82enterNode(Name)
83leaveNode(Name)
84enterNode(Arg)
85enterNode(Scalar_String)
86leaveNode(Scalar_String)
87leaveNode(Arg)
88leaveNode(Expr_FuncCall)
89```
90
91A common pattern is that `enterNode` is used to collect some information and then `leaveNode`
92performs modifications based on that. At the time when `leaveNode` is called, all the code inside
93the node will have already been visited and necessary information collected.
94
95As you usually do not want to implement all four methods, it is recommended that you extend
96`NodeVisitorAbstract` instead of implementing the interface directly. The abstract class provides
97empty default implementations.
98
99Modifying the AST
100-----------------
101
102There are a number of ways in which the AST can be modified from inside a node visitor. The first
103and simplest is to simply change AST properties inside the visitor:
104
105```php
106public function leaveNode(Node $node) {
107    if ($node instanceof Node\Scalar\LNumber) {
108        // increment all integer literals
109        $node->value++;
110    }
111}
112```
113
114The second is to replace a node entirely by returning a new node:
115
116```php
117public function leaveNode(Node $node) {
118    if ($node instanceof Node\Expr\BinaryOp\BooleanAnd) {
119        // Convert all $a && $b expressions into !($a && $b)
120        return new Node\Expr\BooleanNot($node);
121    }
122}
123```
124
125Doing this is supported both inside enterNode and leaveNode. However, you have to be mindful about
126where you perform the replacement: If a node is replaced in enterNode, then the recursive traversal
127will also consider the children of the new node. If you aren't careful, this can lead to infinite
128recursion. For example, let's take the previous code sample and use enterNode instead:
129
130```php
131public function enterNode(Node $node) {
132    if ($node instanceof Node\Expr\BinaryOp\BooleanAnd) {
133        // Convert all $a && $b expressions into !($a && $b)
134        return new Node\Expr\BooleanNot($node);
135    }
136}
137```
138
139Now `$a && $b` will be replaced by `!($a && $b)`. Then the traverser will go into the first (and
140only) child of `!($a && $b)`, which is `$a && $b`. The transformation applies again and we end up
141with `!!($a && $b)`. This will continue until PHP hits the memory limit.
142
143Finally, there are three special replacement types. The first is removal of a node:
144
145```php
146public function leaveNode(Node $node) {
147    if ($node instanceof Node\Stmt\Return_) {
148        // Remove all return statements
149        return NodeVisitor::REMOVE_NODE;
150    }
151}
152```
153
154Node removal only works if the parent structure is an array. This means that usually it only makes
155sense to remove nodes of type `Node\Stmt`, as they always occur inside statement lists (and a few
156more node types like `Arg` or `Expr\ArrayItem`, which are also always part of lists).
157
158On the other hand, removing a `Node\Expr` does not make sense: If you have `$a * $b`, there is no
159meaningful way in which the `$a` part could be removed. If you want to remove an expression, you
160generally want to remove it together with a surrounding expression statement:
161
162```php
163public function leaveNode(Node $node) {
164    if ($node instanceof Node\Stmt\Expression
165        && $node->expr instanceof Node\Expr\FuncCall
166        && $node->expr->name instanceof Node\Name
167        && $node->expr->name->toString() === 'var_dump'
168    ) {
169        return NodeVisitor::REMOVE_NODE;
170    }
171}
172```
173
174This example will remove all calls to `var_dump()` which occur as expression statements. This means
175that `var_dump($a);` will be removed, but `if (var_dump($a))` will not be removed (and there is no
176obvious way in which it can be removed).
177
178Another way to remove nodes is to replace them with `null`. For example, all `else` statements could
179be removed as follows:
180
181```php
182public function leaveNode(Node $node) {
183    if ($node instanceof Node\Stmt\Else_) {
184        return NodeVisitor::REPLACE_WITH_NULL;
185    }
186}
187```
188
189This is only safe to do if the subnode the node is stored in is nullable. `Node\Stmt\Else_` only
190occurs inside `Node\Stmt\If_::$else`, which is nullable, so this particular replacement is safe.
191
192Next to removing nodes, it is also possible to replace one node with multiple nodes. This
193only works if the parent structure is an array.
194
195```php
196public function leaveNode(Node $node) {
197    if ($node instanceof Node\Stmt\Return_ && $node->expr !== null) {
198        // Convert "return foo();" into "$retval = foo(); return $retval;"
199        $var = new Node\Expr\Variable('retval');
200        return [
201            new Node\Stmt\Expression(new Node\Expr\Assign($var, $node->expr)),
202            new Node\Stmt\Return_($var),
203        ];
204    }
205}
206```
207
208Short-circuiting traversal
209--------------------------
210
211An AST can easily contain thousands of nodes, and traversing over all of them may be slow,
212especially if you have more than one visitor. In some cases, it is possible to avoid a full
213traversal.
214
215If you are looking for all class declarations in a file (and assuming you're not interested in
216anonymous classes), you know that once you've seen a class declaration, there is no point in also
217checking all it's child nodes, because PHP does not allow nesting classes. In this case, you can
218instruct the traverser to not recurse into the class node:
219
220```php
221private $classes = [];
222public function enterNode(Node $node) {
223    if ($node instanceof Node\Stmt\Class_) {
224        $this->classes[] = $node;
225        return NodeVisitor::DONT_TRAVERSE_CHILDREN;
226    }
227}
228```
229
230Of course, this option is only available in enterNode, because it's already too late by the time
231leaveNode is reached.
232
233If you are only looking for one specific node, it is also possible to abort the traversal entirely
234after finding it. For example, if you are looking for the node of a class with a certain name (and
235discounting exotic cases like conditionally defining a class two times), you can stop traversal
236once you found it:
237
238```php
239private $class = null;
240public function enterNode(Node $node) {
241    if ($node instanceof Node\Stmt\Class_ &&
242        $node->namespacedName->toString() === 'Foo\Bar\Baz'
243    ) {
244        $this->class = $node;
245        return NodeVisitor::STOP_TRAVERSAL;
246    }
247}
248```
249
250This works both in enterNode and leaveNode. Note that this particular case can also be more easily
251handled using a NodeFinder, which will be introduced below.
252
253Multiple visitors
254-----------------
255
256A single traverser can be used with multiple visitors:
257
258```php
259$traverser = new NodeTraverser;
260$traverser->addVisitor($visitorA);
261$traverser->addVisitor($visitorB);
262$stmts = $traverser->traverse($stmts);
263```
264
265It is important to understand that if a traverser is run with multiple visitors, the visitors will
266be interleaved. Given the following AST excerpt
267
268```
269Stmt_Return(
270    expr: Expr_Variable(
271        name: foobar
272    )
273)
274```
275
276the following method calls will be performed:
277
278```php
279$visitorA->enterNode(Stmt_Return)
280$visitorB->enterNode(Stmt_Return)
281$visitorA->enterNode(Expr_Variable)
282$visitorB->enterNode(Expr_Variable)
283$visitorB->leaveNode(Expr_Variable)
284$visitorA->leaveNode(Expr_Variable)
285$visitorB->leaveNode(Stmt_Return)
286$visitorA->leaveNode(Stmt_Return)
287```
288
289That is, when visiting a node, `enterNode()` and `leaveNode()` will always be called for all
290visitors, with the `leaveNode()` calls happening in the reverse order of the `enterNode()` calls.
291Running multiple visitors in parallel improves performance, as the AST only has to be traversed
292once. However, it is not always possible to write visitors in a way that allows interleaved
293execution. In this case, you can always fall back to performing multiple traversals:
294
295```php
296$traverserA = new NodeTraverser;
297$traverserA->addVisitor($visitorA);
298$traverserB = new NodeTraverser;
299$traverserB->addVisitor($visitorB);
300$stmts = $traverserA->traverser($stmts);
301$stmts = $traverserB->traverser($stmts);
302```
303
304When using multiple visitors, it is important to understand how they interact with the various
305special enterNode/leaveNode return values:
306
307 * If *any* visitor returns `DONT_TRAVERSE_CHILDREN`, the children will be skipped for *all*
308   visitors.
309 * If *any* visitor returns `DONT_TRAVERSE_CURRENT_AND_CHILDREN`, the children will be skipped for *all*
310   visitors, and all *subsequent* visitors will not visit the current node.
311 * If *any* visitor returns `STOP_TRAVERSAL`, traversal is stopped for *all* visitors.
312 * If a visitor returns a replacement node, subsequent visitors will be passed the replacement node,
313   not the original one.
314 * If a visitor returns `REMOVE_NODE`, subsequent visitors will not see this node.
315 * If a visitor returns `REPLACE_WITH_NULL`, subsequent visitors will not see this node.
316 * If a visitor returns an array of replacement nodes, subsequent visitors will see neither the node
317   that was replaced, nor the replacement nodes.
318
319Simple node finding
320-------------------
321
322While the node visitor mechanism is very flexible, creating a node visitor can be overly cumbersome
323for minor tasks. For this reason a `NodeFinder` is provided, which can find AST nodes that either
324satisfy a certain callback, or which are instances of a certain node type. A couple of examples are
325shown in the following:
326
327```php
328use PhpParser\{Node, NodeFinder};
329
330$nodeFinder = new NodeFinder;
331
332// Find all class nodes.
333$classes = $nodeFinder->findInstanceOf($stmts, Node\Stmt\Class_::class);
334
335// Find all classes that extend another class
336$extendingClasses = $nodeFinder->find($stmts, function(Node $node) {
337    return $node instanceof Node\Stmt\Class_
338        && $node->extends !== null;
339});
340
341// Find first class occurring in the AST. Returns null if no class exists.
342$class = $nodeFinder->findFirstInstanceOf($stmts, Node\Stmt\Class_::class);
343
344// Find first class that has name $name
345$class = $nodeFinder->findFirst($stmts, function(Node $node) use ($name) {
346    return $node instanceof Node\Stmt\Class_
347        && $node->resolvedName->toString() === $name;
348});
349```
350
351Internally, the `NodeFinder` also uses a node traverser. It only simplifies the interface for a
352common use case.
353
354Parent and sibling references
355-----------------------------
356
357The node visitor mechanism is somewhat rigid, in that it prescribes an order in which nodes should
358be accessed: From parents to children. However, it can often be convenient to operate in the
359reverse direction: When working on a node, you might want to check if the parent node satisfies a
360certain property.
361
362PHP-Parser does not add parent (or sibling) references to nodes by default, but you can enable them
363using the `ParentConnectingVisitor` or `NodeConnectingVisitor`. See the [FAQ](FAQ.markdown) for
364more information.
365