1######
2 zval
3######
4
5PHP is a dynamic language. A variable can typically contain a value of any type, and the type of the
6variable may even change during the execution of the program. Under the hood, this is implemented
7through the ``zval`` struct. It is one of the most important data structures in php-src. It is
8implemented as a "tagged union", meaning it stores what type of value it contains, and the value
9itself. Let's look at the type first.
10
11************
12 zval types
13************
14
15.. code:: c
16
17   #define IS_UNDEF     0 /* A variable that was never written to. */
18   #define IS_NULL      1
19   #define IS_FALSE     2
20   #define IS_TRUE      3
21   #define IS_LONG      4 /* An integer value. */
22   #define IS_DOUBLE    5 /* A floating point value. */
23   #define IS_STRING    6
24   #define IS_ARRAY     7
25   #define IS_OBJECT    8
26   #define IS_RESOURCE  9
27   #define IS_REFERENCE 10
28
29These simple integer constants determine what value is currently stored in a variable. If you are a
30PHP developer, these types should sound fairly familiar. They are pretty much an exact reflection of
31the types you may use in regular PHP code. One small oddity is that ``IS_FALSE`` and ``IS_TRUE`` are
32implemented as separate types, instead of as a ``IS_BOOL`` type.
33
34Some of these types are self-contained, they don't store any auxiliary data. This includes
35``IS_UNDEF``, ``IS_NULL``, ``IS_FALSE`` and ``IS_TRUE``. For the rest of the types, we are going to
36require some additional memory to store the actual value of the variable.
37
38************
39 zend_value
40************
41
42.. code:: c
43
44   typedef union _zend_value {
45       zend_long         lval; /* long value, i.e. int. */
46       double            dval; /* double value, i.e. float. */
47       zend_refcounted  *counted;
48       zend_string      *str;
49       zend_array       *arr;
50       zend_object      *obj;
51       zend_resource    *res;
52       zend_reference   *ref;
53       // Less important for now.
54       zend_ast_ref     *ast;
55       zval             *zv;
56       void             *ptr;
57       zend_class_entry *ce;
58       zend_function    *func;
59       struct {
60           uint32_t w1;
61           uint32_t w2;
62       } ww;
63   } zend_value;
64
65A C union is a data type that may store any one of its members at a time, by being (at least) as big
66as its biggest member. For example, ``zend_value`` may store the ``lval`` member, or the ``dval``
67member, but never both at the same time. However, it doesn't know which member is being stored.
68Remembering this is our job, and that's exactly what the ``IS_*`` constants are for.
69
70The top members of ``zend_value`` mostly mirror the ``IS_*`` constants, with the exception of
71``counted``. ``counted`` polymorphically refers to any `reference counted <todo>`__ value, including
72strings, arrays, objects, resources and references. ``null`` and ``bool`` are missing from
73``zend_value`` because their types are self-contained.
74
75The rest of the fields aren't important for now.
76
77******
78 zval
79******
80
81Together, the value and the tag make up the ``zval``, along with some other fields. It may look
82intimidating at first. We'll go over it step by step.
83
84.. code:: c
85
86   typedef struct _zval_struct zval;
87
88   struct _zval_struct {
89       zend_value value;
90       union {
91           uint32_t type_info;
92           struct {
93               ZEND_ENDIAN_LOHI_3(
94                   uint8_t type, /* active type */
95                   uint8_t type_flags,
96                   union {
97                       uint16_t extra; /* not further specified */
98                   } u)
99           } v;
100       } u1;
101       union {
102           uint32_t next;           /* hash collision chain */
103           uint32_t cache_slot;     /* cache slot (for RECV_INIT) */
104           uint32_t opline_num;     /* opline number (for FAST_CALL) */
105           uint32_t lineno;         /* line number (for ast nodes) */
106           uint32_t num_args;       /* arguments number for EX(This) */
107           uint32_t fe_pos;         /* foreach position */
108           uint32_t fe_iter_idx;    /* foreach iterator index */
109           uint32_t guard;          /* recursion and single property guard */
110           uint32_t constant_flags; /* constant flags */
111           uint32_t extra;          /* not further specified */
112       } u2;
113   };
114
115``zval.value`` reserves space for the actual variable data, as discussed above.
116
117``zval.u1`` stores the variable type, the given ``IS_*`` constant, along with some other flags. It's
118definition looks a bit complicated. You can think of the entire field as a 4 byte integer, split
119into 3 parts. ``v.type`` stores the actual variable type, ``v.type_flags`` is used for some
120`reference counting <todo>`__ flags, and ``v.u.extra`` is pretty much unused.
121
122``zval.u2`` defines some more storage for various contexts that is often unoccupied. It's there
123because the memory would otherwise be wasted due to padding, so we may as well make use of it. We'll
124go over the relevant ones in their corresponding chapters.
125
126********
127 Macros
128********
129
130The fields in ``zval`` should never be accessed directly. Instead, there are a plethora of macros to
131access them, concealing some of the implementation details of the ``zval`` struct. For many macros,
132there's a ``_P``-suffixed variant that performs the same operation on a pointer to the given
133``zval``.
134
135.. list-table:: ``zval`` macros
136   :header-rows: 1
137
138   -  -  Macro
139      -  Description
140   -  -  ``Z_TYPE[_P]``
141      -  Access the ``zval.u1.v.type`` part of the type flags, containing the ``IS_*`` type.
142   -  -  ``Z_LVAL[_P]``
143      -  Access the underlying ``int`` value.
144   -  -  ``Z_DVAL[_P]``
145      -  Access the underlying ``float`` value.
146   -  -  ``Z_STR[_P]``
147      -  Access the underlying ``zend_string`` pointer.
148   -  -  ``Z_STRVAL[_P]``
149      -  Access the strings raw ``char *`` pointer.
150   -  -  ``Z_STRLEN[_P]``
151      -  Access the strings length.
152   -  -  ``ZVAL_COPY_VALUE(t, s)``
153      -  Copy one ``zval`` to another, including type and value.
154   -  -  ``ZVAL_COPY(t, s)``
155      -  Same as ``ZVAL_COPY_VALUE``, but if the value is reference counted, increase the counter.
156
157..
158   _todo: There are many more.
159
160******************
161 Other zval types
162******************
163
164``zval``\ s are sometimes used internally with types that don't exist in userland.
165
166.. code:: c
167
168   #define IS_CONSTANT_AST 11
169   #define IS_INDIRECT     12
170   #define IS_PTR          13
171   #define IS_ALIAS_PTR    14
172   #define _IS_ERROR       15
173
174``IS_CONSTANT_AST`` is used to represent constant values (the right hand side of ``const``,
175property/parameter initializers, etc.) before they are evaluated. The evaluation of a constant
176expression is not always possible during compilation, because they may contain references to values
177only available at runtime. Until that evaluation is possible, the constants contain the AST of the
178expression rather than the concrete values. Check the `parser <todo>`__ chapter for more information
179on ASTs. When this flag is set, the ``zval.value.ast`` union member is set accordingly.
180
181``IS_INDIRECT`` indicates that the ``zval.value.zv`` member is populated. This field stores a
182pointer to some other ``zval``. This type is mainly used in two situations, namely for intermediate
183values between ``FETCH`` and ``ASSIGN`` instructions, and for the sharing of variables in the symbol
184table.
185
186..
187   _todo: There are many more.
188
189``IS_PTR`` is used for pointers to arbitrary data. Most commonly, this type is used internally for
190``HashTable``, as ``HashTable`` may only store ``zval`` values. For example, ``EG(class_table)``
191represents the class table, which is a hash map of class names to the corresponding
192``zend_class_entry``, representing the class. The same goes for functions and many other data types.
193``IS_ALIAS_PTR`` is used for class aliases registered via ``class_alias``. Essentially, it just
194allows differencing between members in the class table that are aliases, or actual classes.
195Otherwise, it is essentially the same as ``IS_PTR``. Arbitrary data is accessed through
196``zval.value.ptr``, and casted to the correct type depending on context. If ``ptr`` stores a class
197or function, the ``zval.value.ce`` or ``zval.value.func`` fields may be used, respectively.
198
199``_IS_ERROR`` is used as an error value for some `object handlers <todo>`__. It is described in more
200detail in its own chapter.
201
202.. code:: c
203
204   /* Fake types used only for type hinting.
205    * These are allowed to overlap with the types below. */
206   #define IS_CALLABLE 12
207   #define IS_ITERABLE 13
208   #define IS_VOID     14
209   #define IS_STATIC   15
210   #define IS_MIXED    16
211   #define IS_NEVER    17
212
213   /* used for casts */
214   #define _IS_BOOL   18
215   #define _IS_NUMBER 19
216
217These flags are never actually stored in ``zval.u1``. They are used for type hinting and in the
218`object handler <todo>`__ API.
219
220This only leaves the ``zval.value.ww`` field. In short, this field is used on 32-bit platforms when
221copying data from one ``zval`` to another. Normally, ``zval.value.counted`` is copied as a generic
222value, no matter what the actual underlying type is. ``zend_value`` always consists of 8 bytes due
223to the ``double`` field. Pointers, however, consist only of 4. Because we would otherwise miss the
224other 4 bytes, they are copied manually using ``z->value.ww.w2 = _w2;``. This happens in the
225``ZVAL_COPY_VALUE_EX`` macro, you won't ever have to care about this.
226