1Threads 2======= 3 4Wait a minute? Why are we on threads? Aren't event loops supposed to be **the 5way** to do *web-scale programming*? Well... no. Threads are still the medium in 6which processors do their jobs. Threads are therefore mighty useful sometimes, even 7though you might have to wade through various synchronization primitives. 8 9Threads are used internally to fake the asynchronous nature of all of the system 10calls. libuv also uses threads to allow you, the application, to perform a task 11asynchronously that is actually blocking, by spawning a thread and collecting 12the result when it is done. 13 14Today there are two predominant thread libraries: the Windows threads 15implementation and POSIX's :man:`pthreads(7)`. libuv's thread API is analogous to 16the pthreads API and often has similar semantics. 17 18A notable aspect of libuv's thread facilities is that it is a self contained 19section within libuv. Whereas other features intimately depend on the event 20loop and callback principles, threads are complete agnostic, they block as 21required, signal errors directly via return values, and, as shown in the 22:ref:`first example <thread-create-example>`, don't even require a running 23event loop. 24 25libuv's thread API is also very limited since the semantics and syntax of 26threads are different on all platforms, with different levels of completeness. 27 28This chapter makes the following assumption: **There is only one event loop, 29running in one thread (the main thread)**. No other thread interacts 30with the event loop (except using ``uv_async_send``). 31 32Core thread operations 33---------------------- 34 35There isn't much here, you just start a thread using ``uv_thread_create()`` and 36wait for it to close using ``uv_thread_join()``. 37 38.. _thread-create-example: 39 40.. rubric:: thread-create/main.c 41.. literalinclude:: ../../code/thread-create/main.c 42 :language: c 43 :linenos: 44 :lines: 26-36 45 :emphasize-lines: 3-7 46 47.. tip:: 48 49 ``uv_thread_t`` is just an alias for ``pthread_t`` on Unix, but this is an 50 implementation detail, avoid depending on it to always be true. 51 52The second parameter is the function which will serve as the entry point for 53the thread, the last parameter is a ``void *`` argument which can be used to pass 54custom parameters to the thread. The function ``hare`` will now run in a separate 55thread, scheduled pre-emptively by the operating system: 56 57.. rubric:: thread-create/main.c 58.. literalinclude:: ../../code/thread-create/main.c 59 :language: c 60 :linenos: 61 :lines: 6-14 62 :emphasize-lines: 2 63 64Unlike ``pthread_join()`` which allows the target thread to pass back a value to 65the calling thread using a second parameter, ``uv_thread_join()`` does not. To 66send values use :ref:`inter-thread-communication`. 67 68Synchronization Primitives 69-------------------------- 70 71This section is purposely spartan. This book is not about threads, so I only 72catalogue any surprises in the libuv APIs here. For the rest you can look at 73the :man:`pthreads(7)` man pages. 74 75Mutexes 76~~~~~~~ 77 78The mutex functions are a **direct** map to the pthread equivalents. 79 80.. rubric:: libuv mutex functions 81.. code-block:: c 82 83 int uv_mutex_init(uv_mutex_t* handle); 84 int uv_mutex_init_recursive(uv_mutex_t* handle); 85 void uv_mutex_destroy(uv_mutex_t* handle); 86 void uv_mutex_lock(uv_mutex_t* handle); 87 int uv_mutex_trylock(uv_mutex_t* handle); 88 void uv_mutex_unlock(uv_mutex_t* handle); 89 90The ``uv_mutex_init()``, ``uv_mutex_init_recursive()`` and ``uv_mutex_trylock()`` 91functions will return 0 on success, and an error code otherwise. 92 93If `libuv` has been compiled with debugging enabled, ``uv_mutex_destroy()``, 94``uv_mutex_lock()`` and ``uv_mutex_unlock()`` will ``abort()`` on error. 95Similarly ``uv_mutex_trylock()`` will abort if the error is anything *other 96than* ``EAGAIN`` or ``EBUSY``. 97 98Recursive mutexes are supported, but you should not rely on them. Also, they 99should not be used with ``uv_cond_t`` variables. 100 101The default BSD mutex implementation will raise an error if a thread which has 102locked a mutex attempts to lock it again. For example, a construct like:: 103 104 uv_mutex_init(a_mutex); 105 uv_mutex_lock(a_mutex); 106 uv_thread_create(thread_id, entry, (void *)a_mutex); 107 uv_mutex_lock(a_mutex); 108 // more things here 109 110can be used to wait until another thread initializes some stuff and then 111unlocks ``a_mutex`` but will lead to your program crashing if in debug mode, or 112return an error in the second call to ``uv_mutex_lock()``. 113 114.. note:: 115 116 Mutexes on Windows are always recursive. 117 118Locks 119~~~~~ 120 121Read-write locks are a more granular access mechanism. Two readers can access 122shared memory at the same time. A writer may not acquire the lock when it is 123held by a reader. A reader or writer may not acquire a lock when a writer is 124holding it. Read-write locks are frequently used in databases. Here is a toy 125example. 126 127.. rubric:: locks/main.c - simple rwlocks 128.. literalinclude:: ../../code/locks/main.c 129 :language: c 130 :linenos: 131 :emphasize-lines: 13,16,27,31,42,55 132 133Run this and observe how the readers will sometimes overlap. In case of 134multiple writers, schedulers will usually give them higher priority, so if you 135add two writers, you'll see that both writers tend to finish first before the 136readers get a chance again. 137 138We also use barriers in the above example so that the main thread can wait for 139all readers and writers to indicate they have ended. 140 141Others 142~~~~~~ 143 144libuv also supports semaphores_, `condition variables`_ and barriers_ with APIs 145very similar to their pthread counterparts. 146 147.. _semaphores: https://en.wikipedia.org/wiki/Semaphore_(programming) 148.. _condition variables: https://en.wikipedia.org/wiki/Monitor_(synchronization)#Condition_variables_2 149.. _barriers: https://en.wikipedia.org/wiki/Barrier_(computer_science) 150 151In addition, libuv provides a convenience function ``uv_once()``. Multiple 152threads can attempt to call ``uv_once()`` with a given guard and a function 153pointer, **only the first one will win, the function will be called once and 154only once**:: 155 156 /* Initialize guard */ 157 static uv_once_t once_only = UV_ONCE_INIT; 158 159 int i = 0; 160 161 void increment() { 162 i++; 163 } 164 165 void thread1() { 166 /* ... work */ 167 uv_once(once_only, increment); 168 } 169 170 void thread2() { 171 /* ... work */ 172 uv_once(once_only, increment); 173 } 174 175 int main() { 176 /* ... spawn threads */ 177 } 178 179After all threads are done, ``i == 1``. 180 181.. _libuv-work-queue: 182 183libuv v0.11.11 onwards also added a ``uv_key_t`` struct and api_ for 184thread-local storage. 185 186.. _api: http://docs.libuv.org/en/v1.x/threading.html#thread-local-storage 187 188libuv work queue 189---------------- 190 191``uv_queue_work()`` is a convenience function that allows an application to run 192a task in a separate thread, and have a callback that is triggered when the 193task is done. A seemingly simple function, what makes ``uv_queue_work()`` 194tempting is that it allows potentially any third-party libraries to be used 195with the event-loop paradigm. When you use event loops, it is *imperative to 196make sure that no function which runs periodically in the loop thread blocks 197when performing I/O or is a serious CPU hog*, because this means that the loop 198slows down and events are not being handled at full capacity. 199 200However, a lot of existing code out there features blocking functions (for example 201a routine which performs I/O under the hood) to be used with threads if you 202want responsiveness (the classic 'one thread per client' server model), and 203getting them to play with an event loop library generally involves rolling your 204own system of running the task in a separate thread. libuv just provides 205a convenient abstraction for this. 206 207Here is a simple example inspired by `node.js is cancer`_. We are going to 208calculate fibonacci numbers, sleeping a bit along the way, but run it in 209a separate thread so that the blocking and CPU bound task does not prevent the 210event loop from performing other activities. 211 212.. rubric:: queue-work/main.c - lazy fibonacci 213.. literalinclude:: ../../code/queue-work/main.c 214 :language: c 215 :linenos: 216 :lines: 17-29 217 218The actual task function is simple, nothing to show that it is going to be 219run in a separate thread. The ``uv_work_t`` structure is the clue. You can pass 220arbitrary data through it using the ``void* data`` field and use it to 221communicate to and from the thread. But be sure you are using proper locks if 222you are changing things while both threads may be running. 223 224The trigger is ``uv_queue_work``: 225 226.. rubric:: queue-work/main.c 227.. literalinclude:: ../../code/queue-work/main.c 228 :language: c 229 :linenos: 230 :lines: 31-44 231 :emphasize-lines: 10 232 233The thread function will be launched in a separate thread, passed the 234``uv_work_t`` structure and once the function returns, the *after* function 235will be called on the thread the event loop is running in. It will be passed 236the same structure. 237 238For writing wrappers to blocking libraries, a common :ref:`pattern <baton>` 239is to use a baton to exchange data. 240 241Since libuv version `0.9.4` an additional function, ``uv_cancel()``, is 242available. This allows you to cancel tasks on the libuv work queue. Only tasks 243that *are yet to be started* can be cancelled. If a task has *already started 244executing, or it has finished executing*, ``uv_cancel()`` **will fail**. 245 246``uv_cancel()`` is useful to cleanup pending tasks if the user requests 247termination. For example, a music player may queue up multiple directories to 248be scanned for audio files. If the user terminates the program, it should quit 249quickly and not wait until all pending requests are run. 250 251Let's modify the fibonacci example to demonstrate ``uv_cancel()``. We first set 252up a signal handler for termination. 253 254.. rubric:: queue-cancel/main.c 255.. literalinclude:: ../../code/queue-cancel/main.c 256 :language: c 257 :linenos: 258 :lines: 43- 259 260When the user triggers the signal by pressing ``Ctrl+C`` we send 261``uv_cancel()`` to all the workers. ``uv_cancel()`` will return ``0`` for those that are already executing or finished. 262 263.. rubric:: queue-cancel/main.c 264.. literalinclude:: ../../code/queue-cancel/main.c 265 :language: c 266 :linenos: 267 :lines: 33-41 268 :emphasize-lines: 6 269 270For tasks that do get cancelled successfully, the *after* function is called 271with ``status`` set to ``UV_ECANCELED``. 272 273.. rubric:: queue-cancel/main.c 274.. literalinclude:: ../../code/queue-cancel/main.c 275 :language: c 276 :linenos: 277 :lines: 28-31 278 :emphasize-lines: 2 279 280``uv_cancel()`` can also be used with ``uv_fs_t`` and ``uv_getaddrinfo_t`` 281requests. For the filesystem family of functions, ``uv_fs_t.errorno`` will be 282set to ``UV_ECANCELED``. 283 284.. TIP:: 285 286 A well designed program would have a way to terminate long running workers 287 that have already started executing. Such a worker could periodically check 288 for a variable that only the main process sets to signal termination. 289 290.. _inter-thread-communication: 291 292Inter-thread communication 293-------------------------- 294 295Sometimes you want various threads to actually send each other messages *while* 296they are running. For example you might be running some long duration task in 297a separate thread (perhaps using ``uv_queue_work``) but want to notify progress 298to the main thread. This is a simple example of having a download manager 299informing the user of the status of running downloads. 300 301.. rubric:: progress/main.c 302.. literalinclude:: ../../code/progress/main.c 303 :language: c 304 :linenos: 305 :lines: 7-8,35- 306 :emphasize-lines: 2,11 307 308The async thread communication works *on loops* so although any thread can be 309the message sender, only threads with libuv loops can be receivers (or rather 310the loop is the receiver). libuv will invoke the callback (``print_progress``) 311with the async watcher whenever it receives a message. 312 313.. warning:: 314 315 It is important to realize that since the message send is *async*, the callback 316 may be invoked immediately after ``uv_async_send`` is called in another 317 thread, or it may be invoked after some time. libuv may also combine 318 multiple calls to ``uv_async_send`` and invoke your callback only once. The 319 only guarantee that libuv makes is -- The callback function is called *at 320 least once* after the call to ``uv_async_send``. If you have no pending 321 calls to ``uv_async_send``, the callback won't be called. If you make two 322 or more calls, and libuv hasn't had a chance to run the callback yet, it 323 *may* invoke your callback *only once* for the multiple invocations of 324 ``uv_async_send``. Your callback will never be called twice for just one 325 event. 326 327.. rubric:: progress/main.c 328.. literalinclude:: ../../code/progress/main.c 329 :language: c 330 :linenos: 331 :lines: 10-24 332 :emphasize-lines: 7-8 333 334In the download function, we modify the progress indicator and queue the message 335for delivery with ``uv_async_send``. Remember: ``uv_async_send`` is also 336non-blocking and will return immediately. 337 338.. rubric:: progress/main.c 339.. literalinclude:: ../../code/progress/main.c 340 :language: c 341 :linenos: 342 :lines: 31-34 343 344The callback is a standard libuv pattern, extracting the data from the watcher. 345 346Finally it is important to remember to clean up the watcher. 347 348.. rubric:: progress/main.c 349.. literalinclude:: ../../code/progress/main.c 350 :language: c 351 :linenos: 352 :lines: 26-29 353 :emphasize-lines: 3 354 355After this example, which showed the abuse of the ``data`` field, bnoordhuis_ 356pointed out that using the ``data`` field is not thread safe, and 357``uv_async_send()`` is actually only meant to wake up the event loop. Use 358a mutex or rwlock to ensure accesses are performed in the right order. 359 360.. note:: 361 362 mutexes and rwlocks **DO NOT** work inside a signal handler, whereas 363 ``uv_async_send`` does. 364 365One use case where ``uv_async_send`` is required is when interoperating with 366libraries that require thread affinity for their functionality. For example in 367node.js, a v8 engine instance, contexts and its objects are bound to the thread 368that the v8 instance was started in. Interacting with v8 data structures from 369another thread can lead to undefined results. Now consider some node.js module 370which binds a third party library. It may go something like this: 371 3721. In node, the third party library is set up with a JavaScript callback to be 373 invoked for more information:: 374 375 var lib = require('lib'); 376 lib.on_progress(function() { 377 console.log("Progress"); 378 }); 379 380 lib.do(); 381 382 // do other stuff 383 3842. ``lib.do`` is supposed to be non-blocking but the third party lib is 385 blocking, so the binding uses ``uv_queue_work``. 386 3873. The actual work being done in a separate thread wants to invoke the progress 388 callback, but cannot directly call into v8 to interact with JavaScript. So 389 it uses ``uv_async_send``. 390 3914. The async callback, invoked in the main loop thread, which is the v8 thread, 392 then interacts with v8 to invoke the JavaScript callback. 393 394---- 395 396.. _node.js is cancer: http://widgetsandshit.com/teddziuba/2011/10/node-js-is-cancer.html 397.. _bnoordhuis: https://github.com/bnoordhuis 398