1Error handling in QUIC code
2===========================
3
4Current situation with TLS
5--------------------------
6
7The errors are put on the error stack (rather a queue but error stack is
8used throughout the code base) during the libssl API calls. In most
9(if not all) cases they should appear there only if the API call returns an
10error return value. The `SSL_get_error()` call depends on the stack being
11clean before the API call to be properly able to determine if the API
12call caused a library or system (I/O) error.
13
14The error stacks are thread-local. Libssl API calls from separate threads
15push errors to these separate error stacks. It is unusual to invoke libssl
16APIs with the same SSL object from different threads, but even if it happens,
17it is not a problem as applications are supposed to check for errors
18immediately after the API call on the same thread. There is no such thing as
19Thread-assisted mode of operation.
20
21Constraints
22-----------
23
24We need to keep using the existing ERR API as doing otherwise would
25complicate the existing applications and break our API compatibility promise.
26Even the ERR_STATE structure is public, although deprecated, and thus its
27structure and semantics cannot be changed.
28
29The error stack access is not under a lock (because it is thread-local).
30This complicates _moving errors between threads_.
31
32Error stack entries contain allocated data, copying entries between threads
33implies duplicating it or losing it.
34
35Assumptions
36-----------
37
38This document assumes the actual error state of the QUIC connection (or stream
39for stream level errors) is handled separately from the auxiliary error reason
40entries on the error stack.
41
42We can assume the internal assistance thread is well-behaving in regards
43to the error stack.
44
45We assume there are two types of errors that can be raised in the QUIC
46library calls and in the subordinate libcrypto (and provider) calls. First
47type is an intermittent error that does not really affect the state of the
48QUIC connection - for example EAGAIN returned on a syscall, or unavailability
49of some algorithm where there are other algorithms to try. Second type
50is a permanent error that affects the error state of the QUIC connection.
51Operations on QUIC streams (SSL_write(), SSL_read()) can also trigger errors,
52depending on their effect they are either permanent if they cause the
53QUIC connection to enter an error state, or if they just affect the stream
54they are left on the error stack of the thread that called SSL_write()
55or SSL_read() on the stream.
56
57Design
58------
59
60Return value of SSL_get_error() on QUIC connections or streams does not
61depend on the error stack contents.
62
63Intermittent errors are handled within the library and cleared from the
64error stack before returning to the user.
65
66Permanent errors happening within the assist thread, within SSL_tick()
67processing, or when calling SSL_read()/SSL_write() on a stream need to be
68replicated for SSL_read()/SSL_write() calls on other streams.
69
70Implementation
71--------------
72
73There is an error stack in QUIC_CHANNEL which serves as temporary storage
74for errors happening in the internal assistance thread. When a permanent error
75is detected the error stack entries are moved to this error stack in
76QUIC_CHANNEL.
77
78When returning to an application from an SSL_read()/SSL_write() call with
79a permanent connection error, entries from the QUIC_CHANNEL error stack
80are copied to the thread local error stack. They are always kept on
81the QUIC_CHANNEL error stack as well for possible further calls from
82an application. An additional error reason
83SSL_R_QUIC_CONNECTION_TERMINATED is added to the stack.
84
85SSL_tick() return value
86-----------------------
87
88The return value of SSL_tick() does not depend on whether there is
89a permanent error on the connection. The only case when SSL_tick() may
90return an error is when there was some fatal error processing it
91such as a memory allocation error where no further SSL_tick() calls
92make any sense.
93
94Multi-stream-multi-thread mode
95------------------------------
96
97There is nothing particular that needs to be handled specially for
98multi-stream-multi-thread mode as the error stack entries are always
99copied from the QUIC_CHANNEL after the failure. So if multiple threads
100are calling SSL_read()/SSL_write() simultaneously they all get
101the same error stack entries to report to the user.
102