xref: /curl/docs/internals/CONNECTION-FILTERS.md (revision 20aa8d8f)
1<!--
2Copyright (C) Daniel Stenberg, <daniel@haxx.se>, et al.
3
4SPDX-License-Identifier: curl
5-->
6
7# curl connection filters
8
9Connection filters is a design in the internals of curl, not visible in its
10public API. They were added in curl v7.87.0. This document describes the
11concepts, its high level implementation and the motivations.
12
13## Filters
14
15A "connection filter" is a piece of code that is responsible for handling a
16range of operations of curl's connections: reading, writing, waiting on
17external events, connecting and closing down - to name the most important
18ones.
19
20The most important feat of connection filters is that they can be stacked on
21top of each other (or "chained" if you prefer that metaphor). In the common
22scenario that you want to retrieve a `https:` URL with curl, you need 2 basic
23things to send the request and get the response: a TCP connection, represented
24by a `socket` and a SSL instance en- and decrypt over that socket. You write
25your request to the SSL instance, which encrypts and writes that data to the
26socket, which then sends the bytes over the network.
27
28With connection filters, curl's internal setup looks something like this (cf
29for connection filter):
30
31```
32Curl_easy *data         connectdata *conn        cf-ssl        cf-socket
33+----------------+      +-----------------+      +-------+     +--------+
34|https://curl.se/|----> | properties      |----> | keys  |---> | socket |--> OS --> network
35+----------------+      +-----------------+      +-------+     +--------+
36
37 Curl_write(data, buffer)
38  --> Curl_cfilter_write(data, data->conn, buffer)
39       ---> conn->filter->write(conn->filter, data, buffer)
40```
41
42While connection filters all do different things, they look the same from the
43"outside". The code in `data` and `conn` does not really know **which**
44filters are installed. `conn` just writes into the first filter, whatever that
45is.
46
47Same is true for filters. Each filter has a pointer to the `next` filter. When
48SSL has encrypted the data, it does not write to a socket, it writes to the
49next filter. If that is indeed a socket, or a file, or an HTTP/2 connection is
50of no concern to the SSL filter.
51
52This allows stacking, as in:
53
54```
55Direct:
56  http://localhost/      conn -> cf-socket
57  https://curl.se/       conn -> cf-ssl -> cf-socket
58Via http proxy tunnel:
59  http://localhost/      conn -> cf-http-proxy -> cf-socket
60  https://curl.se/       conn -> cf-ssl -> cf-http-proxy -> cf-socket
61Via https proxy tunnel:
62  http://localhost/      conn -> cf-http-proxy -> cf-ssl -> cf-socket
63  https://curl.se/       conn -> cf-ssl -> cf-http-proxy -> cf-ssl -> cf-socket
64Via http proxy tunnel via SOCKS proxy:
65  http://localhost/      conn -> cf-http-proxy -> cf-socks -> cf-socket
66```
67
68### Connecting/Closing
69
70Before `Curl_easy` can send the request, the connection needs to be
71established. This means that all connection filters have done, whatever they
72need to do: waiting for the socket to be connected, doing the TLS handshake,
73performing the HTTP tunnel request, etc. This has to be done in reverse order:
74the last filter has to do its connect first, then the one above can start,
75etc.
76
77Each filter does in principle the following:
78
79```
80static CURLcode
81myfilter_cf_connect(struct Curl_cfilter *cf,
82                    struct Curl_easy *data,
83                    bool *done)
84{
85  CURLcode result;
86
87  if(cf->connected) {            /* we and all below are done */
88    *done = TRUE;
89    return CURLE_OK;
90  }
91                                 /* Let the filters below connect */
92  result = cf->next->cft->connect(cf->next, data, blocking, done);
93  if(result || !*done)
94    return result;               /* below errored/not finished yet */
95
96  /* MYFILTER CONNECT THINGS */  /* below connected, do out thing */
97  *done = cf->connected = TRUE;  /* done, remember, return */
98  return CURLE_OK;
99}
100```
101
102Closing a connection then works similar. The `conn` tells the first filter to
103close. Contrary to connecting, the filter does its own things first, before
104telling the next filter to close.
105
106### Efficiency
107
108There are two things curl is concerned about: efficient memory use and fast
109transfers.
110
111The memory footprint of a filter is relatively small:
112
113```
114struct Curl_cfilter {
115  const struct Curl_cftype *cft; /* the type providing implementation */
116  struct Curl_cfilter *next;     /* next filter in chain */
117  void *ctx;                     /* filter type specific settings */
118  struct connectdata *conn;      /* the connection this filter belongs to */
119  int sockindex;                 /* TODO: like to get rid off this */
120  BIT(connected);                /* != 0 iff this filter is connected */
121};
122```
123
124The filter type `cft` is a singleton, one static struct for each type of
125filter. The `ctx` is where a filter holds its specific data. That varies by
126filter type. An http-proxy filter keeps the ongoing state of the CONNECT here,
127free it after its has been established. The SSL filter keeps the `SSL*` (if
128OpenSSL is used) here until the connection is closed. So, this varies.
129
130`conn` is a reference to the connection this filter belongs to, so nothing
131extra besides the pointer itself.
132
133Several things, that before were kept in `struct connectdata`, now goes into
134the `filter->ctx` *when needed*. So, the memory footprint for connections that
135do *not* use an http proxy, or socks, or https is lower.
136
137As to transfer efficiency, writing and reading through a filter comes at near
138zero cost *if the filter does not transform the data*. An http proxy or socks
139filter, once it is connected, just passes the calls through. Those filters
140implementations look like this:
141
142```
143ssize_t  Curl_cf_def_send(struct Curl_cfilter *cf, struct Curl_easy *data,
144                          const void *buf, size_t len, CURLcode *err)
145{
146  return cf->next->cft->do_send(cf->next, data, buf, len, err);
147}
148```
149The `recv` implementation is equivalent.
150
151## Filter Types
152
153The currently existing filter types (curl 8.5.0) are:
154
155* `TCP`, `UDP`, `UNIX`: filters that operate on a socket, providing raw I/O.
156* `SOCKET-ACCEPT`: special TCP socket that has a socket that has been
157  `accept()`ed in a `listen()`
158* `SSL`: filter that applies TLS en-/decryption and handshake. Manages the
159  underlying TLS backend implementation.
160* `HTTP-PROXY`, `H1-PROXY`, `H2-PROXY`: the first manages the connection to an
161  HTTP proxy server and uses the other depending on which ALPN protocol has
162  been negotiated.
163* `SOCKS-PROXY`: filter for the various SOCKS proxy protocol variations
164* `HAPROXY`: filter for the protocol of the same name, providing client IP
165  information to a server.
166* `HTTP/2`: filter for handling multiplexed transfers over an HTTP/2
167  connection
168* `HTTP/3`: filter for handling multiplexed transfers over an HTTP/3+QUIC
169  connection
170* `HAPPY-EYEBALLS`: meta filter that implements IPv4/IPv6 "happy eyeballing".
171  It creates up to 2 sub-filters that race each other for a connection.
172* `SETUP`: meta filter that manages the creation of sub-filter chains for a
173  specific transport (e.g. TCP or QUIC).
174* `HTTPS-CONNECT`: meta filter that races a TCP+TLS and a QUIC connection
175  against each other to determine if HTTP/1.1, HTTP/2 or HTTP/3 shall be used
176  for a transfer.
177
178Meta filters are combining other filters for a specific purpose, mostly during
179connection establishment. Other filters like `TCP`, `UDP` and `UNIX` are only
180to be found at the end of filter chains. SSL filters provide encryption, of
181course. Protocol filters change the bytes sent and received.
182
183## Filter Flags
184
185Filter types carry flags that inform what they do. These are (for now):
186
187* `CF_TYPE_IP_CONNECT`: this filter type talks directly to a server. This does
188  not have to be the server the transfer wants to talk to. For example when a
189  proxy server is used.
190* `CF_TYPE_SSL`: this filter type provides encryption.
191* `CF_TYPE_MULTIPLEX`: this filter type can manage multiple transfers in parallel.
192
193Filter types can combine these flags. For example, the HTTP/3 filter types
194have `CF_TYPE_IP_CONNECT`, `CF_TYPE_SSL` and `CF_TYPE_MULTIPLEX` set.
195
196Flags are useful to extrapolate properties of a connection. To check if a
197connection is encrypted, libcurl inspect the filter chain in place, top down,
198for `CF_TYPE_SSL`. If it finds `CF_TYPE_IP_CONNECT` before any `CF_TYPE_SSL`,
199the connection is not encrypted.
200
201For example, `conn1` is for a `http:` request using a tunnel through an HTTP/2
202`https:` proxy. `conn2` is a `https:` HTTP/2 connection to the same proxy.
203`conn3` uses HTTP/3 without proxy. The filter chains would look like this
204(simplified):
205
206```
207conn1 --> `HTTP-PROXY` --> `H2-PROXY` --> `SSL` --> `TCP`
208flags:                     `IP_CONNECT`   `SSL`     `IP_CONNECT`
209
210conn2 --> `HTTP/2` --> `SSL` --> `HTTP-PROXY` --> `H2-PROXY` --> `SSL` --> `TCP`
211flags:                 `SSL`                      `IP_CONNECT`   `SSL`     `IP_CONNECT`
212
213conn3 --> `HTTP/3`
214flags:    `SSL|IP_CONNECT`
215```
216
217Inspecting the filter chains, `conn1` is seen as unencrypted, since it
218contains an `IP_CONNECT` filter before any `SSL`. `conn2` is clearly encrypted
219as an `SSL` flagged filter is seen first. `conn3` is also encrypted as the
220`SSL` flag is checked before the presence of `IP_CONNECT`.
221
222Similar checks can determine if a connection is multiplexed or not.
223
224## Filter Tracing
225
226Filters may make use of special trace macros like `CURL_TRC_CF(data, cf, msg,
227...)`. With `data` being the transfer and `cf` being the filter instance.
228These traces are normally not active and their execution is guarded so that
229they are cheap to ignore.
230
231Users of `curl` may activate them by adding the name of the filter type to the
232`--trace-config` argument. For example, in order to get more detailed tracing
233of an HTTP/2 request, invoke curl with:
234
235```
236> curl -v --trace-config ids,time,http/2  https://curl.se
237```
238
239Which gives you trace output with time information, transfer+connection ids
240and details from the `HTTP/2` filter. Filter type names in the trace config
241are case insensitive. You may use `all` to enable tracing for all filter
242types. When using `libcurl` you may call `curl_global_trace(config_string)` at
243the start of your application to enable filter details.
244
245## Meta Filters
246
247Meta filters is a catch-all name for filter types that do not change the
248transfer data in any way but provide other important services to curl. In
249general, it is possible to do all sorts of silly things with them. One of the
250commonly used, important things is "eyeballing".
251
252The `HAPPY-EYEBALLS` filter is involved in the connect phase. Its job is to
253try the various IPv4 and IPv6 addresses that are known for a server. If only
254one address family is known (or configured), it tries the addresses one after
255the other with timeouts calculated from the amount of addresses and the
256overall connect timeout.
257
258When more than one address family is to be tried, it splits the address list
259into IPv4 and IPv6 and makes parallel attempts. The connection filter chain
260looks like this:
261
262```
263* create connection for http://curl.se
264conn[curl.se] --> SETUP[TCP] --> HAPPY-EYEBALLS --> NULL
265* start connect
266conn[curl.se] --> SETUP[TCP] --> HAPPY-EYEBALLS --> NULL
267                                 - ballerv4 --> TCP[151.101.1.91]:443
268                                 - ballerv6 --> TCP[2a04:4e42:c00::347]:443
269* v6 answers, connected
270conn[curl.se] --> SETUP[TCP] --> HAPPY-EYEBALLS --> TCP[2a04:4e42:c00::347]:443
271* transfer
272```
273
274The modular design of connection filters and that we can plug them into each other is used to control the parallel attempts. When a `TCP` filter does not connect (in time), it is torn down and another one is created for the next address. This keeps the `TCP` filter simple.
275
276The `HAPPY-EYEBALLS` on the other hand stays focused on its side of the problem. We can use it also to make other type of connection by just giving it another filter type to try to have happy eyeballing for QUIC:
277
278```
279* create connection for --http3-only https://curl.se
280conn[curl.se] --> SETUP[QUIC] --> HAPPY-EYEBALLS --> NULL
281* start connect
282conn[curl.se] --> SETUP[QUIC] --> HAPPY-EYEBALLS --> NULL
283                                  - ballerv4 --> HTTP/3[151.101.1.91]:443
284                                  - ballerv6 --> HTTP/3[2a04:4e42:c00::347]:443
285* v6 answers, connected
286conn[curl.se] --> SETUP[QUIC] --> HAPPY-EYEBALLS --> HTTP/3[2a04:4e42:c00::347]:443
287* transfer
288```
289
290When we plug these two variants together, we get the `HTTPS-CONNECT` filter
291type that is used for `--http3` when **both** HTTP/3 and HTTP/2 or HTTP/1.1
292shall be attempted:
293
294```
295* create connection for --http3 https://curl.se
296conn[curl.se] --> HTTPS-CONNECT --> NULL
297* start connect
298conn[curl.se] --> HTTPS-CONNECT --> NULL
299                  - SETUP[QUIC] --> HAPPY-EYEBALLS --> NULL
300                                    - ballerv4 --> HTTP/3[151.101.1.91]:443
301                                    - ballerv6 --> HTTP/3[2a04:4e42:c00::347]:443
302                  - SETUP[TCP]  --> HAPPY-EYEBALLS --> NULL
303                                    - ballerv4 --> TCP[151.101.1.91]:443
304                                    - ballerv6 --> TCP[2a04:4e42:c00::347]:443
305* v4 QUIC answers, connected
306conn[curl.se] --> HTTPS-CONNECT --> SETUP[QUIC] --> HAPPY-EYEBALLS --> HTTP/3[151.101.1.91]:443
307* transfer
308```
309