1Design Problem: Abstract Record Layer
2=====================================
3
4This document covers the design of an abstract record layer for use in (D)TLS.
5The QUIC record layer is handled separately.
6
7A record within this document refers to a packet of data. It will typically
8contain some header data and some payload data, and will often be
9cryptographically protected. A record may or may not have a one-to-one
10correspondence with network packets, depending on the implementation details of
11an individual record layer.
12
13The term record comes directly from the TLS and DTLS specifications.
14
15Libssl supports a number of different types of record layer, and record layer
16variants:
17
18- Standard TLS record layer
19- Standard DTLS record layer
20- Kernel TLS record layer
21
22Within the TLS record layer there are options to handle "multiblock" and
23"pipelining" which are different approaches for supporting the reading or
24writing of multiple records at the same time. All record layer variants also
25have to be able to handle different protocol versions.
26
27These different record layer implementations, variants and protocol versions
28have each been added at different times and over many years. The result is that
29each took slightly different approaches for achieving the goals that were
30appropriate at the time and the integration points where they were added were
31spread throughout the code.
32
33The introduction of QUIC support will see the implementation of a new record
34layer, i.e. the QUIC-TLS record layer. This refers to the "inner" TLS
35implementation used by QUIC. Records here will be in the form of QUIC CRYPTO
36frames.
37
38Requirements
39------------
40
41The technical requirements
42[document](https://github.com/openssl/openssl/blob/master/doc/designs/quic-design/quic-requirements.md)
43lists these requirements that are relevant to the record layer:
44
45* The current libssl record layer includes support for TLS, DTLS and KTLS. QUIC
46  will introduce another variant and there may be more over time. The OMC
47  requires a pluggable record layer interface to be implemented to enable this
48  to be less intrusive, more maintainable, and to harmonize the existing record
49  layer interactions between TLS, DTLS, KTLS and the planned QUIC protocols. The
50  pluggable record layer interface will be internal only for MVP and be public
51  in a future release.
52
53* The minimum viable product (MVP) for the next release is a pluggable record
54  layer interface and a single stream QUIC client in the form of s_client that
55  does not require significant API changes. In the MVP, interoperability should
56  be prioritized over strict standards compliance.
57
58* Once we have a fully functional QUIC implementation (in a subsequent release),
59  it should be possible for external libraries to be able to use the pluggable
60  record layer interface and it should offer a stable ABI (via a provider).
61
62The MVP requirements are:
63
64* a pluggable record layer (not public for MVP)
65
66Candidate Solutions that were considered
67----------------------------------------
68
69This section outlines two different solution approaches that were considered for
70the abstract record layer
71
72### Use a METHOD based approach
73
74A METHOD based approach is simply a structure containing function pointers. It
75is a common pattern in the OpenSSL codebase. Different strategies for
76implementing a METHOD can be employed, but these differences are hidden from
77the caller of the METHOD.
78
79In this solution we would seek to implement a different METHOD for each of the
80types of record layer that we support, i.e. there would be one for the standard
81TLS record layer, one for the standard DTLS record layer, one for kernel TLS and
82one for QUIC-TLS.
83
84In the MVP the METHOD approach would be private. However, once it has
85stabilised, it would be straight forward to supply public functions to enable
86end user applications to construct their own METHODs.
87
88This option is simpler to implement than the alternative of having a provider
89based approach. However it could be used as a "stepping stone" for that, i.e.
90the MVP could implement a METHOD based approach, and subsequent releases could
91convert the METHODs into fully fetchable algorithms.
92
93Pros:
94
95* Simple approach that has been used historically in OpenSSL
96* Could be used as the basis for the final public solution
97* Could also be used as the basis for a fetchable solution in a subsequent
98  release
99* If this option is later converted to a fetchable solution then much of the
100  effort involved in making the record layer fetchable can be deferred to a
101  later release
102
103Cons:
104
105* Not consistent with the provider based approach we used for extensibility in
106  3.0
107* If this option is implemented and later converted to a fetchable solution then
108  some rework might be required
109
110### Use a provider based approach
111
112This approach is very similar to the alternative METHOD based approach. The
113main difference is that the record layer implementations would be held in
114providers and "fetched" in much the same way that cryptographic algorithms are
115fetched in OpenSSL 3.0.
116
117This approach is more consistent with the approach adopted for extensibility in
1183.0. METHODS are being deprecated with providers being used extensively.
119
120Complex objects (e.g. an `SSL` object) cannot be passed across the
121libssl/provider boundary. This imposes some restrictions on the design of the
122functions that can be implemented. Additionally implementing the infrastructure
123for a new fetchable operation is more involved than a METHOD based approach.
124
125Pros:
126
127* Consistent with the extensibility solution used in 3.0
128* If this option is implemented immediately in the MVP then it would avoid later
129  rework if adopted in a subsequent release
130
131Cons:
132
133* More complicated to implement than the simple METHOD based approach
134* Cannot pass complex objects across the provider boundary
135
136### Selected solution
137
138The METHOD based approach has been selected for MVP, with the expectation that
139subsequent releases will convert it to a full provider based solution accessible
140to third party applications.
141
142Solution Description: The METHOD based approach
143-----------------------------------------------
144
145This section focuses on the selected approach of using METHODs and further
146elaborates on how the design works.
147
148A proposed internal record method API is given in
149[Appendix A](#appendix-a-the-internal-record-method-api).
150
151An `OSSL_RECORD_METHOD` represents the implementation of a particular type of
152record layer. It contains a set of function pointers to represent the various
153actions that can be performed by a record layer.
154
155An `OSSL_RECORD_LAYER` object represents a specific instantiation of a
156particular `OSSL_RECORD_METHOD`. It contains the state used by that
157`OSSL_RECORD_METHOD` for a specific connection (i.e. `SSL` object). Any `SSL`
158object will have at least 2 `OSSL_RECORD_LAYER` objects associated with it - one
159for reading and one for writing. In some cases there may be more than 2 - for
160example in DTLS it may be necessary to retransmit records from a previous epoch.
161There will be different `OSSL_RECORD_LAYER` objects for different protection
162levels or epochs. It may be that different `OSSL_RECORD_METHOD`s are used for
163different protection levels. For example a connection might start using the
164standard TLS record layer during the handshake, and later transition to using
165the kernel TLS record layer once the handshake is complete.
166
167A new `OSSL_RECORD_LAYER` is created by calling the `new` function of the
168associated `OSSL_RECORD_METHOD`, and freed by calling the `free` function. The
169parameters to the `new` function also supply all of the cryptographic state
170(e.g. keys, ivs, symmetric encryption algorithms, hash algorithm etc) used by
171the record layer. The internal structure details of an `OSSL_RECORD_LAYER` are
172entirely hidden to the rest of libssl and can be specific to the given
173`OSSL_RECORD_METHOD`. In practice the standard internal TLS, DTLS and KTLS
174`OSSL_RECORD_METHOD`s all use a common `OSSL_RECORD_LAYER` structure. However
175the QUIC-TLS implementation is likely to use a different structure layout.
176
177All of the header and payload data for a single record will be represented by an
178`OSSL_RECORD_TEMPLATE` structure when writing. Libssl will construct a set of
179templates for records to be written out and pass them to the "write" record
180layer. In most cases only a single record is ever written out at one time,
181however there are some cases (such as when using the "pipelining" or
182"multibuffer" optimisations) that multiple records can be written in one go.
183
184It is the record layer's responsibility to know whether it can support multiple
185records in one go or not. It is libssl's responsibility to split the payload
186data into `OSSL_RECORD_TEMPLATE` objects. Libssl will call the record layer's
187`get_max_records()` function to determine how many records a given payload
188should be split into. If that value is more than one, then libssl will construct
189(up to) that number of `OSSL_RECORD_TEMPLATE`s and pass the whole set to the
190record layer's `write_records()` function.
191
192The implementation of the `write_records` function must construct the
193appropriate number of records, apply protection to them as required and then
194write them out to the underlying transport layer BIO. In the event that not
195all the data can be transmitted at the current time (e.g. because the underlying
196transport has indicated a retry), then the `write_records` function will return
197a "retry" response. It is permissible for the data to be partially sent, but
198this is still considered a "retry" until all of the data is sent.
199
200On a success or retry response libssl may free its buffers immediately. The
201`OSSL_RECORD_LAYER` object will have to buffer any untransmitted data until it
202is eventually sent.
203
204If a "retry" occurs, then libssl will subsequently call `retry_write_records`
205and continue to do so until a success return value is received. Libssl will
206never call `write_records` a second time until a previous call to
207`write_records` or `retry_write_records` has indicated success.
208
209Libssl will read records by calling the `read_record` function. The
210`OSSL_RECORD_LAYER` may read multiple records in one go and buffer them, but the
211`read_record` function only ever returns one record at a time. The
212`OSSL_RECORD_LAYER` object owns the buffers for the record that has been read
213and supplies a pointer into that buffer back to libssl for the payload data, as
214well as other information about the record such as its length and the type of
215data contained in it. Each record has an associated opaque handle `rechandle`.
216The record data must remain buffered by the `OSSL_RECORD_LAYER` until it has
217been released via a call to `release_record()`.
218
219A record layer implementation supplies various functions to enable libssl to
220query the current state. In particular:
221
222`unprocessed_read_pending()`: to query whether there is data buffered that has
223already been read from the underlying BIO, but not yet processed.
224
225`processed_read_pending()`: to query whether there is data buffered that has
226been read from the underlying BIO and has been processed. The data is not
227necessarily application data.
228
229`app_data_pending()`: to query the amount of processed application data that is
230buffered and available for immediate read.
231
232`get_alert_code()`: to query the alert code that should be used in the event
233that a previous attempt to read or write records failed.
234
235`get_state()`: to obtain a printable string to describe the current state of the
236record layer.
237
238`get_compression()`: to obtain information about the compression method
239currently being used by the record layer.
240
241`get_max_record_overhead()`: to obtain the maximum amount of bytes the record
242layer will add to the payload bytes before transmission. This does not include
243any expansion that might occur during compression. Currently this is only
244implemented for DTLS.
245
246In addition, libssl will tell the record layer about various events that might
247occur that are relevant to the record layer's operation:
248
249`set1_bio()`: called if the underlying BIO being used by the record layer has
250been changed.
251
252`set_protocol_version()`: called during protocol version negotiation when a
253specific protocol version has been selected.
254
255`set_plain_alerts()`: to indicate that receiving unencrypted alerts is allowed
256in the current context, even if normally we would expect to receive encrypted
257data. This is only relevant for TLSv1.3.
258
259`set_first_handshake()`: called at the beginning and end of the first handshake
260for any given (D)TLS connection.
261
262`set_max_pipelines()`: called to configure the maximum number of pipelines of
263data that the record layer should process in one go. By default this is 1.
264
265`set_in_init()`: called by libssl to tell the record layer whether we are
266currently `in_init` or not. Defaults to "true".
267
268`set_options()`: called by libssl in the event that the current set of options
269to use has been updated.
270
271`set_max_frag_len()`: called by libssl to set the maximum allowed fragment
272length that is in force at the moment. This might be the result of user
273configuration, or it may be negotiated during the handshake.
274
275`increment_sequence_ctr()`: force the record layer to increment its sequence
276counter. In most cases the record layer will entirely manage its own sequence
277counters. However in the DTLSv1_listen() corner case, libssl needs to initialise
278the record layer with an incremented sequence counter.
279
280`alloc_buffers()`: called by libssl to request that the record layer allocate
281its buffers. This is a hint only and the record layer is expected to manage its
282own buffer allocation and freeing.
283
284`free_buffers()`: called by libssl to request that the record layer free its
285buffers. This is a hint only and the record layer is expected to manage its own
286buffer allocation and freeing.
287
288Appendix A: The internal record method API
289------------------------------------------
290
291The internal recordmethod.h header file for the record method API:
292
293```` C
294/*
295 * We use the term "record" here to refer to a packet of data. Records are
296 * typically protected via a cipher and MAC, or an AEAD cipher (although not
297 * always). This usage of the term record is consistent with the TLS concept.
298 * In QUIC the term "record" is not used but it is analogous to the QUIC term
299 * "packet". The interface in this file applies to all protocols that protect
300 * records/packets of data, i.e. (D)TLS and QUIC. The term record is used to
301 * refer to both contexts.
302 */
303
304/*
305 * An OSSL_RECORD_METHOD is a protocol specific method which provides the
306 * functions for reading and writing records for that protocol. Which
307 * OSSL_RECORD_METHOD to use for a given protocol is defined by the SSL_METHOD.
308 */
309typedef struct ossl_record_method_st OSSL_RECORD_METHOD;
310
311/*
312 * An OSSL_RECORD_LAYER is just an externally defined opaque pointer created by
313 * the method
314 */
315typedef struct ossl_record_layer_st OSSL_RECORD_LAYER;
316
317
318# define OSSL_RECORD_ROLE_CLIENT 0
319# define OSSL_RECORD_ROLE_SERVER 1
320
321# define OSSL_RECORD_DIRECTION_READ  0
322# define OSSL_RECORD_DIRECTION_WRITE 1
323
324/*
325 * Protection level. For <= TLSv1.2 only "NONE" and "APPLICATION" are used.
326 */
327# define OSSL_RECORD_PROTECTION_LEVEL_NONE        0
328# define OSSL_RECORD_PROTECTION_LEVEL_EARLY       1
329# define OSSL_RECORD_PROTECTION_LEVEL_HANDSHAKE   2
330# define OSSL_RECORD_PROTECTION_LEVEL_APPLICATION 3
331
332# define OSSL_RECORD_RETURN_SUCCESS           1
333# define OSSL_RECORD_RETURN_RETRY             0
334# define OSSL_RECORD_RETURN_NON_FATAL_ERR    -1
335# define OSSL_RECORD_RETURN_FATAL            -2
336# define OSSL_RECORD_RETURN_EOF              -3
337
338/*
339 * Template for creating a record. A record consists of the |type| of data it
340 * will contain (e.g. alert, handshake, application data, etc) along with a
341 * buffer of payload data in |buf| of length |buflen|.
342 */
343struct ossl_record_template_st {
344    int type;
345    unsigned int version;
346    const unsigned char *buf;
347    size_t buflen;
348};
349
350typedef struct ossl_record_template_st OSSL_RECORD_TEMPLATE;
351
352/*
353 * Rather than a "method" approach, we could make this fetchable - Should we?
354 * There could be some complexity in finding suitable record layer implementations
355 * e.g. we need to find one that matches the negotiated protocol, cipher,
356 * extensions, etc. The selection_cb approach given above doesn't work so well
357 * if unknown third party providers with OSSL_RECORD_METHOD implementations are
358 * loaded.
359 */
360
361/*
362 * If this becomes public API then we will need functions to create and
363 * free an OSSL_RECORD_METHOD, as well as functions to get/set the various
364 * function pointers....unless we make it fetchable.
365 */
366struct ossl_record_method_st {
367    /*
368     * Create a new OSSL_RECORD_LAYER object for handling the protocol version
369     * set by |vers|. |role| is 0 for client and 1 for server. |direction|
370     * indicates either read or write. |level| is the protection level as
371     * described above. |settings| are mandatory settings that will cause the
372     * new() call to fail if they are not understood (for example to require
373     * Encrypt-Then-Mac support). |options| are optional settings that will not
374     * cause the new() call to fail if they are not understood (for example
375     * whether to use "read ahead" or not).
376     *
377     * The BIO in |transport| is the BIO for the underlying transport layer.
378     * Where the direction is "read", then this BIO will only ever be used for
379     * reading data. Where the direction is "write", then this BIO will only
380     * every be used for writing data.
381     *
382     * An SSL object will always have at least 2 OSSL_RECORD_LAYER objects in
383     * force at any one time (one for reading and one for writing). In some
384     * protocols more than 2 might be used (e.g. in DTLS for retransmitting
385     * messages from an earlier epoch).
386     *
387     * The created OSSL_RECORD_LAYER object is stored in *ret on success (or
388     * NULL otherwise). The return value will be one of
389     * OSSL_RECORD_RETURN_SUCCESS, OSSL_RECORD_RETURN_FATAL or
390     * OSSL_RECORD_RETURN_NON_FATAL. A non-fatal return means that creation of
391     * the record layer has failed because it is unsuitable, but an alternative
392     * record layer can be tried instead.
393     */
394
395    /*
396     * If we eventually make this fetchable then we will need to use something
397     * other than EVP_CIPHER. Also mactype would not be a NID, but a string. For
398     * now though, this works.
399     */
400    int (*new_record_layer)(OSSL_LIB_CTX *libctx,
401                            const char *propq, int vers,
402                            int role, int direction,
403                            int level,
404                            uint16_t epoch,
405                            unsigned char *key,
406                            size_t keylen,
407                            unsigned char *iv,
408                            size_t ivlen,
409                            unsigned char *mackey,
410                            size_t mackeylen,
411                            const EVP_CIPHER *ciph,
412                            size_t taglen,
413                            int mactype,
414                            const EVP_MD *md,
415                            COMP_METHOD *comp,
416                            BIO *prev,
417                            BIO *transport,
418                            BIO *next,
419                            BIO_ADDR *local,
420                            BIO_ADDR *peer,
421                            const OSSL_PARAM *settings,
422                            const OSSL_PARAM *options,
423                            const OSSL_DISPATCH *fns,
424                            void *cbarg,
425                            OSSL_RECORD_LAYER **ret);
426    int (*free)(OSSL_RECORD_LAYER *rl);
427
428    int (*reset)(OSSL_RECORD_LAYER *rl); /* Is this needed? */
429
430    /* Returns 1 if we have unprocessed data buffered or 0 otherwise */
431    int (*unprocessed_read_pending)(OSSL_RECORD_LAYER *rl);
432
433    /*
434     * Returns 1 if we have processed data buffered that can be read or 0 otherwise
435     * - not necessarily app data
436     */
437    int (*processed_read_pending)(OSSL_RECORD_LAYER *rl);
438
439    /*
440     * The amount of processed app data that is internally buffered and
441     * available to read
442     */
443    size_t (*app_data_pending)(OSSL_RECORD_LAYER *rl);
444
445    /*
446     * Find out the maximum number of records that the record layer is prepared
447     * to process in a single call to write_records. It is the caller's
448     * responsibility to ensure that no call to write_records exceeds this
449     * number of records. |type| is the type of the records that the caller
450     * wants to write, and |len| is the total amount of data that it wants
451     * to send. |maxfrag| is the maximum allowed fragment size based on user
452     * configuration, or TLS parameter negotiation. |*preffrag| contains on
453     * entry the default fragment size that will actually be used based on user
454     * configuration. This will always be less than or equal to |maxfrag|. On
455     * exit the record layer may update this to an alternative fragment size to
456     * be used. This must always be less than or equal to |maxfrag|.
457     */
458    size_t (*get_max_records)(OSSL_RECORD_LAYER *rl, uint8_t type, size_t len,
459                              size_t maxfrag, size_t *preffrag);
460
461    /*
462     * Write |numtempl| records from the array of record templates pointed to
463     * by |templates|. Each record should be no longer than the value returned
464     * by get_max_record_len(), and there should be no more records than the
465     * value returned by get_max_records().
466     * Where possible the caller will attempt to ensure that all records are the
467     * same length, except the last record. This may not always be possible so
468     * the record method implementation should not rely on this being the case.
469     * In the event of a retry the caller should call retry_write_records()
470     * to try again. No more calls to write_records() should be attempted until
471     * retry_write_records() returns success.
472     * Buffers allocated for the record templates can be freed immediately after
473     * write_records() returns - even in the case a retry.
474     * The record templates represent the plaintext payload. The encrypted
475     * output is written to the |transport| BIO.
476     * Returns:
477     *  1 on success
478     *  0 on retry
479     * -1 on failure
480     */
481    int (*write_records)(OSSL_RECORD_LAYER *rl, OSSL_RECORD_TEMPLATE *templates,
482                         size_t numtempl);
483
484    /*
485     * Retry a previous call to write_records. The caller should continue to
486     * call this until the function returns with success or failure. After
487     * each retry more of the data may have been incrementally sent.
488     * Returns:
489     *  1 on success
490     *  0 on retry
491     * -1 on failure
492     */
493    int (*retry_write_records)(OSSL_RECORD_LAYER *rl);
494
495    /*
496     * Read a record and return the record layer version and record type in
497     * the |rversion| and |type| parameters. |*data| is set to point to a
498     * record layer buffer containing the record payload data and |*datalen|
499     * is filled in with the length of that data. The |epoch| and |seq_num|
500     * values are only used if DTLS has been negotiated. In that case they are
501     * filled in with the epoch and sequence number from the record.
502     * An opaque record layer handle for the record is returned in |*rechandle|
503     * which is used in a subsequent call to |release_record|. The buffer must
504     * remain available until release_record is called.
505     *
506     * Internally the OSSL_RECORD_METHOD the implementation may read/process
507     * multiple records in one go and buffer them.
508     */
509    int (*read_record)(OSSL_RECORD_LAYER *rl, void **rechandle, int *rversion,
510                      uint8_t *type, unsigned char **data, size_t *datalen,
511                      uint16_t *epoch, unsigned char *seq_num);
512    /*
513     * Release a buffer associated with a record previously read with
514     * read_record. Records are guaranteed to be released in the order that they
515     * are read.
516     */
517    int (*release_record)(OSSL_RECORD_LAYER *rl, void *rechandle);
518
519    /*
520     * In the event that a fatal error is returned from the functions above then
521     * get_alert_code() can be called to obtain a more details identifier for
522     * the error. In (D)TLS this is the alert description code.
523     */
524    int (*get_alert_code)(OSSL_RECORD_LAYER *rl);
525
526    /*
527     * Update the transport BIO from the one originally set in the
528     * new_record_layer call
529     */
530    int (*set1_bio)(OSSL_RECORD_LAYER *rl, BIO *bio);
531
532    /* Called when protocol negotiation selects a protocol version to use */
533    int (*set_protocol_version)(OSSL_RECORD_LAYER *rl, int version);
534
535    /*
536     * Whether we are allowed to receive unencrypted alerts, even if we might
537     * otherwise expect encrypted records. Ignored by protocol versions where
538     * this isn't relevant
539     */
540    void (*set_plain_alerts)(OSSL_RECORD_LAYER *rl, int allow);
541
542    /*
543     * Called immediately after creation of the record layer if we are in a
544     * first handshake. Also called at the end of the first handshake
545     */
546    void (*set_first_handshake)(OSSL_RECORD_LAYER *rl, int first);
547
548    /*
549     * Set the maximum number of pipelines that the record layer should process.
550     * The default is 1.
551     */
552    void (*set_max_pipelines)(OSSL_RECORD_LAYER *rl, size_t max_pipelines);
553
554    /*
555     * Called to tell the record layer whether we are currently "in init" or
556     * not. Default at creation of the record layer is "yes".
557     */
558    void (*set_in_init)(OSSL_RECORD_LAYER *rl, int in_init);
559
560    /*
561     * Get a short or long human readable description of the record layer state
562     */
563    void (*get_state)(OSSL_RECORD_LAYER *rl, const char **shortstr,
564                      const char **longstr);
565
566    /*
567     * Set new options or modify ones that were originally specified in the
568     * new_record_layer call.
569     */
570    int (*set_options)(OSSL_RECORD_LAYER *rl, const OSSL_PARAM *options);
571
572    const COMP_METHOD *(*get_compression)(OSSL_RECORD_LAYER *rl);
573
574    /*
575     * Set the maximum fragment length to be used for the record layer. This
576     * will override any previous value supplied for the "max_frag_len"
577     * setting during construction of the record layer.
578     */
579    void (*set_max_frag_len)(OSSL_RECORD_LAYER *rl, size_t max_frag_len);
580
581    /*
582     * The maximum expansion in bytes that the record layer might add while
583     * writing a record
584     */
585    size_t (*get_max_record_overhead)(OSSL_RECORD_LAYER *rl);
586
587    /*
588     * Increment the record sequence number
589     */
590    int (*increment_sequence_ctr)(OSSL_RECORD_LAYER *rl);
591
592    /*
593     * Allocate read or write buffers. Does nothing if already allocated.
594     * Assumes default buffer length and 1 pipeline.
595     */
596    int (*alloc_buffers)(OSSL_RECORD_LAYER *rl);
597
598    /*
599     * Free read or write buffers. Fails if there is pending read or write
600     * data. Buffers are automatically reallocated on next read/write.
601     */
602    int (*free_buffers)(OSSL_RECORD_LAYER *rl);
603};
604````
605