3.1.1 Transport layer and packets
The transport layer is able to build, compress, encipher, decipher, authenticate and parse ssh2 binary packets. It dispatches and collect packets to and from the other software components. Unlike other components, the transport layer is always running. It is specified in rfc4253.
The transport layer needs a reliable data pipe in order to exchange packets between the client and server instances. A TCP connection is commonly used.
The first step of the protocol consists in exchanging version strings. Once both sides have transmitted their version strings in a human readable format, they switch to the ssh2 binary packet format.
Binary packets are variable sized arrays of bytes. A packet has a size header and a numerical message identifier that determines its role in the protocol. The interpretation of the packet content depends on the message id. Dispatching packets to the higher layers does not require nesting multiple packet headers. There is a single header and the message identifier is used to dispatch packets to the right software component.
By looking at the enum assh_ssh_msg_e enum, it is easy to see that there are four categories of messages:
Transport related messages, starting at id 1,
Key-exchange related messages, starting at id 30,
User Authentication Protocol messages, starting at id 50 and
Connection Protocol messages, starting at id 80.
Some of the message identifiers used by parts of the protocol have multiple meaning. This is because some parts of the protocol are specified and implemented as extensions. The interpretation of those messages content do depend on the selected and currently running extension.
The format of the packet content depends on the message identifier. Most of the time, the content is composed of a mix of 8 bits and 32 bits fixed size fields as well as some variable sized fields starting with a 32 bits length header.
The Transport Layer Protocol is designed so that incoming and outgoing packets are processed by the selected cryptographic algorithms. However the binary packets are neither enciphered nor compressed before the end of the first key-exchange process that performs the initial algorithms selection.
According to the original specification, the outgoing packets are encrypted after appending the authentication token computed by the retained MAC algorithm. This is called the Mac-Then-Encrypt order and allows encrypting the whole packet.
When a more recent message authentication algorithm is selected that uses the Encrypt-Then-Mac order, the size field has to remain in clear text so that the message authentication token can be located at the end of the undeciphered packet. This is needed because the underlying network pipe is stream oriented rather than datagram oriented, which means it is not required to preserve packets boundaries. The task of recovering packet boundaries is left to the ssh2 transport layer.
When an authenticated cipher is used instead, no separate message authentication algorithm is used. Depending on the design of the authenticated cipher algorithm, the size field may not be enciphered. Note that reliably hiding the actual size of the packets needs additional measures to be taken when transmitting the packets stream over a TCP session.
Obviously, the compression step if performed before encryption when enabled.