Architecture

Collabora Online is designed to be self-contained and secure out of the box. It has its own built-in web-server, which is often ran behind a reverse proxy for flexibility and added security, as well as isolated per-document processes.

Internally, each document is loaded in a separate process, which is isolated on multiple levels.

Conceptually, there are three (3) groups of processes: WSD, ForKit, and Kit.

A high-level sketch looks as follows:

../_images/architecture-sketch.png

WSD

The main process, WSD (Web Service Daemon), is responsible for spawning ForKit, setting up the childroot directory, and listening for incoming connections.

Once ready, the WSD process accepts incoming connections on the incoming port (9980 by default). This is the primary server connection which accepts client connection.

ForKit

The ForKit process is responsible for loading the Collabora Office libraries. These dynamic shared object(s), or DSO for short, implement the core logic of viewing and editing the document. This is referred to as the Core.

In addition to the core logic, these libraries also implement the API to communicate with WSD and therefore implement the interface with the Collabora Online. This layer is called the Kit.

Once the Kit and Core libraries are loaded, ForKit is ready to fork additional processes that can load documents. The ForKit process never loads documents, however. It is only responsible for forking additional processes to load documents, per WSD’s demands.

The name ‘ForKit’, therefore, is a play on both ‘fork it’ and ‘fork Kit’ by contracting the latter into a single word.

Kit

The Kit process is forked from ForKit and is therefore by and large identical to it. Except, there are very important differences. Unlike ForKit, the Kit process is only responsible for loading documents and passing input and commands from users to the document and similarly for the events generated by the document back to the client code. Therefore, it doesn’t fork any processes. Instead, it creates an isolated environment Before even becoming visible to WSD as available to load a document. See Kit Isolation below.

Kit Isolation

Before isolating the Kit process, it needs to create a shadow file system (see below for details). Preparing the shadow file system can be done either by bind-mount from a template directory (preferred) or by linking (if possible) or, failing that, by copying the files. The template directory contains important files to run the Core as well as various files from the system (from /etc) for the timezone, hosts, and similar files.

When bind-mount is enabled (default) and available (some systems and especially containers might not allow it), it is very fast to set up and is done with read-only permissions, which adds yet another layer of security.

After the shadow file system is ready, chroot(2) is called to change the visible root file system for the current Kit process to be that of the shadow file system. No files outside of this root is visible or accessible to the Kit process, isolating it completely from the host system.

Next, since certain capabilities are needed to execute the above (specifically, mount and chroot are privileged), capabilities are dropped to minimize the available system calls.

Finally, all system calls are disabled, even the unprivileged ones, with the exception of the strictly required ones. For example, the kill(2) sys- call is disabled, as well accept(2) and listen(2), which are used to create listening sockets.

Architecture

There are three processes: CoolWSD, CoolForKit, and CoolKit.

WSD is the top-level server and is intended to run as a service. It is responsible for spawning ForKit and listening on public port for client connections.

The ForKit is only responsible for forking Kit instances. There is only one ForKit per WSD instance and there is one Kit instance per document.

WSD listens on a public port and using internal pipes requests the ForKit to fire a child (Kit) instance to host documents. The ForKit then has to find an existing Kit that hosts that document, based on the public URI as unique key, and forward the request to this existing Kit, which then loads a new view to the document.

There is a singleton Admin class that gets notified of all the important changes and update the AdminModel object accordingly. AdminModel object has subscribers which corresponds to admin panel sessions. Subscriber can subscribe to specific commands to get live notifications about, and to update the UI accordingly.

Whether a document is loaded for the first time, or this is a new view on an existing one, the Kit connects via a socket to WSD on an internal port. WSD acts as a bridge between the client and Kit by tunnelling the traffic between the two sockets (that which is between the client and WSD and the one between WSD and Kit).

File System

WSD is given childroot argument through config (child_root_path). This is the root directory of jailed FS. This path can be anywhere, but here we’ll designate it as:

/childroot

Before spawning a ForKit instance, WSD needs to generate a random Jail-ID to use as the jail directory name. This JailID is then passed to ForKit as argument jailid.

Note: for security reasons, this directory name is randomly generated and should not be given out to the client. Since there is only one ForKit per WSD instance, there is also one JailID between them.

The ForKit creates a chroot in this directory (the jail directory):

/childroot/jailid/

ForKit copies the LO instdir (essentially installs LO in the chroot), then copies the Kit binary into the jail directory upon startup. Once done, it chroot-s and drops caps.

ForKit then waits on a read pipe to which WSD writes when a new request from a client is received. ForKit is responsible for spawning (or forking) Kit instances. For our purposes, it doesn’t matter whether Kit is spawned or forked.

Every document is hosted by a Kit instance. Each document is stored in a dedicated directory within the jail directory. The document root within the jail is /user/docs. The absolute path on the system (which isn’t accessible to the Kit process as it’s jailed) is:

/childroot/jailid/user/docs

Within this path, each document gets its own sub-directory based on another random Child-ID (which could be the Process ID of the Kit). This ChildId will be given out to clients to facilitate the insertion and downloading of documents. (Although strictly speaking the client can use the main document URI as key, this is the current design.)

/childroot/jailid/user/docs/childid

Client Connection

A request from a client to load a document will trigger the following chain of events.

  • WSD public socket will receive the connection request followed by a “load” command. The connection includes the wopiSrc unique URL, which includes the user’s token.

  • An instance of DocumentBroker with the given wopiSrc (without the user token) is searched for, if one exists, the document was loaded and this is a new view to the existing document. Otherwise, a new DocumentBroker instance is created and registered internally.

  • WSD finds an available Kit process. If none is available, a request is made to ForKit to spawn more.

  • A ClientSession (ToClient) is created and takes ownership of the incoming socket to handle the client traffic.

  • ForKit sends Kit request to host URI via internal Unix-Socket.

  • Kit connects to WSD on an internal port.

  • The Kit internally creates Document and ChildSession instances to abstract the document and views on it, respectively.

  • WSD creates another ClientSession (ToPrisoner) to service Kit.

  • ClientSession (ToClient) is linked to the ToPrisoner instance, copies the document into jail (first load only) and sends (via ToPrisoner) the load request to Kit.

  • Kit loads the document and sets up callbacks with LOKit.

  • ClientSession (ToClient) and ClientSession (ToPrisoner) tunnel the traffic between clients and the Kit both ways.

../_images/client-connection.png

Tile Rendering

The document is rendered into raster images on the server (the Kit) and sent to the client in pre-defined dimensions called tiles. The tiles are tracked on the client and displayed.

When there is a modification (by any other user, or the current one) the Kit will send invalidation notifications to all (active) clients. Each client in its turn will send requests to render the tiles that are out of date.

The server tracks the requests from all clients and renders each unique tile request only once. A tile is not unique only by its coordinates and size, but also by the zoom factor. This makes sure that no tile is rendered more than once. The tiles are cached, so subsequent requests to the same unique tile is served from cache.

Rendering is expensive, and it pays to be minimize them where possible.

Protocol

The protocol between the client and server uses plain-text with the occasional JSON, if structured data is needed. It is documented separately.

Payloads, in some cases, need to be in binary. This is the case, for example, for rendered tiles. These tile responses must contain the binary data of the tiles they contain.

Threading

The threading model is as simple as possible. Specifically, each document is handled on the WSD side with a single thread. Similarly, in the Kit, each document has a primary single thread. In both of these cases, this primary thread is responsible for the socket communication as well as the handling of commands/events.

On the WSD side, the DocumentBroker is the owner of this thread. It is regulated through the poll system call, which, when there is no new data in the sockets to read, and no data to write, puts the thread into efficient wait state for new input from any of the sockets (belonging to that particular document), or a timeout. This approach is both simple (minimal thread synchronization concerns) and efficient.

Similarly, in the Kit, the same thread that handles document input, commands, and events, is the same thread that handles the socket logic. The way this is done is through the runLoop() Kit API that registers callbacks that the document’s main thread calls in its main loop.

File Server

CoolWSD acts as a file server, in addition to being the server component that handles document loading, editing, saving, etc. The file server in CoolWSD serves only known files. That is, the files are known in advance, enumerated, read from disk, loaded into an in-memory cache, all during starting up and initializing the server.

This has a number of benefits, beyond performance. First, only the known directory (browser/dist) is served from. This avoids the risk of exposing any files outside of the known directory. Second, files that should not be served (for any reason) are detected and excluded at start up. In addition, the file server is responsible for serving service-specific files, such as hosting and discovery, as well as favicon.ico, robots.txt, etc.

Once the file server is initialized, it can serve only the files that it has cached in memory. All other requests are ignored, with proper error logged and an HTTP error code returned (such as 403 or 404).

File serving is exclusively done over HTTP and HTTPS only. The only HTTP supported verbs are GET and POST. GET is used for file serving while POST is exclusively for interacting with documents. For example, converting, downloading, etc.

The reason for using POST for document interactions is to pass the authentication key, which is used to authenticate the user against the storage server and against accessing the document in question, is safest when transmitted in the body of the request rather than in the address, as GET would do.

Finally, for loading documents and for the connection between the server and client, the HTTP socket is upgraded to WebSocket, which is used for the duration of the session that a user has on a document.

Communication Security

When the host integration client initiates a request to view or edit a document in Collabora Online, it will generate and transmit an authentication token to the WSD. The WSD will send back this authentication token with any WOPI request used to read or save the document. This token will need to be verified by the host (the WOPI Server). The WOPI request is done over HTTPS, and WSD will verify the validity of the TLS certificate of the host when performing WOPI request. This is the access control mechanism for the publicly facing part of Collabora Online.

Behind, the WSD will establish a client connection between itself and a new instance of LOKit that was just created, instance contained in a chroot jail, where system calls and file system access are limited and where only one document is loaded at a time. The LOKit instance and WSD are run the same machine and the communication is done over UNIX domain socket. The legitimacy of the connection is verified using the standard UNIX socket peer credentials mechanism and matching of uid / gid.

Note that this doesn’t protect against any tampering done by the super-user (root) on the machine running WSD and LOKit.