11 KiB
Embedded Protocol Version 2: Draft 1.1
(Issue 1, Issue 2, Issue 3, Changelog)
Table of Contents
Background
This section is non-normative.
Since the embedded Sass protocol was first released, a number of improvements have been suggested by embedded host authors that would either be backwards-incompatible, or would require cumbersome redundancies in the protocol to make them backwards-compatible.
In the meantime, to provide an efficient Sass CLI along with embedded hosts, the Dart Sass embedded compiler was merged into the main Dart Sass executable. Because this is effectively also a breaking change in how hosts obtain and invoke the compiler, it seemed like a good opportunity to make the other breaking protocol changes that had been piling up.
Summary
This section is non-normative.
This proposal makes three breaking changes to the embedded Sass protocol:
-
Replace the compilation IDs in the protobufs with one directly encoded in the wire protocol so that compiler implementations can efficiently route requests to different workers without reparsing messages.
-
Use optional fields as defined in protocol buffers 3.15.0 instead of treating default field values as absent.
-
Move
CompileSuccess.loaded_urls
toCompileResponse.loaded_urls
so it's available even when compilation fails.
Design Decisions
Length Before Compilation ID
This proposal places the compilation ID for each request after the length. The length is defined as the length of the protocol buffer plus the length of the compilation ID.
Another possible approach would be to have the compilation ID first, then the length, and have the length list only the length of the protocol buffer itself. This approach would have the benefit of being somewhat easier to decode, since the only length-delimited chunk of input would be parsed as a unit by the protocol buffer parser rather than needing to be sliced out of the (compilation ID, protocol buffer) pair.
However, the embedded protocol is intended to be transport-independent, and not all transports will necessarily require an explicit length encoding at all. Although today all use of the embedded protocol is over stdin/stdout which requires a length delimiter, it's likely that we'll eventually add WASM support as well which won't.
We want to make it easy for the same compiler and host codebases to support multiple transports. Making the length-delimiting process a layer that can be transparently applied to the same message blobs that are used in other transport layers makes it easier for the underlying endpoint logic to just work with those blobs regardless of where they came from.
For languages that can parse protocol buffers from a subsequence of binary content without copying, it should be easy to slice off the compilation ID and parse the remaining binary data. For those that can't, the true length of the protocol buffer can be determined by taking the given length and subtracting the length of the compilation ID, which is given by the following table:
compilation ID | length in bytes |
---|---|
[0, 128) | 1 |
[128, 16384) | 2 |
[16384, 2097152) | 3 |
[2097152, 268435456) | 4 |
[268435456, 4294967296) | 5 |
Cross-Compilation State
We have a future goal to (optionally) share state across compilations, to more efficiently compile projects with many small entrypoints where the bulk of the complexity is in static shared libraries. If/when we support this, there could be two broad implementation strategies for a compiler with worker-based parallelism like Dart Sass:
-
Run each compilation in a separate worker and keep shared state in one or more shared workers. This allows for more parallelism between compilations, but requires state (or requests for information about state) to be serialized across worker boundaries, adding a potentially substantial amount of overhead.
-
Run all compilations that share state in a single worker, allowing zero-overhead access to the shared state but requiring those compilations to run serially rather than in parallel.
It's not clear which of these will be more efficient in which circumstances,
although option 1 is certainly substantially more complex to implement. The
protocol as listed here—without an explicit CompilationRequest.id
field—is
only compatible with option 1, assuming that each compilation ID corresponds to
a separate worker as intended.
However, this isn't a fatal flaw. It would be a non-breaking change to add
CompilationRequest.id
(and *.compilation_id
) back later on if we decide to
support option 2. Hosts that were built to target the current version of the
protocol wouldn't set CompilationRequest.id
, which means it would default to
zero, which will work fine since they're already ensuring each
CompilationRequest
has a different wire-level compilation ID.
Outbound Request IDs
Given that each compilation is expected to run single-threaded in and of itself,
there's theoretically no more need for fields like ImportRequest.id
. Each
compilation ID will only have one request at a time, so we could just declare
that any response with a given compilation ID is for the single outstanding
request.
However, the expectation that each compilation be single-threaded isn't a requirement. One could imagine a multithreaded Sass compiler that actually is capable of fielding multiple concurrent requests as it compiles independent chunks of a given stylesheet or resolves loads eagerly. We don't want to cut off this possibility, so we retain the outbound request IDs.
Overview
Packet Structure
Replace the last paragraph of the embedded protocol overview with:
Each message in the embedded protocol is sent as a packet which contains two values: an unsigned varint up to 32 bits long known as the "compilation ID", and a protocol buffer that contains the protobuf message. For streams (like standard input and output) that don't have built-in message boundaries, every packet must begin with another unsigned varint indicating the length in bytes of the remaining message (including the compilation ID). This matches the best practice described in the protocol buffer documentation.
Because JavaScript can't easily represent integers larger than 2^53 - 1, the length may be no more than 2^53 - 1. Because it's so unlikely that this will come up in practice, implementations are not required to verify it.
For a length-delimited stream, each packet has the following structure:
╔══════════╦══════════════════╗
║ varint ║ Length ║
╠══════════╬══════════════════╣
║ varint ║ Compilation ID ║
╠══════════╬══════════════════╣
║ protobuf ║ Protobuf Message ║
╚══════════╩══════════════════╝
RPCs
Type Definitions
Replace the following RPC type definitions:
-
Requests usually include a mandatory
uint32 id
field so that the other endpoint can respond, except forCompileRequest
which uses the compilation ID as its ID. All request message types end inRequest
. -
Responses usually include a mandatory
uint32 id
field whose value must be the same as their associated request'sid
, except forCompileResponse
which uses the compilation ID as its ID. All response message types begin with the corresponding request name and end withResponse
.
ID Requirements
Replace the paragraph that beings "Implementations must guarantee that they use
a unique id
for every request" with:
Each endpoint must guarantee that each request's id
doesn't match the id
of
any other outstanding request with the same compilation ID from that endpoint.
The same id
may be used for an inbound request and an outbound request, and
the same id
may be used for two requests with different compilation IDs. The
host must similarly guarantee that a CompileRequest
's compilation ID doesn't
match the compilation ID of any other outstanding CompileRequest
. The compiler
must ensure that all outbound requests' compilation IDs match that of the
CompileRequest
that triggered its associated compilation.
The compilation ID 0 is reserved for VersionRequest
and VersionResponse
,
since they're not specific to any individual compilation.
The compilation ID and normal request id
4294967295
is reserved for error
handling. (This is the maximum number representable by a uint32
.)
Optional and Mandatory Fields
Replace the paragraphs about optional and mandatory fields with:
If a field is not optional, the the endpoint that sends that message must
guarantee that it's set to a meaningful value, and the endpoint that receives it
must reject the message if it's not set. Because protocol buffers allow all
oneof
fields to be unset, the spec describes those that require values as
"mandatory".
Error Handling
Append to the first paragraph:
The compilation ID must match the compilation ID of the request or response that triggered the error.
Protocol Buffer
ID Fields
Remove the following fields:
CompileRequest.id
CompileResponse.id
LogEvent.compilation_id
CanonicalizeRequest.compilation_id
ImportRequest.compilation_id
FileImportRequest.compilation_id
FunctionCallRequest.compilation_id
Optional Fields
Mark the following fields as optional, and update each one's specification to check if they're unset instead of the default values:
ImportSuccess.source_map_url
LogEvent.span
SourceSpan.end
Loaded URLs
Remove CompileSuccess.loaded_urls
and add CompileResponse.loaded_urls
with
the same specification.