Skip to main content

Compact reference

Overview

Compact is a strongly typed, statically typed, bounded smart contract language, designed to be used in combination with TypeScript for writing smart contracts for the three-part structure of Midnight, where contracts have the following components:

  • a replicated component on a public ledger,
  • a zero-knowledge circuit component that confidentially proofs the replicated component resulted from a valid execution according to the logic of the contract, and
  • a local, off-chain component that can perform arbitrary code.

Each Compact program (also referred to as a contract) can contain several kinds of program elements:

  • module and import forms for management of namespaces and separate files,
  • declarations of program-defined types,
  • declarations of the data that the contract stores in the public ledger,
  • declarations of witnesses, which are callback functions supplied by the TypeScript runner,
  • definitions of circuits, which are functions serving as the operational core of a smart contract, and
  • the definition of at most one constructor, which is a function called when a contract is created and can be used to initialize public and private state.

Compact is similar to TypeScript: it has a syntax similar to that of JavaScript, and it layers a type system over the JavaScript syntax. Compact deviates intentionally from TypeScript, however, in several important ways:

  • Unlike TypeScript, Compact is strongly typed. Programs cannot bypass the static type system via missing type declarations or unsafe casts. Furthermore, the JavaScript code produced by the Compact compiler includes run-time checks enforcing the static types of values that come from outside Compact as well as preventing external application of a Compact circuit with more or fewer than the declared number of arguments.
  • Compact provides namespace management via static rather than dynamic modules, and these modules can be parameterized via compile-time generic parameters, which include size as well as type parameters.
  • Because every Compact program must compile into a set of finite proving circuits, the language is bounded: all Compact types have sizes that are fixed at compile time, loops are bounded either by constant bounds or by the size of an object of constant size, and recursion is disallowed.
  • Compact numeric values are limited to unsigned integers either with a program-declared range or with a range determined by the field size of the target proving system.
  • Compact distinguishes certain values as potentially containing private data that should typically be protected, and it requires explicit declaration of the disclosure of any potential private data via disclose() wrappers. The basics of this are discussed in the description of disclose, and a more thorough description and discussion appears in the separate document Explicit disclosure in Compact.

Like TypeScript, Compact compiles into JavaScript, but it also produces a TypeScript definition file so it effectively also compiles into TypeScript. It produces separate TypeScript definition files and JavaScript implementation files rather than simply producing TypeScript for three reasons:

  • to allow compiled Compact programs to be used without requiring an additional TypeScript compilation step,
  • to permit the generated code to check function argument counts without disabling compile-time argument-type checks when called from TypeScript, and
  • so that it can generate a sourcemap file that properly maps elements of the generated JavaScript code (e.g., variable bindings and references) to the corresponding elements of the source Compact code.

For each circuit that touches the public ledger and hence requires a proof for on-chain execution, the Compact compiler also produces proving circuits in a zero-knowledge intermediate language (zkir), and it uses a zkir compiler to produce proving keys for each such circuit.

Finally, the compact compiler also produces a JSON-formatted contract info file that contains information about the program and its compiled representation, including version numbers and the types and characteristics of the contract's exported circuits.

This document explains each syntactic category individually. It starts by introducing the building blocks that are used in various contexts: identifiers, constants, types, generic parameters, and patterns. Then it describes the structure of Compact programs, each kind of program element, and the statements and expressions that can appear within circuit and constructor bodies. Finally, it discusses the TypeScript target.

Writing a contract provides a small example of what a Compact program looks like. It also introduces the basic building blocks of a Compact contract. The full grammar of Compact is provided separately.

Notation

The syntax of Compact programs is given by EBNF grammar snippets that use the following notational conventions:

  • Keywords and punctuation are in monospaced font.
  • Terminal and nonterminal names are in emphasized font.
  • Alternation is indicated by a vertical bar (|).
  • Optional items are indicated by the superscript opt.
  • Repetition is specified by ellipses. The notation XX, where X is a grammar symbol, represents zero or more occurrences of X. The notation X ,, X, where X is a grammar symbol and , is a literal comma, represents zero or more occurrences of X separated by commas. In either case, when the ellipsis is marked with the superscript 1, i.e., ⋯¹, the notation represents a sequence containing at least one X. When such a sequence is followed by ,opt, an optional trailing comma is allowed, but only if there is at least one X. For example, idid represents zero or more ids, and expr , ⋯¹ , expr ,opt represents one or more comma-separated exprs possibly followed by an extra comma. The rules involving commas apply equally to semicolons, i.e., apply when , is replaced by ;.

Every program is formed of characters that are organized into atomic sequences of characters known as tokens. Each keyword and punctuation symbol appearing in the grammar snippets represents itself exactly, i.e., represents the token consisting of the same sequence of characters. For example, when the keyword circuit appears in a grammar snippet, it matches only the token circuit, and when the punctuation symbol : appears, it matches only the token :.

Each terminal name appearing in the grammar snippets represents a set of possible tokens. For example, the terminal name id represents the set of all identifiers: when id appears in a grammar snippet, it matches any identifier. The sets of tokens represented by the terminal names appearing in the grammar snippets are described in Terminal names below.

Each nonterminal name appearing in the grammar snippets represents a sequence of tokens that comprise some structured piece of a program. For example, the terminal name expr matches any sequence of tokens that can be interpreted as an expression, such as 3 + x or a ? b : c. The set of structures represented by each nonterminal name is given in the various sections of this reference manual along with typing and evaluation rules where appropriate. For example, the structure of a circuit-definition is described in Circuits definitions.

Terminal names

The following terminal names appear in the grammar snippets.

  • id, module-name, function-name, struct-name, enum-name, contract-name, tvar-name, and type-name all represent identifier tokens.

  • nat represents natural-number literals.

  • str and file represent string literals.

  • version represents version literals (in pragmas).

While identifiers and string literals are each represented by more than one name, each represents the entire set of possible identifier or string-literal tokens. The grammar snippets use different terminal names only to suggest their use, e.g, module-name for module names versus tvar-name for type variable names.

Static and dynamic errors

The compiler detects various kinds of static errors, e.g., malformed syntax, references to undefined identifiers, and type mismatches. When it detects one or more static errors, it prints descriptive error messages for the errors and terminates without generating any output.

The code the compiler generates and the run-time libraries it uses detect various kinds of dynamic errors, e.g., attempts from code outside of Compact to call Compact circuits with wrong numbers or types of arguments or attempts to cast an unsigned value to an unsigned type that is too small for the value. These errors are reported when the generated code is run and cause evaluation of the current circuit to halt.

Identifiers, bindings, and scope

Identifiers are used in Compact, as in most other programming languages, to name things. Syntactically, an identifier is a token (atomic sequence of characters), beginning with with an alphabetic character, a dollar sign ($), or an underscore (_) followed by one or more alphabetic characters, digits (0 - 9), dollar signs, or underscores.

Some identifiers are reserved words. Of these, some are used as keywords in the syntax of the Compact language, e.g., module, import, circuit, and for. Others, specifically keywords reserved by JavaScript and TypeScript, are considered reserved for future use in Compact, e.g., self and class. Still others, specifically every identifier that begins with __compact, are reserved for use by the compiler. A comprehensive list of keywords and reserved words is given in keywords and reserved words.

The remaining identifiers can be used to name specific instances of various kinds of entities, including modules, types, generic parameters, ledger fields, function (circuit or witness) names, function parameters, and local variables. An identifier associated with, i.e., bound to, one of these entities can be referenced anywhere within the scope of the binding. Compact is lexically scoped, so the scope of each binding is limited to a specific region of the program text. The binding might be shadowed (hidden from view) in some region of program text within its scope that contains a binding for the same identifier.

It is a static error for an identifier to be bound more than once in a same scope, except that function overloading allows more than one function with the same name to be visible in the same scope with different signatures, i.e., different numbers or kinds of generic parameters and/or different numbers or types of run-time parameters.

The scope of each binding depends upon where it appears, as described below. (The caveat "except where shadowed" is not explicitly stated but applies in each case.)

  • Identifiers bound at the outermost level of a contract (referred to as the top level) are visible throughout the contract, but not within any modules that are imported from separate files.
  • Identifiers bound at the outermost level of a module are visible throughout the module. They are not visible outside of the module unless exported: any exported binding is also visible if and where it is imported from the module.
  • The generic parameters of a module, structure declaration, or function declaration are visible throughout the declaration.
  • The run-time parameters of a circuit or constructor are visible within its body.
  • Identifiers defined by a const binding inside a block are visible throughout the block.
  • Identifiers defined by a const binding inside a for-loop header are visible throughout the for loop.

Every reference to an identifier must appear within the scope of a binding for the identifier, in which case we say that the identifier is bound to the entity associated with the identifier by that binding. Otherwise, the reference is a static error.

For example:

circuit c(): Field {
const answer = 42;
{
const answer = 12;
assert(answer != 42, "shadowing did not work!");
}
return answer; // returns 42 (the outer 'answer')
}

The identifier c is bound to the circuit named c, and this binding is visible throughout the contract, though no references to c appear in the example. The first (outer) binding of the identifier answer to the value 42 is visible throughout the body of c except where shadowed by the second (inner) binding of answer within in the inner block, so the reference to answer in return answer evaluates to 42. The second (inner) binding of answer to 42 is visible throughout the inner block, so the reference to answer in answer != 42 evaluates to 12.

In addition to having a scope, every binding also has a lifetime. For circuit and witness bindings, the lifetime is effectively permanent, i.e., the binding is always available for use whenever the program is run.

The lifetimes of ledger-field bindings begin when they are first initialized and are effectively permanent from that point on; although the value of a field can change over time, the association of the ledger-field name with the ledger field's location in the (replicated) public state of a contract never changes.

On the other hand, bindings for module names, type names, and generic parameters are live only when a program is compiled, i.e., they help determine the structure of the program and the shape of the data used by the program but are not needed once the program has been compiled. (TypeScript bindings for type names exported from the program's top level do live on, however, in the generated TypeScript definition file.)

Variable bindings, i.e., bindings of circuit parameters, constructor parameters, and local variables bound by const statements and for loops, have dynamic lifetimes. The bindings of a circuit's or constructor's parameters start a new lifetime when the circuit or constructor is called that ends when the circuit or constructor exits. A variable binding established by a const statement starts a new lifetime when the const statement is evaluated that ends when the block containing the const statement exits. A const binding established by the const subform of a for-loop header starts a new lifetime on each iteration of the loop that ends when that iteration ends.

Variable bindings can have multiple lifetimes, because a circuit might be called multiple times, a block might be evaluated multiple times, and a for loop might be evaluated multiple times and/or have multiple iterations. Variables in Compact are immutable, however: they have the same value over the entire lifetime of the variable's binding. Thus, they are referred to as variables not because their values can vary over any single lifetime but because they can have different values in different lifetimes.

Generic parameters and arguments

Various entities, specifically module declarations, structure declarations, type-alias declarations, circuit definitions, and witness declarations, can have generic parameters, i.e., compile-time type and numeric (natural-number) parameters whose values are given at the use site rather than fixed at the point of declaration. This allows the same generic code to be used with different specific types, bounds, and sizes. Except where shadowed, generic parameters are visible throughout the entire entity. In particular, the generic parameters of a module are visible within the program elements that appear within the body of the module.

When present, generic parameters are enclosed in angle brackets following the name of the generic entity (module, structure, type alias, circuit, or witness). Each parameter is either a type name (e.g., T) or a hash-prefixed natural-number name (e.g., #N). Generic natural-number parameters are prefixed by # to distinguish them from generic type parameters.

gparams< generic-param ,, generic-param ,opt >
generic-param# tvar-name
|tvar-name

Generic entities must be specialized at the point of use to produce non-generic entities at compile time by supplying them with generic arguments. Any attempt to use a generic entity without specializing it is a static error. Generic arguments are also enclosed in angle brackets. Each generic argument must be a type, a natural number literal, or the type or numeric value of a generic parameter.

gargs< garg ,, garg ,opt >
gargnat
|type

The syntax of types allows for type references, including references to generic parameters, so any generic argument can pass along the value of a generic type or natural-number parameter that is visible at the point of specialization.

The # used to distinguish generic natural-number parameters from generic type parameters need not and must not be used at the point of specialization. It is a static error, however, if a generic argument supplied for a generic parameter is not numeric when a numeric value is expected or is not a type when a type is expected.

The example below demonstrates the use of two levels of generic parameterization, one at the module level and one at the circuit level.

module M<T, #N> {
export circuit foo<A>(x: T, v: Vector<N, A>): Vector<N, [A, T]> {
return map((y) => [y, x], v);
}
}
import M<Boolean, 3>;
export circuit bar1(): Vector<3, [Uint<8>, Boolean]> {
return foo<Uint<8>>(true, [101, 103, 107]);
}
export circuit bar2(): Vector<3, [Boolean, Boolean]> {
return foo<Boolean>(false, [true, false, true]);
}

The body of circuit foo is generic with respect to the module's type parameters T and N as well as to the circuit's own parameter A. The module is specialized at the point of import, while the circuit is specialized at the point of call (in both bar1 and bar2).

Compact types

Compact is statically typed: every expression in a Compact program must have a static type. For named circuits and witnesses, the parameters and return types must be explicitly declared. For anonymous circuit expressions, the parameters and return types do not need to be declared but can be. The types of const bindings can also be declared or not.

The language is strongly typed: the compiler rejects programs that do not type check. For example, it rejects programs in which a circuit or witness with a parameter type annotation is called with an incorrectly typed argument for that parameter, and it rejects programs where a circuit with a return-type annotation returns an incorrectly typed value. If an optional type annotation is omitted, the compiler attempts to infer a type and it rejects the program if no such type can be inferred.

Types consist of built-in primitive types, ledger-state types, program-defined types, and references to generic type parameters in scope. When the term "type" occurs in this document without any other qualifier, it means either a primitive type, ledger-state type, a program-defined type, or a generic type parameter in scope. The use of ledger-state types is, at present, limited to typing the result of default<T> to obtain the default value of type T, and only constant bindings can have a ledger-state type.

A generic type is not a valid type and so cannot, for example, be used as the type of a parameter or return value. Any attempt to do so is a static error. As with any other generic entity, it must be specialized at the point of use.

Primitive types

The following are the primitive types of Compact:

typetref
|Boolean
|Field
|Uint < tsize >
|Uint < tsize .. tsize >
|Bytes < tsize >
|Opaque < str >
|Vector < tsize , type >
|[ type ,, type ,opt ]
trefid gargsopt
tsizenat
|id
  • Boolean is the type of Boolean values. There are only two values of Boolean type. They are the values of the expressions true and false.

  • Uint<m..n>, where m is the literal 0 or generic natural-number parameter bound to 0, and where n is a non-zero natural number literal or a generic natural-number parameter bound to a non-zero natural number, is the type of bounded unsigned integer values between 0 (inclusive) and n (exclusive). (While the lower bound is currently required to be 0, this restriction might be lifted at some point.) Uint types with different upper bounds are different types, although the one with the lower bound is a subtype of the other. The compiler and run-time system might impose a limit on the range of supported unsigned integer values. If so, it is a static error whenever a Uint type includes values that exceed this limit. The current limit, if any, is given in Implementation-specific limits.

  • Uint<n>, where n is a non-zero natural number literal or generic natural-number parameter bound to a non-zero natural number, is the type of sized unsigned integer values with binary representations using up to n bits. This is the same type as Uint<0..m> where m is equal to 2n. Sized integer types can be seen as a convenience for programmers. Uint<32>, for example, can be more obvious and less error-prone than the equivalent Uint<0..4294967296>. Any Compact program that uses sized integer types can be rewritten to one that uses only bounded integer types, but the converse is not true.

  • Field represents the set of unsigned integers with values up to the order of the native prime field of the ZK proving system. The current maximum field value is given in Implementation-specific limits.

  • [T1, ⋯, Tn], where T1, ⋯, Tn are zero or more comma-separated types, is the type of tuple values with element types T1, ⋯, Tn. Tuples are heterogeneous: any element type can differ from any of the others. The length of a tuple type is the number n of element types. Two tuple types with different lengths are different types. Two tuple types where any element type of one differs from the corresponding element type of the other are also different types, though one of the tuple types might be a subtype of the other.

  • Vector<n, T>, where n is a natural number literal or generic natural-number parameter and T is a type, is a shorthand notation for the tuple type [T, ⋯, T] with n occurrences of the type T. Note that a vector type and the corresponding tuple type are two different ways of writing exactly the same type. Unless otherwise specified, type rules for vector types are derived from the rules for the corresponding tuple type.

  • Bytes<n>, where n is a natural number literal or a generic natural-number parameter, is the type of byte vectors of length n. Bytes types with different lengths are different types. Bytes types are used in the Compact standard library for hashing. String literals in Compact also have a Bytes type, where n is the number of bytes in the UTF-8 encoding of the string.

  • Opaque<s>, where s is a string literal, is the type of opaque values with tag s. Opaque types with different tags are different types. Opaque values can be manipulated in witnesses but they are opaque to circuits. They are represented in circuits as their hash. The only tags currently allowed are "string" and "Uint8Array".

Program-defined types

Programs can define three kinds of new types: structures, enumerations, and contracts. They can also define structural and nominal aliases for existing types.

Structure types

Structure types are defined via struct declarations with the following form:

struct-declarationexportopt struct struct-name gparamsopt { typed-id ;; typed-id ;opt } ;opt
|exportopt struct struct-name gparamsopt { typed-id ,, typed-id ,opt } ;opt
typed-idid : type

A structure declaration has a sequence of named fields which must be separated either by commas or by semicolons. Comma and semicolon separators cannot be mixed within a single structure declaration. A trailing separator is allowed, but not required.

Each structure field must have a type annotation. Here are a couple of examples:

struct Thing {
triple: Vector<3, Field>,
flag: Boolean,
}

struct NumberAnd<T> {
num: Uint<32>;
item: T
}

The first declaration introduces a structure type named Thing with two fields: triple (a vector with three Field elements) and flag (a Boolean). The second introduces a generic structure type named NumberAnd with generic parameter T and two fields: num (a 32-bit unsigned integer) and item (a value of type T).

Generic structure types are not fixed types and must eventually be specialized by supplying generic arguments, e.g., NumberAnd<Uint<8>>. When any generic structure type is specialized, it must be fully specialized: the number of supplied generic arguments must match the number of declared generic parameters. The effect of specializing a generic structure type is to produce the same type as one in which the generic parameters are replaced by the generic argument values. For example, NumberAnd<Uint<8>> is equivalent to NumberAnd if NumberAnd had been defined by:

struct NumberAnd {
num: Uint<32>;
item: Uint<8>
}

It is possible and common for a generic structure type to be specialized via different generic arguments to produce different specialized structure types in different parts of a program.

Structure typing is always nominal: two types are equivalent only if they have the same names and same fields. They are distinct if they have different names even if they have the same fields. More precisely: each structure type is the same as any other structure type that has the same name, same element names (in the same order), and same element types (in the same order). It is distinct from every other type.

This means, for example, that the following program is well-typed:

module M {
struct NumberAnd {
num: Uint<32>;
item: Uint<8>
}
export circuit bar(x: NumberAnd): NumberAnd {
return x;
}
}
import M;
struct NumberAnd<T> {
num: Uint<32>;
item: T
}
export circuit foo(x: NumberAnd<Uint<8>>): NumberAnd<Uint<8>> {
return bar(x);
}

Structure types must not be recursive, i.e., they cannot contain elements of the same type as the structure, either directly or indirectly. An attempt to define a recursive structure type is a static error. For example, it is a static error to use the following pair of declarations:

struct Even {
predecessor: Odd
}

struct Odd {
predecessor: Even
}

export circuit doesntWork(s: Even): Odd {
return s.predecessor;
}

Values of structure types are created with structure-creation expressions and accessed via structure-field-access expressions.

Enumeration types

Enumeration types are defined via enum declarations with the following form:

enum-declarationexportopt enum enum-name { id , ⋯¹ , id ,opt } ;opt

An enumeration declaration has a non-empty sequence of named elements separated by commas. A trailing comma is allowed but not required.

An enumeration declaration introduces a named enumeration type, such as Arrow in the example below:

enum Arrow { up, down, left, right };

Within the scope of this declaration, a value of type Arrow can have one of four values, selected via Arrow.up, Arrow.down, Arrow.left, and Arrow.right.

Two enumeration types are the same if they have the same name and the same element names (in the same order) and distinct otherwise.

Contract types

Compact 1.0 does not fully implement declarations of contracts and the cross-contract calls they support, but the keyword contract used to declare contracts is reserved for this use.

Type aliases

Type aliases can be created via type declarations of the form:

type-alias-declarationexportopt newopt type type-name gparamsopt = type ;

Within the scope of a type-alias declaration of type-name for type, type-name is itself a type. Type aliases are either structural or nominal, depending on whether the optional new modifier is present:

  • A type alias type-name for type declared without the optional new modifier is a structural type alias, i.e., type-name is the same type and is fully interchangeable with type.
  • A type alias type-name for type declared with the optional new modifier is a nominal type alias, i.e., type-name is a distinct type compatible with type but neither a subtype of nor a supertype of type (or any other type).

Any nominal type alias type-name for some type type is compatible with type in the following senses:

  • values of type type-name have the same representation as values of type type
  • values of type type-name can be used by primitive operations that require a value of type type
  • values of type type-name can be cast explicitly to type, and
  • values of type type can be cast explicitly to type type-name.

For example, within the scope of

new type V3U16 = Vector<3, Uint<16>>

a value of type V3U16 can be referenced or sliced just like a vector of type Vector<3, Uint<16>>, but it cannot, for example, be passed to a function that expects a value of type Vector<3, Uint<16>> without an explicit cast.

When one operand of an arithmetic operation (e.g., +) receives a value of some nominal type alias type-name, the other operand must also be of type type-name, and the result of performing the operation is cast to type type-name. This might cause a dynamic error if the result cannot be represented by type type-name.

Values of any nominal type alias type-name cannot be compared directly using, e.g., <, or ==, with values of any other type, including with values of type type. Such comparisons require one of the operands to be cast to the type of the other.

Both structural and nominal type aliases can take generic parameters, e.g.:

type V3<T> = Vector<3, T>;

and

new type VField<#N> = Vector<N, Field>;

When a generic nominal type is specialized, the specialized type is a nominal type.

Subtyping and least upper bounds

Some Compact types are related to other types via subtyping. Informally, if a type T is a subtype of a type S (equivalently, S is a supertype of type T), then every value of type T is also a value of type S, i.e., any value of type T can be used where a value of type S is expected without the need for an explicit cast. For example, a circuit or witness can be called with argument expressions whose types are subtypes of the corresponding parameter type annotations, and a const binding statement with a type annotation can be given a value with an expression whose type is a subtype of the type annotation.

Subtyping is exclusively defined by the following rules:

  • Any type T is a subtype of itself (subtyping is reflexive)
  • Uint<0..n> is a subtype of Uint<0..m>` if n is less than (or equal to) m
  • Uint<0..n> is a subtype of Field if n-1 is less than or equal to the maximum field value. (Whether Field is a subtype of Uint<0..n> if n-1 is greater than the maximum field value is currently unspecified.)
  • The tuple type [T1, ⋯, Tn] is a subtype of the tuple type [S1, ⋯, Sn] if they have the same length and each type Ti is a subtype of the corresponding type Si.

The least upper bound (with respect to subtyping) of a non-empty set of types {T1, ⋯, Tn} is a type S such that:

  • S is an upper bound: Ti is a subtype of S for all i in the range 1..n, and
  • S is the least upper bound: for all upper bounds R of the set of types {T1, ⋯, Tn}, S is a subtype of R.

Note that least upper bounds do not exist for all sets of types, because some types (such as Boolean and Field) are not related.

Tuple and vector types: Every vector type is equivalent to some tuple type. Specifically, as noted in the earlier section on Primitive types, the vector type Vector<n, T> is equivalent to the tuple type [T, ⋯, T] with n occurrences of T.

Thus the above subtyping rule for tuple types applies to vector types as well: every vector type is a subtype of the equivalent tuple type and possibly of some other tuple and vector types. In general, a vector type Vector<n, T> is a subtype of a tuple type [S1, ⋯, Sn] if T is a subtype of each of the types S1, ⋯, Sn. This means, for instance, that a vector can often be passed to a circuit where a tuple is expected.

On the other hand, tuple types do not always have equivalent vector types. For example, neither [Boolean, Field] nor [Uint<8>, Uint<16>] is equivalent to any vector type. We say, however, that a tuple type [T1, ⋯, Tn] with possibly distinct types T1, ⋯, Tn "has a vector type" if the least upper bound S of the set of types {T1, ⋯, Tn} exists. In that case, the tuple type has the vector type Vector<n, S>. Some operations over tuples (such as mapping and folding) require the tuple type to have a vector type.

When a tuple type has a vector type, the tuple type is a subtype of the vector type, but it might not be the same as the vector type. For example, [Uint<16>, Uint<16>] has the vector type Vector<2, Uint<16>>, and the two types are the same, whereas [Uint<8>, Uint<16>] also has the vector type Vector<2, Uint<16>>, but the types are not the same.

Patterns and destructuring

The parameters of a circuit or constructor and the target of a const binding are specified via patterns:

patternid
|[ patternopt ,, patternopt ,opt ]
|{ pattern-struct-elt ,, pattern-struct-elt ,opt }
pattern-struct-eltid
|id : pattern

In its simplest form, a pattern is just an identifier. For example, in the code below, the parameter of sumTuple is the identifier x and the targets of the two const bindings are the identifiers a and b.

circuit sumTuple(x: [Field, Field]): Field {
const a = x[0], b = x[1];
return a + b;
}

When the parameter type is a tuple, vector, or struct, it is often convenient to use one of the destructuring forms of patterns to name individual pieces of the tuple or struct at the point of binding rather than extracting them at each point of use. For example, one could replace the above with:

circuit sumTuple(x: [Field, Field]): Field {
const [a, b] = x;
return a + b;
}

or more simply with:

circuit sumTuple([a, b]: [Field, Field]): Field {
return a + b;
}

Here is a similar example that destructures a struct instead of a tuple:

struct S { x: Uint<16>, y: Uint<32> }
circuit sumStruct({x, y}: S): Uint<64> {
return x + y;
}

Whereas the elements of a tuple pattern are necessarily given in order, the elements of a struct pattern need not be consistent with the order of the fields in the declaration. For example, the definition of sumStruct below is equivalent to the one above, even though the order of the pattern elements has been swapped:

struct S { x: Uint<16>, y: Uint<32> }
circuit sumStruct({y, x}: S): Uint<64> {
return x + y;
}

By default, the names bound by the pattern are the same as the names of the structure elements. When this is not convenient, it is possible to choose different names for the structure elements:

struct S { x: Uint<16>, y: Uint<32> }
circuit sumStruct({x: a, y}: S): Uint<64> {
return a + y;
}

While x: a looks like an identifier with a type annotation, in this context it simply indicates that a rather than x is bound to the value in the x field.

Patterns can be arbitrarily nested, e.g.:

struct S { x: Uint<16>, y: Uint<32> }
circuit sumTupleStruct([{x: a1, y: b1}, {x: a2, y: b2}]: [S, S]): Uint<64> {
return a1 + b1 + a2 + b2;
}

It is permissible and sometimes useful to not name certain parts of the tuple or struct.

struct S { x: Uint<16>, y: Uint<32> }
circuit sumSomeYs([{y: b1}, , {y: b3}]: [S, S, S]): Uint<64> {
return b1 + b3;
}

Here the input is a tuple with three elements, but the pattern skips the second by putting two commas between the first and third. Similarly, while each element of the tuple is a struct with both x and y fields, the pattern ignores the x fields simply by failing to mention them.

It is a static error if a pattern implies a different shape from the declared or inferred type of value to be destructured. For example:

struct S { x: Uint<16>, y: Uint<32> }
circuit sumStruct({x, y}: [Uint<16>, Uint<32>]): Uint<64> {
return x + y;
}

fails because it tries to treat a tuple as a struct, while:

struct S { x: Uint<16>, y: Uint<32> }
circuit sumSomeYs([{y: b1}, , , {y: b3}]: [S, S, S]): Uint<64> {
return b1 + b3;
}

fails because it implies that the input tuple has four elements (including two skipped elements) when it actually has only three, and:

struct S { x: Uint<16>, y: Uint<32> }
circuit sumSomeYs([{y: b1}, , {z: b3}]: [S, S, S]): Uint<64> {
return b1 + b3;
}

fails because it tries to name a nonexistent z field in one of the structs.

Trailing commas in a pattern imply nothing about the structure of the input and are ignored:

struct S { x: Uint<16>, y: Uint<32> }
circuit sumSomeYs([{y: b1,}, , {y: b3,},]: [S, S, S]): Uint<64> {
return b1 + b3;
}

Programs

A compact program is a sequence of zero or more program elements.

Briefly:

The order of program elements in a program or module is unimportant, except that any module must be defined before any import of the module, and any program-defined types used as generic parameters by an import form must be defined before the import form.

Detailed descriptions of struct, enum, contract, and type-alias declarations appear in Compact types above. Detailed descriptions of the remaining program elements are described in the following section.

Pragmas

A pragma takes the following form and declares a constraint on either the compiler version (id = compiler_version) or the language version (id = language_version)

version is a dot-separated pair or trio of natural numbers, so along with nat this allows version to be either a single natural number, a pair of natural numbers separated by a ., or a trio of natural numbers separate by a ., e.g., 1, 1.2, or 1.2.7.

For example, if a program includes the following pragma form:

pragma compiler_version >= 1.0.0 && !1.0.5;

All versions of the compiler from 1.0.0 up except for 1.0.5 will accept the pragma and compile the program, and all other versions will print an appropriate message and discontinue compilation of the program.

Modules, exports, and imports

Modules in Compact are used for namespace management and also possibly to split programs into multiple files. A module is a named collection of program elements created via a module definition, which takes the following form:

module-definitionexportopt module module-name gparamsopt { program-elementprogram-element }

A module definition makes a binding from module-name to the module visible in the program or module containing the module definition. Any bindings established by program elements within the module are not made visible, at least not until the module is imported.

A module can have generic parameters, in which case it is a generic module and must be specialized with generic arguments at the point of import.

Exports

By default, identifier bindings created by the program elements within the body of a module are visible only within the module, i.e., they are not exported from the module. Any identifier defined at or imported into the top level of a module can be exported from the module in one of two ways: (1) by prefixing the definition with the export keyword, or by listing the identifier in a separate export declaration:

export-formexport { id ,, id ,opt } ;opt

For example, the following module exports G and S but not F.

module M {
export { G };
export struct S { x: Uint<16>, y: Boolean }
circuit F(s: S): Boolean {
return s.y;
}
circuit G(s: S): Uint<16> {
return F(s) ? s.x : 0;
}
}

Exporting a binding from a module has no effect unless the module is imported.

Imports

A module can be imported into another module or into the program top level, making some or all of its exported bindings visible there, potentially with a prefix.

import-formimport import-selectionopt import-name gargsopt import-prefixopt ;
import-selection{ import-element ,, import-element ,opt } from
import-elementid
|id as id
import-nameid
|file
import-prefixprefix id

For example:

module Runner {
export circuit start(): [] {}
export circuit stop(): [] {}
}
module UseRunner1 {
import Runner;
// start and stop are now in scope
}
module UseRunner2 {
import { start } from Runner;
// start is now in scope, but not stop
}
module UseRunner3 {
import Runner prefix Runner$;
// Runner$start and Runner$stop are now in scope, but not stop or run
}

and

module Identity<T> {
export { id }
circuit id(x: T): T {
return x;
}
}
import Identity<Field>;
// id is now in scope, specialized to type Field

When import-name is an identifier and an import for import-name appears before any visible definition of import-name, the module is assumed to reside in the filesystem, and it is imported directly from there. If import-name is an identifier module-name, a definition for module module-name must be contained within the file module-name.compact in the same directory as the importing file or in one of the directories in the Compact path. If import-name is a string "{prefix/}module-name" where {prefix/} is either empty or is a pathname ending in a directory separator, a definition for a module named module-name must be contained within a file module-name.compact that is either:

  • (a) if {prefix/}module-name.compact is an absolute pathname, then exactly at {prefix/}module-name.compact, otherwise
  • (b) at {prefix/}module-name.compact relative to the directory of the importing file or to one of the directories in the Compact path. Details on the search order and the mechanism for setting the Compact path are given in Compiler Usage.

In any of these cases, it is a static error if module-name.compact is not found, if it does not contain a definition for a module named module-name, or if it contains anything else other than comments and whitespace.

Several examples follow.

Example 1: The file M.compact below contains a single module definition:

module M {
export { F };
export struct S { x: Uint<16>, y: Boolean }
circuit F(x: S): Boolean {
return x.y;
}
}

Then, test1.compact import M from M.compact:

import M;
export { F };

whereas test2.compact uses its own definition of M:

module M {
export { G };
export struct S { x: Uint<16>, y: Boolean }
circuit G(x: S): Boolean {
return x.y;
}
}
import M;
export { G };

Importing by a pathname allows multiple modules with the same name to be imported into the same scope. For example:

The file M.compact below contains a single module definition, as before:

module M {
export { F };
export struct S { x: Uint<16>, y: Boolean }
circuit F(x: S): Boolean {
return x.y;
}
}

and A/M.compact contains a different module definition:

module M {
export { F };
export struct S { x: Uint<16>, y: Boolean }
circuit F(x: S): Boolean {
return x.y;
}
}

Then the program test.compact can define M and import all three of M, "M", and "A/M":

module M {
export { F };
export struct S { x: Uint<16>, y: Boolean }
circuit F(x: S): Boolean {
return x.y;
}
}
import M prefix M1$;
import "M" prefix M2$;
import "A/M" prefix M3$;

export { M1$F, M2$F, M3$F };

The compact standard library

Compact's standard library can be imported by import CompactStandardLibrary. The standard library defines a number of useful types and circuits along with ledger-state types such as Counter, Map, and MerkleTree.

Top-level exports

Certain kinds of program elements can be exported from a contract's top level, namely circuits, program-defined types, and ledger fields. Exporting them makes them visible outside of the contract, i.e., to the TypeScript driver for the smart contract.

The circuits exported at the top level of a contract (i.e., not merely exported from a module) are the entry points of the contract. (A Compact program has no "main" entry point, but is more similar to a library containing multiple entry points that share a common store.) Although multiple circuits with the same name are allowed generally to support function overloading, it is a static error if more than one circuit with the same name is exported from the top level. It is also a static error for a generic circuit, i.e., one with generic parameters, to be exported from the top level.

Program-defined types exported from the top level of the main file can be used to describe the argument and return types of witnesses and exported circuits; these may accept generic arguments, but generic arguments marked as sizes rather than types are dropped in the exported type. For example:

export struct S<#n, T> { v: Vector<n, T>; curidx: Uint<0..n> }

is exported for use as a TypeScript type with the T parameter but not the n parameter, i.e.,:

export type S<T> = { v: T[]; curidx: bigint }

Ledger field names exported from the top level are visible for direct inspection by code outside of the contract via the generated TypeScript ledger() function.

It is a static error to export any other kind of binding from the top level.

Include files

Compact allows programs and modules to be split into multiple files and spliced together via include forms, which have the following syntax, where file is a string literal specifying a filesystem pathname for file to be included:

include-forminclude file ;

file can be an absolute pathname, one that is relative to the directory of the including file, or one that is relative to one of the directories in the Compact path. Details on the search order and the mechanism for setting the Compact path are given in Compiler Usage.

It is a static error if the file is not present or cannot be read. If present and readable, the file must contain a sequence of syntactically valid program elements, and these elements are treated as if they had been present in the including file in place of the include form.

Declaring witnesses for private state management

A user's private state should be maintained in some secure way by the TypeScript driver of a smart contract and never stored directly in the public state of the contract. A contract must sometimes prove something about some piece of private state, however, as well as cause an update to the private state.

The TypeScript driver of the smart contract can provide pieces of private state to the contract via the arguments of some exported circuit, and it can update the private state based on the return values of the exported circuit.

A circuit can also access or update private state as it operates via witnesses. Witnesses are callback functions provided by the TypeScript driver.

Witnesses must be declared to make them visible to the circuits of a contract. A witness declaration does not include a body, because the implementation is provided by the TypeScript driver.

witness-declarationexportopt witness id gparamsopt simple-parameter-list : type ;
simple-parameter-list( typed-id ,, typed-id ,opt )

Witness declarations can appear anywhere among the program elements of a module or the contract's top level.

For instance:

witness W(x: Uint<16>): Bytes<32>;

defines a witness W, to which the contract provides a 16-bit unsigned value and from which the contract receives 32 bytes of some presumably private data.

danger

Do not assume in your contract that the code of any witness function is the code that you wrote in your own implementation. Any DApp may provide any implementation that it wants for your witness functions. Results from them should be treated as untrusted input.

Declaring and maintaining public state

A contract declares the shape of its public state through ledger declarations.

Each ledger declaration defines one piece of information that the contract might store in the public ledger. Multiple ledger declarations can appear in a program, or none. They can appear anywhere among the program elements of a module or the contract's top level.

ledger-declarationexportopt sealedopt ledger id : type ;

A ledger declaration binds a ledger field name to one of a set of predefined ledger-state types. For instance:

import CompactStandardLibrary;

ledger val: Field;
export ledger cnt: Counter;
sealed ledger u8list: List<Uint<8>>;
export sealed ledger mapping: Map<Boolean, Field>;

All ledger fields are initialized to the default values of their ledger-state types. The contract constructor, if any, runs after the default values are established, allowing the constructor to override the defaults.

Ledger-state types

The following ledger-state types are supported.

  • T, for any regular Compact type T
  • Counter
  • Set<T>, for any Compact type T
  • Map<K, T>, for any Compact types K and T
  • Map<K, V>, for any Compact type K and ledger-state type V (see the following section)
  • List<T>, for any Compact type T
  • MerkleTree<n, T>, for any n, 1 < n ≤ 32, and any Compact type T
  • HistoricMerkleTree<n, T>, for any n, 1 < n ≤ 32, and any Compact type T
  • Kernel, a special type that yields access to ledger operations that do not depend on a specific ledger state

Each ledger type supports a set of operations, which can be invoked with

F.op(e, ⋯, e)

where F is a field name, op is the name of a ledger operation supported by the ledger-state type of F, and each e is an argument expression.

A ledger field that is declared with a Compact type T implicitly has the type Cell<T>, which supports several operations, including read, write, and reset_to_default. For example:

ledger F: Uint<16>;
export circuit putF(x: Uint<16>): [] {
F.write(disclose(x));
}
export circuit getF(): Uint<16> {
return F.read();
}

The read operation of the Cell ledger-state type can be abbreviated to, simply, a reference to the field name, and the write(e) operation can be abbreviated to an assignment of the form F = e. So the above can be written more simply as:

ledger F: Uint<16>;
export circuit putF(x: Uint<16>): [] {
F = disclose(x);
}
export circuit getF(): Uint<16> {
return F;
}

The read operation of the Counter type can be abbreviated in the same way, and its increment and decrement operations can be abbreviated to assignments of the form F += e and F -= e. For example:

import CompactStandardLibrary;
ledger F: Counter;
export circuit incrF(): [] {
F += 1;
}
export circuit decrF(): [] {
F -= 1;
}
export circuit getF(): Uint<64> {
return F;
}

The selection of an appropriate ledger-state type can reduce a transaction's dependency on the exact contents of the ledger state and thus reduce the chance of a transaction being rejected when the proof is checked on chain. For example, the preceding example could be written using a Cell instead:

ledger F: Uint<64>;
export circuit incrF(): [] {
F = F + 1 as Uint<64>;
}
export circuit decrF(): [] {
F = F - 1;
}
export circuit getF(): Uint<64> {
return F;
}

incrF and decrF read the value of F before operating on it and writing it back to F. This read commits the transaction to the current value of the field. If F has the value, say, 27 when the transaction is created, F must still have the value 27 when the proof is checked on chain. The Counter versions, however, do not read the value but merely request that the current value be incremented or decremented; thus, it does not have any such constraint.

A comprehensive description of ledger-state types and operations can be found in the Compact ledger data type documentation.

Nested state types in the Map type

In most cases, ledger-state types cannot be nested within other ledger-state types. However, while Map keys must have regular Compact types, Map values can have either regular Compact types or ledger-state types (except Kernel).

Attempting to nest a ledger-state type anywhere but within a Map value is a static error. Attempting to nest the Kernel type within a Map value is also a static error.

Here is a small example showing two levels of nesting:

import CompactStandardLibrary;

ledger fld: Map<Boolean, Map<Field, Counter>>;

export circuit initNestedMap(b: Boolean): [] {
fld.insert(disclose(b), default<Map<Field, Counter>>);
}

export circuit initNestedCounter(b: Boolean, n: Field): [] {
fld.lookup(b).insert(disclose(n), default<Counter>);
}

export circuit incrementNestedCounter1(b: Boolean, n: Field, k: Uint<16>): [] {
fld.lookup(b).lookup(n).increment(disclose(k));
}

export circuit incrementNestedCounter2(b: Boolean, n: Field, k: Uint<16>): [] {
fld.lookup(b).lookup(n) += disclose(k);
}

export circuit readNestedCounter1(b: Boolean, n: Field): Uint<64> {
return fld.lookup(b).lookup(n).read();
}

export circuit readNestedCounter2(b: Boolean, n: Field): Uint<64> {
return fld.lookup(b).lookup(n);
}

In this example,

  • fld is bound to a Map from Boolean values to Maps from Field values to Counters
  • initNestedMap can be used to create the inner Map for a particular outer-Map key
  • initNestedCounter can be used to create a Counter for a given outer-Map key and a given inner-Map key
  • either incrementNestedCounter1 or incrementNestedCounter2 can be used to increment an existing Counter for a given outer-Map key and a given inner-Map key
  • either readNestedCounter1 or readNestedCounter2 can be used to read the value of an existing Counter for a given outer-Map key and a given inner-Map key.

Notes:

  • Nested ledger-state values must be initialized before first use. The syntax default<T> can be used to create default ledger-state type values, just as it can be used to create default Compact type values. Attempting to operate on a nested ledger-state value without first initializing it is a dynamic error.

  • Ledger state type values are not first-class objects, so when accessing a nested value, the entire indirection chain must be used. For example, the following is a static error:

    export circuit incrementNestedCounter(b: Boolean, n: Field, k: Uint<16>): [] {
    fld.lookup(b); // ERROR: incomplete chain of indirects
    }
  • When the last operation is a write of a base type, increment of a Counter type, or decrement of a Counter type, one can replace the write, increment, or decrement operation with the =, +=, or -= assignment syntax, as illustrated by incrementNestedCounter1 and incrementNestedCounter2, which have the same behavior.

  • When the last operation is a read of a Counter or base type one can omit the explicit read() indirect, as illustrated by readNestedCounter1 and readNestedCounter2, which have the same behavior.

  • For convenience, local variables can hold default values of ledger-state types, so the following definition of initNestedMap is equivalent to the one above.

    export circuit initNestedMap(b: Boolean): [] {
    const t = default<Map<Field, Counter>>;
    fld.insert(disclose(b), t);
    }

Sealed and unsealed ledger fields

Any ledger field can be marked sealed by prefixing the ledger field declaration with the optional modifier sealed. A sealed field cannot be set except during contract initialization. That is, its value can be modified only by the contract constructor (if any), either directly within the body of the constructor or via helper circuits called by the constructor. The sealed keyword must come after the export keyword (if present) and before the ledger keyword, as in the following example:

sealed ledger field1: Uint<32>;
export sealed ledger field2: Uint<32>;

circuit init(x: Uint<32>): [] {
field2 = disclose(x);
}

constructor(x: Uint<16>) {
field1 = 2 * disclose(x);
init(x);
}

It is a static error if a sealed ledger field is updated by any code that is reachable from an exported circuit.

Contract constructor

A contract can be initialized via a contract constructor defined at the program's top level.

constructor-definitionconstructor pattern-parameter-list block
pattern-parameter-list( typed-pattern ,, typed-pattern ,opt )
typed-patternpattern : type

The constructor, if any, is typically used to initialize public state and can also be used to initialize private state through witness calls. At most one contract constructor can be defined for a contract, and it must appear only at the program top level, i.e., it cannot be defined in a module. To initialize ledger fields that are visible only within a module, the constructor can call a circuit that is exported from the module. For example:

module PublicState {
enum STATE { unset, set }
ledger state: STATE;
ledger value: Field;
export circuit init(v: Field): [] {
value = disclose(v);
state = STATE.set;
}
}

import PublicState;

constructor(v: Field) {
init(v);
}

Each constructor parameter must have an explicit type annotation. The type of each variable binding arising from the binding of identifiers in each parameter pattern to the corresponding pieces of the input is the type of the corresponding part of the declared type's structure.

The return type of the constructor is always []. Any attempt to return another type of value using return expr; where the type of expr is something other than [], is a static error.

Circuit definitions

The basic operational element in Compact is the circuit. This corresponds closely to a function in most languages but is designed to be compilable into a zero-knowledge circuit. The key limitation of circuits relative to functions in most languages is that circuits cannot be recursive, either directly or indirectly.

Compact supports two kinds of circuits: named circuits and anonymous circuits. Named circuits are described here, and anonymous circuits are described in Circuit and witness calls.

Named circuit definitions have the following syntax:

circuit-definitionexportopt pureopt circuit function-name gparamsopt pattern-parameter-list : type block

A circuit definition binds function-name to a circuit with the given parameters, return type, and body. The optional export modifier indicates that the circuit binding should be exported from the enclosing module or the program itself, if the circuit is defined outside of any module. The optional pure modifier indicates that the circuit is pure.

If any generic parameters are present (gparams is present and is nonempty), the circuit is generic and must be specialized (provided with generic arguments) at the point of call.

Circuits can take zero or more parameters. The parameters are all patterns containing identifiers to be bound to selected pieces of the argument values. In the simplest case, a pattern is just an identifier and is bound to the argument value as a whole. The bindings established by the parameters are visible within (and only within) the block that constitutes the body of the circuit. Each parameter must have an explicit type annotation, and at the point of every call to the circuit, the type of the corresponding argument expression must be a subtype of that type. The type of each variable binding arising from the binding of identifiers in each parameter pattern to the corresponding pieces of the input is the type of the corresponding part of the declared type's structure. For example, in the body of sumStruct below:

struct S { x: Uint<16>, y: Uint<32> }
circuit sumStruct({x, y}: S): Uint<64> {
return x + y;
}

The variable binding established for x by the pattern {x, y} has type Uint<16>, and the variable binding for y has type Uint<32>.

Every named circuit's return type must be explicitly declared, and it is a static error if the circuit can return a value that is not a subtype of that type.

The body is evaluated each time the circuit is called.

Pure and impure circuits

A Compact circuit is considered pure if it computes its outputs from its inputs without reference to or modification of public state (via the ledger) or private state (via witnesses). In practice, the compiler considers a circuit to be impure if the body of the circuit contains a ledger operation, a call to any impure circuit, or a call to a witness.

Some external circuits defined in CompactStandardLibrary are witnesses; calls to these make the caller impure. The remainder are considered pure, so calls to those do not make the caller impure.

A Compact program can declare a circuit to be pure by prefixing the circuit definition with the pure modifier, which must follow the export modifier, if present, e.g.:

pure circuit c(a: Field): Field { }

export pure circuit c(a: Field): Field { }

The only effect of the pure modifier is that the compiler flags the declaration as an error if its own analysis determines that the circuit is actually impure. The pure modifier allows an application to ensure that the circuit is present in the PureCircuits type declaration and via the pureCircuits constant in the TypeScript module produced for a (correct) Compact program by the Compact compiler.

Blocks

A block is a group of statements enclosed in braces:

block{ stmtstmt }

A block can be used in place of a single statement anywhere a single statement is allowed, which is useful for allowing multiple statements to be evaluated by the "then" and "else" parts of an if statement or the body of a for statement.

The body of every circuit definition and the body of the constructor, if any, is always a block. The right-hand side of the arrow in an anonymous circuit can be either a block or an expression.

The statements within a block occupy a nested scope: variable bindings created by const statements within the block are not visible outside the block, and they can shadow identifier bindings with the same names that exist outside the block.

A block is well-typed if the statements within it are well-typed.

A block is evaluated by evaluating the statements in sequence.

Statements

Compact statements are used to perform computations or effects with the help of Compact expressions.

Statements either do not have values or (in the case of expression sequences serving as statements) the values are ignored. Thus it is not necessary to talk about the type of a statement. Nevertheless, each statement has typing rules that must be followed, such as that the type of the test expression of an if statement must be Boolean.

The first subsection of this section presents a grammar snippet that summarizes the syntax of Compact statements. The remaining subsections describe the typing and evaluation rules of the various kinds of statements.

Syntax of statements

stmtif ( expr-seq ) stmt
|stmt0
stmt0expr-seq ;
|const cbinding , ⋯¹ , cbinding ;
|if ( expr-seq ) stmt0 else stmt
|for ( const id of start .. end ) stmt
|for ( const id of expr-seq ) stmt
|return expr-seq ;
|return ;
|block
starttsize
endtsize

The grammar snippet above shows that a statement (stmt) is either a one-armed if statement or some other kind of statement (stmt0). This structure is used to enforce the restriction that the "then" part of a two-armed if cannot be a one-armed if. This is often left ambiguous in a language grammar, with a separate note to say that the ambiguity is resolved by associating each "else" part with the closest enclosing if statement, but here the constraint is explicit in the grammar.

Expression sequences used as statements

Any expression sequence (expr-seq), i.e., comma-separated sequence of one or more expressions to be evaluated in sequence, can be used as a statement. Expression sequences are described in their own section.

const statements

const statements create local variable bindings. Every const statement takes the following form:

const cbinding , ⋯¹ , cbinding ;

where each cbinding takes the following form:

cbindingoptionally-typed-pattern = expr
optionally-typed-patternpattern
|typed-pattern

A const statement is typed by typing each of its cbinding subforms. A cbinding subform is typed by typing the expression on the right-hand side of the =. If a type T is declared on the left-hand side, the type of the expression must be a subtype of T, otherwise it is a static error. It is also a static error if the pattern implies a different structure from the type of the expression. For example, it is a static error if the pattern implies that the expression's value is a tuple when the type is actually, say, Field.

The type of each variable binding arising from the binding of identifiers in each pattern to the corresponding pieces of the input is the type of the corresponding part of the structure of the declared type, if present, otherwise of the inferred type. For example, in the following code, the binding for x has type Boolean, and the binding for y has type [Uint<64>, Uint<64>]:

witness w(): [Boolean, [Uint<16>, Uint<32>]];
circuit foo(): [Uint<64>, Uint<64>] {
const [x, y]: [Boolean, [Uint<64>, Uint<64>]] = w();
return x ? y : [0, 0];
}

while in the following, x still has type Boolean but y has (the inferred) type [Uint<16>, Uint<64>].

witness w(): [Boolean, [Uint<16>, Uint<32>]];
circuit foo(): [Uint<64>, Uint<64>] {
const [x, y] = w();
return x ? y : [0, 0];
}

Except where shadowed, the scope of each variable bound by a const statement is the entirety of the innermost block containing the const statement. It must not be referenced, however, before it is given a binding. Any attempt to do so is a static error. For example, the reference to x on the first line of the body in the definition of foo below is a static error:

circuit foo(a: Uint<16>): Field {
const y = x + a;
const x = 7;
return y;
}

Similarly, the reference to x in the first cbinding of the const statement below is also a static error:

const y = x, x = 7;

A const statement is evaluated by evaluating the cbinding subforms in order so that the variables given values by each cbinding are available to be referenced by the cbindings that follow.

The evaluation of each cbinding involves determining the value v of the expression on the right-hand side of the =, then giving values to identifiers in the pattern p on the left-hand side to the corresponding pieces of v as described earlier in Patterns and destructuring.

Any variable bound by const may not be reused within a block, although a const binding in a nested block might shadow it. Variables are immutable, although the same variable might take on different values at different times if it is contained within a block of code that is evaluated more than once, such as would be the case for the body of a circuit that is called more than once.

if statements

An if statement is used to determine the flow of control through the statements of a circuit or constructor body. A one-armed if expression has a "test part" (an expr-seq enclosed in parentheses) and a "then part" (then-statement):

if (expr-seq) then-statement

A two-armed if statement has a test part, a then part, and an "else part" (else-statement):

if (expr-seq) then-statement else else-statement

The typing of an if statement requires only that the type of expr-seq must be Boolean.

Evaluating an if expression involves first determining the value v of expr-seq. If v is true, then-statement is evaluated. Otherwise, v must be false, in which case else-statement (if present), is evaluated.

for statements

for statements are used to iterate over a sequence of values. In Compact, in contrast to most languages, the number of iterations can always be determined at compile time. That is, the number of iterations is bounded either by constant numeric bounds or by size of an object of constant size. This restriction is motivated by the need for the compiler to generate finite proving circuits.

Compact supports two kinds of for statements. The first iterates over vectors, tuples, and byte vectors and takes the following form:

for (const x of expr) stmt

This kind of for statement is typed by typing expr and verifying that it is a Vector type, a tuple type that has a vector type, or a Bytes type.

Evaluating this kind of for requires determining the vector, tuple, or byte-vector value v of expr then evaluating stmt once for each element of v with x bound to the value of each element of v in turn.

The second form iterates over a range of unsigned integer values and takes the following form:

for (const i of start .. end) stmt

In this form, each of start and end must be a literal unsigned integer or reference to a generic natural-number parameter, and end must be greater than or equal to start. Otherwise, it is a static error.

This form is always well-typed.

Evaluating this kind of for requires evaluating stmt with i bound to k for each k in the range start (inclusive) to end (exclusive).

return statements cannot be used to return from within for statements. It is therefore a static error for stmt to be a return statement or for a return statement to appear within stmt (except where it appears nested within an anonymous circuit).

Iteration can also be accomplished via map and fold expressions.

return statements

A return statement can be used to exit from the closest enclosing anonymous circuit, if any, or otherwise from the enclosing constructor or named circuit, and to return to the caller either an explicit return value, i.e., the value of expr-seq in this form of return:

return expr-seq;

or the default value [] in this form of return:

return;

A circuit or constructor body can also exit without an explicit return statement: any path through the body that does not end in an explicit return statement is treated as if ended with return;.

A return statement is always well-typed if it exits from an anonymous circuit without a declared return type. Otherwise, a return statement is well-typed if the type of expr-seq, or [] if no expr-seq is present, is a subtype of the expected return type. If the return form exits from a named circuit or an anonymous circuit with a declared return type, the expected type is the declared return type, while if it exits from the constructor, the expected return type is [].

An implication of these rules is that it is a static error to exit without an explicit return value from a circuit with a declared return type other than [].

When a return statement is evaluated, the expr-seq, if present, is evaluated, the circuit or constructor exits immediately without evaluating any subsequent statements, and it returns to the caller the value of expr-seq or [] if no expr-seq is present.

Expressions

Compact expressions are used to compute values, to cause effects, or both.

Every Compact expression must be well-typed (free from static type errors). If any expression within a program contains a static type error, it is a static error: the Compact compiler reports the error and does not produce any target code (TypeScript or zkir) for the program. The static type of a well-typed expression is either a Compact type or a ledger-state type.

Every well-typed Compact expression either evaluates to a value or causes a dynamic error, and it might have effects. The evaluation of an expression is defined in terms of the evaluation of its subexpressions.

The first subsection of this section provides a grammar snippet that summarizes the syntax of Compact expressions, and the second discusses how the grammar reflects the rules for precedence and associativity of operators. The remaining sections describe the typing and evaluation rules for each kind of expression.

Syntax of expressions

expr-seqexpr
|expr , ⋯¹ , expr , expr
exprexpr0 ? expr : expr
|expr0 = expr
|expr0 += expr
|expr0 -= expr
|expr0
expr0expr0 || expr1
|expr1
expr1expr1 && expr2
|expr2
expr2expr2 == expr3
|expr2 != expr3
|expr3
expr3expr4 < expr4
|expr4 <= expr4
|expr4 >= expr4
|expr4 > expr4
|expr4
expr4expr4 as type
|expr5
expr5expr5 + expr6
|expr5 - expr6
|expr6
expr6expr6 * expr7
|expr7
expr7! expr7
|expr8
expr8expr8 [ expr ]
|expr8 . id
|expr8 . id ( expr ,, expr ,opt )
|expr9
expr9fun ( expr ,, expr ,opt )
|map ( fun , expr , ⋯¹ , expr ,opt )
|fold ( fun , expr , expr , ⋯¹ , expr ,opt )
|slice < tsize > ( expr , expr )
|[ tuple-arg ,, tuple-arg ,opt ]
|Bytes [ bytes-arg ,, bytes-arg ,opt ]
|tref { struct-arg ,, struct-arg ,opt }
|assert ( expr , str )
|disclose ( expr )
|term
termid
|true
|false
|nat
|str
|pad ( nat , str )
|default < type >
|( expr-seq )
struct-argexpr
|id : expr
|... expr
bytes-argtuple-arg
tuple-argexpr
|... expr

Precedence and associativity

The structure of the expression grammar unambiguously reflects the precedence and associativity of operators.

Each group from expr-seq through term represents a precedence level, with expr-seq at a higher precedence level than expr, expr at a higher level than expr0, and so on. For example, expr6 (multiplication) is at a higher precedence level and so binds more tightly than expr5 (addition) because the grammar permits the operands of an addition expression to be multiplication expressions, but not vice versa. For example, x * y + z is parsed as the addition of x * y and z rather than the multiplication of x and y + z, because the expr5 production for addition is not reachable from the expr6 production for multiplication. One can still write x * (y + z) because an expr6 can be a term (via the "fall through" productions expr6expr7, expr7expr8, and so on), a term can be a parenthesized expr-seq, and an expr-seq can be an expr.

The grammar enforces associativity by requiring the left or right operand of an operator or both to be at a higher precedence level. Specifically, left-associativity is expressed by requiring the right operand to be at a higher level while allowing the left operand to be at the same level. For instance, expr5expr5 + expr6 enforces the left-associativity of addition, since the left operand must be at a higher level. This means, for example, that x + y - z can be treated only as the subtraction of z from x + y rather than as the addition of x and y - z. Right-associativity is expressed in the opposite manner, e.g., expr0 ? expr : expr enforces the right-associativity of the ternary ?: operator. Non-associative operators such as the relational operators <, <=, >=, and > require both operands to be at a strictly higher level (expr4 on both sides of expr3), preventing chaining like a < b < c.

The middle operand of the ternary ?: operator plays no roll in determining precedence and associativity and can be any kind of expression. (The grammar could even have allowed it to be an expression sequence, but Compact follows TypeScript and JavaScript in restricting it to be a single expression.) In fact, the grammar has many examples where a lower precedence expression is embedded in a higher-precedence expression; this does not result in any ambiguity because the operand is not exposed on the left or right side of the expression where it can be confused as an operand of some other operator.

Parenthesized expressions

Compact allows parenthesized expressions of the form (e), where e is an expression or, more generally, an expression sequence. Parenthesized expressions can be used to control the order of operations or simply to make the default order of operations explicit.

For example, x + y * z computes the product of y and z and adds x to the result, because * has higher precedence than +. x + (y * z) computes the same thing but makes the order of operations more clear, while (x + y) * z computes the sum of x and y and multiplies it by z.

The type of a parenthesized expression is the type of the embedded expression sequence, and its value is the value of the subexpression.

Expression sequences

An expression sequence (expr-seq in the grammar above) is a comma-separated sequence of one or more expressions.

expr,, expr

Expression sequences can appear only in a few contexts. When an expression sequence is required in a context where only a single expression is permitted, the expression sequence can be wrapped in parentheses:

(expr,, expr)

The type of an expression sequence is the type of its last subexpression. The types of the other subexpressions are ignored and are not constrained in any way.

An expression sequence is evaluated by evaluating the subexpressions from left to right, and its value is the value of the last subexpression. The values of the other subexpressions are ignored; these other subexpressions are evaluated solely for their effects.

Literals

Compact has syntax for Boolean, numeric, and string literal expressions.

A Boolean literal is one of the reserved words true or false. The static type of a Boolean literal is Boolean and its value is the corresponding Boolean value.

A numeric literal is a non-negative integer written in decimal, binary, octal, or hexadecimal notation as follows:

  • decimal: either the single digit 0 or a sequence of one or more decimal digits (0-9) starting with a non-zero digit, e.g., 0 or 7091;
  • binary: the prefix 0b or 0B followed by one or more binary digits (0 or 1), e.g., 0b1101001;
  • octal: the prefix 0o or 0O followed by one or more octal digits (0-7), e.g., 0o5073; or
  • hexadecimal: the prefix 0x or 0X followed by one or more hexadecimal digits (0-9, a-f, A-F), e.g., 0x8f0a.

An occurrence of a numeric literal can arise via specialization of a generic module or circuit in which a reference to a generic natural-number parameter occurs in an expression context. For example, if foo is defined as follows:

circuit foo<#N>(): Uint<16> {
return N;
}

the call foo<17>(), at least in effect, gives rise to a copy of foo in which N has been replaced by 17.

circuit foo(): Uint<16> {
return 17;
}

This is treated exactly as if the numeric literal appeared in place of the reference to N in the source code of the program.

It is a static error if the number n denoted by a numeric literal exceeds the maximum unsigned value and the maximum field value.

If n does not exceed the maximum representable unsigned value, the literal's type is Uint<0..k>, k = n+1, and its value is the unsigned integer n.

If n exceeds the maximum representable unsigned value but not the maximum representable field value, the literal must be directly cast to Field, i.e., n as Field. Otherwise it is a static error. The type of n as Field is Field, and its value is the field value n.

String literals can be either simple string literals or pad expressions. String literals produce byte vectors; Compact has no dedicated String type.

Simple string literals use TypeScript string literal syntax. They can be enclosed in either single (') or double (") quotes, and they can contain escaped characters. The length n of a simple string literal is the length of its UTF-8 encoding. The static type of a string literal of length n is Bytes<n>, and its value is a byte vector containing its UTF-8 encoding.

A pad expression pad(n, s) is also a string literal, where pad is a reserved word, n is a natural number literal, and s is a simple string literal. The length of s must be less than or equal to n; otherwise, it is a static error. The static type of a padded string literal pad(n, s) is Bytes<n>, and its value is a byte vector containing the UTF-8 encoding of s followed by 0 bytes up to the padded length n.

Default values of a type

The expression default<T>, where T is a Compact type or a ledger-state type, has static type T and evaluates to the default value of that type as follows:

  • Boolean: the value of the literal false
  • Uint<0..n> and Uint<n>: 0
  • Field: 0
  • [T1, , Tn]: a tuple with n elements, each of which is the default value of the corresponding type Ti
  • Vector<n, T>: a vector with n elements, each of which is the default value of type T.
  • Bytes<n>: a byte vector of length n, each element of which is 0.
  • Opaque<"string">: a zero-length string
  • Opaque<"Uint8Array">: a zero-length Uint8Array
  • Structure types: a struct with each element set to the default value of its type
  • Enumeration types: the value of E.id where E is the name of the enumeration type and id is the first element
  • alias types: the default value of the underlying type
  • Counter: a Counter initialized to 0
  • Set<T>: an empty Set
  • Map<K, T>: an empty Map
  • List<T>: an empty List
  • MerkleTree<n, T>: an empty Merkle tree
  • HistoricMerkleTree<n, T>: an empty historic Merkle tree

Default values are not defined for contract types or the Kernel ledger-state type, so it is a static error if T in default<T> is one of these types.

Variable references

A variable reference is an identifier reference to a variable binding in scope.

The type of a variable reference is the type of the binding to which it refers.

Each call to a circuit (or constructor) associates a new set of values with each variable bound by the circuit's (or constructor's) parameters, and each block in the circuit (or constructor) associates a new set of values with each of the const bindings created within the block. The value of each variable reference is the value currently associated with the variable.

Conditional expressions

Compact supports conditional expressions of the form

e0 ? e1 : e2

where e0, e1, and e2 are expressions.

The type of e0 must be Boolean. The types of e1 and e2 must be related. That is, the type of e1 must be a subtype of the type of e2, or the type of e2 must be a subtype of the type of e1.

The type of the entire expression is the type of e2 if e1 is a subtype of e2 and the type of e1 if e2 is a subtype of e1.

A conditional expression e0 ? e1 : e2 is evaluated by first evaluating e0. The value v of e0 determines which of the other subexpressions is evaluated:

  • if v is true, then e1 is evaluated and its value is the value of the entire expression
  • if v is false, then e2 is evaluated and its value is the value of the entire expression

The evaluation rules ensure that only one of e1 and e2 is evaluated.

Relational comparison expressions

Relational comparison expressions take the form e1 op e2 where e1 and e2 are expressions and op is one of Compact's relational operators. The relational operators are equals (==), not equals (!=), less than (<), greater than (>), less than or equals (<=), and greater than or equals (>=).

Equals and not equals require the types of e1 and e2 to be related. That is, the type of e1 must be a subtype of the type of e2, or the type of e2 must be a subtype of the type of e1. Otherwise, it is a static error.

Less than, greater than, less than or equals, and greater than or equals require the types of e1 and e2 to be unsigned integer types. Otherwise, it is a static error. (Values of type Field cannot be compared with these operators.)

The type of the result is Boolean.

A relational comparison expression e1 op e2 is evaluated by evaluating e1, then evaluating e2, then comparing the resulting values as described below.

Equals and not equals

Equality depends on the types of the operands according to the following rules. The rules need not and therefore do not cover cases where one operand is not a subtype of the other, as required by typing rules above. For example, if one operand has type Boolean, so does the other; if one operand has a Bytes type of length n, so does the other; and if one has a specific structure type, so does the other.

  • Boolean: the values are equal if both are true or both are false.
  • Field and Uint types: the values are equal if the integer values are equal.
  • Bytes: the values are equal if the corresponding bytes at each index are equal.
  • Vector and tuple types: the values are equal if the corresponding values at each index are equal.
  • Structure types: the values are equal if the values of the corresponding fields are equal.
  • Enumeration types: the values are equal if they are the same enumeration member.
  • Opaque types: the values are equal if the run-time values are equal according to JavaScript's strict equality (===) operator.

Otherwise the values are not equal.

For equal, the relational comparison expression evaluates to true if the values are equal; otherwise it evaluates to false. Likewise, for not equals, the relational comparison expression evaluates to false if the values are equal; otherwise it evaluates to true.

Less than, greater than, less than or equals, and greater than or equals

The values are compared according to the corresponding relational operation. The relational comparison expression evaluates to true if the relation holds; otherwise it evaluates to false.

Short-circuit logical expressions

Compact supports short-circuit logical expressions of the form e1 op e2 where e1 and e2 are expressions and op is one of the logical operators or (||) or and (&&).

Logical expressions require the types of both subexpressions to be Boolean. Otherwise it is a static error. The logical expression itself also has type Boolean.

Evaluating e1 op e2 involves first evaluating e1, then:

  • For ||, if v is true, e2 is not evaluated, and the value of the entire expression is true. Otherwise, e2 is evaluated, and its value is the value of the entire expression.
  • For &&, if v is false, e2 is not evaluated, and the value of the entire expression is false. Otherwise, e2 is evaluated, and its value is the value of the entire expression.

These are short-circuiting operators because they do not evaluate the second operand if the final value is determined from the first.

Boolean negation expressions

Compact supports unary Boolean negation expressions of the form !e where e is an expression.

The type of e must be of type Boolean, otherwise it is a static error. The Boolean-negation expression itself also has type Boolean.

A Boolean negation expression !e is evaluated by evaluating e to determine its value v. Then if v is true, the value of the Boolean negation expression is false, and If v is false, the value of the Boolean negation expression is true.

Binary arithmetic expressions

Binary arithmetic expressions have the form e1 op e2, where e1 and e2 are operand expressions and op is one of Compact's binary arithmetic operators. The binary arithmetic operators are add (+), subtract (-) and multiply (*).

Arithmetic expressions require the type of each operand to be a numeric type, that is, either a Field, a Uint, or a type alias for a Field or a Uint. Putting aside type aliases for now, the type of the result depends on the types of the operands as follows:

  • If both operands have type Field, the result has type Field.
  • If both operands have Uint types, i.e., e1 has type Uint<0..m> for some m and e2 has type Uint<0..n> for some n, the type of the result depends on the operation as follows:
    • for add, the result has type Uint<0..m+n>,
    • for subtract, the result has type Uint<0..m>, and
    • for multiply, the result has type Uint<0..m⋅n>.
  • If one operand has type Field and the other has type Uint<0..n> where n-1 is less than or equal to the maximum field value, the Uint operand is implicitly cast to Field, and the result has type Field. (If n-1 is greater than the maximum field value, the behavior of the compiler is currently unspecified.)

For arithmetic operations with Uint result types, it is a static error if the result's bound would be greater than the maximum unsigned integer, if any.

Evaluating an arithmetic expression e1 op e2 involves first evaluating e1, then evaluating e2. Integer addition, subtraction, or multiplication is then used on the operand values. The overflow and underflow behavior differs for Field (either operand has type Field) and Uint operations (both operands have a Uint type):

  • Field arithmetic overflow and underflow wraps around 0; that is, the result of an arithmetic operation whose result is a Field is the actual arithmetic value modulo k, where k is one more than the maximum field value.
  • Uint addition and multiplication cannot overflow: the static type of the result is always large enough to hold the result value.
  • Uint subtraction results in a dynamic error if the resulting value would be negative, i.e., if the value of e2 is greater the value of e1.

If the type of either operand is a structural type alias for a Field or Uint it is treated identically to that Field or Uint, both in type checking and evaluation.

On the other hand, if the type of either operand is a nominal type alias T for a Field or a Uint, the type of the other operand must also have type T, and the type of the result also has type T. Evaluation proceeds as if the two operands had the underlying Field or Uint type with the value cast back to T, which can result in a dynamic error if T is a nominal alias for a type Uint<0..n> and the value is not less than n. For example:

new type Feet = Uint<32>;
circuit foo(x: Feet, y: Feet, scale: Uint<32>): Feet {
return (x + y) * (scale as Feet);
}

computes the sum of x and y multiplied by scale. If the value of this computation fits in 32 bits, foo returns the value. Otherwise, the enclosing top-level circuit or constructor halts with a message indicating that the cast of the value to Feet failed.

Tuple creation

New tuple values are created with expressions of the form [tuple-arg, ⋯, tuple-arg] where tuple-arg, ⋯, tuple-arg is a sequence of zero or more comma-separated tuple arguments. A non-empty sequence can have an optional trailing comma. Each tuple argument is either an expression or a spread.

If a tuple argument is an expression e of type T, it contributes a single element of type T to the new tuple: the value of e.

If a tuple argument is a spread ... e, the type T of e must be a tuple type, a Vector type, or a Bytes type. Otherwise, it is a static error.

If T is a tuple type [U1, ⋯, Un], the spread contributes n new elements of types U1, ⋯, Un to the new tuple, i.e., the elements of the tuple value of e. If T is a Vector type Vector<n, U>, the spread contributes n new elements of type U to the new tuple, i.e., the elements of the Vector value of e. If T is a Bytes type Bytes<n>, the spread contributes n new elements of type Uint<8> to the new tuple, i.e., the elements of the byte-vector value of e.

The type of a tuple-creation expression is the tuple type whose elements are the types contributed by each of the tuple arguments in order.

A tuple-creation expression is evaluated by evaluating the tuple arguments from left to right and creating a new tuple whose elements are the values contributed by each of the tuple arguments in order. The length of the new tuple is the total number of contributed elements, which can be less than or greater than number of tuple arguments.

Byte-vector creation

New byte vectors are created with expressions of the form Bytes [tuple-arg, ⋯, tuple-arg]. Bytes creation is essentially the same as tuple creation except that the types of the contributed elements must all be subtypes of Uint<8> (if not, it is a static error), and the result is a new byte vector of type Bytes<n> where n is the number of contributed elements.

Tuple, vector, and byte-vector references

Compact allows references to individual elements of sequence values (tuples, vectors, and byte vectors) via the syntax e[index] where e is an expression and index is a numeric literal, generic natural-number parameter reference, or an expression that can be reduced to a numeric constant at compile time as described below.

The type of e must be a sequence type, i.e., a tuple type, a vector type, or a Bytes type. The type of index must be Uint<n> for any n. The eventual constant value of index must be less than the length of the sequence value as determined by its type. Violating any of these constraints is a static error.

The unsigned integer value of index must be computable at compile time via the following rules, or it is a static error.

  • Any index can be a constant, i.e., a numeric literal or a generic natural-number parameter reference.
  • If the type of e is a tuple type that has a vector type, the rule for vector types below applies, with e treated as having the vector type. If the type of e is a tuple type that does not have a vector type, however, index must be a constant. This restriction allows the compiler to compute a type for tuple reference expressions without first attempting to reduce index to a numeric value.
  • If the type of e is a vector or byte-vector type, index must be a constant-valued expression, where a constant-valued expression is:
    • a constant,
    • a reference to a generic natural-number parameter,
    • a reference to a variable bound to a constant-valued expression,
    • a reference to a variable bound in the header of a for loop, or
    • the result of adding, subtracting, or multiplying two constant-valued expressions.

The type of a byte-vector reference is Uint<8>. The type of a vector reference where the vector has element type T is T. The type of a tuple reference where the tuple has type [T1, ⋯, Tn] and index is the constant i or the value i of a generic natural-number parameter is Ti. Otherwise, the tuple type must have a vector type Vector<n, T> and the type of the tuple reference is T.

The value of a sequence reference is the value of the ith (zero-based) element of the result of evaluating e, where i is the (eventually) compile-time constant value of index.

Tuple, vector, and byte-vector slices

Slicing a sequence value such as a tuple, vector or byte vector produces a subsequence of the original value. It is similar to a sequence reference but extracts a sequence of values rather than a single value from the sequence value.

Slice expressions take the form slice<k>(e, index), where slice is a keyword, k is a constant or numeric generic parameter reference specifying the fixed size of the slice, e is an expression, and index is a numeric literal, a generic natural-number parameter reference, or an expression that can be reduced to a numeric constant at compile time as described above in the section on sequence references. It is a static error if index + k exceeds the length of the sequence value.

The type of a byte-vector slice is Bytes<k>. The type of a vector slice where the vector has element type T is Vector<k, T>. The type of a tuple slice where the tuple has type [T1, ⋯, Tn] and index is the constant i or the value i of a generic natural-number parameter is the subsequence [Ti, ⋯, Tj] starting with element i (zero-based) and ending with element j=i+k-1 of [T1, ⋯, Tn]. Otherwise, the tuple type must have a vector type Vector<n, T> and the type of the slice is Vector<k, T>.

The value of a slice expression slice<k>(e, index) is the subsequence of the original tuple, vector, or byte vector from i (zero-based, inclusive) through i+k (zero-based, exclusive) of the result of evaluating e, where i is the (eventually) compile-time constant value of index.

For example, if getMiddle is defined as follows:

export circuit getMiddle(x: Bytes<5>): Bytes<3> {
return slice<3>(x, 1);
}

The call

getMiddle(Bytes[17, 18, 19, 20, 21])

evaluates to the equivalent of

Bytes[18, 19, 20]

Structure creation

Structure values are created with structure-creation expressions of the form T {struct-arg, ⋯, struct-arg}, where T is a structure type name S or specialized generic structure type S<garg, ⋯, garg>, and struct-arg, ⋯, struct-arg is a sequence of zero or more comma-separated structure arguments.

A structure argument can be one of three things:

  • a positional argument, which takes the form of an expression e,
  • a named argument, which takes the form id: e, where id is a field name and e is an expression, or
  • a spread argument, which takes the form ...e, where ... is the literal three dots (ellipsis) token and e is an expression.

The sequence of structure arguments within a structure-creation expression can consist either of:

  • zero or more positional arguments followed by zero or more named arguments, or
  • a spread argument followed by zero or more named arguments.

In the first case, the values of the n positional arguments become the values of the first n fields of the created structure and must be given in the same order as the fields in the structure declaration. The named elements specify the values of the remaining fields by name; that is, the value of e in the named argument id: e becomes the value of the field named id.

In the second case, the expression e in the spread argument ...e must have the same structure type T as the one being created, and each field of the created structure is given the value of the corresponding field from the value of e if not overridden by one of the named arguments.

Named arguments need not appear in any particular order.

It is possible for only positional or only named arguments to be present. It is also possible, though not useful, for only a spread argument to be present.

The examples below demonstrate the use of positional and spread field values:

struct S { a: Uint<32>, b: Boolean, c: Bytes<8> }
circuit f(x: Uint<32>, y: Boolean, z: Bytes<8>): S {
const s1 = S { c: z, a: x, b: y };
// Alternatively, s1 can be created with the positional syntax S { x, y, z }
// or a mix of positional and named field values S { x, c: z, b: y }.

const s2 = S { ...s1, b: true };
// s2 is created using the spread syntax. So, s2 has the same field values
// as s1 except that b is true.

const s3 = S { ...s2, c: 'abcdefgh' };
// s3 is also created using the spread syntax. s3 has the same field values
// as s2 except that c is 'abcdefgh'.

return s3;
}

The structure type name must be bound to a structure type in scope. If the structure is generic, it must be fully specialized with generic arguments enclosed in angle brackets.

The static type of a non-generic structure-creation expression is the named structure type, while the static type of a generic structure-creation expression is a structure with the same name as the generic type and field types obtained by substituting the generic arguments for the generic parameters in the structure's declaration.

If a spread argument is not present: The number of structure arguments must match the number of fields in the corresponding structure declaration: a value must be given for every field. Positional arguments must appear before named arguments. A field name must not occur more than once among the named arguments, and each field name that does occur must be the name of a field in the corresponding structure declaration whose value is not given positionally. The type of a positional field subexpression must be a subtype of the declared type of the (positionally) corresponding field in the structure declaration. Similarly, the type of a named field subexpression must be a subtype of the declared type of the corresponding (named) field in the structure declaration.

If a spread argument is present: The spread argument must come first in the sequence of structure arguments. The type of the spread subexpression must be the same as the structure to be created. There must not be any positional arguments. A field name must not occur more than once among the named arguments, and each field name that does occur must be the name of a field in the corresponding structure declaration. The type of a named field subexpression must be a subtype of the declared type of the corresponding (named) field in the structure declaration.

Violating any of the above constraints is a static error.

A structure-creation expression is evaluated by evaluating the structure argument expressions in order from left to right and constructing a structure value whose fields values are based on the corresponding structure arguments: if there is a positional or named argument for a field, the field's value is the value of the expression, otherwise there must be a spread argument and the field's value is the value of the corresponding field in the (structure) value of the spread expression.

Structure field access

A structure field access is an expression of the form e.id where e is an expression with a structure type T and id is an identifier. It is a static error if T does not have a field named id.

The type of any structure field access e.id where e has type T is the type of the id field of T.

The value of any structure field access e.id is the result of evaluating the subexpression e and extracting the value of the resulting structure's id field.

Some expressions of the form e.id can also be enumeration member selection or ledger-state operations. e.id is recognized as a structure field access only when the type of e is a structure type.

Enumeration member selection

An expression of the form E.id where E is the name of an enumeration type and id is an identifier is an enumeration member selection. It is a static error if E does not have a member named id.

The type of any enumeration member selection E.id is E.

E.id is a constant. That is, the value of any enumeration member selection E.id is E.id.

Some expressions of the form E.id can also be structure field accesses or ledger-state operations. E.id is recognized as an enumeration member selection only when the type of e is an enumeration type.

Circuit and witness calls

Circuits and witnesses, collectively referred to as functions, are called via expressions of the form fun(e, ⋯, e), where fun is a function expression and e, ⋯, e is a sequence of zero or more comma-separated argument expressions.

The function expression fun can take one of the following forms:

funid gargsopt
|arrow-parameter-list return-typeopt => block
|arrow-parameter-list return-typeopt => expr
|( fun )
arrow-parameter-list( optionally-typed-pattern ,, optionally-typed-pattern ,opt )
return-type: type

In the simplest form, fun is just a identifier id referring to a witness or named circuit. When id refers to a generic witness or circuit, it must be fully specialized via generic parameters.

fun can also be an anonymous circuit, also known as an arrow circuit. This form consists of a parameter list (optionally-typed-pattern, ⋯, optionally-typed-pattern) followed by an optional return type : type, an arrow (=>), and a body, which can be a block or an expr. Each optionally-typed-pattern is a pattern with an optional type annotation : type.

Finally, fun can be a parenthesized function (fun).

Because circuits and witnesses are not first class, fun cannot be a variable name or arbitrary expression.

Anonymous circuits cannot have generic parameters. Since they appear only in contexts where they are directly called, the circuit would have to be immediately specialized, in which case only one specialization can exist and the non-generic version would be clearer.

The underlying function of a function expression is a non-parenthesized function expression. For a function name it is the function name, for an anonymous circuit it is the anonymous circuit, and for a parenthesized function it is the underlying function of the parenthesized function expression.

Type checking a function call depends on the form of the underlying function.

  • For a named function: Function names may be overloaded: more than one function of the same name may be present in the scope of a call to that function. A call whose underlying function is a name can thus have zero or more candidate functions, i.e., all of those that have the given name and are present in the scope of the call.

    A candidate function is not necessarily compatible with the number and kinds of the generic parameter values nor with the number and types of the argument expressions provided at the call site. It is compatible if and only if the number of generic parameters is the same as the number of generic arguments, each generic argument is of the required kind (numeric or type), the number of declared parameters is the same as the number of argument expressions, and the type of each argument is a subtype of the declared type of the corresponding parameter. If exactly one candidate is compatible, the call is made to that candidate. It is a static error if there is not exactly one compatible candidate.

    The static type of a call to a named function is the declared return type of the called function.

  • For an anonymous circuit: Parameters are type-checked or inferred as follows:

    • If there is a type annotation for a parameter, it is a static error if the type of the corresponding argument expression is not a subtype of the type annotation.

    • If there is no type annotation for a parameter, the parameter's type is inferred as the type of the corresponding argument expression.

    Within the body of the anonymous circuit, the type of each variable binding arising from the binding of identifiers in each parameter pattern to the corresponding pieces of the corresponding argument is the type of the corresponding part of the structure of the declared or inferred parameter type.

    If there is a return-type annotation, it is a static error if the body can return a type of value that is not a subtype of the return-type annotation. A return statement of the form return; implicitly returns a value of type [], as does every control flow path through a body that does not explicitly end with a return statement.

    If there is no return-type annotation, then a return type is inferred from the body as the least upper bound of the types of the values that can be returned from the body (explicitly or implicitly). It is a static error if these types do not have a least upper bound.

    The static type of a call to an anonymous circuit is the declared or inferred return type.

Calls are evaluated by evaluating the argument expressions in order from left to right. Then, if a circuit is being called, the statements in the body of the circuit are executed with the parameter names bound to the corresponding argument values. The value of a circuit call is the value returned from the execution of the body. If a witness is being called, the contract invokes the externally provided witness function with the argument values. The value of a witness call is the value returned by the witness function.

Map and fold expressions

Compact supports expressions that perform the higher-order operations map and (left-to-right) fold over vectors, tuples that have vector types (not arbitrary tuples), and byte vectors.

Map expressions have the form map(fun, e, ⋯¹, e) where map is a keyword, fun is a circuit or witness, and each e is an expression. The map operator creates a vector containing the results of applying fun to the corresponding elements of the sequence values of the expressions.

Fold expressions have the form fold(fun, init, e, ⋯¹, e) where fold is a keyword, fun is a circuit or witness, init is an expression, and each e is an expression. The fold operator accumulates a value starting with the value of init and updating it by applying fun to the current value and the corresponding elements of the sequence values of the expressions in turn.

A detailed description of the circuit or witness fun is given in Circuit and witness calls above.

A map expression is type-checked by checking the type of the witness or circuit fun to find its declared or inferred parameter types and its declared or inferred return type R. fun must have at least one parameter, and the map expression must have the same number of sequence-value subexpressions as the number of parameters of fun. Each of the sequence-value subexpressions must have a vector or byte-vector type, and all these types must have the same length n. If the type of the ith parameter to fun is T, then the ith sequence-value subexpression must have the vector type Vector<n, S> where S is a subtype of T, or it must be the byte-vector type Bytes<n> and Uint<8> must be a subtype of T. The type of the map expression is Vector<n, R>; map produces a vector regardless of whether the input sequence-value subexpressions are vectors, tuples, byte vectors, or some combination.

A fold expression is type-checked by checking the type of the witness or circuit fun to find its parameter types and its return type R. fun must have at least two parameters, and the type of the first parameter must be the same type as the return type R. The fold expression must have one fewer sequence-value subexpression than the number of parameters of fun. The first subexpression init gives the initial value for the fold. It must have a type which is a subtype R. Each of the sequence-value subexpressions must have a vector or byte-vector type, and all these types must have the same length n. If the type of parameter i+1 of fun is T, then the type of the ith sequence-value subexpression must have the vector type Vector<n, S> where S is a subtype of T, or it must be the byte-vector type Bytes<n> and Uint<8> must be a subtype of T. The type of the entire expression is R.

Map expressions are evaluated by evaluating the sequence-value subexpressions from left to right to produce the input sequence values. The witness or circuit fun is then applied in turn, from index 0 up to index n-1, to arguments taken from the input sequence values. The result is a vector of length n where each ith element is the result of applying fun to the ith elements of the corresponding input sequence values.

Fold expressions are evaluated by evaluating the initial value expression init and then evaluating the sequence-value subexpressions from left to right. The values of the sequence-value expressions are the input sequence values. The witness or circuit fun is then applied in turn, from index 0 up to index n-1, to an accumulator value argument and arguments taken from the input sequence values. The 0th (initial) accumulator value is the value of the expression init, and each subsequent i+1th accumulator value is the result of applying fun to the ith accumulator value and to the ith elements of the corresponding input sequence values. The result is the nth (final) accumulator value where n is the length of the input vectors.

Type cast expressions

Values of one type can be cast to another, when allowed, via a cast, which takes the form e as T where e is an expression and T is a type. The effects of casting an expression e of type T to another type U depend on the specific types involved. For some types T and U, the cast is not allowed, which is a static error. Even if the cast is allowed, some values of type T might not be representable as values of type U. In such cases, a run-time check of the value of e is required and can result in a dynamic error. In cases where the run-time representation of T is different from the run-time representation of U, the value of e must be converted at run time from the one representation to the other.

The type of any (allowed) cast e as T is, naturally, T.

Evaluation of a cast proceeds by evaluation of its subexpression, followed by any checks and conversions required by the cast. If a check fails, a dynamic error is reported.

TypeScript casts of the form <T>e are not supported in Compact.

The following rules govern when casts are allowed, when casts might cause dynamic errors, and when casts might require run-time conversions. Any cast not covered by one of the rules is not allowed.

Upcasts

  • Upcasts, i.e., casts from a type to a supertype (e.g., from Vector<3, Uint<8>> to Vector<3, Uint<16>>), are always allowed, never result in dynamic errors, and, except in the case where the type and the supertype are the same, might require run-time conversions. Upcasts are never required: it is always possible to use a value of type T as a value of a supertype of T.

    This rule is straightforward, but some implications regarding tuple and vector types are worth noting:

    • Since Vector<n, T> is the same type as [T, ⋯, T] of length n, a cast from either of these types to the other is allowed but not required.
    • If a tuple type [T1, ⋯, Tn] has a vector type Vector<n, S>, then since the tuple type is a subtype of the vector type, a cast from the tuple type to the vector type is allowed but not required. For example, values of type [Uint<8>, Uint<16>] can be used where values of type Vector<2, Uint<16>> are required.
    • Since a vector type Vector<n, T> is a subtype of any tuple type [S1, ⋯, Sn] where T is a subtype of each S, a cast from the vector type to the tuple type is allowed but not required. For example, values of type Vector<2, Uint<8>> can be used where values of type [Uint<8>, Uint<16>] are required.

Downcasts of Field and Uint types

  • Any Uint downcast, i.e., any cast from Uint<0..m> to Uint<0..n>, m > n, is allowed, requires a run-time check, and might require a run-time conversion. The same applies to any Field to Uint casts where the maximum Field value exceeds the maximum value of the Uint type. (If the maximum value of the Uint type exceeds the maximum Field value, the behavior of the compiler is currently unspecified.)

Casts of Boolean to and from Field and Uint

  • Any cast from Boolean to Field or any Uint type is allowed, sometimes requires a run-time check, and requires a run-time conversion (of false to 0 and true to 1). A run-time check is required for Uint<0..1>.
  • Any casts from Field or any Uint type to Boolean is allowed, cannot result in a dynamic error, and requires a simple run-time conversion (of 0 to false and any other value to true).

Casts of Enum types to and from Field and Uint

  • Any cast from an enum type to Field or any Uint type is allowed, sometimes requires a run-time check, and might require a run-time conversion. A run-time check is required for a cast from an enum type with n elements to Uint<0..m>, n > m.
  • Any cast from Field or any Uint type to an enum type is allowed, sometimes requires a run-time check, and might require a run-time conversion. A run-time check is required for casts from Field. For an enum type with n elements, a run-type check is required for casts from Uint<0..m>, m > n.

Casts of Bytes to and from Field and Uint

  • Any cast from Bytes<m>, m != 0, to Field or any Uint type is allowed, requires a run-time check, and requires a run-time conversion. That is, the byte-vector value of the expression is converted into a Field or Uint value with the least-significant byte of the result corresponding to the first byte in the byte vector. A dynamic error occurs if the result exceeds the maximum value of the Field or Uint type.
  • Any cast from Field or any Uint type to Bytes<m>, m != 0, is allowed, requires a run-time check, and requires a run-time conversion. That is, the integer value of the expression is converted into a byte vector with the first byte of the byte vector corresponding to the least-significant byte of the integer. A dynamic error occurs if the result does not fit in m bytes.

Casts of Bytes to and from Vector and tuple types

  • Any cast from Bytes<m> to Vector<m, Field> or Vector<m, Uint<0..k>>, k ≥ 256 is allowed, cannot cause a dynamic error, but does require a run-time conversion.
  • Any cast from Bytes<m> to tuple type [T1, ⋯, Tm], where each Ti is Field or Uint<0..k> for some k ≥ 256 is allowed, cannot cause a dynamic error, but does require a run-time conversion.
  • Any cast from Vector<m, Uint<0..k>>, k ≤ 256, to Bytes<m> is allowed, cannot cause a dynamic error, but does require a run-time conversion.
  • Any cast from tuple type [T1, ⋯, Tm] to Bytes<m>, where each Ti is Uint<0..k> for some k ≤ 256 is allowed, cannot cause a dynamic error, but does require a run-time conversion.

Ledger state operations

A Compact program interacts with its public state by invoking operations upon the ledger. There are two different forms of ledger operations.

Kernel operations are operations that do not depend on specific ledger state. They can be invoked by expressions of the form id.op(e, ⋯, e), where id is the name of a ledger field declared to have the special ledger-state type Kernel, op is the name of a builtin kernel operation, and e, ⋯, e is a comma-separated sequence of zero or more argument expressions. The CompactStandardLibrary predefines the ledger field name kernel to have ledger-state type Kernel, so for example, the built-in self operation can be called from a circuit as follows:

import CompactStandardLibrary;
circuit f(): ContractAddress {
return kernel.self();
}

The static type of a kernel operation expression is the return type of the corresponding kernel operation according to the the Compact ledger data type documentation.

Kernel operations are evaluated by evaluating the argument subexpressions in order from left to right and then invoking the corresponding kernel operation with the argument values.

Ledger-state operations are operations on the contract's declared public ledger-state. In its simplest form, a ledger-state operation is an expression F.op(e, ⋯, e), where F is a ledger field name declared via a ledger field declaration, op is the name of a ledger-state operator, and each e is an argument expression. F can also be another ledger-state operation, allowing ledger-state operations to be chained when the result of a ledger-state operation might itself have a ledger-state type. Compact also supports shorthands for certain ledger-state operations; these are described below in Ledger-state-operation shorthands.

The static type of a ledger-state operation is the return type of the corresponding ledger-state operation according to the the Compact ledger data type documentation. This might be a Compact type or it might be a ledger-state type. A ledger-state operation whose return type is a ledger-state type must be chained, i.e., immediately subject to another ledger-state operation. It is a static error if this is not the case.

A ledger-state operation F.op(e, ⋯, e) is evaluated by evaluating F if F is itself a ledger-state operation, then evaluating the argument subexpressions from left to right, and then invoking the operation op with the argument values on the portion of the public ledger state identified by F.

Ledger-state-operation shorthands

Compact supports several shorthands for common ledger-state operations.

The simplest of these is a shorthand for the ledger-state operation F.read() where F is a ledger-state field name or itself another ledger-state operation. If the type of F is a ledger-state type that supports the read operation, F.read() can be abbreviated, simply, to F.

The other shorthands all take the form of assignment statements and abbreviate the ledger operations F.write(e), F.increment(e), and F.decrement(e), where e is an expression.

If the type of F is a ledger-state type that supports the write operation, F.write(e) can be abbreviated F = e. Similarly, if the type of F is a ledger-state type that supports the increment operation, F.increment(e) can be abbreviated F += e, and if the type of F is a ledger-state type that supports the decrement operation, F.decrement(e) can be abbreviated F -= e.

The typing and evaluation rules for the abbreviations are identical to expressions they abbreviate.

Asserts

An assertion has the form assert(e, msg) where e is an expression and msg is a string message. e must have a Boolean type, otherwise it is a static error.

The type of every assert form is [].

Evaluation proceeds by evaluating e. If the value of e is true (the assertion succeeds), the assert form produces the value []. If, however, the value of e is false (the assertion fails), it is a dynamic error, i.e., the assert form halts computation of the enclosing top-level circuit or constructor with the message msg.

Each assertion is checked at run time and constrained in-circuit.

Explicit disclosure

Private data can enter a Compact contract via the arguments to exported circuits and via the return values of witnesses. Compact supports selective disclosure, in which the contract decides which private data, if any, should be disclosed in the transcripts and public state of the contract, and how that private data should be encoded.

Compact itself has no information about the nature of any private data and whether and how it is necessary to disclose it, the decision to do so rests with the contract developer. Compact does, however, require that such disclosure be explicitly declared to prevent unintentional disclosure, so that privacy is the default and disclosure is the exception.

The mechanism for explicitly declaring disclosure is simple: disclosure of private data (exported circuit arguments, witness return values, and anything derived from private data) must be acknowledged by wrapping an expression whose value contains private data in a disclose() wrapper before storing it in the public state, returning it from an exported circuit, or using it in conditional expressions that might affect what is stored in the public state or returned from an exported circuit.

For example, the following contract attempts to disclose private data without explicitly declaring the disclosure when it stores the value of the exported parameter x into the ledger field F:

ledger F: Uint<16>;
export circuit setf(x: Uint<16>): [] {
F = x;
}

Compiling this program produces the following error message in Version 1.0 of the Compact compiler:

Exception: testfile.compact line 3 char 5:
potential witness-value disclosure must be declared but is not:
witness value potentially disclosed:
the value of parameter x of exported circuit setf at line 2 char 21
nature of the disclosure:
ledger operation might disclose the witness value
via this path through the program:
the right-hand side of = at line 3 char 5

The remedy in this case is simple: wrap the reference to x in a disclose() wrapper.

ledger F: Uint<16>;
export circuit setf(x: Uint<16>): [] {
F = disclose(x);
}

The disclose() wrapper does not cause disclosure in itself; in fact, it has no effect other than telling the compiler that it is okay to disclose the value of the wrapped expression. Equivalently, it tells the compiler to pretend that the value of the wrapped expression does not contain witness data whether it actually does or not.

The compiler works hard to ensure there are no false alarms and also to provide precise and accurate information about the nature of the disclosure and how it occurs.

Often, the same value is disclosed along different paths, and the nature of the disclosure might be more obscure, particularly if it involves indirect disclosure via conditional expressions. Detailed advice about how to understand and remedy disclosures is beyond the scope of this document, but a few things are worth noting:

  • It is often sufficient to correct first disclosure issue reported at a given disclosure if multiple disclosures are listed.
  • Whenever feasible, the disclose() wrapper should be placed as close to the actual disclosure site as possible.
  • When disclosing a hashed or otherwise obfuscated variant of witness data, place the disclosure somewhere on the path between input and disclosure after the obfuscation to help prevent disclosure of the unobfuscated input.

More details are given in Explicit Disclosure in Compact: The Midnight "Witness Protection Program".

Run-time data representation

The TypeScript representation of a Compact type is defined in Representations in TypeScript.

Compact represents values exactly as TypeScript represents values, i.e., as ordinary JavaScript values. So a Compact Boolean is represented at run time as a JavaScript boolean, a Compact tuple is represented as a JavaScript array, and enum values are represented by numbers.

To maintain type safety, Compact verifies at run time that values passed by an outside caller to an exported circuit or returned from an outside witness have the expected types. This is necessary even when the caller or witness is written in properly typed TypeScript because some Compact types have size and range limits that are not expressible via the TypeScript type system:

  • Field values are limited by a maximum field value.
  • Uint values are limited by the declared bounds.
  • Bytes, Vector, and tuple values are limited by their lengths.
  • Enum values are limited by the maximum index for the enum elements.

It is also necessary because compile-time type checks are easily defeated in TypeScript and are nonexistent when a caller or witness is coded in JavaScript.

Certain values to be stored in public state require size, length, and other properties to be maintained explicitly, because these properties cannot be determined solely from the JavaScript representation of the value. For this purpose, the @midnight/compact-runtime package provides explicit run-time types satisfying the CompactType<T> interface, where T is the corresponding TypeScript type. This representation is not user-facing most of the time, except when replicating the behavior of the operations implemented in @midnight/compact-runtime.

The following types or constructors can be used to create a CompactType instance for a primitive type:

  • Boolean - CompactTypeBoolean
  • Field - CompactTypeField
  • Uint<0..n> - new CompactTypeUnsignedInteger(n, length), where length is the number of bytes required to store n
  • Uint<n> - same as Uint<0..(2 ^ n)>
  • Bytes<n> - new CompactTypeBytes(n)
  • Vector<n, T> - new CompactTypeVector(n, rt_T), where rt_T is the run-time type of T
  • Opaque<"String"> - CompactTypeString
  • Opaque<"Uint8Array"> - CompactTypeUint8Array
  • JubjubPoint - CompactTypeJubjubPoint.

For program-defined types, structures are not currently easily constructed at run time and require implementing CompactType<T> manually or using compiler internals. Enumerations are exposed through new CompactTypeEnum(maxValue, length), where maxValue is the maximum permissible integer assignable, and length its representation length in bytes (almost always 1).

TypeScript target

When compiled, a contract generates several artifacts. Key to these are the exported circuits from the contract's top level. These are divided into two categories: pure circuits and impure circuits.

In the contract directory, the semantics of the contract is encoded in TypeScript, in the form of a index.js JavaScript implementation file and a index.d.ts type declaration file. For most uses, it is recommended to rely on the information and interface provided in index.d.ts.

For each of the impure circuits that touches the public ledger, a zero-knowledge prover/verifier key pair is also generated, as well as instructions for proof generation. These can be found in the output directory's keys and zkir subdirectories respectively.

Structure of the exported TypeScript

The exported TypeScript exposes a number of declarations that can be used to interact with the contract from any TypeScript application. Some of these also require use of the @midnight/compact-runtime library, which all contracts depend upon and which implements key built-in behaviors.

A contract exports the following in the TypeScript module:

  • The TypeScript type corresponding to each program-defined type exported from the contract's top level
  • A Witnesses<PS> type, which describes the format external witnesses must satisfy to instantiate the contract
  • A ImpureCircuits<PS> type, which describes the set of impure circuits exported from the contract's top level
  • A ProvableCircuits<PS> type, which describes the set of circuits exported from the contract's top level that have verifier keys (i.e., exported circuits that touch the public ledger)
  • A PureCircuits type, which describes the set of pure circuits exported from the contract's top level
  • A Circuits<PS> type, which describes the set of all exported circuits
  • A Contract<PS = any, W extends Witnesses<PS> = Witnesses<PS>> class, which:
    • can be constructed by passing in an instance of W
    • exposes members circuits: Circuits<PS>, impureCircuits: ImpureCircuits<PS>, and provableCircuits: ProvableCircuits<PS>
    • provides initial contract states via initialState(context: __compactRuntime.ConstructorContext<PS>, x: T, y: T, ...): runtime.ConstructorResult<PS> where x: T, y: T, ... are the contract constructor parameters and their type declarations if they exist
  • A constant pureCircuits: PureCircuits object, providing all pure circuits as pure functions
  • A Ledger type, providing views into a current ledger state, by permitting direct calls of all read functions of ledger objects, as well of some TypeScript specific ones that cannot be called from Compact, such as iterators
  • A ledger(state: runtime.StateValue | runtime.ChargedState): Ledger constructor of the Ledger type, giving access to the values of exported ledger fields.

The argument PS for a number of these should be interpreted as the type of the local/private state. For the most part, circuit and witness functions are translated simply by translating their Compact types into corresponding TypeScript types for parameters and return values. For PureCircuits, this is all that happens, for the other _Circuits instances, they receive an additional first parameter of type runtime.CircuitContext<PS>, and their result type R is wrapped in runtime.CircuitResults<PS, R>. For Witnesses, they receive an additional first parameter of type runtime.WitnessContext<Ledger, PS>, and their result type R is wrapped in [PS, R]. See the runtime API docs for the details of these types. This wrapping makes the entirety of the contract code functional, ensuring calls have no hidden side effects.

Representations in TypeScript

Compact's primitive types are represented in TypeScript as follows:

  • Boolean - boolean
  • Field - bigint with run-time bounds checks
  • Uint<n> / Uint<0..n> - bigint with run-time bounds checks
  • [T, ...] - the TypeScript tuple type [S, ...] or else the TypeScript array type S[] with run-time length checks, where S is the TypeScript representation of the corresponding type T
  • Bytes<n> - Uint8Array with run-time length checks
  • Opaque<"string"> - string
  • Opaque<"Uint8Array"> - Uint8Array

Program-defined types are represented in TypeScript as follows:

  • enum instances - a number with run-time membership checks
  • struct instances with fields a: A, b: B, ... - an object { a: A, b: B, ... } where A, B, ... are the TypeScript representations of the Compact types.

Note that other Opaque types are currently not supported.

Implementation-specific limits

Compact compiler version 1.0 limits certain datatype values and sizes:

  • The maximum value of a Field value is 52435875175126190479447740508185965837690552500527637822603658699938581184512. This value is dictated by the scalar field size of the ZK proving system.
  • The maximum value of a Uint value is 452312848583266388373324160190187140051835877600158453279131187530910662655. This value is 256k-1 where k is the number of bytes (31) that fits in a Field.
  • The maximum length of a vector or byte vector is 16777216. This value is chosen as a reasonable upper bound that helps prevent the compiler from looping indefinitely on programs that operate on excessively large vector and byte-vector types.

These limits may change in future versions of the compiler.

External resources

Compact formal grammar

See here.

Compact keywords and reserved words

See here.

Compact ledger-state types and operations

See here.

Runtime API

See here

Compiler usage

See here.

Explicit disclosure in Compact

See here.

Writing a contract in Compact

See here.