Compact reference
Overview
Compact is a strongly typed, statically typed, bounded smart contract language, designed to be used in combination with TypeScript for writing smart contracts for the three-part structure of Midnight, where contracts have the following components:
- a replicated component on a public ledger,
- a zero-knowledge circuit component that confidentially proofs the replicated component resulted from a valid execution according to the logic of the contract, and
- a local, off-chain component that can perform arbitrary code.
Each Compact program (also referred to as a contract) can contain several kinds of program elements:
- module and import forms for management of namespaces and separate files,
- declarations of program-defined types,
- declarations of the data that the contract stores in the public ledger,
- declarations of witnesses, which are callback functions supplied by the TypeScript runner,
- definitions of circuits, which are functions serving as the operational core of a smart contract, and
- the definition of at most one constructor, which is a function called when a contract is created and can be used to initialize public and private state.
Compact is similar to TypeScript: it has a syntax similar to that of JavaScript, and it layers a type system over the JavaScript syntax. Compact deviates intentionally from TypeScript, however, in several important ways:
- Unlike TypeScript, Compact is strongly typed. Programs cannot bypass the static type system via missing type declarations or unsafe casts. Furthermore, the JavaScript code produced by the Compact compiler includes run-time checks enforcing the static types of values that come from outside Compact as well as preventing external application of a Compact circuit with more or fewer than the declared number of arguments.
- Compact provides namespace management via static rather than dynamic modules, and these modules can be parameterized via compile-time generic parameters, which include size as well as type parameters.
- Because every Compact program must compile into a set of finite proving circuits, the language is bounded: all Compact types have sizes that are fixed at compile time, loops are bounded either by constant bounds or by the size of an object of constant size, and recursion is disallowed.
- Compact numeric values are limited to unsigned integers either with a program-declared range or with a range determined by the field size of the target proving system.
- Compact distinguishes certain values as potentially containing private data that
should typically be protected, and it requires explicit declaration of the
disclosure of any potential private data via
disclose()wrappers. The basics of this are discussed in the description ofdisclose, and a more thorough description and discussion appears in the separate document Explicit disclosure in Compact.
Like TypeScript, Compact compiles into JavaScript, but it also produces a TypeScript definition file so it effectively also compiles into TypeScript. It produces separate TypeScript definition files and JavaScript implementation files rather than simply producing TypeScript for three reasons:
- to allow compiled Compact programs to be used without requiring an additional TypeScript compilation step,
- to permit the generated code to check function argument counts without disabling compile-time argument-type checks when called from TypeScript, and
- so that it can generate a sourcemap file that properly maps elements of the generated JavaScript code (e.g., variable bindings and references) to the corresponding elements of the source Compact code.
For each circuit that touches the public ledger and hence requires a proof for on-chain execution, the Compact compiler also produces proving circuits in a zero-knowledge intermediate language (zkir), and it uses a zkir compiler to produce proving keys for each such circuit.
Finally, the compact compiler also produces a JSON-formatted contract info file that contains information about the program and its compiled representation, including version numbers and the types and characteristics of the contract's exported circuits.
This document explains each syntactic category individually. It starts by introducing the building blocks that are used in various contexts: identifiers, constants, types, generic parameters, and patterns. Then it describes the structure of Compact programs, each kind of program element, and the statements and expressions that can appear within circuit and constructor bodies. Finally, it discusses the TypeScript target.
Writing a contract provides a small example of what a Compact program looks like. It also introduces the basic building blocks of a Compact contract. The full grammar of Compact is provided separately.
Notation
The syntax of Compact programs is given by EBNF grammar snippets that use the following notational conventions:
- Keywords and punctuation are in
monospacedfont. - Terminal and nonterminal names are in emphasized font.
- Alternation is indicated by a vertical bar (
|). - Optional items are indicated by the superscript opt.
- Repetition is specified by ellipses.
The notation X ⋯ X, where X is a grammar symbol, represents zero
or more occurrences of X.
The notation X
,⋯,X, where X is a grammar symbol and,is a literal comma, represents zero or more occurrences of X separated by commas. In either case, when the ellipsis is marked with the superscript 1, i.e., ⋯¹, the notation represents a sequence containing at least one X. When such a sequence is followed by,opt, an optional trailing comma is allowed, but only if there is at least one X. For example, id ⋯ id represents zero or more ids, and expr,⋯¹,expr,opt represents one or more comma-separated exprs possibly followed by an extra comma. The rules involving commas apply equally to semicolons, i.e., apply when,is replaced by;.
Every program is formed of characters that are organized into atomic
sequences of characters known as tokens.
Each keyword and punctuation symbol appearing in the grammar snippets
represents itself exactly, i.e., represents the token consisting of
the same sequence of characters.
For example, when the keyword circuit appears in a grammar snippet,
it matches only the token circuit, and when the punctuation symbol
: appears, it matches only the token :.
Each terminal name appearing in the grammar snippets represents a set of possible tokens. For example, the terminal name id represents the set of all identifiers: when id appears in a grammar snippet, it matches any identifier. The sets of tokens represented by the terminal names appearing in the grammar snippets are described in Terminal names below.
Each nonterminal name appearing in the grammar snippets represents a
sequence of tokens that comprise some structured piece of a program.
For example, the terminal name expr matches any sequence of tokens
that can be interpreted as an expression, such as 3 + x or
a ? b : c.
The set of structures represented by each nonterminal name is given
in the various sections of this reference manual along with typing
and evaluation rules where appropriate.
For example, the structure of a circuit-definition is described
in Circuits definitions.
Terminal names
The following terminal names appear in the grammar snippets.
-
id, module-name, function-name, struct-name, enum-name, contract-name, tvar-name, and type-name all represent identifier tokens.
-
nat represents natural-number literals.
-
str and file represent string literals.
-
version represents version literals (in pragmas).
While identifiers and string literals are each represented by more than one name, each represents the entire set of possible identifier or string-literal tokens. The grammar snippets use different terminal names only to suggest their use, e.g, module-name for module names versus tvar-name for type variable names.
Static and dynamic errors
The compiler detects various kinds of static errors, e.g., malformed syntax, references to undefined identifiers, and type mismatches. When it detects one or more static errors, it prints descriptive error messages for the errors and terminates without generating any output.
The code the compiler generates and the run-time libraries it uses detect various kinds of dynamic errors, e.g., attempts from code outside of Compact to call Compact circuits with wrong numbers or types of arguments or attempts to cast an unsigned value to an unsigned type that is too small for the value. These errors are reported when the generated code is run and cause evaluation of the current circuit to halt.
Identifiers, bindings, and scope
Identifiers are used in Compact, as in most other programming languages, to
name things.
Syntactically, an identifier is a token (atomic sequence of characters), beginning
with with an alphabetic character, a dollar sign ($), or an underscore (_)
followed by one or more alphabetic characters, digits (0 - 9), dollar signs,
or underscores.
Some identifiers are reserved words.
Of these, some are used as keywords in the syntax of the Compact language, e.g.,
module, import, circuit, and for.
Others, specifically keywords reserved by JavaScript and TypeScript, are considered
reserved for future use in Compact, e.g., self and class.
Still others, specifically every identifier that begins with __compact,
are reserved for use by the compiler.
A comprehensive list of keywords and reserved words is given
in keywords and reserved words.
The remaining identifiers can be used to name specific instances of various kinds of entities, including modules, types, generic parameters, ledger fields, function (circuit or witness) names, function parameters, and local variables. An identifier associated with, i.e., bound to, one of these entities can be referenced anywhere within the scope of the binding. Compact is lexically scoped, so the scope of each binding is limited to a specific region of the program text. The binding might be shadowed (hidden from view) in some region of program text within its scope that contains a binding for the same identifier.
It is a static error for an identifier to be bound more than once in a same scope, except that function overloading allows more than one function with the same name to be visible in the same scope with different signatures, i.e., different numbers or kinds of generic parameters and/or different numbers or types of run-time parameters.
The scope of each binding depends upon where it appears, as described below. (The caveat "except where shadowed" is not explicitly stated but applies in each case.)
- Identifiers bound at the outermost level of a contract (referred to as the top level) are visible throughout the contract, but not within any modules that are imported from separate files.
- Identifiers bound at the outermost level of a module are visible throughout the module. They are not visible outside of the module unless exported: any exported binding is also visible if and where it is imported from the module.
- The generic parameters of a module, structure declaration, or function declaration are visible throughout the declaration.
- The run-time parameters of a circuit or constructor are visible within its body.
- Identifiers defined by a
constbinding inside a block are visible throughout the block. - Identifiers defined by a
constbinding inside afor-loop header are visible throughout theforloop.
Every reference to an identifier must appear within the scope of a binding for the identifier, in which case we say that the identifier is bound to the entity associated with the identifier by that binding. Otherwise, the reference is a static error.
For example:
circuit c(): Field {
const answer = 42;
{
const answer = 12;
assert(answer != 42, "shadowing did not work!");
}
return answer; // returns 42 (the outer 'answer')
}
The identifier c is bound to the circuit named c, and this binding is visible
throughout the contract, though no references to c appear in the example.
The first (outer) binding of the identifier answer to the value 42 is visible
throughout the body of c except where shadowed by the second (inner) binding of
answer within in the inner block, so the reference to answer in return answer
evaluates to 42.
The second (inner) binding of answer to 42 is visible throughout the inner
block, so the reference to answer in answer != 42 evaluates to 12.
In addition to having a scope, every binding also has a lifetime. For circuit and witness bindings, the lifetime is effectively permanent, i.e., the binding is always available for use whenever the program is run.
The lifetimes of ledger-field bindings begin when they are first initialized and are effectively permanent from that point on; although the value of a field can change over time, the association of the ledger-field name with the ledger field's location in the (replicated) public state of a contract never changes.
On the other hand, bindings for module names, type names, and generic parameters are live only when a program is compiled, i.e., they help determine the structure of the program and the shape of the data used by the program but are not needed once the program has been compiled. (TypeScript bindings for type names exported from the program's top level do live on, however, in the generated TypeScript definition file.)
Variable bindings, i.e., bindings of circuit parameters, constructor parameters,
and local variables bound by const statements and for loops, have dynamic
lifetimes.
The bindings of a circuit's or constructor's parameters start a new lifetime
when the circuit or constructor is called that ends when the circuit or constructor
exits.
A variable binding established by a const statement starts a new lifetime when
the const statement is evaluated that ends when the block containing the const
statement exits.
A const binding established by the const subform of a for-loop header starts
a new lifetime on each iteration of the loop that ends when that iteration ends.
Variable bindings can have multiple lifetimes, because a circuit might be called
multiple times, a block might be evaluated multiple times, and a for loop might
be evaluated multiple times and/or have multiple iterations.
Variables in Compact are immutable, however: they have the same value over the
entire lifetime of the variable's binding.
Thus, they are referred to as variables not because their values can vary over
any single lifetime but because they can have different values in different
lifetimes.
Generic parameters and arguments
Various entities, specifically module declarations, structure declarations, type-alias declarations, circuit definitions, and witness declarations, can have generic parameters, i.e., compile-time type and numeric (natural-number) parameters whose values are given at the use site rather than fixed at the point of declaration. This allows the same generic code to be used with different specific types, bounds, and sizes. Except where shadowed, generic parameters are visible throughout the entire entity. In particular, the generic parameters of a module are visible within the program elements that appear within the body of the module.
When present, generic parameters are enclosed in angle brackets following the
name of the generic entity (module, structure, type alias, circuit, or witness).
Each parameter is either a type name (e.g., T) or a hash-prefixed natural-number name
(e.g., #N).
Generic natural-number parameters are prefixed by # to distinguish them from
generic type parameters.
| gparams | ⟶ | < generic-param , ⋯ , generic-param ,opt > |
| generic-param | ⟶ | # tvar-name |
| | | tvar-name |
Generic entities must be specialized at the point of use to produce non-generic entities at compile time by supplying them with generic arguments. Any attempt to use a generic entity without specializing it is a static error. Generic arguments are also enclosed in angle brackets. Each generic argument must be a type, a natural number literal, or the type or numeric value of a generic parameter.
The syntax of types allows for type references, including references to generic parameters, so any generic argument can pass along the value of a generic type or natural-number parameter that is visible at the point of specialization.
The # used to distinguish generic natural-number parameters from generic type parameters
need not and must not be used at the point of specialization.
It is a static error, however, if a generic argument supplied for a generic
parameter is not numeric when a numeric value is expected or is not a type when
a type is expected.
The example below demonstrates the use of two levels of generic parameterization, one at the module level and one at the circuit level.
module M<T, #N> {
export circuit foo<A>(x: T, v: Vector<N, A>): Vector<N, [A, T]> {
return map((y) => [y, x], v);
}
}
import M<Boolean, 3>;
export circuit bar1(): Vector<3, [Uint<8>, Boolean]> {
return foo<Uint<8>>(true, [101, 103, 107]);
}
export circuit bar2(): Vector<3, [Boolean, Boolean]> {
return foo<Boolean>(false, [true, false, true]);
}
The body of circuit foo is generic with respect to the module's type parameters
T and N as well as to the circuit's own parameter A.
The module is specialized at the point of import, while the circuit is specialized
at the point of call (in both bar1 and bar2).
Compact types
Compact is statically typed: every expression in a Compact program must have
a static type.
For named circuits and witnesses, the parameters and return types must be explicitly
declared.
For anonymous circuit expressions, the parameters and return types do not need to
be declared but can be.
The types of const bindings can also be declared or not.
The language is strongly typed: the compiler rejects programs that do not type check. For example, it rejects programs in which a circuit or witness with a parameter type annotation is called with an incorrectly typed argument for that parameter, and it rejects programs where a circuit with a return-type annotation returns an incorrectly typed value. If an optional type annotation is omitted, the compiler attempts to infer a type and it rejects the program if no such type can be inferred.
Types consist of built-in primitive types, ledger-state types, program-defined types,
and references to generic type parameters in scope.
When the term "type" occurs in this document without any other qualifier, it means
either a primitive type, ledger-state type, a program-defined type, or a generic type
parameter in scope.
The use of ledger-state types is, at present, limited to typing the result of
default<T> to obtain the default value of type T, and only constant bindings
can have a ledger-state type.
A generic type is not a valid type and so cannot, for example, be used as the type of a parameter or return value. Any attempt to do so is a static error. As with any other generic entity, it must be specialized at the point of use.
Primitive types
The following are the primitive types of Compact:
| type | ⟶ | tref |
| | | Boolean | |
| | | Field | |
| | | Uint < tsize > | |
| | | Uint < tsize .. tsize > | |
| | | Bytes < tsize > | |
| | | Opaque < str > | |
| | | Vector < tsize , type > | |
| | | [ type , ⋯ , type ,opt ] | |
| tref | ⟶ | id gargsopt |
| tsize | ⟶ | nat |
| | | id |
-
Booleanis the type of Boolean values. There are only two values ofBooleantype. They are the values of the expressionstrueandfalse. -
Uint<m..n>, where m is the literal0or generic natural-number parameter bound to0, and where n is a non-zero natural number literal or a generic natural-number parameter bound to a non-zero natural number, is the type of bounded unsigned integer values between0(inclusive) andn(exclusive). (While the lower bound is currently required to be0, this restriction might be lifted at some point.)Uinttypes with different upper bounds are different types, although the one with the lower bound is a subtype of the other. The compiler and run-time system might impose a limit on the range of supported unsigned integer values. If so, it is a static error whenever aUinttype includes values that exceed this limit. The current limit, if any, is given in Implementation-specific limits. -
Uint<n>, where n is a non-zero natural number literal or generic natural-number parameter bound to a non-zero natural number, is the type of sized unsigned integer values with binary representations using up to n bits. This is the same type asUint<0..m>where m is equal to 2n. Sized integer types can be seen as a convenience for programmers.Uint<32>, for example, can be more obvious and less error-prone than the equivalentUint<0..4294967296>. Any Compact program that uses sized integer types can be rewritten to one that uses only bounded integer types, but the converse is not true. -
Fieldrepresents the set of unsigned integers with values up to the order of the native prime field of the ZK proving system. The current maximum field value is given in Implementation-specific limits. -
[T1, ⋯, Tn], where T1, ⋯, Tn are zero or more comma-separated types, is the type of tuple values with element types T1, ⋯, Tn. Tuples are heterogeneous: any element type can differ from any of the others. The length of a tuple type is the number n of element types. Two tuple types with different lengths are different types. Two tuple types where any element type of one differs from the corresponding element type of the other are also different types, though one of the tuple types might be a subtype of the other. -
Vector<n, T>, where n is a natural number literal or generic natural-number parameter and T is a type, is a shorthand notation for the tuple type[T, ⋯, T]with n occurrences of the type T. Note that a vector type and the corresponding tuple type are two different ways of writing exactly the same type. Unless otherwise specified, type rules for vector types are derived from the rules for the corresponding tuple type. -
Bytes<n>, where n is a natural number literal or a generic natural-number parameter, is the type of byte vectors of length n.Bytestypes with different lengths are different types.Bytestypes are used in the Compact standard library for hashing. String literals in Compact also have aBytestype, where n is the number of bytes in the UTF-8 encoding of the string. -
Opaque<s>, where s is a string literal, is the type of opaque values with tag s.Opaquetypes with different tags are different types. Opaque values can be manipulated in witnesses but they are opaque to circuits. They are represented in circuits as their hash. The only tags currently allowed are"string"and"Uint8Array".
Program-defined types
Programs can define three kinds of new types: structures, enumerations, and contracts. They can also define structural and nominal aliases for existing types.
Structure types
Structure types are defined via struct declarations with the following form:
| struct-declaration | ⟶ | exportopt struct struct-name gparamsopt { typed-id ; ⋯ ; typed-id ;opt } ;opt |
| | | exportopt struct struct-name gparamsopt { typed-id , ⋯ , typed-id ,opt } ;opt | |
| typed-id | ⟶ | id : type |
A structure declaration has a sequence of named fields which must be separated either by commas or by semicolons. Comma and semicolon separators cannot be mixed within a single structure declaration. A trailing separator is allowed, but not required.
Each structure field must have a type annotation. Here are a couple of examples:
struct Thing {
triple: Vector<3, Field>,
flag: Boolean,
}
struct NumberAnd<T> {
num: Uint<32>;
item: T
}
The first declaration introduces a structure type named Thing with two fields:
triple (a vector with three Field elements) and flag (a Boolean).
The second introduces a generic structure type named NumberAnd with generic
parameter T and two fields: num (a 32-bit unsigned integer) and item
(a value of type T).
Generic structure types are not fixed types and must eventually be specialized
by supplying generic arguments, e.g., NumberAnd<Uint<8>>.
When any generic structure type is specialized, it must be fully specialized:
the number of supplied generic arguments must match the number of declared
generic parameters.
The effect of specializing a generic structure type is to produce the same type
as one in which the generic parameters are replaced by the generic argument
values.
For example, NumberAnd<Uint<8>> is equivalent to NumberAnd if NumberAnd
had been defined by:
struct NumberAnd {
num: Uint<32>;
item: Uint<8>
}
It is possible and common for a generic structure type to be specialized via different generic arguments to produce different specialized structure types in different parts of a program.
Structure typing is always nominal: two types are equivalent only if they have
the same names and same fields.
They are distinct if they have different names even if they have the same fields.
More precisely: each structure type is the same as any other structure type
that has the same name, same element names (in the same order), and same element
types (in the same order).
It is distinct from every other type.
This means, for example, that the following program is well-typed:
module M {
struct NumberAnd {
num: Uint<32>;
item: Uint<8>
}
export circuit bar(x: NumberAnd): NumberAnd {
return x;
}
}
import M;
struct NumberAnd<T> {
num: Uint<32>;
item: T
}
export circuit foo(x: NumberAnd<Uint<8>>): NumberAnd<Uint<8>> {
return bar(x);
}
Structure types must not be recursive, i.e., they cannot contain elements of the same type as the structure, either directly or indirectly. An attempt to define a recursive structure type is a static error. For example, it is a static error to use the following pair of declarations:
struct Even {
predecessor: Odd
}
struct Odd {
predecessor: Even
}
export circuit doesntWork(s: Even): Odd {
return s.predecessor;
}
Values of structure types are created with structure-creation expressions and accessed via structure-field-access expressions.
Enumeration types
Enumeration types are defined via enum declarations with the following form:
An enumeration declaration has a non-empty sequence of named elements separated by commas. A trailing comma is allowed but not required.
An enumeration declaration introduces a named enumeration type, such as Arrow
in the example below:
enum Arrow { up, down, left, right };
Within the scope of this declaration, a value of type Arrow can have one of
four values, selected via Arrow.up, Arrow.down, Arrow.left, and Arrow.right.
Two enumeration types are the same if they have the same name and the same element names (in the same order) and distinct otherwise.
Contract types
Compact 1.0 does not fully implement declarations of contracts and the
cross-contract calls they support, but the keyword contract used to declare
contracts is reserved for this use.
Type aliases
Type aliases can be created via type declarations of the form:
Within the scope of a type-alias declaration of type-name for type, type-name
is itself a type.
Type aliases are either structural or nominal, depending on whether the optional
new modifier is present:
- A type alias type-name for type declared without the optional
newmodifier is a structural type alias, i.e., type-name is the same type and is fully interchangeable with type. - A type alias type-name for type declared with the optional
newmodifier is a nominal type alias, i.e., type-name is a distinct type compatible with type but neither a subtype of nor a supertype of type (or any other type).
Any nominal type alias type-name for some type type is compatible with type in the following senses:
- values of type type-name have the same representation as values of type type
- values of type type-name can be used by primitive operations that require a value of type type
- values of type type-name can be cast explicitly to type, and
- values of type type can be cast explicitly to type type-name.
For example, within the scope of
new type V3U16 = Vector<3, Uint<16>>
a value of type V3U16 can be referenced or sliced just like a vector
of type Vector<3, Uint<16>>, but it cannot, for example, be passed to a function
that expects a value of type Vector<3, Uint<16>> without an explicit cast.
When one operand of an arithmetic operation (e.g., +) receives a value of some
nominal type alias type-name, the other operand must also be of type type-name,
and the result of performing the operation is cast to type type-name.
This might cause a dynamic error if the result cannot be represented by type
type-name.
Values of any nominal type alias type-name cannot be compared directly using,
e.g., <, or ==, with values of any other type, including with values of
type type.
Such comparisons require one of the operands to be cast to the type of the other.
Both structural and nominal type aliases can take generic parameters, e.g.:
type V3<T> = Vector<3, T>;
and
new type VField<#N> = Vector<N, Field>;
When a generic nominal type is specialized, the specialized type is a nominal type.
Subtyping and least upper bounds
Some Compact types are related to other types via subtyping.
Informally, if a type T is a subtype of a type S (equivalently, S is a
supertype of type T), then every value of type T is also a value of type S,
i.e., any value of type T can be used where a value of type S is expected without
the need for an explicit cast.
For example, a circuit or witness can be called with argument expressions whose
types are subtypes of the corresponding parameter type annotations, and
a const binding statement with a type annotation can be given a value with an
expression whose type is a subtype of the type annotation.
Subtyping is exclusively defined by the following rules:
- Any type T is a subtype of itself (subtyping is reflexive)
Uint<0..n>is a subtype ofUint<0..m>` if n is less than (or equal to) mUint<0..n>is a subtype ofFieldif n-1 is less than or equal to the maximum field value. (WhetherFieldis a subtype ofUint<0..n>if n-1 is greater than the maximum field value is currently unspecified.)- The tuple type
[T1, ⋯, Tn]is a subtype of the tuple type[S1, ⋯, Sn]if they have the same length and each type Ti is a subtype of the corresponding type Si.
The least upper bound (with respect to subtyping) of a non-empty set of types {T1, ⋯, Tn} is a type S such that:
- S is an upper bound: Ti is a subtype of S for all i in the range 1..n, and
- S is the least upper bound: for all upper bounds R of the set of types {T1, ⋯, Tn}, S is a subtype of R.
Note that least upper bounds do not exist for all sets of types, because
some types (such as Boolean and Field) are not related.
Tuple and vector types:
Every vector type is equivalent to some tuple type.
Specifically, as noted in the earlier section on Primitive types,
the vector type Vector<n, T> is equivalent to the tuple type
[T, ⋯, T] with n occurrences of T.
Thus the above subtyping rule for tuple types applies to vector types as well:
every vector type is a subtype of the equivalent tuple type and possibly of some
other tuple and vector types.
In general, a vector type Vector<n, T> is a subtype of a tuple type
[S1, ⋯, Sn] if T is a subtype of each of the
types S1, ⋯, Sn.
This means, for instance, that a vector can often be passed to a circuit where a
tuple is expected.
On the other hand, tuple types do not always have equivalent vector types.
For example, neither [Boolean, Field] nor [Uint<8>, Uint<16>] is
equivalent to any vector type.
We say, however, that a tuple type [T1, ⋯, Tn] with
possibly distinct types T1, ⋯, Tn "has a vector type" if the
least upper bound S of the set of types {T1, ⋯, Tn}
exists.
In that case, the tuple type has the vector type Vector<n, S>.
Some operations over tuples (such as mapping and folding) require the tuple type
to have a vector type.
When a tuple type has a vector type, the tuple type is a subtype of the vector
type, but it might not be the same as the vector type.
For example, [Uint<16>, Uint<16>] has the vector type Vector<2, Uint<16>>,
and the two types are the same, whereas [Uint<8>, Uint<16>] also has the
vector type Vector<2, Uint<16>>, but the types are not the same.
Patterns and destructuring
The parameters of a circuit or constructor and the target of a const binding
are specified via patterns:
| pattern | ⟶ | id |
| | | [ patternopt , ⋯ , patternopt ,opt ] | |
| | | { pattern-struct-elt , ⋯ , pattern-struct-elt ,opt } | |
| pattern-struct-elt | ⟶ | id |
| | | id : pattern |
In its simplest form, a pattern is just an identifier.
For example, in the code below, the parameter of sumTuple is the identifier
x and the targets of the two const bindings are the identifiers a and b.
circuit sumTuple(x: [Field, Field]): Field {
const a = x[0], b = x[1];
return a + b;
}
When the parameter type is a tuple, vector, or struct, it is often convenient to use one of the destructuring forms of patterns to name individual pieces of the tuple or struct at the point of binding rather than extracting them at each point of use. For example, one could replace the above with:
circuit sumTuple(x: [Field, Field]): Field {
const [a, b] = x;
return a + b;
}
or more simply with:
circuit sumTuple([a, b]: [Field, Field]): Field {
return a + b;
}
Here is a similar example that destructures a struct instead of a tuple:
struct S { x: Uint<16>, y: Uint<32> }
circuit sumStruct({x, y}: S): Uint<64> {
return x + y;
}
Whereas the elements of a tuple pattern are necessarily given in order, the
elements of a struct pattern need not be consistent with the order of the fields
in the declaration.
For example, the definition of sumStruct below is equivalent to the one above,
even though the order of the pattern elements has been swapped:
struct S { x: Uint<16>, y: Uint<32> }
circuit sumStruct({y, x}: S): Uint<64> {
return x + y;
}
By default, the names bound by the pattern are the same as the names of the structure elements. When this is not convenient, it is possible to choose different names for the structure elements:
struct S { x: Uint<16>, y: Uint<32> }
circuit sumStruct({x: a, y}: S): Uint<64> {
return a + y;
}
While x: a looks like an identifier with a type annotation, in this context
it simply indicates that a rather than x is bound to the value in the
x field.
Patterns can be arbitrarily nested, e.g.:
struct S { x: Uint<16>, y: Uint<32> }
circuit sumTupleStruct([{x: a1, y: b1}, {x: a2, y: b2}]: [S, S]): Uint<64> {
return a1 + b1 + a2 + b2;
}
It is permissible and sometimes useful to not name certain parts of the tuple or struct.
struct S { x: Uint<16>, y: Uint<32> }
circuit sumSomeYs([{y: b1}, , {y: b3}]: [S, S, S]): Uint<64> {
return b1 + b3;
}
Here the input is a tuple with three elements, but the pattern skips the
second by putting two commas between the first and third.
Similarly, while each element of the tuple is a struct with both x and y
fields, the pattern ignores the x fields simply by failing to mention them.
It is a static error if a pattern implies a different shape from the declared or inferred type of value to be destructured. For example:
struct S { x: Uint<16>, y: Uint<32> }
circuit sumStruct({x, y}: [Uint<16>, Uint<32>]): Uint<64> {
return x + y;
}
fails because it tries to treat a tuple as a struct, while:
struct S { x: Uint<16>, y: Uint<32> }
circuit sumSomeYs([{y: b1}, , , {y: b3}]: [S, S, S]): Uint<64> {
return b1 + b3;
}
fails because it implies that the input tuple has four elements (including two skipped elements) when it actually has only three, and:
struct S { x: Uint<16>, y: Uint<32> }
circuit sumSomeYs([{y: b1}, , {z: b3}]: [S, S, S]): Uint<64> {
return b1 + b3;
}
fails because it tries to name a nonexistent z field in one of the structs.
Trailing commas in a pattern imply nothing about the structure of the input and are ignored:
struct S { x: Uint<16>, y: Uint<32> }
circuit sumSomeYs([{y: b1,}, , {y: b3,},]: [S, S, S]): Uint<64> {
return b1 + b3;
}
Programs
A compact program is a sequence of zero or more program elements.
Briefly:
- A pragma form allows the program to declare the version of the compiler and/or the language that it requires.
- A module definition defines a Compact module, which also contains a sequence of program elements in its own nested scope.
- An export form exports bindings from a module or from the program itself.
- An import form imports bindings from a Compact module.
- An include form allows program elements to be included from other files.
- A structure definition defines a structure type.
- An enumeration definition define an enumeration type.
- A contract declaration declares a contract type.
- A type-alias definition defines a type alias, possibly creating a distinct type.
- A witness declaration declare a witness, which is a callback function provided by the Dapp.
- A ledger declaration declares one field of the contract's public state.
- A constructor definition defines the contract's constructor, if any.
- A circuit definition defines a circuit.
The order of program elements in a program or module is unimportant, except that any module must be defined before any import of the module, and any program-defined types used as generic parameters by an import form must be defined before the import form.
Detailed descriptions of struct, enum, contract, and type-alias declarations appear in Compact types above. Detailed descriptions of the remaining program elements are described in the following section.
Pragmas
A pragma takes the following form and declares a constraint on either the compiler
version (id = compiler_version) or the language version (id = language_version)
| pragma-form | ⟶ | pragma id version-expr ; |
| version-expr | ⟶ | version-expr || version-expr0 |
| | | version-expr0 | |
| version-expr0 | ⟶ | version-expr0 && version-term |
| | | version-term | |
| version-term | ⟶ | version-atom |
| | | ! version-atom | |
| | | < version-atom | |
| | | <= version-atom | |
| | | >= version-atom | |
| | | > version-atom | |
| | | ( version-expr ) | |
| version-atom | ⟶ | nat |
| | | version |
version is a dot-separated pair or trio of natural numbers, so along
with nat this allows version to be either a single natural number,
a pair of natural numbers separated by a ., or a trio of natural numbers
separate by a ., e.g., 1, 1.2, or 1.2.7.
For example, if a program includes the following pragma form:
pragma compiler_version >= 1.0.0 && !1.0.5;
All versions of the compiler from 1.0.0 up except for 1.0.5 will accept the pragma and compile the program, and all other versions will print an appropriate message and discontinue compilation of the program.
Modules, exports, and imports
Modules in Compact are used for namespace management and also possibly to split programs into multiple files. A module is a named collection of program elements created via a module definition, which takes the following form:
| module-definition | ⟶ | exportopt module module-name gparamsopt { program-element ⋯ program-element } |
A module definition makes a binding from module-name to the module visible in the program or module containing the module definition. Any bindings established by program elements within the module are not made visible, at least not until the module is imported.
A module can have generic parameters, in which case it is a generic module and must be specialized with generic arguments at the point of import.
Exports
By default, identifier bindings created by the program elements within the body
of a module are visible only within the module, i.e., they are not exported from
the module.
Any identifier defined at or imported into the top level of a module can be exported
from the module in one of two ways: (1) by prefixing the definition with the
export keyword, or by listing the identifier in a separate export declaration:
For example, the following module exports G and S but not F.
module M {
export { G };
export struct S { x: Uint<16>, y: Boolean }
circuit F(s: S): Boolean {
return s.y;
}
circuit G(s: S): Uint<16> {
return F(s) ? s.x : 0;
}
}
Exporting a binding from a module has no effect unless the module is imported.
Imports
A module can be imported into another module or into the program top level, making some or all of its exported bindings visible there, potentially with a prefix.
| import-form | ⟶ | import import-selectionopt import-name gargsopt import-prefixopt ; |
| import-selection | ⟶ | { import-element , ⋯ , import-element ,opt } from |
| import-element | ⟶ | id |
| | | id as id | |
| import-name | ⟶ | id |
| | | file | |
| import-prefix | ⟶ | prefix id |
For example:
module Runner {
export circuit start(): [] {}
export circuit stop(): [] {}
}
module UseRunner1 {
import Runner;
// start and stop are now in scope
}
module UseRunner2 {
import { start } from Runner;
// start is now in scope, but not stop
}
module UseRunner3 {
import Runner prefix Runner$;
// Runner$start and Runner$stop are now in scope, but not stop or run
}
and
module Identity<T> {
export { id }
circuit id(x: T): T {
return x;
}
}
import Identity<Field>;
// id is now in scope, specialized to type Field
When import-name is an identifier and an import for import-name appears before any visible definition
of import-name, the module is assumed to reside in the filesystem, and it is imported
directly from there.
If import-name is an identifier module-name, a definition for module module-name
must be contained within
the file module-name.compact in the same directory as the importing file or in one of the
directories in the Compact path.
If import-name is a string "{prefix/}module-name" where {prefix/} is either empty or is
a pathname ending in a directory separator, a definition for a module named
module-name must be contained within a file module-name.compact that is either:
- (a) if
{prefix/}module-name.compactis an absolute pathname, then exactly at{prefix/}module-name.compact, otherwise - (b) at
{prefix/}module-name.compactrelative to the directory of the importing file or to one of the directories in the Compact path. Details on the search order and the mechanism for setting the Compact path are given in Compiler Usage.
In any of these cases, it is a static error if module-name.compact is not found, if it
does not contain a definition for a module named module-name, or if it contains anything
else other than comments and whitespace.
Several examples follow.
Example 1: The file M.compact below contains a single module definition:
module M {
export { F };
export struct S { x: Uint<16>, y: Boolean }
circuit F(x: S): Boolean {
return x.y;
}
}
Then, test1.compact import M from M.compact:
import M;
export { F };
whereas test2.compact uses its own definition of M:
module M {
export { G };
export struct S { x: Uint<16>, y: Boolean }
circuit G(x: S): Boolean {
return x.y;
}
}
import M;
export { G };
Importing by a pathname allows multiple modules with the same name to be imported into the same scope. For example:
The file M.compact below contains a single module definition, as before:
module M {
export { F };
export struct S { x: Uint<16>, y: Boolean }
circuit F(x: S): Boolean {
return x.y;
}
}
and A/M.compact contains a different module definition:
module M {
export { F };
export struct S { x: Uint<16>, y: Boolean }
circuit F(x: S): Boolean {
return x.y;
}
}
Then the program test.compact can define M and import all three of
M, "M", and "A/M":
module M {
export { F };
export struct S { x: Uint<16>, y: Boolean }
circuit F(x: S): Boolean {
return x.y;
}
}
import M prefix M1$;
import "M" prefix M2$;
import "A/M" prefix M3$;
export { M1$F, M2$F, M3$F };
The compact standard library
Compact's standard library can be imported by import CompactStandardLibrary.
The standard library defines a number of useful types and circuits along with
ledger-state types such as Counter, Map, and MerkleTree.
Top-level exports
Certain kinds of program elements can be exported from a contract's top level, namely circuits, program-defined types, and ledger fields. Exporting them makes them visible outside of the contract, i.e., to the TypeScript driver for the smart contract.
The circuits exported at the top level of a contract (i.e., not merely exported from a module) are the entry points of the contract. (A Compact program has no "main" entry point, but is more similar to a library containing multiple entry points that share a common store.) Although multiple circuits with the same name are allowed generally to support function overloading, it is a static error if more than one circuit with the same name is exported from the top level. It is also a static error for a generic circuit, i.e., one with generic parameters, to be exported from the top level.
Program-defined types exported from the top level of the main file can be used to describe the argument and return types of witnesses and exported circuits; these may accept generic arguments, but generic arguments marked as sizes rather than types are dropped in the exported type. For example:
export struct S<#n, T> { v: Vector<n, T>; curidx: Uint<0..n> }
is exported for use as a TypeScript type with the T parameter but not the n
parameter, i.e.,:
export type S<T> = { v: T[]; curidx: bigint }
Ledger field names exported from the top level are visible for direct inspection
by code outside of the contract via the generated TypeScript ledger() function.
It is a static error to export any other kind of binding from the top level.
Include files
Compact allows programs and modules to be split into multiple files and spliced
together via include forms, which have the following syntax, where file is a
string literal specifying a filesystem pathname for file to be included:
| include-form | ⟶ | include file ; |
file can be an absolute pathname, one that is relative to the directory of the including file, or one that is relative to one of the directories in the Compact path. Details on the search order and the mechanism for setting the Compact path are given in Compiler Usage.
It is a static error if the file is not present or cannot be read.
If present and readable, the file must contain a sequence of syntactically
valid program elements, and these elements are treated as if they had
been present in the including file in place of the include form.
Declaring witnesses for private state management
A user's private state should be maintained in some secure way by the TypeScript driver of a smart contract and never stored directly in the public state of the contract. A contract must sometimes prove something about some piece of private state, however, as well as cause an update to the private state.
The TypeScript driver of the smart contract can provide pieces of private state to the contract via the arguments of some exported circuit, and it can update the private state based on the return values of the exported circuit.
A circuit can also access or update private state as it operates via witnesses. Witnesses are callback functions provided by the TypeScript driver.
Witnesses must be declared to make them visible to the circuits of a contract. A witness declaration does not include a body, because the implementation is provided by the TypeScript driver.
| witness-declaration | ⟶ | exportopt witness id gparamsopt simple-parameter-list : type ; |
| simple-parameter-list | ⟶ | ( typed-id , ⋯ , typed-id ,opt ) |
Witness declarations can appear anywhere among the program elements of a module or the contract's top level.
For instance:
witness W(x: Uint<16>): Bytes<32>;
defines a witness W, to which the contract provides a 16-bit unsigned value
and from which the contract receives 32 bytes of some presumably private data.
Do not assume in your contract that the code of any witness function
is the code that you wrote in your own implementation. Any DApp may
provide any implementation that it wants for your witness functions.
Results from them should be treated as untrusted input.
Declaring and maintaining public state
A contract declares the shape of its public state through ledger declarations.
Each ledger declaration defines one piece of information that the contract might store in the public ledger. Multiple ledger declarations can appear in a program, or none. They can appear anywhere among the program elements of a module or the contract's top level.
A ledger declaration binds a ledger field name to one of a set of predefined ledger-state types. For instance:
import CompactStandardLibrary;
ledger val: Field;
export ledger cnt: Counter;
sealed ledger u8list: List<Uint<8>>;
export sealed ledger mapping: Map<Boolean, Field>;
All ledger fields are initialized to the default values of their ledger-state types. The contract constructor, if any, runs after the default values are established, allowing the constructor to override the defaults.
Ledger-state types
The following ledger-state types are supported.
- T, for any regular Compact type T
CounterSet<T>, for any Compact type TMap<K, T>, for any Compact types K and TMap<K, V>, for any Compact type K and ledger-state type V (see the following section)List<T>, for any Compact type TMerkleTree<n, T>, for any n, 1 < n ≤ 32, and any Compact type THistoricMerkleTree<n, T>, for any n, 1 < n ≤ 32, and any Compact type TKernel, a special type that yields access to ledger operations that do not depend on a specific ledger state
Each ledger type supports a set of operations, which can be invoked with
F.op(e, ⋯, e)
where F is a field name, op is the name of a ledger operation supported by the ledger-state type of F, and each e is an argument expression.
A ledger field that is declared with a Compact type T implicitly has the type
Cell<T>, which supports several operations, including read, write, and
reset_to_default.
For example:
ledger F: Uint<16>;
export circuit putF(x: Uint<16>): [] {
F.write(disclose(x));
}
export circuit getF(): Uint<16> {
return F.read();
}
The read operation of the Cell ledger-state type can be abbreviated to,
simply, a reference to the field name, and the write(e) operation can be abbreviated
to an assignment of the form F = e.
So the above can be written more simply as:
ledger F: Uint<16>;
export circuit putF(x: Uint<16>): [] {
F = disclose(x);
}
export circuit getF(): Uint<16> {
return F;
}
The read operation of the Counter type can be abbreviated in the same way,
and its increment and decrement operations can be abbreviated to assignments
of the form F += e and F -= e.
For example:
import CompactStandardLibrary;
ledger F: Counter;
export circuit incrF(): [] {
F += 1;
}
export circuit decrF(): [] {
F -= 1;
}
export circuit getF(): Uint<64> {
return F;
}
The selection of an appropriate ledger-state type can reduce a transaction's
dependency on the exact contents of the ledger state and thus reduce the chance
of a transaction being rejected when the proof is checked on chain.
For example, the preceding example could be written using a Cell instead:
ledger F: Uint<64>;
export circuit incrF(): [] {
F = F + 1 as Uint<64>;
}
export circuit decrF(): [] {
F = F - 1;
}
export circuit getF(): Uint<64> {
return F;
}
incrF and decrF read the value of F before operating on it and writing
it back to F.
This read commits the transaction to the current value of the field.
If F has the value, say, 27 when the transaction is created, F must still
have the value 27 when the proof is checked on chain.
The Counter versions, however, do not read the value but merely request
that the current value be incremented or decremented; thus, it does not have
any such constraint.
A comprehensive description of ledger-state types and operations can be found in the Compact ledger data type documentation.
Nested state types in the Map type
In most cases, ledger-state types cannot be nested within other ledger-state types.
However, while Map keys must have regular Compact types, Map values can have
either regular Compact types or ledger-state types (except Kernel).
Attempting to nest a ledger-state type anywhere but within a Map value is a
static error.
Attempting to nest the Kernel type within a Map value is also a static error.
Here is a small example showing two levels of nesting:
import CompactStandardLibrary;
ledger fld: Map<Boolean, Map<Field, Counter>>;
export circuit initNestedMap(b: Boolean): [] {
fld.insert(disclose(b), default<Map<Field, Counter>>);
}
export circuit initNestedCounter(b: Boolean, n: Field): [] {
fld.lookup(b).insert(disclose(n), default<Counter>);
}
export circuit incrementNestedCounter1(b: Boolean, n: Field, k: Uint<16>): [] {
fld.lookup(b).lookup(n).increment(disclose(k));
}
export circuit incrementNestedCounter2(b: Boolean, n: Field, k: Uint<16>): [] {
fld.lookup(b).lookup(n) += disclose(k);
}
export circuit readNestedCounter1(b: Boolean, n: Field): Uint<64> {
return fld.lookup(b).lookup(n).read();
}
export circuit readNestedCounter2(b: Boolean, n: Field): Uint<64> {
return fld.lookup(b).lookup(n);
}
In this example,
fldis bound to aMapfromBooleanvalues toMaps fromFieldvalues toCountersinitNestedMapcan be used to create the innerMapfor a particular outer-MapkeyinitNestedCountercan be used to create aCounterfor a given outer-Mapkey and a given inner-Mapkey- either
incrementNestedCounter1orincrementNestedCounter2can be used to increment an existingCounterfor a given outer-Mapkey and a given inner-Mapkey - either
readNestedCounter1orreadNestedCounter2can be used to read the value of an existingCounterfor a given outer-Mapkey and a given inner-Mapkey.
Notes:
-
Nested ledger-state values must be initialized before first use. The syntax
default<T>can be used to create default ledger-state type values, just as it can be used to create default Compact type values. Attempting to operate on a nested ledger-state value without first initializing it is a dynamic error. -
Ledger state type values are not first-class objects, so when accessing a nested value, the entire indirection chain must be used. For example, the following is a static error:
export circuit incrementNestedCounter(b: Boolean, n: Field, k: Uint<16>): [] {
fld.lookup(b); // ERROR: incomplete chain of indirects
} -
When the last operation is a write of a base type, increment of a Counter type, or decrement of a Counter type, one can replace the
write,increment, ordecrementoperation with the=,+=, or-=assignment syntax, as illustrated byincrementNestedCounter1andincrementNestedCounter2, which have the same behavior. -
When the last operation is a read of a Counter or base type one can omit the explicit
read()indirect, as illustrated byreadNestedCounter1andreadNestedCounter2, which have the same behavior. -
For convenience, local variables can hold default values of ledger-state types, so the following definition of
initNestedMapis equivalent to the one above.export circuit initNestedMap(b: Boolean): [] {
const t = default<Map<Field, Counter>>;
fld.insert(disclose(b), t);
}
Sealed and unsealed ledger fields
Any ledger field can be marked sealed by prefixing the ledger field declaration
with the optional modifier sealed.
A sealed field cannot be set except during contract initialization. That is, its
value can be modified only by the contract constructor (if any),
either directly within the body of the constructor or via helper
circuits called by the constructor.
The sealed keyword must come after the export keyword (if present) and before
the ledger keyword, as in the following example:
sealed ledger field1: Uint<32>;
export sealed ledger field2: Uint<32>;
circuit init(x: Uint<32>): [] {
field2 = disclose(x);
}
constructor(x: Uint<16>) {
field1 = 2 * disclose(x);
init(x);
}
It is a static error if a sealed ledger field is updated by any code that is reachable from an exported circuit.
Contract constructor
A contract can be initialized via a contract constructor defined at the program's top level.
| constructor-definition | ⟶ | constructor pattern-parameter-list block |
| pattern-parameter-list | ⟶ | ( typed-pattern , ⋯ , typed-pattern ,opt ) |
| typed-pattern | ⟶ | pattern : type |
The constructor, if any, is typically used to initialize public state and can also be used to initialize private state through witness calls. At most one contract constructor can be defined for a contract, and it must appear only at the program top level, i.e., it cannot be defined in a module. To initialize ledger fields that are visible only within a module, the constructor can call a circuit that is exported from the module. For example:
module PublicState {
enum STATE { unset, set }
ledger state: STATE;
ledger value: Field;
export circuit init(v: Field): [] {
value = disclose(v);
state = STATE.set;
}
}
import PublicState;
constructor(v: Field) {
init(v);
}
Each constructor parameter must have an explicit type annotation. The type of each variable binding arising from the binding of identifiers in each parameter pattern to the corresponding pieces of the input is the type of the corresponding part of the declared type's structure.
The return type of the constructor is always [].
Any attempt to return another type of value using return expr; where the
type of expr is something other than [], is a static error.
Circuit definitions
The basic operational element in Compact is the circuit. This corresponds closely to a function in most languages but is designed to be compilable into a zero-knowledge circuit. The key limitation of circuits relative to functions in most languages is that circuits cannot be recursive, either directly or indirectly.
Compact supports two kinds of circuits: named circuits and anonymous circuits. Named circuits are described here, and anonymous circuits are described in Circuit and witness calls.
Named circuit definitions have the following syntax:
| circuit-definition | ⟶ | exportopt pureopt circuit function-name gparamsopt pattern-parameter-list : type block |
A circuit definition binds function-name to a circuit with the given parameters,
return type, and body.
The optional export modifier indicates that the circuit binding should
be exported from the enclosing module or the program itself, if the circuit
is defined outside of any module.
The optional pure modifier indicates that the
circuit is pure.
If any generic parameters are present (gparams is present and is nonempty), the circuit is generic and must be specialized (provided with generic arguments) at the point of call.
Circuits can take zero or more parameters.
The parameters are all patterns containing identifiers
to be bound to selected pieces of the argument values.
In the simplest case, a pattern is just an identifier and is bound to the argument
value as a whole.
The bindings established by the parameters are visible within (and only within)
the block that constitutes the body of the circuit.
Each parameter must have an explicit type annotation, and at the point of every call to the
circuit, the type of the corresponding argument expression must be a subtype of
that type.
The type of each variable binding arising from the binding of identifiers in
each parameter pattern to the corresponding pieces of the input is the type of
the corresponding part of the declared type's structure.
For example, in the body of sumStruct below:
struct S { x: Uint<16>, y: Uint<32> }
circuit sumStruct({x, y}: S): Uint<64> {
return x + y;
}
The variable binding established for x by the pattern {x, y} has type
Uint<16>, and the variable binding for y has type Uint<32>.
Every named circuit's return type must be explicitly declared, and it is a static error if the circuit can return a value that is not a subtype of that type.
The body is evaluated each time the circuit is called.
Pure and impure circuits
A Compact circuit is considered pure if it computes its outputs from its inputs without reference to or modification of public state (via the ledger) or private state (via witnesses). In practice, the compiler considers a circuit to be impure if the body of the circuit contains a ledger operation, a call to any impure circuit, or a call to a witness.
Some external circuits defined in CompactStandardLibrary are witnesses; calls
to these make the caller impure. The remainder are considered pure, so calls to
those do not make the caller impure.
A Compact program can declare a circuit to be pure by prefixing the circuit
definition with the pure modifier, which must follow the export modifier,
if present, e.g.:
pure circuit c(a: Field): Field { ⋯ }
export pure circuit c(a: Field): Field { ⋯ }
The only effect of the pure modifier is that the compiler flags the
declaration as an error if its own analysis determines that the circuit is
actually impure. The pure modifier allows an application to ensure that the
circuit is present in the PureCircuits type declaration and via the
pureCircuits constant in the TypeScript module produced for a (correct)
Compact program by the Compact compiler.
Blocks
A block is a group of statements enclosed in braces:
A block can be used in place of a single statement anywhere a single statement
is allowed, which is useful for allowing multiple statements to be evaluated by the
"then" and "else" parts of an if statement or the body of a for statement.
The body of every circuit definition and the body of the constructor, if any, is always a block. The right-hand side of the arrow in an anonymous circuit can be either a block or an expression.
The statements within a block occupy a nested scope: variable bindings created by
const statements within the block are not visible outside the block, and they can
shadow identifier bindings with the same names that exist outside the block.
A block is well-typed if the statements within it are well-typed.
A block is evaluated by evaluating the statements in sequence.
Statements
Compact statements are used to perform computations or effects with the help of Compact expressions.
Statements either do not have values or (in the case of
expression sequences serving as statements)
the values are ignored.
Thus it is not necessary to talk about the type of a statement.
Nevertheless, each statement has typing rules that must be followed, such as that
the type of the test expression of an if statement must be Boolean.
The first subsection of this section presents a grammar snippet that summarizes the syntax of Compact statements. The remaining subsections describe the typing and evaluation rules of the various kinds of statements.
Syntax of statements
| stmt | ⟶ | if ( expr-seq ) stmt |
| | | stmt0 | |
| stmt0 | ⟶ | expr-seq ; |
| | | const cbinding , ⋯¹ , cbinding ; | |
| | | if ( expr-seq ) stmt0 else stmt | |
| | | for ( const id of start .. end ) stmt | |
| | | for ( const id of expr-seq ) stmt | |
| | | return expr-seq ; | |
| | | return ; | |
| | | block | |
| start | ⟶ | tsize |
| end | ⟶ | tsize |
The grammar snippet above shows that a statement (stmt) is either a one-armed if statement
or some other kind of statement (stmt0).
This structure is used to enforce the restriction that the "then" part of a
two-armed if cannot be a one-armed if.
This is often left ambiguous in a language grammar, with a separate note to say
that the ambiguity is resolved by associating each "else" part with the closest
enclosing if statement, but here the constraint is explicit in the grammar.
Expression sequences used as statements
Any expression sequence (expr-seq), i.e., comma-separated sequence of one or more expressions to be evaluated in sequence, can be used as a statement. Expression sequences are described in their own section.
const statements
const statements create local variable bindings.
Every const statement takes the following form:
const cbinding , ⋯¹ , cbinding ;
where each cbinding takes the following form:
| cbinding | ⟶ | optionally-typed-pattern = expr |
| optionally-typed-pattern | ⟶ | pattern |
| | | typed-pattern |
A const statement is typed by typing each of its cbinding subforms.
A cbinding subform is typed by typing the expression on the right-hand side
of the =.
If a type T is declared on the left-hand side, the type of the expression
must be a subtype of T, otherwise it is a static error.
It is also a static error if the pattern implies a different structure from
the type of the expression.
For example, it is a static error if the pattern implies that the expression's
value is a tuple when the type is actually, say, Field.
The type of each variable binding arising from the binding of identifiers in each
pattern to the corresponding pieces of the input is the type of the corresponding
part of the structure of the declared type, if present, otherwise of the inferred
type.
For example, in the following code, the binding for x has type Boolean, and
the binding for y has type [Uint<64>, Uint<64>]:
witness w(): [Boolean, [Uint<16>, Uint<32>]];
circuit foo(): [Uint<64>, Uint<64>] {
const [x, y]: [Boolean, [Uint<64>, Uint<64>]] = w();
return x ? y : [0, 0];
}
while in the following, x still has type Boolean but y has (the inferred)
type [Uint<16>, Uint<64>].
witness w(): [Boolean, [Uint<16>, Uint<32>]];
circuit foo(): [Uint<64>, Uint<64>] {
const [x, y] = w();
return x ? y : [0, 0];
}
Except where shadowed, the scope of each variable bound by a const statement
is the entirety of the innermost block containing the const statement.
It must not be referenced, however, before it is given a binding.
Any attempt to do so is a static error.
For example, the reference to x on the first line of the body in the
definition of foo below is a static error:
circuit foo(a: Uint<16>): Field {
const y = x + a;
const x = 7;
return y;
}
Similarly, the reference to x in the first cbinding of the const
statement below is also a static error:
const y = x, x = 7;
A const statement is evaluated by evaluating the cbinding subforms in order
so that the variables given values by each cbinding are available to be
referenced by the cbindings that follow.
The evaluation of each cbinding involves determining the value v of the
expression on the right-hand side of the =, then giving values to identifiers
in the pattern p on the left-hand side to the corresponding pieces of v
as described earlier in Patterns and destructuring.
Any variable bound by const may not be reused within a block, although a
const binding in a nested block might shadow it.
Variables are immutable, although the same variable might take on different values
at different times if it is contained within a block of code that is evaluated more
than once, such as would be the case for the body of a circuit that is called
more than once.
if statements
An if statement is used to determine the flow of control through the
statements of a circuit or constructor body.
A one-armed if expression has a "test part" (an expr-seq enclosed
in parentheses) and a "then part" (then-statement):
if (expr-seq) then-statement
A two-armed if statement has a test part, a then part, and an "else part"
(else-statement):
if (expr-seq) then-statement else else-statement
The typing of an if statement requires only that the type of expr-seq must be
Boolean.
Evaluating an if expression involves first determining the value v of
expr-seq.
If v is true, then-statement is evaluated.
Otherwise, v must be false, in which case else-statement
(if present), is evaluated.
for statements
for statements are used to iterate over a sequence of values.
In Compact, in contrast to most languages, the number of iterations can
always be determined at compile time.
That is, the number of iterations is bounded either by constant numeric
bounds or by size of an object of constant size.
This restriction is motivated by the need for the compiler to generate
finite proving circuits.
Compact supports two kinds of for statements.
The first iterates over vectors, tuples, and byte vectors and takes the
following form:
for (const x of expr) stmt
This kind of for statement is typed by typing expr and verifying that
it is a Vector type, a tuple type that
has a vector type, or a Bytes type.
Evaluating this kind of for requires determining the vector, tuple, or
byte-vector value v of expr then evaluating
stmt once for each element of v with x bound to the value of each element
of v in turn.
The second form iterates over a range of unsigned integer values and takes the following form:
for (const i of start .. end) stmt
In this form, each of start and end must be a literal unsigned integer or reference to a generic natural-number parameter, and end must be greater than or equal to start. Otherwise, it is a static error.
This form is always well-typed.
Evaluating this kind of for requires evaluating stmt with i bound
to k for each k in the range start (inclusive) to end (exclusive).
return statements cannot be used to return from within for statements.
It is therefore a static error for stmt to be a return statement
or for a return statement to appear within stmt (except where it
appears nested within an anonymous circuit).
Iteration can also be accomplished via
map and fold expressions.
return statements
A return statement can be used to exit from the closest enclosing
anonymous circuit, if any, or otherwise from
the enclosing constructor or named circuit, and to return to the caller either
an explicit return value, i.e., the value of expr-seq
in this form of return:
return expr-seq;
or the default value [] in this form of return:
return;
A circuit or constructor body can also exit without an explicit return statement:
any path through the body that does not end in an explicit return statement is
treated as if ended with return;.
A return statement is always well-typed if it exits from an anonymous circuit
without a declared return type.
Otherwise, a return statement is well-typed if the type of expr-seq, or
[] if no expr-seq is present, is a subtype of the expected return type.
If the return form exits from a named circuit or an anonymous circuit
with a declared return type, the expected type is the declared return type,
while if it exits from the constructor, the expected return type is [].
An implication of these rules is that it is a static error to exit without an
explicit return value from a circuit with a declared return type other than [].
When a return statement is evaluated, the expr-seq, if present, is evaluated,
the circuit or constructor exits immediately without evaluating any
subsequent statements, and it returns to the caller the value of expr-seq or
[] if no expr-seq is present.
Expressions
Compact expressions are used to compute values, to cause effects, or both.
Every Compact expression must be well-typed (free from static type errors). If any expression within a program contains a static type error, it is a static error: the Compact compiler reports the error and does not produce any target code (TypeScript or zkir) for the program. The static type of a well-typed expression is either a Compact type or a ledger-state type.
Every well-typed Compact expression either evaluates to a value or causes a dynamic error, and it might have effects. The evaluation of an expression is defined in terms of the evaluation of its subexpressions.
The first subsection of this section provides a grammar snippet that summarizes the syntax of Compact expressions, and the second discusses how the grammar reflects the rules for precedence and associativity of operators. The remaining sections describe the typing and evaluation rules for each kind of expression.
Syntax of expressions
| expr-seq | ⟶ | expr |
| | | expr , ⋯¹ , expr , expr | |
| expr | ⟶ | expr0 ? expr : expr |
| | | expr0 = expr | |
| | | expr0 += expr | |
| | | expr0 -= expr | |
| | | expr0 | |
| expr0 | ⟶ | expr0 || expr1 |
| | | expr1 | |
| expr1 | ⟶ | expr1 && expr2 |
| | | expr2 | |
| expr2 | ⟶ | expr2 == expr3 |
| | | expr2 != expr3 | |
| | | expr3 | |
| expr3 | ⟶ | expr4 < expr4 |
| | | expr4 <= expr4 | |
| | | expr4 >= expr4 | |
| | | expr4 > expr4 | |
| | | expr4 | |
| expr4 | ⟶ | expr4 as type |
| | | expr5 | |
| expr5 | ⟶ | expr5 + expr6 |
| | | expr5 - expr6 | |
| | | expr6 | |
| expr6 | ⟶ | expr6 * expr7 |
| | | expr7 | |
| expr7 | ⟶ | ! expr7 |
| | | expr8 | |
| expr8 | ⟶ | expr8 [ expr ] |
| | | expr8 . id | |
| | | expr8 . id ( expr , ⋯ , expr ,opt ) | |
| | | expr9 | |
| expr9 | ⟶ | fun ( expr , ⋯ , expr ,opt ) |
| | | map ( fun , expr , ⋯¹ , expr ,opt ) | |
| | | fold ( fun , expr , expr , ⋯¹ , expr ,opt ) | |
| | | slice < tsize > ( expr , expr ) | |
| | | [ tuple-arg , ⋯ , tuple-arg ,opt ] | |
| | | Bytes [ bytes-arg , ⋯ , bytes-arg ,opt ] | |
| | | tref { struct-arg , ⋯ , struct-arg ,opt } | |
| | | assert ( expr , str ) | |
| | | disclose ( expr ) | |
| | | term | |
| term | ⟶ | id |
| | | true | |
| | | false | |
| | | nat | |
| | | str | |
| | | pad ( nat , str ) | |
| | | default < type > | |
| | | ( expr-seq ) | |
| struct-arg | ⟶ | expr |
| | | id : expr | |
| | | ... expr | |
| bytes-arg | ⟶ | tuple-arg |
| tuple-arg | ⟶ | expr |
| | | ... expr |
Precedence and associativity
The structure of the expression grammar unambiguously reflects the precedence and associativity of operators.
Each group from expr-seq through term represents a precedence level,
with expr-seq at a higher precedence level than expr, expr at a higher level
than expr0, and so on.
For example, expr6 (multiplication) is at a higher precedence level
and so binds more tightly than expr5 (addition) because the grammar
permits the operands of an addition expression to be multiplication expressions,
but not vice versa.
For example, x * y + z is parsed as the addition of x * y and z
rather than the multiplication of x and y + z, because the expr5
production for addition is not reachable from the expr6 production
for multiplication.
One can still write x * (y + z) because an expr6 can be a term (via the
"fall through" productions expr6 ⟶ expr7,
expr7 ⟶ expr8, and so on),
a term can be a parenthesized expr-seq, and an expr-seq can be an expr.
The grammar enforces associativity by requiring the left or right
operand of an operator or both to be at a higher precedence level.
Specifically, left-associativity is expressed by requiring the right operand to
be at a higher level while allowing the left operand to be at the same level.
For instance, expr5 ⟶ expr5 + expr6
enforces the left-associativity of addition, since the left operand must be at
a higher level.
This means, for example, that x + y - z can be treated only as the subtraction of z
from x + y rather than as the addition of x and y - z.
Right-associativity is expressed in the opposite manner, e.g.,
expr0 ? expr : expr enforces the right-associativity of the
ternary ?: operator.
Non-associative operators such as the relational operators <, <=, >=, and
> require both operands to be at a strictly higher level (expr4
on both sides of expr3), preventing chaining like a < b < c.
The middle operand of the ternary ?: operator plays no roll in determining
precedence and associativity and can be any kind of expression.
(The grammar could even have allowed it to be an expression sequence, but Compact
follows TypeScript and JavaScript in restricting it to be a single expression.)
In fact, the grammar has many examples where a lower precedence expression is
embedded in a higher-precedence expression; this does not result in any
ambiguity because the operand is not exposed on the left or right side of the
expression where it can be confused as an operand of some other operator.
Parenthesized expressions
Compact allows parenthesized expressions of the form (e), where
e is an expression or, more generally, an expression sequence.
Parenthesized expressions can be used to control the order of operations or simply
to make the default order of operations explicit.
For example, x + y * z computes the product of y and z and adds x to the
result, because * has higher precedence than +.
x + (y * z) computes the same thing but makes the order of operations more
clear, while (x + y) * z computes the sum of x and y and multiplies it by z.
The type of a parenthesized expression is the type of the embedded expression sequence, and its value is the value of the subexpression.
Expression sequences
An expression sequence (expr-seq in the grammar above) is a comma-separated sequence of one or more expressions.
expr, ⋯, expr
Expression sequences can appear only in a few contexts. When an expression sequence is required in a context where only a single expression is permitted, the expression sequence can be wrapped in parentheses:
(expr, ⋯, expr)
The type of an expression sequence is the type of its last subexpression. The types of the other subexpressions are ignored and are not constrained in any way.
An expression sequence is evaluated by evaluating the subexpressions from left to right, and its value is the value of the last subexpression. The values of the other subexpressions are ignored; these other subexpressions are evaluated solely for their effects.
Literals
Compact has syntax for Boolean, numeric, and string literal expressions.
A Boolean literal is one of the reserved words true or false.
The static type of a Boolean literal is Boolean and its value is the
corresponding Boolean value.
A numeric literal is a non-negative integer written in decimal, binary, octal, or hexadecimal notation as follows:
- decimal: either the single digit
0or a sequence of one or more decimal digits (0-9) starting with a non-zero digit, e.g.,0or7091; - binary: the prefix
0bor0Bfollowed by one or more binary digits (0or1), e.g.,0b1101001; - octal: the prefix
0oor0Ofollowed by one or more octal digits (0-7), e.g.,0o5073; or - hexadecimal: the prefix
0xor0Xfollowed by one or more hexadecimal digits (0-9,a-f,A-F), e.g.,0x8f0a.
An occurrence of a numeric literal can arise via specialization of a
generic module or circuit in which a reference to a generic natural-number parameter
occurs in an expression context.
For example, if foo is defined as follows:
circuit foo<#N>(): Uint<16> {
return N;
}
the call foo<17>(), at least in effect, gives rise to a copy of foo in
which N has been replaced by 17.
circuit foo(): Uint<16> {
return 17;
}
This is treated exactly as if the numeric literal appeared in place of the
reference to N in the source code of the program.
It is a static error if the number n denoted by a numeric literal exceeds the maximum unsigned value and the maximum field value.
If n does not exceed the maximum representable unsigned value, the literal's
type is Uint<0..k>, k = n+1, and its value is the unsigned integer n.
If n exceeds the maximum representable unsigned value but not the
maximum representable field value, the literal must be directly cast to
Field, i.e., n as Field.
Otherwise it is a static error.
The type of n as Field is Field, and its value is the
field value n.
String literals can be either simple string literals or pad expressions.
String literals produce byte vectors; Compact has no dedicated String type.
Simple string literals use TypeScript string literal syntax.
They can be enclosed in either single (') or double (") quotes, and they can
contain escaped characters.
The length n of a simple string literal is the length of its UTF-8 encoding.
The static type of a string literal of length n is Bytes<n>, and its
value is a byte vector containing its UTF-8 encoding.
A pad expression pad(n, s) is also a string literal, where pad is a reserved
word, n is a natural number literal, and s is a simple string literal.
The length of s must be less than or equal to n; otherwise, it is a static error.
The static type of a padded string literal pad(n, s) is Bytes<n>, and its
value is a byte vector containing the UTF-8 encoding of s followed by 0
bytes up to the padded length n.
Default values of a type
The expression default<T>, where T is a Compact type or a ledger-state type,
has static type T and evaluates to the default value of that type as follows:
Boolean: the value of the literalfalseUint<0..n>andUint<n>:0Field:0[T1,⋯,Tn]: a tuple with n elements, each of which is the default value of the corresponding type TiVector<n,T>: a vector with n elements, each of which is the default value of type T.Bytes<n>: a byte vector of length n, each element of which is0.Opaque<"string">: a zero-length stringOpaque<"Uint8Array">: a zero-lengthUint8Array- Structure types: a struct with each element set to the default value of its type
- Enumeration types: the value of E
.id where E is the name of the enumeration type and id is the first element - alias types: the default value of the underlying type
Counter: a Counter initialized to0Set<T>: an empty SetMap<K,T>: an empty MapList<T>: an empty ListMerkleTree<n,T>: an empty Merkle treeHistoricMerkleTree<n,T>: an empty historic Merkle tree
Default values are not defined for contract types or the Kernel ledger-state type,
so it is a static error if T in default<T> is one of these types.
Variable references
A variable reference is an identifier reference to a variable binding in scope.
The type of a variable reference is the type of the binding to which it refers.
Each call to a circuit (or constructor) associates a new set of values with each
variable bound by the circuit's (or constructor's) parameters, and each block
in the circuit (or constructor) associates a new set of values with each of the
const bindings created within the block.
The value of each variable reference is the value currently associated with the
variable.
Conditional expressions
Compact supports conditional expressions of the form
e0 ? e1 : e2
where e0, e1, and e2 are expressions.
The type of e0 must be Boolean.
The types of e1 and e2 must be related.
That is, the type of e1 must be a subtype of the type of e2,
or the type of e2 must be a subtype of the type of e1.
The type of the entire expression is the type of e2 if e1 is a subtype of e2 and the type of e1 if e2 is a subtype of e1.
A conditional expression e0 ? e1 : e2
is evaluated by first evaluating e0. The value v of e0
determines which of the other subexpressions is evaluated:
- if v is
true, then e1 is evaluated and its value is the value of the entire expression - if v is
false, then e2 is evaluated and its value is the value of the entire expression
The evaluation rules ensure that only one of e1 and e2 is evaluated.
Relational comparison expressions
Relational comparison expressions take the form e1 op e2
where e1 and e2 are expressions and op is one of
Compact's relational operators.
The relational operators are equals (==), not equals (!=), less
than (<), greater than (>), less than or equals (<=), and
greater than or equals (>=).
Equals and not equals require the types of e1 and e2 to be related. That is, the type of e1 must be a subtype of the type of e2, or the type of e2 must be a subtype of the type of e1. Otherwise, it is a static error.
Less than, greater than, less than or equals, and greater than or equals require
the types of e1 and e2 to be unsigned integer types.
Otherwise, it is a static error.
(Values of type Field cannot be compared with these operators.)
The type of the result is Boolean.
A relational comparison expression e1 op e2 is evaluated by evaluating e1, then evaluating e2, then comparing the resulting values as described below.
Equals and not equals
Equality depends on the types of the operands according to the following rules.
The rules need not and therefore do not cover cases where one operand is not
a subtype of the other, as required by typing rules above.
For example, if one operand has type Boolean, so does the other; if one
operand has a Bytes type of length n, so does the other; and if one
has a specific structure type, so does the other.
Boolean: the values are equal if both aretrueor both arefalse.FieldandUinttypes: the values are equal if the integer values are equal.Bytes: the values are equal if the corresponding bytes at each index are equal.Vectorand tuple types: the values are equal if the corresponding values at each index are equal.- Structure types: the values are equal if the values of the corresponding fields are equal.
- Enumeration types: the values are equal if they are the same enumeration member.
Opaquetypes: the values are equal if the run-time values are equal according to JavaScript's strict equality (===) operator.
Otherwise the values are not equal.
For equal, the relational comparison expression evaluates to true if the
values are equal; otherwise it evaluates to false.
Likewise, for not equals, the relational comparison expression evaluates to
false if the values are equal; otherwise it evaluates to true.
Less than, greater than, less than or equals, and greater than or equals
The values are compared according to the corresponding relational operation.
The relational comparison expression evaluates to true if the relation holds;
otherwise it evaluates to false.
Short-circuit logical expressions
Compact supports short-circuit logical expressions of the form
e1 op e2 where e1 and e2
are expressions and op is one of the logical operators or (||) or and
(&&).
Logical expressions require the types of both subexpressions to be Boolean.
Otherwise it is a static error.
The logical expression itself also has type Boolean.
Evaluating e1 op e2 involves first evaluating e1, then:
- For
||, if v istrue, e2 is not evaluated, and the value of the entire expression istrue. Otherwise, e2 is evaluated, and its value is the value of the entire expression. - For
&&, if v isfalse, e2 is not evaluated, and the value of the entire expression isfalse. Otherwise, e2 is evaluated, and its value is the value of the entire expression.
These are short-circuiting operators because they do not evaluate the second operand if the final value is determined from the first.
Boolean negation expressions
Compact supports unary Boolean negation expressions of the form !e where e
is an expression.
The type of e must be of type Boolean, otherwise it is a
static error.
The Boolean-negation expression itself also has type Boolean.
A Boolean negation expression !e is evaluated by evaluating e to
determine its value v.
Then if v is true, the value of the Boolean negation expression is false,
and If v is false, the value of the Boolean negation expression is true.
Binary arithmetic expressions
Binary arithmetic expressions have the form e1 op e2,
where e1 and e2 are operand expressions and op is one of
Compact's binary arithmetic operators.
The binary arithmetic operators are add (+), subtract (-) and
multiply (*).
Arithmetic expressions require the type of each operand to be a numeric
type, that is, either a Field, a Uint, or a type alias for a Field or
a Uint.
Putting aside type aliases for now, the type of the result depends on the
types of the operands as follows:
- If both operands have type
Field, the result has typeField. - If both operands have
Uinttypes, i.e., e1 has typeUint<0..m>for some m and e2 has typeUint<0..n>for some n, the type of the result depends on the operation as follows:- for add, the result has type
Uint<0..m+n>, - for subtract, the result has type
Uint<0..m>, and - for multiply, the result has type
Uint<0..m⋅n>.
- for add, the result has type
- If one operand has type
Fieldand the other has typeUint<0..n>where n-1 is less than or equal to the maximum field value, theUintoperand is implicitly cast toField, and the result has typeField. (If n-1 is greater than the maximum field value, the behavior of the compiler is currently unspecified.)
For arithmetic operations with Uint result types, it is a static error if
the result's bound would be greater than the
maximum unsigned integer, if any.
Evaluating an arithmetic expression e1 op e2 involves
first evaluating e1, then evaluating e2.
Integer addition, subtraction, or multiplication is then used on the operand
values.
The overflow and underflow behavior differs for Field (either operand has
type Field) and Uint operations (both operands have a Uint type):
Fieldarithmetic overflow and underflow wraps around 0; that is, the result of an arithmetic operation whose result is aFieldis the actual arithmetic value modulo k, where k is one more than the maximum field value.Uintaddition and multiplication cannot overflow: the static type of the result is always large enough to hold the result value.Uintsubtraction results in a dynamic error if the resulting value would be negative, i.e., if the value of e2 is greater the value of e1.
If the type of either operand is a structural type alias for a Field or
Uint it is treated identically to that Field or Uint, both in type checking
and evaluation.
On the other hand, if the type of either operand is a nominal type alias
T for a Field or a Uint, the type of the other operand must also have
type T, and the type of the result also has type T.
Evaluation proceeds as if the two operands had the underlying Field or Uint
type with the value cast back to T, which can result in a dynamic error if T
is a nominal alias for a type Uint<0..n> and the value is not less than n.
For example:
new type Feet = Uint<32>;
circuit foo(x: Feet, y: Feet, scale: Uint<32>): Feet {
return (x + y) * (scale as Feet);
}
computes the sum of x and y multiplied by scale.
If the value of this computation fits in 32 bits, foo returns the value.
Otherwise, the enclosing top-level circuit or constructor halts with a
message indicating that the cast of the value to Feet failed.
Tuple creation
New tuple values are created with expressions of the form [tuple-arg, ⋯, tuple-arg] where
tuple-arg, ⋯, tuple-arg is a sequence of zero or more comma-separated tuple
arguments.
A non-empty sequence can have an optional trailing comma.
Each tuple argument is either an expression or a spread.
If a tuple argument is an expression e of type T, it contributes a single element of type T to the new tuple: the value of e.
If a tuple argument is a spread ... e, the type T of e must be
a tuple type, a Vector type, or a Bytes type.
Otherwise, it is a static error.
If T is a tuple type [U1, ⋯, Un], the
spread contributes n new elements of types U1, ⋯, Un
to the new tuple, i.e., the elements of the tuple value of e.
If T is a Vector type Vector<n, U>, the spread contributes
n new elements of type U to the new tuple, i.e., the elements
of the Vector value of e.
If T is a Bytes type Bytes<n>, the spread contributes n new elements of
type Uint<8> to the new tuple, i.e., the elements of the byte-vector value
of e.
The type of a tuple-creation expression is the tuple type whose elements are the types contributed by each of the tuple arguments in order.
A tuple-creation expression is evaluated by evaluating the tuple arguments from left to right and creating a new tuple whose elements are the values contributed by each of the tuple arguments in order. The length of the new tuple is the total number of contributed elements, which can be less than or greater than number of tuple arguments.
Byte-vector creation
New byte vectors are created with expressions of the form
Bytes [tuple-arg, ⋯, tuple-arg].
Bytes creation is essentially the same as tuple creation
except that the types of the contributed elements must all be subtypes of Uint<8>
(if not, it is a static error), and the result is a new byte vector of type
Bytes<n> where n is the number of contributed elements.
Tuple, vector, and byte-vector references
Compact allows references to individual elements of sequence values (tuples,
vectors, and byte vectors) via the syntax e[index] where e is
an expression and index is a numeric literal, generic natural-number parameter reference,
or an expression that can be reduced to a numeric constant at compile time as
described below.
The type of e must be a sequence type, i.e., a tuple type, a vector type,
or a Bytes type.
The type of index must be Uint<n> for any n.
The eventual constant value of index must be less than the length of the
sequence value as determined by its type.
Violating any of these constraints is a static error.
The unsigned integer value of index must be computable at compile time via the following rules, or it is a static error.
- Any index can be a constant, i.e., a numeric literal or a generic natural-number parameter reference.
- If the type of e is a tuple type that has a vector type, the rule for vector types below applies, with e treated as having the vector type. If the type of e is a tuple type that does not have a vector type, however, index must be a constant. This restriction allows the compiler to compute a type for tuple reference expressions without first attempting to reduce index to a numeric value.
- If the type of e is a vector or byte-vector type, index must be a
constant-valued expression, where a constant-valued expression is:
- a constant,
- a reference to a generic natural-number parameter,
- a reference to a variable bound to a constant-valued expression,
- a reference to a variable bound in the header of a
forloop, or - the result of adding, subtracting, or multiplying two constant-valued expressions.
The type of a byte-vector reference is Uint<8>.
The type of a vector reference where the vector has element type T is T.
The type of a tuple reference where the tuple has type [T1, ⋯,
Tn] and index is the constant i or the value
i of a generic natural-number parameter is Ti.
Otherwise, the tuple type must have a vector type
Vector<n, T> and the type of the tuple reference is T.
The value of a sequence reference is the value of the ith (zero-based) element of the result of evaluating e, where i is the (eventually) compile-time constant value of index.
Tuple, vector, and byte-vector slices
Slicing a sequence value such as a tuple, vector or byte vector produces a subsequence of the original value. It is similar to a sequence reference but extracts a sequence of values rather than a single value from the sequence value.
Slice expressions take the form slice<k>(e, index), where slice
is a keyword, k is a constant or numeric generic parameter reference specifying
the fixed size of the slice, e is an expression, and index is a numeric
literal, a generic natural-number parameter reference, or an expression that can be
reduced to a numeric constant at compile time as described above in the section on
sequence references.
It is a static error if index + k exceeds the length of the sequence value.
The type of a byte-vector slice is Bytes<k>.
The type of a vector slice where the vector has element type T is
Vector<k, T>.
The type of a tuple slice where the tuple has type
[T1, ⋯, Tn] and index is the constant i
or the value i of a generic natural-number parameter
is the subsequence [Ti, ⋯, Tj] starting with
element i (zero-based) and ending with element j=i+k-1 of [T1,
⋯, Tn].
Otherwise, the tuple type must have a vector
type Vector<n, T> and the type of
the slice is Vector<k, T>.
The value of a slice expression slice<k>(e, index) is the subsequence
of the original tuple, vector, or byte vector from i (zero-based, inclusive)
through i+k (zero-based, exclusive) of the result of evaluating e, where i
is the (eventually) compile-time constant value of index.
For example, if getMiddle is defined as follows:
export circuit getMiddle(x: Bytes<5>): Bytes<3> {
return slice<3>(x, 1);
}
The call
getMiddle(Bytes[17, 18, 19, 20, 21])
evaluates to the equivalent of
Bytes[18, 19, 20]
Structure creation
Structure values are created with structure-creation expressions of the form
T {struct-arg, ⋯, struct-arg}, where T is a structure type name S
or specialized generic structure type S<garg, ⋯, garg>, and
struct-arg, ⋯, struct-arg is a sequence of zero or more comma-separated
structure arguments.
A structure argument can be one of three things:
- a positional argument, which takes the form of an expression e,
- a named argument, which takes the form id
:e, where id is a field name and e is an expression, or - a spread argument, which takes the form
...e, where...is the literal three dots (ellipsis) token and e is an expression.
The sequence of structure arguments within a structure-creation expression can consist either of:
- zero or more positional arguments followed by zero or more named arguments, or
- a spread argument followed by zero or more named arguments.
In the first case, the values of the n positional arguments become the values
of the first n fields of the created structure and must be given in the same
order as the fields in the structure declaration.
The named elements specify the values of the remaining fields by name; that is,
the value of e in the named argument id: e becomes the value of the
field named id.
In the second case, the expression e in the spread argument ...e must
have the same structure type T as the one being created, and each field of the
created structure is given the value of the corresponding field from the value
of e if not overridden by one of the named arguments.
Named arguments need not appear in any particular order.
It is possible for only positional or only named arguments to be present. It is also possible, though not useful, for only a spread argument to be present.
The examples below demonstrate the use of positional and spread field values:
struct S { a: Uint<32>, b: Boolean, c: Bytes<8> }
circuit f(x: Uint<32>, y: Boolean, z: Bytes<8>): S {
const s1 = S { c: z, a: x, b: y };
// Alternatively, s1 can be created with the positional syntax S { x, y, z }
// or a mix of positional and named field values S { x, c: z, b: y }.
const s2 = S { ...s1, b: true };
// s2 is created using the spread syntax. So, s2 has the same field values
// as s1 except that b is true.
const s3 = S { ...s2, c: 'abcdefgh' };
// s3 is also created using the spread syntax. s3 has the same field values
// as s2 except that c is 'abcdefgh'.
return s3;
}
The structure type name must be bound to a structure type in scope. If the structure is generic, it must be fully specialized with generic arguments enclosed in angle brackets.
The static type of a non-generic structure-creation expression is the named structure type, while the static type of a generic structure-creation expression is a structure with the same name as the generic type and field types obtained by substituting the generic arguments for the generic parameters in the structure's declaration.
If a spread argument is not present: The number of structure arguments must match the number of fields in the corresponding structure declaration: a value must be given for every field. Positional arguments must appear before named arguments. A field name must not occur more than once among the named arguments, and each field name that does occur must be the name of a field in the corresponding structure declaration whose value is not given positionally. The type of a positional field subexpression must be a subtype of the declared type of the (positionally) corresponding field in the structure declaration. Similarly, the type of a named field subexpression must be a subtype of the declared type of the corresponding (named) field in the structure declaration.
If a spread argument is present: The spread argument must come first in the sequence of structure arguments. The type of the spread subexpression must be the same as the structure to be created. There must not be any positional arguments. A field name must not occur more than once among the named arguments, and each field name that does occur must be the name of a field in the corresponding structure declaration. The type of a named field subexpression must be a subtype of the declared type of the corresponding (named) field in the structure declaration.
Violating any of the above constraints is a static error.
A structure-creation expression is evaluated by evaluating the structure argument expressions in order from left to right and constructing a structure value whose fields values are based on the corresponding structure arguments: if there is a positional or named argument for a field, the field's value is the value of the expression, otherwise there must be a spread argument and the field's value is the value of the corresponding field in the (structure) value of the spread expression.
Structure field access
A structure field access is an expression of the form e.id where e is
an expression with a structure type T and id is an identifier.
It is a static error if T does not have a field named id.
The type of any structure field access e.id where e has type T is the
type of the id field of T.
The value of any structure field access e.id is the result of evaluating the
subexpression e and extracting the value of the resulting structure's id field.
Some expressions of the form e.id can also be
enumeration member selection or
ledger-state operations.
e.id is recognized as a structure field access only when the
type of e is a structure type.
Enumeration member selection
An expression of the form E.id where E is the name of an enumeration
type and id is an identifier is an enumeration member selection.
It is a static error if E does not have a member named id.
The type of any enumeration member selection E.id is E.
E.id is a constant.
That is, the value of any enumeration member selection E.id is E.id.
Some expressions of the form E.id can also be
structure field accesses or
ledger-state operations.
E.id is recognized as an enumeration member selection only when the
type of e is an enumeration type.
Circuit and witness calls
Circuits and witnesses, collectively referred to as functions, are called via
expressions of the form fun(e, ⋯, e), where fun is a function expression
and e, ⋯, e is a sequence of zero or more comma-separated argument expressions.
The function expression fun can take one of the following forms:
| fun | ⟶ | id gargsopt |
| | | arrow-parameter-list return-typeopt => block | |
| | | arrow-parameter-list return-typeopt => expr | |
| | | ( fun ) | |
| arrow-parameter-list | ⟶ | ( optionally-typed-pattern , ⋯ , optionally-typed-pattern ,opt ) |
| return-type | ⟶ | : type |
In the simplest form, fun is just a identifier id referring to a witness or named circuit. When id refers to a generic witness or circuit, it must be fully specialized via generic parameters.
fun can also be an anonymous circuit, also known as an arrow circuit.
This form consists of a parameter list
(optionally-typed-pattern, ⋯, optionally-typed-pattern) followed
by an optional return type : type,
an arrow (=>), and a body, which can be a block or an expr.
Each optionally-typed-pattern is a pattern
with an optional type annotation : type.
Finally, fun can be a parenthesized function (fun).
Because circuits and witnesses are not first class, fun cannot be a variable name or arbitrary expression.
Anonymous circuits cannot have generic parameters. Since they appear only in contexts where they are directly called, the circuit would have to be immediately specialized, in which case only one specialization can exist and the non-generic version would be clearer.
The underlying function of a function expression is a non-parenthesized function expression. For a function name it is the function name, for an anonymous circuit it is the anonymous circuit, and for a parenthesized function it is the underlying function of the parenthesized function expression.
Type checking a function call depends on the form of the underlying function.
-
For a named function: Function names may be overloaded: more than one function of the same name may be present in the scope of a call to that function. A call whose underlying function is a name can thus have zero or more candidate functions, i.e., all of those that have the given name and are present in the scope of the call.
A candidate function is not necessarily compatible with the number and kinds of the generic parameter values nor with the number and types of the argument expressions provided at the call site. It is compatible if and only if the number of generic parameters is the same as the number of generic arguments, each generic argument is of the required kind (numeric or type), the number of declared parameters is the same as the number of argument expressions, and the type of each argument is a subtype of the declared type of the corresponding parameter. If exactly one candidate is compatible, the call is made to that candidate. It is a static error if there is not exactly one compatible candidate.
The static type of a call to a named function is the declared return type of the called function.
-
For an anonymous circuit: Parameters are type-checked or inferred as follows:
-
If there is a type annotation for a parameter, it is a static error if the type of the corresponding argument expression is not a subtype of the type annotation.
-
If there is no type annotation for a parameter, the parameter's type is inferred as the type of the corresponding argument expression.
Within the body of the anonymous circuit, the type of each variable binding arising from the binding of identifiers in each parameter pattern to the corresponding pieces of the corresponding argument is the type of the corresponding part of the structure of the declared or inferred parameter type.
If there is a return-type annotation, it is a static error if the body can return a type of value that is not a subtype of the return-type annotation. A return statement of the form
return;implicitly returns a value of type[], as does every control flow path through a body that does not explicitly end with a return statement.If there is no return-type annotation, then a return type is inferred from the body as the least upper bound of the types of the values that can be returned from the body (explicitly or implicitly). It is a static error if these types do not have a least upper bound.
The static type of a call to an anonymous circuit is the declared or inferred return type.
-
Calls are evaluated by evaluating the argument expressions in order from left to right. Then, if a circuit is being called, the statements in the body of the circuit are executed with the parameter names bound to the corresponding argument values. The value of a circuit call is the value returned from the execution of the body. If a witness is being called, the contract invokes the externally provided witness function with the argument values. The value of a witness call is the value returned by the witness function.
Map and fold expressions
Compact supports expressions that perform the higher-order operations map and (left-to-right) fold over vectors, tuples that have vector types (not arbitrary tuples), and byte vectors.
Map expressions have the form map(fun, e, ⋯¹, e) where map is a
keyword, fun is a circuit or witness, and each e is an expression.
The map operator creates a vector containing the results of applying fun
to the corresponding elements of the sequence values of the expressions.
Fold expressions have the form fold(fun, init, e, ⋯¹, e) where
fold is a keyword, fun is a circuit or witness, init is an expression,
and each e is an expression.
The fold operator accumulates a value starting with the value of init and
updating it by applying fun to the current value and the corresponding elements of
the sequence values of the expressions in turn.
A detailed description of the circuit or witness fun is given in Circuit and witness calls above.
A map expression is type-checked by checking the type of the witness or circuit
fun to find its declared or inferred parameter types and its declared or
inferred return type R.
fun must have at least one parameter, and the map expression must have the same
number of sequence-value subexpressions as the number of parameters of fun.
Each of the sequence-value subexpressions must have a vector or byte-vector type,
and all these types must have the same length n.
If the type of the ith parameter to fun is T, then the ith sequence-value
subexpression must have the vector type Vector<n, S> where S is a subtype of
T, or it must be the byte-vector type Bytes<n> and Uint<8> must be a subtype
of T.
The type of the map expression is Vector<n, R>; map produces a vector
regardless of whether the input sequence-value subexpressions are vectors, tuples,
byte vectors, or some combination.
A fold expression is type-checked by checking the type of the witness or circuit
fun to find its parameter types and its return type R.
fun must have at least two parameters, and the type of the first parameter must
be the same type as the return type R.
The fold expression must have one fewer sequence-value subexpression than the
number of parameters of fun.
The first subexpression init gives the initial value for the fold.
It must have a type which is a subtype R.
Each of the sequence-value subexpressions must have a vector or byte-vector type,
and all these types must have the same length n.
If the type of parameter i+1 of fun is T, then the type of the ith
sequence-value subexpression must have the vector type Vector<n, S> where S
is a subtype of T, or it must be the byte-vector type Bytes<n> and Uint<8>
must be a subtype of T.
The type of the entire expression is R.
Map expressions are evaluated by evaluating the sequence-value subexpressions from left to right to produce the input sequence values. The witness or circuit fun is then applied in turn, from index 0 up to index n-1, to arguments taken from the input sequence values. The result is a vector of length n where each ith element is the result of applying fun to the ith elements of the corresponding input sequence values.
Fold expressions are evaluated by evaluating the initial value expression init and then evaluating the sequence-value subexpressions from left to right. The values of the sequence-value expressions are the input sequence values. The witness or circuit fun is then applied in turn, from index 0 up to index n-1, to an accumulator value argument and arguments taken from the input sequence values. The 0th (initial) accumulator value is the value of the expression init, and each subsequent i+1th accumulator value is the result of applying fun to the ith accumulator value and to the ith elements of the corresponding input sequence values. The result is the nth (final) accumulator value where n is the length of the input vectors.
Type cast expressions
Values of one type can be cast to another, when allowed, via a cast, which
takes the form e as T where e is an expression and T is a type.
The effects of casting an expression e of type T to another type U depend
on the specific types involved.
For some types T and U, the cast is not allowed, which is a static error.
Even if the cast is allowed, some values of type T might not be representable
as values of type U.
In such cases, a run-time check of the value of e is required and can result
in a dynamic error.
In cases where the run-time representation of T is different from the run-time
representation of U, the value of e must be converted at run time from the
one representation to the other.
The type of any (allowed) cast e as T is, naturally, T.
Evaluation of a cast proceeds by evaluation of its subexpression, followed by any checks and conversions required by the cast. If a check fails, a dynamic error is reported.
TypeScript casts of the form <T>e are not supported in Compact.
The following rules govern when casts are allowed, when casts might cause dynamic errors, and when casts might require run-time conversions. Any cast not covered by one of the rules is not allowed.
Upcasts
-
Upcasts, i.e., casts from a type to a supertype (e.g., from
Vector<3, Uint<8>>toVector<3, Uint<16>>), are always allowed, never result in dynamic errors, and, except in the case where the type and the supertype are the same, might require run-time conversions. Upcasts are never required: it is always possible to use a value of type T as a value of a supertype of T.This rule is straightforward, but some implications regarding tuple and vector types are worth noting:
- Since
Vector<n, T>is the same type as[T, ⋯, T]of length n, a cast from either of these types to the other is allowed but not required. - If a tuple type
[T1, ⋯, Tn]has a vector typeVector<n, S>, then since the tuple type is a subtype of the vector type, a cast from the tuple type to the vector type is allowed but not required. For example, values of type[Uint<8>, Uint<16>]can be used where values of typeVector<2, Uint<16>>are required. - Since a vector type
Vector<n, T>is a subtype of any tuple type[S1, ⋯, Sn]where T is a subtype of each S, a cast from the vector type to the tuple type is allowed but not required. For example, values of typeVector<2, Uint<8>>can be used where values of type[Uint<8>, Uint<16>]are required.
- Since
Downcasts of Field and Uint types
- Any
Uintdowncast, i.e., any cast fromUint<0..m>toUint<0..n>, m > n, is allowed, requires a run-time check, and might require a run-time conversion. The same applies to anyFieldtoUintcasts where the maximumFieldvalue exceeds the maximum value of theUinttype. (If the maximum value of theUinttype exceeds the maximumFieldvalue, the behavior of the compiler is currently unspecified.)
Casts of Boolean to and from Field and Uint
- Any cast from
BooleantoFieldor anyUinttype is allowed, sometimes requires a run-time check, and requires a run-time conversion (offalseto0andtrueto1). A run-time check is required forUint<0..1>. - Any casts from
Fieldor anyUinttype toBooleanis allowed, cannot result in a dynamic error, and requires a simple run-time conversion (of0tofalseand any other value totrue).
Casts of Enum types to and from Field and Uint
- Any cast from an enum type to
Fieldor anyUinttype is allowed, sometimes requires a run-time check, and might require a run-time conversion. A run-time check is required for a cast from an enum type with n elements toUint<0..m>, n > m. - Any cast from
Fieldor anyUinttype to an enum type is allowed, sometimes requires a run-time check, and might require a run-time conversion. A run-time check is required for casts fromField. For an enum type with n elements, a run-type check is required for casts fromUint<0..m>, m > n.
Casts of Bytes to and from Field and Uint
- Any cast from
Bytes<m>, m != 0, toFieldor anyUinttype is allowed, requires a run-time check, and requires a run-time conversion. That is, the byte-vector value of the expression is converted into aFieldorUintvalue with the least-significant byte of the result corresponding to the first byte in the byte vector. A dynamic error occurs if the result exceeds the maximum value of theFieldorUinttype. - Any cast from
Fieldor anyUinttype toBytes<m>, m != 0, is allowed, requires a run-time check, and requires a run-time conversion. That is, the integer value of the expression is converted into a byte vector with the first byte of the byte vector corresponding to the least-significant byte of the integer. A dynamic error occurs if the result does not fit in m bytes.
Casts of Bytes to and from Vector and tuple types
- Any cast from
Bytes<m>toVector<m,Field>orVector<m,Uint<0..k>>, k ≥ 256 is allowed, cannot cause a dynamic error, but does require a run-time conversion. - Any cast from
Bytes<m>to tuple type[T1, ⋯, Tm], where each Ti isFieldorUint<0..k>for some k ≥ 256 is allowed, cannot cause a dynamic error, but does require a run-time conversion. - Any cast from
Vector<m,Uint<0..k>>, k ≤ 256, toBytes<m>is allowed, cannot cause a dynamic error, but does require a run-time conversion. - Any cast from tuple type
[T1, ⋯, Tm]toBytes<m>, where each Ti isUint<0..k>for some k ≤ 256 is allowed, cannot cause a dynamic error, but does require a run-time conversion.
Ledger state operations
A Compact program interacts with its public state by invoking operations upon the ledger. There are two different forms of ledger operations.
Kernel operations are operations that do not depend on specific ledger
state. They can be invoked by expressions of the form id.op(e, ⋯, e), where
id is the name of a ledger field declared to have the special ledger-state type Kernel,
op is the name of a builtin kernel operation, and
e, ⋯, e is a comma-separated sequence of zero or more argument expressions.
The CompactStandardLibrary predefines the ledger field name kernel to have
ledger-state type Kernel, so for example, the built-in self operation can be called
from a circuit as follows:
import CompactStandardLibrary;
circuit f(): ContractAddress {
return kernel.self();
}
The static type of a kernel operation expression is the return type of the corresponding kernel operation according to the the Compact ledger data type documentation.
Kernel operations are evaluated by evaluating the argument subexpressions in order from left to right and then invoking the corresponding kernel operation with the argument values.
Ledger-state operations are operations on the contract's declared public ledger-state.
In its simplest form, a ledger-state operation is an expression
F.op(e, ⋯, e), where F is a ledger field name
declared via a ledger field declaration,
op is the name of a ledger-state operator, and each e is an argument expression.
F can also be another ledger-state operation, allowing ledger-state operations
to be chained when the result of a ledger-state operation might itself have a
ledger-state type.
Compact also supports shorthands for certain ledger-state operations; these are
described below in Ledger-state-operation shorthands.
The static type of a ledger-state operation is the return type of the corresponding ledger-state operation according to the the Compact ledger data type documentation. This might be a Compact type or it might be a ledger-state type. A ledger-state operation whose return type is a ledger-state type must be chained, i.e., immediately subject to another ledger-state operation. It is a static error if this is not the case.
A ledger-state operation F.op(e, ⋯, e) is evaluated by evaluating F if F
is itself a ledger-state operation, then evaluating the argument subexpressions
from left to right, and then invoking the operation op with the argument
values on the portion of the public ledger state identified by F.
Ledger-state-operation shorthands
Compact supports several shorthands for common ledger-state operations.
The simplest of these is a shorthand for the ledger-state operation
F.read() where F is a ledger-state field name or itself another
ledger-state operation.
If the type of F is a ledger-state type that supports the read
operation, F.read() can be abbreviated, simply, to F.
The other shorthands all take the form of assignment statements and
abbreviate the ledger operations F.write(e), F.increment(e), and
F.decrement(e), where e is an expression.
If the type of F is a ledger-state type that supports the write
operation, F.write(e) can be abbreviated F = e.
Similarly, if the type of F is a ledger-state type that supports the
increment operation, F.increment(e) can be abbreviated F += e,
and if the type of F is a ledger-state type that supports the
decrement operation, F.decrement(e) can be abbreviated F -= e.
The typing and evaluation rules for the abbreviations are identical to expressions they abbreviate.
Asserts
An assertion has the form assert(e, msg) where e is an expression and msg
is a string message.
e must have a Boolean type, otherwise it is a static error.
The type of every assert form is [].
Evaluation proceeds by evaluating e.
If the value of e is true (the assertion succeeds), the assert form produces
the value [].
If, however, the value of e is false (the assertion fails), it is a dynamic
error, i.e., the assert form halts computation of the enclosing top-level
circuit or constructor with the message msg.
Each assertion is checked at run time and constrained in-circuit.
Explicit disclosure
Private data can enter a Compact contract via the arguments to exported circuits and via the return values of witnesses. Compact supports selective disclosure, in which the contract decides which private data, if any, should be disclosed in the transcripts and public state of the contract, and how that private data should be encoded.
Compact itself has no information about the nature of any private data and whether and how it is necessary to disclose it, the decision to do so rests with the contract developer. Compact does, however, require that such disclosure be explicitly declared to prevent unintentional disclosure, so that privacy is the default and disclosure is the exception.
The mechanism for explicitly declaring disclosure is simple: disclosure of private
data (exported circuit arguments, witness return values, and anything derived from
private data) must be acknowledged by wrapping an expression whose value contains
private data in a disclose() wrapper before storing it in the public state, returning
it from an exported circuit, or using it in conditional expressions that might
affect what is stored in the public state or returned from an exported circuit.
For example, the following contract attempts to disclose private data without
explicitly declaring the disclosure when it stores the value of the exported
parameter x into the ledger field F:
ledger F: Uint<16>;
export circuit setf(x: Uint<16>): [] {
F = x;
}
Compiling this program produces the following error message in Version 1.0 of the Compact compiler:
Exception: testfile.compact line 3 char 5:
potential witness-value disclosure must be declared but is not:
witness value potentially disclosed:
the value of parameter x of exported circuit setf at line 2 char 21
nature of the disclosure:
ledger operation might disclose the witness value
via this path through the program:
the right-hand side of = at line 3 char 5
The remedy in this case is simple: wrap the reference to x in a disclose()
wrapper.
ledger F: Uint<16>;
export circuit setf(x: Uint<16>): [] {
F = disclose(x);
}
The disclose() wrapper does not cause disclosure in itself; in fact, it has no
effect other than telling the compiler that it is okay to disclose the value of
the wrapped expression.
Equivalently, it tells the compiler to pretend that the value of the wrapped
expression does not contain witness data whether it actually does or not.
The compiler works hard to ensure there are no false alarms and also to provide precise and accurate information about the nature of the disclosure and how it occurs.
Often, the same value is disclosed along different paths, and the nature of the disclosure might be more obscure, particularly if it involves indirect disclosure via conditional expressions. Detailed advice about how to understand and remedy disclosures is beyond the scope of this document, but a few things are worth noting:
- It is often sufficient to correct first disclosure issue reported at a given disclosure if multiple disclosures are listed.
- Whenever feasible, the
disclose()wrapper should be placed as close to the actual disclosure site as possible. - When disclosing a hashed or otherwise obfuscated variant of witness data, place the disclosure somewhere on the path between input and disclosure after the obfuscation to help prevent disclosure of the unobfuscated input.
More details are given in Explicit Disclosure in Compact: The Midnight "Witness Protection Program".
Run-time data representation
The TypeScript representation of a Compact type is defined in Representations in TypeScript.
Compact represents values exactly as TypeScript represents values, i.e., as ordinary JavaScript values. So a Compact Boolean is represented at run time as a JavaScript boolean, a Compact tuple is represented as a JavaScript array, and enum values are represented by numbers.
To maintain type safety, Compact verifies at run time that values passed by an outside caller to an exported circuit or returned from an outside witness have the expected types. This is necessary even when the caller or witness is written in properly typed TypeScript because some Compact types have size and range limits that are not expressible via the TypeScript type system:
Fieldvalues are limited by a maximum field value.Uintvalues are limited by the declared bounds.Bytes,Vector, and tuple values are limited by their lengths.- Enum values are limited by the maximum index for the enum elements.
It is also necessary because compile-time type checks are easily defeated in TypeScript and are nonexistent when a caller or witness is coded in JavaScript.
Certain values to be stored in public state require size, length, and other
properties to be maintained explicitly, because these properties cannot
be determined solely from the JavaScript representation of the value.
For this purpose, the @midnight/compact-runtime package provides explicit
run-time types satisfying the CompactType<T> interface, where T is the
corresponding TypeScript type.
This representation is not user-facing most of the time, except
when replicating the behavior of the operations implemented
in @midnight/compact-runtime.
The following types or constructors can be used to create a CompactType instance
for a primitive type:
Boolean-CompactTypeBooleanField-CompactTypeFieldUint<0..n>-new CompactTypeUnsignedInteger(n,length), where length is the number of bytes required to store nUint<n>- same asUint<0..(2 ^n)>Bytes<n>-new CompactTypeBytes(n)Vector<n,T>-new CompactTypeVector(n,rt_T), where rt_T is the run-time type of TOpaque<"String">-CompactTypeStringOpaque<"Uint8Array">-CompactTypeUint8ArrayJubjubPoint-CompactTypeJubjubPoint.
For program-defined types, structures are not currently easily
constructed at run time and require implementing CompactType<T>
manually or using compiler internals. Enumerations are exposed
through new CompactTypeEnum(maxValue, length), where maxValue is
the maximum permissible integer assignable, and length its
representation length in bytes (almost always 1).
TypeScript target
When compiled, a contract generates several artifacts. Key to these are the exported circuits from the contract's top level. These are divided into two categories: pure circuits and impure circuits.
In the contract directory, the semantics of the contract is encoded in
TypeScript, in the form of a index.js JavaScript implementation file and a
index.d.ts type declaration file. For most uses, it is recommended to rely on the
information and interface provided in index.d.ts.
For each of the impure circuits that touches the public ledger, a zero-knowledge prover/verifier key pair is also
generated, as well as instructions for proof generation. These can be found
in the output directory's keys and zkir subdirectories respectively.
Structure of the exported TypeScript
The exported TypeScript exposes a number of declarations that can be used
to interact with the contract from any TypeScript application. Some of these
also require use of the @midnight/compact-runtime library, which all
contracts depend upon and which implements key built-in behaviors.
A contract exports the following in the TypeScript module:
- The TypeScript type corresponding to each program-defined type exported from the contract's top level
- A
Witnesses<PS>type, which describes the format external witnesses must satisfy to instantiate the contract - A
ImpureCircuits<PS>type, which describes the set of impure circuits exported from the contract's top level - A
ProvableCircuits<PS>type, which describes the set of circuits exported from the contract's top level that have verifier keys (i.e., exported circuits that touch the public ledger) - A
PureCircuitstype, which describes the set of pure circuits exported from the contract's top level - A
Circuits<PS>type, which describes the set of all exported circuits - A
Contract<PS = any, W extends Witnesses<PS> = Witnesses<PS>>class, which:- can be constructed by passing in an instance of
W - exposes members
circuits: Circuits<PS>,impureCircuits: ImpureCircuits<PS>, andprovableCircuits: ProvableCircuits<PS> - provides initial contract states via
initialState(context: __compactRuntime.ConstructorContext<PS>,x:T,y:T, ...): runtime.ConstructorResult<PS>where x:T,y:T, ...are the contract constructor parameters and their type declarations if they exist
- can be constructed by passing in an instance of
- A constant
pureCircuits: PureCircuitsobject, providing all pure circuits as pure functions - A
Ledgertype, providing views into a current ledger state, by permitting direct calls of all read functions ofledgerobjects, as well of some TypeScript specific ones that cannot be called from Compact, such as iterators - A
ledger(state: runtime.StateValue | runtime.ChargedState): Ledgerconstructor of theLedgertype, giving access to the values of exported ledger fields.
The argument PS for a number of these should be interpreted as the type of the
local/private state. For the most part, circuit and witness functions are
translated simply by translating their Compact types into corresponding
TypeScript types for parameters and return values. For PureCircuits, this is
all that happens, for the other _Circuits instances, they receive an
additional first parameter of type runtime.CircuitContext<PS>, and their result
type R is wrapped in runtime.CircuitResults<PS, R>. For Witnesses, they
receive an additional first parameter of type runtime.WitnessContext<Ledger, PS>, and their result type R is wrapped in [PS, R]. See the runtime API
docs for the
details of these types. This wrapping makes the entirety of the contract code
functional, ensuring calls have no hidden side effects.
Representations in TypeScript
Compact's primitive types are represented in TypeScript as follows:
Boolean-booleanField-bigintwith run-time bounds checksUint<n>/Uint<0..n>-bigintwith run-time bounds checks[T, ...]- the TypeScript tuple type[S, ...]or else the TypeScript array type S[]with run-time length checks, where S is the TypeScript representation of the corresponding type TBytes<n>-Uint8Arraywith run-time length checksOpaque<"string">-stringOpaque<"Uint8Array">-Uint8Array
Program-defined types are represented in TypeScript as follows:
enuminstances - anumberwith run-time membership checksstructinstances with fieldsa: A, b: B, ...- an object{ a: A, b: B, ... }whereA,B, ... are the TypeScript representations of the Compact types.
Note that other Opaque types are currently not supported.
Implementation-specific limits
Compact compiler version 1.0 limits certain datatype values and sizes:
- The maximum value of a
Fieldvalue is 52435875175126190479447740508185965837690552500527637822603658699938581184512. This value is dictated by the scalar field size of the ZK proving system. - The maximum value of a
Uintvalue is 452312848583266388373324160190187140051835877600158453279131187530910662655. This value is 256k-1 where k is the number of bytes (31) that fits in aField. - The maximum length of a vector or byte vector is 16777216. This value is chosen as a reasonable upper bound that helps prevent the compiler from looping indefinitely on programs that operate on excessively large vector and byte-vector types.
These limits may change in future versions of the compiler.
External resources
Compact formal grammar
See here.
Compact keywords and reserved words
See here.
Compact ledger-state types and operations
See here.
Runtime API
See here
Compiler usage
See here.
Explicit disclosure in Compact
See here.
Writing a contract in Compact
See here.