CSPM files consist of a number of definitions, which are described below.
Some of these definitions can also be given at the FDR
The Interactive Prompt and within Let
expressions. In this Section
we give an overview of the type of declarations allowed in CSPM. A
declaration list is simply a list of the declarations in this section,
separated by newlines.
The simplest type of definition in CSPM binds the value of an expression to a pattern. In particular, if p
is
a pattern and e
is an expression, both of some type a
, then p = e
matches the value of e
against p
and, if p
does match then binds all
the variables of p
, otherwise throws an error. For example, x = 5
binds x
to 5
, (x, y) = (5, f(5))
binds x
to 5
and y
to
f(5)
. Alternatively, given <x> = <>
the following error is thrown upon
evaluating x:
Pattern match failure: Value <> does not match the pattern <x>
CSPM provides a rich syntax for defining functions that is highly expressive.
The simplest example of a function is the identify function, which simply
returns its argument unaltered. This can be written in CSPM as id(x) = x
and
has type
. CSPM also allows function definitions to use
pattern matching to define functions. For example, given
the definition of (a) -> a
f
as:
f(0) = 1
f(1) = 2
f(_) = error("Disallowed argument")
Then f(0)
evaluates to 1
, f(1)
evaluates to 2
, whilst any other
argument evaluates to an error. Functions can also take multiple arguments,
separated by commas. Thus, f(x,y) = x*y
defines the multiplication function.
CSPM also provides support for curried function definitions. For example,
f(x)(y) = x+y
means that f
is of type
(noting that (
Int
) -> (
Int
) ->
Int
->
is right associative). Evaluating f(4)
yields a function
to which the second argument may be passed. Thus, f(4)(2) = 6
.
Functions and constants may have
their type specified by providing a type annotation on a separate line to
the main definition. For example, the following specifies that the function
f
has the type of the identity function:
f :: (a) -> a
It is also possible to specify that a type variable has certain type
constraints. For example, the following requires that g
is a function that
takes two arguments of the same type that must satisfy the Eq
type constraint:
g :: Eq a => (a, a) -> Bool
The type annotations, in addition to being a useful way of documenting programs,
are used by the type-checker to guide it to deducing the correct type,
particularly for functions that use Dot
in various complex ways. Further,
the use of type annotations will result in more useful errors being emitted.
In general, a type annotation is a line of the form:
n1, n2, ..., nM :: Type
n1, n2, ..., nM :: TypeConstraint a => Type
n1, n2, ..., nM :: (TypeConstraint a, ..., TypeConstraint a) => Type
where the ni
are names whose definitions are given at the same lexical level
(i.e. if the type of f
is declared inside a let expression, then f
must
also be declared inside exactly the same let expression); TypeConstraint
is
a type constraint; Type
is a Type System Type
expression.
Type variables within type annotations are scoped in the same way as variables. For example, consider the following definition, which contains a let expression:
f :: Set a => (a) -> {a}
f(x) =
let
g :: (a) -> {a}
g(x) = {x}
within g(x)
In the above, the type variable a
in the type annotation for g
is
precisely the same type variable as in the type annotation for f
. Hence,
the type annotation for g
does not require the Set
type constraint
since this has already been specified in the type annotation for f
(in
general, type constraints may only be specified in the type annotation where
a type variable is created).
Warning
Specifying type annotations for non-existent names will result in an error, as will specifying multiple type annotations for the same name. Further, type constraints may only be specified in the type annotation that creates the type variable.
See also
New in version 3.0.
CSPM also allows structured datatypes to be declared, which are similar to
Haskell’s data
declarations. The simplest kind of datatype declaration
simply declares constants:
datatype NamedColour = Red | Green | Blue
This declares Red
, Green
and Blue
as symbols of type
NamedColour
, and binds NamedColour
to the set {Red, Green, Blue}
.
Datatypes can also have parameters. For example, we could add a RGB colour data
constructor as follows:
datatype ComplexColour = Named.NamedColour | RGB.{0..255}.{0..255}.{0..255}
This declares Named
to be a data-constructor, of type NamedColour =>
ComplexColour
, RGB
to be a data-constructor of type Int => Int => Int =>
RGB
and ComplexColour
to be the set:
union({Named.c | c <- NamedColour},
{RGB.r.g.b | r <- {0..255}, g <- {0..255}, b <- {0..255}})
Thus, in general, if a datatype T
is declared, then T
is bound to the
set of all possible datatype values that may be constructed. Note that, as
mentioned in Dot
, if an invalid data-value is constructed then an error
is thrown. Thus, constructing RGB.1000.0.0
would throw an error, as 1000
is not in the permitted range of values. This is the primary difference between
datatypes in other languages and CSPM: CSPM requires the actual set of values
allowed at runtime to be declared, whilst other languages merely require the
type. This is done to allow Prefix
to be used more conveniently.
One important consideration is that the datatype must be rectangular, in that in must be decomposable into a Cartesian product. For example, consider the following:
datatype X = A.{x.y | x <- {0..2}, y <- {0..2}, x+y < 2}
This datatype is not rectangular as the datatype declaration cannot be rewritten
as the Cartesian product of a number of sets, since A.0.1
is valid, whilst
A.1.1
is not. This will result in the following error being thrown:
./test.csp:1:14-57:
The set: {0.0, 0.1, 1.0}
cannot be decomposed into a cartesian product (i.e. it is not rectangular).
The cartesian product is equal to: {0.0, 0.1, 1.0, 1.1}
and thus the following values are missing: {1.1}
Datatypes can also be recursive. For example, the type of binary trees storing integers can be declared as follows:
datatype Tree = Leaf.Int | Node.Int.Tree.Tree
For example, the following function flattens a binary tree to a list of its contents:
flatten(Leaf.x) = <x>
flatten(Node.x.l.r) = flatten(l)^<x>^flatten(r)
This function is of type (Tree) -> <Int>
.
In general a CSPM datatype declaration takes the following form:
datatype N = C1.te1 | C2.te2 | ...
where N is the datatype name, the tei
are optional and are
Type Expressions (which differ from regular expressions) and the
Ci
are the clause names. As a result of this, each Ci
is bound to a
datatype constructor that when dotted with appropriate values (according to
tei
), yield a datatype of type N
. In particular, if tei
is of type
{t1.(...).tN}
, then Ci
is of type t1 => ... => tN => N
. Further,
N
is bound to the set of all valid datatypes values of type N
.
Note
CSPM datatypes cannot be parametric, so it is not possible, for instance, to declare a binary tree that stores any type at its node.
See also
Variable Pattern
for a warning on channel names to avoid.
The expressions in datatype and channel declarations are interpreted differently
to regular expressions. Type expressions can either be any
Expressions that evaluates to a set or a dot separated list of type
expressions, or a tuple of type expressions. Thus a type expression te
is
of the form:
e | (te1, ..., teN) | te1.(...).teN
where e
denotes an :ref`expression <csp_expression>` of type {a}
. A type
expression of the form (te1, ..., teN)
constructs the set of all tuples
where the ith element is drawn from tei
. For example, ({0..1},
{2..3})
evaluates to {(0, 2), (0, 3), (1, 2), (1, 3)}
. A type expression
of the form te1.(...).teN
evaluates like a tuple type expression, but
instead dots together the value. Thus, {0..1}.{2..3}
evaluates to {0.2,
0.3, 1.2, 1.3}
. For example, consider the following datatype declarations:
datatype X = A.(Bool, Bool)
datatype Y = B.Bool.Bool
This means that X
is equal to the set {A.(false, false), A.(false, true),
A.(true, false), A.(true, true)}
, whilst Y
is equal to the set
{B.false.false, B.false.true, B.true.false, B.true.true}
.
CSPM channels are used to create events and are declared in a very similar way to datatypes. For example:
channel done
channel x, y : {0..1}.Bool
declares three channels, one that takes no parameters (and hence done
is of
type Event
), and two that have two components: a value from {0.1}
and a boolean. Thus, the set of events defined by the above is {done,
x.0.false, x.1.false, x.0.true, x.1.true, y.0.false, y.1.false, y.0.true,
y.1.true}
. These events can then by used by Prefix
to declare processes
such as P = x?a?b -> STOP
.
In general, channels are declared using the following syntax:
channel n1, ..., nM : te
where te
is a Type Expressions and the ni
are unqualified
names
. As a result of this, n1
, nM
are bound to event
constructors that when dotted with appropriate values (according to te
),
yield an event``. In particular, if te
is of type {t1.(...).tN}
, then
each ni
is of type t1 => ... => tN => Event
.
See also
Variable Pattern
for a warning on channel names to avoid.
CSPM allows subtypes to be declared, which bind a set to a subset of datatype values. For example, consider the following:
datatype Type = Y.Bool | Z.Bool
subtype SubType = Y.Bool
This creates a datatype, as above, but additionally binds
SubType
to the set {Y.false, Y.true}
. In general a subtype takes the
following form:
subtype N = C1.te1 | C2.te2 | ...
where N
is the name of the subtype, the Ci
are the names of existing
data constructors and the tei
are Type Expressions. Note that
the Ci.tei
must all be of some common type T
, which must be the type of
a datatype (e.g., in the above example, Y.Bool
is of type Type
).
Named types simply associate a name with a set of values, defined using type expressions. For example:
nametype X = {0..1}.{0..1}
binds X
to the set {0.0, 0.1, 1.0, 1.1}
. This is no more powerful than
a Set Comprehension
, but as it uses Type Expressions, it can
be significantly more convenient. The general form of a nametype is:
nametype X = te
where te
is a Type Expressions.
CSPM also allows various assertions to be defined in CSPM files. These are then added to the list of assertions in FDR in order to allow convenient execution of the assertions. FDR permits several different forms of assertions, as described below. Further, options such as partial order reduction may be specified to. See Assertion Options for further details.
The simplest assertion in CSPM are refinement assertions, which are lines of the form:
assert P [T= Q
The above will cause FDR to check whether \(P \sqsubseteq_T Q\). The semantic model can be any of the following:
[T=
;[F=
;[FD=
.It is possible to assert that a process is deadlock free, in a particular
semantic model. In all cases, FDR internally converts this into a refinement
assertion of the form \(DF(A) \sqsubseteq P\), where A
is the
alphabet of events that P
performs and DF(A)
is the most
non-deterministic process over A
, i.e. \(DF(A) =
\repintchoice_{x \in A} x \prefix DF(A)\). Intuitively, in the failures model,
this means that the process can never get into a state where no event is
offered. This assertion can be written as:
assert P :[deadlock free]
which defaults to checking the assertion in the failures-divergences model, or a particular semantic model can be specified using:
assert P :[deadlock free [F]]
which insists that the failures model will be used to check the assertion. Note that only the failures and failures-divergences models are supported for deadlock freedom assertions.
Hint
If the process P
is known to be divergence free, then checking the
deadlock freedom assertion in the failures model, instead of the failures
divergences model will result in increased performance. If P
is not
known to be divergence free, then separately checking that P
is
livelock free and that P
satisfies an appropriate (livelock free)
traces or failures specification will, again, result in increased
performance.
In the failures-divergences model it is also possible to check if a process
can diverge, i.e. perform an infinite amount of internal work (this is also
known as a livelock freedom check). FDR converts all such checks into a
refinement assertion of the form \(CHAOS(A) \sqsubseteq P\), where A
is the alphabet of the process P
. This essentially says that the process
can have arbitrary behaviour, but may never diverge. This can be written as:
assert P :[divergence free]
which defaults to checking in the failures-divergences model, or a particular semantic model can be specified as follows:
assert P :[divergence free [FD]]
which insists that the failures-divergences model will be used to check the assertion. Note that only the failures-divergences model is supported for divergence freedom assertions.
FDR can be used to check if a given process is deterministic by asserting:
assert P :[deterministic]
The above assertion will check if P
is deterministic in the failures-
divergences model. As with other property checks, a specific model can be
specified as follows:
assert P :[deterministic [FD]]
Note that FDR considers a process \(P\) to be deterministic providing no witness to non-determinism exists, where a witness consists of a trace \(tr\) and an event \(ev\) such that \(P\) can both accept and refuse \(a\) after \(tr\). Formally, in the failures model, \(P\) is non-deterministic iff \(\exists tr, a \cdot tr \verb!^! \langle a \rangle \in traces(P) \land (tr, \{a\}) \in failures(P)\). In the failures-divergences model, \(failures\) is replaced by \(failures_{\perp}\).
Internally, for the failures-divergences model, FDR actually constructs a
deterministic version of the process P
(using deter
) and then
checks if deter(P) [FD= P
. Thus, whilst the compilation phase can appear
to take a long time to finish, the checking phase is typically faster. For
the failures model, FDR uses Lazic’s determinism check which runs two copies
of the process in parallel. If the process contains a lot of taus, then this
can cause the number of states that need to be checked to increase greatly.
FDR has support for asserting that a process has a certain trace. For instance:
assert P :[has trace]: <a, b, c>
asserts that P
can perform the trace <a, b, c>
in the failures-divergences model. This assertion supports
the following semantic models:
The traces model, written as :[has trace [T]]:
. This means that P
must be able to
perform this trace.
The failures model, written as :[has trace [F]]:
. This means that P
must not be
able to refuse to perform this trace. For instance:
assert STOP |~| a -> STOP :[has trace [F]]: <a>
would fail in the failures model, since the process may refuse to perform the a
.
The failures-divergences model, written as :[has trace [FD]]:
. This means
that P
must not be able to refuse to perform this trace, and may not diverge whilst performing it.
The intention is that this assertion is used for simple unit tests of processes, rather than as a specification in itself.
New in version 3.3.0.
assert
not P [T= Q
. Note that if a negated assertion fails, it is not possible to
debug it since this merely implies that the underlying assertion passed.
Conversely, a passing negated assertion can be debugged.It is also possible to specify options to the assertions. Not all options are supported by all assertions (and there are no general rules for this in many cases), so FDR will throw an error when it is unable to support a particular assertion option.
Warning
Partial order reduction is still experimental, and improvements are still required, particularly to performance.
FDR can use partial order reduction to attempt to automatically reduce the size of the state space of a system in a safe manner. For example, the assertion:
assert P [T= Q :[partial order reduce]
is precisely the same as the standard trace assertion, but FDR will attempt
to automatically reduce the state space. Partial order reduction also has
three difference modes precise
(default), hybrid
and fast
. This
achieve progressively smaller reductions in the state space, but should run
faster. The mode can be selected as follows:
assert P [T= Q :[partial order reduce [hybrid]]
Partial order reduction will only work on some examples. Generally, it is simplest to try it and to see what state space reduction and time difference it causes. Further, if partial order reduction works well on one instance of a problem, it will also work on larger instances.
In general, partial order reduction will be effective on examples where there
is a parallel composition where each process, or group of processes, has a
number of events that are independent of the events that other processes
perform. For example, dinning philosophers
achieves excellent partial order reduction because each philosopher has a
number of events that do not conflict with other philosophers (e.g.
thinks.i
, eats.i
etc). Partial order reduction will remove the
redundant interleavings.
The degree of reduction that partial order reduction can achieve is also affected by the specification in question. It is most efficient when given a deadlock-freedom assertion in the failures model. In general, refinement assertions will achieve less reduction. To improve the amount of reduction that can be achieved in refinement checks, the specification alphabet should be made as small as possible.
New in version 3.1.
This allows tau to be given priority over a specific set of events. For example, the assertion:
assert P [T= Q :[tau priority]: {tock}
is equivalent to the assertion:
assert prioritise(P, <{}, {tock})>) [T= prioritise(Q, <{}, {tock})>)
For further information see prioritise
.
New in version 2.94.
A number of functions within FDR are available only if they are imported using
transparent
or external
. Transparent functions are all applied to
processes, and are semantics preserving. External functions are either
non-semantics preserving process functions, or are utility functions provided
by FDR. See Compression Functions for examples of transparent functions, and
mtransclose
for an example of an external function.
Transparent and external functions may be imported into a script as follows: Transparent and External Functions
CSPM also supports a limited version of modules that allow code to be
encapsulated, to avoid leaking. For example, the following declares a simple
module A
:
module A
X = 2
Y = X + 2
exports
Z = 2 + Y + X
endmodule
f(x) = A::Z
As seen above, modules can have a number of private definitions (in this case
X
and Y
are private and can only be referred to by Z
), and public
or exported definitions (in the above case, just Z
). Exported definitions
are accessible in other parts of the same script using their fully qualified
name, which consists of ModuleName::VariableName
. The general syntax for
modules is as follows:
module ModuleName
<private declaration list>
exports
<public declaration list>
endmodule
where both declaration lists can contain any declarations with the exception of assertions. Modules may also be nested, as the following example shows:
module M1
X = 1
exports
Y = 1 + M2::Y
module M2
X = 2
exports
Y = 1 + X
endmodule
endmodule
f = 1 + M1::Y + M1::M2::Y
Nested modules can either be public or private.
Modules may also be defined to take arguments. For example, consider the module:
module M1(X)
check(x) = member(x, X)
exports
f(y) = if check(y) then "Inside X" else "Outside of X"
datatype X = Y | Z
endmodule
The above defines a module M1
that takes a single parameter X
, which
must be of type
, for some type {a}
. The arguments
in the module definition are in scope within all of the declarations defined
within the module. Instances of the module may then be created to call the
functions within a particular instance of the module. For example, if the
following module instance is declared:a
instance M2 = M1({0})
the expression M2::f(0)
would evaluate to “Inside X”. Note that declarations
inside parametrised modules are accessible only via a module instance (thus
M1::f(0)
is an invalid expression). If a datatype is declared inside a
parametrised module, then different module instances will contain distinct
versions of the datatype. For example give the module instances:
instance M2 = M1({0})
instance M3 = M1({0})
M2::X
and M3::X
are not of the same type and hence the expression
M2::X == M3::X
would result in a type error.
More generally, a module that takes N arguments can be declared by:
module ModuleName(Arg1, ..., ArgN)
<private declaration list>
exports
<public declaration list>
endmodule
An instance of a module that takes N arguments can be declared as follows:
instance ModuleInstanceName = ModuleName(E1, ..., EN)
where E1
and EN
are expressions of a type appropriate for
ModuleName
. Instances of modules that do not take parameters can also be
created simply by eliding the (E1, ..., EN)
portion of the instance
declaration.
Warning
An instance declaration can only appear strictly after the definition of the module in the input file.
Tock CSP is a discretely-timed variant of CSP that uses the event tock
to
model the passage of time. In order to specify tock-CSP processes, CSPM
includes a timed section syntax that automatically translates CSP processes
to tock-CSP. For example:
channel tock
OneStep(_) = 1
Timed(OneStep) {
P = a -> P
}
P' = a -> tock -> P' [] tock -> P'
will translate P
into tock-CSP. The resulting process will, in fact, be
equivalent to P'
(see below for the full translation).
In general, timed sections are written as:
Timed(expression) {
declarations
}
where:
declarations
Let
or
module, then the usual restrictions apply). Note that
within these declarations timed_priority
and WAIT
may be
used.expression
(
Event
) ->
Int
. This function returns the number of time units
that the given event takes to complete. For example, in the above example,
each event is defined as taking 1 time unit (note that this can be defined
using a Lambda
).Before using a timed section, tock
must be declared as an event. Failure to
do so, or defining tock as something of an incompatible type, will result in a
type-checker error.
The definitions inside the timed section are translated into tock-CSP according to the following translation, assuming that the timing function is called time:
Process | Translated To |
---|---|
STOP |
TSTOP , where TSTOP = tock -> TSTOP . |
SKIP |
TSKIP , where TSKIP = SKIP [] tock -> TSKIP . |
a -> P |
X = tock -> X [] a -> tock -> ... -> tock -> P where time(a)
tocks occur after the a . |
c?x -> P(x) |
X = tock -> X [] c?x -> tock -> ... -> tock -> P(x) where
time(c.x) tocks occur after the c.x . |
P [] Q |
P [+ {tock} +] Q |
b & P |
if b then P else TSTOP |
P [A || B] Q |
P [union({tock},A) || union({tock},B)] Q |
P [| A |] Q |
P [| union({tock},A) |] Q |
P ||| Q |
P [| {tock} |] Q |
P /\ Q |
P /+ {tock} +\ Q |
P [+ A +] Q |
P [+ union({tock}, A) +] Q |
P /+ A +\ Q |
P /+ union({tock}, A) +\ Q |
The remaining operators, i.e. Exception
, Hide
,
Internal Choice
, Rename
, Sequential Composition
and
Sliding Choice
, are not affected by the
translation. All the replicated operators are translated analogously to the
above.
Warning
At the present time, Linked Parallel
and the Non-deterministic Input of
Prefix
are not permitted inside timed sections.
See also
Synchronising External Choice
External Choice
into tock-CSP.Synchronising Interrupt
Interrupt
into tock-CSP.New in version 2.94.
Changed in version 3.0: In FDR2, tock
was automatically defined when a timed section was
encountered. FDR3 requires tock
to be defined to be defined by the user
as an event, in order to allow timed sections in more contexts. In addition,
FDR2 defined TOCKS as a global constant in a file that mentioned timed
sections; FDR3 no longer does so.
If a CSPM file contains a statement of the form:
print 1+1
print head(<5..>)
then FDR will add the statements to a list on the right hand side of the session window, thus allowing the expressions to be easily evaluated.
Sometimes it can helpful to split a single CSPM file into several files,
either because some code is common to several problems, or to simply break up
large file. This can be accomplished by using an include "file.csp"
expression in the file. For example, if file1.csp
and file2.csp
are in
the same directory, and file1.csp
contains:
include "file2.csp"
f = g + 2
and file2.csp
contains:
g = 2
then this is equivalent to a single file that contains:
g = 2
f = g + 2
Each include
statement must be on a separate line and all paths are
relative to the file that contains a particular include
statement.