Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Summary

Higher-order languages have been widely studied in functional programming, following the \(\lambda \)-calculus. In a higher-order calculus, variables may be instantiated with terms of the language. When multiple occurrences of the variable exist, this mechanism results in the possibility of copying the terms of the language.

Equivalence proof of computer programs is an important but challenging problem. Equivalence between two programs means that the programs should behave “in the same manner” under any context [Mor68]. Finding effective methods for equivalence proofs is particularly challenging in higher-order languages: pure functional languages like the \(\lambda \)-calculus, and richer languages including non-functional features such as non-determinism, information hiding mechanisms (e.g., generative names, store, data abstraction), concurrency, and so on.

Bisimulation [Par81a, Par81b, Mil89, San09, San12] has emerged as a very powerful operational method for proving equivalence of programs in various kinds of languages, due to the associated co-inductive proof method. Further, a number of enhancements of the bisimulation method have been studied, usually called up-to techniques. To be useful, the behavioral relation resulting from bisimulation—bisimilarity—should be a congruence. Bisimulation has been transplanted onto higher-order languages by Abramsky [Abr90]. This version of bisimulation, called applicative bisimulations, and variants of it, have received considerable attention [Gor93, GR96, Pit97, San98, Las98]. In short, two functions \(P\) and \(Q\) are applicatively bisimilar when their applications \(P(M)\) and \(Q(M)\) are applicatively bisimilar for any argument \(M\).

Applicative bisimulations have some serious limitations. For instance, they are unsound under the presence of generative names [JR99] or data abstraction [SP05] because they apply bisimilar functions to an identical argument. Secondly, congruence proofs of applicative bisimulations are notoriously hard. Such proofs usually rely on Howe’s method [How96]. The method appears however rather subtle and fragile, for instance under the presence of generative names [JR99], non-determinism [How96], or concurrency (e.g., [FHJ98]). Also, the method is very syntactical and lacks good intuition about when and why it works. Related to the problems with congruence are also the difficulties of applicative bisimulations with “up-to context” techniques (the usefulness of these techniques in higher-order languages and its problems with applicative bisimulations have been extensively studied by Lassen [Las98]; see also [San98, KW06]).

Congruence proofs for bisimulations usually exploit the bisimulation method itself to establish that the closure of the bisimilarity under contexts is again a bisimulation. To see why, intuitively, this proof does not work for applicative bisimulation, consider a pair of bisimilar functions \( P_1, Q_1 \) and another pair of bisimilar terms \(P_2,Q_2\). In an application context they yield the terms \(P_1 P_2\) and \(Q_1 Q_2\) which, if bisimilarity is a congruence, should be bisimilar. However the argument for the functions \(P_1\) and \(Q_1\) are bisimilar, but not necessarily identical: hence we are unable to apply the bisimulation hypothesis on the functions.

Proposals for improving applicative bisimilarity include environmental bisimulations [SKS11, KLS11, PS12] and logical bisimulations [SKS07]. A key idea of environmental bisimulations is to make a clear distinction between the tested terms and the environment. An element of an environmental bisimulation has, in addition to the tested terms, a further component, the environment, which expresses the observer’s current knowledge. (In languages richer than pure \(\lambda \)-calculi, there may be other components, for instance to keep track of generated names.) The bisimulation requirements for higher-order inputs and outputs naturally follow. For instance, in higher-order outputs, the values emitted by the tested terms are published to the environment, and are added to it, as part of the updated current knowledge. In contrast, when the tested terms perform a higher-order input (e.g., in \(\lambda \)-calculi the tested terms are functions that require an argument), the arguments supplied are terms that the observer can build using the current knowledge; that is, terms obtained by composing the values currently in the environment using the operators of the calculus.

A possible drawback of environmental bisimulations over, say, applicative bisimulations, is that the set of arguments to related functions that have to be considered in the bisimulation clause is larger (since it also includes non-identical arguments). As a remedy to this is offered by up-to techniques (in particular techniques involving up-to contexts), which are easier to establish for environmental bisimulations than for applicative bisimulations, and which allow us to considerably enhance the bisimulation proof method.

The difference between environmental bisimulations and logical bisimulations is that the latter does not make use of an explicit environment: the environment is implicitly taken to be the set of pairs forming the bisimulation. This simplifies the definition, but has the drawback of making the functional of bisimulation non-monotone. In \(\lambda \)-calculi one usually is able to show that the functional has nevertheless a greatest fixed-point which coincides with contextual equivalence. But in richer languages this does not appear to be possible.

For bisimulation and coinductive techniques, a non-trivial extension of higher-order languages concern probabilities. Probabilistic models are more and more pervasive. Not only they are a formidable tool when dealing with uncertainty and incomplete information, but they sometimes are a necessity rather than an alternative, like in computational cryptography (where, e.g., secure public key encryption schemes need to be probabilistic [GM84]). A nice way to deal computationally with probabilistic models is to allow probabilistic choice as a primitive when designing algorithms, this way switching from usual, deterministic computation to a new paradigm, called probabilistic computation. Examples of application areas in which probabilistic computation has proved to be useful include natural language processing [MS99], robotics [Thr02], computer vision [CRM03], and machine learning [Pea88].

This new form of computation, of course, needs to be available to programmers to be accessible. And indeed, various programming languages have been introduced in the last years, spanning from abstract ones [JP89, RP02, PPT08] to more concrete ones [Pfe01, Goo13], being inspired by various programming paradigms like imperative, functional or even object oriented. A quite common scheme consists in endowing any deterministic language with one or more primitives for probabilistic choice, like binary probabilistic choice or primitives for distributions.

One class of languages which cope well with probabilistic computation are functional languages. Indeed, viewing algorithms as functions allows a smooth integration of distributions into the playground, itself nicely reflected at the level of types through monads [GAB+13, RP02]. As a matter of fact, many existing probabilistic programming languages [Pfe01, Goo13] are designed around the \(\lambda \)-calculus or one of its incarnations, like Scheme. All these allows to write higher-order functions (programs can take functions as inputs and produce them as outputs).

Bisimulation and context equivalence in a probabilistic \(\lambda \)-calculus have been considered in [ALS14], where a technique is proposed for proving congruence of probabilistic applicative bisimilarity. While the technique follows Howe’s method, some of the technicalities are quite different, relying on non-trivial “disentangling” properties for sets of real numbers, these properties themselves proved by tools from linear algebra. The bisimulation is proved to be sound for contextual equivalence. Completeness, however, fails: applicative bisimilarity is strictly finer. A subtle aspect is also the late vs. early formulation of bisimilarity; with a choice operator the two versions are semantically different; the congruence proof of bisimilarity crucially relies on the late style.

Context equivalence and bisimilarity, however, coincidence on pure \(\lambda \)-terms. The resulting equality is that induced by Levy-Longo trees (LLT), generally accepted as the finest extensional equivalence on pure \(\lambda \)-terms under a lazy regime. The proof follows Böhm-out techniques along the lines of [San94, SW01]. The result is in sharp contrast with what happens under a nondeterministic interpretation of choice (or in the absence of choice), where context equivalence is coarser than LLT equality.

A coinductive characterisation of context equivalence on the whole probabilistic language is possible via an extension in which weighted formal sums — terms akin to distributions — may appear in redex position. Thinking of distributions as sets of terms, the construction reminds us of the reduction of nondeterministic to deterministic automata. The technical details are however quite different, because we are in a higher-order language and therefore — once more — we are faced with the congruence problem for bisimulation, and because formal sums may contain an infinite number of terms. The proof of congruence of bisimulation in this extended language uses the technique of logical bisimulation, therefore allowing bisimilar functions to be tested with bisimilar (rather than identical) arguments (more precisely, the arguments should be in the context closure of the bisimulation). In the probabilistic setting, however, the ordinary logical bisimulation game has to be modified substantially. For instance, formal sums represent possible evolutions of running terms, hence they should appear in redex position only (allowing them anywhere would complicate matters considerably). The obligation of redex position for certain terms is in contrast with the basic schema of logical bisimulation, in which related terms can be used as arguments to bisimilar functions and can therefore end up in arbitrary positions. This problem is solved by moving to coupled logical bisimulations, where a bisimulation is formed by a pair of relations, one on ordinary terms, the other on terms extended with formal sums. The bisimulation game is played on both relations, but only the first relation is used to assemble input arguments for functions.

In higher-order languages coinductive equivalences and techniques appear to be more fundamental than in first-order languages. Evidence of this are the above-mentioned results of correspondence between forms of bisimilarity and contextual equivalence in various \(\lambda \)-calculi. Contextual equivalence is a ‘may’ of form of testing that, in first-order languages (e.g., CCS) is quite different from bisimilarity or even simulation equivalence. Indeed, in general, higher-order languages have a stronger discriminating power than first-order languages [BSV14]. For instance, if we use higher-order languages to test first-order languages, using (may-like) contextual equivalence, then the equivalences induced is often finer than the equivalences induced by first-order languages (usually trace equivalence); moreover, the natural definition of the former equivalences is coinductive, whereas that for the latter equivalences is inductive. In distributed higher-order languages, a construct that may strongly enhance the discriminating power is passivation [SS03, GH05a, LSS09a, LSS09b, LSS11, LPSS11, PS12, KH13]. Passivation offers the capability of capturing the content of a certain location into a variable, possibly copying it, and then restarting the execution in different contexts. The same discriminating power can also be obtained in call-by-value \(\lambda \)-calculi (that is, without concurrency or nondeterminism) extended with a location-like construct akin to a store of imperative \(\lambda \)-calculi, and operators for reading the content of this location, overriding it, and, if the location contains a process, for consuming such process (i.e., performing observations on the process actions). When the tested first-order processes are probabilistic, the difference in discriminating power between first-order and higher-order languages increases further: in higher-order languages equipped with passivation, or in a call-by-value \(\lambda \)-calculus, bisimilarity may be recovered [BSV14].