Byzantine Generals in the Permissionless Setting

In the distributed computing literature, consensus protocols have traditionally been studied in a setting where all participants are known to each other from the start of the protocol execution. In the parlance of the ‘blockchain’ literature, this is referred to as the permissioned setting. What differentiates the most prominent blockchain protocol Bitcoin from these previously studied protocols is that it operates in a permissionless setting, i.e. it is a protocol for establishing consensus over an unknown network of participants that anybody can join, with as many identities as they like in any role. The arrival of this new form of protocol brings with it many questions. Beyond Bitcoin, what can we prove about permissionless protocols in a general sense? How does recent work on permissionless protocols in the blockchain literature relate to the well-developed history of research on permissioned protocols in distributed computing?

To answer these questions, we describe a formal framework for the analysis of both permissioned and permissionless systems. Our framework allows for “apples-to-apples” comparisons between different categories of protocols and, in turn, the development of theory to formally discuss their relative merits. A major benefit of the framework is that it facilitates the application of a rich history of proofs and techniques in distributed computing to problems in blockchain and the study of permissionless systems. Within our framework, we then address the questions above. We consider the Byzantine Generals Problem  as a formalisation of the problem of reaching consensus, and address a programme of research that asks, “Under what adversarial conditions, and for what types of permissionless protocol, is consensus possible?”
We prove several results for this programme, our main result being that deterministic consensus is not possible for decentralised permissionless protocols. To close, we give a list of seven open questions.


Byzantine Generals in the Permissionless Setting, FC23 pdf.

This is joint work with Tim Roughgarden. This paper also replaces earlier versions of the paper, “A General Framework  for the Security Analysis of Blockchain Protocols” and “Resource Pools and the CAP Theorem”. 

The idemetric property: when most distances are (almost) the same.

In this paper my coauthors and I introduce the idemetric property, which formalises the idea that most nodes in a graph have similar distances between them, and which turns out to be quite standard amongst small-world network models. Modulo reasonable sparsity assumptions, we are then able to show that a strong form of idemetricity is actually equivalent to a very weak expander condition (PUMP). This provides a direct way of providing short proofs that small-world network models such as the Watts-Strogatz model are strongly idemetric (for a wide range of parameters), and also provides further evidence that being idemetric is a common property. We also consider how satisfaction of the idemetric property is relevant to algorithm design.

Proceedings of the Royal Society A, 475(2222), 2019, pdf.

This is joint work with Barmpalias, Huang, Li, Li, Pan and Roughgarden.

Compression of data streams down to their information content.

According to Kolmogorov complexity, every finite binary string is compressible to a shortest code — its information content — from which it is effectively recoverable. In this paper my coauthor Barmpalias and I investigate the extent to which this holds for infinite binary sequences (streams). We devise a new coding method which uniformly codes every stream X into an algorithmically random stream Y, in such a way that the first n bits of X are recoverable from the first I(X_n) bits of Y, where I is any partial computable information content measure which is defined on all prefixes of X, and where X_n is the initial segment of X of length n. As a consequence, if g is any computable upper bound on the initial segment prefix-free complexity of X, then X is computable from an algorithmically random Y with oracle-use at most g. Alternatively (making no use of such a computable bound g) one can achieve an oracle-use bounded above by K(X_n)+log n. This provides a strong analogue of Shannon’s source coding theorem for algorithmic information theory.

With Barmpalias, IEEE Transactions on Information Theory,  65 (7), 2019, pdf.

C.e. reals and random oracles

Left c.e. reals are those which can be effectively approximated from below, as the limit of an increasing sequence of rationals. Correspondingly, right c.e. reals are those which can be effectively approximated from above. Randomness for reals, on the other hand, can be defined via a number of essentially equivalent paradigms, the most well known of which is probably the treatment of Martin-Loef in terms of effectively null sets. In this sequence of papers, we establish a number of basic properties for random c.e. reals, as well as coding techniques involving random oracles more generally. For left-c.e. reals, one highlight is a result established in “Differences of c.e. reals” (see the link below) which asserts that for any pair of left c.e. reals a and b there exists a unique number r > 0 such that qa – b is a 1-random left-c.e. real for each positive rational q > r and a 1-random right-c.e. real for each positive rational q < r. Based on this result we develop a theory of differences of halting probabilities, which answers a number of questions concerning Martin-Loef random left-c.e. reals, including one of the few remaining open problems from the list of open questions in algorithmic randomness by Miller and Nies in 2006. Our methods also suffice to show that all effective approximations to a given random c.e. real are essentially the same in a very strong sense. For random oracles more generally, we extend the classic result of Kucera and Gacs, which states than any infinite sequence can be coded into a Martin-Loef sequence from which it can be effectively recovered. In “Optimal redudancy in computations from random oracles”, we develop a method of coding into random oracles which is optimal in terms  of oracle use.

Monotonous betting strategies in warped casinos, with Barmpalias and Fang, Information and Computation, 271, 2020, pdf.

Optimal asymptotic bounds on the oracle use in computations from Chaitin’s Omega, with Barmpalias and Fang, Journal of Computer and System Sciences, 82, 1283-1299, 2016, pdf.

Optimal redundancy in computations from random oracles, with Barmpalias, Journal of Computer and System Sciences, 92, 1-8, 2018, pdf.

Differences of halting probabilities, with Barmpalias, Journal of Computer and System Sciences, 89, 349-360, 2017, pdf.

A note on the differences of computably enumerable reals, with Barmpalias, Proceedings of the International Symposium on Computability and Complexity, Lecture Notes in Computer Science, 10010, Springer, 2017, pdf.

Lower bounds on the redundancy in computations from random oracles via betting strategies with restricted wagers, with Barmpalias and Teutsch, Information and Computation, 251, 287-300, 2016, pdf.

Computing halting probabilities from other halting probabilities, with Barmpalias, Theoretical Computer Science, 660, 16-22, 2017, pdf.

Pointed computations and Martin-Loef randomness, with Barmpalias and Li, Computability, vol. 7, no. 2-3, 171-177, 2018, pdf.

Limits of the Kucera-Gacs coding method, with Barmpalias, Post-proceedings volume of SEALS 2016, South Eastern Logic Symposium. World Scientific 2017, pdf.

Cooperative behaviour on networks

Prisoner’s Dilemma games have become a well-established paradigm for studying the mechanisms by which cooperative behaviour may evolve in societies consisting of selfish individuals. Recent research has focussed on the effect of spatial and connectivity structure in promoting the emergence of cooperation in scenarios where individuals play games with their neighbors, using simple `memoryless’ rules to decide their choice of strategy in repeated games. While heterogeneity and structural features such as clustering have been seen to lead to reasonable levels of cooperation in very restricted settings, no conditions on network structure have been established which robustly ensure the emergence of cooperation in a manner which is not overly sensitive to parameters such as network size, average degree, or the initial proportion of cooperating individuals. Here we consider a natural random network model, with parameters which allow us to vary the level of `community’ structure in the network, as well as the number of high degree hub nodes. We investigate the effect of varying these structural features and show that, for appropriate choices of these parameters, cooperative behaviour does now emerge in a truly robust fashion and to a previously unprecedented degree. The implication is that cooperation (as modelled here by Prisoner’s Dilemma games) can become the social norm in societal structures divided into smaller communities, and in which hub nodes provide the majority of inter-community connections.

Establishing social cooperation: the role of hubs and community structure, with Cooper, Li, Pan and Yong, to appear in Network Science: pdf.

Sex vs Asex

The question as to why most complex organisms reproduce sexually remains a very active research area in evolutionary biology. Theories dating back to Weismann have suggested that the key may lie in the creation of increased variability in offspring, causing enhanced response to selection. Under appropriate conditions, selection is known to result in the generation of negative linkage disequilibrium, with the effect of recombination then being to increase genetic variance by reducing these negative associations between alleles. It has therefore been a matter of significant interest to understand precisely those conditions resulting in negative linkage disequilibrium, and to recognise also the conditions in which the corresponding increase in genetic variation will be advantageous. In joint work with Antonio Montalban, we prove results establishing basic conditions under which negative linkage disequilibrium will be created, and describing the long term effect on population fitness. For infinite population models in which gene fitnesses combine additively with zero-epistasis, we show that, during the process of asexual propagation, a negative linkage disequilibrium will be created and maintained, meaning that an occurrence of recombination at any stage of the process will cause an immediate increase in fitness variance. In contexts where there is a large but finite bound on allele fitnesses, the non-linear nature of the effect of recombination on a population presents serious obstacles in establishing convergence to an equilibrium, or even the positions of fixed points in the corresponding dynamical system. We describe techniques for analysing the long term behaviour of sexual and asexual populations for such models, and use these techniques to establish conditions resulting in higher fitnesses for sexually reproducing populations.

Sex versus Asex: an analysis of the role of variance conversion, with Antonio Montalban, Theoretical Population Biology, 114, 2017, pdf.

Schelling segregation

Schelling’s model of segregation looks to explain the way in which particles or agents of two types may come to arrange themselves spatially into configurations consisting of large homogeneous clusters, i.e. connected regions consisting of only one type. As one of the earliest agent based models studied by economists and perhaps the most famous model of self-organising behaviour, it also has direct links to areas at the interface between computer science and statistical mechanics, such as the Ising model and the study of contagion and cascading phenomena in networks.

While the model has been extensively studied it has largely resisted rigorous analysis, prior results from the literature generally pertaining to variants of the model which are tweaked so as to be amenable to standard techniques from statistical mechanics or stochastic evolutionary game theory. Recently Brandt, Immorlica, Kamath and Kleinberg provided the first rigorous analysis of the unperturbed model, for a specific set of input parameters. In the following sequence of papers my co-authors George Barmpalias, Richard Elwes and I provide a rigorous analysis of the model’s behaviour much more generally and establish some surprising forms of threshold behaviour, for the two and three dimensional as well as the one-dimensional model.

Digital morphogenesis via Schelling segregation, with Barmpalias and Elwes, FOCS 2014, 55th Annual IEEE Symposium on Foundations of Computer Science, Oct. 18-21, Philadelphia, pdf.

Tipping points in Schelling segregation, with Barmpalias and Elwes, Journal of Statistical Physics 158, 806-852, 2016, pdf.

From randomness to order: Schelling segregation in two or three dimensions, with Barmpalias and Elwes, Journal of Statistical Physics 164 (6), 1460-1487, 2016, pdf.

Minority population in the one-dimensional Schelling model of segregation, with Barmpalias and Elwes, Journal of Statistical Physics 173(5), 1408–1458, 2018, pdf.

Typicality and the Turing degrees

The Turing degree of a real measures the computational difficulty of producing its binary expansion. Since Turing degrees are tailsets, it follows from Kolmogorov’s 0-1 law that for any property which may or may not be satisfied by any given Turing degree, the satisfying class will either be of Lebesgue measure 0 or 1, so long as it is measurable. So either the typical degree satisfies the property, or else the typical degree satisfies its negation. Further, there is then some level of randomness sufficient to ensure typicality in this regard. In this paper, we prove results in a new programme of research which aims to establish the (order theoretically) definable properties of the typical Turing degree.

The typical Turing degree, with Barmpalias and Day, Proceedings of the London Mathematical Society (2014) 109 (1). pp. 1-39, pdf. 

Computable structures

A computable structure is given by a computable domain, and then a set of computable relations and functions defined on that domain. The study of computable structures, going back as far as the work of Frohlich and Shepherdson, Rabin, and Malcev is part of a long-term programme to understand the algorithmic content of mathematics.

In mathematics generally, the notion of isomorphism is used to determine structures which are essentially the same. Within the context of effective (algorithmic) mathematics, however, one is presented with the possibility that pairs of computable structures may exist which, while isomorphic, fail to have a computable isomorphism between them. Thus the notion of computable categoricity has become of central importance: a computable structure S is computably categorical if any two computable presentations A and B of S are computably isomorphic. In this paper, my co-authors Downey, Kach, Lempp, Montalban, Turetsky and I, answer one of the longstanding questions in computable structure theory, showing the class of computably categorical structures has no simple structural or syntactic characterisation.

The complexity of computable categoricity, with Downey, Kach, Lempp, Montalban, and Turetsky, Advances in Mathematics 268 (2015), 423–466, pdf.