This is an invited expository article for the Notices of the American Mathematical Society, which requires (almost) no background knowledge, and aims to introduce cryptocurrency protocols in simple terms: pdf.
Blockchain protocols differ in fundamental ways, including the mechanics of selecting users to produce blocks (e.g., proof-of-work vs. proof-of-stake) and the method to establish consensus (e.g., longest chain rules vs. Byzantine fault-tolerant (BFT) inspired protocols). These fundamental differences have hindered “apples-to-apples” comparisons between different categories of blockchain protocols and, in turn, the development of theory to formally discuss their relative merits.
This paper presents a parsimonious abstraction sufficient for capturing and comparing properties of many well-known permissionless blockchain protocols, simultaneously capturing essential properties of both proof-of-work (PoW) and proof-of-stake (PoS) protocols, and of both longest-chain-type and BFT-type protocols. Our framework blackboxes the precise mechanics of the user selection process, allowing us to isolate the properties of the selection process that are significant for protocol design.
We demonstrate the utility of our general framework with several concrete results that delineate which properties are achievable by different types of blockchains under different synchrony assumptions. For example:
- We prove a CAP-type impossibility theorem asserting that liveness with an unknown level of participation (as in typical PoW protocols) rules out security in a partially synchronous setting (as enjoyed by several BFT-type PoS protocols).
- Delving deeper into the partially synchronous setting, we prove that
a necessary and sufficient condition for security is the production of “certificates,” meaning stand-alone proofs of block confirmation.
- Restricting to synchronous settings, we prove that typical protocols with a known level of participation (including longest chain-type PoS protocols) can be adapted to provide certificates, but those with an unknown level of participation cannot.
- Finally, we use our framework to articulate a modular two-step approach to blockchain security analysis: (i) Prove liveness and security properties for a permissioned version of a protocol; (ii) Prove that the method of user selection extends these properties (with high probability) to the original permissionless protocol.
A General Framework for the Security Analysis of Blockchain Protocols, pdf.
This is joint work with Tim Roughgarden.
In this paper my coauthors and I introduce the idemetric property, which formalises the idea that most nodes in a graph have similar distances between them, and which turns out to be quite standard amongst small-world network models. Modulo reasonable sparsity assumptions, we are then able to show that a strong form of idemetricity is actually equivalent to a very weak expander condition (PUMP). This provides a direct way of providing short proofs that small-world network models such as the Watts-Strogatz model are strongly idemetric (for a wide range of parameters), and also provides further evidence that being idemetric is a common property. We also consider how satisfaction of the idemetric property is relevant to algorithm design.
To appear in the Proceedings of the Royal Society A: pdf.
This is joint work with Barmpalias, Huang, Li, Li, Pan and Roughgarden.
According to Kolmogorov complexity, every finite binary string is compressible to a shortest code — its information content — from which it is effectively recoverable. In this paper my coauthor Barmpalias and I investigate the extent to which this holds for infinite binary sequences (streams). We devise a new coding method which uniformly codes every stream X into an algorithmically random stream Y, in such a way that the first n bits of X are recoverable from the first I(X_n) bits of Y, where I is any partial computable information content measure which is defined on all prefixes of X, and where X_n is the initial segment of X of length n. As a consequence, if g is any computable upper bound on the initial segment prefix-free complexity of X, then X is computable from an algorithmically random Y with oracle-use at most g. Alternatively (making no use of such a computable bound g) one can achieve an oracle-use bounded above by K(X_n)+log n. This provides a strong analogue of Shannon’s source coding theorem for algorithmic information theory.
With Barmpalias, to appear in IEEE Transactions on Information Theory: pdf.
Left c.e. reals are those which can be effectively approximated from below, as the limit of an increasing sequence of rationals. Correspondingly, right c.e. reals are those which can be effectively approximated from above. Randomness for reals, on the other hand, can be defined via a number of essentially equivalent paradigms, the most well known of which is probably the treatment of Martin-Loef in terms of effectively null sets. In this sequence of papers, we establish a number of basic properties for random c.e. reals, as well as coding techniques involving random oracles more generally. For left-c.e. reals, one highlight is a result established in “Differences of c.e. reals” (see the link below) which asserts that for any pair of left c.e. reals a and b there exists a unique number r > 0 such that qa – b is a 1-random left-c.e. real for each positive rational q > r and a 1-random right-c.e. real for each positive rational q < r. Based on this result we develop a theory of differences of halting probabilities, which answers a number of questions concerning Martin-Loef random left-c.e. reals, including one of the few remaining open problems from the list of open questions in algorithmic randomness by Miller and Nies in 2006. Our methods also suffice to show that all effective approximations to a given random c.e. real are essentially the same in a very strong sense. For random oracles more generally, we extend the classic result of Kucera and Gacs, which states than any infinite sequence can be coded into a Martin-Loef sequence from which it can be effectively recovered. In “Optimal redudancy in computations from random oracles”, we develop a method of coding into random oracles which is optimal in terms of oracle use.
Monotonous betting strategies in warped casinos, with Barmpalias and Fang, submitted, pdf.
Optimal asymptotic bounds on the oracle use in computations from Chaitin’s Omega, with Barmpalias and Fang, Journal of Computer and System Sciences, 82, 1283-1299, 2016, pdf.
Optimal redundancy in computations from random oracles, with Barmpalias, Journal of Computer and System Sciences (in press), pdf.
Differences of halting probabilities, with Barmpalias, Journal of Computer and System Sciences, 89, 349-360, 2017, pdf.
A note on the differences of computably enumerable reals, with Barmpalias, Proceedings of the International Symposium on Computability and Complexity, Lecture Notes in Computer Science, 10010, Springer, 2017, pdf.
Lower bounds on the redundancy in computations from random oracles via betting strategies with restricted wagers, with Barmpalias and Teutsch, Information and Computation, 251, 287-300, 2016, pdf.
Computing halting probabilities from other halting probabilities, with Barmpalias, Theoretical Computer Science, 660, 16-22, 2017, pdf.
Pointed computations and Martin-Loef randomness, with Barmpalias and Li, Computability (in press), pdf.
Limits of the Kucera-Gacs coding method, with Barmpalias, Post-proceedings volume of SEALS 2016, South Eastern Logic Symposium. World Scientific 2017, pdf.
Prisoner’s Dilemma games have become a well-established paradigm for studying the mechanisms by which cooperative behaviour may evolve in societies consisting of selfish individuals. Recent research has focussed on the effect of spatial and connectivity structure in promoting the emergence of cooperation in scenarios where individuals play games with their neighbors, using simple `memoryless’ rules to decide their choice of strategy in repeated games. While heterogeneity and structural features such as clustering have been seen to lead to reasonable levels of cooperation in very restricted settings, no conditions on network structure have been established which robustly ensure the emergence of cooperation in a manner which is not overly sensitive to parameters such as network size, average degree, or the initial proportion of cooperating individuals. Here we consider a natural random network model, with parameters which allow us to vary the level of `community’ structure in the network, as well as the number of high degree hub nodes. We investigate the effect of varying these structural features and show that, for appropriate choices of these parameters, cooperative behaviour does now emerge in a truly robust fashion and to a previously unprecedented degree. The implication is that cooperation (as modelled here by Prisoner’s Dilemma games) can become the social norm in societal structures divided into smaller communities, and in which hub nodes provide the majority of inter-community connections.
Establishing social cooperation: the role of hubs and community structure, with Cooper, Li, Pan and Yong, to appear in Network Science: pdf.
The question as to why most complex organisms reproduce sexually remains a very active research area in evolutionary biology. Theories dating back to Weismann have suggested that the key may lie in the creation of increased variability in offspring, causing enhanced response to selection. Under appropriate conditions, selection is known to result in the generation of negative linkage disequilibrium, with the effect of recombination then being to increase genetic variance by reducing these negative associations between alleles. It has therefore been a matter of significant interest to understand precisely those conditions resulting in negative linkage disequilibrium, and to recognise also the conditions in which the corresponding increase in genetic variation will be advantageous. In joint work with Antonio Montalban, we prove results establishing basic conditions under which negative linkage disequilibrium will be created, and describing the long term effect on population fitness. For infinite population models in which gene fitnesses combine additively with zero-epistasis, we show that, during the process of asexual propagation, a negative linkage disequilibrium will be created and maintained, meaning that an occurrence of recombination at any stage of the process will cause an immediate increase in fitness variance. In contexts where there is a large but finite bound on allele fitnesses, the non-linear nature of the effect of recombination on a population presents serious obstacles in establishing convergence to an equilibrium, or even the positions of fixed points in the corresponding dynamical system. We describe techniques for analysing the long term behaviour of sexual and asexual populations for such models, and use these techniques to establish conditions resulting in higher fitnesses for sexually reproducing populations.
Sex versus Asex: an analysis of the role of variance conversion, with Antonio Montalban, Theoretical Population Biology, 114, 2017, pdf.
Schelling’s model of segregation looks to explain the way in which particles or agents of two types may come to arrange themselves spatially into configurations consisting of large homogeneous clusters, i.e. connected regions consisting of only one type. As one of the earliest agent based models studied by economists and perhaps the most famous model of self-organising behaviour, it also has direct links to areas at the interface between computer science and statistical mechanics, such as the Ising model and the study of contagion and cascading phenomena in networks.
While the model has been extensively studied it has largely resisted rigorous analysis, prior results from the literature generally pertaining to variants of the model which are tweaked so as to be amenable to standard techniques from statistical mechanics or stochastic evolutionary game theory. Recently Brandt, Immorlica, Kamath and Kleinberg provided the first rigorous analysis of the unperturbed model, for a specific set of input parameters. In the following sequence of papers my co-authors George Barmpalias, Richard Elwes and I provide a rigorous analysis of the model’s behaviour much more generally and establish some surprising forms of threshold behaviour, for the two and three dimensional as well as the one-dimensional model.
Digital morphogenesis via Schelling segregation, with Barmpalias and Elwes, FOCS 2014, 55th Annual IEEE Symposium on Foundations of Computer Science, Oct. 18-21, Philadelphia, pdf.
Tipping points in Schelling segregation, with Barmpalias and Elwes, Journal of Statistical Physics 158, 806-852, 2016, pdf.
From randomness to order: Schelling segregation in two or three dimensions, with Barmpalias and Elwes, Journal of Statistical Physics 164 (6), 1460-1487, 2016, pdf.
Minority population in the one-dimensional Schelling model of segregation, with Barmpalias and Elwes, Journal of Statistical Physics 173(5), 1408–1458, 2018, pdf.
The Turing degree of a real measures the computational difficulty of producing its binary expansion. Since Turing degrees are tailsets, it follows from Kolmogorov’s 0-1 law that for any property which may or may not be satisfied by any given Turing degree, the satisfying class will either be of Lebesgue measure 0 or 1, so long as it is measurable. So either the typical degree satisfies the property, or else the typical degree satisfies its negation. Further, there is then some level of randomness sufficient to ensure typicality in this regard. In this paper, we prove results in a new programme of research which aims to establish the (order theoretically) definable properties of the typical Turing degree.
The typical Turing degree, with Barmpalias and Day, Proceedings of the London Mathematical Society (2014) 109 (1). pp. 1-39, pdf.
A computable structure is given by a computable domain, and then a set of computable relations and functions defined on that domain. The study of computable structures, going back as far as the work of Frohlich and Shepherdson, Rabin, and Malcev is part of a long-term programme to understand the algorithmic content of mathematics.
In mathematics generally, the notion of isomorphism is used to determine structures which are essentially the same. Within the context of effective (algorithmic) mathematics, however, one is presented with the possibility that pairs of computable structures may exist which, while isomorphic, fail to have a computable isomorphism between them. Thus the notion of computable categoricity has become of central importance: a computable structure S is computably categorical if any two computable presentations A and B of S are computably isomorphic. In this paper, my co-authors Downey, Kach, Lempp, Montalban, Turetsky and I, answer one of the longstanding questions in computable structure theory, showing the class of computably categorical structures has no simple structural or syntactic characterisation.
The complexity of computable categoricity, with Downey, Kach, Lempp, Montalban, and Turetsky, Advances in Mathematics 268 (2015), 423–466, pdf.