Exhalation

The universe began as an enormous breath being held.

Here is an interesting story on entropy and consciousness:

Exhalation

Advertisements

Why Probabilistic Models?

BillCallahan_ShepherdInASheepskinVestFor all its merits, the “deep learning revolution” has entrenched, at least in some circles, a monolithic approach to machine learning that may be too limiting for its own good: All you need is artificial neural network (ANN) models; massive amounts of data, time, and resources; and, of course, backpropagation to fit the data. Alternative approaches typically get short shrift from practitioners and even from academics: Why try anything else?

One of the main apparent casualties of this dominant perspective is the class of probabilistic models at large. Standard ANN-based models only account for uncertainty — more or less explicitly — at their inputs or outputs, while the process transforming inputs to outputs is deterministic. This is typically done in one of two ways:

  • ANN-based descriptive probabilistic models: The output y, given the input x, is defined by a probability p(y|x) that is parameterized by the output of an ANN f(x) (e.g., f(x) defines the natural parameters of an exponential-family distribution) — the ANN hence describes the distribution of the output;
  • ANN-based  prescriptive probabilistic model: The output y is produced from an ANN f(x) given a random input x (e.g., implicit generative models as used in GANs).*

By excluding randomness from the process connecting inputs and outputs, ANN-based models are limited in their capacity to model structured uncertainty and to encode domain knowledge, and are not well suited to provide a framework for causal (as opposed to merely correlative) inference.

In contrast, more general probabilistic models define, in a prescriptive or descriptive fashion, a structured collection of random variables, with the semantic relationships among the variables described by directed or undirected graphs. In a probabilistic model, uncertainty can be modeled throughout the process relating input and output variables.**  Probabilistic models were at the heart of the so-called expert systems, and were, perhaps curiously, the framework used to develop the very first deep learning algorithms for neural networks. They also provide a useful starting point to reason about causality.

This insistence on deterministic models has arguably hindered progress in problems for which probabilistic models are a more natural fit. Two cases in point come to mind: metalearning and Spiking Neural Networks (SNNs). Metalearning, or learning to learn, can be naturally represented within a probabilistic model with latent variables, while deterministic frameworks remain rather ad hoc and questionable. In the case of SNNs, the dominant models used to derive training rules define neurons as deterministic threshold-activated devices. This makes it possible to leverage gradient-based training only at the cost of accepting various heuristics and approximations. In contrast, in a probabilistic framework, training rules can be naturally derived from first principles.

But there are signs that probabilistic modelling and programming may be making a comeback, with the support of companies like Intel and Uber. Efforts to integrate ANN-based techniques and probabilistic programming may lead the next wave of innovations in machine learning.


*Descriptive and prescriptive models can also be mixed as in variational autoencoders.

** ANN-based model can be used to define a local parameterization of portions — known as factors or conditional distributions — of the graph.

The Fifth London Symposium on Information Theory (LSIT)

A few months after England’s national team first-ever appearance in a World Cup, and a month after BBC’s first overseas live TV broadcast (from France), a distinguished group of academics gathered at the Royal Society in London to talk about information theory. It was 1950 — only two years after the publication of Shannon’s seminal paper and of Wiener’s “Cybernetics” — and the new ideas of information, control, and feedback were quickly making their way from engineering to the natural, social, and human sciences, begetting new insights and raising new questions.

This “cybernetic moment” [2] underpinned the first four editions of the London Symposium on Information Theory (LSIT), with the first meeting in 1950 followed by the symposia in 1952, 1955, and 1960. The program in 1950, shown in the figure, featured two talks by Shannon on communications and coding, as well as a number of presentations on topics ranging from physics, statistics, and radar, to linguistics, neuroscience, psychology, and neurophysiology. The first LSIT was also notable for two written contributions by Alan Turing, who could not attend in person. One of the contributions offered the following ante-litteram definition of machine learning:

If […] the operations of the machine itself could alter its instructions, there is the possibility that a learning process could by this means completely alter the programme in the machine.

According to the report [2], the second meeting, held in 1952, was characterized by an “emphasis on the transmission and analysis of speech“, while the third LSIT in 1955 covered again a large variety of topics, including “anatomy, animal welfare, anthropology, […] neuropsychiatry, […] phonetics, political theory“. David Slepian, one of the participants and a future author of pioneering contributions to the field, would later write in his Bell Labs report about this third meeting that the “best definition I was able to get as to what constituted ‘The Information Theory’ was ‘the sort of material on this program’!” [2]. At the same time, the heterogeneity of topics in the program may have been one of the motivations behind Shannon’s “Bandwagon” paper published the following year [3]. In it, Shannon famously warned against the indiscriminate application of information theory based solely on the abstract relevance of the concept of information to many scientific and philosophical fields.

The fourth LSIT was held in 1960 and featured among its speakers Marvin Minsky, one of the founding fathers of Artificial Intelligence (AI), who delivered a talk entitled “Learning in Random Nets”.

In the middle of our own “AI moment”, the time seemed right to bring back to London the discussion initiated in the fifties and sixties during the first four LSIT editions. And so, with a temporal leap of almost sixty years, the fifth LSIT was held at King’s College London on May 30-31, 2019. The symposium was organized by Deniz Gündüz and Osvaldo Simeone from Imperial College London and King’s College London, respectively — two institutions that featured prominently in the first editions of LSIT (see figure).

While heeding Shannon’s warning, the program of the symposium aimed at exploring the “daisy” of intersections of machine learning with fields such as statistics, machine learning, physics, communication theory, and computer science. Each day featured two keynote talks, along with two invited session, and a poster session with invited as well as contributed posters submitted to an open call. The first day was devoted to the intersection between machine learning and information theory, while the second day focused on novel applications of information theory.

The first day started with a keynote by Michael Gastpar (EPFL), who presented a talk entitled “Information measures, learning and generalization”. This was followed by an invited session on “Information theory and data-driven methods”, chaired by Iñaki Esnaola (University of Sheffield), which featured talks by Bernhard Geiger (Graz University of Technology) on “How (not) to train your neural network using the information bottleneck principle”; by Jonathan Scarlett (National University of Singapore) on Converse bounds for Gaussian process bandit optimization”; by Changho Suh (KAIST) on Matrix completion with graph side information”; and by Camilla Hollanti (Aalto University) on “In the quest for the capacity of private information retrieval from coded and colluding servers”. The session was interrupted by a fire alarm that was carefully timed by the organizers in order to give the attendees more time to enjoy the storied surrounding of the Strand Campus of King’s College London. After lunch, the symposium kicked off with a keynote talk by Phil Schniter (Ohio State University) on “Recent advances in approximate message passing”, which was followed by an invited session on “Statistical signal processing”, organized by  Ramji Venkataramanan (University of Cambridge), which featured talks by Po-Ling Loh (University of Wisconsin-Madison) on Teaching and learning in uncertainty”; by Cynthia Rush (Columbia University) on “SLOPE is better than LASSO”; by Jean Barbier (EPFL) on “Mutual information for the dense stochastic block model : A direct proof”; and by Galen Reeves (Duke University) on “The geometry of community detection via the MMSE matrix”. The first day was ended by a poster session organized by Bruno Clerckx (Imperial College London); by wine, refreshments, and by the view on the Thames and the Waterloo bridge from the 8th floor of the Bush House.

IMG-20190531-WA0004

The second, and last day, started off with a keynote by Kannan Ramchandran (Berkeley) on “Beyond communications: Codes offer a CLEAR advantage (Computing, LEArning, and Recovery)”. Next was an invited session on “Information theory and frontiers in communications”, chaired by Zoran Cvetkovic (King’s College London), with talks by Ayfer Özgür (Stanford) on “Distributed learning under communication constraints”; by Mark Wilde (Louisiana State University) on “A tale of quantum data processing and recovery”; by Michèle Wigger (Telecom ParisTech) on “Networks with mixed delay constraints”; by Aaron Wagner (Cornell University) on “What hockey and foraging animals can teach us about feedback communication”; and by Ofer Shayevitz (Tel Aviv University) on “The minimax quadratic risk of distributed correlation estimation”. The afternoon session was opened by Yiannis Kontoyiannis (University of Cambridge), who gave a keynote on “Bayesian inference for discrete time series using context trees”, and continued with an invited session on “Post-quantum cryptography”, organised by Cong Ling (Imperial College London), with talks by Shun Watanabe (Tokyo University of Agriculture and Technology) on “Change of measure argument for strong converse and application to parallel repetition”; Qian Guo (University of Bergen) on “Decryption failure attacks on post-quantum cryptographic primitives with error-correcting codes”; Leo Ducas (CWI) on “Polynomial time bounded distance decoding near Minkowski’s bound in discrete logarithm lattices”; and Thomas Prest (PQShield) on “Unifying leakage models on a Rényi day”. As the first, the second was not a rainy day and attendees were able to enjoy the view from the Bush House terrace with wine and mezes, while discussing results from poster sessions organised by Mario Berta (Imperial College London), and Kai-Kit Wong (University College London).

Videos of all talks will be made available shortly.

Registration was free and more than 150 students, researchers, and academics were in attendance. Support was provided by the European Research Council (ERC) under the European Union’s Horizon 2020 Research and Innovation Programme (Grant Agreements No. 725731 and 677854).

The organizers hope for this edition to be the first of many, with the sixth LSIT planned for 2021. LSIT has outlived the cybernetic movement, and it may well continue well beyond the current “AI moment”.

(Written by Osvaldo Simeone and Deniz Gündüz)

References

[1] N. Blachman, “Report on the third London Symposium on Information Theory,” IRE Transactions on Information Theory, vol. 2, no. 1, pp. 17-23, March 1956.

[2] R. R. Kline, “The Cybernetics Moment Or Why We Call Our Age the Information Age”, John Hopkins University Press, 2015.

[3] C. E. Shannon, “The bandwagon”, IRE Transactions on Information Theory, vol. 2, no. 1, Mar. 1956.

The Extranet of Skills

Now don’t go getting us wrong.40545817.jpg

We want the best for you. We want to make the world more connected. We want you to share your skills and ideas with the world. We want you to be yourself. We want you to feel less alone. We want you to find others just like you. 

We want to make your life easier. We want to help you sort out everyday little problems. We want you to find only the information you want when you want it. We want you to know where are you are right now. We want you to want to be where we know you’ll be. We want you to want to do what we know you’ll do. We want you to share your special moments with us. We want you to record your life because you mean so much to us and to the world. We want you to know that we’re very interested in what matters to you. We want you to know it matters to us too.

We want to count every step you take. We want you to be fit and strong. We want to monitor your emotions, predict your needs, and converse with you. We also want to help the less fortunate, and isn’t it nice that we can use the same tools for that too. We want to know your DNA so that we can find, for a small amount of money, where you come from and alert you of any potential genetic health issue; and we want it only for these totally legitimate reasons as a useful service to you.

We want to know what music you listen to, which series you like, which movies you watch, which books you read. We also want to probe your moral choices — just to entertain you. We want to know what you are wearing, who you look up to. We want to tailor our advertising bespoke to you. We want it to be right for you. We want you to take our fun psychological test to find out what kind of person you really are and who you’ll vote for in the next election or what you will vote for in the next poll.

We want to be there in your home. We want you to think of us as a family member. We’re interested in everything you say. We want you to see through that screen or through our special glasses (maybe only when you are at work for now) while you’re looking at something entirely other than us.  We want to educate your children, and we need their personal data in order to tailor our service to them (your kids are unique and special as you are).

We want to learn from you. We want our machines to learn from you. We want you to tell us what you see and hear, so that our machines can learn how to see and hear for you, whether you want it or not. We want to be creative so you don’t have to be.

We want you to know we’re keeping you safe. We want you to know we respect and protect your privacy. We want you to know that we believe privacy is a human right and a civil liberty, as long as you use our platform. We want to assure that you have control, as long as you use our platform.

We want your pasts and your presents because we want your futures too.

(Inspired by, and partly quoted from, “Spring” by Ali Smith.)

Surveillance Capitalism

Definition: 1.  A new economic order that claims human experience as free raw material for hidden commercial practices of extraction, prediction, and sales; 2. A parasitic economic logic in which the production of goods and services is subordinated to a new global architecture of behavioral modification; 3. A rogue mutation of capitalism marked by concentration of wealth, knowledge, and power unprecedented in human history; 4. The foundational framework of a surveillance economy; 5. As significant a threat to human nature in the twenty-first century as industrial capitalism was to the natural world in the nineteenth and twentieth; 6. The origin of a new instrumentarian power that asserts dominance over society and presents startling challenges to market democracy; 7. A movement that aims to impose a new collective order based on total certainty; 8. An expropriation of critical human rights that is best understood as a coup from above: an overthrow of people’s sovereignty.

(from “The age of surveillance capitalism” by Shoshana Zuboff)

Future Politics

Image result for populousThe currency and lifeblood of politics is information, and the way we collect, organize, and process information is undergoing profound changes — what does this mean for our political systems?

Many events of the last two years, including the recent protests in France and the current impasse over Brexit in the UK,  point to a future politics of right- and left-wing populism, to social fragmentation, to an increasing power of the state and of tech companies in terms of force, scrutiny, and perception-control, and to a breakdown of the liberal democratic system that was once famously hailed as the end of history.

in Future Politics, Jamie Susskind offers some hope that “a new and more robust form of democracy” may instead slowly emerge, supercharged by digital technologies and guided by the “vigilance, prudence, curiosity, persistence, assertiveness, and public-spiritedness” of this and of future generations:

The solution, I hope, will be […] one that combines the most promising elements of Deliberative Democracy, Direct Democracy, Wiki Democracy, Data Democracy, and AI Democracy.”

Whether or not one shares Susskind’s well-argued and informed outlook (the book is highly recommended), it is worth briefly reviewing these ideas.

  • Deliberative Democracy: Decisions should be taken in a way that grants everyone the same opportunity to participate in the discussion — deliberation, and not just voting, should be central to the decision process. Deliberation should be in principle facilitated by digital platforms, but so far human nature has been in the way: fake news, isolated political communities, bullying and trolling covered by anonymity, and racist chatbots have dominated online discussion. And yet. New promising platforms are emerging, and there are even calls for the use of deliberative democracy tools to solve deadlocks caused by today’s political systems.
  • Direct Democracy: Decisions on all issues should be taken through voting. Digital platforms, such as DemocracyOS, offer an ideal tool to elicit citizens’ preferences. Voting on every single issue may, however, be impractical or undesirable — what do I know about fiscal policy or waterway regulation? Complementing Direct Democracy, Liquid Democracy would allow me to delegate my vote on unfamiliar issues to trusted experts whose opinions I expect to share.
  • Wiki Democracy: Laws should be collaboratively written and edited, and perhaps encoded and recorded in a system of smart contracts. As pointed out by Jarod Lanier, under a Wiki Democracy, “superenergized people would be struggling to shift the wording of the tax code on a frantic, never-ending basis.” Not to mention if bots got into the game.
  • Data Democracy: Decisions should be taken on the basis of data continuously and uniformly collected from all citizens and agreed-upon machine learning algorithms. Data Democracy would obviate the need for constant voting, but how would we define a consensual and transparent moral framework to inform the operation of the algorithms? Where would Data Democracy leave human will and conscious participation to political life?
  • AI Democracy: Laws, and the corresponding code, should be written by an “AI” system. AI could implement policies by directly responding to the individual preferences of citizens or groups of citizens. As for Data Democracy, AI Democracy raises key issues of transparency and consensus. It also easily brings to mind Iain Banks’ AI-based Culture hyperpower and its clashes with civilizations that do not share its underlying moral framework.

As seen, digital technology would be a key enabler for all these new and old forms of democracy. Behind it all is the figure of the engineer working on the code, the algorithms, the data bases, and the platforms. At the time of the first Democracy, Plato wrote that:

There will be no end to the troubles of states, or of humanity itself, till philosophers become kings in this world, or till those we now call kings and rulers really and truly become philosophers, and political power and philosophy thus come into the same hands.”

Today, at this critical junction in the history of democracy, it may be that the only way to avoid a dark future for states and humanity is for engineers to become “philosophers”, educated in the consequences and implications of their design choices.

Before Pressing “Submit”

Image result for icaro azzurro“I do not care to show that I was right, but to determine if I was. […] Yes, we will restore everything, everything in doubt. And we will not proceed with seven league boots, but at a snail’s pace. And what we find today, tomorrow we will cancel from the blackboard and we will not rewrite it again, unless the day after tomorrow we find it again. If any discovery follows our predictions, we will consider it with special distrust. […] And only when we have failed, when, beaten without hope, we will be reduced to licking our wounds, then with death in the soul we will begin to ask ourselves if by chance we were not right […]” (Galileo Galilei as imagined by Berthold Brecht)