Cyberiad to your local bookstore (if you still have one), and you’ll find a growing number of recent novels with plots built on some projection of the role of AI in the near future. None of them seems to me even close to matching the range, urgency, technical prowess, and sheer fun of Stanislaw Lem‘s The Cyberiad, published in Polish in the same year of the premiere of Help! by The Beatles (1965).

(The book was published in English only nine years later. Incidentally, 1974 is also when the FBI received a letter from Philip K. Dick maintaining that Stanislaw Lem was “probably a composite committee rather than an individual”, and that the committee operated on the orders of the Communist party to “control opinion through criticism and pedagogic essay”.)

I was unaware of this book until recently, but I have since learned that it has quite a following. Renowned cosmologist Sean Carroll described it as a wide-ranging exploration of robotics, technology, computation and social structures.” And that it is, while also being a sort of Decameron set in a intergalactic medieval universe. The stories in the collection follow two “constructors” roaming a universe of kings, knights, robots, and dragons. The constructors are in the business of building AI solutions — not the term used by Lem, who was concerned with cybernetics, but that’s what we would call them today — for wealthy patrons. Here are some of my favorite.

General AI. To put this tale in modern terms, imagine the chief scientist at a top data company who has just completed years of training of the most powerful machine learning model based on all data available to mankind. Is this finally the dawn of general AI? The scientist starts the machine in the presence of a colleague.  When asked to sum 2 and 2, the machine responds 7. No attempt to fix this apparent bug can be found, despite desperate effort of the machine’s creator. The other scientist comments admiringly:

there is no question but that we have here  a stupid machine, and not merely stupid in the usual, normal way, oh no! This is, as far as I can determine – and you know I am something of an expert – this is the stupidest thinking machine in the entire world, and that’s nothing to sneeze at! To construct deliberately, such a machine would be far from easy; in fact I would say that no one could manage it. For the thing is not only stupid, but stubborn as a mule.

In the story, the machine ends up chasing its maker. Today it may find applications for speech writing or as a predictive model for politicians.

AI for military. It is undeniable that two of the most successful applications of AI so far have been targeted advertising and military technology (police AI technology has had some setbacks). Lem imagines a new military AI technology with the power of creating a perfect army: for each soldier, “a plug is screwed in front, a socket in the back“, and, lo and behold, the platoon acts as a single mind. When deployed by two eager kings, here is what happens:

As formation joined formation, in proportion there developed an esthetic sense, […] the weightiest metaphysical questions were considered, and, with an absentmindedness characteristic of great genius, these large units lost their weapons, […] and completely forgot that there was a war on […] both armies went hand in hand, picking flowers beneath the fluffy white clouds, on the field of the battle that never was.”

If life were only like this.

Image result for illustrations cyberiad bookAI and art. AI has found its way in museums, concert halls, and galleries around the world. In one of the tales, Lem has a constructors build an AI poet. Puzzling over the best design, he reads every book of poetry he can get his hands on, until he finally realizes that

in order to program a poetry machine, one would first have to repeat the entire Universe from the beginning.

Not one to be discouraged by such trifles, the constructor builds a machine to model the universe from the Big Bang to the present. After some tweaking, the machine outputs gems such as this one:

“Cancel me not — for what then shall remain?/ Abscissas, some mantissas, modules, modes./ A root or two, a torus, and a node:/ The inverse of my verse, a null domain./ […] I see the eigenvalue in thine eye,/ I hear the tender tensor in thy sigh./ Bernoulli would have been content to die,/ Hand he but known such a^2cos2φ!”

Through sonnets and cantos of such supreme quality, the AI poet ends up causing severe attacks of “esthetic ecstasy” across the galaxy, forcing the authorities to sentence it to a forced exile.

AI and Information.  Machine learning works by finding informative patterns in data. How informative the patterns are depends on the end user, who may or not find new knowledge or utility in them: Informative but useless patterns are everywhere. Lem imagines the possibility to design a

Demon of the Second Kind, which […] will extract information […] about everything that was, is, may be or ever will be.”

In a manner similar to its predecessor, the new demon peers through an opening of a box filled with some gas, but, instead of merely selecting molecules based on their velocities, it lets out “only significant information, keeping in all the nonsense.” This way, the demon extracts “from the dance of atoms only information that is genuine, like mathematical theorems, fashion magazines, blueprints, historical chronicles, or a recipe for ion crumpets.” And there would indeed be a lot of information to retrieve:

in a milligram of air and in a fraction of second, there would come into being all the cantos of all the epic poems to be written in the next million years, as well as an abundance of wonderful truths.

And yet, even to the most avid information junkie, “all this information, entirely true and meaningful in every particular, was absolutely useless“, causing only the poor end user of the story to be entangled in miles and miles of paper, unable to move.

And much more. The collection covers much more, including AI and morality (“all these processes take place only because I programmed them,…“, but maybe “a sufferer is one who behaves like a sufferer!“), AI laywers and advisers, and even a (rather disappointing!) civilization that has achieved the Highest Possible Level of Development.



Possible Minds

Why The Fight Scene Matters In John Carpenter’s They Live ...While many of us worry about the rise of cyborgs as inorganic machines equipped with artificial intelligence, cyborgs emerging from the integrations of humans and computers have been among us for decades. And not as passive agents: the consequences of their actions have irrevocably changed the environment, society, and the human condition.

These words of Norbert Wiener, written in 1950 in The Human Use of Human Beings, frame the problem in stark and precise terms:

the machine […], which can learn and can make decisions on the basis of its learning, will in no way be obliged to make such decisions as we should have made, or will be acceptable to us […] Whether we entrust our decisions to machines of metal, or those machines of flesh and blood which are bureaus and vast laboratories and corporations […] the hour is very late, and the choice of good and evil knocks at our door.

A timely discussion of this and other prescient ideas from Wiener’s work can be found in the interesting collection Possible Minds: Twenty-Five Ways of Looking at AI edited by John Brockman.


According to the photographer’s notes, Fred Bender was using this device, installed on the Northern State Parkway on Long Island, to let his wife know he was late for dinner. Nov. 6, 1959.Superintelligence. Hyperintelligence. Singularity. Artificial General Intelligence. If you have already stopped reading, I don’t blame you: is there anything more to these than generous extrapolations from the state of present technology? Of course, generous extrapolations have often been proven right: The first cellphone — “the Brick” — went on sale in 1983 with a price tag of $3,995, a weight of about two pounds, and a battery autonomy of 20 minutes for a 10-hour charge. But, when it comes to AI, the issue may not be one of scale but rather of false advertising and misplaced priorities.

False advertising and misplaced priorities. For the former, here is how Peter Thiel puts it: “At its core, artificial intelligence is a military technology […] what is powerful about actually existing AI is its application to relatively mundane tasks like computer vision and data analysis.” And that is hard to argue against.

But what concerns us here, in these days of unprecedented temperatures, is the issue of misplaced priorities. It is evident to anyone who understands the concept of “scientific consensus” that the planet is warming and that time is running out. Some see AI as part of a solution, but it certainly also part of the problem. As reported by MIT Technology Review, in the absence of significant innovation in materials, chip manufacturing and design, data centers’ AI workloads could account for a tenth of the world’s electricity usage by 2025and training several popular and large AI models produces nearly five times the entire lifetime emissions of the average American car.” And the contribution of AI to energy consumption is not likely to abate since, with the advent of 5G, [t]here will also be more information for models to crunch thanks to the rise of things like autonomous vehicles and sensors embedded in other smart devices.

Image result for gaia lovelockgAIa. Enter James Lovelock, the centenarian (as of this July) inventor of the Gaia theory — not exactly one to forget the environment. His theory famously states that living and inorganic organisms on Earth form a self-regulating system that maintains suitable conditions for life on the planet. His latest book, published on the occasion of his 100th birthday, is tellingly entitled Novacene: The Coming Age of Hyperintelligence.

This is how Lovelock introduces his main argument in the book:

There have been two previous decisive events in the history of our planet. The first was about 3.4 billion years ago when photosynthetic bacteria first appeared. Photosynthesis is the conversion of sunlight to usable energy. The second was in 1712 when Newcomen created an efficient machine that converted the sunlight locked in coal directly into work. We are now entering the third phase in which we — and our cyborg successors — convert sunlight directly into information.

Our cyborg successors. Lovelock envisions a future in which electronic intelligence with replicating capabilities — cyborgs, but not the humanoid type of much science fiction — develops a separate biosphere from ours, watching us move and act in slow motion just as we may watch a garden grow (he provides an estimate of the cyborg-to-human speedup: 10,000 times). The emergence of cyborgs will be the product of the same evolutionary process that has selected us for intelligence. As he explains it, “it seems that the prime objective (of evolution) is to convert all of matter and radiation into information.”

But we should not despair: cyborgs will need us to help cool the planet as organic beings. At least until that is no longer possible, and life will make another leap — from organic to pure electronic information.

Why Probabilistic Models?

BillCallahan_ShepherdInASheepskinVestFor all its merits, the “deep learning revolution” has entrenched, at least in some circles, a monolithic approach to machine learning that may be too limiting for its own good: All you need is artificial neural network (ANN) models; massive amounts of data, time, and resources; and, of course, backpropagation to fit the data. Alternative approaches typically get short shrift from practitioners and even from academics: Why try anything else?

One of the main apparent casualties of this dominant perspective is the class of probabilistic models at large. Standard ANN-based models only account for uncertainty — more or less explicitly — at their inputs or outputs, while the process transforming inputs to outputs is deterministic. This is typically done in one of two ways:

  • ANN-based descriptive probabilistic models: The output y, given the input x, is defined by a probability p(y|x) that is parameterized by the output of an ANN f(x) (e.g., f(x) defines the natural parameters of an exponential-family distribution) — the ANN hence describes the distribution of the output;
  • ANN-based  prescriptive probabilistic model: The output y is produced from an ANN f(x) given a random input x (e.g., implicit generative models as used in GANs).*

By excluding randomness from the process connecting inputs and outputs, ANN-based models are limited in their capacity to model structured uncertainty and to encode domain knowledge, and are not well suited to provide a framework for causal (as opposed to merely correlative) inference.

In contrast, more general probabilistic models define, in a prescriptive or descriptive fashion, a structured collection of random variables, with the semantic relationships among the variables described by directed or undirected graphs. In a probabilistic model, uncertainty can be modeled throughout the process relating input and output variables.**  Probabilistic models were at the heart of the so-called expert systems, and were, perhaps curiously, the framework used to develop the very first deep learning algorithms for neural networks. They also provide a useful starting point to reason about causality.

This insistence on deterministic models has arguably hindered progress in problems for which probabilistic models are a more natural fit. Two cases in point come to mind: metalearning and Spiking Neural Networks (SNNs). Metalearning, or learning to learn, can be naturally represented within a probabilistic model with latent variables, while deterministic frameworks remain rather ad hoc and questionable. In the case of SNNs, the dominant models used to derive training rules define neurons as deterministic threshold-activated devices. This makes it possible to leverage gradient-based training only at the cost of accepting various heuristics and approximations. In contrast, in a probabilistic framework, training rules can be naturally derived from first principles.

But there are signs that probabilistic modelling and programming may be making a comeback, with the support of companies like Intel and Uber. Efforts to integrate ANN-based techniques and probabilistic programming may lead the next wave of innovations in machine learning.

*Descriptive and prescriptive models can also be mixed as in variational autoencoders.

** ANN-based model can be used to define a local parameterization of portions — known as factors or conditional distributions — of the graph.

The Fifth London Symposium on Information Theory (LSIT)

A few months after England’s national team first-ever appearance in a World Cup, and a month after BBC’s first overseas live TV broadcast (from France), a distinguished group of academics gathered at the Royal Society in London to talk about information theory. It was 1950 — only two years after the publication of Shannon’s seminal paper and of Wiener’s “Cybernetics” — and the new ideas of information, control, and feedback were quickly making their way from engineering to the natural, social, and human sciences, begetting new insights and raising new questions.

This “cybernetic moment” [2] underpinned the first four editions of the London Symposium on Information Theory (LSIT), with the first meeting in 1950 followed by the symposia in 1952, 1955, and 1960. The program in 1950, shown in the figure, featured two talks by Shannon on communications and coding, as well as a number of presentations on topics ranging from physics, statistics, and radar, to linguistics, neuroscience, psychology, and neurophysiology. The first LSIT was also notable for two written contributions by Alan Turing, who could not attend in person. One of the contributions offered the following ante-litteram definition of machine learning:

If […] the operations of the machine itself could alter its instructions, there is the possibility that a learning process could by this means completely alter the programme in the machine.

According to the report [2], the second meeting, held in 1952, was characterized by an “emphasis on the transmission and analysis of speech“, while the third LSIT in 1955 covered again a large variety of topics, including “anatomy, animal welfare, anthropology, […] neuropsychiatry, […] phonetics, political theory“. David Slepian, one of the participants and a future author of pioneering contributions to the field, would later write in his Bell Labs report about this third meeting that the “best definition I was able to get as to what constituted ‘The Information Theory’ was ‘the sort of material on this program’!” [2]. At the same time, the heterogeneity of topics in the program may have been one of the motivations behind Shannon’s “Bandwagon” paper published the following year [3]. In it, Shannon famously warned against the indiscriminate application of information theory based solely on the abstract relevance of the concept of information to many scientific and philosophical fields.

The fourth LSIT was held in 1960 and featured among its speakers Marvin Minsky, one of the founding fathers of Artificial Intelligence (AI), who delivered a talk entitled “Learning in Random Nets”.

In the middle of our own “AI moment”, the time seemed right to bring back to London the discussion initiated in the fifties and sixties during the first four LSIT editions. And so, with a temporal leap of almost sixty years, the fifth LSIT was held at King’s College London on May 30-31, 2019. The symposium was organized by Deniz Gündüz and Osvaldo Simeone from Imperial College London and King’s College London, respectively — two institutions that featured prominently in the first editions of LSIT (see figure).

While heeding Shannon’s warning, the program of the symposium aimed at exploring the “daisy” of intersections of machine learning with fields such as statistics, machine learning, physics, communication theory, and computer science. Each day featured two keynote talks, along with two invited session, and a poster session with invited as well as contributed posters submitted to an open call. The first day was devoted to the intersection between machine learning and information theory, while the second day focused on novel applications of information theory.

The first day started with a keynote by Michael Gastpar (EPFL), who presented a talk entitled “Information measures, learning and generalization”. This was followed by an invited session on “Information theory and data-driven methods”, chaired by Iñaki Esnaola (University of Sheffield), which featured talks by Bernhard Geiger (Graz University of Technology) on “How (not) to train your neural network using the information bottleneck principle”; by Jonathan Scarlett (National University of Singapore) on Converse bounds for Gaussian process bandit optimization”; by Changho Suh (KAIST) on Matrix completion with graph side information”; and by Camilla Hollanti (Aalto University) on “In the quest for the capacity of private information retrieval from coded and colluding servers”. The session was interrupted by a fire alarm that was carefully timed by the organizers in order to give the attendees more time to enjoy the storied surrounding of the Strand Campus of King’s College London. After lunch, the symposium kicked off with a keynote talk by Phil Schniter (Ohio State University) on “Recent advances in approximate message passing”, which was followed by an invited session on “Statistical signal processing”, organized by  Ramji Venkataramanan (University of Cambridge), which featured talks by Po-Ling Loh (University of Wisconsin-Madison) on Teaching and learning in uncertainty”; by Cynthia Rush (Columbia University) on “SLOPE is better than LASSO”; by Jean Barbier (EPFL) on “Mutual information for the dense stochastic block model : A direct proof”; and by Galen Reeves (Duke University) on “The geometry of community detection via the MMSE matrix”. The first day was ended by a poster session organized by Bruno Clerckx (Imperial College London); by wine, refreshments, and by the view on the Thames and the Waterloo bridge from the 8th floor of the Bush House.


The second, and last day, started off with a keynote by Kannan Ramchandran (Berkeley) on “Beyond communications: Codes offer a CLEAR advantage (Computing, LEArning, and Recovery)”. Next was an invited session on “Information theory and frontiers in communications”, chaired by Zoran Cvetkovic (King’s College London), with talks by Ayfer Özgür (Stanford) on “Distributed learning under communication constraints”; by Mark Wilde (Louisiana State University) on “A tale of quantum data processing and recovery”; by Michèle Wigger (Telecom ParisTech) on “Networks with mixed delay constraints”; by Aaron Wagner (Cornell University) on “What hockey and foraging animals can teach us about feedback communication”; and by Ofer Shayevitz (Tel Aviv University) on “The minimax quadratic risk of distributed correlation estimation”. The afternoon session was opened by Yiannis Kontoyiannis (University of Cambridge), who gave a keynote on “Bayesian inference for discrete time series using context trees”, and continued with an invited session on “Post-quantum cryptography”, organised by Cong Ling (Imperial College London), with talks by Shun Watanabe (Tokyo University of Agriculture and Technology) on “Change of measure argument for strong converse and application to parallel repetition”; Qian Guo (University of Bergen) on “Decryption failure attacks on post-quantum cryptographic primitives with error-correcting codes”; Leo Ducas (CWI) on “Polynomial time bounded distance decoding near Minkowski’s bound in discrete logarithm lattices”; and Thomas Prest (PQShield) on “Unifying leakage models on a Rényi day”. As the first, the second was not a rainy day and attendees were able to enjoy the view from the Bush House terrace with wine and mezes, while discussing results from poster sessions organised by Mario Berta (Imperial College London), and Kai-Kit Wong (University College London).

Videos of all talks are available here.

Registration was free and more than 150 students, researchers, and academics were in attendance. Support was provided by the European Research Council (ERC) under the European Union’s Horizon 2020 Research and Innovation Programme (Grant Agreements No. 725731 and 677854).

The organizers hope for this edition to be the first of many, with the sixth LSIT planned for 2021. LSIT has outlived the cybernetic movement, and it may well continue well beyond the current “AI moment”.

(Written by Osvaldo Simeone and Deniz Gündüz)


[1] N. Blachman, “Report on the third London Symposium on Information Theory,” IRE Transactions on Information Theory, vol. 2, no. 1, pp. 17-23, March 1956.

[2] R. R. Kline, “The Cybernetics Moment Or Why We Call Our Age the Information Age”, John Hopkins University Press, 2015.

[3] C. E. Shannon, “The bandwagon”, IRE Transactions on Information Theory, vol. 2, no. 1, Mar. 1956.

The Extranet of Skills

Now don’t go getting us wrong.40545817.jpg

We want the best for you. We want to make the world more connected. We want you to share your skills and ideas with the world. We want you to be yourself. We want you to feel less alone. We want you to find others just like you. 

We want to make your life easier. We want to help you sort out everyday little problems. We want you to find only the information you want when you want it. We want you to know where are you are right now. We want you to want to be where we know you’ll be. We want you to want to do what we know you’ll do. We want you to share your special moments with us. We want you to record your life because you mean so much to us and to the world. We want you to know that we’re very interested in what matters to you. We want you to know it matters to us too.

We want to count every step you take. We want you to be fit and strong. We want to monitor your emotions, predict your needs, and converse with you. We also want to help the less fortunate, and isn’t it nice that we can use the same tools for that too. We want to know your DNA so that we can find, for a small amount of money, where you come from and alert you of any potential genetic health issue; and we want it only for these totally legitimate reasons as a useful service to you.

We want to know what music you listen to, which series you like, which movies you watch, which books you read. We also want to probe your moral choices — just to entertain you. We want to know what you are wearing, who you look up to. We want to tailor our advertising bespoke to you. We want it to be right for you. We want you to take our fun psychological test to find out what kind of person you really are and who you’ll vote for in the next election or what you will vote for in the next poll.

We want to be there in your home. We want you to think of us as a family member. We’re interested in everything you say. We want you to see through that screen or through our special glasses (maybe only when you are at work for now) while you’re looking at something entirely other than us.  We want to educate your children, and we need their personal data in order to tailor our service to them (your kids are unique and special as you are).

We want to learn from you. We want our machines to learn from you. We want you to tell us what you see and hear, so that our machines can learn how to see and hear for you, whether you want it or not. We want to be creative so you don’t have to be.

We want you to know we’re keeping you safe. We want you to know we respect and protect your privacy. We want you to know that we believe privacy is a human right and a civil liberty, as long as you use our platform. We want to assure that you have control, as long as you use our platform.

We want your pasts and your presents because we want your futures too.

(Inspired by, and partly quoted from, “Spring” by Ali Smith.)