Abduction and The Myth of AI

Let us start with a quiz (say a pub quiz, with apologies to non-Brits): When was the following written?

“The present desire for the mechanical replacement of the human mind has its sharp limits. When the task done by an individual is narrowly and sharply understood, it is not too difficult to find a fairly adequate replacement either by a purely mechanical device or by an organization in which human minds are put together as if they were cogs in such a device. […] To expect to obtain new ideas of real significance […] without the leadership of a first-rate mind is another form of the fallacy of the monkeys and the typewriter […].”

The use of the term “mechanical” may offer a hint that this is not a recent quote, but replace “mechanical” with “digital” and you’ll have what reads to me as an accurate picture of the current state of affairs in the field of AI. It was written by Norbert Wiener, the founder of cybernetics, in 1954 in a book that was published decades after the author’s death. In it, Wiener notes that there is a clear distinction between narrow cognitive tasks, on the one hand, and the higher level of intelligence producing “new ideas of real significance”, on the other. Despite being a decades-old observation, this differentiation tends to be lost in many prominent accounts of AI that assume a natural progression from one level of intelligence to the other.

This belief in an inevitable evolution from narrow to general AI is widespread, but logically hard to justify, notwithstanding the appeal of ideas such as superintelligence and singularity: How can a lower-grade intelligence develop one of higher complexity? As John von Neumann wrote in 1948, “An organization which synthesizes something is necessarily more complicated, of a higher order, than the organization it synthesizes”. The origin, substance, and implications of the “myth” of AI that builds on this belief are the subject of Erik J. Larson’s informative new book.

In Larson’s view, the simplification of intelligence implied by the assumption of a continuum between narrow and general AI originated with the very birth of the field of AI in the work of Alan Turing. Turing came to believe that intuition could be programmed in a computing machine, and realized via deduction and induction. Deductive reasoning requires a codified set of true statements to form the basis of logical inference (if I am typing, and typing requires a keyboard, I am using a keyboard). In contrast, induction posits the stationarity and “low-tail” distribution of the possible states of the world, ruling out “black swans” like pandemics or market crashes and enabling extrapolation from data (if I have seen many times stock A go down as stock B goes up, if I see stock B go up I will predict stock A to go down).

Larson then argues that the key barrier standing in the way of a direct path between current and general AI is the lack of any known approach to program a third form of inference that is distinct from deduction and induction — abduction. Introduced by Charles Sanders Peirce in the second half of the 19th century, abduction is a form of inferential reasoning whereby probable causes — Pierce called them “hypotheses” — are guessed from evidence and are subsequently be tested by means of experiments. Abduction can be implemented by Bayes’ rule when prior conditions can be specified in a meaningful way. More generally, abduction entails forming hypotheses based on intuition and creative leaps of logic. Maxwell’s unification of electricity and magnetism, Einstein’s space-time, and Gödel‘s incompleteness theorems are all outcomes of abductive reasoning.

In Larson’s well-argued view, the belief that more research on current AI methods may eventually lead to general AI, as well as the underlying assumption of the “programmability” of intelligence, may have potentially nefarious implications for science. If all scientific progress were within the reach of programmable reasoning, all we would need to invest on is computing resources and facilities where swarms of researchers would be tasked with mechanically following codified rules. Instead, if we, like Wiener, believe that there are still important scientific truths to be discovered that cannot be deduced from known facts or induced from data, we should promote the education of “first-rate minds” by rewarding original thinking and creativity.

Semantics, AI, and The Real World


In mid-February of last year, my family took the momentous decision to adopt a shared calendar — an actual paper calendar to keep track of travels, classes, and various engagements. Everyone’s schedules were getting too complex, and we thought that a paper trail of evidence would limit opportunities for recrimination. Ten days layer, the calendar was demoted to the role of scratch paper, trips no longer a possibility, classes cancelled, engagements reduced to zoom calls.

The “AIs” on my smart phones did not register the change, sending daily reminders about the best way to get to work, going as far as suggesting the best route by car (I have not owned one since before the time the words “smart” and “phone” meant anything together). My two-year-old son, instead, took the change in stride, learning quickly how to navigate upturned schedules and moods, masks nothing more than what (some) people wear to go out for a walk.

The AIs were stuck within the “machine”, crunching data in search of regularities that did not exist; while my son took reality in, connecting the dots between behavior and consequences, between words and objects, between symbols and the real world.

Which leads us to the use of the term “semantics”, that has gone through changes of its own in recent years: Following Brian Cantwell Smith‘s phrasing, in computer science, semantics has come to refer to the behavioral consequences, within the computer system, of a program being executed, while its traditional use describes relationships and consequences of a given program, idea, or action in the real world. According to the new definition, the AIs in my smart phone are engaged in semantically meaningful tasks, while the old definition would assign the “semantic badge” only to my son’s efforts.

There are, of course, ways in which the AIs can receive feedback from the real world. Think of robots finding optimal trajectories in an environment via reinforcement learning; or the (infamous) A/B testing. But they are all mediated by some algorithmic encoder converting real-world inputs, such as user experience, into numerical quantities, such as binary feedback. Uncertainty, imperfections, and complexities of the real world are simplified in ways that may distort their actual significance. Paraphrasing again Smith, in the words of Paul Taylor, “if we seem to inhabit a world that is constructed of well-defined objects exemplifying properties and standing in unambiguous relations, that is an achievement of our intelligence, not a truth that can be used when engineering an artificial intelligence”.

Following the lead of computer scientists, communication technology researchers (yes, like me) have also started using the term “semantics” to refer to goal-oriented communications, which moves past the classical goal of transferring bits towards the aim of transferring actionable information. This approach inherits the philosophical limitations of the terminology used in computer science. One may be justifiably suspicious about the need for introducing this new language. Time will tell if this will go the way of my family’s first paper calendar or if it is here to stay.

A Course on Machine Learning for Engineers

A couple of years ago, in a different era, I posted a set of notes on machine learning. Since then, I have been asked a number of times to share slides to be used for teaching a course on machine learning for engineers. It has taken me some time, but I can finally link to a page that contains a set of slides covering material from basic background in probability and linear algebra to advanced topics such as generative adversarial networks. This is work in progress, and more chapters are in the pipeline. I welcome comments and feedback.

Free NRG

Pattern_recognition_(book_cover)Some energy can be turned into useful work; and some is merely a reflection of a system’s complexity and remains unusable. If you’ll excuse some math, we may write that the free energy F — the part of the energy that can be turned into work — equals the total energy E minus the system’s entropy S — the unusable bit — as in F = E S. (I am forgetting the temperature here, but hopefully you’ll excuse this too.)  This idea goes back to the roots of thermodynamics and of the second industrial revolution, and it lurks in the background of a surprisingly (for me) large number of applications in machine learning, information theory, and even neuroscience.

Take machine learning. A successful learning algorithm — don’t call it AI — should perform well not only on the training data it has available in the lab, but also on new data it will encounter in the future when deployed in the real world. To paraphrase Borges, to learn is “to forget differences, generalize, make abstractions.” An algorithm that is too complex is bound to capture inadvertently some of the noise in the training data that has no bearing on real-world conditions. Now, think of the total energy E as the performance of the algorithm in the lab and of the entropy S as the part of this performance that is actually just an artifact of the complexity of the algorithm. The real-world performance of the learning algorithm is then exactly the free energy F = E S.

More on the connections of free energy to modelling, inference, learning, and optimization can be found in these lecture notes written with Sharu Jose.

The Machine Does Not Stop

DSOKt5BW0AAQ6HCIn a few weeks, we went from expecting the technological singularity to make humanity obsolete in our lifetimes to being shocked and outraged at the limitations of our technological tools. Shouldn’t our most advanced “AI” have predicted the spread of the virus? Shouldn’t it have pointed to a cure or a vaccine by now?  And yet, in an odd turn, technology is taking over even more of our personal spaces and relationships, recording and predicting, repackaging and reporting, connecting and exposing.

Where we go from here is what many pundits are rushing to opine on. What does it all mean for democracy and globalization? What does it mean for the economy? How do we  trade lives for livelihoods? And how do we agree on common policies when most of the world population cannot afford the luxury to be socially isolated and to be watched over by machines of loving grace?

When reality shifts, stories that were just outside the light cone of reality take their place in tree of possibilities that unrolls from now. One of them is the 1909 short story The Machine Stops by E. M. Forster. Here is how it starts:

Imagine, if you can, a small room, hexagonal in shape, like the cell of a bee. It is lighted neither by window nor by lamp, yet it is filled with a soft radiance. There are no apertures for ventilation, yet the air is fresh. There are no musical instruments, and yet, at the moment that my meditation opens, this room is throbbing with melodious sounds. […] Above her, beneath her, and around her, the Machine hummed eternally; she did not notice the noise, for she had been born with it in her ears.

As we are told, or forced at gun point, not to leave our houses, this isn’t as hard to imagine as in Edwardian times, or even a few weeks ago: You have your own underground pod, for you to live in isolation, with food, music, and communications — both text and video — all delivered by the Machine. Travel is discouraged as a dangerous frivolity to be avoided.

In each room there sat a human being, eating, or sleeping, or producing ideas.

Lacking any link to the world and to other people, the ideas being produced are valued for their abstraction, remoteness, and purity:

Let your ideas be second-hand, and if possible tenth-hand, for then they will be far removed from that disturbing element — direct observation.

And they are exchanged with

several thousand people, [as] in certain directions human intercourse had advanced enormously.

What happens when the Machine take over? In the short story, the system eventually collapses, possibly from neglect and condescension. Those dependent on the Machine perish, their lives made meaningless and incomprehensible outside the system. Only the “surface-dwellers”, those who kept a link with the natural world unmediated by the Machine, survive, inheriting a world in need of rebuilding.

This is Not Spam

Image result for mary toft rabbitOne day you tell a lie. The story you tell is outrageous, it flies in the face of science and reason. Like that 18th-century woman who claimed to be able to give birth to rabbits.

The person you tell the lie to is intrigued: If this is true, he would be the one disclosing it to the world. Some time science is wrong, some time special things happen. Why not to him?

So, the person tells the lie to another person, this time a renowned scientist. His career has seen better days, he is eager to hear more. You put up a little show, and he is convinced.

So, he posts a scientific paper explaining how he found out about it and how this changes everything.

The scientific community is skeptical, but the story slowly becomes viral. The article is a sensation, producing and reproducing memes in all corners of the web: Once again science has failed, once again they are not telling us the whole truth.

Some researchers in Silicon Valley decide to run the story through an AI to get a second opinion. Unfortunately, the AI misinterprets a sentence, and outputs just this: It was a teenage wedding, and the old folks wished them well. This puzzles everyone.

Most politicians are cautions, while others declare this to be yet another example that you cannot trust the establishment.

A few days later you run for office.

(This is mostly just to recommend Dexter Palmer’s clever new novel.)

Brain-Inspired Computing


(The following is a partial reproduction of the editorial published on the IEEE Signal Processing Magazine for the special issue on Brain-Inspired Computing co-edited with Bipin Rajendran, Andre Gruning, Evangelos Eleftheriou, Mike Davies, Sophie Deneve, and Guang-Bin Huang)

Context. The success of Artificial Neural Networks (ANNs) in carrying out various specialized cognitive tasks has brought along renewed efforts to apply machine learning (ML) tools for economic, commercial, and societal aims, while also raising expectations regarding the advent of a “Artificial General Intelligence“. Recent much publicized examples of ML breakthroughs include the ANN-based algorithm AlphaGo, which has proven capable of beating human champions at the complex strategic game of Go. The emergence of a new generation of ANN-based ML tools has built upon the unprecedented availability of computing power in data centers and cloud computing platforms. For example, the AlphaGo Zero version required training over 64 GPU workers and 19 CPU parameter servers for weeks, with an estimated hardware cost of $25 million; and OpenAI’s video game-playing program needed training for an equivalent of 45,000 years of game play, costing millions of dollars in rent access for cloud computing services.

Recent studies have more generally quantified the requirements of ANN-based models in terms of energy, time, and memory consumption both in the training and in the inference (run-time) phases. As an example, a recent work by researchers from the University of Massachusetts Amherst has concluded that training a single ANN-based ML model can emit as much carbon as five cars in their lifetimes.

The massive resource requirements of ANN-based ML raise important questions regarding the accessibility of the technology to the general public and to smaller businesses. Furthermore, they pose an important impediment to the deployment of powerful ML algorithms on low-power mobile or embedded devices.

The importance of developing suitable methods for low-power AI to be implemented on mobile and embedded devices is attested by its central role in applications such as digital health, the tactile Internet, smart cities, and smart homes. In light of this, key industrial players, including Apple, Google, Huawei, and IBM are investing on the development of new chips optimized for streaming matrix arithmetic that promise to make ANN-based inference more energy efficient through complexity-reduction techniques such as quantization and pruning.

Neuromorphic, of brain-inspired, computing. In contrast to ANNs, the human brain is capable of performing more general and complex tasks at a minute fraction of the power, time, and space required by state-of-the-art supercomputers. An emerging line of work, often collectively labeled as “neuromorphic” computing, aims at uncovering novel computational frameworks that mimic the operation of the brain in a quest for orders-of-magnitude improvements in terms of energy efficiency and resource requirements.

There may be many reasons for the unmatched efficiency of the human brain as an adaptive learning and inference machine. Among these, none appears to be more fundamental, and more fundamentally different from the operation of digital computer, than the way in which neurons encode information: with time, rather than merely over time. Biological neurons can be thought of as complex dynamic systems with internal analog dynamics that communicate through the timing of all-or-nothing — and hence digital — spikes. This is in stark contrast to the static analog operation of neurons in an ANN. Biological neurons are connected through complex networks characterized by large fan-out, feedback, and recurrent signaling paths. This is unlike the feedforward or chain-like recurrent architectures of ANNs. As studied in theoretical neuroscience, the sparse, dynamic, and event-driven operation of biological neurons makes it possible to implement complex online adaptation and learning mechanisms via local synaptic plasticity rules and minimal energy consumption.

Based on these observations, brain-inspired neuromorphic signal processing and learning algorithms and hardware platforms have recently emerged as a low-power alternative to energy-hungry ANNs. Unlike conventional neural networks, Spiking Neural Networks (SNNs) are trainable dynamic systems that make use of the temporal dimension, not just as a neutral substrate for computing, but as a means to encode and process information in the form of asynchronous spike trains. In SNNs, inter-neuron communications and intra-neuron computing are carried out on sparse spiking, and hence time-encoded, signals.

This has motivated the development of prototype neuromorphic hardware platforms that are able to process time-encoded data. These platforms include IBM’s TrueNorth, SpiNNaker, developed within the Human Brain Project, Intel’s Loihi, and more advanced proof-of-concept prototypes based on nanoscale memristive devices. These systems are typically based on hybrid digital-analog circuitry and in-memory computing, and they have already provided convincing proof-of-concept evidence of the remarkable energy savings that can be achieved with respect to conventional neural networks. Furthermore, SNNs have the unique advantage of being able to natively process spiking data as produced by emerging audio and video sensors inspired by biology, such as silicon cochleas or Dynamic Vision Sensor (DVS) cameras.

The role of Signal Processing in Neuromorphic Computing. Work on neuromorphic computing has proceeded, often in parallel, by researchers in machine learning, computational neuroscience, and hardware design. While the problems under study — regression, classification, control, and learning — are central to signal processing, the signal processing community has been by and large not involved in the definition of this emerging field. Nevertheless, with the increasing availability of neuromorphic chips and platforms, it is the view of the guest editors that progress in the field of neuromorphic computing calls for an inter-disciplinary effort by researchers in signal processing in concert with researchers in machine learning, hardware design, system design, and computational neuroscience.

From a signal processing perspective, the specific features and constraints of neuromorphic computing platforms open interesting new problems concerning regression, classification, control, and learning. In particular, SNNs consist of asynchronous distributed architectures that process sparse binary time series by means of local spike-driven computations, local or global feedback, and online learning. Ideally, they are characterized by a graceful degradation in performance as the number of spikes, and hence the energy usage, of the network increases. As an example, recent work has shown that SNNs can obtain satisfactory solutions of the sparse regression (LASSO) problem much more quickly than conventional iterative algorithms (see, e.g., [8]). Solutions leverage tools that are well-known to signal processing researchers, such as variational inference, nonlinear systems, and stochastic gradient descent.

This special issue. The scope of the field, encompassing neuroscience, hardware design, and machine learning, makes it difficult for a non-expert to find a suitable entry point in the literature. It is the goal of this special issue to bring together key researchers in this area, with the aim of providing the readership of the IEEE Signal Processing Magazine with up-to-date and survey-style papers on algorithmic, hardware, and neuroscience perspectives on the state-of-the-art of this emerging field. The special issue can be found here and further information is available here.



https://i0.wp.com/d.gr-assets.com/books/1181144760l/1118485.jpgGo to your local bookstore (if you still have one), and you’ll find a growing number of recent novels with plots built on some projection of the role of AI in the near future. None of them seems to me even close to matching the range, urgency, technical prowess, and sheer fun of Stanislaw Lem‘s The Cyberiad, published in Polish in the same year of the premiere of Help! by The Beatles (1965).

(The book was published in English only nine years later. Incidentally, 1974 is also when the FBI received a letter from Philip K. Dick maintaining that Stanislaw Lem was “probably a composite committee rather than an individual”, and that the committee operated on the orders of the Communist party to “control opinion through criticism and pedagogic essay”.)

I was unaware of this book until recently, but I have since learned that it has quite a following. Renowned cosmologist Sean Carroll described it as a wide-ranging exploration of robotics, technology, computation and social structures.” And that it is, while also being a sort of Decameron set in a intergalactic medieval universe. The stories in the collection follow two “constructors” roaming a universe of kings, knights, robots, and dragons. The constructors are in the business of building AI solutions — not the term used by Lem, who was concerned with cybernetics, but that’s what we would call them today — for wealthy patrons. Here are some of my favorite.

General AI. To put this tale in modern terms, imagine the chief scientist at a top data company who has just completed years of training of the most powerful machine learning model based on all data available to mankind. Is this finally the dawn of general AI? The scientist starts the machine in the presence of a colleague.  When asked to sum 2 and 2, the machine responds 7. No attempt to fix this apparent bug can be found, despite desperate effort of the machine’s creator. The other scientist comments admiringly:

there is no question but that we have here  a stupid machine, and not merely stupid in the usual, normal way, oh no! This is, as far as I can determine – and you know I am something of an expert – this is the stupidest thinking machine in the entire world, and that’s nothing to sneeze at! To construct deliberately, such a machine would be far from easy; in fact I would say that no one could manage it. For the thing is not only stupid, but stubborn as a mule.

In the story, the machine ends up chasing its maker. Today it may find applications for speech writing or as a predictive model for politicians.

AI for military. It is undeniable that two of the most successful applications of AI so far have been targeted advertising and military technology (police AI technology has had some setbacks). Lem imagines a new military AI technology with the power of creating a perfect army: for each soldier, “a plug is screwed in front, a socket in the back“, and, lo and behold, the platoon acts as a single mind. When deployed by two eager kings, here is what happens:

As formation joined formation, in proportion there developed an esthetic sense, […] the weightiest metaphysical questions were considered, and, with an absentmindedness characteristic of great genius, these large units lost their weapons, […] and completely forgot that there was a war on […] both armies went hand in hand, picking flowers beneath the fluffy white clouds, on the field of the battle that never was.”

If life were only like this.

Image result for illustrations cyberiad bookAI and art. AI has found its way in museums, concert halls, and galleries around the world. In one of the tales, Lem has a constructors build an AI poet. Puzzling over the best design, he reads every book of poetry he can get his hands on, until he finally realizes that

in order to program a poetry machine, one would first have to repeat the entire Universe from the beginning.

Not one to be discouraged by such trifles, the constructor builds a machine to model the universe from the Big Bang to the present. After some tweaking, the machine outputs gems such as this one:

“Cancel me not — for what then shall remain?/ Abscissas, some mantissas, modules, modes./ A root or two, a torus, and a node:/ The inverse of my verse, a null domain./ […] I see the eigenvalue in thine eye,/ I hear the tender tensor in thy sigh./ Bernoulli would have been content to die,/ Hand he but known such a^2cos2φ!”

Through sonnets and cantos of such supreme quality, the AI poet ends up causing severe attacks of “esthetic ecstasy” across the galaxy, forcing the authorities to sentence it to a forced exile.

AI and Information.  Machine learning works by finding informative patterns in data. How informative the patterns are depends on the end user, who may or not find new knowledge or utility in them: Informative but useless patterns are everywhere. Lem imagines the possibility to design a

Demon of the Second Kind, which […] will extract information […] about everything that was, is, may be or ever will be.”

In a manner similar to its predecessor, the new demon peers through an opening of a box filled with some gas, but, instead of merely selecting molecules based on their velocities, it lets out “only significant information, keeping in all the nonsense.” This way, the demon extracts “from the dance of atoms only information that is genuine, like mathematical theorems, fashion magazines, blueprints, historical chronicles, or a recipe for ion crumpets.” And there would indeed be a lot of information to retrieve:

in a milligram of air and in a fraction of second, there would come into being all the cantos of all the epic poems to be written in the next million years, as well as an abundance of wonderful truths.

And yet, even to the most avid information junkie, “all this information, entirely true and meaningful in every particular, was absolutely useless“, causing only the poor end user of the story to be entangled in miles and miles of paper, unable to move.

And much more. The collection covers much more, including AI and morality (“all these processes take place only because I programmed them,…“, but maybe “a sufferer is one who behaves like a sufferer!“), AI laywers and advisers, and even a (rather disappointing!) civilization that has achieved the Highest Possible Level of Development.