Human-Imitative AI vs. Useful AI

Image result for Robot-Proof: Higher Education in the Age of Artificial IntelligenceIn a recent post, Michael I. Jordan distinguishes the following aspirations for AI-related research:

  • Human imitation: AI should behave in a way that is indistinguishable from that of a human being.
  • Intelligence Augmentation (IA): AI should augment human capacity to think, communicate, and create.
  • Intelligence Infrastructure (II): AI should manage a network of computing and communicating agents, markets, and repositories of information in a way that is efficient, supportive of human and societal needs, as well as economically and legally viable.

Human-imitative AI is often singled out by technologists, academicians, journalists and venture capitalists as the real aspiration and end goal of AI research. However, most progress in “AI” to date has not concerned high-level abstract thinking machines, but rather low-level engineering solutions with roots in operations research, statistics, pattern recognition, information theory and control theory.

In fact, as argued by Jordan, the single-minded focus on human-imitative AI has become a distraction from the more useful endeavor of addressing IA and II. To this end, the post calls for the founding of a new engineering discipline building on

ideas that the preceding century gave substance to — ideas such as “information,” “algorithm,” “data,” “uncertainty,” “computing,” “inference,” and “optimization.” Moreover, since much of the focus of the new discipline will be on data from and about humans, its development will require perspectives from the social sciences and humanities.”

The development of such a discipline would call not only for large-scale targeted research efforts, but also for new higher education programs. As proposed in “Robot-proof“, the new programs should impart data literacy, technological literacy, and human literacy:

Students will need data literacy to manage the flow of big data, and technological literacy to know how their machines work, but human literacy–the humanities, communication, and design–to function as a human being. Life-long learning opportunities will support their ability to adapt to change.

Advertisements

Information Theory is…

Image result for strand bookstore

[According to Google’s Talk to Books]

“Information theory is a branch of applied mathematics providing a framework allowing the quantification of the information generated or transmitted through a communication channel.”  from The Manual of Photography  by Elizabeth Allen, Sophie Triantaphillidou

“Information theory is a mathematical theory dealing with highly precise aspects of the communication of information in terms of bits through well-defined channels. The theory of international politics developed in this volume is non-mathematical and non-precise.” from System and Process in International Politics  by Morton A. Kaplan

“Information theory is a branch of science, mathematics, and engineering that studies information in a physical and mathematical context rather than a psychological framework.” from Assessing and Measuring Environmental Impact and Sustainability  by Jiri J Klemes

“Information theory is the science of message transmission developed by Claude Shannon and other engineers at Bell Telephone Laboratories in the late 1940s. It provides a mathematical means of measuring information.” from The Creation Hypothesis: Scientific Evidence for an Intelligent Designer by James Porter Moreland

Information theory is a branch of statistics and probability theory dealing with the study of data, ways to manipulate it (for instance, cryptography and compression) and communicate it (for instance, data transmissions and communication systems)”. from AI Game Development: Synthetic Creatures with Learning and Reactive Behaviors  by Alex J. Champandard

Information theory provides a means to quantify the complexity of information that can be used in the design of communication systems (Shannon 1948). It originated during World War II as a tool for assuring the successful transmission…” from Oceanography and Marine Biology: An Annual Review by R. N. Gibson, R. J. A. Atkinson, J. D. M. Gordon

And so on…

 

AI = RL + DL?

26700026492_f5f8853acb_c

One often hears the following two statements made in the same breath: that the current resurgence of artificial intelligence research will fundamentally transform the way in which we live, and that we are on the verge of mastering general intelligence. The frequent conflating of these two strikingly different assessments has led to predictions ranging from an evolutionary catastrophe for the human race to the onset of a downward cycle in AI of innovation brought about by overhype. In order to make sense of these claims, it is useful to take a deeper look at the current state of the art and to walk a few steps back in history for some perspective.

One of the most publicized successes of modern AI is given by programs, developed most notably by DeepMind, that have mastered complex games such as Go, obtaining super-human performance. The topic is considered to be of sufficient dramatic heft to justify a Netflix production. An aspect that has particularly captured the attention of commentators is the capability of these algorithms to learn from a blank slate, not requiring even an initial nudge by the programmers towards strategies that have been found to be effective by human players.

The engine underlying these programs is reinforcement learning, which builds on the idea of using feedback from a large number of simulated experiences to slowly gather information about the environment and/or the optimal strategies to be adopted. The specific algorithms employed date from the 70s and can be eloquently explained by a cartoon. What the new AI wave has brought to the table is hence, by and large, not a novel understanding of intelligence, but rather a bag of clever tricks that allows old algorithms to make an effective statistical use of unprecedented computational resources. Today’s algorithms would look very familiar to a researcher of the 80s. But they would also be useless: with her computing technology, the researcher would have to wait a few million years to obtain the same results that now take just a few days on modern processors.

In fact, even the most straightforward black-box optimization schemes, such as evolution strategies, have been shown to provide state-of-the-art results. These schemes merely test many random perturbations of the current strategy by leveraging computing parallelism, and they modify the current solution according to the feedback received from simulations of the environment. As such, these methods merely rely on the capability of the computing system to simulate the effect of many variations of the actions on the environment across test runs.

The fact that sheer computing power is to be credited for the most visible successes of AI should give us pause when commenting on our understanding of intelligence. The key principles at play in the current AI wave are in fact still the same postulated by Norbert Wiener’s cybernetics, namely information processing and feedback. Whether these are the right principles on which to build a theory of intelligence appears to be a valid open question.

As nicely summarized in Jessica Riskin’ essay, our understanding of intelligent behavior has over the centuries shifted to concentrate on different manifestations of intelligence, such as motion or programmability. For example, following Aristotle’s definition of living beings as things that can move at will, hydraulic automata that can make water travel upward, against gravity, would qualify as intelligent. In light of the apparent limitations of our current understanding of intelligence, artificial or otherwise, will some new principle of intelligence emerge that will make our current established framework appear as quaint as Aristotle’s?

Mind-Body Computing

Ahttps://i2.wp.com/colah.github.io/posts/2015-01-Visualizing-Representations/img/wiki-pic-major.pngs any quantitative researcher knows all too well, visualizing data facilitates interpretation and extrapolation, and can be a powerful tool for solving problems and motivating decision and actions. Visualization allows one to leverage our intuitive sense of space in order to grasp connections and relationships, as well as to notice parallels and analogies. (It can, of course, also be used to confuse).

Modern machine learning algorithms operate on very high dimensional data structures that cannot be directly visualized by humans. In this sense, machines can “see”, and “think”, in spaces that are inaccessible to our eyes. To put it with Richard Hamming’s words:

“Just as there are odors that dogs can smell and we cannot, as well as sounds that dogs can hear and we cannot, so too there are wavelengths of light we cannot see and flavors we cannot taste. Why then, given our brains wired the way they are, does the remark ‘Perhaps there are thoughts we cannot think’, surprise you?”

In an era of smart homes, smart cities, and smart governments, methods that visualize high-dimensional data in two dimensions can allow us to bridge, albeit partially, the understanding gap between humans and algorithms. This explains the success of techniques such as t-SNE, especially when coupled with interactive graphical interfaces.

As virtual reality and mixed technologies vie with standard two-dimensional interfaces as the dominant medium between us and the machines, data visualization and interactive representation stand to gain another dimension. And the difference may not be merely a quantitative one. As suggested by Jaron Lanier:

“People think differently when they express themselves physically. […] Having a piano in front of me makes me smarter by applying the biggest part of my cortex, the part associated with haptics. […] Don’t just think of VR as the place where you can look at a molecule in 3-D, or perhaps handle one, like all those psychiatrists in Freud avatars. No! VR is the place where you become a molecule. Where you learn to think like a molecule. Your brain is waiting for the chance.”

As such, VR may allow humans “to explore motor cortex intelligence”. Can this result in a new wave of innovations and discoveries?

The Rise of Hybrid Digital-Analog

Asautonomous_design-by-will-staehle a keen observer of nature, Leonardo da Vinci was more comfortable with geometry than with arithmetic. Shapes, being continuous quantities, were easier to fit, and disappear into, the observable world than discrete, discontinuous, numbers. For centuries since Leonardo, physics has shared his preference for analog thinking, building on calculus to describe macroscopic phenomena. The analog paradigm was upended at the beginning of the last century, when the quantum revolution revealed that the microscopic world behaves digitally, with observable quantities taking only discrete values. Quantum physics is, however, at heart a hybrid analog-digital theory, as it requires the presence of analog hidden variables to model the digital observations.

Computing technology appears to be following a similar path. The state-of-the-art computer that Claude Shannon found in Vannevar Bush‘s lab at MIT in the thirties was analog: turning its wheels would set the parameters of a differential equation to be solved by the computer via integration. Shannon’s thesis and the invention of the transistor ushered in the era of digital computing and the information age, relegating analog computing to little more than a historical curiosity.

But analog computing retains important advantages over digital machines. Analog computers can be faster in carrying our specialized tasks. As an example, deep neural networks, which have led to the well-publicized breakthroughs in pattern recognition, reinforcement learning, and data generation tasks, are inherently analog (although they are currently mostly implemented on digital platforms). Furthermore, while the reliance of digital computing on either-or choices can provide a higher accuracy, it can also also yield catastrophic failures. In contrast, the lower accuracy of analog systems is accompanied by a gradual performance loss in case of errors. Finally, analog computers can leverage time, not just as a neutral substrate for computation as in digital machines, but as an additional information-carrying dimension. The resulting space-time computing has the potential to reduce the energetic and spatial footprint of information processing.

The outlined complementarity of analog and digital computing has led experts to predict that hybrid digital-analog computers will be the way of the future.  Even in the eighties, Terrence J. Sejnowski is reported to have said:  ”I suspect the computers used in the future will be hybrid designs, incorporating analog and digital.” This conjecture is supported by our current understanding of the operation of biological neurons, which communicate using the digital language of spikes, but maintain internal analog states in the form of membrane potentials.

With the emergence of academic and commercial neuromorphic processors, the rise of hybrid digital-analog computing may just be around the corner. As it is often the case, the trend has been anticipated by fiction. In Autonomous, robots have a digital main logic unit with a human brain as a coprocessor to interpret people’s reactions and emotions. Analog elements can support common sense and humanity, in contrast to digital AI that “can make a perfect chess move while the room is on fire.” For instance, in H(a)ppy and Gnomon, analog is an element of disruption and reason in an ideally ordered and purified world under constant digital surveillance.

(Update: Here is a recent relevant article.)

When Message and Meaning are One and the Same

Embassytown2.pngThe indigenous creatures of Embassytown — an outpost of the human diaspora somewhere/somewhen in the space-time imagined by China Miéville — communicate via the Language. Despite requiring two coordinated sound sources to be spoken, the Language does not have the capacity to express any duplicitous thought: Every message, in order to be perceived as part of the Language, must correspond to a physical reality. A spoken message is hence merely a link to an existing object, and it ceases being recognized as a message when the linked object is no longer in existence.

As Miéville describes it: “… each word of Language, sound isomorphic with some Real: not a thought, not really, only self-expressed worldness […] Language had always been redundant: it had only ever been the world.

The Language upends Shannon’s premise that the semantic aspects of communication are irrelevant to the problem of transferring and storing information. In the Language, recorded sounds, untied to the state of the mind that produced them, do not carry any information. In a reversal of Shannon’s framework, information is thus inextricably linked to its meaning, and preserving information requires the maintenance of the physical object that embodies its semantics.

When message and meaning are one and the same as in the Language, information cannot be represented in any format other than in its original expression; Shannon’s information theory ceases to be applicable; and information becomes analog, irreproducible, and intrinsically physical. (And, as the events in the novel show, interactions with the human language may lead to some dramatic unforeseen consequences.)

A Few Things I Didn’t Know About Claude Shannon

Claude SHANNON, US mathematician. 1962

  • While he was a student at MIT, Claude Shannon, the future father of Information Theory, trained as an aircraft pilot in his spare time (to the protestations of the instructor, who was worried about damaging such a promising brain).
  • What do Coco Chanel, Truman Capote, Albert Camus, Gandhi, Malcolm X and Claude Shannon have in common? They were all photographed by Henri Cartier-Bresson (see photo).
  • Having pioneered artificial intelligence research with his maze-solving mouse and his chess-playing machine, in 1984 Shannon proposed the following targets for 2001: 1) Beat the chess word champion (check); 2) Generate a poem accepted for publication by the New Yorker (work in progress); 3) Prove the Riemann hypothesis (work in progress); 4) Pick stocks outperforming the prime rate by 50% (check, although perhaps with some delay).
  • Shannon corresponded with L. Ron Hubbard of Scientology fame, writing about him that he “has been doing very interesting work lately in using a modified hypnotic technique for therapeutic purposes”, although he later conceded that he did not know “whether or not his treatment contains anything of value”.
  • He is quoted as saying that great insights spring from a “constructive dissatisfaction”, that is, “a slight irritation when things don’t quite look right”.

(From “A Mind at Play“, an excellent book about Claude Shannon by Jimmy Soni and Rob Goodman.)

The Network & the Network

Full Narrative Timeline

In “The City & the City“, China Miéville imagines an usual coexistence arrangement between two cities located in the same geographical area that provides a surprisingly apt metaphor for the concept of network slicing in 5G networks — from the city & the city to the network & the network.

The two cities: Besźel and Ul Qoma occupy the same physical location, with buildings, squares, streets and parks either allocated completely to one city or “crosshatched”, that is, shared. The separation and isolation between the two cities is not ensured by physical borders, but is rather enforced by cultural customs and legal norms. The inhabitants of each city are taught from childhood to “unsee” anything that lies in the other city, consciously ignoring people, cars and buildings, even though they share the same sidewalks, roads and city blocks. Recognition of “alter” areas and citizens is made possible by the different architectures, language and clothing styles adopted in the two cities. Breaching the logical divide between Besźel and Ul Qoma by entering areas or interacting with denizens of the other city is a serious crime dealt with by a special police force. (Prospective tourists in Besźel or Ul Qoma are required to attend a long preliminary course to learn how to “unsee”.)

And now for the two networks: Experts predict an upcoming upheaval in telecommunication networks to parallel the recent revolution in computing brought on by cloudification. Just as computing and storage have become readily available on demand to individuals, companies and governments on shared cloud platforms, network slicing technologies are expected to enable the on-demand instantiation of wireless services on a common network substrate. Networking and wireless access for, say, a start-up offering IoT or vehicular communication applications, could be quickly set up on the hardware and spectrum managed by an infrastructure provider. Each service would run its own network on the same physical infrastructure but on logically separated slices — the packets and signals of one slice “unseeing” those of the other. In keeping with the metaphor, ensuring the isolation and security of the coexisting slices is among the key challenges facing this potentially revolutionary technology.

 

The Rebirth of Expertise?

BaltesThese days, conversations on almost any topic — be it finance, health care, art, the economy, music, or even religion — do not seem complete without a lively, and more or less informed, exchange on AI and on machine learning. The crux of the discussion typically rests on the role of humans in the increasingly large number of enterprises that depend on machines for decision making and manufacturing. In this context, a distinction that may prove useful in thinking about a future society of humans and “intelligent” machines is that proposed back in the 60s in the field of psychology between fluid and crystallized intelligence. As recently pointed out by Sarah Harper, taken to its logical end point, this idea may yield some possibly counter-intuitive conclusions regarding the parts to be played by AI and by different generations in the workplace.

Fluid intelligence relates to the ability to solve new problems by applying well-defined logical rules, such as by means of inductive or deductive reasoning. Fluid intelligence does not depend on any external prior knowledge about the world and the problem domain. In contrast, crystallized intelligence is the capacity to build on one’s experience and knowledge to acquire new skills and to solve problems.

In humans, fluid intelligence tends to decrease with age, while crystallized intelligence follows an inverse trend, peaking much later in life. Machines appear to have surpassed humans in terms of fluid intelligence, given their unprecedented capability to recognize patterns in large volumes of data and to optimize actions over long time horizons. But building general-purpose skills based on expertise in a computer, that is, generating artificial crystallized intelligence, is broadly considered to be unattainable with current AI techniques (listen to Obama’s eloquent explanation of this point!). Current state-of-the-art machine learning methods in fact cannot even explain why they output given decisions.

So there you have it — in a system that can leverage the fluid intelligence of sophisticated AI tools, the crystallized intelligence borne out of the experience of older women or men may become more valuable than the speed and flexibility of fresh graduates. Considering the predictions of an increased lifespan, this sounds like good news — can it be that expertise is not dead  after all?