Having taught courses on machine learning, I am often asked by colleagues and students with a background in engineering to suggest “the best place to start” to get into this subject. I typically respond with a list of books — for a general, but slightly outdated introduction, read this book; for a detailed survey of methods based on probabilistic models, check this other reference; to learn about statistical learning, I found this text useful; and so on. This answers strikes me, and most likely also my interlocutors, as quite unsatisfactory. This is especially so since the size of many of these books may be discouraging for busy professionals and students working on other projects. These notes are my first attempt to offer a basic and compact reference that describes key ideas and principles in simple terms and within a unified treatment, encompassing also more recent developments and pointers to the literature for further study. This is a work in progress and feedback is very welcome!
Feed the data on the left (adapted from this book by Pearl and co-authors) to a learning machine. With confidence, the trained algorithm will predict lower cholesterol levels for individuals who exercise less. While counter-intuitive, the prediction is sound and supported by the available data. Nonetheless, no one could in good faith use the output of this algorithm as a prescription to reduce the number of hours at the gym.
This is clearly a case of correlation being distinct from causation. But how do we know? And how can we ensure that an AI Doctor would not interpret the data incorrectly and produce a harmful diagnosis?
We know because we have prior information on the problem domain. Thanks to our past experience, we can explain away this spurious correlation by including another measurable variable in the model, namely age. To see this, consider the same data, now redrawn by highlighting the age of the individual corresponding to each data point. The resulting figure, shown on the right, reveals that older people — within the observed bracket — tend to have a higher cholesterol as well as to exercise more: Age is a common cause of both exercise and cholesterol levels. In order to capture the causality relationship between the latter variables, we hence need to adjust for age. Doing this requires to consider the trend within each age separately, recovering the expected conclusion that exercising is useful to lower one’s cholesterol.
And yet an AI Doctor that is given only the data set in the first figure would have no way of deciding that the observed upward trend hides a spurious correlation through another variable. More generally, just like the AI Doctor blinded by a wrong model, AI algorithms used for hiring, rating teachers’ performance or credit assessment can confuse causation for correlation and produce biased, or even discriminatory, decisions.
As seen, solving this problem would require making modeling choices, identifying relevant variables and their relationships — a task that appears to require a human in the loop. Add this to the, still rather short, list of new jobs created by the introduction of AI and machine learning technologies in the workplace.
An IEEE report on a recent work by my research group —
Mobile batteries are quickly drained when running augmented reality (AR) applications like Pokémon Go. A proposed system leverages mobile edge computing to conserve the battery life of phones running AR applications. https://goo.gl/FKQtd6
- While he was a student at MIT, Claude Shannon, the future father of Information Theory, trained as an aircraft pilot in his spare time (to the protestations of the instructor, who was worried about damaging such a promising brain).
- What do Coco Chanel, Truman Capote, Albert Camus, Gandhi, Malcolm X and Claude Shannon have in common? They were all photographed by Henri Cartier-Bresson (see photo).
- Having pioneered artificial intelligence research with his maze-solving mouse and his chess-playing machine, in 1984 Shannon proposed the following targets for 2001: 1) Beat the chess word champion (check); 2) Generate a poem accepted for publication by the New Yorker (work in progress); 3) Prove the Riemann hypothesis (work in progress); 4) Pick stocks outperforming the prime rate by 50% (check, although perhaps with some delay).
- Shannon corresponded with L. Ron Hubbard of Scientology fame, writing about him that he “has been doing very interesting work lately in using a modified hypnotic technique for therapeutic purposes”, although he later conceded that he did not know “whether or not his treatment contains anything of value”.
- He is quoted as saying that great insights spring from a “constructive dissatisfaction”, that is, “a slight irritation when things don’t quite look right”.
(From “A Mind at Play“, an excellent book about Claude Shannon by Jimmy Soni and Rob Goodman.)
In a formal field such as Information Theory (IT), the boundary between possible and impossible is well delineated: given a problem, the optimality of a solution can be in principle checked and determined unambiguously. As a pertinent example, IT says that there are ways to compress an “information source”, say a class of images, up to some file size, and that no conceivable solution could do any better than the theoretical limit. This is often a cause of confusion among newcomers, who tend to more naturally focus on improving existing solutions — say on producing a better compression algorithm as in “Silicon Valley” — rather than asking if the effort could at all be fruitful due to intrinsic informational limits.
The strong formalism has been among the key reasons for the many successes of IT, but — some may argue — it has also hindered its applications to a broader set of problems. (Claude Shannon himself famously warned about an excessively liberal use of the theory.) It is not unusual for IT experts to look with suspicion at fields such as Machine Learning (ML) in which the boundaries between possible and impossible are constantly redrawn by advances in algorithm design and computing power.
In fact, a less formal field such as ML allows practice to precede theory, letting the former push the state-of-the-art boundary in the process. As a case in point, deep neural networks, which power countless algorithms and applications, are still hardly understood from a theoretical viewpoint. The same is true for the more recent algorithmic framework of Generative Adversarial Networks (GANs). GANs can generate realistic images of faces, animals and rooms from datasets of related examples, producing fake faces, animals and rooms that cannot be distinguished from their real counterparts. It is expected that soon enough GANs will even be able to generate videos of events that never happened — watch Françoise Hardy discuss the current US president in the 60’s. While the theory may be lagging behind, these methods are making significant practical contributions.
Interestingly, GANs can be interpreted in terms of information-theoretic quantities (namely the Jensen-Shannon divergence), showing that the gap between the two fields is perhaps not as unbridgeable as it has broadly assumed to be, at least in recent years.
In “The City & the City“, China Miéville imagines an usual coexistence arrangement between two cities located in the same geographical area that provides a surprisingly apt metaphor for the concept of network slicing in 5G networks — from the city & the city to the network & the network.
The two cities: Besźel and Ul Qoma occupy the same physical location, with buildings, squares, streets and parks either allocated completely to one city or “crosshatched”, that is, shared. The separation and isolation between the two cities is not ensured by physical borders, but is rather enforced by cultural customs and legal norms. The inhabitants of each city are taught from childhood to “unsee” anything that lies in the other city, consciously ignoring people, cars and buildings, even though they share the same sidewalks, roads and city blocks. Recognition of “alter” areas and citizens is made possible by the different architectures, language and clothing styles adopted in the two cities. Breaching the logical divide between Besźel and Ul Qoma by entering areas or interacting with denizens of the other city is a serious crime dealt with by a special police force. (Prospective tourists in Besźel or Ul Qoma are required to attend a long preliminary course to learn how to “unsee”.)
And now for the two networks: Experts predict an upcoming upheaval in telecommunication networks to parallel the recent revolution in computing brought on by cloudification. Just as computing and storage have become readily available on demand to individuals, companies and governments on shared cloud platforms, network slicing technologies are expected to enable the on-demand instantiation of wireless services on a common network substrate. Networking and wireless access for, say, a start-up offering IoT or vehicular communication applications, could be quickly set up on the hardware and spectrum managed by an infrastructure provider. Each service would run its own network on the same physical infrastructure but on logically separated slices — the packets and signals of one slice “unseeing” those of the other. In keeping with the metaphor, ensuring the isolation and security of the coexisting slices is among the key challenges facing this potentially revolutionary technology.
These days, conversations on almost any topic — be it finance, health care, art, the economy, music, or even religion — do not seem complete without a lively, and more or less informed, exchange on AI and on machine learning. The crux of the discussion typically rests on the role of humans in the increasingly large number of enterprises that depend on machines for decision making and manufacturing. In this context, a distinction that may prove useful in thinking about a future society of humans and “intelligent” machines is that proposed back in the 60s in the field of psychology between fluid and crystallized intelligence. As recently pointed out by Sarah Harper, taken to its logical end point, this idea may yield some possibly counter-intuitive conclusions regarding the parts to be played by AI and by different generations in the workplace.
Fluid intelligence relates to the ability to solve new problems by applying well-defined logical rules, such as by means of inductive or deductive reasoning. Fluid intelligence does not depend on any external prior knowledge about the world and the problem domain. In contrast, crystallized intelligence is the capacity to build on one’s experience and knowledge to acquire new skills and to solve problems.
In humans, fluid intelligence tends to decrease with age, while crystallized intelligence follows an inverse trend, peaking much later in life. Machines appear to have surpassed humans in terms of fluid intelligence, given their unprecedented capability to recognize patterns in large volumes of data and to optimize actions over long time horizons. But building general-purpose skills based on expertise in a computer, that is, generating artificial crystallized intelligence, is broadly considered to be unattainable with current AI techniques (listen to Obama’s eloquent explanation of this point!). Current state-of-the-art machine learning methods in fact cannot even explain why they output given decisions.
So there you have it — in a system that can leverage the fluid intelligence of sophisticated AI tools, the crystallized intelligence borne out of the experience of older women or men may become more valuable than the speed and flexibility of fresh graduates. Considering the predictions of an increased lifespan, this sounds like good news — can it be that expertise is not dead after all?
A prime example of the complex relationship between digital technologies and the legal system is the fluidity and geographical variance of the laws that regulate broadband access. The discussion is typically framed — as far as I can tell from my outsider’s perspective — around two absolute principles, namely network neutrality and networks vitality. The net neutrality and net vitality camps, at least in their purest expressions, often seem uninterested in hearing each other’s arguments. This tends to hide from public discussion the layered technological, economic, moral and legal aspects that underlie the delicate balance between access and economic incentives that is at the core of the issue. And things appear to be getting even more involved with the advent of 5G.
Net neutrality is — for purists — the principle that all bits are created equal. Accordingly, broadband access providers should not be allowed to “throttle” packets on the basis, for instance, of their application (e.g., BitTorrent) or their origin (as determined by the IP address). The network should be “dumb” and only convey bits from two ends of a communication session. Regulation that upholds net neutrality rules is in place in many countries around the word, including in the EU and the US. Under the previous US administration, the FCC reclassified broadband Internet access as a “common carrier”, that is, as a public utility, in 2015, allowing the enforcement of net neutrality rules. Under the new administration, this decision now appears likely to be reversed.
The counterarguments to net neutrality typically center around some notion of net vitality, which refers broadly to the dynamism of the broadband Internet ecosystem, particularly as it pertains investment and growth. The term was coined in a report by the Media Institute, where a quantitative index was proposed as a compound measure of the net vitality of a country in terms of applications and content (e.g., access, e-government, social network penetration, app development), devices (e.g., smart phone penetration and sales), networks (e.g., cybersecurity, investment, broadband prices), and macroeconomic factors (e.g., number and evaluation of start-ups).
Net neutrality purists — not all advocates fall in this category — believe that allowing broadband access providers to discriminate on the basis of a packet’s identity would pose a threat to freedom of expression and competition. Without net neutrality rules, telecom operators could in fact block competitors’ services, and also favor deep-pocketed internet companies, such as the Frightful Five (Alphabet, Amazon, Apple, Facebook and Microsoft), that can outspend start-ups for faster access. A case in point is the ban of Google Wallet by Verizon Wireless, AT&T, and T-Mobile to promote their competing Isis (!) mobile payment system.
The net vitality camp, headed by broadband access providers and economists, deems net neutrality rules to be an impediment to investment and growth. As claimed in a 2016 manifesto by European telecom operators, only by charging more for better service can sufficient revenue be raised by broadband access providers to fund new infrastructure and services.
Digging a little deeper, one finds that the issue is more complex than implied by the arguments of the two camps. To start, some discrimination among the bits carried by the network may in fact serve a useful purpose. For instance, by letting some packets be transported for free, telecom operators can offer zero-cost Internet access to the poorest communities in the developing world as in the Facebook Zero and Google Free Zone projects. And packet prioritization is in fact already implemented in LTE networks as a necessary means to ensure call quality for Voice over LTE (VoLTE is not considered to be a broadband Internet service and hence not subject to net neutrality regulations).
That net neutrality is a more subtle requirement that the “every bit is created equal” mantra is in fact well recognized by many net neutrality advocates. When making the case for net neutrality rules, the then-president Obama called for “no blocking, no throttling, no special treatment at interconnections, and no paid prioritization to speed content transmission”, hence stopping short of prescribing full bit equivalence. Tim Berners-Lee and Eric Shmidt have also voiced similar opinions.
The planned transition to 5G systems is bound to add a further layer of complexity to the relationship between net neutrality and net vitality. 5G networks are indeed expected not only to provide broadband access, but also to serve vertical industries through the deployment of ultra-reliable and low-latency communication services. In this context, it seems apparent that bits carrying information about, say, a remote surgery or the control of a vehicle, should not be treated in the same way as bits encoding an email.
As the example of VoLTE shows, a general solution may lie in isolating mobile broadband services, on which strong net neutrality guarantees can be enforced, from other types of traffic, such as ultra-reliable and machine-type communications, on which traffic differentiation may be allowed. The feasibility of this approach is reinforced by the fact that isolation is a central feature of network slicing, a technology that will allow operators of 5G to create virtual networks that are fine-tuned for specific applications.
The feedback loop between politics and technology appears to be one of the most potent motors of human history, its pace quickening with each innovation cycle.
In the antiquity, the Greek city states and the Persian and Roman Empires created the necessary conditions and incentives for mathematicians and engineers to pursue specific technologies, such as aqueducts and catapults. Later, the feedback loop began closing, with technological advances directly informing the political system — in Marx’s words:
“The hand-mill gives you society with the feudal lord; the steam-mill, society with the industrial capitalist.”
Fast-forwarding to today, much has been written about the way in which new communication tools have enabled the resurgence of popular and populist movements in the Middle East, Europe and the USA. It is also often argued that it is politics that is bound to play catch-up in this process, with technological firms leading the way towards decentralization and acceleration.
But, in light of recent political developments around immigration, the feedback loop appears to be promptly working its way in the reverse direction. Politics, in fact, seem bound to dislodge the barycenter of technological innovation from its current established poles. Technology has thrived in environments characterized by diversity and mobility. Financial tech firms have found fertile ground in London and Internet companies have taken over Silicon Valley. But the threat, even if not yet directly enforced, of new walls and restrictions in the UK and USA is pushing gifted PhD students and Post-Docs to choose alternative destinations, such as Canada, New Zealand, Mexico and Singapore.
Looking further ahead in the future, to update Marx’s quote, 3D printing may give us a post-scarcity and post-work society. As imagined by Cory Doctorow in “Walkaway“, if tools such as 3D printers fulfill their promises, objects and foodstuffs will be just a few clicks away, produced with close to zero cost by repurposing unused materials and by transforming bits into things. In an economy of abundance, there would be no need to keep jobs, maintain a currency, protect private property or support the current education system. This would strip the nation state of its main functions and usher in new ways of organizing societies — and, inevitably, new technologies.
Having read yet another discouraging article on the state of our planet, a group of French filmmakers embarked on an optimistic globe-trotting quest for climate change solutions that took them to the UK, USA, Denmark, France, Switzerland, Finland and India. The result is a 2015 documentary entitled “Tomorrow” that is now playing in US theaters. The movie is meant as an antidote to the fatalism that can stem from familiarity with the scientific consensus on global warming.
Shot at a time in which prospects were not as dire as they appear today after the latest policy shifts in the United States, the documentary finds hope in innovative sustainable approaches to agriculture, energy production, finance, democracy and education. A common underlying element of all the surveyed solutions is their reliance on social, local and decentralized mechanisms. Including the inevitable interview with Vandana Shiva and an expected visit to a Finnish elementary school, the film uncovers a heartening set of initiatives, such as permaculture and the adoption of local currencies alongside conventional government-backed money.
Conspicuously missing from any of the solutions discussed in the movie are information and communication technologies (excluding the fleeting appearance of a smartphone used to pay in a local currency). This may not come as a surprise given the measurable decrease in “closeness, connection, and conversation quality” among people in local communities that has resulted from the widespread adoption of mobile phones. A city of the future ruled by smart devices seems indeed destined to be a lonely place, incompatible with the development of meaningful social programs. If we also factor in the energy footprint of producing digital devices and running telecom networks, the case for considering communication technology as a contributor to climate change appears to be well motivated.
Nonetheless, communication technology has a potentially important role to play in combating global warming. Smart phones are already used for emergency preparedness and coordination to respond to the effects of a changing climate. And, as discussed in a recent report of the Brookings Institution, the upcoming fifth generation of wireless networks may prove to be a significant asset in key areas such as water management, air quality control, energy production, transportation, and building design.
Take water management. It has been reported that two thirds of the world’s population may face water shortages by 2025. Thanks to Internet-of-Things (IoT)-enabled sensor and actuator networks, the efficiency in the use of this scarce natural resource may be improved via monitoring (e.g., of the concentration of dangerous chemicals), leakage detection, measuring home usage, adaptive irrigation tailored to measured moisture levels, and smart chillers for industries.
Leveraging the benefits of connectedness for a more efficient use of natural resources while at the same time building more resilient and sustainable communities may prove a delicate balancing act, but one that could prove critical for the future of our planet.