AI weekly (47/2019)

My selection of news on AI/ML and Data Science

AI weekly (47/2019)

My selection of news on AI/ML and Data Science

+++ AI will speed up the hunt for Nazca Lines in southern Peru +++ Increasing transparency with Google Cloud Explainable AI +++ RecSim: Configurable Simulation Platform for Recommender Systems +++ Safety Gym: Reinforcement Learning Agents That Respect Safety Constraints +++ Sony Accounces the Establishment of Sony AI +++ Complete Data Science Project Template with Mlflow for Non-Dummies +++ Summer school on Deep Bayesian methods +++ How to apply machine learning and deep learning methods to audio analysis +++

Breakthrough — Or So They Say

AI will speed up the hunt for Nazca Lines in southern Peru. Created between 200 BCE and 600 CE, Nazca Lines were made by removing stones to reveal the white sand beneath and depict various geometric shapes, people, and animals, known as geoglyphs. In a collaboration with IBM Research, researchers are now deploying deep-learning algorithms to help speed up the hunt for them. They are using a cloud platform to stitch together massive amounts of geospatial data, including lidar, drone, and satellite imagery and geographical surveys, to create high-fidelity maps of the search areas. They are then training a neural network to recognize the data patterns of known lines to look for new ones. Even during testing, the team discovered a new design—a smaller humanoid figure—that had been missed in previously collected data. The discovery took only two months; previous methods had typically taken years. Press coverage can be found here and here.

Tools and Frameworks

Increasing transparency with Google Cloud Explainable AI. We are excited to announce our latest step in improving the interpretability of AI with Google Cloud AI Explanations. Explanations quantifies each data factor’s contribution to the output of a machine learning model. These summaries help enterprises understand why the model made the decisions it did. You can use this information to further improve your models or share useful insights with the model’s consumers.

RecSim: A Configurable Simulation Platform for Recommender Systems. Reinforcement learning (RL) is the de facto standard ML approach for addressing sequential decision problems, and as such is a natural paradigm in collaborative interactive recommenders (CIRs). However, it remains under-investigated in research and under-utilized in practice. To address this, we have developed RecSim, a configurable platform for authoring simulation environments to facilitate the study of RL algorithms in recommender systems (and CIRs in particular). RecSim allows both researchers and practitioners to test the limits of existing RL methods in synthetic recommender settings. Further details of the RecSim framework can be found in the white paper, while code and colabs/tutorials are available here.

Safety Gym: Reinforcement Learning Agents That Respect Safety Constraints. OpenAI is releasing Safety Gym, a suite of environments and tools for measuring progress towards reinforcement learning agents that respect safety constraints while training. If deep reinforcement learning is applied to the real world, whether in robotics or internet-based tasks, it will be important to have algorithms that are safe even while learning—like a self-driving car that can learn to avoid accidents without actually having to experience them.

Business News and Applications

Sony Announces the Establishment of Sony AI. This new R&D organization, with offices in Tokyo, Austin, and a European city to be named, will advance fundamental research and development of AI. Focus will be on unleashing human imagination and creativity with AI. At first, there will be three projects in the works-one for gaming, one for gastronomy and another for imaging and sensing. The new AI tech developed through these projects “will be critical to further enhancing the value of Sony’s gaming and sensor businesses in coming years.” More info here and here.

AI will disrupt white-collar workers the most, predicts a new report. A surprise finding: Conventional wisdom says robotics, artificial intelligence, and automation will radically alter work for blue-collar truckers and factory workers. In fact, white-collar jobs will be affected more, according to a new analysis by the Brookings Institution. Other studies on this topic, with different findings, can be found here and here.

Much of what’s being sold as “AI” today is snake oil. It does not and cannot work. In a talk at MIT, Arvind Narayanan (from Princeton) describes why this happening, how we can recognize flawed AI claims, and push back.

Publications

A network of science: 150 years of Nature papers. Beautiful presentation of the network of paper citations at Nature. This short video is nice introduction to the project.

Facebook’s Head of AI Says the Field Will Soon ‘Hit the Wall’. Jerome Pesenti is VP of AI at Facebook and oversees hundreds of scientists and engineers whose work shapes the company’s direction and its impact on the wider world. WIRED has interviewed him. Here some interesting statements: “As a lab, our objective is to match human intelligence. We’re still very, very far from that, but we think it’s a great objective. But I think many people in the lab, including Yann, believe that the concept of “AGI” is not really interesting and doesn’t really mean much. […] Deep learning and current AI, if you are really honest, has a lot of limitations. We are very very far from human intelligence, and there are some criticisms that are valid: It can propagate human biases, it’s not easy to explain, it doesn’t have common sense, it’s more on the level of pattern matching than robust semantic understanding. But we’re making progress in addressing some of these, and the field is still progressing pretty fast. You can apply deep learning to mathematics, to understanding proteins, there are so many things you can do with it. […] When you scale deep learning, it tends to behave better and to be able to solve a broader task in a better way. So, there’s an advantage to scaling. But clearly the rate of progress is not sustainable. If you look at top experiments, each year the cost it going up 10-fold. Right now, an experiment might be in seven figures, but it’s not going to go to nine or ten figures, it’s not possible, nobody can afford that. It means that at some point we’re going to hit the wall. In many ways we already have. Not every area has reached the limit of scaling, but in most places, we’re getting to a point where we really need to think in terms of optimization, in terms of cost benefit, and we really need to look at how we get most out of the compute we have.”

Tutorials

Complete Data Science Project Template with Mlflow for Non-Dummies. A nice, long blog post discussing best practices for setting up a data science project, model development and experimentation, centered around mlflow. The source code for the data science project template can be found on GitLab.

Summer school on Deep Bayesian methods. Videos, slides, and assignments for a course on deep bayesian networks, in Moscow (Aug 2019). At Deep|Bayes summer school, we will discuss how Bayesian Methods can be combined with Deep Learning and lead to better results in machine learning applications. Recent research has proven that the use of Bayesian approach can be beneficial in various ways. School participants will learn methods and techniques that are crucial for understanding current research in machine learning. They will also have hands-on experience with using probabilistic modeling to build neural generative and discriminative models, learn modern stochastic optimization methods and regularization techniques for neural networks, and master the ways to reason about the uncertainty about the weight of the neural networks and their predictions.

How to apply machine learning and deep learning methods to audio analysis. This post is focused on showing how data scientists and AI practitioners can use comet.ml to apply machine learning and deep learning methods in the domain of audio analysis. Quote: “To understand how models can extract information from digital audio signals, we’ll dive into some of the core feature engineering methods for audio analysis. We will then use Librosa, a great python library for audio analysis, to code up a short python example training a neural architecture on the UrbanSound8k dataset.”

See also