AI weekly (48/2019)

My selection of news on AI/ML and Data Science

AI weekly (48/2019)

My selection of news on AI/ML and Data Science

+++ Introducing TF-GAN: A lightweight GAN library for TensorFlow 2.0 +++ Global AI Survey: AI proves its worth, but few scale impact +++ AR-Net: A simple Auto-Regressive Neural Network for time-series +++ China’s billion-dollar ed-tech companies are planning to export their vision overseas +++ Deep Learning with PyTorch +++ **Corpus Wide Argument Mining - a Working Solution +++ Cost-benefit analysis in R +++ How to apply machine/deep learning to audio analysis, with comet.ml +++ Collection of Jupyter notebooks for quantitative finance +++ Create a streetmap of your favorite city with ggplot2 +++ Tutorial on Graph Neural Networks for Computer Vision and Beyond (in 3 parts) +++

Breakthrough — Or So They Say

Tools and Frameworks

Introducing TF-GAN: A lightweight GAN library for TensorFlow 2.0. The TensorFlow team has released some handy code for training GANs with TF 2.0, along with a self-study GAN course and tutorials here and here.

Business News and Applications

Global AI Survey: AI proves its worth, but few scale impact. Most companies report measurable benefits from AI where it has been deployed; however, much work remains to scale impact, manage risks, and retrain the workforce. This new report from McKinsey & Company shows how a group of high performers is leading the way.

China’s billion-dollar ed-tech companies are planning to export their vision overseas. Experts agree AI will be important in 21st-century education. While academics have puzzled over best practices, China hasn’t waited but invested heavily in AI-enabled teaching and learning. Tech giants, startups, and education incumbents have all jumped in. Tens of millions of students now use some form of AI to learn—whether through extracurricular tutoring programs like Squirrel’s, through digital learning platforms like 17ZuoYe, or even by using facial recognition in their main classrooms. It’s the world’s biggest experiment on AI in education, and no one can predict the outcome. An article in Technology Review provides in-depth coverage of Squirrel. Squirrel is already exporting its technology abroad. It has cultivated its international reputation by appearing at some of the largest AI conferences around the world and bringing on reputable collaborators affiliated with MIT, Harvard, and other prestigious research institutes. Li has also recruited several Americans to serve on his executive team, with the intent of pushing into the US and Europe in the next two years. One of them is Tom Mitchell, the dean of computer science at Carnegie Mellon. Another company covered is Alo7, an ed-tech company with the mission of teaching English.

Publications

AR-Net: A simple Auto-Regressive Neural Network for time-series (arXiv:1911.12436). Researchers from facebook present a new framework that combines the best of both traditional statistical models and neural network models for time series modeling. Classical models such as autoregression (AR) exploit the inherent characteristics of a time series, leading to a more concise model. This is possible because the model makes strong assumptions about the data, such as the true order of the AR process. These models, however, do not scale well for a large volume of training data, particularly if there are long-range dependencies or complex interactions. The proposed framework models proven classic AR methods using a feedforward neural network approach. The AR-Net scales well to large orders, making it possible to estimate long-range dependencies. In addition, it automatically selects and estimates the important coefficients of a sparse AR process, eliminating the need to know the true order of the AR process.

Corpus Wide Argument Mining - a Working Solution (arXiv:1911.10763). Quote: “One of the main tasks in argument mining is the retrieval of argumentative content pertaining to a given topic. Most previous work addressed this task by retrieving a relatively small number of relevant documents as the initial source for such content. Here we present a first end-to-end high-precision, corpus-wide argument mining system. This is made possible by combining sentence-level queries over an appropriate indexing of a very large corpus of newspaper articles, with an iterative annotation scheme. This scheme addresses the inherent label bias in the data and pinpoints the regions of the sample space whose manual labeling is required to obtain high-precision among top-ranked candidates.” This is one of the core technologies behind IBM’s Project Debater.

Deep Learning with PyTorch. Manning and PyTorch are offering a free collection of excerpts from their upcoming book, Deep Learning with PyTorch. It provides a detailed, hands-on introduction to building and training neural networks with PyTorch, a popular open source machine learning framework.

Tutorials

Cost-benefit analysis in R. This tutorial by Peter Ellis shows how to perform cost-benefit analysis in R and how to build in uncertainty in a much clearer way than is generally done.

How to apply machine/deep learning to audio analysis, with comet.ml This post is focused on showing how data scientists and AI practitioners can use comet.ml to apply machine learning and deep learning methods in the domain of audio analysis. To understand how models can extract information from digital audio signals, we’ll dive into some of the core feature engineering methods for audio analysis.

Collection of Jupyter notebooks for quantitative finance. Nice collection on github of interactive notebooks about quantitative finance. It contains several topics that are not so popular nowadays, but that can be very powerful, e.g. PDE methods, Lévy processes, Fourier methods or Kalman filters. There’s also a worthwhile discussion about this project and related resources on Hacker News.

Create a streetmap of your favorite city with ggplot2. In this tutorial, and its sources here and here, you’ll learn how to create streetmaps for any city. We will use the packages osmdata and ggplot2. With osmdata we can extract the streets from the Openstreetmap database. Openstreetmap is a freely available database of all locations in the world under an open license. Therefore, we don’t need an API key or the like to get the data.

Tutorial on Graph Neural Networks for Computer Vision and Beyond (in 3 parts). I’m answering questions that AI/ML/CV people not familiar with graphs or graph neural networks typically ask. I provide PyTorch examples to clarify the idea behind this relatively new and exciting kind of model. Part 1, part 2, part 3.

See also