AI weekly (05/2020)

My selection of news on AI/ML and Data Science

AI weekly (05/2020)

My selection of news on AI/ML and Data Science

+++ Google says its new chatbot Meena is the best in the world +++ Facebook is Open-sourcing Polygames, a new framework for training AI bots through self-play +++ An AI Epidemiologist Sent the First Warnings of the Wuhan Virus +++ Europe wants single data market to break U.S. tech giants’ dominance +++ Meena: Towards a Conversational Agent that Can Chat About Anything +++ Survey of Deep Reinforcement Learning for Motion Planning of Autonomous Vehicles +++ A Survey of Deep Learning Techniques for Autonomous Driving +++

Breakthrough — Or So They Say

Google says its new chatbot Meena is the best in the world. Google’s publication on Meena (see below in Publications section) is featured in MIT Technology Review, but also more extensively and critically on VentureBeat (and earlier on here), highlighting the following passage in the orginial preprint: “Tackling safety and bias in the models is a key focus area for us, and given the challenges related to this, we are not currently releasing an external research demo.”

Tools and Frameworks

Facebook is Open-sourcing Polygames, a new framework for training AI bots through self-play. Polygames is a new open source AI research framework for training agents to master strategy games through self-play, rather than by studying extensive examples of successful gameplay. Because it is more flexible and has more features than previous frameworks, Polygames can help researchers with advancing and benchmarking a broad range of zero learning (ZL) techniques that don’t require training data sets. Polygames’ architecture makes it compatible with more kinds of games — including Breakthrough, Hex, Havannah, Minishogi, Connect6, Minesweeper, Mastermind, EinStein würfelt nicht!, Nogo, and Othello — than previous systems, such as AlphaZero and ELF OpenGo. In addition to building and evaluating ZL methods across a variety of games, Polygames allows researchers to study transfer learning, meaning the applicability of a model trained on one game to succeed at others. Details can be found on arXiv:2001.09832 and github.

Business News and Applications

An AI Epidemiologist Sent the First Warnings of the Wuhan Virus. Wired features the company BlueDot which tracks, contextualizes, and anticipates infectious disease risks. Their AI-driven algorithm scours foreign-language news reports, animal and plant disease networks, and official proclamations to give its clients advance warning to avoid danger zones like Wuhan. They also draw upon access to global airline ticketing data that can help predict where and when infected residents are headed next. The company which was founded in 2014 now has 40 employees—physicians and programmers who devise the disease surveillance analytic program, which uses natural-language processing and machine learning techniques to sift through news reports in 65 languages, along with airline data and reports of animal disease outbreaks. BlueDot had successfully predicted the spread of Zika virus from Brasil to Florida in 2015, and they have now correctly anticipated that the virus would jump from Wuhan to Bangkok, Seoul, Taipei, and Tokyo in the days following its initial appearance.

Europe wants single data market to break U.S. tech giants’ dominance. According to a European Commission proposal seen by Reuters, the European Union wants to create a single market in data aimed at challenging the dominance of tech giants such as Facebook, Google and Amazon. The paper states that “a small number of big tech firms hold a large part of the world’s data. This is a major weakness for data-driven businesses to emerge, grow and innovate today, including in Europe, but huge opportunities lie ahead.” Measures to achieve that goal shall include an array of new rules covering cross-border data use, data interoperability and standards related to manufacturing, climate change, the auto industry, healthcare, financial services, agriculture and energy.

Publications

Meena: Towards a Conversational Agent that Can Chat About Anything (arXiv:2001.09977). Modern conversational agents (chatbots) tend to be highly specialized — they perform well as long as users don’t stray too far from their expected usage. Now the Google Brain team presents “Meena, a multi-turn open-domain chatbot trained end-to-end on data mined and filtered from public domain social media conversations. This 2.6B parameter neural network is trained to minimize perplexity, an automatic metric that we compare against human judgement of multi-turn conversation quality”. At its heart lies the Evolved Transformer seq2seq architecture, a Transformer architecture discovered by evolutionary neural architecture search to improve perplexity. Meena can conduct conversations that are more sensible and specific than existing state-of-the-art chatbots. It was trained on 341 GB of text, filtered from public domain social media conversations. Compared to an existing state-of-the-art generative model, OpenAI GPT-2, Meena has 1.7x greater model capacity and was trained on 8.5x more data.

Survey of Deep Reinforcement Learning for Motion Planning of Autonomous Vehicles (arXiv:2001.11231). One of the many research streams in the field of autonomous vehicles deals with the different layers of motion planning, such as strategic decisions, trajectory planning, and control. A wide range of techniques in machine learning itself have been developed, and this article describes one of these fields, deep reinforcement learning (DRL). The main elements of designing such a system are the modeling of the environment, the modeling abstractions, the description of the state and the perception models, the appropriate rewarding, and the realization of the underlying neural network. The paper describes vehicle models, simulation possibilities and computational requirements. Strategic decisions on different layers and the observation models, e.g., continuous and discrete state representations, grid-based, and camera-based solutions are presented. The paper surveys the state-of-art solutions systematized by the different tasks and levels of autonomous driving, such as car-following, lane-keeping, trajectory following, merging, or driving in dense traffic.

A Survey of Deep Learning Techniques for Autonomous Driving (arXiv:1910.07738). Another related survey, from October 2019, that takes a broader look at deep learning in autonomous driving, surveying modular perception-planning-action pipelines and end-to-end systems that directly map sensory information to steering commands. In addition, it outlines challenges in designing AI architectures for autonomous driving, such as their safety, training data sources and computational hardware.

Tutorials

Portfolio starter kit. How do you choose assets? What amount of money should you put into each asset? Should you keep the same relative weighting or should you change it? And if you do change it, how frequently? This nice post is the first one in longer series concentrating on finanical portfolio construction, with data science and R guiding the way. Follow-ons can be found here.

See also