+++ Microsoft, Google and Other Tech Companies Work on AI Ethics +++ Introducing Dreamer: Scalable Reinforcement Learning Using World Models +++
Breakthrough — Or So They Say
Tools and Frameworks
Business News and Applications
Microsoft, Google and Other Tech Companies Work on AI Ethics. In spite of rising concern for the societal implications of artificial intelligence, it remains challenging for practitioners to identify the harmful repercussions of their own systems prior to deployment, and, once deployed, emergent issues can become difficult or impossible to trace back to their source. In the last weeks, both Google and Microsoft have published checklists that will help practitioners address this challenge. Google’s “Closing the AI Accountability Gap” proposes the SMACTR framework (Scoping, Mapping, Artifact Collection, Testing, and Reflection), which aims to encourage companies to perform audits before an AI model is deployed for customer use. In a similar vein, Microsoft’s “Co-Designing Checklists to Understand Organizational Challenges and Opportunities around Fairness in AI” develops a detailed checklist with sections on Envision, Define, Prototype, Build, Launch and Evolve.
Publications
Introducing Dreamer: Scalable Reinforcement Learning Using World Models. Model-free approaches to RL, which learn to predict successful actions through trial and error, have enabled DeepMind’s DQN to play Atari games and AlphaStar to beat world champions at Starcraft II, but require large amounts of environment interaction, limiting their usefulness for real-world scenarios. In contrast, model-based RL approaches additionally learn a simplified model of the environment. This world model lets the agent predict the outcomes of potential action sequences, allowing it to play through hypothetical scenarios to make informed decisions in new situations, thus reducing the trial and error necessary to achieve goals. Now, in the preprint arXiv:1912.01603 and a related blog post, Google AI and DeepMind present Dreamer, a reinforcement learning agent that solves long-horizon tasks from images purely by latent imagination. It can efficiently learn behaviors by propagating analytic gradients of learned state values back through trajectories imagined in the compact state space of a learned world model. On 20 challenging visual control tasks, Dreamer exceeds existing approaches in data-efficiency, computation time, and final performance. Dreamer consists of three processes that are typical for model-based methods: learning the world model, learning behaviors from predictions made by the world model, and executing its learned behaviors in the environment to collect new experience. To learn behaviors, Dreamer uses a value network to take into account rewards beyond the planning horizon and an actor network to efficiently compute actions.