+++ Powered by AI: Turning any 2D photo into 3D using convolutional neural nets +++ When AI Can’t Replace a Worker, It Watches Them Instead +++ Co-creator of YOLO has turned his back on computer vision over ethical issues +++ Fastai: A Layered API for Deep Learning +++
Breakthrough — Or So They Say
Powered by AI: Turning any 2D photo into 3D using convolutional neural nets. Facebook has developed a system that allows to infer the 3D structure of any image. Building this enhanced 3D Photos technique required overcoming a variety of technical challenges, such as training a model that correctly infers 3D positions of an extremely wide variety of subject matter and optimizing the system so that it works on-device on typical mobile processors in a fraction of a second. To overcome these challenges, they trained a convolutional neural network (CNN) on millions of pairs of public 3D images and their accompanying depth maps, and leveraged a variety of mobile-optimization techniques previously developed by Facebook AI. Related research is featured in another post on their blog.
Tools and Frameworks
Business News and Applications
When AI Can’t Replace a Worker, It Watches Them Instead. AI may not steal your job, but it can tell the boss when you’re slacking.
Drishti, a startup based in Palo Alto and Bengaluru, tracks the productivity of industrial workers by recognizing their actions on the assembly line. According to Wired, automotive parts giant Denso is using the technology to eliminate bottlenecks in its factory in Battle Creek, Michigan. Drishti trains the system to recognize standardized actions in the client’s industrial processes. The training data includes video of many different people from a variety of angles, so the software can classify actions regardless of who is performing them. A camera over each station captured workers’ movements as they assembled parts for auto heat-management systems. The system tracks how long it takes them to complete their tasks and alerts managers of significant deviations from the norm. Workers see a live display of their performance metrics. It shows them a smiley face if they’re ahead of schedule, a frown if they fall behind. Drishti tirelessly logs the “cycle time” for every worker and station all day, for every shift. Plant managers use the data to track output and find and eliminate even subtle bottlenecks in production. The system has helped factories achieve double-digit improvements in several productivity indicators, a Denso executive told Forbes.
Co-creator of YOLO has turned his back on computer vision over ethical issues. The co-creator of the popular object-recognition network You Only Look Once (YOLO) said on Twitter that he no longer works on computer vision because the technology has “almost no upside and enormous downside risk.” In a 2018 TEDx talk he had already publicly stated that he had been “horrified” to learn that the U.S. Army used his algorithms to help battlefield drones track targets.
Publications
Fastai: A Layered API for Deep Learning. The open access journal Information is publishing a special issue “Machine Learning with Python”. One of the articles features fastai. Here’s the abstract: “fastai is a deep learning library which provides practitioners with high-level components that can quickly and easily provide state-of-the-art results in standard deep learning domains, and provides researchers with low-level components that can be mixed and matched to build new approaches. It aims to do both things without substantial compromises in ease of use, flexibility, or performance. This is possible thanks to a carefully layered architecture, which expresses common underlying patterns of many deep learning and data processing techniques in terms of decoupled abstractions. These abstractions can be expressed concisely and clearly by leveraging the dynamism of the underlying Python language and the flexibility of the PyTorch library. fastai includes: a new type dispatch system for Python along with a semantic type hierarchy for tensors; a GPU-optimized computer vision library which can be extended in pure Python; an optimizer which refactors out the common functionality of modern optimizers into two basic pieces, allowing optimization algorithms to be implemented in 4–5 lines of code; a novel 2-way callback system that can access any part of the data, model, or optimizer and change it at any point during training; a new data block API; and much more. We used this library to successfully create a complete deep learning course, which we were able to write more quickly than using previous approaches, and the code was more clear. The library is already in wide use in research, industry, and teaching.”