This one-day workshop 13 July 2018 brought together the Machine Learning and Formal Methods communities. Here is a summary of some take-aways. Highlights were the talks by Pushmeet Kohli from Google DeepMind UK, Alison Lowndes from NVIDIA, and Adnan Darwiche UCLA. Melinda Hodkiewicz (SHL)and Ashwin D’Cruz (ex-SHL now working for Calipsa in London) attended. https://www.floc2018.org/summit-on-machine-learning/.
Pushmeet Kohli (Google DeepMind): Challenges for AI are to ensure it is a) robust to adversaries, b) generalises well to variations in the real world, c) it is fair, d) it is compliant with regulations. When talking about ‘fairness’, he split the discussion into the “What do we mean by fair?” and “How to make AI fair”. He did not answer the “what” question. Instead saying that “this needed to be set in regulations”. As far as the “how”, Kohli suggested three steps 1) rigorous testing, 2) developing robust AI, and 3) verifying AI systems. A significant challenge for AI is that test set evaluation approaches commonly used in ML are inappropriate for 1) adversarial environments and 2) safety critical domains. In safety critical domains loss functions are unbounded. Also you would need a lot of samples of bad events for test set evaluation, and we can’t afford to do this. He then gave some examples of work his team at Google is doing (see his recent ICML, ICLR and NIPS papers) and left us with the idea that we might need a new language for AI which has suitable inductive bias (set of assumptions that the learning algorithm uses to predict outputs given inputs it has not encountered) and with the right expressiveness to describe what’s going on.
Comment: Melinda asked the room if there was anyone at the workshop working for legislators or regulators; there was not. It is not clear how legislators are going to develop workable regulations and regulators have the capacity to assess practice against these regulations without a good understanding of the issues being discussed at these types of events.
Alison Lowndes (NVIDIA): NVDIA has developed massive simulation platforms and Alison talked about their work on Jetson Xavier, an AI computer for autonomous machines delivering GPU workstation performance in a single embedded module https://developer.nvidia.com/jetson-xavier-devkit. She observed that while Reinforcement Learning was highly fashionable (80 papers/day published on arXiv), it is not commercial yet. Classical ML (SVM, MLP, GBDT) are still very relevant and widely used. Convolution neural networks are widely used. She expressed concern about the “common person’s voice in the room” and suggested that philosophy and psychology will become more important. Finally relevant for the SHL and Makers she said that educational institutions can get a free DevKit from NVDIA https://developer.nvidia.com/teaching-kits
Andre Platzer (Carnegie Mellon) talked on ‘safe’ reinforcement learning via formal methods with a focus on safety critical systems. How do you demonstrate that the algorithm is “provably safe”? He talked about the need to 1) learn safety, 2) learn a safety policy, and 3) verify and about the issue of “what if the model is incorrect”? As far as safety policy, this appears to be based on the idea of “have we seen this output before and it was ok, then it should be safe, given the same context”, how do we know to trust this and how do we know if the context has changed? Andre runs the Logical Systems Labs http://www.ls.cs.cmu.edu/ and has a text book on Logical Foundations of Cyber-Physical Systems.
Sumit Gulwani from Microsoft talked abotu their PROSE kit https://microsoft.github.io/prose/. This is about programming by examples – the automatic generation of programs from input-output examples. It can build programs in various languages such as Python/R/C# and some of its functionality is baked in Excel. Some scripts that are now tedious to write can be automated.
Marta Kwiatkowska (Oxford) talked about her team’s work on ‘safety verification for deep learning networks with proven guarantees’. She made the point that while there are an infinite set of possible outcomes we only measure ‘accuracy’ on a finite data set. She demonstrated how deep learning networks are unstable to adversarial perturbations using image processing of a car sign (plenty of papers on this on arxiv) and asked ‘how can we verify that such behaviour cannot occur’. Marta’s group at Oxford is involved in modelling and automated verification techniques for software systems. One of the current projects in her group is safety and trust for mobile autonomous robots. http://www.cs.ox.ac.uk/people/marta.kwiatkowska/research.html.
Adnan Darwiche from UCLA presented on “what just happened in AI”. It draws on his recent paper “Human-Level Intelligence or Animal-Like Abilities” https://arxiv.org/abs/1707.04327. He made a number of interesting observations 1) Lots on new AI applications, 2) AI has been around >50 years, 3) The AI curriculum is almost unchanged. Essentially every behaviour can be captured to some extent by a function. We are now building bigger functions and we have more data. A deep learning NN is a function and architecting the structure of a NN is function engineering. Next he moved onto how our perception of value has changed. Model-based approaches try and understand a system whereas ML models translate without insight. We have realised in many cases (e.g. social media) you don’t need the understanding to be useful. The ease at which we can get results that may be the same or only slightly better than what we can do with model based methods is very attractive. However he warned about the growing gap between hype and reality and reminded the audience of a period he described as the “AI winter”. He warned about a lost generation of AI researchers who are well versed in NN models but not in Logic, the need to understand the limitations of function-based approaches and to characterise deep learning functions in a scientifically precise manner. It’s worth reading his paper from the link above for a fully discussion of his concerns.