‘F’ to ‘A’ on the N.Y. Regents Science Exams: An Overview of the Aristo Project
Performing well on standardized exams has been a longstanding challenge for AI. Even in 2016, the best AI system achieved less than 60% on an 8th Grade science exam challenge. Recently, AI2’s Aristo system achieved surprising success on the Grade 8 New York Regents Science Exams, scoring over 90% on the exam’s non-diagram, multiple choice (NDMC) questions. How was it able to do this, and what mistakes does it still make? In this talk, I will overview Aristo and the impact of its various components, in particular, its new language model (LM) solvers. I will also present several analyses of what is going on inside Aristo, in particular probing how much the LM solvers go beyond simple pattern matching, and what kinds of errors still occur. Finally, I will speculate on the larger quest towards knowledgeable machines that can reason, explain, and interact, and what additional capabilities are needed to reach this broader goal.
Speaker Details
Peter Clark is a Senior Research Manager and founding member of AI2, and has led Project Aristo since its inception in 2014. His research focuses upon natural language processing, machine inference, and commonsense reasoning, and the interplay between these three areas.
- Series:
- Microsoft Research Talks
- Date:
- Speakers:
- Peter Clark
- Affiliation:
- Allen Institute for AI
Series: Microsoft Research Talks
-
Decoding the Human Brain – A Neurosurgeon’s Experience
Speakers:- Pascal Zinn,
- Ivan Tashev
-
-
-
-
Galea: The Bridge Between Mixed Reality and Neurotechnology
Speakers:- Eva Esteban,
- Conor Russomanno
-
Current and Future Application of BCIs
Speakers:- Christoph Guger
-
Challenges in Evolving a Successful Database Product (SQL Server) to a Cloud Service (SQL Azure)
Speakers:- Hanuma Kodavalla,
- Phil Bernstein
-
Improving text prediction accuracy using neurophysiology
Speakers:- Sophia Mehdizadeh
-
-
DIABLo: a Deep Individual-Agnostic Binaural Localizer
Speakers:- Shoken Kaneko
-
-
Recent Efforts Towards Efficient And Scalable Neural Waveform Coding
Speakers:- Kai Zhen
-
-
Audio-based Toxic Language Detection
Speakers:- Midia Yousefi
-
-
From SqueezeNet to SqueezeBERT: Developing Efficient Deep Neural Networks
Speakers:- Sujeeth Bharadwaj
-
Hope Speech and Help Speech: Surfacing Positivity Amidst Hate
Speakers:- Monojit Choudhury
-
-
-
-
-
'F' to 'A' on the N.Y. Regents Science Exams: An Overview of the Aristo Project
Speakers:- Peter Clark
-
Checkpointing the Un-checkpointable: the Split-Process Approach for MPI and Formal Verification
Speakers:- Gene Cooperman
-
Learning Structured Models for Safe Robot Control
Speakers:- Ashish Kapoor
-