`

Timezone: »

Workshop
Math AI for Education (MATHAI4ED): Bridging the Gap Between Research and Smart Education
Pan Lu · Yuhuai Wu · Sean Welleck · Xiaodan Liang · Eric Xing · James McClelland

Tue Dec 14 08:55 AM -- 06:05 PM (PST) @ None

Mathematical reasoning is a unique aspect of human intelligence and a fundamental building block for scientific and intellectual pursuits. However, learning mathematics is often a challenging human endeavor that relies on expert instructors to create, teach and evaluate mathematical material. From an educational perspective, AI systems that aid in this process offer increased inclusion and accessibility, efficiency, and understanding of mathematics. Moreover, building systems capable of understanding, creating, and using mathematics offers a unique setting for studying reasoning in AI. This workshop will investigate the intersection of mathematics education and AI, including applications to teaching, evaluation, and assisting. Enabling these applications requires not only innovations in math AI research, but also a better understanding of the challenges in real-world education scenarios. Hence, we will bring together a group of experts from a diverse set of backgrounds, institutions, and disciplines to drive progress on these and other real-world education scenarios, and to discuss the promise and challenge of integrating mathematical AI into education.

 Tue 8:55 a.m. - 9:00 a.m. Introduction and Opening Remarks (Remarks) 🔗 Tue 9:00 a.m. - 9:01 a.m. Introduction of the talk speaker (Introduction) 🔗 Tue 9:01 a.m. - 9:26 a.m. Solving Math Problems by Joint Parsing and Cognitive Reasoning (Invited Talk)  link » Song-Chun Zhu 🔗 Tue 9:26 a.m. - 9:30 a.m. Talk Q&A (Q&A) 🔗 Tue 9:30 a.m. - 9:31 a.m. Introduction of the talk speaker (Introduction) 🔗 Tue 9:31 a.m. - 9:56 a.m. Natural Language Processing meets Educational Data Science (Invited Talk) Mrinmaya Sachan 🔗 Tue 9:56 a.m. - 10:00 a.m. Talk Q&A (Q&A) 🔗 Tue 10:00 a.m. - 10:30 a.m. Poster Session 1 (Poster Session)  link » Please join us in GatherTown for our poster session. The posters are as follows: 33833 Geometric Question Answering Towards Multimodal Numerical Reasoning 33832 Towards Diagram Understanding and Cognitive Reasoning in Icon Question Answering 33830 Towards Grounded Natural Language Proof Generation 33828 Theorem-Aware Geometry Problem Solving with Symbolic Reasoning and Theorem Prediction 33827 REAL2: An end-to-end memory-augmented solver for math word problems 33826 GeoRE: A Relation Extraction Dataset for Chinese Geometry Problems 33823 MathBERT: A Pre-trained Language Model for General NLP Tasks in Mathematics Education Link » Jiaqi Chen · Tony Xia · Sean Welleck · Jiacheng (Gary) Liu · Ran Gong · Shifeng Huang · Wei Yu · Tracy Jia Shen 🔗 Tue 10:30 a.m. - 11:00 a.m. Coffee Break (Break) 🔗 Tue 11:00 a.m. - 12:00 p.m. Interview with Stephen Wolfram (Interview) Stephen Wolfram · Danielle R Mayer 🔗 Tue 12:00 p.m. - 1:00 p.m. Lunch Break (Break) 🔗 Tue 1:00 p.m. - 1:01 p.m. Introduction of the talk speaker (Introduction) 🔗 Tue 1:01 p.m. - 1:26 p.m. Understanding and Knowledge Extraction from Mathematical and Scientific Text (Invited Talk) Hanna Hajishirzi 🔗 Tue 1:26 p.m. - 1:30 p.m. Talk Q&A (Q&A) 🔗 Tue 1:30 p.m. - 1:31 p.m. Introduction of the talk speaker (Introduction) 🔗 Tue 1:31 p.m. - 1:56 p.m. Free-form Grading of Math Assignments: A case study in collaboration with Art of Problem Solving (Invited Talk) Yuri Burda 🔗 Tue 1:56 p.m. - 2:00 p.m. Talk Q&A (Q&A) 🔗 Tue 2:00 p.m. - 2:01 p.m. Introduction of the talk speaker (Introduction) 🔗 Tue 2:01 p.m. - 2:26 p.m. FACT: An automated teaching assistant for middle school math classrooms (Invited Talk) Kurt VanLehn 🔗 Tue 2:26 p.m. - 2:30 p.m. Talk Q&A (Q&A) 🔗 Tue 2:30 p.m. - 3:00 p.m. Poster Session 2 (Poster Session)  link » Please join us in GatherTown for our poster session. The posters are as follows: 33831 Gamifying Math Education using Object Detection 33829 Who Gets the Benefit of the Doubt? Racial Bias in Machine Learning Algorithms Applied to Secondary School Math Education 33825 Phygital Math Learning with Handwriting for Kids 33824 Exploring Student Representation For Neural Cognitive Diagnosis 33822 An Empirical Study of Finding Similar Exercises 33821 Evaluation of mathematical questioning strategies using data collected through weak supervision Link » Yueqiu Sun · Haewon Jeong · Nrupatunga . · Hengyao Bao · Tongwen Huang · Debajyoti Datta 🔗 Tue 3:00 p.m. - 3:30 p.m. Coffee Break (Break) 🔗 Tue 3:30 p.m. - 3:31 p.m. Introduction of the talk speaker (Introduction) 🔗 Tue 3:31 p.m. - 3:56 p.m. Weaving AI Into Education (Invited Talk) Sumeet Singh 🔗 Tue 3:56 p.m. - 4:00 p.m. Talk Q&A (Q&A) 🔗 Tue 4:00 p.m. - 4:01 p.m. Introduction of the contributed talk speaker (Introduction) 🔗 Tue 4:01 p.m. - 4:16 p.m. MathBERT: A Pre-trained Language Model for General NLP Tasks in Mathematics Education (Contributed Talk)  link »    Best Paper Award for NeurIPS 2021 MathAI4Ed Workshop. Since the introduction of the original BERT (i.e., BASE BERT), researchers have developed various customized BERT models with improved performance for specific domains and tasks by exploiting the benefits of transfer learning. Due to the nature of mathematical texts, which often use domain specific vocabulary along with equations and math symbols, we posit that the development of a new BERT model for mathematics would be useful for many mathematical downstream tasks. In this paper, we introduce our multi-institutional effort (i.e., two learning platforms and three academic institutions in the US) toward this need: MathBERT, a model created by pre-training the BASE BERT model on a large mathematical corpus ranging from pre-kindergarten (pre-k), to high-school, to college graduate level mathematical content. In addition, we select three general NLP tasks that are often used in mathematics education: prediction of knowledge component, auto-grading open-ended Q&A, and knowledge tracing, to demonstrate the superiority of m over BASE BERT. Our experiments show that MathBERT outperforms prior best methods by 1.2-22% and BASE BERT by 2-8% on these tasks. In addition, we build a mathematics specific vocabulary mathVocab to train with MathBERT. We release MathBERT for public usage at: https://github.com/tbs17/MathBERT. Link » Tracy Jia Shen 🔗 Tue 4:16 p.m. - 4:20 p.m. Contributed Talk Q&A (Q&A) 🔗 Tue 4:20 p.m. - 4:21 p.m. Introduction of the contributed talk speaker (Introduction) 🔗 Tue 4:21 p.m. - 4:36 p.m. Towards Grounded Natural Language Proof Generation (Contributed Talk)  link »    When a student is working on a mathematical proof, it is often helpful to receive suggestions about how to proceed. To this end, we provide an initial study of two generation tasks in natural mathematical language: suggesting the next step in a proof, and full-proof generation. As proofs are grounded in past results- e.g. theorems, definitions- we study knowledge-grounded generation methods, and find that conditioning on retrieved or ground-truth knowledge greatly improves generations. We characterize error types and provide directions for future research. Link » Jiacheng Liu 🔗 Tue 4:36 p.m. - 4:40 p.m. Contributed Talk Q&A (Q&A) 🔗 Tue 4:40 p.m. - 5:00 p.m. Coffee Break (Break) 🔗 Tue 5:00 p.m. - 6:00 p.m. Panel Discussion (Panel) Jo Boaler · Yuri Burda · Chris Piech · Sumeet Singh · Kurt VanLehn 🔗 Tue 6:00 p.m. - 6:05 p.m. Closing Remarks (Remarks) 🔗 - Evaluation of mathematical questioning strategies using data collected through weak supervision (Poster)    High-fidelity, AI-based simulated classroom systems enable teachers to rehearse effective teaching strategies. However, a dialogue oriented open ended conversations like teaching a student about scale factor can be difficult to model. This paper presents a high-fidelity, AI based classroom simulator to help teachers rehearse research-based mathematical questioning skills. We take a human centered approach to designing our system relying advances in deep-learning, uncertainty quantification and natural language processing while acknowledging the limitations of conversational agents for specific pedagogical needs. Using experts' input directly during the simulation, we demonstrate how conversation success rate and high user satisfaction can be achieved. Debajyoti Datta · Maria Phillips · Jim P. Bywater · Jennifer L. Chiu · Ginger S. Watson · Laura E Barnes · Donald Brown 🔗 - An Empirical Study of Finding Similar Exercises (Poster)    Education artificial intelligence aims to profit tasks in the education domain such as intelligent test paper generation and consolidation exercises where the main technique behind is how to match the exercises, known as the finding similar exercises(FSE) problem. Most of these approaches emphasized their model abilities to represent the exercise, unfortunately there are still many challenges such as the scarcity of data, un-sufficient understanding of exercises and high label noises. We release a Chinese education pre-trained language model BERT$_{Edu}$ for the label-scarce dataset and introduce the exercise normalization to overcome the diversity of mathematical formulas and terms in exercise. We discover new auxiliary tasks in an innovative way depends on problem-solving ideas and propose a very effective MoE enhanced multi-task model for FSE task to attain better understanding of exercises. In addition, confidence learning was utilized to prune train-set and overcome high noises in labeling data. Experiments show that these methods proposed in this paper are very effective. Tongwen Huang · Li Xihua · Tongwen Huang 🔗 - MathBERT: A Pre-trained Language Model for General NLP Tasks in Mathematics Education (Poster)    Since the introduction of the original BERT (i.e., BASE BERT), researchers have developed various customized BERT models with improved performance for specific domains and tasks by exploiting the benefits of transfer learning. Due to the nature of mathematical texts, which often use domain specific vocabulary along with equations and math symbols, we posit that the development of a new BERT model for mathematics would be useful for many mathematical downstream tasks. In this paper, we introduce our multi-institutional effort (i.e., two learning platforms and three academic institutions in the US) toward this need: MathBERT, a model created by pre-training the BASE BERT model on a large mathematical corpus ranging from pre-kindergarten (pre-k), to high-school, to college graduate level mathematical content. In addition, we select three general NLP tasks that are often used in mathematics education: prediction of knowledge component, auto-grading open-ended Q&A, and knowledge tracing, to demonstrate the superiority of m over BASE BERT. Our experiments show that MathBERT outperforms prior best methods by 1.2-22% and BASE BERT by 2-8% on these tasks. In addition, we build a mathematics specific vocabulary mathVocab to train with MathBERT. We release MathBERT for public usage at: https://github.com/tbs17/MathBERT. Tracy Jia Shen · Michiharu Yamashita · Ethan Prihar · Neil Heffernan · Xintao Wu · Ben Graff · Dongwon Lee 🔗 - Exploring Student Representation For Neural Cognitive Diagnosis (Poster)    Cognitive diagnosis, the goal of which is to obtain the proficiency level of students on specific knowledge concepts, is an fundamental task in smart educational systems. Previous works usually represent each student as a trainable knowledge proficiency vector, which cannot capture the relations of concepts and the basic profile(e.g. memory or comprehension) of students. In this paper, we propose a method of student representation with the exploration of the hierarchical relations of knowledge concepts and student embedding. Specifically, since the proficiency on parent knowledge concepts reflects the correlation between knowledge concepts, we get the first knowledge proficiency with a parent-child concepts projection layer. In addition, a low-dimension dense vector is adopted as the embedding of each student, and obtain the second knowledge proficiency with a full connection layer. Then, we combine the two proficiency vector above to get the final representation of students. Experiments show the effectiveness of proposed representation method. Hengyao Bao · Li Xihua 🔗 - Phygital Math Learning with Handwriting for Kids (Poster)    To provide fun learning and concept apprehension for online education the content and experience are of prime importance. In this work, we present a Phygital (Physical + Digital) math learning through handwriting with traditional pen and paper, vital for a child's cognitive and motor skill development. Our system provides interactive educational content for 3-10 year old kids with real-time feedback and evaluation recognizing handwriting at high precision/ recall. The real-time feedback along with a virtual assisting character is developed in line with a child's thinking ability and age. Our system is used across geographies at a huge scale. Nrupatunga . · Aashish Kumar · Anoop Kolar Rajagopal 🔗 - GeoRE: A Relation Extraction Dataset for Chinese Geometry Problems (Poster)    Relation extraction is an important foundation for many natural language understanding applications, as well as geometry problem solving. In this paper, we present GeoRE, a relation extraction dataset for Chinese geometry problems. To the best of our knowledge, GeoRE is the first Chinese relation extraction dataset about geometry problems. It consists of 12,901 geometry problems on 43 shapes, covering 19 positional relations and 4 quantitative relations. We experiment with various state-of-the-art (SOTA) models and the best model achieves only 70.3% F1 value on GeoRE. This shows that GeoRE presents a challenge for future research. Wei Yu · Shuyu Miao · Xun Zhou · Jingdong Liu · Yongfu Zha · Yongjian Zhang · Mengzhu Wang · Xiaodong Wang 🔗 - REAL2: An end-to-end memory-augmented solver for math word problems (Poster)    The task of math word problems has recently shown encouraging progress, e.g. in Recall and Learn (REAL), that solving problem by retrieving most similar questions based on a pre-trained memory module. In this article, we verify the effectiveness of different neural memory modules that can be trained end-to-end. Specifically, we first propose a Top-N pre-ranking process to retrieve candidate questions based on a Word2Vec model, and then we utilize a trainable memory module to re-rank the candidates to obtain the most similar Top-K questions. With this simple modification, we establish a stronger framework REAL2 that achieves state-of-the-art results. Code will be made public and we hope it will make the research of analogical learning in MWP task more accessible. Shifeng Huang · Jiawei Wang · Jiao Xu · Da Cao · Ming Yang 🔗 - Theorem-Aware Geometry Problem Solving with Symbolic Reasoning and Theorem Prediction (Poster)    Geometry problem solving is challenging as it requires abstract problem understanding and symbolic reasoning with axiomatic knowledge. However, current datasets are either small in scale or not publicly available. Thus, we construct a new large-scale benchmark, Geometry3K, consisting of 3,002 geometry problems with dense annotation in formal language. We further propose a novel geometry solving approach with formal language and symbolic reasoning, called \textit{Interpretable Geometry Problem Solver} (Inter-GPS). Inter-GPS first parses the problem text and diagram into formal language automatically via rule-based text parsing and neural object detecting, respectively. Unlike implicit learning in existing methods, Inter-GPS incorporates theorem knowledge as conditional rules and performs symbolic reasoning step by step. Also, a theorem predictor is designed to infer the theorem application sequence fed to the symbolic solver for the more efficient and reasonable searching path. Extensive experiments on the Geometry3K and GEOS datasets demonstrate that Inter-GPS achieves significant improvements over existing methods. The project is available at https://lupantech.github.io/inter-gps. Pan Lu · Ran Gong · Shibiao Jiang · Liang Qiu · Siyuan Huang · Xiaodan Liang · Song-Chun Zhu · Ran Gong 🔗 - Who Gets the Benefit of the Doubt? Racial Bias in Machine Learning Algorithms Applied to Secondary School Math Education (Poster)    Machine learning algorithms are rapidly being adopted to aid pedagogical decision-making in applications ranging from grading to student placement. Are these algorithms fair? We prove that, for predicting students' math performance, the standard machine learning practice of selecting a model that maximizes predictive accuracy can result in algorithms that give significantly more benefit of the doubt to White, Asian students and are more punitive to Black, Hispanic, Native American students. This disparity is masked by comparatively high predictive accuracy across both groups. We suggest new interventions that help close this performance gap and do not require the use of a different algorithm for each student group. Together, our results suggest new best practices for applying machine learning to education-related applications. Haewon Jeong · Michael D. Wu · Nilanjana Dasgupta · Muriel Medard · Flavio Calmon 🔗 - Towards Grounded Natural Language Proof Generation (Poster)    When a student is working on a mathematical proof, it is often helpful to receive suggestions about how to proceed. To this end, we provide an initial study of two generation tasks in natural mathematical language: suggesting the next step in a proof, and full-proof generation. As proofs are grounded in past results- e.g. theorems, definitions- we study knowledge-grounded generation methods, and find that conditioning on retrieved or ground-truth knowledge greatly improves generations. We characterize error types and provide directions for future research. Sean Welleck · Jiacheng (Gary) Liu · Yejin Choi 🔗 - Gamifying Math Education using Object Detection (Poster)    Manipulatives used in the right way help improve mathematical concepts leading to better learning outcomes. In this paper, we present a phygital (physical + digital) curriculum inspired teaching system for kids aged 5-8 to learn geometry using shape tile manipulatives. Combining smaller shapes to form larger ones is an important skill kids learn early on which requires shape tiles to be placed close to each other in the play area. This introduces a challenge of oriented object detection for densely packed objects with arbitrary orientations. Leveraging simulated data for neural network training and light-weight mobile architectures, we enable our system to understand user interactions and provide real-time audiovisual feedback. Experimental results show that our network runs real-time with high precision/recall on consumer devices, thereby providing a consistent and enjoyable learning experience. Rohit Nambiar · Yueqiu Sun · Vivek Vidyasagaran 🔗 - Towards Diagram Understanding and Cognitive Reasoning in Icon Question Answering (Poster)    Current visual question answering (VQA) tasks mainly consider answering human-annotated questions for natural images. However, aside from natural images, abstract diagrams with semantic richness are still understudied in visual understanding and reasoning research. In this work, we introduce a new challenge of Icon Question Answering (IconQA) with the goal of answering a question in an icon image context. We release IconQA, a large-scale dataset that consists of 107,439 questions, which highlights the importance of abstract diagram understanding and comprehensive cognitive reasoning. IconQA requires not only perception skills like object recognition and text understanding, but also diverse cognitive reasoning skills, such as geometric reasoning, commonsense reasoning, and arithmetic reasoning. To facilitate potential IconQA models to learn semantic representations for icon images, we further release an icon dataset Icon645 which contains 645,687 colored icons on 377 classes. We conduct extensive user studies and blind experiments and reproduce a wide range of advanced VQA methods to benchmark the IconQA task. Also, we develop a strong IconQA baseline Patch-TRM that applies a pyramid cross-modal Transformer with input diagram embeddings pre-trained on the icon dataset. IconQA and Icon645 are available athttps://iconqa.github.io. Pan Lu · Liang Qiu · Jiaqi Chen · Tanglin Xia · Yizhou Zhao · Wei Zhang · Zhou Yu · Xiaodan Liang · Song-Chun Zhu 🔗 - Geometric Question Answering Towards Multimodal Numerical Reasoning (Poster)    Automatic math problem solving has recently attracted increasing attention as a long-standing AI benchmark. In this paper, we focus on solving geometric problems, which requires a comprehensive understanding of textual descriptions, visual diagrams, and theorem knowledge. However, the existing methods were highly dependent on handcraft rules and were merely evaluated on small-scale datasets. Therefore, we propose a Geometric Question Answering dataset GeoQA, containing 5,010 geometric problems with corresponding annotated programs, which illustrate the solving process of the given problems. Compared with another publicly available dataset GeoS, GeoQA is 25 times larger, in which the program annotations can provide a practical testbed for future research on explicit and explainable numerical reasoning. Moreover, we introduce a Neural Geometric Solver (NGS) to address geometric problems by comprehensively parsing multimodal information and generating interpretable programs. We further add multiple self-supervised auxiliary tasks on NGS to enhance cross-modal semantic representation. Extensive experiments on GeoQA validate the effectiveness of our proposed NGS and auxiliary tasks. However, the results are still significantly lower than human performance, which leaves large room for future research. Jiaqi Chen · Jianheng Tang · Jinghui Qin · Xiaodan Liang · Lingbo Liu · Eric Xing · Liang Lin 🔗