A Dialogue-based Tutoring System (DBTS) is a system that educates learners through conversations. The DBTS helps improve the ability of learners and can provide more practice opportunities in various fields such as language, science, and mathematics. ...
A Dialogue-based Tutoring System (DBTS) is a system that educates learners through conversations. The DBTS helps improve the ability of learners and can provide more practice opportunities in various fields such as language, science, and mathematics. Recently, the development in artificial intelligence and natural language processing has significantly improved the conversion quality of DBTS based on generative language models. To implement such DBTS, it is essential to secure various conversation-based guidance data in the respective fields. However, crowdsourcing such dialogue datasets is costly, and annotator training also requires a lot of time and money. Moreover, insufficient training of annotators can lead to a decline in data quality. Therefore, this paper proposes an automated method for constructing datasets for dialogue- based math tutoring systems and designs a benchmark to evaluate such systems based on the automatically constructed data, presenting its baseline performance. Compared to the traditional manual dataset construction methods, the automated approach not only saves time and cost but also demonstrates superiority in terms of data diversity and consistency. The data setting involves a 1:1 dialogue-based guided practice setting, where the dialogue scenarios are designed and constructed in a way that the teacher and student collaboratively solve problems after the teacher teaches new knowledge. For math problems, the dataset was constructed targeting the difficulty level of elementary mathematics, comprising Math Word Problems (GSM8K; Grade School Math 8K).