Skip to content

Latest commit

 

History

History
executable file
·
55 lines (54 loc) · 21.4 KB

cot_datasets.md

File metadata and controls

executable file
·
55 lines (54 loc) · 21.4 KB

Collected from Navigate through Enigmatic Labyrinth A Survey of Chain of Thought Reasoning: Advances, Frontiers and Future

Task Dataset Year Size Input Output Rationale Description Paper
Mathematical Reasoning AddSub 2014 395 Question Number Equation Simple arithmetic Learning to Solve Arithmetic Word Problems with Verb Categorization
Mathematical Reasoning SingleEq 2015 508 Question Number Equation Simple arithmetic Parsing Algebraic Word Problems into Equations
Mathematical Reasoning MultiArith 2015 600 Question Number Equation Simple arithmetic Solving General Arithmetic Word Problems
Mathematical Reasoning MAWPS 2016 3,320 Question Number Equation Simple arithmetic MAWPS: A Math Word Problem Repository
Mathematical Reasoning AQUA-RAT 2017 100,000 Question Option Natural Language Math reasoning with NL rationale Program Induction by Rationale Generation: Learning to Solve and Explain Algebraic Word Problems
Mathematical Reasoning ASDiv 2020 2,305 Question Number Equation Multi-step math reasoning A Diverse Corpus for Evaluating and Developing English Math Word Problem Solvers
Mathematical Reasoning SVAMP 2021 1,000 Question Number Equation Multi-step math reasoning Are NLP Models really able to Solve Simple Math Word Problems?
Mathematical Reasoning GSM8K 2021 8,792 Question Number Natural Language Multi-step math reasoning Training Verifiers to Solve Math Word Problems
Mathematical Reasoning GSM-Hard 2023 936 Question Number Natural Language GSM8K with larger number PAL: Program-aided Language Models
Mathematical Reasoning MathQA 2019 37,297 Question Number Operation Annotated based on AQUA MathQA: Towards Interpretable Math Word Problem Solving with Operation-Based Formalisms
Mathematical Reasoning DROP 2019 96,567 Question+Passage Number+Span Equation Reading comprehension form DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs
Mathematical Reasoning TheoremQA 2023 800 Question+Theorem Number - Answer based on theorems TheoremQA: A Theorem-driven Question Answering Dataset
Mathematical Reasoning TAT-QA 2021 16,552 Question+Table+Text Number+Span Operation Answer based on tables TAT-QA: A Question Answering Benchmark on a Hybrid of Tabular and Textual Content in Finance
Mathematical Reasoning FinQA 2021 8,281 Question+Table+Text Number Operation Answer based on tables FinQA: A Dataset of Numerical Reasoning over Financial Data
Mathematical Reasoning ConvFinQA 2022 3,892 Question+Table+Dialog Number Operation Multi-turn dialogs ConvFinQA: Exploring the Chain of Numerical Reasoning in Conversational Finance Question Answering
Mathematical Reasoning MATH 2021 12,500 Question Number Natural Language Challenging competition math problems Measuring Mathematical Problem Solving With the MATH Dataset
Mathematical Reasoning NumGLUE 2022 101,835 Question+Text Number+Span - Multi-task benchmark NumGLUE: A Suite of Fundamental yet Challenging Mathematical Reasoning Tasks
Mathematical Reasoning LILA 2022 133,815 Question+Text Free-form Program Multi-task benchmark LILA: A Unified Benchmark for Mathematical Reasoning
Commonsense Reasoning ARC 2021 7,787 Question Option - From science exam Think you have Solved Direct-Answer Question Answering? Try ARC-DA,the Direct-Answer AI2 Reasoning Challenge
Commonsense Reasoning OpenBookQA 2018 5,957 Question+Context Option - Open-book knowledges Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering
Commonsense Reasoning PIQA 2020 21,000 Goal+Solution Option - Physical commonsense knowledge PIQA: Reasoning about Physical Commonsense in Natural Language
Commonsense Reasoning CommonsenseQA 2019 12,247 Question Option - Derived from ConceptNet CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge
Commonsense Reasoning CommonsenseQA 2.0 2021 14,343 Question Yes/No - Gaming annotation with high quality CommonsenseQA 2.0: Exposing the Limits of AI through Gamification
Commonsense Reasoning Event2Mind 2018 25,000 Event Intent+Reaction - Intension commonsense reasoning Event2Mind: Commonsense Inference on Events, Intents, and Reactions
Commonsense Reasoning McTaco 2019 13,225 Question Option - Event temporal commonsense reasoning Going on a vacation takes longer than Going for a walk: A Study of Temporal Commonsense Understanding
Commonsense Reasoning CosmosQA 2019 35,588 Question+Paragraph Option - Narrative commonsense reasoning Cosmos QA: Machine Reading Comprehension with Contextual Commonsense Reasoning
Commonsense Reasoning ComValidation 2019 11,997 Statement Option - Commonsense verification Does it Make Sense? And Why? A Pilot Study for Sense Making and Explanation
Commonsense Reasoning ComExplanation 2019 11,997 Statement Option/Free-form - Commonsense explanation Does it Make Sense? And Why? A Pilot Study for Sense Making and Explanation
Commonsense Reasoning StrategyQA 2021 2,780 Question Yes/No - Multi-hop commonsense reasoning Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit Reasoning Strategies
Symbolic Reasoning Last Letter Concat 2022 - Words Letters - Rule-based Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Symbolic Reasoning Coin Flip 2022 - Statement Yes/No - Rule-based Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Symbolic Reasoning Reverse List 2022 - List Reversed List - Rule-based Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Symbolic Reasoning BigBench 2022 - - - - Contains multiple symbolic reasoning datasets Beyond the Imitation Game: Quantifying and extrapolating the capabilitiesof language models
Symbolic Reasoning BigBench-Hard 2023 - - - - Contains multiple symbolic reasoning datasets Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them
Logical Reasoning ReClor 2020 6,138 Question+Context Option - Questions from GMAT and LSAT ReClor: A Reading Comprehension Dataset Requiring Logical Reasoning
Logical Reasoning LogiQA 2020 8,678 Question+Paragraph Option - Questions from China Civil Service Exam LogiQA: A Challenge Dataset for Machine Reading Comprehension withLogical Reasoning
Logical Reasoning ProofWriter 2021 20,192 Question+Rule Answer+Proof Entailment Tree Reasoning process generation ProofWriter: Generating Implications, Proofs, and Abductive Statements over Natural Language
Logical Reasoning FOLIO 2022 1,435 Conclusion+Premise Yes/No - First-order logic FOLIO: Natural Language Reasoning with First-Order Logic
Logical Reasoning DEER 2024 1,200 Fact Rule - Inductive reasoning Language Models as Inductive Reasoners
Logical Reasoning PrOntoQA 2023 - Question+Context Yes/No+Proccess First-Order Logic Deductive reasoning Language Models Are Greedy Reasoners: A Systematic Formal Analysis of Chain-of-Thought
Multimodal Reasoning VCR 2019 264,720 Question+Image Option Natural Language Visual commonsense reasoning From Recognition to Cognition: Visual Commonsense Reasoning
Multimodal Reasoning VisualCOMET 2020 1,465,704 Image+Event Action+Intent - Visual commonsense reasoning VisualCOMET: Reasoning About the Dynamic Context of a Still Image
Multimodal Reasoning PMR 2022 15,360 Image+Background Option - Premise-based multi-modal reasoning Premise-based Multimodal Reasoning: Conditional Inference on Joint Textual and Visual Clues
Multimodal Reasoning ScienceQA 2022 21,208 Q+Image+Context Option Natural Language Multi-modal reasoning with NL rationales Learn to Explain: Multimodal Reasoning via Thought Chains for ScienceQuestion Answering
Multimodal Reasoning VLEP 2020 28,726 Premise+Video Option - Video event prediction What is More Likely to Happen Next? Video-and-Language Future Event Prediction
Multimodal Reasoning CLEVRER 2020 305,280 Question+Video Option/Free-form Program Video temporal and causal reasoning CLEVRER: Collision Events for Video Representation and Reasoning
Multimodal Reasoning STAR 2021 600,000 Question+Video Option - Video situated reasoning STAR: A Benchmark for Situated Reasoning in Real-World Videos
Multimodal Reasoning NEXT-QA 2021 47,692 Question+Video Option - Video temporal,causal,commonsense reasoning NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions
Multimodal Reasoning Causal-VidQA 2022 107,600 Question+Video Free-form Natural Language Video causal and commonsense reasoning From Representation to Reasoning: Towards both Evidence and CommonsenseReasoning for Video Question-Answering
Multimodal Reasoning News-KVQA 2022 1,041,352 Q+V+KG Option - Video reasoning with external knowledge NewsKVQA: Knowledge-Aware News Video Question Answering