Adding reasoning to your AI? Take these resources, they may help you on your way. 
| AGI/causality/frml grammar |
|
|
| Deepmind Chomsky Hierarchy |
Problems crafted for FSM/PDA/TM |
[1] |
| automata |
a neurallambda tool to gen from grammars |
[1] |
| im a strange dataset |
Tough for LLMs because of self-references. |
[1] |
| DiagGSM8k |
NL Reasoning Benchmark |
[1] |
| CLadder |
Causal reasoning |
[1] |
| Cause-Effect Pairs |
108 datasets of 2 var dynamics (not NL) |
[1] |
| MNLI Entailment |
sentence parsing + entailment |
[1] |
| AGENT/TOOL |
|
|
| THUDM AgentInstruct |
long form dialogs |
[1] |
| WANG AgentInstruct |
gpt3 synthesized instructions |
[1] |
| KnowLM Tool |
prompt + tool call + answer |
[1] |
| Glaive Tool Usage |
sys prompt says tools + prompt + answer |
[1] |
| opentoolformer retrieval |
prompt + tool call |
[1] |
| CODE |
|
|
| rosetta |
same program, many diff languages |
[1] |
| EvoEval Tool Use |
100 prompt + code + tests |
[1] |
| MATH/LOGIC |
|
|
| gsm8k |
Grade School Math 8k |
[1] |
| MetaMath |
one-shot math |
[1] |
| MetaMathFewShot |
few-shot math |
[1] |
| MathPile |
9B tok from filtered internet |
[1] |
| LogiQA |
NL multi choice, requires abstraction |
[1] |
| Logic-LM |
a model combining auto theorem provers and llms |
[1] |
| Coq Facts |
270k cog theorem prover programs |
[1] |
| NATURAL LANGUAGE |
|
|
| Nous Open Reasoning |
community contrib tasks |
[1] |
| UltraInteract_sft |
GPT generated iterated reasoning dialogs |
[1] |
| CoGnition |
NL compositional generalization |
[1] |
| Winogrande |
ambiguous sentences, fill in 1 word |
[1] |
| Winograd_wsc |
ambiguous sentences, choose the right word |
[1] |
| Contradiction |
2 phrases, do they contradict |
[1] |
| Recognizing Textual Entailment |
2 phrases, do they entail each other |
[1] |
| Textual Entailment Pool |
more entailment |
[1] |
| Answer Validation |
2 phrases, does the answer solve question |
[1] |
| Monotonicity Entailment |
x is true, does y follow |
[1] |
| entailment |
passage, question -> T/F |
[1] |
| Commonsense QA |
muti choice QA |
[1] |
| GLUE |
several datasets |
[1] |
| custom multi-hop |
use wikipedia's graph of articles |
|
| MUD videogames |
(various could be training data) |
|
| skunkworks/reasoning |
wide variety of NL tasks |
[1] |
| TOY PROBLEMS |
|
|
| arc-like |
1D visual puzzles, great seq reasoning |
[1] |
| re-arc |
2D reverse engineered ARC |
[1] |
| ARC |
competition |
[1] |
| (misc) |
xLSTM paper lists several in appendix |
[1] |
| expand polynomials |
algebraic expansion |
[Abstractor] |
| linear eq |
solve algebraic eqs |
[Abstractor] |
| Match-To-Sample |
cogsci test for relational reasoning |
[1] MLPs Learn In Context |
| Oddball Detection |
cogsci test for relational reasoning |
[1] MLPs Learn In Context |
| regression |
with incontext learning, good reasoning test |
[1] MLPs Learn In Context |
| clustering |
with incontext learning, good reasoning test |
[1] MLPs Learn In Context |
| COGS |
compositional generalization |
[1] |
| SCAN |
systematicity, "$x to the left" |
[1] [2] |
| clevr |
2d img of 3d shapes + natural language QA |
[1] [2] |
| lambda calc + beta reductions |
generator code, single+multistep |
[1] |
| lichess-puzzles |
chess puzzles |
[1] |
| pointer net problems |
convex hull, TSP, triangulation |
[1] |
| Big Bench Hard |
23 challenges (only 6k datapoints) |
[1] |
| logical entailment dataset |
logic strings by deepmind |
[1] |
| logical entailment dataset code |
(generate it yourself) |
[1] |
| FSM Game |
generate strings according to grammar |
|
| Adaptive Grammar |
grammar rule might change |
|
| String/Graph Rewriting |
|
string_rewriting.py |
| LibraryOfLogic |
generate NL from multiple games |
[1] |
| AB-XY Game |
|
|
| word ladder |
|
|
| parser |
|
|
| longest cmn subseq |
|
|
| string reversal |
|
|
| wisconsin card sorting |
|
|
| anagram |
|
|
| palindrome |
|
|
| permutation composition |
|
|
| TOKEN AUGMENTED REASONING |
|
|
| Reasoning tokens |
Self-Reasoning Tokens, teaching models to think ahead |
[1] |
| Quiet-STaR |
LLMs Can Teach Themselves to Think Before Speaking |
[1] |