About the ARC Challenge

The Abstraction and Reasoning Corpus (ARC) is a dataset that measures general fluid intelligence in AI systems. It consists of tasks where the AI must infer a pattern from a few examples and apply it to new situations.

Each task contains:

  • Training examples showing input-output pairs that demonstrate the pattern
  • A test input where the AI must predict the correct output
  • The ground truth test output for evaluation

This page showcases different transduction / induction models in attempting to solve the ARC validation set. For each task, models generate multiple candidate solutions, which are ranked based on various strategies including test-time fine-tuning and reranking approaches.

The visualization allows you to:

  • Compare different model variants and their performance
  • View training examples and test cases
  • Examine candidate solutions generated by the models
  • Track success rates and solution rankings
Enable Human Average Filter
50%
0 / 0