QuArch

Question-Answering Computer Architecture Dataset

What is QuArch?

QuArch (Question-Answering in Computer Architecture) is a specialized dataset designed to support AI-driven question answering in the domain of computer architecture and hardware. Built from the Archipedia corpus—a comprehensive collection of scholarly articles, technical documentation, and insights spanning decades—QuArch consists of questions on a wide range of computer architecture topics, with answers grounded in curated technical content.

QuArch aims to provide structured datasets for both straightforward and complex questions in areas such as processor design, memory systems, and performance optimization. By tackling these technical topics, the goal of QuArch is to serve as a benchmark dataset, helping researchers build and evaluate language models (LM) for accuracy and relevance in the computer architecture domain.

The alpha release of QuArch v0.1 offers a foundation of question-answer pairs, designed to assess the computer architecture knowledge embedded in LMs today and bridge the gap between AI agent capabilities and specialized knowledge in computing hardware and architecture. For more details about QuArch, refer to this paper:

Resources

Explore the QuArch initiative through the following links:



The QuArch v1.0 dataset will be hosted at the Hugging Face link above in the future following our crowdsourcing effort. In the meantime, if you are interested in access to the alpha version of the dataset (QuArch v0.1) please submit a request via email at contact.quarch@gmail.com.

Model Leaderboard

QuArch is being used to assess domain knowledge in computer architecture to build AI agents that can assist computer architects in reasoning about system problems, trade-offs, and optimizations.

Rank Model Accuracy (%)
1
Oct 15, 2024
claude-3.5
Anthropic
83.76%
2
Oct 2, 2024
gpt-4o
OpenAI
83.38%
3
Oct 2, 2024
llama-3.1-70b
Meta
78.72%
4
Oct 15, 2024
gemini-1.5
Google
78.07%
5
Oct 1, 2024
llama-3.1-8b
Meta
71.73%
6
Oct 15, 2024
llama-3.2-3b
Meta
59.51%
7
Oct 15, 2024
gemma-2-27b
Google
60.35%
8
Oct 1, 2024
mistral-7b
Mistral AI
61.97%
9
Oct 1, 2024
llama-3.2-1b
Meta
48.77%
10
Oct 1, 2024
gemma-2-9b
Google
47.93%
11
Oct 1, 2024
gemma-2-2b
Google
38.62%

QuArch Embeddings