What is QuArch?
QuArch (Question-Answering in Computer Architecture) is a specialized dataset designed to support AI-driven question answering in the domain of computer architecture and hardware. Built from the Archipedia corpus—a comprehensive collection of scholarly articles, technical documentation, and insights spanning decades—QuArch consists of questions on a wide range of computer architecture topics, with answers grounded in curated technical content.
QuArch aims to provide structured datasets for both straightforward and complex questions in areas such as processor design, memory systems, and performance optimization. By tackling these technical topics, the goal of QuArch is to serve as a benchmark dataset, helping researchers build and evaluate language models (LM) for accuracy and relevance in the computer architecture domain.
The alpha release of QuArch v0.1 offers a foundation of question-answer pairs, designed to assess the computer architecture knowledge embedded in LMs today and bridge the gap between AI agent capabilities and specialized knowledge in computing hardware and architecture. For more details about QuArch, refer to this paper:
Resources
Explore the QuArch initiative through the following links:
The QuArch v1.0 dataset will be hosted at the Hugging Face link above in the future following our crowdsourcing effort. In the meantime, if you are interested in access to the alpha version of the dataset (QuArch v0.1) please submit a request via email at contact.quarch@gmail.com.
Model Leaderboard
QuArch is being used to assess domain knowledge in computer architecture to build AI agents that can assist computer architects in reasoning about system problems, trade-offs, and optimizations.
Rank | Model | Accuracy (%) |
---|---|---|
1 Oct 15, 2024 |
claude-3.5 Anthropic |
83.76% |
2 Oct 2, 2024 |
gpt-4o OpenAI |
83.38% |
3 Oct 2, 2024 |
llama-3.1-70b Meta |
78.72% |
4 Oct 15, 2024 |
gemini-1.5 |
78.07% |
5 Oct 1, 2024 |
llama-3.1-8b Meta |
71.73% |
6 Oct 15, 2024 |
llama-3.2-3b Meta |
59.51% |
7 Oct 15, 2024 |
gemma-2-27b |
60.35% |
8 Oct 1, 2024 |
mistral-7b Mistral AI |
61.97% |
9 Oct 1, 2024 |
llama-3.2-1b Meta |
48.77% |
10 Oct 1, 2024 |
gemma-2-9b |
47.93% |
11 Oct 1, 2024 |
gemma-2-2b |
38.62% |