An expert-curated benchmark that tests whether frontier AI
models can reason about analog circuit design.
| # | Model | Score | |
|---|---|---|---|
| Score | Meaning | Criteria |
|---|---|---|
| 4 | Correct | Correct conclusion and reasoning; topology, device roles, dominant mechanism, trend, and key assumptions are right. |
| 3 | Mostly correct | Main conclusion is right, with a minor omission, imprecision, or modeling flaw that does not change the result. |
| 2 | Partially correct | Identifies some relevant mechanism, but misses an important circuit detail, trend, or design consequence. |
| 1 | Mostly incorrect | Main conclusion is wrong, but the answer contains a small amount of relevant circuit understanding. |
| 0 | Incorrect / unusable | Fundamentally wrong, internally inconsistent, or based on a mistaken topology/device/connection. |
Judges evaluate against golden solutions prioritizing analog-circuit reasoning over surface similarity. See GitHub for full details.