Three open-source engines come up in every accessibility stack: axe-core, Lighthouse and Pa11y. They're often pitted against each other, wrongly: they don't serve the same purpose. Here's when to use each, and how they complement one another.
axe-core — maximum precision
Maintained by Deque. The broadest, most up-to-date rule catalogue, the fewest false positives. It's the reference engine for a serious audit and for CI blocking. When you want a reliable verdict, issue by issue, it's the one.
Lighthouse — the quick score
Included in Chrome DevTools. It uses axe-core as the engine, but on a subset of rules, and returns a 0-100 score. Perfect for tracking a trend or showing a figure to an executive, insufficient for deciding on a fix: the details are missing.
Pa11y — the scriptable crawl
An older command-line tool, built to review entire sites in batch. Less up to date than axe on rules, but handy in a cron over thousands of pages. You use it for coverage, not for finesse.
The practical verdict
- →Serious one-off audit: axe-core (via a dedicated tool or on its own)
- →Dashboard to show an executive: Lighthouse
- →Full crawl of a 10,000-page site in-house: Pa11y
- →CI/CD pipeline blocking at the PR: axe-core CLI
// The common ground, and the common limit
All three rest, directly or indirectly, on automatable rules — that's 30 to 40% of the criteria. None judges the relevance of an alt or the real behaviour under a screen reader. Stacking the three doesn't cover more criteria: it covers more pages, faster, for the same rule scope.
Frequently asked questions
Should I use all three?
Rarely. axe-core covers the essentials for the audit and the CI; Lighthouse adds the steering score; Pa11y only makes sense for massive crawls. For most sites, axe-core alone, used well, is enough on the automated side.
Which one to block a CI?
axe-core CLI, with its exit-on-violation option. It's the most reliable and precise for stopping a regression reaching production.
Do these tools measure the RGAA?
No, they measure WCAG. For an RGAA verdict, you need a layer that maps and supplements towards the French criteria — what a dedicated audit adds on top of the engine.
Do Lighthouse and axe give the same score?
Not exactly. Lighthouse only runs a subset of axe-core's rules and weights its score its own way. A Lighthouse accessibility "100" therefore doesn't mean "zero axe violations": it's a trend indicator, not an audit verdict.
Can you run Pa11y and axe together?
Yes, and it's a common combination: Pa11y to sweep thousands of URLs in batch, axe-core for a precise verdict on the key pages. They don't compete, they cover two different needs — breadth and depth.
Do these tools detect keyboard problems?
Very partially. They spot a missing accessible name or an aberrant `tabindex`, but not whether keyboard navigation actually works end to end. That test stays manual, whatever the engine.
Which to keep if I can only keep one?
axe-core. It's the most precise, the most up to date, and it serves both the one-off audit and CI blocking. Lighthouse and Pa11y are complements as needed: steering score, or massive crawl.
An RGAA audit that builds on axe-core and goes further:
→ Run an audit