RESEARCH BRIEF·WHITEPAPER v2.4·38 PAGES · 47 CITATIONS·UPDATED MAY 12, 2026

How we measure
task-level AI exposure.

The TaskExposed index is a quantitative score that estimates, for each profession in the U.S. labor force, what fraction of paid time is spent on tasks current frontier models can plausibly substitute or assist.

01 · Definitions

Tasks, not jobs.

We define a task as a discrete, observable unit of paid work — drawn from the O*NET 28.3 occupational taxonomy maintained by the U.S. Department of Labor. A profession is a weighted bag of tasks. Models substitute or assist tasks, not whole jobs.

Throughout this document, exposure refers to the share of time-weighted task value that current frontier models can perform at or above human-median quality, holding context, tools, and oversight constant.

Definition
Eprofession = Σi ∈ tasks wi · c(ti, M)
w = TIME WEIGHT (BLS)c = CAPABILITY SCOREM = FRONTIER MODEL SET
02 · Data sources

Four primary inputs.

TASK TAXONOMY
O*NET 28.3
923 occupations · 19,260 tasks · 277 work activities
U.S. DEPT OF LABOR
LABOR MARKET
BLS OEWS 2025
154M workers · wages · employment · time-use
BUREAU OF LABOR STATISTICS
CAPABILITY MAPPING
Anthropic 2024
Task-level model performance benchmarks
INTERNAL & ARXIV
EXPOSURE RESEARCH
Eloundou et al.
GPTs are GPTs — exposure framework foundations
ARXIV:2303.10130
03 · The capability matrix

52 capabilities × 6 frontier models.

Each task is decomposed into one or more capability primitives — atomic skills like “summarize unstructured text” or “debug deterministic code under spec.” We score every capability against six frontier models, refreshed quarterly.

CapabilityClaude 4.5GPT-5Gemini 2.5Llama 4o-SeriesMedian
Generate boilerplate code0.940.920.890.810.860.89
Summarize unstructured text0.910.930.880.850.840.88
Multi-turn empathetic dialog0.620.580.550.410.490.55
Triage production incident logs0.410.450.380.310.460.41
Design system architecture0.280.310.250.190.270.27
04 · Scoring & aggregation

Bottom-up, weighted, classed.

Tasks are aggregated bottom-up. Each task is given a capability-weighted exposure score (0–100), then classified into one of three buckets at the canonical thresholds:

O*NET Tasks19,260 ATOMIC UNITSBLS Time-UseMINUTES PER TASKCapability Matrix52 CAPS × 6 MODELSTask ScorePER-TASK 0–100Profession IndexTIME-WEIGHTED AGG.
AI-Substitutable
SCORE ≥ 75
Frontier models meet human-median quality without prompted oversight.
AI-Assisted
SCORE 40 – 74
Models reliably accelerate but require human review for correctness.
Human-Critical
SCORE < 40
Models underperform humans materially; oversight is the work.
05 · Updates & versioning

Re-scored quarterly. Versioned forever.

The capability matrix is re-benchmarked every quarter as new frontier models release. Every change is committed to a versioned dataset — you can pin any report to a historical dataset for longitudinal research.

v2.4MAY 12, 2026Added GPT-5 + Claude 4.5 to capability matrix
v2.3FEB 2, 2026O*NET 28.3 task taxonomy migration
v2.2NOV 18, 2025Added six creative-professional families
v2.1AUG 24, 2025Re-classification thresholds adjusted (+5pp)
v2.0MAY 4, 2025Major rewrite — capability matrix introduced
06 · Limitations

Where this score is wrong.

Exposure is not adoption. Substitution is not extinction. Below-average models matter less than the marginal user. This index tells you where the ceiling is, not where the market actually lands. Read the full whitepaper for our priors, our anti-bias adjustments, and the things that genuinely surprised us.

07 · Citations

Primary references.

[1]Eloundou, T., Manning, S., Mishkin, P., & Rock, D. (2023). GPTs are GPTs: An early look at the labor market impact potential of large language models. arXiv:2303.10130.
[2]Acemoglu, D., & Restrepo, P. (2022). Tasks, automation, and the rise in US wage inequality. Econometrica, 90(5), 1973–2016.
[3]O*NET Resource Center. (2026). O*NET 28.3 database. U.S. Department of Labor/Employment and Training Administration.
[4]Bureau of Labor Statistics. (2025). Occupational Employment and Wage Statistics (OEWS) 2025. U.S. Department of Labor.
[5]Anthropic. (2024). Task-level model performance benchmarks [Internal research report]. Anthropic, PBC.
[6]Autor, D., Levy, F., & Murnane, R. J. (2003). The skill content of recent technological change: An empirical exploration. Quarterly Journal of Economics, 118(4), 1279–1333.