Source files
Benchmarks are derived exclusively from Machine-Readable Files (MRFs) published by health insurers under 45 CFR § 147.210 — the CMS Transparency in Coverage rule. No estimates, no scraping, no secondary sources.
OHBS-5-UHC-Texas:
the Optum Health Behavioral Services professional network. It is 5.9 MB compressed, fully
streaming-parseable, and contains CPT codes directly (unlike UHC's general commercial files
which begin with facility revenue codes requiring seeking past gigabytes of irrelevant data).
Pipeline steps
The pipeline runs in four discrete stages. Each produces a versioned artifact. No stage mutates its inputs.
Fetch: Download and parse MRF
The UHC table-of-contents (85,321 blobs) is fetched and the OHBS-5 download URL extracted.
The MRF is streamed with ijson using the Schema 2.0 provider reference
structure: NPIs live at
provider_references → provider_groups → npi (not directly at
provider_references → npi). Output: Parquet of NPI×rate rows for CPT 90837.
Registry: Build Texas MH provider list
NPPES bulk CSV is filtered to Texas individual providers with MH taxonomy codes. Each NPI is assigned one credential bucket using T5 precedence rules. Output: Parquet with NPI, state, taxonomy, bucket.
Canonical: Join and deduplicate
Rate Parquet joined to provider registry on NPI (inner join — only providers with confirmed Texas MH taxonomy are kept). T3 rate-type filter applied. T4 dedup key applied. Output: canonical rates fact table.
Score: Compute cell statistics
Grouped by (payer, billing code, credential bucket). Computes P10–P90, IQR, IQR/median ratio, T1 heterogeneity check, T5 subgroup comparison. T6 confidence label assigned. Output: cell statistics Parquet + QA sheet.
Publish: Build frozen payload and render PDF
For each Moderate+ confidence cell, a frozen JSON payload is constructed with all
statistics, metadata, and methodology fields. A SHA-256 hash is computed over the
payload body and stamped as source_snapshot_id. WeasyPrint renders the
PDF from a Jinja2 template; the output PDF hash is written back to the payload.
The same payload always produces the same PDF.
T1 — Pooling rule
T2 — File inclusion
medicare|medicaid|chip|dental|vision|allowed.amount.
Only in-network rate files for commercial products are included.
Allowed-amount files report actual paid amounts, not contracted rates — a different
statistic that would contaminate the benchmark.
T3 — Rate types included
negotiated_type values are includednegotiated and fee schedule only.
derived rates are excluded.
negotiated type. No fee schedule or derived
rows are present. This is disclosed in each report's rate-type mix field.
For the buyer: every rate in this benchmark reflects a bilateral contract between UHC and an individual provider — not a computed estimate or percentage-of-allowed formula. When the median is $110.30, that is the median of 6,569 actual contracted prices.
T4 — Dedup key
(payer_brand, network_id, npi, billing_code, negotiated_type,
rate, expiration_date). One row per unique combination.
provider_group_id entries. Without dedup, the same $110 rate for a given
NPI would appear multiple times, over-representing providers with complex plan
structures. The dedup key preserves all meaningfully distinct rate entries (e.g.
different expiration dates or different plan products) while removing exact duplicates.
The result is one effective rate per NPI in the current cohort.
T5 — Credential bucket precedence
psychologist > lcsw > lmft > lpc_lmhc.
One provider, one bucket.
Psychologist: 103TC0700X, 103TC2200X, 103TP2700X, 103TP0016X, 103T00000X, 103TF0000X, 103TH0004X, 103TM1800X
LCSW: 1041C0700X
LMFT: 106H00000X
LPC/LMHC: 101YM0800X, 101YP1400M
T6 — Confidence labels
| Tier | n requirement | Dispersion requirement | Interpretation |
|---|---|---|---|
| High | ≥ 100 | IQR/median ≤ 0.50 | Large, coherent cohort. Statistic is stable and representative. |
| Moderate | ≥ 30 | IQR/median ≤ 0.80 | Sufficient for directional benchmark. Report with appropriate caveats. |
| Sparse | ≥ 10 | Any | Small cohort. Use for orientation only; do not cite in negotiations. |
| Suppressed | < 10 | — | Too few providers. Not published. Would identify individual rates. |
T7 — Release ID and versioning
{STATE}-{CPT}-{PAYER}-{BUCKET}-{YEAR}Q{QUARTER}.
Example: TX-90837-UHC-MASTERS-2026Q1.
source_snapshot_id. The rendered PDF
hash is separately computed and stored back in the payload.
This means: (1) the same input always produces the same PDF,
(2) any change to methodology or data produces a different hash,
(3) a buyer can verify their PDF against the published hash.
When source data is updated (quarterly), a new release ID is issued.
Old reports retain their original release IDs and remain valid as historical records.
Confidence and publishability
A cell must reach Moderate confidence before it is offered for sale. Cells below Moderate are suppressed entirely — we do not publish statistics we cannot stand behind.
UHC CPT 90837 · Master's-level (TX): n=6,569, IQR/p50=0.16 → High
UHC CPT 90837 · Psychologist (TX): n=1,274, IQR/p50=0.00 → High
Reproducibility
Every report PDF can be traced back to its exact source data and pipeline state.
Frozen payload guarantee
When you purchase a report, the PDF you receive corresponds to a specific frozen JSON payload.
The payload body SHA-256 (source_snapshot_id) and the PDF SHA-256
(pdf_hash) are both stamped in the report's provenance block.
If you receive the same release ID from two sources, you can verify they are identical
by comparing the hashes. Any difference in methodology or data will produce a different
source_snapshot_id.
What this data is not
These benchmarks are not predictions of what any payer will offer you. They are not guarantees that the median rate is attainable, and they are not legal advice.
These are an empirical snapshot of what one payer was contracted to pay a specific cohort of Texas providers as of a specific date, derived from that payer's own mandated public filings. The median may be above or below Medicare. Your rate may be above or below the median. Both are factual findings, not endorsements.
Individual contracted rates depend on your specific agreement with the payer. The benchmark shows you what the market looks like. If the data surprises you, that is the data working as intended.
Legal basis
Authorization
45 CFR § 147.210 (Transparency in Coverage rule) requires health insurers to publish in-network negotiated rate MRFs and explicitly authorizes third-party access, aggregation, and redistribution of the data. CMS has confirmed that analysis tools and information products derived from MRF data are an intended use of the regulation.
What is and is not published
RateScope publishes aggregate cohort statistics only (percentiles, IQR, provider counts). Individual NPI-level rates are not published, sold, or disclosed. No provider is identifiable from the published statistics. Suppression (n < 10) prevents publication of statistics that would allow reverse-engineering of individual rates.
No safe harbor
The DOJ/FTC 1996 healthcare safe harbor (which previously protected certain benchmark activities) was rescinded in February 2023. RateScope's legal posture relies on the 45 CFR § 147.210 authorization defense, not the safe harbor.
Disclaimer
Benchmark statistics are derived from publicly available data and are provided for informational purposes only. This is not legal, financial, or contractual advice. Individual contracted rates depend on specific agreements between providers and insurers. RateScope makes no warranty that published statistics reflect any individual provider's current contracted rate.