Data & Lookup

What Is CMS Public Data?

Quick Answer

CMS (Centers for Medicare & Medicaid Services) public data refers to the collection of datasets that the federal government publishes about the Medicare program, available for free at data.cms.gov. The most powerful dataset for provider analysis is the Medicare Physician & Other Practitioners file, which contains every CPT/HCPCS code billed by every Medicare provider, including service counts, beneficiary counts, submitted charges, allowed amounts, and actual payments. This dataset covers 1.2 million+ providers and over 10 million billing records annually. Additional CMS public datasets include hospital utilization data, prescription drug spending, quality measures (MIPS scores), and geographic utilization reports. NPIxray uses the Medicare Physician & Other Practitioners dataset as its primary data source, processing 1.175M provider records and 8.15M billing records to power its free revenue analysis tool. Source: NPIxray analysis of 1.175M Medicare providers and 8.15M billing records.

1.2M+ providers in the CMS public dataset
10M+ billing records published annually
Data includes every CPT code, volume, and payment
Available free at data.cms.gov in multiple formats

Key CMS Public Datasets

CMS publishes dozens of datasets, but the most relevant for provider analysis are: Medicare Physician & Other Practitioners — by Provider and Service: the core dataset, showing every code each provider bills, with volumes and payments. This is what NPIxray uses. Medicare Physician & Other Practitioners — by Provider: aggregated summary per provider (total services, total payments, beneficiary count) without code-level detail. Medicare Physician & Other Practitioners — by Geography and Service: aggregated by state and code, useful for geographic benchmarking. Hospital Compare: quality and safety data for hospitals. Part D Prescriber Data: prescription drug prescribing patterns by provider.

All datasets are available at data.cms.gov in multiple formats (CSV, API, bulk download) and are updated periodically — typically annually for the physician billing datasets.

What Is in the Provider-Service Dataset?

The Medicare Physician & Other Practitioners — by Provider and Service file is the most granular publicly available Medicare billing dataset. Each row represents one provider billing one specific HCPCS/CPT code. Columns include: Rndrng_NPI (provider NPI number), Rndrng_Prvdr_Last_Org_Name (provider last name or organization name), Rndrng_Prvdr_Fst_Name (first name), Rndrng_Prvdr_Crdntls (credentials like MD, DO, NP), Rndrng_Prvdr_Gndr (gender), Rndrng_Prvdr_St1/St2/City/State_Abrvtn/Zip5 (practice address), Rndrng_Prvdr_Type (specialty), HCPCS_Cd (the CPT/HCPCS code billed), HCPCS_Desc (code description), Place_Of_Srvc (office vs. facility), Tot_Benes (unique beneficiaries), Tot_Srvcs (total service count), Tot_Sbmtd_Chrg (submitted charges), Avg_Sbmtd_Chrg (average charge per service), Tot_Mdcr_Alowd_Amt (total Medicare allowed amount), and Avg_Mdcr_Pymt_Amt (average Medicare payment per service).

Records are suppressed (not included) when a provider billed a specific code for fewer than 11 beneficiaries, protecting patient privacy for rare procedures.

How NPIxray Uses CMS Data

NPIxray processes the raw CMS dataset and transforms it into actionable revenue intelligence. For each provider, NPIxray calculates: E&M code distribution (percentage of visits at each code level) compared to specialty benchmarks, care management program indicators (presence of CCM, RPM, BHI billing codes), AWV completion rate (AWV codes billed vs. total Medicare beneficiary count), total Medicare revenue versus specialty and geographic peers, per-patient revenue versus specialty median, and a gap analysis estimating missed revenue from undercoding, missing care management programs, and low AWV completion.

This analysis runs instantly when you enter any NPI number at npixray.com. The underlying calculations use specialty-specific benchmark tables derived from the full 1.175M-provider dataset, ensuring comparisons are meaningful and accurate.

Accessing CMS Data Directly

For analysts and researchers who want to work with raw CMS data, several access methods are available. Bulk download: the full dataset is available as tab-delimited files (2GB+) from data.cms.gov. These can be loaded into databases, spreadsheets (with limitations due to size), or statistical software. Socrata API: CMS data.gov supports the Socrata Open Data API (SODA), allowing programmatic queries with filtering, pagination, and aggregation. The API endpoint supports SQL-like queries using SoQL syntax. CMS Data Explorer: a web-based interface on data.cms.gov for simple queries and visualizations directly in the browser.

For most users, NPIxray provides the insights they need without the complexity of working with raw data. For researchers conducting large-scale analyses, the bulk download combined with a database (PostgreSQL, BigQuery) is the most efficient approach.

Data Limitations and Considerations

CMS public data has several important limitations. Medicare only: the data covers Medicare fee-for-service billing only. It excludes Medicare Advantage, Medicaid, private insurance, and uninsured patients. For providers with a mixed payer base, Medicare represents approximately 40-60% of their total billing activity. Time lag: data is typically released 1-2 years after the service year. The most recent available dataset in 2026 covers 2024 services. Suppression: records with fewer than 11 beneficiaries per code are suppressed, which can cause underreporting for providers who bill rare codes. Allowed amount vs. payment: the dataset shows both allowed amounts and actual payments, which differ due to patient cost-sharing and sequestration. Facility vs. professional: the dataset separates facility and non-facility billing, which can create confusion if not properly aggregated.

Despite these limitations, CMS public data is the most comprehensive source of provider billing intelligence available anywhere — and it is completely free.

Frequently Asked Questions

Is CMS data really free?

Yes. All CMS public datasets are available at no cost. They are published under open data principles as taxpayer-funded government data. No registration, API key, or payment is required to access the data through data.cms.gov or the Socrata API.

How often is CMS data updated?

The Medicare Physician & Other Practitioners dataset is typically updated annually, with a 1-2 year lag from the service date. CMS usually releases the new dataset in the spring or summer. Other CMS datasets (hospital quality, Part D) have their own update schedules.

Does CMS data include patient information?

No. CMS public data is aggregated at the provider level and contains no patient-identifiable information. Records with fewer than 11 beneficiaries per code are suppressed to prevent any possibility of patient identification. The data is fully HIPAA compliant.

See Your Practice's Specific Numbers

Enter any NPI number to instantly see missed revenue from E&M coding gaps, CCM, RPM, BHI, and AWV programs — based on real CMS Medicare data.

Scan Your NPI
Source: NPIxray analysis of 1.175M Medicare providers and 8.15M billing records from CMS public data