US Healthcare Interview Questions

TL;DR

40+ US Healthcare interview questions with hidden answers, organized by topic. Click "Show Answer" to reveal. Covers system fundamentals, claims process, coding systems, analytics (HEDIS/STAR), EDI transactions, and scenario-based questions for data professionals.

Short on time? Focus on Claims Process, HEDIS/STAR Analytics, and EDI Transactions — these come up in 80% of healthcare data interviews.
YMYL Disclaimer: This content is for informational purposes only. It is designed for data professionals preparing for healthcare domain interviews — not for medical advice or clinical decision-making.

Healthcare System Fundamentals

Q: What is the US Healthcare system and how does it differ from single-payer systems?

The US uses a multi-payer system — a patchwork of private insurance companies, employer-sponsored plans, and government programs (Medicare, Medicaid) all operating in parallel. Each payer has its own rules, rates, and coverage decisions. In a single-payer system (like Canada or the UK's NHS), one government entity funds healthcare for everyone. The US system results in higher administrative costs but more choice. The US spends roughly $4.5 trillion annually on healthcare — about 17% of GDP — the highest per capita of any country.

Q: What are the main types of health insurance plans (HMO, PPO, EPO)?

HMO (Health Maintenance Organization): Requires a primary care physician (PCP) as a gatekeeper. Need referrals to see specialists. Must stay in-network. Lower premiums but less flexibility.

PPO (Preferred Provider Organization): No PCP required. Can see specialists directly. Can go out-of-network (at higher cost). Higher premiums but more flexibility.

EPO (Exclusive Provider Organization): Hybrid — no referrals needed (like PPO) but must stay in-network (like HMO). No out-of-network coverage except emergencies.

Other types include POS (Point of Service) plans that combine HMO and PPO features, and HDHP (High Deductible Health Plans) paired with HSA savings accounts.

Q: What is the difference between Medicare and Medicaid?

Medicare is a federal program for people 65+, certain disabled individuals, and those with end-stage renal disease. It has 4 parts: Part A (hospital), Part B (outpatient/physician), Part C (Medicare Advantage — private plans), Part D (prescription drugs). Funded primarily through payroll taxes.

Medicaid is a joint federal-state program for low-income individuals and families. Eligibility and benefits vary by state. Covers about 90 million people. Each state administers its own Medicaid program under federal guidelines.

Key difference: Medicare is based on age (or disability), Medicaid is based on income. Some people qualify for both ("dual-eligible").

Q: What role do PBMs play in healthcare?

Pharmacy Benefit Managers (PBMs) are intermediaries between insurance companies, pharmacies, and drug manufacturers. They manage prescription drug benefits by: (1) negotiating drug prices and rebates with manufacturers, (2) creating and maintaining formularies (lists of covered drugs organized by tier), (3) processing pharmacy claims, and (4) running mail-order pharmacies. The top 3 PBMs (CVS Caremark, Express Scripts, OptumRx) control about 80% of the market. PBMs are controversial because their rebate negotiations lack transparency, and critics argue they don't always pass savings to patients.

Q: What is a self-insured employer plan and how does it differ from fully-insured?

In a fully-insured plan, the employer pays fixed premiums to an insurance company, which assumes all financial risk for employee claims.

In a self-insured (self-funded) plan, the employer pays claims directly out of its own funds. The employer bears the financial risk. They typically hire a TPA (Third-Party Administrator) to handle claims processing and often purchase stop-loss insurance to cap catastrophic claims.

About 65% of covered workers in the US are on self-insured plans. Self-insured plans are regulated under federal ERISA law (not state insurance law), which gives employers more flexibility in plan design. This distinction is important for data professionals because the data flows and reporting requirements differ between the two models.

Q: What is the role of CMS in US Healthcare?

The Centers for Medicare & Medicaid Services (CMS) is the federal agency within HHS that administers Medicare, Medicaid, CHIP, and the ACA Marketplace. CMS is the single largest payer in US healthcare, covering over 150 million Americans. Key responsibilities include: (1) setting reimbursement rates for Medicare, (2) defining quality measures like HEDIS and STAR ratings, (3) regulating Medicare Advantage plans, (4) enforcing compliance with healthcare regulations, (5) publishing data sets used widely in analytics (e.g., CMS Provider Utilization data, Hospital Compare). For data professionals, CMS is often the ultimate source of rules, codes, and benchmarks.

Want deeper coverage? See US Healthcare Overview and Core Concepts.

Claims Process

Q: Walk through the claims lifecycle from patient encounter to payment.

The claims lifecycle has 7 key steps:

1. Patient encounter: Patient visits a provider (doctor, hospital, lab).
2. Charge capture: Provider documents services and assigns diagnosis (ICD-10) and procedure (CPT/HCPCS) codes.
3. Claim creation: Billing department creates a claim — 837P (professional) or 837I (institutional).
4. Claim submission: Claim is sent electronically through a clearinghouse to the payer.
5. Adjudication: Payer validates the claim against eligibility, benefits, medical policies, and fee schedules. The claim is either paid, denied, or pended for review.
6. Payment (ERA/EOB): Payer sends an 835 (Electronic Remittance Advice) to the provider and an EOB (Explanation of Benefits) to the patient. Payment is issued.
7. Patient responsibility: Provider bills patient for remaining balance (copay, coinsurance, deductible).

The entire cycle typically takes 30–90 days. Each step generates data that feeds into analytics.

Q: What is the difference between a claim rejection and a denial?

A rejection happens before adjudication — the claim has errors that prevent it from being processed (invalid member ID, missing fields, wrong format). Rejected claims are returned to the provider and never enter the payer's system. They can be corrected and resubmitted.

A denial happens during or after adjudication — the claim was processed but the payer determined it shouldn't be paid (service not covered, prior authorization missing, duplicate claim). Denials require a formal appeals process to overturn.

Key metric: Denial rate = denied claims / total claims. Industry average is 5–10%. High denial rates are a major revenue cycle problem.

Q: What is claims adjudication?

Claims adjudication is the process a payer uses to evaluate and decide on a claim. The system runs the claim through a series of automated checks: (1) Eligibility verification — was the member enrolled on the date of service? (2) Benefit verification — is this service covered under their plan? (3) Medical policy rules — does this service meet medical necessity criteria? (4) Duplicate check — has this exact claim been submitted before? (5) Pricing — what's the allowed amount based on the fee schedule or contract? The output is one of three statuses: Paid, Denied, or Pended (held for manual review). Modern systems auto-adjudicate 80–90% of claims.

Q: What is a clean claim and what is the industry target for clean claim rates?

A clean claim is a claim that passes all validation checks on first submission — no errors, no missing information, no need for additional documentation. It can be adjudicated without manual intervention.

The industry target is a 95%+ clean claim rate. Best-in-class organizations achieve 98%. Every unclean claim costs an estimated $25–$30 to rework. A hospital processing 100,000 claims/month at 90% clean rate would spend an extra $250K–$300K monthly just on rework. That's why clean claim rate is one of the most watched KPIs in revenue cycle management.

Q: What are the three levels of appeals for denied claims?

For Medicare (and similar for commercial plans):

Level 1 — Internal appeal (Redetermination): Provider submits additional documentation to the same payer. Must be filed within 120 days. The payer's own team reviews the denial.

Level 2 — External review (Reconsideration): If Level 1 fails, an independent review entity (Qualified Independent Contractor for Medicare) reviews the case. A fresh set of eyes outside the payer.

Level 3 — Administrative Law Judge (ALJ) hearing: If Level 2 fails and the amount meets the threshold ($180+ for Medicare in 2024), the provider can request a formal hearing before an ALJ. Beyond this, there are further levels (Medicare Appeals Council, federal court) but they're rarely reached.

Successfully appealing denials is a significant revenue recovery opportunity. Data analysts often track appeal success rates by denial reason code.

Q: What is the difference between a facility claim and a professional claim?

Professional claim (CMS-1500 / 837P): Submitted by individual providers (physicians, therapists, nurse practitioners) for services they personally performed. Uses CPT codes. Identifies the rendering provider by NPI.

Facility claim (UB-04 / 837I): Submitted by institutions (hospitals, skilled nursing facilities, home health agencies) for facility fees — room & board, operating room, supplies, nursing care. Uses revenue codes and may include DRG assignments for inpatient stays.

A single hospital visit often generates both types: the hospital submits a facility claim for the room and equipment, and the physician submits a professional claim for their services. This is why you'll often see two charges for one encounter in claims data.

Deeper coverage: Core Concepts

Coding Systems

Q: What is CPT and give an example code?

CPT (Current Procedural Terminology) is a coding system maintained by the AMA (American Medical Association) that describes medical procedures and services performed by providers. CPT codes are 5-digit numeric codes.

Example: 99213 — Office visit, established patient, low-moderate complexity. This is one of the most commonly billed codes in outpatient care.

CPT is organized into 3 categories: Category I (standard procedures, e.g., 99213), Category II (performance tracking, e.g., 1234F), and Category III (emerging technology, e.g., 0001T). Category I codes are what you'll encounter most in claims data.

Q: What is ICD-10 and how does it differ from ICD-9?

ICD-10 (International Classification of Diseases, 10th Revision) is the diagnosis coding system used to describe why a patient sought care. The US uses ICD-10-CM (Clinical Modification) for diagnoses and ICD-10-PCS (Procedure Coding System) for inpatient procedures.

Key differences from ICD-9:
Code structure: ICD-9 used 3–5 digit numeric codes (~14,000 codes). ICD-10 uses 3–7 character alphanumeric codes (~70,000 codes).
Specificity: ICD-10 is far more granular. ICD-9 had one code for "fracture of femur." ICD-10 distinguishes left vs. right, open vs. closed, initial vs. subsequent encounter.
Example: ICD-9 250.00 (diabetes) became ICD-10 E11.9 (type 2 diabetes without complications).

The US transitioned to ICD-10 on October 1, 2015. If you're working with historical data, you may need to map between the two systems.

Q: What are DRGs and how do they affect hospital reimbursement?

DRGs (Diagnosis Related Groups) are a classification system that groups hospital inpatient stays into roughly 750 categories based on diagnosis, procedures, age, complications, and discharge status. Each DRG has a relative weight that reflects the expected resource intensity.

Reimbursement: Medicare pays hospitals a fixed amount per DRG, regardless of the actual cost of the stay. If the hospital treats the patient for less, they profit. If it costs more, they absorb the loss. This is called the Prospective Payment System (PPS).

Payment = DRG weight × base rate (adjusted for local wages, teaching status, etc.)

This incentivizes efficiency: hospitals want to code accurately (to capture the highest appropriate DRG) and reduce length of stay. For data analysts, DRG analysis is critical for understanding inpatient revenue and case mix.

Q: What is an NPI number?

The NPI (National Provider Identifier) is a unique 10-digit identification number assigned to healthcare providers by CMS. It's required for all HIPAA-covered transactions (claims, eligibility checks, referrals).

There are two types: Type 1 (individual providers — doctors, nurses, therapists) and Type 2 (organizational providers — hospitals, clinics, group practices). A single physician may have a Type 1 NPI and be associated with a group's Type 2 NPI.

NPIs are public and searchable in the NPPES (National Plan and Provider Enumeration System) registry. In claims data, the NPI is the primary key for identifying providers. It replaced multiple legacy identifiers (UPIN, Medicare PIN) that were used before 2007.

Q: Explain the difference between CPT, ICD-10, HCPCS, and NDC codes.

Each code system answers a different question on a claim:

ICD-10"What's wrong?" (Diagnosis). Example: E11.9 = Type 2 diabetes. Used to justify medical necessity.

CPT"What was done?" (Procedure). Example: 99213 = Office visit. Published by AMA. Covers physician services.

HCPCS"What was used/supplied?" (Healthcare Common Procedure Coding System). Level I is just CPT codes. Level II covers items CPT doesn't: ambulance services, durable medical equipment, prosthetics. Example: E0601 = CPAP device. Alphanumeric codes starting with a letter.

NDC"Which specific drug?" (National Drug Code). 10-digit code identifying the manufacturer, product, and package size. Example: 0069-3150-83 = specific Lipitor package. Used on pharmacy claims.

In claims data, you'll typically see ICD-10 + CPT on medical claims, and NDC codes on pharmacy claims.

Deeper coverage: Terms & Terminology

Healthcare Analytics

Q: What are HEDIS measures and why are they important?

HEDIS (Healthcare Effectiveness Data and Information Set) is a set of standardized performance measures developed by NCQA (National Committee for Quality Assurance). It evaluates health plan performance across 90+ measures in 6 domains: effectiveness of care, access/availability, experience of care, utilization, health plan descriptive information, and measures reported using electronic clinical data.

Examples of HEDIS measures:
Breast Cancer Screening (BCS): % of women 50–74 who had a mammogram in past 2 years
Controlling High Blood Pressure (CBP): % of members with hypertension whose BP is controlled
Comprehensive Diabetes Care (CDC): % of diabetic members with HbA1c testing

HEDIS is important because it directly feeds into CMS STAR ratings for Medicare Advantage plans, which affect bonus payments worth billions of dollars. Poor HEDIS scores = lower STAR rating = less revenue. This makes HEDIS data one of the most analytically important datasets in healthcare.

Q: Explain CMS STAR ratings and how they impact Medicare Advantage plans.

CMS STAR ratings are a 1–5 star quality rating system for Medicare Advantage (MA) and Part D prescription drug plans. Ratings are based on ~40 measures across 5 categories: outcomes, intermediate outcomes, patient experience, access, and process.

Financial impact is massive:
• Plans rated 4+ stars receive 5% quality bonus payments on top of their Medicare benchmarks
• 4+ star plans can also enroll members year-round (not just during Annual Enrollment)
• Plans below 3 stars for 3 consecutive years face CMS sanctions and potential termination

For a large MA plan with 500,000+ members, the difference between 3.5 and 4.0 stars can be $50–$100 million per year in bonus payments. This is why health plans invest heavily in STAR ratings improvement and why analytics teams track these measures obsessively.

Q: What is risk adjustment and how do HCC codes work?

Risk adjustment is a methodology that adjusts payments to health plans based on the health status (risk) of their enrolled members. The idea is simple: a plan with sicker members should get paid more than a plan with healthier members.

HCC (Hierarchical Condition Categories) is the risk adjustment model used by CMS for Medicare Advantage. It works like this:
1. ICD-10 diagnosis codes from claims are mapped to HCC categories (~86 categories)
2. Each HCC has a coefficient (weight) that adds to the member's risk score
3. Demographic factors (age, sex, Medicaid dual-eligibility) also contribute
4. The total Risk Adjustment Factor (RAF) score determines the payment

Example: A 70-year-old with diabetes (HCC 19) and CHF (HCC 85) has a higher RAF than a healthy 70-year-old, so the plan receives more per month. The average RAF score is normalized to 1.0. A RAF of 1.5 means 50% more payment than average.

Risk adjustment is a major area for analytics: plans need to ensure all diagnoses are properly documented and coded to capture accurate RAF scores. Undercoding means leaving money on the table.

Q: What is the difference between fee-for-service and value-based care?

Fee-for-Service (FFS): The traditional model. Providers are paid for each service they perform — more visits, more tests, more procedures = more revenue. This incentivizes volume over quality. It's the reason US healthcare costs keep rising.

Value-Based Care (VBC): The emerging model. Providers are paid based on patient outcomes and quality metrics, not volume. Types include:
Pay-for-performance (P4P): Bonuses for meeting quality targets
Bundled payments: One fixed payment for an entire episode of care (e.g., knee replacement start to finish)
Capitation: Fixed per-member-per-month (PMPM) payment regardless of services used
Shared savings (ACOs): Providers share in the savings if total costs come in under a benchmark

CMS is aggressively pushing the industry toward VBC. By 2030, CMS wants all Medicare beneficiaries in a value-based arrangement. For data professionals, VBC means tracking outcomes, quality measures, and total cost of care — very different from counting claims volume.

Q: What types of analytics are commonly used in healthcare?

Healthcare analytics spans five main areas:

1. Claims analytics: Analyzing medical and pharmacy claims for cost trends, utilization patterns, denial rates, and provider performance. The bread and butter of healthcare data teams.

2. Clinical analytics: Using EHR (Electronic Health Record) data to improve clinical outcomes — readmission prediction, sepsis early warning, medication adherence tracking.

3. Quality analytics: HEDIS measure calculation, STAR ratings tracking, and quality improvement initiatives. Directly tied to revenue for MA plans.

4. Population health analytics: Identifying high-risk patient cohorts, care gap analysis, chronic disease management, and health equity metrics.

5. Financial/actuarial analytics: Medical loss ratio (MLR), PMPM cost trending, reserve estimation, and risk scoring for premium setting.

Most healthcare data professionals work across several of these areas. SQL, Python, and BI tools (Power BI, Tableau) are the core technical skills.

Q: What is population health management?

Population health management (PHM) is the practice of improving health outcomes for a defined group of people by analyzing data across the population to identify risks and target interventions. Instead of treating patients one at a time when they show up sick, PHM proactively manages entire populations.

Key components:
Risk stratification: Segmenting members into risk tiers (healthy, rising risk, high risk, catastrophic) using claims, labs, and social determinants of health (SDoH)
Care gap identification: Finding members overdue for screenings, vaccinations, or chronic disease follow-ups (directly feeds HEDIS)
Care management programs: Assigning high-risk members to nurse care managers for proactive outreach
Social determinants of health: Analyzing factors like zip code, income, food access, and transportation that impact outcomes

For data professionals, PHM means building risk models, running care gap reports, and measuring program effectiveness (did our diabetes management program actually reduce HbA1c levels and ER visits?).

Deeper coverage: Healthcare Analytics & Reports

Data Standards & EDI

Q: What are the key EDI transactions in healthcare (837, 835, 270/271)?

EDI (Electronic Data Interchange) is the standardized electronic format for healthcare transactions mandated by HIPAA. The key transaction types:

837 (Claim): The claim submission. 837P = professional claims (physician), 837I = institutional claims (hospital), 837D = dental claims. This is the most data-rich transaction — contains patient, provider, diagnosis, procedure, and charge information.

835 (Remittance): The payment explanation sent from payer to provider. Shows what was paid, what was adjusted, and why. The electronic version of an EOB. Critical for payment reconciliation.

270/271 (Eligibility): 270 is the eligibility inquiry (is this patient covered?), 271 is the response. Checked before every visit to verify active coverage.

276/277 (Claim Status): 276 is the status inquiry, 277 is the response. Used to check where a claim is in the adjudication process.

278 (Prior Authorization): Request and response for service pre-approval.

All of these follow the X12 standard with pipe-delimited segments. They look nothing like JSON or XML — they're flat, cryptic, and require specialized parsers.

Q: What is HIPAA and what are its key provisions for data?

HIPAA (Health Insurance Portability and Accountability Act of 1996) has two major components relevant to healthcare data:

1. Privacy Rule: Defines PHI (Protected Health Information) — 18 identifiers including name, SSN, DOB, address, medical record numbers, and IP addresses. PHI can only be used/disclosed for treatment, payment, and healthcare operations without patient consent. Covered entities must provide a Notice of Privacy Practices.

2. Security Rule: Requires administrative, physical, and technical safeguards to protect electronic PHI (ePHI). This includes access controls, encryption, audit trails, and breach notification procedures.

Transaction and Code Sets Rule: Mandates the use of standardized EDI formats (837, 835, etc.) for electronic transactions.

For data professionals: you must understand de-identification (Safe Harbor method removes 18 identifiers, Expert Determination uses statistical methods), minimum necessary standard (only access the PHI you need), and BAA (Business Associate Agreement) requirements if your company handles PHI on behalf of a covered entity. HIPAA violations carry fines of $100–$1.9 million per incident.

Q: What is FHIR and how does it differ from HL7 v2?

Both are standards for exchanging clinical data between healthcare systems, but they represent different eras:

HL7 v2 (1987): The workhorse of healthcare interoperability. Uses pipe-delimited messages (e.g., MSH|^~\&|...). Extremely widespread — estimated 95% of US healthcare organizations use it. But it's flexible to a fault: implementations vary wildly between vendors, making true interoperability difficult. Point-to-point integration that doesn't scale well.

FHIR (Fast Healthcare Interoperability Resources, 2014): The modern successor. Built on RESTful APIs, uses JSON/XML. Resources (Patient, Observation, Encounter, Claim) are accessed via standard HTTP methods. Much easier for developers to work with. Supports OAuth2 for authentication.

Key differences: HL7 v2 is message-based (push), FHIR is resource-based (pull via API). HL7 v2 requires custom interfaces for each connection, FHIR uses standardized endpoints. FHIR is mandated by the 21st Century Cures Act for patient data access.

In practice, most organizations run both: HL7 v2 for internal system communication and FHIR for external data sharing and patient-facing apps.

Q: What is a clearinghouse and why is it needed?

A clearinghouse is an intermediary that routes electronic claims between providers and payers. Think of it as a postal service for healthcare claims.

Why they exist: There are thousands of providers and hundreds of payers. Without a clearinghouse, each provider would need a direct electronic connection to every payer — an O(n×m) problem. The clearinghouse acts as a hub, reducing this to O(n+m).

What clearinghouses do:
Format validation: Check claims for errors before sending to the payer (scrubbing)
Translation: Convert between different EDI versions and formats
Routing: Send claims to the correct payer based on payer ID
Status tracking: Provide claim status visibility across payers

Major clearinghouses include Change Healthcare (now part of Optum/UHG), Availity, and Trizetto. Change Healthcare processes about 15 billion transactions annually. The 2024 Change Healthcare cyberattack demonstrated how critical (and concentrated) this infrastructure is.

Q: What data elements are typically on an 837 professional claim?

An 837P (professional claim) contains these core data elements organized into loops and segments:

Header information: Submitter, receiver, payer IDs, transaction date

Subscriber/Patient: Member ID, name, DOB, gender, address, relationship to subscriber, group number

Billing/Rendering Provider: NPI (Type 1 and Type 2), taxonomy code, name, address, tax ID

Claim-level data: Claim ID, total charge amount, place of service code (11=office, 21=inpatient, 23=ER), claim frequency code (original, replacement, void), prior authorization number

Diagnosis codes: Up to 12 ICD-10-CM codes, with a principal diagnosis indicator

Service lines: Each line has a CPT/HCPCS code, modifier(s), date of service, units, line charge amount, rendering provider NPI, diagnosis pointer (links to which ICD-10 code justifies this service)

Understanding this structure is essential for building claims data warehouses and writing accurate analytics queries. Each element maps to columns in your claims tables.

Deeper coverage: Terms & Terminology

Scenario Questions

Q: A client reports their clean claim rate dropped from 95% to 85%. How would you investigate?

Structured investigation approach:

1. Scope the problem: When did the drop start? Is it across all payers or specific ones? All providers or a subset?
SELECT payer_name,
  DATE_TRUNC('week', submission_date) AS week,
  COUNT(*) AS total_claims,
  SUM(CASE WHEN is_clean = 1 THEN 1 ELSE 0 END) AS clean_claims,
  ROUND(100.0 * SUM(CASE WHEN is_clean = 1 THEN 1 ELSE 0 END) / COUNT(*), 1) AS clean_pct
FROM claims
WHERE submission_date >= CURRENT_DATE - INTERVAL '90 days'
GROUP BY payer_name, DATE_TRUNC('week', submission_date)
ORDER BY payer_name, week;
2. Analyze rejection reasons: Group rejections by reason code to find the top drivers.
SELECT rejection_code, rejection_description,
  COUNT(*) AS cnt,
  ROUND(100.0 * COUNT(*) / SUM(COUNT(*)) OVER (), 1) AS pct_of_total
FROM claim_rejections
WHERE submission_date >= '2026-01-01'
GROUP BY rejection_code, rejection_description
ORDER BY cnt DESC
LIMIT 10;
3. Look for root cause: Common culprits include: a system upgrade that changed field mappings, a new payer with different validation rules, a new provider group with incorrect credentialing, or an NPI/taxonomy code mismatch after a provider changed practice.

4. Check for correlations: Did the drop correlate with a billing system update? New provider onboarding? A payer changing their validation rules?

5. Recommend fix and monitoring: Implement pre-submission scrubbing rules targeting the top rejection reasons. Set up alerts if clean claim rate drops below 93%.

Q: You're building a dashboard to track HEDIS measures. What metrics and dimensions would you include?

Key metrics (KPIs):
• Compliance rate per measure (numerator / denominator × 100)
• Gap count — members in the denominator who haven't met the measure
• Year-over-year trend per measure
• Days remaining in measurement year
• Projected final rate (based on current trajectory)
• Distance to STAR threshold (e.g., need 72% for 4 stars, currently at 68%)

Dimensions (slicers):
• Measure name (BCS, CBP, CDC-HbA1c, etc.)
• Product/plan (MA-HMO, MA-PPO)
• Market/region
• Provider/medical group
• Member demographics (age band, gender, risk tier)
• Care gap closure source (claims, supplemental data, chart review)

Dashboard structure:
Executive summary: Overall STAR rating projection, top measures at risk
Measure detail: Drill-down per measure showing compliance rate, trend, and gap count
Provider scorecard: Which medical groups are driving gaps
Member outreach list: Actionable list of members with open care gaps for outreach campaigns

The most critical design decision: the dashboard should answer "what do we need to do TODAY to move the needle?" not just report historical numbers.

Q: A Medicare Advantage plan's STAR rating dropped from 4.0 to 3.5. What data would you analyze to find the cause?

This is a multi-million dollar problem. Structured approach:

1. Identify which measures drove the drop:
SELECT measure_id, measure_name,
  prior_year_score, current_year_score,
  prior_year_stars, current_year_stars,
  (current_year_score - prior_year_score) AS score_change
FROM star_ratings_detail
WHERE plan_id = 'H1234'
ORDER BY score_change ASC;
2. Decompose the STAR formula: STAR ratings weight different categories differently. Check if the drop came from HEDIS clinical measures (heavily weighted), CAHPS patient experience surveys, HOS (Health Outcomes Survey), or Part D medication adherence.

3. For each dropped measure, analyze the numerator/denominator: Did the denominator grow (more eligible members) or the numerator shrink (fewer members meeting the measure)? A plan that acquired a large new group might see dilution.

4. Check operational factors: Did supplemental data submission deadlines get missed? Were chart reviews completed? Did a major provider group leave the network? Did CAHPS survey response rates drop (low response rates hurt)?

5. External factors: Did CMS change cut-points or methodology? (CMS periodically tightens STAR thresholds.) Were there COVID-related measure adjustments that expired?

6. Recommend recovery plan: Prioritize measures closest to the next cut-point threshold. A measure at 71% where 72% earns an extra star is worth more focus than one at 60% needing 72%.

Q: Your risk adjustment team believes diagnoses are being undercoded. How would you validate this with data?

Undercoding means valid diagnoses aren't being captured on claims, leading to lower RAF scores and reduced revenue. Here's how to validate:

1. Diagnosis persistence analysis: Find chronic conditions documented last year but missing this year. Chronic diseases don't disappear.
SELECT m.member_id, m.member_name, h.hcc_code, h.hcc_description,
  py.documented AS prior_year, cy.documented AS current_year
FROM members m
JOIN member_hcc_history h ON m.member_id = h.member_id
LEFT JOIN (SELECT member_id, hcc_code, 1 AS documented
           FROM claims_hcc WHERE service_year = 2025) py
  ON m.member_id = py.member_id AND h.hcc_code = py.hcc_code
LEFT JOIN (SELECT member_id, hcc_code, 1 AS documented
           FROM claims_hcc WHERE service_year = 2026) cy
  ON m.member_id = cy.member_id AND h.hcc_code = cy.hcc_code
WHERE py.documented = 1 AND cy.documented IS NULL
  AND h.is_chronic = 1;
2. Compare RAF scores year-over-year: If the average RAF dropped without a change in population mix, that suggests coding decline.

3. Benchmark against expected prevalence: If 8% of your 65+ population has CHF but only 4% have CHF codes on claims, you likely have undercoding.

4. Provider-level analysis: Compare HCC capture rates across providers. Providers with suspiciously low rates may need coding education.

5. Pharmacy-to-diagnosis gap: Members on insulin (from pharmacy claims) without a diabetes diagnosis code on medical claims is a strong signal of undercoding.

Revenue impact: Quantify the gap: (expected RAF - actual RAF) × members × monthly capitation rate × 12 = annual revenue left on the table.

Q: A payer asks you to build a fraud detection model for claims. What features would you engineer?

Healthcare fraud costs an estimated $100+ billion annually. Feature engineering categories:

Provider-level features:
• Claims volume vs. peer average (by specialty and geography)
• Average charge per claim vs. peers
• % of claims at highest complexity E&M codes (upcoding signal)
• Number of unique patients per day (impossible volume detection)
• Rendering provider billing under multiple groups

Claim-level features:
• Unbundling patterns (billing separate CPTs that should be a single bundled code)
• Weekend/holiday billing frequency
• Services billed to deceased members
• Duplicate claims with slight modifications
• Services inconsistent with patient demographics (pediatric codes for adults)

Network/relationship features:
• Self-referral rates
• Unusual referral patterns between providers
• Patients traveling long distances past closer providers
• Phantom patient detection (members with claims but no other activity)

Temporal features:
• Sudden volume spikes for a provider
• Claims clustered just below review thresholds
• Billing patterns that changed sharply after an audit

Model approach: Start with rules-based anomaly detection (easier to explain to regulators), then layer in supervised models (gradient boosting, random forest) trained on confirmed fraud cases. Explainability is critical — you need to justify each flag to investigators.

Q: You're migrating healthcare data from HL7 v2 to FHIR. What challenges would you expect?

This is one of the hardest interoperability problems in healthcare. Key challenges:

1. Data model mismatch: HL7 v2 is message-based (ADT, ORM, ORU messages). FHIR is resource-based (Patient, Encounter, Observation). A single HL7 v2 ADT message might map to multiple FHIR resources (Patient + Encounter + Coverage + Location). There's no 1:1 mapping.

2. Vocabulary differences: HL7 v2 allows local code systems and free text. FHIR strongly encourages standard terminologies (SNOMED CT, LOINC, RxNorm). You'll need to build terminology mapping tables.

3. HL7 v2 variability: Every vendor implements HL7 v2 differently. "Z-segments" (custom extensions) are everywhere. You'll spend more time understanding the source system's specific HL7 v2 implementation than writing the actual transform.

4. Historical data quality: Legacy HL7 v2 data often has missing fields, inconsistent formats, and embedded free text where coded values should be. You need a data quality triage strategy.

5. Identity resolution: HL7 v2 systems often use local MRNs. FHIR expects consistent patient identifiers. You'll need a master patient index (MPI) strategy to deduplicate and link records.

6. Ongoing dual maintenance: During migration, you'll run both systems in parallel. Every downstream consumer needs to handle both formats until cutover. Plan for a long coexistence period.

Practical advice: Start with a priority subset (lab results or ADT events), not everything at once. Use a FHIR facade/adapter pattern that translates on-the-fly rather than a big-bang migration. Tools like Microsoft FHIR Converter and HAPI FHIR can accelerate development.