Why Racial Bias Persists in Medical Algorithms and How to Fix It

Why Racial Bias Persists in Medical Algorithms and How to Fix It

Medical algorithms are machine learning or statistical models that ingest patient data—demographics, lab results, diagnoses—and output predictions or recommendations to clinicians, insurers, and patients. Medical algorithms are used in risk scores for diseases like heart failure, triage tools for emergency rooms, eligibility determinations for organ transplants, and priority rankings for population health outreach. The goal is to improve efficiency and reduce human error by letting data drive decisions. However, this technical promise hides underlying social choices and assumptions. Algorithms don’t arise in a vacuum; they reflect the data they’re trained on and the values of the systems in which they’re embedded. When those systems carry biases, the algorithms can replicate or magnify them.

Because health systems often use proxies like cost to estimate illness, many algorithms inadvertently rank patients based on spending rather than severity. This matters because Black and Brown patients often receive less care and therefore have lower expenditures, which causes algorithms to misjudge their health needs. Furthermore, the variables included in these models—like race, neighborhood, or socio-economic status—can encode structural inequities from the outset. Understanding how these tools work is the first step in addressing their weaknesses.

Where Does Bias in Medical Algorithms Come From?

Bias in medical algorithms can originate from multiple sources. Sampling bias occurs when the data used to train a model are not representative of the populations the model will serve. For example, training a sepsis prediction model predominantly on data from White, insured patients can lead to poor performance for uninsured or non-White patients whose symptoms and healthcare-seeking patterns differ. Measurement bias arises when the proxies used to measure outcomes are flawed. Using cost as a proxy for severity systematically underestimates the burden of disease on marginalized communities because they often have less access to healthcare and accumulate lower medical bills. Label bias happens when the labels used for training—such as “high risk”—are influenced by subjective clinician judgments that may be biased by stereotypes.

Bias can also stem from the structural factors in healthcare systems. Healthcare data reflect decades of inequities: who gets insured, who can afford visits, and who is believed when they report symptoms. By feeding historical data into an algorithm, we risk encoding past discrimination into future decisions. Additionally, the lack of diversity in the teams developing these algorithms can lead to blind spots. If designers don’t understand how race intersects with health, they might overlook variables or relationships that capture social determinants of health, leading to simplistic and harmful models.

Why Bias Persists in Medical Algorithms

One reason bias persists is that the healthcare industry often treats algorithms as objective. There is a prevailing belief that because something is data-driven, it is free of human prejudice. In reality, algorithms are built on human-generated data and choices. Algorithmic bias persists when organizations do not audit their models regularly or consider how variables might proxy for race and class. Many hospitals and insurers adopt third-party tools with little transparency into how they were built, and clinicians may trust them uncritically. The persistence of bias is also tied to financial incentives: risk scores can be profitable for insurers and providers who focus on cost-control, even if those scores undervalue minority patients’ needs.

Regulatory and legal frameworks have lagged behind the rapid deployment of healthcare AI. Unlike pharmaceuticals, algorithms do not undergo rigorous clinical trials before use. There are few requirements for algorithm developers to report performance metrics across racial, ethnic, or socioeconomic groups. Without pressure from regulators or lawsuits, companies may see little reason to modify models that appear to perform “well enough” on average, even if they fail some populations. Bias persists because the default approach of using historical data and standard metrics implicitly accepts systemic inequities as normal.

Consequences of Bias: Real-World Harms

The consequences of bias in medical algorithms are significant and measurable. A widely used tool in the United States to manage care for high-risk patients systematically under-enrolled Black patients in special care programs because it used past healthcare costs to predict future needs. As researchers noted, if the tool had no racial bias, the percentage of Black patients receiving extra help would more than double. This means that biased algorithms can contribute to poor health outcomes by preventing marginalized populations from receiving preventive care. In the realm of organ transplantation, algorithms used to estimate kidney function often adjust for race, labeling Black patients as healthier than they are and delaying listing for transplants. These decisions can shorten or save lives, yet they rely on crude racial categories rather than underlying biology.

Biased algorithms also erode trust in healthcare. When patients learn that a tool rated them lower because of their race or neighborhood, they may become reluctant to engage with medical systems. Clinicians may feel disempowered if they suspect a model is unfair but lack authority to override it. The combined effect is a feedback loop: marginalized communities get worse care, which lowers their health outcomes, which reinforces stereotypes and justifies further neglect. Addressing algorithmic bias is therefore not just a technical problem but a matter of social justice and public trust.

Fixing Bias: Solutions and Reforms

Fixing bias in medical algorithms requires changes at multiple levels. First, developers must prioritize representative data. This means deliberately seeking data from diverse populations and ensuring that variables capturing socioeconomic status, access to care, and environmental exposures are included. When certain populations are underrepresented in training data, developers can apply techniques like reweighting or synthetic data generation to balance the dataset. Second, outcome proxies must be carefully chosen. Instead of using cost as a proxy for illness, algorithms could use biomarkers, lab values, or patient-reported outcomes that better reflect health status and are less correlated with wealth.

Third, algorithm developers and purchasers should conduct bias audits. These audits compare model performance across demographic groups and identify disparities. If an algorithm performs worse for Black patients, for example, developers can adjust the model or create separate models tailored to different populations. Fourth, transparency is key: organizations that produce or use algorithms should disclose their methodologies and allow independent researchers to test for bias. The adoption of algorithmic impact assessments—similar to environmental impact statements—could become standard, requiring developers to anticipate and mitigate social risks before deployment.

The Role of Social Determinants and Structural Racism

Beyond technical fixes, addressing algorithmic bias demands a broader understanding of health inequities. Social determinants of health—such as housing, education, employment, transportation, and food security—shape who gets sick and who gets treated. Models that ignore these determinants risk misattributing differences in outcomes to individual choices or genetics rather than to structural factors. For example, an algorithm might flag a patient as non-compliant for missing appointments without recognizing that they lack reliable transportation or childcare. Embedding social variables into algorithms can help capture the real drivers of risk, but it also requires careful handling to avoid stigmatization.

Structural racism operates through laws, policies, and institutional practices that produce and maintain racial inequalities. Acknowledging structural racism in healthcare means recognizing how redlining, segregation, and discrimination influence health outcomes. Medical algorithms trained on data reflecting these injustices will replicate them unless consciously corrected. Solutions therefore include investing in community health, addressing environmental hazards in minority neighborhoods, and ensuring that marginalized voices are represented in algorithm development. Equity cannot be achieved by tweaking equations alone; it requires dismantling the underlying structures that produce biased data.

A Path Forward for Equitable Healthcare Technology

Creating equitable healthcare technology is an ongoing process. One promising direction is community-based participatory research, where researchers partner with community members to design and evaluate interventions. This ensures that the concerns and needs of marginalized groups shape the development of algorithms. Another avenue is regulatory reform: governments can establish standards for algorithmic fairness, require reporting of demographic performance, and enforce anti-discrimination laws in healthcare AI. Insurers and hospitals can also tie reimbursement or adoption decisions to fairness metrics, creating financial incentives for equitable models.

Education and training are crucial. Clinicians must be trained to understand the limitations of algorithms and to recognize when they may be biased. Data scientists need to learn about health disparities, ethics, and how to engage with communities. Patients should be informed about how algorithms might influence their care and have avenues to contest decisions. The ultimate goal is a healthcare system where artificial intelligence and machine learning help to reduce inequities rather than exacerbate them. Achieving this goal requires combining technological innovation with socia

Case Studies: Racial Bias in Medical Algorithms

Medical algorithms shape decisions across the healthcare spectrum, from allocating organs for transplant to predicting patient deterioration. Real-world examples reveal how these tools can replicate existing inequities when not carefully designed and monitored.

One widely discussed case involved a population health management algorithm used by U.S. hospitals and insurers. The model was designed to predict which patients would benefit from more intensive care management programs. Rather than directly using clinical need as the target, the algorithm relied on historical health-care expenditures. Because the cost of care is lower for Black patients—owing to longstanding inequities in access and treatment—the algorithm systematically underestimated the risk levels for Black patients. A study found that this model effectively excluded many Black patients who would have benefitted from additional resources, demonstrating how proxy variables can hide discriminatory assumptions and cause real harm.

Another example comes from cardiology risk scores. Many widely adopted equations include race as a factor when estimating the probability of adverse cardiac events. For instance, calculators for heart-failure prognosis or anticoagulation therapy often assign different baseline risk values to Black versus non-Black patients based on historical data. These adjustments can result in underestimating the severity of illness in Black patients or delaying interventions. When the underlying data reflect structural disparities in healthcare utilization, embedding a race correction in the model perpetuates, rather than corrects, those inequities.

Algorithms used in kidney transplant allocation have also come under scrutiny. The estimated glomerular filtration rate (eGFR), used to assess kidney function, often includes a race-based multiplier for patients identified as Black. This adjustment can artificially inflate kidney function estimates, delaying eligibility for transplant for Black patients. While eGFR aims to correct for muscle mass differences, it fails to account for the heterogeneity within populations and the many non-genetic factors that influence kidney health. As a result, Black patients are routinely referred for transplant evaluation later than their White counterparts.

These case studies underscore the central lessons: the choice of outcome variables, inclusion of race corrections, and reliance on proxy metrics can embed structural inequality into algorithmic outputs. Only by interrogating these design choices, auditing models for bias, and involving affected communities in the development process can we ensure that algorithms mitigate, rather than reinforce, healthcare disparities and drive awareness and policy change.

Conclusion

Looking Forward: Toward Equitable Algorithms

The journey toward fairness in medical algorithms does not end with identifying bias or auditing existing models. It requires a proactive, iterative approach that anticipates inequity and embeds equity from the outset. Developers must commit to inclusive data practices, ensuring that training datasets capture the diversity of human experiences and avoid over-reliance on proxies that mask structural discrimination. This means investing in data collection that reaches marginalized populations, engaging community leaders, and continuously updating models as demographics and clinical practices evolve.

Equity-focused algorithm development also demands robust oversight structures. Regulatory agencies, professional organizations, and healthcare institutions should establish guidelines for fairness, transparency, and accountability. These standards can include mandatory bias assessments, public reporting of algorithmic performance across demographic groups, and mechanisms to pause deployment when harmful disparities emerge. Ultimately, a commitment to equitable algorithms goes beyond technical refinement; it calls for aligning technology with social justice, ensuring that the innovations intended to save lives do not inadvertently deepen the very inequities they aim to resolve.

Racial bias in medical algorithms is not an isolated technical glitch but a symptom of broader inequities in healthcare and society. These models draw on data reflecting historical and contemporary discrimination, and without deliberate efforts, they will perpetuate the same injustices. To fix them, we must recognize that medical algorithms are socio-technical systems shaped by human choices and values. We need representative data, careful proxy selection, ongoing bias audits, transparency, and regulatory oversight. We also need to address the social determinants and structural racism that underpin disparities.

By combining technical adjustments with structural reforms, healthcare organizations can build algorithms that support equitable care. This involves investing in underserved communities, engaging them in research, and ensuring that fairness metrics guide algorithmic design and adoption. The promise of data-driven medicine should be realized for everyone, not just those who have historically benefitted. Only then can we harness the power of algorithms to create a more just and healthy society.

david

David Nguyen

David is a storyteller who uses his writing as a platform to share his thoughts and experiences. His main goal is to spark curiosity and encourage dialogue on wide range of topics.

More from David Nguyen