The rise of artificial intelligence (AI) in healthcare presents immense opportunities to improve diagnosis, treatment and research. However, the efficacy of AI-driven innovations relies heavily on the quality, diversity and representativeness of the datasets used to train these systems. A critical gap exists in women’s health data, which has perpetuated healthcare disparities and inequities in outcomes. Addressing this imbalance is essential to ensure AI technologies benefit all populations equitably, particularly women, who face unique health challenges that are often under-researched or misrepresented.
A Historical Oversight in Women’s Health Research
The underrepresentation of women in health research is rooted in historical biases. For much of the 20th century, clinical trials excluded women due to a combination of misguided assumptions and ethical concerns. It was widely believed that findings from male subjects could be generalised to women, as men were seen as biologically representative of the broader population. Additionally, concerns about hormonal fluctuations interfering with research outcomes and risks to pregnancy led to policies such as the 1977 FDA recommendation to exclude women of childbearing potential from early drug trials.
Events like the Thalidomide scandal, which resulted in thousands of children born with severe deformities, further cemented the exclusion of women from clinical studies. This avoidance, though seemingly precautionary, overlooked the pressing need to understand how treatments affect women differently.
In the 1990s, this perspective began to shift. The NIH Revitalization Act of 1993 mandated the inclusion of women and minorities in federally funded clinical trials, recognising the importance of diverse data. While this policy was a step forward, progress has been uneven. Even today, a significant proportion of NIH-funded studies fail to stratify findings by sex or consider gender as a variable. This persistent oversight has led to tangible consequences, such as women experiencing adverse drug reactions at twice the rate of men. Moreover, conditions predominantly affecting women, such as endometriosis, migraines and autoimmune diseases, receive disproportionately less research funding compared to diseases primarily affecting men, even when adjusted for disease burden. These gaps in data and funding have limited the development of AI models capable of addressing women’s specific health needs.
The State of Women’s Health Data
While some progress has been made in gathering women’s health data, the available datasets often paint an incomplete picture. Notable initiatives, such as the All of Us Research Program, the Nurses’ Health Study (NHS) and the Women’s Health Initiative (WHI), provide valuable insights but also highlight the limitations of current data collection efforts.
The All of Us Research Program, for instance, is a landmark NIH initiative aiming to collect health information from over a million participants in the United States. With 60% of its participants being women, it offers a promising starting point for building more inclusive datasets. However, as an observational study, its data focus on correlations rather than causations, limiting its application in certain areas.
Similarly, the NHS, initiated in 1976, has provided long-term data on chronic disease risk factors in women. Yet, its criteria for participation—married female registered nurses aged 30–55—excluded younger women, those in postmenopausal stages and those from non-nursing professions. The WHI, launched in the 1990s, offers insights into postmenopausal health issues but similarly lacks the breadth required for a holistic understanding of women’s health across the lifespan.
While valuable, these datasets remain fragmented, focusing on specific demographics or conditions. Moreover, their observational nature limits the ability to assess cause-and-effect relationships, an area critical for developing effective AI applications. To train AI systems that accurately address women’s health needs, it is imperative to expand and enhance these datasets with more diverse, longitudinal, and interventional data.
Advancing Women’s Health Through AI
Closing the data gap in women’s health requires a multifaceted approach. Public-private partnerships could be pivotal in creating robust datasets by pooling resources from different sectors. Pharmaceutical companies, for example, hold vast troves of proprietary data from clinical trials. Collaborating with public initiatives like the NIH and integrating anonymised medical records could significantly enrich the datasets available for AI training.
Additionally, consumer-driven data sources present an untapped opportunity. Platforms like Clue and Flo and wearable devices such as those from Apple and Oura collect menstrual and reproductive health data from millions of users. Clue, for instance, allows users to opt-in to contribute de-identified data for research purposes. Similarly, in partnership with Harvard, the Apple Women’s Health Study aims to advance understanding of conditions like polycystic ovary syndrome (PCOS) and menopausal transitions. By integrating these data streams into larger research efforts, AI models could better understand women’s reproductive health, menstrual cycles and other vital signs.
Another key area of focus should be funding for research into diseases that disproportionately affect women. Conditions like autoimmune disorders, cardiovascular diseases and Alzheimer’s require more attention to improve outcomes and address the unique challenges women face. For instance, women are 50% more likely than men to die within a year of a heart attack, and 78% of Americans with autoimmune diseases are women. Investments in these areas could enhance health outcomes, reduce diagnostic delays, and improve survival rates.
The integration of AI in healthcare has the potential to revolutionise the field, but its success depends on addressing critical gaps in the underlying data. The historical underrepresentation of women in health research has left a legacy of inequities that must be rectified. By fostering collaboration across sectors, leveraging consumer health data and prioritising funding for under-researched conditions, we can build AI systems that genuinely reflect the diversity of the populations they serve. Only through these efforts can we ensure that AI technologies fulfil their promise to improve healthcare outcomes for all, particularly for women, who have long been overlooked in medical research.
Source: Digital Health Insights
Image Credit: iStock