Dale Sanders
Chief Strategy Officer

Mayur Saxena
CEO and Co-Founder
Droice Labs

Embracing the noise and bias in healthcare data


Watch the webinar on-demand

Let the model fluctuate around the data is a common saying among physicists and data scientists working in national intelligence. In principle, it means let the data speak for itself and reveal patterns that describe the essence of the data.

But in healthcare, that’s not usually the case. Rather, there’s a long-held practice of force-fitting disparate data into pre-defined models like HL7, OMOP, and FHIR – whether the data was collected using those models or not – and then analyzing it. Indeed, in any discussion about healthcare analytics, poor data quality is always a present topic due to issues like missing and incorrect data, data biases, or other types of noise.

Historically, the healthcare industry has addressed this problem through labor-intensive, error-prone data cleansing and curation. Until such time that healthcare data quality can be improved and standardized at the point of production, we can borrow lessons from space and national intelligence and let disparate data speak for itself by acknowledging and embracing noise and biases. Watch the replay for a thoughtful discussion about: 

  • The current state of healthcare data, the pipes that carry and transform this data, vocabularies that encode health information, and how these factors impact patient data analytics

  • Information models” vs. “data models” for clinical data

  • Data science techniques for combining text, discrete, and categorical data to identify novel patient cohorts and subtypes for population health and clinical research, without imposing an a priori data model