Rapidly and Comprehensively Characterize Real World Data

Achieve unmatched efficiency and accuracy in clinical data for all your research studies

How our algorithm works

The foundation of all Cornerstone modules is the multi-step Cornerstone generative AI algorithm that characterizes and evaluates clinical data at scale.

1. Connect a dataset and learn its structure and contents rapidly

Cornerstone AI maps and harmonizes clinical data, making it analysis-ready across formats and sources.

2. Transform the data

Cornerstone AI processes and validates data, enabling precise modeling and assessments.

3. Model and detect errors

Cornerstone AI detects and explains data errors, creating quality metrics.

Product Features

Data Quality Assessment Browser

Quality metrics at dataset, table, patient & field level
Summarize patient counts across all attributes of data

Cohort Builder

Accurately identify patients meeting inclusion and exclusion criteria at scale
Leverage all aspects of the Cornerstone AI algorithm (i.e. mapping to ontology, data transformation & generation of quality metrics

Benefits

Achieve consistent, reliable quality checks within and across datasets

Speed up data assessments from many weeks to hours

Comprehensive characterization enables smarter decision-making

Amplify your RWD Maturity

01.

Confirm that you are getting what you are buying

Applicable to new and existing data purchases
Example: Oncology gene mutation, quickly filter for key attributes of clinical interest

02.

Become a leader in real world data quality

Dynamic quality report with ability to query data rapidly
A library of data assets
Objective and timely quality ratings

03.

Unified, consistent quality framework for evaluation

Point and click with ease of use for different levels of users, including technical and non-technical
Achieve more with less technical manpower
Bring feedback to data sources to improve underlying data quality

Case Study

Previous Vendor

Billable Model (H*R)

10 weeks to produce a data assessment

Static excel report

Shallow results limited to set variables

Expensive

Cornerstone AI

Software license

Hours to produce a data assessment

Explore data dynamically in software interface

Comprehensive of all variables

5x more affordable than the vendor solution on an annual basis

Testimonials

“It is imperative that sponsors can rapidly and comprehensively assess the quality of data sources required for scientific discovery and drug development which will ultimately drive patient benefit. Cornerstone has built a platform that enables sponsors to make RWD decisions in a consistent and objective manner as well as improve the underlying data quality so that industry can accelerate the process of bringing new cures to market."

Daniel Lane
Real World Data Expert & Strategist, Novartis
"Cornerstone AI's software serves as a valuable automated quality platform we can continue to use to detect and make further quality improvements to existing and new datasets."

Dan Levy
Chief Data Officer, OM1

Frequently Asked Questions

The types of issues commonly observed include:
- Date sequencing errors and dates too far in the past
- Clinical inconsistencies across records or tables
- Biologically implausible values checks
- Value misplacements checks
- Unstandardized or mis-standardized text fields
- Tokenization errors
- Clinically duplicate records
For each anomaly, the system identifies the likely erroneous record(s) as well as an interpretable reason for the issue, such as those listed above.
The Cornerstone system supports clinical data related to any indication. The Cornerstone AI algorithms are self-learning and therefore indication-agnostic. Types of clinical data we’ve worked with include Rheumatoid Arthritis, Infertility, Oncology, Knee Surgery Recovery, COVID, Alzheimer's Disease, and healthy control cohorts.
Our platform is built for and has demonstrated success with Real World Data (RWD) as well as data from patient registries, and Phase II, III, and IV clinical trials.
We’ve helped improve data quality in datasets of less than 100 patients to more than 100,000 patients.
Having seen a large number of healthcare datasets, from a large number of sources, we’ve seen a very large number of problems in data. The good news is that many of these are fixable with adjustment to the data processing pipeline. This is valuable for our customers who are data providers, and valuable for our biopharma customers who have licensed data and can provide targeted feedback to those providers. For example, we can categorize errors that come from incorrectly merged source tables, from incomplete parsing of JSON formatted data, or from incorrect NLP of unstructured information in notes.

Additional FAQs