Dataset Comparison

Last updated: Jul 18, 2026, 7:28 AM EDT

This page provides a comprehensive side-by-side comparison of all four primary result groups. Each group represents an independent dataset with progressively less restrictive quality criteria. Researchers can use this comparison to assess how data cleaning decisions affect statistical conclusions.

Sample Overview

#	Result Group	N	Definition
1	Conservative Clean	157	Prolific APPROVED + all quality checks (IRI, duration >= 540s, reCAPTCHA, straightlining, auth)
2	Flexible Clean	253	Prolific APPROVED + basic quality (all 3 IRIs + duration >= 480s)
3	Prolific Accepted	476	All deduplicated V2 rows with Prolific APPROVED status
4	All V2 Finished	749	Finished + duration >= 120s (extreme speeders excluded)

Core Metrics Comparison

All descriptive statistics, reliability coefficients, and correlations are shown for each result group. The Δ column shows the difference from Conservative Clean (the strictest dataset). Deltas > 0.05 are highlighted in amber.

Metric	Conservative Clean	Flexible Clean	Prolific Accepted	All V2 Finished	Δ Flexible	Δ Prolific	Δ All
Barrier Grand Mean	2.8705	2.8897	2.8913	2.8962	+0.0192	+0.0208	+0.0257
Barrier SD	0.6676	0.7220	0.7203	0.7811	+0.0544	+0.0527	+0.1135
Readiness Grand Mean	3.0373	3.0603	3.0940	3.2057	+0.0230	+0.0567	+0.1684
Readiness SD	0.5945	0.6605	0.6517	0.7188	+0.0660	+0.0572	+0.1243
Maturity Grand Mean	3.0672	3.0689	3.1215	3.2312	+0.0017	+0.0543	+0.1640
Maturity SD	0.7074	0.8057	0.7846	0.8036	+0.0983	+0.0772	+0.0962
B-R Correlation	-0.3987	-0.4272	-0.3194	-0.2552	-0.0285	+0.0793	+0.1435
B-M Correlation	-0.1655	-0.2761	-0.2358	-0.2278	-0.1106	-0.0703	-0.0623
R-M Correlation	0.5692	0.6929	0.7144	0.7153	+0.1237	+0.1452	+0.1461
Alpha Barriers	0.8660	0.8846	0.8858	0.9071	+0.0186	+0.0198	+0.0411
Alpha Readiness	0.8688	0.9083	0.9093	0.9257	+0.0395	+0.0405	+0.0569
Alpha Maturity	0.8204	0.8815	0.8779	0.8866	+0.0611	+0.0575	+0.0662

Survey Demographics Comparison (Qualtrics)

Role distribution and organization size breakdown for each result group, based on self-reported survey responses (Q1, Q4). These are organizational demographics from the TABS instrument. Prolific platform demographics (age, sex, ethnicity, plus prescreener fields like industry, company size, and occupation) are collected separately and can be used to cross-validate these survey responses via Prolific Participant ID.

Tech vs Non-Tech Composition

Result Group	N	Technical	Non-Technical	% Tech
Conservative Clean	157	30	127	19.1%
Flexible Clean	253	57	196	22.5%
Prolific Accepted	476	105	371	22.1%
All V2 Finished	749	186	563	24.8%

Organization Size Distribution

Result Group	<100	100-499	500-999	1000-4999	5000-9999	10000+
Conservative Clean	28	52	16	31	11	19
Flexible Clean	41	82	36	49	14	31
Prolific Accepted	83	142	70	86	39	56
All V2 Finished	121	222	119	137	59	91

Filter Bias Analysis

This analysis tests whether stricter quality filters disproportionately exclude certain demographics. A Chi-square test for independence is computed across the four result groups for role, organization size, and profit model.

Demographic Category	Chi-Square (χ²)	df	p-value	Interpretation
Role (Tech vs Non-Tech)	3.01	3	0.3902	No significant difference (demographics stable)
Organization Size	6.42	15	0.9719	No significant difference (demographics stable)
Profit Model	3.25	6	0.7768	No significant difference (demographics stable)

Profit Model Distribution

Result Group	For-Profit	Non-Profit	Government/Public Sector
Conservative Clean	111	23	23
Flexible Clean	177	42	34
Prolific Accepted	341	82	53
All V2 Finished	549	117	83

Effect Size Comparison

Cohen’s d effect sizes for key group comparisons across all four result groups. This shows how effect sizes shift as the sample becomes less restrictive.

Tech vs Non-Tech (Cohen’s d)

Construct	Conservative Clean	Flexible Clean	Prolific Accepted	All V2 Finished
barriers	-0.086	-0.130	-0.010	-0.066
readiness	0.509	0.612	0.604	0.497
maturity	0.174	0.331	0.367	0.368

Large vs Small/Medium Org (Cohen’s d)

Construct	Conservative Clean	Flexible Clean	Prolific Accepted	All V2 Finished
barriers	0.409	0.313	0.072	0.026
readiness	-0.221	-0.186	-0.036	0.015

Interpretation Guide

Metrics that remain stable across all four groups suggest robust findings that are not sensitive to data cleaning decisions. Metrics that show large deltas (highlighted in amber) between Conservative Clean and less restrictive groups warrant further investigation, as the finding may depend on sample composition.

As a rule of thumb: if a Cohen’s d shifts by more than 0.1 between Conservative Clean and All V2 Finished, the effect size may be inflated or attenuated by lower-quality responses. Similarly, if Cronbach’s α drops below 0.70 in larger samples, it may indicate that less-engaged respondents are adding noise to the scale.

Key Findings - effect sizes, t-tests, and ANOVA per result group
Sensitivity Analysis - metric-level sensitivity across sample definitions
Descriptive Statistics - correlation matrices per result group
← Back to Results Overview