Dataset Comparison

Last updated: Apr 17, 2026, 6:46 AM EDT

This page provides a comprehensive side-by-side comparison of all four primary result groups. Each group represents an independent dataset with progressively less restrictive quality criteria. Researchers can use this comparison to assess how data cleaning decisions affect statistical conclusions.

Sample Overview

#	Result Group	N	Definition
1	Conservative Clean	89	Prolific APPROVED + all quality checks (IRI, duration >= 540s, reCAPTCHA, straightlining, auth)
2	Flexible Clean	140	Prolific APPROVED + basic quality (all 3 IRIs + duration >= 480s)
3	Prolific Accepted	261	All deduplicated V2 rows with Prolific APPROVED status
4	All V2 Finished	410	Finished + duration >= 120s (extreme speeders excluded)

Core Metrics Comparison

All descriptive statistics, reliability coefficients, and correlations are shown for each result group. The Δ column shows the difference from Conservative Clean (the strictest dataset). Deltas > 0.05 are highlighted in amber.

Metric	Conservative Clean	Flexible Clean	Prolific Accepted	All V2 Finished	Δ Flexible	Δ Prolific	Δ All
Barrier Grand Mean	2.8354	2.8135	2.7944	2.7591	-0.0219	-0.0410	-0.0763
Barrier SD	0.6252	0.7115	0.7092	0.7658	+0.0863	+0.0840	+0.1406
Readiness Grand Mean	3.0520	3.0862	3.1260	3.2284	+0.0342	+0.0740	+0.1764
Readiness SD	0.5643	0.6573	0.6701	0.7194	+0.0930	+0.1058	+0.1551
Maturity Grand Mean	3.0526	3.0656	3.1530	3.2593	+0.0130	+0.1004	+0.2067
Maturity SD	0.6988	0.8064	0.8072	0.8074	+0.1076	+0.1084	+0.1086
B-R Correlation	-0.4265	-0.4485	-0.3457	-0.3042	-0.0220	+0.0808	+0.1223
B-M Correlation	-0.1783	-0.3141	-0.2815	-0.3189	-0.1358	-0.1032	-0.1406
R-M Correlation	0.5783	0.7065	0.7208	0.7235	+0.1282	+0.1425	+0.1452
Alpha Barriers	0.8535	0.8757	0.8764	0.8997	+0.0222	+0.0229	+0.0462
Alpha Readiness	0.8677	0.9171	0.9183	0.9317	+0.0494	+0.0506	+0.0640
Alpha Maturity	0.8291	0.8871	0.8899	0.8909	+0.0580	+0.0608	+0.0618

Survey Demographics Comparison (Qualtrics)

Role distribution and organization size breakdown for each result group, based on self-reported survey responses (Q1, Q4). These are organizational demographics from the TABS instrument. Prolific platform demographics (age, sex, ethnicity, plus prescreener fields like industry, company size, and occupation) are collected separately and can be used to cross-validate these survey responses via Prolific Participant ID.

Tech vs Non-Tech Composition

Result Group	N	Technical	Non-Technical	% Tech
Conservative Clean	89	19	70	21.3%
Flexible Clean	140	34	106	24.3%
Prolific Accepted	261	61	200	23.4%
All V2 Finished	410	108	302	26.3%

Organization Size Distribution

Result Group	<100	100-499	500-999	1000-4999	5000-9999	10000+
Conservative Clean	11	31	8	22	6	11
Flexible Clean	19	44	18	34	8	17
Prolific Accepted	45	70	35	53	21	37
All V2 Finished	59	119	62	81	33	56

Filter Bias Analysis

This analysis tests whether stricter quality filters disproportionately exclude certain demographics. A Chi-square test for independence is computed across the four result groups for role, organization size, and profit model.

Demographic Category	Chi-Square (χ²)	df	p-value	Interpretation
Role (Tech vs Non-Tech)	1.39	3	0.7072	No significant difference (demographics stable)
Organization Size	8.37	15	0.9079	No significant difference (demographics stable)
Profit Model	3.38	6	0.7605	No significant difference (demographics stable)

Profit Model Distribution

Result Group	For-Profit	Non-Profit	Government/Public Sector
Conservative Clean	63	15	11
Flexible Clean	95	29	16
Prolific Accepted	192	46	23
All V2 Finished	305	70	35

Effect Size Comparison

Cohen’s d effect sizes for key group comparisons across all four result groups. This shows how effect sizes shift as the sample becomes less restrictive.

Tech vs Non-Tech (Cohen’s d)

Construct	Conservative Clean	Flexible Clean	Prolific Accepted	All V2 Finished
barriers	-0.093	-0.264	-0.045	-0.132
readiness	0.438	0.578	0.555	0.455
maturity	0.263	0.384	0.363	0.321

Large vs Small/Medium Org (Cohen’s d)

Construct	Conservative Clean	Flexible Clean	Prolific Accepted	All V2 Finished
barriers	0.515	0.371	0.157	0.162
readiness	-0.065	-0.166	0.100	0.069

Interpretation Guide

Metrics that remain stable across all four groups suggest robust findings that are not sensitive to data cleaning decisions. Metrics that show large deltas (highlighted in amber) between Conservative Clean and less restrictive groups warrant further investigation, as the finding may depend on sample composition.

As a rule of thumb: if a Cohen’s d shifts by more than 0.1 between Conservative Clean and All V2 Finished, the effect size may be inflated or attenuated by lower-quality responses. Similarly, if Cronbach’s α drops below 0.70 in larger samples, it may indicate that less-engaged respondents are adding noise to the scale.

Key Findings - effect sizes, t-tests, and ANOVA per result group
Sensitivity Analysis - metric-level sensitivity across sample definitions
Descriptive Statistics - correlation matrices per result group
← Back to Results Overview