Research Foundations

VoxKit acts on (and interprets) findings from key research studies in speech pathology to support research teams and reduce technical barriers.

MFA Achieves Human-Level Reliability

Mahr et al., 2021

Key Finding

MFA-SAT reached 86% accuracy on child speech (ages 3-6), the only system approaching human interrater agreement.

VoxKit Implementation

VoxKit defaults to MFA while supporting alternative engines for comparative research.

Critical Consideration

MFA-SAT was trained on adult speech; researchers should validate performance on their specific datasets.

Phoneme Class Reliability Varies

Mahr et al., 2021

Key Finding

Vowels showed 83% accuracy across systems. Fricative accuracy improved significantly with child age (OR = 1.29/year).

VoxKit Implementation

VoxKit tracks alignment metadata and speaker age, enabling age-stratified accuracy analysis.

Critical Consideration

These patterns emerged from elicited single-word productions and may not generalize to spontaneous speech.

Clinical AI Often Overfits Small Datasets

Berisha & Liss, 2024

Key Finding

Most clinical speech datasets contain only minutes to hours of audio with uncertain labels, leading to poor generalization.

VoxKit Implementation

VoxKit enables comprehensive metadata tracking and versioned provenance for transparent reporting.

Critical Consideration

VoxKit facilitates rigorous documentation but cannot solve fundamental overfitting—researchers must apply appropriate validation.

Interpretable Measures Outperform Black Boxes

Berisha & Liss, 2024

Key Finding

Low-dimensional, clinically grounded measures (e.g., hypernasality, articulatory precision) outperform opaque embeddings.

VoxKit Implementation

VoxKit's analyzer architecture supports custom, interpretable feature extraction from alignment outputs.

Critical Consideration

Alignment errors can propagate to downstream measures. Researchers must validate that phonetic boundaries are reliable for their specific analyses.

VoxKit's Research-Driven Design

Guided workflows: Guidance and layout can be customized to fit the direction and use case for specific studies/research

Flexible architecture: Base classes and modular components allow developers to extend and adapt VoxKit

Metadata-rich outputs: Metadata tracking enables reportable results and and reduces overhead

Honest positioning: VoxKit is research infrastructure, not a clinical decision-support system, it reduces technical barriers while maintaining scientific rigor

VoxKit prioritizes usability, flexibility, and transparency to empower researchers to utilize the cutting edge.