Enabling cross-functional teams to conduct phonetic research with ease.
To accelerate discovery and improve care in speech-language pathology by providing an intuitive, transparent platform for advanced forced alignment, pronunciation assessment, and phonetic research. Accessible to every researcher, not just programmers.
Abstract base class patterns allow for the rapid addition of new alignment toolkits and graphical components.
Quickly adjust visual flow and guidance to fit the needs of specific research studies and user groups.
Clean separation between GUI, storage, engine, and analyzer layers enables independent development and testing of each component.
Every dataset, model, and alignment stores provenance information for reproducibility and data sharing.
VoxKit's modular api enables seamless integration of speech processing engines (toolkits), from established libraries at the cutting edge and beyond.
Industry-standard forced alignment with speaker-adaptive training. Achieves human-level reliability on diverse speech samples.
Tools:
Alternative alignment engine implementing state-of-the-art Wav2Vec2 based phonetic alignment.
Tools:
Lightweight automatic speech recognition (ASR) engine for transcription tasks, optimized for speed and efficiency.
Tools:
Multi-step capabilities for common research tasks, from model training to alignment generation and pronunciation assessment. Stackers are ordered, added and removed to build custom pipelines.
Generate text transcriptions from audio datasets using integrated ASR engines.
Workflow Steps:
Train custom acoustic models on your datasets with configurable hyperparameters.
Workflow Steps:
Generate phoneme-level alignments using trained or pretrained models.
Workflow Steps:
Extract Goodness of Pronunciation scores for pronunciation assessment and speech disorder analysis.
Workflow Steps:
Extract structured metadata from datasets at registration time, enabling quality assurance and a tailored method of visualization.
Extracts core dataset metadata including file counts, and speaker count.
Analyzer Outputs:
Extensible analyzer system allows researchers to add more...
Analyzer Outputs:
Graphical interface eliminates command-line barriers while maintaining full control over analysis parameters and model configurations.
Extensible architecture with well-documented APIs enables integration of proprietary tools and custom analysis pipelines.
Ready to explore VoxKit's capabilities in depth?