Browse PQC-verified AI models, datasets, and tools
This is a partial copy of CoVoST2 dataset. The main difference is that the audio data is included in the dataset, which makes usage easier and allows browsing the samples using HF Dataset Viewer. The limitation of this method is that all audio samples of the EN_XX subsets are duplicated, as such the size of the dataset is larger. As such, not all the data is included: Only the validation and test subsets are available. From the XX_EN subsets, only fr, es, and zh-CN are included.