Model Hub
Browse PQC-verified AI models, datasets, and tools
Multilingual Speech Commands Dataset (15 Languages, Augmented) This dataset contains augmented speech command samples in 15 languages, derived from multiple public datasets. Only commands that overlap with the Google Speech Commands (GSC) vocabulary are included, making the dataset suitable for multilingual keyword spotting tasks aligned with GSC-style classification. Audio samples have been augmented using standard audio techniques to improve model robustness (e.g., time-shifting… See the full description on the dataset page: https://huggingface.co/datasets/artur-muratov/multilingual-speech-commands-15lang.
DartLab Data Structured company data from DART & EDGAR disclosure filings DART 전자공시 + EDGAR 공시 데이터 — 한국 2,700사 / 미국 970사 What is this? Pre-collected Parquet files from DartLab — a Python library that turns DART (Korea) and EDGAR (US) disclosure filings into one structured company map. 한국 DART 전자공시 시스템과 미국 SEC EDGAR에서 수집한 기업 공시 데이터입니다. This dataset is the data layer behind DartLab. When you run dartlab.Company("005930"), the library automatically downloads the… See the full description on the dataset page: https://huggingface.co/datasets/eddmpython/dartlab-data.