README.md
2.1 KB · 104 lines · markdown Raw
1 ---
2 language:
3 - en
4 - hi
5 - or
6 - bn
7 - ta
8 - te
9 - kn
10 - ml
11 - mr
12 - gu
13 license: cc-by-nc-4.0
14 pipeline_tag: audio-classification
15 library_name: transformers
16 tags:
17 - language-identification
18 - indian-languages
19 - multilingual
20 - speech
21 - asr-preprocessing
22 - callcenter-ai
23 - speech-analytics
24 - audio-classification
25 - wav2vec2
26 - transformers
27 - pytorch
28 - huggingface
29 ---
30 **Model Name:** open-vakgyata
31
32 **Model Overview:**
33 open-vakgyata is an open-source language identification model capable of detecting and classifying indian languages from speech inputs.
34
35 **Supported Languages:**
36 | Language | Code |
37 |----------------------|-------|
38 | English (India) | en-IN |
39 | Hindi | hi-IN |
40 | Odia | or-IN |
41 | Bengali | bn-IN |
42 | Tamil | ta-IN |
43 | Telugu | te-IN |
44 | Kannada | kn-IN |
45 | Malayalam | ml-IN |
46 | Marathi | mr-IN |
47 | Gujarati | gu-IN |
48
49 **Specification**
50 - Supported Sampling Rate: 16000
51 - Recomonded Audio Format: 16kHz, 16bit PCM
52
53 **Usage:**
54
55 ```py
56 from transformers import Wav2Vec2ForSequenceClassification, AutoFeatureExtractor
57 import torch
58
59 device = "cpu" # "cuda"
60
61 model_id = "onecxi/open-vakgyata"
62
63 processor = AutoFeatureExtractor.from_pretrained(model_id)
64 model = Wav2Vec2ForSequenceClassification.from_pretrained(model_id).to(device)
65
66 ```
67
68 **Inference:**
69
70 ```py
71 import torchaudio
72
73 audio, sr = torchaudio.load("path/to/audio.wav")
74
75 # Process the waveform and move to the appropriate device
76 inputs = processor(audio.flatten(), sampling_rate=sr, return_tensors="pt").to(device)
77
78 # Perform inference
79 with torch.no_grad():
80 logits = model(**inputs).logits
81
82 # Get language probabilities
83 probs = logits.softmax(dim=-1).cpu().numpy()
84 language = model.config.id2label.get(probs.argmax())
85
86 print(language)
87 ```
88
89 ---
90
91 ## **Citation**
92
93 If you use this model in your research or application, please consider citing the model and its base source:
94
95 ```
96 @misc{vakgyata2024,
97 title={vakgyata: Language Identification for Indian Speech},
98 author={OneCXI},
99 year={2024},
100 url={https://huggingface.co/onecxi/open-vakgyata}
101 }
102 ```
103
104 ---