README.md · open-vakgyata

README.md

2.1 KB · 104 lines · markdown Raw

1	`---`
2	`language:`
3	`- en`
4	`- hi`
5	`- or`
6	`- bn`
7	`- ta`
8	`- te`
9	`- kn`
10	`- ml`
11	`- mr`
12	`- gu`
13	`license: cc-by-nc-4.0`
14	`pipeline_tag: audio-classification`
15	`library_name: transformers`
16	`tags:`
17	`- language-identification`
18	`- indian-languages`
19	`- multilingual`
20	`- speech`
21	`- asr-preprocessing`
22	`- callcenter-ai`
23	`- speech-analytics`
24	`- audio-classification`
25	`- wav2vec2`
26	`- transformers`
27	`- pytorch`
28	`- huggingface`
29	`---`
30	`Model Name: open-vakgyata`
31
32	`Model Overview:`
33	`open-vakgyata is an open-source language identification model capable of detecting and classifying indian languages from speech inputs.`
34
35	`Supported Languages:`
36	`\| Language \| Code \|`
37	`\|----------------------\|-------\|`
38	`\| English (India) \| en-IN \|`
39	`\| Hindi \| hi-IN \|`
40	`\| Odia \| or-IN \|`
41	`\| Bengali \| bn-IN \|`
42	`\| Tamil \| ta-IN \|`
43	`\| Telugu \| te-IN \|`
44	`\| Kannada \| kn-IN \|`
45	`\| Malayalam \| ml-IN \|`
46	`\| Marathi \| mr-IN \|`
47	`\| Gujarati \| gu-IN \|`
48
49	`Specification`
50	`- Supported Sampling Rate: 16000`
51	`- Recomonded Audio Format: 16kHz, 16bit PCM`
52
53	`Usage:`
54
55	```py
56	`from transformers import Wav2Vec2ForSequenceClassification, AutoFeatureExtractor`
57	`import torch`
58
59	`device = "cpu" # "cuda"`
60
61	`model_id = "onecxi/open-vakgyata"`
62
63	`processor = AutoFeatureExtractor.from_pretrained(model_id)`
64	`model = Wav2Vec2ForSequenceClassification.from_pretrained(model_id).to(device)`
65
66	```
67
68	`Inference:`
69
70	```py
71	`import torchaudio`
72
73	`audio, sr = torchaudio.load("path/to/audio.wav")`
74
75	`# Process the waveform and move to the appropriate device`
76	`inputs = processor(audio.flatten(), sampling_rate=sr, return_tensors="pt").to(device)`
77
78	`# Perform inference`
79	`with torch.no_grad():`
80	`logits = model(**inputs).logits`
81
82	`# Get language probabilities`
83	`probs = logits.softmax(dim=-1).cpu().numpy()`
84	`language = model.config.id2label.get(probs.argmax())`
85
86	`print(language)`
87	```
88
89	`---`
90
91	`## Citation`
92
93	`If you use this model in your research or application, please consider citing the model and its base source:`
94
95	```
96	`@misc{vakgyata2024,`
97	`title={vakgyata: Language Identification for Indian Speech},`
98	`author={OneCXI},`
99	`year={2024},`
100	`url={https://huggingface.co/onecxi/open-vakgyata}`
101	`}`
102	```
103
104	`---`