README.md · wav2vec-vm-finetune

README.md

2.1 KB · 73 lines · markdown Raw

1	`---`
2	`library_name: transformers`
3	`license: apache-2.0`
4	`base_model: facebook/wav2vec2-xls-r-300m`
5	`tags:`
6	`- generated_from_trainer`
7	`- speech-recognition`
8	`- audio-classification`
9	`- voicemail-detection`
10	`model-index:`
11	`- name: wav2vec-vm-finetune`
12	`results: []`
13	`language:`
14	`- en`
15	`metrics:`
16	`- accuracy`
17	`---`
18
19	`# wav2vec-vm-finetune`
20
21	`This model is a fine-tuned version of [facebook/wav2vec2-xls-r-300m](https://huggingface.co/facebook/wav2vec2-xls-r-300m) for voicemail detection. It is trained on a dataset of call recordings to distinguish between voicemail greetings and live human responses.`
22
23	`## Model description`
24
25	`This model builds on wav2vec2-xls-r-300m, a self-supervised speech model trained on large-scale multilingual data. We fine-tuned it on the first two seconds of a call.`
26
27	`## Intended uses & limitations`
28
29	`- Automated voicemail detection in AI-powered call assistants.`
30	`- Filtering voicemail responses in customer service and sales call automation.`
31
32	`- Only trianed on the English language.`
33	`- Assumes the voicemail track is isolated and contains no audio from the caller.`
34	`- Designed for the first two seconds of audio when calling a voicemail.`
35
36	`## Training and evaluation data`
37
38	`The model was trained on a proprietary dataset of call recordings, labeled as:`
39	`- Live human responses`
40	`- Voicemail greetings`
41
42	`The dataset includes diverse voicemail recordings across multiple types to improve generalization.`
43
44
45	`## Evaluation metrics`
46
47	`The model achieved:`
48	`- 98% accuracy on voicemail detection.`
49
50
51	`## Training procedure`
52
53	`### Training hyperparameters`
54
55	`The following hyperparameters were used during training:`
56	`- learning_rate: 0.0003`
57	`- train_batch_size: 16`
58	`- eval_batch_size: 8`
59	`- seed: 42`
60	`- gradient_accumulation_steps: 2`
61	`- total_train_batch_size: 32`
62	`- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments`
63	`- lr_scheduler_type: linear`
64	`- lr_scheduler_warmup_steps: 500`
65	`- num_epochs: 10`
66	`- mixed_precision_training: Native AMP`
67
68	`### Framework versions`
69
70	`- Transformers 4.48.2`
71	`- Pytorch 2.5.1+cu124`
72	`- Datasets 1.18.3`
73	`- Tokenizers 0.21.0`