README.md
2.1 KB · 73 lines · markdown Raw
1 ---
2 library_name: transformers
3 license: apache-2.0
4 base_model: facebook/wav2vec2-xls-r-300m
5 tags:
6 - generated_from_trainer
7 - speech-recognition
8 - audio-classification
9 - voicemail-detection
10 model-index:
11 - name: wav2vec-vm-finetune
12 results: []
13 language:
14 - en
15 metrics:
16 - accuracy
17 ---
18
19 # wav2vec-vm-finetune
20
21 This model is a fine-tuned version of [facebook/wav2vec2-xls-r-300m](https://huggingface.co/facebook/wav2vec2-xls-r-300m) for **voicemail detection**. It is trained on a dataset of call recordings to distinguish between **voicemail greetings** and **live human responses**.
22
23 ## Model description
24
25 This model builds on **wav2vec2-xls-r-300m**, a self-supervised speech model trained on large-scale multilingual data. We fine-tuned it on the first two seconds of a call.
26
27 ## Intended uses & limitations
28
29 - Automated voicemail detection in AI-powered call assistants.
30 - Filtering voicemail responses in customer service and sales call automation.
31
32 - Only trianed on the English language.
33 - Assumes the voicemail track is isolated and contains no audio from the caller.
34 - Designed for the first two seconds of audio when calling a voicemail.
35
36 ## Training and evaluation data
37
38 The model was trained on a proprietary dataset of call recordings, labeled as:
39 - **Live human responses**
40 - **Voicemail greetings**
41
42 The dataset includes diverse voicemail recordings across multiple types to improve generalization.
43
44
45 ## Evaluation metrics
46
47 The model achieved:
48 - **98% accuracy** on voicemail detection.
49
50
51 ## Training procedure
52
53 ### Training hyperparameters
54
55 The following hyperparameters were used during training:
56 - learning_rate: 0.0003
57 - train_batch_size: 16
58 - eval_batch_size: 8
59 - seed: 42
60 - gradient_accumulation_steps: 2
61 - total_train_batch_size: 32
62 - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
63 - lr_scheduler_type: linear
64 - lr_scheduler_warmup_steps: 500
65 - num_epochs: 10
66 - mixed_precision_training: Native AMP
67
68 ### Framework versions
69
70 - Transformers 4.48.2
71 - Pytorch 2.5.1+cu124
72 - Datasets 1.18.3
73 - Tokenizers 0.21.0