speech31
/

XLS-R-tamil-phoneme

Automatic Speech Recognition

Model card Files Files and versions

XLS-R-tamil-phoneme / README.md

speech31's picture

Update README.md

0dd27bf verified over 2 years ago

|

history blame contribute delete

835 Bytes

	---
	datasets:
	- mozilla-foundation/common_voice_16_1
	language:
	- ta
	metrics:
	- wer
	pipeline_tag: automatic-speech-recognition
	---
	This model is fine-tuned on the Tamil dataset from Common Voice 16.1, preprocessed using Epitran for transliterating text into IPA. The 'tam-Taml' code was employed to generate a precise phoneme list, crucial for capturing the nuances of Tamil phonetics:

	* Vowels:
	* Monophthongs:'a', 'aː', 'e', 'eː', 'i', 'iː', 'o', 'oː', 'u', 'uː'
	* Diphthongs: 'aj', 'aʋ'

	* Consonants:
	* Nasals: 'm', 'n̪', 'n', 'ɳ', 'ɲ', 'ŋ'
	* Stops: 'p', 't̪', 'ʈ', 'k',
	* Affricates: 't͡ʃ', 'd͡ʒ'
	* Fricatives: 's', 'ʂ', 'ʃ', 'h'
	* Tap: 'ɾ'
	* Trill: 'r'
	* Approximants: 'ʋ','ɻ', 'j', 'l', 'ɭ'
	* Consonant cluster: 'kʂ'
	* Special Symbols: '்' (denotes the absence of inherent vowel)