ON DOCUMENTING LOW RESOURCED INDIAN LANGUAGES INSIGHTS FROM KANAUJI SPEECH CORPUS
Citations Over Time
Abstract
Well - designed and well - developed corpora can considerably be helpful in bridging the gap between theory and practice in language documentation and revitalization process, in building language technology applications, in testing language hypothesis and in numerous other important areas. Developing a corpus for an under - resourced or endangered language encounters several problems and issues. The present study starts with an over view of the role that corpora (speech corpora in particular) can play in language documentation and revitalization process. It then provides a brief account of the situation of endangered languages and corpora development efforts in India. Thereafter, it d iscusses the various issues involved in the construction of a speech corpus for low resourced languages. Insights are followed from speech database of Kanauji of Kanpur, an endangered variety of Western Hindi, spoken in Uttar Pradesh. Kanauji speech databa se is being developed at Indian Institute of Technology Ropar, Punjab.
Related Papers
- Study and Two Types of Typical Usage of DataGrid Web Server Control(2005)
- Achieving Parameter of DBSCAN Based on Datagrid(2010)
- Using DataGrid Control to Realize DataBase of Querying in VB6.0(2000)
- Susquehanna Chorale Spring Concert "Roots and Wings"(2017)
- → DETERMINING QUALITY REQUIREMENTS AT THE UNIVERSITIES TO IMPROVE THE QUALITY OF EDUCATION(2018)