Documenting oral languages and culture grows visibility for low-resource languages
Originally published on Global Voices
Editor's note: From April 12–18, 2022, Amrit Sufi will be hosting the @AsiaLangsOnline rotating Twitter account, which explores how technology can be used to revitalize Asian languages. Read more about the campaign here.
Amrit Sufi is a researcher and academician who is currently working on the digitization of endangered oral language and culture. She is also a speaker of an endangered language called Angika, which is an Indo-Aryan language of the Anga region, that falls within the Indian states of Bihar, Jharkhand, and West Bengal. Sufi has a master's degree in English Language and Literature from Hemwati Nandan Bahuguna Garhwal University in Uttarakhand, India and worked as a lecturer on English literature at Doon University, Dehradun. Rising Voices talked with Sufi about her passion for the digitization of oral culture.
Angika is considered the oldest language of the Bengali family and is very similar to Maithili. Due to the predominance of the Hindi language, languages like Angika are being threatened in India. Sufi believes that documenting oral languages and culture as audio and video helps grow visibility for many low-resource languages. To test this, she documented some Angika folk songs as an experiment.
In 2021, Sufi worked as a coordinator on an Oral Culture Transcription Toolkit project, which is funded by the Wikimedia Foundation. The project has worked with native people and experts to build an online toolkit that will enable people to upload media in endangered languages, including Angika. The toolkit will aid in the further documentation of languages and cultures, also helping the creation of video transcriptions and subtitles.
Rising Voices interviewed Amrit Sufi over email. The interview has been edited for clarity.
Rising Voices (RV): Please tell us about yourself and your language-related work.
Amrit Sufi (AS): I am a researcher, currently pursuing a Master of Arts in Linguistics from the Jawaharlal Nehru University (JNU), in New Delhi, India. I have experience as an academician in English Literature. I am also working as a case study researcher for India for the project “Indigenous and minority language community needs for secure technologies: a participatory research project” by Rising Voices.
With a Wikimedian friend of mine, Nitesh Gill, I have created a toolkit for digitizing oral culture on Wikimedia and other online platforms. The toolkit gives detailed instructions on how to record oral culture, how to upload it on Wikimedia Commons, to create a transcription and upload it on Wikisource.
Apart from these, I regularly edit for the Angika Wikipedia incubator project.
RV: What is the current state of your language both online and offline?
AS: Angika is a vulnerable language as per the UNESCO Map of World Languages in Danger. There are attempts to include it in education at primary, secondary and university levels. Speakers of Angika usually have to learn a second language for educational and economic purposes. However, Angika activists are trying to bring it on digital platforms like Wikipedia and sister projects, YouTube, and websites dedicated to the publication of Angika literature. However, the growth of Angika on digital platforms seems slow.
RV: What are your motivations for seeing your language present in digital spaces?
AS: Angika is spoken primarily in Bihar, India. There are several negative stereotypes regarding the state, whether looming crime or a lack of education. I believe these stereotypes have roots in its economic deprivation and its misrepresentation by the media. I hope that normalizing the use of Angika in digital spaces will ensure that the speakers will be more confident in their identity in offline and online spaces. The main goal hence is an assertion of one’s identity and pride in it.
RV: Describe some of the challenges that prevent your language from being fully utilized online.
AS: The most commonly stated problem regarding the use of Angika in digital spaces that I’ve seen language activists be vocal about is the lack of a script. Though Angika was historically written using the Kaithi script, it is now written using the popular Devanagari script (same script as Hindi). Using this script does not fulfil the purpose of representing certain unique vowel sounds in Angika; it is also clubbed as a ‘Hindi-belt language.’ Another glaring challenge is the lack of digital material and resources like apps in this language. It results in an automatic shift from Angika to Hindi/English.
RV: What concrete steps do you think can be taken to encourage younger people to begin learning their language or keep using their language?
AS: There should be government policies that create employment opportunities for Angika speakers since there is a lot of migration by people looking for work. Making digital resources available in the language and developing Unicode to accommodate the sounds and symbols of the language would be a good first step. Ensuring digital safety and privacy would also be encouraging for users.
In November 2021, Amrit Sufi co-facilitated a series of Language Digital Activism Workshops for India organized by Rising Voices. In December 2021, Sufi joined a Rising Voices panel at the IGF-2021 session Building the wiki-way for low-resource languages.
Post a Comment