Namibia is home to 2.5 million people with a rich cultural and colonial history spanning over 100 years.
The stories of the Namibian people have not been told with regards to their cultural practises, knowledge, nor its history from the perspectives of the Namibian people. As Goring said at the Nuremberg trials “The victor will always be the judge, and the vanquished the accused.”
As such, this project aims to capture this knowledge in the historical and cultural context, for one of the most critically endangered languages, Khoekhoegowab and the Namibian most widely spoken, Oshiwambo — and in doing so provide data for NLP tasks.
This project builds on prior efforts to create cultural and historical texts in the khoekhoegowab language, by crowdsourcing a speech dataset from 300 war veterans from a potential 10000 Namibian war veterans, mostly Oshiwambo speaking and a community of Khoekhoegowab elders, whose traditional methods are still used in wildlife conservation, for monitoring and tracking.
The project will consider various data gathering methods such as interviews, focus groups and web apps to capture the data. The speech data will be annotated and translated into English