Friday, May 11, 2012

How and Why I think Malawi has to Move on with Software Localization

We are in the technological era and the use of ICT is not news any more. Even in the remote areas of Africa, we find someone using at least a mobile phone. It has been argued that even smart phones are becoming more popular in villages. This is good news, but not everyone is enjoying this to the maximum. Let's face the reality, a larger section of villagers are not enjoying the services that we all enjoy: blogs, social media (WhatsApp, Twitter, Tumblr, Facebook, Skype, etc), you name them. Most of them just know how to make and answer calls. A few more are able to read and compose text messages, otherwise I have seen many more that do not even know what to do when text messages come on their phones.

I have always argued in various forums that in order to remove this language barrier, software localization is one of the best alternatives to make ICT appropriate to a target population. Localization of software involves adapting technologies to the linguistic, cultural and technical requirements of a target group of people. However if not handled carefully, software localization cannot achieve its intended purposes. One of the problems that native speakers find is to manage the influx of new terminologies that are begging their way into the language vocabulary. Generally, most people do not mind much about that as far as they communicate. But language regulatory bodies and linguists always have great concerns over the unsupervised growth of terminologies.

Somehow, leaving the terminologies to grow by themselves is very dangerous. Since a lot of terminologies that come in are loaned, they pose a lot of threat to the linguistic and phonological structures of the target language. For example, in Chichewa, the term banki is borrowed from the English word bank. The plural is banki or mabanki, and it is not clear yet which is the correct form. Regular users may not necessarily think that this is an issue, but for linguists it poses a lot of problems and inconsistencies in trying to grammatically categorise words like these. You may also wish to recall that Chichewa nouns affect sentence structure because they determine the right argument markers (called agwirizanitsi in Chichewa) to associate with. In Chichewa noun system, the noun banki belongs to class 9. Generally, class 9 plurals are in class 10. Thus, banki (sing. Class 9) => banki (pl. class 10) makes sense, while banki (sing. Class 9) => mabanki (pl. class 2/class 6) does not, yet it is the mostly used plural form. Compare the markers in the following two sentences: Banki zina zinatsegulidwa kale koma zilibe anthu ambiri. and Mabanaki ena anatsegulidwa kale koma alibe anthu ambiri. Both mean the same: "Some banks were opened long ago but they do not have a lot of customers") This is the case with other loanwords like ofesi (office), kapu (cup) et cetera. (For those that do not understand this number-based noun classification system: the word banki sounds to nicely belong to I-Zi class, but using mabanki creates a new classification, I-Ma, which is ungrammatical; just as pluralizations nkhuku => mankhuku and nyanja => manyanja can sound very awkward).

The worst case scenario is where the language becomes completely immersed into another language. Look at this Chichewa sentence in youth slang: Ndatrapa ngini ija magaye. Panopa ndikudona pa nide, titchekana boboo thayimu ina yake. Chichewa is just 38% (5/13 --- ija, panopa, pa, ina, yake). The rest is Chinglish: Chichewalised English. If we juxtapose it with an equivalent English sentence, it can be seen that this phrase is “skin-to-skin transliteratable” (forget about semantics here): I have trapped that thing, guys. I am downing to my den. We will check each other some time. So little by little English is eating away our language and if anything cannot done as soon as possible, we will lose out our beautiful language. Of course, I have a problem with youth slang. I have always argued elsewhere that it is volatile and unpredictable. As such, we cannot rely on it very much though we cannot deny the fact that it is influencing Chichewa language in general.

Linguists tell us that language (just like any cultural element) is dynamic. We take English as an example: In 1500-1600s, no one raised eyebrows if you spoke like this: I hath purposed to come unto thee, but was let hitherto. In this statement let means to prevent from. But you can agree with me that the word let now means allow as depicted in this phrase: Please let me go. This is exactly opposite to the original meaning . Similarly, the word gay does not carry the same meaning it used to carry some few years ago, because it is now more associated with sex orientation and not necessarily excitement. Chichewa has also changed overtime. For example, Chichewa that is in the widely used bible version, Buku Lopatulika ndilo Mawu a Mulungu, was translated by William Percival Johnson in 1912 and ever since it has not been modified. (By this, I am not referring to these parallel translations/versions: Malembo Woyera or Buku Loyera). There are some grammatical and semantical errors but Bible Society of Malawi is afraid to correct them (I don't know why). Leaving that aside, 1912 Chichewa is not the same as 2012 Chichewa. Exactly 100 years have passed and there are a lot of things that have changed about Chichewa language. For example, 1 Timothy 3:6 is translated as Asakhale wophunza. Wophunza means novice, but it took me time (and age) to grasp its meaning and understand that wophunza is the root for wophunzira (student). Nowadays, a better translation would be Asakhale wongobadwa mwatsopano (i.e. He must not be a new convert). In addition, in those days a town was called mudzi but today we know it as tawuni, and indeed tawuni is not mudzi (village). Given another 100 years or so, Pure Chichewa will not be the same.

The whole issue of localization comes to a bottleneck because there seem to be a tag of war between developers of new terminologies and users of such terminologies. Terminologists are fast developing new terminologies when the users are not ready or willing to use them. In languages of business, terminologies easily flow in. But that leaves other languages with the task of generating new terminologies or risk dilution. However, localization when viewed from a positive angle, it is a way of preserving a language. It is well-known that languages from the West are mostly associated with economic influence and are little by little subduing other indigenous languages.

In this technological era, every language that wants to survive has to move with fashion. English is fast adapting. Words like mouse, server, breadcrumbs, web do not have same meanings as they used to before 1960s. Similarly, words like blog, facebook (verb), google (verb), tweet (verb) have just born now with the invention of technology. So what is all this noise I am trying to make? We still need translation for our languages to survive and also for the larger section of Malawians that do not understand English. But we need to adjust with time. The language should retain its originality without imposing unnecessary rigour to contemporary readers/writers.

1 comment:

  1. My personal take is that borrowing is excessive when you start replacing existing native terms.

    The official vs grassroots problem is interestingly complex and to a large extent language-specific I think. Official bodies in underfunded languages tend to be slow and reactive, rather than proactive. In the case of Gaelic, it has been pretty much a 1-2 person effort to streamline the (existing) confusion of terms (like half a dozen words for "import") and thrash out the rest. By now, it's becoming the de-facto "standard" (bearing in mind the problem of uptake in such a tiny language).

    I think in a language like Chichewa the "total" grassroots approach has a much better chance, at least if you can get enough people talking about technology in your language. Amongst a million people, someone WILL think of a really great word for "addon", so it's then just a question of picking up on it and codifying.

    Identifying such words is tricky of course. But you could use web tools to map the use of terminology - we have a tool to do the reverse, in a way but I guess you could use it to identify specific words. I'm not making much sense; but if you go to and enter a word, for example "gille" and then click on the blue underlined word, you get a Google map which displays users who know and use a word. In our case, we're trying to determine which lexical items are actually still in use and which are obsolete (we have the problem of too many people using old dictionaries). But you could use something like that to map use of technical terminology, no?