10. Where Pandunia words come from?
Principles
Most Pandunia words are already international – at least in some part of the world! The three key criteria for selecting words for Pandunia are:
- Equality : Words are be borrowed equally from different regions of the world. In practice it means that Pandunia borrows words from the languages of Africa, Asia, Europe and the Americas.
- Prevalence : Widely spread words are favored. The more people know the word the better.
- Simplicity : Word forms with easy pronunciation are favored.
Cultures of the world
Hartmut Traunmuller divided the world into four major cultural spheres in his article A Universal Interlanguage: Some Basic Considerations. The languages within a certain cultural sphere share words (loan words and translated loan words) and cultural concepts.
- The Euro-American cultural sphere
- This sphere covers all of Europe, Americas, Australia and various smaller regions.
- Languages of the West have been influenced greatly by Greek and Latin and in modern times by French and English.
- The Afro-Asian (or the Islamic) cultural sphere
- This group includes languages of areas where Islam is the main religion.
- It spans from the Atlantic coast of Africa to the Pacific islands of Indonesia and the Philippines.
- The languages of this cultural sphere are influenced by Persian and especially Arabic, which is the language of Quran, the holy book of Islam.
- The South Asian (or the Indian) cultural sphere
- This sphere covers the very populous subcontinent of India, Indochina and more
- The classical language of this group are Sanskrit, Tamil and Pāli
- The Indian vocabulary has been spread by Hinduism and especially Buddhism in all directions in Asia and elsewhere.
- The East Asian (or the Chinese) cultural sphere
- This culturel sphere grew around ancient China, the Middle Kingdom
- All languages of East Asia are saturated by loan words from Chinese.
- The biggest modern Chinese language, Mandarin, competes for the title of the most spoken language in the world today.
The cultural spheres are roughly outlined in the picture below.
Languages of the world and world languages
It is estimated that over 6000 different languages are spoken in the world. Some languages are spoken by many while others are spoken by only a few. Native and non-native speakers of the five most widely spoken languages together add up to more than half of the total population of the world. It is impossible to include all languages into the construction of a world language because of their great number. The number of source languages should be manageable for one person to work with.
So, which languages should be taken in?
Power Language Index (PLI) provides an answer to this question. It is a tool for comparing efficacy of languages that has been created by Ph.D. Kai L. Chan. It compares languages on how well they provide to a speaker the following five opportunities:
- Geography: The ability to travel
- Economy: The ability to participate in economic activities
- Communication: The ability to participate in dialogue
- Knowledge and media: The ability to consume knowledge and media
- Diplomacy: The ability to engage in international relations
Chan builds a ranking of languages based on a combination of the above-listed opportunities. This ranking is used as a reference in Pandunia.
The main source languages for Pandunia
Most Pandunia words are borrowed from 21 widely spoken languages as listed in the table below. The languages are selected so that they represent different language families, different geographical regions and different cultures.
The following table is ordered by the rank in the Power Language Index. The numbers of speakers are from the Power Language Index and the Wikipedia.
Language | Native speakers | Non-native speakers | PLI ranking | Cultural sphere | Language family |
---|---|---|---|---|---|
English | 446 million | 510 million | 1 | Euro-American | Indo-European |
Mandarin Chinese | 960 million | 178 million | 2 | East Asian | Sino-Tibetan |
French | 80 million | 192 million | 3 | Euro-American | Indo-European |
Spanish | 470 million | 70 million | 4 | Euro-American | Indo-European |
Arabic | 295 million | 132 million | 5 | Afro-Asian | Afro-Asiatic |
Russian | 150 million | 115 million | 6 | Euro-American | Indo-European |
German | 76 million | 59 million | 7 | Euro-American | Indo-European |
Hindi-Urdu | 442 million | 214 million | 8 | Indian & Afro-Asian | Indo-European |
Japanese | 125 million | 1 million | 9 | East Asian | Japonic |
Portuguese | 215 million | 32 million | 10 | Euro-American | Indo-European |
Cantonese | 80 million | ½ million | 11 | East Asian | Sino-Tibetan |
Malay | 77 million | 204 million | 14 | Indian & Afro-Asian | Malayo-Polynesian |
Korean | 80 million | 1 million | 16 | East Asian | Koreanic |
Turkish | 82 million | 6 million | 18 | Afro-Asian | Turkic |
Persian | 56 million | 21 million | 29 | Afro-Asian | Indo-European |
Bengali | 210 million | 19 million | 30 | Indian & Afro-Asian | Indo-European |
Swahili | 20 million | 80 million | 37 | Afro-Asian | Niger-Congo |
Tamil | 78 million | 8 million | 38 | Indian | Dravidian |
Vietnamese | 76 million | 1 million | 43 | East Asian | Austroasiatic |
Hausa | 51 million | 26 million | 114 | Afro-Asian | Afro-Asiatic |
Fula | 42 million | 10 million | 119 | Afro-Asian | Niger-Congo |
They represent also a good mix of cultures and regions of the world. The table below shows the number of countries by continent where the 21 source languages have an official or national status.
Language | America | Europe | Africa | Asia | Oceania |
---|---|---|---|---|---|
English | 14 | 3 | 23 | 5 | 14 |
French | 2 | 5 | 21 | 1 | |
Spanish | 18 | 1 | 1 | ||
Portuguese | 1 | 1 | 6 | 1 | |
Russian | 2 | 3 | |||
German | 6 | ||||
Arabic | 11 | 12 | |||
Swahili | 5 | ||||
Fula | 3 | ||||
Hausa | 2 | ||||
Turkish | 1 | 1 | |||
Persian | 3 | ||||
Hindi-Urdu | 2 | ||||
Bengali | 2 | ||||
Tamil | 3 | ||||
Malay | 4 | ||||
Mandarin | 3 | ||||
Cantonese | 2* | ||||
Japanese | 1 | ||||
Korean | 2 | ||||
Vietnamese | 1 |
* Cantonese is the official language of Hong Kong and Macau, which are not countries but special administrative districts.
Word selection method
There are a lot of international words, because languages influence each other all the time. Some words are international in the West, some in the East, and some are even global. Pandunia attempts to use as international, intercontinental and global words as possible.
Words that are specific to a certain culture shall be adopted from languages that best represent that culture.
Words for objects of nature (for example plants and animal species) shall be adopted from a language that is spoken in the area where that object is found.
So the first question is, does the word belong to a certain region or culture?
Yes. → Select the word from languages that are important in that region or culture.
No. → Use the following word selection method.
- Collect translations for a given word in the 21 languages that were listed in the previous chapter by using electronic or printed or electronic dictionaries, Wiktionary, reliable machine translation, or some other tool.
- Identify groups of similar words.
- Similar words can be historically related
- or they can sound alike by coincidence.
- Select the most international group of similar words.
- The more international, the better.
- The best words are cross-cultural.
- If there's no cross-cultural word, select the one that is known by the greatest number of 1st language speakers.
- Select a word form that represents the group well and also fits well into Pandunia.
- Strip off unnecessary prefixes and suffixes.
- Use the sounds, the spelling and the normal word structure of Pandunia.
- Make sure that the new word is not identical or too similar to any previously existing Pandunia word.
Normally a word appears in at least two of the source languages. In case there isn't a common word, partially similar words can be selected. Only in the last resort a word from only one language can be accepted.
Word statistics
Figure 1. This bar chart shows how the percentage of Pandunia's base words that are similar with the source languages.
Figure 2. This pie chart shows how big influence each source language has on Pandunia.
Figure 3. This network diagram shows how much Pandunia words the source languages have in common with each other.
Figure 3 is a network diagram of the 21 source languages of Pandunia. The circles symbolize source languages. The larger the diameter, the more words Pandunia has borrowed from that language. Lines between the circles indicate how many Pandunia words the languages connected by the line have in common. The thicker the line is, the more words the connected languages have in common with each other and Pandunia.
Examples
Selecting the word for 'language'
First possible candidates are searched from widely spoken languages. The search reveals that there are several words that are international.
- Arabic لغة /luɣa/ is also known in Swahili lugha. It is also known in Persian and Turkic languages but with the meaning "dictionary".
- Persian زبان /zæba:n/ has spread to Urdu and Punjabi among others.
- Latinate lingua is found in the Romance languages and it has spread to most European languages in words like linguistics and multilingual.
- Indo-Aryan भाषा /bʱaʂa/ is used in Hindi and Bangla and it has spread to several neighbouring languages including Telugu, Thai and Indonesian.
The most prevalent of these words is /bʱaʂa/. It is recognised nearly everywhere in India, Indochina and the Malay archipelago, which are some of the most densely populated areas in the world.
Language | Spoken word | Written word |
---|---|---|
Hindi | bʱa:ʂa: | भाषा |
Punjabi | bʱa:ʃa: | ਭਾਸ਼ਾ |
Gujarati | bʱa:ʃa: | ભાષા |
Marathi | bʱa:ɕa: | भाषा |
Bangla | bʱaʃa | ভাষা |
Telugu | ba:ʃa | భాష |
Tamil | ba:ɕai | பாசை |
Thai | pʰa:sa: | ภาษา |
Indonesian | bahasa | bahasa |
Javanese | basa | basa |
Sundanese | basa | basa |
As you can see, the same word is written and pronounced differently in different languages. This is typical of international words. They get adapted in almost every language to their own spelling system. Likewise it is necessary to adapt this word to the spelling and pronunciation rules of Pandunia. Only the root of the word is borrowed to Pandunia. The root is bhāṣ-, as we can see from derived words like Hindi द्विभाषी /dvibhāṣī/ and Bangla দ্বিভাষিক /dibhaśik/. Both of them mean 'bilingual'.
So the Pandunia root for 'language' becomes bash, and it serves as the root for many derived words, including dubashik (du-bash-ik) 'bilingual' and polibashik (poli-bash-ik) 'multilingual, polyglot'.
Examples of global words
bir 'beer'
- English beer
- German Bier
- French bière /biɛʁ/
- Italian birra
- Turkish bira
- Arabic بيرَه (bīra)
- Amharic ቢራ (bira)
- Rwanda byere
- Swahili bia
- Hindi बियर (biyar)
- Bangla বিয়ার (biyar)
- Telugu బీరు (bīru)
- Malay bir
- Japanese ビール (bīru)
- Wu Chinese 啤酒 (bi-jieu)
- Mandarin 啤酒 (pí-jiǔ)
- Vietnamese bia
cha 'tea'
- Mandarin 茶 (chá)
- Japanese 茶 (cha)
- Korean 차 (cha)
- Vietnamese trà /tʂa/
- Persian چای (chây)
- Bangla চা (cha)
- Hindi चाय (chāy)
- Russian чай (chay)
- Turkish çay
- Arabic شاي /ʃay/
- Hausa shayi
- Fula ataayi
- Swahili chai
- Portuguese chá
- English cha, chai, char 'types of tea'
motor 'motor, engine'
- English motor
- Spanish motor
- French moteur
- Russian мотор (motor)
- Persian موتور (motor)
- Hindi मोटर (motar)
- Turkish motor
- Arabic موتور (mutūr)
- Hausa mota
- Swahili mota
- Japanese モーター (mōtā)
- Korean 모터 (moteo)
- Mandarin 摩托 (mótuō)
- Vietnamese mô-tơ
sherte 'shirt'
- English shirt /ʃɜː(ɹ)t/
- German T-Shirt /ti:.ʃøːɐt/ 'T-shirt'
- French T-shirt /ti.ʃɛʁt/ 'T-shirt'
- Persian تیشِرْت (ti-šert) 'T-shirt'
- Hindi टी-शर्ट (ṭī-śarṭ) 'T-shirt'
- Bangla শার্ট (śarṭo)
- Tamil சட்டை (caṭṭai)
- Turkish tişört 'T-shirt'
- Arabic تِي شِيرْت (tī šīrt) 'T-shirt'
- Swahili shati /ʃati/
- Yoruba ṣẹ́ẹ̀tì
- Mandarin T恤衫 (tī-xùshān) 'T-shirt'
- Cantonese 恤衫 (seot1 saam1)
- Japanese シャツ (shatsu)
- Korean 셔츠 (syeocheu)
Examples of scattered words
amir 'order, command'
Originally an Arabic word, it has been borrowed to the European languages as emir 'commander of an Islamic nation- and as admiral 'commander of a navy'.
- African and Asian languages
- Arabic أَمْر /ʾamr/ 'command'
- Persian امر /amr/
- Turkish emir
- Swahili amri
- Hausa umarni 'to command'
- Euro-American languages
- English emir and admiral 'types of commander'
- French amiral 'admiral'
- Russian эмир (emir)
bandera 'flag'
- Portuguese bandeira
- Spanish bandera
- French bannière /baniɛʁ/
- English banner /bænəɹ/
- Malay bendera /bəndera/
- Amharic ባንዴራ (bandera)
- Swahili bandera
- Kongo bandêla
kamar 'room, chamber'
- Euro-American languages
- Italian camera 'chamber'
- Portuguese câmara 'chamber'
- Spanish cámara 'chamber'
- German Kammer 'chamber'
- South Asian languages
- Hindi कमरा (kamrā)
- Urdu کمرا (kamrā)
- Malay kamar
Examples of unrelated but similar words
jen 'person, people'
The word jen is combined from several unrelated sources.
- East Asian languages
- Mandarin 人 /ʐən/ 'person'
- Wu 人 /zəŋ/ 'person'
- Japanese 人 (jin) 'person'
- Euro-American languages
- French gens /ʒã/ 'people'
- Portuguese gente /ʒenti/ 'people'
- South Asian languages
- Hindi जन (jan) 'person, people'
- Bengali জন (jôn) 'counter word for people'
- Khmer ជន (jon) 'person, people'
- Thai ชน (chon) 'person, people'
kat 'to cut'
- Euro-American languages
- English cut /kʌt/
- South Asian languages
- Hindi काटना (katnā)
- Bengali কাটা (kata)
- Afro-Asian languages
- Arabic قَطَعَ (qaṭa’a)
- Swahili -kata
- East Asian languages
- Wu Chinese 隔 /kɐʔ/
- Vietnamese cắt /kɐʔt/
sui 'water'
- East-Asian languages
- Mandarin 水 (shuǐ)
- Cantonese 水 (seoi)
- Wu Chinese 水 (su)
- Japanese 水 (sui)
- Korean 수 (su)
- Vietnamese thuỷ
- South West Asian languages
- Turkish su
- Turkmen suw
- Kazakh su
Examples of Afro-Asian words
dua 'prayer'
- Arabic دعاء (duʿā)
- Persian دعاء (do'a)
- Hindi hin:दुआ (duā)
- Bangla দুয়া (dua)
- Turkish dua
- Malaydoa
- Hausa addu'a
- Yoruba àdúrà
- Fula du'aade
kitab 'writing'
This word means 'book' in many languages. The original Arabic word means 'writing' in general.
- Arabic كتاب (kitāb)
- Persian کتاب (ketâb)
- Urdu کتاب (kitāb)
- Indonesian kitab
- Turkish kitap
- Hausa littafi
- Oromo kitaaba
- Swahili kitabu
Examples of East Asian words
Sinitic words are words from Middle Chinese that are used today in languages of East Asia, including Chinese languages, Japanese, Korean and Vietnamese. Sinitic words are single-syllable words or compounds of syllabic elements.
Middle Chinese had lexical tone. Today Chinese languages and Vietnamese have tones but they are not the same as in Middle Chinese. Japanese and Korean are not tonal languages so they have ignored the tones. Also Pandunia ignores the tones. (To ignore the tones is about the same as to ignore the stress accent or pitch accent in words from other source languages.)
Middle Chinese had unreleased stop consonants, which are usually written in the Latin alphabet as -p, -t and -k. Cantonese, Vietnamese and Korean keep them mostly as they were. Mandarin has deleted them. Japanese has added a vowel to ease pronunciation. Pandunia keeps the final stops and adds an optional schwa sound.
chut 'exit, leave'
- Cantonese 出 (ceot1)
- Mandarin 出 (chū)
- Hakka 出 (chut)
- Japanese 出 (shutsu)
- Korean 출 (chul)
- Vietnamese xuất
mun 'door, opening'
- Cantonese 門 (mun4)
- Mandarin 门 (mén)
- Japanese 門 (mon)
- Korean 문 (mun)
- Vietnamese môn
shan 'mountain'
- Mandarin 山 (shān)
- Cantonese 山 (saan1)
- Japanse 山 (san)
- Korean 산 (san)
- Vietnamese sơn
shim 'heart'
- Mandarin 心 (xīn)
- Cantonese 心 (sam)
- Japanese 心 (shin)
- Korean (sim)
- Vietnamese tâm
Examples of Euro-American words
Typically Euro-American words have the following structure: prefix + root + suffixes. In most cases the root ends in a consonant.
For example in Spanish, the root cort- (short) can be combined with affixes to produce different kinds of words.
- Adjectives: cort-o (masc.), cort-a (fem.)
- Noun: cort-edad
- Verb: a-cort-ar
Also English uses comparable affixes.
- Adjectives: short, short-er, short-est
- Nouns: short-ness, short-y
- Verb: short-en
Pandunia borrows only the bare roots of Western words. The purpose is to select a form that sounds familiar to speakers of as many languages as possible.
korte 'short'
- English short
- German kurz
- French court
- Spanish curto
- Portuguese corto
- Russian короткий (korotkiy)
nov 'new'
- English new, novel
- German neu
- French nouveau
- Spanish nuevo
- Portuguese novo
- Russian новый (novîy)
marche 'walk'
- English march 'military walk'
- German Marsch 'military walk'
- French marche 'walk'
- Spanish marcha
- Portuguese marcha
- Russian марш (marš)