The database consists of TSV (tab-separated value) files. Each row in each file has a unique identifier and a word, a definition or other data. The identifier links each definition to a word in each language. It also links each word in one language to its translations in other languages. There can be also other data, like word-class definition and pronunciation instructions in IPA or another transcription system. They can be linked to the word, too.
In the picture below, there are excerpts of a definition file in English and data files in English (eng), German (deu) and Swahili (swa). The data files include files for translation, transcription, word type ("genus") and style ("stilus").
Code: Select all
file: eng_definition.tst
anatomia:facies.n the front part of the head (esp. of a human)
anatomia:bucca.n the soft skin on each side of the face, below the eyes
file: eng.tsv
anatomia:facies.n face
anatomia:facies.n-2 visage
anatomia:facies.n-3 mug
anatomia:facies.a facial
file: eng-IPA.tsv
anatomia:facies.n feɪs
anatomia:facies.n-2 ˈvɪzɪd͡ʒ
anatomia:facies.n-3 mʌɡ
anatomia:facies.a ˈfeɪ.ʃəl
file: eng-stilus.tsv
anatomia:facies.n-2 lit.
anatomia:facies.n-3 sl.
file: deu.tsv
anatomia:facies.n Gesicht
anatomia:facies.n-2 Visage
anatomia:facies.n-3 Fresse
anatomia:facies.a Gesichts-
file: deu-genus.tsv
anatomia:facies.n n
anatomia:facies.n-2 f
anatomia:facies.n-3 f
file: deu-stilus.tsv
anatomia:facies.n-2 sl.
anatomia:facies.n-3 sl.
file: swa.tsv
anatomia:facies.n uso
anatomia:facies.n-2 sura
anatomia:facies.n-3 wajihi
anatomia:facies.a -a uso
file: swa-genus.tsv
anatomia:facies.n 11/10
anatomia:facies.n-2 9/10
anatomia:facies.n-3 9/10
file: swa-stilus.tsv
anatomia:facies.n-2 lit.
anatomia:facies.n-3 lit.
Selected data in the above files could be combined together with scripts or programs for different purposes. For example, one could create an English to German dictionary that would have entries like this:
face n. Gesicht n, (sl.) Visage f, (sl.) Fresse f
facial adj. Gesichts-
mug n. (Slangwort für Gesicht) Visage f, Fresse f
visage n. Gesicht n
This is only one possibility. One could re-use the same program, which generated the English–German dictionary, to generate German–English, Swahili–German, German–Swahili, Swahili–English and English–Swahili dictionaries. When someone adds translations for a new language, like Quechua, Esperanto or Pandunia, to the database, one could generate new dictionaries, which might have been made never before, like Quechua–Swahili and Esperanto–Quechua. So everyone's contribution to the Panlexia database will benefit everybody else.
The meaning identifiers are a key ingredient in this project. They are structured so that they give a mini definition of the meaning in some scientific terminology or sometimes in English: <terminology>:<term>.<word_class>. For example:
anatomia:facies.n
anatomia:bucca.n
Here is a possible division of meaning categories (from Loanwords in the World's Languages, a Comparative Handbook) that we could use initially in Panlexia.
- The physical world
- Kinship
- Animals
- Anatomy
- Food and drink
- Clothing and grooming
- The house
- Agriculture and vegetation
- Basic actions and technology
- Motion
- Possession
- Spatial relations
- Quantity
- Time
- Sense perception
- Emotions and values
- Cognition
- Speech and language
- Social and political relations
- Warfare and hunting
- Law
- Religion and belief
- Modern world
- Miscellaneous function words
This is how Panlexia should be in principle. There is a lot of work to do in practice before everything is settled down for real.