WordNet - A Lexical Database for English

Discussion about the Panlexia project
tok a tema of de panlexia projet
seweli
Posts: 46
Joined: 2024-09-29 20:49

WordNet - A Lexical Database for English

Post by seweli »

WordNet® is a large lexical database of English. Nouns, verbs, adjectives and adverbs are grouped into sets of cognitive synonyms (synsets), each expressing a distinct concept.
https://wordnet.princeton.edu/

Although it is downloadable, I don't think it could be useful this way, because it links to zero translation.

But it could help to take inspiration on their semantic field for a concept, and in the far future, it may help to find new concepts to add to Panlexia.

Example with a search on "soya":

http://wordnetweb.princeton.edu/perl/we ... o3=&o4=&h=

http://wordnetweb.princeton.edu/perl/we ... =00&s=soya

Warning: it's very slow. That will be the good reason to download it 😅
Last edited by seweli on 2024-10-30 11:35, edited 2 times in total.
pandunia-guru
Posts: 47
Joined: 2024-09-28 09:28

Re: WordNet - A Lexical Database for English

Post by pandunia-guru »

WordNet is a great finding but it's hard to say how useful it is for Panlexia. WordNet is a monolingual English database and it doesn't link to any translations, as you said. However, I found a word list in Concepticon, Borin 2015 1532, which "contains an additional mapping to the Princeton Wordnet".

I tried a few searches on the slow web interface and found out that WordNet provides good semantic categorization. For example dream has the categories <noun.cognition> in the sense of 'a series of mental images while asleep' and <noun.feeling> in the sense of 'a cherished desire' among others. So WordNet truly grasps the polysemy of English words. As a verb, WordNet categorizes dream as <verb.perception> and sleep as <verb.body> whereas in Concepticon both were in the Body category, which is clearly wrong.

So it seems like WordNet categorizes concepts better than Concepticon. Therefore we can consult WordNet when we try to figure out, which category a word should go. By the way, WordNet categorizes brass into <noun.substance>. ;)
pandunia-guru
Posts: 47
Joined: 2024-09-28 09:28

Re: WordNet - A Lexical Database for English

Post by pandunia-guru »

I downloaded WordNet files. The file wn3.1.dict.tar.gz includes the newest database. I looked inside it and found out that the file names reveal the concept categories.

Adjective and adverb categories don't have any useful information.
  • adj.all
  • adj.pert
  • adj.ppl
  • adv.all
WordNet categorizes nouns into 25 categories and verbs into 15 categories.
  1. noun.act
  2. noun.animal
  3. noun.artifact
  4. noun.attribute
  5. noun.body
  6. noun.cognition
  7. noun.communication
  8. noun.event
  9. noun.feeling
  10. noun.food
  11. noun.group
  12. noun.location
  13. noun.motive
  14. noun.object
  15. noun.person
  16. noun.phenomenon
  17. noun.plant
  18. noun.possession
  19. noun.process
  20. noun.quantity
  21. noun.relation
  22. noun.shape
  23. noun.state
  24. noun.substance
  25. noun.time
  1. verb.body
  2. verb.change
  3. verb.cognition
  4. verb.communication
  5. verb.competition
  6. verb.consumption
  7. verb.contact
  8. verb.creation
  9. verb.emotion
  10. verb.motion
  11. verb.perception
  12. verb.possession
  13. verb.social
  14. verb.stative
  15. verb.weather
I think that the category names don't match exactly what is displayed in the web interface. For example, instead of "food" there was "consumption", which is actually a better wording for category that includes food and drink and maybe also intoxicants. So I have to study it some more.
seweli
Posts: 46
Joined: 2024-09-29 20:49

Re: WordNet - A Lexical Database for English

Post by seweli »

Substance is a nice category for the alloy brass.

But I hesitate on consumption because it's a very polysemic word
https://en.m.wiktionary.org/wiki/consumption

Another point: I never thought of having different categories for verb and nouns and adjective... but it's maybe a good idea. And if it would, it would make sense to reverse the hierarchy, and to begin the id with the POS.

I hope your brain is clearer than mine on the topic, because if not, it will be either very drafty, either very slow. I prefer drafty.
pandunia-guru
Posts: 47
Joined: 2024-09-28 09:28

Re: WordNet - A Lexical Database for English

Post by pandunia-guru »

In my opinion it is the best order to when the word-class marker is last. I like to group, for example, to clean (v.), clean (adj.) and (the act of) cleaning (n.) in the same semantic category. Apparently it wouldn't be even possible in WordNet!

You are right about consumption. What about ingestion? According to Wiktionary it means 'the process of ingesting, or consuming something orally, whether it be food, drink, medicine, or other substance'.
seweli
Posts: 46
Joined: 2024-09-29 20:49

Re: WordNet - A Lexical Database for English

Post by seweli »

I works!

Question: it would be

ingestion:cake.N

?
Post Reply