Find synonyms and hyponyms using Python nltk and WordNet

What are Wordnet, Hyponyms, and synonyms?

Wordnet is a large collection of words and vocabulary from the English language that are related to each other and are grouped in some way. That’s the reason WordNet is also called a lexical database.

WordNet groups nouns, adjectives, verbs which are similar and calls them synsets or synonyms. A group of synsets might belong to some other synset. For example, the synsets “Brick” and “concrete” belong to the synset “Construction Materials” or the synset “Brick” also belongs to another synset called “brickwork “. In the example given, brick and concrete are called hyponyms of synset construction materials and also the synsets construction material and brickwork are called synonyms.

You can imagine wordnet as a tree, where synonyms are nodes on the same level and hyponyms are nodes lower than the current node.

What is Python nltk ?

Natural Language Toolkit (NLTK) is a python library to process human language. Not only does it have various features to help in natural language processing, it also comes with a lot of data and corpus that can be used. Wordnet is one such corpus provided by nltk data.

How to install nltk and Wordnet ?

To install nltk on Linux and Mac, just run the following command :


sudo pip install nltk

For full installation details and installation on other platforms visit their official installation page.

Once nltk is downloaded, you can download wordnet using the nltk data interface. Follow the instructions given here.

How do you find all the synonyms and hyponyms of a given word ?

We can use the downloaded data along with nltk API to fetch the synonyms of a given word directly. To fetch all the hyponyms of a word, we would have to recursively navigate to each node and its synonyms in the wordnet hierarchy. Here is a python script to do that.

Get all synonyms or Thesaurus for a given word
```
from nltk.corpus import wordnet as wn
input_word = raw_input(&quot;Enter word to get different meanings: &quot;)

for i,j in enumerate(wn.synsets(input_word)):
print &quot;Meaning&quot;,i, &quot;NLTK ID:&quot;, j.name()
print &quot;Definition:&quot;,j.definition()
print
```
Following example finds the synoyms/ synsets for the word car:
Get all the hyponyms and hypernyms for a given word
```
from nltk.corpus import wordnet as wn
from itertools import chain

input_word = raw_input(&quot;Enter word to get hyponyms and hypernyms: &quot;)

for i,j in enumerate(wn.synsets('dog')):
print &quot;Meaning&quot;,i, &quot;NLTK ID:&quot;, j.name()
print &quot;Hypernyms:&quot;, &quot;, &quot;.join(list(chain(*[l.lemma_names() for l in j.hypernyms()])))
print &quot;Hyponyms:&quot;, &quot;, &quot;.join(list(chain(*[l.lemma_names() for l in j.hyponyms()])))
print
```
hypernyms are nothing but synsets above a given word. Getting all the hypo and hypernyms are also called ontology of a word. In the following example, the ontology for the word car is extracted.
Get all Hyponyms with synsetID

each synset has an Id which is nothing but the offset of that particular word in the list of all words. If you know the Id of a synset and want to find out the id of all the hyponyms instead of meanings and definitions, you can do this:
```
from nltk.corpus import wordnet as wn

X = []

id = int(raw_input(&quot;enter synset ID: &quot;))
wr = wn._synset_from_pos_and_offset('n',id)

def traverse(wr):
if(len(wr.hyponyms()) ==0):
X.append(wr.offset())
else:
list_hypo = wr.hyponyms()
for each_hypo in list_hypo:
traverse(each_hypo)

traverse(wr)
print X
```

Source : StackOverflow

Find synonyms and hyponyms using Python nltk and WordNet

What are Wordnet, Hyponyms, and synonyms?

What is Python nltk ?

How to install nltk and Wordnet ?

How do you find all the synonyms and hyponyms of a given word ?

Get all synonyms or Thesaurus for a given word

Get all the hyponyms and hypernyms for a given word

Get all Hyponyms with synsetID

Like this:

Leave a ReplyCancel reply

What are Wordnet, Hyponyms, and synonyms?

What is Python nltk ?

How to install nltk and Wordnet ?

How do you find all the synonyms and hyponyms of a given word ?

Get all synonyms or Thesaurus for a given word

Get all the hyponyms and hypernyms for a given word

Get all Hyponyms with synsetID

Share it with others

Like this:

Leave a ReplyCancel reply

Related Posts

websites with lists of startups

Handling Database Tables with no Primary key in Spring MVC and Hibernate

Grub rescue

Startup Applications: automatic execution of the most used programs during every user login