Wikipedia on the terminal , using python

I wanted to get information on my terminal on Ubuntu. y requirement  was that, any  data that i wanted  should be output on the terminal itself. I didn’t want to open a browser each time i wanted some information. And all I needed was the data and not images and videos. So to view the information on the terminal was very convenient and efficient.

When i searched, I was looking for a python implimentstion , i came to know that using Wikipedia API, one can to send a http request to the website as a query action and json format and get the json object. This could be implemented using the request , module in python. So this would give the json object that had the data on the topic to be queried.

The next step was to parse it . And I found that Beautiful Soup can be used to do that so extract only the data. This was one of the best options available . An then the only thing to do was to print the data that is extracted.

I found 2 scripts to do just that. These scripts however dont use Requests module but use urllib and urllib2.
Advantage :
The advantage of using this is that, only a brief summary of the topic under search will be showed. And most of the time it is the only thing we want.

The second thing is that if there are sections in the topic, it will be shown.

Disadvantage :
What this lacks is that, it does not have a good pattern matching for searches, that is if it doesnt ind the exact words in the article it will not be returned. This also happens if each result has multiple result.
Sometimes the data returned is either very less or a lot.

Procedure:

First download and save tehse 2 python files.
wikipedia.py
    wiki2plain.py

Then create a python file and name it wiki.py . Then paste the following script in it.

from wikipedia import *
from wiki2plain import *

lang = ‘simple’
wiki = Wikipedia(lang)
try:
    data1 = raw_input(“enter searh query: “)
    raw = wiki.article(data1)
except:
    raw = None

if raw:
    wiki2plain = Wiki2Plain(raw)
    content = wiki2plain.text
    print content
else:
    print “No text returned”

Save it in the same folder as the other to files and run  wiki.py
Enter the search term and get the results.

Screenshots :

Screenshot from 2014-04-20 22:21:28

Screenshot from 2014-04-20 22:21:06

akshay pai

I am a data science engineer and I love working on machine learning problems. I have experience in computer vision, OCR and NLP. I love writing and sharing my knowledge with others. This is why I created Source Dexter. Here I write about Python, Machine Learning, and Raspberry Pi the most. I also write about technology in general, books and topics related to science. I am also a freelance writer with over 3 years of writing high-quality, SEO optimized content for the web. I have written for startups, websites, and universities all across the globe. Get in Touch! We can discuss more.

0 thoughts on “Wikipedia on the terminal , using python

  • April 21, 2014 at 5:37 am
    Permalink

    I have the follow mistake, I how solve this ??

    Traceback (most recent call last):
    File “wiki.py”, line 1, in
    from wikipedia import *
    File “/home/jairovelez/Escritorio/wikipedia.py”, line 4, in
    import yaml
    ImportError: No module named yaml

    Reply
    • April 21, 2014 at 7:43 am
      Permalink

      Hi,
      the error has occurred because you have not installed the yaml module.
      To install it, type in your terminal the following command:

      sudo pip install yaml
      then enter your password and let it install. If it says pip is not installed then install pip first by typing
      sudo apt-get install pip

      Reply

Leave a Reply