Search Wikipedia from Command Prompt

This tutorial shows how to quickly setup python scripts to search Wikipedia  right form the terminal.  This would be ideal for people who want to quickly read up on a topic without having to open a browser , wait for it to load and search. Terminal would do the job more quickly and efficiently.

The way to solve this is to use the Wikipedia API  to send a http request to the website as a query action in json format and get the response is back as a json Object. This could be implemented using the request  module in python.

The next step is to parse this response json object. I found that Beautiful Soup can be used to do that. This was one of the best options available . Once the parsing was complete, we only have to display the data.

I found 2 scripts to do just that. These scripts however don’t use Requests module but use urllib and urllib2.

Advantage :
The advantage of using this is that, only a brief summary of the topic under search will be showed. And most of the time it is the only thing we want.

The second thing is that if there are sections in the topic, it will be shown.

Disadvantage :
What this lacks is that, it does not have a good pattern matching for searches, that is if it doesn’t ind the exact words in the article it will not be returned. This also happens if each result has multiple result.
Sometimes the data returned is either very less or a lot.

Procedure:

First download and save these 2 python file:
wikipedia.py  : This python Program  is used to form the url for the search term to get the article from wikipedia.com. Once the URL is formed, we send a request using the urllib python library.We perform the search and get the whole data from Wikipedia.

wiki2plain.py : This program is used to convert the full   document received  from previous program to readable text format. Usually, the response from the previous program is in the form of json/html. Thus we use this program to parse the json/html and get meaningful data on the topic.

Then create a python file and name it wiki.py and paste the following script in it:


from wikipedia import *
from wiki2plain import *

lang = ‘simple’
wiki = Wikipedia(lang)
try:
    data1 = raw_input(“enter searh query: “)
    raw = wiki.article(data1)
except:
    raw = None

if raw:
    wiki2plain = Wiki2Plain(raw)
    content = wiki2plain.text
    print content
else:
    print “No text returned”

This code just calls the previously downloaded files and allows you to dynamically enter a topic and search Wikipedia.
Save it in the same folder as the other two files and run  wiki.py

Enter the search term and get the results.

Screenshots :

Screenshot from 2014-04-20 22:21:28

Screenshot from 2014-04-20 22:21:06

akshay pai

I am a data science engineer and I love working on machine learning problems. I have experience in computer vision, OCR and NLP. I love writing and sharing my knowledge with others. This is why I created Source Dexter. Here I write about Python, Machine Learning, and Raspberry Pi the most. I also write about technology in general, books and topics related to science. I am also a freelance writer with over 3 years of writing high-quality, SEO optimized content for the web. I have written for startups, websites, and universities all across the globe. Get in Touch! We can discuss more.

Leave a Reply

%d bloggers like this: