Search Wikipedia from Command Prompt
This tutorial shows how to quickly setup python scripts to search Wikipedia right form the terminal. This would be ideal for people who want to quickly read up on a topic without having to open a browser , wait for it to load and search. Terminal would do the job more quickly and efficiently.
The way to solve this is to use the Wikipedia API to send a http request to the website as a query action in json format and get the response is back as a json Object. This could be implemented using the request module in python.
The next step is to parse this response json object. I found that Beautiful Soup can be used to do that. This was one of the best options available . Once the parsing was complete, we only have to display the data.
I found 2 scripts to do just that. These scripts however don’t use Requests module but use urllib and urllib2.
Advantage :
The advantage of using this is that, only a brief summary of the topic under search will be showed. And most of the time it is the only thing we want.
The second thing is that if there are sections in the topic, it will be shown.
Disadvantage :
What this lacks is that, it does not have a good pattern matching for searches, that is if it doesn’t ind the exact words in the article it will not be returned. This also happens if each result has multiple result.
Sometimes the data returned is either very less or a lot.
Procedure:
First download and save these 2 python file:
wikipedia.py : This python Program is used to form the url for the search term to get the article from wikipedia.com. Once the URL is formed, we send a request using the urllib python library.We perform the search and get the whole data from Wikipedia.
wiki2plain.py : This program is used to convert the full document received from previous program to readable text format. Usually, the response from the previous program is in the form of json/html. Thus we use this program to parse the json/html and get meaningful data on the topic.
Then create a python file and name it wiki.py and paste the following script in it:
from wikipedia import * from wiki2plain import * lang = ‘simple’ wiki = Wikipedia(lang) try: data1 = raw_input(“enter searh query: “) raw = wiki.article(data1) except: raw = None if raw: wiki2plain = Wiki2Plain(raw) content = wiki2plain.text print content else: print “No text returned”
This code just calls the previously downloaded files and allows you to dynamically enter a topic and search Wikipedia.
Save it in the same folder as the other two files and run wiki.py
Enter the search term and get the results.
Screenshots :