IPython is a command shell for interactive computing in multiple programming languages, originally developed for the Python programming language, that offers introspection, rich media, shell syntax, tab completion, and history.
IPython provides a great interactive shell for user to play with code. When you have to test some sample python code or want to test a script, IPython is the “Thing” for you. I will be doing web scraping in this post using IPython.
NOTE- This is tested on Ubuntu 14.04.
There are several ways to install IPython. I am going to install it via pip. To install pip run-
$ sudo apt-get install python-pip
It will install pip ( tool to install python packages ). To install ipython via pip, run –
$ pip install ipython
It will install ipython on your machine. We can also install ipython using apt-get but we will need pip to install packages needed to do web scraping. This is the reason i have used pip to install ipython
Python packages Required
For this blog, we need two python packages-
Python’s request module to send and receive GET requests. To install it, run-
$ pip install requests
Beautiful Soup module to extract information from HTML page. To install it, run-
$ pip install bs4
With this much done, lets start Scraping.
Let the Fun Begin
For this example, I am going to get the “Latest News” in Student Corner from
First of all start ipython using command-
It will open an ipython interpreter ( You can also save code in a script and run it). Then enter the code below, one line per time, pressing enter after each line –
import requests, os, bs4
url = "http://gndec.ac.in"
req = requests.get(url)
soup = bs4.BeautifulSoup(req.text, 'html.parser')
output = soup.select("#block-block-15").select(".content").find_all("p").find_all("span").getText()
You will have all the latest news in your Terminal!