You Are Here: Home » How-To » Web

How To Install Beautiful Soup Scraping Library for Python?

By on September 20th, 2013 
Advertisement

Beautiful Soup is a Python library using which you can scrape data from various webpages online. Although Python has another vast and better framework called the Scrapy for web-data scraping purposes but Beautiful Soup is a very light-weight library and does the job quickly.

Python Logo

Python Logo

You can install Beautiful Soup using following two commands:

Some sample outputs...

pip install beautifulsoup4
easy_install beautifulsoup4

root@dm:~# pip install beautifulsoup4
Downloading/unpacking beautifulsoup4
  Downloading beautifulsoup4-4.3.1.tar.gz (142Kb): 142Kb downloaded
  Running setup.py egg_info for package beautifulsoup4

Installing collected packages: beautifulsoup4
  Running setup.py install for beautifulsoup4

Successfully installed beautifulsoup4
Cleaning up...
root@dm:~#

Some more sample outputs...

root@dm [/home/d]# easy_install beautifulsoup4
Searching for beautifulsoup4
Reading http://pypi.python.org/simple/beautifulsoup4/
Best match: beautifulsoup4 4.3.1
Downloading https://pypi.python.org/packages/source/b/beautifulsoup4/beautifulsoup4-4.3.1.tar.gz#md5=508095f2784c64114e06856edc1dafed
Processing beautifulsoup4-4.3.1.tar.gz
Running beautifulsoup4-4.3.1/setup.py -q bdist_egg --dist-dir /tmp/easy_install-gMGxK0/beautifulsoup4-4.3.1/egg-dist-tmp-0kAJy4
zip_safe flag not set; analyzing archive contents...
Adding beautifulsoup4 4.3.1 to easy-install.pth file

Installed /usr/local/lib/python2.7/site-packages/beautifulsoup4-4.3.1-py2.7.egg
Processing dependencies for beautifulsoup4
Finished processing dependencies for beautifulsoup4
root@dm [/home/d]#

Web Data Extraction / scraping from public websites is common but you should always evaluate the legality of any scraping before you do so.

Advertisement







How To Install Beautiful Soup Scraping Library for Python? was originally published on Digitizor.com on September 20, 2013 - 8:23 am (Indian Standard Time)