Member-only story

Python — Beautiful Soup All Notes with Projects

HKN MZ
3 min readMay 2, 2021

Beautiful Soup is a Python package for getting data out of HTML ,XML documents and other markup languages. We can use this package for getting data from java script or dynamically loading pages. It only fetches the contents of the URL that you give and then stops.

Beautiful Soup installation is so easy for your Python environment. Just type pip install bs4 .

pip install bs4

Beautiful Soup supports the HTML parser included in Python’s standard library, but it also supports a number of third-party Python parsers. One of them is lxml parser. If you want to use it you have to install it. By the way there are four type of parser libraries. Lets summarizes the advantages and disadvantages of each parser library.

BeaufulSoup(markup,"html.parser") Advantages: Batteries included, Decent speed. Disadvantages: Not as fast as lxml, less lenient than html5lib

BeaufulSoup(markup,"lxml") Advantages: Very fast, Lenient. Disadvantages: External C dependency

BeaufulSoup(markup,"xml") Advantages: Very fast, The only currently supported XML parser. Disadvantages: External C dependency

BeaufulSoup(markup,"html5lib") Advantages: Creates valid HTML5. Disadvantages: Very slow, External C dependency

— — — — — — — — — — — — — — — — — — — — — — — — — —

Project-1) Examine basic Html document

--

--

HKN MZ
HKN MZ

Written by HKN MZ

I am writing about Sql Server, Elasticsearch and Python. İ am an Database Administrator on SQL Server and Elasticsearch more than 5+ years.

No responses yet