Software Details:
Version: 0.99999 / 1.0b3
Upload Date: 12 May 15
Distribution Type: Freeware
Downloads: 46
It follows the original WHATWG official HTML5 specification.
The parser is designed to handle all flavours of HTML and parses invalid documents using well-defined error handling rules compatible with the behaviour of major desktop web browsers.
The output is palced inside a tree structure.
It supports output to ElementTree, DOM and lxml tree formats as well as a simple custom format.
HTML5Lib is packaged with distutils.
HTML5Lib is also available in:
Ruby - download HTML5Lib for Ruby here.
Python - download HTML5Lib for Python here.
PHP - download HTML5Lib for PHP here.
What is new in this release:
- Parses valid and invalid HTML documents to a tree
- Support for minidom, ElementTree (including cElementTree and lxml.etree), BeautifulSoup (deprecated) and custom simpletree output formats
- DOM to SAX converter
- Reports parse errors
- Character encoding detection
- Filtering and serializing of trees
- HTML+CSS sanitizer
- Many unit tests
Comments not found