Software Details:
Version: 1.0.3
Upload Date: 13 Apr 15
Distribution Type: Freeware
Downloads: 4
Designed as a wrapper around lxml, it now expands lxml with all the features normally needed in HTML data mining.
Features:
- General features:
- Nice jQuery-like CSS selectors
- Simple access to element attributes
- Easy way for convert HTML to other format (BBcode, Markdown, etc)
- Few nice functions for work with text
- Saves all original features of lxml
- Functions to work with pure text:
- to_unicode -- Convert string to Unicode string
- strip_accents -- Strip accents from a string
- strip_symbols -- Strip ugly Unicode symbols from a string
- strip_spaces -- Strip excess spaces from a string
- strip_linebreaks -- Strip excess line breaks from a string
Requirements:
- lxml
Comments not found