forked from html5lib/html5lib-python
-
Notifications
You must be signed in to change notification settings - Fork 0
A Python HTML parser/tokenizer based on the WHATWG HTML5 spec (Required by python-sphinx, python-sphinxcontrib-htmlhelp, qt-doc, qt-webengine) | (PKGBUILD: https://archlinux.org/packages/extra/any/python-html5lib)
License
sysfce2/python-html5lib
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
html5lib is a pure-python library for parsing HTML. It is designed to conform to the HTML 5 specification, which has formalized the error handling algorithms of popular web browsers. = Installation = html5lib is packaged with distutils. To install it use: $ python setup.py install = Tests = You may wish to check that your installation has been a success by running the testsuite. All the tests can be run by invoking runtests.py in the tests/ directory = Usage = Simple usage follows this pattern: import html5lib f = open("mydocument.html") parser = html5lib.HTMLParser() document = parser.parse(f) More documentation is avaliable in the docstrings or from http://code.google.com/p/html5lib/wiki/UserDocumentation = Bugs = Please report any bugs on the issue tracker: http://code.google.com/p/html5lib/issues/list = Get Involved = Contributions to code or documenation are actively encouraged. Submit patches to the issue tracker or discuss changes on irc in the #whatwg channel on freenode.net
About
A Python HTML parser/tokenizer based on the WHATWG HTML5 spec (Required by python-sphinx, python-sphinxcontrib-htmlhelp, qt-doc, qt-webengine) | (PKGBUILD: https://archlinux.org/packages/extra/any/python-html5lib)
Resources
License
Stars
Watchers
Forks
Packages 0
No packages published
Languages
- Python 68.6%
- HTML 31.4%