summaryrefslogtreecommitdiff
path: root/python/w3lib/README
blob: b2e25c80c486971dd7da6709c8ef0b561e9bcdc2 (plain)
1
2
3
4
5
6
7
8
9
10
This is a Python library of web-related functions, such as:

remove comments, or tags from HTML snippets
extract base url from HTML snippets
translate entites on HTML strings
convert raw HTTP headers to dicts and vice-versa
construct HTTP auth header
converting HTML pages to unicode
sanitize urls (like browsers do)
extract arguments from urls