"PyKHTML is... A Python module for writing website scrapers/spiders. Whereas traditional methods focus on writing the code to parse HTML/forms themselves, PyKHTML uses the excellent KHTML engine to do all the trudge work. It therefore handles webpages very well (even the severely crufty ones) and is pretty darn fast (implemented in C++). As a bonus the module handles JavaScript and cookies transparently. Hurrah!"Last changelog:
The PyKHTML Changelog is available at <a href="http://paul.giannaros.org/pykhtml/changelog.htm">http://paul.giannaros.org/pykhtml/changelog.htm</a>
PyKDE is excellent. It wraps pretty much all of the functionality of kdelibs and is a pleasure to work with -- I would highly reocmmend it.
PyKDE4 is not ready yet, it will be released when the KDE4 API is stable and finalised.
More or less implemented in the development repository (though with a requirement that you're in GUI debug mode, for the moment). You can get instructions on how to check it out at http://paul.giannaros.org/pykhtml/download.htm
Ratings & Comments
5 Comments
Is it possible to create thumbnails of web pages?
Not at the moment, but that should be easily accomplished. Is it something you'd find useful? If so, I could have a go at implementing it.
Yes What do you think about PyKDE ? Are all KDE libraries already ported to Python ?
PyKDE is excellent. It wraps pretty much all of the functionality of kdelibs and is a pleasure to work with -- I would highly reocmmend it. PyKDE4 is not ready yet, it will be released when the KDE4 API is stable and finalised.
More or less implemented in the development repository (though with a requirement that you're in GUI debug mode, for the moment). You can get instructions on how to check it out at http://paul.giannaros.org/pykhtml/download.htm