Displaying 1 to 2 from 2 results
Grub
Grub Next Generation is distributed web crawling system (clients/servers) which helps to build and maintain index of the Web. It is client-server architecture where client crawls the web and updates the server. The peer-to-peer grubclient software crawls during computer idle time.
Pywebcrawler - A lightweight, easy to use Python web crawler API
Python Web Crawlerpywebcrawler is a simple, lightweight and yet effective web-crawler API written entirely in Python, which aims to ease customization of crawling and data-fetching. Via a Journal-object you define what URLs to crawl and what to do with the pages once they've been downloaded. Interacting with the crawler through Request and Response objects makes it easy to handle any situation where, for example, HTTP authentication or cookie support is needed. pywebcrawler is of May 2010 still