Displaying 1 to 10 from 15 results
Grub Next Generation is distributed web crawling system (clients/servers) which helps to build and maintain index of the Web. It is client-server architecture where client crawls the web and updates the server. The peer-to-peer grubclient software crawls during computer idle time.
Open Search Server is both a modern crawler and search engine and a suite of high-powered full text search algorithms. Built using the best open source technologies like lucene, zkoss, tomcat, poi, tagsoup. Open Search Server is a stable, high-performance piece of software.
ASPseek is an Internet search engine software developed by SWsoft.ASPseek consists of an indexing robot, a search daemon, and a CGI search frontend. It can index as many as a few million URLs and search for words and phrases, use wildcards, and do a Boolean search. Search results can be limited to time period given, site or Web space (set of sites) and sorted by relevance (PageRank is used) or date.
Andjing Web Crawler 0.01 pre AlphaAndjing is a basic web crawler/spider written in PHP and running in CLI environment. Requirements:PHP MySQL To Do:Change database using SQLite instead of MySQL to save more CPU resource. What You Can Do:You can modify this application into a powerfull email harvester and or content crawler. Application Usage:Extract the files Create database and table from SQL dump file included Edit config.php and change as needed Run C:\\andjing>php.exe andjing.php http://some
C Crawler is a web crawler build in C# with Dotnet framework, built in C# 3.5 version. it contains a simple extention of web content categorizer, which can saparate between the web page depending on their content ...
A (very basic) web crawling library for common lisp. Little attention has been paid to efficiency, but since this is the only web-crawling library I can find for CL, hopefully it will get better. It could be a lot better :) Patches and suggestions welcome! Send to asokoloski (but no spam!) at gmail . com Usage: (web-crawler:start-crawl "http://start/url" (lambda (uri page-text) (do-something-with (or uri page-text))))
Indexfirst - A URL-incrementing web crawler with an "index first, ask questions later" phi
The aim of the Indexfirst Project is to create indexfirst, an open source, URL-incrementing web crawler with the following features: Can be invoked with a URL structure containing a mixture of fixed and incrementable parts, e.g. '' would use the backslash pairs to identify which parts to increment ('part1' and 'part2') and would increment them according to rules specified elsewhere. Can be invoked with a starting string for each incrementable part of the URL structure mentioned above, a characte
This project is still in its absolute infancy. craWWWler will be a large scale web crawler written in C++ (no MFC). It currently has a very basic plugin architecture controlled by a purposely thin manager. The manager, however, is designed to be more like an ignition switch, occasional pump, and emergency shutdown. The manager is responsible for allowing one or mores plugins to subscribe to the output of other plugins. In this way, the plugins do not have to pass large amounts of data to other p
iCrawler is a web based crawler system which enable some features like multithreading. iCrawler is also extensible crawler which will support adding any features to it. Build Enviroment: C# Programming Language MS-SQL DataBase ASP.NET