Displaying 1 to 10 from 34 results
Grub Next Generation is distributed web crawling system (clients/servers) which helps to build and maintain index of the Web. It is client-server architecture where client crawls the web and updates the server. The peer-to-peer grubclient software crawls during computer idle time.
Open Search Server is both a modern crawler and search engine and a suite of high-powered full text search algorithms. Built using the best open source technologies like lucene, zkoss, tomcat, poi, tagsoup. Open Search Server is a stable, high-performance piece of software.
An open source .NET web crawler written in C# using SQL 2005/2008. Arachnode.net is a complete and comprehensive .NET web crawler for downloading, indexing and storing Internet content including e-mail addresses, files, hyperlinks, images, and Web pages.
ASPseek is an Internet search engine software developed by SWsoft.ASPseek consists of an indexing robot, a search daemon, and a CGI search frontend. It can index as many as a few million URLs and search for words and phrases, use wildcards, and do a Boolean search. Search results can be limited to time period given, site or Web space (set of sites) and sorted by relevance (PageRank is used) or date.
mnoGoSearch for UNIX consists of a command line indexer and a search program which can be run under Apache Web Server, or any other HTTP server supporting CGI interface. mnoGoSearch for Unix is distributed in sources and can be compiled with a number of databases, depending on user's choice. It is known to work on a wide variety of the modern Unix operating systems including Linux, FreeBSD, Mac OSX, Solaris and others.
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project. Heritrix is designed to respect the robots.txt exclusion directives and META robots tags, and collect material at a measured, adaptive pace unlikely to disrupt normal website activity.
Andjing Web Crawler 0.01 pre AlphaAndjing is a basic web crawler/spider written in PHP and running in CLI environment. Requirements:PHP MySQL To Do:Change database using SQLite instead of MySQL to save more CPU resource. What You Can Do:You can modify this application into a powerfull email harvester and or content crawler. Application Usage:Extract the files Create database and table from SQL dump file included Edit config.php and change as needed Run C:\\andjing>php.exe andjing.php http://some
C Crawler is a web crawler build in C# with Dotnet framework, built in C# 3.5 version. it contains a simple extention of web content categorizer, which can saparate between the web page depending on their content ...