Displaying 1 to 10 from 97 results

Grub

Grub Next Generation is distributed web crawling system (clients/servers) which helps to build and maintain index of the Web. It is client-server architecture where client crawls the web and updates the server. The peer-to-peer grubclient software crawls during computer idle time.

Read more

Open Search Server

Open Search Server is both a modern crawler and search engine and a suite of high-powered full text search algorithms. Built using the best open source technologies like lucene, zkoss, tomcat, poi, tagsoup. Open Search Server is a stable, high-performance piece of software.

Read more

ASPseek

ASPseek is an Internet search engine software developed by SWsoft.ASPseek consists of an indexing robot, a search daemon, and a CGI search frontend. It can index as many as a few million URLs and search for words and phrases, use wildcards, and do a Boolean search. Search results can be limited to time period given, site or Web space (set of sites) and sorted by relevance (PageRank is used) or date.

Read more

Pavuk

Pavuk is a UNIX program used to mirror the contents of WWW documents or files. It transfers documents from HTTP, FTP, Gopher and optionally from HTTPS (HTTP over SSL) servers. Pavuk has an optional GUI based on the GTK2 widget set.

Read more

Anole-spider - a python spider

a python spider ,easy customization

Read more


Andjing - PHP web crawler/spider

Andjing Web Crawler 0.01 pre AlphaAndjing is a basic web crawler/spider written in PHP and running in CLI environment. Requirements:PHP MySQL To Do:Change database using SQLite instead of MySQL to save more CPU resource. What You Can Do:You can modify this application into a powerfull email harvester and or content crawler. Application Usage:Extract the files Create database and table from SQL dump file included Edit config.php and change as needed Run C:\\andjing>php.exe andjing.php http://some

Read more

Baidu-relative-search-words - crawling the baidu's relative words, save them in a txt file

starting from a original words, crawl their relative words. And save them in a txt file. for example from the word "上海天气", we can crawl the following words: 上海天气预报, 上海一周天气, 上海一周天气预报, 上海天气预报查询, 上海明天天气, 上海的天气, 上海天气查询,\t上海明天天气预报, 上海今日天气, 上海下周天气

Read more

Bayes-swarm - A research project to extract correlations between web sources

Bayes-Swarm is a research project, its aim is to spider web sources (news portals, blogs and online newspapers) and extract correlations between such sources. ImportantBayes-Swarm is no longer under active development. The http://www.bayes-swarm.com no longer hosts the spider frontend interface. Feel free to navigate the documentation and use the code for your own purposes, just don't expect many new features to be released... For any info, including access to spidered data (roughly 8Gb of tar g

Read more

Arana - Araña is a simple web testing library

Araña web testing libraryAraña ("spider" in Spanish) is a simple web testing library, written in C#. It can be used to integrate simple testing of web applications into unit testing, so the parts of your web application can be tested separately as well as how they work together. Araña can follow links, post forms and, through simple CSS selectors, ensure that the content on the pages of your web application is what you expect, and thus can be tested with unit test assert statements. Araña us

Read more

Aranya - Distributed Network Reptiles(Netword spider)

Aranya is spider, using distributed architecture. this project is to complete a safe, efficient, and Configurable Internet information collection system, through the profile, it can provide effective data(pages, photos, etc.) for many kinds of search engines.

Read more

     Next >>

Bookmark and Share
Browse projects by tags.


Filter results using tags


Follow feeds Follow bestopensource on Twitter Follow bestopensource on Facebook


Open source products are scattered around the web. Please provide information about the open source projects you own / you use. Add Projects.

Do you provide Consulting, Training, Support for any open source products. Register your business

Tag Cloud >>