Katta - Lucene and more in the cloud.
Katta is a scalable, failure tolerant, distributed, data storage for real time access. Katta serves large, replicated, indices as shards to serve high loads and very large data sets. These indices can be of different type. Currently implementations are available for Lucene and Hadoop mapfiles. Its features include:
- Makes serving large or high load indices easy
- Serves very large Lucene or Hadoop Mapfile indices as index shards on many servers
- Replicate shards on different servers for performance and fault-tolerance
- Supports pluggable network topologies
- Master fail-over
- Fast, lightweight, easy to integrate
- Plays well with Hadoop clusters
comments powered by Disqus
Snort is an open source network intrusion prevention and detection system (IDS/IPS) developed by Sourcefire. Snort can perform protocol analysis and content searching/matching. It can be used to detect a variety of attacks and probes, such as buffer overflows, stealth port scans, CGI attacks, SMB probes, OS fingerprinting attempts, and much more.
Kiwix enables you to have the whole Wikipedia at hand wherever you go. Kiwix gives you access to the whole human knowledge. You don't need Internet, everything is stored on your computer, USB flash drive or DVD. It is basically an offline reader for web content. It supports the ZIM format, a highly compressed open format with additional meta-data.
Jumper 2.0 is a collaborative community search platform that revolutionizes search by crowdsourcing knowledge management powered by a shared bookmarking engine. It is easily and quickly deployed into a community of practice that benefits users with complex and specialized search requirements. Jumper delivers universal search of any databases, flat files, fileshares, content systems, web pages, blogs and wikis, even people - through one simple search box.
MG4J (Managing Gigabytes for Java) is a free full-text search engine for large document collections written in Java. MG4J is a highly customisable, high-performance, full-fledged search engine providing state-of-the-art features (such as BM25/BM25F scoring) and new research algorithms. The main points of MG4J are Powerful indexing, Multi-index interval semantics, Virtual fields, Clustering and lot more.
Sphinix is free open-source SQL full-text search engine. How do you implement full-text search for that 10+ million row table, keep up with the load, and stay relevant? Sphinx is good at those kinds of riddles.
Solr is the popular, blazing fast open source enterprise search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, and rich document (e.g., Word, PDF) handling. Solr is highly scalable, providing distributed search and index replication, and it powers the search and navigation features of many of the world's largest internet sites.
Open Search Server is both a modern crawler and search engine and a suite of high-powered full text search algorithms. Built using the best open source technologies like lucene, zkoss, tomcat, poi, tagsoup. Open Search Server is a stable, high-performance piece of software.
Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.
Constellio Open Source Enterprise Search is based on Apache Solr and using Google Search Appliances connectors architecture, it allows, with a single click, to find all relevant content in your organization (Web, email, ECM, CRM etc.).