SMILA - Unified information access architecture

SMILA is an extensible framework for building search solutions to access unstructured information in the enterprise. Besides providing essential infrastructure components and services, SMILA also delivers ready-to-use add-on components, like connectors to most relevant data sources. Using the framework as their basis will enable developers to concentrate on the creation of higher value solutions, like semantic driven applications etc.

It has a crawler / agent, which pushes the data. The data is then processed by various filters. BPEL workflows could be created using its WorkerManager. The pipeline could be well managed by this workflow engine.




http://www.eclipse.org/smila/

Bookmark and Share          1281



comments powered by Disqus


Related Products

Snort - Network Intrusion Prevention and Detection System

Snort is an open source network intrusion prevention and detection system (IDS/IPS) developed by Sourcefire. Snort can perform protocol analysis and content searching/matching. It can be used to detect a variety of attacks and probes, such as buffer overflows, stealth port scans, CGI attacks, SMB probes, OS fingerprinting attempts, and much more.

Read more

UIMA - Unstructured information management architecture

UIMA analyzes large volumes of unstructured information in order to discover knowledge that is relevant to an end user. It is a framework with different set of components. The components include Language Identification, Language specific segmentation, Sentence boundary detection, Entity detection (person/place names) etc. The framework manages these components and the data flows between them.

Read more

OpenPipe - Document Pipeline

OpenPipe is an open source scalable platform for manipulating a stream of documents. A pipeline is an ordered set of steps / operations performed on a document to convert from its raw form to something ready to be put into the index. The operations performed on documents include language detection, field manipulation, POS tagging, entity extraction or submitting the document to a search engine.

Read more

Kiwix - Offline Reader For Wikipedia

Kiwix enables you to have the whole Wikipedia at hand wherever you go. Kiwix gives you access to the whole human knowledge. You don't need Internet, everything is stored on your computer, USB flash drive or DVD. It is basically an offline reader for web content. It supports the ZIM format, a highly compressed open format with additional meta-data.

Read more

Aperture - Java framework for getting data and metadata

Aperture is a Java framework for extracting and querying full-text content and metadata from various information systems. It could crawl and extract information from File system, Websites, Mail boxes and Mail servers. It supports various file formats like Office, PDF, Zip and lot more. Metadata information is extracted from image files. Aperture has a strong focus on semantics, metadata extracted could be mapped to predefined properties.

Read more

Uima-connectors - uima connectors, solutions to build the bridge between some markup languages and t

OverviewUIMA-connectors aims mainly at offering solutions to build the bridge between some markup languages and the UIMA structure data, namely the CAS. In comparison, the Tika project aims at detecting and extracting metadata and structured text content from various type MIME documents. UIMA-connectors is more dedicated to perfom mapping from/to text formats to/from CAS, providing solutions for handling language formats such as eXtended Markup Language (XML), Comma Separated Value (CSV), whites

Read more

Google-enterprise-connector-manager - Google Search Appliance Connector Manager

Welcome to the Google Enterprise Connector Manager project! The Google Enterprise connector framework enables the Google Search Appliance to search and serve documents stored in non-Web repositories, such as enterprise content management systems. An enterprise content management (ECM) system provides a central repository for large numbers of documents. The Connector Manager is the central part of the connector framework for the Google Search Appliance. The Connector Manager itself manages creati

Read more

Google-enterprise-connector-otex - Google Search Appliance Connector for Livelink

Welcome to the Open Text Livelink Connector Project! The Google Open Text Livelink Connector enables the Google Search Appliance to search and serve documents and other content stored in Open Text Livelink. Update: Livelink connector release 2.6.12 is now available. This is a patch release with some enhancements. This release replaces version 2.6.10. See the release notes for details on the changes in both versions. To install this release, use the 2.6.8 installer, and then update the Livelink c

Read more

Gate - General Architecture for Text Engineering

GATE excels at text analysis of all shapes and sizes. It provides support for diverse language processing tasks such as parsers, morphology, tagging, Information Retrieval tools, Information Extraction components for various languages, and many others. It provides support to measure, evaluate, model and persist the data structure. It could analyze text or speech. It has built-in support for machine learning and also adds support for different implementation of machine learning via plugin.

Read more

Google-enterprise-connector-afyd - Google Apps For Your Domain Connector for Enterprise Search Appli

Welcome to the Apps-For-Your-Domain Connector Project! The Apps-For-Your-Domain Connector is experimental, unsupported code intended to enable the Google Search Appliance to search and serve documents and other content stored in a Google Apps-For-Your-Domain, and made available through GData APIs. This code is under active development, is not yet usable, is not tested, and is unsupported! However, developers may find this code useful as example code for building other connectors. If you are inte

Read more

Related Tags
Browse projects by tags.

Follow feeds Follow bestopensource on Twitter Follow bestopensource on Facebook


Open source products are scattered around the web. Please provide information about the open source projects you own / you use. Add Projects.

Do you provide Consulting, Training, Support for any open source products. Register your business

Tag Cloud >>