Displaying 1 to 5 from 5 results
Aperture - Java framework for getting data and metadata
Aperture is a Java framework for extracting and querying full-text content and metadata from various information systems. It could crawl and extract information from File system, Websites, Mail boxes and Mail servers. It supports various file formats like Office, PDF, Zip and lot more. Metadata information is extracted from image files. Aperture has a strong focus on semantics, metadata extracted could be mapped to predefined properties.
Gate - General Architecture for Text Engineering
GATE excels at text analysis of all shapes and sizes. It provides support for diverse language processing tasks such as parsers, morphology, tagging, Information Retrieval tools, Information Extraction components for various languages, and many others. It provides support to measure, evaluate, model and persist the data structure. It could analyze text or speech. It has built-in support for machine learning and also adds support for different implementation of machine learning via plugin.
OpenPipe - Document Pipeline
OpenPipe is an open source scalable platform for manipulating a stream of documents. A pipeline is an ordered set of steps / operations performed on a document to convert from its raw form to something ready to be put into the index.
The operations performed on documents include language detection, field manipulation, POS tagging, entity extraction or submitting the document to a search engine.
Oa4j - Java Client for OpenAmplify
OverviewOA4J is a java client for Version 2.1 of the OpenAmplify web service. It requires java 1.6+. It needs an API key to be used, this is free and takes a couple of minutes to obtain. InstallationAdd oa4j-x.x.x.jar to your classpath, Java 1.6+ is required. Usageimport static java.lang.System.out;import java.net.URL;import com.linguamathematica.oa4j.Analysis;import com.linguamathematica.oa4j.AnalysisService;import com.linguamathematica.oa4j.DefaultAnalysisService;public class Test{\tpublic sta
Spatiotemporal-zoning - Spatiotemporal zoning aims to analyze and partition text into segments that
We proposed a framework called spatiotemporal zoning as an attempt to overcome the limitations of geo-temporal encoding faced in the existing health surveillance systems by classifying news content into predefined classes based on its spatial and temporal characteristic, and recognizing the spatial and temporal attributes of each event. Specifically speaking, spatiotemporal zoning is the task that aims to partition text into segments that contain events, which occurred in the same location at th