Architecture

Truviso's underlying architecture integrates all the key components for delivering real-time data analysis and event processing applications. Based on ground-breaking research in data stream processing, the Truviso architecture has expanded well beyond merely a data stream management system to incorporate other important elements that are part of delivering complete solutions.

The Truviso architecture includes three main sub-systems:

  • TruCQ - The core engine in Truviso's architecture is TruCQ which performs all data management functions, including queries over both data streams and tables. TruCQ leverages the open source database, PostgreSQL. Starting from this proven, enterprise-class code base makes TruCQ much more feature-rich and robust than competing data stream management systems (DSMS), which were written from scratch. This approach also enables TruCQ to perform both data stream processing and traditional database functions within the same engine—making Truviso the only vendor to offer such a capability.
  • TruLink - TruLink provides the integration framework for getting data into and out of the TruCQ engine. Built in Java, TruLink provides a library of source connectors that enable rapid integration to data feeds, and action connectors which deliver processed streams and/or decisions to outbound interfaces and systems.
  • TruView - The third sub-system, TruView, provides a framework for rapidly delivering dynamic, visual front-ends that render query results directly in a standard web browser. TruView provides a set of pre-built graphing and charting templates, which can be plugged into one or multiple outputs.

What enables Truviso to analyze such huge data volumes in real-time is an entirely new approach to data processing. Unlike a traditional database which must first persist data to a table before running queries, TruCQ runs its queries continuously against live streams of data, producing updated results the instant new data arrives. A data stream is an unbounded, potentially infinite, series of records traveling through a network, such as a stock ticker, web click-stream, or sales transaction data. In addition, the system can create derived streams, which are filtered, aggregated, correlated, or otherwise processed on-the-fly as views for use by other queries.

To run queries against streams, TruCQ provides simple extensions to SQL that apply windows which segment the stream into discrete data sets by time or the number of records. The window defines the set of data over which a given query produces results. TruCQ provides rich windowing semantics to support a variety of window definitions, including snapshots, chunking, sliding, landmark and partitioning windows.

Unlike queries in a traditional database which must be initiated and executed sequentially, TruCQ runs all queries in the engine continuously and concurrently, so they are always producing instantaneous results. Truviso's query semantics conform as closely as possible to ISO standard SQL. Consequently, any database developer will be immediately comfortable and productive with TruCQ with minimal training. While other stream processing solutions have adopted complex proprietary languages or mere sub-sets of SQL, TruCQ supports full SQL, including:

  • Views
  • Filters (e.g. basic operators such as > , < ,=, etc.)
  • User-defined functions (e.g. C, C++, Java, Perl, etc.)
  • Joins over both streams and relations
  • Aggregates (e.g. SUM, COUNT, MIN, MAX, AVG, etc.)
  • User-defined aggregates
  • Grouped aggregates
  • Arbitrary sub-selects
  • Other Sub-queries

TruCQ can execute all these as continuous queries directly against data streams without first persisting the data. Optionally, data streams and query results can be archived—in which case they are persisted directly to the tables within TruCQ for replay, back-testing, drill-down, benchmarking and other advanced analysis.

The breakthrough of Truviso's architectural design is its Adaptive Query Processor. Unlike a traditional query processor that executes each individual query in a separate static dataflow, the Adaptive Query Processor intelligently, and on-the-fly, "folds" the processing steps of multiple different queries into a shared global "uber-query" that effectively executes as a single query in the system. The result is super-linear query scalability that enables literally thousands of concurrent queries to be run continuously against incoming streams of data. The system has been benchmarked processing data streams in excess of 200,000 records/second in a single server—with support for clustering to handle even greater data volumes.

Request a Demo.
Get Truviso.