Innovations

Understand what sets DataJitsu apart from other big data processing solutions

Volume, variety and velocity of a big data system
Easy and cost-effective scaling of storage and processing
Reliability and performance of a data warehouse
- Transactional insertions, deletions, upserts and queries, i.e. reliable concurrency
- Automatically indexes, compacts and caches data in object stores (e.g. S3)
Increased speed and lower latency for data ingestion

A fast, scalable, fault-tolerant, end-to-end, exactly-once stream processing API that simplifies streaming applications
Incremental and continuous update of the final result (table) is taken care of by the API
Dataset/DataFrame API can be used in Scala, Java, Python, or R to express streaming aggregations, event-time windows, stream-to-batch joins, etc
Computations are optimized via the Spark SQL engine
Guarantees end-to-end exactly-once fault-tolerance through checkpointing and WALs (write-ahead logs)