
Innovations
Understand what sets DataJitsu apart from other big data processing solutions
Administrative Support
Cloud Data Warehouse
-
Volume, variety and velocity of a big data system
-
Easy and cost-effective scaling of storage and processing
-
Reliability and performance of a data warehouse
-
Transactional insertions, deletions, upserts and queries, i.e. reliable concurrency
-
Automatically indexes, compacts and caches data in object stores (e.g. S3)
-
-
Increased speed and lower latency for data ingestion
Project Management
Streaming Pipeline
-
A fast, scalable, fault-tolerant, end-to-end, exactly-once stream processing API that simplifies streaming applications
-
Incremental and continuous update of the final result (table) is taken care of by the API
-
Dataset/DataFrame API can be used in Scala, Java, Python, or R to express streaming aggregations, event-time windows, stream-to-batch joins, etc
-
Computations are optimized via the Spark SQL engine
-
Guarantees end-to-end exactly-once fault-tolerance through checkpointing and WALs (write-ahead logs)