top of page

Innovations

Understand what sets DataJitsu apart from other big data processing solutions

Administrative Support

Cloud Data Warehouse
  • Volume, variety and velocity of a big data system

  • Easy and cost-effective scaling of storage and processing

  • Reliability and performance of a data warehouse

    • Transactional insertions, deletions, upserts and queries, i.e. reliable concurrency

    • Automatically indexes, compacts and caches data in object stores (e.g. S3)

  • Increased speed and lower latency for data ingestion

Project Management

Streaming Pipeline
  • A fast, scalable, fault-tolerant, end-to-end, exactly-once stream processing API that simplifies streaming applications

  • Incremental and continuous update of the final result (table) is taken care of by the API

  • Dataset/DataFrame API can be used in Scala, Java, Python, or R to express streaming aggregations, event-time windows, stream-to-batch joins, etc

  • Computations are optimized via the Spark SQL engine

  • Guarantees end-to-end exactly-once fault-tolerance through checkpointing and WALs (write-ahead logs)

bottom of page