Stream processing is a well-studied topic. Over the last few decades, researchers and practitioners have devoted substantial efforts to developing fast, scalable, and reliable streaming systems tailored for real-time analytics over streaming data. Thanks to these efforts, today, open-source and commercial streaming systems are running smoothly in big tech’s data centers, powering thousands of applications covering ads recommendation, fraud detection, IoT analytics, and many others.
Witnessing the significant progress made in the stream processing field, more and more companies have started investigating modern streaming systems, hoping to see how modern technologies can transform their businesses. Unfortunately, many of these companies get stuck in their journey, complaining about the high cost of adopting streaming systems, for at least two sets of reasons:
Difficult to learn. It was never easy to learn how to use streaming systems. Unlike conventional databases (e.g., MySQL and PostgreSQL) that provide SQL as the interactive interface, most, if not all, of the streaming systems require users to learn a set of platform-specific programming interfaces (most likely in Java) to manipulate the streaming data. Mastering streaming systems becomes a mission impossible for non-tech persons. To make things worse, streaming systems represent data in a different way from databases. Users have to write complicated data extraction logic to transit data between streaming systems and databases.
Expensive to operate. Many popular streaming systems are open-source. Comprehensive deployment scripts and docker images are easy to get. But open-source never implies being free or affordable. Streaming workloads can fluctuate abruptly based on the usage demand. Companies have to purchase a large cluster of machines to sustain worst-case scenarios. The cost of deploying and maintaining a streaming system can be much more than just purchasing machines. In fact, assembling a team of engineers that are willing to burn their midnight oil to operate the system can be a real headache.
Democratizing stream processing
Stream processing should not be the privilege of big techs and deep pockets. Stream processing should not be treated as a monster whose power can only be harnessed by talented engineers. Stream processing should benefit everyone, from data scientists to decision makers, from large enterprises to small businesses. At Singularity Data, we invest all our efforts in democratizing stream processing. We are building RisingWave, a cloud-native streaming database that makes stream processing simple, affordable, and accessible to everyone.
Stream processing made simple
RisingWave is a distributed streaming database that provides standard SQL as the interactive interface. It speaks in PostgreSQL dialect, and can be seamlessly integrated with the PostgreSQL ecosystem with no code change. RisingWave treats streams as tables and allows users to compose complex queries over streaming data and historical data declaratively and elegantly. With RisingWave, users can now purely focus on their analytical query logics, without worrying about learning Java or system-specific low-level APIs.
Stream processing made affordable
RisingWave is designed for the cloud. The cloud-native architecture enables RisingWave to fully leverage elastic resources provided by the cloud platforms. As a fully managed service, RisingWave deploys, maintains, and scales in the cloud on its own, without human interference from the user side. Once users set their service-level agreement (SLA), RisingWave will automatically assemble different tiers of compute and storage resources in the cloud to achieve the performance goal at a minimal cost. RisingWave is serverless: users pay for the service on an as-used basis, and users do not need to pay unless they use the service. We keep optimizing the service to ensure that RisingWave is affordable even for small businesses.
Stream processing made accessible
We believe that a great product comes from the collective wisdom of a thriving and open community. Instead of developing RisingWave by relying on the experience of a small group of experts, we design and implement it with the open-source community. We decided to open-source RisingWave kernel under Apache License 2.0, a permissive free software license. The RisingWave community is open: everyone can participate in the design of the RisingWave project roadmap; everyone can deploy the distributed streaming database in their own cloud service; everyone can contribute and send feedback to the community. The RisingWave community is collaborative: we are eager to build the modern real-time data infrastructure stack together with other communities. For example, we are actively working with the community of Redpanda, a real-time streaming platform, to unleash productivity in building mission-critical applications.
The next wave of stream processing is coming. Please join us to help define the future of stream processing, for everyone!