With the increasing number of enterprise ML models in production, the rising demand for localized models, and the focus on more robust model monitoring, scalability is more important than ever. Consequently, ML model monitoring platforms need to be built with a highly scalable architecture that can do everything your organization needs it to do at production grade.
An ML model monitoring architecture that isn’t scalable can result in a number of technical challenges, including lack of responsiveness, increased infrastructure cost, and platform inelasticity. If your model monitoring system isn’t built to scale, your models—and ultimately, your business—will suffer the consequences.
Unlike less resilient ML model monitoring and observability solutions, the Arthur platform is built for high performance and scalability from the ground up. Keep reading to find out why Arthur’s tech stack makes it the leading platform for enterprises that want to run high-performing ML models at scale.
1. Database Management
Arthur leverages the strengths of both ClickHouse and Postgres to handle different types of workloads. ClickHouse, used for OLAP workloads, is a horizontally scalable database management system that allows for high insert rates and fast serving of complex queries against very large data for multi-dimensional analysis. Postgres, used for OLTP workloads, is one of the best relational databases. With its reliable transactional mechanism, it organizes the users and the models to support the recording and the querying of respective model metrics. ClickHouse and Postgres work together to provide the Arthur platform and its end users with an optimal level of flexibility.
2. Auto-Scaler Mechanism & Streaming-First Architecture
As previously mentioned, components in Arthur’s platform are independently and horizontally scalable. The platform’s auto-scaler mechanism self-manages and optimizes resource utilization on a Kubernetes cluster, automatically scaling up and down based on platform activities as well as the lag observed in the data pipeline queue.
Whether the data is coming from streaming or batch, Arthur’s streaming-first architecture allows a very large volume of data to be ingested reliably and efficiently in a non-blocking fashion. For queuing, Arthur uses Apache Kafka, which was built for streaming big data and is ideal for MLOps use cases such as high-throughput activity tracking, stream processing, event sourcing, and log aggregation.
3. High-Performance Programming Language
Arthur’s platform core is written in the Go programming language, which was developed by Google and is used by leading enterprises like Uber and Dropbox. Go was chosen for a few reasons: it’s compiled to machine code, its runtime performance is up to 30x faster than languages that are interpreted or have virtual runtime, and it’s built for concurrency and parallelism.
At the end of the day, scalability is far more than just a “nice to have.” As your organization grows and ML projects outpace their original deployment, your model monitoring system must be resilient and flexible to adjust to the ever-growing volume of data. The Arthur platform was not only built with all of this in mind, but it also continues to be optimized to ensure models are proactively scaling and ultimately maximizing value for your business. Read more about the importance of scalability and performance in our whitepaper here.