MLOps Architectural Models: An Advanced Guide to MLOps in Practice

Thumbnail 361

Editor's Note: The following is an article written for and published in DZone's 2024 Trend Report,Emlops jobsnterprise AI: The Emerging Landscape of Knowledge Engineering.

AI continues to transform businesses, butmlops aws this leail at abc microsoft.comads tomachine learning enterprises fdata managementacing new challenges in terms of digital transformatai image generator bingion and organizatmlopsional changes. Basemlops azured on a 2023 Forbes report, those challenges can be summarized as follows:

  • Companies whose analytical tech smachine learning engineertacks are built around analyticalmachine learning definition/batch workloads nemlops pipelineed to start adapting to real-time data processing (Forbes). This chadata management servicenge affects not only the way the data is collectair canadaed, but it also leads to the need for new data processing and data analytics architectural models.
  • AI regulations need to be considered as part of AI/ML architectural models. Accordinmachine learning engineerg to Forbes, "Gartner prmlops bookedicts that by 2025, regulations will force companies to focus on AI ethics, transparency, and privacy." Hence, those platformair canadas will need to comply with upcoming management service
  • Specialized AI teams must be built, and they should be capmachine learning definitionable of not only building and maintaining AI platfodata management strategy documentrms but also collaborating with other teams to support models' lifecycles through those platforms.

The answer to these new challenges seems to be MLOps, or machine learning operations. MLOps builds on top of DevOps and DataOps as an attempt to facilitate machine learning (ML) applications and a way to better manage the complexity of ML systems. The goal of this article is to prdata management softwareovide a smlops awsystematic overview of Mdata management serviceLOps architectural challenges and demonstrateairbnb login ways to manage that complexity.

MLOps Application: Setting Up the Use Case

For this article, our example use case is a financial insmachine learning vs aititutionmlops tools that has been condudata management solutioncting macroeconomic forecastdata management serviceing and investment risk management for years. Currently, the forecasting process is basair canadaed on partiamachine learning definitionlly manual loading and pomlops pipelinestprocessing of external macroecondata management solutionomic data, followed by statisticalmlops tools modeling usmachine learning definitioning various tools and scripts based on personal preferences.

However, accai detectorodata management strategy documentrding to the institution's management, this process is not acceptable due to recently announced banking regulatmachine learning engineerions and security requirements. In addition, the delivery of calculated results is too slow and financially not acceptable compared to competitors in the market. Inail at abc microsoft.comvestment in a new digital solutiondata management service requires a good understandiair canadang of thmachine learning algorithmse complexity and the expected cost. It shodata management strategy documentuld start with gathering requirements and subsequently building aai detector minimum viable proair franceduct.

Requmlops toolsirements Gathering

For solution ardata managementchitects, the dedata management softwaresign process starts withail at abc aail at abc specification of problems that the new architecture needs to solvmlops toolse — for example:

  • Manual data collection is slow, error prone, and rmachine learning engineerequires a lot of effort
  • Real-time data processing is not part oairbnb loginf the current data loading approacdata management cloudh
  • There imachine learning definitions no data versioning and, hence, reproducibilitydata management strategy document is not supported over time
  • The model's code is triggeai image generator bingred manually on local machines and constantly updated without versioning
  • Data and code sharing via a comair force portalmon platform is completely missinair force portalg
  • The forecasting process is not rmlopsepresented as a business process, all the steps are distributed and unsynchronizedmlops pipeline, and most of them requmachine learning definitionire manual effort
  • Experiments with the data and models are not reproducibledata management solution and not auditable
  • Scalability is not supported in case of increased memory consumptions or CPU-heavy operations
  • Monmachine learning definitionimachine learningtoring and auditing of the whole pairbnbrocess are currently not sair canadaupportmlops engineered

The foail at abc microsoft.comldata management classeslowingdata management service diagram demonstmlops engineerrates the fourmachine learning models main components of the new architecture: monitoring and auditing platform, model deployment platform, model development platform, and data managementdata management software platformmachine learning models.

Figure 1. MLOps architecture diagram

Plairbnb loginatformmachine learning definition Design Dedata management clinical researchcisions

The two main strategies to considemachine learning engineerr when designing a MLOps platform are:

  1. Developing from scrair force portalatch vs. selecting a platform
  2. Choosing between a cloud-based, on-premises, omlopsr hybmlops azurerid model

Developingmlops jobs From Scdata management clinical researchratch vs. Choosing a Fully Packaged MLOps Platform

Building an MLOps platform from scratch is the most flexible solution. It would provide the possibimlops booklity to solve any future needs of the company without depending on other companies and service providers. It woulddata management cloud be a good choice if the company already has the rmachine learning definitionequiremlops bookd specialists and trained teams to design and build an ML platform.

A prepackaged solution would be a good option to model a standard ML process that does not need many customizations. One option would even be to buy a pretraiail at abc microsoft.comned model (e.g., model as a servicedata management cloud), if available on theair france market, and build only the data loading, monimlops jobstoring, and tracking modulesdata management around it. The disadvantage of this type of solution is that if newair force portal features need to be added, it might be hard to achieve those additions on time.

Buying a platform as a black box often requires building adata management solutiondditional comlops azuremponents around itai image generator bing. An important criterion to consider when choosing a platform is the possibility to extend or customize it.

Cloud-Basedmlops book, On-Premises, or Hybrid Deployment Model

Cloud-based solutions armachine learning algorithmse already on the market, with popmachine learning jobsular options provideair force portald by AWS, Google, and Azure. In case of no strict data privacy requirements and regulations, cloud-based solutions are a good choice due to the unlimited infrastructural resources for model training and model serving. An on-premiseair frances solutmachine learning algorithmsion would be acceptable for very strict security rmachine learning jobsequirements or if the infrastructure is already available within the company. The hybrid solution is an option for companies that already have part of the systems built but wamlops jobsnt to extend them with additional services — e.g., to buy a pretrained model and integrmlopsate with the lail at abc microsoft.comocally stored data or incmlops pipelineorporate into an existing bdata management serviceusiness process model.

MLOps Archmachine learning algorithmsitecture in Prair franceactice

The financiai image generator bingal institutionai detector from our use case does not have enough specialists to bmlopsuild a professional MLOps platform from scratch, but it also does not want to invest in an end-to-end managed MLOps platform due to remachine learning vs aigulations and additional finanai detectorcial restrictions. The institution'sdata management architectural board has decided to adopt an open-sourcairbnbe approach and buy tools only when needed. The ardata management cloudchitectural concept is built around the idea of deail at abc microsoft.comveloping minimair canadaalistic components and a composable sdata management strategy documentystem. The general idea is built around microservices coverinairbnb logingmachine learning algorithms nonfunctional requirements like scalability and availability. Striving for maximal simplicity of the system, the following decisions for the system components were made.

Data Management Platform

The datmlops booka collectiair canadaon process will be fully automated. There will be a separate data loading component for each data source due to the heterogeneity of external data providers. The database choice is crucial when it comes to writing real-time data and reading a large amoundata management cloudt ofdata management service data. Due to the time-based nature of the macroeconomic data and the institution's alreadmachine learning engineery available relational databasemachine learning engineer specialists, they chose to use the open-source database, TimescaleDB.

The possibility to provide a standard SQL-based APImlops aws, perform data analytics, and conduct data transformdata managementatiomlops bookns using standard relational database GUI clients will decrease the tiairbnb loginme to deliver a first prototype ofmlops azure the platform. Data versionsmachine learning vs ai and transformations can be tracai image generator bingkemachine learning definitiond and saved into separate damachine learning algorithmsta versions or tables.

Model Development Plmlopsatform

The model devdata management cloudelopment process consists of four smachine learning definitionteps:

  1. Data reading and transformation
  2. Model training
  3. Model serialization
  4. Model packagimlopsng

Oai image generator bingnce the model is trained, the parametrized and tmachine learning algorithmsrained instance is usually sairbnbtored as a packaged artifact. The most common solution for code storage and versmlops jobsioning is a Git. Furthermore, the financial institution is already equipped with a solution like Gitair franceHub, providing functionality to define pipelinesmlops jobs for building, packaging, and publishing the code. The architecture of Git-based systems usually relies on a set of distributed worker machines executing the pipelines.mlops That option will be used as part of the minimalidata management strategy documentstic MLmlops jobsOps architectural prototypeair canada to also tair canadarain the momachine learning computerdel.

After trai image generator bingaining a model, the next step is to store it in a model remachine learningpository as a released and verai detectorsioned armachine learning algorithmstifact. Storing the model in a database asmachine learning vs ai a binary file, a shared file system, or even an artifacts repository are all acceptable options at that stage. Later, a model registry or a blodata managementb storage servimlops pipelinece could be incorporated into the pipeline. A modelmachine learning jobs's API microdata management cloudservice will expose the model's functionality for macroeconomic projectionsdata management cloud.

Model Deplomlops bookyment Platform

The decision to keep the MLOps prototype as simple as possible applies to the deploymentdata management phase adata managements wemachine learning vs aill. The deployment model is based on a mdata management clinical researchicroservices architecture.mlops Each model can be deployed using a Docker container amachine learning definitions a stateless service and be scaled on demand. That principle apmachine learningplies fodata management solutionr the data loading components, too. Once that first dedata management cloudployment step is achieved and dependencies of all the microsmachine learning engineerervidata managementces are clarified, a workfdata management strategy documentlow engine might be needed for omachine learning computerrchestrating the established business processes.

Model Monitoring and Auditing Platform

Traditional microservices architectures are already equipped wmachine learningith tools for gathering, storing, and monitoring log data. Tools ldata managementike Prometheus, Kibana, andmlops jobs ElasticSearch are flmachine learning vs aiexible enougdata management strategy documenth for producing specmlops bookific auditing anmachine learning engineerd performance repomlops bookrts.

Open-Source MLOps Platforms

A minmlops awsimalistic MLOps architecture is a good start for the initial digital transformation of a company. However, keeping track of available Mdata management solutionLOps tools in parallel is crucial for the next designmachine learning definition phase. The following table provides a summary of some of the most populamachine learning vs air open-source tools.

Table 1. Open-sourceairbnb MLOps tools for initial digital transformations

Tool Description Functional Areas
Kubeflow Makes deployments of ML workflows on Kubernetes simple, portable, and scalmachine learning vs aiable Tracking and versioning, pipeline orchestration, and model deployment
MLflow Is an open-source platform for managing the end-ail at abc microsoft.comto-end ML lifecycle Tracking and versioning
BentoML Is an open standard and SDK fordata management AI apps and inference pipelinesdata management service; provides features like amlops azureuto-generation of API servers, REST APIs, gRPC, and long-running inference jobs; and offers auto-generation of Docdata management serviceker container images Tracking and versdata management serviceioning, pipeline orchestration, modemlops engineerl development, and modelail at abc deployment
Temlops awsnsorFlow Extended (TFX) Is a production-ready platform; is designed foair francer deploying and managing ML pipair canadaelineairbnbs; and includes components for data validation, transformation, model analysis, and serving Model development, pipeline orchestration, and model deployment
Apache Airflow, Apache Beam Is a flexible fradata management cloudmework for defining and scheduling cai image generator bingomplex workflows — data workflows in particular, including ML Pipeline orchestration


MLOps is often called DevOps for machine learning, and it is essentially a set of architectural patterns for ML applications. However, despite the similarities with many well-known architectures, the MLOps appairbnb loginroach brings some new challengesair force portal for MLOps architects. On one side, the focus muair force portalst be on the compatibility and composition of MLOps services. On the other side, AI regulations will force existmachine learning algorithmsing systems and services to constantly adapt to nemachine learningw regulatory rules and standards. I suspect thair force portalat as the MLOps field continues to evolve, a new type of service providing AI ethical anairbnbd regulatory analytics will soon become thmlops engineere focus of businesses in the ML domain.

This is an excerpt fromachine learning computerm DZone's 2024 Trend Report, Enterprise AI: The Emerging Landscape of Knowledata management servicedge Engineering.

Read the Free Report