Debunk 7 Legends of MLops | VentureBeat

We’re excited to bring Transform 2022 back in person on July 19 and around July 20-28. Join AI and data leaders for insightful conversations and exciting networking opportunities. Register today!


With the exponential growth of services that support machine learning (ML), the term MLops It became a regular part of the conversation — and for good reason. Short for “Machine Learning Processes,” MLops refers to a broad set of tools, business functions, and best practices to ensure that machine learning models are deployed and maintained in production reliably and efficiently. His practice is the basis for production grade models – ensuring rapid deployment and facilitating experiments to improve and avoid performance bias model or loss of forecast quality. Without it, ML becomes impossible on a large scale.

With any upcoming practice, it’s easy to get confused about what it actually involves. To help, we have listed seven common legends About MLops to avoid them, so you can move forward with successfully utilizing ML at scale.

Myth #1: MLops expire at launch

Reality: Launching an ML model is just one step in an ongoing process.

ML is an exercise that is experimental in nature. Even after the initial launch, it is necessary to test new hypotheses while tuning signals and parameters. This allows the model to improve accuracy and performance over time. MLops Operations helps engineers manage the experimentation process effectively.

For example, one of the basic components of MLops It is version management. This allows teams to track key metrics across a wide range of model variables to ensure optimum model selection, while allowing for easy return in the event of an error.

It is also important to monitor the performance of the model over time due to the risk of data skew. Data skew occurs when the data the model examines in production is significantly skewed from the data the model was originally trained on, resulting in poor quality predictions. For example, many ML models trained in epidemic consumer behavior before COVID-19 deteriorated severely in quality after lockdowns changed the way we live. MLops works to address these scenarios by establishing robust monitoring practices and building infrastructure to quickly adapt in the event of a major change. It goes beyond just launching a model.

Myth #2: MLops is the same as model development

Reality: MLops is the bridge between model development and the successful use of ML in production.

The process used to develop a model in a test environment is usually not the same that will enable it to succeed in production. Models running in production require powerful data pipelines to source, process, and train models, which often span much larger data sets than those in development.

Databases and computing power will typically need to move to distributed environments to manage the increased load. Much of this process needs to be automated to ensure reliable deployments and the ability to quickly iterate at scale. Tracking should also be more robust because production environments will see data outside of what is available in the test, so the likelihood of the unexpected is much greater. MLops consists of all these practices to take a model from development to launch.

Myth 3: MLops are the same as devops

Reality: MLops work toward similar goals as devops, but their implementation differs in many ways.

As both MLops and devops strive to make deployment scalable and effective, achieving this goal for ML systems requires a new set of practices. MLops places a stronger emphasis on experimentation than devops. In contrast to standard software deployments, ML models are often deployed with many variants simultaneously, and thus there is a need to monitor the model to compare them to determine the optimal version. For each postback, it’s not enough just to get the code – the models need to be retrained every time there is a change. This differs from standard devops deployments, where the pipeline must now include a retraining and validation phase.

For many common devops practices, MLops expands the range to meet their specific needs. MLops continuous integration goes beyond just code testing, but also includes data quality checks along with model validation. Continuous Deployment is more than just a set of software packages, but now also includes a pipeline for modifying or undoing changes to models.

Myth 4: Fixing a bug is just changing lines of code

Reality: Fixing ML model errors in production requires advance planning and multiple precautions.

If a new deployment results in performance degradation or some other bug, MLops teams need a range of options available to solve the problem. Often simply referencing the previous code is not enough, since the models need to be retrained before publishing. Instead, teams must keep multiple versions of models on hand, to ensure a production-ready version is always available in case something goes wrong.

Furthermore, in scenarios where there is a loss of data, or a major shift in the distribution of production data, teams need simple exploratory techniques to back up so that the system can at least maintain a certain level of performance. All of this requires important advance planning, which is an essential aspect of MLops.

Myth 5: Governance is completely different from MLops

Fact: While governance has objectives distinct from MLops, many MLops can help support governance objectives.

The governance model manages the regulatory compliance and risks associated with the use of the ML system. This includes things like maintaining appropriate user data protection policies and avoiding bias or discriminatory outcomes in model predictions. While MLops are usually seen as ensuring that models deliver performance, this is a narrow view of what they can deliver.

Tracking and monitoring of models in production can be supplemented with analysis to improve the annotation of models and to find bias in results. Transparency in model training and deployment lines can facilitate data processing compliance goals. MLops should be viewed as a practice to enable scalable money laundering for all business objectives, including performance, governance and exemplary risk management.

Myth 6: Managing machine learning systems can take place in silos

Reality: Successful MLops require collaborative teams with mixed skill sets.

Deploying an ML model spans many roles, including data scientists, data engineers, ML engineers, and devops engineers. Without cooperation and understanding of each other’s work, machine learning systems can become widely impractical.

For example, a data scientist may develop models without significant external insight or input, which can then lead to deployment challenges due to performance and scaling issues. Perhaps the devops team, without insight into key ML practices, would not develop the appropriate tracking to enable an iterative model experiment.

This is why it is important, across the board, that all team members have a broad understanding of the model development pipeline and machine learning practices – with collaboration starting from day one.

Myth #7: Managing ML systems is risky and untenable

Reality: Any team can leverage ML at scale with the right tools and practices.

Since MLops is still a growing field, it may seem as though there is a significant amount of complexity involved. However, the ecosystem is rapidly maturing and there are a wealth of resources and tools available to help teams succeed at every step of the MLops lifecycle.

With the right processes in place, you can unleash the full potential of ML at scale.

Krishnaram Kinthabadi is Chief Scientist at Fiddler AI.

Decision-makers

Welcome to the VentureBeat community!

DataDecisionMakers is where experts, including technical people who do data work, can share ideas and innovations related to data.

If you want to read about cutting-edge ideas and up-to-date information, best practices, and the future of data and data technology, join us at DataDecisionMakers.

You can even think Contribute an article Your own!

Read more from DataDecisionMakers