If you’re a data scientist or you work with machine learning (ML) models, you have tools to label data, technology environments to train models, and a fundamental understanding of MLops and modelops. If you have ML models running in production, you probably use ML monitoring to identify data drift and other model risks.
Data science teams use these essential ML practices and platforms to collaborate on model development, to configure infrastructure, to deploy ML models to different environments, and to maintain models at scale. Others who are seeking to increase the number of models in production, improve the quality of predictions, and reduce the costs in ML model maintenance will likely need these ML life cycle management tools, too.
Unfortunately, explaining these practices and tools to business stakeholders and budget decision-makers isn’t easy. It’s all technical jargon to leaders who want to understand the return on investment and business impact of machine learning and artificial intelligence investments and would prefer staying out of the technical and operational weeds.
Data scientists, developers, and technology leaders recognize that getting buy-in requires defining and simplifying the jargon so stakeholders understand the importance of key disciplines. Following up on a previous article about how to explain devops jargon to business executives, I thought I would write a similar one to clarify several critical ML practices that business leaders should understand.
What is the machine learning life cycle?
As a developer or data scientist, you have an engineering process for taking new ideas from concept to delivering business value. That process includes defining the problem statement, developing and testing models, deploying models to production environments, monitoring models in production, and enabling maintenance and improvements. We call this a life cycle process, knowing that deployment is the first step to realizing the business value and that once in production, models aren’t static and will require ongoing support.
Business leaders may not understand the term life cycle. Many still perceive software development and data science work as one-time investments, which is one reason why many organizations suffer from tech debt and data quality issues.
Explaining the life cycle with technical terms about model development, training, deployment, and monitoring will make a business executive’s eyes glaze over. Marcus Merrell, vice president of technology strategy at Sauce Labs, suggests providing leaders with a real-world analogy.
“Machine learning is somewhat analogous to farming: The crops we know today are the ideal outcome of previous generations noticing patterns, experimenting with combinations, and sharing information with other farmers to create better variations using accumulated knowledge,” he says. “Machine learning is much the same process of observation, cascading conclusions, and compounding knowledge as your algorithm gets trained.”
What I like about this analogy is that it illustrates generative learning from one crop year to the next but can also factor in real-time adjustments that might occur during a growing season because of weather, supply chain, or other factors. Where possible, it may be beneficial to find analogies in your industry or a domain your business leaders understand.
What is MLops?
Most developers and data scientists think of MLops as the equivalent of devops for machine learning. Automating infrastructure, deployment, and other engineering processes improves collaborations and helps teams focus more energy on business objectives instead of manually performing technical tasks.
But all this is in the weeds for business executives who need a simple definition of MLops, especially when teams need budget for tools or time to establish best practices.
“MLops, or machine learning operations, is the practice of collaboration and communication between data science, IT, and the business to help manage the end-to-end life cycle of machine learning projects,” says Alon Gubkin, CTO and cofounder of Aporia. “MLops is about bringing together different teams and departments within an organization to ensure that machine learning models are deployed and maintained effectively.”
Thibaut Gourdel, technical product marketing manager at Talend, suggests adding some detail for the more data-driven business leaders. He says, “MLops promotes the use of agile software principles applied to ML projects, such as version control of data and models as well as continuous data validation, testing, and ML deployment to improve repeatability and reliability of models, in addition to your teams’ productivity.”
What is data drift?
Whenever you can use words that convey a picture, it’s much easier to connect the term with an example or a story. An executive understands what drift is from examples such as a boat drifting off course because of the wind, but they may struggle to translate it to the world of data, statistical distributions, and model accuracy.
“Data drift occurs when the data the model sees in production no longer resembles the historical data it was trained on,” says Krishnaram Kenthapadi, chief AI officer and scientist at Fiddler AI. “It can be abrupt, like the shopping behavior changes brought on by the COVID-19 pandemic. Regardless of how the drift occurs, it’s critical to identify these shifts quickly to maintain model accuracy and reduce business impact.”
Gubkin provides a second example of when data drift is a more gradual shift from the data the model was trained on. “Data drift is like a company’s products becoming less popular over time because consumer preferences have changed.”
David Talby, CTO of John Snow Labs, shared a generalized analogy. “Model drift happens when accuracy degrades due to the changing production environment in which it operates,” he says. “Much like a new car’s value declines the instant you drive it off the lot, a model does the same, as the predictable research environment it was trained on behaves differently in production. Regardless of how well it’s operating, a model will always need maintenance as the world around it changes.”
The important message that data science leaders must convey is that because data isn’t static, models must be reviewed for accuracy and be retrained on more recent and relevant data.
What is ML monitoring?
How does a manufacturer measure quality before their products are boxed and shipped to retailers and customers? Manufacturers use different tools to identify defects, including when an assembly line is beginning to show deviations from acceptable output quality. If we think of an ML model as a small manufacturing plant producing forecasts, then it makes sense that data science teams need ML monitoring tools to check for performance and quality issues. Katie Roberts, data science solution architect at Neo4j, says, “ML monitoring is a set of techniques used during production to detect issues that may negatively impact model performance, resulting in poor-quality insights.”
Manufacturing and quality control is an easy analogy, and here are two recommendations to provide ML model monitoring specifics: “As companies accelerate investment in AI/ML initiatives, AI models will increase drastically from tens to thousands. Each needs to be stored securely and monitored continuously to ensure accuracy,” says Hillary Ashton, chief product officer at Teradata.
What is modelops?
MLops focuses on multidisciplinary teams collaborating on developing, deploying, and maintaining models. But how should leaders decide what models to invest in, which ones require maintenance, and where to create transparency around the costs and benefits of artificial intelligence and machine learning?
These are governance concerns and part of what modelops practices and platforms aim to address. Business leaders want modelops but won’t fully understand the need and what it delivers until its partially implemented.
That’s a problem, especially for enterprises that seek investment in modelops platforms. Nitin Rakesh, CEO and managing director of Mphasis suggests explaining modelops this way. “By focusing on modelops, organizations can ensure machine learning models are deployed and maintained to maximize value and ensure governance for different versions.“
Ashton suggests including one example practice. “Modelops allows data scientists to identify and remediate data quality risks, automatically detect when models degrade, and schedule model retraining,” she says.
There are still many new ML and AI capabilities, algorithms, and technologies with confusing jargon that will seep into a business leader’s vocabulary. When data specialists and technologists take time to explain the terminology in language business leaders understand, they are more likely to get collaborative support and buy-in for new investments.