The adoption of machine learning (ML) is increasing at a rapid pace across manufacturing companies.
Even as more and more ML applications make the transition from pilot to production, ensuring quick returns on investment (ROI) is the key. For this, organizations are focusing on faster development using techniques such as prebuilt components, pretrained models, feature reuse, upstream model output utilization, and several others. However, in their quest for quick go-to-market and faster business value realization, organizations tend to overlook many critical features of ML development cycle and infrastructure. These overlooked aspects can make ML applications brittle and unstable if left untreated for long. Borrowing the terminology from software technical debt, these ML-specific issues are termed as ML technical debt. Examples of ML technical debt include prediction bias, correction cascades, undeclared consumers, unstable data dependencies, legacy features, configuration debt and so on.
Continually evolving market conditions force manufacturing organizations to move faster and thereby accumulate ML technical debt, which elevates business risks and escalates costs.
For example, manufacturing organizations are routinely using machine learning methods to proactively detect warranty frauds. However, prediction bias can make warranty fraud detection models ineffective when the dealers and service providers change their behaviours. As the input data (warranty claims) drifts away from its original model training profile, the model can no longer effectively flag potential frauds.
Similarly, if a model developed to approve vehicle lease applications uses a training dataset that underrepresents a certain demographic group, the model may show a prediction bias toward the lease applications received from the overlooked demographic group. This can result in loss of potential sales while also making the organization vulnerable to litigations. Likewise, when pricing the service contract for a new vehicle, it is not uncommon to use the ML model of a similar existing vehicle with some corrections applied to its output. This results in a correction cascade, which means if the original model is updated, it will make the service contract pricing of new vehicle sub-optimal, resulting either in loss of sale or erosion of margin.
Data-driven development implies that ML models are only as good as the initial data they were trained on.
As the actual data profile starts drifting away from the training data, models start to deteriorate. Sometimes, the data profile does not change but the business realities change in such a subtle manner that models do not accurately depict the current state of the world. Many ML algorithms are stochastic, which means that these algorithms may not produce the same model parameters every time they are retrained, and the model inferences may vary as well. These characteristics of data-driven development and stochastic algorithms make it difficult to spot ML technical debt in comparison to the software technical debt.
Historically, it has been difficult to address the ML technical debt.
However, with the recent advancements in AI engineering and machine learning operations (MLOps), it is now possible to control and service this debt to a great extent.
For example, correction cascades and undeclared consumers can be kept in control with the model versioning feature of MLOps. Model versioning makes it possible to update the base model without breaking the dependent models and consumer applications as the latter two will continue to use the previous version of the base model if it is available in the model registry. Even though MLOps does not solve the issues of correction cascades and undeclared consumers completely, it still ensures that unwanted dependencies do not delay (or stop) new releases of the base model.
Similarly, unstable data dependencies can be dealt with the dataset versioning feature of MLOps. Furthermore, almost every MLOps framework provides the calculation of feature importance that helps in weeding out unnecessary legacy features. MLOps solutions also provide extensive capabilities to slice and dice the prediction sets to detect prediction bias and ensure model fairness. MLOps promotes reproducible ML pipelines as code that ensures continuous integration and continuous delivery (CI-/CD) of ML pipelines, potentially solving the problem of configuration mismatch.
Resolving the ML technical debt is paramount for continued value creation using machine learning.
Applying machine learning without paying attention to ML technical debt can cause inaccurate and biased predictions that not only erodes competitive edge but also risks organizational reputation. MLOps solutions are relatively new in the market but they are rapidly maturing, and new capabilities are being added continuously. An appropriate MLOps strategy can make it possible for manufacturing organizations to realize the true potential of machine learning.