COVID-19 is a black swan event and it fundamentally changes the way we pursue Machine Learning in the real world. For businesses facing these rapid changes, the AI/ML models currently live in production are trained off of data that is very different from today’s pandemic situation.
A simple example is, a huge reduction in commercial flights might mean a drop in the accuracy of weather forecasts. Large-scale weather forecasts are made using sophisticated computer programs that model a virtual world. The accuracy of the output of such a model depends on the quality of the input.
One of the usual sources of input is an aircraft, feeding back real-time wind data. COVID-19 has considerably reduced the number of aircraft operating. As a result, the incoming information has dropped and forecast winds at cruise height can no longer be verified and the feedback loop into the global model is much weaker.
Similarly, finance models (used for credit analysis, asset pricing, demand forecast, etc.) taking unemployment data into consideration will see some extreme input variables never seen previously. A look at unemployment data that was just released this week. The reading of 17 million people filing for unemployment is four to five times the next highest reading and a more than 25 times sigma event.
Any model using unemployment data as input and making a decision on this data is using a feature that is more than 25 Sigmas outside of the expected value. This is an event that would be calculated to happen once every 100,000 years!
A model working on this input is unable to handle every unexpected event perfectly and give an accurate response.
Also Read: This Machine Learning startup helps breast cancer patients customise treatment, predicts risk of recurrence
ML models are trained on previously seen observations to predict future scenarios. However, today these models are seeing events different from what they were ever trained on.
Many businesses (especially in credit and finance) have hundreds to thousands of live production models running in their organisation, making incorrect decisions on data that affect their business outcomes across sectors – health, business, finance, gig-economy, credit, commerce, auto-traffic and travel to name a few.
The models that are likely to have problems in the coming days or months span credit, home pricing, asset pricing, demand forecasts, conversion/churn models, supply-demand for gig companies, ad pricing, in addition to several others.
With that in mind, it’s important to think about the model observability, overall systems resilience to these inputs, and the ability to troubleshoot as issues arise. The most important thing is for teams to have models that are observable; if you can’t observe, you can’t adapt. This means having instrumented detection+analysis on model decisions.
As a rule of them for model observability It is important to look out for the following:
- Events that are outliers should be detectable and surfaced automatically
- Outliers events should be linked with analytics for troubleshooting the model’s response
- Monitoring distribution shifts in input data because of these events
- Robust splicing and filtering capabilities for model input
Also Read: Differences between AI and Machine Learning, and why it matters
Input Variables:
Key Input-level monitors that an AI/ML model should have in production while factoring in the black swan events:
- Input checks to determine if values and distributions of features are drastically different than normal benchmark periods
- Checks on single events or a small number of recent events detecting out of distribution issues
- Detect if the features your model is most sensitive to, have changed drastically and factor that in accordingly
- Statistics to determine how far off the features are from the training set
Model Response:
Once you know the input to a model has changed, the next thing to monitor is how the model is responding to extreme inputs.
- Check the model’s performance for specific subclasses of predictions. Certain sectors such as Energy, Airline, or Travel might have significant risks. Best is to have fast online checks against various groups of predictions
- Use prior periods to produce worst-case and base case scenarios to then compare against outcomes
- Monitor the predictions in real-time against every new truth event (real-world prediction feedback) you receive
- If real-world feedback is not possible due to time lags, use proxy metrics — things you can predict and measure to determine the models’ performance
Also Read: Differences between AI and Machine Learning, and why it matters
Overall best practices:
The best practice for production ML models is not far off from best practice for production software — building observability tools to understand what is happening when models or software is live to catch issues before it impacts your customers. Some best practices for production ML models during these extreme environments are as under.
Track and identify outlier events
Tracking input data and model performance on outlier events is key. Annotating these events and being able to filter upon outlier events can help when gathering training data for future extreme environments. It is also important to consider whether to include outlier events in data for future model training. The model will be proactive against future extremes, but it also might think extremes are the new normal.
Decide on a Model Fallback Plan
Understanding how the model has performed in the past during extreme environments can help understand how the model is performing now. If the model is not performing well, it is best to set up naive forecasts based on the last N minutes or N Days and compare the model performance to this naive model.
Find look-a-like events
It is important to have enough observability into past similar events to set up look-alike modeling for this current situation. For example, if your model took in unemployment data as input, you might be able to leverage unemployment data from similar economic downturns, such as the 2008 recession.
Also Read: Survival vs growth: ShopBack CEO shares 3 golden rules to withstand the pandemic
Build a diverse portfolio of models and compare model performance
Real-time models that are reacting to the external world might be performing better today than batch predictions. Having a diverse portfolio of models enables teams to compare model performance and route traffic to models that are reacting better to extreme environments.
Combine various models by ensembling or stacking
Stacking or ensembling different models like tree-based or penalised regression models together would reduce the estimation error of the output and make the forecasts more robust during extreme events.
Know the uncertainty of your model’s predictions
Real-life modelling scenarios can be marketed with the absence of a good model altogether. In these cases, do you know how uncertain your model is? In this case, do not rely on point estimates rather return the model’s predictions with its confidence levels. In extreme periods, the uncertainty band will increase providing valuable information about the unreliability of point estimates
–
Register for our next webinar: Fireside chat with Paul Meyers and Jussi Salovaara
Editor’s note: e27 aims to foster thought leadership by publishing contributions from the community. Become a thought leader in the community and share your opinions or ideas and earn a byline by submitting a post.
Join our e27 Telegram group, or like the e27 Facebook page
Image credit: Kevin Ku on Unsplash
The post How can startups factor in the unpredictability of COVID-19 in their Machine Learning models? appeared first on e27.