While machine-learning techniques can improve business processes, predict future outcomes, and save money, they also increase modeling risk because of their complex and opaque features. In this article, Milliman’s Jonathan Glowacki and Martin Reichhoff discuss how model validation techniques can mitigate the potential pitfalls of machine-learning algorithms.
Here is an excerpt:
An independent model validation carried out by knowledgeable professionals can mitigate the risks associated with new modeling techniques. In spite of the novelty of machine-learning techniques, there are several methods to safeguard against overfitting and other modeling flaws. The most important requirement for model validation is for the team performing the model validation to understand the algorithm. If the validator does not understand the theory and assumptions behind the model, then they are likely to not perform an effective model validation on the process. After demonstrating an understanding on the model theory, the following procedures are helpful in performing the validation.
Outcomes analysis refers to comparing modeled results to actual data. For advanced modeling techniques, outcomes analysis becomes a very simple yet useful approach to understanding model interactions and pitfalls. One way to understand model results is to simply plot the range of the independent variable against both the actual and predicted outcome along with the number of observations. This allows the user to visualize the univariate relationship within the model and understand if the model is overfitting to sparse data. To evaluate possible interactions, cross plots can also be created looking at results in two dimensions as opposed to a single dimension. Dimensionality beyond two dimensions becomes difficult to evaluate, but looking at simple interactions does provide an initial useful understanding of how the model behaves with independent variables….
…Cross-validation is a common strategy to help ensure that a model isn’t overfitting the sample data it’s being developed with. Cross-validation has been used to help ensure the integrity of other statistical methods in the past, and with the rising popularity of machine-learning techniques, it has become even more important. In cross-validation, a model is fitted using only a portion of the sample data. The model is then applied to the other portion of the data to test performance. Ideally, a model will perform equally well on both portions of the data. If it doesn’t, it’s likely that the model has been over fit.
Advances in catastrophe models and new state insurance regulations have opened the door for an affordable, risk-based private insurance market in Florida. This reading list highlights articles focusing on various issues and implications related to the market. The articles feature Milliman consultants Nancy Watkins and Matt Chamberlain, whose knowledge and experience is helping insurers to understand and price flood risk more precisely.
• Forbes: “The private flood insurance market is stirring after more than 50 years of dormancy”
The reemergence of private flood insurance has piqued the interest of carriers seeking to enter the market. Some catastrophe (CAT) modeling companies are creating flood models to help insurers price policies. Here’s an excerpt:
Nancy Watkins, a principal consulting actuary for Milliman, likened the current level of interest from insurers to enter the private flood insurance market to popcorn.
“We are at that stage where you can hear the space between pops. You can hear one kernel at a time,” she said. “What I think is going to happen is, in one to two years, there’s going to be a lot more going on.”
• Bradenton Herald: “Important for homeowners to compare flood insurance options”
Florida homeowners must consider the issues related to the National Flood Insurance Program (NFIP) and private flood policies. Private insurers can use predictive modeling technology to determine a home’s distinct flood risk.
• Tampa Bay Times: “Remember the flood insurance scare of 2013? It’s creeping back into Tampa Bay and Florida”
Real estate and insurance experts comment on the possible effects that high flood insurance rates may have on homeowners. Insurers express interest in the granular modeling of flood-prone territories.
• Tampa Bay Business Journal: “Why some Tampa Bay property insurers are offering flood coverage and others are not” (subscription required)
Insurers need to weight the risks and rewards associated with the underwriting of flood insurance. A few carriers have already decided to participate in Florida’s private flood insurance market.
Insurance companies have many responsibilities involving the submission and approval of rate filings, including providing support for any predictive models used to develop rates. Determining the level of predictive modeling support required can be complicated because it varies widely by state.
In this Insight article, Milliman’s Eric Krafcheck discusses the types of predictive modeling support that companies should include in a rate filing to minimize the objections and amount of time needed to receive a regulator’s approval.
Here is an excerpt:
While there are a variety of predictive modeling support documents that a filer can choose to include in a rate filing, some are more standard than others. At the very least, in addition to the indicated rating factors, an actuarial memo should be included with the filing that describes the data used, the adjustments that were made to the data, and the overall modeling process, including a description of the model validation techniques. Additionally, the memo should include a discussion of the predictive modeling method used and give any necessary specifications of the models. For instance, if a generalized linear model (GLM) was used, include model specifications such as the target variable being modeled (e.g., pure premium, loss ratio, frequency, severity, etc.), the error distribution used (e.g., Poisson, Tweedie, etc.), the link function used, and the predictor variables included in the final model. As always, if you are an actuary, consult the Actuarial Standard of Practice No. 41, Actuarial Communications, for other items that you should consider adding to the memo.
Other types of support are useful to have on hand but might not be necessary to provide in every state. For instance, some states require the submission of various goodness-of-fit measures and other model validation statistics whereas other states may not. If you do not want to provide this information in every state unless necessary, you should at least have the information readily available in an exhibit format. This will save a lot of time down the road if goodness-of-fit information is requested in response to a filing.
While most daily fantasy sports (DFS) players usually swing and miss, big data management and predictive analytics have the capacity to increase a player’s chance of winning more consistently. In this article, Milliman’s Michael Henk and Nicholas Blaubach discuss the monetary success that some advance modelers are having on DFS websites using predictive analytics. The following excerpt highlights the steps necessary to build a DFS predictive model.
There are some basic steps that serve as general “rules of thumb” when we set out to develop our predictive model to make us millions in DFS.
First, we need an objective. We want our model to optimize our roster, giving us the most potential points. In our DFS example, we’d want a predictive model that will help us identify the best players for the cost (in order to stay under the salary caps) for any given contest.
Next, we gather our data… Gathering the data and getting it into a proper format for our predictive model is another story, but historical sports data is easy to find online. One thing to consider here is the traditional actuarial concern of credibility. If the data isn’t credible, it’s highly unlikely that we’ll be able to build a successful model from it….
After we choose the data to use, we need to select and transform the specific variables in the data set. The structure of the predictive (or independent) variables in relation to the target (or dependent) variable determines how well a model works. We can transform variables (by taking logarithms, for example) or bucket variables to see what gives us the best fit. Sports data can have hundreds (or even thousands) of variables….
Next, we process and evaluate our model. The key to good model performance is obviously getting the best fit. If we’ve done the other steps up to this point well, this step should run smoothly. Here we identify the ideal number of variables and use performance metrics to evaluate the model fits….
Once all of that is done, it’s important to not merely implement the model and ignore it. It requires routine maintenance. As time goes by and data continues to emerge, we need to take time to reinvestigate the data, update the models, and challenge some of our initial assumptions. The best models are continually updated and recalibrated, audited on a regular basis, and replaced when they are no longer effective.
I had the pleasure of recently participating in a panel discussion on predictive modeling at the 2015 Casualty Actuarial Society (CAS) Ratemaking and Product Management seminar in Dallas. Prior to the meeting, the CAS conducted a predictive modelling survey, and the panelists were there to discuss both the results and the emerging role that actuaries play in predictive modeling. And it’s an important role! I always try to include an actuary on a predictive modeling project, teaming actuarial expertise with subject matter experts as well as data scientists. This kind of collaboration makes for stronger models. Actuaries bring a unique business knowledge to the mix, while the data scientists will challenge norms. The result of this collaborative tension: Innovative and relevant business insights.
The CAS issued a press release earlier this week recapping the panel. You can read it here, or contact me for more information.
Some insurers and third-party administrators (TPAs) have recently begun to tackle rising medical costs by implementing claims predictive models. In their paper, “Making workers’ compensation medical costs more manageable,” Rong Yi and Steve DiCenso discuss how next-generation predictive models enable changes in claims strategies and culture.
The authors highlight a case study in their paper illustrating how one company was able to make changes in these areas and gain value from enhanced predictive models.
The key strategic changes outlined by the authors focus on:
• Identifying and estimating future high-cost claims earlier.
• Identifying and quantifying the cost (risk) drivers of these claims.
• Creating a focused, early-intervention medical management program to prevent adverse claim development.