AI Wisdom Ep. 18: The Art and Science of Machine Learning

Artificial Intelligence - September 30 2020

On this episode of the “AI Wisdom – Talking Innovation in Insurance” podcast, host Ron Glozman speaks with Chris Laver, Chief Data Scientist, Chisel AI about the art and science of machine learning and how it can be applied to core insurance processes. They also tackle a few of the common misconceptions about AI and ML. Click the play button to listen or read the full transcript below.

Full Transcript

Ron: Hello, and welcome to “AI Wisdom Talking Innovation in Insurance”. On this podcast, we talk to business and insurtech leaders about how artificial intelligence is transforming the way we buy and sell insurance. I'm your host, Ron Glozman, founder and CEO of Chisel AI, and a strong believer in the power of AI to help people work smart and enrich their lives. So, let's get into it.

There’s a lot of AI hype in the marketplace, which can result in misunderstandings and misconceptions. Machine learning and artificial intelligence are buzzwords that are often used as synonyms. However, machine learning is a technique at the core of AI, where AI is a broad field, covering many areas, including natural language processing, speech to text, machine vision, etc. In its simplest form, machine learning is about learning patterns and predicting outcomes from large data sets. Like most technologies, machine learning is useful in particular areas to solve very specific business cases or problems.

On this episode, we’ll take a closer look at both the art and the science of machine learning and how it can be applied to some core insurance processes. I’m very pleased to have with me today Chris Laver, our very own Chief Data Scientist at Chisel AI, join me today, as we unpack a little bit in the art and world of machine learning. Welcome, Chris. Before we jump in, can you please introduce yourself?

Chris: Yeah, for sure. My name’s Chris, I’m the Chief Data Scientist here at Chisel AI, as Ron mentioned. My background is software engineering and mathematics and I spent entirely too long in grad school, prepping myself for a career in machine learning.

Ron: Excellent. So, Chris, let’s jump right in. Machine learning is a subset of artificial intelligence, and a lot of enterprises are starting to think about how they can leverage machine learning to operate more efficiently and more accurately. Can you talk a little bit about machine learning, what it is and how it works?

Chris: Absolutely. On my very first day of grad school, my advisor asked me to define what AI and machine learning were and we actually spent about two years arguing back and forth about that topic. I promise my answer right now won’t take two years but bear with me. We’re talking about machine learning, which is a little bit more straight forward than the entire field of AI. Broadly speaking, it’s the science of teaching a machine to perform a task, to accomplish a goal by example, rather than by explicitly programming the machine. So, with traditional software, software development, a programmer would sit down, and they would write a program that processes a bunch of data and produces an answer. A machine learning algorithm flips that paradigm on its head. It takes data and it takes some known answers and it produces or writes the program for you.

Now I say program, but a machine learning system is fundamentally math. It draws heavily from statistics, from linear algebra, from calculus, and a program that’s written by a machine learning model is actually a complicated mathematical function that transforms that data into answers.

Ron: Love it! Chris, I can imagine that two years would have been just riveting for everybody to listen to.

Chris: It was a fun conversation.

Ron: So, would it be fair to say in maybe a slightly different word, a deterministic system versus a probabilistic system?

Chris: That’s a good distinction. It’s important to note that most machine learning algorithms are probabilistic or stochastic when they’re learning, when you’re training them. So they don’t end up the same every time, but when it comes time to use them, a phase we call inference, they’re mostly deterministic, which means you are going to repeatedly get the same answer, given the same data over and over.

Ron: That’s very important because we obviously don’t want to have the same input and different outputs. So, when we actually want to train a system, what does it mean? Is somebody writing code or how does it differ from traditional software development?

Chris: So, training a machine learning model is a combination of data, usually lots of data, computing power, often lots of computing power, and a lot of caffeine and sweat for a machine learning team.

Ron: Love that.

Chris: Usually, you’ll start with a problem you’re trying to solve just like any other time and then you’ll gather data that you believe contains enough information to solve that problem. Your team will select an appropriate model architecture, make some technical choices, and then they’ll let that computing power loose to analyze the data and produce that model, that mathematical framework we talked about. Then once it’s been produced, they’ll take a look at the model, they’ll examine it, they’ll measure it on new data, decide if it’s really effectively solving the problem they set out to solve and if they decide that it’s good to go ahead, they’ll deploy it and integrate it into whatever system or process that they’re training this model to help with.

Then they’ll usually observe that for a while in a live scenario to determine if any changes are necessary to improve performance or if it’s good enough. Then just like with any other software cycle, it’s then time to go back to the start, take a look at what you learned, start over and build the next model or make this model better.

Ron: Are there specific things you’re looking for when you talk about effectiveness? You know, maybe people have heard of the words - precision, recall and accuracy. Is that what you think about as effectiveness, or do you have a different measure for effectiveness? And is effectiveness just different across different models and different business problems?

Chris: Great question. Effectiveness you can measure in a lot of ways. There are quantitative technical measures that we look at with a model. You mentioned a couple, accuracy, precision, recall, and F score. These are measures of how well the model has learned to answer the very specific question on the very specific data that you’ve presented the model with. These are very, very useful to calibrate our expectations and understand on a technical level how the model’s performing. It doesn’t necessarily translate to whether the model is effective for solving your business problem. For that, you need to look at your technical business measures for, is this model accomplishing what you need it to accomplish? Or, alternatively, qualitative, ask yourself, ask your employees, ask your people, does this model help you do better day-to-day?

Ron: So, what are some of the most popular models or methods that you’ve seen and maybe some of the new ones that are coming out that you’re excited about?

Chris: There are literally hundreds of flavors of machine learning methods, models. It’s a very, very active field of research, and new models are coming out every day. I have to keep an eye on the research papers just to keep moderately up-to-date with new developments in the field. But if I had to break it down, I would say there are three broad areas of machine learning that you may come across or care about.

The first is supervised learning, and this is the most common family of models that we see. It’s usually what we mean when we just say the phrase machine learning. You’ll sometimes hear people say deep learning. They’re usually talking about supervised learning. This is exactly the case I talked about, where you have some data, you have some answers that you know, and you have the machine learn to produce the program or the model that makes answers by analyzing and learning from what we already know.

There’s another related area called reinforcement learning. Now, it’s, in some ways, similar to supervised learning. But rather than trying to answer a very specific question, reinforcement learning attempts to achieve the best outcome it can for you. A good example might be a car navigation system. A supervised learning system might be able to predict how busy all of the streets are going to be, might be able to predict given a certain route how long it will take to get there. The reinforcement learning system is goal-based. My goal would be to get where I’m going as fast as possible, and then its outcome would be to tell me the optimal route to get there. That might sound strictly better than supervised learning, but a reinforcement learning model is significantly more complicated and difficult to train. So, you want to choose the right approach based on your problem.

Ron: I’m curious, I mean, from my perspective, and I don’t know if the machine...you know this, I mean, obviously the straightest path is two lines and sometimes there’s not a road there. So, are there any sort of downsides when you think about how to train systems? Are there sort of barriers or rules you can set up? Because obviously that’s not a feasible solution.

Chris: Depending on the type of algorithm you’re choosing, the type of data you have, it is possible to bake some limitations or constraints into a model. That tends to be a very problem specific sort of concept. So generally, we don’t have complete freedom to take any action or do anything to achieve a goal. We have to act within some kinds of guardrails, or we have to act within some kinds of strictures or constraints.

So, it’s important when we’re considering what we’re trying to accomplish, that we choose the right machine learning system or framework to help us accomplish it. And the more precise, the more prescriptive we can get about that problem we have to solve, the easier the task of building that system is going to be.

Ron: And it’s already hard work. So, we want to make it as easy as possible.

Chris: We are powered by caffeine. (laughs)

Ron: So, we’re going to take a quick 20-second break to tell you where you can find out some more information and insights about insurance innovation. We’ll be right back.

[If you liked this episode of AI Wisdom, subscribe to our blog, Writing the Future: AI in Commercial Insurance at www.chisel.ai/blog for feature articles, interviews, opinions, and more.]

We’re back with our featured guests, Chris Laver. Let’s jump right into the next question. When you hear these words, data mining or we talked about machine learning, but there’s also deep learning. Are there differences between these? Are they synonyms, and do people incorrectly use them?

Chris: So, there are a lot of conflicting definitions and terms in the field of machine learning even with a group of practitioners. You get a group of machine learning experts in the same room and we’ll spend an hour hashing out what we’re talking about before we even really get started. The differences between things like data mining, machine learning and deep learning usually come down to, like I mentioned before, the problem you’re trying to solve. One of the major ways that machine learning gets used, it’s often called data mining or descriptive machine learning. Just like it sounds, we’re trying to describe something. We wanna analyze our data, discover underlying patterns and trends, get some insight. This type of machine learning is used to provide more information to people, to help them make better, smarter decisions by finding those underlying patterns and distributions in your data.

Now, there’s another flavor of machine learning called predictive machine learning. And, again, this is sort of most common one, the one that people usually mean when they talk about machine learning. It tries to predict something, it tries to use all of the complex features, information, and factors in your data to produce a prediction, a likelihood, a chance, a probability. For example, I might ask a system, "What is the likelihood it will rain today?" and I can feed it all of the environmental factors and all the local factors from sensors. And it will try and predict an answer for me to say, "There’s a 40% chance or 100% chance that it is going to rain."

There is a third type of machine learning that’s closely related to reinforcement learning that we talked about before, and that’s prescriptive machine learning. This, again, is not so much about predicting a quantity as it is about achieving a goal, achieving some desired outcome. It tries to prescribe actions or prescribe direction to either a human or another system for how we accomplish that goal.

Now, these shouldn’t be viewed as sort of a ladder of increasing desirability. It’s not that prescriptive is better than predictive, is better than descriptive. They all have their place in trying to solve different types of problems, get different kinds of answers. And a really robust machine learning practice will have all of those methods in their toolbox. If I can maybe bring it down to earth and provide a little bit of an example of what those three types might look like, let’s try and apply this to the concept of risk and loss, something I think that our listeners are probably very familiar with.

A descriptive technique, like data mining, it’ll analyze data to determine, you know, relationships between losses that you’ve already seen between risk factors, co-occurrences of those losses. Anything that we can determine should be in our dataset, and it will then try and bring up trends, highlight insights, anything that an analyst or a human might use to help the human make a better decision. Now, in some ways, that’s a rebranding of the quantitative analysis practice that’s been going on for decades. In some ways, this is statistics rebranded with some new tools and techniques.

Predictive techniques, they might be used to try and predict what is the likelihood or magnitude of a loss, given all that same data, all those known risk factors, whereas prescriptive techniques, like reinforcement learning, they might be used to help a human shape a portfolio, choose how they’re going to take decisions to maximize returns or minimize risk.

Ron: I love that. So, let’s go a little bit deeper even. When insurers or even brokers or reinsurers or anybody, in the industry might be thinking about deploying AI solutions, what are some of the factors that should be top of mind?

Chris: So, anybody trying to deploy or make use of a machine learning system should always remember that a system is only as good as the data that was used to build it. If your data is incomplete, if your data is biased, these are going to create challenges for that machine learning system.

Chris Laver Blog Quote

An example that I’ve seen frequently is models that are trained on data that was drawn primarily from a specific geography. When preparing a dataset, it’s often helpful to narrow your scope by only taking data from locations in the City of Toronto. It’s an easy way to prepare a dataset. But that might not be representative of what conditions look like across your data in different geographies. And then your model would underperform or give you incorrect or bizarre answers if you try and apply it to a new region. So, making sure that your data is really high quality is critical.

It’s also important to remember that great machine learning systems are an iterative process. It’s not build a model once and we’re done and can move on. A great system needs to be maintained and improved over time. Teams get to know the data intimately and understand the techniques that work well on it, understand, like we talked about, the gaps and biases in the data, how to correct for it. And then also how to understand when the underlying data has patterns that are changing over time and maybe they need something new, the model needs to change with that data to adapt.

Ron: When the data changes, what are some of the steps you should be taking? Is there a way to actually determine that the data has changed mathematically?

Chris: So, there’s a topic that we call distributional drift. I promise I’m not going to go too deep. I’ll put everyone to sleep. But there are some technical factors that we can observe in the data over time to tell if that data is changing. We can also observe the model over time. I talked earlier about, there are some technical measures that we can apply and there are qualitative measures.

The people who are going to be most familiar with these systems, with these models are the ones who are using and interacting with them day-to-day. So, it’s important to really take both a technical and a human pulse of these models to watch for any degradation, any changes, any bizarre answers. That would be an indication that something’s going wrong.

Now, if something’s going wrong, the usual answer is your model needs to be retrained. Once a model has been defined, retraining it is usually an easier operation. It’s not starting from scratch, it’s gathering new data, making sure that data is really representative of the new conditions, and then feeding it into the model architecture you’ve already put together. The new model that’s produced will ideally both capture all of the advantages of the model you had before, as well as any of the new patterns, new trends that you’re seeing in data that’s coming in.

Ron: Speaking of data, data is obviously very important. It contains sometimes PII or HIPAA information and anything under the sun, really, anything can be data. Are there corporate governance practices that should be followed both to make the process easier, as well as to maybe remove bias or any other types of identifiers in that data and the validation process?

Chris: Absolutely. When you’re training a model, the data quality and the provenance of that data are important. First and foremost, if you don’t trust your data, then you can’t trust any system that you create from that data. And at some point, someone’s going to ask you where your data came from, what data was included in the model, and having robust systems in place to trace your data, catalog your data, understand your data are important.

Hand in hand with that data catalog and data governance process usually comes personally identifiable information and how we handle it. Most machine learning models don’t need to make use of that personally identifiable information. Usually, the PII, whether it’s somebody’s name, their phone number, their address doesn’t contain useful information for making a decision, or if it does, that information is often prescribed. We’re not allowed to use it because of regulations in making a decision.

Address and geography are often a special case. I would say that it is important to strike a balance between scrubbing and protecting your data, to make sure that you are really safeguarding your information or your customers’ or clients’ information. But if you want to build a really effective machine learning model, you’re balanced with the need to give your team all of the tools it needs to build the best model to make those decisions. So, it’s always a little bit of a delicate balance.

One of the things that you can do is put in place a really robust model validation practice and that model validation practice is responsible both for making sure that your models are free of bias, that they have reproducible results, that the results are human interpretable, but also that the data being fed to train those models is appropriate data, that it’s data you want to use and that you’re comfortable using, and it is properly scrubbed and has provenance associated with it.

Now, that last point I made about interpretability, that can be particularly challenging for machine learning models. There exists a tradeoff between how powerful a model is, the representational power, and how explainable it is.

Neural networks, deep learning, you’ve probably heard a lot about, these are very powerful models. They’re able to capture and make use of really complicated data that has a lot of interdependencies, but it’s not very explainable. It’s a little bit like the human brain. If you wanna understand what’s going on, it’s not very helpful to open it up and try and look at all the connections and neurons in there. You need other approaches, other techniques, and that’s where a model validation team would come in and be very helpful.

Ron: I love that example. I think people will really be able to connect to that. So, when we talk about data and you think about volume of data, I think that’s important and also source of data. You talked a little bit about protecting the data, also about, do you know where the data comes from? Do you trust it? And maybe touch on here a little bit about like generated data versus, real life data or simulated data.

Chris: So one of the most common questions that a machine learning professional will get asked is, "How much data do I need?" to which will invariably provide the incredibly unhelpful answer, "Enough high quality data to exhibit all the important patterns that solve your problem." And that’s an incredibly frustrating answer. It’s sometimes hard to come up with a rule like 1,000 examples of each category or 100 of data points. It’s really hard to nail down exactly how much you need. Usually the answer is “you need enough”. Now, the good news is that you can always collect some data, build a system, and then analyze the results, look at how well does it perform, where does it perform poorly? Where do I need to collect additional data and iterate?

Now, that data collection process is sometimes a difficult and time-consuming activity. Either the data just doesn’t exist, there’s not enough of it, or it’s challenging or expensive to collect, or collecting it is fraught with issues like personally identifiable information. So, it’s sometimes tempting to say, "We’ll simulate data. We’ll create some data for the system to learn from." That is a challenging activity. It’s valuable if you can get it right. It’s also an emerging area of research in the machine learning community. But one of the challenges is that these machine learning models are very, very good at picking up on trends and mathematical patterns. Simulated data usually comes from a mathematical model that will generate that data based on observed trends. So it’s very easy for a machine to sort of self-reinforce and focus in on the synthetic aspects of the data rather than creating data that really exhibits the properties that solve the problem you’re trying to solve. So, you can create more data with simulation. It’s not an easy thing to do.

Ron: So many people have this misconception that AI is magic. And maybe, in some cases like deep learning and neural networks, it might be magic, who knows, we can’t explain it. But at the end of the day, what are some of the common misunderstandings that you hear a lot about when it comes to ML and AI and even insurance when it comes to those?

Chris: For sure. So, I have considered having the words "AI isn’t magic" printed on every shirt that I own because I get asked that question a lot. So, I’ve actually found that most insurance professionals and executives that I talk to, they’re quite well-informed about machine learning and AI, but there are a few themes that come up, that I’ve heard a few times. One of them is that the system will automatically learn from, and then correct its mistakes.

Now, machine learning models are certainly capable of incorporating new information, using human-assisted feedback loops. We call these human in the loop systems to improve the results over time to correct its mistakes, but this doesn’t come automatically or for free. Just like a human learning something, we have to take time, we have to take care, invest in these systems so that they will improve over time. This usually takes the form of something like a human review mechanism, having a person validating or verifying some of the results of the machine so that they can correct the system and help it get better over time.

A closely related topic is I often hear, "Oh, the results will get better if we just give it more data." That’s sort of a half-truth. Sometimes that is the case, but I’d rather say the results will get better with better data.

As with all things in life, quality trumps quantity here. It’s not just about how much data but having the right data. And then I’ve also seen people reluctant to use a model if the accuracy isn’t 100%. A colleague of mine is fond of saying that an AI system should be as good as an average human on their best day. Just as with a human, perfection is something that we should strive for, but we can’t or shouldn’t wait for it. We don’t expect every human to be perfect and the same goes for machine learning.

I want a very good, very effective machine learning model, and I want it to continue to improve and learn over time just like a human. If I don’t deploy a model, if I don’t make use of it, when it’s good enough to help it then learn and get better, I’ll be waiting forever. Perfect may never come.

Chris Laver Blog Quote 2

Ron: That’s right. Perfect is the enemy of good. So, as we wrap up, is there one piece of wisdom that you’d love to share with our listeners?

Chris: Yes. If there’s one thing that I’ve learned in my years doing this it’s that machine learning is a journey, not a destination. This field is evolving rapidly and what is today a best in class system is behind the curve system two years from now. It’s not about having that perfect model it’s about setting up the conditions so that you constantly have improving great models to solve your problems.

It’s not about arriving at the one perfect solution. It’s about having the capability in your company, in your team, in your group to always have great models, great AI systems, great machine learning solutions.

_Chris Laver Blog Quote 3

Ron: So well put. Chris, thank you so much for your time. If people want to find out more about you, can they find you on Twitter? Where’s the best place?

Chris: So you can find me on LinkedIn and I’d be happy to engage, solve your problem, help you with understanding a little bit more about the field of machine learning. I will warn you, once you get me talking about the subject, the hard part is getting me to shut up. (laugh)

Ron: My conversations with you are always pleasant! Thank you so much, Chris and as always, if you want to find out more about Chisel or some of our thought leadership in this space, you can check us out at www.chisel.ai, or on LinkedIn, and, of course, Spotify and anywhere where you listen to podcasts. We’ll see you soon.

Ron: That’s a wrap for this episode of “AI Wisdom” hosted by Chisel AI and me, Ron Glozman. Thanks for listening.

If you like our podcast and want to hear more, check us out at www.chisel.ai or tune in and subscribe wherever you get your podcasts: SoundCloud, Spotify, iTunes, Google Podcast, or Stitcher

Join us next time for more expert insights and straight talk on how AI and insurtech innovations are transforming the insurance value chain. See you on the next episode!

Browse different topics

Recent Posts