Tag Archives: data science

The Data Science Game is back

Registration for the 2017 Data Science Game is officially open. The Data Science Game is a two-phase competition showcasing teams of data science students from universities around the world. An online qualifier will take place on April 15 with the final stage happening in September.

Students from the Moscow Institute of Physics and Technology (MIPT) won last year’s competition. Will your university win this year? To register your team, visit www.datasciencegame.com. The deadline to register is April 9.

Milliman is a sponsor of the 2017 Data Science Game.

The 2016 Data Science Game winner is…

Russian Data Mafia from the Moscow Institute of Physics and Technology (MIPT) won the 2016 Data Science Game. The final phase of the game featured teams of students from 20 universities competing in a 30-hour hackathon challenge.

DSC_0347 (2)

Microsoft provided its support by giving free access to its Azure computing clusters while the final challenge was set by AXA. Each team worked on a data set containing requests for auto insurance quotes from different brokers and comparison websites. Students were asked to predict whether the person who requested a given quote bought the associated insurance policy. Teams were ranked according to their prediction scores.

The final phase took place September 10 and 11 at Capgemini’s Les Fontaines campus in France. This year, 143 teams participated from more than 50 universities and schools in 28 different countries.

To learn more about the Data Science Game, click here.

The Data Science Game’s qualification round is complete

In July, teams of data science students from more than 50 universities around the globe competed in the qualification phase of the 2016 Data Science Game. Over 140 teams of four students were asked to develop an algorithm that could recognize the orientation of a roof from a satellite photograph by building on more than 10,000 photograph of roofs categorized through crowdsourcing.

Twenty-two teams have qualified for the final phase. The top three ranking teams were Jonquille (University Pierre and Marie Curie), PolytechNique (Ecole Polytechnique), and The Nerd Herd (University of Amsterdam). The final is being held in Paris on September 10 and 11, where the teams will compete in a big data analysis challenge.

Data Science Game finalists

For more information on the Data Science Game, click here.

Milliman is a sponsor of the 2016 Data Science Game.

Milliman sponsoring data science competition

Milliman is a sponsor of the 2016 Data Science Game, a two-phase competition showcasing teams of data science students from universities around the world. After an online eliminatory challenge, the best 20 teams will be invited to a two-day competition in Paris.

Last year, teams competed to solve a machine learning challenge created by Google. Students from the Moscow State University won the competition. Who will win this year?

Teams can register at www.datasciencegame.com. The deadline to register is May 31. The online challenge will take place in June while the two-day competition is scheduled for September.

Enhanced processes of mining unstructured data

Innovative analytical tools and high-performance computing are providing insurers the means needed to analyze huge volumes of unstructured data. In this Risk.net article (subscription required), Milliman’s Neil Cantle discusses how these advances offer carriers a more sophisticated approach in analyzing inherent risks and developing best business practices.

Here is an excerpt:

Many of the new generation of tools for unstructured data were initially developed to enable search engines such as Yahoo and Google to tackle the vast resources of the web. Key among these is the Hadoop framework for the management and processing of large-scale disparate datasets on clusters of commodity hardware. Hadoop has a number of modules for such things as distributing data across groups of processors, filtering, sorting and summarizing information, and automatically handling the inevitable hardware failures that arise in large computing grids. All of the technologies mentioned are open source, which means they are free and readily available, and they are also supported by many proprietary commercial extensions and equivalents.

The breakthrough with new data sources and tools is the ability to query things for which the data has not been organized in advance. This can reveal new patterns, trends and correlations that can be helpful in managing risk and spotting opportunities, says Neil Cantle, principal and consulting actuary at Milliman, based in London.

… “[The new data capabilities] enable insurers to look more broadly and deeply into the world in which the policyholder lives without necessarily being specific about the person, and allow them to start making inferences about an individual and their behavior,” says Cantle.

The article also focuses on the emergence of data scientists who are entrusted with mining new data sources. Milliman’s Peggy Brinkmann expounds on data science and the techniques data scientists use to extract value from large amounts of information in her paper “Why big data is a big deal.”

What do you want most from big data?

Big data is changing decision-making processes across many industries. In this new article, author Neil Cantle discusses how analyses of large data sets can be used to forecast business results or to learn about the interrelated factors that drive business.

Here is an excerpt:

So, what is big data all about? Well, it depends. To some, it is simply about applying new processing techniques that enable you to run queries over very large datasets. This can be a useful thing to do, but the key point is: “What questions does your analysis seek to answer?”

There are two main responses to this. First, is “prediction” – trying to find “reliable” similarities between the behaviors of some subset of factors in your “big” dataset and the outcome you want to “predict.” This can be useful if the relationships uncovered happen to make sense and persist over time. But it is always possible to find some variables somewhere which, for a period of time at least, behave similarly to the one you are interested in without having any real relationship between them whatsoever. Eventually that apparent relationship will disappear and your “predictions” are suddenly not very good, but you don’t know why.

The second type of analysis is arguably a more satisfying one – seeking “explanation” not just “prediction.” Studying large sets of information to learn about the underlying mechanism driving the outputs you see brings insight and meaning, helping you to find out more about “why” things are related, not just that they move in apparently similar ways.

For more perspective on some approaches making it possible to manage and extract value from big data, read Peggy Brinkman’s article “Why big data is a big deal.”