For the second year running, the world’s most promising data scientists battled it out in a hackathon over the course of a weekend.
It’s 2pm on Friday, September 9 and the 20 teams that have made it to the final of the Data Science Game finish hurrying into the headquarters of Microsoft France in Issy-Les-Moulineaux, on the outskirts of Paris. During this opening afternoon suits, ties, shirts and dresses are the order of the day, but in just a few hours all that will change. Indeed, if the participant’s look somewhat tense and nervous, it’s because during more than 30 hours of non-stop work these 80 data science students, who have come from some of the world’s most prestigious universities, are going to face off against each other in a hackathon.
This continuously collaborative exercise is perfectly designed for data virtuosos. “To reach the final of the Data Science Game is, right now, our main goal,” explains the team from the Institute of Management in Calcutta, India. “The main difficulty concerns the amount of time given to process the data provided. You have to be quick and efficient.”
AXA, one of the event’s three main sponsors alongside Capgemini and Microsoft, was responsible for thinking up this challenge through its Data Innovation Lab. The goal: To use structured data to develop a model capable of predicting whether a potential client will sign up to an insurance policy that is presented to them.
Point of contact
To support contestants during this adventure, AXA brought in some of its in-house experts. Marcin Detyniecki, Head of Research at AXA’s Data Innovation Lab, is in his element here: his energy is contagious. While commenting on the early phases of the competition to three people at the same time – and in different languages – he cannot fail to mention that the French team from the Pierre and Marie Curie University (UPMC), who are in pole position following the qualification phase, are from the same lab where he worked at the start of his career. “They’re my favourites, that’s for sure,” he admits. “But you have to watch out for the Russians. They were last year’s winners and I’ll tell it like it is, they’re very, very good.”
But Marcin isn’t just there to make predictions. Antoine Ly, President of the Data Science Game Association, explains more: “The Data Science Game allows companies to gain access to the student world more easily, without having to do the rounds of all the universities. It centralises a point of contact. As for the students, they come into direct, physical contact with the profession.”
Between the different round tables and the welcome cocktail organised on this inaugural afternoon of the Microsoft-based event, the participants don’t let the opportunity to speak about research, financing, partnerships and recruitment pass them by.
32 hours on the clock
Saturday, 6.45am: Gone are the canapés and the formal attire. It’s against the stunning backdrop of Les Fontaines castle, just a one-hour car ride from Paris, that the hackathon is about to kick off. Eric Lebigot, Head Data Scientist for the AXA Group, gives the students a reminder: “You will have to show innovation and creativity.”
Each team has taken up its quarters. Glued to seats from which they will barely move during the following 32 hours, the students pay little attention to the photographers who move among them to immortalise the moment. “Things are always a little bit delicate on the first day,” Antoine explains. “Tomorrow, they will be a little more willing to speak about their progress and their first impressions.”
That’s exactly what the AXA mentors are there for. Carlos Dalla Stella is a Data Scientist at AXA’s Data Innovation Lab, and any of the groups can call on his aid. “I’m used to working with real data,” he explains. “My role mainly consists in sharing this experience with them. I can also give them a few tips now and then…”
"Our mentors are capable of answering certain technical questions that are linked more closely to our line of business," states Eric Lebigot. "They can help unblock certain situations. They also offer opportunities to explore. A tip given to one team can also help another. Mentors act as catalysts."
Being a catalyst is one of the main missions at the AXA Data Innovation Lab, and not just one for data scientists. "Generally speaking, the challenge for our data scientists is to identify useful information within a large amount of data," Eric Lebigot continues. "The data that we have processed is like an oil reserve that we can either decide to use or ignore."
But beyond the solutions that data scientists can recommend, it's also the questions they ask that are interesting. Which data can we use? Is it useful and, if so, why? "Data enables us to optimise the product and the service, thus benefiting both AXA and its clients," Eric Lebigot adds. "But to offer cheaper insurance we need to have a clear idea of the consequences, the benefits and the risks."
The hours pass, ticking away as the leader board is updated in real time to show the provisional ranking. The closer the margin of error of a team's solution moves towards zero, once the algorithms have been run to analyse them, the higher up the table the team moves. "You have to avoid looking at the leader board to make sure you don’t get too stressed out," says Rémi Cadène of UPMC. "We have seen teams find highly expressive variables displaying zero margin of error, but we don't know which model they have used or if it's an error or a bug. We shouldn't lose time looking at a leader board which might discourage us, or, if we're leading, make us feel too sure of ourselves."
Meet Rémi Cadène, Pierre and Marie Curie University
There are only a few hours of the hackathon left and Rémi, who is participating in his second Data Science Game, agrees to stop work for a few minutes to talk to us. During our conversation, he constantly looks in the direction of his computer, eager to plunge back into the lively atmosphere of the competition.
He is part of the team representing the Pierre and Marie Curie University (UPMC), where he discovered data science in the third year of his degree. So how did he learn about the Data Science Game? “It was one my teachers who told me about it,” he recalls. “Last year we came eighth.” Today, Rémi has just completed a Masters and is about to start a doctorate focusing on deep learning applied to imagery. This is a theme that fits perfectly with the requirements of the qualifying phase of the Data Science Game. “It allowed us to develop specialised models and consolidate a good first place,” he tells us. “During the final, the data will be a lot more structured. This other aspect of things interests me, precisely because it is not my speciality – or at least not as much as it is for the people who work for AXA!” he adds with a smile.
For the competition, his game plan is well developed: he and his team went through many phases of reflection to come up with it: “Everything stems from an intuition” he reveals. “The more we work with data, and the more we carry out statistical studies of the models we find, the more refined this intuition becomes. Now we’ll be able to begin working with even more statistical models and more machine learning, and we’ll see what all of that leads to.”
Just four hours away from the final whistle, it’s the team from Cambridge that is in the lead, followed by the Italian team from the University of Padua. The French team from UPMC is in third place. Costas, a member of the Cambridge team, remains wary, however. “Yes, we’re top at the moment,” he says, “but we need to keep a cool head, especially given that we have perhaps adjusted our algorithms a little too much for them to be accepted by the competition jury.”
A last minute surprise, orchestrated by the organisers of the Data Science Game, will go some way towards resolving their queries. With only half an hour left for the contestants, a whole new batch of unforeseen data is submitted to them.
“It’s now that we will find out whether the updated algorithms are up to the challenge and whether they work with new data or not” Eric Lebigot reveals. “They will have to fine-tune them, and maybe make a few adjustments. It’s the last big moment of stress for the teams.”
A last glance at the leader board brings with it a second twist in the plot. The Russian team from the Moscow Institute of Physics and Technology, who waited until the very last moment to share their provisional results, has gone straight in… into first place! Cambridge drops to second.
The jury is out
“Now we are going to evaluate the contestants on their methods,” Antoine Ly explains. “Their prediction score will be taken into account, but also the originality of their approach. Here, we reward teams that tried to do something that stands out from the rest, teams with perhaps not the best prediction but ones that have a practical application.” Each team has just three minutes to defend its solution before the jury, to try and move up the leader board.
This will notably be the case for the Russian team from the Skoltech University in Moscow, which ultimately succeeded in stripping the Data Ninjas team from Singapore of third place, with the latter just dropping out of the top five spot. The French team from UPMC took fourth place, with the Italians from Padua pushed into fifth. There was no change, however, in the top two.
The Russians from the Moscow Institute of Physics and Technology won the competition for the second year running. “We were very confident about our chances of winning” admits Stanislas during the award ceremony. “We are often perfectly placed during the competitions.
This result at the Data Science Games proves it once more.” As the sponsors rush to congratulate the winners, all of the finalists know that, wherever they finished in the ranking, they have highly valuable knowledge and expertise. Indeed, tomorrow’s best data scientists are among them.
3 questions for Eric Lebigot
Chief Data Scientist at AXA Group and Head of Research and Data Analysis at the Group’s Data Innovation Lab
Why did AXA decide to sponsor the Data Science Game?
From a recruitment and employer branding point of view, this competition has an undeniable advantage. We are faced with young Masters and doctorate students, some of whom have just landed their first jobs. The Data Science Game allows us to create a dialogue with these young people, who not only interest us but who are also attracted to what we do here at AXA. This event really brings together the data science community of tomorrow, and sponsoring it highlights AXA’s desire to recruit from all over the world.
What does the job of an AXA data scientist consist of?
The data scientist is a person who needs to be capable of extracting specific, concrete, comprehensible and, above all, usable information from a database. They must also be able to anticipate the types of usage expected with regards to the data they handled, that is to say they need to have an understanding of the business expectations of the company and to be able to help the company benefit from their scientific expertise. From this basis, they will be able to define models that enable them to generalise data. For example, as part of a marketing campaign they can help to determine whether a client will be tempted by a second AXA product after their initial subscription. Our data scientists are able to create models enabling them to answer this type of question. An AXA data scientist must also be responsive to the idea of actually producing the models that they conceive. They must know how to programme a turnkey solution if necessary, or an intuitive web site, in order to ensure that third parties will be able to utilise and take ownership of the algorithms.
Is it necessary for young data scientists to have prior knowledge of the insurance world in order to work for AXA?
Data scientists can come to AXA without any prior business experience. They can learn “on-the-job”. That doesn’t mean, however, that some initial knowledge of actuarial insurance, for example, is not appreciated, but there is really a place here for young data science experts who are interested in its concrete applications. In any case, wherever they end up, young data scientists will have to learn and develop their own experience. This is especially at AXA, which offers a highly attractive environment in this domain, whether in our Data Innovation Lab or in one of our branches in France or abroad. And it is above all the case when they are as skilled as those competing in the Data Science Game.