Igor Krashenyi and Oleg Panichev, Senior Research Engineers at Ciklum R&D, teamed up to participate in competition launched on data science platform Kaggle. Read in the post below how the team approached the task and built a highly-accurate AI algorithm.
Drifting icebergs are dangerous to drilling platforms and pipelines off the east coast of Canada. Companies mitigate this risk by using satellite images to locate them but differentiating the icebergs from ships can be difficult.
Statoil is an international energy company partnered with C-CORE that has been using satellite data for 30+ years to build a computer vision based iceberg surveillance system. To keep operations safe and efficient, Statoil is concerned with spotting icebergs early on so that partner companies can tow them out of the way or move their equipment.
C-CORE and Statoil wanted to find better ways to locate icebergs before they drift near oil and gas infrastructure and needed to approach the problem from a different perspective. Both companies launched a Kaggle competition in which the participants had to develop a program to detect icebergs in images collected via satellite imagery.
Build an algorithm to automatically identify whether a remotely sensed target is an iceberg or not. The algorithm had to be extremely accurate because lives and billions of dollars in energy infrastructure at stake.
Duration: 2 months
It is not always easy for the human eye to differentiate icebergs from ships. All of the images in the competition dataset were taken from a Sentinel-1 satellite 600 kilometers above Earth. The data was very specific and the competition organizers provided an incidence angle feature for each image.
To build a model, the Ciklum R&D team used multiple deep learning models ensembled together. Ensemble techniques use two or more learning algorithms to get better predictive performance than could be obtained from any of the constituent learning algorithms alone. The output of such a system is better and more precise than the output of each method separately.
The final model included 40 models. Each was determined beforehand according to the voting principle on the basis of 5 submodels:
To make the final prediction, all the datasets were run through each of the 40 models.
The final result was evaluated using log loss metrics. This metric measures the accuracy of the model where the prediction input is a probability value between 0 and 1. The goal of our machine learning models is to minimize this value. A perfect model would have a log loss of 0.
The system developed by the Ciklum R&D team defined the iceberg with a log loss of 0.1310
To find out more how your business can embrace deep learning and realize own AI projects, check out the deep learning projects we already did that helped improve clients’ businesses.