Oleg Panichev, a Researcher and a member of Ciklum’s R&D Engineering Team, took 4-th place in the DrivenData competition on predicting the number of penguins in the Antarctic. The goal of the competition was to create better models to estimate populations for hard-to-reach sites in the Antarctic and thereby greatly improve the ability to use penguins to monitor the health of the Southern Ocean.

Ciklum R&D Engineering team

Oleg Panichev (in the middle) and Ciklum’s R&D Engineering team. Photo from Garage48 IoT & Machine Learning Hackathon, where Ciklum’s team won the first place by creating the Cocktail Mixer Using IoT and Machine Learning Tools

The 93 competing teams had to submit their own models to predict penguin population counts and distribution in Antarctica. The DrivenData competition was initiated by the Oceanites, Inc., Black Bawks Data Science Ltd., and Dr. Heather Lynch’s lab at Stony Brook University. Penguins help scientists detect the general health of the Antarctic because these species are important krill and fish predators, and changes (natural or anthropogenic) that influence prey abundance and environmental conditions will ultimately be detected through changes in distribution or population size. The top-5 winning models by accuracy were awarded the money prize.

The data for this competition come from the hard work of scientists around the globe who are dedicated to collecting data on penguins. The task of Oleg Panichev, Ciklum’s R&D Team member, who took 4th place, was to use the dataset of all of the observations through the 2013 penguin season and build the own prediction model to foresee the nest counts for 2014, 2015, 2016, and the upcoming season, 2017. The 2014-2016 data on penguin population were used to check the prediction model accuracy.

Ever wondered how the penguins are counted? Check this 360 video:


According to the competition initiators, the data on penguin populations are limited because most monitored colonies are near permanent research stations and other sites are surveyed only sporadically. Because the data are so patchy, and time series relatively short, it has been difficult to build statistical models explaining past dynamics or providing reliable future predictions. This competition was the first for ecological time series. It is expected that the best prediction models may be further used to predict the penguin population dynamics in Antarctica and monitor the health of the Southern Ocean.

Read also: