Data reliability, a key challenge for artificial intelligence

The volume of data stored on our planet, which can be collected by a range of equipment and sensors, backed up on the “cloud” or shared on the Internet, has reached a mind-boggling 33 zettabytes and is expected to increase five-fold by 2025. Its usefulness lies in the fact it can be used for modeling, prediction and decision-making purposes.

A data point represents a record based on observing the world around us. Being a representation of this observation, it can be coded and backed up, including by digital means. When data is accessible, it can be computed and analyzed. Statistics models data as the realization of a random variable, thereby paving the way for its use by so-called machine learning algorithms, which are vital components of intelligent systems.

This paper summarizes Florence d’Alché-Buc’s presentation at the ENGIE “Réveil Digital” event on 5 June 2019