The significant increase in scientific data that occurred in the past decade – such as NASA’s archive growth from some hundred Terabytes in 2000 to 32 Petabytes of climate observation data, as well as ECMWF’s climate archive of 220 Petabytes– marked a change in the workflow of researchers and programmers. Largely the data responsible for this development is multidimensional arrays (or data cubes), and is foundational in Earth / Life / Space sciences, as well as industrial sectors like agriculture, mineral resource exploitation etc.

The datacube paradigm has proven instrumental in making spatio-temporal Big Data analysis-ready, thereby easing access for experts and non-experts alike. Pioneered by the rasdaman technology, meantime a range of prototypes has emerged. Implementation techniques vary: while rasdaman is a full-stack C++ implementation many tools add an extra layer on top of some existing library, often in python.

Array databases aim to provide flexible, scalable services on exactly such massive datacubes. Of the currently available implementations, rasdaman is particularly relevant to EO as it is supporting several open OGC datacube standards (WCS, WCPS, WMS), and is in operational use at research institutions like AWI and HZG, smart farming startups like EOfarm and Crop­Maps, and on Petascale data centers like DIASs and CODE-DE.

The Internet of Things, cloud computing, big data tools to investigate climate, as well as intelligent analytics platforms and new technological progressions, have further emphasized the need for big data analytics support in climate science and big data science. Given the context of combating climate change, existing research has applied big data analytics in mainly the aspects of energy efficiency, intelligent agriculture, smart urban planning, weather forecast, natural disaster management, etc.

Main stakeholders doing R&D: Google, Microsoft, Long Live the Kings, JJAIBOT, Dymaxion Labs, DHL, IBM, 50 Reefs, rasdaman

