Recent studies have indicated that 90% of all data worldwide has been created in the last two years, capturing, storing, processing, and providing information ten or more times than all previous years of humanity. Current estimates indicate that the amount of data generated in 2018 reached 33 zettabytes (one zettabyte or ZB equals 1 billion terabytes), equivalent to 16 times more than that generated in the previous 10 years. This Big Bang of data continues to accelerate, and is expected to exceed 175 zettabytes by 2025, quintupling the amount in 2018.
This new scenario of great information production, forces us to improve our capacity for decision-making based on data, as a relevant factor in public policies, the private sector and the community in general. A good use and analysis of the data provides the possibility of knowing characteristics or Insights of their actions, either to understand an event, its causes and anticipate its repercussions. In this way, it enables the use of data to make decisions on a solid and reliable basis. However, this new scenario brings with it the need to develop the necessary mechanisms for data capture, storage, processing, security and availability, as well as the ability to use it in a simple and comprehensive way. This ability to acquire data, understand it, process it, extract value from it, visualize it, and communicate it, will be an enormously important skill for decades to come.
Data Science can be defined as the use of data to achieve specific objectives by designing or applying computational methods for inference or prediction. This considers the study of data, where they come from, what they represent and the ways in which they can be transformed into valuable contributions and resources to create scientific, commercial and social strategies. This is an interdisciplinary field that involves scientific methods, processes, and systems to extract knowledge or a better understanding of (large volumes of) data. Some of the characteristics of Data Science are:
El Data Observatory (DO) está orientado al desarrollo y promoción de data sets (conjunto de datos) nacionales relevantes en los ámbitos de competencia, así como al desarrollo de soluciones innovadoras que aporten valor en las áreas de la ciencia, economía y sociedad, a través del manejo de datos e innovación de soluciones. En particular, las principales acciones en esta área son las siguientes.