It has been a few weeks since the release of the Global Data Barometer (GDB) results, database, and our blogs on the GDB structure and A-hands-on guide. We hope you are exploring the data and testing hypotheses on different thematic areas. Among the variety of topics covered by the GDB, there is one on which we want to guide you so you can continue your exploratory research: How does one track openness with the GDB?
Two different barometers
The Global Data Barometer builds upon the Open Data Barometer (ODB), which was a project from the World Wide Web Foundation and partners between 2013 and 2020 (with the final global edition completed in 2016, and the final regional edition, focusing on Latin America and the Caribbean, completed by ILDA in 2020).
The ODB aimed to understand the status and impact of open data initiatives around the world. However, during the last decade, aspects that were not at the beginning in the core of the open data agenda could no longer be ignored. Thus, new measurements were necessary to understand these complexities in the data field. In response to this context, the GDB was developed to measure the state of data for public good around the world. As a result, the GDB methodology and research process present significant differences that do not allow a straightforward comparison with the ODB, mainly because of the thematic lenses, structured approach, evaluated variables, and the data collection and calculation processes.
The differences are substantive and require a careful approach if you want to compare the two barometers:
- First, the GDB is organised around modules, which provide differentiated thematic views of data frameworks, capabilities, availability and use; instead of a general view of open data.
- Second, the variables evaluated inside indicators are not the same between the two barometers, even when indicators have the same name.
- Third, the way each variable is scored in the barometers is different, while the ODB does it in an aggregated way, the GDB scores each variable individually, which also means that the GDB offers more structured data.
- Fourth, the GDB introduces both, weighting differences between variables and variables that act as multipliers inside the indicators, as explained in the appendix of the GDB Report; while the ODB applied equal weights to all its variables.
Considering these structural differences, only rough and controlled comparisons can be done between the two barometers. We made a few of them in our Report to track changes on a few key aspects.
- On Capabilities, for example, we made an assumption to compare the open data initiative indicator of the two barometers. Any score above 50 points (out of 100) was considered as an active open data initiative, and any score above 70 points was considered as a well-resourced open data initiative. Under this assumption, we compared the number of active and well-resourced open data initiatives from 2013 to 2021, understating the comparison had the limitations mentioned above.
- On Availability, the comparison we used had a different approach. The GDB contains groups of questions for each dataset related to: its existence; kinds of data; data fields and quality; data openness, timing and structure; and data extent. The ODB on the other hand, contained a group of sub-questions only related to openness, timing and structure. Thus, the comparison between the barometers was done only with the matching questions on openness, timing and structure. Particularly in the report, we compared the results of these sub-questions on the ODB and GDB to make some reflections about the proportion of datasets published as open data.
In spite of the differences between the two barometers, we explored ways in which the GDB by its own could track data openness. We want to describe how we did this to guide you to conduct your own research and conclusions. And while we do this, we want to share some key findings we found about the status of data openness around the world.
GDB indicators and openness
The best way to observe how the GDB evaluates a country on open data is to look at specific indicators within the pillars. The indicator provides comprehensive information, not only about the “openness” of a dataset, to talk about an Availability indicator for example, but also information about the data fields it contains, and the quality and coverage it offers. We recommend you look at the indicators in a general and granular way, that is at the indicator score and the sub-questions responses and evidence it contains. To track openness with the GDB, these are the indicators we recommend you to observe:
Let’s first look at the Governance indicators that evaluate open data. The indicator “Open data policy” (G.GOVERNANCE.ODPOLICY in the database) answers the question: To what extent do relevant laws, regulations, policies, and guidance provide a comprehensive framework for generating and publishing open data? And the indicator “Data management” (G.GOVERNANCE.DATAMANAGE in the database) answers the question: To what extent do relevant laws, regulations, policies, and guidance provide a comprehensive framework for consistent data management and publication?
From the first indicator, we know that 30 countries have open data frameworks with force of law, that 44 countries have frameworks but lack force of law, and that 35 countries do not have open data frameworks at all. The fact that only 30, out of the 109 countries, have been able to provide force of law to their open data frameworks over a decade of work, shows that the agenda is alive but that is not getting stronger as we could have expected, given the importance of these frameworks on open data initiatives.
From the same indicator, we can also know that the existing open data frameworks have important weaknesses on key elements they should contain. For example, only 24 of the countries have a framework that requires the use of data standards, and only 44 of the countries have a framework that promotes open licensing without any restrictions beyond attribution and share-alike.
On the Capabilities indicators, we suggest to observe: “Open data initiative” (C.CAPABILITIES.ODINIT in the database), “Government support for re-use” (C.CAPABILITIES.GOVSUPPORT), Civil service (C.CAPABILITIES.TRAIN), and “Sub-national” (C.CAPABILITIES.SUBNAT). Let’s focus on the open data initiative indicator to share some key findings. This indicator answers the question: To what extent is there a well-resourced open government data initiative in the country?
From this indicator we know there are 10 countries, out of 109, where there is no evidence of a government-led open data initiative; 27 countries where there has been a government-led open data initiative but with limited evidence of recent activity; and 72 countries where there is evidence of an active government-led open data initiative. The 46 countries that have active open data initiatives but do not have a open data framework at all or have a framework that lacks force of law, suggests that many of the existing open data initiatives (64%) lack coverage and sustainability throughout the government.
From this open data initiative indicator, on a more positive view, we know that 88 countries have a government team supporting open data initiatives; that 79 countries have a well-maintained or partially maintained open data portal; and that 76 countries offer some guidance and support to publish open government data.
On the Availability pillar, we suggest you look at all GDB primary indicators as they explore whether certain categories of data are available, shared, and of adequate quality to allow reuse for the public good. Considering these conditions, our results show that data on public procurement and budget and spend is the most available in the world, and that data on lobbying, beneficial ownership and land tenure is the most restricted.
A deeper look inside the indicators tells us more about the global situation in terms of data openness, timing, and structure (the standard open data assessed elements). The following figure shows the results in percentages of the sub-questions related to data openness in all 17 availability indicators in the 109 assessed countries (1853 different datasets). There you can notice that only 29% of the assessed datasets are provided in machine-readable formats, that only 18% of the datasets are available as a whole, and that only 25% are openly licensed, among some other things.
While these figures give us a snapshot of data opennesses, in terms of open data frameworks, capabilities and availability; we believe that the provided descriptions also work as an example for you to continue looking at more aspects about openness assessed by the GDB. We hope you can continue your own exploratory research.