Skip to main content

OGGO Committee Report

If you have any questions or comments regarding the accessibility of this publication, please contact us at accessible@parl.gc.ca.

PDF

CHAPTER FOUR: USEABLE BY ALL

It is not enough to publish data for that data to serve as an excellent example of open data.… [O]pen data is about having more easily reusable formats.
Lyne Da Sylva, Associate Professor, School of Library and Information Science, Université de Montréal

The third open data principle – Useable by All – is centered on the release of as much data in as many open formats as possible. Throughout the Committee’s study, witnesses suggested ways to improve data formats and search functionality in order for open data to be truly useable by all, from researchers and academics to the general public. These comments have informed the Committee’s recommendations about potential ways to make open government data more accessible and easily useable.

A. Data formats

Before open data, the federal government published many sources of data. However, these were not necessarily presented in machine-readable formats and were often published under restricted licensing terms. According to the CIO of the Government of Canada, open data allows users to search and then download data in machine-useable formats, so that they can develop programs and information systems that can manipulate the data and produce other uses for it.

One barrier to utilizing open data is licensing restrictions. As such, the development of a common open license is an integral part of open government data. The federal government released its open government license in June 2013. As well, Mr. Deslauriers explained that the Government of Québec and the cities of Québec, Gatineau, Montréal and Sherbrooke released a common license in February 2014. Moreover, these four cities and the provincial government are merging their open data portals to provide users with a single point of access. Ms. Miller noted that licenses should be designed to allow governments to release data while still maintaining ownership over the data.

According to a TBS official, while federal departments and agencies are “owners” of the data, “[c]ertain legislation doesn’t allow for the data to be shared.” According to Ms. Ubaldi, there are also restrictions that concern the sharing of data within the public sector. She explained, “at times, for instance, linked data sets can support their data analytics, which can help identify trends to improve policy making and service delivery, but still some legal restrictions do not enable different parts of the administration to access the various data sets.”

Some witnesses even suggested legislative changes. For example, according to Mr. Eaves, Canada should consider core datasets whenever it passes new legislation, to answer the question: “What are the core datasets that allow for the transparency so that the public can assess whether the legislation is working?” In addition, he recommended that the federal government should update the Access to Information Act in order to require departments to respond to requests for data with datasets that are in a machine-readable format.

According to several witnesses, including Ms. Ubaldi, technical challenges that governments deal with in relation to open data include how to enable interoperability and integration of data and how to foster the linkage of datasets to be released in open formats. According to a representative from the Government of New Brunswick, “[o]ne issue is that today data is available in each government's format and very few are using international standards.” A representative from the City of Ottawa informed the Committee that “it's difficult to get common data formats for particular topics across the levels of government simply because in many cases you're working with different types of data.” With respect to this lack of standardization, Mr. Sharma commented that the federal government has a role to play to standardize formats and protocols so that applications created locally can serve in other jurisdictions inside or outside Canada.

A TBS official explained that the federal government has incorporated the use of an international openness scale on its open data portal to indicate the level of openness of its datasets. This scale indicates the extent to which the dataset is available in a well-structured format, and whether or not proprietary software is required in order to open the dataset. The U.S, the U.K. and a variety of other jurisdictions are also using this scale.

The Committee heard several suggestions on which formats are best for releasing open data. According to Mr. Chui, data formats should be machine-readable and although most formats are, “some [formats] are easier to use [and] easier to process, such as comma-delimited.”

According to some witnesses, the best format for open data is the Resource Description Framework (RDF) format. Ms. Da Sylva told the Committee that RDF format is the champion of “reusability.” She explained that RDF format is an extremely simple but highly-structured format, and while it is more challenging for a person to write and read the format, it can be easily manipulated by a computer. As explained by a representative from the Government of British Columbia, “RDF is a really interesting and powerful data format because it can create those interconnections among different datasets.”

Comma-separated values (CSV) is another widely used format. According to Ms. Da Sylva, producing data in CSV format is quite easy and there are no real technological barriers to doing so. In addition, she explained that CSV format can be readily manipulated by computer and can be converted into RDF format.

In terms of international best practices, Ms Da Sylva told the Committee that the U.K. publishes a large number of its datasets in RDF format. In comparison, she added that some governments publish their documents as zipped PDF images, which is not a desirable format for open data. She also noted that some federal data is presented in zipped text files – also not a desirable format, since the data is unstructured and much more difficult to analyze directly with a computer. In terms of the federal government’s progress, the President of the Treasury Board noted that harmonizing data formats is a work in progress.

Some witnesses raised concerns around the usability of open government data by the general public. A representative from the Government of British Columbia commented that data has to be accessible and usable. According to Ms. Francoli, “[t]he raw format that the datasets are released in really does privilege data scientists; people who have high degrees of expertise in the use of raw data. Many others, non-governmental organizations for example, would benefit greatly from the datasets and the information, but they're not able to use them because they lack the resources, they lack the expertise.” A representative from the Government of British Columbia emphasized that “making sure that people can connect to the data in a way that is relevant to them and that serves their needs is really important.” According to Ms. Da Sylva, “[t]he [federal government] site is just so huge – there are so many things – that to figure out what might be of use to you might take a while.” With respect to usability, a representative from the Government of Ontario suggested that visualization tools would be one way to address this issue in order “to make it simpler to understand.”

B. Discoverability

Several witnesses suggested that the federal government should develop a federated search function, through a national search engine, which would include open data from federal, provincial and municipal governments. This single point of access would facilitate the research of open data in relation to a specific subject from all levels of government in Canada. According to an official from the Government of Ontario, this “would help the adoption and use of open data by improving access.” Representatives from the Government of British Columbia, the City of Toronto and the City of Ottawa all agreed that there should be a federated search function for open government data in Canada.

A federated search function could also lead to economies of scale for individual governments. An official from the Government of Ontario suggested that governments could collaborate and jointly develop a common search engine, so wherever the data resides, users can search federal, provincial and municipal government data. A representative from the City of Ottawa agreed and noted that the federal government could act as the lead on this collaborative effort.

The federal government also recognizes the value of a federated discovery approach. The CIO of the Government of Canada remarked that “[i]t would certainly be very useful to civil society to be able to tap data sets across Canada without any challenge to usability or licensing.” In particular, a TBS official shared that, “through our international collaborations and through our national cooperation with the provinces and municipalities, we have discovered that this is something that users want. […] As such, in the coming years, we will work to create as many links as possible between the different portals as well as between the different access points in order to permit relatively transparent navigation between these sites.”

In addition, several witnesses suggested that search capabilities need to be further expanded to allow users to search by geography, as well as by theme. A representative from the Government of British Columbia agreed that the open data portal should be improved in terms of the discoverability of data. The CIO of the Government of Canada indicated to the Committee that the department is working internationally with the U.K. on greater search capabilities.

C. Building awareness

In terms of building awareness of the open data portal, the CIO of the Government of Canada informed the Committee that there is no separate advertising budget for the federal government’s open data initiative. However, she explained to the Committee that the open data portal is being promoted online, through consultations and search engines. While several witnesses agreed that the federal government should advertise its open data portal, there were various suggestions as to which approach was best in terms of raising awareness.

Some witnesses suggested that the approach to advertising the open data portal should depend on which users the government is trying to reach. For example, Mr. Stirling remarked that rather than paying for advertising, the most effective way to increase the awareness of the open data portal could be to send letters to certain charities or civil society organizations, asking those bodies to spread the word to their members.

Meanwhile, several witnesses agreed that the focus should be placed on awareness, engagement and dialogue. According to them, with an engaged group of individuals, open data has the potential to become more valuable. On this theme, Donald Lenihan of the Public Policy Forum emphasized the importance of engaging the public, consulting and raising awareness around open data. Ms. Ubaldi also commented, “it is very important to know what's going on before advertising, and awareness raising is essential.” She added that “[i]t's about businesses as actors, but it's also about other groups in society, so taking active steps to advertise and let people know and engage is […] essential.”

With respect to data formats and licensing, the Committee recommends that:

RECOMMENDATION 13

The Government of Canada should assess whether there are restrictions in existing federal legislation which prevent the release of certain datasets on its open data portal, and consider making legislative changes where appropriate. In addition, the Government of Canada consider open data requirements when introducing new legislation.

RECOMMENDATION 14

The Government of Canada should update the Access to Information Act in order to require federal departments and agencies to provide datasets that are in a machine-readable formats when responding to access to information requests for data.

RECOMMENDAITON 15

The Government of Canada accelerate its efforts to harmonize data formats by consulting sectorial roundtables, involving provincial, territorial and municipal governments, and other stakeholders.

RECOMMENDATION 16

The Government of Canada should continue to prioritize the release of high value datasets and align the format of those datasets with its G8 partner countries.

RECOMMENDATION 17

The Government of Canada should update its procurement policies to require that information technology purchases support open data; and that these policies include a requirement in terms of data formats, such as RDF and CSV formats, in order to support the release of open data in machine-readable formats.

In order to have a single point of access for open government data in Canada, the Committee recommends that:

RECOMMENDATION 18

The Government of Canada, in collaboration with provincial, territorial and municipal governments, should develop a federated search function to enable users to access open data from all three levels of government through a single point of access.

In relation to building awareness of the federal government’s open data portal, the Committee recommends that:

RECOMMENDATION 19

The Government of Canada should continue to promote its open data portal through additional promotion to the public.