:
Sorry about that. That's a regular problem with me.
I'd like to start by talking about the terms “open government” and “open data”, because I think there's a fair amount of confusion. I have noticed that this committee itself has been using those two terms almost interchangeably, and I think that's actually a mistake. I think these are actually two very different things and I want to break them apart.
The notion of open government is the overarching umbrella term. I won't dive in too deeply, because I think it means lots of different things to lots of different people, but I think there are three different pieces that make up open government, in our interest at least, and I want to talk about each one of those.
The first piece is open data. Open data refers to facts, the actual things that you would refer to as being numbers, or perhaps a map. They're generally numbers, and for people who are new to the open data and the open government world, I would really like to talk about open data as being something you would open in an Excel spreadsheet. So generally there are not a lot of words. Generally it's what we call “machine readable”, which means your Excel spreadsheet can open it up and you can look at it. So a budget, for example, would be a great example of open data. A map is a fantastic example of open data.
I want to separate that very clearly from a second piece, which is open information. Open information would be a report that has been written about data or about the state of the country. It could be about anything. Open information is the type of thing you would normally want to ATIP, the type of thing you would read, such as a report written by the government. Open information and open data are actually quite different. Open information is what we call unstructured data. It's words, it's vocabulary. Open data is structured. It looks like numbers and it's usually confusing to most of us.
The final piece is open processes. Open processes are the decision-making tools of government. This could range from a referendum, to this committee, to a consultation, and it's how open you make the decision-making within government.
The reason I want to break these three things out is because I actually think there are very different options available to us for each one of these types of categories, and I would hate for this committee to confuse “open information” with “open data” and make decisions that apply to open information, because that's a much more restricted realm we're dealing with there. When it comes to open data, I think the doors are wide open, and that is the category where I really want to focus my comments.
In the last four or five years, there has been an explosion in the amount of data that governments are willing to share with their citizens. So in the United States you have the launch of something called Data.gov, which now I think has something over 200,000 different data sets that the U.S. government shares freely with its citizens. Anybody can go to the website. Anybody can download this information. The British government has launched data.gov.uk, which has several thousands if not hundreds of thousands of data sets now as well, where you can go and download any information the government has that has been put up on that website.
I advised the mayor of Vancouver.... Two years ago we passed an open motion and we launched an open data portal for the City of Vancouver. You'll now see that there are somewhere in the range of about 160 data sets the city shares with citizens that they can use to write reports, to make software applications, or to simply help rethink how city services are conducted.
The reason I think this matters is right now we're kind of stuck in a world where we deal with all information as if it is an atypical request. So everything has to go through ATIP. And you have to understand that I come from a generation of people who are growing up using Google, and the average length of a Google search is somewhere in the realm of about 30 milliseconds. The average length of time it takes to complete an ATIP request is somewhere in the range of four months. If you ask anybody under the age of 30 what they think of access to their government, the simple response is it's broken. We have a system that has been in place for 20, 30, 40 years that may have looked good when it was first initiated, but today, to anybody who's used to living in a digital era, simply appears broken.
The reason I wanted to subdivide these into three different problems is that I recognize that the open process and the open information categories are tricky things to deal with, and there are a lot of competing interests, but I think in the realm of open data we can make a number of very quick and very significant wins.
We can architect a system that works literally for all citizens, and not just for a small group of journalists or for a small group of very interested people who are willing to hang out and wait for months in order to get a piece of government information. I think we can do a whole lot better.
I want to talk a little bit about the core principle about why we should be doing better. I think this gets lost in all the conversation about access to information, about privacy, about government secrecy. When it comes to raw data, the information the government collects about this country, about its citizens, is a public asset. It doesn't look like a road, it doesn't hang like a bridge, it's not a building in which public servants work, but it is as much a public asset as the building we're sitting in right now. With virtually every public asset that we have as a government, we go out of our way to make it as usable and accessible to Canadians as we conceivably can, because we know that those assets make for a stronger economy and they make for a stronger country.
Yet suddenly, when it comes to data, we actually go out of our way to not share this public asset. We choose not to let Canadians understand how their government works, and we choose to not let Canadians use that information to strengthen their companies, strengthen their families, to make their country better. It's a problem I don't understand.
The small number of times when we actually do decide to make data accessible, there are very few times when we make it free. Instead, we actually charge for it. We've taken economics and we've turned it completely on its head, because when you charge for data there's an economic argument why this makes no sense, and there's a moral argument why this makes no sense. The moral argument for why it makes no sense is what you have effectively done is you've taken an activity that all taxpayers have subsidized and you are now only allowing the wealthiest to gain access to that information. That might make sense if that asset were limited in its use: like there are only so many of us who can drive on a road, and every time we drive on that road we make that road worse, so I can understand you want to toll the road because you want to capture some of the revenue from the people who are actually using that asset so you can pay to keep it up to date.
The problem with data is that if Mrs. Freeman uses that data, it doesn't suddenly become less valuable to me. If Mr. Murphy uses that data, it doesn't become less valuable to me. It's just as valuable as it was before. So here in government economics we take all of the assets that actually erode, like our roads, and we very rarely toll them--we allow people to use them for free--and we take all the assets that never erode and could be reused infinitely, and we actually charge for them.
I think one of the biggest crimes we have in this country at the moment is that we charge for an enormous amount of StatsCan's data. Here is information about how communities function, about how healthy people are, about who they are as Canadians, and we make that information hard to access.
The other piece is we're creating barriers to entry to all sorts of new and interesting, potentially disruptive, companies. When you begin to look at the information that's getting released out there, it's starting to do some very interesting things. We're still very early on, but when you think of the companies that have emerged in the digital economy, most of those companies emerged because they have become profoundly good at organizing and leveraging and making use of data. You think of a company like Google. All Google does is organize, offer up, and make use of data. Who knows what company could emerge in Canada if the Canadian government made that information available, and what new services people might imagine, what efficiencies could be gained. I'm going to talk about that in a little bit.
What are some of the opportunities in front of us? What are some of the reasons why I think we should be thinking about open data and making the information the government has more accessible to the public, especially the data? I think for government there are three or four or five reasons that really come to mind.
The first is, we could reduce ATIP requests. Here we have an enormous cost where we have people rifling through documents trying to figure out what can be shared and what can't be shared. I think there's a whole bunch of data that we could frankly just share, and we would reduce the cost of having to fill that out. Here I understand that even MPs would find this useful.
My understanding is that most MPs have kind of a running access-to-information request where they want to know how much money the federal government spent in their ridings every quarter or every year or every month. One could imagine that rather than having to make that request over and over again, if the government simply made that data available on a website you could simply download it. Rather than having to wait days, possibly weeks, or even months to get information, you would have it as fast as your Internet connection. Not only would you have it, but everybody who lives in your riding would have it. Everybody who has a business in your riding would have it. Every citizen in the country could gain access to it.
The second reason I think open data is interesting for us to engage in is because it has another cost driver, or it kind of reduces cost in government. When you look at open data portals around the world, it is not uncommon to see that the biggest users of open data portals are government employees. Right now we have all these public servants all across Ottawa sitting on information that they'd actually love to share with one another, which would allow for more effective policy to be made, and they have no mechanism to easily do that.
When you create an open data portal where you share information with the public, you have actually also created a portal where you share that data with public servants. For example, when I talked to the guy who runs the open data portal in Washington, D.C., in the city, I asked him who the biggest user was. He said that was public servants, because for years they had wanted to gain access to what the crime rates were in a region, or what the budget was over there, or what pollution was doing over in a region, and they had to go through five different people in order to get that. Now they can simply download it.
The biggest opportunity is for finding cost savings. This is what the Tories have realized in the U.K. The Conservative government in the United Kingdom now shares actual spending data down to the 25,000-pound level with the public. In some ministries they share it down to the 500-pound level. They have literally invited the public to come and take a look at their books and to help them find where the waste is.
If you don't think that could matter here, I have one brief story I'd like to share with you. A couple of years ago a friend of mine was asked by a colleague to assess the charities in the greater Toronto area. They went to the Canada Revenue Agency, which eventually gave them a spread sheet of all the information about charities in the Toronto area. They were working away on this information, and on a lark they decided one day to sort these charities by the number of tax receipts issued. When they did that, something astounding happened. The United Way is the single largest charity in the Toronto area. It generally raises about $100 million a year. Yet the United Way only placed third on this list. There were two larger charities that had issued somewhere in the range of $160 million and the other one had issued somewhere in the range of $230 million in tax receipts. In fact, six of the top fifteen charities on their list they had never heard of.
Once you began to crunch the numbers and once you began to dive deeper to look at the charities, it became very obvious that these charities were actually not charities. Some of them were tax evasion schemes and some were engaged in fraud. And when you looked at what these charities had collected over a five-year period, the total amount of forgone tax revenues for the Canadian government and the cost to Canadian taxpayers was $3.2 billion. This is an enormous sum of money for us.
If the CRA's data had actually been made open and had been made available, I could imagine two things. One is that someone somewhere probably would have created a graph that would have shown different charities in the Toronto area and would have shown a charity that had gone from $60,000 in charitable receipts one year to $20 million in the next year, to $60 million in the year after that, to $120 million, then to $240 million. And someone somewhere would have said “I either need to hire that executive director, because they are running the most amazing charity in Canada, or something serious is going on”. I actually think if they had made that data public and someone had created that site, it might have prevented that charity from even emerging in the first place, because the simple scrutiny of the public being able to see them would have prevented that type of scam from emerging.
For me, the opportunity around government is an enormous amount of cost saving and also an opportunity to kind of monitor what we're doing and to see the problems as they are emerging, and finally to take the government services we have and augment them.
Someone has tried to do that with just what you guys do here in the House of Commons. I think you interviewed Michael Mulley, who is a friend of mine in Montreal. He has created the openparliament.ca website. He's taking data that you create in your own Parliament, and he's put it on a website that is far more accessible than the parliamentary website. In fact, I know public servants in Ottawa now who use that website to track what their ministers are saying and doing so that they can stay on top of what the government's agenda is and what the debates are.
These are the types of services that we can make more efficient.
The other reason why I think we should be doing open data is that it actually strengthens our economy. When I think of the billions of dollars that were spent on strengthening the Canadian economy coming out of the recession, I am saddened to think about how little of it was spent on the type of infrastructure that is going to be so powerful in the 21st century, which is better data. Why didn't we simply make all of StatsCan's data publicly available?
If you think of some of the big examples of how data has transformed the Canadian economy.... I have a few quick examples for you. The first is weather. The Canadian government collects weather information and shares it. In the United States, the U.S. government does the same thing. It is estimated that in the United States the economy generated by open weather information is roughly in the range of about $2 billion.
Just think about it. Think of all the logistics firms that are now giving advice on when to move goods based on the weather information they have that's been made available to them freely by their government. Think of all the individual decisions made by commuters about whether or not they're going to take the bus or whether they're going to drive, and all of the oil that gets saved by people who feel “actually, I can ride my bike today” or “I can take the bus” or “I'm not going to bring that umbrella” or “I'm going to dress more appropriately”—all the productivity hours that are gained there.
The amount of wealth generated by weather data is almost incomprehensible. That is a single data set that this government creates and shares. If we begin to imagine what is possible with the hundreds of thousands of data sets that you have at your disposal, I can think of an economy that is much more resilient, much more vibrant than the one we have now.
Another example is GPS data. I think if we're really honest with ourselves, GPS data was created so that we could deploy nuclear warheads with enormous precision on people we don't like. Tim O'Reilly says that nobody sat around while developing GPS data saying “GPS data will be really interesting when people have cellphones so that they can tell everybody where they are or they can log into Google maps and figure out how not to get lost.” Think of the billions of barrels of oil every year that are not expended because people no longer get lost. They can simply figure out where they are because of a GPS device. This is the power of open data.
What I really want to challenge you guys to think about is as you're making recommendations and as you're thinking about what the future of government is, I want you to understand that there is a huge opportunity in the data that this government sits on. If you share it, there is the citizenry out there that wants to make use of it to better understand how the government works, to hold you to account—I'll be honest—and to build the economy of the next century.
What we're really trying to figure out is that if we are going to have a knowledge-based economy, we're going to need a knowledge-based government that is going to want to engage with a knowledge-based citizenry. They already exist. They're already trained and skilled. They are already thinking about the stuff. They're just sitting around and waiting for someone to give them some materials to make that economy a reality.
I've talked for long enough; I'll stop there. I'd love to hear your questions and to answer them as best I can.
:
I have a couple of thoughts. First, around open information and the division between the two, I think what the British have done is incredibly interesting. The British are now contemplating setting up a public data corporation that will house the regularly collected data that the government uses. It's a very interesting model. Basically, they're going to try to centralize the actual data they collect. I think it's a model this government should be looking at. It's certainly the model that's used by the city of Washington, D.C., and it's the reason they've been able to move so quickly.
That's where I would define “data”. It's the information that this government chooses to regularly collect about the country. There might be data that on the offhand, every once in a while, someone commissions--you know, a report, when they want to know something. I think we should share that as well. But I actually think that at the heart of it there's a core set of data that we regularly collect. That is a public asset. Frankly, our tax dollars paid for it, and I'd like to know why you're not sharing it with me.
On the second piece, around information, I want to be really clear with the committee. I recognize the importance of the government's need to have a certain degree of privacy when developing policies and ideas. I do not think that under all circumstances it is wise for every idea to be shared with the public as it's being formulated. There are ideas that are controversial, there are ideas that need to be explored, there are ideas that need to be nurtured, and they deserve to have the privacy of a government in order to do that. If I were going to make some recommendations, one recommendation I might make is that I would radically reduce the length of time between when a document is made versus when it's made public. The second is I would insist that any document now that is being released, where it exists in a digital form, be released in that digital form. So if you happen to have it in a Word document, please release it in a Word document. Don't print it out and send it to me.
One of the most powerful things about digital media is that they're searchable. When you dump 3,000 printed-out documents onto me, you are effectively not releasing those documents to me. Am I really going to go through 3,000 different pieces of paper and find the relevant piece of information? When a citizen asks for a piece of information and you send them printouts, you're effectively telling them, “We are denying your access to information”, and I think you are actually disrespecting them in a really profound way. So I would want to make that recommendation.
I would also love for this committee to rethink the rules under which information is released, and even how parliamentary privilege works. Right now, for example, when the video of this committee is released, no one's going to be allowed to use that video to do anything they want. People can rebroadcast that video, but for example if somebody wants to make fun of me and take this video and match it up with a song, my understanding is that right now their rights are actually quite limited in doing that. They certainly can't do it with any of you. In the United States there's The Daily Show, and they regularly show the House of Representatives and the Senate and make fun of them, but it's a way of educating people. That's the satire that's so important. You can't do any of that in Canada. So there are these restrictions on how data can be used.
And then, finally, when it comes to processes, there I actually have less to say. I think a lot of the thinking around what open processes look like today is built around the current way we share information. If we shared a lot more information and a lot more data with the public, the types of processes we'd want would also change dramatically.
For example, if this government chose to make its budget open, and simply released the Excel spreadsheet of the budget and said “Everybody in the world, go and analyze it and you tell us where the problems are”, I think you'd have the people who came and talked to you much better informed. This committee would work in a very different way, because rather than re-educating the people who are coming to present to you, or having them tell you things that are incorrect because they didn't understand the 3,000 pieces of paper they had to go through, the system would be much faster and the way you'd want to engage people would begin to change.
So I'm hesitant to go into that place, because I think that world's going to evolve, depending on what we do in the other two places.