The proliferation of new licences continues to be a major challenge for open data. When licensors decide to create custom licences instead of using standard open ones, it creates a number of problems. Understanding the legal arrangements of new licences may be cumbersome for data data users. Because of legal uncertainties and compatibility issues with many different licenses, this proliferation can have chilling effects on the reuse of data. Standardised licences can smoothen this process by clearly stating usage rights.
This report provides a snapshot of licence proliferation in government. It explains why new licence terms complicate the licensing ecosystem and fail to resolve deeper issues around copyright and existing licence incompatibilities. The report makes a case for the use of public domain dedication and existing reusable standard licences and argues that a core issue of licensing is a lack of centralised coordination within government.
To understand how governments can centralise and harmonise decision-making around open licences, the report discusses different phases of the licensing process including copyright and copyright reform, policy development, and the design of individual licences. It offers an introduction on how licences are governed and outlines some persisting challenges and best practices.
We make the following recommendations for government agencies who wish to make their public sector information as reusable as possible.
When determining the legal context:
When designing policy and governance mechanisms:
When choosing an open licence we make following recommendations:
Once you have chosen an open licence, we recommend to take following actions:
People invest time and effort in creating works. Intellectual property rights, including copyright or similar rights regulate who can use and exploit the works. The default legal status may restrict uses we want to promote and allow. Without clear licensing, the legal grounds for permitting those uses can be unclear.
Open licences are legal arrangements that grant the general public rights to reuse, distribute, combine or modify works that would otherwise be restricted under intellectual property laws. Licences are open, if they enable anyone to use works for any purpose, both commercially and non-commercially.
Without an open licence, users face legal grey areas. “May I publish environmental statistics in a health app? Am I allowed to combine a city map with location information of companies?” Open licences help answer these and similar questions.
Open licenses may be unnecessary for works not restricted by copyright or similar laws. Such works are considered to be in the ‘public domain’1. Which rights are recognized for public domain works varies by country. But generally they cover all rights that open licenses would also permit.
The Open Definition defines how IP-protected works can be made available openly. This provides the framework for publishers to turn their data into open data, and can be similarly applied to other types of information, such as official government documents, journalism, art, and so on. Central to the Open Definition are nine requirements to legal openness. According to these requirements, open licences must allow anyone to use data for any purpose. Restrictions may only regard provenance, such as attribution of contributors, rights holders, sponsors, and creators, and possibly restrictions that adapted artifacts must use license terms similar to the works they originate from. Publishers must comply with these requirements in order to open up their information. Open licences only address barriers posed by copyright but naturally do not supersede civic law or allow for unlawful behaviour.
Open licences are described as either permissive or copyleft. If a work is licensed under a ‘copyleft’ licence, its terms and conditions must apply to works that will derive from and build on the original work. If a work is licensed under a permissive licence, the terms and conditions of a derivative work can be changed.
Open licences can refer to any copyrighted work, including content and/or databases. Depending on a country’s copyright law, content may include video, audio, written texts, and other works that are considered products of creative authorship. In the European Union and some other countries, other protections on databases exist, so-called ‘database rights’2. Database rights do not stem from copyright, but intend to protect investments made to build a database (sometimes referred to as ‘sweat of the brow’). The European Database Directive3 is possibly the most prominent example for database rights.
In the United States, databases are considered compiled work, similar to books or other works.4 Here the sweat of the brow principle does not apply. Either the database records or the database structure need to be results of creative authorship to be protected. An alphabetical list of factual data is usually not considered to be protected.5 While legally different, database rights and copyright have similar effects. They represent restrictions on the public use of information.
Several widely known licences address database rights. The Open Data Commons licences Open Database Licence 1.0, (ODbL) and Open Data Commons Attribution Licence 1.0 (ODC-BY) were the first to address database rights. The Creative Commons Zero public domain dedication (CC0), as well as version 4.0 of the Creative Commons Attribution (CC-BY), and Creative Commons Attribution Share-Alike (CC-BY-SA) licences cover database rights as well as regular copyright. Beyond these re-usable standard licences, governments develop their own licences to cover database rights such as the Open Government Licence (OGL 3.0) or the Norwegian Licence for Open Government Data (NLOD). Recently, the Linux Foundation published their ‘Community Data License Agreement’ to address database rights.
Even if all licences are open, they may prevent users from mixing data under multiple licenses because of incompatibility among each other. Licences are compatible if people can combine works and distribute them under one of these licences, or a third compatible licence. Licences must be at least be ‘one-way compatible’. This means that it must be possible to provide a combined work at least under the terms of the more restrive licence. The simplified graphic below shows an example of one-way compatibility, using databases licensed under CC BY and CC BY-SA.
Governments need to be very careful not to create data use silos. We speak of silos whenever a data system is not compatible or integrated with another data system. Use silos arise legally if it is not possible to combine data from different sources, due to incompatible licensing. This problem becomes even more important as governments turn towards supporting the free flow of public sector information such as with Europe’s Digital Single Market strategy.6 Licence compatibility and maximum simplicity of licences are paramount for creating effective data markets, economic growth through data, cross-border data sharing, and reuse of government data by civil society.
Open Knowledge International strongly discourages governments from creating new licences to avoid complicating the licence ecosystem further. Fundamental differences and incompatibilities exist both between permissive and copyleft licences, and between different copyleft licences. For example, OpenStreetMap contributors often deal with and discuss incompatibility between CC BY-SA and the ODbL.7
New licenses do not address fundamental incompatibility problems across already existing, and widely used open licences. Instead they are likely to complicate the licensing ecosystem because they usually do not replace existing open licences. Furthermore new licences cannot address the problem of international incompatibility of open licences. Open licences operate within specific environments of intellectual property protection. Depending on varying laws, open licences can have differing degrees of applicability across countries.
Licence proliferation occurs every time a new open licence is created. First acknowledged by the open source software movement8, licence proliferation presents a major problem for the reuse of data. The following five issues of licence proliferation are especially problematic for open government data:
To address licence proliferation it is important to understand the difference between reusable standard licences and custom licences. As explained in the following section, we strongly encourage governments to use standard legal solutions that are reusable, in particular CC0 and CC-BY 4.0.
Reusable standard licences lay out terms that can apply to any licensor and licensee. They are not the result of individual negotiations between a rightsholder and licensee. If rights holders publish their works under the same reusable standard terms, they allow users to combine, modify, and distribute these works together. This helps, for example, to add value by combining different databases.
Creative Commons and Open Data Commons9 have developed standard licences that are publicly available and may be reused by any licensor without any modifications. As a non-profit supporting others to legally share their works, Creative Commons developed several types of copyright licences with the goal to simplify and standardise the licence ecosystem. Open Data Commons is an initiative co-initiated by Open Knowledge International focusing on developing and maintaining legal tools to open data.
Creative Commons and Open Data Commons both store their licences at a permanent URI and have a community that helps updating them. The Open Definition endorses four reusable standard licences, namely CC-BY, CC-BY-SA, ODbL, ODC-BY, as well as the tools for public domain dedication Open Data Commons Public Domain Dedication Licence (PDDL) and Creative Commons Zero (CC0). It should be noted that CC0 and CC BY 4.0 are possibly the most commonly used legal tools10 and are best suited for public sector information.
Custom licences (also referred to as ‘bespoke’ licences) are specifically tailored to individual public sector bodies and their relevant copyright. Tailoring licences to certain government bodies can cause a situation where other public sector bodies cannot reuse this licence but need to adopt a different licence.11 Findings from the most recent Global Open Data Index show that governments continue to create custom licences. In a sample of twenty countries, we analysed data available from the Global Open Data Index and documented the terms of more than 50 unique licence texts.12
We extracted the names of open licences and grouped them under categories for the most common licences. This enabled us to find license clusters, showing that the majority of governments uses national open data licences.13
In order to address licence proliferation, Open Knowledge International strongly recommends the following actions.
The licensing process within government includes rights clearance, policy development, and the application of licences.16 A key issue for open licensing is a lack of central coordination to harmonise licensing choices The table below shows different elements of the licensing process and aspects of lacking central coordination. In the following sections we discuss these further and present some ways to harmonise licenses.
|Legal context||Policy tools||Governance mechanisms||Individual licensing|
Legal context defines what works fall under copyright or similar protections
Protection of works may be ambivalent
Clarification of intellectual property rights, as well as cultural heritage rights, personal data, confidential data and third-party-rights.
Policy tools outline commitments, rules and responsibilities to open up data.
Tools range from law, policies and decrees to executive orders
Legal power of tools varies depending on provisions and to whom a tool legally applies.
Governance mechanisms are used to supervise and support the adoption of open licences.
May include the appointment of a task force, a committee, or an agency to support governments with the adoption of open licences.
Support can include reviews as to why agencies apply specific open licences. Education and training programs are also applied.
The moment where data is published online under an open licence.
Happens on multiple government levels (federal, state, regions, municipalities)
Depending on the design of policy tools and governance mechanisms (see left), governments are free to choose where to upload data and under what conditions.
Open licensing starts with the interpretation of copyright law and similar protections. Copyright law can apply differently depending on the administrative level of government and the type of public sector body. For example, in the United States federal government data is by default in the public domain18, whilst state-level government data is protected by copyright.19 Copyright may also apply differently depending on the type of public sector body. For example, in the United Kingdom, Crown Copyright applies to government agencies which have ‘Crown status’. Other public sector bodies that do not fall under Crown Copyright would possibly not be able to apply the same custom open licences that government bodies use. The Open Government Licence 2.0 and 3.0 address this issue by being applicable to rights beyond Crown Copyright.
Legal certainty: Jurisdictions can apply and clarify protections very differently. Sometimes law clarifies if public sector information is protected, in other cases an intellectual property office may be able to give answers. In other jurisdictions, however, the protection status of government information is unclear. A high degree of legal uncertainty prevents government and public sector bodies from putting their data in the open.20 Copyright reform can be the strongest mechanism to give legal certainty and to make government data legally ‘open by default’. For instance, central government may define that databases are exempt from copyright and similar rights. Copyright reform, at best across countries, can help to overcome the complexity of differing copyright exceptions and limitations.21
Governments can develop different policy tools to mandate licensing, ranging from laws to decrees, executive orders, and policies. Through policy development governments can execute different degrees of control over the licences that get implemented. But not all policy tools ensure a consistent uptake of open licences.
This can partly be explained by vague terminology used in licence texts. We found provisions requiring to make data re-usable by applying a ‘widely used licence’. Such vague language makes licences a discretionary choice of individual government bodies. Clear language should be used, requiring the use of a specific licence, possibly also covering its updated licence versions. Alternatively, policies can mandate a location to store data such as data.gov repositories. This can be complemented with another policy clearly defining the terms of this repository. Licence standards and licensing frameworks may also help to support a harmonised uptake of open licences (see box below).
Licensing frameworks are a noteworthy approach to harmonising the use of open licences.22 These frameworks offer legal guidance material for government agencies as well as information material to support government agencies making better informed decisions about how to license data. Examples include New Zealand and Australia. The mission of New Zealand’s Government Open Access and Licensing Framework (NZGOAL) is
“to give guidance for agencies to follow when releasing copyright works and non-copyright material for re-use by others. It aims to standardise the licensing of government copyright works for reuse using Creative Commons licences and recommends statements for non-copyright material.”
NZGOAL does so 1) by setting out principles for open licensing and open access, 2) by actively promoting the use of standard (Creative Commons) licences and “no known rights” statements wherever copyright is unclear, and 3) by providing a review process how to license works including online training material. NZGOAL also clearly explains why New Zealand specifically endorses the Creative Commons Attribution 4.0 licence thereby adding further clarification.
NZGOAL was born out of the problem that different agencies in New Zealand interpreted Crown Copyright differently. Additional legal guidance was needed to understand which copyrights these agencies hold, and how they can open up information. By using a licensing framework that promotes standard open licences and standard licensing statements, the framework shall harmonise the use of common licences and reduce ambiguities within agencies.
Enforceability of policy tools may differ, too. For example, executive orders may be developed and implemented faster, but will only affect bodies associated with the issuing agency (such as a ministerial order). It is recommended to align policies and open licences across government branches and bodies. For example, an inter-ministerial committee can be appointed, to support a more coordinated policy development. In addition central authorities may be appointed to review licensing decisions of other government bodies. This supervision process can steer licensing choices. Yet, it does not entirely resolve the risk of developing and using incompatible licences.
Ultimately, every licensor makes decisions what open licences to apply, as well as where to publish them and in what form. If governments see themselves in the need to create custom open licences, they should ensure compatibility with existing open licences. Their licences should be reader-friendly, and provide legal certainty. Creating a custom licence needs a careful balance across these three aspects. If a licence is reader-friendly and understandable, but conflicting with standard open licences, governments significantly hamper the usability of data. Likewise, if a licence is compatible with standard open licences but contains vague language it may divert users.
Based on interviews and a close-reading of licence texts, we identified a list of arguments brought forward by government to create a custom licence. Not all of these customisations are legally required to address copyright issues. Here we explain how these arguments can be addressed without harming compatibility with standard licences.
Branding and attribution: Custom licences can be intended to give confidence to government agencies if they carry the name of a country or are called ‘government’ licences. Yet, creating custom licences only for branding reasons is unnecessary because standard open licences allow governments to add their custom attribution clauses.
Ensuring clarity of a licence text: The Norwegian Licence for Open Government Data (NLOD) contains clarifying comments. Since these comments are part of the licence text, they can be interpreted as legally binding. This can cause compatibility issues with the terms of existing open licences. Clarifying comments should instead be added as notices, outside of a licence text, ideally stored at a permanent address.
A notice is a document that clarifies applicable rules. In contrast to a licence, it does not grant rights or permissions to data users. Notices have the purpose to inform about legal context. This makes them especially helpful when governments want to add explaining comments to their licence terms. Notices are also the best way to clarify the absence of protections on data (public domain). There are some readily available standard solutions, such as the Creative Commons Public Domain Mark.
Use restrictions to prevent illegal use of data: Governments may add restrictions how data can be used, sometimes written in an overly broad and vague manner. An example is a licence that was in use until recently23 by the New Zealand Companies Office. The licence states that
These terms raise several questions. What does accurate mean? When is the use of data deceptive or misleading? Another lengthy example of a Belgian licence states that:
There are multiple arguments against using these kinds of restrictive clauses. Firstly, open licences are not meant to include references to civil law (direct ones or indirect ones as in the example above). An open licence must only clarify use rights of data, based on intellectual property rights. They never supersede civil law. Secondly, restrictions can conflict with the nine principles of openness outlined by the Open Definition, increase legal uncertainty, and make licences incompatible.
How can governments ensure and communicate compatibility between custom licences and standard open licences? First and foremost licence compatibility requires governments to ensure that their licence terms align with standard open licences. Even if governments create custom attribution licences, they can contain provisions that hinder the licensing of derivative works.
The LAPSI Licensing Interoperability report24 discusses “micro-elements” in licences, and how critical they are in order to ensure compatibility across licences. These include, among others, the parties of a licence, licence issuers, licence application, the rights granted, conditions, as well as disclaimers. The report discusses each of these elements and how governments may ensure that every one of these is compatible with reusable standard licences such as CC BY 4.0 or the public domain dedication CC0.
Compatibility clauses: Governments sometimes add compatibility clauses for custom licences. Some licences meticulously list different licences and licence versions, while others only include vague compatibility statements. For example the French Licence Ouverte includes a compatibility statement saying that:
Taiwan’s Open Government Data License Taiwan 1.0 (OGDL-TW 1.0)25 provides excellent compatibility by using a ‘one-way transition clause’. This clause ensures that users are able to combine works licensed under OGDL-TW 1.0 with a CC-BY 4.0 licence and automatically comply with the terms of OGDL-TW 1.0. The licence text states:
Other licences leave more room for interpretation about what licences they are compatible with. Precision is necessary as to what licences are exactly compatible with a custom licence. Governments might state compatibility with multiple licences which are not compatible among each other.26
I especially thank Sander van der Waal (Open Knowledge International) for his useful advice throughout the writing process of this report, as well as Vitor Baptista (Open Knowledge International) for supporting the analysis of open licence texts. My special thanks goes to Aaron Wolf for his support in editing the paper.
Furthermore, I would like to register my gratitude to the following people whom I interviewed, consulted or otherwise drew inspiration from in this project.
Nikesh Balami, Open Knowledge Nepal
Allison O’Beirne, Treasury Board of Canada Secretariat
Livar Bergheim, Agency for Public Management and eGovernment (Difi), Norway
Leigh Dodds, Open Data Institute
Stephen Gates, Open Data Institute Queensland
Bart Hanssens, FOD/SPF Bosa - DG Digital Transformation
Augusto Herrmann Batista, Departamento de Governo Digital, Brazil
Anne Kauhanen-Simanainen, Ministry of Finance, Finland
Suzanne McLaughlin, Ministry of Finance, Northern Ireland
Mike Linksvayer, Open Definition Advisory Council
Andrea Nelson Mauro, Dataninja.it
Walter Palmetshofer, Open Knowledge Foundation Germany
Diane Peters, Creative Commons
Teemu Ropponen, Open Knowledge Finland
Masahiko Shoji, Open Knowledge Japan
Audrey Tang, Digital Minister, Taiwan
Tarmo Toikkanen, Open Knowledge Finland
Martine Trznadel, Agence pour la simplification administrative, Belgium
Luis Villa, Open Definition Advisory Council
Tomoaki Watanabe, Open Knowledge Japan
Aaron Wolf, Open Definition Advisory Council
Enrique Zapata, Government of Mexico
Creative Commons, ‘Public domain guidelines’, http://bit.ly/2j8CJzF
Creative Commons, ‘State Department Publishes Open Licensing “Playbook” for Federal Agencies’, http://bit.ly/2jHbpGK
Dataverse community norms: http://bit.ly/2iDLfmb
European Commission, ‘Building a European data economy’, http://bit.ly/2hjwQuB
European Commission, ‘European legislation on re-use of public sector information’, http://bit.ly/2AYIyGK
European Parliament and European Council, ‘Directive 96/9/EC on the legal protection of databases’, http://bit.ly/1nHYqmK
Ilaria Buri, ‘Accessing and Licensing Government Data under Open Access Conditions’, Creative Commons Netherlands, http://bit.ly/2iFF1m5
Katherine Zimmerman, ‘U.S. copyright on a state level’, Office for Scholarly Communication, Harvard Library, http://bit.ly/1XyJWWd
Legal aspects of public sector information (LAPSI), ‘Licensing Guidelines’, http://bit.ly/2A7ab0G
Legal aspects of public sector information (LAPSI), ‘Licence interoperability report’, http://bit.ly/2A7ab0G
New Zealand Government Open Access and Licensing Framework, ‘Guidance and Resources’, http://bit.ly/1zZi8PQ
Open Data Commons, ‘Licenses’, http://bit.ly/2j9MopI
Open Knowledge International, Open Definition 2.1, http://bit.ly/1SknmNZ
Open Knowledge International, ‘Open Definition Licence Approval Process’, http://bit.ly/2y86wtL
Open Source Initiative, ‘Report of License Proliferation Committee’, http://bit.ly/2BXzLCK
OpenStreetMap Foundation, ‘Why CC BY-SA is unsuitable’, http://bit.ly/2kwK1O0
Po-yu Tseng, Mei-chun Lee, ‘Taiwan Open Government Data Report’, Open Culture Foundation: http://bit.ly/2A5i4Qg
The National Archives, ‘UK Crown Bodies’, http://bit.ly/2uyB85T
U.S. Copyright Law Revision: http://bit.ly/2BbY7vf
World Intellectual Property Organisation, ‘Scoping Study on Copyright and related rights and the public domain’, http://bit.ly/2iYygiX