Algorithms
Author: Aline Iramina
Illustration: Ilaria Urbinati
Algorithms play an increasing role in creative production and consumption. An algorithm is nothing more than a method for solving a specific problem or completing a task, which can include a basic set of rules/instructions as well as machine learning and more cutting-edge artificial intelligence (AI) code.
There are many examples of the production of creative goods and services involving the use of algorithms. AI applications that have been developed are capable of creating artistic, musical, and literary works (such as paintings, music, poems and news articles) with very limited or even no human intervention in the final product. Researchers increasingly rely on data mining algorithms to handle a large amount of data, which may include copyright works, and to be able to extract patterns, trends and associations and gain insightful knowledge. Creators of video games also employ a broad set of algorithms to respond to players’ inputs. They use algorithms not only to mimic simple human behaviours, for example, the ghosts’ movements in the Pac-Man game, but also to formulate more complex and sophisticated behaviours, such as the life simulation of the characters in The Sims. Moreover, music streaming services such as Spotify and Amazon Music employ algorithmic systems to recommend artists, bands, and new songs to their listeners. In a similar way, user-generated content platforms, such as YouTube and TikTok, use algorithmic systems to recommend as well as to moderate the content that circulates on their platforms (as further explained in the Copyright User UGC page).
Even though algorithms are not something new, given their increasing significance in the creative industries, it is important that users have a better understanding of how to deal with their influence on the production, distribution and consumption of creative works and the implications for copyright.
Algorithms and Copyright
Copyright law in the UK, as elsewhere, does not protect ideas, methods, or procedures. However, the expression of an idea can be protected if it is original and falls into one of the categories of protected works. Therefore, in order to be eligible for copyright protection, programmers need to convert the algorithms into a source code, which, as a fundamental component of a computer programme (software), can be protected as a literary work by copyright law. Copyright protection prevents others from copying the code without the copyright owner’s permission but not from writing a different code with the same function. One of the most common categories of algorithms used in computer programming is collaborative filtering algorithms, which are normally employed in the recommendation systems used by music streaming services. They provide personalised music experiences to their users based on the opinions of other users who share similar characteristics and music preferences. These algorithms analyse past user behaviour in order to establish connections between users and songs and to predict which songs or playlists each user would like.
In some jurisdictions, programmers can also protect algorithms as patents. In this case, they need to connect algorithms and their frameworks with some industrial or technical application. This means that, although mathematical algorithms per se are not patentable, inventions encompassing mathematical methods, along with computer programs, are patentable as long as they present a technical contribution. However, some producers might prefer to keep their algorithms secret, protecting them as trade secrets.
As an intellectual property asset, algorithms are usually proprietary, which means programmers or the companies they work for reserve the rights to use, modify or share these algorithms. However, as algorithms are rarely deployed alone, they are often incorporated in AI tools or software. These AI tools can also be licensed as open source software. In these cases, not only the algorithms but also their core statistical models are made open source. When developing software, for example, programmers usually incorporate the algorithms (the set of instructions) as actual code. If the software is released as open source software, where the underlying code is publicly available for anyone to access, use or modify without restrictions, the embedded algorithms are also available to be used freely by any programmer, data scientist, or anyone else who wish to make use of them.
Licences
There are different types of software licences. The most common are: public domain, permissive, copyleft, and proprietary licences. These licences make clear what people may or may not do with a source code. The first three are free open source software (FOSS) licences, which are free of cost, but not necessarily free to use:
- Public domain licences allow anyone to modify, distribute and use the software without any copyright restrictions.
- Permissive licences have very limited restrictions on how the software can be modified and redistributed (for example, an obligation to acknowledge the creator’s name). Unlike copyleft licences, permissive licences allow proprietary derivative works, which means that programmers who used a software under a permissive licence to develop a new one can distribute this new software through a paid licence. Some popular permissive licences are the Apache Licence, the BSD Licence, and the MIT licence.
- Copyleft licences are reciprocal licences, which allow anyone to modify the licensed software and distribute new works based on it, provided that they distribute any derivative work (for example, an improved version of the software) under the same licence as the original. For software, the GNU General Public Licence (GPL) is the most popular copyleft licence.
If programmers wish to develop an open source project, they must choose one FOSS licence and make it clear in their project. Some examples of open source programs are Linux operating system, Firefox browsing, Android by Google, and WordPress content management system.
Proprietary licences, on the other hand, ensure that all rights are reserved to the programmer, to the company that developed the software or to anyone else who owns rights in the software. Rightsholders can prevent others from using, modifying or redistributing their software without their permission. Microsoft Windows, macOS, iTunes, Adobe Photoshop are examples of proprietary software.
Input
Data is the fuel of AI algorithmic systems. Machine learning algorithms demand a large amount of data as input to their learning process. For example, in the ‘Next Rembrandt’ project, experts from different fields used deep learning and face recognition algorithms to produce a brand-new painting in the same style of the famous Dutch painter, based on data scanned from 168,000 fragments of his works. All Rembrandts’ paintings were in the public domain, so, from a legal perspective, there was no copyright attached to them and, in this case, they could be used without permission. However, in most cases the amount of data necessary to train the algorithms is enormous and may encompass thousands of protected works from many different owners. To request permission from so many rightsholders can be very challenging and, sometimes, even impracticable. How can users deal with this situation?
In the UK, users can rely on different copyright exceptions to use copyright material without previous authorisation. For example, there are fair dealing exceptions for quotation, criticism or review, and caricature, parody or pastiche, but their applicability to a large number of works is usually difficult to justify. Researchers can also rely on the copyright exception for text and data mining to analyse large amounts of copyright works they have lawful access to. However, this exception is limited to non-commercial research.
Still, big data from corporations such as Netflix and Spotify are usually proprietary and give them considerable commercial advantage. For example, Netflix makes use of machine learning algorithms to provide personalised recommendations for their users. In order to feed and train these algorithms, Netflix collects a large amount of users’ data, such as data on subscribers’ actions, feedback and demographics. Though many people do not realise is that use of all this data is not freely available. Big data and data analytics models are fundamental to Netflix and other streaming platforms’ business models. Therefore, besides having to comply with privacy and personal data protection rules, companies also often rely on copyright law, trade secrets law, and even patent or other kinds of contractual agreements to manage and protect their data.
In the UK, databases can be protected by copyright law and the sui generis database right when the selection or arrangement of the material in a database is original or when there is a substantial investment in obtaining, verifying, or presenting the content of the database. While copyright protects the selection or arrangement of the data from copying, the sui generis right protects the data itself from substantial extraction or reutilisation. In the latter case, the copyright exception for text and data analysis does not even apply. All these multiple layers of data protection can be daunting for users and hamper research and innovation in many fields, including AI. This is one of the reasons why so many governments and academic and research institutions are encouraging open science initiatives.
The open science movement has the main objective of maximising the dissemination of knowledge in a more transparent and collaborative way. It includes initiatives such as open source to software and open access to content and information. Many governmental agencies and academic and research institutions adopt open access policies and practices and make their resources freely available to use. The UK data service, for example, is open access and some of their collections are open data. For some of their open data collections, the data service partnership uses the UK Open Government Licence or Creative Common Licences (CC). The open access initiative in the Netherlands aims to grant free and open online access to academic data and publications. Google Scholar, Internet Archive, and Wikimedia Commons are examples of open access databases.
Output
In addition to using copyright works as input, AI algorithms can also produce works that in principle may attract copyright protection, such as music, video games, literary and artistic works. One of the most successful works generated by AI is the Portrait of Edmond Belamy, which was sold for $432,500 in an auction at Christie’s in 2018. To create this painting, Obvious AI & Art, a collective of artists working with AI, employed what they call the ‘generative adversarial network’ (GAN) method and fed their algorithmic systems with a data set of 15,000 portraits painted between the 14th and 20th century. One of their main goals with this project was to prove that algorithms are able to emulate creativity. However, the creation of AI generated works has led to questions about the originality and creativity of this kind of work and whether it should be protected by copyright law. In the US, the copyright office denied twice Steve Thaler’s request for copyright protection of an artwork generated by AI arguing that the applicant ‘had provided no evidence of sufficient creative input or intervention by a human author in the work’. In its decision, besides rejecting the author’s request to apply the ‘work for hire doctrine’ for AI generated works, the Review Board of the United States Copyright Office concluded that ‘Office policy and practice makes human authorship a prerequisite for copyright protection’.
Most theories that justify copyright protection did not anticipate creation by a non-human entity. In many jurisdictions, it is not clear whether works created by machines can be protected by copyright, since the law often limits authorship to natural persons. UK copyright law, though, does provide for protection for computer-generated works without a human author. In this case, the person who made the ‘arrangements necessary’ for the work to be created is considered the author. This person has economic rights over the work, but no moral rights, which means that, as the owner, they can control the use of the work for a certain period of time, but they do not have the right of attribution and integrity over the AI creation. Usually, the person who programmed and taught the computer to write, paint or compose music can request the authorship of the computer-generated work. However, this should be addressed on a case-by-case basis. Under certain circumstances, the user may be able to claim authorship.
In the English case Nova Production v. Mazooma Games (2006), which involved a potential copyright infringement in the graphics and frames generated and displayed on a screen by users when playing a computer game, the court held that the player could not be the author of the artistic works created in the successive frame images, since the player’s input ‘is not artistic in nature (…), and he has contributed no skill or labour of an artistic kind’. According to Justice Kitchin, the only thing the player did was to play the game. In this case, since the programmer was the person responsible for making the arrangements necessary to the creation of the work, the court decided that the programmer should be considered the author and, consequently, the copyright owner. However, if the court had addressed the same question in relation to more interactive games such as Minecraft, the outcome might have been different. As AI technologies progress, it will be more difficult to determine which person made the ‘arrangements necessary for the creation of the work’, since the direct role of human beings in the algorithmic creation will be less and less clear.
There are many law proposals under discussion in various jurisdictions to regulate the use of algorithms, especially by online platforms. In the coming years, new rules and obligations involving algorithms are expected, which will probably have impact on the copyright system, and, consequently, on copyright users.
Legal Language
These are quotes from a legal case, where the judge explains who should be considered the author of computer-generated works under UK copyright law.
Nova Production v Mazooma Games [2006] EWHC 24 (Ch)
Justice Kitchin [105]: ‘In so far as each composite frame is a computer-generated work then the arrangements necessary for the creation of the work were undertaken by Mr Jones because he devised the appearance of the various elements of the game and the rules and logic by which each frame is generated and he wrote the relevant computer program. In these circumstances I am satisfied that Mr Jones is the person by whom the arrangements necessary for the creation of the works were undertaken and therefore is deemed to be the author by virtue of s.9(3)’
[106] ‘Before leaving this topic there is one further complexity I must consider and that is the effect of player input. The appearance of any particular screen depends to some extent on the way the game is being played. For example, when the rotary knob is turned the cue rotates around the cue ball. Similarly, the power of the shot is affected by the precise moment the player chooses to press the play button. The player is not, however, an author of any of the artistic works created in the successive frame images. His input is not artistic in nature and he has contributed no skill or labour of an artistic kind. Nor has he undertaken any of the arrangements necessary for the creation of the frame images. All he has done is to play the game.’
Legal References
EU Directives
UK Copyright, Designs and Patents Act 1988
Section 3: Literary, dramatic and music works – Section 3(1)(b) provides for copyright protection for computer programs
Section 9: Authorship of a work – Section 9(3) provides for copyright protection for computer-generated works
Sections 28A to 31: Acts permitted in relation to copyright works (General) – Section 29A provides for the exception to “text and data analysis for non-commercial research”
Section 178: Minor definitions – Section 178 defines a computer-generated work as one that is “generated by computer in circumstances such that there is no human author of the work”
The Copyright and Rights in Databases Regulations 1997
Regulation 20: Exceptions to database rights
U.S. Copyright Review Board opinion on ‘Second Request for Reconsideration for Refusal to Register a Recent Entrance to Paradise (Correspondence ID 1-3ZPC6C3; SR # 1-7100387071)’: https://www.copyright.gov/rulings-filings/review-board/docs/a-recent-entrance-to-paradise.pdf
The Law of Data Scraping: A review of UK law on text and data mining (2021) by Sheona Burrow: https://zenodo.org/record/4635759#.YaOq0S3TV0t
Artificial Intelligence as Producer and Consumer of Copyright Works: Evaluating the Consequences of Algorithmic Creativity (2020) by Enrico Bonadio and Luke McDonagh: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3617197
Other references
Five ways that Open Source Software shapes AI policy by Alex Engles in Brooking.edu: https://www.brookings.edu/blog/techtank/2021/08/18/five-ways-that-open-source-software-shapes-ai-policy/
Is artificial intelligence set to become art’s next medium? – https://www.christies.com/features/A-collaboration-between-two-artists-one-human-one-a-machine-9332-1.aspx
The ‘Next Rembrandt” Project – https://www.nextrembrandt.com
Related
Licensing & Exploiting
If you own the copyright in a work, you are free to exploit it on your own or license the use of it to another party (such as a book publisher). ‘Exploit’ in this context means to develop or make use of it.
Terms & Conditions
Terms and conditions are a set of rules. These rules generally form a contract between you, the user, and the service provider, whose website you are visiting.
Text & Data Mining
The electronic analysis of large amounts of copyright works allows researchers to discover patterns, trends and other useful information that cannot be detected through usual ‘human’ reading. This process, known as ‘text and data mining’…