Learn to use AI like a Pro. Learn More

A Call for Opt-In Data Usage

The Dataset Providers Alliance Champions Ethical AI Data Licensing

Last updated:

Mackenzie Ferguson

Edited By

Mackenzie Ferguson

AI Tools Researcher & Implementation Consultant

The Dataset Providers Alliance is advocating for a fairer AI industry by pushing an opt-in system for data usage. Comprising seven AI licensing companies, the group emphasizes that creators and rights holders should have control over their material. Their efforts aim to standardize licensing and compensation structures, promoting ethical sourcing of data.

Banner for The Dataset Providers Alliance Champions Ethical AI Data Licensing

The first wave of major generative AI tools were trained on publicly available data scraped from the internet. However, sources of training data are now increasingly restricting access and pushing for licensing agreements. As the demand for new data sources grows, new licensing startups have emerged to ensure a continuous supply of training material.

    The Dataset Providers Alliance (DPA), a trade group formed in the summer, aims to standardize and make the AI industry fairer. Recently, they released a position paper outlining their stance on key AI-related issues. The DPA comprises seven AI licensing companies, including music-copyright-management firm Rightsify, Japanese stock-photo marketplace Pixta, and generative-AI copyright-licensing startup Calliope Networks. More companies are expected to join the alliance in the fall.

      Learn to use AI like a Pro

      Get the latest AI workflows to boost your productivity and business performance, delivered weekly by expert consultants. Enjoy step-by-step guides, weekly Q&A sessions, and full access to our AI workflow archive.

      Canva Logo
      Claude AI Logo
      Google Gemini Logo
      HeyGen Logo
      Hugging Face Logo
      Microsoft Logo
      OpenAI Logo
      Zapier Logo
      Canva Logo
      Claude AI Logo
      Google Gemini Logo
      HeyGen Logo
      Hugging Face Logo
      Microsoft Logo
      OpenAI Logo
      Zapier Logo

      One of the central tenets advocated by the DPA is an opt-in system, which stipulates that data can only be used after explicit consent is given by creators and rights holders. This approach marks a significant departure from the current industry norm, where some companies have adopted opt-out systems or offer no opt-outs at all. The DPA believes that an opt-in system is not only more ethical but also a pragmatic approach to avoid potential lawsuits and maintain credibility.

        Alex Bestall, CEO of Rightsify and the Global Copyright Exchange, emphasizes the importance of the opt-in approach. According to Bestall, artists and creators should be actively involved in the process. He argues that selling publicly available datasets without proper licensing could lead to legal issues and damage the reputation of AI companies.

          Ed Newton-Rex, a former AI executive and current head of the ethical AI nonprofit Fairly Trained, supports the DPA’s push for opt-ins. He finds opt-out systems unfair to creators, who may not even be aware when such options exist. Newton-Rex is particularly pleased with the DPA’s advocacy for an opt-in approach.

            Shayne Longpre, lead at the Data Provenance Initiative, a collective that audits AI datasets, finds the DPA’s ethical sourcing efforts admirable. However, he notes that implementing an opt-in standard could be challenging due to the vast amount of data modern AI models require. According to Longpre, this could lead to a data scarcity issue or increase costs, making it feasible only for large tech companies to afford.

              Learn to use AI like a Pro

              Get the latest AI workflows to boost your productivity and business performance, delivered weekly by expert consultants. Enjoy step-by-step guides, weekly Q&A sessions, and full access to our AI workflow archive.

              Canva Logo
              Claude AI Logo
              Google Gemini Logo
              HeyGen Logo
              Hugging Face Logo
              Microsoft Logo
              OpenAI Logo
              Zapier Logo
              Canva Logo
              Claude AI Logo
              Google Gemini Logo
              HeyGen Logo
              Hugging Face Logo
              Microsoft Logo
              OpenAI Logo
              Zapier Logo

              In their position paper, the DPA argues against government-mandated licensing, favoring a free-market approach where data originators and AI companies negotiate directly. They also propose five compensation structures to ensure fair payment for creators and rights holders, including subscription-based models, usage-based licensing, and outcome-based licensing tied to profits. These models could apply to various media types, from music to images and films.

                Technologist Bill Rosenblatt sees the standardization of compensation structures as beneficial. He notes that the DPA is well-positioned to set industry terms. According to Rosenblatt, AI companies need incentives to adopt data licensing, with legal risks being a major motivator. He also stresses the importance of making the licensing process convenient to encourage widespread adoption. Standardized payment models can facilitate this transition.

                  The DPA also supports the use of synthetic data—AI-generated data—in training AI models. They believe synthetic data will soon constitute the majority of training datasets. The alliance advocates for proper licensing of the initial data used to generate synthetic data and calls for transparency on how synthetic data is created. They also recommend regular evaluations of synthetic data models to address biases and ethical concerns.

                    Despite these initiatives, the DPA faces the challenge of getting key industry players to adopt their standards. Ed Newton-Rex points out that while ethical licensing standards are emerging, not enough AI companies are currently embracing them. Nonetheless, the formation of the DPA signals the end of the unregulated 'Wild West' era in AI data sourcing.

                      Alex Bestall highlights the rapid changes occurring in the AI industry. The establishment of the DPA and its efforts to promote ethical data licensing reflect a significant shift towards more regulated and fair practices in AI training data procurement.

                        Recommended Tools

                        News

                          Learn to use AI like a Pro

                          Get the latest AI workflows to boost your productivity and business performance, delivered weekly by expert consultants. Enjoy step-by-step guides, weekly Q&A sessions, and full access to our AI workflow archive.

                          Canva Logo
                          Claude AI Logo
                          Google Gemini Logo
                          HeyGen Logo
                          Hugging Face Logo
                          Microsoft Logo
                          OpenAI Logo
                          Zapier Logo
                          Canva Logo
                          Claude AI Logo
                          Google Gemini Logo
                          HeyGen Logo
                          Hugging Face Logo
                          Microsoft Logo
                          OpenAI Logo
                          Zapier Logo