AI / Machine Learning / Big Data, Government activitiesMay 24, 2024May 24, 2024

EU Report Highlights Persistent GDPR Non-Compliance Issues with ChatGPT

The European Data Protection Board (EDPB) recently published a progress report from its “GPT taskforce” assessing OpenAI’s ChatGPT, revealing significant challenges in complying with the European Union’s General Data Protection Regulation (GDPR). Despite efforts by OpenAI, the report underscores that the chatbot still fails to meet essential data accuracy standards mandated by the EU.

The EDPB’s report highlights that although OpenAI has made some strides toward transparency, these efforts fall short of meeting the rigorous data accuracy requirements of the GDPR. The primary concern is that ChatGPT, due to its probabilistic nature, often produces outputs that are inaccurate, biased, or entirely fabricated. This poses a substantial risk as users might mistakenly accept these outputs as factually correct.

The report explicitly states: “Although the measures taken in order to comply with the transparency principle are beneficial to avoid misinterpretation of the output of ChatGPT, they are not sufficient to comply with the data accuracy principle.” This finding is particularly troubling given the model’s reliance on a vast dataset containing billions of data points and around a trillion parameters, making it technically infeasible to manually verify the accuracy of its outputs.

The EDPB’s task force was established in response to concerns raised by national regulators, including Italy’s data protection authority, which temporarily banned ChatGPT in March 2023. Despite OpenAI’s efforts to address these concerns and lift the ban in April 2023, the chatbot remains under scrutiny. Various national privacy watchdogs, including those in Germany and Spain, continue to investigate potential GDPR breaches by ChatGPT, particularly focusing on the lawfulness of its data collection methods, transparency, and data accuracy.

The core issue with ChatGPT’s compliance lies in its training approach, which leads to outputs that may not be factually accurate. The EDPB report explains that the probabilistic nature of the system results in a model that can generate biased or fabricated content. This is compounded by the likelihood that users will perceive the AI’s outputs as accurate, regardless of their veracity.

This problem is exacerbated by the system’s reliance on web scraping for data collection. The process gathers vast amounts of publicly available data, including personal information, which may not always be accurate or up-to-date. The EDPB emphasizes that simply being technically challenging to ensure accuracy cannot excuse non-compliance with GDPR requirements.

In addition to data accuracy, the EDPB report raises concerns about fairness and transparency. The principle of fairness, according to GDPR, requires that personal data should not be processed in a manner detrimental, discriminatory, or misleading to the data subject. OpenAI’s current measures place the burden of compliance onto users, suggesting they should not input personal data, which contravenes the principle that data controllers should ensure GDPR compliance.

Moreover, transparency obligations under GDPR require informing users clearly about how their data will be used. While OpenAI has improved its privacy policies and implemented some measures to enhance transparency, the EDPB report indicates these efforts are still insufficient, particularly regarding informing users that their interactions with ChatGPT may be used for training purposes.

The EDPB outlines that each stage of data processing, from collection to output generation, must meet GDPR standards. This includes ensuring that personal data used in training does not violate individuals’ privacy rights and that adequate safeguards are in place to protect against undue impact. The board suggests technical measures to filter out sensitive data and ensure data minimization as potential solutions.

Additionally, the report highlights the importance of maintaining robust mechanisms for users to exercise their rights under GDPR, such as access to data, rectification, and erasure. OpenAI has taken steps to facilitate these rights, but further improvements are necessary to make these processes more accessible and effective.