France’s privacy watchdog, the Commission Nationale de l’informatique et des Libertés (CNIL), is exploring ways to protect against data-scraping in its new AI Action Plan. It is focusing on understanding the impacts of AI systems on people, as well as providing support for innovative players in the AI ecosystem that follow best practices as outlined by the CNIL. Along with clear rules for protecting the personal data of European citizens, the CNIL hopes this plan will contribute to the development of privacy-friendly AI systems.
In the United States, tech leaders are calling for more regulations for AI but Europe has already taken steps in this direction. This past year has seen the bloc sanction several high-profile companies such as Clearview AI, and Replika, an AI chatbot, was disciplined by Italy’s Data Protection Authorities (DPAs).
The EU is working to launch the AI Act, a risk-based framework for regulating applications of AI, before the end of 2021. The AI Act and current regulations inspire law-makers and DPAs worldwide to seek better understanding and expertise surrounding AI.
The CNIL’s focus in this action plan includes preventing the use of web scraping, or “scraping,” of data for the purposes of designing AI tools. Web scraping, especially for large language models (LLMs), involve collecting public data off the Internet in order to train the models. This poses a challenge under General Data Protection Regulation (GDPR) which requires those processing personal data to do so with consent or legitimate interests.
OpenAI’s ChatGPT is currently facing an Italian DPA investigation due to lack of consent for collecting personal data from web users. The CNIL has made it a priority to focus on the fairness and transparency of the data processing underlying all AI tools in its action plan as well.
OpenAI’s CEO, Sam Altman, testified in the US Senate and he suggested that a licensing and testing regime is necessary for AI technology. However, with the AI Act and other regulations, it is essential for companies to protect the privacy of users’ data. OpenAI places strict limits on how it uses users’ information but it failed to address the question of the legality of the data used to train the model. Therefore, OpenAI and other companies like it face legal challenges with enforcement coming from DPAs across Europe.