OpenAI has introduced new tools to give users more control over their data, even though they may be lacking in some aspects. Lately, since ChatGPT’s popularity has drastically increased, fears of its data coming from the web have heightened and with it, the risk of the system being trained on gathered data. Hence, multiple data regulators are carrying out investigations in order to evaluate the accuracy of the data and its potential risks.
The GDPR laws in Europe compel companies and organizations to present a lawful reason for their data handling and allow users to access, modify or request the deletion of any false information stored about them. Moreover, ChatGPT has proven to generate fake information in some cases, such as false newspaper articles and incorrect statements in regards to people’s profession.
This has made some companies such as Samsung ban their employees from using such tools due to their fear of company secrets being disclosed to other users. OpenAI has responded to the scrutiny by presenting more tools and processes to give people more control over their data as that is a critical solution to ensure their privacy and ensure accuracy. These tools allow a person to delete their data from OpenAI’s system.
OpenAI’s large language models are trained from three sources of information; data taken from the web, data licensed from other companies, and the data added by people who use the chatbot services. They are all statistically generated, predicting which words follow one another from the millions of examples given by humans.
OpenAI is taking measures to reduce the amount of gathered data and although that is an important step for the company, more work has to be done in order to secure people’s privacy and be completely GDPR compliant. Nonetheless, the tools and processes introduced are very useful in letting the user control their data while allowing OpenAI to keep its generative text models intact.