AI Chatbots may be considered outdated in the world of technology, however, the recent development of ChatGPT by OpenAI and Bard by Google have significantly enhanced their abilities, albeit sometimes with negative implications. The rapid growth of AI advancement has sparked worries pertaining to fake news, fake data, intellectual theft, and computer-generated malicious software. Experts believe that the potential privacy issues for the average internet user caused by generative AI depend greatly on the bots’ training methods and the extent of our engagement with them.
To mimic human interactions, AI chatbots are educated on vast quantities of information, with a large portion sourced from databases such as Common Crawl. Common Crawl has accumulated a vast amount of data over the years by means of crawling and extracting information from the public web, as the name implies. Megha Srivastava, who is studying for her PhD in computer science at Stanford and was previously an AI researcher at Microsoft Research, explained that these models undergo training using vast amounts of publicly accessible data on the internet.
Your sensitive information could currently be accessible on some distant part of the internet as a consequence of either your own negligence or the inadequate security measures of a third-party. Although the average user may encounter difficulty in obtaining it, it is feasible that data has been harvested to make up a training set, and consequently, may be reproduced by the chatbot at a later point. It is not a theoretical concern that a bot could potentially reveal a person’s genuine contact information. According to a tweet from Bloomberg columnist Dave Lee, ChatGPT allegedly divulged their precise phone number after being requested to communicate via the secure messaging app Signal. Although this type of engagement is probably an unusual occurrence, it is still valuable to take into account the data that these educational models can access. According to David Hoelzer, a member of the SANS Institute, it is not probable that OpenAI would aim to gather individual-specific healthcare information to train its models. “Could it possibly have been included unintentionally? Certainly.”
Open AI, the organization responsible for ChatGPT, remained unresponsive to our inquiries regarding their safeguarding of data privacy and their handling of any personally identifying information that may have been collected for training purposes. We resorted to the second-best option and questioned ChatGPT directly. The statement conveyed that it is designed to abide by ethical and legal principles to safeguard users’ privacy and personal details, and that it requires personal information to be provided before it can access it. Google also stated that it has implemented comparable safeguards in Bard to prevent the disclosure of personally identifiable information while conversing.
ChatGPT provided valuable insight and highlighted another important area where generative AI can potentially jeopardize privacy, which is through the exploitation of the software itself. This could arise from information that is divulged through chat logs or device and user data that is obtained by the software during its operation. OpenAI’s privacy policy lists different types of usual data that it gathers about its users, that may potentially identify them. When launching the ChatGPT platform, users are warned that their dialogues could undergo review by the AI trainers to refine the systems.
In contrast to other Google products that have a broad privacy policy, Bard does not have its own privacy policy and utilizes the same one as other products. Bard allows users to opt out of saving their conversations to their Google account and have the option to delete them through Google, according to statements made to Engadget by the company. Engadget was informed by Rishi Jaitly, a distinguished humanities fellow and professor at Virginia Tech, that establishing and maintaining users’ confidence will necessitate upfront and complete disclosure about privacy policies and data protection measures.
Although OpenAI’s “clear conversations” feature exists, it does not result in the deletion of your data, as stated in the service’s FAQ page. Additionally, OpenAI lacks the capability to target and erase select prompts. Although ChatGPT discourages users from sharing sensitive information, the only feasible method to eliminate personally identifiable data disclosed to the platform is to terminate your account, as stated by the company, ensuring complete erasure of all linked information.
Hoelzer reassured Engadget that he has no concerns about ChatGPT utilizing personal dialogues for the purpose of education. The location where the conversation data is being stored requires attention to ensure its protection, thus making its security a valid worry. Coincidentally, ChatGPT encountered a short-lived shutdown in March due to a glitch in the programming, which disclosed details about the chat logs of its users. At this point in the widespread implementation, it is uncertain whether chat records from such AI will attract the attention of malevolent individuals.
It would be wise to approach these chatbots with similar caution as any other technological product in the foreseeable future. Srivastava informed Engadget that individuals engaging with these models must anticipate that any interaction they have with the model can be utilized by Open AI or other companies for their advantage.