5 Reasons To Avoid Giving Online AI Tools Your Personal Information

You won't be surprised to learn that artificial intelligence has taken the world by storm. Although chatbots have existed since Eliza's rule-based "chatterbot" made its debut in 1966, the 2020s have painted a completely different picture of what chatbots are capable of. Generative AI — or large language models – started to feature more prominently in public consciousness in late 2022, with OpenAI's ChatGPT leading the charge.

With AI systems able to improve performances on a vast range of applications, it's only natural that users are starting to get comfortable with this technology. Recently, we've started to see an increased use of ChatGPT as a tool for self-therapy, in which users tell the LLM intricate details of their daily lives, including personally identifiable information. Just as there are certain things you should never share on social media, chatting with AI systems should still be subject to "stranger danger" rules. Here is a (nonexhaustive) list of five reasons to avoid sharing your personal information with AI tools.

Leaks pose a threat to data security

There's a reason cybersecurity is one of the fastest-growing segments in the digital economy. According to the FBI, losses due to suspected internet crime exceeded $16 billion in 2024. This figure represents an uptick of 33% from 2023, with phishing and personal data breaches being among the top three cyber crimes reported in 2024.

These risks, coupled with the fact that only 24% of generative AI initiatives are properly protected against cyberattacks, should sound alarm bells for users who share too much personal information with these chatbots. There are already high-profile records of data leaks; ChatGPT, for one, experienced a data breach in May 2023 that compromised sensitive information belonging to nearly 100,000 users. In fact, privacy concerns took ChatGPT offline for a short period in 2023. 

That's far from an isolated incident, too. In late 2023, a group of reporters for The New York Times also saw their personal details leaked as part of a research experiment. In our opinion, this history of leaks demonstrates the danger of sharing personally identifiable information with AI systems.

Deepfakes could facilitate imposter fraud

Sharing personal information with AI tools is dangerous, but not just because hackers could gain access to your credentials during a data leak. Users upload pictures of themselves to AI photo editing services all the time — to enhance the shot, anime-fy their profile picture, or even create a brand new image — all without employing the services of a professional.

While it might be convenient or even fun, this trend has exposed users to a whole new world of identity theft risks with the rise of deepfakes. Deepfakes are altered images or videos that display a subject saying or doing something they never did, and they are already advanced enough to fool casual observers. This technology makes it much easier, with the right sort of data, to commit imposter fraud, extortion, and other crimes that could tarnish a victim's public image.

Although records of deepfakes stemming directly from personally identifiable information are few and far between, their ability to conjure convincing images and voices poses a very real threat. Yet, those machines need plenty of sound and picture data to generate new media about a subject. For most of us who don't have an especially public online persona, this data is mainly going to come from leaked files.

Chatbots remember more than you think

One of the most enduring axioms in internet culture is that "the internet never forgets." Whether in the form of a simple Google search or elaborate conversations with AI-powered virtual assistants, users need to stay conscious of the fact that there are going to be records for everything they do. Not everything stays on the internet forever, as millions of dead torrent demonstrate. However, using ChatGPT as a case study, the AI company claims that deleted chats are "hard deleted" from its systems within 30 days, but this only applies to users who want to delete their accounts entirely from OpenAI's database.

For continuing users, it is stated that specific prompts cannot be deleted from chat history. According to a scientific study from the University of North Carolina, Chapel Hill, deleting information from an LLM is possible, but verifying the deletion is much more difficult.

With this in mind, the data retention capacity of AI tools should make users think twice before sharing personal information with them. If this information falls into the wrong hands — through company-wide leaks (which have happened in the past), targeted attacks from cybercriminals, or a simple change in company policy you disagree with — the consequences could be catastrophic.

Helping AIs train could cause damages to your intellectual property

AI tools are powered by machine learning algorithms, which essentially cast a wide net across millions of datasets in order to learn patterns and give relevant responses to user prompts. This learning process doesn't stop even when users are actively interfacing with them, either; training machine learning algorithms on user information is considered fair game.

LinkedIn, for instance, came under fire in late 2024 for training AI tools on user data. In response to the outrage, the company updated its privacy policy to allow users to opt out of the data training process with a toggle button. That option has since become an industry-wide standard, but just like the data policies of all social media, few users know what they're signing and what they can opt out of.

LinkedIn is far from the only company that incorporates sensitive user data into its LLM training procedure; Meta's AI is partially trained on Facebook and Instagram posts, and Amazon also uses data from customer interactions with Alexa to train its model. The ramifications of this status quo could manifest in intellectual property infringement, for instance. Writers and artists with a unique creative style could suddenly find AI-generated material with striking similarities to their work being shared with millions of users. This has led to several class-action lawsuits, including the rally of artists against Stable Diffusion early last year. 

Biased AIs could use your personal information against you

Intellectual property infringement isn't the only new risk to contend with when sharing personal information with AI tools. The nature of the training process means that LLMs are prone to develop biases that could have a negative impact on their users and on others.

For example, Georgetown University's Center on Privacy and Technology estimated in 2016 that one in two American adults has their image stored in a law enforcement facial recognition network. According to prison statistics, African Americans make up nearly 40% of the incarcerated population. Given that this is the data that is fed to AIs, it stands to reason that facial recognition algorithms will introduce some measure of bias when building the profile of a potential crime suspect, or when trying to match a textual description or a picture to a face.

This same technology is used to select candidates for a job, give out mortgages, and more. Without extra care, it's just as prone to bias. As such, sharing certain details about yourself could expose you to the negative effects of profiling. In 2015, Amazon had to pump the brakes on its automated hiring experiment after it was found to systematically discriminate against women. That said, sometimes there isn't a lot you can do to protect you against those biases, as models are trained on tons of data that single users can't hope to shape. Trying to protect your data is a good start, but that can be impossible when applying for a job or trying to rent a house.

Recommended