Preventing and Managing Accidental Sensitive Data Leaks to Generative AI
Generative AI (GenAI) platforms like ChatGPT offer powerful productivity benefits, but they also introduce new risks when employees accidentally feed sensitive or proprietary data into them. Recent incidents (such as employees unwittingly leaking confidential source code via ChatGPT) have spotlighted the potential fallout. For organisations in New Zealand, it’s imperative to understand these risks, have a response plan, and implement preventative measures. Emphasis and effort should be focused on risk mitigation, strong governance, and compliance with NZ privacy regulations.
Key Risks of Submitting Sensitive Data to GenAI Platforms
- Loss of Confidentiality & Privacy Breaches: Data entered into GenAI tools is often stored by the provider and could be accessed or disclosed in ways you don’t intend. The NZ Privacy Commissioner warns that personal or confidential information entered into a generative AI may be retained by the provider and even used to train the model. This creates a risk that sensitive customer data or business secrets could later surface in AI outputs to other users. In short, once you paste proprietary text or personal data into an external AI service, you effectively lose exclusive control over that information’s confidentiality. If that data includes personal information, it may constitute an unauthorised disclosure – a potential privacy breach under NZ’s Privacy Act 2020.
- Data Retention and Training Data Exposure: Most GenAI providers (by default) use user inputs as part of ongoing model training. For example, OpenAI has stated it uses ChatGPT queries as training data to improve its models. This means any sensitive data your staff input could become embedded in the AI’s knowledge base. Researchers have demonstrated that AI models can sometimes regurgitate pieces of their training data when prompted in certain ways. Thus, proprietary information accidentally submitted might later reappear in another user’s query results. Even if direct output leakage is mitigated, the AI provider’s employees or contractors may review stored prompts, or the data could be included in future versions of the model. This training data exposure risk was highlighted when Samsung discovered engineers had pasted secret source code and meeting notes into ChatGPT; the company swiftly banned GenAI use after realising that data could be absorbed into the public model.
- Platform Security & Data Breach Risks: Relying on external GenAI platforms means trusting that provider’s security. If the platform is compromised or bugs occur, your submitted data could leak. In one incident, a ChatGPT glitch allowed users to see parts of other users’ conversation histories (including titles containing sensitive info). The UK’s National Cyber Security Centre (NCSC) notes that even if GenAI outputs are not directly shared between users, all queries are stored on the provider’s servers, where they “will almost certainly be used for developing the LLM” – and those stored queries could be hacked, leaked, or accidentally made public.
- Regulatory and Compliance Implications: Any accidental disclosure of personal information to an unauthorised party (in this case, a GenAI provider and potentially its model) can trigger New Zealand’s privacy breach notification requirements. The Privacy Act 2020 is technology-neutral – if the breach is likely to cause serious harm to an individual, the organisation must notify the Office of the Privacy Commissioner and affected individuals as soon as practicable. Beyond privacy law, consider confidentiality agreements or industry-specific rules: e.g. client contracts, NDA obligations, or financial and health data regulations.
Response Strategies for Accidental GenAI Data Exposure
Despite best efforts, mistakes happen. How organisations respond in the first hours can greatly affect damage control and compliance outcomes. Treat an accidental GenAI disclosure as you would a serious security incident or data breach. Key response steps include:
- Contain and Halt Further Exposure: Immediately instruct the individual to stop using the GenAI platform for any sensitive material. If the platform allows deletion of the submitted prompt or data, have them delete it (though this may not fully remove it from the provider’s servers). Containment also means revoking any broader access if needed – e.g. temporarily blocking corporate access to the GenAI service until the incident is assessed. As with any data breach, quickly limiting additional data leakage is paramount. Identify exactly what information was disclosed and ensure no one else repeats the mistake while the incident is active.
- Preserve Information & Assess Impact: Record the details of what happened – which GenAI platform, what data was input, when, and by whom. This log will be useful for assessment and any notifications. Next, assess the sensitivity of the leaked data and potential impact. Was personal information involved (customer data, employee records)? How sensitive is the proprietary material (source code, financials, strategic plans)? Determine who could be harmed or what advantage could be gained if the data were exposed via the AI. Remember that information given to an AI might be incorporated into its training data and inform outputs to other users, so consider worst-case scenarios (e.g. a competitor querying the AI and uncovering hints of your Intellectual Property). Engage your privacy officer or data protection team to evaluate the seriousness.
- Engage Legal and Notify Authorities if Required: Loop in your legal counsel early. They will help determine legal obligations and guide communications. If you determine that a notifiable privacy breach has occurred, New Zealand’s Privacy Act mandates notifying the Privacy Commissioner and the affected individuals “as soon as practicable” when a breach could cause serious harm. The Office of the Privacy Commissioner (OPC) expects notification within 72 hours of becoming aware of such a breach. Work with legal/privacy teams to prepare a breach notification via the OPC’s NotifyUs tool if needed, and draft clear communication for any affected parties explaining what happened and what is being done. Even if the leaked data is not personal (say it’s proprietary business info), legal counsel can advise on whether other disclosures are warranted – for instance, informing an impacted business partner, or in rare cases, making a public statement if the incident could materially affect shareholders or customers. A frank discussion with the GenAI provider may be worthwhile—while you can’t easily “un-train” the AI, some providers may delete submitted data from logs, especially for enterprise clients. In any case, ensure you comply with all regulatory reporting timelines and documentation.
- Inform key internal stakeholders and share the response plan. Transparency supports governance and helps allocate resources. Consider pausing GenAI use organisation-wide if a policy gap is revealed. Remind staff of data handling rules. For significant incidents, consider a formal investigation to identify the root cause (e.g. lack of awareness or deadline pressure). Focus on damage control and reassuring leadership—and possibly clients—that the situation is contained.
- Post-Incident Review and Prevention Update: Once the immediate crisis is handled, conduct a post-incident review. Analyse how and why the lapse occurred and what controls failed. Was there a lack of awareness or training? Was the policy not clear or enforced? Use these findings to strengthen your safeguards. Update your GenAI usage policies or data classification rules if necessary. You might decide to tighten technical controls (for example, deploying stricter Data Loss Prevention rules to block copying of certain data to web applications). This is also the stage to revisit your incident response plan: was the response timely and effective? A robust, tested incident response plan is crucial; it can prevent an incident from “becoming serious because of an unprepared response”.
Preventative Measures to Mitigate GenAI Data Leakage Risk
The best way to manage GenAI data leaks is to prevent them. Establish a strong governance framework around the use of generative AI, combining policy, education, and technical controls. Key preventative measures include:
- Clear GenAI Usage Policies and Data Classification: Develop and enforce a GenAI Acceptable Use Policy. This policy should explicitly define what staff are not allowed to input into any AI service. For example, make it clear that any data classified as Confidential or Restricted (customer PII, financial records, intellectual property, etc.) must never be entered into public AI tools. Identify which business units (if any) are authorised to experiment with GenAI and under what conditions. Policies should also cover whether employees may use personal accounts or only corporate-approved AI accounts, and whether additional approvals are needed for certain use cases. Tie these rules into your existing data classification scheme – e.g., “Company Confidential” data cannot be shared with any third-party system without approval. By codifying this, you give employees a clear guideline and create a basis for enforcement or disciplinary action if violated.
- Staff Training and Awareness: Policy alone isn’t enough; staff need to understand why it matters. Conduct regular training on the risks of GenAI and the contents of your GenAI usage policy. Use real examples to make it concrete: for instance, explain how an engineer uploading source code to get coding help could leak trade secrets, or how inputting a client’s data could breach privacy laws. Emphasise that GenAI queries are not private – they are stored and reviewed by providers and potentially used in training. Training should also cover social engineering risks (e.g. adversaries creating fake “AI assistant” websites to phish data) and reinforce basic cyber hygiene when using new tools. The goal is to build a culture where employees think before they paste. Encourage a mindset that treating AI like a public forum is the safest default. Staff should also know the procedure to report any accidental disclosure immediately, without fear – a blameless reporting culture can lead to faster containment.
- Technical Controls and Monitoring: Back up your policies with technical measures. Many organisations deploy Data Loss Prevention (DLP) solutions and web filtering to curb risky behaviour. For example, a DLP system can detect patterns of sensitive data (like customer account numbers or code) being copied to web forms and block or alert on it. Similarly, you might configure network controls to restrict access to GenAI sites unless via approved methods. Some companies have opted to ban external AI tools until they can implement secure alternatives. If outright bans are not feasible, consider providing a sanctioned, secure GenAI solution – for instance, using an enterprise version of a GenAI platform that offers data isolation (no training on your inputs) and contractual privacy assurances. Major cloud providers now offer such enterprise LLM services where your prompts aren’t shared in the global model. Additionally, implement monitoring for GenAI usage: your Security Information and Event Management (SIEM) or CASB (Cloud Access Security Broker) tools may detect unusual spikes in data going to AI services. Continuous monitoring with automated alerts can catch policy violations or suspicious usage early. Finally, ensure your incident response capability is up to date: prepare for the scenario of an AI-related data leak. Having a tested plan, as noted, will let you respond “quickly and effectively” if prevention fails.
- Governance and Compliance Measures: Treat GenAI use as high-risk and subject to oversight. The NZ Privacy Commissioner recommends senior leadership approval and thorough risk assessment before adoption. Executives should sign off on business use, balancing benefits with privacy and security risks. Conduct Privacy Impact and Algorithmic Impact Assessments to identify data exposure points and necessary controls. Align with frameworks like NIST’s AI Risk Management Framework and its 2024 Generative AI Profile, which help manage GenAI-specific risks. Follow standards such as ISO/IEC 27001, 27701, and the new ISO/IEC 42001 for AI governance and data protection. Ensure compliance with NZ’s Privacy Act and sector-specific rules—verify data storage, usage, and address these in vendor due diligence and contracts. Integrate GenAI oversight into your risk governance, update the board on AI risks, and report regularly on policy compliance.
Accidental leaks of sensitive data into generative AI platforms are a 21st-century twist on the classic insider mistake. They blend human error with cutting-edge technology – and the stakes can be high. For New Zealand organisations, the message is clear: preventing and managing GenAI-related data breaches must be part of your cybersecurity and privacy strategy. Organisations have a duty to establish prudent controls (policies, training, technical safeguards) so that innovation with AI does not outpace governance.
About the Bulletin:
The NZ Incident Response Bulletin is a monthly high-level executive summary containing some of the most important news articles that have been published on Forensic and Cyber Security matters during the last month. Also included are articles written by Incident Response Solutions, covering topical matters. Each article contains a brief summary and if possible, includes a linked reference on the web for detailed information. The purpose of this resource is to assist Executives in keeping up to date from a high-level perspective with a sample of the latest Forensic and Cyber Security news.
To subscribe or to submit a contribution for an upcoming Bulletin, please either visit https://incidentresponse.co.nz/bulletin or send an email to bulletin@incidentresponse.co.nz with the subject line either “Subscribe”, “Unsubscribe”, or if you think there is something worth reporting, “Contribution”, along with the Webpage or URL in the contents. Access our Privacy Policy.
This Bulletin is prepared for general guidance and does not constitute formal advice. This information should not be relied on without obtaining specific formal advice. We do not make any representation as to the accuracy or completeness of the information contained within this Bulletin. Incident Response Solutions Limited does not accept any liability, responsibility or duty of care for any consequences of you or anyone else acting, or refraining to act, when relying on the information contained in this Bulletin or for any decision based on it.
