Investigating Software Development Teams Members' Perceptions of Data Privacy in the Use of Large Language Models (LLMs)

Abstract

Context: Large Language Models (LLMs) have revolutionized natural language generation and understanding. However, they raise significant data privacy concerns, especially when sensitive data is processed and stored by third parties. Goal: This paper investigates the perception of software development teams members regarding data privacy when using LLMs in their professional activities. Additionally, we examine the challenges faced and the practices adopted by these practitioners. Method: We conducted a survey with 78 ICT practitioners from five regions of the country. Results: Software development teams members have basic knowledge about data privacy and LGPD, but most have never received formal training on LLMs and possess only basic knowledge about them. Their main concerns include the leakage of sensitive data and the misuse of personal data. To mitigate risks, they avoid using sensitive data and implement anonymization techniques. The primary challenges practitioners face are ensuring transparency in the use of LLMs and minimizing data collection. Software development teams members consider current legislation inadequate for protecting data privacy in the context of LLM use. Conclusions: The results reveal a need to improve knowledge and practices related to data privacy in the context of LLM use. According to software development teams members, organizations need to invest in training, develop new tools, and adopt more robust policies to protect user data privacy. They advocate for a multifaceted approach that combines education, technology, and regulation to ensure the safe and responsible use of LLMs.

Authors (ORCID)

Citation

If you use or discuss our survey in your work, please use the following citation::

@inproceedings{10.1145/3701625.3701675,
author = {Falc\~{a}o, Fabiano Damasceno Sousa and Canedo, Edna Dias},
title = {Investigating Software Development Teams Members' Perceptions of Data Privacy in the Use of Large Language Models (LLMs)},
year = {2024},
isbn = {9798400717772},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3701625.3701675},
doi = {10.1145/3701625.3701675},
abstract = {Context: Large Language Models (LLMs) have revolutionized natural language generation and understanding. However, they raise significant data privacy concerns, especially when sensitive data is processed and stored by third parties. Goal: This paper investigates the perception of software development teams members regarding data privacy when using LLMs in their professional activities. Additionally, we examine the challenges faced and the practices adopted by these practitioners. Method: We conducted a survey with 78 ICT practitioners from five regions of the country. Results: Software development teams members have basic knowledge about data privacy and LGPD, but most have never received formal training on LLMs and possess only basic knowledge about them. Their main concerns include the leakage of sensitive data and the misuse of personal data. To mitigate risks, they avoid using sensitive data and implement anonymization techniques. The primary challenges practitioners face are ensuring transparency in the use of LLMs and minimizing data collection. Software development teams members consider current legislation inadequate for protecting data privacy in the context of LLM use. Conclusions: The results reveal a need to improve knowledge and practices related to data privacy in the context of LLM use. According to software development teams members, organizations need to invest in training, develop new tools, and adopt more robust policies to protect user data privacy. They advocate for a multifaceted approach that combines education, technology, and regulation to ensure the safe and responsible use of LLMs.},
booktitle = {Proceedings of the XXIII Brazilian Symposium on Software Quality},
pages = {373–382},
numpages = {10},
keywords = {Large language models (LLM), Conversational agents, Chatbots, Data Privacy, Privacy risks},
location = {
},
series = {SBQS '24}
}

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
.gitignore		.gitignore
README.md		README.md
data_analysis_survey.ipynb		data_analysis_survey.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Investigating Software Development Teams Members' Perceptions of Data Privacy in the Use of Large Language Models (LLMs)

Abstract

Authors (ORCID)

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Investigating Software Development Teams Members' Perceptions of Data Privacy in the Use of Large Language Models (LLMs)

Abstract

Authors (ORCID)

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages