Please use this identifier to cite or link to this item:
https://hdl.handle.net/10316/93812
Title: | Using natural language processing to detect privacy violations in online contracts | Authors: | Silva, Paulo Gonçalves, Carolina Godinho, Carolina Antunes, Nuno Manuel dos Santos Curado, Marília |
Issue Date: | Mar-2020 | Publisher: | ACM | Project: | info:eu-repo/grantAgreement/EC/H2020/786713/EU/Protection and control of Secured Information by means of a privacy enhanced Dashboard | Serial title, monograph or event: | SAC '20: Proceedings of the 35th Annual ACM Symposium on Applied Computing | Place of publication or event: | Proceedings of the 35th Annual ACM Symposium on Applied Computing | Abstract: | As information systems deal with contracts and documents in essential services, there is a lack of mechanisms to help organizations in protecting the involved data subjects. In this paper, we evaluate the use of named entity recognition as a way to identify, monitor and validate personally identifiable information. In our experiments, we use three of the most well-known Natural Language Processing tools (NLTK, Stanford CoreNLP, and spaCy). First, the effectiveness of the tools is evaluated in a generic dataset. Then, the tools are applied in datasets built based on contracts that contain personally identifiable information. The results show that models' performance was highly positive in accurately classifying both the generic and the contracts' data. Furthermore, we discuss how our proposal can effectively act as a Privacy Enhancing Technology. | URI: | https://hdl.handle.net/10316/93812 | ISBN: | 9781450368667 | DOI: | 10.1145/3341105.3375774 | Rights: | openAccess |
Appears in Collections: | FCTUC Eng.Informática - Artigos em Revistas Internacionais |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
ACM_SAC_Paper__POSTER_.pdf | 386.08 kB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.