Retyping attacks intent to interpret the content of a digital document and rewrite it as a new document with the aim of removing possible watermarks inserted in the document. These attacks can be automated using Object Character Recognition software making possible to attack a big volume ofdigital documents handled in organizational environments. Previous works have studied the robustness of watermarked text-based digital documents to retyping attacks, using a syntactic and semantic approach. However, little attention has been put on watermarking of image-based digital documents, which have resulted to be totally vulnerable to retyping attacks. This paper presents a study of the robustness of fingerprinted image-based digital documents against automated retyping attacks using a state-of-the-art fingerprinting system.
Robustness is achieved by inserting fingerprints/watermarks in the frequential components of the digital documents causing visual distortion on the fingerprinted documents, which makes that when a retyping attack be applied over it, the extracted text be useless. From the experiments in this study, it was found that the evaluated system resists automated retyping attacks generating an information loss above 50% for most of the fingerprinted documents in lossless and lossy formats. Additionally, it was found that exists a tradeoff between the robustness of fingerprinted image-based digital documents to retyping attacks and the robustness to other common attack in fingerprinted documents, the collusion attack.