FuzzyDoc project

Document integrity check based on fuzzy extractors
September 2021 - July 2023

Context

Nowadays, the security of documents, whether in electronic or digital format, has become a real problem. To address this issue, many integrity verification approaches have been developed in the scientific literature. The challenge is to find features that are robust to the printing and scanning processes. In the FuzzyDoc project, we want to establish a methodology for fraud detection and integrity verification of scanned documents. The forgeries considered can be a change or deletion of a character, or a modification of the font or its style. Thus, the signature used to represent the document must allow the detection of even minimal modifications made to the printing and scanning processes. For this purpose, we investigate the use of fuzzy extractors to detecte the modifications on the scanned document.

Authentication

Challenges

Our goal is to find a unique representation of a document, inspired by the approaches used in biometrics. In the context of document security, our project aims at establishing a methodology for fraud detection and integrity verification of scanned documents. The considered forgeries can be a change or a deletion of a character, or a modification of the font or its style. Thus, the fuzzy hash developed in the FuzzyDoc project, used as a document signature, should be robust to the printing and scanning processes. Its analysis will allow the detection of modifications, even minimal, operated on the document.

Two main scientific challenges we address in the FuzzyDoc project are:

Involved reserchers

We are looking for an intern !

Master Internship position on "Deep fuzzy extractors for document integrity check", January-February 2024, 6 months. More details here.

Publications related to the project

  1. F. Yriarte, P. Puteaux and I. Tkachenko, “A Two-Step Method for Ensuring Printed Document Integrity using Crossing Number Distances”, IEEE WIFS 2022, December 2022, online (pdf)[code and dataset].
  2. P. Puteaux, I. Tkachenko, “Crossing number features: from biometrics to printed character matching”, Workshop IWCDF@ICDAR 2021, September 2021, Lausanne, Switzerland (pdf).