Document Integrity in an AI World

Traditionally documents have been used as a means for preserving information over both space and time. Ideally, the information is honestly created, accurately recorded,  clearly represented, and, if properly protected, can serve as a trusted record to be consulted at a later time.  However, we find that with traditional documents, the provenance can be questioned with respect to authorship, ownership, or accuracy of the information they contain because these proper safeguards do not always exist. Nevertheless, we can safely assume that humans created an overwhelming majority of them, and if they are modified,  humans changed them.  And forensic techniques exist to help restore their integrity. But what happens in a world where one can no longer assume an “overwhelming majority” of information has been honestly created, remains unmanipulated, or has even been authored by humans? This talk will explore the challenges of ensuring the integrity of documents that are being created in a world where AI systems are able to author and manipulate content at will. And what are the issues that need to be addressed in order to ensure that the historical content we are creating now remains believable in the future.”


Dr. David Doermann is a Professor of Empire Innovation at the University at Buffalo (UB) and the Director of the University at Buffalo Artificial Intelligence Institute. Prior to coming to UB he was a program manager at the Defense Advanced Research Projects Agency (DARPA) where he developed, selected and oversaw approximately $150 million in research and transition funding in the areas of computer vision, human language technologies and voice analytics. He coordinated performers on all of the projects, orchestrating consensus, evaluating cross team management and overseeing fluid program objectives. From 1993 to 2018, David was a member of the research faculty at the University of Maryland, College Park. In his role in the Institute for Advanced Computer Studies, he served as Director of the Laboratory for Language and Media Processing, and as an adjunct member of the graduate faculty for the Department of Computer Science and the Department of Electrical and Computer Engineering. He and his group of researchers focused on many innovative topics related to analysis and processing of document images and video including triage, visual indexing and retrieval, enhancement and recognition of both textual and structural components of visual media. David has over 250 publications in conferences and journals, is a fellow of the IEEE and IAPR, has numerous awards including an honorary doctorate from the University of Oulu, Finland and is a founding Editor-in-Chief of the International Journal on Document Analysis and Recognition. David also successfully co-founded and managed Applied Media Analysis, Inc, building a team of 12 research and developers from 2001-2014. He recognized the need for a cross platform implementation of computer vision algorithms on mobile devices and developed the architecture to port basic image processing and document analysis capabilities to various devices from a wide range of manufacturers. The work, which was supported by Small Business Innovative Research grants, government contracts, Nokia and Ricoh, resulted in the ability to implement an early version of both barcode readers (1D and 2D) and optical character recognition technologies on many devices. David is a leading researcher and innovative thinker in the areas of document image analysis and recognition. He is interested in applying his skills in leadership, mentoring and transition of research to help change the way we perceive and comprehend visual information. The impacts and scale of David’s interests are global because documents range from containers for textual and visual info-graphics to dynamic powerful resources that have the ability to seamlessly drive business processes in today’s evolving digital environment.

Date: January 15th, 2021 Time: 2:00-3:00 PM CET