Organization

Organizing Committee

(in alphabetical order)

Douglas Burdick is a Research Staff Member at IBM Research - Almaden currently working on the application of AI and machine learning to document understanding, which includes table extraction and understanding in addition to inferring document structure. His document understanding work is incorporated into the IBM Watson Compare & Comply and IBM Watson Discovery products. His other research focuses on the creation of financial knowledge graphs from unstructured data sources such as regulatory filings and analyst reports, which includes interpretation of tabular data from these documents. He has contributed to Apache SystemML and OpenII data integration toolkit, and co-organizes the DSMM workshop series (co-located with SIGMOD). He received his PhD in Computer Science from the University of Wisconsin - Madison.

Benjamin Han is the Principal Science Manager leading the research and development of the natural language services on Microsoft Azure AI. His research interests include many aspects of NLGPU (natural language generation, processing and understanding) such as information extraction, summarization, conversational AI, question answering and knowledge graph. During his time at Microsoft he has been a Principal Scientist in Satori (knowledge graph) and Bot Framework (conversational AI). Before that he was a Research Staff Member in the Multilingual NLP Technologies group at IBM TJ Watson Research Center working on all stages of information extraction technologies that power products such as IBM Watson Knowledge Studio and Watson NLU. He organized the Document Intelligence Workshop in KDD-2021 and the Knowledge Graph tutorial in KDD-2018, participated in government organized projects/competitions such as TREC, RADAR, ACE, GALE and TACKBP, and published in venues such as AAAI, ICME, ICoS, IJCAI, KDD, NAACL and SIGIR.

Dave Lewis is an Executive Vice President for AI Research, Development, and Ethics at Reveal-Brainspace. Prior to joining Brainspace, he was variously a freelance consultant, corporate researcher (Bell Labs, AT&T Labs), research professor, and software company co-founder. Dave has published more than 40 peer-reviewed scientific publications and 9 patents. He was elected a Fellow of the American Association for Advancement of Science in 2006 for foundational work in text categorization, and won a Test of Time Award from ACM SIGIR in 2017 for his paper w/ Gale introducing uncertainty sampling.

Sandeep Tata is a Software Engineer at Google Research and leads a research group on information extraction. Sandeep has published dozens of peer-reviewed research articles across a variety of disciplines including data management, data mining, natural language processing, and information extraction. Sandeep’s research work has impacted billions of people through research-focused enhancements to products like Google Drive, Gmail, and Google Assistant. He has served on the program committees for VLDB, ICDE, CIKM, and as a senior program committee member for KDD. He served on the organizing committee for WSDM 2016. Prior to Google Research, Sandeep was a Research Staff Member at IBM’s Almaden Research Center. He has a PhD from the University of Michigan.

Dan Tecuci leads the US AI Lab at EY working on prototyping and productizing business document understanding, including unstructured, semi-structured (e.g. invoices) and fully structured (e.g. spreadsheets) documents. Before EY, Dan worked on applying NLP to drug discovery, generating recipes, and question answering at IBM, as well as diagnostic systems and natural language access to large bodies of enterprise data at Siemens Research. He authored several publications and holds 11 patents. He received his PhD from The University of Texas at Austin.

Program Committee Chair

Ani Nenkova is a Principal Scientist at Adobe Research, leading the Adobe-Maryland part of the Document Intelligence Lab. Ani’s work is broadly in the area of language technology, including text quality prediction, summarization and named entity recognition. She has co-organized several workshops on summarization and improving text readability. Ani was program co-chair of NAACL 2016 and currently serves as editorin-chief for TACL.

Reviewers

The DI-2022 Organizing Committee wishes to express its sincere gratitude to the help from our paper reviewers. Without your thorough and timely reviewing, we could not have organized a successful workshop! THANK YOU!

(in alphabetical order of last names)

#	Full Name	Affiliation
1	Charles Beller	IBM
2	Tongfei Chen	Microsoft
3	Freddy Chua	Ernst & Young
4	John Corring	Microsoft
5	Daniel Campos	University of Illinois at Urbana-Champaign
6	Marina Danilevsky	IBM
7	Jonathan Degange	Ernst & Young
8	Yasuhisa Fujii	Google
9	Revanth Gangi Reddy	University of Illinois at Urbana-Champaign
11	Sean Goldberg	Microsoft
12	Jiuxiang Gu	Adobe Research
13	Beliz Gunel	Stanford University
14	Ruining He	Google
15	Hans Henseler	University of Applied Sciences Leiden
16	Mehrdad Jabbarzadeh Gangeh	Ernst & Young
17	Rajiv Jain	Adobe Research
18	Antonio Jose Jimeno Yepes	University of Melbourne
19	Amanda Jones	H5
20	Priyanka Kulkarni	Microsoft
21	Sameer Kulkarni	Google
22	Chen-Yu Lee	Google
23	Manling Li	University of Illinois at Urbana-Champaign
24	James Mayfield	Johns Hopkins University
25	Graham McDonald	University of Glasgow
26	Lesly Miculicich	Microsoft
27	Mark Noel	Hogan Lovells
28	Feifei Pan	Rensselaer Polytechnic Institute
29	Navneet Potti	Google
30	Brian Price	Adobe Research
31	Xiaoqi Ren	Google
32	Herbert Roitblat	Mimecast
33	Ying Sheng	Google
34	Baoguang Shi	Microsoft
35	Peter Staar	IBM
36	Baochen Sun	Microsoft
37	Chris Tensmeyer	Adobe Research
38	Jyothi Vinjumur	Walmart
39	Guoxin Wang	Microsoft
40	Sen Wu	Stanford University
41	Li Yang	Google
42	Qi Zeng	University of Illinois at Urbana-Champaign