Organization


Organizing Committee

(in alphabetical order)

Douglas Burdick is a Research Staff Member at IBM Research - Almaden currently working on the application of AI and machine learning to document understanding, which includes table extraction and understanding in addition to inferring document structure. His document understanding work is incorporated into the IBM Watson Compare & Comply and IBM Watson Discovery products. His other research focuses on the creation of financial knowledge graphs from unstructured data sources such as regulatory filings and analyst reports, which includes interpretation of tabular data from these documents. He has contributed to Apache SystemML and OpenII data integration toolkit, and co-organizes the DSMM workshop series (co-located with SIGMOD). He received his PhD in Computer Science from the University of Wisconsin - Madison.


Benjamin Han is the Principal Science Manager leading the research and development of the natural language services on Microsoft Azure AI. His research interests include many aspects of NLGPU (natural language generation, processing and understanding) such as information extraction, summarization, conversational AI, question answering and knowledge graph. During his time at Microsoft he has been a Principal Scientist in Satori (knowledge graph) and Bot Framework (conversational AI). Before that he was a Research Staff Member in the Multilingual NLP Technologies group at IBM TJ Watson Research Center working on all stages of information extraction technologies that power products such as IBM Watson Knowledge Studio and Watson NLU. He organized the Document Intelligence Workshop in KDD-2021 and the Knowledge Graph tutorial in KDD-2018, participated in government organized projects/competitions such as TREC, RADAR, ACE, GALE and TACKBP, and published in venues such as AAAI, ICME, ICoS, IJCAI, KDD, NAACL and SIGIR.


Dave Lewis is an Executive Vice President for AI Research, Development, and Ethics at Reveal-Brainspace. Prior to joining Brainspace, he was variously a freelance consultant, corporate researcher (Bell Labs, AT&T Labs), research professor, and software company co-founder. Dave has published more than 40 peer-reviewed scientific publications and 9 patents. He was elected a Fellow of the American Association for Advancement of Science in 2006 for foundational work in text categorization, and won a Test of Time Award from ACM SIGIR in 2017 for his paper w/ Gale introducing uncertainty sampling.


Sandeep Tata is a Software Engineer at Google Research and leads a research group on information extraction. Sandeep has published dozens of peer-reviewed research articles across a variety of disciplines including data management, data mining, natural language processing, and information extraction. Sandeep’s research work has impacted billions of people through research-focused enhancements to products like Google Drive, Gmail, and Google Assistant. He has served on the program committees for VLDB, ICDE, CIKM, and as a senior program committee member for KDD. He served on the organizing committee for WSDM 2016. Prior to Google Research, Sandeep was a Research Staff Member at IBM’s Almaden Research Center. He has a PhD from the University of Michigan.


Dan Tecuci leads the US AI Lab at EY working on prototyping and productizing business document understanding, including unstructured, semi-structured (e.g. invoices) and fully structured (e.g. spreadsheets) documents. Before EY, Dan worked on applying NLP to drug discovery, generating recipes, and question answering at IBM, as well as diagnostic systems and natural language access to large bodies of enterprise data at Siemens Research. He authored several publications and holds 11 patents. He received his PhD from The University of Texas at Austin.


Program Committee Chair

Ani Nenkova is a Principal Scientist at Adobe Research, leading the Adobe-Maryland part of the Document Intelligence Lab. Ani’s work is broadly in the area of language technology, including text quality prediction, summarization and named entity recognition. She has co-organized several workshops on summarization and improving text readability. Ani was program co-chair of NAACL 2016 and currently serves as editorin-chief for TACL.


Reviewers

The DI-2022 Organizing Committee wishes to express its sincere gratitude to the help from our paper reviewers. Without your thorough and timely reviewing, we could not have organized a successful workshop! THANK YOU!

(in alphabetical order of last names)

#Full NameAffiliation
1Charles BellerIBM
2Tongfei ChenMicrosoft
3Freddy ChuaErnst & Young
4John CorringMicrosoft
5Daniel CamposUniversity of Illinois at Urbana-Champaign
6Marina DanilevskyIBM
7Jonathan DegangeErnst & Young
8Yasuhisa FujiiGoogle
9Revanth Gangi ReddyUniversity of Illinois at Urbana-Champaign
11Sean GoldbergMicrosoft
12Jiuxiang GuAdobe Research
13Beliz GunelStanford University
14Ruining HeGoogle
15Hans HenselerUniversity of Applied Sciences Leiden
16Mehrdad Jabbarzadeh GangehErnst & Young
17Rajiv JainAdobe Research
18Antonio Jose Jimeno YepesUniversity of Melbourne
19Amanda JonesH5
20Priyanka KulkarniMicrosoft
21Sameer KulkarniGoogle
22Chen-Yu LeeGoogle
23Manling LiUniversity of Illinois at Urbana-Champaign
24James MayfieldJohns Hopkins University
25Graham McDonaldUniversity of Glasgow
26Lesly MiculicichMicrosoft
27Mark NoelHogan Lovells
28Feifei PanRensselaer Polytechnic Institute
29Navneet PottiGoogle
30Brian PriceAdobe Research
31Xiaoqi RenGoogle
32Herbert RoitblatMimecast
33Ying ShengGoogle
34Baoguang ShiMicrosoft
35Peter StaarIBM
36Baochen SunMicrosoft
37Chris TensmeyerAdobe Research
38Jyothi VinjumurWalmart
39Guoxin WangMicrosoft
40Sen WuStanford University
41Li YangGoogle
42Qi ZengUniversity of Illinois at Urbana-Champaign