10th COMPUTATIONAL ARCHIVAL SCIENCE (CAS) WORKSHOP
Tuesday December 9, 2025 (Online)

Part of: 2025 IEEE Big Data Conference (IEEE BigData 2025)
https://conferences.cis.um.edu.mo/ieeebigdata2025/
Dec. 8-11, 2025

If you are planning on attending the workshopplease contact mark.hedges at kcl.ac.uk for registration details!

See: https://ai-collaboratory.net/cas/cas-workshops/ieee-big-data-2025-cas-10/ for the latest updates and schedule.

Keynote from National Archives of Singapore, 18 papers from 27 institutions in 8 countries spanning 5 continents:
Canada, USA (North America) / Brazil (South America) / Scotland, Spain, Switzerland (Europe) / South Africa (Africa) / Korea (Asia).


  • In Memoriam, Dec. 16, 2022… to our friend and CAS collaborator Michael Kurtz:
    “One of the pulls to the bright side is our CAS initiative. Not only is it intellectually compelling to me, but I feel I am part of an endeavor that will help others in the archival space and beyond. To be even more blunt, I am so curious to see what happens next as it makes me want to push the boundaries of the time that I have left!”
Photo taken on Friday, Dec. 16, 2022 — Annapolis, MD.

COMPUTATIONAL ARCHIVAL SCIENCE: digital records in the age of big data

INTRODUCTION TO WORKSHOP [also see our CAS Portal]:

The large-scale digitization of analogue archives, the emerging diverse forms of born-digital archive, and the new ways in which researchers across disciplines (as well as the public)wish to engage with archival material, are resulting in disruptions to transitional archival theories and practices. Increasing quantities of ‘big archival data’ present challenges for the practitioners and researchers who work with archival material, but also offer enhanced possibilities for scholarship, through the application both of computational methods and tools to the archival problem space and of archival methods and tools to computational problems such as trusted computing, as well as, more fundamentally, through the integration of computational thinking with archival thinking.


Our working definition of Archival Computational Science (CAS) is:

    • A transdisciplinary field grounded in archival, information, and computational science that is concerned with the application of computational methods and resources, design patterns, sociotechnical constructs, and human-technology interaction, to large-scale (big data) records/archives processing, analysis, storage, long-term preservation, and access problems, with the aim of improving and optimizing efficiency, authenticity, truthfulness, provenance, productivity, computation, information structure and design, precision, and human technology interaction in support of acquisition, appraisal, arrangement and description, preservation, communication, transmission, analysis, and access decision. [refined by Nathaniel Payne (2018)]

OBJECTIVES

This workshop will explore the conjunction (and its consequences) of emerging methods and technologies around big data with archival practice (including record keeping) and new forms of analysis and historical, social, scientific, and cultural research engagement with archives.We aim to identify and evaluate current trends, requirements, and potential in these areas, to examine the new questions that they can provoke, and to help determine possible research agendas for the evolution of computational archival science in the coming years. At the same time, we will address the questions and concerns scholarship is raising about the interpretation of ‘big data’ and the uses to which it is put, in particular appraising the challenges of producing quality–meaning, knowledge and value–from quantity, tracing data and analytic provenance across complex ‘big data’ platforms and knowledge production ecosystems, and addressing data privacy issues.

This will be the 9th workshop at IEEE Big Data addressing Computational Archival Science (CAS), following on from workshops in 2016, 2017, 2018, 2019, 2020, 2021, 2022, 2023, and 2024. It also builds on three earlier workshops on Big Humanities Data organized by the same chairs at the 2013-2015 conferences, and more directly on a 2016 symposium held in April 2016 at the University of Maryland.

All papers accepted for the workshop will be included in the Conference Proceedings published by the IEEE Computer Society Press


RESEARCH TOPICS COVERED:
Topics covered by the workshop include, but are not restricted to, the following:

    • Application of analytics to archival material, including AI, ML, text-mining, data-mining, sentiment analysis, network analysis.
    • Analytics in support of archival processing, including e-discovery, identification of personal information, appraisal, arrangement and description.
    • Scalable services for archives, including identification, preservation, metadata generation, integrity checking, normalization, reconciliation, linked data, entity extraction, anonymization and reduction.
    • New forms of archives, including Web, social media, audiovisual archives, and blockchain.
    • Cyber-infrastructures for archive-based research and for development and hosting of collections
    • Big data and archival theory and practice
    • Digital curation and preservation
    • Crowd-sourcing and archives
    • Big data and the construction of memory and identity
    • Specific big data technologies (e.g. NoSQL databases) and their applications
    • Corpora and reference collections of big archival data
    • Linked data and archives
    • Big data and provenance
    • Constructing big data research objects from archives
    • Legal and ethical issues in big data archives

PROGRAM CHAIRS:
Dr. Mark Hedges
Department of Digital Humanities (DDH)
King’s College London, UK

Prof. Victoria Lemieux
School of Information
University of British Columbia, CANADA

Prof. Richard Marciano
Advanced Information Collaboratory (AIC)
College of Information Studies
University of Maryland, USA


PROGRAM COMMITTEE MEMBERS:
Dr. Linde Brocato
Metadata Librarian

U. Arkansas Libraries
University of Arkansas, USA

Dr. Sarah Buchanan
Library and Information Science

iSchool
University of Missouri, USA

Mark Conrad
Advanced Information Collaboratory (AIC)
College of Information
University of Maryland, USA

Dr. Jane Greenberg
Alice B. Kroeger Professor and Director, Metadata Research Center
College of Computing & Informatics

Drexel University, USA

Dr.  Kumar Gnanasekaran
Advanced Information Collaboratory (AIC)
College of Information
University of Maryland, USA

Dr. Bill Underwood
Advanced Information Collaboratory (AIC)
College of Information
University of Maryland, USA