The NDSA Individual Excellence Award honors individuals making significant contributions to the digital preservation community. In 2019, Tessa Walsh was one of two awardees in this category. Tessa has created an evolving suite of robust open source tools meeting many core needs of the stewardship community in appraising, processing, and reporting upon born-digital collections. At the time of the award, her projects included the Brunnhilde characterization tool; BulkReviewer, for identifying PII and other sensitive information; the METSFlask viewer for Archivematica METS files; SCOPE, an access interface for Archivematica dissemination information packages; and CCA Tools, for creating submission packages from a variety of folder and disk image sources. Taken together, these tools support a very wide gamut of both technical and curatorial activities. 

We recently caught up with Tessa to chat about the Excellence Awards. Read on to hear more about what Tessa has been working on recently! 

1) What have you been doing since receiving an NDSA Excellence Award?

I’ve been busy! Other than the whole global pandemic bit, I shifted from an archivist/librarian coding off the side of my desk to a professional software developer working on open source digital preservation tools, which has been a dream.

From March 2020 (the same week lockdown started here in Montreal) to September 2022, I worked as a Software Developer at Artefactual Systems, primarily on the Archivematica and Access to Memory (AtoM) projects. Getting a chance to grow leaps and bounds as a developer while working on open source software that the digital preservation and archival communities are heavily invested in was a dream come true. And as anyone who has had the chance to work with the folks at Artefactual will know, it’s a really supportive environment filled with kind, curious, multi-skilled people. I’m proud of some of the features I was able to work on there, including implementing an storage adapter for Archivematica to work with nearly any cloud storage provider, adding single sign-on to Archivematica and AtoM, helping users with their migration and theming projects, and working on some supplementary tools for things like reporting and audit logging.

In September 2022, I took a new role as Senior Applications and Tools Engineer at Webrecorder. Getting to work on a friendly and talented small team developing user-friendly open source solutions to challenging problems in web archiving has been fantastic. Since starting at Webrecorder, I’ve made contributions to pywb and Browsertrix Crawler, and have been heavily involved in the development of Browsertrix Cloud, a new open source cloud-native browser-based crawling service that unifies several Webrecorder tools into a single easy-to-use web application for creating, managing, curating, and sharing web archives. We’ve been hard at work developing both the software as well as a sustainable open source business model around it, and will be launching a hosted service as well as support models for the open source software in the coming months and year. It’s a really exciting time to be at Webrecorder, and I’m excited for us to continue furthering Webrecorder’s mission of web archiving for all.

I’ve also kept developing and maintaining a small set of my own open source projects, including putting out several releases of Bulk Reviewer (https://github.com/bulk-reviewer/bulk-reviewer/), a desktop application that aids users in finding and managing private and sensitive information in digital archives that is now included in the BitCurator Environment.

Finally, I’ve had the pleasure of being involved in a few research projects that I hope are helping to push forward thinking on topics that are of special interest to me. With Keith Pendergrass, Walker Sampson, and Laura Alagna, I published the paper “Toward Environmentally Sustainable Digital Preservation” in American Archivist in late 2019, which explores the environmental impact of digital preservation practice and suggests ways for the field to move forward in a more sustainable fashion. With Aliza Leventhal and Julie Collins, I published “Of Grasshoppers and Rhinos: A Visual Literacy Approach to Born-Digital Design Records,” also in American Archivist, in 2021. The paper applies a visual literacy approach to notoriously difficult digital design records such as CAD/BIM and 3D models in architectural archives with the hopes of making these materials more approachable to those responsible for preserving and providing access to them. And finally, with Jess Whyte, I’ve also been conducting interviews with Canadian memory workers on the issues they face and strategies they use in managing private and sensitive information in digital collections. Our paper, titled “‘Carefully and Cautiously’: How Canadian Cultural Memory Workers Review Digital Materials for Private and Sensitive Information,” will be published later this year in the open access journal Partnership: The Canadian Journal of Library and Information Practice and Research.

2) What did receiving the NDSA award mean to you?

Receiving the NDSA award validated the work that I was doing in trying to develop and maintain open source software that makes digital archiving and digital preservation work easier for practitioners. It helped me get over a bit of imposter syndrome and find the confidence to pursue software development as a career rather than just an interest, which I’m deeply grateful for! I hope and suspect it also introduced some new folks to some of the tools that I’d been working on, which is always nice.

3) What efforts/advances/ideas of the last few years have you been impressed with or admired in the field of data stewardship and/or digital preservation?

I think the conversations around environmental sustainability that have been happening in the last few years are wonderful and needed, especially as we see the effects of climate change unfold in real time. Digital stewardship will need to both respond to increasing risks of events like data center outages, and it behooves us to try to reduce our footprint as we can through classic archival practices like careful selection and new techniques like threat modeling and using defined levels of preservation tiers appropriate for various types of content being stored.

In the web archiving space, I’ve been really excited about the possibilities afforded by client-side replay in the browser made possible by Webrecorder’s replayweb.page tool. By being able to render and rewrite web archives in the browser we remove the need to upload data to a server in order to replay web archives and open up new exciting possibilities for access such as embedding web archive viewers into preservation and access systems (for more on that, see: https://replayweb.page/docs/embedding). I’m a big proponent of putting the focus on access to content that we’re preserving and I think this is a big step forward for web archives on that front!

4) How has your work evolved since you won the Excellence Award?

Since winning the Excellence Award, I’ve been fortunate to receive a lot of mentoring and have grown into a senior developer, which is really exciting personally. I’ve also had the opportunity to deepen my thinking on the sustainability of open source projects that the digital stewardship and preservation fields rely on through firsthand experience as a solo maintainer and as a person working on larger open source projects with many contributors. It’s a difficult thing to get right but really important, as we don’t want the burden of maintaining these tools to fall on individuals who aren’t compensated for their labor or for projects to become abandoned after being widely adopted.

5) What do you currently see as some of the biggest challenges or opportunities in digital preservation?

One thing I see as both a challenge and an opportunity currently is beginning to shift the focus from preserving content to providing open, sophisticated, useful access. Ultimately the goal of preservation is (or should be!) for someone to come use what we’re preserving. As the field matures and gets more comfortable in our preservation practices, I think there are a lot of interesting opportunities to demonstrate our value by connecting preserved content to users in forms that are useful to them, whether that means providing computational access to data, making it easier to integrate preserved content with our access systems, or pushing content to where people already are.

I’d also love to see us continue to lower the technical barriers to entry for digital preservation practice. A lot of the tools we rely on assume a certain level of competence with command line interfaces and scripting languages. Those tools can be great for providing a lot of flexibility to practitioners, and the field has done a lot to make learning these skills easier. That said, requiring such skills can also make it difficult to hire and mentor the next generation of digital stewards. I’d love to see our common toolsets continue to get more approachable and easier to use so that we can continue to grow and diversify our field of practitioners.

6) Are you working on any new digital preservation related tools at the moment? If so, could you please share a bit about the tool(s).

I’ve mentioned a few tools already, but I’d like to talk a little bit more about Browsertrix Cloud, the focus of a lot of my activity at Webrecorder these days. In the early days of development, a lot of our focus was on supporting functionality that were already possible through tools like Browsertrix Crawler in a more user-friendly and modern user interface. Now we’re focusing on building features that are new to Webrecorder, such as building and publicly sharing curated collections of web content, and integrating Browsertrix Cloud with existing tools like the archiveweb.page Chrome extension for manually archiving websites in your browser. By the end of the year, we’ll be working on some features that are I think relatively new to the web archiving field as a whole. I’m particularly excited about starting to work on software-assisted quality assurance (QA) of crawls, where we will be analyzing the WACZ files created by our crawler and presenting information to the end user about the relative quality of capture for the pages that have been crawled. That’s really just a start and I’m sure we’ll continue to refine what assisted QA can entail, but it aligns super well with my personal mission of using software to make currently onerous tasks easier for digital stewards, freeing them to use their time on the tasks where our expertise is most valuable.

Click here to read about other winners from the 2019 NDSA Innovation Awards!

The post Catching up with past NDSA Excellence Awards Winners: Tessa Walsh appeared first on DLF.