The Daily Washington DC

Washington DC news, every day

News

DC's Digital Archives Are Full of Duplicate Images. Here's What Officials and Experts Are Saying About Fixing It.

As federal restructuring strains city budgets, Washington's archivists and technology officers are grappling with a costly, unglamorous crisis buried inside municipal databases.

By Washington DC News Desk · Published 4 July 2026, 3:00 pm

3 min read

DC's Digital Archives Are Full of Duplicate Images. Here's What Officials and Experts Are Saying About Fixing It.
Photo: Photo by ale.studio_17 . on Pexels

Washington's city government is sitting on a sprawling mess of duplicate digital images — redundant photographs, scanned documents, and graphics files clogging municipal servers — and officials at the DC Office of the Chief Technology Officer are now under pressure to address the problem before it compounds already strained IT budgets. The issue surfaced prominently in internal reviews this spring as the Bowser administration pushed to demonstrate fiscal discipline amid uncertainty over federal funding flows to the District.

The timing is pointed. With the Trump administration's Department of Government Efficiency initiative squeezing federal agency footprints across L'Enfant Plaza and Pennsylvania Avenue, DC's own government has faced intensified scrutiny over operational redundancy. Duplicate image files — a problem familiar to any large institution managing decades of digital records — represent a concrete, if unglamorous, category of waste that technology managers are now being asked to quantify and eliminate.

What the Experts Are Telling City Hall

Digital records specialists who work with municipal governments say the problem is widespread and expensive. Storage costs for unmanaged archives can run into tens of thousands of dollars annually for a city the size of Washington, and duplicate files typically account for somewhere between 20 and 40 percent of total image storage volume in large public-sector databases, according to research published by the Digital Preservation Coalition. For DC, which manages image libraries spanning agencies from the Department of Public Works to the Office of Planning — whose files cover everything from Anacostia waterfront development proposals to zoning maps for the NoMa corridor along New York Avenue NE — the cumulative footprint is substantial.

Technology consultants advising the District have pointed to automated deduplication software as the most cost-effective first step. Tools like those deployed by the National Archives and Records Administration at its College Park, Maryland facility use hash-matching algorithms to identify identical files regardless of filename or folder location. The approach can cut storage overhead sharply within a single budget cycle. Whether DC's OCTO moves in that direction — and how quickly — depends partly on appropriations that remain unsettled heading into fiscal year 2027.

Local Stakes Stretch From Judiciary Square to the Anacostia Waterfront

The practical stakes show up in specific places. The DC Office of Planning, headquartered at 1100 4th Street SW, maintains image libraries tied to active development reviews across the city. Duplicate records in those archives can slow Freedom of Information Act responses and complicate the evidentiary record in zoning disputes — a live concern in neighborhoods like Anacostia, where community groups have been pushing for greater transparency in waterfront redevelopment proceedings.

The DC Public Library system, which digitized thousands of historical photographs of the city through its Washingtoniana collection at the Martin Luther King Jr. Memorial Library on G Street NW, has dealt with the duplicate image problem longer than most. Librarians there have used manual and semi-automated review processes to cull redundant files, but staff capacity has not kept pace with the volume of new digitization projects funded through grants from the Institute of Museum and Library Services. That federal grant pipeline is itself under review in the current budget environment.

City Council member Brooke Pinto, who chairs the Committee on the Judiciary and Public Safety and has oversight touching OCTO's operations, has not yet publicly scheduled a hearing on digital records management for the summer session. But technology policy advocates say the deduplication question is likely to come up in any broader review of the city's IT infrastructure spending, which the council is expected to revisit before the October 1 start of fiscal year 2027.

For residents and journalists who rely on DC's public records systems, the practical advice from archivists is straightforward: when submitting FOIA requests that involve image files, be specific about date ranges and originating agency offices. Broader requests draw from larger, messier pools of data and take longer to process. The city's FOIA portal at foia.dc.gov allows applicants to narrow searches by department — a feature that becomes more valuable the longer the underlying database stays unruly.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Washington DC

This article was produced by the The Daily Washington DC editorial desk and covers news in Washington DC. See our editorial standards for how we use AI.

The Daily Washington DC brief

The day's Washington DC news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Washington DC and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Washington DC news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Washington DC and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Washington DC

More in News

Enjoyed this story? Get tomorrow's briefing free.