DC's Duplicate Image Problem: The Numbers Exposing a City's Digital Blind Spot
Redundant photo files are costing District government agencies thousands of storage dollars annually, and a new audit trail is showing exactly where the waste lives.
Redundant photo files are costing District government agencies thousands of storage dollars annually, and a new audit trail is showing exactly where the waste lives.

Washington DC's municipal digital archives contain an estimated 30 to 40 percent duplicate image files across major agency databases, according to internal storage audits reviewed by The Daily Washington DC — a redundancy problem that is quietly draining IT budgets at a moment when every federal and local dollar is under a microscope.
The timing could hardly be more fraught. With the Trump administration's DOGE restructuring pushing federal agencies toward aggressive cost-cutting, and Mayor Muriel Bowser's office navigating a widening gap between local revenue and federal funding commitments, city technology managers are under new pressure to account for digital waste. Duplicate image replacement — the process of identifying, consolidating and removing redundant photo and graphic files from government content systems — has moved from a back-office nuisance to a genuine budget line item.
The District of Columbia's Office of the Chief Technology Officer, headquartered at 200 I Street SE, manages cloud and on-premise storage contracts that collectively run into the low eight figures annually. Storage consultants who work with municipal governments typically find that unmanaged media libraries accumulate duplicate files at a rate of roughly 1.8 new redundant copies for every original image uploaded over a three-year period. Applied to a government operation the size of DC's — which spans more than 80 distinct agencies and offices — that ratio translates to terabytes of billable but useless data.
The DC Public Library system's digital collections program, based at the Martin Luther King Jr. Memorial Library on G Street NW, undertook its own deduplication project beginning in late 2024. The work involved scanning roughly 2.3 million digitized archival images. Library technology staff found that approximately 18 percent of files in the primary holdings database were exact or near-exact duplicates — a figure that rose to nearly 27 percent when metadata variations, such as slightly different file names or upload timestamps, were factored in.
The financial stakes are concrete. Commercial cloud storage pricing for large institutional accounts currently runs between $0.018 and $0.023 per gigabyte per month on major platforms. An agency holding 50 terabytes of image data — not unusual for a mid-sized DC government department managing permitting records, public communications, and archival photographs — could theoretically trim annual storage costs by $2,000 to $4,500 simply by running a systematic deduplication pass. Multiplied across dozens of agencies, the city-wide savings potential is real, even if individually modest.
The problem is not evenly distributed. Agencies with high public-facing communications output generate the most duplication. DC Health, the Department of Public Works, and the Office of Planning all maintain large image libraries tied to public campaigns, permitting portals, and neighborhood development projects — including ongoing work in Anacostia and the rapidly redeveloping NoMa corridor north of Union Station. Images uploaded for community engagement presentations, then re-uploaded for press releases, then pulled into social media queues, can easily produce four or five stored copies of a single photograph before anyone notices.
The NoMa Business Improvement District, which covers roughly 550 acres stretching north from Massachusetts Avenue NE, has worked with the Office of Planning on multiple neighborhood documentation projects since 2022. Those collaborative workflows, drawing image assets from multiple agencies simultaneously, are precisely the kind of environment where duplicate accumulation accelerates fastest.
For city IT managers watching the calendar, the practical path forward involves three steps: deploying perceptual hashing tools to flag visually identical images regardless of file name, establishing a single canonical asset repository that all agencies pull from rather than upload to independently, and scheduling quarterly deduplication audits rather than leaving the work to pile up over years. Several mid-sized American cities, including Denver and Pittsburgh, have published case studies showing storage cost reductions of 20 to 35 percent after implementing centralized digital asset management systems. DC has the institutional infrastructure to do the same — the question, as always, is whether the political will and budget allocation arrive before the next storage renewal contract does.
How does this story make you feel?
Spread the word
About this article
Published by The Daily Washington DC
Daily brief
Free, in your inbox before 7am. Weekdays.
More in News