Washington's municipal agencies are sitting on hundreds of thousands of duplicate digital images — redundant photographs, scanned documents, and graphics files stored across overlapping servers — and the problem has quietly compounded for more than a decade. The Office of the Chief Technology Officer, headquartered on 441 4th Street NW, is now leading an audit process to identify and purge the worst of the redundancy from city systems, a task that staff describe internally as far larger than originally anticipated.
The timing matters. Under the Trump administration's DOGE-driven federal restructuring, the line between what Washington's District government manages independently and what falls under federal digital infrastructure has grown blurrier. Several DC agencies share server architecture with federal counterparts, and the freeze on federal IT spending this spring created a pileup: routine image-replacement and deduplication workflows stalled, and files that should have been flagged for deletion remained on live servers instead. The result is a storage and retrieval headache that is also, increasingly, a budget problem.
A Problem Built Over Years, Not Months
The roots go back to at least 2014, when the DC government began an aggressive push to digitise records held by agencies including the Department of Consumer and Regulatory Affairs and the DC Public Library system, which operates branches from the Martin Luther King Jr. Memorial Library on G Street NW to neighborhood locations in Anacostia and Columbia Heights. Each digitisation project generated its own filing conventions, and almost none of them talked to each other. Images were saved in multiple resolutions, renamed inconsistently, and in many cases uploaded more than once when staff changed or project handoffs were mismanaged.
By 2020, internal assessments reportedly found that storage costs tied to duplicated assets were running into the hundreds of thousands of dollars annually across the District's portfolio of digital systems — though no single public figure has been officially published covering the full scope. The DC Public Schools system, which manages its own digital asset library for communications and curriculum materials, ran a separate deduplication effort in 2022 covering roughly 1.2 million files, according to a presentation given at a 2023 digital records conference hosted by the National Archives in College Park, Maryland.
The situation grew sharper this year. Mayor Muriel Bowser's fiscal year 2026 budget included a technology modernisation allocation for the Office of Unified Communications and related city IT arms, but sequencing of those funds has been complicated by the federal funding uncertainty that has shadowed District governance since January. Projects that require joint federal-District server access have seen approval delays of up to several months.
What the Cleanup Actually Involves
Deduplication at this scale is not a single software run. Technologists working on municipal systems distinguish between exact duplicates — identical files with the same hash value — and near-duplicates, which are visually identical images saved in different formats, resolutions, or with altered metadata. The latter category is far more expensive to resolve and requires human review or AI-assisted matching tools that the District is still in the process of procuring.
The Metropolitan Police Department and the DC Office of Planning both maintain large proprietary image libraries — aerial surveys, crime scene photograph archives, zoning documentation — that fall outside the scope of the current OCTO audit. Those agencies manage their deduplication internally, on separate timelines.
For residents and businesses interacting with DC systems, the practical impact shows up most visibly in the online permit and licensing portals managed by DCRA, where outdated or duplicated reference images have occasionally caused mismatches in property documentation. The agency acknowledged the issue in a March 2026 service update posted to its website, though it did not quantify the number of affected records.
The OCTO audit is expected to produce a phased remediation roadmap by the end of September 2026. Agencies will then be required to migrate to a unified digital asset management platform — a process likely to stretch into 2027. For anyone dealing with DC records requests or digital permitting between now and then, agency staff recommend submitting requests with as much identifying metadata as possible to reduce the chance of a duplicate file being returned in error.