Washington's city government is sitting on a records management headache that has quietly ballooned over the past three years: tens of thousands of duplicate digital images stored across multiple agency servers, inflating storage costs, slowing database queries, and, in at least one documented case, sending contradictory identification photos to federal partners. The problem spans departments from the D.C. Department of Motor Vehicles on C Street SW to the Office of Unified Communications, which coordinates 911 dispatch on Indiana Avenue NW.
The timing matters. With the Trump administration's Department of Government Efficiency initiative pushing federal agencies to audit their own data infrastructure, D.C. officials face pressure from two directions simultaneously. Federal contracts that fund portions of city IT operations increasingly require data hygiene certifications, and local budget constraints — Mayor Muriel Bowser's fiscal year 2026 spending plan absorbed cuts in several technology line items — leave agencies with limited room to hire outside vendors for cleanup projects.
The Scale of the Problem, and Who Is Paying Attention
Duplicate image files accumulate for mundane reasons: staff scanning the same document twice, legacy system migrations that copy rather than move records, and inter-agency data-sharing protocols that don't flag files already in the receiving system. The D.C. Office of the Chief Technology Officer, based at 200 I Street SE, has been working since at least early 2025 to develop a deduplication framework applicable across District agencies, according to publicly posted procurement notices on the city's contracts database. A solicitation posted in February 2025 sought vendors with experience in hash-based image deduplication at scale, with contract values listed in the six-figure range.
Records management specialists outside government say the District's situation is common among mid-sized municipal systems that digitized paper records rapidly during the 2010s without standardized metadata tagging. The National Archives and Records Administration, headquartered at 700 Pennsylvania Avenue NW, has published guidance noting that local government digitization projects frequently generate duplication rates of 15 to 30 percent in scanned image archives — a range that, applied to D.C.'s known holdings, could represent hundreds of thousands of redundant files.
The D.C. Public Library's digital services team, which manages the Washingtoniana Collection at the Martin Luther King Jr. Memorial Library on G Street NW, completed its own deduplication audit in 2024. Library technologists found that roughly one in five image files in a subset of neighborhood photograph archives was a functional duplicate, and that eliminating them reduced that particular collection's storage footprint by nearly a quarter. Those figures came from the library's own published annual report, not from independent verification, but they offer one of the few locally sourced data points available.
What Comes Next — and What Agencies Are Being Told
Technology policy advocates who track D.C. government IT spending say agencies should treat the deduplication problem as a prerequisite for any broader artificial intelligence or machine-learning initiative, since AI tools trained on image sets with high duplication rates produce skewed outputs. The D.C. Department of Forensic Sciences, which moved into its facility on E Street NW, relies on image databases for evidence management — a context where duplicates carry legal, not merely administrative, consequences.
Practical advice circulating among city IT managers, drawn from published guidance by the National Institute of Standards and Technology, emphasizes three steps: conduct a full inventory using automated hash-comparison tools before any manual review, establish a single authoritative repository per record type before deleting apparent duplicates, and document the deduplication process in a way that satisfies D.C. Superior Court evidentiary standards for public records.
The OCTO's procurement timeline suggests a vendor contract could be awarded before the end of calendar year 2026. Whether individual agencies will fund their own parallel efforts, or wait for a city-wide solution, remains an open budget question heading into fall appropriations season. For residents who interact with D.C. government databases — from permit applications processed through the Department of Buildings on Rhode Island Avenue NE to driver's license renewals at the C Street DMV — the downstream effect is the same: slower systems and, occasionally, the wrong image attached to the right name.