DC's Fight Against Duplicate Images in Public Records Lags Behind London and Seoul
As federal restructuring scrambles city databases, Washington finds itself behind peer capitals in purging redundant visual data from government archives.
As federal restructuring scrambles city databases, Washington finds itself behind peer capitals in purging redundant visual data from government archives.

Washington DC's Office of the Chief Technology Officer quietly flagged a problem last fall that has grown harder to ignore: an estimated 4.2 million duplicate images are sitting inside the District's permitting, licensing, and property records systems, clogging servers, slowing clerks, and costing taxpayers roughly $1.8 million a year in unnecessary storage contracts. Six months later, the cleanup is barely a quarter done.
The timing is brutal. Mayor Muriel Bowser's administration is simultaneously absorbing the downstream chaos of DOGE-driven federal workforce cuts, which have pushed hundreds of furloughed federal employees toward DC's own social services and licensing queues since January. Every slowdown in the city's digital infrastructure hits harder when the caseload is already swollen.
The problem sounds mundane until you see the numbers. A single construction permit application filed with the DC Department of Buildings on Rhode Island Avenue NE can generate up to eleven copies of the same site photograph — one for each departmental review stage — and none of the city's legacy software automatically reconciles them. Multiply that across the roughly 38,000 permits processed annually, and the redundancy compounds fast. The OCTO estimated in its November 2025 internal audit that duplicate image files account for 31 percent of total storage load on the District's GovCloud contract with Microsoft Azure, a deal renewed in March 2024 at $6.4 million per year.
London's Government Digital Service began tackling the equivalent problem in 2021 under its Local Digital programme, deploying a deduplication layer across 32 borough councils. By 2023, the GDS reported a 44 percent reduction in redundant visual assets across planning and licensing databases. Seoul's Smart City Division launched a similar automated hash-matching protocol in 2022 and cleared its backlog within eight months. Neither city faced the additional complication Washington does: a patchwork of overlapping federal and municipal data systems that often store copies of the same images independently, with different retention schedules and no shared reconciliation tool.
In DC, the responsibility falls between agencies in ways that have historically resisted resolution. The Office of Planning, the DC Department of Transportation, and the Department of Consumer and Regulatory Affairs all maintain separate image repositories. A photograph of a cracked sidewalk on Georgia Avenue NW can exist in all three simultaneously, each version owned by a different agency, none of them talking to the others.
OCTO launched a pilot deduplication program in the NoMa neighborhood in April 2026, targeting roughly 180,000 images tied to the construction boom along New York Avenue NE. The pilot uses open-source perceptual hashing software — the same category of tool Seoul adopted — and as of late June had cleared about 62,000 duplicates, freeing 1.1 terabytes of storage. If the results hold, OCTO has told the city council it wants a $2.3 million contract to scale the program district-wide by the first quarter of 2027.
Chicago ran a comparable pilot in its Department of Streets and Sanitation in 2023 and found that automated deduplication only works cleanly on about 70 percent of images; the remaining 30 percent require human review because the files contain metadata discrepancies that automated tools flag as potentially unique. DC's OCTO has budgeted for a review team of six contractors to handle that gray zone — a leaner setup than Chicago's 14-person task force, which may mean the timeline slips.
For residents and small business owners filing applications at the permit center on 1100 4th Street SW, the practical payoff would be faster processing times. OCTO estimates that clearing the duplicate backlog would shave an average of three business days off permit approvals — significant for contractors whose financing costs run on daily schedules. The NoMa pilot data, due to be presented to the Committee on Technology and Innovation in September 2026, will determine whether the council authorizes the broader rollout. If it doesn't, DC will keep paying for storage space to house millions of photographs that, by every measure, already exist somewhere else in its own systems.
How does this story make you feel?
Spread the word
About this article
Published by The Daily Washington DC
Daily brief
Free, in your inbox before 7am. Weekdays.
More in News