How Egypt's Archives Ended Up Full of Ghost Images — and What Comes Next
A decade of rushed digitisation, underfunded libraries, and pandemic-era shortcuts left Cairo's public record riddled with duplicate scans. Now officials are trying to fix it.
A decade of rushed digitisation, underfunded libraries, and pandemic-era shortcuts left Cairo's public record riddled with duplicate scans. Now officials are trying to fix it.

Egypt's national archiving infrastructure has a problem it has spent years quietly ignoring: tens of thousands of duplicate image files embedded across government databases, municipal portals, and the digital holdings of institutions from the Egyptian National Library and Archives on Corniche el-Nil to the Cairo Governorate's property records office in Abdeen. The scale of the redundancy, confirmed by a procurement tender published by the Ministry of Communications and Information Technology in late May 2026, is substantial enough that officials have now commissioned a dedicated deduplication project — the first of its kind run at a national level in Egypt.
The timing matters. Egypt is mid-way through a broader e-government push linked to conditions attached to the International Monetary Fund loan programme, which has required Cairo to demonstrate measurable improvements in public-sector efficiency. Cleaning up digital image repositories is unglamorous, but it sits inside a wider obligation to reduce wasteful infrastructure spending — relevant at a moment when the Egyptian pound has lost significant purchasing power over the past three years and every gigabyte of redundant server storage represents real, avoidable cost.
The duplication problem has three overlapping origins, each tied to a specific period of Egyptian institutional life. The first came during the mass scanning drives of the mid-2010s, when the Egyptian National Library and Archives — based on the Corniche in Boulaq — pushed to digitise millions of documents, newspapers, and photographic plates. Scanners were operated by contracted staff paid per batch, not per unique image, creating a structural incentive to rescan rather than verify. Nobody was deliberately cheating the system; the workflow simply did not require a deduplication check before upload.
The second wave came after 2017, when the New Administrative Capital project generated enormous volumes of architectural drawings, planning images, and engineering photographs that needed to be filed across multiple ministries simultaneously. The Capital Administrative City Company, the state entity managing the project, distributed image files to the Housing Ministry, the Transport Ministry, and the Administrative Capital's own internal document system. Without a shared file-naming convention or a central image hash registry, the same scan of the same blueprint often ended up stored under three different filenames in three different databases.
The third and largest contributor was the period between 2020 and 2022. When government offices in Cairo shut during the pandemic and then reopened under hybrid arrangements, departments that had previously co-ordinated their uploads in person shifted to decentralised remote filing. The Cairo Governorate's Abdeen administrative complex, which handles property deeds, building permits, and civil registration images for much of central Cairo, saw upload volumes spike with almost no central quality control. An internal audit cited in the May 2026 tender document put the proportion of duplicate image entries in the property records portal at approximately 34 percent of all files uploaded during that two-year window.
The Ministry of Communications and Information Technology tender, published on 28 May 2026, set a contract ceiling of 47 million Egyptian pounds for a contractor to deploy automated perceptual hashing software across four primary government image repositories over an 18-month period. The work will begin with the National Library holdings, move to the New Administrative Capital document store, then address the Cairo Governorate property portal, and finally sweep the Ministry of Health's patient imaging archive, which is stored at a data centre in the Sixth of October City technology zone west of the capital.
Perceptual hashing works differently from simple file comparison. Rather than checking whether two files are byte-for-byte identical, the software generates a fingerprint based on visual content, meaning a document that was scanned twice at slightly different resolutions — a common occurrence in the 2020–2022 period — will still be flagged as a duplicate. The approach is now standard practice in commercial media archives, though its adoption in Middle Eastern public administration has lagged European counterparts by roughly six to eight years.
For ordinary Cairenes, the most immediate practical consequence will be faster load times on the e-government portal Misr Digital, which draws on the same image servers. Businesses applying for trade licences through the Greater Cairo Chamber of Commerce's online platform, which integrates with government document databases, have long complained that attached scanned documents sometimes appear duplicated or fail to upload cleanly. The deduplication project should resolve a share of those errors by late 2027, assuming the contract is awarded and work begins on schedule this autumn.
How does this story make you feel?
Spread the word
About this article
Published by The Daily Cairo
Daily brief
Free, in your inbox before 7am. Weekdays.
More in News