In 1996, sixteen-year-old John Greenewald Jr. began filing Freedom of Information Act requests and sharing the responses on a modest web host. The project formalised as theblackvault.com, announcing a permanent, open repository of original U.S. government material.
Scale and Architecture
The archive tracks 3.43 million OCR-indexed pages across intelligence, defense, science, and UAP topics.1 WordPress renders article stubs, while a separate Apache directory tree (documents.theblackvault.com) serves bulk PDFs—some exceeding 400 MB—outside the CMS.2 Elasticsearch powers the keyword search box advertised on the landing page; older image-only scans are gradually re-OCR'd.
Raw File Repository
Researchers bypass the blog layer and pull source documents directly from the open file tree, which lists more than 6,000 sub-directories grouped by agency and year.
The front-end search hits the Elasticsearch index; advanced users scrape the directory listing or feed local scripts with URLs. No official API is offered.
Columbia Journalism Review reported ten terabytes of monthly downloads in 2020; live counters now show ~65,000 daily visitors and 3–5 TB egress.31
Publications and Media Profile
Greenewald's 2019 book Inside the Black Vault (Rowman & Littlefield) summarises notable document sets. Coverage has appeared in CJR, Popular Mechanics, Vice, Wired, and network television specials.
Notable FOIA Releases
The Black Vault secured and posted the CIA's entire UFO document cache in 2021, digitised twelve gigabytes of Project Blue Book microfilm, and released Air Force laser-propagation studies once classified for Strategic Defense Initiative planning.45
FOIA Strategy
Greenewald files targeted requests, publishes all correspondence, and appeals denials through administrative and judicial channels, creating templates future requesters reuse.6
Media and Academic Use
Major outlets—Los Angeles Times, Popular Mechanics, Vice—cite Vault records on topics from NSA satellite programmes to Cold War psychotronic research. University security-studies seminars assign document sets hosted on the site.7
Funding and Governance
Operating costs are met through advertising, podcast sponsorships, paid DVD compilations, and small patron subscriptions. Greenewald retains sole ownership and mirrors servers in multiple jurisdictions for continuity.
Legacy
By systematising public-records requests and releasing the results without paywalls, The Black Vault transformed individual curiosity into a durable public resource that shapes modern discourse on secrecy and oversight.