The library exists for two reasons that are equally important: Ora needs free knowledge to access, with no commercial pinch point on its inputs; and the public benefits from a continuously curated, provenance-weighted knowledge resource available to anyone. The library is designed from day one to be replicable, distributable, and verifiable without the Foundation's continued operation. If the Foundation disappears, the library persists in distributed form. That is the public-domain commitment made operational at the data layer.

Constitutional principles

The library operates under six principles that establish from day one and govern the algorithms whether or not anyone is watching:

  • Provenance-weighted, source-traceable output. Every claim is connected to its source. Source reliability is evaluated and weighted, not asserted.
  • Public-domain output. All published datasets remain freely available and in the public domain. The library does not become a leverage point because it cannot be enclosed.
  • No editorial bias by processing. The algorithms execute specifications. They do not make editorial judgments. Where the specification does not clearly address content, the content is flagged for human review rather than processed by guess.
  • Public processing specifications. The specifications that govern each domain are public documents, open to inspection by anyone. Anyone can see how a piece of content came to be in the library, against what criteria, with what provenance weight.
  • Automation serves specification, not the reverse. If an algorithm cannot faithfully execute a specification, the algorithm is fixed or the specification is amended by its committee. The algorithm does not silently approximate.
  • No monetization of user interaction data. User retrievals, queries, and behavior are not monetized, sold, or shared. The library is infrastructure, not surveillance.

The four-layer operating model

Domain-specific specifications authored by subject-matter experts, executed faithfully by automated processing, with judicial review for edge cases. The pattern preserved from earlier design work because it is the right operating model for a knowledge library specifically.

  • Layer 1 — Constitutional principles (the six above) govern all library work.
  • Layer 2 — Expert committees author the specifications that govern each domain. What qualifies as a source, how provenance is weighted, how content is processed and atomized, what edge cases require human review.
  • Layer 3 — Judicial review handles edge cases flagged by users or algorithms. Rulings become precedent for future specification amendments by the relevant committee.
  • Layer 4 — Algorithms execute specifications faithfully, deterministically, and auditably. Every algorithmic decision is traceable to the specification that authorized it.

At launch, the founder operates all human roles. Formal committee staffing is triggered when funding or volunteers arrive. The constitutional principles establish from day one regardless of organizational size.

The universal pipeline

Every knowledge domain follows the same processing pattern. The pattern is the infrastructure; only the specifications differ.

Source identification → provenance verification → processing (any document → atomic notes → structured output) → cross-referencing → indexing → publication. Each domain's specification document governs steps 1 through 3. Steps 4 through 6 are infrastructure-level and domain-agnostic.

Decentralized infrastructure

The library is hosted across many independent nodes rather than concentrated on Foundation servers. The Foundation operates some nodes and coordinates the network; it does not host exclusively. The technical architecture combines content-addressed storage (IPFS as reference implementation), distributed hosting through volunteer nodes (partner organizations, university libraries, contemplative-tradition digital archives), and cryptographic provenance verification through the P1–P6 hierarchy implemented as canonical-document signing at each provenance level.

The Foundation's role in this architecture is signing authority and specification authorship rather than hosting infrastructure. Foundation as authority, network as infrastructure. This is what allows the library to persist regardless of the Foundation's continued operation while preserving the provenance verification that makes the library trustworthy. A distributed library cannot be enclosed by acquiring or compromising a single entity.

Phasing

Phase 1 — easy public-domain material. Encyclopedia (Wikipedia equivalent), source library (Wikisource equivalent), dictionary/lexicon (Wiktionary equivalent). The Level 3 provenance base that makes Ora useful on day one. Source material already digitized, already in the public domain, already in formats that can be processed without negotiation.

Phase 2 — news and current events. The library pipeline applied to current events, solving the AI training-cutoff problem with continuously updated provenance-verified news. Initial scope: US national news. Journalistic standards apply: source verification, multi-source corroboration, provenance tracking on all claims.

Phase 3 and beyond. Textbooks (Wikibooks equivalent), courses (Wikiversity equivalent), government data (Federal Reserve papers, census data, regulatory filings, congressional records, agency reports).

Deferred domains. Legal, medical, museum/cultural heritage, patents, and literature/film. Each presents specific challenges beyond the standard pipeline. Deferred not because they are unimportant but because their processing frameworks have not yet been designed.

Programs operated under this component

  • Main Street Independent — the publication operating from the framework methodology on a daily news beat. The proof of concept for the news domain in Phase 2.