Overview

The Process Formalization Framework (PFF) is a meta-framework — a framework for creating frameworks. When a process is known well enough to run reliably across executions but has not been formalized, PFF takes the description of the process and produces a bespoke framework specification that runs the process the same way every time. PFF is the productivity multiplier of Ora’s framework ecosystem: design a process once with PFF, then run the resulting framework hundreds of times against new instances of the same task. Most of Ora’s runtime frameworks were produced via PFF — including some used by the Information Lifecycle System itself (CFF and OFF were both PFF-designed).

The framework runs in four modes. F-Design creates a new framework from a task description through a structured 10-question requirements gathering protocol followed by complexity assessment and architecture confirmation. The default output is one consolidated framework file (single-pass rendering — runs in commercial AI without further packaging). F-Convert modernizes an existing framework against the current standard, identifying gaps and producing one consolidated updated framework file conforming to the Framework Anatomy and Authoring Standards (Sections II and III of the canonical specification). F-Render is the explicit opt-in path for producing additional execution variants from a consolidated framework — agent-mode rendering for swarm execution with stages and tools; reasoning-model rendering simplified for o3 / o4 / DeepSeek-R1; an additional single-pass rendering when reconstructing from older split-artifact frameworks. F-Audit evaluates an existing framework against the Quality Verification Checklist (Section VII) and produces a score per criterion plus specific remediation recommendations.

The framework’s load-bearing intellectual content is the Framework Anatomy, the Authoring Standards, the Quality Verification Checklist, and the Inline Properties Principle. The Framework Anatomy specifies the structural standard every PFF-designed framework must conform to: Display Name and Description, Setup Questions, How to Use This File, Table of Contents, Purpose, Input Contract, Output Contract, Execution Tier, Milestones Delivered (with per-milestone inline properties), Evaluation Criteria (with 5-level rubrics per criterion; minimum passing score 3), Persona, numbered Layers (one per processing stage with Stage Focus / Input / Output / Processing Instructions), Self-Evaluation Layer, Error Correction and Output Formatting Layer, Named Failure Modes, Execution Commands, User Input. The Authoring Standards govern language and convention — explicit input-output contracts at every layer; concrete verbs; named actors; no implicit assumptions; specific evidence requirements in challenges. The Quality Verification Checklist is what F-Audit scores frameworks against. The Inline Properties Principle says each milestone declares its own properties (mode, endpoint produced, verification criterion, layers covered, required prior milestones, gear, output format, drift check question) in the milestone block itself rather than in a separate properties registry — which keeps the framework self-contained and inspectable.

The framework’s design discipline is the multi-milestone requirement for modes longer than five layers. A mode with more than five processing layers must be decomposed into multiple intermediate milestones; modes with five or fewer layers may remain single-milestone. This requirement exists because long single-milestone modes hide partial drafts that have not yet cleared verification — without intermediate milestones, the framework’s runtime supervision (Process Coherence) has no checkpoint to fire at and partial work can drift before reaching the terminal verification. The discipline is enforced at F-Design Phase 1 Step E (correction triggers when a mode has more than five layers but only one milestone declared) and at F-Audit Quality Verification Checklist (one of the criteria evaluates milestone decomposition).

The framework answers questions like: I have a task I do over and over with slight variations — can you turn it into a framework I can paste into any AI? My old framework feels stale and doesn’t match how I work now — can you modernize it? I want this framework to also run as an autonomous agent on my Mac Studio with tool access — can you produce that variant? I think this framework I’m using has quality issues but I can’t articulate what — can you audit it?

Systemic context

Process Formalization is the meta-framework of the Information Lifecycle System (alongside Process Inference, Corpus Formalization, Output Formalization, Terrain Mapping, and Decision Clarity Analysis; the system view is in [[Paper — Information Lifecycle System]]). PFF is the productivity multiplier — it produces the runnable frameworks that the Strategic Supervision System invokes (PEF supervises matrices that may invoke PFF-designed frameworks at Active milestones; MOM produces matrices whose milestones are delivered by PFF-designed frameworks), the Coordination Layer enforces (Process Coherence supervises framework outputs; Oversight Configuration generates configurations PFF-designed frameworks honor), the Matrix Lifecycle System uses (OM, IIF, MOM, PEF — all the canonical Ora frameworks were either PFF-designed or PFF-converted from earlier versions), and the Knowledge Production System invokes (KAC was PFF-converted in v6.0; the MindSpec Interview Framework’s structure conforms to PFF’s Framework Anatomy). The framework reads but does not modify other frameworks’ substantive content during F-Convert — the conversion preserves intellectual content and refits it to the current Anatomy.

Ora’s output

Example question: I want a framework that helps me extract one core insight from each book I read, with a confidence rating and a connection to existing notes. Can you design this for me?

Ora’s response (running PFF in Mode F-Design):

The task is concrete enough to design from. I’ll work through Phase 1’s 10-question requirements gathering protocol, do the complexity assessment, and produce a consolidated framework file. Here is what F-Design produces.

Phase 1 — Requirements gathering (abbreviated for the demo):

Q1 — Final user-facing deliverable. One atomic note per book with: the core insight (single discrete claim), a confidence rating (high / medium / low) reflecting how well the insight is evidenced in the book, and typed relationships to existing engrams.

Q2 — Mode structure. Single-mode. The task does not have meaningfully different invocation contexts.

Q3 — Routing layer. None — single mode.

Q4 — Per-mode milestone breakpoints. The mode is short enough (estimated 4-5 layers) to remain single-milestone. No decomposition required.

Q5 — Input inventory. The book’s text or your reading notes; optional context (what you were reading the book to learn); your existing engram corpus (read-only, for relationship typing).

Q6 — Quality definition. The atomic passes KAC’s three grammar rules; the confidence rating is calibrated against the book’s actual evidence (not just the user’s enthusiasm); typed relationships use the 13-type taxonomy with high-confidence links only.

Q7 — Failure modes. (a) Multiple insights extracted instead of one (the user couldn’t pick); (b) confidence rating inflated by enthusiasm; (c) relationships drafted speculatively rather than from actual engram content.

Q8 — Pipeline position. Standalone or invoked by Document Processing on a book PDF.

Q9 — Execution environment. Commercial AI (interactive). Default consolidated single-file rendering.

Q10 — Domain expertise needs. The framework operator must be familiar with KAC’s three grammar rules and the 13-type relationship taxonomy.

Proactive gap assessment. I’ll surface things the user might not have named. (a) Should the framework support a “no insight” verdict when the book genuinely doesn’t yield one? (b) Should the confidence rating decompose into evidence-strength and reasoning-strength dimensions? (c) Should the relationship typing be limited to engrams matching the same nexus or include cross-nexus links? The user accepts (a) and (c); dismisses (b) as overengineering for a one-insight-per-book framework.

Complexity assessment. Single-mode, ~4 layers, no agent-mode requirement, no multi-framework pipeline. Architecture: simple. Default output: one consolidated framework file.

The produced framework file (abbreviated):

# Book Insight Extraction Framework (v1.0)

## Display Name
Book Insight Extraction (BIE)

## Display Description
Extracts one core insight per book as an atomic note with confidence rating and typed relationships to existing engrams.

## Setup Questions
### Source material
Required. The book's text, your reading notes, or both.
### Reading context
Optional. What you were reading the book to learn.
### Existing engram corpus reference
Optional. Read-only access for relationship typing.

## PURPOSE
Produce one atomic note per book with the book's core insight, a calibrated confidence rating, and typed relationships to existing engrams. Counter the failure mode where the operator extracts multiple insights (could not pick one) or inflates the confidence rating (book felt important).

## INPUT CONTRACT
Required: source material (book text or reading notes). Optional: reading context, engram corpus reference.

## OUTPUT CONTRACT
Primary: one atomic note (subtype matching the insight's claim type) with confidence rating in frontmatter and typed relationships to existing engrams. Or: a "no insight" verdict with explanation when the book genuinely does not yield one.

## MILESTONES DELIVERED
### Milestone 1: Single insight extracted
- **Endpoint produced:** One atomic note OR a no-insight verdict with explanation; confidence rating present (high / medium / low); typed relationships using 13-type taxonomy at high confidence only.
- **Verification criterion:** (a) one insight surfaced (not multiple); (b) confidence rating cites specific evidence from the book; (c) relationships use the 13-type taxonomy and link to actual engrams (no speculative typing); (d) atomic passes KAC's three grammar rules; (e) no-insight verdict, when issued, cites why.
- **Layers covered:** 1, 2, 3, 4
- **Drift check question:** Did the framework extract one insight rather than several, with confidence calibrated to evidence rather than enthusiasm?

## EVALUATION CRITERIA
[Five 5-level rubric criteria — abbreviated for demo]

## PERSONA
The Restraint Auditor — extracts the single insight that best survives scrutiny and rejects insights that don't survive. Adversarial against the operator's enthusiasm; cooperative with the operator's reasoning.

## LAYER 1: SURVEY THE INPUT
[Stage Focus / Input / Output / Processing Instructions]

## LAYER 2: CANDIDATE INSIGHTS
[Stage Focus / Input / Output / Processing Instructions]

## LAYER 3: CONFIDENCE CALIBRATION AND SELECTION
[Stage Focus / Input / Output / Processing Instructions]

## LAYER 4: RELATIONSHIP TYPING AND OUTPUT
[Stage Focus / Input / Output / Processing Instructions]

## NAMED FAILURE MODES
### 1. Insight Inflation Trap
[Description and correction]
### 2. Speculative Relationship Trap
[Description and correction]
### 3. Enthusiasm Calibration Failure
[Description and correction]

That is what F-Design produces. The user now has a runnable framework — Framework — Book Insight Extraction.md — that they can paste into any AI session along with a book’s text or notes, and the framework will produce one atomic note (or a no-insight verdict) with the structure specified. The framework can be used hundreds of times across hundreds of books; each invocation runs the same process the same way; the output is consistent and inspectable.

The design took about 30 minutes of interactive elicitation. The resulting framework will be invoked by the user against every book they read going forward. The productivity multiplier is the ratio between design effort (one-time, ~30 minutes) and execution effort (per-book, ~10 minutes with framework vs. ~30 minutes without — and inconsistent without).

Commercial AI comparison

Comparison content auto-populates when the comparison-refresh framework runs against this question. Drafters do not author this section.

Brief comparison commentary

Auto-populates with the comparison content above.

How to use this framework

You can run the Process Formalization pattern with any AI of your choice. The composition is single-pass for any of the four modes.

For F-Design (most common starting point):

[Paste the framework specification]

Run F-Design.

Task description: [Plain prose describing what you want a framework for. Be concrete — name the deliverable, the stakeholders, and the hard requirements.]

The AI runs Phase 1’s 10-question requirements gathering, surfaces the proactive gap assessment, runs the complexity assessment, and produces the consolidated framework file. Expect interactive turns for the elicitation; a typical F-Design takes 20–60 minutes depending on complexity.

For F-Convert (modernizing an old framework):

[Paste the framework specification]

Run F-Convert.

Existing framework: [Paste the old framework’s content.]

The AI analyzes the existing framework against the current Anatomy and Authoring Standards, identifies gaps (missing Input Contract sections; milestone properties scattered rather than inline; no Self-Evaluation layer; etc.), and produces the consolidated updated framework. The original intellectual content is preserved; the structure is refit to the current standard.

For F-Render (producing an additional execution variant):

[Paste the framework specification]

Run F-Render.

Existing framework: [Paste the framework]

Target execution tier: agent (for swarm execution with stages and tools) OR reasoning-model (for o3 / R1).

The AI produces one additional rendering of the framework optimized for the target environment.

For F-Audit (quality review):

[Paste the framework specification]

Run F-Audit.

Existing framework: [Paste the framework]

The AI scores the framework against the Quality Verification Checklist criteria and produces a per-criterion score plus specific remediation recommendations.

For best results, treat F-Design as a real elicitation rather than a one-shot prompt. The framework’s quality depends on the requirements being elicited honestly — short-circuiting Phase 1 by giving brief answers to all 10 questions tends to produce frameworks that fail in unanticipated ways. The 30 minutes spent in Phase 1 saves hours of re-design later.

Other examples

  • F-Convert on an old prompt collection. A user has a folder of effective prompts they’ve saved over a year — each prompt is two or three paragraphs of accumulated tweaks. They want to turn the collection into proper frameworks. F-Convert runs against each prompt; the analysis identifies which prompts are framework candidates (recurring task with stable output shape) and which are one-shots (single-use; not framework material). For the framework candidates, F-Convert produces consolidated frameworks conforming to the current Anatomy. The user ends up with five proper frameworks plus a reject pile for the one-shots. Demonstrates F-Convert against ad-hoc accumulated prompt material — turning informal patterns into formal frameworks.

  • F-Render producing an agent-mode variant. A user has a consolidated framework that runs cleanly in commercial AI but they want to also run it on the Mac Studio swarm with tool access (browser, file system, scheduled execution). F-Render produces the agent-mode variant — explicit stages with state transitions; tool access declarations; gear assignments per stage; agent-execution metadata. The single-pass version remains the canonical; the agent-mode version is a derived rendering for the swarm context. Demonstrates the explicit opt-in path for additional renderings.

  • F-Audit on a framework that looks good but performs poorly. A user has a framework that passes their informal sniff test but produces inconsistent results across runs. F-Audit scores the framework: Anatomy compliance 4/5; Authoring Standards 3/5; Inline Properties 2/5 (milestones declared without per-milestone properties); Multi-milestone Decomposition 1/5 (one mode has eight layers but only one milestone). Remediation recommendations: decompose the long mode into three intermediate milestones; pull milestone properties inline; sharpen the Authoring Standards on Layer 4’s Processing Instructions. The user applies the remediation; subsequent runs are consistent. Demonstrates F-Audit’s diagnostic value when the surface-level framework looks correct but the structural compliance is off.

Citations

The Process Formalization Framework draws on several traditions in software engineering and design methodology. The meta-framework pattern (a framework that produces other frameworks) draws on metaprogramming traditions in computer science — code that produces code, with the meta-level handling the patterns the object-level cannot capture cleanly. The Framework Anatomy as a structural standard is closer to interface contracts in object-oriented design (Liskov Substitution Principle’s discipline that conforming implementations are interchangeable) — every PFF-conforming framework can be invoked the same way, supervised the same way, and audited the same way because they share the structural contract.

The Inline Properties Principle (each milestone declares its own properties in the milestone block) draws on locality-of-reference principles in software design — a reader inspecting one milestone should not need to look up a separate registry to understand what the milestone declares. The multi-milestone requirement for long modes draws on test-driven-development practice’s discipline that a long single test often hides intermediate failures the test cannot diagnose; decomposing into multiple checkpoints surfaces the intermediate failures.

The Quality Verification Checklist’s structure (5-level rubrics per criterion; minimum passing score 3) draws on rubric-design literature in education and on adversarial-evaluation frameworks in machine learning (where evaluation criteria with explicit per-level descriptors produce more reliable scoring than free-text quality verdicts). The single-file consolidated default (v2.2) was a deliberate response to user feedback against file profusion — additional execution variants are now opt-in rather than default.

The framework is single-author and originated in early Ora; v2.0 (2026-04-08) added the Framework Anatomy and Authoring Standards in their canonical form; v2.1 incorporated multi-step prompting research; v2.2 (current) consolidated the single-file output as default.

Downloads

  • Framework specification (PDF) — link to ora-ai.org canonical artifact when published
  • Framework specification (plain text) — link to ora-ai.org canonical artifact when published
  • Full white paper (PDF) — link when published