Corpus Overview

The Sylvester Corpus was formally initiated at 11:47:51 PM EST on January 24, 2026, with the statement: “Thus, we TDD. Especially Ourselves.” The corpus name references the researcher’s surname and establishes a citational anchor for all subsequent research outputs.

4
Integrated Datasets
3
Perspective Filters
8
Storage Drives (D–K)
100K+
Lines of LLM Dialogue

Dataset 1: YouTube Behavioral Metadata (2015–2026)

YouTube consumption history provides timestamped, machine-readable evidence immune to retrospective bias. When analyzed through the triple integrated perspective filters (Ascetic-INFJ, LGBTQ+ Gender Expression, and Polymath Capacity), this dataset reveals learning patterns, skill development trajectories, cognitive states, and identity exploration patterns that the researcher could not fabricate retroactively.

Triple Integrated Perspective Filters

Filter 1: Ascetic-INFJ

Maps consumption through INFJ cognitive functions (Ni-Fe-Ti-Se). Detects ascetic practice periods. Classifies 17 spiritual/cognitive states with privacy-aware interpretation.

Filter 2: LGBTQ+ Expression

Bisexual orientation (public), creative expression and costuming (contextual). Consumption states across privacy gradients. Cultural touchstone mapping including Rocky Horror Picture Show participation.

Filter 3: Polymath Capacity

Cross-domain learning detection, skill cluster identification, Rocky Horror security context, and integration analysis across all identity dimensions.

Dataset 2: TDDFlow LLM Corpus

Hundreds of thousands of lines of LLM dialogue processed through TDDFlow v2, a hydrological cognitive scaffolding system:

TDDFlow ElementAnalogFunction
LakeMajor life domainMinistry, QA, therapy, ranching
RiverWork streamPlanning, research, implementation
TributaryDetailed threadSpecific investigation within a stream
Puddle PacketActionable taskSmallest unit of tracked work
CloudInsight crystallizedReady to execute
RainImplementationActive execution of insight
HurricaneDeep work sprintScope-locked intensive execution
DeltaClosure checkpointArchive and verify completion
SedimentVersion layerThought evolution tracked over time

Dataset 3: Professional Documentation

526+ competencies across 12 domains, extracted and verified through multiple AI-assisted analyses of professional resume and cover letter documents. The skill taxonomy spans: technical skills (161), leadership/management (75), security/compliance (32), domain crossover (41), methodological innovation (28), and additional categories.

Dataset 4: Distributed Filesystem (Drives D–K)

Eight storage drives across Windows and Ubuntu dual-boot systems, representing the physical architecture of polymath-scale knowledge organization. The filesystem itself is a research instrument—its structure reveals organizational patterns, domain prioritization, and knowledge architecture that supports claims of cross-domain expertise.

Infrastructure

Privacy Through Ownership

The Sylvester Corpus employs self-hosted Ollama GPU processing (Docker container: mnccouk/ollama-gpu-rx580) on an AMD RX 580. This ensures complete data sovereignty: no cloud uploads, no surveillance capitalism, no third-party data processing. The researcher owns every byte of the analysis pipeline.

This infrastructure choice is not merely technical preference but a methodological requirement: the corpus contains sacred spiritual content, therapeutic processing, and ministry confidentiality that must remain under researcher control.