This work documents the development of SeekEngine, an independent systems research project investigating whether grounded information retrieval can mitigate hallucinations in large language model (LLM)–assisted search. Rather than treating hallucination as a UI failure or a statistical quirk of transformer architectures, we treat it as a distributed systems reliability problem: ungrounded LLMs act as isolated inference nodes lacking access to verifiable external state, and therefore produce fluent but unverifiable claims.
To counter this, SeekEngine fuses two upstream providers with conflicting operational characteristics:
Google Custom Search Engine (CSE) — a retrieval node that provides sparse but verified snippets grounded to the contemporary web, and
OpenRouter — an inference node capable of synthesizing structured answers but prone to fabricating details when starved of context.
These nodes are orchestrated in parallel and merged through a fusion protocol that treats factual consistency as a first-class constraint. The outcome is a grounding-first search agent designed to expose its own uncertainty, cite its sources, and prefer silence to confident misinformation.
The research was conducted under strict real-world constraints: zero budget, untrusted providers, non-deterministic latency, rate limits, adversarial input surface, and no proprietary infrastructure. The system encountered partial failures, silent failures, and latency spikes—behavior similar to decentralized peer networks and bootstrapping layers in P2P protocols. These failures were not incidental; they shaped the architecture.
SeekEngine does not claim to solve hallucination. Instead, the research shows that hallucination can be reframed as a coordination problem between retrieval and inference, where truth incurs a cost (latency, bandwidth, failed queries, dropped snippets, lower fluency), and where accuracy is not free—it must be purchased through grounding and verification.
Most discussions around hallucination treat it as either: (a) a training dataset deficiency, (b) a prompt engineering problem, or (c) a model architecture limitation. Our early experiments showed these views are incomplete. Hallucination does not emerge merely from probability maximization; it emerges from isolation.
LLMs generate tokens conditioned on internal priors—but search requires external state. Without connectivity to the contemporary web, LLMs operate like offline distributed nodes attempting to answer time-sensitive, stateful queries using stale snapshots of reality. In this paradigm, hallucination is not an error; it is a fallback strategy. Fluency fills the epistemic gap where grounding is absent.
SeekEngine was initiated to answer a simple question:
Can retrieval serve as a “bootstrap node” for grounding, turning hallucination into a coordination problem rather than a probability problem?
This question framed the project less as AI UX and more as distributed systems research. The relevant phenomena resembled concepts from P2P systems:
LLM Search Problem
Distributed Analogy
Hallucination
Unverified piece
Retrieval
Bootstrap node
Fusion
Swarm coordination
Source citation
Piece hashing
XSS + Prompt Injection
Peer poisoning
Latency
Consistency cost
Rate limits
Network congestion
Timeout
Silent peer drop
Provider mismatch
Protocol incompatibility
Truth penalty
Distributed coordination overhead
Once reframed, the problem became tractable without proprietary data or large infrastructure.
#II. Independent Research Positioning
SeekEngine was built as a zero-budget, zero-infrastructure, open-web experiment by two independent researchers (Gaurav Yadav & Aditya Yadav) without privileged access to datasets, model weights, proprietary APIs, or academic compute. This constraint forced architectural decisions that are often avoided in institutional settings because they appear inelegant or “hacky,” yet they mirror constraints faced by real systems deployed outside research labs.
We found that constraints were not obstacles—they were signal generators.
Zero-budget forced reliance on free-tier APIs → revealed failure modes
No private infrastructure forced client/server separation → revealed credential surfaces
No vector DB forced dynamic RAG → revealed retrieval starvation behavior
No observability tooling forced terminal-level logging → revealed latency patterns
In short: removing resources made reality show up.
#III. Research Claim (Soft)
SeekEngine does not claim superiority over industrial RAG pipelines nor claims to “fix” hallucination. Instead, it claims:
Hallucination is reducible to coordination.Grounding is reducible to verification.Verification is reducible to cost.
And cost—not creativity—is the limiting factor for truth.
Where typical chatbot UX hides uncertainty, SeekEngine surfaces it. Where typical inference pipelines suppress latency, SeekEngine exposes latency as proof-of-work for grounding. Where typical LLM outputs aim for eloquence, SeekEngine aims for inspectability.
#Phase 1 — Grounding the Problem
SeekEngine began with a deceptively simple observation: modern LLMs are exceptionally good at sounding correct yet structurally incapable of knowing whether their claims reflect reality. The problem is not malicious; it is architectural. Transformers predict tokens based on internal priors, not the contemporary web. When asked a stateful query (“AAPL price right now”), the model manufactures plausible numbers. This is not a hallucination defect — it is a fallback policy for lack of external state.
The initial research framing was naive: “We should attach a search API.” It quickly became clear that search was not merely an enrichment layer but a bootstrap node for grounding. Without retrieval, the model operates as a sealed container; with retrieval, it becomes a coordinated system of heterogeneous nodes that must merge partial and noisy information under latency constraints.
In this phase, the work shifted from AI speculation to systems thinking:
Truth is not a property of generation — it is a property of verification.Verification is not free — it incurs cost.Cost changes the architecture.
This realization created the first conceptual invariants of SeekEngine:
No “free” truth — grounding must be paid for in latency, bandwidth, or structure.
Inference is a node, not an oracle — it must negotiate with other nodes.
Grounding dominates creativity — creativity is a liability in search.
The UX must reflect uncertainty — opaque “confidence” is a failure mode.
#Phase 2 — Retrieval as Bootstrap
The BitTorrent analogy emerged unconsciously here.
In P2P networks, trackers and DHT nodes provide an entry point into a swarm. Without bootstrap nodes, peers have no swarm to join and no metadata to resolve. Retrieval fulfills the same role for grounded inference.
We treated Google CSE as our bootstrap node:
sparse
authoritative enough
rate-limited
non-deterministic
prone to silent drops
adversarial at input boundary
The signal from CSE resembled peer metadata:
titles → strong anchors
snippets → partial truths
urls → provenance
timestamps → freshness
keywords → weak alignment
ranking → heuristics, not truth
Retrieval was not the answer — it was the context substrate that made answers possible.
At this point the architecture formalized into a bootstrap graph:
To observe behavior, we implemented a diagnostic terminal:
System_Terminal
Fig. — Diagnostic Terminal Output
This terminal did more than demo output — it exposed the raw dynamics of a system negotiating with partial information, latency, and missing context.
Retrieval was now a protocol, not a feature.
#Phase 3 — Parallel Orchestration & Fusion
With bootstrap established, we introduced a second upstream node: OpenRouter, used as an inference relay. The orchestration problem immediately resembled swarm coordination:
retrieval produced grounded but brittle context
inference produced fluent but ungrounded synthesis
fusion required synchronizing mismatched temporal and semantic grain
We attempted sequential execution first:
Code Reference
CSE → LLM
This yielded correct facts but brittle structure: models produced citation noise, repetitive summarization, and low semantic coherence.
Parallel execution changed everything:
Code Reference
CSE || LLM → Fusion Layer
This made SeekEngine behave like a distributed system:
latency became a negotiation variable
timeouts became partial failures
rate limits became congestion
CSE starvation became a grounding deficit
LLM starvation became a synthesis deficit
Fusion was a protocol, not a merge function.
In practice, the fusion layer was forced to operate under three constraints:
Truthfulness Constraint
Fused answers must be grounded or fail silent.
Minimality Constraint
Synthesis must be brief; verbosity dilutes claims.
Inspectability Constraint
Sources must be traceable.
The UX design decision to use citations + snippet grounding was not aesthetic — it was a protocol-level requirement for epistemic transparency.
To visualize this fusion, we introduced:
Search Query
Parallel
→
Google CSERaw Data
LLM RAGSynthesis
Response Fusion Layer
→
Verified Answer
Live Visualization: The Parallel Orchestration Flow
Fig. — Orchestration Diagram
And operationally evaluated latency using:
Response Latency (ms)
Google CSE300ms
Direct LLM (OpenRouter)1200ms
SeekEngine Hybrid1500ms
The "Truth Penalty": SeekEngine trades additional latency for improved factual consistency.
Fig. — Latency Comparison Benchmark
Where the BitTorrent client paid bandwidth and time for piece verification, SeekEngine paid latency for truth verification.
This tradeoff is fundamental:
truth costs time and time costs UX.
Designers ignore this at their peril.
#Phase 4 — Verification as Protocol
Phase 4 formalized the insight that grounding must be explicit, not implicit. We defined verification as a protocol with four gates:
Existence Gate
Does the answer reference any retrieved sources?
Consistency Gate
Do claims align with retrieved snippets?
Temporal Gate
Are claims time-sensitive and stale?
Source Gate
Are sources adversarial or low-quality?
Only after verification do we allow synthesis.
To illustrate verification dynamics, we upgraded an earlier demo into a truth vs hallucination comparator:
Hallucination Detected
"The current stock price of Apple is $245.30, showing a strong 2% growth since this morning's opening..."
(Note: LLM is using training data from 2024 to guess 2026 prices)
Fig. — Grounded vs Ungrounded Output Comparison
In micro-benchmarks:
ungrounded inference → high fluency, low truth
grounded inference → lower fluency, higher truth
This revealed the truth penalty more starkly than latency:
grounding reduces eloquence
verification increases frictions
citations expose uncertainty
silence becomes preferable to fabrication
In human UX terms:
truth does not always look pretty.
This phase reframed hallucination as:
“verification failure under isolation.”
#Phase 5 — Security & Adversarial Surface
Once retrieval and inference were fused, a new concern emerged: the system was now exposed to two adversaries at once:
External adversaries — the open web
Internal adversaries — the LLM itself
Unlike BitTorrent, SeekEngine does not have malicious peers, but it has malicious inputs. The web is adversarial by default — SEO poisoning, spam vectors, XSS payloads, tracker pixels, misleading snippets, prompt injection triggers, content farms, and outdated content masquerading as authoritative.
The inference pipeline is adversarial by construction — LLMs are capable of self-hallucination, overconfidence, and unbounded fabrication when starved of context.
We treated the LLM as a potentially adversarial subsystem capable of:
unsanctioned creativity
miscalibration
citation forgery
temporal guesswork
sentimental phrasing
source attribution fakery
These required protocol-level guardrails, not UX hints.
Boundary Security
Credential exposure emerged as an unexpected risk. Retrieval and inference both required API keys, but inference required higher privilege. Early prototypes leaked credentials through client bundles, forcing a redesign of the execution boundary and relocation to server-only handlers.
This surfaced the first formal trust boundary:
Code Reference
Client —(untrusted)→ Server —(trusted)→ Provider
To visualize this, we preserved and upgraded:
Environment Encapsulation
client_side.js
const API_KEY = "sk-..." // LEAK DETECTED
server_action.ts
process.env.OPENROUTER_KEY // ENCAPSULATED
Auth Integrity: 100%
Fig. — Trust Boundary & Credential Encapsulation
Threat Matrix
We consolidated threat classes into a matrix:
Threat Model & Mitigations
XSS Injection
mitigated
DOMPurify sanitization
API Key Leakage
mitigated
Server-side encapsulation
Prompt Injection
partial
Input filtering (basic)
Data Persistence
mitigated
Request-scope only
Upstream Compromise
unaddressed
Outside control
Model-Level Exploits
unaddressed
Future work
MITIGATED
PARTIAL
UNADDRESSED
Fig. — Adversarial Surface & Mitigation Matrix
This matrix resembled real-world threat models from cybersecurity research more than traditional IR/RAG pipelines.
#Phase 6 — Observability & Diagnostics
After securing the boundaries, the system hit a new bottleneck: non-observability. Distributed systems cannot be debugged through intuition. Failures were occurring inside the fusion layer that produced no visible errors — silent, partial, or timing-based failures similar to P2P networks.
Symptoms included:
retrieval starvation
inference starvation
fusion race conditions
inconsistent snippet alignment
snippet truncation
stale web results
inference guesswork
non-deterministic formatting
latency variance spikes
To make the system observable, we implemented a diagnostic terminal UI that streamed the orchestration process. This did not look like research instrumentation — but it was exactly that.
System_Terminal
Fig. — Diagnostic Terminal Output
This feature revealed system truths that logs alone could not:
latency became visible as structure
silence became a detectable event
sequence became temporal order
errors resumed shape
We discovered that lack of failure was not success — it was a symptom of silent fallback. This is a lesson common to P2P engineers and absent from most AI tool builders.
Observability transformed SeekEngine from a black box to a negotiable protocol.
#Phase 7 — Partial Failures & Silent Errors
The hallmark of distributed systems is not crashing — it is partial failure. SeekEngine encountered partial failure behaviors identical to those seen in:
BitTorrent swarms
DHT peer tables
gossip networks
cloud orchestration
weakly-consistent caching systems
Failure modes included:
(a) Retrieval Starvation
CSE occasionally returned empty or stale results. The LLM compensated by fabricating plausible answers. Bootstrap failure → hallucination.
(b) Inference Starvation
OpenRouter occasionally dropped or rate-limited requests. Retrieval produced raw snippets with no synthesis. Bootstrap success → no swarm coordination.
(c) Timing Desynchronization
Parallel requests resolved in inconsistent orders. Fusion layer misaligned context and generated broken synthesis.
(d) Rate-Limit Oscillation
LLM response times oscillated under multi-query load, creating weird latency cliffs.
(e) Provider Mismatch
CSE timestamps mismatched OpenRouter’s training cutoff, producing temporal inconsistency (new vs stale knowledge).
(f) Trust Misalignment
High-ranking snippets were low-quality (SEO spam), while lower-ranked snippets were authoritative (primary sources). Retrieval ≠ trust.
These surfaced in the Limitations Matrix:
Known Limitations Matrix
No Standardized BenchmarksEvaluation
Internal testing only
high impact
Third-Party DependencyReliability
Google CSE, OpenRouter availability
medium impact
Multilingual SupportCoverage
English-primary implementation
medium impact
Temporal ConsistencyAccuracy
Real-time data freshness varies
high impact
Rate LimitingScale
Free-tier constraints
low impact
Honest assessment: These limitations are documented, not hidden.
Fig. — Known Limitations Assessment
SeekEngine never crashed — it degraded, silently.
This is the hallmark of real distributed systems.
#Phase 8 — Lessons from the System
By the time SeekEngine stabilized, it had ceased being an AI demo and had become a distributed coordination experiment operating across three domains:
(1) The Web as Information Substrate
→ sparse, adversarial, timestamped, unstructured
(2) The LLM as Synthesis Machine
→ structured, fluent, hallucination-prone, stochastic
(3) The UI as Epistemic Interface
→ mediates uncertainty, verification, and trust
The most surprising lessons came from working at the boundaries:
Lesson 1
Retrieval alone cannot answer.
Inference alone cannot know.
Truth emerges from negotiation.
Lesson 2
Hallucination is not a bug —
it is a failure of coordination under isolation.
Trust is a UI problem as much as an execution problem.
Lesson 5
The cheapest systems teach you the most —
because they cannot hide their failures.
#IV. System Architecture
By Phase 3, it became clear that SeekEngine needed a formal architecture—not to impress reviewers, but to reason about failure modes. Distributed systems without architecture are inscrutable; architecture is an instrument for understanding.
The final system decomposed into three macro-layers:
The "Truth Penalty": SeekEngine trades additional latency for improved factual consistency.
Fig. — Latency Comparison Benchmark
Latency breakdown:
Stage
Cost
Retrieval
network-bound
Inference
compute-bound
Fusion
synchronization-bound
Verification
consistency-bound
The result was a measurable latency penalty of ~1.3–2.4× vs ungrounded inference.
But truth isn't free.
#VI. Threat Model & Adversarial Surface
Unlike BitTorrent, SeekEngine is not attacked by malicious peers—but it is attacked by malicious content and overconfident models.
Threat classes included:
Threat Class
Source
Mitigation
XSS Injection
Web
Sanitizer
SEO Poisoning
Web
Source Weighting
Prompt Injection
User
Input Filtering
Citation Forgery
Model
Verification
Temporal Drift
Web/Model
Timestamp Check
Credential Leakage
System
Server Actions
Upstream Collapse
Provider
Timeout + Fallback
Poisoned Snippets
Web
Snippet Consistency
Rendered as:
Threat Model & Mitigations
XSS Injection
mitigated
DOMPurify sanitization
API Key Leakage
mitigated
Server-side encapsulation
Prompt Injection
partial
Input filtering (basic)
Data Persistence
mitigated
Request-scope only
Upstream Compromise
unaddressed
Outside control
Model-Level Exploits
unaddressed
Future work
MITIGATED
PARTIAL
UNADDRESSED
Fig. — Adversarial Surface & Mitigation Matrix
Zero-Trust Execution
We adopted zero-trust against: (1) Providers, (2) Models, (3) Users, and (4) The Web. This security stance is uncommon in RAG prototypes and more aligned with hardened web services.
#VII. Limitations (Hard & Soft)
Hard Limitations
Cannot be fixed without architectural overhaul:
no formal factuality benchmarks
no multilingual grounding
temporal inconsistency (training cutoff vs now)
dependency on hostile providers
unbounded LLM miscalibration
snippet scarcity
rate-limited retrieval API
Soft Limitations
Fixable with future work:
query expansion
snippet ranking improvement
multi-provider fusion
uncertainty calibration
timestamp weighting
Rendered as:
Known Limitations Matrix
No Standardized BenchmarksEvaluation
Internal testing only
high impact
Third-Party DependencyReliability
Google CSE, OpenRouter availability
medium impact
Multilingual SupportCoverage
English-primary implementation
medium impact
Temporal ConsistencyAccuracy
Real-time data freshness varies
high impact
Rate LimitingScale
Free-tier constraints
low impact
Honest assessment: These limitations are documented, not hidden.
Fig. — Known Limitations Assessment
#VIII. Future Work
We outline research directions in increasing difficulty:
(1) Cryptographic Source Signing
Truth can be anchored cryptographically (web domains → signatures).
Truth is not binary; it is distributed.We need systems that arbitrate claims, not chatbots that answer them.
A research-grade SeekEngine would not generate answers—it would generate epistemic maps.
#IX. Conclusion: Independent Systems Research Perspective
SeekEngine demonstrates that hallucination can be reframed not as a model failure, but as a coordination problem under resource constraints. Retrieval and inference are complementary nodes—neither sufficient alone. Grounding requires verification; verification incurs cost; cost alters architecture and UX.
More importantly, SeekEngine shows that meaningful research can emerge from constraints: no funding, no institutional backing, no proprietary models, no GPU clusters. The system was not built in a lab—it was built in the open, where failure is visible and upstream reality cannot be abstracted away.
The project mirrors independent research traditions found in historical networking communities and BitTorrent hackers—curiosity-driven, empirical, adversarial, and deeply systems-aware.
SeekEngine’s value is not performance; it is the framing:
Hallucination is a distributed systems problem.Grounding is a verification protocol.Truth is expensive.
#X. Bibliographic Context & Inspirations
SeekEngine sits at the intersection of several research and engineering traditions. It draws implicitly from:
✔ Information Retrieval Research
snippet extraction
relevance ranking
query expansion
temporal freshness
semantic matching
✔ Distributed Systems & P2P
partial failure behavior
bootstrap mechanisms
adversarial assumptions
non-deterministic sequencing
swarm coordination
✔ Security Engineering
zero-trust boundaries
dominance of untrusted inputs
poisoning resistance
credential encapsulation
browser threat models
✔ LLM Research
hallucination
grounding
RAG pipelines
uncertainty calibration
prompt shaping
Unlike institutional RAG research—which assumes vector databases, stable compute, and proprietary evaluation—SeekEngine assumes none of these.
Instead, it inherits the tradition of independent experimental systems research, where validation comes from running the system against reality rather than benchmarks.
#XI. Acknowledgments & Contributions
SeekEngine was conceived, designed, and implemented as a collaborative independent research effort between Gaurav Yadav and Aditya Yadav, contributing equally across system architecture, implementation, debugging, and conceptual design.
Acknowledgments extend to:
OpenRouter → for accessible inference
Google CSE → for retrieval substrate
Next.js → for server action boundaries
Tailwind + React → for UI expressiveness
The open web → for its adversarial character
LLMs → for their confabulation tendencies (our experimental foil)
No institutional support, funding, or proprietary infrastructure was used.
#XIV. Appendix A — Prompting & RAG Protocol Notes (Spec-Level)
SeekEngine’s prompting layer enforces invariants:
no creativity
no speculation
no sentiment
no invented citations
brief claims
explicit sourcing
failure > confabulation
Example:
Code Reference
<< SYSTEM >>
You are a grounding-first search agent.
If no data is retrieved, say "Unknown."
Never invent facts. Cite snippets.
Minimize fluency and avoid speculation.
This interface treats LLM synthesis as a semantic reducer, not an author.
#XV. Appendix B — Failure Trace Catalog
Observed Failure Modes
Failure
Root Cause
Hallucination
retrieval starvation
Staleness
training cutoff mismatch
Misalignment
parallel fusion race
Speculation
inference fallback
Overconfidence
no calibration
Spam
SEO poisoning
Silence
rate limit + timeout
These traces shaped future work directions.
#XVI. Appendix C — Temporal Considerations
Temporal mismatch is a major source of epistemic error:
Code Reference
Web Time ≈ Now
Model Time ≈ Past
Query Time ≈ Future
Temporal alignment remains an open research frontier.
#XVII. Appendix D — Observability as Insight
We argue observability is not merely tooling; it is epistemology.
Diagnostic terminal:
System_Terminal
Fig. — Diagnostic Terminal Output
Transforms orchestration into knowledge.
Observability is how systems speak.
#XVIII. Appendix E — Independent Research Context
SeekEngine joins a lineage of independent systems research driven not by grant funding or institutional hardware but by curiosity and constraint.
This lineage includes:
personal DHT implementations
hobby kernels
SDR radio stacks
Tor middleboxes
bare-metal type systems
BitTorrent clients built from scratch
Academic research tends to optimize for benchmarks.
Independent research optimizes for contact with reality.
SeekEngine belongs to the latter tradition.
#XIX. Final Statement
SeekEngine began as a hallucination patch and became a study in distributed grounding under constraint. It reveals that hallucination is not a statistical error—it is the absence of negotiated truth. Retrieval provides grounding; inference provides structure; verification provides validity; UI provides epistemic legibility.
This work suggests a reframing:
Truth is not produced; it is synchronized.
And synchronization—like all distributed coordination—is expensive, non-deterministic, and adversarial.
SeekEngine does not solve hallucination.
It demonstrates a way to reason about it.
#— End of Ultra Draft —
Citation
@article{yadav2026seekengine,
title={SeekEngine: Grounded Hybrid Retrieval for Truthful Search},
author={Yadav, Gaurav and Yadav, Aditya},
year={2026},
note={Independent Research},
url={https://seekengine.vercel.app},
}