SeekEngine started from a frustration: LLMs are great at sounding right, but they don't actually know anything about the present — they're frozen in their training data. Ask for today's stock price and you'll get a confident, fluent, completely fabricated number. We wanted to see if we could fix that by treating hallucination not as an AI problem, but as a distributed systems problem — the model is an isolated node cut off from external state, and it fills the gap with plausible nonsense.
Our fix was to wire together two very different upstream providers:
Google Custom Search Engine (CSE) — gives us sparse, real, timestamped snippets from the live web, and
OpenRouter — gives us structured synthesis, but will happily make things up when it doesn't have enough context.
We run them in parallel and merge the outputs through a fusion layer that treats factual grounding as a hard constraint. The result isn't a chatbot — it's a search agent that cites its sources, flags its own uncertainty, and stays quiet when it doesn't know.
We built this with zero budget, free-tier APIs, no proprietary infrastructure, and uncontrolled latency. The system failed constantly — rate limits, dropped requests, stale data, timing mismatches. But those failures weren't bugs to fix; they were the system telling us what the architecture needed to handle.
The core claim is modest: hallucination is a coordination problem. Ground the model with retrieval, verify the output against sources, and accept that accuracy costs something — latency, bandwidth, occasionally silence. Truth isn't free.
Most people talk about hallucination like it's a training data problem, or a prompting problem, or something you fix with a better model architecture. We tried all of that early on. None of it was the real issue. Hallucination doesn't just come from probability maximization — it comes from isolation.
LLMs generate tokens based on internal priors — but search requires external state. Without connectivity to the live web, LLMs are basically offline nodes trying to answer time-sensitive questions using stale snapshots of the world. Hallucination isn't a bug in this framing — it's a fallback policy. When the model doesn't have real data, fluency fills the void.
SeekEngine was initiated to answer a simple question:
Can retrieval serve as a “bootstrap node” for grounding, turning hallucination into a coordination problem rather than a probability problem?
This question framed the project less as AI UX and more as distributed systems research. The relevant phenomena resembled concepts from P2P systems:
LLM Search Problem
Distributed Analogy
Hallucination
Unverified piece
Retrieval
Bootstrap node
Fusion
Swarm coordination
Source citation
Piece hashing
XSS + Prompt Injection
Peer poisoning
Latency
Consistency cost
Rate limits
Network congestion
Timeout
Silent peer drop
Provider mismatch
Protocol incompatibility
Truth penalty
Distributed coordination overhead
Once reframed, the problem became tractable without proprietary data or large infrastructure.
#II. Independent Research Positioning
SeekEngine was built as a zero-budget, zero-infrastructure, open-web experiment by two independent researchers (Gaurav Yadav & Aditya Yadav) without privileged access to datasets, model weights, proprietary APIs, or academic compute. This constraint forced architectural decisions that are often avoided in institutional settings because they appear inelegant or “hacky,” yet they mirror constraints faced by real systems deployed outside research labs.
We found that constraints were not obstacles—they were signal generators.
Zero-budget forced reliance on free-tier APIs → revealed failure modes
No private infrastructure forced client/server separation → revealed credential surfaces
No vector DB forced dynamic RAG → revealed retrieval starvation behavior
No observability tooling forced terminal-level logging → revealed latency patterns
In short: removing resources made reality show up.
#III. Research Claim (Soft)
SeekEngine does not claim superiority over industrial RAG pipelines nor claims to “fix” hallucination. Instead, it claims:
Hallucination is reducible to coordination.Grounding is reducible to verification.Verification is reducible to cost.
And cost—not creativity—is the limiting factor for truth.
Where typical chatbot UX hides uncertainty, SeekEngine surfaces it. Where typical inference pipelines suppress latency, SeekEngine exposes latency as proof-of-work for grounding. Where typical LLM outputs aim for eloquence, SeekEngine aims for inspectability.
#Phase 1 — Grounding the Problem
SeekEngine began with a deceptively simple observation: modern LLMs are exceptionally good at sounding correct yet structurally incapable of knowing whether their claims reflect reality. The problem is not malicious; it is architectural. Transformers predict tokens based on internal priors, not the contemporary web. When asked a stateful query (“AAPL price right now”), the model manufactures plausible numbers. This is not a hallucination defect — it is a fallback policy for lack of external state.
The initial research framing was naive: “We should attach a search API.” It quickly became clear that search was not merely an enrichment layer but a bootstrap node for grounding. Without retrieval, the model operates as a sealed container; with retrieval, it becomes a coordinated system of heterogeneous nodes that must merge partial and noisy information under latency constraints.
In this phase, the work shifted from AI speculation to systems thinking:
Truth is not a property of generation — it is a property of verification.Verification is not free — it incurs cost.Cost changes the architecture.
This realization created the first conceptual invariants of SeekEngine:
No “free” truth — grounding must be paid for in latency, bandwidth, or structure.
Inference is a node, not an oracle — it must negotiate with other nodes.
Grounding dominates creativity — creativity is a liability in search.
The UX must reflect uncertainty — opaque “confidence” is a failure mode.
#Phase 2 — Retrieval as Bootstrap
The BitTorrent analogy emerged unconsciously here.
In P2P networks, trackers and DHT nodes provide an entry point into a swarm. Without bootstrap nodes, peers have no swarm to join and no metadata to resolve. Retrieval fulfills the same role for grounded inference.
We treated Google CSE as our bootstrap node:
sparse
authoritative enough
rate-limited
non-deterministic
prone to silent drops
adversarial at input boundary
The signal from CSE resembled peer metadata:
titles → strong anchors
snippets → partial truths
urls → provenance
timestamps → freshness
keywords → weak alignment
ranking → heuristics, not truth
Retrieval was not the answer — it was the context substrate that made answers possible.
At this point the architecture formalized into a bootstrap graph:
To observe behavior, we implemented a diagnostic terminal:
System_Terminal
Fig. — Diagnostic Terminal Output
This terminal did more than demo output — it exposed the raw dynamics of a system negotiating with partial information, latency, and missing context.
Retrieval was now a protocol, not a feature.
#Phase 3 — Parallel Orchestration & Fusion
With bootstrap established, we introduced a second upstream node: OpenRouter, used as an inference relay. The orchestration problem immediately resembled swarm coordination:
retrieval produced grounded but brittle context
inference produced fluent but ungrounded synthesis
fusion required synchronizing mismatched temporal and semantic grain
We attempted sequential execution first:
Code Reference
CSE → LLM
This yielded correct facts but brittle structure: models produced citation noise, repetitive summarization, and low semantic coherence.
Parallel execution changed everything:
Code Reference
CSE || LLM → Fusion Layer
This made SeekEngine behave like a distributed system:
latency became a negotiation variable
timeouts became partial failures
rate limits became congestion
CSE starvation became a grounding deficit
LLM starvation became a synthesis deficit
Fusion was a protocol, not a merge function.
In practice, the fusion layer was forced to operate under three constraints:
Truthfulness Constraint
Fused answers must be grounded or fail silent.
Minimality Constraint
Synthesis must be brief; verbosity dilutes claims.
Inspectability Constraint
Sources must be traceable.
The UX design decision to use citations + snippet grounding was not aesthetic — it was a protocol-level requirement for epistemic transparency.
To visualize this fusion, we introduced:
Search Query
Parallel
→
Google CSERaw Data
LLM RAGSynthesis
Response Fusion Layer
→
Verified Answer
Live Visualization: The Parallel Orchestration Flow
Fig. — Orchestration Diagram
And operationally evaluated latency using:
Response Latency (ms)
Google CSE300ms
Direct LLM (OpenRouter)1200ms
SeekEngine Hybrid1500ms
The "Truth Penalty": SeekEngine trades additional latency for improved factual consistency.
Fig. — Latency Comparison Benchmark
Where the BitTorrent client paid bandwidth and time for piece verification, SeekEngine paid latency for truth verification.
This tradeoff is fundamental:
truth costs time and time costs UX.
Designers ignore this at their peril.
#Phase 4 — Verification as Protocol
Phase 4 formalized the insight that grounding must be explicit, not implicit. We defined verification as a protocol with four gates:
Existence Gate
Does the answer reference any retrieved sources?
Consistency Gate
Do claims align with retrieved snippets?
Temporal Gate
Are claims time-sensitive and stale?
Source Gate
Are sources adversarial or low-quality?
Only after verification do we allow synthesis.
To illustrate verification dynamics, we upgraded an earlier demo into a truth vs hallucination comparator:
Hallucination Detected
"The current stock price of Apple is $245.30, showing a strong 2% growth since this morning's opening..."
(Note: LLM is using training data from 2024 to guess 2026 prices)
Fig. — Grounded vs Ungrounded Output Comparison
In micro-benchmarks:
ungrounded inference → high fluency, low truth
grounded inference → lower fluency, higher truth
This revealed the truth penalty more starkly than latency:
grounding reduces eloquence
verification increases frictions
citations expose uncertainty
silence becomes preferable to fabrication
In human UX terms:
truth does not always look pretty.
This phase reframed hallucination as:
“verification failure under isolation.”
#Phase 5 — Security & Adversarial Surface
Once retrieval and inference were fused, a new concern emerged: the system was now exposed to two adversaries at once:
External adversaries — the open web
Internal adversaries — the LLM itself
Unlike BitTorrent, SeekEngine does not have malicious peers, but it has malicious inputs. The web is adversarial by default — SEO poisoning, spam vectors, XSS payloads, tracker pixels, misleading snippets, prompt injection triggers, content farms, and outdated content masquerading as authoritative.
The inference pipeline is adversarial by construction — LLMs are capable of self-hallucination, overconfidence, and unbounded fabrication when starved of context.
We treated the LLM as a potentially adversarial subsystem capable of:
unsanctioned creativity
miscalibration
citation forgery
temporal guesswork
sentimental phrasing
source attribution fakery
These required protocol-level guardrails, not UX hints.
Boundary Security
Credential exposure emerged as an unexpected risk. Retrieval and inference both required API keys, but inference required higher privilege. Early prototypes leaked credentials through client bundles, forcing a redesign of the execution boundary and relocation to server-only handlers.
This surfaced the first formal trust boundary:
Code Reference
Client —(untrusted)→ Server —(trusted)→ Provider
To visualize this, we preserved and upgraded:
Environment Encapsulation
client_side.js
const API_KEY = "sk-..." // LEAK DETECTED
server_action.ts
process.env.OPENROUTER_KEY // ENCAPSULATED
Auth Integrity: 100%
Fig. — Trust Boundary & Credential Encapsulation
Threat Matrix
We consolidated threat classes into a matrix:
Threat Model & Mitigations
XSS Injection
mitigated
DOMPurify sanitization
API Key Leakage
mitigated
Server-side encapsulation
Prompt Injection
partial
Input filtering (basic)
Data Persistence
mitigated
Request-scope only
Upstream Compromise
unaddressed
Outside control
Model-Level Exploits
unaddressed
Future work
MITIGATED
PARTIAL
UNADDRESSED
Fig. — Adversarial Surface & Mitigation Matrix
This matrix resembled real-world threat models from cybersecurity research more than traditional IR/RAG pipelines.
#Phase 6 — Observability & Diagnostics
After securing the boundaries, the system hit a new bottleneck: non-observability. Distributed systems cannot be debugged through intuition. Failures were occurring inside the fusion layer that produced no visible errors — silent, partial, or timing-based failures similar to P2P networks.
Symptoms included:
retrieval starvation
inference starvation
fusion race conditions
inconsistent snippet alignment
snippet truncation
stale web results
inference guesswork
non-deterministic formatting
latency variance spikes
To make the system observable, we implemented a diagnostic terminal UI that streamed the orchestration process. This did not look like research instrumentation — but it was exactly that.
System_Terminal
Fig. — Diagnostic Terminal Output
This feature revealed system truths that logs alone could not:
latency became visible as structure
silence became a detectable event
sequence became temporal order
errors resumed shape
We discovered that lack of failure was not success — it was a symptom of silent fallback. This is a lesson common to P2P engineers and absent from most AI tool builders.
Observability transformed SeekEngine from a black box to a negotiable protocol.
#Phase 7 — Partial Failures & Silent Errors
The hallmark of distributed systems is not crashing — it is partial failure. SeekEngine encountered partial failure behaviors identical to those seen in:
BitTorrent swarms
DHT peer tables
gossip networks
cloud orchestration
weakly-consistent caching systems
Failure modes included:
(a) Retrieval Starvation
CSE occasionally returned empty or stale results. The LLM compensated by fabricating plausible answers. Bootstrap failure → hallucination.
(b) Inference Starvation
OpenRouter occasionally dropped or rate-limited requests. Retrieval produced raw snippets with no synthesis. Bootstrap success → no swarm coordination.
(c) Timing Desynchronization
Parallel requests resolved in inconsistent orders. Fusion layer misaligned context and generated broken synthesis.
(d) Rate-Limit Oscillation
LLM response times oscillated under multi-query load, creating weird latency cliffs.
(e) Provider Mismatch
CSE timestamps mismatched OpenRouter’s training cutoff, producing temporal inconsistency (new vs stale knowledge).
(f) Trust Misalignment
High-ranking snippets were low-quality (SEO spam), while lower-ranked snippets were authoritative (primary sources). Retrieval ≠ trust.
These surfaced in the Limitations Matrix:
Known Limitations Matrix
No Standardized BenchmarksEvaluation
Internal testing only
high impact
Third-Party DependencyReliability
Google CSE, OpenRouter availability
medium impact
Multilingual SupportCoverage
English-primary implementation
medium impact
Temporal ConsistencyAccuracy
Real-time data freshness varies
high impact
Rate LimitingScale
Free-tier constraints
low impact
Honest assessment: These limitations are documented, not hidden.
Fig. — Known Limitations Assessment
SeekEngine never crashed — it degraded, silently.
This is the hallmark of real distributed systems.
#Phase 8 — Lessons from the System
By the time SeekEngine stabilized, it had ceased being an AI demo and had become a distributed coordination experiment operating across three domains:
(1) The Web as Information Substrate
→ sparse, adversarial, timestamped, unstructured
(2) The LLM as Synthesis Machine
→ structured, fluent, hallucination-prone, stochastic
(3) The UI as Epistemic Interface
→ mediates uncertainty, verification, and trust
The most surprising lessons came from working at the boundaries:
Lesson 1
Retrieval alone cannot answer.
Inference alone cannot know.
Truth emerges from negotiation.
Lesson 2
Hallucination is not a bug —
it is a failure of coordination under isolation.
Trust is a UI problem as much as an execution problem.
Lesson 5
The cheapest systems teach you the most —
because they cannot hide their failures.
#IV. System Architecture
By Phase 3, it became clear that SeekEngine needed a formal architecture—not to impress reviewers, but to reason about failure modes. Distributed systems without architecture are inscrutable; architecture is an instrument for understanding.
The final system decomposed into three macro-layers:
The "Truth Penalty": SeekEngine trades additional latency for improved factual consistency.
Fig. — Latency Comparison Benchmark
Latency breakdown:
Stage
Cost
Retrieval
network-bound
Inference
compute-bound
Fusion
synchronization-bound
Verification
consistency-bound
The result was a measurable latency penalty of ~1.3–2.4× vs ungrounded inference.
But truth isn't free.
#VI. Threat Model & Adversarial Surface
Unlike BitTorrent, SeekEngine is not attacked by malicious peers—but it is attacked by malicious content and overconfident models.
Threat classes included:
Threat Class
Source
Mitigation
XSS Injection
Web
Sanitizer
SEO Poisoning
Web
Source Weighting
Prompt Injection
User
Input Filtering
Citation Forgery
Model
Verification
Temporal Drift
Web/Model
Timestamp Check
Credential Leakage
System
Server Actions
Upstream Collapse
Provider
Timeout + Fallback
Poisoned Snippets
Web
Snippet Consistency
Rendered as:
Threat Model & Mitigations
XSS Injection
mitigated
DOMPurify sanitization
API Key Leakage
mitigated
Server-side encapsulation
Prompt Injection
partial
Input filtering (basic)
Data Persistence
mitigated
Request-scope only
Upstream Compromise
unaddressed
Outside control
Model-Level Exploits
unaddressed
Future work
MITIGATED
PARTIAL
UNADDRESSED
Fig. — Adversarial Surface & Mitigation Matrix
Zero-Trust Execution
We adopted zero-trust against: (1) Providers, (2) Models, (3) Users, and (4) The Web. This security stance is uncommon in RAG prototypes and more aligned with hardened web services.
#VII. Limitations (Hard & Soft)
Hard Limitations
Cannot be fixed without architectural overhaul:
no formal factuality benchmarks
no multilingual grounding
temporal inconsistency (training cutoff vs now)
dependency on hostile providers
unbounded LLM miscalibration
snippet scarcity
rate-limited retrieval API
Soft Limitations
Fixable with future work:
query expansion
snippet ranking improvement
multi-provider fusion
uncertainty calibration
timestamp weighting
Rendered as:
Known Limitations Matrix
No Standardized BenchmarksEvaluation
Internal testing only
high impact
Third-Party DependencyReliability
Google CSE, OpenRouter availability
medium impact
Multilingual SupportCoverage
English-primary implementation
medium impact
Temporal ConsistencyAccuracy
Real-time data freshness varies
high impact
Rate LimitingScale
Free-tier constraints
low impact
Honest assessment: These limitations are documented, not hidden.
Fig. — Known Limitations Assessment
#VIII. Future Work
We outline research directions in increasing difficulty:
(1) Cryptographic Source Signing
Truth can be anchored cryptographically (web domains → signatures).
Truth is not binary; it is distributed.We need systems that arbitrate claims, not chatbots that answer them.
A research-grade SeekEngine would not generate answers—it would generate epistemic maps.
#IX. Conclusion: Independent Systems Research Perspective
SeekEngine showed us that hallucination isn't really a model failure — it's a coordination failure under resource constraints. Retrieval and inference are complementary nodes; neither is sufficient on its own. Grounding needs verification. Verification costs time. And that cost reshapes everything — the architecture, the UX, the expectations.
More importantly, this project showed that meaningful research doesn't require funding, institutional backing, or GPU clusters. We built this in the open, with free-tier APIs, where every failure was visible and reality couldn't be abstracted away.
The project mirrors independent research traditions found in historical networking communities and BitTorrent hackers—curiosity-driven, empirical, adversarial, and deeply systems-aware.
SeekEngine’s value is not performance; it is the framing:
Hallucination is a distributed systems problem.Grounding is a verification protocol.Truth is expensive.
#X. Bibliographic Context & Inspirations
SeekEngine sits at the intersection of several research and engineering traditions. It draws implicitly from:
✔ Information Retrieval Research
snippet extraction
relevance ranking
query expansion
temporal freshness
semantic matching
✔ Distributed Systems & P2P
partial failure behavior
bootstrap mechanisms
adversarial assumptions
non-deterministic sequencing
swarm coordination
✔ Security Engineering
zero-trust boundaries
dominance of untrusted inputs
poisoning resistance
credential encapsulation
browser threat models
✔ LLM Research
hallucination
grounding
RAG pipelines
uncertainty calibration
prompt shaping
Unlike institutional RAG research—which assumes vector databases, stable compute, and proprietary evaluation—SeekEngine assumes none of these.
Instead, it inherits the tradition of independent experimental systems research, where validation comes from running the system against reality rather than benchmarks.
#XI. Acknowledgments & Contributions
SeekEngine was conceived, designed, and built as a collaborative effort between Gaurav Yadav and Aditya Yadav, splitting the work equally across architecture, implementation, debugging, and the conceptual design documented here.
Acknowledgments extend to:
OpenRouter → for accessible inference
Google CSE → for retrieval substrate
Next.js → for server action boundaries
Tailwind + React → for UI expressiveness
The open web → for its adversarial character
LLMs → for their confabulation tendencies (our experimental foil)
No institutional support, funding, or proprietary infrastructure was used.
#XIV. Appendix A — Prompting & RAG Protocol Notes (Spec-Level)
SeekEngine’s prompting layer enforces invariants:
no creativity
no speculation
no sentiment
no invented citations
brief claims
explicit sourcing
failure > confabulation
Example:
Code Reference
<< SYSTEM >>
You are a grounding-first search agent.
If no data is retrieved, say "Unknown."
Never invent facts. Cite snippets.
Minimize fluency and avoid speculation.
This interface treats LLM synthesis as a semantic reducer, not an author.
#XV. Appendix B — Failure Trace Catalog
Observed Failure Modes
Failure
Root Cause
Hallucination
retrieval starvation
Staleness
training cutoff mismatch
Misalignment
parallel fusion race
Speculation
inference fallback
Overconfidence
no calibration
Spam
SEO poisoning
Silence
rate limit + timeout
These traces shaped future work directions.
#XVI. Appendix C — Temporal Considerations
Temporal mismatch is a major source of epistemic error:
Code Reference
Web Time ≈ Now
Model Time ≈ Past
Query Time ≈ Future
Temporal alignment remains an open research frontier.
#XVII. Appendix D — Observability as Insight
We argue observability is not merely tooling; it is epistemology.
Diagnostic terminal:
System_Terminal
Fig. — Diagnostic Terminal Output
Transforms orchestration into knowledge.
Observability is how systems speak.
#XVIII. Appendix E — Independent Research Context
SeekEngine joins a lineage of independent systems research driven not by grant funding or institutional hardware but by curiosity and constraint.
This lineage includes:
personal DHT implementations
hobby kernels
SDR radio stacks
Tor middleboxes
bare-metal type systems
BitTorrent clients built from scratch
Academic research tends to optimize for benchmarks.
Independent research optimizes for contact with reality.
SeekEngine belongs to the latter tradition.
#XIX. Final Statement
SeekEngine began as a hallucination patch and became a study in distributed grounding under constraint. It reveals that hallucination is not a statistical error—it is the absence of negotiated truth. Retrieval provides grounding; inference provides structure; verification provides validity; UI provides epistemic legibility.
This work suggests a reframing:
Truth is not produced; it is synchronized.
And synchronization—like all distributed coordination—is expensive, non-deterministic, and adversarial.
SeekEngine does not solve hallucination.
It demonstrates a way to reason about it.
#— End of Ultra Draft —
Citation
@article{yadav2026seekengine,
title={SeekEngine: Grounded Hybrid Retrieval for Truthful Search},
author={Yadav, Gaurav and Yadav, Aditya},
year={2026},
note={Independent Research},
url={https://seekengine.vercel.app},
}