Social Graph as Infrastructure: The Anthropological Case for Distributed Storage

Every decentralized protocol built so far has shipped the same architectural mistake. Here's what we borrowed from 150,000 years of human behavior to fix it.

There's a pattern in how anthropologists describe information flow in pre-literate communities. News doesn't broadcast — it cascades. A person hears something from someone they trust, judges the claim against what they know of that source, and decides how far to pass it along. The further information travels from its origin, the more social proof it accumulates. Weak signals die locally. Strong ones reach everyone who needs them — and no one needs a server to make this work.

We've known this for decades. Robin Dunbar quantified the cognitive architecture behind it. Mark Granovetter mapped its structural mechanics. Watts and Strogatz showed it produces provably efficient routing through small-world network theory. The human social graph is not a metaphor for distributed systems. It is one — and a remarkably well-engineered one.

So why did every decentralized social protocol ever shipped route around it entirely?

The Architectural Oversight

Over the last decade, a generation of decentralized social protocols emerged in direct response to platform capture — Twitter/X banning accounts, Facebook harvesting data, Reddit killing third-party clients. The goal was user sovereignty: your identity, your data, no platform intermediary.

Three protocols represent the serious attempts. Nostr separates identity from infrastructure entirely — your keypair is your account, no domain tied to it, portable across any relay. Mastodon (ActivityPub) federates across independently-run instances, breaking the single-platform monopoly. Bluesky (AT Protocol) takes a similar federated approach but with portable accounts and a centralized relay aggregating the firehose.

Each made real progress. Each reproduced the same structural failure.

In Nostr, a client talks to relay servers. The relays decide which events to keep and for how long. Relay operators can drop your content without notice, and your data disappears with them unless you manually replicate across multiple relays. In Mastodon, your account is bound to an instance operator who can suspend it, shut down the server, or change terms unilaterally — your posts don't survive the instance dying. In Bluesky, a centralized relay aggregates and serves the firehose, recreating a single point of control at the infrastructure layer.

The specific architecture varies. The dependency is the same: servers control what gets stored, served, and censored. The user's social graph lives on infrastructure they do not own or control.

What none of them did — what no decentralized protocol has done — is make the social graph itself the storage layer. In every implementation, the trust relationships you've built are a display layer. They render your feed. They shape your recommendations. They do not touch the infrastructure. Your follow list is a UI concern, not a load-bearing one.

The promise of decentralization is sovereignty. The delivered reality is a different set of platform operators with a better origin story. The server dependency was never eliminated — it was rebranded.

What no protocol has yet done is use the social graph for what it actually is: a load-bearing distributed system that humans have been running reliably since before writing existed.

What Evolution Actually Shipped

Dunbar's research established that human social networks exhibit concentric structure. You maintain roughly 5 intimate contacts, 15 close friends, 50 good friends, 150 casual acquaintances — each layer approximately three times larger than the last, with intimacy and obligation halving at each step.

This isn't arbitrary. It's a solution to a resource allocation problem. Maintaining relationships costs cognitive overhead. The Dunbar layers represent different obligation tiers: the inner circle (5–15) are the people you'd help move at 2am. The sympathy group (15–50) are people whose social updates you track reliably. The affinity group (50–150) are people you recognize and trust provisionally.

What's structurally interesting is that these layers map directly onto information propagation dynamics:

Reciprocity as mechanism. Anthropologists have documented that information flow in communities is heavily shaped by reciprocal relationships. You share news selectively with people who share news with you. One-sided information relationships decay — the same way one-sided friendships do. This isn't altruism. It's a stable equilibrium that emerged because it produces good outcomes for both parties.

Triadic closure as verification. Rapoport formalized what communities already knew: if you trust A and A trusts B, your prior on B is substantially higher than on a stranger. The probability that information from B is relevant to you scales with how many of your mutual contacts also follow B. This is not a heuristic — it's the structural mechanism that makes recommendations reliable.

Proportional redundancy. Important information reaches you through multiple independent paths. Less important information dies locally. The redundancy is not uniform — it's proportional to social relevance. Gossip about your immediate community saturates; gossip about distant events fades. This is a feature, not a bug. It's content-addressable replication, implemented socially.

The weak tie bridge. Granovetter's 1973 paper remains one of the most structurally important results in network science: your strong ties (mutual follows, reciprocal relationships) provide redundancy — you and your close contacts share most of the same information. Your weak ties (one-directional follows, acquaintances) provide reach — they bridge you to communities outside your cluster. Information diversity comes from weak ties; information reliability comes from strong ties.

Every one of these mechanisms is an engineering solution to a distributed systems problem: storage, routing, verification, redundancy, discovery. Evolution ran the experiment for 150,000 years and shipped a working protocol. We just haven't been implementing it.

Gozzip is an open protocol that inherits Nostr's proven primitives — secp256k1 identity, signed events, relay transport — and adds the layer that makes the social graph structural rather than cosmetic.

The core mechanism is a storage pact: a bilateral agreement between two peers in your Web of Trust to hold each other's data. Volume-balanced (within 30% tolerance), cryptographically verified through periodic challenge-response, maintained in the background. Your pact partners receive a copy of your events alongside your relays. When someone reads your content, they check the pact network first.

This is not a clever engineering trick. It is a direct formalization of how human communities actually store information.

The Dunbar Mapping Is Not Decorative

The protocol parameters map explicitly to Dunbar's social layers:

Dunbar Layer	Size	Protocol Role
Support clique	~5	Full-node pact partners (complete history)
Sympathy group	~15	Active pact partners (20 target)
Affinity group	~50	Direct WoT tier (mutual follows)
Dunbar number	~150	2-hop WoT boundary
Acquaintances	~500+	Relay-discovered peers (no storage obligation)

The 20-pact target sits in the sympathy group layer because storage pacts require ongoing reciprocal obligation — a level of commitment that corresponds to the ~15 relationships humans maintain with regular mutual contact. The 2-hop gossip boundary aligns with Dunbar's number because trust assessment beyond ~150 nodes becomes cognitively unreliable in humans — and it becomes structurally unreliable in the protocol for the same reason.

These aren't aesthetic choices. If you required 500 pact partners, you'd exceed the sympathy group's capacity and incentivize defection. If you relied on 5-hop WoT propagation, you'd extend trust beyond the Dunbar boundary into structurally unverifiable territory.

Reciprocity as Infrastructure, Literally

Volume matching in pact formation formalizes what unstable human relationships already demonstrate: asymmetric obligations incentivize defection. A prolific poster paired with a lurker creates an asymmetric storage burden. Asymmetric friendships decay in human communities for the same structural reason — the cost/benefit ratio breaks down.

The protocol prevents this by matching peers on compatible activity levels. This is not an optimization — it is the minimum viable condition for the reciprocal relationship to be stable.

Reliability scoring operates identically to how human reputation actually works: not as a global score, but as a private per-peer assessment using a 30-day rolling window of observed behavior. There is no central reputation authority. Each node computes its own assessment. Dropped pacts lose you a storage advocate — the protocol equivalent of being talked about less.

Gossip as Curated Propagation

When humans gossip, they don't broadcast. They share selectively based on trust and relevance. The protocol's WoT-filtered gossip layer works identically:

Active pact partners get immediate forwarding
1-hop WoT peers (direct follows) get standard priority
2-hop WoT peers (friends-of-friends) get forwarding when capacity permits
Unknown sources: never forwarded. Strangers' gossip stops at your door.

Pastor-Satorras and Vespignani proved that epidemic spreading on scale-free networks has a vanishing threshold — any nonzero transmission rate produces network-wide propagation. This means gossip is efficient but also means spam and attacks propagate freely if unrestricted.

The 2-hop WoT boundary solves this by creating a finite epidemic threshold. Within the WoT community, gossip spreads with near-scale-free efficiency. Beyond the boundary, propagation requires relay assistance. This produces the same dual regime that human gossip exhibits: saturating within communities, fading across them.

Bootstrapping Through Guardianship

Every community has established members who vouch for newcomers. The protocol formalizes this as guardian pacts: an established user voluntarily stores data for one newcomer outside their WoT, accepting a small storage cost without reciprocal obligation.

The framing is deliberate. Today's newcomer is tomorrow's guardian. The pay-it-forward dynamic isn't enforced by the protocol — it's encouraged by client UX — because in human communities, the same dynamic operates through social norm rather than contractual obligation.

The Retrieval Cascade

Data requests flow through four tiers, each activated only when the previous fails:

Tier 1 — Local pact storage. You already have it. Zero network traffic.

Tier 2 — Cached endpoint. You've interacted with this author's pact partners before; you have their addresses cached. Direct connection, ~60ms.

Tier 3 — WoT gossip. Blinded request (daily-rotating pubkey hash) broadcast to your WoT peers with TTL=3. With mean degree 20 and a clustering coefficient of 0.25, three hops covers ~4,500 nodes — 90%+ of a 5,000-node network. Gossip reach follows the small-world property: path length scales logarithmically while clustering remains high.

Tier 4 — Relay fallback. Last resort, 30-second timeout, ~200ms. The relay is still there. It's just no longer the load-bearing wall.

A critical emergent property: reads create replicas. When Bob fetches Alice's events from a pact partner, Bob now holds a local copy. When Carol later requests Alice's events via gossip, Bob can respond — without being a formal pact partner. Read load scales O(followers), not O(pact_partners). This is exactly how popular information propagates in human social networks: the more people read it, the more redundantly it's stored.

What This Doesn't Solve (Be Honest)

The protocol is analytically validated and simulated. It is not production-validated. Nostr has ~1,000 relays and real users. This doesn't.

Three open questions that matter:

Will 25% full nodes emerge organically? The data availability guarantee (P(all-pacts-offline) ≈ 10⁻⁹) assumes 25% always-on participants. Whether social incentives alone sustain this ratio is empirical. If it falls significantly, the math breaks.

Cold-start graph sparsity. New users have no WoT. The bootstrap phase requires relay dependence by design — pact formation is impossible without mutual follows. The transition is gradual, not instant. Early adopters in thin networks see less benefit.

Full node economics. Full nodes bear disproportionate storage and bandwidth costs. Social reciprocity may be sufficient incentive; it may not. Micropayments or premium services might be necessary. Unknown until production data exists.

The architectural claims are sound. The deployment parameters require production to validate.

The Relay Doesn't Die

A relay going offline no longer destroys the data it hosted — because that data exists across 20+ pact partners who have bilateral obligations to preserve it. But relays don't disappear. They earn a more honest role:

Discovery layer — finding people outside your trust graph. The WoT doesn't help you find people you don't know yet. Relays do.

Curation layer — editorial judgment, spam filtering, topic organization. This is legitimate value that relay operators can provide.

Performance accelerator — CDN behavior for content beyond the 2-hop WoT boundary.

What changes is that relay operators become curators rather than gatekeepers. They decide what to surface, not what exists.

Why This Matters Beyond the Protocol

The structural insight here is not about Nostr or decentralization specifically. It's that human social architecture contains solutions to distributed systems problems that we keep reinventing badly.

Dunbar's layers are a resource allocation protocol. Reciprocity is a Byzantine fault-tolerance mechanism. Triadic closure is distributed trust verification. Proportional redundancy is content-weighted replication. The epidemic threshold is a spam filter. These are not metaphors — they are functional isomorphisms.

We've spent decades building decentralized infrastructure that fails because we modeled data as content to be stored and served, rather than as a social object embedded in a trust network. The social graph isn't the user experience layer sitting on top of the infrastructure. It is the infrastructure. We just designed around it.

The Gozzip protocol whitepaper is at github.com/gozzip-protocol/gozzip.

The author is on Nostr: npub1gxdhmu9swqduwhr6zptjy4ya693zp3ql28nemy4hd97kuufyrqdqwe5zfk