CottonStorage lifetime
Storage lifetime

Storage objects should only die when the system can prove they are dead.

Cotton does not treat a missing folder path or a loose backend object as enough evidence to delete data. Stored objects stay protected while live references exist, reclaim is delayed and rechecked, and ingest coordinates with garbage collection so delete and re-upload do not fight each other.

Live referencesCautious GCOrphan detectionIngest coordinationReclaim safety

Database as source of truth

Cotton treats the database as the authority for whether a storage object is alive. A backend object by itself is not a durable product reference; the product needs explicit metadata that says why that object must survive.

  • File manifests and manifest chunks keep visible file content alive.
  • Snapshots, versions, shares, and trash can retain references after the current layout changes.
  • The cleanup job can reason about live content without trusting raw backend listing alone.

Live references are bigger than files

A file cloud stores more than uploaded files. Preview objects, user avatars, protected database backup artifacts, and bootstrap artifacts such as the encrypted master-key sentinel also need retention paths.

  • Previews and avatars are storage-backed product surfaces.
  • Latest backup pointer, backup manifest, and backup chunks can protect restore artifacts.
  • The encrypted master-key sentinel must survive because it protects unlock continuity.

Not every row retains content

ChunkOwnership helps ingestion and concurrency, but it is not a durable retention reference. That distinction matters because an upload coordination guard should not accidentally keep physical bytes alive forever.

Orphans are scheduled, not instantly destroyed

Cotton can register raw backend objects as orphan chunk rows so garbage collection can schedule them. Reclaim is delayed and references are checked again before deletion, so objects that become live again are left alone.

Ingest and GC cooperate

If a chunk is being deleted, ingest does not blindly race a re-upload of the same chunk. The safer behavior is to hold or refuse the conflicting write until the delete completes, then reconcile from a known state.

Feature registration rule

New storage-backed features need to write through the normal chunk and manifest flow, or add an explicit protection path that the chunk usage service understands. If GC cannot see the reference, the object is eligible for reclaim after the retention window.

Lifetime proof

The lifetime contract is concrete: live file chunks, preview objects, avatars, backup artifacts, the encrypted sentinel, orphan scheduling, retention windows, and final rechecks all feed the reclaim decision.

Reclaim should be boring

The storage win is not dramatic deletion. It is predictable deletion. Cotton is built so storage cleanup behaves like an operating discipline, not a dangerous background guess.

The cost of being careful

Cautious reclaim means more metadata discipline and delayed deletion. That is the right tradeoff for a file cloud where snapshots, previews, shares, backup artifacts, and integrity checks all depend on storage references staying honest.

Reclaim proof

The hard part is not storing bytes. It is knowing when bytes are really dead.

A content-addressed cloud needs a real death certificate for stored objects. Cotton makes deletion depend on visible references, protected artifacts, retention delay, and a final live check.

Storage lifetime contract diagramLive references, protected artifacts, orphan scheduling, retention delay, and final recheck decide whether bytes can die.
01write through chunk flow
02register a live reference
03schedule orphan cleanup
04wait through retention
05recheck before delete
06cancel if live again

Visible content

File manifests and manifest chunks keep user-visible bytes alive.

Recovery surfaces

Snapshots, versions, trash, and shares can retain content after layout changes.

Product artifacts

Previews, avatars, backup artifacts, and the unlock sentinel are storage-backed too.

Final recheck

Scheduled reclaim is cancelled when an object becomes live again before deletion.

GC boundary

ChunkOwnership coordinates ingest. It does not keep chunks alive forever.

The distinction is intentional: concurrency guards prevent races, while durable references decide retention. That keeps cleanup from becoming either reckless or permanently stuck.

  • Raw backend objects are not automatically trusted as live.
  • Orphan objects can be registered and scheduled for cleanup.
  • GC rechecks references before physical deletion.
  • Ingest coordinates with chunks that are being deleted.
  • New storage features need an explicit reference path.
  • Backup artifacts and sentinels are protected storage objects.
FAQ

Direct answers

Why not delete unused chunks immediately?

Because unused can be a temporary state while uploads, snapshots, versions, shares, previews, backup artifacts, and cleanup jobs are racing. Cotton schedules, waits, and rechecks before reclaim.

Is a raw object in storage considered live?

No. The backend object is only bytes. Cotton needs a live reference in the database or an explicit protected artifact path before the object is treated as retained product data.

What happens if an orphan becomes live again?

Garbage collection clears the reclaim schedule when the object becomes live again before deletion. The final reference check is what keeps cleanup from fighting recovery or re-upload.

Does this replace storage consistency checks?

No. The lifetime contract defines what is allowed to be deleted. Storage consistency and integrity jobs still need to surface missing, corrupt, or unexpected backend objects.