Upload shape
The browser hashes chunks in a worker, uploads them in parallel, and only retries missing chunks after interruption. Small and large files follow the same route, so the system does not need a special fragile path for big uploads or a full-file memory buffer.
- Chunk identity is SHA-256 based.
- A file manifest assembles ordered chunks into visible content.
- Interrupted uploads can re-send only the missing pieces.
Storage model
Cotton separates the layout graph from stored content. A visible file is a database-backed abstraction over manifests and chunks, not a single loose object sitting in a user folder. Folders, nodes, file versions, snapshots, manifests, and chunks each have clear jobs, which makes navigation, restore, deduplication, and cleanup predictable.
- Layouts describe where content appears.
- Manifests describe what content is.
- Chunks can be reused safely when multiple files reference the same bytes.
Backend contract
The storage backend only needs a small contract: write data for a key and read it back. Filesystem storage can place chunk blobs under hash-derived segments, while S3-compatible storage can use the same logical keys. Listing and delete support make verification and cleanup richer, but the core model stays simple.
- The public chunk hash is the storage identity.
- Backend objects do not reveal the user's folder tree.
- The database remains the source of truth for live references.
Transform pipeline
Data is compressed before encryption and then written to either filesystem or S3-backed storage. Compression, encryption, and backend persistence are not optional afterthoughts; they are the normal ingest path, built around streaming buffers instead of loading the full object.
- Inline Zstd keeps storage savings in the hot path.
- Streaming AES-GCM authenticates content per chunk.
- Filesystem and S3 backends share the same logical pipeline.
Serving model
Reads are assembled from chunks into a byte stream without rebuilding a whole file first. That is why downloads, previews, media seeking, range requests, share pages, and WebDAV can all ride the same storage design.
- Large media stays seekable.
- Preview generators can work against encrypted chunk streams.
- HTTP range responses do not require temporary full-file copies.
Recovery model
Snapshots record references rather than copying the whole tree. Restore can switch layout state without turning rollback into a giant background copy job, while garbage collection still has a clear retention contract.
- Snapshots are first-class layout operations.
- Versions and trash fit into the same lifecycle.
- Unreferenced content is rechecked before reclaim.
Operational model
Background manifest hashing, storage consistency checks, preview work, token cleanup, and temp cleanup are part of the product path. Operators get explicit warnings instead of discovering silent storage drift later.
Storage model proof
The architecture page is not asking visitors to trust a diagram. The product proof is visible in the storage vocabulary and page surface: chunks, manifests, snapshots, versions, range reads, WebDAV, previews, and integrity checks all point back to the same model.
- SHA-256 chunk identity is the storage anchor.
- Manifests describe file bytes separately from folder layout.
- Snapshots and restore operate on references instead of copying every byte again.
Why the model sells
Cotton fits best when you want a file cloud whose internals explain the product instead of fighting it. The same architecture that makes large files practical also makes recovery, sharing, previews, and cleanup easier to reason about.
When a simpler stack wins
A chunk-first storage engine is more deliberate than mounting a folder and calling it a cloud. If you mainly need a broad app suite or direct filesystem semantics, a groupware stack or simple file server may fit better.