Components
Ref: https://gear.hermygong.com/p/seaweeds/
Blob storage
Other Blobstore operations
Write
Read
File Storage
Filer Architecture
Ref: https://www.a-programmer.top/2021/06/19/SeaweedFS%E5%88%9D%E6%8E%A2/
Filer Store Data Model
Volume Server
Volume
In SeaweedFS, a volume is a single file consisting of many small files. When a master server starts, it sets the volume file maximum size to 30GB (see: -volumeSizeLimitMB). At volume server initialization, it will create 8 of these volumes (see: -max).
Each volume has its own TTL and replication.
Ref: https://github.com/seaweedfs/seaweedfs/wiki/Components
Volume Files Structure
Ref: https://github.com/seaweedfs/seaweedfs/wiki/Volume-Files-Structure
UI
Architecture
Design Philosophy
High Availability
In Master, How Raft is Used?
- Leader Election: Multiple master servers form a Raft cluster to elect a leader. Only the leader can assign new volume IDs.
- Volume ID Assignment: When a new volume needs to be created, the leader:
- Gets the current max volume ID
- Increments it
- Replicates this new max via Raft to ensure all masters agree
Also master manages, all these
- Assign file ID -
Leader only (proxied if needed) - Volume creation -
Leader only - NextVolumeId (Raft write) -
Leader only + barrier - Volume lookup Leader uses local topology, non-leader queries too
- Client connections - Any master, but redirected to leader
Replication
Erasure Coding
- SeaweedFS uses Reed-Solomon erasure coding with a default 10+4 scheme (10 data shards + 4 parity shards = 14 total shards).
- This allows you to lose up to 4 volume servers and still recover your data.
- Only volumes with this fullness ratio 80% or higher will be erasure coded, configurable
/*
Steps to apply erasure coding to .dat .idx files
0. ensure the volume is readonly
1. client call VolumeEcShardsGenerate to generate the .ecx and .ec00 ~ .ec13 files
2. client ask master for possible servers to hold the ec files
3. client call VolumeEcShardsCopy on above target servers to copy ec files from the source server
4. target servers report the new ec files to the master
5. master stores vid -> [14]*DataNode
6. client checks master. If all 14 slices are ready, delete the original .idx, .idx files
*/
Ref: seaweedfs/weed/server/volume_grpc_erasure_coding.go
S3 changes
Ref: SeaweedFS S3 API in 2025: Enterprise‑grade security and control - Chris Lu, SeaweedFS KubeCon
Changes related to this S3 data path skips filer https://github.com/seaweedfs/seaweedfs/pull/7481
- Check this file
weed/s3api/s3api_object_handlers_put.gopreviously it used Filer ProxyproxyReq, err := http.NewRequest(http.MethodPut, uploadUrl, body), now S3 api directly talks to volume server
Change
flowchart TB
subgraph OLD["Before v4.01"]
direction TB
C1[S3 Client] --> S1[S3 API Server]
S1 -->|"HTTP proxy<br/>ALL data + metadata"| F1[Filer]
F1 -->|"Read/Write data"| V1[Volume Server]
style F1 fill:#cc4444,color:#fff
style S1 fill:#4466aa,color:#fff
end
subgraph NEW["NEW Architecture (v4.01+)"]
direction TB
C2[S3 Client] --> S2[S3 API Server]
S2 -->|"gRPC<br/>metadata only"| F2[Filer]
S2 -->|"HTTP direct<br/>data streaming"| V2[Volume Server]
style F2 fill:#44aa66,color:#fff
style S2 fill:#4466aa,color:#fff
style V2 fill:#44aa66,color:#fff
end
Write Path
sequenceDiagram
participant Client as S3 Client
participant S3API as S3 API Server
participant Filer as Filer (gRPC)
participant Volume as Volume Server
Note over Client,Volume: PUT Object - Direct Volume Access
Client->>S3API: PUT /bucket/key (data)
rect rgba(144, 238, 144, 0.3)
Note right of S3API: Step 1: Get Volume Assignment
S3API->>Filer: AssignVolume (gRPC)
Filer-->>S3API: {volumeId, fileId, url, JWT}
end
rect rgba(173, 216, 230, 0.3)
Note right of S3API: Step 2: Upload Data DIRECTLY
loop For each 8MB chunk
S3API->>Volume: POST http://volume:8080/{fid} (chunk data + JWT)
Volume-->>S3API: {size, eTag, fid}
end
end
rect rgba(255, 182, 193, 0.3)
Note right of S3API: Step 3: Save Metadata Only
S3API->>Filer: CreateEntry (gRPC)
Note over Filer: Stores: chunks[], size,<br/>ETag, SSE metadata,<br/>user metadata, etc.
Filer-->>S3API: OK
end
S3API-->>Client: 200 OK + ETag
Read Path
sequenceDiagram
participant Client as S3 Client
participant S3API as S3 API Server
participant Filer as Filer (gRPC)
participant Volume as Volume Server
Note over Client,Volume: GET Object - Direct Volume Access
Client->>S3API: GET /bucket/key
rect rgba(255, 182, 193, 0.3)
Note right of S3API: Step 1: Fetch Metadata Only
S3API->>Filer: LookupDirectoryEntry (gRPC)
Filer-->>S3API: Entry{chunks[], size, attrs, extended}
end
rect rgba(144, 238, 144, 0.3)
Note right of S3API: Step 2: Resolve Volume URLs
Note over S3API: Uses FilerClient's<br/>cached vidMap<br/>(no gRPC per chunk!)
S3API->>S3API: lookupFileIdFn(volumeId)
end
rect rgba(173, 216, 230, 0.3)
Note right of S3API: Step 3: Stream Data DIRECTLY
S3API->>Volume: GET http://volume:8080/{fid} + JWT
Volume-->>S3API: chunk data (streaming)
S3API-->>Client: data (streaming passthrough)
end
How Large file is written to S3?
sequenceDiagram
autonumber
participant C as Client
participant S3 as S3 API Server
participant F as Filer
participant M as Master
participant V1 as Volume Server 1
participant V2 as Volume Server 2
Note over V1,M: Periodically send heartbeats to Master
C->>S3: PUT /bucket/object (4GB stream)
Note over S3: Stream data, chunk in 8MB buffers<br/>(max 4 buffers = 32MB)
rect rgb(240, 248, 255)
Note over S3,V2: Repeat for each 8MB chunk (streaming)
S3->>F: AssignVolume (gRPC)
F->>M: Assign request
M-->>F: Return Fid + Volume URL
F-->>S3: Return Fid + Volume URL
S3->>V1: POST chunk data (HTTP)
V1->>V2: Replicate
Note over V1,V2: Strong consistency:<br/>response after replication completes
V2-->>V1: Ack
V1-->>S3: Return chunk size + ETag
end
Note over S3: All chunks uploaded,<br/>collect FileChunk metadata
S3->>F: CreateEntry (gRPC)<br/>(path, chunks[], attributes)
F-->>S3: Success
S3-->>C: 200 OK + ETag