本文档深入解析 CSI 的工作原理,从 CSI 驱动开发到 Linux 底层挂载的完整流程。
1. CSI 概述#
1.1 什么是 CSI#
CSI (Container Storage Interface) 是一个标准化的存储接口规范,定义了容器编排系统(如 Kubernetes)与存储系统之间的通信协议。
1.2 CSI 解决的问题#
┌─────────────────────────────────────────────────────────────┐
│ CSI 之前的问题 │
├─────────────────────────────────────────────────────────────┤
│ • 存储插件代码耦合在 Kubernetes 核心代码中 (in-tree) │
│ • 新增存储支持需要修改 Kubernetes 源码 │
│ • 存储厂商需要等待 Kubernetes 发布周期 │
│ • 不同容器编排系统需要重复开发存储插件 │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ CSI 的解决方案 │
├─────────────────────────────────────────────────────────────┤
│ • 标准化接口,存储插件独立于 Kubernetes (out-of-tree) │
│ • 存储厂商可独立开发、发布和维护驱动 │
│ • 一套驱动可支持多个容器编排系统 │
│ • 通过 gRPC 解耦通信 │
└─────────────────────────────────────────────────────────────┘
2. CSI 架构#
2.1 整体架构图#
graph TB
subgraph "Kubernetes Control Plane"
API[API Server]
PVC[PVC Controller]
ADC[AttachDetach Controller]
end
subgraph "CSI Controller Plugin"
CP[Controller Plugin Pod]
EP[External Provisioner]
EA[External Attacher]
ES[External Snapshotter]
ER[External Resizer]
CD[CSI Driver<br/>Controller Service]
end
subgraph "每个 Node"
NP[Node Plugin DaemonSet]
NR[Node Driver Registrar]
ND[CSI Driver<br/>Node Service]
KV[Kubelet<br/>VolumeManager]
end
subgraph "存储后端"
ST[(Storage System)]
end
API --> PVC
API --> ADC
PVC --> EP
ADC --> EA
EP --> CD
EA --> CD
ES --> CD
ER --> CD
CD --> ST
KV --> ND
NR --> KV
ND --> ST
2.2 组件说明#
| 组件 |
位置 |
职责 |
| CSI Controller Plugin |
Deployment |
处理卷的创建、删除、附加、快照等控制面操作 |
| CSI Node Plugin |
DaemonSet |
处理卷在节点上的挂载、卸载操作 |
| External Provisioner |
Sidecar |
监听 PVC,调用 CreateVolume/DeleteVolume |
| External Attacher |
Sidecar |
监听 VolumeAttachment,调用 ControllerPublishVolume |
| External Snapshotter |
Sidecar |
处理快照相关操作 |
| External Resizer |
Sidecar |
处理卷扩容操作 |
| Node Driver Registrar |
Sidecar |
向 Kubelet 注册 CSI 驱动 |
3. CSI 三大服务接口#
CSI 规范定义了三个核心 gRPC 服务:
graph LR
subgraph "CSI Services"
IS[Identity Service]
CS[Controller Service]
NS[Node Service]
end
IS --- |"GetPluginInfo<br/>GetPluginCapabilities<br/>Probe"| ID[驱动身份]
CS --- |"CreateVolume<br/>DeleteVolume<br/>ControllerPublishVolume<br/>..."| CO[控制面操作]
NS --- |"NodeStageVolume<br/>NodePublishVolume<br/>..."| NO[节点操作]
3.1 Identity Service#
1
2
3
4
5
6
7
8
9
10
|
service Identity {
// 返回驱动名称和版本
rpc GetPluginInfo(GetPluginInfoRequest) returns (GetPluginInfoResponse) {}
// 返回驱动支持的能力
rpc GetPluginCapabilities(GetPluginCapabilitiesRequest) returns (GetPluginCapabilitiesResponse) {}
// 健康检查
rpc Probe(ProbeRequest) returns (ProbeResponse) {}
}
|
3.2 Controller Service#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
|
service Controller {
// 创建卷
rpc CreateVolume(CreateVolumeRequest) returns (CreateVolumeResponse) {}
// 删除卷
rpc DeleteVolume(DeleteVolumeRequest) returns (DeleteVolumeResponse) {}
// 将卷附加到节点(控制面操作)
rpc ControllerPublishVolume(ControllerPublishVolumeRequest) returns (ControllerPublishVolumeResponse) {}
// 从节点分离卷
rpc ControllerUnpublishVolume(ControllerUnpublishVolumeRequest) returns (ControllerUnpublishVolumeResponse) {}
// 验证卷能力
rpc ValidateVolumeCapabilities(ValidateVolumeCapabilitiesRequest) returns (ValidateVolumeCapabilitiesResponse) {}
// 列出卷
rpc ListVolumes(ListVolumesRequest) returns (ListVolumesResponse) {}
// 获取存储容量
rpc GetCapacity(GetCapacityRequest) returns (GetCapacityResponse) {}
// 获取控制器能力
rpc ControllerGetCapabilities(ControllerGetCapabilitiesRequest) returns (ControllerGetCapabilitiesResponse) {}
// 创建快照
rpc CreateSnapshot(CreateSnapshotRequest) returns (CreateSnapshotResponse) {}
// 删除快照
rpc DeleteSnapshot(DeleteSnapshotRequest) returns (DeleteSnapshotResponse) {}
// 扩容卷
rpc ControllerExpandVolume(ControllerExpandVolumeRequest) returns (ControllerExpandVolumeResponse) {}
}
|
3.3 Node Service#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
|
service Node {
// 将卷挂载到全局目录(Stage)
rpc NodeStageVolume(NodeStageVolumeRequest) returns (NodeStageVolumeResponse) {}
// 从全局目录卸载卷
rpc NodeUnstageVolume(NodeUnstageVolumeRequest) returns (NodeUnstageVolumeResponse) {}
// 将卷挂载到 Pod 目录(Publish)
rpc NodePublishVolume(NodePublishVolumeRequest) returns (NodePublishVolumeResponse) {}
// 从 Pod 目录卸载卷
rpc NodeUnpublishVolume(NodeUnpublishVolumeRequest) returns (NodeUnpublishVolumeResponse) {}
// 获取卷统计信息
rpc NodeGetVolumeStats(NodeGetVolumeStatsRequest) returns (NodeGetVolumeStatsResponse) {}
// 节点上扩容卷
rpc NodeExpandVolume(NodeExpandVolumeRequest) returns (NodeExpandVolumeResponse) {}
// 获取节点能力
rpc NodeGetCapabilities(NodeGetCapabilitiesRequest) returns (NodeGetCapabilitiesResponse) {}
// 获取节点信息
rpc NodeGetInfo(NodeGetInfoRequest) returns (NodeGetInfoResponse) {}
}
|
4. CSI 驱动开发#
4.1 驱动架构#
graph TB
subgraph "CSI Driver 代码结构"
Main[main.go]
IS[IdentityServer]
CS[ControllerServer]
NS[NodeServer]
Main --> IS
Main --> CS
Main --> NS
end
subgraph "gRPC Server"
GS[gRPC Server]
Socket[Unix Socket]
end
IS --> GS
CS --> GS
NS --> GS
GS --> Socket
4.2 驱动示例代码#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
|
// 驱动主入口
func main() {
// 创建驱动实例
driver := NewDriver(
driverName, // 如: "csi.example.com"
nodeID, // 节点标识
endpoint, // Unix socket 路径
)
// 启动 gRPC 服务
driver.Run()
}
// Identity Server 实现
type IdentityServer struct {
name string
version string
}
func (is *IdentityServer) GetPluginInfo(ctx context.Context, req *csi.GetPluginInfoRequest) (*csi.GetPluginInfoResponse, error) {
return &csi.GetPluginInfoResponse{
Name: is.name,
VendorVersion: is.version,
}, nil
}
func (is *IdentityServer) GetPluginCapabilities(ctx context.Context, req *csi.GetPluginCapabilitiesRequest) (*csi.GetPluginCapabilitiesResponse, error) {
return &csi.GetPluginCapabilitiesResponse{
Capabilities: []*csi.PluginCapability{
{
Type: &csi.PluginCapability_Service_{
Service: &csi.PluginCapability_Service{
Type: csi.PluginCapability_Service_CONTROLLER_SERVICE,
},
},
},
},
}, nil
}
// Controller Server 实现
type ControllerServer struct {
// 存储后端客户端
client StorageClient
}
func (cs *ControllerServer) CreateVolume(ctx context.Context, req *csi.CreateVolumeRequest) (*csi.CreateVolumeResponse, error) {
// 1. 验证请求参数
if req.GetName() == "" {
return nil, status.Error(codes.InvalidArgument, "Volume name required")
}
// 2. 调用存储后端创建卷
vol, err := cs.client.CreateVolume(req.GetName(), req.GetCapacityRange().GetRequiredBytes())
if err != nil {
return nil, status.Error(codes.Internal, err.Error())
}
// 3. 返回卷信息
return &csi.CreateVolumeResponse{
Volume: &csi.Volume{
VolumeId: vol.ID,
CapacityBytes: vol.Size,
VolumeContext: map[string]string{
"storage-pool": vol.Pool,
},
},
}, nil
}
// Node Server 实现
type NodeServer struct {
nodeID string
mounter mount.Interface
}
func (ns *NodeServer) NodeStageVolume(ctx context.Context, req *csi.NodeStageVolumeRequest) (*csi.NodeStageVolumeResponse, error) {
// 1. 获取设备路径
devicePath := req.GetPublishContext()["devicePath"]
stagingPath := req.GetStagingTargetPath()
fsType := req.GetVolumeCapability().GetMount().GetFsType()
// 2. 格式化设备(如果需要)
if err := ns.formatDevice(devicePath, fsType); err != nil {
return nil, err
}
// 3. 挂载到 staging 目录
if err := ns.mounter.Mount(devicePath, stagingPath, fsType, nil); err != nil {
return nil, err
}
return &csi.NodeStageVolumeResponse{}, nil
}
func (ns *NodeServer) NodePublishVolume(ctx context.Context, req *csi.NodePublishVolumeRequest) (*csi.NodePublishVolumeResponse, error) {
stagingPath := req.GetStagingTargetPath()
targetPath := req.GetTargetPath()
// 使用 bind mount 将 staging 目录挂载到 Pod 目录
mountOptions := []string{"bind"}
if req.GetReadonly() {
mountOptions = append(mountOptions, "ro")
}
if err := ns.mounter.Mount(stagingPath, targetPath, "", mountOptions); err != nil {
return nil, err
}
return &csi.NodePublishVolumeResponse{}, nil
}
|
4.3 Sidecar 容器部署#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
|
# CSI Controller Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: csi-controller
spec:
replicas: 1
template:
spec:
containers:
# CSI 驱动容器
- name: csi-driver
image: example/csi-driver:v1.0
args:
- "--endpoint=unix:///csi/csi.sock"
- "--nodeid=$(NODE_ID)"
volumeMounts:
- name: socket-dir
mountPath: /csi
# External Provisioner Sidecar
- name: csi-provisioner
image: registry.k8s.io/sig-storage/csi-provisioner:v5.0.1
args:
- "--csi-address=/csi/csi.sock"
- "--feature-gates=Topology=true"
volumeMounts:
- name: socket-dir
mountPath: /csi
# External Attacher Sidecar
- name: csi-attacher
image: registry.k8s.io/sig-storage/csi-attacher:v4.6.1
args:
- "--csi-address=/csi/csi.sock"
volumeMounts:
- name: socket-dir
mountPath: /csi
volumes:
- name: socket-dir
emptyDir: {}
---
# CSI Node DaemonSet
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: csi-node
spec:
template:
spec:
containers:
# CSI 驱动容器
- name: csi-driver
image: example/csi-driver:v1.0
args:
- "--endpoint=unix:///csi/csi.sock"
- "--nodeid=$(NODE_ID)"
securityContext:
privileged: true
volumeMounts:
- name: socket-dir
mountPath: /csi
- name: pods-mount-dir
mountPath: /var/lib/kubelet/pods
mountPropagation: Bidirectional
- name: device-dir
mountPath: /dev
# Node Driver Registrar Sidecar
- name: node-driver-registrar
image: registry.k8s.io/sig-storage/csi-node-driver-registrar:v2.13.0
args:
- "--csi-address=/csi/csi.sock"
- "--kubelet-registration-path=/var/lib/kubelet/plugins/csi.example.com/csi.sock"
volumeMounts:
- name: socket-dir
mountPath: /csi
- name: registration-dir
mountPath: /registration
volumes:
- name: socket-dir
hostPath:
path: /var/lib/kubelet/plugins/csi.example.com
type: DirectoryOrCreate
- name: registration-dir
hostPath:
path: /var/lib/kubelet/plugins_registry
type: Directory
- name: pods-mount-dir
hostPath:
path: /var/lib/kubelet/pods
type: Directory
- name: device-dir
hostPath:
path: /dev
type: Directory
|
5. Kubernetes CSI 集成#
5.1 CSI 驱动注册流程#
sequenceDiagram
participant NR as Node Driver Registrar
participant PW as Kubelet Plugin Watcher
participant CSI as CSI Plugin
participant Driver as CSI Driver
participant API as API Server
NR->>PW: 1. 创建 registration socket
PW->>CSI: 2. ValidatePlugin(name, endpoint, versions)
CSI->>Driver: 3. GetPluginInfo()
Driver-->>CSI: 4. 返回驱动信息
CSI-->>PW: 5. 验证通过
PW->>CSI: 6. RegisterPlugin(name, endpoint, versions)
CSI->>Driver: 7. NodeGetInfo()
Driver-->>CSI: 8. 返回节点信息 (nodeID, maxVolumes, topology)
CSI->>API: 9. 创建/更新 CSINode 对象
API-->>CSI: 10. CSINode 已更新
CSI-->>PW: 11. 注册完成
关键代码路径 (pkg/volume/csi/csi_plugin.go):
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
|
// RegistrationHandler 处理 CSI 驱动注册
type RegistrationHandler struct {
csiPlugin *csiPlugin
}
// RegisterPlugin 在驱动注册时调用
func (h *RegistrationHandler) RegisterPlugin(pluginName string, endpoint string, versions []string, pluginClientTimeout *time.Duration) error {
// 1. 存储驱动信息
csiDrivers.Set(pluginName, Driver{
endpoint: endpoint,
highestSupportedVersion: highestSupportedVersion,
})
// 2. 获取节点信息
csi, err := newCsiDriverClient(csiDriverName(pluginName))
nodeID, maxVolumePerNode, accessibleTopology, err := csi.NodeGetInfo(ctx)
// 3. 更新 CSINode 对象
err = nim.InstallCSIDriver(pluginName, nodeID, maxVolumePerNode, accessibleTopology)
return nil
}
|
5.2 核心控制器#
graph TB
subgraph "Control Plane Controllers"
PVC_C[PV/PVC Controller<br/>pv_controller.go]
AD_C[AttachDetach Controller<br/>attach_detach_controller.go]
end
subgraph "Kubelet"
VM[VolumeManager<br/>volume_manager.go]
Pop[Populator<br/>desired_state_of_world_populator.go]
Rec[Reconciler<br/>reconciler.go]
end
PVC_C -->|创建 PV| API[API Server]
AD_C -->|创建 VolumeAttachment| API
API --> VM
VM --> Pop
Pop --> Rec
Rec -->|调用 CSI| CSI[CSI Driver]
5.3 关键 CRD 资源#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
|
# CSIDriver - 描述 CSI 驱动能力
apiVersion: storage.k8s.io/v1
kind: CSIDriver
metadata:
name: csi.example.com
spec:
attachRequired: true # 是否需要 Attach 操作
podInfoOnMount: true # 是否传递 Pod 信息
fsGroupPolicy: File # FSGroup 策略
volumeLifecycleModes:
- Persistent # 支持持久卷
- Ephemeral # 支持临时卷
tokenRequests: # SA Token 请求
- audience: "api"
requiresRepublish: false # 是否需要重新发布
seLinuxMount: false # SELinux 挂载支持
---
# CSINode - 描述节点上的 CSI 驱动
apiVersion: storage.k8s.io/v1
kind: CSINode
metadata:
name: node-1
spec:
drivers:
- name: csi.example.com
nodeID: node-1-id
topologyKeys:
- topology.kubernetes.io/zone
allocatable:
count: 100 # 最大卷数量
---
# VolumeAttachment - 表示卷附加状态
apiVersion: storage.k8s.io/v1
kind: VolumeAttachment
metadata:
name: csi-xxx
spec:
attacher: csi.example.com
nodeName: node-1
source:
persistentVolumeName: pv-xxx
status:
attached: true
attachmentMetadata:
devicePath: /dev/sdb
|
6. 完整挂载流程#
6.1 端到端流程图#
sequenceDiagram
autonumber
participant User
participant API as API Server
participant PVC_C as PVC Controller
participant EP as External Provisioner
participant CSI_C as CSI Controller
participant Storage as Storage Backend
participant AD_C as AttachDetach Controller
participant EA as External Attacher
participant KV as Kubelet VolumeManager
participant CSI_N as CSI Node
participant Linux as Linux Kernel
rect rgb(200, 230, 200)
Note over User,Storage: Phase 1: 动态供应 (Dynamic Provisioning)
User->>API: 创建 PVC
API->>PVC_C: PVC 事件
PVC_C->>EP: 发现未绑定 PVC
EP->>CSI_C: CreateVolume()
CSI_C->>Storage: 创建存储卷
Storage-->>CSI_C: 返回 VolumeID
CSI_C-->>EP: CreateVolumeResponse
EP->>API: 创建 PV
EP->>API: 绑定 PVC 到 PV
end
rect rgb(200, 200, 230)
Note over User,Linux: Phase 2: 卷附加 (Volume Attachment)
User->>API: 创建 Pod (引用 PVC)
API->>AD_C: Pod 调度到节点
AD_C->>API: 创建 VolumeAttachment
API->>EA: VolumeAttachment 事件
EA->>CSI_C: ControllerPublishVolume()
CSI_C->>Storage: 附加卷到节点
Storage-->>CSI_C: 返回 devicePath
CSI_C-->>EA: PublishContext {devicePath}
EA->>API: 更新 VolumeAttachment.status
end
rect rgb(230, 200, 200)
Note over User,Linux: Phase 3: 节点挂载 (Node Mount)
KV->>API: 监听 Pod
KV->>KV: 检测需要挂载的卷
KV->>CSI_N: NodeStageVolume()
CSI_N->>Linux: 格式化设备 (mkfs)
CSI_N->>Linux: 挂载到 staging 目录
CSI_N-->>KV: NodeStageVolumeResponse
KV->>CSI_N: NodePublishVolume()
CSI_N->>Linux: bind mount 到 Pod 目录
CSI_N-->>KV: NodePublishVolumeResponse
KV->>API: 更新 Pod status
end
6.2 详细阶段解析#
Phase 1: 动态供应#
flowchart TB
subgraph "1. PVC 创建"
A[用户创建 PVC] --> B[API Server 存储]
B --> C[PVC Controller 监听]
end
subgraph "2. 查找 StorageClass"
C --> D{PVC 指定<br/>StorageClass?}
D -->|是| E[获取指定 StorageClass]
D -->|否| F[使用默认 StorageClass]
E --> G[确定 Provisioner]
F --> G
end
subgraph "3. External Provisioner"
G --> H[Provisioner 监听 PVC]
H --> I[调用 CreateVolume RPC]
I --> J[CSI Driver 创建卷]
J --> K[返回 Volume 信息]
end
subgraph "4. PV 创建和绑定"
K --> L[创建 PV 对象]
L --> M[设置 PV.Spec.CSI]
M --> N[绑定 PVC.Spec.VolumeName = PV.Name]
N --> O[绑定完成]
end
关键代码 (pkg/controller/volume/persistentvolume/pv_controller.go):
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
|
func (ctrl *PersistentVolumeController) syncUnboundClaim(ctx context.Context, claim *v1.PersistentVolumeClaim) error {
// 1. 查找匹配的 PV
volume, err := ctrl.volumes.findBestMatchForClaim(claim, delayBinding)
if volume == nil {
// 2. 没有匹配的 PV,检查是否可以动态供应
if ctrl.shouldProvision(claim) {
// External Provisioner 会处理
return nil
}
}
// 3. 绑定 PV 和 PVC
return ctrl.bind(ctx, volume, claim)
}
|
Phase 2: 卷附加#
flowchart TB
subgraph "1. Pod 调度"
A[Pod 创建] --> B[Scheduler 调度]
B --> C[Pod 分配到 Node]
end
subgraph "2. AttachDetach Controller"
C --> D[AD Controller 检测]
D --> E{检查 CSIDriver<br/>attachRequired}
E -->|true| F[创建 VolumeAttachment]
E -->|false| G[跳过 Attach]
end
subgraph "3. External Attacher"
F --> H[Attacher 监听 VA]
H --> I[调用 ControllerPublishVolume]
I --> J[存储系统附加卷]
J --> K[返回 publishContext]
end
subgraph "4. 状态更新"
K --> L[更新 VA.status.attached=true]
L --> M[写入 publishContext<br/>如: devicePath=/dev/sdb]
M --> N[附加完成]
end
关键代码 (pkg/volume/csi/csi_attacher.go):
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
|
func (c *csiAttacher) Attach(spec *volume.Spec, nodeName types.NodeName) (string, error) {
// 1. 创建 VolumeAttachment 对象
attachment := &storage.VolumeAttachment{
ObjectMeta: metav1.ObjectMeta{
Name: attachID,
},
Spec: storage.VolumeAttachmentSpec{
Attacher: pvSrc.Driver,
NodeName: string(nodeName),
Source: storage.VolumeAttachmentSource{
PersistentVolumeName: &pvName,
},
},
}
_, err := c.k8s.StorageV1().VolumeAttachments().Create(ctx, attachment, metav1.CreateOptions{})
// 2. 等待 VolumeAttachment 完成
return c.waitForVolumeAttachment(ctx, attachID, timeout)
}
|
Phase 3: 节点挂载#
flowchart TB
subgraph "1. Kubelet 检测"
A[Pod 调度到节点] --> B[VolumeManager 检测]
B --> C[获取 Volume 列表]
end
subgraph "2. 等待 Attach"
C --> D{需要 Attach?}
D -->|是| E[等待 VolumeAttachment<br/>status.attached=true]
D -->|否| F[直接挂载]
E --> F
end
subgraph "3. NodeStageVolume"
F --> G[调用 NodeStageVolume]
G --> H[获取设备路径<br/>publishContext.devicePath]
H --> I[格式化设备<br/>mkfs.ext4 /dev/sdb]
I --> J[挂载到 staging 目录<br/>/var/lib/kubelet/plugins/.../globalmount]
end
subgraph "4. NodePublishVolume"
J --> K[调用 NodePublishVolume]
K --> L[bind mount 到 Pod 目录<br/>/var/lib/kubelet/pods/.../volumes/...]
L --> M[设置权限和 FSGroup]
M --> N[挂载完成]
end
关键代码 (pkg/volume/csi/csi_mounter.go):
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
|
func (c *csiMountMgr) SetUpAt(dir string, mounterArgs volume.MounterArgs) error {
// 1. 获取 publishContext(包含设备路径)
publishContext, err := c.getPublishContext()
// 2. 调用 NodeStageVolume(如果支持)
if stageUnstageSet {
err = csi.NodeStageVolume(ctx,
c.volumeID,
publishContext,
stagingTargetPath,
fsType,
accessMode,
nodeStageSecrets,
volumeContext,
mountOptions,
fsGroup,
)
}
// 3. 调用 NodePublishVolume
err = csi.NodePublishVolume(ctx,
c.volumeID,
c.readOnly,
stagingTargetPath,
dir, // Pod 挂载目录
accessMode,
publishContext,
volumeContext,
nodePublishSecrets,
fsType,
mountOptions,
fsGroup,
)
return nil
}
|
6.3 挂载路径说明#
/var/lib/kubelet/
├── plugins/
│ └── kubernetes.io~csi/
│ └── <driver-name>/
│ └── <volume-hash>/
│ └── globalmount/ # NodeStageVolume 目标
│ └── <mounted-fs> # 格式化后的文件系统
│
├── pods/
│ └── <pod-uid>/
│ └── volumes/
│ └── kubernetes.io~csi/
│ └── <volume-name>/
│ └── mount/ # NodePublishVolume 目标
│ └── <bind-mount> # bind mount 到 globalmount
│
└── plugins_registry/ # CSI 驱动注册目录
└── <driver-name>-reg.sock # 注册 socket
7. Linux 存储栈#
7.1 存储层次架构#
graph TB
subgraph "用户空间"
App[应用程序]
FS_API[文件系统 API<br/>open/read/write]
end
subgraph "内核空间"
VFS[VFS 虚拟文件系统层]
FS[具体文件系统<br/>ext4/xfs/btrfs]
BC[Block Cache<br/>页面缓存]
BL[块层<br/>Block Layer]
IO[I/O 调度器]
SCSI[SCSI 子系统]
Driver[设备驱动]
end
subgraph "硬件"
HBA[HBA/NVMe 控制器]
Disk[(存储设备)]
end
App --> FS_API
FS_API --> VFS
VFS --> FS
FS --> BC
BC --> BL
BL --> IO
IO --> SCSI
SCSI --> Driver
Driver --> HBA
HBA --> Disk
7.2 CSI 到 Linux 挂载流程#
sequenceDiagram
participant CSI as CSI Node Driver
participant Mount as mount 系统调用
participant VFS as Linux VFS
participant FS as 文件系统 (ext4)
participant Block as 块设备层
participant Device as /dev/sdb
Note over CSI,Device: NodeStageVolume 阶段
CSI->>Device: 1. 发现设备 /dev/sdb
CSI->>CSI: 2. 检查是否需要格式化
alt 新设备
CSI->>FS: 3. mkfs.ext4 /dev/sdb
FS->>Block: 写入文件系统元数据
Block->>Device: 持久化到磁盘
end
CSI->>Mount: 4. mount /dev/sdb /staging/path
Mount->>VFS: 5. sys_mount()
VFS->>FS: 6. ext4_fill_super()
FS->>Block: 7. 读取超级块
Block->>Device: 8. 读取设备数据
Device-->>Block: 返回数据
Block-->>FS: 返回超级块
FS-->>VFS: 注册挂载点
VFS-->>Mount: 挂载成功
Mount-->>CSI: 返回成功
Note over CSI,Device: NodePublishVolume 阶段
CSI->>Mount: 9. mount --bind /staging /pod/path
Mount->>VFS: 10. bind mount
VFS->>VFS: 11. 创建新挂载点引用
VFS-->>Mount: 挂载成功
Mount-->>CSI: 返回成功
7.3 关键 Linux 命令#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
|
# 1. 查看块设备
lsblk
# NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
# sda 8:0 0 100G 0 disk
# └─sda1 8:1 0 100G 0 part /
# sdb 8:16 0 50G 0 disk /var/lib/kubelet/plugins/...
# 2. 格式化设备
mkfs.ext4 /dev/sdb
# 3. 挂载设备
mount /dev/sdb /var/lib/kubelet/plugins/kubernetes.io~csi/driver/vol-xxx/globalmount
# 4. Bind Mount
mount --bind /source /target
# 5. 查看挂载点
findmnt /var/lib/kubelet/pods/xxx/volumes/kubernetes.io~csi/pvc-xxx/mount
# TARGET SOURCE FSTYPE OPTIONS
# /var/lib/kubelet/pods/.../mount /dev/sdb ext4 rw,relatime
# 6. 查看挂载传播
cat /proc/self/mountinfo | grep kubelet
# 可以看到 shared, slave, private 等传播类型
|
7.4 挂载传播 (Mount Propagation)#
graph TB
subgraph "挂载传播类型"
P[Private<br/>不传播]
S[Shared<br/>双向传播]
SL[Slave<br/>单向传播]
U[Unbindable<br/>不可绑定]
end
subgraph "CSI 场景"
HP[Host Path<br/>Bidirectional]
CP[Container Path<br/>HostToContainer]
end
HP -->|需要| S
CP -->|需要| SL
CSI Node 容器需要 Bidirectional 挂载传播,以便在容器内创建的挂载点对 Host 可见:
1
2
3
4
|
volumeMounts:
- name: pods-mount-dir
mountPath: /var/lib/kubelet/pods
mountPropagation: Bidirectional # 关键配置
|
8. 关键代码路径#
8.1 代码文件索引#
| 模块 |
文件路径 |
功能 |
| CSI 插件 |
pkg/volume/csi/csi_plugin.go |
CSI 插件主入口,驱动注册 |
|
pkg/volume/csi/csi_attacher.go |
卷附加/分离逻辑 |
|
pkg/volume/csi/csi_mounter.go |
卷挂载/卸载逻辑 |
|
pkg/volume/csi/csi_client.go |
CSI gRPC 客户端 |
|
pkg/volume/csi/csi_drivers_store.go |
驱动注册表 |
| 控制器 |
pkg/controller/volume/persistentvolume/pv_controller.go |
PV/PVC 控制器 |
|
pkg/controller/volume/attachdetach/attach_detach_controller.go |
AttachDetach 控制器 |
|
pkg/controller/volume/attachdetach/reconciler/reconciler.go |
协调器 |
| Kubelet |
pkg/kubelet/volumemanager/volume_manager.go |
Kubelet 卷管理器 |
|
pkg/kubelet/volumemanager/reconciler/reconciler.go |
Kubelet 协调器 |
|
pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go |
期望状态填充器 |
| 操作执行 |
pkg/volume/util/operationexecutor/operation_executor.go |
操作执行器 |
|
pkg/volume/util/operationexecutor/operation_generator.go |
操作生成器 |
| CSI Spec |
vendor/github.com/container-storage-interface/spec/lib/go/csi/csi.pb.go |
CSI 协议定义 |
8.2 关键函数调用链#
Pod 创建 → Kubelet
│
▼
VolumeManager.Run()
│
▼
Populator.Run() ──────────────────────► DesiredStateOfWorld.AddPodToVolume()
│
▼
Reconciler.reconcile()
│
├─► unmountVolumes()
│
├─► mountOrAttachVolumes()
│ │
│ ▼
│ waitForVolumeAttach() ────────► 等待 VolumeAttachment.status.attached=true
│ │
│ ▼
│ OperationExecutor.MountVolume()
│ │
│ ▼
│ OperationGenerator.GenerateMountVolumeFunc()
│ │
│ ├─► csiAttacher.MountDevice() ──► csiClient.NodeStageVolume()
│ │
│ └─► csiMountMgr.SetUpAt() ──────► csiClient.NodePublishVolume()
│
└─► unmountDetachDevices()
8.3 核心数据结构#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
|
// CSI 客户端接口 (csi_client.go)
type csiClient interface {
NodeGetInfo(ctx context.Context) (nodeID string, maxVolumePerNode int64, accessibleTopology map[string]string, err error)
NodePublishVolume(ctx context.Context, volumeid string, readOnly bool, stagingTargetPath string, targetPath string, accessMode api.PersistentVolumeAccessMode, publishContext map[string]string, volumeContext map[string]string, secrets map[string]string, fsType string, mountOptions []string, fsGroup *int64) error
NodeUnpublishVolume(ctx context.Context, volID string, targetPath string) error
NodeStageVolume(ctx context.Context, volID string, publishVolumeInfo map[string]string, stagingTargetPath string, fsType string, accessMode api.PersistentVolumeAccessMode, secrets map[string]string, volumeContext map[string]string, mountOptions []string, fsGroup *int64) error
NodeUnstageVolume(ctx context.Context, volID, stagingTargetPath string) error
NodeSupportsStageUnstage(ctx context.Context) (bool, error)
// ... 更多方法
}
// Volume Mounter (csi_mounter.go)
type csiMountMgr struct {
csiClientGetter
k8s kubernetes.Interface
plugin *csiPlugin
driverName csiDriverName
volumeLifecycleMode storage.VolumeLifecycleMode
volumeID string
specVolumeID string
readOnly bool
spec *volume.Spec
pod *api.Pod
podUID types.UID
publishContext map[string]string
kubeVolHost volume.KubeletVolumeHost
}
// Volume Attacher (csi_attacher.go)
type csiAttacher struct {
plugin *csiPlugin
k8s kubernetes.Interface
watchTimeout time.Duration
csiClient csiClient
}
|
CSI 通过标准化的 gRPC 接口,将存储系统与 Kubernetes 解耦。完整的挂载流程包括:
- 动态供应: PVC Controller + External Provisioner → CreateVolume
- 卷附加: AttachDetach Controller + External Attacher → ControllerPublishVolume
- 节点挂载: Kubelet VolumeManager → NodeStageVolume + NodePublishVolume
- Linux 挂载: mkfs + mount + bind mount
理解这个流程对于开发 CSI 驱动、排查存储问题都至关重要。
参考资料#