环境

在学习Ceph时,需要一个实验环境,可以使用Docker来搭建一个Ceph环境。本次教程就是在单虚拟机环境下搭建有3个OSD的Ceph环境,虚拟机的操作系统为Ubuntu16.04系统。虚拟机的IP为10.0.2.15。

假设再在系统中已经有Docker了。

下载镜像

1
docker pull ceph/daemon:tag-build-master-jewel-ubuntu-16.04

安装Ceph-common

1
apt-get install ceph-common

修改/etc/ceph/ceph.conf

注意osd max object name len = 256, osd max object namespace len = 64,否则可能报错(Your backend filesystem appears to not support attrs large enough to handle the configured max rados name size)。

1
2
3
4
5
6
7
8
9
10
11
12
[global]
fsid = 943a03ff-d798-4f36-ac6c-eca1a80a71fd
mon initial members = fankang
mon host = 10.0.2.15
auth cluster required = cephx
auth service required = cephx
auth client required = cephx
public network = 10.0.2.0/24
cluster network = 10.0.2.0/24
osd journal size = 100
osd max object name len = 256
osd max object namespace len = 64

起monitor

其中,MON_IP及CEPH_PUBLIC_NETWORK根据自己的网络情况填写。

1
docker run -d --privileged --net=host -v /etc/ceph:/etc/ceph -v /var/lib/ceph/:/var/lib/ceph/ -e MON_IP=10.0.2.15 -e CEPH_PUBLIC_NETWORK=10.0.2.0/24 ceph/daemon:tag-build-master-jewel-ubuntu-16.04 mon

检验:

1
2
3
4
5
6
7
8
9
10
11
root@fankang:/var/lib/ceph# ceph -s
cluster 943a03ff-d798-4f36-ac6c-eca1a80a71fd
health HEALTH_ERR
no osds
monmap e1: 1 mons at {fankang=10.0.2.15:6789/0}
election epoch 3, quorum 0 fankang
osdmap e1: 0 osds: 0 up, 0 in
flags sortbitwise,require_jewel_osds
pgmap v2: 64 pgs, 1 pools, 0 bytes data, 0 objects
0 kB used, 0 kB / 0 kB avail
64 creating

新建OSD目录

1
2
3
root@fankang:/var/lib/ceph/osd# mkdir ceph-0
root@fankang:/var/lib/ceph/osd# mkdir ceph-1
root@fankang:/var/lib/ceph/osd# mkdir ceph-2

创建OSD

执行三次docker exec <mon-container-id> ceph osd create

1
2
3
4
5
6
root@fankang:/var/lib/ceph/osd# docker exec aab4b005e16f ceph osd create
0
root@fankang:/var/lib/ceph/osd# docker exec aab4b005e16f ceph osd create
1
root@fankang:/var/lib/ceph/osd# docker exec aab4b005e16f ceph osd create
2

检验:

1
2
3
4
5
6
7
8
9
10
root@fankang:/var/lib/ceph/osd# ceph -s
cluster 943a03ff-d798-4f36-ac6c-eca1a80a71fd
health HEALTH_OK
monmap e1: 1 mons at {fankang=10.0.2.15:6789/0}
election epoch 3, quorum 0 fankang
osdmap e4: 3 osds: 0 up, 0 in
flags sortbitwise,require_jewel_osds
pgmap v5: 64 pgs, 1 pools, 0 bytes data, 0 objects
0 kB used, 0 kB / 0 kB avail
64 creating

起OSD

1
docker run -d --privileged --net=host -v /etc/ceph:/etc/ceph -v /var/lib/ceph/:/var/lib/ceph/ -e MON_IP=10.0.2.15 -e CEPH_PUBLIC_NETWORK=10.0.2.0/24 ceph/daemon:tag-build-master-jewel-ubuntu-16.04 osd_directory

检验:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
root@fankang:/var/lib/ceph/osd# ceph -s
cluster 943a03ff-d798-4f36-ac6c-eca1a80a71fd
health HEALTH_ERR
64 pgs are stuck inactive for more than 300 seconds
64 pgs degraded
64 pgs stuck inactive
64 pgs stuck unclean
64 pgs undersized
too few PGs per OSD (21 < min 30)
monmap e1: 1 mons at {fankang=10.0.2.15:6789/0}
election epoch 3, quorum 0 fankang
osdmap e9: 3 osds: 3 up, 3 in
flags sortbitwise,require_jewel_osds
pgmap v12: 64 pgs, 1 pools, 0 bytes data, 0 objects
27857 MB used, 408 GB / 458 GB avail
64 undersized+degraded+peered

很明显,报too few PGs per OSD的错误了。

消除错误

由于默认只有一个rbd pool,所以我们需要调整rbd pool的pg数。由于我们有3个OSD,而副本数目前为3,所以3*100/3=100,取2的次方为128。

1
2
3
4
root@fankang:/var/lib/ceph/osd# ceph osd pool set rbd pg_num 128
set pool 0 pg_num to 128
root@fankang:/var/lib/ceph/osd# ceph osd pool set rbd pgp_num 128
set pool 0 pgp_num to 128

检验:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
root@fankang:/var/lib/ceph/osd# ceph -s
cluster 943a03ff-d798-4f36-ac6c-eca1a80a71fd
health HEALTH_ERR
128 pgs are stuck inactive for more than 300 seconds
128 pgs degraded
128 pgs stuck inactive
128 pgs stuck unclean
128 pgs undersized
monmap e1: 1 mons at {fankang=10.0.2.15:6789/0}
election epoch 3, quorum 0 fankang
osdmap e14: 3 osds: 3 up, 3 in
flags sortbitwise,require_jewel_osds
pgmap v107: 128 pgs, 1 pools, 0 bytes data, 0 objects
27872 MB used, 408 GB / 458 GB avail
128 undersized+degraded+peered

结果发现还是报错。这时因为系统默认按主机来对副本进行存储的,而我们的系统为单机环境,所以把rbd pool的副本数调整为1。

1
2
root@fankang:/var/lib/ceph/osd# ceph osd pool set rbd size 1
set pool 0 size to 1

现在还不会改存储策略,以后可以改成按OSD来存储,则无需调整副本数。
检验:

1
2
3
4
5
6
7
8
9
10
root@fankang:/var/lib/ceph/osd# ceph -s
cluster 943a03ff-d798-4f36-ac6c-eca1a80a71fd
health HEALTH_OK
monmap e1: 1 mons at {fankang=10.0.2.15:6789/0}
election epoch 3, quorum 0 fankang
osdmap e17: 3 osds: 3 up, 3 in
flags sortbitwise,require_jewel_osds
pgmap v197: 128 pgs, 1 pools, 0 bytes data, 0 objects
27883 MB used, 408 GB / 458 GB avail
128 active+clean

可以看出,系统总共有458GB存储,但实际上这是3个OSD,即3*153G得来的,实际并没有这么多。所以用的时候还要省着点。

1
2
3
4
5
6
7
8
9
10
11
12
root@fankang:/var/lib/ceph/osd# df -h
Filesystem Size Used Avail Use% Mounted on
udev 2.0G 0 2.0G 0% /dev
tmpfs 396M 5.8M 390M 2% /run
/dev/mapper/fankang--vg-root 153G 9.1G 137G 7% /
tmpfs 2.0G 220K 2.0G 1% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 2.0G 0 2.0G 0% /sys/fs/cgroup
/dev/sda1 472M 103M 346M 23% /boot
tmpfs 100K 0 100K 0% /run/lxcfs/controllers
cgmfs 100K 0 100K 0% /run/cgmanager/fs
tmpfs 396M 0 396M 0% /run/user/1000