Corosync+Pacemaker+Ldirectord+Lvs+Httpd
一、硬件环境
4台虚拟机在同一网段
操作系统:centos6.3
关闭系统不必要的服务脚本
#!/bin/bashservices=`chkconfig --list|cut -f1|cut -d" " -f1`for ser in $servicesdo if [ "$ser" == "network" ] || [ "$ser" == "rsyslog" ] || [ "$ser" == "sshd" ] || [ "$ser" == "crond" ] || [ "$ser" == "atd" ]; then chkconfig "$ser" on else chkconfig "$ser" off fidonereboot
二、ip地址规划
master 172.30.82.45slave 172.30.82.58node1 172.30.82.3node2 172.30.82.11VIP 172.30.82.61
三、注意:
1、设置各个节点间的时间同步
ntpdate 172.30.82.254 &>/dev/null
2、基于hosts文件实现能够互相用主机名访问,修改/etc/hosts文件
3、使用uname -n执行结果要和主机名相同
4、确保ldirectord服务关闭开机启动
chkconfig ldirectord off
5、关闭selinux
setenfroce 0
四、相关软件下载及安装
从pacemaker1.1.8开始,crm发展成了一个独立项目,叫crmsh。也就是说,我们安装了pacemaker后,并没有crm这个命令,我们要实现对集群资源管理,还需要独立安装crmsh
pssh-2.3.1-4.1.x86_64.rpm crmsh-2.1-1.6.x86_64.rpm python-pssh-2.3.1-4.1.x86_64.rpm下载地址http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-6/x86_64/libdnet 下载地址:http://dl.fedoraproject.org/pub/epel/6/x86_64/repoview/letter_l.group.htmlldirectord-3.9.6-0rc1.1.1.x86_64.rpm 下载地址:http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-6/x86_64/yum install corosync pacemaker libesmtp –yyum install -y python-dateutil python-lxml redhat-rpm-config cluster-glue cluster-glue-libs resource-agentsyum --nogpgcheck localinstall pssh-2.3.1-4.1.x86_64.rpm crmsh-2.1-1.6.x86_64.rpm python-pssh-2.3.1-4.1.x86_64.rpm ldirectord-3.9.6-0rc1.1.1.x86_64.rpm
五、配置director节点的高可用
1、拷贝配置文件cp corosync.conf.example corosync.confcp /usr/share/doc/ldirectord-3.9.6/ldirectord.cf /etc/ha.d/2、生成autokeys文件 corosync-keygen3、修改corosync.conftotem { version: 2 secauth: off #是否开启秘钥认证 threads: 0 #发送集群节点认证信息使用的进程数 interface { ringnumber: 0 #为避免冗余环路设定的所在的网络接口 bindnetaddr: 172.30.82.0 #集群所在网络 mcastaddr: 239.238.16.1 #集群通告组播地址 mcastport: 5405 #服务端口 ttl: 1 }}logging { fileline: off #日志是否打印行号 to_stderr: no #是否输出标准错误(到显示器) to_logfile: yes #定义日志 logfile: /var/log/corosync.log to_syslog: no #是否开启系统日志 debug: off timestamp: on logger_subsys { subsys: AMF debug: off }}service { #服务启动时启动pacemaker ver: 0 name: pacemaker}4、修改ldirectord配置文件ldirectord.cfchecktimeout=3 # 检测超时checkinterval=1 # 检测间隔autoreload=yes # 从新载入客户机logfile="/var/log/ldirectord.log" # 日志路径logfile="local0"quiescent=no # realserver 宕机后从lvs列表中删除,恢复后自动添加进列表virtual=172.30.82.61:80 # 监听VIP地址80端口real=172.30.82.3:80 gate # 真机IP地址和端口 路由模式real=172.30.82.11:80 gatefallback=127.0.0.1:80 gate # 如果real节点都宕机,则回切到环回地址service=http # 服务是httprequest=".text.html" # 保存在real的web根目录并且可以访问,通过它来判断real是否存活receive="OK" # 检测文件内容scheduler=rr # 调度算法 protocol=tcp # 检测协议 checktype=negotiate # 检测类型checkport=80 # 检测端口5、复制配置文件到备用节点:scp -P authkeys corosync.conf ldirectord.cf slave:/etc/ha.d/
六、DR模型下配置realserver脚本:
#!/bin/bashVIP=172.30.82.61host=`/bin/hostname`case "$1" instart) # Start LVS-DR real server on this machine. /sbin/ifconfig lo down /sbin/ifconfig lo up echo "1" >/proc/sys/net/ipv4/conf/lo/arp_ignore echo "2" >/proc/sys/net/ipv4/conf/lo/arp_announce echo "1" >/proc/sys/net/ipv4/conf/all/arp_ignore echo "2" >/proc/sys/net/ipv4/conf/all/arp_announce /sbin/ifconfig lo:0 $VIP netmask 255.255.255.255 up /sbin/route add -host $VIP dev lo:0;;stop) # Stop LVS-DR real server loopback device(s). /sbin/ifconfig lo:0 down echo "0" >/proc/sys/net/ipv4/conf/lo/arp_ignore echo "0" >/proc/sys/net/ipv4/conf/lo/arp_announce echo "0" >/proc/sys/net/ipv4/conf/all/arp_ignore echo "0" >/proc/sys/net/ipv4/conf/all/arp_announce;;status) # Status of LVS-DR real server. islothere=`/sbin/ifconfig lo:0 | grep $VIP` isrothere=`netstat -rn | grep "lo" | grep $VIP` if [ ! "$islothere" -o ! "$isrothere" ];then # Either the route or the lo:0 device # not found. echo "LVS-DR real server is stopped." else echo "LVS-DR real server is running." fi;;*) # Invalid entry. echo "$0: Usage: $0 {start|status|stop}" exit 1;;esac
七、real上安装httpd服务并添加测试页面
1、node1yum install -y httpdecho "Welcome to realserver 1" >/var/www/html/index.htmlecho "OK" >/var/www/html/.text.htmlservice httpd start2、node2yum install -y httpdecho "Welcome to realserver 2" >/var/www/html/index.htmlecho "OK" >/var/www/html/.text.htmlservice httpd start
八、开启、配置并测试高可用集群服务
1、在master上执行service corosync startssh slave 'service corosync start'注意:启动node2需要在node1上使用如上命令进行,不要在node2节点上直接启动;查看corosync引擎是否正常启动[root@master corosync]# grep -e "Corosync Cluster Engine" -e "configuration file" /var/log/corosync.log May 19 23:11:05 corosync [MAIN ] Corosync Cluster Engine exiting with status 0 at main.c:2055.May 19 23:11:46 corosync [MAIN ] Corosync Cluster Engine ('1.4.7'): started and ready to provide service.May 19 23:11:46 corosync [MAIN ] Successfully read main configuration file '/etc/corosync/corosync.conf'查看初始化成员节点通知是否正常发出:[root@master corosync]# grep TOTEM /var/log/corosync.log May 19 19:59:44 corosync [TOTEM ] Initializing transport (UDP/IP Multicast).May 19 19:59:44 corosync [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).May 19 19:59:44 corosync [TOTEM ] The network interface [172.30.82.45] is now up.May 19 19:59:44 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed.检查启动过程中是否有错误产生:[root@master corosync]# # grep ERROR: /var/log/corosync.log 查看pacemaker是否正常启动:May 19 23:11:46 corosync [pcmk ] info: pcmk_startup: CRM: InitializedMay 19 23:11:46 corosync [pcmk ] Logging: Initialized pcmk_startupMay 19 23:11:46 corosync [pcmk ] info: pcmk_startup: Maximum core file size is: 18446744073709551615May 19 23:11:46 corosync [pcmk ] info: pcmk_startup: Service: 9May 19 23:11:46 corosync [pcmk ] info: pcmk_startup: Local hostname: master使用如下命令查看集群节点的启动状态:[root@master corosync]# crm statusLast updated: Wed May 20 00:10:38 2015Last change: Tue May 19 22:49:50 2015Stack: classic openais (with plugin)Current DC: slave - partition with quorumVersion: 1.1.11-97629de2 Nodes configured, 2 expected votes2 Resources configuredOnline: [ master slave ]2、配置集群资源,这里需要配置2个基本资源1个组资源a、配置vipcrm(live)configure# primitive vip ocf:heartbeat:IPaddr params ip=172.30.82.61 nic=eth0 cidr_netmask=24b、配置ldirectord服务资源crm(live)configure#priimitive ldir lsb:ldirectordc、配置组资源,组资源将基本资源定义在同一台服务器上运行,默认情况集群资源会均衡运行在集群中各个节点crm(live)configure#group lvsserver vip ldird、不用定义组,可以通过资源粘性及资源约束来也可定义资源的倾向性,这里只是举例:顺序约束:资源的启动顺序crm(live)configure# order vip_before_ldir mandatory: vip ldir排列约束:哪些资源运行在一起crm(live)configure# colocation ldir_with_vip inf: vip ldir位置约束:资源更倾向运行在那个节点上crm(live)configure# location vip_on_mater vip rule 100: #uname eq node1e、其他的一些配置禁用stonith设备crm(live)configure# property stonith-enabled=false设定集群未到达法定票数的工作机制为忽略,因为只有两台服务器只能选此项crm(live)configure#no-quorum-policy=ignorecorosync的框架、运行原理、配置命令说明需自行研究,这里倾向于环境搭建及测试查看集群配置信息库:crm(live)configure#shownode masternode slaveprimitive ldir lsb:ldirectordprimitive vip IPaddr \ params ip=172.30.82.61 nic=eth0 cidr_netmask=24group lvsserver vip ldirproperty cib-bootstrap-options: \ dc-version=1.1.11-97629de \ cluster-infrastructure="classic openais (with plugin)" \ expected-quorum-votes=2 \ stonith-enabled=false \ no-quorum-policy=ignore验证配置语法:crm(live)configure# verify不报错即提交固化配置:crm(live)configure# commit3、测试集群服务,客户端访问172.30.82.61a、master 上执行:[root@master corosync]# ipvsadm -LnIP Virtual Server version 1.2.1 (size=4096)Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConnTCP 172.30.82.61:80 rr -> 172.30.82.3:80 Route 1 0 13 -> 172.30.82.11:80 Route 1 0 14 b、slave 上执行:[root@slave ha.d]# ipvsadm -LnIP Virtual Server version 1.2.1 (size=4096)Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConn说明集群资源只运行在master上4、集群资源转移测试a、master上执行service corosync stop[root@master log]# ipvsadm -LnIP Virtual Server version 1.2.1 (size=4096)Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConnb、在slave上执行[root@slave log]# ipvsadm -LnIP Virtual Server version 1.2.1 (size=4096)Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConnTCP 172.30.82.61:80 rr -> 172.30.82.3:80 Route 1 1 17 -> 172.30.82.11:80 Route 1 0 18说明集群资源转移成功c、后端服务故障检测node1上执行service httpd stop查看master集群服务[root@master log]# ipvsadm -LnIP Virtual Server version 1.2.1 (size=4096)Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConnTCP 172.30.82.61:80 rr -> 172.30.82.11:80 Route 1 0 0 恢复node1服务service httpd start[root@master log]# ipvsadm -LnIP Virtual Server version 1.2.1 (size=4096)Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConnTCP 172.30.82.61:80 rr -> 172.30.82.11:80 Route 1 0 -> 172.30.82.3:80 Route 1