cman 無法啟動
cman 無法啟動,而且還沒有具體原因。請教各位是為什麼。
相關信息如下:
# service cman restart
Stopping cluster:
Stopping fencing... done
Stopping cman... done
Stopping ccsd... done
Unmounting configfs... done
[ OK ]
Starting cluster:
Loading modules... done
Mounting configfs... done
Starting ccsd... done
Starting cman... failed
message的日誌如下:
Oct 24 16:24:57 cms2 openais: heartbeat_failures_allowed (0)
Oct 24 16:24:57 cms2 openais: max_network_delay (50 ms)
Oct 24 16:24:57 cms2 openais: HeartBeat is Disabled. To enable set heartbeat_failures_allowed > 0
Oct 24 16:24:57 cms2 openais: Receive multicast socket recv buffer size (262142 bytes).
Oct 24 16:24:57 cms2 openais: Transmit multicast socket send buffer size (262142 bytes).
Oct 24 16:24:57 cms2 openais: The network interface is now up.
Oct 24 16:24:57 cms2 openais: Created or loaded sequence id 0.192.168.201.2 for this ring.
Oct 24 16:24:57 cms2 openais: entering GATHER state from 15.
Oct 24 16:24:57 cms2 openais: Initialising service handler 'openais extended virtual synchrony service'
Oct 24 16:24:58 cms2 openais: Initialising service handler 'openais cluster membership service B.01.01'
Oct 24 16:24:58 cms2 openais: Initialising service handler 'openais availability management framework B.01.01'
Oct 24 16:24:58 cms2 openais: Initialising service handler 'openais checkpoint service B.01.01'
Oct 24 16:24:58 cms2 openais: Initialising service handler 'openais event service B.01.01'
Oct 24 16:24:58 cms2 openais: Initialising service handler 'openais distributed locking service B.01.01'
Oct 24 16:24:58 cms2 openais: Initialising service handler 'openais message service B.01.01'
Oct 24 16:24:58 cms2 openais: Initialising service handler 'openais configuration service'
Oct 24 16:24:58 cms2 openais: Initialising service handler 'openais cluster closed process group service v1.01'
Oct 24 16:24:58 cms2 openais: Initialising service handler 'openais CMAN membership service 2.01'
Oct 24 16:24:58 cms2 openais: CMAN 2.0.60 (built Jan 23 2007 12:42:29) started
Oct 24 16:24:58 cms2 openais: Not using a virtual synchrony filter.
Oct 24 16:24:58 cms2 openais: ERROR: Could not bind AF_UNIX: Address already in use.
Oct 24 16:24:58 cms2 openais: AIS Executive exiting (-7).
Oct 24 16:24:59 cms2 ccsd: Unable to connect to cluster infrastructure after 30 seconds.
《解決方案》
Oct 24 16:24:58 cms2 openais: ERROR: Could not bind AF_UNIX: Address already in use.
《解決方案》
原帖由 hmqq 於 2008-10-25 01:50 發表 http://linux.chinaunix.net/bbs/images/common/back.gif
Oct 24 16:24:58 cms2 openais: ERROR: Could not bind AF_UNIX: Address already in use.
請問這是指那個address呢?
《解決方案》
回復 #3 oioilu 的帖子
請問這是指那個address呢 port number(s)
cman udp 5404 and 5405
iptables -L
iptables -F
or
iptables -A INPUT -i 10.10.10.200 -m multiport -m state --state NEW -p udp
-s 10.10.10.0/24 -d 10.10.10.0/24 --dports 5404,5405 -j ACCEPT
10.10.10.200 is the interface ip(your server ip)
10.10.10.0/24 is your network
if do not want to get involved in iptables, just do
service iptables stop
chkconfig iptables off
another way to see which port you are using now
nmap -sS -O localhost
or
netstat -an
it would be nice to know your os rhel5 or rhel4 or something else
[ 本帖最後由 gl00ad 於 2008-10-25 09:51 編輯 ]
《解決方案》
回復 #4 gl00ad 的帖子
多謝!
我的是RH5
# uname -an
Linux cms2 2.6.18-8.el5 #1 SMP Fri Jan 26 14:15:21 EST 2007 i686 i686 i386 GNU/Linux
我的iptables中已經打開了udp 5404/5405的。但是依舊報錯,我把iptables禁用也是一樣的情況。還有其他要注意的嗎?
以下為配置:
# more /etc/sysconfig/iptables
# Generated by iptables-save v1.3.5 on Sat Oct 25 18:48:52 2008
*filter
:INPUT ACCEPT
:FORWARD ACCEPT
:OUTPUT ACCEPT
:RH-Firewall-1-INPUT -
-A INPUT -j RH-Firewall-1-INPUT
-A FORWARD -j RH-Firewall-1-INPUT
-A RH-Firewall-1-INPUT -i lo -j ACCEPT
-A RH-Firewall-1-INPUT -p icmp -m icmp --icmp-type any -j ACCEPT
-A RH-Firewall-1-INPUT -p esp -j ACCEPT
-A RH-Firewall-1-INPUT -p ah -j ACCEPT
-A RH-Firewall-1-INPUT -d 224.0.0.251 -p udp -m udp --dport 5353 -j ACCEPT
-A RH-Firewall-1-INPUT -p udp -m udp --dport 631 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m tcp --dport 631 -j ACCEPT
-A RH-Firewall-1-INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 22 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 21 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 80 -j ACCEPT
-A RH-Firewall-1-INPUT -p udp -m state --state NEW -m udp --dport 5404 -j ACCEPT
-A RH-Firewall-1-INPUT -p udp -m state --state NEW -m udp --dport 5405 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 16851 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 21064 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 41966 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 41967 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 41968 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 41969 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 50006 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 50008 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 50009 -j ACCEPT
-A RH-Firewall-1-INPUT -p udp -m state --state NEW -m udp --dport 50007 -j ACCEPT
-A RH-Firewall-1-INPUT -p udp -m state --state NEW -m udp --dport 161 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 8070 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 873 -j ACCEPT
-A RH-Firewall-1-INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT
-A RH-Firewall-1-INPUT -j REJECT --reject-with icmp-host-prohibited
COMMIT
# Completed on Sat Oct 25 18:48:52 2008
《解決方案》
回復 #5 oioilu 的帖子
I do not believed it took so long
let us do this
chkconfig iptables off
service iptables stop
netstat -an|grep udp
cat /etc/cluster/cluster.conf
another thing to look at is multicast, openais using multicast, I am not sure what I am saying, I still think your problem should be easier ... never know
《解決方案》
需要確保你的集群所在網路環境中是否有可能產生衝突的地址。
這個衝突可能包括廣播地址、浮動IP地址;
你需要提供/etc/hosts,/etc/cluster.conf以及ifconfig 輸出。
另外最簡單的測試方法可以將心跳線直連看cman是否能夠啟動,若能啟動則證明網路中存在衝突IP無疑。
《解決方案》
我之前曾經測試過,只要設備重啟(iptables開啟且配置沒有改變的情況下)cman可以運行。但是現在已經是生產環境了,不能重啟了。
現在雙機之間的心跳線是直連方式的。
從單播IP看,應該不存在重複。感覺應該是RHCS內部通訊用的組播地址可能會有問題。cms1運行都正常,但是本機cms2不能work。
# clustat
CMAN is not running.
# netstat -an|grep udp
udp 0 0 0.0.0.0:32769 0.0.0.0:*
udp 0 0 0.0.0.0:514 0.0.0.0:*
udp 0 0 127.0.0.1:5405 0.0.0.0:*
udp 0 0 127.0.0.1:5149 0.0.0.0:*
udp 0 0 226.94.1.1:5405 0.0.0.0:*
udp 0 0 0.0.0.0:161 0.0.0.0:*
udp 0 0 0.0.0.0:825 0.0.0.0:*
udp 0 0 0.0.0.0:828 0.0.0.0:*
udp 0 0 0.0.0.0:5353 0.0.0.0:*
udp 0 0 0.0.0.0:111 0.0.0.0:*
udp 0 0 0.0.0.0:631 0.0.0.0:*
udp 0 0 :::32770 :::*
udp 0 0 :::32771 :::*
udp 0 0 :::32772 :::*
udp 0 0 :::2463 :::*
udp 0 0 :::50007 :::*
udp 0 0 :::5353 :::*
# cat /etc/cluster/cluster.conf
<?xml version="1.0"?>
<cluster alias="cms" config_version="4" name="cms">
<fence_daemon clean_start="1" post_fail_delay="0" post_join_delay="3"/>
<clusternodes>
<clusternode name="cms1" nodeid="1" votes="1">
<fence>
<method name="1">
<device lanplus="1" name="cms1-fence"/>
</method>
</fence>
</clusternode>
<clusternode name="cms2" nodeid="2" votes="1">
<fence>
<method name="1">
<device lanplus="1" name="cms2-fence"/>
</method>
</fence>
</clusternode>
</clusternodes>
<cman expected_votes="1" two_node="1"/>
<fencedevices>
<fencedevice agent="fence_ipmilan" auth="password" ipaddr="192.1
68.201.11" login="a" name="cms1-fence" passwd="a"/>
<fencedevice agent="fence_ipmilan" auth="password" ipaddr="192.1
68.201.12" login="a" name="cms2-fence" passwd="a"/>
</fencedevices>
<rm>
<failoverdomains/>
<resources/>
</rm>
</cluster>
# cat /etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1 localhost.localdomain localhost cms2.Guangdong
::1 localhost6.localdomain6 localhost cms2.Guangdong6
192.168.201.1 cms1
192.168.201.2 cms2
# ifconfig
bond0 Link encap:Ethernet HWaddr 00:1E:4F:39:91:94
inet addr:IPA Bcast:IPB Mask:255.255.255.240
inet6 addr: fe80::21e:4fff:fe39:9194/64 Scope:Link
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
RX packets:4177457574 errors:0 dropped:0 overruns:0 frame:0
TX packets:608846683 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:2005133999 (1.8 GiB) TX bytes:1653861240 (1.5 GiB)
eth0 Link encap:Ethernet HWaddr 00:1E:4F:39:91:92
inet addr:192.168.201.2 Bcast:192.168.201.255 Mask:255.255.255.0
inet6 addr: fe80::21e:4fff:fe39:9192/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:34221328 errors:0 dropped:0 overruns:0 frame:0
TX packets:24361128 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:1013490086 (966.5 MiB) TX bytes:3409997433 (3.1 GiB)
Interrupt:169 Memory:da000000-da012100
eth1 Link encap:Ethernet HWaddr 00:1E:4F:39:91:94
inet6 addr: fe80::21e:4fff:fe39:9194/64 Scope:Link
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:4177451384 errors:0 dropped:0 overruns:0 frame:0
TX packets:608846666 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:2003644680 (1.8 GiB) TX bytes:1653856872 (1.5 GiB)
Interrupt:169 Memory:d6000000-d6012100
eth2 Link encap:Ethernet HWaddr 00:1E:4F:39:91:94
inet6 addr: fe80::21e:4fff:fe39:9194/64 Scope:Link
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:6190 errors:0 dropped:0 overruns:0 frame:0
TX packets:17 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:1489319 (1.4 MiB) TX bytes:4368 (4.2 KiB)
Base address:0xdce0 Memory:d5ee0000-d5f00000
[ 本帖最後由 oioilu 於 2008-10-26 09:56 編輯 ]
《解決方案》
回復 #8 oioilu 的帖子
udp 0 0 127.0.0.1:5405 0.0.0.0:*
udp 0 0 127.0.0.1:5149 0.0.0.0:*
udp 0 0 226.94.1.1:5405 0.0.0.0:*
your 5405 port is being used already by ... 127.0.0.1:5405 and multicast 226.94.1.1:5405, you need to look into this issue.
run these and tell us the output:
lsof -i @127.0.0.1:5405
lsof -i @226.94.1.1:5405
there is a better way
lsof -i UDP:5405
[ 本帖最後由 gl00ad 於 2008-10-26 11:35 編輯 ]
《解決方案》
回復 #9 gl00ad 的帖子
Here is the output
# lsof -i UDP:5405
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
aisexec 9884 root 3u IPv4 37348117 UDP 226.94.1.1:netsupport
aisexec 9884 root 5u IPv4 37348119 UDP localhost.localdomain:netsupport
奇怪,難道aisexec不會隨著service cman restart而自動重啟嗎?
# ps -ef | grep aisexec
root 9884 1 5 Jul31 ? 4-13:17:07 aisexec
root 13708 13649 0 18:44 pts/1 00:00:00 grep aisexec
我把9884 進程kill了,之後fence又出錯。
# service cman restart
Stopping cluster:
Stopping fencing... done
Stopping cman... done
Stopping ccsd... done
Unmounting configfs... done
[ OK ]
Starting cluster:
Loading modules... done
Mounting configfs... done
Starting ccsd... done
Starting cman... done
Starting daemons... done
Starting fencing... failed
相關的syslog變了
以下是node2上的log
Oct 26 18:45:05 cms2 ccsd: Starting ccsd 2.0.60:
Oct 26 18:45:05 cms2 ccsd: Built: Jan 23 2007 12:42:25
Oct 26 18:45:05 cms2 ccsd: Copyright (C) Red Hat, Inc. 2004 All rights reserved.
Oct 26 18:45:05 cms2 ccsd: cluster.conf (cluster name = cms, version = 4) found.
Oct 26 18:45:06 cms2 openais: AIS Executive Service RELEASE 'subrev 1324 version 0.80.2'
Oct 26 18:45:06 cms2 openais: Copyright (C) 2002-2006 MontaVista Software, Inc and contributors.
Oct 26 18:45:06 cms2 openais: Copyright (C) 2006 Red Hat, Inc.
Oct 26 18:45:06 cms2 openais: AIS Executive Service: started and ready to provide service.
Oct 26 18:45:06 cms2 openais: Using default multicast address of 239.192.2.219
Oct 26 18:45:06 cms2 openais: openais component openais_cpg loaded.
Oct 26 18:45:06 cms2 openais: Registering service handler 'openais cluster closed process group service v1.01'
Oct 26 18:45:06 cms2 openais: openais component openais_cfg loaded.
Oct 26 18:45:06 cms2 openais: Registering service handler 'openais configuration service'
Oct 26 18:45:06 cms2 openais: openais component openais_msg loaded.
Oct 26 18:45:06 cms2 openais: Registering service handler 'openais message service B.01.01'
Oct 26 18:45:06 cms2 openais: openais component openais_lck loaded.
Oct 26 18:45:06 cms2 openais: Registering service handler 'openais distributed locking service B.01.01'
Oct 26 18:45:06 cms2 openais: openais component openais_evt loaded.
Oct 26 18:45:06 cms2 openais: Registering service handler 'openais event service B.01.01'
Oct 26 18:45:06 cms2 openais: openais component openais_ckpt loaded.
Oct 26 18:45:06 cms2 openais: Registering service handler 'openais checkpoint service B.01.01'
Oct 26 18:45:06 cms2 openais: openais component openais_amf loaded.
Oct 26 18:45:06 cms2 openais: Registering service handler 'openais availability management framework B.01.01'
Oct 26 18:45:06 cms2 openais: openais component openais_clm loaded.
Oct 26 18:45:06 cms2 openais: Registering service handler 'openais cluster membership service B.01.01'
Oct 26 18:45:06 cms2 openais: openais component openais_evs loaded.
Oct 26 18:45:06 cms2 openais: Registering service handler 'openais extended virtual synchrony service'
Oct 26 18:45:06 cms2 openais: openais component openais_cman loaded.
Oct 26 18:45:06 cms2 openais: Registering service handler 'openais CMAN membership service 2.01'
Oct 26 18:45:07 cms2 openais: Token Timeout (10000 ms) retransmit timeout (495 ms)
Oct 26 18:45:07 cms2 openais: token hold (386 ms) retransmits before loss (20 retrans)
Oct 26 18:45:07 cms2 openais: join (60 ms) send_join (0 ms) consensus (4800 ms) merge (200 ms)
Oct 26 18:45:07 cms2 openais: downcheck (1000 ms) fail to recv const (50 msgs)
Oct 26 18:45:07 cms2 openais: seqno unchanged const (30 rotations) Maximum network MTU 1500
Oct 26 18:45:07 cms2 openais: window size per rotation (50 messages) maximum messages per rotation (17 messages)
Oct 26 18:45:07 cms2 openais: send threads (0 threads)
Oct 26 18:45:07 cms2 openais: RRP token expired timeout (495 ms)
Oct 26 18:45:07 cms2 openais: RRP token problem counter (2000 ms)
Oct 26 18:45:07 cms2 openais: RRP threshold (10 problem count)
Oct 26 18:45:07 cms2 openais: RRP mode set to none.
Oct 26 18:45:07 cms2 openais: heartbeat_failures_allowed (0)
Oct 26 18:45:07 cms2 openais: max_network_delay (50 ms)
Oct 26 18:45:07 cms2 openais: HeartBeat is Disabled. To enable set heartbeat_failures_allowed > 0
Oct 26 18:45:07 cms2 openais: Receive multicast socket recv buffer size (262142 bytes).
Oct 26 18:45:07 cms2 openais: Transmit multicast socket send buffer size (262142 bytes).
Oct 26 18:45:07 cms2 openais: The network interface is now up.
Oct 26 18:45:07 cms2 openais: Created or loaded sequence id 0.192.168.201.2 for this ring.
Oct 26 18:45:07 cms2 openais: entering GATHER state from 15.
Oct 26 18:45:07 cms2 openais: Initialising service handler 'openais extended virtual synchrony service'
Oct 26 18:45:07 cms2 openais: Initialising service handler 'openais cluster membership service B.01.01'
Oct 26 18:45:07 cms2 openais: Initialising service handler 'openais availability management framework B.01.01'
Oct 26 18:45:07 cms2 ccsd: Initial status:: Quorate
Oct 26 18:45:07 cms2 openais: Initialising service handler 'openais checkpoint service B.01.01'
Oct 26 18:45:07 cms2 openais: Initialising service handler 'openais event service B.01.01'
Oct 26 18:45:07 cms2 openais: Initialising service handler 'openais distributed locking service B.01.01'
Oct 26 18:45:07 cms2 openais: Initialising service handler 'openais message service B.01.01'
Oct 26 18:45:07 cms2 openais: Initialising service handler 'openais configuration service'
Oct 26 18:45:07 cms2 openais: Initialising service handler 'openais cluster closed process group service v1.01'
Oct 26 18:45:07 cms2 openais: Initialising service handler 'openais CMAN membership service 2.01'
Oct 26 18:45:07 cms2 openais: CMAN 2.0.60 (built Jan 23 2007 12:42:29) started
Oct 26 18:45:07 cms2 openais: Not using a virtual synchrony filter.
Oct 26 18:45:07 cms2 openais: Creating commit token because I am the rep.
Oct 26 18:45:07 cms2 openais: Saving state aru 0 high seq received 0
Oct 26 18:45:07 cms2 openais: entering COMMIT state.
Oct 26 18:45:07 cms2 openais: entering RECOVERY state.
Oct 26 18:45:07 cms2 openais: position member 192.168.201.2:
Oct 26 18:45:07 cms2 openais: previous ring seq 0 rep 192.168.201.2
Oct 26 18:45:07 cms2 openais: aru 0 high delivered 0 received flag 0
Oct 26 18:45:07 cms2 openais: Did not need to originate any messages in recovery.
Oct 26 18:45:07 cms2 openais: Storing new sequence id for ring 4
Oct 26 18:45:07 cms2 openais: Sending initial ORF token
Oct 26 18:45:07 cms2 openais: CLM CONFIGURATION CHANGE
Oct 26 18:45:07 cms2 openais: New Configuration:
Oct 26 18:45:07 cms2 openais: Members Left:
Oct 26 18:45:07 cms2 openais: Members Joined:
Oct 26 18:45:07 cms2 openais: This node is within the primary component and will provide service.
Oct 26 18:45:07 cms2 openais: CLM CONFIGURATION CHANGE
Oct 26 18:45:07 cms2 openais: New Configuration:
Oct 26 18:45:07 cms2 openais: r(0) ip(192.168.201.2)
Oct 26 18:45:07 cms2 openais: Members Left:
Oct 26 18:45:07 cms2 openais: Members Joined:
Oct 26 18:45:07 cms2 openais: r(0) ip(192.168.201.2)
Oct 26 18:45:07 cms2 openais: This node is within the primary component and will provide service.
Oct 26 18:45:07 cms2 openais: entering OPERATIONAL state.
Oct 26 18:45:07 cms2 openais: quorum regained, resuming activity
Oct 26 18:45:07 cms2 openais: got nodejoin message 192.168.201.2
Oct 26 18:45:07 cms2 openais: entering GATHER state from 11.
Oct 26 18:45:07 cms2 openais: Saving state aru 9 high seq received 9
Oct 26 18:45:07 cms2 openais: entering COMMIT state.
Oct 26 18:45:07 cms2 openais: entering RECOVERY state.
Oct 26 18:45:07 cms2 openais: position member 192.168.201.1:
Oct 26 18:45:07 cms2 openais: previous ring seq 116 rep 192.168.201.1
Oct 26 18:45:07 cms2 openais: aru c high delivered c received flag 0
Oct 26 18:45:07 cms2 openais: position member 192.168.201.2:
Oct 26 18:45:07 cms2 openais: previous ring seq 4 rep 192.168.201.2
Oct 26 18:45:07 cms2 openais: aru 9 high delivered 9 received flag 0
Oct 26 18:45:08 cms2 openais: Did not need to originate any messages in recovery.
Oct 26 18:45:08 cms2 openais: Storing new sequence id for ring 78
Oct 26 18:45:08 cms2 groupd: found uncontrolled kernel object rgmanager in /sys/kernel/dlm
Oct 26 18:45:08 cms2 openais: CLM CONFIGURATION CHANGE
Oct 26 18:45:08 cms2 groupd: found uncontrolled kernel object clvmd in /sys/kernel/dlm
Oct 26 18:45:08 cms2 openais: New Configuration:
Oct 26 18:45:08 cms2 groupd: local node must be reset to clear 2 uncontrolled instances of gfs and/or dlm
Oct 26 18:45:08 cms2 openais: r(0) ip(192.168.201.2)
Oct 26 18:45:08 cms2 openais: Members Left:
Oct 26 18:45:08 cms2 openais: Members Joined:
Oct 26 18:45:08 cms2 openais: This node is within the primary component and will provide service.
Oct 26 18:45:08 cms2 openais: CLM CONFIGURATION CHANGE
Oct 26 18:45:08 cms2 openais: New Configuration:
Oct 26 18:45:08 cms2 openais: r(0) ip(192.168.201.1)
Oct 26 18:45:08 cms2 openais: r(0) ip(192.168.201.2)
Oct 26 18:45:08 cms2 openais: Members Left:
Oct 26 18:45:08 cms2 openais: Members Joined:
Oct 26 18:45:08 cms2 openais: r(0) ip(192.168.201.1)
Oct 26 18:45:08 cms2 openais: This node is within the primary component and will provide service.
Oct 26 18:45:08 cms2 openais: entering OPERATIONAL state.
Oct 26 18:45:08 cms2 openais: got nodejoin message 192.168.201.1
Oct 26 18:45:08 cms2 openais: got nodejoin message 192.168.201.2
Oct 26 18:45:08 cms2 openais: got joinlist message from node 1
Oct 26 18:45:08 cms2 openais: cman killed by node 2 for reason 2
Oct 26 18:45:08 cms2 dlm_controld: cluster is down, exiting
Oct 26 18:45:08 cms2 gfs_controld: cluster is down, exiting
Oct 26 18:45:08 cms2 fenced: cluster is down, exiting
Oct 26 18:45:08 cms2 kernel: dlm: closing connection to node 2
Oct 26 18:45:08 cms2 kernel: dlm: closing connection to node 1
Oct 26 18:45:35 cms2 ccsd: Unable to connect to cluster infrastructure after 30 seconds.
以下是node1上的log
Oct 26 18:56:03 cms1 openais: entering GATHER state from 11.
Oct 26 18:56:03 cms1 openais: Creating commit token because I am the rep.
Oct 26 18:56:03 cms1 openais: Saving state aru c high seq received c
Oct 26 18:56:03 cms1 openais: entering COMMIT state.
Oct 26 18:56:03 cms1 openais: entering RECOVERY state.
Oct 26 18:56:03 cms1 openais: position member 192.168.201.1:
Oct 26 18:56:03 cms1 openais: previous ring seq 124 rep 192.168.201.1
Oct 26 18:56:03 cms1 openais: aru c high delivered c received flag 0
Oct 26 18:56:03 cms1 openais: position member 192.168.201.2:
Oct 26 18:56:03 cms1 openais: previous ring seq 4 rep 192.168.201.2
Oct 26 18:56:03 cms1 openais: aru 9 high delivered 9 received flag 0
Oct 26 18:56:03 cms1 openais: Did not need to originate any messages in recovery.
Oct 26 18:56:03 cms1 openais: Storing new sequence id for ring 80
Oct 26 18:56:03 cms1 openais: Sending initial ORF token
Oct 26 18:56:03 cms1 openais: CLM CONFIGURATION CHANGE
Oct 26 18:56:03 cms1 openais: New Configuration:
Oct 26 18:56:03 cms1 openais: r(0) ip(192.168.201.1)
Oct 26 18:56:03 cms1 openais: Members Left:
Oct 26 18:56:03 cms1 openais: Members Joined:
Oct 26 18:56:03 cms1 openais: This node is within the primary component and will provide service.
Oct 26 18:56:03 cms1 openais: CLM CONFIGURATION CHANGE
Oct 26 18:56:03 cms1 openais: New Configuration:
Oct 26 18:56:03 cms1 openais: r(0) ip(192.168.201.1)
Oct 26 18:56:03 cms1 openais: r(0) ip(192.168.201.2)
Oct 26 18:56:03 cms1 openais: Members Left:
Oct 26 18:56:03 cms1 openais: Members Joined:
Oct 26 18:56:03 cms1 openais: r(0) ip(192.168.201.2)
Oct 26 18:56:03 cms1 openais: This node is within the primary component and will provide service.
Oct 26 18:56:03 cms1 openais: entering OPERATIONAL state.
Oct 26 18:56:03 cms1 openais: got nodejoin message 192.168.201.1
Oct 26 18:56:03 cms1 openais: got nodejoin message 192.168.201.2
Oct 26 18:56:03 cms1 openais: got joinlist message from node 1
Oct 26 18:56:14 cms1 openais: The token was lost in the OPERATIONAL state.
Oct 26 18:56:14 cms1 openais: Receive multicast socket recv buffer size (288000 bytes).
Oct 26 18:56:14 cms1 openais: Transmit multicast socket send buffer size (288000 bytes).
Oct 26 18:56:14 cms1 openais: entering GATHER state from 2.
Oct 26 18:56:19 cms1 openais: entering GATHER state from 0.
Oct 26 18:56:19 cms1 openais: Creating commit token because I am the rep.
Oct 26 18:56:19 cms1 openais: Saving state aru 17 high seq received 17
Oct 26 18:56:19 cms1 openais: entering COMMIT state.
Oct 26 18:56:19 cms1 openais: entering RECOVERY state.
Oct 26 18:56:19 cms1 openais: position member 192.168.201.1:
Oct 26 18:56:19 cms1 openais: previous ring seq 128 rep 192.168.201.1
Oct 26 18:56:19 cms1 openais: aru 17 high delivered 17 received flag 0
Oct 26 18:56:19 cms1 openais: Did not need to originate any messages in recovery.
Oct 26 18:56:19 cms1 openais: Storing new sequence id for ring 84
Oct 26 18:56:19 cms1 openais: Sending initial ORF token
Oct 26 18:56:19 cms1 openais: CLM CONFIGURATION CHANGE
Oct 26 18:56:19 cms1 openais: New Configuration:
Oct 26 18:56:19 cms1 kernel: dlm: closing connection to node 2
Oct 26 18:56:19 cms1 openais: r(0) ip(192.168.201.1)
Oct 26 18:56:19 cms1 openais: Members Left:
Oct 26 18:56:19 cms1 openais: r(0) ip(192.168.201.2)
Oct 26 18:56:19 cms1 openais: Members Joined:
Oct 26 18:56:19 cms1 openais: This node is within the primary component and will provide service.
Oct 26 18:56:19 cms1 openais: CLM CONFIGURATION CHANGE
Oct 26 18:56:19 cms1 openais: New Configuration:
Oct 26 18:56:19 cms1 openais: r(0) ip(192.168.201.1)
Oct 26 18:56:19 cms1 openais: Members Left:
Oct 26 18:56:19 cms1 openais: Members Joined:
Oct 26 18:56:19 cms1 openais: This node is within the primary component and will provide service.
Oct 26 18:56:19 cms1 openais: entering OPERATIONAL state.
Oct 26 18:56:19 cms1 openais: got nodejoin message 192.168.201.1
Oct 26 18:56:19 cms1 openais: got joinlist message from node 1
[ 本帖最後由 oioilu 於 2008-10-26 19:09 編輯 ]