歡迎您光臨本站 註冊首頁

cman 無法啟動

←手機掃碼閱讀     火星人 @ 2014-03-04 , reply:0

cman 無法啟動

cman 無法啟動,而且還沒有具體原因。請教各位是為什麼。

相關信息如下:

# service cman restart
Stopping cluster:
   Stopping fencing... done
   Stopping cman... done
   Stopping ccsd... done
   Unmounting configfs... done
[  OK  ]
Starting cluster:
   Loading modules... done
   Mounting configfs... done
   Starting ccsd... done
   Starting cman... failed




message的日誌如下:
Oct 24 16:24:57 cms2 openais: heartbeat_failures_allowed (0)
Oct 24 16:24:57 cms2 openais: max_network_delay (50 ms)
Oct 24 16:24:57 cms2 openais: HeartBeat is Disabled. To enable set heartbeat_failures_allowed > 0
Oct 24 16:24:57 cms2 openais: Receive multicast socket recv buffer size (262142 bytes).
Oct 24 16:24:57 cms2 openais: Transmit multicast socket send buffer size (262142 bytes).
Oct 24 16:24:57 cms2 openais: The network interface is now up.
Oct 24 16:24:57 cms2 openais: Created or loaded sequence id 0.192.168.201.2 for this ring.
Oct 24 16:24:57 cms2 openais: entering GATHER state from 15.
Oct 24 16:24:57 cms2 openais: Initialising service handler 'openais extended virtual synchrony service'
Oct 24 16:24:58 cms2 openais: Initialising service handler 'openais cluster membership service B.01.01'
Oct 24 16:24:58 cms2 openais: Initialising service handler 'openais availability management framework B.01.01'
Oct 24 16:24:58 cms2 openais: Initialising service handler 'openais checkpoint service B.01.01'
Oct 24 16:24:58 cms2 openais: Initialising service handler 'openais event service B.01.01'
Oct 24 16:24:58 cms2 openais: Initialising service handler 'openais distributed locking service B.01.01'
Oct 24 16:24:58 cms2 openais: Initialising service handler 'openais message service B.01.01'
Oct 24 16:24:58 cms2 openais: Initialising service handler 'openais configuration service'
Oct 24 16:24:58 cms2 openais: Initialising service handler 'openais cluster closed process group service v1.01'
Oct 24 16:24:58 cms2 openais: Initialising service handler 'openais CMAN membership service 2.01'
Oct 24 16:24:58 cms2 openais: CMAN 2.0.60 (built Jan 23 2007 12:42:29) started
Oct 24 16:24:58 cms2 openais: Not using a virtual synchrony filter.
Oct 24 16:24:58 cms2 openais: ERROR: Could not bind AF_UNIX: Address already in use.
Oct 24 16:24:58 cms2 openais: AIS Executive exiting (-7).
Oct 24 16:24:59 cms2 ccsd: Unable to connect to cluster infrastructure after 30 seconds.
《解決方案》

Oct 24 16:24:58 cms2 openais: ERROR: Could not bind AF_UNIX: Address already in use.
《解決方案》

原帖由 hmqq 於 2008-10-25 01:50 發表 http://linux.chinaunix.net/bbs/images/common/back.gif
Oct 24 16:24:58 cms2 openais:  ERROR: Could not bind AF_UNIX: Address already in use.

請問這是指那個address呢?
《解決方案》

回復 #3 oioilu 的帖子

請問這是指那個address呢 port number(s)
cman udp 5404 and 5405

iptables -L

iptables -F
or
iptables -A INPUT -i 10.10.10.200 -m multiport -m state --state NEW -p udp
-s 10.10.10.0/24 -d 10.10.10.0/24 --dports 5404,5405 -j ACCEPT

10.10.10.200 is the interface ip(your server ip)

10.10.10.0/24 is your network
if do not want to get involved in iptables, just do
service iptables stop
chkconfig iptables off

another way to see which port you are using now

nmap -sS -O localhost
or
netstat -an

it would be nice to know your os rhel5 or rhel4 or something else

[ 本帖最後由 gl00ad 於 2008-10-25 09:51 編輯 ]
《解決方案》

回復 #4 gl00ad 的帖子

多謝!
我的是RH5

# uname -an
Linux cms2 2.6.18-8.el5 #1 SMP Fri Jan 26 14:15:21 EST 2007 i686 i686 i386 GNU/Linux

我的iptables中已經打開了udp 5404/5405的。但是依舊報錯,我把iptables禁用也是一樣的情況。還有其他要注意的嗎?

以下為配置:

# more /etc/sysconfig/iptables
# Generated by iptables-save v1.3.5 on Sat Oct 25 18:48:52 2008
*filter
:INPUT ACCEPT
:FORWARD ACCEPT
:OUTPUT ACCEPT
:RH-Firewall-1-INPUT -
-A INPUT -j RH-Firewall-1-INPUT
-A FORWARD -j RH-Firewall-1-INPUT
-A RH-Firewall-1-INPUT -i lo -j ACCEPT
-A RH-Firewall-1-INPUT -p icmp -m icmp --icmp-type any -j ACCEPT
-A RH-Firewall-1-INPUT -p esp -j ACCEPT
-A RH-Firewall-1-INPUT -p ah -j ACCEPT
-A RH-Firewall-1-INPUT -d 224.0.0.251 -p udp -m udp --dport 5353 -j ACCEPT
-A RH-Firewall-1-INPUT -p udp -m udp --dport 631 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m tcp --dport 631 -j ACCEPT
-A RH-Firewall-1-INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 22 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 21 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 80 -j ACCEPT
-A RH-Firewall-1-INPUT -p udp -m state --state NEW -m udp --dport 5404 -j ACCEPT
-A RH-Firewall-1-INPUT -p udp -m state --state NEW -m udp --dport 5405 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 16851 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 21064 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 41966 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 41967 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 41968 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 41969 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 50006 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 50008 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 50009 -j ACCEPT
-A RH-Firewall-1-INPUT -p udp -m state --state NEW -m udp --dport 50007 -j ACCEPT
-A RH-Firewall-1-INPUT -p udp -m state --state NEW -m udp --dport 161 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 8070 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 873 -j ACCEPT
-A RH-Firewall-1-INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT
-A RH-Firewall-1-INPUT -j REJECT --reject-with icmp-host-prohibited
COMMIT
# Completed on Sat Oct 25 18:48:52 2008
《解決方案》

回復 #5 oioilu 的帖子

I do not believed it took so long
let us do this
chkconfig iptables off
service iptables stop
netstat -an|grep udp
cat /etc/cluster/cluster.conf

another thing to look at is multicast, openais using multicast, I am not sure what I am saying, I still think your problem should be easier ... never know
《解決方案》

需要確保你的集群所在網路環境中是否有可能產生衝突的地址。
這個衝突可能包括廣播地址、浮動IP地址;

你需要提供/etc/hosts,/etc/cluster.conf以及ifconfig 輸出。
另外最簡單的測試方法可以將心跳線直連看cman是否能夠啟動,若能啟動則證明網路中存在衝突IP無疑。
《解決方案》

我之前曾經測試過,只要設備重啟(iptables開啟且配置沒有改變的情況下)cman可以運行。但是現在已經是生產環境了,不能重啟了。

現在雙機之間的心跳線是直連方式的。

從單播IP看,應該不存在重複。感覺應該是RHCS內部通訊用的組播地址可能會有問題。cms1運行都正常,但是本機cms2不能work。

# clustat
CMAN is not running.
# netstat -an|grep udp
udp        0      0 0.0.0.0:32769               0.0.0.0:*                              
udp        0      0 0.0.0.0:514                 0.0.0.0:*                              
udp        0      0 127.0.0.1:5405              0.0.0.0:*                              
udp        0      0 127.0.0.1:5149              0.0.0.0:*                              
udp        0      0 226.94.1.1:5405             0.0.0.0:*                              
udp        0      0 0.0.0.0:161                 0.0.0.0:*                              
udp        0      0 0.0.0.0:825                 0.0.0.0:*                              
udp        0      0 0.0.0.0:828                 0.0.0.0:*                              
udp        0      0 0.0.0.0:5353                0.0.0.0:*                              
udp        0      0 0.0.0.0:111                 0.0.0.0:*                              
udp        0      0 0.0.0.0:631                 0.0.0.0:*                              
udp        0      0 :::32770                    :::*                                    
udp        0      0 :::32771                    :::*                                    
udp        0      0 :::32772                    :::*                                    
udp        0      0 :::2463                     :::*                                    
udp        0      0 :::50007                    :::*                                    
udp        0      0 :::5353                     :::*                             

# cat /etc/cluster/cluster.conf
<?xml version="1.0"?>
<cluster alias="cms" config_version="4" name="cms">
        <fence_daemon clean_start="1" post_fail_delay="0" post_join_delay="3"/>
        <clusternodes>
                <clusternode name="cms1" nodeid="1" votes="1">
                        <fence>
                                <method name="1">
                                        <device lanplus="1" name="cms1-fence"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="cms2" nodeid="2" votes="1">
                        <fence>
                                <method name="1">
                                        <device lanplus="1" name="cms2-fence"/>
                                </method>
                        </fence>
                </clusternode>
        </clusternodes>
        <cman expected_votes="1" two_node="1"/>
        <fencedevices>
                <fencedevice agent="fence_ipmilan" auth="password" ipaddr="192.1
68.201.11" login="a" name="cms1-fence" passwd="a"/>
                <fencedevice agent="fence_ipmilan" auth="password" ipaddr="192.1
68.201.12" login="a" name="cms2-fence" passwd="a"/>
        </fencedevices>
        <rm>
                <failoverdomains/>
                <resources/>
        </rm>
</cluster>



# cat /etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1       localhost.localdomain   localhost       cms2.Guangdong
::1     localhost6.localdomain6 localhost       cms2.Guangdong6
192.168.201.1 cms1
192.168.201.2 cms2




# ifconfig
bond0     Link encap:Ethernet  HWaddr 00:1E:4F:39:91:94  
          inet addr:IPA  Bcast:IPB  Mask:255.255.255.240
          inet6 addr: fe80::21e:4fff:fe39:9194/64 Scope:Link
          UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
          RX packets:4177457574 errors:0 dropped:0 overruns:0 frame:0
          TX packets:608846683 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:2005133999 (1.8 GiB)  TX bytes:1653861240 (1.5 GiB)

eth0      Link encap:Ethernet  HWaddr 00:1E:4F:39:91:92  
          inet addr:192.168.201.2  Bcast:192.168.201.255  Mask:255.255.255.0
          inet6 addr: fe80::21e:4fff:fe39:9192/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:34221328 errors:0 dropped:0 overruns:0 frame:0
          TX packets:24361128 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:1013490086 (966.5 MiB)  TX bytes:3409997433 (3.1 GiB)
          Interrupt:169 Memory:da000000-da012100

eth1      Link encap:Ethernet  HWaddr 00:1E:4F:39:91:94  
          inet6 addr: fe80::21e:4fff:fe39:9194/64 Scope:Link
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:4177451384 errors:0 dropped:0 overruns:0 frame:0
          TX packets:608846666 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:2003644680 (1.8 GiB)  TX bytes:1653856872 (1.5 GiB)
          Interrupt:169 Memory:d6000000-d6012100

eth2      Link encap:Ethernet  HWaddr 00:1E:4F:39:91:94  
          inet6 addr: fe80::21e:4fff:fe39:9194/64 Scope:Link
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:6190 errors:0 dropped:0 overruns:0 frame:0
          TX packets:17 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:1489319 (1.4 MiB)  TX bytes:4368 (4.2 KiB)
          Base address:0xdce0 Memory:d5ee0000-d5f00000

[ 本帖最後由 oioilu 於 2008-10-26 09:56 編輯 ]
《解決方案》

回復 #8 oioilu 的帖子

udp        0      0 127.0.0.1:5405              0.0.0.0:*                              
udp        0      0 127.0.0.1:5149              0.0.0.0:*                              
udp        0      0 226.94.1.1:5405             0.0.0.0:*

your 5405 port is being used already by ... 127.0.0.1:5405 and multicast 226.94.1.1:5405, you need to look into this issue.

run these and tell us the output:


lsof -i @127.0.0.1:5405
lsof -i @226.94.1.1:5405

there is a better way

lsof -i UDP:5405


[ 本帖最後由 gl00ad 於 2008-10-26 11:35 編輯 ]
《解決方案》

回復 #9 gl00ad 的帖子

Here is the output



# lsof -i UDP:5405
COMMAND  PID USER   FD   TYPE   DEVICE SIZE NODE NAME
aisexec 9884 root    3u  IPv4 37348117       UDP 226.94.1.1:netsupport
aisexec 9884 root    5u  IPv4 37348119       UDP localhost.localdomain:netsupport



奇怪,難道aisexec不會隨著service cman restart而自動重啟嗎?


# ps -ef | grep aisexec
root      9884     1  5 Jul31 ?        4-13:17:07 aisexec
root     13708 13649  0 18:44 pts/1    00:00:00 grep aisexec


我把9884 進程kill了,之後fence又出錯。


# service cman restart
Stopping cluster:
   Stopping fencing... done
   Stopping cman... done
   Stopping ccsd... done
   Unmounting configfs... done
[  OK  ]
Starting cluster:
   Loading modules... done
   Mounting configfs... done
   Starting ccsd... done
   Starting cman... done
   Starting daemons... done
   Starting fencing... failed





相關的syslog變了
以下是node2上的log

Oct 26 18:45:05 cms2 ccsd: Starting ccsd 2.0.60:
Oct 26 18:45:05 cms2 ccsd:  Built: Jan 23 2007 12:42:25
Oct 26 18:45:05 cms2 ccsd:  Copyright (C) Red Hat, Inc.  2004  All rights reserved.
Oct 26 18:45:05 cms2 ccsd: cluster.conf (cluster name = cms, version = 4) found.
Oct 26 18:45:06 cms2 openais: AIS Executive Service RELEASE 'subrev 1324 version 0.80.2'
Oct 26 18:45:06 cms2 openais: Copyright (C) 2002-2006 MontaVista Software, Inc and contributors.
Oct 26 18:45:06 cms2 openais: Copyright (C) 2006 Red Hat, Inc.
Oct 26 18:45:06 cms2 openais: AIS Executive Service: started and ready to provide service.
Oct 26 18:45:06 cms2 openais: Using default multicast address of 239.192.2.219
Oct 26 18:45:06 cms2 openais: openais component openais_cpg loaded.
Oct 26 18:45:06 cms2 openais: Registering service handler 'openais cluster closed process group service v1.01'
Oct 26 18:45:06 cms2 openais: openais component openais_cfg loaded.
Oct 26 18:45:06 cms2 openais: Registering service handler 'openais configuration service'
Oct 26 18:45:06 cms2 openais: openais component openais_msg loaded.
Oct 26 18:45:06 cms2 openais: Registering service handler 'openais message service B.01.01'
Oct 26 18:45:06 cms2 openais: openais component openais_lck loaded.
Oct 26 18:45:06 cms2 openais: Registering service handler 'openais distributed locking service B.01.01'
Oct 26 18:45:06 cms2 openais: openais component openais_evt loaded.
Oct 26 18:45:06 cms2 openais: Registering service handler 'openais event service B.01.01'
Oct 26 18:45:06 cms2 openais: openais component openais_ckpt loaded.
Oct 26 18:45:06 cms2 openais: Registering service handler 'openais checkpoint service B.01.01'
Oct 26 18:45:06 cms2 openais: openais component openais_amf loaded.
Oct 26 18:45:06 cms2 openais: Registering service handler 'openais availability management framework B.01.01'
Oct 26 18:45:06 cms2 openais: openais component openais_clm loaded.
Oct 26 18:45:06 cms2 openais: Registering service handler 'openais cluster membership service B.01.01'
Oct 26 18:45:06 cms2 openais: openais component openais_evs loaded.
Oct 26 18:45:06 cms2 openais: Registering service handler 'openais extended virtual synchrony service'
Oct 26 18:45:06 cms2 openais: openais component openais_cman loaded.
Oct 26 18:45:06 cms2 openais: Registering service handler 'openais CMAN membership service 2.01'
Oct 26 18:45:07 cms2 openais: Token Timeout (10000 ms) retransmit timeout (495 ms)
Oct 26 18:45:07 cms2 openais: token hold (386 ms) retransmits before loss (20 retrans)
Oct 26 18:45:07 cms2 openais: join (60 ms) send_join (0 ms) consensus (4800 ms) merge (200 ms)
Oct 26 18:45:07 cms2 openais: downcheck (1000 ms) fail to recv const (50 msgs)
Oct 26 18:45:07 cms2 openais: seqno unchanged const (30 rotations) Maximum network MTU 1500
Oct 26 18:45:07 cms2 openais: window size per rotation (50 messages) maximum messages per rotation (17 messages)
Oct 26 18:45:07 cms2 openais: send threads (0 threads)
Oct 26 18:45:07 cms2 openais: RRP token expired timeout (495 ms)
Oct 26 18:45:07 cms2 openais: RRP token problem counter (2000 ms)
Oct 26 18:45:07 cms2 openais: RRP threshold (10 problem count)
Oct 26 18:45:07 cms2 openais: RRP mode set to none.
Oct 26 18:45:07 cms2 openais: heartbeat_failures_allowed (0)
Oct 26 18:45:07 cms2 openais: max_network_delay (50 ms)
Oct 26 18:45:07 cms2 openais: HeartBeat is Disabled. To enable set heartbeat_failures_allowed > 0
Oct 26 18:45:07 cms2 openais: Receive multicast socket recv buffer size (262142 bytes).
Oct 26 18:45:07 cms2 openais: Transmit multicast socket send buffer size (262142 bytes).
Oct 26 18:45:07 cms2 openais: The network interface is now up.
Oct 26 18:45:07 cms2 openais: Created or loaded sequence id 0.192.168.201.2 for this ring.
Oct 26 18:45:07 cms2 openais: entering GATHER state from 15.
Oct 26 18:45:07 cms2 openais: Initialising service handler 'openais extended virtual synchrony service'
Oct 26 18:45:07 cms2 openais: Initialising service handler 'openais cluster membership service B.01.01'
Oct 26 18:45:07 cms2 openais: Initialising service handler 'openais availability management framework B.01.01'
Oct 26 18:45:07 cms2 ccsd: Initial status:: Quorate
Oct 26 18:45:07 cms2 openais: Initialising service handler 'openais checkpoint service B.01.01'
Oct 26 18:45:07 cms2 openais: Initialising service handler 'openais event service B.01.01'
Oct 26 18:45:07 cms2 openais: Initialising service handler 'openais distributed locking service B.01.01'
Oct 26 18:45:07 cms2 openais: Initialising service handler 'openais message service B.01.01'
Oct 26 18:45:07 cms2 openais: Initialising service handler 'openais configuration service'
Oct 26 18:45:07 cms2 openais: Initialising service handler 'openais cluster closed process group service v1.01'
Oct 26 18:45:07 cms2 openais: Initialising service handler 'openais CMAN membership service 2.01'
Oct 26 18:45:07 cms2 openais: CMAN 2.0.60 (built Jan 23 2007 12:42:29) started
Oct 26 18:45:07 cms2 openais: Not using a virtual synchrony filter.
Oct 26 18:45:07 cms2 openais: Creating commit token because I am the rep.
Oct 26 18:45:07 cms2 openais: Saving state aru 0 high seq received 0
Oct 26 18:45:07 cms2 openais: entering COMMIT state.
Oct 26 18:45:07 cms2 openais: entering RECOVERY state.
Oct 26 18:45:07 cms2 openais: position member 192.168.201.2:
Oct 26 18:45:07 cms2 openais: previous ring seq 0 rep 192.168.201.2
Oct 26 18:45:07 cms2 openais: aru 0 high delivered 0 received flag 0
Oct 26 18:45:07 cms2 openais: Did not need to originate any messages in recovery.
Oct 26 18:45:07 cms2 openais: Storing new sequence id for ring 4
Oct 26 18:45:07 cms2 openais: Sending initial ORF token
Oct 26 18:45:07 cms2 openais: CLM CONFIGURATION CHANGE
Oct 26 18:45:07 cms2 openais: New Configuration:
Oct 26 18:45:07 cms2 openais: Members Left:
Oct 26 18:45:07 cms2 openais: Members Joined:
Oct 26 18:45:07 cms2 openais: This node is within the primary component and will provide service.
Oct 26 18:45:07 cms2 openais: CLM CONFIGURATION CHANGE
Oct 26 18:45:07 cms2 openais: New Configuration:
Oct 26 18:45:07 cms2 openais:     r(0) ip(192.168.201.2)  
Oct 26 18:45:07 cms2 openais: Members Left:
Oct 26 18:45:07 cms2 openais: Members Joined:
Oct 26 18:45:07 cms2 openais:     r(0) ip(192.168.201.2)  
Oct 26 18:45:07 cms2 openais: This node is within the primary component and will provide service.
Oct 26 18:45:07 cms2 openais: entering OPERATIONAL state.
Oct 26 18:45:07 cms2 openais: quorum regained, resuming activity
Oct 26 18:45:07 cms2 openais: got nodejoin message 192.168.201.2
Oct 26 18:45:07 cms2 openais: entering GATHER state from 11.
Oct 26 18:45:07 cms2 openais: Saving state aru 9 high seq received 9
Oct 26 18:45:07 cms2 openais: entering COMMIT state.
Oct 26 18:45:07 cms2 openais: entering RECOVERY state.
Oct 26 18:45:07 cms2 openais: position member 192.168.201.1:
Oct 26 18:45:07 cms2 openais: previous ring seq 116 rep 192.168.201.1
Oct 26 18:45:07 cms2 openais: aru c high delivered c received flag 0
Oct 26 18:45:07 cms2 openais: position member 192.168.201.2:
Oct 26 18:45:07 cms2 openais: previous ring seq 4 rep 192.168.201.2
Oct 26 18:45:07 cms2 openais: aru 9 high delivered 9 received flag 0
Oct 26 18:45:08 cms2 openais: Did not need to originate any messages in recovery.
Oct 26 18:45:08 cms2 openais: Storing new sequence id for ring 78
Oct 26 18:45:08 cms2 groupd: found uncontrolled kernel object rgmanager in /sys/kernel/dlm
Oct 26 18:45:08 cms2 openais: CLM CONFIGURATION CHANGE
Oct 26 18:45:08 cms2 groupd: found uncontrolled kernel object clvmd in /sys/kernel/dlm
Oct 26 18:45:08 cms2 openais: New Configuration:
Oct 26 18:45:08 cms2 groupd: local node must be reset to clear 2 uncontrolled instances of gfs and/or dlm
Oct 26 18:45:08 cms2 openais:     r(0) ip(192.168.201.2)  
Oct 26 18:45:08 cms2 openais: Members Left:
Oct 26 18:45:08 cms2 openais: Members Joined:
Oct 26 18:45:08 cms2 openais: This node is within the primary component and will provide service.
Oct 26 18:45:08 cms2 openais: CLM CONFIGURATION CHANGE
Oct 26 18:45:08 cms2 openais: New Configuration:
Oct 26 18:45:08 cms2 openais:     r(0) ip(192.168.201.1)  
Oct 26 18:45:08 cms2 openais:     r(0) ip(192.168.201.2)  
Oct 26 18:45:08 cms2 openais: Members Left:
Oct 26 18:45:08 cms2 openais: Members Joined:
Oct 26 18:45:08 cms2 openais:     r(0) ip(192.168.201.1)  
Oct 26 18:45:08 cms2 openais: This node is within the primary component and will provide service.
Oct 26 18:45:08 cms2 openais: entering OPERATIONAL state.
Oct 26 18:45:08 cms2 openais: got nodejoin message 192.168.201.1
Oct 26 18:45:08 cms2 openais: got nodejoin message 192.168.201.2
Oct 26 18:45:08 cms2 openais: got joinlist message from node 1
Oct 26 18:45:08 cms2 openais: cman killed by node 2 for reason 2
Oct 26 18:45:08 cms2 dlm_controld: cluster is down, exiting
Oct 26 18:45:08 cms2 gfs_controld: cluster is down, exiting
Oct 26 18:45:08 cms2 fenced: cluster is down, exiting
Oct 26 18:45:08 cms2 kernel: dlm: closing connection to node 2
Oct 26 18:45:08 cms2 kernel: dlm: closing connection to node 1
Oct 26 18:45:35 cms2 ccsd: Unable to connect to cluster infrastructure after 30 seconds.



以下是node1上的log


Oct 26 18:56:03 cms1 openais: entering GATHER state from 11.
Oct 26 18:56:03 cms1 openais: Creating commit token because I am the rep.
Oct 26 18:56:03 cms1 openais: Saving state aru c high seq received c
Oct 26 18:56:03 cms1 openais: entering COMMIT state.
Oct 26 18:56:03 cms1 openais: entering RECOVERY state.
Oct 26 18:56:03 cms1 openais: position member 192.168.201.1:
Oct 26 18:56:03 cms1 openais: previous ring seq 124 rep 192.168.201.1
Oct 26 18:56:03 cms1 openais: aru c high delivered c received flag 0
Oct 26 18:56:03 cms1 openais: position member 192.168.201.2:
Oct 26 18:56:03 cms1 openais: previous ring seq 4 rep 192.168.201.2
Oct 26 18:56:03 cms1 openais: aru 9 high delivered 9 received flag 0
Oct 26 18:56:03 cms1 openais: Did not need to originate any messages in recovery.
Oct 26 18:56:03 cms1 openais: Storing new sequence id for ring 80
Oct 26 18:56:03 cms1 openais: Sending initial ORF token
Oct 26 18:56:03 cms1 openais: CLM CONFIGURATION CHANGE
Oct 26 18:56:03 cms1 openais: New Configuration:
Oct 26 18:56:03 cms1 openais:      r(0) ip(192.168.201.1)  
Oct 26 18:56:03 cms1 openais: Members Left:
Oct 26 18:56:03 cms1 openais: Members Joined:
Oct 26 18:56:03 cms1 openais: This node is within the primary component and will provide service.
Oct 26 18:56:03 cms1 openais: CLM CONFIGURATION CHANGE
Oct 26 18:56:03 cms1 openais: New Configuration:
Oct 26 18:56:03 cms1 openais:      r(0) ip(192.168.201.1)  
Oct 26 18:56:03 cms1 openais:      r(0) ip(192.168.201.2)  
Oct 26 18:56:03 cms1 openais: Members Left:
Oct 26 18:56:03 cms1 openais: Members Joined:
Oct 26 18:56:03 cms1 openais:      r(0) ip(192.168.201.2)  
Oct 26 18:56:03 cms1 openais: This node is within the primary component and will provide service.
Oct 26 18:56:03 cms1 openais: entering OPERATIONAL state.
Oct 26 18:56:03 cms1 openais: got nodejoin message 192.168.201.1
Oct 26 18:56:03 cms1 openais: got nodejoin message 192.168.201.2
Oct 26 18:56:03 cms1 openais: got joinlist message from node 1
Oct 26 18:56:14 cms1 openais: The token was lost in the OPERATIONAL state.
Oct 26 18:56:14 cms1 openais: Receive multicast socket recv buffer size (288000 bytes).
Oct 26 18:56:14 cms1 openais: Transmit multicast socket send buffer size (288000 bytes).
Oct 26 18:56:14 cms1 openais: entering GATHER state from 2.
Oct 26 18:56:19 cms1 openais: entering GATHER state from 0.
Oct 26 18:56:19 cms1 openais: Creating commit token because I am the rep.
Oct 26 18:56:19 cms1 openais: Saving state aru 17 high seq received 17
Oct 26 18:56:19 cms1 openais: entering COMMIT state.
Oct 26 18:56:19 cms1 openais: entering RECOVERY state.
Oct 26 18:56:19 cms1 openais: position member 192.168.201.1:
Oct 26 18:56:19 cms1 openais: previous ring seq 128 rep 192.168.201.1
Oct 26 18:56:19 cms1 openais: aru 17 high delivered 17 received flag 0
Oct 26 18:56:19 cms1 openais: Did not need to originate any messages in recovery.
Oct 26 18:56:19 cms1 openais: Storing new sequence id for ring 84
Oct 26 18:56:19 cms1 openais: Sending initial ORF token
Oct 26 18:56:19 cms1 openais: CLM CONFIGURATION CHANGE
Oct 26 18:56:19 cms1 openais: New Configuration:
Oct 26 18:56:19 cms1 kernel: dlm: closing connection to node 2
Oct 26 18:56:19 cms1 openais:      r(0) ip(192.168.201.1)  
Oct 26 18:56:19 cms1 openais: Members Left:
Oct 26 18:56:19 cms1 openais:      r(0) ip(192.168.201.2)  
Oct 26 18:56:19 cms1 openais: Members Joined:
Oct 26 18:56:19 cms1 openais: This node is within the primary component and will provide service.
Oct 26 18:56:19 cms1 openais: CLM CONFIGURATION CHANGE
Oct 26 18:56:19 cms1 openais: New Configuration:
Oct 26 18:56:19 cms1 openais:      r(0) ip(192.168.201.1)  
Oct 26 18:56:19 cms1 openais: Members Left:
Oct 26 18:56:19 cms1 openais: Members Joined:
Oct 26 18:56:19 cms1 openais: This node is within the primary component and will provide service.
Oct 26 18:56:19 cms1 openais: entering OPERATIONAL state.
Oct 26 18:56:19 cms1 openais: got nodejoin message 192.168.201.1
Oct 26 18:56:19 cms1 openais: got joinlist message from node 1


[ 本帖最後由 oioilu 於 2008-10-26 19:09 編輯 ]

[火星人 ] cman 無法啟動已經有1320次圍觀

http://coctec.com/docs/service/show-post-6384.html