歡迎您光臨本站 註冊首頁

項目被RHCS卡住了。。。。。。求jerrywjl大哥指導下。

←手機掃碼閱讀     火星人 @ 2014-03-04 , reply:0

項目被RHCS卡住了。。。。。。求jerrywjl大哥指導下。

項目被RHCS卡住了。。。。。。求jerrywjl大哥指導下。集群節點能online但是clustat下面沒有資源顯示。資源部能啟動。
具體情況如下:
# uname -a
Linux udbapp1 2.6.18-53.el5xen #1 SMP Wed Oct 10 16:48:44 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux
# more /etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1               localhost.localdomain   localhost
::1     localhost6.localdomain6 localhost6
119.87.244.70   udbapp1.local udbapp1
119.87.244.69   udbapp2.local udbapp2
#119.87.244.70    udbapp1
#119.87.244.69    udbapp2
#

-------------------------------------------
# more /etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1               localhost.localdomain   localhost
::1     localhost6.localdomain6 localhost6
119.87.244.70   udbapp1.local udbapp1
119.87.244.69   udbapp2.local udbapp2
#119.87.244.70    udbapp1
#119.87.244.69    udbapp2

上面是/etc/hosts文件。fence用HP ilo。硬體設備連接如下:
HP主機雙網口進行bond產生bond0 IP分別為119.87.244.70和69 ilo分別為71和72.網口和ILO口都連接到交換機都能相互ping通。
# fence_ilo -a 119.87.244.71 -l redhat -p redhat123456 -o status
power is ON
success
-------------------
# fence_ilo -a 119.87.244.72 -l redhat -p redhat123456 -o status
power is ON
success

/etc/cluster/cluster.conf文件如下:
<?xml version="1.0" ?>
<cluster config_version="1" name="cluster_2">
        <fence_daemon post_fail_delay="0" post_join_delay="3"/>
        <clusternodes>
                <clusternode name="udbapp1" nodeid="1" votes="1">
                        <fence>
                                <method name="1">
                                        <device name="fence_1"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="udbapp2" nodeid="2" votes="1">
                        <fence>
                                <method name="1">
                                        <device name="fence_2"/>
                                </method>
                        </fence>
                </clusternode>
        </clusternodes>
<cman expected_votes="1" two_node="1">
<multicast addr="224.0.0.1"/>
</cman>
        <fencedevices>
                <fencedevice agent="fence_ilo" hostname="119.87.244.71" login="redhat" name="fence_1" passwd="redhat123456"/>
                <fencedevice agent="fence_ilo" hostname="119.87.244.72" login="redhat" name="fence_2" passwd="redhat123456"/>
        </fencedevices>
<rm>
                <failoverdomains>
                        <failoverdomain name="udbapp" ordered="1" restricted="0"/>
                        <failoverdomainnode name="udbapp1" priority="1"/>
                </failoverdomains>
                <resources>
                <ip address="119.87.244.73" monitor_link="1"/>
                </resources>
                <service autostart="1" domain="udbapp" name="apache" recovery="relocate">
                                <ip ref="119.87.244.73"/>
                                <script file="/etc/init.d/httpd" name="httpd"/>
                </service>
        </rm>
</cluster>

兩邊同時service cman start的情況:
# service cman start
Starting cluster:
   Loading modules... done
   Mounting configfs... done
   Starting ccsd... done
   Starting cman... done
   Starting daemons... done
   Starting fencing... done
[確定]

clustat顯示的情況:
# clustat
Member Status: Quorate

  Member Name                        ID   Status
  ------ ----                        ---- ------
  udbapp1                               1 Online, Local
  udbapp2                               2 Online
顯示資源沒有啟動。
# clusvcadm  -e apache -m udbapp1
Member udbapp1 trying to enable service:apache...Success
service:apache is now running on udbapp1
# ps -ef|grep httpd
root     32690 30413  0 09:08 pts/1    00:00:00 grep httpd
仍沒有啟動資源。。。。

# service rgmanager start
啟動 Cluster Service Manager:[確定]
# service rgmanager status
clurgmgrd 已死,但 pid 文件仍存
啟動rgmanager 卻發現進程已死。。。

下面是啟動過程的messages:
# tail -f /var/log/messages
Nov 30 09:11:36 udbapp2 ccsd: Starting ccsd 2.0.60:
Nov 30 09:11:36 udbapp2 ccsd:  Built: Jan 23 2007 12:42:13
Nov 30 09:11:36 udbapp2 ccsd:  Copyright (C) Red Hat, Inc.  2004  All rights reserved.
Nov 30 09:11:36 udbapp2 ccsd: cluster.conf (cluster name = cluster_2, version = 1) found.
Nov 30 09:11:39 udbapp2 openais: AIS Executive Service RELEASE 'subrev 1324 version 0.80.2'
Nov 30 09:11:39 udbapp2 openais: Copyright (C) 2002-2006 MontaVista Software, Inc and contributors.
Nov 30 09:11:39 udbapp2 openais: Copyright (C) 2006 Red Hat, Inc.
Nov 30 09:11:39 udbapp2 openais: AIS Executive Service: started and ready to provide service.
Nov 30 09:11:39 udbapp2 openais: openais component openais_cpg loaded.
Nov 30 09:11:39 udbapp2 openais: Registering service handler 'openais cluster closed process group service v1.01'
Nov 30 09:11:39 udbapp2 openais: openais component openais_cfg loaded.
Nov 30 09:11:39 udbapp2 openais: Registering service handler 'openais configuration service'
Nov 30 09:11:39 udbapp2 openais: openais component openais_msg loaded.
Nov 30 09:11:39 udbapp2 openais: Registering service handler 'openais message service B.01.01'
Nov 30 09:11:39 udbapp2 openais: openais component openais_lck loaded.
Nov 30 09:11:39 udbapp2 openais: Registering service handler 'openais distributed locking service B.01.01'
Nov 30 09:11:39 udbapp2 openais: openais component openais_evt loaded.
Nov 30 09:11:39 udbapp2 openais: Registering service handler 'openais event service B.01.01'
Nov 30 09:11:39 udbapp2 openais: openais component openais_ckpt loaded.
Nov 30 09:11:39 udbapp2 openais: Registering service handler 'openais checkpoint service B.01.01'
Nov 30 09:11:39 udbapp2 openais: openais component openais_amf loaded.
Nov 30 09:11:39 udbapp2 openais: Registering service handler 'openais availability management framework B.01.01'
Nov 30 09:11:39 udbapp2 openais: openais component openais_clm loaded.
Nov 30 09:11:39 udbapp2 openais: Registering service handler 'openais cluster membership service B.01.01'
Nov 30 09:11:39 udbapp2 openais: openais component openais_evs loaded.
Nov 30 09:11:39 udbapp2 openais: Registering service handler 'openais extended virtual synchrony service'
Nov 30 09:11:39 udbapp2 openais: openais component openais_cman loaded.
Nov 30 09:11:39 udbapp2 openais: Registering service handler 'openais CMAN membership service 2.01'
Nov 30 09:11:39 udbapp2 openais: Token Timeout (10000 ms) retransmit timeout (495 ms)
Nov 30 09:11:40 udbapp2 openais: token hold (386 ms) retransmits before loss (20 retrans)
Nov 30 09:11:40 udbapp2 openais: join (60 ms) send_join (0 ms) consensus (4800 ms) merge (200 ms)
Nov 30 09:11:40 udbapp2 openais: downcheck (1000 ms) fail to recv const (50 msgs)
Nov 30 09:11:40 udbapp2 openais: seqno unchanged const (30 rotations) Maximum network MTU 1500
Nov 30 09:11:40 udbapp2 openais: window size per rotation (50 messages) maximum messages per rotation (17 messages)
Nov 30 09:11:40 udbapp2 openais: send threads (0 threads)
Nov 30 09:11:40 udbapp2 openais: RRP token expired timeout (495 ms)
Nov 30 09:11:40 udbapp2 openais: RRP token problem counter (2000 ms)
Nov 30 09:11:40 udbapp2 openais: RRP threshold (10 problem count)
Nov 30 09:11:40 udbapp2 openais: RRP mode set to none.
Nov 30 09:11:40 udbapp2 openais: heartbeat_failures_allowed (0)
Nov 30 09:11:40 udbapp2 openais: max_network_delay (50 ms)
Nov 30 09:11:40 udbapp2 openais: HeartBeat is Disabled. To enable set heartbeat_failures_allowed > 0
Nov 30 09:11:40 udbapp2 openais: Receive multicast socket recv buffer size (262142 bytes).
Nov 30 09:11:40 udbapp2 openais: Transmit multicast socket send buffer size (262142 bytes).
Nov 30 09:11:40 udbapp2 openais: The network interface is now up.
Nov 30 09:11:40 udbapp2 openais: Created or loaded sequence id 0.119.87.244.69 for this ring.
Nov 30 09:11:40 udbapp2 openais: entering GATHER state from 15.
Nov 30 09:11:40 udbapp2 openais: Initialising service handler 'openais extended virtual synchrony service'
Nov 30 09:11:40 udbapp2 openais: Initialising service handler 'openais cluster membership service B.01.01'
Nov 30 09:11:40 udbapp2 openais: Initialising service handler 'openais availability management framework B.01.01'
Nov 30 09:11:40 udbapp2 openais: Initialising service handler 'openais checkpoint service B.01.01'
Nov 30 09:11:40 udbapp2 openais: Initialising service handler 'openais event service B.01.01'
Nov 30 09:11:40 udbapp2 openais: Initialising service handler 'openais distributed locking service B.01.01'
Nov 30 09:11:40 udbapp2 openais: Initialising service handler 'openais message service B.01.01'
Nov 30 09:11:40 udbapp2 openais: Initialising service handler 'openais configuration service'
Nov 30 09:11:40 udbapp2 openais: Initialising service handler 'openais cluster closed process group service v1.01'
Nov 30 09:11:40 udbapp2 ccsd: Initial status:: Quorate
Nov 30 09:11:40 udbapp2 openais: Initialising service handler 'openais CMAN membership service 2.01'
Nov 30 09:11:40 udbapp2 openais: CMAN 2.0.60 (built Jan 23 2007 12:42:16) started
Nov 30 09:11:40 udbapp2 openais: Not using a virtual synchrony filter.
Nov 30 09:11:40 udbapp2 openais: Creating commit token because I am the rep.
Nov 30 09:11:40 udbapp2 openais: Saving state aru 0 high seq received 0
Nov 30 09:11:40 udbapp2 openais: entering COMMIT state.
Nov 30 09:11:41 udbapp2 openais: entering RECOVERY state.
Nov 30 09:11:41 udbapp2 openais: position member 119.87.244.69:
Nov 30 09:11:41 udbapp2 openais: previous ring seq 0 rep 119.87.244.69
Nov 30 09:11:41 udbapp2 openais: aru 0 high delivered 0 received flag 0
Nov 30 09:11:41 udbapp2 openais: Did not need to originate any messages in recovery.
Nov 30 09:11:41 udbapp2 openais: Storing new sequence id for ring 4
Nov 30 09:11:41 udbapp2 openais: Sending initial ORF token
Nov 30 09:11:41 udbapp2 openais: CLM CONFIGURATION CHANGE
Nov 30 09:11:41 udbapp2 openais: New Configuration:
Nov 30 09:11:41 udbapp2 openais: Members Left:
Nov 30 09:11:41 udbapp2 openais: Members Joined:
Nov 30 09:11:41 udbapp2 openais: This node is within the primary component and will provide service.
Nov 30 09:11:41 udbapp2 openais: CLM CONFIGURATION CHANGE
Nov 30 09:11:41 udbapp2 openais: New Configuration:
Nov 30 09:11:41 udbapp2 openais:          r(0) ip(119.87.244.69)  
Nov 30 09:11:41 udbapp2 openais: Members Left:
Nov 30 09:11:41 udbapp2 openais: Members Joined:
Nov 30 09:11:41 udbapp2 openais:          r(0) ip(119.87.244.69)  
Nov 30 09:11:41 udbapp2 openais: This node is within the primary component and will provide service.
Nov 30 09:11:41 udbapp2 openais: entering OPERATIONAL state.
Nov 30 09:11:41 udbapp2 openais: quorum regained, resuming activity
Nov 30 09:11:41 udbapp2 openais: got nodejoin message 119.87.244.69
Nov 30 09:11:41 udbapp2 openais: entering GATHER state from 11.
Nov 30 09:11:41 udbapp2 openais: Creating commit token because I am the rep.
Nov 30 09:11:41 udbapp2 openais: Saving state aru 9 high seq received 9
Nov 30 09:11:41 udbapp2 openais: entering COMMIT state.
Nov 30 09:11:41 udbapp2 openais: entering RECOVERY state.
Nov 30 09:11:41 udbapp2 openais: position member 119.87.244.69:
Nov 30 09:11:41 udbapp2 openais: previous ring seq 4 rep 119.87.244.69
Nov 30 09:11:41 udbapp2 openais: aru 9 high delivered 9 received flag 0
Nov 30 09:11:41 udbapp2 openais: position member 119.87.244.70:
Nov 30 09:11:42 udbapp2 openais: previous ring seq 4 rep 119.87.244.70
Nov 30 09:11:42 udbapp2 openais: aru 9 high delivered 9 received flag 0
Nov 30 09:11:42 udbapp2 openais: Did not need to originate any messages in recovery.
Nov 30 09:11:42 udbapp2 openais: Storing new sequence id for ring 8
Nov 30 09:11:42 udbapp2 openais: Sending initial ORF token
Nov 30 09:11:42 udbapp2 openais: CLM CONFIGURATION CHANGE
Nov 30 09:11:42 udbapp2 openais: New Configuration:
Nov 30 09:11:42 udbapp2 openais:          r(0) ip(119.87.244.69)  
Nov 30 09:11:42 udbapp2 openais: Members Left:
Nov 30 09:11:42 udbapp2 openais: Members Joined:
Nov 30 09:11:42 udbapp2 openais: This node is within the primary component and will provide service.
Nov 30 09:11:42 udbapp2 openais: CLM CONFIGURATION CHANGE
Nov 30 09:11:42 udbapp2 openais: New Configuration:
Nov 30 09:11:42 udbapp2 openais:          r(0) ip(119.87.244.69)  
Nov 30 09:11:42 udbapp2 openais:          r(0) ip(119.87.244.70)  
Nov 30 09:11:42 udbapp2 openais: Members Left:
Nov 30 09:11:42 udbapp2 openais: Members Joined:
Nov 30 09:11:42 udbapp2 openais:          r(0) ip(119.87.244.70)  
Nov 30 09:11:42 udbapp2 openais: This node is within the primary component and will provide service.
Nov 30 09:11:42 udbapp2 openais: entering OPERATIONAL state.
Nov 30 09:11:42 udbapp2 openais: got nodejoin message 119.87.244.69
Nov 30 09:11:42 udbapp2 openais: got nodejoin message 119.87.244.70


麻煩各位兄弟姐妹分析下messages是否有報錯。別人一下子就建立起來了。我這個就是不行。
1.我看文檔沒有明確說要專有心跳網路。心跳能直接走數據介面。但是有哪個配置文件在設置心跳走哪個網路?
《解決方案》

第一,配置文件基本沒有問題,但為什麼failoverdomain中只有一個節點?
<failoverdomains>
                        <failoverdomain name="udbapp" ordered="1" restricted="0"/>
                        <failoverdomainnode name="udbapp1" priority="1"/>
                </failoverdomains>
第二,在clusvcadm之前要先啟動rgmanager,所以你應該將rgmanager設置為enable,重啟兩台機器看看。或者手動啟動rgmanager,由集群自己去決定在哪台機器上啟apache。
第三,能不用xen的內核盡量不要用。
《解決方案》

《解決方案》

LZ如果在上海的話可以聯繫我,我這邊可以介紹個RHCA給你解決問題。
《解決方案》

配資源啊

[火星人 ] 項目被RHCS卡住了。。。。。。求jerrywjl大哥指導下。已經有769次圍觀

http://coctec.com/docs/service/show-post-5953.html