rhcs出現這個怪問題,請教!
日誌如下
Feb 20 21:48:32 host217 ccsd: Starting ccsd 1.0.2:
Feb 20 21:48:32 host217 ccsd: Built: Aug 1 2005 14:39:51
Feb 20 21:48:32 host217 ccsd: Copyright (C) Red Hat, Inc. 2004 All rights reserved.
Feb 20 21:48:33 host217 ccsd: startup succeeded
Feb 20 21:48:33 host217 kernel: CMAN 2.6.9-39.5 (built Sep 20 2005 16:06:26) installed
Feb 20 21:48:33 host217 kernel: NET: Registered protocol family 30
Feb 20 21:48:33 host217 ccsd: cluster.conf (cluster name = oracle_cluster, version = 9) found.
Feb 20 21:48:34 host217 kernel: CMAN: Waiting to join or form a Linux-cluster
Feb 20 21:48:34 host217 ccsd: Connected to cluster infrastruture via: CMAN/SM Plugin v1.1.2
Feb 20 21:48:34 host217 ccsd: Initial status:: Inquorate
Feb 20 21:49:01 host217 kernel: CMAN: sending membership request
Feb 20 21:49:01 host217 kernel: CMAN: got node host216
Feb 20 21:49:01 host217 kernel: CMAN: quorum regained, resuming activity
Feb 20 21:49:01 host217 ccsd: Cluster is quorate. Allowing connections.
Feb 20 21:49:01 host217 cman: startup succeeded
Feb 20 21:49:01 host217 kernel: DLM 2.6.9-37.7 (built Sep 20 2005 16:09:43) installed
Feb 20 21:49:02 host217 fenced: startup succeeded
Feb 20 21:50:34 host217 clurgmgrd: <info> Loading Service Data
Feb 20 21:50:34 host217 rgmanager: clurgmgrd startup succeeded
Feb 20 21:50:35 host217 ccsd: Update of cluster.conf complete (version 9 -> 10).
Feb 20 21:50:35 host217 clurgmgrd: <info> Initializing Services
Feb 20 21:50:35 host217 httpd: httpd shutdown failed
Feb 20 21:50:35 host217 clurgmgrd: <notice> stop on script "httpd" returned 1 (generic error)
Feb 20 21:50:35 host217 clurgmgrd: <info> Services Initialized
Feb 20 21:50:35 host217 clurgmgrd: <info> Logged in SG "usrm::manager"
Feb 20 21:50:35 host217 clurgmgrd: <info> Magma Event: Membership Change
Feb 20 21:50:35 host217 clurgmgrd: <info> State change: Local UP
Feb 20 21:50:36 host217 clurgmgrd: <info> State change: host216 UP
Feb 20 21:52:18 host217 clurgmgrd: <notice> Recovering failed service httpservice
Feb 20 21:52:18 host217 kernel: kjournald starting. Commit interval 5 seconds
Feb 20 21:52:18 host217 kernel: EXT3 FS on sdb7, internal journal
Feb 20 21:52:18 host217 kernel: EXT3-fs: mounted filesystem with ordered data mode.
Feb 20 21:52:20 host217 httpd: httpd startup succeeded
Feb 20 21:52:20 host217 clurgmgrd: <notice> Service httpservice started
Feb 20 21:54:29 host217 clurgmgrd: <notice> status on ip "192.168.8.219" returned 1 (generic error)
Feb 20 21:54:29 host217 clurgmgrd: <notice> Stopping service httpservice
Feb 20 21:54:29 host217 httpd: httpd shutdown succeeded
Feb 20 21:54:30 host217 clurgmgrd: <notice> Service httpservice is recovering
Feb 20 21:54:32 host217 clurgmgrd: <notice> Service httpservice is now running on member 1
Feb 20 21:56:35 host217 clurgmgrd: <notice> Recovering failed service httpservice
Feb 20 21:56:35 host217 kernel: kjournald starting. Commit interval 5 seconds
Feb 20 21:56:35 host217 kernel: EXT3 FS on sdb7, internal journal
Feb 20 21:56:35 host217 kernel: EXT3-fs: mounted filesystem with ordered data mode.
Feb 20 21:56:36 host217 httpd: httpd startup succeeded
Feb 20 21:56:36 host217 clurgmgrd: <notice> Service httpservice started
Feb 20 21:58:45 host217 clurgmgrd: <notice> status on ip "192.168.8.219" returned 1 (generic error)
Feb 20 21:58:45 host217 clurgmgrd: <notice> Stopping service httpservice
Feb 20 21:58:45 host217 httpd: httpd shutdown succeeded
Feb 20 21:58:45 host217 clurgmgrd: <notice> Service httpservice is recovering
Feb 20 21:58:47 host217 clurgmgrd: <notice> Service httpservice is now running on member 1
Feb 20 22:00:51 host217 clurgmgrd: <notice> Recovering failed service httpservice
Feb 20 22:00:51 host217 kernel: kjournald starting. Commit interval 5 seconds
Feb 20 22:00:51 host217 kernel: EXT3 FS on sdb7, internal journal
Feb 20 22:00:51 host217 kernel: EXT3-fs: mounted filesystem with ordered data mode.
Feb 20 22:00:52 host217 httpd: httpd startup succeeded
Feb 20 22:00:52 host217 clurgmgrd: <notice> Service httpservice started
Feb 20 22:03:00 host217 clurgmgrd: <notice> status on ip "192.168.8.219" returned 1 (generic error)
Feb 20 22:03:01 host217 clurgmgrd: <notice> Stopping service httpservice
Feb 20 22:03:01 host217 httpd: httpd shutdown succeeded
Feb 20 22:03:01 host217 clurgmgrd: <notice> Service httpservice is recovering
Feb 20 22:03:03 host217 clurgmgrd: <notice> Service httpservice is now running on member 1
Feb 20 22:05:05 host217 clurgmgrd: <notice> Recovering failed service httpservice
Feb 20 22:05:05 host217 kernel: kjournald starting. Commit interval 5 seconds
Feb 20 22:05:05 host217 kernel: EXT3 FS on sdb7, internal journal
Feb 20 22:05:05 host217 kernel: EXT3-fs: mounted filesystem with ordered data mode.
Feb 20 22:05:07 host217 httpd: httpd startup succeeded
Feb 20 22:05:07 host217 clurgmgrd: <notice> Service httpservice started
Feb 20 22:07:06 host217 clurgmgrd: <notice> status on ip "192.168.8.219" returned 1 (generic error)
Feb 20 22:07:06 host217 clurgmgrd: <notice> Stopping service httpservice
Feb 20 22:07:06 host217 httpd: httpd shutdown succeeded
Feb 20 22:07:06 host217 clurgmgrd: <notice> Service httpservice is recovering
Feb 20 22:07:08 host217 clurgmgrd: <notice> Service httpservice is now running on member 1
Feb 20 22:09:15 host217 clurgmgrd: <notice> Recovering failed service httpservice
Feb 20 22:09:15 host217 kernel: kjournald starting. Commit interval 5 seconds
Feb 20 22:09:15 host217 kernel: EXT3-fs warning: maximal mount count reached, running e2fsck is recommended
Feb 20 22:09:15 host217 kernel: EXT3 FS on sdb7, internal journal
Feb 20 22:09:15 host217 kernel: EXT3-fs: mounted filesystem with ordered data mode.
Feb 20 22:09:17 host217 httpd: httpd startup succeeded
Feb 20 22:09:17 host217 clurgmgrd: <notice> Service httpservice started
Feb 20 22:11:22 host217 clurgmgrd: <notice> status on ip "192.168.8.219" returned 1 (generic error)
Feb 20 22:11:22 host217 clurgmgrd: <notice> Stopping service httpservice
Feb 20 22:11:22 host217 httpd: httpd shutdown succeeded
Feb 20 22:11:22 host217 clurgmgrd: <notice> Service httpservice is recovering
Feb 20 22:11:24 host217 clurgmgrd: <notice> Service httpservice is now running on member 1
httpservice反覆在member 1和2之間切換,這是啥回事?大家幫忙看一下。
《解決方案》
2號機的日誌
Feb 20 21:48:32 host217 ccsd: Starting ccsd 1.0.2:
Feb 20 21:48:32 host217 ccsd: Built: Aug 1 2005 14:39:51
Feb 20 21:48:32 host217 ccsd: Copyright (C) Red Hat, Inc. 2004 All rights reserved.
Feb 20 21:48:33 host217 ccsd: startup succeeded
Feb 20 21:48:33 host217 kernel: CMAN 2.6.9-39.5 (built Sep 20 2005 16:06:26) installed
Feb 20 21:48:33 host217 kernel: NET: Registered protocol family 30
Feb 20 21:48:33 host217 ccsd: cluster.conf (cluster name = oracle_cluster, version = 9) found.
Feb 20 21:48:34 host217 kernel: CMAN: Waiting to join or form a Linux-cluster
Feb 20 21:48:34 host217 ccsd: Connected to cluster infrastruture via: CMAN/SM Plugin v1.1.2
Feb 20 21:48:34 host217 ccsd: Initial status:: Inquorate
Feb 20 21:49:01 host217 kernel: CMAN: sending membership request
Feb 20 21:49:01 host217 kernel: CMAN: got node host216
Feb 20 21:49:01 host217 kernel: CMAN: quorum regained, resuming activity
Feb 20 21:49:01 host217 ccsd: Cluster is quorate. Allowing connections.
Feb 20 21:49:01 host217 cman: startup succeeded
Feb 20 21:49:01 host217 kernel: DLM 2.6.9-37.7 (built Sep 20 2005 16:09:43) installed
Feb 20 21:49:02 host217 fenced: startup succeeded
Feb 20 21:50:34 host217 clurgmgrd: <info> Loading Service Data
Feb 20 21:50:34 host217 rgmanager: clurgmgrd startup succeeded
Feb 20 21:50:35 host217 ccsd: Update of cluster.conf complete (version 9 -> 10).
Feb 20 21:50:35 host217 clurgmgrd: <info> Initializing Services
Feb 20 21:50:35 host217 httpd: httpd shutdown failed
Feb 20 21:50:35 host217 clurgmgrd: <notice> stop on script "httpd" returned 1 (generic error)
Feb 20 21:50:35 host217 clurgmgrd: <info> Services Initialized
Feb 20 21:50:35 host217 clurgmgrd: <info> Logged in SG "usrm::manager"
Feb 20 21:50:35 host217 clurgmgrd: <info> Magma Event: Membership Change
Feb 20 21:50:35 host217 clurgmgrd: <info> State change: Local UP
Feb 20 21:50:36 host217 clurgmgrd: <info> State change: host216 UP
Feb 20 21:52:18 host217 clurgmgrd: <notice> Recovering failed service httpservice
Feb 20 21:52:18 host217 kernel: kjournald starting. Commit interval 5 seconds
Feb 20 21:52:18 host217 kernel: EXT3 FS on sdb7, internal journal
Feb 20 21:52:18 host217 kernel: EXT3-fs: mounted filesystem with ordered data mode.
Feb 20 21:52:20 host217 httpd: httpd startup succeeded
Feb 20 21:52:20 host217 clurgmgrd: <notice> Service httpservice started
Feb 20 21:54:29 host217 clurgmgrd: <notice> status on ip "192.168.8.219" returned 1 (generic error)
Feb 20 21:54:29 host217 clurgmgrd: <notice> Stopping service httpservice
Feb 20 21:54:29 host217 httpd: httpd shutdown succeeded
Feb 20 21:54:30 host217 clurgmgrd: <notice> Service httpservice is recovering
Feb 20 21:54:32 host217 clurgmgrd: <notice> Service httpservice is now running on member 1
Feb 20 21:56:35 host217 clurgmgrd: <notice> Recovering failed service httpservice
Feb 20 21:56:35 host217 kernel: kjournald starting. Commit interval 5 seconds
Feb 20 21:56:35 host217 kernel: EXT3 FS on sdb7, internal journal
Feb 20 21:56:35 host217 kernel: EXT3-fs: mounted filesystem with ordered data mode.
Feb 20 21:56:36 host217 httpd: httpd startup succeeded
Feb 20 21:56:36 host217 clurgmgrd: <notice> Service httpservice started
Feb 20 21:58:45 host217 clurgmgrd: <notice> status on ip "192.168.8.219" returned 1 (generic error)
Feb 20 21:58:45 host217 clurgmgrd: <notice> Stopping service httpservice
Feb 20 21:58:45 host217 httpd: httpd shutdown succeeded
Feb 20 21:58:45 host217 clurgmgrd: <notice> Service httpservice is recovering
Feb 20 21:58:47 host217 clurgmgrd: <notice> Service httpservice is now running on member 1
Feb 20 22:00:51 host217 clurgmgrd: <notice> Recovering failed service httpservice
Feb 20 22:00:51 host217 kernel: kjournald starting. Commit interval 5 seconds
Feb 20 22:00:51 host217 kernel: EXT3 FS on sdb7, internal journal
Feb 20 22:00:51 host217 kernel: EXT3-fs: mounted filesystem with ordered data mode.
Feb 20 22:00:52 host217 httpd: httpd startup succeeded
Feb 20 22:00:52 host217 clurgmgrd: <notice> Service httpservice started
Feb 20 22:03:00 host217 clurgmgrd: <notice> status on ip "192.168.8.219" returned 1 (generic error)
Feb 20 22:03:01 host217 clurgmgrd: <notice> Stopping service httpservice
Feb 20 22:03:01 host217 httpd: httpd shutdown succeeded
Feb 20 22:03:01 host217 clurgmgrd: <notice> Service httpservice is recovering
Feb 20 22:03:03 host217 clurgmgrd: <notice> Service httpservice is now running on member 1
Feb 20 22:05:05 host217 clurgmgrd: <notice> Recovering failed service httpservice
Feb 20 22:05:05 host217 kernel: kjournald starting. Commit interval 5 seconds
Feb 20 22:05:05 host217 kernel: EXT3 FS on sdb7, internal journal
Feb 20 22:05:05 host217 kernel: EXT3-fs: mounted filesystem with ordered data mode.
Feb 20 22:05:07 host217 httpd: httpd startup succeeded
Feb 20 22:05:07 host217 clurgmgrd: <notice> Service httpservice started
Feb 20 22:07:06 host217 clurgmgrd: <notice> status on ip "192.168.8.219" returned 1 (generic error)
Feb 20 22:07:06 host217 clurgmgrd: <notice> Stopping service httpservice
Feb 20 22:07:06 host217 httpd: httpd shutdown succeeded
Feb 20 22:07:06 host217 clurgmgrd: <notice> Service httpservice is recovering
Feb 20 22:07:08 host217 clurgmgrd: <notice> Service httpservice is now running on member 1
Feb 20 22:09:15 host217 clurgmgrd: <notice> Recovering failed service httpservice
Feb 20 22:09:15 host217 kernel: kjournald starting. Commit interval 5 seconds
Feb 20 22:09:15 host217 kernel: EXT3-fs warning: maximal mount count reached, running e2fsck is recommended
Feb 20 22:09:15 host217 kernel: EXT3 FS on sdb7, internal journal
Feb 20 22:09:15 host217 kernel: EXT3-fs: mounted filesystem with ordered data mode.
Feb 20 22:09:17 host217 httpd: httpd startup succeeded
Feb 20 22:09:17 host217 clurgmgrd: <notice> Service httpservice started
Feb 20 22:11:22 host217 clurgmgrd: <notice> status on ip "192.168.8.219" returned 1 (generic error)
Feb 20 22:11:22 host217 clurgmgrd: <notice> Stopping service httpservice
Feb 20 22:11:22 host217 httpd: httpd shutdown succeeded
Feb 20 22:11:22 host217 clurgmgrd: <notice> Service httpservice is recovering
Feb 20 22:11:24 host217 clurgmgrd: <notice> Service httpservice is now running on member 1
Feb 20 22:13:28 host217 clurgmgrd: <notice> Recovering failed service httpservice
Feb 20 22:13:28 host217 kernel: kjournald starting. Commit interval 5 seconds
Feb 20 22:13:28 host217 kernel: EXT3-fs warning: maximal mount count reached, running e2fsck is recommended
Feb 20 22:13:28 host217 kernel: EXT3 FS on sdb7, internal journal
Feb 20 22:13:28 host217 kernel: EXT3-fs: mounted filesystem with ordered data mode.
Feb 20 22:13:29 host217 httpd: httpd startup succeeded
Feb 20 22:13:29 host217 clurgmgrd: <notice> Service httpservice started
Feb 20 22:15:37 host217 clurgmgrd: <notice> status on ip "192.168.8.219" returned 1 (generic error)
Feb 20 22:15:37 host217 clurgmgrd: <notice> Stopping service httpservice
Feb 20 22:15:38 host217 httpd: httpd shutdown succeeded
Feb 20 22:15:38 host217 clurgmgrd: <notice> Service httpservice is recovering
Feb 20 22:15:40 host217 clurgmgrd: <notice> Service httpservice is now running on member 1
Feb 20 22:17:43 host217 clurgmgrd: <notice> Recovering failed service httpservice
Feb 20 22:17:43 host217 kernel: kjournald starting. Commit interval 5 seconds
Feb 20 22:17:43 host217 kernel: EXT3-fs warning: maximal mount count reached, running e2fsck is recommended
Feb 20 22:17:43 host217 kernel: EXT3 FS on sdb7, internal journal
Feb 20 22:17:43 host217 kernel: EXT3-fs: mounted filesystem with ordered data mode.
Feb 20 22:17:44 host217 httpd: httpd startup succeeded
Feb 20 22:17:44 host217 clurgmgrd: <notice> Service httpservice started
《解決方案》
確保apache的http服務在runlevel 3 5中是刪除的或禁用的
共享的PUB IP資源,Share Disk,Script資源配置正確
httpd.conf配置文件正確
默認的配置應該就可以做到正常切換,然後再將共享的Disk加到配置中
《解決方案》
IP資源監測出錯,看看是不是網卡問題
《解決方案》
我現在把網路的配置取消了,通過腳本來啟浮動IP,就不會自動切換了。
這是什麼原因咧?怎麼用CS的網路配置就會自動切換。
《解決方案》
我最近在配置RHCS時,也碰到與樓主同樣的問題:
clurgmgrd: <notice> status on ip "X.X.X.X" returned 1 (generic error)
然後服務就不斷的recovery重啟。
不知道是否與設置IP地址資源的時候,其中的「Monitor Link」選項有關係?
《解決方案》
「我現在把網路的配置取消了,通過腳本來啟浮動IP,就不會自動切換了。
這是什麼原因咧?怎麼用CS的網路配置就會自動切換。」
樓主是否是將IP的resource刪除?那如何在腳本中加入啟動浮動IP的功能?
希望樓主能賜教。
[ 本帖最後由 redkops 於 2007-3-6 21:50 編輯 ]
《解決方案》
monitor link 使用ethtool來檢測網卡,你看看你的網卡支不支持ethtool命令
ethtool eth0
《解決方案》
一樣的問題。。