配置ganglia出了點問題,高手幫幫忙
這幾天裝了ganglia 3.0.2, 裝上后可以正常遠程用網頁打開, 可以正常顯示收集到到計算機的一些資源,但有一個問題一直沒法解決,我的伺服器有兩個IP段,eth0: 192.168.0.100, eth1: 10.10.1.106 這兩個IP段分別接的兩個交換機,我裝上gangia后,它只能顯示出10.10.1.X IP段的所以機子,而192.168.0.X段的機子,沒法顯示,而兩個IP段的機子我都已經正常安裝了gmond客戶端,且都可以正常起動,運行.哪位高手指點一下,是不是我的gmetad.conf的gmond.conf哪些地方設置的不對,非常感受謝!
Gmetad.conf:
data_source "rcmm1" localhost
gridname "RCMM1"
all_trusted on
setuid_username "ren"
其他配置都是默認的
Gmond.conf:
globals {
setuid = yes
user = root
cleanup_threshold = 300 /*secs */
}
cluster {
name = "rcmm1"
}
host {
location = "unspecified"
}
udp_send_channel {
mcast_join = 239.2.11.71
port = 8649
}
udp_recv_channel {
mcast_join = 239.2.11.71
port = 8649
bind = 239.2.11.71
}
tcp_accept_channel {
port = 8649
}
collection_group {
collect_once = yes
time_threshold = 20
metric {
name = "heartbeat"
}
}
collection_group {
collect_once = yes
time_threshold = 1200
metric {
name = "cpu_num"
}
metric {
name = "cpu_speed"
}
metric {
name = "mem_total"
}
/* Should this be here? Swap can be added/removed between reboots. */
metric {
name = "swap_total"
}
metric {
name = "boottime"
}
metric {
name = "machine_type"
}
metric {
name = "os_name"
}
metric {
name = "os_release"
}
metric {
name = "location"
}
}
collection_group {
collect_once = yes
time_threshold = 300
metric {
name = "gexec"
}
}
collection_group {
collect_every = 20
time_threshold = 90
/* CPU status */
metric {
name = "cpu_user"
value_threshold = "1.0"
}
metric {
name = "cpu_system"
value_threshold = "1.0"
}
metric {
name = "cpu_idle"
value_threshold = "5.0"
}
metric {
name = "cpu_nice"
value_threshold = "1.0"
}
metric {
name = "cpu_aidle"
value_threshold = "5.0"
}
metric {
name = "cpu_wio"
value_threshold = "1.0"
}
metric {
name = "cpu_intr"
value_threshold = "1.0"
}
metric {
name = "cpu_sintr"
value_threshold = "1.0"
}
*/
}
collection_group {
collect_every = 20
time_threshold = 90
/* Load Averages */
metric {
name = "load_one"
value_threshold = "1.0"
}
metric {
name = "load_five"
value_threshold = "1.0"
}
metric {
name = "load_fifteen"
value_threshold = "1.0"
}
}
collection_group {
collect_every = 80
time_threshold = 950
metric {
name = "proc_run"
value_threshold = "1.0"
}
metric {
name = "proc_total"
value_threshold = "1.0"
}
}
collection_group {
collect_every = 40
time_threshold = 180
metric {
name = "mem_free"
value_threshold = "1024.0"
}
metric {
name = "mem_shared"
value_threshold = "1024.0"
}
metric {
name = "mem_buffers"
value_threshold = "1024.0"
}
metric {
name = "mem_cached"
value_threshold = "1024.0"
}
metric {
name = "swap_free"
value_threshold = "1024.0"
}
}
《解決方案》
telnet看看ganglia,發現的是哪一個IP
好像ganglia發現IP有一定的協議,不是你想顯示哪個IP就可以顯示的.(也許我理解的不正確)
可以在安裝gmetad的伺服器上
telnet localhost 8651 | grep "ip"
ip為你安裝gmond機器的地址,如果有多個可以都試驗一下,
看看gmetad能夠識別哪一個IP
你的問題
data_source "rcmm1" localhost
localhost是本機,你要根據你安裝了gmond的機器來設置它
比如:data_source "rcmm1" 15 10.10.1.2 10.10.1.3……10.10.1.X
data_source "rcmm" 15 192.168.0.1……192.168.0.X
它們默認埠都是8649,你也可以更改它們。在gmond.conf
udp_send_channel {
mcast_join = 239.2.11.71
port = 8648
}
udp_recv_channel {
mcast_join = 239.2.11.71
port = 8648
bind = 239.2.11.71
}
tcp_accept_channel {
port = 8648
在Gmetad.conf:
data_source "rcmm" 15 192.168.0.1:8648
這裡只是簡單的例子,供你參考。