Tommi Virtanen 19da747700 Add current node to mon_hosts, if it has role ceph-mon.
This shortcircuits the search logic, so ceph-mon --mkfs will always
see itself in ceph-mon, and thus won't end up with an empty monmap.
Empty monmap would prevent it from initializing properly, and the
peer hints aren't enough to help.

Search is not reliable for this, as the search index updates lazily,
and e.g. the "roles:" field, holding the expanded run_list, is only
set at the end of a successful chef-client run.

Crowbar creates a special-purpose role, one per node, and assigns the
actual roles to that role. We work around this by doing the search in
two phases, when running under Crowbar.

Thanks to Tyler Brekke.
2012-07-20 14:32:18 -07:00

52 lines
1.8 KiB
Ruby

def is_crowbar?()
return defined?(Chef::Recipe::Barclamp) != nil
end
def get_mon_addresses()
mons = []
# make sure if this node runs ceph-mon, it's always included even if
# search is laggy; put it first in the hopes that clients will talk
# primarily to local node
if node['roles'].include? 'ceph-mon'
mons << node
end
if is_crowbar?
mon_roles = search(:role, 'name:crowbar-* AND run_list:role\[ceph-mon\]')
if not mon_roles.empty?
search_string = mon_roles.map { |role_object| "role:"+role_object.name }.join(' OR ')
mons += search(:node, "(#{search_string}) AND ceph_config_environment:#{node['ceph']['config']['environment']}")
end
else
mons += search(:node, "role:ceph-mon AND chef_environment:#{node.chef_environment}")
end
if is_crowbar?
mon_addresses = mons.map { |node| Chef::Recipe::Barclamp::Inventory.get_network_by_type(node, "admin").address }
else
mon_addresses = mons.map { |node| node["ipaddress"] }
end
mon_addresses = mon_addresses.map { |ip| ip + ":6789" }
return mon_addresses.uniq
end
QUORUM_STATES = ['leader', 'peon']
def have_quorum?()
# "ceph auth get-or-create-key" would hang if the monitor wasn't
# in quorum yet, which is highly likely on the first run. This
# helper lets us delay the key generation into the next
# chef-client run, instead of hanging.
#
# Also, as the UNIX domain socket connection has no timeout logic
# in the ceph tool, this exits immediately if the ceph-mon is not
# running for any reason; trying to connect via TCP/IP would wait
# for a relatively long timeout.
mon_status = %x[ceph --admin-daemon /var/run/ceph/ceph-mon.#{node['hostname']}.asok mon_status]
raise 'getting monitor state failed' unless $?.exitstatus == 0
state = JSON.parse(mon_status)['state']
return QUORUM_STATES.include?(state)
end