Skip to content

Commit ebc70a5

Browse files
committed
Merge pull request #1486 from remibergsma/reimplement-vrrp-setting-47
Reimplement router.redundant.vrrp.interval settingGlobal setting `router.redundant.vrrp.interval` is not used any more and it is now set to a hardcoded 1. This results in a failover from master->backup when the backup doesn't hear from the master in ~3.6sec. This is a bit too tight, as we've seen failovers during live migrations. We could reproduce it in about half of the cases. Setting this to setting to 2 (tested it by hardcoding it in the systemvms) gives twice as much time and we didn't see issues any more. Instead of updating the hardcoded setting from 1 to 2, I reimplemented the global setting by sending it to the router with the cmd_line, as the non-VPC router also does. Background: Why is the maximum failover time in the example 3.6 seconds? This comes from the advertisement interval and the skew time. The default advertisement interval is 1 second (configurable in keepalived.conf). The skew time helps to keep everyone from trying to transition at once. It is a number between 0 and 1, based on the formula (256 - priority) / 256 As defined in the RFC, the backup must receive an advertisement from the master every (3 * advert_int) + skew_time seconds. If it doesn't hear anything from the master, it takes over. With a backup router priority of 100 (as in the example), the failover will happen at most 3.6 seconds after the master goes down. Source: http://www.hollenback.net/KeepalivedForNetworkReliability * pr/1486: Configure rVPC for router.redundant.vrrp.interval advert_int setting Have rVPCs use the router.redundant.vrrp.interval setting Signed-off-by: Will Stevens <williamstevens@gmail.com>
2 parents 9a20ab8 + 9c0eee4 commit ebc70a5

File tree

4 files changed

+11
-0
lines changed

4 files changed

+11
-0
lines changed

server/src/com/cloud/network/router/VirtualNetworkApplianceManagerImpl.java

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1598,6 +1598,9 @@ protected StringBuilder createRedundantRouterArgs(final NicProfile nic, final Do
15981598
if (isRedundant) {
15991599
buf.append(" redundant_router=1");
16001600

1601+
final int advertInt = NumbersUtil.parseInt(_configDao.getValue(Config.RedundantRouterVrrpInterval.key()), 1);
1602+
buf.append(" advert_int=").append(advertInt);
1603+
16011604
final Long vpcId = router.getVpcId();
16021605
final List<DomainRouterVO> routers;
16031606
if (vpcId != null) {

systemvm/patches/debian/config/opt/cloud/bin/cs/CsDatabag.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -154,3 +154,7 @@ def get_use_ext_dns(self):
154154
return self.idata()['useextdns']
155155
return False
156156

157+
def get_advert_int(self):
158+
if 'advert_int' in self.idata():
159+
return self.idata()['advert_int']
160+
return 1

systemvm/patches/debian/config/opt/cloud/bin/cs/CsFile.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -113,6 +113,7 @@ def section(self, start, end, content):
113113
self.new_config[sind:eind] = content
114114

115115
def greplace(self, search, replace):
116+
logging.debug("Searching for %s and replacing with %s" % (search, replace))
116117
self.new_config = [w.replace(search, replace) for w in self.new_config]
117118

118119
def search(self, search, replace):

systemvm/patches/debian/config/opt/cloud/bin/cs/CsRedundant.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -138,6 +138,9 @@ def _redundant_on(self):
138138
" router_id ", " router_id %s" % self.cl.get_name())
139139
keepalived_conf.search(
140140
" interface ", " interface %s" % guest.get_device())
141+
keepalived_conf.search(
142+
" advert_int ", " advert_int %s" % self.cl.get_advert_int())
143+
141144
keepalived_conf.greplace("[RROUTER_BIN_PATH]", self.CS_ROUTER_DIR)
142145
keepalived_conf.section("authentication {", "}", [
143146
" auth_type AH \n", " auth_pass %s\n" % self.cl.get_router_password()])

0 commit comments

Comments
 (0)