Skip to content

az aks nodepool upgrade silently drops --max-unavailable flag #33195

@mmulvanny

Description

@mmulvanny

Describe the bug

az aks nodepool upgrade --max-unavailable accepts the flag, validates it, and documents it in help text, but never applies it to the agent pool before sending the PUT request. The value is silently dropped.

Related command

(from https://learn.microsoft.com/en-us/azure/aks/stateful-workload-upgrades#step-3-upgrade-node1-former-primary)

az aks nodepool upgrade \
    --resource-group myRG \
    --cluster-name myCluster \
    --name myPool \
    --kubernetes-version 1.29.0 \
    --max-surge 0 \
    --max-unavailable 1

Errors

This bug would not produce any error message.

Issue script & Debug output

This issue was discovered through code review of dev.

Expected behavior

az aks nodepool upgrade should pass --max-unavailable through to the PUT request.

Environment Summary

This issue was discovered through code review of dev.

Additional context

Line numbers below are as of adb78953b2b0c0debe5be51aa02262baacba0626.

Root cause

--max-unavailable was added in #31510 (commit cb14a97). That PR correctly wired up max_unavailable in:

  • The function signature (custom.py, line 3087)
  • The --node-image-only mutual-exclusivity check (custom.py, line 3110)
  • Help text (_help.py, line 2195)
  • Parameter registration (_params.py, line 1134)
  • The nodepool add and nodepool update decorators (agentpool_decorator.py, lines 2036–2038 and 2518–2520)

But aks_agentpool_upgrade() in custom.py does not use the decorator pattern — it directly assigns upgrade settings to the instance (lines 3159–3169). The assignment for max_unavailable was never added:

# custom.py lines 3159-3169
if not instance.upgrade_settings:
    instance.upgrade_settings = AgentPoolUpgradeSettings()

if max_surge:
    instance.upgrade_settings.max_surge = max_surge
if drain_timeout:
    instance.upgrade_settings.drain_timeout_in_minutes = drain_timeout
if isinstance(node_soak_duration, int) and node_soak_duration >= 0:
    instance.upgrade_settings.node_soak_duration_in_minutes = node_soak_duration
if undrainable_node_behavior:
    instance.upgrade_settings.undrainable_node_behavior = undrainable_node_behavior
# max_unavailable: missing

The test added in #31510 covers nodepool add and nodepool update but explicitly skips nodepool upgrade (# actually running an upgrade is too expensive for these tests.).

Fix

Add the missing assignment after line 3169:

if max_unavailable:
    instance.upgrade_settings.max_unavailable = max_unavailable

Metadata

Metadata

Labels

AKSaz aks/acs/openshiftAuto-AssignAuto assign by botLanguageService AttentionThis issue is responsible by Azure service team.Similar-Issueact-observability-squadbugThis issue requires a change to an existing behavior in the product in order to be resolved.customer-reportedIssues that are reported by GitHub users external to the Azure organization.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions