Skip to content

feat: Integrate Managed Lustre Dynamic Tier#5791

Open
parulbajaj01 wants to merge 2 commits into
GoogleCloudPlatform:developfrom
parulbajaj01:cmek
Open

feat: Integrate Managed Lustre Dynamic Tier#5791
parulbajaj01 wants to merge 2 commits into
GoogleCloudPlatform:developfrom
parulbajaj01:cmek

Conversation

@parulbajaj01

Copy link
Copy Markdown
Contributor

This PR integrates and documents support for the Managed Lustre Dynamic Tier in Cluster Toolkit.

Additionally, the module documentation has been expanded to address the specific platform-level requirements needed to deploy this tier successfully.

Submission Checklist

NOTE: Community submissions can take up to 2 weeks to be reviewed.

Please take the following actions before submitting this pull request.

  • Fork your PR branch from the Toolkit "develop" branch (not main)
  • Test all changes with pre-commit in a local branch #
  • Confirm that "make tests" passes all tests
  • Add or modify unit tests to cover code changes
  • Ensure that unit test coverage remains above 80%
  • Update all applicable documentation
  • Follow Cluster Toolkit Contribution guidelines #

@parulbajaj01 parulbajaj01 requested a review from vikramvs-gg June 16, 2026 09:13
@parulbajaj01 parulbajaj01 requested a review from a team as a code owner June 16, 2026 09:13
@parulbajaj01 parulbajaj01 added the enhancement New feature or request label Jun 16, 2026
@gemini-code-assist

Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces support for the Managed Lustre Dynamic Tier within the Cluster Toolkit. It provides the necessary Terraform configuration to enable this feature and includes comprehensive documentation to guide users through the specific platform-level requirements, such as VPC-scoped quota management and minimum capacity constraints.

Highlights

  • Dynamic Tier Integration: Added support for Managed Lustre Dynamic Tier, including a new enable_dynamic_tier variable and dynamic configuration options in the Terraform module.
  • Documentation Updates: Expanded the README with a dedicated Dynamic Tier example, deployment prerequisites, and important constraints regarding VPC-scoped quotas.
  • Dependency Upgrade: Updated the required Google provider version to >= 7.27.0 to support the new functionality.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for the Dynamic Tier feature in the Managed Lustre module, which includes updating the Google provider version requirement to >= 7.27.0, adding the enable_dynamic_tier variable, and documenting its usage and prerequisites in the README. The review feedback suggests avoiding silently overriding per_unit_storage_throughput when enable_dynamic_tier is enabled, recommending instead the use of a lifecycle precondition to explicitly validate and fail on conflicting configurations.

Comment thread modules/file-system/managed-lustre/main.tf Outdated
@parulbajaj01 parulbajaj01 added release-key-new-features Added to release notes under the "Key New Features" heading. and removed enhancement New feature or request labels Jun 16, 2026
> ```
>
> 2. **Minimum Size:** The minimum capacity is **472,000 GiB** (and must be in multiples of 472,000 GiB).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can add a warning that recreating a Lustre instance (which rotates the IP) breaks existing mount setups. Maybe something like this:

Warning

Static IP Constraint & GKE CSI Support:
Managed Lustre instances do not guarantee static IP allocations. If the Lustre instance is recreated (due to scaling, location moves, etc.), its IP will change. Because GKE CSI PersistentVolume configurations are immutable, an IP change will cause GKE mount failures. You must manually delete and recreate GKE PersistentVolume configurations in such events.

type = number
default = 500
default = null
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can add a validation here to ensure valid inputs are provided:

  validation {
    condition     = var.per_unit_storage_throughput == null || contains([125, 250, 500, 1000], var.per_unit_storage_throughput)
    error_message = "Throughput must be null or one of: 125, 250, 500, 1000."
  }

Please check if this goes well with the dynamic tier validation.

condition = !(var.enable_dynamic_tier && var.per_unit_storage_throughput != null)
error_message = "per_unit_storage_throughput must not be set when enable_dynamic_tier is enabled."
}
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that there is a restriction on the minimum size_gib: 472000 in the readme. Is it also supposed to be a multiple of 472000? Maybe we can add this precondition:

    precondition {
      condition     = !var.enable_dynamic_tier || (var.size_gib >= 472000 && var.size_gib % 472000 == 0)
      error_message = "For Managed Lustre Dynamic Tier, size_gib must be at least 472000 (GiB) and must be in multiples of 472000."
    }

@SwarnaBharathiMantena

Copy link
Copy Markdown
Contributor

/gemini review

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for the Dynamic Tier in the Managed Lustre module, updating the Google provider requirement to version 7.27.0 or higher, and adding relevant documentation, variables, and preconditions. The review feedback suggests enhancing validation by adding a precondition to enforce the minimum capacity and multiples of 472,000 GiB for the Dynamic Tier, as well as adding a validation block to restrict the allowed values for the storage throughput variable.

Comment on lines +99 to +102
precondition {
condition = !(var.enable_dynamic_tier && var.per_unit_storage_throughput != null)
error_message = "per_unit_storage_throughput must not be set when enable_dynamic_tier is enabled."
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

When enable_dynamic_tier is enabled, the minimum capacity is 472,000 GiB and must be in multiples of 472,000 GiB. Adding a lifecycle precondition to validate this will prevent deployment failures at the GCP API level and fail fast with a clear error message.

    precondition {
      condition     = !(var.enable_dynamic_tier && var.per_unit_storage_throughput != null)
      error_message = "per_unit_storage_throughput must not be set when enable_dynamic_tier is enabled."
    }
    precondition {
      condition     = !var.enable_dynamic_tier || (var.size_gib >= 472000 && var.size_gib % 472000 == 0)
      error_message = "For Dynamic Tier, size_gib must be at least 472,000 GiB and a multiple of 472,000 GiB."
    }
References
  1. Avoid automatically and silently overriding user-provided settings in Terraform configurations. Instead, validate conflicting configurations (e.g., using lifecycle preconditions) to explicitly fail with a clear error message.

Comment on lines 80 to 84
variable "per_unit_storage_throughput" {
description = "Throughput of the instance in MB/s/TiB. Valid values are 125, 250, 500, 1000."
description = "Throughput of the instance in MB/s/TiB. Valid values are 125, 250, 500, 1000. If enable_dynamic_tier is false, this defaults to 500."
type = number
default = 500
default = null
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Adding a validation block to per_unit_storage_throughput ensures that any user-provided value is validated at plan time against the allowed values (125, 250, 500, 1000), preventing invalid configurations from being sent to the GCP API. Using the contains function is preferred for validating if a variable's value is within a list of allowed values.

variable "per_unit_storage_throughput" {
  description = "Throughput of the instance in MB/s/TiB. Valid values are 125, 250, 500, 1000. If enable_dynamic_tier is false, this defaults to 500."
  type        = number
  default     = null

  validation {
    condition     = var.per_unit_storage_throughput == null || contains([125, 250, 500, 1000], var.per_unit_storage_throughput)
    error_message = "The per_unit_storage_throughput must be one of 125, 250, 500, or 1000."
  }
}
References
  1. In Terraform, use the contains function for validating if a variable's value is within a list of allowed values to improve readability and maintainability.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-key-new-features Added to release notes under the "Key New Features" heading.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants