feat: Integrate Managed Lustre Dynamic Tier#5791
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces support for the Managed Lustre Dynamic Tier within the Cluster Toolkit. It provides the necessary Terraform configuration to enable this feature and includes comprehensive documentation to guide users through the specific platform-level requirements, such as VPC-scoped quota management and minimum capacity constraints. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request adds support for the Dynamic Tier feature in the Managed Lustre module, which includes updating the Google provider version requirement to >= 7.27.0, adding the enable_dynamic_tier variable, and documenting its usage and prerequisites in the README. The review feedback suggests avoiding silently overriding per_unit_storage_throughput when enable_dynamic_tier is enabled, recommending instead the use of a lifecycle precondition to explicitly validate and fail on conflicting configurations.
| > ``` | ||
| > | ||
| > 2. **Minimum Size:** The minimum capacity is **472,000 GiB** (and must be in multiples of 472,000 GiB). | ||
|
|
There was a problem hiding this comment.
We can add a warning that recreating a Lustre instance (which rotates the IP) breaks existing mount setups. Maybe something like this:
Warning
Static IP Constraint & GKE CSI Support:
Managed Lustre instances do not guarantee static IP allocations. If the Lustre instance is recreated (due to scaling, location moves, etc.), its IP will change. Because GKE CSI PersistentVolume configurations are immutable, an IP change will cause GKE mount failures. You must manually delete and recreate GKE PersistentVolume configurations in such events.
| type = number | ||
| default = 500 | ||
| default = null | ||
| } |
There was a problem hiding this comment.
I think we can add a validation here to ensure valid inputs are provided:
validation {
condition = var.per_unit_storage_throughput == null || contains([125, 250, 500, 1000], var.per_unit_storage_throughput)
error_message = "Throughput must be null or one of: 125, 250, 500, 1000."
}
Please check if this goes well with the dynamic tier validation.
| condition = !(var.enable_dynamic_tier && var.per_unit_storage_throughput != null) | ||
| error_message = "per_unit_storage_throughput must not be set when enable_dynamic_tier is enabled." | ||
| } | ||
| } |
There was a problem hiding this comment.
I see that there is a restriction on the minimum size_gib: 472000 in the readme. Is it also supposed to be a multiple of 472000? Maybe we can add this precondition:
precondition {
condition = !var.enable_dynamic_tier || (var.size_gib >= 472000 && var.size_gib % 472000 == 0)
error_message = "For Managed Lustre Dynamic Tier, size_gib must be at least 472000 (GiB) and must be in multiples of 472000."
}
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces support for the Dynamic Tier in the Managed Lustre module, updating the Google provider requirement to version 7.27.0 or higher, and adding relevant documentation, variables, and preconditions. The review feedback suggests enhancing validation by adding a precondition to enforce the minimum capacity and multiples of 472,000 GiB for the Dynamic Tier, as well as adding a validation block to restrict the allowed values for the storage throughput variable.
| precondition { | ||
| condition = !(var.enable_dynamic_tier && var.per_unit_storage_throughput != null) | ||
| error_message = "per_unit_storage_throughput must not be set when enable_dynamic_tier is enabled." | ||
| } |
There was a problem hiding this comment.
When enable_dynamic_tier is enabled, the minimum capacity is 472,000 GiB and must be in multiples of 472,000 GiB. Adding a lifecycle precondition to validate this will prevent deployment failures at the GCP API level and fail fast with a clear error message.
precondition {
condition = !(var.enable_dynamic_tier && var.per_unit_storage_throughput != null)
error_message = "per_unit_storage_throughput must not be set when enable_dynamic_tier is enabled."
}
precondition {
condition = !var.enable_dynamic_tier || (var.size_gib >= 472000 && var.size_gib % 472000 == 0)
error_message = "For Dynamic Tier, size_gib must be at least 472,000 GiB and a multiple of 472,000 GiB."
}
References
- Avoid automatically and silently overriding user-provided settings in Terraform configurations. Instead, validate conflicting configurations (e.g., using lifecycle preconditions) to explicitly fail with a clear error message.
| variable "per_unit_storage_throughput" { | ||
| description = "Throughput of the instance in MB/s/TiB. Valid values are 125, 250, 500, 1000." | ||
| description = "Throughput of the instance in MB/s/TiB. Valid values are 125, 250, 500, 1000. If enable_dynamic_tier is false, this defaults to 500." | ||
| type = number | ||
| default = 500 | ||
| default = null | ||
| } |
There was a problem hiding this comment.
Adding a validation block to per_unit_storage_throughput ensures that any user-provided value is validated at plan time against the allowed values (125, 250, 500, 1000), preventing invalid configurations from being sent to the GCP API. Using the contains function is preferred for validating if a variable's value is within a list of allowed values.
variable "per_unit_storage_throughput" {
description = "Throughput of the instance in MB/s/TiB. Valid values are 125, 250, 500, 1000. If enable_dynamic_tier is false, this defaults to 500."
type = number
default = null
validation {
condition = var.per_unit_storage_throughput == null || contains([125, 250, 500, 1000], var.per_unit_storage_throughput)
error_message = "The per_unit_storage_throughput must be one of 125, 250, 500, or 1000."
}
}
References
- In Terraform, use the
containsfunction for validating if a variable's value is within a list of allowed values to improve readability and maintainability.
This PR integrates and documents support for the Managed Lustre Dynamic Tier in Cluster Toolkit.
Additionally, the module documentation has been expanded to address the specific platform-level requirements needed to deploy this tier successfully.
Submission Checklist
NOTE: Community submissions can take up to 2 weeks to be reviewed.
Please take the following actions before submitting this pull request.