Skip to content

Commit 765c86c

Browse files
authored
CLOUDP-305848 & CLOUDP-338152 - release on master merges axuillary images (#626)
# Summary - replacing ambiguous flags with clearer version selection, removing legacy/manual release variants, and ensuring proper argument passing throughout the pipeline **Pipeline and Release Process Improvements:** * Replaces the `release_agent` task/variant with a new `release_on_merge` task/variant that runs automatically on every merge, detecting changes to `release.json` and releasing images based on evg anchors. Manual/manual-patch-only variants for agent and OM 6.0 images are removed. * Updates the agent image build logic to use an explicit `agent_version` parameter (`all`, `current`, or a specific version), removing the previous use of `--all-agents` and `--current-agents` flags. Tools version is now required for specific agent builds. ```mermaid flowchart TD A[Merge to master] --> B[release_on_merge task] B --> C[Load config<br/>release.json + .evergreen.yml] C --> D[Release via pipeline.py cloud_manager agent] C --> E{For each OM version<br/>6.0, 7.0, 8.0} E --> F[Release via pipeline.py ops-manager image] E --> G[Release via pipeline.py matching agent] D & F & G --> H[atomic_pipeline.py skip_if_exists handles duplicates] ``` related pct pr: 10gen/mms#149833 ## Proof of Work - manual run: [Patch](https://spruce.mongodb.com/task/mongodb_kubernetes_release_on_merge_release_on_merge_patch_0b18efc62ffc05d9eb7047d206f43b185a2913b3_69369ba4b3de00000781dcff_25_12_08_09_34_30/logs?execution=0) - relevant logs ``` [2025/12/05 16:18:51.394] 2025-12-05 15:18:51,394 - INFO - Found OM 60: 6.0.27 [2025/12/05 16:18:51.394] 2025-12-05 15:18:51,394 - INFO - Found OM 70: 7.0.19 [2025/12/05 16:18:51.394] 2025-12-05 15:18:51,394 - INFO - Found OM 80: 8.0.16 [2025/12/05 16:18:51.394] 2025-12-05 15:18:51,394 - INFO - === Releasing cloud_manager agent: 13.43.0.9995-1 === ... ... All specified image tags already exist. Skipping build. ``` - new tool dry-run ``` [2025/12/08 10:39:51.814] ============================================================ [2025/12/08 10:39:51.814] RELEASE SUMMARY: [2025/12/08 10:39:51.814] ============================================================ [2025/12/08 10:39:51.814] Agents: [2025/12/08 10:39:51.814] ✓ 13.43.0.9995-1 (cloud_manager) [2025/12/08 10:39:51.814] ✓ 12.0.35.7911-1 (OM 6.0.27) [2025/12/08 10:39:51.814] ✓ 107.0.19.8805-1 (OM 7.0.19) [2025/12/08 10:39:51.814] ✓ 108.0.16.8895-1 (OM 8.0.16) [2025/12/08 10:39:51.814] Ops Manager: [2025/12/08 10:39:51.814] ✓ 6.0.27 [2025/12/08 10:39:51.814] ✓ 7.0.19 [2025/12/08 10:39:51.814] ✓ 8.0.16 [2025/12/08 10:39:51.814] Total: 7 releases [2025/12/08 10:39:51.814] ============================================================ ``` ## Checklist - [x] Have you linked a jira ticket and/or is the ticket in the title? - [x] Have you checked whether your jira ticket required DOCSP changes? - [x] Have you added changelog file? - use `skip-changelog` label if not needed - refer to [Changelog files and Release Notes](https://github.com/mongodb/mongodb-kubernetes/blob/master/CONTRIBUTING.md#changelog-files-and-release-notes) section in CONTRIBUTING.md for more details
1 parent 3faf0fd commit 765c86c

File tree

10 files changed

+431
-432
lines changed

10 files changed

+431
-432
lines changed

.evergreen-functions.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -496,6 +496,7 @@ functions:
496496
IMAGE_NAME: ${image_name}
497497
BUILD_SCENARIO_OVERRIDE: ${build_scenario}
498498
FLAGS: ${flags}
499+
AGENT_VERSION_OVERRIDE: ${agent_version}
499500

500501
teardown_cloud_qa_all:
501502
- command: shell.exec

.evergreen.yml

Lines changed: 27 additions & 77 deletions
Original file line numberDiff line numberDiff line change
@@ -287,19 +287,21 @@ tasks:
287287
commands:
288288
- func: lint_repo
289289

290-
# pct only triggers this variant once a new agent image is out
291-
- name: release_agent
292-
# this enables us to run this variant either manually (patch) which pct does or during an OM bump (github_pr)
293-
allowed_requesters: [ "patch", "github_pr" ]
290+
# Runs on every merge - detects release.json changes and releases appropriate images
291+
- name: release_om_and_agents
292+
allowed_requesters: [ "patch" , "commit"]
294293
commands:
295294
- func: clone
295+
- func: python_venv
296296
- func: setup_building_host
297297
- func: quay_login
298298
- func: setup_docker_sbom
299-
- func: pipeline
300-
vars:
301-
image_name: agent
302-
build_scenario: release
299+
- command: subprocess.exec
300+
params:
301+
working_dir: src/github.com/mongodb/mongodb-kubernetes
302+
binary: scripts/dev/run_python.sh
303+
args:
304+
- scripts/release/release_om_and_agents.py
303305

304306
- name: migrate_all_agents
305307
# this enables us to run this variant manually to build all the agents for the new agent registry
@@ -343,8 +345,9 @@ tasks:
343345
- func: quay_login
344346
- func: pipeline
345347
vars:
348+
agent_version: all
346349
image_name: agent
347-
flags: "--parallel --all-agents --skip-if-exists=false"
350+
flags: "--parallel --skip-if-exists=false"
348351

349352
- name: rebuild_currently_used_agents
350353
# this enables us to run this manually (patch) and rebuild current agent versions to verify
@@ -356,8 +359,9 @@ tasks:
356359
- func: quay_login
357360
- func: pipeline
358361
vars:
362+
agent_version: current
359363
image_name: agent
360-
flags: "--parallel --current-agents --skip-if-exists=false"
364+
flags: "--parallel --skip-if-exists=false"
361365

362366
- name: build_kubectl_mongodb_plugin
363367
commands:
@@ -463,7 +467,8 @@ tasks:
463467
- func: pipeline
464468
vars:
465469
image_name: agent
466-
flags: "--parallel --all-agents"
470+
agent_version: all
471+
flags: "--parallel"
467472

468473
- name: build_init_database_image_ubi
469474
commands:
@@ -749,7 +754,7 @@ task_groups:
749754
- e2e_sharded_cluster_scram_sha_256_switch_project
750755
- e2e_replica_set_scram_sha_1_switch_project
751756
- e2e_sharded_cluster_scram_sha_1_switch_project
752-
# TODO CLOUDP-349093 - Disabled these tests as they don't use the password secret, and project migrations aren't fully supported yet.
757+
# TODO CLOUDP-349093 - Disabled these tests as they don't use the password secret, and project migrations aren't fully supported yet.
753758
# e2e_sharded_cluster_x509_switch_project
754759
# e2e_replica_set_x509_switch_project
755760
# e2e_replica_set_ldap_switch_project
@@ -1893,26 +1898,6 @@ buildvariants:
18931898
tasks:
18941899
- name: build_om_images
18951900

1896-
# It will be called by pct while bumping the agent cloud manager image
1897-
- name: release_agent
1898-
display_name: release_agent
1899-
tags: [ "manual_patch", "release_agent" ]
1900-
run_on:
1901-
- release-ubuntu2404-small # This is required for CISA attestation https://jira.mongodb.org/browse/DEVPROD-17780
1902-
depends_on:
1903-
- variant: init_test_run
1904-
name: build_agent_images_ubi # this ensures the agent gets released to ECR as well
1905-
- variant: e2e_multi_cluster_kind
1906-
name: '*'
1907-
- variant: e2e_static_multi_cluster_2_clusters
1908-
name: '*'
1909-
- variant: e2e_mdb_kind_ubi_cloudqa
1910-
name: '*'
1911-
- variant: e2e_static_mdb_kind_ubi_cloudqa
1912-
name: '*'
1913-
tasks:
1914-
- name: release_agent
1915-
19161901
# Only called manually, It's used for testing the task release_agents in case the release.json
19171902
# has not changed, and you still want to push the images to registry.
19181903
- name: manual_release_all_agents
@@ -1946,51 +1931,6 @@ buildvariants:
19461931
- name: backup_csv_images_limit_3
19471932
- name: backup_csv_images_all
19481933

1949-
- name: publish_om60_images
1950-
display_name: publish_om60_images
1951-
tags: [ "manual_patch" ]
1952-
allowed_requesters: [ "patch", "github_pr" ]
1953-
run_on:
1954-
- release-ubuntu2404-small # This is required for CISA attestation https://jira.mongodb.org/browse/DEVPROD-17780
1955-
depends_on:
1956-
- variant: e2e_om60_kind_ubi
1957-
name: '*'
1958-
- variant: e2e_static_om60_kind_ubi
1959-
name: '*'
1960-
tasks:
1961-
- name: publish_ops_manager
1962-
- name: release_agent
1963-
1964-
- name: publish_om70_images
1965-
display_name: publish_om70_images
1966-
tags: [ "manual_patch" ]
1967-
allowed_requesters: [ "patch", "github_pr" ]
1968-
run_on:
1969-
- release-ubuntu2404-small # This is required for CISA attestation https://jira.mongodb.org/browse/DEVPROD-17780
1970-
depends_on:
1971-
- variant: e2e_om70_kind_ubi
1972-
name: '*'
1973-
- variant: e2e_static_om70_kind_ubi
1974-
name: '*'
1975-
tasks:
1976-
- name: publish_ops_manager
1977-
- name: release_agent
1978-
1979-
- name: publish_om80_images
1980-
display_name: publish_om80_images
1981-
tags: [ "manual_patch" ]
1982-
allowed_requesters: [ "patch", "github_pr" ]
1983-
run_on:
1984-
- release-ubuntu2404-small # This is required for CISA attestation https://jira.mongodb.org/browse/DEVPROD-17780
1985-
depends_on:
1986-
- variant: e2e_om80_kind_ubi
1987-
name: '*'
1988-
- variant: e2e_static_om80_kind_ubi
1989-
name: '*'
1990-
tasks:
1991-
- name: publish_ops_manager
1992-
- name: release_agent
1993-
19941934
- name: migrate_all_agents
19951935
display_name: migrate_all_agents
19961936
tags: [ "manual_patch" ]
@@ -1999,3 +1939,13 @@ buildvariants:
19991939
- ubuntu2404-large
20001940
tasks:
20011941
- name: migrate_all_agents
1942+
1943+
# Runs on every merge to master and releases images auxiliary to the operator release like OM and the Agent
1944+
- name: release_om_and_agents
1945+
display_name: release_om_and_agents
1946+
allowed_requesters: [ "patch" , "commit"]
1947+
run_on:
1948+
- release-ubuntu2404-small
1949+
patchable: false # Only run on commit builds
1950+
tasks:
1951+
- name: release_om_and_agents

scripts/release/agent/detect_ops_manager_changes.py renamed to scripts/release/agent/agents_to_rebuild.py

Lines changed: 0 additions & 71 deletions
Original file line numberDiff line numberDiff line change
@@ -19,32 +19,6 @@
1919
logger = logging.getLogger(__name__)
2020

2121

22-
def get_content_from_git(commit: str, file_path: str) -> Optional[str]:
23-
try:
24-
result = subprocess.run(
25-
["git", "show", f"{commit}:{file_path}"], capture_output=True, text=True, check=True, timeout=30
26-
)
27-
return result.stdout
28-
except (subprocess.CalledProcessError, subprocess.TimeoutExpired) as e:
29-
logger.error(f"Failed to get {file_path} from git commit {commit}: {e}")
30-
return None
31-
32-
33-
def load_release_json_from_master() -> Optional[Dict]:
34-
base_revision = "origin/master"
35-
36-
content = get_content_from_git(base_revision, "release.json")
37-
if not content:
38-
logger.error(f"Could not retrieve release.json from {base_revision}")
39-
return None
40-
41-
try:
42-
return json.loads(content)
43-
except json.JSONDecodeError as e:
44-
logger.error(f"Invalid JSON in base release.json: {e}")
45-
return None
46-
47-
4822
def load_current_release_json() -> Optional[Dict]:
4923
try:
5024
with open("release.json", "r") as f:
@@ -60,28 +34,6 @@ def extract_ops_manager_mapping(release_data: Dict) -> Dict:
6034
return release_data.get("supportedImages", {}).get("mongodb-agent", {}).get("opsManagerMapping", {})
6135

6236

63-
def get_changed_agents(current_mapping: Dict, base_mapping: Dict) -> List[Tuple[str, str]]:
64-
"""Returns list of (agent_version, tools_version) tuples for added/changed agents"""
65-
added_agents = []
66-
67-
current_om_mapping = current_mapping.get("ops_manager", {})
68-
master_om_mapping = base_mapping.get("ops_manager", {})
69-
70-
for om_version, agent_tools_version in current_om_mapping.items():
71-
if om_version not in master_om_mapping or master_om_mapping[om_version] != agent_tools_version:
72-
added_agents.append((agent_tools_version["agent_version"], agent_tools_version["tools_version"]))
73-
74-
current_cm = current_mapping.get("cloud_manager")
75-
master_cm = base_mapping.get("cloud_manager")
76-
current_cm_tools = current_mapping.get("cloud_manager_tools")
77-
master_cm_tools = base_mapping.get("cloud_manager_tools")
78-
79-
if current_cm != master_cm or current_cm_tools != master_cm_tools:
80-
added_agents.append((current_cm, current_cm_tools))
81-
82-
return list(set(added_agents))
83-
84-
8537
def get_tools_version_for_agent(agent_version: str) -> str:
8638
"""Get tools version for a given agent version from release.json"""
8739
release_data = load_current_release_json()
@@ -206,26 +158,3 @@ def get_currently_used_agents() -> List[Tuple[str, str]]:
206158
except Exception as e:
207159
logger.error(f"Error getting currently used agents: {e}")
208160
return []
209-
210-
211-
def detect_ops_manager_changes() -> List[Tuple[str, str]]:
212-
"""Returns (has_changes, changed_agents_list)"""
213-
logger.info("=== Detecting OM Mapping Changes (Local vs Base) ===")
214-
215-
current_release = load_current_release_json()
216-
if not current_release:
217-
logger.error("Could not load current local release.json")
218-
return []
219-
220-
master_release = load_release_json_from_master()
221-
if not master_release:
222-
logger.warning("Could not load base release.json, assuming changes exist")
223-
return []
224-
225-
current_mapping = extract_ops_manager_mapping(current_release)
226-
base_mapping = extract_ops_manager_mapping(master_release)
227-
228-
if current_mapping != base_mapping:
229-
return get_changed_agents(current_mapping, base_mapping)
230-
else:
231-
return []

scripts/release/atomic_pipeline.py

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -15,8 +15,7 @@
1515
from opentelemetry import trace
1616

1717
from lib.base_logger import logger
18-
from scripts.release.agent.detect_ops_manager_changes import (
19-
detect_ops_manager_changes,
18+
from scripts.release.agent.agents_to_rebuild import (
2019
get_all_agents_for_rebuild,
2120
get_currently_used_agents,
2221
)
@@ -330,22 +329,23 @@ def build_upgrade_hook_image(build_configuration: ImageBuildConfiguration):
330329

331330

332331
def build_agent(build_configuration: ImageBuildConfiguration):
333-
"""
334-
Build the agent only for the latest operator for patches and operator releases.
332+
"""Build the agent image(s). Validation happens in pipeline.py."""
333+
version = build_configuration.version
335334

336-
"""
337-
if build_configuration.all_agents:
335+
if version == "all":
338336
agent_versions_to_build = get_all_agents_for_rebuild()
339337
logger.info("building all agents")
340-
elif build_configuration.currently_used_agents:
338+
elif version == "current":
341339
agent_versions_to_build = get_currently_used_agents()
342-
logger.info("building current used agents")
340+
logger.info("building currently used agents")
341+
elif version and build_configuration.agent_tools_version:
342+
agent_versions_to_build = [(version, build_configuration.agent_tools_version)]
343+
logger.info(f"building agent {version} with tools {build_configuration.agent_tools_version}")
343344
else:
344-
agent_versions_to_build = detect_ops_manager_changes()
345-
logger.info("building agents for changed OM versions")
345+
raise ValueError("No agent selection provided - this should be caught by pipeline.py validation")
346346

347347
if not agent_versions_to_build:
348-
logger.info("No changes detected, skipping agent build")
348+
logger.warning("No agent versions found to build")
349349
return
350350

351351
logger.info(f"Building Agent versions: {agent_versions_to_build}")

scripts/release/build/image_build_configuration.py

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,7 @@
44
from scripts.release.build.build_scenario import BuildScenario
55
from scripts.release.build.image_build_process import ImageBuilder
66

7-
SUPPORTED_PLATFORMS = ["darwin/amd64", "darwin/arm64", "linux/amd64", "linux/arm64", "linux/s390x",
8-
"linux/ppc64le"]
7+
SUPPORTED_PLATFORMS = ["darwin/amd64", "darwin/arm64", "linux/amd64", "linux/arm64", "linux/s390x", "linux/ppc64le"]
98

109

1110
@dataclass
@@ -24,9 +23,8 @@ class ImageBuildConfiguration:
2423
# Agent specific
2524
parallel: bool = False
2625
parallel_factor: int = 0
27-
all_agents: bool = False
28-
currently_used_agents: bool = False
2926
architecture_suffix: bool = False
27+
agent_tools_version: Optional[str] = None # Explicit tools version for agent builds
3028

3129
def is_release_scenario(self) -> bool:
3230
return self.scenario == BuildScenario.RELEASE

scripts/release/pipeline.py

Lines changed: 28 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -62,6 +62,9 @@
6262
)
6363
from scripts.release.build.image_build_process import PodmanImageBuilder
6464

65+
CURRENT_AGENTS = "current"
66+
ALL_AGENTS = "all"
67+
6568
"""
6669
The goal of main.py, image_build_configuration.py and build_context.py is to provide a single source of truth for the build
6770
configuration. All parameters that depend on the the build environment (local dev, evg, etc) should be resolved here and
@@ -142,9 +145,25 @@ def image_build_config_from_args(args) -> ImageBuildConfiguration:
142145
if type(builder) is PodmanImageBuilder and len(platforms) > 1:
143146
raise ValueError("Cannot use Podman builder with multi-platform builds")
144147

145-
# Validate version - only agent can have None version as the versions are managed by the agent
146-
# which are externally retrieved from release.json
147-
if version is None and image != "agent":
148+
# Get agent_tools_version for agent builds (from --agent-tools-version arg)
149+
agent_tools_version = getattr(args, "agent_tools_version", None)
150+
151+
# Validate version requirements
152+
if image == "agent":
153+
# Agent builds: version can be "all", "current", or explicit version (requires agent_tools_version)
154+
if version is None:
155+
raise ValueError(
156+
"Agent build requires --version. Use one of:\n"
157+
" --version all (for all agents in release.json)\n"
158+
" --version current (for currently used agents)\n"
159+
" --version <ver> --agent-tools-version <tools_ver> (for specific agent)"
160+
)
161+
is_special_version = version in (ALL_AGENTS, CURRENT_AGENTS)
162+
if not is_special_version and agent_tools_version is None:
163+
raise ValueError(
164+
f"For agent builds with explicit version '{version}', --agent-tools-version must also be provided."
165+
)
166+
elif version is None:
148167
raise ValueError(f"Version cannot be empty for {image}.")
149168

150169
return ImageBuildConfiguration(
@@ -160,9 +179,8 @@ def image_build_config_from_args(args) -> ImageBuildConfiguration:
160179
skip_if_exists=skip_if_exists,
161180
parallel=args.parallel,
162181
parallel_factor=args.parallel_factor,
163-
all_agents=args.all_agents,
164-
currently_used_agents=args.current_agents,
165182
architecture_suffix=architecture_suffix,
183+
agent_tools_version=agent_tools_version,
166184
)
167185

168186

@@ -277,14 +295,11 @@ def main():
277295
help="Number of agent builds to run in parallel, defaults to number of cores",
278296
)
279297
parser.add_argument(
280-
"--all-agents",
281-
action="store_true",
282-
help="Build all agent images.",
283-
)
284-
parser.add_argument(
285-
"--current-agents",
286-
action="store_true",
287-
help="Build all currently used agent images.",
298+
"--agent-tools-version",
299+
metavar="",
300+
action="store",
301+
type=str,
302+
help="Tools version to use when building agent image. Required when --version is an explicit version (not 'all' or 'current').",
288303
)
289304
parser.add_argument(
290305
"--architecture-suffix",

0 commit comments

Comments
 (0)