| name | skill-koji-triage |
|---|---|
| description | [Skill] Examine Koji builds, fetch task info and logs from the Koji Web UI, identify failures, and provide root cause analysis. Use when triaging Koji build failures, investigating failed tasks, downloading build logs, or searching for broken packages. Triggers: koji failure, koji build failed, koji task, koji log, koji triage, build failure analysis. |
-
Base URL: Get from the user, or from env variable
KOJI_BASE_URL, or pre-configured in the MCP server.- If the env variable is not set, and the MCP is not pre-configured, prompt the user to input the base URL of the Koji Web UI (e.g.,
https://koji.example.com).
- If the env variable is not set, and the MCP is not pre-configured, prompt the user to input the base URL of the Koji Web UI (e.g.,
-
MCP tools (preferred): Use the
kojiMCP server tools to fetch pages and logs. These write output to temp files underbase/build/work/scratch/koji/to avoid bloating context. Useread_fileandgrep_searchon the resulting files to inspect content.Available tools:
koji_status,set_koji_url,koji_fetch,koji_allow_insecure,koji_cleanup. All tools return a dict with the result plus server state (base URL, output dir, etc.).- Call
koji_statusto check if a base URL is already configured (via env or a prior call). - Either call
set_koji_urlto set a default, or passoverride_base_urldirectly tokoji_fetchper-call. Setting a default is convenient when all requests target the same instance, andoverride_base_urlis useful for ad-hoc queries against other instances. - Call
koji_fetchwith the path (e.g.,/koji/taskinfo?taskID=3307). It returns the file path where content was saved.- If the fetch fails with an SSL certificate error (e.g., self-signed cert), ask the user if they want to skip SSL verification, then call
koji_allow_insecure(optionally withoverride_base_url) to proceed. Do NOT proceed without explicit user approval for the specific URL.
- If the fetch fails with an SSL certificate error (e.g., self-signed cert), ask the user if they want to skip SSL verification, then call
- Use
read_fileorgrep_searchon the saved file to extract the information you need. - Call
koji_cleanupwhen done to remove fetched temp files and reclaim disk space (NOTE: This will clean up ALL scratch files, so only call it when you're completely done with the current files).
- Call
-
MCP is required. If MCP tools are not working, ask the user to fix their MCP configuration rather than falling back to CLI tools.
-
Network issues: Koji is typically only accessible via VPN or corporate network. If you encounter connection errors or timeouts, assume it is a network permissions issue — ask the user to confirm they are connected to the appropriate network before retrying. Do NOT keep retrying on your own.
When the user provides a package name instead of a task ID, use both discovery paths below to find all recent failures. Neither path alone is sufficient.
Why two paths? The package info page only lists builds where Koji created a build record. If a task fails early (e.g.,
buildSRPMFromSCMfails), no build record exists and the failure is invisible on the package page. The failed tasks list catches these, but doesn't filter by package — you must grep for the name. Always run both.
- Search for the package:
/koji/search?match=glob&type=package&terms=<PACKAGE_NAME>
- If the exact name doesn't match, try a glob:
terms=<NAME>* - Extract the
packageIDfrom the results (look forpackageinfo?packageID=links) - If the search returns only one match, Koji redirects directly to the package info page (skip step 2).
- Get recent builds from the package info page:
/koji/packageinfo?packageID=<PACKAGE_ID>
- Look for
buildinfo?buildID=links with statefailed - Note the build NVRs (Name-Version-Release) for context
- Get the build's parent task ID:
/koji/buildinfo?buildID=<BUILD_ID>
- Extract the
taskinfo?taskID=link — this is the parentbuildtask - The parent task's Descendants section lists the actual failed child tasks (
buildArch,buildSRPMFromSCM)
Search the global failed tasks list for the package name:
/koji/tasks?owner=&state=failed&view=flat&method=build&order=-id
- This lists all recent failed
buildtasks, newest first - Grep the results for the package name to find relevant task IDs
- This catches early failures (e.g.,
buildSRPMFromSCMerrors, plugin/callback errors) that never create a build record and would be invisible on the package info page
Collect the parent task IDs from both paths, deduplicate, then for each parent task:
- Fetch
taskinfo?taskID=<PARENT_ID>to see the Descendants tree - Identify which child task(s) failed (
buildSRPMFromSCMorbuildArch) - Proceed to the Investigation Workflow below for each failed child
| Goal | URL Path |
|---|---|
| Search packages by name | /koji/search?match=glob&type=package&terms=<NAME> |
| Package info (recent builds) | /koji/packageinfo?packageID=<ID> |
| Build details → parent task | /koji/buildinfo?buildID=<ID> |
| All recent failed build tasks | /koji/tasks?owner=&state=failed&view=flat&method=build&order=-id |
| Search builds by NVR | /koji/search?match=glob&type=build&terms=<NAME>* |
# Path A: Package info page
# 1. Search for the kernel package
koji_fetch(path="/koji/search?match=glob&type=package&terms=kernel")
# → grep for 'packageinfo?packageID=' → e.g., packageID=6
# → (if single match, Koji redirects directly to package info page)
# 2. Get package info to see recent builds
koji_fetch(path="/koji/packageinfo?packageID=6")
# → grep for 'failed' and 'buildinfo?buildID=' → e.g., buildID=1482
# 3. Get the parent task ID from the build info page
koji_fetch(path="/koji/buildinfo?buildID=1482")
# → grep for 'taskinfo?taskID=' → e.g., parent task 32608
# Path B: Failed tasks list (run in parallel with Path A)
# 4. Search all recent failed build tasks for "kernel"
koji_fetch(path="/koji/tasks?owner=&state=failed&view=flat&method=build&order=-id")
# → grep for 'kernel' → may find additional tasks (e.g., task 87474)
# that failed before creating a build record
# Combine: parent tasks = {32608, 87474}
# 5. For each parent, fetch task info and find the failed child
koji_fetch(path="/koji/taskinfo?taskID=32608")
# → Descendants: buildArch x86_64 (task 39696) = failed
koji_fetch(path="/koji/taskinfo?taskID=87474")
# → Descendants: buildSRPMFromSCM (task 87475) = failed
# 6. Investigate each failed child using the workflow below
Branch mismatch warning: The branch checked out locally may not correspond to the code that Koji built. If the local component definition, spec, or overlays don't seem to match what the build logs show, check which branch/commit Koji used (visible in the task info page or
checkout.log) and compare with the current local branch (git branch --show-current/git log --oneline -1). Use non-destructive git commands to inspect the Koji branch without modifying the working tree — e.g.,git show <koji-ref>:<path>to read files,git diff HEAD..<koji-ref> -- <path>to compare. Avoidgit checkoutto switch branches, as the user may have uncommitted work.
Fetch the task page and extract key details (state, method, children).
If you don't know the status of the koji tool, check its status.
Decide if the base URL should be changed (is this a one-off, or long-term investigation?).
Call koji_fetch with path /koji/taskinfo?taskID=<TASK_ID>, either using the default base URL or passing an override_base_url. Use either built-in tools or cmdline tools on the saved file to extract state, method, and child task IDs.
Parse the HTML to find:
- State:
failed,closed(success),canceled,free,open - Method:
build(parent),buildSRPMFromSCM,buildArch - Child task IDs: Look for
taskinfo?taskID=links in the Descendants section
A build task always has children: one buildSRPMFromSCM (SRPM creation) and one or more buildArch tasks (per architecture). The parent fails if any child fails.
Extract all taskinfo?taskID=<ID> links from the parent task page. Then fetch each child task page and check its state. Typically:
buildSRPMFromSCMsucceeds- One of the
buildArchtasks fails (look forstate.*failedin the HTML)
Important: Also check the Result field in the task info HTML. For some failures (especially buildSRPMFromSCM), the error message is embedded directly in the page under "Result" and is NOT in any downloadable log file.
Download logs from the failed child task.
Call koji_fetch with the path /koji/getfile?taskID=<CHILD_TASK_ID>&volume=DEFAULT&name=<FILENAME>&offset=0. The content is saved to a temp file — use grep_search or read_file to inspect it. For large logs, fetch with &offset=-4000 first to get the tail.
Available log files (listed on the child task page):
| File | Contains |
|---|---|
root.log |
Mock chroot setup, dependency resolution, RPM build commands |
build.log |
RPM build output (%prep, %build, %install, %check) |
mock_output.log |
Mock orchestration output, systemd-nspawn errors |
checkout.log |
Git clone/fetch output (buildSRPMFromSCM tasks only) |
dnf5.log |
Package manager operations |
state.log |
Mock state transitions |
hw_info.log |
Builder hardware info |
mock_config.log |
Mock configuration |
Use offset=-4000 instead of offset=0 to get just the tail (last 4KB) for a quick peek.
There are four main failure categories. Check them in this order:
CallbackError: Error running postSCMCheckout callback— Theazldevbuilder plugin failed during source preparationazldev failed with return code 1— The Azure Linux source prep tool could not process the package- These errors appear in the task page Result field, NOT in any log file
- Cause: Package spec/sources are malformed, or the azldev tool doesn't support the package layout
Failed to register machine: The name org.freedesktop.machine1 was not provided— systemd-nspawn/machined not available on builder (infrastructure issue, retry on different host)could not init mock buildroot— Mock chroot setup failed before builds started- These are builder infrastructure issues, not package issues. Retrying usually fixes them.
No match for argument:— ABuildRequirespackage is not available in the build tagFailed to resolve the transaction— dnf5 cannot satisfy dependencies- Cause: Missing packages in the build tag, build order issues, or circular dependencies
error:— Compiler or build tool errorsFAIL/ERROR— Test suite failures in%checkphaseBad exit status from— RPM scriptlet failure (shows which phase:%build,%check,%install)- Look for the Testsuite summary block to see PASS/FAIL/ERROR/SKIP counts
Common root causes:
| Pattern | Cause | Log / Location |
|---|---|---|
CallbackError...azldev |
Source prep plugin failure | Task info HTML Result field |
azldev failed with return code 1 |
Package spec/source issue | Task info HTML Result field |
Failed to register machine |
Builder missing systemd-machined | mock_output.log |
could not init mock buildroot |
Mock infrastructure failure | mock_output.log |
No match for argument: <pkg> |
Missing BuildRequires in build tag | root.log |
Failed to resolve the transaction |
Dependency resolution failure | root.log |
FAIL tests/... with test summary |
Test failure in %check |
build.log |
Bad exit status from (%check) |
Test suite failed | build.log |
Bad exit status from (%build) |
Compilation failed | build.log |
error: unpacking of archive failed |
Corrupt source/SRPM | build.log |
no inotify_add_watch |
Container inotify limitation | build.log |
mock exited with status 30 |
Dependency OR infrastructure | root.log or mock_output.log |
mock exited with status 1 |
Build or test failure | build.log |
Upstream Fedora's Koji (https://koji.fedoraproject.org/koji/) and the Fedora dist-git are incredibly useful references for understanding package issues. If a package is failing in Azure Linux, check if it fails in Fedora too. Check for upstream fixes to the issue, or if it's fixed in newer Fedora releases. NEVER assume that a package is "just broke", if the same package builds successfully in Fedora, then it's critical to understand WHAT the difference is between the Fedora and Azure Linux build environments that is causing the failure. This can often lead to insights about missing dependencies, unsupported configurations, or other issues in Azure Linux that need to be addressed.
For older upstream logs, you may have to reference (https://kojipkgs.fedoraproject.org), which archives old build logs that are no longer on the main koji server. This server is usually protected by a bot challenge, you can re-purpose the koji_fetch tool with override_base_url to fetch from this server as well, it will often pass the bot challenge successfully.
# 1. Set the Koji base URL (once per session), or pass override_base_url to each call
set_koji_url(base_url="https://koji.example.com")
# 2. Fetch parent task info → saved to temp file
koji_fetch(path="/koji/taskinfo?taskID=3307")
# → {"output": "Wrote 12345 bytes (200 lines) to .../koji/koji_abc123", "default_base_url": "...", ...}
# 3. Use grep_search on the saved file to find child task IDs and state
# grep for 'taskinfo?taskID=' to find children
# grep for 'failed' to find which child failed
# 4. Fetch the failed child's task info (check Result field for plugin errors)
koji_fetch(path="/koji/taskinfo?taskID=6059")
# 5. Fetch logs in order of priority:
# 5a. mock_output.log (infrastructure errors)
koji_fetch(path="/koji/getfile?taskID=6059&volume=DEFAULT&name=mock_output.log&offset=-4000")
# → grep the file from the "output" field for: ERROR, Failed to register, could not init
# 5b. root.log (dependency resolution errors)
koji_fetch(path="/koji/getfile?taskID=6059&volume=DEFAULT&name=root.log")
# → grep for: No match for, Failed to resolve
# 5c. build.log (build/test errors)
koji_fetch(path="/koji/getfile?taskID=6059&volume=DEFAULT&name=build.log")
# → grep for: FAIL, ERROR, Bad exit status, error:
# 5d. For exceptionally large files, consider using 'fold' on the saved file.
The MCP server is the only supported method for fetching Koji information: it handles URL construction, SSL issues, output file creation and cleanup, etc.
If MCP tools are not working, guide the user to help them fix their MCP configuration. A (non exhaustive) list of some things to check:
- If they are in VSCode or Copilot CLI, ensure they are running in a reasonable workspace (ie not a parent folder that doesn't have the right mcp.json file, or adhoc files that don't have a root dir)
- If they REALLY don't want to be in a workspace, guide them through adding a global mcp configuration
- Copilot CLI:
~/.copilot/mcp-config.json(https://docs.github.com/en/copilot/how-tos/copilot-cli/use-copilot-cli#add-an-mcp-server) - VSCode: user profile: (https://code.visualstudio.com/docs/copilot/customization/mcp-servers#_add-an-mcp-server)
- Copilot CLI:
- Check that the MCP server can start - it might be missing the
mcppython package dependency - Something else...