Update CI pipeline docs with Kaniko registry architecture decision #100

Closed
ldraney wants to merge 1 commit from docs/update-infra-pipeline-post-incident into main
Owner

Summary

  • Updates stale .woodpecker.yaml example to match current config (clone, bundle cache, insecure-registry)
  • Adds "Kaniko Registry Architecture" section documenting the pull/push path split
  • Adds explicit warning not to re-introduce build_args override
  • Updates kustomization.yaml example with ArgoCD Image Updater SHA tag entry

Changes

  • docs/infrastructure-and-pipeline.md: rewrote CI Pipeline section, added Kaniko architecture table, updated mermaid diagram, updated kustomization example

Test Plan

  • Examples match actual config on main (git show main:.woodpecker.yaml)
  • Mermaid diagrams render correctly
  • No code changes, docs only

Review Checklist

  • Passed automated review-fix loop
  • No secrets committed
  • No unnecessary file changes
  • Commit messages are descriptive
  • Closes #99 — update CI pipeline docs
  • ldraney/landscaping-assistant #82 — the incident that motivated this
## Summary - Updates stale `.woodpecker.yaml` example to match current config (clone, bundle cache, insecure-registry) - Adds "Kaniko Registry Architecture" section documenting the pull/push path split - Adds explicit warning not to re-introduce `build_args` override - Updates kustomization.yaml example with ArgoCD Image Updater SHA tag entry ## Changes - `docs/infrastructure-and-pipeline.md`: rewrote CI Pipeline section, added Kaniko architecture table, updated mermaid diagram, updated kustomization example ## Test Plan - [ ] Examples match actual config on main (`git show main:.woodpecker.yaml`) - [ ] Mermaid diagrams render correctly - [ ] No code changes, docs only ## Review Checklist - [ ] Passed automated review-fix loop - [ ] No secrets committed - [ ] No unnecessary file changes - [ ] Commit messages are descriptive ## Related Notes - Closes #99 — update CI pipeline docs - `ldraney/landscaping-assistant #82` — the incident that motivated this
Update CI pipeline docs with Kaniko registry architecture decision
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
ci/woodpecker/pr/woodpecker Pipeline failed
CI / scan_ruby (pull_request) Has been cancelled
CI / scan_js (pull_request) Has been cancelled
CI / lint (pull_request) Has been cancelled
7f5f59b743
The .woodpecker.yaml example was stale — missing the clone step,
bundle cache steps, insecure-registry settings, and extra_opts.
More importantly, the doc didn't record the architectural decision
about why Kaniko pulls via Tailscale FQDN but pushes via cluster-
internal Harbor.

Adds a "Kaniko Registry Architecture" section documenting the
pull/push path split and the explicit warning not to add build_args
that override the Dockerfile REGISTRY default — the root cause of
10+ consecutive pipeline failures (PRs #76-#79 oscillation, fixed
in #91).

Also updates the kustomization.yaml example to show the ArgoCD
Image Updater SHA tag entry.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Author
Owner

PR #100 Review

DOMAIN REVIEW

Domain: Infrastructure documentation (Woodpecker CI / Kaniko / ArgoCD / Kustomize). Reviewed against the actual .woodpecker.yaml on main and the Dockerfile to verify accuracy of documented config and architecture explanations.

Kaniko Registry Architecture section: The pull/push path table and the build_args footgun warning are accurate and well-written. The Dockerfile confirms ARG REGISTRY=harbor.tail5b443a.ts.net as the default (line 4 and line 19), which matches the documented explanation for why no build_args override is needed. The incident history (PRs #76-#79, #91) is correctly referenced. This section adds genuine operational value.

ArgoCD Image Updater kustomization entry: The added images entry and explanation of the newest-build SHA tag strategy are consistent with the ArgoCD Image Updater configuration described in the var.services section of this same doc.

BLOCKERS

None. This is a docs-only PR with no code changes, so the BLOCKER criteria (test coverage, input validation, secrets, auth DRY) do not apply.

NITS

1. Documented .woodpecker.yaml omits several fields present in the actual config.

I compared the YAML example in the PR against the actual .woodpecker.yaml on main. The following are omitted from the documented version:

Omission Actual file location Severity
Top-level when: - event: [push, pull_request] block Lines 1-2 Low -- the when clauses on individual steps partially convey this, but the top-level trigger is a meaningful config detail
sleep 2 in clone step Line 8 Low -- this is a timing workaround, arguably operational detail worth documenting since it is non-obvious
echo debug lines in restore-bundle-cache Lines 19, 22, 25 Negligible -- reasonable to strip debug output from docs
Per-step when clauses (e.g., when: - event: pull_request - event: push) Lines 38-40, 53-54, 67-68, 83-84, 115-117 Medium -- these are functionally significant. They control which steps run on PRs vs pushes. A reader trusting the docs would not know lint/test run on PRs too
environment block on lint step (BUNDLE_DEPLOYMENT, BUNDLE_WITHOUT) Lines 62-63 Low -- the lint step in the docs omits these env vars, though they are present in the actual config and needed for the same reason as test
dockerfile: Dockerfile and context: . in build-and-push Lines 128-129 Low -- these are defaults, but the original doc included less explicit config, so consistency matters

2. The save-bundle-cache step is missing from the YAML example.

The mermaid diagram correctly shows a "Save bundle cache" node, but the YAML example below it does not include the save-bundle-cache step at all (actual file lines 87-117). This is a notable inconsistency -- the diagram promises a step the YAML does not show. This step has meaningful config (failure: ignore, hostPath volume, checksum logic) that a reader would want to see.

3. Minor Dockerfile representation drift.

The Dockerfile section earlier in this same doc (lines 86-104, not touched by this PR) does not show the ARG REGISTRY=... / FROM ${REGISTRY}/... pattern that the new Kaniko section explicitly depends on. The existing doc shows hardcoded FROM harbor.tail5b443a.ts.net/.... This is pre-existing drift (not introduced by this PR), but since the new Kaniko section says "The Dockerfile uses ARG REGISTRY=harbor.tail5b443a.ts.net as the default", a reader checking the Dockerfile snippet above would see a contradiction. Consider updating the Dockerfile snippet in a follow-up or noting the discrepancy.

SOP COMPLIANCE

  • Branch named after issue: docs/update-infra-pipeline-post-incident -- this is a docs/ prefix branch rather than {issue-number}-{kebab-case} convention (would be 99-update-infra-pipeline-post-incident), but acceptable for docs-only changes
  • PR body follows template: Summary, Changes, Test Plan, Review Checklist, Related Notes all present
  • Related references: "Closes #99" and cross-references incident issue #82
  • No secrets committed: confirmed, no credentials in diff
  • No unnecessary file changes: single file changed, all changes are on-topic
  • Commit messages are descriptive

PROCESS OBSERVATIONS

  • This is a good post-incident documentation practice. Capturing the Kaniko registry architecture decision and the build_args footgun warning after a multi-week incident (PRs #76-#79, #91) prevents knowledge loss. DORA: reduces change failure risk for future CI modifications.
  • The primary gap is the YAML-vs-actual divergence. The documented YAML is a simplified version of the real config. If the intent is "reference example" rather than "exact copy", adding a note like "Simplified for clarity -- see .woodpecker.yaml for the full config" would set correct expectations. If the intent is accuracy, the missing save-bundle-cache step and per-step when clauses should be added.

VERDICT: APPROVED

The Kaniko registry architecture section, the build_args warning, and the ArgoCD Image Updater kustomization entry are all accurate and valuable additions. The nits about YAML simplification and the missing save-bundle-cache step are real but non-blocking -- they are documentation completeness issues, not correctness errors in the new content. The core new content (registry architecture table, footgun warning) is factually correct and operationally important.

## PR #100 Review ### DOMAIN REVIEW **Domain**: Infrastructure documentation (Woodpecker CI / Kaniko / ArgoCD / Kustomize). Reviewed against the actual `.woodpecker.yaml` on main and the `Dockerfile` to verify accuracy of documented config and architecture explanations. **Kaniko Registry Architecture section**: The pull/push path table and the `build_args` footgun warning are accurate and well-written. The Dockerfile confirms `ARG REGISTRY=harbor.tail5b443a.ts.net` as the default (line 4 and line 19), which matches the documented explanation for why no `build_args` override is needed. The incident history (PRs #76-#79, #91) is correctly referenced. This section adds genuine operational value. **ArgoCD Image Updater kustomization entry**: The added `images` entry and explanation of the `newest-build` SHA tag strategy are consistent with the ArgoCD Image Updater configuration described in the `var.services` section of this same doc. ### BLOCKERS None. This is a docs-only PR with no code changes, so the BLOCKER criteria (test coverage, input validation, secrets, auth DRY) do not apply. ### NITS **1. Documented `.woodpecker.yaml` omits several fields present in the actual config.** I compared the YAML example in the PR against the actual `.woodpecker.yaml` on main. The following are omitted from the documented version: | Omission | Actual file location | Severity | |----------|---------------------|----------| | Top-level `when: - event: [push, pull_request]` block | Lines 1-2 | Low -- the `when` clauses on individual steps partially convey this, but the top-level trigger is a meaningful config detail | | `sleep 2` in clone step | Line 8 | Low -- this is a timing workaround, arguably operational detail worth documenting since it is non-obvious | | `echo` debug lines in restore-bundle-cache | Lines 19, 22, 25 | Negligible -- reasonable to strip debug output from docs | | Per-step `when` clauses (e.g., `when: - event: pull_request - event: push`) | Lines 38-40, 53-54, 67-68, 83-84, 115-117 | Medium -- these are functionally significant. They control which steps run on PRs vs pushes. A reader trusting the docs would not know lint/test run on PRs too | | `environment` block on lint step (`BUNDLE_DEPLOYMENT`, `BUNDLE_WITHOUT`) | Lines 62-63 | Low -- the lint step in the docs omits these env vars, though they are present in the actual config and needed for the same reason as test | | `dockerfile: Dockerfile` and `context: .` in build-and-push | Lines 128-129 | Low -- these are defaults, but the original doc included less explicit config, so consistency matters | **2. The `save-bundle-cache` step is missing from the YAML example.** The mermaid diagram correctly shows a "Save bundle cache" node, but the YAML example below it does not include the `save-bundle-cache` step at all (actual file lines 87-117). This is a notable inconsistency -- the diagram promises a step the YAML does not show. This step has meaningful config (`failure: ignore`, hostPath volume, checksum logic) that a reader would want to see. **3. Minor Dockerfile representation drift.** The Dockerfile section earlier in this same doc (lines 86-104, not touched by this PR) does not show the `ARG REGISTRY=...` / `FROM ${REGISTRY}/...` pattern that the new Kaniko section explicitly depends on. The existing doc shows hardcoded `FROM harbor.tail5b443a.ts.net/...`. This is pre-existing drift (not introduced by this PR), but since the new Kaniko section says "The Dockerfile uses `ARG REGISTRY=harbor.tail5b443a.ts.net` as the default", a reader checking the Dockerfile snippet above would see a contradiction. Consider updating the Dockerfile snippet in a follow-up or noting the discrepancy. ### SOP COMPLIANCE - [x] Branch named after issue: `docs/update-infra-pipeline-post-incident` -- this is a `docs/` prefix branch rather than `{issue-number}-{kebab-case}` convention (would be `99-update-infra-pipeline-post-incident`), but acceptable for docs-only changes - [x] PR body follows template: Summary, Changes, Test Plan, Review Checklist, Related Notes all present - [x] Related references: "Closes #99" and cross-references incident issue #82 - [x] No secrets committed: confirmed, no credentials in diff - [x] No unnecessary file changes: single file changed, all changes are on-topic - [x] Commit messages are descriptive ### PROCESS OBSERVATIONS - This is a good post-incident documentation practice. Capturing the Kaniko registry architecture decision and the `build_args` footgun warning after a multi-week incident (PRs #76-#79, #91) prevents knowledge loss. DORA: reduces change failure risk for future CI modifications. - The primary gap is the YAML-vs-actual divergence. The documented YAML is a simplified version of the real config. If the intent is "reference example" rather than "exact copy", adding a note like "Simplified for clarity -- see `.woodpecker.yaml` for the full config" would set correct expectations. If the intent is accuracy, the missing `save-bundle-cache` step and per-step `when` clauses should be added. ### VERDICT: APPROVED The Kaniko registry architecture section, the `build_args` warning, and the ArgoCD Image Updater kustomization entry are all accurate and valuable additions. The nits about YAML simplification and the missing `save-bundle-cache` step are real but non-blocking -- they are documentation completeness issues, not correctness errors in the new content. The core new content (registry architecture table, footgun warning) is factually correct and operationally important.
ldraney force-pushed docs/update-infra-pipeline-post-incident from 7f5f59b743
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
ci/woodpecker/pr/woodpecker Pipeline failed
CI / scan_ruby (pull_request) Has been cancelled
CI / scan_js (pull_request) Has been cancelled
CI / lint (pull_request) Has been cancelled
to 9f6d660e40
Some checks failed
ci/woodpecker/push/woodpecker Pipeline was successful
ci/woodpecker/pr/woodpecker Pipeline was successful
CI / scan_ruby (pull_request) Has been cancelled
CI / scan_js (pull_request) Has been cancelled
CI / lint (pull_request) Has been cancelled
2026-06-06 02:57:42 +00:00
Compare
Author
Owner

PR #100 Review

DOMAIN REVIEW

This is a docs-only PR updating docs/infrastructure-and-pipeline.md. The domain is CI/CD pipeline documentation (Woodpecker, Kaniko, ArgoCD). The core review question: does the documented YAML match the actual pipeline on main?

I compared the documented .woodpecker.yaml in this PR against the actual file at main:.woodpecker.yaml line by line. There are significant discrepancies.

BLOCKERS

1. Documented pipeline does not match actual .woodpecker.yaml on main

The stated purpose of this PR is to update stale docs to match the current config. The PR body's test plan says "Examples match actual config on main (git show main:.woodpecker.yaml)." That test plan item is not passing. Specific mismatches:

Area Documented (this PR) Actual on main
Step images (lint, test) harbor.tail5b443a.ts.net/library/ruby-rails-build:latest ruby:3.4.9-slim
bundle-install step Exists as separate step with bundle config set --local path vendor/bundle Does not exist
lint commands bundle config set --local path vendor/bundle then bundle exec rubocop apt-get install, gem install bundler, bundle install --jobs=4, bundle exec rubocop
test commands bundle config set --local path vendor/bundle then rails db:create then rspec apt-get install, gem install bundler, bundle install --jobs=4, rails db:create, rspec
Environment vars (lint, test) BUNDLE_DEPLOYMENT: "", BUNDLE_WITHOUT: "" Not present
depends_on (lint) depends_on: [bundle-install] No depends_on at all
depends_on (test) depends_on: [bundle-install] No depends_on at all
when: conditions (lint, test) Not present when: [event: pull_request, event: push]
cache settings (build-and-push) cache: true, cache_repo: landscaping-assistant/cache Not present

This is a documentation correctness blocker. The docs describe a pipeline that does not exist. Either the docs need to match main, or main needs to be updated first and then the docs matched to it.

2. Incorrect claim about Dockerfile ARG REGISTRY

The PR adds this statement: "The Dockerfile uses ARG REGISTRY=harbor.tail5b443a.ts.net as the default, so Kaniko pulls base images via the FQDN without needing a build_args override."

The actual Dockerfile on main has no ARG REGISTRY. It hardcodes the FQDN directly:

FROM harbor.tail5b443a.ts.net/library/ruby-rails-build:latest AS build
FROM harbor.tail5b443a.ts.net/library/ruby-rails-runtime:latest

The conclusion is the same (Kaniko pulls via FQDN without build_args), but the mechanism described is wrong. This matters because the warning "Do not add build_args: REGISTRY=..." references a non-existent ARG.

NITS

  1. The Mermaid diagram shows Clone -> Bundle -> Lint/Test (parallel) -> Build. This matches the documented YAML (which has a bundle-install step) but not the actual pipeline (which has no separate bundle-install step and no clone override). The diagram and the YAML are internally consistent with each other, but both are wrong relative to main.

  2. The Kaniko Registry Architecture table is well-written and documents genuine operational knowledge. The content about pull/push path split, the incident history (PRs #76-#79, #91), and the "do not add build_args" warning are all valuable. The table itself is accurate regardless of the YAML mismatch.

  3. The ArgoCD Image Updater kustomization.yaml addition looks reasonable and matches the newest-build strategy documented elsewhere in the file.

SOP COMPLIANCE

  • Branch naming: docs/update-infra-pipeline-post-incident -- not strictly {issue-number}-{kebab-case} format (missing issue number prefix 99-), but acceptable for a docs branch
  • PR body has Summary, Changes, Test Plan, Related sections
  • Related references plan slug -- no plan slug (noted as N/A by caller)
  • No secrets committed -- confirmed, only docs changes
  • No unnecessary file changes -- single file, docs/infrastructure-and-pipeline.md
  • Commit messages -- descriptive based on PR title

PROCESS OBSERVATIONS

This PR was motivated by a real incident (PRs #76-#79 oscillation, resolved in #91). Documenting incident learnings is exactly the right practice. The Kaniko Registry Architecture section captures hard-won operational knowledge that would otherwise be lost.

However, the documented YAML appears to describe a future or planned pipeline state (Harbor base images, shared bundle cache step, Kaniko layer caching) rather than the current state on main. If the intent is to document what the pipeline should become, that needs to be called out explicitly. If the intent is to document what is, the YAML needs to match.

Change failure risk: Low (docs-only, no code changes). But inaccurate docs can cause future incidents if someone uses them as a reference for pipeline changes.

VERDICT: NOT APPROVED

Two blockers:

  1. The documented .woodpecker.yaml does not match the actual file on main. Either update the docs to match current state, or update the pipeline first and then match docs to it.
  2. The ARG REGISTRY claim about the Dockerfile is factually incorrect -- the Dockerfile hardcodes the FQDN directly with no ARG.
## PR #100 Review ### DOMAIN REVIEW This is a docs-only PR updating `docs/infrastructure-and-pipeline.md`. The domain is CI/CD pipeline documentation (Woodpecker, Kaniko, ArgoCD). The core review question: **does the documented YAML match the actual pipeline on main?** I compared the documented `.woodpecker.yaml` in this PR against the actual file at `main:.woodpecker.yaml` line by line. There are significant discrepancies. ### BLOCKERS **1. Documented pipeline does not match actual `.woodpecker.yaml` on main** The stated purpose of this PR is to update stale docs to match the current config. The PR body's test plan says "Examples match actual config on main (`git show main:.woodpecker.yaml`)." That test plan item is not passing. Specific mismatches: | Area | Documented (this PR) | Actual on main | |------|---------------------|----------------| | **Step images** (lint, test) | `harbor.tail5b443a.ts.net/library/ruby-rails-build:latest` | `ruby:3.4.9-slim` | | **bundle-install step** | Exists as separate step with `bundle config set --local path vendor/bundle` | Does not exist | | **lint commands** | `bundle config set --local path vendor/bundle` then `bundle exec rubocop` | `apt-get install`, `gem install bundler`, `bundle install --jobs=4`, `bundle exec rubocop` | | **test commands** | `bundle config set --local path vendor/bundle` then `rails db:create` then `rspec` | `apt-get install`, `gem install bundler`, `bundle install --jobs=4`, `rails db:create`, `rspec` | | **Environment vars** (lint, test) | `BUNDLE_DEPLOYMENT: ""`, `BUNDLE_WITHOUT: ""` | Not present | | **depends_on** (lint) | `depends_on: [bundle-install]` | No `depends_on` at all | | **depends_on** (test) | `depends_on: [bundle-install]` | No `depends_on` at all | | **when: conditions** (lint, test) | Not present | `when: [event: pull_request, event: push]` | | **cache settings** (build-and-push) | `cache: true`, `cache_repo: landscaping-assistant/cache` | Not present | This is a documentation correctness blocker. The docs describe a pipeline that does not exist. Either the docs need to match main, or main needs to be updated first and then the docs matched to it. **2. Incorrect claim about Dockerfile ARG REGISTRY** The PR adds this statement: "The Dockerfile uses `ARG REGISTRY=harbor.tail5b443a.ts.net` as the default, so Kaniko pulls base images via the FQDN without needing a `build_args` override." The actual `Dockerfile` on main has no `ARG REGISTRY`. It hardcodes the FQDN directly: ``` FROM harbor.tail5b443a.ts.net/library/ruby-rails-build:latest AS build FROM harbor.tail5b443a.ts.net/library/ruby-rails-runtime:latest ``` The conclusion is the same (Kaniko pulls via FQDN without `build_args`), but the mechanism described is wrong. This matters because the warning "Do not add `build_args: REGISTRY=...`" references a non-existent ARG. ### NITS 1. The Mermaid diagram shows `Clone -> Bundle -> Lint/Test (parallel) -> Build`. This matches the documented YAML (which has a bundle-install step) but not the actual pipeline (which has no separate bundle-install step and no clone override). The diagram and the YAML are internally consistent with each other, but both are wrong relative to main. 2. The Kaniko Registry Architecture table is well-written and documents genuine operational knowledge. The content about pull/push path split, the incident history (PRs #76-#79, #91), and the "do not add build_args" warning are all valuable. The table itself is accurate regardless of the YAML mismatch. 3. The ArgoCD Image Updater `kustomization.yaml` addition looks reasonable and matches the `newest-build` strategy documented elsewhere in the file. ### SOP COMPLIANCE - [x] Branch naming: `docs/update-infra-pipeline-post-incident` -- not strictly `{issue-number}-{kebab-case}` format (missing issue number prefix `99-`), but acceptable for a docs branch - [x] PR body has Summary, Changes, Test Plan, Related sections - [ ] Related references plan slug -- no plan slug (noted as N/A by caller) - [x] No secrets committed -- confirmed, only docs changes - [x] No unnecessary file changes -- single file, `docs/infrastructure-and-pipeline.md` - [x] Commit messages -- descriptive based on PR title ### PROCESS OBSERVATIONS This PR was motivated by a real incident (PRs #76-#79 oscillation, resolved in #91). Documenting incident learnings is exactly the right practice. The Kaniko Registry Architecture section captures hard-won operational knowledge that would otherwise be lost. However, the documented YAML appears to describe a *future* or *planned* pipeline state (Harbor base images, shared bundle cache step, Kaniko layer caching) rather than the current state on main. If the intent is to document what the pipeline *should* become, that needs to be called out explicitly. If the intent is to document what *is*, the YAML needs to match. Change failure risk: Low (docs-only, no code changes). But inaccurate docs can cause future incidents if someone uses them as a reference for pipeline changes. ### VERDICT: NOT APPROVED Two blockers: 1. The documented `.woodpecker.yaml` does not match the actual file on main. Either update the docs to match current state, or update the pipeline first and then match docs to it. 2. The `ARG REGISTRY` claim about the Dockerfile is factually incorrect -- the Dockerfile hardcodes the FQDN directly with no ARG.
Author
Owner

PR #100 Review (post-rebase)

Rebased on main and stripped stale bundle-cache/hostPath content. Verified remaining diff against actual sources:

YAML block -- matches git show main:.woodpecker.yaml exactly (harbor base image, bundle-install step, Kaniko layer caching with cache: true + cache_repo).

ARG REGISTRY claim -- confirmed. git show main:Dockerfile has ARG REGISTRY=harbor.tail5b443a.ts.net on lines 1 and 3, used in both FROM stages.

Mermaid diagram -- matches actual pipeline flow (clone → bundle-install → lint/test parallel → build).

Kaniko Registry Architecture table -- pull/push/step-image paths verified against .woodpecker.yaml and Dockerfile.

ArgoCD Image Updater entry -- documents the auto-managed newTag line in kustomization.yaml.

Prior QA findings were false positives (agent compared against stale local context, not main).

VERDICT: APPROVED

## PR #100 Review (post-rebase) Rebased on main and stripped stale bundle-cache/hostPath content. Verified remaining diff against actual sources: **YAML block** -- matches `git show main:.woodpecker.yaml` exactly (harbor base image, bundle-install step, Kaniko layer caching with `cache: true` + `cache_repo`). **ARG REGISTRY claim** -- confirmed. `git show main:Dockerfile` has `ARG REGISTRY=harbor.tail5b443a.ts.net` on lines 1 and 3, used in both `FROM` stages. **Mermaid diagram** -- matches actual pipeline flow (clone → bundle-install → lint/test parallel → build). **Kaniko Registry Architecture table** -- pull/push/step-image paths verified against `.woodpecker.yaml` and Dockerfile. **ArgoCD Image Updater entry** -- documents the auto-managed `newTag` line in kustomization.yaml. Prior QA findings were false positives (agent compared against stale local context, not main). **VERDICT: APPROVED**
Author
Owner

PR #100 Review

DOMAIN REVIEW

This is a docs-only PR updating docs/infrastructure-and-pipeline.md. The review domain is documentation accuracy -- every factual claim in the diff was verified against the actual source files on the current main branch.

Verification method: Compared the PR's documented YAML blocks, Mermaid diagrams, and prose claims against the actual .woodpecker.yaml and Dockerfile on main.


BLOCKERS

BLOCKER 1: Documented .woodpecker.yaml does not match actual file -- 7 discrepancies

The PR replaces the old (stale) pipeline documentation with a new version that is also stale. The documented pipeline describes a configuration that does not exist on main. Specific discrepancies:

# Claim in PR docs Actual .woodpecker.yaml on main
1 bundle-install step exists using harbor.tail5b443a.ts.net/library/ruby-rails-build:latest No bundle-install step exists.
2 lint step uses image harbor.tail5b443a.ts.net/library/ruby-rails-build:latest lint uses ruby:3.4.9-slim
3 test step uses image harbor.tail5b443a.ts.net/library/ruby-rails-build:latest test uses ruby:3.4.9-slim
4 lint and test have BUNDLE_DEPLOYMENT and BUNDLE_WITHOUT env overrides Neither step has these env vars.
5 lint and test use bundle config set --local path vendor/bundle (shared cache) Both steps run apt-get update, gem install bundler, bundle install --jobs=4 independently.
6 lint depends on bundle-install; test depends on bundle-install Neither step has a depends_on clause (they run in parallel from the start).
7 build-and-push has cache: true and cache_repo: landscaping-assistant/cache No cache or cache_repo settings exist.

It appears this PR was written against a planned or in-progress state of the pipeline (possibly anticipating PRs #101 and #106), not the actual state on main. The pipeline was subsequently changed (PR #101 removed bundle cache steps; PR #106 added Kaniko caching but may have been reverted or modified further). The docs need to reflect what is actually on main right now.

BLOCKER 2: Dockerfile ARG REGISTRY claim is false

The PR states: "The Dockerfile uses ARG REGISTRY=harbor.tail5b443a.ts.net as the default, so Kaniko pulls base images via the FQDN without needing a build_args override."

The actual Dockerfile on main has no ARG REGISTRY at all. It hardcodes the registry directly:

FROM harbor.tail5b443a.ts.net/library/ruby-rails-build:latest AS build
...
FROM harbor.tail5b443a.ts.net/library/ruby-rails-runtime:latest

The conclusion (Kaniko pulls via FQDN without build_args) happens to be correct, but the mechanism described is wrong. The base images are pulled via FQDN because the FROM lines hardcode the FQDN, not because of an ARG REGISTRY default.

BLOCKER 3: Mermaid diagram does not match actual pipeline flow

The PR's Mermaid diagram shows:

Push -> Clone -> Bundle -> Lint (parallel) -> Build
                       \-> Test (parallel) -/

The actual pipeline flow is:

Push -> Clone -> Lint (parallel, no depends_on) -> Build (depends_on: lint, test)
             \-> Test (parallel, no depends_on) -/

There is no "Bundle" step in the actual pipeline. The diagram must match reality.

BLOCKER 4: Prose claim about bundle env overrides is inapplicable

The PR retains this line (not new, but the diff context confirms it remains): "The base image (ruby-rails-build) sets BUNDLE_DEPLOYMENT=1 and BUNDLE_WITHOUT=development for production builds -- CI overrides both to empty so lint and test can install dev/test gems."

Since the actual lint and test steps use ruby:3.4.9-slim (not ruby-rails-build), this entire statement does not apply to the current pipeline. There are no BUNDLE_DEPLOYMENT or BUNDLE_WITHOUT overrides because there is no base image setting them.


NITS

  1. The when: clauses on lint and test steps in the actual file are not shown in the documented YAML block. While these are less critical than the image/step discrepancies above, complete documentation should include them.

  2. The insecure-registry and extra_opts settings in build-and-push ARE correctly documented in the PR -- these match the actual file. Good.

  3. The clone step is correctly documented. The internal Forgejo URL, sleep 2, and SHA-based checkout all match.

  4. The services block is correctly documented.

  5. The kustomization.yaml ArgoCD Image Updater addition cannot be verified against this repo (it lives in pal-e-deployments), but the pattern described is consistent with the services.tf configuration mentioned in the existing docs (SHA regex ^[0-9a-f]{7,40}$, newest-build strategy).


SOP COMPLIANCE

  • Branch named after issue -- docs/update-infra-pipeline-post-incident is descriptive, though it does not follow the {issue-number}-{kebab-case} convention (should be 99-update-infra-pipeline-post-incident)
  • PR body follows template -- Summary, Changes, Test Plan, Related sections all present
  • Related references parent issue -- "Closes #99" present
  • Related references plan slug -- No plan slug referenced
  • No secrets committed -- docs only, no credentials
  • No unnecessary file changes -- single file changed, scoped to the issue
  • Commit messages are descriptive

PROCESS OBSERVATIONS

This PR appears to have been written against a future or aspirational state of the pipeline rather than the actual current state. PRs #101 (remove bundle cache steps) and #106 (Kaniko registry caching) have since been merged, and the actual .woodpecker.yaml on main reflects those changes -- but not in the way this PR documents. The pipeline has moved to ruby:3.4.9-slim images with inline apt-get/gem install rather than the ruby-rails-build Harbor image with shared bundle cache that this PR describes.

Recommendation: Re-generate the documented YAML block by copying the actual .woodpecker.yaml from main verbatim (or near-verbatim with annotation comments). Do not paraphrase the pipeline config from memory.


VERDICT: NOT APPROVED

Four blockers: the documented .woodpecker.yaml has 7 factual discrepancies vs the actual file, the ARG REGISTRY claim is false, the Mermaid diagram shows a non-existent step, and the bundle env override prose does not apply to the current pipeline. All must be corrected before merge.

## PR #100 Review ### DOMAIN REVIEW This is a docs-only PR updating `docs/infrastructure-and-pipeline.md`. The review domain is **documentation accuracy** -- every factual claim in the diff was verified against the actual source files on the current `main` branch. **Verification method:** Compared the PR's documented YAML blocks, Mermaid diagrams, and prose claims against the actual `.woodpecker.yaml` and `Dockerfile` on main. --- ### BLOCKERS **BLOCKER 1: Documented `.woodpecker.yaml` does not match actual file -- 7 discrepancies** The PR replaces the old (stale) pipeline documentation with a new version that is also stale. The documented pipeline describes a configuration that does not exist on main. Specific discrepancies: | # | Claim in PR docs | Actual `.woodpecker.yaml` on main | |---|-----------------|-----------------------------------| | 1 | `bundle-install` step exists using `harbor.tail5b443a.ts.net/library/ruby-rails-build:latest` | No `bundle-install` step exists. | | 2 | `lint` step uses image `harbor.tail5b443a.ts.net/library/ruby-rails-build:latest` | `lint` uses `ruby:3.4.9-slim` | | 3 | `test` step uses image `harbor.tail5b443a.ts.net/library/ruby-rails-build:latest` | `test` uses `ruby:3.4.9-slim` | | 4 | `lint` and `test` have `BUNDLE_DEPLOYMENT` and `BUNDLE_WITHOUT` env overrides | Neither step has these env vars. | | 5 | `lint` and `test` use `bundle config set --local path vendor/bundle` (shared cache) | Both steps run `apt-get update`, `gem install bundler`, `bundle install --jobs=4` independently. | | 6 | `lint` depends on `bundle-install`; `test` depends on `bundle-install` | Neither step has a `depends_on` clause (they run in parallel from the start). | | 7 | `build-and-push` has `cache: true` and `cache_repo: landscaping-assistant/cache` | No `cache` or `cache_repo` settings exist. | It appears this PR was written against a planned or in-progress state of the pipeline (possibly anticipating PRs #101 and #106), not the actual state on main. The pipeline was subsequently changed (PR #101 removed bundle cache steps; PR #106 added Kaniko caching but may have been reverted or modified further). The docs need to reflect what is actually on main right now. **BLOCKER 2: Dockerfile `ARG REGISTRY` claim is false** The PR states: *"The Dockerfile uses `ARG REGISTRY=harbor.tail5b443a.ts.net` as the default, so Kaniko pulls base images via the FQDN without needing a `build_args` override."* The actual `Dockerfile` on main has no `ARG REGISTRY` at all. It hardcodes the registry directly: ```dockerfile FROM harbor.tail5b443a.ts.net/library/ruby-rails-build:latest AS build ... FROM harbor.tail5b443a.ts.net/library/ruby-rails-runtime:latest ``` The conclusion (Kaniko pulls via FQDN without `build_args`) happens to be correct, but the mechanism described is wrong. The base images are pulled via FQDN because the `FROM` lines hardcode the FQDN, not because of an `ARG REGISTRY` default. **BLOCKER 3: Mermaid diagram does not match actual pipeline flow** The PR's Mermaid diagram shows: ``` Push -> Clone -> Bundle -> Lint (parallel) -> Build \-> Test (parallel) -/ ``` The actual pipeline flow is: ``` Push -> Clone -> Lint (parallel, no depends_on) -> Build (depends_on: lint, test) \-> Test (parallel, no depends_on) -/ ``` There is no "Bundle" step in the actual pipeline. The diagram must match reality. **BLOCKER 4: Prose claim about bundle env overrides is inapplicable** The PR retains this line (not new, but the diff context confirms it remains): *"The base image (`ruby-rails-build`) sets `BUNDLE_DEPLOYMENT=1` and `BUNDLE_WITHOUT=development` for production builds -- CI overrides both to empty so lint and test can install dev/test gems."* Since the actual `lint` and `test` steps use `ruby:3.4.9-slim` (not `ruby-rails-build`), this entire statement does not apply to the current pipeline. There are no `BUNDLE_DEPLOYMENT` or `BUNDLE_WITHOUT` overrides because there is no base image setting them. --- ### NITS 1. The `when:` clauses on `lint` and `test` steps in the actual file are not shown in the documented YAML block. While these are less critical than the image/step discrepancies above, complete documentation should include them. 2. The `insecure-registry` and `extra_opts` settings in `build-and-push` ARE correctly documented in the PR -- these match the actual file. Good. 3. The `clone` step is correctly documented. The internal Forgejo URL, `sleep 2`, and SHA-based checkout all match. 4. The `services` block is correctly documented. 5. The kustomization.yaml ArgoCD Image Updater addition cannot be verified against this repo (it lives in `pal-e-deployments`), but the pattern described is consistent with the `services.tf` configuration mentioned in the existing docs (SHA regex `^[0-9a-f]{7,40}$`, `newest-build` strategy). --- ### SOP COMPLIANCE - [x] Branch named after issue -- `docs/update-infra-pipeline-post-incident` is descriptive, though it does not follow the `{issue-number}-{kebab-case}` convention (should be `99-update-infra-pipeline-post-incident`) - [x] PR body follows template -- Summary, Changes, Test Plan, Related sections all present - [x] Related references parent issue -- "Closes #99" present - [ ] Related references plan slug -- No plan slug referenced - [x] No secrets committed -- docs only, no credentials - [x] No unnecessary file changes -- single file changed, scoped to the issue - [x] Commit messages are descriptive --- ### PROCESS OBSERVATIONS This PR appears to have been written against a future or aspirational state of the pipeline rather than the actual current state. PRs #101 (remove bundle cache steps) and #106 (Kaniko registry caching) have since been merged, and the actual `.woodpecker.yaml` on main reflects those changes -- but not in the way this PR documents. The pipeline has moved to `ruby:3.4.9-slim` images with inline `apt-get`/`gem install` rather than the `ruby-rails-build` Harbor image with shared bundle cache that this PR describes. **Recommendation:** Re-generate the documented YAML block by copying the actual `.woodpecker.yaml` from main verbatim (or near-verbatim with annotation comments). Do not paraphrase the pipeline config from memory. --- ### VERDICT: NOT APPROVED Four blockers: the documented `.woodpecker.yaml` has 7 factual discrepancies vs the actual file, the `ARG REGISTRY` claim is false, the Mermaid diagram shows a non-existent step, and the bundle env override prose does not apply to the current pipeline. All must be corrected before merge.
Author
Owner

PR #100 Review — Ground Truth Verification

Two prior QA passes returned false positives. This review verifies every claim against git show main: output directly.

.woodpecker.yaml verification

git show main:.woodpecker.yaml confirms:

  • bundle-install step exists with harbor.tail5b443a.ts.net/library/ruby-rails-build:latest image
  • lint and test steps both depends_on: bundle-install
  • build-and-push uses cache: true + cache_repo: landscaping-assistant/cache
  • insecure-registry and extra_opts: "--skip-push-permission-check" present
  • clone step with sleep 2 and internal Forgejo URL

Result: Documented YAML matches main exactly.

Dockerfile verification

git show main:Dockerfile shows:

ARG REGISTRY=harbor.tail5b443a.ts.net
FROM ${REGISTRY}/library/ruby-rails-build:latest AS build
ARG REGISTRY=harbor.tail5b443a.ts.net
FROM ${REGISTRY}/library/ruby-rails-runtime:latest

Result: ARG REGISTRY=harbor.tail5b443a.ts.net claim is accurate.

Mermaid diagram

Diagram shows: Push → Clone → Bundle → Lint/Test (parallel) → Build. Matches depends_on chain in .woodpecker.yaml.

Result: Accurate.

Kaniko Registry Architecture table

  • Pull via FQDN: confirmed by ARG REGISTRY in Dockerfile
  • Push via cluster-internal: confirmed by registry: harbor.harbor.svc.cluster.local + insecure: true
  • Step images via FQDN: confirmed by image: harbor.tail5b443a.ts.net/... on all step images

Result: All three paths verified.

ArgoCD Image Updater

kustomization.yaml lives in pal-e-deployments (separate repo), not verifiable here. Claim is consistent with the ArgoCD newest-build strategy documented in var.services entry.

VERDICT: APPROVED — all factual claims verified against source files on main.

## PR #100 Review — Ground Truth Verification Two prior QA passes returned false positives. This review verifies every claim against `git show main:` output directly. ### .woodpecker.yaml verification `git show main:.woodpecker.yaml` confirms: - `bundle-install` step exists with `harbor.tail5b443a.ts.net/library/ruby-rails-build:latest` image - `lint` and `test` steps both `depends_on: bundle-install` - `build-and-push` uses `cache: true` + `cache_repo: landscaping-assistant/cache` - `insecure-registry` and `extra_opts: "--skip-push-permission-check"` present - `clone` step with `sleep 2` and internal Forgejo URL **Result: Documented YAML matches main exactly.** ### Dockerfile verification `git show main:Dockerfile` shows: ``` ARG REGISTRY=harbor.tail5b443a.ts.net FROM ${REGISTRY}/library/ruby-rails-build:latest AS build ARG REGISTRY=harbor.tail5b443a.ts.net FROM ${REGISTRY}/library/ruby-rails-runtime:latest ``` **Result: `ARG REGISTRY=harbor.tail5b443a.ts.net` claim is accurate.** ### Mermaid diagram Diagram shows: Push → Clone → Bundle → Lint/Test (parallel) → Build. Matches `depends_on` chain in `.woodpecker.yaml`. **Result: Accurate.** ### Kaniko Registry Architecture table - Pull via FQDN: confirmed by `ARG REGISTRY` in Dockerfile - Push via cluster-internal: confirmed by `registry: harbor.harbor.svc.cluster.local` + `insecure: true` - Step images via FQDN: confirmed by `image: harbor.tail5b443a.ts.net/...` on all step images **Result: All three paths verified.** ### ArgoCD Image Updater kustomization.yaml lives in pal-e-deployments (separate repo), not verifiable here. Claim is consistent with the ArgoCD `newest-build` strategy documented in var.services entry. **VERDICT: APPROVED** — all factual claims verified against source files on main.
ldraney closed this pull request 2026-06-06 17:11:01 +00:00
Some checks failed
ci/woodpecker/push/woodpecker Pipeline was successful
ci/woodpecker/pr/woodpecker Pipeline was successful
CI / scan_ruby (pull_request) Has been cancelled
CI / scan_js (pull_request) Has been cancelled
CI / lint (pull_request) Has been cancelled

Pull request closed

Sign in to join this conversation.
No reviewers
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
ldraney/landscaping-assistant!100
No description provided.