implemented observability

This commit is contained in:
2026-05-12 20:32:30 +02:00
parent c7df53708c
commit e360f3697e
11 changed files with 478 additions and 22 deletions
+41 -2
View File
@@ -16,7 +16,7 @@ Versions follow [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
- Environment cards: live health status via HTTP health check polling
- Repo page: error rate and deployment frequency sparklines
### Planned — Phase 3F (Federation)
### Planned — Phase 3F (Federation, next)
- ActivityPub inbox/outbox HTTP handlers
- HTTP signature verification middleware
- WebFinger `/.well-known/webfinger` endpoint
@@ -33,6 +33,44 @@ Versions follow [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
---
## [0.8.0] — 2026-05-12
Phase 3E complete. Prometheus metrics, structured health checks, and per-repo operational health are operational.
### Added — Prometheus Metrics (`internal/observability/`)
- `GET /metrics` — Prometheus text format endpoint (standard root-level path for k8s/Prometheus scraping)
- `GET /health` — upgraded from static `{"status":"ok"}` to a structured liveness response:
`{"status":"healthy","checks":{"database":"ok","nats":"ok"},"version":"0.8.0"}`
Returns HTTP 503 when any dependency is degraded
- `internal/observability/metrics.go` — metric definitions:
- `forgebucket_http_requests_total{method,path,status}` — counter
- `forgebucket_http_request_duration_seconds{method,path}` — histogram (Prometheus default buckets)
- `forgebucket_pipeline_runs_total{status}` — counter (succeeded/failed/cancelled), pre-initialized to 0
- `forgebucket_deployments_total{status}` — counter (pending/success/failure/cancelled), pre-initialized to 0
- `forgebucket_active_pipeline_runs` — gauge (in-flight runs)
- `internal/observability/health.go``Check(db, bus)` pings PostgreSQL and calls `bus.Healthy()`
- HTTP instrumentation middleware inserted after `Recoverer`, before `CORS` — records every request
- Path normalization prevents label cardinality explosion: `/repos/alice/myrepo/runs/42`
`/api/v1/repos/:owner/:repo/runs/:id`
- NATS metric watcher subscribes to `pipeline.>` and `deployment.>` and increments counters
### Added — Per-Repo Operational Health (`GET /api/v1/repos/{owner}/{repo}/health`)
- Returns a JSON summary for the repo page operational header:
- `ciPassRate7d` — fraction of pipeline runs that succeeded in the last 7 days
- `totalRuns7d` — total run count in the last 7 days
- `latestRun` — most recent `PipelineRun` record
- `latestDeployments` — one entry per environment showing latest deploy (envName, status, sha, finishedAt)
- `openDriftCount` — GitOpsConfigs in `drifted` state
- `openPRCount` — open pull request count
### Added — EventBus `Healthy() bool`
- Added to the `EventBus` interface; `NATSBus` returns `nc.IsConnected()`; `NoOpBus` returns `true`
### Changed — Middleware chain
- `observability.Middleware()` added between `Recoverer` and `CORS` (applies to all requests including `/health` and `/metrics`)
---
## [0.7.0] — 2026-05-12
Phase 3D complete. Git is now the source of truth for environment deployment state.
@@ -274,7 +312,8 @@ Initial development milestone. Core Git hosting, collaboration, and frontend SPA
---
[Unreleased]: https://github.com/forgeo/forgebucket/compare/v0.7.0...HEAD
[Unreleased]: https://github.com/forgeo/forgebucket/compare/v0.8.0...HEAD
[0.8.0]: https://github.com/forgeo/forgebucket/compare/v0.7.0...v0.8.0
[0.7.0]: https://github.com/forgeo/forgebucket/compare/v0.6.0...v0.7.0
[0.6.0]: https://github.com/forgeo/forgebucket/compare/v0.5.0...v0.6.0
[0.5.0]: https://github.com/forgeo/forgebucket/compare/v0.4.0...v0.5.0