···5656make manifests # Generate WebhookConfiguration, ClusterRole and CustomResourceDefinition objects
5757make generate # Generate DeepCopy methods for CRD types
5858make helm-sync # Sync generated CRDs from config/ to helm/ after CRD changes
5959+6060+# Protocol Buffer generation (for gRPC)
6161+buf generate # Generate Go code from proto files (requires buf tool)
6262+buf lint # Lint protobuf files
6363+buf format -w # Format protobuf files
5964```
60656166### Docker Images
···193198194199### API Development and Testing
195200```bash
196196-# Test API endpoints locally
201201+# Test REST API endpoints locally (manager)
197202cd examples/api && ./health-check.sh
198203199204# Create test secrets via API
···203208./list-secrets.sh
204209```
205210211211+### gRPC Development and Testing
212212+```bash
213213+# Generate protobuf code after modifying .proto files
214214+buf generate
215215+216216+# Test gRPC agent connectivity
217217+# Note: Agent runs on port 9090 (gRPC) and 8093 (health)
218218+219219+# Test agent health via HTTP (from within cluster)
220220+curl http://hsm-agent-pod:8093/healthz
221221+222222+# Test gRPC connection programmatically
223223+# See internal/agent/grpc_integration_test.go for examples
224224+225225+# Protocol buffer linting
226226+buf lint api/proto/hsm/v1/hsm.proto
227227+```
228228+229229+### Protocol Buffer Development
230230+```bash
231231+# Install buf tool (required for proto generation)
232232+go install github.com/bufbuild/buf/cmd/buf@latest
233233+234234+# Modify proto files
235235+# Edit api/proto/hsm/v1/hsm.proto
236236+237237+# Regenerate Go code
238238+buf generate
239239+240240+# Format proto files
241241+buf format -w api/proto/hsm/v1/hsm.proto
242242+243243+# Validate proto files
244244+buf lint
245245+```
246246+206247## Project Overview
207248208249A Kubernetes operator that bridges Pico HSM binary data storage with Kubernetes Secrets, providing true secret portability through hardware-based storage. The operator implements a controller pattern that watches HSMSecret Custom Resource Definitions (CRDs) and maintains bidirectional synchronization between HSM binary data files and Kubernetes Secret objects.
···228269 - Can fallback to **MockClient** for testing
229270 - Deployed close to HSM hardware (DaemonSet pattern)
230271 - Heavy image with full PKCS#11 library dependencies
231231- - Serves HSM operations via API for manager requests
272272+ - **gRPC API**: Serves HSM operations via gRPC on port 9090 (default)
273273+ - **HTTP API**: Legacy HTTP support via `--use-grpc=false`
274274+ - **Health Checks**: HTTP health endpoints on port 8093
2322752332763. **Discovery Binary** (`cmd/discovery/main.go`)
234277 - Handles **HSMDevice CRDs** (readonly specs) and USB device discovery
···279322HSM Storage ←→ HSMSecret CRD ←→ Kubernetes Secret
280323USB Device ←→ HSMDevice CRD (readonly spec) ←→ Pod Annotations ←→ HSMPool CRD (aggregated status)
281324282282-Manager: HSMPath ←→ Agent API ←→ PKCS#11 Client ←→ K8s Secret (owner refs)
325325+Manager: HSMPath ←→ Agent gRPC ←→ PKCS#11 Client ←→ K8s Secret (owner refs)
283326 HSMDevice ←→ HSMPool (auto-created with owner refs)
284327 Pod Annotations ←→ HSMPool Status (aggregated discovery results)
285328286329Discovery: /sys/bus/usb ←→ Pod Annotations (ephemeral reports)
287287-Agent: PKCS#11 Library ←→ HSM Device ←→ API Server
330330+Agent: PKCS#11 Library ←→ HSM Device ←→ gRPC Server (port 9090)
288331```
289332290333**Key Benefits:**
···293336- ✅ **Grace Periods**: 5-minute buffer prevents agent churn during outages
294337- ✅ **Kubernetes Native**: Standard patterns (annotations, owner refs, watches)
295338339339+### gRPC Communication Architecture
340340+341341+The operator uses **Protocol Buffers (protobuf)** and **gRPC** for efficient, type-safe communication between manager and agent components:
342342+343343+**Protocol Definition**: `api/proto/hsm/v1/hsm.proto`
344344+- **HSMAgent Service**: Complete gRPC service definition
345345+- **10 Operations**: GetInfo, ReadSecret, WriteSecret, WriteSecretWithMetadata, ReadMetadata, DeleteSecret, ListSecrets, GetChecksum, IsConnected, Health
346346+- **Type Safety**: Structured messages for HSMInfo, SecretData, SecretMetadata
347347+- **Error Handling**: gRPC status codes for proper error propagation
348348+349349+**gRPC Server** (`internal/agent/grpc_server.go`):
350350+- **Port 9090**: Default gRPC service port
351351+- **Port 8093**: HTTP health checks (`/healthz`, `/readyz`)
352352+- **Interceptors**: Request logging and metrics collection
353353+- **Graceful Shutdown**: Context-based cancellation support
354354+355355+**gRPC Client** (`internal/agent/grpc_client.go`):
356356+- **Connection Management**: Automatic keepalive and reconnection
357357+- **Timeouts**: Configurable request timeouts (default: 30s)
358358+- **Error Handling**: gRPC status code interpretation
359359+- **Interface Compatibility**: Implements `hsm.Client` interface
360360+361361+**Protocol Buffer Generation**:
362362+```bash
363363+# Generate Go code from .proto files
364364+buf generate
365365+366366+# Lint proto files
367367+buf lint
368368+369369+# Format proto files
370370+buf format -w
371371+```
372372+373373+**Generated Files**:
374374+- `api/proto/hsm/v1/hsm.pb.go` - Message types
375375+- `api/proto/hsm/v1/hsm_grpc.pb.go` - Service client/server code
376376+- `hsm/v1/hsm.pb.go` - Duplicate for backward compatibility
377377+296378### Key Architectural Patterns
2973792983801. **Status-Driven Reconciliation**: Controllers use comprehensive status fields to track state
···555637**Root Cause**: API server was configured to use port 8080, conflicting with metrics server.
556638557639**Solution Applied**:
558558-- **API Server**: Restored to port 8090 (dedicated for REST API)
559559-- **Metrics Server**: Port 8080 internal, exposed as 8443 via service
560560-- **Health Probes**: Port 8081 (unchanged)
561561-- **Service Mapping**: Corrected service target ports to match actual server ports
640640+- **Manager API Server**: Port 8090 (dedicated for REST API)
641641+- **Manager Metrics Server**: Port 8080 internal, exposed as 8443 via service
642642+- **Manager Health Probes**: Port 8081 (unchanged)
643643+- **Agent gRPC Server**: Port 9090 (default for HSM operations)
644644+- **Agent Health Server**: Port 8093 (HTTP health checks)
562645563646**Result**: Clean port separation with no conflicts.
564647···694777│ ├── discovery/main.go # Discovery: HSMPool controller (removed from new arch)
695778│ ├── agent/main.go # Agent: Direct HSM communication
696779│ └── test-hsm/main.go # Test utility for HSM operations
697697-├── api/v1alpha1/ # CRD definitions
698698-│ ├── hsmsecret_types.go # HSMSecret CRD
699699-│ ├── hsmpool_types.go # HSMPool CRD (race-free aggregation)
700700-│ └── hsmdevice_types.go # HSMDevice CRD (readonly specs)
780780+├── api/ # API definitions
781781+│ ├── proto/hsm/v1/ # Protocol buffer definitions
782782+│ │ ├── hsm.proto # gRPC service definition
783783+│ │ ├── hsm.pb.go # Generated protobuf messages
784784+│ │ └── hsm_grpc.pb.go # Generated gRPC client/server
785785+│ └── v1alpha1/ # CRD definitions
786786+│ ├── hsmsecret_types.go # HSMSecret CRD
787787+│ ├── hsmpool_types.go # HSMPool CRD (race-free aggregation)
788788+│ └── hsmdevice_types.go # HSMDevice CRD (readonly specs)
701789├── internal/
702790│ ├── controller/ # Kubernetes controllers
703791│ │ ├── hsmsecret_controller.go # Secret sync
···710798│ │ ├── pkcs11_client.go # Production PKCS#11 client (CGO)
711799│ │ └── pkcs11_client_nocgo.go # Stub for testing builds
712800│ ├── agent/ # Agent deployment and communication
713713-│ │ ├── deployment.go # Agent pod management
714714-│ │ └── client.go # Agent API client
801801+│ │ ├── deployment.go # Agent pod management
802802+│ │ ├── server.go # Legacy HTTP server
803803+│ │ ├── grpc_server.go # gRPC server implementation
804804+│ │ ├── grpc_client.go # gRPC client implementation
805805+│ │ └── client.go # Agent API client (legacy)
715806│ ├── api/ # REST API server
716807│ │ ├── server.go # HTTP server setup
717808│ │ └── proxy_handlers.go # API proxy to agents
···730821│ └── default/ # Default deployment configuration
731822├── helm/ # Helm chart
732823│ └── hsm-secrets-operator/ # Complete Helm chart
824824+├── buf.yaml # Buf protobuf tool configuration
825825+├── buf.gen.yaml # Protobuf code generation config
826826+├── hsm/v1/ # Legacy protobuf output (compatibility)
827827+│ ├── hsm.pb.go # Duplicate protobuf messages
828828+│ └── hsm_grpc.pb.go # Duplicate gRPC client/server
733829└── test/ # Test suites
734830 ├── e2e/ # End-to-end tests
735831 └── utils/ # Test utilities
···742838- **controller-runtime**: Kubernetes controller framework
743839- **PKCS#11 library**: For HSM communication (sc-hsm-embedded)
744840- **OpenSC**: PKCS#11 middleware for smart cards/HSMs
841841+- **buf**: Protocol buffer compiler and linter
842842+- **protoc-gen-go**: Protocol buffer Go code generator
843843+- **protoc-gen-go-grpc**: gRPC Go code generator
844844+- **google.golang.org/grpc**: gRPC Go library
745845746846### HSM Integration
747847- Use PKCS#11 interface for Pico HSM communication
···9861086kubectl exec $AGENT_POD -- pkcs11-tool --module="/usr/lib/opensc-pkcs11.so" -I
9871087```
988108810891089+### Agent Configuration and Ports
10901090+```bash
10911091+# Agent runs with gRPC by default (port 9090)
10921092+# Health checks via HTTP (port 8093)
10931093+10941094+# To use legacy HTTP mode instead of gRPC:
10951095+# agent --use-grpc=false --port=8090
10961096+10971097+# Check agent configuration
10981098+kubectl get deployment hsm-agent-* -o yaml | grep -A 10 containers:
10991099+```
11001100+9891101### Troubleshooting
9901102- **API works, pkcs11-tool doesn't see objects**: Use `--login --pin` for private objects
9911103- **`CKR_DEVICE_REMOVED` errors**: Restart agent pod to reset PKCS#11 session
9921104- **`CKR_TEMPLATE_INCONSISTENT` errors**: Switch from CardContact to OpenSC library
993993-- **Agent crash loop**: Check library path and PIN secret configuration11051105+- **Agent crash loop**: Check library path and PIN secret configuration
11061106+- **gRPC connection failed**: Verify agent is running on port 9090, check service/endpoint configuration
11071107+- **Proto generation issues**: Install buf tool and run `buf generate` after proto changes
+18-6
internal/agent/deployment.go
···636636 return volumes
637637}
638638639639-// agentNeedsUpdate checks if the agent deployment needs to be updated due to device path changes
639639+// agentNeedsUpdate checks if the agent deployment needs to be updated due to device path or image changes
640640func (m *Manager) agentNeedsUpdate(ctx context.Context, deployment *appsv1.Deployment, hsmDevice *hsmv1alpha1.HSMDevice) (bool, error) {
641641+ // Check if container image needs updating
642642+ if len(deployment.Spec.Template.Spec.Containers) == 0 {
643643+ return false, fmt.Errorf("deployment has no containers")
644644+ }
645645+646646+ container := deployment.Spec.Template.Spec.Containers[0]
647647+ currentImage := container.Image
648648+649649+ // Check if image has changed (only if ImageResolver is available)
650650+ if m.ImageResolver != nil {
651651+ expectedImage := m.ImageResolver.GetImage(ctx, "AGENT_IMAGE")
652652+ if currentImage != expectedImage {
653653+ // Image has changed, need to update
654654+ return true, nil
655655+ }
656656+ }
657657+641658 // Get current HSMPool to check for updated device paths
642659 poolName := hsmDevice.Name + "-pool"
643660 pool := &hsmv1alpha1.HSMPool{}
···654671 }
655672656673 // Extract current volume mounts from deployment
657657- if len(deployment.Spec.Template.Spec.Containers) == 0 {
658658- return false, fmt.Errorf("deployment has no containers")
659659- }
660660-661661- container := deployment.Spec.Template.Spec.Containers[0]
662674 currentDeviceMounts := make(map[string]string) // mount name -> device path
663675664676 for _, mount := range container.VolumeMounts {