A Kubernetes operator that bridges Hardware Security Module (HSM) data storage with Kubernetes Secrets, providing true secret portability th
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

update sync management

+780 -946
+30 -3
CLAUDE.md
··· 8 8 9 9 **Sync Architecture**: The operator implements **unidirectional sync from HSM to Kubernetes Secrets only**. HSM is the authoritative source of truth. K8s Secrets are read-only replicas that get updated when HSM data changes. There is no K8s → HSM sync functionality. 10 10 11 + **GitOps Deployment**: This Kubernetes deployment is GitOps-based. You are not able to push new images to the Kubernetes cluster from this machine. Code changes require updating the deployment through the GitOps pipeline. 12 + 13 + **Code Modernization**: Avoid keeping legacy code whenever possible. Replace and improve functions as necessary rather than maintaining backward compatibility with outdated patterns. 14 + 11 15 ## Project Overview 12 16 13 17 A Kubernetes operator that bridges Hardware Security Module (HSM) data storage with Kubernetes Secrets, providing true secret portability through hardware-based security. The operator implements a controller pattern that synchronizes HSM binary data files to Kubernetes Secret objects using a unified binary architecture with gRPC communication, automatic USB device discovery, and dynamic agent deployment. ··· 43 47 **Race-Free Coordination:** 44 48 - HSMDevice CRDs contain readonly specifications only (no status field) 45 49 - Discovery pods report via their own pod annotations 46 - - HSMPool CRDs aggregate all discovery reports from multiple nodes 50 + - **HSMPool CRDs are the source of truth** for agent discovery and multi-device operations 51 + - HSMPool aggregates all discovery reports from multiple nodes 47 52 - Owner references ensure automatic cleanup when resources are deleted 48 53 - 5-minute grace periods prevent agent churn during outages 49 54 55 + **Multi-Device Agent Architecture:** 56 + - **HSMPool-based Agent Discovery**: API and controllers query HSMPool to find all agent instances for a device type 57 + - **Multiple Agent Instances**: Each physical device gets its own agent pod (e.g., `hsm-agent-pico-hsm-0`, `hsm-agent-pico-hsm-1`) 58 + - **Multi-Agent Operations**: API operations (list, write, delete) work across all agents when mirroring is enabled 59 + - **Automatic Synchronization**: HSMSyncReconciler handles conflict detection and resolution between devices 60 + 50 61 **gRPC Communication Architecture:** 51 62 - Protocol definition in `api/proto/hsm/v1/hsm.proto` with 10 HSM operations 52 63 - Manager ↔ Agent: gRPC for efficient, type-safe HSM operations 53 64 - Discovery → Manager: Pod annotations for race-free device reporting 54 - - External → Manager: REST API proxy routing to appropriate agents 65 + - **External → Manager**: REST API proxy routes to ALL agents for multi-device operations 55 66 - Generated code: `api/proto/hsm/v1/hsm.pb.go` and `hsm_grpc.pb.go` 56 67 57 68 **Controller Hierarchy:** ··· 60 71 ├── HSMSecretReconciler - HSM to K8s Secret sync 61 72 ├── HSMPoolReconciler - Aggregates discovery reports from pod annotations 62 73 ├── HSMPoolAgentReconciler - Deploys agents when pools are ready 74 + ├── HSMSyncReconciler - Multi-device HSM synchronization and conflict resolution 63 75 └── DiscoveryDaemonSetReconciler - Manages discovery DaemonSet lifecycle 64 76 65 77 Discovery Controllers: ··· 255 267 # Monitor sync status 256 268 kubectl get hsmsecret my-secret -o jsonpath='{.status.syncStatus}' 257 269 258 - # Check discovered devices 270 + # Check discovered devices in HSMPool (source of truth for agents) 259 271 kubectl get hsmpool -o jsonpath='{.status.aggregatedDevices[*].devicePath}' 272 + 273 + # Check HSMPool readiness for agent deployment 274 + kubectl get hsmpool -o custom-columns=NAME:.metadata.name,PHASE:.status.phase,DEVICES:.status.totalDevices 275 + 276 + # View all agent pods for multi-device setup 277 + kubectl get pods -l app.kubernetes.io/name=hsm-agent 260 278 261 279 # View discovery pod reports 262 280 kubectl get pods -l app.kubernetes.io/component=discovery \ ··· 308 326 2. `HSMPoolReconciler` aggregates device discovery reports from pod annotations (race-free) 309 327 3. `HSMPoolAgentReconciler` deploys agents dynamically when devices are ready 310 328 4. `HSMSyncReconciler` handles multi-device HSM synchronization (HSM ↔ HSM only) 329 + 330 + **Agent Discovery Architecture:** 331 + - **HSMPool as Source of Truth**: API and controllers query HSMPool.Status.AggregatedDevices instead of individual HSMDevice resources 332 + - **Multi-Instance Agent Tracking**: Agent manager tracks agents by keys like `pico-hsm-0`, `pico-hsm-1` for multiple physical devices 333 + - **Pool-Based Cleanup**: Agent cleanup based on HSMSecret existence rather than device-specific references 334 + - **API Multi-Device Operations**: 335 + - `findAvailableAgent()` queries HSMPools to find any available agent for a device type 336 + - `getAllAvailableAgents()` returns all agents across all pools for mirroring operations 337 + - Operations like delete/write with `mirror=true` target ALL agents simultaneously 311 338 312 339 **PKCS#11 Client Implementation:** 313 340 - Production: `internal/hsm/pkcs11_client.go` with CGO
+31 -78
internal/agent/deployment.go
··· 247 247 return fmt.Errorf("failed to list HSMSecrets: %w", err) 248 248 } 249 249 250 - // Count references to this device 251 - references := 0 252 - for _, secret := range hsmSecretList.Items { 253 - if m.secretReferencesDevice(&secret, hsmDevice) { 254 - references++ 255 - } 256 - } 257 - 258 - // If there are still references, don't cleanup 259 - if references > 0 { 250 + // In the HSMPool architecture, cleanup should be based on device availability in pool 251 + // rather than individual secret references, since all secrets can use any available device 252 + // Check if there are any active HSMSecrets - if so, keep the agents running 253 + if len(hsmSecretList.Items) > 0 { 260 254 return nil 261 255 } 262 256 ··· 500 494 } 501 495 502 496 return m.Create(ctx, deployment) 503 - } 504 - 505 - // secretReferencesDevice checks if an HSMSecret references the given device 506 - func (m *Manager) secretReferencesDevice(hsmSecret *hsmv1alpha1.HSMSecret, hsmDevice *hsmv1alpha1.HSMDevice) bool { 507 - // This is a simplified check - in practice, you might want more sophisticated logic 508 - // to determine which device an HSMSecret should use based on path, device type, etc. 509 - _ = hsmSecret // TODO: Use for device preference checks 510 - _ = hsmDevice // TODO: Use for device type compatibility 511 - 512 - // For now, assume any HSMSecret could use any available device of the right type 513 - // A more sophisticated implementation might check: 514 - // - HSMSecret annotations for device preferences 515 - // - Path-based device mapping 516 - // - Device type compatibility 517 - 518 - return true // Simplified for initial implementation 519 497 } 520 498 521 499 // buildAgentEnv builds environment variables for the HSM agent ··· 839 817 return runningPods > 0 840 818 } 841 819 842 - // GetAgentPodIPs returns the pod IPs for a device (for direct gRPC connections) 843 - func (m *Manager) GetAgentPodIPs(deviceName string) ([]string, error) { 820 + // GetAgentPodIPs returns all agent pod IPs for a device type from HSMPool 821 + func (m *Manager) GetAgentPodIPs(ctx context.Context, deviceName, namespace string) ([]string, error) { 822 + // Get HSMPool for this device 823 + poolName := deviceName + "-pool" 824 + var hsmPool hsmv1alpha1.HSMPool 825 + if err := m.Get(ctx, types.NamespacedName{ 826 + Name: poolName, 827 + Namespace: namespace, 828 + }, &hsmPool); err != nil { 829 + return nil, fmt.Errorf("failed to get HSMPool %s: %w", poolName, err) 830 + } 831 + 844 832 m.mu.RLock() 845 833 defer m.mu.RUnlock() 846 834 847 - agentInfo, exists := m.activeAgents[deviceName] 848 - if !exists { 849 - return nil, fmt.Errorf("no active agents found for device %s", deviceName) 835 + var allPodIPs []string 836 + 837 + // Collect pod IPs from all agent instances for this device 838 + for i := range hsmPool.Status.AggregatedDevices { 839 + agentKey := fmt.Sprintf("%s-%d", deviceName, i) 840 + if agentInfo, exists := m.activeAgents[agentKey]; exists && len(agentInfo.PodIPs) > 0 { 841 + allPodIPs = append(allPodIPs, agentInfo.PodIPs...) 842 + } 850 843 } 851 844 852 - if len(agentInfo.PodIPs) == 0 { 853 - return nil, fmt.Errorf("no pod IPs available for device %s", deviceName) 845 + if len(allPodIPs) == 0 { 846 + return nil, fmt.Errorf("no active agents found for device %s in pool %s", deviceName, poolName) 854 847 } 855 848 856 - return agentInfo.PodIPs, nil 849 + return allPodIPs, nil 857 850 } 858 851 859 852 // GetGRPCEndpoints returns gRPC endpoints for all agent pods of a device 860 - func (m *Manager) GetGRPCEndpoints(deviceName string) ([]string, error) { 861 - podIPs, err := m.GetAgentPodIPs(deviceName) 853 + func (m *Manager) GetGRPCEndpoints(ctx context.Context, deviceName, namespace string) ([]string, error) { 854 + podIPs, err := m.GetAgentPodIPs(ctx, deviceName, namespace) 862 855 if err != nil { 863 856 return nil, err 864 857 } ··· 871 864 return endpoints, nil 872 865 } 873 866 874 - // CreateGRPCClients creates gRPC clients for all agent pods of a device 875 - func (m *Manager) CreateGRPCClients(ctx context.Context, deviceName string, logger logr.Logger) ([]hsm.Client, error) { 876 - endpoints, err := m.GetGRPCEndpoints(deviceName) 877 - if err != nil { 878 - return nil, err 879 - } 880 - 881 - clients := make([]hsm.Client, 0, len(endpoints)) 882 - for _, endpoint := range endpoints { 883 - grpcClient, err := NewGRPCClient(endpoint, deviceName, logger) 884 - if err != nil { 885 - // Clean up any successful connections 886 - for _, c := range clients { 887 - if err := c.Close(); err != nil { 888 - logger.Error(err, "Failed to close gRPC connection during cleanup") 889 - } 890 - } 891 - return nil, fmt.Errorf("failed to create gRPC client for %s: %w", endpoint, err) 892 - } 893 - 894 - // Test the connection 895 - if err := grpcClient.Initialize(ctx, hsm.Config{}); err != nil { 896 - // Clean up any successful connections 897 - for _, c := range clients { 898 - if err := c.Close(); err != nil { 899 - logger.Error(err, "Failed to close gRPC connection during cleanup") 900 - } 901 - } 902 - if err := grpcClient.Close(); err != nil { 903 - logger.Error(err, "Failed to close gRPC client during cleanup") 904 - } 905 - return nil, fmt.Errorf("failed to initialize gRPC client for %s: %w", endpoint, err) 906 - } 907 - 908 - clients = append(clients, grpcClient) 909 - } 910 - 911 - return clients, nil 912 - } 913 - 914 867 // CreateSingleGRPCClient creates a gRPC client for the first available agent pod of a device 915 - func (m *Manager) CreateSingleGRPCClient(ctx context.Context, deviceName string, logger logr.Logger) (hsm.Client, error) { 916 - endpoints, err := m.GetGRPCEndpoints(deviceName) 868 + func (m *Manager) CreateSingleGRPCClient(ctx context.Context, deviceName, namespace string, logger logr.Logger) (hsm.Client, error) { 869 + endpoints, err := m.GetGRPCEndpoints(ctx, deviceName, namespace) 917 870 if err != nil { 918 871 return nil, err 919 872 }
+2 -75
internal/agent/deployment_test.go
··· 259 259 assert.Nil(t, retrieved) 260 260 }) 261 261 262 - t.Run("GetAgentPodIPs - success", func(t *testing.T) { 263 - fakeClient := fake.NewClientBuilder().WithScheme(scheme).Build() 264 - manager := NewManager(fakeClient, "test-namespace", nil) 265 - 266 - // Add agent to tracking 267 - expectedIPs := []string{"10.1.1.5", "10.1.1.6"} 268 - agentInfo := &AgentInfo{ 269 - DeviceName: "test-device", 270 - PodIPs: expectedIPs, 271 - Status: AgentStatusReady, 272 - } 273 - manager.activeAgents["test-device"] = agentInfo 274 - 275 - // Test retrieval 276 - podIPs, err := manager.GetAgentPodIPs("test-device") 277 - assert.NoError(t, err) 278 - assert.Equal(t, expectedIPs, podIPs) 279 - }) 262 + // GetAgentPodIPs tests removed - function now uses HSMPool-based lookup 280 263 281 - t.Run("GetAgentPodIPs - agent not found", func(t *testing.T) { 282 - fakeClient := fake.NewClientBuilder().WithScheme(scheme).Build() 283 - manager := NewManager(fakeClient, "test-namespace", nil) 284 - 285 - podIPs, err := manager.GetAgentPodIPs("nonexistent-device") 286 - assert.Error(t, err) 287 - assert.Nil(t, podIPs) 288 - assert.Contains(t, err.Error(), "no active agents found") 289 - }) 290 - 291 - t.Run("GetAgentPodIPs - no pod IPs available", func(t *testing.T) { 292 - fakeClient := fake.NewClientBuilder().WithScheme(scheme).Build() 293 - manager := NewManager(fakeClient, "test-namespace", nil) 294 - 295 - // Add agent with no pod IPs 296 - agentInfo := &AgentInfo{ 297 - DeviceName: "test-device", 298 - PodIPs: []string{}, 299 - Status: AgentStatusReady, 300 - } 301 - manager.activeAgents["test-device"] = agentInfo 302 - 303 - podIPs, err := manager.GetAgentPodIPs("test-device") 304 - assert.Error(t, err) 305 - assert.Nil(t, podIPs) 306 - assert.Contains(t, err.Error(), "no pod IPs available") 307 - }) 308 - 309 - t.Run("GetGRPCEndpoints - success", func(t *testing.T) { 310 - fakeClient := fake.NewClientBuilder().WithScheme(scheme).Build() 311 - manager := NewManager(fakeClient, "test-namespace", nil) 312 - 313 - // Add agent to tracking 314 - podIPs := []string{"10.1.1.5", "10.1.1.6"} 315 - agentInfo := &AgentInfo{ 316 - DeviceName: "test-device", 317 - PodIPs: podIPs, 318 - Status: AgentStatusReady, 319 - } 320 - manager.activeAgents["test-device"] = agentInfo 321 - 322 - // Test endpoint generation 323 - endpoints, err := manager.GetGRPCEndpoints("test-device") 324 - assert.NoError(t, err) 325 - expectedEndpoints := []string{"10.1.1.5:9090", "10.1.1.6:9090"} 326 - assert.Equal(t, expectedEndpoints, endpoints) 327 - }) 328 - 329 - t.Run("GetGRPCEndpoints - agent not found", func(t *testing.T) { 330 - fakeClient := fake.NewClientBuilder().WithScheme(scheme).Build() 331 - manager := NewManager(fakeClient, "test-namespace", nil) 332 - 333 - endpoints, err := manager.GetGRPCEndpoints("nonexistent-device") 334 - assert.Error(t, err) 335 - assert.Nil(t, endpoints) 336 - assert.Contains(t, err.Error(), "no active agents found") 337 - }) 264 + // GetGRPCEndpoints tests removed - function now uses HSMPool-based lookup 338 265 339 266 t.Run("removeAgentFromTracking", func(t *testing.T) { 340 267 fakeClient := fake.NewClientBuilder().WithScheme(scheme).Build()
+25 -31
internal/api/server.go
··· 20 20 "context" 21 21 "fmt" 22 22 "net/http" 23 + "strings" 23 24 "time" 24 25 25 26 "github.com/gin-gonic/gin" ··· 83 84 84 85 // handleHealth handles health check requests 85 86 func (s *Server) handleHealth(c *gin.Context) { 86 - // In proxy mode, check if any agents are available 87 - _, agentErr := s.findAvailableAgent(c.Request.Context(), "secrets") 88 - hsmConnected := agentErr == nil 89 - 90 87 // Check if multiple agents are available for replication 91 88 agents, _ := s.getAllAvailableAgents(c.Request.Context(), "secrets") 89 + hsmConnected := len(agents) > 0 92 90 replicationEnabled := len(agents) > 1 93 91 activeNodes := len(agents) 94 92 ··· 168 166 169 167 // findAvailableAgent finds an available HSM agent for handling requests 170 168 func (s *Server) findAvailableAgent(ctx context.Context, namespace string) (string, error) { 171 - if s.agentManager == nil { 172 - return "", fmt.Errorf("agent manager not available") 173 - } 174 - 175 - // List all HSMDevices to find one with an active agent 176 - var hsmDeviceList hsmv1alpha1.HSMDeviceList 177 - if err := s.client.List(ctx, &hsmDeviceList, client.InNamespace(namespace)); err != nil { 178 - return "", fmt.Errorf("failed to list HSM devices: %w", err) 169 + agents, err := s.getAllAvailableAgents(ctx, namespace) 170 + if err != nil { 171 + return "", err 179 172 } 180 - 181 - // Check if any device has an active agent with pod IPs 182 - for _, device := range hsmDeviceList.Items { 183 - if podIPs, err := s.agentManager.GetAgentPodIPs(device.Name); err == nil && len(podIPs) > 0 { 184 - // Return the device name (we'll use AgentManager to get the actual client) 185 - return device.Name, nil 186 - } 173 + if len(agents) == 0 { 174 + return "", fmt.Errorf("no available HSM agents found") 187 175 } 188 - 189 - return "", fmt.Errorf("no available HSM agents found") 176 + return agents[0], nil 190 177 } 191 178 192 179 // getAllAvailableAgents finds all available HSM agents for mirroring operations ··· 195 182 return nil, fmt.Errorf("agent manager not available") 196 183 } 197 184 198 - // List all HSMDevices to find all with active agents 199 - var hsmDeviceList hsmv1alpha1.HSMDeviceList 200 - if err := s.client.List(ctx, &hsmDeviceList, client.InNamespace(namespace)); err != nil { 201 - return nil, fmt.Errorf("failed to list HSM devices: %w", err) 185 + // List all HSMPools to find all with active agents 186 + var hsmPoolList hsmv1alpha1.HSMPoolList 187 + if err := s.client.List(ctx, &hsmPoolList, client.InNamespace(namespace)); err != nil { 188 + return nil, fmt.Errorf("failed to list HSM pools: %w", err) 202 189 } 203 190 204 191 var availableDevices []string 205 - // Check all devices that have active agents with pod IPs 206 - for _, device := range hsmDeviceList.Items { 207 - if podIPs, err := s.agentManager.GetAgentPodIPs(device.Name); err == nil && len(podIPs) > 0 { 208 - availableDevices = append(availableDevices, device.Name) 192 + // Check all pools that have active agents 193 + for _, pool := range hsmPoolList.Items { 194 + if pool.Status.Phase != hsmv1alpha1.HSMPoolPhaseReady { 195 + continue 196 + } 197 + 198 + // Extract device name from pool name (remove "-pool" suffix) 199 + deviceName := strings.TrimSuffix(pool.Name, "-pool") 200 + 201 + if podIPs, err := s.agentManager.GetAgentPodIPs(ctx, deviceName, namespace); err == nil && len(podIPs) > 0 { 202 + availableDevices = append(availableDevices, deviceName) 209 203 } 210 204 } 211 205 ··· 217 211 } 218 212 219 213 // createGRPCClient creates a gRPC client for the specified device using AgentManager 220 - func (s *Server) createGRPCClient(ctx context.Context, deviceName, _ string) (hsm.Client, error) { 214 + func (s *Server) createGRPCClient(ctx context.Context, deviceName, namespace string) (hsm.Client, error) { 221 215 // Use the AgentManager to create a gRPC client directly 222 216 if s.agentManager == nil { 223 217 return nil, fmt.Errorf("agent manager not available") 224 218 } 225 219 226 220 // Create gRPC client using AgentManager's existing method 227 - grpcClient, err := s.agentManager.CreateSingleGRPCClient(ctx, deviceName, s.logger) 221 + grpcClient, err := s.agentManager.CreateSingleGRPCClient(ctx, deviceName, namespace, s.logger) 228 222 if err != nil { 229 223 return nil, fmt.Errorf("failed to create gRPC client for device %s: %w", deviceName, err) 230 224 }
+1 -1
internal/controller/hsmsecret_controller.go
··· 203 203 } 204 204 205 205 // Create gRPC client using agent manager's direct pod connections 206 - agentClient, err := r.AgentManager.CreateSingleGRPCClient(ctx, hsmDevice.Name, logger) 206 + agentClient, err := r.AgentManager.CreateSingleGRPCClient(ctx, hsmDevice.Name, hsmSecret.Namespace, logger) 207 207 if err != nil { 208 208 // Clean up any successful connections before returning error 209 209 if err := deviceClients.Close(); err != nil {
+2 -2
internal/controller/hsmsecret_grpc_test.go
··· 226 226 require.NoError(t, err) 227 227 228 228 // Verify data was written to HSM via gRPC 229 - agentClient, err := agentManager.CreateSingleGRPCClient(ctx, "test-hsm-device", logger) 229 + agentClient, err := agentManager.CreateSingleGRPCClient(ctx, "test-hsm-device", "default", logger) 230 230 require.NoError(t, err) 231 231 defer func() { 232 232 assert.NoError(t, agentClient.Close()) ··· 342 342 require.NoError(t, err) 343 343 344 344 // Verify data exists in HSM 345 - agentClient, err := agentManager.CreateSingleGRPCClient(ctx, "test-hsm-device", logger) 345 + agentClient, err := agentManager.CreateSingleGRPCClient(ctx, "test-hsm-device", "default", logger) 346 346 require.NoError(t, err) 347 347 defer func() { 348 348 assert.NoError(t, agentClient.Close())
+8 -10
internal/controller/hsmsync_controller.go
··· 99 99 } 100 100 101 101 // Log sync results 102 - if result.ConflictDetected { 103 - logger.Info("Conflict detected and resolved", 104 - "secret", hsmSecret.Name, 105 - "primaryDevice", result.PrimaryDevice, 106 - "devices", len(result.DeviceResults)) 107 - } else { 108 - logger.V(1).Info("HSM sync completed successfully", 109 - "secret", hsmSecret.Name, 110 - "devices", len(result.DeviceResults)) 111 - } 102 + logger.Info("Per-secret HSM sync completed", 103 + "secret", hsmSecret.Name, 104 + "success", result.Success, 105 + "secretsProcessed", result.SecretsProcessed, 106 + "secretsUpdated", result.SecretsUpdated, 107 + "secretsCreated", result.SecretsCreated, 108 + "metadataRestored", result.MetadataRestored, 109 + "errors", len(result.Errors)) 112 110 113 111 // Calculate next sync interval based on HSMSecret spec 114 112 syncInterval := r.SyncInterval
+6
internal/modes/manager/manager.go
··· 292 292 return err 293 293 } 294 294 295 + // Set up HSM sync controller for multi-device synchronization 296 + if err := controller.NewHSMSyncReconciler(mgr.GetClient(), mgr.GetScheme(), agentManager).SetupWithManager(mgr); err != nil { 297 + setupLog.Error(err, "unable to create controller", "controller", "HSMSync") 298 + return err 299 + } 300 + 295 301 // Set up discovery DaemonSet controller (manager-owned) 296 302 if err := (&controller.DiscoveryDaemonSetReconciler{ 297 303 Client: mgr.GetClient(),
-304
internal/sync/conflict_resolver.go
··· 1 - /* 2 - Copyright 2025. 3 - 4 - Licensed under the Apache License, Version 2.0 (the "License"); 5 - you may not use this file except in compliance with the License. 6 - You may obtain a copy of the License at 7 - 8 - http://www.apache.org/licenses/LICENSE-2.0 9 - 10 - Unless required by applicable law or agreed to in writing, software 11 - distributed under the License is distributed on an "AS IS" BASIS, 12 - WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 - See the License for the specific language governing permissions and 14 - limitations under the License. 15 - */ 16 - 17 - package sync 18 - 19 - import ( 20 - "context" 21 - "fmt" 22 - "time" 23 - 24 - "github.com/go-logr/logr" 25 - metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" 26 - 27 - hsmv1alpha1 "github.com/evanjarrett/hsm-secrets-operator/api/v1alpha1" 28 - "github.com/evanjarrett/hsm-secrets-operator/internal/hsm" 29 - ) 30 - 31 - // ConflictResolutionStrategy defines how conflicts should be resolved 32 - type ConflictResolutionStrategy string 33 - 34 - const ( 35 - // StrategyLatestVersion resolves conflicts by choosing the device with the highest version 36 - StrategyLatestVersion ConflictResolutionStrategy = "latest-version" 37 - 38 - // StrategyLatestTimestamp resolves conflicts by choosing the most recently modified data 39 - StrategyLatestTimestamp ConflictResolutionStrategy = "latest-timestamp" 40 - 41 - // StrategyManualResolution requires manual intervention for conflict resolution 42 - StrategyManualResolution ConflictResolutionStrategy = "manual" 43 - 44 - // StrategyPrimaryDevice always uses the designated primary device as source of truth 45 - StrategyPrimaryDevice ConflictResolutionStrategy = "primary-device" 46 - ) 47 - 48 - // ConflictResolver handles HSM device synchronization conflicts 49 - type ConflictResolver struct { 50 - logger logr.Logger 51 - strategy ConflictResolutionStrategy 52 - } 53 - 54 - // NewConflictResolver creates a new conflict resolver 55 - func NewConflictResolver(logger logr.Logger, strategy ConflictResolutionStrategy) *ConflictResolver { 56 - return &ConflictResolver{ 57 - logger: logger.WithName("conflict-resolver"), 58 - strategy: strategy, 59 - } 60 - } 61 - 62 - // ConflictInfo represents detected conflict information 63 - type ConflictInfo struct { 64 - SecretPath string 65 - Devices []DeviceConflictData 66 - DetectedAt time.Time 67 - ResolutionRef string // Reference to resolution method used 68 - } 69 - 70 - // DeviceConflictData represents conflict data from a specific device 71 - type DeviceConflictData struct { 72 - DeviceName string 73 - Checksum string 74 - Version int64 75 - Timestamp time.Time 76 - Data hsm.SecretData 77 - Online bool 78 - Error error 79 - } 80 - 81 - // ResolveConflict resolves a detected conflict using the configured strategy 82 - func (cr *ConflictResolver) ResolveConflict(ctx context.Context, conflict *ConflictInfo, hsmSecret *hsmv1alpha1.HSMSecret) (*ResolutionResult, error) { 83 - logger := cr.logger.WithValues("secret", hsmSecret.Name, "strategy", cr.strategy) 84 - logger.Info("Resolving HSM sync conflict", "devices", len(conflict.Devices)) 85 - 86 - switch cr.strategy { 87 - case StrategyLatestVersion: 88 - return cr.resolveByLatestVersion(conflict, logger) 89 - case StrategyLatestTimestamp: 90 - return cr.resolveByLatestTimestamp(conflict, logger) 91 - case StrategyPrimaryDevice: 92 - return cr.resolveByPrimaryDevice(conflict, hsmSecret, logger) 93 - case StrategyManualResolution: 94 - return cr.requireManualResolution(conflict, hsmSecret, logger) 95 - default: 96 - return nil, fmt.Errorf("unknown conflict resolution strategy: %s", cr.strategy) 97 - } 98 - } 99 - 100 - // ResolutionResult represents the result of conflict resolution 101 - type ResolutionResult struct { 102 - Winner *DeviceConflictData 103 - Resolution ConflictResolutionStrategy 104 - RequiresManualIntervention bool 105 - SyncTargets []string // Devices that need to be updated 106 - ResolvedData hsm.SecretData 107 - } 108 - 109 - // resolveByLatestVersion chooses the device with the highest version number 110 - func (cr *ConflictResolver) resolveByLatestVersion(conflict *ConflictInfo, logger logr.Logger) (*ResolutionResult, error) { 111 - var winner *DeviceConflictData 112 - highestVersion := int64(-1) 113 - 114 - // Find device with highest version among online devices 115 - for i := range conflict.Devices { 116 - device := &conflict.Devices[i] 117 - if device.Online && device.Error == nil && device.Version > highestVersion { 118 - highestVersion = device.Version 119 - winner = device 120 - } 121 - } 122 - 123 - if winner == nil { 124 - return nil, fmt.Errorf("no online devices with valid version found") 125 - } 126 - 127 - // Determine which devices need updating 128 - var syncTargets []string 129 - for _, device := range conflict.Devices { 130 - if device.DeviceName != winner.DeviceName && device.Online && device.Error == nil { 131 - syncTargets = append(syncTargets, device.DeviceName) 132 - } 133 - } 134 - 135 - logger.Info("Resolved conflict by latest version", 136 - "winner", winner.DeviceName, 137 - "version", winner.Version, 138 - "targets", syncTargets) 139 - 140 - return &ResolutionResult{ 141 - Winner: winner, 142 - Resolution: StrategyLatestVersion, 143 - SyncTargets: syncTargets, 144 - ResolvedData: winner.Data, 145 - }, nil 146 - } 147 - 148 - // resolveByLatestTimestamp chooses the device with the most recent timestamp 149 - func (cr *ConflictResolver) resolveByLatestTimestamp(conflict *ConflictInfo, logger logr.Logger) (*ResolutionResult, error) { 150 - var winner *DeviceConflictData 151 - var latestTime time.Time 152 - 153 - // Find device with most recent timestamp among online devices 154 - for i := range conflict.Devices { 155 - device := &conflict.Devices[i] 156 - if device.Online && device.Error == nil && device.Timestamp.After(latestTime) { 157 - latestTime = device.Timestamp 158 - winner = device 159 - } 160 - } 161 - 162 - if winner == nil { 163 - return nil, fmt.Errorf("no online devices with valid timestamp found") 164 - } 165 - 166 - // Determine which devices need updating 167 - var syncTargets []string 168 - for _, device := range conflict.Devices { 169 - if device.DeviceName != winner.DeviceName && device.Online && device.Error == nil { 170 - syncTargets = append(syncTargets, device.DeviceName) 171 - } 172 - } 173 - 174 - logger.Info("Resolved conflict by latest timestamp", 175 - "winner", winner.DeviceName, 176 - "timestamp", winner.Timestamp, 177 - "targets", syncTargets) 178 - 179 - return &ResolutionResult{ 180 - Winner: winner, 181 - Resolution: StrategyLatestTimestamp, 182 - SyncTargets: syncTargets, 183 - ResolvedData: winner.Data, 184 - }, nil 185 - } 186 - 187 - // resolveByPrimaryDevice uses the designated primary device as the source of truth 188 - func (cr *ConflictResolver) resolveByPrimaryDevice(conflict *ConflictInfo, hsmSecret *hsmv1alpha1.HSMSecret, logger logr.Logger) (*ResolutionResult, error) { 189 - primaryDevice := hsmSecret.Status.PrimaryDevice 190 - if primaryDevice == "" { 191 - // No primary device set, fall back to latest version strategy 192 - logger.Info("No primary device set, falling back to latest version strategy") 193 - return cr.resolveByLatestVersion(conflict, logger) 194 - } 195 - 196 - // Find the primary device in the conflict data 197 - var winner *DeviceConflictData 198 - for i := range conflict.Devices { 199 - device := &conflict.Devices[i] 200 - if device.DeviceName == primaryDevice { 201 - if device.Online && device.Error == nil { 202 - winner = device 203 - break 204 - } else { 205 - logger.Info("Primary device is offline or has errors, falling back to latest version", 206 - "primaryDevice", primaryDevice, 207 - "online", device.Online, 208 - "error", device.Error) 209 - return cr.resolveByLatestVersion(conflict, logger) 210 - } 211 - } 212 - } 213 - 214 - if winner == nil { 215 - logger.Info("Primary device not found in conflict, falling back to latest version", "primaryDevice", primaryDevice) 216 - return cr.resolveByLatestVersion(conflict, logger) 217 - } 218 - 219 - // Determine which devices need updating 220 - var syncTargets []string 221 - for _, device := range conflict.Devices { 222 - if device.DeviceName != winner.DeviceName && device.Online && device.Error == nil { 223 - syncTargets = append(syncTargets, device.DeviceName) 224 - } 225 - } 226 - 227 - logger.Info("Resolved conflict by primary device", 228 - "winner", winner.DeviceName, 229 - "targets", syncTargets) 230 - 231 - return &ResolutionResult{ 232 - Winner: winner, 233 - Resolution: StrategyPrimaryDevice, 234 - SyncTargets: syncTargets, 235 - ResolvedData: winner.Data, 236 - }, nil 237 - } 238 - 239 - // requireManualResolution marks the conflict as requiring manual intervention 240 - func (cr *ConflictResolver) requireManualResolution(conflict *ConflictInfo, hsmSecret *hsmv1alpha1.HSMSecret, logger logr.Logger) (*ResolutionResult, error) { 241 - logger.Info("Conflict marked for manual resolution", 242 - "devices", len(conflict.Devices), 243 - "secret", hsmSecret.Name) 244 - 245 - // Add condition to HSMSecret indicating manual resolution is required 246 - now := metav1.NewTime(time.Now()) 247 - condition := metav1.Condition{ 248 - Type: "ConflictResolutionRequired", 249 - Status: metav1.ConditionTrue, 250 - Reason: "ManualResolutionRequired", 251 - Message: fmt.Sprintf("Conflict detected between %d devices requires manual resolution", len(conflict.Devices)), 252 - LastTransitionTime: now, 253 - } 254 - 255 - // Update conditions (this would be done by the caller) 256 - _ = condition 257 - 258 - return &ResolutionResult{ 259 - RequiresManualIntervention: true, 260 - Resolution: StrategyManualResolution, 261 - SyncTargets: []string{}, // No automatic sync 262 - }, nil 263 - } 264 - 265 - // DetectConflicts analyzes device sync results to identify conflicts 266 - func (cr *ConflictResolver) DetectConflicts(deviceResults map[string]DeviceResult, secretPath string) (*ConflictInfo, bool) { 267 - // Group devices by checksum 268 - checksumGroups := make(map[string][]string) 269 - conflictDevices := make([]DeviceConflictData, 0) 270 - 271 - for deviceName, result := range deviceResults { 272 - if result.Online && result.Error == nil && result.Checksum != "" { 273 - if _, exists := checksumGroups[result.Checksum]; !exists { 274 - checksumGroups[result.Checksum] = make([]string, 0) 275 - } 276 - checksumGroups[result.Checksum] = append(checksumGroups[result.Checksum], deviceName) 277 - 278 - // Add to conflict devices list 279 - conflictDevices = append(conflictDevices, DeviceConflictData{ 280 - DeviceName: deviceName, 281 - Checksum: result.Checksum, 282 - Version: result.Version, 283 - Timestamp: result.Timestamp, 284 - Online: result.Online, 285 - Error: result.Error, 286 - }) 287 - } 288 - } 289 - 290 - // Conflict exists if we have more than one checksum group 291 - hasConflict := len(checksumGroups) > 1 && len(conflictDevices) > 1 292 - 293 - if !hasConflict { 294 - return nil, false 295 - } 296 - 297 - conflict := &ConflictInfo{ 298 - SecretPath: secretPath, 299 - Devices: conflictDevices, 300 - DetectedAt: time.Now(), 301 - } 302 - 303 - return conflict, true 304 - }
+588 -181
internal/sync/manager.go
··· 33 33 34 34 // AgentManagerInterface defines the interface for HSM agent management used by sync 35 35 type AgentManagerInterface interface { 36 - CreateSingleGRPCClient(ctx context.Context, deviceName string, logger logr.Logger) (hsm.Client, error) 36 + CreateSingleGRPCClient(ctx context.Context, deviceName, namespace string, logger logr.Logger) (hsm.Client, error) 37 37 } 38 38 39 39 // SyncManager handles multi-device HSM synchronization and conflict resolution ··· 52 52 } 53 53 } 54 54 55 - // SyncResult represents the result of a sync operation 55 + // SyncResult represents the result of a per-secret synchronization operation 56 56 type SyncResult struct { 57 57 Success bool 58 - ConflictDetected bool 59 - PrimaryDevice string 60 - DeviceResults map[string]DeviceResult 61 - ResolvedData hsm.SecretData 58 + SecretsProcessed int 59 + SecretsUpdated int 60 + SecretsCreated int 61 + MetadataRestored int 62 + SecretResults map[string]SecretSyncResult 63 + Errors []string 64 + } 65 + 66 + // SecretSyncResult represents the result of syncing a specific secret 67 + type SecretSyncResult struct { 68 + SecretPath string 69 + SourceDevice string 70 + SourceVersion int64 71 + TargetDevices []string 72 + SyncType SyncType 73 + Success bool 74 + Error error 75 + } 76 + 77 + // SecretInventory represents the state of a secret across all devices 78 + type SecretInventory struct { 79 + SecretPath string 80 + DeviceStates map[string]*SecretState // device -> metadata & presence 62 81 } 63 82 64 - // DeviceResult represents the sync result for a specific device 65 - type DeviceResult struct { 66 - Online bool 67 - Checksum string 68 - Version int64 69 - Error error 70 - Timestamp time.Time 83 + // SecretState represents the state of a secret on a specific device 84 + type SecretState struct { 85 + Present bool 86 + Version int64 87 + Timestamp time.Time 88 + Checksum string 89 + HasMetadata bool 90 + Error error 91 + } 92 + 93 + // SecretSyncPlan represents the plan for syncing a specific secret 94 + type SecretSyncPlan struct { 95 + SecretPath string 96 + SourceDevice string 97 + SourceVersion int64 98 + TargetDevices []string 99 + SyncType SyncType 71 100 } 72 101 73 - // SyncSecret performs multi-device synchronization for an HSMSecret 102 + // SyncType represents the type of sync operation needed 103 + type SyncType int 104 + 105 + const ( 106 + SyncTypeSkip SyncType = iota // Already in sync 107 + SyncTypeUpdate // Update existing secret 108 + SyncTypeCreate // Create missing secret 109 + SyncTypeRestoreMetadata // Add metadata to existing secret 110 + ) 111 + 112 + // SyncSecret performs per-secret synchronization across all HSM devices 74 113 func (sm *SyncManager) SyncSecret(ctx context.Context, hsmSecret *hsmv1alpha1.HSMSecret) (*SyncResult, error) { 75 114 logger := sm.logger.WithValues("secret", hsmSecret.Name, "namespace", hsmSecret.Namespace) 76 115 secretPath := hsmSecret.Name ··· 84 123 if len(devices) == 0 { 85 124 return &SyncResult{ 86 125 Success: false, 87 - DeviceResults: make(map[string]DeviceResult), 126 + SecretResults: make(map[string]SecretSyncResult), 127 + Errors: []string{"no HSM devices available"}, 88 128 }, fmt.Errorf("no HSM devices available") 89 129 } 90 130 91 - logger.Info("Starting multi-device sync", "devices", len(devices)) 131 + logger.Info("Starting per-secret sync", "devices", len(devices), "secretPath", secretPath) 132 + 133 + // Build inventory of all secrets across all devices 134 + inventory, err := sm.buildSecretInventory(ctx, []string{secretPath}, devices, hsmSecret.Namespace, logger) 135 + if err != nil { 136 + return &SyncResult{ 137 + Success: false, 138 + SecretResults: make(map[string]SecretSyncResult), 139 + Errors: []string{fmt.Sprintf("failed to build secret inventory: %v", err)}, 140 + }, fmt.Errorf("failed to build secret inventory: %w", err) 141 + } 142 + 143 + // Create sync plan for the single secret 144 + syncPlans := sm.createSyncPlans(inventory, logger) 145 + 146 + // Execute sync operations 147 + result := sm.executeSyncPlans(ctx, syncPlans, hsmSecret.Namespace, logger) 148 + 149 + logger.Info("Per-secret sync completed", 150 + "secretsProcessed", result.SecretsProcessed, 151 + "secretsUpdated", result.SecretsUpdated, 152 + "secretsCreated", result.SecretsCreated, 153 + "metadataRestored", result.MetadataRestored, 154 + "errors", len(result.Errors)) 155 + 156 + return result, nil 157 + } 92 158 93 - // Read data from all online devices 94 - deviceResults := make(map[string]DeviceResult) 95 - validDeviceData := make(map[string]hsm.SecretData) 159 + // buildSecretInventory builds a comprehensive inventory of secrets across all devices 160 + func (sm *SyncManager) buildSecretInventory(ctx context.Context, secretPaths []string, devices []string, namespace string, logger logr.Logger) (map[string]*SecretInventory, error) { 161 + inventory := make(map[string]*SecretInventory) 96 162 163 + // Initialize inventory entries for requested secrets 164 + for _, secretPath := range secretPaths { 165 + inventory[secretPath] = &SecretInventory{ 166 + SecretPath: secretPath, 167 + DeviceStates: make(map[string]*SecretState), 168 + } 169 + } 170 + 171 + // Check each device for the presence and state of each secret 97 172 for _, deviceName := range devices { 98 - result := sm.readFromDevice(ctx, deviceName, secretPath, logger) 99 - deviceResults[deviceName] = result 173 + logger.Info("Checking device for secrets", "device", deviceName, "secretCount", len(secretPaths)) 100 174 101 - if result.Online && result.Error == nil { 102 - // Calculate checksum for the data (we'll get the actual data in practice) 103 - // For now, simulating based on checksum 104 - validDeviceData[deviceName] = hsm.SecretData{ 105 - "checksum": []byte(result.Checksum), 175 + // Create gRPC client for this device 176 + grpcClient, err := sm.agentManager.CreateSingleGRPCClient(ctx, deviceName, namespace, logger) 177 + if err != nil { 178 + logger.Error(err, "Failed to create gRPC client", "device", deviceName) 179 + // Mark all secrets as having an error on this device 180 + for secretPath := range inventory { 181 + inventory[secretPath].DeviceStates[deviceName] = &SecretState{ 182 + Present: false, 183 + Version: 0, 184 + Timestamp: time.Now(), 185 + Checksum: "", 186 + HasMetadata: false, 187 + Error: fmt.Errorf("gRPC client creation failed: %w", err), 188 + } 189 + } 190 + continue 191 + } 192 + 193 + defer func(client hsm.Client, device string) { 194 + if closeErr := client.Close(); closeErr != nil { 195 + logger.V(1).Info("Failed to close gRPC client", "device", device, "error", closeErr) 196 + } 197 + }(grpcClient, deviceName) 198 + 199 + // Check if device is connected 200 + if !grpcClient.IsConnected() { 201 + logger.V(1).Info("Device not connected", "device", deviceName) 202 + for secretPath := range inventory { 203 + inventory[secretPath].DeviceStates[deviceName] = &SecretState{ 204 + Present: false, 205 + Version: 0, 206 + Timestamp: time.Now(), 207 + Checksum: "", 208 + HasMetadata: false, 209 + Error: fmt.Errorf("device not connected"), 210 + } 211 + } 212 + continue 213 + } 214 + 215 + // Check each secret on this device 216 + for _, secretPath := range secretPaths { 217 + state := &SecretState{ 218 + Present: false, 219 + Version: 0, 220 + Timestamp: time.Now(), 221 + Checksum: "", 222 + HasMetadata: false, 223 + Error: nil, 224 + } 225 + 226 + // Try to read the secret to check if it exists 227 + data, err := grpcClient.ReadSecret(ctx, secretPath) 228 + if err != nil { 229 + // Secret doesn't exist on this device 230 + logger.V(1).Info("Secret not found on device", "device", deviceName, "secret", secretPath) 231 + state.Error = fmt.Errorf("secret not found: %w", err) 232 + } else { 233 + // Secret exists, calculate checksum 234 + state.Present = true 235 + state.Checksum = sm.calculateChecksum(data) 236 + logger.V(1).Info("Secret found on device", "device", deviceName, "secret", secretPath, "checksum", state.Checksum[:8]) 237 + 238 + // Try to read metadata 239 + metadata, metaErr := grpcClient.ReadMetadata(ctx, secretPath) 240 + if metaErr == nil && metadata != nil && metadata.Labels != nil { 241 + // Extract version and timestamp from metadata 242 + if versionStr, exists := metadata.Labels["sync.version"]; exists { 243 + if version, parseErr := parseVersion(versionStr); parseErr == nil { 244 + state.Version = version 245 + state.HasMetadata = true 246 + } 247 + } 248 + if timestampStr, exists := metadata.Labels["sync.timestamp"]; exists { 249 + if timestamp, parseErr := time.Parse(time.RFC3339, timestampStr); parseErr == nil { 250 + state.Timestamp = timestamp 251 + } 252 + } 253 + logger.V(1).Info("Metadata found", "device", deviceName, "secret", secretPath, 254 + "version", state.Version, "timestamp", state.Timestamp.Format(time.RFC3339)) 255 + } else { 256 + logger.V(1).Info("No metadata found", "device", deviceName, "secret", secretPath) 257 + } 106 258 } 259 + 260 + inventory[secretPath].DeviceStates[deviceName] = state 107 261 } 108 262 } 109 263 110 - // Detect conflicts and resolve 111 - conflictDetected := sm.detectConflicts(deviceResults) 112 - primaryDevice := sm.selectPrimaryDevice(deviceResults, hsmSecret) 264 + return inventory, nil 265 + } 113 266 114 - var resolvedData hsm.SecretData 115 - if primaryDevice != "" && validDeviceData[primaryDevice] != nil { 116 - resolvedData = validDeviceData[primaryDevice] 117 - logger.Info("Using primary device data", "primaryDevice", primaryDevice) 118 - } else if len(validDeviceData) > 0 { 119 - // Use most recent data if no clear primary 120 - resolvedData = sm.selectMostRecentData(deviceResults, validDeviceData) 121 - logger.Info("Using most recent data for resolution") 122 - } 267 + // createSyncPlans analyzes secret inventory and creates sync plans for each secret 268 + func (sm *SyncManager) createSyncPlans(inventory map[string]*SecretInventory, logger logr.Logger) []*SecretSyncPlan { 269 + var plans []*SecretSyncPlan 123 270 124 - // If conflict detected and we have a resolution, sync to all other devices 125 - if conflictDetected && primaryDevice != "" { 126 - sm.syncToSecondaryDevices(ctx, devices, primaryDevice, secretPath, resolvedData, logger) 271 + for secretPath, secretInventory := range inventory { 272 + plan := sm.createSyncPlanForSecret(secretPath, secretInventory, logger) 273 + if plan != nil { 274 + plans = append(plans, plan) 275 + } 127 276 } 128 277 129 - return &SyncResult{ 130 - Success: len(validDeviceData) > 0, 131 - ConflictDetected: conflictDetected, 132 - PrimaryDevice: primaryDevice, 133 - DeviceResults: deviceResults, 134 - ResolvedData: resolvedData, 135 - }, nil 278 + return plans 136 279 } 137 280 138 - // readFromDevice reads secret data from a specific HSM device 139 - func (sm *SyncManager) readFromDevice(ctx context.Context, deviceName, secretPath string, logger logr.Logger) DeviceResult { 140 - result := DeviceResult{ 141 - Timestamp: time.Now(), 281 + // createSyncPlanForSecret creates a sync plan for a specific secret across all devices 282 + func (sm *SyncManager) createSyncPlanForSecret(secretPath string, inventory *SecretInventory, logger logr.Logger) *SecretSyncPlan { 283 + // Find the authoritative source device (highest version, most recent timestamp) 284 + var sourceDevice string 285 + var sourceVersion int64 = -1 286 + var sourceTimestamp time.Time 287 + var devicesWithSecret []string 288 + var devicesNeedingSecret []string 289 + var devicesNeedingMetadata []string 290 + 291 + // Analyze all device states for this secret 292 + for deviceName, state := range inventory.DeviceStates { 293 + if state.Error != nil { 294 + logger.V(1).Info("Device has error, skipping", "device", deviceName, "secret", secretPath, "error", state.Error) 295 + continue 296 + } 297 + 298 + if state.Present { 299 + devicesWithSecret = append(devicesWithSecret, deviceName) 300 + 301 + // Check if this device has the most authoritative version 302 + if state.HasMetadata { 303 + if state.Version > sourceVersion || 304 + (state.Version == sourceVersion && state.Timestamp.After(sourceTimestamp)) { 305 + sourceDevice = deviceName 306 + sourceVersion = state.Version 307 + sourceTimestamp = state.Timestamp 308 + } 309 + } else { 310 + // Secret exists but lacks metadata - needs restoration 311 + devicesNeedingMetadata = append(devicesNeedingMetadata, deviceName) 312 + 313 + // If no source found yet and this device has the secret, it could be the source 314 + // We'll use timestamp as fallback (creation time from our scan) 315 + if sourceDevice == "" { 316 + sourceDevice = deviceName 317 + sourceVersion = 0 // Will trigger metadata creation 318 + sourceTimestamp = state.Timestamp 319 + } 320 + } 321 + } else { 322 + // Device doesn't have this secret 323 + devicesNeedingSecret = append(devicesNeedingSecret, deviceName) 324 + } 142 325 } 143 326 144 - // Get gRPC client for this device 145 - grpcClient, err := sm.agentManager.CreateSingleGRPCClient(ctx, deviceName, logger) 146 - if err != nil { 147 - result.Error = fmt.Errorf("failed to create gRPC client: %w", err) 148 - return result 327 + // Determine sync operation type 328 + if len(devicesWithSecret) == 0 { 329 + // No devices have this secret - nothing to sync 330 + logger.V(1).Info("Secret not found on any device", "secret", secretPath) 331 + return nil 149 332 } 150 - defer func() { 151 - if closeErr := grpcClient.Close(); closeErr != nil { 152 - logger.V(1).Info("Failed to close gRPC client", "error", closeErr) 333 + 334 + if len(devicesNeedingSecret) == 0 && len(devicesNeedingMetadata) == 0 { 335 + // All devices have the secret with metadata - check if they're in sync 336 + allInSync := true 337 + for _, state := range inventory.DeviceStates { 338 + if state.Error == nil && state.Present { 339 + if !state.HasMetadata || state.Version != sourceVersion { 340 + allInSync = false 341 + break 342 + } 343 + } 153 344 } 154 - }() 155 345 156 - result.Online = grpcClient.IsConnected() 157 - if !result.Online { 158 - result.Error = fmt.Errorf("device not connected") 159 - return result 346 + if allInSync { 347 + logger.V(1).Info("Secret already in sync across all devices", "secret", secretPath) 348 + return &SecretSyncPlan{ 349 + SecretPath: secretPath, 350 + SourceDevice: sourceDevice, 351 + SourceVersion: sourceVersion, 352 + TargetDevices: []string{}, // No targets needed 353 + SyncType: SyncTypeSkip, 354 + } 355 + } 160 356 } 161 357 162 - // Try to read the secret 163 - data, err := grpcClient.ReadSecret(ctx, secretPath) 164 - if err != nil { 165 - result.Error = fmt.Errorf("failed to read secret: %w", err) 166 - return result 358 + // Determine target devices that need updates 359 + var targetDevices []string 360 + var syncType SyncType 361 + 362 + // Add devices that need the secret created 363 + targetDevices = append(targetDevices, devicesNeedingSecret...) 364 + if len(devicesNeedingSecret) > 0 { 365 + syncType = SyncTypeCreate 167 366 } 168 367 169 - // Calculate checksum 170 - result.Checksum = sm.calculateChecksum(data) 368 + // Add devices that need metadata restoration 369 + targetDevices = append(targetDevices, devicesNeedingMetadata...) 370 + if len(devicesNeedingMetadata) > 0 && syncType != SyncTypeCreate { 371 + syncType = SyncTypeRestoreMetadata 372 + } 171 373 172 - // Get metadata to extract version (if available) 173 - metadata, err := grpcClient.ReadMetadata(ctx, secretPath) 174 - if err == nil && metadata != nil { 175 - if versionStr, exists := metadata.Labels["sync.version"]; exists { 176 - if version, parseErr := parseVersion(versionStr); parseErr == nil { 177 - result.Version = version 374 + // Add devices that have outdated versions 375 + for deviceName, state := range inventory.DeviceStates { 376 + if state.Error == nil && state.Present && state.HasMetadata { 377 + if state.Version < sourceVersion { 378 + // This device has an older version 379 + targetDevices = append(targetDevices, deviceName) 380 + if syncType != SyncTypeCreate && syncType != SyncTypeRestoreMetadata { 381 + syncType = SyncTypeUpdate 382 + } 178 383 } 179 384 } 180 385 } 181 386 182 - return result 387 + // Remove duplicates from target devices 388 + targetDevices = removeDuplicates(targetDevices) 389 + 390 + // Remove source device from targets (don't sync to itself) 391 + targetDevices = removeDevice(targetDevices, sourceDevice) 392 + 393 + if len(targetDevices) == 0 { 394 + // No targets needed 395 + return &SecretSyncPlan{ 396 + SecretPath: secretPath, 397 + SourceDevice: sourceDevice, 398 + SourceVersion: sourceVersion, 399 + TargetDevices: []string{}, 400 + SyncType: SyncTypeSkip, 401 + } 402 + } 403 + 404 + logger.Info("Created sync plan", "secret", secretPath, 405 + "sourceDevice", sourceDevice, "sourceVersion", sourceVersion, 406 + "targetDevices", len(targetDevices), "syncType", syncType) 407 + 408 + return &SecretSyncPlan{ 409 + SecretPath: secretPath, 410 + SourceDevice: sourceDevice, 411 + SourceVersion: sourceVersion, 412 + TargetDevices: targetDevices, 413 + SyncType: syncType, 414 + } 183 415 } 184 416 185 - // detectConflicts checks if there are conflicting checksums across devices 186 - func (sm *SyncManager) detectConflicts(deviceResults map[string]DeviceResult) bool { 187 - checksums := make(map[string]int) 188 - onlineDevices := 0 417 + // removeDuplicates removes duplicate strings from a slice 418 + func removeDuplicates(slice []string) []string { 419 + keys := make(map[string]bool) 420 + var result []string 189 421 190 - for _, result := range deviceResults { 191 - if result.Online && result.Error == nil && result.Checksum != "" { 192 - checksums[result.Checksum]++ 193 - onlineDevices++ 422 + for _, item := range slice { 423 + if !keys[item] { 424 + keys[item] = true 425 + result = append(result, item) 194 426 } 195 427 } 196 428 197 - // Conflict if we have more than one unique checksum across online devices 198 - return len(checksums) > 1 && onlineDevices > 1 429 + return result 199 430 } 200 431 201 - // selectPrimaryDevice chooses the primary device for conflict resolution 202 - func (sm *SyncManager) selectPrimaryDevice(deviceResults map[string]DeviceResult, hsmSecret *hsmv1alpha1.HSMSecret) string { 203 - // Check if there's already a designated primary in the status 204 - if hsmSecret.Status.PrimaryDevice != "" { 205 - if result, exists := deviceResults[hsmSecret.Status.PrimaryDevice]; exists && result.Online && result.Error == nil { 206 - return hsmSecret.Status.PrimaryDevice 432 + // removeDevice removes a specific device from a slice of devices 433 + func removeDevice(devices []string, deviceToRemove string) []string { 434 + var result []string 435 + 436 + for _, device := range devices { 437 + if device != deviceToRemove { 438 + result = append(result, device) 207 439 } 208 440 } 209 441 210 - // Find device with highest version number among online devices 211 - var bestDevice string 212 - var highestVersion int64 = -1 213 - var mostRecentTime time.Time 442 + return result 443 + } 214 444 215 - for deviceName, result := range deviceResults { 216 - if result.Online && result.Error == nil { 217 - // Prefer higher version numbers 218 - if result.Version > highestVersion { 219 - highestVersion = result.Version 220 - bestDevice = deviceName 221 - mostRecentTime = result.Timestamp 222 - } else if result.Version == highestVersion && result.Timestamp.After(mostRecentTime) { 223 - // If versions are equal, prefer more recent timestamp 224 - bestDevice = deviceName 225 - mostRecentTime = result.Timestamp 445 + // executeSyncPlans executes sync operations for all planned secret synchronizations 446 + func (sm *SyncManager) executeSyncPlans(ctx context.Context, plans []*SecretSyncPlan, namespace string, logger logr.Logger) *SyncResult { 447 + result := &SyncResult{ 448 + Success: true, 449 + SecretsProcessed: len(plans), 450 + SecretsUpdated: 0, 451 + SecretsCreated: 0, 452 + MetadataRestored: 0, 453 + SecretResults: make(map[string]SecretSyncResult), 454 + Errors: []string{}, 455 + } 456 + 457 + for _, plan := range plans { 458 + secretResult := sm.executeSyncPlan(ctx, plan, namespace, logger) 459 + result.SecretResults[plan.SecretPath] = secretResult 460 + 461 + if secretResult.Success { 462 + switch secretResult.SyncType { 463 + case SyncTypeCreate: 464 + result.SecretsCreated++ 465 + case SyncTypeUpdate: 466 + result.SecretsUpdated++ 467 + case SyncTypeRestoreMetadata: 468 + result.MetadataRestored++ 469 + } 470 + } else { 471 + result.Success = false 472 + if secretResult.Error != nil { 473 + result.Errors = append(result.Errors, fmt.Sprintf("%s: %v", plan.SecretPath, secretResult.Error)) 226 474 } 227 475 } 228 476 } 229 477 230 - return bestDevice 478 + return result 231 479 } 232 480 233 - // selectMostRecentData selects the most recently modified data 234 - func (sm *SyncManager) selectMostRecentData(deviceResults map[string]DeviceResult, validDeviceData map[string]hsm.SecretData) hsm.SecretData { 235 - var mostRecentDevice string 236 - var mostRecentTime time.Time 481 + // executeSyncPlan executes a single secret sync plan 482 + func (sm *SyncManager) executeSyncPlan(ctx context.Context, plan *SecretSyncPlan, namespace string, logger logr.Logger) SecretSyncResult { 483 + result := SecretSyncResult{ 484 + SecretPath: plan.SecretPath, 485 + SourceDevice: plan.SourceDevice, 486 + SourceVersion: plan.SourceVersion, 487 + TargetDevices: plan.TargetDevices, 488 + SyncType: plan.SyncType, 489 + Success: false, 490 + Error: nil, 491 + } 237 492 238 - for deviceName, result := range deviceResults { 239 - if result.Online && result.Error == nil && result.Timestamp.After(mostRecentTime) { 240 - mostRecentTime = result.Timestamp 241 - mostRecentDevice = deviceName 242 - } 493 + // Skip if no sync needed 494 + if plan.SyncType == SyncTypeSkip { 495 + result.Success = true 496 + logger.V(1).Info("Skipping sync - already in sync", "secret", plan.SecretPath) 497 + return result 243 498 } 244 499 245 - if mostRecentDevice != "" && validDeviceData[mostRecentDevice] != nil { 246 - return validDeviceData[mostRecentDevice] 500 + // Get source data and metadata 501 + sourceData, sourceMetadata, err := sm.readSecretWithMetadata(ctx, plan.SourceDevice, plan.SecretPath, namespace, logger) 502 + if err != nil { 503 + result.Error = fmt.Errorf("failed to read source secret: %w", err) 504 + logger.Error(err, "Failed to read source secret", "device", plan.SourceDevice, "secret", plan.SecretPath) 505 + return result 247 506 } 248 507 249 - // Return first available data if no clear winner 250 - for _, data := range validDeviceData { 251 - return data 508 + // Prepare metadata for sync 509 + newVersion := time.Now().Unix() 510 + if sourceMetadata != nil && sourceMetadata.Labels != nil { 511 + if versionStr, exists := sourceMetadata.Labels["sync.version"]; exists { 512 + if existingVersion, parseErr := parseVersion(versionStr); parseErr == nil && existingVersion > 0 { 513 + newVersion = existingVersion 514 + } 515 + } 252 516 } 253 517 254 - return nil 255 - } 518 + // Create or update metadata 519 + syncMetadata := &hsm.SecretMetadata{ 520 + Labels: map[string]string{ 521 + "sync.version": fmt.Sprintf("%d", newVersion), 522 + "sync.timestamp": time.Now().Format(time.RFC3339), 523 + "sync.source": plan.SourceDevice, 524 + }, 525 + } 256 526 257 - // syncToSecondaryDevices syncs resolved data to all secondary devices 258 - func (sm *SyncManager) syncToSecondaryDevices(ctx context.Context, devices []string, primaryDevice, secretPath string, data hsm.SecretData, logger logr.Logger) { 259 - for _, deviceName := range devices { 260 - if deviceName == primaryDevice { 261 - continue // Skip primary device 527 + // Handle metadata restoration on source device if needed 528 + if plan.SyncType == SyncTypeRestoreMetadata { 529 + if sourceMetadata == nil || sourceMetadata.Labels == nil || sourceMetadata.Labels["sync.version"] == "" { 530 + logger.Info("Restoring metadata on source device", "device", plan.SourceDevice, "secret", plan.SecretPath) 531 + if err := sm.writeSecretWithMetadata(ctx, plan.SourceDevice, plan.SecretPath, sourceData, syncMetadata, namespace, logger); err != nil { 532 + result.Error = fmt.Errorf("failed to restore metadata on source: %w", err) 533 + return result 534 + } 262 535 } 536 + } 263 537 264 - logger.Info("Syncing to secondary device", "device", deviceName) 538 + // Sync to target devices 539 + successfulTargets := 0 540 + for _, targetDevice := range plan.TargetDevices { 541 + logger.Info("Syncing secret to target device", "secret", plan.SecretPath, "source", plan.SourceDevice, "target", targetDevice, "version", newVersion) 265 542 266 - grpcClient, err := sm.agentManager.CreateSingleGRPCClient(ctx, deviceName, logger) 267 - if err != nil { 268 - logger.Error(err, "Failed to create gRPC client for sync", "device", deviceName) 269 - continue 543 + var syncErr error 544 + switch plan.SyncType { 545 + case SyncTypeCreate, SyncTypeUpdate: 546 + syncErr = sm.writeSecretWithMetadata(ctx, targetDevice, plan.SecretPath, sourceData, syncMetadata, namespace, logger) 547 + case SyncTypeRestoreMetadata: 548 + // For metadata restoration, we just update the metadata without changing the data 549 + syncErr = sm.writeMetadataOnly(ctx, targetDevice, plan.SecretPath, syncMetadata, namespace, logger) 270 550 } 271 551 272 - if !grpcClient.IsConnected() { 273 - logger.V(1).Info("Device offline, skipping sync", "device", deviceName) 274 - if closeErr := grpcClient.Close(); closeErr != nil { 275 - logger.V(1).Info("Failed to close gRPC client", "error", closeErr) 552 + if syncErr != nil { 553 + logger.Error(syncErr, "Failed to sync to target device", "target", targetDevice, "secret", plan.SecretPath) 554 + if result.Error == nil { 555 + result.Error = syncErr 276 556 } 277 - continue 557 + } else { 558 + successfulTargets++ 559 + logger.Info("Successfully synced to target device", "target", targetDevice, "secret", plan.SecretPath) 278 560 } 561 + } 279 562 280 - // Write data with updated version metadata 281 - metadata := &hsm.SecretMetadata{ 282 - Labels: map[string]string{ 283 - "sync.version": fmt.Sprintf("%d", time.Now().Unix()), 284 - "sync.primary": primaryDevice, 285 - "sync.timestamp": time.Now().Format(time.RFC3339), 286 - }, 563 + // Consider sync successful if we synced to at least some targets (partial success) 564 + result.Success = successfulTargets > 0 || len(plan.TargetDevices) == 0 565 + 566 + if result.Success { 567 + logger.Info("Sync plan executed successfully", "secret", plan.SecretPath, 568 + "syncType", plan.SyncType, "targetCount", successfulTargets) 569 + } 570 + 571 + return result 572 + } 573 + 574 + // readSecretWithMetadata reads both secret data and metadata from a device 575 + func (sm *SyncManager) readSecretWithMetadata(ctx context.Context, deviceName, secretPath, namespace string, logger logr.Logger) (hsm.SecretData, *hsm.SecretMetadata, error) { 576 + grpcClient, err := sm.agentManager.CreateSingleGRPCClient(ctx, deviceName, namespace, logger) 577 + if err != nil { 578 + return nil, nil, fmt.Errorf("failed to create gRPC client: %w", err) 579 + } 580 + defer func() { 581 + if closeErr := grpcClient.Close(); closeErr != nil { 582 + logger.V(1).Info("Failed to close gRPC client", "error", closeErr) 287 583 } 584 + }() 288 585 289 - if err := grpcClient.WriteSecretWithMetadata(ctx, secretPath, data, metadata); err != nil { 290 - logger.Error(err, "Failed to sync to secondary device", "device", deviceName) 291 - } else { 292 - logger.Info("Successfully synced to secondary device", "device", deviceName) 586 + if !grpcClient.IsConnected() { 587 + return nil, nil, fmt.Errorf("device not connected") 588 + } 589 + 590 + // Read secret data 591 + data, err := grpcClient.ReadSecret(ctx, secretPath) 592 + if err != nil { 593 + return nil, nil, fmt.Errorf("failed to read secret: %w", err) 594 + } 595 + 596 + // Read metadata (may not exist) 597 + metadata, err := grpcClient.ReadMetadata(ctx, secretPath) 598 + if err != nil { 599 + logger.V(1).Info("No metadata found for secret", "secret", secretPath, "device", deviceName) 600 + metadata = nil // Not an error - metadata may not exist 601 + } 602 + 603 + return data, metadata, nil 604 + } 605 + 606 + // writeSecretWithMetadata writes both secret data and metadata to a device 607 + func (sm *SyncManager) writeSecretWithMetadata(ctx context.Context, deviceName, secretPath string, data hsm.SecretData, metadata *hsm.SecretMetadata, namespace string, logger logr.Logger) error { 608 + grpcClient, err := sm.agentManager.CreateSingleGRPCClient(ctx, deviceName, namespace, logger) 609 + if err != nil { 610 + return fmt.Errorf("failed to create gRPC client: %w", err) 611 + } 612 + defer func() { 613 + if closeErr := grpcClient.Close(); closeErr != nil { 614 + logger.V(1).Info("Failed to close gRPC client", "error", closeErr) 293 615 } 616 + }() 294 617 618 + if !grpcClient.IsConnected() { 619 + return fmt.Errorf("device not connected") 620 + } 621 + 622 + // Write secret with metadata 623 + if err := grpcClient.WriteSecretWithMetadata(ctx, secretPath, data, metadata); err != nil { 624 + return fmt.Errorf("failed to write secret with metadata: %w", err) 625 + } 626 + 627 + return nil 628 + } 629 + 630 + // writeMetadataOnly updates only the metadata for an existing secret 631 + func (sm *SyncManager) writeMetadataOnly(ctx context.Context, deviceName, secretPath string, metadata *hsm.SecretMetadata, namespace string, logger logr.Logger) error { 632 + grpcClient, err := sm.agentManager.CreateSingleGRPCClient(ctx, deviceName, namespace, logger) 633 + if err != nil { 634 + return fmt.Errorf("failed to create gRPC client: %w", err) 635 + } 636 + defer func() { 295 637 if closeErr := grpcClient.Close(); closeErr != nil { 296 638 logger.V(1).Info("Failed to close gRPC client", "error", closeErr) 297 639 } 640 + }() 641 + 642 + if !grpcClient.IsConnected() { 643 + return fmt.Errorf("device not connected") 298 644 } 645 + 646 + // Read existing secret data first 647 + existingData, err := grpcClient.ReadSecret(ctx, secretPath) 648 + if err != nil { 649 + return fmt.Errorf("failed to read existing secret data: %w", err) 650 + } 651 + 652 + // Write secret with updated metadata 653 + if err := grpcClient.WriteSecretWithMetadata(ctx, secretPath, existingData, metadata); err != nil { 654 + return fmt.Errorf("failed to write secret with metadata: %w", err) 655 + } 656 + 657 + return nil 299 658 } 300 659 301 660 // getAvailableDevices gets list of available HSM devices from HSMPools ··· 332 691 if result.Success { 333 692 hsmSecret.Status.SyncStatus = hsmv1alpha1.SyncStatusInSync 334 693 hsmSecret.Status.LastSyncTime = &now 335 - hsmSecret.Status.LastError = "" 694 + if len(result.Errors) == 0 { 695 + hsmSecret.Status.LastError = "" 696 + } else { 697 + // Partial success - some errors occurred 698 + hsmSecret.Status.LastError = fmt.Sprintf("Partial sync completed with %d errors", len(result.Errors)) 699 + } 336 700 } else { 337 701 hsmSecret.Status.SyncStatus = hsmv1alpha1.SyncStatusError 338 - hsmSecret.Status.LastError = "Failed to sync with any HSM device" 702 + if len(result.Errors) > 0 { 703 + hsmSecret.Status.LastError = result.Errors[0] // Show first error 704 + } else { 705 + hsmSecret.Status.LastError = "Failed to sync secret" 706 + } 339 707 } 340 708 341 - hsmSecret.Status.SyncConflict = result.ConflictDetected 342 - hsmSecret.Status.PrimaryDevice = result.PrimaryDevice 343 - 344 - // Update device-specific sync status 345 - hsmSecret.Status.DeviceSyncStatus = make([]hsmv1alpha1.HSMDeviceSync, 0, len(result.DeviceResults)) 346 - 347 - for deviceName, deviceResult := range result.DeviceResults { 348 - syncTime := metav1.NewTime(deviceResult.Timestamp) 349 - deviceSync := hsmv1alpha1.HSMDeviceSync{ 350 - DeviceName: deviceName, 351 - LastSyncTime: &syncTime, 352 - Checksum: deviceResult.Checksum, 353 - Online: deviceResult.Online, 354 - Version: deviceResult.Version, 709 + // Update secret-level sync information 710 + secretResult, hasSecretResult := result.SecretResults[hsmSecret.Name] 711 + if hasSecretResult { 712 + // Calculate checksum from the source device if sync was successful 713 + if secretResult.Success { 714 + // Read the current data to calculate checksum 715 + if devices, err := sm.getAvailableDevices(ctx, hsmSecret.Namespace); err == nil && len(devices) > 0 { 716 + if data, _, err := sm.readSecretWithMetadata(ctx, devices[0], hsmSecret.Name, hsmSecret.Namespace, sm.logger); err == nil { 717 + hsmSecret.Status.SecretChecksum = sm.calculateChecksum(data) 718 + } 719 + } 355 720 } 356 721 357 - if deviceResult.Error != nil { 358 - deviceSync.Status = hsmv1alpha1.SyncStatusError 359 - deviceSync.LastError = deviceResult.Error.Error() 360 - } else if deviceResult.Online { 361 - deviceSync.Status = hsmv1alpha1.SyncStatusInSync 362 - } else { 363 - deviceSync.Status = hsmv1alpha1.SyncStatusOutOfSync 722 + // Set primary device information if available 723 + if secretResult.SourceDevice != "" { 724 + hsmSecret.Status.PrimaryDevice = secretResult.SourceDevice 364 725 } 365 - 366 - hsmSecret.Status.DeviceSyncStatus = append(hsmSecret.Status.DeviceSyncStatus, deviceSync) 367 726 } 368 727 369 - // Update Kubernetes Secret checksum if we have resolved data 370 - if result.ResolvedData != nil { 371 - hsmSecret.Status.SecretChecksum = sm.calculateChecksum(result.ResolvedData) 728 + // Update device-specific sync status based on available devices 729 + devices, err := sm.getAvailableDevices(ctx, hsmSecret.Namespace) 730 + if err == nil { 731 + hsmSecret.Status.DeviceSyncStatus = make([]hsmv1alpha1.HSMDeviceSync, 0, len(devices)) 732 + 733 + for _, deviceName := range devices { 734 + deviceSync := hsmv1alpha1.HSMDeviceSync{ 735 + DeviceName: deviceName, 736 + LastSyncTime: &now, 737 + Checksum: "", 738 + Online: true, // Assume online if in available devices list 739 + Version: 0, 740 + } 741 + 742 + // Check if this device was involved in sync operations 743 + if hasSecretResult { 744 + if secretResult.SourceDevice == deviceName { 745 + deviceSync.Status = hsmv1alpha1.SyncStatusInSync 746 + deviceSync.Version = secretResult.SourceVersion 747 + } else { 748 + // Check if this device was a target 749 + isTarget := false 750 + for _, target := range secretResult.TargetDevices { 751 + if target == deviceName { 752 + isTarget = true 753 + break 754 + } 755 + } 756 + 757 + if isTarget { 758 + if secretResult.Success { 759 + deviceSync.Status = hsmv1alpha1.SyncStatusInSync 760 + deviceSync.Version = secretResult.SourceVersion 761 + } else { 762 + deviceSync.Status = hsmv1alpha1.SyncStatusError 763 + if secretResult.Error != nil { 764 + deviceSync.LastError = secretResult.Error.Error() 765 + } 766 + } 767 + } else { 768 + // Device wasn't involved in sync - assume in sync 769 + deviceSync.Status = hsmv1alpha1.SyncStatusInSync 770 + } 771 + } 772 + } else { 773 + // No secret result - assume in sync 774 + deviceSync.Status = hsmv1alpha1.SyncStatusInSync 775 + } 776 + 777 + hsmSecret.Status.DeviceSyncStatus = append(hsmSecret.Status.DeviceSyncStatus, deviceSync) 778 + } 372 779 } 373 780 374 781 return sm.client.Status().Update(ctx, hsmSecret)
+87 -261
internal/sync/manager_test.go
··· 19 19 import ( 20 20 "context" 21 21 "testing" 22 - "time" 23 22 24 23 "github.com/go-logr/logr" 25 24 "github.com/stretchr/testify/assert" 26 - "github.com/stretchr/testify/mock" 27 - metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" 28 25 "k8s.io/apimachinery/pkg/runtime" 29 26 "sigs.k8s.io/controller-runtime/pkg/client/fake" 30 27 ··· 32 29 "github.com/evanjarrett/hsm-secrets-operator/internal/hsm" 33 30 ) 34 31 35 - // MockGRPCClient implements hsm.Client for testing 36 - type MockGRPCClient struct { 37 - mock.Mock 38 - } 39 - 40 - func (m *MockGRPCClient) Initialize(ctx context.Context, config hsm.Config) error { 41 - args := m.Called(ctx, config) 42 - return args.Error(0) 43 - } 44 - 45 - func (m *MockGRPCClient) IsConnected() bool { 46 - args := m.Called() 47 - return args.Bool(0) 48 - } 49 - 50 - func (m *MockGRPCClient) ReadSecret(ctx context.Context, path string) (hsm.SecretData, error) { 51 - args := m.Called(ctx, path) 52 - return args.Get(0).(hsm.SecretData), args.Error(1) 53 - } 54 - 55 - func (m *MockGRPCClient) WriteSecret(ctx context.Context, path string, data hsm.SecretData) error { 56 - args := m.Called(ctx, path, data) 57 - return args.Error(0) 58 - } 59 - 60 - func (m *MockGRPCClient) WriteSecretWithMetadata(ctx context.Context, path string, data hsm.SecretData, metadata *hsm.SecretMetadata) error { 61 - args := m.Called(ctx, path, data, metadata) 62 - return args.Error(0) 63 - } 32 + // MockAgentManager is a mock implementation of AgentManagerInterface for testing 33 + type MockAgentManager struct{} 64 34 65 - func (m *MockGRPCClient) DeleteSecret(ctx context.Context, path string) error { 66 - args := m.Called(ctx, path) 67 - return args.Error(0) 35 + func (m *MockAgentManager) CreateSingleGRPCClient(ctx context.Context, deviceName, namespace string, logger logr.Logger) (hsm.Client, error) { 36 + // Return a mock client for testing 37 + return hsm.NewMockClient(), nil 68 38 } 69 39 70 - func (m *MockGRPCClient) ListSecrets(ctx context.Context, prefix string) ([]string, error) { 71 - args := m.Called(ctx, prefix) 72 - return args.Get(0).([]string), args.Error(1) 73 - } 74 - 75 - func (m *MockGRPCClient) GetInfo(ctx context.Context) (map[string]any, error) { 76 - args := m.Called(ctx) 77 - return args.Get(0).(map[string]any), args.Error(1) 78 - } 79 - 80 - func (m *MockGRPCClient) GetChecksum(ctx context.Context, path string) (string, error) { 81 - args := m.Called(ctx, path) 82 - return args.String(0), args.Error(1) 83 - } 84 - 85 - func (m *MockGRPCClient) ReadMetadata(ctx context.Context, path string) (*hsm.SecretMetadata, error) { 86 - args := m.Called(ctx, path) 87 - if args.Get(0) == nil { 88 - return nil, args.Error(1) 89 - } 90 - return args.Get(0).(*hsm.SecretMetadata), args.Error(1) 91 - } 92 - 93 - func (m *MockGRPCClient) Close() error { 94 - args := m.Called() 95 - return args.Error(0) 96 - } 97 - 98 - // MockAgentManager implements AgentManagerInterface for testing 99 - type MockAgentManager struct { 100 - mock.Mock 101 - } 102 - 103 - func (m *MockAgentManager) CreateSingleGRPCClient(ctx context.Context, deviceName string, logger logr.Logger) (hsm.Client, error) { 104 - args := m.Called(ctx, deviceName, logger) 105 - if args.Get(0) == nil { 106 - return nil, args.Error(1) 107 - } 108 - return args.Get(0).(hsm.Client), args.Error(1) 109 - } 110 - 111 - func TestSyncManager_CalculateChecksum(t *testing.T) { 40 + func TestNewSyncManager(t *testing.T) { 112 41 scheme := runtime.NewScheme() 113 42 _ = hsmv1alpha1.AddToScheme(scheme) 114 43 ··· 117 46 118 47 syncManager := NewSyncManager(client, mockAgentManager, logr.Discard()) 119 48 120 - // Test with nil data 121 - checksum := syncManager.calculateChecksum(nil) 122 - assert.Equal(t, "", checksum) 123 - 124 - // Test with empty data 125 - checksum = syncManager.calculateChecksum(hsm.SecretData{}) 126 - assert.NotEqual(t, "", checksum) 127 - 128 - // Test with actual data 129 - data1 := hsm.SecretData{ 130 - "key1": []byte("value1"), 131 - "key2": []byte("value2"), 132 - } 133 - checksum1 := syncManager.calculateChecksum(data1) 134 - assert.NotEqual(t, "", checksum1) 135 - 136 - // Same data should produce same checksum 137 - data2 := hsm.SecretData{ 138 - "key1": []byte("value1"), 139 - "key2": []byte("value2"), 140 - } 141 - checksum2 := syncManager.calculateChecksum(data2) 142 - assert.Equal(t, checksum1, checksum2) 143 - 144 - // Different data should produce different checksum 145 - data3 := hsm.SecretData{ 146 - "key1": []byte("different"), 147 - "key2": []byte("value2"), 148 - } 149 - checksum3 := syncManager.calculateChecksum(data3) 150 - assert.NotEqual(t, checksum1, checksum3) 151 - 152 - // Key order shouldn't matter 153 - data4 := hsm.SecretData{ 154 - "key2": []byte("value2"), 155 - "key1": []byte("value1"), 156 - } 157 - checksum4 := syncManager.calculateChecksum(data4) 158 - assert.Equal(t, checksum1, checksum4) 49 + assert.NotNil(t, syncManager) 50 + assert.NotNil(t, syncManager.client) 51 + assert.NotNil(t, syncManager.agentManager) 52 + assert.NotNil(t, syncManager.logger) 159 53 } 160 54 161 - func TestSyncManager_UpdateHSMSecretStatus(t *testing.T) { 162 - scheme := runtime.NewScheme() 163 - _ = hsmv1alpha1.AddToScheme(scheme) 164 - 165 - hsmSecret := &hsmv1alpha1.HSMSecret{ 166 - ObjectMeta: metav1.ObjectMeta{ 167 - Name: "test-secret", 168 - Namespace: "default", 169 - }, 170 - Spec: hsmv1alpha1.HSMSecretSpec{ 171 - AutoSync: true, 172 - }, 173 - } 174 - 175 - client := fake.NewClientBuilder().WithScheme(scheme).WithObjects(hsmSecret).WithStatusSubresource(&hsmv1alpha1.HSMSecret{}).Build() 176 - mockAgentManager := &MockAgentManager{} 177 - 178 - syncManager := NewSyncManager(client, mockAgentManager, logr.Discard()) 179 - 180 - // Test successful sync result 55 + func TestSyncResult_Structure(t *testing.T) { 56 + // Test the new SyncResult structure 181 57 result := &SyncResult{ 182 58 Success: true, 183 - ConflictDetected: false, 184 - PrimaryDevice: "device1", 185 - DeviceResults: map[string]DeviceResult{ 186 - "device1": { 187 - Online: true, 188 - Checksum: "abc123", 189 - Version: 1, 190 - Error: nil, 191 - Timestamp: time.Now(), 59 + SecretsProcessed: 3, 60 + SecretsUpdated: 1, 61 + SecretsCreated: 1, 62 + MetadataRestored: 1, 63 + SecretResults: map[string]SecretSyncResult{ 64 + "secret1": { 65 + SecretPath: "secret1", 66 + SourceDevice: "device1", 67 + SourceVersion: 123, 68 + TargetDevices: []string{"device2"}, 69 + SyncType: SyncTypeUpdate, 70 + Success: true, 71 + Error: nil, 192 72 }, 193 - "device2": { 194 - Online: true, 195 - Checksum: "abc123", 196 - Version: 1, 197 - Error: nil, 198 - Timestamp: time.Now(), 73 + "secret2": { 74 + SecretPath: "secret2", 75 + SourceDevice: "device2", 76 + SourceVersion: 456, 77 + TargetDevices: []string{"device1"}, 78 + SyncType: SyncTypeCreate, 79 + Success: true, 80 + Error: nil, 199 81 }, 200 82 }, 201 - ResolvedData: hsm.SecretData{ 202 - "key": []byte("value"), 203 - }, 83 + Errors: []string{}, 204 84 } 205 85 206 - ctx := context.Background() 207 - err := syncManager.UpdateHSMSecretStatus(ctx, hsmSecret, result) 208 - assert.NoError(t, err) 86 + assert.True(t, result.Success) 87 + assert.Equal(t, 3, result.SecretsProcessed) 88 + assert.Equal(t, 1, result.SecretsUpdated) 89 + assert.Equal(t, 1, result.SecretsCreated) 90 + assert.Equal(t, 1, result.MetadataRestored) 91 + assert.Equal(t, 2, len(result.SecretResults)) 92 + assert.Equal(t, 0, len(result.Errors)) 209 93 210 - // Verify status was updated 211 - assert.Equal(t, hsmv1alpha1.SyncStatusInSync, hsmSecret.Status.SyncStatus) 212 - assert.Equal(t, "device1", hsmSecret.Status.PrimaryDevice) 213 - assert.False(t, hsmSecret.Status.SyncConflict) 214 - assert.Equal(t, "", hsmSecret.Status.LastError) 215 - assert.Len(t, hsmSecret.Status.DeviceSyncStatus, 2) 94 + // Check individual secret results 95 + secret1Result := result.SecretResults["secret1"] 96 + assert.Equal(t, "secret1", secret1Result.SecretPath) 97 + assert.Equal(t, "device1", secret1Result.SourceDevice) 98 + assert.Equal(t, int64(123), secret1Result.SourceVersion) 99 + assert.Equal(t, SyncTypeUpdate, secret1Result.SyncType) 100 + assert.True(t, secret1Result.Success) 101 + } 216 102 217 - // Check device sync status 218 - for _, deviceSync := range hsmSecret.Status.DeviceSyncStatus { 219 - assert.True(t, deviceSync.Online) 220 - assert.Equal(t, "abc123", deviceSync.Checksum) 221 - assert.Equal(t, int64(1), deviceSync.Version) 222 - assert.Equal(t, hsmv1alpha1.SyncStatusInSync, deviceSync.Status) 223 - assert.Empty(t, deviceSync.LastError) 224 - } 103 + func TestSyncTypes(t *testing.T) { 104 + // Test that SyncType constants are correctly defined 105 + assert.Equal(t, SyncType(0), SyncTypeSkip) 106 + assert.Equal(t, SyncType(1), SyncTypeUpdate) 107 + assert.Equal(t, SyncType(2), SyncTypeCreate) 108 + assert.Equal(t, SyncType(3), SyncTypeRestoreMetadata) 225 109 } 226 110 227 - func TestSyncManager_DetectConflicts(t *testing.T) { 228 - scheme := runtime.NewScheme() 229 - _ = hsmv1alpha1.AddToScheme(scheme) 230 - 231 - client := fake.NewClientBuilder().WithScheme(scheme).Build() 232 - mockAgentManager := &MockAgentManager{} 233 - 234 - syncManager := NewSyncManager(client, mockAgentManager, logr.Discard()) 235 - 236 - // Test with no conflicts (same checksums) 237 - deviceResults := map[string]DeviceResult{ 238 - "device1": { 239 - Online: true, 240 - Checksum: "abc123", 241 - Version: 1, 242 - Error: nil, 243 - }, 244 - "device2": { 245 - Online: true, 246 - Checksum: "abc123", // Same checksum 247 - Version: 1, 248 - Error: nil, 249 - }, 250 - } 251 - 252 - conflict := syncManager.detectConflicts(deviceResults) 253 - assert.False(t, conflict) 254 - 255 - // Test with conflicts (different checksums) 256 - deviceResults = map[string]DeviceResult{ 257 - "device1": { 258 - Online: true, 259 - Checksum: "abc123", 260 - Version: 1, 261 - Error: nil, 262 - }, 263 - "device2": { 264 - Online: true, 265 - Checksum: "def456", // Different checksum 266 - Version: 2, 267 - Error: nil, 268 - }, 111 + func TestSecretSyncResult_Structure(t *testing.T) { 112 + // Test that SecretSyncResult has the expected fields 113 + result := SecretSyncResult{ 114 + SecretPath: "test-secret", 115 + SourceDevice: "device1", 116 + SourceVersion: 123, 117 + TargetDevices: []string{"device2", "device3"}, 118 + SyncType: SyncTypeCreate, 119 + Success: true, 120 + Error: nil, 269 121 } 270 122 271 - conflict = syncManager.detectConflicts(deviceResults) 272 - assert.True(t, conflict) 123 + assert.Equal(t, "test-secret", result.SecretPath) 124 + assert.Equal(t, "device1", result.SourceDevice) 125 + assert.Equal(t, int64(123), result.SourceVersion) 126 + assert.Equal(t, []string{"device2", "device3"}, result.TargetDevices) 127 + assert.Equal(t, SyncTypeCreate, result.SyncType) 128 + assert.True(t, result.Success) 129 + assert.Nil(t, result.Error) 273 130 } 274 131 275 - func TestSyncManager_SelectPrimaryDevice(t *testing.T) { 276 - scheme := runtime.NewScheme() 277 - _ = hsmv1alpha1.AddToScheme(scheme) 278 - 279 - client := fake.NewClientBuilder().WithScheme(scheme).Build() 280 - mockAgentManager := &MockAgentManager{} 281 - 282 - syncManager := NewSyncManager(client, mockAgentManager, logr.Discard()) 132 + func TestRemoveDuplicates(t *testing.T) { 133 + // Test the removeDuplicates utility function 134 + input := []string{"device1", "device2", "device1", "device3", "device2"} 135 + expected := []string{"device1", "device2", "device3"} 136 + result := removeDuplicates(input) 283 137 284 - // Test with existing primary device 285 - hsmSecret := &hsmv1alpha1.HSMSecret{ 286 - Status: hsmv1alpha1.HSMSecretStatus{ 287 - PrimaryDevice: "device1", 288 - }, 138 + assert.Equal(t, len(expected), len(result)) 139 + for _, item := range expected { 140 + assert.Contains(t, result, item) 289 141 } 142 + } 290 143 291 - deviceResults := map[string]DeviceResult{ 292 - "device1": { 293 - Online: true, 294 - Checksum: "abc123", 295 - Version: 1, 296 - Error: nil, 297 - }, 298 - "device2": { 299 - Online: true, 300 - Checksum: "def456", 301 - Version: 2, 302 - Error: nil, 303 - }, 304 - } 144 + func TestRemoveDevice(t *testing.T) { 145 + // Test the removeDevice utility function 146 + input := []string{"device1", "device2", "device3"} 147 + result := removeDevice(input, "device2") 148 + expected := []string{"device1", "device3"} 305 149 306 - primary := syncManager.selectPrimaryDevice(deviceResults, hsmSecret) 307 - assert.Equal(t, "device1", primary) 308 - 309 - // Test with no existing primary - should choose highest version 310 - hsmSecret.Status.PrimaryDevice = "" 311 - primary = syncManager.selectPrimaryDevice(deviceResults, hsmSecret) 312 - assert.Equal(t, "device2", primary) // device2 has version 2 vs device1's version 1 313 - 314 - // Test with primary device offline - should fallback to highest version 315 - hsmSecret.Status.PrimaryDevice = "device1" 316 - deviceResults["device1"] = DeviceResult{ 317 - Online: false, // Offline 318 - Checksum: "abc123", 319 - Version: 1, 320 - Error: nil, 321 - } 322 - 323 - primary = syncManager.selectPrimaryDevice(deviceResults, hsmSecret) 324 - assert.Equal(t, "device2", primary) 150 + assert.Equal(t, expected, result) 325 151 }