The Windows Machine Config Operator (WMCO) configures Windows Server instances as worker nodes in OpenShift/OKD clusters, enabling Windows container workloads.
Architecture:
┌─────────────────────────────────────────────────────────────┐
│ OpenShift Cluster │
├────────────────────────┬────────────────────────────────────┤
│ Linux Control Plane │ Windows Worker Nodes │
│ ┌──────────────────┐ │ ┌──────────────────────────────┐ │
│ │ WMCO Operator │──┼──│ WICD (daemon) │ │
│ │ - Controllers │SSH│ │ - Service reconciliation │ │
│ │ - CSR Approval │ │ │ - Certificate rotation │ │
│ └──────────────────┘ │ │ - Environment management │ │
│ │ └──────────────────────────────┘ │
│ │ ┌──────────────────────────────┐ │
│ │ │ Windows Components │ │
│ │ │ kubelet, containerd, │ │
│ │ │ kube-proxy, hybrid-overlay │ │
│ │ └──────────────────────────────┘ │
└────────────────────────┴────────────────────────────────────┘
Provisioning Modes:
- Machine API: Create Windows VMs via MachineSets, auto-configured by WMCO
- BYOH: Define existing instances in
windows-instancesConfigMap
Tech Stack: Go 1.24+, Kubernetes Operator (controller-runtime), OpenShift APIs, Windows Server 2019/2022
Red Hat OpenShift support for Windows Containers is a layered component of OpenShift that allows the integration of Windows Nodes for running Windows Containers on an OpenShift 4 Cluster.
This is achieved by installing the Windows Machine Config Operator (WMCO), which runs on Linux based control-plane nodes. The WMCO bootstraps Windows nodes to join the cluster as Windows worker nodes.
Windows Containers requires a specific subscription, in addition to an OpenShift Container Platform subscription:
- Worker nodes only - Control plane and infrastructure nodes don't need a paid subscription for Windows Containers
- When working cases, verify entitlement mapping to ensure compliance
Look for the "for Windows" phrase in the entitlement name.
Example Entitlements:
MW01465: Red Hat OpenShift Container Platform, Standard (2 Cores or 4 vCPUs, for Windows)MW01615: Red Hat OpenShift Container Platform, Premium (2 Cores or 4 vCPUs, for Windows)
| Project | Purpose | Link |
|---|---|---|
| OpenShift Bugs (OCPBUGS) | Bug reports | Red Hat Issue Router → "Windows Containers" component |
| Windows Containers (WINC) | Engineering tracking | issues.redhat.com/projects/WINC |
| RFE | Feature requests | issues.redhat.com/project/RFE → "Windows Containers" component |
| Portfolio Backlog | Roadmap/planning | Portfolio Plan View |
Search for the "Windows Containers" component on the Red Hat Issue Router, then create a bug with:
- Detailed explanation of the issue including:
- Troubleshooting steps already taken
- Any recent changes to the cluster
- Relevant contextual information
- Link to customer case (if one exists)
- Must-gather archive recently generated on the cluster
- Command outputs (not present in older must-gather archives):
oc get network.operator cluster -o yaml
oc logs -f deployment/windows-machine-config-operator -n openshift-windows-machine-config-operator- MachineSet object describing Windows instances (if using Machine API/IPI)
- windows-instances ConfigMap (if using BYOH method)
Use this template when filing bugs in OpenShift Bugs (OCPBUGS) with Component: Windows Containers:
Description of problem:
{code:none}
{code}
Version-Release number of selected component (if applicable):
{code:none}
{code}
How reproducible:
{code:none}
{code}
Steps to Reproduce:
{code:none}
1.
2.
3.
{code}
Actual results:
{code:none}
{code}
Expected results:
{code:none}
{code}
Additional info:
{code:none}
{code}
Slack is a supplement to bug reports in Jira. Only reach out if:
- A bug is already filed
- The issue hasn't received a prompt response
Channel: #forum-ocp-winc (CoreOS Slack)
Tag: Use only @winc-watcher to get the team's attention
Do NOT use @here or @everyone
make build- Build operator binaryGOOS=windows make build-daemon- Build Windows daemon (WICD)make build-all- Build everything
make unit- Run all unit testsgo test -v ./pkg/nodeconfig/...- Test specific packagemake lint- Run linter (golangci-lint)make verify- All checks (lint, vet, unit, build)
make generate- Generate RBAC manifests, mocksmake vendor- Update vendored dependenciesmake manifests- Generate CRD/RBAC YAML
//go:build windows- Windows-only code (daemon, services)//go:build !windows- Linux-only code (operator)- Cannot cross-compile without correct GOOS - always check tags
| Controller | File | Watches |
|---|---|---|
| ConfigMap | configmap_controller.go |
windows-instances ConfigMap |
| Machine | machine_controller.go |
Machine objects with Windows OS label |
| Node | node_controller.go |
Windows Node objects |
| Secret | secret_controller.go |
cloud-private-key Secret |
| Package | Purpose |
|---|---|
controller/ |
WICD main reconciliation loop |
manager/ |
Windows Service Control Manager interface |
certs/ |
Certificate import/management |
cleanup/ |
Node deconfiguration logic |
envvar/ |
Environment variable management |
fake/ |
Test mocks (platform-independent) |
| Package | Purpose | Platform |
|---|---|---|
pkg/nodeconfig/ |
Orchestrates node configuration via SSH | Linux |
pkg/windows/ |
SSH connectivity, SFTP, remote commands | Linux |
pkg/csr/ |
CSR validation and approval | Linux |
pkg/services/ |
Windows service definitions | Linux |
pkg/servicescm/ |
Services ConfigMap schema and parsing | Any |
pkg/cluster/ |
Cluster network config (OVN) | Linux |
pkg/wiparser/ |
Parse windows-instances ConfigMap | Linux |
All remote operations on Windows instances.
type Windows interface {
GetIPv4Address() string
GetHostname() (string, error)
Run(cmd string, psCmd bool) (string, error)
EnsureFile(*payload.CompressedFileInfo, string) error
EnsureFileContent([]byte, string, string) error
Bootstrap(ctx context.Context, version, namespace, kubeconfig string) error
ConfigureWICD(namespace, kubeconfig string) error
RebootAndReinitialize(context.Context) error
RunWICDCleanup(namespace, kubeconfig string) error
}Windows service management (Windows-only build tag).
type Manager interface {
CreateService(name, exepath string, config mgr.Config, args ...string) (Service, error)
GetServices() (map[string]struct{}, error)
OpenService(name string) (Service, error)
DeleteService(name string) error
EnsureServiceState(Service, svc.State) error
Disconnect() error
}Service definition in ConfigMap.
type Service struct {
Name string `json:"name"`
Command string `json:"path"`
NodeVariablesInCommand []NodeCmdArg `json:"nodeVariablesInCommand,omitempty"`
PowershellPreScripts []PowershellPreScript `json:"powershellPreScripts,omitempty"`
Dependencies []string `json:"dependencies,omitempty"`
Bootstrap bool `json:"bootstrap"`
Priority uint `json:"priority"`
}All implement reconcile.Reconciler:
func (r *Reconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error)- Define in
pkg/services/services.go:
func myServiceConfiguration() servicescm.Service {
return servicescm.Service{
Name: "my-service",
Command: "C:\\k\\my-service.exe --flag=value",
Dependencies: []string{"containerd"},
Bootstrap: false, // true = runs during node join
Priority: 2, // 0 = first, higher = later
}
}- Add to
GenerateManifest()return slice - Add unit tests in
pkg/services/services_test.go - Service auto-reconciled by WICD on Windows nodes
MachineSet Created
↓
Machine Created → WMCO detects (Windows OS label)
↓
SSH to Windows instance
↓
Transfer files (kubelet, containerd, CNI, etc.)
↓
Bootstrap WICD with kubeconfig
↓
WICD starts services, generates kubelet CSR
↓
WMCO approves CSR → Node joins cluster
↓
WICD reconciles services continuously
IMPORTANT: When making code changes, prioritize files that are already modified (dirty) in the git history before changing any clean files.
- First check
git statusto identify modified files - When implementing changes, prefer modifying already-dirty files over clean ones
- Only modify clean files when absolutely necessary for the task
- This helps keep changesets focused and easier to review
# Node annotations
windowsmachineconfig.openshift.io/version: "10.0.0" # Current
windowsmachineconfig.openshift.io/desired-version: "10.1.0" # Target- Mismatch triggers: drain → reconfigure → uncordon
- Only one node upgraded at a time (sequential)
- Reboot annotation triggers instance restart
- Kubelet generates CSR during bootstrap
- WMCO CSR controller validates:
- Node name matches ConfigMap/Machine
- Certificate type (client vs serving)
- Key usages correct
- Approved → certificate issued to kubelet
- WICD manages rotation via certificate manager
- Trust bundle updates require node reboot
- Test files:
*_test.goalongside source - Table-driven tests preferred
- Use
testify/assertandtestify/require
pkg/daemon/fake/ provides test doubles:
// Create fake service manager with existing services
existingSvcs := map[string]*fake.FakeService{
"kubelet": fake.NewFakeService("kubelet", mgr.Config{}, svc.Status{State: svc.Running}),
}
fakeMgr := fake.NewTestMgr(existingSvcs)
// Use in tests
err := removeServices(fakeMgr, configMapServices, false)
require.NoError(t, err)# All unit tests
make unit
# Specific package
go test -v ./pkg/nodeconfig/...
# Specific test
go test -v ./pkg/csr/... -run TestApprove
# Race detector
go test -race ./pkg/...
# Coverage
go test -cover ./pkg/... -coverprofile=coverage.out
go tool cover -html=coverage.out- Located in
test/e2e/ - Require running cluster with Windows nodes
- See
docs/HACKING.mdfor setup
gofmtfor formattingmake lintbefore committing (golangci-lint)
Logging - Use logr.Logger:
log.Info("configuring node", "name", nodeName, "address", addr)
log.Error(err, "failed to configure node", "name", nodeName)
log.V(1).Info("debug message") // VerboseContext - Always first parameter:
func (nc *NodeConfig) Configure(ctx context.Context) errorErrors - Wrap with context:
return fmt.Errorf("failed to configure node %s: %w", nodeName, err)Kubernetes Client - Prefer client.Client:
// Good
func NewController(c client.Client) *Controller
// Avoid (unless needed for specific APIs)
func NewController(clientset *kubernetes.Clientset) *ControllerRBAC - Kubebuilder markers:
//+kubebuilder:rbac:groups=core,resources=nodes,verbs=get;list;watch;patch;update
//+kubebuilder:rbac:groups=certificates.k8s.io,resources=certificatesigningrequests,verbs=get;list;watch
func (r *NodeReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error)Private Keys
- Stored in
cloud-private-keySecret - Never log key content
- Used for SSH authentication only
CSR Approval
- Validates node identity before approval
- Checks against windows-instances ConfigMap or Machine
- Validates certificate type and key usages
- See
pkg/csr/validation/for rules
Credentials in Annotations
- Username encrypted with PGP (
pkg/crypto/) - Public key hash stored for verification
- Never store plaintext secrets
Certificates
- Auto-rotate before expiry (80% lifetime)
- Trust bundle changes require reboot
- WICD certificate separate from kubelet
- Reboots required for environment variables and certificates
- Service dependencies must be in Priority order
- File paths: backslash in PowerShell (
C:\k\), forward in Go paths - SSH sessions must be closed or connections leak
- PowerShell default shell: commands auto-wrapped if needed
versionvsdesired-versionare different annotations with different purposes- ConfigMap changes trigger immediate reconciliation
- Only one node upgraded at a time (prevents cluster disruption)
- Node drain before reconfiguration (workloads rescheduled)
- Cannot build Windows daemon on Linux without
GOOS=windows - Run
make vendorafter any go.mod changes - Run
make generateafter kubebuilder marker changes - Vendored cloud providers in separate directories
- vSphere: Machine name max 15 chars, MachineSet name max 9
- AWS: Requires EC2LaunchV2 v2.0.1643+ for disconnected
- Azure: cloud-node-manager service required
- GCP: Custom hostname script in
pkg/internal/
make build # Operator binary
GOOS=windows make build-daemon # Windows daemon
make unit # Unit tests
make lint # Linting
make verify # All checks
make vendor # Update deps
make generate # RBAC, code gen# Generate MachineSet for platform
./hack/machineset.sh
./hack/machineset.sh apply
# Watch Windows nodes
oc get nodes -l kubernetes.io/os=windows -w
# Operator logs
oc logs -n openshift-windows-machine-config-operator \
deployment/windows-machine-config-operator -f
# Check services ConfigMap
oc get cm -n openshift-windows-machine-config-operator \
-l windowsmachineconfig.openshift.io/services
# CSR debugging
oc get csr | grep system:node
oc describe csr <name>
oc adm certificate approve <name> # Manual approval if needed# WICD status and logs
Get-Service windows-instance-config-daemon
Get-Content C:\k\logs\wicd.log -Tail 50 -Wait
# Kubelet status and logs
Get-Service kubelet
Get-Content C:\k\logs\kubelet.log -Tail 50
# All WMCO-managed services
Get-Service | Where-Object {$_.Description -like "*OpenShift Managed*"}
# Container runtime
Get-Service containerd
ctr -n k8s.io containers list
# Network
Get-HnsNetwork
Get-HnsEndpoint| Path | Purpose |
|---|---|
C:\k\ |
Main directory (binaries, configs) |
C:\k\logs\ |
Service logs (kubelet, wicd, containerd) |
C:\k\cni\ |
CNI plugins and configs |
C:\k\cni\config\ |
CNI configuration files |
C:\k\containerd\ |
Containerd config and state |
C:\k\kubeconfig |
Kubelet kubeconfig |
C:\k\wicd-kubeconfig |
WICD service account token |
C:\k\kubelet.conf |
Kubelet configuration |
C:\k\ca.crt |
Cluster CA certificate |
C:\var\lib\kubelet\ |
Kubelet working directory |
C:\var\lib\kubelet\pki\ |
Kubelet certificates |
Unsupported Features:
- DeploymentConfigs (use Deployments instead)
- Vertical Pod Autoscaling for Windows workloads
- OpenShift Builds, Pipelines, Service Mesh
- Trunk port networking (access port only)
Requires Manual Setup:
- Windows CSI drivers (not deployed by WMCO)
- Custom storage classes for Windows
Network:
- OVN-Kubernetes hybrid networking only
- OpenShiftSDN not supported
See CONTRIBUTION.md for full details.
[subsystem] <what changed>
<BLANK LINE>
<why this change was made>
<BLANK LINE>
<Footer>
Example:
[nodeconfig] Add custom DNS configuration support
The node configuration did not support custom DNS settings for Windows
nodes. This adds the ability to specify DNS servers during bootstrap.
Follow-up to Id5e7cbb1.
- Subject: max 50 characters
- Body: max 80 characters per line
- Subsystem examples:
docs,nodeconfig,csr,daemon/controller,services
WINC|OCPBUGS-<number>: [<subsystem>] <title>
Examples:
WINC-959: [docs] reorganizes readmeOCPBUGS-1234: [csr] Fix validation for serving certificates[nodeconfig] Add custom DNS support(if no Jira issue)
# Required checks
make lint # Lint code
make imports # Fix import issues
make verify # All checks (lint + vet + unit + build)Checklist:
- Fetched and rebased against upstream master
- Tests pass locally (
make verify) - Linted with
make lint - Fixed imports with
make imports - Error messages are single line
- Documentation updated if user-facing change
make vendorif dependencies changedmake generateif RBAC markers changed
- Open as Draft - Always open PRs as drafts first to prevent tests from running immediately
- Get Reviews - Need at least one
/lgtmand one/approve - Mark Ready - Click "Ready for review" to trigger CI tests
- Auto-Merge - PR merges automatically when tests pass
If a test fails due to a flake, retest with this format:
/retest-required
<explanation of the error>
<reason for retest>
<prow.ci.openshift.org link>
<log snippet of the failure>
Use the cherry-pick robot for backports to supported versions:
/cherry-pick <release>
For multiple versions, chain the backports (master → 4.11 → 4.10 → 4.9) to preserve Jira associations.
If cherry-pick bot fails, create manual PR and run:
/jira cherry-pick OCPBUGS-<number>
Open a GitHub issue for bugs or documentation problems.