Replication between primary and additional domain controllers has failed: how should AD, DNS, time, and SYSVOL be checked?
Domain-controller replication failures affect accounts, passwords, GPOs, and sign-ins. Start with replication summaries and error codes, then verify DNS, sites, RPC, time, and DFSR.
1. Conclusion and scope
Prepare the client and server versions, domain membership, DNS and gateway settings, network location, full error text, event timestamps, and recent changes. The reserved example domain corp.example is used throughout; no customer domain, IP address, account, or device identifier is included.
This issue falls under Active Directory and Group Policy. Logs and configuration can often be collected remotely first. Bulk permission changes, switch-path work, production cutovers, and recovery drills should use a controlled implementation window.
2. Symptoms and environment
- Capture the complete error text, event-log timestamp, and failed action rather than relying on a verbal description.
- Record the affected scope, first occurrence, reproducibility, and whether the result changes on another subnet.
- Domain members and computers being joined must use internal DNS; public resolvers do not provide the AD SRV records required for discovery.
3. Troubleshooting sequence
- Use repadmin to review the replication summary, failing source, error code, site, connection object, and naming context.
- Run dcdiag and focus on DNS, Advertising, Services, SysVolCheck, and NetLogons results.
- Confirm every domain controller uses DNS that can resolve the AD zone, site-to-subnet mappings are correct, and cross-site routing plus dynamic RPC are permitted.
- Verify consistent time zones and time sources across clients, domain controllers, and hypervisors to prevent Kerberos failures.
- If replication problems coincide with inconsistent GPOs, verify SYSVOL and NETLOGON shares, DFSR events, and policy versions on each domain controller.
- Across subnets, validate DNS, Kerberos, LDAP, SMB, RPC and the dynamic RPC range instead of testing a single port.
repadmin /replsummary
repadmin /showrepl * /errorsonly
dcdiag /test:dns /test:advertising /vReplace server names, domains, and paths with values verified for your environment. Do not copy real IP addresses, domains, or accounts from an unrelated environment.
4. Safe remediation and rollout
Start with read-only queries, configuration exports, and one-system validation. Once the root cause is confirmed, define the target scope, change window, and rollback method. For multiple computers, use a test OU and a small pilot group, export policy results, and roll out in stages only after side effects are excluded.
- Verify consistent time zones and time sources across clients, domain controllers, and hypervisors to prevent Kerberos failures.
- If replication problems coincide with inconsistent GPOs, verify SYSVOL and NETLOGON shares, DFSR events, and policy versions on each domain controller.
- Across subnets, validate DNS, Kerberos, LDAP, SMB, RPC and the dynamic RPC range instead of testing a single port.
5. Validation, rollback and common mistakes
Do not stop when the service works once. Revalidate with the user workflow, logs, a restart or fresh sign-in, another network location where relevant, and the next policy or backup cycle.
Validation and rollback checks
- Change one variable at a time and export the current configuration before making changes.
- Verify consistent time zones and time sources across clients, domain controllers, and hypervisors to prevent Kerberos failures.
- Confirm AD and SYSVOL replication between domain controllers and verify that the contacted controller has the expected policy version.
Common mistakes to avoid
- Using public DNS or permanent hosts-file entries for domain controllers.
- Removing a computer from the domain before confirming local administrator access.
- Linking a new GPO to the entire domain without a pilot group.
Need an assessment based on your actual environment?
Send the exact error, screenshots, operating system and application versions, a high-level network diagram, the affected scope, and the steps already attempted. We will first determine whether the issue is suitable for remote troubleshooting or requires an on-site change window, then confirm scope and pricing.
