In late 2020, investigators tracing the SUNBURST backdoor found that after compromising SolarWinds' build pipeline, the threat actor spent weeks inside victim networks performing what researchers at FireEye called "deliberate, methodical" lateral movement. The actor queued enumeration tasks, mapped Active Directory trust relationships, and selectively escalated privileges only in environments with high-value targets β avoiding noisy broad scans that would trigger alerts. The patience and precision resembled automated decision logic more than manual keyboard operation.
Lateral movement reconnaissance is the process of mapping an internal network from a compromised host β identifying reachable systems, enumerating services, inferring trust relationships, and prioritizing targets β before the attacker actually moves to those systems. In classical red team operations this phase is manual and time-consuming. AI-assisted tooling compresses it dramatically.
AI systems are particularly effective here because the task is fundamentally a graph traversal and ranking problem: given a set of observed hosts, services, user sessions, and credentials, determine the optimal traversal path toward a defined objective. Large language models can also synthesize raw tool output β Nmap XML, BloodHound JSON, SMB share listings β into prioritized attack surface summaries in seconds.
The NotPetya worm used an automated lateral movement engine combining EternalBlue (MS17-010), Mimikatz credential harvesting, and PSEXEC/WMIC propagation. Though not AI-driven, it demonstrated what happens when reconnaissance and movement logic are fully automated: 80% of Maersk's 45,000-machine global network was encrypted within roughly 90 minutes. AI-assisted lateral movement planning can achieve similar speed with far greater target selectivity β a more dangerous combination for defenders.
Raw enumeration typically surfaces hundreds of reachable hosts. Deciding which to move to first is where AI provides unique leverage. An LLM given a full host inventory can rank targets by: credential overlap likelihood, service criticality (backup servers, domain controllers, SIEM appliances), detection risk profile (endpoint agents present, EDR coverage), and proximity to the declared objective.
This mirrors how defenders think about crown jewel analysis β AI applies the same logic offensively. Penetration testers using this approach in authorized engagements report significantly faster path-to-objective times and more realistic emulation of advanced threat actors who spend weeks in pre-movement reconnaissance.
All lateral movement reconnaissance in a real engagement requires explicit written authorization covering internal network segments. Many SOW agreements authorize perimeter testing only. Unauthorized lateral movement β even with legitimate external access β constitutes unauthorized computer access in most jurisdictions. Confirm scope in writing before any internal enumeration activity.
You have completed initial access to a workstation (WS-FINANCE-04) in a financial services company's internal network during an authorized red team engagement. You have harvested a set of AD enumeration outputs including logged-on sessions, group memberships, and trust relationships. Your task is to use AI assistance to analyze this data and build a prioritized lateral movement plan toward the Domain Controller (DC-CORP-01).
In January 2022, the LAPSUS$ group gained access to a Sitel support engineer's workstation that had Okta administrative console access. Rather than immediately weaponizing that access, they spent weeks enumerating what the account could reach before executing their objective. When Okta disclosed the breach in March 2022, analysis revealed the actor had accessed customer tenant data through the support tooling β a classic privilege escalation via third-party contractor credential rather than a direct vulnerability exploitation. LAPSUS$ operated through social engineering, SIM swapping, and credential purchase, then used AI-assisted enumeration tools to identify the maximum-value path from a limited initial foothold.
In a compromised Windows environment, credentials exist in multiple forms simultaneously: NTLM hashes in memory via LSASS, Kerberos tickets (TGTs and service tickets) in the ticket cache, cleartext passwords in WDigest on legacy systems, DPAPI-encrypted browser and application credentials, and LSA secrets containing service account passwords. Each form requires different extraction techniques, and defenders monitor for all of them.
AI assists at the analysis stage: given a dump of extracted credential material, an LLM can identify which accounts map to privileged AD groups, which service accounts are Kerberoastable (weak password candidates), and which credentials are likely shared across systems β all without additional network queries that might trigger detection.
Kerberoasting requests Kerberos service tickets for accounts with SPNs (Service Principal Names) and attempts offline cracking. The attack surface varies enormously: some environments have dozens of Kerberoastable accounts, most with strong passwords. Indiscriminate cracking wastes time and may generate anomalous Kerberos traffic. AI can prioritize which SPNs to target by analyzing account naming conventions (service accounts often follow predictable patterns), account age (older accounts may have weaker password policies), and privilege level β focusing cracking resources on accounts with the highest access.
The DarkSide ransomware group that attacked Colonial Pipeline in May 2021 gained initial access via a legacy VPN account with a compromised password found in a dark web credential dump. A single reused password for an account with no MFA was sufficient initial access. From there, the actor moved laterally through operational technology networks. This illustrates how credential analysis β mapping reused passwords across systems β remains one of the highest-value lateral movement enablers, and why AI tools that automate credential correlation are operationally significant.
Privilege escalation planning is not about finding a single vulnerability β it is about constructing a chain of incremental access gains. A typical chain might proceed: local admin on workstation β credential dump via LSASS β service account hash β Kerberoast offline β domain user with IT-HelpDesk membership β AdminTo rights on server β active Domain Admin session β token impersonation β Domain Admin.
AI systems can enumerate and validate these chains by processing BloodHound JSON export data offline. The model identifies the shortest path, flags detection chokepoints at each step (e.g., LSASS access triggers Windows Defender Credential Guard alerts), and suggests alternative paths that trade efficiency for stealth.
The same AI-assisted path analysis used offensively can be deployed defensively. Purple team exercises where defenders run BloodHound-to-LLM analysis on their own environments regularly surface misconfigured delegation, forgotten service accounts with excessive privileges, and implicit trust paths that were never intentional. Running this analysis proactively is one of the highest-ROI defensive activities available to enterprise security teams.
You have executed SharpHound in the target environment and obtained a credential dump from the compromised workstation. You have a list of 23 Kerberoastable service accounts and need to prioritize which to crack. You also have a partial BloodHound output showing ACL paths. Your goal is to identify the shortest privilege escalation chain to Domain Admin within the authorized engagement scope.
The DOJ indictment of APT41 members in September 2020 and subsequent CISA advisories described a threat actor that maintained access to compromised networks across multiple incident response cycles. After defenders discovered and remediated one implant, APT41 was observed re-establishing access within days β sometimes hours. Analysis attributed this to layered persistence: the group deployed multiple independent backdoors using different command-and-control channels and different persistence techniques, so that removal of any single one left the others intact. Investigators found scheduled tasks, modified registry run keys, WMI subscriptions, and a COM object hijack all operating in parallel on the same compromised host.
Persistence is not a single technique β it is a design problem. The attacker's goal is to survive: password resets, system reboots, AV updates, partial remediation, and incident response investigation. Each persistence mechanism has a different detection probability, a different survivability profile against specific IR actions, and a different operational footprint. AI assists in selecting the right combination for the target environment's defensive posture.
This selection problem maps cleanly to the MITRE ATT&CK Persistence tactic, which currently documents 19 techniques and 60+ sub-techniques. An LLM given context about an environment's EDR solution, Windows version, and AD configuration can reason about which persistence techniques are most likely to survive that specific defensive stack β mirroring how a human analyst would think about it, but faster.
Run/RunOnce keys, image file execution options, AppInit_DLLs. Widely detected by modern EDRs but effective against endpoints with gaps in registry monitoring.
Windows Task Scheduler and service installation. AI can generate task XML that mimics legitimate system tasks in naming and trigger patterns to reduce detection probability.
Fileless persistence via WMI permanent subscriptions. Survives disk forensics. APT41 and multiple Chinese threat actors used this heavily in 2019β2021 campaigns.
AdminSDHolder abuse, domain object ACL modification, DSRM account activation. Highly durable β survives endpoint reimaging because persistence lives in AD, not on disk.
Forged Kerberos tickets valid for attacker-defined periods (often 10 years). Requires KRBTGT hash. Survives password resets unless KRBTGT is reset twice within the ticket's validity window.
Bootkits, UEFI implants, and MBR modification. Extremely durable but high-complexity. Nation-state actors (FinSpy, Lojax) used UEFI implants observed surviving OS reinstallation.
The Hafnium threat actor's exploitation of Microsoft Exchange Server (ProxyLogon, CVE-2021-26855) in early 2021 involved web shell deployment for persistence β specifically ASPX web shells written to publicly accessible directories. Microsoft's MSTIC analysis identified over 30 distinct web shell variants deployed across thousands of compromised Exchange servers worldwide. AI-assisted web shell generation can produce variations that evade static signature detection while maintaining functionality β a clear application of LLM code generation to persistence planning in web-accessible environments.
When an LLM is given a description of a target environment β OS version, EDR product, logging configuration, AD functional level, and incident response capability β it can reason about persistence technique selection using the following logic framework:
Detection Probability: Which techniques does the environment's specific EDR vendor detect reliably? (This is researchable from public EDR bypass research, vendor test reports, and engagement experience.)
Survivability Against IR Actions: Which techniques survive a reimaging of the compromised host? A password reset? A KRBTGT double-reset? Only AD-resident persistence survives endpoint reimaging.
Operational Noise: Which techniques generate anomalous telemetry at deployment time versus at execution time? Scheduled tasks are noisier at creation; Golden Tickets are noisy only if the KRBTGT hash extraction was detected.
Layering Strategy: APT41's documented approach β multiple independent persistence mechanisms using different C2 channels β is the operationally robust model. AI can recommend an optimal layering combination given the constraints above.
In authorized engagements, every persistence mechanism deployed must be documented with: the technique name and MITRE ATT&CK ID, the specific implementation (registry key path, task name, WMI subscription query), the system it was deployed on, and the cleanup procedure. Failure to fully remove persistence mechanisms post-engagement has caused real incidents where red team implants were discovered by defenders months later β or worse, by actual threat actors who leveraged existing red team access.
You have achieved Domain Admin in an authorized red team engagement. The client has asked you to demonstrate what a sophisticated threat actor's persistence would look like so their IR team can practice detection and remediation. You need to design a layered persistence strategy that mimics APT-level tradecraft β using multiple techniques with different survivability profiles β and document exactly how each would be detected and removed.
The SUNBURST implant deployed by APT29 communicated via a C2 channel that researchers described as remarkably patient. After initial activation, the malware waited between 12 and 14 days before contacting its command server β specifically to avoid sandbox environments that time out. It then communicated via DNS queries designed to mimic legitimate Orion telemetry traffic patterns. C2 traffic used subdomain generation algorithms tied to victim network identifiers and routed through avsvmcloud[.]com, a domain designed to appear as a legitimate SolarWinds service. The sophistication of this C2 design β mimicry, timing control, traffic blending β is precisely what AI-assisted C2 planning can now automate for red team operators.
Command and control infrastructure is the most consistently detectable component of a long-term intrusion. Defenders monitor for: unusual outbound DNS patterns, beaconing behavior (regular interval connections to external hosts), low-reputation domain connections, anomalous protocol use (HTTP on non-standard ports, non-browser TLS fingerprints), and geographic anomalies (connections to countries inconsistent with business operations).
Each of these detection vectors corresponds to a C2 design decision. AI systems can reason about C2 infrastructure choices by modeling the detection environment β specifically, what a defender with knowledge of the organization's normal traffic baseline would flag β and generating C2 channel configurations that minimize each detection vector simultaneously.
In May 2023, the FBI and CISA announced the disruption of Turla's Snake malware network β infrastructure that had operated for nearly 20 years. Snake used a peer-to-peer C2 architecture where compromised hosts relayed traffic through each other, masking the true command server location. CISA's advisory noted that Snake's HTTP-based C2 protocol was designed to mimic legitimate HTTP traffic and used custom obfuscation that changed between versions. The longevity of this infrastructure (20 years) illustrates the value of sophisticated C2 design β and the challenge defenders face when C2 is built to blend with legitimate traffic patterns.
Red team operators must apply OPSEC to their own infrastructure to prevent premature detection that contaminates the engagement results. Key principles include: using dedicated infrastructure per engagement (not re-using C2 servers across clients), registering domains with categorical consistency (a "financial services" client should use domains consistent with financial sector traffic), rotating C2 infrastructure at engagement phases, and ensuring that all red team tooling is configured to mimic the specific threat actor the engagement is emulating β if the client wants an APT29 simulation, the C2 profile should match APT29's documented tradecraft.
AI assists red teams in constructing engagement-specific C2 profiles by analyzing public reporting on the threat actor being emulated and generating Cobalt Strike Malleable C2 or Havoc Framework profile configurations that match documented C2 characteristics. This increases engagement realism and prepares defenders for the actual adversary rather than a generic red team profile.
The most effective C2 detection is not signature-based β it is behavioral. Establishing a traffic baseline and detecting deviations (new external domains, unusual data volume patterns, off-hours connections from service accounts) catches sophisticated C2 that evades signature detection. Network detection tools like Zeek, with ML-augmented anomaly detection, consistently outperform signature-based IDS against custom C2 frameworks. This is where defenders' investment should flow.
You are preparing a threat emulation engagement where your client wants to test their detection capabilities against APT29-style tradecraft. You need to design a C2 infrastructure and communication profile that mimics documented APT29 characteristics β including traffic blending, appropriate beacon intervals, and a category-consistent domain strategy. The client uses Palo Alto Cortex XDR with network analytics and has Zeek deployed at the perimeter.