What Is Penetration Testing?
Penetration testing is the authorized, simulated attack against an organization's systems, networks, or applications conducted to identify exploitable vulnerabilities before malicious actors can reach them. This page covers the formal definition and regulatory scope of penetration testing, the phased methodology that structures engagements, the operational scenarios in which testing is applied, and the decision criteria that distinguish one testing approach from another. The sector spans contracted professional services, compliance-mandated assessments, and internal security programs across every major US industry vertical.
Definition and scope
Penetration testing occupies a legally and technically distinct position in the US cybersecurity landscape: it is the only security assessment method that requires demonstrated exploitation — not merely enumeration — of discovered weaknesses. NIST SP 800-115, Technical Guide to Information Security Testing and Assessment defines penetration testing as security testing in which assessors mimic real-world attacks to identify methods for circumventing the security features of an application, system, or network. That definition draws a hard line between penetration testing and automated vulnerability scanning, which identifies potential weaknesses without confirming exploitability through active attack chains.
The legal boundary is equally definitive. Authorization documentation separates a legitimate penetration test from criminal unauthorized computer access under 18 U.S.C. § 1030, the Computer Fraud and Abuse Act (CFAA). Signed rules of engagement, scope-of-work agreements, and written authorization from the system owner are prerequisites for any lawful engagement — not procedural formalities. Details on penetration testing legal considerations and authorization agreements cover this framework in depth.
Regulatory drivers have made penetration testing a compliance-mandated control across multiple US sectors:
- PCI DSS v4.0, Requirement 11.4 — mandates external and internal penetration testing at least once per year and after significant infrastructure changes for all entities handling cardholder data.
- HIPAA Security Rule (45 CFR § 164.306) — requires covered entities to conduct technical and nontechnical evaluations of their security controls, which regulators and auditors interpret to include penetration testing.
- FedRAMP — the Federal Risk and Authorization Management Program requires annual penetration testing of cloud service offerings seeking or maintaining a federal authorization (FedRAMP Penetration Test Guidance).
- CMMC 2.0 — the Cybersecurity Maturity Model Certification framework, administered by the Department of Defense, incorporates penetration testing requirements at Level 2 and Level 3 for defense contractors (DoD CMMC Overview).
For a detailed breakdown of compliance-specific requirements, see penetration testing compliance requirements.
How it works
A penetration test follows a structured, phased methodology. The Penetration Testing Execution Standard (PTES) and NIST SP 800-115 both define discrete operational phases, which industry practice has converged around:
- Pre-engagement — Scope definition, rules of engagement, authorization documentation, and legal agreements are finalized. The target environment, assessment type, and out-of-scope systems are specified in writing.
- Reconnaissance — Passive and active information gathering about the target: DNS records, exposed services, employee information, technology stack identification, and network topology mapping. Passive reconnaissance does not interact with target systems directly; active reconnaissance does.
- Vulnerability identification — Systematic discovery of potential weaknesses using both automated scanning tools (such as Nmap for network enumeration or Burp Suite for web applications) and manual analysis. This phase produces a candidate list, not a confirmed finding set.
- Exploitation — Human-driven attempts to confirm and weaponize identified vulnerabilities. This phase distinguishes penetration testing from vulnerability assessment: the tester actively attempts to gain unauthorized access, escalate privileges, or chain vulnerabilities to achieve a defined objective.
- Post-exploitation — Assessment of access depth, lateral movement potential, data reachability, and persistence mechanisms. This phase maps the realistic impact of a successful breach.
- Reporting — Documented findings with severity ratings, exploitation evidence, risk context, and remediation guidance. A professional penetration test report separates technical findings (for engineering teams) from executive summaries (for leadership and compliance officers).
The OWASP Testing Guide provides a parallel methodology specifically for web application engagements, with test case catalogs organized by vulnerability class. Detailed breakdowns of individual phases appear under penetration testing phases and penetration testing methodology.
Common scenarios
Penetration testing is applied across a range of operational contexts, each with distinct scope, depth, and regulatory implications.
External network testing targets internet-facing infrastructure — firewalls, VPNs, web servers, and exposed services — simulating an attacker with no prior access. This is the most common entry-level engagement and the baseline requirement under PCI DSS. See network penetration testing for technical scope.
Web application testing focuses on HTTP/HTTPS attack surfaces: authentication flaws, injection vulnerabilities (SQL injection, XSS, SSRF), broken access controls, and API endpoints. The OWASP Top 10 provides the industry-standard vulnerability classification framework for this engagement type. Related coverage appears under web application penetration testing.
Internal network testing simulates a threat actor who has already obtained a foothold inside the network perimeter — modeling insider threats, compromised endpoints, or post-phishing scenarios. This engagement type evaluates lateral movement paths, segmentation controls, and privilege escalation opportunities.
Red team operations extend beyond single-domain testing to simulate full adversarial campaigns, combining social engineering, physical access attempts, and technical exploitation over an extended engagement window. Red team engagements typically operate against an unaware blue team (defensive security staff) to test detection and response capability, not just technical controls. See red team operations for the operational distinction.
Cloud and infrastructure-as-code testing addresses attack surfaces in AWS, Azure, and GCP environments, including misconfigured storage buckets, identity and access management (IAM) privilege paths, and container escape vulnerabilities. Cloud penetration testing covers provider-specific rules and scoping constraints.
Compliance-driven assessments are structured specifically to satisfy a named regulatory requirement — PCI DSS Requirement 11.4, FedRAMP annual testing, or SOC 2 Type II audit support — and are scoped and reported to match auditor expectations rather than attacker simulation objectives.
Decision boundaries
Selecting the appropriate assessment type requires matching scope, budget, timeline, and compliance obligation to the correct engagement structure. The primary distinctions that structure these decisions:
Penetration testing vs. vulnerability assessment — A vulnerability assessment enumerates potential weaknesses using automated tools and produces a list of findings ranked by severity. A penetration test confirms exploitability through active attack. Vulnerability assessments are faster and lower cost; penetration tests produce higher-confidence findings. The penetration testing vs. vulnerability assessment page maps this boundary in detail.
Black box vs. white box vs. gray box — Testing methodology varies by the information provided to the tester before engagement:
- Black box: No prior knowledge of the target environment — simulates an external attacker.
- White box: Full access to source code, architecture diagrams, and credentials — maximizes coverage efficiency.
- Gray box: Partial knowledge (e.g., user-level credentials, network diagrams) — the most common commercial configuration, balancing realism with thoroughness.
Full classification detail appears under black box, white box, and gray box testing.
Point-in-time vs. continuous testing — Traditional penetration tests are conducted annually or at defined intervals. Continuous penetration testing models and penetration testing as a service platforms provide ongoing assessment coverage aligned to development cycle velocity. Organizations operating under active threat environments or rapid release cycles increasingly adopt continuous models.
Manual vs. automated — Automated platforms reduce cost and increase frequency but cannot replicate the chained exploitation and creative attack paths that human testers generate. Regulatory frameworks — including PCI DSS and FedRAMP — explicitly require human-conducted penetration testing for compliance credit. The automated vs. manual penetration testing comparison covers the tradeoff in operational terms.
For organizations entering procurement, hiring a penetration testing firm and cost of penetration testing cover the service-sector structure and qualification standards applicable to vendor selection.
References
- NIST SP 800-115, Technical Guide to Information Security Testing and Assessment — National Institute of Standards and Technology
- PCI DSS v4.0 — Requirement 11.4 — PCI Security Standards Council
- [FedRAMP Penetration Test Guidance](https://www.fedramp.gov/assets/resources/documents/CS