← Certified Web Application Pentester

Reconnaissance and Information Gathering

Task 1
Passive Reconnaissance Fundamentals

1. What is Passive Reconnaissance

Passive reconnaissance gathers information about a target without directly interacting with it. No packets are sent to the target's infrastructure, making it undetectable.

Active Recon:   Attacker --> [packets] --> Target (detectable)
Passive Recon:  Attacker --> [queries] --> Third-party sources (undetectable by target)

2. WHOIS Enumeration

# Domain WHOIS
whois target.com

# Key information to extract:
# - Registrant name, email, organization
# - Name servers (hosting provider clues)
# - Registration/expiration dates
# - Registrar information

# IP WHOIS
whois 93.184.216.34

# Key information:
# - IP range/CIDR block owned
# - Organization name
# - Abuse contact
# - Network name (NetName)

# Reverse WHOIS (find domains by registrant)
# amass intel -whois -d target.com
# reversewhois.io
# whoxy.com API

3. DNS Reconnaissance

# All DNS records
dig target.com ANY

# Specific record types
dig target.com A        # IPv4 address
dig target.com AAAA     # IPv6 address
dig target.com MX       # Mail servers
dig target.com NS       # Name servers
dig target.com TXT      # TXT records (SPF, DKIM, verification tokens)
dig target.com CNAME    # Canonical names
dig target.com SOA      # Start of Authority
dig target.com SRV      # Service records

# Using specific DNS server
dig @8.8.8.8 target.com A
dig @1.1.1.1 target.com A

# Reverse DNS
dig -x 93.184.216.34

# Zone transfer attempt
dig axfr @ns1.target.com target.com

# DNS trace (follow delegation chain)
dig +trace target.com

# Short output
dig +short target.com A
dig +short target.com MX

DNS Record Security Implications

Record TypeSecurity Relevance
A/AAAADirect IP, hosting provider identification
MXEmail infrastructure, phishing targets
NSDNS provider, potential takeover
TXTSPF/DKIM/DMARC config, domain verification tokens
CNAMESubdomain takeover candidates
SRVInternal services exposed
SOAZone admin email, serial number

4. Certificate Transparency Logs

# crt.sh - query CT logs
curl -s "https://crt.sh/?q=%25.target.com&output=json" | jq -r '.[].name_value' | sort -u

# Filter for unique subdomains
curl -s "https://crt.sh/?q=%25.target.com&output=json" | \
  jq -r '.[].name_value' | \
  sed 's/\*\.//g' | \
  sort -u > ct_subdomains.txt

# censys.io certificates API
# https://search.censys.io/certificates?q=target.com

# Google Certificate Transparency
# https://transparencyreport.google.com/https/certificates

# Facebook CT monitoring
# https://developers.facebook.com/tools/ct/

5. Search Engine Dorking

Google Dorks

# Find login pages
site:target.com inurl:login
site:target.com inurl:admin
site:target.com intitle:"login" OR intitle:"sign in"

# Find sensitive files
site:target.com filetype:pdf
site:target.com filetype:xlsx OR filetype:csv
site:target.com filetype:sql
site:target.com filetype:env
site:target.com filetype:log
site:target.com filetype:bak
site:target.com filetype:conf OR filetype:cfg
site:target.com filetype:xml

# Find exposed directories
site:target.com intitle:"index of"
site:target.com intitle:"directory listing"

# Find error messages
site:target.com "php error" OR "sql syntax" OR "undefined index"
site:target.com "stack trace" OR "traceback"

# Find API documentation
site:target.com inurl:api
site:target.com inurl:swagger OR inurl:api-docs
site:target.com filetype:json inurl:openapi

# Find sensitive information
site:target.com "password" filetype:txt
site:target.com "api_key" OR "apikey" OR "api-key"
site:target.com "BEGIN RSA PRIVATE KEY"

# Find subdomains
site:*.target.com -www

# Find WordPress specific
site:target.com inurl:wp-content
site:target.com inurl:wp-admin
site:target.com filetype:xml inurl:sitemap

# Cached/old versions
cache:target.com

Other Search Engines

# Bing
# site:target.com

# DuckDuckGo
# site:target.com

# Yandex (good for .ru domains)
# site:target.com

# Shodan
# hostname:target.com
# org:"Target Organization"
# ssl.cert.subject.cn:target.com

# Censys
# services.tls.certificates.leaf.names:target.com

# ZoomEye
# site:target.com

# FOFA
# domain="target.com"

6. Web Archive Analysis

# Wayback Machine URLs
waybackurls target.com > wayback_urls.txt

# gau (GetAllUrls) - multiple sources
gau target.com > gau_urls.txt
gau --subs target.com > gau_with_subs.txt

# Combine and deduplicate
cat wayback_urls.txt gau_urls.txt | sort -u > all_historical_urls.txt

# Filter for interesting endpoints
cat all_historical_urls.txt | grep -iE "\.(php|asp|aspx|jsp|json|xml|config|env|sql|bak|old|backup)" > interesting_urls.txt

# Filter for parameters
cat all_historical_urls.txt | grep "?" | sort -u > parameterized_urls.txt

# Filter for API endpoints
cat all_historical_urls.txt | grep -iE "(api|graphql|rest|v1|v2|v3)" > api_urls.txt

# Check which URLs are still alive
cat interesting_urls.txt | httpx -silent -status-code -title > alive_urls.txt

# Wayback Machine snapshots via API
curl -s "https://web.archive.org/cdx/search/cdx?url=target.com/*&output=json&fl=timestamp,original,statuscode,mimetype" | jq .

7. Technology Fingerprinting

# Wappalyzer (browser extension or CLI)
# Identifies: CMS, frameworks, programming languages, servers, analytics

# BuiltWith
# https://builtwith.com/target.com

# Netcraft
# https://sitereport.netcraft.com/?url=target.com

# WhatRuns (browser extension)

# Check HTTP headers for technology hints
curl -sI https://target.com | grep -iE "(server|x-powered|x-aspnet|x-generator|x-drupal|x-framework)"

# robots.txt analysis
curl -s https://target.com/robots.txt

# Common technology indicators in HTML
curl -s https://target.com | grep -ioE "(wp-content|drupal|joomla|laravel|django|rails|angular|react|vue|next|nuxt)"

# favicon hash for identification
curl -s https://target.com/favicon.ico | md5sum
# Compare hash with Shodan favicon database
# http.favicon.hash:<hash_value>

8. Email Harvesting

# theHarvester
theHarvester -d target.com -b google,bing,linkedin,twitter -l 500

# hunter.io
# https://hunter.io/domain-search (API available)

# Phonebook.cz
# https://phonebook.cz

# Clearbit Connect
# Browser extension for email discovery

# Verify emails
# emailhippo.com
# verify-email.org

# LinkedIn OSINT
# Search for employees: site:linkedin.com "target company"
# Extract names → generate email patterns

# Common email patterns to try
# [email protected]
# [email protected]
# [email protected]
# [email protected]
# [email protected]

9. Social Media OSINT

# GitHub/GitLab reconnaissance
# Search for organization repos
# https://github.com/target-org

# Search code for secrets
# GitHub: org:target-org password
# GitHub: org:target-org api_key
# GitHub: org:target-org secret
# GitHub: org:target-org AWS_ACCESS_KEY

# GitDorker - automated GitHub dorking
# python3 GitDorker.py -t <github_token> -org target-org

# truffleHog - find secrets in git history
trufflehog github --org=target-org

# LinkedIn
# Company page → employees list
# Technology stack from job postings

# Twitter/X
# @target_company tweets for technology mentions
# Employees discussing internal tools

# Pastebin/paste sites
# Search for target.com on paste sites
# Dehashed, IntelligenceX for leaked data

10. Shodan and Internet-Wide Scan Data

# Shodan CLI
shodan search "hostname:target.com"
shodan search "org:\"Target Organization\""
shodan search "ssl.cert.subject.cn:target.com"
shodan host 93.184.216.34

# Shodan filters
# port:    Specific port
# country: Country code
# city:    City name
# os:      Operating system
# product: Software name
# version: Software version
# vuln:    CVE number

# Censys
# https://search.censys.io
# services.tls.certificates.leaf.names:target.com

# BinaryEdge
# https://app.binaryedge.io

# GreyNoise
# https://viz.greynoise.io

# Shodan Dorks for specific services
# "Apache" hostname:target.com
# "nginx" hostname:target.com
# "Microsoft-IIS" hostname:target.com
# "X-Powered-By: PHP" hostname:target.com
# "Set-Cookie: JSESSIONID" hostname:target.com

11. Metadata Analysis

# Extract metadata from documents
exiftool document.pdf
exiftool -a -u -g1 document.pdf

# FOCA (Windows tool for metadata extraction)
# Batch download and analyze documents

# Download all PDFs from target
wget -r -l1 -A pdf https://target.com/
# Extract metadata from all
find . -name "*.pdf" -exec exiftool {} \; > metadata_results.txt

# Key metadata to look for:
# - Author names (internal usernames)
# - Software versions
# - Internal paths (C:\Users\john\...)
# - Printer names
# - GPS coordinates
# - Email addresses
# - Creation/modification dates

12. Passive Recon Automation Script

#!/bin/bash
# passive_recon.sh - Automated passive reconnaissance
TARGET=$1

echo "[*] Starting passive recon for: $TARGET"
mkdir -p recon/$TARGET

# WHOIS
echo "[*] WHOIS lookup..."
whois $TARGET > recon/$TARGET/whois.txt

# DNS records
echo "[*] DNS enumeration..."
for type in A AAAA MX NS TXT SOA CNAME; do
    dig +short $TARGET $type >> recon/$TARGET/dns_records.txt
done

# Certificate Transparency
echo "[*] Certificate transparency..."
curl -s "https://crt.sh/?q=%25.$TARGET&output=json" | \
  jq -r '.[].name_value' 2>/dev/null | \
  sed 's/\*\.//g' | sort -u > recon/$TARGET/ct_subdomains.txt

# Wayback URLs
echo "[*] Wayback Machine URLs..."
waybackurls $TARGET 2>/dev/null | sort -u > recon/$TARGET/wayback_urls.txt

# gau
echo "[*] GAU URLs..."
gau $TARGET 2>/dev/null | sort -u > recon/$TARGET/gau_urls.txt

# Combine URLs
cat recon/$TARGET/wayback_urls.txt recon/$TARGET/gau_urls.txt | \
  sort -u > recon/$TARGET/all_urls.txt

echo "[*] Results saved to recon/$TARGET/"
echo "[*] Subdomains found: $(wc -l < recon/$TARGET/ct_subdomains.txt)"
echo "[*] URLs collected: $(wc -l < recon/$TARGET/all_urls.txt)"
Task 2
DNS Enumeration and Zone Transfers

1. DNS Architecture for Pentesters

Client Query → Recursive Resolver → Root NS → TLD NS → Authoritative NS
                                                        DNS Response

DNS Record Types Deep Dive

# A Record - IPv4 address mapping
dig target.com A +short
# 93.184.216.34

# AAAA Record - IPv6 address mapping
dig target.com AAAA +short
# 2606:2800:220:1:248:1893:25c8:1946

# MX Record - Mail servers (priority:server)
dig target.com MX +short
# 10 mail1.target.com.
# 20 mail2.target.com.

# NS Record - Authoritative name servers
dig target.com NS +short
# ns1.target.com.
# ns2.target.com.

# SOA Record - Zone authority
dig target.com SOA +short
# ns1.target.com. admin.target.com. 2024010101 3600 900 604800 86400

# TXT Record - Text records (SPF, DKIM, verification)
dig target.com TXT +short
# "v=spf1 include:_spf.google.com ~all"
# "google-site-verification=..."

# CNAME Record - Alias/canonical name
dig www.target.com CNAME +short
# target.com.

# SRV Record - Service location
dig _sip._tcp.target.com SRV +short
# 10 60 5060 sip.target.com.

# PTR Record - Reverse DNS
dig -x 93.184.216.34 +short
# target.com.

# CAA Record - Certificate Authority Authorization
dig target.com CAA +short
# 0 issue "letsencrypt.org"

2. Zone Transfer Attacks (AXFR)

# Identify name servers first
dig target.com NS +short

# Attempt zone transfer from each NS
dig axfr target.com @ns1.target.com
dig axfr target.com @ns2.target.com

# Using host command
host -t axfr target.com ns1.target.com

# Using nmap
nmap --script dns-zone-transfer -p 53 ns1.target.com

# IXFR (Incremental Zone Transfer)
dig ixfr target.com @ns1.target.com

# What a successful zone transfer reveals:
# - ALL subdomains and their IP addresses
# - Internal hostnames
# - Mail servers
# - Service records
# - Network architecture

Zone Transfer Automation

#!/bin/bash
# zone_transfer.sh
TARGET=$1

echo "[*] Attempting zone transfers for $TARGET"

# Get name servers
NS_SERVERS=$(dig +short NS $TARGET)

for ns in $NS_SERVERS; do
    echo "[*] Trying zone transfer from: $ns"
    result=$(dig axfr $TARGET @$ns 2>&1)
    if echo "$result" | grep -qv "Transfer failed"; then
        echo "[+] Zone transfer successful from $ns!"
        echo "$result" > "${TARGET}_zone_transfer_${ns}.txt"
    else
        echo "[-] Zone transfer failed from $ns"
    fi
done

3. Subdomain Enumeration

Passive Subdomain Discovery

# Subfinder - fast passive subdomain enumeration
subfinder -d target.com -o subfinder_subs.txt
subfinder -d target.com -all -o subfinder_all.txt

# Amass - comprehensive OSINT
amass enum -passive -d target.com -o amass_passive.txt
amass enum -d target.com -o amass_active.txt  # includes active

# Assetfinder
assetfinder target.com > assetfinder_subs.txt
assetfinder --subs-only target.com > assetfinder_subs_only.txt

# Findomain
findomain -t target.com -q > findomain_subs.txt

# Combine all results
cat subfinder_subs.txt amass_passive.txt assetfinder_subs.txt findomain_subs.txt | \
  sort -u > all_subdomains.txt

Active Subdomain Bruteforcing

# DNS bruteforce with common wordlist
gobuster dns -d target.com -w /usr/share/seclists/Discovery/DNS/subdomains-top1million-5000.txt -t 50

# Shuffledns - massdns wrapper
shuffledns -d target.com -w /usr/share/seclists/Discovery/DNS/subdomains-top1million-110000.txt -r resolvers.txt

# Puredns - fast DNS bruteforcing
puredns bruteforce /usr/share/seclists/Discovery/DNS/subdomains-top1million-110000.txt target.com -r resolvers.txt

# DNSx - DNS toolkit
cat all_subdomains.txt | dnsx -silent -a -resp > resolved_subs.txt

# Altdns - subdomain alteration
altdns -i all_subdomains.txt -o altered_subs.txt -w words.txt
cat altered_subs.txt | dnsx -silent > resolved_altered.txt

# dnsgen - generate permutations
cat all_subdomains.txt | dnsgen - | dnsx -silent > dnsgen_results.txt

# Common subdomain patterns to check
# dev, staging, test, uat, qa, preprod, beta
# api, api-v2, api-internal
# admin, portal, dashboard, panel
# mail, webmail, smtp, pop, imap
# vpn, remote, gateway
# git, gitlab, bitbucket, jenkins, ci, cd
# jira, confluence, wiki, docs
# db, database, mysql, postgres, mongo, redis
# backup, old, legacy, archive
# cdn, static, assets, media, images

4. DNS Security Analysis

SPF Record Analysis

# Check SPF
dig target.com TXT | grep "v=spf1"

# SPF Mechanisms:
# ip4:x.x.x.x    - Allow specific IPv4
# ip6:...         - Allow specific IPv6
# include:domain  - Include another domain's SPF
# a               - Allow domain's A record IPs
# mx              - Allow domain's MX IPs
# all             - Match all (with qualifier)

# Qualifiers:
# +all  PASS (allow all - very weak)
# -all  FAIL (reject non-listed - strong)
# ~all  SOFTFAIL (mark but accept - moderate)
# ?all  NEUTRAL (no policy - weak)

# Weak SPF examples:
# v=spf1 +all                    → allows anyone to send
# v=spf1 ?all                    → neutral policy
# (no SPF record)                → no email auth

DMARC Record Analysis

# Check DMARC
dig _dmarc.target.com TXT

# DMARC tags:
# v=DMARC1       - Version
# p=none         - No action (monitoring only)
# p=quarantine   - Quarantine failures
# p=reject       - Reject failures
# rua=           - Aggregate report URI
# ruf=           - Forensic report URI
# pct=           - Percentage of messages to filter
# sp=            - Subdomain policy

# Weak DMARC:
# p=none (monitoring only, no enforcement)
# No DMARC record at all

DKIM Record Analysis

# DKIM selector discovery
dig google._domainkey.target.com TXT
dig default._domainkey.target.com TXT
dig selector1._domainkey.target.com TXT
dig k1._domainkey.target.com TXT

# Common selectors to check:
# google, default, selector1, selector2, s1, s2, k1, dkim, mail

5. DNS Cache Snooping

# Non-recursive query to check if domain is cached
dig @ns1.target.com example.com A +norecurse

# If response received → domain was recently resolved
# Can reveal what sites employees visit

# Automated cache snooping
nmap --script dns-cache-snoop --script-args 'dns-cache-snoop.domains={facebook.com,gmail.com,slack.com}' -p 53 ns1.target.com

6. DNS Rebinding Detection

# DNS Rebinding Attack Flow:
1. Victim visits attacker.com
2. attacker.com resolves to attacker's IP (first resolution)
3. JavaScript loads from attacker's server
4. DNS TTL expires
5. attacker.com resolves to internal IP (127.0.0.1 or 192.168.x.x)
6. JavaScript can now access internal services via same-origin

# Detection: check for very low TTL values
dig target.com A | grep -i "ttl"

# TTL < 60 seconds may indicate DNS rebinding potential

7. Subdomain Takeover via DNS

# Check for dangling CNAME records
cat all_subdomains.txt | while read sub; do
    cname=$(dig +short CNAME $sub)
    if [ -n "$cname" ]; then
        echo "$sub -> $cname"
    fi
done > cname_records.txt

# Check if CNAME targets are claimable
# Common takeover targets:
# *.s3.amazonaws.com (404 NoSuchBucket)
# *.herokuapp.com (No such app)
# *.ghost.io (404)
# *.github.io (404)
# *.azurewebsites.net (404)
# *.cloudfront.net (Bad Request)
# *.pantheon.io
# *.shopify.com
# *.tumblr.com
# *.wordpress.com
# *.zendesk.com

# Automated check
subjack -w all_subdomains.txt -t 100 -timeout 30 -ssl -c fingerprints.json -v

# Nuclei subdomain takeover templates
nuclei -l all_subdomains.txt -t takeovers/

8. DNS Tunneling Detection

# Signs of DNS tunneling:
# - Unusually long subdomain labels
# - High volume of TXT record queries
# - Base64/hex encoded subdomain labels
# - Queries to suspicious domains with many subdomains

# Monitor DNS traffic
tcpdump -i eth0 port 53 -w dns_capture.pcap

# Analyze with tshark
tshark -r dns_capture.pcap -Y "dns" -T fields -e dns.qry.name | \
  awk '{print length, $0}' | sort -rn | head -20

# DNS tunneling tools (for authorized testing):
# iodine, dnscat2, dns2tcp

9. Reverse DNS Enumeration

# Reverse DNS on IP range
# If target owns 93.184.216.0/24
for ip in $(seq 1 254); do
    result=$(dig -x 93.184.216.$ip +short 2>/dev/null)
    if [ -n "$result" ]; then
        echo "93.184.216.$ip -> $result"
    fi
done

# Using dnsrecon
dnsrecon -r 93.184.216.0/24 -n 8.8.8.8

# Using nmap
nmap -sL 93.184.216.0/24 | grep "(" > reverse_dns.txt

# Fierce - DNS reconnaissance tool
fierce --domain target.com

10. Comprehensive DNS Recon Script

#!/bin/bash
# dns_recon.sh - Complete DNS reconnaissance
TARGET=$1
OUTDIR="recon/${TARGET}/dns"
mkdir -p $OUTDIR

echo "=== DNS Reconnaissance: $TARGET ==="

# Standard records
echo "[*] Querying standard DNS records..."
for type in A AAAA MX NS TXT SOA CNAME CAA SRV; do
    echo "--- $type Records ---" >> $OUTDIR/records.txt
    dig $TARGET $type +noall +answer >> $OUTDIR/records.txt
    echo "" >> $OUTDIR/records.txt
done

# Name servers
echo "[*] Identifying name servers..."
dig +short NS $TARGET > $OUTDIR/nameservers.txt

# Zone transfer
echo "[*] Attempting zone transfers..."
while read ns; do
    echo "[*] Trying AXFR from $ns..."
    dig axfr $TARGET @$ns >> $OUTDIR/zone_transfer.txt 2>&1
done < $OUTDIR/nameservers.txt

# SPF/DMARC/DKIM
echo "[*] Checking email security records..."
echo "=== SPF ===" > $OUTDIR/email_security.txt
dig $TARGET TXT | grep "v=spf1" >> $OUTDIR/email_security.txt
echo "=== DMARC ===" >> $OUTDIR/email_security.txt
dig _dmarc.$TARGET TXT >> $OUTDIR/email_security.txt
echo "=== DKIM (common selectors) ===" >> $OUTDIR/email_security.txt
for sel in google default selector1 selector2 k1 dkim mail; do
    dig ${sel}._domainkey.$TARGET TXT +short >> $OUTDIR/email_security.txt 2>/dev/null
done

echo "[*] DNS recon complete. Results in $OUTDIR/"
Task 3
Subdomain Discovery Techniques

1. Why Subdomain Discovery Matters

Main domain: target.com (hardened, WAF, monitored)
                |
                +-- dev.target.com (debug mode, weak auth)
                +-- staging.target.com (outdated code, test data)
                +-- api-internal.target.com (no authentication)
                +-- old.target.com (vulnerable legacy app)
                +-- jenkins.target.com (default credentials)
                +-- backup.target.com (exposed database dumps)

Subdomains often have weaker security than the main domain, making them high-value targets.

2. Passive Subdomain Enumeration

2.1 Certificate Transparency

# crt.sh
curl -s "https://crt.sh/?q=%25.target.com&output=json" | \
  jq -r '.[].name_value' | sed 's/\*\.//g' | sort -u

# certspotter
curl -s "https://api.certspotter.com/v1/issuances?domain=target.com&include_subdomains=true&expand=dns_names" | \
  jq -r '.[].dns_names[]' | sort -u

# Censys certificates
# censys search "parsed.names: target.com" --index-type certificates

2.2 Passive DNS Databases

# SecurityTrails API
curl -s --header "APIKEY: $SECURITYTRAILS_KEY" \
  "https://api.securitytrails.com/v1/domain/target.com/subdomains" | jq '.subdomains[]'

# VirusTotal
curl -s "https://www.virustotal.com/vtapi/v2/domain/report?apikey=$VT_KEY&domain=target.com" | \
  jq '.subdomains[]'

# AlienVault OTX
curl -s "https://otx.alienvault.com/api/v1/indicators/domain/target.com/passive_dns" | \
  jq '.passive_dns[].hostname' | sort -u

# RapidDNS
curl -s "https://rapiddns.io/subdomain/target.com?full=1" | \
  grep -oP '_blank">\K[^<]*' | sort -u

# Hackertarget
curl -s "https://api.hackertarget.com/hostsearch/?q=target.com" | cut -d, -f1

2.3 Multi-Source Tools

# Subfinder (30+ passive sources)
subfinder -d target.com -all -o subfinder.txt
subfinder -d target.com -all -cs -o subfinder_sources.txt  # show sources
subfinder -dL domains.txt -o subfinder_multi.txt  # multiple domains

# Amass passive
amass enum -passive -d target.com -o amass_passive.txt
amass enum -passive -d target.com -src -o amass_sources.txt  # show sources

# Assetfinder
assetfinder --subs-only target.com > assetfinder.txt

# Findomain
findomain -t target.com -u findomain.txt

# Chaos (ProjectDiscovery)
chaos -d target.com -key $CHAOS_KEY -o chaos.txt

# Combine all
cat subfinder.txt amass_passive.txt assetfinder.txt findomain.txt chaos.txt | \
  sort -u > all_passive_subs.txt
echo "[*] Total unique subdomains: $(wc -l < all_passive_subs.txt)"

3. Active Subdomain Bruteforcing

3.1 DNS Bruteforcing

# Gobuster DNS
gobuster dns -d target.com \
  -w /usr/share/seclists/Discovery/DNS/subdomains-top1million-20000.txt \
  -t 50 -o gobuster_dns.txt

# Massdns (fastest)
massdns -r resolvers.txt -t A -o S -w massdns_results.txt subdomains_wordlist.txt

# Shuffledns (massdns wrapper with wildcard filtering)
shuffledns -d target.com \
  -w /usr/share/seclists/Discovery/DNS/subdomains-top1million-110000.txt \
  -r resolvers.txt -o shuffledns.txt

# Puredns (wildcard filtering + bruteforce)
puredns bruteforce /usr/share/seclists/Discovery/DNS/subdomains-top1million-110000.txt \
  target.com -r resolvers.txt -w puredns.txt

# Create resolvers list
# Public DNS resolvers for mass DNS resolution
cat > resolvers.txt << 'EOF'
8.8.8.8
8.8.4.4
1.1.1.1
1.0.0.1
9.9.9.9
208.67.222.222
208.67.220.220
EOF

# DNSx - resolve and filter
cat all_subs.txt | dnsx -silent -a -cname -resp > dnsx_resolved.txt

3.2 Wordlists for DNS Bruteforcing

# Best wordlists:
# /usr/share/seclists/Discovery/DNS/subdomains-top1million-5000.txt     (quick)
# /usr/share/seclists/Discovery/DNS/subdomains-top1million-20000.txt    (medium)
# /usr/share/seclists/Discovery/DNS/subdomains-top1million-110000.txt   (thorough)
# /usr/share/seclists/Discovery/DNS/dns-Jhaddix.txt                     (comprehensive)
# /usr/share/seclists/Discovery/DNS/fierce-hostlist.txt
# /usr/share/seclists/Discovery/DNS/namelist.txt
# /usr/share/seclists/Discovery/DNS/deepmagic.com-prefixes-top50000.txt

# Custom wordlist based on discovered subdomains
# Extract patterns from known subdomains
cat known_subs.txt | sed 's/\.target\.com$//' | tr '-' '\n' | tr '.' '\n' | sort -u > custom_words.txt

4. Subdomain Permutation and Alteration

# Altdns - subdomain alteration and permutation
altdns -i known_subs.txt -o altered.txt -w /usr/share/seclists/Discovery/DNS/subdomains-top1million-5000.txt
cat altered.txt | puredns resolve -r resolvers.txt > resolved_altered.txt

# dnsgen - intelligent permutation
cat known_subs.txt | dnsgen - > dnsgen_perms.txt
cat dnsgen_perms.txt | puredns resolve -r resolvers.txt > resolved_dnsgen.txt

# gotator - generate permutations
gotator -sub known_subs.txt -perm permutation_words.txt -depth 1 -numbers 3 -md > gotator.txt

# Common permutation patterns
# dev-api, api-dev, api-staging, staging-api
# v2-api, api-v2, apiv2
# internal-api, api-internal
# test-app, app-test, app1, app2
# us-east-1, eu-west-1 (regional)

5. Virtual Host Discovery

# Virtual hosts share the same IP but serve different content based on Host header

# ffuf vhost discovery
ffuf -u http://TARGET_IP -H "Host: FUZZ.target.com" \
  -w /usr/share/seclists/Discovery/DNS/subdomains-top1million-5000.txt \
  -fs 0 -mc all

# gobuster vhost
gobuster vhost -u http://TARGET_IP -w /usr/share/seclists/Discovery/DNS/subdomains-top1million-5000.txt \
  --append-domain -t 50

# Manual vhost testing
curl -s -H "Host: dev.target.com" http://TARGET_IP
curl -s -H "Host: staging.target.com" http://TARGET_IP
curl -s -H "Host: admin.target.com" http://TARGET_IP

# Filter by response size (exclude default/wildcard responses)
# First, get default response size:
DEFAULT_SIZE=$(curl -s -H "Host: nonexistent12345.target.com" http://TARGET_IP | wc -c)
echo "Default response size: $DEFAULT_SIZE"

# Then filter
ffuf -u http://TARGET_IP -H "Host: FUZZ.target.com" \
  -w wordlist.txt -fs $DEFAULT_SIZE

6. Wildcard DNS Detection and Handling

# Test for wildcard DNS
dig randomnonexistent12345.target.com A +short
# If returns an IP → wildcard DNS is configured

# Wildcard detection script
RANDOM_SUB=$(cat /dev/urandom | tr -dc 'a-z' | fold -w 20 | head -n 1)
WILDCARD_IP=$(dig +short $RANDOM_SUB.target.com A)
if [ -n "$WILDCARD_IP" ]; then
    echo "[!] Wildcard DNS detected: *.$1 -> $WILDCARD_IP"
    echo "[*] Filter results by this IP"
fi

# Tools that handle wildcards automatically:
# puredns (built-in wildcard detection)
# shuffledns (built-in wildcard detection)
# massdns + wildcard filtering post-processing

7. Subdomain Validation and Probing

# Resolve all subdomains
cat all_subs.txt | dnsx -silent -a -resp -o resolved.txt

# HTTP probing - find live web services
cat all_subs.txt | httpx -silent -status-code -title -tech-detect -web-server \
  -content-length -follow-redirects -o httpx_results.txt

# Filter interesting results
cat httpx_results.txt | grep -E "200|301|302|401|403" > live_web.txt
cat httpx_results.txt | grep "401\|403" > auth_required.txt
cat httpx_results.txt | grep -i "admin\|panel\|dashboard\|portal" > admin_panels.txt

# Screenshot all live hosts
gowitness file -f live_subs.txt -P screenshots/
# or
aquatone < live_subs.txt

# Nuclei scan on all subdomains
nuclei -l live_subs.txt -t technologies/ -o tech_detection.txt

8. Subdomain Takeover

# What makes a subdomain vulnerable to takeover:
# 1. CNAME points to external service
# 2. External service account is deleted/unconfigured
# 3. Attacker can claim the service and serve content on the subdomain

# Check for CNAME records
cat all_subs.txt | dnsx -cname -resp -o cnames.txt

# Automated takeover checking
subjack -w all_subs.txt -t 100 -timeout 30 -ssl -c /path/to/fingerprints.json -v -o subjack_results.txt

# Nuclei takeover templates
nuclei -l all_subs.txt -t http/takeovers/ -o takeover_results.txt

# can-i-take-over-xyz reference
# https://github.com/EdOverflow/can-i-take-over-xyz

# Manual verification for common services:
# AWS S3: "NoSuchBucket" error
# GitHub Pages: 404 with GitHub branding
# Heroku: "No such app"
# Azure: "NXDOMAIN" for *.azurewebsites.net
# Shopify: "Sorry, this shop is currently unavailable"
# Fastly: "Fastly error: unknown domain"

9. Scope Expansion Techniques

# Find related domains via reverse IP
# All domains on same IP
curl -s "https://api.hackertarget.com/reverseiplookup/?q=93.184.216.34"

# Find domains in same IP range
# whois the IP → find CIDR block → reverse DNS on range

# Find related organizations via ASN
# Lookup ASN
whois -h whois.radb.net -- '-i origin AS12345'
# or
curl -s "https://api.bgpview.io/asn/12345/prefixes"

# Google Analytics / AdSense ID tracking
# If target uses UA-12345678 in Google Analytics
# Search for other sites with same UA ID
# builtwith.com → Relationship Profile

# Find domains by same registrant
# whoxy.com reverse WHOIS

# Favicon hash matching on Shodan
curl -s https://target.com/favicon.ico | python3 -c "
import mmh3, sys, codecs
favicon = codecs.encode(sys.stdin.buffer.read(), 'base64')
print(f'http.favicon.hash:{mmh3.hash(favicon)}')"

10. Complete Subdomain Discovery Pipeline

#!/bin/bash
# subdomain_discovery.sh - Complete pipeline
TARGET=$1
OUTDIR="recon/${TARGET}/subdomains"
RESOLVERS="resolvers.txt"
mkdir -p $OUTDIR

echo "=== Subdomain Discovery Pipeline: $TARGET ==="

# Phase 1: Passive collection
echo "[1/6] Passive enumeration..."
subfinder -d $TARGET -all -silent > $OUTDIR/subfinder.txt 2>/dev/null
amass enum -passive -d $TARGET -o $OUTDIR/amass.txt 2>/dev/null
assetfinder --subs-only $TARGET > $OUTDIR/assetfinder.txt 2>/dev/null
findomain -t $TARGET -q > $OUTDIR/findomain.txt 2>/dev/null

# CT logs
curl -s "https://crt.sh/?q=%25.$TARGET&output=json" | \
  jq -r '.[].name_value' 2>/dev/null | sed 's/\*\.//g' | sort -u > $OUTDIR/crt.txt

# Combine passive results
cat $OUTDIR/*.txt | sort -u > $OUTDIR/passive_all.txt
echo "[*] Passive subdomains: $(wc -l < $OUTDIR/passive_all.txt)"

# Phase 2: Active bruteforcing
echo "[2/6] DNS bruteforcing..."
puredns bruteforce /usr/share/seclists/Discovery/DNS/subdomains-top1million-20000.txt \
  $TARGET -r $RESOLVERS -w $OUTDIR/bruteforce.txt 2>/dev/null

# Phase 3: Permutations
echo "[3/6] Generating permutations..."
cat $OUTDIR/passive_all.txt | dnsgen - 2>/dev/null | \
  puredns resolve -r $RESOLVERS 2>/dev/null > $OUTDIR/permutations.txt

# Phase 4: Combine all
echo "[4/6] Combining results..."
cat $OUTDIR/passive_all.txt $OUTDIR/bruteforce.txt $OUTDIR/permutations.txt | \
  sort -u > $OUTDIR/all_subdomains.txt
echo "[*] Total unique subdomains: $(wc -l < $OUTDIR/all_subdomains.txt)"

# Phase 5: Resolve and validate
echo "[5/6] Resolving subdomains..."
cat $OUTDIR/all_subdomains.txt | dnsx -silent -a -resp > $OUTDIR/resolved.txt

# Phase 6: HTTP probing
echo "[6/6] HTTP probing..."
cat $OUTDIR/all_subdomains.txt | httpx -silent -status-code -title -tech-detect \
  -web-server -o $OUTDIR/httpx_results.txt

echo "=== Discovery Complete ==="
echo "[*] Total subdomains: $(wc -l < $OUTDIR/all_subdomains.txt)"
echo "[*] Live web services: $(wc -l < $OUTDIR/httpx_results.txt)"
echo "[*] Results saved to $OUTDIR/"
Task 4
Technology Fingerprinting

1. Why Technology Fingerprinting Matters

Identify Stack → Find Known CVEs → Map Attack Surface → Target Exploits

Example:
Server: Apache 2.4.49 → CVE-2021-41773 (Path Traversal/RCE)
PHP: 8.1.0-dev → Backdoor RCE
WordPress: 5.7.0 → Known plugin vulnerabilities
jQuery: 1.6.1 → XSS via jQuery.html()

2. HTTP Header Analysis

# Get all response headers
curl -sI https://target.com

# Key headers for fingerprinting:
# Server: Apache/2.4.51 (Ubuntu)
# X-Powered-By: PHP/8.1.2
# X-AspNet-Version: 4.0.30319
# X-AspNetMvc-Version: 5.2
# X-Generator: Drupal 9
# X-Drupal-Cache: HIT
# X-Drupal-Dynamic-Cache: MISS
# X-WordPress: true
# X-Pingback: xmlrpc.php (WordPress indicator)
# X-Shopify-Stage: production
# X-Wix-Request-Id: ... (Wix)
# X-GitHub-Request-Id: ... (GitHub Pages)

# Verbose curl output
curl -v https://target.com 2>&1 | grep -iE "(< server|< x-powered|< x-asp|< x-gen|< set-cookie|< x-frame)"

# Check multiple pages
for path in / /login /admin /api /robots.txt; do
    echo "=== $path ===" 
    curl -sI "https://target.com$path" | grep -iE "(server|x-powered|x-asp|x-gen)"
done

# Cookie-based fingerprinting
# PHPSESSID → PHP
# JSESSIONID → Java (Tomcat/JBoss/etc.)
# ASP.NET_SessionId → ASP.NET
# laravel_session → Laravel
# _rails_session → Ruby on Rails
# connect.sid → Node.js Express
# csrftoken + sessionid → Django
# ci_session → CodeIgniter

3. Automated Fingerprinting Tools

# WhatWeb - comprehensive web fingerprinting
whatweb https://target.com
whatweb -a 3 https://target.com  # aggressive mode
whatweb -v https://target.com    # verbose
whatweb --input-file=urls.txt -o results.json --log-json  # batch

# Wappalyzer CLI
# npm install -g wappalyzer
wappalyzer https://target.com

# httpx with tech detection
echo target.com | httpx -tech-detect -status-code -title -web-server -silent

# Batch tech detection
cat live_hosts.txt | httpx -tech-detect -status-code -title -web-server -silent -o tech_results.txt

# webanalyze (Go-based Wappalyzer)
webanalyze -host https://target.com -crawl 2

# Nuclei technology detection
nuclei -u https://target.com -t technologies/
nuclei -l live_hosts.txt -t technologies/ -o tech_nuclei.txt

# Retire.js - find vulnerable JavaScript libraries
retire --js --jspath /path/to/js/files
retire --node --path /path/to/node/project

4. CMS Detection

# WordPress detection
# Indicators:
# /wp-content/, /wp-includes/, /wp-admin/
# /xmlrpc.php, /wp-login.php
# <meta name="generator" content="WordPress 6.x">
# /wp-json/wp/v2/

curl -s https://target.com | grep -i "wp-content\|wordpress"
curl -s https://target.com/wp-json/wp/v2/ | head -5
curl -s https://target.com/readme.html | grep -i version

# WPScan - comprehensive WordPress scanner
wpscan --url https://target.com --enumerate ap,at,u,dbe
wpscan --url https://target.com --api-token $WPSCAN_TOKEN -e vp,vt

# Drupal detection
# /core/, /modules/, /themes/, /sites/default/
# CHANGELOG.txt, /core/CHANGELOG.txt
# X-Generator: Drupal
curl -s https://target.com/CHANGELOG.txt | head -5
curl -s https://target.com/core/CHANGELOG.txt | head -5

# Droopescan
droopescan scan drupal -u https://target.com

# Joomla detection
# /administrator/, /components/, /modules/
# /configuration.php~, /README.txt
# <meta name="generator" content="Joomla!">
curl -s https://target.com | grep -i "joomla"
curl -s https://target.com/administrator/manifests/files/joomla.xml | grep version

# JoomScan
joomscan -u https://target.com

# Magento
# /skin/, /js/, /app/, /var/
# /downloader/, /admin/
magescan scan:all https://target.com

# SharePoint
# /_layouts/, /_vti_bin/
# /_api/web
curl -s https://target.com/_api/web | head -5

5. JavaScript Framework Detection

# Check page source for framework indicators
curl -s https://target.com | grep -oiE "(react|angular|vue|next|nuxt|svelte|ember|backbone)"

# React indicators
# data-reactroot, data-reactid
# __NEXT_DATA__ (Next.js)
# react-app (class names)
# /_next/ paths (Next.js)

# Angular indicators
# ng-version, ng-app, ng-controller
# <app-root>, angular.json
# /assets/ structure

# Vue.js indicators
# data-v-xxxx attributes
# __vue__, v-bind, v-if, v-for
# /static/js/chunk-vendors (Vue CLI)
# /_nuxt/ (Nuxt.js)

# Check JavaScript files
curl -s https://target.com | grep -oP 'src="[^"]*\.js"' | head -20

# Analyze bundled JavaScript
curl -s https://target.com/static/js/main.xxxxx.js | grep -oE "(React|Angular|Vue|jQuery)" | sort -u

# Source map discovery (development builds)
curl -sI https://target.com/static/js/main.js | grep -i "sourcemap"
curl -s https://target.com/static/js/main.js.map

# Webpack bundle analyzer
# Look for /static/js/*.chunk.js patterns
# __webpack_require__ in JS files

6. Server-Side Technology Detection

# PHP detection
curl -sI https://target.com/index.php
curl -s https://target.com/phpinfo.php  # common misconfiguration
# X-Powered-By: PHP/x.x
# PHPSESSID cookie

# Java/Spring detection
# JSESSIONID cookie
# /actuator endpoints (Spring Boot)
curl -s https://target.com/actuator/health
curl -s https://target.com/actuator/env
curl -s https://target.com/actuator/info

# ASP.NET detection
# .aspx, .ashx, .asmx extensions
# ASP.NET_SessionId cookie
# X-AspNet-Version header
# __VIEWSTATE in form

# Python/Django detection
# csrftoken + sessionid cookies
# /admin/ (Django admin)
# "csrfmiddlewaretoken" in forms

# Python/Flask detection
# "session" cookie (signed)
# Werkzeug debugger: /console
curl -s https://target.com/console

# Ruby on Rails detection
# _rails_session cookie
# X-Request-Id header
# /assets/ pipeline
# "authenticity_token" in forms

# Node.js/Express detection
# connect.sid cookie
# X-Powered-By: Express
# ETag format differences

# Go detection
# No session by default
# Specific error page formats
# Chi/Gin/Echo framework headers

7. Web Server Fingerprinting

# Apache version and modules
curl -sI https://target.com | grep Server
# Apache specifics: /server-status, /server-info
curl -s https://target.com/server-status
curl -s https://target.com/server-info

# Nginx
# Server: nginx/1.x
# /nginx_status
curl -s https://target.com/nginx_status

# IIS
# Server: Microsoft-IIS/10.0
# /_vti_bin/, /_vti_inf.html
# aspnet_client folder

# LiteSpeed
# Server: LiteSpeed

# Tomcat
# Server: Apache-Coyote/1.1
# /manager/html (admin)
# /host-manager/html
# /status (server status)

# HTTP methods testing
curl -X OPTIONS -sI https://target.com
nmap --script http-methods -p 80,443 target.com

# 404 page analysis (different servers return different formats)
curl -s https://target.com/nonexistent_page_12345 | head -20

8. WAF/CDN Detection

# wafw00f - WAF detection
wafw00f https://target.com
wafw00f -a https://target.com  # test all WAFs

# Common WAF indicators:

# Cloudflare
# Server: cloudflare
# cf-ray header, __cfduid cookie
# Error page: "Attention Required! | Cloudflare"

# AWS WAF
# x-amzn-requestid header
# 403 with "Request blocked" message

# Akamai
# Server: AkamaiGHost
# X-Akamai-Transformed header

# Imperva/Incapsula
# X-CDN: Imperva
# visid_incap cookie
# incap_ses cookie

# Sucuri
# Server: Sucuri/Cloudproxy
# X-Sucuri-ID header

# ModSecurity
# Server: Apache with mod_security
# 403 Forbidden with ModSecurity message

# F5 BIG-IP
# Server: BigIP
# BIGipServer cookie
# Persistence cookie: BIGipServer~pool_name

# Manual WAF detection
curl -s "https://target.com/?id=1' OR 1=1--" -o /dev/null -w "%{http_code}"
curl -s "https://target.com/?q=<script>alert(1)</script>" -o /dev/null -w "%{http_code}"
# 403/406/429 → WAF likely present

9. Version-Specific Vulnerability Mapping

# Once technologies are identified, search for CVEs

# searchsploit (ExploitDB local)
searchsploit apache 2.4.49
searchsploit wordpress 5.7
searchsploit jquery 1.6

# Vulners API
curl -s "https://vulners.com/api/v3/burp/software/?software=apache&version=2.4.49&type=httpd"

# NVD search
# https://nvd.nist.gov/vuln/search?query=apache+2.4.49

# GitHub Advisory Database
# https://github.com/advisories?query=apache+2.4.49

# Nuclei CVE scanning
nuclei -u https://target.com -t cves/
nuclei -u https://target.com -t cves/2023/ -t cves/2024/

# Map identified technologies to known vulns
# Create a table:
# | Technology    | Version | CVEs          | Severity |
# |--------------|---------|---------------|----------|
# | Apache       | 2.4.49  | CVE-2021-41773| Critical |
# | PHP          | 7.4.3   | CVE-2024-...  | High     |
# | jQuery       | 1.6.1   | CVE-2020-...  | Medium   |
# | WordPress    | 5.7.0   | CVE-2021-...  | High     |

10. Complete Technology Fingerprinting Script

#!/bin/bash
# tech_fingerprint.sh
TARGET=$1
OUTDIR="recon/${TARGET}/tech"
mkdir -p $OUTDIR

echo "=== Technology Fingerprinting: $TARGET ==="

# HTTP Headers
echo "[*] Analyzing HTTP headers..."
curl -sI "https://$TARGET" > $OUTDIR/headers.txt
curl -sI "http://$TARGET" >> $OUTDIR/headers.txt 2>/dev/null

# Extract key headers
grep -iE "(server|x-powered|x-asp|x-generator|x-drupal|x-frame|x-xss|set-cookie|content-security)" \
  $OUTDIR/headers.txt > $OUTDIR/key_headers.txt

# WhatWeb scan
echo "[*] Running WhatWeb..."
whatweb -a 3 "https://$TARGET" > $OUTDIR/whatweb.txt 2>/dev/null

# Check common CMS paths
echo "[*] Checking CMS indicators..."
for path in /wp-login.php /wp-json/wp/v2/ /administrator/ /user/login /wp-content/ \
            /xmlrpc.php /api /graphql /actuator/health /console /server-status; do
    code=$(curl -s -o /dev/null -w "%{http_code}" "https://$TARGET$path" 2>/dev/null)
    if [ "$code" != "404" ] && [ "$code" != "000" ]; then
        echo "[+] $path$code" >> $OUTDIR/cms_paths.txt
    fi
done

# Check robots.txt and sitemap
echo "[*] Checking robots.txt and sitemap..."
curl -s "https://$TARGET/robots.txt" > $OUTDIR/robots.txt 2>/dev/null
curl -s "https://$TARGET/sitemap.xml" > $OUTDIR/sitemap.xml 2>/dev/null

# JavaScript library detection
echo "[*] Analyzing JavaScript libraries..."
curl -s "https://$TARGET" | grep -oP 'src="[^"]*\.js[^"]*"' > $OUTDIR/js_files.txt

# WAF detection
echo "[*] Checking for WAF..."
wafw00f "https://$TARGET" > $OUTDIR/waf.txt 2>/dev/null

echo "[*] Fingerprinting complete. Results in $OUTDIR/"
Task 5
Google Dorking and OSINT

1. Google Dorking Operators

OperatorDescriptionExample
site:Restrict to domainsite
.com
inurl:URL contains stringinurl
intitle:Page title containsintitle:"login page"
intext:Page body containsintext:"password"
filetype:Specific file typefiletype
ext:File extensionext
cache:Cached versioncache
.com
link:Pages linking to URLlink
.com
related:Related sitesrelated
.com
info:Site informationinfo
.com
define:Definitionsdefine
injection
numrange:Number rangenumrange:1000-2000
daterange:Date rangedaterange:2457388-2457491
OR /
Either term
ANDBoth termsadmin AND password
-Excludesite
.com -www
" "Exact match"index of"
*Wildcardadmin*.target.com
..Number range1..100

2. Sensitive File Discovery

# Configuration files
site:target.com ext:xml | ext:conf | ext:cfg | ext:ini | ext:env | ext:yml | ext:yaml | ext:toml
site:target.com filetype:env
site:target.com filetype:yml "password"

# Database files
site:target.com ext:sql | ext:db | ext:sqlite | ext:mdb
site:target.com filetype:sql "INSERT INTO"
site:target.com filetype:sql "CREATE TABLE"

# Backup files
site:target.com ext:bak | ext:backup | ext:old | ext:temp | ext:swp | ext:save
site:target.com filetype:zip | filetype:tar | filetype:gz | filetype:rar
site:target.com inurl:backup

# Log files
site:target.com ext:log
site:target.com filetype:log "error" | "warning" | "fatal"
site:target.com filetype:log "password" | "user"

# Source code
site:target.com ext:php | ext:asp | ext:aspx | ext:py | ext:rb | ext:java
site:target.com ext:git
site:target.com inurl:.git

# Credentials and secrets
site:target.com "password" | "passwd" | "pwd" filetype:txt
site:target.com "api_key" | "apikey" | "api-key" | "secret_key"
site:target.com "aws_access_key_id" | "aws_secret_access_key"
site:target.com "BEGIN RSA PRIVATE KEY" | "BEGIN OPENSSH PRIVATE KEY"
site:target.com "jdbc:" | "mongodb://" | "mysql://" | "postgresql://"
site:target.com filetype:pem | filetype:key | filetype:ppk

3. Admin Panel and Login Page Discovery

# Admin panels
site:target.com inurl:admin
site:target.com inurl:administrator
site:target.com inurl:admin/login
site:target.com inurl:cpanel
site:target.com inurl:webmin
site:target.com intitle:"admin" intitle:"login"
site:target.com intitle:"dashboard" inurl:admin
site:target.com inurl:wp-admin
site:target.com inurl:administrator/index.php

# Login pages
site:target.com inurl:login | inurl:signin | inurl:auth
site:target.com intitle:"login" | intitle:"sign in"
site:target.com inurl:user/login
site:target.com inurl:account/login

# Registration pages
site:target.com inurl:register | inurl:signup | inurl:join
site:target.com intitle:"register" | intitle:"sign up" | intitle:"create account"

# Password reset
site:target.com inurl:forgot | inurl:reset | inurl:recover
site:target.com intitle:"forgot password" | intitle:"reset password"

4. Vulnerability Discovery Dorks

# SQL Injection indicators
site:target.com inurl:id= | inurl:pid= | inurl:category= | inurl:item=
site:target.com inurl:".php?id="
site:target.com "sql syntax" | "mysql_fetch" | "unclosed quotation"
site:target.com "ORA-" | "Oracle error"
site:target.com "Microsoft OLE DB Provider"
site:target.com "PostgreSQL query failed"
site:target.com "Warning: mysql_" | "Warning: pg_"

# XSS indicators
site:target.com inurl:q= | inurl:search= | inurl:query= | inurl:keyword=
site:target.com inurl:redirect= | inurl:url= | inurl:return= | inurl:next=

# Directory listing
site:target.com intitle:"index of"
site:target.com intitle:"directory listing"
site:target.com intitle:"parent directory"

# Error messages
site:target.com "Fatal error" | "Parse error" | "Syntax error"
site:target.com "stack trace" | "traceback" | "debugging"
site:target.com "Warning:" "on line"
site:target.com intext:"Exception in thread"

# Exposed services
site:target.com intitle:"phpMyAdmin"
site:target.com intitle:"Adminer"
site:target.com intitle:"pgAdmin"
site:target.com inurl:phpmyadmin
site:target.com intitle:"Kibana"
site:target.com intitle:"Grafana"
site:target.com intitle:"Jenkins"
site:target.com intitle:"GitLab"

5. API and Documentation Discovery

# API documentation
site:target.com inurl:swagger | inurl:api-docs | inurl:openapi
site:target.com filetype:json "openapi" | "swagger"
site:target.com intitle:"Swagger UI"
site:target.com inurl:graphql | inurl:graphiql
site:target.com inurl:api/v1 | inurl:api/v2 | inurl:api/v3
site:target.com filetype:yaml "paths:" "info:"
site:target.com inurl:apidocs | inurl:api-reference

# Development/staging environments
site:target.com inurl:dev | inurl:staging | inurl:test | inurl:uat
site:target.com inurl:beta | inurl:alpha | inurl:sandbox
site:*.dev.target.com
site:*.staging.target.com
site:*.test.target.com

# Internal documentation
site:target.com filetype:pdf "internal" | "confidential" | "draft"
site:target.com filetype:doc | filetype:docx "internal use only"
site:target.com filetype:xlsx "employee" | "salary" | "password"

6. GitHub OSINT

# GitHub Dorking - search for secrets in code
# Organization-wide search
# org:target-org password
# org:target-org secret
# org:target-org api_key
# org:target-org token
# org:target-org AWS_ACCESS_KEY
# org:target-org private_key
# org:target-org credentials

# Specific file searches
# org:target-org filename:.env
# org:target-org filename:wp-config.php
# org:target-org filename:configuration.php
# org:target-org filename:config.py
# org:target-org filename:.htpasswd
# org:target-org filename:id_rsa
# org:target-org filename:shadow
# org:target-org filename:credentials
# org:target-org filename:docker-compose.yml

# Extension searches
# org:target-org extension:pem
# org:target-org extension:key
# org:target-org extension:env
# org:target-org extension:sql

# Automated GitHub dorking tools
# GitDorker
python3 GitDorker.py -t $GITHUB_TOKEN -org target-org -d dorks/medium_dorks.txt

# truffleHog - find secrets in git history
trufflehog github --org=target-org --json > trufflehog_results.json

# gitleaks
gitleaks detect --source=/path/to/repo --report-path=gitleaks_report.json

# git-secrets
git secrets --scan

# shhgit - find secrets in real-time
shhgit --search-query "target.com"

7. Leaked Credentials and Data

# Check breach databases (authorized use only)
# Have I Been Pwned API
curl -s "https://haveibeenpwned.com/api/v3/breachedaccount/[email protected]" \
  -H "hibp-api-key: $HIBP_KEY"

# DeHashed
# https://dehashed.com - search by domain, email, username

# IntelligenceX
# https://intelx.io - comprehensive search

# Leaked paste search
# Pastebin, GitHub Gists, other paste sites
# Search: "target.com" on IntelligenceX phonebook

# Credential stuffing wordlists
# Use discovered emails with common password patterns

# Check for password patterns
# company123, Company2024!, Target@123
# Season+Year: Summer2024!, Winter2023
# Month+Year: January2024!

8. Social Media OSINT

# LinkedIn reconnaissance
# Search: "target company" employees
# Extract: names, roles, technologies mentioned
# Job postings reveal: tech stack, internal tools, security measures

# Name to email generation
# [email protected]
# [email protected]
# [email protected]
# [email protected]

# Tools:
# linkedin2username
python3 linkedin2username.py -c "Target Company" -d target.com

# CrossLinked
crosslinked -f '{first}.{last}@target.com' -t 'Target Company' -j 2

# Email verification
# emailhippo.com
# hunter.io/email-verifier

# Twitter/X OSINT
# Search for @target_company
# Monitor employee tweets about technology
# Search for leaked info: "target.com" "password"

# Instagram, Facebook (company pages)
# Employee posts revealing office setup, badge designs, etc.

9. OSINT Frameworks and Tools

# theHarvester - multi-source OSINT
theHarvester -d target.com -b all -l 500

# Sources: google, bing, linkedin, twitter, virustotal,
# certspotter, crtsh, rapiddns, sublist3r, etc.

# Recon-ng - OSINT framework
recon-ng
# [recon-ng][default] > marketplace install all
# [recon-ng][default] > workspaces create target
# [recon-ng][target] > modules search
# [recon-ng][target] > modules load recon/domains-hosts/google_site_web
# [recon-ng][target] > options set SOURCE target.com
# [recon-ng][target] > run

# SpiderFoot - automated OSINT
spiderfoot -s target.com -o output.html
# Web UI: spiderfoot -l 127.0.0.1:5001

# Maltego - visual link analysis
# Community edition available
# Transform-based OSINT

# OSINT Framework
# https://osintframework.com - comprehensive tool directory

# Shodan
shodan search "hostname:target.com"
shodan search "org:\"Target Organization\""
shodan search "ssl.cert.subject.cn:target.com"

# Censys
# https://search.censys.io

10. Automated OSINT Pipeline

#!/bin/bash
# osint_pipeline.sh
TARGET=$1
OUTDIR="recon/${TARGET}/osint"
mkdir -p $OUTDIR

echo "=== OSINT Pipeline: $TARGET ==="

# theHarvester
echo "[*] Running theHarvester..."
theHarvester -d $TARGET -b google,bing,crtsh,virustotal,rapiddns -l 500 \
  -f $OUTDIR/harvester 2>/dev/null

# Check for exposed git repos
echo "[*] Checking for exposed .git..."
for sub in $(cat recon/$TARGET/subdomains/all_subdomains.txt 2>/dev/null); do
    code=$(curl -s -o /dev/null -w "%{http_code}" "https://$sub/.git/config" 2>/dev/null)
    if [ "$code" = "200" ]; then
        echo "[+] Exposed .git found: https://$sub/.git/" >> $OUTDIR/exposed_git.txt
    fi
done

# Check for sensitive files
echo "[*] Checking for sensitive files..."
SENSITIVE_PATHS=".env .env.bak wp-config.php.bak .htpasswd .DS_Store
 config.php.bak web.config.bak phpinfo.php info.php server-status
 elmah.axd trace.axd .svn/entries crossdomain.xml"
for path in $SENSITIVE_PATHS; do
    code=$(curl -s -o /dev/null -w "%{http_code}" "https://$TARGET/$path" 2>/dev/null)
    if [ "$code" = "200" ]; then
        echo "[+] Found: https://$TARGET/$path" >> $OUTDIR/sensitive_files.txt
    fi
done

# robots.txt and sitemap analysis
echo "[*] Analyzing robots.txt and sitemap..."
curl -s "https://$TARGET/robots.txt" > $OUTDIR/robots.txt 2>/dev/null
curl -s "https://$TARGET/sitemap.xml" > $OUTDIR/sitemap.xml 2>/dev/null

# Extract disallowed paths from robots.txt
grep "Disallow:" $OUTDIR/robots.txt 2>/dev/null | awk '{print $2}' > $OUTDIR/disallowed_paths.txt

echo "[*] OSINT pipeline complete. Results in $OUTDIR/"
Task 6
Wayback Machine and Historical Analysis

1. Web Archive Sources

# Wayback Machine (archive.org)
# The largest web archive with billions of snapshots

# waybackurls - fetch URLs from Wayback Machine
waybackurls target.com > wayback_urls.txt
waybackurls target.com | sort -u | tee wayback_unique.txt

# gau (GetAllUrls) - multiple sources
# Sources: Wayback Machine, Common Crawl, AlienVault OTX, URLScan
gau target.com > gau_urls.txt
gau --subs target.com > gau_with_subs.txt
gau --providers wayback,commoncrawl,otx,urlscan target.com > gau_all.txt

# waymore - comprehensive URL collection
waymore -i target.com -mode U -oU waymore_urls.txt

# Combine all sources
cat wayback_urls.txt gau_urls.txt waymore_urls.txt | sort -u > all_historical_urls.txt
echo "[*] Total unique historical URLs: $(wc -l < all_historical_urls.txt)"

2. URL Filtering and Analysis

# Filter by file extension
cat all_historical_urls.txt | grep -iE "\.(php|asp|aspx|jsp|jspx|do|action)(\?|$)" > dynamic_pages.txt
cat all_historical_urls.txt | grep -iE "\.(js|json|xml|yaml|yml|config|env|bak|sql|log)" > sensitive_files.txt
cat all_historical_urls.txt | grep -iE "\.(pdf|doc|docx|xls|xlsx|ppt|pptx|csv)" > documents.txt
cat all_historical_urls.txt | grep -iE "\.(zip|tar|gz|rar|7z|bak|backup|old)" > archives.txt

# Filter by patterns
cat all_historical_urls.txt | grep -iE "(admin|login|dashboard|panel|portal|manage)" > admin_urls.txt
cat all_historical_urls.txt | grep -iE "(api|graphql|rest|v1|v2|v3|endpoint)" > api_urls.txt
cat all_historical_urls.txt | grep -iE "(upload|file|download|export|import)" > file_handling.txt
cat all_historical_urls.txt | grep -iE "(debug|test|dev|staging|internal)" > dev_urls.txt
cat all_historical_urls.txt | grep -iE "(config|setup|install|phpinfo|server-status)" > config_urls.txt

# Extract parameters
cat all_historical_urls.txt | grep "?" | sort -u > parameterized.txt
cat parameterized.txt | grep -oP '\?[^&]*' | cut -d= -f1 | sort | uniq -c | sort -rn > param_names.txt

# uro - URL deduplication and optimization
cat all_historical_urls.txt | uro > optimized_urls.txt

# unfurl - parse and extract URL components
cat all_historical_urls.txt | unfurl --unique domains > domains.txt
cat all_historical_urls.txt | unfurl --unique paths > paths.txt
cat all_historical_urls.txt | unfurl --unique keys > parameter_keys.txt

3. Discovering Removed/Hidden Content

# Wayback Machine CDX API
# Fetch all snapshots for a URL
curl -s "https://web.archive.org/cdx/search/cdx?url=target.com/*&output=json&fl=timestamp,original,statuscode,mimetype&collapse=urlkey" | \
  jq -r '.[] | @tsv' > cdx_results.tsv

# Find pages that existed before but return 404 now
while read url; do
    current_code=$(curl -s -o /dev/null -w "%{http_code}" "$url" 2>/dev/null)
    if [ "$current_code" = "404" ]; then
        echo "[REMOVED] $url" >> removed_pages.txt
    fi
done < dynamic_pages.txt

# View historical snapshots
# https://web.archive.org/web/2020*/target.com
# https://web.archive.org/web/20200101000000*/target.com/admin

# Download historical version
curl -s "https://web.archive.org/web/2020/https://target.com/admin" > historical_admin.html

# Find old API endpoints
cat api_urls.txt | while read url; do
    current_code=$(curl -s -o /dev/null -w "%{http_code}" "$url" 2>/dev/null)
    echo "$current_code $url"
done > api_status_check.txt

# Look for removed functionality
# Old registration pages, admin panels, debug endpoints
# These may still be accessible but hidden from navigation

4. JavaScript File Analysis from Archives

# Extract JavaScript URLs
cat all_historical_urls.txt | grep -iE "\.js(\?|$)" | sort -u > js_files.txt

# Download current versions
while read js_url; do
    filename=$(echo "$js_url" | md5sum | cut -d' ' -f1).js
    curl -s "$js_url" -o "js_current/$filename" 2>/dev/null
done < js_files.txt

# Compare with archived versions
# Look for:
# - Removed API endpoints
# - Changed authentication logic
# - Removed debug code
# - Hidden admin functionality
# - API keys/tokens in old versions

# LinkFinder - extract endpoints from JS
linkfinder -i https://target.com -o cli > linkfinder_results.txt
linkfinder -i https://target.com -d -o linkfinder_output.html

# SecretFinder - find secrets in JS
SecretFinder -i https://target.com -o cli > secrets_in_js.txt

# Analyze JS for sensitive data
cat js_current/*.js | grep -oiE "(api[_-]?key|api[_-]?secret|token|password|secret|credential|auth)['\"][:\s]*['\"][^'\"]{8,}" > js_secrets.txt

# Find API endpoints in JS
cat js_current/*.js | grep -oiE "(https?://[^\s'\"]+|/api/[^\s'\"]+|/v[0-9]/[^\s'\"]+)" | sort -u > js_endpoints.txt

5. Source Code Recovery from Archives

# Common source code leaks in archives
# .git directory exposure
curl -s "https://web.archive.org/web/2020/https://target.com/.git/config"
curl -s "https://web.archive.org/web/2020/https://target.com/.git/HEAD"

# .svn directory
curl -s "https://web.archive.org/web/2020/https://target.com/.svn/entries"

# .env file
curl -s "https://web.archive.org/web/2020/https://target.com/.env"

# Configuration files
curl -s "https://web.archive.org/web/2020/https://target.com/web.config"
curl -s "https://web.archive.org/web/2020/https://target.com/wp-config.php"

# Check multiple timestamps for each sensitive file
SENSITIVE_FILES=".env .git/config wp-config.php web.config .htaccess 
 config.php database.yml settings.py"

for file in $SENSITIVE_FILES; do
    echo "=== $file ===" >> archived_configs.txt
    curl -s "https://web.archive.org/cdx/search/cdx?url=target.com/$file&output=json" >> archived_configs.txt
done

6. Technology Change Tracking

# Track technology changes over time
# Useful for understanding the evolution of the target

# Check old versions of the site
# https://web.archive.org/web/20200101/target.com
# https://web.archive.org/web/20210101/target.com
# https://web.archive.org/web/20220101/target.com

# Extract technology indicators from each snapshot
for year in 2019 2020 2021 2022 2023 2024; do
    echo "=== $year ===" >> tech_timeline.txt
    content=$(curl -s "https://web.archive.org/web/${year}0601/https://target.com" 2>/dev/null)
    echo "$content" | grep -oiE "(wordpress|drupal|joomla|react|angular|vue|jquery|bootstrap|laravel|django|rails|express|next|nuxt)" | sort -u >> tech_timeline.txt
done

# Track subdomain changes
# Compare historical subdomain lists with current
# New subdomains = potential new attack surface
# Removed subdomains = potential takeover candidates

7. Sitemap and Robots.txt History

# Historical robots.txt
curl -s "https://web.archive.org/web/2020/https://target.com/robots.txt"
curl -s "https://web.archive.org/web/2021/https://target.com/robots.txt"
curl -s "https://web.archive.org/web/2022/https://target.com/robots.txt"

# Compare robots.txt versions
# Removed Disallow entries might reveal hidden paths
# that were previously blocked

# Historical sitemaps
curl -s "https://web.archive.org/web/2020/https://target.com/sitemap.xml"

# Extract all paths from historical sitemaps
curl -s "https://web.archive.org/web/2020/https://target.com/sitemap.xml" | \
  grep -oP '<loc>\K[^<]+' | sort -u > sitemap_2020_paths.txt

curl -s "https://target.com/sitemap.xml" | \
  grep -oP '<loc>\K[^<]+' | sort -u > sitemap_current_paths.txt

# Find pages removed from sitemap
comm -23 sitemap_2020_paths.txt sitemap_current_paths.txt > removed_from_sitemap.txt

8. Common Crawl Analysis

# Common Crawl - free web crawl data
# https://commoncrawl.org

# Search Common Crawl index
curl -s "https://index.commoncrawl.org/CC-MAIN-2024-10-index?url=*.target.com&output=json" | \
  jq -r '.url' | sort -u > commoncrawl_urls.txt

# Download WARC records for specific URLs
# Useful for getting full HTTP responses including headers

# cdx_toolkit (Python)
# pip install cdx_toolkit
python3 << 'PYEOF'
import cdx_toolkit

cdx = cdx_toolkit.CDXFetcher(source='cc')
for obj in cdx.iter('target.com/*', limit=1000):
    print(obj['url'])
PYEOF

# URLScan.io
curl -s "https://urlscan.io/api/v1/search/?q=domain:target.com" | \
  jq -r '.results[].page.url' | sort -u > urlscan_urls.txt

9. Parameter Mining from Historical Data

# Extract all unique parameters
cat all_historical_urls.txt | grep "?" | \
  grep -oP '[?&]\K[^=]+' | sort | uniq -c | sort -rn > all_params.txt

# High-value parameters for injection testing
# id, user_id, item_id → IDOR
# url, redirect, next, return, goto → Open Redirect / SSRF
# search, q, query, keyword → XSS / SQLi
# file, path, page, template → LFI/RFI
# cmd, exec, command → Command Injection
# email, username, user → Account Enumeration

# Create parameter-based test URLs
cat all_historical_urls.txt | grep "?" | \
  qsreplace "FUZZ" | sort -u > fuzzable_urls.txt

# Arjun - parameter discovery
arjun -u https://target.com/page -oJ arjun_params.json
arjun -i parameterized.txt -oJ arjun_batch.json

# x8 - hidden parameter discovery
x8 -u https://target.com/page -w /usr/share/seclists/Discovery/Web-Content/burp-parameter-names.txt

10. Complete Historical Analysis Script

#!/bin/bash
# historical_analysis.sh
TARGET=$1
OUTDIR="recon/${TARGET}/historical"
mkdir -p $OUTDIR/{urls,js,configs,params}

echo "=== Historical Analysis: $TARGET ==="

# Collect URLs from all sources
echo "[1/7] Collecting historical URLs..."
waybackurls $TARGET 2>/dev/null > $OUTDIR/urls/wayback.txt
gau $TARGET 2>/dev/null > $OUTDIR/urls/gau.txt
cat $OUTDIR/urls/*.txt | sort -u > $OUTDIR/urls/all.txt
echo "[*] Total URLs: $(wc -l < $OUTDIR/urls/all.txt)"

# Filter and categorize
echo "[2/7] Categorizing URLs..."
cat $OUTDIR/urls/all.txt | grep -iE "\.js(\?|$)" | sort -u > $OUTDIR/js/js_files.txt
cat $OUTDIR/urls/all.txt | grep -iE "\.(php|asp|aspx|jsp)(\?|$)" | sort -u > $OUTDIR/urls/dynamic.txt
cat $OUTDIR/urls/all.txt | grep -iE "(api|graphql|v[0-9])" | sort -u > $OUTDIR/urls/api.txt
cat $OUTDIR/urls/all.txt | grep "?" | sort -u > $OUTDIR/urls/parameterized.txt

# Extract parameters
echo "[3/7] Extracting parameters..."
cat $OUTDIR/urls/parameterized.txt | grep -oP '[?&]\K[^=]+' | \
  sort | uniq -c | sort -rn > $OUTDIR/params/all_params.txt

# Check for sensitive files in archives
echo "[4/7] Checking archived sensitive files..."
for file in .env .git/config .git/HEAD wp-config.php web.config .htaccess \
            config.php .svn/entries phpinfo.php server-status; do
    result=$(curl -s -o /dev/null -w "%{http_code}" \
      "https://web.archive.org/web/2023/https://$TARGET/$file" 2>/dev/null)
    if [ "$result" = "200" ]; then
        echo "[+] Archived: $file" >> $OUTDIR/configs/archived_sensitive.txt
    fi
done

# Historical robots.txt
echo "[5/7] Checking historical robots.txt..."
for year in 2019 2020 2021 2022 2023 2024; do
    curl -s "https://web.archive.org/web/${year}0601/https://$TARGET/robots.txt" \
      > $OUTDIR/configs/robots_${year}.txt 2>/dev/null
done

# Extract disallowed paths from all versions
cat $OUTDIR/configs/robots_*.txt 2>/dev/null | grep "Disallow:" | \
  awk '{print $2}' | sort -u > $OUTDIR/configs/all_disallowed.txt

# Alive check on interesting URLs
echo "[6/7] Checking alive status..."
cat $OUTDIR/urls/api.txt | httpx -silent -status-code > $OUTDIR/urls/alive_api.txt 2>/dev/null

# Generate report
echo "[7/7] Generating summary..."
echo "=== Historical Analysis Summary ===" > $OUTDIR/summary.txt
echo "Total URLs: $(wc -l < $OUTDIR/urls/all.txt)" >> $OUTDIR/summary.txt
echo "JS Files: $(wc -l < $OUTDIR/js/js_files.txt)" >> $OUTDIR/summary.txt
echo "Dynamic Pages: $(wc -l < $OUTDIR/urls/dynamic.txt)" >> $OUTDIR/summary.txt
echo "API Endpoints: $(wc -l < $OUTDIR/urls/api.txt)" >> $OUTDIR/summary.txt
echo "Unique Parameters: $(wc -l < $OUTDIR/params/all_params.txt)" >> $OUTDIR/summary.txt

echo "[*] Historical analysis complete. Results in $OUTDIR/"
cat $OUTDIR/summary.txt
Task 7
JavaScript File Analysis

1. Why Analyze JavaScript Files

JavaScript files contain:
├── API endpoints and routes
├── Authentication logic (client-side)
├── API keys, tokens, secrets
├── Hidden admin functionality
├── Business logic implementation
├── WebSocket endpoints
├── Internal domain references
├── Debug/development code
├── Third-party integrations
└── Source maps (full source code)

2. Finding JavaScript Files

# Extract from HTML source
curl -s https://target.com | grep -oP 'src="[^"]*\.js[^"]*"' | sed 's/src="//;s/"//'
curl -s https://target.com | grep -oP "src='[^']*\.js[^']*'" | sed "s/src='//;s/'//"

# Find JS in multiple pages
for page in / /login /dashboard /api /about; do
    curl -s "https://target.com$page" | grep -oP 'src="[^"]*\.js[^"]*"'
done | sort -u > js_urls.txt

# Historical JS files
cat wayback_urls.txt | grep -iE "\.js(\?|$)" | sort -u > historical_js.txt

# getJS - extract JS URLs
getJS --url https://target.com --complete > getjs_results.txt

# gospider - web spider for JS discovery
gospider -s https://target.com -c 10 -d 2 --js > gospider_js.txt

# hakrawler - fast web crawler
echo https://target.com | hakrawler -js > hakrawler_js.txt

# Resolve relative paths to absolute
while read js; do
    case "$js" in
        http*) echo "$js" ;;
        //*) echo "https:$js" ;;
        /*) echo "https://target.com$js" ;;
        *) echo "https://target.com/$js" ;;
    esac
done < js_urls.txt > js_absolute.txt

3. Endpoint Extraction

# LinkFinder - extract endpoints from JS
python3 linkfinder.py -i https://target.com -o cli
python3 linkfinder.py -i https://target.com -d -o results.html  # full domain
python3 linkfinder.py -i /path/to/file.js -o cli  # local file

# Manual regex extraction
curl -s https://target.com/app.js | grep -oP '["'"'"'](/[a-zA-Z0-9_/\-\.]+)["'"'"']' | sort -u

# Extract API paths
curl -s https://target.com/app.js | grep -oiE '["'"'"'](\/api\/[^"'"'"']+)["'"'"']' | sort -u

# Extract full URLs
curl -s https://target.com/app.js | grep -oiE 'https?://[^\s"'"'"'<>]+' | sort -u

# Extract fetch/axios/XMLHttpRequest calls
curl -s https://target.com/app.js | grep -oP 'fetch\(["\x27][^"\x27]+["\x27]' | sort -u
curl -s https://target.com/app.js | grep -oP 'axios\.(get|post|put|delete|patch)\(["\x27][^"\x27]+' | sort -u
curl -s https://target.com/app.js | grep -oP '\.open\(["\x27](GET|POST|PUT|DELETE)["\x27],\s*["\x27][^"\x27]+' | sort -u

# Extract route definitions (React Router, Vue Router, Angular)
curl -s https://target.com/app.js | grep -oP 'path:\s*["\x27][^"\x27]+["\x27]' | sort -u
curl -s https://target.com/app.js | grep -oP 'route\(["\x27][^"\x27]+["\x27]' | sort -u

4. Secret and Credential Discovery

# SecretFinder
python3 SecretFinder.py -i https://target.com/app.js -o cli
python3 SecretFinder.py -i https://target.com -e -o results.html  # crawl entire domain

# Manual regex patterns for secrets
JS_FILE="https://target.com/app.js"

# API Keys
curl -s $JS_FILE | grep -oiE '(api[_-]?key|apikey)["\x27:\s]*["\x27][a-zA-Z0-9_\-]{16,}["\x27]'

# AWS Keys
curl -s $JS_FILE | grep -oiE '(AKIA[0-9A-Z]{16})'
curl -s $JS_FILE | grep -oiE '(aws[_-]?secret[_-]?access[_-]?key)["\x27:\s]*["\x27][^\x27"]{20,}["\x27]'

# Google API Key
curl -s $JS_FILE | grep -oiE 'AIza[0-9A-Za-z_\-]{35}'

# Firebase
curl -s $JS_FILE | grep -oiE '(firebase[a-zA-Z]*\.com|firebaseio\.com)'

# JWT tokens
curl -s $JS_FILE | grep -oiE 'eyJ[a-zA-Z0-9_-]*\.eyJ[a-zA-Z0-9_-]*\.[a-zA-Z0-9_-]*'

# Generic tokens/passwords
curl -s $JS_FILE | grep -oiE '(token|password|secret|credential|auth)["\x27:\s]*["\x27][^\x27"]{8,}["\x27]'

# Private keys
curl -s $JS_FILE | grep -oiE 'BEGIN (RSA |EC |DSA )?PRIVATE KEY'

# Slack tokens
curl -s $JS_FILE | grep -oiE 'xox[baprs]-[0-9a-zA-Z\-]+'

# GitHub tokens
curl -s $JS_FILE | grep -oiE 'gh[pousr]_[A-Za-z0-9_]{36,}'

# nuclei JS secret detection
nuclei -u https://target.com -t exposures/tokens/

5. Source Map Analysis

# Source maps contain the original, unminified source code
# Usually at: file.js.map or referenced in the JS file

# Check for source map reference in JS
curl -s https://target.com/app.js | tail -5
# Look for: //# sourceMappingURL=app.js.map

# Check source map header
curl -sI https://target.com/app.js | grep -i "sourcemap"
# SourceMap: /app.js.map

# Download source map
curl -s https://target.com/app.js.map -o app.js.map

# Common source map paths
# /static/js/main.xxxxx.js.map
# /assets/app.js.map
# /dist/bundle.js.map
# /build/static/js/main.chunk.js.map

# Extract original source from source map
# unwebpack-sourcemap
python3 unwebpack_sourcemap.py --make-directory app.js.map output_dir/

# sourcemapper
sourcemapper -url https://target.com/app.js.map -output source_code/

# smap - source map extractor
smap https://target.com/app.js.map -o source_output/

# After extraction, analyze the full source code
# Look for: API endpoints, auth logic, admin routes, hardcoded secrets
grep -rn "api" source_output/
grep -rn "password\|secret\|token\|key" source_output/
grep -rn "admin\|isAdmin\|role" source_output/

6. Webpack Bundle Analysis

# Identify Webpack bundles
# Look for: __webpack_require__, webpackJsonp, webpackChunkapp

# Extract chunk names/IDs
curl -s https://target.com/app.js | grep -oP 'webpackChunk[a-zA-Z_]*'

# Find all chunks
curl -s https://target.com | grep -oP '/static/js/[^"]+\.js' | sort -u

# Download all chunks
for chunk in $(curl -s https://target.com | grep -oP '/static/js/[^"]+\.js'); do
    wget -q "https://target.com$chunk" -P webpack_chunks/
done

# Analyze chunks for interesting content
for f in webpack_chunks/*.js; do
    echo "=== $f ===" >> webpack_analysis.txt
    # Routes
    grep -oP 'path:\s*["'"'"'][^"'"'"']+' "$f" >> webpack_analysis.txt 2>/dev/null
    # API calls
    grep -oP '["'"'"']/api/[^"'"'"']+' "$f" >> webpack_analysis.txt 2>/dev/null
    # Secrets
    grep -oiP '(api_key|secret|token|password)\s*[:=]\s*["'"'"'][^"'"'"']+' "$f" >> webpack_analysis.txt 2>/dev/null
done

# webpack-exploder - analyze webpack bundles
# Deobfuscate and separate modules

7. Minified/Obfuscated JS Analysis

# Beautify minified JavaScript
# js-beautify
js-beautify -f minified.js -o beautified.js

# Online tools:
# https://beautifier.io
# https://prettier.io

# de4js - JavaScript deobfuscator
# Handles: eval, packed, obfuscator.io, JSFuck

# Common obfuscation patterns:

# eval-based
# eval(function(p,a,c,k,e,d){...})

# String array obfuscation
# var _0x1234=['api','key','secret']; function _0x5678(_0x1234,_0x9abc){...}

# JSFuck
# [][(![]+[])[+[]]+(![]+[])[!+[]+!+[]]...

# JJencode / AAencode
# $=~[];$={___:++$...

# Deobfuscation approaches:
# 1. Use browser console to execute and capture output
# 2. Replace eval() with console.log() to see decoded code
# 3. Use AST-based deobfuscation tools
# 4. Synchronet deobfuscator
# 5. Manual analysis with breakpoints

8. DOM Sink Analysis

# Identify potential DOM XSS sinks in JavaScript

# Dangerous sinks to search for:
SINKS="innerHTML|outerHTML|document\.write|document\.writeln|eval|setTimeout|setInterval|Function|execScript|\.html\(|\.append\(|\.prepend\(|\.after\(|\.before\(|\.replaceWith\(|location\.href|location\.assign|location\.replace|window\.open|\.src\s*=|\.href\s*="

# Search JavaScript files for sinks
curl -s https://target.com/app.js | grep -oiE "$SINKS" | sort | uniq -c | sort -rn

# Dangerous sources (user-controllable input):
SOURCES="location\.hash|location\.search|location\.href|document\.URL|document\.documentURI|document\.referrer|window\.name|document\.cookie|postMessage|\.value"

# Search for sources
curl -s https://target.com/app.js | grep -oiE "$SOURCES" | sort | uniq -c | sort -rn

# Look for direct source-to-sink flows
# Example: document.getElementById('output').innerHTML = location.hash
# This is a DOM XSS vulnerability

9. Third-Party Library Analysis

# Identify third-party libraries and versions

# Common CDN patterns
curl -s https://target.com | grep -oP 'https://cdn[^"'"'"']+' | sort -u
curl -s https://target.com | grep -oP 'https://cdnjs[^"'"'"']+' | sort -u
curl -s https://target.com | grep -oP 'https://unpkg[^"'"'"']+' | sort -u

# Extract library versions from JS files
# jQuery
curl -s https://target.com/jquery.min.js | head -5 | grep -oP 'v\d+\.\d+\.\d+'

# Check for known vulnerable versions
# retire.js - identify vulnerable JS libraries
retire --jsrepo --js https://target.com

# Snyk vulnerability database
# https://snyk.io/vuln

# Known vulnerable versions:
# jQuery < 3.5.0 → XSS vulnerabilities
# Angular.js 1.x → template injection, sandbox bypass
# Lodash < 4.17.12 → prototype pollution
# Moment.js → ReDoS
# DOMPurify < 2.0.17 → mXSS bypass
# handlebars < 4.7.7 → prototype pollution

# Check all libraries with nuclei
nuclei -u https://target.com -t technologies/ -t exposures/

10. Complete JS Analysis Pipeline

#!/bin/bash
# js_analysis.sh
TARGET=$1
OUTDIR="recon/${TARGET}/js_analysis"
mkdir -p $OUTDIR/{files,endpoints,secrets,sourcemaps,beautified}

echo "=== JavaScript Analysis: $TARGET ==="

# Collect JS files
echo "[1/6] Collecting JavaScript files..."
curl -s "https://$TARGET" | grep -oP 'src="[^"]*\.js[^"]*"' | \
  sed 's/src="//' | sed 's/"//' | while read js; do
    case "$js" in
        http*) echo "$js" ;;
        //*) echo "https:$js" ;;
        /*) echo "https://$TARGET$js" ;;
        *) echo "https://$TARGET/$js" ;;
    esac
done | sort -u > $OUTDIR/js_urls.txt

echo "[*] Found $(wc -l < $OUTDIR/js_urls.txt) JS files"

# Download JS files
echo "[2/6] Downloading JS files..."
while read url; do
    filename=$(echo "$url" | md5sum | cut -d' ' -f1).js
    curl -s "$url" -o "$OUTDIR/files/$filename" 2>/dev/null
    echo "$url$filename" >> $OUTDIR/files/url_map.txt
done < $OUTDIR/js_urls.txt

# Extract endpoints
echo "[3/6] Extracting endpoints..."
for f in $OUTDIR/files/*.js; do
    grep -oiE '["'"'"'](\/[a-zA-Z0-9_/\-\.]+)["'"'"']' "$f" 2>/dev/null
done | sort -u > $OUTDIR/endpoints/all_endpoints.txt

# Search for secrets
echo "[4/6] Searching for secrets..."
for f in $OUTDIR/files/*.js; do
    # API keys
    grep -oiE '(api[_-]?key|apikey|api_secret|secret_key|auth_token)["\x27:\s=]*["\x27][a-zA-Z0-9_\-]{16,}["\x27]' "$f" >> $OUTDIR/secrets/api_keys.txt 2>/dev/null
    # AWS keys
    grep -oiE 'AKIA[0-9A-Z]{16}' "$f" >> $OUTDIR/secrets/aws_keys.txt 2>/dev/null
    # JWT
    grep -oiE 'eyJ[a-zA-Z0-9_-]*\.eyJ[a-zA-Z0-9_-]*\.[a-zA-Z0-9_-]*' "$f" >> $OUTDIR/secrets/jwt_tokens.txt 2>/dev/null
    # URLs
    grep -oiE 'https?://[^\s"'"'"'<>]+' "$f" >> $OUTDIR/secrets/urls.txt 2>/dev/null
done

# Check for source maps
echo "[5/6] Checking for source maps..."
while read url; do
    map_url="${url}.map"
    code=$(curl -s -o /dev/null -w "%{http_code}" "$map_url" 2>/dev/null)
    if [ "$code" = "200" ]; then
        echo "[+] Source map found: $map_url" >> $OUTDIR/sourcemaps/found.txt
        curl -s "$map_url" -o "$OUTDIR/sourcemaps/$(basename $map_url)" 2>/dev/null
    fi
done < $OUTDIR/js_urls.txt

# Summary
echo "[6/6] Generating summary..."
echo "=== JS Analysis Summary ===" > $OUTDIR/summary.txt
echo "JS Files: $(wc -l < $OUTDIR/js_urls.txt)" >> $OUTDIR/summary.txt
echo "Endpoints: $(wc -l < $OUTDIR/endpoints/all_endpoints.txt)" >> $OUTDIR/summary.txt
echo "Potential Secrets: $(cat $OUTDIR/secrets/*.txt 2>/dev/null | wc -l)" >> $OUTDIR/summary.txt
echo "Source Maps: $(cat $OUTDIR/sourcemaps/found.txt 2>/dev/null | wc -l)" >> $OUTDIR/summary.txt

cat $OUTDIR/summary.txt
Task 8
API Endpoint Discovery

1. API Discovery Methodology

API Discovery Flow:
1. Documentation → Swagger/OpenAPI, GraphQL introspection
2. JavaScript analysis → fetch/axios calls, route definitions
3. Historical data → Wayback Machine, cached pages
4. Traffic analysis → Proxy interception (Burp Suite)
5. Bruteforcing → wordlist-based endpoint discovery
6. Mobile app analysis → Decompile APK/IPA
7. Error messages → Stack traces revealing routes

2. API Documentation Discovery

# Common API documentation paths
PATHS="/swagger /swagger-ui /swagger-ui.html /swagger/index.html
 /api-docs /api-docs/swagger.json /api/swagger.json
 /openapi.json /openapi.yaml /openapi/v3/api-docs
 /v1/api-docs /v2/api-docs /v3/api-docs
 /docs /docs/api /redoc /api/docs
 /graphql /graphiql /playground /graphql/playground
 /_api /api /api/v1 /api/v2 /api/v3
 /swagger/v1/swagger.json /swagger/v2/swagger.json
 /api-documentation /developer /developer/docs
 /swagger-resources /api/swagger-resources
 /.well-known/openapi.json /.well-known/openapi.yaml
 /api/schema /api/openapi /api/spec
 /documentation /api/documentation
 /api-explorer /api/explorer"

for path in $PATHS; do
    code=$(curl -s -o /dev/null -w "%{http_code}" "https://target.com$path" 2>/dev/null)
    if [ "$code" != "404" ] && [ "$code" != "000" ]; then
        echo "[+] $path$code"
    fi
done

# Download and parse Swagger/OpenAPI spec
curl -s https://target.com/swagger.json | jq '.paths | keys[]' | sort
curl -s https://target.com/openapi.json | jq '.paths | keys[]' | sort

# Extract all endpoints from OpenAPI spec
curl -s https://target.com/swagger.json | jq -r '.paths | to_entries[] | .key as $path | .value | to_entries[] | "\(.key | ascii_upcase) \($path)"'

3. GraphQL Endpoint Discovery

# Common GraphQL paths
GRAPHQL_PATHS="/graphql /graphiql /v1/graphql /v2/graphql
 /api/graphql /query /gql /graphql/console
 /playground /graphql/playground /altair
 /api/graphiql /graphql-explorer"

for path in $GRAPHQL_PATHS; do
    code=$(curl -s -o /dev/null -w "%{http_code}" -X POST \
      -H "Content-Type: application/json" \
      -d '{"query":"{__typename}"}' \
      "https://target.com$path" 2>/dev/null)
    if [ "$code" = "200" ]; then
        echo "[+] GraphQL endpoint: $path"
    fi
done

# GraphQL introspection query
curl -s -X POST -H "Content-Type: application/json" \
  -d '{"query":"{ __schema { types { name fields { name type { name kind ofType { name } } } } } }"}' \
  https://target.com/graphql | jq .

# Full introspection
curl -s -X POST -H "Content-Type: application/json" \
  -d '{"query":"query IntrospectionQuery { __schema { queryType { name } mutationType { name } subscriptionType { name } types { ...FullType } directives { name description locations args { ...InputValue } } } } fragment FullType on __Type { kind name description fields(includeDeprecated: true) { name description args { ...InputValue } type { ...TypeRef } isDeprecated deprecationReason } inputFields { ...InputValue } interfaces { ...TypeRef } enumValues(includeDeprecated: true) { name description isDeprecated deprecationReason } possibleTypes { ...TypeRef } } fragment InputValue on __InputValue { name description type { ...TypeRef } defaultValue } fragment TypeRef on __Type { kind name ofType { kind name ofType { kind name ofType { kind name ofType { kind name ofType { kind name ofType { kind name ofType { kind name } } } } } } } }"}' \
  https://target.com/graphql > introspection.json

# GraphQL Voyager - visualize schema
# https://graphql-kit.com/graphql-voyager/

# InQL - Burp Suite extension for GraphQL
# Automatically discovers queries, mutations, subscriptions

# clairvoyance - GraphQL field bruteforcing (when introspection is disabled)
python3 clairvoyance.py -w wordlist.txt -d https://target.com/graphql

4. REST API Endpoint Bruteforcing

# Wordlist-based API discovery
ffuf -u https://target.com/api/FUZZ -w /usr/share/seclists/Discovery/Web-Content/api/api-endpoints.txt -mc 200,201,204,301,302,401,403

# Common API versioning patterns
for ver in v1 v2 v3 v4; do
    ffuf -u "https://target.com/api/$ver/FUZZ" \
      -w /usr/share/seclists/Discovery/Web-Content/common.txt \
      -mc 200,201,204,301,302,401,403 -o "api_${ver}.json"
done

# API-specific wordlists
# /usr/share/seclists/Discovery/Web-Content/api/
# /usr/share/seclists/Discovery/Web-Content/api/api-endpoints.txt
# /usr/share/seclists/Discovery/Web-Content/api/api-seen-in-wild.txt
# /usr/share/seclists/Discovery/Web-Content/api/objects.txt
# /usr/share/seclists/Discovery/Web-Content/api/actions.txt

# Kiterunner - API-aware content discovery
kr scan https://target.com -w routes-large.kite -x 20
kr scan https://target.com -A=apiroutes-210228:20000

# Try different HTTP methods
for method in GET POST PUT DELETE PATCH OPTIONS HEAD; do
    code=$(curl -s -o /dev/null -w "%{http_code}" -X $method https://target.com/api/users 2>/dev/null)
    echo "$method /api/users → $code"
done

# Content-Type variations
curl -s -X POST -H "Content-Type: application/json" -d '{}' https://target.com/api/users
curl -s -X POST -H "Content-Type: application/xml" -d '<user/>' https://target.com/api/users
curl -s -X POST -H "Content-Type: application/x-www-form-urlencoded" -d 'test=1' https://target.com/api/users

5. WADL and WSDL Discovery

# WADL (Web Application Description Language) - REST
WADL_PATHS="/application.wadl /api/application.wadl /services/application.wadl
 /rest/application.wadl"
for path in $WADL_PATHS; do
    code=$(curl -s -o /dev/null -w "%{http_code}" "https://target.com$path")
    if [ "$code" = "200" ]; then
        echo "[+] WADL found: $path"
        curl -s "https://target.com$path" > wadl.xml
    fi
done

# WSDL (Web Services Description Language) - SOAP
WSDL_PATHS="?wsdl ?WSDL /services?wsdl /ws?wsdl /service?wsdl
 /api?wsdl /webservice?wsdl"
for suffix in $WSDL_PATHS; do
    code=$(curl -s -o /dev/null -w "%{http_code}" "https://target.com$suffix")
    if [ "$code" = "200" ]; then
        echo "[+] WSDL found: $suffix"
        curl -s "https://target.com$suffix" > wsdl.xml
    fi
done

# Parse WSDL for operations
curl -s "https://target.com?wsdl" | grep -oP 'name="[^"]*"' | sort -u

6. Mobile App API Extraction

# Android APK analysis
# Decompile APK
apktool d target-app.apk -o decompiled/
jadx target-app.apk -d jadx_output/

# Search for API endpoints in decompiled code
grep -rn "http://" jadx_output/ | grep -v ".png\|.jpg\|.gif"
grep -rn "https://" jadx_output/ | grep -v ".png\|.jpg\|.gif"
grep -rn "api" jadx_output/ --include="*.java" --include="*.xml" --include="*.json"

# Find API base URLs
grep -rn "BASE_URL\|base_url\|API_URL\|api_url\|SERVER_URL" jadx_output/

# iOS IPA analysis
# Unzip IPA
unzip target-app.ipa -d ipa_contents/
# Use class-dump, Hopper, or IDA for binary analysis

# strings extraction
strings decompiled/classes.dex | grep -iE "https?://" | sort -u

# Search for API keys
grep -rn "api_key\|apikey\|api-key\|secret\|token" jadx_output/ --include="*.java" --include="*.xml"

# Network Security Config (Android)
cat decompiled/res/xml/network_security_config.xml
# Check for cleartext traffic allowed, custom trust anchors

# MobSF - Mobile Security Framework (automated analysis)
# docker run -it --rm -p 8000:8000 opensecurity/mobile-security-framework-mobsf

7. Proxy-Based API Discovery

# Burp Suite approach:
# 1. Configure proxy
# 2. Browse the application thoroughly
# 3. Check Proxy → HTTP History → filter by API paths
# 4. Use Target → Site map to see all discovered endpoints
# 5. Export endpoints from site map

# Mitmproxy
mitmproxy --mode regular -p 8080
# Navigate the application, then analyze captured traffic
# mitmproxy: press 'z' to clear, 'f' to filter

# mitmproxy dump
mitmdump -w api_traffic.flow
# Later analysis
mitmproxy -r api_traffic.flow

# ZAP Spider + AJAX Spider
# Automated crawling discovers API endpoints
# ZAP → Tools → Spider → Target URL → Start
# ZAP → Tools → AJAX Spider → Target URL → Start

# Extract unique API paths from proxy history
# Burp: Extensions → Logger++ → Export
# Filter: URL matches regex /api/

8. Error-Based API Discovery

# Trigger error messages that reveal API routes

# Force 405 Method Not Allowed
curl -s -X DELETE https://target.com/api/ | head -20
# Response may list allowed methods

# Request non-existent endpoint
curl -s https://target.com/api/nonexistent12345
# Framework may reveal route patterns in error

# Debug mode endpoints
curl -s https://target.com/api/debug
curl -s https://target.com/debug/routes
curl -s https://target.com/_debug
curl -s https://target.com/routes

# Framework-specific route listing
# Laravel: /api/routes (if debug enabled)
# Django: / (with DEBUG=True shows all URL patterns)
# Express: custom debug middleware
# Spring Boot: /actuator/mappings
curl -s https://target.com/actuator/mappings | jq '.contexts[].mappings.dispatcherServlets[][].details.requestMappingConditions.patterns[]'

# Flask: /__debugger__
curl -s https://target.com/__debugger__

# Rails: /rails/info/routes (development mode)
curl -s https://target.com/rails/info/routes

# Send malformed requests
curl -s -X POST https://target.com/api/ -H "Content-Type: application/json" -d '{invalid}'
# Error may reveal framework and routing info

9. API Versioning and Legacy Endpoints

# Check multiple API versions
for ver in v0 v1 v2 v3 v4 v5; do
    for endpoint in users accounts products orders transactions; do
        code=$(curl -s -o /dev/null -w "%{http_code}" "https://target.com/api/$ver/$endpoint" 2>/dev/null)
        if [ "$code" != "404" ] && [ "$code" != "000" ]; then
            echo "[+] /api/$ver/$endpoint$code"
        fi
    done
done

# Header-based versioning
curl -s -H "Accept: application/vnd.api.v1+json" https://target.com/api/users
curl -s -H "Accept: application/vnd.api.v2+json" https://target.com/api/users
curl -s -H "Api-Version: 1" https://target.com/api/users
curl -s -H "Api-Version: 2" https://target.com/api/users
curl -s -H "X-API-Version: 2024-01-01" https://target.com/api/users

# Query parameter versioning
curl -s "https://target.com/api/users?version=1"
curl -s "https://target.com/api/users?api_version=2"

# Legacy/deprecated endpoints often lack security controls
# Try: /api/v0/, /api/beta/, /api/alpha/, /api/internal/
# Try: /api/legacy/, /api/old/, /api/deprecated/

10. Complete API Discovery Script

#!/bin/bash
# api_discovery.sh
TARGET=$1
OUTDIR="recon/${TARGET}/api"
mkdir -p $OUTDIR

echo "=== API Discovery: $TARGET ==="

# Check common documentation paths
echo "[1/5] Checking API documentation paths..."
DOC_PATHS="/swagger /swagger-ui /swagger-ui.html /swagger.json /swagger/v1/swagger.json
 /api-docs /openapi.json /openapi.yaml /v1/api-docs /v2/api-docs
 /graphql /graphiql /playground /docs /api/docs /redoc /documentation
 /api-explorer /api/schema /.well-known/openapi.json
 /actuator/mappings /debug/routes /rails/info/routes"

for path in $DOC_PATHS; do
    code=$(curl -s -o /dev/null -w "%{http_code}" "https://$TARGET$path" 2>/dev/null)
    if [ "$code" != "404" ] && [ "$code" != "000" ] && [ "$code" != "" ]; then
        echo "[+] $path$code" >> $OUTDIR/docs_found.txt
    fi
done
echo "[*] Documentation results: $(wc -l < $OUTDIR/docs_found.txt 2>/dev/null || echo 0)"

# GraphQL detection
echo "[2/5] Testing GraphQL endpoints..."
for path in /graphql /api/graphql /v1/graphql /query /gql; do
    result=$(curl -s -X POST -H "Content-Type: application/json" \
      -d '{"query":"{__typename}"}' "https://$TARGET$path" 2>/dev/null)
    if echo "$result" | grep -q "data"; then
        echo "[+] GraphQL endpoint: $path" >> $OUTDIR/graphql_endpoints.txt
    fi
done

# API versioning check
echo "[3/5] Checking API versions..."
for ver in v0 v1 v2 v3; do
    code=$(curl -s -o /dev/null -w "%{http_code}" "https://$TARGET/api/$ver/" 2>/dev/null)
    if [ "$code" != "404" ] && [ "$code" != "000" ]; then
        echo "[+] /api/$ver/ → $code" >> $OUTDIR/api_versions.txt
    fi
done

# Endpoint bruteforcing
echo "[4/5] Bruteforcing API endpoints..."
for base in /api /api/v1 /api/v2; do
    ffuf -u "https://$TARGET${base}/FUZZ" \
      -w /usr/share/seclists/Discovery/Web-Content/api/api-endpoints.txt \
      -mc 200,201,204,301,302,401,403,405 \
      -o $OUTDIR/ffuf_${base//\//_}.json 2>/dev/null
done

# Method testing on discovered endpoints
echo "[5/5] Testing HTTP methods..."
if [ -f $OUTDIR/docs_found.txt ]; then
    while read line; do
        path=$(echo "$line" | awk '{print $2}')
        for method in GET POST PUT DELETE PATCH; do
            code=$(curl -s -o /dev/null -w "%{http_code}" -X $method "https://$TARGET$path" 2>/dev/null)
            echo "$method $path$code" >> $OUTDIR/method_testing.txt
        done
    done < $OUTDIR/docs_found.txt
fi

echo "[*] API discovery complete. Results in $OUTDIR/"