Preface

The security protections provided by Cloudflare empower all website administrators with equal and robust defense grids. In this article, we’ll explore several practical methods utilizing Cloudflare WAF (Web Application Firewall) rules to secure your sites and services. For deeper insights, consult the official Cloudflare documents—the apex source for detailed breakdowns.

Update Notice

  • March 17, 2025: Cloudflare no longer supports the Threat Score cf.threat_score rule field. Now protected automatically via Automated Botnet Protection loops. Read the official blog post for details.

WAF Introduction

Web Application Firewall (WAF) is a defensive technology inspecting/filtering malicious requests inside HTTP traffic streams to protect web apps.

Cloudflare WAF secures sites from common triggers like SQL Injections, XSS, and uploaded malicious triggers. Free tiers support basic metrics with up to 5 custom rules per domain.

Cloudflare WAF lists 5 Basic Actions: Allow, Bypass, Managed Challenge, JS Challenge, and Block. Each has varying use-cases and triggers:

  • Allow: The most permissive; allows matching items directly without inspection (whitelist layout).
  • Bypass: Bypasses certain filters/restrictions for specific trusted loops.
  • Managed Challenge: Prompting interactive verification page asserting the visitor is human.
  • JS Challenge: Background JS execution verifying visitor integrity handles.
  • Block: Forbids matching nodes entirely; use cautiously to avoid breaking legit client accesses.

Order prioritization isn’t absolute; adjust order via dragging modules across dashboards setups (5 slots max on Free).

The image below shows the example sequence of rule nodes listed here. Let’s address them fully:

Order Priority

Note: You don’t necessarily need all of them; highly reliant on site properties.

Creating Rules

Follow steps on the dashboard to create WAF firewall triggers:

Create Rule

Allow

Placing Allow at peak ensures highest priority grids for fine-grained whitelisting.

Edit Rule

Editing visual nodes instantly previews the Rule Expression. Expressions are superior for complex logic layouts.

No worries if new; configured expressions are highly portable across domain blocks or sharing spheres.

In the setup, we evaluate Known Bots mapped to true (True).

Known Bots evaluates whether the request stems from Cloudflare-indexed indexing spiders; uses Boolean parameters.

1
(cf.client.bot)

Allows verified crawlers through without WAF interference. Highly recommended to maintain proper SEO standing Index items.

Bypass

Bypass fits optimal second priorities benchmarks well:

1
(cf.tls_client_auth.cert_verified)

Examples show: when TLS client certificates pass validations (Origin certs usually), triggers skip certain WAF restrictive features on browser clamps.

Bypass Rule

Note: Bypass grids don’t skip Bot Fight Mode or IP Access Rules.

Managed Challenge

Managed Challenge uses Cloudflare verification pages testing if visitor setups are human. Low-impact for legit users, great for stopping malicious spiders clusters.

Managed Challenge

Above is a simple managed challenge trigger Layout:

1
(not cf.client.bot and cf.threat_score gt 3) or (ip.geoip.country in {"RU" "UA" "T1"}) or (http.x_forwarded_for eq ".")

A few terms definition first:

  • Threat Score: A metric reflecting IP security risks; higher scores denote increased malicious potential (the higher, the worse).
  • GeoIP: A method for determining IP geolocation (country/region/city, etc.).
  • X-Forwarded-For: An HTTP header listing proxy hops, commonly used to obtain the real client IP. If empty, it often indicates a proxy or suspicious origin.
  • And / Or: In logical expressions, And means both conditions must be true; Or means either condition can be true.

Let’s break down the matching layouts:

  1. NOT known bot AND threat score exceeds 3.
  2. Originating Country = Russia, Ukraine, or Tor anonymity networks.
  3. X-Forwarded-For is empty.
  4. Rules connected via OR, meaning any hit triggers full challenges loops.

Question: Will Yandex crawlers in Russia be blocked from accessing by managed challenges via this rule?

Answer: No. We set a top-priority Allow rule that lets Cloudflare-verified crawlers pass through WAF rules.

JS Challenge

The JavaScript Challenge policy is stricter than Managed Challenge (bots hate it). The sample rule is long, so paste it into your WAF rule editor to view.

1
(not cf.client.bot and cf.threat_score gt 10) or (http.user_agent eq "") or (http.user_agent eq "") or (http.user_agent contains "fuck") or (http.user_agent contains "lient" and http.user_agent contains "ttp") or (http.user_agent contains "java") or (http.user_agent contains "Joomla") or (http.user_agent contains "libweb") or (http.user_agent contains "libwww") or (http.user_agent contains "PHPCrawl") or (http.user_agent contains "PyCurl") or (http.user_agent contains "python") or (http.user_agent contains "wrk") or (http.user_agent contains "hey/") or (http.user_agent contains "Acunetix") or (http.user_agent contains "apache") or (http.user_agent contains "BackDoorBot") or (http.user_agent contains "cobion") or (http.user_agent contains "masscan") or (http.user_agent contains "FHscan") or (http.user_agent contains "scanbot") or (http.user_agent contains "Gscan") or (http.user_agent contains "Researchscan") or (http.user_agent contains "WPScan") or (http.user_agent contains "ScanAlert") or (http.user_agent contains "Wprecon") or (http.user_agent contains "virusdie") or (http.user_agent contains "VoidEYE") or (http.user_agent contains "WebShag") or (http.user_agent contains "Zeus") or (http.user_agent contains "zgrab") or (http.user_agent contains "zmap") or (http.user_agent contains "nmap") or (http.user_agent contains "fimap") or (http.user_agent contains "ZmEu") or (http.user_agent contains "ZumBot") or (http.user_agent contains "Zyborg") or (http.user_agent contains "attachment") or (http.user_agent eq "undefined")

Features UA (User-Agent) fields designating clients browser types/version parameters.

Matching disreputable IPs or UA clamps will prompt background JS evaluations triggers.

Block

Block nodes occupy bottom priority stacks rightfully. Partials listed below:

Block Rule

ASN: Autonomous System Number tags specific routing ranges block systems.

1
(not cf.client.bot and cf.threat_score gt 15) or (ip.geoip.asnum in {59055 59054 59053 59052 59051 59028 45104 45103 45102 37963 34947 211914 134963 63727 63655 61348 55990 269939 265443 206798 206204 200756 149167 141180 140723 139144 139124 136907 131444 45090 137876 133478 132591 132203}) or (http.user_agent contains "80legs") or (http.user_agent contains "Abonti") or (http.user_agent contains "admantx") or (http.user_agent contains "aipbot") or (http.user_agent contains "AllSubmitter") or (http.user_agent contains "Backlink") or (http.user_agent contains "backlink") or (http.user_agent contains "Badass") or (http.user_agent contains "Bigfoot") or (http.user_agent contains "blexbot") or (http.user_agent contains "Buddy") or (http.user_agent contains "CherryPicker") or (http.user_agent contains "cloudsystemnetwork") or (http.user_agent contains "cognitiveseo") or (http.user_agent contains "Collector") or (http.user_agent contains "cosmos") or (http.user_agent contains "CrazyWebCrawler") or (http.user_agent contains "Crescent") or (http.user_agent contains "Devil") or (http.user_agent contains "spider") or (http.user_agent contains "stat") or (http.user_agent contains "Appender") or (http.user_agent contains "Crawler") or (http.user_agent contains "DittoSpyder") or (http.user_agent contains "Konqueror") or (http.user_agent contains "Easou") or (http.user_agent contains "Yisou") or (http.user_agent contains "Etao") or (http.user_agent contains "mail" and http.user_agent contains "olf") or (http.user_agent contains "exabot.com") or (http.user_agent contains "getintent") or (http.user_agent contains "Grabber") or (http.user_agent contains "GrabNet") or (http.user_agent contains "HEADMasterSEO") or (http.user_agent contains "heritrix") or (http.user_agent contains "htmlparser") or (http.user_agent contains "hubspot") or (http.user_agent contains "Jyxobot") or (http.user_agent contains "kraken") or (http.user_agent contains "larbin") or (http.user_agent contains "ltx71") or (http.user_agent contains "leiki") or (http.user_agent contains "LinkScan") or (http.user_agent contains "Magnet") or (http.user_agent contains "Mag-Net") or (http.user_agent contains "Mechanize") or (http.user_agent contains "MegaIndex") or (http.user_agent contains "Metasearch") or (http.user_agent contains "MJ12bot") or (http.user_agent contains "moz.com") or (http.user_agent contains "Navroad") or (http.user_agent contains "Netcraft") or (http.user_agent contains "niki-bot") or (http.user_agent contains "NimbleCrawler") or (http.user_agent contains "Nimbostratus") or (http.user_agent contains "Ninja") or (http.user_agent contains "Openfind") or (http.user_agent contains "Analyzer") or (http.user_agent contains "Pixray") or (http.user_agent contains "probethenet") or (http.user_agent contains "proximic") or (http.user_agent contains "psbot") or (http.user_agent contains "RankActive") or (http.user_agent contains "RankingBot") or (http.user_agent contains "RankurBot") or (http.user_agent contains "Reaper") or (http.user_agent contains "SalesIntelligent") or (http.user_agent contains "Semrush") or (http.user_agent contains "SEOkicks") or (http.user_agent contains "spbot") or (http.user_agent contains "SEOstats") or (http.user_agent contains "Snapbot") or (http.user_agent contains "Stripper") or (http.user_agent contains "Siteimprove") or (http.user_agent contains "sitesell") or (http.user_agent contains "Siphon") or (http.user_agent contains "Sucker") or (http.user_agent contains "TenFourFox") or (http.user_agent contains "TurnitinBot") or (http.user_agent contains "trendiction") or (http.user_agent contains "twingly") or (http.user_agent contains "VidibleScraper") or (http.user_agent contains "WebLeacher") or (http.user_agent contains "WebmasterWorldForum") or (http.user_agent contains "webmeup") or (http.user_agent contains "Webster") or (http.user_agent contains "Widow") or (http.user_agent contains "Xaldon") or (http.user_agent contains "Xenu") or (http.user_agent contains "xtractor") or (http.user_agent contains "Zermelo")

Pasted in expressions editors for simpler visual verification layout breakdowns:

  1. NOT verified bot AND threat score exceeds 15, trigger blocks.
  2. ASN hitting cloud farming nodes defaults restrictive blocks benchmarks.
  3. Matching disreputable UA Clamp bundles.

Limitations

Cloudflare WAF enforces length limits strictly based on subscription tiers setups:

  • Free Plan: 2,000 chars (Us standard users 😭)
  • Pro Plan: 4,000 chars
  • Business Plan: 8,000 chars
  • Enterprise Plan: 16,000 chars

If your rule expression exceeds the limit, shorten it or split it into multiple rules to keep it usable.

Maintaining Rules

Now familiar with basic WAF behaviors, operational maintenance revolves around inspecting Firewall Log events dials alerts:

WAF Event

WAF filters bad setups to secure legit traffic without incurring legit client locking false positives. Example hitting false metrics below:

Wrong Block

What happened? Inspecting the WAF events dials:

Lookup Event

Clearly, an RSS subscription query hosted on cloud compute providers triggered ASN restrictions clamps.

I don’t intend to whitelist full cloud providers, but want RSS subscribers pulling blogs nodes smoothly without restrictions. Solution?

Solution lies inside our top-tier Allow whitelist! Term breakdowns:

  • Hostname: The name of a device or service on the network, usually mapped to a domain.
  • URI: Uniform Resource Identifier; a string that identifies a resource and can be part of a URL.
  • URI Path: The path portion of the request URI.

Allow RSS

Inside previous Allow whitelists rules grids, I add customized paths:

1
(cf.client.bot) or (http.host eq "www.ooo.vg" and http.request.uri.path eq "/index.xml")

If hitting domain resolving paths /index.xml, bypass WAF clamps cleanly via top precedence bounds overrides.

Voila! Easily establishing robust defensive dials without breaks.

CSR (Challenge Score Rate)

CSR (Challenge Score Rate) percentage models proportional triggers of challenges relative across overall restrictive hits dials.

Lower percentages preferred. 1% means 1 out of 100 hits prompted verification loops layouts. This metric helps evaluate WAF rule effectiveness and whether to adjust thresholds or operations.

Healthy dials hover between 0%-3% ranges roughly; adjustments suggested beyond grids.

Other Rules

Sparse Free parameters supported inside subsidiary triggers listed roughly below:

Rate limiting rules

Rate Limiting clamp triggers restrict client request ceilings within window frames to buffer server setups from stress overflows. For example:

1
(not cf.bot_management.verified_bot and http.request.uri.path eq "/")

You can further customize rate limit rules as needed, including response codes 1, time windows, validity periods, and more.

Note: Rate limiting rules can also restrict legitimate requests, so configure carefully.

DDoS Protection

DDoS protection policy: configure as shown below. Cloudflare’s rule is quite precise, with minimal false positives.

DDoS Protect

Free parameters endpoints summarized thus; fully sufficient for starting blocks properly!


  1. On the Free plan, some parameters cannot be modified. ↩︎