Zentinel is built on Cloudflare’s Pingora, a battle-tested HTTP proxy framework written in Rust. This page explains what Pingora provides and how Zentinel extends it.
What is Pingora?
Pingora is an open-source proxy framework that Cloudflare uses to handle over 1 trillion requests per day. It provides:
- High-performance async HTTP handling using Tokio
- Connection pooling to upstream servers
- TLS termination with modern cipher suites
- HTTP/1.1 and HTTP/2 support
- Zero-copy buffer management for efficiency
- Graceful shutdown and upgrades
Zentinel uses Pingora as its foundation, adding routing, load balancing, agent coordination, and configuration management on top.
Why Pingora?
| Requirement | Pingora Solution |
|---|---|
| Performance | Handles millions of requests/sec with low latency |
| Safety | Written in Rust with memory safety guarantees |
| Production-proven | Powers Cloudflare’s global edge network |
| Extensibility | Clean trait-based architecture for customization |
| Operational | Built-in graceful restart and upgrade support |
Compared to Alternatives
| Framework | Language | Trade-offs |
|---|---|---|
| Pingora | Rust | Best performance + safety, smaller ecosystem |
| Envoy | C++ | Feature-rich but complex, memory safety concerns |
| HAProxy | C | Mature but harder to extend, no memory safety |
| Nginx | C | Ubiquitous but module development is challenging |
Core Pingora Concepts
Server and Services
Pingora applications start with a Server that manages one or more services:
// Create Pingora server with options
let mut server = Server::new(Some(pingora_opt))?;
server.bootstrap();
// Create HTTP proxy service
let proxy_service = http_proxy_service(&server.configuration, proxy);
// Add listeners
proxy_service.add_tcp("0.0.0.0:8080");
// Register service and run
server.add_service(proxy_service);
server.run_forever();
The server handles:
- Worker process management
- Signal handling (SIGHUP, SIGTERM)
- Graceful restarts and upgrades
- Daemonization
Session
A Session represents a single HTTP request/response cycle. It provides access to:
// Request information
session.req_header() // HTTP request headers
session.req_header_mut() // Mutable access for modifications
session.client_addr() // Client IP address
// Response information
session.response_written() // Response after sending
// Body handling
session.read_request_body() // Read request body chunks
session.write_response_body() // Write response body
HttpPeer
An HttpPeer represents an upstream server connection target:
let peer = HttpPeer::new(
("backend.example.com", 8080), // Address
false, // TLS enabled
"backend.example.com".into() // SNI hostname
);
// Connection options
peer.options.connection_timeout = Some(Duration::from_secs(5));
peer.options.read_timeout = Some(Duration::from_secs(30));
Pingora maintains connection pools to peers for efficiency.
The ProxyHttp Trait
The ProxyHttp trait is the heart of Pingora’s extensibility. Zentinel implements this trait to inject custom logic at each stage of request processing:
#[async_trait]
impl ProxyHttp for ZentinelProxy {
type CTX = RequestContext;
// Create per-request context
fn new_ctx(&self) -> Self::CTX {
RequestContext::new()
}
// Select upstream server
async fn upstream_peer(
&self,
session: &mut Session,
ctx: &mut Self::CTX,
) -> Result<Box<HttpPeer>, Box<Error>>;
// Process request before forwarding
async fn request_filter(
&self,
session: &mut Session,
ctx: &mut Self::CTX,
) -> Result<bool, Box<Error>>;
// Process response before returning
async fn response_filter(
&self,
session: &mut Session,
upstream_response: &mut ResponseHeader,
ctx: &mut Self::CTX,
) -> Result<(), Box<Error>>;
// Final logging after request completes
async fn logging(
&self,
session: &mut Session,
error: Option<&Error>,
ctx: &mut Self::CTX,
);
}
Request Context
Each request gets its own context that persists throughout the lifecycle:
pub struct RequestContext {
pub trace_id: String,
pub start_time: Instant,
pub route_id: Option<String>,
pub upstream: Option<String>,
pub client_ip: String,
pub method: String,
pub path: String,
pub upstream_attempts: u32,
// ... more fields
}
How Zentinel Uses Pingora
1. Route Matching (upstream_peer)
When a request arrives, Zentinel matches it to a route and selects an upstream:
Request arrives
│
▼
┌────────────────────┐
│ Parse request info │
│ (method, path, │
│ host, headers) │
└────────┬───────────┘
│
▼
┌────────────────────┐
│ Match against │
│ compiled routes │
└────────┬───────────┘
│
▼
┌────────────────────┐
│ Select peer from │
│ upstream pool │
└────────┬───────────┘
│
▼
Return HttpPeer
2. Request Processing (request_filter)
Before forwarding, Zentinel applies filters and calls agents:
async fn request_filter(&self, session: &mut Session, ctx: &mut Self::CTX)
-> Result<bool, Box<Error>>
{
// Handle static files and builtins
if route.service_type == ServiceType::Static {
return self.handle_static_route(session, ctx).await;
}
// Enforce limits
if headers.len() > config.limits.max_header_count {
return Err(Error::explain("Too many headers"));
}
// Add tracing headers
req_header.insert_header("X-Correlation-Id", &ctx.trace_id)?;
req_header.insert_header("X-Forwarded-By", "Zentinel")?;
// Call external agents
self.process_agents(session, ctx).await?;
Ok(false) // Continue to upstream
}
Returning Ok(true) short-circuits processing (response already sent).
Returning Ok(false) continues to the upstream.
3. Response Processing (response_filter)
After receiving the upstream response:
async fn response_filter(
&self,
session: &mut Session,
upstream_response: &mut ResponseHeader,
ctx: &mut Self::CTX,
) -> Result<(), Box<Error>> {
// Add security headers
upstream_response.insert_header("X-Content-Type-Options", "nosniff")?;
upstream_response.insert_header("X-Frame-Options", "DENY")?;
// Add correlation ID
upstream_response.insert_header("X-Correlation-Id", &ctx.trace_id)?;
// Record metrics
self.metrics.record_request(
ctx.route_id.as_deref().unwrap_or("unknown"),
&ctx.method,
upstream_response.status.as_u16(),
ctx.elapsed(),
);
// Update health status
self.passive_health.record_outcome(&upstream, success).await;
Ok(())
}
4. Logging (logging)
After the response is sent to the client:
async fn logging(&self, session: &mut Session, error: Option<&Error>, ctx: &mut Self::CTX) {
// Decrement active request counter
self.reload_coordinator.dec_requests();
// Write structured access log
let entry = AccessLogEntry {
timestamp: Utc::now().to_rfc3339(),
trace_id: ctx.trace_id.clone(),
method: ctx.method.clone(),
path: ctx.path.clone(),
status: session.response_written().map(|r| r.status.as_u16()),
duration_ms: ctx.elapsed().as_millis(),
// ...
};
self.log_manager.log_access(&entry);
}
Connection Pooling
Pingora automatically pools connections to upstream servers:
┌─────────────────────────────────────────────────────────┐
│ Zentinel Proxy │
│ ┌───────────────────────────────────────────────────┐ │
│ │ Connection Pool Manager │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌───────────┐ │ │
│ │ │ backend-1 │ │ backend-2 │ │ backend-3 │ │ │
│ │ │ ┌─┐┌─┐┌─┐ │ │ ┌─┐┌─┐┌─┐ │ │ ┌─┐┌─┐ │ │ │
│ │ │ │C││C││C│ │ │ │C││C││C│ │ │ │C││C│ │ │ │
│ │ │ └─┘└─┘└─┘ │ │ └─┘└─┘└─┘ │ │ └─┘└─┘ │ │ │
│ │ └─────────────┘ └─────────────┘ └───────────┘ │ │
│ └───────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────┘
│ │ │
▼ ▼ ▼
┌─────────┐ ┌─────────┐ ┌─────────┐
│ Backend │ │ Backend │ │ Backend │
│ 1 │ │ 2 │ │ 3 │
└─────────┘ └─────────┘ └─────────┘
Benefits:
- Reduced latency - Reuses existing TCP connections
- Lower resource usage - Fewer connections to manage
- Connection limits - Prevents overwhelming backends
Graceful Operations
Hot Restart
Pingora supports zero-downtime restarts:
┌──────────────┐ SIGUSR2 ┌──────────────┐
│ Old Worker │ ───────────────▶ │ New Worker │
│ (draining) │ │ (starting) │
└──────┬───────┘ └──────┬───────┘
│ │
│ Existing connections │ New connections
│ finish gracefully │ accepted
▼ ▼
[exit when done] [fully operational]
Graceful Shutdown
On SIGTERM/SIGINT:
- Stop accepting new connections
- Wait for in-flight requests (with timeout)
- Close connection pools
- Exit cleanly
Zentinel extends this with reload coordination:
pub struct GracefulReloadCoordinator {
active_requests: AtomicUsize,
max_drain_time: Duration,
}
impl GracefulReloadCoordinator {
pub fn inc_requests(&self) { /* ... */ }
pub fn dec_requests(&self) { /* ... */ }
pub async fn wait_for_drain(&self) { /* ... */ }
}
Error Handling
Pingora uses a typed error system:
pub enum ErrorType {
InvalidHTTPHeader,
ConnectTimedout,
ConnectRefused,
ConnectNoRoute,
ReadError,
WriteError,
// ... many more
}
Zentinel maps these to appropriate HTTP responses:
| Error Type | HTTP Status | Response |
|---|---|---|
ConnectTimedout | 504 | Gateway Timeout |
ConnectRefused | 502 | Bad Gateway |
ReadError | 502 | Bad Gateway |
InvalidHTTPHeader | 400 | Bad Request |
Performance Characteristics
Pingora’s architecture enables:
| Metric | Typical Value |
|---|---|
| Requests/sec (per core) | 100,000+ |
| P99 latency overhead | < 1ms |
| Memory per connection | ~10KB |
| Connection reuse rate | > 95% |
Dependencies
Zentinel uses these Pingora crates:
[dependencies]
pingora = { version = "0.6", features = ["proxy", "lb"] }
pingora-core = "0.6"
pingora-http = "0.6"
pingora-proxy = "0.6"
pingora-load-balancing = "0.6"
pingora-timeout = "0.6"
Next Steps
- Architecture Overview - High-level design
- Component Design - Zentinel’s crate structure
- Request Flow - Detailed request lifecycle