<< back to Guides

Guide to Optimize a NGINX + PHP-FPM Setup in Kubernetes

This guide addresses the deployment configurations for Kubernetes and the NGINX + PHP-FPM interaction optimizations, ensuring performance, scalability, and reliability.


Deployment Optimization

1. Separate NGINX and PHP-FPM into Independent Deployments

# NGINX Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
spec:
  replicas: 2
  template:
    spec:
      containers:
      - name: nginx
        image: nginx:stable
        resources:
          requests:
            cpu: 100m
            memory: 128Mi
          limits:
            cpu: 500m
            memory: 256Mi

---
# PHP-FPM Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: php-fpm
spec:
  replicas: 2
  template:
    spec:
      containers:
      - name: php-fpm
        image: php:fpm
        resources:
          requests:
            cpu: 500m
            memory: 256Mi
          limits:
            cpu: 1
            memory: 512Mi


2. Scale Independently with HPA

Use Horizontal Pod Autoscaler to scale each component based on its specific resource or custom metrics.

General Guidelines:

1. NGINX Scaling Metric: Request Rate or Connections
metrics:
- type: Pods
  pods:
    metricName: nginx_ingress_controller_requests
    target:
      type: AverageValue
      averageValue: 100
2. PHP-FPM Scaling Metric: CPU or Memory Utilization
metrics:
- type: Resource
  resource:
    name: cpu
    target:
      type: Utilization
      averageUtilization: 70

Understanding Contradictions and Particular Needs

While the general guidelines work in most cases, specific application workloads and traffic patterns can necessitate alternative scaling strategies. Here’s how to assess and adapt:


NGINX Scaling Considerations


PHP-FPM Scaling Considerations


How to Identify Particular Needs

  1. Analyze Application Behavior:

    • Use tools like Prometheus + Grafana or NGINX Amplify to gather metrics:
      • Response time
      • Error rates
      • Resource utilization
    • Compare metrics to identify bottlenecks.
  2. Profile Traffic Patterns:

    • Is the workload spiky (e.g., event-driven traffic)?
    • Is there a high proportion of static vs. dynamic content?
    • Does your application use long-lived connections (e.g., WebSockets)?
  3. Perform Load Testing:

    • Simulate real-world traffic using tools like k6 or wrk to understand scaling needs under pressure.
  4. Iterative Tuning:

    • Start with general scaling rules and gradually refine targets based on observed behavior.

Examples of Contradictions in Practice


By understanding these nuances, you can tailor scaling strategies to your workload while optimizing costs and performance.


3. Service Communication

apiVersion: v1
kind: Service
metadata:
  name: php-fpm
spec:
  selector:
    app: php-fpm
  ports:
  - protocol: TCP
    port: 9000 # PHP-FPM default
    targetPort: 9000

NGINX Configuration

1. Enable Keep-Alive Connections

http {
    upstream php-fpm {
        server php-fpm.default.svc.cluster.local:9000;
        keepalive 32; # Number of persistent connections
    }

    server {
        location ~ \.php$ {
            fastcgi_pass php-fpm;
            include fastcgi_params;
            fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
            fastcgi_keep_conn on;
        }
    }
}

Note: Pros and Cons of Keep-Alive Connections

Pros:

  • Reduced Latency: Eliminates the overhead of establishing a new TCP connection for each request, speeding up interactions between client and server.
  • Lower Resource Usage: Saves CPU and memory by avoiding repeated handshakes (e.g., TLS/SSL negotiation).
  • Improved Throughput: Multiple requests can share a single connection, leading to better utilization of network resources.
  • Better User Experience: Faster response times and smoother browsing for users.

Cons:

  • Higher Resource Consumption on the Server: Long-lived connections may occupy server memory and connection slots, reducing capacity for new clients.
  • Potential for Overloading: If keep-alive timeouts are too long, idle connections can overwhelm the server, especially under high traffic.
  • Requires Fine-Tuning: Needs optimal timeout and connection limit configurations to avoid inefficiencies.
  • Compatibility Issues: Older or non-standard clients may not fully support keep-alive, leading to inconsistencies.

When to Use:
Keep-alive is ideal for scenarios with frequent and repeated client-server interactions (e.g., APIs or websites with many assets). Avoid or limit keep-alive for low-traffic, infrequent interactions where connections are likely to sit idle.

2. Buffering

fastcgi_buffer_size 16k;
fastcgi_buffers 4 16k;
fastcgi_busy_buffers_size 32k;
fastcgi_temp_file_write_size 32k;

Note: Pros and Cons of FastCGI Buffering

Pros:

  • Improved Performance: Buffers responses in memory before sending them to the client, reducing I/O overhead and improving throughput.
  • Better Handling of Slow Clients: Allows NGINX to quickly fetch responses from the backend and hold them in memory, freeing up the FastCGI backend for other requests.
  • Reduced Backend Load: Limits the time FastCGI processes spend serving responses directly, enhancing scalability under heavy traffic.
  • Optimized Resource Usage: Useful for handling large, static-like dynamic content (e.g., large HTML pages).

Cons:

  • Increased Memory Usage: Buffering requires memory allocation, which can grow significantly with high traffic or large responses.
  • Latency for Streaming Content: Adds latency when serving real-time or streaming data, as the response is buffered before being sent to the client.
  • Complex Tuning Required: Requires careful adjustment of buffer sizes (fastcgi_buffers, fastcgi_buffer_size) to avoid wasted resources or unintended performance issues.
  • Not Always Needed: For small, quick responses, buffering can introduce unnecessary overhead.

When to Use:
Enable FastCGI buffering for most scenarios, especially when handling dynamic content or high traffic. Consider disabling buffering for low-latency, streaming, or real-time applications where immediate delivery of data is critical.

3. Static File Serving

location / {
    try_files $uri $uri/ /index.php?$query_string;
}

location ~* \.(jpg|jpeg|png|css|js|ico|gif|svg|woff2|woff|ttf|eot|otf)$ {
    expires 1y;
    add_header Cache-Control "public, no-transform";
    access_log off;
}

Note: Pros and Cons of Static File Serving

Pros:

  • Improved Performance: Serving static files directly from NGINX is much faster than processing them through PHP or other backend services, reducing response times.
  • Low Resource Usage: Static files do not require server-side processing, freeing up resources for dynamic content handling.
  • Efficient Caching: Static files can be easily cached at multiple levels (e.g., in browser cache, CDN, or NGINX), reducing load on the server and improving delivery speed.
  • Better Scalability: Serving static content through NGINX reduces the need for scaling backend services, as static content doesn't put additional load on application servers.

Cons:

  • Limited Flexibility: Static file serving doesn’t allow dynamic behavior (e.g., user personalization), and it can’t handle content changes without updating the file itself.
  • Potential for Stale Content: Without proper cache control, clients may receive outdated content if the cache is not refreshed or invalidated.
  • File Management Complexity: Managing static assets (e.g., versioning, minification, or compression) requires additional setup and maintenance.

When to Use:
Serve static files through NGINX whenever possible, especially for assets like images, CSS, and JavaScript. Use cache headers and CDNs to further optimize delivery. Avoid serving static content through PHP or other dynamic backends to improve performance and reduce server load.

4. Rate Limiting

limit_req_zone $binary_remote_addr zone=php_limit:10m rate=10r/s;
location ~ \.php$ {
    limit_req zone=php_limit;
}

Note: Pros and Cons of Rate Limiting

Pros:

  • Prevents Abuse: Protects your server from excessive requests, which could overwhelm the system, by limiting the number of requests from a client within a specific time frame.
  • Improved Reliability: Ensures fair usage of resources, preventing any single user from consuming too many resources and negatively affecting others.
  • Mitigates DDoS Attacks: Helps mitigate simple Denial of Service (DoS) or Distributed Denial of Service (DDoS) attacks by controlling the rate of incoming traffic.

Cons:

  • Can Block Legitimate Users: If set too aggressively, rate limits might block legitimate traffic, especially from users with varying request patterns.
  • Requires Fine-Tuning: Needs proper configuration to balance between protecting resources and ensuring a smooth user experience.
  • May Introduce Latency: For APIs or services with strict rate limits, users might experience delays if they hit the limit too quickly.

When to Use:
Implement rate limiting for services that are exposed to the internet, such as APIs or login pages, to prevent abuse and maintain performance under high load.


PHP-FPM Configuration

1. Pool Configuration

pm = dynamic
pm.max_children = 50     # Max simultaneous requests
pm.start_servers = 10    # Initial number of workers
pm.min_spare_servers = 5 # Minimum idle workers
pm.max_spare_servers = 15 # Maximum idle workers

2. Process Management

pm.max_requests = 500    # Restart workers after processing 500 requests
pm.status_path = /status

3. Enable OPcache

opcache.enable=1
opcache.memory_consumption=128
opcache.interned_strings_buffer=8
opcache.max_accelerated_files=4000

Monitoring and Debugging

1. NGINX Metrics

location /nginx_status {
    stub_status;
    allow 127.0.0.1; # Allowlist
    deny all;
}

2. PHP-FPM Metrics

3. Log Optimization

request_slowlog_timeout = 5s
slowlog = /var/log/php-fpm-slow.log

Testing and Iterative Improvement

1. Load Testing:

2. Observability:

3. Iterative Tuning:


Summary of alternatives to a NGINX+PHP-FPM

Option Best For Pros Cons Reference
Apache with mod_php Simpler setups, legacy environments Easy to configure, no need for FPM, .htaccess support Lower performance and scalability compared to NGINX + FPM Apache
Caddy with PHP-FPM Simple configuration and automatic HTTPS Easy to set up, secure by default, automatic SSL/TLS Less flexible than NGINX, smaller community Caddy
LiteSpeed with LSAPI High-performance PHP apps requiring scalability High performance, supports HTTP/2, out-of-the-box optimizations Proprietary software, licensing costs LiteSpeed
OpenLiteSpeed Open-source high-performance PHP apps Free, fast, optimized for PHP, supports HTTP/2 and caching Smaller community than NGINX/Apache OpenLiteSpeed
PHP Built-In Server Development or lightweight production setups Simple to use, no server setup required Not suitable for production, limited scalability PHP Manual
Node.js with PHP-FPM Integrating PHP into a Node.js ecosystem Flexibility for real-time capabilities, can combine multiple technologies Additional complexity, potential latency from proxying Node.js
Docker + PHP-FPM Microservices or containerized environments Scalability, isolation, works well with cloud environments and Kubernetes Docker overhead, additional complexity Docker
Serverless PHP Event-driven or highly dynamic applications No infrastructure management, auto-scaling, pay-per-use Cold start latency, limited runtime, limited control over environment AWS Lambda

Summary of alternatives for Monitoring and Telemetry

Option Best For Pros Cons Reference
Prometheus Time-series data, monitoring and alerting Open-source, highly customizable, integrates with Kubernetes, large community support Requires configuration and management, storage overhead, may require additional tools for full-stack observability Prometheus
Grafana Visualizing metrics from various sources Powerful visualization and dashboarding, integrates with multiple data sources, open-source May require integration with other tools (e.g., Prometheus) for data collection and alerting Grafana
Datadog Full-stack monitoring and APM (Application Performance Monitoring) Comprehensive monitoring, real-time alerting, cloud-based, easy to set up Paid service, pricing can scale with usage, cloud vendor lock-in Datadog
New Relic APM, monitoring for web apps and microservices Detailed application performance insights, real-time monitoring and alerting Paid service, can become expensive, requires agent installation New Relic
Elastic Stack (ELK Stack) Log management, monitoring, and analytics Centralized logging, advanced search and analysis, integrates with Kubernetes Setup and management complexity, storage and performance can be an issue for large-scale environments Elastic Stack
Zabbix Traditional monitoring for infrastructure Open-source, supports wide range of monitoring types, can be highly customized May have a steeper learning curve, not as modern in terms of UI and integrations Zabbix
InfluxDB Time-series database for monitoring and metrics Optimized for high-write loads, easy integration with monitoring tools like Grafana Requires additional tools for full-stack observability, focused mainly on time-series data InfluxDB
Azure Monitor Microsoft Azure services, cloud infrastructure monitoring Native integration with Azure, real-time monitoring, scalable Limited to Azure, not as flexible as open-source solutions Azure Monitor
Sysdig Kubernetes, container, and cloud-native monitoring Deep monitoring for containers and Kubernetes, integrates with Prometheus and Grafana Paid service for advanced features, limited free tier Sysdig
Splunk Logs and machine data analysis Powerful log aggregation, real-time analytics, APM capabilities Paid service, may become expensive at scale, setup complexity Splunk
Nagios IT infrastructure monitoring Widely used, open-source, highly customizable, alerting capabilities Older tool, can be complex to configure, UI is less modern Nagios

Glossary of Terms

Term Description
PHP-FPM PHP FastCGI Process Manager, a PHP implementation designed to provide faster execution by handling multiple PHP processes concurrently.
NGINX A high-performance web server that can also be used as a reverse proxy, load balancer, and HTTP cache.
Keep-Alive A mechanism that keeps connections open for reuse, reducing the overhead of creating new connections for each request.
FastCGI A protocol for interfacing interactive programs with a web server, used to communicate between NGINX and PHP-FPM.
HPA (Horizontal Pod Autoscaler) A Kubernetes resource that automatically scales the number of pods in a deployment based on observed CPU utilization or custom metrics.
Request (in Kubernetes) Defines the minimum amount of CPU or memory a container is guaranteed to have available in Kubernetes.
Limit (in Kubernetes) Specifies the maximum amount of CPU or memory a container can use in Kubernetes.
Pod A group of one or more containers in Kubernetes that share storage and network resources, and a specification for how to run the containers.
Service (in Kubernetes) A Kubernetes abstraction that defines a logical set of Pods and a policy to access them, commonly used for load balancing.
Deployment (in Kubernetes) A resource that provides declarative updates to Pods and ReplicaSets, ensuring the desired state of applications is maintained.
Autoscaling The process of automatically adjusting the number of active instances (pods) based on the load or other performance metrics.
Prometheus An open-source monitoring and alerting toolkit designed for reliability and scalability, often used with Kubernetes.
Grafana An open-source analytics and monitoring platform that integrates with Prometheus to visualize and query metrics.
Docker A platform used to automate the deployment, scaling, and management of applications using containerization technology.
Node.js A JavaScript runtime built on Chrome's V8 JavaScript engine, often used for building scalable network applications.
Serverless A cloud computing execution model where the cloud provider automatically manages the infrastructure, often used for event-driven apps.
TLS (Transport Layer Security) A cryptographic protocol designed to provide secure communication over a computer network, commonly used in web servers for HTTPS.
TLS Termination The process of decrypting secure HTTPS traffic before it reaches the backend application, often performed by reverse proxies like NGINX.
Rate Limiting A technique used to limit the number of requests a client can make to a service in a given period, preventing overload.
Caching Storing copies of files or data in temporary storage locations to reduce latency and improve access speed.
Elastic Stack (ELK Stack) A suite of tools (Elasticsearch, Logstash, Kibana) used for search, logging, and visualization, commonly used in observability setups.
Prometheus Metrics Time-series data collected by Prometheus from configured targets (like Kubernetes pods or applications) that represent performance metrics.
Time-Series Data Data points indexed by time, commonly used for performance monitoring, such as metrics on CPU usage, memory consumption, etc.
APM (Application Performance Monitoring) A suite of tools for monitoring and analyzing the performance of software applications in real time.
Node Exporter A Prometheus exporter that collects hardware and OS metrics from Linux nodes, often used with Prometheus for infrastructure monitoring.
Horizontal Scaling Increasing the number of resources or instances (e.g., pods) to distribute load and enhance system performance.
Vertical Scaling Increasing the resources (CPU or memory) available to a single instance, such as a pod, to handle more load.
FPM Pool A set of PHP processes managed by PHP-FPM that handle incoming requests, with the number of processes configurable for performance.
ElasticSearch A distributed search and analytics engine that is commonly used in observability and logging systems.
ReplicaSet A Kubernetes controller that ensures a specified number of pod replicas are running at any given time, often part of a deployment.
Pod Disruption Budget (PDB) A policy in Kubernetes that specifies the minimum number of pods that should remain available during voluntary disruptions like updates or scaling.
API Gateway A server that acts as an API frontend, receiving API requests, routing them to the appropriate service, and aggregating the results.
<< back to Guides