Scaling PHP Symfony Metrics at 200k RPM: 50+ Servers, Zero Overhead with UDP + Telegraf

“Redis dies at 200k RPM, Prometheus can’t scrape 50 servers in time, and the business demands real-time dashboards. Sounds familiar?”

Friday, 6:00 PM. Grafana shows timeouts while scraping metrics. Redis, used by prometheus_client_php, eats 8GB of RAM and 100% CPU. Prometheus fails to scrape all 50+ servers within the 15-second window. And Black Friday launches on Monday…

This article is about how we switched from a pull to a push model for PHP monitoring in a highload project, why we chose UDP + Telegraf over the classical approach, and how we now collect metrics from 50+ servers without a single timeout.

Architecture: Pull vs Push for PHP Metrics

Why Prometheus PHP Client Doesn’t Always Work for Highload

A typical scenario: you run a PHP Symfony application and need metrics. The first idea — prometheus_client_php. Great library, but with caveats:

// Classic prometheus_client_php usage
$registry = new CollectorRegistry(new Redis());
$counter = $registry->getOrRegisterCounter('app', 'requests_total', 'Total requests');
$counter->inc(['method' => 'GET', 'endpoint' => '/api/users']);

What happens under the hood:

Each metric is stored in Redis/APC/in-memory storage
Prometheus periodically scrapes the /metrics endpoint
On scrape, all metrics are read from storage

Where problems begin:

Scaling: With 50+ servers, Prometheus must scrape each. This becomes a bottleneck.
Storage: Redis adds latency; APC works only per server; in-memory dies on FPM restarts.
Configuration: You must set up service discovery for all servers.
Performance: At 200k RPM, each Redis call for counter increment = overhead.

The Solution: Push Model with UDP for PHP Highload Monitoring

Instead, we send metrics via UDP to Telegraf, which then forwards them to Prometheus, InfluxDB, or others.

Why UDP?

Fire & forget: No waiting for responses, no timeouts.
Minimal overhead: Microsecond delivery.
Fault tolerance: If Telegraf crashes, the app keeps running.
Simplicity: No connection pools, retries, or circuit breakers.

Important: UDP may lose packets, but losing 0.01% metrics won’t distort dashboards.

TelegrafMetricsBundle: Implementation

All of this is packaged in a Symfony bundle — TelegrafMetricsBundle — for sending metrics over UDP.

Installation

composer require yakovlef/telegraf-metrics-bundle

Config (config/packages/telegraf_metrics.yaml):

telegraf_metrics:
    namespace: 'my_app'
    client:
        url: 'http://localhost:8086'
        udpPort: 8089

Bundle Architecture

Three core components:

// MetricsCollectorInterface - DI contract
interface MetricsCollectorInterface
{
    public function collect(string $name, array $fields, array $tags = []): void;
}

// Implementation with UDP Writer - implemented via InfluxDB UDP Writer
class MetricsCollector implements MetricsCollectorInterface
{
    private UdpWriter $writer;

    public function __construct(Client $client, string $namespace)
    {
        $this->writer = $client->createUdpWriter();
    }

    public function collect(string $name, array $fields, array $tags = []): void
    {
        // Send metric in InfluxDB
        $this->writer->write(
            new Point("{$this->namespace}_$name", $tags, $fields)
        );
    }
}

DI integration:

services:
    Yakovlef\TelegrafMetricsBundle\Collector\MetricsCollectorInterface: 
        '@telegraf_metrics.collector'

Practical Use Cases

1. API Endpoint Monitoring

class ApiController
{
    public function __construct(
        private MetricsCollectorInterface $metrics
    ) {}

    public function getUsers(): JsonResponse
    {
        $startTime = microtime(true);
        
        try {
            $users = $this->userRepository->findAll();
            $responseTime = (microtime(true) - $startTime) * 1000;
            
            $this->metrics->collect('api_request', [
                'response_time' => $responseTime,
                'count' => 1
            ], [
                'endpoint' => '/api/users',
                'method' => 'GET',
                'status' => '200'
            ]);
            
            return new JsonResponse($users);
            
        } catch (\Exception $e) {
            $this->metrics->collect('api_error', ['count' => 1], [
                'endpoint' => '/api/users',
                'error_type' => get_class($e),
                'status' => '500'
            ]);
            throw $e;
        }
    }
}

2. Business Metrics in E-commerce

class OrderService
{
    public function createOrder(OrderDto $dto): Order
    {
        $order = new Order($dto);
        $this->em->persist($order);
        $this->em->flush();
        
        $this->metrics->collect('order_created', [
            'amount' => $order->getTotalAmount(),
            'items_count' => $order->getItemsCount(),
            'count' => 1
        ], [
            'payment_method' => $order->getPaymentMethod(),
            'currency' => $order->getCurrency(),
            'user_type' => $order->getUser()->getType()
        ]);
        
        return $order;
    }
    
    public function processPayment(Order $order): void
    {
        $startTime = microtime(true);
        
        try {
            $result = $this->paymentGateway->charge($order);
            
            $this->metrics->collect('payment_processed', [
                'amount' => $order->getTotalAmount(),
                'processing_time' => (microtime(true) - $startTime) * 1000,
                'count' => 1
            ], [
                'gateway' => $this->paymentGateway->getName(),
                'status' => 'success'
            ]);
            
        } catch (PaymentException $e) {
            $this->metrics->collect('payment_failed', [
                'amount' => $order->getTotalAmount(),
                'count' => 1
            ], [
                'gateway' => $this->paymentGateway->getName(),
                'error_code' => $e->getCode()
            ]);
            throw $e;
        }
    }
}

3. Background Job Monitoring

class EmailConsumer implements MessageHandlerInterface
{
    public function __invoke(SendEmailMessage $message): void
    {
        $startTime = microtime(true);
        
        try {
            $this->mailer->send($message->getEmail());
            
            $this->metrics->collect('consumer_processed', [
                'processing_time' => (microtime(true) - $startTime) * 1000,
                'count' => 1
            ], [
                'consumer' => 'email',
                'status' => 'success',
                'priority' => $message->getPriority()
            ]);
            
        } catch (\Exception $e) {
            $this->metrics->collect('consumer_failed', ['count' => 1], [
                'consumer' => 'email',
                'error' => get_class($e)
            ]);
            throw $e;
        }
    }
}

4. Circuit Breaker Pattern

class ExternalApiClient
{
    private int $failures = 0;
    private bool $isOpen = false;
    
    public function call(string $endpoint): array
    {
        if ($this->isOpen) {
            $this->metrics->collect('circuit_breaker', ['count' => 1], [
                'service' => 'external_api',
                'state' => 'open',
                'action' => 'rejected'
            ]);
            throw new CircuitBreakerOpenException();
        }
        
        try {
            $response = $this->httpClient->request('GET', $endpoint);
            
            $this->failures = 0;
            $this->metrics->collect('circuit_breaker', ['count' => 1], [
                'service' => 'external_api',
                'state' => 'closed',
                'action' => 'success'
            ]);
            
            return $response->toArray();
            
        } catch (\Exception $e) {
            $this->failures++;
            
            if ($this->failures >= 5) {
                $this->isOpen = true;
                $this->metrics->collect('circuit_breaker', ['count' => 1], [
                    'service' => 'external_api',
                    'state' => 'open',
                    'action' => 'opened'
                ]);
            }
            
            throw $e;
        }
    }
}

Aggregation in Telegraf

Telegraf’s killer feature — built-in aggregations (basicstats). Instead of raw data flooding Prometheus, aggregation happens directly in Telegraf.

Metric	Description	Use case
count	Number of values per period	Requests, errors, registrations
sum	Sum of values	Total revenue, processing time
mean	Arithmetic mean	Avg response time, avg basket size
min	Minimum	Min response time, smallest order
max	Maximum	Peak load, max response time
stdev	Standart deviation	Response time variability
s2	Variance	More sensitive variability metric

Example telegraf.conf

[[inputs.socket_listener]]
  service_address = "udp://:8089"
  data_format = "influx"

[[aggregators.basicstats]]
  period = "10s"
  drop_original = false
  stats = ["count", "mean", "sum", "min", "max", "stdev"]
  namepass = ["my_app_api_*"]

[[outputs.prometheus_client]]
  listen = ":9273"
  metric_version = 2
  path = "/metrics"
  metric_batch_size = 1000
  metric_buffer_limit = 10000

Pitfalls and How to Avoid Them

UDP Packet Loss — and Why It’s Fine

Problem: At high load, packet loss may occur.

Solution: Monitor Telegraf’s own metrics. If losses are critical — increase UDP buffers or add batching in the application.

Remember: losing 0.01% metrics is better than app crash due to Redis.

UDP Packet Size: Why Your Metrics Might Not Arrive

Problem: UDP packet size limit is ~65KB. With too many tags, you can exceed it.

Solution: Limit unique tags and use short names:

// Bad: long tags with high cardinality
$this->metrics->collect('api_request', ['time' => 100], [
    'user_email' => $user->getEmail(), // high cardinality
    'request_id' => uniqid(),          // unique every time
    'full_endpoint_path_with_parameters' => $request->getUri()
]);

// Good: short tags with low cardinality
$this->metrics->collect('api_request', ['time' => 100], [
    'endpoint' => '/api/users',
    'method' => 'GET',
    'status' => '200'
]);

Fewer unique tags = smaller packet size = more reliable delivery.

Alternative Scenarios

VictoriaMetrics Instead of Prometheus

For high-load systems, Prometheus can become a bottleneck: high memory consumption, long queries with large data volumes, and no clustering mode “out of the box.”

VictoriaMetrics is fully compatible with the Prometheus protocol but:

is more efficient in storage,
handles long queries faster,
supports horizontal scaling.

That makes it a more reliable choice for systems with hundreds of thousands of metrics per second.

Sending Metrics to Multiple Systems Simultaneously

[[outputs.prometheus_client]]
  listen = ":9273"

[[outputs.influxdb_v2]]
  urls = ["http://influxdb:8086"]

[[outputs.graphite]]
  servers = ["graphite:2003"]

Roadmap and Current Limitations

Already works:

Production-ready
Symfony 6.4+ and 7.0+
Prometheus / VictoriaMetrics supported
Zero-overhead delivery

Note: no test suite yet, but it’s been running stable in multiple highload projects for over a year.

Final Thoughts

Switching to the push model with UDP + Telegraf gave us three key wins:

Performance as a competitive advantage

Latency reduced 60× (from 3ms to 0.05ms). At 200k RPM, that saves 10 minutes of CPU time per hour, allowing 15% more requests on the same hardware.

Scaling without headaches

Linear scaling — adding new servers now takes 30 seconds. Just deploy with the same UDP endpoint. No Prometheus changes, no service discovery.

System antifragility

Complete isolation of failures — the metrics system can collapse entirely, and the app continues running. Over the years, this saved us multiple times during monitoring infrastructure outages.

Metrics in PHP are not a luxury but a necessity to understand what’s happening in production. The Telegraf UDP approach allowed us to forget about scaling problems and focus on what really matters — business logic and user experience.

Yes, we sacrificed guaranteed delivery of every packet. But in return, we got a system that withstands any load and never becomes a single point of failure — especially at critical peak moments.

Bundle available on GitHub and Packagist.

P.S. If this saved you time reinventing the wheel — star the repo. Found a bug? Open an issue, and we’ll fix it.