Postfix Performance Tuning
Purpose of Postfix performance tuning
The hints and tips in this document help you improve the performance of Postfix systems that already work. If your Postfix system is unable to receive or deliver mail, then you need to solve those problems first, using the DEBUG_README document as guidance.
For tuning external content filter performance, first read the respective information in the FILTER_README and SMTPD_PROXY_README documents. Then make sure to avoid latency in the content filter code. As much as possible avoid performing queries against external data sources with a high or highly variable delay. Your content filter will run with a small concurrency to avoid CPU/memory starvation, and if any latency creeps in, content filter throughput will suffer. High volume environments should avoid RBL lookups, complex database queries and so on.
Topics on mail receiving performance:
Topics on mail delivery performance:
Other Postfix performance tuning topics:
The following tools can be used to measure mail system performance under artificial loads. They are normally not installed with Postfix.
When Postfix responds slowly to SMTP clients:
With Postfix versions 2.0 and earlier, the smtpd(8) server pauses before reporting an error to an SMTP client. The idea is called tar pitting. However, these delays also slow down Postfix. When the smtpd(8) server replies slowly, sessions take more time, so that more smtpd(8) server processes are needed to handle the load. When your Postfix smtpd(8) server process limit is reached, new clients must wait until a server process becomes available. This means that all clients experience poor performance.
You can speed up the handling of smtpd(8) server error replies by turning off the delay:
/usr/local/etc/postfix/main.cf: # Not needed with Postfix 2.1 smtpd_error_sleep_time = 0
With the above setting, Postfix 2.0 and earlier can serve more SMTP clients with the same number SMTP server processes. The next section describes how Postfix deals with clients that make a large number of errors.
The Postfix smtpd(8) server maintains a per-session error count. The error count is reset when a message is transferred successfully, and is incremented when a client request is unrecognized or unimplemented, when a client request violates access restrictions, or when some other error happens.
As the per-session error count increases, the smtpd(8) server changes behavior and begins to insert delays into the responses. The idea is to slow down a run-away client in order to limit resource usage. The behavior is Postfix version dependent.
IMPORTANT: These delays slow down Postfix, too. When too much delay is configured, the number of simultaneous SMTP sessions will increase until it reaches the smtpd(8) server process limit, and new SMTP clients must wait until an smtpd(8) server process becomes available.
Postfix version 2.1 and later:
Postfix version 2.0 and earlier:
Note: this feature is not included with Postfix version 2.1.
The Postfix smtpd(8) server can limit the number of simultaneous connections from the same SMTP client, as well as the number of connections that a client is allowed to make per unit time. These statistics are maintained by the anvil(8) server (translation: if anvil(8) breaks, then connection limits stop working).
IMPORTANT: These limits are designed to protect the smtpd(8) server against flagrant abuse. Do not use these limits to regulate legitimate traffic: mail will suffer grotesque delays if you do so.
Although Postfix can be configured to run 1000 SMTP client processes at the same time, it is rarely desirable that it makes 1000 simultaneous connections to the same remote system. For this reason, Postfix has safety mechanisms in place to avoid this so-called "thundering herd" problem.
The Postfix queue manager implements the analog of the TCP slow start flow control strategy: when delivering to a site, send a small number of messages first, then increase the concurrency as long as all goes well; reduce concurrency in the face of congestion.
Examples of transport specific concurrency limits are:
The above default values of the concurrency limits work well in a broad range of situations. Knee-jerk changes to these parameters in the face of congestion can actually make problems worse. Specifically, large destination concurrencies should never be the default. They should be used only for transports that deliver mail to a small number of high volume domains.
A common situation where high concurrency is called for is on gateways relaying a high volume of mail from between the Internet and an intranet mail environment. Approximately half the mail (assuming equal volumes inbound and outbound) will be destined for the internal mail hubs. Since the internal mail hubs will be receiving all external mail exclusively from the gateway, it is reasonable to configure the gateway to make greater demands on the capacity of the internal SMTP servers.
The tuning of the inbound concurrency limits need not be trial and error. A high volume capable mailhub should be able to easily handle 50 or 100 (rather than the default 20) simultaneous connections, especially if the gateway forwards to multiple MX hosts. When all MX hosts are up and accepting connections in a timely fashion, throughput will be high. If any MX host is down and completely unresponsive, the average connection latency rises to at least 1/N * $smtp_connection_timeout, if there are N MX hosts. This limits throughput to at most the destination concurrency * N / $smtp_connection_timeout.
For example, with a destination concurrency of 100 and 2 MX hosts, each host will handle up to 50 simultaneous connections. If one MX host is down and the default SMTP connection timeout is 30s, the throughput limit is 100 * 2 / 30 ~= 6 messages per second. This suggests that high volume destinations with good connectivity and multiple MX hosts need a lower connection timeout, values as low as 5s or even 1s can be used to prevent congestion when one or more, but not all MX hosts are down.
If necessary, set a higher transport_destination_concurrency_limit (in main.cf since this is a queue manager parameter) and a lower smtp_connection_timeout (with a "-o" override in master.cf since this parameter has no per-transport name) for the relay transport and any transports dedicated for specific high volume destinations.
The default_destination_recipient_limit parameter (default: 50) controls how many recipients a Postfix delivery agent will send with each copy of an email message. You can override this setting for specific Postfix delivery agents. For example, "uucp_destination_recipient_limit = 100" would limit the number of recipients per UUCP delivery to 100.
If an email message exceeds the recipient limit for some destination, the Postfix queue manager breaks up the list of recipients into smaller lists. Postfix will attempt to send multiple copies of the message in parallel.
IMPORTANT: Be careful when increasing the recipient limit per message delivery; some smtpd(8) servers abort the connection when they run out of memory or when a hard recipient limit is reached, so that the message will never be delivered.
The smtpd_recipient_limit parameter (default: 1000) controls how many recipients the Postfix smtpd(8) server will take per delivery. The default limit is more than any reasonable SMTP client would send. The limit exists to protect the local mail system against a run-away client.
This process is governed by a bunch of little parameters.
IMPORTANT: If you increase the frequency of deferred mail delivery attempts, or if you flush the deferred mail queue frequently, then you may find that Postfix mail delivery performance actually becomes worse. The symptoms are as follows:
When mail is being deferred frequently, fixing the problem is always better than increasing the frequency of delivery attempts. However, if you can control only the delivery attempt frequency, consider using a dedicated fallback_relay "graveyard" machine for bad destinations so that they do not ruin the performance of normal mail deliveries.
The default_process_limit configuration parameter gives direct control over how many daemon processes Postfix will run. As of Postfix 2.0 the default limit is 100 smtp client processes, 100 smtp server processes, and so on. This may overwhelm systems with little memory, as well as networks with low bandwidth.
You can change the global process limit by specifying a non-default default_process_limit in the main.cf file. For example, to run up to 10 smtp client processes, 10 smtp server processes, and so on:
/usr/local/etc/postfix/main.cf: default_process_limit = 10
You need to execute "postfix reload" to make the change effective. The limits are enforced by the Postfix master(8) daemon which does not automatically read main.cf when it changes.
You can override the process limit for specific Postfix daemons by editing the master.cf file. For example, if you do not wish to receive 100 SMTP messages at the same time, but do not want to change the process limits for local mail deliveries, you could specify:
/usr/local/etc/postfix/master.cf: # ==================================================================== # service type private unpriv chroot wakeup maxproc command + args # (yes) (yes) (yes) (never) (100) # ==================================================================== . . . smtp inet n - - - 10 smtpd . . .
When Postfix opens too many files or sockets, processes will abort with fatal errors, and the system may log "file table full" errors.