Monday 1 August 2011

Unix Made Easy: Tutorial for Squid

Unix Made Easy: Tutorial for Squid: "Introduction Squid is a high-performance proxy caching server for web clients, supporting FTP, gopher, and HTTP data objects. Unlike tr..."

Tutorial for Squid


Introduction

    Squid is a high-performance proxy caching server for web clients, supporting FTP, gopher, and HTTP data objects. Unlike traditional caching software, Squid handles all requests in a single, non-blocking, I/O-driven process.
    Squid keeps meta data and especially hot objects cached in RAM, caches DNS lookups, supports non-blocking DNS lookups, and implements negative caching of failed requests.It supports SSL, extensive access controls, and full request logging. By using the lightweight Internet Cache Protocol, Squid caches can be arranged in a hierarchy or mesh for additional bandwidth savings.
    Squid consists of a main server program squid, a Domain Name System lookup program dnsserver, some optional programs for rewriting requests and performing authentication, and some management and client tools. When squidstarts up, it spawns a configurable number of dnsserver processes, each of which can perform a single, blocking Domain Name System (DNS) lookup. This reduces the amount of time the cache waits for DNS lookups.
    This web caching software works on a variety of platforms including Linux, FreeBSD, and Windows. Squid is created by Duane Wessels.



Operating Systems Supported by Squid 

  • Linux
  • FreeBSD
  • NetBSD
  • OpenBSD
  • BSDI
  • Mac OS/X
  • OSF/Digital Unix/Tru64
  • IRIX
  • SunOS/Solaris
  • NeXTStep
  • SCO Unix
  • AIX
  • HP-UX
  • OS/2
  • Cygwin
Installation Squid
Downloading Squid 

Squid can be  download as  a squid source archive file in a gzipped tar ball form (eg.squid-*-src.tar.gz) available athttp://www.squid-cache.org/ or from ftp://www.squid-cache.org/pub 

squid can also be downloaded as an binary from http://www.squid-cache.org/binaries.html

Installing Squid from Source

1.Extract the source
    tar xzf squid-*-src.tar.gz


2.Change the current directory to squid-*
    cd squid-*


3.Compile and Installing squid
    ./configure
    make
    make install


Note:
This will by default, get installed  in "/usr/local/squid".
To get more help for the compile time options available in squid.
./configure .help


Creating Squid Swap Directories

The Squid swap directories could be created by the following command

#/usr/local/squid/sbin/squid -z 

Start, Stop & Restarting Squid

Start Squid #/usr/local/squid/sbin/squid

Stop Squid
Stopping squid .  #/usr/local/squid/sbin/squid -k shutdown

Restart Squid
Stopping squid .  #/usr/local/squid/sbin/squid -k shutdown
Starting squid - #/usr/local/squid/sbin/squid

Options Available
-k reconfigure|rotate|shutdown|interrupt|kill|debug|check|parse
                 Parse configuration file, then send signal to
                 running copy (except -k parse) and exit.


Running Squid as Daemon

For running squid as a daemon or a background process, it could  be started as 

#/usr/local/squid/sbin/squid -N

Starting Squid in Debugging Mode

Squid can be started in debugging mode by running squid as given below.

#/usr/local/squid/sbin/squid -Ncd1

which gives a debugging output.
If the test is perfect then it would print .Ready to serve requests..

Check Squid Status

To check whether squid is running the following command could be used.

#/usr/local/squid/sbin/squid -k checkConfiguration
 
Basic Configuration
 
Squid Listening to a Particular Port
 
The option http_port specifies the port number where squid will listen for HTTP client requests. If this option is set to port 80, the client will have the illusion of being connected to the actual web server. Squid by default listen to the port 3128
 
Different modes of Squid Configuration
 
Squid could be configured in three different modes as Direct proxy, Reverse proxy and Transparent proxy. 
 
Direct Proxy Cache

Direct proxy cache is used to cache static web pages (html and images) to a squid machine. When the page is requested second time, the browser returns the data from the proxy instead of the origin web server. The browser is explicitly configured to direct all HTTP requests to the proxy cache, rather than the target web server. The cache then either satisfies the request itself or passes on the request to the target server.

Configuring as Direct Proxy
By default, squid is configured in proxy mode. In order to cache web traffic and to use the squid system as a proxy, you have to configure your browser, which needs at least two pieces of information: 
Set the proxy server's host name 
Set the port that the proxy server is accepting requests on 

Transparent Cache

Transparent cache achieves the same goal as a standard proxy cache, but operates transparently to the browser. The browser does not need to be explicitly configured to access the cache. Instead, the transparent cache intercepts network traffic, filters HTTP traffic (on port 80) and handles the request if the object is in the cache. If the object is not in the cache, the packets are forwarded to the origin web server.


Configuring as Transparent Proxy

Using squid transparently is a two part process, requiring first that squid be configured properly to accept non-proxy requests (performed in the squid module) and second that web traffic gets redirected to the squid port (achieved in three ways namely policy based routing, Using smart switching or by setting squid Box as a gateway).
 
Getting transparent caching to work requires the following steps
 
For some operating systems, have to configure and build a version of Squid which can recognize the hijacked connections and discern the destination addresses. For Linux this seems to work automatically. For BSD-based systems, you probably have to configure squid with the --enable-ipf-transparent option, and you have to configure squid as
 
httpd_accel_host virtual
httpd_accel_port 80
httpd_accel_with_proxy on
httpd_accel_uses_host_header on

 
You have to configure your cache host to accept the redirected packets - any IP address, on port 80 - and deliver them to your cache application. This is typically done with IP filtering/forwarding features built into the kernel. On linux they call this ipfilter (kernel 2.4.x), ipchains (2.2.x) or ipfwadm (2.0.x). On FreeBSD and other BSD systems they call it ip filter or ipnat; on many systems, it may require rebuilding the kernel or adding a new loadable kernel module.
 

Reverse Proxy Cache

A reverse proxy cache differs from direct and transparent caches, in that it reduces load on the origin web server, rather than reducing upstream network bandwidth on the client side. Reverse Proxy Caches offload client requests for static content from the web server, preventing unforeseen traffic surges from overloading the origin server. The proxy server sits between the Internet and the Web site and handles all traffic before it can reach the Web server. A reverse proxy server intercepts requests to the Web server and instead responds to the request out of a store of cached pages. This method improves the performance by reducing the amount of pages actually created "fresh" by the Web server.

 
Configuring as Reverse Proxy 
 
To set Squid up to run as an accelerator then you probably want to listen on port 80. And finally you have to define the machine you are accelerating for. This is done in squid module,
http_port 80
httpd_accel_host visolve.com
httpd_accel_port 81
httpd_accel_single_host on
httpd_accel_with_proxy on

If you are using Squid as an accelerator for a virtual host system, then instead of a 'hostname' here you have to use the word virtual as:
 
http_port 80
httpd_accel_host virtual
httpd_accel_port 81
httpd_accel_with_proxy on
 
Different method of Intercepting HTTP Traffic 
 
The methods could found in detail in the following link.
 
http://www.visolve.com/squid/whitepapers/trans_caching.php   



WCCP configuration

Does Squid supports wccp?
 
Yes, Squid supports WCCP. Routers that support WCCP can be configured to direct traffic to one or more web caches using an efficient load balancing mechanism. WCCP also provides for automatic bypassing of an unavailable cache in the event of a failure 
 
Configuring Squid for WCCP Support
 
Patches to be applied for linux kernel.

The linux kernel in the squid machine should be patched with ip_wccp as ip_gre is some what broken. Recompile the kernel enabling ip_gre and ip_wccp.

Now install the squid from source and configure it in the squid.conf to point to the WCCP router.

Squid Machine configuration.
The following iptables rule to be made so as to redirect all the http traffic to squid port 3128.
iptables -t nat -A PREROUTING -p tcp --dport 80 -j REDIRECT --to-ports 3128

Cache Inside the Routers network

If the cache is inside the routers network the packets coming from caches should be prevented from being redirected back to the caches again. So the the following firewall rule has to be prepended in the router machine.

iptables -t mangle -A PREROUTING 1 -p tcp --dport 80 -s <ip-squid> -j ACCEPT

 
SNMP Configuration
 
Enabling SNMP support to Squid
 
To use SNMP with squid, it must be enabled with the configure script, and rebuilt. To enable SNMP in squid go to squid src directory and follow the steps given below :

./configure --enable-snmp [ ... other configure options ]

make all
make install

And edit following tags in squid.conf file :

acl aclname snmp_community public
snmp_access aclname

Once you configure squid and SNMP server, Start SNMP and squid. 

 
Why should i go for SNMP?
 
SNMP in squid is useful in longer term overview of how proxy is doing. It can also be used as a problem solver. For example: how is it going with your file descriptor usage? Or how much does your LRU vary along a day? These informations can not be monitored normally.
 
Monitoring Squid 
 
There are a number of tools used to monitor Squid via SNMP, among which MRTG is mostly used. The Multi Router Traffic Grapher (MRTG) is a tool to monitor squid information which generates a real-time status (graphical representation), in dynamic view by sampling data every five minutes (may vary according to your need). MRTG shows activity - in the last 24 hours and also in a weekly, monthly and yearly graph. 
 
Parameters Monitored
 
Squid runtime information like CPU usage, Memory usage, Cache Hit, Miss etc., can be monitored using SNMP. 
 
Delay Pools Configuration


Limiting Bandwidth
 
Delay Classes are generally used in places where bandwidth is expensive. They let you slow down access to specific sites (so that other downloads can happen at a reasonable rate), and they allow you to stop a small number of users from using all your bandwidth (at the expense of those just trying to use the Internet for work).
To ensure that some bandwidth is available for work-related downloads, you can use delay-pools. By classifying downloads into segments, and then allocating these segments a certain amount of bandwidth (in kilobytes per second), your link can remain uncongested for useful traffic.
To use delay-pools you need to have compiled Squid with the appropriate source code: you will have to have used the --enable-delay-pools option when running the configure program

An acl-operator (delay_access) is used to split requests into pools. Since we are using acls, you can split up requests by source address, destination url or more.

 
Configuring Squid with Delay Pools
 
To enable delay pools option,
Compile squid with --enable-delay-pools
Example 
acl tech src 192.168.0.1-192.168.0.20/32
acl no_hotmail url_regex -i hotmail
acl all src 0.0.0.0/0.0.0.0
delay_pools 1 #Number of delay_pool 1
delay_class 1 1 #pool 1 is a delay_class 1
delay_parameters 1 100/100
delay_access 1 allow no_hotmail !tech

In the above example, hotmail users are limited to the speed specified in the delay_class. IP's in the ACL tech are allowed in the normal bandwidth. You can see the usage of bandwidth through cachemgr.cgi.   



Caching

Can squid cache FTP contents?
 
Squid is a http proxy with ftp support, not a real ftp proxy. It can download from ftp, it can also upload to some ftp, but it can't delete/change name of files on remote ftp servers.When we block ports 20 and 21, we won't be able to delete/change name of files on remote ftp servers.It speaks FTP on the server-side, but not on the client-side

Can squid Cache dynamic pages?
 
Squid will not be able to cache pages that dynamically generate the scripts. It will cache only the static pages.
 
Deleing Objects from Cache
Deletion of object from is possible by using .purging. method.

Squid does not allow you to purge objects unless it is configured with access controls in squid.conf. First you must edit the following tag in squid.conf as

acl PURGE method PURGE
acl localhost src 127.0.0.1
http_access allow PURGE localhost
http_access deny PURGE

The above allows purge requests which come from the local host and denies all other purge requests.

/usr/local/squid/bin/client -m PURGE <URL>

 
Specifing Cache Size 
Cache size could be specified by
 
Using cache_dir directive in squid.conf,

cache_dir ufs /usr/local/squid/cache 100 16 256 

 
Here ufs is the squid filesystem, /usr/local/squid/cache is the default cache directory, 100 is the cache size in MB . The cache size could be specified here
and 16 and 256 are the number of sublevel directories in cache directory. 
 
Squid Swap Formats
 
The squid swap formats systems available are

ufs,aufs,diskd and coss

Authentication 
 
Configuring Squid for authenticating users
 
Squid allows you to configure user authentication by using auth_param directive.This is used to define parameters for the various authentication schemes supported by Squid.
 
Proxy authentication in transparent mode
 
       Authentication can't be used in a transparently intercepting proxy as the client then thinks it is talking to an origin server and not the proxy. This is a limitation of bending the TCP/IP protocol to transparently intercepting port 80, not a limitation in Squid.
 
Authentication schemes available for squid
 
The Squid source code comes with a few authentication processes for Basic authentication. These include
LDAP: Uses the Lightweight Directory Access Protocol
NCSA: Uses an NCSA-style username and password file.
MSNT: Uses a Windows NT authentication domain.
PAM: Uses the Linux Pluggable Authentication Modules scheme.
SMB: Uses a SMB server like Windows NT or Samba.
getpwam: Uses the old-fashioned Unix password file.
sasl: Uses SALS libraries.
winbind: Uses Samba authenticate in a Windows NT domain

In addition Squid also supports the NTLM and Digest authentication schemes which both provide more secure authentication methods where the password is not exchanged in plain text.

 
Configuring squid for LDAP authentication

Compiling squid with ldap support.
./configure --enable-basic-auth-helpers="LDAP"

In squid.conf file edit the following

For Example
auth_param basic program /usr/local/squid/libexec/squid_ldap_auth -b dc=visolve,dc=com -f uid=%s -h visolve.com
acl password proxy_auth REQUIRED
http_access allow password
http_access deny all
  
 
Check Squid working with LDAP auth
 
To check whether the Squid machine communicates with the LDAP server Use the below command in command line

Example:
# /usr/local/squid/libexec/squid_ldap_auth -b dc=visolve,dc=com -f uid=%s visolve.com

This waits for the input.You have to give uid space passwd. If it was able to connect to LDAP server it will return "ok".

 
 LDAP group authentication
 
Compiling squid with ldap support.
./configure --enable-basic-auth-helpers="LDAP" --enable-external-acl-helpers=ldap_group

In the confiuration file (squid.conf)

external_acl_type group_auth %LOGIN /usr/local/squid/libexec/squid_ldap_group -b "dc=visolve,dc=com" -f " (&(objectclass=groupOfUniqueNames)(cn=%a)(uniqueMember=uid=%v,cn=accounts,dc=visolve,dc=com))" -h visolve.com

acl gsrc external group_auth accounts
http_access allow gsrc

 
  configuring Squid for NCSA
 
NCSA Authentication

This is the easiest to implement and probably the preferred choice for many environments. This type of authentication uses an Apache style  htpasswd  file, which  is checked  whenever anyone logs in. This is the best supported option, and a web based password changing  program  is provided to make it easy for our  users to  maintain  their own  passwords

To turn on NCSA authentication, edit some directives in squid.conf

authenticate_program /usr/local/squid/bin/ncsa_auth /usr/local/squid/etc/passwd

This tells Squid where to find the authenticator. Next we have to create an ACL.

Acl configuration for ncsa_auth :

acl auth_users proxy_auth REQUIRED
http_access allow auth_users
http_access deny all

Configuring Squid for SMB

SMB Auth Module :

smb_auth is  a  proxy authentication module. With  smb_auth we can authenticate proxy users against an SMB server like Windows NT or Samba.

Adding smb_auth in Squid.conf :

Squid Configuration :

To turn on SMB authentication, edit some directives in squid.conf.

authenticate_program /usr/local/squid/bin/smb_auth -W domain -S /share/path/to/proxyauth

This tells Squid where to find the authenticator. Next we have to create an ACL .

Acl configuration for smb_auth :

acl domainusers proxy_auth REQUIRED
http_access allow domainusers
http_access deny all

 
Configuring squid for MSNT

MSNT Auth Module :

MSNT is a Squid web proxy authentication module. It allows a Unix web proxy to authenticate users with their Windows NT domain credentials.

Adding msnt_auth in Squid.conf :

Squid Configuration :

To turn on MSNT authentication, edit some directives in squid.conf

auth_param basic program /usr/local/squid/libexec/msnt_auth
auth_param basic children 5
auth_param basic realm Squid proxy-caching web server
auth_param basic credentialsttl 2 hours

This tells Squid where to find the authenticator. Next we have to create an ACL

Acl configuration for msnt_auth :

acl auth_users proxy_auth REQUIRED
http_access allow auth_users
http_access deny all

Configure squid for PAM
PAM Auth Module :

This program authenticates users against a PAM configured authentication service "squid". This allows us to authenticate Squid users to any authentication source for which we have a PAM module.

Adding pam_auth in Squid.conf

Squid Configuration

To turn on PAM authentication, edit some directives in squid.conf.

authenticate_program /usr/local/squid/bin/pam_auth

 This tells Squid where to find the authenticator. Next we have to create an ACL .

Acl configuration for pam_auth :

 acl auth_users proxy_auth REQUIRED
 http_access allow auth_users
 http_access deny all

 
Configure squid for NTLM 

NTLM authentication is a challenge-response authentication type. NTLM is a bit different and does not obey the standard rules of HTTP connection management. The authentication is a three step (5 ways) handshake per TCP connection, not per request.

1a. Client sends unauthenticated request to the proxy / server.

1b. Proxy / server responds with "Authentication required" of type NTLM.

2a. The client responds with a request for NTLM negotiation

2b. The server responds with a NTLM challenge

3a. The client responds with a NTLM response

3b. if successful the connection is authenticated for this request and onwards. No further authentication exchanges takes place on THIS TCP connection. 

Adding ntlm_auth and passwd file in Squid.conf

Squid Configuration:

To turn on NTLM authentication, edit some directives in squid.conf.

auth_param ntlm program /usr/local/squid/libexec/ntlm_auth (domainname)/(pdc name)
auth_param ntlm children 5
auth_param ntlm max_challenge_reuses 0
auth_param ntlm max_challenge_lifetime 2 minutes

This tells Squid where to find the authenticator. Next we have to create an ACL.

Acl configuration for ntlm_auth :

acl auth_users proxy_auth REQUIRED
http_access allow auth_users
http_access deny all

Filtering

Filtering a website
 
Filtering of websites could be made with ACL (Access Control List). Here is an example of denying a group of ip addresses to a specific domain.

acl block_ips src <ipaddr1-ipaddr2>
acl block_domain dstdomain <domainname>

http_access deny block_ips block_domain
http_access allow all
  
 
Denying a user from accessing particular site
 
Denying a user from accessing particular site coule be done by ACLs.
It is possible by using 'dstdomain' acl type.

For example..

acl sites dstdomain .gap.com .realplayer.com .yahoo.com

http_access deny sites 

 
Filter a particular port
 
Filtering a particular port could be done in ACL as follows

acl block_port port 3456
http_access deny block_port
http_access allow all

 
Denying or allowing users 
 
Denying access to websites for a particular timing could be done as follows.

To restrict the client from a source IP to access a particular domain during 9am-5pm on Monday,

acl names src <ipaddr>
acl site dstdomain <domainname>
acl acltime time M 9:00-17:00

http_access deny names site acltime
http_access allow all
  
 
What all squid cant filter?
 
Squid cannot filters virus and web pages based on content.
 
Filtering a Particular MAC address

To use ARP (MAC) access controls, you first need to compile in the optional code. Do this with the --enable-arp-acl configure option.

Example:

acl M1 arp 01:02:03:04:05:06
acl M2 arp 11:12:13:14:15:16
http_access allow M1
http_access allow M2
http_access deny all
Performance
 
Monitoring Squid Performance
 
Squid performance is monitored by using cache manager and SNMP.
Cache Manager:
This provides access to certain information needed by the cache administrator. A companion program, cachemgr.cgican be used to make this information available via a Web browser. Cache manager requests to Squid are made with a special URL of the form 
 
        cache_object://hostname/operation
 
The cache manager provides essentially ``read-only'' access to information. It does not provide a method for configuring Squid while it is running. 
 
SNMP: 
 
SNMP could be used for monitoring squid runtime information like CPU usage, Memory usage, Cache Hit, Miss etc. The Multi Router Traffic Grapher (MRTG) is a tool to monitor squid information which generates a real-time status (graphical representation), in dynamic view by sampling data every five minutes.
 
Improving Squid Performance
 
Squid performance could be improved by gathering the performance data for the particular environment and tuning the Hardware and Kernel parameters for the peak performance.
 
Does the cache directory filesystem impact the performance?
 
The Cache directory has the default option ufs. When it is made with the following 
 
cache_dir aufs 
 
The aufs storage scheme improves the Squid.s disk I/O response time by using a number of thread processes for disk I/O operations .The aufs code requires a pthreads library. This is the standard threads interface defined by POSIX. To use aufs squid must be compiled with storeio option.
 
Note:
 
If disk caching is not used, it should be disabled by setting to 'null /tmp'
This eliminates the need for meta-data cache index memory space used by squid.
 
 Log files
 
Log files produced by squid
 
The list of log files produced by squid are
 
squid.out, cache.log, useragent.log, store.log, hierarchy.log, access.log.
 
Monitoring User Access
 
The access information gets stored in the access.log file.
 
Rotating Log
Larger log files could be handled by rotating the same.This could be done with the following command

squid -k rotate

To specify the number of logfile rotations to make when you type 'squid -k rotate' configure it in the squid.conf file in logfile_rotate directive.

Scheduling of this procedure could be done by Cron entry which rotates logs at midnight.

0 0 * * * /usr/local/squid/bin/squid -k rotate

 
 Can squid supports logs of size greater than 2GB?
 
Squid by default doesnt supports logs of size greater than 2 GB.To make the squid supports files of size greater than 2GB compile the squid with the option(--with-large-files)
 
Disbaling Squid Log File
 
Disabling log files could be done
To disable access.log
        cache_access_log none
To disable store.log
        cache_store_log none
To disable cache.log
        cache_log /dev/null 
Tools
 
Cache Manger (cachemgr.cgi)
  
The cache manager (cachemgr.cgi) is a CGI utility for displaying statistics about the squid process as it runs. The cache manager is a convenient way to manage the cache and view statistics without logging into the server. 


Tools For Configuring Squid
 
There are many tools available to configure squid like webmin and so on.

You can get these tools from

http://www.squid-cache.org/related-software.html
 
  
Log Analysers

Calamaris 

It is a commonly used  tool to analyze Squid's access.log. It Supports many features like generating Status reports of incoming UDP-Requests and incoming TCP-Requests for total as well as on per host basis; Reports about requested second-level-domains and Top_Level_domains are generated; And also Reports about requested Content Types, file_extension and on Protocols are generated using calamaris. It generates ASCII or html reports. For a full list of features, please visit the Calamaris home page. 
  
Weblog 
  
  WebLog is a group of Python modules containing several class definitions that are useful for parsing and manipulating common Web and Web proxy logfile formats. 
  
The Webalizer 
  
The Webalizer is a fast, free web server log file analysis program It is  written in C to be extremely fast and highly portable. The results are presented in both columnar and graphical format. Yearly, monthly, daily and hourly usage statistics are presented, along with the ability to display usage by site, URL, referrer, user agent , search string, entry/exit page, username and country. Processed data may also be exported into most database and spreadsheet programs that support tab delimited data formats. In addition, wu-ftpd xferlog formatted logs and squid proxy logs are supported. 
  
SARG 

   Sarg is a Squid Analysis Report Generator that allow you to view "where" your users are going to on the Internet. Sarg generates reports in html, with many fields, like: users, IP Addresses, bytes, sites and times. 

  
Tools to generate user web access report
Webmin is a web-based tool for generating web access reports. Using any browser that supports tables and forms (and Java for the File Manager module), you can setup user accounts, Apache, DNS, file sharing and so on. 
  
Webmin consists of a simple web server, and a number of CGI programs which directly update system files like/etc/inetd.conf and /etc/passwd. The web server and all CGI programs are written in Perl version 5, and use no non-standard Perl modules.   
 

Miscellaneous


Controlling Uploads
  
The uploads can be controlled by using acls(req_header).
  
acl upload_control req_header Content-Length [1-9][0-9][0-9][0-9][0-9]{3,}
http_access deny upload_control
http_access allow all

Controlling Downloads
  
The Downloads can be controlled by using the following directive.

reply_body_max_size     bytes allow|deny acl


Saturday 30 July 2011

Unix Made Easy: Domainkeys,DKIM and SPF with Postfix

Unix Made Easy: Domainkeys,DKIM and SPF with Postfix: "Domainkeys,DKIM and SPF with Postfix SPAM and Phishing has been a growing problem for a long time and more recently the battle to stamp i..."

Domainkeys,DKIM and SPF with Postfix


Domainkeys,DKIM and SPF with Postfix

SPAM and Phishing has been a growing problem for a long time and more recently the battle to stamp it out has been getting more aggressive resulting in a lot of legitimate mail starting to get discarded as SPAM/Phishing.
For less technical users and all but the best system administrators, it is often near impossible to jump through all the hurdles to ensure all mail always gets where it's intended. In many cases misconfigurations or at lest sub-optimum configurations (eg. reverse DNS mismatches) play a part. Adding to the problem is that many anti-SPAM mechanisims discard (or quarantine and then expire) mail after it has been accepted by the server, thus defeating the design of SMTP that no mail should be lost (it would either delivered or bounced back to the sender). In many cases the mail losses go unnoticed, or are just accepted as normal and people have to re-send mail that does not get through.
Almost all SPAM is forged, often using legitimate addresses or domains as the fake source addresses. Domainkeys (originally proposed by Yahoo!) provides a means of verifying the mail has in fact come from where it claims which all sounds good. If widely implemented this could largely stamp out many Phishing mails and much more.
Additionally, SPF (Sender Policy Framework) can be added to verify the source of the email is legitimate.
These all add credibility to your mail and reduce the risks of having your domain blacklisted or your mail silently discarded by others systems.
There is however plenty to consider....
DomainKeys Warts and all
The first snag we hit is that there are in fact two standards which are confusingly named: DomainKeys (DK) and DomainKeys Identified Mail (DKIM). Although plain DK is historic and should not be used for verification (a point missed by many), there are still systems (including reports of some major web mail providers) out there who run verification on the legacy standard. That wouldn't be so bad if it wasn't that the same DNS records are used by both standards so if you don't sign mail with legacy standard on sending, it looks to systems that do verify against the legacy standard like the mail is faked. This means that (for now anyway) we have to support both standards on sending to ensure that mail gets through.
Secondly, almost all the information I have found on setting up DK/DKIM seems to say how to configure things in testing mode (where mail gets treated the same as unsigned mail) and then stops there. Even many of the worlds leading tech companies are running their DK/DKIM in testing mode, and I'm sure the ones reading this are thinking that it can't be them! That's fine if they are testing, but few seem to be brave enough to bite the bullet and switch off test mode. This effectively means that although they have DK/DKIM, they are requesting that everyone ignores the DK/DKIM signatures and treats their mail as unsigned.
And finally, it's not without it's vulnerabilities. The public key (for verification) is distributed by DNS in a TXT record. This is back to the classic crypto key exchange problem. If your DNS can be compromised or just faked to the receiving server or upstream DNS caches, then anyone can pretend to be sending mail from you, or even DoS your mail (eg. corrupt your public key so verification always fails and your mail gets discarded). This DoS could in fact be used even if the sender doesn't support DK/DKIM as fake DNS records would tell everyone that they do. Nothing to be alarmed by - it is still possible to disrupt mail flow without DK/DKIM if DNS gets compromised.
Not really completely bullet proof, but provides more integrity than no verification at all. Personally I think the problem is that SMTP was designed at a time where there was no reason to worry about what would come through email - times have changed and the underlying SMTP protocol isn't hardened against the abuse that happens now. All the add-ons will only have impact if they are widely deployed, but how many admins out there even have a clue they exist? So long as basic SMTP is alive and kicking the problems will continue. The only certain way to stop it is to replace SMTP with a modern protocol where verification is mandatory at every stage, but that's not going to happen any time soon.
DK/DKIM only validates part of the mail and to get the full benefit of authenticating mail all the way really needs to be combined with other technologies like SPF (see later) and ADSP (see later).
Should you use it?
There are a number of aspects to DK/DKIM and SPF that undermine their value:
  • Almost everyone operating their DKIM in test mode, effectively requesting peers to treat mail as unsigned
  • There is no formal requirement for setting a domain policy so mail can easily be forged
  • DKIM ADSP (previously known as ASP) provides the beginnings of a policy mechanism for DKIM, but at this time is not formalised, and recommendations include running it in a mode where it is acceptable not to sign messages. This again defeats the effectiveness of the system.
  • DKIM doesn't authenticate the envelope but rather selected aspects of the mail. This means that if these aspects are then replicated exactly other aspects of the mail (including the envelope and hence recipient) may be changed.
  • Neither DK/DKIM nor SPF is widely used beyond a few major mail providers. SPF does seem to be more widely deployed in organisations that are heavily phished. The many small providers, corporates etc. aren't paying any attention, and I doubt even know anything about it.
  • None of them protect against account breakins (eg. via a trojaned machine) and other mail that would appear to authenticate properly.
  • If misconfigured (either on the sending or receiving side), it could make a real disaster of your mail
The big thing in the favour of DKIM and SPF is that they add credibility to mail from your domain. If the mail checks out then odds are it's legit, and in making it more difficult for spammers and fraudsters to use your domain, you reduce the chance of it being blaclisted.
If you are also running a highly phished domain then it can be useful to discourage abuse of your domain. That said, a quick check of a few high street banks which I see an enormous amount of phishing of, only one I checked had SPF configured - that's all! It's such an easy way to protect their customers from phishing, yet few can be bothered.
How DK/DKIM it works
What DK/DKIM does is relatively simple, though not without it's warts. The concept is that key parts of the mail get cryptographically signed with a private key on the sending server, and then verified on the receiving server.
Each server doing signing can have a unique "selector" with a matching key making it easier to have multiple independent machines without having to keep keys in sync across them all. It also means provides a degree of isolation if a key or server gets compromised.
It's unclear just how effective it currently is with almost everyone running in test mode, or if some systems are even ignoring that and using DK/DKIM for spam filtering anyway.
A note on chroot in Postfix
Postfix is often run with at least some services being chroot (default in Debian Lenny), but some older installs do not do this. There are security benefits of chroot, though it does make setup a bit more tricky as sockets for the milters have to be placed within the chroot rather than the /var/run as they would normally be.
Typically Postfix will chroot to it's spool directory of /var/spool/postfix.
There are three approaches to dealing with chroot - either create a directory in the chroot area and configure the milters to put their sockets there, create directories and bind mount the relevant directories from /var/run into the chroot, or run the milters networked so that there are no sockets at all.
Personally I favour the first option and create the directory /var/spool/postfix/milter where I configure all the milter sockets to be. This means that in the Postfix config it will see all the sockets under /milter where all the milter configs will have them under /var/spool/postfix/milter
# mkdir /var/spool/postfix/milter
# mkdir /var/spool/postfix/milter/dk-filter
# chown dk-filter.dk-filter /var/spool/postfix/milter/dk-filter
# chmod 2755 /var/spool/postfix/milter/dk-filter
# mkdir /var/spool/postfix/milter/dkim-filter
# chown dkim-filter.dkim-filter /var/spool/postfix/milter/dkim-filter
# chmod 0755 /var/spool/postfix/milter/dkim-filter
If you are concearned about the possible risks of having a world writable directory then you could just make subdirectories with appropriate permissions for each milter.
The advantage of sticking to Unix sockets is that the permissions can be controlled making them more secure.
Keep this in mind with the config that follows as you may need to adapt it to the location and configuration of your Postfix.
DKIM preparation
Start off by installing dkim-filter. This is a milter which can be used in Postfix to do signing and verification.
Next, we need to generate keys. I am going to base this on handling multiple domains (eg. virtual hosted) on the same box so we are going to create a key per domain on each server.
# mkdir -p /etc/mail/dkim/keys/domain1
# cd /etc/mail/dkim/keys/domain1
# dkim-genkey -r -d domain1
At this point we have our key pair for domain1. The selector (identifier that says what key we are using) be the filename that dkim-filter pulls the key from. We can either rename the key, or I prefer to just symlink it. So for example, if we are on a server mail2.domain1, we probably just want to call the selector mail2 to keep things simple:
# ln -s default.private mail2
Likewise, you can do the same for domain2, domain3, and so on for all the domains that your server handles.
Next, we are going to tell dkim-filter what key to use for what mail. Create a file /etc/dkim-keys.conf and put the following in it:
*@domain1:domain1:/etc/mail/dkim/keys/domain1/mail2
*@domain2:domain2:/etc/mail/dkim/keys/domain2/mail2
*@domain3:domain3:/etc/mail/dkim/keys/domain3/mail2
Now, you need to take some time to look at how your network is configured. In many cases machines may be allowed to use the server as an outbound relay. Any machines that do this need to be explicitly defined else dkim-filter will not sign mail from them. If you do need to tell dkim-filter about these then create a file /etc/dkim-internalhosts.conf and put the machines that can use this server as a relay in:
127.0.0.1
::1
localhost
server2.domain1
server1.domain2
All that needs doing is to do the final config in /etc/dkim-filter.conf. From the default, the only things you will probably need to do is uncomment the line:
KeyList        /etc/dkim-keys.conf
And, add the InternalHosts (if needed):
InternalHosts    /etc/dkim-internalhosts.conf
You may want to take more control over how dkim-filter behaves under different circumstances. See the man page and look at On-* options which may also be added to the config telling it how to handle mail.
If you are running Postfix chroot (see above) then add/change the line in /etc/default/dkim-filter to put the socket within the chroot:
SOCKET="local:/var/spool/postfix/milter/dkim-filter/dkim-filter.sock"
Now restart dkim-filter and hopefully everything will work as expected:
# /etc/init.d/dkim-filter restart
Restarting DKIM Filter: dkim-filter.
The only other thing you need to do is add postfix into the dkim-filter group else it will not be able to connect to the socket to talk to dkim-filter.
We will handle the Postfix side of things later once all the parts are in place.
DK preparation
This is the historic method and is much less tidy than dkim-filter. Install dk-filter.
We will be using the same keys, but DK normally puts them in a different location. For convenience I just symlink them:
# mkdir /etc/mail/domainkeys
# cd /etc/mail/domainkeys
# ln -s /etc/mail/dkim/keys
The only thing to watch is permissions as dk-filter tries to read the keys as it's uer rather than root. To solve this I suggest changing permissions on the keys:
# cd keys/domain1
# chgrp dk-filter *
# chmod g+r *
And repeat this for all the other domains.
Next we need to create some lists of domains, keys etc. for dk-filter. These are similar to what we did for dkim-filter, but beware, are not all the same.
Easy one first - internal hosts is the same so I just symlink it:
# cd /etc/mail/domainkeys
# ln -s /etc/dkim-internalhosts.conf internalhosts
We also need a list of domains that we shoud sign. Create /etc/mail/domainkeys/domains containing:
domain1
domain2
domain3
... as needed.
The list of keys to use is also a different format. Create /etc/mail/domainkeys/keylist containing:
*@domain1:/etc/mail/domainkeys/keys/domain1/mail2
*@domain2:/etc/mail/domainkeys/keys/domain2/mail2
*@domain3:/etc/mail/domainkeys/keys/domain3/mail2
... and more as needed.
The config for dk-filter is all done with command line arguments. Typically these would be added in to /etc/default/dk-filter. I have added the following to the bottom of the file:
DAEMON_OPTS="$DAEMON_OPTS -i /etc/mail/domainkeys/internalhosts"
DAEMON_OPTS="$DAEMON_OPTS -d /etc/mail/domainkeys/domains"
DAEMON_OPTS="$DAEMON_OPTS -k -s /etc/mail/domainkeys/keylist"
DAEMON_OPTS="$DAEMON_OPTS -b s"
The last line is important because it causes dk-filter to sign only. If we do verification on DK then we just become part of the problem of legacy systems still running.
If you are running Postfix chroot (see above) then also add/change the line in /etc/default/dk-filter to put the socket within the chroot:
SOCKET="/var/spool/postfix/milter/dk-filter/dk-filter.sock"
Now restart dk-filter and hopefully everything will work as expected:
# /etc/init.d/dk-filter restart
Restarting DomainKeys Filter: dk-filter.
The only other thing you need to do is add postfix into the dk-filter group else it will not be able to connect to the socket to talk to dk-filter.
Now that we have the filters working we can get Postfix hooked up.
DK/DKIM Postfix configuration
Edit your /etc/postfix/main.cf file and add the lines (or add to them if you already have milters:
smtpd_milters =
    unix:/var/run/dkim-filter/dkim-filter.sock
    unix:/var/run/dk-filter/dk-filter.sock
non_smtpd_milters =
    unix:/var/run/dkim-filter/dkim-filter.sock
    unix:/var/run/dk-filter/dk-filter.sock
milter_default_action = accept
Or for a chroot configuration of Postfix (see above):
smtpd_milters =
    unix:/milter/dkim-filter/dkim-filter.sock
    unix:/milter/dk-filter/dk-filter.sock
non_smtpd_milters =
    unix:/milter/dkim-filter/dkim-filter.sock
    unix:/milter/dk-filter/dk-filter.sock
milter_default_action = accept
These tell Postfix where to find the sockets for talking to the filters. Restart Postfix and hopefully now mail will be getting signed:
# /etc/init.d/postfix restart
Now we need to publish the public keys to make sure that people can verify the mail.
DK/DKIM DNS configuration
One catch here is that you will need to be able to add TXT records with underscores in them and some DNS providers have problems with this.
In the directories that you created the keys for each domain there will be a default.txt file which contains the DNS record that has to be added to that domain. For now I also suggest you add a t=y flag to it to indicate that it should be in test mode (don't treat mail any different to unsigned mail even if it fails verification):
default._domainkey IN TXT "v=DKIM1; g=*; k=rsa; t=y; p=MIGf........."
In your DNS record change default to whatever the selector is for this server (ie. mail2 in our example):
mail2._domainkey IN TXT "v=DKIM1; g=*; k=rsa; t=y; p=MIGf........."
This is what goes in your DNS zone. Beware that with many web interfaces you will have to put in mail2._domainkey as the name and the part in quotes (not including the quotes) as the value, ensuring that you are creating a TXT record.
For DK, a default policy for a domain is probably worth setting to discourage mail from being rejected by legacy systems verifying DK:
_domainkey IN TXT "t=y;o=~"
This says that it is in testing mode (ie. treat it the same as unsigned mail), and that not all mail will be signed. This should give DK verifies no reason to reject any mail any more than an unsigned mail.
You can test your DNS config with the policy tester at http://domainkeys.sourceforge.net/policycheck.html and the selector tester at http://domainkeys.sourceforge.net/selectorcheck.html
Testing
It can take some time for DNS to propagate, so ensure that the DNS records you added have become available before trying to test.
It is worth having access to some mail accounts with major mail providers who use DK/DKIM so that you can test. Yahoo! and Google are good places to start, but other providers are also worth testing.
Send mails to your test accounts from all your domains, and from all your test accounts to all your domains, and examine the message headers at the other side.
You should see that signatures are being added:
You should also see a line indicating that verification has succeeded:
If you run into trouble then check that the correct fields are making it into the headers:
Also check the DNS:
$ host -t txt _domainkey.domain1
$ host -t txt mail2._domainkey.domain1
It's worth checking against other DNS servers. Google's public DNS is useful for this:
$ host -t txt _domainkey.domain1 8.8.8.8
$ host -t txt _domainkey.domain1 8.8.8.8
$ host -t txt mail2._domainkey.domain1 8.8.4.4
$ host -t txt mail2._domainkey.domain1 8.8.4.4
There are also test reflectors listed at: http://testing.dkim.org/reflector.html
Going Live
Once you have tested sufficiently and run the system in test mode for long enough to be confident that everything is working then you may like to switch off test mode.
To turn off test mode remove the t=y fields from selector DNS records to indicate that all mail for the domain is signed. At this point other systems should start rejecting mail from your domain that does not verify.
This means that spammers / fraudsters attempting to fake mail from your domain will hit problems and ultimately people using DKIM should have less reason to block mail from your domain due to it being used as a fake source of SPAM / phishing.
Keep in mind however that if something goes wrong (eg. someone mangles the DNS records, messes up the dkim-filter configuration or something) then this could also end up disrupting your mail.
I would reccomend some form of monitoring in place to ensure that everything is working as designed and to be able to detect breakages quickly.
ADSP (was ASP)
DKIM ADSP is at this time not a formal standard, but none the less it takes care of the policy for your domain, and hence you may like to put some thought into using it at this stage. It is once again a TXT DNS record and for now a good place to start is:
_adsp._domainkey IN TXT "dkim=unknown"
This simply states that not all mail from the domain will be signed, hence servers should still accept unsigned mail. This is the recomended state, but if you want to start enforcing it (eg. your domain is being faked by phishers) then you can tell the world that all mail should be signed, and it's worth verifying that all your mail is actually being signed first:
_adsp._domainkey IN TXT "dkim=all"
You can go a step further and explicitly tell the world to discard mail that is unsigned:
_adsp._domainkey IN TXT "dkim=discardable"
There is still much debate about this, and if discarding mail (rather than rejecting it on the edge servers) is actually a good idea at all. My opinion is that so far as possible, mail should never be discarded as if there is a fault upstream the sender doesn't know about it and can't rectify the problem. If mail is rejected then an increase in failures will be noticed on well run systems which are being monitored and the admins can investigate and correct the problems.
The other problem with discarding mail is that it appears to spammers that they are being successful. Really, I would like to demonstrate as clearly as possible to spammers that they are failing and discourage them by rejecting the mail. If it is clearly a waste of time spamming then less people will try it.
There are arguments about the backscatter / blowback problem with rejecting mail, but again, if systems reject mail then it's a problem for those running systems that relay mail and they should harden their systems. If they are creating backscatter then they deserve to have their servers blacklisted.
Adding SPF
You will often see SPF (Sender Policy Framework) related lines in the headers of mail verified with with DKIM, and they work nicely together. SPF is simply a way of publishing policies about what sources of mail should be trusted - kind of like a MX record for sending servers for a domain.
With Postfix I use postfix-policyd-spf-perl to validate SPF. The man page gives you most of what you need to know.
The first thing you need to be aware of is that if you have any secondary servers that forward mail to this one, SPF checks will have to be skipped for them as they will be seen as a source of forged mail. To fix this you will need to edit the actual Perl and add the source addresses of these relays - line 86 in the version in Debian/Lenny:
use constant relay_addresses => map(
    NetAddr::IP->new($_),
    qw(
1.2.3.4  5.6.7.8 )
); # add addresses to qw (  ) above separated by spaces using CIDR notation.
Be aware that if the package is upgraded then these will be overwritten.
Add postfix-policyd-spf-perl to the /etc/postfix/master.cf so that it is started when needed:
spfcheck  unix  -       n       n       -       0       spawn
    user=policyd-spf argv=/usr/sbin/postfix-policyd-spf-perl
Put in your smtpd_recipient_restrictions the policy check:
smtpd_recipient_restrictions =
    reject_invalid_hostname,
    reject_non_fqdn_sender,
    reject_non_fqdn_recipient,
    reject_unknown_sender_domain,
    reject_unknown_recipient_domain,
    permit_mynetworks,
    reject_non_fqdn_hostname,
    permit_sasl_authenticated,
    reject_unauth_destination,
    check_recipient_access pcre:/etc/postfix/toaccess_pcre,
    check_recipient_access hash:/etc/postfix/toaccess,
   
check_policy_service unix:private/spfcheck
    check_policy_service inet:127.0.0.1:60000,
    reject_rbl_client bl.spamcop.net,
    reject_rbl_client dnsbl.sorbs.net,
    reject_rbl_client zen.spamhaus.org,
    permit
Make sure that the line is added after reject_unauth_destination else you could end up approving mail to any destination (open relay). At that point you should be ready to go - restart Postfix and see what happens.
All going to plan you should see things like this logged occasionally:
postfix/policy-spf[*****]: : SPF Pass (Mechanism 'ip4:***.***.***.***/**' matched): Envelope-from: someone@somedomain
postfix/policy-spf[*****]: handler sender_policy_framework: is decisive.
Lastly, you need to setup your DNS so that others can verify your mail sources. Although there is a specific SPF record in DNS, for now you will almost certainly have to use a TXT record:
@ IN TXT "v=spf1 mx a:mail.domain include:senders.domain ~all"
@ means default for the domain (ie. when you lookup the base domain), but you can as easily specify the record for subdomains.
v=spf1 identifies it as an SPF record and gives the version.
mx says that mail could come from a machine matching the MX records for your domain. For smaller domains this is often all that is needed.
a specifies an A or AAAA record where mail may come from. This may be an outbound-only mail relay, a security applicance, a webserver that mails customers directly or perhaps a marketing company's systems who sends out mail blasts on your behalf.
include specifies another TXT record to include which is useful if you run a large outfit need to break up your records into managable chunks.
There are various other mechanisms (eg. ipv4, ipv6 which specify address ranges) that can be added but most will probably only be use to people with large amounts of mail infrastructure to worry about and they can be easily looked up.
Lastly, ~all says that all other sources should soft fail (retryable failure, useful for testing). This can also be -all meaning to fail (reject/bounce) other sources, ?all meaning to ignore the policy (again usful for testing), and +all meaning to accept all others which is probably not a good idea. With the a, mx, etc. the + is implied - ie. saying mx really means +mx
You can find much more on this syntax at: http://www.openspf.org/SPF_Record_Syntax
Like with DKIM, this needs testing and accounts at major web mail providers will often have a verification header in them. Test throughly before setting to -all where other mail sources will not be able to send mail as your domain. If you have forgotten to include one of your legitimate outbound mail sources then this too will be blocked from sending mail.
Record keeping
When deploying technologies like this it is very easy to loose track of all the places where configuration is hiding that need to be changed if for example you add another server or just change the address of an existing one.
With small setups it's generally all left in the head of whoever set it up. As they are not administrating it on a continous basis, they often forget and then mistakes happen. Likewise if they leave, their replacement will have no familiarity with what configuration is where.
In larger organisations there is far more infrastructure and it can be hard work keeping track of it all. Administration is done by many people and unless they communicate effectively it is a recipe for disaster.
In any size organsiation, keeping good records of your configuration, work notes of who did what configuration and checklists/work instructions (eg. for deploying new servers) are vital to ensuring that everything remains under control.
Cacti
I am updating my Postfix templates for Cacti for moitoring DKIM and SPF and these will be available shortly.
Monitoring is vital to smooth running of mail as well as long term planning so get yours configured.