2012-02-27

Don't forget about /etc/shells

So I'm attempting, over and over, to authenticate to a new service I've configured on a host.  And it keeps on telling me "Login incorrect".  Grr... Argh...

Then I look in the service's PAM configuration [/etc/pam.d/{serviceName}].  And there I see that in addition to the typical include of the host's default password-auth stack it has added a requirement for pam_shells.  Ah ha!  I remember something about that from approximately a hundred or so years ago.

pam_shells restricts authentication to accounts whose UNIX shell attribute is one of the shells listed in the text file /etc/shells.
Aside:  The UNIX shell of an account is the seventh, and last, column in the traditional /etc/passwd file or the loginShell attribute of the account attribute if you are using the RFC2307 LDAP schema.

Since my account was set to use a weirdo shell ["/bin/ksh"] that had not been added to /etc/shells my authentication was failing to satisfy this PAM stack.  So -
echo "/bin/ksh" >> /etc/shells
- and away I go.

2012-02-20

Sequestering E-Mail

When testing applications one of the concerns is always that their actions don't effect the real-world. One aspect of that this is sending e-mail; the last thing you want is the application you are testing to send a paid-in-full customer a flurry of e-mails that he owes you a zillion dollars. A simple, and reliable,  method to avoid this is to adjust the Postfix server on the host used for testing to bury all mail in a shared folder.  This way:
  • You don't need to make any changes to the application between production and testing.
  • You can see the message content exactly as it would ordinarily have been delivered.
To accomplish this you can use Postfix's generic address rewriting feature;  generic address rewriting processes addresses of messages sent [vs. received as is the more typical case for address rewriting] by the service.  For this example we'll rewrite every address to shared+myfolder@example.com using a regular expression.

Step#1

Create the regular expression map.  Maps are how Postfix handles all rewriting; a match for the input address is looked for in the left hand [key] column and rewritten in the form specified by the right hand [value] column.
echo "/(.)/           shared+myfolder@example.com" > /etc/postfix/generic.regexp
Text 1: Create the new regular expression map.
Step#2

Configure Postfix to use the new map for generic address rewriting.
postconf -e smtp_generic_maps=regexp:/etc/postfix/generic.regexp
Text 2: Enable generic rewriting using the regular expression map.

Step#3

Tell Postfix to reload its configuration.
postfix reload
Text 3: Signal the running Postfix service to reload its configuration.

Now any mail, to any address, sent via the hosts' Postfix service, will be driven not to the original address but to the shared "myfolder" folder.

2012-02-15

Complex Queries With SQLAlchemy (Example#1)

There are lots of examples of how to use SQLAlchemy to provide your Python application with a first-rate ORM. But most of these examples tend to model very trivial queries;  but the real power of SQLAlchemy, unlike many ORM solutions, is that it doesn't hide / bury the power of the RDBMS - and if you aren't going to use that power why bother with an RDBMS at all [Aren't NoSQL solutions the IT fad of the year? You could be so hip!].  So in this post I'll provide a useful non-trivial query and how to perform the a complex query - only better - using SQLalchemy.
So first, a query:
SELECT process.process_id, op1.value_int, op2.value_string, op3.value_string
FROM process
  INNER JOIN route ON ( route.route_id = process.route_id )
  LEFT OUTER JOIN obj_property op1
    ON ( op1.obj_id = route.route_id AND
         op1.namespace_prefix = 'http://www.opengroupware.us/oie' AND
         op1.value_key = 'expireDays' )
  LEFT OUTER JOIN obj_property op2
    ON ( op2.obj_id = route.route_id AND
         op2.namespace_prefix = 'http://www.opengroupware.us/oie' AND
         op2.value_key = 'preserveAfterCompletion' )
  LEFT OUTER JOIN obj_property op3
    ON ( op3.obj_id = route.route_id AND
         op3.namespace_prefix = 'http://www.opengroupware.us/oie' AND
         op3.value_key = 'archiveAfterExpiration' )
WHERE process.db_status != 'archived'
  AND process.state IN ( 'C', 'F', 'Z' )
  AND process.status != 'archived';
Text 1: A sophisticated query that exploits the power of the PostgreSQL database engine.

This query returns the process_id value from the process table and then some values from multiple records from the table obj_property - if those records exist - correlated via an intermediate table route.  Complex, but also fast!  
Aside:With the correct DDL when creating the database PostgreSQL also enforces the integrity of the relations, maintains usage statistics, and provides multiple levels of atomicity.  I certainly wouldn't want to have to do all that myself.
So how to model such a query in SQLAlchemy?  First realize that the result is actually going to be better than what straight SQL / DBAPI would give us.  Instead of some values the query will return real objects, the advantage of this is the amount of code eliminated by having to do things based on ids or primary keys.  This example assumes that the ORM entities Process, Route, and ObjectProperty have already been described - there is lots of documentation about how to declare the relations between your tables and your objects using the declarative style.
from sqlalchemy.orm   import  aliased
from sqlalchemy       import and_, or_

db = ctx.db_session()

op1 = aliased(ObjectProperty)
op2 = aliased(ObjectProperty)
op3 = aliased(ObjectProperty)

q = db.query( Process, op1, op2, op3 ).\
       join( Route, Route.object_id == Process.route_id ).\
       outerjoin( op1, and_( op1.parent_id == Route.object_id,
                             op1.namespace=='http://www.opengroupware.us/oie',
                             op1.name=='expireDays' ), ).\
       outerjoin( op2, and_( op2.parent_id == Route.object_id,
                             op2.namespace=='http://www.opengroupware.us/oie',
                             op2.name=='preserveAfterCompletion' ), ).\
       outerjoin( op3, and_( op3.parent_id == Route.object_id,
                             op3.namespace=='http://www.opengroupware.us/oie',
                             op3.name=='archiveAfterExpiration' ), ).\
       filter( and_( Process.state.in_( [ 'C', 'F', 'Z' ] ),
                     Process.status != 'archived' ) )
Text 2: The same query as above, only expressed via the SQLAlchemy ORM.  But instead of returning values it returns live objects.
The SQLAlchemy aliased method declares multiple references to ObjectProperty that can be used independently: op1, op2, and op3.  The other advanced technique is to use the outerjoin method to relate the need for a LEFT OUTER join.
The results of this query will be tuples of four elements; the first being a Process object and the second, third, and fourth will either be ObjectProperty objects if the concomitant outer join identified a record or None if no record matched the join. The lovely upside of this is that the query results can be processed using a straight forward for-each construct:
for process, expire_days, preserve_after, archive_after in q.all():
   if expire_days:
       ....
Text 3: Iterate over the query results; the first step depends if the op1 is an object (a record matched the first outer join).
Personally I find the ORM code to be easier to visually parse than the native SQL. Especially if you need to build the query dynamically or modify it based on the applications needs - since q is an object additional filter and join conditions can continue to be added.  Imagine trying to do that with straight SQL?
q = q.filter(Process.owner_id == 10100)
q = q.limit(150)
Text 4: Add one more filter expression to the queries WHERE clause and limit the query to 150 results.
Another advantage to this method is that SQLAlchemy can adapt it's dialect to the specific back-end if, for example, you are stuck using a database other that PostgreSQL.  Without such an adaptive layer using anything other than the most trivial queries becomes daunting do to slight but important differences in how various engines express joins and nested queries.

2012-02-09

Configuring Postfix As An SMTP Client

Every host needs to send mail; not just users.  Hosts send mail for a variety of reasons - from cron jobs, log watchers, error and exception reports, lots and lots of reasons.  But mail sent by hosts should be as secure as mail sent by users at least to the degree you trust the facilities generating the mail.  To achieve that the host's Postfix instance should encrypt it's traffic to the central SMTP relay and it should authenticate itself - just like a user.  Fortunately doing so is pretty straight-forward.  For this example I'm assuming the central SMTP server is smtp.example.com and you have a username/password you need to authenticate. 

A note about SASL mechs

Whether authentication is performed using DIGEST, PLAIN, CRAM, etc... doesn't really matter.  The only caveat for the type of authentication is that you need to have the appropriate SASL library installed; so for plain you need to "yum install cyrus-sasl-plain".  If you get to the end and don't have an appropriate SASL library installed for a type of authentication the central SMTP server supports you'll see messages like "SASL authentication failure: No worthy mechs found".  It is the central SMTP server that determines what authentication methods are acceptable - your client has to be able to match at least on of the methods it supports.

Step #1 : Point to the central SMTP server

Configure the Postfix instance to only listen to the local interface and to send all mail, regardless of destination, to the central relay.
postconf -e inet_interfaces=localhost
postconf -e relayhost='[smtp.example.com]'
Text 1: Setting the central SMTP server (relayhost)
Step #2 : Enable authentication & encryption

Of course you'll want to encrypt the traffic and the relay host will probably only permit authentication over an encrypted connection anyway.
postconf -e smtp_sasl_auth_enable=yes
postconf -e smtp_use_tls=yes
postconf -e smtp_tls_note_starttls_offer = yes
Text 2: Enable TLS & authentication
If the site is using their own CA certificate to create SSL certificates then that CA certification must be available on the host in order to verify the host certificate of the SMTP relay.
postconf -e smtp_tls_CAfile=/path/to/the/cacert.pem
Text 3: Set the path to the CA certificate
Step #3 : Establish the authentication credentials

Now the SMTP server needs some credentials.  These are written to a file and then a Postfix map is generated from that file. The format of the file is the host name of the remote, whitespace, and then the username and password delimited by a colon.  Note that the hostname must match the actual hostname of the remote or the local Postfix instance won't attempt to login - it will think it doesn't have credentials. The permissions on the sasl_passwd and sasl_passwd.db files should be secured so that only user root & group mail have access.

echo "smtp.example.com username:password" > sasl_passwd
postmap hash:/etc/postfix/sasl_passwd
postconf -e smtp_sasl_password_maps=hash:/etc/postfix/sasl_passwd
chown root:mail sasl_passwd sasl_passwd.db
chmod 740 sasl_passwd sasl_passwd.db
Text 4: Creating the sasl_passwd map
Optional Extra Paranoia

My personal preference, for a bit of added paranoia, is to also set the immutable flag of the three security sensitive files.
chattr +i cacert.pem sasl_passwd sasl_passwd.db
Text 5: Making the sensitive file immutable.

A file set as imutable canntot be modfied, deleted, renamed, or linked to.  Not even by root - at least not until the immutability flag is explicitly removed [chatter -i files].  This protects the file from being modified or deleted unintentionally as well as making them that much more difficult to modify maliciously.

Step #4: Test
Now you should be able to send some mail;  this is most easily accomplished with the mail command [which is provided in the mailx package].  Watch the /var/log/maillog file to see your message go; or see any errors. If you see messages like "certificate verification failed for ..." then Postfix doesn't accept the validity of the central SMTP relay's certificate.  Either the CA cert specified in Step#2 is invalid or the permissions are incorrect and Postfix can access the file.
When you receive the mail you've sent in your INBOX you can look at the headers and you should see something very much like:
Received: from client.example.com (client.example.com [192.168.1.70]) \
  (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No \
  client certificate requested) (Authenticated sender: smtpclient) by \
  smtp.example.com (Postfix) with ESMTP id 5FD712000C for \
  ; Wed,  8 Feb 2012 12:05:19 -0500 (EST)
Text 6: Example header from a secure and authenticated e-mail.
You server is now securely sending messages.

2012-02-08

Installing PDO_INFORMIX on CentOS6

Step#1 : Install the Informix SDK / Client

This is as simple as copying the files to /opt/informix or using one of the various install methods provided.  But beyond that it is necessarily to initialize the required environment variables.  The simplest way to set the environment variables is to create an informix.sh profile script in /etc/profile.d - these scripts are executed by /etc/profile whenever a session is created [such as when a user logs in].  Additionally you'll need to set these same variables in /etc/sysconfig/httpd so that they are set in Apache's environment when started by the system start-up scripts.
$ (cat << EOF
export INFORMIXDIR=/opt/informix
export DBDATE='Y4MD-'
export INFORMIXSERVER=YOURINSTANCENAME
EOF
) > /etc/profile.d/informix.sh
Text 1: Creating /etc/profile.d/informix.sh
YOURINSTANCENAME needs to be defined in /opt/informix/etc/sqlhosts.  Your method of installing the SDK may or may not have set that up for you.

The system library path must also be extended to include the directories containing the SDK's libraries.
$ ( cat << EOF
 /opt/informix/lib
 /opt/informix/lib/cli
 /opt/informix/lib/esql
 /opt/informix/lib/client
 /opt/informix/lib/csm
 /opt/informix/lib/dmi
 EOF
 ) > /etc/ld.so.conf.d/informix.conf
Text 2: Extending the system's library path
 If the library path is not configured correctly applications, included httpd, will not be able to load the Informix libraries.  At this point the library cache can be refreshed by executing the /sbin/ldconfig command.  Once that has been performed either log out and back into the server, or just reboot the server, to verify that upon logging in you have the INFORMIXDIR, INFORMIXSERVER, and DBDATE variables in your enviroment.

Step#2 : Build the Informix PDO driver.

In order to build PHP PECL modules you must have php-devel, make, and gcc installed on the server.
$ pecl download PDO_INFORMIX-1.2.6
$ tar xzf PDO_INFORMIX-1.2.6.tgz
$ cd PDO_INFORMIX-1.2.6
$ phpize
$ configure
$ make
Text 3: Building PDO_INFORMIX
If your Informix SDK is installed correctly and you've properly initialized the environment everything should be found automatically and build without complaint.  Now move the PDO driver into place and inform the PHP interpreter that it needs to load the library.  Here we perform a dirty trick of first loading the base pdo.so library.  This shouldn't be necessary and PHP will grumble about it upon initialization, but it works around some wackiness regarding PDO versions.  Without this line pdo_informix.so will refuse to load because PDO isn't loaded yet because the need for PDO isn't automatically discovered.
$ cp /tmp/PDO_INFORMIX-1.2.6/modules/pdo_informix.so /usr/lib64/php/modules/
$ ( cat << EOF
extension=pdo.so
extension=pdo_informix.so
EOF
 ) > > /etc/php.d/informix.ini
Text 4:  Install and register PDO_INFORMIX
Now we can try to start/restart the Apache service and see if our PDO module is available: service httpd restart.  But it won't work. The loading of the Informix SDK by Apache will be blocked by SELinux's security policy.

Step#3 : Provision SELinux


PHP Warning:  PHP Startup: Unable to load dynamic library '/usr/lib64/php/modules/pdo_informix.so' - libifcli.so: failed to map segment from shared object: Permission denied in Unknown on line 0
The message in /var/log/httpd/error_log indicating that loading the library failed with a "permission denied"; regardless of what you set the permissions too.
Text 4: SELinux blocking the loading of libifcli.so
The solution is not to disable SELinux; SELinux is your over-protective big brother.  Maybe annoying to have around sometimes, but worth it for those time when you need to take a short cut through a dark musty alley.  The correct solution is just to label the required library as a known and approved shared library.
$ chcon -t lib_t /opt/informix/lib/cli/libifcli.so
Text 5: Applying the appropriate label to libifcli.so
Step#3 : Get coffee

Restarting Apache know and you should see the pdo_informix driver available in phpinfo() output.  Also double check that INFORMIXDIR, INFORMIXSERVER, and DBDATE appear in the "Enviroment" section of phpinfo; without these variables the PDO driver will not be able to find your informix instance.

From here on out it is pretty much the web developer's problem.

2012-02-07

Renaming The Volume Group Containing /

Almost every server I work with is a virtual machine;  accordingly I like to do one small install with all the packages that I always want [like pam-nss-ldapd, snmp-utils, dstat, etc...] but don't install by default.  Then I make sure VMware tools is installed an operational.  From that point forward I can just clone that one VM and add to it when I want a new instance of something.
The only downside to this is that all the machines end up with the same volume group name: typically VolGroup0 or some such thing.  That's ugly and limits the ability to move volume groups around - these are virtual disks after all.  Renaming a volume group is straight forward:
vgrename /dev/VolGroup0  /dev/Carazon
Text 1: Rename the volume group VolGroup0 to be named Carazon.
 Hold it!  Now the machine won't boot anymore!  You get the pleasure of rebooting into "Kernel Panic:  blah blah blah".  If you rename the volume group containing the root filesystem and swap space you also have to update that name in two places -
  1. In /etc/fstab.  This one is obvious and I usually remember.
  2. In /etc/grub.conf.  Otherwise the kernel tries to mount the root file-system using the old volume group name.
Done this so many times... always forget.

Identifying The Hottest Tables (Informix)

Yesterday I posted about how to identify the hottest table in a PostgreSQL database.  Pretty much the same functionality is available for administrators of Informix databases as well; this kind of information is found in the "sysmaster" database which is the database engine's own database.
SELECT
        TRIM(dbsname) || ':' || TRIM(tabname) AS relation_name,
        isreads AS records_read,
        pagreads AS page_reads,
        iswrites AS records_inserted
        bufwrites AS buffered_writes
FROM sysmaster:sysptprof
ORDER BY isreads DESC;
Text 1: List the basic read and write statistics for objects in the engine's databases.
This will list a record for every object in the database including indexes; the ratio of ISAM operations vs. buffer page operations can give you a hint as to the effectiveness of your server's configuration. If the ratio is very low for busy object your buffer pool is possibly too small.
If you are interested in the counts of various query operations the sysptprof table also provides the following values:
  • isrwrite - The number of records updated.
  • isrdelete - The number of records deleted.
These counters will reset whenever the database server is restarted. In versions 11.7 and later of the Informix persistent values are available from the sysmaster database.
Many more sysmaster queries can be found in the Informix Wiki.

2012-02-06

Identifying The Hottest Tables (PostgreSQL)

In recent versions of PostgreSQL there is a magical view called pg_stat_user_tables which provides per table information on usage; there is one row per table and eight counters per row.
Fields
relname - The name of the table.
seq_scan - The number of sequential scans that have been performed on the table. A sequential scan is a read of the table from beginning to end, either because the table is very small or no indexes were available that could satisfy the filter criteria in an efficient way. Sequential scans are probably the most expensive operation the database server performs, some are however unavoidable. If proper indexing cannot resolve the need to sequentially scan a table it is imperative that the PostrgeSQL configuration provide enough resources to maintain a high cache rate.
seq_tup_read - The number of rows processed through sequential scans. This is not the number of records returned to the applications as results but the number of records processed in order to create the result set, which is probably a significant subset of this number. For example, if a query returns ten records but requires a sequential scan of the table then this value will increase by the number of records in the table, not by ten.
idx_scans - The number of indexes scans of the tables.
idx_tup_fetch - The number of rows processes through indexed scans. As with seq_tup_read this is not the count of records returned as the results of queries but those evaluated for queries due to index entries.
seq_tup_read - The number of records processed in order to create the result set of a query, not the number of records returned to the applications.
n_tup_ins - The number of rows inserted into the table.
n_tup_upd - The number of rows updated.
n_tup_del - The number of rows deleted.
Using this view an administrator can isolate the busiest tables in the database.
SELECT relname AS table_name,
seq_tup_read, idx_tup_fetch
FROM pg_stat_user_tables
WHERE (seq_tup_read + idx_tup_fetch) > 0
ORDER BY records DESC LIMIT 10
Text 1: Query to return the ten hottest tables with their sequential and index tuple fetch values.
These results will reveal both table usage and the effectiveness of your indexes.  If you have lots of sequential scans occurring then the query engine isn't finding indexes that match the queries being performed.
     table_name                           seq_tup_read      idx_tup_fetch
doc                                       1,423,407,729,074    349,028,985,971
job_history                                        71,378,301     4,213,364,118
job_history_info                               74,454,363      4,207,594,850
date_company_assignment           31,059,671      1,305,469,897
enterprise                                    3,551,311,871     1,083,015,878
date_x                                               12,884,498        982,418,723
object_acl                                  15,942,621,939        137,179,721
job                                               39,956,712,914          46,912,825
project_info                                 1,709,329,011                         23
team                                           1,141,035,688                            0
Text 2: Example results.
In these example results it is apparent that the table doc is one of the hottest objects and while many records are being identified using index entries there is also a very large number of sequential processes occurring.  This may be because either the indexes do not match the queries being performed or the cardinality of the indexed values is too low. Now we know where to look. 
So don't grope about speculating about how to improve database performance or scalability - ask where to look, PostgreSQL wants to help you.  Much more information can be found at the PostgreSQL stats monitoring documentation.

Integrating Postfix And CLAMAV

The clamav-miler is packaged by most distributions in their "clamav" package can be used in conjunction with Postfix to protect your network from malware embedded in SMTP traffic. Integration of CLAMAV and Postfix involves four steps:

  1. Configuration and enabling of the clamd service.
  2. Updating the CLAMAV malware database and enabling the freshclam service
  3. Configuration and enabling of the clamav-milter service. Current versions of the clamav-milter require connectivity to the clamd daemon through either a local UNIX socker or a TCP/IP socket.
  4. Configuration of Postfix to utilize the available clamav-milter service.
Step#1 : Enabling the clamd service
LocalSocket /var/lib/clamav/clamd-socket
LogFacility LOG_MAIL
LogSyslog yes
PidFile /var/lib/clamav/clamd.pid
TCPAddr 127.0.0.1
TCPSocket 3310
User vscan
Text 1: Typical settings overridden from defaults in /etc/clamd.conf

The clamd daemon typically reads its configuration from the /etc/clamd.conf file. Most importantly this file specifies, via the TCPSocket and TCPAddr directives, on what IP port and address the service listens for connections. These directives should be set to values appropriate for the host and which will be reachable by the clamav-milter. If the clamav-milter and the clamd daemon will be running on the same host the clamd service can be configured to listen to the localhost address [127.0.0.1] to avoid any potential network firewall and traffic filtering issues.
The clamd.conf file also provides many other tunable values but almost all of these should be appropriate at the distributions defaults.
Once configured the clamd service must be started and enabled for automatic start following the system's boot-up sequence; on RPM based systems this is typically achieved using the service and chkconfig commands.

Step #2 : Enabling the freshclam service

The freshclam service is an instance of the freshclam command line tool started with the “-d” option which runs the command in daemon mode. Whether started from the command-line or running in daemon mode freshclam will read its configuration from the /etc/freshclam.conf file. When running the freshclam daemon will periodically check the CLAMAV project mirrors for new malware signatures and update the local database used by the clamd scanning service. The freshclam daemon should run as the same user context as the clamd service; the typical way to ensure this is to synchronize the values of DatabaseOwner in /etc/freshclam.conf and User in /etc/clamd.conf. The frequency which freshclam will check for new patterns is controlled by the Checks directive – the default is 12 [times a day], this value should be sufficient in most cases. When database update succeeds the freshclam service will notify the clamd service that newer patterns are now available [for this to work the NotifyClamd directive must indicate the correct path to the current clamd configuration file].
DatabaseMirror database.clamav.net
DatabaseOwner vscan
HTTPProxyPort 3128
HTTPProxyServer proxy.example.com
LogFacility LOG_MAIL
LogSyslog yes
NotifyClamd /etc/clamd.conf
OnErrorExecute /usr/local/bin/malware_update_fail.sh
OnUpdateExecute /usr/local/bin/malware_update_ok.sh
PidFile /var/lib/clamav/freshclam.pid
UpdateLogFile /var/log/freshclam.log
Text 2: Example /etc/freshclam.conf file (comments removed)

The most important considerations in configuration of freshclam is if your network configuration requires use of an HTTP proxy server in order to access the CLAMAV mirrors for updates and if you need to configure some form of notification concerning success or failure of the pattern updates – a security focused service like a malware milter doesn't help anyone if it is silently failing in the background.
The HTTPProxyPort and HTTPProxyServer directives allow an HTTP proxy to be specified; freshclam will use this proxy for all mirror requests whether running as a command-line utility or in daemon mode. Should your proxy require a username/password for authentication these can be provided using the additional HTTPProxyUsername and HTTPProxyPassword directives. However it is simpler and more reliable to simply approve the “database.clamav.net” domain and sub-domains on your HTTP proxy service; all mirror requests will be made to those domains.
For notification of successful or failed updates the OnUpdateExecute and OnErrorExecute directives are used respectively. Whatever command is specified here will execute in the security context of the DatabaseOwner. A useful approach is to enable the log file via the UpdateLogFile directive and have the tail-end of that file mailed to a responsible party such as a help-desk or system-administrator for periodic verification that the service is operational.
#!/bin/sh

tail -25  /var/log/freshclam.log \
 | mail -s "[NOTICE] Malware Database Update Successful" \
    -r milter@example.com helpdesk@example.com
Text 3: A simple example script that might be used for OnUpdateExecute
The proper operation of freshclam can be tested by simply executing the freshclam utility on the command-line; it should check the mirrors and download any new patterns without an error message. Once configured and tested the freshclam service must be started and enabled for automatic start following the system's boot-up sequence.

Step #3 : Enabling the clamav-milter service
ClamdSocket tcp:127.0.0.1
LogFacility LOG_MAIL
LogSyslog yes
MilterSocket inet:32767@192.168.1.66
OnInfected Reject
PidFile /var/lib/clamav/clamav-milter.pid
ReportHostname mail.example.com
User vscan
VirusAction /usr/local/bin/virus_notify.sh
Text 4: Example clamav-milter.conf file (with comments removed)

Once the clamd scanning service is running and the freshclam service is maintaining the malware signatures the clamav-milter must be configured and started in order to connect the scanning service into Postfix's SMTP processing. The milter service is typically loads its configuration from the /etc/clamav-milter.conf file.
The service must be informed via the ClamdSocket directive where to find the clamd scanning service and via MilterSocket where to listen for connections from Postfix. The MitlerSocket directive is “inet:port@ip-address”. VirusAction and OnInfected directives can be used to control the behavior of the service when malware is identified; an OnInfected value of Quarantine will cause Postfix to hold the infected message in it's hold queue while a value of Reject will bounce the message with an SMTP error. Especially when used in Reject mode defining an appropriate VirusAction to notify the intended recipient of the message that a message has been discarded is important. The script named by VirusAction is executed in the security context of the scanning service and is provided seven parameters:
  1. Virus name-space
  2. Message queue id
  3. The sender's e-mail addres
  4. The e-mail address of the intended recipient.
  5. The subject of the message-id
  6. The message's Message-ID
  7. The date of the message.
Once configured the clamav-milter service must be started and set to automatically restart upon completion of system boot-up.
#!/bin/bash

# Parameters:
#   virus name, queue id, sender, destination, subject, message id, message date

(
 echo "";
 echo "   A message containing malware has been discarded.";
 echo "";
 echo "   Malware:     $1";
 echo "   Sender:      $3";
 echo "   Destination: $4";
 echo "   Subject:     $5";
 echo "   Message-ID:  $6";
 echo "   Date:        $7";
 echo "   Queue-ID:    $2";
 echo "";
) | \
 mail -s '[ALERT] Infected Messages Discarded' \
  -r milter@example.com -c helpdesk@example.com $4
Text 5: A sample script for use as the VirusAction. This script notifies the intended recipient and help-desk that a message was identified as malware and discarded.
Connecting the Postfix service to clamav-milter

In order to integrate the scanning into Postfix the milter is configured in the main.cf file as an smtpd_milter. The default action of the milter should be set to “accept” so if for any reason the milter is unresponsive messages will still be delivered. As when connecting the other components it is important to verify that the Postfix service can reach the specified service [traffic is permitted by firewall's etc...].
smtpd_milters = inet:milter.example.com:32767
milter_default_action = accept
Text 6: Configuration directives from Postfix's main.cf

Upon modification of the main.cf file the Postfix service should be restarted.
Once configured the malware filtering service should be tested; this can be accomplished by acquiring a copy of the EICAR diagnostic virus and verifying that messages with this content attached are rejected and that the end-user's are notified of the rejection [according the clamav-milter's defined VirusAction].

clamd[11973]: instream(127.0.0.1@60469): Eicar-Test-Signature FOUND
Text 7: Example clamd log message for identified malware.

When malware is detected a message will be logged by clamd via syslog regarding the event; this will typically be logged under the “mail” service. Depending on the distribution messages logged as mail will be written to either /var/log/mail or /var/log/maillog [at least with the default syslog configuration].