From owner-postfix-users@postfix.org Mon Mar  1 19:04:41 1999
Delivered-To: wietse@porcupine.org
Received: from russian-caravan.cloud9.net (russian-caravan.cloud9.net [168.100.1.4])
	by spike.porcupine.org (Postfix) with ESMTP id 5785A45A72
	for <wietse@porcupine.org>; Mon,  1 Mar 1999 19:04:36 -0500 (EST)
Received: by russian-caravan.cloud9.net (Postfix)
	id 5F5D37638F; Mon,  1 Mar 1999 19:04:08 -0500 (EST)
Delivered-To: postfix-users-outgoing@cloud9.net
Received: by russian-caravan.cloud9.net (Postfix, from userid 54)
	id 322AF76398; Mon,  1 Mar 1999 19:04:08 -0500 (EST)
Delivered-To: postfix-users@cloud9.net
Received: from melang.off.connect.com.au (melang.off.connect.com.au [202.21.9.1])
	by russian-caravan.cloud9.net (Postfix) with ESMTP id 6E87C7638F
	for <postfix-users@postfix.org>; Mon,  1 Mar 1999 19:04:04 -0500 (EST)
Received: from connect.com.au (localhost [127.0.0.1])
	by melang.off.connect.com.au (Postfix) with ESMTP
	id 074C7ED7D; Tue,  2 Mar 1999 11:04:02 +1100 (EST)
To: wietse@porcupine.org (Wietse Venema)
Cc: postfix-users@postfix.org (Postfix users)
Subject: regexp map patch
In-reply-to: Your message of "Thu, 25 Feb 1999 19:51:25 CDT."
             <19990226005125.69B3C4596E@spike.porcupine.org> 
Date: Tue, 02 Mar 1999 11:04:02 +1100
From: Andrew McNamara <andrewm@connect.com.au>
Message-Id: <19990302000403.074C7ED7D@melang.off.connect.com.au>
Sender: owner-postfix-users@postfix.org
Precedence: bulk
Return-Path: <owner-postfix-users@postfix.org>
Status: RO

I've written a patch to add a regexp map type. It utilises the PCRE
library (Perl Compatible Regular Expressions), which can be obtained
from:

   ftp://ftp.cus.cam.ac.uk/pub/software/programs/pcre/

You will need to add -DHAS_PCRE and a -I for the PCRE header to CCARGS,
and add the path to the PCRE library to AUXLIBS, for example:

   make -f Makefile.init makefiles 'CCARGS=-DHAS_PCRE -I../../pcre-2.04' \
      'AUXLIBS=../../pcre-2.04/libpcre.a'

One possible use is to add a line to main.cf:

   smtpd_recipient_restrictions = regexp:/opt/postfix/etc/smtprecipient

The regular expressions are read from the file specified and compiled -
a sample regexp file for this usage is included in the patch.

Any feedback is appreciated (from Wietse in particular :-). Have fun.

diff -u --recursive orig/postfix-beta-19990122-pl01/conf/sample-regexp postfix-beta-19990122-pl01/conf/sample-regexp
--- orig/postfix-beta-19990122-pl01/conf/sample-regexp	Tue Mar  2 10:42:43 1999
+++ postfix-beta-19990122-pl01/conf/sample-regexp	Tue Mar  2 10:51:49 1999
@@ -0,0 +1,51 @@
+# 
+#	Sample regexp map source file
+#
+#	The first field is a perl-like regular express. The expression
+#	delimiter can be any character except whitespace, or characters
+#	that have special meaning to the regexp library (traditionally
+#	the forward slash is used). The expression can contain
+#	whitespace.
+#
+#	By default, matching is case-INsensative, although following
+#	the second slash with an 'i' will reverse this. Other flags are
+#	supported, but the only other useful one is 'U', which makes
+#	matching ungreedy (see PCRE documentation and source for more
+#	info).
+#
+#	The second field is the "replacement" string - the text
+#	returned by the match. When used for smtpd checks, this would
+#	be a helpful message to misguided users (or an offensive
+#	message to spammers), although it could also be a domain name
+#	or other data for use as a transport, virtual, or other map.
+#
+#	Substitution of sub-strings from the matched expression is
+#	possible using the conventional perl syntax. The macros in the
+#	replacement string may need to be protected with curly braces
+#	if they aren't followed by whitespace (see the examples
+#	below).
+#
+#	If no second field is given, the text "REJECT" is returned -
+#	this string is magic to the check functions in smtpd, and
+#	results in an "administratively denied relay" message.
+#
+#	Lines starting with whitespace are continuation lines - they are
+#	appended to the previous line (there should be no whitespace
+#	before your regular expression!)
+
+
+# Protect your outgoing majordomo exploders
+#
+/^(.*)-outgoing@(connect.com.au)$/	550 Use ${1}@${2} instead
+
+
+# Bounce friend@whatever, except when whatever is our domain (you would
+# be better just bouncing all friend@ mail - this is just an example).
+#
+/^friend@(?!connect.com.au).*$/		550 Stick this in your pipe $0
+
+# A multi-line response
+#
+/^noddy@connect.com.au$/
+ 550 This user is a funny one. You really don't want to send mail to them
+ as it only makes their head spin. 
diff -u --recursive orig/postfix-beta-19990122-pl01/util/Makefile.in postfix-beta-19990122-pl01/util/Makefile.in
--- orig/postfix-beta-19990122-pl01/util/Makefile.in	Sun Jan 31 15:16:15 1999
+++ postfix-beta-19990122-pl01/util/Makefile.in	Fri Feb 26 15:57:24 1999
@@ -18,7 +18,7 @@
 	timed_wait.c translit.c trimblanks.c unix_connect.c unix_listen.c \
 	unix_trigger.c unsafe.c username.c valid_hostname.c vbuf.c \
 	vbuf_print.c vstream.c vstream_popen.c vstring.c vstring_vstream.c \
-	writable.c write_buf.c write_wait.c doze.c
+	writable.c write_buf.c write_wait.c doze.c dict_pcre.c
 OBJS	= argv.o argv_split.o attr.o basename.o binhash.o chroot_uid.o \
 	close_on_exec.o concatenate.o dict.o dict_db.o dict_dbm.o \
 	dict_env.o dict_ht.o dict_ldap.o dict_ni.o dict_nis.o \
@@ -38,7 +38,7 @@
 	timed_wait.o translit.o trimblanks.o unix_connect.o unix_listen.o \
 	unix_trigger.o unsafe.o username.o valid_hostname.o vbuf.o \
 	vbuf_print.o vstream.o vstream_popen.o vstring.o vstring_vstream.o \
-	writable.o write_buf.o write_wait.o doze.o
+	writable.o write_buf.o write_wait.o doze.o dict_pcre.o
 HDRS	= argv.h attr.h binhash.h chroot_uid.h connect.h dict.h dict_db.h \
 	dict_dbm.h dict_env.h dict_ht.h dict_ldap.h dict_ni.h dict_nis.h \
 	dict_nisplus.h dir_forest.h events.h exec_command.h find_inet.h \
@@ -51,7 +51,7 @@
 	ring.h safe.h safe_open.h sane_accept.h scan_dir.h set_eugid.h \
 	set_ugid.h sigdelay.h split_at.h stat_as.h stringops.h sys_defs.h \
 	timed_connect.h timed_wait.h trigger.h username.h valid_hostname.h \
-	vbuf.h vbuf_print.h vstream.h vstring.h vstring_vstream.h
+	vbuf.h vbuf_print.h vstream.h vstring.h vstring_vstream.h dict_pcre.h
 TESTSRC	= fifo_open.c fifo_rdwr_bug.c fifo_rdonly_bug.c
 WARN	= -W -Wformat -Wimplicit -Wmissing-prototypes \
 	-Wparentheses -Wstrict-prototypes -Wswitch -Wuninitialized \
diff -u --recursive orig/postfix-beta-19990122-pl01/util/dict_open.c postfix-beta-19990122-pl01/util/dict_open.c
--- orig/postfix-beta-19990122-pl01/util/dict_open.c	Sat Dec 12 05:55:34 1998
+++ postfix-beta-19990122-pl01/util/dict_open.c	Fri Feb 26 15:07:51 1999
@@ -100,6 +100,7 @@
 #include <dict_nisplus.h>
 #include <dict_ni.h>
 #include <dict_ldap.h>
+#include <dict_pcre.h>
 #include <stringops.h>
 #include <split_at.h>
 
@@ -131,6 +132,9 @@
 #endif
 #ifdef HAS_LDAP
     "ldap", dict_ldap_open,
+#endif
+#ifdef HAS_PCRE
+    "regexp", dict_pcre_open,
 #endif
     0,
 };
diff -u --recursive orig/postfix-beta-19990122-pl01/util/dict_pcre.c postfix-beta-19990122-pl01/util/dict_pcre.c
--- orig/postfix-beta-19990122-pl01/util/dict_pcre.c	Tue Mar  2 10:42:30 1999
+++ postfix-beta-19990122-pl01/util/dict_pcre.c	Tue Mar  2 10:39:37 1999
@@ -0,0 +1,349 @@
+/*++
+/* NAME
+/*	dict_pcre 3
+/* SUMMARY
+/*	dictionary manager interface to PCRE regular expression library
+/* SYNOPSIS
+/*	#include <dict_pcre.h>
+/*
+/*	DICT	*dict_pcre_open(name, flags)
+/*	const char *name;
+/*	int	flags;
+/* DESCRIPTION
+/*	dict_pcre_open() opens the named file and compiles the contained
+/*	regular expressions.
+/* SEE ALSO
+/*	dict(3) generic dictionary manager
+/* LICENSE
+/* .ad
+/* .fi
+/*	The Secure Mailer license must be distributed with this software.
+/* AUTHOR(S)
+/*	Wietse Venema
+/*	IBM T.J. Watson Research
+/*	P.O. Box 704
+/*	Yorktown Heights, NY 10598, USA
+/*
+/*	Andrew McNamara
+/*	andrewm@connect.com.au
+/*	connect.com.au Pty. Ltd.
+/*	Level 3, 213 Miller St
+/*	North Sydney, NSW, Australia
+/*--*/
+
+#include "sys_defs.h"
+
+#ifdef HAS_PCRE
+
+/* System library. */
+
+#include <stdio.h>			/* sprintf() prototype */
+#include <stdlib.h>
+#include <unistd.h>
+#include <string.h>
+#include <ctype.h>
+
+/* Utility library. */
+
+#include "mymalloc.h"
+#include "msg.h"
+#include "safe.h"
+#include "vstream.h"
+#include "vstring.h"
+#include "stringops.h"
+#include "readline.h"
+#include "dict.h"
+#include "dict_pcre.h"
+#include "mac_parse.h"
+
+/* PCRE library */
+
+#include "pcre.h"
+
+#define PCRE_MAX_CAPTURE	99	/* Max strings captured by regexp - */
+					/* essentially the max number of (..) */
+
+struct dict_pcre_list {
+    pcre        *pattern;		/* The compiled pattern */
+    pcre_extra  *hints;			/* Hints to speed pattern execution */
+    char	*replace;		/* Replacement string */
+    int		lineno;			/* Source file line number */
+    struct dict_pcre_list  *next;	/* Next regexp in dict */
+};
+
+typedef struct {
+    DICT        dict;			/* generic members */
+    char       *map;			/* map name */
+    int         flags;			/* unused at the moment */
+    struct dict_pcre_list  *head;
+} DICT_PCRE;
+
+static dict_pcre_init = 0;		/* flag need to init pcre library */
+
+/* 
+ *  dict_pcre_update - not supported
+ */
+static void dict_pcre_update(DICT *dict, const char *unused_name, 
+			const char *unused_value)
+{
+    DICT_PCRE *dict_pcre = (DICT_PCRE *) dict;
+
+    msg_fatal("dict_pcre_update: attempt to update regexp map %s", 
+	dict_pcre->map);
+}
+
+/*
+ * Context for macro expansion callback.
+ */
+struct dict_pcre_context {
+    const char	*dict_name;			/* source dict name */
+    int		lineno;				/* source file line number */
+    VSTRING	*buf;				/* target string buffer */
+    const char	*subject;			/* str against which we match */
+    int		offsets[ PCRE_MAX_CAPTURE * 3 ];/* Cut substrings */
+    int         matches;			/* Count of cuts */
+};
+
+/*
+ * Macro expansion callback - replace $0-${99} with strings cut from
+ * matched string.
+ */
+static void dict_pcre_action( int type, VSTRING *buf, char *ptr )
+{
+    struct dict_pcre_context *ctxt = (struct dict_pcre_context *) ptr;
+    const char	*pp;
+    int		n, ret;
+
+    if( type == MAC_PARSE_VARNAME ) {
+	n = atoi( vstring_str( buf ));
+	ret = pcre_get_substring( ctxt->subject, ctxt->offsets, ctxt->matches,
+		n, &pp );
+	if( ret < 0 ) {
+	    if( ret == PCRE_ERROR_NOSUBSTRING )
+		msg_warn( "regexp %s, line %d: replace index out of range",
+		    ctxt->dict_name, ctxt->lineno );
+	    else
+	    	msg_warn( "regexp %s, line %d: pcre_get_substring error: %d",
+		    ctxt->dict_name, ctxt->lineno, ret );
+	    return;
+	}
+	vstring_strcat( ctxt->buf, pp );
+    } else
+	/* Straight text - duplicate with no substitution */
+    	vstring_strcat( ctxt->buf, vstring_str(buf));
+}
+
+/*
+ * Look up regexp dict and perform string substitution on matched
+ * strings.
+ */
+static const char *dict_pcre_lookup(DICT *dict, const char *name)
+{
+    DICT_PCRE *dict_pcre = (DICT_PCRE *) dict;
+    struct dict_pcre_list *pcre_list;
+    int		name_len = strlen( name );
+    struct dict_pcre_context	ctxt;
+    static VSTRING		*buf;
+
+/*    msg_info("dict_pcre_lookup: %s: %s", dict_pcre->map, name );*/
+
+    /* Search for a matching expression */
+    for( pcre_list = dict_pcre->head; pcre_list; pcre_list = pcre_list->next ) {
+	if( pcre_list->pattern ) {
+	    ctxt.matches = pcre_exec( pcre_list->pattern, pcre_list->hints, 
+		    name, name_len, 0, ctxt.offsets, PCRE_MAX_CAPTURE * 3 );
+	    if( ctxt.matches != PCRE_ERROR_NOMATCH ) {
+		if( ctxt.matches > 0 )
+		    break;			/* Got a match! */
+		else {
+		    /* An error */
+		    switch( ctxt.matches ) {
+		    case 0:
+		    	msg_warn( "regexp map %s, line %d: too many (...)",
+				dict_pcre->map, pcre_list->lineno );
+			break;
+		    case PCRE_ERROR_NULL:
+		    case PCRE_ERROR_BADOPTION:
+		    	msg_fatal( "regexp map %s, line %d: bad args to re_exec",
+				dict_pcre->map, pcre_list->lineno );
+			break;
+		    case PCRE_ERROR_BADMAGIC:
+		    case PCRE_ERROR_UNKNOWN_NODE:
+		    	msg_fatal( "regexp map %s, line %d: corrupt compiled regexp",
+				dict_pcre->map, pcre_list->lineno );
+			break;
+		    default:
+		    	msg_fatal( "regexp map %s, line %d: unknown re_exec error: %d",
+				dict_pcre->map, pcre_list->lineno, ctxt.matches );
+			break;
+		    }
+		    return( (char *)0 );
+		}
+	    }
+	}
+    }
+
+    /* If we've got a match, */
+    if ( ctxt.matches > 0 ) {
+	/* And we've got a replacement string, */
+    	if ( pcre_list->replace ) {
+	    /* Then perform substitution on replacement string */
+	    if( buf == 0 )
+		buf = vstring_alloc(10);
+	    VSTRING_RESET(buf);
+	    ctxt.buf = buf;
+	    ctxt.subject = name;
+	    ctxt.dict_name = dict_pcre->map;
+	    ctxt.lineno = pcre_list->lineno;
+
+	    mac_parse( pcre_list->replace, dict_pcre_action, (char *)&ctxt );
+
+	    VSTRING_TERMINATE(buf);
+	    return( vstring_str( buf ));
+	} else
+	    /* No replacement string, so just return dummy */
+	    return( "REJECT" );
+    }
+
+    return ( (char *)0 );
+}
+
+/* dict_pcre_close - close pcre dictionary */
+
+static void dict_pcre_close(DICT *dict)
+{
+    DICT_PCRE *dict_pcre = (DICT_PCRE *) dict;
+    struct dict_pcre_list *pcre_list;
+
+    for( pcre_list = dict_pcre->head; pcre_list; pcre_list = pcre_list->next ) {
+	if( pcre_list->pattern )
+	    myfree((char *) pcre_list->pattern);
+	if( pcre_list->hints )
+	    myfree((char *) pcre_list->hints);
+	if( pcre_list->replace )
+	    myfree((char *) pcre_list->replace);
+    }
+    myfree((char *) dict_pcre);
+}
+
+/*
+ * dict_pcre_open - load and compile a file containing regular expressions
+ */
+DICT   *dict_pcre_open(const char *map, int unused_flags)
+{
+    DICT_PCRE   *dict_pcre;
+    VSTREAM	*map_fp;
+    VSTRING	*line_buffer;
+    struct dict_pcre_list *pcre_list = NULL, *pl;
+    int		lineno = 0;
+    char	*regexp, *p, re_delimiter;
+    int		re_options;
+    pcre	*pattern;
+    pcre_extra	*hints;
+    const char	*error;
+    int		errptr;
+
+    line_buffer = vstring_alloc(100);
+
+    dict_pcre = (DICT_PCRE *) mymalloc(sizeof(*dict_pcre));
+    dict_pcre->dict.lookup = dict_pcre_lookup;
+    dict_pcre->dict.update = dict_pcre_update;
+    dict_pcre->dict.close = dict_pcre_close;
+    dict_pcre->dict.fd = -1;
+    dict_pcre->map = mystrdup(map);
+    dict_pcre->flags = 0;
+    dict_pcre->head = NULL;
+
+    if (dict_pcre_init == 0) {
+	pcre_malloc = (void *)mymalloc;
+	pcre_free = (void *)myfree;
+	dict_pcre_init = 1;
+    }
+
+    if(( map_fp = vstream_fopen( map, O_RDONLY, 0 )) == 0 ) {
+    	msg_fatal("open %s: %m", map );
+    }
+    while (readline(line_buffer, map_fp, &lineno)) {
+
+    	if (*vstring_str(line_buffer) == '#')		/* Skip comments */
+	    continue;
+
+	p = vstring_str(line_buffer);
+	re_delimiter = *p++;
+	regexp = p;
+
+	/* Search for second delimiter, handling backslash escape */
+	while( *p ) {
+	    if( *p == re_delimiter && 
+		    ( p > vstring_str(line_buffer) && *(p - 1) != '\\' ))
+	    	break;
+	    ++p;
+	}
+
+	if (!*p) {
+	    msg_warn("%s, line %d: no closing regexp delimiter: %c",
+	    	VSTREAM_PATH(map_fp), lineno, re_delimiter );
+	    continue;
+	}
+	*p++ = '\0';				/* Null term the regexp */
+
+	/* Now parse any regexp options */
+	re_options = PCRE_CASELESS;
+	while( *p && !ISSPACE( *p )) {
+	    switch( *p ) {
+		case 'i':	re_options ^= PCRE_CASELESS; break;
+		case 'm':	re_options ^= PCRE_MULTILINE; break;
+		case 's':	re_options ^= PCRE_DOTALL; break;
+		case 'x':	re_options ^= PCRE_EXTENDED; break;
+		case 'A':	re_options ^= PCRE_ANCHORED; break;
+		case 'E':	re_options ^= PCRE_DOLLAR_ENDONLY; break;
+		case 'U':	re_options ^= PCRE_UNGREEDY; break;
+		case 'X':	re_options ^= PCRE_EXTRA; break;
+		default:
+		    msg_warn("%s, line %d: unknown regexp option '%c'",
+			VSTREAM_PATH(map_fp), lineno, *p );
+	    }
+	    ++p;
+	}
+
+	while( *p && ISSPACE( *p ))
+	    ++p;
+	
+	/* Compile the patern */
+	pattern = pcre_compile( regexp, re_options, &error, &errptr, NULL );
+	if( pattern == NULL ) {
+	    msg_warn("%s, line %d: error in regex at offset %d: %s",
+	    	VSTREAM_PATH(map_fp), lineno, errptr, error );
+	    continue;
+	}
+	hints = pcre_study( pattern, 0, &error );
+	if( error != NULL ) {
+	    msg_warn("%s, line %d: error while studying regex: %s",
+	    	VSTREAM_PATH(map_fp), lineno, error );
+	    myfree( (char *)pattern );
+	    continue;
+	}
+
+	/* Add it to the list */
+	pl = (struct dict_pcre_list *)mymalloc( sizeof( struct dict_pcre_list ));
+
+	/* Save the replacement string (if any) */
+	pl->replace = ( *p ? mystrdup( p ) : NULL );
+	pl->pattern = pattern;
+	pl->hints = hints;
+	pl->next = NULL;
+	pl->lineno = lineno;
+
+	if( pcre_list == NULL )
+	    dict_pcre->head = pl;
+	else
+	    pcre_list->next = pl;
+	pcre_list = pl;
+    }
+
+    vstring_free(line_buffer);
+    vstream_fclose(map_fp);
+
+    return (&dict_pcre->dict);
+}
+#endif /*HAS_PCRE*/
diff -u --recursive orig/postfix-beta-19990122-pl01/util/dict_pcre.h postfix-beta-19990122-pl01/util/dict_pcre.h
--- orig/postfix-beta-19990122-pl01/util/dict_pcre.h	Tue Mar  2 10:42:32 1999
+++ postfix-beta-19990122-pl01/util/dict_pcre.h	Mon Mar  1 18:17:23 1999
@@ -0,0 +1,41 @@
+#ifndef _DICT_PCRE_H_INCLUDED_
+#define _DICT_PCRE_H_INCLUDED_
+
+/*++
+/* NAME
+/*	dict_pcre 3h
+/* SUMMARY
+/*	dictionary manager interface to PCRE regular expression library
+/* SYNOPSIS
+/*	#include <dict_pcre.h>
+/* DESCRIPTION
+/* .nf
+
+ /*
+  * Utility library.
+  */
+#include <dict.h>
+
+ /*
+  * External interface.
+  */
+extern DICT *dict_pcre_open(const char *, int);
+
+/* LICENSE
+/* .ad
+/* .fi
+/*	The Secure Mailer license must be distributed with this software.
+/* AUTHOR(S)
+/*	Wietse Venema
+/*	IBM T.J. Watson Research
+/*	P.O. Box 704
+/*	Yorktown Heights, NY 10598, USA
+/*
+/*	Andrew McNamara
+/*	andrewm@connect.com.au
+/*	connect.com.au Pty. Ltd.
+/*	Level 3, 213 Miller St
+/*	North Sydney, NSW, Australia
+/*--*/
+
+#endif

 ---
Andrew McNamara (Senior System Administrator)

connect.com.au Pty Ltd
Lvl 3, 213 Miller St, North Sydney, NSW 2060, Australia
Phone: +61 2 9959 5959, Fax: +61 2 9966 1960



