Using Regular Expressions for email processing in SecureTide

Using Regular Expressions (REGEX, REGEXP)

 

A regular expression is a special text string for describing a search pattern. You can think of regular expressions as hypersonic wildcard searches. Programmers and technical email administrators with programming experience in your organization can use these regular expressions to rapidly search through your emails for items of interest.  Here are some basic to exotic examples of how to use Regular expressions to perform specific filtered email searches:

 

Example 1:   Find any email with the word cat in it.

 

(?i)(cat)

 

Example 2:   Find any email with any of the words cat or dog in it.

 

(?i)(cat|dog)

 

Example 3:   Find any email with the words cat or dog followed by bird in it.

 

(?i)(cat|dog) *(bird)

 

Example 4:   Find any email with words or combinations that begin with th and end with s in it.

 

(?i)(th.*s)

 

Example 5: Find any SSN in the format of XXX-XX-XXXX in it.

 

([0-9]{3}-[0-9]{2}-[0-9]{4})

 

Example 6: Find any 5 or 9 digit zip code in it.

 

\d{5}(-?\d{4})?

 

Example 7:   Find any simple formatted phone number in the format of (XXX)XXX-XXXX in it.

 

\(\d{3}\)(\d{3}-\d{4})

 

Example 8:   Find any normally formatted phone number within the united states in it, including long distance prefixes and differently formatted area codes.

 

1?\s*-?\s*(\d{3}|\(\s*\d{3}\s*\))\s*-?\s*\d{3}\s*-?\s*\d{4}

 

Example 9:  Find any normally formatted email address contained in it.

 

([a-zA-Z0-9_\-])([a-zA-Z0-9_\-\.]*)@(\[((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])\.){3}|((([a-zA-Z0-9\-]+)\.)+))([a-zA-Z]{2,}|(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])\])

 

Example 10:  Find any email that has a URL imbedded in it.

 

(?i)\b (https?|ftp|file)://[-A-Z0-9+&@#/%?=~_|!:,.;]*[-A-Z0-9+&@#/%?=~_|]

 

Example 11:  Find any email that has date designations in it.

 

\d{1,2}/|-\d{1,2}/|-\d{4}

 

Example 12:  Find any email that has a valid IP address in it.

 

((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])\.){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])

 

Practical Application:

Using the SecureTide® REGEX Email Filtering engine to perform email redirection operations.

 

AppRiver’s Regular expression engine is designed for programmers and technical administrators with programming ability to input search term parameters and have the system perform specific actions based on the search terms being found.  The sample provided below demonstrates the successful application of one of these regular expression search terms and it’s beneficial use in routing email.

 

Scenario:  John Smith, who works at XYZ Company, Inc. is a project manager who manages multiple high level projects for the company.  John has a high priority project assigned to him called PROJECT PHOENIX.  Because of the sensitive nature of this project and it’s vital communications requirements, John needs any of his inbound emails that has the key word PHOENIX in it to be immediately copied to a special project oversight email address.  This email address is:  phoenix@xyzcompany.com

In order to accomplish this task, John logs onto his SecureTide ™ Spam & Virus interface for his company and performs the following actions:

1. From the summary screen, John selects Tools > Final Processing/mail Rules.

2. John then adds a rule that contains the following information.

  •  Condition = BODY

  •  Equals = MATCHES REGEX

  •  Parameter = (?i)(phoenix)

  •  Action = copy

  •  Action Value = phoenix@xyzcompany.com

The tested and validated rule, when completed, would look like the one pictured below:

 

John would then click the Add a Rule button to activate the email processing rule. With this rule active, John is assured that all inbound emails that reference the key word PHOENIX will be copied to the appropriate project email account for centralized storage.

For more information, you can find a Regular expression tutorial at the following link:

  http://www.regular-expressions.info/tutorial.html

It is also strongly recommend that you use the link provided below to completely test your regular expression formulations prior to applying them in the production portion of your SecureTide anti-spam interface.

  http://www.regexlib.com/RETester.aspx

 

Regular Expression Syntax Reference Card

 

BASIC METACHARACTERS

 

SPECIFIC CHARACTERS (con’t)

.

 Match any single character

 

\S

 Match anything but white space character

|

 Or

 

\t

Tab

[]

 Match one of a set of characters

 

\v

Vertical tab

[^]

 Negate a set of characters

 

\w

Match any alphanumeric character, digit or underscore

-

 Define a range of characters eg. [0-9]

 

\W

Opposite of \w

\

 Escape the next character

 

\x

Match a hexadecimal number

QUANTIFIERS

 

\0

Match octal number

*

 Match zero or more of the previous character

 

BACKREFERENCES & LOOKAROUND

*?

 Lazy version of *

 

()

Define subexpression

+

 Match one or more of the previous character

 

\n

Match nth subexpression

+?

 Lazy version of +

 

?=

Lookahead

?

 Match zero or one of the previous character

 

?!

Negative lookahead

{n}

 Match exact number of instances

 

CASE CONVERSION

{m,n}

 Match a range of instances

 

\E

Terminate \L or \U

{n,}

 Match n or more instances

 

\l

Convert next character to lowercase

{n,}?

 Lazy version on {n,}

 

\L

Convert all characters up to \E to lowercase

ANCHORS

 

\u

Convert next character to uppercase

^

 Match  start of string

 

\U

Convert all characters up to \E to uppercase

\A

 Match start of string

 

MODIFIERS

$

 Match end of string

 

(?m)

Multiline mode

\Z

 Match end of string

 

POSIX

\<

 Match start of word

 

[:alnum:]

Any letter or digit

\>

 Match end of word

 

[:alpha:]

Any letter

\b

 Match a word boundary

 

[:blank:]

Space or tab

\B

 Opposite of \b

 

[:cntrl:]

ASCII control

SPECIFIC CHARACTERS

 

[:digit:]

Any digit

[\b]

 Backspace

 

[:print:]

Any printable character

\c

 Match a control character

 

[:graph:]

Same as [:print:] but excludes space

\d

 Match any digit

 

[:lower:]

Any lower case character

\D

 Opposite of \d

 

[:punct:]

Any character that is in not [:alnum:] or [:cntrl:]

\f

 Form feed

 

[:space:]

Any whitespace character including space

\n

 Line feed

 

[:upper:]

Any uppercase character

\r

 Carriage return

 

[:xdigit:]

Any hexadecimal digit

\s

 Match any white space character

 

 

               
 

 

 

Add Feedback