Finishing Up the Bash Mail Merge Script

Finally, I'm going to finish the mail merge script, just in time for Replicant Day.

Remember the mail merge script I started writing a while back? Yeah, that was quite some time ago. I got sidetracked with the Linux Journal Anniversary special issue (see my article "Back in the Day: UNIX, Minix and Linux"), and then I spun off on a completely different tangent for my last article ("Breaking Up Apache Log Files for Analysis"). I blame it on...

SQUIRREL!

Oh, sorry, back to topic here. I was developing a shell script that would let you specify a text document with embedded field names that could be substituted iteratively across a file containing lots of field values.

Each field was denoted by #fieldname#, and I identified two categories of fieldnames: fixed and dynamic. A fixed value might be #name#, which would come directly out of the data file, while a dynamic value could be #date#, which would be the current date.

More interesting, I also proposed calculated values, specifically #suggested#, which would be a value calculated based on #donation#, and #date#, which would be replaced by the current date. The super-fancy version would have a simple language where you could define the relationship between variables, but let's get real. Mail merge. It's just mail merge.

Reading and Assigning Values

It turns out that the additions needed for this script aren't too difficult. The basic data file has comma-separated field names, then subsequent lines have the values associated with those fields.

Here's that core code:


if [ $lines -eq 1 ] ; then # field names
# grab variable names
declare -a varname=($f1 $f2 $f3 $f4 $f5 $f6 $f7)
else # process fields

# grab values for this line (can contain spaces)
declare -a value=("$f1" "$f2" "$f3" "$f4" "$f5" "$f6" "$f7")

The declare function turns out to be ideal for this, allowing you to create an array varname based on the contents of the first line, then keep replacing the values of the array value, so that varname[1] = value[1], and so on.

To add the additional variables #date# and #suggested#, you simply can append them to the varname and value arrays. The first one is easy, but it did highlight a weakness in the original code that I had to fix by adding quotes as shown:


declare -a varname=("$f1" "$f2" "$f3" "$f4" "$f5"
  "$f6" "$f7" "date" "suggested")

The f1–f7 values needed to be quoted to ensure that there always are the same number of values in the varname array regardless of actual value (if any).

Adding the values to the value array is a smidge more tricky because you actually need to calculate values. Date is easy; it can be calculated once:


thedate=$(date "+%b %d, %Y")

Calculating the suggested value—donation/2—is also fairly easy to accomplish, but must be done within the main loop so that it changes for each letter being sent. The original donation amount in the demo is field 3, so the necessary code is:


# amount=f3, so suggested=(f3/2)
suggested="$(( $f3 / 2 ))"

The main block of code doesn't require any changes at all, fortunately, so with just those few tweaks, you now can use the mail merge script to generate, yes, a fully customized email message:


$ subs.sh
------------------------
Apr 13, 2019

Dear Eldon Tyrell, I wanted to start by again thanking you
for your generous donation of $500 in July. We couldn't do
our work without support from humans like you, Eldon.
This year we're looking at some unexpected expenses,
particularly in Sector 5, which encompasses California, as
you know. I'm hoping you can start the year with an
additional contribution? Even $250 would be tremendously
helpful.
Thanks for your ongoing support.
Rick Deckard
Society for the Prevention of Cruelty to Replicants

Notice that date and suggested are both replaced with logical values, the former showing the current date in a pleasant format (the date format string, above), and the suggested value as 50% of the donation.

Looping More Than Once

The biggest bug that's still in the script at this point is that although the donors source list has more than one donor listed, the script actually only ever shows results for that first donor and then quits.

To debug this part, let's look at just the key lines in the main loop:


while IFS=',' read -r f1 f2 f3 f4 f5 f6 f7
do
if [ $lines -eq 1 ] ; then # field names
# grab variable names
declare -a varname=("$f1" "$f2" "$f3" "$f4" "$f5"
    "$f6" "$f7" "date" "suggested")
else # process fields
. . .
echo "------------------------"
exec $sed "$SUBS" $inputfile

fi
done < "$datafile"

Can you see the problem here? In a burst of enthusiasm for efficient coding and fast execution, the script actually commits a sort of digital seppuku with an exec call instead of just running the sed and continuing the loop.

Oops. My bad!

The solution is simply to remove the word exec from the loop, and it suddenly works exactly as desired. The problem then is how do you split out all the individual letters? Having it all stream out as one long sequence of text is rather useless.

Creating Separate Output Files

There are a number of possible solutions, but I'm going to create individual files based on the donor's name. Since that value is $f1 once the data has been parsed, this is easy:


outfile="$(echo $f1 | sed 's/ /-/g')-letter.txt"
echo "Letter for $f1. Output = $outfile"
$sed "$SUBS" $inputfile > $outfile

You can see that the outfile value is composed by replacing all spaces with dashes, and the subsequent echo statement offers a status output. Finally, the actual sed invocation now eschews the evil exec call (okay, it's not evil) and adds an output redirect.

Here's the source donor file:


$ cat donors.txt
name,first,amount,month,state
Eldon Tyrell,Eldon,500,July,California
Rachel,Rachel,100,March,New York
Roy Batty,Roy,50,January,Washington

And, here's what happens when the script is run:


$ sh bulkmail-subs.sh
Letter for Eldon Tyrell. Output = Eldon-Tyrell-letter.txt
Letter for Rachel. Output = Rachel-letter.txt
Letter for Roy Batty. Output = Roy-Batty-letter.txt

Great. Now, what about one of those letters? Let's see what you'd be sending that rich head of industry, Eldon Tyrell:


$ cat Eldon-Tyrell-letter.txt
Apr 13, 2019
Dear Eldon Tyrell, I wanted to start by again thanking you
for your generous donation of $500 in July. We couldn't do
our work without support from humans like you, Eldon.
This year we're looking at some unexpected expenses,
particularly in Sector 5, which encompasses California, as
you know. I'm hoping you can start the year with an
additional contribution? Even $250 would be tremendously
helpful.
Thanks for your ongoing support.
Rick Deckard
Society for the Prevention of Cruelty to Replicants

Solved—and neatly too. Now, what would you do differently or add to make this script more useful? Without vast overkill, of course.

In my next article, I plan to take an entirely different direction. I'm not sure what, but I'll come up with something.

Dave Taylor has been hacking shell scripts on UNIX and Linux systems for a really long time. He's the author of Learning Unix for Mac OS X and Wicked Cool Shell Scripts. You can find him on Twitter as @DaveTaylor, and you can reach him through his tech Q&A site: Ask Dave Taylor.

Load Disqus comments