|
BestPrac.Org
Stop Spam : Best Practice in Email
Spam Prevention and Eradication
Spam Bots - and how to avoid them : Part 1
(Released - January, 2003. Updated - May 2008.)
Spam bots, also known as email harvesters, email address
harvesters, crawlers or spiders, are quite probably the single most
insidious manner by which spammers collect the email addresses of
innocent victims.
In brief, most websites have at least one email address
viewable to the public, even if just as a contact address for
information or enquiries. Many websites show many email addresses - not
just their own, but also those of visitors to their sites. Examples of
these include message boards, guest books, archives of e-zine
discussion lists, and so forth.
It is not uncommon for websites to not
publicly reveal an email address, yet for it to
still be simply found in the HTML source code of a
web page. Many "form mailing" types of CGI scripts, for example, put
the reply address in the (not so) "hidden" fields in the HTML coding on
a web page which calls the CGI script.
Spam bots are software used by spammers to automatically
"crawl" the web, (and newsgroups, chat rooms, IRC, instant messager and
other contact databases) and locate any and every email address they
can find. They then record all these harvested email addresses into a
database to be used for plying the evil trade of the spammer.
How to stop spam bots finding your email address
- First of all, and sometimes easier said than done, do
not put any email addresses on your website. Let us consider a few
practical solutions, and what to do where this is not entirely
possible:
HTML Forms & CGI: For contact
purposes, do not use "mailto:"
tags. Use an HTML form (such as we have on this website at http://www.bestprac.org/contact.shtml),
being sure that the return email address used to send you the form
contents does not appear in the
HTML code, whether or not it is "publicly viewable". Your CGI script
should contain the response email address - not the HTML. There are
many such scripts available either free or for a fee online. One such
very simple freeware script can be found at: http://scripts.cgi101.com/
and is called "mail-form.cgi" (not to be confused with
"generiform.cgi", located on the same page, and which does not offer
the same protection against spam bots that "mail-form.cgi" does.)
Graphic Image: Many web-marketing
experts believe that adding a clearly visible actual email address is
preferable to an HTML Contact Form & CGI, because supposedly it
gives greater comfort to the visitor to the website that the website is
legitimate, and easily contactable. While disagreeing with this
unproven and naive assumption, we uphold the right of people to follow
such advise if they wish. (Furthermore, a number of 3rd Party Credit
Card processing services make it condition of use that an email address
for support purposes be prominently displayed on your page.) It can
still be done without exposing any email address in spam-bot readable
format. Simply, make a small graphics file in *gif format. Size may
only need to be about 6cm by 0.5cm. Give your gif a transparent
background. Then, make a text-image of your email address, underlined
and in the colour of a hyperlink. Place the finished *gif on your page
where you wish your email address to appear. Now, your email address is
clearly visible to the human eye, but because it is a graphic image it
cannot be read by spam bots.
The following is an example:

(Throughout this article, we are
using the domain "example.com", as it has been reserved by the Internet
Assigned Numbers Authority for the purpose for providing examples, as
described in RFC 2606 Part 3.)
(If you wish, you can then hyperlink this graphic
depiction of your email address to your HTML contact form page, as
discussed in the previous paragraph.)
ASCII or JavaScript Encoding:
Commonly recommended, these methods are beginning to lose their
effectiveness. The unfortunately misused of creative genius of spam bot
software programmers is beginning to be used in some of the newest spam
bots on the market to defeat these protection methods. Still, ASCII or
JavaScript encoding of email addresses continues to provide protection
against the majority of spam bots in use today.
One of the weaknesses of JavaScript encoding is that not
all of the visitors to your website will have JavaScript enabled in
their web browsers. Therefore, this small minority may not be able to
see your email address or even email you at all. This is not a problem
with ASCII encoding, though ASCII encoding is easier for the most
modern and sophisticated spam bots to decode. The JavaScript encoders
offer the stronger level of protection.
Using ASCII encoding, an email address such as example@example.com
will look like:
example@e
xample.com
Visitors web browsers will automatically interpret that
encoding and present it on the web page in the normal
"example@example.com" format, though few spam bots are intelligent
enough to recognise and convert it.
There are a number of free online
tools for ASCII encoding. Some examples are
provided by:
(We recommend converting not just the email address
itself, but also the "mailto:"
tag, or even the full "<A
HREF="mailto:example@example.com">send email</A>"
anchor syntax, for just a little bit of extra safety.)
A small word of warning about using ASCII encoding if
you use a graphical HTML editor to design your webpages. Many of these
design packages will see the ASCII encoding, and automatically decipher
it and save it in "standard" format. Check whether your HTML design
software does this prior to uploading the pages to your server. If it
does automatically make this conversion against your wishes, just prior
to uploading the page to your server you should open the page in a
plain text editor such as Notepad or Wordpad or similar. Remove the
'standard' format email address and manually replace it with your ASCII
encoding. Save it, then upload it to your server straight away, without
opening it again in your HTML design software.
A number of free online tools (or
downloadable software) for JavaScript encoding use
slightly different techniques to achieve the same purpose. The best of
them combine both JavaScript and ASCII encoding. Some examples are
provided by:
An example of what a spam bot sees when it crawls an
email address such as example@example.com
encoded by the latter tool on this list, by Tim Williams (quite
probably the strongest email address encoder available), is:
<script
type="text/javascript" language="javascript">
<!--
// eMail Obfuscator Script 2.1 by Tim Williams - freeware
{
coded = "MYQEBFM@MYQEBFM.OCE"
cipher =
"aZbYcXdWeVfUgThSiRjQkPlOmNnMoLpKqJrIsHtGuFvEwDxCyBzA1234567890"
shift=coded.length
link=""
for (i=0; i<coded.length; i++){
if (cipher.indexOf(coded.charAt(i))==-1){
}
else {
ltr = (cipher.indexOf(coded.charAt(i))-shift+cipher.length) %
cipher.length
link+=(cipher.charAt(ltr))
}
}
document.write("<a href='mailto:"+link+"'>Email
me!</a>")
}
//-->
</script>
<noscript>
<p>Sorry, but a Javascript-enabled browser is required to
email me.</p>
</noscript>
It bears repeating, though, that although ASCII or
JavaScript Encoding increases the level of protection an email address
on your website has against spam bots, neither type can guarantee
complete immunity against the increasingly sophisticated programming of
the most modern spam bots to decode these techniques.
Continue
on to Part 2.....
Return
to Articles Index
|