[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Anti-Spambot: what algorithm should be used?



[email protected] (Igor Chudov @ home) writes:

> Hi,
>
> As we all know, there exist certain programs, called "spambots", whose
> task is to post various messages to many newsgroups simultaneously.
>
> Besides their posting functionality, certain spambot programs take
> special care to make their spams undetectable by anti-spambots. In
> particular, they can be programmed to modify certain fields or the
> message text itself in such a way that these messages would not look
> unique, but would still carry the same content.
>
> We can generalize the things that spambots might do and suggest that
> a general spambot would do the following to avoid spam detection:
>
> 1) modify all header fields, for example From: Subject:, etc, with
> each spam posting.
> 2) Follow up to other articles posted to newsgroups so that the
> spams would look like genuine unique messages to the readers, and
> defeat spam detectors

Right - if you're following up in a newsgroup, you can just re-use
the subject. You can also use the "From:" and other headers from
one of the regular posters.

> 3) Randomly altering the spam message proper such that blindly comparing

If you're following up, then rather than being random, you can tailor
you response based on the message you're following up on - kind of
like Eliza or better. :-)

> them would be futile. Such alterations may include interchanging certain
> synonymous words, adding spaces or punctuation, or simply changing line
> wrapping length.
> 4) Swapping paragraphs and phrases.
> 5) Add random headers, footers & fillings (like ASCII art)
>
> I am sure that the readers can come up with more examples.
>
> The task (or the problem) is:
>
> a) come up with a reasonable set of assumptions of what such a
> spambot would or could do
> b) Create an algorithm which would print Message-IDs of messages that
> have identical content, so that most if not all of the judgments of
> this algorithm would be correct, assuming that the spambot operates
> within the limits of a).
>
> A message can be thought of as a sequence of words, phrases and
> paragraphs, as well as a set of header lines.
>
> Path: header field may be specially treated.

I don't think it's possible.

---

<a href="mailto:[email protected]">Dr.Dimitri Vulis KOTM</a>
Brighton Beach Boardwalk BBS, Forest Hills, N.Y.: +1-718-261-2013, 14.4Kbps