From: Anthony Appleyard <XPUM04@prime-a.central-services.umist.ac.uk>
To: DAVIDF@cs.heriot-watt.ac.uk
Date:         Tue, 19 Jun 90 09:41:28 BST 
Message-Id:   <$TGWFCWKBBCVB at UMPA>
Subject:      Here is Virus-L vol 0 #0914


Virus-L Digest Wed, 14 Sep 88, Volume 0 : Issue #0914

Today's Topics

another amusing re-print from RISKS on virus existance
CRCs
Re: CRCs
(Mac) HyperCard Virus Warning!
Re: CRCs
RE: Re: CRCs
Virus in bad sectors
crc polynomial bit twiddling
Re: (Mac) HyperCard Virus Warning!
Student paper
crc polynomial bit twiddling - reply to J. Ogata
crcs
RE: crc polynomial bit twiddling

------------------------------

Date:         Wed, 14 Sep 88 09:24:29 EDT
From:         "Kenneth R. van Wyk" <LUKEN@LEHIIBM1>
Subject:      another amusing re-print from RISKS on virus existance

Here's a followup on a message which I posted a couple days ago;  it,  too,
is a re-print from the RISKS forum. Ken

Date: 12 Sep 88 22:35:54 GMT
From: ddb%ns%bungia@umn-cs.cs.umn.edu (David Dyer-Bennet)
Subject: ``MS-DOS "virus" programs do not exist.'' (Re: RISKS-7.49)

In RISKS-7.49, Mark Moore writes about  a  public-domain  software  catalog
containing an article claiming that MS-DOS "virus" programs do not exist. I
view  this  with  a  certain  glee,  because  for  several  years I've been
attempting to follow up each story about viruses I hear; so far, the  story
has  either faded into the distance, or I have been told that they have the
virus isolated, but won't show it to me. While I accept that people running
academic computer centers,  in  particular,  have  some  justification  for
taking a paranoid attitude (though I wasn't approaching them from within as
a  student),  I've  been  telling  people for some time that by covering up
viruses the way they do, they are going to lead people to believe it's  all
a  myth, which in the long run is bad. So let me just say, "I told you so."
to those who've been concealing the evidence.
-- David Dyer-Bennet, Terrabit Software

    ...!{rutgers!dayton | amdahl!ems | uunet!rosevax}!umn-cs!ns!ddb
    ddb@Lynx.MN.Org, ...{amdahl,hpda}!bungia!viper!ddb
    Fidonet 1:282/341.0, (612) 721-8967 hst/2400/1200/300

Kenneth R. van Wyk                   Calvin: Ever consider the end of the
User Services Senior Consultant        world as we know it?
Lehigh University Computing Center   Hobbes: You mean nuclear war?
Internet: <luken@Spot.CC.Lehigh.EDU> Calvin: I think Mom was referring to if I
BITNET:   <LUKEN@LEHIIBM1>             let the air out of the car tires again.

--------------------

Date:         Wed, 14 Sep 88 09:46:13 EDT
From:         "David M. Chess" <CHESS@YKTVMV>
Subject:      CRCs

I *think* that, since a CRC is linear, you only need to be able to  twiddle
something like 2n bits to fox an n-bit CRC, if you know what the polynomial
in  use  is.  Figuring out exactly what the proper twiddling is is the hard
part. And foxing two n-bit CRCs requires just twice  as  many  bits,  since
using  two  n-bit  CRCs is Just Like using one 2n-bit CRC (for most intents
and purposes). Real mathematicians can correct me if I'm wrong  (in  either
direction)!
DC

--------------------

Date:         Wed, 14 Sep 88 10:20:35 CDT
From:         Len Levine <len@EVAX.MILW.WISC.EDU>
Subject:      Re: CRCs
In-Reply-To:  Message from "David M. Chess" of Sep 14, 88 at 9:46 am

>I *think* that, since a CRC is linear, you only need to be able
>to twiddle something like 2n bits to fox an n-bit CRC, if you
>know what the polynomial in use is.  Figuring out exactly what
>the proper twiddling is is the hard part.  And foxing two n-bit
>CRCs requires just twice as many bits, since using two n-bit CRCs
>is Just Like using one 2n-bit CRC (for most intents and purposes).
>Real mathematicians can correct me if I'm wrong (in either direction)!
>DC

I agree with David C. It might take an extra byte or two to make  it  work,
but  the agreement with n CRCs can be done fairly economically. What if the
distributor were to distribute the software through one channel,  and  then
through  another  channel,  at a later date, distribute the CRC equation or
equations used. Might that help?

+ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +
| Leonard P. Levine               e-mail len@evax.milw.wisc.edu |
| Professor, Computer Science             Office (414) 229-5170 |
| University of Wisconsin-Milwaukee       Home   (414) 962-4719 |
| Milwaukee, WI 53201 U.S.A.              Modem  (414) 962-6228 |
+ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +

--------------------

Date:         Wed, 14 Sep 88 09:59:52 EDT
From:         Joe McMahon <XRJDM@SCFVM>
Subject:      (Mac) HyperCard Virus Warning!

All HyperCard user, beware! A misguided  (though  well-meaning,  I'm  sure)
individual  on  Delphi  has  *** PUBLISHED THE SOURCE *** for the "Dukakis"
HyperCard virus! Not only that, but  it  was  distributed  to  in  the  Mac
Digest?!?!?

Prepare for an onslaught of HyperTalk viruses, folks. It's  just  too  darn
simple  and (worse yet) well commented. I will be publishing the source for
a "set" handler you can install in your Home stack that will help trap this
particular virus. How to make sure that you don't get infected by a  stack?

- Refuse to use any stack which does not allow you to change the userLevel
  or is password-protected.
- Read the handlers in any new stack, watching out for "set the script of..."

I will see if I can cook up a stack which will analyze a new stack and show
you any scripts which attempt to modify other scripts. It will most  likely
get a number of false alarms, but it will be better than nothing.
I'm not going to mention a couple of ways I can think of to get round this...
- - Joe  M.

--------------------

Date:         Wed, 14 Sep 88 13:07:02 EDT
From:         ENGNBSC@BUACCA
Subject:      Re: CRCs
In-Reply-To:  Message of Wed, 14 Sep 88 10:20:35 CDT

I don't think that published CRC will be the answer -  I've  seen  software
appear  on  bulletin boards within a few days of it's release; as soon as a
new 'clean' CRC is published, anyone who really wanted to infect  it  could
have a new mutation set up.
Bruce Howells, engnbsc@buacca.bu.edu, engnbsc@buacca.BITNet

--------------------

Date:         Wed, 14 Sep 88 14:46:00 EST
From:         "Jerry Leichter (LEICHTER-JERRY@CS.YALE.EDU)" <LEICHTER@YALEVMS>
Subject:      RE: Re: CRCs

The mathematics of CRC's is very simple. The basic idea can  be  seen  more
easily  by using a much simpler algorithm: Consider an input file as a very
long binary number B. Choose  a  some  fixed  number  N,  and  compute  the
remainder  of  B divided by N. The result is between 0 and N-1, and is your
checksum.

Suppose I have a modified file which, viewed as a long binary  number,  has
value  B'.  If  remdr(B',N) = remdr(B,N), my modified program will have the
same checksum as the original, and will fool you.  I  can  ensure  that  by
computing  B''  =  B'  -  remdr(B',N)  +  remdr(B,N);  then remdr(B'',N) is
"correct". Of course, in the process I may have  screwed  up  the  program:
When  viewed  as  a  set  of  bits, B'' may not work correctly. However, in
general B' and B'' differ only in the last couple of bits,  so  I  may  get
away with this. In addition, B'' + k*N has the same remainder mod N as B'',
for  any  integer k, so I can try to fiddle things around to find a version
that DOES work.

Now, no one actually uses remainders of programs viewed as  large  integers
for  checksums because they are very expensive to compute. CRC instead uses
a related idea: Instead of viewing the bits in the program as specifying  a
binary  number,  think  of  them  as  giving  the  coefficients  of  a huge
polynomial P: The first bit is the coefficient constant  term,  the  second
the  x  term,  the  third  the  x~2 term, and so on. Instead of a number N,
choose a polynomial Q. As before, compute the remainer of P divided  by  Q.
The  result  is  the  check- sum. The checksum is a polynomial of degree at
most deg(Q)-1.

Now, you COULD view the polynomial as having real numbers as coefficients -
where the input coefficients happen all to be 0 or 1 - but then  it's  just
as  hard  to  compute  the  result  as to do the integer long division, and
besides you don't get some advantages I'll mention in  a  moment.  Instead,
you  view  the coefficients of P as being in the integers mod 2, which is a
perfectly good field to do arithmetic in. You must then chose  Q  to  be  a
polynomial  over  the same field - so all its coefficients are 0 or 1 - and
the same with the remainder (so the remainder is  just  a  bunch  of  bits,
too).

Arithmetic for integers mod 2 is very simple: Addition is the same as  XOR,
and  multiplication is the same as AND. It turns out that you can implement
this very simply as a feedback shift register. What this comes down  to  is
imagining   your   high-school  long-division  algorithm  for  polynomials,
simplified in two ways: The arithmetic is XOR's and AND's, and the quotient
doesn't matter - all you need is the remainder. The first of  these  allows
for  very  simple  circuitry. The second means you can toss away high-order
terms as you finish with them: You only need to work on a segments as  long
as  Q  is.  Further, the process just proceeds right to left, never looking
ahead or back; so you can checksum a stream of bits "on the  fly",  as  you
send or receive them.

These properties make CRC's very easy to compute. In addition, they have  a
nice  property  which  is  easy  to  see  from  just thinking about the way
polynomials work: If you flip some bits of P, but no two bits you flip  are
further  apart  than  the  degree  of Q (plus 1), then the remainder ALWAYS
changes. What this means is that CRC's are very good at  detecting  "burst"
errors:  Groups  of  clobbered bits close to each other. The errors seen on
communications lines are usually the results of short "noise  events",  and
are thus burst errors; so CRC's are an excellent choice for detecting them.
Also,  errors  on disks are usually the result of physical damage to a tiny
piece of the disk surface, so they, too, are burst errors. Same for  tapes.
That's why CRC's are so widely used.

The  burst  error  detection  capability  is  quite  irrelevent  for   many
applications.  For  example,  I've seen people use CRC as hash functions in
hash tables. This is a waste of effort; there are much simpler,  easier  to
compute  functions  that  will  work  just  as  well  in  this application.

Similarly, CRC's as  such  have  little  to  recommend  them  as  signature
functions.  If  you  think  about  how they work, you'll see that fooling a
known CRC is very easy - even easier than in the case of the "remainder mod
N" example, since you can produce any CRC you want by  modifying  just  the
trailing  deg(Q)+1  bits  of the input. What Rabin pointed out, however, is
that if you DON'T know the polynomial, then your chance of making just  the
right  modification  is  essentially  the  same  as  that  of  guessing the
polynomial, which can be made  small  by  chosing  the  polynomial  from  a
suitably large set of candidates.
                                                        -- Jerry

--------------------

Date:         Wed, 14 Sep 88 13:52:20 CDT
From:         Kevin Trojanowski <troj@UMAXC.WEEG.UIOWA.EDU>
Subject:      Virus in bad sectors

Something I thought of on the way to work today, concerning the idea  of  a
virus  hiding in sectors marked as 'bad' in the FAT (or VTOC, or whatever).
Would it not be possible to write a device driver of sorts which checks all
sector reads against sectors marked as 'bad'  in  the  FAT?  Then,  if  the
sector  is  indeed  a  'bad'  sector,  return an error without allowing the
sector to be read at all. One hole I can think of in this is if the  author
of  the  virus  circumvents  it  by  doing  direct reads with the hardware,
avoiding DOS altogethor. But that seems a little more  in  depth  than  the
average (but not all) virus author would be willing to go.

-Kevin Trojanowski
 troj@umaxc.weeg.uiowa.edu
 fstkkvpg@uiamvs.bitnet (if Internet access isn't an option)

--------------------

Date:         Wed, 14 Sep 88 14:02:08 EDT
From:         me! Jefferson Ogata <OGATA@UMDD>
Subject:      crc polynomial bit twiddling

It all depends on the nature  of  the  polynomial  involved.  Consider  the
scheme:  s(i+1)  =  s(i)  + (x[i] ** 3). That is, the CRC is the sum of all
cubes of bytes in the file (lower 16 bits). In most cases, it'll take a lot
more than 32 bits of twiddle factor to  hit  the  same  CRC.  Approach  the
problem  in  the  following  fashion: the original file has some CRC m; the
virus code has the CRC n. In this CRC scheme, the CRC of the combined file,
i.e. after infection, is (m + n) mod (2 ** 16). There are  two  cases:  one
where  m + n >= 2 ** 16, and one where m + n < 2 ** 16. Consider the latter
case. It is desired that (m + n) = m. For this to happen,  the  virus  must
have a CRC of zero. Consider the other case. Here we want (m + n) mod (2 **
16)  = m. As in the first case, the virus must have a CRC of zero. So it is
clear that the virus must have a CRC of zero in order for it to escape  CRC
detection.  In  order  to  get a zero CRC, we must pick bytes to add to the
virus that drive all the one-bits out of the CRC sum. This can be  done  as
follows (C algorithm):

   while (crc () mod (2 ** 3)) addbyte (1 ** 3);
   while (crc () mod (4 ** 3)) addbyte (2 ** 3);
   while (crc () mod (8 ** 3)) addbyte (4 ** 3);     etc.

Note that we cannot simply compute the sum of these  bytes  and  add  that,
because for any x and y, ((x + y) ** 3) <> (x ** 3) + (y ** 3). If we could
find a sequence of bytes that has the same CRC as the sequence generated by
the above algorithm, we could use that sequence, but doing so would be VERY
difficult.

In order to arrive at a CRC of zero with only 32 bits of twiddling, the CRC
n of the untwiddled virus must be such that (n + a + b + c + d) mod  (2  **
16)  =  0, where a, b, c, and d are four perfect cubes of numbers less that
256. While any positive integer can be expressed as the sum of four perfect
squares, it will usually require more than four perfect cubes.

Of course, this is a case of a cubic CRC polynomial.  Consider  the  linear
scheme:  s(i+1)  =  s(i) + x[i]. This is merely the sum of all bytes in the
file (lower 16 bits). Even this scheme requires that you add about 2  **  8
bytes  in some cases. Suppose the untwiddled virus has a CRC of one. To hit
zero, you must add bytes whose sum is 65535. You'll need 257  bytes  to  do
this. (65535 / 255 = 257)

- Jeff Ogata

--------------------

Date:         Wed, 14 Sep 88 15:25:11 EDT
From:         ENGNBSC@BUACCA
Subject:      Re: (Mac) HyperCard Virus Warning!
In-Reply-To:  Message of Wed, 14 Sep 88 09:59:52 EDT

Well, I guess we finally have "proof" of the existence of a virus...
Bruce Howells, engnbsc@buacca.bu.edu / engnbsc@buacca.BITNet

--------------------

Date:         Wed, 14 Sep 88 15:42:33 EDT
From:         portal!cup.portal.com!Dan-Hankins@SUN.COM
Subject:      Student paper

The paper has one flaw in it that I can see. It claims that  Trojan  horses
carry  viruses  in  terms that imply that that's *all* Trojans do. This can
clearly not be the case, as I personally know of many Trojans that  do  not
carry  self-replicating programs. Additionally, the term Trojan Horse as it
is used in the computer world  was  coined  long  before  the  term  virus.

Dan Hankins

"These opinions are mine, and mine alone.  They are not for sale.  They may,
 however, be rented or leased at reasonable rates."

--------------------

Date:         Wed, 14 Sep 88 16:52:35 EDT
From:         "David M. Chess" <CHESS@YKTVMV>
Subject:      crc polynomial bit twiddling - reply to J. Ogata

By "CRC", I meant the  usual  thing  that's  called  a  "cyclic  redundancy
check",  described  so  well  just  now  in  Jerry  Leichter's posting: the
remainder when you treat both the data and the  polynomial  as  polynomials
over  the field [0,1], and determine the remainder when dividing the one by
the other. I don't think the examples that you give are CRCs in that sense?
The examples you give are, I think, nonlinear in places that a real CRC  is
linear,  and  therefore may be harder to fox with control of a small number
of bits. I intended my posting only to apply to real CRCs, not to signature
algorithms in general.
DC

--------------------

Date:         Wed, 14 Sep 88 17:36:55 EDT
From:         me! Jefferson Ogata <OGATA@UMDD>
Subject:      crcs

It is true that the schemes I was discussing are not true CRCs. They  would
probably be termed 'checksums'. - Jeff Ogata

--------------------

Date:         Wed, 14 Sep 88 18:04:00 EST
From:         "Jerry Leichter (LEICHTER-JERRY@CS.YALE.EDU)" <LEICHTER@YALEVMS>
Subject:      RE: crc polynomial bit twiddling

Jefferson Ogata discusses checksums of the form s(i+1) = s(i) +  p(b[i+1]),
where  b[i]  is  the  i'th  byte  of  the text to be checksummed and p is a
polynomial (actually, he only suggest p(x) = x~k for some k).

a) This is a checksum based on polynomials, but it is NOT what "CRC" refers
to. "CRC" has an established meaning  in  the  computer  science/electrical
engineering/coding  theory  fields.  It  means  a  checksum computed as the
remainder mod a polynomial over Z mod 2.

b) Ogata suggests that even in the simple case p(x) = x, with a 16-bit  sum
of  8-bit  unsigned  bytes as the checksum, producing a particular checksum
might require adding 65535 bytes to the file  (e.g.,  if  the  file  had  a
checksum  of  1  and  the  checksum  you  needed to fake happened to be 0).

This is true but again illustrates the dangers  of  not  understanding  the
model  you  are  working  with.  If  the  goal  is to prevent the wholesale
replacement of the original file  by  another  distinct  but  pre-specified
file,  such a checksum might be worthwhile. However, your opponent normally
CHANGES the original file, he doesn't replace it. If your opponent needs to
change k bytes of your file, he can always hide  the  changes  by  changing
some  other  k bytes: For each byte he added d to, he subtracts d from some
other byte in the file. Note that this doesn't change  the  length  of  the
file at all.

The same basic attack works for any polynomial p with the given scheme.

By the way, any such scheme has a much more profound weakness:  It  is  not
sensitive  to re-ordering of the bytes in the file. I can probably find all
the bytes I need to construct my virus SOMEWHERE in your  program.  I  need
only  move  them  to the place I want them. (Of course, if it's a data file
like a checking account record you are talking about, changing $109 to $901
is also rather effective. :-) )

Sure,  I'll  break  something  in  your  program  using  either  of   these
techniques, but by the time you happen to use what I broke it will probably
be way too late to help you.

Changing Ogata's algorithm to:        s(i+1) = p(s(i)) + b[i]

eliminates the rearrangement hole, and generally makes the previous "modify
bytes in pairs" attack impossible, but it has other vulnerabilities.  (BTW,
with  p(x)  =  kx  for  a  suitable  k,  this  is my favorite algorithm for
generating a hash code in hash table algorithms.  I  ignore  overflows  and
chose  k so that r(i+1) = kr(i) is a good linear congruential random number
generator mod 2~n, assuming the arithmetic uses n bits. This also  makes  a
fairly  good  checksum  for  detecting transmission errors - not as good as
CRC, but very, very easy to compute quickly  in  a  tiny  piece  of  easily
portable code.)

Remember, the security of a technique is not based on how well  it  resists
the  attacks YOU thought of, it is based on how well it resists the attacks
your OPPONENT thinks of! -- Jerry

--------------------

*** end of Virus-L issue ***