| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051 | 
							- This is a patched version of zlib, modified to use
 
- Pentium-Pro-optimized assembly code in the deflation algorithm. The
 
- files changed/added by this patch are:
 
- README.686
 
- match.S
 
- The speedup that this patch provides varies, depending on whether the
 
- compiler used to build the original version of zlib falls afoul of the
 
- PPro's speed traps. My own tests show a speedup of around 10-20% at
 
- the default compression level, and 20-30% using -9, against a version
 
- compiled using gcc 2.7.2.3. Your mileage may vary.
 
- Note that this code has been tailored for the PPro/PII in particular,
 
- and will not perform particuarly well on a Pentium.
 
- If you are using an assembler other than GNU as, you will have to
 
- translate match.S to use your assembler's syntax. (Have fun.)
 
- Brian Raiter
 
- breadbox@muppetlabs.com
 
- April, 1998
 
- Added for zlib 1.1.3:
 
- The patches come from
 
- http://www.muppetlabs.com/~breadbox/software/assembly.html
 
- To compile zlib with this asm file, copy match.S to the zlib directory
 
- then do:
 
- CFLAGS="-O3 -DASMV" ./configure
 
- make OBJA=match.o
 
- Update:
 
- I've been ignoring these assembly routines for years, believing that
 
- gcc's generated code had caught up with it sometime around gcc 2.95
 
- and the major rearchitecting of the Pentium 4. However, I recently
 
- learned that, despite what I believed, this code still has some life
 
- in it. On the Pentium 4 and AMD64 chips, it continues to run about 8%
 
- faster than the code produced by gcc 4.1.
 
- In acknowledgement of its continuing usefulness, I've altered the
 
- license to match that of the rest of zlib. Share and Enjoy!
 
- Brian Raiter
 
- breadbox@muppetlabs.com
 
- April, 2007
 
 
  |