|
There are two simple methods
used in compression. Those are lossless
and lossy. Lossy compression shrinks the
size of a file by deleting bits of
the file that won't dramatically change
the quality of information stored, at least
on the human level. Lossy compression
is most commonly used in photos or MP3 files.
Data owners may choose to trade in the integrity
of their original data and information for
more storage, but infrastructures rarely
carry the liberty of trading quality for
disk space, as quite often the data given
to them is entrusted with the stipulation
that it is kept safe. Because of this, lossless
compression may be in demand from a customer.
The oldest storage compression
is called traditional compression. This
is used best on text files and other similar
sources. The compression manager analyses
a small segment of data, searching
for patterns that can be reduced. The compression
engine can be implemented in hardware or
software.
When making a choice, almost
all cases favor a hardware compression rather
than software compression. Having both,
however is strongly advised against. It
will have little to no effect on the compression,
and will indeed slow everything down.
Data deduplication is like
traditional compression, with the exception
that it functions on larger datasets,
erasing duplicate chunks of data under it's
control. The deduplicated data is often
then compressed using a traditional compression
technique. The space required when storing
deduplicated data is dependant on the redundancy
of the data. Data deduplication was available
only at a file level until recently. With
more processing abilities, smaller pieces
are being deduplicated, down to a byte level.
Data deduplication is gaining
speed in the modern days of computing. The
swells of data and information pooling around
in every datacenter is making deduplication
the most important software development
to hit users of storage technology in a
decade. As more companies scramble to take
a corner of the market and capitalize on
the deduplicating frenzy, more host software,
intelligent fabrics, and disk arrays
designed to implement new and sophisticated
data deduplication processes and compression
mechanisms will be sure to emerge.
|