VirtualBox

Ignore:
Timestamp:
Jun 9, 2022 9:09:44 AM (3 years ago)
Author:
vboxsync
svn:sync-xref-src-repo-rev:
151769
Message:

libs/zlib: Upgrade to 1.2.12, bugref:8515

Location:
trunk/src/libs/zlib-1.2.12
Files:
6 edited

Legend:

Unmodified
Added
Removed
  • trunk/src/libs/zlib-1.2.12

    • Property svn:mergeinfo
      •  

        old new  
        2020/branches/dsen/gui3/src/libs/zlib-1.2.11:79645-79692
        2121/trunk/src/src/libs/zlib-1.2.11:92342
         22/vendor/zlib/1.2.12:151751
         23/vendor/zlib/current:150724-151750
  • trunk/src/libs/zlib-1.2.12/examples/README.examples

    r76163 r95239  
    3535    - illustrates use of a gzip header extra field
    3636
     37gznorm.c
     38    normalize a gzip file by combining members into a single member
     39    - demonstrates how to concatenate deflate streams using Z_BLOCK
     40
    3741zlib_how.html
    3842    painfully comprehensive description of zpipe.c (see below)
     
    4549
    4650zran.c
     51zran.h
    4752    index a zlib or gzip stream and randomly access it
    4853    - illustrates the use of Z_BLOCK, inflatePrime(), and
  • trunk/src/libs/zlib-1.2.12/examples/enough.c

    r76163 r95239  
    11/* enough.c -- determine the maximum size of inflate's Huffman code tables over
    2  * all possible valid and complete Huffman codes, subject to a length limit.
    3  * Copyright (C) 2007, 2008, 2012 Mark Adler
    4  * Version 1.4  18 August 2012  Mark Adler
     2 * all possible valid and complete prefix codes, subject to a length limit.
     3 * Copyright (C) 2007, 2008, 2012, 2018 Mark Adler
     4 * Version 1.5  5 August 2018  Mark Adler
    55 */
    66
     
    1818                     Clean up comparisons of different types
    1919                     Clean up code indentation
     20   1.5   5 Aug 2018  Clean up code style, formatting, and comments
     21                     Show all the codes for the maximum, and only the maximum
    2022 */
    2123
    2224/*
    23    Examine all possible Huffman codes for a given number of symbols and a
    24    maximum code length in bits to determine the maximum table size for zilb's
    25    inflate.  Only complete Huffman codes are counted.
     25   Examine all possible prefix codes for a given number of symbols and a
     26   maximum code length in bits to determine the maximum table size for zlib's
     27   inflate. Only complete prefix codes are counted.
    2628
    2729   Two codes are considered distinct if the vectors of the number of codes per
    28    length are not identical.  So permutations of the symbol assignments result
     30   length are not identical. So permutations of the symbol assignments result
    2931   in the same code for the counting, as do permutations of the assignments of
    3032   the bit values to the codes (i.e. only canonical codes are counted).
    3133
    3234   We build a code from shorter to longer lengths, determining how many symbols
    33    are coded at each length.  At each step, we have how many symbols remain to
     35   are coded at each length. At each step, we have how many symbols remain to
    3436   be coded, what the last code length used was, and how many bit patterns of
    3537   that length remain unused. Then we add one to the code length and double the
    36    number of unused patterns to graduate to the next code length.  We then
     38   number of unused patterns to graduate to the next code length. We then
    3739   assign all portions of the remaining symbols to that code length that
    38    preserve the properties of a correct and eventually complete code.  Those
     40   preserve the properties of a correct and eventually complete code. Those
    3941   properties are: we cannot use more bit patterns than are available; and when
    40    all the symbols are used, there are exactly zero possible bit patterns
    41    remaining.
     42   all the symbols are used, there are exactly zero possible bit patterns left
     43   unused.
    4244
    4345   The inflate Huffman decoding algorithm uses two-level lookup tables for
    44    speed.  There is a single first-level table to decode codes up to root bits
    45    in length (root == 9 in the current inflate implementation).  The table
    46    has 1 << root entries and is indexed by the next root bits of input.  Codes
    47    shorter than root bits have replicated table entries, so that the correct
    48    entry is pointed to regardless of the bits that follow the short code.  If
    49    the code is longer than root bits, then the table entry points to a second-
    50    level table.  The size of that table is determined by the longest code with
    51    that root-bit prefix.  If that longest code has length len, then the table
    52    has size 1 << (len - root), to index the remaining bits in that set of
    53    codes.  Each subsequent root-bit prefix then has its own sub-table.  The
    54    total number of table entries required by the code is calculated
    55    incrementally as the number of codes at each bit length is populated.  When
    56    all of the codes are shorter than root bits, then root is reduced to the
    57    longest code length, resulting in a single, smaller, one-level table.
     46   speed. There is a single first-level table to decode codes up to root bits
     47   in length (root == 9 for literal/length codes and root == 6 for distance
     48   codes, in the current inflate implementation). The base table has 1 << root
     49   entries and is indexed by the next root bits of input. Codes shorter than
     50   root bits have replicated table entries, so that the correct entry is
     51   pointed to regardless of the bits that follow the short code. If the code is
     52   longer than root bits, then the table entry points to a second-level table.
     53   The size of that table is determined by the longest code with that root-bit
     54   prefix. If that longest code has length len, then the table has size 1 <<
     55   (len - root), to index the remaining bits in that set of codes. Each
     56   subsequent root-bit prefix then has its own sub-table. The total number of
     57   table entries required by the code is calculated incrementally as the number
     58   of codes at each bit length is populated. When all of the codes are shorter
     59   than root bits, then root is reduced to the longest code length, resulting
     60   in a single, smaller, one-level table.
    5861
    5962   The inflate algorithm also provides for small values of root (relative to
    6063   the log2 of the number of symbols), where the shortest code has more bits
    61    than root.  In that case, root is increased to the length of the shortest
    62    code.  This program, by design, does not handle that case, so it is verified
    63    that the number of symbols is less than 2^(root + 1).
     64   than root. In that case, root is increased to the length of the shortest
     65   code. This program, by design, does not handle that case, so it is verified
     66   that the number of symbols is less than 1 << (root + 1).
    6467
    6568   In order to speed up the examination (by about ten orders of magnitude for
    6669   the default arguments), the intermediate states in the build-up of a code
    67    are remembered and previously visited branches are pruned.  The memory
     70   are remembered and previously visited branches are pruned. The memory
    6871   required for this will increase rapidly with the total number of symbols and
    69    the maximum code length in bits.  However this is a very small price to pay
     72   the maximum code length in bits. However this is a very small price to pay
    7073   for the vast speedup.
    7174
    72    First, all of the possible Huffman codes are counted, and reachable
     75   First, all of the possible prefix codes are counted, and reachable
    7376   intermediate states are noted by a non-zero count in a saved-results array.
    7477   Second, the intermediate states that lead to (root + 1) bit or longer codes
    7578   are used to look at all sub-codes from those junctures for their inflate
    76    memory usage.  (The amount of memory used is not affected by the number of
     79   memory usage. (The amount of memory used is not affected by the number of
    7780   codes of root bits or less in length.)  Third, the visited states in the
    7881   construction of those sub-codes and the associated calculation of the table
     
    8083   Beginning the code examination at (root + 1) bit codes, which is enabled by
    8184   identifying the reachable nodes, accounts for about six of the orders of
    82    magnitude of improvement for the default arguments.  About another four
    83    orders of magnitude come from not revisiting previous states.  Out of
    84    approximately 2x10^16 possible Huffman codes, only about 2x10^6 sub-codes
     85   magnitude of improvement for the default arguments. About another four
     86   orders of magnitude come from not revisiting previous states. Out of
     87   approximately 2x10^16 possible prefix codes, only about 2x10^6 sub-codes
    8588   need to be examined to cover all of the possible table memory usage cases
    8689   for the default arguments of 286 symbols limited to 15-bit codes.
    8790
    88    Note that an unsigned long long type is used for counting.  It is quite easy
    89    to exceed the capacity of an eight-byte integer with a large number of
    90    symbols and a large maximum code length, so multiple-precision arithmetic
    91    would need to replace the unsigned long long arithmetic in that case.  This
    92    program will abort if an overflow occurs.  The big_t type identifies where
    93    the counting takes place.
    94 
    95    An unsigned long long type is also used for calculating the number of
    96    possible codes remaining at the maximum length.  This limits the maximum
    97    code length to the number of bits in a long long minus the number of bits
    98    needed to represent the symbols in a flat code.  The code_t type identifies
    99    where the bit pattern counting takes place.
     91   Note that the uintmax_t type is used for counting. It is quite easy to
     92   exceed the capacity of an eight-byte integer with a large number of symbols
     93   and a large maximum code length, so multiple-precision arithmetic would need
     94   to replace the integer arithmetic in that case. This program will abort if
     95   an overflow occurs. The big_t type identifies where the counting takes
     96   place.
     97
     98   The uintmax_t type is also used for calculating the number of possible codes
     99   remaining at the maximum length. This limits the maximum code length to the
     100   number of bits in a long long minus the number of bits needed to represent
     101   the symbols in a flat code. The code_t type identifies where the bit-pattern
     102   counting takes place.
    100103 */
    101104
     
    103106#include <stdlib.h>
    104107#include <string.h>
     108#include <stdarg.h>
     109#include <stdint.h>
    105110#include <assert.h>
    106111
    107112#define local static
    108113
    109 /* special data types */
    110 typedef unsigned long long big_t;   /* type for code counting */
    111 typedef unsigned long long code_t;  /* type for bit pattern counting */
    112 struct tab {                        /* type for been here check */
    113     size_t len;         /* length of bit vector in char's */
    114     char *vec;          /* allocated bit vector */
     114// Special data types.
     115typedef uintmax_t big_t;    // type for code counting
     116#define PRIbig "ju"         // printf format for big_t
     117typedef uintmax_t code_t;   // type for bit pattern counting
     118struct tab {                // type for been-here check
     119    size_t len;             // allocated length of bit vector in octets
     120    char *vec;              // allocated bit vector
    115121};
    116122
     
    127133      len: 1..max - 1 (max == maximum code length in bits)
    128134
    129    syms == 2 is not saved since that immediately leads to a single code.  left
     135   syms == 2 is not saved since that immediately leads to a single code. left
    130136   must be even, since it represents the number of available bit patterns at
    131    the current length, which is double the number at the previous length.
    132    left ends at syms-1 since left == syms immediately results in a single code.
     137   the current length, which is double the number at the previous length. left
     138   ends at syms-1 since left == syms immediately results in a single code.
    133139   (left > sym is not allowed since that would result in an incomplete code.)
    134140   len is less than max, since the code completes immediately when len == max.
    135141
    136    The offset into the array is calculated for the three indices with the
    137    first one (syms) being outermost, and the last one (len) being innermost.
    138    We build the array with length max-1 lists for the len index, with syms-3
    139    of those for each symbol.  There are totsym-2 of those, with each one
    140    varying in length as a function of sym.  See the calculation of index in
    141    count() for the index, and the calculation of size in main() for the size
    142    of the array.
     142   The offset into the array is calculated for the three indices with the first
     143   one (syms) being outermost, and the last one (len) being innermost. We build
     144   the array with length max-1 lists for the len index, with syms-3 of those
     145   for each symbol. There are totsym-2 of those, with each one varying in
     146   length as a function of sym. See the calculation of index in map() for the
     147   index, and the calculation of size in main() for the size of the array.
    143148
    144149   For the deflate example of 286 symbols limited to 15-bit codes, the array
    145    has 284,284 entries, taking up 2.17 MB for an 8-byte big_t.  More than
    146    half of the space allocated for saved results is actually used -- not all
    147    possible triplets are reached in the generation of valid Huffman codes.
     150   has 284,284 entries, taking up 2.17 MB for an 8-byte big_t. More than half
     151   of the space allocated for saved results is actually used -- not all
     152   possible triplets are reached in the generation of valid prefix codes.
    148153 */
    149154
     
    152157   Each element in the array is further indexed by the (mem, rem) doublet,
    153158   where mem is the amount of inflate table space used so far, and rem is the
    154    remaining unused entries in the current inflate sub-table.  Each indexed
     159   remaining unused entries in the current inflate sub-table. Each indexed
    155160   element is simply one bit indicating whether the state has been visited or
    156    not.  Since the ranges for mem and rem are not known a priori, each bit
     161   not. Since the ranges for mem and rem are not known a priori, each bit
    157162   vector is of a variable size, and grows as needed to accommodate the visited
    158    states.  mem and rem are used to calculate a single index in a triangular
    159    array.  Since the range of mem is expected in the default case to be about
     163   states. mem and rem are used to calculate a single index in a triangular
     164   array. Since the range of mem is expected in the default case to be about
    160165   ten times larger than the range of rem, the array is skewed to reduce the
    161    memory usage, with eight times the range for mem than for rem.  See the
    162    calculations for offset and bit in beenhere() for the details.
     166   memory usage, with eight times the range for mem than for rem. See the
     167   calculations for offset and bit in been_here() for the details.
    163168
    164169   For the deflate example of 286 symbols limited to 15-bit codes, the bit
    165    vectors grow to total approximately 21 MB, in addition to the 4.3 MB done[]
    166    array itself.
     170   vectors grow to total 5.5 MB, in addition to the 4.3 MB done array itself.
    167171 */
    168172
    169 /* Globals to avoid propagating constants or constant pointers recursively */
    170 local int max;          /* maximum allowed bit length for the codes */
    171 local int root;         /* size of base code table in bits */
    172 local int large;        /* largest code table so far */
    173 local size_t size;      /* number of elements in num and done */
    174 local int *code;        /* number of symbols assigned to each bit length */
    175 local big_t *num;       /* saved results array for code counting */
    176 local struct tab *done; /* states already evaluated array */
    177 
    178 /* Index function for num[] and done[] */
    179 #define INDEX(i,j,k) (((size_t)((i-1)>>1)*((i-2)>>1)+(j>>1)-1)*(max-1)+k-1)
    180 
    181 /* Free allocated space.  Uses globals code, num, and done. */
    182 local void cleanup(void)
    183 {
    184     size_t n;
    185 
    186     if (done != NULL) {
    187         for (n = 0; n < size; n++)
    188             if (done[n].len)
    189                 free(done[n].vec);
    190         free(done);
    191     }
    192     if (num != NULL)
    193         free(num);
    194     if (code != NULL)
    195         free(code);
    196 }
    197 
    198 /* Return the number of possible Huffman codes using bit patterns of lengths
    199    len through max inclusive, coding syms symbols, with left bit patterns of
    200    length len unused -- return -1 if there is an overflow in the counting.
    201    Keep a record of previous results in num to prevent repeating the same
    202    calculation.  Uses the globals max and num. */
    203 local big_t count(int syms, int len, int left)
    204 {
    205     big_t sum;          /* number of possible codes from this juncture */
    206     big_t got;          /* value returned from count() */
    207     int least;          /* least number of syms to use at this juncture */
    208     int most;           /* most number of syms to use at this juncture */
    209     int use;            /* number of bit patterns to use in next call */
    210     size_t index;       /* index of this case in *num */
    211 
    212     /* see if only one possible code */
     173// Type for a variable-length, allocated string.
     174typedef struct {
     175    char *str;          // pointer to allocated string
     176    size_t size;        // size of allocation
     177    size_t len;         // length of string, not including terminating zero
     178} string_t;
     179
     180// Clear a string_t.
     181local void string_clear(string_t *s) {
     182    s->str[0] = 0;
     183    s->len = 0;
     184}
     185
     186// Initialize a string_t.
     187local void string_init(string_t *s) {
     188    s->size = 16;
     189    s->str = malloc(s->size);
     190    assert(s->str != NULL && "out of memory");
     191    string_clear(s);
     192}
     193
     194// Release the allocation of a string_t.
     195local void string_free(string_t *s) {
     196    free(s->str);
     197    s->str = NULL;
     198    s->size = 0;
     199    s->len = 0;
     200}
     201
     202// Save the results of printf with fmt and the subsequent argument list to s.
     203// Each call appends to s. The allocated space for s is increased as needed.
     204local void string_printf(string_t *s, char *fmt, ...) {
     205    va_list ap;
     206    va_start(ap, fmt);
     207    size_t len = s->len;
     208    int ret = vsnprintf(s->str + len, s->size - len, fmt, ap);
     209    assert(ret >= 0 && "out of memory");
     210    s->len += ret;
     211    if (s->size < s->len + 1) {
     212        do {
     213            s->size <<= 1;
     214            assert(s->size != 0 && "overflow");
     215        } while (s->size < s->len + 1);
     216        s->str = realloc(s->str, s->size);
     217        assert(s->str != NULL && "out of memory");
     218        vsnprintf(s->str + len, s->size - len, fmt, ap);
     219    }
     220    va_end(ap);
     221}
     222
     223// Globals to avoid propagating constants or constant pointers recursively.
     224struct {
     225    int max;            // maximum allowed bit length for the codes
     226    int root;           // size of base code table in bits
     227    int large;          // largest code table so far
     228    size_t size;        // number of elements in num and done
     229    big_t tot;          // total number of codes with maximum tables size
     230    string_t out;       // display of subcodes for maximum tables size
     231    int *code;          // number of symbols assigned to each bit length
     232    big_t *num;         // saved results array for code counting
     233    struct tab *done;   // states already evaluated array
     234} g;
     235
     236// Index function for num[] and done[].
     237local inline size_t map(int syms, int left, int len) {
     238    return ((size_t)((syms - 1) >> 1) * ((syms - 2) >> 1) +
     239            (left >> 1) - 1) * (g.max - 1) +
     240           len - 1;
     241}
     242
     243// Free allocated space in globals.
     244local void cleanup(void) {
     245    if (g.done != NULL) {
     246        for (size_t n = 0; n < g.size; n++)
     247            if (g.done[n].len)
     248                free(g.done[n].vec);
     249        g.size = 0;
     250        free(g.done);   g.done = NULL;
     251    }
     252    free(g.num);    g.num = NULL;
     253    free(g.code);   g.code = NULL;
     254    string_free(&g.out);
     255}
     256
     257// Return the number of possible prefix codes using bit patterns of lengths len
     258// through max inclusive, coding syms symbols, with left bit patterns of length
     259// len unused -- return -1 if there is an overflow in the counting. Keep a
     260// record of previous results in num to prevent repeating the same calculation.
     261local big_t count(int syms, int left, int len) {
     262    // see if only one possible code
    213263    if (syms == left)
    214264        return 1;
    215265
    216     /* note and verify the expected state */
    217     assert(syms > left && left > 0 && len < max);
    218 
    219     /* see if we've done this one already */
    220     index = INDEX(syms, left, len);
    221     got = num[index];
     266    // note and verify the expected state
     267    assert(syms > left && left > 0 && len < g.max);
     268
     269    // see if we've done this one already
     270    size_t index = map(syms, left, len);
     271    big_t got = g.num[index];
    222272    if (got)
    223         return got;         /* we have -- return the saved result */
    224 
    225     /* we need to use at least this many bit patterns so that the code won't be
    226        incomplete at the next length (more bit patterns than symbols) */
    227     least = (left << 1) - syms;
     273        return got;         // we have -- return the saved result
     274
     275    // we need to use at least this many bit patterns so that the code won't be
     276    // incomplete at the next length (more bit patterns than symbols)
     277    int least = (left << 1) - syms;
    228278    if (least < 0)
    229279        least = 0;
    230280
    231     /* we can use at most this many bit patterns, lest there not be enough
    232       available for the remaining symbols at the maximum length (if there were
    233        no limit to the code length, this would become: most = left - 1) */
    234     most = (((code_t)left << (max - len)) - syms) /
    235             (((code_t)1 << (max - len)) - 1);
    236 
    237     /* count all possible codes from this juncture and add them up */
    238     sum = 0;
    239     for (use = least; use <= most; use++) {
    240         got = count(syms - use, len + 1, (left - use) << 1);
     281    // we can use at most this many bit patterns, lest there not be enough
     282    // available for the remaining symbols at the maximum length (if there were
     283    // no limit to the code length, this would become: most = left - 1)
     284    int most = (((code_t)left << (g.max - len)) - syms) /
     285               (((code_t)1 << (g.max - len)) - 1);
     286
     287    // count all possible codes from this juncture and add them up
     288    big_t sum = 0;
     289    for (int use = least; use <= most; use++) {
     290        got = count(syms - use, (left - use) << 1, len + 1);
    241291        sum += got;
    242         if (got == (big_t)0 - 1 || sum < got)   /* overflow */
    243             return (big_t)0 - 1;
    244     }
    245 
    246     /* verify that all recursive calls are productive */
     292        if (got == (big_t)-1 || sum < got)      // overflow
     293            return (big_t)-1;
     294    }
     295
     296    // verify that all recursive calls are productive
    247297    assert(sum != 0);
    248298
    249     /* save the result and return it */
    250     num[index] = sum;
     299    // save the result and return it
     300    g.num[index] = sum;
    251301    return sum;
    252302}
    253303
    254 /* Return true if we've been here before, set to true if not.  Set a bit in a
    255    bit vector to indicate visiting this state.  Each (syms,len,left) state
    256    has a variable size bit vector indexed by (mem,rem).  The bit vector is
    257    lengthened if needed to allow setting the (mem,rem) bit. */
    258 local int beenhere(int syms, int len, int left, int mem, int rem)
    259 {
    260     size_t index;       /* index for this state's bit vector */
    261     size_t offset;      /* offset in this state's bit vector */
    262     int bit;            /* mask for this state's bit */
    263     size_t length;      /* length of the bit vector in bytes */
    264     char *vector;       /* new or enlarged bit vector */
    265 
    266     /* point to vector for (syms,left,len), bit in vector for (mem,rem) */
    267     index = INDEX(syms, left, len);
    268     mem -= 1 << root;
    269     offset = (mem >> 3) + rem;
     304// Return true if we've been here before, set to true if not. Set a bit in a
     305// bit vector to indicate visiting this state. Each (syms,len,left) state has a
     306// variable size bit vector indexed by (mem,rem). The bit vector is lengthened
     307// as needed to allow setting the (mem,rem) bit.
     308local int been_here(int syms, int left, int len, int mem, int rem) {
     309    // point to vector for (syms,left,len), bit in vector for (mem,rem)
     310    size_t index = map(syms, left, len);
     311    mem -= 1 << g.root;             // mem always includes the root table
     312    mem >>= 1;                      // mem and rem are always even
     313    rem >>= 1;
     314    size_t offset = (mem >> 3) + rem;
    270315    offset = ((offset * (offset + 1)) >> 1) + rem;
    271     bit = 1 << (mem & 7);
    272 
    273     /* see if we've been here */
    274     length = done[index].len;
    275     if (offset < length && (done[index].vec[offset] & bit) != 0)
    276         return 1;       /* done this! */
    277 
    278     /* we haven't been here before -- set the bit to show we have now */
    279 
    280     /* see if we need to lengthen the vector in order to set the bit */
     316    int bit = 1 << (mem & 7);
     317
     318    // see if we've been here
     319    size_t length = g.done[index].len;
     320    if (offset < length && (g.done[index].vec[offset] & bit) != 0)
     321        return 1;       // done this!
     322
     323    // we haven't been here before -- set the bit to show we have now
     324
     325    // see if we need to lengthen the vector in order to set the bit
    281326    if (length <= offset) {
    282         /* if we have one already, enlarge it, zero out the appended space */
     327        // if we have one already, enlarge it, zero out the appended space
     328        char *vector;
    283329        if (length) {
    284330            do {
    285331                length <<= 1;
    286332            } while (length <= offset);
    287             vector = realloc(done[index].vec, length);
    288             if (vector != NULL)
    289                 memset(vector + done[index].len, 0, length - done[index].len);
     333            vector = realloc(g.done[index].vec, length);
     334            assert(vector != NULL && "out of memory");
     335            memset(vector + g.done[index].len, 0, length - g.done[index].len);
    290336        }
    291337
    292         /* otherwise we need to make a new vector and zero it out */
     338        // otherwise we need to make a new vector and zero it out
    293339        else {
    294             length = 1 << (len - root);
     340            length = 16;
    295341            while (length <= offset)
    296342                length <<= 1;
    297             vector = calloc(length, sizeof(char));
     343            vector = calloc(length, 1);
     344            assert(vector != NULL && "out of memory");
    298345        }
    299346
    300         /* in either case, bail if we can't get the memory */
    301         if (vector == NULL) {
    302             fputs("abort: unable to allocate enough memory\n", stderr);
    303             cleanup();
    304             exit(1);
    305         }
    306 
    307         /* install the new vector */
    308         done[index].len = length;
    309         done[index].vec = vector;
    310     }
    311 
    312     /* set the bit */
    313     done[index].vec[offset] |= bit;
     347        // install the new vector
     348        g.done[index].len = length;
     349        g.done[index].vec = vector;
     350    }
     351
     352    // set the bit
     353    g.done[index].vec[offset] |= bit;
    314354    return 0;
    315355}
    316356
    317 /* Examine all possible codes from the given node (syms, len, left).  Compute
    318    the amount of memory required to build inflate's decoding tables, where the
    319    number of code structures used so far is mem, and the number remaining in
    320    the current sub-table is rem.  Uses the globals max, code, root, large, and
    321    done. */
    322 local void examine(int syms, int len, int left, int mem, int rem)
    323 {
    324     int least;          /* least number of syms to use at this juncture */
    325     int most;           /* most number of syms to use at this juncture */
    326     int use;            /* number of bit patterns to use in next call */
    327 
    328     /* see if we have a complete code */
     357// Examine all possible codes from the given node (syms, len, left). Compute
     358// the amount of memory required to build inflate's decoding tables, where the
     359// number of code structures used so far is mem, and the number remaining in
     360// the current sub-table is rem.
     361local void examine(int syms, int left, int len, int mem, int rem) {
     362    // see if we have a complete code
    329363    if (syms == left) {
    330         /* set the last code entry */
    331         code[len] = left;
    332 
    333         /* complete computation of memory used by this code */
     364        // set the last code entry
     365        g.code[len] = left;
     366
     367        // complete computation of memory used by this code
    334368        while (rem < left) {
    335369            left -= rem;
    336             rem = 1 << (len - root);
     370            rem = 1 << (len - g.root);
    337371            mem += rem;
    338372        }
    339373        assert(rem == left);
    340374
    341         /* if this is a new maximum, show the entries used and the sub-code */
    342         if (mem > large) {
    343             large = mem;
    344             printf("max %d: ", mem);
    345             for (use = root + 1; use <= max; use++)
    346                 if (code[use])
    347                     printf("%d[%d] ", code[use], use);
    348             putchar('\n');
    349             fflush(stdout);
     375        // if this is at the maximum, show the sub-code
     376        if (mem >= g.large) {
     377            // if this is a new maximum, update the maximum and clear out the
     378            // printed sub-codes from the previous maximum
     379            if (mem > g.large) {
     380                g.large = mem;
     381                string_clear(&g.out);
     382            }
     383
     384            // compute the starting state for this sub-code
     385            syms = 0;
     386            left = 1 << g.max;
     387            for (int bits = g.max; bits > g.root; bits--) {
     388                syms += g.code[bits];
     389                left -= g.code[bits];
     390                assert((left & 1) == 0);
     391                left >>= 1;
     392            }
     393
     394            // print the starting state and the resulting sub-code to g.out
     395            string_printf(&g.out, "<%u, %u, %u>:",
     396                          syms, g.root + 1, ((1 << g.root) - left) << 1);
     397            for (int bits = g.root + 1; bits <= g.max; bits++)
     398                if (g.code[bits])
     399                    string_printf(&g.out, " %d[%d]", g.code[bits], bits);
     400            string_printf(&g.out, "\n");
    350401        }
    351402
    352         /* remove entries as we drop back down in the recursion */
    353         code[len] = 0;
     403        // remove entries as we drop back down in the recursion
     404        g.code[len] = 0;
    354405        return;
    355406    }
    356407
    357     /* prune the tree if we can */
    358     if (beenhere(syms, len, left, mem, rem))
     408    // prune the tree if we can
     409    if (been_here(syms, left, len, mem, rem))
    359410        return;
    360411
    361     /* we need to use at least this many bit patterns so that the code won't be
    362        incomplete at the next length (more bit patterns than symbols) */
    363     least = (left << 1) - syms;
     412    // we need to use at least this many bit patterns so that the code won't be
     413    // incomplete at the next length (more bit patterns than symbols)
     414    int least = (left << 1) - syms;
    364415    if (least < 0)
    365416        least = 0;
    366417
    367     /* we can use at most this many bit patterns, lest there not be enough
    368       available for the remaining symbols at the maximum length (if there were
    369        no limit to the code length, this would become: most = left - 1) */
    370     most = (((code_t)left << (max - len)) - syms) /
    371             (((code_t)1 << (max - len)) - 1);
    372 
    373     /* occupy least table spaces, creating new sub-tables as needed */
    374     use = least;
     418    // we can use at most this many bit patterns, lest there not be enough
     419    // available for the remaining symbols at the maximum length (if there were
     420    // no limit to the code length, this would become: most = left - 1)
     421    int most = (((code_t)left << (g.max - len)) - syms) /
     422               (((code_t)1 << (g.max - len)) - 1);
     423
     424    // occupy least table spaces, creating new sub-tables as needed
     425    int use = least;
    375426    while (rem < use) {
    376427        use -= rem;
    377         rem = 1 << (len - root);
     428        rem = 1 << (len - g.root);
    378429        mem += rem;
    379430    }
    380431    rem -= use;
    381432
    382     /* examine codes from here, updating table space as we go */
     433    // examine codes from here, updating table space as we go
    383434    for (use = least; use <= most; use++) {
    384         code[len] = use;
    385         examine(syms - use, len + 1, (left - use) << 1,
    386                 mem + (rem ? 1 << (len - root) : 0), rem << 1);
     435        g.code[len] = use;
     436        examine(syms - use, (left - use) << 1, len + 1,
     437                mem + (rem ? 1 << (len - g.root) : 0), rem << 1);
    387438        if (rem == 0) {
    388             rem = 1 << (len - root);
     439            rem = 1 << (len - g.root);
    389440            mem += rem;
    390441        }
     
    392443    }
    393444
    394     /* remove entries as we drop back down in the recursion */
    395     code[len] = 0;
    396 }
    397 
    398 /* Look at all sub-codes starting with root + 1 bits.  Look at only the valid
    399    intermediate code states (syms, left, len).  For each completed code,
    400    calculate the amount of memory required by inflate to build the decoding
    401    tables. Find the maximum amount of memory required and show the code that
    402    requires that maximum.  Uses the globals max, root, and num. */
    403 local void enough(int syms)
    404 {
    405     int n;              /* number of remaing symbols for this node */
    406     int left;           /* number of unused bit patterns at this length */
    407     size_t index;       /* index of this case in *num */
    408 
    409     /* clear code */
    410     for (n = 0; n <= max; n++)
    411         code[n] = 0;
    412 
    413     /* look at all (root + 1) bit and longer codes */
    414     large = 1 << root;              /* base table */
    415     if (root < max)                 /* otherwise, there's only a base table */
    416         for (n = 3; n <= syms; n++)
    417             for (left = 2; left < n; left += 2)
    418             {
    419                 /* look at all reachable (root + 1) bit nodes, and the
    420                    resulting codes (complete at root + 2 or more) */
    421                 index = INDEX(n, left, root + 1);
    422                 if (root + 1 < max && num[index])       /* reachable node */
    423                     examine(n, root + 1, left, 1 << root, 0);
    424 
    425                 /* also look at root bit codes with completions at root + 1
    426                    bits (not saved in num, since complete), just in case */
    427                 if (num[index - 1] && n <= left << 1)
    428                     examine((n - left) << 1, root + 1, (n - left) << 1,
    429                             1 << root, 0);
     445    // remove entries as we drop back down in the recursion
     446    g.code[len] = 0;
     447}
     448
     449// Look at all sub-codes starting with root + 1 bits. Look at only the valid
     450// intermediate code states (syms, left, len). For each completed code,
     451// calculate the amount of memory required by inflate to build the decoding
     452// tables. Find the maximum amount of memory required and show the codes that
     453// require that maximum.
     454local void enough(int syms) {
     455    // clear code
     456    for (int n = 0; n <= g.max; n++)
     457        g.code[n] = 0;
     458
     459    // look at all (root + 1) bit and longer codes
     460    string_clear(&g.out);           // empty saved results
     461    g.large = 1 << g.root;          // base table
     462    if (g.root < g.max)             // otherwise, there's only a base table
     463        for (int n = 3; n <= syms; n++)
     464            for (int left = 2; left < n; left += 2) {
     465                // look at all reachable (root + 1) bit nodes, and the
     466                // resulting codes (complete at root + 2 or more)
     467                size_t index = map(n, left, g.root + 1);
     468                if (g.root + 1 < g.max && g.num[index]) // reachable node
     469                    examine(n, left, g.root + 1, 1 << g.root, 0);
     470
     471                // also look at root bit codes with completions at root + 1
     472                // bits (not saved in num, since complete), just in case
     473                if (g.num[index - 1] && n <= left << 1)
     474                    examine((n - left) << 1, (n - left) << 1, g.root + 1,
     475                            1 << g.root, 0);
    430476            }
    431477
    432     /* done */
    433     printf("done: maximum of %d table entries\n", large);
    434 }
    435 
    436 /*
    437    Examine and show the total number of possible Huffman codes for a given
    438    maximum number of symbols, initial root table size, and maximum code length
    439    in bits -- those are the command arguments in that order.  The default
    440    values are 286, 9, and 15 respectively, for the deflate literal/length code.
    441    The possible codes are counted for each number of coded symbols from two to
    442    the maximum.  The counts for each of those and the total number of codes are
    443    shown.  The maximum number of inflate table entires is then calculated
    444    across all possible codes.  Each new maximum number of table entries and the
    445    associated sub-code (starting at root + 1 == 10 bits) is shown.
    446 
    447    To count and examine Huffman codes that are not length-limited, provide a
    448    maximum length equal to the number of symbols minus one.
    449 
    450    For the deflate literal/length code, use "enough".  For the deflate distance
    451    code, use "enough 30 6".
    452 
    453    This uses the %llu printf format to print big_t numbers, which assumes that
    454    big_t is an unsigned long long.  If the big_t type is changed (for example
    455    to a multiple precision type), the method of printing will also need to be
    456    updated.
    457  */
    458 int main(int argc, char **argv)
    459 {
    460     int syms;           /* total number of symbols to code */
    461     int n;              /* number of symbols to code for this run */
    462     big_t got;          /* return value of count() */
    463     big_t sum;          /* accumulated number of codes over n */
    464     code_t word;        /* for counting bits in code_t */
    465 
    466     /* set up globals for cleanup() */
    467     code = NULL;
    468     num = NULL;
    469     done = NULL;
    470 
    471     /* get arguments -- default to the deflate literal/length code */
    472     syms = 286;
    473     root = 9;
    474     max = 15;
     478    // done
     479    printf("maximum of %d table entries for root = %d\n", g.large, g.root);
     480    fputs(g.out.str, stdout);
     481}
     482
     483// Examine and show the total number of possible prefix codes for a given
     484// maximum number of symbols, initial root table size, and maximum code length
     485// in bits -- those are the command arguments in that order. The default values
     486// are 286, 9, and 15 respectively, for the deflate literal/length code. The
     487// possible codes are counted for each number of coded symbols from two to the
     488// maximum. The counts for each of those and the total number of codes are
     489// shown. The maximum number of inflate table entires is then calculated across
     490// all possible codes. Each new maximum number of table entries and the
     491// associated sub-code (starting at root + 1 == 10 bits) is shown.
     492//
     493// To count and examine prefix codes that are not length-limited, provide a
     494// maximum length equal to the number of symbols minus one.
     495//
     496// For the deflate literal/length code, use "enough". For the deflate distance
     497// code, use "enough 30 6".
     498int main(int argc, char **argv) {
     499    // set up globals for cleanup()
     500    g.code = NULL;
     501    g.num = NULL;
     502    g.done = NULL;
     503    string_init(&g.out);
     504
     505    // get arguments -- default to the deflate literal/length code
     506    int syms = 286;
     507    g.root = 9;
     508    g.max = 15;
    475509    if (argc > 1) {
    476510        syms = atoi(argv[1]);
    477511        if (argc > 2) {
    478             root = atoi(argv[2]);
     512            g.root = atoi(argv[2]);
    479513            if (argc > 3)
    480                 max = atoi(argv[3]);
     514                g.max = atoi(argv[3]);
    481515        }
    482516    }
    483     if (argc > 4 || syms < 2 || root < 1 || max < 1) {
     517    if (argc > 4 || syms < 2 || g.root < 1 || g.max < 1) {
    484518        fputs("invalid arguments, need: [sym >= 2 [root >= 1 [max >= 1]]]\n",
    485519              stderr);
     
    487521    }
    488522
    489     /* if not restricting the code length, the longest is syms - 1 */
    490     if (max > syms - 1)
    491         max = syms - 1;
    492 
    493     /* determine the number of bits in a code_t */
    494     for (n = 0, word = 1; word; n++, word <<= 1)
    495         ;
    496 
    497     /* make sure that the calculation of most will not overflow */
    498     if (max > n || (code_t)(syms - 2) >= (((code_t)0 - 1) >> (max - 1))) {
     523    // if not restricting the code length, the longest is syms - 1
     524    if (g.max > syms - 1)
     525        g.max = syms - 1;
     526
     527    // determine the number of bits in a code_t
     528    int bits = 0;
     529    for (code_t word = 1; word; word <<= 1)
     530        bits++;
     531
     532    // make sure that the calculation of most will not overflow
     533    if (g.max > bits || (code_t)(syms - 2) >= ((code_t)-1 >> (g.max - 1))) {
    499534        fputs("abort: code length too long for internal types\n", stderr);
    500535        return 1;
    501536    }
    502537
    503     /* reject impossible code requests */
    504     if ((code_t)(syms - 1) > ((code_t)1 << max) - 1) {
     538    // reject impossible code requests
     539    if ((code_t)(syms - 1) > ((code_t)1 << g.max) - 1) {
    505540        fprintf(stderr, "%d symbols cannot be coded in %d bits\n",
    506                 syms, max);
     541                syms, g.max);
    507542        return 1;
    508543    }
    509544
    510     /* allocate code vector */
    511     code = calloc(max + 1, sizeof(int));
    512     if (code == NULL) {
    513         fputs("abort: unable to allocate enough memory\n", stderr);
    514         return 1;
    515     }
    516 
    517     /* determine size of saved results array, checking for overflows,
    518        allocate and clear the array (set all to zero with calloc()) */
    519     if (syms == 2)              /* iff max == 1 */
    520         num = NULL;             /* won't be saving any results */
     545    // allocate code vector
     546    g.code = calloc(g.max + 1, sizeof(int));
     547    assert(g.code != NULL && "out of memory");
     548
     549    // determine size of saved results array, checking for overflows,
     550    // allocate and clear the array (set all to zero with calloc())
     551    if (syms == 2)              // iff max == 1
     552        g.num = NULL;           // won't be saving any results
    521553    else {
    522         size = syms >> 1;
    523         if (size > ((size_t)0 - 1) / (n = (syms - 1) >> 1) ||
    524                 (size *= n, size > ((size_t)0 - 1) / (n = max - 1)) ||
    525                 (size *= n, size > ((size_t)0 - 1) / sizeof(big_t)) ||
    526                 (num = calloc(size, sizeof(big_t))) == NULL) {
    527             fputs("abort: unable to allocate enough memory\n", stderr);
    528             cleanup();
    529             return 1;
    530         }
    531     }
    532 
    533     /* count possible codes for all numbers of symbols, add up counts */
    534     sum = 0;
    535     for (n = 2; n <= syms; n++) {
    536         got = count(n, 1, 2);
     554        g.size = syms >> 1;
     555        int n = (syms - 1) >> 1;
     556        assert(g.size <= (size_t)-1 / n && "overflow");
     557        g.size *= n;
     558        n = g.max - 1;
     559        assert(g.size <= (size_t)-1 / n && "overflow");
     560        g.size *= n;
     561        g.num = calloc(g.size, sizeof(big_t));
     562        assert(g.num != NULL && "out of memory");
     563    }
     564
     565    // count possible codes for all numbers of symbols, add up counts
     566    big_t sum = 0;
     567    for (int n = 2; n <= syms; n++) {
     568        big_t got = count(n, 2, 1);
    537569        sum += got;
    538         if (got == (big_t)0 - 1 || sum < got) {     /* overflow */
    539             fputs("abort: can't count that high!\n", stderr);
    540             cleanup();
    541             return 1;
    542         }
    543         printf("%llu %d-codes\n", got, n);
    544     }
    545     printf("%llu total codes for 2 to %d symbols", sum, syms);
    546     if (max < syms - 1)
    547         printf(" (%d-bit length limit)\n", max);
     570        assert(got != (big_t)-1 && sum >= got && "overflow");
     571    }
     572    printf("%"PRIbig" total codes for 2 to %d symbols", sum, syms);
     573    if (g.max < syms - 1)
     574        printf(" (%d-bit length limit)\n", g.max);
    548575    else
    549576        puts(" (no length limit)");
    550577
    551     /* allocate and clear done array for beenhere() */
     578    // allocate and clear done array for been_here()
    552579    if (syms == 2)
    553         done = NULL;
    554     else if (size > ((size_t)0 - 1) / sizeof(struct tab) ||
    555              (done = calloc(size, sizeof(struct tab))) == NULL) {
    556         fputs("abort: unable to allocate enough memory\n", stderr);
    557         cleanup();
    558         return 1;
    559     }
    560 
    561     /* find and show maximum inflate table usage */
    562     if (root > max)                 /* reduce root to max length */
    563         root = max;
    564     if ((code_t)syms < ((code_t)1 << (root + 1)))
     580        g.done = NULL;
     581    else {
     582        g.done = calloc(g.size, sizeof(struct tab));
     583        assert(g.done != NULL && "out of memory");
     584    }
     585
     586    // find and show maximum inflate table usage
     587    if (g.root > g.max)             // reduce root to max length
     588        g.root = g.max;
     589    if ((code_t)syms < ((code_t)1 << (g.root + 1)))
    565590        enough(syms);
    566591    else
    567         puts("cannot handle minimum code lengths > root");
    568 
    569     /* done */
     592        fputs("cannot handle minimum code lengths > root", stderr);
     593
     594    // done
    570595    cleanup();
    571596    return 0;
  • trunk/src/libs/zlib-1.2.12/examples/gzappend.c

    r76163 r95239  
    138138    if (rot == 1) {
    139139        tmp = *list;
    140         memcpy(list, list + 1, len - 1);
     140        memmove(list, list + 1, len - 1);
    141141        *last = tmp;
    142142        return;
  • trunk/src/libs/zlib-1.2.12/examples/gzlog.c

    r76163 r95239  
    11/*
    22 * gzlog.c
    3  * Copyright (C) 2004, 2008, 2012, 2016 Mark Adler, all rights reserved
     3 * Copyright (C) 2004, 2008, 2012, 2016, 2019 Mark Adler, all rights reserved
    44 * For conditions of distribution and use, see copyright notice in gzlog.h
    5  * version 2.2, 14 Aug 2012
     5 * version 2.3, 25 May 2019
    66 */
    77
     
    757757            }
    758758            if ((fd = open(log->path, O_RDONLY, 0)) < 0) {
     759                free(data);
    759760                log_log(log, op, ".add file read failure");
    760761                return -1;
     
    763764            close(fd);
    764765            if (ret) {
     766                free(data);
    765767                log_log(log, op, ".add file read failure");
    766768                return -1;
  • trunk/src/libs/zlib-1.2.12/examples/zran.c

    r76163 r95239  
    11/* zran.c -- example of zlib/gzip stream indexing and random access
    2  * Copyright (C) 2005, 2012 Mark Adler
     2 * Copyright (C) 2005, 2012, 2018 Mark Adler
    33 * For conditions of distribution and use, see copyright notice in zlib.h
    4    Version 1.1  29 Sep 2012  Mark Adler */
     4 * Version 1.2  14 Oct 2018  Mark Adler */
    55
    66/* Version History:
    77 1.0  29 May 2005  First version
    88 1.1  29 Sep 2012  Fix memory reallocation error
     9 1.2  14 Oct 2018  Handle gzip streams with multiple members
     10                   Add a header file to facilitate usage in applications
    911 */
    1012
     
    2123   uncompressed data that precede that block.  Also the uncompressed offset of
    2224   that block is saved to provide a referece for locating a desired starting
    23    point in the uncompressed stream.  build_index() works by decompressing the
    24    input zlib or gzip stream a block at a time, and at the end of each block
    25    deciding if enough uncompressed data has gone by to justify the creation of
    26    a new access point.  If so, that point is saved in a data structure that
    27    grows as needed to accommodate the points.
     25   point in the uncompressed stream.  deflate_index_build() works by
     26   decompressing the input zlib or gzip stream a block at a time, and at the
     27   end of each block deciding if enough uncompressed data has gone by to
     28   justify the creation of a new access point.  If so, that point is saved in a
     29   data structure that grows as needed to accommodate the points.
    2830
    2931   To use the index, an offset in the uncompressed data is provided, for which
     
    4446   access, mainly copying the 32K byte dictionary.  So if small pieces of the
    4547   file are being accessed, it would make sense to implement a cache to hold
    46    some lookahead and avoid many calls to extract() for small lengths.
     48   some lookahead and avoid many calls to deflate_index_extract() for small
     49   lengths.
    4750
    4851   Another way to build an index would be to use inflateCopy().  That would
     
    5760#include <string.h>
    5861#include "zlib.h"
    59 
    60 #define local static
    61 
    62 #define SPAN 1048576L       /* desired distance between access points */
     62#include "zran.h"
     63
    6364#define WINSIZE 32768U      /* sliding window size */
    6465#define CHUNK 16384         /* file input buffer size */
    6566
    66 /* access point entry */
     67/* Access point entry. */
    6768struct point {
    6869    off_t out;          /* corresponding offset in uncompressed data */
    6970    off_t in;           /* offset in input file of first full byte */
    70     int bits;           /* number of bits (1-7) from byte at in - 1, or 0 */
     71    int bits;           /* number of bits (1-7) from byte at in-1, or 0 */
    7172    unsigned char window[WINSIZE];  /* preceding 32K of uncompressed data */
    7273};
    7374
    74 /* access point list */
    75 struct access {
    76     int have;           /* number of list entries filled in */
    77     int size;           /* number of list entries allocated */
    78     struct point *list; /* allocated list */
    79 };
    80 
    81 /* Deallocate an index built by build_index() */
    82 local void free_index(struct access *index)
     75/* See comments in zran.h. */
     76void deflate_index_free(struct deflate_index *index)
    8377{
    8478    if (index != NULL) {
     
    8882}
    8983
    90 /* Add an entry to the access point list.  If out of memory, deallocate the
    91    existing list and return NULL. */
    92 local struct access *addpoint(struct access *index, int bits,
    93     off_t in, off_t out, unsigned left, unsigned char *window)
     84/* Add an entry to the access point list. If out of memory, deallocate the
     85   existing list and return NULL. index->gzip is the allocated size of the
     86   index in point entries, until it is time for deflate_index_build() to
     87   return, at which point gzip is set to indicate a gzip file or not.
     88 */
     89static struct deflate_index *addpoint(struct deflate_index *index, int bits,
     90                                      off_t in, off_t out, unsigned left,
     91                                      unsigned char *window)
    9492{
    9593    struct point *next;
     
    9795    /* if list is empty, create it (start with eight points) */
    9896    if (index == NULL) {
    99         index = malloc(sizeof(struct access));
     97        index = malloc(sizeof(struct deflate_index));
    10098        if (index == NULL) return NULL;
    10199        index->list = malloc(sizeof(struct point) << 3);
     
    104102            return NULL;
    105103        }
    106         index->size = 8;
     104        index->gzip = 8;
    107105        index->have = 0;
    108106    }
    109107
    110108    /* if list is full, make it bigger */
    111     else if (index->have == index->size) {
    112         index->size <<= 1;
    113         next = realloc(index->list, sizeof(struct point) * index->size);
     109    else if (index->have == index->gzip) {
     110        index->gzip <<= 1;
     111        next = realloc(index->list, sizeof(struct point) * index->gzip);
    114112        if (next == NULL) {
    115             free_index(index);
     113            deflate_index_free(index);
    116114            return NULL;
    117115        }
     
    120118
    121119    /* fill in entry and increment how many we have */
    122     next = index->list + index->have;
     120    next = (struct point *)(index->list) + index->have;
    123121    next->bits = bits;
    124122    next->in = in;
     
    134132}
    135133
    136 /* Make one entire pass through the compressed stream and build an index, with
    137    access points about every span bytes of uncompressed output -- span is
    138    chosen to balance the speed of random access against the memory requirements
    139    of the list, about 32K bytes per access point.  Note that data after the end
    140    of the first zlib or gzip stream in the file is ignored.  build_index()
    141    returns the number of access points on success (>= 1), Z_MEM_ERROR for out
    142    of memory, Z_DATA_ERROR for an error in the input file, or Z_ERRNO for a
    143    file read error.  On success, *built points to the resulting index. */
    144 local int build_index(FILE *in, off_t span, struct access **built)
     134/* See comments in zran.h. */
     135int deflate_index_build(FILE *in, off_t span, struct deflate_index **built)
    145136{
    146137    int ret;
     138    int gzip = 0;               /* true if reading a gzip file */
    147139    off_t totin, totout;        /* our own total counters to avoid 4GB limit */
    148140    off_t last;                 /* totout value of last access point */
    149     struct access *index;       /* access points being generated */
     141    struct deflate_index *index;    /* access points being generated */
    150142    z_stream strm;
    151143    unsigned char input[CHUNK];
     
    164156    /* inflate the input, maintain a sliding window, and build an index -- this
    165157       also validates the integrity of the compressed data using the check
    166        information at the end of the gzip or zlib stream */
     158       information in the gzip or zlib stream */
    167159    totin = totout = last = 0;
    168160    index = NULL;               /* will be allocated by first addpoint() */
     
    173165        if (ferror(in)) {
    174166            ret = Z_ERRNO;
    175             goto build_index_error;
     167            goto deflate_index_build_error;
    176168        }
    177169        if (strm.avail_in == 0) {
    178170            ret = Z_DATA_ERROR;
    179             goto build_index_error;
     171            goto deflate_index_build_error;
    180172        }
    181173        strm.next_in = input;
     174
     175        /* check for a gzip stream */
     176        if (totin == 0 && strm.avail_in >= 3 &&
     177            input[0] == 31 && input[1] == 139 && input[2] == 8)
     178            gzip = 1;
    182179
    183180        /* process all of that, or until end of stream */
     
    199196                ret = Z_DATA_ERROR;
    200197            if (ret == Z_MEM_ERROR || ret == Z_DATA_ERROR)
    201                 goto build_index_error;
    202             if (ret == Z_STREAM_END)
     198                goto deflate_index_build_error;
     199            if (ret == Z_STREAM_END) {
     200                if (gzip &&
     201                    (strm.avail_in || ungetc(getc(in), in) != EOF)) {
     202                    ret = inflateReset(&strm);
     203                    if (ret != Z_OK)
     204                        goto deflate_index_build_error;
     205                    continue;
     206                }
    203207                break;
     208            }
    204209
    205210            /* if at end of block, consider adding an index entry (note that if
     
    218223                if (index == NULL) {
    219224                    ret = Z_MEM_ERROR;
    220                     goto build_index_error;
     225                    goto deflate_index_build_error;
    221226                }
    222227                last = totout;
     
    228233    (void)inflateEnd(&strm);
    229234    index->list = realloc(index->list, sizeof(struct point) * index->have);
    230     index->size = index->have;
     235    index->gzip = gzip;
     236    index->length = totout;
    231237    *built = index;
    232     return index->size;
     238    return index->have;
    233239
    234240    /* return error */
    235   build_index_error:
     241  deflate_index_build_error:
    236242    (void)inflateEnd(&strm);
    237     if (index != NULL)
    238         free_index(index);
     243    deflate_index_free(index);
    239244    return ret;
    240245}
    241246
    242 /* Use the index to read len bytes from offset into buf, return bytes read or
    243    negative for error (Z_DATA_ERROR or Z_MEM_ERROR).  If data is requested past
    244    the end of the uncompressed data, then extract() will return a value less
    245    than len, indicating how much as actually read into buf.  This function
    246    should not return a data error unless the file was modified since the index
    247    was generated.  extract() may also return Z_ERRNO if there is an error on
    248    reading or seeking the input file. */
    249 local int extract(FILE *in, struct access *index, off_t offset,
    250                   unsigned char *buf, int len)
     247/* See comments in zran.h. */
     248int deflate_index_extract(FILE *in, struct deflate_index *index, off_t offset,
     249                          unsigned char *buf, int len)
    251250{
    252251    int ret, skip;
     
    277276    ret = fseeko(in, here->in - (here->bits ? 1 : 0), SEEK_SET);
    278277    if (ret == -1)
    279         goto extract_ret;
     278        goto deflate_index_extract_ret;
    280279    if (here->bits) {
    281280        ret = getc(in);
    282281        if (ret == -1) {
    283282            ret = ferror(in) ? Z_ERRNO : Z_DATA_ERROR;
    284             goto extract_ret;
     283            goto deflate_index_extract_ret;
    285284        }
    286285        (void)inflatePrime(&strm, here->bits, ret >> (8 - here->bits));
     
    294293    do {
    295294        /* define where to put uncompressed data, and how much */
    296         if (offset == 0 && skip) {          /* at offset now */
    297             strm.avail_out = len;
    298             strm.next_out = buf;
    299             skip = 0;                       /* only do this once */
    300         }
    301295        if (offset > WINSIZE) {             /* skip WINSIZE bytes */
    302296            strm.avail_out = WINSIZE;
     
    304298            offset -= WINSIZE;
    305299        }
    306         else if (offset != 0) {             /* last skip */
     300        else if (offset > 0) {              /* last skip */
    307301            strm.avail_out = (unsigned)offset;
    308302            strm.next_out = discard;
    309303            offset = 0;
     304        }
     305        else if (skip) {                    /* at offset now */
     306            strm.avail_out = len;
     307            strm.next_out = buf;
     308            skip = 0;                       /* only do this once */
    310309        }
    311310
     
    316315                if (ferror(in)) {
    317316                    ret = Z_ERRNO;
    318                     goto extract_ret;
     317                    goto deflate_index_extract_ret;
    319318                }
    320319                if (strm.avail_in == 0) {
    321320                    ret = Z_DATA_ERROR;
    322                     goto extract_ret;
     321                    goto deflate_index_extract_ret;
    323322                }
    324323                strm.next_in = input;
     
    328327                ret = Z_DATA_ERROR;
    329328            if (ret == Z_MEM_ERROR || ret == Z_DATA_ERROR)
    330                 goto extract_ret;
    331             if (ret == Z_STREAM_END)
    332                 break;
     329                goto deflate_index_extract_ret;
     330            if (ret == Z_STREAM_END) {
     331                /* the raw deflate stream has ended */
     332                if (index->gzip == 0)
     333                    /* this is a zlib stream that has ended -- done */
     334                    break;
     335
     336                /* near the end of a gzip member, which might be followed by
     337                   another gzip member -- skip the gzip trailer and see if
     338                   there is more input after it */
     339                if (strm.avail_in < 8) {
     340                    fseeko(in, 8 - strm.avail_in, SEEK_CUR);
     341                    strm.avail_in = 0;
     342                }
     343                else {
     344                    strm.avail_in -= 8;
     345                    strm.next_in += 8;
     346                }
     347                if (strm.avail_in == 0 && ungetc(getc(in), in) == EOF)
     348                    /* the input ended after the gzip trailer -- done */
     349                    break;
     350
     351                /* there is more input, so another gzip member should follow --
     352                   validate and skip the gzip header */
     353                ret = inflateReset2(&strm, 31);
     354                if (ret != Z_OK)
     355                    goto deflate_index_extract_ret;
     356                do {
     357                    if (strm.avail_in == 0) {
     358                        strm.avail_in = fread(input, 1, CHUNK, in);
     359                        if (ferror(in)) {
     360                            ret = Z_ERRNO;
     361                            goto deflate_index_extract_ret;
     362                        }
     363                        if (strm.avail_in == 0) {
     364                            ret = Z_DATA_ERROR;
     365                            goto deflate_index_extract_ret;
     366                        }
     367                        strm.next_in = input;
     368                    }
     369                    ret = inflate(&strm, Z_BLOCK);
     370                    if (ret == Z_MEM_ERROR || ret == Z_DATA_ERROR)
     371                        goto deflate_index_extract_ret;
     372                } while ((strm.data_type & 128) == 0);
     373
     374                /* set up to continue decompression of the raw deflate stream
     375                   that follows the gzip header */
     376                ret = inflateReset2(&strm, -15);
     377                if (ret != Z_OK)
     378                    goto deflate_index_extract_ret;
     379            }
     380
     381            /* continue to process the available input before reading more */
    333382        } while (strm.avail_out != 0);
    334383
    335         /* if reach end of stream, then don't keep trying to get more */
    336384        if (ret == Z_STREAM_END)
     385            /* reached the end of the compressed data -- return the data that
     386               was available, possibly less than requested */
    337387            break;
    338388
    339         /* do until offset reached and requested data read, or stream ends */
     389        /* do until offset reached and requested data read */
    340390    } while (skip);
    341391
    342     /* compute number of uncompressed bytes read after offset */
     392    /* compute the number of uncompressed bytes read after the offset */
    343393    ret = skip ? 0 : len - strm.avail_out;
    344394
    345     /* clean up and return bytes read or error */
    346   extract_ret:
     395    /* clean up and return the bytes read, or the negative error */
     396  deflate_index_extract_ret:
    347397    (void)inflateEnd(&strm);
    348398    return ret;
    349399}
    350400
    351 /* Demonstrate the use of build_index() and extract() by processing the file
    352    provided on the command line, and the extracting 16K from about 2/3rds of
    353    the way through the uncompressed output, and writing that to stdout. */
     401#ifdef TEST
     402
     403#define SPAN 1048576L       /* desired distance between access points */
     404#define LEN 16384           /* number of bytes to extract */
     405
     406/* Demonstrate the use of deflate_index_build() and deflate_index_extract() by
     407   processing the file provided on the command line, and extracting LEN bytes
     408   from 2/3rds of the way through the uncompressed output, writing that to
     409   stdout. An offset can be provided as the second argument, in which case the
     410   data is extracted from there instead. */
    354411int main(int argc, char **argv)
    355412{
    356413    int len;
    357     off_t offset;
     414    off_t offset = -1;
    358415    FILE *in;
    359     struct access *index = NULL;
    360     unsigned char buf[CHUNK];
     416    struct deflate_index *index = NULL;
     417    unsigned char buf[LEN];
    361418
    362419    /* open input file */
    363     if (argc != 2) {
    364         fprintf(stderr, "usage: zran file.gz\n");
     420    if (argc < 2 || argc > 3) {
     421        fprintf(stderr, "usage: zran file.gz [offset]\n");
    365422        return 1;
    366423    }
     
    371428    }
    372429
     430    /* get optional offset */
     431    if (argc == 3) {
     432        char *end;
     433        offset = strtoll(argv[2], &end, 10);
     434        if (*end || offset < 0) {
     435            fprintf(stderr, "zran: %s is not a valid offset\n", argv[2]);
     436            return 1;
     437        }
     438    }
     439
    373440    /* build index */
    374     len = build_index(in, SPAN, &index);
     441    len = deflate_index_build(in, SPAN, &index);
    375442    if (len < 0) {
    376443        fclose(in);
     
    393460
    394461    /* use index by reading some bytes from an arbitrary offset */
    395     offset = (index->list[index->have - 1].out << 1) / 3;
    396     len = extract(in, index, offset, buf, CHUNK);
     462    if (offset == -1)
     463        offset = (index->length << 1) / 3;
     464    len = deflate_index_extract(in, index, offset, buf, LEN);
    397465    if (len < 0)
    398466        fprintf(stderr, "zran: extraction failed: %s error\n",
     
    404472
    405473    /* clean up and exit */
    406     free_index(index);
     474    deflate_index_free(index);
    407475    fclose(in);
    408476    return 0;
    409477}
     478
     479#endif
Note: See TracChangeset for help on using the changeset viewer.

© 2025 Oracle Support Privacy / Do Not Sell My Info Terms of Use Trademark Policy Automated Access Etiquette