1 | GNU grep NEWS -*- outline -*-
|
---|
2 |
|
---|
3 | * Noteworthy changes in release 3.7 (2021-08-14) [stable]
|
---|
4 |
|
---|
5 | ** Changes in behavior
|
---|
6 |
|
---|
7 | Use of the --unix-byte-offsets (-u) option now evokes a warning.
|
---|
8 | Since 3.1, this Windows-only option has had no effect.
|
---|
9 |
|
---|
10 | ** Bug fixes
|
---|
11 |
|
---|
12 | Preprocessing N patterns would take at least O(N^2) time when too many
|
---|
13 | patterns hashed to too few buckets. This now takes seconds, not days:
|
---|
14 | : | grep -Ff <(seq 6400000 | tr 0-9 A-J)
|
---|
15 | [Bug#44754 introduced in grep 3.5]
|
---|
16 |
|
---|
17 |
|
---|
18 | * Noteworthy changes in release 3.6 (2020-11-08) [stable]
|
---|
19 |
|
---|
20 | ** Changes in behavior
|
---|
21 |
|
---|
22 | The GREP_OPTIONS environment variable no longer affects grep's behavior.
|
---|
23 | The variable was declared obsolescent in grep 2.21 (2014), and since
|
---|
24 | then any use had caused grep to issue a diagnostic.
|
---|
25 |
|
---|
26 | ** Bug fixes
|
---|
27 |
|
---|
28 | grep's DFA matcher performed an invalid regex transformation
|
---|
29 | that would convert an ERE like a+a+a+ to a+a+, which would make
|
---|
30 | grep a+a+a+ mistakenly match "aa".
|
---|
31 | [Bug#44351 introduced in grep 3.2]
|
---|
32 |
|
---|
33 | grep -P now reports the troublesome input filename upon PCRE execution
|
---|
34 | failure. Before, searching many files for something rare might fail with
|
---|
35 | just "exceeded PCRE's backtracking limit". Now, it also reports which file
|
---|
36 | triggered the failure.
|
---|
37 |
|
---|
38 |
|
---|
39 | * Noteworthy changes in release 3.5 (2020-09-27) [stable]
|
---|
40 |
|
---|
41 | ** Changes in behavior
|
---|
42 |
|
---|
43 | The message that a binary file matches is now sent to standard error
|
---|
44 | and the message has been reworded from "Binary file FOO matches" to
|
---|
45 | "grep: FOO: binary file matches", to avoid confusion with ordinary
|
---|
46 | output or when file names contain spaces and the like, and to be
|
---|
47 | more consistent with other diagnostics. For example, commands
|
---|
48 | like 'grep PATTERN FILE | wc' no longer add 1 to the count of
|
---|
49 | matching text lines due to the presence of the message. Like other
|
---|
50 | stderr messages, the message is now omitted if the --no-messages
|
---|
51 | (-s) option is given.
|
---|
52 |
|
---|
53 | Two other stderr messages now use the typical form too. They are
|
---|
54 | now "grep: FOO: warning: recursive directory loop" and "grep: FOO:
|
---|
55 | input file is also the output".
|
---|
56 |
|
---|
57 | The --files-without-match (-L) option has reverted to its behavior
|
---|
58 | in grep 3.1 and earlier. That is, grep -L again succeeds when a
|
---|
59 | line is selected, not when a file is listed. The behavior in grep
|
---|
60 | 3.2 through 3.4 was causing compatibility problems.
|
---|
61 |
|
---|
62 | ** Bug fixes
|
---|
63 |
|
---|
64 | grep -I no longer issues a spurious "Binary file FOO matches" line.
|
---|
65 | [Bug#33552 introduced in grep 2.23]
|
---|
66 |
|
---|
67 | In UTF-8 locales, grep -w no longer ignores a multibyte word
|
---|
68 | constituent just before what would otherwise be a word match.
|
---|
69 | [Bug#43225 introduced in grep 2.28]
|
---|
70 |
|
---|
71 | grep -i no longer mishandles ASCII characters that match multibyte
|
---|
72 | characters. For example, 'LC_ALL=tr_TR.utf8 grep -i i' no longer
|
---|
73 | dumps core merely because 'i' matches 'İ' (U+0130 LATIN CAPITAL
|
---|
74 | LETTER I WITH DOT ABOVE) in Turkish when ignoring case.
|
---|
75 | [Bug#43577 introduced partly in grep 2.28 and partly in grep 3.4]
|
---|
76 |
|
---|
77 | A performance regression with -E and many patterns has been mostly fixed.
|
---|
78 | "Mostly" as there is a performance tradeoff between Bug#22357 and Bug#40634.
|
---|
79 | [Bug#40634 introduced in grep 2.28]
|
---|
80 |
|
---|
81 | A performance regression with many duplicate patterns has been fixed.
|
---|
82 | [Bug#43040 introduced in grep 3.4]
|
---|
83 |
|
---|
84 | An N^2 RSS performance regression with many patterns has been fixed
|
---|
85 | in common cases (no backref, and no use of -o or --color).
|
---|
86 | With only 80,000 lines of /usr/share/dict/linux.words, the following
|
---|
87 | would use 100GB of RSS and take 3 minutes. With the fix, it used less
|
---|
88 | than 400MB and took less than one second:
|
---|
89 | head -80000 /usr/share/dict/linux.words > w; grep -vf w w
|
---|
90 | [Bug#43527 introduced in grep 3.4]
|
---|
91 |
|
---|
92 | ** Build-related
|
---|
93 |
|
---|
94 | "make dist" builds .tar.gz files again, as they are still used in
|
---|
95 | some barebones builds.
|
---|
96 |
|
---|
97 |
|
---|
98 | * Noteworthy changes in release 3.4 (2020-01-02) [stable]
|
---|
99 |
|
---|
100 | ** New features
|
---|
101 |
|
---|
102 | The new --no-ignore-case option causes grep to observe case
|
---|
103 | distinctions, overriding any previous -i (--ignore-case) option.
|
---|
104 |
|
---|
105 | ** Bug fixes
|
---|
106 |
|
---|
107 | '.' no longer matches some invalid byte sequences in UTF-8 locales.
|
---|
108 | [bug introduced in grep 2.7]
|
---|
109 |
|
---|
110 | grep -Fw can no longer false match in non-UTF-8 multibyte locales
|
---|
111 | For example, this command would erroneously print its input line:
|
---|
112 | echo ab | LC_CTYPE=ja_JP.eucjp grep -Fw b
|
---|
113 | [Bug#38223 introduced in grep 2.28]
|
---|
114 |
|
---|
115 | The exit status of 'grep -L' is no longer incorrect when standard
|
---|
116 | output is /dev/null.
|
---|
117 | [Bug#37716 introduced in grep 3.2]
|
---|
118 |
|
---|
119 | A performance bug has been fixed when grep is given many patterns,
|
---|
120 | each with no back-reference.
|
---|
121 | [Bug#33249 introduced in grep 2.5]
|
---|
122 |
|
---|
123 | A performance bug has been fixed for patterns like '01.2' that
|
---|
124 | cause grep to reorder tokens internally.
|
---|
125 | [Bug#34951 introduced in grep 3.2]
|
---|
126 |
|
---|
127 | ** Build-related
|
---|
128 |
|
---|
129 | The build procedure no longer relies on any already-built src/grep
|
---|
130 | that might be absent or broken. Instead, it uses the system 'grep'
|
---|
131 | to bootstrap, and uses src/grep only to test the build. On Solaris
|
---|
132 | /usr/bin/grep is broken, but you can install GNU or XPG4 'grep' from
|
---|
133 | the standard Solaris distribution before building GNU Grep yourself.
|
---|
134 | [bug introduced in grep 2.8]
|
---|
135 |
|
---|
136 |
|
---|
137 | * Noteworthy changes in release 3.3 (2018-12-20) [stable]
|
---|
138 |
|
---|
139 | ** Bug fixes
|
---|
140 |
|
---|
141 | Some uses of \b in the C locale and with the DFA matcher would fail, e.g.,
|
---|
142 | the following would print nothing (it should print the input line):
|
---|
143 | echo 123-x|LC_ALL=C grep '.\bx'
|
---|
144 | Using a multibyte locale, using certain regexp constructs (some ranges,
|
---|
145 | back-references), or forcing use of the PCRE matcher via --perl-regexp (-P)
|
---|
146 | would avoid the bug.
|
---|
147 | [bug introduced in grep 3.2]
|
---|
148 |
|
---|
149 |
|
---|
150 | * Noteworthy changes in release 3.2 (2018-12-20) [stable]
|
---|
151 |
|
---|
152 | ** Changes in behavior
|
---|
153 |
|
---|
154 | The --files-without-match (-L) option now causes grep to succeed
|
---|
155 | when a file is listed, instead of when a line is selected. This
|
---|
156 | resembles what git-grep does.
|
---|
157 |
|
---|
158 | ** Bug fixes
|
---|
159 |
|
---|
160 | The --recursive (-r) option no longer fails on MS-Windows.
|
---|
161 | [bug introduced in grep 2.11]
|
---|
162 |
|
---|
163 | ** Improvements
|
---|
164 |
|
---|
165 | An over-30x performance improvement when many 'or'd expressions
|
---|
166 | share a common prefix, thanks to improvements in gnulib's dfa.c,
|
---|
167 | by Norihiro Tanaka. See gnulib commits v0.1-2110-ge648401be,
|
---|
168 | v0.1-2111-g4299106ce, v0.1-2117-g617a60974
|
---|
169 |
|
---|
170 | An additional 3-23% speed-up when searching large files, via
|
---|
171 | increased initial buffer size.
|
---|
172 |
|
---|
173 | grep now diagnoses stack overflow. Before grep-2.6, the included
|
---|
174 | regexp code would detect it. Since 2.6, grep defaulted to using
|
---|
175 | glibc's regexp, which lost that capability.
|
---|
176 |
|
---|
177 |
|
---|
178 | * Noteworthy changes in release 3.1 (2017-07-02) [stable]
|
---|
179 |
|
---|
180 | ** Improvements
|
---|
181 |
|
---|
182 | grep '[0-9]' is now just as fast as grep '[[:digit:]]' when run
|
---|
183 | in a multi-byte locale. Before, it was several times slower.
|
---|
184 |
|
---|
185 | ** Changes in behavior
|
---|
186 |
|
---|
187 | Context no longer excludes selected lines omitted because of -m.
|
---|
188 | For example, 'grep "^" -m1 -A1' now outputs the first two input
|
---|
189 | lines, not just the first line. This fixes a glitch that has been
|
---|
190 | present since -m was added in grep 2.5.
|
---|
191 |
|
---|
192 | The following changes affect only MS-Windows platforms. First, the
|
---|
193 | --binary (-U) option now governs whether binary I/O is used, instead
|
---|
194 | of a heuristic that was sometimes incorrect. Second, the
|
---|
195 | --unix-byte-offsets (-u) option now has no effect on MS-Windows too.
|
---|
196 |
|
---|
197 |
|
---|
198 | * Noteworthy changes in release 3.0 (2017-02-09) [stable]
|
---|
199 |
|
---|
200 | ** Bug fixes
|
---|
201 |
|
---|
202 | grep without -F no longer goes awry when given two or more patterns
|
---|
203 | that contain no special characters other than '\' and also contain a
|
---|
204 | subpattern like '\.' that escapes a character to make it ordinary.
|
---|
205 | [bug introduced in grep 2.28]
|
---|
206 |
|
---|
207 | grep no longer fails to build on PCRE versions before 8.20.
|
---|
208 | [bug introduced in grep 2.28]
|
---|
209 |
|
---|
210 |
|
---|
211 | * Noteworthy changes in release 2.28 (2017-02-06) [stable]
|
---|
212 |
|
---|
213 | ** Bug fixes
|
---|
214 |
|
---|
215 | When grep -Fo finds matches of differing length, it could
|
---|
216 | mistakenly print a shorter one. Now it prints a longest one.
|
---|
217 | [bug introduced in grep-2.26]
|
---|
218 |
|
---|
219 | When standard output is /dev/null, grep no longer fails when
|
---|
220 | standard input is a file in the Linux /proc file system, or when
|
---|
221 | standard input is a pipe and standard output is in append mode.
|
---|
222 | [bugs introduced in grep-2.27]
|
---|
223 |
|
---|
224 | Fix performance regression with multiple patterns, e.g., for -Fi in
|
---|
225 | a multi-byte locale, or for -Fw in a single-byte locale.
|
---|
226 | [bugs introduced in grep-2.19, grep-2.22 and grep-2.26]
|
---|
227 |
|
---|
228 | ** Improvements
|
---|
229 |
|
---|
230 | Improve performance for -E or -G pattern lists that are easily
|
---|
231 | converted to -F format.
|
---|
232 |
|
---|
233 |
|
---|
234 | * Noteworthy changes in release 2.27 (2016-12-06) [stable]
|
---|
235 |
|
---|
236 | ** Bug fixes
|
---|
237 |
|
---|
238 | grep no longer reports a false match in a multibyte, non-UTF8 locale
|
---|
239 | like zh_CN.gb18030, with a regular expression like ".*7" that just
|
---|
240 | happens to match the 4-byte representation of gb18030's \uC9, the
|
---|
241 | final byte of which is the digit "7".
|
---|
242 | [bug introduced in grep-2.19]
|
---|
243 |
|
---|
244 | Unless an early-exit option like -q, -l, -L, -m, or -f /dev/null is
|
---|
245 | specified, grep now reads all of a non-seekable standard input,
|
---|
246 | even if this cannot affect grep's output or exit status. This works
|
---|
247 | better with nonportable scripts that run "PROGRAM | grep PATTERN
|
---|
248 | >/dev/null" where PROGRAM dies when writing into a broken pipe.
|
---|
249 | [bug introduced in grep-2.26]
|
---|
250 |
|
---|
251 | grep no longer mishandles ranges in nontrivial unibyte locales.
|
---|
252 | [bug introduced in grep-2.26]
|
---|
253 |
|
---|
254 | grep -P no longer attempts multiline matches. This works more
|
---|
255 | intuitively with unusual patterns, and means that grep -Pz no longer
|
---|
256 | rejects patterns containing ^ and $ and works when combined with -x.
|
---|
257 | [bugs introduced in grep-2.23] A downside is that grep -P is now
|
---|
258 | significantly slower, albeit typically still faster than pcregrep.
|
---|
259 |
|
---|
260 | grep -m0 -L PAT FILE now outputs "FILE". [bug introduced in grep-2.5]
|
---|
261 |
|
---|
262 | To output ':' and tab-align the following character C, grep -T no
|
---|
263 | longer outputs tab-backspace-':'-C, an approach that has problems if
|
---|
264 | run inside an Emacs shell window. [bug introduced in grep-2.5.2]
|
---|
265 |
|
---|
266 | grep -T now uses worst-case widths of line numbers and byte offsets
|
---|
267 | instead of guessing widths that might not work with larger files.
|
---|
268 | [bug introduced in grep-2.5.2]
|
---|
269 |
|
---|
270 | grep's use of getprogname no longer causes a build failure on HP-UX.
|
---|
271 |
|
---|
272 | ** Improvements
|
---|
273 |
|
---|
274 | grep no longer reads the input in a few more cases when it is easy
|
---|
275 | to see that matching cannot succeed, e.g., 'grep -f /dev/null'.
|
---|
276 |
|
---|
277 |
|
---|
278 | * Noteworthy changes in release 2.26 (2016-10-02) [stable]
|
---|
279 |
|
---|
280 | ** Bug fixes
|
---|
281 |
|
---|
282 | Grep no longer omits output merely because it follows an output line
|
---|
283 | suppressed due to encoding errors. [bug introduced in grep-2.21]
|
---|
284 |
|
---|
285 | In the Shift_JIS locale, grep no longer mistakenly matches in the
|
---|
286 | middle of a multibyte character. [bug present since "the beginning"]
|
---|
287 |
|
---|
288 | ** Improvements
|
---|
289 |
|
---|
290 | grep can be much faster now when standard output is /dev/null.
|
---|
291 |
|
---|
292 | grep -F is now typically much faster when many patterns are given,
|
---|
293 | as it now uses the Aho-Corasick algorithm instead of the
|
---|
294 | Commentz-Walter algorithm in that case.
|
---|
295 |
|
---|
296 | grep -iF is typically much faster in a multibyte locale, if the
|
---|
297 | pattern and its case counterparts contain only single byte characters.
|
---|
298 |
|
---|
299 | grep with complicated expressions (e.g., back-references) and without
|
---|
300 | -i now uses the regex fastmap for better performance.
|
---|
301 |
|
---|
302 | In multibyte locales, grep now handles leading "." in patterns more
|
---|
303 | efficiently.
|
---|
304 |
|
---|
305 | grep now prints a "FILENAME:LINENO: " prefix when diagnosing an
|
---|
306 | invalid regular expression that was read from an '-f'-specified file.
|
---|
307 |
|
---|
308 |
|
---|
309 | * Noteworthy changes in release 2.25 (2016-04-21) [stable]
|
---|
310 |
|
---|
311 | ** Bug fixes
|
---|
312 |
|
---|
313 | In the C or POSIX locale, grep now treats all bytes as valid
|
---|
314 | characters even if the C runtime library says otherwise. The
|
---|
315 | revised behavior is more compatible with the original intent of
|
---|
316 | POSIX, and the next release of POSIX will likely make this official.
|
---|
317 | [bug introduced in grep-2.23]
|
---|
318 |
|
---|
319 | grep -Pz no longer mistakenly diagnoses patterns like [^a] that use
|
---|
320 | negated character classes. [bug introduced in grep-2.24]
|
---|
321 |
|
---|
322 | grep -oz now uses null bytes, not newlines, to terminate output lines.
|
---|
323 | [bug introduced in grep-2.5]
|
---|
324 |
|
---|
325 | ** Improvements
|
---|
326 |
|
---|
327 | grep now outputs details more consistently when reporting a write error.
|
---|
328 | E.g., "grep: write error: No space left on device" rather than just
|
---|
329 | "grep: write error".
|
---|
330 |
|
---|
331 |
|
---|
332 | * Noteworthy changes in release 2.24 (2016-03-10) [stable]
|
---|
333 |
|
---|
334 | ** Bug fixes
|
---|
335 |
|
---|
336 | grep -z would match strings it should not. To trigger the bug, you'd
|
---|
337 | have to use a regular expression including an anchor (^ or $) and a
|
---|
338 | feature like a range or a back-reference, causing grep to forego its DFA
|
---|
339 | matcher and resort to using re_search. With a multibyte locale, that
|
---|
340 | matcher could mistakenly match a string containing a newline.
|
---|
341 | For example, this command:
|
---|
342 | printf 'a\nb\0' | LC_ALL=en_US.utf-8 grep -z '^[a-b]*b'
|
---|
343 | would mistakenly match and print all four input bytes. After the fix,
|
---|
344 | there is no match, as expected.
|
---|
345 | [bug introduced in grep-2.7]
|
---|
346 |
|
---|
347 | grep -Pz now diagnoses attempts to use patterns containing ^ and $,
|
---|
348 | instead of mishandling these patterns. This problem seems to be
|
---|
349 | inherent to the PCRE API; removing this limitation is on PCRE's
|
---|
350 | maint/README wish list. Patterns can continue to match literal ^
|
---|
351 | and $ by escaping them with \ (now needed even inside [...]).
|
---|
352 | [bug introduced in grep-2.5]
|
---|
353 |
|
---|
354 |
|
---|
355 | * Noteworthy changes in release 2.23 (2016-02-04) [stable]
|
---|
356 |
|
---|
357 | ** Bug fixes
|
---|
358 |
|
---|
359 | Binary files are now less likely to generate diagnostics and more
|
---|
360 | likely to yield text matches. grep now reports "Binary file FOO
|
---|
361 | matches" and suppresses further output instead of outputting a line
|
---|
362 | containing an encoding error; hence grep can now report matching text
|
---|
363 | before a later binary match. Formerly, grep reported FOO to be
|
---|
364 | binary when it found an encoding error in FOO before generating
|
---|
365 | output for FOO, which meant it never reported both matching text and
|
---|
366 | matching binary data; this was less useful for searching text
|
---|
367 | containing encoding errors in non-matching lines.
|
---|
368 | [bug introduced in grep-2.21]
|
---|
369 |
|
---|
370 | grep -c no longer stops counting when finding binary data.
|
---|
371 | [bug introduced in grep-2.21]
|
---|
372 |
|
---|
373 | grep no longer outputs encoding errors in unibyte locales.
|
---|
374 | For example, if the byte '\x81' is not a valid character in a
|
---|
375 | unibyte locale, grep treats the byte as binary data.
|
---|
376 | [bug introduced in grep-2.21]
|
---|
377 |
|
---|
378 | grep -oP is no longer susceptible to an infinite loop when processing
|
---|
379 | invalid UTF8 just before a match.
|
---|
380 | [bug introduced in grep-2.22]
|
---|
381 |
|
---|
382 | --exclude and related options are now matched against trailing
|
---|
383 | parts of command-line arguments, not against the entire arguments.
|
---|
384 | This partly reverts the --exclude-related change in 2.22.
|
---|
385 | [bug introduced in grep-2.22]
|
---|
386 |
|
---|
387 | --line-buffer is no longer ineffective when combined with -l.
|
---|
388 | [bug introduced in grep-2.5]
|
---|
389 |
|
---|
390 | -xw is now equivalent to -x more consistently, with -P and with backrefs.
|
---|
391 | [bug only partially fixed in grep-2.19]
|
---|
392 |
|
---|
393 |
|
---|
394 | * Noteworthy changes in release 2.22 (2015-11-01) [stable]
|
---|
395 |
|
---|
396 | ** Improvements
|
---|
397 |
|
---|
398 | Performance has improved for patterns containing very long strings,
|
---|
399 | reducing preprocessing time for an N-byte regexp from O(N^2) to
|
---|
400 | only slightly superlinear for most patterns. Before, a command like
|
---|
401 | the following would take over a minute, but now, it takes less than
|
---|
402 | a second:
|
---|
403 | : | grep -f <(seq -s '' 99999)
|
---|
404 |
|
---|
405 | When building grep, 'configure' now uses PCRE's pkg-config module for
|
---|
406 | configuration information, rather than attempting to guess it by hand.
|
---|
407 |
|
---|
408 | ** Bug fixes
|
---|
409 |
|
---|
410 | A DFA matcher bug made this command mistakenly print its input line:
|
---|
411 | echo axb | grep -E '^x|x$'
|
---|
412 | Likewise for this equivalent command:
|
---|
413 | echo axb | grep -e '^x' -e 'x$'
|
---|
414 | [bug introduced in grep-2.19 ]
|
---|
415 |
|
---|
416 | grep no longer reads from uninitialized memory or from beyond the end
|
---|
417 | of the heap-allocated input buffer. This fix addressed CVE-2015-1345.
|
---|
418 | [bug introduced in grep-2.19 ]
|
---|
419 |
|
---|
420 | With -z, '.' and '[^x]' in a pattern now consistently match newline.
|
---|
421 | Previously, they sometimes matched newline, and sometimes did not.
|
---|
422 | [bug introduced in grep-2.4]
|
---|
423 |
|
---|
424 | When the JIT stack is exhausted, grep -P now grows the stack rather
|
---|
425 | than reporting an internal PCRE error.
|
---|
426 |
|
---|
427 | 'grep -D skip PATTERN FILE' no longer hangs if FILE is a fifo.
|
---|
428 | [bug introduced in grep-2.12]
|
---|
429 |
|
---|
430 | --exclude and related options are now matched against entire
|
---|
431 | command-line arguments, not against command-line components.
|
---|
432 | [bug introduced in grep-2.6]
|
---|
433 |
|
---|
434 | Fix performance degradation of grep -Fw in unibyte locales.
|
---|
435 | [bug introduced in grep-2.19 ]
|
---|
436 |
|
---|
437 |
|
---|
438 | * Noteworthy changes in release 2.21 (2014-11-23) [stable]
|
---|
439 |
|
---|
440 | ** Improvements
|
---|
441 |
|
---|
442 | Performance has been greatly improved for searching files containing
|
---|
443 | holes, on platforms where lseek's SEEK_DATA flag works efficiently.
|
---|
444 |
|
---|
445 | Performance has improved for rejecting data that cannot match even
|
---|
446 | the first part of a nontrivial pattern.
|
---|
447 |
|
---|
448 | Performance has improved for very long strings in patterns.
|
---|
449 |
|
---|
450 | If a file contains data improperly encoded for the current locale,
|
---|
451 | and this is discovered before any of the file's contents are output,
|
---|
452 | grep now treats the file as binary.
|
---|
453 |
|
---|
454 | grep -P no longer reports an error and exits when given invalid UTF-8 data.
|
---|
455 | Instead, it considers the data to be non-matching.
|
---|
456 |
|
---|
457 | ** Bug fixes
|
---|
458 |
|
---|
459 | grep no longer mishandles patterns that contain \w or \W in multibyte
|
---|
460 | locales.
|
---|
461 |
|
---|
462 | grep would fail to count newlines internally when operating in non-UTF8
|
---|
463 | multibyte locales, leading it to print potentially many lines that did
|
---|
464 | not match. E.g., the command, "seq 10 | env LC_ALL=zh_CN src/grep -n .."
|
---|
465 | would print this:
|
---|
466 | 1:1
|
---|
467 | 2
|
---|
468 | 3
|
---|
469 | 4
|
---|
470 | 5
|
---|
471 | 6
|
---|
472 | 7
|
---|
473 | 8
|
---|
474 | 9
|
---|
475 | 10
|
---|
476 | implying that the match, "10" was on line 1.
|
---|
477 | [bug introduced in grep-2.19]
|
---|
478 |
|
---|
479 | grep -F -x -o no longer prints an extra newline for each match.
|
---|
480 | [bug introduced in grep-2.19]
|
---|
481 |
|
---|
482 | grep in a non-UTF8 multibyte locale could mistakenly match in the middle
|
---|
483 | of a multibyte character when using a '^'-anchored alternate in a pattern,
|
---|
484 | leading it to print non-matching lines. [bug present since "the beginning"]
|
---|
485 |
|
---|
486 | grep -F Y no longer fails to match in non-UTF8 multibyte locales like
|
---|
487 | Shift-JIS, when the input contains a 2-byte character, XY, followed by
|
---|
488 | the single-byte search pattern, Y. grep would find the first, middle-
|
---|
489 | of-multibyte matching "Y", and then mistakenly advance an internal
|
---|
490 | pointer one byte too far, skipping over the target "Y" just after that.
|
---|
491 | [bug introduced in grep-2.19]
|
---|
492 |
|
---|
493 | grep -E rejected unmatched ')', instead of treating it like '\)'.
|
---|
494 | [bug present since "the beginning"]
|
---|
495 |
|
---|
496 | On NetBSD, grep -r no longer reports "Inappropriate file type or format"
|
---|
497 | when refusing to follow a symbolic link.
|
---|
498 | [bug introduced in grep-2.12]
|
---|
499 |
|
---|
500 | ** Changes in behavior
|
---|
501 |
|
---|
502 | The GREP_OPTIONS environment variable is now obsolescent, and grep
|
---|
503 | now warns if it is used. Please use an alias or script instead.
|
---|
504 |
|
---|
505 | In locales with multibyte character encodings other than UTF-8,
|
---|
506 | grep -P now reports an error and exits instead of misbehaving.
|
---|
507 |
|
---|
508 | When searching binary data, grep now may treat non-text bytes as
|
---|
509 | line terminators. This can boost performance significantly.
|
---|
510 |
|
---|
511 | grep -z no longer automatically treats the byte '\200' as binary data.
|
---|
512 |
|
---|
513 | * Noteworthy changes in release 2.20 (2014-06-03) [stable]
|
---|
514 |
|
---|
515 | ** Bug fixes
|
---|
516 |
|
---|
517 | grep --max-count=N FILE would no longer stop reading after the Nth match.
|
---|
518 | I.e., while grep would still print the correct output, it would continue
|
---|
519 | reading until end of input, and hence, potentially forever.
|
---|
520 | [bug introduced in grep-2.19]
|
---|
521 |
|
---|
522 | A command like echo aa|grep -E 'a(b$|c$)' would mistakenly
|
---|
523 | report the input as a matched line.
|
---|
524 | [bug introduced in grep-2.19]
|
---|
525 |
|
---|
526 | ** Changes in behavior
|
---|
527 |
|
---|
528 | grep --exclude-dir='FOO/' now excludes the directory FOO.
|
---|
529 | Previously, the trailing slash meant the option was ineffective.
|
---|
530 |
|
---|
531 |
|
---|
532 | * Noteworthy changes in release 2.19 (2014-05-22) [stable]
|
---|
533 |
|
---|
534 | ** Improvements
|
---|
535 |
|
---|
536 | Performance has improved, typically by 10% and in some cases by a
|
---|
537 | factor of 200. However, performance of grep -P in UTF-8 locales has
|
---|
538 | gotten worse as part of the fix for the crashes mentioned below.
|
---|
539 |
|
---|
540 | ** Bug fixes
|
---|
541 |
|
---|
542 | grep no longer mishandles patterns like [a-[.z.]], and no longer
|
---|
543 | mishandles patterns like [^a] in locales that have multicharacter
|
---|
544 | collating sequences so that [^a] can match a string of two characters.
|
---|
545 |
|
---|
546 | grep no longer mishandles an empty pattern at the end of a pattern list.
|
---|
547 | [bug introduced in grep-2.5]
|
---|
548 |
|
---|
549 | grep -C NUM now outputs separators consistently even when NUM is zero,
|
---|
550 | and similarly for grep -A NUM and grep -B NUM.
|
---|
551 | [bug present since "the beginning"]
|
---|
552 |
|
---|
553 | grep -f no longer mishandles patterns containing NUL bytes.
|
---|
554 | [bug introduced in grep-2.11]
|
---|
555 |
|
---|
556 | Plain grep, grep -E, and grep -F now treat encoding errors in patterns
|
---|
557 | the same way the GNU regular expression matcher treats them, with respect
|
---|
558 | to whether the errors can match parts of multibyte characters in data.
|
---|
559 | [bug present since "the beginning"]
|
---|
560 |
|
---|
561 | grep -w no longer mishandles a potential match adjacent to a letter that
|
---|
562 | takes up two or more bytes in a multibyte encoding.
|
---|
563 | Similarly, the patterns '\<', '\>', '\b', and '\B' no longer
|
---|
564 | mishandle word-boundary matches in multibyte locales.
|
---|
565 | [bug present since "the beginning"]
|
---|
566 |
|
---|
567 | grep -P now reports an error and exits when given invalid UTF-8 data.
|
---|
568 | Previously it was unreliable, and sometimes crashed or looped.
|
---|
569 | [bug introduced in grep-2.16]
|
---|
570 |
|
---|
571 | grep -P now works with -w and -x and back-references. Before,
|
---|
572 | echo aa|grep -Pw '(.)\1' would fail to match, yet
|
---|
573 | echo aa|grep -Pw '(.)\2' would match.
|
---|
574 |
|
---|
575 | grep -Pw now works like grep -w in that the matched string has to be
|
---|
576 | preceded and followed by non-word components or the beginning and end
|
---|
577 | of the line (as opposed to word boundaries before). Before, this
|
---|
578 | echo a@@a| grep -Pw @@ would match, yet this
|
---|
579 | echo a@@a| grep -w @@ would not. Now, they both fail to match,
|
---|
580 | per the documentation on how grep's -w works.
|
---|
581 |
|
---|
582 | grep -i no longer mishandles patterns containing titlecase characters.
|
---|
583 | For example, in a locale containing the titlecase character
|
---|
584 | 'Lj' (U+01C8 LATIN CAPITAL LETTER L WITH SMALL LETTER J),
|
---|
585 | 'grep -i Lj' now matches both 'LJ' (U+01C7 LATIN CAPITAL LETTER LJ)
|
---|
586 | and 'lj' (U+01C9 LATIN SMALL LETTER LJ).
|
---|
587 |
|
---|
588 |
|
---|
589 | * Noteworthy changes in release 2.18 (2014-02-20) [stable]
|
---|
590 |
|
---|
591 | ** Bug fixes
|
---|
592 |
|
---|
593 | grep no longer mishandles patterns like [^^-~] in unibyte locales.
|
---|
594 | [bug introduced in grep-2.8]
|
---|
595 |
|
---|
596 | grep -i in a multibyte, non-UTF8 locale could be up to 200 times slower
|
---|
597 | than in 2.16. [bug introduced in grep-2.17]
|
---|
598 |
|
---|
599 |
|
---|
600 | * Noteworthy changes in release 2.17 (2014-02-17) [stable]
|
---|
601 |
|
---|
602 | ** Improvements
|
---|
603 |
|
---|
604 | grep -i in a multibyte locale is now typically 10 times faster
|
---|
605 | for patterns that do not contain \ or [.
|
---|
606 |
|
---|
607 | grep (without -i) in a multibyte locale is now up to 7 times faster
|
---|
608 | when processing many matched lines.
|
---|
609 |
|
---|
610 | ** Maintenance
|
---|
611 |
|
---|
612 | grep's --mmap option was disabled in March of 2010, and began to
|
---|
613 | elicit a warning in January of 2012. Now it is completely gone.
|
---|
614 |
|
---|
615 |
|
---|
616 | * Noteworthy changes in release 2.16 (2014-01-01) [stable]
|
---|
617 |
|
---|
618 | ** Bug fixes
|
---|
619 |
|
---|
620 | Fix gnulib-provided maint.mk so that the release procedure described
|
---|
621 | in README-release actually does what we want. Before that fix, that
|
---|
622 | procedure resulted in a grep-2.15 tarball that would lead to a grep
|
---|
623 | binary whose --version-reported version number was 2.14.51...
|
---|
624 |
|
---|
625 | The fix to make \s and \S work with multi-byte white space broke
|
---|
626 | the use of each shortcut whenever followed by a repetition operator.
|
---|
627 | For example, \s*, \s+, \s? and \s{3} would all malfunction in a
|
---|
628 | multi-byte locale. [bug introduced in grep-2.15]
|
---|
629 |
|
---|
630 | The fix to make grep -P work better with UTF-8 made it possible for
|
---|
631 | grep to evoke a larger set of PCRE errors, some of which could trigger
|
---|
632 | an abort. E.g., this would abort:
|
---|
633 | printf '\x82'|LC_ALL=en_US.UTF-8 grep -P y
|
---|
634 | Now grep handles arbitrary PCRE errors. [bug introduced in grep-2.15]
|
---|
635 |
|
---|
636 | Handle very long lines (2GiB and longer) on systems with a deficient
|
---|
637 | read system call.
|
---|
638 |
|
---|
639 | * Noteworthy changes in release 2.15 (2013-10-26) [stable]
|
---|
640 |
|
---|
641 | ** Bug fixes
|
---|
642 |
|
---|
643 | grep's \s and \S failed to work with multi-byte white space characters.
|
---|
644 | For example, \s would fail to match a non-breaking space, and this
|
---|
645 | would print nothing: printf '\xc2\xa0' | LC_ALL=en_US.UTF-8 grep '\s'
|
---|
646 | A related bug is that \S would mistakenly match an invalid multibyte
|
---|
647 | character. For example, the following would match:
|
---|
648 | printf '\x82\n' | LC_ALL=en_US.UTF-8 grep '^\S$'
|
---|
649 | [bug present since grep-2.6]
|
---|
650 |
|
---|
651 | grep -i would segfault on systems using UTF-16-based wchar_t (Cygwin)
|
---|
652 | when converting an input string containing certain 4-byte UTF-8
|
---|
653 | sequences to lower case. The conversions to wchar_t and back to
|
---|
654 | a UTF-8 multibyte string did not take surrogate pairs into account.
|
---|
655 | [bug present since at least grep-2.6, though the segfault is new with 2.13]
|
---|
656 |
|
---|
657 | grep -E would segfault when given a regexp like '([^.]*[M]){1,2}'
|
---|
658 | for any multibyte character M. [bug introduced in grep-2.6, which would
|
---|
659 | segfault, but 2.7 and 2.8 had no problem, and 2.9 through 2.14 would
|
---|
660 | hit a failed assertion. ]
|
---|
661 |
|
---|
662 | grep -F would get stuck in an infinite loop when given a search string
|
---|
663 | that is an invalid byte sequence in the current locale and that matches
|
---|
664 | the bytes of the input twice on a line. Now grep fails with exit status 1.
|
---|
665 |
|
---|
666 | grep -P could misbehave. While multi-byte mode is only supported by PCRE
|
---|
667 | with UTF-8 locales, grep did not activate it. This would cause failures
|
---|
668 | to match multibyte characters against some regular expressions, especially
|
---|
669 | those including the '.' or '\p' metacharacters.
|
---|
670 |
|
---|
671 | ** New features
|
---|
672 |
|
---|
673 | grep -P can now use a just-in-time compiler to greatly speed up matches,
|
---|
674 | This feature is transparent to the user; no flag is required to enable
|
---|
675 | it. It is only available if the corresponding support in the PCRE
|
---|
676 | library is detected when grep is compiled.
|
---|
677 |
|
---|
678 |
|
---|
679 | * Noteworthy changes in release 2.14 (2012-08-20) [stable]
|
---|
680 |
|
---|
681 | ** Bug fixes
|
---|
682 |
|
---|
683 | grep -i '^$' could exit 0 (i.e., report a match) in a multi-byte locale,
|
---|
684 | even though there was no match, and the command generated no output.
|
---|
685 | E.g., seq 2 | LC_ALL=en_US.utf8 grep -il '^$' would mistakenly print
|
---|
686 | "(standard input)". Related, seq 9 | LC_ALL=en_US.utf8 grep -in '^$'
|
---|
687 | would print "2:4:6:8:10:12:14:16" and exit 0. Now it prints nothing
|
---|
688 | and exits with status of 1. [bug introduced in grep-2.6]
|
---|
689 |
|
---|
690 | 'grep' no longer falsely reports text files as being binary on file
|
---|
691 | systems that compress contents or that store tiny contents in metadata.
|
---|
692 |
|
---|
693 |
|
---|
694 | * Noteworthy changes in release 2.13 (2012-07-04) [stable]
|
---|
695 |
|
---|
696 | ** Bug fixes
|
---|
697 |
|
---|
698 | grep -i, in a multi-byte locale, when matching a line containing a character
|
---|
699 | like the UTF-8 Turkish I-with-dot (U+0130) (whose lower-case representation
|
---|
700 | occupies fewer bytes), would print an incomplete output line.
|
---|
701 | Similarly, with a matched line containing a character (e.g., the Latin
|
---|
702 | capital I in a Turkish UTF-8 locale), where the lower-case representation
|
---|
703 | occupies more bytes, grep could print garbage.
|
---|
704 | [bug introduced in grep-2.6]
|
---|
705 |
|
---|
706 | --include and --exclude can again be combined, and again apply to
|
---|
707 | the command line, e.g., "grep --include='*.[ch]' --exclude='system.h'
|
---|
708 | PATTERN *" again reads all *.c and *.h files except for system.h.
|
---|
709 | [bug introduced in grep-2.6]
|
---|
710 |
|
---|
711 | ** New features
|
---|
712 |
|
---|
713 | 'grep' without -z now treats a sparse file as binary, if it can
|
---|
714 | easily determine that the file is sparse.
|
---|
715 |
|
---|
716 | ** Dropped features
|
---|
717 |
|
---|
718 | Bootstrapping with Makefile.boot has been broken since grep 2.6,
|
---|
719 | and was removed.
|
---|
720 |
|
---|
721 |
|
---|
722 | * Noteworthy changes in release 2.12 (2012-04-23) [stable]
|
---|
723 |
|
---|
724 | ** Bug fixes
|
---|
725 |
|
---|
726 | "echo P|grep --devices=skip P" once again prints P, as it did in 2.10
|
---|
727 | [bug introduced in grep-2.11]
|
---|
728 |
|
---|
729 | grep no longer segfaults with -r --exclude-dir and no file operand.
|
---|
730 | I.e., ":|grep -r --exclude-dir=D PAT" would segfault.
|
---|
731 | [bug introduced in grep-2.11]
|
---|
732 |
|
---|
733 | Recursive grep now uses fts for directory traversal, so it can
|
---|
734 | handle much-larger directories without reporting things like "File
|
---|
735 | name too long", and it can run much faster when dealing with large
|
---|
736 | directory hierarchies. [bug present since the beginning]
|
---|
737 |
|
---|
738 | grep -E 'a{1000000000}' now reports an overflow error rather than
|
---|
739 | silently acting like grep -E 'a\{1000000000}'.
|
---|
740 |
|
---|
741 | grep -E 'a{,10}' was not treated equivalently to grep -E 'a{0,10}'.
|
---|
742 |
|
---|
743 | ** New features
|
---|
744 |
|
---|
745 | The -R option now has a long-option alias --dereference-recursive.
|
---|
746 |
|
---|
747 | ** Changes in behavior
|
---|
748 |
|
---|
749 | The -r (--recursive) option now follows only command-line symlinks.
|
---|
750 | Also, by default -r now reads a device only if it is named on the command
|
---|
751 | line; this can be overridden with --devices. -R acts as before, so
|
---|
752 | use -R if you prefer the old behavior of following all symlinks and
|
---|
753 | defaulting to reading all devices.
|
---|
754 |
|
---|
755 |
|
---|
756 | * Noteworthy changes in release 2.11 (2012-03-02) [stable]
|
---|
757 |
|
---|
758 | ** Bug fixes
|
---|
759 |
|
---|
760 | grep no longer dumps core on lines whose lengths do not fit in 'int'.
|
---|
761 | (e.g., lines longer than 2 GiB on a typical 64-bit host).
|
---|
762 | Instead, grep either works as expected, or reports an error.
|
---|
763 | An error can occur if not enough main memory is available, or if the
|
---|
764 | GNU C library's regular expression functions cannot handle such long lines.
|
---|
765 | [bug present since "the beginning"]
|
---|
766 |
|
---|
767 | The -m, -A, -B, and -C options no longer mishandle context line
|
---|
768 | counts that do not fit in 'int'. Also, grep -c's counts are now
|
---|
769 | limited by the type 'intmax_t' (typically less than 2**63) rather
|
---|
770 | than 'int' (typically less than 2**31).
|
---|
771 |
|
---|
772 | grep no longer silently suppresses errors when reading a directory
|
---|
773 | as if it were a text file. For example, "grep x ." now reports a
|
---|
774 | read error on most systems; formerly, it ignored the error.
|
---|
775 | [bug introduced in grep-2.5]
|
---|
776 |
|
---|
777 | grep now exits with status 2 if a directory loop is found,
|
---|
778 | instead of possibly exiting with status 0 or 1.
|
---|
779 | [bug introduced in grep-2.3]
|
---|
780 |
|
---|
781 | The -s option now suppresses certain input error diagnostics that it
|
---|
782 | formerly failed to suppress. These include errors when closing the
|
---|
783 | input, when lseeking the input, and when the input is also the output.
|
---|
784 | [bug introduced in grep-2.4]
|
---|
785 |
|
---|
786 | On POSIX systems, commands like "grep PAT < FILE >> FILE"
|
---|
787 | now report an error instead of looping.
|
---|
788 | [bug present since "the beginning"]
|
---|
789 |
|
---|
790 | The --include, --exclude, and --exclude-dir options now handle
|
---|
791 | command-line arguments more consistently. --include and --exclude
|
---|
792 | apply only to non-directories and --exclude-dir applies only to
|
---|
793 | directories. "-" (standard input) is never excluded, since it is
|
---|
794 | not a file name.
|
---|
795 | [bug introduced in grep-2.5]
|
---|
796 |
|
---|
797 | grep no longer rejects "grep -qr . > out", i.e., when run with -q
|
---|
798 | and an input file is the same as the output file, since with -q
|
---|
799 | grep generates no output, so there is no risk of infinite loop or
|
---|
800 | of an output-affecting race condition. Thus, the use of the following
|
---|
801 | options also disables the input-equals-output failure:
|
---|
802 | --max-count=N (-m) (for N >= 2)
|
---|
803 | --files-with-matches (-l)
|
---|
804 | --files-without-match (-L)
|
---|
805 | [bug introduced in grep-2.10]
|
---|
806 |
|
---|
807 | grep no longer emits an error message and quits on MS-Windows when
|
---|
808 | invoked with the -r option.
|
---|
809 |
|
---|
810 | grep no longer misinterprets some alternations involving anchors
|
---|
811 | (^, $, \< \> \B, \b). For example, grep -E "(^|\B)a" no
|
---|
812 | longer reports a match for the string "x a".
|
---|
813 | [bug present since "the beginning"]
|
---|
814 |
|
---|
815 | ** New features
|
---|
816 |
|
---|
817 | If no file operand is given, and a command-line -r or equivalent
|
---|
818 | option is given, grep now searches the working directory. Formerly
|
---|
819 | grep ignored the -r and searched standard input nonrecursively.
|
---|
820 | An -r found in GREP_OPTIONS does not have this new effect.
|
---|
821 |
|
---|
822 | grep now supports color highlighting of matches on MS-Windows.
|
---|
823 |
|
---|
824 | ** Changes in behavior
|
---|
825 |
|
---|
826 | Use of the --mmap option now elicits a warning. It has been a no-op
|
---|
827 | since March of 2010.
|
---|
828 |
|
---|
829 | grep no longer diagnoses write errors repeatedly; it exits after
|
---|
830 | diagnosing the first write error. This is better behavior when
|
---|
831 | writing to a dangling pipe.
|
---|
832 |
|
---|
833 | Syntax errors in GREP_COLORS are now ignored, instead of sometimes
|
---|
834 | eliciting warnings. This is more consistent with programs that
|
---|
835 | (e.g.) ignore errors in termcap entries.
|
---|
836 |
|
---|
837 | * Noteworthy changes in release 2.10 (2011-11-16) [stable]
|
---|
838 |
|
---|
839 | ** Bug fixes
|
---|
840 |
|
---|
841 | grep no longer mishandles high-bit-set pattern bytes on systems
|
---|
842 | where "char" is a signed type. [bug appears to affect only MS-Windows]
|
---|
843 |
|
---|
844 | On POSIX systems, grep now rejects a command like "grep -r pattern . > out",
|
---|
845 | in which the output file is also one of the inputs,
|
---|
846 | because it can result in an "infinite" disk-filling loop.
|
---|
847 | [bug present since "the beginning"]
|
---|
848 |
|
---|
849 | ** Build-related
|
---|
850 |
|
---|
851 | "make dist" no longer builds .tar.gz files.
|
---|
852 | xz is portable enough and in wide-enough use that distributing
|
---|
853 | only .tar.xz files is enough.
|
---|
854 |
|
---|
855 |
|
---|
856 | * Noteworthy changes in release 2.9 (2011-06-21) [stable]
|
---|
857 |
|
---|
858 | ** Bug fixes
|
---|
859 |
|
---|
860 | grep no longer clobbers heap for an ERE like '(^| )*( |$)'
|
---|
861 | [bug introduced in grep-2.6]
|
---|
862 |
|
---|
863 | grep is faster on regular expressions that match multibyte characters
|
---|
864 | in brackets (such as '[áéíóú]').
|
---|
865 |
|
---|
866 | echo c|grep '[c]' would fail for any c in 0x80..0xff, with a uni-byte
|
---|
867 | encoding for which the byte-to-wide-char mapping is nontrivial. For
|
---|
868 | example, the ISO-88591 locales are not affected, but ru_RU.KOI8-R is.
|
---|
869 | [bug introduced in grep-2.6]
|
---|
870 |
|
---|
871 | grep -P no longer aborts when PCRE's backtracking limit is exceeded
|
---|
872 | Before, echo aaaaaaaaaaaaaab |grep -P '((a+)*)+$' would abort. Now,
|
---|
873 | it diagnoses the problem and exits with status 2.
|
---|
874 |
|
---|
875 |
|
---|
876 | * Noteworthy changes in release 2.8 (2011-05-13) [stable]
|
---|
877 |
|
---|
878 | ** Bug fixes
|
---|
879 |
|
---|
880 | echo c|grep '[c]' would fail for any c in 0x80..0xff, and in many locales.
|
---|
881 | E.g., printf '\xff\n'|grep "$(printf '[\xff]')" || echo FAIL
|
---|
882 | would print FAIL rather than the required matching line.
|
---|
883 | [bug introduced in grep-2.6]
|
---|
884 |
|
---|
885 | grep's interpretation of range expression is now more consistent with
|
---|
886 | that of other tools. [bug present since multi-byte character set
|
---|
887 | support was introduced in 2.5.2, though the steps needed to reproduce
|
---|
888 | it changed in grep-2.6]
|
---|
889 |
|
---|
890 | grep erroneously returned with exit status 1 on some memory allocation
|
---|
891 | failure. [bug present since "the beginning"]
|
---|
892 |
|
---|
893 |
|
---|
894 | * Noteworthy changes in release 2.7 (2010-09-16) [stable]
|
---|
895 |
|
---|
896 | ** Bug fixes
|
---|
897 |
|
---|
898 | grep --include=FILE works once again, rather than working like --exclude=FILE
|
---|
899 | [bug introduced in grep-2.6]
|
---|
900 |
|
---|
901 | Searching with grep -Fw for an empty string would not match an
|
---|
902 | empty line. [bug present since "the beginning"]
|
---|
903 |
|
---|
904 | X{0,0} is implemented correctly. It used to be a synonym of X{0,1}.
|
---|
905 | [bug present since "the beginning"]
|
---|
906 |
|
---|
907 | In multibyte locales, regular expressions including back-references
|
---|
908 | no longer exhibit quadratic complexity (i.e., they are orders
|
---|
909 | of magnitude faster). [bug present since multi-byte character set
|
---|
910 | support was introduced in 2.5.2]
|
---|
911 |
|
---|
912 | In UTF-8 locales, regular expressions including "." can be orders
|
---|
913 | of magnitude faster. For example, "grep ." is now twice as fast
|
---|
914 | as "grep -v ^$", instead of being immensely slower. It remains
|
---|
915 | slow in other multibyte locales. [bug present since multi-byte
|
---|
916 | character set support was introduced in 2.5.2]
|
---|
917 |
|
---|
918 | --mmap was meant to be ignored in 2.6.x, but it was instead
|
---|
919 | removed by mistake. [bug introduced in 2.6]
|
---|
920 |
|
---|
921 | ** New features
|
---|
922 |
|
---|
923 | grep now diagnoses (and fails with exit status 2) commonly mistyped
|
---|
924 | regular expression like [:space:], [:digit:], etc. Before, those were
|
---|
925 | silently interpreted as [ac:eps] and [dgit:] respectively. Virtually
|
---|
926 | all who make that class of mistake should have used [[:space:]] or
|
---|
927 | [[:digit:]]. This new behavior is disabled when the POSIXLY_CORRECT
|
---|
928 | environment variable is set.
|
---|
929 |
|
---|
930 | On systems using glibc, grep can support equivalence classes. However,
|
---|
931 | whether they actually work depends on glibc's locale definitions.
|
---|
932 |
|
---|
933 | * Noteworthy changes in release 2.6.3 (2010-04-02) [stable]
|
---|
934 |
|
---|
935 | ** Bug fixes
|
---|
936 |
|
---|
937 | Searching with grep -F for an empty string in a multibyte locale
|
---|
938 | would hang grep. [bug introduced in 2.6.2]
|
---|
939 |
|
---|
940 | PCRE support is once again detected on systems with <pcre/pcre.h>
|
---|
941 | [bug introduced in 2.6.2]
|
---|
942 |
|
---|
943 |
|
---|
944 | * Noteworthy changes in release 2.6.2 (2010-03-29) [stable]
|
---|
945 |
|
---|
946 | ** Bug fixes
|
---|
947 |
|
---|
948 | grep -F no longer mistakenly reports a match when searching
|
---|
949 | for an incomplete prefix of a multibyte character.
|
---|
950 | [bug present since "the beginning"]
|
---|
951 |
|
---|
952 | grep -F no longer goes into an infinite loop when it finds a match for an
|
---|
953 | incomplete (non-prefix of a) multibyte character. [bug introduced in 2.6]
|
---|
954 |
|
---|
955 | Using any of the --include or --exclude* options would cause a NULL
|
---|
956 | dereference. [bugs introduced in 2.6]
|
---|
957 |
|
---|
958 | ** Build-related
|
---|
959 |
|
---|
960 | configure no longer relies on pkg-config to detect PCRE support.
|
---|
961 |
|
---|
962 |
|
---|
963 | * Noteworthy changes in release 2.6.1 (2010-03-25) [stable]
|
---|
964 |
|
---|
965 | ** Bug fixes
|
---|
966 |
|
---|
967 | Character classes could cause a segmentation fault if they included a
|
---|
968 | multibyte character. [bug introduced in 2.6]
|
---|
969 |
|
---|
970 | Character ranges would not work in single-byte character sets other
|
---|
971 | than C (for example, ISO-8859-1 or KOI8-R) and some multi-byte locales.
|
---|
972 | For example, this should print "1", but would find no match:
|
---|
973 | $ echo 1 | env -i LC_COLLATE=en_US.UTF-8 grep '[0-9]'
|
---|
974 | [bug introduced in 2.6]
|
---|
975 |
|
---|
976 | The output of grep was incorrect for whole-word (-w) matches if the
|
---|
977 | patterns included a back-reference. [bug introduced in grep-2.5.2]
|
---|
978 |
|
---|
979 | ** Portability
|
---|
980 |
|
---|
981 | Avoid a link failure on Solaris 8.
|
---|
982 |
|
---|
983 |
|
---|
984 | * Noteworthy changes in release 2.6 (2010-03-23) [stable]
|
---|
985 |
|
---|
986 | ** Speed improvements
|
---|
987 |
|
---|
988 | grep is much faster on multibyte character sets, especially (but not
|
---|
989 | limited to) UTF-8 character sets. The speed improvement is also very
|
---|
990 | pronounced with case-insensitive matches.
|
---|
991 |
|
---|
992 | ** Bug fixes
|
---|
993 |
|
---|
994 | Character classes would malfunction in multi-byte locales when using grep -i.
|
---|
995 | Examples which would print nothing for LC_ALL=en_US.UTF-8 include:
|
---|
996 | - for ranges, echo Z | grep -i '[a-z]'
|
---|
997 | - for single characters, echo Y | grep -i '[y]'
|
---|
998 | - for character types, echo Y | grep -i '[[:lower:]]'
|
---|
999 |
|
---|
1000 | grep -i -o would fail to report some matches; grep -i --color, while not
|
---|
1001 | missing any line containing a match, would fail to color some matches.
|
---|
1002 |
|
---|
1003 | grep would fail to report a match in a multibyte character set other than
|
---|
1004 | UTF-8, if another match occurred earlier in the line but started in the
|
---|
1005 | middle of a multibyte character.
|
---|
1006 |
|
---|
1007 | Various bugs in grep -P, caused by expressions such as [^b] or \S matching
|
---|
1008 | newlines, were fixed. grep -P also supports the special sequences \Z and
|
---|
1009 | \z, and can be combined with the command-line option -z to perform searches
|
---|
1010 | on NUL-separated records.
|
---|
1011 |
|
---|
1012 | grep would mistakenly exit with status 1 upon error, rather than 2,
|
---|
1013 | as it is documented to do.
|
---|
1014 |
|
---|
1015 | Using options like -1 -2 or -1 -v -2 results in two lines of
|
---|
1016 | context (the last value that appears on the command line) instead
|
---|
1017 | twelve (the concatenation of all the values). This is consistent
|
---|
1018 | with the behavior of options -A/-B/-C.
|
---|
1019 |
|
---|
1020 | Two new command-line options, --group-separator=ARGUMENT and
|
---|
1021 | --no-group-separator, enable further customization of the output
|
---|
1022 | when -A, -B or -C is being used.
|
---|
1023 |
|
---|
1024 | ** Other changes
|
---|
1025 |
|
---|
1026 | egrep accepts the -E option and fgrep accepts the -F option. If egrep
|
---|
1027 | and fgrep are given another of the -E/-F/-G options, they print a more
|
---|
1028 | meaningful error message.
|
---|
1029 |
|
---|
1030 | * Noteworthy changes in release 2.5.4 (2009-02-10) [stable]
|
---|
1031 |
|
---|
1032 | - This is a bugfix release. No new features.
|
---|
1033 |
|
---|
1034 | Version 2.5.3
|
---|
1035 | - The new option --exclude-dir allows to specify a directory pattern that
|
---|
1036 | will be excluded from recursive grep.
|
---|
1037 | - Numerous bug fixes
|
---|
1038 |
|
---|
1039 | Version 2.5.1
|
---|
1040 | - This is a bugfix release. No new features.
|
---|
1041 |
|
---|
1042 | Version 2.5
|
---|
1043 | - The new option --label allows to specify a different name for input
|
---|
1044 | from stdin. See the man or info pages for details.
|
---|
1045 |
|
---|
1046 | - The internal lib/getopt* files are no longer used on systems providing
|
---|
1047 | getopt functionality in their libc (e.g. glibc 2.2.x).
|
---|
1048 | If you need the old getopt files, use --with-included-getopt.
|
---|
1049 |
|
---|
1050 | - The new option --only-matching (-o) will print only the part of matching
|
---|
1051 | lines that matches the pattern. This is useful, for example, to extract
|
---|
1052 | IP addresses from log files.
|
---|
1053 |
|
---|
1054 | - i18n bug fixed ([A-Z0-9] wouldn't match A in locales other than C on
|
---|
1055 | systems using recent glibc builds
|
---|
1056 |
|
---|
1057 | - GNU grep can now be built with autoconf 2.52.
|
---|
1058 |
|
---|
1059 | - The new option --devices controls how grep handles device files. Its usage
|
---|
1060 | is analogous to --directories.
|
---|
1061 |
|
---|
1062 | - The new option --line-buffered fflush on everyline. There is a noticeable
|
---|
1063 | slow down when forcing line buffering.
|
---|
1064 |
|
---|
1065 | - Back-references are now local to the regex.
|
---|
1066 | grep -e '\(a\)\1' -e '\(b\)\1'
|
---|
1067 | The last backref \1 in the second expression refer to \(b\)
|
---|
1068 |
|
---|
1069 | - The new option --include=PATTERN will search only matching files
|
---|
1070 | when recursing in directories
|
---|
1071 |
|
---|
1072 | - The new option --exclude=PATTERN will skip matching files when
|
---|
1073 | recursing in directories.
|
---|
1074 |
|
---|
1075 | - The new option --color will use the environment variable GREP_COLOR
|
---|
1076 | (default is red) to highlight the matching string.
|
---|
1077 | --color takes an optional argument specifying when to colorize a line:
|
---|
1078 | --color=always, --color=tty, --color=never
|
---|
1079 |
|
---|
1080 | - The following changes are for POSIX conformance:
|
---|
1081 |
|
---|
1082 | . The -q or --quiet or --silent option now causes grep to exit
|
---|
1083 | with zero status when a input line is selected, even if an error
|
---|
1084 | also occurs.
|
---|
1085 |
|
---|
1086 | . The -s or --no-messages option no longer affects the exit status.
|
---|
1087 |
|
---|
1088 | . Bracket regular expressions like [a-z] are now locale-dependent.
|
---|
1089 | For example, many locales sort characters in dictionary order,
|
---|
1090 | and in these locales the regular expression [a-d] is not
|
---|
1091 | equivalent to [abcd]; it might be equivalent to [aBbCcDd], for
|
---|
1092 | example. To obtain the traditional interpretation of bracket
|
---|
1093 | expressions, you can use the C locale by setting the LC_ALL
|
---|
1094 | environment variable to the value "C".
|
---|
1095 |
|
---|
1096 | - The -C or --context option now requires an argument, partly for
|
---|
1097 | consistency, and partly because POSIX recommends against
|
---|
1098 | optional arguments.
|
---|
1099 |
|
---|
1100 | - The new -P or --perl-regexp option tells grep to interpret the pattern as
|
---|
1101 | a Perl regular expression.
|
---|
1102 |
|
---|
1103 | - The new option --max-count=num makes grep stop reading a file after num
|
---|
1104 | matching lines.
|
---|
1105 | New option -m; equivalent to --max-count.
|
---|
1106 |
|
---|
1107 | - Translations for bg, ca, da, nb and tr have been added.
|
---|
1108 |
|
---|
1109 | Version 2.4.2
|
---|
1110 |
|
---|
1111 | - Added more check in configure to default the grep-${version}/src/regex.c
|
---|
1112 | instead of the one in GNU Lib C.
|
---|
1113 |
|
---|
1114 | Version 2.4.1
|
---|
1115 |
|
---|
1116 | - If the final byte of an input file is not a newline, grep now silently
|
---|
1117 | supplies one.
|
---|
1118 |
|
---|
1119 | - The new option --binary-files=TYPE makes grep assume that a binary input
|
---|
1120 | file is of type TYPE.
|
---|
1121 | --binary-files='binary' (the default) outputs a 1-line summary of matches.
|
---|
1122 | --binary-files='without-match' assumes binary files do not match.
|
---|
1123 | --binary-files='text' treats binary files as text
|
---|
1124 | (equivalent to the -a or --text option).
|
---|
1125 |
|
---|
1126 | - New option -I; equivalent to --binary-files='without-match'.
|
---|
1127 |
|
---|
1128 | Version 2.4:
|
---|
1129 |
|
---|
1130 | - egrep is now equivalent to 'grep -E' as required by POSIX,
|
---|
1131 | removing a longstanding source of confusion and incompatibility.
|
---|
1132 | 'grep' is now more forgiving about stray '{'s, for backward
|
---|
1133 | compatibility with traditional egrep.
|
---|
1134 |
|
---|
1135 | - The lower bound of an interval is not optional.
|
---|
1136 | You must use an explicit zero, e.g. 'x{0,10}' instead of 'x{,10}'.
|
---|
1137 | (The old documentation incorrectly claimed that it was optional.)
|
---|
1138 |
|
---|
1139 | - The --revert-match option has been renamed to --invert-match.
|
---|
1140 |
|
---|
1141 | - The --fixed-regexp option has been renamed to --fixed-strings.
|
---|
1142 |
|
---|
1143 | - New option -H or --with-filename.
|
---|
1144 |
|
---|
1145 | - New option --mmap. By default, GNU grep now uses read instead of mmap.
|
---|
1146 | This is faster on some hosts, and is safer on all.
|
---|
1147 |
|
---|
1148 | - The new option -z or --null-data causes 'grep' to treat a zero byte
|
---|
1149 | (the ASCII NUL character) as a line terminator in input data, and
|
---|
1150 | to treat newlines as ordinary data.
|
---|
1151 |
|
---|
1152 | - The new option -Z or --null causes 'grep' to output a zero byte
|
---|
1153 | instead of the normal separator after a file name.
|
---|
1154 |
|
---|
1155 | - These two options can be used with commands like 'find -print0',
|
---|
1156 | 'perl -0', 'sort -z', and 'xargs -0' to process arbitrary file names,
|
---|
1157 | even those that contain newlines.
|
---|
1158 |
|
---|
1159 | - The environment variable GREP_OPTIONS specifies default options;
|
---|
1160 | e.g. GREP_OPTIONS='--directories=skip' reestablishes grep 2.1's
|
---|
1161 | behavior of silently skipping directories.
|
---|
1162 |
|
---|
1163 | - You can specify a matcher multiple times without error, e.g.
|
---|
1164 | 'grep -E -E' or 'fgrep -F'. It is still an error to specify
|
---|
1165 | conflicting matchers.
|
---|
1166 |
|
---|
1167 | - -u and -U are now allowed on non-DOS hosts, and have no effect.
|
---|
1168 |
|
---|
1169 | - Modifications of the tests scripts to go around the "Broken Pipe"
|
---|
1170 | errors from bash. See Bash FAQ.
|
---|
1171 |
|
---|
1172 | - New option -r or --recursive or --directories=recurse.
|
---|
1173 | (This option was also in grep 2.3, but wasn't announced here.)
|
---|
1174 |
|
---|
1175 | - --without-included-regex disable, was causing bogus reports .i.e
|
---|
1176 | doing more harm then good.
|
---|
1177 |
|
---|
1178 | Version 2.3:
|
---|
1179 |
|
---|
1180 | - When searching a binary file FOO, grep now just reports
|
---|
1181 | "Binary file FOO matches" instead of outputting binary data.
|
---|
1182 | This is typically more useful than the old behavior,
|
---|
1183 | and it is also more consistent with other utilities like 'diff'.
|
---|
1184 | A file is considered to be binary if it contains a NUL (i.e. zero) byte.
|
---|
1185 |
|
---|
1186 | The new -a or --text option causes 'grep' to assume that all
|
---|
1187 | input is text. (This option has the same meaning as with 'diff'.)
|
---|
1188 | Use it if you want binary data in your output.
|
---|
1189 |
|
---|
1190 | - 'grep' now searches directories just like ordinary files; it no longer
|
---|
1191 | silently skips directories. This is the traditional behavior of
|
---|
1192 | Unix text utilities (in particular, of traditional 'grep').
|
---|
1193 | Hence 'grep PATTERN DIRECTORY' should report
|
---|
1194 | "grep: DIRECTORY: Is a directory" on hosts where the operating system
|
---|
1195 | does not permit programs to read directories directly, and
|
---|
1196 | "grep: DIRECTORY: Binary file matches" (or nothing) otherwise.
|
---|
1197 |
|
---|
1198 | The new -d ACTION or --directories=ACTION option affects directory handling.
|
---|
1199 | '-d skip' causes 'grep' to silently skip directories, as in grep 2.1;
|
---|
1200 | '-d read' (the default) causes 'grep' to read directories if possible,
|
---|
1201 | as in earlier versions of grep.
|
---|
1202 |
|
---|
1203 | - The MS-DOS and Microsoft Windows ports now behave identically to the
|
---|
1204 | GNU and Unix ports with respect to binary files and directories.
|
---|
1205 |
|
---|
1206 | Version 2.2:
|
---|
1207 |
|
---|
1208 | Bug fix release.
|
---|
1209 |
|
---|
1210 | - Status error number fix.
|
---|
1211 | - Skipping directories removed.
|
---|
1212 | - Many typos fix.
|
---|
1213 | - -f /dev/null fix(not to consider as an empty pattern).
|
---|
1214 | - Checks for wctype/wchar.
|
---|
1215 | - -E was using the wrong matcher fix.
|
---|
1216 | - bug in regex char class fix
|
---|
1217 | - Fixes for DJGPP
|
---|
1218 |
|
---|
1219 | Version 2.1:
|
---|
1220 |
|
---|
1221 | This is a bug fix release(see Changelog) i.e. no new features.
|
---|
1222 |
|
---|
1223 | - More compliance to GNU standard.
|
---|
1224 | - Long options.
|
---|
1225 | - Internationalization.
|
---|
1226 | - Use automake/autoconf.
|
---|
1227 | - Directory hierarchy change.
|
---|
1228 | - Sigvec with -e on Linux corrected.
|
---|
1229 | - Sigvec with -f on Linux corrected.
|
---|
1230 | - Sigvec with the mmap() corrected.
|
---|
1231 | - Bug in kwset corrected.
|
---|
1232 | - -q, -L and -l stop on first match.
|
---|
1233 | - New and improve regex.[ch] from Ulrich Drepper.
|
---|
1234 | - New and improve dfa.[ch] from Arnold Robbins.
|
---|
1235 | - Prototypes for over zealous C compiler.
|
---|
1236 | - Not scanning a file, if it's a directory
|
---|
1237 | (cause problems on Sun).
|
---|
1238 | - Ported to MS-DOS/MS-Windows with DJGPP tools.
|
---|
1239 |
|
---|
1240 | See Changelog for the full story and proper credits.
|
---|
1241 |
|
---|
1242 | Version 2.0:
|
---|
1243 |
|
---|
1244 | The most important user visible change is that egrep and fgrep have
|
---|
1245 | disappeared as separate programs into the single grep program mandated
|
---|
1246 | by POSIX 1003.2. New options -G, -E, and -F have been added,
|
---|
1247 | selecting grep, egrep, and fgrep behavior respectively. For
|
---|
1248 | compatibility with historical practice, hard links named egrep and
|
---|
1249 | fgrep are also provided. See the manual page for details.
|
---|
1250 |
|
---|
1251 | In addition, the regular expression facilities described in Posix
|
---|
1252 | draft 11.2 are now supported, except for internationalization features
|
---|
1253 | related to locale-dependent collating sequence information.
|
---|
1254 |
|
---|
1255 | There is a new option, -L, which is like -l except it lists
|
---|
1256 | files which don't contain matches. The reason this option was
|
---|
1257 | added is because '-l -v' doesn't do what you expect.
|
---|
1258 |
|
---|
1259 | Performance has been improved; the amount of improvement is platform
|
---|
1260 | dependent, but (for example) grep 2.0 typically runs at least 30% faster
|
---|
1261 | than grep 1.6 on a DECstation using the MIPS compiler. Where possible,
|
---|
1262 | grep now uses mmap() for file input; on a Sun 4 running SunOS 4.1 this
|
---|
1263 | may cut system time by as much as half, for a total reduction in running
|
---|
1264 | time by nearly 50%. On machines that don't use mmap(), the buffering
|
---|
1265 | code has been rewritten to choose more favorable alignments and buffer
|
---|
1266 | sizes for read().
|
---|
1267 |
|
---|
1268 | Portability has been substantially cleaned up, and an automatic
|
---|
1269 | configure script is now provided.
|
---|
1270 |
|
---|
1271 | The internals have changed in ways too numerous to mention.
|
---|
1272 | People brave enough to reuse the DFA matcher in other programs
|
---|
1273 | will now have their bravery amply "rewarded", for the interface
|
---|
1274 | to that file has been completely changed. Some changes were
|
---|
1275 | necessary to track the evolution of the regex package, and since
|
---|
1276 | I was changing it anyway I decided to do a general cleanup.
|
---|
1277 |
|
---|
1278 | ========================================================================
|
---|
1279 | Copyright (C) 1992, 1997-2002, 2004-2021 Free Software Foundation, Inc.
|
---|
1280 |
|
---|
1281 | Copying and distribution of this file, with or without modification,
|
---|
1282 | are permitted in any medium without royalty provided the copyright
|
---|
1283 | notice and this notice are preserved.
|
---|
1284 |
|
---|
1285 | Permission is granted to copy, distribute and/or modify this document
|
---|
1286 | under the terms of the GNU Free Documentation License, Version 1.3 or
|
---|
1287 | any later version published by the Free Software Foundation; with no
|
---|
1288 | Invariant Sections, with no Front-Cover Texts, and with no Back-Cover
|
---|
1289 | Texts. A copy of the license is included in the "GNU Free
|
---|
1290 | Documentation License" file as part of this distribution.
|
---|