grep.texi@ 3529

Last change on this file since 3529 was 3529, checked in by bird, 3 years ago
Imported grep 3.7 from grep-3.7.tar.gz (sha256: c22b0cf2d4f6bbe599c902387e8058990e1eee99aef333a203829e5fd3dbb342), applying minimal auto-props.
File size: 70.2 KB

Line
1	\input texinfo @c --texinfo--
2	@c %**start of header
3	@setfilename grep.info
4	@include version.texi
5	@settitle GNU Grep @value{VERSION}
6
7	@c Combine indices.
8	@syncodeindex ky cp
9	@syncodeindex pg cp
10	@syncodeindex tp cp
11	@defcodeindex op
12	@syncodeindex op cp
13	@syncodeindex vr cp
14	@c %**end of header
15
16	@documentencoding UTF-8
17	@c These two require Texinfo 5.0 or later, so use the older
18	@c equivalent @set variables supported in 4.11 and later.
19	@ignore
20	@codequotebacktick on
21	@codequoteundirected on
22	@end ignore
23	@set txicodequoteundirected
24	@set txicodequotebacktick
25	@iftex
26	@c TeX sometimes fails to hyphenate, so help it here.
27	@hyphenation{spec-i-fied}
28	@end iftex
29
30	@copying
31	This manual is for @command{grep}, a pattern matching engine.
32
33	Copyright @copyright{} 1999--2002, 2005, 2008--2021 Free Software Foundation,
34	Inc.
35
36	@quotation
37	Permission is granted to copy, distribute and/or modify this document
38	under the terms of the GNU Free Documentation License, Version 1.3 or
39	any later version published by the Free Software Foundation; with no
40	Invariant Sections, with no Front-Cover Texts, and with no Back-Cover
41	Texts. A copy of the license is included in the section entitled
42	``GNU Free Documentation License''.
43	@end quotation
44	@end copying
45
46	@dircategory Text creation and manipulation
47	@direntry
48	* grep: (grep). Print lines that match patterns.
49	@end direntry
50
51	@titlepage
52	@title GNU Grep: Print lines that match patterns
53	@subtitle version @value{VERSION}, @value{UPDATED}
54	@author Alain Magloire et al.
55	@page
56	@vskip 0pt plus 1filll
57	@insertcopying
58	@end titlepage
59
60	@contents
61
62
63	@ifnottex
64	@node Top
65	@top grep
66
67	@command{grep} prints lines that contain a match for one or more patterns.
68
69	This manual is for version @value{VERSION} of GNU Grep.
70
71	@insertcopying
72	@end ifnottex
73
74	@menu
75	* Introduction:: Introduction.
76	* Invoking:: Command-line options, environment, exit status.
77	* Regular Expressions:: Regular Expressions.
78	* Usage:: Examples.
79	* Performance:: Performance tuning.
80	* Reporting Bugs:: Reporting Bugs.
81	* Copying:: License terms for this manual.
82	* Index:: Combined index.
83	@end menu
84
85
86	@node Introduction
87	@chapter Introduction
88
89	@cindex searching for patterns
90
91	Given one or more patterns, @command{grep} searches input files
92	for matches to the patterns.
93	When it finds a match in a line,
94	it copies the line to standard output (by default),
95	or produces whatever other sort of output you have requested with options.
96
97	Though @command{grep} expects to do the matching on text,
98	it has no limits on input line length other than available memory,
99	and it can match arbitrary characters within a line.
100	If the final byte of an input file is not a newline,
101	@command{grep} silently supplies one.
102	Since newline is also a separator for the list of patterns,
103	there is no way to match newline characters in a text.
104
105
106	@node Invoking
107	@chapter Invoking @command{grep}
108
109	The general synopsis of the @command{grep} command line is
110
111	@example
112	grep [@var{option}...] [@var{patterns}] [@var{file}...]
113	@end example
114
115	@noindent
116	There can be zero or more @var{option} arguments, and zero or more
117	@var{file} arguments. The @var{patterns} argument contains one or
118	more patterns separated by newlines, and is omitted when patterns are
119	given via the @samp{-e@ @var{patterns}} or @samp{-f@ @var{file}}
120	options. Typically @var{patterns} should be quoted when
121	@command{grep} is used in a shell command.
122
123	@menu
124	* Command-line Options:: Short and long names, grouped by category.
125	* Environment Variables:: POSIX, GNU generic, and GNU grep specific.
126	* Exit Status:: Exit status returned by @command{grep}.
127	* grep Programs:: @command{grep} programs.
128	@end menu
129
130	@node Command-line Options
131	@section Command-line Options
132
133	@command{grep} comes with a rich set of options:
134	some from POSIX and some being GNU extensions.
135	Long option names are always a GNU extension,
136	even for options that are from POSIX specifications.
137	Options that are specified by POSIX,
138	under their short names,
139	are explicitly marked as such
140	to facilitate POSIX-portable programming.
141	A few option names are provided
142	for compatibility with older or more exotic implementations.
143
144	@menu
145	* Generic Program Information::
146	* Matching Control::
147	* General Output Control::
148	* Output Line Prefix Control::
149	* Context Line Control::
150	* File and Directory Selection::
151	* Other Options::
152	@end menu
153
154	Several additional options control
155	which variant of the @command{grep} matching engine is used.
156	@xref{grep Programs}.
157
158	@node Generic Program Information
159	@subsection Generic Program Information
160
161	@table @option
162
163	@item --help
164	@opindex --help
165	@cindex usage summary, printing
166	Print a usage message briefly summarizing the command-line options
167	and the bug-reporting address, then exit.
168
169	@item -V
170	@itemx --version
171	@opindex -V
172	@opindex --version
173	@cindex version, printing
174	Print the version number of @command{grep} to the standard output stream.
175	This version number should be included in all bug reports.
176
177	@end table
178
179	@node Matching Control
180	@subsection Matching Control
181
182	@table @option
183
184	@item -e @var{patterns}
185	@itemx --regexp=@var{patterns}
186	@opindex -e
187	@opindex --regexp=@var{patterns}
188	@cindex patterns option
189	Use @var{patterns} as one or more patterns; newlines within
190	@var{patterns} separate each pattern from the next.
191	If this option is used multiple times or is combined with the
192	@option{-f} (@option{--file}) option, search for all patterns given.
193	Typically @var{patterns} should be quoted when @command{grep} is used
194	in a shell command.
195	(@option{-e} is specified by POSIX.)
196
197	@item -f @var{file}
198	@itemx --file=@var{file}
199	@opindex -f
200	@opindex --file
201	@cindex patterns from file
202	Obtain patterns from @var{file}, one per line.
203	If this option is used multiple times or is combined with the
204	@option{-e} (@option{--regexp}) option, search for all patterns given.
205	The empty file contains zero patterns, and therefore matches nothing.
206	(@option{-f} is specified by POSIX.)
207
208	@item -i
209	@itemx -y
210	@itemx --ignore-case
211	@opindex -i
212	@opindex -y
213	@opindex --ignore-case
214	@cindex case insensitive search
215	Ignore case distinctions in patterns and input data,
216	so that characters that differ only in case
217	match each other. Although this is straightforward when letters
218	differ in case only via lowercase-uppercase pairs, the behavior is
219	unspecified in other situations. For example, uppercase ``S'' has an
220	unusual lowercase counterpart ``ſ'' (Unicode character U+017F, LATIN
221	SMALL LETTER LONG S) in many locales, and it is unspecified whether
222	this unusual character matches ``S'' or ``s'' even though uppercasing
223	it yields ``S''. Another example: the lowercase German letter ``ß''
224	(U+00DF, LATIN SMALL LETTER SHARP S) is normally capitalized as the
225	two-character string ``SS'' but it does not match ``SS'', and it might
226	not match the uppercase letter ``ẞ'' (U+1E9E, LATIN CAPITAL LETTER
227	SHARP S) even though lowercasing the latter yields the former.
228
229	@option{-y} is an obsolete synonym that is provided for compatibility.
230	(@option{-i} is specified by POSIX.)
231
232	@item --no-ignore-case
233	@opindex --no-ignore-case
234	Do not ignore case distinctions in patterns and input data. This is
235	the default. This option is useful for passing to shell scripts that
236	already use @option{-i}, in order to cancel its effects because the
237	two options override each other.
238
239	@item -v
240	@itemx --invert-match
241	@opindex -v
242	@opindex --invert-match
243	@cindex invert matching
244	@cindex print non-matching lines
245	Invert the sense of matching, to select non-matching lines.
246	(@option{-v} is specified by POSIX.)
247
248	@item -w
249	@itemx --word-regexp
250	@opindex -w
251	@opindex --word-regexp
252	@cindex matching whole words
253	Select only those lines containing matches that form whole words.
254	The test is that the matching substring must either
255	be at the beginning of the line,
256	or preceded by a non-word constituent character.
257	Similarly,
258	it must be either at the end of the line
259	or followed by a non-word constituent character.
260	Word constituent characters are letters, digits, and the underscore.
261	This option has no effect if @option{-x} is also specified.
262
263	Because the @option{-w} option can match a substring that does not
264	begin and end with word constituents, it differs from surrounding a
265	regular expression with @samp{\<} and @samp{\>}. For example, although
266	@samp{grep -w @@} matches a line containing only @samp{@@}, @samp{grep
267	'\<@@\>'} cannot match any line because @samp{@@} is not a
268	word constituent. @xref{The Backslash Character and Special
269	Expressions}.
270
271	@item -x
272	@itemx --line-regexp
273	@opindex -x
274	@opindex --line-regexp
275	@cindex match the whole line
276	Select only those matches that exactly match the whole line.
277	For regular expression patterns, this is like parenthesizing each
278	pattern and then surrounding it with @samp{^} and @samp{$}.
279	(@option{-x} is specified by POSIX.)
280
281	@end table
282
283	@node General Output Control
284	@subsection General Output Control
285
286	@table @option
287
288	@item -c
289	@itemx --count
290	@opindex -c
291	@opindex --count
292	@cindex counting lines
293	Suppress normal output;
294	instead print a count of matching lines for each input file.
295	With the @option{-v} (@option{--invert-match}) option,
296	count non-matching lines.
297	(@option{-c} is specified by POSIX.)
298
299	@item --color[=@var{WHEN}]
300	@itemx --colour[=@var{WHEN}]
301	@opindex --color
302	@opindex --colour
303	@cindex highlight, color, colour
304	Surround the matched (non-empty) strings, matching lines, context lines,
305	file names, line numbers, byte offsets, and separators (for fields and
306	groups of context lines) with escape sequences to display them in color
307	on the terminal.
308	The colors are defined by the environment variable @env{GREP_COLORS}
309	and default to @samp{ms=01;31:mc=01;31:sl=:cx=:fn=35:ln=32:bn=32:se=36}
310	for bold red matched text, magenta file names, green line numbers,
311	green byte offsets, cyan separators, and default terminal colors otherwise.
312	The deprecated environment variable @env{GREP_COLOR} is still supported,
313	but its setting does not have priority;
314	it defaults to @samp{01;31} (bold red)
315	which only covers the color for matched text.
316	@var{WHEN} is @samp{never}, @samp{always}, or @samp{auto}.
317
318	@item -L
319	@itemx --files-without-match
320	@opindex -L
321	@opindex --files-without-match
322	@cindex files which don't match
323	Suppress normal output;
324	instead print the name of each input file from which
325	no output would normally have been printed.
326
327	@item -l
328	@itemx --files-with-matches
329	@opindex -l
330	@opindex --files-with-matches
331	@cindex names of matching files
332	Suppress normal output;
333	instead print the name of each input file from which
334	output would normally have been printed.
335	Scanning each input file stops upon first match.
336	(@option{-l} is specified by POSIX.)
337
338	@item -m @var{num}
339	@itemx --max-count=@var{num}
340	@opindex -m
341	@opindex --max-count
342	@cindex max-count
343	Stop after the first @var{num} selected lines.
344	If the input is standard input from a regular file,
345	and @var{num} selected lines are output,
346	@command{grep} ensures that the standard input is positioned
347	just after the last selected line before exiting,
348	regardless of the presence of trailing context lines.
349	This enables a calling process to resume a search.
350	For example, the following shell script makes use of it:
351
352	@example
353	while grep -m 1 'PATTERN'
354	do
355	echo xxxx
356	done < FILE
357	@end example
358
359	But the following probably will not work because a pipe is not a regular
360	file:
361
362	@example
363	# This probably will not work.
364	cat FILE \|
365	while grep -m 1 'PATTERN'
366	do
367	echo xxxx
368	done
369	@end example
370
371	@cindex context lines
372	When @command{grep} stops after @var{num} selected lines,
373	it outputs any trailing context lines.
374	When the @option{-c} or @option{--count} option is also used,
375	@command{grep} does not output a count greater than @var{num}.
376	When the @option{-v} or @option{--invert-match} option is also used,
377	@command{grep} stops after outputting @var{num} non-matching lines.
378
379	@item -o
380	@itemx --only-matching
381	@opindex -o
382	@opindex --only-matching
383	@cindex only matching
384	Print only the matched (non-empty) parts of matching lines,
385	with each such part on a separate output line.
386	Output lines use the same delimiters as input, and delimiters are null
387	bytes if @option{-z} (@option{--null-data}) is also used (@pxref{Other
388	Options}).
389
390	@item -q
391	@itemx --quiet
392	@itemx --silent
393	@opindex -q
394	@opindex --quiet
395	@opindex --silent
396	@cindex quiet, silent
397	Quiet; do not write anything to standard output.
398	Exit immediately with zero status if any match is found,
399	even if an error was detected.
400	Also see the @option{-s} or @option{--no-messages} option.
401	(@option{-q} is specified by POSIX.)
402
403	@item -s
404	@itemx --no-messages
405	@opindex -s
406	@opindex --no-messages
407	@cindex suppress error messages
408	Suppress error messages about nonexistent or unreadable files.
409	Portability note:
410	unlike GNU @command{grep},
411	7th Edition Unix @command{grep} did not conform to POSIX,
412	because it lacked @option{-q}
413	and its @option{-s} option behaved like
414	GNU @command{grep}'s @option{-q} option.@footnote{Of course, 7th Edition
415	Unix predated POSIX by several years!}
416	USG-style @command{grep} also lacked @option{-q}
417	but its @option{-s} option behaved like GNU @command{grep}'s.
418	Portable shell scripts should avoid both
419	@option{-q} and @option{-s} and should redirect
420	standard and error output to @file{/dev/null} instead.
421	(@option{-s} is specified by POSIX.)
422
423	@end table
424
425	@node Output Line Prefix Control
426	@subsection Output Line Prefix Control
427
428	When several prefix fields are to be output,
429	the order is always file name, line number, and byte offset,
430	regardless of the order in which these options were specified.
431
432	@table @option
433
434	@item -b
435	@itemx --byte-offset
436	@opindex -b
437	@opindex --byte-offset
438	@cindex byte offset
439	Print the 0-based byte offset within the input file
440	before each line of output.
441	If @option{-o} (@option{--only-matching}) is specified,
442	print the offset of the matching part itself.
443
444	@item -H
445	@itemx --with-filename
446	@opindex -H
447	@opindex --with-filename
448	@cindex with filename prefix
449	Print the file name for each match.
450	This is the default when there is more than one file to search.
451
452	@item -h
453	@itemx --no-filename
454	@opindex -h
455	@opindex --no-filename
456	@cindex no filename prefix
457	Suppress the prefixing of file names on output.
458	This is the default when there is only one file
459	(or only standard input) to search.
460
461	@item --label=@var{LABEL}
462	@opindex --label
463	@cindex changing name of standard input
464	Display input actually coming from standard input
465	as input coming from file @var{LABEL}.
466	This can be useful for commands that transform a file's contents
467	before searching; e.g.:
468
469	@example
470	gzip -cd foo.gz \| grep --label=foo -H 'some pattern'
471	@end example
472
473	@item -n
474	@itemx --line-number
475	@opindex -n
476	@opindex --line-number
477	@cindex line numbering
478	Prefix each line of output with the 1-based line number within its input file.
479	(@option{-n} is specified by POSIX.)
480
481	@item -T
482	@itemx --initial-tab
483	@opindex -T
484	@opindex --initial-tab
485	@cindex tab-aligned content lines
486	Make sure that the first character of actual line content lies on a tab stop,
487	so that the alignment of tabs looks normal.
488	This is useful with options that prefix their output to the actual content:
489	@option{-H}, @option{-n}, and @option{-b}.
490	This may also prepend spaces to output line numbers and byte offsets
491	so that lines from a single file all start at the same column.
492
493	@item -Z
494	@itemx --null
495	@opindex -Z
496	@opindex --null
497	@cindex zero-terminated file names
498	Output a zero byte (the ASCII NUL character)
499	instead of the character that normally follows a file name.
500	For example,
501	@samp{grep -lZ} outputs a zero byte after each file name
502	instead of the usual newline.
503	This option makes the output unambiguous,
504	even in the presence of file names containing unusual characters like newlines.
505	This option can be used with commands like
506	@samp{find -print0}, @samp{perl -0}, @samp{sort -z}, and @samp{xargs -0}
507	to process arbitrary file names,
508	even those that contain newline characters.
509
510	@end table
511
512	@node Context Line Control
513	@subsection Context Line Control
514
515	@cindex context lines
516	@dfn{Context lines} are non-matching lines that are near a matching line.
517	They are output only if one of the following options are used.
518	Regardless of how these options are set,
519	@command{grep} never outputs any given line more than once.
520	If the @option{-o} (@option{--only-matching}) option is specified,
521	these options have no effect and a warning is given upon their use.
522
523	@table @option
524
525	@item -A @var{num}
526	@itemx --after-context=@var{num}
527	@opindex -A
528	@opindex --after-context
529	@cindex after context
530	@cindex context lines, after match
531	Print @var{num} lines of trailing context after matching lines.
532
533	@item -B @var{num}
534	@itemx --before-context=@var{num}
535	@opindex -B
536	@opindex --before-context
537	@cindex before context
538	@cindex context lines, before match
539	Print @var{num} lines of leading context before matching lines.
540
541	@item -C @var{num}
542	@itemx -@var{num}
543	@itemx --context=@var{num}
544	@opindex -C
545	@opindex --context
546	@opindex -@var{num}
547	@cindex context lines
548	Print @var{num} lines of leading and trailing output context.
549
550	@item --group-separator=@var{string}
551	@opindex --group-separator
552	@cindex group separator
553	When @option{-A}, @option{-B} or @option{-C} are in use,
554	print @var{string} instead of @option{--} between groups of lines.
555
556	@item --no-group-separator
557	@opindex --group-separator
558	@cindex group separator
559	When @option{-A}, @option{-B} or @option{-C} are in use,
560	do not print a separator between groups of lines.
561
562	@end table
563
564	Here are some points about how @command{grep} chooses
565	the separator to print between prefix fields and line content:
566
567	@itemize @bullet
568	@item
569	Matching lines normally use @samp{:} as a separator
570	between prefix fields and actual line content.
571
572	@item
573	Context (i.e., non-matching) lines use @samp{-} instead.
574
575	@item
576	When context is not specified,
577	matching lines are simply output one right after another.
578
579	@item
580	When context is specified,
581	lines that are adjacent in the input form a group
582	and are output one right after another, while
583	by default a separator appears between non-adjacent groups.
584
585	@item
586	The default separator
587	is a @samp{--} line; its presence and appearance
588	can be changed with the options above.
589
590	@item
591	Each group may contain
592	several matching lines when they are close enough to each other
593	that two adjacent groups connect and can merge into a single
594	contiguous one.
595	@end itemize
596
597	@node File and Directory Selection
598	@subsection File and Directory Selection
599
600	@table @option
601
602	@item -a
603	@itemx --text
604	@opindex -a
605	@opindex --text
606	@cindex suppress binary data
607	@cindex binary files
608	Process a binary file as if it were text;
609	this is equivalent to the @samp{--binary-files=text} option.
610
611	@item --binary-files=@var{type}
612	@opindex --binary-files
613	@cindex binary files
614	If a file's data or metadata
615	indicate that the file contains binary data,
616	assume that the file is of type @var{type}.
617	Non-text bytes indicate binary data; these are either output bytes that are
618	improperly encoded for the current locale (@pxref{Environment
619	Variables}), or null input bytes when the
620	@option{-z} (@option{--null-data}) option is not given (@pxref{Other
621	Options}).
622
623	By default, @var{type} is @samp{binary}, and @command{grep}
624	suppresses output after null input binary data is discovered,
625	and suppresses output lines that contain improperly encoded data.
626	When some output is suppressed, @command{grep} follows any output
627	with a one-line message saying that a binary file matches.
628
629	If @var{type} is @samp{without-match},
630	when @command{grep} discovers null input binary data
631	it assumes that the rest of the file does not match;
632	this is equivalent to the @option{-I} option.
633
634	If @var{type} is @samp{text},
635	@command{grep} processes binary data as if it were text;
636	this is equivalent to the @option{-a} option.
637
638	When @var{type} is @samp{binary}, @command{grep} may treat non-text
639	bytes as line terminators even without the @option{-z}
640	(@option{--null-data}) option. This means choosing @samp{binary}
641	versus @samp{text} can affect whether a pattern matches a file. For
642	example, when @var{type} is @samp{binary} the pattern @samp{q$} might
643	match @samp{q} immediately followed by a null byte, even though this
644	is not matched when @var{type} is @samp{text}. Conversely, when
645	@var{type} is @samp{binary} the pattern @samp{.} (period) might not
646	match a null byte.
647
648	@emph{Warning:} The @option{-a} (@option{--binary-files=text}) option
649	might output binary garbage, which can have nasty side effects if the
650	output is a terminal and if the terminal driver interprets some of it
651	as commands. On the other hand, when reading files whose text
652	encodings are unknown, it can be helpful to use @option{-a} or to set
653	@samp{LC_ALL='C'} in the environment, in order to find more matches
654	even if the matches are unsafe for direct display.
655
656	@item -D @var{action}
657	@itemx --devices=@var{action}
658	@opindex -D
659	@opindex --devices
660	@cindex device search
661	If an input file is a device, FIFO, or socket, use @var{action} to process it.
662	If @var{action} is @samp{read},
663	all devices are read just as if they were ordinary files.
664	If @var{action} is @samp{skip},
665	devices, FIFOs, and sockets are silently skipped.
666	By default, devices are read if they are on the command line or if the
667	@option{-R} (@option{--dereference-recursive}) option is used, and are
668	skipped if they are encountered recursively and the @option{-r}
669	(@option{--recursive}) option is used.
670	This option has no effect on a file that is read via standard input.
671
672	@item -d @var{action}
673	@itemx --directories=@var{action}
674	@opindex -d
675	@opindex --directories
676	@cindex directory search
677	@cindex symbolic links
678	If an input file is a directory, use @var{action} to process it.
679	By default, @var{action} is @samp{read},
680	which means that directories are read just as if they were ordinary files
681	(some operating systems and file systems disallow this,
682	and will cause @command{grep}
683	to print error messages for every directory or silently skip them).
684	If @var{action} is @samp{skip}, directories are silently skipped.
685	If @var{action} is @samp{recurse},
686	@command{grep} reads all files under each directory, recursively,
687	following command-line symbolic links and skipping other symlinks;
688	this is equivalent to the @option{-r} option.
689
690	@item --exclude=@var{glob}
691	@opindex --exclude
692	@cindex exclude files
693	@cindex searching directory trees
694	Skip any command-line file with a name suffix that matches the pattern
695	@var{glob}, using wildcard matching; a name suffix is either the whole
696	name, or a trailing part that starts with a non-slash character
697	immediately after a slash (@samp{/}) in the name.
698	When searching recursively, skip any subfile whose base
699	name matches @var{glob}; the base name is the part after the last
700	slash. A pattern can use
701	@samp{*}, @samp{?}, and @samp{[}...@samp{]} as wildcards,
702	and @code{\} to quote a wildcard or backslash character literally.
703
704	@item --exclude-from=@var{file}
705	@opindex --exclude-from
706	@cindex exclude files
707	@cindex searching directory trees
708	Skip files whose name matches any of the patterns
709	read from @var{file} (using wildcard matching as described
710	under @option{--exclude}).
711
712	@item --exclude-dir=@var{glob}
713	@opindex --exclude-dir
714	@cindex exclude directories
715	Skip any command-line directory with a name suffix that matches the
716	pattern @var{glob}. When searching recursively, skip any subdirectory
717	whose base name matches @var{glob}. Ignore any redundant trailing
718	slashes in @var{glob}.
719
720	@item -I
721	Process a binary file as if it did not contain matching data;
722	this is equivalent to the @samp{--binary-files=without-match} option.
723
724	@item --include=@var{glob}
725	@opindex --include
726	@cindex include files
727	@cindex searching directory trees
728	Search only files whose name matches @var{glob},
729	using wildcard matching as described under @option{--exclude}.
730	If contradictory @option{--include} and @option{--exclude} options are
731	given, the last matching one wins. If no @option{--include} or
732	@option{--exclude} options match, a file is included unless the first
733	such option is @option{--include}.
734
735	@item -r
736	@itemx --recursive
737	@opindex -r
738	@opindex --recursive
739	@cindex recursive search
740	@cindex searching directory trees
741	@cindex symbolic links
742	For each directory operand,
743	read and process all files in that directory, recursively.
744	Follow symbolic links on the command line, but skip symlinks
745	that are encountered recursively.
746	Note that if no file operand is given, grep searches the working directory.
747	This is the same as the @samp{--directories=recurse} option.
748
749	@item -R
750	@itemx --dereference-recursive
751	@opindex -R
752	@opindex --dereference-recursive
753	@cindex recursive search
754	@cindex searching directory trees
755	@cindex symbolic links
756	For each directory operand, read and process all files in that
757	directory, recursively, following all symbolic links.
758
759	@end table
760
761	@node Other Options
762	@subsection Other Options
763
764	@table @option
765
766	@item --
767	@opindex --
768	@cindex option delimiter
769	Delimit the option list. Later arguments, if any, are treated as
770	operands even if they begin with @samp{-}. For example, @samp{grep PAT --
771	-file1 file2} searches for the pattern PAT in the files named @file{-file1}
772	and @file{file2}.
773
774	@item --line-buffered
775	@opindex --line-buffered
776	@cindex line buffering
777	Use line buffering for standard output, regardless of output device.
778	By default, standard output is line buffered for interactive devices,
779	and is fully buffered otherwise. With full buffering, the output
780	buffer is flushed when full; with line buffering, the buffer is also
781	flushed after every output line. The buffer size is system dependent.
782
783	@item -U
784	@itemx --binary
785	@opindex -U
786	@opindex --binary
787	@cindex MS-Windows binary I/O
788	@cindex binary I/O
789	On platforms that distinguish between text and binary I/O,
790	use the latter when reading and writing files other
791	than the user's terminal, so that all input bytes are read and written
792	as-is. This overrides the default behavior where @command{grep}
793	follows the operating system's advice whether to use text or binary
794	I/O@. On MS-Windows when @command{grep} uses text I/O it reads a
795	carriage return--newline pair as a newline and a Control-Z as
796	end-of-file, and it writes a newline as a carriage return--newline
797	pair.
798
799	When using text I/O @option{--byte-offset} (@option{-b}) counts and
800	@option{--binary-files} heuristics apply to input data after text-I/O
801	processing. Also, the @option{--binary-files} heuristics need not agree
802	with the @option{--binary} option; that is, they may treat the data as
803	text even if @option{--binary} is given, or vice versa.
804	@xref{File and Directory Selection}.
805
806	This option has no effect on GNU and other POSIX-compatible platforms,
807	which do not distinguish text from binary I/O.
808
809	@item -z
810	@itemx --null-data
811	@opindex -z
812	@opindex --null-data
813	@cindex zero-terminated lines
814	Treat input and output data as sequences of lines, each terminated by
815	a zero byte (the ASCII NUL character) instead of a newline.
816	Like the @option{-Z} or @option{--null} option,
817	this option can be used with commands like
818	@samp{sort -z} to process arbitrary file names.
819
820	@end table
821
822	@node Environment Variables
823	@section Environment Variables
824
825	The behavior of @command{grep} is affected
826	by the following environment variables.
827
828	@vindex LANGUAGE @r{environment variable}
829	@vindex LC_ALL @r{environment variable}
830	@vindex LC_MESSAGES @r{environment variable}
831	@vindex LANG @r{environment variable}
832	The locale for category @w{@code{LC_@var{foo}}}
833	is specified by examining the three environment variables
834	@env{LC_ALL}, @w{@env{LC_@var{foo}}}, and @env{LANG},
835	in that order.
836	The first of these variables that is set specifies the locale.
837	For example, if @env{LC_ALL} is not set,
838	but @env{LC_COLLATE} is set to @samp{pt_BR},
839	then the Brazilian Portuguese locale is used
840	for the @env{LC_COLLATE} category.
841	As a special case for @env{LC_MESSAGES} only, the environment variable
842	@env{LANGUAGE} can contain a colon-separated list of languages that
843	overrides the three environment variables that ordinarily specify
844	the @env{LC_MESSAGES} category.
845	The @samp{C} locale is used if none of these environment variables are set,
846	if the locale catalog is not installed,
847	or if @command{grep} was not compiled
848	with national language support (NLS).
849	The shell command @code{locale -a} lists locales that are currently available.
850
851	Many of the environment variables in the following list let you
852	control highlighting using
853	Select Graphic Rendition (SGR)
854	commands interpreted by the terminal or terminal emulator.
855	(See the
856	section
857	in the documentation of your text terminal
858	for permitted values and their meanings as character attributes.)
859	These substring values are integers in decimal representation
860	and can be concatenated with semicolons.
861	@command{grep} takes care of assembling the result
862	into a complete SGR sequence (@samp{\33[}...@samp{m}).
863	Common values to concatenate include
864	@samp{1} for bold,
865	@samp{4} for underline,
866	@samp{5} for blink,
867	@samp{7} for inverse,
868	@samp{39} for default foreground color,
869	@samp{30} to @samp{37} for foreground colors,
870	@samp{90} to @samp{97} for 16-color mode foreground colors,
871	@samp{38;5;0} to @samp{38;5;255}
872	for 88-color and 256-color modes foreground colors,
873	@samp{49} for default background color,
874	@samp{40} to @samp{47} for background colors,
875	@samp{100} to @samp{107} for 16-color mode background colors,
876	and @samp{48;5;0} to @samp{48;5;255}
877	for 88-color and 256-color modes background colors.
878
879	The two-letter names used in the @env{GREP_COLORS} environment variable
880	(and some of the others) refer to terminal ``capabilities,'' the ability
881	of a terminal to highlight text, or change its color, and so on.
882	These capabilities are stored in an online database and accessed by
883	the @code{terminfo} library.
884
885	@cindex environment variables
886
887	@table @env
888
889	@item GREP_COLOR
890	@vindex GREP_COLOR @r{environment variable}
891	@cindex highlight markers
892	This variable specifies the color used to highlight matched (non-empty) text.
893	It is deprecated in favor of @env{GREP_COLORS}, but still supported.
894	The @samp{mt}, @samp{ms}, and @samp{mc} capabilities of @env{GREP_COLORS}
895	have priority over it.
896	It can only specify the color used to highlight
897	the matching non-empty text in any matching line
898	(a selected line when the @option{-v} command-line option is omitted,
899	or a context line when @option{-v} is specified).
900	The default is @samp{01;31},
901	which means a bold red foreground text on the terminal's default background.
902
903	@item GREP_COLORS
904	@vindex GREP_COLORS @r{environment variable}
905	@cindex highlight markers
906	This variable specifies the colors and other attributes
907	used to highlight various parts of the output.
908	Its value is a colon-separated list of @code{terminfo} capabilities
909	that defaults to @samp{ms=01;31:mc=01;31:sl=:cx=:fn=35:ln=32:bn=32:se=36}
910	with the @samp{rv} and @samp{ne} boolean capabilities omitted (i.e., false).
911	Supported capabilities are as follows.
912
913	@table @code
914	@item sl=
915	@vindex sl GREP_COLORS @r{capability}
916	SGR substring for whole selected lines
917	(i.e.,
918	matching lines when the @option{-v} command-line option is omitted,
919	or non-matching lines when @option{-v} is specified).
920	If however the boolean @samp{rv} capability
921	and the @option{-v} command-line option are both specified,
922	it applies to context matching lines instead.
923	The default is empty (i.e., the terminal's default color pair).
924
925	@item cx=
926	@vindex cx GREP_COLORS @r{capability}
927	SGR substring for whole context lines
928	(i.e.,
929	non-matching lines when the @option{-v} command-line option is omitted,
930	or matching lines when @option{-v} is specified).
931	If however the boolean @samp{rv} capability
932	and the @option{-v} command-line option are both specified,
933	it applies to selected non-matching lines instead.
934	The default is empty (i.e., the terminal's default color pair).
935
936	@item rv
937	@vindex rv GREP_COLORS @r{capability}
938	Boolean value that reverses (swaps) the meanings of
939	the @samp{sl=} and @samp{cx=} capabilities
940	when the @option{-v} command-line option is specified.
941	The default is false (i.e., the capability is omitted).
942
943	@item mt=01;31
944	@vindex mt GREP_COLORS @r{capability}
945	SGR substring for matching non-empty text in any matching line
946	(i.e.,
947	a selected line when the @option{-v} command-line option is omitted,
948	or a context line when @option{-v} is specified).
949	Setting this is equivalent to setting both @samp{ms=} and @samp{mc=}
950	at once to the same value.
951	The default is a bold red text foreground over the current line background.
952
953	@item ms=01;31
954	@vindex ms GREP_COLORS @r{capability}
955	SGR substring for matching non-empty text in a selected line.
956	(This is used only when the @option{-v} command-line option is omitted.)
957	The effect of the @samp{sl=} (or @samp{cx=} if @samp{rv}) capability
958	remains active when this takes effect.
959	The default is a bold red text foreground over the current line background.
960
961	@item mc=01;31
962	@vindex mc GREP_COLORS @r{capability}
963	SGR substring for matching non-empty text in a context line.
964	(This is used only when the @option{-v} command-line option is specified.)
965	The effect of the @samp{cx=} (or @samp{sl=} if @samp{rv}) capability
966	remains active when this takes effect.
967	The default is a bold red text foreground over the current line background.
968
969	@item fn=35
970	@vindex fn GREP_COLORS @r{capability}
971	SGR substring for file names prefixing any content line.
972	The default is a magenta text foreground over the terminal's default background.
973
974	@item ln=32
975	@vindex ln GREP_COLORS @r{capability}
976	SGR substring for line numbers prefixing any content line.
977	The default is a green text foreground over the terminal's default background.
978
979	@item bn=32
980	@vindex bn GREP_COLORS @r{capability}
981	SGR substring for byte offsets prefixing any content line.
982	The default is a green text foreground over the terminal's default background.
983
984	@item se=36
985	@vindex fn GREP_COLORS @r{capability}
986	SGR substring for separators that are inserted
987	between selected line fields (@samp{:}),
988	between context line fields (@samp{-}),
989	and between groups of adjacent lines
990	when nonzero context is specified (@samp{--}).
991	The default is a cyan text foreground over the terminal's default background.
992
993	@item ne
994	@vindex ne GREP_COLORS @r{capability}
995	Boolean value that prevents clearing to the end of line
996	using Erase in Line (EL) to Right (@samp{\33[K})
997	each time a colorized item ends.
998	This is needed on terminals on which EL is not supported.
999	It is otherwise useful on terminals
1000	for which the @code{back_color_erase}
1001	(@code{bce}) boolean @code{terminfo} capability does not apply,
1002	when the chosen highlight colors do not affect the background,
1003	or when EL is too slow or causes too much flicker.
1004	The default is false (i.e., the capability is omitted).
1005	@end table
1006
1007	Note that boolean capabilities have no @samp{=}... part.
1008	They are omitted (i.e., false) by default and become true when specified.
1009
1010
1011	@item LC_ALL
1012	@itemx LC_COLLATE
1013	@itemx LANG
1014	@vindex LC_ALL @r{environment variable}
1015	@vindex LC_COLLATE @r{environment variable}
1016	@vindex LANG @r{environment variable}
1017	@cindex character type
1018	@cindex national language support
1019	@cindex NLS
1020	These variables specify the locale for the @env{LC_COLLATE} category,
1021	which might affect how range expressions like @samp{[a-z]} are
1022	interpreted.
1023
1024	@item LC_ALL
1025	@itemx LC_CTYPE
1026	@itemx LANG
1027	@vindex LC_ALL @r{environment variable}
1028	@vindex LC_CTYPE @r{environment variable}
1029	@vindex LANG @r{environment variable}
1030	@cindex encoding error
1031	@cindex null character
1032	These variables specify the locale for the @env{LC_CTYPE} category,
1033	which determines the type of characters,
1034	e.g., which characters are whitespace.
1035	This category also determines the character encoding.
1036	@xref{Character Encoding}.
1037
1038	@item LANGUAGE
1039	@itemx LC_ALL
1040	@itemx LC_MESSAGES
1041	@itemx LANG
1042	@vindex LANGUAGE @r{environment variable}
1043	@vindex LC_ALL @r{environment variable}
1044	@vindex LC_MESSAGES @r{environment variable}
1045	@vindex LANG @r{environment variable}
1046	@cindex language of messages
1047	@cindex message language
1048	@cindex national language support
1049	@cindex translation of message language
1050	These variables specify the locale for the @env{LC_MESSAGES} category,
1051	which determines the language that @command{grep} uses for messages.
1052	The default @samp{C} locale uses American English messages.
1053
1054	@item POSIXLY_CORRECT
1055	@vindex POSIXLY_CORRECT @r{environment variable}
1056	If set, @command{grep} behaves as POSIX requires; otherwise,
1057	@command{grep} behaves more like other GNU programs.
1058	POSIX
1059	requires that options that
1060	follow file names must be treated as file names;
1061	by default,
1062	such options are permuted to the front of the operand list
1063	and are treated as options.
1064	Also, @env{POSIXLY_CORRECT} disables special handling of an
1065	invalid bracket expression. @xref{invalid-bracket-expr}.
1066
1067	@item _@var{N}_GNU_nonoption_argv_flags_
1068	@vindex _@var{N}_GNU_nonoption_argv_flags_ @r{environment variable}
1069	(Here @code{@var{N}} is @command{grep}'s numeric process ID.)
1070	If the @var{i}th character of this environment variable's value is @samp{1},
1071	do not consider the @var{i}th operand of @command{grep} to be an option,
1072	even if it appears to be one.
1073	A shell can put this variable in the environment for each command it runs,
1074	specifying which operands are the results of file name wildcard expansion
1075	and therefore should not be treated as options.
1076	This behavior is available only with the GNU C library,
1077	and only when @env{POSIXLY_CORRECT} is not set.
1078
1079	@end table
1080
1081	The @env{GREP_OPTIONS} environment variable of @command{grep} 2.20 and
1082	earlier is no longer supported, as it caused problems when writing
1083	portable scripts. To make arbitrary changes to how @command{grep}
1084	works, you can use an alias or script instead. For example, if
1085	@command{grep} is in the directory @samp{/usr/bin} you can prepend
1086	@file{$HOME/bin} to your @env{PATH} and create an executable script
1087	@file{$HOME/bin/grep} containing the following:
1088
1089	@example
1090	#! /bin/sh
1091	export PATH=/usr/bin
1092	exec grep --color=auto --devices=skip "$@@"
1093	@end example
1094
1095
1096	@node Exit Status
1097	@section Exit Status
1098	@cindex exit status
1099	@cindex return status
1100
1101	Normally the exit status is 0 if a line is selected, 1 if no lines
1102	were selected, and 2 if an error occurred. However, if the
1103	@option{-q} or @option{--quiet} or @option{--silent} option is used
1104	and a line is selected, the exit status is 0 even if an error
1105	occurred. Other @command{grep} implementations may exit with status
1106	greater than 2 on error.
1107
1108	@node grep Programs
1109	@section @command{grep} Programs
1110	@cindex @command{grep} programs
1111	@cindex variants of @command{grep}
1112
1113	@command{grep} searches the named input files
1114	for lines containing a match to the given patterns.
1115	By default, @command{grep} prints the matching lines.
1116	A file named @file{-} stands for standard input.
1117	If no input is specified, @command{grep} searches the working
1118	directory @file{.} if given a command-line option specifying
1119	recursion; otherwise, @command{grep} searches standard input.
1120	There are four major variants of @command{grep},
1121	controlled by the following options.
1122
1123	@table @option
1124
1125	@item -G
1126	@itemx --basic-regexp
1127	@opindex -G
1128	@opindex --basic-regexp
1129	@cindex matching basic regular expressions
1130	Interpret patterns as basic regular expressions (BREs).
1131	This is the default.
1132
1133	@item -E
1134	@itemx --extended-regexp
1135	@opindex -E
1136	@opindex --extended-regexp
1137	@cindex matching extended regular expressions
1138	Interpret patterns as extended regular expressions (EREs).
1139	(@option{-E} is specified by POSIX.)
1140
1141	@item -F
1142	@itemx --fixed-strings
1143	@opindex -F
1144	@opindex --fixed-strings
1145	@cindex matching fixed strings
1146	Interpret patterns as fixed strings, not regular expressions.
1147	(@option{-F} is specified by POSIX.)
1148
1149	@item -P
1150	@itemx --perl-regexp
1151	@opindex -P
1152	@opindex --perl-regexp
1153	@cindex matching Perl-compatible regular expressions
1154	Interpret patterns as Perl-compatible regular expressions (PCREs).
1155	PCRE support is here to stay, but consider this option experimental when
1156	combined with the @option{-z} (@option{--null-data}) option, and note that
1157	@samp{grep@ -P} may warn of unimplemented features.
1158	@xref{Other Options}.
1159
1160	@end table
1161
1162	In addition,
1163	two variant programs @command{egrep} and @command{fgrep} are available.
1164	@command{egrep} is the same as @samp{grep@ -E}.
1165	@command{fgrep} is the same as @samp{grep@ -F}.
1166	Direct invocation as either
1167	@command{egrep} or @command{fgrep} is deprecated,
1168	but is provided to allow historical applications
1169	that rely on them to run unmodified.
1170
1171
1172	@node Regular Expressions
1173	@chapter Regular Expressions
1174	@cindex regular expressions
1175
1176	A @dfn{regular expression} is a pattern that describes a set of strings.
1177	Regular expressions are constructed analogously to arithmetic expressions,
1178	by using various operators to combine smaller expressions.
1179	@command{grep} understands
1180	three different versions of regular expression syntax:
1181	basic (BRE), extended (ERE), and Perl-compatible (PCRE).
1182	In GNU @command{grep},
1183	there is no difference in available functionality between the basic and
1184	extended syntaxes.
1185	In other implementations, basic regular expressions are less powerful.
1186	The following description applies to extended regular expressions;
1187	differences for basic regular expressions are summarized afterwards.
1188	Perl-compatible regular expressions give additional functionality, and
1189	are documented in the @i{pcresyntax}(3) and @i{pcrepattern}(3) manual
1190	pages, but work only if PCRE is available in the system.
1191
1192	@menu
1193	* Fundamental Structure::
1194	* Character Classes and Bracket Expressions::
1195	* The Backslash Character and Special Expressions::
1196	* Anchoring::
1197	* Back-references and Subexpressions::
1198	* Basic vs Extended::
1199	* Character Encoding::
1200	* Matching Non-ASCII::
1201	@end menu
1202
1203	@node Fundamental Structure
1204	@section Fundamental Structure
1205
1206	@cindex ordinary characters
1207	@cindex special characters
1208	In regular expressions, the characters @samp{.?*+@{\|()[\^$} are
1209	@dfn{special characters} and have uses described below. All other
1210	characters are @dfn{ordinary characters}, and each ordinary character
1211	is a regular expression that matches itself.
1212
1213	@opindex .
1214	@cindex dot
1215	@cindex period
1216	The period @samp{.} matches any single character.
1217	It is unspecified whether @samp{.} matches an encoding error.
1218
1219	@cindex interval expressions
1220	A regular expression may be followed by one of several
1221	repetition operators; the operators beginning with @samp{@{}
1222	are called @dfn{interval expressions}.
1223
1224	@table @samp
1225
1226	@item ?
1227	@opindex ?
1228	@cindex question mark
1229	@cindex match expression at most once
1230	The preceding item is optional and is matched at most once.
1231
1232	@item *
1233	@opindex *
1234	@cindex asterisk
1235	@cindex match expression zero or more times
1236	The preceding item is matched zero or more times.
1237
1238	@item +
1239	@opindex +
1240	@cindex plus sign
1241	@cindex match expression one or more times
1242	The preceding item is matched one or more times.
1243
1244	@item @{@var{n}@}
1245	@opindex @{@var{n}@}
1246	@cindex braces, one argument
1247	@cindex match expression @var{n} times
1248	The preceding item is matched exactly @var{n} times.
1249
1250	@item @{@var{n},@}
1251	@opindex @{@var{n},@}
1252	@cindex braces, second argument omitted
1253	@cindex match expression @var{n} or more times
1254	The preceding item is matched @var{n} or more times.
1255
1256	@item @{,@var{m}@}
1257	@opindex @{,@var{m}@}
1258	@cindex braces, first argument omitted
1259	@cindex match expression at most @var{m} times
1260	The preceding item is matched at most @var{m} times.
1261	This is a GNU extension.
1262
1263	@item @{@var{n},@var{m}@}
1264	@opindex @{@var{n},@var{m}@}
1265	@cindex braces, two arguments
1266	@cindex match expression from @var{n} to @var{m} times
1267	The preceding item is matched at least @var{n} times, but not more than
1268	@var{m} times.
1269
1270	@end table
1271
1272	The empty regular expression matches the empty string.
1273	Two regular expressions may be concatenated;
1274	the resulting regular expression
1275	matches any string formed by concatenating two substrings
1276	that respectively match the concatenated expressions.
1277
1278	Two regular expressions may be joined by the infix operator @samp{\|};
1279	the resulting regular expression
1280	matches any string matching either alternate expression.
1281
1282	Repetition takes precedence over concatenation,
1283	which in turn takes precedence over alternation.
1284	A whole expression may be enclosed in parentheses
1285	to override these precedence rules and form a subexpression.
1286	An unmatched @samp{)} matches just itself.
1287
1288	@node Character Classes and Bracket Expressions
1289	@section Character Classes and Bracket Expressions
1290
1291	@cindex bracket expression
1292	@cindex character class
1293	A @dfn{bracket expression} is a list of characters enclosed by @samp{[} and
1294	@samp{]}.
1295	It matches any single character in that list.
1296	If the first character of the list is the caret @samp{^},
1297	then it matches any character @strong{not} in the list,
1298	and it is unspecified whether it matches an encoding error.
1299	For example, the regular expression
1300	@samp{[0123456789]} matches any single digit,
1301	whereas @samp{[^()]} matches any single character that is not
1302	an opening or closing parenthesis, and might or might not match an
1303	encoding error.
1304
1305	@cindex range expression
1306	Within a bracket expression, a @dfn{range expression} consists of two
1307	characters separated by a hyphen.
1308	It matches any single character that
1309	sorts between the two characters, inclusive.
1310	In the default C locale, the sorting sequence is the native character
1311	order; for example, @samp{[a-d]} is equivalent to @samp{[abcd]}.
1312	In other locales, the sorting sequence is not specified, and
1313	@samp{[a-d]} might be equivalent to @samp{[abcd]} or to
1314	@samp{[aBbCcDd]}, or it might fail to match any character, or the set of
1315	characters that it matches might even be erratic.
1316	To obtain the traditional interpretation
1317	of bracket expressions, you can use the @samp{C} locale by setting the
1318	@env{LC_ALL} environment variable to the value @samp{C}.
1319
1320	Finally, certain named classes of characters are predefined within
1321	bracket expressions, as follows.
1322	Their interpretation depends on the @env{LC_CTYPE} locale;
1323	for example, @samp{[[:alnum:]]} means the character class of numbers and letters
1324	in the current locale.
1325
1326	@cindex classes of characters
1327	@cindex character classes
1328	@table @samp
1329
1330	@item [:alnum:]
1331	@opindex alnum @r{character class}
1332	@cindex alphanumeric characters
1333	Alphanumeric characters:
1334	@samp{[:alpha:]} and @samp{[:digit:]}; in the @samp{C} locale and ASCII
1335	character encoding, this is the same as @samp{[0-9A-Za-z]}.
1336
1337	@item [:alpha:]
1338	@opindex alpha @r{character class}
1339	@cindex alphabetic characters
1340	Alphabetic characters:
1341	@samp{[:lower:]} and @samp{[:upper:]}; in the @samp{C} locale and ASCII
1342	character encoding, this is the same as @samp{[A-Za-z]}.
1343
1344	@item [:blank:]
1345	@opindex blank @r{character class}
1346	@cindex blank characters
1347	Blank characters:
1348	space and tab.
1349
1350	@item [:cntrl:]
1351	@opindex cntrl @r{character class}
1352	@cindex control characters
1353	Control characters.
1354	In ASCII, these characters have octal codes 000
1355	through 037, and 177 (DEL).
1356	In other character sets, these are
1357	the equivalent characters, if any.
1358
1359	@item [:digit:]
1360	@opindex digit @r{character class}
1361	@cindex digit characters
1362	@cindex numeric characters
1363	Digits: @code{0 1 2 3 4 5 6 7 8 9}.
1364
1365	@item [:graph:]
1366	@opindex graph @r{character class}
1367	@cindex graphic characters
1368	Graphical characters:
1369	@samp{[:alnum:]} and @samp{[:punct:]}.
1370
1371	@item [:lower:]
1372	@opindex lower @r{character class}
1373	@cindex lower-case letters
1374	Lower-case letters; in the @samp{C} locale and ASCII character
1375	encoding, this is
1376	@code{a b c d e f g h i j k l m n o p q r s t u v w x y z}.
1377
1378	@item [:print:]
1379	@opindex print @r{character class}
1380	@cindex printable characters
1381	Printable characters:
1382	@samp{[:alnum:]}, @samp{[:punct:]}, and space.
1383
1384	@item [:punct:]
1385	@opindex punct @r{character class}
1386	@cindex punctuation characters
1387	Punctuation characters; in the @samp{C} locale and ASCII character
1388	encoding, this is
1389	@code{!@: " # $ % & ' ( ) * + , - .@: / : ; < = > ?@: @@ [ \ ] ^ _ ` @{ \| @} ~}.
1390
1391	@item [:space:]
1392	@opindex space @r{character class}
1393	@cindex space characters
1394	@cindex whitespace characters
1395	Space characters: in the @samp{C} locale, this is
1396	tab, newline, vertical tab, form feed, carriage return, and space.
1397	@xref{Usage}, for more discussion of matching newlines.
1398
1399	@item [:upper:]
1400	@opindex upper @r{character class}
1401	@cindex upper-case letters
1402	Upper-case letters: in the @samp{C} locale and ASCII character
1403	encoding, this is
1404	@code{A B C D E F G H I J K L M N O P Q R S T U V W X Y Z}.
1405
1406	@item [:xdigit:]
1407	@opindex xdigit @r{character class}
1408	@cindex xdigit class
1409	@cindex hexadecimal digits
1410	Hexadecimal digits:
1411	@code{0 1 2 3 4 5 6 7 8 9 A B C D E F a b c d e f}.
1412
1413	@end table
1414	Note that the brackets in these class names are
1415	part of the symbolic names, and must be included in addition to
1416	the brackets delimiting the bracket expression.
1417
1418	@anchor{invalid-bracket-expr}
1419	If you mistakenly omit the outer brackets, and search for say, @samp{[:upper:]},
1420	GNU @command{grep} prints a diagnostic and exits with status 2, on
1421	the assumption that you did not intend to search for the nominally
1422	equivalent regular expression: @samp{[:epru]}.
1423	Set the @env{POSIXLY_CORRECT} environment variable to disable this feature.
1424
1425	Special characters lose their special meaning inside bracket expressions.
1426
1427	@table @samp
1428	@item ]
1429	ends the bracket expression if it's not the first list item.
1430	So, if you want to make the @samp{]} character a list item,
1431	you must put it first.
1432
1433	@item [.
1434	represents the open collating symbol.
1435
1436	@item .]
1437	represents the close collating symbol.
1438
1439	@item [=
1440	represents the open equivalence class.
1441
1442	@item =]
1443	represents the close equivalence class.
1444
1445	@item [:
1446	represents the open character class symbol, and should be followed by a
1447	valid character class name.
1448
1449	@item :]
1450	represents the close character class symbol.
1451
1452	@item -
1453	represents the range if it's not first or last in a list or the ending point
1454	of a range.
1455
1456	@item ^
1457	represents the characters not in the list.
1458	If you want to make the @samp{^}
1459	character a list item, place it anywhere but first.
1460
1461	@end table
1462
1463	@node The Backslash Character and Special Expressions
1464	@section The Backslash Character and Special Expressions
1465	@cindex backslash
1466
1467	The @samp{\} character followed by a special character is a regular
1468	expression that matches the special character.
1469	The @samp{\} character,
1470	when followed by certain ordinary characters,
1471	takes a special meaning:
1472
1473	@table @samp
1474
1475	@item \b
1476	Match the empty string at the edge of a word.
1477
1478	@item \B
1479	Match the empty string provided it's not at the edge of a word.
1480
1481	@item \<
1482	Match the empty string at the beginning of a word.
1483
1484	@item \>
1485	Match the empty string at the end of a word.
1486
1487	@item \w
1488	Match word constituent, it is a synonym for @samp{[_[:alnum:]]}.
1489
1490	@item \W
1491	Match non-word constituent, it is a synonym for @samp{[^_[:alnum:]]}.
1492
1493	@item \s
1494	Match whitespace, it is a synonym for @samp{[[:space:]]}.
1495
1496	@item \S
1497	Match non-whitespace, it is a synonym for @samp{[^[:space:]]}.
1498
1499	@end table
1500
1501	For example, @samp{\brat\b} matches the separate word @samp{rat},
1502	@samp{\Brat\B} matches @samp{crate} but not @samp{furry rat}.
1503
1504	@node Anchoring
1505	@section Anchoring
1506	@cindex anchoring
1507
1508	The caret @samp{^} and the dollar sign @samp{$} are special characters that
1509	respectively match the empty string at the beginning and end of a line.
1510	They are termed @dfn{anchors}, since they force the match to be ``anchored''
1511	to beginning or end of a line, respectively.
1512
1513	@node Back-references and Subexpressions
1514	@section Back-references and Subexpressions
1515	@cindex subexpression
1516	@cindex back-reference
1517
1518	The back-reference @samp{\@var{n}},
1519	where @var{n} is a single nonzero digit, matches
1520	the substring previously matched by the @var{n}th parenthesized subexpression
1521	of the regular expression.
1522	For example, @samp{(a)\1} matches @samp{aa}.
1523	If the parenthesized subexpression does not participate in the match,
1524	the back-reference makes the whole match fail;
1525	for example, @samp{(a)*\1} fails to match @samp{a}.
1526	If the parenthesized subexpression matches more than one substring,
1527	the back-reference refers to the last matched substring;
1528	for example, @samp{^(ab)\1$} matches @samp{ababbabb} but not @samp{ababbab}.
1529	When multiple regular expressions are given with
1530	@option{-e} or from a file (@samp{-f @var{file}}),
1531	back-references are local to each expression.
1532
1533	@xref{Known Bugs}, for some known problems with back-references.
1534
1535	@node Basic vs Extended
1536	@section Basic vs Extended Regular Expressions
1537	@cindex basic regular expressions
1538
1539	In basic regular expressions the characters @samp{?}, @samp{+},
1540	@samp{@{}, @samp{\|}, @samp{(}, and @samp{)} lose their special meaning;
1541	instead use the backslashed versions @samp{\?}, @samp{\+}, @samp{\@{},
1542	@samp{\\|}, @samp{$}, and @samp{$}. Also, a backslash is needed
1543	before an interval expression's closing @samp{@}}, and an unmatched
1544	@code{\)} is invalid.
1545
1546	Portable scripts should avoid the following constructs, as
1547	POSIX says they produce undefined results:
1548
1549	@itemize @bullet
1550	@item
1551	Extended regular expressions that use back-references.
1552	@item
1553	Basic regular expressions that use @samp{\?}, @samp{\+}, or @samp{\\|}.
1554	@item
1555	Empty parenthesized regular expressions like @samp{()}.
1556	@item
1557	Empty alternatives (as in, e.g, @samp{a\|}).
1558	@item
1559	Repetition operators that immediately follow empty expressions,
1560	unescaped @samp{$}, or other repetition operators.
1561	@item
1562	A backslash escaping an ordinary character (e.g., @samp{\S}),
1563	unless it is a back-reference.
1564	@item
1565	An unescaped @samp{[} that is not part of a bracket expression.
1566	@item
1567	In extended regular expressions, an unescaped @samp{@{} that is not
1568	part of an interval expression.
1569	@end itemize
1570
1571	@cindex interval expressions
1572	Traditional @command{egrep} did not support interval expressions and
1573	some @command{egrep} implementations use @samp{\@{} and @samp{\@}} instead, so
1574	portable scripts should avoid interval expressions in @samp{grep@ -E} patterns
1575	and should use @samp{[@{]} to match a literal @samp{@{}.
1576
1577	GNU @command{grep@ -E} attempts to support traditional usage by
1578	assuming that @samp{@{} is not special if it would be the start of an
1579	invalid interval expression.
1580	For example, the command
1581	@samp{grep@ -E@ '@{1'} searches for the two-character string @samp{@{1}
1582	instead of reporting a syntax error in the regular expression.
1583	POSIX allows this behavior as an extension, but portable scripts
1584	should avoid it.
1585
1586	@node Character Encoding
1587	@section Character Encoding
1588	@cindex character encoding
1589
1590	The @env{LC_CTYPE} locale specifies the encoding of characters in
1591	patterns and data, that is, whether text is encoded in UTF-8, ASCII,
1592	or some other encoding. @xref{Environment Variables}.
1593
1594	In the @samp{C} or @samp{POSIX} locale, every character is encoded as
1595	a single byte and every byte is a valid character. In more-complex
1596	encodings such as UTF-8, a sequence of multiple bytes may be needed to
1597	represent a character, and some bytes may be encoding errors that do
1598	not contribute to the representation of any character. POSIX does not
1599	specify the behavior of @command{grep} when patterns or input data
1600	contain encoding errors or null characters, so portable scripts should
1601	avoid such usage. As an extension to POSIX, GNU @command{grep} treats
1602	null characters like any other character. However, unless the
1603	@option{-a} (@option{--binary-files=text}) option is used, the
1604	presence of null characters in input or of encoding errors in output
1605	causes GNU @command{grep} to treat the file as binary and suppress
1606	details about matches. @xref{File and Directory Selection}.
1607
1608	Regardless of locale, the 103 characters in the POSIX Portable
1609	Character Set (a subset of ASCII) are always encoded as a single byte,
1610	and the 128 ASCII characters have their usual single-byte encodings on
1611	all but oddball platforms.
1612
1613	@node Matching Non-ASCII
1614	@section Matching Non-ASCII and Non-printable Characters
1615	@cindex non-ASCII matching
1616	@cindex non-printable matching
1617
1618	In a regular expression, non-ASCII and non-printable characters other
1619	than newline are not special, and represent themselves. For example,
1620	in a locale using UTF-8 the command @samp{grep 'Λ@tie{}ω'} (where the
1621	white space between @samp{Λ} and the @samp{ω} is a tab character)
1622	searches for @samp{Λ} (Unicode character U+039B GREEK CAPITAL LETTER
1623	LAMBDA), followed by a tab (U+0009 TAB), followed by @samp{ω} (U+03C9
1624	GREEK SMALL LETTER OMEGA).
1625
1626	Suppose you want to limit your pattern to only printable characters
1627	(or even only printable ASCII characters) to keep your script readable
1628	or portable, but you also want to match specific non-ASCII or non-null
1629	non-printable characters. If you are using the @option{-P}
1630	(@option{--perl-regexp}) option, PCREs give you several ways to do
1631	this. Otherwise, if you are using Bash, the GNU project's shell, you
1632	can represent these characters via ANSI-C quoting. For example, the
1633	Bash commands @samp{grep $'Λ\tω'} and @samp{grep $'\u039B\t\u03C9'}
1634	both search for the same three-character string @samp{Λ@tie{}ω}
1635	mentioned earlier. However, because Bash translates ANSI-C quoting
1636	before @command{grep} sees the pattern, this technique should not be
1637	used to match printable ASCII characters; for example, @samp{grep
1638	$'\u005E'} is equivalent to @samp{grep '^'} and matches any line, not
1639	just lines containing the character @samp{^} (U+005E CIRCUMFLEX
1640	ACCENT).
1641
1642	Since PCREs and ANSI-C quoting are GNU extensions to POSIX, portable
1643	shell scripts written in ASCII should use other methods to match
1644	specific non-ASCII characters. For example, in a UTF-8 locale the
1645	command @samp{grep "$(printf '\316\233\t\317\211\n')"} is a portable
1646	albeit hard-to-read alternative to Bash's @samp{grep $'Λ\tω'}.
1647	However, none of these techniques will let you put a null character
1648	directly into a command-line pattern; null characters can appear only
1649	in a pattern specified via the @option{-f} (@option{--file}) option.
1650
1651	@node Usage
1652	@chapter Usage
1653
1654	@cindex usage, examples
1655	Here is an example command that invokes GNU @command{grep}:
1656
1657	@example
1658	grep -i 'hello.*world' menu.h main.c
1659	@end example
1660
1661	@noindent
1662	This lists all lines in the files @file{menu.h} and @file{main.c} that
1663	contain the string @samp{hello} followed by the string @samp{world};
1664	this is because @samp{.*} matches zero or more characters within a line.
1665	@xref{Regular Expressions}.
1666	The @option{-i} option causes @command{grep}
1667	to ignore case, causing it to match the line @samp{Hello, world!}, which
1668	it would not otherwise match.
1669
1670	Here is a more complex example,
1671	showing the location and contents of any line
1672	containing @samp{f} and ending in @samp{.c},
1673	within all files in the current directory whose names
1674	start with non-@samp{.}, contain @samp{g}, and end in @samp{.h}.
1675	The @option{-n} option outputs line numbers, the @option{--} argument
1676	treats any later arguments as file names not options even if
1677	@code{g.h} expands to a file name that starts with @samp{-},
1678	and the empty file @file{/dev/null} causes file names to be output
1679	even if only one file name happens to be of the form @samp{g.h}.
1680
1681	@example
1682	grep -n -- 'f.\.c$' g*.h /dev/null
1683	@end example
1684
1685	@noindent
1686	Note that the regular expression syntax used in the pattern differs
1687	from the globbing syntax that the shell uses to match file names.
1688
1689	@xref{Invoking}, for more details about
1690	how to invoke @command{grep}.
1691
1692	@cindex using @command{grep}, Q&A
1693	@cindex FAQ about @command{grep} usage
1694	Here are some common questions and answers about @command{grep} usage.
1695
1696	@enumerate
1697
1698	@item
1699	How can I list just the names of matching files?
1700
1701	@example
1702	grep -l 'main' test-*.c
1703	@end example
1704
1705	@noindent
1706	lists names of @samp{test-*.c} files in the current directory whose contents
1707	mention @samp{main}.
1708
1709	@item
1710	How do I search directories recursively?
1711
1712	@example
1713	grep -r 'hello' /home/gigi
1714	@end example
1715
1716	@noindent
1717	searches for @samp{hello} in all files
1718	under the @file{/home/gigi} directory.
1719	For more control over which files are searched,
1720	use @command{find} and @command{grep}.
1721	For example, the following command searches only C files:
1722
1723	@example
1724	find /home/gigi -name '*.c' ! -type d \
1725	-exec grep -H 'hello' '@{@}' +
1726	@end example
1727
1728	This differs from the command:
1729
1730	@example
1731	grep -H 'hello' /home/gigi/*.c
1732	@end example
1733
1734	which merely looks for @samp{hello} in non-hidden C files in
1735	@file{/home/gigi} whose names end in @samp{.c}.
1736	The @command{find} command line above is more similar to the command:
1737
1738	@example
1739	grep -r --include='*.c' 'hello' /home/gigi
1740	@end example
1741
1742	@item
1743	What if a pattern or file has a leading @samp{-}?
1744
1745	@example
1746	grep -- '--cut here--' *
1747	@end example
1748
1749	@noindent
1750	searches for all lines matching @samp{--cut here--}.
1751	Without @option{--},
1752	@command{grep} would attempt to parse @samp{--cut here--} as a list of
1753	options, and there would be similar problems with any file names
1754	beginning with @samp{-}.
1755
1756	Alternatively, you can prevent misinterpretation of leading @samp{-}
1757	by using @option{-e} for patterns and leading @samp{./} for files:
1758
1759	@example
1760	grep -e '--cut here--' ./*
1761	@end example
1762
1763	@item
1764	Suppose I want to search for a whole word, not a part of a word?
1765
1766	@example
1767	grep -w 'hello' test*.log
1768	@end example
1769
1770	@noindent
1771	searches only for instances of @samp{hello} that are entire words;
1772	it does not match @samp{Othello}.
1773	For more control, use @samp{\<} and
1774	@samp{\>} to match the start and end of words.
1775	For example:
1776
1777	@example
1778	grep 'hello\>' test*.log
1779	@end example
1780
1781	@noindent
1782	searches only for words ending in @samp{hello}, so it matches the word
1783	@samp{Othello}.
1784
1785	@item
1786	How do I output context around the matching lines?
1787
1788	@example
1789	grep -C 2 'hello' test*.log
1790	@end example
1791
1792	@noindent
1793	prints two lines of context around each matching line.
1794
1795	@item
1796	How do I force @command{grep} to print the name of the file?
1797
1798	Append @file{/dev/null}:
1799
1800	@example
1801	grep 'eli' /etc/passwd /dev/null
1802	@end example
1803
1804	gets you:
1805
1806	@example
1807	/etc/passwd:eli:x:2098:1000:Eli Smith:/home/eli:/bin/bash
1808	@end example
1809
1810	Alternatively, use @option{-H}, which is a GNU extension:
1811
1812	@example
1813	grep -H 'eli' /etc/passwd
1814	@end example
1815
1816	@item
1817	Why do people use strange regular expressions on @command{ps} output?
1818
1819	@example
1820	ps -ef \| grep '[c]ron'
1821	@end example
1822
1823	If the pattern had been written without the square brackets, it would
1824	have matched not only the @command{ps} output line for @command{cron},
1825	but also the @command{ps} output line for @command{grep}.
1826	Note that on some platforms,
1827	@command{ps} limits the output to the width of the screen;
1828	@command{grep} does not have any limit on the length of a line
1829	except the available memory.
1830
1831	@item
1832	Why does @command{grep} report ``Binary file matches''?
1833
1834	If @command{grep} listed all matching ``lines'' from a binary file, it
1835	would probably generate output that is not useful, and it might even
1836	muck up your display.
1837	So GNU @command{grep} suppresses output from
1838	files that appear to be binary files.
1839	To force GNU @command{grep}
1840	to output lines even from files that appear to be binary, use the
1841	@option{-a} or @samp{--binary-files=text} option.
1842	To eliminate the
1843	``Binary file matches'' messages, use the @option{-I} or
1844	@samp{--binary-files=without-match} option,
1845	or the @option{-s} or @option{--no-messages} option.
1846
1847	@item
1848	Why doesn't @samp{grep -lv} print non-matching file names?
1849
1850	@samp{grep -lv} lists the names of all files containing one or more
1851	lines that do not match.
1852	To list the names of all files that contain no
1853	matching lines, use the @option{-L} or @option{--files-without-match}
1854	option.
1855
1856	@item
1857	I can do ``OR'' with @samp{\|}, but what about ``AND''?
1858
1859	@example
1860	grep 'paul' /etc/motd \| grep 'franc,ois'
1861	@end example
1862
1863	@noindent
1864	finds all lines that contain both @samp{paul} and @samp{franc,ois}.
1865
1866	@item
1867	Why does the empty pattern match every input line?
1868
1869	The @command{grep} command searches for lines that contain strings
1870	that match a pattern. Every line contains the empty string, so an
1871	empty pattern causes @command{grep} to find a match on each line. It
1872	is not the only such pattern: @samp{^}, @samp{$}, and many
1873	other patterns cause @command{grep} to match every line.
1874
1875	To match empty lines, use the pattern @samp{^$}. To match blank
1876	lines, use the pattern @samp{^[[:blank:]]*$}. To match no lines at
1877	all, use the command @samp{grep -f /dev/null}.
1878
1879	@item
1880	How can I search in both standard input and in files?
1881
1882	Use the special file name @samp{-}:
1883
1884	@example
1885	cat /etc/passwd \| grep 'alain' - /etc/motd
1886	@end example
1887
1888	@item
1889	Why is this back-reference failing?
1890
1891	@example
1892	echo 'ba' \| grep -E '(a)\1\|b\1'
1893	@end example
1894
1895	This outputs an error message, because the second @samp{\1}
1896	has nothing to refer back to, meaning it will never match anything.
1897
1898	@item
1899	How can I match across lines?
1900
1901	Standard grep cannot do this, as it is fundamentally line-based.
1902	Therefore, merely using the @code{[:space:]} character class does not
1903	match newlines in the way you might expect.
1904
1905	With the GNU @command{grep} option @option{-z} (@option{--null-data}), each
1906	input and output ``line'' is null-terminated; @pxref{Other Options}. Thus,
1907	you can match newlines in the input, but typically if there is a match
1908	the entire input is output, so this usage is often combined with
1909	output-suppressing options like @option{-q}, e.g.:
1910
1911	@example
1912	printf 'foo\nbar\n' \| grep -z -q 'foo[[:space:]]\+bar'
1913	@end example
1914
1915	If this does not suffice, you can transform the input
1916	before giving it to @command{grep}, or turn to @command{awk},
1917	@command{sed}, @command{perl}, or many other utilities that are
1918	designed to operate across lines.
1919
1920	@item
1921	What do @command{grep}, @command{fgrep}, and @command{egrep} stand for?
1922
1923	The name @command{grep} comes from the way line editing was done on Unix.
1924	For example,
1925	@command{ed} uses the following syntax
1926	to print a list of matching lines on the screen:
1927
1928	@example
1929	global/regular expression/print
1930	g/re/p
1931	@end example
1932
1933	@command{fgrep} stands for Fixed @command{grep};
1934	@command{egrep} stands for Extended @command{grep}.
1935
1936	@end enumerate
1937
1938
1939	@node Performance
1940	@chapter Performance
1941
1942	@cindex performance
1943	Typically @command{grep} is an efficient way to search text. However,
1944	it can be quite slow in some cases, and it can search large files
1945	where even minor performance tweaking can help significantly.
1946	Although the algorithm used by @command{grep} is an implementation
1947	detail that can change from release to release, understanding its
1948	basic strengths and weaknesses can help you improve its performance.
1949
1950	The @command{grep} command operates partly via a set of automata that
1951	are designed for efficiency, and partly via a slower matcher that
1952	takes over when the fast matchers run into unusual features like
1953	back-references. When feasible, the Boyer--Moore fast string
1954	searching algorithm is used to match a single fixed pattern, and the
1955	Aho--Corasick algorithm is used to match multiple fixed patterns.
1956
1957	@cindex locales
1958	Generally speaking @command{grep} operates more efficiently in
1959	single-byte locales, since it can avoid the special processing needed
1960	for multi-byte characters. If your patterns will work just as well
1961	that way, setting @env{LC_ALL} to a single-byte locale can help
1962	performance considerably. Setting @samp{LC_ALL='C'} can be
1963	particularly efficient, as @command{grep} is tuned for that locale.
1964
1965	@cindex case insensitive search
1966	Outside the @samp{C} locale, case-insensitive search, and search for
1967	bracket expressions like @samp{[a-z]} and @samp{[[=a=]b]}, can be
1968	surprisingly inefficient due to difficulties in fast portable access to
1969	concepts like multi-character collating elements.
1970
1971	@cindex back-references
1972	A back-reference such as @samp{\1} can hurt performance significantly
1973	in some cases, since back-references cannot in general be implemented
1974	via a finite state automaton, and instead trigger a backtracking
1975	algorithm that can be quite inefficient. For example, although the
1976	pattern @samp{^(.)\1@{14@}(.)\2@{13@}$} matches only lines whose
1977	lengths can be written as a sum @math{15x + 14y} for nonnegative
1978	integers @math{x} and @math{y}, the pattern matcher does not perform
1979	linear Diophantine analysis and instead backtracks through all
1980	possible matching strings, using an algorithm that is exponential in
1981	the worst case.
1982
1983	@cindex holes in files
1984	On some operating systems that support files with holes---large
1985	regions of zeros that are not physically present on secondary
1986	storage---@command{grep} can skip over the holes efficiently without
1987	needing to read the zeros. This optimization is not available if the
1988	@option{-a} (@option{--binary-files=text}) option is used (@pxref{File and
1989	Directory Selection}), unless the @option{-z} (@option{--null-data})
1990	option is also used (@pxref{Other Options}).
1991
1992	For more about the algorithms used by @command{grep} and about
1993	related string matching algorithms, see:
1994
1995	@frenchspacing on
1996	@itemize @bullet
1997	@item
1998	Aho AV. Algorithms for finding patterns in strings.
1999	In: van Leeuwen J. @emph{Handbook of Theoretical Computer Science}, vol. A.
2000	New York: Elsevier; 1990. p. 255--300.
2001	This surveys classic string matching algorithms, some of which are
2002	used by @command{grep}.
2003
2004	@item
2005	Aho AV, Corasick MJ. Efficient string matching: an aid to bibliographic search.
2006	@emph{CACM}. 1975;18(6):333--40.
2007	@url{https://dx.doi.org/10.1145/360825.360855}.
2008	This introduces the Aho--Corasick algorithm.
2009
2010	@item
2011	Boyer RS, Moore JS. A fast string searching algorithm.
2012	@emph{CACM}. 1977;20(10):762--72.
2013	@url{https://dx.doi.org/10.1145/359842.359859}.
2014	This introduces the Boyer--Moore algorithm.
2015
2016	@item
2017	Faro S, Lecroq T. The exact online string matching problem: a review
2018	of the most recent results.
2019	@emph{ACM Comput Surv}. 2013;45(2):13.
2020	@url{https://dx.doi.org/10.1145/2431211.2431212}.
2021	This surveys string matching algorithms that might help improve the
2022	performance of @command{grep} in the future.
2023	@end itemize
2024	@frenchspacing off
2025
2026	@node Reporting Bugs
2027	@chapter Reporting bugs
2028
2029	@cindex bugs, reporting
2030	Bug reports can be found at the
2031	@url{https://debbugs.gnu.org/cgi/pkgreport.cgi?package=grep,
2032	GNU bug report logs for @command{grep}}.
2033	If you find a bug not listed there, please email it to
2034	@email{bug-grep@@gnu.org} to create a new bug report.
2035
2036	@menu
2037	* Known Bugs::
2038	@end menu
2039
2040	@node Known Bugs
2041	@section Known Bugs
2042	@cindex Bugs, known
2043
2044	Large repetition counts in the @samp{@{n,m@}} construct may cause
2045	@command{grep} to use lots of memory.
2046	In addition, certain other
2047	obscure regular expressions require exponential time and
2048	space, and may cause @command{grep} to run out of memory.
2049
2050	Back-references can greatly slow down matching, as they can generate
2051	exponentially many matching possibilities that can consume both time
2052	and memory to explore. Also, the POSIX specification for
2053	back-references is at times unclear. Furthermore, many regular
2054	expression implementations have back-reference bugs that can cause
2055	programs to return incorrect answers or even crash, and fixing these
2056	bugs has often been low-priority: for example, as of 2021 the
2057	@url{https://sourceware.org/bugzilla/,GNU C library bug database}
2058	contained back-reference bugs
2059	@url{https://sourceware.org/bugzilla/show_bug.cgi?id=52,,52},
2060	@url{https://sourceware.org/bugzilla/show_bug.cgi?id=10844,,10844},
2061	@url{https://sourceware.org/bugzilla/show_bug.cgi?id=11053,,11053},
2062	@url{https://sourceware.org/bugzilla/show_bug.cgi?id=24269,,24269}
2063	and @url{https://sourceware.org/bugzilla/show_bug.cgi?id=25322,,25322},
2064	with little sign of forthcoming fixes. Luckily,
2065	back-references are rarely useful and it should be little trouble to
2066	avoid them in practical applications.
2067
2068
2069	@node Copying
2070	@chapter Copying
2071	@cindex copying
2072
2073	GNU @command{grep} is licensed under the GNU GPL, which makes it @dfn{free
2074	software}.
2075
2076	The ``free'' in ``free software'' refers to liberty, not price. As
2077	some GNU project advocates like to point out, think of ``free speech''
2078	rather than ``free beer''. In short, you have the right (freedom) to
2079	run and change @command{grep} and distribute it to other people, and---if you
2080	want---charge money for doing either. The important restriction is
2081	that you have to grant your recipients the same rights and impose the
2082	same restrictions.
2083
2084	This general method of licensing software is sometimes called
2085	@dfn{open source}. The GNU project prefers the term ``free software''
2086	for reasons outlined at
2087	@url{https://www.gnu.org/philosophy/open-source-misses-the-point.html}.
2088
2089	This manual is free documentation in the same sense. The
2090	documentation license is included below. The license for the program
2091	is available with the source code, or at
2092	@url{https://www.gnu.org/licenses/gpl.html}.
2093
2094	@menu
2095	* GNU Free Documentation License::
2096	@end menu
2097
2098	@node GNU Free Documentation License
2099	@section GNU Free Documentation License
2100
2101	@include fdl.texi
2102
2103
2104	@node Index
2105	@unnumbered Index
2106
2107	@printindex cp
2108
2109	@bye

Note: See TracBrowser for help on using the repository browser.

source: kBuild/vendor/grep/3.7/doc/grep.texi@ 3529

Download in other formats: