6 ##############################################################################
10 ## Jeffrey Friedl (jfriedl@omron.co.jp), Dec 1994.
11 ## Copyright 19.... ah hell, just take it.
12 ## Ported to Win95 by Dan Schmidt (dfan@alum.mit.edu)
15 ## A combo of find and grep -- more or less do a 'grep' on a whole
16 ## directory tree. Fast, with lots of options. Much more powerful than
17 ## the simple "find ... | xargs grep ....". Has a full man page.
18 ## Powerfully customizable. Emacs interface included.
20 ## This file is big, but mostly comments and man page.
22 ## See man page for usage info.
23 ## Return value: 2=error, 1=nothing found, 0=something found.
26 $version = "960908.11";
28 ## Added -depth=0, 'cause I needed it.
31 ## Added -F/-R, and made it a bit more smart about $ and @ in arguments.
34 ## Sigh -- fixed a perl4 backward-compatabilty problem.
38 ## Added the "filter" directive to the startup file so I could have
39 ## search automatically look inside compressed files.
41 ## Implementation: slightly changed semantics with how EVALs are
42 ## removed from the inlined program.
44 ## From Lionel Cons a few bug fixes and cool emacs interface:
46 ## ``Also, I integrated search in Emacs so that I can use "M-x search"
47 ## and then "C-x `" to browse through the occurences (like grep).
48 ## Here is the code that I use:
50 ## (defun search (dir what) "Run search with all grep goodies."
51 ## (interactive "DSearch under: \nsSearch for: ")
52 ## (setq default-directory
53 ## (if (string-match "/$" dir) dir (concat dir "/")))
54 ## (compile-internal (concat "search -n " what)
55 ## "No more search hits" "grep" nil grep-regexp-alist))
57 ## Add color stuff (-bold, -red, etc...)
58 ## Fix up nroff stuff to work with groff
60 ## Changed all 'sysread' to 'read' because Linux perl's don't seem
65 ## Added -nice (due to Lionel Cons <Lionel.Cons@cern.ch>)
66 ## Removed any leading "./" from name.
67 ## Added default flags for ~/.search, including TTY, -nice, -list, etc.
68 ## Program name now has path removed when printed in diagnostics.
69 ## Added simple tilde-expansion to -dir arg.
70 ## Added -dskip, etc. Fixed -iregex bug.
71 ## Changed -dir to be additive, adding -ddir.
72 ## Now screen out devices, pipes, and sockets.
73 ## More tidying and lots of expanding of the man page
80 $rc_file = join('/', $ENV{'HOME'}, ".search");
84 ## Make sure we've got a regex.
85 ## Don't need one if -find or -showrc was specified.
86 $!=2, die "expecting regex arguments.\n"
87 if $FIND_ONLY == 0 && $showrc == 0 && @ARGV == 0;
89 &prepare_to_search($rc_file);
91 &import_program if !defined &dodir; ## BIG key to speed.
93 ## do search while there are directories to be done.
94 &dodir(shift(@todo)) while @todo;
96 &clear_message if $VERBOSE && $STDERR_IS_TTY;
98 ###############################################################################
102 ## initialize variables that might be reset by command-line args
103 $DOREP=0; ## set true by -dorep (redo multi-hardlink files)
104 $DO_SORT=0; ## set by -sort (sort files in a dir before checking)
105 $FIND_ONLY=0; ## set by -find (don't search files)
106 $LIST_ONLY=0; ## set true by -l (list filenames only)
107 $NEWER=0; ## set by -newer, "-mtime -###"
108 $NICE=0; ## set by -nice (print human-readable output)
109 $NOLINKS=0; ## set true by -nolinks (don't follow symlinks)
110 $OLDER=0; ## set by -older, "-mtime ###"
111 $PREPEND_FILENAME=1; ## set false by -h (don't prefix lines with filename)
112 $REPORT_LINENUM=0; ## set true by -n (show line numbers)
113 $VERBOSE=0; ## set to a value by -v, -vv, etc. (verbose messages)
114 $WHY=0; ## set true by -why, -vvv+ (report why skipped)
115 $XDEV=0; ## set true by -xdev (stay on one filesystem)
116 $all=0; ## set true by -all (don't skip many kinds of files)
117 $iflag = ''; ## set to 'i' by -i (ignore case);
118 $norc=0; ## set by -norc (don't load rc file)
119 $showrc=0; ## set by -showrc (show what happens with rc file)
120 $underlineOK=0; ## set true by -u (watch for underline stuff)
121 $words=0; ## set true by -w (match whole-words only)
122 $MARK=''; ## set by -bold (-red, etc.). Does ANSI markup.
123 $literal=0; ## set by -F -- "regex" args taken as literal strings
124 $retval=1; ## will set to 0 if we find anything.
125 $DESCEND_SUBDIRECTORIES=1;## set to false by -depth=0
127 $WINDOWS=0; ## under windows?
128 $USE_INODES=1; ## are inodes valid?
130 ## various elements of stat() that we might access
135 $VV_PRINT_COUNT = 50; ## with -vv, print every VV_PRINT_COUNT files, or...
136 $VV_SIZE = 1024*1024; ## ...every VV_SIZE bytes searched
137 $vv_print = $vv_size = 0; ## running totals.
139 ## set default options, in case the rc file wants them
140 $opt{'TTY'}= 1 if -t STDOUT;
142 ## want to know this for debugging message stuff
143 $STDERR_IS_TTY = -t STDERR ? 1 : 0;
144 $STDERR_SCREWS_STDOUT = ($STDERR_IS_TTY && -t STDOUT) ? 1 : 0;
146 $0 =~ s,.*/,,; ## clean up $0 for any diagnostics we'll be printing.
154 while (@ARGV && $ARGV[0] =~ m/^-/)
158 if ($arg eq '-version' || ($VERBOSE && $arg eq '-help')) {
159 print qq/Jeffrey\'s file search, version "$version".\n/;
160 exit(0) unless $arg eq '-help';
162 if ($arg eq '-help') {
163 print <<INLINE_LITERAL_TEXT;
164 usage: $0 [options] [-e] [PerlRegex ....]
165 OPTIONS TELLING *WHERE* TO SEARCH:
166 -dir DIR start search at the named directory (default is current dir).
167 -xdev stay on starting file system.
168 -sort sort the files in each directory before processing.
169 -nolinks don\'t follow symbolic links.
170 -depth=0 don\'t descend into subdirectories
171 OPTIONS TELLING WHICH FILES TO EVEN CONSIDER:
172 -mtime # consider files modified > # days ago (-# for < # days old)
173 -newer FILE consider files modified more recently than FILE (also -older)
174 -name GLOB consider files whose name matches pattern (also -regex).
175 -skip GLOB opposite of -name: identifies files to not consider.
176 -path GLOB like -name, but for files whose whole path is described.
177 -dpath/-dregex/-dskip versions for selecting or pruning directories.
178 -all don\'t skip any files marked to be skipped by the startup file.
179 -x<SPECIAL> (see manual, and/or try -showrc).
180 -why report why a file isn\'t checked (also implied by -vvvv).
181 OPTIONS TELLING WHAT TO DO WITH FILES THAT WILL BE CONSIDERED:
182 -f | -find just list files (PerlRegex ignored). Default is to grep them.
183 -ff | -ffind Does a faster -find (implies -find -all -dorep)
184 OPTIONS CONTROLLING HOW THE SEARCH IS DONE (AND WHAT IS PRINTED):
185 -F | -lit "regex" args taken as literal strings (like fgrep)
186 -R | -regex undoes -F -- regex ares really are perl regexes
187 -l | -list only list files with matches, not the lines themselves.
188 -nice | -nnice print more "human readable" output.
189 -bold | -red mark found items (various colors supported)
190 -n prefix each output line with its line number in the file.
191 -h don\'t prefix output lines with file name.
192 -u also look "inside" manpage-style underlined text
193 -i do case-insensitive searching.
194 -w match words only (as defined by perl\'s \\b).
196 -v, -vv, -vvv various levels of message verbosity.
197 -e end of options (in case a regex looks like an option).
198 -showrc show what the rc file sets, then exit.
199 -norc don\'t load the rc file.
200 -dorep check files with multiple hard links multiple times.
201 -win necessary if running under Windows 95.
203 print "Use -v -help for more verbose help.\n" unless $VERBOSE;
204 print "This script file is also a man page.\n" unless $stripped;
205 print <<INLINE_LITERAL_TEXT if $VERBOSE;
207 If -f (or -find) given, PerlRegex is optional and ignored.
208 Otherwise, will search for files with lines matching any of the given regexes.
210 Combining things like -name and -mtime implies boolean AND.
211 However, duplicating things (such as -name '*.c' -name '*.txt') implies OR.
213 -mtime may be given floating point (i.e. 1.5 is a day and a half).
214 -iskip/-idskip/-ipath/... etc are case-insensitive versions.
216 If any letter in -newer/-older is upper case, "or equal" is
217 inserted into the test.
219 You can always find the latest version on the World Wide Web in
220 http://www.wg.omron.co.jp/~jfriedl/perl/
224 $DOREP=1, next if $arg eq '-dorep'; ## do repeats
225 $DO_SORT=1, next if $arg eq '-sort'; ## sort files
226 $NOLINKS=1, next if $arg eq '-nolinks'; ## no sym. links
227 $PREPEND_FILENAME=0, next if $arg eq '-h'; ## no filename prefix
228 $REPORT_LINENUM=1, next if $arg eq '-n'; ## show line numbers
229 $WHY=1, next if $arg eq '-why'; ## tell why skipped
230 $XDEV=1, next if $arg eq '-xdev'; ## don't leave F.S.
231 $all=1,$opt{'-all'}=1,next if $arg eq '-all'; ## don't skip *.Z, etc
232 $iflag='i', next if $arg eq '-i'; ## ignore case
233 $iflag='', next if $arg eq '-noi'; ## don't ignore case
234 $norc=1, next if $arg eq '-norc'; ## don't load rc file
235 $showrc=1, next if $arg eq '-showrc'; ## show rc file
236 $underlineOK=1, next if $arg eq '-u'; ## look throuh underln.
237 $words=1, next if $arg eq '-w'; ## match "words" only
238 $literal=1, next if $arg eq '-F'; ## args are literal
239 $literal=1, next if $arg eq '-lit'; ## args are literal
240 $literal=0, next if $arg eq '-R'; ## args are regexes
241 $literal=0, next if $arg eq '-regex'; ## args are regexes
242 &strip if $arg eq '-strip'; ## dump this program
243 last if $arg eq '-e';
245 $mark = '\e[7m', next if $arg eq '-bold'; ## embold found items]
246 $mark = '\e[30m', next if $arg eq '-black'; ## embold found items]
247 $mark = '\e[31m', next if $arg eq '-red'; ## embold found items]
248 $mark = '\e[32m', next if $arg eq '-green'; ## embold found items]
249 $mark = '\e[33m', next if $arg eq '-yellow'; ## embold found items]
250 $mark = '\e[34m', next if $arg eq '-blue'; ## embold found items]
251 $mark = '\e[36m', next if $arg eq '-cyan'; ## embold found items]
252 $mark = '\e[37m', next if $arg eq '-white'; ## embold found items]
254 $FIND_ONLY=1, next if $arg =~/^-f(ind)?$/;## do "find" only
256 $FIND_ONLY=1, $DOREP=1, $all=1,
257 next if $arg =~/^-ff(ind)?$/;## fast -find
258 $LIST_ONLY=1,$opt{'-list'}=1,
259 next if $arg =~/^-l(ist)?$/;## only list files
261 $WINDOWS=1, $USE_INODES=0, $DOREP=1,
262 next if $arg eq '-win'; ## running under Windows
265 if ($arg =~ m/^-depth=(\d+)$/) {
266 die qq/$0: only -depth of 0 currently supported\n/ if $1 != 0;
267 $DESCEND_SUBDIRECTORIES=0;
271 if ($arg =~ m/^-(v+)$/) { ## verbosity
272 $VERBOSE =length($1);
273 foreach $len (1..$VERBOSE) { $opt{'-'.('v' x $len)}=1 }
276 if ($arg =~ m/^-(n+)ice$/) { ## "nice" output
278 foreach $len (1..$NICE) { $opt{'-'.('n' x $len).'ice'}=1 }
282 if ($arg =~ m/^-(i?)(d?)skip$/) {
283 local($i) = $1 eq 'i';
284 local($d) = $2 eq 'd';
285 $! = 2, die qq/$0: expecting glob arg to -$arg\n/ unless @ARGV;
286 foreach (split(/\s+/, shift @ARGV)) {
299 if ($arg =~ m/^-(i?)(d?)(regex|path|name)$/) {
300 local($i) = $1 eq 'i';
301 $! = 2, die qq/$0: expecting arg to -$arg\n/ unless @ARGV;
302 foreach (split(/\s+/, shift @ARGV)) {
303 $iname{join(',', $arg, $_)}=1 if $i;
304 $name{join(',', $arg, $_)}=1;
309 if ($arg =~ m/^-d?dir$/) {
311 $! = 2, die qq/$0: expecting filename arg to -$arg\n/ unless @ARGV;
312 $start = shift(@ARGV);
313 $start =~ s#^~(/+|$)#$ENV{'HOME'}$1# if defined $ENV{'HOME'};
314 $! = 2, die qq/$0: can\'t find ${arg}\'s "$start"\n/ unless -e $start;
315 $! = 2, die qq/$0: ${arg}\'s "$start" not a directory.\n/ unless -d _;
316 undef(@todo), $opt{'-ddir'}=1 if $arg eq '-ddir';
321 if ($arg =~ m/^-(new|old)er$/i) {
322 $! = 2, die "$0: expecting filename arg to -$arg\n" unless @ARGV;
323 local($file, $time) = shift(@ARGV);
324 $! = 2, die qq/$0: can\'t stat -${arg}\'s "$file"./
325 unless $time = (stat($file))[$STAT_MTIME];
326 local($upper) = $arg =~ tr/A-Z//;
327 if ($arg =~ m/new/i) {
328 $time++ unless $upper;
329 $NEWER = $time if $NEWER < $time;
331 $time-- unless $upper;
332 $OLDER = $time if $OLDER == 0 || $OLDER > $time;
337 if ($arg =~ m/-mtime/) {
338 $! = 2, die "$0: expecting numerical arg to -$arg\n" unless @ARGV;
339 local($days) = shift(@ARGV);
340 $! = 2, die qq/$0: inappropriate arg ($days) to $arg\n/ if $days==0;
343 local($time) = $^T + $days;
344 $NEWER = $time if $NEWER < $time;
346 local($time) = $^T - $days;
347 $OLDER = $time if $OLDER == 0 || $OLDER > $time;
352 ## special user options
353 if ($arg =~ m/^-x(.+)/) {
354 foreach (split(/[\s,]+/, $1)) { $user_opt{$_} = $opt{$_}= 1; }
358 $! = 2, die "$0: unknown arg [$arg]\n";
360 $DOMARK = defined($mark) ? 1 : 0;
364 ## Given a filename glob, return a regex.
365 ## If the glob has no globbing chars (no * ? or [..]), then
366 ## prepend an effective '*' to it.
371 local(@parts) = $glob =~ m/\\.|[*?]|\[]?[^]]*]|[^[\\*?]+/g;
374 if ($_ eq '*' || $_ eq '?') {
376 $trueglob=1; ## * and ? are a real glob
377 } elsif (substr($_, 0, 1) eq '[') {
378 $trueglob=1; ## [..] is a real glob
380 s/^\\//; ## remove any leading backslash;
381 s/(\W)/\\$1/g; ## now quote anything dangerous;
384 unshift(@parts, '.*') unless $trueglob;
385 join('', '^', @parts, q/$/);
388 sub prepare_to_search
390 local($rc_file) = @_;
392 $HEADER_BYTES=0; ## Might be set nonzero in &read_rc;
393 $last_message_length = 0; ## For &message and &clear_message.
395 &read_rc($rc_file, $showrc) unless $norc;
398 $NEXT_DIR_ENTRY = $DO_SORT ? 'shift @files' : 'readdir(DIR)';
399 $WHY = 1 if $VERBOSE > 3; ## Arg -vvvv or above implies -why.
400 @todo = ('.') if @todo == 0; ## Where we'll start looking
402 ## see if any user options were specified that weren't accounted for
403 foreach $opt (keys %user_opt) {
404 next if defined $seen_opt{$opt};
405 warn "warning: -x$opt never considered.\n";
408 die "$0: multiple time constraints exclude all possible files.\n"
409 if ($NEWER && $OLDER) && ($NEWER > $OLDER);
412 ## Process any -skip/-iskip args that had been given
415 foreach $glob (keys %skip) {
416 $i = defined($iskip{$glob}) ? 'i': '';
417 push(@skip_test, '$name =~ m/'. &glob_to_regex($glob). "/$i");
420 $SKIP_TEST = join('||',@skip_test);
423 $DO_SKIP_TEST = $SKIP_TEST = 0;
427 ## Process any -dskip/-idskip args that had been given
430 foreach $glob (keys %dskip) {
431 $i = defined($idskip{$glob}) ? 'i': '';
432 push(@dskip_test, '$name =~ m/'. &glob_to_regex($glob). "/$i");
435 $DSKIP_TEST = join('||',@dskip_test);
438 $DO_DSKIP_TEST = $DSKIP_TEST = 0;
443 ## Process any -name, -path, -regex, etc. args that had been given.
447 foreach $key (keys %name) {
448 local($type, $pat) = split(/,/, $key, 2);
449 local($i) = defined($iname{$key}) ? 'i' : '';
450 if ($type =~ /regex/) {
452 $test = "\$name =~ m!^$pat\$!$i";
454 local($var) = $type eq 'name' ? '$name' : '$file';
455 $test = "$var =~ m/". &glob_to_regex($pat). "/$i";
457 if ($type =~ m/^-i?d/) {
458 push(@dname_test, $test);
460 push(@name_test, $test);
464 $GLOB_TESTS = join('||', @name_test);
468 $GLOB_TESTS = $DO_GLOB_TESTS = 0;
471 $DGLOB_TESTS = join('||', @dname_test);
474 $DGLOB_TESTS = $DO_DGLOB_TESTS = 0;
478 ## Process any 'magic' things from the startup file.
480 if (@magic_tests && $HEADER_BYTES) {
481 ## the $magic' one is for when &dodir is not inlined
482 $tests = join('||',@magic_tests);
483 $MAGIC_TESTS = "{ package magic; \$val = ($tests) }";
491 ## Prepare regular expressions.
495 local(@mark_commands);
499 ## need to have $* set, but perl5 just won't shut up about it.
508 ## Until I figure out a better way to deal with it,
509 ## We have to worry about a regex like [^xyz] when doing $LIST_ONLY.
510 ## Such a regex *will* match \n, and if I'm pulling in multiple
511 ## lines, it can allow lines to match that would otherwise not match.
513 ## Therefore, if there is a '[^' in a regex, we can NOT take a chance
514 ## and use the fast listonly.
516 $CAN_USE_FAST_LISTONLY = $LIST_ONLY;
518 local(@extra, $orig);
519 local($underline_glue) = ($] >= 5) ? '(:?_\cH)?' : '(_\cH)?';
521 $regex = shift(@ARGV);
524 $orig = $regex if $literal;
525 $regex =~ s/(\W)/\\$1/g; ## quote everything
527 ## try to be smart about a $ in the regex. If it looks
528 ## like an end-of-line metacharacter, we'll leave it.
529 ## Otherwise, escape it so that it has no variable
531 $regex =~ s/(.?)\$([^)|])/
532 $1 eq '\\' ? $& : "$1\\\$$2"/eg; ## quote some $
534 $regex =~ s/\@/\\@/g; ## quote @
538 local($tmp) = $regex;
540 $tmp = join($tmp, '\b(', ')\b') if $words;
541 push(@mark_commands, "s/($tmp)/$mark\$1\e[m/g$iflag");
545 ## If watching for underlined things too, add another regex.
548 if ($regex =~ m/[?*+{}()\\.|^\$[]/) {
549 warn "$0: warning, can't underline-safe ``$regex''.\n";
551 $regex = join($underline_glue, split(//, $regex));
555 ## If nothing special in the regex, just use index...
556 ## is quite a bit faster.
557 if (($iflag eq '') && ($words == 0) &&
558 ($literal || $regex !~ m/[?*+{}()\\.|^\$[]/))
560 $regex = $orig if $literal;
561 push(@regex_tests, "(index(\$_, q\001$regex\001)>=0)");
564 #$regex =~ s!([\$\@\/]\w)!\\$1!g;
566 if ($regex =~ m/\|/) {
567 ## could be dangerous -- see if we can wrap in parens.
568 if ($regex =~ m/\\\d/) {
569 warn "warning: -w and a | in a regex is dangerous.\n"
571 $regex = join($regex, '(', ')');
574 $regex = join($regex, '\b', '\b');
576 $CAN_USE_FAST_LISTONLY = 0 if substr($regex, "[^") >= 0;
577 push(@regex_tests, "m'$regex'$iflag$mflag");
580 ## If we're done, but still have @extra to do, get set for that.
581 if (@ARGV == 0 && @extra) {
582 @ARGV = @extra; ## now deal with the extra stuff.
583 $underlineOK = 0; ## but no more of this.
584 undef @extra; ## or this.
588 $REGEX_TEST = join('||', @regex_tests);
589 ## print STDERR $REGEX_TEST, "\n"; exit;
591 ## must be doing -find -- just give something syntactically correct.
596 $MARK = join(';', @mark_commands);
601 ## Make sure we can read the first item(s).
603 foreach $start (@todo) {
604 $! = 2, die qq/$0: can\'t stat "$start"\n/
605 unless ($dev,$inode) = (stat($start))[$STAT_DEV,$STAT_INODE];
607 if (defined $dir_done{"$dev,$inode"}) {
608 ## ignore the repeat.
609 warn(qq/ignoring "$start" (same as "$dir_done{"$dev,$inode"}").\n/)
614 ## if -xdev was given, remember the device.
615 $xdev{$dev} = 1 if $XDEV;
617 ## Note that we won't want to do it again
618 $dir_done{"$dev,$inode"} = $start;
624 ## See the comment above the __END__ above the 'sub dodir' below.
629 print STDERR "$0: internal error (@_)\n";
633 ## Read from data, up to next __END__. This will be &dodir.
634 local($/) = "\n__END__";
638 ## Inline uppercase $-variables by their current values, removing
639 ## any preceeding eval if applicable
642 $prog =~ s/(?:\beval\s*)?\$([A-Z][A-Z0-9_]{2,}\b)/
643 &bad($1) if !defined ${$main::{$1}}; ${$main::{$1}};/eg;
646 $prog =~ s/(\beval\s*)?\$([A-Z][A-Z0-9_]{2,}\b)/local(*VAR) = $_main{$2};
647 &bad($2) if !defined $VAR; $VAR;/eg;
650 eval $prog; ## now do it. This will define &dodir;
651 $!=2, die "$0 internal error: $@\n" if $@;
654 ###########################################################################
657 ## Read the .search file:
658 ## Blank lines and lines that are only #-comments ignored.
659 ## Newlines may be escaped to create long lines
660 ## Other lines are directives.
662 ## A directive may begin with an optional tag in the form <...>
663 ## Things inside the <...> are evaluated as with:
664 ## <(this || that) && must>
666 ## -xmust -xthis or -xmust -xthat
667 ## were specified on the command line (order doesn't matter, though)
668 ## A directive is not done if there is a tag and it's false.
669 ## Any characters but whitespace and &|()>,! may appear after an -x
670 ## (although "-xdev" is special). -xmust,this is the same as -xmust -xthis.
671 ## Something like -x~ would make <~> true, and <!~> false.
673 ## Directives are in the form:
674 ## filter: EXPR : "command"
676 ## magic : NUMBYTES : EXPR
679 ## The STRING is parsed like a Bourne shell command line, and the
680 ## options are used as if given on the command line.
681 ## No comments are allowed on 'option' lines.
683 ## # skip objects and libraries
684 ## option: -skip '.o .a'
685 ## # skip emacs *~ and *# files, unless -x~ given:
686 ## <!~> option: -skip '~ #'
689 ## EXPR can be pretty much any perl (comments allowed!).
690 ## If it evaluates to true for any particular file, it is skipped.
691 ## The only info you'll have about a file is the variable $H, which
692 ## will have at least the first NUMBYTES of the file (less if the file
693 ## is shorter than that, of course, and maybe more). You'll also have
694 ## any variables you set in previous 'magic' lines.
696 ## magic: 6 : ($x6 = substr($H, 0, 6)) eq 'GIF87a'
697 ## magic: 6 : $x6 eq 'GIF89a'
699 ## magic: 6 : (($x6 = substr($H, 0, 6)) eq 'GIF87a' ## old gif \
700 ## || $x6 eq 'GIF89a' ## new gif
701 ## (the above two sets are the same)
702 ## ## Check the first 32 bytes for "binarish" looking bytes.
703 ## ## Don't blindly dump on any high-bit set, as non-ASCII text
704 ## ## often has them set. \x80 and \xff seem to be special, though.
705 ## ## Require two in a row to not get things like perl's $^T.
706 ## ## This is known to get *.Z, *.gz, pkzip, *.elc and about any
707 ## ## executable you'll find.
708 ## magic: 32 : $H =~ m/[\x00-\x06\x10-\x1a\x1c-\x1f\x80\xff]{2}/
712 local($file, $show) = @_;
713 local($line_num, $ln, $tag) = 0;
714 local($use_default, @default) = 0;
716 { package magic; $
\17 = 0; } ## turn off warnings for when we run EXPR's
718 unless (open(RC, "$file")) {
720 $file = "<internal default startup file>";
721 ## no RC file -- use this default.
722 @default = split(/\n/,<<'--------INLINE_LITERAL_TEXT');
723 magic: 32 : $H =~ m/[\x00-\x06\x10-\x1a\x1c-\x1f\x80\xff]{2}/
724 filter: $N =~ m/\.(gz|Z)$/ : "zcat %"
725 option: -skip '.a .COM .elc .EXE .o .pbm .xbm .dvi'
726 option: -iskip '.tarz .zip .lzh .jpg .jpeg .gif .uu'
727 <!~> option: -skip '~ #'
728 --------INLINE_LITERAL_TEXT
732 ## Make an eval error pretty.
734 sub clean_eval_error {
736 s/ in file \(eval\) at line \d+,//g; ## perl4-style error
737 s/ at \(eval \d+\) line \d+,//g; ## perl5-style error
738 s/\n[\x00-\xff]*//; ## remove all but first line
742 print "reading RC file: $file\n" if $show;
744 while ($_ = ($use_default ? shift(@default) : <RC>)) {
745 $ln = ++$line_num; ## note starting line num.
746 $_ .= <RC>, $line_num++ while s/\\\n?$/\n/; ## allow continuations
747 next if /^\s*(#.*)?$/; ## skip blank or comment-only lines.
750 ## look for an initial <...> tag.
751 if (s/^\s*<([^>]*)>//) {
752 ## This simple s// will make the tag ready to eval.
753 ($tag = $msg = $1) =~
755 $seen_opt{$1}=1; ## note seen option
756 "defined(\$opt{q>$1>})" ## (q>> is safe quoting here)
759 ## see if the tag is true or not, abort this line if not.
760 $dothis = (eval $tag);
761 $!=2, die "$file $ln <$msg>: $_".&clean_eval_error($@) if $@;
764 $msg =~ s/([^\s&|(!)]+)/-x$1/;
765 $msg =~ s/\s*!\s*/ no /g;
766 $msg =~ s/\s*&&\s*/ and /g;
767 $msg =~ s/\s*\|\|\s*/ or /g;
768 $msg =~ s/^\s+//; $msg =~ s/\s+$//;
769 $do = $dothis ? "(doing because $msg)" :
776 if (m/^\s*filter\s*:(.*):\s*"(.*)\s*"\s*$/) {
777 local($expr, $cmd) = ($1, $2);
778 eval "local(\$^W) = 0; $expr; 1";
779 die "$file $ln: ".&clean_eval_error($@) if $@;
780 $filter_cmd{$expr} = $cmd;
784 if (m/^\s*option\s*:\s*(.*)/) {
785 next if $all && !$show; ## -all turns off these checks;
789 print " $do option: $_\n" if $show;
790 local($0) = "$0 ($file)"; ## for any error message.
794 ## Parse $_ as a Bourne shell line -- fill @ARGV
798 push(@ARGV, $this) if defined $this;
802 $this = '' if !defined $this;
803 $this .= $1 while s/^\'([^\']*)\'// ||
805 s/^([^\'\"\s\\]+)//||
807 die "$file $ln: error parsing $orig at $_\n" if m/^\S/;
809 push(@ARGV, $this) if defined $this;
811 die qq/$file $ln: unused arg "@ARGV".\n/ if @ARGV;
815 if (m/^\s*magic\s*:\s*(\d+)\s*:\s*(.*)/) {
816 next if $all && !$show; ## -all turns off these checks;
817 local($bytes, $check) = ($1, $2);
820 $check =~ s/\n?$/\n/;
821 print " $do contents: $check";
823 ## Check to make sure the thing at least compiles.
824 eval "package magic; (\$H = '1'x \$main'bytes) && (\n$check\n)\n";
825 $! = 2, die "$file $ln: ".&clean_eval_error($@) if $@;
827 $HEADER_BYTES = $bytes if $bytes > $HEADER_BYTES;
828 push(@magic_tests, "(\n$check\n)");
831 $! = 2, die "$file $ln: unknown command\n";
838 if (!$STDERR_IS_TTY) {
839 print STDERR $_[0], "\n";
842 $thislength = length($text);
843 if ($thislength >= $last_message_length) {
844 print STDERR $text, "\r";
846 print STDERR $text, ' 'x ($last_message_length-$thislength),"\r";
848 $last_message_length = $thislength;
854 print STDERR ' ' x $last_message_length, "\r" if $last_message_length;
855 $vv_print = $vv_size = $last_message_length = 0;
859 ## Output a copy of this program with comments, extra whitespace, and
860 ## the trailing man page removed. On an ultra slow machine, such a copy
861 ## might load faster (but I can't tell any difference on my machine).
864 seek(DATA, 0, 0) || die "$0: can't reset internal pointer.\n";
866 print, next if /INLINE_LITERAL_TEXT/.../INLINE_LITERAL_TEXT/;
867 ## must mention INLINE_LITERAL_TEXT on this line!
868 s/\#\#.*|^\s+|\s+$//; ## remove cruft
870 next if ($_ eq '') || ($_ eq "'di'") || ($_ eq "'ig00'");
871 s/\$stripped=0;/\$stripped=1;/;
872 s/\s\s+/ /; ## squish multiple whitespaces down to one.
879 ## Just to shut up -w. Never executed.
883 1 || &dummy || &dir_done || &bad || &message || $NEXT_DIR_ENTRY ||
884 $VV_SIZE || $VV_PRINT_COUNT || $STDERR_SCREWS_STDOUT || @files ||
885 @files || $magic'H || $magic'H || $magic'val || $magic'val ||
886 $filter_cmd{1} || $xdev{''} || $MARK;
891 ## If the following __END__ is in place, what follows will be
892 ## inlined when the program first starts up. Any $ variable name
893 ## all in upper case, specifically, any string matching
894 ## \$([A-Z][A-Z0-9_]{2,}\b
895 ## will have the true value for that variable inlined. Also, any 'eval'
896 ## immediately preceeding one of the inlined variables is removed.
899 ## The idea is that when the whole thing is then eval'ed to define &dodir,
900 ## the perl optimizer will make all the decisions that are based upon
901 ## command-line options (such as $VERBOSE), since they'll be inlined as
904 ## Also, and here's the big win, the tests for matching the regex, and a
905 ## few others, are all inlined. Should be blinding speed here.
907 ## See the read from <DATA> above for where all this takes place.
908 ## But all-in-all, you *want* the __END__ here. Comment it out only for
915 ## Given a directory, check all "appropriate" files in it.
916 ## Shove any subdirectories into the global @todo, so they'll be done
919 ## Be careful about adding any upper-case variables, as they are subject
920 ## to being inlined. See comments above the __END__ above.
925 $dir =~ s,/+$,,; ## remove any trailing slash.
926 unless (opendir(DIR, "$dir/.")) {
927 &clear_message if $VERBOSE && $STDERR_SCREWS_STDOUT;
928 warn qq($0: can\'t opendir "$dir/".\n);
934 $vv_print = $vv_size = 0;
937 @files = sort readdir(DIR) if $DO_SORT;
939 while (defined($name = eval $NEXT_DIR_ENTRY))
941 next if $name eq '.' || $name eq '..'; ## never follow these.
943 ## create full relative pathname.
944 $file = $dir eq '.' ? $name : "$dir/$name";
946 ## if link and skipping them, do so.
947 if ($NOLINKS && -l $file) {
948 warn qq/skip (symlink): $file\n/ if $WHY;
952 ## skip things unless files or directories
953 unless (-f $file || -d _) {
955 $why = (-S _ && "socket") ||
957 (-b _ && "block special")||
958 (-c _ && "char special") || "somekinda special";
959 warn qq/skip ($why): $file\n/;
964 ## skip things we can't read
967 $why = (-l $file) ? "follow" : "read";
968 warn qq/skip (can\'t $why): $file\n/;
973 ## skip things that are empty
974 if (!$WINDOWS) { # -s fails for all dirs under Windows
975 # or should we just put the -d test before this?
977 warn qq/skip (empty): $file\n/ if $WHY;
982 ## Note file device & inode. If -xdev, skip if appropriate.
983 ($dev, $inode) = (stat(_))[$STAT_DEV, $STAT_INODE];
984 if ($XDEV && defined $xdev{$dev}) {
985 warn qq/skip (other device): $file\n/ if $WHY;
988 $id = "$dev,$inode" if $USE_INODES;
990 ## special work for a directory
992 if ($DESCEND_SUBDIRECTORIES == 0) {
993 warn qq/skip (-depth): $file\n/ if $WHY;
997 ## Do checks for directory file endings.
998 if ($DO_DSKIP_TEST && (eval $DSKIP_TEST)) {
999 warn qq/skip (-dskip): $file\n/ if $WHY;
1002 ## do checks for -name/-regex/-path tests
1003 if ($DO_DGLOB_TESTS && !(eval $DGLOB_TESTS)) {
1004 warn qq/skip (dirname): $file\n/ if $WHY;
1010 ## _never_ redo a directory
1011 if (defined $dir_done{$id}) {
1012 warn qq/skip (did as "$dir_done{$id}"): $file\n/ if $WHY;
1015 $dir_done{$id} = $file; ## mark it done.
1017 unshift(@todo, $file); ## add to the list to do.
1020 if ($WHY == 0 && $VERBOSE > 1) {
1021 if ($VERBOSE>2||$vv_print++>$VV_PRINT_COUNT||($vv_size+=-s _)>$VV_SIZE){
1023 $vv_print = $vv_size = 0;
1027 ## do time-related tests
1028 if ($NEWER || $OLDER) {
1029 $_ = (stat(_))[$STAT_MTIME];
1030 if ($NEWER && $_ < $NEWER) {
1031 warn qq/skip (too old): $file\n/ if $WHY;
1034 if ($OLDER && $_ > $OLDER) {
1035 warn qq/skip (too new): $file\n/ if $WHY;
1040 ## do checks for file endings
1041 if ($DO_SKIP_TEST && (eval $SKIP_TEST)) {
1042 warn qq/skip (-skip): $file\n/ if $WHY;
1046 ## do checks for -name/-regex/-path tests
1047 if ($DO_GLOB_TESTS && !(eval $GLOB_TESTS)) {
1048 warn qq/skip (filename): $file\n/ if $WHY;
1053 ## If we're not repeating files,
1054 ## skip this one if we've done it, or note we're doing it.
1056 if (defined $file_done{$id}) {
1057 warn qq/skip (did as "$file_done{$id}"): $file\n/ if $WHY;
1060 $file_done{$id} = $file;
1064 foreach $expr (keys %filter_cmd) {
1065 next unless eval "{ package filter; \$N = \$main'file; { $expr }}";
1066 $filter = $filter_cmd{$expr};
1067 $filter .= " $file" unless $filter =~ s/%/$file/g;
1071 if ($DO_MAGIC_TESTS && !$filter) {
1072 if (!open(FILE_IN, $file)) {
1073 &clear_message if $VERBOSE && $STDERR_SCREWS_STDOUT;
1074 warn qq/$0: can\'t open: $file\n/;
1077 unless (read(FILE_IN, $magic'H, $HEADER_BYTES)) {#'
1078 &clear_message if $VERBOSE && $STDERR_SCREWS_STDOUT;
1079 warn qq/$0: can\'t read from "$file"\n/;
1087 warn qq/skip (magic): $file\n/ if $WHY;
1090 seek(FILE_IN, 0, 0); ## reset for later <FILE_IN>
1093 if ($WHY != 0 && $VERBOSE > 1) {
1094 if ($VERBOSE>2||$vv_print++>$VV_PRINT_COUNT||($vv_size+=-s _)>$VV_SIZE){
1096 $vv_print = $vv_size = 0;
1102 &clear_message if $VERBOSE && $STDERR_SCREWS_STDOUT;
1104 $retval=0; ## we've found something
1105 close(FILE_IN) if $DO_MAGIC_TESTS;
1108 ## if we weren't doing magic tests, file won't be open yet...
1110 if (!open(FILE_IN, "$filter|")) {
1111 &clear_message if $VERBOSE && $STDERR_SCREWS_STDOUT;
1112 warn qq/$0: can\'t open filter: $filter\n/;
1115 }elsif (!$DO_MAGIC_TESTS && !open(FILE_IN, $file)) {
1116 &clear_message if $VERBOSE && $STDERR_SCREWS_STDOUT;
1117 warn qq/$0: can\'t open: $file\n/;
1120 if ($LIST_ONLY && $CAN_USE_FAST_LISTONLY) {
1122 ## This is rather complex, but buys us a LOT when we're just
1123 ## listing files and not the individual internal lines.
1125 local($size) = 4096; ## block-size in which to do reads
1126 local($nl); ## will point to $_'s ending newline.
1127 local($read); ## will be how many bytes read.
1128 local($_) = ''; ## Starts out empty
1129 local($hold); ## (see below)
1131 while (($read = read(FILE_IN,$_,$size,length($_)))||length($_))
1134 ## if read a full block, but no newline, need to read more.
1135 while ($read == $size && ($nl = rindex($_, "\n")) < 0) {
1136 push(@parts, $_); ## save that part
1137 $read = read(FILE_IN, $_, $size); ## keep trying
1141 ## If we had to save parts, must now combine them together.
1142 ## adjusting $nl to reflect the now-larger $_. This should
1143 ## be a lot more efficient than using any kind of .= in the
1147 local($lastlen) = length($_); #only need if $nl >= 0
1148 $_ = join('', @parts, $_);
1149 $nl = length($_) - ($lastlen - $nl) if $nl >= 0;
1153 ## If we're at the end of the file, then we can use $_ as
1154 ## is. Otherwise, we need to remove the final partial-line
1155 ## and save it so that it'll be at the beginning of the
1156 ## next read (where the rest of the line will be layed in
1157 ## right after it). $hold will be what we should save
1160 if ($read != $size || $nl < 0) {
1163 $hold = substr($_, $nl + 1);
1164 substr($_, $nl + 1) = '';
1168 ## Now have a bunch of full lines in $_. Use it.
1170 if (eval $REGEX_TEST) {
1171 &clear_message if $VERBOSE && $STDERR_SCREWS_STDOUT;
1173 $retval=0; ## we've found something
1178 ## Prepare for next read....
1182 } else { ## else not using faster block scanning.....
1184 $lines_printed = 0 if $NICE;
1187 next unless (eval $REGEX_TEST);
1190 ## We found a matching line.
1193 &clear_message if $VERBOSE && $STDERR_SCREWS_STDOUT;
1198 ## prepare to print line.
1199 if ($NICE && $lines_printed++ == 0) {
1200 print '-' x 70, "\n" if $NICE > 1;
1205 ## Print all the prelim stuff. This looks less efficient
1206 ## than it needs to be, but that's so that when the eval
1207 ## is compiled (and the tests are optimized away), the
1208 ## result will be less actual PRINTs than the more natural
1209 ## way of doing these tests....
1212 if ($REPORT_LINENUM) {
1217 } elsif ($REPORT_LINENUM && $PREPEND_FILENAME) {
1219 } elsif ($PREPEND_FILENAME) {
1221 } elsif ($REPORT_LINENUM) {
1224 if ($DOMARK) { eval $MARK; }
1226 print "\n" unless m/\n$/;
1229 print "\n" if ($NICE > 1) && $lines_printed;
1241 .nr nl 0-1 \" fake up transition to first page again
1242 .nr % 0 \" start at page 1
1243 .\"__________________NORMAL_MAN_PAGE_BELOW_________________
1245 .TH search 1 "Dec 17, 1994"
1247 search \- search files (a'la grep) in a whole directory tree.
1249 search [ grep-like and find-like options] [regex ....]
1252 is more or less a combo of 'find' and 'grep' (although the regular
1253 expression flavor is that of the perl being used, which is closer to
1254 egrep's than grep's).
1257 does generally the same kind of thing that
1259 find <blah blah> | xargs egrep <blah blah>
1263 more powerful and efficient (and intuitive, I think).
1265 This manual describes
1267 as of version "960325.7".
1268 You can always find the latest version at
1270 http://www.wg.omron.co.jp/~jfriedl/perl/index.html
1274 Basic use is simple:
1278 will search files in the current directory, and all sub directories, for
1279 files that have "jeff" in them. The lines will be listed with the
1280 containing file's name prepended.
1282 If you list more than one regex, such as with
1284 % search jeff Larry Randal+ 'Stoc?k' 'C.*son'
1286 then a line containing any of the regexes will be listed.
1287 This makes it effectively the same as
1289 % search 'jeff|Larry|Randal+|Stoc?k|C.*son'
1291 However, listing them separately is much more efficient (and is easier
1294 Note that in the case of these examples, the
1296 (list whole-words only) option would be useful.
1297 And if your terminal supports ANSI escape sequences, you can use
1299 to higlight the items found. Furthermore, if your display supports
1300 color as well, you can use
1304 etc. instead to have the searched items marked with the given color.
1306 Normally, various kinds of files are automatically removed from consideration.
1307 If it has has a certain ending (such as ".tar", ".Z", ".o", .etc), or if
1308 the beginning of the file looks like a binary, it'll be excluded.
1309 You can control exactly how this works -- see below. One quick way to
1310 override this is to use the
1312 option, which means to consider all the files that would normally be
1313 automatically excluded.
1314 Or, if you're curious, you can use
1316 to have notes about what files are skipped (and why) printed to stderr.
1318 .SH "BASIC OVERVIEW"
1319 Normally, the search starts in the current directory, considering files in
1324 file to control ways to automatically exclude files.
1325 If you don't have this file, a default one will kick in, which automatically
1330 (among others) to exclude those kinds of files (which you probably want to
1331 skip when searching for text, as is normal).
1332 Files that look to be be binary will also be excluded.
1334 Files ending with "#" and "~" will also be excluded unless the
1340 to show what kinds of files will normally be skipped.
1341 See the section on the startup file
1346 option to indicate you want to consider all files that would otherwise be
1347 skipped by the startup file.
1349 Based upon various other flags (see "WHICH FILES TO CONSIDER" below),
1350 more files might be removed from consideration. For example
1354 will exclude files that aren't at least three days old (change the 3 to -3
1355 to exclude files that are more than three days old), while
1359 would exclude any file beginning with a dot (of course, '.' and '..' are
1360 special and always excluded).
1362 If you'd like to see what files are being excluded, and why, you can get the
1367 If a file makes it past all the checks, it is then "considered".
1368 This usually means it is greped for the regular expressions you gave
1369 on the command line.
1371 If any of the regexes match a line, the line is printed.
1374 is given, just the filename is printed. Or, if
1376 is given, a somewhat more (human-)readable output is generated.
1378 If you're searching a huge tree and want to keep informed about how
1379 the search is progressing,
1381 will print (to stderr) the current directory being searched.
1384 will also print the current file "every so often", which could be useful
1385 if a directory is huge. Using
1387 will print the update with every file.
1389 Below is the full listing of options.
1391 .SH "OPTIONS TELLING *WHERE* TO SEARCH"
1394 Start searching at the named directory instead of the current directory.
1397 arguments are given, multiple trees will be searched.
1402 except it flushes any previous
1404 directories (i.e. "-dir A -dir B -dir C" will search A, B, and C, while
1405 "-dir A -ddir B -dir C" will search only B and C. This might be of use
1406 in the startup file (see that section below).
1409 Stay on the same filesystem as the starting directory/directories.
1412 Sort the items in a directory before processing them.
1413 Normally they are processed in whatever order they happen to be read from
1417 Don't follow symbolic links. Normally they're followed.
1420 Don't descend into subdirectories. Only a depth of 0 currently supported.
1421 .SH "OPTIONS CONTROLLING WHICH FILES TO CONSIDER AND EXCLUDE"
1424 Only consider files that were last changed more than
1431 has '-' prepended, i.e. "-mtime -2.5" means to consider files that
1432 have been changed in the last two and a half days).
1435 Only consider files that have not changed since
1438 If there is any upper case in the "-older", "or equal" is added to the sense
1439 of the test. Therefore, "search -older ./file regex" will never consider
1440 "./file", while "search -Older ./file regex" will.
1442 If a file is a symbolic link, the time used is that of the file and not the
1450 Only consider files that match the shell filename pattern
1452 The check is only done on a file's name (use
1454 to check the whole path, and use
1456 to check directory names).
1458 Multiple specifications can be given by separating them with spaces, a'la
1462 to consider C source and header files.
1465 doesn't contain any special pattern characters, a '*' is prepended.
1466 This last example could have been given as
1470 It could also be given as
1476 -name '*.c' -name '*.h'
1483 but in this last case, you have to be sure to supply the leading '*'.
1488 except the entire path is checked against the pattern.
1491 Considers files whose names (not paths) match the given perl regex
1495 Case-insensitive version of
1499 Case-insensitive version of
1502 .BI -iregex " REGEX"
1503 Case-insensitive version of
1508 Only search down directories whose path matches the given pattern (this
1509 doesn't apply to the initial directory given by
1514 -dir /usr/man -dpath /usr/man/man*
1516 would completely skip
1517 "/usr/man/cat1", "/usr/man/cat2", etc.
1520 Skips directories whose name (not path) matches the given pattern.
1523 -dir /usr/man -dskip cat*
1525 would completely skip any directory in the tree whose name begins with "cat"
1526 (including "/usr/man/cat1", "/usr/man/cat2", etc.).
1528 .BI -dregex " REGEX"
1531 but the pattern is a full perl regex. Note that this quite different
1534 which considers only file names (not paths). This option considers
1535 full directory paths (not just names). It's much more useful this way.
1536 Sorry if it's confusing.
1539 This option exists, but is probably not very useful. It probably wants to
1540 be like the '-below' or something I mention in the "TODO" section.
1543 Case-insensitive version of
1547 Case-insensitive version of
1550 .BI -idregex " REGEX"
1551 Case-insensitive version of
1555 Ignore any 'magic' or 'option' lines in the startup file.
1556 The effect is that all files that would otherwise be automatically
1557 excluded are considered.
1560 Arguments starting with
1564 explained elsewhere) do special interaction with the
1566 startup file. Something like
1570 will turn on "flag1" and "flag2" in the startup file (and is
1571 the same as "-xflag1,flag2"). You can use this to write your own
1572 rules for what kinds of files are to be considered.
1574 For example, the internal-default startup file contains the line
1576 <!~> option: -skip '~ #'
1578 This means that if the
1587 The effect is that emacs temp and backup files are not normally
1588 considered, but you can included them with the -x~ flag.
1590 You can write your own rules to customize
1592 in powerful ways. See the STARTUP FILE section below.
1595 Print a message (to stderr) when and why a file is not considered.
1597 .SH "OPTIONS TELLING WHAT TO DO WITH FILES THAT WILL BE CONSIDERED"
1599 \fB-find\fP or \fB-b\fP
1600 This option changes the basic action of
1603 Normally, if a file is considered, it is searched
1604 for the regular expressions as described earlier. However, if this option
1605 is given, the filename is printed and no searching takes place. This turns
1607 into a 'find' of some sorts.
1609 In this case, no regular expressions are needed on the command line
1610 (any that are there are silently ignored).
1612 This is not intended to be a replacement for the 'find' program,
1614 you in understanding just what files are getting past the exclusion checks.
1615 If you really want to use it as a sort of replacement for the 'find' program,
1616 you might want to use
1618 so that it doesn't waste time checking to see if the file is binary, etc
1619 (unless you really want that, of course).
1623 none of the "GREP-LIKE OPTIONS" (below) matter.
1625 As a replacement for 'find',
1627 is probably a bit slower (or in the case of GNU find, a lot slower --
1631 However, "search -ffind"
1632 might be more useful than 'find' when options such as
1634 are used (at least until 'find' gets such functionality).
1636 \fB-ffind\fP or \fB-ff\fP
1637 A faster more 'find'-like find. Does
1641 .SH "GREP-LIKE OPTIONS"
1642 These options control how a searched file is accessed,
1643 and how things are printed.
1645 \fB-F\fP or \fB-lit\fP
1646 Causes arguments to be taken as literal text rather than as perl regular
1649 \fB-R\fP or \fB-regex\fP
1652 Regex arguments are indeed taken as perl regular expressions.
1655 Ignore letter case when matching.
1658 Don't ignore letter case when matching (useful for overriding a
1660 in the startup file)
1663 Consider only whole-word matches ("whole word" as defined by perl's "\\b"
1667 If the regex(es) is/are simple, try to modify them so that they'll work
1668 in manpage-like underlined text (i.e. like _^Ht_^Hh_^Hi_^Hs).
1669 This is very rudimentary at the moment.
1671 \fB-list\fP or \fB-l\fP
1673 Don't print matching lines, but the names of files that contain matching
1674 lines. This will likely be *much* faster, as special optimizations are
1675 made -- particularly with large files.
1678 Pepfix each line by its line number.
1681 Not a grep-like option, but similar to
1685 will have the output be a bit more human-readable, with matching lines printed
1686 slightly indented after the filename, a'la
1690 somedir/somefile: line with foo in it
1691 somedir/somefile: some food for thought
1692 anotherdir/x: don't be a buffoon!
1702 some food for thought
1708 This option due to Lionel Cons.
1713 Prefix each file's output by a rule line, and follow with an extra blank line.
1716 Don't prepend each output line with the name of the file
1723 .SH "OPTIONS WHICH INDICATE HOW TO DISPLAY"
1728 from just above, you can use the following if your display supports
1729 ANSI escape sequences (most systems seem to).
1732 Show the found items in reverse video.
1735 Show the found items in red.
1738 Show the found items in green.
1741 Show the found items in yellow.
1744 Show the found items in blue.
1747 Show the found items in cyan.
1750 Show the found items in white.
1753 Show the found items in black.
1758 Print the usage information.
1761 Print the version information and quit.
1764 Set the level of message verbosity.
1766 will print a note whenever a new directory is entered.
1768 will also print a note "every so often". This can be useful to see
1769 what's happening when searching huge directories.
1771 will print a new with every file.
1779 This ends the options, and can be useful if the regex begins with '-'.
1782 Shows what is being considered in the startup file, then exits.
1785 Normally, an identical file won't be checked twice (even with multiple
1786 hard or symbolic links). If you're just trying to do a fast
1788 the bookkeeping to remember which files have been seen is not desirable,
1789 so you can eliminate the bookkeeping with this flag.
1794 starts up, it processes the directives in
1796 If no such file exists, a default
1797 internal version is used.
1799 The internal version looks like:
1802 magic: 32 : $H =~ m/[\ex00-\ex06\ex10-\ex1a\ex1c-\ex1f\ex80\exff]{2}/
1803 filter: $N =~ m/\.(gz|Z)$/ : "zcat %"
1804 option: -skip '.a .COM .elc .EXE .o .pbm .xbm .dvi'
1805 option: -iskip '.tarz .zip .lzh .jpg .jpeg .gif .uu'
1806 <!~> option: -skip '~ #'
1809 If you wish to create your own "~/.search",
1810 you might consider copying the above, and then working from there.
1812 There are three kinds of directives in a startup file: "filter", "magic"
1818 Option lines will automatically do the command-line options given.
1819 For example, the line
1823 in you startup file will turn on -v every time, without needing to type it
1824 on the command line.
1826 The text on the line after the "option:" directive is processed
1827 like the Bourne shell, so make sure to pay attention to quoting.
1829 option: -skip .exe .com
1831 will give an error (".com" by itself isn't a valid option), while
1833 option: -skip ".exe .com"
1835 will properly include it as part of -skip's argument.
1839 Magic lines are used to determine if a file should be considered a binary
1840 or not (the term "magic" refers to checking a file's magic number). These
1841 are described in more detail below.
1845 Filter lines are used to apply a command to a file to get the text to search.
1850 filter : EXPRESSION: "command...."
1854 is a perl expression used to determine if the filter should be applied to a
1855 given file (the file's name will be in the variable $N, but remember that
1858 etc., won't even be considered for a filter). If true, the
1860 will be executed and its standard-output will be checked. ``\fB%\fP'' in the
1861 command string will be replace by the filename.
1863 The most common example would be to uncompress a file on the fly, i.e.
1865 filter: $N =~ m/\.(gz|Z)$/ : "zcat %"
1867 Note that had the ``\fBzcat\fP'' been ``\fBgunzip\fP'' instead, you'd
1868 uncompress your files in place instead of searching them, so take care when
1869 specifying a filter! If you're worried about mixing up GNU'z zcat with
1870 an old one, you might use seperate ones as with:
1872 filter: $N =~ m/\.gz$/ : "/my/GNU/binaries/zcat %"
1873 filter: $N =~ m/\.Z$/ : "/the/non-GNU/binaries/zcat %"
1876 Also note that when a filter is applied, the
1878 section is ignored for the file (this can be considered a bug, so it might
1879 change in the future).
1883 Blank lines and comments (lines beginning with '#') are allowed.
1885 If a line begins with <...>, then it's a check to see if the
1886 directive on the line should be done or not. The stuff inside the <...>
1887 can contain perl's && (and), || (or), ! (not), and parens for grouping,
1888 along with "flags" that might be indicated by the user with
1892 For example, using "-xfoo" will cause "foo" to be true inside the <...>
1893 blocks. Therefore, a line beginning with "<foo>" would be done only when
1894 "-xfoo" had been specified, while a line beginning with "<!foo>" would be
1895 done only when "-xfoo" is not specified (of course, a line without any <...>
1896 is done in either case).
1898 A realistic example might be
1902 This will cause -vv messages to be the default, but allow "-xv" to override.
1904 There are a few flags that are set automatically:
1908 true if the output is to the screen (as opposed to being redirected to a file).
1909 You can force this (as with all the other automatic flags) with -xTTY.
1912 True if -v was specified. If -vv was specified, both
1916 flags are true (and so on).
1919 True if -nice was specified. Same thing about -nnice as for -vv.
1923 true if -list (or -l) was given.
1926 true if -dir was given.
1929 Using this info, you might change the last example to
1932 <!v && !-v> option: -vv
1935 The added "&& !-v" means "and if the '-v' option not given".
1936 This will allow you to use "-v" alone on the command line, and not
1937 have this directive add the more verbose "-vv" automatically.
1940 Some other examples:
1942 <!-dir && !here> option: -dir ~/
1943 Effectively make the default directory your home directory (instead of the
1944 current directory). Using -dir or -xhere will undo this.
1946 <tex> option: -name .tex -dir ~/pub
1947 Create '-xtex' to search only "*.tex" files in your ~/pub directory tree.
1948 Actually, this could be made a bit better. If you combine '-xtex' and '-dir'
1949 on the command line, this directive will add ~/pub to the list, when you
1950 probably want to use the -dir directory only. You could do
1953 <tex> option: -name .tex
1954 <tex && !-dir> option: -dir ~/pub
1957 to will allow '-xtex' to work as before, but allow a command-line "-dir"
1958 to take precedence with respect to ~/pub.
1960 <fluff> option: -nnice -sort -i -vvv
1961 Combine a few user-friendly options into one '-xfluff' option.
1963 <man> option: -ddir /usr/man -v -w
1964 When the '-xman' option is given, search "/usr/man" for whole-words
1965 (of whatever regex or regexes are given on the command line), with -v.
1968 The lines in the startup file are executed from top to bottom, so something
1972 <both> option: -xflag1 -xflag2
1973 <flag1> option: ...whatever...
1974 <flag2> option: ...whatever...
1977 will allow '-xboth' to be the same as '-xflag1 -xflag2' (or '-xflag1,flag2'
1978 for that matter). However, if you put the "<both>" line below the others,
1979 they will not be true when encountered, so the result would be different
1980 (and probably undesired).
1982 The "magic" directives are used to determine if a file looks to be binary
1983 or not. The form of a magic line is
1985 magic: \fISIZE\fP : \fIPERLCODE\fP
1989 is the number of bytes of the file you need to check, and
1991 is the code to do the check. Within
1993 the variable $H will hold at least the first
1995 bytes of the file (unless the file is shorter than that, of course).
1996 It might hold more bytes. The perl should evaluate to true if the file
1997 should be considered a binary.
2001 magic: 6 : substr($H, 0, 6) eq 'GIF87a'
2003 to test for a GIF ("-iskip .gif" is better, but this might be useful
2004 if you have images in files without the ".gif" extension).
2006 Since the startup file is checked from top to bottom, you can be a bit
2009 magic: 6 : ($x6 = substr($H, 0, 6)) eq 'GIF87a'
2010 magic: 6 : $x6 eq 'GIF89a'
2012 You could also write the same thing as
2014 magic: 6 : (($x6 = substr($H, 0, 6)) eq 'GIF87a') || ## an old gif, or.. \e
2015 $x6 eq 'GIF89a' ## .. a new one.
2017 since newlines may be escaped.
2019 The default internal startup file includes
2021 magic: 32 : $H =~ m/[\ex00-\ex06\ex10-\ex1a\ex1c-\ex1f\ex80\exff]{2}/
2023 which checks for certain non-printable characters, and catches a large
2024 number of binary files, including most system's executables, linkable
2025 objects, compressed, tarred, and otherwise folded, spindled, and mutilated
2028 Another example might be
2030 ## an archive library
2031 magic: 17 : substr($H, 0, 17) eq "!<arch>\en__.SYMDEF"
2036 returns zero if lines (or files, if appropriate) were found,
2037 or if no work was requested (such as with
2039 Returns 1 if no lines (or files) were found.
2043 Things I'd like to add some day:
2045 + show surrounding lines (context).
2046 + highlight matched portions of lines.
2047 + add '-and', which can go between regexes to override
2048 the default logical or of the regexes.
2049 + add something like
2051 which will examine a tree and only consider files that
2052 lie in a directory deeper than one named by the pattern.
2053 + add 'warning' and 'error' directives.
2054 + add 'help' directive.
2057 If -xdev and multiple -dir arguments are given, any file in any of the
2058 target filesystems are allowed. It would be better to allow each filesystem
2059 for each separate tree.
2061 Multiple -dir args might also cause some confusing effects. Doing
2063 -dir some/dir -dir other
2065 will search "some/dir" completely, then search "other" completely. This
2066 is good. However, something like
2068 -dir some/dir -dir some/dir/more/specific
2070 will search "some/dir" completely *except for* "some/dir/more/specific",
2071 after which it will return and be searched. Not really a bug, but just sort
2074 File times (for -newer, etc.) of symbolic links are for the file, not the
2075 link. This could cause some misunderstandings.
2077 Probably more. Please let me know.
2079 Jeffrey Friedl, Omron Corp (jfriedl@omron.co.jp)
2081 http://www.wg.omron.co.jp/cgi-bin/j-e/jfriedl.html
2084 See http://www.wg.omron.co.jp/~jfriedl/perl/index.html