Command Line Syntax
The generic syntax is:
webbot [ options ] [ docaddress [ keywords ]]
The order of the options is not important and options can in fact be specified
on either side of any docaddress. Currently
available options are:-
Getting Help
-
-v [ a | b | c | g | p | s | t | u ]
-
Verbose mode: Gives a running commentary on the program's attempts to read
data in various ways. As the amount of verbose output is substantial, the
-v
option can now be followed by zero, one or more of the following
flags (without space) in order to differentiate the verbose output generated:
-
a: Anchor relevant information
-
b: Bindings to local file system
-
c: Cache trace
-
g: SGML trace
-
p: Protocol module information
-
s: SGML/HTML relevant information
-
t: Thread trace
-
u: URI relevant information
The -v
option without any appended options shows all trace messages.
An example is
-vpt
showing thread and protocol trace messages
-
-version
-
Prints out the version number of the software, and the version number of
the WWW library, and exits.
Configuration Options
-
-img
-
Test include inlined images using a
HEAD
request
-
-saveimg
-
Saving the inlined images on local disk or pump them to a black hole. This
is primarili to test a true client behavior in the robot
-
-cache
-
Enable the libwww persistent
cache
-
-cacheroot [ dir ]
-
Where should the cache be located? The default is
/tmp/w3c-cache
-
-validate
-
Force validation using either the
etag
or the last-modified
date provided by the server
-
-endvalidate
-
Force end-to-end validation by adding a
max-age=0
cache control
directive
-
-l [ file ]
-
Specifies a log file with a list of visited documents. The default value
is "www-log"
-
-link [ n ]
-
Fetch all links from this document. By indicating an integer "n" as the parameter
you can specify the depth of which the search should go. The default value
is 0 which means that only the start page is searched. Level 1 indicates
that the start page and all pages directly linked from the start page
are searched.
-
-n
-
Non-interactive mode. Outputs the formatted document to the standard output,
then exits. Pages are delimited with form feed (FF) characters.
-
-o [ file ]
-
Redirects output to specified file. The default value is "www-out". This
mode forced non-interactive mode
-
-q
-
Quit mode. Don't say anything at all
-
-nopipe
-
Do not use HTTP/1.1 pipelining. The default for this option can be
set using the configure script under
installation.
-
-delay [ n]
-
Specify the write delay in ms for how long we can wait until we flush the
output buffer when using pipelining. The default value is 50 ms. The longer
delay, the bigger TCP packets but also longer response time.
-
-r <file>
-
Rule file, a.k.a. configuration file. If this is specified, a rule file may
be used to map URLs, and to set up other aspects of the behavior of the browser.
Many rule files may be given with successive -r options, and a default rule
file name may be given using the WWW_CONFIG environment variable.
-
-ss
-
Print out date and time for start and stop for the job.
-
-single
-
Single threaded mode. If this flag is set then the browser uses blocking,
non interruptible I/O in interactive mode. Non-interactive mode always uses
blocking I/O.
-
-timeout <n>
-
Timeout in seconds on sockets
If present, the next argument (docaddress) is the
hypertext address , of the
document at which you want to start browsing. You may want to define an alias
for www followed by name of your favorite index.
Any further command line arguments are taken as keywords. The first argument
must refer to an index in this case. The index is searched for entries matching
the keywords, and a list of matching entries is displayed.
Henrik Frystyk,
libwww@w3.org,
@(#) $Id: CommandLine.html,v 1.11 1997/02/06 16:33:53 frystyk Exp $