<!-- -*- mode: sgml; mode: fold -*- -->
<!doctype debiandoc  PUBLIC  "-//DebianDoc//DTD DebianDoc//EN">
<book>
<title>APT Method Interface </title>

<author>Jason Gunthorpe <email>jgg@debian.org</email></author>
<version>$Id: method.sgml,v 1.9 2002/09/15 23:14:34 jgg Exp $</version>

<abstract>
This document describes the interface that APT uses to the archive
access methods.
</abstract>

<copyright>
Copyright &copy; Jason Gunthorpe, 1998.
<p>
"APT" and this document are free software; you can redistribute them and/or
modify them under the terms of the GNU General Public License as published
by the Free Software Foundation; either version 2 of the License, or (at your
option) any later version.

<p>
For more details, on Debian GNU/Linux systems, see the file
/usr/doc/copyright/GPL for the full license.
</copyright>

<toc sect>

<chapt>Introduction
<!-- General		                                               {{{ -->
<!-- ===================================================================== -->
<sect>General

<p>
The APT method interface allows APT to acquire archive files (.deb), index
files (Packages, Release, Mirrors) and source files (.tar.gz, .diff). It
is a general, extensible system designed to satisfy all of these
requirements:

<enumlist>
<item>Remote methods that download files from a distant site
<item>Resume of aborted downloads
<item>Progress reporting
<item>If-Modified-Since (IMS) checking for index files
<item>In-Line MD5 generation
<item>No-copy in-filesystem methods
<item>Multi-media methods (like CD's)
<item>Dynamic source selection for failure recovery
<item>User interaction for user/password requests and media swaps
<item>Global configuration
</enumlist>

Initial releases of APT (0.1.x) used a completely different method
interface that only supported the first 6 items. This new interface
deals with the remainder.
</sect>
                                                                  <!-- }}} -->
<!-- Terms		                                               {{{ -->
<!-- ===================================================================== -->
<sect>Terms

<p>
Several terms are used through out the document, they have specific
meanings which may not be immediately evident. To clarify they are summarized
here.

<taglist>
<tag>source<item>
Refers to an item in source list. More specifically it is the broken down
item, that is each source maps to exactly one index file. Archive sources
map to Package files and Source Code sources map to Source files.

<tag>archive file<item>
Refers to a binary package archive (.deb, .rpm, etc). 

<tag>source file<item>
Refers to one of the files making up the source code of a package. In
debian it is one of .diff.gz, .dsc. or .tar.gz.

<tag>URI<item>
Universal Resource Identifier (URI) is a super-set of the familiar URL
syntax used by web browsers. It consists of an access specification
followed by a specific location in that access space. The form is
&lt;access&gt;:&lt;location&gt;. Network addresses are given with the form 
&lt;access&gt;://[&lt;user&gt;[:&lt;pas&gt;]@]hostname[:port]/&lt;location&gt;. 
Some examples:
<example>
file:/var/mirrors/debian/
ftp://ftp.debian.org/debian
ftp://jgg:MooCow@localhost:21/debian
nfs://bigred/var/mirrors/debian
rsync://debian.midco.net/debian
cdrom:Debian 2.0r1 Disk 1/
</example>

<tag>method<item>
There is a one to one mapping of URI access specifiers to methods. A method
is a program that knows how to handle a URI access type and operates according
to the specifications in this file.

<tag>method instance<item>
A specific running method. There can be more than one instance of each method
as APT is capable of concurrent method handling.

<tag>message<item>
A series of lines terminated by a blank line sent down one of the
communication lines. The first line should have the form xxx TAG
where xxx are digits forming the status code and TAG is an informational
string

<tag>acquire<item>
The act of bring a URI into the local pathname space. This may simply
be verifying the existence of the URI or actually downloading it from
a remote site.

</taglist>

</sect>
                                                                  <!-- }}} -->
<chapt>Specification
<!-- Overview		                                               {{{ -->
<!-- ===================================================================== -->
<sect>Overview

<p>
All methods operate as a sub process of a main controlling parent. 3 FD's
are opened for use by the method allowing two way communication and
emergency error reporting. The FD's correspond to the well known unix FD's, 
stdin, stdout and stderr.

<p>
Through operation of the method communication is done via http 
style plain text. Specifically RFC-822 (like the Package file) fields
are used to describe items and a numeric-like header is used to indicate
what is happening. Each of these distinct communication messages should be
sent quickly and without pause.

<p> 
In some instances APT may pre-invoke a method to allow things like file
URI's to determine how many files are available locally.
</sect>
                                                                  <!-- }}} -->
<!-- Message Overview	                                               {{{ -->
<!-- ===================================================================== -->
<sect>Message Overview

<p>
The first line of each message is called the message header. The first
3 digits (called the Status Code) have the usual meaning found in the 
http protocol. 1xx is informational, 2xx is successful and 4xx is failure. 
The 6xx series is used to specify things sent to the method. After the 
status code is an informational string provided for visual debugging.

<list>
<item>100 Capabilities - Method capabilities
<item>101 Log - General Logging
<item>102 Status - Inter-URI status reporting (login progress)
<item>200 URI Start - URI is starting acquire
<item>201 URI Done - URI is finished acquire
<item>400 URI Failure - URI has failed to acquire
<item>401 General Failure - Method did not like something sent to it
<item>402 Authorization Required - Method requires authorization
        to access the URI. Authorization is User/Pass
<item>403 Media Failure - Method requires a media change	
<item>600 URI Acquire - Request a URI be acquired
<item>601 Configuration - Sends the configuration space
<item>602 Authorization Credentials - Response to the 402 message
<item>603 Media Changed - Response to the 403 message
</list>

Only the 6xx series of status codes is sent TO the method. Furthermore
the method may not emit status codes in the 6xx range. The Codes 402
and 403 require that the method continue reading all other 6xx codes
until the proper 602/603 code is received. This means the method must be
capable of handling an unlimited number of 600 messages.

<p>
The flow of messages starts with the method sending out a 
<em>100 Capabilities</> and APT sending out a <em>601 Configuration</>.
After that APT begins sending <em>600 URI Acquire</> and the method
sends out <em>200 URI Start</>, <em>201 URI Done</> or 
<em>400 URI Failure</>. No synchronization is performed, it is expected
that APT will send <em>600 URI Acquire</> messages at -any- time and
that the method should queue the messages. This allows methods like http
to pipeline requests to the remote server. It should be noted however
that APT will buffer messages so it is not necessary for the method
to be constantly ready to receive them.
</sect>
                                                                  <!-- }}} -->
<!-- Header Fields	                                               {{{ -->
<!-- ===================================================================== -->
<sect>Header Fields

<p> 
The following is a short index of the header fields that are supported

<taglist>
<tag>URI<item>URI being described by the message
<tag>Filename<item>Location in the filesystem
<tag>Last-Modified<item>A time stamp in RFC1123 notation for use by IMS checks
<tag>IMS-Hit<item>The already existing item is valid
<tag>Size<item>Size of the file in bytes
<tag>Resume-Point<item>Location that transfer was started
<tag>MD5-Hash<item>Computed MD5 hash for the file
<tag>Message<item>String indicating some displayable message
<tag>Media<item>String indicating the media name required
<tag>Site<item>String indicating the site authorization is required for
<tag>User<item>Username for authorization
<tag>Password<item>Password for authorization
<tag>Fail<item>Operation failed
<tag>Drive<item>Drive the media should be placed in
<tag>Config-Item<item>
A string of the form <var>item</>=<var>value</> derived from the APT 
configuration space. These may include method specific values and general
values not related to the method. It is up to the method to filter out
the ones it wants.
<tag>Single-Instance<item>Requires that only one instance of the method be run
                          This is a yes/no value.
<tag>Pipeline<item>The method is capable of pipelining.
<tag>Local<item>The method only returns Filename: fields.
<tag>Send-Config<item>Send configuration to the method.
<tag>Needs-Cleanup<item>The process is kept around while the files it returned
are being used. This is primarily intended for CDROM and File URIs that need
to unmount filesystems.
<tag>Version<item>Version string for the method
</taglist>

This is a list of which headers each status code can use

<taglist>
<tag>100 Capabilities<item>
Displays the capabilities of the method. Methods should set the
pipeline bit if their underlying protocol supports pipelining. The
only known method that does support pipelining is http.
Fields: Version, Single-Instance, Pre-Scan, Pipeline, Send-Config, 
Needs-Cleanup

<tag>101 Log<item>
A log message may be printed to the screen if debugging is enabled. This
is only for debugging the method.
Fields: Message

<tag>102 Status<item>
Message gives a progress indication for the method. It can be used to show
pre-transfer status for Internet type methods.
Fields: Message

<tag>200 URI Start<item>
Indicates the URI is starting to be transfered. The URI is specified
along with stats about the file itself.
Fields: URI, Size, Last-Modified, Resume-Point

<tag>201 URI Done<item>
Indicates that a URI has completed being transfered. It is possible
to specify a <em>201 URI Done</> without a <em>URI Start</> which would
mean no data was transfered but the file is now available. A Filename
field is specified when the URI is directly available in the local 
pathname space. APT will either directly use that file or copy it into 
another location. It is possible to return Alt-* fields to indicate that
another possibility for the URI has been found in the local pathname space.
This is done if a decompressed version of a .gz file is found.
Fields: URI, Size, Last-Modified, Filename, MD5-Hash

<tag>400 URI Failure<item>
Indicates a fatal URI failure. The URI is not retrievable from this source.
As with <em>201 URI Done</> <em>200 URI Start</> is not required to precede
this message
Fields: URI, Message

<tag>401 General Failure<item>
Indicates that some unspecific failure has occurred and the method is unable
to continue. The method should terminate after sending this message. It 
is intended to check for invalid configuration options or other severe
conditions.
Fields: Message

<tag>402 Authorization Required<item>
The method requires a Username and Password pair to continue. After sending
this message the method will expect APT to send a <em>602 Authorization 
Credentials</> message with the required information. It is possible for
a method to send this multiple times.
Fields: Site

<tag>403 Media Failure<item>
A method that deals with multiple media requires that a new media be inserted.
The Media field contains the name of the media to be inserted.
Fields: Media, Drive

<tag>600 URI Acquire<item>
APT is requesting that a new URI be added to the acquire list. Last-Modified
has the time stamp of the currently cache file if applicable. Filename
is the name of the file that the acquired URI should be written to.
Fields: URI, Filename Last-Modified

<tag>601 Configuration<item>
APT is sending the configuration space to the method. A series of
Config-Item fields will be part of this message, each containing an entry
from the configuration space.
Fields: Config-Item.

<tag>602 Authorization Credentials<item>
This is sent in response to a <em>402 Authorization Required</> message. 
It contains the entered username and password.
Fields: Site, User, Password

<tag>603 Media Changed<item>
This is sent in response to a <em>403 Media Failure</> message. It
indicates that the user has changed media and it is safe to proceed.
Fields: Media, Fail
</taglist>

</sect>
                                                                  <!-- }}} -->
<!-- Method Notes		                                       {{{ -->
<!-- ===================================================================== -->
<sect>Notes

<p>
The methods supplied by the stock apt are:
<enumlist>
<item>cdrom - For Multi-Disc CDROMs
<item>copy - (internal) For copying files around the filesystem
<item>file - For local files
<item>gzip - (internal) For decompression
<item>http - For HTTP servers
</enumlist>

<p>
The two internal methods, copy and gzip, are used by the acquire code to
parallize and simplify the automatic decompression of package files as well 
as copying package files around the file system. Both methods can be seen to 
act the same except that one decompresses on the fly. APT uses them by
generating a copy URI that is formed identically to a file URI. The destination
file is send as normal. The method then takes the file specified by the 
URI and writes it to the destination file. A typical set of operations may
be:
<example>
http://foo.com/Packages.gz -> /bar/Packages.gz
gzip:/bar/Packages.gz -> /bar/Packages.decomp
rename Packages.decomp to /final/Packages
</example>

<p>
The http method implements a fully featured HTTP/1.1 client that supports
deep pipelining and reget. It works best when coupled with an apache 1.3
server. The file method simply generates failures or success responses with
the filename field set to the proper location. The cdrom method acts the same
except that it checks that the mount point has a valid cdrom in it. It does 
this by (effectively) computing a md5 hash of 'ls -l' on the mountpoint.

</sect>
                                                                  <!-- }}} -->

</book>