<!-- -*- mode: sgml; mode: fold -*- -->
<!doctype debiandoc  PUBLIC  "-//DebianDoc//DTD DebianDoc//EN">
<book>
<title>APT Files</title>

<author>Jason Gunthorpe <email>jgg@debian.org</email></author>
<version>$Id: files.sgml,v 1.12 2003/04/26 23:26:13 doogie Exp $</version>

<abstract>
This document describes the complete implementation and format of the 
installed APT directory structure. It also serves as guide to how APT 
views the Debian archive.
</abstract>

<copyright>
Copyright &copy; Jason Gunthorpe, 1998-1999.
<p>
"APT" and this document are free software; you can redistribute them and/or
modify them under the terms of the GNU General Public License as published
by the Free Software Foundation; either version 2 of the License, or (at your
option) any later version.

<p>
For more details, on Debian GNU/Linux systems, see the file
/usr/share/common-licenses/GPL for the full license.
</copyright>

<toc sect>

<chapt>Introduction
<!-- General		                                               {{{ -->
<!-- ===================================================================== -->
<sect>General

<p>
This document serves two purposes. The first is to document the installed
directory structure and the format and purpose of each file. The second
purpose is to document how APT views the Debian archive and deals with 
multiple package files.

<p>
The var directory structure is as follows:
<example>
  /var/lib/apt/
		lists/
		       partial/
		periodic/
		extended_states
		cdroms.list
  /var/cache/apt/
		  archives/
		          partial/
		  pkgcache.bin
		  srcpkgcache.bin
  /etc/apt/
	    sources.list.d/
	    apt.conf.d/
	    preferences.d/
	    trusted.gpg.d/
	    sources.list
	    apt.conf
	    apt_preferences
	    trusted.gpg
  /usr/lib/apt/
	        methods/
			 bzip2
			 cdrom
			 copy
			 file
			 ftp
			 gpgv
			 gzip
			 http
			 https
			 lzma
			 rred
			 rsh
			 ssh
</example>

<p>
As is specified in the FHS 2.1 /var/lib/apt is used for application 
data that is not expected to be user modified. /var/cache/apt is used
for regeneratable data and is where the package cache and downloaded .debs
go. /etc/apt is the place where configuration should happen and
/usr/lib/apt is the place where the apt and other packages can place
binaries which can be used by the acquire system of APT.
</sect>
                                                                  <!-- }}} -->

<chapt>Files
<!-- Distribution Source List					       {{{ -->
<!-- ===================================================================== -->
<sect>Files and fragment directories in /etc/apt

<p>
All files in /etc/apt are used to modify specific aspects of APT. To enable
other packages to ship needed configuration herself all these files have
a fragment directory packages can place their files in instead of mangling
with the main files. The main files are therefore considered to be only
used by the user and not by a package. The documentation omits this directories
most of the time to be easier readable, so every time the documentation includes
a reference to a main file it really means the file or the fragment directories.

</sect>

<sect>Distribution Source list (sources.list)

<p>
The distribution source list is used to locate archives of the debian
distribution. It is designed to support any number of active sources and to
support a mix of source media. The file lists one source per line, with the 
fastest source listed first. The format of each line is:

<p>
<var>type uri args</var>

<p>
The first item, <var>type</var>, indicates the format for the remainder 
of the line. It is designed to indicate the structure of the distribution
the line is talking about. Currently the only defined value is <em>deb</em>
which indicates a standard debian archive with a dists dir.

<sect1>The deb Type
   <p>
   The <em>deb</em> type is to be a typical two level debian distributions,
   dist/<var>distribution</var>/<var>component</var>. Typically distribution
   is one of stable, unstable or testing while component is one of main, 
   contrib, non-free or non-us. The format for the deb line is as follows:

   <p>
   deb <var>uri</var> <var>distribution</var> <var>component</var> 
   [<var>component</var> ...]

   <p>
   <var>uri</var> for the <em>deb</em> type must specify the base of the 
   debian distribution. APT will automatically generate the proper longer 
   URIs to get the information it needs. <var>distribution</var> can specify
   an exact path, in this case the components must be omitted and
   <var>distribution</var> must end in a slash.
   
   <p>
   Since only one distribution can be specified per deb line it may be
   necessary to list a number of deb lines for the same URI. APT will
   sort the URI list after it has generated a complete set to allow 
   connection reuse. It is important to order things in the sourcelist
   from most preferred to least preferred (fastest to slowest).
</sect1>

<sect1>URI specification
<p> 
URIs in the source list support a large number of access schemes which
are listed in the sources.list manpage and can be further extended by
transport binaries placed in /usr/lib/apt/methods. The most important
builtin schemes are:

<taglist>
<tag>cdrom<item>
   The cdrom scheme is special in that If Modified Since queries are never
   performed and that APT knows how to match a cdrom to the name it
   was given when first inserted. APT also knows all of the possible 
   mount points the cdrom drives and that the user should be prompted
   to insert a CD if it cannot be found. The path is relative to an 
   arbitrary mount point (of APT's choosing) and must not start with a 
   slash. The first pathname component is the given name and is purely
   descriptive and of the users choice. However, if a file in the root of 
   the cdrom is called '.disk/info' its contents will be used instead of 
   prompting. The name serves as a tag for the cdrom and should be unique.
   <example>
   cdrom:Debian 1.3/debian
   </example>

<tag>http<item>
   This scheme specifies a HTTP server for the debian archive. HTTP is preferred
   over FTP because If Modified Since queries against the Package file are
   possible as well as deep pipelining and resume capabilities.
   <example>
   http://www.debian.org/archive
   </example>

<tag>ftp<item>
   This scheme specifies a FTP connection to the server. FTP is limited because
   there is no support for IMS and is hard to proxy over firewalls.
   <example>
   ftp://ftp.debian.org/debian
   </example>

<tag>file<item>
   The file scheme allows an arbitrary directory in the file system to be 
   considered as a debian archive. This is useful for NFS mounts and 
   local mirrors/archives.
   <example>
   file:/var/debian
   </example>
</taglist>
</sect1>

<sect1>Hashing the URI
<p>
All permanent information acquired from any of the sources is stored in the
lists directory. Thus, there must be a way to relate the filename in the
lists directory to a line in the sourcelist. To simplify things this is
done by quoting the URI and treating _'s as quoteable characters and
converting / to _. The URI spec says this is done by converting a 
sensitive character into %xx where xx is the hexadecimal representation 
from the ASCII character set. Examples:

<example>
http://www.debian.org/archive/dists/stable/binary-i386/Packages 
/var/lib/apt/lists/www.debian.org_archive_dists_stable_binary-i386_Packages

cdrom:Debian 1.3/debian/Packages
/var/lib/apt/info/Debian%201.3_debian_Packages
</example>

<p> 
The other alternative that was considered was to use a deep directory 
structure but this poses two problems, it makes it very difficult to prune
directories back when sources are no longer used and complicates the handling
of the partial directory. This gives a very simple way to deal with all
of the situations that can arise. Also note that the same rules described in 
the <em>Archive Directory</> section regarding the partial sub dir apply 
here as well.
</sect1>

</sect>
                                                                  <!-- }}} -->
<!-- Extended Status						       {{{ -->
<!-- ===================================================================== -->
<sect>Extended States File (extended_states)

<p>
The extended_states file serves the same purpose as the normal dpkg status file
(/var/lib/dpkg/status) except that it stores information unique to apt.
This includes currently only the autoflag but is open to store more
unique data that come up over time. It duplicates nothing from the normal
dpkg status file.  Please see other APT documentation for a discussion
of the exact internal behaviour of these fields. The Package and the
Architecture field are placed directly before the new fields to indicate
which package they apply to. The new fields are as follows:

<taglist>
<tag>Auto-Installed<item>
   The Auto flag can be 1 (Yes) or 0 (No) and controls whether the package
   was automatical installed to satisfy a dependency or if the user requested
   the installation
</taglist>
</sect>
                                                                  <!-- }}} -->
<!-- Binary Package Cache					       {{{ -->
<!-- ===================================================================== -->
<sect>Binary Package Cache (srcpkgcache.bin and pkgcache.bin)

<p>
Please see cache.sgml for a complete description of what this file is. The 
cache file is updated whenever the contents of the lists directory changes.
If the cache is erased, corrupted or of a non-matching version it will
be automatically rebuilt by all of the tools that need it. 
<em>srcpkgcache.bin</> contains a cache of all of the package files in the 
source list. This allows regeneration of the cache when the status files 
change to use a prebuilt version for greater speed.
</sect>
                                                                  <!-- }}} -->
<!-- Downloads Directory					       {{{ -->
<!-- ===================================================================== -->
<sect>Downloads Directory (archives)

<p>
The archives directory is where all downloaded .deb archives go. When the
file transfer is initiated the deb is placed in partial. Once the file
is fully downloaded and its MD5 hash and size are verified it is moved
from partial into archives/. Any files found in archives/ can be assumed 
to be verified.

<p>
No directory structure is transfered from the receiving site and all .deb
file names conform to debian conventions. No short (msdos) filename should
be placed in archives. If the need arises .debs should be unpacked, scanned
and renamed to their correct internal names. This is mostly to prevent
file name conflicts but other programs may depend on this if convenient. 
A conforming .deb is one of the form, name_version_arch.deb. Our archive
scripts do not handle epochs, but they are necessary and should be re-inserted.
If necessary _'s and :'s in the fields should be quoted using the % convention.
It must be possible to extract all 3 fields by examining the file name.
Downloaded .debs must be found in one of the package lists with an exact
name + version match..
</sect>
                                                                  <!-- }}} -->
<!-- The Methods Directory					       {{{ -->
<!-- ===================================================================== -->
<sect> The Methods Directory (/usr/lib/apt/methods)

<p>
The Methods directory is more fully described in the APT Methods interface
document.
</sect>
                                                                  <!-- }}} -->
<!-- The Configuration File					       {{{ -->
<!-- ===================================================================== -->
<sect> The Configuration File (/etc/apt/apt.conf)

<p>
The configuration file (and the associated fragments directory
/etc/apt/apt.conf.d/) is described in the apt.conf manpage.
</sect>
                                                                  <!-- }}} -->
<!-- The trusted.gpg File					       {{{ -->
<!-- ===================================================================== -->
<sect> The trusted.gpg File (/etc/apt/trusted.gpg)

<p>
The trusted.gpg file (and the files in the associated fragments directory
/etc/apt/trusted.gpg.d/) is a binary file including the keyring used
by apt to validate that the information (e.g. the Release file) it
downloads are really from the distributor it clams to be and is
unmodified and is therefore the last step in the chain of trust between
the archive and the end user. This security system is described in the
apt-secure manpage.
</sect>
                                                                  <!-- }}} -->
<!-- The Release File						       {{{ -->
<!-- ===================================================================== -->
<sect> The Release File

<p>
This file plays an important role in how APT presents the archive to the 
user. Its main purpose is to present a descriptive name for the source
of each version of each package. It also is used to detect when new versions
of debian are released. It augments the package file it is associated with 
by providing meta information about the entire archive which the Packages
file describes.

<p>
The full name of the distribution for presentation to the user is formed
as 'label version archive', with a possible extended name being 
'label version archive component'.

<p>
The file is formed as the package file (RFC-822) with the following tags
defined:

<taglist>
<tag>Archive<item>
This is the common name we give our archives, such as <em>stable</> or
<em>unstable</>.

<tag>Component<item>
Refers to the sub-component of the archive, <em>main</>, <em>contrib</>
etc. Component may be omitted if there are no components for this archive.

<tag>Version<item>
This is a version string with the same properties as in the Packages file.
It represents the release level of the archive.

<tag>Origin<item>
This specifies who is providing this archive. In the case of Debian the
string will read 'Debian'. Other providers may use their own string

<tag>Label<item>
This carries the encompassing name of the distribution. For Debian proper
this field reads 'Debian'. For derived distributions it should contain their 
proper name.

<tag>Architecture<item>
When the archive has packages for a single architecture then the Architecture
is listed here. If a mixed set of systems are represented then this should
contain the keyword <em>mixed</em>.

<tag>NotAutomatic<item>
A Yes/No flag indicating that the archive is extremely unstable and its
version's should never be automatically selected. This is to be used by 
experimental.

<tag>Description<item>
Description is used to describe the release. For instance experimental would
contain a warning that the packages have problems.
</taglist>

<p>
The location of the Release file in the archive is very important, it must 
be located in the same location as the packages file so that it can be 
located in all situations. The following is an example for the current stable
release, 1.3.1r6 

<example>
Archive: stable
Component: main
Version: 1.3.1r6
Origin: Debian
Label: Debian
Architecture: i386
</example>

This is an example of experimental,
<example>
Archive: experimental
Version: 0
Origin: Debian
Label: Debian
Architecture: mixed
NotAutomatic: Yes
</example>

And unstable,
<example>
Archive: unstable
Component: main
Version: 2.1
Origin: Debian
Label: Debian
Architecture: i386
</example>

</sect>
                                                                  <!-- }}} -->

</book>