<!doctype debiandoc system> <debiandoc> <book> <titlepag> <title> The APT project design document</title> <author> <name>Manoj Srivastava</name> <email>srivasta@debian.org</email> </author> <version>$Id: design.sgml,v 1.2 2001/04/04 05:00:15 jgg Exp $</version> <abstract> This document is an overview of the specifications and design goals of the APT project. It also attempts to give a broad description of the implementation as well. </abstract> <copyright> <copyrightsummary>Copyright ©1997 Manoj Srivastava </copyrightsummary> <p> APT, including this document, is free software; you may redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2, or (at your option) any later version.</p> <p> This is distributed in the hope that it will be useful, but <em>without any warranty</em>; without even the implied warranty of merchantability or fitness for a particular purpose. See the GNU General Public License for more details.</p> <p> You should have received a copy of the GNU General Public License with your Debian GNU/Linux system, in <tt>/usr/doc/copyright/GPL</tt>, or with the <prgn/debiandoc-sgml/ source package as the file <tt>COPYING</tt>. If not, write to the Free Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.</p> </copyright> </titlepag> <chapt id="introduction"> <heading>Introduction</heading> <p>APT is supposed to be a replacement for dselect, and not a replacement for dpkg. However, since addition functionality has been required for APT, and given the fact that this is very closely related to dpkg, it is not unreasonable to expect that additional functionality in the underlying dpkg would also be requested.</p> <p> Diety/dselect are the first introduction that people have to Debian, and unfortunately this first impression contributes greatly to the public perception of the distribution. It is imperative that this be a showcase for Debian, rather than frighten novices away (which has been an accusation often levelled at the current system)</p> </chapt> <chapt> <heading>Requirements</heading> <p> <enumlist compact="compact"> <item> <p> APT should be a replacement for dselect. Therefore it should have all the functionality that dselect has currently. This is the primary means of interaction between the user and the package management system, and it should be able to handle all tasks involved in installing, upgrading, and routine management without having the users take recourse to the underlying management system.</p> </item> <item> <p> It should be easier to use and less confusing for novice users. The primary stimulus for the creation of APT was the perceived intractability, complexity, and non-intuitive behavior of the existing user interface, and as such, human factors must be a primary mandate of APT.</p> </item> <item> <p> It should be able to group packages more flexibly, and possibly allow operations based on a group. One should be able to select, or deselect, a coherent group of related packages simultaneously, allowing one to add, remove, or upgrade functionality to a machine as one step. </p> </item> <item> <p> This would allow APT to handle <em>standard installations</em>, namely, one could then install a set of packages to enable a machine to fulfill specific tasks. Define a few standard installations, and which packages are included therein. The packages should be internally consistent.</p> </item> <item> <p> Make use of a keywords field in package headers; provide a standard list of keywords for people to use. This could be the underpinning to allow the previous two requirements to work (though the developers are not constrained to implement the previous requirements using keywords) </p> </item> <item> <p> Use dependencies, conflicts, and reverse dependencies to properly order packages for installation and removal. This has been a complaint in the past that the installation methods do not really understand dependencies, causing the upgrade process to break, or allowing the removal of packages that left the system in an untenable state by breaking the dependencies on packages that were dependent on the package being removed. A special emphasis is placed on handling pre-dependencies correctly; the target of a predependency has to be fully configured before attempting to install the pre-dependent package. Also, <em>configure immediately</em> requests mentioned below should be handled.</p> </item> <item> <p> Handle replacement of a package providing a virtual package with another (for example, it has been very difficult replacing <prgn>sendmail</prgn> with <prgn>smail</prgn>, or vice versa), making sure that the dependencies are still satisfied. </p> </item> <item> <p> Handle source lists for updates from multiple sources. APT should also be able to handle diverse methods of acquiring new packages; local filesystem, mountable CD-ROM drives, FTP accessible repositories are some of the methods that come to mind. Also, the source lists can be separated into categories, such as main, contrib, non-us, non-local, non-free, my-very-own, etc. APT should be set up to retrieve the Packages files from these multiple source lists, as well as retrieving the packages themselves. </p> </item> <item> <p> Handle base of source and acquire all Packages files underneath. (possibly select based on architecture), this should be a simple extension of the previous requirement.</p> </item> <item> <p> Handle remote installation (to be implemented maybe in a future version, it still needs to be designed). This would ease the burden of maintaining multiple Debian machines on a site. In the authors opinion this is a killer difference for the distribution, though it may be too hard a problem to be implemented with the initial version of APT. However, some thought must be given to this to enable APT to retain hooks for future functionality, or at least to refrain from methods that may preclude remote activity. It is desirable that adding remote installation not require a redesign of APT from the ground up.</p> </item> <item> <p> Be scalable. Dselect worked a lot better with 400 packages, but at last count the number of packages was around twelve hundred and climbing. This also requires APT to pay attention to the needs of small machines which are low on memory (though this requirement shall diminish as we move towards bigger machines, it would still be nice if Debian worked on all old machines where Linux itself would work).</p> </item> <item> <p> Handle install immediately requests. Some packages, like watchdog, are required to be working for the stability of the machine itself. There are others which may be required for the correct functioning of a production machine, or which are mission critical applications. APT should, in these cases, upgrade the packages with minimal downtime; allowing these packages to be one of potentially hundreds of packages being upgraded concurrently may not satisfy the requirements of the package or the site. (Watchdog, for example, if not restarted quickly, may cause the machine to reboot in the midst of installation, which may cause havoc on the machine)</p> </item> </enumlist> </p> </chapt> <chapt> <heading>Procedural description</heading> <p><taglist> <tag>Set Options</tag> <item> <p> This process handles setting of user or site options, and configuration of all aspects of APT. It allows the user to set the location and order of package sources, allowing them to set up source list details, like ftp site locations, passwords, etc. Display options may also be set.</p> </item> <tag>Updates</tag> <item> <p> Build a list of available packages, using source lists or a base location and trawling for Packages files (needs to be aware of architecture). This may involve finding and retrieving Packages files, storing them locally for efficiency, and parsing the data for later use. This would entail contacting various underlying access modules (ftp, cdrom mounts, etc) Use a backing store for speed. This may also require downloading the actual package files locally for speed.</p> </item> <tag>Local status</tag> <item> <p> Build up a list of packages already installed. This requires reading and writing the local?? status file. For remote installation, this should probably use similar mechanisms as the Packages file retrieval does. Use the backing store for speed. One should consider multiple backing stores, one for each machine. </p> </item> <tag>Relationship determination</tag> <item> <p> Determine forward and reverse dependencies. All known dependency fields should be acted upon, since it is fairly cheap to do so. Update the backing store with this information.</p> </item> <tag>Selection</tag> <item> <p> Present the data to the user. Look at Behan Webster's documentation for the user interface procedures. (Note: In the authors opinion deletions and reverse dependencies should also be presented to the user, in a strictly symmetric fashion; this may make it easier to prevent a package being removed that breaks dependencies) </p> </item> <tag>Ordering of package installations and configuration </tag> <item> <p> Build a list of events. Simple topological sorting gives order of packages in dependency order. At certain points in this ordering, predependencies/immediate configure directives cause an break in normal ordering. We need to insert the uninstall/purge directive in the stream (default: as early as possible).</p> </item> <tag>Action</tag> <item> <p> Take the order of installations and removals and build up a stream of events to send to the packaging system (dpkg). Execute the list of events if successful. Do not partially install packages and leave system in broken state. Go to The Selection step as needed.</p> </item> </taglist> </p> </chapt> <chapt> <heading>Modules and interfaces</heading> <p><taglist> <tag>The user interface module</tag> <item> <p> Look at Behan Webster's documentation.</p> </item> <tag>Widget set</tag> <item> <p> Related closely to above Could some one present design decisions of the widget set here?</p> </item> <tag>pdate Module</tag> <item> <p> Distinct versions of the same package are recorded separately, but if multiple Packages files contain the same version of a package, then only the first one is recorded. For this reason, the least expensive update source should be listed first (local file system is better than a remote ftp site)</p> <p> This module should interact with the user interface module to set and change configuration parameters for the modules listed below. It needs to record that information in an on disk data file, to be read on future invocations. </p> <p><enumlist> <item> <p>FTP methods</p> </item> <item> <p>mount and file traversal module(s)?</p> </item> <item> <p>Other methods ???</p> </item> </enumlist> </p> </item> <tag>Status file parser/generator</tag> <item> <p> The status file records the current state of the system, listing the packages installed, etc. The status file is also one method of communicating with dpkg, since it is perfectly permissible for the user to use APT to request packages be updated, put others on hold, mark other for removal, etc, and then run <tt>dpkg -BORGiE</tt> on a file system.</p> </item> <tag>Package file parser/generator</tag> <item> <p> Related to above. Handle multiple Packages files, from different sources. Each package contains a link back to the packages file structure that contains details about the origin of the data. </p> </item> <tag>Dependency module</tag> <item> <p><list> <item> <p>dependency/conflict determination and linking</p> </item> <item> <p>reverse dependency generator. Maybe merged with above</p> </item> </list> </p> </item> <tag>Package ordering Module</tag> <item> <p>Create an ordering of the actions to be taken.</p> </item> <tag>Event generator</tag> <item> <p>module to interact with dpkg</p> </item> </taglist> </chapt> <chapt> <heading>Data flow and conversions analysis.</heading> <p> <example> ____________ __\|ftp modules| / /|___________| _ ____________ / ________________ | update | / |mount/local file| |==========================>| module |/_____\| traversals | | |_____________| /|________________| | ^ ^ | | | ______________ ______|_______ _ _____ ______ | _____v________ \| | |Configuration | |configuration| | |Packages Files| ===|Status file | | module |<=>| data | | |______________| / /|____________| |______________| |_____________| | ^ / ^ | | / | | _______v_______|/_ | | | | ________________ | | | |/_\| Dependency | | | |backing store |\ /| Module | | | |______________| _|_______________| | \ ^ /| ^ | \ | / | | _\|____v_______|/__ ____v_______ |_____________________________\| User interaction| | dpkg | /|_________________|<==>| Invoker | |___________| </example> <p> dpkg also interacts with status and available files.</p> <p> The backing store and the associated data structures are the core of APT. All modules essentially revolve around the backing store, feeding it data, adding and manipulating links and relationships between data in the backing store, allowing the user to interact with and modify the data in the backing store, and finally writing it out as the status file and possibly issuing directives to dpkg.</p> <p>The other focal point for APT is the user interface.</p> </chapt> </book> </debiandoc>