diff options
Diffstat (limited to 'doc/acquire-additional-files.md')
-rw-r--r-- | doc/acquire-additional-files.md | 325 |
1 files changed, 325 insertions, 0 deletions
diff --git a/doc/acquire-additional-files.md b/doc/acquire-additional-files.md new file mode 100644 index 000000000..30c83ef25 --- /dev/null +++ b/doc/acquire-additional-files.md @@ -0,0 +1,325 @@ +# Acquire additional files in 'update' operations + +The download and verification of data from multiple sources in different +compression formats, with partial downloads and patches is an involved +process which is hard to implement correctly and securely. + +APT front-ends share the code and binaries to make this happen in libapt +with the Acquire system, supported by helpers shipped in the apt package +itself and additional transports in individual packages like +`apt-transport-https`. + +For its own operation libapt needs or can make use of *Packages*, *Sources* +and *Translation-* files, which it will acquire by default, but +a repository might contain more data files (e.g. `Contents`) a front-end +(e.g. `apt-file`) might want to use and would therefore need to be +downloaded as well. + +This file describes the configuration scheme such a front-end can use to +instruct the Acquire system to download those additional files. + +# The Configuration Stanza + +The Acquire system uses the same configuration settings to implement the +files it downloads by default. These settings are the default, but if +they would be written in a configuration file the configuration +instructing the Acquire system to download the *Packages* files would look +like this (see also `apt.conf(5)` manpage for configuration file syntax): + + Acquire::IndexTargets::deb::Packages { + MetaKey "$(COMPONENT)/binary-$(ARCHITECTURE)/Packages"; + ShortDescription "Packages"; + Description "$(RELEASE)/$(COMPONENT) $(ARCHITECTURE) Packages"; + + flatMetaKey "Packages"; + flatDescription "$(RELEASE) Packages"; + + Optional "no"; + }; + +All files which should be downloaded (nicknamed *Targets*) are mentioned +below the `Acquire::IndexTargets` scope. `deb` is here the type of the +`sources.list` entry the file should be acquired for. The only other +supported value is hence `deb-src`. Beware: You can't specify multiple +types here and you can't download the same (evaluated) `MetaKey` from +multiple types! + +After the type you can pick any valid and unique string which preferable +refers to the file it downloads (In the example we picked *Packages*). +This string is used as identifier (if not explicitly set otherwise) for +the target class and accessible as `Identifier` and `Created-By` e.g. +in the `apt-get indextargets` output as detailed below. The identifier +is also used to allow user to enable/disable targets per sources.list +entry. + +All targets have three main properties you can define: + +* `MetaKey`: The identifier of the file to be downloaded as used in the + Release file. It is also the relative location of the file from the + Release file. You can neither download from a different server + entirely (absolute URI) nor access directories above the Release file + (e.g. "../../"). +* `ShortDescription`: Very short string intended to be displayed to the + user e.g. while reporting progress. apt will e.g. use this string in + the last line to indicate progress of e.g. the download of a specific + item. +* `Description`: A preferable human understandable and readable identifier + of which file is acquired exactly. Mainly used for progress reporting + and error messages. apt will e.g. use this string in the Get/Hit/Err + progress lines. + An identifier of the site accessed as seen in the sources.list (e.g. + `http://example.org/debian` or `file:/path/to/a/repository`) is + automatically prefixed for this property. + + +Additional optional properties: + +* `Identifier`: The default value is the unique string identifying this + file (in the example above it was *Packages*) also accessible as + `Created-By`. The difference is that using this property multiple files + can be subsumed under one identifier e.g. if you configure multiple + possible locations for the files (with `Fallback-Of`), but the front-end + doesn't need to handle files from the different locations differently. +* `DefaultEnabled`: The default value is `yes` which means that apt will + try to acquire this target from all sources. If set to `no` the user + has to explicitly enable this target in the sources.list file with the + `Targets` option(s) – or override this value in a config file. +* `Optional`: The default value is `yes` and should be kept at this value. + If enabled the acquire system will skip the download if the file isn't + mentioned in the `Release` file. Otherwise this is treated as a hard + error and the update process fails. Note that failures while + downloading (e.g. 404 or hash verification errors) are failures, + regardless of this setting. +* `KeepCompressed`: The default is the value of `Acquire::GzipIndexes`, + which defaults to `false`. If `true`, the acquire system will keep the + file compressed on disk rather than extract it. If your front-end can't + deal with compressed files transparently you have to explicitly set + this option to `false` to avoid problems with users setting the option + globally. On the other hand, if you set it to `true` or don't set it you + have to ensure your front-end can deal with all compressed fileformats + supported by apt (libapt users can e.g. use `FileFd`, others can use + the `cat-file` command of `/usr/lib/apt/apt-helper`). +* `Fallback-Of`: Is by default not set. If it is set and specifies another + target name (see `Created-By`) which was found in the *Release* file the + download of this target will be skipped. This can be used to implement + fallback(chain)s to allow transitions like the rename of target files. + The behavior if cycles are formed with Fallback-Of is undefined! +* `flatMetaKey`, `flatDescription`: APT supports two types of repositories: + dists-style repositories which are the default and by far the most + common which are named after the fact that the files are in an + elaborated directory structure. In contrast a flat-style repository + lumps all files together in one directory. Support for these flat + repositories exists mainly for legacy purposes only. It is therefore + recommend to not set these values. + + +The acquire system will automatically choose to download a compressed +file if it is available and uncompress it for you, just as it will also +use PDiff patching if provided by the repository and enabled by the +user. You only have to ensure that the Release file contains the +information about the compressed files/PDiffs to make this happen. +**NO** properties have to be set to enable this! + + +More properties exist, but these should **NOT** be set by front-ends +requesting files. They exist for internal and end-user usage only. +Some of these are – which are documented here only to ensure that they +aren't accidentally used by front-ends: + +* `PDiffs`: controls if apt will try to use PDiffs for this target. + Defaults to the value of `Acquire::PDiffs` which is *true* by default. + Can be overridden per-source by the sources.list option of the same + name. See the documentation for both of these for details. +* `By-Hash`: controls if apt will try to use an URI constructed from + a hashsum of the file to download. See the documentation for config + option `Acquire::By-Hash` and sources.list option `By-Hash` for details. +* `CompressionTypes`: The default value is a space separated list of + compression types supported by apt (see `Acquire::CompressionTypes`). + You can set this option to prevent apt from downloading a compression + type a front-end can't open transparently. This should always be + a temporary workaround through and a bug should be reported against + the front-end in question. +* `KeepCompressedAs`: The default value is a space separated list of + compression types supported by apt (see previous option) which is + sorted by the cost-value of the compression in ascending order, + except that cost=0 "compressions" (like uncompressed) are listed last. + + +# More examples + +The stanzas for `Translation-*` files as well as for `Sources` files would +look like this: + +Acquire::IndexTargets { + deb::Translations { + MetaKey "$(COMPONENT)/i18n/Translation-$(LANGUAGE)"; + ShortDescription "Translation-$(LANGUAGE)"; + Description "$(RELEASE)/$(COMPONENT) Translation-$(LANGUAGE)"; + + flatMetaKey "$(LANGUAGE)"; + flatDescription "$(RELEASE) Translation-$(LANGUAGE)"; + }; + + deb-src::Sources { + MetaKey "$(COMPONENT)/source/Sources"; + ShortDescription "Sources"; + Description "$(RELEASE)/$(COMPONENT) Sources"; + + flatMetaKey "Sources"; + flatDescription "$(RELEASE) Sources"; + + Optional "no"; + }; +}; + +# Substitution variables + +As seen in the examples, properties can contain placeholders filled in +by the acquire system. The following variables are known; note that +unknown variables have no default value nor are they touched: They are +printed as-is. + +* `$(RELEASE)`: This is usually an archive- or codename, e.g. *stable* or + *stretch*. Note that flat-style repositories do not have an archive- + or codename per-se, so the value might very well be just "/" or so. +* `$(COMPONENT)`: as given in the sources.list, e.g. *main*, *non-free* or + *universe*. Note that flat-style repositories again do not really + have a meaningful value here. +* `$(LANGUAGE)`: Values are all entries (expect *none*) of configuration + option `Acquire::Languages`, e.g. *en*, *de* or *de_AT*. +* `$(ARCHITECTURE)`: Values are all entries of configuration option + `APT::Architectures` (potentially modified by sources.list options), + e.g. *amd64*, *i386* or *armel* for the *deb* type. In type *deb-src* + this variable has the value *source*. +* `$(NATIVE_ARCHITECTURE)`: The architecture apt treats as the native + architecture for this system configured as `APT::Architecture` + defaulting to the architecture apt itself was built for. + +Note that while more variables might exist in the implementation, these +are to be considered undefined and their usage strongly discouraged. If +you have a need for other variables contact us. + +# Accessing files + +Do **NOT** hardcode specific file locations, names or compression types in +your application! You will notice that the configuration options give +you no choice over where the downloaded files will be stored. This is by +design so multiple applications can download and use the same file +rather than each and every one of them potentially downloads and uses +its own copy somewhere on disk. + +`apt-get indextargets` can be used to get the location as well as other +information about all files downloaded (aka: you will see *Packages*, +*Sources* and *Translation-* files here as well). Provide a line of the +default output format as parameter to filter out all entries which do +not have such a line. With `--format`, you can further more define your +own output style. The variables are what you see in the output, just all +uppercase and wrapped in `$()`, as in the configuration file. + +To get all the filenames of all *Translation-en* files you can e.g. call: + + apt-get indextargets --format '$(FILENAME)' "Identifier: Translations" "Language: en" + +The line-based filtering and the formatting is rather crude and feature- +less by design: The default format is Debian's standard format `deb822` +(in particular: Field names are case-insensitive and the order of fields +in the stanza is undefined), so instead of apt reimplementing powerful +filters and formatting for this command, it is recommend to use piping +and dedicated tools like `grep-dctrl` if you need more than the basics +provided. + +Accessing this information via libapt is done by reading the +sources.lists (`pkgSourceList`), iterating over the metaIndex objects this +creates and calling `GetIndexTargets()` on them. See the source code of +`apt-get indextargets` for a complete example. + +Note that by default targets are not listed if they weren't downloaded. +If you want to see all targets, you can use the `--no-release-info`, which +also removes the *Codename*, *Suite*, *Version*, *Origin*, *Label* and *Trusted* +fields from the output as these also display data which needs to be +downloaded first and could hence be inaccurate [on the pro-side: This +mode is faster as it doesn't require a valid binary cache to operate]. +The most notable difference perhaps is in the *Filename* field through: By +default it indicates an existing file, potentially compressed (Hint: +libapt users can use `FileFd` to open compressed files transparently). In +the `--no-release-info` mode the indicated file doesn't need to exist and +it will always refer to an uncompressed file, even if the index would be +(or is) stored compressed. + +Remarks on fields only available in (default) `--release-info mode`: + +* `Trusted`: Denotes with a *yes* or *no* if the data in this file is + authenticated by a trust chain rooted in a trusted gpg key. You should + be careful with untrusted data and warn the user if you use it. +* `Codename`, `Suite`, `Version`, `Origin` and `Label` are fields from the + *Release* file, are only present if they are present in the *Release* file + and contain the same data. + +Remarks on other available fields: + +* `MetaKey`, `ShortDesc`, `Description`, `Site`, `Release`: as defined + by the configuration and described further above. +* `Identifier`: Defaults to the value of `Created-By`, but can be set + explicitly in the configuration (see above). Prefer this field over + `Created-By` to subsume multiple file(location)s (see `Fallback-Of`). +* `Created-By`: configuration entity responsible for this target +* `Target-Of`: type of the sources.list entry +* `URI`, `Repo-URI`: avoid using. Contains potentially username/password. + Prefer `Site`, especially for display. +* `Optional`, `DefaultEnabled`, `KeepCompressed`: Decode the options of the + same name from the configuration. +* `Language`, `Architecture`, `Component`: as defined further above, but with + the catch that they might be missing if they don't effect the target + (aka: They weren't used while evaluating the `MetaKey` template). + +Again, additional fields might be visible in certain implementations, +but you should avoid using them and instead talk to us about a portable +implementation. + +# Multiple applications requiring the same files + +It is highly encouraged that applications talk to each other and to us +about which files they require. It is usually best to have a common +package ship the configuration needed to get the files, but specific +needs might require specific solutions. Again: **talk to us**. + +Bad things will happen if multiple front-ends request the same file(s) +via different targets, which is another reason why coordination is very +important! + +# Acquiring files not mentioned in the Release file + +You can't. This is by design as these files couldn't be verified to not +be modified in transit, corrupted by the download process or simple if +they are present at all on the server, which would require apt to probe +for them. APT did this in the past for legacy reasons, we do not intend +to go back to these dark times. + +This is also why you can't request files from a different server. It +would have the additional problem that this server might not even be +accessible (e.g. proxy settings) or that local sources (file:/, cdrom:/) +start requesting online files… + +In other words: We would be opening Pandora's box. + +# Acquiring files to a specific location on disk + +You can't by design to avoid multiple front-ends requesting the same file +to be downloaded to multiple different places on (different) disks +(among other reasons). See the next point for a solution if you really +have to force a specific location by creating symlinks. + +# Post processing the acquired files + +You can't modify the files apt has downloaded as apt keeps state with +e.g. the modification times of the files and advanced features like +PDiffs break. + +You can however install an `APT::Update::Post-Invoke{-Success,}` hook +script and use them to copy (modified) files to a different location. +Use `apt-get indextargets` (or similar) to get the filenames – do not +look into `/var/lib/apt/lists` directly! + +Please avoid time consuming calculations in the scripts and instead just +trigger a background task as there is little to no feedback for the user +while hook scripts run. |