Age | Commit message (Collapse) | Author |
|
Downloading and storing are two different operations were different
compression types can be preferred. For downloading we provide the
choice via Acquire::CompressionTypes::Order as there is a choice to
be made between download size and speed – and limited by whats available
in the repository.
Storage on the other hand has all compressions currently supported by
apt available and to reduce runtime of tools accessing these files the
compression type should be a low-cost format in terms of decompression.
apt traditionally stores its indexes uncompressed on disk, but has
options to keep them compressed. Now that apt downloads additional files
we also deal with files which simply can't be stored uncompressed as
they are just too big (like Contents for apt-file). Traditionally they
are downloaded in a low-cost format (gz) as repositories do not provide
other formats, but there might be even lower-cost formats and for
download we could introduce higher-cost in the repositories.
Downloading an entire index potentially requires recompression to
another format, so an update takes potentially longer – but big files
are usually updated via pdiffs which has to de- and re-compress anyhow
and does it on the fly anyhow, so there is no extra time needed and in
general it seems to be benefitial to invest the time in update to save
time later on file access.
|
|
There is no reason to enforce that the file we start the bootstrap with
is compressed with a compressor which is available online. This allows
us to change the on-disk format as well as deals with repositories
adding/removing support for a specific compressor.
|
|
This doesn't allow all tests to run cleanly, but it at least allows to
write tests which could run successfully in such environments.
Git-Dch: Ignore
|
|
The wrapping will fail in the best case and actually end up deleting
/dev/null in the worst case. Given that there is no point in trying to
write atomically to /dev/null as you can't read from it again just
ignore these flags if higher level code ends up trying to use them on
/dev/null.
Git-Dch: Ignore
|
|
Based on a discussion with Niels Thykier who asked for Contents-all this
implements apt trying for all architecture dependent files to get a file
for the architecture all, which is treated internally now as an official
architecture which is always around (like native). This way arch:all
data can be shared instead of duplicated for each architecture requiring
the user to download the same information again and again.
There is one problem however: In Debian there is already a binary-all/
Packages file, but the binary-any files still include arch:all packages,
so that downloading this file now would be a waste of time, bandwidth
and diskspace. We therefore need a way to decide if it makes sense to
download the all file for Packages in Debian or not. The obvious answer
would be a special flag in the Release file indicating this, which would
need to default to 'no' and every reasonable repository would override
it to 'yes' in a few years time, but the flag would be there "forever".
Looking closer at a Release file we see the field "Architectures", which
doesn't include 'all' at the moment. With the idea outlined above that
'all' is a "proper" architecture now, we interpret this field as being
authoritative in declaring which architectures are supported by this
repository. If it says 'all', apt will try to get all, if not it will be
skipped. This gives us another interesting feature: If I configure a
source to download armel and mips, but it declares it supports only
armel apt will now print a notice saying as much. Previously this was a
very cryptic failure. If on the other hand the repository supports mips,
too, but for some reason doesn't ship mips packages at the moment, this
'missing' file is silently ignored (= that is the same as the repository
including an empty file).
The Architectures field isn't mandatory through, so if it isn't there,
we assume that every architecture is supported by this repository, which
skips the arch:all if not listed in the release file.
|
|
This allows running tests in parallel.
Git-Dch: Ignore
|
|
xz has pretty much won "the compressor war" and e.g. the Debian archive
doesn't even distribute bz2 anymore in favor of 'xz' and 'gz', so by
changing the default order we have a more realistic --print-uris
behavior as it will always show the first compressor.
In practice this effects repositories without a Release file (very bad,
we don't want to support them anymore anyhow) as xz will be tried before
bz2 now [which is probably not available, but so might be bz2…] AND
repositories which provide both, bz2 and xz (which isn't too common) in
sofar as apt will now download xz instead of bz2.
Users with special needs can stick with bz2 as first compressor tried
with Acquire::CompressionTypes::Order:: "bz2"; (see man apt.conf) – but
users with special needs usually prefer "gz" anyhow, so the realworld
change is expected to be very low.
|
|
'files' is a bit too generic as a name for a command usually only used
programmatically (if at all) by developers, so instead of "wasting" this
generic name for this we use "indextargets" which is actually the name
of the datastructure the displayed data is stored in.
Along with this rename the config options are renamed accordingly.
|
|
There is an option to keep all targets (Packages, Sources, …) compressed
for a while now, but the all-or-nothing approach is a bit limited for
our purposes with additional targets as some of them are very big
(Contents) and rarely used in comparison, so keeping them compressed by
default can make sense, while others are still unpacked.
Most interesting is the copy-change maybe: Copy is used by the acquire
system as an uncompressor and it is hence expected that it returns the
hashes for the "output", not the input. Now, in the case of keeping a
file compressed, the output is never written to disk, but generated in
memory and we should still validated it, so for compressed files copy is
expected to return the hashes of the uncompressed file. We used to use
the config option to enable on-the-fly decompress in the method, but in
reality copy is never used in a way where it shouldn't decompress a
compressed file to get its hashes, so we can save us the trouble of
sending this information to the method and just do it always.
|
|
Again, consistency is the main sellingpoint here, but this way it is now
also easier to explain that some files move through different stages and
lines are printed for them hence multiple times: That is a bit hard to
believe if the number is changing all the time, but now that it keeps
consistent.
|
|
Selecting targets based on the Release they belong to isn't to
unrealistic. In fact, it is assumed to be the most used case so it is
made the default especially as this allows to bundle another thing we
have to be careful with: Filenames and only showing targets we have
acquired.
Closes: 752702
|
|
Downloading additional files is only half the job. We still need a way
to allow external tools to know where the files are they requested for
download given that we don't want them to choose their own location.
'apt-get files' is our answer to this showing by default in a deb822
format information about each IndexTarget with the potential to filter
the records based on lines and an option to change the output format.
The command serves also as an example on how to get to this information
via libapt.
|
|
We still need an API for the targets, so slowly prepare the IndexTargets
to let them take this job.
Git-Dch: Ignore
|
|
We have two places in the code which need to iterate over targets and do
certain things with it. The first one is actually creating these targets
for download and the second instance pepares certain targets for
reading.
Git-Dch: Ignore
|
|
First pass at making the acquire system capable of downloading files
based on configuration rather than hardcoded entries. It is now possible
to instruct 'deb' and 'deb-src' sources.list lines to download more than
just Packages/Translation-* and Sources files. Details on how to do that
can be found in the included documentation file.
|