summaryrefslogtreecommitdiff
path: root/apt-pkg/acquire-item.cc
AgeCommit message (Collapse)Author
2016-08-26Make directory paths configurableJulian Andres Klode
This allows other vendors to use different paths, or to build your own APT in /opt for testing. Note that this uses + 1 in some places, as the paths we receive are absolute, but we need to strip of the initial /.
2016-08-26Use C locale instead of C.UTF-8 for protocol stringsJulian Andres Klode
The C.UTF-8 locale is not portable, so we need to use C, otherwise we crash on other systems. We can use std::locale::classic() for that, which might also be a bit cheaper than using locale("C").
2016-08-17support compression and by-hash for .diff/Index filesDavid Kalnischkies
In af81ab9030229b4ce6cbe28f0f0831d4896fda01 by-hash got implemented as a special compression type for our usual index files like Packages. Missing in this scheme was the special .diff/Index index file containing the info about individual patches for this index file. Deriving from the index file class directly we inherent the compression handling infrastructure and in this way also by-hash nearly for free. Closes: #824926
2016-08-17support getting only-uncompressed files via by-hashDavid Kalnischkies
The URI we later want to modify to get the file via by-hash was unset in case a file was only available uncompressed (which is usually not the case) causing an acquire error.
2016-08-17set the correct item FileSize in by-hash caseDavid Kalnischkies
In af81ab9030229b4ce6cbe28f0f0831d4896fda01 we implement by-hash as a special compression type, which breaks this filesize setting as the code is looking for a foobar.by-hash file then. Dealing this slightly gets us the intended value. Note that this has no direct effect as this value will be set in other ways, too, and could only effect progress reporting. Gbp-Dch: Ignore
2016-08-17retry without same redirection mirror on 404 errorsDavid Kalnischkies
If 9b8034a9fd40b4d05075fda719e61f6eb4c45678 serves the Release files from a partial mirror we will end up getting 404 for some of the indexes. Instead of giving up, we will instead ignore our same redirection mirror constrain and ask the redirection service as a potential hashsum mismatch is better than keeping the certain 404 error.
2016-08-17check internal redirections for loops, tooDavid Kalnischkies
Now that we have the redirections loopchecker centrally in our items we can use it also to prevent internal redirections to loop caused by bugs as in a few instances we get into the business of rewriting the URI we will query by ourself as we predict we would see such a redirect anyway. Our code has no bugs of course, hence no practical difference. ;) Gbp-Dch: Ignore
2016-08-10detect redirection loops in acquire instead of workersDavid Kalnischkies
Having the detection handled in specific (http) workers means that a redirection loop over different hostnames isn't detected. Its also not a good idea have this implement in each method independently even if it would work
2016-07-30prevent C++ locale number formatting in text APIs (try 2)David Kalnischkies
Followup of b58e2c7c56b1416a343e81f9f80cb1f02c128e25. Still a regression of sorts of 8b79c94af7f7cf2e5e5342294bc6e5a908cacabf. Closes: 832044
2016-07-27rred: truncate result file before writing to itDavid Kalnischkies
If another file in the transaction fails and hence dooms the transaction we can end in a situation in which a -patched file (= rred writes the result of the patching to it) remains in the partial/ directory. The next apt call will perform the rred patching again and write its result again to the -patched file, but instead of starting with an empty file as intended it will override the content previously in the file which has the same result if the new content happens to be longer than the old content, but if it isn't parts of the old content remain in the file which will pass verification as the new content written to it matches the hashes and if the entire transaction passes the file will be moved the lists/ directory where it might or might not trigger errors depending on if the old content which remained forms a valid file together with the new content. This has no real security implications as no untrusted data is involved: The old content consists of a base file which passed verification and a bunch of patches which all passed multiple verifications as well, so the old content isn't controllable by an attacker and the new one isn't either (as the new content alone passes verification). So the best an attacker can do is letting the user run into the same issue as in the report. Closes: #831762
2016-07-26verify hash of input file in rredDavid Kalnischkies
We read the entire input file we want to patch anyhow, so we can also calculate the hash for that file and compare it with what he had expected it to be. Note that this isn't really a security improvement as a) the file we patch is trusted & b) if the input is incorrect, the result will hardly be matching, so this is just for failing slightly earlier with a more relevant error message (althrough, in terms of rred its ignored and complete download attempt instead).
2016-07-22allow arch=all to override No-Support-for-Architecture-allDavid Kalnischkies
If a user explicitly requests the download of arch:all apt shouldn't get in the way and perform its detection dance if arch:all packages are (also) in arch:any files or not. This e.g. allows setting arch=all on a source with such a field (or one which doesn't support all at all, but has the arch:all files like Debian itself ATM) to get only the arch:all packages from there instead of behaving like a no-op. Reported-By: Helmut Grohne on IRC
2016-07-02use +0000 instead of UTC by default as timezone in outputDavid Kalnischkies
All apt versions support numeric as well as 3-character timezones just fine and its actually hard to write code which doesn't "accidently" accepts it. So why change? Documenting the Date/Valid-Until fields in the Release file is easy to do in terms of referencing the datetime format used e.g. in the Debian changelogs (policy §4.4). This format specifies only the numeric timezones through, not the nowadays obsolete 3-character ones, so in the interest of least surprise we should use the same format even through it carries a small risk of regression in other clients (which encounter repositories created with apt-ftparchive). In case it is really regressing in practice, the hidden option -o APT::FTPArchive::Release::NumericTimezone=0 can be used to go back to good old UTC as timezone. The EDSP and EIPP protocols use this 'new' format, the text interface used to communicate with the acquire methods does not for compatibility reasons even if none of our methods would be effected and I doubt any other would (in these instances the timezone is 'GMT' as that is what HTTP/1.1 requires). Note that this is only true for apt talking to methods, (libapt-based) methods talking to apt will respond with the 'new' format. It is therefore strongly adviced to support both also in method input.
2016-06-27imbue .diff/Index parsing with C.UTF-8 as wellDavid Kalnischkies
In 3bdff17c894d0c3d0f813d358fc45d7a263f3552 we did it for the datetime parsing, but we use the same style in the parsing for pdiff (where the size of the file is in the middle of the three fields) so imbueing here as well is a good idea.
2016-06-22add insecure (and weak) allow-options for sources.listDavid Kalnischkies
Weak had no dedicated option before and Insecure and Downgrade were both global options, which given the effect they all have on security is rather bad. Setting them for individual repositories only isn't great but at least slightly better and also more consistent with other settings for repositories.
2016-06-22add [weak] tag to hash errors to indicate insufficiencyDavid Kalnischkies
For "Hash Sum mismatch" that info doesn't make a whole lot of difference, but for the new insufficient info message an indicator that while this hashes are there and even match, they aren't enough from a security standpoint.
2016-06-22better error message for insufficient hashsumsDavid Kalnischkies
Downloading and saying "Hash Sum mismatch" isn't very friendly from a user POV, so with this change we try to detect such cases early on and report it, preferably before download even started. Closes: 827758
2016-06-22generalize secure->insecure downgrade protectionDavid Kalnischkies
Handling the extra check (and force requirement) for downgrades in security in our AllowInsecureRepositories checker helps in having this check everywhere instead of just in the most common place and requiring a little extra force in such cases is always good.
2016-06-22handle weak-security repositories as unauthenticatedDavid Kalnischkies
APT can be forced to deal with repositories which have no security features whatsoever, so just giving up on repositories which "just" fail our current criteria of good security features is the wrong incentive. Of course, repositories are better of fixing their setup to provide the minimum of security features, but sometimes this isn't possible: Historic repositories for example which do not change (anymore). That also fixes problem with repositories which are marked as trusted, but are providing only weak security features which would fail the parsing of the Release file. Closes: 827364
2016-05-24fix two typos in untranslated errors of libapt-pkgDavid Kalnischkies
Reported-By: lintian: spelling-error-in-binary Git-Dch: Ignore
2016-05-08implement Fallback-Of for IndexTargetsDavid Kalnischkies
Sometimes index files are in different locations in a repository as it is currently the case for Contents files which are per-component in Debian, but aren't in Ubuntu. This has historic reasons and is perhaps changed soon, but such cases of transitions can always happen in the future again, so we should prepare: Introduced is a new field declaring that the current item should only be downloaded if the mentioned item wasn't allowing for transitions without a flagday in clients and archives. This isn't implemented 'simpler' with multiple MetaKeys as items (could) change their descriptions and perhaps also other configuration bits with their location.
2016-05-07don't construct MetaIndex acquire items with IndexTargetsDavid Kalnischkies
We don't have to initialize the Release files with a set of IndexTargets to acquire, but instead wait for the Release file to be acquired and only then ask which IndexTargets to get. Git-Dch: Ignore
2016-05-07delay progress until Release files are downloadedDavid Kalnischkies
Progress reporting used an "upper bound" on files we might get, expect that this wasn't correct in case pdiff entered the picture. So instead of calculating a value which is perhaps incorrect, we just accept that we can't tell how many files we are going to download and just keep at 0% until we know. Additionally, if we have pdiffs we wait until we got these (sub)index files, too. That could all be done better by downloading all Release files first and planing with them in hand accordingly, but one step at a time.
2016-05-07TransactionManager can never be a nullptrDavid Kalnischkies
The code naturally evolved from a TransactionManager optional to a required setup which resulted in various places doing unneeded checks suggesting a more complicated setup than is actually needed. Git-Dch: Ignore
2016-05-07fix same-mirror redirection for Release{,.gpg} pairDavid Kalnischkies
Commit 9b8034a9fd40b4d05075fda719e61f6eb4c45678 just deals with InRelease properly and generates broken URIs in case the mirror (or the achieve really) has no InRelease file. [As this was in no released version no need to clutter changelog with a fix notice.] Git-Dch: Ignore
2016-05-01don't show NO_PUBKEY warning if repo is signed by another keyDavid Kalnischkies
Daniel Kahn Gillmor highlights in the bugreport that security isn't improving by having the user import additional keys – especially as importing keys securely is hard. The bugreport was initially about dropping the warning to a notice, but in given the previously mentioned observation and the fact that we weren't printing a warning (or a notice) for expired or revoked keys providing a signature we drop it completely as the code to display a message if this was the only key is in another path – and is considered critical. Closes: 618445
2016-04-25use the same redirection mirror for all index filesDavid Kalnischkies
Redirection services like httpredir.debian.org tend to use a set of mirrors from which they pick a mirror at "random" for each requested file, which is usually benefitial for the download of debs, but for the index files this can quickly cause problems (aka hashsum mismatches) if the two (or more) mirrors involved are only slightly out-of-sync. This commit "resolves" this issue by using the mirror we ended up using to get the (signed) Release file directly to get the index files belonging to this Release file instead of asking the redirection service which eliminates the risk of hitting out-of-sync mirrors. As an obvious downside the redirection service can't serve partial mirrors anymore for indexes and the download of indexes indexed in the same Release file can't be done in parallel (from different mirrors). This does not effect the download of non-index files like deb-files as out-of-sync mirrors aren't a huge problem there, so the parallel download outweights a potentially 404 error (also because this causes no errenous downloads while hashsum mismatches download the entire file before finding out that it was pointless). The rational for this is that indexes are relative to the Release file. If we would be talking about a HTML page including images, such a behaviour is obvious and intended – not doing it means in the best case a bunch of "useless" requests which will all be answered with a redirect.
2016-04-25show more details for "Writing more data" errors, tooDavid Kalnischkies
They are the small brothers of the hashsum mismatch, so they deserve a similar treatment even through we have for architectual reasons not a much to display as for hashsum mismatches for now.
2016-04-25show more details for "Hash Sum mismatch" errorsDavid Kalnischkies
Users tend to report these errors with just this error message… not very actionable and hard to figure out if this is a temporary or 'permanent' mirror-sync issue or even the occasional apt bug. Showing the involved hashsums and modification times should help in triaging these kind of bugs – and eventually we will have less of them via by-hash. The subheaders aren't marked for translation for now as they are technical glibberish and probably easier to deal with if not translated. After all, our iconic "Hash Sum mismatch" is translated at least. These additions were proposed in #817240 by Peter Palfrader.
2016-04-25sanify unused ReportMirrorFailure a tiny bitDavid Kalnischkies
Calling the (non-existent) reporter multiple times for the same error with different codes for the same error (e.g. hashsum) is a bit strange. It also doesn't need to be a public API. Ideally that would all look and behave slightly different, but we will worry about that at the time this is actually (planed to be) used somewhere… Git-Dch: Ignore
2016-04-14ensure outdated files are dropped without lists-cleanupDavid Kalnischkies
Tested via (newly) empty index files, but effects also files dropped from the repository or an otherwise changed repository config.
2016-04-14silently skip acquire of empty index filesDavid Kalnischkies
There is just no point in taking the time to acquire empty files – especially as it will be tiny non-empty compressed files usually.
2016-04-14fix Alt-Filename handling of file methodDavid Kalnischkies
A silly of-by-one error in the stripping of the extension to check for the uncompressed filename broken in an attempt to support all compressions in commit a09f6eb8fc67cd2d836019f448f18580396185e5. Fixing this highlights also mistakes in the handling of the Alt-Filename in libapt which would cause apt to remove the file from the repository (if root has the needed rights – aka the disk isn't readonly or similar)
2016-04-07stop handling items in doomed transactionsDavid Kalnischkies
With the previous commit we track the state of transactions, so we can now use our knowledge to avoid processing data for a transaction which was already closed (via an abort in this case). This is needed as multiple independent processes are interacting in the process, so there isn't a simple immediate full-engine stop and it would also be bad to teach each and every item how to check if its manager has failed subordinate and what to do in that case. In the pdiff case, which deals (potentially) with many items during its lifetime e.g. a hashsum mismatch in another file can abort the transaction the file we try to patch via pdiff belongs to. This causes some of the items (which are already done) to be aborted with it, but items still in the process of acquisition continue in the processing and will later try to use all the items together failing in strange ways as cleanup already happened. The chosen solution is to dry up the communication channels instead by ignoring new requests for data acquisition, canceling requests which are not assigned to a queue and not calling Done/Failed on items anymore. This means that e.g. already started or pending (e.g. pipelined) downloads aren't stopped and continue as normal for now, but they remain in partial/ and aren't processed further so the next update command will pick them up and put them to good use while the current process fails updating (for this transaction group) in an orderly fashion. Closes: 817240 Thanks: Barr Detwix & Vincent Lefevre for log files
2016-04-07ensure transaction states are changed only onceDavid Kalnischkies
We want to keep track of the state of a transaction overall to base future decisions on it, but as a pre-requirement we have to make sure that a transaction isn't commited twice (which happened if the download of InRelease failed and Release takes over). It also happened to create empty commits after a transaction was already aborted in cases in which the Release files were rejected. This isn't effecting security at the moment, but to ensure this isn't happening again and can never be bad a bunch of fatal error messages are added to make regressions on this front visible.
2016-03-19refactor loading of previous release fileDavid Kalnischkies
There is really no need to have the same code three times. Git-Dch: Ignore
2016-03-16Get accurate progress reporting in apt update againMichael Vogt
For the non-pdiff case, we have can have accurate progress reporting because after fetching the {,In}Release files we know how many IndexFiles will be fetched and what size they have. Therefore init the filesize early (in pkgAcqIndex::Init) and ensure that in Acquire::Pulse() looks at already downloaded bits when calculating the progress in Acquire::Pulse. Also improve debug output of Debug::acquire::progress
2016-03-14don't use Desc.URI to calculate .diff/Index filenamesDavid Kalnischkies
The URI descibing an item can change via mirrors/redirectors which causes the .diff/Index files to get the wrong names in storage. Git-Dch: Ignore
2016-03-14require $(HASH)-Download field in .diff/Index filesDavid Kalnischkies
Now that we ignore SHA1-only files it makes sense to require also the provision of hashes for the compressed patches as this was introduced in the same patchset as support for non-SHA1 hashes in the file itself in dak and adding support in other archive creators (if they support pdiffs at all) will likely be in the same batch. The reason for the change itself is simple: If you are 'scared' enough about the security of SHA1, you shouldn't uncompress a file you haven't verified at all – after all, it could be exploiting a bug or a zip bomb.
2016-03-07Fix several typosVeres Lajos
This effectively merges branch 'typofixes-vlajos-20150807' of github.com:vlajos/apt with the following commit: commit 13cacb3e2e2352ba701e769fc889e3344fabbf7e Author: Veres Lajos <vlajos@gmail.com> Date: Sun Aug 9 00:12:53 2015 +0100 typofix - https://github.com/vlajos/misspell_fixer It has been rebased for a better commit message.
2016-03-06do not move not-failed pdiff-patches into CWD on failureDavid Kalnischkies
If a single pdiff fails, we have to fail the entire patching endeavour and fall back to getting the complete file instead. That is easy in serverside merged pdiffs as we get them one by one. For clientside we get them all at once through, which means that a failure in one has to stop the entire pipeline, which works as expected (as proven by the bugreporters as they don't even notice it happening). The problem is just that the first failing pdiff will do the cleanup, so another pdiff which happens to be successfully acquired after we processed the failure doesn't find the file it is supposed to use as a basename anymore, so the patch is renamed to what should be the unique extension and moved into the current working directory. Processing is then stopped as the patch realizes that it isn't the last one which completed downloading. On the plus side this means this is neither us using a bad temporary location nor a security problem. It "just" overrides unconditionally files in your current working directory (if you happen to have them named like a pdiff patch – a bit unlikely perhaps) and so drops files there which are never used again. I guess this was introduced in 4e3c5633b1e74b4f58b95f339cfbbf4cbf21ab3e for real as I made the need for the existence of the base file rather explicit, but the potential lingers in the code for far longer. Closes: #816837
2016-03-06deal with partially downloaded changelogsDavid Kalnischkies
Changelogs are relatively small and we have no hashes for them, but we had partial support for them before, so lets stick to it. This also deletes the (partial) file before moving the downloaded file into its place – rename(2) should be doing this by itself, but testing on semaphoreci suggests that this isn't always the case (error is "Stale file handle") and we don't need an atomic replace here, so be explicit. Git-Dch: Ignore
2016-02-26Add missing numeric includes in files using std::accumulate()Julian Andres Klode
Reported-By: Helmut Grohne on IRC
2016-02-11always download changelogs into /tmp firstDavid Kalnischkies
pkgAcqChangelog has the default behaviour of downloading a changelog to a temporary directory (inside /tmp, not /tmp directly), which is cleaned up on shutdown, but this can be overridden to store the changelog more permanently – but that caries a permission problem. For changelog we can 'easily' solve this by always downloading to a temporary directory and only move it out of there on done.
2016-02-11use local changelog from /usr/share/doc if possibleDavid Kalnischkies
If pkgAcqChangelog is told to acquire the changelog for a version it will check first if this version is installed on the disk and if so will use the local changelog in /usr/share/doc (possibily/likely gz compressed) instead of downloading the file from the web. An option is provided to disable this, which is enabled by default for the Ubuntu vendor as they truncate the local changelogs – and for apts --print-uris action.
2016-01-08remove uncompressed leftover partial file before pdiff bootstrapDavid Kalnischkies
The code already deals with compressed leftovers, but forgot the uncompressed files. The opertunity is picked to reorder this code and add debug messages about the actions taken as well as produce such a leftover file in the associated testcase.
2016-01-08use filesize of compressed pdiffs for the limit if possibleDavid Kalnischkies
With the addition of the $HASH-Download field in the .diff/Index we got the size of the compressed patches for 'free', so if that information is available we can use it for a more fitting calculation of the size requirements of the patches vs. the complete file. Note that this predicts a too small size in the transition case in which the information isn't available for all patches, but figuring this out would be a lot of code for practically nothing as only one update can ever be in such a transition phase.
2016-01-08keep compressed indexes in a low-cost formatDavid Kalnischkies
Downloading and storing are two different operations were different compression types can be preferred. For downloading we provide the choice via Acquire::CompressionTypes::Order as there is a choice to be made between download size and speed – and limited by whats available in the repository. Storage on the other hand has all compressions currently supported by apt available and to reduce runtime of tools accessing these files the compression type should be a low-cost format in terms of decompression. apt traditionally stores its indexes uncompressed on disk, but has options to keep them compressed. Now that apt downloads additional files we also deal with files which simply can't be stored uncompressed as they are just too big (like Contents for apt-file). Traditionally they are downloaded in a low-cost format (gz) as repositories do not provide other formats, but there might be even lower-cost formats and for download we could introduce higher-cost in the repositories. Downloading an entire index potentially requires recompression to another format, so an update takes potentially longer – but big files are usually updated via pdiffs which has to de- and re-compress anyhow and does it on the fly anyhow, so there is no extra time needed and in general it seems to be benefitial to invest the time in update to save time later on file access.
2016-01-08allow pdiff bootstrap from all supported compressorsDavid Kalnischkies
There is no reason to enforce that the file we start the bootstrap with is compressed with a compressor which is available online. This allows us to change the on-disk format as well as deals with repositories adding/removing support for a specific compressor.
2016-01-08ensure compression cleanup even without lists-cleanupDavid Kalnischkies
If we store files compressed in lists/ and the file switched compression formats we happened to retain the "old" format, but by default the cleanup process catched this oversight and removed the file. [The initial situation described doesn't arise as we store no files by default compressed and even with apt-file configuring Contents files, we don't really have that problem as there is just .gz files for those.] We solve this by just removing any uncompressed as well as compressed (we support) file just before we move the 'new' version of the file in.