summaryrefslogtreecommitdiff
path: root/apt-pkg/deb/debmetaindex.cc
AgeCommit message (Collapse)Author
2016-01-27only warn about missing/invalid Date field for nowDavid Kalnischkies
The Date field in the Release file is useful to avoid allowing an attacker to 'downgrade' a user to earlier Release files (and hence to older states of the archieve with open security bugs). It is also needed to allow a user to define min/max values for the validation of a Release file (with or without the Release file providing a Valid-Until field). APT wasn't formally requiring this field before through and (agrueable not binding and still incomplete) online documentation declares it optional (until now), so we downgrade the error to a warning for now to give repository creators a bit more time to adapt – the bigger ones should have a Date field for years already, so the effected group should be small in any case. It should be noted that earlier apt versions had this as an error already, but only showed it if a Valid-Until field was present (or the user tried to used the configuration items for min/max valid-until). Closes: 809329
2016-01-08keep compressed indexes in a low-cost formatDavid Kalnischkies
Downloading and storing are two different operations were different compression types can be preferred. For downloading we provide the choice via Acquire::CompressionTypes::Order as there is a choice to be made between download size and speed – and limited by whats available in the repository. Storage on the other hand has all compressions currently supported by apt available and to reduce runtime of tools accessing these files the compression type should be a low-cost format in terms of decompression. apt traditionally stores its indexes uncompressed on disk, but has options to keep them compressed. Now that apt downloads additional files we also deal with files which simply can't be stored uncompressed as they are just too big (like Contents for apt-file). Traditionally they are downloaded in a low-cost format (gz) as repositories do not provide other formats, but there might be even lower-cost formats and for download we could introduce higher-cost in the repositories. Downloading an entire index potentially requires recompression to another format, so an update takes potentially longer – but big files are usually updated via pdiffs which has to de- and re-compress anyhow and does it on the fly anyhow, so there is no extra time needed and in general it seems to be benefitial to invest the time in update to save time later on file access.
2015-12-27allow repositories to forbid arch:all for specific index targetsDavid Kalnischkies
Debian has a Packages file for arch:all already, but the arch:any files contain arch:all packages as well, so downloading it would be a total waste of resources. Getting this solved is on the list of things to do, but it is also the hardest part – for index targets like Contents the situation is much easier and less server/client implementations are involved so we might not want to stall them. A repository can now declare via: No-Support-for-Architecture-all: Packages that even if an arch:all Packages exists, it shouldn't be downloaded, so that support for Contents files can be added now. See also 1dd20368486820efb6ef4476ad739e967174bec4 for the implementation of downloading arch:all index targets, which this is limiting. The field uses the name of the target from the apt configuration for simplicity and is negative by design as this field is intended to be supported/needed only for a "short" time (one or two Debian releases). While this commit theoretically supports any target, its expected to only see "Packages" as a value in reality.
2015-12-14show a more descriptive error for weak Release filesDavid Kalnischkies
If we can't work with the hashes we parsed from the Release file we display now an error message if the Release file includes only weak hashes instead of downloading the indexes and failing to verify them with "Hash Sum mismatch" even through the hashes didn't mismatch (they were just weak). If for some (unlikely) reason we have got weak hashes only for individual targets we will show a warning to this effect (again, befor downloading and failing the index itself). Closes: 806459
2015-11-21review of new/changed translatable program stringsJustin B Rye
Reference mail: https://lists.debian.org/debian-l10n-english/2015/11/msg00006.html
2015-11-04support arch:all data e.g. in separate Packages fileDavid Kalnischkies
Based on a discussion with Niels Thykier who asked for Contents-all this implements apt trying for all architecture dependent files to get a file for the architecture all, which is treated internally now as an official architecture which is always around (like native). This way arch:all data can be shared instead of duplicated for each architecture requiring the user to download the same information again and again. There is one problem however: In Debian there is already a binary-all/ Packages file, but the binary-any files still include arch:all packages, so that downloading this file now would be a waste of time, bandwidth and diskspace. We therefore need a way to decide if it makes sense to download the all file for Packages in Debian or not. The obvious answer would be a special flag in the Release file indicating this, which would need to default to 'no' and every reasonable repository would override it to 'yes' in a few years time, but the flag would be there "forever". Looking closer at a Release file we see the field "Architectures", which doesn't include 'all' at the moment. With the idea outlined above that 'all' is a "proper" architecture now, we interpret this field as being authoritative in declaring which architectures are supported by this repository. If it says 'all', apt will try to get all, if not it will be skipped. This gives us another interesting feature: If I configure a source to download armel and mips, but it declares it supports only armel apt will now print a notice saying as much. Previously this was a very cryptic failure. If on the other hand the repository supports mips, too, but for some reason doesn't ship mips packages at the moment, this 'missing' file is silently ignored (= that is the same as the repository including an empty file). The Architectures field isn't mandatory through, so if it isn't there, we assume that every architecture is supported by this repository, which skips the arch:all if not listed in the release file.
2015-09-14add by-hash sources.list option and document all of by-hashDavid Kalnischkies
This changes the semantics of the option (which is renamed too) to be a yes/no value with the special additional value "force" as this allows by-hash to be disabled even if the repository indicates it would be supported and is more in line with our other yes/no options like pdiff which disable themselves if no support can be detected. The feature wasn't documented so far and hasn't reached a (un)stable release yet, so changing it without trying too hard to keep compatibility seems okay.
2015-09-14avoid using global PendingError to avoid failing too often too soonDavid Kalnischkies
Our error reporting is historically grown into some kind of mess. A while ago I implemented stacking for the global error which is used in this commit now to wrap calls to functions which do not report (all) errors via return, so that only failures in those calls cause a failure to propergate down the chain rather than failing if anything (potentially totally unrelated) has failed at some point in the past. This way we can avoid stopping the entire acquire process just because a single source produced an error for example. It also means that after the acquire process the cache is generated – even if the acquire process had failures – as we still have the old good data around we can and should generate a cache for (again). There are probably more instances of this hiding, but all these looked like the easiest to work with and fix with reasonable (aka net-positive) effects.
2015-08-30detect and deal with indextarget duplicatesDavid Kalnischkies
Multiple targets downloading the same file is bad™ as it leads us to all sorts of problems like the acquire system breaking or simply a problem of which settings to use for them. Beside that this is most likely a mistake and silently ignoring it doesn't help the user realizing his mistake… On the other hand, we have 'duplicates' which are 'created' by how we create indextargets, so we have to prevent those from being created to but do not emit a warning for them as this is an implementation detail. And then, there is the absolute and most likely user mistake: Having the same target(s) activated in multiple entries.
2015-08-30implement $(NATIVE_ARCHITECTURE) substvar for indextargetsDavid Kalnischkies
2015-08-29implement indextargets option 'DefaultEnabled'David Kalnischkies
Some targets like Contents-udeb are special-needs targets. Shipping the configuration snippet for them is okay, but they shouldn't be downloaded by default. Forcing the user to enable targets by uncommenting targets is wrong and this would still not really solve the problem completely as even if you want to download some -udebs it will probably not be for all sources you have enabled, so having the possibility of disabling a target by default, but giving the user the option to enable it on a per-source entry basis is better.
2015-08-29use c++11 algorithms to avoid strange compiler warningsDavid Kalnischkies
Nobody knows what makes the 'unable to optimize loop' warning to appear in the sourceslist minus-options parsing, especially if we use a foreach loop, but we can replace it with some nice c++11 algorithm+lambda usage, which also helps in making even clearer what happens here. And as this would be a lonely change, lets do it for a few more loops as well where I might or might not have seen the warning at some point in time, too. Git-Dch: Ignore
2015-08-28implement PDiff patching for compressed filesDavid Kalnischkies
Some additional files like 'Contents' are very big and should therefore kept compressed on the disk, which apt-file did in the past. It also implemented pdiff patching of these files by un- and recompressing these files on-the-fly, with this commit we can do the same – but we can do this in both pdiff patching styles (client and server merging) and secured by hashes. Hashes are in so far slightly complicated as we can't compare the hashes of the compressed files as we might compress them differently than the server would (different compressor versions, options, …), so we must compare the hashes of the uncompressed content. While this commit has changes in public headers, the classes it changes are marked as hidden, so nobody can use them directly, which means the ABI break is internal only.
2015-08-27sources.list and indextargets option for pdiffsDavid Kalnischkies
Disabling pdiffs can be useful occasionally, like if you have a fast local mirror where the download doesn't matter, but still want to use it for non-local mirrors. Also, some users might prefer it to only use it for very big indextargets like Contents.
2015-08-27allow explicit dis/enable of IndexTargets in sources optionsDavid Kalnischkies
While Target{,-Add,-Remove} is available for configuring IndexTargets already, allow Targets to be mentioned explicitely as yes/no options as well, so that the Target 'Contents' can be disabled via 'Contents: no' as well as 'Target-Remove: Contents'.
2015-08-27not all targets are deb-src targetsDavid Kalnischkies
Sometimes too much refactoring can have bad effects. Thanks: Niels Thykier for reporting on IRC Git-Dch: Ignore
2015-08-17Cleanup includes after running iwyuMichael Vogt
2015-08-10add volatile sources support in libapt-pkgDavid Kalnischkies
Sources are usually defined in sources.list (and co) and are pretty stable, but once in a while a frontend might want to add an additional "source" like a local .deb file to install this package (No support for 'real' sources being added this way as this is a multistep process). We had a hack in place to allow apt-get and apt to pull this of for a short while now, but other frontends are either left in the cold by this and/or the code for it looks dirty with FIXMEs plastering it and has on top of this also some problems (like including these 'volatile' sources in the srcpkgcache.bin file). So the biggest part in this commit is actually the rewrite of the cache generation as it is now potentially a three step process. The biggest problem with adding support now through is that this makes a bunch of previously mostly unusable by externs and therefore hidden classes public, so a bit of further tuneing on this now public API is in order…
2015-08-10rename 'apt-get files' to 'apt-get indextargets'David Kalnischkies
'files' is a bit too generic as a name for a command usually only used programmatically (if at all) by developers, so instead of "wasting" this generic name for this we use "indextargets" which is actually the name of the datastructure the displayed data is stored in. Along with this rename the config options are renamed accordingly.
2015-08-10add c++11 override marker to overridden methodsDavid Kalnischkies
C++11 adds the 'override' specifier to mark that a method is overriding a base class method and error out if not. We hide it in the APT_OVERRIDE macro to ensure that we keep compiling in pre-c++11 standards. Reported-By: clang-modernize -add-override -override-macros Git-Dch: Ignore
2015-08-10allow individual targets to be kept compressedDavid Kalnischkies
There is an option to keep all targets (Packages, Sources, …) compressed for a while now, but the all-or-nothing approach is a bit limited for our purposes with additional targets as some of them are very big (Contents) and rarely used in comparison, so keeping them compressed by default can make sense, while others are still unpacked. Most interesting is the copy-change maybe: Copy is used by the acquire system as an uncompressor and it is hence expected that it returns the hashes for the "output", not the input. Now, in the case of keeping a file compressed, the output is never written to disk, but generated in memory and we should still validated it, so for compressed files copy is expected to return the hashes of the uncompressed file. We used to use the config option to enable on-the-fly decompress in the method, but in reality copy is never used in a way where it shouldn't decompress a compressed file to get its hashes, so we can save us the trouble of sending this information to the method and just do it always.
2015-08-10implement Signed-By option for sources.listDavid Kalnischkies
Limits which key(s) can be used to sign a repository. Not immensely useful from a security perspective all by itself, but if the user has additional measures in place to confine a repository (like pinning) an attacker who gets the key for such a repository is limited to its potential and can't use the key to sign its attacks for an other (maybe less limited) repository… (yes, this is as weak as it sounds, but having the capability might come in handy for implementing other stuff later).
2015-08-10add sources.list Check-Valid-Until and Valid-Until-{Max,Min} optionsDavid Kalnischkies
These options could be set via configuration before, but the connection to the actual sources is so strong that they should really be set in the sources.list instead – especially as this can be done a lot more specific rather than e.g. disabling Valid-Until for all sources at once. Valid-Until-* names are chosen instead of the Min/Max-ValidTime as this seems like a better name and their use in the wild is probably low enough that this isn't going to confuse anyone if we have to names for the same thing in different areas. In the longrun, the config options should be removed, but for now documentation hinting at the new options is good enough as these are the kind of options you set once across many systems with different apt versions, so the new way should work everywhere first before we deprecate the old way.
2015-08-10merge indexRecords into metaIndexDavid Kalnischkies
indexRecords was used to parse the Release file – mostly the hashes – while metaIndex deals with downloading the Release file, storing all indexes coming from this release and … parsing the Release file, but this time mostly for the other fields. That wasn't a problem in metaIndex as this was done in the type specific subclass, but indexRecords while allowing to override the parsing method did expect by default a specific format. APT isn't really supporting different types at the moment, but this is a violation of the abstraction we have everywhere else and, which is the actual reason for this merge: Options e.g. coming from the sources.list come to metaIndex naturally, which needs to wrap them up and bring them into indexRecords, so the acquire system is told about it as they don't get to see the metaIndex, but they don't really belong in indexRecords as this is just for storing data loaded from the Release file… the result is a complete mess. I am not saying it is a lot prettier after the merge, but at least adding new options is now slightly easier and there is just one place responsible for parsing the Release file. That can't hurt.
2015-08-10detect and error out on conflicting Trusted settingsDavid Kalnischkies
A specific trust state can be enforced via a sources.list option, but it effects all entries handled by the same Release file, not just the entry it was given on so we enforce acknowledgement of this by requiring the same value to be (not) set on all such entries.
2015-08-10support lang= and target= sources.list optionsDavid Kalnischkies
We support arch= for a while, now we finally add lang= as well and as a first simple way of controlling which targets to acquire also target=. This asked for a redesign of the internal API of parsing and storing information about 'deb' and 'deb-src' lines. As this API isn't visible to the outside no damage done through. Beside being a nice cleanup (= it actually does more in less lines) it also provides us with a predictable order of architectures as provides in the configuration rather than based on string sorting-order, so that now the native architecture is parsed/displayed first. Observeable e.g. in apt-get output.
2015-08-10fix memory leaks reported by -fsanitizeDavid Kalnischkies
Various small leaks here and there. Nothing particularily big, but still good to fix. Found by the sanitizers while running our testcases. Reported-By: gcc -fsanitize Git-Dch: Ignore
2015-08-10make all d-pointer * const pointersDavid Kalnischkies
Doing this disables the implicit copy assignment operator (among others) which would cause hovac if used on the classes as it would just copy the pointer, not the data the d-pointer points to. For most of the classes we don't need a copy assignment operator anyway and in many classes it was broken before as many contain a pointer of some sort. Only for our Cacheset Container interfaces we define an explicit copy assignment operator which could later be implemented to copy the data from one d-pointer to the other if we need it. Git-Dch: Ignore
2015-08-10apply various style suggestions by cppcheckDavid Kalnischkies
Some of them modify the ABI, but given that we prepare a big one already, these few hardly count for much. Git-Dch: Ignore
2015-06-16add d-pointer, virtual destructors and de-inline de/constructorsDavid Kalnischkies
To have a chance to keep the ABI for a while we need all three to team up. One of them missing and we might loose, so ensuring that they are available is a very tedious but needed task once in a while. Git-Dch: Ignore
2015-06-15implement default apt-get file --release-info modeDavid Kalnischkies
Selecting targets based on the Release they belong to isn't to unrealistic. In fact, it is assumed to be the most used case so it is made the default especially as this allows to bundle another thing we have to be careful with: Filenames and only showing targets we have acquired. Closes: 752702
2015-06-12store Release files data in the CacheDavid Kalnischkies
We used to read the Release file for each Packages file and store the data in the PackageFile struct even through potentially many Packages (and Translation-*) files could use the same data. The point of the exercise isn't the duplicated data through. Having the Release files as first-class citizens in the Cache allows us to properly track their state as well as allows us to use the information also for files which aren't in the cache, but where we know to which Release file they belong (Sources are an example for this). This modifies the pkgCache structs, especially the PackagesFile struct which depending on how libapt users access the data in these structs can mean huge breakage or no visible change. As a single data point: aptitude seems to be fine with this. Even if there is breakage it is trivial to fix in a backportable way while avoiding breakage for everyone would be a huge pain for us. Note that not all PackageFile structs have a corresponding ReleaseFile. In particular the dpkg/status file as well as *.deb files have not. As these have only a Archive property need, the Component property takes over this duty and the ReleaseFile remains zero. This is also the reason why it isn't needed nor particularily recommended to change from PackagesFile to ReleaseFile blindly. Sticking with the earlier is usually the better option.
2015-06-11implement 'apt-get files' to access index targetsDavid Kalnischkies
Downloading additional files is only half the job. We still need a way to allow external tools to know where the files are they requested for download given that we don't want them to choose their own location. 'apt-get files' is our answer to this showing by default in a deb822 format information about each IndexTarget with the potential to filter the records based on lines and an option to change the output format. The command serves also as an example on how to get to this information via libapt.
2015-06-11use an enum instead of strings as IndexTarget::Option interfaceDavid Kalnischkies
Strings are easy to typo and we can keep the extensibility we require here with a simple enum we can append to without endangering ABI. Git-Dch: Ignore
2015-06-11use IndexTarget to get to IndexFileDavid Kalnischkies
Removes a bunch of duplicated code in the deb-specific parts. Especially the Description part is now handled centrally by IndexTarget instead of being duplicated to the derivations of IndexFile. Git-Dch: Ignore
2015-06-11show URI.Path in all acquire item descriptionsDavid Kalnischkies
It is a rather strange sight that index items use SiteOnly which strips the Path, while e.g. deb files are downloaded with NoUserPassword which does not. Important to note here is that for the file transport Path is pretty important as there is no Host which would be displayed by Site, which always resulted in "interesting" unspecific errors for "file:". Adding a 'middle' ground between the two which does show the Path but potentially modifies it (it strips a pending / at the end if existing) solves this "file:" issue, syncs the output and in the end helps to identify which file is meant exactly in progress output and co as a single site can have multiple repositories in different paths.
2015-06-10rename Calculate- to GetIndexTargets and use it as official APIDavid Kalnischkies
We need a general way to get from a sources.list entry to IndexTargets and with this change we can move from pkgSourceList over the list of metaIndexes it includes to the IndexTargets each metaIndex can have. Git-Dch: Ignore
2015-06-10stop using IndexTarget pointers which are never freedDavid Kalnischkies
Creating and passing around a bunch of pointers of IndexTargets (and of a vector of pointers of IndexTargets) is probably done to avoid the 'costly' copy of container, but we are really not in a timecritical operation here and move semantics will help us even further in the future. On the other hand we never do a proper cleanup of these pointers, which is very dirty, even if structures aren't that big… The changes will effecting many items only effect our own hidden class, so we can do that without fearing breaking interfaces or anything. Git-Dch: Ignore
2015-06-10store all targets data in IndexTarget structDavid Kalnischkies
We still need an API for the targets, so slowly prepare the IndexTargets to let them take this job. Git-Dch: Ignore
2015-06-10abstract the code to iterate over all targets a bitDavid Kalnischkies
We have two places in the code which need to iterate over targets and do certain things with it. The first one is actually creating these targets for download and the second instance pepares certain targets for reading. Git-Dch: Ignore
2015-06-09configureable acquire targets to download additional filesDavid Kalnischkies
First pass at making the acquire system capable of downloading files based on configuration rather than hardcoded entries. It is now possible to instruct 'deb' and 'deb-src' sources.list lines to download more than just Packages/Translation-* and Sources files. Details on how to do that can be found in the included documentation file.
2015-06-09rework hashsum verification in the acquire systemDavid Kalnischkies
Having every item having its own code to verify the file(s) it handles is an errorprune process and easy to break, especially if items move through various stages (download, uncompress, patching, …). With a giant rework we centralize (most of) the verification to have a better enforcement rate and (hopefully) less chance for bugs, but it breaks the ABI bigtime in exchange – and as we break it anyway, it is broken even harder. It shouldn't effect most frontends as they don't deal with the acquire system at all or implement their own items, but some do and will need to be patched (might be an opportunity to use apt on-board material). The theory is simple: Items implement methods to decide if hashes need to be checked (in this stage) and to return the expected hashes for this item (in this stage). The verification itself is done in worker message passing which has the benefit that a hashsum error is now a proper error for the acquire system rather than a Done() which is later revised to a Failed().
2015-03-16fix some new compiler warnings reported by gcc-5David Kalnischkies
Git-Dch: Ignore
2014-11-08mark internal interfaces as hiddenDavid Kalnischkies
We have a bunch of classes which are of no use for the outside world, but were still exported and so needed to preserve ABI/API. Marking them as hidden to not export them any longer is a big API break in theory, but in practice nobody is using them – as if they would its a bug.
2014-11-08better non-virtual metaIndex.LocalFileName() implementationDavid Kalnischkies
We can't add a new virtual method without breaking the ABI, but we can freely add new methods, so for older ABIs we just implement this method with a dynamic_cast, so that clients can be more ignorant about the API here and especially don't need to pull a very dirty trick by assuming internal knowledge (like apt-get did here).
2014-11-08use a abi version check similar to the gcc checkDavid Kalnischkies
Git-Dch: Ignore
2014-10-13trusted=yes sources are secure, we just don't know whyDavid Kalnischkies
Do not require a special flag to be present to update trusted=yes sources as this flag in the sources.list is obviously special enough. Note that this is just disabling the error message, the user will still be warned about all the (possible) failures the repository generated, it is just triggering the acceptance of the warnings on a source-by-source level. Similarily, the trusted=no flag doesn't require the user to pass additional flags to update, if the repository looks fine in the view of apt it will update just fine. The unauthenticated warnings will "just" be presented then the data is used. In case you wonder: Both was the behavior in previous versions, too.
2014-09-17use pkgAcqMetaBase as the transactionManagerMichael Vogt
2014-07-31Rework TransactionID stuffMichael Vogt
2014-07-21Download Release first, then Release.gpgMichael Vogt
The old way of handling this was that pkgAcqMetaIndex was responsible to check/move both Release and Release.gpg in place. This breaks the assumption of the transaction that each pkgAcquire::Item has a single File that its responsible for.