summaryrefslogtreecommitdiff
path: root/apt-pkg/pkgcachegen.cc
AgeCommit message (Collapse)Author
2021-01-13pkgcachegen: Avoid write to old cache for Version::ExtraJulian Andres Klode
Assigning the result of AllocateInMap directly to Ver->d caused Ver->d to be resolved first, and hence if Ver was remapped during the AllocateInMap, we were trying to assign to the old value. Closes: #980037
2021-01-08Add support for Phased-Update-PercentageJulian Andres Klode
This adds support for Phased-Update-Percentage by pinning upgrades that are not to be installed down to 1. The output of policy has been changed to add the level of phasing, and documentation has been improved to document how phased updates work. The patch detects if it is running in a chroot, and if so, always includes phased updates, restoring classic apt behavior to avoid behavioral changes on buildd chroots. Various options are added to control this all: * APT::Get::{Always,Never}-Include-Phased-Updates and their legacy update-manager equivalents to always or never include phased updates * APT::Machine-ID can be set to a UUID string to have all machines in a fleet phase the same * Dir::Etc::Machine-ID is weird in that it's default is sort of like ../machine-id, but not really, as ../machine-id would look up $PWD/../machine-id and not relative to Dir::Etc; but it allows you to override the path to machine-id (as opposed to the value) * Dir::Bin::ischroot is the path to the ischroot(1) binary which is used to detect whether we are running in a chroot.
2020-02-26apt-pkg: default visibility to hiddenJulian Andres Klode
2020-02-25Silence narrow conversion warnings, add error checksJulian Andres Klode
When converting a long offset to a uint32_t to be stored in the map, check that this is safe to do. If the offset is negative, or we lose data in the conversion, we lost.
2020-02-24Make map_pointer<T> typesafeJulian Andres Klode
Instead of just using uint32_t, which would allow you to assign e.g. a map_pointer<Version> to a map_pointer<Package>, use our own smarter struct that has strict type checking. We allow creating a map_pointer from a nullptr, and we allow comparing map_pointer to nullptr, which also deals with comparisons against 0 which are often used, as 0 will be implictly converted to nullptr.
2020-02-24Wrap AllocateInMap with a templated versionJulian Andres Klode
2020-02-24Replace map_pointer_t with map_pointer<T>Julian Andres Klode
This is a first step to a type safe cache, adding typing information everywhere. Next, we'll replace map_pointer<T> implementation with a type safe one.
2020-02-18Use a 32-bit djb VersionHash instead of CRC-16Julian Andres Klode
2020-01-27NewGroup: Create GrpIterator after allocation (fix segfault)Julian Andres Klode
NewGroup created a GrpIterator and then called WriteStringInMap() which might remap the cache, causing the iterator to go invalid. Avoid this simply by creating the iterator later on.
2020-01-16NewProvidesAllArch: Check if group is empty before using itJulian Andres Klode
APT 1.9.6 introduced empty groups by making use of groups to deduplicate package names. This is not normally a problem, but here we assumed that every group has at least one package. This caused a problem because automake was providing automake-1.16 while having the source package automake-1.16. So we found the automake-1.16 group, iterated over its empty package list, trying to store the provides (which hence never happened). LP: #1859952
2020-01-14Remove includes of (md5|sha1|sha2).h headersJulian Andres Klode
Remove it everywhere, except where it is still needed.
2020-01-08Avoid extra out-of-cache hash table deduplication for package namesJulian Andres Klode
We were de-duplicating package name strings in StoreString, but also deduplicating most of them by them being in groups, so we had extra hash table lookups that could be avoided in NewGroup(). To continue deduplicating names across binary packages and source packages, insert groups for source packages as well. This is also a good first step in allowing efficient lookup of packages by source package - we can extend Group later by a list of SourceVersion objects, or alternatively, simply add a by-source chain into pkgCache::Version. This change improves performance by about 10% (913 to 814 ms), while having no significant overhead on the cache size: --- before +++ after @@ -1,7 +1,7 @@ -Total package names: 109536 (2.191 k) -Total package structures: 118689 (4.748 k) +Total package names: 119642 (2.393 k) +Total package structures: 118687 (4.747 k) Normal packages: 83309 - Pure virtual packages: 3365 + Pure virtual packages: 3363 Single virtual packages: 17811 Mixed virtual packages: 1973 Missing: 12231 @@ -10,21 +10,21 @@ Total distinct descriptions: 149291 (3.583 k) Total dependencies: 484135/156650 (12,2 M) Total ver/file relations: 57421 (1.378 k) Total Desc/File relations: 18219 (437 k) -Total Provides mappings: 29963 (719 k) +Total Provides mappings: 29959 (719 k) Total globbed strings: 226993 (5.332 k) Total slack space: 26,8 k -Total space accounted for: 38,1 M +Total space accounted for: 38,3 M Total buckets in PkgHashTable: 50503 - Unused: 5727 - Used: 44776 - Utilization: 88.6601% - Average entries: 2.65073 + Unused: 5728 + Used: 44775 + Utilization: 88.6581% + Average entries: 2.65074 Longest: 60 Shortest: 1 Total buckets in GrpHashTable: 50503 - Unused: 5727 - Used: 44776 - Utilization: 88.6601% - Average entries: 2.44631 - Longest: 10 + Unused: 4649 + Used: 45854 + Utilization: 90.7946% + Average entries: 2.60919 + Longest: 11 Shortest: 1
2019-02-26pkgcachegen: Remove deprecated functionsJulian Andres Klode
2019-02-26pkgcache: Remove deprecated bitsJulian Andres Klode
2018-11-25Fix typo reported by codespell in code commentsDavid Kalnischkies
No user visible change expect for some years old changelog entries, so we don't really need to add a new one for this… Reported-By: codespell Gbp-Dch: Ignore
2018-05-07Remove obsolete RCS keywordsGuillem Jover
Prompted-by: Jakub Wilk <jwilk@debian.org>
2018-05-05Fix various typos reported by spellcheckersDavid Kalnischkies
Reported-By: codespell & spellintian Gbp-Dch: Ignore
2017-12-24do not remap current files if nullptrs in cache generationDavid Kalnischkies
If the cache needs to grow to make room to insert volatile files like deb files into the cache we were remapping null-pointers making them non-null-pointers in the process causing trouble later on. Only the current Releasefile pointer can currently legally be a nullpointer as volatile files have no release file they belong to, but for safety the pointer to the current Packages file is equally guarded. The option APT::Cache-Start can be used to workaround this problem. Reported-By: Mattia Rizzolo on IRC
2017-12-13convert various c-style casts to C++-styleDavid Kalnischkies
gcc was warning about ignored type qualifiers for all of them due to the last 'const', so dropping that and converting to static_cast in the process removes the here harmless warning to avoid hidden real issues in them later on. Reported-By: gcc Gbp-Dch: Ignore
2017-12-13deprecate the single-line deprecation ignoring macroDavid Kalnischkies
gcc has problems understanding this construct and additionally thinks it would produce multiple lines and stuff, so to keep using it isn't really worth it for the few instances we have: We can just write the long form there which works better. Reported-By: gcc Gbp-Dch: Ignore
2017-07-12Reformat and sort all includes with clang-formatJulian Andres Klode
This makes it easier to see which headers includes what. The changes were done by running git grep -l '#\s*include' \ | grep -E '.(cc|h)$' \ | xargs sed -i -E 's/(^\s*)#(\s*)include/\1#\2 include/' To modify all include lines by adding a space, and then running ./git-clang-format.sh.
2017-07-12Drop cacheiterators.h includeJulian Andres Klode
Including cacheiterators.h before pkgcache.h fails because pkgcache.h depends on cacheiterators.h.
2017-01-19avoid validate/delete/load race in cache generationDavid Kalnischkies
Keeping the Fd of the cache file we have validated around to later load it into the mmap ensures not only that we load the same file (which wouldn't really be a problem in practice), but that this file also still exists and wasn't deleted e.g. by a 'apt clean' call run in parallel.
2016-11-22Do not use MD5SumValue for Description_md5()Julian Andres Klode
Our profile says we spend about 5% of the time transforming the hex digits into the binary format used by HashsumValue, all for comparing them against the other strings. That makes no sense at all. According to callgrind, this reduces the overall instruction count from 5,3 billion to 5 billion in my example, which roughly matches the 5%.
2016-11-22Compare size before data when ordering cache bucket entriesJulian Andres Klode
This has the effect of significantly reducing actual string comparisons, and should improve the performance of FindGrp a bit, although it's hardly measureable (callgrind says it uses 10% instructions less now).
2016-07-01do not treat same-version local debs as downgradeDavid Kalnischkies
As the volatile sources are parsed last they were sorted behind the dpkg/status file and hence are treated as a downgrade, which isn't really what you want to happen as from a user POV its an upgrade.
2016-03-06Prevent double remapping of iterators and string viewsJulian Andres Klode
If an iterator or a stringview has multiple dynamic objects registered with it, it may be remapped twice. Prevent that by noting which iterators/views we have seen and not remapping one if we have already seen it. We most likely do not have any instance of multiple dynamics on a single object, but let's play safe - the overhead is not high.
2016-01-27deal better with (very) small apt::cache-start valuesDavid Kalnischkies
It is a bit academic to support values which aren't big enough to fit even the hashtables without resizing, but cleaning up ensures that we do the right thing (aka not segfaulting) even if something goes wrong in these deep layers. You still can't have very very small values through… Git-Dch: Ignore
2016-01-26convert Version() and Architecture() to APT::StringViewDavid Kalnischkies
Part of hidden classes, so conversion is abi-free. Git-Dch: Ignore
2016-01-25reimplement build-dep via apts normal resolverDavid Kalnischkies
build-dep was implemented by parsing the build-dependencies of a package and figuring out which packages to install/remove based on this. That means that for the first level of dependencies build-dep was implementing its very own resolver with all the benefits (aka: bugs) this gives us for not using the existing resolver for all levels. Making this work involves generating a dummy binary package with fitting Depends and Conflicts and as we can't create them out of thin air the cache generation needs to be involved so we end up writing a Packages file which we want to parse – after we have parsed the other Packages files already. With .dsc/.deb files we could add them before we started parsing anything. With a bit of care we can avoid generating too much data we have to throw away again (as many parts assume that e.g. the count of packages doesn't change midair), so that on a speed front there shouldn't be much of a difference, but output can be slightly confusing as if we have a completely valid cache on disk the "Reading package lists... Done" is printed two times – but apt is pretty quick about it in that case. Closes: #137560, #444930, #489911, #583914, #728317, #812173
2016-01-25use consistently the last : as name:arch separatorDavid Kalnischkies
Proper debian packages do not contain ':' in the package name, so for real packages this is a non-issue, but apt itself frequently makes use of packages with such an illegal name for internal proposes. Git-Dch: Ignore
2016-01-25always create pkg at the time pkg:arch is createdDavid Kalnischkies
To resolve dependencies like "pkg:arch" we create a package with the name "pkg:arch" and the architecture "any". We create these packages only if a dependency needs it as these kind of dependencies aren't that common. This commit ensured that in the even this architecture specific dependency is the only relation this package has we still create the underlying package to have them available in provides resolution.
2016-01-23Remap another (non-parameter) StringViewJulian Andres Klode
I only looked at parameters in the previous commit, which was not enough: One place also generated local string views. In this case, we only need to make ArchA dynamic, as NameA is not used after the FindPkg() call. Gbp-Dch: ignore
2016-01-23Remap StringView instances pointing into the cacheJulian Andres Klode
It turns out that StringViews might need to be remapped in some places because they come from the cache. For example, some sites pass a Ver.VerStr() to NewProvides(). Such a StringView would become invalid during the duration of the call if the cache is remapped, causing the program to die with a segmentation fault. We can take care of those issues by remapping string views in the same way we remap all the iterators. String views are only remapped if they point into the cache though, this allows us to write more generic code on the callee site without having to check whether the view points into the cache or not. That's not as efficient as possible, but the overhead does not appear to be measurable. Closes: #812251
2016-01-23Pass the old map size to ReMap()Julian Andres Klode
This allows us to check if a value to be remapped was inside the cache or not, which will become useful at a later point. Gbp-Dch: ignore
2016-01-14fix M-A:foreign provides creation for unknown archsDavid Kalnischkies
Architectures for packages which do not belong to the native nor a foreign architecture (dubbed barbarian for now) which are marked M-A:foreign still provide in their own architecture even if not for others. Also, other M-A:foreign (and allowed) packages provide in these barbarian architectures.
2016-01-08Store the size of strings in the cacheJulian Andres Klode
By storing the size of the string in the cache, we can make use of it when comparing the names in the hashtable in pkgCache::FindGrp.
2016-01-08pkgCacheGenerator: CurMd5.Value() cannot be emptyJulian Andres Klode
It makes no sense to check if the value is empty, as it cannot be. It will always be a hexstring of exactly 32 bytes.
2016-01-08pkgCacheGenerator::StoreString: Get rid of std::stringJulian Andres Klode
Instead of storing a string -> map_stringitem_t mapping, create our own data type that can point to either a normal string or a string inside the cache. This avoids the creation of any string and improves performance slightly (about 4%).
2016-01-08Replace compare() == 0 checks with this == other checksJulian Andres Klode
This improves performance, as we now can ignore unequal strings based on their length already. Gbp-Dch: ignore
2016-01-08pkgCacheGenerator: Use StringView for toStringJulian Andres Klode
This removes some minor overhead. Gbp-Dch: ignore
2016-01-08pkgCacheGenerator::StoreString: Move the string into the mapJulian Andres Klode
Moving the string is likely faster than copying it. We could probably avoid strings alltogether in the future using some more crazy code, but I have not looked at that yet. Gbp-Dch: ignore
2016-01-07Switch performance critical code to use APT::StringViewJulian Andres Klode
This improves performance of the cache generation on my ARM platform (4x Cortex A15) by about 10% to 20% from 2.35-2.50 to 2.1 seconds.
2015-12-29Do not sync the cache fileJulian Andres Klode
Integrity is taken care of by the checksum now.
2015-12-29Add support for calculating hashes over the entire cacheJulian Andres Klode
2015-12-29pkgCacheGenerator: Allow passing down an already created cacheJulian Andres Klode
If we already have opened a cache, there is no point in having to open it again.
2015-12-27pkgcachegen: Use std::unordered_map instead of std::mapJulian Andres Klode
std::unordered_map is faster than std::map in our use case, reducing cache generation time by about 10% in my benchmark.
2015-11-27add messages to our deprecation warnings in libaptDavid Kalnischkies
Git-Dch: Ignore
2015-11-20do not segfault in cache generation on mmap failureDavid Kalnischkies
Out of memory and similar circumstanzas could cause MMap::Map to fail and especially the mmap/malloc calls in it. With some additional checking we can avoid segfaults and similar in such situations – at least in theory as if this is a real out of memory everything we do to handle the error could just as well run into a memory problem as well… But at least in theory (if MMap::Map is made to fail always) we can deal with it so good that a user actually never sees a failure (as the cache it tries to load with it fails and is discarded, so that DynamicMMap takes over and a new one is build) instead of segfaulting. Closes: 803417
2015-11-05apply various suggestions made by cppcheckDavid Kalnischkies
Reported-By: cppcheck Git-Dch: Ignore