summaryrefslogtreecommitdiff
path: root/apt-pkg/pkgcache.cc
AgeCommit message (Collapse)Author
2020-12-15Unroll pkgCache::sHash 8 time, break up dependencyJulian Andres Klode
Unroll pkgCache::sHash 8 times and break up the dependency between the iterations by expanding the calculation H(n) = 33 * H(n-1) + c 8 times rather than performing it 8 times. This seems to yield about a 0.4% performance improvement. I tried unrolling 4 and 2 bytes as well, those only having 3 ifs at the end rather than 1 small loop; but that was actually slower - potentially the code got to large and the cache went bonkers. I also tried unrolling 4 times instead of 8, thinking that smaller code might yield better results overall then, but that was slower as well.
2020-12-15Use XXH3 for cache, hash table hashingJulian Andres Klode
XXH3 is faster than both our CRC32c implementation as well as DJB hash for hash table hashing, so meh, let's switch to it.
2020-12-10Raise APT::Cache-HashtableSize to 196613Julian Andres Klode
We now have over 100k package names, my Ubuntu system has 125k arleady, so increase the hash table size to match, this will cost us about a MB in cache size, but give a very nice speed up somewhere around 3%-4% or so.
2020-05-25Allow FMV SSE4.2 detection to succeed on clangDavid Kalnischkies
As the builtins were used in the feature test also in the default branch clang fails to compile the test helpfully complaining that you need to compile with sse4.2 to use that while on gcc it is optimized out as unused code and produces only a warning for that… removing the code from the default branch fixes this problem, but we adapt the code some more to avoid compilers optimizing it out in the future just in case.
2020-02-24Make map_pointer<T> typesafeJulian Andres Klode
Instead of just using uint32_t, which would allow you to assign e.g. a map_pointer<Version> to a map_pointer<Package>, use our own smarter struct that has strict type checking. We allow creating a map_pointer from a nullptr, and we allow comparing map_pointer to nullptr, which also deals with comparisons against 0 which are often used, as 0 will be implictly converted to nullptr.
2020-01-16pkgcache.cc: Mix PACKAGE_VERSION into the cache hashJulian Andres Klode
This ensures that caches build with one version can't be opened with another, which makes sense. It's a temporary approach until we can replace major:minor fields with a version string. For example, this would have prevented 1.9.7 from using broken caches from 1.9.6.
2020-01-07Rename _count() macro to APT_ARRAY_SIZE()Julian Andres Klode
2019-11-25Search in all available description translationsАлексей Шилин
When multiple translations of package descriptions are available, perform search in all of them. It allows using search patterns in any of the configured languages. Previously, only the first available translation was searched. As the result, patterns in e.g. English never matched packages which had their descriptions translated into local language. Closes: #490000
2019-07-10Fix typos reported by codespell in code commentsDavid Kalnischkies
Also in old changelogs, but nothing really user visible like error messages or alike so barely noteworthy. Reported-By: codespell Gbp-Dch: Ignore
2019-06-12Bump cache MajorVersion to 16Julian Andres Klode
1.6 was 13, so 1.7 has 14 reserved, and 1.8 has 15 reserved, so let's use 16 for 1.9 for now.
2019-06-11Make APT::StringView publicJulian Andres Klode
2019-02-26cacheiterators: Cleanup deprecated codeJulian Andres Klode
2019-02-26pkgcache: Remove deprecated bitsJulian Andres Klode
2019-02-04Detect function multiversioning and sse4.2/crc32, enables i386Julian Andres Klode
This fixes the build on kfreebsd-amd64, and due to the detection of sse4.2, should also enable the sse4.2 on i386.
2019-01-05hash32: Tighten to multiversion to x86-64 ELF and use uint32_tJulian Andres Klode
2018-12-26cache hash: Use sse4.2 CRC32c on x86-64 where availableJulian Andres Klode
This is more than twice as fast as adler32, but could be made another 50% faster by calculating crcs for 8 byte blocks in "parallel" (without data dependency) and then combining them. But that's complicated code. Reference measurements for hashing the cache 100 times: adler32=2.46s xxhash64=0.64 xxhash32=1.12 crc32c(this)=1.10 crc32c(opt)=0.44s
2018-11-25Fix typo reported by codespell in code commentsDavid Kalnischkies
No user visible change expect for some years old changelog entries, so we don't really need to add a new one for this… Reported-By: codespell Gbp-Dch: Ignore
2018-05-11Deprectate buggy/incorrect Rls/PkgFile::IsOk methodsDavid Kalnischkies
With the advent of compressed files and especially with in-memory post-processed files the simple assumptions made in IsOk are no longer true. Worse, they are at best duplicates of checks performed by the cache generation (and validation) earlier and isn't used in too many places. It is hence best to simply get right of these calls instead of trying to fix them.
2018-05-07Remove obsolete RCS keywordsGuillem Jover
Prompted-by: Jakub Wilk <jwilk@debian.org>
2018-03-19Bump cache major version to allow different 1.5 and 1.6 updatesJulian Andres Klode
Shipping 1.6 with major 12 would not allow us to update 1.5.y in a different way than 1.6.y if we have to without resorting to minor version hacks. Let's just bump the major instead.
2017-07-12Reformat and sort all includes with clang-formatJulian Andres Klode
This makes it easier to see which headers includes what. The changes were done by running git grep -l '#\s*include' \ | grep -E '.(cc|h)$' \ | xargs sed -i -E 's/(^\s*)#(\s*)include/\1#\2 include/' To modify all include lines by adding a space, and then running ./git-clang-format.sh.
2017-06-26pkgcache: Bump major version to 12Julian Andres Klode
We need to be able to update 1.4.y in different ways than later apt versions, and thus need to bump the major version so there is no collision in the minor version at some point.
2017-01-19fix various typos reported by spellintianDavid Kalnischkies
Most of them in (old) code comments. The two instances of user visible string changes the po files of the manpages are fixed up as well. Gbp-Dch: Ignore Reported-By: spellintian
2016-11-22Compare size before data when ordering cache bucket entriesJulian Andres Klode
This has the effect of significantly reducing actual string comparisons, and should improve the performance of FindGrp a bit, although it's hardly measureable (callgrind says it uses 10% instructions less now).
2016-11-22Introduce tolower_ascii_unsafe() and use it for hashingJulian Andres Klode
This one has some obvious collisions for non-alphabetical characters, like some control characters also hashing to numbers, but we don't really have those, and these are hash functions which are not collision free to begin with.
2016-11-22Bump the cache major version for non-backportable changesJulian Andres Klode
We already have two stable series with major version 10, and the next commits will introduce non-backportable performance changes that affect the cache algorithms, so we need to bump the major version now to prevent future problems.
2016-09-18VersionHash: Do not skip too long dependency linesJulian Andres Klode
If the dependency line does not contain spaces in the repository but does in the dpkg status file (because dpkg normalized the dependency list), the dpkg line might be longer than the line in the repository. If it now happens to be longer than 1024 characters, it would be skipped, causing the hashes to be out of date. Note that we have to bump the minor cache version again as this changes the format slightly, and we might get mismatches with an older src cache otherwise. Fixes Debian/apt#23
2016-06-28cache: Bump minor version to 6Julian Andres Klode
Needed for the previous change
2016-04-28factor out Pkg/DepIterator prettyprinters into own headerDavid Kalnischkies
The old prettyprinters have only access to the struct they pretty print, which isn't enough usually as we want to know for a package also a bit of state information like which version is the candidate. We therefore need to pull the DepCache into context and hence use a temporary struct which is printed instead of the iterator itself.
2016-03-19cachefile: Only set members that were initialized successfullyJulian Andres Klode
Otherwise, things will just start failing later down the stack, because (a) the lazy getters do not check if building was successful and (b) any further getter call would return the invalid object anyway. Also initialize VS in pkgCache to nullptr by default. Closes: #818628
2016-03-07Fix several typosVeres Lajos
This effectively merges branch 'typofixes-vlajos-20150807' of github.com:vlajos/apt with the following commit: commit 13cacb3e2e2352ba701e769fc889e3344fabbf7e Author: Veres Lajos <vlajos@gmail.com> Date: Sun Aug 9 00:12:53 2015 +0100 typofix - https://github.com/vlajos/misspell_fixer It has been rebased for a better commit message.
2016-02-25Fix crash with empty architecture listJulian Andres Klode
If the architecture list is empty somehow, fail normally. LP: #1549819
2016-01-26parse version correctly from binary Source fieldDavid Kalnischkies
In commit a221efc331693f8905da870141756c892911c433 I promoted the source package name and version to the binary cache for faster access by e.g. EDSP, but due to changing the interpretation length to soon we always ignored the version part of the Source field, so that packages ended up having the binary version as source version – which while usually just fine it is wrong for binary rebuilds. Closes: 812492
2016-01-25use consistently the last : as name:arch separatorDavid Kalnischkies
Proper debian packages do not contain ':' in the package name, so for real packages this is a non-issue, but apt itself frequently makes use of packages with such an illegal name for internal proposes. Git-Dch: Ignore
2016-01-08Store the size of strings in the cacheJulian Andres Klode
By storing the size of the string in the cache, we can make use of it when comparing the names in the hashtable in pkgCache::FindGrp.
2016-01-07Switch performance critical code to use APT::StringViewJulian Andres Klode
This improves performance of the cache generation on my ARM platform (4x Cortex A15) by about 10% to 20% from 2.35-2.50 to 2.1 seconds.
2016-01-03Increase APT::Cache-HashTableSize default to 50503Julian Andres Klode
This drop the hash table utilization from a high 98% to acceptable 74% on unstable, and the average bucket length from 4.6 to 1.8. This improves performance by about 5%, while increasing the size of the cache by 0.2 out of 38MB, that is 0.5%. 48481 is a nice number
2016-01-03apt-cache: stats: Show a table utilization as percentageJulian Andres Klode
Gbp-Dch: ignore
2015-12-29Add support for calculating hashes over the entire cacheJulian Andres Klode
2015-12-29Switch to DJB hashing and use prime number as table sizeJulian Andres Klode
On my testing system, consisting of unstable and experimental, this reduces the average chain from 6.5 to 4.5, and the longest chain from 17 to 15.
2015-12-14pkgcache: Make hash arch-independent using fixed size integerJulian Andres Klode
This helps writing test cases. Also adapt the test case that expected 64-bit. Nothing changes performance wise, the distribution of the hash values remains intact.
2015-12-11Bump cache minor version to 2 to trigger rebuildsJulian Andres Klode
With the package names now normalized to lower case, the caches of affected systems need to be rebuild. Adjust the minor version to trigger such a rebuild. Gbp-Dch: ignore
2015-12-10Do not swap required and important in pkgCache::Priority()Julian Andres Klode
required and important were swapped, leading to wrong output. Closes: #807523 Thanks: Manuel A. Fernandez Montecelo for discovering this
2015-09-14remove incorrect optimization branchesDavid Kalnischkies
These assumptions were once true, but they aren't anymore, so what is supposed to be a speed up is effectively a slowdown [not that it would be noticible]. Usage of SingleArchFindPkg was nuked in a stable update already as the included assumption was actually harmful btw, which is why we should get right of other 'non-harmful' but still untrue assumptions while we can. Git-Dch: Ignore
2015-09-14implement dpkgs vision of interpreting pkg:<arch> dependenciesDavid Kalnischkies
How the Multi-Arch field and pkg:<arch> dependencies interact was discussed at DebConf15 in the "MultiArch BoF". dpkg and apt (among other tools like dose) had a different interpretation in certain scenarios which we resolved by agreeing on dpkg view – and this commit realizes this agreement in code. As was the case so far libapt sticks to the idea of trying to hide MultiArch as much as possible from individual frontends and instead translates it to good old SingleArch. There are certainly situations which can be improved in frontends if they know that MultiArch is upon them, but these are improvements – not necessary changes needed to unbreak a frontend. The implementation idea is simple: If we parse a dependency on foo:amd64 the dependency is formed on a package 'foo:amd64' of arch 'any'. This package is provided by package 'foo' of arch 'amd64', but not by 'foo' of arch 'i386'. Both of those foo packages provide each other through (assuming foo is M-A:foreign) to allow a dependency on 'foo' to be satisfied by either foo of amd64 or i386. Packages can also declare to provide 'foo:amd64' which is translated to providing 'foo:amd64:any' as well. This indirection over provides was chosen as the alternative would be to teach dependency resolvers how to deal with architecture specific dependencies – which violates the design idea of avoiding resolver changes, especially as architecture-specific dependencies are a cornercase with quite a few subtil rules. Handling it all over versioned provides as we already did for M-A in general seems much simpler as it just works for them. This switch to :any has actually a "surprising" benefit as well: Even frontends showing a package name via .Name() [which doesn't show the architecture] will display the "architecture" for dependencies in which it was explicitely requested, while we will not show the 'strange' :any arch in FullName(true) [= pretty-print] either. Before you had to specialcase these and by default you wouldn't get these details shown. The only identifiable disadvantage is that this complicates error reporting and handling. apt-get's ShowBroken has existing problems with virtual packages [it just shows the name without any reason], so that has to be worked on eventually. The other case is that detecting if a package is completely unknown or if it was at least referenced somewhere needs to acount for this "split" – not that it makes a practical difference which error is shown… but its one of the improvements possible.
2015-09-14store ':any' pseudo-packages with 'any' as architectureDavid Kalnischkies
Previously we had python:any:amd64, python:any:i386, … in the cache and the dependencies of an amd64 package would be on python:any:amd64, of an i386 on python:any:i386 and so on. That seems like a relatively pointless endeavor given that they will all be provided by the same packages and therefore also a waste of space. Git-Dch: Ignore
2015-08-17Cleanup includes after running iwyuMichael Vogt
2015-08-10parse packages from all architectures into the cacheDavid Kalnischkies
Now that we can dynamically create dependencies and provides as needed rather than requiring to know with which architectures we will deal before running we can allow the listparser to parse all records rather than skipping records of "unknown" architectures. This can e.g. happen if a user has foreign architecture packages in his status file without dpkg knowing about this architecture (or apt configured in this way). A sideeffect is that now arch:all packages are (correctly) recorded as available from any Packages file, not just from the native one – which has its downsides for the resolver as mixed-arch source packages can appear in different architectures at different times, but that is the problem of the resolver and dealing with it in the parser is at best a hack (and also depends on a helpful repository). Another sideeffect is that his allows :none packages to appear in Packages files again as we don't do any kind of checks now, but given that they aren't really supported (anymore) by anyone we can live with that.
2015-08-10hide implicit deps in apt-cache again by defaultDavid Kalnischkies
Before MultiArch implicits weren't a thing, so they were hidden by default by definition. Adding them for MultiArch solved many problems, but having no reliable way of detecting which dependency (and provides) is implicit or not causes problems everytime we want to output dependencies without confusing our observers with unneeded implementation details. The really notworthy point here is actually that we keep now a better record of how a dependency came to be so that we can later reason about it more easily, but that is hidden so deep down in the library internals that change is more the problems it solves than the change itself.
2015-08-10use a smaller type for flags storage in the cacheDavid Kalnischkies
We store very few flags in the cache, so keeping storage space for 8 is enough for all of them and still leaves a few unused bits remaining for future extensions without wasting bytes for nothing. Git-Dch: Ignore