diff options
author | Julian Andres Klode <julian.klode@canonical.com> | 2020-01-08 11:03:28 +0100 |
---|---|---|
committer | Julian Andres Klode <julian.klode@canonical.com> | 2020-01-08 11:13:27 +0100 |
commit | 6902792898a9fcc3bdff605e2097e6a5cd2d6bbc (patch) | |
tree | 6206b42552cdabdf0c373a921edd46cef2e3dbc7 /ftparchive/byhash.h | |
parent | d3636f2666b77eb17b261300cb91eb912e2789c6 (diff) |
Avoid extra out-of-cache hash table deduplication for package names
We were de-duplicating package name strings in StoreString, but also
deduplicating most of them by them being in groups, so we had extra
hash table lookups that could be avoided in NewGroup().
To continue deduplicating names across binary packages and source
packages, insert groups for source packages as well. This is also
a good first step in allowing efficient lookup of packages by source
package - we can extend Group later by a list of SourceVersion objects,
or alternatively, simply add a by-source chain into pkgCache::Version.
This change improves performance by about 10% (913 to 814 ms), while
having no significant overhead on the cache size:
--- before
+++ after
@@ -1,7 +1,7 @@
-Total package names: 109536 (2.191 k)
-Total package structures: 118689 (4.748 k)
+Total package names: 119642 (2.393 k)
+Total package structures: 118687 (4.747 k)
Normal packages: 83309
- Pure virtual packages: 3365
+ Pure virtual packages: 3363
Single virtual packages: 17811
Mixed virtual packages: 1973
Missing: 12231
@@ -10,21 +10,21 @@ Total distinct descriptions: 149291 (3.583 k)
Total dependencies: 484135/156650 (12,2 M)
Total ver/file relations: 57421 (1.378 k)
Total Desc/File relations: 18219 (437 k)
-Total Provides mappings: 29963 (719 k)
+Total Provides mappings: 29959 (719 k)
Total globbed strings: 226993 (5.332 k)
Total slack space: 26,8 k
-Total space accounted for: 38,1 M
+Total space accounted for: 38,3 M
Total buckets in PkgHashTable: 50503
- Unused: 5727
- Used: 44776
- Utilization: 88.6601%
- Average entries: 2.65073
+ Unused: 5728
+ Used: 44775
+ Utilization: 88.6581%
+ Average entries: 2.65074
Longest: 60
Shortest: 1
Total buckets in GrpHashTable: 50503
- Unused: 5727
- Used: 44776
- Utilization: 88.6601%
- Average entries: 2.44631
- Longest: 10
+ Unused: 4649
+ Used: 45854
+ Utilization: 90.7946%
+ Average entries: 2.60919
+ Longest: 11
Shortest: 1
Diffstat (limited to 'ftparchive/byhash.h')
0 files changed, 0 insertions, 0 deletions