Age | Commit message (Collapse) | Author |
|
I only looked at parameters in the previous commit, which was
not enough: One place also generated local string views. In this
case, we only need to make ArchA dynamic, as NameA is not used
after the FindPkg() call.
Gbp-Dch: ignore
|
|
It turns out that StringViews might need to be remapped in some
places because they come from the cache. For example, some sites
pass a Ver.VerStr() to NewProvides().
Such a StringView would become invalid during the duration of
the call if the cache is remapped, causing the program to die
with a segmentation fault.
We can take care of those issues by remapping string views in
the same way we remap all the iterators. String views are only
remapped if they point into the cache though, this allows us
to write more generic code on the callee site without having
to check whether the view points into the cache or not.
That's not as efficient as possible, but the overhead does not
appear to be measurable.
Closes: #812251
|
|
This allows us to check if a value to be remapped was inside
the cache or not, which will become useful at a later point.
Gbp-Dch: ignore
|
|
APT::StringView is supposed to be a temporary measure, until support
for the standardized string_view is widely available. Introducing
additional unstandardized features just makes porting to the
standard version harder.
The constexpr constructor also won't have any real effect on most
systems, as the compiler will happily optimise the strlen() call
away for constant strings.
Gbp-Dch: ignore
|
|
Git-Dch: Ignore
|
|
The commit also adds a few trivial tests
Git-Dch: Ignore
|
|
The position returned is supposed to be the position of the character
counted from the start of the string, but if we used the substr calling
overloads the skipped over prefix wasn't considered. The pos parameter
of rfind had also the wrong semantic.
|
|
Introduced in 9d2a8a7388cf3b0bbbe92f6b0b30a533e1167f40 apt tries to
merge actions like downloading the same (as judged by hashes) file
into doing it once. The implementation was very simple in that it isn't
planing at all. Turns out that it works 90% of the time just fine, but
has issues in more complicated situations in which items can be in
different stages downloading different files emitting potentially the
"wrong" hash – like while pdiffs are worked on we might end up copying
the patch instead of the result file giving us very strange errors in
return. Reverting the change until we can implement a better planing
solution seems to be the best course of action even if its sad.
Closes: 810046
|
|
Architectures for packages which do not belong to the native nor a
foreign architecture (dubbed barbarian for now) which are marked
M-A:foreign still provide in their own architecture even if not for
others. Also, other M-A:foreign (and allowed) packages provide in these
barbarian architectures.
|
|
Fix reproducibility issue due to readdir() order by sorting
the list of sources to be built and linked.
[jak@debian.org: Added summary and fixed typo]
Closes: #810509
|
|
By storing the size of the string in the cache, we can make use of
it when comparing the names in the hashtable in pkgCache::FindGrp.
|
|
Hide the std::string overload instead of providing a
const char * one, the old variant was stupid.
Gbp-Dch: ignore
|
|
I overlooked this
Gbp-Dch: ignore
|
|
The code already deals with compressed leftovers, but forgot the
uncompressed files. The opertunity is picked to reorder this code and
add debug messages about the actions taken as well as produce such a
leftover file in the associated testcase.
|
|
With the addition of the $HASH-Download field in the .diff/Index we got
the size of the compressed patches for 'free', so if that information is
available we can use it for a more fitting calculation of the size
requirements of the patches vs. the complete file.
Note that this predicts a too small size in the transition case in which
the information isn't available for all patches, but figuring this out
would be a lot of code for practically nothing as only one update can
ever be in such a transition phase.
|
|
Downloading and storing are two different operations were different
compression types can be preferred. For downloading we provide the
choice via Acquire::CompressionTypes::Order as there is a choice to
be made between download size and speed – and limited by whats available
in the repository.
Storage on the other hand has all compressions currently supported by
apt available and to reduce runtime of tools accessing these files the
compression type should be a low-cost format in terms of decompression.
apt traditionally stores its indexes uncompressed on disk, but has
options to keep them compressed. Now that apt downloads additional files
we also deal with files which simply can't be stored uncompressed as
they are just too big (like Contents for apt-file). Traditionally they
are downloaded in a low-cost format (gz) as repositories do not provide
other formats, but there might be even lower-cost formats and for
download we could introduce higher-cost in the repositories.
Downloading an entire index potentially requires recompression to
another format, so an update takes potentially longer – but big files
are usually updated via pdiffs which has to de- and re-compress anyhow
and does it on the fly anyhow, so there is no extra time needed and in
general it seems to be benefitial to invest the time in update to save
time later on file access.
|
|
There is no reason to enforce that the file we start the bootstrap with
is compressed with a compressor which is available online. This allows
us to change the on-disk format as well as deals with repositories
adding/removing support for a specific compressor.
|
|
If we store files compressed in lists/ and the file switched compression
formats we happened to retain the "old" format, but by default the
cleanup process catched this oversight and removed the file.
[The initial situation described doesn't arise as we store no files by
default compressed and even with apt-file configuring Contents files, we
don't really have that problem as there is just .gz files for those.]
We solve this by just removing any uncompressed as well as compressed
(we support) file just before we move the 'new' version of the file in.
|
|
Adding a new compressor method meant adding a new method as well – even
if that boilt down to just linking to our generalized decompressor with
a new name. That is unneeded busywork if we can instead just call the
generalized decompressor and let it figure out which compressor to use
based on the filenames rather than by program name.
For compatibility we ship still 'gzip', 'bzip2' and co, but they are
just links to our "new" 'store' method.
|
|
Do not create strings within the loop, that creates one string
per language and does more work than needed. Instead, reserve
enough space at the beginning and assign the prefix, and then
resize and append inside the loop.
Also call exists with the string itself instead of the c_str(),
this means that the lookup uses the size information in the
string now and does not have to call strlen() on it.
|
|
It makes no sense to check if the value is empty, as it cannot
be. It will always be a hexstring of exactly 32 bytes.
|
|
Use the same path for both comparisons, as the operator== path
is faster than just calling compare() - it avoids any comparison
if the size differs.
Gbp-Dch: ignore
|
|
Gbp-Dch: ignore
|
|
Instead of storing a string -> map_stringitem_t mapping, create
our own data type that can point to either a normal string or
a string inside the cache.
This avoids the creation of any string and improves performance
slightly (about 4%).
|
|
This improves performance, as we now can ignore unequal strings
based on their length already.
Gbp-Dch: ignore
|
|
This removes some minor overhead.
Gbp-Dch: ignore
|
|
Moving the string is likely faster than copying it. We could probably
avoid strings alltogether in the future using some more crazy code,
but I have not looked at that yet.
Gbp-Dch: ignore
|
|
Gbp-Dch: ignore
|
|
Gbp-Dch: ignore
|
|
Thanks: Niels Thykier for reporting on IRC
Gbp-Dch: ignore
|
|
This improves performance of the cache generation on my
ARM platform (4x Cortex A15) by about 10% to 20% from
2.35-2.50 to 2.1 seconds.
|
|
The class APT::StringView implements a drop-in replacement
for a subset of C++17 std::string_view() features. It will
be dropped at a later point and may not be used in public
interfaces.
|
|
The maximum parallelization soft limit is the number of CPU
cores * 2 on systems defining _SC_NPROCESSORS_ONLN. The hard
limit in all cases is Acquire::QueueHost::Limit.
|
|
This is a multiple of the page size and thus results in less
page faults, speeding up copying.
Also, while we're at at, unify all uses of that size in a
constant variable APT_BUFFER_SIZE.
|
|
Implement native support for LZ4 compression, using the official
lz4 library.
|
|
This drop the hash table utilization from a high 98%
to acceptable 74% on unstable, and the average bucket length
from 4.6 to 1.8.
This improves performance by about 5%, while increasing
the size of the cache by 0.2 out of 38MB, that is 0.5%.
48481 is a nice number
|
|
Gbp-Dch: ignore
|
|
This will give us the freedom to insert more compressors at
positions in between.
Also change the cost of uncompressed to 0, as that really has
no overhead, and the values do not really mean much.
|
|
This makes code easier to read, and somewhat more correct.
Gbp-Dch: ignore
|
|
Gbp-Dch: ignore
|
|
apt_preferences and deb822-style sources used the specialized class
pkgUserTagSection to deal with comments before/after a given stanza, but
it couldn't deal with comments in the stanza at all.
codesearch suggests that nobody else does and a vastely superior way of
working with potentially commented files is implemented now, so we can
officially discourage the use of the old incomplete hack class.
|
|
Now (55153bf94ff28a23318e79aa48242244c4d82b3c) that pkgTagFile can be
told to deal with all sorts of comments we can use this mode to parse
dsc (as by catch) and debian/control files properly even in the wake of
multiline fields spliced with comments like Build-Depends.
Closes: 806775
|
|
APT usually deals with perfectly formatted files generated automatically
be other programs – and as it has to parse multiple MBs of such files it
tries to be fast rather than forgiving.
This was always a problem if we reused this parser for files with a
deb822 syntax which are mostly written by hand however, like
apt_preferences or the deb822-style sources as these can include stray
newlines and more importantly comments all over the place.
As a stopgap we had pkgUserTagSection which deals at least with comments
before and after a given stanza, but comments in between weren't really
supported and now that we support parsing debian/control for e.g.
build-dep we face the full comment problem e.g. with comments inbetween
multi-line fields (like Build-Depends).
We can't easily deal with this on the pkgTagSection level as the interface
gives access to 'raw' char-pointers for performance reasons so we would
need to optionally add a buffer here on which we could remove comments
to hand out pointers into this buffer instead. The interface is quite
large already and supports writing stanzas as well, which does not
support comments at all either. So while in future it might make sense
to have a parser setup which deals with and keeps comments in this
commit we opt for the simpler solution for now: We officially declare
that pkgTagSection does not support comments and instead expect the
caller to deal with them, which in our case is pkgTagFile:
pkgTagFile is extended with an additional mode which can deal with
comments by dropping them from the buffer which will later form the
input of pkgTagSection. The actual implementation is slightly more
complex than this sentence suggests at first on one hand to have good
performance and on the other to allow jumping directly to stanzas with
offsets collected in a previous run (like our cache generation does it
for example).
|
|
Integrity is taken care of by the checksum now.
|
|
|
|
If we already have opened a cache, there is no point in having
to open it again.
|
|
We ignored the boundary of the buffer we were reading in
while scanning for spaces.
|
|
This shuts up gcc
Gbp-Dch: ignore
|
|
To preserve compatibility, the new inline functions have _inline
as a suffix, and a macro defines the old names to refer to the
inline variants.
The old functions are still preserved for binary compatibility.
Also simplify the implementation of both functions.
|
|
On my testing system, consisting of unstable and experimental,
this reduces the average chain from 6.5 to 4.5, and the longest
chain from 17 to 15.
|