Age | Commit message (Collapse) | Author |
|
|
|
|
|
|
|
Our profile says we spend about 5% of the time transforming the
hex digits into the binary format used by HashsumValue, all for
comparing them against the other strings. That makes no sense
at all.
According to callgrind, this reduces the overall instruction
count from 5,3 billion to 5 billion in my example, which
roughly matches the 5%.
|
|
Generating a string for each version we see is somewhat inefficient.
The problem here is that the Description tag names are longer than
15 byte, and thus require an allocation on the heap, which we should
avoid.
It seems reasonable that 20 characters works for all languages codes
used for archive descriptions, but if not, there's a warning, so
we'll catch that.
This should improve performance by about 2%.
|
|
Stop copying stuff, and just parse the bytes one by-one to the
newly created AddCRC16Byte. This improves the instruction count
for an update run from 720,850,121 to 455,801,749 according to
callgrind.
|
|
This one has some obvious collisions for non-alphabetical characters,
like some control characters also hashing to numbers, but we don't
really have those, and these are hash functions which are not
collision free to begin with.
|
|
This basically gets rid of 40-50% of the hash table lookups,
making things a bit faster that way, and the profiles look
far cleaner.
|
|
You can pretty much achieve the same with a local dummy package if you
want to, but libapt has an inbuilt setting for essential: "apt" which
can be overridden with this option as well – it could be helpful in
quick tests and what not so adding this alternative shouldn't really
hurt much.
We aren't going to document them much through as care must be taken in
regards to the binary caches as they aren't invalidated by config
options alone, so the effects of old settings could still be in them,
similar to the other already existing pkgCacheGen option(s).
Closes: 767891
Thanks: Anthony Towns for initial patch
|
|
If the dependency line does not contain spaces in the repository
but does in the dpkg status file (because dpkg normalized the
dependency list), the dpkg line might be longer than the line
in the repository. If it now happens to be longer than 1024
characters, it would be skipped, causing the hashes to be
out of date.
Note that we have to bump the minor cache version again as
this changes the format slightly, and we might get mismatches
with an older src cache otherwise.
Fixes Debian/apt#23
|
|
If we have a (e.g. locally built) deb file installed and do try to
install it again apt complained about this being a downgrade, but it
wasn't as it is the very same version… it was just confused into not
merging the versions together which looks like a downgrade then.
The same size assumption is usually good, but given that volatile files
are parsed last (even after the status file) the base assumption no
longer holds, but is easy to adept without actually changing anything in
practice.
|
|
If a package file is formatted in a way that that no space
follows a deprecated "<", we would reformat it to "<=" and
increase the length of the output by 1, which can break.
Under normal circumstances with "<=" this should not be an
issue.
Closes: #828812
|
|
Mysteriously segfaults only on i386 for me, but at least one reporter
had the same behavior and it makes sense that this is the problem as the
parsing of Source: was fixed in 1.2.2 – before the not remapped group
was not used.
We don't use our usual Dynamic<> trick here as we don't have it in the
parser. Its a bit of a layer violation to do this parsing here, but its
how it is always was…
Until next time with this lovely kind of problem.
Closes: 812251
Thanks: Francesco Poli and Marc Haber for testdata.
|
|
Part of hidden classes, so conversion is abi-free.
Git-Dch: Ignore
|
|
These virtual methods are implemented in hidden classes, so we can drop
them without breaking the ABI.
Git-Dch: Ignore
|
|
In commit a221efc331693f8905da870141756c892911c433 I promoted the source
package name and version to the binary cache for faster access by e.g.
EDSP, but due to changing the interpretation length to soon we always
ignored the version part of the Source field, so that packages ended up
having the binary version as source version – which while usually just
fine it is wrong for binary rebuilds.
Closes: 812492
|
|
Git-Dch: Ignore
|
|
Git-Dch: Ignore
|
|
Architectures for packages which do not belong to the native nor a
foreign architecture (dubbed barbarian for now) which are marked
M-A:foreign still provide in their own architecture even if not for
others. Also, other M-A:foreign (and allowed) packages provide in these
barbarian architectures.
|
|
I overlooked this
Gbp-Dch: ignore
|
|
Do not create strings within the loop, that creates one string
per language and does more work than needed. Instead, reserve
enough space at the beginning and assign the prefix, and then
resize and append inside the loop.
Also call exists with the string itself instead of the c_str(),
this means that the lookup uses the size information in the
string now and does not have to call strlen() on it.
|
|
This improves performance, as we now can ignore unequal strings
based on their length already.
Gbp-Dch: ignore
|
|
This improves performance of the cache generation on my
ARM platform (4x Cortex A15) by about 10% to 20% from
2.35-2.50 to 2.1 seconds.
|
|
We do not see those branches at all during normal mode of
operation (that is, during cache generation), so tell the
compiler about it.
|
|
The Set() method returns false if the input is no hex number,
so simply use that.
|
|
This makes the code parsing architecture lists slower, but on
the other hand, improves the more generic case of reading
dependencies from Packages files.
|
|
This converts all callers that read machine-generated data,
callers that might work with user input are not converted.
|
|
dpkg does that when reading package files, so we should do
the same. This only deals with parsing names from binary
package paragraphs, it does not look at source package names
and/or the list of binaries in a dsc file.
Closes: #807012
|
|
More safety, less writeable memory.
|
|
How the Multi-Arch field and pkg:<arch> dependencies interact was
discussed at DebConf15 in the "MultiArch BoF". dpkg and apt (among other
tools like dose) had a different interpretation in certain scenarios
which we resolved by agreeing on dpkg view – and this commit realizes
this agreement in code.
As was the case so far libapt sticks to the idea of trying to hide
MultiArch as much as possible from individual frontends and instead
translates it to good old SingleArch. There are certainly situations
which can be improved in frontends if they know that MultiArch is upon
them, but these are improvements – not necessary changes needed
to unbreak a frontend.
The implementation idea is simple: If we parse a dependency on foo:amd64
the dependency is formed on a package 'foo:amd64' of arch 'any'. This
package is provided by package 'foo' of arch 'amd64', but not by 'foo'
of arch 'i386'. Both of those foo packages provide each other through
(assuming foo is M-A:foreign) to allow a dependency on 'foo' to be
satisfied by either foo of amd64 or i386. Packages can also declare to
provide 'foo:amd64' which is translated to providing 'foo:amd64:any' as
well.
This indirection over provides was chosen as the alternative would be to
teach dependency resolvers how to deal with architecture specific
dependencies – which violates the design idea of avoiding resolver
changes, especially as architecture-specific dependencies are a
cornercase with quite a few subtil rules. Handling it all over versioned
provides as we already did for M-A in general seems much simpler as it
just works for them.
This switch to :any has actually a "surprising" benefit as well: Even
frontends showing a package name via .Name() [which doesn't show the
architecture] will display the "architecture" for dependencies in which
it was explicitely requested, while we will not show the 'strange' :any
arch in FullName(true) [= pretty-print] either. Before you had to
specialcase these and by default you wouldn't get these details shown.
The only identifiable disadvantage is that this complicates error
reporting and handling. apt-get's ShowBroken has existing problems with
virtual packages [it just shows the name without any reason], so that
has to be worked on eventually. The other case is that detecting if a
package is completely unknown or if it was at least referenced somewhere
needs to acount for this "split" – not that it makes a practical
difference which error is shown… but its one of the improvements
possible.
|
|
We parse all architectures we encounter recently, which means we also
parse packages from architectures which are neither native nor foreign,
but still came onto the system somehow (usually via heavy force).
|
|
Previously we had python:any:amd64, python:any:i386, … in the cache and
the dependencies of an amd64 package would be on python:any:amd64, of an
i386 on python:any:i386 and so on. That seems like a relatively
pointless endeavor given that they will all be provided by the same
packages and therefore also a waste of space.
Git-Dch: Ignore
|
|
Reported-By: gcc
Git-Dch: Ignore
|
|
This could allow an attacker to mark a package as installed in a
remote package index, as long as the package was not listed in
the dpkg status file.
This way, an attacker could force the installation of a package
during a dist-upgrade, by providing two packages in an index,
an older marked as installed, and a newer - apt would "upgrade"
to the newer version.
|
|
|
|
Git-Dch: Ignore
|
|
We archieve the same without the special handling now, so drop this code.
Makes supporting this abdomination a little longer bearable as well.
Git-Dch: Ignore
|
|
Now that we can dynamically create dependencies and provides as needed
rather than requiring to know with which architectures we will deal
before running we can allow the listparser to parse all records rather
than skipping records of "unknown" architectures.
This can e.g. happen if a user has foreign architecture packages in his
status file without dpkg knowing about this architecture (or apt
configured in this way).
A sideeffect is that now arch:all packages are (correctly) recorded as
available from any Packages file, not just from the native one – which
has its downsides for the resolver as mixed-arch source packages can
appear in different architectures at different times, but that is the
problem of the resolver and dealing with it in the parser is at best a
hack (and also depends on a helpful repository).
Another sideeffect is that his allows :none packages to appear in
Packages files again as we don't do any kind of checks now, but given
that they aren't really supported (anymore) by anyone we can live with
that.
|
|
Trade deduplication of code for a bunch of new virtuals, so it is
actually visible how the different indexes behave cleaning up the
interface at large in the process.
Git-Dch: Ignore
|
|
Sources are usually defined in sources.list (and co) and are pretty
stable, but once in a while a frontend might want to add an additional
"source" like a local .deb file to install this package (No support for
'real' sources being added this way as this is a multistep process).
We had a hack in place to allow apt-get and apt to pull this of for a
short while now, but other frontends are either left in the cold by this
and/or the code for it looks dirty with FIXMEs plastering it and has on
top of this also some problems (like including these 'volatile' sources
in the srcpkgcache.bin file).
So the biggest part in this commit is actually the rewrite of the cache
generation as it is now potentially a three step process. The biggest
problem with adding support now through is that this makes a bunch of
previously mostly unusable by externs and therefore hidden classes
public, so a bit of further tuneing on this now public API is in order…
|
|
Now that we deal with provides in a more dynamic fashion the last
remaining problem is explicit dependencies like 'Conflicts: foo' which
have to apply to all architectures, but creating them all at the same
time requires us to know all architectures ending up in the cache which
isn't needed to be the same set as all foreign architectures.
The effect is visible already now through as this prevents the creation
of a bunch of virtual packages for arch:all packages and as such also
many dependencies, just not very visible if you don't look at the stats…
Git-Dch Ignore
|
|
Expecting the worst is easy to code, but has its disadvantages e.g.
by creating package structures which otherwise would have never
existed. By creating the provides instead at the time a package
structure is added we are well prepared for the introduction of partial
architectures, massive amounts of M-A:foreign (and :allowed) and co as
far as provides are concerned at least. We have something relatively
similar for dependencies already.
Many tests are added for both M-A states and the code cleaned to
properly support implicit provides for foreign architectures and
architectures we 'just' happen to parse.
Git-Dch: Ignore
|
|
Before MultiArch implicits weren't a thing, so they were hidden by
default by definition. Adding them for MultiArch solved many problems,
but having no reliable way of detecting which dependency (and provides)
is implicit or not causes problems everytime we want to output
dependencies without confusing our observers with unneeded
implementation details.
The really notworthy point here is actually that we keep now a better
record of how a dependency came to be so that we can later reason about
it more easily, but that is hidden so deep down in the library internals
that change is more the problems it solves than the change itself.
|
|
We aren't and we will not be really compatible again with the previous
stable abi, so lets drop these markers (which never made it into a
released version) for good as they have outlived their intend already.
Git-Dch: Ignore
|
|
DepCache functions are called a lot, so if we can squeeze some drops out
of them for free we should do so. Takes also the opportunity to remove
some whitespace errors from these functions.
Git-Dch: Ignore
|
|
Doing this disables the implicit copy assignment operator (among others)
which would cause hovac if used on the classes as it would just copy the
pointer, not the data the d-pointer points to. For most of the classes
we don't need a copy assignment operator anyway and in many classes it
was broken before as many contain a pointer of some sort.
Only for our Cacheset Container interfaces we define an explicit copy
assignment operator which could later be implemented to copy the data
from one d-pointer to the other if we need it.
Git-Dch: Ignore
|
|
We used to read the Release file for each Packages file and store the
data in the PackageFile struct even through potentially many Packages
(and Translation-*) files could use the same data. The point of the
exercise isn't the duplicated data through. Having the Release files as
first-class citizens in the Cache allows us to properly track their
state as well as allows us to use the information also for files which
aren't in the cache, but where we know to which Release file they
belong (Sources are an example for this).
This modifies the pkgCache structs, especially the PackagesFile struct
which depending on how libapt users access the data in these structs can
mean huge breakage or no visible change. As a single data point:
aptitude seems to be fine with this. Even if there is breakage it is
trivial to fix in a backportable way while avoiding breakage for
everyone would be a huge pain for us.
Note that not all PackageFile structs have a corresponding ReleaseFile.
In particular the dpkg/status file as well as *.deb files have not. As
these have only a Archive property need, the Component property takes
over this duty and the ReleaseFile remains zero. This is also the reason
why it isn't needed nor particularily recommended to change from
PackagesFile to ReleaseFile blindly. Sticking with the earlier is
usually the better option.
|
|
Conflicts:
apt-pkg/acquire-item.cc
cmdline/apt-key.in
methods/https.cc
test/integration/test-apt-key
test/integration/test-multiarch-foreign
|
|
On single-arch the parsing was creating groupnames like 'apt:amd64' even
through it should be 'apt' and a package in it belonging to architecture
amd64. The result for foreign architectures was as expected: The
dependency isn't satisfiable, but for native architecture it means the
wrong package (ala apt:amd64:amd64) is linked so this is also not
satisfiable, which is very much not expected.
No longer excluding single-arch from this codepath allows the generation
of the correct links, which still link to non-exisiting packages for
foreign dependencies, but natives link to the expected native package
just as if no architecture was given.
For negative arch-specific dependencies ala Conflicts this matter was
worse as apt will believe there isn't a Conflict to resolve, tricking it
into calculating a solution dpkg will refuse.
Architecture specific positive dependencies are rare in jessie – the
only one in amd64 main is foreign –, negative dependencies do not even
exist. Neither class has a native specimen, so no package in jessie is
effected by this bug, but it might be interesting for stretch upgrades.
This also means the regression potential is very low.
Closes: 777760
|
|
The underlying problem is that libapt-pkg does not correctly parse these
provides. Internally, it creates a version named "baz:i386" with
architecture amd64. Of course, such a package name is invalid and thus
this version is completely inaccessible. Thus, this bug should not cause
apt to accept a broken situation as valid. Nevertheless, it prevents
using architecture qualified depends.
Closes: 777071
|