New FreeBSD package system (a.k.a. Daemon Package System (dps))
Bert JW Regeer
xistence at 0x58.com
Tue May 15 06:34:55 UTC 2007
On May 14, 2007, at 10:03 PM, Garrett Cooper wrote:
> Bert JW Regeer wrote:
>> On May 12, 2007, at 5:14 AM, Philippe Laquet wrote:
>>> Stanislav Sedov a écrit :
>>>> On Fri, 11 May 2007 02:10:05 +0200
>>>> Ivan Voras <ivoras at fer.hr> mentioned:
>>>>
>>>>
>>>>> - I think it's time to give up on using BDB+directory tree full
>>>>> of text
>>>>> files for storing the installed packages database, and I
>>>>> propose all of
>>>>> this be replaced by a single SQLite database. SQLite is public
>>>>> domain
>>>>> (can be slurped into base system), embeddable, stores all data
>>>>> in a
>>>>> single file, lightweight, fast, and can be used to do fancy
>>>>> things such
>>>>> as reporting.
>>>>>
>>>>
>>>> What is the reason to use SQL-based database? You'll perform direct
>>>> queries to database? The packaging system is for ordinal users,
>>>> not sql
>>>> geeks, so they should not have to use sql for managing packages.
>>>> So a
>>>> simple set of hashes will suffer or needs. I agree with Julian
>>>> that we
>>>> should have a backup of packaging database in plain text format,
>>>> and
>>>> utility to rebuild it. This way we can always restore the
>>>> database if
>>>> something goes wrong. Furhtermore, that should not make a great
>>>> impact
>>>> on performance, since we don't have to rebuild it every day.
>>>>
>>> I agree with Stan ;)
>>>
>>> "fast and improved" package utilities uses mainly some indexed
>>> berkeley DB combined with flat files, aren't they? I, and may be
>>> many other FreeBSD users use light systems for efficiency and
>>> easier management, if we use some database system it will require
>>> Disk Space, resources for the DB to run, dependencies and so
>>> on... And we also may be exposed to a "that DB is better" war ;)
>>>
>> SQLite is compiled inside a program, and as such does not require
>> any resources other than one file handle and some CPU time when
>> querying. The file is stored on disk, and requires no separate
>> process to be running to query. Maybe I misunderstood what you
>> were trying to say. SQLite will require less resources than flat
>> text files, since SQLite is a one time open then process, instead
>> of what is currently happening, having to open and close hundreds
>> of files depending on how many ports are installed. With this
>> regard, SQLite is like BDB. Where SQLite uses standards compliant
>> SQL statements to get data.
>
> Correct. From what I was reading shared memory read access and
> locking are two available features of BDB databases.
>
> The only thing is that I do agree that there should be a dumping
> and importing mechanism of some kind for semi-formatted text files,
> for backup, debugging, and modification purposes. That's just my
> personal idea on the topic though :).
>
>>>> --
>>>> Stanislav Sedov
>>>> ST4096-RIPE
>>>>
>>>
>> I am able to understand many of the gripes with using a databases,
>> and have to import yet another code base into the FreeBSD base,
>> however as one of the young ones, and knowing sed/awk/grep and
>> SQL, I prefer SQL over having to process hundreds of text files
>> using text processing tools. It saddens me each time I run one of
>> the pkg_* tools that needs to parse the flat file structure since
>> it takes so long. I have friends running Ubuntu and their apt-get
>> returns results much faster.
>> In a world where hard drives are becoming more reliable, and are
>> automatically relocating sectors that go bad, do we really have to
>> worry about database corruption as much? I feel that many of the
>> fears that are being put forward will do harm to a text based
>> "storage" system as well. If one block drops out, it can cause
>> tools to not be able to parse the files. Create a backup copy of
>> the database after each successful transaction? There are ways to
>> battle data corruption.
>
> True. I was thinking of backup, and recreation from scratch,
> considering that the database wouldn't be more than a few megs. In
> place replacement just seems like a hairy situation sometimes..
>
>> Using BDB is not an real option either. I can not even count the
>> amount of times that the BDB database that portupgrade created has
>> become corrupt because I accidently ran two portupgrades at the
>> same time, or even remembered that I did not want to upgrade
>> something and hit Ctrl+C.
>
> I'm sorry but nothing's completely solid in that respect, AFAIK. In
> terms of the first problem you mentioned, Wade is working on the
> locking <http://wiki.freebsd.org/WadeWesolowsky>.
>
> In terms of transactions, maybe we should take a look at Subversion
> for inspiration: <http://svn.haxx.se/dev/
> archive-2005-03/0301.shtml>. I'm a firm believer that it's easier
> to incorporate code than it is to remove it.
I am unable to see any references to transaction support for BDB
databases, maybe I am missing something. Subversion in that thread is
suggesting SQL for a totally different reason. fsfs is what most
people are using as a subversion backend to help avoid BDB
corruption. From the many people I have talked to that used to use
Subversion with BDB have had major issues, whereas fsfs has not had
any issues at all.
Just what I have experienced myself as a Subversion repository
administrator.
>
>> The experience I got from running SVN with BDB as the back-end
>> database to store my data, I say no thanks. In that case I would
>> much rather stick with the flat text files than go with a database.
>
> Well, a few comments:
>
> -Text files are bloated. Although many people are for XML, it takes
> much longer to parse than binary databases.
/var/db/pkg/ are all plain flat text files. I am not a supporter of
XML at all.
> -Custom text files require custom format capable parsers, no matter
> what the format, and the less coverage a parser has, the more
> probable the likelihood of bugs IMO.
We already have these in the pkg_* functions, so i'd hope they are
fairly solid!
> -In the event that features changed or were added, some required
> modifications to the parser could be trivial to major. With
> databases you can get away from that mentality to some degree IMHO.
Changing an SQL query versus re-writing a parser for text files is a
huge difference.
>
> -Garrett
I am not opposed to text files, other than that they can be slow. I
am against BDB because over the years, in my experience they have
shown to be extremely unreliable and easily corrupted. If we are
going to be making changes to the way the ports/packages store the
information about what exists, it should be done in such a way that
it is scalable and at the same time extensible (is this a word?).
Bert JW Regeer
More information about the freebsd-hackers
mailing list