Discussion:
BTree elements not sorted?
(too old to reply)
c***@gmail.com
2007-01-25 20:49:14 UTC
Permalink
Hi!

I use a BTree as an indexing structure for my berkeleydb database. The
keys are all 4 bytes long, unsigned integers. And i use cursors to
traverse the database in both directions.

I thought that since i use a Btree, the elements are sorted, and the
cursors traverse a sorted tree. I thought that if i position the cursor
at the first element, i get the smallest key, and the last element is
the largest key. But i noticed that this is not the case - the cursors
rather return the elements in the order of insertion (the oldest
element is in DB_FIRST, the youngest in DB_LAST).

Is there a way how i can get "real" sorting?

Why does bdb use this sorting "by age"? Is there a special reason why
they decided to do so?

Regards & big thanks
Chris
c***@gmail.com
2007-01-27 10:29:46 UTC
Permalink
No ideas, anybody?
Florian Weimer
2007-01-27 12:04:01 UTC
Permalink
Post by c***@gmail.com
No ideas, anybody?
You need to show example code, or put up one of your database files on
the web so that we can have a look at it.
c***@gmail.com
2007-01-27 19:36:21 UTC
Permalink
Hi Florian and others,

i have attached a small test program. it should compile on all gcc
platforms:

gcc -Wall dbtest.c -ldb

When starting, it inserts 1000 random integer values into a BTREE
index. Then it creates a cursor, sets it to DB_FIRST (*), prints the
key, gets the next element, prints it etc. The keys are not sorted.

(*) actually the cursor is set to DB_NEXT, but setting a newly created
cursor to DB_NEXT is the same as setting it to DB_FIRST, according to
the documentation.

I *thought* that i've found the problem. My integers are little
endian, but the sort function obviously uses memcmp or something like
that. So i wrote my own comparison function, and the results were much
better. However, it's still not sorted correctly; the first 20 keys
are these:

key: 0
key: 256
key: 512
key: 768
key: 887077888
key: 498777856
key: 1338299904
key: 1
key: 257
key: 513
key: 769
key: 619054081
key: 1215828993
key: 478034945
key: 524305153
key: 2001100545
key: 2032894977
key: 941804289
key: 2

it's weird - there's *some* sorting, but not much...

i have berkeley db 4.2.52, according to the version string in the db.h
header file.

here's the code. thanks for every idea.

regards
chris

#include <string.h>
#include <stdio.h>
#include <stdlib.h>
#include <db.h>

void
check(int ret, const char *msg)
{
if (ret) {
printf("%s failed: status %d\n", msg, ret);
exit(-1);
}
}

int
compare(DB *db, const DBT *dbt1, const DBT *dbt2)
{
int l=*(unsigned *)dbt1->data;
int r=*(unsigned *)dbt2->data;
if (l<r)
return (-1);
if (r<l)
return (+1);
return (0);
}

int
main(int argc, char **argv)
{
DB *db;
DBC *c;
int ret, i, r;
DBT key, rec;

(void)argc;
(void)argv;

/* create a BTREE database */
ret=db_create(&db, 0, 0);
check(ret, "db_create");
ret=db->set_bt_compare(db, compare);
check(ret, "db->set_bt_compare");
ret=db->open(db, 0, "test.db", 0, DB_BTREE, DB_CREATE, 0);
check(ret, "db->open");

/* insert 1000 random values */
for (i=0; i<1000; i++) {
memset(&key, 0, sizeof(key));
memset(&rec, 0, sizeof(rec));

r=rand();
key.size=sizeof(r);
key.data=&r;
rec.size=sizeof(r);
rec.data=&r;

ret=db->put(db, 0, &key, &rec, 0);
check(ret, "db->put");
}

/* create a cursor */
ret=db->cursor(db, 0, &c, 0);
check(ret, "db->cursor");

/* print all keys */
while (1) {
memset(&key, 0, sizeof(key));
memset(&rec, 0, sizeof(rec));

ret=c->c_get(c, &key, &rec, DB_NEXT);
if (ret==DB_NOTFOUND)
break;
check(ret, "c->c_get");

r=*(unsigned *)key.data;
printf("key: %u\n", r);
}

c->c_close(c);
db->close(db, 0);

return (0);
}
c***@gmail.com
2007-01-29 09:27:42 UTC
Permalink
still no ideas, anybody?

i'd just like to know if this is a normal behaviour, and if there's a
workaround... i'm not going so far to say that it's a berkeleydb
bug...
Florian Weimer
2007-01-29 10:01:42 UTC
Permalink
Post by c***@gmail.com
here's the code. thanks for every idea.
Works for me (with Berkeley DB 4.4).
c***@gmail.com
2007-01-29 10:34:06 UTC
Permalink
Hi Florian,

thanks for trying. I'll try to update my berkeley DB as soon as i'm at
home.

Chris
c***@gmail.com
2007-01-30 23:04:59 UTC
Permalink
Post by Florian Weimer
Post by c***@gmail.com
here's the code. thanks for every idea.
Works for me (with Berkeley DB 4.4).
I updated to Berkeley DB 4.5.20, and now it works, too.
Maybe a bug in 4.2.52...

Thanks again for the help.

Loading...