|
DMAC's
Btree++ Index Algorithm Implementation
Explained
A complete explanation of the Unibase by DMAC
btree++ index algorithm approach
Unibase by DMAC, Release 7.47i dated
March 1, 2000 and later, has some new sort
features. Three goals drove these
features:
- sort
speed
- sort
predictability
- sort
customization
Background:
Unibase by DMAC uses the sort verb to
create Btree++ indexes. These indexes are
used in the AID language to rapidly retrieve
data.
Speed:
Unibase by DMAC clients want both speed
of retrieval and speed of building from an
index. Over the years both speeds have
improved. Sort speed (index creation) as
measured in records per second varies
depending upon how many total records are
going to be sorted in a single pass. Some
sort speciality vendors study the file to be
sorted before sorting so that an optimum
approach can be used. Today sort ( indexing
speeds) of over 10000 records per second are
common in Unibase by DMAC.
Sort Predictability:
DMAC regularly gets asked how long it
will take to sort (build on index on) a given
number of records. In the past the variables
were so many that the equation to predict the
answer was beyond most clients and DMAC
people. Today a simple means of predicting
the performance on a machine of Unibase by
DMAC in possible with obtaining one or two
sort speeds on that particular machine
configuration.
Sort Customization:
The next question to the above is, "What
does one do the get better sort speeds?" The
answer to this question has been expanded to
four answers -- 1) no network traffic
(standalone pc), 2) more RAM, 3) faster CPU,
and 4) no screen displays while
sorting.
What Happened:
In the Microsoft Windows (95, 98, NT,
2000, and XP) versions and the UNIX/LINUX
version, basic changes allow the user to take
advantage of adding memory to the client in
the case of Microsoft and the server in the
case of UNIX/LINUX. While the 16 bit
Microsoft version can benefit from adding
memory up to a point, the 16 megabyte maximum
means this 16 bit Microsoft version is
obsolete for sorting.
Some of the current reccommendatsions
come from pratical experience. All the sort
times come from reading and studying the
book, "A Practical Introduction to Data
Structures and Algorithm Analysis" by
Clifford A. Shaffer of Virginia Polytechnic
Institute and State University.
Network Server / Standalone PC
Changes:
As network servers strive to protect
users, and standalone computer become faster,
and graphics more elaborate, the sort speed
on a standalone PC with no screen interface
has surpassed the sort speed using a
server/client workstation configuration. Thus
the code changes to Unibase by DMAC described
below are enhanced by using a standalone
computer.
Code Changes to Unibase by
DMAC:
One feature coded into Unibase was to set
up the index so that it could reference up to
4 terabytes (4000 gigabytes) of files stored
in individual files of less than 2
gigabytes.
Another, more subtle, feature was to set
up so that as more RAM memory was available
bigger files could be sorted efficiently in
RAM memory - automatically.
Improving
Predictability:
The codes changes improved the
predictability of Unibase by DMAC. While the
sort verb can handle all sizes of files,
there is a tremendous speed decrease when
anything outside of Random Access Memory
(RAM) is accessed.
Users can use the following equation to
determine how much RAM is need for various
sizes of files:
RAM needed = A + 2 (n * (36 + key
length))
("A" is an empirical number (use 4,000,000
for example)
Some RAM needed numbers:
| Key Length 10: |
|
n
|
2 * N * (36+10)
|
RAM needed
|
|
1,000 |
92,000 |
4,092,000 |
|
5,000 |
460,000 |
4,460,000 |
|
10,000 |
920,000 |
4,920,000 |
|
50,000 |
4,600,000 |
8,600,000 |
|
100,000 |
9,200,000 |
13,200,000 |
|
500,000 |
46,000,000 |
50,000,000 |
|
1,000,000 |
92,000,000 |
96,000,000 |
|
5,000,000 |
460,000,000
|
464,000,000
|
|
10,000,000 |
920,000,000
|
964,000,000
|
|
Key Length 80:
|
|
n
|
2 * N * (36+10)
|
RAM needed
|
|
1,000 |
232,000 |
4,232,000 |
|
5,000 |
1,160,000 |
5,160,000 |
|
10,000 |
2,320,000 |
6,320,000 |
|
50,000 |
11,600,000 |
15,600,000 |
|
100,000 |
23,200,000 |
27,200,000 |
|
500,000 |
116,000,000 |
120,000,000 |
|
1,000,000 |
232,000,000 |
236,000,000 |
|
5,000,000 |
1,160,000,000
|
1,164,000,000
|
|
10,000,000 |
2,320,000,000
|
2,324,000,000
|
The Unibase code breaks down at 2 gigabytes
of RAM (everyone today does). So above that
number, we need to sort in strings and merge
the strings up to the 4 terabytes we can
index.
Sort
Time:
If we ignore sorting when other storage
media are used, we can predict time required
quite accurately. This means that users can
build themselves a sort (index generating)
machine which responds according to an
asymptotic algorithm analysis of the sort
time being computable by the following
equation:
time
= A + B(n * log n)
where:
A is an empirical constant
B is an empirical constant
n is the number of keys to be
indexed
Actually, if we are measuring time in
seconds, A is normally about one or two
seconds, so most of the time you see the
above equation written:
time
= B (n * log n)
where:
B is an empirical constant
n is the number of keys to be
indexed
What I am saying is that if sufficient
RAM (up to 2 gigabytes) is available for
files up to 10,000,000 records, the above
equation works for Unibase by DMAC sort
verb.
Another feature developed in Unibase by
DMAC will allow merging of indexes so that up
to 4 terabyte indexes can be created for
billions and billions of records. The Merge
process works; but no times for the merge
process have been collected.
Some Log Examples:
We can make things very easy by
constructing a table of the value of n * log
n. Most users really only want a range of
values so they can pick the following
values:
| n |
log
n |
n
time log n |
| 1,000 |
3.00 |
3,000 |
| 5,000 |
3.70 |
18,500 |
| 10,000 |
4.00 |
40,000 |
| 50,000 |
4.70 |
235,000 |
| 100,000 |
5.00 |
500,000 |
| 500,000 |
5.70 |
2,850,000 |
Let us look at some sample data from a
200 mhz pentium running Microsoft Windows 95
for building an index with a key size of 80
characters:
| n |
log
n |
build time (secs) |
records/
second |
| 4,250 |
3.63 |
13 |
346 |
| 21,250 |
4.33 |
72 |
295 |
| 51,000 |
4.71 |
173 |
295 |
| 60,700 |
4.78 |
211 |
288 |
| 108,100 |
5.03 |
420 |
258 |
From this data
A is about 1.2 seconds
And B is .00077
So we can predict for whatever number
of records we want. A million records will
take about 4,600 seconds (about 76 minutes)
and 5,000,000 records will take about
25,650 seconds (about 7 hours).
Some Faster Computers and Operating
Systems:
Never one to pick on a particular
operating system, I offer the following
data to be used to evaluate two other
computers:
|
HP 9000 (smallest box) running
UNIX
|
| n |
log
n |
n *
log n |
B |
Ttl. Time |
| 41,200 |
4.6149 |
190,133 |
.000447 |
85 secs |
| 123,300 |
5.0906 |
627,597 |
.000413 |
259 secs |
|
LINUX on Pentium II
|
| n |
log
n |
n *
log n |
B |
Ttl. Time |
| 43,086 |
4.6343 |
199,673 |
.000055 |
11 secs |
| 129,258 |
5.113 |
660,676 |
.000052 |
34 secs |
|