|
Color
and Grayscale
Come to Unibase Imaging
|
|
Unibase
Imaging now supports Color and Grayscale JPEG 7.49i images. JPEG
or Joint Photographic Experts Group, is the Department of
Defense's standard image format and is supported throughout the
Unibase by DMAC environment. JPEG images store the image in
“true color” format. That is, a palette is not used. When
Grayscale is used with JPEG images, eight bits (256) of shades
of gray are stored.
Strictly speaking, JPEG (pronounced “jay-peg”) is an image
compression algorithm. JPEG is a non-proprietary international
standard (ISO/IEC 10918). JPEG stores the colors in a format
different from the RGB format most computer users know. The
format is similar to that used by televisions. Of the three
colors saved, the first is the Y or luminance and it represents
the intensity of the image. The other components are the
chrominance components. Cb specifies the blueness of the image
and Cr gives the redness. Thus YcbCr specifies the color and
just Y provides the grayscale.
Almost all JPEG images are stored and transferred in JFIF file
format. This JFIF format adds “markers” to the compressed
file (TIFF has tags). DMAC products now support the popular
types of JFIF file formats.
File compression is achieved in three steps - 1) Discrete Cosine
Transform, 2) Coefficient Quantization and 3) Lossless
Compression. The hairiest is the first step. (Call this the DCT
step). In the second step comes the process that can be
described as either lossy or lossless. Almost all JPEG files
today use the lossy approach and it is here that inaccuracies
are introduced. Finally the last step is just compressing the
lossy data similar to what happens in a G3 TIFF file. A
technical term for the quality of the image exists but almost
all JPEG users talk about the compression of the file instead of
the quality. The more compressed, the poorer quality (or larger
quality number).
Results vary by image but for example:
Continued Pg. 4 “Color”
Color from pg. 1
|
Quality |
Size |
Final
Size |
Compression |
|
1 |
64000 |
9265 |
86% |
|
3 |
64000 |
6403 |
90 |
|
5 |
64000 |
5651 |
92 |
|
10 |
64000 |
4917 |
93 |
|
25 |
64000 |
4337 |
94 |
gives the
flavor of results.
So now Unibase Imaging users can go for jobs that have color
images. Might as well think this way because all the newer image
storage systems are storing images in color. By varying the
quality of the storage, JPEG can be used to meet almost any
users demands. Conversion to and from JPEG while setting the
quality gives all the tradeoffs between size and accuracy.
Of course, more disk, RAM and CPU cycles are needed compared to
TIFF files. But, if we are going to see machines go from 40
megahertz to 1200 megahertz some of the benefit ought to come to
the user; not just to the operating system vendor.
Support to add JPEG's came by searching the World Wide Web. The
code is based upon “The Independent JPEG Group's JPEG
software,” release 6b, of March 27,1998.
Two books were also used as reference; “Compressed Image File
Formats” by John Miano and “The Data Compression Book,
Second Edition” by Mark Nelson and Jean-Loup Gailly. These
provided the descriptions needed to understand the code
objectives more fully.
|
|
Works
Well and Here Comes XP
|
|
The latest
7.48i of DMAC products in distribution works well with Windows
2000. The crazy responses to the mouse are gone and general
protect faults (GPFs) are few and far between. Obviously it is
time for Microsoft to bring out a new operating system version.
DMAC products need the latest service pack (level 2) for Windows
2000 to reduce GPFs. Several people have complained about how
slow Windows 2000 is on their current machines. From what we
hear, Windows XP will be more of the same. If you do not buy the
fastest machine out there, do not expect to process at the same
speeds as previously when upgrading to XP.
|
|
Rfmouse
Continues To
Move In DMAC Direction
|
|
Rfmouse is
written using the ZINC code for platform independence. For the
past few years DMAC has been trying to achieve the desired
independence. ZINC had trouble; now DMAC has trouble. Since the
last report, “Mouse Bites
Developer,” in our Spring 2001 issue, Rfmouse continues to
provide problems on the three platforms of current interest
LINUX, Microsoft 16 bit and Microsoft 32 bit.
DMAC should probably give up and try another approach. DMAC has
upped the error level reporting on ZINC to the highest level
(W4) possible with its compiler and over 6000 errors were
reported and these errors were all fixed. The code still has too
many unexpected general protection faults and ugly screens.
|
 |
|
Why even
mention this? Well, DMAC is still plugging away at the problems
while all the time looking for a better approach. Sometimes you
just buckle down and work harder and longer and longer and
longer.
|
|
Too
Many Novell Servers
No Longer Spoils the Broth
|
|
Too many
chefs may spoil the broth; but not too many Novel Servers.
Clients were finding that, try as they might, people were still
logging into DMAC with their current working directories on
systems unknown to the Unibase server. Now, hopefully, that no
longer happens with DMAC products in release 7.48i, and higher.
Novell has fixed this in 6.0 (no special client required), but
DMAC has what it hopes to be a better solution now for Unibase
in all the Novell environments.
Not sure if you have ever seen this problem. If you ever had a
user's work statistics commingled with a visitor's statistics
from another server environment, then you had the problem.
Novell would send messages based upon connection numbers and
these numbers sometimes would become non-unique in the Unibase
environment.
|
|
Pan
and Zoom Snippets
In Ballet Brings Power
|
|
Users want
the craziest things. First, chop up a form into as many as ten
pieces. Then rearrange these pieces and display a few of them on
the screen at the same time. Now pan one of them AND expect the
others to move accordingly. Or ZOOM in or ZOOM out on one of
them and expect them all to do the same!
Ok, if you have that in your mind, that is what DMAC has given
users in the newest release 7.49i going through quality
assurance testing now. All of the sneaky things will trip you up
as you think about this. Cannot just zoom in or zoom out focus
is lost. And when you rotate it cannot just rotate it; gotta do
more.
At DMAC we think we have it right. Using Tina Kay's eagle eye
and the clients desires, we have tweaked this so that you can
pick out a column from the left side of a form and set a column
from the right side next to it. The middle stuff could be shown,
or not, to the right of the second column.
Can you think of anything missing? User did. See if you can find
the companion feature to this elsewhere in this newsletter.
|
|
ParaPort
OCR/ICR Scanning Hits Expected Accuracy
|
|
Everyone
is wary of claims for accuracy for OCR/ICR engines. At DMAC we
try to measure this accuracy and report to our users what we
find. Before we built the ParaPort connection we had hoped for
forty percent accuracy on run-of-the-mill handwritten material.
The ParaPort trial tests show that this is possible.
ParaScript can be table driven, i.e. either matching an item in
the table or being rejected, and the threshold for accuracy
affects these accuracies. We have played with thresholds from 30
to 90 percent. We know that we can set the threshold very high
and obtain high accuracies for numbers and machine print. Such
jobs as bubble boxes, check boxes, etc also are very high in
accuracy.
Currently, DMAC clients are trying to determine how many fields
on a document need to be read, at what accuracy, to justify the
overhead associated with spinning off the data through the
ParaPort; receiving the data back; and repopulating the Unibase
batches with the data. We will report on these numbers as soon
as they are available.
If you do not remember what the ParaPort can do for your data
capture, click here and read last newsletter's report.
The article, “ParaPort XML to ParaScript OCR/ICR Offers Easy
Way To Use OCR/ICR,” in the summer issue of the Technical
Review explains the potential benefits of ParaScript
|
|
Red
Line Across
Screen Gives Class
|
|
Once you
can zoom and pan snippets in ballet step, as discussed elsewhere
in this quarter's newsletter, you need a not-so-imaginary line
across all these snippets so that you can see what goes with
what. At least that is what one user wanted. Now once you have
this red line you need to be able to move it up and down just
the right increment each time. But, of course, the requirements
of each job vary, so the increment is adjustable for the needs
of each job.
Ever wonder how it is that Unibase by DMAC, Unibase Imaging and
WebBase have so many features? It is because users ask for them;
DMAC listens and adds them to the product. Unibase Imaging
release 7.49i has this red line. What can DMAC do to speed up
your production?
|
|
TIFF
Decode Failures
Become Fewer and Fewer
|
|
Every once
in a while someone sends us a TIFF file which they can read in
either showi or dei but not both of them. These brainteasers are
fun to solve. Showi and dei use two different methods for
decoding G3 and G4 TIFF files. Usually, between them, we can
find out why something does not work. You would think that after
ten years there would be nothing out there that we cannot read.
Not so.
Where are the failures? Usually the failure to read is related
to the end of a line of pixels or the end of the document. We
have seen a dozen different ways of ending a TIFF scan. If you
read the specification for TIFF files, you wonder how the
endings could have been created. But, we just figure out what
the creator of the TIFF file did, and do likewise.
Why mention this? Well we are starting to support JPEG files in
JFIF file format. Lots of holes in its specification, so we know
people will find JPEG files we cannot read at first. Find them
and send them to us. Discrete Cosine Transforms are similar to
Fast Fourier Transforms and, as a mechanical engineer, I grew up
on the FFTs.
Ought to repeat credit here for the references used to get JPEG
going. The main code for decoding the JPEGs comes from “The
Independent JPEG Group's JPEG software release 6b of March
27,1998. They can be found a jpeg-info@uunet.uu.net. “
The Data Compression Book, Second Edition” by Mark Nelson and
Jean-Loup Gailly provided a lot of information. “Compressed
Image File Formats” by John Miano was also used.
|
|
Video
Cards and VESA Gives
Ugly Screens Less Often Now
|
|

|
DMAC
clients live in two environments. Sometimes it is only the 32
bit Microsoft environment. In this environment, the user's keyer
sets the screen mode and density for what is desired and lets it
go. Shops tend to standardize on 600 x 480 or 800 x 600 and 256
colors. This is great for ninety percent of the images from
which data is to be captured.
Along comes a strange job, and away goes the standard
environment. A greater density is needed; let's say 1280 x 1024.
Someone goes around and resets all the windows environments
under “settings - display”
. |
|
Do this
several times a month and you might be tempted to use a
different approach. DMAC provides a setting in the environment
that allows the 16-bit Microsoft, and the WebBase products, to
dynamically change by job. This comes from the “DMACI”
environment variable.
Not that there aren't problems. First and foremost, is the
problem of getting a driver for the video card loaded that
supports VESA (VIDEO ELECTRONICS STANDARDS ASSOCIATION)
standards. With this driver is loaded, it is possible to
determine if the video board will perform a certain function or
not. If it is not loaded, some trial and error in software is
necessary. This is not always successful.
Every few months DMAC tweaks its detection of video cards so
that more can be detected in the 16-bit Microsoft mode. Users
want to be freed of this pain, and we are working on it. DMAC
will be testing dynamic response on the 32-bit Microsoft version
in the near future. We need a beta with mostly 32-bit Microsoft
operating system workstations for this. We are waiting for the
XP OS so that we can be up to date. We will still tweak the 16
bit and WebBase versions.
|
|