CAPBAK/X, Ver. 5.2
RELEASE NOTES

© Copyright 1999-2006 by Software Research, Inc.

SUMMARY

This Release Note describes changes and additions to CAPBAK/X Ver. 5.2 for UNIX platforms.
Changes to the CAPBAK/X product that are described here include:

To obtain your copy of CAPBAK/X 5.2 if you already have the CAPBAK/X 5.1 software please contact SR using the Information Request Form.


Multi-Platform OCR Font Training

CAPBAK/X Ver. 5.2

Multi-Platform OCR Font Training -- Summary

In client/server applications for CAPBAK/X there are more and more situations in which connections are made between UNIX and Windows machines. For example, a UNIX-based client may be viewing material generated on a Windows NT based server. One such product which uses both fonts on the UNIX side is called "CoSession" by UniPress.

The font training enhancement described below provide the option of using supplied font training files that handle the most-common:

PC Fonts Recognized PC-based font character recognition support will be provided for the following standard Microsoft TrueType fonts:

	Times Roman
	Courier
	MS-Line
	MS-San Serif
Currently, training is optimized to recognize fonts at 8, 10, and 12 point sizes, bold type inclusive. Our experience is that once PC-based text is emulated onto a UNIX X-Windowing environment, the most legible size is a 12 point PC-font due to the screen resolution difference between PC and UNIX hardware. MS-Windows emulations using UNIX standard color map is supported.

UNIX Only Fonts Recognized

This varies from platform to platform. However, in all cases, the UNIX training supports the 95% most-commonly used typefaces.

Combined UNIX & PC Fonts Recognized

Beyond the standard set of fonts supported specific to UNIX platforms listed in our product line, optical character recognition for PC-based fonts is available concurrently. This is helpful when the majority or mission-critical applications are developed in a UNIX environment yet the Application Under Test (AUT) resides in a MS-Windows environment.

Another scenario is if managing software resides in UNIX and has not been nor will be ported to Windows yet the AUT is MS-Windows based. Supported PC-based fonts are listed in the preceding section.

How to Choose the Right Training File, When to Use Which File

Optical character recognition training is based on a neural-network like structure. Though theoretically, an OCR training file can accept as many fonts as trainable, law of statistics overtake the results. This is why we chose not to train for every conceivable typeface in the market.

We offer PC-UNIX font recognition without significant compromise to the recognition success rate. Based on this premise, we encourage users to select the most appropriate training file while testing their AUT.

If the AUT is MS-Windows based exclusively, use the PC-only OCR training file. On the other hand, if testing requires the monitoring of both UNIX and PC activity, use the PC-UNIX based OCR training file.


RGB Specification Option

CAPBAK/X Ver. 5.2

OCR COLOR PROCESSING ENHANCEMENT -- SUMMARY

OCR recognition effectiveness can be affected by CAPBAK/X's built-in algorithms for processing closely-matched colors, and the effectiveness of CAPBAK/X in a realistic scenario when such images are used is much lower. Basically, the OCR engine becomes confused by the colors and the recognition percentage drops off badly.

The enhancement described below gives the user additional options in selecting pixel values or rgb values. In turn, this gives the user a much more effective OCR recognition efficiency.

BACKGROUND AND TECHNICAL APPROACH

How CAPBAK/X Currently Works; Why the New Facilities are Needed

Without using the options described below, CAPBAK/X currently captures images, and dithers them into a 1-bit plane equivalent, the derived image with foreground/background set according to the current definitions for white and black.

The problem is that the OCR engine fails if there is text embedded in complex images, e.g. red letters against a pink background. What is foreground and background affects what 1-bit plane derived image is sent to the OCR.

The difficulty is that the built-in dithering misleads the OCR engine; OCR tries to extract ASCII from a sloppy image.

The Technical Approach

User specifies either rgb.rc, a set of RGB color values and a threshold for applying them, or pixel.rc, a set of screen pixel values to be cast foreground/background. With this new capability you can override the built-ins if you want and fully specify the rules by which the 1-bit-plane derived image is generated. This gives you nearly complete control over the OCR recognition process.

SPECIAL pixel.rc and/or rgb.rc FILES

Where You Put These Files

You put either the rgb.rc or the pixel.rc file at $SR/ocr.

If there is no $SR/ocr directory, or if $SR/ocr exists and does not contain either rgb.rc or pixel.rc, then CAPBAK/X acts normally using its own built-ins.

If $SR/ocr/pixel.rc exists, pixel.rc data is used even if $SR/ocr/rgb.rc exists.

If $SR/ocr/rgb.rc exists but $SR/ocr/pixel.rc does NOT exist, CAPBAK/X uses $SR/ocr/rgb.rc data.

Precedence Between rgb.rc and pixel.rc.

A pixel.rc file, if present, takes precedence over an rgb.rc file, if present. If there is no pixel.rc file present, then CAPBAK/X tries to find an rgb.rc file and if it finds one applies the values found there. If neither a pixel.rc file nor a rgb.rc file is found, CAPBAK/X behaves normally.

These two capabilities give a user the ability to customize the CAPBAK/X OCR recognition in a way that maximizes the ability to extract useful ASCII information from a captured screen image.

rgb.rc FILE FORMAT

The user specifies a set of up to 255 RGB values represented as follows in the rgb.rc file that is read during OCR activation time by CAPBAK/X:

Foreground RGB Specification Format

You specify the rgb color values you want to be treated as the foreground with the rgb.rc file using one line for each color value, as show below:
	foreground
	000 000 000
	120 120 120
	122 147 210
	064 064 064
	...
	(EOF)

Reverse RGB Specification Format

If the rgb.rc file contains a different initial line, as shown below, and has the effect of specifying the colors that become the background (rather than the foreground):
	background
	255 255 255
	120 120 120
	255 200 195
	...
	(EOF)

then the process will be the same, except that the specified colors are those which are mapped into the background, all others being mapped into the foreground.

Tolerance Specification

You can, with the rgb.rc file and with either of the above two formats, specify the tolerance at which RGB values are matched between the current screen image and the baselined image. You do this by adding a tolerance phrase to the first line of the file:
	background tolerance 
	255 255 255
	120 120 120
	255 200 195
	...
	(EOF)

where is a decimal number from 0 to 100 that specifies the percentage of closeness for the two RGB values to match.

Formatting Rules, rgb.rc File

This input file format follows that commonly used in the rgb.txt file (see below). Note that any information on any line in the file from the 12th character and beyond is ignored. Any lines not matching the above format result in an error message and are ignored.

Note that you can choose any 24-bit (3 x 8-bit) RGB value that is possible on the machine in this fashion. Every color that is given is put in as a foreground color in the derived image that is generated from the captured image and sent to the OCR engine.

You have to have at least one valid color or the file is ignored.

The file must begin with either foreground or background (see below) or the file is ignored and operation reverts to normal mode.

RGB File Option

This extension to CAPBAK/X provides a user with the capability of specifying a list of Red/Green/Blue (RGB) values that make up the foreground color [the background color], with any other colors being assigned to the background color [the foreground color]. RGB values are standardly specified as three 8-bit quantities.

Pixel File Option

In addition, the user is given the option of using a pixel.rc file in which he specifies pixel values that make up the foreground color [the background color], with any other colors being assigned to the background color [the foreground color]. Pixel values are standardly specified as a single 8-bit pixel, except for monochrome displays (in which case these options are unnecessary).

rgb.rc Operation Mode

The new RGB specification feature is active only in case there is a rgb.rc file present in the directory $SR/ocr. If this file is absent then CAPBAK/X behaves normally. Note that $SR is an environment variable that points to the directory at which TestWorks is installed.

rgb.rc File Absent

If the rgb.rc file is NOT PRESENT in the current working directory, then the OCR software behaves as it presently does.

Normal Operation

CAPBAK/X, when presented with a color image and asked to extract ASCII text from it using the built-in OCR engine, converts the image through a process called dithering into a 1-pixel plane deep monochromatic image derived from the screen image.

Normally this image contains derived pixels (actually, single bits) that are selected on the basis of known machine characteristics, properties of the color map settings, and several other factors. CAPBAK/X settings are pre-set to give effective use of the OCR capability on most machines, with most displays, and on most platforms.

OPERATION WITH rgb.rc FILE PRESENT

We have provided an algorithm that (described here in pseudocode) works in the following way. The candidate image is the one the user has selected on the screen, and the derived image pixels, with foreground and background as 1 and 0, respectively, are found by looking up the RGB values for the actual pixel to see if they are in the user-supplied rgb.rc file:

	for (all pixels in the candidate image)
		{
		for (each defined color in the list in rgb.rc)
			if (color of pixel = color in list)
			{
				derived-image-pixel = foreground = 1;
				stop the scan;
			}
			No match was found, so:
			derived-image-pixel = background = 0;
		}
	Activate the OCR engine using the derived image;

Note: Because the innermost loop stops if a color-match is found between some color specified in the rgb.rc file and the current pixel, is will be best to put the color values in the rgb.rc file in the order that is most likely to match. For short lists of RGB values overall performance is not expected to be a problem. This will give the user the flexibility to specify a set of colors all of which are aggregated to be the foreground color.

How This Method Works

To make sure there is no confusion, but without requiring the reader understand all of the intricate details of how the UNIX X Window system works, it will be useful to point out how the above algorithm is actually implemented.

The X Window system actually renders each colored pixel from a machine dependent set of three 16-bit color intensity values. There are X Window calls that extract a C structure from the actual three 16-bit quantities that are used on your system and show you what values are actually being used to generate your display. This is done via the X Window call XQueryColors(...);.

The three 8-bit rgb values that are specified in the table that a user specifies are converted into the same X Window structure with the function call XParseColor.

Hence, the comparison above (at the Noted line above) is done at this level, i.e. by comparing the outputs of these two function calls for the particular pixel in question.

Note that you specify the RGB values in your Xdefaults file, or in dozens of different ways within your application. The most common source for this information is the named colors in the file rgb.txt which is normally found /usr/lib/X11/.... The specific location will vary from system to system.

OPERATION WITH pixel.rc FILE PRESENT

The new pixel value specification feature is active only in case there is a pixel.rc file present in the directory $SR/ocr. If this file is absent, and if there also is no rgb.rc file in $SR/ocr, then CAPBAK/X behaves normally.

Note that $SR is an environment variable that points to the directory at which TestWorks is installed.

pixel.rc File Format

The user specifies a set of up to 255 RGB values represented as follows in the pixel.rc file that is read during OCR activation time by CAPBAK/X:
	foreground
	00
	12
	12
	06
	FF
	...
	

Reverse Pixel Specification Format If the pixel.rc file contains a different initial line, as follows:
	background
	FF
	36
	25
	...
	

then the process will be the same, except that the specified colors are those which are mapped into the background, all others being mapped into the foreground.

Formatting Rules, pixel.rc File

This input file format follows that commonly used in the rgb.txt file.

Note that any information on any line in the file from the 4th character and beyond is ignored.

Any lines not matching the above format generate an error message to standard output and are ignored.

Note that you can choose any 8-bit pixel value that is possible on the machine in this fashion. Every color that is given is put in as a foreground color in the derived image that is generated from the captured image and sent to the OCR engine.

You have to have at least one valid pixel specified or the file is ignored.

The file must begin with either foreground or background or the file is ignored and operation reverts to normal mode.

Operation with the pixel.rc File Present

Here is the processing algorithm for pixel.rc processing.
	for (all pixels in the candidate image)
		{
		for (each pixel in the list from pixel.rc)
			if (actual pixel = pixel list)
			{
				derived-image-pixel = foreground = 1;
				stop the scan;
			}
			No match was found, so:
			derived-image-pixel = background = 0;
		}
	Activate the OCR engine using the derived image;

OBTAINING DATA FOR pixel.rc OR rgb.rc

Where do you get the data for these files?

Obtaining pixel.rc Data: Use the utility xmag (supplied), which comes from the X11 samples and is freely distributable, to examine actual pixel values and 16-bit RGB values, but not the 8-bit RGB values used to program rgb.rc.

Obtaining rgb.rc Data: Use the values from rgb.txt at /usr/lib/X11.... Get color values from the xmag (supplied with X11) or some equivalent utility.


ADDITIONAL API FUNCTIONS

CAPBAK/X Ver. 5.2

The following are the details for two new calls in the CAPBAK/X 5.2 API.

Color Match


int cb_location_RGB8_color(x, y, r, g, b, tolerance) int x, y, tolerance; short r,g,b;

This function checks to determine if the RGB value of the pixel at location x,y matches the red, green and blue values specified by the r, g, and b parameters, within the tolerance specified and returns if so. Otherwise, 0 is returned.

The r,g,b are values for red, green and blue and must be between the value of 0 to 255.

Tolerance indicates a +/- percentage threshold from which the actual RGB value at the specified pixel can vary and still be considered a match. The tolerance parameter should have a value between -1 and 99, where -1 means use the default (+/- 2%) and 0 means an exact match. A higher value will mean that more colors on the screen will match the specified RGB value.

Expanded cb_location Function


int cb_location_RGB16_color(x, y, r, g, b) int x, y; unsigned short r,g,b;

cb_location_RGB16_color is similar to cb_location_RGB8_color but allows one to specify the RGB values as 16-bit values. Since the Xserver stores colors in 16-bit RGB values this allows for an exact match to be made. There is no tolerance factor for this function call since only if the RGB value at pixel location x,y exactly matches the specified r,g,b will a 1 be returned.

r,g,b are values for red, green and blue and must be between the value of 0x0 to 0xFFFF.


SPECIAL CHARACTERISTICS OF THE RS/6000 AIX 4.2 VERSION OF CAPBAK/X

CAPBAK/X Ver. 5.2

Minor Instabilities Noted

For users of CAPBAK/X Ver. 5.2 on AIX 4.n platforms we are found that there are some minor instabilities that do not appear on other platforms and thus are due to some AIX 4.n specific architectural features.

In normal operation you should not experience any difficulty in playing back already existing recordings. The instabilities arise when you are making an Object Mode (OM) recording that requires that CAPBAK/X 5.2 be linked to your application using the special libSRXt.a library.

In cases when you provide the CAPBAK/X-linked application with "too much" input the synchronization modes between CAPBAK/X, your application, and the underlying windowing system will cause the application to abort. An example is when you vigorously move a slide bar back and forth as fast as you can for several seconds; the linked application may abort early in this interval, i.e. after one or two seconds, or it may get all the way through the ten-second interval successfully.

The workaround in this case is to re-record the session, if possibly being reasonably realistic about the speed at which you are making events happen.

Important note: The phenomena described above has NOT been reported on other platforms.

Static Linking Required on CAPBAK/X Ver. 5.2

To use object mode on an AIX 4.n system you must link all of the SR-supplied versions of libXta and all the X and Motif libraries statically. That is, you must use the:
-b static
flag when linking these libraries.

This restriction applies whether the ld command or compiler commands are used to link your application.

Object Mode operation of CAPBAK/X Ver. 5.2 is not supported on releases of AIX older than Ver. 4.n. Testing and checkout of CAPBAK/X Ver. 5.2 has been performed only on AIX Ver. 4.2; however, no sites using the earlier versions of AIX have reported any difficulties.

Software Product License Agreement.