Up | CircleFinder2D - 165 | EasyOCR - 78 | PolygonMatch2 - 84 | GCPS -110 | HeightMap3D - 111 | ImageExtrema - 129 | PCA3D - 140 | CameraPose3D - 148 | MonoPose3D - 149 | ImagePicker3D - 145 | ExternalCoordinates3D - 157 | MaMa3D - 152 | CircleSegmentor - 153 | PolygonMatch3 - 160

EasyOCR - 78 - Details
 

  

 
EasyOCR - 78

EasyOCR is a Scorpion wrapper for the Euresys eVision EasyOCR ActiveX control.

The control is licensed separately, see http://www.euresys.com/ and requires a Euresys usb-dongle to run.

When installing Scorpion it is required to select and install the eVision module in the Scorpion Setup. The tool was originally based on eVision Version 6.6. A compatible and complete set of files are installed when installing Scorpion.

EasyOCR is a library dedicated to automatically locate and decode characters. Prior to its use, the OCR engine must be presented with a sample set of characters to recognize. Character location and segmentation is performed automatically; the user must simply identify each sample character.  The library has the following features

  • Completely integrated training
  • Very fast
  • Character scaling support
  • Automatic compensation of illumination changes
  • Direct contrast (black on white) and inverse contrast (white on black) support

The tool will detect and report (presently) all matched characters in the ROI as a single long string, and in addition details for all matched and unmatched characters in the image (including position and scores). Background information


Setup

Reference - Reference system selection

ROI (Region of interest)

  • Use whole picture - only possible if the reference is trivial, i.e., with no calibrator, perspective, rotation or scaling. Position/size is ignored if this is checked.
  • Center-X - x
  • Center-Y - y
  • dX - height
  • dY - width
  • Angle - rotation of image
  • Include angle in ROI paste - if unchecked, pasted rectangles will be at zero angle (faster processing)
  • Paste integer values only - if checked, pasted rectangles are forced at pixel centers (faster processing)
  • Size - resampling of the incoming image to shrink/grow.

The ROI can be managed by the buttons

  • Paste - paste the ROI from the image to the scorpion clipboard
  • Copy - copy the ROI to the image from the scorpion clipboard

Point & Click Clipboard Support

The rectangular ROI is defined by four points.


 One point will change the center point.

More on Image Operations.

File for storing OCR training data

The OCR training data is always stored in an external (*.OCR) file. This file is compatible with any program using the Euresys OCR library.

  • Read only - training is disabled -- name an external OCR file, e.g., one created with the Euresys EasyAccess program.

Recognition parameters

These parameters are kept separate from the training parameters. For a detailed description, refer to the Euresys documentation. Note that the settings should always be kept identical during training and recognition, where applicable; this is automatically taken care of when using the training system, so you will normally never need to change any settings here with the possible exception of "Compare aspect ratio".

  • Remove narrow or flat chars - by default, narrow AND flat characters are removed (ignored). Check this to ignore characters that are either narrow or flat.
  • Cut large characters - if checked, an attempt is made to cut too large characters into smaller (may help with e.g. ink bloating). if unchecked, large characters are ignored.
  • Remove at border - if checked, characters at ROI edges are ignored.
  • Compare aspect ratio - if checked, stretched or flattened characters get lower score
  • Segmentation mode - "repaste objects" means that e.g. the dot over "i" is connected to the stem prior to recognition.
  • Threshold - Euresys recommends using "Min residue".
  • Matching mode - Euresys recommends using "RMS".
  • Relative spacing - Used only when Cut large characters is checked; a number larger than 0 forces a space between the split parts.

Character classes

When training, characters can be classified as Digits, Uppercase, Lowercase or Special characters. Only those selected are included in the matching process.

Distance and separation

Filters out characters found that have a low match:

  • Max distance to training - normalised distance to training data. Use higher value (0<=v<=1) to accept poorer match
  • Minimum separation - match distance ratio between best two matches. Use lower value (0<=v<=1) to accept similar characters


Advanced

By default, the OCR library will automatically segment potential characters in the image. This can be bypassed here.

Character positions

  • Use manual character positioning (bypass segmentation) - check this to enable manual positioning
  • Shift tolerance (up/down) - maximum movement of the manually set position for best match
  • Shift tolerance (up/down) - maximum movement of the manually set position for best match
  • Shifting mode -
    • Characters - each character is moved separately
    • Text - all characters are moved as a whole
  • Positions - list of manually added positions (shown as pixel coordinates within the ROI)
    • Add - add new position from rectangle clicked in the main image
    • Delete - delete selected position
    • Copy - copy selected position back to the main image
    • Paste - modify selected position from rectangle clicked in the main image
  • Highlight - positions can be temporarily highlighed in the main image
    • None - highlight nothing
    • Selected - highlight only the selected position
    • Active - highlight the checked (active) positions
    • All - highlight all positions
    • Refresh - highlights will disappear under a number of circumstances - click refresh to update
  • List right-click menu
    • Add - same as button
    • Delete - same as button
    • Copy - same as button
    • Paste - same as button
    • Delete all - delete all positions in the list


Training

You can include any number of sample images for training the characters. These can be read from file or copied from e.g. the Scorpion main image.

  • Add room for a new image

  • Delete selected image and all its training data

  • Paste image from the clipboard

  • Load image from file

  • Perform interactive training on the selected image (see OCR training below)

Image right-click menu

  • Copy shown image without graphics - copy shown part of image (possibly zoomed) to the clipboard

  • Save shown image without graphics - save shown part of image (possibly zoomed) to file


OCR training

The image is automatically segmented based on these parameters:

  • Width min/max - max single character size
  • Height min/max - min single character size
  • Noise area - smallest area to be considered
  • Spacing - minimum space between adjacent characters
  • Remove narrow or flat characters - by default, narrow AND flat characters are removed (ignored). Check this to ignore characters that are either narrow or flat.
  • Cut large characters - if checked, an attempt is made to cut too large characters into smaller (may help with e.g. ink bloating). if unchecked, large characters are ignored.
  • Remove at border - if checked, characters at ROI edges are ignored.
  • Text color - "Light on dark" or "Dark on light" are meant to be used with the threshold and matching mode settings (below)
  • Segmentation mode - "repaste objects" means that e.g. the dot over "i" is connected to the stem prior to recognition.
  • Threshold - Euresys recommends using "Min residue".
  • Matching mode - Euresys recommends using "RMS".
  • Relative spacing - Used only when Cut large characters is checked; a number larger than 0 forces a space between the split parts.

After the segmentation is done, any previously assigned character codes/classes are applied. If the segmentation parameters are changed, the assigned codes are kept as far as possible.

WARNING: when the segmentation parameters are changed, this applies to all training images. You should revisit and check all images for consistency after making any changes.

The found characters are displayed in red. Clicking a character highlights the corresponding item in the list on the right. Doubleclicking a list item (or pressing RETURN when the item has focus) brings up the learning dialog (below). The item selected in the list is also highlighted in blue in the image. When a code and class has been assigned to a character, they are shown in green in the image.

List right-click menu items

  • Edit - same as double-click
  • Activate - shortcut to (re)activate a previously deselected item
  • Deactivate - remove character from recognition process

In the "Selected pattern" dialog you teach the OCR which character it has found.

  • Active - the character is used in the recognition only if checked
  • Code - Single-character code
  • Digit/Uppercase/Lowercase/Special - Pattern class, used for recognition selection/classification


Visualisation

BadSegment Found but not accepted character rectangle
Character Found character code
ReadString All characters found, in sequence
ROI Search area
Segment Found character rectangle


Results

Whole picture 1: whole picture was searched; 0: specified ROI was used
Trivial refsys 1: reference system is trivial - whole picture may be used; 0: not trivial - whole picture not available
Read string All character codes, in sequence
Number of accepted Number of recognised characters
Number of not accepted Number of refused characters
Characters All characters, as Python dictionary tuple
Accepted All accepted characters, as Python dictionary tuple
Not accepted All refused characters, as Python dictionary tuple

The Python dictionary strings contain this information, as a tuple of dictionaries:

  • OK - 0 or 1
  • Code - character code as a single character
  • Class - "Digit", "Upper", "Lower" or "Special"
  • Pos - Object coordinates of top left character corner
  • Dist - distance to training data
  • Sep - ratio of separation to next possible match

Example of Characters string for two found characters "B" and "d":

({'Code': 'B', 'Dist': 0.0, 'Sep': 1.0, 'Pos': (224.0, 243.0), 'OK': 1, 'Class': 'Upper'},{'Code': 'd', 'Dist': 0.0, 'Sep': 1.0, 'Pos': (224.0, 262.0), 'OK': 1, 'Class': 'Lower'})


ExecuteCmd support
(see also executeCmd)

Command

Parameters

Return values

Comments

Set Object=ROI;Value=<point/polygon> ok,res Sets the tool's ROI. See Copy/paste ROIs for details.
Get Object=ROI ok,<polygon> Current ROI (angled rectangle).

 

 

Scorpion Vision Version XII : Build 646 - Date: 20170225
Scorpion Vision Software® is a registered trademark of Tordivel AS.
Copyright © 2000 - 2017 Tordivel AS.