ocr_utils package¶
Submodules¶
ocr_utils.alto_to_svg module¶
-
class
ocr_utils.alto_to_svg.
FontSize
(default: int, guess: bool, max_value: int)[source]¶ Bases:
object
-
default
: int¶
-
guess
: bool¶
-
max_value
: int¶
-
-
class
ocr_utils.alto_to_svg.
ForeignObject
(text: str, x: int, y: int, width: int, height: int, **extra)[source]¶ Bases:
svgwrite.base.BaseElement
- Parameters
extra –
extra SVG attributes (keyword arguments)
add trailing ‘_’ to reserved keywords:
'class_'
,'from_'
replace inner ‘-‘ by ‘_’:
'stroke_width'
SVG attribute names will be checked, if debug is True.
workaround for removed attribs parameter in Version 0.2.2:
# replace element = BaseElement(attribs=adict) #by element = BaseElement() element.update(adict)
-
elementname
= 'foreignObject'¶
-
class
ocr_utils.alto_to_svg.
Text
(content: str, hpos: float, vpos: float)[source]¶ Bases:
object
-
content
: str¶
-
hpos
: float¶
-
vpos
: float¶
-
-
ocr_utils.alto_to_svg.
alto_pages_and_cells_to_svg
(alto_xml_strings: List[str], pages_cells: List[List[ocr_utils.table.DetectedCell]], default_font_size: int = 40, guess_font_size: bool = True, max_font_size: int = 50) → svgwrite.drawing.Drawing[source]¶ Generates an SVG image made of concatenated pages from alto xml files and table cells
- Parameters
alto_xml_strings – alto xml strings
pages_cells – detected cells on each page
default_font_size (int) – size of font in output svg
guess_font_size (bool) – if True, font size is automatically deduced from block width when possible (to handle varying font sizes)
max_font_size (int) – when guess_font_size is True, maximal possible font size is set to max_font_size (to avoid huge font size in edge cases)
- Returns
svg, can be written to file with saveas method
- Return type
svgwrite.Drawing
-
ocr_utils.alto_to_svg.
alto_to_svg
(input_filename: str, output_filename: str, default_font_size: int = 40, guess_font_size: bool = True, max_font_size: int = 50) → None[source]¶ Loads alto xml file and generates an SVG image made of concatenated pages.
- Parameters
input_filename (str) – Path of the XML alto file
output_filename (str) – Path of the output SVG image
default_font_size (int) – size of font in output svg
guess_font_size (bool) – if True, font size is automatically deduced from block width when possible (to handle varying font sizes)
max_font_size (int) – when guess_font_size is True, maximal possible font size is set to max_font_size (to avoid huge font size in edge cases)
ocr_utils.commons module¶
ocr_utils.pdf_to_svg module¶
ocr_utils.table module¶
-
class
ocr_utils.table.
Cell
(content: ~ T, colspan: int = 1, rowspan: int = 1)[source]¶ Bases:
Generic
[ocr_utils.table.T
]-
colspan
: int = 1¶
-
content
: T¶
-
classmethod
from_dict
(dict_: Dict, factory: Optional[Callable[[Dict], T]] = None) → ocr_utils.table.Cell[source]¶
-
rowspan
: int = 1¶
-
-
class
ocr_utils.table.
Contour
(x_0: int, x_1: int, y_0: int, y_1: int)[source]¶ Bases:
object
-
x_0
: int¶
-
x_1
: int¶
-
y_0
: int¶
-
y_1
: int¶
-
-
class
ocr_utils.table.
DetectedCell
(text: str, contour: ocr_utils.table.Contour, lines: List[alto.TextLine] = <factory>)[source]¶ Bases:
object
-
contour
: ocr_utils.table.Contour¶
-
lines
: List[alto.TextLine]¶
-
text
: str¶
-
-
class
ocr_utils.table.
LocatedTable
(table: ocr_utils.table.Table[~ T], h_pos: int, v_pos: int, height: int, width: int)[source]¶ Bases:
Generic
[ocr_utils.table.T
]-
classmethod
from_dict
(dict_: Dict[str, Any], factory: Optional[Callable[[Dict], T]] = None) → ocr_utils.table.LocatedTable[source]¶
-
h_pos
: int¶
-
height
: int¶
-
table
: ocr_utils.table.Table[T]¶
-
v_pos
: int¶
-
width
: int¶
-
classmethod
-
class
ocr_utils.table.
Row
(cells: List[ocr_utils.table.Cell[~ T]])[source]¶ Bases:
Generic
[ocr_utils.table.T
]-
cells
: List[ocr_utils.table.Cell[T]]¶
-
classmethod
from_dict
(dict_: Dict, factory: Optional[Callable[[Dict], T]] = None) → ocr_utils.table.Row[source]¶
-
-
class
ocr_utils.table.
Table
(headers: List[ocr_utils.table.Row[~ T]], rows: List[ocr_utils.table.Row[~ T]])[source]¶ Bases:
Generic
[ocr_utils.table.T
]-
classmethod
from_dict
(dict_: Dict, factory: Optional[Callable[[Dict], T]] = None) → ocr_utils.table.Table[source]¶
-
headers
: List[ocr_utils.table.Row[T]]¶
-
rows
: List[ocr_utils.table.Row[T]]¶
-
classmethod
-
ocr_utils.table.
extract_and_hide_cells
(image_filename: str, output_filename: str, lang: str) → List[ocr_utils.table.DetectedCell][source]¶ Detects cells Returns all detected cells with their parsed content Saves image with detected cells covered by a blank rectangle (using opencv for structure detection and pytesseract for cell content detection)
- Parameters
image_filename (str) – Path of the input image.
output_filename (str) – Location of the output image (input image with detected tables covered by blank rectangle).
lang (str) – Lang to use when performing OCR.
- Returns
cells – List of detected cells
- Return type
List[DetectedCells]
-
ocr_utils.table.
extract_and_hide_tables
(image_filename: str, output_filename: str, lang: str) → List[ocr_utils.table.LocatedTable][source]¶ Detects and returns tables in image Save image with detected tables covered by a blank rectangle (using opencv for structure detection and pytesseract for cell content detection)
- Parameters
image_filename (str) – Path of the input image.
output_filename (str) – Location of the output image (input image with detected tables covered by blank rectangle).
lang (str) – Lang to use when performing OCR.
- Returns
tables – List of tables with their position in the original image
- Return type
List[LocatedTable]
-
ocr_utils.table.
extract_and_hide_tables_from_image
(image: numpy.ndarray, lang: str) → Tuple[numpy.ndarray, List[ocr_utils.table.LocatedTable]][source]¶ Detects and returns tables in images using opencv for structure detection and pytesseract for cell content detection. Then hides detected tables from the original image.
- Parameters
image (np.ndarray) – Input image as an array of pixels, (output of cv2.imread(image_filename, 0))
lang (str) – Lang to use when performing OCR
- Returns
image (np.ndarray) – Output image as an array of pixels with blank rectangle over detected tables
tables (List[LocatedTable]) – List of tables with their position in the original image
-
ocr_utils.table.
extract_tables
(image_filename: str, lang: str) → List[ocr_utils.table.LocatedTable][source]¶ Detects and returns tables in images using opencv for structure detection and pytesseract for cell content detection
- Parameters
image_filename (str) – Path of the input image.
lang (str) – Lang to use when performing OCR.
- Returns
tables – List of tables with their position in the original image
- Return type
List[LocatedTable]
-
ocr_utils.table.
extract_tables_from_image
(image: numpy.ndarray, lang: str) → List[ocr_utils.table.LocatedTable][source]¶ Detects and returns tables in images using opencv for structure detection and pytesseract for cell content detection
- Parameters
image (np.ndarray) – Input image as an array of pixels, (output of cv2.imread(image_filename, 0))
lang (str) – Lang to use when performing OCR
- Returns
tables – List of tables with their position in the original image
- Return type
List[LocatedTable]