ReadingOrderProcessor

class burdoc.processors.reading_order_processor.ReadingOrderProcessor(log_level: int = 20)

Infers the correct reading order for all elements on a page.

The ReadingOrderProcessor analyses each section of a page independently and uses a combination of heuristics to order elements within each, before creating an overall ordering of section.

Requires: [“page_bounds”, “elements”, “image_elements”] Optional: [“tables”] Generates: [“elements”]

add_generated_items_to_fig(page_number: int, fig: Figure, data: Dict[str, Any])

Draw any items generated by this processor to a page image

generates() List[str]

Return list of fields added by this processor

requirements() Tuple[List[str], List[str]]

Return list of required data fields and list of optional data fields