ReadingOrderProcessor
- class burdoc.processors.reading_order_processor.ReadingOrderProcessor(log_level: int = 20)
Infers the correct reading order for all elements on a page.
The ReadingOrderProcessor analyses each section of a page independently and uses a combination of heuristics to order elements within each, before creating an overall ordering of section.
Requires: [“page_bounds”, “elements”, “image_elements”] Optional: [“tables”] Generates: [“elements”]
- add_generated_items_to_fig(page_number: int, fig: Figure, data: Dict[str, Any])
Draw any items generated by this processor to a page image
- generates() List[str]
Return list of fields added by this processor
- requirements() Tuple[List[str], List[str]]
Return list of required data fields and list of optional data fields