Processor
- class burdoc.processors.processor.Processor(name: str, log_level: int = 20, max_threads: int | None = None)
Abstract base class for a general Processor. Processors receive data in a single blob, extract any needed data, then write new or updated fields back to the data store.
- abstract add_generated_items_to_fig(page_number: int, fig: Figure, data: Dict[str, Any])
Draw any items generated by this processor to a page image
- check_requirements(data: Any) bool
Checks that required data fields are present in the data.
- Parameters:
data (Any) – Primary data store
- Returns:
Are all fields present
- Return type:
bool
- abstract generates() List[str]
Return list of fields added by this processor
- get_data(data: Any) List[Dict[int, Any]]
Returns all of the data in a list of required fields. Optional requirements are returned as ‘None’ if not present
- Parameters:
data (Any) – Primary data store
- Returns:
List of fields
- Return type:
List[Dict[int, Any]]
- get_page_data(data: Dict[str, Dict[int, Any]], page_number: int | None = None) Iterator[List[Any]]
Returns an iterable of the passed data segmented by page number. Optional requirements are returned as ‘None’ if not present
- Parameters:
data (Dict[str, Dict[int, Any]]) – Primary data store
page_number (Optional[int], optional) – Return a specific page’s data. Defaults to None.
- Yields:
Iterator[List[Any]] – An iterator over the page-grouped fields
- initialise()
Perform any expensive operations required to create a processor
- process(data: Any) Any
Transforms the processed data
- abstract requirements() Tuple[List[str], List[str]]
Return list of required data fields and list of optional data fields