How Rossum can help you extract data from tables
Understanding tables in documents can be challenging, from simple layouts to more complex structures. Aurora for complex tables simplifies the process, making it easy to extract data from tables quickly. This solution not only handles structured and straightforward cases but also lets you extract information from complex tables. Importantly, no additional add-ons, like the Magic Items extension are needed.
How to switch to between ‘Grid design’ and ‘Complex Tables’ annotating line items
Aurora for Complex Tables offers flexibility in table extraction. Choose between ‘Complex Tables’ and ‘Grid Design’ by opening the Sidebar settings and selecting relevant option in Line Items section. To extract line items using the ‘Grid Design,’ all clients are encouraged to refer to our detailed article here.
Users can switch between the ‘Grid design’ and the ‘Complex Tables’ approach during annotations without worrying about losing any data. If you find that one method doesn’t quite fit a specific table, simply make the switch. Your annotated values stay intact throughout, even if you switch between approaches. While the screen refreshes, keep in mind that unconfirmed suggestions will disappear.
How to annotate line items using Complex Tables
Annotating the value
To begin, draw a bounding box around the value in the first row of the table that you wish to extract from the document. Once you select the value, a small window with options to either delete (bin icon) or accept the value (green checkmark icon) will appear.
Important note: It’s important to start annotating from the first row of the table. Starting in the middle will result in inaccurate suggestions.
Accepting or Rejecting Suggestions
Based on your input, Rossum will automatically suggest values for the next rows, which you can then accept. If the suggestions are correct, it is important to accept them so that they are actually extracted.
Tip: It learns from user actions. When the bounding box for the selected value is adjusted, it learns from this, and the bounding boxes in the suggestions will also be adjusted.
Once you accept the data, the bounding boxes will visually adapt, changing from purple dashed bounding boxes to yellow-filled bounding boxes.
To decline the suggestions, navigate to the three dots and select “Reject suggestions” from the menu.
Customising Annotations
Here you can also choose the label for it, which is the column where the value should go. Click on the label, and open the down arrow to access the drop-down list with all the available column options.
Managing Suggestions
After annotating the first page, easily move to the next page and click the ‘Suggest’ button to receive suggestions for that page. Alternatively, open the three dots, opt for ‘Suggest next pages,’ and the engine will generate suggestions for the remaining number of pages.
If you do not require automatic suggestions, you can disable them. Open the Sidebar settings and switch ‘Show suggestions’ from ‘Automatically’ to ‘On demand.’ With ‘On demand’ selected, you would have to manually click the ‘Suggest’ button in the left panel for suggestions.
To remove unnecessary rows, you can select the rows you want to delete and click on the bin icon in the menu.
Recommended Workflow
The recommended workflow for achieving the best results is as follows:
First, click on the first cell in the footer to begin annotating the relevant value.
Next, proceed to annotate the remaining values in the first row.
Once the first row is annotated, carefully review suggestions in the subsequent rows.
Check thoroughly to ensure that all values are annotated, and there is no missing information.
If necessary, make any required adjustments to the annotated values in the first row and lastly accept the data.
In case the data is automatically predicted, feel free to make changes according to your requirements. Start by fixing the value that needs adjustments, then verify the suggestions based on that. It’s advisable to always begin by adjusting the value in the first row.
Exploring Other Available Features
To make annotating easier, we have added some extra features to assist you in your daily tasks:
You have the option to select the column order based on either the predefined schema (Queue settings) or the order observed and extracted in the document (Automatically). You can also choose whether to display line items as a single table or split them across specific pages.
You can validate or delete data for the entire table on a page or specific rows or columns by clicking on their labels like ‘Page 1,’ ‘1,’ or ‘Quantity.’ From there, you can delete the data (bin icon) or validate it (green checkmark). Validated data will be marked with green-filled bounding boxes.
You also have the option to quickly add a new row below the selected row if you need to include more information. To access these options, simply select the row in the footer or click on its number label in the document and click on insert line.
Tip: Use the “Add line” option located in the bottom right corner of the footer table or use the keyboard shortcut action (CMD + SHIFT + A) to add a manual row.
You can select all extracted table data at once by clicking the down arrow next to the checkbox in the column header row. ‘Select all’ enables you to choose from all pages, while ‘Select next pages’ lets you pick data solely from the current page and the subsequent ones.
To choose multiple bounding boxes, you can also hold down the “Shift” key on your keyboard and draw a selection box by clicking the left mouse button and dragging it.