Group "Robin AI", subgroup "Classifier (Robin)"
The action determines the class to which the text belongs based on the trained classification model, i.e. it shows the probability that the text belongs to a certain class corresponding to a rubric based on the trained classification method.
The purpose of the action is to determine the rubric to which the text is closest, i.e. the rubric with the highest percentage of accuracy must be determined to allow a decision to be made about further actions with this text.
Property | Description | Type | Filling example | Mandatory field |
---|---|---|---|---|
Parameters | ||||
Text for classification | The text whose class needs to be defined. | Robin.String | Yes | |
Trained model | The path to the folder that contains the trained classification model. | Robin.FolderPath | C:\doc\img | Yes |
Results | ||||
Result | Dictionary, where the key is the name of the class, and the value is the percentage of entry into this class. Sorting in the dictionary by the percentage of entry into the class. | Robin.Dictionary |
1. The folder that contains the trained model must contain two files. The files are provided to the customer upon request. These files represent the packaged machine learning model.
2. If some file is missing/another name, it will cause an error when the action runs.
3. The robot will generate an error if:
4. The robot will not generate an error if:
5. An existing trained model cannot be "re-trained", if classes need to be added, the model training action must be re-launched.
6. For reference, a trained model on 20000 records classifies text in 2-3 minutes.
More information about text classification theory:
https://vas3k.blog/blog/machine_learning/
https://www.edureka.co/blog/classification-in-machine-learning/
Classify text based on a trained model.
Use the "Classify text" action.
3. Specify the path to the folder that contains the trained model.
4. Click on the "Start" button in the top panel.
The program robot completed successfully. The dictionary is obtained , where Key is a heading and Value is the percentage of occurrences in this heading. Sorting in the dictionary by the percentage of occurrence in the rubric.
The result is a dictionary with the name of the category and the accuracy of belonging to this category.
If it is necessary to get the rubric to which the text belongs to the most extent, it is necessary to use the "Get key collection" action, because the % values are specified in the values, and the rubrics-categories themselves are in the keys. Next, we need to get a collection of keys and the zero element of this collection is the heading to which the text most likely belongs (action "Get value by index").