Сравнение версий

Ключ

  • Эта строка добавлена.
  • Эта строка удалена.
  • Изменено форматирование.

Classify text Version 1 (python)

Group "Robin AI", subgroup "Classifier (Preferentum)"

...

Description

The action classifies the text according to the given indexes and defines its class

Action icon


Parameters

Input parameters

  1. Context - uses the result of the "Open classifier" action, which includes the path to the folder with the classifier.

  2. Text - string value to be classified.

  3. Multiclass classification - used to select how many classes will be obtained as a result.

    If the value is "false", the class with the highest percentage of probability will be determined for the text. If "true", then multiple classes will be defined to which the text may belong.

  4. Confidence threshold - allows you to set the difference between the first two headings (topics), at which the system can confidently attribute the text to a single heading. The parameter is taken into account if "Multiclass classification" = false, otherwise the parameter is ignored. 

    • If the percentage of occurrences ≥ confidence threshold, then "Confident result" = true

    • If the percentage of occurrences < confidence threshold, then "Confident result" = false

  5. Number of classes - the maximum number of classes that will be output to the resulting dictionary.

    The parameter is considered if "Multiclass classification" = true, otherwise the parameter is ignored. 

    • If there are more classes in the resulting sample in the class dictionary than in "Number of classes", then the number of classes specified in the parameter is displayed

    • If the number of classes in the resulting sample in the class dictionary is less than in "Number of classes", the number of classes is output as many as were received.

Output parameters

  1. Classes - a dictionary with the resulting sample of classes, where the key is the class and the value is the probability percentage, i.e. the rank of the class
    (displayed in the same form as in the classifier).

  2. Confident result:

    • If "Multiclass classification" = false and "Confidence threshold" is not filled, then "Confident result" = false

    • If "Multiclass classification" = true, then "Confident result" = false

Settings

Property

Description

Type

Filling example

Mandatory field
Parameters
Context

Classifier context for the operation of the action

ContextOpen classifier.Classifier
Yes
TextText that needs to be classifiedString

When Wehner and colleagues performed a historical data analysis of hurricanes between 1980 and 2021, they found five storms that would fit into a Category 6 that have all occurred in the last nine years. It includes 2015’s Hurricane Patricia, which was the most powerful tropical cyclone that lashed Mexico with winds up to 215 mph. The other storms include Typhoon Haiyan in 2013, Typhoon Meranti in 2016, Typhoon Goni in 2020, and Typhoon Surigae in 2021.

Yes
Multiclass classificationIf "false", then the class with the highest probability percentage will be determined for the text. If "true", then several classes will be defined, to which the text can belongBooleanTrueNo
Confidence thresholdA number from 1 to 100 that determines whether the classification result is accurate enough. It is used if you need to define only one class. The higher the specified number, the greater the difference between the two most likely classes should be. The parameter is taken into account if "Multiclass classification" = false
Numeric80No
Number of classesThe maximum number of classes the action can return. If more classes were defined for the text during classification, the action will return only the specified number of classesNumeric5No
Results
ClassesA dictionary with classes to which the specified text can belong. The key is the class, the key value is the percentage of probability that the text will enter the class
Dictionary


Confident resultIf "true", the classification result is sufficiently accurate. If "false", the classification result may be inaccurateBoolean

Classifier operation description

Guidelines for the use of the Preferentum classification system - https://preferentum.ru/wp-content/uploads/2022/04/preferentumclass_manual.pdf.

...

The system classifies the text into possible headings and calculates the rank for each heading. All neighboring rubrics are compared to each other using the formula: X/Y, where x is the first rank and y is the subsequent rank. The highest number obtained during the comparison determines which rubrics will not be included in the resulting dictionary. The action returns the dictionary with the headings that were higher in the list than the heading with the highest comparison number. The rubric with the highest comparison number is also included in the resulting dictionary.

Special conditions of use

  1. If "Multiclass classification" = false and the text was classified into classes with the same percentage of probability, the action will fail.

  2. If "Multiclass classification" = true, "Number of classes" - multiple classes are specified, and the text was classified into classes with the same percentage of probability, the action will fail. 
    (Example: "Number of classes" = 2. The text was classified into three classes, two with the same probability percentage = 50 and the third was classified with probability percentage = 80, the action will end in an error).

  3. If the text has not been classified into any class or the classifier has no classes, the action will fail.

  4. It should be noted that currently only Russian language text can be categorized.

Example of use

Task 1

Classify text based on the trained model, identifying the class with the highest percentage of probability.

Solution

Use the "Classify text" action. 

Implementation

Preface

The "Open classifier" action requires a trained classifier model. 
Training is performed using the "Create index" action.

...

4. Click on the "Start" button in the top panel.  

Result

The program robot completed successfully.

...

and confirmation that the classification result is accurate enough ("Confident result" parameter is True).

Task 2

Classify text based on a trained model to determine the classes the text can belong to

Solution

Use the "Classify text" action. 

Implementation

Preface

The "Open classifier" action requires a trained classifier model. 
Training is performed using the "Create index" action.

...

4. Click on the "Start" button in the top panel.  

Result

The program robot completed successfully.

The dictionary with classes to which the specified text may belong was obtained, and the "Confident Result" parameter is False.

Task 3

Get the results of the "Classify text" action.

Solution

Use the "Get keys", "Get value by index" and "Get value" actions.

Implementation

  1. Repeat steps 1-3 from Task 2.
  2. Transfer the "Get keys" action to the workspace. 

  3. Fill in the "Dictionary" parameter of the "Get keys" action.


  4. Transfer the "Get value by index" action to the workspace. 


  5. Set the parameters of the "Get value by index" action 
    1. Set the result of the "Get keys" action in the "Collection" field
    2. Set the collection index


  6. Transfer the "Get value" action to the workspace. 


  7. Set the parameters of the "Get value" action 
    1. Set the result of the "Classify text" action in the "Dictionary" field
    2. Set the key obtained from the "Get value by index" action


  8.  Click on the "Start" button in the top panel.

Result

The program robot completed successfully.

...