Вы просматриваете старую версию данной страницы. Смотрите текущую версию.

Сравнить с текущим просмотр истории страницы

« Предыдущий Версия 3 Текущий »

General information 

Regular expressions (also called regexp or regex) are a mechanism for finding and replacing text.

The search uses a pattern consisting of characters and metacharacters and specifies the search rule. For text manipulation, a replacement string is additionally defined, which may also contain special characters.

Regular expressions are used by some text editors and utilities to find and substitute text. For example, regular expressions can be used to specify patterns that allow:

  • Find all occurrences of the sequence of characters "cat" in any context, such as: "cat", "catalog", "caterpillar".
  • Find the standalone word "cat" and replace it with "kitten".
  • Find the word "cat" preceded by the word "Persian" or "Cheshire".
  • Remove all sentences from the text that mention the word "cat" or "kitten".

Regular expressions also allow you to specify much more complex search or replacement patterns.

The result of working with a regular expression can be:

  • checking for the presence of the desired pattern in the given text;
  • determining the substring of the text which is matched to the pattern;
  • identifying groups of characters corresponding to separate parts of the pattern.

If a regular expression is used to replace text, the result will be a new text string representing the source text, from which the found substrings (matched to the pattern) have been removed, and the replacement strings (possibly modified by the character groups remembered during parsing from the source text) have been substituted instead. A special case of text modification is deletion of all occurrences of the found pattern, for which the replacement string is specified empty.

1. Regex Basics article - https://habr.com/ru/articles/545150/

2. Service No. 1 for checking regular expressions - https://regex101.com/

3. Service No. 2 for checking regular expressions - https://regexr.com/ 

4. Regular expression bank - https://regex101.com/library

5. Examples of regular expressions -https://support.google.com/a/answer/1371417?hl=en 

6. Bank of ready expressions - https://regexlib.com/. It is better to use the tested expression for complex and typical requests like email, which is almost impossible to write manually

7. Guide to regular expression elements  - https://docs.microsoft.com/ru-ru/dotnet/standard/base-types/regular-expression-language-quick-reference

8. Another guide to regex - http://website-lab.ru/article/regexp/shpargalka_po_regulyarnyim_vyirajeniyam/ 

9. Formats for working with dates (action group “Date conversion” in Studio) – https://learn.microsoft.com/en-us/dotnet/standard/base-types/custom-date-and-time-format-strings

10. Templates for working with the file system (File system action group in Studio) -  https://en.wikipedia.org/wiki/Glob_(programming)

Examples of regular expressions

1. Take the text between a given text -  (?<=start_text\s).+(?=\b\s+finish_text) where start_text, finish_text are the texts between which to take the text

2.  Take the text between a given text - (?s)(?<=start_text).+?(?=finish_text) where start_text, finish_text are the texts between which to take the text

3.  Take the text with the given text - start_text\s+.+\s+finish_text where start_text is the text to take the text from and finish_text is the text to take the text to

4. Extract the year from the text - [2][0][0-2][0-9] regular expression allows you to extract from the text a four-digit number with thousands digit = 2, hundreds digit = 0, tens digit = 0-2, units digit 0-9

5.  Extract a sequence of numbers from the text \d{n} where n is the number of digits in the sequence

6.  Extract a sequence of arbitrary numbers from the text (\d+) 

7.  Extracting only the number from the cell data that contained the text and the numeric value is the simplest one [0-9]+

8.   Extract the mail address from the text (\S*@\S*\.\w+?\b) 

9. Getting rid of multi-line text ^\S.*

  • Нет меток