File Watcher Features

The Watcher module has been designed with the typical use cases of the Banking and Telecommunications industry in mind for IT Batch Processing.

If you know of a use case that is not covered by watcher, please tell us about it in the GitHub Discussions Section .

Currently Watcher comprises the following features: Single File & Folders, Multiples File Groups, File Patterns, Non-Bloking Execution, Blocking Execution, Bulk File Processing, Advanced File Deletion, Advanced File Creation, Advanced File Alteration, Watcher for Any Alteration, Watcher for Specific Alteration, Decoupled Execution, Novelty Detection, Qualitative Response, Check File Stability, Big Amounts of Files, Atomic Function Injection, Folder Recursion, Selective Path Level, Watcher Monitoring

Note

The lines of code used to exemplify each feature of watcher assume the following:

1fwa = require('watcher').file   --for file-watcher
2mon = require('watcher').monit  --for watcher monitoring

Single File & Folders

Detection of creation, deletion and alteration of single files or single folders in the file system.

1fwa.creation({'/path/to/single_file'})    --watching file creation
2fwa.creation({'/path/to/single_folder/'}) --watching folder creation

Multiples File Groups

Multiple groups of different files can be watched at the same time. The input list of watchable files is a Lua table type parameter.

1fwa.deletion(
2    {
3        '/path_1/to/group_file_a/*',  --folder
4        '/path_2/to/group_file_b/*'   --another
5     }
6 )

File Patterns

fwa.creation({'/path/to/files_*.txt'})

Note

The watch-list is constructed with a single flag that controls the behavior of the function: GLOB_NOESCAPE. For details type man 3 glob.

Non-Bloking Execution

By default the Watcher run is executed in non-blocking mode through tarantool fibers. Fibers are a unique Tarantool feature “green threads” or coroutines that run independently of operating system threads.

Blocking Execution

The waitfor function blocks the code and waits for a watcher to finish.

waitfor(fwa.creation({'/path/to/file'}).wid) --wait for watcher

Bulk File Processing

Watcher has an internal mechanism to allocate fibers for every certain amount of files in the watcher list. This amount is determined by the BULK_CAPACITY configuration value in order to optimize performance.

Advanced File Deletion

Inputs

File Watcher Deletion Parameters

Param

Type

Description

wlist

table, required

Watch List

maxwait

number, otional, default-value: 60

Maximum wait time in seconds

interval

number, otional, default-value: 0.5

Verification interval for watcher in seconds

options

table, optional, default-value: {'NS', 0, 0}

List of search options

recursion

table, optional, default-value: nil or {false, {0}, false}

Recursion paramaters

wlist

It is the list of files, directories or file patterns to be observed. The data type is a Lua table and the size of tables is already limited to 2.147.483.647 elements.

An example definition is the following:

wlist = {'path/file', 'path', 'pattern*', ...} --arbitrary code

maxwait

Maxwait is a numeric value that represents the maximum time to wait for the watcher. Watcher will terminate as soon as possible and as long as the search conditions are met. The default value is 60 seconds.

interval

Interval is a numerical value that determines how often the watcher checks the search conditions. This value must be less than the maxwait value. The default value is 0.5 seconds.

options

The options parameter is a Lua table containing 3 elements: sort, cases and match.

  • The first one sort contains the ordering method of the wlist.

  • The second element cases contains the number of cases to observe from the wlist.

  • and the third element match indicates the number of cases expected to satisfy the search.

By default, the value of the option table is {sort = 'NS', cases = 0, match = 0}.

The list of possible values for sort

Value

Description

'NS'

No sort

'AA'

Sorted alphabetically ascending

'AD'

Sorted alphabetically descending

'MA'

Sorted by date of modification ascending

'MD'

Sorted for date of modification descending

Note

The value 'NS' treats the list in the same order in which the elements are passed to the list wlist.

recursion

To enable directory recursion you must define the recursion parameter. The recursion works only for an observable of type directory.

The recursion value is a Lua table type composed of the following elements {recursive_mode, {deep_levels}, hidden_files}:

  • recursive_mode: Boolean indicating whether or not to activate the recursive mode on the root directory. The default value is false.

  • deep_levels: Numerical table indicating the levels of depth to be evaluated in the directory structure. The default value is {0}

  • hidden_files: Boolean indicating whether hidden files will be evaluated in the recursion. The default value is false.

How do the recursion levels work?

To understand how levels work in recursion, let’s look at the following example.

Imagine you have the following directory structure and you want to observe the deletion of files from the path ‘/folder_A/folder_B/’.

Recursive levels

The levels are determined from the object path or root path that will be used as input in the watcher expression. In this case the path ‘/folder_A/folder_B/’ has level zero and, for each folder node a level will be added according to its depth. The result is shown in the following summary table, which contains the list of files for each level.

Identification of the levels of recursion

[Input] Level 0 {0}

Level 1 {1}

Level 2 {2}

Level 3 {3}

Level 4 {4}

folder

'/folder_A/folder_B/'

'folder_C'

'folder_D'

'folder_E'

'folder_N'

files

{A1} {B1, B2, .B3}

{C1, C2}

{.D1}

{E1, E2, .E3}

{N1, N2}

Note

The files, .B3, .D1 and .E3 are hidden files.

Now that we know how to set the recursion level, let’s see an example of the observable files depending on different values of the recursion parameter for the above mentioned example.

Observable files depending on the recursion level

recursion value

Composition of the list of observable files wlist

{true, {0}, false}

{A1, B1, B2}

{true, {0}, true}

{A1, B1, B2, .B3}

{true, {0, 1}, false}

{A1, B1, B2, C1, C2}

{true, {0, 1}, true}

{A1, B1, B2, .B3, C1, C2}

{true, {2}, false}

nil

{true, {2}, true}

{.D1}

{true, {0, 1, 2, 3, 4}, false}

{A1, B1, B2, C1, C2, E1, E2, N1, N2}

{true, {0, 1, 2, 3, 4}, true}

{A1, B1, B2, .B3, C1, C2, .D1, E1, E2, .E3, N1, N2}

Output

Advanced File Creation

Inputs

File Watcher Creation Parameters

Param

Type

Description

wlist

table, required

Watch List

maxwait

number, otional, default-value: 60

Maximum wait time in seconds

interval

number, otional, default-value: 0.5

Verification interval for watcher in seconds

minsize

number, optional, default-value: 0

Value of the minimum expected file size

stability

table, optional, default-value: {1, 15}

Minimum criteria for measuring file stability

novelty

table, optional, default-value: {0, 0}

Time interval that determines the validity of the file’s novelty

nmatch

number, optional, default-value: 0

Number of expected files as a search sufficiency condition

wlist

It is the list of files, directories or file patterns to be observed. The data type is a Lua table and the size of tables is already limited to 2.147.483.647 elements.

An example definition is the following:

wlist = {'path/file', 'path', 'pattern*', ...} --arbitrary code

maxwait

Maxwait is a numeric value that represents the maximum time to wait for the watcher. Watcher will terminate as soon as possible and as long as the search conditions are met. The default value is 60 seconds.

interval

Interval is a numerical value that determines how often the watcher checks the search conditions. This value must be less than the maxwait value. The default value is 0.5 seconds.

minsize

Minsize is a numerical value representing the minimum expected file size. The default value is 0, which means that it is sufficient to just generate the file when the minimum size is unknown.

Important

Regardless of whether the expected file size is 0 Bytes, watcher will not terminate until the file arrives in its entirety, avoiding edge cases where a file is consumed before the data transfer is complete.

stability

The stability parameter contains the elements that allow to evaluate the stability of a file. It is a Lua table containing two elements:

  • The interval that defines the frequency of checking the file once it has arrived.

  • The number of iterations used to determine the stability of the file.

The default value is: {1, 15}.

novelty

The novelty parameter is a two-element Lua table that contains the time interval that determines the validity of the file’s novelty. The default value is {0, 0} which indicates that the novelty of the file will not be evaluated.

nmatch

nmatch is a number of expected files as a search sufficiency condition.

Advanced File Alteration

Inputs

File Watcher Alteration Parameters

Param

Type

Description

wlist

table, required

Watch List

maxwait

numeric, otional, default-value: 60

Maximum wait time in seconds

interval

numeric, otional, default-value: 0.5

Verification interval for watcher in seconds

awhat

string, optional, default-value: '1'

Type of file alteration to be observed

nmatch

number, optional, default-value: 0

Number of expected files as a search sufficiency condition

wlist

It is the list of files, directories or file patterns to be observed. The data type is a Lua table and the size of tables is already limited to 2.147.483.647 elements.

An example definition is the following:

wlist = {'path/file', 'path', 'pattern*', ...} --arbitrary code

maxwait

Maxwait is a numeric value that represents the maximum time to wait for the watcher. Watcher will terminate as soon as possible and as long as the search conditions are met. The default value is 60 seconds.

interval

Interval is a numerical value that determines how often the watcher checks the search conditions. This value must be less than the maxwait value. The default value is 0.5 seconds.

awhat

Type of file alteration to be observed. See File Watcher Alteration Parameters.

File Watcher Alteration Parameters

Type

Value

Description

ANY_ALTERATION

'1'

Search for any alteration

CONTENT_ALTERATION

'2'

Search for content file alteration

SIZE_ALTERATION

'3'

Search for file size alteration

CHANGE_TIME_ALTERATION

'4'

Search for file ctime alteration

MODIFICATION_TIME_ALTERATION

'5'

Search for file mtime alteration

INODE_ALTERATION

'6'

Search for file inode alteration

OWNER_ALTERATION

'7'

Search for file owner alteration

GROUP_ALTERATION

'8'

Search for file group alteration

nmatch

nmatch is a number of expected files as a search sufficiency condition.

Watcher for Any Alteration

fwa.alteration({'/path/to/file'}, nil, nil, '1')

Watcher for Specific Alteration

1fwa.alteration({'/path/to/file'}, nil, nil, '2') --Watcher for content file alteration
2fwa.alteration({'/path/to/file'}, nil, nil, '3') --Watcher for content file size alteration
3fwa.alteration({'/path/to/file'}, nil, nil, '4') --Watcher for content file ctime alteration
4--explore other options for 'awhat' values

See table File Watcher Alteration Parameters for more options.

Decoupled Execution

The create, run function and the monit options have been decoupled for better behavior, overhead relief and versatility of use.

Novelty Detection

Watcher implements the detection of the newness of a file based on the mtime modification date. This is useful to know if file system items have been created in an expected time window.

Warning

Note that the creation of the files may have been done preserving the attributes of the original file. In that case you should consider the novelty rank accordingly.

1 date_from  = os.time() - 24*60*60 --One day before the current date
2 date_to    = os.time() + 24*60*60 --One day after the current date
3 os.execute('touch /tmp/novelty_file.txt') --The file is created on the current date
4 fwt.creation({'/tmp/novelty_file.txt'}, 10, nil, 0, nil, {date_from, date_to})

Note

For known dates you can use the Lua function os.time() as follows:

 1  date_from = os.time(
 2      {
 3          year = 2020,
 4          month = 6,
 5          day = 4,
 6          hour = 23,
 7          min = 48,
 8          sec = 10
 9      }
10  )

Qualitative Response

Watcher leaves a record for each watchable file where it provides qualitative nformation about the search result for each of them. To explore this information see the Watcher Monitoring match and nomatch functions.

 1 NOT_YET_CREATED = '_'               --The file has not yet been created
 2 FILE_PATTERN = 'P'                  --This is a file pattern
 3 HAS_BEEN_CREATED = 'C'              --The file has been created
 4 IS_NOT_NOVELTY = 'N'                --The file is not an expected novelty
 5 UNSTABLE_SIZE = 'U'                 --The file has an unstable file size
 6 UNEXPECTED_SIZE = 'S'               --The file size is unexpected
 7 DISAPPEARED_UNEXPECTEDLY = 'D'      --The file has disappeared unexpectedly
 8 DELETED = 'X'                       --The file has been deleted
 9 NOT_EXISTS = 'T'                    --The file does not exist
10 NOT_YET_DELETED = 'E'               --The file has not been deleted yet
11 NO_ALTERATION = '0'                 --The file has not been modified
12 ANY_ALTERATION = '1'                --The file has been modified
13 CONTENT_ALTERATION = '2'            --The content of the file has been altered
14 SIZE_ALTERATION = '3'               --The file size has been altered
15 CHANGE_TIME_ALTERATION = '4'        --The ctime of the file has been altered
16 MODIFICATION_TIME_ALTERATION = '5'  --The mtime of the file has been altered
17 INODE_ALTERATION = '6'              --The number of inodes has been altered
18 OWNER_ALTERATION = '7'              --The owner of the file has changed
19 GROUP_ALTERATION = '8'              --The group of the file has changed

Check File Stability

Enabled only for file creation. This feature ensures that the watcher terminates once the file creation is completely finished. This criterion is independent of the file size.

See usage for parameter stability

Big Amounts of Files

In the following example, watching the file deletion from the path “/” recursively down to depth level 3 (levels={0,1,2,3}) yields a total of 163,170 watchable files. Note that the execution takes 85 seconds (on a typical desktop machine) but the maximum timeout of the watcher has been specified as low as 10 seconds. This means that 88% of the time is consumed in creating the watcher due to recursion.

 1  tarantool> test=function() local ini=os.time() local fwa=fw.deletion({'/'}, 10, nil, {'NS', nil, 2}, {true, {0,1,2,3}, false}) print(os.time()-ini) print(fwa.wid) end
 2  tarantool> test()
 3  85
 4  1620701962375155ULL
 5  ---
 6  tarantool> mon.info(1620701962375155ULL)
 7  ---
 8  - ans: true
 9  match: 72
10  what: '{"/"}'
11  wid: 1620701962375155
12  type: FWD
13  nomatch: 163098
14  status: completed
15  ...

Atomic Function Injection

Atomic function injection allows you to perform specific tasks on each element of the watchable list separately. In the example, the atomic function afu creates a backup copy for each element of the watchlist.

1afu = function(file) os.execute('cp '..file..' '..file..'_backup') end --Atomic Funcion
2cor = require('watcher').core
3wat = cor.create({'/tmp/original.txt'}, 'FWD', afu) --afu is passed as parameter
4res = run_watcher(wat)

Folder Recursion

You can enable recursion on directories to detect changes in the file system. Recursion is enabled based on a directory entry as a parameter that is considered as a root directory. Starting from this root directory, considered as level zero, you can selectively activate the observation of successive directory levels.

 1 fwa.deletion(
 2     {'/tmp/folder_1'}, --Observed directory is considered a zero level root directory
 3     nil,               --Maxwait, nil to take the value by omission
 4     nil,               --Interval, nil to take the value by omission
 5     nil,               --Options, nil to take the value by omission
 6     {
 7         true,          --Activate recursion
 8         {0, 1, 2},     --Levels of directories to be observed (root and levels 1 & 2)
 9         false          --Includes hidden files
10     }
11 )

For more info see How do the recursion levels work?.

Selective Path Level

The recursion levels is a list of numerical values so you can specify (selectively) the directory level you want to observe and ignore others. This is useful in situations where the full path to the file is unknown but the depth or level of the file is known.

 1 fwa.deletion(
 2     {'/bac/invoices'},
 3     nil,
 4     nil,
 5     nil,
 6     {
 7         true,          --Activate recursion
 8         {3},           --Selective level 3
 9         false          --Includes hidden files
10     }
11 )

See use case …

Watcher Monitoring

monit for Watcher monitoring allows you to monitor and explore the running status of a watcher.

info

The output is a Lua table containing the following elements:

  • ans is a boolean value containing the response of the watcher. true means that the watcher has detected the expected changes that are defined in the parameters.

  • match is the number of cases that match the true value of ans.

  • nomatch is the number of cases that do not belong to the set of true ans.

  • what is a string containing the obserbables parameter.

  • wid is the unique identifier of the watcher.

  • type is the type of the watcher

  • status is the execution status of the watcher.

 1   mon.info(1620701962375155ULL)
 2
 3     {
 4         ans: true
 5         match: 72
 6         what: '{"/"}'
 7         wid: 1620701962375155
 8         type: 'FWD'
 9         nomatch: 163098
10         status: 'completed'
11     }

match

nomatch