File Watcher Features
The Watcher module has been designed with the typical use cases of the Banking and Telecommunications industry in mind for IT Batch Processing.
If you know of a use case that is not covered by watcher, please tell us about it in the GitHub Discussions Section .
Currently Watcher comprises the following features: Single File & Folders, Multiples File Groups, File Patterns, Non-Bloking Execution, Blocking Execution, Bulk File Processing, Advanced File Deletion, Advanced File Creation, Advanced File Alteration, Watcher for Any Alteration, Watcher for Specific Alteration, Decoupled Execution, Novelty Detection, Qualitative Response, Check File Stability, Big Amounts of Files, Atomic Function Injection, Folder Recursion, Selective Path Level, Watcher Monitoring
Note
The lines of code used to exemplify each feature of watcher assume the following:
1fwa = require('watcher').file --for file-watcher
2mon = require('watcher').monit --for watcher monitoring
Single File & Folders
Detection of creation
, deletion
and alteration
of single files or single folders in the file system.
1fwa.creation({'/path/to/single_file'}) --watching file creation
2fwa.creation({'/path/to/single_folder/'}) --watching folder creation
Multiples File Groups
Multiple groups of different files can be watched at the same time. The input list of watchable files is a Lua table type parameter.
1fwa.deletion(
2 {
3 '/path_1/to/group_file_a/*', --folder
4 '/path_2/to/group_file_b/*' --another
5 }
6 )
File Patterns
fwa.creation({'/path/to/files_*.txt'})
Note
The watch-list is constructed with a single flag that controls the behavior of the function: GLOB_NOESCAPE.
For details type man 3 glob
.
Non-Bloking Execution
By default the Watcher run is executed in non-blocking mode through tarantool fibers. Fibers are a unique Tarantool feature “green threads” or coroutines that run independently of operating system threads.
Blocking Execution
The waitfor
function blocks the code and waits for a watcher to finish.
waitfor(fwa.creation({'/path/to/file'}).wid) --wait for watcher
Bulk File Processing
Watcher has an internal mechanism to allocate fibers for every certain amount of files
in the watcher list. This amount is determined by the BULK_CAPACITY
configuration value
in order to optimize performance.
Advanced File Deletion
Inputs
Param |
Type |
Description |
---|---|---|
wlist |
|
Watch List |
maxwait |
|
Maximum wait time in seconds |
interval |
|
Verification interval for watcher in seconds |
options |
|
List of search options |
recursion |
|
Recursion paramaters |
wlist
It is the list of files, directories or file patterns to be observed. The data type is a Lua table and
the size of tables is already limited to 2.147.483.647
elements.
An example definition is the following:
wlist = {'path/file', 'path', 'pattern*', ...} --arbitrary code
maxwait
Maxwait is a numeric value that represents the maximum time to wait for the watcher.
Watcher will terminate as soon as possible and as long as the search conditions are met.
The default value is 60 seconds
.
interval
Interval is a numerical value that determines how often the watcher checks the search conditions.
This value must be less than the maxwait value.
The default value is 0.5
seconds.
options
The options parameter is a Lua table containing 3 elements: sort
, cases
and match
.
The first one
sort
contains the ordering method of thewlist
.The second element
cases
contains the number of cases to observe from the wlist.and the third element
match
indicates the number of cases expected to satisfy the search.
By default, the value of the option table is {sort = 'NS', cases = 0, match = 0}
.
Value |
Description |
---|---|
|
No sort |
|
Sorted alphabetically ascending |
|
Sorted alphabetically descending |
|
Sorted by date of modification ascending |
|
Sorted for date of modification descending |
Note
The value 'NS'
treats the list in the same order in which the elements
are passed to the list wlist
.
recursion
To enable directory recursion you must define the recursion parameter. The recursion works only for an observable of type directory.
The recursion value is a Lua table type composed of the following elements {recursive_mode, {deep_levels}, hidden_files}
:
recursive_mode: Boolean indicating whether or not to activate the recursive mode on the root directory. The default value is
false
.deep_levels: Numerical table indicating the levels of depth to be evaluated in the directory structure. The default value is
{0}
hidden_files: Boolean indicating whether hidden files will be evaluated in the recursion. The default value is
false
.
How do the recursion levels work?
To understand how levels work in recursion, let’s look at the following example.
Imagine you have the following directory structure and you want to observe the deletion of files from the path ‘/folder_A/folder_B/’.
The levels are determined from the object path or root path that will be used as input in the watcher expression. In this case the path ‘/folder_A/folder_B/’ has level zero and, for each folder node a level will be added according to its depth. The result is shown in the following summary table, which contains the list of files for each level.
[Input] Level 0 |
Level 1 |
Level 2 |
Level 3 |
Level 4 |
|
---|---|---|---|---|---|
folder |
|
|
|
|
|
files |
|
|
|
|
|
Note
The files, .B3
, .D1
and .E3
are hidden files.
Now that we know how to set the recursion level, let’s see an example of the observable files depending on different values of the recursion parameter for the above mentioned example.
|
Composition of the list of observable files |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Output
Advanced File Creation
Inputs
Param |
Type |
Description |
---|---|---|
wlist |
|
Watch List |
maxwait |
|
Maximum wait time in seconds |
interval |
|
Verification interval for watcher in seconds |
minsize |
|
Value of the minimum expected file size |
stability |
|
Minimum criteria for measuring file stability |
novelty |
|
Time interval that determines the validity of the file’s novelty |
nmatch |
|
Number of expected files as a search sufficiency condition |
wlist
It is the list of files, directories or file patterns to be observed. The data type is a Lua table and
the size of tables is already limited to 2.147.483.647
elements.
An example definition is the following:
wlist = {'path/file', 'path', 'pattern*', ...} --arbitrary code
maxwait
Maxwait is a numeric value that represents the maximum time to wait for the watcher.
Watcher will terminate as soon as possible and as long as the search conditions are met.
The default value is 60 seconds
.
interval
Interval is a numerical value that determines how often the watcher checks the search conditions.
This value must be less than the maxwait value.
The default value is 0.5
seconds.
minsize
Minsize is a numerical value representing the minimum expected file size.
The default value is 0
, which means that it is sufficient to just generate the file when the minimum size is unknown.
Important
Regardless of whether the expected file size is 0 Bytes
,
watcher will not terminate until the file arrives in its entirety,
avoiding edge cases where a file is consumed before the data transfer is complete.
stability
The stability
parameter contains the elements that allow to evaluate the stability of a file.
It is a Lua table containing two elements:
The
interval
that defines the frequency of checking the file once it has arrived.The number of
iterations
used to determine the stability of the file.
The default value is: {1, 15}
.
novelty
The novelty
parameter is a two-element Lua table that contains the
time interval that determines the validity of the file’s novelty.
The default value is {0, 0}
which indicates that the novelty of the file will not be evaluated.
nmatch
nmatch
is a number of expected files as a search sufficiency condition.
Advanced File Alteration
Inputs
Param |
Type |
Description |
---|---|---|
wlist |
|
Watch List |
maxwait |
|
Maximum wait time in seconds |
interval |
|
Verification interval for watcher in seconds |
awhat |
|
Type of file alteration to be observed |
nmatch |
|
Number of expected files as a search sufficiency condition |
wlist
It is the list of files, directories or file patterns to be observed. The data type is a Lua table and
the size of tables is already limited to 2.147.483.647
elements.
An example definition is the following:
wlist = {'path/file', 'path', 'pattern*', ...} --arbitrary code
maxwait
Maxwait is a numeric value that represents the maximum time to wait for the watcher.
Watcher will terminate as soon as possible and as long as the search conditions are met.
The default value is 60 seconds
.
interval
Interval is a numerical value that determines how often the watcher checks the search conditions.
This value must be less than the maxwait value.
The default value is 0.5
seconds.
awhat
Type of file alteration to be observed. See File Watcher Alteration Parameters.
Type |
Value |
Description |
---|---|---|
|
|
Search for any alteration |
|
|
Search for content file alteration |
|
|
Search for file size alteration |
|
|
Search for file |
|
|
Search for file |
|
|
Search for file |
|
|
Search for file |
|
|
Search for file |
nmatch
nmatch
is a number of expected files as a search sufficiency condition.
Watcher for Any Alteration
fwa.alteration({'/path/to/file'}, nil, nil, '1')
Watcher for Specific Alteration
1fwa.alteration({'/path/to/file'}, nil, nil, '2') --Watcher for content file alteration
2fwa.alteration({'/path/to/file'}, nil, nil, '3') --Watcher for content file size alteration
3fwa.alteration({'/path/to/file'}, nil, nil, '4') --Watcher for content file ctime alteration
4--explore other options for 'awhat' values
See table File Watcher Alteration Parameters for more options.
Decoupled Execution
The create
, run
function and the monit
options have been decoupled
for better behavior, overhead relief and versatility of use.
Novelty Detection
Watcher implements the detection of the newness of a file based on the mtime
modification date.
This is useful to know if file system items have been created in an expected time window.
Warning
Note that the creation of the files may have been done preserving the attributes of the original file. In that case you should consider the novelty rank accordingly.
1 date_from = os.time() - 24*60*60 --One day before the current date
2 date_to = os.time() + 24*60*60 --One day after the current date
3 os.execute('touch /tmp/novelty_file.txt') --The file is created on the current date
4 fwt.creation({'/tmp/novelty_file.txt'}, 10, nil, 0, nil, {date_from, date_to})
Note
For known dates you can use the Lua function os.time() as follows:
1 date_from = os.time(
2 {
3 year = 2020,
4 month = 6,
5 day = 4,
6 hour = 23,
7 min = 48,
8 sec = 10
9 }
10 )
Qualitative Response
Watcher leaves a record for each watchable file where it provides qualitative
nformation about the search result for each of them.
To explore this information see the Watcher Monitoring match
and nomatch
functions.
1 NOT_YET_CREATED = '_' --The file has not yet been created
2 FILE_PATTERN = 'P' --This is a file pattern
3 HAS_BEEN_CREATED = 'C' --The file has been created
4 IS_NOT_NOVELTY = 'N' --The file is not an expected novelty
5 UNSTABLE_SIZE = 'U' --The file has an unstable file size
6 UNEXPECTED_SIZE = 'S' --The file size is unexpected
7 DISAPPEARED_UNEXPECTEDLY = 'D' --The file has disappeared unexpectedly
8 DELETED = 'X' --The file has been deleted
9 NOT_EXISTS = 'T' --The file does not exist
10 NOT_YET_DELETED = 'E' --The file has not been deleted yet
11 NO_ALTERATION = '0' --The file has not been modified
12 ANY_ALTERATION = '1' --The file has been modified
13 CONTENT_ALTERATION = '2' --The content of the file has been altered
14 SIZE_ALTERATION = '3' --The file size has been altered
15 CHANGE_TIME_ALTERATION = '4' --The ctime of the file has been altered
16 MODIFICATION_TIME_ALTERATION = '5' --The mtime of the file has been altered
17 INODE_ALTERATION = '6' --The number of inodes has been altered
18 OWNER_ALTERATION = '7' --The owner of the file has changed
19 GROUP_ALTERATION = '8' --The group of the file has changed
Check File Stability
Enabled only for file creation. This feature ensures that the watcher terminates once the file creation is completely finished. This criterion is independent of the file size.
See usage for parameter stability
Big Amounts of Files
In the following example, watching the file deletion from the path “/” recursively
down to depth level 3 (levels={0,1,2,3}
) yields a total of 163,170 watchable files.
Note that the execution takes 85 seconds (on a typical desktop machine) but the maximum timeout
of the watcher has been specified as low as 10 seconds.
This means that 88% of the time is consumed in creating the watcher due to recursion.
1 tarantool> test=function() local ini=os.time() local fwa=fw.deletion({'/'}, 10, nil, {'NS', nil, 2}, {true, {0,1,2,3}, false}) print(os.time()-ini) print(fwa.wid) end 2 tarantool> test() 3 85 4 1620701962375155ULL 5 --- 6 tarantool> mon.info(1620701962375155ULL) 7 --- 8 - ans: true 9 match: 72 10 what: '{"/"}' 11 wid: 1620701962375155 12 type: FWD 13 nomatch: 163098 14 status: completed 15 ...
Atomic Function Injection
Atomic function injection allows you to perform specific tasks on each element of the watchable list separately. In the example, the atomic function afu creates a backup copy for each element of the watchlist.
1afu = function(file) os.execute('cp '..file..' '..file..'_backup') end --Atomic Funcion
2cor = require('watcher').core
3wat = cor.create({'/tmp/original.txt'}, 'FWD', afu) --afu is passed as parameter
4res = run_watcher(wat)
Folder Recursion
You can enable recursion on directories to detect changes in the file system. Recursion is enabled based on a directory entry as a parameter that is considered as a root directory. Starting from this root directory, considered as level zero, you can selectively activate the observation of successive directory levels.
1 fwa.deletion(
2 {'/tmp/folder_1'}, --Observed directory is considered a zero level root directory
3 nil, --Maxwait, nil to take the value by omission
4 nil, --Interval, nil to take the value by omission
5 nil, --Options, nil to take the value by omission
6 {
7 true, --Activate recursion
8 {0, 1, 2}, --Levels of directories to be observed (root and levels 1 & 2)
9 false --Includes hidden files
10 }
11 )
For more info see How do the recursion levels work?.
Selective Path Level
The recursion levels is a list of numerical values so you can specify (selectively) the directory level you want to observe and ignore others. This is useful in situations where the full path to the file is unknown but the depth or level of the file is known.
1 fwa.deletion(
2 {'/bac/invoices'},
3 nil,
4 nil,
5 nil,
6 {
7 true, --Activate recursion
8 {3}, --Selective level 3
9 false --Includes hidden files
10 }
11 )
See use case …
Watcher Monitoring
monit
for Watcher monitoring allows you to monitor and explore the running status of a watcher.
info
The output is a Lua table containing the following elements:
ans is a boolean value containing the response of the watcher.
true
means that the watcher has detected the expected changes that are defined in the parameters.match is the number of cases that match the
true
value of ans.nomatch is the number of cases that do not belong to the set of
true
ans.what is a string containing the obserbables parameter.
wid is the unique identifier of the watcher.
type is the type of the watcher
status is the execution status of the watcher.
1 mon.info(1620701962375155ULL)
2
3 {
4 ans: true
5 match: 72
6 what: '{"/"}'
7 wid: 1620701962375155
8 type: 'FWD'
9 nomatch: 163098
10 status: 'completed'
11 }