Finder#
- class Finder(root, pattern, use_regex=False, scan_everything=False)#
Bases:
objectFind files using a filename pattern.
The Finder object is the main entrance point to this library. Given a root directory and a filename pattern, it can search for all corresponding files.
- Parameters:
root (str) – The root directory of the filetree where all files can be found.
pattern (str) – The filename pattern. See Pattern for details.
use_regex (bool) – If True, characters outside of groups are considered as valid regex (and not escaped). Default is False.
scan_everything (bool) – If true, look into all sub-directories up to a depth of
max_scan_depth. This is appropriate if the pattern contains optional sub-directories. If false (default), check that every sub-directory matches its part of the regular expression, thus avoiding some work.
- _add_file(filename, pattern)#
Add file to cache if it matches pattern and pass filters.
- _find_files_scan_everything()#
Find files in all sub-directories.
Because having to check if a sub-directory matches the pattern is difficult, this allows for more exotic patterns where a folder separator can appear in a capturing group, by example for optional sub-directories.
This will scan the whole filetree under
rootand check every file found, which can be significant work in some cases.- Return type:
None
- _find_files_subdirectories()#
Find files checking sub-directories along the way.
Each sub-directory must match against its corresponding part of the generated regular expression. This is ill suited if any group contains a folder separator. But it will limit the number of sub-directories to explore and thus the number of files to check.
- Return type:
None
- _find_groups(pattern)#
Find the groups within the pattern and their corresponding string indices.
The returned indices should be sorted in order of appearance in the pattern.
The indices should correspond to the first and last character of the group, including the delimiter characters.
On the contrary, the string specification of the group should not include them.
This implementation finds the matching pair defined by the attribute
_group_delimiters. A match of the start of a group that does not have a matching end will raise.
- _void_cache()#
Clear the cache.
- Return type:
None
- add_filter(func, **kwargs)#
Add a filter with which to select scanned files.
The filter will be applied to files already in the cache.
See Filtering for details.
- clear_filters()#
Remove all filters.
- Return type:
None
- find_files()#
Find files to scan and store them in cache.
Is automatically called when accessing
filesorget_files(). Apply all filters and sort files alphabetically.- Return type:
None
- find_matches(filename, relative=True)#
Alias for
get_matches().
- fix_by_filter(key, func, fix_discard=False, default_date=None, pass_unparsed=False, **kwargs)#
Fix a group value by using a filter function.
When a file is scanned, if it matches the pattern, it will only be kept if func returns True when called with the group parsed value. If the group cannot parse the value, if pass_unparse is True the unparsed string will be passed to the predicate function nonetheless, otherwise it will not keep the file (default).
This adds a filter (see
add_filter()) with a name consisting of the key and a unique id (this allows multiple filters for a single group).- Parameters:
key (int | str) – Can be the index of a group in the pattern (starts at 0), or the name of a group. If multiple groups share the same name, they are all fixed.
func (Callable[[...], bool]) – A function that takes the parsed value of the group and returns True if the corresponding file should be kept, or False otherwise. If multiple groups correspond to the key, all values will be tested succesively.
fix_discard (bool) – If True, also use groups values with the discard flag. Default is False.
pass_unparsed (bool) – In case the group cannot parse the string, if True pass the unparsed string to the predicate function func anyway. If False (default) the file will not be kept.
default_date (datetime | Mapping[str, int] | None) – Passed to
library.get_date()if key is “date”.kwargs – Will be passed to the function.
- fix_group(key, value, fix_discard=False)#
Fix a group to a string.
This will void the cache.
- Parameters:
key (int | str) – Can be the index of a group in the pattern (starts at 0), or the name of a group. If multiple groups share the same name, they are all fixed to the same value.
value (str | Any) – Can be a string, or a value that will be formatted using the group format string. A string will be interpreted as a regular expression, so all special characters should be properly escaped. A list of values will be joined by the regex ‘|’ OR.
fix_discard (bool) – If True, groups with the ‘discard’ option will still be fixed. Default is False.
- fix_groups(fixes=None, fix_discard=False, **fixes_kw)#
Fix multiple groups at once.
- get_absolute(filename)#
Concatenate the finder root directory and a filename.
- get_files(relative=False, nested=None)#
Return files that matches the regex.
Lazily scan files: if files were already scanned, just return the stored list of files.
- Parameters:
relative (bool) – If True, filenames are returned relative to the finder root directory. If not, paths are absolute (default).
nested (Sequence[str | Sequence[str]] | None) – If not None, return nested list of filenames with each level corresponding to a group, or set of group. Last set in the list is at the innermost level.
- Raises:
KeyError – A group name in nested is not found in the pattern.
- Return type:
- get_group_names(fixed=None)#
Get the names of groups in the pattern.
- get_groups(key)#
Return list of groups corresponding to key.
If
date_is_first_classis True, for the key ‘date’ return all time related groups.
- get_matches(filename, relative=True)#
Find matches for a given filename.
Apply regex to filename and return the results as a
Matchesobject. Fixed values are applied as normal.
- get_relative(filename)#
Get filename path relative to root.
- make_filename(fixes=None, relative=False, **kw_fixes)#
Return a filename.
Replace groups with provided values. All groups must be fixed prior, or with fixes argument.
Only works if
use_regexis set to False (default).- Parameters:
fixes (dict | None) – Dictionnary of fixes (group name or index: value). For details, see
fix_group(). Will (temporarily) supplant group fixed prior. If prior fix is a list, first item will be used.relative (bool) – If the filename should be relative to the finder root directory. Default is False.
kw_fixes (Any) – Same as fixes. Takes precedence.
- Raises:
ValueError – use_regex is activated.
- Return type:
- set_scan_everything(scan_everything, /)#
Set value for attribute
scan_everything.Void cache if necessary.
- Parameters:
scan_everything (bool)
- Return type:
None
- set_use_regex(use_regex, /)#
Set value for attribute
use_regex.- Parameters:
use_regex (bool)
- Return type:
None
- unfix_groups(*keys)#
Unfix groups, and remove group related filters.
This will void the cache.
- Parameters:
keys (int | str) – Keys to find groups to unfix. See
get_groups(). If no key is provided, all groups will be unfixed.
- _group_delimiters: tuple[str, str, str] = ('%', '(', ')')#
Delimiter characters of groups in the pattern.
Tuple of (prefix, start characters, end characters). Start and end character must be balanced within the group. Prefix can be empty.
- _segments: list[str]#
Segments of the pattern. Used to replace specific groups. [‘text before group 1’, ‘group 1’, ‘text before group 2, ‘group 2’, …, ‘text after last group’]
- property files: list[tuple[str, Matches]]#
List of filenames and their matches.
Will scan files when accessed and cache the result, if it has not already been done.
- filters: FilterList#
List of filters to apply to found files.
- max_scan_depth: int = 32#
Maximum sub-directory depth to scan when
scan_everythingis True.