Compare commits

...

48 Commits

Author SHA1 Message Date
3e85185b1a Migrating to a devcontainer
All checks were successful
renamebycsv/pipeline/head This commit looks good
2025-07-12 20:51:21 +00:00
eacb730961 Added a comment
All checks were successful
ydeng/renamebycsv/pipeline/head This commit looks good
2023-09-11 21:55:30 +00:00
844cf4b2de Changed steps to use native credential manager
All checks were successful
ydeng/renamebycsv/pipeline/head This commit looks good
2023-09-11 20:52:34 +00:00
434f969556 Improved logging and reduced exceptions.
All checks were successful
ydeng/renamebycsv/pipeline/head This commit looks good
Also bumped version.
2023-09-11 07:59:56 +00:00
d98801dd66 Updated pipeline to use latest build container image features
All checks were successful
ydeng/renamebycsv/pipeline/head This commit looks good
2023-05-03 08:37:35 -05:00
83639a10e2 Added info to the 'README.md' regarding escape character
All checks were successful
ydeng/renamebycsv/pipeline/head This commit looks good
2023-04-26 13:59:58 -05:00
34e5b107ff Updated 'README.md' with more help
All checks were successful
ydeng/renamebycsv/pipeline/head This commit looks good
2023-04-26 13:52:01 -05:00
70af81ed84 Preparing for 0.0.7 release 2023-04-26 13:47:34 -05:00
b745915e49 Added some more logging to the INFO level 2023-04-26 13:47:09 -05:00
c6d79c9eb1 Fixed bug where CLI wouldn't run
Caused by wrong argument name parameter for run function
2023-04-26 13:44:27 -05:00
682503a24a Minor change to CLI help menu
All checks were successful
ydeng/renamebycsv/pipeline/head This commit looks good
2023-04-26 03:12:33 -05:00
4bf334c9d5 Restructured code structure 2023-04-26 03:07:09 -05:00
90a1db4f0c Began work on 0.0.6 2023-04-26 03:06:25 -05:00
a3bb168c14 Publish stage will not fail on publish failure
All checks were successful
ydeng/renamebycsv/pipeline/head This commit looks good
2023-04-25 10:47:00 -05:00
f4fe30ce9f Added regex101.com to 'README.md'
Some checks reported errors
ydeng/renamebycsv/pipeline/head Something is wrong with the build of this commit
2023-04-25 10:40:00 -05:00
c84e0d8c4c Changed publish stage to use single quotes in sh step
All checks were successful
ydeng/renamebycsv/pipeline/head This commit looks good
2023-04-25 09:55:09 -05:00
5c3431428f Added -k to 'launch.json'
All checks were successful
ydeng/renamebycsv/pipeline/head This commit looks good
2023-04-25 09:51:00 -05:00
028c93eb80 Fixed code formatting
All checks were successful
ydeng/renamebycsv/pipeline/head This commit looks good
2023-04-25 09:48:35 -05:00
352af7da14 Added ability to define extension to append
All checks were successful
ydeng/renamebycsv/pipeline/head This commit looks good
2023-04-25 09:48:07 -05:00
f4d9c37687 Updated help menu to be more consistent 2023-04-25 09:30:57 -05:00
7bab8a9436 Added rudimentary VSCode launch for development
All checks were successful
ydeng/renamebycsv/pipeline/head This commit looks good
2023-04-25 09:17:27 -05:00
f31b1b2705 Added check for non-existing keys in CSV dictionary 2023-04-25 09:15:26 -05:00
7b7f6438d4 Test installation now forces reinstall
All checks were successful
ydeng/renamebycsv/pipeline/head This commit looks good
2023-04-24 12:25:55 -05:00
4bea0eb15e Downgrading 'python-build' package
All checks were successful
ydeng/renamebycsv/pipeline/head This commit looks good
2023-04-24 12:22:01 -05:00
1f1f2567ae Fixed 'environment.yml' package name specification
Some checks failed
ydeng/renamebycsv/pipeline/head There was a failure building this commit
2023-04-24 12:17:16 -05:00
2e38fe83f1 Merge branch 'master' into develop
Some checks failed
ydeng/renamebycsv/pipeline/head There was a failure building this commit
2023-04-24 12:14:15 -05:00
83c384e55c Upgraded package 'build' and downgraded 'cryptography'
Some checks failed
ydeng/renamebycsv/pipeline/head There was a failure building this commit
2023-04-24 12:13:32 -05:00
26cbac64e8 Lock openssl version in build environment
Some checks failed
ydeng/renamebycsv/pipeline/head There was a failure building this commit
2023-04-24 12:04:45 -05:00
2f5b1c7be6 Bumped version
Some checks failed
ydeng/renamebycsv/pipeline/head There was a failure building this commit
2023-04-24 11:58:14 -05:00
03fa2b3d8b Merge branch 'develop'
Some checks failed
ydeng/renamebycsv/pipeline/head There was a failure building this commit
2023-04-24 11:52:52 -05:00
286ca0b5a5 Added step to test installation
All checks were successful
ydeng/renamebycsv/pipeline/head This commit looks good
2023-04-22 16:37:03 -05:00
64204b561d Added some information about usage in README.md
All checks were successful
ydeng/renamebycsv/pipeline/head This commit looks good
2023-04-21 15:20:59 -05:00
4f2eeb6a54 Fixed 'Jenkinsfile' branch check when publishing
All checks were successful
ydeng/renamebycsv/pipeline/head This commit looks good
2023-04-21 11:34:21 -05:00
8662972fe5 Bump version number
All checks were successful
ydeng/renamebycsv/pipeline/head This commit looks good
2023-04-21 11:30:33 -05:00
a566813c56 Restructured code slightly
All checks were successful
ydeng/renamebycsv/pipeline/head This commit looks good
2023-04-21 11:27:27 -05:00
7bb56ac14d Reduced build 'environment.yml' version specificity
All checks were successful
ydeng/renamebycsv/pipeline/head This commit looks good
2023-04-21 11:25:22 -05:00
e95f5b5ac9 Added test installation and archiving stages to 'Jenkinsfile'
All checks were successful
ydeng/renamebycsv/pipeline/head This commit looks good
2023-04-21 11:23:56 -05:00
efe4855297 Merge branch 'develop' of https://git.reslate.systems/ydeng/renamebycsv into develop 2023-04-21 11:22:23 -05:00
9ed70b317d Jenkins pipeline now only publishes when on main branch
Some checks failed
ydeng/renamebycsv/pipeline/head There was a failure building this commit
2023-04-08 13:14:45 -05:00
739ac17b02 Fixed pipeline publishing stage
Some checks failed
ydeng/renamebycsv/pipeline/head There was a failure building this commit
2023-04-08 12:51:38 -05:00
cc6c9bd0db Downgraded environment cryptography package
Some checks failed
ydeng/renamebycsv/pipeline/head There was a failure building this commit
2023-04-08 05:55:09 -05:00
edec2cb929 Fixed Jenkinsfile
Some checks failed
ydeng/renamebycsv/pipeline/head There was a failure building this commit
2023-04-08 05:45:31 -05:00
1fea13d053 Change image to debian
Some checks failed
ydeng/renamebycsv/pipeline/head There was a failure building this commit
2023-04-08 05:38:47 -05:00
b867f333a1 Changed to using mamba environment 2023-04-08 05:36:11 -05:00
1eebfd9717 Change image to debian
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
2023-04-06 09:51:29 -05:00
1e508eeace Added package configurations and Woodpecker CI pipeline
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
2023-04-06 00:52:05 -05:00
7ce680b112 Added option to retain original file extensions 2023-04-06 00:40:28 -05:00
81f4bd8d41 Added a license and restructured folders 2023-04-05 16:19:32 -05:00
20 changed files with 333 additions and 74 deletions

View File

@@ -0,0 +1,22 @@
// For format details, see https://aka.ms/devcontainer.json. For config options, see the
// README at: https://github.com/devcontainers/templates/tree/main/src/python
{
"name": "Python 3",
// Or use a Dockerfile or Docker Compose file. More info: https://containers.dev/guide/dockerfile
"image": "mcr.microsoft.com/devcontainers/python:1-3.12-bullseye"
// Features to add to the dev container. More info: https://containers.dev/features.
// "features": {},
// Use 'forwardPorts' to make a list of ports inside the container available locally.
// "forwardPorts": [],
// Use 'postCreateCommand' to run commands after the container is created.
// "postCreateCommand": "pip3 install --user -r requirements.txt",
// Configure tool-specific properties.
// "customizations": {},
// Uncomment to connect as root instead. More info: https://aka.ms/dev-containers-non-root.
// "remoteUser": "root"
}

26
.vscode/launch.json vendored Normal file
View File

@@ -0,0 +1,26 @@
{
// Use IntelliSense to learn about possible attributes.
// Hover to view descriptions of existing attributes.
// For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
"version": "0.2.0",
"configurations": [
{
"name": "Python: Module",
"type": "python",
"request": "launch",
"module": "renamebycsv.cli",
"args": [
"${workspaceFolder}/tests/resources/files",
"group\\d+-\\w-(\\d+)\\.txt",
"${workspaceFolder}/tests/resources/groups.csv",
"target",
"replaced",
"-d",
"-e",
"abc",
"-k"
],
"justMyCode": true
}
]
}

44
Jenkinsfile vendored Normal file
View File

@@ -0,0 +1,44 @@
pipeline {
agent {
kubernetes {
cloud 'rsys-devel'
defaultContainer 'pip'
inheritFrom 'pip'
}
}
stages {
stage("install") {
steps {
sh 'pip install -r requirements.txt'
}
}
stage("build") {
steps {
sh 'rm -rf ./dist/*'
sh "python -m build"
}
}
stage("test") {
steps {
sh "pip install dist/*.whl --force-reinstall"
sh "renamebycsv -h"
}
}
stage("archive") {
steps {
archiveArtifacts artifacts: 'dist/*.tar.gz, dist/*.whl', fingerprint: true, followSymlinks: false, onlyIfSuccessful: true
}
}
stage("publish package") {
environment {
CREDS = credentials('rs-git-package-registry-ydeng')
}
when {
branch '**/master'
}
steps {
sh returnStatus: true, script: 'python -m twine upload --repository-url https://git.reslate.systems/api/packages/${CREDS_USR}/pypi -u ${CREDS_USR} -p ${CREDS_PSW} --non-interactive --disable-progress-bar --verbose dist/*'
}
}
}
}

21
LICENSE Normal file
View File

@@ -0,0 +1,21 @@
MIT License
Copyright (c) [year] [fullname]
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

View File

@@ -5,5 +5,38 @@ A simple program that renames files by using a spreadsheet in CSV format as a di
## Features
- Rename files recursively within a directory to a desired string
- Replace only the REGEX match portion
- Desired string is set by a CSV where one column is the original string, and another column is the string to replace the original string with
- Uses a REGEX capture group to select file and the portion of the filename to rename
- Ability to define file extension
## Installing using `pip`
1. Run `pip install --index-url https://git.reslate.systems/api/packages/ydeng/pypi/simple renamebycsv` in any `python3` enabled terminal.
2. Run `renamebycsv -h` to see the help and confirm installation was successful.
## Advanced Usage: What is REGEX?
This program makes heavy use of REGEX, also known as Regular Expression to give users the flexibility to choose which portion of any given filename should be the portion used by the program to look up in the CSV. It is therefore critical for users of this script to understand how REGEX works. Here are some key pointers to get you started:
- REGEX works by matching one string to another, this is just like if you used any in-text search function.
- Where it differs is the ability to use one REGEX string to match many strings.
- i.e, the REGEX "`abc\d+`" will match with "`abc1`", "`abc2`", "`abc12`", but not "`ac12`" or "`abc`".
- Many characters can be used as normal and will match a string literally (character for character), but some will be treated as special characters (such as the previously used `\`, which indicates that the letter afterwards should be treated specially, such as a token)
- Common tokens to be aware of: `.` for any character, `\d` for single digits, `\w` for word characters, `\s` for space characters (tabs, spaces, linebreaks, etc.). Tokens can be repeated by using `+`, indicating "one or more", `*` indicating "none or more". If you want to match something that is read as a token by default, such as `.`, or `+`, using the `\` in front of it will cause it to match `.` literally, i.e, `1\.2` matches `1.3`, but not `123`, `1a3`, etc.
- A capture group is a way of "selecting" a part of a text and is formed by using `(` and `)` around the REGEX that should be selected.
Now for a few examples:
Let's say we have files `run325-a-1.vcf`, `run326-b-2.vcf`, and `run327-b-3.vcf`. If we know that all that matters is the `1` after the `run[numbers]-[character]-`, we can write `run\d+-\w-(\d)\.vcf` which will match with all 3 of the above examples, and select the last digit. The program can then use a given CSV to look up the selected digits and replace the name with what is given by the CSV.
For learning and testing your own REGEX, checkout [regex101.com](https://regex101.com/), which allows you to write the strings that you're trying to match, and the REGEX. It will show you live which parts of the strings match to what, if any parts match.
## Not Working?
If the program is not working the way you would like it, try running the program in `-v DEBUG` mode which increases verbosity. Typically, files not being renamed can be attributed to one of two problems:
1. It's looking in the wrong directory. The solution would be to double check that the directory it's looking in (printed by the program each run) is correct. If not, try adding quotes around the path in the command line.
2. The provided REGEX pattern isn't matching to any of the files. In this case, test one or two of the files at [regex101.com](https://regex101.com/) with your pattern.

12
environment.yml Normal file
View File

@@ -0,0 +1,12 @@
name: renamebycsv
channels:
- anaconda
- conda-forge
dependencies:
- python-build=0.7
- pytest=7.2
- python=3.11
- setuptools=67.6
- twine=4.0
- cryptography=38.0.4
prefix: ./env

3
pyproject.toml Normal file
View File

@@ -0,0 +1,3 @@
[build-system]
build-backend = "setuptools.build_meta"
requires = ["setuptools", "wheel"]

103
renamebycsv/cli.py Executable file
View File

@@ -0,0 +1,103 @@
#!/usr/bin/env python3
import argparse
import logging
from renamebycsv.renamer import find_all_candidates, rename_by_csv
def run(args):
candidates = find_all_candidates(args.input_dir, args.pattern, args.recursive)
if len(candidates):
rename_by_csv(
args.csv,
candidates,
args.current,
args.become,
args.dry,
args.extension,
args.keep_extension,
)
def main():
program_name = "renamebycsv"
argparser = argparse.ArgumentParser(
program_name, description="Rename all files by using a CSV as a dictionary."
)
argparser.add_argument(
"input_dir",
help="The directory containing the items that is to be renamed.",
metavar="I",
)
argparser.add_argument(
"pattern",
help="The regex to apply to each file name. The first capture group is used to "
"perform the replacement.",
metavar="R",
)
argparser.add_argument(
"csv",
help="The CSV to use as the dictionary for the substitutions in file name.",
metavar="C",
)
argparser.add_argument(
"current",
help="The column header to look for the text matched by the regex.",
metavar="F",
)
argparser.add_argument(
"become", help="The column header to replace the regex match.", metavar="T"
)
argparser.add_argument(
"-r",
"--recursive",
help="Perform renaming action recursively.",
action="store_true",
)
argparser.add_argument(
"-f",
"--force",
help="Overwrite files if file already exists.",
action="store_true",
)
argparser.add_argument(
"-d", "--dry", help="Do not perform any renames", action="store_true"
)
argparser.add_argument(
"-V",
"--verbosity",
help="Set the logging verbosity.",
required=False,
type=str,
default="INFO",
)
argparser.add_argument(
"-e",
"--extension",
help='Sets the new file extension after the renaming. Use empty string ("") '
"to not add extension. Will use empty string by default.",
type=str,
default="",
required=False,
)
argparser.add_argument(
"-k",
"--keep-extension",
help="Keeps the OS recognized extension from the original filename. Will "
'append to end of argument given by "-e" or "--extension".',
action="store_true",
default=False,
required=False,
)
args = argparser.parse_args()
logging.basicConfig(
format="[%(filename)s %(asctime)s - %(levelname)s] %(message)s",
level=args.verbosity.upper(),
)
run(args)
if __name__ == "__main__":
main()

110
renamebycsv.py → renamebycsv/renamer.py Executable file → Normal file
View File

@@ -1,14 +1,17 @@
#!/usr/bin/env python3
import argparse
import csv
import logging
import os
import re
from typing import Iterable
import logging
def find_all_candidates(input_dir: str, regex: str, recursive: bool):
logging.info(
'Searching "%s" for files that match "%s" %s',
input_dir,
regex,
"recursively" if recursive else "non-recursively",
)
results = []
for subitem in os.listdir(input_dir):
subitem_path = os.path.join(input_dir, subitem)
@@ -22,15 +25,26 @@ def find_all_candidates(input_dir: str, regex: str, recursive: bool):
continue
results.append((subitem_path, subitem, match))
logging.debug(f'Collecting "{subitem}"...')
if len(results) < 1:
logging.info(
'No results found matching "%s" in "%s". Please double check your REGEX '
"pattern and directory being searched.",
regex,
input_dir,
)
else:
logging.info("Collected %d files to rename.", len(results))
return results
def rename(
def rename_by_csv(
csv_path: str,
candidates: Iterable[tuple[str, str, re.Match]],
current: str,
become: str,
dry: bool,
extension: str,
keep_extension: bool,
):
replacement_dict = {}
with open(csv_path, "r") as csv_fd:
@@ -39,6 +53,10 @@ def rename(
become_col_ind = None
for row in reader:
if current_col_ind is None and become_col_ind is None:
if current not in row:
logging.error("\"%s\" not in header %s.", current, list(row))
if become not in row:
logging.error("\"%s\" not in header %s.", become, list(row))
current_col_ind = row.index(current)
become_col_ind = row.index(become)
continue
@@ -46,18 +64,30 @@ def rename(
row[current_col_ind] in replacement_dict
and replacement_dict[row[current_col_ind]] != row[become_col_ind]
):
# Check if there's a duplicate key for different values.
raise Exception("Duplicate current key.")
replacement_dict[row[current_col_ind]] = row[become_col_ind]
for subitem_path, subitem, match in candidates:
if match.group(1) not in replacement_dict:
logging.warning(
'Group "%s" was not matched to any row in the provided CSV. '
"Skipping...",
match.group(1),
)
continue
original = subitem_path
objective = os.path.join(
os.path.dirname(subitem_path),
re.sub(match.re, replacement_dict[match.group(1)], subitem),
re.sub(match.re, replacement_dict[match.group(1)], subitem.strip()),
)
if extension:
objective += ("." if not extension.startswith(".") else "") + extension
if keep_extension:
objective += os.path.splitext(subitem_path)[1]
logging.info(f'Will rename "{original}" to "{os.path.basename(objective)}"')
if os.path.exists(objective):
logging.error(
f'Path at "{objective}" exists, not continuing. '
f'Path at "{objective}" already exists, not continuing. '
"Use -f to overwrite instead of stopping."
)
exit(1)
@@ -65,69 +95,3 @@ def rename(
os.rename(original, objective)
if dry:
logging.info("No file names were modified.")
def main():
program_name = "renamebycsv"
argparser = argparse.ArgumentParser(
program_name, "Rename all files by using a CSV as a dictionary."
)
argparser.add_argument(
"input_dir",
help="The directory containing the items that is to be renamed.",
metavar="I",
)
argparser.add_argument(
"regex",
help="The regex to apply to each file name. The first capture group is used to "
"perform the replacement.",
metavar="R",
)
argparser.add_argument(
"csv",
help="The CSV to use as the dictionary for the substitutions in file name.",
metavar="C",
)
argparser.add_argument(
"current",
help="The column header to look for the text matched by the regex.",
metavar="F",
)
argparser.add_argument(
"become", help="The column header to replace the regex match.", metavar="T"
)
argparser.add_argument(
"-r",
"--recursive",
help="Perform renaming action recursively",
action="store_true",
)
argparser.add_argument(
"-f",
"--force",
help="Overwrite files if file already exists",
action="store_true",
)
argparser.add_argument(
"-d", "--dry", help="Do not perform any renames", action="store_true"
)
argparser.add_argument(
"-V",
"--verbosity",
help="Set the logging verbosity",
required=False,
type=str,
default="INFO",
)
args = argparser.parse_args()
logging.basicConfig(
format="[%(filename)s %(asctime)s - %(levelname)s] %(message)s",
level=args.verbosity.upper(),
)
candidates = find_all_candidates(args.input_dir, args.regex, args.recursive)
rename(args.csv, candidates, args.current, args.become, args.dry)
if __name__ == "__main__":
main()

5
requirements.txt Normal file
View File

@@ -0,0 +1,5 @@
build
pytest
setuptools
twine
cryptography

10
setup.cfg Normal file
View File

@@ -0,0 +1,10 @@
[metadata]
name = renamebycsv
version = 0.0.8
[options]
packages = renamebycsv
[options.entry_points]
console_scripts =
renamebycsv = renamebycsv.cli:main

2
setup.py Normal file
View File

@@ -0,0 +1,2 @@
from setuptools import setup
setup()

View File

@@ -0,0 +1 @@
Text

View File

@@ -0,0 +1,2 @@
Text

View File

@@ -0,0 +1 @@
Text

View File

@@ -0,0 +1 @@
Text

View File

@@ -0,0 +1 @@
Text

View File

@@ -0,0 +1 @@
Text

View File

@@ -0,0 +1 @@
Text

View File

@@ -0,0 +1,6 @@
target,replaced
9,a
10,b
11,c
12,d
13,e
1 target replaced
2 9 a
3 10 b
4 11 c
5 12 d
6 13 e