Added a comment

Changed steps to use native credential manager
Improved logging and reduced exceptions.
2023-09-11 21:55:30 +00:00 · 2023-09-11 20:52:34 +00:00 · 2023-09-11 07:59:56 +00:00 · 2023-05-03 08:37:35 -05:00 · 2023-04-26 13:59:58 -05:00 · 2023-04-26 13:52:01 -05:00
15 changed files with 192 additions and 93 deletions
--- a/.vscode/launch.json
+++ b/.vscode/launch.json
@@ -0,0 +1,26 @@
+{
+    // Use IntelliSense to learn about possible attributes.
+    // Hover to view descriptions of existing attributes.
+    // For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
+    "version": "0.2.0",
+    "configurations": [
+        {
+            "name": "Python: Module",
+            "type": "python",
+            "request": "launch",
+            "module": "renamebycsv.cli",
+            "args": [
+                "${workspaceFolder}/tests/resources/files",
+                "group\\d+-\\w-(\\d+)\\.txt",
+                "${workspaceFolder}/tests/resources/groups.csv",
+                "target",
+                "replaced",
+                "-d",
+                "-e",
+                "abc",
+                "-k"
+            ],
+            "justMyCode": true
+        }
+    ]
+}
--- a/20
+++ b/20
@@ -1,41 +1,37 @@
 pipeline {
    agent any
    stages {
-        stage("clean") {
-            steps {
-                sh 'rm -rf ./dist/*'
-            }
-        }
        stage("install") {
            steps {
-                sh 'mamba env update --file environment.yml'
-                sh 'echo "mamba activate renamebycsv" >> ~/.bashrc'
+                sh 'mamba env update --file environment.yml --prefix ./env || mamba env create --force --file environment.yml --prefix ./env'
            }
        }
        stage("build") {
            steps {
+                sh 'rm -rf ./dist/*'
                sh "python -m build"
            }
        }
        stage("test") {
            steps {
-                sh "pip install dist/*.whl"
+                sh "pip install dist/*.whl --force-reinstall"
                sh "renamebycsv -h"
            }
        }
        stage("archive") {
            steps {
-                archiveArtifacts artifacts: 'dist/*.tar.gz, dist/*.whl'
+                archiveArtifacts artifacts: 'dist/*.tar.gz, dist/*.whl', fingerprint: true, followSymlinks: false, onlyIfSuccessful: true
            }
        }
        stage("publish package") {
+            environment {
+                CREDS = credentials('rs-git-package-registry-ydeng')
+            }
            when {
                branch '**/master'
            }
            steps {
-                withCredentials([usernamePassword(credentialsId: 'rs-git-package-registry-ydeng', passwordVariable: 'PASS', usernameVariable: 'USER')]) {
-                    sh "python -m twine upload --repository-url https://git.reslate.systems/api/packages/${USER}/pypi -u ${USER} -p ${PASS} --non-interactive --disable-progress-bar --verbose dist/*"
-                }
+                sh returnStatus: true, script: 'python -m twine upload --repository-url https://git.reslate.systems/api/packages/${CREDS_USR}/pypi -u ${CREDS_USR} -p ${CREDS_PSW} --non-interactive --disable-progress-bar --verbose dist/*'
            }
        }
    }
--- a/README.md
+++ b/README.md
@@ -5,8 +5,10 @@ A simple program that renames files by using a spreadsheet in CSV format as a di
 ## Features

 - Rename files recursively within a directory to a desired string
+ - Replace only the REGEX match portion
 - Desired string is set by a CSV where one column is the original string, and another column is the string to replace the original string with
 - Uses a REGEX capture group to select file and the portion of the filename to rename
+ - Ability to define file extension

 ## Installing using `pip`

@@ -14,7 +16,6 @@ A simple program that renames files by using a spreadsheet in CSV format as a di

 2. Run `renamebycsv -h` to see the help and confirm installation was successful.

-
 ## Advanced Usage: What is REGEX?

 This program makes heavy use of REGEX, also known as Regular Expression to give users the flexibility to choose which portion of any given filename should be the portion used by the program to look up in the CSV. It is therefore critical for users of this script to understand how REGEX works. Here are some key pointers to get you started:
@@ -23,9 +24,19 @@ This program makes heavy use of REGEX, also known as Regular Expression to give
 - Where it differs is the ability to use one REGEX string to match many strings.
   - i.e, the REGEX "`abc\d+`" will match with "`abc1`", "`abc2`", "`abc12`", but not "`ac12`" or "`abc`".
 - Many characters can be used as normal and will match a string literally (character for character), but some will be treated as special characters (such as the previously used `\`, which indicates that the letter afterwards should be treated specially, such as a token)
-   - Common tokens to be aware of: `.` for any character, `\d` for single digits, `\w` for word characters, `\s` for space characters (tabs, spaces, linebreaks, etc.). Tokens can be repeated by using `+`, indicating "one or more", `*` indicating "none or more".
+   - Common tokens to be aware of: `.` for any character, `\d` for single digits, `\w` for word characters, `\s` for space characters (tabs, spaces, linebreaks, etc.). Tokens can be repeated by using `+`, indicating "one or more", `*` indicating "none or more". If you want to match something that is read as a token by default, such as `.`, or `+`, using the `\` in front of it will cause it to match `.` literally, i.e, `1\.2` matches `1.3`, but not `123`, `1a3`, etc.
 - A capture group is a way of "selecting" a part of a text and is formed by using `(` and `)` around the REGEX that should be selected.

 Now for a few examples:

-Let's say we have files `run325-a-1.vcf`, `run326-b-2.vcf`, and `run327-b-3.vcf`. If we know that all that matters is the `1` after the `run[numbers]-[character]-`, we can write `run\d+-\w-(\d).vcf` which will match with all 3 of the above examples, and select the last digit. The program can then use a given CSV to look up the selected digits and replace the name with what is given by the CSV.
+Let's say we have files `run325-a-1.vcf`, `run326-b-2.vcf`, and `run327-b-3.vcf`. If we know that all that matters is the `1` after the `run[numbers]-[character]-`, we can write `run\d+-\w-(\d)\.vcf` which will match with all 3 of the above examples, and select the last digit. The program can then use a given CSV to look up the selected digits and replace the name with what is given by the CSV.
+
+For learning and testing your own REGEX, checkout [regex101.com](https://regex101.com/), which allows you to write the strings that you're trying to match, and the REGEX. It will show you live which parts of the strings match to what, if any parts match.
+
+## Not Working?
+
+If the program is not working the way you would like it, try running the program in `-v DEBUG` mode which increases verbosity. Typically, files not being renamed can be attributed to one of two problems:
+
+1. It's looking in the wrong directory. The solution would be to double check that the directory it's looking in (printed by the program each run) is correct. If not, try adding quotes around the path in the command line.
+
+2. The provided REGEX pattern isn't matching to any of the files. In this case, test one or two of the files at [regex101.com](https://regex101.com/) with your pattern.
--- a/environment.yml
+++ b/environment.yml
@@ -8,4 +8,5 @@ dependencies:
  - python=3.11
  - setuptools=67.6
  - twine=4.0
-  - cryptography=38.0.4
+  - cryptography=38.0.4
+prefix: ./env
--- a/renamebycsv/cli.py
+++ b/renamebycsv/cli.py
@@ -1,86 +1,29 @@
 #!/usr/bin/env python3

 import argparse
-import csv
-import os
-import re
-from typing import Iterable
 import logging

-
-def find_all_candidates(input_dir: str, regex: str, recursive: bool):
-    results = []
-    for subitem in os.listdir(input_dir):
-        subitem_path = os.path.join(input_dir, subitem)
-        match = re.match(regex, subitem)
-        if os.path.isdir(subitem_path) and recursive:
-            logging.debug(f'Checking directory "{subitem}"...')
-            results.extend(find_all_candidates(subitem_path, regex, recursive))
-        else:
-            if not match:
-                logging.debug(f'Ignoring "{subitem}"...')
-                continue
-            results.append((subitem_path, subitem, match))
-            logging.debug(f'Collecting "{subitem}"...')
-    return results
-
-
-def rename(
-    csv_path: str,
-    candidates: Iterable[tuple[str, str, re.Match]],
-    current: str,
-    become: str,
-    dry: bool,
-    keep_extension: bool,
-):
-    replacement_dict = {}
-    with open(csv_path, "r") as csv_fd:
-        reader = csv.reader(csv_fd)
-        current_col_ind = None
-        become_col_ind = None
-        for row in reader:
-            if current_col_ind is None and become_col_ind is None:
-                current_col_ind = row.index(current)
-                become_col_ind = row.index(become)
-                continue
-            if (
-                row[current_col_ind] in replacement_dict
-                and replacement_dict[row[current_col_ind]] != row[become_col_ind]
-            ):
-                raise Exception("Duplicate current key.")
-            replacement_dict[row[current_col_ind]] = row[become_col_ind]
-    for subitem_path, subitem, match in candidates:
-        original = subitem_path
-        objective = os.path.join(
-            os.path.dirname(subitem_path),
-            re.sub(match.re, replacement_dict[match.group(1)], subitem),
-        )
-        if keep_extension:
-            objective += os.path.splitext(subitem_path)[1]
-        logging.info(f'Will rename "{original}" to "{os.path.basename(objective)}"')
-        if os.path.exists(objective):
-            logging.error(
-                f'Path at "{objective}" exists, not continuing. '
-                "Use -f to overwrite instead of stopping."
-            )
-            exit(1)
-        if not dry:
-            os.rename(original, objective)
-    if dry:
-        logging.info("No file names were modified.")
+from renamebycsv.renamer import find_all_candidates, rename_by_csv


 def run(args):
-    candidates = find_all_candidates(args.input_dir, args.regex, args.recursive)
-    rename(
-        args.csv, candidates, args.current, args.become, args.dry, args.keep_extension
-    )
+    candidates = find_all_candidates(args.input_dir, args.pattern, args.recursive)
+    if len(candidates):
+        rename_by_csv(
+            args.csv,
+            candidates,
+            args.current,
+            args.become,
+            args.dry,
+            args.extension,
+            args.keep_extension,
+        )


 def main():
    program_name = "renamebycsv"
    argparser = argparse.ArgumentParser(
-        program_name, "Rename all files by using a CSV as a dictionary."
+        program_name, description="Rename all files by using a CSV as a dictionary."
    )
    argparser.add_argument(
        "input_dir",
@@ -88,7 +31,7 @@ def main():
        metavar="I",
    )
    argparser.add_argument(
-        "regex",
+        "pattern",
        help="The regex to apply to each file name. The first capture group is used to "
        "perform the replacement.",
        metavar="R",
@@ -109,13 +52,13 @@ def main():
    argparser.add_argument(
        "-r",
        "--recursive",
-        help="Perform renaming action recursively",
+        help="Perform renaming action recursively.",
        action="store_true",
    )
    argparser.add_argument(
        "-f",
        "--force",
-        help="Overwrite files if file already exists",
+        help="Overwrite files if file already exists.",
        action="store_true",
    )
    argparser.add_argument(
@@ -124,17 +67,28 @@ def main():
    argparser.add_argument(
        "-V",
        "--verbosity",
-        help="Set the logging verbosity",
+        help="Set the logging verbosity.",
        required=False,
        type=str,
        default="INFO",
    )
+    argparser.add_argument(
+        "-e",
+        "--extension",
+        help='Sets the new file extension after the renaming. Use empty string ("") '
+        "to not add extension. Will use empty string by default.",
+        type=str,
+        default="",
+        required=False,
+    )
    argparser.add_argument(
        "-k",
        "--keep-extension",
-        help="Keeps the original file's extension by appending it to the end of the "
-        "name defined by the CSV.",
+        help="Keeps the OS recognized extension from the original filename. Will "
+        'append to end of argument given by "-e" or "--extension".',
        action="store_true",
+        default=False,
+        required=False,
    )

    args = argparser.parse_args()
--- a/renamebycsv/renamer.py
+++ b/renamebycsv/renamer.py
@@ -0,0 +1,97 @@
+import csv
+import logging
+import os
+import re
+from typing import Iterable
+
+
+def find_all_candidates(input_dir: str, regex: str, recursive: bool):
+    logging.info(
+        'Searching "%s" for files that match "%s" %s',
+        input_dir,
+        regex,
+        "recursively" if recursive else "non-recursively",
+    )
+    results = []
+    for subitem in os.listdir(input_dir):
+        subitem_path = os.path.join(input_dir, subitem)
+        match = re.match(regex, subitem)
+        if os.path.isdir(subitem_path) and recursive:
+            logging.debug(f'Checking directory "{subitem}"...')
+            results.extend(find_all_candidates(subitem_path, regex, recursive))
+        else:
+            if not match:
+                logging.debug(f'Ignoring "{subitem}"...')
+                continue
+            results.append((subitem_path, subitem, match))
+            logging.debug(f'Collecting "{subitem}"...')
+    if len(results) < 1:
+        logging.info(
+            'No results found matching "%s" in "%s". Please double check your REGEX '
+            "pattern and directory being searched.",
+            regex,
+            input_dir,
+        )
+    else:
+        logging.info("Collected %d files to rename.", len(results))
+    return results
+
+
+def rename_by_csv(
+    csv_path: str,
+    candidates: Iterable[tuple[str, str, re.Match]],
+    current: str,
+    become: str,
+    dry: bool,
+    extension: str,
+    keep_extension: bool,
+):
+    replacement_dict = {}
+    with open(csv_path, "r") as csv_fd:
+        reader = csv.reader(csv_fd)
+        current_col_ind = None
+        become_col_ind = None
+        for row in reader:
+            if current_col_ind is None and become_col_ind is None:
+                if current not in row:
+                    logging.error("\"%s\" not in header %s.", current, list(row))
+                if become not in row:
+                    logging.error("\"%s\" not in header %s.", become, list(row))
+                current_col_ind = row.index(current)
+                become_col_ind = row.index(become)
+                continue
+            if (
+                row[current_col_ind] in replacement_dict
+                and replacement_dict[row[current_col_ind]] != row[become_col_ind]
+            ):
+                # Check if there's a duplicate key for different values.
+                raise Exception("Duplicate current key.")
+            replacement_dict[row[current_col_ind]] = row[become_col_ind]
+    for subitem_path, subitem, match in candidates:
+        if match.group(1) not in replacement_dict:
+            logging.warning(
+                'Group "%s" was not matched to any row in the provided CSV. '
+                "Skipping...",
+                match.group(1),
+            )
+            continue
+        original = subitem_path
+        objective = os.path.join(
+            os.path.dirname(subitem_path),
+            re.sub(match.re, replacement_dict[match.group(1)], subitem.strip()),
+        )
+        if extension:
+            objective += ("." if not extension.startswith(".") else "") + extension
+        if keep_extension:
+            objective += os.path.splitext(subitem_path)[1]
+        logging.info(f'Will rename "{original}" to "{os.path.basename(objective)}"')
+        if os.path.exists(objective):
+            logging.error(
+                f'Path at "{objective}" already exists, not continuing. '
+                "Use -f to overwrite instead of stopping."
+            )
+            exit(1)
+        if not dry:
+            os.rename(original, objective)
+    if dry:
+        logging.info("No file names were modified.")
--- a/setup.cfg
+++ b/setup.cfg
@@ -1,6 +1,6 @@
 [metadata]
 name = renamebycsv
-version = 0.0.3
+version = 0.0.8

 [options]
 packages = renamebycsv
--- a/tests/resources/files/foo.txt
+++ b/tests/resources/files/foo.txt
@@ -0,0 +1 @@
+Text
--- a/tests/resources/files/group1-a-12.txt
+++ b/tests/resources/files/group1-a-12.txt
@@ -0,0 +1,2 @@
+
+Text
--- a/tests/resources/files/group1-a-13.txt
+++ b/tests/resources/files/group1-a-13.txt
@@ -0,0 +1 @@
+Text
--- a/tests/resources/files/group1-b-10.txt
+++ b/tests/resources/files/group1-b-10.txt
@@ -0,0 +1 @@
+Text
--- a/tests/resources/files/group1-b-11.txt
+++ b/tests/resources/files/group1-b-11.txt
@@ -0,0 +1 @@
+Text
--- a/tests/resources/files/group1-b-14.txt
+++ b/tests/resources/files/group1-b-14.txt
@@ -0,0 +1 @@
+Text
--- a/tests/resources/files/group1-b-9.txt
+++ b/tests/resources/files/group1-b-9.txt
@@ -0,0 +1 @@
+Text
--- a/tests/resources/groups.csv
+++ b/tests/resources/groups.csv
@@ -0,0 +1,6 @@
+target,replaced
+9,a
+10,b
+11,c
+12,d
+13,e
Author	SHA1	Message	Date
Harrison Deng	eacb730961	Added a comment All checks were successful ydeng/renamebycsv/pipeline/head This commit looks good Details	2023-09-11 21:55:30 +00:00
Harrison Deng	844cf4b2de	Changed steps to use native credential manager All checks were successful ydeng/renamebycsv/pipeline/head This commit looks good Details	2023-09-11 20:52:34 +00:00
Harrison Deng	434f969556	Improved logging and reduced exceptions. All checks were successful ydeng/renamebycsv/pipeline/head This commit looks good Details Also bumped version.	2023-09-11 07:59:56 +00:00
Harrison	d98801dd66	Updated pipeline to use latest build container image features All checks were successful ydeng/renamebycsv/pipeline/head This commit looks good Details	2023-05-03 08:37:35 -05:00
Harrison	83639a10e2	Added info to the 'README.md' regarding escape character All checks were successful ydeng/renamebycsv/pipeline/head This commit looks good Details	2023-04-26 13:59:58 -05:00
Harrison	34e5b107ff	Updated 'README.md' with more help All checks were successful ydeng/renamebycsv/pipeline/head This commit looks good Details	2023-04-26 13:52:01 -05:00
Harrison	70af81ed84	Preparing for 0.0.7 release	2023-04-26 13:47:34 -05:00
Harrison	b745915e49	Added some more logging to the INFO level	2023-04-26 13:47:09 -05:00
Harrison	c6d79c9eb1	Fixed bug where CLI wouldn't run Caused by wrong argument name parameter for run function	2023-04-26 13:44:27 -05:00
Harrison	682503a24a	Minor change to CLI help menu All checks were successful ydeng/renamebycsv/pipeline/head This commit looks good Details	2023-04-26 03:12:33 -05:00
Harrison	4bf334c9d5	Restructured code structure	2023-04-26 03:07:09 -05:00
Harrison	90a1db4f0c	Began work on 0.0.6	2023-04-26 03:06:25 -05:00
Harrison	a3bb168c14	Publish stage will not fail on publish failure All checks were successful ydeng/renamebycsv/pipeline/head This commit looks good Details	2023-04-25 10:47:00 -05:00
Harrison	f4fe30ce9f	Added regex101.com to 'README.md' Some checks reported errors ydeng/renamebycsv/pipeline/head Something is wrong with the build of this commit Details	2023-04-25 10:40:00 -05:00
Harrison	c84e0d8c4c	Changed publish stage to use single quotes in sh step All checks were successful ydeng/renamebycsv/pipeline/head This commit looks good Details	2023-04-25 09:55:09 -05:00
Harrison	5c3431428f	Added -k to 'launch.json' All checks were successful ydeng/renamebycsv/pipeline/head This commit looks good Details	2023-04-25 09:51:00 -05:00
Harrison	028c93eb80	Fixed code formatting All checks were successful ydeng/renamebycsv/pipeline/head This commit looks good Details	2023-04-25 09:48:35 -05:00
Harrison	352af7da14	Added ability to define extension to append All checks were successful ydeng/renamebycsv/pipeline/head This commit looks good Details	2023-04-25 09:48:07 -05:00
Harrison	f4d9c37687	Updated help menu to be more consistent	2023-04-25 09:30:57 -05:00
Harrison	7bab8a9436	Added rudimentary VSCode launch for development All checks were successful ydeng/renamebycsv/pipeline/head This commit looks good Details	2023-04-25 09:17:27 -05:00
Harrison	f31b1b2705	Added check for non-existing keys in CSV dictionary	2023-04-25 09:15:26 -05:00
Harrison	7b7f6438d4	Test installation now forces reinstall All checks were successful ydeng/renamebycsv/pipeline/head This commit looks good Details	2023-04-24 12:25:55 -05:00