Stop on fail argument now works

Added option to output database and schemas lists to CSV
Updated readme to reflect recent changes and discuss versioning
2025-02-19 15:50:18 +00:00 · 2025-02-19 15:01:57 +00:00 · 2025-02-18 19:16:39 +00:00
3 changed files with 45 additions and 8 deletions
--- a/README.md
+++ b/README.md
@@ -12,6 +12,7 @@ This CLI is capable of exactly what [autoBIGS.engine](https://pypi.org/project/a
 - Fetch the available BIGSdb database schemas for a given MLST database
 - Retrieve exact/non-exact MLST allele variant IDs based off a sequence
 - Retrieve MLST sequence type IDs based off a sequence
+- Inexact matches are annotated with an asterisk (\*)
 - Output all results to a single CSV

 ## Planned Features for CLI
@@ -40,6 +41,18 @@ Let's say you have a fasta called `seq.fasta` which contains several sequences.

 3. Then, run `autobigs st -h` and familiarize yourself with the parameters needed for sequence typing.

-4. Namely, you should find that you will need to run `autobigs st seq.fasta pubmlst_bordetella_seqdef 3 output.csv`. You can optionally include multiple `FASTA` files, and/or `--exact` to only retrieve exact sequence types, and/or `--stop-on-fail` to stop typing if one of your sequences fail to retrieve any type. 
+4. Namely, you should find that you will need to run `autobigs st seq.fasta pubmlst_bordetella_seqdef 3 output.csv`. You can optionally include multiple `FASTA` files, and `--stop-on-fail` to stop typing if one of your sequences fail to retrieve any type. 

 5. Sit tight, and wait. The `output.csv` will contain your results once completed.
+
+## Versioning
+
+the autoBIGS project follows [semantic versioning](https://semver.org/) where the three numbers may be interpreted as MAJOR.MINOR.PATCH.
+
+Note regarding major version 0 ([spec item 4](https://semver.org/#spec-item-4)), the following adaptation of semantic versioning definition is as follows:
+
+1. Given x.Y.z, Y is only incremented when a backwards incompatible change is made.
+
+2. Given x.y.Z, Z is only incremented when a backwards compatible change is made.
+
+Versions of autoBIGS items with a major version number of 0 will introduce numerous changes and patches. As such, changes between such versions should be considered highly variable.
--- a/src/autobigs/cli/info.py
+++ b/src/autobigs/cli/info.py
@@ -1,5 +1,7 @@
 from argparse import ArgumentParser, Namespace
 import asyncio
+import csv
+from os import path
 from autobigs.engine.analysis.bigsdb import BIGSdbIndex

 def setup_parser(parser: ArgumentParser):
@@ -24,6 +26,14 @@ def setup_parser(parser: ArgumentParser):
        help="Lists the known schema IDs for a given BIGSdb sequence definition database name. The name, and then the ID of the schema is given."
    )

+    parser.add_argument(
+        "--csv-prefix", "-o",
+        dest="csv_output",
+        required=False,
+        default=None,
+        help="Output list as CSV at a given path. A suffix is added depending on the action taken."
+    )
+
    parser.set_defaults(run=run_asynchronously)
    return parser

@@ -31,15 +41,29 @@ async def run(args: Namespace):
    async with BIGSdbIndex() as bigsdb_index:
        if args.list_dbs:
            known_seqdef_dbs = await bigsdb_index.get_known_seqdef_dbs(force=False)
-            print("The following are all known BIGS database names (sorted alphabetically):")
-            print("\n".join(sorted(known_seqdef_dbs.keys())))
+            sorted_seqdef_dbs = [(name, source) for name, source in sorted(known_seqdef_dbs.items())]
+            print("The following are all known BIGS database names, and their source (sorted alphabetically):")
+            print("\n".join(["{0}: {1}".format(name, source) for name, source in sorted_seqdef_dbs]))
+            if args.csv_output:
+                dbs_csv_path = path.splitext(args.csv_output)[0] + "_" + "dbs.csv"
+                with open(dbs_csv_path, "w") as csv_out_handle:
+                    writer = csv.writer(csv_out_handle)
+                    writer.writerow(("BIGSdb Names", "Source"))
+                    writer.writerows(sorted_seqdef_dbs)
+                    print("\nDatabase output written to {0}".format(dbs_csv_path))

        for bigsdb_schema_name in args.list_bigsdb_schemas:
            schemas = await bigsdb_index.get_schemas_for_seqdefdb(bigsdb_schema_name)
+            sorted_schemas = [(name, id) for name, id in sorted(schemas.items())]
            print("The following are the known schemas for \"{0}\", and their associated IDs:".format(bigsdb_schema_name))
-            for schema_desc, schema_id in schemas.items():
-                print(f"{schema_desc}: {schema_id}")
-
+            print("\n".join(["{0}: {1}".format(name, id) for name, id in sorted_schemas]))
+            if args.csv_output:
+                schema_csv_path = path.splitext(args.csv_output)[0] + "_" + "schemas.csv"
+                with open(schema_csv_path, "w") as csv_out_handle:
+                    writer = csv.writer(csv_out_handle)
+                    writer.writerow(("Name", "ID"))
+                    writer.writerows(sorted_schemas)
+                    print("\nSchema list output written to {0}".format(schema_csv_path))
        if not (args.list_dbs or len(args.list_bigsdb_schemas) > 0):
            print("Nothing to do. Try specifying \"-l\" for a list of known databases, or \"-h\" for more information.")

--- a/src/autobigs/cli/st.py
+++ b/src/autobigs/cli/st.py
@@ -50,7 +50,7 @@ async def run(args: Namespace):
    async with BIGSdbIndex() as bigsdb_index:
        gen_strings = read_multiple_fastas(args.fastas)
        async with await bigsdb_index.build_profiler_from_seqdefdb(False, args.seqdefdb, args.schema) as mlst_profiler:
-            mlst_profiles = mlst_profiler.profile_multiple_strings(gen_strings)
+            mlst_profiles = mlst_profiler.profile_multiple_strings(gen_strings, args.stop_on_fail)
            failed = await write_mlst_profiles_as_csv(mlst_profiles, args.out)
            if len(failed) > 0:
                print(f"A total of {len(failed)} IDs failed (no profile found):\n{"\n".join(failed)}")
Author	SHA1	Message	Date
Harrison Deng	af9c8c70b8	Stop on fail argument now works All checks were successful automlst.cli/pipeline/head This commit looks good Details	2025-02-19 15:50:18 +00:00
Harrison Deng	319edf36af	Added option to output database and schemas lists to CSV	2025-02-19 15:01:57 +00:00
Harrison Deng	43a17d698b	Updated readme to reflect recent changes and discuss versioning All checks were successful automlst.cli/pipeline/head This commit looks good Details	2025-02-18 19:16:39 +00:00