Updated README.md to explain versioning

Writing now uses named MLST profile
Multiple string profiling now respects grouped queries (for non-WGS)
2025-02-18 16:32:02 +00:00 · 2025-02-18 16:03:17 +00:00 · 2025-02-18 15:34:18 +00:00 · 2025-02-14 20:47:06 +00:00 · 2025-02-14 20:37:13 +00:00 · 2025-02-14 14:35:53 +00:00
7 changed files with 49 additions and 28 deletions
--- a/README.md
+++ b/README.md
@@ -1,7 +1,8 @@
-# autoBIGS.Engine
+# autoBIGS.engine
 A python library implementing common BIGSdb MLST schemes and databases accesses for the purpose of typing sequences automatically. Implementation follows the RESTful API outlined by the official [BIGSdb documentation](https://bigsdb.readthedocs.io/en/latest/rest.html) up to `V1.50.0`.
 ## Features
 Briefly, this library can:
@@ -22,4 +23,16 @@ Then, it's as easy as running `pip install autobigs-engine` in any terminal that
 ### CLI usage
-This is a independent python library and thus does not have any form of direct user interface. One way of using it could be to create your own Python script that makes calls to this libraries functions. Alternatively, you may use `autobigs-cli`, a `Python` package that implements a CLI for calling this library.
+This is a independent python library and thus does not have any form of direct user interface. One way of using it could be to create your own Python script that makes calls to this libraries functions. Alternatively, you may use `autobigs-cli`, a `Python` package that implements a CLI for calling this library.
 ## Versioning
 the autoBIGS project follows [semantic versioning](https://semver.org/) where the three numbers may be interpreted as MAJOR.MINOR.PATCH.
 Note regarding major version 0 ([spec item 4](https://semver.org/#spec-item-4)), the following adaptation of semantic versioning definition is as follows:
 1. Given x.Y.z, Y is only incremented when a backwards incompatible change is made.
 2. Given x.y.Z, Z is only incremented when a backwards compatible change is made.
 Versions of autoBIGS items with a major version number of 0 will introduce numerous changes and patches. As such, changes between such versions should be considered highly variable.
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -13,11 +13,12 @@ dependencies = [
 ]
 requires-python = ">=3.12"
 description = "A library to rapidly fetch fetch MLST profiles given sequences for various diseases."
 license = {text = "GPL-3.0-or-later"}
 [project.urls]
-Homepage = "https://github.com/RealYHD/autoBIGS.engine"
+Homepage = "https://github.com/Syph-and-VPD-Lab/autoBIGS.engine"
-Source = "https://github.com/RealYHD/autoBIGS.engine"
+Source = "https://github.com/Syph-and-VPD-Lab/autoBIGS.engine"
-Issues = "https://github.com/RealYHD/autoBIGS.engine/issues"
+Issues = "https://github.com/Syph-and-VPD-Lab/autoBIGS.engine/issues"
 [tool.setuptools_scm]
--- a/src/autobigs/engine/analysis/bigsdb.py
+++ b/src/autobigs/engine/analysis/bigsdb.py
@@ -124,13 +124,17 @@ class RemoteBIGSdbMLSTProfiler(BIGSdbMLSTProfiler):
    async def profile_multiple_strings(self, query_named_string_groups: AsyncIterable[Iterable[NamedString]], stop_on_fail: bool = False) -> AsyncGenerator[NamedMLSTProfile, Any]:
        async for named_strings in query_named_string_groups:
            names: list[str] = list()
            sequences: list[str] = list()
            for named_string in named_strings:
-                try:
+                names.append(named_string.name)
-                    yield NamedMLSTProfile(named_string.name, (await self.profile_string([named_string.sequence])))
+                sequences.append(named_string.sequence)
-                except NoBIGSdbMatchesException as e:
+            try:
-                    if stop_on_fail:
+                yield NamedMLSTProfile("-".join(names), (await self.profile_string(sequences)))
-                        raise e
+            except NoBIGSdbMatchesException as e:
-                    yield NamedMLSTProfile(named_string.name, None)
+                if stop_on_fail:
                    raise e
                yield NamedMLSTProfile("-".join(names), None)
    async def close(self):
        await self._http_client.close()
--- a/src/autobigs/engine/reading.py
+++ b/src/autobigs/engine/reading.py
@@ -5,12 +5,13 @@ from Bio import SeqIO
 from autobigs.engine.structures.genomics import NamedString
-async def read_fasta(handle: Union[str, TextIOWrapper]) -> AsyncGenerator[NamedString, Any]:
+async def read_fasta(handle: Union[str, TextIOWrapper]) -> Iterable[NamedString]:
    fasta_sequences = asyncio.to_thread(SeqIO.parse, handle=handle, format="fasta")
    results = []
    for fasta_sequence in await fasta_sequences:
-        yield NamedString(fasta_sequence.id, str(fasta_sequence.seq))
+        results.append(NamedString(fasta_sequence.id, str(fasta_sequence.seq)))
    return results
-async def read_multiple_fastas(handles: Iterable[Union[str, TextIOWrapper]]) -> AsyncGenerator[NamedString, Any]:
+async def read_multiple_fastas(handles: Iterable[Union[str, TextIOWrapper]]) -> AsyncGenerator[Iterable[NamedString], Any]:
    for handle in handles:
-        async for named_seq in read_fasta(handle):
+        yield await read_fasta(handle)
            yield named_seq
--- a/src/autobigs/engine/writing.py
+++ b/src/autobigs/engine/writing.py
@@ -3,7 +3,7 @@ import csv
 from os import PathLike
 from typing import AsyncIterable, Collection, Mapping, Sequence, Union
-from autobigs.engine.structures.mlst import Allele, MLSTProfile
+from autobigs.engine.structures.mlst import Allele, MLSTProfile, NamedMLSTProfile
 def alleles_to_text_map(alleles: Collection[Allele]) -> Mapping[str, Union[Sequence[str], str]]:
@@ -17,12 +17,14 @@ def alleles_to_text_map(alleles: Collection[Allele]) -> Mapping[str, Union[Seque
            result[locus] = tuple(result[locus]) # type: ignore
    return dict(result)
-async def write_mlst_profiles_as_csv(mlst_profiles_iterable: AsyncIterable[tuple[str, Union[MLSTProfile, None]]], handle: Union[str, bytes, PathLike[str], PathLike[bytes]]) -> Sequence[str]:
+async def write_mlst_profiles_as_csv(mlst_profiles_iterable: AsyncIterable[NamedMLSTProfile], handle: Union[str, bytes, PathLike[str], PathLike[bytes]]) -> Sequence[str]:
    failed = list()
    with open(handle, "w", newline='') as filehandle:
        header = None
        writer: Union[csv.DictWriter, None] = None
-        async for name, mlst_profile in mlst_profiles_iterable:
+        async for named_mlst_profile in mlst_profiles_iterable:
            name = named_mlst_profile.name
            mlst_profile = named_mlst_profile.mlst_profile
            if mlst_profile is None:
                failed.append(name)
                continue
--- a/tests/autobigs/engine/test_reading.py
+++ b/tests/autobigs/engine/test_reading.py
@@ -2,6 +2,6 @@ from autobigs.engine.reading import read_fasta
 async def test_fasta_reader_not_none():
-    named_strings = read_fasta("tests/resources/tohama_I_bpertussis.fasta")
+    named_strings = await read_fasta("tests/resources/tohama_I_bpertussis.fasta")
-    async for named_string in named_strings:
+    for named_string in named_strings:
        assert named_string.name == "BX470248.1"
--- a/tests/autobigs/engine/test_writing.py
+++ b/tests/autobigs/engine/test_writing.py
@@ -3,7 +3,7 @@ from typing import AsyncIterable, Iterable
 import pytest
 from autobigs.engine.structures.alignment import AlignmentStats
 from autobigs.engine.writing import alleles_to_text_map, write_mlst_profiles_as_csv
-from autobigs.engine.structures.mlst import Allele, MLSTProfile
+from autobigs.engine.structures.mlst import Allele, MLSTProfile, NamedMLSTProfile
 import tempfile
 from csv import reader
 from os import path
@@ -11,20 +11,20 @@ from os import path
@pytest.fixture
 def dummy_alphabet_mlst_profile():
-    return MLSTProfile((
+    return NamedMLSTProfile("name", MLSTProfile((
        Allele("A", "1", None),
        Allele("D", "1", None),
        Allele("B", "1", None),
        Allele("C", "1", None),
        Allele("C", "2", AlignmentStats(90, 10, 0, 90))
-    ), "mysterious", "very mysterious")
+    ), "mysterious", "very mysterious"))
 async def iterable_to_asynciterable(iterable: Iterable):
    for iterated in iterable:
        yield iterated
 async def test_column_order_is_same_as_expected_file(dummy_alphabet_mlst_profile: MLSTProfile):
-    dummy_profiles = [("test_1", dummy_alphabet_mlst_profile)]
+    dummy_profiles = [dummy_alphabet_mlst_profile]
    with tempfile.TemporaryDirectory() as temp_dir:
        output_path = path.join(temp_dir, "out.csv")
        await write_mlst_profiles_as_csv(iterable_to_asynciterable(dummy_profiles), output_path)
@@ -34,8 +34,8 @@ async def test_column_order_is_same_as_expected_file(dummy_alphabet_mlst_profile
            target_columns = lines[4:]
            assert target_columns == sorted(target_columns)
-async def test_alleles_to_text_map_mapping_is_correct(dummy_alphabet_mlst_profile: MLSTProfile):
+async def test_alleles_to_text_map_mapping_is_correct(dummy_alphabet_mlst_profile: NamedMLSTProfile):
-    mapping = alleles_to_text_map(dummy_alphabet_mlst_profile.alleles)
+    mapping = alleles_to_text_map(dummy_alphabet_mlst_profile.mlst_profile.alleles) # type: ignore
    expected_mapping = {
        "A": "1",
        "B": "1",
@@ -44,4 +44,4 @@ async def test_alleles_to_text_map_mapping_is_correct(dummy_alphabet_mlst_profil
    }
    for allele_name, allele_ids in mapping.items():
        assert allele_name in expected_mapping
-        assert allele_ids == expected_mapping[allele_name]
+        assert allele_ids == expected_mapping[allele_name]
Author	SHA1	Message	Date
Harrison Deng	62ce1c9b2f	Updated README.md to explain versioning All checks were successful automlst.engine/pipeline/head This commit looks good Details	2025-02-18 16:32:02 +00:00
Harrison Deng	7384895578	Writing now uses named MLST profile All checks were successful automlst.engine/pipeline/head This commit looks good Details automlst.engine/pipeline/tag This commit looks good Details	2025-02-18 16:03:17 +00:00
Harrison Deng	5a03c7e8d8	Multiple string profiling now respects grouped queries (for non-WGS) All checks were successful automlst.engine/pipeline/head This commit looks good Details	2025-02-18 15:34:18 +00:00
Harrison Deng	ddf9cde175	Added a license text to pyproject.toml	2025-02-14 20:47:06 +00:00
Harrison Deng	2e8cdd8da9	Updated URL links All checks were successful automlst.engine/pipeline/head This commit looks good Details autoBIGS.engine/pipeline/tag This commit looks good Details	2025-02-14 20:37:13 +00:00
Harrison Deng	d0318536b2	Changed FASTA reading to group based on file for merging partial targets	2025-02-14 14:35:53 +00:00