Advanced / Optional Dependencies
The dependencies described here are not required to run Snekmer. They provide optional performance improvements for specific use cases.
Blazing Signature Filter (BSF): Faster Clustering
The Blazing Signature Filter
is an optional dependency used by snekmer cluster to compute pairwise Jaccard
distance matrices more efficiently. It applies to the following clustering methods
(set via the cluster.method config parameter):
density-jaccardhdensity-jaccardagglomerative-jaccard(default)
If BSF is not installed, Snekmer automatically falls back to
scipy.spatial.distance.pdist for Jaccard distance computation. Clustering
will produce identical results; BSF simply runs faster on large datasets.
BSF is not compatible with Apple silicon (M1/M2/M3) systems. See the known Apple silicon issues for details.
Install GCC (required before installing BSF)
BSF requires GCC 4.9 or later. Install it for your operating system:
macOS
brew install gcc llvm libomp
After installing llvm, Homebrew may print a “Caveats” message with additional
flags that need to be set. Follow those instructions to ensure GCC is correctly
resolved. A typical caveats message looks like:
If you need to have llvm first in your PATH, run:
echo 'export PATH="/usr/local/opt/llvm/bin:$PATH"' >> ~/.zshrc
For compilers to find llvm you may need to set:
export LDFLAGS="-L/usr/local/opt/llvm/lib"
export CPPFLAGS="-I/usr/local/opt/llvm/include"
Windows / Linux / Unix
See the BSF documentation for platform-specific GCC installation instructions.
Install BSF
With GCC installed and the Snekmer virtual environment active:
pip install git+https://github.com/PNNL-CompBio/bsf-jaccard-py#egg=bsf