Software

Descriptions of software that we have developed for data analysis and metabolic modeling. These are free to use and can be adapted to other strains or projects.

RiboPipe

Processing and analyzing RNA-Seq and Ribo-Seq data

High throughput sequencing data generated from transcriptomes (RNA-seq) and translatomes (ribosome profiling or Ribo-seq) need filtering, genome mapping, and analysis in order to draw biological conclusions. RiboPipe is a pipeline that filters RNA-seq and ribosome profiling nucleotide sequencing data (FASTQ files), maps extracted reads to reference genomes, and provides basic quality control and analysis. The repository also includes R scripts for principal component analysis and plotting of read density in genomic regions. This tool meets our general RNA-seq and Ribo-seq needs and has been used in several studies. See also publication.

Author: Johannes Asplund-Samuelsson, Jan Karlsen.

Repository: https://github.com/Asplund-Samuelsson/ribopipe

POPPY

Prospecting Optimal Pathways with PYthon

Engineering bio-based production of chemicals requires optimal pathways stringed together from a specific selection of thousands of possible biochemical reactions. POPPY generates pathways that can produce a target chemical from host-endogenous metabolites. The pathways are constructed from known and plausible biochemical reactions in large databases. A unique network-embedded max-min driving force analysis is included to ensure compatibility with host metabolism and simultaneously rank the candidate pathways. See also publication.

Author: Johannes Asplund-Samuelsson

Repository: https://github.com/Asplund-Samuelsson/POPPY

RedMAGPIE

Knowledge of microbial genome adaptations to acquiring the Calvin cycle of carbon dioxide fixation could drive future efforts in improving such organisms as well as generating them from scratch. RedMAGPIE is an analysis that aims to identify the most important genetic adaptations that differentiate microbes with and without the Calvin cycle. To that end, closely related genomes with and without the Calvin cycle were identified and subjected to gene enrichment analysis, ancestral character estimation, and random forest machine learning. Thereby hundreds of important genetic features were identified that together make up a “recipe” of an autotrophic organism.

Author: Johannes Asplund-Samuelsson

Repository: https://github.com/Asplund-Samuelsson/redmagpie

FUREE

As shown through overexpression experiments and kinetic modeling, fructose-1,6-bisphosphatase/sedoheptulose-1,7-bisphosphatase (F/SBPase) limits the capability of the Calvin cycle of carbon dioxide fixation to support growth in autotrophs. FUREE uses the UniRep artificial intelligence framework to teach a deep learning model to extract representative features from F/SBPase sequences. By fitting a linear regression top model to such F/SBPase representations and accompanying experimental data, it is possible to guide mutations in the computer, so called in silico evolution. FUREE contains scripts for training a top model on custom F/SBPase data, making predictions, and performing in silico evolution to improve F/SBPase activity, and thereby improve growth of autotrophs that use the Calvin cycle.

Author: Johannes Asplund-Samuelsson

Repository: https://github.com/Asplund-Samuelsson/furee

K1

Kinetic model of the Calvin cycle

K1 is a kinetic model of the core carbon fixation metabolism of Synechocystis 6803. K1 uses several layers of random sampling to supply metabolite concentrations and kinetic enzyme parameters to the model. This sampling overcomes the problem of missing values or uncertainties in these parameters from experimental datasets. The sampling approach allows for the analysis of stability of the metabolic system in for a set of steady-state metabolite concentrations. Stability analysis can guide metabolic engineering, as modifications could move metabolite concentrations into a range where the system is less stable and therefore will not sustain a steady state. A parameterized kinetic such as that provided by K1 also allows for the calculation of control coefficients. See also publication.

Author: Markus Janasch

Repository: https://github.com/MJanasch/CBB_Kinetics

sgRNA library designer

Automated design of sgRNAs for CRISPRi

Genome-wide mutagenesis libraries are a powerful tool for assessing the metabolic capabilities of a given strain. We recently created a genome-wide CRISPRi repression library for the cyanobacerium Synechocystis 6803, and used it to discover gene knock downs that could improve specific growth rate, lactate tolerance and productivity. We developed of a custom script to automatically generate the guide RNA sequences (sgRNAs) needed to target each gene in the genome. The software finds up to five sgRNA targeting regions for each gene in a supplied genome file. The sgRNAs are designed that maximize repression strength while minimizing off-target binding. See also publication.

Ralstonia GEM

Genome scale model and resource balance analysis for Ralstonia eutropha

A genome scale metabolic model for the litho-autotrophic bacterium Cupriavidus necator, formerly Ralstonia eutropha H16. This model is extensively curated to fix errors or missing annotations in previous models. The model can be imported with the COBRA toolbox or the cobrapy package for python. Such models can predict metabolic fluxes (e.g. using FBA) or inform genetic engineering strategies by simulating the effect of knockouts.

Author: Michael Jahn

Repository: https://github.com/m-jahn/genome-scale-models