diff --git a/analysis/FSEOF.m b/analysis/FSEOF.m index e60740ce..169c9573 100755 --- a/analysis/FSEOF.m +++ b/analysis/FSEOF.m @@ -1,33 +1,42 @@ function targets=FSEOF(model,biomassRxn,targetRxn,iterations,coefficient,outputFile) -% FSEOF -% Implements the Flux Scanning based on Enforced Objective Flux algorithm. +% FSEOF Flux Scanning based on Enforced Objective Flux. % -% Input: -% model a model structure -% biomassRxn string with reaction ID of the biomass formation or -% growth reaction -% targetRxn string with reaction ID of target reaction -% iterations numeric indicating number of iterations (optional, -% default 10) -% coefficient numeric indicating ratio of optimal target reaction -% flux, must be less than 1 (optional, default 0.9) -% outputFile string with output filename (optional, default prints -% to command window) +% Implements the Flux Scanning based on Enforced Objective Flux algorithm. +% This function writes a tab-delimited file or prints to the command +% window. If an output has been specified (targets), it will also generate +% a structure indicating for each model reaction whether it is identified +% by FSEOF as a target and the slope of the reaction when switching from +% biomass formation to product formation. % -% Output: -% targets structure with information for identified targets -% logical logical array indicating whether a model reaction was -% identified as target by FSEOF -% slope numeric array with FSEOF slopes for target reactions +% Parameters +% ---------- +% model : struct +% a model structure. +% biomassRxn : char +% reaction ID of the biomass formation or growth reaction. +% targetRxn : char +% reaction ID of the target reaction. +% iterations : double, optional +% number of iterations (default 10). +% coefficient : double, optional +% ratio of optimal target reaction flux, must be less than 1 +% (default 0.9). +% outputFile : char, optional +% output filename (default prints to command window). % -% This function writes an tab-delimited file or prints to command window. -% If an output has been specified (targets), it will also generate a -% structure indicating for each model reaction whether it is identified by -% FSEOF as a target and the slope of the reaction when switching from -% biomass formation to product formation. +% Returns +% ------- +% targets : struct +% structure with information for identified targets, with fields: +% +% - logical : logical array indicating whether a model reaction was +% identified as target by FSEOF +% - slope : numeric array with FSEOF slopes for target reactions % -% Usage: targets = FSEOF(model, biomassRxn, targetRxn, iterations,... -% coefficient, outputFile) +% Examples +% -------- +% targets = FSEOF(model, biomassRxn, targetRxn, iterations, ... +% coefficient, outputFile); biomassRxn=char(biomassRxn); targetRxn=char(targetRxn); diff --git a/analysis/analyzeSampling.m b/analysis/analyzeSampling.m index 8bcbf4d7..570925e1 100755 --- a/analysis/analyzeSampling.m +++ b/analysis/analyzeSampling.m @@ -1,33 +1,47 @@ function scores=analyzeSampling(Tex, df, solutionsA, solutionsB, printResults) -% analyzeSampling -% Compares the significance of change in flux between two conditions with -% the significance of change in gene expression +% analyzeSampling Compare flux change significance with expression change. % -% Tex a vector of t-scores for the change in gene expression -% for each reaction. This score could be the Student t -% between the two conditions, or you can calculate it from -% a p-value (by computing the inverse of the so called error -% function). If you choose the second alternative you should -% be aware that the transcripts that increased in expression -% level should have positive values and those who decreased -% in expression level should have negative values (the -% p-values only tell you if the fluxes changed or not but -% not in which direction) -% df the degrees of freedom in the t-test -% solutionsA random solutions for the reference condition (as -% generated by randomSampling) -% solutionsB random solutions for the test condition (as generated -% by randomSampling) -% printResults prints the most significant reactions in each category -% (optional, default false) +% Compares the significance of change in flux between two conditions with +% the significance of change in gene expression. % -% scores a Nx3 column matrix with the probabilities of a reaction: -% 1) changing both in flux and expression in the same direction -% 2) changing in expression but not in flux -% 3) changing in flux but not in expression or changing -% in opposed directions in flux and expression. +% Parameters +% ---------- +% Tex : double +% a vector of t-scores for the change in gene expression for each +% reaction. This score could be the Student t between the two +% conditions, or you can calculate it from a p-value (by computing the +% inverse of the so called error function). If you choose the second +% alternative you should be aware that the transcripts that increased +% in expression level should have positive values and those who +% decreased in expression level should have negative values (the +% p-values only tell you if the fluxes changed or not but not in which +% direction). +% df : double +% the degrees of freedom in the t-test. +% solutionsA : double +% random solutions for the reference condition (as generated by +% randomSampling). +% solutionsB : double +% random solutions for the test condition (as generated by +% randomSampling). +% printResults : logical, optional +% prints the most significant reactions in each category +% (default false). % -% Usage: scores=analyzeSampling(Tex, df, solutionsA, solutionsB, printResults) +% Returns +% ------- +% scores : double +% a Nx3 column matrix with the probabilities of a reaction: +% +% 1. changing both in flux and expression in the same direction +% 2. changing in expression but not in flux +% 3. changing in flux but not in expression or changing in opposed +% directions in flux and expression +% +% Examples +% -------- +% scores = analyzeSampling(Tex, df, solutionsA, solutionsB, ... +% printResults); if nargin<5 printResults=false; diff --git a/analysis/findGeneDeletions.m b/analysis/findGeneDeletions.m index 7420d298..f6986bbd 100755 --- a/analysis/findGeneDeletions.m +++ b/analysis/findGeneDeletions.m @@ -1,54 +1,69 @@ function [genes, fluxes, originalGenes, details, grRatioMuts]=findGeneDeletions(model,testType,analysisType,refModel,oeFactor) -% findGeneDeletions -% Deletes genes, optimizes the model, and keeps track of the resulting -% fluxes. This is used for identifying gene deletion targets. +% findGeneDeletions Delete genes and track the resulting fluxes. % -% model a model structure -% testType single/double gene deletions/over expressions. Over -% expression only available if using MOMA -% 'sgd' single gene deletion -% 'dgd' double gene deletion -% 'sgo' single gene over expression -% 'dgo' double gene over expression -% (optional, default 'sgd') -% analysisType determines whether to use FBA ('fba') or MOMA ('moma') -% in the optimization. (optional, default 'fba') -% refModel MOMA works by fitting the flux distributions of two -% models to be as similar as possible. The most common -% application is where you have a reference model where -% some of the fluxes are constrained from experimental -% data. This model is required when using MOMA -% oeFactor a factor by which the fluxes should be increased if a -% gene is overexpressed (optional, default 10) +% Deletes genes, optimizes the model, and keeps track of the resulting +% fluxes. This is used for identifying gene deletion targets. % -% genes a matrix with the genes that were deleted in each -% optimization (the gene indexes in originalGenes). Each -% row corresponds to a column in fluxes -% fluxes a matrix with the resulting fluxes. Double deletions -% that result in an unsolvable problem have all zero -% flux. Single deletions that result in an unsolvable -% problem are indicated in details instead -% originalGenes simply the genes in the input model. Included for -% simple presentation of the output -% details not all genes will be deleted in all analyses. It is -% for example not necessary to delete genes for dead end -% reactions. This is a vector with details about -% each gene in originalGenes and why or why not it was -% deleted -% 1: Was deleted/overexpressed -% 2: Proved lethal in sgd (single gene deletion) -% 3: - redundant, no longer used - -% 4: Involved in dead-end reaction -% grRatioMuts growth rate ratio between mutated strain and wild type, -% matches the originalGenes(genes) mutants. Note that -% this does not directly map to model.genes, as is the case -% for COBRA getEssentialGenes. However, this can be -% obtained by afterwards running: -% grRatio=zeros(1,numel(model.genes)); -% grRatio(genes)=grRatioMuts; +% Parameters +% ---------- +% model : struct +% a model structure. +% testType : char, optional +% single/double gene deletions/over expressions. Over expression is +% only available if using MOMA (default 'sgd'): % -% Usage: [genes, fluxes, originalGenes, details, grRatioMuts]=findGeneDeletions(model,testType,analysisType,... -% refModel,oeFactor) +% - 'sgd' : single gene deletion +% - 'dgd' : double gene deletion +% - 'sgo' : single gene over expression +% - 'dgo' : double gene over expression +% analysisType : char, optional +% determines whether to use FBA ('fba') or MOMA ('moma') in the +% optimization (default 'fba'). +% refModel : struct, optional +% MOMA works by fitting the flux distributions of two models to be as +% similar as possible. The most common application is where there is a +% reference model with some fluxes constrained from experimental data. +% This model is required when using MOMA. +% oeFactor : double, optional +% a factor by which the fluxes should be increased if a gene is +% overexpressed (default 10). +% +% Returns +% ------- +% genes : double +% a matrix with the genes that were deleted in each optimization (the +% gene indexes in originalGenes). Each row corresponds to a column in +% fluxes. +% fluxes : double +% a matrix with the resulting fluxes. Double deletions that result in +% an unsolvable problem have all zero flux. Single deletions that +% result in an unsolvable problem are indicated in details instead. +% originalGenes : cell +% simply the genes in the input model. Included for simple +% presentation of the output. +% details : double +% not all genes will be deleted in all analyses. It is for example not +% necessary to delete genes for dead end reactions. This is a vector +% with details about each gene in originalGenes and why or why not it +% was deleted: +% +% - 1 : was deleted/overexpressed +% - 2 : proved lethal in sgd (single gene deletion) +% - 3 : redundant, no longer used +% - 4 : involved in dead-end reaction +% grRatioMuts : double +% growth rate ratio between mutated strain and wild type, matching the +% originalGenes(genes) mutants. Note that this does not directly map +% to model.genes, as is the case for COBRA getEssentialGenes. However, +% this can be obtained by afterwards running: +% +% grRatio=zeros(1,numel(model.genes)); +% grRatio(genes)=grRatioMuts; +% +% Examples +% -------- +% [genes, fluxes, originalGenes, details, grRatioMuts]=... +% findGeneDeletions(model,testType,analysisType,refModel,oeFactor); originalModel=model; if nargin<5 diff --git a/analysis/followChanged.m b/analysis/followChanged.m index 95fd0a2c..7a217b27 100755 --- a/analysis/followChanged.m +++ b/analysis/followChanged.m @@ -1,24 +1,34 @@ function followChanged(model,fluxesA,fluxesB, cutOffChange, cutOffFlux, cutOffDiff, metaboliteList) -% followChanged -% Prints fluxes and reactions for each of the reactions that results in -% different fluxes compared to the reference case. +% followChanged Print reactions whose fluxes differ from a reference case. % -% model a model structure -% fluxesA flux vector for the test case -% fluxesB flux vector for the reference test -% cutOffChange reactions where the fluxes differ by less than -% this many percent won't be printed (optional, default 10^-8) -% cutOffFlux reactions where the absolute value of both fluxes -% are below this value won't be printed (optional, -% default 10^-8) -% cutOffDiff reactions where the fluxes differ by less than -% cutOffDiff won't be printed (optional, default 10^-8) -% metaboliteList cell array of metabolite names. Only reactions -% involving any of these metabolites will be -% printed (optional) +% Prints fluxes and reactions for each of the reactions that result in +% different fluxes compared to the reference case. % -% Usage: followChanged(model,fluxesA,fluxesB, cutOffChange, cutOffFlux, -% cutOffDiff, metaboliteList) +% Parameters +% ---------- +% model : struct +% a model structure. +% fluxesA : double +% flux vector for the test case. +% fluxesB : double +% flux vector for the reference test. +% cutOffChange : double, optional +% reactions where the fluxes differ by less than this many percent +% won't be printed (default 10^-8). +% cutOffFlux : double, optional +% reactions where the absolute value of both fluxes are below this +% value won't be printed (default 10^-8). +% cutOffDiff : double, optional +% reactions where the fluxes differ by less than cutOffDiff won't be +% printed (default 10^-8). +% metaboliteList : cell, optional +% cell array of metabolite names. Only reactions involving any of +% these metabolites will be printed. +% +% Examples +% -------- +% followChanged(model,fluxesA,fluxesB,cutOffChange,cutOffFlux,... +% cutOffDiff,metaboliteList); %Checks if a cut off flux has been set if nargin<4 diff --git a/analysis/followFluxes.m b/analysis/followFluxes.m index 7f00921c..33470d7b 100755 --- a/analysis/followFluxes.m +++ b/analysis/followFluxes.m @@ -1,18 +1,31 @@ function errorFlag=followFluxes(model, fluxesA, lowerFlux, upperFlux, fluxesB) -% followFluxes -% Prints fluxes and reactions for each of the reactions that results in -% fluxes in the specified interval. +% followFluxes Print reactions with fluxes in a specified interval. % -% model a model structure -% fluxesA flux vector for the test case -% lowerFlux only reactions with fluxes above this cutoff -% value are displayed -% upperFlux only reactions with fluxes below this cutoff -% value are displayed (optional, default Inf) -% fluxesB flux vector for the reference case(optional) +% Prints fluxes and reactions for each of the reactions that result in +% fluxes within the specified interval. % -% Usage: errorFlag=followFluxes(model, fluxesA, lowerFlux, upperFlux, -% fluxesB) +% Parameters +% ---------- +% model : struct +% a model structure. +% fluxesA : double +% flux vector for the test case. +% lowerFlux : double +% only reactions with fluxes above this cutoff value are displayed. +% upperFlux : double, optional +% only reactions with fluxes below this cutoff value are displayed +% (default Inf). +% fluxesB : double, optional +% flux vector for the reference case. +% +% Returns +% ------- +% errorFlag : double +% set to 1 if upperFlux is not larger than lowerFlux, otherwise empty. +% +% Examples +% -------- +% errorFlag=followFluxes(model,fluxesA,lowerFlux,upperFlux,fluxesB); %Checks that the upper flux is larger than the lower flux if nargin>3 diff --git a/analysis/getAllSubGraphs.m b/analysis/getAllSubGraphs.m index de9c858a..e8a869ef 100755 --- a/analysis/getAllSubGraphs.m +++ b/analysis/getAllSubGraphs.m @@ -1,17 +1,23 @@ function subGraphs=getAllSubGraphs(model) -% getAllSubGraphs -% Get all metabolic subgraphs in a model. Two metabolites -% are connected if they share a reaction. +% getAllSubGraphs Get all metabolic subgraphs in a model. % -% Input: -% model a model structure +% Two metabolites are connected if they share a reaction. % -% Output: -% subGraphs a boolean matrix where the rows correspond to the metabolites -% and the columns to which subgraph they are assigned to. The -% columns are ordered so that larger subgraphs come first +% Parameters +% ---------- +% model : struct +% a model structure. % -% Usage: subGraphs=getAllSubGraphs(model) +% Returns +% ------- +% subGraphs : logical +% a boolean matrix where the rows correspond to the metabolites and +% the columns to which subgraph they are assigned to. The columns are +% ordered so that larger subgraphs come first. +% +% Examples +% -------- +% subGraphs = getAllSubGraphs(model); %Generate the connectivity graph. Metabolites are connected through %reactions. This is not a bipartite graph with the reactions. diff --git a/analysis/getAllowedBounds.m b/analysis/getAllowedBounds.m index ed580fd0..1b31a8de 100755 --- a/analysis/getAllowedBounds.m +++ b/analysis/getAllowedBounds.m @@ -1,30 +1,39 @@ function [minFluxes, maxFluxes, exitFlags]=getAllowedBounds(model,rxns,runParallel) -% getAllowedBounds -% Returns the minimal and maximal fluxes through each reaction. +% getAllowedBounds Return the minimal and maximal fluxes through reactions. % -% Input: -% model a model structure -% rxns either a cell array of reaction IDs, a logical vector -% with the same number of elements as reactions in the -% model, or a vector of reaction indexes (optional, default -% model.rxns) -% runParallel speed up calculations by parallel processing. This is -% not beneficial if allowed bounds are calculated for -% only a few reactions, as the overhead of parallel -% processing will take longer. It requires MATLAB -% Parallel Computing Toolbox. If this is not installed, -% the calculations will not be parallelized, regardless -% what is indicated as runParallel. (optional, default true) +% Parameters +% ---------- +% model : struct +% a model structure. +% rxns : cell or logical or double, optional +% either a cell array of reaction IDs, a logical vector with the same +% number of elements as reactions in the model, or a vector of +% reaction indexes (default model.rxns). +% runParallel : logical, optional +% speed up calculations by parallel processing. This is not beneficial +% if allowed bounds are calculated for only a few reactions, as the +% overhead of parallel processing will take longer. It requires MATLAB +% Parallel Computing Toolbox. If this is not installed, the +% calculations will not be parallelized, regardless of what is +% indicated as runParallel (default true). % -% Output: -% minFluxes minimal allowed fluxes -% maxFluxes maximal allowed fluxes -% exitFlags exit flags for min/max for each of the reactions. True -% if it was possible to calculate a flux +% Returns +% ------- +% minFluxes : double +% minimal allowed fluxes. +% maxFluxes : double +% maximal allowed fluxes. +% exitFlags : double +% exit flags for min/max for each of the reactions. True if it was +% possible to calculate a flux. % +% Notes +% ----- % In cases where no solution can be calculated, NaN is returned. % -% Usage: [minFluxes, maxFluxes, exitFlags] = getAllowedBounds(model, rxns, runParallel) +% Examples +% -------- +% [minFluxes, maxFluxes, exitFlags] = getAllowedBounds(model, rxns, runParallel); if nargin<2 || isempty(rxns) rxns = 1:numel(model.rxns); diff --git a/analysis/getEssentialRxns.m b/analysis/getEssentialRxns.m index d9106c35..ec616dbb 100755 --- a/analysis/getEssentialRxns.m +++ b/analysis/getEssentialRxns.m @@ -1,18 +1,29 @@ function [essentialRxns, essentialRxnsIndexes]=getEssentialRxns(model,ignoreRxns) -% getEssentialRxns -% Calculate the essential reactions for a model to be solvable +% getEssentialRxns Calculate the essential reactions for a solvable model. % -% model a model structure -% ignoreRxns cell array of reaction IDs which should not be -% checked (optional, default {}) +% Parameters +% ---------- +% model : struct +% a model structure. +% ignoreRxns : cell, optional +% cell array of reaction IDs which should not be checked +% (default {}). % -% essentialRxns cell array with the IDs of the essential reactions -% essentialRxnsIndexes vector with the indexes of the essential reactions +% Returns +% ------- +% essentialRxns : cell +% cell array with the IDs of the essential reactions. +% essentialRxnsIndexes : double +% vector with the indexes of the essential reactions. % -% Essential reactions are those which, when constrained to 0, result in an -% infeasible problem. +% Notes +% ----- +% Essential reactions are those which, when constrained to 0, result in an +% infeasible problem. % -% Usage: [essentialRxns, essentialRxnsIndexes]=getEssentialRxns(model,ignoreRxns) +% Examples +% -------- +% [essentialRxns, essentialRxnsIndexes] = getEssentialRxns(model, ignoreRxns); if nargin<2 ignoreRxns={}; diff --git a/analysis/getFluxZ.m b/analysis/getFluxZ.m index 0e6ba6ef..3a2fda2c 100755 --- a/analysis/getFluxZ.m +++ b/analysis/getFluxZ.m @@ -1,18 +1,29 @@ function Z=getFluxZ(solutionsA, solutionsB) -% getFluxZ -% Calculates the Z scores between two sets of random flux distributions. +% getFluxZ Calculate Z scores between two sets of random flux distributions. % -% solutionsA random solutions for the reference condition (as -% generated by randomSampling) -% solutionsB random solutions for the test condition (as generated -% by randomSampling) +% Parameters +% ---------- +% solutionsA : double +% random solutions for the reference condition (as generated by +% randomSampling). +% solutionsB : double +% random solutions for the test condition (as generated by +% randomSampling). % -% Z a vector with Z-scores that tells you for each reaction -% how likely it is for its flux to have increased (positive sign) -% or decreased (negative sign) in the second condition with -% respect to the first. +% Returns +% ------- +% Z : double +% a vector with Z-scores that tells you for each reaction how likely it +% is for its flux to have increased (positive sign) or decreased +% (negative sign) in the second condition with respect to the first. % -% Usage: Z=getFluxZ(solutionsA, solutionsB) +% Examples +% -------- +% Z = getFluxZ(solutionsA, solutionsB); +% +% See also +% -------- +% randomSampling nRxns=size(solutionsA,1); diff --git a/analysis/getMinNrFluxes.m b/analysis/getMinNrFluxes.m index 6c9ca5e2..0c4e4858 100755 --- a/analysis/getMinNrFluxes.m +++ b/analysis/getMinNrFluxes.m @@ -1,31 +1,48 @@ function [x,I,exitFlag]=getMinNrFluxes(model, toMinimize, params,scores) -% getMinNrFluxes -% Returns the minimal set of fluxes that satisfy the model using -% mixed integer linear programming. +% getMinNrFluxes Find the minimal set of fluxes that satisfy the model. % -% model a model structure -% toMinimize either a cell array of reaction IDs, a logical vector -% with the same number of elements as reactions in the model, -% of a vector of indexes for the reactions that should be -% minimized (optional, default model.rxns) -% params *obsolete option* -% scores vector of weights for the reactions. Negative scores -% should not have flux. Positive scores are not possible in this -% implementation, and they are changed to max(scores(scores<0)). -% Must have the same dimension as toMinimize (find(toMinimize) -% if it is a logical vector) (optional, default -1 for all reactions) +% Uses mixed integer linear programming to find the minimal set of fluxes +% that satisfy the model. % -% x the corresponding fluxes for the full model -% I the indexes of the reactions in toMinimize that were used -% in the solution -% exitFlag 1: optimal solution found -% -1: no feasible solution found -% -2: optimization time out +% Parameters +% ---------- +% model : struct +% a model structure. +% toMinimize : cell or logical or double, optional +% either a cell array of reaction IDs, a logical vector with the same +% number of elements as reactions in the model, or a vector of indexes +% for the reactions that should be minimized (default model.rxns). +% params : struct, optional +% *obsolete option*. +% scores : double, optional +% vector of weights for the reactions. Negative scores should not have +% flux. Positive scores are not possible in this implementation, and +% they are changed to max(scores(scores<0)). Must have the same +% dimension as toMinimize (find(toMinimize) if it is a logical vector) +% (default -1 for all reactions). % -% NOTE: Uses 1000 mmol/gDW/h as an arbitary large flux. Could possibly -% cause problems if the fluxes in the model are larger than that. +% Returns +% ------- +% x : double +% the corresponding fluxes for the full model. +% I : double +% the indexes of the reactions in toMinimize that were used in the +% solution. +% exitFlag : double +% exit status: % -% Usage: [x,I,exitFlag]=getMinNrFluxes(model, toMinimize, params, scores) +% - 1 : optimal solution found +% - -1 : no feasible solution found +% - -2 : optimization time out +% +% Examples +% -------- +% [x, I, exitFlag] = getMinNrFluxes(model, toMinimize, params, scores); +% +% Notes +% ----- +% Uses 1000 mmol/gDW/h as an arbitary large flux. Could possibly cause +% problems if the fluxes in the model are larger than that. exitFlag=1; diff --git a/analysis/haveFlux.m b/analysis/haveFlux.m index 01215959..f926e2c1 100755 --- a/analysis/haveFlux.m +++ b/analysis/haveFlux.m @@ -1,25 +1,36 @@ function I=haveFlux(model,cutOff,rxns) -% haveFlux -% Checks which reactions can carry a (positive or negative) flux. Is used -% as a faster version of getAllowedBounds if it is only interesting -% whether the reactions can carry a flux or not +% haveFlux Check which reactions can carry a flux. % -% Input: -% model a model structure -% cutOff the flux value that a reaction has to carry to be -% identified as positive (optional, default 10^-8) -% rxns either a cell array of IDs, a logical vector with the -% same number of elements as metabolites in the model, or a -% vector of indexes (optional, default model.rxns) +% Checks which reactions can carry a (positive or negative) flux. Is used as +% a faster version of getAllowedBounds if it is only interesting whether the +% reactions can carry a flux or not. % -% Output: -% I logical array with true if the corresponding reaction can -% carry a flux +% Parameters +% ---------- +% model : struct +% a model structure. +% cutOff : double, optional +% the flux value that a reaction has to carry to be identified as +% positive (default 10^-8). +% rxns : cell or logical or double, optional +% either a cell array of IDs, a logical vector with the same number of +% elements as metabolites in the model, or a vector of indexes (default +% model.rxns). % -% If a model has +/- Inf bounds then those are replaced with an arbitary -% large value of +/- 10000 prior to solving +% Returns +% ------- +% I : logical +% logical array with true if the corresponding reaction can carry a +% flux. +% +% Examples +% -------- +% I = haveFlux(model, cutOff, rxns); % -% Usage: I = haveFlux(model, cutOff, rxns) +% Notes +% ----- +% If a model has +/- Inf bounds then those are replaced with an arbitary +% large value of +/- 10000 prior to solving. if nargin<2 cutOff=10^-6; diff --git a/analysis/randomSampling.m b/analysis/randomSampling.m index 9c48431c..3f96faa4 100755 --- a/analysis/randomSampling.m +++ b/analysis/randomSampling.m @@ -1,61 +1,63 @@ function [solutions, goodRxns]=randomSampling(model,nSamples,replaceBoundsWithInf,supressErrors,runParallel,goodRxns,minFlux) -% randomSampling -% Performs random sampling of the solution space, as described in Bordel -% et al. (2010) PLOS Compt Biol (doi:10.1371/journal.pcbi.1000859). +% randomSampling Perform random sampling of the solution space. % -% Input: -% model a model structure -% nSamples the number of solutions to return -% (optional, default 1000) -% replaceBoundsWithInf replace the largest upper bounds with Inf and -% the smallest lower bounds with -Inf. This is -% needed in order to get solutions without loops -% if your model has for example 1000/-1000 as -% arbitary large bounds. If your model only has -% "biologically relevant" bounds, then set this -% to false (optional, default true) -% supressErrors the program will halt if it has problems -% finding non-zero solutions which are not -% involved in loops. This could be because the -% constraints on the model are too relaxed (such -% as unlimited glucose uptake) or too strict -% (such as too many and too narrow constraints) -% (optional, default false) -% runParallel speed up calculations by parallel processing. -% Requires MATLAB Parallel Computing Toolbox. If -% this is not installed, the calculations will -% not be parallelized, regardless what is -% indicated as runParallel. (optional, default -% true) -% goodRxns double vector of indexes of those reactions -% that are not involved in loops and can be used -% as random objective functions, as generated by -% a previous run of randomSampling on the same -% model (optional, default empty) -% minFlux determines if a second optimization should be -% performed for each random sample, to minimize -% the number of fluxes and thereby preventing -% loops. Typically, loops are averaged out when a -% large number of samples are taken, but this is -% not always the case (optional, default false) +% Performs random sampling of the solution space, as described in Bordel et +% al. (2010) PLoS Comput Biol (doi:10.1371/journal.pcbi.1000859). % -% Output: -% solutions matrix with the solutions -% goodRxns double vector of indexes of those reactions -% that are not involved in loops or always carry -% zero flux and can be used as random objective -% functions +% Parameters +% ---------- +% model : struct +% a model structure. +% nSamples : double, optional +% the number of solutions to return (default 1000). +% replaceBoundsWithInf : logical, optional +% replace the largest upper bounds with Inf and the smallest lower +% bounds with -Inf. This is needed in order to get solutions without +% loops if your model has, for example, 1000/-1000 as arbitrarily large +% bounds. If your model only has "biologically relevant" bounds, then +% set this to false (default true). +% supressErrors : logical, optional +% the program will halt if it has problems finding non-zero solutions +% which are not involved in loops. This could be because the constraints +% on the model are too relaxed (such as unlimited glucose uptake) or too +% strict (such as too many and too narrow constraints) (default false). +% runParallel : logical, optional +% speed up calculations by parallel processing. Requires the MATLAB +% Parallel Computing Toolbox. If this is not installed, the calculations +% will not be parallelized, regardless of what is indicated as +% runParallel (default true). +% goodRxns : double, optional +% vector of indexes of those reactions that are not involved in loops +% and can be used as random objective functions, as generated by a +% previous run of randomSampling on the same model (default empty). +% minFlux : logical, optional +% determines if a second optimization should be performed for each +% random sample, to minimize the number of fluxes and thereby prevent +% loops. Typically, loops are averaged out when a large number of +% samples are taken, but this is not always the case (default false). % -% Note: The solutions are generated by maximizing (with random weights) for -% a random set of three reactions. For reversible reactions it randomly +% Returns +% ------- +% solutions : double +% matrix with the solutions. +% goodRxns : double +% vector of indexes of those reactions that are not involved in loops or +% always carry zero flux and can be used as random objective functions. +% +% Notes +% ----- +% The solutions are generated by maximizing (with random weights) for a +% random set of three reactions. For reversible reactions it randomly % chooses between maximizing and minimizing. % % If the model is a GECKO v3+ ecModel, then usage_prot reactions are not % selected for sampling, instead focusing on sampling from the metabolic % aspects that form the solution space. % -% Usage: solutions = randomSampling(model, nSamples, replaceBoundsWithInf,... -% supressErrors, runParallel, goodRxns, minFlux) +% Examples +% -------- +% solutions = randomSampling(model, nSamples, replaceBoundsWithInf, ... +% supressErrors, runParallel, goodRxns, minFlux); if nargin<2 | isempty(nSamples) nSamples=1000; diff --git a/analysis/reporterMetabolites.m b/analysis/reporterMetabolites.m index 022acec1..666244c2 100755 --- a/analysis/reporterMetabolites.m +++ b/analysis/reporterMetabolites.m @@ -1,43 +1,59 @@ function repMets=reporterMetabolites(model,genes,genePValues,printResults,outputFile,geneFoldChanges) -% reporterMetabolites -% The Reporter Metabolites algorithm for identifying metabolites around -% which transcriptional changes occur +% reporterMetabolites Identify metabolites around which transcriptional changes occur. % -% model a model structure -% genes a cell array of gene names (should match with -% model.genes) -% genePValues P-values for differential expression of the genes -% printResults true if the top 20 Reporter Metabolites should be -% printed to the screen (optional, default false) -% outputFile the results are printed to this file (optional) -% geneFoldChanges log-fold changes for the genes. If supplied, then -% Reporter Metabolites are calculated for only up/down- -% regulated genes in addition to the full test (optional) +% The Reporter Metabolites algorithm for identifying metabolites around +% which transcriptional changes occur. % -% repMets an array of structures with the following fields. -% test a string the describes the genes that were used to -% calculate the Reporter Metabolites ('all', 'only up', -% or 'only down'). The two latter structures are -% only calculated if geneFoldChanges are supplied. -% mets a cell array of metabolite IDs for the metabolites for -% which a score could be calculated -% metZScores Z-scores for differential expression around each -% metabolite in "mets" -% metPValues P-values for differential expression around each -% metabolite in "mets" -% metNGenes number of neighbouring genes for each metabolite in -% "mets" -% meanZ average Z-scores for the genes around each metabolite -% in "mets" -% stdZ standard deviations of the Z-scores around each -% metabolite in "mets" +% Parameters +% ---------- +% model : struct +% a model structure. +% genes : cell +% a cell array of gene names (should match with model.genes). +% genePValues : double +% P-values for differential expression of the genes. +% printResults : logical, optional +% true if the top 20 Reporter Metabolites should be printed to the +% screen (default false). +% outputFile : char, optional +% the results are printed to this file (default none). +% geneFoldChanges : double, optional +% log-fold changes for the genes. If supplied, then Reporter +% Metabolites are calculated for only up/down-regulated genes in +% addition to the full test (default none). % -% NOTE: For details about the algorithm, see Patil KR, Nielsen J, -% Uncovering transcriptional regulation of metabolism by using metabolic -% network topology. Proc. Natl Acad. Sci. USA 2005;102:2685-2689. +% Returns +% ------- +% repMets : struct +% an array of structures with the following fields: % -% Usage: repMets=reporterMetabolites(model,genes,genePValues,printResults,... -% outputFile,geneFoldChanges) +% - test : a string that describes the genes used to calculate the +% Reporter Metabolites ('all', 'only up', or 'only down'). The two +% latter structures are only calculated if geneFoldChanges are +% supplied. +% - mets : a cell array of metabolite IDs for the metabolites for which +% a score could be calculated. +% - metZScores : Z-scores for differential expression around each +% metabolite in "mets". +% - metPValues : P-values for differential expression around each +% metabolite in "mets". +% - metNGenes : number of neighbouring genes for each metabolite in +% "mets". +% - meanZ : average Z-scores for the genes around each metabolite in +% "mets". +% - stdZ : standard deviations of the Z-scores around each metabolite +% in "mets". +% +% Examples +% -------- +% repMets = reporterMetabolites(model, genes, genePValues, ... +% printResults, outputFile, geneFoldChanges); +% +% Notes +% ----- +% For details about the algorithm, see Patil KR, Nielsen J, Uncovering +% transcriptional regulation of metabolism by using metabolic network +% topology. Proc. Natl Acad. Sci. USA 2005;102:2685-2689. genes=convertCharArray(genes); if nargin<4 diff --git a/analysis/runDynamicFBA.m b/analysis/runDynamicFBA.m index a556c71b..4c338fbd 100755 --- a/analysis/runDynamicFBA.m +++ b/analysis/runDynamicFBA.m @@ -1,43 +1,58 @@ function [concentrationMatrix, excRxnNames, timeVec, biomassVec] = runDynamicFBA(model, substrateRxns, initConcentrations, initBiomass, timeStep, nSteps, plotRxns, exclUptakeRxns) -% runDynamicFBA -% Performs dynamic FBA simulation using the static optimization approach +% runDynamicFBA Perform dynamic FBA using the static optimization approach. % -% Input: -% model a model structure -% substrateRxns cell array with exchange reaction identifiers for -% substrates that are initially in the media, whose -% concentration may change (e.g. not h2o or co2) -% initConcentrations numeric initial concentrations of substrates -% (matching substrateRxns) -% initBiomass numeric initial biomass (must be non-zero) -% timeStep numeric time step size -% nSteps numeric maximum number of time steps -% plotRxns cell array with exchange reaction identifiers for -% substrates whose concentration should be plotted -% exclUptakeRxns cell array with exchange reaction identifiers for -% substrates whose concentration does not change -% (e.g. co2, o2, h2o, h) +% Parameters +% ---------- +% model : struct +% a model structure. +% substrateRxns : cell +% cell array with exchange reaction identifiers for substrates that are +% initially in the media, whose concentration may change (e.g. not h2o +% or co2). +% initConcentrations : double +% initial concentrations of substrates (matching substrateRxns). +% initBiomass : double +% initial biomass (must be non-zero). +% timeStep : double +% time step size. +% nSteps : double +% maximum number of time steps. +% plotRxns : cell +% cell array with exchange reaction identifiers for substrates whose +% concentration should be plotted. +% exclUptakeRxns : cell +% cell array with exchange reaction identifiers for substrates whose +% concentration does not change (e.g. co2, o2, h2o, h). % -% Output: -% concentrationMatrix numeric matrix with extracellular metabolite -% concentrations -% excRxnNames cell array with exchange reaction identifiers that -% match the metabolites included in the -% concentrationMatrix -% timeVec numeric vector of time points -% biomassVec numeric vector with biomass concentrations +% Returns +% ------- +% concentrationMatrix : double +% matrix with extracellular metabolite concentrations. +% excRxnNames : cell +% cell array with exchange reaction identifiers that match the +% metabolites included in the concentrationMatrix. +% timeVec : double +% vector of time points. +% biomassVec : double +% vector with biomass concentrations. % +% Examples +% -------- +% [concentrationMatrix, excRxnNames, timeVec, biomassVec] = ... +% runDynamicFBA(model, substrateRxns, initConcentrations, ... +% initBiomass, timeStep, nSteps, plotRxns, exclUptakeRxns); +% +% Notes +% ----- % If no initial concentration is given for a substrate that has an open -% uptake in the model (i.e. `model.lb < 0`) the concentration is assumed to +% uptake in the model (i.e. model.lb < 0) the concentration is assumed to % be high enough to not be limiting. If the uptake rate for a nutrient is % calculated to exceed the maximum uptake rate for that nutrient specified % in the model and the max uptake rate specified is > 0, the maximum uptake % rate specified in the model is used instead of the calculated uptake % rate. % -% Modified from COBRA Toolbox dynamicFBA.m -% -% Usage: [concentrationMatrix, excRxnNames, timeVec, biomassVec] = runDynamicFBA(model, substrateRxns, initConcentrations, initBiomass, timeStep, nSteps, plotRxns, exclUptakeRxns) +% Modified from COBRA Toolbox dynamicFBA.m. % Find exchange rxns excRxnNames = getExchangeRxns(model); diff --git a/analysis/runPhenotypePhasePlane.m b/analysis/runPhenotypePhasePlane.m index 6fd206d1..e694ee4a 100755 --- a/analysis/runPhenotypePhasePlane.m +++ b/analysis/runPhenotypePhasePlane.m @@ -1,30 +1,44 @@ function [growthRates, shadowPrices1, shadowPrices2] = runPhenotypePhasePlane(model, controlRxn1, controlRxn2, nPts, range1, range2) -% runPhenotypePhasePlane -% Runs phenotype phase plane analysis and plots the results. The first -% plot is a 3D surface plot showing the phenotype phase plane, the other -% two plots show the shadow prices of the metabolites from the two -% control reactions, which define the phases. Modified from the COBRA -% phenotypePhasePlane function. +% runPhenotypePhasePlane Run phenotype phase plane analysis and plot the results. % -% Input: -% model a model structure -% controlRxn1 reaction identifier of the first reaction to be plotted -% controlRxn2 reaction identifier of the second reaction to be plotted -% nPts the number of points to plot in each dimension (optional, -% default 50) -% range1 the range [from 0 to range1] of reaction 1 to plot -% (optional, default 20) -% range2 the range [from 0 to range2] of reaction 2 to plot -% (optional, default 20) +% Runs phenotype phase plane analysis and plots the results. The first plot +% is a 3D surface plot showing the phenotype phase plane, the other two +% plots show the shadow prices of the metabolites from the two control +% reactions, which define the phases. % -% Output: -% growthRates1 a matrix of maximum growth rates -% shadowPrices1 a matrix with shadow prices for reaction 1 -% shadowPrices2 a matrix with shadow prices for reaction 2 +% Parameters +% ---------- +% model : struct +% a model structure. +% controlRxn1 : char +% reaction identifier of the first reaction to be plotted. +% controlRxn2 : char +% reaction identifier of the second reaction to be plotted. +% nPts : double, optional +% the number of points to plot in each dimension (default 50). +% range1 : double, optional +% the range [from 0 to range1] of reaction 1 to plot (default 20). +% range2 : double, optional +% the range [from 0 to range2] of reaction 2 to plot (default 20). % -% Modified from COBRA Toolbox phenotypePhasePlane.m +% Returns +% ------- +% growthRates : double +% a matrix of maximum growth rates. +% shadowPrices1 : double +% a matrix with shadow prices for reaction 1. +% shadowPrices2 : double +% a matrix with shadow prices for reaction 2. % -% Usage: [growthRates, shadowPrices1, shadowPrices2] = runPhenotypePhasePlane(model, controlRxn1, controlRxn2, nPts, range1, range2) +% Examples +% -------- +% [growthRates, shadowPrices1, shadowPrices2] = ... +% runPhenotypePhasePlane(model, controlRxn1, controlRxn2, nPts, ... +% range1, range2); +% +% Notes +% ----- +% Modified from COBRA Toolbox phenotypePhasePlane.m. close all force % Close all existing figure windows (if open) if nargin < 4 nPts = 50; diff --git a/analysis/runProductionEnvelope.m b/analysis/runProductionEnvelope.m index b9329bf7..302b0149 100755 --- a/analysis/runProductionEnvelope.m +++ b/analysis/runProductionEnvelope.m @@ -1,21 +1,32 @@ function [biomassValues, targetValues] = runProductionEnvelope(model, targetRxn, biomassRxn, nPts) -% runProductionEnvelope -% Calculates the byproduct secretion envelope +% runProductionEnvelope Calculate the byproduct secretion envelope. % -% Input: -% model a model structure -% targetRxn identifier of target metabolite production reaction -% biomassRxn identifier of biomass reaction -% nPts number of points in the plot (optional, default 20) +% Parameters +% ---------- +% model : struct +% a model structure. +% targetRxn : char +% identifier of target metabolite production reaction. +% biomassRxn : char +% identifier of biomass reaction. +% nPts : double, optional +% number of points in the plot (default 20). % -% Output: -% biomassValues Biomass values for plotting -% targetValues Target upper and lower bounds for plotting +% Returns +% ------- +% biomassValues : double +% biomass values for plotting. +% targetValues : double +% target upper and lower bounds for plotting. % -% Modified from COBRA Toolbox productionEnvelope.m +% Examples +% -------- +% [biomassValues, targetValues] = runProductionEnvelope(model, ... +% targetRxn, biomassRxn, nPts); % -% Usage: [biomassValues, targetValues] = runProductionEnvelope(model,... -% targetRxn, biomassRxn, nPts) +% Notes +% ----- +% Modified from COBRA Toolbox productionEnvelope.m. if nargin < 4 nPts = 20; diff --git a/analysis/runRobustnessAnalysis.m b/analysis/runRobustnessAnalysis.m index 2d085103..6146b1f5 100755 --- a/analysis/runRobustnessAnalysis.m +++ b/analysis/runRobustnessAnalysis.m @@ -1,26 +1,40 @@ function [controlFlux, objFlux] = runRobustnessAnalysis(model, controlRxn, nPoints, objRxn, plotRedCost) -% runRobustnessAnalysis -% Performs robustness analysis for a reaction of interest and an objective -% of interest. Modified from the COBRA robustnessAnalysis function. +% runRobustnessAnalysis Perform robustness analysis for a reaction and objective. % -% Input: -% model a model structure -% controlRxn reaction of interest whose value is to be controlled -% nPoints number of points to show on plot (optional, default 20) -% objRxn reaction identifier of objective to be maximized (optional, -% default it uses the objective defined in the model) -% plotRedCost logical whether reduced cost should also be plotted -% (optional, default false) +% Performs robustness analysis for a reaction of interest and an objective +% of interest. % -% Output: -% controlFlux flux values of the reaction of interest, ranging from -% its minimum to its maximum value -% objFlux optimal values of objective reaction at each control -% reaction flux value +% Parameters +% ---------- +% model : struct +% a model structure. +% controlRxn : char +% reaction of interest whose value is to be controlled. +% nPoints : double, optional +% number of points to show on plot (default 20). +% objRxn : char, optional +% reaction identifier of objective to be maximized (default uses the +% objective defined in the model). +% plotRedCost : logical, optional +% whether reduced cost should also be plotted (default false). % -% Modified from COBRA Toolbox robustnessAnalysis.m +% Returns +% ------- +% controlFlux : double +% flux values of the reaction of interest, ranging from its minimum to +% its maximum value. +% objFlux : double +% optimal values of objective reaction at each control reaction flux +% value. % -% Usage: runRobustnessAnalysis(model, controlRxn, nPoints, objRxn) +% Examples +% -------- +% [controlFlux, objFlux] = runRobustnessAnalysis(model, controlRxn, ... +% nPoints, objRxn); +% +% Notes +% ----- +% Modified from COBRA Toolbox robustnessAnalysis.m. if nargin < 3 nPoints = 20; diff --git a/analysis/runSimpleOptKnock.m b/analysis/runSimpleOptKnock.m index a36e0456..69cbfb26 100755 --- a/analysis/runSimpleOptKnock.m +++ b/analysis/runSimpleOptKnock.m @@ -1,33 +1,44 @@ function out = runSimpleOptKnock(model, targetRxn, biomassRxn, deletions, genesOrRxns, maxNumKO, minGrowth) -% runSimpleOptKnock -% Simple OptKnock algorithm that checks all gene or reaction deletions -% for growth-coupled metabolite production, by testing all possible -% combinations. This is not defined as MILP, and is therefore slow (but -% simple). +% runSimpleOptKnock Simple OptKnock for growth-coupled production. % -% Input: -% model a model structure -% targetRxn identifier of target reaction -% biomassRxn identifier of biomass reaction -% deletions cell array with gene or reaction identifiers that -% should be considered for knockout -% (optional, default = model.rxns) -% genesOrRxns string indicating whether deletions parameter is given -% with 'genes' or 'rxns' identifiers (optional, default -% 'rxns') -% maxNumKO numeric with maximum number of simulatenous knockout -% (optional, default 1) -% minGrowth numeric of minimum growth rate (optional, default 0.05) +% Simple OptKnock algorithm that checks all gene or reaction deletions for +% growth-coupled metabolite production, by testing all possible +% combinations. This is not defined as MILP, and is therefore slow (but +% simple). % -% Output: -% out structure with deletions strategies that result in -% growth-coupled production -% KO cell array with gene(s) or reaction(s) to be deleted -% growthRate vector with growth rates after deletion -% prodRate vector with production rates after deletion +% Parameters +% ---------- +% model : struct +% a model structure. +% targetRxn : char +% identifier of target reaction. +% biomassRxn : char +% identifier of biomass reaction. +% deletions : cell, optional +% cell array with gene or reaction identifiers that should be +% considered for knockout (default model.rxns). +% genesOrRxns : char, optional +% string indicating whether deletions parameter is given with 'genes' +% or 'rxns' identifiers (default 'rxns'). +% maxNumKO : double, optional +% maximum number of simultaneous knockouts (default 1). +% minGrowth : double, optional +% minimum growth rate (default 0.05). % -% Usage: out = runSimpleOptKnock(model, targetRxn, biomassRxn, deletions,... -% genesOrRxns, maxNumKO, minGrowth) +% Returns +% ------- +% out : struct +% structure with deletion strategies that result in growth-coupled +% production, with fields: +% +% - KO : cell array with gene(s) or reaction(s) to be deleted. +% - growthRate : vector with growth rates after deletion. +% - prodRate : vector with production rates after deletion. +% +% Examples +% -------- +% out = runSimpleOptKnock(model, targetRxn, biomassRxn, deletions, ... +% genesOrRxns, maxNumKO, minGrowth); if nargin < 4 params.deletions = model.rxns; diff --git a/annotation/assignSBOterms.m b/annotation/assignSBOterms.m index 0c0e1755..4e4093d0 100644 --- a/annotation/assignSBOterms.m +++ b/annotation/assignSBOterms.m @@ -1,62 +1,67 @@ function model = assignSBOterms(model, opts) -% assignSBOterms -% Assign SBO terms to metabolites and reactions following a generic -% rule set. Mirrors raven_python.annotation.add_sbo_terms; -% organism-agnostic, parameterised entirely by `opts`. The -% yeast-GEM port of this function is the legacy addSBOterms.m, -% which becomes a thin shim here. +% assignSBOterms Assign SBO terms to metabolites and reactions. % -% Rules -% ----- -% Metabolites: -% SBO:0000649 (Biomass) when met.name is in opts.biomassMetNames, -% or ends with any of opts.biomassMetSuffixes. Otherwise -% SBO:0000247 (Simple chemical). +% Assign SBO terms to metabolites and reactions following a generic rule +% set. Mirrors raven_python.annotation.add_sbo_terms; organism-agnostic, +% parameterised entirely by opts. The yeast-GEM port of this function is +% the legacy addSBOterms.m, which becomes a thin shim here. % -% Reactions (default → override → pseudoreaction override): -% SBO:0000176 (Metabolic reaction) default. -% Single-reactant reactions become: -% SBO:0000627 (exchange) if the lone metabolite is -% extracellular (compartment 'e' or compartment name -% containing 'extracellular'), -% SBO:0000632 (sink) if coef < 0, -% SBO:0000628 (demand) otherwise. -% Transport reactions (detected by opts.transportDetector or -% the default heuristic: same metName in ≥ 2 compartments -% in a single reaction) → SBO:0000655. -% Reactions whose name matches opts.biomassRxnName → SBO:0000629. -% Reactions whose name matches opts.ngamRxnName → SBO:0000630. -% Reactions whose name contains any of -% opts.pseudoreactionSubstrings → SBO:0000395. +% SBO is written via editMiriam(..., 'fill') so pre-existing SBO +% annotations are preserved. % -% "fill" semantic — SBO is written via editMiriam(..., 'fill') so -% pre-existing SBO annotations are preserved. +% Parameters +% ---------- +% model : struct +% RAVEN model struct. +% opts : struct, optional +% Struct with any of the following fields. Missing fields take the +% defaults shown: % -% Inputs: -% model RAVEN model struct. -% opts (opt) struct with any of the following fields. Missing -% fields take the defaults shown: -% biomassMetNames {'biomass','DNA','RNA','protein', -% 'carbohydrate','lipid','cofactor','ion'} -% biomassMetSuffixes {' backbone',' chain'} -% biomassRxnName 'biomass pseudoreaction' -% ngamRxnName 'non-growth associated maintenance reaction' -% pseudoreactionSubstrings {'pseudoreaction','SLIME rxn'} -% onlyLastReactionForPseudo false. yeast-GEM bug-compat -% flag — replicates the -% legacy `for i=numel(...)` -% typo (pseudoreaction -% SBOs applied only to the -% last reaction). Off by -% default; turn ON for -% byte-equivalent yeast-GEM -% output. +% - biomassMetNames : {'biomass','DNA','RNA','protein','carbohydrate', +% 'lipid','cofactor','ion'} +% - biomassMetSuffixes : {' backbone',' chain'} +% - biomassRxnName : 'biomass pseudoreaction' +% - ngamRxnName : 'non-growth associated maintenance reaction' +% - pseudoreactionSubstrings : {'pseudoreaction','SLIME rxn'} +% - onlyLastReactionForPseudo : false. yeast-GEM bug-compat flag that +% replicates the legacy `for i=numel(...)` typo (pseudoreaction SBOs +% applied only to the last reaction). Off by default; turn ON for +% byte-equivalent yeast-GEM output. % -% Output: -% model Modified model. +% Returns +% ------- +% model : struct +% Modified model. % -% Usage: model = assignSBOterms(model) -% model = assignSBOterms(model, struct('onlyLastReactionForPseudo', true)) +% Examples +% -------- +% model = assignSBOterms(model); +% model = assignSBOterms(model, struct('onlyLastReactionForPseudo', true)); +% +% Notes +% ----- +% Metabolites: +% +% SBO:0000649 (Biomass) when met.name is in opts.biomassMetNames, or +% ends with any of opts.biomassMetSuffixes. Otherwise SBO:0000247 +% (Simple chemical). +% +% Reactions (default → override → pseudoreaction override): +% +% SBO:0000176 (Metabolic reaction) default. +% Single-reactant reactions become: +% SBO:0000627 (exchange) if the lone metabolite is extracellular +% (compartment 'e' or compartment name containing +% 'extracellular'), +% SBO:0000632 (sink) if coef < 0, +% SBO:0000628 (demand) otherwise. +% Transport reactions (detected by opts.transportDetector or the +% default heuristic: same metName in ≥ 2 compartments in a single +% reaction) → SBO:0000655. +% Reactions whose name matches opts.biomassRxnName → SBO:0000629. +% Reactions whose name matches opts.ngamRxnName → SBO:0000630. +% Reactions whose name contains any of opts.pseudoreactionSubstrings +% → SBO:0000395. if nargin < 2 || isempty(opts) opts = struct(); diff --git a/annotation/editMiriam.m b/annotation/editMiriam.m index 35cafdd5..8cb26138 100755 --- a/annotation/editMiriam.m +++ b/annotation/editMiriam.m @@ -1,40 +1,48 @@ function model=editMiriam(model,type,object,miriamName,miriams,keep) -% editMiriam -% Change MIRIAM annotation fields, one annotation type at the same time. +% editMiriam Change MIRIAM annotation fields, one type at a time. % -% Input: -% model model structure -% type 'met', 'rxn', 'gene' or 'comp' dependent on which -% objects the annotations should be assigned to -% object either a cell array of IDs, a logical vector with the -% same number of elements as the type (see above) in the -% model, a vector of indexes, or 'all' -% miriamName string specifying the namespace of the identifier, for -% instance 'bigg.metabolite'. Should be a valid prefix -% from identifiers.org (e.g. -% https://registry.identifiers.org/registry/bigg.metabolite) -% miriam string or cell array of strings with annotation -% identifiers, e.g. '12dgr161' -% keep one of the following strings, specifying what should be -% done if an object already has an existing MIRIAM -% annotations with the same miriamName: -% 'replace' discard all existing annotations, all will -% be overwritten, even if the new annotation -% is an empty field. Should only be used if -% you do not want to keep any of the old -% annotation with the same miriamName -% 'fill' only add annotations to those objects that -% did not yet have an annotation with that -% miriamName. Otherwise, the existing -% annotation is kept, even if it is different -% from the suggested new annotation -% 'add' keep all existing annotations, and add any -% new annotations, after removing duplicates -% -% Ouput: -% model model structure with updated MIRIAM annotation field -% -% Usage: model=editMiriam(model,type,object,miriamName,miriams,keep) +% Parameters +% ---------- +% model : struct +% model structure. +% type : char +% 'met', 'rxn', 'gene' or 'comp' dependent on which objects the +% annotations should be assigned to. +% object : cell or logical or double or char +% either a cell array of IDs, a logical vector with the same number of +% elements as the type (see above) in the model, a vector of indexes, +% or 'all'. +% miriamName : char +% string specifying the namespace of the identifier, for instance +% 'bigg.metabolite'. Should be a valid prefix from identifiers.org +% (e.g. https://registry.identifiers.org/registry/bigg.metabolite). +% miriams : char or cell +% string or cell array of strings with annotation identifiers, e.g. +% '12dgr161'. +% keep : char +% one of the following strings, specifying what should be done if an +% object already has existing MIRIAM annotations with the same +% miriamName: +% +% - 'replace' : discard all existing annotations, all will be +% overwritten, even if the new annotation is an empty field. Should +% only be used if you do not want to keep any of the old annotation +% with the same miriamName. +% - 'fill' : only add annotations to those objects that did not yet +% have an annotation with that miriamName. Otherwise, the existing +% annotation is kept, even if it is different from the suggested new +% annotation. +% - 'add' : keep all existing annotations, and add any new +% annotations, after removing duplicates. +% +% Returns +% ------- +% model : struct +% model structure with updated MIRIAM annotation field. +% +% Examples +% -------- +% model = editMiriam(model, type, object, miriamName, miriams, keep); miriamName=char(miriamName); miriams=convertCharArray(miriams); diff --git a/annotation/extractMiriam.m b/annotation/extractMiriam.m index 6a88edc9..5bb582aa 100755 --- a/annotation/extractMiriam.m +++ b/annotation/extractMiriam.m @@ -1,30 +1,37 @@ function [miriams,extractedMiriamNames]=extractMiriam(modelMiriams,miriamNames) -% extractMiriam -% This function unpacks the information kept in metMiriams, rxnMiriams, -% geneMiriams or compMiriams to make the annotation more -% human-readable. The obtained cell array looks the same like in Excel -% format, just the columns are split to have particular miriam name in -% corresponding column +% extractMiriam Unpack MIRIAM annotations into a human-readable table. % -% modelMiriams a miriam structure (e.g. model.metMiriams) -% for one or multiple metabolites -% miriamNames cell array with miriam names to be -% extracted (optional, default 'all', meaning -% that annotation for all miriam names found -% in modelMiriams will be extracted) +% This function unpacks the information kept in metMiriams, rxnMiriams, +% geneMiriams or compMiriams to make the annotation more human-readable. +% The obtained cell array looks the same as in Excel format, just the +% columns are split to have a particular miriam name in the corresponding +% column. % -% miriams a cell array with extracted miriams. if -% several miriam names are requested, the -% corresponding information is saved in -% different columns. if there are several ids -% available for the same entity (metabolite, -% gene, reaction or compartment), they are -% concatenated into one column. the total -% number of column represent the number of -% unique miriam names per entity -% extractedMiriamNames cell array with extracted miriam names +% Parameters +% ---------- +% modelMiriams : cell +% a miriam structure (e.g. model.metMiriams) for one or multiple +% metabolites. +% miriamNames : cell or char, optional +% cell array with miriam names to be extracted (default 'all', meaning +% that annotation for all miriam names found in modelMiriams will be +% extracted). % -% Usage: miriam=extractMiriam(modelMiriams,miriamName) +% Returns +% ------- +% miriams : cell +% a cell array with extracted miriams. If several miriam names are +% requested, the corresponding information is saved in different +% columns. If there are several ids available for the same entity +% (metabolite, gene, reaction or compartment), they are concatenated +% into one column. The total number of columns represents the number +% of unique miriam names per entity. +% extractedMiriamNames : cell +% cell array with extracted miriam names. +% +% Examples +% -------- +% [miriams, extractedMiriamNames] = extractMiriam(modelMiriams, miriamNames); if nargin<2 || (ischar(miriamNames) && strcmp(miriamNames,'all')) extractAllTypes=true; diff --git a/annotation/loadDeltaGfromCSV.m b/annotation/loadDeltaGfromCSV.m index da56e766..c20f1082 100644 --- a/annotation/loadDeltaGfromCSV.m +++ b/annotation/loadDeltaGfromCSV.m @@ -1,24 +1,33 @@ function model = loadDeltaGfromCSV(model, metCsv, rxnCsv) -% loadDeltaGfromCSV -% Populate model.metDeltaG and model.rxnDeltaG from project CSV -% files. Mirrors raven_python.annotation.load_delta_g_csv and is -% the upstream version of yeast-GEM's loadDeltaG. +% loadDeltaGfromCSV Populate metDeltaG and rxnDeltaG from CSV files. % -% Each CSV is a two-column table: identifier, deltaG. Rows whose -% identifier doesn't appear in the model are silently skipped. -% Pass an empty string ('') for either argument to skip that side. +% Populate model.metDeltaG and model.rxnDeltaG from project CSV files. +% Mirrors raven_python.annotation.load_delta_g_csv and is the upstream +% version of yeast-GEM's loadDeltaG. % -% Inputs: -% model RAVEN model struct. -% metCsv Path to metabolite ΔG CSV (id, ΔG), or '' to skip. -% rxnCsv Path to reaction ΔG CSV (id, ΔG), or '' to skip. +% Each CSV is a two-column table: identifier, deltaG. Rows whose +% identifier doesn't appear in the model are silently skipped. Pass an +% empty string ('') for either argument to skip that side. % -% Output: -% model Model with metDeltaG and/or rxnDeltaG fields added. +% Parameters +% ---------- +% model : struct +% RAVEN model struct. +% metCsv : char, optional +% Path to metabolite ΔG CSV (id, ΔG), or '' to skip. +% rxnCsv : char, optional +% Path to reaction ΔG CSV (id, ΔG), or '' to skip. % -% Usage: model = loadDeltaGfromCSV(model, ... -% 'data/databases/model_metDeltaG.csv', ... -% 'data/databases/model_rxnDeltaG.csv') +% Returns +% ------- +% model : struct +% Model with metDeltaG and/or rxnDeltaG fields added. +% +% Examples +% -------- +% model = loadDeltaGfromCSV(model, ... +% 'data/databases/model_metDeltaG.csv', ... +% 'data/databases/model_rxnDeltaG.csv'); if nargin < 3 rxnCsv = ''; diff --git a/annotation/saveDeltaGtoCSV.m b/annotation/saveDeltaGtoCSV.m index ee682940..491710c5 100644 --- a/annotation/saveDeltaGtoCSV.m +++ b/annotation/saveDeltaGtoCSV.m @@ -1,23 +1,30 @@ function saveDeltaGtoCSV(model, metCsv, rxnCsv, verbose) -% saveDeltaGtoCSV -% Persist model.metDeltaG and model.rxnDeltaG to project CSV files. -% Counterpart of loadDeltaGfromCSV and the upstream version of -% yeast-GEM's saveDeltaG. Mirrors raven_python.annotation.save_delta_g_csv. +% saveDeltaGtoCSV Persist metDeltaG and rxnDeltaG to CSV files. % -% Each CSV gets two columns: identifier, deltaG. Rows are written -% in model order (one row per entity); identifiers without a -% matching field get NaN. Pass an empty string for metCsv or rxnCsv -% to skip that side. +% Persist model.metDeltaG and model.rxnDeltaG to project CSV files. +% Counterpart of loadDeltaGfromCSV and the upstream version of yeast-GEM's +% saveDeltaG. Mirrors raven_python.annotation.save_delta_g_csv. % -% Inputs: -% model RAVEN model struct. -% metCsv Output path for the metabolite ΔG CSV, or '' to skip. -% rxnCsv Output path for the reaction ΔG CSV, or '' to skip. -% verbose (opt, default false) Print "wrote ..." per file. +% Each CSV gets two columns: identifier, deltaG. Rows are written in model +% order (one row per entity); identifiers without a matching field get +% NaN. Pass an empty string for metCsv or rxnCsv to skip that side. % -% Usage: saveDeltaGtoCSV(model, ... -% 'data/databases/model_metDeltaG.csv', ... -% 'data/databases/model_rxnDeltaG.csv') +% Parameters +% ---------- +% model : struct +% RAVEN model struct. +% metCsv : char, optional +% Output path for the metabolite ΔG CSV, or '' to skip. +% rxnCsv : char, optional +% Output path for the reaction ΔG CSV, or '' to skip. +% verbose : logical, optional +% Print "wrote ..." per file (default false). +% +% Examples +% -------- +% saveDeltaGtoCSV(model, ... +% 'data/databases/model_metDeltaG.csv', ... +% 'data/databases/model_rxnDeltaG.csv'); if nargin < 4 verbose = false; diff --git a/biomass/fitParameters.m b/biomass/fitParameters.m index 44c15ecf..b3c949ba 100755 --- a/biomass/fitParameters.m +++ b/biomass/fitParameters.m @@ -1,39 +1,55 @@ function [parameters, fitnessScore, exitFlag, newModel]=fitParameters(model,xRxns,xValues,rxnsToFit,valuesToFit,parameterPositions,fitToRatio,initialGuess,plotFitting) -% fitParameters -% Fits parameters such as maintenance ATP by quadratic programming +% fitParameters Fit parameters such as maintenance ATP by quadratic programming. % -% model a model structure -% xRxns cell array with the IDs of the reactions that will be -% fixed for each data point -% xValues matrix with the corresponding values for each -% xRxns (columns are reactions) -% rxnsToFit cell array with the IDs of reactions that will be fitted to -% valuesToFit matrix with the corresponding values for each -% rxnsToFit (columns are reactions) -% parameterPositions stucture that determines where the parameters are in the -% stoichiometric matrix. Contains the fields: -% position cell array of vectors where each element contains -% the positions in the S-matrix for that parameter -% isNegative cell array of vectors where the elements are true -% if that position should be the negative of the -% fitted value (to differentiate between -% production/consumption) -% fitToRatio if the ratio of simulated to measured values should -% be fitted instead of the absolute value. Used to prevent -% large fluxes from having too large impact (optional, -% default true) -% initialGuess initial guess of the parameters (optional) -% plotFitting true if the resulting fitting should be plotted -% (optional, default false) +% Parameters +% ---------- +% model : struct +% a model structure. +% xRxns : cell +% cell array with the IDs of the reactions that will be fixed for each +% data point. +% xValues : double +% matrix with the corresponding values for each xRxns (columns are +% reactions). +% rxnsToFit : cell +% cell array with the IDs of reactions that will be fitted to. +% valuesToFit : double +% matrix with the corresponding values for each rxnsToFit (columns are +% reactions). +% parameterPositions : struct +% structure that determines where the parameters are in the +% stoichiometric matrix, with fields: % -% parameters fitted parameters in the same order as in -% parameterPositions -% fitnessScore the correponding residual sum of squares -% newModel updated model structure with the fitted parameters +% - position : cell array of vectors where each element contains the +% positions in the S-matrix for that parameter +% - isNegative : cell array of vectors where the elements are true if +% that position should be the negative of the fitted value (to +% differentiate between production/consumption) +% fitToRatio : logical, optional +% if the ratio of simulated to measured values should be fitted +% instead of the absolute value. Used to prevent large fluxes from +% having too large an impact (default true). +% initialGuess : double, optional +% initial guess of the parameters (default ones). +% plotFitting : logical, optional +% true if the resulting fitting should be plotted (default false). % -% Usage: [parameters, fitnessScore, exitFlag, newModel]=fitParameters(model,... -% xRxns,xValues,rxnsToFit,valuesToFit,parameterPositions,fitToRatio,... -% initialGuess,plotFitting) +% Returns +% ------- +% parameters : double +% fitted parameters in the same order as in parameterPositions. +% fitnessScore : double +% the corresponding residual sum of squares. +% exitFlag : double +% exit status returned by fminsearch. +% newModel : struct +% updated model structure with the fitted parameters. +% +% Examples +% -------- +% [parameters, fitnessScore, exitFlag, newModel]=fitParameters(model,... +% xRxns,xValues,rxnsToFit,valuesToFit,parameterPositions,fitToRatio,... +% initialGuess,plotFitting); if nargin<7 fitToRatio=true; diff --git a/biomass/getBiomassFractions.m b/biomass/getBiomassFractions.m index 592e7830..d42fa4d8 100644 --- a/biomass/getBiomassFractions.m +++ b/biomass/getBiomassFractions.m @@ -1,47 +1,52 @@ function fractions = getBiomassFractions(model, biomassConfig) -% getBiomassFractions -% Compute the mass fraction (g/gDW) per biomass component plus the -% total. Mirrors raven_python.biomass.sum_biomass; the MATLAB -% counterpart of yeast-GEM's legacy sumBioMass. +% getBiomassFractions Compute mass fraction per biomass component. % -% The biomassConfig struct describes the per-organism biomass -% layout — see "Inputs" below. Components whose pseudoreaction is -% missing from the model contribute 0. +% Compute the mass fraction (g/gDW) per biomass component plus the total. +% Mirrors raven_python.biomass.sum_biomass; the MATLAB counterpart of +% yeast-GEM's legacy sumBioMass. % -% Inputs: -% model RAVEN model struct. -% biomassConfig struct with fields: -% biomass_rxn rxn id of the top-level -% biomass pseudoreaction. -% proton_met met id of cytosolic H+ (used -% only by rescalePseudoreaction; -% may be unused here). -% components cell array of component -% structs with fields: -% .name component name -% (e.g. 'protein'). -% .pseudoreaction_name model.rxnNames -% entry to identify -% the pseudoreaction. -% .mass_strategy 'mw' | 'mw_minus_2h' -% | 'mw_minus_water' -% | 'grams' — see -% NOTES below. +% The biomassConfig struct describes the per-organism biomass layout — see +% Parameters below. Components whose pseudoreaction is missing from the +% model contribute 0. % -% Output: -% fractions struct keyed by component name plus 'total': -% fractions.protein, fractions.RNA, ... etc. -% All values are in g/gDW. +% Parameters +% ---------- +% model : struct +% RAVEN model struct. +% biomassConfig : struct +% Struct describing the biomass layout, with fields: % -% NOTES on mass_strategy: -% 'mw' MW from chemical formula -% 'mw_minus_2h' MW − 2.016 g/mol (two protons released per -% charged tRNA — protein-pseudoreaction substrates) -% 'mw_minus_water' MW − 18.015 g/mol (water released per -% polymerisation step — RNA / DNA) -% 'grams' stoichiometry already in g/gDW (lipid backbone) +% - biomass_rxn : rxn id of the top-level biomass pseudoreaction. +% - proton_met : met id of cytosolic H+ (used only by +% rescalePseudoreaction; may be unused here). +% - components : cell array of component structs, each with fields: % -% Usage: fractions = getBiomassFractions(model, biomassConfig) +% - name : component name (e.g. 'protein'). +% - pseudoreaction_name : model.rxnNames entry to identify the +% pseudoreaction. +% - mass_strategy : 'mw' | 'mw_minus_2h' | 'mw_minus_water' | +% 'grams' — see Notes below. +% +% Returns +% ------- +% fractions : struct +% Struct keyed by component name plus 'total': fractions.protein, +% fractions.RNA, ... etc. All values are in g/gDW. +% +% Examples +% -------- +% fractions = getBiomassFractions(model, biomassConfig); +% +% Notes +% ----- +% mass_strategy values: +% +% 'mw' MW from chemical formula +% 'mw_minus_2h' MW − 2.016 g/mol (two protons released per charged +% tRNA — protein-pseudoreaction substrates) +% 'mw_minus_water' MW − 18.015 g/mol (water released per +% polymerisation step — RNA / DNA) +% 'grams' stoichiometry already in g/gDW (lipid backbone) fractions = struct(); total = 0; diff --git a/biomass/scaleBiomassFraction.m b/biomass/scaleBiomassFraction.m index 0e4165f2..714bafa8 100644 --- a/biomass/scaleBiomassFraction.m +++ b/biomass/scaleBiomassFraction.m @@ -1,23 +1,37 @@ function model = scaleBiomassFraction(model, biomassConfig, componentName, newValue, balanceOut) -% scaleBiomassFraction -% Rescale a biomass component to a target g/gDW value, optionally -% balancing a second component so the total biomass mass stays at -% 1 g/gDW. Mirrors raven_python.biomass.scale_biomass and yeast-GEM's -% legacy scaleBioMass. +% scaleBiomassFraction Rescale a biomass component to a target value. % -% Inputs: -% model RAVEN model struct. -% biomassConfig struct (see getBiomassFractions). -% componentName Component to rescale. -% newValue Target fraction in g/gDW. -% balanceOut (opt) Second component name to adjust so the -% biomass total remains 1 g/gDW. Empty / omit -% to skip balancing. +% Rescale a biomass component to a target g/gDW value, optionally +% balancing a second component so the total biomass mass stays at 1 g/gDW. +% Mirrors raven_python.biomass.scale_biomass and yeast-GEM's legacy +% scaleBioMass. % -% Output: -% model Modified model. +% Parameters +% ---------- +% model : struct +% RAVEN model struct. +% biomassConfig : struct +% Struct (see getBiomassFractions). +% componentName : char +% Component to rescale. +% newValue : double +% Target fraction in g/gDW. +% balanceOut : char, optional +% Second component name to adjust so the biomass total remains 1 +% g/gDW. Empty / omit to skip balancing. % -% Usage: model = scaleBiomassFraction(model, biomassConfig, 'protein', 0.5, 'carbohydrate') +% Returns +% ------- +% model : struct +% Modified model. +% +% Examples +% -------- +% model = scaleBiomassFraction(model, biomassConfig, 'protein', 0.5, 'carbohydrate'); +% +% See also +% -------- +% getBiomassFractions if nargin < 5 balanceOut = ''; diff --git a/biomass/scaleBiomassPseudoreaction.m b/biomass/scaleBiomassPseudoreaction.m index 4636297c..f565800b 100644 --- a/biomass/scaleBiomassPseudoreaction.m +++ b/biomass/scaleBiomassPseudoreaction.m @@ -1,29 +1,43 @@ function model = scaleBiomassPseudoreaction(model, biomassConfig, componentName, factor) -% scaleBiomassPseudoreaction -% Multiply the substrate coefficients of one biomass component -% pseudoreaction by `factor` and rebalance H+ to preserve charge -% neutrality. Mirrors raven_python.biomass.rescale_pseudoreaction -% and yeast-GEM's legacy rescalePseudoReaction. +% scaleBiomassPseudoreaction Rescale a biomass component pseudoreaction. % -% "Substrate" means every metabolite in the pseudoreaction whose -% metabolite name does NOT match the component name (the -% component's product is left untouched). After rescaling, the -% coefficient of biomassConfig.proton_met is recomputed so the -% pseudoreaction's total ionic charge sums to zero. +% Multiply the substrate coefficients of one biomass component +% pseudoreaction by factor and rebalance H+ to preserve charge neutrality. +% Mirrors raven_python.biomass.rescale_pseudoreaction and yeast-GEM's +% legacy rescalePseudoReaction. % -% Inputs: -% model RAVEN model struct. -% biomassConfig struct (see getBiomassFractions). -% componentName Name of the component to rescale (must match -% biomassConfig.components{i}.name for some i, -% AND be the model.metNames of the produced -% metabolite in the matching pseudoreaction). -% factor Multiplicative factor. +% "Substrate" means every metabolite in the pseudoreaction whose +% metabolite name does NOT match the component name (the component's +% product is left untouched). After rescaling, the coefficient of +% biomassConfig.proton_met is recomputed so the pseudoreaction's total +% ionic charge sums to zero. % -% Output: -% model Modified model. +% Parameters +% ---------- +% model : struct +% RAVEN model struct. +% biomassConfig : struct +% Struct (see getBiomassFractions). +% componentName : char +% Name of the component to rescale (must match +% biomassConfig.components{i}.name for some i, AND be the +% model.metNames of the produced metabolite in the matching +% pseudoreaction). +% factor : double +% Multiplicative factor. % -% Usage: model = scaleBiomassPseudoreaction(model, biomassConfig, 'protein', 0.9) +% Returns +% ------- +% model : struct +% Modified model. +% +% Examples +% -------- +% model = scaleBiomassPseudoreaction(model, biomassConfig, 'protein', 0.9); +% +% See also +% -------- +% getBiomassFractions comp = findComponent(biomassConfig, componentName); rxnPos = find(strcmp(model.rxnNames, comp.pseudoreaction_name)); diff --git a/biomass/setGAM.m b/biomass/setGAM.m index 319c04df..e521139c 100644 --- a/biomass/setGAM.m +++ b/biomass/setGAM.m @@ -1,33 +1,42 @@ function model = setGAM(model, value, biomassRxn, cofactorMetNames, ngamRxn, ngamValue) -% setGAM -% Set the growth-associated maintenance (GAM) coefficient in the -% biomass pseudoreaction, and optionally fix the non-growth -% maintenance (NGAM) reaction's bounds. Mirrors -% raven_python.biomass.set_gam and yeast-GEM's legacy changeGAM. +% setGAM Set the growth-associated maintenance (GAM) coefficient. % -% For every metabolite in the biomass pseudoreaction whose -% `model.metNames` entry is in `cofactorMetNames`, the -% stoichiometric coefficient is set to ±`value` preserving the sign -% of the current coefficient. Yeast-GEM scales ATP, ADP, H2O, H+ -% and phosphate (with ATP and H2O on the substrate side, ADP / H+ / -% phosphate on the product side). +% Set the growth-associated maintenance (GAM) coefficient in the biomass +% pseudoreaction, and optionally fix the non-growth maintenance (NGAM) +% reaction's bounds. Mirrors raven_python.biomass.set_gam and yeast-GEM's +% legacy changeGAM. % -% Inputs: -% model RAVEN model struct. -% value New GAM value (mmol ATP / gDW per growth unit). -% biomassRxn Reaction id of the biomass pseudoreaction. -% cofactorMetNames Cell array of metabolite NAMES (not IDs) -% to rescale, e.g. {'ATP','ADP','H2O','H+', -% 'phosphate'}. -% ngamRxn (opt) NGAM reaction id. Required when -% ngamValue is supplied. -% ngamValue (opt) NGAM flux to fix. Sets the NGAM -% reaction's bounds to (ngamValue, ngamValue). +% For every metabolite in the biomass pseudoreaction whose model.metNames +% entry is in cofactorMetNames, the stoichiometric coefficient is set to +% ±value preserving the sign of the current coefficient. Yeast-GEM scales +% ATP, ADP, H2O, H+ and phosphate (with ATP and H2O on the substrate side, +% ADP / H+ / phosphate on the product side). % -% Output: -% model Modified model. +% Parameters +% ---------- +% model : struct +% RAVEN model struct. +% value : double +% New GAM value (mmol ATP / gDW per growth unit). +% biomassRxn : char +% Reaction id of the biomass pseudoreaction. +% cofactorMetNames : cell +% Cell array of metabolite NAMES (not IDs) to rescale, e.g. +% {'ATP','ADP','H2O','H+','phosphate'}. +% ngamRxn : char, optional +% NGAM reaction id. Required when ngamValue is supplied. +% ngamValue : double, optional +% NGAM flux to fix. Sets the NGAM reaction's bounds to (ngamValue, +% ngamValue). % -% Usage: model = setGAM(model, 80, 'r_4041', {'ATP','ADP','H2O','H+','phosphate'}) +% Returns +% ------- +% model : struct +% Modified model. +% +% Examples +% -------- +% model = setGAM(model, 80, 'r_4041', {'ATP','ADP','H2O','H+','phosphate'}); if nargin < 4 error('setGAM:missingArgs', ... diff --git a/comparison/compareMultipleModels.m b/comparison/compareMultipleModels.m index 596d0957..4bd62aed 100755 --- a/comparison/compareMultipleModels.m +++ b/comparison/compareMultipleModels.m @@ -1,45 +1,52 @@ function compStruct = compareMultipleModels(models,printResults,plotResults,groupVector,funcCompare,taskFile) -% compareMultipleModels -% Compares two or more condition-specific models generated from the same -% base model using high-dimensional comparisons in the reaction-space. +% compareMultipleModels Compare two or more condition-specific models. % -% models cell array of two or more models -% printResults true if the results should be printed on the screen -% (optional, default false) -% plotResults true if the results should be plotted -% (optional, default false) -% groupVector numeric vector or cell array for grouping similar -% models, i.e. by tissue (optional, default, all models -% ungrouped) -% funcCompare logical, should a functional comparison be run -% (optional,default, false) -% taskFile string containing the name of the task file to use -% for the functional comparison (should be an .xls or -% .xlsx file, required for functional comparison) +% Compares two or more condition-specific models generated from the same +% base model using high-dimensional comparisons in the reaction-space. % -% compStruct structure that contains the comparison results -% modelIDs cell array of model ids -% reactions substructure containing reaction information -% matrix binary matrix composed of reactions (rows) in -% each model (column). This matrix is used as the -% input for the model comparisons. -% IDs list of the reactions contained in the reaction -% matrix. -% subsystems substructure containing subsystem information -% matrix matrix with comparison of number of rxns per -% subsystem -% ID vector consisting of names of all subsystems -% structComp matrix with pairwise comparisons of model structure -% based on (1-Hamming distance) between models -% structCompMap matrix with 3D tSNE (or MDS) mapping of model -% structures based on Hamming distances -% funcComp substructure containing function comparison results -% matrix matrix with PASS / FAIL (1 / 0) values for each -% task -% tasks vector containing names of all tasks +% Parameters +% ---------- +% models : cell +% cell array of two or more models. +% printResults : logical, optional +% true if the results should be printed on the screen (default false). +% plotResults : logical, optional +% true if the results should be plotted (default false). +% groupVector : double or cell, optional +% numeric vector or cell array for grouping similar models, i.e. by +% tissue (default all models ungrouped). +% funcCompare : logical, optional +% should a functional comparison be run (default false). +% taskFile : char, optional +% string containing the name of the task file to use for the functional +% comparison (should be an .xls or .xlsx file, required for functional +% comparison). % -% Usage: compStruct=compareMultipleModels(models,printResults,... -% plotResults,groupVector,funcCompare,taskFile); +% Returns +% ------- +% compStruct : struct +% structure that contains the comparison results, with fields: +% +% - modelIDs : cell array of model ids +% - reactions : substructure containing reaction information, with +% fields matrix (binary matrix composed of reactions (rows) in each +% model (column), used as the input for the model comparisons) and IDs +% (list of the reactions contained in the reaction matrix) +% - subsystems : substructure containing subsystem information, with +% fields matrix (matrix with comparison of number of rxns per +% subsystem) and ID (vector consisting of names of all subsystems) +% - structComp : matrix with pairwise comparisons of model structure +% based on (1-Hamming distance) between models +% - structCompMap : matrix with 3D tSNE (or MDS) mapping of model +% structures based on Hamming distances +% - funcComp : substructure containing function comparison results, with +% fields matrix (matrix with PASS / FAIL (1 / 0) values for each task) +% and tasks (vector containing names of all tasks) +% +% Examples +% -------- +% compStruct = compareMultipleModels(models, printResults, ... +% plotResults, groupVector, funcCompare, taskFile); %% Stats toolbox required if ~(exist('mdscale.m','file') && exist('pdist.m','file') && exist('squareform.m','file') && exist('tsne.m','file')) diff --git a/comparison/compareRxnsGenesMetsComps.m b/comparison/compareRxnsGenesMetsComps.m index 29f40dcf..6712bee8 100755 --- a/comparison/compareRxnsGenesMetsComps.m +++ b/comparison/compareRxnsGenesMetsComps.m @@ -1,29 +1,35 @@ function compStruct=compareRxnsGenesMetsComps(models,printResults) -% compareRxnsGenesMetsComps -% Compares two or more models with respect to overlap in terms of genes, -% reactions, metabolites and compartments. +% compareRxnsGenesMetsComps Compare overlap of genes, reactions, metabolites and compartments. % -% models cell array of two or more models -% printResults true if the results should be printed on the screen -% (optional, default false) +% Compares two or more models with respect to overlap in terms of genes, +% reactions, metabolites and compartments. % -% compStruct structure that contains the comparison -% modelIDs cell array of model ids -% rxns These contain the comparison for each field. 'equ' are -% the equations after sorting and 'uEqu' are the -% equations when not taking compartmentalization into acount -% mets -% genes -% eccodes -% metNames -% equ -% uEqu -% comparison binary matrix where each row indicate which models are -% included in the comparison -% nElements vector with the number of elements for each -% comparison +% Parameters +% ---------- +% models : cell +% cell array of two or more models. +% printResults : logical, optional +% true if the results should be printed on the screen (default false). % -% Usage: compStruct=compareRxnsGenesMetsComps(models,printResults) +% Returns +% ------- +% compStruct : struct +% structure that contains the comparison, with fields: +% +% - modelIDs : cell array of model ids +% - rxns, mets, genes, eccodes, metNames, equ, uEqu : the comparison for +% each field. 'equ' are the equations after sorting and 'uEqu' are the +% equations when not taking compartmentalization into account. Each of +% these contains the sub-fields: +% +% - comparison : binary matrix where each row indicates which models +% are included in the comparison +% - nElements : vector with the number of elements for each +% comparison +% +% Examples +% -------- +% compStruct = compareRxnsGenesMetsComps(models, printResults); if nargin<2 printResults=true; diff --git a/conditions/applyCondition.m b/conditions/applyCondition.m index 8494485f..940c5d2f 100644 --- a/conditions/applyCondition.m +++ b/conditions/applyCondition.m @@ -1,52 +1,63 @@ function model = applyCondition(model, condition) -% applyCondition -% Apply a deterministic "condition" to a model: a prelude that resets -% exchange bounds, optional metabolite removals + automatic charge -% rebalancing of a pseudoreaction, optional biomass-stoichiometry -% delta, and a per-reaction bounds diff. The schema is intentionally -% narrow so a condition can be reviewed as data. +% applyCondition Apply a deterministic condition to a model. % -% Yeast-GEM was the first consumer; the same schema works for any -% GEM that keeps its condition presets as data rather than as code. -% Project-specific extensions (e.g. yeast-GEM's amino_acid_ratio -% step that rewrites a protein pseudoreaction's stoichiometry from a -% side-car TSV) are handled by the *caller* before / after this -% function — kept upstream-narrow on purpose. +% Apply a deterministic "condition" to a model: a prelude that resets +% exchange bounds, optional metabolite removals + automatic charge +% rebalancing of a pseudoreaction, optional biomass-stoichiometry delta, +% and a per-reaction bounds diff. The schema is intentionally narrow so a +% condition can be reviewed as data. % -% Inputs: -% model RAVEN model struct. -% condition Either a path to a YAML condition file or a struct -% already produced by parseYAML. The expected schema -% (all keys optional): +% Yeast-GEM was the first consumer; the same schema works for any GEM that +% keeps its condition presets as data rather than as code. +% Project-specific extensions (e.g. yeast-GEM's amino_acid_ratio step that +% rewrites a protein pseudoreaction's stoichiometry from a side-car TSV) +% are handled by the caller before / after this function — kept +% upstream-narrow on purpose. % -% prelude: -% reset_exchanges: out % truthy -> reset all +% Parameters +% ---------- +% model : struct +% RAVEN model struct. +% condition : char or struct +% Either a path to a YAML condition file or a struct already produced +% by parseYAML. The expected schema (all keys optional): % -% cofactor_pseudoreaction: -% rxn_id: r_4598 -% remove_mets: -% - { met: s_3714 } -% charge_balance_met: s_0794 +% prelude: +% reset_exchanges: out % truthy -> reset all % -% biomass_stoichiometry_delta: -% rxn_id: r_4041 -% add: -% - { met: s_0689, coef: 0.08 } -% - { met: s_0687, coef: -0.08 } -% - { met: s_0794, coef: -0.16 } +% cofactor_pseudoreaction: +% rxn_id: r_4598 +% remove_mets: +% - { met: s_3714 } +% charge_balance_met: s_0794 % -% bounds: -% - { rxn: r_1654, lb: -1000 } -% - { rxn: r_1992, lb: 0 } -% - { rxn: r_1663, lb: 0, ub: 0 } +% biomass_stoichiometry_delta: +% rxn_id: r_4041 +% add: +% - { met: s_0689, coef: 0.08 } +% - { met: s_0687, coef: -0.08 } +% - { met: s_0794, coef: -0.16 } % -% expected_uptake_count: 15 +% bounds: +% - { rxn: r_1654, lb: -1000 } +% - { rxn: r_1992, lb: 0 } +% - { rxn: r_1663, lb: 0, ub: 0 } % -% Output: -% model Modified model. +% expected_uptake_count: 15 % -% Usage: model = applyCondition(model, 'data/conditions/anaerobic.yml') -% model = applyCondition(model, parseYAML('data/conditions/anaerobic.yml')) +% Returns +% ------- +% model : struct +% Modified model. +% +% Examples +% -------- +% model = applyCondition(model, 'data/conditions/anaerobic.yml'); +% model = applyCondition(model, parseYAML('data/conditions/anaerobic.yml')); +% +% See also +% -------- +% parseYAML if ischar(condition) || isstring(condition) cond = parseYAML(char(condition)); diff --git a/conversion/addIdentifierPrefix.m b/conversion/addIdentifierPrefix.m index 74672ff6..e4873a41 100644 --- a/conversion/addIdentifierPrefix.m +++ b/conversion/addIdentifierPrefix.m @@ -1,26 +1,35 @@ function [model, hasChanged]=addIdentifierPrefix(model,fields) -% addIdentifierPrefix -% If reaction, metabolite, compartment, gene or model identifiers do not -% start with a letter or _, which conflicts with SBML specifications, -% prefixes are added for all identifiers in the respective model field. -% The prefixes are: -% "R_" for model.rxns, -% "M_" for model.mets, -% "C_" for model.comps; -% "G_" for model.genes (and also represented in model.grRules) +% addIdentifierPrefix Add identifier prefixes required by SBML. % -% Input: -% model model whose identifiers should be modified -% fields cell array with model field names that should be -% checked if prefixes should be added, possible values: -% 'rxns', 'mets', 'comps', 'genes', 'id'. (optional, by -% default all listed model fields will be checked). +% If reaction, metabolite, compartment, gene or model identifiers do not +% start with a letter or _, which conflicts with SBML specifications, +% prefixes are added for all identifiers in the respective model field. +% The prefixes are: % -% Output: -% model modified model -% hasChanged cell array with fields and prefixes that are added +% "R_" for model.rxns, +% "M_" for model.mets, +% "C_" for model.comps; +% "G_" for model.genes (and also represented in model.grRules) % -% Usage: [model, hasChanged]=addIdentifierPrefix(model,fields) +% Parameters +% ---------- +% model : struct +% model whose identifiers should be modified. +% fields : cell, optional +% cell array with model field names that should be checked if prefixes +% should be added, possible values: 'rxns', 'mets', 'comps', 'genes', +% 'id' (default all listed model fields will be checked). +% +% Returns +% ------- +% model : struct +% modified model. +% hasChanged : cell +% cell array with fields and prefixes that are added. +% +% Examples +% -------- +% [model, hasChanged] = addIdentifierPrefix(model, fields); if nargin<2 || isempty(fields) fields = {'rxns','mets','comps','genes','id'}; diff --git a/conversion/ravenCobraWrapper.m b/conversion/ravenCobraWrapper.m index cb82866f..4c692e93 100755 --- a/conversion/ravenCobraWrapper.m +++ b/conversion/ravenCobraWrapper.m @@ -1,33 +1,42 @@ function newModel=ravenCobraWrapper(model) -% ravenCobraWrapper -% Converts between RAVEN and COBRA structures +% ravenCobraWrapper Convert between RAVEN and COBRA structures. % -% Input: model a RAVEN/COBRA-compatible model structure +% This function is a bidirectional tool to convert between RAVEN and COBRA +% structures. It recognises a COBRA structure by checking the existence of +% the field 'rules', which is only found in a COBRA Toolbox structure. If +% the COBRA model also has a grRules field, then this will be used instead +% of parsing the rules field. % -% Ouput: newModel a COBRA/RAVEN-compatible model structure -% -% This function is a bidirectional tool to convert between RAVEN and -% COBRA structures. It recognises COBRA structure by checking field -% 'rules' existense, which is only found in COBRA Toolbox structure. If -% the COBRA model also has a grRules field, then this will be used -% instead of parsing the rules field. +% Parameters +% ---------- +% model : struct +% a RAVEN/COBRA-compatible model structure. % -% NOTE: During RAVEN -> COBRA -> RAVEN conversion cycle the following -% fields are lost: annotation, compOutside, compMiriams, rxnComps, -% geneComps, unconstrained. Boundary metabolites are lost, because COBRA -% structure does not involve boundary metabolites, so they are removed -% using simplifyModel before RAVEN -> COBRA conversion. The field 'rev' -% is also partially lost, but during COBRA -> RAVEN conversion it's -% reconstructed based on lower bound reaction values +% Returns +% ------- +% newModel : struct +% a COBRA/RAVEN-compatible model structure. % -% NOTE: During COBRA -> RAVEN -> COBRA conversion cycle the following -% fields are lost: geneEntrezID, modelVersion, proteins +% Examples +% -------- +% newModel = ravenCobraWrapper(model); % -% NOTE: The information about mandatory RAVEN fields was taken from -% checkModelStruct function, whereas the corresponding information about -% COBRA fields was fetched from verifyModel function +% Notes +% ----- +% During the RAVEN -> COBRA -> RAVEN conversion cycle the following fields +% are lost: annotation, compOutside, compMiriams, rxnComps, geneComps, +% unconstrained. Boundary metabolites are lost, because the COBRA +% structure does not involve boundary metabolites, so they are removed +% using simplifyModel before RAVEN -> COBRA conversion. The field 'rev' is +% also partially lost, but during COBRA -> RAVEN conversion it is +% reconstructed based on lower bound reaction values. % -% Usage: newModel=ravenCobraWrapper(model) +% During the COBRA -> RAVEN -> COBRA conversion cycle the following fields +% are lost: geneEntrezID, modelVersion, proteins. +% +% The information about mandatory RAVEN fields was taken from the +% checkModelStruct function, whereas the corresponding information about +% COBRA fields was fetched from the verifyModel function. if isfield(model,'rules') isRaven=false; diff --git a/conversion/removeIdentifierPrefix.m b/conversion/removeIdentifierPrefix.m index 0a960108..fe509bc4 100644 --- a/conversion/removeIdentifierPrefix.m +++ b/conversion/removeIdentifierPrefix.m @@ -1,31 +1,41 @@ function [model, hasChanged]=removeIdentifierPrefix(model,fields,forceRemove) -% removeIdentifierPrefix -% This function removes identifier prefixes: -% "R_" for model.rxns, model.rxnNames and model.id, -% "M_" for model.mets and model.metNames, -% "C_" for model.comps; -% "G_" for model.genes (and also represented in model.grRules). -% By default, the prefixes are only removed if all entries in a -% particular field has the prefix. The prefixes might have been present -% because one or more identifiers do not start with a letter or _, which -% conflicts with SBML specifications. +% removeIdentifierPrefix Remove SBML-required identifier prefixes. % -% Input: -% model model whose identifiers should be modified -% fields cell array with model field names from which the -% identifiers should be removed, possible values: -% 'rxns', 'mets', 'comps', 'genes', 'metNames', -% 'rxnNames', 'id'. (optional, by default all listed -% model fields will be checked). -% forceRemove if prefixes should be removed even if not all entries -% in a model field have the prefix (optional, default -% false) +% This function removes identifier prefixes: % -% Output: -% model modified model -% hasChanged cell array with fields and prefixes that are removed +% "R_" for model.rxns, model.rxnNames and model.id, +% "M_" for model.mets and model.metNames, +% "C_" for model.comps; +% "G_" for model.genes (and also represented in model.grRules). % -% Usage: model=removeIdentifierPrefix(model,fields,forceRemove) +% By default, the prefixes are only removed if all entries in a particular +% field have the prefix. The prefixes might have been present because one +% or more identifiers do not start with a letter or _, which conflicts +% with SBML specifications. +% +% Parameters +% ---------- +% model : struct +% model whose identifiers should be modified. +% fields : cell, optional +% cell array with model field names from which the identifiers should +% be removed, possible values: 'rxns', 'mets', 'comps', 'genes', +% 'metNames', 'rxnNames', 'id' (default all listed model fields will +% be checked). +% forceRemove : logical, optional +% if prefixes should be removed even if not all entries in a model +% field have the prefix (default false). +% +% Returns +% ------- +% model : struct +% modified model. +% hasChanged : cell +% cell array with fields and prefixes that are removed. +% +% Examples +% -------- +% model = removeIdentifierPrefix(model, fields, forceRemove); if nargin<2 || isempty(fields) fields = {'rxns','mets','comps','genes','metNames','rxnNames','id'}; diff --git a/conversion/standardizeModelFieldOrder.m b/conversion/standardizeModelFieldOrder.m index ffe79e11..7c4f5878 100755 --- a/conversion/standardizeModelFieldOrder.m +++ b/conversion/standardizeModelFieldOrder.m @@ -1,17 +1,30 @@ function orderedModel=standardizeModelFieldOrder(model) -% standardizeModelFieldOrder -% Orders fields of RAVEN model structure as specified at -% https://github.com/SysBioChalmers/RAVEN/wiki/RAVEN-Model-Structure +% standardizeModelFieldOrder Order RAVEN model structure fields. % -% Input: model model structure, either RAVEN or COBRA format +% Orders fields of a RAVEN model structure as specified at +% https://github.com/SysBioChalmers/RAVEN/wiki/RAVEN-Model-Structure % -% Output: orderedModel model structure with ordered fields +% The model fields themselves are not changed, only the order is modified. +% For changing model fields between RAVEN and COBRA format, use +% ravenCobraWrapper(). % -% The model fields themselves are not changed, only the order is -% modified. For changing model fields between RAVEN and COBRA format, use -% ravenCobraWrapper(). +% Parameters +% ---------- +% model : struct +% model structure, either RAVEN or COBRA format. % -% Usage: orderedModel=standardizeModelFieldOrder(model) +% Returns +% ------- +% orderedModel : struct +% model structure with ordered fields. +% +% Examples +% -------- +% orderedModel = standardizeModelFieldOrder(model); +% +% See also +% -------- +% ravenCobraWrapper ravenPath=findRAVENroot(); diff --git a/curation/curateModelFromTables.m b/curation/curateModelFromTables.m index 742b50bd..8af1d203 100644 --- a/curation/curateModelFromTables.m +++ b/curation/curateModelFromTables.m @@ -1,49 +1,60 @@ function newModel=curateModelFromTables(model,metsInfo,genesInfo,rxnsCoeffs,rxnsInfo,metPrefix,rxnPrefix) -% curateModelFromTables -% Curate existing and/or add new metabolites, reactions and genes -% from tabular data files. Originally extracted from yeast-GEM's -% curateMetsRxnsGenes; generalised here so any GEM project can drive -% batch curation from the same set of *.tsv files. +% curateModelFromTables Curate or add mets, rxns and genes from tables. % -% If the *.tsv files contain metabolites, reactions and/or genes that are -% already present in the model, then information in the model will be -% overwritten. Note that this includes empty annotations in the *.tsv -% files! Metabolites are matched by metaboliteName[comp]; reactions by -% the stoichiometry of its reactants and products; genes by their gene -% name. This function can therefore be used to add new entities in the -% model, or curate those already existing in the model. +% Curate existing and/or add new metabolites, reactions and genes from +% tabular data files. Originally extracted from yeast-GEM's +% curateMetsRxnsGenes; generalised here so any GEM project can drive batch +% curation from the same set of *.tsv files. % -% Input: -% model RAVEN model structure to be curated. -% metsInfo path to a *.tsv file with metabolite information, or -% 'none' to skip metabolite curation. Columns: -% metNames, comps, formula, charge, inchi, metNotes, -% then any number of MIRIAM-namespace columns. -% genesInfo path to a *.tsv file with gene information, or -% 'none'. Columns: genes, geneShortNames, then MIRIAM. -% rxnsCoeffs path to a *.tsv file with reaction stoichiometric -% coefficients, or 'none'. Columns: rxnIdx, rxnNames, -% metNames, comps, coefficient. One row per -% (reaction, metabolite) pair. -% rxnsInfo path to a *.tsv file with reaction information, or -% 'none'. Columns: rxnIdx, rxnNames, grRules, lb, ub, -% rev, subSystems, eccodes, rxnNotes, rxnReferences, -% rxnConfidenceScores, then MIRIAM. -% metPrefix prefix used to mint fresh metabolite ids (e.g. 's_' -% for yeast-GEM, 'M_' for the cobrapy/BiGG default). -% Default: 'M_'. -% rxnPrefix prefix used to mint fresh reaction ids. Default: 'R_'. +% If the *.tsv files contain metabolites, reactions and/or genes that are +% already present in the model, then information in the model will be +% overwritten. Note that this includes empty annotations in the *.tsv +% files! Metabolites are matched by metaboliteName[comp]; reactions by the +% stoichiometry of its reactants and products; genes by their gene name. +% This function can therefore be used to add new entities in the model, or +% curate those already existing in the model. % -% Output: -% newModel curated RAVEN model structure. +% Parameters +% ---------- +% model : struct +% RAVEN model structure to be curated. +% metsInfo : char +% Path to a *.tsv file with metabolite information, or 'none' to skip +% metabolite curation. Columns: metNames, comps, formula, charge, +% inchi, metNotes, then any number of MIRIAM-namespace columns. +% genesInfo : char +% Path to a *.tsv file with gene information, or 'none'. Columns: +% genes, geneShortNames, then MIRIAM. +% rxnsCoeffs : char +% Path to a *.tsv file with reaction stoichiometric coefficients, or +% 'none'. Columns: rxnIdx, rxnNames, metNames, comps, coefficient. One +% row per (reaction, metabolite) pair. +% rxnsInfo : char +% Path to a *.tsv file with reaction information, or 'none'. Columns: +% rxnIdx, rxnNames, grRules, lb, ub, rev, subSystems, eccodes, +% rxnNotes, rxnReferences, rxnConfidenceScores, then MIRIAM. +% metPrefix : char, optional +% Prefix used to mint fresh metabolite ids (e.g. 's_' for yeast-GEM, +% 'M_' for the cobrapy/BiGG default) (default 'M_'). +% rxnPrefix : char, optional +% Prefix used to mint fresh reaction ids (default 'R_'). % -% The 'everything after the core columns is MIRIAM' convention applies -% to all three info tables: any column whose header is not one of the -% listed core fields is treated as a MIRIAM annotation namespace and -% stored on the matching entity. +% Returns +% ------- +% newModel : struct +% Curated RAVEN model structure. % -% Usage: newModel = curateModelFromTables(model, metsInfo, genesInfo, ... -% rxnsCoeffs, rxnsInfo, metPrefix, rxnPrefix) +% Examples +% -------- +% newModel = curateModelFromTables(model, metsInfo, genesInfo, ... +% rxnsCoeffs, rxnsInfo, metPrefix, rxnPrefix); +% +% Notes +% ----- +% The 'everything after the core columns is MIRIAM' convention applies to +% all three info tables: any column whose header is not one of the listed +% core fields is treated as a MIRIAM annotation namespace and stored on +% the matching entity. if nargin==4 error('Provide both a ''rxnsInfo'' and a ''rxnsCoeffs'' file') diff --git a/gapfilling/canConsume.m b/gapfilling/canConsume.m index b0256d03..1345bd82 100755 --- a/gapfilling/canConsume.m +++ b/gapfilling/canConsume.m @@ -1,17 +1,26 @@ function consumed=canConsume(model,mets) -% canConsume -% Checks which metabolites that can be consumed by a model using the -% specified constraints +% canConsume Check which metabolites can be consumed by a model. % -% model a model structure -% mets either a cell array of metabolite IDs, a logical vector -% with the same number of elements as metabolites in the model, -% or a vector of indexes to check for (optional, default model.mets) +% Checks which metabolites can be consumed by a model using the specified +% constraints. % -% consumed vector with true if the corresponding metabolite could be -% produced +% Parameters +% ---------- +% model : struct +% a model structure. +% mets : cell or logical or double, optional +% either a cell array of metabolite IDs, a logical vector with the same +% number of elements as metabolites in the model, or a vector of +% indexes to check for (default model.mets). % -% Usage: consumed=canConsume(model,mets) +% Returns +% ------- +% consumed : logical +% vector with true if the corresponding metabolite could be produced. +% +% Examples +% -------- +% consumed = canConsume(model, mets); if nargin<2 mets=model.mets; diff --git a/gapfilling/canProduce.m b/gapfilling/canProduce.m index e8bca220..e9ff2ce5 100755 --- a/gapfilling/canProduce.m +++ b/gapfilling/canProduce.m @@ -1,18 +1,31 @@ function produced=canProduce(model,mets) -% canProduce -% Checks which metabolites that can be produced from a model using the -% specified constraints. This is a less advanced but faster version of -% checkProduction. +% canProduce Check which metabolites can be produced from a model. % -% model a model structure -% mets either a cell array of metabolite IDs, a logical vector -% with the same number of elements as metabolites in the model, -% or a vector of indexes to check for (optional, default model.mets) +% Checks which metabolites can be produced from a model using the +% specified constraints. This is a less advanced but faster version of +% checkProduction. % -% produced vector with true if the corresponding metabolite could be -% produced +% Parameters +% ---------- +% model : struct +% a model structure. +% mets : cell or logical or double, optional +% either a cell array of metabolite IDs, a logical vector with the same +% number of elements as metabolites in the model, or a vector of +% indexes to check for (default model.mets). % -% Usage: produced=canProduce(model,mets) +% Returns +% ------- +% produced : logical +% vector with true if the corresponding metabolite could be produced. +% +% Examples +% -------- +% produced = canProduce(model, mets); +% +% See also +% -------- +% checkProduction if nargin<2 mets=model.mets; diff --git a/gapfilling/checkProduction.m b/gapfilling/checkProduction.m index 13837dd5..b1c53f01 100755 --- a/gapfilling/checkProduction.m +++ b/gapfilling/checkProduction.m @@ -1,46 +1,52 @@ function [notProduced, notProducedNames, neededForProductionMat,minToConnect,model]=checkProduction(model,checkNeededForProduction,excretionFromCompartments,printDetails) -% checkProduction -% Checks which metabolites that can be produced from a model using the -% specified constraints. +% checkProduction Check which metabolites can be produced from a model. % -% model a model structure -% checkNeededForProduction for each of the metabolites that could not -% be produced, include an artificial -% production reaction and calculate which new -% metabolites that could be produced as en -% effect of this (optional, default false) -% excretionFromCompartments cell array with compartment ids from which -% metabolites can be excreted (optional, default -% model.comps) -% printDetails print details to the screen (optional, default -% true) +% Checks which metabolites that can be produced from a model using the +% specified constraints. % -% notProduced cell array with metabolites that could not -% be produced -% notProducedNames cell array with names and compartments for -% metabolites that could not be produced -% neededForProductionMat matrix where n x m is true if metabolite n -% allows for production of metabolite m -% minToConnect structure with the minimal number of -% metabolites that need to be connected in -% order to be able to produce all other -% metabolites and which metabolites each of -% them connects -% model updated model structure with excretion -% reactions added +% The function is intended to be used to identify which metabolites must be +% connected in order to have a fully connected network. It does so by first +% identifying which metabolites could have a net production in the network. +% Then it calculates which other metabolites must be able to have net +% production in order to have production of all metabolites in the network. +% So, if a network contains the equations A[external]->B, C->D, and D->E it +% will identify that production of C will connect the metabolites D and E. % -% The function is intended to be used to identify which metabolites must -% be connected in order to have a fully connected network. It does so by -% first identifying which metabolites could have a net production in the -% network. Then it calculates which other metabolites must be able to -% have net production in order to have production of all metabolites in -% the network. So, if a network contains the equations A[external]->B, -% C->D, and D->E it will identify that production of C will connect -% the metabolites D and E. +% Parameters +% ---------- +% model : struct +% a model structure. +% checkNeededForProduction : logical, optional +% for each of the metabolites that could not be produced, include an +% artificial production reaction and calculate which new metabolites that +% could be produced as an effect of this (default false). +% excretionFromCompartments : cell, optional +% cell array with compartment ids from which metabolites can be excreted +% (default model.comps). +% printDetails : logical, optional +% print details to the screen (default true). % -% Usage: [notProduced, notProducedNames,neededForProductionMat,minToConnect,model]=... -% checkProduction(model,checkNeededForProduction,... -% excretionFromCompartments,printDetails) +% Returns +% ------- +% notProduced : double +% indices of metabolites that could not be produced. +% notProducedNames : cell +% cell array with names and compartments for metabolites that could not +% be produced. +% neededForProductionMat : logical +% matrix where n x m is true if metabolite n allows for production of +% metabolite m. +% minToConnect : cell +% the minimal number of metabolites that need to be connected in order to +% be able to produce all other metabolites, and which metabolites each of +% them connects. +% model : struct +% updated model structure with excretion reactions added. +% +% Examples +% -------- +% [notProduced, notProducedNames, neededForProductionMat, minToConnect, model] = ... +% checkProduction(model, checkNeededForProduction, excretionFromCompartments, printDetails); if nargin<2 checkNeededForProduction=false; diff --git a/gapfilling/checkRxn.m b/gapfilling/checkRxn.m index 1334514c..9874d7f4 100755 --- a/gapfilling/checkRxn.m +++ b/gapfilling/checkRxn.m @@ -1,26 +1,38 @@ function report=checkRxn(model,rxn,cutoff,revDir,printReport) -% checkRxn -% Checks which reactants in a reaction that can be synthesized and which -% products that can be consumed. This is primarily for debugging -% reactions which cannot have flux +% checkRxn Check which reactants can be synthesized and products consumed. % -% model a model structure -% rxn the id of one reaction to check -% cutoff minimal flux for successful production/consumption (optional, -% default 10^-7) -% revDir true if the reaction should be reversed (optional, default -% false) -% printReport print a report (optional, default true) +% Checks which reactants in a reaction that can be synthesized and which +% products that can be consumed. This is primarily for debugging reactions +% which cannot have flux. % -% report -% reactants array with reactant indexes -% canMake boolean array, true if the corresponding reactant can -% be synthesized by the rest of the metabolic network -% products array with product indexes -% canConsume boolean array, true if the corresponding product can -% be consumed by the rest of the metabolic network +% Parameters +% ---------- +% model : struct +% a model structure. +% rxn : char +% the id of one reaction to check. +% cutoff : double, optional +% minimal flux for successful production/consumption (default 10^-7). +% revDir : logical, optional +% true if the reaction should be reversed (default false). +% printReport : logical, optional +% print a report (default true). % -% Usage: report=checkRxn(model,rxn,cutoff,revDir,printReport) +% Returns +% ------- +% report : struct +% report with fields: +% +% - reactants : array with reactant indexes +% - canMake : boolean array, true if the corresponding reactant can be +% synthesized by the rest of the metabolic network +% - products : array with product indexes +% - canConsume : boolean array, true if the corresponding product can be +% consumed by the rest of the metabolic network +% +% Examples +% -------- +% report = checkRxn(model, rxn, cutoff, revDir, printReport); rxn=char(rxn); if nargin<3 diff --git a/gapfilling/consumeSomething.m b/gapfilling/consumeSomething.m index 5ec896cb..d0b87616 100755 --- a/gapfilling/consumeSomething.m +++ b/gapfilling/consumeSomething.m @@ -1,44 +1,55 @@ function [solution, metabolite]=consumeSomething(model,ignoreMets,isNames,minNrFluxes,params,ignoreIntBounds) -% consumeSomething -% Tries to consume any metabolite using as few reactions as possible. -% The intended use is when you want to make sure that you model cannot -% consume anything without producing something. It is intended to be used -% with no active exchange reactions. +% consumeSomething Try to consume any metabolite using as few reactions as possible. % -% model a model structure -% ignoreMets either a cell array of metabolite IDs, a logical vector -% with the same number of elements as metabolites in the model, -% of a vector of indexes for metabolites to exclude from -% this analysis (optional, default []) -% isNames true if the supplied mets represent metabolite names -% (as opposed to IDs). This is a way to delete -% metabolites in several compartments at once without -% knowing the exact IDs. This only works if ignoreMets -% is a cell array (optional, default false) -% minNrFluxes solves the MILP problem of minimizing the number of -% fluxes instead of the sum. Slower, but can be -% used if the sum gives too many fluxes (optional, default -% false) -% params *obsolete option* -% ignoreIntBounds true if internal bounds (including reversibility) -% should be ignored. Exchange reactions are not affected. -% This can be used to find unbalanced solutions which are -% not possible using the default constraints (optional, -% default false) +% The intended use is when you want to make sure that your model cannot +% consume anything without producing something. It is intended to be used +% with no active exchange reactions. % -% solution flux vector for the solution -% metabolite the index of the metabolite(s) which was consumed. If -% possible only one metabolite is reported, but there are -% situations where metabolites can only be consumed in -% pairs (or more) +% Parameters +% ---------- +% model : struct +% a model structure. +% ignoreMets : cell or logical or double, optional +% either a cell array of metabolite IDs, a logical vector with the same +% number of elements as metabolites in the model, or a vector of indexes +% for metabolites to exclude from this analysis (default []). +% isNames : logical, optional +% true if the supplied mets represent metabolite names (as opposed to +% IDs). This is a way to delete metabolites in several compartments at +% once without knowing the exact IDs. This only works if ignoreMets is a +% cell array (default false). +% minNrFluxes : logical, optional +% solves the MILP problem of minimizing the number of fluxes instead of +% the sum. Slower, but can be used if the sum gives too many fluxes +% (default false). +% params : struct, optional +% *obsolete option*. +% ignoreIntBounds : logical, optional +% true if internal bounds (including reversibility) should be ignored. +% Exchange reactions are not affected. This can be used to find +% unbalanced solutions which are not possible using the default +% constraints (default false). % -% NOTE: This works by forcing at least 1 unit of "any metabolites" to be -% consumed and then minimize for the sum of fluxes. If more than one -% metabolite is consumed, it picks one of them to be consumed and then -% minimizes for the sum of fluxes. +% Returns +% ------- +% solution : double +% flux vector for the solution. +% metabolite : double +% the index of the metabolite(s) which was consumed. If possible only one +% metabolite is reported, but there are situations where metabolites can +% only be consumed in pairs (or more). % -% Usage: [solution, metabolite]=consumeSomething(model,ignoreMets,isNames,... -% minNrFluxes,params,ignoreIntBounds) +% Examples +% -------- +% [solution, metabolite] = consumeSomething(model, ignoreMets, isNames, ... +% minNrFluxes, params, ignoreIntBounds); +% +% Notes +% ----- +% This works by forcing at least 1 unit of "any metabolites" to be consumed +% and then minimize for the sum of fluxes. If more than one metabolite is +% consumed, it picks one of them to be consumed and then minimizes for the +% sum of fluxes. if nargin<2 ignoreMets=[]; diff --git a/gapfilling/fillGaps.m b/gapfilling/fillGaps.m index da0d81c9..38bed192 100755 --- a/gapfilling/fillGaps.m +++ b/gapfilling/fillGaps.m @@ -1,66 +1,76 @@ function [newConnected, cannotConnect, addedRxns, newModel, exitFlag]=fillGaps(model,models,allowNetProduction,useModelConstraints,supressWarnings,rxnScores) -% fillGaps -% Uses template model(s) to fill gaps in a model +% fillGaps Use template model(s) to fill gaps in a model. % -% model a model structure that may contains gaps to be filled -% models a cell array of reference models or a model structure. -% The gaps will be filled using reactions from these models -% allowNetProduction true if net production of all metabolites is -% allowed. A reaction can be unable to carry flux because one of -% the reactants is unavailable or because one of the -% products can't be further processed. If this -% parameter is true, only the first type of -% unconnectivity is considered (optional, default false) -% useModelConstraints true if the constraints specified in the model -% structure should be used. If false then reactions -% included from the template model(s) so that as many -% reactions as possible in model can carry flux -% (optional, default false) -% supressWarnings false if warnings should be displayed (optional, default -% false) -% rxnScores array with scores for each of the reactions in the -% reference model(s). If more than one model is supplied, -% then rxnScores should be a cell array of vectors. -% The solver will try to maximize the sum of the -% scores for the included reactions (optional, default -% is -1 for all reactions) +% This method works by merging the model with the reference model(s) and +% checking which reactions can carry flux. All reactions that can't carry +% flux are removed (cannotConnect). If useModelConstraints is false it +% then solves the MILP problem of minimizing the number of active +% reactions from the reference models that are required to have flux in +% all the reactions in model. This requires that the input model has +% exchange reactions present for the nutrients that are needed for its +% metabolism. If useModelConstraints is true then the problem is to +% include as few reactions as possible from the reference models in order +% to satisfy the model constraints. % -% newConnected cell array with the reactions that could be -% connected. This is not calulated if -% useModelConstraints is true -% cannotConnect cell array with reactions that could not be -% connected. This is not calculated if -% useModelConstraints is true -% addedRxns cell array with the reactions that were added from -% "models" -% newModel the model with reactions added to fill gaps -% exitFlag 1: optimal solution found -% -1: no feasible solution found -% -2: optimization time out +% The intended use is that the user can attempt a general gap-filling +% using useModelConstraints=false, or a more targeted gap-filling by +% setting constraints in the model structure and then using +% useModelConstraints=true. For example, to include reactions so that all +% biomass components can be synthesized, the user could define a biomass +% equation and set its lower bound to >0. Running this function with +% useModelConstraints=true would then give the smallest set of reactions +% that have to be included in order for the model to produce biomass. % -% This method works by merging the model to the reference model(s) and -% checking which reactions can carry flux. All reactions that can't -% carry flux are removed (cannotConnect). -% If useModelConstraints is false it then solves the MILP problem of -% minimizing the number of active reactions from the reference models -% that are required to have flux in all the reactions in model. This -% requires that the input model has exchange reactions present for the -% nutrients that are needed for its metabolism. If useModelConstraints is -% true then the problem is to include as few reactions as possible from -% the reference models in order to satisfy the model constraints. -% The intended use is that the user can attempt a general gap-filling using -% useModelConstraint=false or a more targeted gap-filling by setting -% constraints in the model structure and then use -% useModelConstraints=true. Say that the user want to include reactions -% so that all biomass components can be synthesized. He/she could then -% define a biomass equation and set the lower bound to >0. Running this -% function with useModelConstraints=true would then give the smallest set -% of reactions that have to be included in order for the model to produce -% biomass. +% Parameters +% ---------- +% model : struct +% a model structure that may contain gaps to be filled. +% models : cell or struct +% a cell array of reference models or a model structure. The gaps will +% be filled using reactions from these models. +% allowNetProduction : logical, optional +% true if net production of all metabolites is allowed. A reaction can +% be unable to carry flux because one of the reactants is unavailable +% or because one of the products can't be further processed. If true, +% only the first type of unconnectivity is considered (default false). +% useModelConstraints : logical, optional +% true if the constraints specified in the model structure should be +% used. If false then reactions are included from the template +% model(s) so that as many reactions as possible in model can carry +% flux (default false). +% supressWarnings : logical, optional +% false if warnings should be displayed (default false). +% rxnScores : double or cell, optional +% array with scores for each of the reactions in the reference +% model(s). If more than one model is supplied, then rxnScores should +% be a cell array of vectors. The solver will try to maximize the sum +% of the scores for the included reactions (default is -1 for all +% reactions). % -% Usage: [newConnected, cannotConnect, addedRxns, newModel, exitFlag]=... -% fillGaps(model,models,allowNetProduction,useModelConstraints,... -% supressWarnings,rxnScores,params) +% Returns +% ------- +% newConnected : cell +% cell array with the reactions that could be connected. This is not +% calculated if useModelConstraints is true. +% cannotConnect : cell +% cell array with reactions that could not be connected. This is not +% calculated if useModelConstraints is true. +% addedRxns : cell +% cell array with the reactions that were added from "models". +% newModel : struct +% the model with reactions added to fill gaps. +% exitFlag : double +% exit status: +% +% - 1 : optimal solution found +% - -1 : no feasible solution found +% - -2 : optimization time out +% +% Examples +% -------- +% [newConnected, cannotConnect, addedRxns, newModel, exitFlag]=... +% fillGaps(model,models,allowNetProduction,useModelConstraints,... +% supressWarnings,rxnScores,params); %If the user only supplied a single template model if ~iscell(models) diff --git a/gapfilling/fitTasks.m b/gapfilling/fitTasks.m index f823d459..e4095ce9 100755 --- a/gapfilling/fitTasks.m +++ b/gapfilling/fitTasks.m @@ -1,40 +1,46 @@ function [outModel, addedRxns]=fitTasks(model,refModel,inputFile,printOutput,rxnScores,taskStructure) -% fitTasks -% Fills gaps in a model by including reactions from a reference model, -% so that the resulting model can perform all the tasks in a task list +% fitTasks Fill gaps in a model so it can perform a list of tasks. % -% Input: -% model model structure -% refModel reference model from which to include reactions -% inputFile a task list in Excel format. See the function -% parseTaskList for details (optional if taskStructure is -% supplied) -% printOutput true if the results of the test should be displayed -% (optional, default true) -% rxnScores scores for each of the reactions in the reference -% model. Only negative scores are allowed. The solver will -% try to maximize the sum of the scores for the included -% reactions (optional, default is -1 for all reactions) -% taskStructure structure with the tasks, as from parseTaskList. If -% this is supplied then inputFile is ignored (optional) +% Fills gaps in a model by including reactions from a reference model, so +% that the resulting model can perform all the tasks in a task list. The +% gap-filling is done in a task-by-task manner, rather than solving for +% all tasks at once. This means that the order of the tasks could +% influence the result. % +% Parameters +% ---------- +% model : struct +% a model structure. +% refModel : struct +% reference model from which to include reactions. +% inputFile : char, optional +% a task list in Excel format. See the function parseTaskList for +% details (optional if taskStructure is supplied). +% printOutput : logical, optional +% true if the results of the test should be displayed (default true). +% rxnScores : double, optional +% scores for each of the reactions in the reference model. Only +% negative scores are allowed. The solver will try to maximize the sum +% of the scores for the included reactions (default is -1 for all +% reactions). +% taskStructure : struct, optional +% structure with the tasks, as from parseTaskList. If supplied then +% inputFile is ignored. % -% Output: -% outModel model structure with reactions added to perform the -% tasks -% addedRxns MxN matrix with the added reactions (M) from refModel -% for each task (N). An element is true if the corresponding -% reaction is added in the corresponding task. -% Failed tasks and SHOULD FAIL tasks are ignored +% Returns +% ------- +% outModel : struct +% model structure with reactions added to perform the tasks. +% addedRxns : logical +% MxN matrix with the added reactions (M) from refModel for each task +% (N). An element is true if the corresponding reaction is added in +% the corresponding task. Failed tasks and SHOULD FAIL tasks are +% ignored. % -% This function fills gaps in a model by using a reference model, so -% that the resulting model can perform a list of metabolic tasks. The -% gap-filling is done in a task-by-task manner, rather than solving for -% all tasks at once. This means that the order of the tasks could influence -% the result. -% -% Usage: [outModel, addedRxns]=fitTasks(model,refModel,inputFile,printOutput,... -% rxnScores,taskStructure) +% Examples +% -------- +% [outModel, addedRxns]=fitTasks(model,refModel,inputFile,printOutput,... +% rxnScores,taskStructure); if nargin<4 printOutput=true; diff --git a/gapfilling/gapReport.m b/gapfilling/gapReport.m index e4bccec3..b42eda99 100755 --- a/gapfilling/gapReport.m +++ b/gapfilling/gapReport.m @@ -1,45 +1,51 @@ function [noFluxRxns, noFluxRxnsRelaxed, subGraphs, notProducedMets, minToConnect,... neededForProductionMat, canProduceWithoutInput, canConsumeWithoutOutput, ... connectedFromTemplates, addedFromTemplates]=gapReport(model, templateModels) -% gapReport -% Performs a gap analysis and summarizes the results +% gapReport Perform a gap analysis and summarize the results. % -% model a model structure -% templateModels a cell array of template models to use for -% gap filling (optional) +% Parameters +% ---------- +% model : struct +% a model structure. +% templateModels : cell, optional +% a cell array of template models to use for gap filling. % -% noFluxRxns cell array with reactions that cannot carry -% flux -% noFluxRxnsRelaxed cell array with reactions that cannot carry -% flux even if the mass balance constraint is -% relaxed so that it is allowed to have -% net production of all metabolites -% subGraphs structure with the metabolites in each of -% the isolated sub networks -% notProducedMets cell array with the metabolites that -% couldn't have net production -% minToConnect structure with the minimal number of -% metabolites that need to be connected in -% order to be able to produce all other -% metabolites and which metabolites each of -% them connects -% neededForProductionMat matrix where n x m is true if metabolite n -% allows for production of metabolite m -% canProduceWithoutInput cell array with metabolites that could be -% produced even when there is no input to the -% model -% canConsumeWithoutOutput cell array with metabolites that could be -% consumed even when there is no output from -% the model -% connectedFromTemplates cell array with the reactions that could be -% connected using the template models -% addedFromTemplates structure with the reactions that were -% added from the template models and which -% model they were added from +% Returns +% ------- +% noFluxRxns : cell +% reactions that cannot carry flux. +% noFluxRxnsRelaxed : cell +% reactions that cannot carry flux even if the mass balance +% constraint is relaxed so that net production of all metabolites is +% allowed. +% subGraphs : struct +% the metabolites in each of the isolated sub-networks. +% notProducedMets : cell +% the metabolites that could not have net production. +% minToConnect : struct +% the minimal number of metabolites that need to be connected in +% order to be able to produce all other metabolites, and which +% metabolites each of them connects. +% neededForProductionMat : double +% matrix where n x m is true if metabolite n allows for production of +% metabolite m. +% canProduceWithoutInput : cell +% metabolites that could be produced even when there is no input to +% the model. +% canConsumeWithoutOutput : cell +% metabolites that could be consumed even when there is no output from +% the model. +% connectedFromTemplates : cell +% the reactions that could be connected using the template models. +% addedFromTemplates : struct +% the reactions that were added from the template models and which +% model they were added from. % -% Usage: [noFluxRxns, noFluxRxnsRelaxed, subGraphs, notProducedMets, minToConnect,... -% neededForProductionMat, connectedFromTemplates, addedFromTemplates]=... -% gapReport(model, templateModels) +% Examples +% -------- +% [noFluxRxns, noFluxRxnsRelaxed, subGraphs, notProducedMets, ... +% minToConnect, neededForProductionMat, connectedFromTemplates, ... +% addedFromTemplates] = gapReport(model, templateModels); if nargin<2 templateModels=[]; diff --git a/gapfilling/makeSomething.m b/gapfilling/makeSomething.m index 1b9b2ab8..11fa58ed 100755 --- a/gapfilling/makeSomething.m +++ b/gapfilling/makeSomething.m @@ -1,46 +1,57 @@ function [solution, metabolite]=makeSomething(model,ignoreMets,isNames,minNrFluxes,allowExcretion,params,ignoreIntBounds) -% makeSomething -% Tries to excrete any metabolite using as few reactions as possible. -% The intended use is when you want to make sure that you model cannot -% synthesize anything from nothing. It is then a faster way than to use -% checkProduction or canProduce +% makeSomething Excrete any metabolite using as few reactions as possible. % -% model a model structure -% ignoreMets either a cell array of metabolite IDs, a logical vector -% with the same number of elements as metabolites in the model, -% of a vector of indexes for metabolites to exclude from -% this analysis (optional, default []) -% isNames true if the supplied mets represent metabolite names -% (as opposed to IDs). This is a way to delete -% metabolites in several compartments at once without -% knowing the exact IDs. This only works if ignoreMets -% is a cell array (optional, default false) -% minNrFluxes solves the MILP problem of minimizing the number of -% fluxes instead of the sum. Slower, but can be -% used if the sum gives too many fluxes (optional, default -% false) -% allowExcretion allow for excretion of all other metabolites (optional, -% default true) -% params *obsolete option* -% ignoreIntBounds true if internal bounds (including reversibility) -% should be ignored. Exchange reactions are not affected. -% This can be used to find unbalanced solutions which are -% not possible using the default constraints (optional, -% default false) +% Tries to excrete any metabolite using as few reactions as possible. The +% intended use is when you want to make sure that you model cannot +% synthesize anything from nothing. It is then a faster way than to use +% checkProduction or canProduce. % -% solution flux vector for the solution -% metabolite the index of the metabolite(s) which was excreted. If -% possible only one metabolite is reported, but there are -% situations where metabolites can only be excreted in -% pairs (or more) +% Parameters +% ---------- +% model : struct +% a model structure. +% ignoreMets : cell or logical or double, optional +% either a cell array of metabolite IDs, a logical vector with the same +% number of elements as metabolites in the model, or a vector of indexes +% for metabolites to exclude from this analysis (default []). +% isNames : logical, optional +% true if the supplied mets represent metabolite names (as opposed to +% IDs). This is a way to delete metabolites in several compartments at +% once without knowing the exact IDs. This only works if ignoreMets is a +% cell array (default false). +% minNrFluxes : logical, optional +% solves the MILP problem of minimizing the number of fluxes instead of +% the sum. Slower, but can be used if the sum gives too many fluxes +% (default false). +% allowExcretion : logical, optional +% allow for excretion of all other metabolites (default true). +% params : struct, optional +% *obsolete option*. +% ignoreIntBounds : logical, optional +% true if internal bounds (including reversibility) should be ignored. +% Exchange reactions are not affected. This can be used to find +% unbalanced solutions which are not possible using the default +% constraints (default false). % -% NOTE: This works by forcing at least 1 unit of "any metabolites" to be -% produced and then minimize for the sum of fluxes. If more than one -% metabolite is produced, it picks one of them to be produced and then -% minimizes for the sum of fluxes. +% Returns +% ------- +% solution : double +% flux vector for the solution. +% metabolite : double +% the index of the metabolite(s) which was excreted. If possible only +% one metabolite is reported, but there are situations where metabolites +% can only be excreted in pairs (or more). % -% Usage: [solution, metabolite]=makeSomething(model,ignoreMets,isNames,... -% minNrFluxes,allowExcretion,params,ignoreIntBounds) +% Examples +% -------- +% [solution, metabolite] = makeSomething(model, ignoreMets); +% +% Notes +% ----- +% This works by forcing at least 1 unit of "any metabolites" to be produced +% and then minimize for the sum of fluxes. If more than one metabolite is +% produced, it picks one of them to be produced and then minimizes for the +% sum of fluxes. if nargin<2 ignoreMets=[]; diff --git a/io/SBMLFromExcel.m b/io/SBMLFromExcel.m index b88889cf..115d7148 100755 --- a/io/SBMLFromExcel.m +++ b/io/SBMLFromExcel.m @@ -1,21 +1,29 @@ function SBMLFromExcel(fileName, outputFileName,toCOBRA,printWarnings) -% SBMLFromExcel -% Converts a model in the Excel format to SBML +% SBMLFromExcel Convert a model in the Excel format to SBML. % -% fileName the Excel file -% outputFileName the SBML file -% toCOBRA true if the model should be saved in COBRA Toolbox -% format. Only limited support at the moment (optional, -% default false) -% printWarnings true if warnings about model issues should be reported -% (optional, default true) +% For a detailed description of the file format, see the supplied manual. % -% For a detailed description of the file format, see the supplied manual. +% Parameters +% ---------- +% fileName : char +% the Excel file. +% outputFileName : char +% the SBML file. +% toCOBRA : logical, optional +% true if the model should be saved in COBRA Toolbox format. Only +% limited support at the moment (default false). +% printWarnings : logical, optional +% true if warnings about model issues should be reported (default +% true). % -% Usage: SBMLFromExcel(fileName,outputFileName,toCOBRA,printWarnings) +% Examples +% -------- +% SBMLFromExcel(fileName, outputFileName, toCOBRA, printWarnings); % -% NOTE: This is just a wrapper function for importExcelModel, printModelStats -% and exportModel. Use those functions directly for greater control. +% Notes +% ----- +% This is just a wrapper function for importExcelModel, printModelStats and +% exportModel. Use those functions directly for greater control. fileName=char(fileName); outputFileName=char(outputFileName); if nargin<3 diff --git a/io/addJavaPaths.m b/io/addJavaPaths.m index f70ee6ac..87f765d2 100755 --- a/io/addJavaPaths.m +++ b/io/addJavaPaths.m @@ -1,8 +1,8 @@ -% addJavaPaths -% Adds the Apache POI classes to the static Java paths +% addJavaPaths Add the Apache POI classes to the static Java paths. % -% Usage: addJavaPaths() - +% Examples +% -------- +% addJavaPaths(); function addJavaPaths() %Get the path to Apache POI ravenPath=findRAVENroot(); diff --git a/io/checkFileExistence.m b/io/checkFileExistence.m index 576674e9..7d50ef2f 100755 --- a/io/checkFileExistence.m +++ b/io/checkFileExistence.m @@ -1,30 +1,37 @@ function files=checkFileExistence(files,fullOrTemp,allowSpace,checkExist) -% checkFileExistence -% Check whether files exist. If no full path is given a file should be -% located in the current folder, which by default is appended to the -% filename. +% checkFileExistence Check whether files exist. % -% Input: -% files string or cell array of strings with path to file(s) or -% path or filename(s) -% fullOrTemp 0: do not change path to file(s) -% 1: return full path to file(s) -% 2: copy file(s) to system default temporary folder and -% return full path -% (optional, default 0) -% allowSpace logical, whether 'space' character is allowed in the -% path (optional, default true) -% checkExist logical, whether file existence should really be -% checked, as this function can also be used to return -% the full path to a new file (optional, default true). Can -% only be set to false if fullOrTemp is set to 1. +% If no full path is given a file should be located in the current folder, +% which by default is appended to the filename. % -% Output: -% files string or cell array of strings with updated paths if -% fullOrTemp was set as 1 or 2, otherwise original paths -% are returned -% -% Usage: files=checkFileExistence(files,fullOrTemp,allowSpace,checkExist) +% Parameters +% ---------- +% files : char or cell +% string or cell array of strings with path to file(s) or path or +% filename(s). +% fullOrTemp : double, optional +% controls path handling (default 0): +% +% - 0 : do not change path to file(s) +% - 1 : return full path to file(s) +% - 2 : copy file(s) to system default temporary folder and return +% full path +% allowSpace : logical, optional +% whether the 'space' character is allowed in the path (default true). +% checkExist : logical, optional +% whether file existence should really be checked, as this function can +% also be used to return the full path to a new file (default true). +% Can only be set to false if fullOrTemp is set to 1. +% +% Returns +% ------- +% files : char or cell +% string or cell array of strings with updated paths if fullOrTemp was +% set as 1 or 2, otherwise original paths are returned. +% +% Examples +% -------- +% files = checkFileExistence(files, fullOrTemp, allowSpace, checkExist); if nargin<2 fullOrTemp = 0; diff --git a/io/cleanSheet.m b/io/cleanSheet.m index c2184884..2346d7f6 100755 --- a/io/cleanSheet.m +++ b/io/cleanSheet.m @@ -1,22 +1,34 @@ -% cleanSheet -% Cleans up an Excel sheet by removing empty rows/colums (and some other -% checks) +% cleanSheet Clean up an Excel sheet. % -% raw cell array with the data in the sheet -% removeComments true if commented lines (non-empty first cell in each -% row) should be removed (optional, default true) -% removeOnlyCap remove columns with captions but no other values (optional, -% default false) -% removeNoCap remove columns without captions (optional, default true) -% removeEmptyRows remove rows with no non-empty cells (optional, default true) -% -% raw cleaned version -% keptRows indexes of the kept rows in the original structure -% keptCols indexes of the kept columns in the original structure +% Removes empty rows/columns (and performs some other checks). % -% Usage: [raw,keptRows,keptCols]=cleanSheet(raw,removeComments,removeOnlyCap,... -% removeNoCap,removeEmptyRows) - +% Parameters +% ---------- +% raw : cell +% cell array with the data in the sheet. +% removeComments : logical, optional +% true if commented lines (non-empty first cell in each row) should be +% removed (default true). +% removeOnlyCap : logical, optional +% remove columns with captions but no other values (default false). +% removeNoCap : logical, optional +% remove columns without captions (default true). +% removeEmptyRows : logical, optional +% remove rows with no non-empty cells (default true). +% +% Returns +% ------- +% raw : cell +% cleaned version. +% keptRows : double +% indices of the kept rows in the original structure. +% keptCols : double +% indices of the kept columns in the original structure. +% +% Examples +% -------- +% [raw, keptRows, keptCols] = cleanSheet(raw, removeComments, ... +% removeOnlyCap, removeNoCap, removeEmptyRows); function [raw,keptRows,keptCols]=cleanSheet(raw,removeComments,removeOnlyCap,removeNoCap,removeEmptyRows) if nargin<2 removeComments=true; diff --git a/io/exportForGit.m b/io/exportForGit.m index 48448744..bea56a59 100755 --- a/io/exportForGit.m +++ b/io/exportForGit.m @@ -1,38 +1,45 @@ function out=exportForGit(model,prefix,path,formats,mainBranchFlag,subDirs,COBRAtext,neverPrefixIDs) -% exportForGit -% Generates a directory structure and populates this with model files, ready -% to be commited to a Git(Hub) maintained model repository. Writes the model -% as SBML L3V1 FBCv2 (both XML and YAML), COBRA text, Matlab MAT-file -% orthologies in KEGG +% exportForGit Export a model for a Git-maintained model repository. % -% model model structure in RAVEN format that should be -% exported -% prefix prefix for all filenames (optional, default 'model') -% path path where the directory structure should be -% generated and populated with all files (optional, -% default to current working directory) -% formats cell array of strings specifying in what file -% formats the model should be exported (optional, -% default to all formats as {'mat', 'txt', 'xlsx', -% 'xml', 'yml'}) -% mainBranchFlag logical, if true, function will error if RAVEN (and -% COBRA if detected) is/are not on the main branch. -% (optional, default false) -% subDirs logical, whether model files for each file format -% should be written in its own subdirectory, with -% 'model' as parent directory, in accordance to the -% standard-GEM repository format. If false, all files -% are stored in the same folder. (optional, default -% true) -% COBRAtext logical, whether the txt file should be in COBRA -% Toolbox format using metabolite IDs, instead of -% metabolite names and compartments. (optional, -% default false) -% neverPrefixIDs true if prefixes are never added to identifiers, -% even if start with e.g. digits. This might result -% in invalid SBML files (optional, default false) +% Generates a directory structure and populates it with model files, ready +% to be committed to a Git(Hub) maintained model repository. Writes the +% model as SBML L3V1 FBCv2 (both XML and YAML), COBRA text, Matlab MAT-file +% and Microsoft Excel formats. % -% Usage: exportForGit(model,prefix,path,formats,mainBranchFlag,subDirs,COBRAtext,COBRAstyle) +% Parameters +% ---------- +% model : struct +% model structure in RAVEN format that should be exported. +% prefix : char, optional +% prefix for all filenames (default 'model'). +% path : char, optional +% path where the directory structure should be generated and populated +% with all files (default current working directory). +% formats : cell, optional +% cell array of strings specifying in what file formats the model +% should be exported (default all formats as {'mat', 'txt', 'xlsx', +% 'xml', 'yml'}). +% mainBranchFlag : logical, optional +% if true, function will error if RAVEN (and COBRA if detected) is/are +% not on the main branch (default false). +% subDirs : logical, optional +% whether model files for each file format should be written in their +% own subdirectory, with 'model' as parent directory, in accordance to +% the standard-GEM repository format. If false, all files are stored in +% the same folder (default true). +% COBRAtext : logical, optional +% whether the txt file should be in COBRA Toolbox format using +% metabolite IDs, instead of metabolite names and compartments +% (default false). +% neverPrefixIDs : logical, optional +% true if prefixes are never added to identifiers, even if they start +% with e.g. digits. This might result in invalid SBML files (default +% false). +% +% Examples +% -------- +% exportForGit(model, prefix, path, formats, mainBranchFlag, subDirs, ... +% COBRAtext, neverPrefixIDs); if nargin<8 neverPrefixIDs=false; end diff --git a/io/exportModel.m b/io/exportModel.m index 27c9bb6f..7ef80bce 100755 --- a/io/exportModel.m +++ b/io/exportModel.m @@ -1,22 +1,27 @@ function exportModel(model,fileName,neverPrefix,supressWarnings,sortIds) -% exportModel -% Exports a constraint-based model to an SBML file (L3V1 FBCv2) +% exportModel Export a constraint-based model to an SBML file (L3V1 FBCv2). % -% Input: -% model a model structure -% fileName filename to export the model to. A dialog window -% will open if no file name is specified. -% neverPrefix true if prefixes are never added to identifiers, -% even if start with e.g. digits. This might result -% in invalid SBML files (optional, default false) -% supressWarnings true if warnings should be supressed. This might -% results in invalid SBML files, as no checks are -% performed (optional, default false) -% sortIds logical whether metabolites, reactions and genes -% should be sorted alphabetically by their -% identifiers (optional, default false) +% Parameters +% ---------- +% model : struct +% a model structure. +% fileName : char +% filename to export the model to. A dialog window will open if no file +% name is specified. +% neverPrefix : logical, optional +% true if prefixes are never added to identifiers, even if they start +% with e.g. digits. This might result in invalid SBML files (default +% false). +% supressWarnings : logical, optional +% true if warnings should be suppressed. This might result in invalid +% SBML files, as no checks are performed (default false). +% sortIds : logical, optional +% whether metabolites, reactions and genes should be sorted +% alphabetically by their identifiers (default false). % -% Usage: exportModel(model,fileName,neverPrefix,supressWarnings,sortIds) +% Examples +% -------- +% exportModel(model, fileName, neverPrefix, supressWarnings, sortIds); if nargin<2 || isempty(fileName) [fileName, pathName] = uiputfile({'*.xml;*.sbml'}, 'Select file for model export',[model.id '.xml']); diff --git a/io/exportModelToSIF.m b/io/exportModelToSIF.m index fe312f12..b72c4ff6 100755 --- a/io/exportModelToSIF.m +++ b/io/exportModelToSIF.m @@ -1,19 +1,26 @@ function exportModelToSIF(model,fileName,graphType,rxnLabels,metLabels) -% exportModelToSIF -% Exports a constraint-based model to a SIF file +% exportModelToSIF Export a constraint-based model to a SIF file. % -% model a model structure -% fileName the filename to export the model to -% graphType the type of graph to export to (optional, default 'rc') -% 'rc' reaction-compound -% 'rr' reaction-reaction -% 'cc' compound-compound -% rxnLabels cell array with labels for reactions (optional, default -% model.rxns) -% metLabels cell array with labels for metabolites (optional, default -% model.mets) +% Parameters +% ---------- +% model : struct +% a model structure. +% fileName : char +% the filename to export the model to. +% graphType : char, optional +% the type of graph to export to (default 'rc'): % -% Usage: exportModelToSIF(model,fileName,graphType,rxnLabels,metLabels) +% - 'rc' : reaction-compound +% - 'rr' : reaction-reaction +% - 'cc' : compound-compound +% rxnLabels : cell, optional +% cell array with labels for reactions (default model.rxns). +% metLabels : cell, optional +% cell array with labels for metabolites (default model.mets). +% +% Examples +% -------- +% exportModelToSIF(model, fileName, graphType, rxnLabels, metLabels); fileName=char(fileName); if nargin<3 graphType='rc'; diff --git a/io/exportToExcelFormat.m b/io/exportToExcelFormat.m index 9337e76e..f1bf9fc6 100755 --- a/io/exportToExcelFormat.m +++ b/io/exportToExcelFormat.m @@ -1,19 +1,23 @@ function exportToExcelFormat(model,fileName,sortIds) -% exportToExcelFormat -% Exports a model structure to the Microsoft Excel model format +% exportToExcelFormat Export a model to the Microsoft Excel model format. % -% Input: -% model a model structure -% fileName file name of the Excel file. Only xlsx format is supported. -% In order to preserve backward compatibility this could also -% be only a path, in which case the model is exported to a -% set of tab-delimited text files via exportToTabDelimited. -% A dialog window will open if fileName is empty. -% sortIds logical whether metabolites, reactions and genes should be -% sorted alphabetically by their identifiers (optional, -% default false) +% Parameters +% ---------- +% model : struct +% a model structure. +% fileName : char +% file name of the Excel file. Only xlsx format is supported. In order +% to preserve backward compatibility this could also be only a path, in +% which case the model is exported to a set of tab-delimited text files +% via exportToTabDelimited. A dialog window will open if fileName is +% empty. +% sortIds : logical, optional +% whether metabolites, reactions and genes should be sorted +% alphabetically by their identifiers (default false). % -% Usage: exportToExcelFormat(model, fileName, sortIds) +% Examples +% -------- +% exportToExcelFormat(model, fileName, sortIds); if nargin<2 || isempty(fileName) [fileName, pathName] = uiputfile('*.xlsx', 'Select file for model export',[model.id '.xlsx']); diff --git a/io/exportToTabDelimited.m b/io/exportToTabDelimited.m index 98ad6bb7..c8609863 100755 --- a/io/exportToTabDelimited.m +++ b/io/exportToTabDelimited.m @@ -1,22 +1,29 @@ function exportToTabDelimited(model,path,sortIds) -% exportToTabDelimited -% Exports a model structure to a set of tab-delimited text files +% exportToTabDelimited Export a model to tab-delimited text files. % -% model a model structure -% path the path to export to. The resulting text files will be saved -% under the names excelRxns.txt, excelMets.txt, excelGenes.txt, -% excelModel.txt, and excelComps.txt -% sortIds logical whether metabolites, reactions and genes should be -% sorted alphabetically by their identifiers (optional, default false) +% Parameters +% ---------- +% model : struct +% a model structure. +% path : char, optional +% the path to export to. The resulting text files will be saved under +% the names excelRxns.txt, excelMets.txt, excelGenes.txt, +% excelModel.txt, and excelComps.txt (default './'). +% sortIds : logical, optional +% whether metabolites, reactions and genes should be sorted +% alphabetically by their identifiers (default false). % -% NOTE: This functionality was previously a part of exportToExcelFormat. -% The naming of the resulting text files is to preserve backward -% compatibility +% Examples +% -------- +% exportToTabDelimited(model, path, sortIds); % -% NOTE: No checks are made regarding the correctness of the model. Use -% checkModelStruct to identify problems in the model structure +% Notes +% ----- +% This functionality was previously a part of exportToExcelFormat. The +% naming of the resulting text files is to preserve backward compatibility. % -% Usage: exportToTabDelimited(model,path,sortIds) +% No checks are made regarding the correctness of the model. Use +% checkModelStruct to identify problems in the model structure. if nargin<2 path='./'; diff --git a/io/getFullPath.m b/io/getFullPath.m index a766c49e..0b3bfee6 100755 --- a/io/getFullPath.m +++ b/io/getFullPath.m @@ -1,57 +1,66 @@ function File = getFullPath(File, Style) -% getFullPath - Get absolute canonical path of a file or folder -% Absolute path names are safer than relative paths, when e.g. a GUI or TIMER -% callback changes the current directory. Only canonical paths without "." and -% ".." can be recognized uniquely. -% Long path names (>259 characters) require a magic initial key "\\?\" to be -% handled by Windows API functions, e.g. for Matlab's FOPEN, DIR and EXIST. +% getFullPath Get the absolute canonical path of a file or folder. % -% FullName = getFullPath(Name, Style) -% INPUT: -% Name: String or cell string, absolute or relative name of a file or -% folder. The path need not exist. Unicode strings, UNC paths and long -% names are supported. -% Style: Style of the output as string, optional, default: 'auto'. -% 'auto': Add '\\?\' or '\\?\UNC\' for long names on demand. -% 'lean': Magic string is not added. -% 'fat': Magic string is added for short names also. -% The Style is ignored when not running under Windows. +% Absolute path names are safer than relative paths, when e.g. a GUI or +% TIMER callback changes the current directory. Only canonical paths +% without "." and ".." can be recognized uniquely. Long path names (>259 +% characters) require a magic initial key "\\?\" to be handled by Windows +% API functions, e.g. for Matlab's FOPEN, DIR and EXIST. % -% OUTPUT: -% FullName: Absolute canonical path name as string or cell string. -% For empty strings the current directory is replied. -% '\\?\' or '\\?\UNC' is added on demand. +% Parameters +% ---------- +% File : char or cell +% absolute or relative name of a file or folder, as a string or cell +% string. The path need not exist. Unicode strings, UNC paths and long +% names are supported. +% Style : char, optional +% style of the output (default 'auto'). Ignored when not running under +% Windows. One of: % -% NOTE: The M- and the MEX-version create the same results, the faster MEX -% function works under Windows only. -% Some functions of the Windows-API still do not support long file names. -% E.g. the Recycler and the Windows Explorer fail even with the magic '\\?\' -% prefix. Some functions of Matlab accept 260 characters (value of MAX_PATH), -% some at 259 already. Don't blame me. -% The 'fat' style is useful e.g. when Matlab's DIR command is called for a -% folder with les than 260 characters, but together with the file name this -% limit is exceeded. Then "dir(getFullPath([folder, '\*.*], 'fat'))" helps. +% - 'auto' : add '\\?\' or '\\?\UNC\' for long names on demand. +% - 'lean' : magic string is not added. +% - 'fat' : magic string is added for short names also. % -% EXAMPLES: -% cd(tempdir); % Assumed as 'C:\Temp' here -% getFullPath('File.Ext') % 'C:\Temp\File.Ext' -% getFullPath('..\File.Ext') % 'C:\File.Ext' -% getFullPath('..\..\File.Ext') % 'C:\File.Ext' -% getFullPath('.\File.Ext') % 'C:\Temp\File.Ext' -% getFullPath('*.txt') % 'C:\Temp\*.txt' -% getFullPath('..') % 'C:\' -% getFullPath('..\..\..') % 'C:\' -% getFullPath('Folder\') % 'C:\Temp\Folder\' -% getFullPath('D:\A\..\B') % 'D:\B' -% getFullPath('\\Server\Folder\Sub\..\File.ext') -% % '\\Server\Folder\File.ext' -% getFullPath({'..', 'new'}) % {'C:\', 'C:\Temp\new'} -% getFullPath('.', 'fat') % '\\?\C:\Temp\File.Ext' +% Returns +% ------- +% File : char or cell +% absolute canonical path name as a string or cell string. For empty +% strings the current directory is replied. '\\?\' or '\\?\UNC' is +% added on demand. +% +% Examples +% -------- +% cd(tempdir); % Assumed as 'C:\Temp' here +% getFullPath('File.Ext') % 'C:\Temp\File.Ext' +% getFullPath('..\File.Ext') % 'C:\File.Ext' +% getFullPath('..\..\File.Ext') % 'C:\File.Ext' +% getFullPath('.\File.Ext') % 'C:\Temp\File.Ext' +% getFullPath('*.txt') % 'C:\Temp\*.txt' +% getFullPath('..') % 'C:\' +% getFullPath('..\..\..') % 'C:\' +% getFullPath('Folder\') % 'C:\Temp\Folder\' +% getFullPath('D:\A\..\B') % 'D:\B' +% getFullPath('\\Server\Folder\Sub\..\File.ext') +% % '\\Server\Folder\File.ext' +% getFullPath({'..', 'new'}) % {'C:\', 'C:\Temp\new'} +% getFullPath('.', 'fat') % '\\?\C:\Temp\File.Ext' +% +% Notes +% ----- +% The M- and the MEX-version create the same results, the faster MEX +% function works under Windows only. Some functions of the Windows-API +% still do not support long file names. E.g. the Recycler and the Windows +% Explorer fail even with the magic '\\?\' prefix. Some functions of Matlab +% accept 260 characters (value of MAX_PATH), some at 259 already. Don't +% blame me. The 'fat' style is useful e.g. when Matlab's DIR command is +% called for a folder with less than 260 characters, but together with the +% file name this limit is exceeded. Then "dir(getFullPath([folder, +% '\*.*], 'fat'))" helps. % % COMPILE: -% Automatic: InstallMex getFullPath.c uTest_getFullPath -% Manual: mex -O getFullPath.c -% Download: http://www.n-simon.de/mex +% Automatic: InstallMex getFullPath.c uTest_getFullPath +% Manual: mex -O getFullPath.c +% Download: http://www.n-simon.de/mex % Run the unit-test uTest_getFullPath after compiling. % % Tested: Matlab 6.5, 7.7, 7.8, 7.13, WinXP/32, Win7/64 @@ -59,7 +68,9 @@ % Assumed Compatibility: higher Matlab versions % Author: Jan Simon, Heidelberg, (C) 2009-2013 matlab.THISYEAR(a)nMINUSsimon.de % -% See also: CD, FULLFILE, FILEPARTS. +% See also +% -------- +% cd, fullfile, fileparts % $JRev: R-G V:032 Sum:7Xd/JS0+yfax Date:15-Jan-2013 01:06:12 $ % $License: BSD (use/copy/change/redistribute on own risk, mention the author) $ diff --git a/io/getMD5Hash.m b/io/getMD5Hash.m index 36c9bb35..caafcc1c 100755 --- a/io/getMD5Hash.m +++ b/io/getMD5Hash.m @@ -1,20 +1,25 @@ function md5Hash=getMD5Hash(inputFile,binEnd) -% getMD5Hash -% Calculates MD5 hash for a file +% getMD5Hash Calculate the MD5 hash for a file. % -% Input: -% inputFile string with the path to file for which MD5 hash should -% be calculated -% binEnd string that shows the operating system running in the -% client's computer. Use ".exe" for Windows, ".mac" for -% macOS or leave it blank for Linux (""). (optional, by -% default the function automatically detects the client's -% operating system) +% Parameters +% ---------- +% inputFile : char +% string with the path to the file for which the MD5 hash should be +% calculated. +% binEnd : char, optional +% string that indicates the operating system running on the client's +% computer. Use ".exe" for Windows, ".mac" for macOS or leave it blank +% for Linux (""). (default: the function automatically detects the +% client's operating system). % -% Output: -% md5Hash string containing an MD5 hash for inputFile -% -% Usage: md5Hash=getMD5Hash(inputFile,binEnd) +% Returns +% ------- +% md5Hash : char +% string containing an MD5 hash for inputFile. +% +% Examples +% -------- +% md5Hash = getMD5Hash(inputFile, binEnd); inputFile=char(inputFile); if nargin<2 diff --git a/io/getToolboxVersion.m b/io/getToolboxVersion.m index 964e0834..7de954de 100755 --- a/io/getToolboxVersion.m +++ b/io/getToolboxVersion.m @@ -1,18 +1,29 @@ function version = getToolboxVersion(toolbox,fileID,mainBranchFlag) -% getToolboxVersion -% Returns the version of a given toolbox, or if not available the latest -% commit hash (7 characters). +% getToolboxVersion Return the version of a given toolbox. % -% toolbox string with the toolbox name (e.g. "RAVEN") -% fileID string with the name of a file that is only found in -% the corresponding toolbox (e.g. "ravenCobraWrapper.m"). -% mainBranchFlag logical, if true, function will error if the toolbox is -% not on the main branch (optional, default false). +% Returns the version of a given toolbox, or if not available the latest +% commit hash (7 characters). % -% version string containing either the toolbox version or latest -% commit hash (7 characters). +% Parameters +% ---------- +% toolbox : char +% string with the toolbox name (e.g. "RAVEN"). +% fileID : char +% string with the name of a file that is only found in the +% corresponding toolbox (e.g. "ravenCobraWrapper.m"). +% mainBranchFlag : logical, optional +% if true, the function will error if the toolbox is not on the main +% branch (default false). % -% Usage: version = getToolboxVersion(toolbox,fileID,mainBranchFlag) +% Returns +% ------- +% version : char +% string containing either the toolbox version or latest commit hash +% (7 characters). +% +% Examples +% -------- +% version = getToolboxVersion(toolbox, fileID, mainBranchFlag); toolbox=char(toolbox); fileID=char(fileID); diff --git a/io/importExcelModel.m b/io/importExcelModel.m index 0c4c0784..8edfd188 100755 --- a/io/importExcelModel.m +++ b/io/importExcelModel.m @@ -1,79 +1,98 @@ function model=importExcelModel(fileName,removeExcMets,printWarnings,ignoreErrors) -% importExcelModel -% Imports a constraint-based model from a Excel file +% importExcelModel Import a constraint-based model from an Excel file. % -% fileName a Microsoft Excel file to import -% removeExcMets true if exchange metabolites should be removed. This is -% needed to be able to run simulations, but it could also -% be done using simplifyModel at a later stage (optional, -% default true) -% printWarnings true if warnings should be printed (optional, default true) -% ignoreErrors true if errors should be ignored. See below for details -% (optional, default false) +% Loads models in the RAVEN Toolbox Excel format. % -% model -% annotation -% taxonomy String with the NCBI Taxonomy ID, as valid -% identifiers.org annotation -% defaultLB Double with the default lower bound values for reactions -% defaultUB Double with the default upper bound values for reactions -% givenName String with the name of the main model author -% familyName String with the surname of the main model author -% email String with the e-mail address of the main model author -% organization String with the organization of the main model author -% note String with additional comments about the model -% name name of model -% id model ID -% rxns reaction ids -% mets metabolite ids -% S stoichiometric matrix -% lb lower bounds -% ub upper bounds -% rev reversibility vector -% c objective coefficients -% b equality constraints for the metabolite equations -% comps compartment ids -% compNames compartment names -% compOutside the id (as in comps) for the compartment -% surrounding each of the compartments -% compMiriams structure with MIRIAM information about the -% compartments -% rxnNames reaction name -% rxnComps compartments for reactions -% grRules reaction to gene rules in text form -% rxnGeneMat reaction-to-gene mapping in sparse matrix form -% subSystems subsystem name for each reaction -% eccodes EC-codes for the reactions -% rxnMiriams structure with MIRIAM information about the reactions -% rxnNotes reaction notes -% rxnReferences reaction references -% rxnConfidenceScores reaction confidence scores -% genes list of all genes -% geneComps compartments for genes -% geneMiriams structure with MIRIAM information about the genes -% geneShortNames gene alternative names (e.g. ERG10) -% metNames metabolite name -% metComps compartments for metabolites -% inchis InChI-codes for metabolites -% metFormulas metabolite chemical formula -% metMiriams structure with MIRIAM information about the metabolites -% metCharges metabolite charge -% unconstrained true if the metabolite is an exchange metabolite +% Parameters +% ---------- +% fileName : char +% a Microsoft Excel file to import. +% removeExcMets : logical, optional +% true if exchange metabolites should be removed. This is needed to be +% able to run simulations, but it could also be done using +% simplifyModel at a later stage (default true). +% printWarnings : logical, optional +% true if warnings should be printed (default true). +% ignoreErrors : logical, optional +% true if errors should be ignored. See Notes for details (default +% false). % -% Loads models in the RAVEN Toolbox Excel format. A number of consistency -% checks are performed in order to ensure that the model is valid. These -% can be ignored by putting ignoreErrors to true. However, this is highly -% advised against, as it can result in errors in simulations or other -% functionalities. The RAVEN Toolbox is made to function only on consistent -% models, and the only checks performed are when the model is imported. +% Returns +% ------- +% model : struct +% imported model structure with fields: % -% NOTE: Most errors are checked for by checkModelStruct, but some -% are checked for in this function as well. Those are ones which relate -% to missing model elements and so on, and which would make it impossible -% to construct the model structure. Those errors cannot be ignored by -% setting ignoreErrors to true. +% - annotation : structure with model metadata, with fields: % -% Usage: model=importExcelModel(fileName,removeExcMets,printWarnings,ignoreErrors) +% - taxonomy : String with the NCBI Taxonomy ID, as valid +% identifiers.org annotation +% - defaultLB : Double with the default lower bound values for +% reactions +% - defaultUB : Double with the default upper bound values for +% reactions +% - givenName : String with the name of the main model author +% - familyName : String with the surname of the main model author +% - email : String with the e-mail address of the main model author +% - organization : String with the organization of the main model +% author +% - note : String with additional comments about the model +% +% - name : name of model +% - id : model ID +% - rxns : reaction ids +% - mets : metabolite ids +% - S : stoichiometric matrix +% - lb : lower bounds +% - ub : upper bounds +% - rev : reversibility vector +% - c : objective coefficients +% - b : equality constraints for the metabolite equations +% - comps : compartment ids +% - compNames : compartment names +% - compOutside : the id (as in comps) for the compartment surrounding +% each of the compartments +% - compMiriams : structure with MIRIAM information about the +% compartments +% - rxnNames : reaction name +% - rxnComps : compartments for reactions +% - grRules : reaction to gene rules in text form +% - rxnGeneMat : reaction-to-gene mapping in sparse matrix form +% - subSystems : subsystem name for each reaction +% - eccodes : EC-codes for the reactions +% - rxnMiriams : structure with MIRIAM information about the reactions +% - rxnNotes : reaction notes +% - rxnReferences : reaction references +% - rxnConfidenceScores : reaction confidence scores +% - genes : list of all genes +% - geneComps : compartments for genes +% - geneMiriams : structure with MIRIAM information about the genes +% - geneShortNames : gene alternative names (e.g. ERG10) +% - metNames : metabolite name +% - metComps : compartments for metabolites +% - inchis : InChI-codes for metabolites +% - metFormulas : metabolite chemical formula +% - metMiriams : structure with MIRIAM information about the metabolites +% - metCharges : metabolite charge +% - unconstrained : true if the metabolite is an exchange metabolite +% +% Examples +% -------- +% model = importExcelModel(fileName, removeExcMets, printWarnings, ignoreErrors); +% +% Notes +% ----- +% A number of consistency checks are performed in order to ensure that the +% model is valid. These can be ignored by putting ignoreErrors to true. +% However, this is highly advised against, as it can result in errors in +% simulations or other functionalities. The RAVEN Toolbox is made to +% function only on consistent models, and the only checks performed are +% when the model is imported. +% +% Most errors are checked for by checkModelStruct, but some are checked for +% in this function as well. Those are ones which relate to missing model +% elements and so on, and which would make it impossible to construct the +% model structure. Those errors cannot be ignored by setting ignoreErrors +% to true. fileName=char(fileName); if nargin<2 diff --git a/io/importModel.m b/io/importModel.m index dae96063..df4c487b 100755 --- a/io/importModel.m +++ b/io/importModel.m @@ -1,69 +1,78 @@ function model=importModel(fileName,removeExcMets,removePrefix,supressWarnings) -% importModel -% Import a constraint-based model from an SBML file. +% importModel Import a constraint-based model from an SBML file. % -% Input: -% fileName a SBML file to import. A dialog window will open if -% no file name is specified. -% removeExcMets true if exchange metabolites should be removed. This is -% needed to be able to run simulations, but it could also -% be done using simplifyModel at a later stage (optional, -% default true) -% removePrefix true if identifier prefixes should be removed when -% loading the model: G_ for genes, R_ for reactions, -% M_ for metabolites, and C_ for compartments. These are -% only removed if all identifiers of a certain type -% contain the prefix. (optional, default true) -% supressWarnings true if warnings regarding the model structure should -% be supressed (optional, default false) +% Parameters +% ---------- +% fileName : char +% a SBML file to import. A dialog window will open if no file name is +% specified. +% removeExcMets : logical, optional +% true if exchange metabolites should be removed. This is needed to be +% able to run simulations, but it could also be done using +% simplifyModel at a later stage (default true). +% removePrefix : logical, optional +% true if identifier prefixes should be removed when loading the model: +% G_ for genes, R_ for reactions, M_ for metabolites, and C_ for +% compartments. These are only removed if all identifiers of a certain +% type contain the prefix (default true). +% supressWarnings : logical, optional +% true if warnings regarding the model structure should be supressed +% (default false). % -% Output: -% model -% id model ID -% name name of model contents -% annotation additional information about model -% rxns reaction ids -% mets metabolite ids -% S stoichiometric matrix -% lb lower bounds -% ub upper bounds -% rev reversibility vector -% c objective coefficients -% b equality constraints for the metabolite equations -% comps compartment ids -% compNames compartment names -% compOutside the id (as in comps) for the compartment -% surrounding each of the compartments -% compMiriams structure with MIRIAM information about the -% compartments -% rxnNames reaction description -% rxnComps compartments for reactions -% grRules reaction to gene rules in text form -% rxnGeneMat reaction-to-gene mapping in sparse matrix form -% subSystems subsystem name for each reaction -% eccodes EC-codes for the reactions -% rxnMiriams structure with MIRIAM information about the reactions -% rxnNotes reaction notes -% rxnReferences reaction references -% rxnConfidenceScores reaction confidence scores -% genes list of all genes -% geneComps compartments for genes -% geneMiriams structure with MIRIAM information about the genes -% geneShortNames gene alternative names (e.g. ERG10) -% proteins protein associated to each gene -% metNames metabolite description -% metComps compartments for metabolites -% inchis InChI-codes for metabolites -% metFormulas metabolite chemical formula -% metMiriams structure with MIRIAM information about the metabolites -% metCharges metabolite charge -% unconstrained true if the metabolite is an exchange metabolite +% Returns +% ------- +% model : struct +% imported model structure with fields: % -% Note: A number of consistency checks are performed in order to ensure that the +% - id : model ID +% - name : name of model contents +% - annotation : additional information about model +% - rxns : reaction ids +% - mets : metabolite ids +% - S : stoichiometric matrix +% - lb : lower bounds +% - ub : upper bounds +% - rev : reversibility vector +% - c : objective coefficients +% - b : equality constraints for the metabolite equations +% - comps : compartment ids +% - compNames : compartment names +% - compOutside : the id (as in comps) for the compartment surrounding +% each of the compartments +% - compMiriams : structure with MIRIAM information about the +% compartments +% - rxnNames : reaction description +% - rxnComps : compartments for reactions +% - grRules : reaction to gene rules in text form +% - rxnGeneMat : reaction-to-gene mapping in sparse matrix form +% - subSystems : subsystem name for each reaction +% - eccodes : EC-codes for the reactions +% - rxnMiriams : structure with MIRIAM information about the reactions +% - rxnNotes : reaction notes +% - rxnReferences : reaction references +% - rxnConfidenceScores : reaction confidence scores +% - genes : list of all genes +% - geneComps : compartments for genes +% - geneMiriams : structure with MIRIAM information about the genes +% - geneShortNames : gene alternative names (e.g. ERG10) +% - proteins : protein associated to each gene +% - metNames : metabolite description +% - metComps : compartments for metabolites +% - inchis : InChI-codes for metabolites +% - metFormulas : metabolite chemical formula +% - metMiriams : structure with MIRIAM information about the metabolites +% - metCharges : metabolite charge +% - unconstrained : true if the metabolite is an exchange metabolite +% +% Examples +% -------- +% model = importModel(fileName, removeExcMets, removePrefix, supressWarnings); +% +% Notes +% ----- +% A number of consistency checks are performed in order to ensure that the % model is valid. Take these warnings seriously and modify the model % structure to solve them. -% -% Usage: model = importModel(fileName, removeExcMets, removePrefix, supressWarnings) if nargin<1 || isempty(fileName) [fileName, pathName] = uigetfile({'*.xml;*.sbml'}, 'Please select the model file'); diff --git a/io/loadSheet.m b/io/loadSheet.m index 3324964e..211a30c4 100755 --- a/io/loadSheet.m +++ b/io/loadSheet.m @@ -1,14 +1,25 @@ -% loadSheet -% Loads an Excel sheet into a cell matrix using the Java library Apache POI +% loadSheet Load an Excel sheet into a cell matrix. % -% workbook Workbook object representing the Excel file -% sheet name of the sheet (optional, default first sheet) +% Loads an Excel sheet into a cell matrix using the Java library Apache +% POI. % -% raw cell array with the data in the sheet -% flag 0 if everything worked, -1 if it didn't +% Parameters +% ---------- +% workbook : Workbook +% Workbook object representing the Excel file. +% sheet : char, optional +% name of the sheet (default first sheet). % -% Usage: [raw, flag]=loadSheet(workbook, sheet) - +% Returns +% ------- +% raw : cell +% cell array with the data in the sheet. +% flag : double +% 0 if everything worked, -1 if it didn't. +% +% Examples +% -------- +% [raw, flag] = loadSheet(workbook, sheet); function [raw, flag]=loadSheet(workbook, sheet) if nargin<2 sheet=[]; diff --git a/io/loadWorkbook.m b/io/loadWorkbook.m index c1444533..038b82d3 100755 --- a/io/loadWorkbook.m +++ b/io/loadWorkbook.m @@ -1,15 +1,25 @@ function workbook=loadWorkbook(fileName,createEmpty) -% loadWorkbook -% Loads an Excel file into a Workbook object using the Java library Apache POI +% loadWorkbook Load an Excel file into a Workbook object. % -% fileName name of the Excel file. If it doesn't exist it will be -% created -% createEmpty true if an empty workbook should be created if the file -% didn't exist (optional, default false) +% Loads an Excel file into a Workbook object using the Java library Apache +% POI. % -% workbook Workbook object representing the Excel file +% Parameters +% ---------- +% fileName : char +% name of the Excel file. If it doesn't exist it will be created. +% createEmpty : logical, optional +% true if an empty workbook should be created if the file didn't exist +% (default false). % -% Usage: workbook=loadWorkbook(fileName,createEmpty) +% Returns +% ------- +% workbook : Workbook +% Workbook object representing the Excel file. +% +% Examples +% -------- +% workbook = loadWorkbook(fileName, createEmpty); if nargin<2 createEmpty=false; diff --git a/io/parseYAML.m b/io/parseYAML.m index 5c93b76a..72355714 100644 --- a/io/parseYAML.m +++ b/io/parseYAML.m @@ -1,33 +1,43 @@ function out = parseYAML(filename) -% parseYAML -% Read an arbitrary YAML file into a MATLAB struct / cell tree. +% parseYAML Read an arbitrary YAML file into a MATLAB struct/cell tree. % -% Use this for parsing arbitrary YAML configuration / data files -% (e.g. yeast-GEM's data/conditions/*.yml). For loading a cobra-format -% model YAML, use readYAMLmodel instead — that function knows the -% model schema and returns a populated RAVEN model struct. +% Use this for parsing arbitrary YAML configuration / data files (e.g. +% yeast-GEM's data/conditions/*.yml). For loading a cobra-format model +% YAML, use readYAMLmodel instead — that function knows the model schema +% and returns a populated RAVEN model struct. % -% Implementation: delegates to Python's yaml.safe_load, then -% recursively converts the py.dict / py.list tree to native MATLAB -% struct / cell. Requires a working MATLAB-Python bridge and the -% pyyaml package in the linked Python environment: +% Implementation: delegates to Python's yaml.safe_load, then recursively +% converts the py.dict / py.list tree to native MATLAB struct / cell. +% Requires a working MATLAB-Python bridge and the pyyaml package in the +% linked Python environment: % -% pip install pyyaml % from the MATLAB-linked Python env +% pip install pyyaml % from the MATLAB-linked Python env % -% Input: -% filename path to the YAML file. +% Parameters +% ---------- +% filename : char +% Path to the YAML file. % -% Output: -% out MATLAB representation of the document: -% py.dict -> struct -% py.list -> cell column vector -% py.str -> char -% py.int -> double -% py.float -> double -% py.bool -> logical -% py.None -> [] +% Returns +% ------- +% out : struct or cell or char or double or logical +% MATLAB representation of the document: % -% Usage: cfg = parseYAML('data/conditions/anaerobic.yml') +% - py.dict -> struct +% - py.list -> cell column vector +% - py.str -> char +% - py.int -> double +% - py.float -> double +% - py.bool -> logical +% - py.None -> [] +% +% Examples +% -------- +% cfg = parseYAML('data/conditions/anaerobic.yml'); +% +% See also +% -------- +% readYAMLmodel if ~isfile(filename) error('parseYAML:fileNotFound', 'File not found: %s', filename); diff --git a/io/readYAMLmodel.m b/io/readYAMLmodel.m index adc74f58..e71b4206 100755 --- a/io/readYAMLmodel.m +++ b/io/readYAMLmodel.m @@ -1,16 +1,24 @@ function model=readYAMLmodel(fileName, verbose) -% readYAMLmodel -% Reads a yaml file matching (roughly) the cobrapy yaml structure +% readYAMLmodel Read a model structure from a YAML file. % -% Input: -% fileName a model file in yaml file format. A dialog window will open -% if no file name is specified. -% verbose set as true to monitor progress (optional, default false) +% Reads a yaml file matching (roughly) the cobrapy yaml structure. % -% Output: -% model a model structure +% Parameters +% ---------- +% fileName : char +% a model file in yaml file format. A dialog window will open if no +% file name is specified. +% verbose : logical, optional +% set as true to monitor progress (default false). % -% Usage: model = readYAMLmodel(fileName, verbose) +% Returns +% ------- +% model : struct +% a model structure. +% +% Examples +% -------- +% model = readYAMLmodel(fileName, verbose); if nargin<1 || isempty(fileName) [fileName, pathName] = uigetfile({'*.yml;*.yaml'}, 'Please select the model file'); if fileName == 0 diff --git a/io/writeSheet.m b/io/writeSheet.m index 1ca76d31..31ed185a 100755 --- a/io/writeSheet.m +++ b/io/writeSheet.m @@ -1,17 +1,33 @@ function wb=writeSheet(wb,sheetName,sheetPosition,captions,units,raw,isIntegers) -% writeSheet -% Writes a cell matrix to an Excel sheet into using the Java library Apache POI +% writeSheet Write a cell matrix to an Excel sheet. % -% workbook Workbook object representing the Excel file -% sheetName name of the sheet -% sheetPosition 0-based position of the sheet -% captions cell array of captions (optional) -% units WRITE INFO -% raw cell array with the data in the sheet -% isIntegers true if numeric values should be integers (optional, default -% true) +% Writes a cell matrix to an Excel sheet using the Java library Apache POI. % -% Usage: wb=writeSheet(wb,sheetName,sheetPosition,captions,units,raw) +% Parameters +% ---------- +% wb : Workbook +% Workbook object representing the Excel file. +% sheetName : char +% name of the sheet. +% sheetPosition : double +% 0-based position of the sheet. +% captions : cell, optional +% cell array of captions. +% units : cell +% cell array of units for the columns. +% raw : cell +% cell array with the data in the sheet. +% isIntegers : logical, optional +% true if numeric values should be integers (default true). +% +% Returns +% ------- +% wb : Workbook +% Workbook object updated with the written sheet. +% +% Examples +% -------- +% wb = writeSheet(wb, sheetName, sheetPosition, captions, units, raw); if nargin<7 isIntegers=true; diff --git a/io/writeYAMLmodel.m b/io/writeYAMLmodel.m index fc19d9e5..b1f59263 100755 --- a/io/writeYAMLmodel.m +++ b/io/writeYAMLmodel.m @@ -1,28 +1,33 @@ function writeYAMLmodel(model,fileName,preserveQuotes,sortIds) -% writeYAMLmodel -% Writes a yaml file matching cobrapy's YAML structure. The format is -% cobrapy's native !!omap layout, extended with RAVEN-only top-level -% per-entry keys (inchis, deltaG, metFrom, rxnFrom, references, -% confidence_score, protein) and the GECKO ec-rxns / ec-enzymes -% sections. Reaction EC numbers are written inside the `annotation` -% block as `ec-code` (the cobrapy/geckopy convention), not as a -% top-level reaction key. Output is byte-stable with raven_python's -% io.yaml.write_yaml_model when called with the same model. +% writeYAMLmodel Write a model to a yaml file matching cobrapy's structure. % -% model a model structure -% fileName name that the file will have. A dialog window will -% open if no file name is specified. -% preserveQuotes if all string values should be wrapped in double -% quotes. cobrapy emits quotes only where YAML -% requires them, so the default is false (matches -% cobrapy / raven-python). -% (logical, default=false) -% sortIds if metabolites, reactions, genes and compartments -% should be sorted alphabetically by their identifier, -% otherwise they are kept in their original order -% (logical, default=false) +% The format is cobrapy's native !!omap layout, extended with RAVEN-only +% top-level per-entry keys (inchis, deltaG, metFrom, rxnFrom, references, +% confidence_score, protein) and the GECKO ec-rxns / ec-enzymes sections. +% Reaction EC numbers are written inside the `annotation` block as +% `ec-code` (the cobrapy/geckopy convention), not as a top-level reaction +% key. Output is byte-stable with raven_python's io.yaml.write_yaml_model +% when called with the same model. % -% Usage: writeYAMLmodel(model,fileName,preserveQuotes,sortIds) +% Parameters +% ---------- +% model : struct +% a model structure. +% fileName : char +% name that the file will have. A dialog window will open if no file +% name is specified. +% preserveQuotes : logical, optional +% if all string values should be wrapped in double quotes. cobrapy +% emits quotes only where YAML requires them, so the default is false +% (matches cobrapy / raven-python) (default false). +% sortIds : logical, optional +% if metabolites, reactions, genes and compartments should be sorted +% alphabetically by their identifier, otherwise they are kept in their +% original order (default false). +% +% Examples +% -------- +% writeYAMLmodel(model,fileName,preserveQuotes,sortIds); if nargin<2|| isempty(fileName) [fileName, pathName] = uiputfile({'*.yml;*.yaml'}, 'Select file for model export',[model.id '.yml']); if fileName == 0 diff --git a/localization/getExpressionStructure.m b/localization/getExpressionStructure.m index f49899d7..aa4d275a 100755 --- a/localization/getExpressionStructure.m +++ b/localization/getExpressionStructure.m @@ -1,50 +1,61 @@ function experiment=getExpressionStructure(fileName) -% getExpressionStructure -% Loads a representation of an experiment from an Excel file (see -% comments further down) +% getExpressionStructure Load a representation of an experiment from Excel. % -% fileName an Excel representation on an experiment +% Loads a representation of an experiment from an Excel file (see notes +% further down). % -% experiment an experiment structure -% data matrix with expression values -% orfs the corresponding ORFs -% experiments the titles of the experiments -% boundNames reaction names for the bounds -% upperBoundaries matrix with the upper bound values -% fitNames reaction names for the measured fluxes -% fitTo matrix with the measured fluxes +% Parameters +% ---------- +% fileName : char +% an Excel representation of an experiment. % -% A very common data set when working with genome-scale metabolic models -% is that you have measured fermentation data, gene expression data, -% and some different 'bounds' (for example different carbon sources -% or genes that are knocked out) in a number of conditions. This function -% reads an Excel representation of such an experiment. -% The Excel file must contain three sheets, 'EXPRESSION', 'BOUNDS', -% 'FITTING'. Below are some examples to show how they should be -% formatted: +% Returns +% ------- +% experiment : struct +% an experiment structure with fields: % -% -EXPRESSION -% ORF dsm_paa wisc_paa -% Pc00e00030 79.80942723 78.14755338 -% Shows the expression of the gene Pc00e00030 under two different -% conditions (in this case a DSM strain and a Wisconsin strain of P. -% chrysogenum with PSS in the media) +% - data : matrix with expression values +% - orfs : the corresponding ORFs +% - experiments : the titles of the experiments +% - boundNames : reaction names for the bounds +% - upperBoundaries : matrix with the upper bound values +% - fitNames : reaction names for the measured fluxes +% - fitTo : matrix with the measured fluxes % -% -BOUNDS -% Fixed Upper dsm_paa wisc_paa -% paaIN 0.1 0.2 -% The upper bound for the reaction paaIN should be 0.1 for the first -% condition and 0.2 for the second +% Examples +% -------- +% experiment = getExpressionStructure(fileName); % -% -FITTING -% Fit to dsm_paa wisc_paa -% co2OUT 2.85 3.05 -% glcIN 1.2 0.9 -% The measured fluxes for CO2 production and glucose uptake for the two -% conditions. The model(s) can later be fitted to match these values as -% good as possible. +% Notes +% ----- +% A very common data set when working with genome-scale metabolic models +% is that you have measured fermentation data, gene expression data, and +% some different 'bounds' (for example different carbon sources or genes +% that are knocked out) in a number of conditions. This function reads an +% Excel representation of such an experiment. The Excel file must contain +% three sheets, 'EXPRESSION', 'BOUNDS', 'FITTING'. Below are some examples +% to show how they should be formatted: % -% Usage: experiment=getExpressionStructure(fileName) +% -EXPRESSION +% ORF dsm_paa wisc_paa +% Pc00e00030 79.80942723 78.14755338 +% Shows the expression of the gene Pc00e00030 under two different +% conditions (in this case a DSM strain and a Wisconsin strain of P. +% chrysogenum with PSS in the media). +% +% -BOUNDS +% Fixed Upper dsm_paa wisc_paa +% paaIN 0.1 0.2 +% The upper bound for the reaction paaIN should be 0.1 for the first +% condition and 0.2 for the second. +% +% -FITTING +% Fit to dsm_paa wisc_paa +% co2OUT 2.85 3.05 +% glcIN 1.2 0.9 +% The measured fluxes for CO2 production and glucose uptake for the two +% conditions. The model(s) can later be fitted to match these values as +% well as possible. [type, sheets]=xlsfinfo(fileName); diff --git a/localization/getWoLFScores.m b/localization/getWoLFScores.m index b0ad9b60..aafe6855 100755 --- a/localization/getWoLFScores.m +++ b/localization/getWoLFScores.m @@ -1,20 +1,30 @@ function GSS = getWoLFScores(inputFile, kingdom) -% getWoLFScores -% Call WoLF PSort to predict the sub-cellular localization of proteins. -% The output can be used as input to predictLocalization. This function -% is currently only available for Linux and requires Perl to be -% installed. If one wants to use another predictor, see parseScores. The -% function normalizes the scores so that the best score for each gene is -% 1.0. +% getWoLFScores Predict protein sub-cellular localization with WoLF PSORT. % -% Input: -% inputFile a FASTA file with protein sequences -% kingdom the kingdom of the organism, 'animal', 'fungi' or 'plant' +% The output can be used as input to predictLocalization. This function is +% currently only available for Linux and requires Perl to be installed. If +% one wants to use another predictor, see parseScores. The function +% normalizes the scores so that the best score for each gene is 1.0. % -% Output: -% GSS a gene scoring structure to be used in predictLocalization +% Parameters +% ---------- +% inputFile : char +% a FASTA file with protein sequences. +% kingdom : char +% the kingdom of the organism, 'animal', 'fungi' or 'plant'. % -% Usage: GSS = getWoLFScores(inputFile, kingdom) +% Returns +% ------- +% GSS : struct +% a gene scoring structure to be used in predictLocalization. +% +% Examples +% -------- +% GSS = getWoLFScores(inputFile, kingdom); +% +% See also +% -------- +% parseScores, predictLocalization if ~isfile(inputFile) error('FASTA file %s cannot be found',string(inputFile)); diff --git a/localization/mapCompartments.m b/localization/mapCompartments.m index e47c744d..43b78eac 100755 --- a/localization/mapCompartments.m +++ b/localization/mapCompartments.m @@ -1,12 +1,53 @@ function geneScoreStructure=mapCompartments(geneScoreStructure,varargin) -% mapCompartments -% Maps compartments in the geneScoreStructure. This is used if you do not -% want a models that uses all of the compartment from the predictor. This -% function will then let you define rules on how the compartments should -% be merged. +% mapCompartments Map compartments in the geneScoreStructure. % -% Any number of rules could be defined as consecutive strings or in a cell array. -% 'comp1' comp1 should be kept in the structure +% Maps compartments in the geneScoreStructure. This is used if you do not +% want a model that uses all of the compartments from the predictor. This +% function will then let you define rules on how the compartments should be +% merged. +% +% Parameters +% ---------- +% geneScoreStructure : struct +% a structure to be used in predictLocalization. +% varargin : char or cell +% any number of rules, defined as consecutive strings or in a cell +% array: +% +% - 'comp1' : comp1 should be kept in the structure. +% - 'comp1=comp2' : The scores in comp2 are merged to comp1 and comp2 is +% removed from the structure. This automatically keeps comp1 in the +% structure. +% - 'comp1=comp2 comp3' : The scores in comp2 and comp3 are merged to +% comp1 and comp2 & comp3 are removed from the structure. This +% automatically keeps comp1 in the structure. +% - 'comp1 comp2=comp3' : The scores in comp3 are split between comp1 and +% comp2. This automatically keeps comp1 and comp2 in the structure. +% - 'comp1=other' : The scores in any compartment not included are merged +% to comp1. This is applied after all other rules. +% +% Returns +% ------- +% geneScoreStructure : struct +% a structure to be used in predictLocalization. +% +% Examples +% -------- +% The predictor you use gives prediction for Extracellular, Cytosol, +% Nucleus, Peroxisome, Mitochondria, ER, and Lysosome. You want to have a +% model with Extracellular, Cytosol, Mitochondria, and Peroxisome where +% Lysosome is merged with Peroxisome and all other compartments are merged +% to the Cytosol: +% +% GSS = mapCompartments(GSS, 'Extracellular', 'Mitochondria', ... +% 'Peroxisome=Lysosome', 'Cytosol=other'); +% +% Notes +% ----- +% When one compartment is merged to another the resulting scores will be the +% best for each gene in either of the compartments. In the case where one +% compartment is split among several, the scores for the compartment to be +% merged is weighted with the number of compartments to split to. % 'comp1=comp2' The scores in comp2 are merged to comp1 and comp2 is % removed from the structure. This automatically diff --git a/localization/parseScores.m b/localization/parseScores.m index ae865b26..5de1c53b 100755 --- a/localization/parseScores.m +++ b/localization/parseScores.m @@ -1,19 +1,29 @@ function GSS = parseScores(inputFile, predictor) -% parseScores -% Parse the output from a predictor to generate the GSS +% parseScores Parse the output from a predictor to generate the GSS. % -% Input: -% inputFile a file with the output from the predictor -% predictor the predictor that was used. 'wolf' for WoLF PSORT, 'cello' -% for CELLO, 'deeploc' for DeepLoc (optional, default 'wolf') +% The function normalizes the scores so that the best score for each gene +% is 1.0. % -% Output: -% GSS a gene scoring structure to be used in predictLocalization +% Parameters +% ---------- +% inputFile : char +% a file with the output from the predictor. +% predictor : char, optional +% the predictor that was used. 'wolf' for WoLF PSORT, 'cello' for +% CELLO, 'deeploc' for DeepLoc (default 'wolf'). % -% The function normalizes the scores so that the best score for each gene -% is 1.0. +% Returns +% ------- +% GSS : struct +% a gene scoring structure to be used in predictLocalization. % -% Usage: GSS = parseScores(inputFile, predictor) +% Examples +% -------- +% GSS = parseScores(inputFile, predictor); +% +% See also +% -------- +% predictLocalization, getWoLFScores if nargin<2 predictor='wolf'; diff --git a/localization/predictLocalization.m b/localization/predictLocalization.m index 1908caa6..d9e1e334 100755 --- a/localization/predictLocalization.m +++ b/localization/predictLocalization.m @@ -1,66 +1,77 @@ function [outModel, geneLocalization, transportStruct, scores,... removedRxns] = predictLocalization(model, GSS,... defaultCompartment, transportCost, maxTime, plotResults) -% predictLocalization -% Tries to assign reactions to compartments in a manner that is in -% agreement with localization predictors while at the same time -% maintaining connectivity. +% predictLocalization Assign reactions to compartments using localization predictors. % -% Input: -% model a model structure. If the model contains -% several compartments they will be merged -% GSS a gene scoring structure as from parseScores -% defaultCompartment transport reactions are expressed as diffusion -% between the defaultCompartment and the others. -% This is usually the cytosol. The default -% compartment must have a match in GSS -% transportCost the cost for including a transport reaction. If -% this a scalar then the same cost is used for -% all metabolites. It can also be a vector of -% costs with the same dimension as model.mets. -% Note that negative costs will result in that -% transport of the metabolite is encouraged (optional, -% default 0.5) -% maxTime maximum optimization time in minutes (optional, -% default 15) -% plotResults true if the results should be plotted during the -% optimization (optional, default false) +% Tries to assign reactions to compartments in a manner that is in +% agreement with localization predictors while at the same time maintaining +% connectivity. % -% Output: -% outModel the resulting model structure -% geneLocalization structure with the genes and their resulting -% localization -% transportStruct structure with the transport reactions that had -% to be inferred and between which compartments -% scores structure that contains the total score history -% together with the score based on gene -% localization and the score based on included -% transport reactions -% removedRxns cell array with the reaction ids that had to be -% removed in order to have a connected input -% model +% Parameters +% ---------- +% model : struct +% a model structure. If the model contains several compartments they +% will be merged. +% GSS : struct +% a gene scoring structure as from parseScores. +% defaultCompartment : char +% transport reactions are expressed as diffusion between the +% defaultCompartment and the others. This is usually the cytosol. The +% default compartment must have a match in GSS. +% transportCost : double, optional +% the cost for including a transport reaction. If this is a scalar then +% the same cost is used for all metabolites. It can also be a vector of +% costs with the same dimension as model.mets. Note that negative costs +% will result in transport of the metabolite being encouraged (default +% 0.5). +% maxTime : double, optional +% maximum optimization time in minutes (default 15). +% plotResults : logical, optional +% true if the results should be plotted during the optimization +% (default false). % -% This function requires that the starting network is connected when it -% is in one compartment. Reactions that are unconnected are removed and -% saved in removedRxns. Try running fillGaps to have a more connected -% input model if there are many such reactions. The input model should -% also not include any exchange, demand or sink reactions, otherwise this -% function would not provide any results. +% Returns +% ------- +% outModel : struct +% the resulting model structure. +% geneLocalization : struct +% structure with the genes and their resulting localization. +% transportStruct : struct +% structure with the transport reactions that had to be inferred and +% between which compartments. +% scores : struct +% structure that contains the total score history together with the +% score based on gene localization and the score based on included +% transport reactions. +% removedRxns : cell +% cell array with the reaction ids that had to be removed in order to +% have a connected input model. % -% In the final model all metabolites are produced in at least one -% reaction. This does not guarantee a fully functional model since there -% can be internal loops. Transport reactions are only included as passive -% diffusion (A <=> B). +% Notes +% ----- +% This function requires that the starting network is connected when it is +% in one compartment. Reactions that are unconnected are removed and saved +% in removedRxns. Try running fillGaps to have a more connected input model +% if there are many such reactions. The input model should also not include +% any exchange, demand or sink reactions, otherwise this function would not +% provide any results. % -% The score of a model is the sum of scores for all genes in their -% assigned compartment minus the cost of all transport reactions that had -% to be included. A gene can only be assigned to one compartment. This is -% a simplification to keep the problem size down. The problem is solved -% using simulated annealing. +% In the final model all metabolites are produced in at least one reaction. +% This does not guarantee a fully functional model since there can be +% internal loops. Transport reactions are only included as passive diffusion +% (A <=> B). % -% Usage: [outModel, geneLocalization, transportStruct, scores,... -% removedRxns] = predictLocalization(model, GSS,... -% defaultCompartment, transportCost, maxTime, plotResults) +% The score of a model is the sum of scores for all genes in their assigned +% compartment minus the cost of all transport reactions that had to be +% included. A gene can only be assigned to one compartment. This is a +% simplification to keep the problem size down. The problem is solved using +% simulated annealing. +% +% Examples +% -------- +% [outModel, geneLocalization, transportStruct, scores, removedRxns] = ... +% predictLocalization(model, GSS, defaultCompartment, ... +% transportCost, maxTime, plotResults); if nargin<4 transportCost=ones(numel(model.mets),1)*0.5; diff --git a/manipulation/addExchangeRxns.m b/manipulation/addExchangeRxns.m index 831e888b..573240e8 100755 --- a/manipulation/addExchangeRxns.m +++ b/manipulation/addExchangeRxns.m @@ -1,25 +1,36 @@ function [model, addedRxns]=addExchangeRxns(model,reactionType,mets) -% addExchangeRxns -% Adds exchange reactions for some metabolites +% addExchangeRxns Add exchange reactions for some metabolites. % -% model a model structure -% reactionType the type of reactions to add -% 'in' input reactions -% 'out' output reactions -% 'both' reversible input/output reactions. Positive -% direction corresponds to output -% mets either a cell array of metabolite IDs, a logical vector -% with the same number of elements as metabolites in the model, -% or a vector of indexes to add for (optional, default model.mets) +% This is a faster version than addRxns when adding exchange reactions. +% New reactions are named "metName exchange (OUT/IN/BOTH)" while reaction +% ids are formatted as "EXC_OUT/IN/BOTH_METID". % -% model updated model structure -% addedRxns ids of the added reactions +% Parameters +% ---------- +% model : struct +% a model structure. +% reactionType : char +% the type of reactions to add: % -% This is a faster version than addRxns when adding exchange reactions. -% New reactions are named "metName exchange (OUT/IN/BOTH)" while reaction -% ids are formatted as "EXC_OUT/IN/BOTH_METID". +% - 'in' : input reactions +% - 'out' : output reactions +% - 'both' : reversible input/output reactions. Positive direction +% corresponds to output +% mets : cell or logical or double, optional +% either a cell array of metabolite IDs, a logical vector with the same +% number of elements as metabolites in the model, or a vector of +% indexes to add for (default model.mets). % -% Usage: [model, addedRxns]=addExchangeRxns(model,reactionType,mets) +% Returns +% ------- +% model : struct +% updated model structure. +% addedRxns : cell +% ids of the added reactions. +% +% Examples +% -------- +% [model, addedRxns] = addExchangeRxns(model, reactionType, mets); if nargin<3 mets=model.mets; diff --git a/manipulation/addGenesRaven.m b/manipulation/addGenesRaven.m index cc44f153..ac482dca 100755 --- a/manipulation/addGenesRaven.m +++ b/manipulation/addGenesRaven.m @@ -1,28 +1,35 @@ function newModel=addGenesRaven(model,genesToAdd) -% addGenesRaven -% Adds genes to a model +% addGenesRaven Add genes to a model. % -% model a model structure -% genesToAdd the genes genesToAdd can have the following fields: -% genes cell array with unique strings that -% identifies each gene. Only character which are -% allowed in SBML ids are allowed (mainly a-z, -% 0-9 and '_'). However, there is no check -% for this performed, as it only matters if -% the model should be exported to SBML -% geneShortNames cell array of gene abbreviations (optional, -% default '') -% geneMiriams cell array with MIRIAM structures (optional, -% default []) -% proteins cell array of protein names associated to -% each gene (optional, default '') +% This function does not make extensive checks about MIRIAM formats, +% forbidden characters or such. % -% newModel an updated model structure +% Parameters +% ---------- +% model : struct +% a model structure. +% genesToAdd : struct +% the genes to add, which can have the following fields: % -% NOTE: This function does not make extensive checks about MIRIAM formats, -% forbidden characters or such. +% - genes : cell array with unique strings that identifies each gene. +% Only characters which are allowed in SBML ids are allowed (mainly +% a-z, 0-9 and '_'). However, there is no check for this performed, +% as it only matters if the model should be exported to SBML +% - geneShortNames : cell array of gene abbreviations (optional, +% default '') +% - geneMiriams : cell array with MIRIAM structures (optional, +% default []) +% - proteins : cell array of protein names associated to each gene +% (optional, default '') % -% Usage: newModel=addGenesRaven(model,genesToAdd) +% Returns +% ------- +% newModel : struct +% an updated model structure. +% +% Examples +% -------- +% newModel = addGenesRaven(model, genesToAdd); newModel=model; diff --git a/manipulation/addMets.m b/manipulation/addMets.m index 3b4668a6..3e1c14f0 100755 --- a/manipulation/addMets.m +++ b/manipulation/addMets.m @@ -1,57 +1,65 @@ function newModel=addMets(model,metsToAdd,copyInfo,prefix) -% addMets -% Adds metabolites to a model -% -% Input: -% model a model structure -% metsToAdd the metabolite structure can have the following fields: -% mets cell array with unique strings that identifies each -% metabolite (optional, default IDs of new -% metabolites are numbered with the prefix defined -% below) -% metNames cell array with the names of each metabolite -% compartments cell array with the compartment of each -% metabolite. Should match model.comps. If this is a -% string rather than a cell array it is assumed that -% all mets are in that compartment -% b Nx1 or Nx2 matrix with equality constraints for -% each metabolite (optional, default 0) -% unconstrained vector describing if each metabolite is an exchange -% metabolite (1) or not (0) (optional, default 0) -% inchis cell array with InChI strings (optional, default '') -% metSmiles cell array with SMILES strings (optional, default '') -% metFormulas cell array with the formulas (optional, default '') -% metMiriams cell array with MIRIAM structures (optional, default []) -% metCharges metabolite charge (optional, default NaN) -% metDeltaG Gibbs free energy of formation at biochemical -% standard condition in kJ/mole (optional, default NaN) -% metNotes cell array with metabolite notes as strings -% (optional, default '') -% copyInfo when adding metabolites to a compartment where it -% previously did not exist, the function will copy any -% available annotation from the metabolite in another -% compartment (optional, default true) -% prefix when metsToAdd.mets is not specified, new metabolite IDs -% are generated with the prefix specified here. If IDs with -% the prefix are already used in the model then the -% numbering will start from the highest existing integer+1 -% (optional, default 'm_') -% -% Output: -% newModel an updated model structure +% addMets Add metabolites to a model. % % This function does not make extensive checks about MIRIAM formats, % forbidden characters or such. % +% Parameters +% ---------- +% model : struct +% a model structure. +% metsToAdd : struct +% the metabolite structure, which can have the following fields: +% +% - mets : cell array with unique strings that identifies each +% metabolite (optional, default IDs of new metabolites are numbered +% with the prefix defined below) +% - metNames : cell array with the names of each metabolite +% - compartments : cell array with the compartment of each metabolite. +% Should match model.comps. If this is a string rather than a cell +% array it is assumed that all mets are in that compartment +% - b : Nx1 or Nx2 matrix with equality constraints for each +% metabolite (optional, default 0) +% - unconstrained : vector describing if each metabolite is an exchange +% metabolite (1) or not (0) (optional, default 0) +% - inchis : cell array with InChI strings (optional, default '') +% - metSmiles : cell array with SMILES strings (optional, default '') +% - metFormulas : cell array with the formulas (optional, default '') +% - metMiriams : cell array with MIRIAM structures (optional, +% default []) +% - metCharges : metabolite charge (optional, default NaN) +% - metDeltaG : Gibbs free energy of formation at biochemical standard +% condition in kJ/mole (optional, default NaN) +% - metNotes : cell array with metabolite notes as strings (optional, +% default '') +% copyInfo : logical, optional +% when adding metabolites to a compartment where it previously did not +% exist, the function will copy any available annotation from the +% metabolite in another compartment (default true). +% prefix : char, optional +% when metsToAdd.mets is not specified, new metabolite IDs are +% generated with the prefix specified here. If IDs with the prefix are +% already used in the model then the numbering will start from the +% highest existing integer+1 (default 'm_'). +% +% Returns +% ------- +% newModel : struct +% an updated model structure. +% +% Examples +% -------- +% newModel = addMets(model, metsToAdd, copyInfo, prefix); +% +% Notes +% ----- % If multiple metabolites are added at once, the metMiriams cell array % should be defined as (example with ChEBI and KEGG): % -% metsToAdd.metMiriams{1} = struct('name',{{'chebi';'kegg.compound'}},... -% 'value',{{'CHEBI:18072';'C11821'}}); -% metsToAdd.metMiriams{2} = struct('name',{{'chebi';'kegg.compound'}},... -% 'value',{{'CHEBI:31132';'C12248'}}); -% -% Usage: newModel = addMets(model, metsToAdd, copyInfo, prefix) +% metsToAdd.metMiriams{1} = struct('name',{{'chebi';'kegg.compound'}},... +% 'value',{{'CHEBI:18072';'C11821'}}); +% metsToAdd.metMiriams{2} = struct('name',{{'chebi';'kegg.compound'}},... +% 'value',{{'CHEBI:31132';'C12248'}}); if nargin<3 copyInfo=true; diff --git a/manipulation/addRxns.m b/manipulation/addRxns.m index 3ce890c6..a046b9bd 100755 --- a/manipulation/addRxns.m +++ b/manipulation/addRxns.m @@ -1,105 +1,102 @@ function newModel=addRxns(model,rxnsToAdd,eqnType,compartment,allowNewMets,allowNewGenes) -% addRxns -% Adds reactions to a model +% addRxns Add reactions to a model. % -% Input: -% model a model structure -% rxnsToAdd the reaction structure can have the following fields: -% rxns cell array with unique strings that identifies -% each reaction -% equations cell array with equation strings. Decimal -% coefficients are expressed as "1.2". -% Reversibility is indicated by "<=>" or "=>" -% mets (alternative to equations) cell array with the -% metabolites involved in each reaction as nested -% arrays. E.g.: {{'met1','met2'},{'met1','met3','met4'}} -% In the case of one single reaction added, it -% can be a string array: {'met1','met2'} -% stoichCoeffs (alternative to equations) cell array with the -% corresponding stoichiometries as nested vectors -% E.g.: {[-1,+2],[-1,-1,+1]}. In the case of one -% single reaction added, it can be a vector: [-1,+2] -% rxnNames cell array with the names of each reaction -% (optional, default '') -% lb vector with the lower bounds (optional, default -% model.annotations.defaultLB or -inf for -% reversible reactions and 0 for irreversible -% when "equations" is used. When "mets" and -% "stoichCoeffs" are ,used it defaults for all -% to model.annotations.defaultLB or -inf) -% ub vector with the upper bounds (optional, default -% model.annotations.defaultUB or inf) -% c vector with the objective function coefficients -% (optional, default 0) -% eccodes cell array with the EC-numbers for each -% reactions. Delimit several EC-numbers with ";" -% (optional, default '') -% subSystems cell array with the subsystems for each -% reaction (optional, default '') -% grRules cell array with the gene-reaction relationship -% for each reaction. E.g. "(A and B) or (C)" -% means that the reaction could be catalyzed by a -% complex between A & B or by C on its own. All -% the genes have to be present in model.genes. -% Add genes with addGenesRaven before calling -% this function if needed (optional, default '') -% rxnMiriams cell array with Miriam structures (optional, -% default []) -% rxnComps cell array with compartments (as in -% model.comps) (optional, default {}) -% rxnNotes cell array with reaction notes (optional, -% default '') -% rxnDeltaG Gibbs free energy at biochemical standard -% condition in kJ/mole (optional, default NaN) -% rxnReferences cell array with reaction references (optional, -% default '') -% rxnConfidenceScores vector with reaction confidence scores -% (optional, default NaN) -% eqnType double describing how the equation string should be -% interpreted -% 1 - The metabolites are matched to model.mets. New -% metabolites (if allowed) are added to -% "compartment" (default) -% 2 - The metabolites are matched to model.metNames and -% all metabolites are assigned to "compartment". Any -% new metabolites that are added will be assigned -% IDs "m1", "m2"... If IDs on the same form are -% already used in the model then the numbering will -% start from the highest used integer+1 -% 3 - The metabolites are written as -% "metNames[comps]". Only compartments in -% model.comps are allowed. Any -% new metabolites that are added will be assigned -% IDs "m1", "m2"... If IDs on the same form are -% already used in the model then the numbering will -% start from the highest used integer+1 -% compartment a string with the compartment the metabolites should -% be placed in when using eqnType=2. Must match -% model.comps (optional when eqnType=1 or eqnType=3) -% allowNewMets true if the function is allowed to add new -% metabolites. Can also be a string, which will be used -% as prefix for the new metabolite IDs. It is highly -% recommended to first add any new metabolites with -% addMets rather than automatically through this -% function. addMets supports more annotation of -% metabolites, allows for the use of exchange -% metabolites, and using it reduces the risk of parsing -% errors (optional, default false) -% allowNewGenes true if the functions is allowed to add new genes -% (optional, default false) +% This function does not make extensive checks about formatting of +% gene-reaction rules. % -% Output: -% newModel an updated model structure +% When adding metabolites to a compartment where they previously do not +% exist, the function will copy any available information from the +% metabolite in another compartment. % -% This function does not make extensive checks about formatting of -% gene-reaction rules. +% Parameters +% ---------- +% model : struct +% a model structure. +% rxnsToAdd : struct +% the reaction structure, which can have the following fields: % -% When adding metabolites to a compartment where they previously do not -% the function will copy any available information from the metabolite in -% another compartment. +% - rxns : cell array with unique strings that identifies each reaction +% - equations : cell array with equation strings. Decimal coefficients +% are expressed as "1.2". Reversibility is indicated by "<=>" or "=>" +% - mets : (alternative to equations) cell array with the metabolites +% involved in each reaction as nested arrays. E.g.: +% {{'met1','met2'},{'met1','met3','met4'}}. In the case of one single +% reaction added, it can be a string array: {'met1','met2'} +% - stoichCoeffs : (alternative to equations) cell array with the +% corresponding stoichiometries as nested vectors. E.g.: +% {[-1,+2],[-1,-1,+1]}. In the case of one single reaction added, it +% can be a vector: [-1,+2] +% - rxnNames : cell array with the names of each reaction (optional, +% default '') +% - lb : vector with the lower bounds (optional, default +% model.annotations.defaultLB or -inf for reversible reactions and 0 +% for irreversible when "equations" is used. When "mets" and +% "stoichCoeffs" are used it defaults for all to +% model.annotations.defaultLB or -inf) +% - ub : vector with the upper bounds (optional, default +% model.annotations.defaultUB or inf) +% - c : vector with the objective function coefficients (optional, +% default 0) +% - eccodes : cell array with the EC-numbers for each reaction. Delimit +% several EC-numbers with ";" (optional, default '') +% - subSystems : cell array with the subsystems for each reaction +% (optional, default '') +% - grRules : cell array with the gene-reaction relationship for each +% reaction. E.g. "(A and B) or (C)" means that the reaction could be +% catalyzed by a complex between A & B or by C on its own. All the +% genes have to be present in model.genes. Add genes with +% addGenesRaven before calling this function if needed (optional, +% default '') +% - rxnMiriams : cell array with Miriam structures (optional, +% default []) +% - rxnComps : cell array with compartments (as in model.comps) +% (optional, default {}) +% - rxnNotes : cell array with reaction notes (optional, default '') +% - rxnDeltaG : Gibbs free energy at biochemical standard condition in +% kJ/mole (optional, default NaN) +% - rxnReferences : cell array with reaction references (optional, +% default '') +% - rxnConfidenceScores : vector with reaction confidence scores +% (optional, default NaN) +% eqnType : double, optional +% describes how the equation string should be interpreted (default 1): % -% Usage: newModel = addRxns(model, rxnsToAdd, eqnType, compartment,... -% allowNewMets, allowNewGenes) +% - 1 : the metabolites are matched to model.mets. New metabolites (if +% allowed) are added to "compartment" +% - 2 : the metabolites are matched to model.metNames and all +% metabolites are assigned to "compartment". Any new metabolites that +% are added will be assigned IDs "m1", "m2"... If IDs on the same +% form are already used in the model then the numbering will start +% from the highest used integer+1 +% - 3 : the metabolites are written as "metNames[comps]". Only +% compartments in model.comps are allowed. Any new metabolites that +% are added will be assigned IDs "m1", "m2"... If IDs on the same +% form are already used in the model then the numbering will start +% from the highest used integer+1 +% compartment : char, optional +% the compartment the metabolites should be placed in when using +% eqnType=2. Must match model.comps (optional when eqnType=1 or +% eqnType=3). +% allowNewMets : logical or char, optional +% true if the function is allowed to add new metabolites. Can also be a +% string, which will be used as prefix for the new metabolite IDs. It +% is highly recommended to first add any new metabolites with addMets +% rather than automatically through this function. addMets supports +% more annotation of metabolites, allows for the use of exchange +% metabolites, and using it reduces the risk of parsing errors +% (default false). +% allowNewGenes : logical, optional +% true if the function is allowed to add new genes (default false). +% +% Returns +% ------- +% newModel : struct +% an updated model structure. +% +% Examples +% -------- +% newModel = addRxns(model, rxnsToAdd, eqnType, compartment, ... +% allowNewMets, allowNewGenes); if nargin<3 eqnType=1; diff --git a/manipulation/addRxnsGenesMets.m b/manipulation/addRxnsGenesMets.m index 7e856f0c..5b550216 100755 --- a/manipulation/addRxnsGenesMets.m +++ b/manipulation/addRxnsGenesMets.m @@ -1,48 +1,58 @@ function model=addRxnsGenesMets(model,sourceModel,rxns,addGene,rxnNote,confidence) -% addRxnsGenesMets -% Copies reactions from a source model to a new model, including -% (new) metabolites and genes +% addRxnsGenesMets Copy reactions from a source model into another model. % -% model draft model where reactions should be copied to -% sourceModel model where reactions and metabolites are sourced from -% rxns cell array with reaction IDs (from source model). Can also -% be string if only one reaction is added -% addGene three options: -% false no genes are annotated to the new reactions -% true grRules ared copied from the sourceModel and -% new genes are added when required -% string or cell array -% new grRules are specified as string or cell -% array, and any new genes are added when -% required -% (optional, default false) -% rxnNote cell array with strings explaining why reactions were copied -% to the model, to be included as newModel.rxnNotes. Can also -% be string if same rxnNotes should be added for each new -% reaction, or only one reaction is to be added (optional, default -% 'Added via addRxnsAndMets()') -% confidence integer specifying confidence score for all reactions. -% 4: biochemical data: direct evidence from enzymes -% assays -% 3: genetic data: knockout/-in or overexpression -% analysis -% 2: physiological data: indirect evidence, e.g. -% secretion products or defined medium requirement -% sequence data: genome annotation -% 1: modeling data: required for functional model, -% hypothetical reaction -% 0: no evidence -% following doi:10.1038/nprot.2009.203 (optional, default 0) +% Copies reactions from a source model to a new model, including (new) +% metabolites and genes. % -% newModel an updated model structure +% This function only works if the draft model and source model follow the +% same metabolite and compartment naming convention. Metabolites are only +% matched by metaboliteName[compartment]. Useful if one wants to copy +% additional reactions from source to draft after getModelFromHomology was +% used involving the same models. % -% This function only works if the draft model and source model follow -% the same metabolite and compartment naming convention. Metabolites are -% only matched by metaboliteName[compartment]. Useful if one wants to copy -% additional reactions from source to draft after getModelFromHomology was -% used involving the same models. +% Parameters +% ---------- +% model : struct +% draft model where reactions should be copied to. +% sourceModel : struct +% model where reactions and metabolites are sourced from. +% rxns : cell or char +% reaction IDs (from source model). Can also be a string if only one +% reaction is added. +% addGene : logical or char or cell, optional +% three options (default false): % -% Usage: newModel=addRxnsGenesMets(model,sourceModel,rxns,addGene,rxnNote,confidence) +% - false : no genes are annotated to the new reactions +% - true : grRules are copied from the sourceModel and new genes are +% added when required +% - string or cell array : new grRules are specified as string or cell +% array, and any new genes are added when required +% rxnNote : cell or char, optional +% strings explaining why reactions were copied to the model, to be +% included as newModel.rxnNotes. Can also be a string if the same +% rxnNotes should be added for each new reaction, or only one reaction +% is to be added (default 'Added via addRxnsAndMets()'). +% confidence : double, optional +% integer specifying confidence score for all reactions, following +% doi:10.1038/nprot.2009.203 (default 0): +% +% - 4 : biochemical data: direct evidence from enzyme assays +% - 3 : genetic data: knockout/-in or overexpression analysis +% - 2 : physiological data: indirect evidence, e.g. secretion products +% or defined medium requirement; sequence data: genome annotation +% - 1 : modeling data: required for functional model, hypothetical +% reaction +% - 0 : no evidence +% +% Returns +% ------- +% model : struct +% an updated model structure. +% +% Examples +% -------- +% newModel = addRxnsGenesMets(model, sourceModel, rxns, addGene, ... +% rxnNote, confidence); if nargin<6 confidence=0; diff --git a/manipulation/addTransport.m b/manipulation/addTransport.m index 4b88f022..3d960912 100755 --- a/manipulation/addTransport.m +++ b/manipulation/addTransport.m @@ -1,29 +1,43 @@ function [model, addedRxns]=addTransport(model,fromComp,toComps,metNames,isRev,onlyToExisting,prefix) -% addTransport -% Adds transport reactions between compartments +% addTransport Add transport reactions between compartments. % -% model a model structure -% fromComp the id of the compartment to transport from (should -% match model.comps) -% toComps a cell array of compartment names to transport to (should -% match model.comps) -% metNames the metabolite names to add transport for (optional, all -% metabolites in fromComp) -% isRev true if the transport reactions should be reversible -% (optional, default true) -% onlyToExisting true if transport of a metabolite should only be added -% if it already exists in toComp. If false, then new metabolites -% are added with addMets first (optional, default true) -% prefix string specifying prefix to reaction IDs (optional, default -% 'tr_') +% This is a faster version than addRxns when adding transport reactions. +% New reaction names are formatted as "metaboliteName, fromComp-toComp", +% while new reaction IDs are sequentially counted with a tr_ prefix: +% e.g. tr_0001, tr_0002, etc. % -% This is a faster version than addRxns when adding transport reactions. -% New reaction names are formatted as "metaboliteName, fromComp-toComp", -% while new reaction IDs are sequentially counted with a tr_ prefix: -% e.g. tr_0001, tr_0002, etc. +% Parameters +% ---------- +% model : struct +% a model structure. +% fromComp : char +% the id of the compartment to transport from (should match +% model.comps). +% toComps : cell +% compartment names to transport to (should match model.comps). +% metNames : cell, optional +% the metabolite names to add transport for (default all metabolites +% in fromComp). +% isRev : logical, optional +% true if the transport reactions should be reversible (default true). +% onlyToExisting : logical, optional +% true if transport of a metabolite should only be added if it already +% exists in toComp. If false, then new metabolites are added with +% addMets first (default true). +% prefix : char, optional +% prefix to reaction IDs (default 'tr_'). % -% Usage: [model, addedRxns]=addTransport(model,fromComp,toComps,metNames,... -% isRev,onlyToExisting,prefix) +% Returns +% ------- +% model : struct +% updated model structure. +% addedRxns : cell +% ids of the added reactions. +% +% Examples +% -------- +% [model, addedRxns] = addTransport(model, fromComp, toComps, ... +% metNames, isRev, onlyToExisting, prefix); fromComp=char(fromComp); [I, fromID]=ismember(model.comps,fromComp); diff --git a/manipulation/changeGrRules.m b/manipulation/changeGrRules.m index e8418c3e..a2ad5de3 100755 --- a/manipulation/changeGrRules.m +++ b/manipulation/changeGrRules.m @@ -1,20 +1,29 @@ function model = changeGrRules(model,rxns,grRules,replace) -% changeGrRules -% Changes multiple grRules at the same time. +% changeGrRules Change multiple grRules at the same time. % -% model a model structure to change the gene association -% rxns string or cell array of reaction IDs -% grRules string of additional or replacement gene association. -% Should be written with ' and ' to indicate subunits, ' or ' -% to indicate isoenzymes, and brackets '()' to separate -% different instances -% replace true if old gene association should be replaced with new -% association. False if new gene association should be -% concatenated to the old association (optional, default true) +% Parameters +% ---------- +% model : struct +% a model structure to change the gene association. +% rxns : char or cell +% reaction IDs. +% grRules : char or cell +% additional or replacement gene association. Should be written with +% ' and ' to indicate subunits, ' or ' to indicate isoenzymes, and +% brackets '()' to separate different instances. +% replace : logical, optional +% true if old gene association should be replaced with new association. +% False if new gene association should be concatenated to the old +% association (default true). % -% model an updated model structure +% Returns +% ------- +% model : struct +% an updated model structure. % -% Usage: changeGrRules(model,rxns,grRules,replace) +% Examples +% -------- +% model = changeGrRules(model, rxns, grRules, replace); if nargin==3 replace=true; diff --git a/manipulation/changeRxns.m b/manipulation/changeRxns.m index 8527e95f..7fc82bac 100755 --- a/manipulation/changeRxns.m +++ b/manipulation/changeRxns.m @@ -1,57 +1,66 @@ function model=changeRxns(model,rxns,equations,eqnType,compartment,allowNewMets) -% changeRxns -% Modifies the equations of reactions +% changeRxns Modify the equations of reactions in a model. % -% model a model structure -% rxns cell array with reaction ids -% equations cell array with equations. Alternatively, it can be a -% structure with the fields "mets" and "stoichCoeffs", -% in the same fashion as addRxns. E.g.: -% equations.mets = {{'met1','met2'},{'met1','met3'}} -% equations.stoichCoeffs = {[-1,+2],[-1,+1]} -% eqnType double describing how the equation string should be -% interpreted -% 1 - The metabolites are matched to model.mets. New -% metabolites (if allowed) are added to -% "compartment" (default) -% 2 - The metabolites are matched to model.metNames and -% all metabolites are assigned to "compartment". Any -% new metabolites that are added will be assigned -% IDs "m1", "m2"... If IDs on the same form are -% already used in the model then the numbering will -% start from the highest used integer+1 -% 3 - The metabolites are written as -% "metNames[compNames]". Only compartments in -% model.compNames are allowed. Any -% new metabolites that are added will be assigned -% IDs "m1", "m2"... If IDs on the same form are -% already used in the model then the numbering will -% start from the highest used integer+1 -% compartment a string with the compartment the metabolites should -% be placed in when using eqnType=2. Must match -% model.compNames (optional when eqnType=1 or eqnType=3) -% allowNewMets true if the function is allowed to add new -% metabolites. It is highly recommended to first add -% any new metabolites with addMets rather than -% automatically through this function. addMets supports -% more annotation of metabolites, allows for the use of -% exchange metabolites, and using it reduces the risk -% of parsing errors (optional, default false) +% Parameters +% ---------- +% model : struct +% a model structure. +% rxns : cell +% cell array with reaction ids. +% equations : cell or struct +% cell array with equations. Alternatively, it can be a structure with +% the fields "mets" and "stoichCoeffs", in the same fashion as addRxns. +% E.g.: % -% model an updated model structure +% - equations.mets = {{'met1','met2'},{'met1','met3'}} +% - equations.stoichCoeffs = {[-1,+2],[-1,+1]} +% eqnType : double, optional +% describes how the equation string should be interpreted (default 1): % -% NOTE: This function should be used with some care, since it doesn't -% care about bounds on the reactions. Changing a irreversible reaction to -% a reversible one (or the other way around) will only change the -% model.rev field and not the model.lb/model.ub fields. The reaction will -% therefore still be having the same reversibility because of the -% bounds. Use setParams to change the bounds. +% - 1 : the metabolites are matched to model.mets. New metabolites (if +% allowed) are added to "compartment". +% - 2 : the metabolites are matched to model.metNames and all +% metabolites are assigned to "compartment". Any new metabolites that +% are added will be assigned IDs "m1", "m2"... If IDs on the same form +% are already used in the model then the numbering will start from the +% highest used integer+1. +% - 3 : the metabolites are written as "metNames[compNames]". Only +% compartments in model.compNames are allowed. Any new metabolites +% that are added will be assigned IDs "m1", "m2"... If IDs on the same +% form are already used in the model then the numbering will start +% from the highest used integer+1. +% compartment : char, optional +% a string with the compartment the metabolites should be placed in when +% using eqnType=2. Must match model.compNames (optional when eqnType=1 or +% eqnType=3). +% allowNewMets : logical, optional +% true if the function is allowed to add new metabolites. It is highly +% recommended to first add any new metabolites with addMets rather than +% automatically through this function. addMets supports more annotation +% of metabolites, allows for the use of exchange metabolites, and using +% it reduces the risk of parsing errors (default false). % -% NOTE: When adding metabolites to a compartment where it previously -% doesn't exist, the function will copy any available information from -% the metabolite in another compartment. +% Returns +% ------- +% model : struct +% an updated model structure. % -% Usage: model=changeRxns(model,rxns,equations,eqnType,compartment,allowNewMets) +% Examples +% -------- +% model = changeRxns(model, rxns, equations, eqnType, compartment, allowNewMets); +% +% Notes +% ----- +% This function should be used with some care, since it doesn't care about +% bounds on the reactions. Changing an irreversible reaction to a reversible +% one (or the other way around) will only change the model.rev field and not +% the model.lb/model.ub fields. The reaction will therefore still be having +% the same reversibility because of the bounds. Use setParams to change the +% bounds. +% +% When adding metabolites to a compartment where it previously doesn't +% exist, the function will copy any available information from the metabolite +% in another compartment. if nargin<4 eqnType=1; diff --git a/manipulation/closeModel.m b/manipulation/closeModel.m index b91d69f0..9941029f 100755 --- a/manipulation/closeModel.m +++ b/manipulation/closeModel.m @@ -1,13 +1,21 @@ function closedModel=closeModel(model) -% closeModel -% Adds boundary metabolites and their participation in exchange -% reactions. +% closeModel Add boundary metabolites and their exchange reactions. % -% model a model structure +% Adds boundary metabolites and their participation in exchange reactions. % -% closedModel an updated closedModel structure +% Parameters +% ---------- +% model : struct +% a model structure. % -% Usage: closedModel=closeModel(model) +% Returns +% ------- +% closedModel : struct +% an updated model structure with boundary metabolites added. +% +% Examples +% -------- +% closedModel = closeModel(model); closedModel=model; diff --git a/manipulation/contractModel.m b/manipulation/contractModel.m index 0d6af575..35d62fc0 100755 --- a/manipulation/contractModel.m +++ b/manipulation/contractModel.m @@ -1,31 +1,41 @@ function [reducedModel, removedRxns, indexedDuplicateRxns]=contractModel(model,distReverse,mets) -% contractModel -% Contracts a model by grouping all identical reactions. Similar to the -% deleteDuplicates part in simplifyModel but more care is taken here -% when it comes to gene associations. If the duplicated reactions have -% '_EXP_*' suffixes (where * is a digit), then the model is assumed to -% have been passed through expandModel, and these suffixes are removed -% here. +% contractModel Contract a model by grouping all identical reactions. % -% model a model structure -% distReverse distinguish reactions with same metabolites -% but different reversibility as different -% reactions (optional, default true) -% mets string or cell array of strings with metabolite -% identifiers, whose involved reactions should be -% checked for duplication (optional, by default all -% reactions are considered) (option is used by -% replaceMets) +% Similar to the deleteDuplicates part in simplifyModel but more care is +% taken here when it comes to gene associations. If the duplicated reactions +% have '_EXP_*' suffixes (where * is a digit), then the model is assumed to +% have been passed through expandModel, and these suffixes are removed here. % -% reducedModel a model structure without duplicate reactions -% removedRxns cell array for the removed duplicate reactions -% indexedDuplicateRxns indexed cell array for the removed duplicate -% reactions (multiple valuess separated by semicolon) +% Parameters +% ---------- +% model : struct +% a model structure. +% distReverse : logical, optional +% distinguish reactions with same metabolites but different reversibility +% as different reactions (default true). +% mets : char or cell, optional +% string or cell array of strings with metabolite identifiers, whose +% involved reactions should be checked for duplication (by default all +% reactions are considered). This option is used by replaceMets. % -% NOTE: This code might not work for advanced grRules strings -% that involve nested expressions of 'and' and 'or'. +% Returns +% ------- +% reducedModel : struct +% a model structure without duplicate reactions. +% removedRxns : cell +% cell array for the removed duplicate reactions. +% indexedDuplicateRxns : cell +% indexed cell array for the removed duplicate reactions (multiple values +% separated by semicolon). % -% Usage: [reducedModel, removedRxns, indexedDuplicateRxns]=contractModel(model,distReverse,mets) +% Examples +% -------- +% [reducedModel, removedRxns, indexedDuplicateRxns] = contractModel(model, distReverse, mets); +% +% Notes +% ----- +% This code might not work for advanced grRules strings that involve nested +% expressions of 'and' and 'or'. if nargin<2 distReverse=true; diff --git a/manipulation/convertToIrrev.m b/manipulation/convertToIrrev.m index 5dc06f54..5697716d 100755 --- a/manipulation/convertToIrrev.m +++ b/manipulation/convertToIrrev.m @@ -1,23 +1,33 @@ function [irrevModel,matchRev,rev2irrev,irrev2rev]=convertToIrrev(model,rxns) -% convertToIrrev -% Converts a model to irreversible form +% convertToIrrev Convert a model to irreversible form. % -% Input: -% model a model structure -% rxns cell array with the reactions so split (if reversible) -% (optional, default model.rxns) +% Reversible reactions are split into one forward and one reverse +% reaction. The reverse reactions are saved as 'rxnID_REV'. A warning is +% shown if some reaction identifiers already end with '_REV'. % -% Output: -% irrevModel a model structure where reversible reactions have -% been split into one forward and one reverse reaction -% matchRev matching forward reaction to its backward reaction -% rev2irrev forward and backward reactions for reversible reactions -% irrev2rev matching all reactions back to original model +% Parameters +% ---------- +% model : struct +% a model structure. +% rxns : cell, optional +% cell array with the reactions to split, if reversible (default +% model.rxns). % -% The reverse reactions are saved as 'rxnID_REV'. A warning is shown if -% some reaction identifiers already end with '_REV'. +% Returns +% ------- +% irrevModel : struct +% a model structure where reversible reactions have been split into +% one forward and one reverse reaction. +% matchRev : double +% matching forward reaction to its backward reaction. +% rev2irrev : cell +% forward and backward reactions for reversible reactions. +% irrev2rev : double +% matching all reactions back to original model. % -% Usage: [irrevModel,matchRev,rev2irrev,irrev2rev]=convertToIrrev(model,rxns) +% Examples +% -------- +% [irrevModel,matchRev,rev2irrev,irrev2rev]=convertToIrrev(model,rxns); if nargin<2 I=true(numel(model.rxns),1); diff --git a/manipulation/copyToComps.m b/manipulation/copyToComps.m index 042b4c0b..2eed4f66 100755 --- a/manipulation/copyToComps.m +++ b/manipulation/copyToComps.m @@ -1,29 +1,41 @@ function model=copyToComps(model,toComps,rxns,deleteOriginal,compNames,compOutside) -% copyToComps -% Copies reactions to new compartment(s) +% copyToComps Copy reactions to new compartment(s). % -% model a model structure -% toComps cell array of compartment ids. If there is no match -% to model.comps then it is added as a new compartment -% (see below for details) -% rxns either a cell array of reaction IDs, a logical vector -% with the same number of elements as reactions in the model, -% or a vector of indexes to remove (optional, default -% model.rxns) -% deleteOriginal true if the original reactions should be removed -% (making it move the reactions instead) (optional, default -% false) -% compNames cell array of compartment names. This is used if new -% compartments should be added (optional, default toComps) -% compOutside cell array of the id (as in comps) for the compartment -% surrounding each of the compartments. This is used if -% new compartments should be added (optional, default all {''}) +% Parameters +% ---------- +% model : struct +% a model structure. +% toComps : cell +% cell array of compartment ids. If there is no match to model.comps +% then it is added as a new compartment (see compNames and +% compOutside). +% rxns : cell or logical or double, optional +% either a cell array of reaction IDs, a logical vector with the same +% number of elements as reactions in the model, or a vector of indexes +% to copy (default model.rxns). +% deleteOriginal : logical, optional +% true if the original reactions should be removed, making it move the +% reactions instead (default false). +% compNames : cell, optional +% cell array of compartment names. Used if new compartments should be +% added (default toComps). +% compOutside : cell, optional +% cell array of the id (as in comps) for the compartment surrounding +% each of the compartments. Used if new compartments should be added +% (default all {''}). % -% model an updated model structure +% Returns +% ------- +% model : struct +% an updated model structure. % -% NOTE: New reactions and metabolites will be named as "id_toComps(i)". +% Examples +% -------- +% model=copyToComps(model,toComps,rxns,deleteOriginal,compNames,compOutside); % -% Usage: model=copyToComps(model,toComps,rxns,deleteOriginal,compNames,compOutside) +% Notes +% ----- +% New reactions and metabolites will be named as "id_toComps(i)". arguments model (1,1) struct diff --git a/manipulation/deleteUnusedGenes.m b/manipulation/deleteUnusedGenes.m index 7d0427f9..07843c16 100755 --- a/manipulation/deleteUnusedGenes.m +++ b/manipulation/deleteUnusedGenes.m @@ -1,14 +1,22 @@ function reducedModel=deleteUnusedGenes(model,verbose) -% deleteUnusedGenes -% Deletes all genes that are not associated to any reaction +% deleteUnusedGenes Delete all genes not associated to any reaction. % -% model a model structure -% verbose 0 for silent; 1 for printing number of deleted genes; -% 2 for printing the list of deleted genes (optional, default 1) +% Parameters +% ---------- +% model : struct +% a model structure. +% verbose : double, optional +% 0 for silent; 1 for printing the number of deleted genes; 2 for +% printing the list of deleted genes (default 1). % -% reducedModel an updated model structure +% Returns +% ------- +% reducedModel : struct +% an updated model structure. % -% Usage: reducedModel=deleteUnusedGenes(model) +% Examples +% -------- +% reducedModel=deleteUnusedGenes(model); if nargin<2 verbose=1; diff --git a/manipulation/expandModel.m b/manipulation/expandModel.m index e81eeafa..288d2844 100755 --- a/manipulation/expandModel.m +++ b/manipulation/expandModel.m @@ -1,25 +1,34 @@ function [newModel, rxnToCheck]=expandModel(model) -% expandModel -% Expands a model which uses several gene associations for one reaction. -% Each such reaction is split into several reactions, each under the control -% of only one gene. -% -% Input: -% model model structure -% -% Output: -% newModel model structure with separate reactions for iso-enzymes, where -% the reaction ids are renamed as to id_EXP_1, id_EXP_2, etc. -% rxnToCheck cell array with original reaction identifiers for those -% that contained nested and/or-relationships in grRules. +% expandModel Expand reactions that use several gene associations. % -% NOTE: grRules strings that involve nested expressions of 'and' and 'or' -% might not be parsed correctly if they are not standardized (if the -% standardizeGrRules functions was not first run on the model). For -% those reactions, it is therefore advisable to inspect the reactions in -% rxnToCheck to confirm correct model expansion. +% Each reaction that uses several gene associations is split into several +% reactions, each under the control of only one gene. % -% Usage: [newModel, rxnToCheck]=expandModel(model) +% Parameters +% ---------- +% model : struct +% a model structure. +% +% Returns +% ------- +% newModel : struct +% model structure with separate reactions for iso-enzymes, where the +% reaction ids are renamed as id_EXP_1, id_EXP_2, etc. +% rxnToCheck : cell +% cell array with original reaction identifiers for those that +% contained nested and/or-relationships in grRules. +% +% Examples +% -------- +% [newModel, rxnToCheck]=expandModel(model); +% +% Notes +% ----- +% grRules strings that involve nested expressions of 'and' and 'or' might +% not be parsed correctly if they are not standardized (if the +% standardizeGrRules function was not first run on the model). For those +% reactions, it is therefore advisable to inspect the reactions in +% rxnToCheck to confirm correct model expansion. %Check how many reactions we will create (the number of or:s in the GPRs). %This way, we can preallocate all fields and save much computation time diff --git a/manipulation/findDuplicateRxns.m b/manipulation/findDuplicateRxns.m index 44d3011b..469ef8fe 100644 --- a/manipulation/findDuplicateRxns.m +++ b/manipulation/findDuplicateRxns.m @@ -1,27 +1,31 @@ function pairs = findDuplicateRxns(model, ignoreDirection) -% findDuplicateRxns -% Find reactions that share identical stoichiometry. Counterpart of -% raven_python.manipulation.find_duplicate_reactions, and the -% upstream version of yeast-GEM's findDuplicatedRxns. +% findDuplicateRxns Find reactions that share identical stoichiometry. % -% Only stoichiometry is compared — bounds, GPRs, and annotations -% are ignored. The default treats A→B and B→A as duplicates -% (typical curation use case: "find reactions that could be -% merged"). +% Counterpart of raven_python.manipulation.find_duplicate_reactions, and +% the upstream version of yeast-GEM's findDuplicatedRxns. % -% Inputs: -% model RAVEN model struct. -% ignoreDirection (opt, default true) Treat A→B and B→A as -% duplicates. +% Only stoichiometry is compared — bounds, GPRs, and annotations are +% ignored. The default treats A→B and B→A as duplicates (typical curation +% use case: "find reactions that could be merged"). % -% Output: -% pairs Nx2 numeric array of reaction-index pairs -% (i, j) where reactions i and j share the -% same (possibly negated) stoichiometry, with -% i < j. Empty if the model has no duplicates. +% Parameters +% ---------- +% model : struct +% RAVEN model struct. +% ignoreDirection : logical, optional +% treat A→B and B→A as duplicates (default true). % -% Usage: pairs = findDuplicateRxns(model) -% pairs = findDuplicateRxns(model, false) +% Returns +% ------- +% pairs : double +% Nx2 numeric array of reaction-index pairs (i, j) where reactions i +% and j share the same (possibly negated) stoichiometry, with i < j. +% Empty if the model has no duplicates. +% +% Examples +% -------- +% pairs = findDuplicateRxns(model); +% pairs = findDuplicateRxns(model, false); if nargin < 2 ignoreDirection = true; diff --git a/manipulation/generateNewIds.m b/manipulation/generateNewIds.m index f66c6aa5..ade50c06 100755 --- a/manipulation/generateNewIds.m +++ b/manipulation/generateNewIds.m @@ -1,21 +1,33 @@ function newIds=generateNewIds(model,type,prefix,quantity,numLength) -% generateNewIds -% Generates a list of new metabolite or reaction ids, sequentially -% numbered with a defined prefix. The model is queried for the highest -% existing number of that type of id. +% generateNewIds Generate a list of new metabolite or reaction ids. % -% model model structure -% type string specifying type of id, 'rxns' or 'mets' -% prefix string specifying prefix to be used in all ids. E.g. 's_' -% or 'r_'. -% quantity number of new ids that should be generated (optional, default 1) -% numLength length of numerical part of id. E.g. 4 gives ids like -% r_0001 and 6 gives ids like r_000001. If the prefix is -% already used in the model, then the model-defined length -% will be used instead. (optional, default 4) +% The ids are sequentially numbered with a defined prefix. The model is +% queried for the highest existing number of that type of id. % -% Usage: newIds=generateNewIds(model,type,prefix,quantity,numLength) -% +% Parameters +% ---------- +% model : struct +% model structure. +% type : char +% type of id, 'rxns' or 'mets'. +% prefix : char +% prefix to be used in all ids, e.g. 's_' or 'r_'. +% quantity : double, optional +% number of new ids that should be generated (default 1). +% numLength : double, optional +% length of the numerical part of the id. E.g. 4 gives ids like r_0001 +% and 6 gives ids like r_000001. If the prefix is already used in the +% model, then the model-defined length will be used instead +% (default 4). +% +% Returns +% ------- +% newIds : cell +% cell array with the generated ids. +% +% Examples +% -------- +% newIds = generateNewIds(model, type, prefix, quantity, numLength); type=char(type); prefix=char(prefix); diff --git a/manipulation/mergeCompartments.m b/manipulation/mergeCompartments.m index 1c8d9b52..7732d7f3 100755 --- a/manipulation/mergeCompartments.m +++ b/manipulation/mergeCompartments.m @@ -1,36 +1,46 @@ function [model, deletedRxns, duplicateRxns]=mergeCompartments(model,keepUnconstrained,deleteRxnsWithOneMet,distReverse) -% mergeCompartments -% Merge all compartments in a model +% mergeCompartments Merge all compartments in a model. % -% model a model structure -% keepUnconstrained keep metabolites that are unconstrained in a -% 'unconstrained' compartment. If these are merged the -% exchange reactions will most often be deleted (optional, -% default false) -% deleteRxnsWithOneMet delete reactions with only one metabolite. These -% reactions come from reactions such as A[c] + B[c] -% => A[m]. In some models hydrogen is balanced around -% each membrane with reactions like this (optional, -% default false) -% distReverse distinguish reactions with same metabolites but -% different reversibility as different reactions -% (optional, default true) +% Parameters +% ---------- +% model : struct +% a model structure. +% keepUnconstrained : logical, optional +% keep metabolites that are unconstrained in a 'unconstrained' +% compartment. If these are merged the exchange reactions will most often +% be deleted (default false). +% deleteRxnsWithOneMet : logical, optional +% delete reactions with only one metabolite. These reactions come from +% reactions such as A[c] + B[c] => A[m]. In some models hydrogen is +% balanced around each membrane with reactions like this (default +% false). +% distReverse : logical, optional +% distinguish reactions with same metabolites but different reversibility +% as different reactions (default true). % -% model a model with all reactions located to one compartment -% deletedRxns reactions that were deleted because of only -% having one metabolite after merging -% duplicateRxns identical reactions that occurred in different -% compartments and were deleted because they turned -% to be duplicated after merging +% Returns +% ------- +% model : struct +% a model with all reactions located to one compartment. +% deletedRxns : cell +% reactions that were deleted because of only having one metabolite +% after merging. +% duplicateRxns : cell +% identical reactions that occurred in different compartments and were +% deleted because they turned to be duplicated after merging. % -% Merges all compartments into one 's' compartment (for 'System'). This can -% be useful for example to ensure that there are metabolic capabilities to -% synthesize all metabolites. +% Examples +% -------- +% [model, deletedRxns, duplicateRxns] = mergeCompartments(model); % -% NOTE: If the metabolite IDs reflect the compartment that they are in -% the IDs may no longer be representative. +% Notes +% ----- +% Merges all compartments into one 's' compartment (for 'System'). This can +% be useful for example to ensure that there are metabolic capabilities to +% synthesize all metabolites. % -% Usage: [model, deletedRxns, duplicateRxns]=mergeCompartments(model,keepUnconstrained,deleteRxnsWithOneMet,distReverse) +% If the metabolite IDs reflect the compartment that they are in the IDs may +% no longer be representative. if nargin<2 keepUnconstrained=false; diff --git a/manipulation/mergeModels.m b/manipulation/mergeModels.m index fe07a4a0..98c2414a 100755 --- a/manipulation/mergeModels.m +++ b/manipulation/mergeModels.m @@ -1,28 +1,35 @@ function model=mergeModels(models,metParam,supressWarnings,copyToComps) -% mergeModels -% Merges models into one model structure. Reactions are added without any -% checks, so duplicate reactions might appear. Metabolites are matched by -% their name and compartment (metaboliteName[comp]), while genes are -% matched by their name. +% mergeModels Merge models into one model structure. % -% Input: -% models a cell array with model structures -% metParam string metabolite name ('metNames') or ID ('mets') are -% used for matching (optional, default 'metNames') -% supressWarnings logical whether warnings should be supressed (optional, -% default false) -% copyToComps logical whether mergeModels is run via copyToComps -% (optional, default false) +% Merges models into one model structure. Reactions are added without any +% checks, so duplicate reactions might appear. Metabolites are matched by +% their name and compartment (metaboliteName[comp]), while genes are matched +% by their name. % -% Output: -% model a model structure with the merged model. Follows the -% structure of normal models but also has 'rxnFrom/ -% metFrom/geneFrom' fields to indicate from which model -% each reaction/metabolite/gene was taken. If the model -% already has 'rxnFrom/metFrom/geneFrom' fields, then -% these fields are not modified. +% Parameters +% ---------- +% models : cell +% a cell array with model structures. +% metParam : char, optional +% string, metabolite name ('metNames') or ID ('mets') are used for +% matching (default 'metNames'). +% supressWarnings : logical, optional +% whether warnings should be supressed (default false). +% copyToComps : logical, optional +% whether mergeModels is run via copyToComps (default false). % -% Usage: model=mergeModels(models) +% Returns +% ------- +% model : struct +% a model structure with the merged model. Follows the structure of +% normal models but also has 'rxnFrom/metFrom/geneFrom' fields to +% indicate from which model each reaction/metabolite/gene was taken. If +% the model already has 'rxnFrom/metFrom/geneFrom' fields, then these +% fields are not modified. +% +% Examples +% -------- +% model = mergeModels(models); arguments models; diff --git a/manipulation/permuteModel.m b/manipulation/permuteModel.m index 4fad67c2..d71f33d6 100755 --- a/manipulation/permuteModel.m +++ b/manipulation/permuteModel.m @@ -1,18 +1,25 @@ function newModel=permuteModel(model, indexes, type) -% permuteModel -% Changes the order of the reactions or metabolites in a model +% permuteModel Change the order of the reactions or metabolites in a model. % -% Input: -% model a model structure -% indexes a vector with the same length as the number of items in the -% model, which gives the new order of items -% type 'rxns' for reactions, 'mets' for metabolites, 'genes' for -% genes, 'comps' for compartments +% Parameters +% ---------- +% model : struct +% a model structure. +% indexes : double +% a vector with the same length as the number of items in the model, +% which gives the new order of items. +% type : char +% 'rxns' for reactions, 'mets' for metabolites, 'genes' for genes, +% 'comps' for compartments. % -% Output: -% newModel an updated model structure +% Returns +% ------- +% newModel : struct +% an updated model structure. % -% Usage: newModel=permuteModel(model, indexes, type) +% Examples +% -------- +% newModel = permuteModel(model, indexes, type); newModel=model; type=char(type); diff --git a/manipulation/removeBadRxns.m b/manipulation/removeBadRxns.m index 9b6328dd..20625c0e 100755 --- a/manipulation/removeBadRxns.m +++ b/manipulation/removeBadRxns.m @@ -1,73 +1,85 @@ function [newModel, removedRxns]=removeBadRxns(model,rxnRules,ignoreMets,isNames,balanceElements,refModel,ignoreIntBounds,printReport) -% removeBadRxns -% Iteratively removes reactions which enable production/consumption of some -% metabolite without any uptake/excretion +% removeBadRxns Remove reactions that enable production/consumption from nothing. % -% model a model structure. For the intented function, -% the model shouldn't allow for any uptake/excretion. -% The easiest way to achieve this is to import the -% model using importModel('filename',false) -% rxnRules 1: only remove reactions which are unbalanced -% 2: also remove reactions which couldn't be checked for -% mass balancing -% 3: all reactions can be removed -% (optional, default 1) -% ignoreMets either a cell array of metabolite IDs, a logical vector -% with the same number of elements as metabolites in the model, -% of a vector of indexes for metabolites to exclude from -% this analysis (optional, default []) -% isNames true if the supplied mets represent metabolite names -% (as opposed to IDs). This is a way to delete -% metabolites in several compartments at once without -% knowing the exact IDs. This only works if ignoreMets -% is a cell array (optional, default false) -% balanceElements a cell array with the elements for which to -% balance the reactions. May contain any -% combination of the elements defined in parseFormulas -% (optional, default {'C';'P';'S';'N';'O'}) -% refModel a reference model which can be used to ensure -% that the resulting model is still functional. -% The intended use is that the reference model is -% a copy of model, but with uptake/excretion allowed and -% some objectives (such as production of biomass) -% constrained to a non-zero flux. Before a -% reaction is removed from "model" the function first -% checks that the same deletion in "refModel" -% doesn't render the problem unfeasible (optional) -% ignoreIntBounds true if internal bounds (including reversibility) -% should be ignored. Exchange reactions are not affected. -% This can be used to find unbalanced solutions which are -% not possible using the default constraints (optional, -% default false) -% printReport true if a report should be printed (optional, -% default false) +% Iteratively removes reactions which enable production/consumption of some +% metabolite without any uptake/excretion. % -% newModel a model structure after the problematic -% reactions have been deleted -% removedRxns a cell array with the reactions that were -% removed +% Parameters +% ---------- +% model : struct +% a model structure. For the intended function, the model shouldn't +% allow for any uptake/excretion. The easiest way to achieve this is to +% import the model using importModel('filename', false). +% rxnRules : double, optional +% which reactions may be removed (default 1): % -% The purpose of this function is to remove reactions which enable -% production/consumption of metabolites even when exchange reactions aren't used. -% Many models, especially if they are automatically inferred from -% databases, will have unbalanced reactions which allow for -% net-production/consumption of metabolites without any consumption/excretion. -% A common reason for this is when general compounds have different meaning -% in different reactions (as DNA has in these two reactions). -% dATP + dGTP + dCTP + dTTP <=> DNA + 4 PPi -% 0.25 dATP + 0.25 dGTP + 0.25 dCTP + 0.25 dTTP <=> DNA + PPi -% Reactions that are problematic like this are always elementally -% unbalanced, but it is not always that you would like to exclude all -% unbalanced reactions from your model. -% This function tries to remove as few problematic reactions as possible -% so that the model cannot produce/consume anything from nothing. This is done by -% repeatedly calling makeSomething/consumeSomething, checking if any of -% the involved reactions are elementally unbalanced, remove one of them, -% and then iterating until no metabolites can be produced/consumed. -% makeSomething is called before consumeSomething. +% - 1 : only remove reactions which are unbalanced +% - 2 : also remove reactions which couldn't be checked for mass +% balancing +% - 3 : all reactions can be removed +% ignoreMets : cell or logical or double, optional +% either a cell array of metabolite IDs, a logical vector with the same +% number of elements as metabolites in the model, or a vector of indexes +% for metabolites to exclude from this analysis (default []). +% isNames : logical, optional +% true if the supplied mets represent metabolite names (as opposed to +% IDs). This is a way to delete metabolites in several compartments at +% once without knowing the exact IDs. This only works if ignoreMets is a +% cell array (default false). +% balanceElements : cell, optional +% a cell array with the elements for which to balance the reactions. May +% contain any combination of the elements defined in parseFormulas +% (default {'C';'P';'S';'N';'O'}). +% refModel : struct, optional +% a reference model which can be used to ensure that the resulting model +% is still functional. The intended use is that the reference model is a +% copy of model, but with uptake/excretion allowed and some objectives +% (such as production of biomass) constrained to a non-zero flux. Before +% a reaction is removed from "model" the function first checks that the +% same deletion in "refModel" doesn't render the problem unfeasible. +% ignoreIntBounds : logical, optional +% true if internal bounds (including reversibility) should be ignored. +% Exchange reactions are not affected. This can be used to find +% unbalanced solutions which are not possible using the default +% constraints (default false). +% printReport : logical, optional +% true if a report should be printed (default false). % -% Usage: [newModel, removedRxns]=removeBadRxns(model,rxnRules,... -% ignoreMets,isNames,refModel,ignoreIntBounds,printReport) +% Returns +% ------- +% newModel : struct +% a model structure after the problematic reactions have been deleted. +% removedRxns : cell +% a cell array with the reactions that were removed. +% +% Notes +% ----- +% The purpose of this function is to remove reactions which enable +% production/consumption of metabolites even when exchange reactions aren't +% used. Many models, especially if they are automatically inferred from +% databases, will have unbalanced reactions which allow for +% net-production/consumption of metabolites without any consumption or +% excretion. A common reason for this is when general compounds have +% different meaning in different reactions (as DNA has in these two +% reactions): +% +% dATP + dGTP + dCTP + dTTP <=> DNA + 4 PPi +% 0.25 dATP + 0.25 dGTP + 0.25 dCTP + 0.25 dTTP <=> DNA + PPi +% +% Reactions that are problematic like this are always elementally +% unbalanced, but it is not always the case that you would like to exclude +% all unbalanced reactions from your model. This function tries to remove as +% few problematic reactions as possible so that the model cannot +% produce/consume anything from nothing. This is done by repeatedly calling +% makeSomething/consumeSomething, checking if any of the involved reactions +% are elementally unbalanced, removing one of them, and then iterating until +% no metabolites can be produced/consumed. makeSomething is called before +% consumeSomething. +% +% Examples +% -------- +% [newModel, removedRxns] = removeBadRxns(model, rxnRules, ignoreMets, ... +% isNames, balanceElements, refModel, ignoreIntBounds, printReport); if nargin<2 rxnRules=1; diff --git a/manipulation/removeGenes.m b/manipulation/removeGenes.m index ac2b5cf5..cd4980ad 100755 --- a/manipulation/removeGenes.m +++ b/manipulation/removeGenes.m @@ -1,21 +1,31 @@ function reducedModel = removeGenes(model,genesToRemove,removeUnusedMets,removeBlockedRxns,standardizeRules) -% removeGenes -% Deletes a set of genes from a model +% removeGenes Delete a set of genes from a model. % -% model a model structure -% genesToRemove either a cell array of gene IDs, a logical vector -% with the same number of elements as genes in the model, -% or a vector of indexes to remove -% removeUnusedMets remove metabolites that are no longer in use (optional, default -% false) -% removeBlockedRxns remove reactions that get blocked after deleting the genes -% (optional, default false) -% standardizeRules format gene rules to be compliant with standard format -% (optional, default true) +% Parameters +% ---------- +% model : struct +% a model structure. +% genesToRemove : cell or logical or double +% either a cell array of gene IDs, a logical vector with the same number +% of elements as genes in the model, or a vector of indexes to remove. +% removeUnusedMets : logical, optional +% remove metabolites that are no longer in use (default false). +% removeBlockedRxns : logical, optional +% remove reactions that get blocked after deleting the genes (default +% false). +% standardizeRules : logical, optional +% format gene rules to be compliant with the standard format (default +% true). % -% reducedModel an updated model structure +% Returns +% ------- +% reducedModel : struct +% an updated model structure. % -% Usage: reducedModel = removeGenes(model,genesToRemove,removeUnusedMets,removeBlockedRxns) +% Examples +% -------- +% reducedModel = removeGenes(model, genesToRemove, removeUnusedMets, ... +% removeBlockedRxns, standardizeRules); if nargin<3 removeUnusedMets = false; diff --git a/manipulation/removeMets.m b/manipulation/removeMets.m index 0f4a5f26..ded6eda5 100755 --- a/manipulation/removeMets.m +++ b/manipulation/removeMets.m @@ -1,27 +1,35 @@ function reducedModel=removeMets(model,metsToRemove,isNames,removeUnusedRxns,removeUnusedGenes,removeUnusedComps) -% removeMets -% Deletes a set of metabolites from a model +% removeMets Delete a set of metabolites from a model. % -% model a model structure -% metsToRemove either a cell array of metabolite IDs, a logical vector -% with the same number of elements as metabolites in the model, -% of a vector of indexes to remove -% isNames true if the supplied mets represent metabolite names -% (as opposed to IDs). This is a way to delete -% metabolites in several compartments at once without -% knowing the exact IDs. This only works if metsToRemove -% is a cell array (optional, default false) -% removeUnusedRxns remove reactions that are no longer in use (optional, -% default false) -% removeUnusedGenes remove genes that are no longer in use (optional, -% default false) -% removeUnusedComps remove compartments that are no longer in use (optional, -% default false) +% Parameters +% ---------- +% model : struct +% a model structure. +% metsToRemove : cell or logical or double +% either a cell array of metabolite IDs, a logical vector with the same +% number of elements as metabolites in the model, or a vector of +% indexes to remove. +% isNames : logical, optional +% true if the supplied mets represent metabolite names (as opposed to +% IDs). This is a way to delete metabolites in several compartments at +% once without knowing the exact IDs. This only works if metsToRemove +% is a cell array (default false). +% removeUnusedRxns : logical, optional +% remove reactions that are no longer in use (default false). +% removeUnusedGenes : logical, optional +% remove genes that are no longer in use (default false). +% removeUnusedComps : logical, optional +% remove compartments that are no longer in use (default false). % -% reducedModel an updated model structure +% Returns +% ------- +% reducedModel : struct +% an updated model structure. % -% Usage: reducedModel=removeMets(model,metsToRemove,isNames,... -% removeUnusedRxns,removeUnusedGenes,removeUnusedComps) +% Examples +% -------- +% reducedModel = removeMets(model, metsToRemove, isNames, ... +% removeUnusedRxns, removeUnusedGenes, removeUnusedComps); if ~islogical(metsToRemove) && ~isnumeric(metsToRemove) metsToRemove=convertCharArray(metsToRemove); end diff --git a/manipulation/removeReactions.m b/manipulation/removeReactions.m index 4255c6e2..a009f788 100755 --- a/manipulation/removeReactions.m +++ b/manipulation/removeReactions.m @@ -1,24 +1,30 @@ function reducedModel=removeReactions(model,rxnsToRemove,removeUnusedMets,removeUnusedGenes,removeUnusedComps) -% removeReactions -% Deletes a set of reactions from a model +% removeReactions Delete a set of reactions from a model. % -% Input: -% model a model structure -% rxnsToRemove either a cell array of reaction IDs, a logical vector -% with the same number of elements as reactions in the -% model, or a vector of indexes to remove -% removeUnusedMets remove metabolites that are no longer in use -% (optional, default false) -% removeUnusedGenes remove genes that are no longer in use (optional, -% default false) -% removeUnusedComps remove compartments that are no longer in use -% (optional, default false) +% Parameters +% ---------- +% model : struct +% a model structure. +% rxnsToRemove : cell or logical or double +% either a cell array of reaction IDs, a logical vector with the same +% number of elements as reactions in the model, or a vector of indexes +% to remove. +% removeUnusedMets : logical, optional +% remove metabolites that are no longer in use (default false). +% removeUnusedGenes : logical, optional +% remove genes that are no longer in use (default false). +% removeUnusedComps : logical, optional +% remove compartments that are no longer in use (default false). % -% Output: -% reducedModel an updated model structure +% Returns +% ------- +% reducedModel : struct +% an updated model structure. % -% Usage: reducedModel = removeReactions(model, rxnsToRemove, removeUnusedMets,... -% removeUnusedGenes, removeUnusedComps) +% Examples +% -------- +% reducedModel = removeReactions(model, rxnsToRemove, removeUnusedMets, ... +% removeUnusedGenes, removeUnusedComps); if nargin<3 removeUnusedMets=false; diff --git a/manipulation/replaceMets.m b/manipulation/replaceMets.m index 9301d862..7ecaa9cc 100755 --- a/manipulation/replaceMets.m +++ b/manipulation/replaceMets.m @@ -1,32 +1,46 @@ function [model, removedRxns, idxDuplRxns]=replaceMets(model,metabolite,replacement,verbose,identifiers) -% replaceMets -% Replaces metabolite names and annotation with replacement metabolite -% that is already in the model. If this results in duplicate metabolites, -% the replacement metabolite will be kept, while the S matrix is updated -% to use the replacement metabolite instead. At the end, contractModel is -% run to remove any duplicate reactions that might have occured. +% replaceMets Replace a metabolite with another already in the model. % -% Input: -% model a model structure -% metabolite string with name of metabolite to be replace -% replacement string with name of replacement metabolite -% verbose logical whether to print the ids of reactions that -% involve the replaced metabolite (optional, default -% false) -% identifiers true if 'metabolite' and 'replacement' refer to -% metabolite identifiers instead of metabolite names -% (optional, default false) -% -% Output: -% model model structure with selected metabolites replaced -% removedRxns identifiers of duplicate reactions that were removed -% idxDuplRxns index of removedRxns in original model +% Replaces metabolite names and annotation with a replacement metabolite +% that is already in the model. If this results in duplicate metabolites, +% the replacement metabolite will be kept, while the S matrix is updated to +% use the replacement metabolite instead. At the end, contractModel is run +% to remove any duplicate reactions that might have occurred. % -% Note: This function is useful when the model contains both 'oxygen' and -% 'o2' as metabolite names. If 'oxygen' and 'o2' are identifiers instead, -% then the 'identifiers' flag should be set to true. +% Parameters +% ---------- +% model : struct +% a model structure. +% metabolite : char +% string with name of metabolite to be replaced. +% replacement : char +% string with name of replacement metabolite. +% verbose : logical, optional +% whether to print the ids of reactions that involve the replaced +% metabolite (default false). +% identifiers : logical, optional +% true if 'metabolite' and 'replacement' refer to metabolite +% identifiers instead of metabolite names (default false). % -% Usage: [model, removedRxns, idxDuplRxns] = replaceMets(model, metabolite, replacement, verbose) +% Returns +% ------- +% model : struct +% model structure with selected metabolites replaced. +% removedRxns : cell +% identifiers of duplicate reactions that were removed. +% idxDuplRxns : double +% index of removedRxns in original model. +% +% Examples +% -------- +% [model, removedRxns, idxDuplRxns] = replaceMets(model, metabolite, ... +% replacement, verbose); +% +% Notes +% ----- +% This function is useful when the model contains both 'oxygen' and 'o2' as +% metabolite names. If 'oxygen' and 'o2' are identifiers instead, then the +% 'identifiers' flag should be set to true. metabolite=char(metabolite); replacement=char(replacement); diff --git a/manipulation/setExchangeBounds.m b/manipulation/setExchangeBounds.m index 89aa4ac7..74b16c19 100755 --- a/manipulation/setExchangeBounds.m +++ b/manipulation/setExchangeBounds.m @@ -1,50 +1,55 @@ function [exchModel,unusedMets] = setExchangeBounds(model,mets,lb,ub,closeOthers,mediaOnly) -% setExchangeBounds -% Define the exchange flux bounds for a given set of metabolites. +% setExchangeBounds Define exchange flux bounds for a set of metabolites. % -% Input: -% model a model structure -% mets a cell array of metabolite names (case insensitive) or -% metabolite IDs, or a vector of metabolite indices -% (optional, default all exchanged metabolites) -% lb lower bound of exchange flux. Can be either a vector of -% bounds corresponding to each of the provided metabolites, -% or a single value that will be applied to all. -% (optional, default to model.annotation.defaultLB if it exists, -% otherwise -1000) -% ub upper bound of exchange flux. Can be either a vector of -% bounds corresponding to each of the provided metabolites, -% or a single value that will be applied to all. -% (optional, default to model.annotation.defaultUB if it exists, -% otherwise 1000) -% closeOthers close exchange reactions for all other exchanged -% metabolites not present in the provided list. This will -% prevent IMPORT of the metabolites, but their EXPORT will -% not be modified. -% (optional, default true) -% mediaOnly only consider exchange reactions involving exchange to or -% from the extracellular (media) compartment. Reactions -% such as "sink" reactions that exchange metabolites -% directly with an intracellular compartment will therefore -% be ignored even though "getExchangeRxns" identifies such -% such reactions as exchange reactions. -% Note: The function will attempt to identify the -% extracellular compartment by the "compNames" field, and -% also requires the "metComps" field to be present, -% otherwise the mediaOnly flag will be ignored. -% (optional, default false) +% Parameters +% ---------- +% model : struct +% a model structure. +% mets : cell or double, optional +% a cell array of metabolite names (case insensitive) or metabolite +% IDs, or a vector of metabolite indices (default all exchanged +% metabolites). +% lb : double, optional +% lower bound of exchange flux. Can be either a vector of bounds +% corresponding to each of the provided metabolites, or a single value +% that will be applied to all (default model.annotation.defaultLB if it +% exists, otherwise -1000). +% ub : double, optional +% upper bound of exchange flux. Can be either a vector of bounds +% corresponding to each of the provided metabolites, or a single value +% that will be applied to all (default model.annotation.defaultUB if it +% exists, otherwise 1000). +% closeOthers : logical, optional +% close exchange reactions for all other exchanged metabolites not +% present in the provided list. This will prevent IMPORT of the +% metabolites, but their EXPORT will not be modified (default true). +% mediaOnly : logical, optional +% only consider exchange reactions involving exchange to or from the +% extracellular (media) compartment. Reactions such as "sink" reactions +% that exchange metabolites directly with an intracellular compartment +% will therefore be ignored even though "getExchangeRxns" identifies +% such reactions as exchange reactions. The function will attempt to +% identify the extracellular compartment by the "compNames" field, and +% also requires the "metComps" field to be present, otherwise the +% mediaOnly flag will be ignored (default false). % -% Output: -% exchModel a model structure with updated exchange flux bounds for -% the provided set of metabolites -% unusedMets metabolites provided by the user that were not used -% because they are not involved in any exchange reactions -% in the model +% Returns +% ------- +% exchModel : struct +% a model structure with updated exchange flux bounds for the provided +% set of metabolites. +% unusedMets : cell +% metabolites provided by the user that were not used because they are +% not involved in any exchange reactions in the model. % -% NOTE: Exchange reactions involving more than one metabolite will be -% ignored. +% Examples +% -------- +% exchModel = setExchangeBounds(model, mets, lb, ub, closeOthers, ... +% mediaOnly); % -% Usage: exchModel = setExchangeBounds(model,mets,lb,ub,closeOthers,mediaOnly); +% Notes +% ----- +% Exchange reactions involving more than one metabolite will be ignored. % handle input arguments diff --git a/manipulation/setParam.m b/manipulation/setParam.m index 70705f3f..2b4e971c 100755 --- a/manipulation/setParam.m +++ b/manipulation/setParam.m @@ -1,33 +1,42 @@ function model=setParam(model, paramType, rxnList, params, var) -% setParam -% Sets parameters for reactions +% setParam Set parameters for reactions. % -% Input: -% model a model structure -% paramType the type of parameter to set: -% 'lb' lower bound -% 'ub' upper bound -% 'eq' both upper and lower bound (equality constraint) -% 'obj' objective coefficient -% 'rev' reversibility (only changes the model.rev fields, -% does not affect model.lb and model.ub) -% 'var' variance around measured bound -% 'unc' unconstrained, set lower and upper bound to the -% default values (-1000 and 1000, or any other values -% that are defined in model.annotation.defaultLB and -% .defaultUB) -% rxnList a cell array of reaction IDs or a vector with their -% corresponding indexes -% params a vector of the corresponding values -% var percentage of variance around measured value, if 'var' is -% set as paramType. Defining 'var' as 5 results in lb and ub -% at 97.5% and 102.5% of the provide params value (if params -% value is negative, then lb and ub are 102.5% and 97.5%). +% Parameters +% ---------- +% model : struct +% a model structure. +% paramType : char +% the type of parameter to set: % -% Output: -% model an updated model structure +% - 'lb' : lower bound. +% - 'ub' : upper bound. +% - 'eq' : both upper and lower bound (equality constraint). +% - 'obj' : objective coefficient. +% - 'rev' : reversibility (only changes the model.rev fields, does not +% affect model.lb and model.ub). +% - 'var' : variance around measured bound. +% - 'unc' : unconstrained, set lower and upper bound to the default +% values (-1000 and 1000, or any other values that are defined in +% model.annotation.defaultLB and .defaultUB). +% rxnList : cell or double +% a cell array of reaction IDs or a vector with their corresponding +% indexes. +% params : double +% a vector of the corresponding values. +% var : double, optional +% percentage of variance around measured value, if 'var' is set as +% paramType. Defining 'var' as 5 results in lb and ub at 97.5% and +% 102.5% of the provided params value (if params value is negative, +% then lb and ub are 102.5% and 97.5%). % -% Usage: model = setParam(model, paramType, rxnList, params, var) +% Returns +% ------- +% model : struct +% an updated model structure. +% +% Examples +% -------- +% model = setParam(model, paramType, rxnList, params, var); paramType=convertCharArray(paramType); if ~any(strcmpi(paramType,{'lb','ub','eq','obj','rev','var','unc'})) diff --git a/manipulation/simplifyModel.m b/manipulation/simplifyModel.m index d99f7ae6..1c862579 100755 --- a/manipulation/simplifyModel.m +++ b/manipulation/simplifyModel.m @@ -1,40 +1,55 @@ function [reducedModel, deletedReactions, deletedMetabolites]=simplifyModel(model,... deleteUnconstrained, deleteDuplicates, deleteZeroInterval, deleteInaccessible, deleteMinMax, groupLinear, constrainReversible, reservedRxns, suppressWarnings) -% simplifyModel -% Simplifies a model by deleting reactions/metabolites +% simplifyModel Simplify a model by deleting reactions and metabolites. % -% model a model structure -% deleteUnconstrained delete metabolites marked as unconstrained (optional, default true) -% deleteDuplicates delete all but one of duplicate reactions (optional, default false) -% deleteZeroInterval delete reactions that are constrained to zero flux (optional, default false) -% deleteInaccessible delete dead end reactions (optional, default false) -% deleteMinMax delete reactions that cannot carry a flux by trying -% to minimize/maximize the flux through that -% reaction. May be time consuming (optional, default false) -% groupLinear group linearly dependent pathways (optional, default false) -% constrainReversible check if there are reversible reactions which can -% only carry flux in one direction, and if so -% constrain them to be irreversible. This tends to -% allow for more reactions grouped when using -% groupLinear (optional, default false) -% reservedRxns cell array with reaction IDs that are not allowed to be -% removed (optional) -% suppressWarnings true if warnings should be suppressed (optional, -% default false) +% This function is for reducing the model size by removing reactions and +% associated metabolites that cannot carry flux. It can also be used for +% identifying different types of gaps. % -% reducedModel an updated model structure -% deletedReactions a cell array with the IDs of all deleted reactions -% deletedMetabolites a cell array with the IDs of all deleted -% metabolites +% Parameters +% ---------- +% model : struct +% a model structure. +% deleteUnconstrained : logical, optional +% delete metabolites marked as unconstrained (default true). +% deleteDuplicates : logical, optional +% delete all but one of duplicate reactions (default false). +% deleteZeroInterval : logical, optional +% delete reactions that are constrained to zero flux (default false). +% deleteInaccessible : logical, optional +% delete dead end reactions (default false). +% deleteMinMax : logical, optional +% delete reactions that cannot carry a flux by trying to +% minimize/maximize the flux through that reaction. May be time +% consuming (default false). +% groupLinear : logical, optional +% group linearly dependent pathways (default false). +% constrainReversible : logical, optional +% check if there are reversible reactions which can only carry flux in +% one direction, and if so constrain them to be irreversible. This +% tends to allow for more reactions grouped when using groupLinear +% (default false). +% reservedRxns : cell, optional +% cell array with reaction IDs that are not allowed to be removed +% (default none). +% suppressWarnings : logical, optional +% true if warnings should be suppressed (default false). % -% This function is for reducing the model size by removing -% reactions and associated metabolites that cannot carry flux. It can also -% be used for identifying different types of gaps. +% Returns +% ------- +% reducedModel : struct +% an updated model structure. +% deletedReactions : cell +% a cell array with the IDs of all deleted reactions. +% deletedMetabolites : cell +% a cell array with the IDs of all deleted metabolites. % -% Usage: [reducedModel, deletedReactions, deletedMetabolites]=simplifyModel(model,... -% deleteUnconstrained, deleteDuplicates, deleteZeroInterval,... -% deleteInaccessible, deleteMinMax, groupLinear,... -% constrainReversible, reservedRxns, suppressWarnings) +% Examples +% -------- +% [reducedModel, deletedReactions, deletedMetabolites] = ... +% simplifyModel(model, deleteUnconstrained, deleteDuplicates, ... +% deleteZeroInterval, deleteInaccessible, deleteMinMax, ... +% groupLinear, constrainReversible, reservedRxns, suppressWarnings); if nargin<2 deleteUnconstrained=true; diff --git a/manipulation/sortIdentifiers.m b/manipulation/sortIdentifiers.m index 98738de5..5a062a1e 100755 --- a/manipulation/sortIdentifiers.m +++ b/manipulation/sortIdentifiers.m @@ -1,16 +1,22 @@ function newModel = sortIdentifiers(model) -% exportModel -% Sort reactions, metabolites, genes and compartments alphabetically by -% their identifier. +% sortIdentifiers Sort model identifiers alphabetically. % -% Input: -% model a model structure +% Sort reactions, metabolites, genes and compartments alphabetically by +% their identifier. % -% Output: -% newModel an updated model structure with alphabetically sorted -% identifiers +% Parameters +% ---------- +% model : struct +% a model structure. % -% Usage: newModel=sortIdentifiers(model) +% Returns +% ------- +% newModel : struct +% an updated model structure with alphabetically sorted identifiers. +% +% Examples +% -------- +% newModel = sortIdentifiers(model); [~,I]=sort(model.rxns); newModel=permuteModel(model,I,'rxns'); diff --git a/manipulation/sortModel.m b/manipulation/sortModel.m index 7dc00d7e..0ff3d53d 100755 --- a/manipulation/sortModel.m +++ b/manipulation/sortModel.m @@ -1,23 +1,31 @@ function model=sortModel(model,sortReversible,sortMetName,sortReactionOrder) -% sortModel -% Sorts a model based on metabolite names and compartments +% sortModel Sort a model based on metabolite names and compartments. % -% model a model structure -% sortReversible sorts the reversible reactions so the the metabolite -% that is first in lexiographical order is a reactant -% (optional, default true) -% sortMetName sort the metabolite names in the equation, also uses -% compartment abbreviation (optional, default false) -% sortReactionOrder sorts the reaction order within each subsystem so that -% reactions consuming some metabolite comes efter -% reactions producing it. This overrides the -% sortReversible option and reactions are sorted so that -% the production direction matches the consumption -% direction (optional, default false) +% Parameters +% ---------- +% model : struct +% a model structure. +% sortReversible : logical, optional +% sorts the reversible reactions so that the metabolite that is first +% in lexicographical order is a reactant (default true). +% sortMetName : logical, optional +% sort the metabolite names in the equation, also uses compartment +% abbreviation (default false). +% sortReactionOrder : logical, optional +% sorts the reaction order within each subsystem so that reactions +% consuming some metabolite come after reactions producing it. This +% overrides the sortReversible option and reactions are sorted so that +% the production direction matches the consumption direction (default +% false). % -% model an updated model structure +% Returns +% ------- +% model : struct +% an updated model structure. % -% Usage: model=sortModel(model,sortReversible,sortMetName,sortReactionOrder) +% Examples +% -------- +% model = sortModel(model, sortReversible, sortMetName, sortReactionOrder); if nargin<2 sortReversible=true; diff --git a/manipulation/standardizeGrRules.m b/manipulation/standardizeGrRules.m index 7d079043..6d6c560c 100755 --- a/manipulation/standardizeGrRules.m +++ b/manipulation/standardizeGrRules.m @@ -1,25 +1,42 @@ function [grRules,rxnGeneMat,indexes2check] = standardizeGrRules(model,embedded) -% standardizeGrRules -% Standardizes gene-rxn rules in a model according to the following -% - No overall containing brackets -% - Just enzyme complexes are enclosed into brackets -% - ' and ' & ' or ' strings are strictly set to lowercases +% standardizeGrRules Standardize gene-reaction rules in a model. % -% A rxnGeneMat matrix consistent with the standardized grRules is created. +% The grRules are standardized according to the following: % -% model a model structure -% embedded true if this function is called inside of another -% RAVEN function (optional, default false) +% - No overall containing brackets +% - Just enzyme complexes are enclosed into brackets +% - ' and ' and ' or ' strings are strictly set to lowercase % -% grRules [nRxns x 1] cell array with the standardized grRules -% rxnGeneMat [nRxns x nGenes]Sparse matrix consitent with the -% standardized grRules -% -% If this function is going to be used in a model reconstruction or -% modification pipeline it is recommended to run this function just -% at the beginning of the process. +% A rxnGeneMat matrix consistent with the standardized grRules is created. % -% Usage: [grRules,rxnGeneMat,indexes2check]=standardizeGrRules(model,embedded) +% Parameters +% ---------- +% model : struct +% a model structure. +% embedded : logical, optional +% true if this function is called inside of another RAVEN function +% (default false). +% +% Returns +% ------- +% grRules : cell +% [nRxns x 1] cell array with the standardized grRules. +% rxnGeneMat : double +% [nRxns x nGenes] sparse matrix consistent with the standardized +% grRules. +% indexes2check : double +% indices of the grRules with potentially problematic relationships +% that should be checked manually. +% +% Examples +% -------- +% [grRules, rxnGeneMat, indexes2check] = standardizeGrRules(model, embedded); +% +% Notes +% ----- +% If this function is going to be used in a model reconstruction or +% modification pipeline it is recommended to run this function just at the +% beginning of the process. %Preallocate fields n = length(model.rxns); diff --git a/omics/parseHPA.m b/omics/parseHPA.m index 5b7d4ca3..f9ea2d64 100755 --- a/omics/parseHPA.m +++ b/omics/parseHPA.m @@ -1,40 +1,42 @@ function hpaData=parseHPA(fileName, version) -% parseHPA -% Parses a database dump of the Human Protein Atlas (HPA) +% parseHPA Parse a database dump of the Human Protein Atlas (HPA). % -% Input: -% fileName comma- or tab-separated database dump of HPA. For details -% regarding the format, see -% http://www.proteinatlas.org/about/download. -% version version of HPA [optional, default=19] +% Parameters +% ---------- +% fileName : char +% comma- or tab-separated database dump of HPA. For details regarding +% the format, see http://www.proteinatlas.org/about/download. +% version : double, optional +% version of HPA (default 19). % +% Returns +% ------- +% hpaData : struct +% parsed HPA data with fields: % -% Output: -% hpaData -% genes cell array with the unique gene names. In -% version >=18 this is the ensemble name, see -% geneNames below for the names in ver >=18 -% geneNames cell array with the gene names, indexed the -% same way as genes. -% tissues cell array with the tissue names. The list may not be -% unique, as there can be multiple cell types per tissue -% celltypes cell array with the cell type names for each tissue -% levels cell array with the unique expression levels -% types cell array with the unique evidence types -% reliabilities cell array with the unique reliability levels +% - genes : cell array with the unique gene names. In version >=18 this +% is the ensemble name, see geneNames below for the names in ver >=18 +% - geneNames : cell array with the gene names, indexed the same way as +% genes +% - tissues : cell array with the tissue names. The list may not be +% unique, as there can be multiple cell types per tissue +% - celltypes : cell array with the cell type names for each tissue +% - levels : cell array with the unique expression levels +% - types : cell array with the unique evidence types +% - reliabilities : cell array with the unique reliability levels +% - gene2Level : gene-to-expression level mapping in sparse matrix form. +% The value for element i,j is the index in hpaData.levels of gene i +% in cell type j +% - gene2Type : gene-to-evidence type mapping in sparse matrix form. The +% value for element i,j is the index in hpaData.types of gene i in +% cell type j. Doesn't exist in version >=18. +% - gene2Reliability : gene-to-reliability level mapping in sparse +% matrix form. The value for element i,j is the index in +% hpaData.reliabilities of gene i in cell type j % -% gene2Level gene-to-expression level mapping in sparse matrix form. -% The value for element i,j is the index in -% hpaData.levels of gene i in cell type j -% gene2Type gene-to-evidence type mapping in sparse matrix form. -% The value for element i,j is the index in -% hpaData.types of gene i in cell type j. Doesn't -% exist in version >=18. -% gene2Reliability gene-to-reliability level mapping in sparse matrix form. -% The value for element i,j is the index in -% hpaData.reliabilities of gene i in cell type j -% -% Usage: hpaData=parseHPA(fileName,version) +% Examples +% -------- +% hpaData = parseHPA(fileName, version); if nargin<2 version=19; %Change this and add code for more versions when the current HPA version is increased and the format is changed diff --git a/omics/parseHPArna.m b/omics/parseHPArna.m index e58b3d69..8abf3c40 100755 --- a/omics/parseHPArna.m +++ b/omics/parseHPArna.m @@ -1,24 +1,28 @@ function arrayData=parseHPArna(fileName, version) -% parseHPA -% Parses a database dump of the Human Protein Atlas (HPA) RNA-Seq data. +% parseHPArna Parse a dump of Human Protein Atlas (HPA) RNA-Seq data. % -% Input: -% fileName tab-separated database dump of HPA RNA data. For -% details regarding the format, see -% http://www.proteinatlas.org/about/download. -% version version of HPA [optional, default=19] +% Parameters +% ---------- +% fileName : char +% tab-separated database dump of HPA RNA data. For details regarding the +% format, see http://www.proteinatlas.org/about/download. +% version : double, optional +% version of HPA (default 19). Only versions 18 and 19 are supported. % +% Returns +% ------- +% arrayData : struct +% parsed HPA RNA data with fields: % -% Output: -% arrayData -% genes cell array with the unique ensemble gene IDs -% geneNames cell array with the gene names (gene abbrevs) -% tissues cell array with the tissue names -% levels matrix of gene expression levels (TPM), where -% rows correspond to genes, and columns -% correspond to tissues +% - genes : cell array with the unique ensemble gene IDs +% - geneNames : cell array with the gene names (gene abbrevs) +% - tissues : cell array with the tissue names +% - levels : matrix of gene expression levels (TPM), where rows +% correspond to genes, and columns correspond to tissues % -% Usage: arrayData=parseHPArna(fileName,version) +% Examples +% -------- +% arrayData = parseHPArna(fileName, version); if nargin<2 %Change this and add code for more versions when the current HPA diff --git a/omics/scoreModel.m b/omics/scoreModel.m index 896381e2..a1047104 100755 --- a/omics/scoreModel.m +++ b/omics/scoreModel.m @@ -1,60 +1,72 @@ function [rxnScores, geneScores, hpaScores, arrayScores]=scoreModel(model,hpaData,arrayData,tissue,celltype,noGeneScore,multipleGeneScoring,multipleCellScoring,hpaLevelScores) -% scoreRxns -% Scores the reactions and genes in a model based on expression data -% from HPA and/or gene arrays +% scoreModel Score model reactions and genes from HPA and/or array data. % -% Input: -% model a model structure -% hpaData HPA data structure from parseHPA (optional if arrayData is -% supplied, default []) -% arrayData gene expression data structure (optional if hpaData is -% supplied, default []) -% genes cell array with the unique gene names -% tissues cell array with the tissue names. The list may not be -% unique, as there can be multiple cell types per tissue -% celltypes cell array with the cell type names for each tissue -% levels GENESxTISSUES array with the expression level for -% each gene in each tissue/celltype. NaN should be -% used when no measurement was performed -% threshold a single value or a vector of gene expression -% thresholds, above which genes are considered to be -% "expressed". (optional, by default, the mean expression -% levels of each gene across all tissues in arrayData -% will be used as the threshold values) -% tissue tissue to score for. Should exist in either -% hpaData.tissues or arrayData.tissues -% celltype cell type to score for. Should exist in either -% hpaData.celltypes or arrayData.celltypes for this -% tissue (optional, default is to use the best values -% among all the cell types for the tissue. Use [] if -% you want to supply more arguments) -% noGeneScore score for reactions without genes (optional, default -2) -% multipleGeneScoring determines how scores are calculated for reactions -% with several genes ('best' or 'average') -% (optional, default 'best') -% multipleCellScoring determines how scores are calculated when several -% cell types are used ('best' or 'average') -% (optional, default 'best') -% hpaLevelScores structure with numerical scores for the expression -% level categories from HPA. The structure should have a -% "names" and a "scores" field (optional, see code for -% default scores) +% Scores the reactions and genes in a model based on expression data from +% HPA and/or gene arrays. % +% Parameters +% ---------- +% model : struct +% a model structure. +% hpaData : struct, optional +% HPA data structure from parseHPA (optional if arrayData is supplied, +% default []). +% arrayData : struct, optional +% gene expression data structure (optional if hpaData is supplied, +% default []) with fields: % -% Output: -% rxnScores scores for each of the reactions in model -% geneScores scores for each of the genes in model. Genes which are -% not in the dataset(s) have -Inf as scores -% hpaScores scores for each of the genes in model if only taking hpaData -% into account. Genes which are not in the dataset(s) -% have -Inf as scores -% arrayScores scores for each of the genes in model if only taking arrayData -% into account. Genes which are not in the dataset(s) -% have -Inf as scores +% - genes : cell array with the unique gene names +% - tissues : cell array with the tissue names. The list may not be +% unique, as there can be multiple cell types per tissue +% - celltypes : cell array with the cell type names for each tissue +% - levels : GENESxTISSUES array with the expression level for each gene +% in each tissue/celltype. NaN should be used when no measurement was +% performed +% - threshold : a single value or a vector of gene expression +% thresholds, above which genes are considered to be "expressed". +% (optional, by default, the mean expression levels of each gene +% across all tissues in arrayData will be used as the threshold +% values) +% tissue : char +% tissue to score for. Should exist in either hpaData.tissues or +% arrayData.tissues. +% celltype : char, optional +% cell type to score for. Should exist in either hpaData.celltypes or +% arrayData.celltypes for this tissue (default is to use the best values +% among all the cell types for the tissue). Use [] if you want to supply +% more arguments. +% noGeneScore : double, optional +% score for reactions without genes (default -2). +% multipleGeneScoring : char, optional +% determines how scores are calculated for reactions with several genes, +% 'best' or 'average' (default 'best'). +% multipleCellScoring : char, optional +% determines how scores are calculated when several cell types are used, +% 'best' or 'average' (default 'best'). +% hpaLevelScores : struct, optional +% structure with numerical scores for the expression level categories +% from HPA. The structure should have a "names" and a "scores" field +% (default see code for default scores). % -% Usage: [rxnScores, geneScores, hpaScores, arrayScores]=scoreModel(model,... -% hpaData,arrayData,tissue,celltype,noGeneScore,multipleGeneScoring,... -% multipleCellScoring,hpaLevelScores) +% Returns +% ------- +% rxnScores : double +% scores for each of the reactions in model. +% geneScores : double +% scores for each of the genes in model. Genes which are not in the +% dataset(s) have -Inf as scores. +% hpaScores : double +% scores for each of the genes in model if only taking hpaData into +% account. Genes which are not in the dataset(s) have -Inf as scores. +% arrayScores : double +% scores for each of the genes in model if only taking arrayData into +% account. Genes which are not in the dataset(s) have -Inf as scores. +% +% Examples +% -------- +% [rxnScores, geneScores, hpaScores, arrayScores] = scoreModel(model, ... +% hpaData, arrayData, tissue, celltype, noGeneScore, ... +% multipleGeneScoring, multipleCellScoring, hpaLevelScores); if nargin<3 arrayData=[]; diff --git a/queries/buildEquation.m b/queries/buildEquation.m index a4a20563..bcfe77ec 100755 --- a/queries/buildEquation.m +++ b/queries/buildEquation.m @@ -1,15 +1,23 @@ function equationString=buildEquation(mets,stoichCoeffs,isrev) -% buildEquation -% Construct single equation string for a given reaction +% buildEquation Construct single equation string for a given reaction. % -% mets cell array with metabolites involved in the reaction. -% stoichCoeffs vector with corresponding stoichiometric coeffs. -% isrev logical indicating if the reaction is or not -% reversible. +% Parameters +% ---------- +% mets : cell +% metabolites involved in the reaction. +% stoichCoeffs : double +% vector with corresponding stoichiometric coeffs. +% isrev : logical +% indicates if the reaction is reversible or not. % -% equationString equation as a string +% Returns +% ------- +% equationString : char +% equation as a string. % -% Usage: equationString=buildEquation(mets,stoichCoeffs,isrev) +% Examples +% -------- +% equationString = buildEquation(mets, stoichCoeffs, isrev); mets=convertCharArray(mets); if ~isnumeric(stoichCoeffs) diff --git a/queries/checkModelStruct.m b/queries/checkModelStruct.m index eb9f8e6d..140b7c64 100755 --- a/queries/checkModelStruct.m +++ b/queries/checkModelStruct.m @@ -1,18 +1,26 @@ function checkModelStruct(model,throwErrors,trimWarnings) -% checkModelStruct -% Performs a number of checks to ensure that a model structure is ok +% checkModelStruct Perform a number of checks to ensure a model structure is ok. % -% model a model structure -% throwErrors true if the function should throw errors if -% inconsistencies are found. The alternative is to -% print warnings for all types of issues (optional, default true) -% trimWarnings true if only a maximal of 10 items should be displayed in -% a given error/warning (optional, default true) +% Parameters +% ---------- +% model : struct +% a model structure. +% throwErrors : logical, optional +% true if the function should throw errors if inconsistencies are found. +% The alternative is to print warnings for all types of issues +% (default true). +% trimWarnings : logical, optional +% true if only a maximum of 10 items should be displayed in a given +% error/warning (default true). % -% NOTE: This is performed after importing a model from Excel or before -% attempting to export a model to SBML format. +% Notes +% ----- +% This is performed after importing a model from Excel or before attempting +% to export a model to SBML format. % -% Usage: checkModelStruct(model,throwErrors,trimWarnings) +% Examples +% -------- +% checkModelStruct(model, throwErrors, trimWarnings); if nargin<2 throwErrors=true; diff --git a/queries/constructEquations.m b/queries/constructEquations.m index c2f38189..06f5018f 100755 --- a/queries/constructEquations.m +++ b/queries/constructEquations.m @@ -1,38 +1,46 @@ function equationStrings=constructEquations(model,rxns,useComps,sortRevRxns,sortMetNames,useMetID,useFormula,useRevField) -% constructEquations -% Construct equation strings for reactions +% constructEquations Construct equation strings for reactions. % -% Input: -% model a model structure -% rxns either a cell array of reaction IDs, a logical vector -% with the same number of elements as reactions in the -% model, or a vector of reaction indexes (optional, default -% model.rxns) -% useComps include the compartment of each metabolite (optional, -% default true) -% sortRevRxns sort reversible reactions so that the metabolite that -% is first in the lexiographic order is a reactant -% (optional, default false) -% sortMetNames sort the metabolite names in the equation. Uses -% compartment even if useComps is false (optional, default -% false) -% useMetID use metabolite ID in generated equations (optional, -% default false) -% useFormula use metabolite formula in generated equations (optional, -% default false) -% useRevField use the model.rev field to indicate reaction -% reversibility, alternatively this is determined from -% the model.ub and model.lb fields (optional, default true) +% Parameters +% ---------- +% model : struct +% a model structure. +% rxns : cell or logical or double, optional +% either a cell array of reaction IDs, a logical vector with the same +% number of elements as reactions in the model, or a vector of reaction +% indexes (default model.rxns). +% useComps : logical, optional +% include the compartment of each metabolite (default true). +% sortRevRxns : logical, optional +% sort reversible reactions so that the metabolite that is first in the +% lexicographic order is a reactant (default false). +% sortMetNames : logical, optional +% sort the metabolite names in the equation. Uses compartment even if +% useComps is false (default false). +% useMetID : logical, optional +% use metabolite ID in generated equations (default false). +% useFormula : logical, optional +% use metabolite formula in generated equations (default false). +% useRevField : logical, optional +% use the model.rev field to indicate reaction reversibility, +% alternatively this is determined from the model.ub and model.lb fields +% (default true). % -% Output: -% equationStrings a cell array with equations +% Returns +% ------- +% equationStrings : cell +% a cell array with equations. % -% If useRevField is false, then reactions should be organized in their -% forward direction (e.g. ub = 1000 and lb = -1000/0) for the -% reversibility to be correctly determined. +% Examples +% -------- +% equationStrings = constructEquations(model, rxns, useComps, ... +% sortRevRxns, sortMetNames, useMetID, useFormula, useRevField); % -% Usage: equationStrings = constructEquations(model, rxns, useComps,... -% sortRevRxns, sortMetNames, useMetID, useFormula, useRevField) +% Notes +% ----- +% If useRevField is false, then reactions should be organized in their +% forward direction (e.g. ub = 1000 and lb = -1000/0) for the reversibility +% to be correctly determined. if nargin<2 || isempty(rxns) rxns=model.rxns; diff --git a/queries/constructS.m b/queries/constructS.m index 5412cf24..034e8b82 100755 --- a/queries/constructS.m +++ b/queries/constructS.m @@ -1,26 +1,36 @@ function [S, mets, badRxns, reversible]=constructS(equations,mets,rxns) -% constructS -% Constructs a stoichiometric matrix from a cell array of equations +% constructS Construct a stoichiometric matrix from a cell array of equations. % -% equations cell array of equations on the form 'A + 2 B <=> 3 C', -% where <=> indicates reversible and => irreversible reactions -% mets cell array of metabolites. All metabolites in the equations -% must be present in the list (optional, default generated from -% the equations) -% rxns cell array of reaction ids. This is only used for printing -% reaction ids instead of equations in warnings/errors (optional, -% default []) +% Parameters +% ---------- +% equations : cell +% cell array of equations on the form 'A + 2 B <=> 3 C', where <=> +% indicates reversible and => irreversible reactions. +% mets : cell, optional +% cell array of metabolites. All metabolites in the equations must be +% present in the list (default generated from the equations). +% rxns : cell, optional +% cell array of reaction ids. This is only used for printing reaction ids +% instead of equations in warnings/errors (default []). % -% S the resulting stoichiometric matrix mets cell array with -% metabolites that corresponds to the order in the S matrix -% badRxns boolean vector with the reactions that have one or more -% metabolites as both substrate and product. An example would -% be the phosphotransferase ATP + ADP <=> ADP + ATP. In the -% stoichiometric matrix this equals to an empty reaction -% which can be problematic -% reversible boolean vector with true if the equation is reversible +% Returns +% ------- +% S : double +% the resulting stoichiometric matrix. +% mets : cell +% cell array with metabolites that corresponds to the order in the S +% matrix. +% badRxns : logical +% boolean vector with the reactions that have one or more metabolites as +% both substrate and product. An example would be the phosphotransferase +% ATP + ADP <=> ADP + ATP. In the stoichiometric matrix this equals to an +% empty reaction which can be problematic. +% reversible : logical +% boolean vector with true if the equation is reversible. % -% Usage: [S, mets, badRxns, reversible]=constructS(equations,mets) +% Examples +% -------- +% [S, mets, badRxns, reversible] = constructS(equations, mets); equations=convertCharArray(equations); switch nargin diff --git a/queries/getAllRxnsFromGenes.m b/queries/getAllRxnsFromGenes.m index 36e66cde..a1710c92 100755 --- a/queries/getAllRxnsFromGenes.m +++ b/queries/getAllRxnsFromGenes.m @@ -1,19 +1,27 @@ function allRxns=getAllRxnsFromGenes(model,rxns) -% getAllRxnsFromGenes -% Given a list of reactions, this function finds the associated genes in -% the template model and gives all reactions that are annotated by these -% genes. +% getAllRxnsFromGenes Find all reactions annotated by the genes of a set. % -% model a model structure -% rxns either a cell array of IDs, a logical vector with the -% same number of elements as reactions in the model, or a -% vector of indexes +% Given a list of reactions, this function finds the associated genes in +% the model and returns all reactions that are annotated by these genes. % -% allRxns either a cell array of IDs, a logical vector with the -% same number of elements as reactions in the model, or a -% vector of indexes, dependent on the format of rxns +% Parameters +% ---------- +% model : struct +% a model structure. +% rxns : cell or logical or double +% either a cell array of IDs, a logical vector with the same number of +% elements as reactions in the model, or a vector of indexes. % -% Usage: allRxns=getAllRxnsFromGenes(model,rxns) +% Returns +% ------- +% allRxns : cell or logical or double +% either a cell array of IDs, a logical vector with the same number of +% elements as reactions in the model, or a vector of indexes, +% dependent on the format of rxns. +% +% Examples +% -------- +% allRxns = getAllRxnsFromGenes(model, rxns); if ~islogical(rxns) && ~isnumeric(rxns) rxns=convertCharArray(rxns); diff --git a/queries/getElementalBalance.m b/queries/getElementalBalance.m index 8bc76fa7..24c03a4d 100755 --- a/queries/getElementalBalance.m +++ b/queries/getElementalBalance.m @@ -1,30 +1,41 @@ function balanceStructure=getElementalBalance(model,rxns,printUnbalanced,printUnparsable) -% getElementalBalance -% Checks a model to see if the reactions are elementally balanced +% getElementalBalance Check whether the reactions of a model are balanced. % -% model a model structure -% rxns either a cell array of reaction IDs, a logical vector -% with the same number of elements as reactions in the model, -% of a vector of indexes. Only these reactions will be -% checked (optional, default model.rxns) -% printUnbalanced print warnings about the reactions that were -% unbalanced (optional, default false) -% printUnparsable print warnings about the reactions that cannot be -% parsed (optional, default false) +% Checks a model to see if the reactions are elementally balanced. % -% balanceStructure -% balanceStatus 1 if the reaction is balanced, 0 if it's unbalanced, -% -1 if it couldn't be balanced due to missing information, -% -2 if it couldn't be balanced due to an error -% elements -% abbrevs cell array with abbreviations for all used elements -% names cell array with the names for all used elements -% leftComp MxN matrix with the sum of coefficients for each of -% the elements (N) for the left side of the -% reactions (M) -% rightComp the corresponding matrix for the right side +% Parameters +% ---------- +% model : struct +% a model structure. +% rxns : cell or logical or double, optional +% either a cell array of reaction IDs, a logical vector with the same +% number of elements as reactions in the model, or a vector of +% indexes. Only these reactions will be checked (default model.rxns). +% printUnbalanced : logical, optional +% print warnings about the reactions that were unbalanced +% (default false). +% printUnparsable : logical, optional +% print warnings about the reactions that cannot be parsed +% (default false). % -% Usage: balanceStructure=getElementalBalance(model,rxns,printUnbalanced,printUnparsable) +% Returns +% ------- +% balanceStructure : struct +% elemental balance structure with fields: +% +% - balanceStatus : 1 if the reaction is balanced, 0 if it is +% unbalanced, -1 if it could not be balanced due to missing +% information, -2 if it could not be balanced due to an error +% - elements : struct with fields abbrevs (cell array with +% abbreviations for all used elements) and names (cell array with +% the names for all used elements) +% - leftComp : MxN matrix with the sum of coefficients for each of the +% elements (N) for the left side of the reactions (M) +% - rightComp : the corresponding matrix for the right side +% +% Examples +% -------- +% balanceStructure = getElementalBalance(model, rxns, printUnbalanced, printUnparsable); if nargin<2 rxns=[]; diff --git a/queries/getExchangeRxns.m b/queries/getExchangeRxns.m index d058f689..a7de80bc 100755 --- a/queries/getExchangeRxns.m +++ b/queries/getExchangeRxns.m @@ -1,44 +1,49 @@ function [exchangeRxns, exchangeRxnsIndexes]=getExchangeRxns(model,reactionType) -% getExchangeRxns -% Retrieves the exchange reactions from a model. Exchange reactions are -% identified by having either no substrates or products. +% getExchangeRxns Retrieve the exchange reactions from a model. % -% Input: -% model a model structure -% reactionType which exchange reactions should be returned -% 'all' all reactions, irrespective of reaction -% bounds -% 'uptake' reactions with bounds that imply that -% only uptake are allowed. Reaction -% direction, upper and lower bounds are -% all considered -% 'excrete' reactions with bounds that imply that -% only excretion are allowed. Reaction -% direction, upper and lower bounds are -% all considered -% 'reverse' reactions with non-zero upper and lower -% bounds that imply that both uptake and -% excretion are allowed -% 'blocked' reactions that have zero upper and lower -% bounds, not allowing any flux -% 'in' reactions where the boundary metabolite -% is the substrate of the reaction, a -% positive flux value would imply uptake, -% but reaction bounds are not considered -% 'out' reactions where the boundary metabolite -% is the product of the reaction, a -% negative flux value would imply uptake, -% but reaction bounds are not considered. +% Exchange reactions are identified by having either no substrates or no +% products. % -% Output: -% exchangeRxns cell array with the IDs of the exchange reactions -% exchangeRxnsIndexes vector with the indexes of the exchange reactions +% Parameters +% ---------- +% model : struct +% a model structure. +% reactionType : char, optional +% which exchange reactions should be returned (default 'all'): % -% Note: -% The union of 'in' and 'out' equals 'all'. Also, the union of 'uptake', -% 'excrete', 'reverse' and 'blocked' equals all. +% - 'all' : all reactions, irrespective of reaction bounds +% - 'uptake' : reactions with bounds that imply that only uptake is +% allowed. Reaction direction, upper and lower bounds are all +% considered +% - 'excrete' : reactions with bounds that imply that only excretion is +% allowed. Reaction direction, upper and lower bounds are all +% considered +% - 'reverse' : reactions with non-zero upper and lower bounds that +% imply that both uptake and excretion are allowed +% - 'blocked' : reactions that have zero upper and lower bounds, not +% allowing any flux +% - 'in' : reactions where the boundary metabolite is the substrate of +% the reaction; a positive flux value would imply uptake, but +% reaction bounds are not considered +% - 'out' : reactions where the boundary metabolite is the product of +% the reaction; a negative flux value would imply uptake, but +% reaction bounds are not considered % -% Usage: [exchangeRxns,exchangeRxnsIndexes]=getExchangeRxns(model,reactionType) +% Returns +% ------- +% exchangeRxns : cell +% cell array with the IDs of the exchange reactions. +% exchangeRxnsIndexes : double +% vector with the indexes of the exchange reactions. +% +% Notes +% ----- +% The union of 'in' and 'out' equals 'all'. Also, the union of 'uptake', +% 'excrete', 'reverse' and 'blocked' equals 'all'. +% +% Examples +% -------- +% [exchangeRxns, exchangeRxnsIndexes] = getExchangeRxns(model, reactionType); if nargin<2 reactionType='all'; diff --git a/queries/getGenesFromGrRules.m b/queries/getGenesFromGrRules.m index a251d6bd..74e7620c 100755 --- a/queries/getGenesFromGrRules.m +++ b/queries/getGenesFromGrRules.m @@ -1,28 +1,28 @@ function [genes,rxnGeneMat] = getGenesFromGrRules(grRules, originalGenes) -%getGenesFromGrRules Extract gene list and rxnGeneMat from grRules array. +% getGenesFromGrRules Extract gene list and rxnGeneMat from grRules array. % -% USAGE: +% Parameters +% ---------- +% grRules : cell +% a cell array of model grRules, from which a list of genes is to be +% extracted. NOTE: Boolean operators can be text ("and", "or") or +% symbolic ("&", "|"), but there must be a space between operators and +% gene names/IDs. +% originalGenes : cell, optional +% the original gene list from the model as reference. % -% [genes,rxnGeneMat] = getGenesFromGrRules(grRules, originalGenes); -% -% INPUTS: -% -% grRules A cell array of model grRules, from which a list of genes -% are to be extracted. -% NOTE: Boolean operators can be text ("and", "or") or -% symbolic ("&", "|"), but there must be a space -% between operators and gene names/IDs. -% originalGenes The original gene list from the model as reference -% -% OUTPUTS: -% -% genes A unique list of all gene IDs that appear in grRules. -% -% rxnGeneMat (Optional) A binary matrix indicating which genes -% participate in each reaction, where rows correspond to -% reactions (entries in grRules) and columns correspond to -% genes. +% Returns +% ------- +% genes : cell +% a unique list of all gene IDs that appear in grRules. +% rxnGeneMat : double +% (optional) a binary matrix indicating which genes participate in each +% reaction, where rows correspond to reactions (entries in grRules) and +% columns correspond to genes. % +% Examples +% -------- +% [genes, rxnGeneMat] = getGenesFromGrRules(grRules, originalGenes); % handle input arguments diff --git a/queries/getIndexes.m b/queries/getIndexes.m index 72ec4cef..62357b23 100755 --- a/queries/getIndexes.m +++ b/queries/getIndexes.m @@ -1,30 +1,38 @@ function indexes=getIndexes(model, objects, type, returnLogical) -% getIndexes -% Retrieves the indexes for a list of reactions or metabolites +% getIndexes Retrieve the indexes for a list of reactions or metabolites. % -% Input: -% model a model structure -% objects either a cell array of IDs, a logical vector with the -% same number of elements as metabolites in the model, -% of a vector of indexes -% type 'rxns', 'mets', or 'genes' depending on what to retrieve -% 'metnames' queries metabolite names, while 'metcomps' -% allows to provide specific metabolites and their -% compartments in the format metaboliteName[comp]. If a -% model.ec structure exists (GECKO 3), then also -% 'ecenzymes', 'ecrxns' and 'ecgenes' are allowed -% returnLogical Sets whether to return a logical array or an array with -% the indexes (optional, default false) +% Parameters +% ---------- +% model : struct +% a model structure. +% objects : cell or logical or double +% either a cell array of IDs, a logical vector with the same number of +% elements as metabolites in the model, or a vector of indexes. +% type : char +% 'rxns', 'mets', or 'genes' depending on what to retrieve. 'metnames' +% queries metabolite names, while 'metcomps' allows providing specific +% metabolites and their compartments in the format metaboliteName[comp]. +% If a model.ec structure exists (GECKO 3), then also 'ecenzymes', +% 'ecrxns' and 'ecgenes' are allowed. +% returnLogical : logical, optional +% sets whether to return a logical array or an array with the indexes +% (default false). % -% Output: -% indexes can be a logical array or a double array depending on -% the value of returnLogical +% Returns +% ------- +% indexes : logical or double +% can be a logical array or a double array depending on the value of +% returnLogical. % -% Note: If 'ecenzymes', 'ecrxns' or 'ecgenes' are used with a GECKO 3 -% model, then the indexes are from the model.ec.enzymes, model.ec.rxns or -% model.ec.genes fields, respectively. -% -% Usage: indexes=getIndexes(model, objects, type, returnLogical) +% Notes +% ----- +% If 'ecenzymes', 'ecrxns' or 'ecgenes' are used with a GECKO 3 model, then +% the indexes are from the model.ec.enzymes, model.ec.rxns or model.ec.genes +% fields, respectively. +% +% Examples +% -------- +% indexes = getIndexes(model, objects, type, returnLogical); if nargin<4 returnLogical=false; diff --git a/queries/getMetsInComp.m b/queries/getMetsInComp.m index 63f20300..2c6df05b 100755 --- a/queries/getMetsInComp.m +++ b/queries/getMetsInComp.m @@ -1,14 +1,23 @@ function [I, metNames]=getMetsInComp(model,comp) -% getMetsInComp -% Gets the metabolites in a specified compartment +% getMetsInComp Get the metabolites in a specified compartment. % -% model a model structure -% comp string with the compartment id +% Parameters +% ---------- +% model : struct +% a model structure. +% comp : char +% string with the compartment id. % -% I boolean vector of the metabolites -% metNames the names of the metabolites +% Returns +% ------- +% I : logical +% boolean vector of the metabolites. +% metNames : cell +% the names of the metabolites. % -% Usage: [I, metNames]=getMetsInComp(model,comp) +% Examples +% -------- +% [I, metNames] = getMetsInComp(model, comp); comp=char(comp); diff --git a/queries/getRxnsInComp.m b/queries/getRxnsInComp.m index b9daa850..c70b2c0e 100755 --- a/queries/getRxnsInComp.m +++ b/queries/getRxnsInComp.m @@ -1,17 +1,26 @@ function [I, rxnNames]=getRxnsInComp(model,comp,includePartial) -% getRxnsInComp -% Gets the reactions in a specified compartment +% getRxnsInComp Get the reactions in a specified compartment. % -% model a model structure -% comp string with the compartment id -% includePartial true if reactions with metabolites in several -% compartments (normally transport reactions) should -% be included (optional, default false) +% Parameters +% ---------- +% model : struct +% a model structure. +% comp : char +% string with the compartment id. +% includePartial : logical, optional +% true if reactions with metabolites in several compartments (normally +% transport reactions) should be included (default false). % -% I boolean vector of the reactions -% rxnNames the names of the reactions +% Returns +% ------- +% I : double +% boolean vector of the reactions. +% rxnNames : cell +% the names of the reactions. % -% Usage: [I, rxnNames]=getRxnsInComp(model,comp,includePartial) +% Examples +% -------- +% [I, rxnNames] = getRxnsInComp(model, comp, includePartial); comp=char(comp); if nargin<3 diff --git a/queries/getTransportRxns.m b/queries/getTransportRxns.m index 7f4f2f82..122bea38 100755 --- a/queries/getTransportRxns.m +++ b/queries/getTransportRxns.m @@ -1,16 +1,25 @@ function transportRxns=getTransportRxns(model) -% getTransportRxns -% Retrieves the transport reactions from a model +% getTransportRxns Retrieve the transport reactions from a model. % -% model a model structure +% Parameters +% ---------- +% model : struct +% a model structure. % -% transportRxns logical array with true if the corresponding -% reaction is a transport reaction +% Returns +% ------- +% transportRxns : logical +% logical array with true if the corresponding reaction is a transport +% reaction. % -% Transport reactions are defined as reactions involving (at least) one -% metabolite name in more than one compartment. +% Examples +% -------- +% transportRxns = getTransportRxns(model); % -% Usage: transportRxns=getTransportRxns(model) +% Notes +% ----- +% Transport reactions are defined as reactions involving (at least) one +% metabolite name in more than one compartment. transportRxns=false(numel(model.rxns),1); diff --git a/queries/parseFormulas.m b/queries/parseFormulas.m index b54ce40c..d4f29cfd 100755 --- a/queries/parseFormulas.m +++ b/queries/parseFormulas.m @@ -1,34 +1,47 @@ function [elements, useMat, exitFlag, MW]=parseFormulas(formulas, noPolymers,isInchi,ignoreRX) -% parseFormulas -% Gets the elemental composition from formulas +% parseFormulas Get the elemental composition from formulas. % -% formulas a cell array with formulas -% noPolymers assume that all polymers consist of one element. -% Corresponds to counting everything between (...)n as -% n being equal to one. Only one set of parentheses -% is allowed. If this is false then polymers are returned as -% "Could not parse formula" (optional, default false) -% isInchi true if the formulas are in the InChI format (optional, -% default false) -% ignoreRX ignore R-groups and bound protein. This can be useful since they -% are often used only as intermediates (optional, default false) +% Parameters +% ---------- +% formulas : cell +% a cell array with formulas. +% noPolymers : logical, optional +% assume that all polymers consist of one element. Corresponds to +% counting everything between (...)n as n being equal to one. Only one +% set of parentheses is allowed. If this is false then polymers are +% returned as "Could not parse formula" (default false). +% isInchi : logical, optional +% true if the formulas are in the InChI format (default false). +% ignoreRX : logical, optional +% ignore R-groups and bound protein. This can be useful since they are +% often used only as intermediates (default false). % -% elements -% abbrevs cell array with abbreviations for all used elements -% names cell array with the names for all used elements -% useMat MxN matrix with the number of atoms for each formula (M) and each -% element (N) -% exitFlag array with the exit flags: -% 1= Sucessful parsing -% 0= No formula found -% -1= Could not parse formula -% MW predicted molecular weight (g/mol). This is only returned -% for formulas which can be sucessfully parsed, and its -% calculation doesn't affect the exitFlag variable. NaN is -% returned if the weight couldn't be calculated -% -% Usage: [elements, useMat, exitFlag, MW]= -% parseFormulas(formulas, noPolymers,isInchi,ignoreRX) +% Returns +% ------- +% elements : struct +% struct with fields: +% +% - abbrevs : cell array with abbreviations for all used elements +% - names : cell array with the names for all used elements +% useMat : double +% MxN matrix with the number of atoms for each formula (M) and each +% element (N). +% exitFlag : double +% array with the exit flags: +% +% - 1 : successful parsing +% - 0 : no formula found +% - -1 : could not parse formula +% MW : double +% predicted molecular weight (g/mol). This is only returned for +% formulas which can be successfully parsed, and its calculation +% doesn't affect the exitFlag variable. NaN is returned if the weight +% couldn't be calculated. +% +% Examples +% -------- +% [elements, useMat, exitFlag, MW] = ... +% parseFormulas(formulas, noPolymers, isInchi, ignoreRX); if nargin<2 noPolymers=false; diff --git a/queries/parseRxnEqu.m b/queries/parseRxnEqu.m index 08bfdbe5..1a3ab6e9 100755 --- a/queries/parseRxnEqu.m +++ b/queries/parseRxnEqu.m @@ -1,20 +1,28 @@ function metabolites=parseRxnEqu(equations) -% parseRxnEqu -% Gets all metabolite names from a cell array of equations +% parseRxnEqu Get all metabolite names from a cell array of equations. % -% metabolites=parseRxnEqu(equations) +% Parameters +% ---------- +% equations : cell +% a cell array with equation strings. % -% equations A cell array with equation strings +% Returns +% ------- +% metabolites : cell +% a cell array with the involved metabolites. % -% metabolites A cell array with the involved metabolites +% Examples +% -------- +% metabolites = parseRxnEqu(equations); % -% The equations should be written like: -% 1 A + 3 B (=> or <=>) 5C + 2 D +% Notes +% ----- +% The equations should be written like: % -% If the equation is expressed as for example '... + (n-1) starch' then -% '(n-1) starch' will be interpreted as one metabolite +% 1 A + 3 B (=> or <=>) 5C + 2 D % -% Usage: metabolites=parseRxnEqu(equations) +% If the equation is expressed as for example '... + (n-1) starch' then +% '(n-1) starch' will be interpreted as one metabolite. if ~iscell(equations) equations={equations}; diff --git a/queries/printFluxes.m b/queries/printFluxes.m index 8f2d65db..b6099d3d 100755 --- a/queries/printFluxes.m +++ b/queries/printFluxes.m @@ -1,39 +1,49 @@ function printFluxes(model, fluxes, onlyExchange, cutOffFlux, outputFile,outputString,metaboliteList) -% printFluxes -% Prints reactions and fluxes to the screen or to a file +% printFluxes Print reactions and fluxes to the screen or to a file. % -% Input: -% model a model structure -% fluxes a vector with fluxes -% onlyExchange only print exchange fluxes (optional, default true) -% cutOffFlux only print fluxes with absolute values above or equal -% to this value (optional, default 10^-8) -% outputFile a file to save the print-out to (optional, default is -% output to the command window) -% outputString a string that specifies the output of each reaction -% (optional, default '%rxnID\t(%rxnName):\t%flux\n') -% metaboliteList cell array of metabolite names. Only reactions -% involving any of these metabolites will be -% printed (optional) +% Parameters +% ---------- +% model : struct +% a model structure. +% fluxes : double +% a vector with fluxes. +% onlyExchange : logical, optional +% only print exchange fluxes (default true). +% cutOffFlux : double, optional +% only print fluxes with absolute values above or equal to this value +% (default 10^-8). +% outputFile : char, optional +% a file to save the print-out to (default is output to the command +% window). +% outputString : char, optional +% a string that specifies the output of each reaction (default +% '%rxnID\t(%rxnName):\t%flux\n'). +% metaboliteList : cell, optional +% cell array of metabolite names. Only reactions involving any of these +% metabolites will be printed. % +% Notes +% ----- % The following codes are available for user-defined output strings: % -% %rxnID reaction ID -% %rxnName reaction name -% %lower lower bound -% %upper upper bound -% %obj objective coefficient -% %eqn equation -% %flux flux -% %element equation using the metabolite formulas rather than -% metabolite names -% %unbalanced "(*)" if the reaction is unbalanced and "(-)" if it could -% not be parsed -% %lumped equation where the elemental compositions for the left/right -% hand sides are lumped +% - %rxnID : reaction ID +% - %rxnName : reaction name +% - %lower : lower bound +% - %upper : upper bound +% - %obj : objective coefficient +% - %eqn : equation +% - %flux : flux +% - %element : equation using the metabolite formulas rather than metabolite +% names +% - %unbalanced : "(*)" if the reaction is unbalanced and "(-)" if it could +% not be parsed +% - %lumped : equation where the elemental compositions for the left/right +% hand sides are lumped % -% Usage: printFluxes(model, fluxes, onlyExchange, cutOffFlux, outputFile,... -% outputString, metaboliteList) +% Examples +% -------- +% printFluxes(model, fluxes, onlyExchange, cutOffFlux, outputFile, ... +% outputString, metaboliteList); if nargin<3 onlyExchange=true; diff --git a/queries/printModel.m b/queries/printModel.m index baf81e25..d1285d95 100755 --- a/queries/printModel.m +++ b/queries/printModel.m @@ -1,39 +1,47 @@ function printModel(model,rxnList,outputString,outputFile,metaboliteList) -% printModel -% Prints reactions to the screen or to a file +% printModel Print reactions to the screen or to a file. % -% model a model structure -% rxnList either a cell array of reaction IDs, a logical vector -% with the same number of elements as reactions in the model, -% or a vector of indexes to remove (optional, default -% model.rxns) -% outputString a string that specifies the output of each reaction (optional, -% default '%rxnID (%rxnName)\n\t%eqn [%lower %upper]\n') -% outputFile a file to save the print-out to (optional, default is output to -% the command window) -% metaboliteList cell array of metabolite names. Only reactions -% involving any of these metabolites will be -% printed (optional) +% This is a wrapper around printFluxes, intended for use when there is no +% flux distribution. % -% The following codes are available for user-defined output strings: +% Parameters +% ---------- +% model : struct +% a model structure. +% rxnList : cell or logical or double, optional +% either a cell array of reaction IDs, a logical vector with the same +% number of elements as reactions in the model, or a vector of indexes +% to print (default model.rxns). +% outputString : char, optional +% a string that specifies the output of each reaction (default +% '%rxnID (%rxnName)\n\t%eqn [%lower %upper]\n'). +% outputFile : char, optional +% a file to save the print-out to (default is output to the command +% window). +% metaboliteList : cell, optional +% cell array of metabolite names. Only reactions involving any of these +% metabolites will be printed. % -% %rxnID reaction ID -% %rxnName reaction name -% %lower lower bound -% %upper upper bound -% %obj objective coefficient -% %eqn equation -% %element equation using the metabolite formulas rather than -% metabolite names -% %unbalanced "(*)" if the reaction is unbalanced and "(-)" if it could not -% be parsed -% %lumped equation where the elemental compositions for the left/right -% hand sides are lumped +% Notes +% ----- +% The following codes are available for user-defined output strings: % -% NOTE: This is just a wrapper function around printFluxes. It is -% intended to be used when there is no flux distribution. +% - %rxnID : reaction ID +% - %rxnName : reaction name +% - %lower : lower bound +% - %upper : upper bound +% - %obj : objective coefficient +% - %eqn : equation +% - %element : equation using the metabolite formulas rather than metabolite +% names +% - %unbalanced : "(*)" if the reaction is unbalanced and "(-)" if it could +% not be parsed +% - %lumped : equation where the elemental compositions for the left/right +% hand sides are lumped % -% Usage: printModel(model,rxnList,outputString,outputFile,metaboliteList) +% Examples +% -------- +% printModel(model, rxnList, outputString, outputFile, metaboliteList); if nargin<2 || isempty(rxnList) rxnList=model.rxns; diff --git a/queries/printModelStats.m b/queries/printModelStats.m index 6018cf09..fcef6c8d 100755 --- a/queries/printModelStats.m +++ b/queries/printModelStats.m @@ -1,16 +1,20 @@ function printModelStats(model, printModelIssues, printDetails) -% printModelStats -% prints some statistics about a model to the screen +% printModelStats Print some statistics about a model to the screen. % -% model a model structure -% printModelIssues true if information about unconnected -% reactions/metabolites and elemental balancing -% should be printed (optional, default false) -% printDetails true if detailed information should be printed -% about model issues. Only used if printModelIssues -% is true (optional, default true) +% Parameters +% ---------- +% model : struct +% a model structure. +% printModelIssues : logical, optional +% true if information about unconnected reactions/metabolites and +% elemental balancing should be printed (default false). +% printDetails : logical, optional +% true if detailed information should be printed about model issues. +% Only used if printModelIssues is true (default true). % -% Usage: printModelStats(model,printModelIssues, printDetails) +% Examples +% -------- +% printModelStats(model, printModelIssues, printDetails); if nargin<2 printModelIssues=false; diff --git a/reconstruction/combineMetaCycKEGGModels.m b/reconstruction/combineMetaCycKEGGModels.m index d52638cb..ba14f925 100755 --- a/reconstruction/combineMetaCycKEGGModels.m +++ b/reconstruction/combineMetaCycKEGGModels.m @@ -1,18 +1,24 @@ function model=combineMetaCycKEGGModels(metacycModel,keggModel) -% combineMetaCycKEGGModels -% Combine MetaCyc and KEGG draft models into one model structure. +% combineMetaCycKEGGModels Combine MetaCyc and KEGG draft models into one. % -% Input: -% metacycModel the reconstructed model from MetaCyc -% keggModel the reconstructed model from KEGG +% Parameters +% ---------- +% metacycModel : struct +% the reconstructed model from MetaCyc. +% keggModel : struct +% the reconstructed model from KEGG. % -% Output: -% model a model structure generated by integrating information -% from draft models reconstructed using MetaCyc and KEGG -% databases. The 'rxnFrom/metFrom/geneFrom' fields are -% included to indicate the source. +% Returns +% ------- +% model : struct +% a model structure generated by integrating information from draft +% models reconstructed using MetaCyc and KEGG databases. The +% 'rxnFrom/metFrom/geneFrom' fields are included to indicate the +% source. % -% Usage: model=combineMetaCycKEGGModels(metacycModel,keggModel) +% Examples +% -------- +% model = combineMetaCycKEGGModels(metacycModel,keggModel); %Just return the model if nargin<2 diff --git a/reconstruction/guessComposition.m b/reconstruction/guessComposition.m index ab204b85..75056f4e 100755 --- a/reconstruction/guessComposition.m +++ b/reconstruction/guessComposition.m @@ -1,37 +1,50 @@ function [model, guessedFor, couldNotGuess]=guessComposition(model, printResults) -% guessComposition -% Attempts to guess the composition of metabolites without information -% about elemental composition +% guessComposition Guess the composition of metabolites without one. % -% model a model structure -% printResults true if the output should be printed (optional, default true) +% Attempts to guess the composition of metabolites without information about +% elemental composition. % -% model a model structure with information about elemental -% composition added -% guessedFor indexes for the metabolites for which a composition -% could be guessed -% couldNotGuess indexes for the metabolites for which no -% composition could be assigned +% Parameters +% ---------- +% model : struct +% a model structure. +% printResults : logical, optional +% true if the output should be printed (default true). % -% This function works in a rather straight forward manner: +% Returns +% ------- +% model : struct +% a model structure with information about elemental composition added. +% guessedFor : double +% indexes for the metabolites for which a composition could be guessed. +% couldNotGuess : double +% indexes for the metabolites for which no composition could be +% assigned. % -% 1. Get the metabolites which lack composition and participates in -% at least one reaction where all other metabolites have composition information -% 2. Loop through them and calculate their composition based on the rest -% of the involved metabolites. If there are any inconsistencies, so that -% a given metabolite should have different composition in different -% equations, then throw an error -% 3. Go to 1 +% Examples +% -------- +% [model, guessedFor, couldNotGuess] = guessComposition(model, printResults); % -% This simple approach requires that the rest of the metabolites have -% correct composition information, and that the involved reactions are -% correct. The function will exit with an error on any inconsistencies, -% which means that it could also be used as a way of checking the model -% for errors. Note that just because this exits sucessfully, the -% calculated compositions could still be wrong (in case that the existing -% compositions were wrong) +% Notes +% ----- +% This function works in a rather straight forward manner: % -% Usage: [newModel, guessedFor, couldNotGuess]=guessComposition(model, printResults) +% 1. Get the metabolites which lack composition and participates in at +% least one reaction where all other metabolites have composition +% information. +% 2. Loop through them and calculate their composition based on the rest of +% the involved metabolites. If there are any inconsistencies, so that a +% given metabolite should have different composition in different +% equations, then throw an error. +% 3. Go to 1. +% +% This simple approach requires that the rest of the metabolites have +% correct composition information, and that the involved reactions are +% correct. The function will exit with an error on any inconsistencies, +% which means that it could also be used as a way of checking the model for +% errors. Note that just because this exits sucessfully, the calculated +% compositions could still be wrong (in case that the existing compositions +% were wrong). if nargin<2 printResults=true; diff --git a/reconstruction/homology/getBlast.m b/reconstruction/homology/getBlast.m index 6dda0e46..e567ef73 100755 --- a/reconstruction/homology/getBlast.m +++ b/reconstruction/homology/getBlast.m @@ -1,40 +1,51 @@ function [blastStructure,blastReport]=getBlast(organismID,fastaFile,... modelIDs,refFastaFiles,developMode,hideVerbose) -% getBlast -% Performs a bidirectional BLAST between the organism of interest and a -% set of template organisms +% getBlast Bidirectional BLAST between an organism and template organisms. % -% Input: -% organismID the id of the organism of interest. This should also -% match with the id supplied to getModelFromHomology -% fastaFile a FASTA file with the protein sequences for the -% organism of interest -% modelIDs a cell array of model ids. These must match the -% "model.id" fields in the "models" structure if the -% output is to be used with getModelFromHomology -% refFastaFiles a cell array with the paths to the corresponding FASTA -% files -% developMode true if blastReport should be generated that is used -% in the unit testing function for BLAST+ (optional, default -% false) -% hideVerbose true if no status messages should be printed (optional, -% default false) +% Parameters +% ---------- +% organismID : char +% the id of the organism of interest. This should also match with the +% id supplied to getModelFromHomology. +% fastaFile : char +% a FASTA file with the protein sequences for the organism of interest. +% modelIDs : cell +% a cell array of model ids. These must match the "model.id" fields in +% the "models" structure if the output is to be used with +% getModelFromHomology. +% refFastaFiles : cell +% a cell array with the paths to the corresponding FASTA files. +% developMode : logical, optional +% true if blastReport should be generated that is used in the unit +% testing function for BLAST+ (default false). +% hideVerbose : logical, optional +% true if no status messages should be printed (default false). % -% Output: -% blastStructure structure containing the bidirectional homology -% measurements that can be used by getModelFromHomology -% blastReport structure containing MD5 hashes for FASTA database -% files and non-parsed BLAST output data. Will be blank -% if developMode is false. +% Returns +% ------- +% blastStructure : struct +% structure containing the bidirectional homology measurements that +% can be used by getModelFromHomology. +% blastReport : struct +% structure containing MD5 hashes for FASTA database files and +% non-parsed BLAST output data. Will be blank if developMode is false. % -% NOTE: This function calls BLAST+ to perform a bidirectional homology -% test between the organism of interest and a set of other organisms -% using standard settings. The only filtering this function does is the -% removal of hits with an E-value higher than 10e-5. The other homology -% measurements can be implemented using getBlastFromExcel. +% Notes +% ----- +% This function calls BLAST+ to perform a bidirectional homology test +% between the organism of interest and a set of other organisms using +% standard settings. The only filtering this function does is the removal +% of hits with an E-value higher than 10e-5. The other homology +% measurements can be implemented using getBlastFromExcel. % -% Usage: [blastStructure,blastReport]=getBlast(organismID,fastaFile,... -% modelIDs,refFastaFiles,developMode,hideVerbose) +% Examples +% -------- +% [blastStructure,blastReport] = getBlast(organismID,fastaFile,... +% modelIDs,refFastaFiles,developMode,hideVerbose); +% +% See also +% -------- +% getModelFromHomology, getBlastFromExcel, getDiamond if nargin<5 developMode = false; diff --git a/reconstruction/homology/getBlastFromExcel.m b/reconstruction/homology/getBlastFromExcel.m index f1e98774..d193b89d 100755 --- a/reconstruction/homology/getBlastFromExcel.m +++ b/reconstruction/homology/getBlastFromExcel.m @@ -1,29 +1,41 @@ function blastStructure=getBlastFromExcel(models,blastFile,organismId) -% getBlastFromExcel -% Retrieves gene homology information from Excel files. Used as -% input to getModelFromHomology. +% getBlastFromExcel Retrieve gene homology information from Excel files. % -% Input: -% models a cell array of model structures -% blastFile Excel file with homology information -% organismId the id of the organism of interest (as described in the -% Excel file) +% Used as input to getModelFromHomology. % -% Output: -% blastStructure structure containing the information in the Excel -% sheets. +% Parameters +% ---------- +% models : cell +% a cell array of model structures. +% blastFile : char +% Excel file with homology information. +% organismId : char +% the id of the organism of interest (as described in the Excel file). % -% The Excel file should contain a number of spreadsheets which in turn -% contain the bidirectional homology measurements between the genes in the -% organisms. The first and second column headers in each sheet is the -% "to" and "from" model ids (as defined in models or for the new organism). -% The entries should correspond to the gene names in those models. The third, -% fourth, fifth, sixth and seventh columns represent the E-value, alignment -% length, identity, bitscore and percentage of positive-scoring matches for -% each measurement (captions should be "E-value", "Alignment length", -% "Identity", "Bitscore" and "PPOS"). +% Returns +% ------- +% blastStructure : struct +% structure containing the information in the Excel sheets. % -% Usage: blastStructure=getBlastFromExcel(models,blastFile,organismId) +% Notes +% ----- +% The Excel file should contain a number of spreadsheets which in turn +% contain the bidirectional homology measurements between the genes in the +% organisms. The first and second column headers in each sheet is the "to" +% and "from" model ids (as defined in models or for the new organism). The +% entries should correspond to the gene names in those models. The third, +% fourth, fifth, sixth and seventh columns represent the E-value, alignment +% length, identity, bitscore and percentage of positive-scoring matches for +% each measurement (captions should be "E-value", "Alignment length", +% "Identity", "Bitscore" and "PPOS"). +% +% Examples +% -------- +% blastStructure = getBlastFromExcel(models,blastFile,organismId); +% +% See also +% -------- +% getModelFromHomology, getBlast if ~isfile(blastFile) error('BLAST result file %s cannot be found',string(blastFile)); diff --git a/reconstruction/homology/getDiamond.m b/reconstruction/homology/getDiamond.m index 2200b97c..0a0d14ac 100755 --- a/reconstruction/homology/getDiamond.m +++ b/reconstruction/homology/getDiamond.m @@ -1,41 +1,52 @@ function [blastStructure,diamondReport]=getDiamond(organismID,fastaFile,... modelIDs,refFastaFiles,developMode,hideVerbose) -% getDiamond -% Uses DIAMOND to perform a bidirectional BLAST between the organism -% of interest and a set of template organisms +% getDiamond Bidirectional BLAST with DIAMOND against template organisms. % -% Input: -% organismID the id of the organism of interest. This should also -% match with the id supplied to getModelFromHomology -% fastaFile a FASTA file with the protein sequences for the -% organism of interest -% modelIDs a cell array of model ids. These must match the -% "model.id" fields in the "models" structure if the -% output is to be used with getModelFromHomology -% refFastaFiles a cell array with the paths to the corresponding FASTA -% files -% developMode true if blastReport should be generated that is used -% in the unit testing function for DIAMOND (optional, default -% false) -% hideVerbose true if no status messages should be printed (optional, -% default false) +% Parameters +% ---------- +% organismID : char +% the id of the organism of interest. This should also match with the +% id supplied to getModelFromHomology. +% fastaFile : char +% a FASTA file with the protein sequences for the organism of interest. +% modelIDs : cell +% a cell array of model ids. These must match the "model.id" fields in +% the "models" structure if the output is to be used with +% getModelFromHomology. +% refFastaFiles : cell +% a cell array with the paths to the corresponding FASTA files. +% developMode : logical, optional +% true if blastReport should be generated that is used in the unit +% testing function for DIAMOND (default false). +% hideVerbose : logical, optional +% true if no status messages should be printed (default false). % -% Output: -% blastStructure structure containing the bidirectional homology -% measurements which are used by getModelFromHomology -% diamondReport structure containing MD5 hashes for FASTA database -% files and non-parsed BLAST output data. Will be blank -% if developMode is false. +% Returns +% ------- +% blastStructure : struct +% structure containing the bidirectional homology measurements which +% are used by getModelFromHomology. +% diamondReport : struct +% structure containing MD5 hashes for FASTA database files and +% non-parsed BLAST output data. Will be blank if developMode is false. % -% NOTE: This function calls DIAMOND to perform a bidirectional homology -% search between the organism of interest and a set of other organisms -% using the '--more-sensitive' setting from DIAMOND. For the most -% sensitive results, the use of getBlast() is adviced, however, -% getDiamond() is a fast alternative (>15x faster). The blastStructure -% generated is in the same format as those obtained from getBlast(). +% Notes +% ----- +% This function calls DIAMOND to perform a bidirectional homology search +% between the organism of interest and a set of other organisms using the +% '--more-sensitive' setting from DIAMOND. For the most sensitive results, +% the use of getBlast() is adviced, however, getDiamond() is a fast +% alternative (>15x faster). The blastStructure generated is in the same +% format as those obtained from getBlast(). % -% Usage: [blastStructure,diamondReport]=getDiamond(organismID,fastaFile,... -% modelIDs,refFastaFiles,developMode,hideVerbose) +% Examples +% -------- +% [blastStructure,diamondReport] = getDiamond(organismID,fastaFile,... +% modelIDs,refFastaFiles,developMode,hideVerbose); +% +% See also +% -------- +% getModelFromHomology, getBlast if nargin<5 developMode = false; diff --git a/reconstruction/homology/getModelFromHomology.m b/reconstruction/homology/getModelFromHomology.m index aa356b03..66bfa134 100755 --- a/reconstruction/homology/getModelFromHomology.m +++ b/reconstruction/homology/getModelFromHomology.m @@ -1,70 +1,79 @@ function [draftModel, hitGenes]=getModelFromHomology(models,blastStructure,... getModelFor,preferredOrder,strictness,onlyGenesInModels,maxE,... minLen,minIde,mapNewGenesToOld) -% getModelFromHomology -% Constructs a new model from a set of existing models and gene homology -% information. +% getModelFromHomology Construct a new model from existing models and homology. % -% models a cell array of model structures to build the model -% from. These models must be sorted by importance in -% decreasing order -% blastStructure a blastStructure as produced by getBlast or -% getBlastFromExcel -% getModelFor a three-four letter abbreviation of the organism to -% build a model for. Must have BLASTP hits in both -% directions to the organisms in 'models' -% preferredOrder the order in which reactions should be added from the -% models. If not supplied, reactions will be included -% from all models, otherwise one gene will only result -% in reactions from one model (optional, default {}) -% strictness integer that specifies which reactions should be -% included: -% 1: Map new genes to old for all pairs, which have -% acceptable BLASTP results in both directions -% 2: Map new genes to old for all pairs, which have -% acceptable BLASTP results in correspondent direction -% (mapping can be done in the opposite direction, see -% mapNewGenesToOld below) -% 3: Check all BLASTP results and retain only the best -% results by E-value for all gene pairs in each -% direction separately. Then map new genes to old for -% all pairs, which have acceptable BLASTP results in -% both directions (optional, default 1). -% onlyGenesInModels consider BLASTP results only for genes that exist in -% the models. This tends to import a larger fraction -% from the existing models but may give less reliable -% results. Has effect only if strictness=3 (optional, -% default false) -% maxE only look at genes with E-values <= this value (optional, -% default 10^-30) -% minLen only look at genes with alignment length >= this -% value (optional, default 200) -% minIde only look at genes with identity >= this value -% (optional, default 40 (%)) -% mapNewGenesToOld determines how to match genes if not looking at only -% 1-1 orthologs. Either map the new genes to the old or -% old genes to new. The default is to map the new genes -% (optional, default true) +% Constructs a new model from a set of existing models and gene homology +% information. % -% draftModel a model structure for the new organism -% hitGenes collect the old and new genes +% Parameters +% ---------- +% models : cell +% a cell array of model structures to build the model from. These +% models must be sorted by importance in decreasing order. +% blastStructure : struct +% a blastStructure as produced by getBlast or getBlastFromExcel. +% getModelFor : char +% a three-four letter abbreviation of the organism to build a model +% for. Must have BLASTP hits in both directions to the organisms in +% 'models'. +% preferredOrder : cell, optional +% the order in which reactions should be added from the models. If not +% supplied, reactions will be included from all models, otherwise one +% gene will only result in reactions from one model (default {}). +% strictness : double, optional +% integer that specifies which reactions should be included (default 1): % -% The models in the 'models' structure should have named the metabolites -% in the same manner, have their reversible reactions in the same -% direction (run sortModel), and use the same compartment names. To avoid -% keeping unneccesary old genes, the models should not have -% 'or'-relations in their grRules (use expandModel). +% - 1 : Map new genes to old for all pairs, which have acceptable BLASTP +% results in both directions. +% - 2 : Map new genes to old for all pairs, which have acceptable BLASTP +% results in correspondent direction (mapping can be done in the +% opposite direction, see mapNewGenesToOld below). +% - 3 : Check all BLASTP results and retain only the best results by +% E-value for all gene pairs in each direction separately. Then map +% new genes to old for all pairs, which have acceptable BLASTP results +% in both directions. +% onlyGenesInModels : logical, optional +% consider BLASTP results only for genes that exist in the models. This +% tends to import a larger fraction from the existing models but may +% give less reliable results. Has effect only if strictness=3 (default +% false). +% maxE : double, optional +% only look at genes with E-values <= this value (default 10^-30). +% minLen : double, optional +% only look at genes with alignment length >= this value (default 200). +% minIde : double, optional +% only look at genes with identity >= this value (default 40 (%)). +% mapNewGenesToOld : logical, optional +% determines how to match genes if not looking at only 1-1 orthologs. +% Either map the new genes to the old or old genes to new. The default +% is to map the new genes (default true). % -% The resulting draft model contains only reactions associated with -% orthologous genes. The old (original) genes involved in 'and' -% relations in grRules without any orthologs are still included in -% the draft model as OLD_MODELID_geneName. +% Returns +% ------- +% draftModel : struct +% a model structure for the new organism. +% hitGenes : struct +% collect the old and new genes. % -% NOTE: "to" and "from" means relative to the new organism +% Examples +% -------- +% draftModel = getModelFromHomology(models, blastStructure, getModelFor); % -% Usage: draftModel=getModelFromHomology(models,blastStructure,... -% getModelFor,preferredOrder,strictness,onlyGenesInModels,maxE,... -% minLen,minIde,mapNewGenesToOld) +% Notes +% ----- +% The models in the 'models' structure should have named the metabolites in +% the same manner, have their reversible reactions in the same direction +% (run sortModel), and use the same compartment names. To avoid keeping +% unneccesary old genes, the models should not have 'or'-relations in their +% grRules (use expandModel). +% +% The resulting draft model contains only reactions associated with +% orthologous genes. The old (original) genes involved in 'and' relations +% in grRules without any orthologs are still included in the draft model as +% OLD_MODELID_geneName. +% +% "to" and "from" means relative to the new organism. hitGenes.oldGenes = []; % collect the old genes from the template model (organism) hitGenes.newGenes = []; % collect the new genes of the draft model (target organism) diff --git a/reconstruction/homology/makeFakeBlastStructure.m b/reconstruction/homology/makeFakeBlastStructure.m index 43e25231..e7724cd9 100755 --- a/reconstruction/homology/makeFakeBlastStructure.m +++ b/reconstruction/homology/makeFakeBlastStructure.m @@ -1,28 +1,40 @@ function blastStructure=makeFakeBlastStructure(orthologList,sourceModelID,getModelFor) -% makeFakeBlastStructure -% Makes a fake blastStructure, that would normally be generated by -% getBlast. This allows to feed a predefined list of orthologs to -% getModelFromHomology while retaining the further use of that function. -% For this function to work, it is crucial that the orthologList is a -% cell array where the first column contains the genes from the source -% organism, and the second column contains the genes from the target -% organism -% -% orthologList cell array of orthologous genes, where the first -% column contains the genes from the source organism, -% while the second column contains the genes from the -% target organism -% sourceModelID ID of the model that will be used as template, that -% contains the genes in the first column of -% orthologList -% getModelFor the name of the organism to build a model for, -% identical to the getModelFor parameter in the -% getModelFromHomology function +% makeFakeBlastStructure Make a fake blastStructure from an ortholog list. % -% blastStructure a fake blastStructure, where the evalue, identity -% and aligLen are set at extreme values, such that -% all orthologous pairs will pass the filter when -% running getModelFromHomology +% This is a structure that would normally be generated by getBlast. It +% allows to feed a predefined list of orthologs to getModelFromHomology +% while retaining the further use of that function. For this function to +% work, it is crucial that the orthologList is a cell array where the first +% column contains the genes from the source organism, and the second column +% contains the genes from the target organism. +% +% Parameters +% ---------- +% orthologList : cell +% cell array of orthologous genes, where the first column contains the +% genes from the source organism, while the second column contains the +% genes from the target organism. +% sourceModelID : char +% ID of the model that will be used as template, that contains the +% genes in the first column of orthologList. +% getModelFor : char +% the name of the organism to build a model for, identical to the +% getModelFor parameter in the getModelFromHomology function. +% +% Returns +% ------- +% blastStructure : struct +% a fake blastStructure, where the evalue, identity and aligLen are +% set at extreme values, such that all orthologous pairs will pass the +% filter when running getModelFromHomology. +% +% Examples +% -------- +% blastStructure = makeFakeBlastStructure(orthologList,sourceModelID,getModelFor); +% +% See also +% -------- +% getModelFromHomology, getBlast if nargin<3 error('All three parameters should be set'); diff --git a/reconstruction/kegg/constructMultiFasta.m b/reconstruction/kegg/constructMultiFasta.m index de16d8ed..f8594565 100755 --- a/reconstruction/kegg/constructMultiFasta.m +++ b/reconstruction/kegg/constructMultiFasta.m @@ -1,18 +1,25 @@ function constructMultiFasta(model,sourceFile,outputDir) -% constructMultiFasta -% Saves one file in FASTA format for each reaction in the model that has genes +% constructMultiFasta Save a FASTA file per reaction in the model with genes. % -% Input: -% model a model structure -% sourceFile a file with sequences in FASTA format -% outputDir the directory to save the resulting FASTA files in +% Parameters +% ---------- +% model : struct +% a model structure. +% sourceFile : char +% a file with sequences in FASTA format. +% outputDir : char +% the directory to save the resulting FASTA files in. % -% The source file is assumed to have the format '>gene identifier -% additional info'. Only the gene identifier is used for matching. This is -% to be compatible with the rest of the code that retrieves information -% from KEGG. +% Notes +% ----- +% The source file is assumed to have the format '>gene identifier +% additional info'. Only the gene identifier is used for matching. This is +% to be compatible with the rest of the code that retrieves information +% from KEGG. % -% Usage: constructMultiFasta(model,sourceFile,outputDir) +% Examples +% -------- +% constructMultiFasta(model,sourceFile,outputDir); sourceFile=char(sourceFile); outputDir=char(outputDir); diff --git a/reconstruction/kegg/getGenesFromKEGG.m b/reconstruction/kegg/getGenesFromKEGG.m index 90886ee9..66c67ae4 100755 --- a/reconstruction/kegg/getGenesFromKEGG.m +++ b/reconstruction/kegg/getGenesFromKEGG.m @@ -1,39 +1,47 @@ function model=getGenesFromKEGG(keggPath,koList) -% getGenesFromKEGG -% Retrieves information on all genes stored in KEGG database +% getGenesFromKEGG Retrieve information on all genes stored in KEGG. % -% Input: -% keggPath if keggGenes.mat is not in the RAVEN\external\kegg -% directory, this function will attempt to read data from a -% local FTP dump of the KEGG database. keggPath is the path -% to the root of this database -% koList the number of genes in KEGG is very large. koList can be a -% cell array with KO identifiers, in which case only genes -% belonging to one of those KEGG orthologies are retrieved -% (optional, default all KOs with associated reactions) +% Parameters +% ---------- +% keggPath : char, optional +% if keggGenes.mat is not in the RAVEN\external\kegg directory, this +% function will attempt to read data from a local FTP dump of the KEGG +% database. keggPath is the path to the root of this database (default +% 'RAVEN/external/kegg'). +% koList : cell, optional +% the number of genes in KEGG is very large. koList can be a cell array +% with KO identifiers, in which case only genes belonging to one of +% those KEGG orthologies are retrieved (default all KOs with associated +% reactions). % -% Output: -% model a model structure generated from the database. The -% following fields are filled -% id 'KEGG' -% name 'Automatically generated from KEGG database' -% rxns KO ids -% rxnNames Name for each entry -% genes IDs for all the genes. Genes are saved as organism -% abbreviation:id (same as in KEGG). 'HSA:124' for -% example is alcohol dehydrogenase in Homo sapiens -% rxnGeneMat A binary matrix that indicates whether a specific -% gene is present in a KO id +% Returns +% ------- +% model : struct +% a model structure generated from the database, with fields: % -% NOTE: If the file keggGenes.mat is in the RAVEN\external\kegg directory -% it will be loaded instead of parsing of the KEGG files. If it does not -% exist it will be saved after parsing of the KEGG files. In general, you -% should remove the keggGenes.mat file if you want to rebuild the model -% structure from a newer version of KEGG. +% - id : 'KEGG' +% - name : 'Automatically generated from KEGG database' +% - rxns : KO ids +% - rxnNames : name for each entry +% - genes : IDs for all the genes. Genes are saved as organism +% abbreviation:id (same as in KEGG). 'HSA:124' for example is alcohol +% dehydrogenase in Homo sapiens +% - rxnGeneMat : a binary matrix that indicates whether a specific gene +% is present in a KO id % -% Usage: model=getGenesFromKEGG(keggPath,koList) +% Examples +% -------- +% model = getGenesFromKEGG(keggPath, koList); % -% NOTE: This is how one entry looks in the file +% Notes +% ----- +% If the file keggGenes.mat is in the RAVEN\external\kegg directory it will +% be loaded instead of parsing of the KEGG files. If it does not exist it +% will be saved after parsing of the KEGG files. In general, you should +% remove the keggGenes.mat file if you want to rebuild the model structure +% from a newer version of KEGG. +% +% This is how one entry looks in the file: % % ENTRY K11440 KO % NAME gbsB @@ -59,9 +67,6 @@ % The file is not tab-delimited. Instead each label is 12 characters % (except for '///'). % -% Check if the genes have been parsed before and saved. If so, load the -% model. -% if nargin<1 keggPath='RAVEN/external/kegg'; diff --git a/reconstruction/kegg/getKEGGModelForOrganism.m b/reconstruction/kegg/getKEGGModelForOrganism.m index f789b138..a773c09f 100755 --- a/reconstruction/kegg/getKEGGModelForOrganism.m +++ b/reconstruction/kegg/getKEGGModelForOrganism.m @@ -2,120 +2,116 @@ outDir,keepSpontaneous,keepUndefinedStoich,keepIncomplete,... keepGeneral,cutOff,minScoreRatioKO,minScoreRatioG,maxPhylDist,... nSequences,seqIdentity,globalModel) -% getKEGGModelForOrganism -% Reconstructs a genome-scale metabolic model based on protein homology -% to the orthologies in KEGG. If the target species is not available in -% KEGG, the user must select a closely related species. It is also -% possible to circumvent protein homology search (see fastaFile parameter -% for more details) +% getKEGGModelForOrganism Reconstruct a model from KEGG protein homology. % -% Input: -% organismID three or four letter abbreviation of the organism -% (as used in KEGG). If not available, use a closely -% related species. This is used for determing the -% phylogenetic distance. Use 'eukaryotes' or -% 'prokaryotes' to get a model for the whole domain. -% Only applicable if fastaFile is empty, i.e. no -% homology search should be performed -% fastaFile a FASTA file that contains the protein sequences of -% the organism for which to reconstruct a model (optional, -% if no FASTA file is supplied then a model is -% reconstructed based only on the organism -% abbreviation. This option ignores all settings -% except for keepSpontaneous, keepUndefinedStoich, -% keepIncomplete and keepGeneral) -% dataDir directory for which to retrieve the input data. -% Should contain a combination of these sub-folders: -% -dataDir\keggdb -% The KEGG database files used in 1a (see below) -% -dataDir\fasta -% The multi-FASTA files generated in 1b (see -% below) -% -dataDir\aligned -% The aligned FASTA files as generated in 2a (see -% below) -% -dataDir\hmms -% The hidden Markov models as generated in 2b or -% downloaded from BioMet Toolbox (see below) -% The final directory in dataDir should be styled as -% prok90_kegg116 or euk90_kegg116, indicating whether -% the HMMs were trained on pro- or eukaryotic -% sequences; using which sequence similarity treshold -% (first set of digits); using which KEGG version -% (second set of digits). (this parameter should -% ALWAYS be provided) -% outDir directory to save the results from the quering of -% the hidden Markov models. The output is specific -% for the input sequences and the settings used. It -% is stored in this manner so that the function can -% continue if interrupted or if it should run in -% parallel. Be careful not to leave output files from -% different organisms or runs with different settings -% in the same folder. They will not be overwritten -% (optional, default is a temporary dir where all *.out -% files are deleted before and after doing the -% reconstruction) -% keepSpontaneous include reactions labeled as "spontaneous". (optional, -% default true) -% keepUndefinedStoich include reactions in the form n A <=> n+1 A. These -% will be dealt with as two separate metabolites -% (optional, default true) -% keepIncomplete include reactions which have been labelled as -% "incomplete", "erroneous" or "unclear" (optional, -% default true) -% keepGeneral include reactions which have been labelled as -% "general reaction". These are reactions on the form -% "an aldehyde <=> an alcohol", and are therefore -% unsuited for modelling purposes. Note that not all -% reactions have this type of annotation, and the -% script will therefore not be able to remove all -% such reactions (optional, default false) -% cutOff significance score from HMMer needed to assign -% genes to a KO (optional, default 10^-50) -% minScoreRatioG a gene is only assigned to KOs for which the score -% is >=log(score)/log(best score) for that gene. This -% is to prevent that a gene which clearly belongs to -% one KO is assigned also to KOs with much lower -% scores (optional, default 0.8 (lower is less strict)) -% minScoreRatioKO ignore genes in a KO if their score is -%