From 5d959db9aeafd5338cac8ee5da144bbb8ad4f029 Mon Sep 17 00:00:00 2001 From: Eduard Kerkhoven Date: Wed, 10 Jun 2026 07:05:43 +0200 Subject: [PATCH] Reformat function help blocks to NumPy-style docstrings Rewrite the leading help comment block of every documented function (across the analysis, annotation, biomass, comparison, conditions, conversion, curation, gapfilling, io, localization, manipulation, omics, queries, reconstruction, solver, tasks, utils and visualization modules) to NumPy-style docstrings: a one-line summary, followed by Parameters, Returns, Examples and (where relevant) Notes / See also sections, with typed arguments. This renders as structured argument/return tables in the documentation site (via mkdocstrings-matlab) while remaining valid, readable MATLAB help text. Only comment lines are changed; function signatures and code are untouched. --- analysis/FSEOF.m | 59 +++-- analysis/analyzeSampling.m | 66 +++-- analysis/findGeneDeletions.m | 109 +++++---- analysis/followChanged.m | 46 ++-- analysis/followFluxes.m | 37 ++- analysis/getAllSubGraphs.m | 26 +- analysis/getAllowedBounds.m | 51 ++-- analysis/getEssentialRxns.m | 31 ++- analysis/getFluxZ.m | 33 ++- analysis/getMinNrFluxes.m | 63 +++-- analysis/haveFlux.m | 45 ++-- analysis/randomSampling.m | 98 ++++---- analysis/reporterMetabolites.m | 86 ++++--- analysis/runDynamicFBA.m | 73 +++--- analysis/runPhenotypePhasePlane.m | 58 +++-- analysis/runProductionEnvelope.m | 37 ++- analysis/runRobustnessAnalysis.m | 50 ++-- analysis/runSimpleOptKnock.m | 63 +++-- annotation/assignSBOterms.m | 109 +++++---- annotation/editMiriam.m | 78 +++--- annotation/extractMiriam.m | 53 ++-- annotation/loadDeltaGfromCSV.m | 41 ++-- annotation/saveDeltaGtoCSV.m | 39 +-- biomass/fitParameters.m | 80 +++--- biomass/getBiomassFractions.m | 81 ++++--- biomass/scaleBiomassFraction.m | 46 ++-- biomass/scaleBiomassPseudoreaction.m | 56 +++-- biomass/setGAM.m | 59 +++-- comparison/compareMultipleModels.m | 83 ++++--- comparison/compareRxnsGenesMetsComps.m | 50 ++-- conditions/applyCondition.m | 89 ++++--- conversion/addIdentifierPrefix.m | 47 ++-- conversion/ravenCobraWrapper.m | 55 +++-- conversion/removeIdentifierPrefix.m | 58 +++-- conversion/standardizeModelFieldOrder.m | 31 ++- curation/curateModelFromTables.m | 91 ++++--- gapfilling/canConsume.m | 29 ++- gapfilling/canProduce.m | 35 ++- gapfilling/checkProduction.m | 82 ++++--- gapfilling/checkRxn.m | 50 ++-- gapfilling/consumeSomething.m | 83 ++++--- gapfilling/fillGaps.m | 126 +++++----- gapfilling/fitTasks.m | 70 +++--- gapfilling/gapReport.m | 78 +++--- gapfilling/makeSomething.m | 87 ++++--- io/SBMLFromExcel.m | 34 ++- io/addJavaPaths.m | 8 +- io/checkFileExistence.m | 55 +++-- io/cleanSheet.m | 46 ++-- io/exportForGit.m | 71 +++--- io/exportModel.m | 37 +-- io/exportModelToSIF.m | 33 ++- io/exportToExcelFormat.m | 30 ++- io/exportToTabDelimited.m | 35 +-- io/getFullPath.m | 107 ++++---- io/getMD5Hash.m | 33 +-- io/getToolboxVersion.m | 33 ++- io/importExcelModel.m | 159 ++++++------ io/importModel.m | 129 +++++----- io/loadSheet.m | 27 ++- io/loadWorkbook.m | 26 +- io/parseYAML.m | 56 +++-- io/readYAMLmodel.m | 26 +- io/writeSheet.m | 38 ++- io/writeYAMLmodel.m | 49 ++-- localization/getExpressionStructure.m | 89 ++++--- localization/getWoLFScores.m | 36 ++- localization/mapCompartments.m | 55 ++++- localization/parseScores.m | 32 ++- localization/predictLocalization.m | 119 ++++----- manipulation/addExchangeRxns.m | 45 ++-- manipulation/addGenesRaven.m | 47 ++-- manipulation/addMets.m | 100 ++++---- manipulation/addRxns.m | 189 +++++++-------- manipulation/addRxnsGenesMets.m | 90 ++++--- manipulation/addTransport.m | 58 +++-- manipulation/changeGrRules.m | 35 ++- manipulation/changeRxns.m | 105 ++++---- manipulation/closeModel.m | 20 +- manipulation/contractModel.m | 56 +++-- manipulation/convertToIrrev.m | 40 +-- manipulation/copyToComps.m | 54 +++-- manipulation/deleteUnusedGenes.m | 22 +- manipulation/expandModel.m | 47 ++-- manipulation/findDuplicateRxns.m | 42 ++-- manipulation/generateNewIds.m | 42 ++-- manipulation/mergeCompartments.m | 64 ++--- manipulation/mergeModels.m | 49 ++-- manipulation/permuteModel.m | 29 ++- manipulation/removeBadRxns.m | 142 ++++++----- manipulation/removeGenes.m | 38 +-- manipulation/removeMets.m | 48 ++-- manipulation/removeReactions.m | 40 +-- manipulation/replaceMets.m | 64 +++-- manipulation/setExchangeBounds.m | 89 +++---- manipulation/setParam.m | 61 +++-- manipulation/simplifyModel.m | 77 +++--- manipulation/sortIdentifiers.m | 24 +- manipulation/sortModel.m | 40 +-- manipulation/standardizeGrRules.m | 51 ++-- omics/parseHPA.m | 66 ++--- omics/parseHPArna.m | 36 +-- omics/scoreModel.m | 116 +++++---- queries/buildEquation.m | 24 +- queries/checkModelStruct.m | 30 ++- queries/constructEquations.m | 68 +++--- queries/constructS.m | 48 ++-- queries/getAllRxnsFromGenes.m | 32 ++- queries/getElementalBalance.m | 57 +++-- queries/getExchangeRxns.m | 77 +++--- queries/getGenesFromGrRules.m | 42 ++-- queries/getIndexes.m | 54 +++-- queries/getMetsInComp.m | 23 +- queries/getRxnsInComp.m | 29 ++- queries/getTransportRxns.m | 25 +- queries/parseFormulas.m | 69 +++--- queries/parseRxnEqu.m | 28 ++- queries/printFluxes.m | 70 +++--- queries/printModel.m | 68 +++--- queries/printModelStats.m | 24 +- reconstruction/combineMetaCycKEGGModels.m | 28 ++- reconstruction/guessComposition.m | 67 ++--- reconstruction/homology/getBlast.m | 73 +++--- reconstruction/homology/getBlastFromExcel.m | 54 +++-- reconstruction/homology/getDiamond.m | 75 +++--- .../homology/getModelFromHomology.m | 125 +++++----- .../homology/makeFakeBlastStructure.m | 58 +++-- reconstruction/kegg/constructMultiFasta.m | 29 ++- reconstruction/kegg/getGenesFromKEGG.m | 71 +++--- reconstruction/kegg/getKEGGModelForOrganism.m | 228 +++++++++--------- reconstruction/kegg/getMetsFromKEGG.m | 63 ++--- reconstruction/kegg/getModelFromKEGG.m | 85 ++++--- reconstruction/kegg/getPhylDist.m | 42 ++-- reconstruction/kegg/getRxnsFromKEGG.m | 96 ++++---- reconstruction/kegg/getWSLpath.m | 31 ++- reconstruction/metacyc/addSpontaneousRxns.m | 30 ++- .../metacyc/getEnzymesFromMetaCyc.m | 64 ++--- .../metacyc/getMetaCycModelForOrganism.m | 78 +++--- reconstruction/metacyc/getMetsFromMetaCyc.m | 60 ++--- reconstruction/metacyc/getModelFromMetaCyc.m | 67 ++--- reconstruction/metacyc/getRxnsFromMetaCyc.m | 91 +++---- reconstruction/metacyc/linkMetaCycKEGGRxns.m | 15 +- solver/checkSolution.m | 23 +- solver/optimizeProb.m | 36 ++- solver/qMOMA.m | 40 +-- solver/setRavenSolver.m | 27 ++- solver/solveLP.m | 89 +++---- solver/solveQP.m | 42 ++-- solver/splitProbForConditioning.m | 84 ++++--- tasks/checkTasks.m | 85 ++++--- tasks/parseTaskList.m | 202 ++++++++-------- utils/convertCharArray.m | 25 +- utils/dispEM.m | 33 ++- utils/emptyOrLogicalScalar.m | 9 + utils/emptyOrTextOrCellOfText.m | 12 +- utils/emptyOrTextScalar.m | 10 + utils/parallelPoolRAVEN.m | 55 +++-- utils/printOrange.m | 23 +- utils/runRAVENtests.m | 16 +- visualization/colorPathway.m | 62 +++-- visualization/colorSubsystem.m | 36 +-- visualization/drawMap.m | 63 +++-- visualization/drawPathway.m | 33 ++- visualization/getColorCodes.m | 60 +++-- visualization/getObjectiveString.m | 29 ++- visualization/getPathwayDimensions.m | 24 +- visualization/mapPathwayRxnNames.m | 33 ++- visualization/markPathwayWithExpression.m | 28 ++- visualization/markPathwayWithFluxes.m | 41 ++-- visualization/plotAdditionalInfo.m | 59 +++-- visualization/plotLabels.m | 21 +- visualization/setColorToMapRxns.m | 77 +++--- visualization/setOmicDataToRxns.m | 24 +- visualization/setTitle.m | 18 +- visualization/trimPathway.m | 27 ++- 175 files changed, 5798 insertions(+), 4152 deletions(-) diff --git a/analysis/FSEOF.m b/analysis/FSEOF.m index e60740ce..169c9573 100755 --- a/analysis/FSEOF.m +++ b/analysis/FSEOF.m @@ -1,33 +1,42 @@ function targets=FSEOF(model,biomassRxn,targetRxn,iterations,coefficient,outputFile) -% FSEOF -% Implements the Flux Scanning based on Enforced Objective Flux algorithm. +% FSEOF Flux Scanning based on Enforced Objective Flux. % -% Input: -% model a model structure -% biomassRxn string with reaction ID of the biomass formation or -% growth reaction -% targetRxn string with reaction ID of target reaction -% iterations numeric indicating number of iterations (optional, -% default 10) -% coefficient numeric indicating ratio of optimal target reaction -% flux, must be less than 1 (optional, default 0.9) -% outputFile string with output filename (optional, default prints -% to command window) +% Implements the Flux Scanning based on Enforced Objective Flux algorithm. +% This function writes a tab-delimited file or prints to the command +% window. If an output has been specified (targets), it will also generate +% a structure indicating for each model reaction whether it is identified +% by FSEOF as a target and the slope of the reaction when switching from +% biomass formation to product formation. % -% Output: -% targets structure with information for identified targets -% logical logical array indicating whether a model reaction was -% identified as target by FSEOF -% slope numeric array with FSEOF slopes for target reactions +% Parameters +% ---------- +% model : struct +% a model structure. +% biomassRxn : char +% reaction ID of the biomass formation or growth reaction. +% targetRxn : char +% reaction ID of the target reaction. +% iterations : double, optional +% number of iterations (default 10). +% coefficient : double, optional +% ratio of optimal target reaction flux, must be less than 1 +% (default 0.9). +% outputFile : char, optional +% output filename (default prints to command window). % -% This function writes an tab-delimited file or prints to command window. -% If an output has been specified (targets), it will also generate a -% structure indicating for each model reaction whether it is identified by -% FSEOF as a target and the slope of the reaction when switching from -% biomass formation to product formation. +% Returns +% ------- +% targets : struct +% structure with information for identified targets, with fields: +% +% - logical : logical array indicating whether a model reaction was +% identified as target by FSEOF +% - slope : numeric array with FSEOF slopes for target reactions % -% Usage: targets = FSEOF(model, biomassRxn, targetRxn, iterations,... -% coefficient, outputFile) +% Examples +% -------- +% targets = FSEOF(model, biomassRxn, targetRxn, iterations, ... +% coefficient, outputFile); biomassRxn=char(biomassRxn); targetRxn=char(targetRxn); diff --git a/analysis/analyzeSampling.m b/analysis/analyzeSampling.m index 8bcbf4d7..570925e1 100755 --- a/analysis/analyzeSampling.m +++ b/analysis/analyzeSampling.m @@ -1,33 +1,47 @@ function scores=analyzeSampling(Tex, df, solutionsA, solutionsB, printResults) -% analyzeSampling -% Compares the significance of change in flux between two conditions with -% the significance of change in gene expression +% analyzeSampling Compare flux change significance with expression change. % -% Tex a vector of t-scores for the change in gene expression -% for each reaction. This score could be the Student t -% between the two conditions, or you can calculate it from -% a p-value (by computing the inverse of the so called error -% function). If you choose the second alternative you should -% be aware that the transcripts that increased in expression -% level should have positive values and those who decreased -% in expression level should have negative values (the -% p-values only tell you if the fluxes changed or not but -% not in which direction) -% df the degrees of freedom in the t-test -% solutionsA random solutions for the reference condition (as -% generated by randomSampling) -% solutionsB random solutions for the test condition (as generated -% by randomSampling) -% printResults prints the most significant reactions in each category -% (optional, default false) +% Compares the significance of change in flux between two conditions with +% the significance of change in gene expression. % -% scores a Nx3 column matrix with the probabilities of a reaction: -% 1) changing both in flux and expression in the same direction -% 2) changing in expression but not in flux -% 3) changing in flux but not in expression or changing -% in opposed directions in flux and expression. +% Parameters +% ---------- +% Tex : double +% a vector of t-scores for the change in gene expression for each +% reaction. This score could be the Student t between the two +% conditions, or you can calculate it from a p-value (by computing the +% inverse of the so called error function). If you choose the second +% alternative you should be aware that the transcripts that increased +% in expression level should have positive values and those who +% decreased in expression level should have negative values (the +% p-values only tell you if the fluxes changed or not but not in which +% direction). +% df : double +% the degrees of freedom in the t-test. +% solutionsA : double +% random solutions for the reference condition (as generated by +% randomSampling). +% solutionsB : double +% random solutions for the test condition (as generated by +% randomSampling). +% printResults : logical, optional +% prints the most significant reactions in each category +% (default false). % -% Usage: scores=analyzeSampling(Tex, df, solutionsA, solutionsB, printResults) +% Returns +% ------- +% scores : double +% a Nx3 column matrix with the probabilities of a reaction: +% +% 1. changing both in flux and expression in the same direction +% 2. changing in expression but not in flux +% 3. changing in flux but not in expression or changing in opposed +% directions in flux and expression +% +% Examples +% -------- +% scores = analyzeSampling(Tex, df, solutionsA, solutionsB, ... +% printResults); if nargin<5 printResults=false; diff --git a/analysis/findGeneDeletions.m b/analysis/findGeneDeletions.m index 7420d298..f6986bbd 100755 --- a/analysis/findGeneDeletions.m +++ b/analysis/findGeneDeletions.m @@ -1,54 +1,69 @@ function [genes, fluxes, originalGenes, details, grRatioMuts]=findGeneDeletions(model,testType,analysisType,refModel,oeFactor) -% findGeneDeletions -% Deletes genes, optimizes the model, and keeps track of the resulting -% fluxes. This is used for identifying gene deletion targets. +% findGeneDeletions Delete genes and track the resulting fluxes. % -% model a model structure -% testType single/double gene deletions/over expressions. Over -% expression only available if using MOMA -% 'sgd' single gene deletion -% 'dgd' double gene deletion -% 'sgo' single gene over expression -% 'dgo' double gene over expression -% (optional, default 'sgd') -% analysisType determines whether to use FBA ('fba') or MOMA ('moma') -% in the optimization. (optional, default 'fba') -% refModel MOMA works by fitting the flux distributions of two -% models to be as similar as possible. The most common -% application is where you have a reference model where -% some of the fluxes are constrained from experimental -% data. This model is required when using MOMA -% oeFactor a factor by which the fluxes should be increased if a -% gene is overexpressed (optional, default 10) +% Deletes genes, optimizes the model, and keeps track of the resulting +% fluxes. This is used for identifying gene deletion targets. % -% genes a matrix with the genes that were deleted in each -% optimization (the gene indexes in originalGenes). Each -% row corresponds to a column in fluxes -% fluxes a matrix with the resulting fluxes. Double deletions -% that result in an unsolvable problem have all zero -% flux. Single deletions that result in an unsolvable -% problem are indicated in details instead -% originalGenes simply the genes in the input model. Included for -% simple presentation of the output -% details not all genes will be deleted in all analyses. It is -% for example not necessary to delete genes for dead end -% reactions. This is a vector with details about -% each gene in originalGenes and why or why not it was -% deleted -% 1: Was deleted/overexpressed -% 2: Proved lethal in sgd (single gene deletion) -% 3: - redundant, no longer used - -% 4: Involved in dead-end reaction -% grRatioMuts growth rate ratio between mutated strain and wild type, -% matches the originalGenes(genes) mutants. Note that -% this does not directly map to model.genes, as is the case -% for COBRA getEssentialGenes. However, this can be -% obtained by afterwards running: -% grRatio=zeros(1,numel(model.genes)); -% grRatio(genes)=grRatioMuts; +% Parameters +% ---------- +% model : struct +% a model structure. +% testType : char, optional +% single/double gene deletions/over expressions. Over expression is +% only available if using MOMA (default 'sgd'): % -% Usage: [genes, fluxes, originalGenes, details, grRatioMuts]=findGeneDeletions(model,testType,analysisType,... -% refModel,oeFactor) +% - 'sgd' : single gene deletion +% - 'dgd' : double gene deletion +% - 'sgo' : single gene over expression +% - 'dgo' : double gene over expression +% analysisType : char, optional +% determines whether to use FBA ('fba') or MOMA ('moma') in the +% optimization (default 'fba'). +% refModel : struct, optional +% MOMA works by fitting the flux distributions of two models to be as +% similar as possible. The most common application is where there is a +% reference model with some fluxes constrained from experimental data. +% This model is required when using MOMA. +% oeFactor : double, optional +% a factor by which the fluxes should be increased if a gene is +% overexpressed (default 10). +% +% Returns +% ------- +% genes : double +% a matrix with the genes that were deleted in each optimization (the +% gene indexes in originalGenes). Each row corresponds to a column in +% fluxes. +% fluxes : double +% a matrix with the resulting fluxes. Double deletions that result in +% an unsolvable problem have all zero flux. Single deletions that +% result in an unsolvable problem are indicated in details instead. +% originalGenes : cell +% simply the genes in the input model. Included for simple +% presentation of the output. +% details : double +% not all genes will be deleted in all analyses. It is for example not +% necessary to delete genes for dead end reactions. This is a vector +% with details about each gene in originalGenes and why or why not it +% was deleted: +% +% - 1 : was deleted/overexpressed +% - 2 : proved lethal in sgd (single gene deletion) +% - 3 : redundant, no longer used +% - 4 : involved in dead-end reaction +% grRatioMuts : double +% growth rate ratio between mutated strain and wild type, matching the +% originalGenes(genes) mutants. Note that this does not directly map +% to model.genes, as is the case for COBRA getEssentialGenes. However, +% this can be obtained by afterwards running: +% +% grRatio=zeros(1,numel(model.genes)); +% grRatio(genes)=grRatioMuts; +% +% Examples +% -------- +% [genes, fluxes, originalGenes, details, grRatioMuts]=... +% findGeneDeletions(model,testType,analysisType,refModel,oeFactor); originalModel=model; if nargin<5 diff --git a/analysis/followChanged.m b/analysis/followChanged.m index 95fd0a2c..7a217b27 100755 --- a/analysis/followChanged.m +++ b/analysis/followChanged.m @@ -1,24 +1,34 @@ function followChanged(model,fluxesA,fluxesB, cutOffChange, cutOffFlux, cutOffDiff, metaboliteList) -% followChanged -% Prints fluxes and reactions for each of the reactions that results in -% different fluxes compared to the reference case. +% followChanged Print reactions whose fluxes differ from a reference case. % -% model a model structure -% fluxesA flux vector for the test case -% fluxesB flux vector for the reference test -% cutOffChange reactions where the fluxes differ by less than -% this many percent won't be printed (optional, default 10^-8) -% cutOffFlux reactions where the absolute value of both fluxes -% are below this value won't be printed (optional, -% default 10^-8) -% cutOffDiff reactions where the fluxes differ by less than -% cutOffDiff won't be printed (optional, default 10^-8) -% metaboliteList cell array of metabolite names. Only reactions -% involving any of these metabolites will be -% printed (optional) +% Prints fluxes and reactions for each of the reactions that result in +% different fluxes compared to the reference case. % -% Usage: followChanged(model,fluxesA,fluxesB, cutOffChange, cutOffFlux, -% cutOffDiff, metaboliteList) +% Parameters +% ---------- +% model : struct +% a model structure. +% fluxesA : double +% flux vector for the test case. +% fluxesB : double +% flux vector for the reference test. +% cutOffChange : double, optional +% reactions where the fluxes differ by less than this many percent +% won't be printed (default 10^-8). +% cutOffFlux : double, optional +% reactions where the absolute value of both fluxes are below this +% value won't be printed (default 10^-8). +% cutOffDiff : double, optional +% reactions where the fluxes differ by less than cutOffDiff won't be +% printed (default 10^-8). +% metaboliteList : cell, optional +% cell array of metabolite names. Only reactions involving any of +% these metabolites will be printed. +% +% Examples +% -------- +% followChanged(model,fluxesA,fluxesB,cutOffChange,cutOffFlux,... +% cutOffDiff,metaboliteList); %Checks if a cut off flux has been set if nargin<4 diff --git a/analysis/followFluxes.m b/analysis/followFluxes.m index 7f00921c..33470d7b 100755 --- a/analysis/followFluxes.m +++ b/analysis/followFluxes.m @@ -1,18 +1,31 @@ function errorFlag=followFluxes(model, fluxesA, lowerFlux, upperFlux, fluxesB) -% followFluxes -% Prints fluxes and reactions for each of the reactions that results in -% fluxes in the specified interval. +% followFluxes Print reactions with fluxes in a specified interval. % -% model a model structure -% fluxesA flux vector for the test case -% lowerFlux only reactions with fluxes above this cutoff -% value are displayed -% upperFlux only reactions with fluxes below this cutoff -% value are displayed (optional, default Inf) -% fluxesB flux vector for the reference case(optional) +% Prints fluxes and reactions for each of the reactions that result in +% fluxes within the specified interval. % -% Usage: errorFlag=followFluxes(model, fluxesA, lowerFlux, upperFlux, -% fluxesB) +% Parameters +% ---------- +% model : struct +% a model structure. +% fluxesA : double +% flux vector for the test case. +% lowerFlux : double +% only reactions with fluxes above this cutoff value are displayed. +% upperFlux : double, optional +% only reactions with fluxes below this cutoff value are displayed +% (default Inf). +% fluxesB : double, optional +% flux vector for the reference case. +% +% Returns +% ------- +% errorFlag : double +% set to 1 if upperFlux is not larger than lowerFlux, otherwise empty. +% +% Examples +% -------- +% errorFlag=followFluxes(model,fluxesA,lowerFlux,upperFlux,fluxesB); %Checks that the upper flux is larger than the lower flux if nargin>3 diff --git a/analysis/getAllSubGraphs.m b/analysis/getAllSubGraphs.m index de9c858a..e8a869ef 100755 --- a/analysis/getAllSubGraphs.m +++ b/analysis/getAllSubGraphs.m @@ -1,17 +1,23 @@ function subGraphs=getAllSubGraphs(model) -% getAllSubGraphs -% Get all metabolic subgraphs in a model. Two metabolites -% are connected if they share a reaction. +% getAllSubGraphs Get all metabolic subgraphs in a model. % -% Input: -% model a model structure +% Two metabolites are connected if they share a reaction. % -% Output: -% subGraphs a boolean matrix where the rows correspond to the metabolites -% and the columns to which subgraph they are assigned to. The -% columns are ordered so that larger subgraphs come first +% Parameters +% ---------- +% model : struct +% a model structure. % -% Usage: subGraphs=getAllSubGraphs(model) +% Returns +% ------- +% subGraphs : logical +% a boolean matrix where the rows correspond to the metabolites and +% the columns to which subgraph they are assigned to. The columns are +% ordered so that larger subgraphs come first. +% +% Examples +% -------- +% subGraphs = getAllSubGraphs(model); %Generate the connectivity graph. Metabolites are connected through %reactions. This is not a bipartite graph with the reactions. diff --git a/analysis/getAllowedBounds.m b/analysis/getAllowedBounds.m index ed580fd0..1b31a8de 100755 --- a/analysis/getAllowedBounds.m +++ b/analysis/getAllowedBounds.m @@ -1,30 +1,39 @@ function [minFluxes, maxFluxes, exitFlags]=getAllowedBounds(model,rxns,runParallel) -% getAllowedBounds -% Returns the minimal and maximal fluxes through each reaction. +% getAllowedBounds Return the minimal and maximal fluxes through reactions. % -% Input: -% model a model structure -% rxns either a cell array of reaction IDs, a logical vector -% with the same number of elements as reactions in the -% model, or a vector of reaction indexes (optional, default -% model.rxns) -% runParallel speed up calculations by parallel processing. This is -% not beneficial if allowed bounds are calculated for -% only a few reactions, as the overhead of parallel -% processing will take longer. It requires MATLAB -% Parallel Computing Toolbox. If this is not installed, -% the calculations will not be parallelized, regardless -% what is indicated as runParallel. (optional, default true) +% Parameters +% ---------- +% model : struct +% a model structure. +% rxns : cell or logical or double, optional +% either a cell array of reaction IDs, a logical vector with the same +% number of elements as reactions in the model, or a vector of +% reaction indexes (default model.rxns). +% runParallel : logical, optional +% speed up calculations by parallel processing. This is not beneficial +% if allowed bounds are calculated for only a few reactions, as the +% overhead of parallel processing will take longer. It requires MATLAB +% Parallel Computing Toolbox. If this is not installed, the +% calculations will not be parallelized, regardless of what is +% indicated as runParallel (default true). % -% Output: -% minFluxes minimal allowed fluxes -% maxFluxes maximal allowed fluxes -% exitFlags exit flags for min/max for each of the reactions. True -% if it was possible to calculate a flux +% Returns +% ------- +% minFluxes : double +% minimal allowed fluxes. +% maxFluxes : double +% maximal allowed fluxes. +% exitFlags : double +% exit flags for min/max for each of the reactions. True if it was +% possible to calculate a flux. % +% Notes +% ----- % In cases where no solution can be calculated, NaN is returned. % -% Usage: [minFluxes, maxFluxes, exitFlags] = getAllowedBounds(model, rxns, runParallel) +% Examples +% -------- +% [minFluxes, maxFluxes, exitFlags] = getAllowedBounds(model, rxns, runParallel); if nargin<2 || isempty(rxns) rxns = 1:numel(model.rxns); diff --git a/analysis/getEssentialRxns.m b/analysis/getEssentialRxns.m index d9106c35..ec616dbb 100755 --- a/analysis/getEssentialRxns.m +++ b/analysis/getEssentialRxns.m @@ -1,18 +1,29 @@ function [essentialRxns, essentialRxnsIndexes]=getEssentialRxns(model,ignoreRxns) -% getEssentialRxns -% Calculate the essential reactions for a model to be solvable +% getEssentialRxns Calculate the essential reactions for a solvable model. % -% model a model structure -% ignoreRxns cell array of reaction IDs which should not be -% checked (optional, default {}) +% Parameters +% ---------- +% model : struct +% a model structure. +% ignoreRxns : cell, optional +% cell array of reaction IDs which should not be checked +% (default {}). % -% essentialRxns cell array with the IDs of the essential reactions -% essentialRxnsIndexes vector with the indexes of the essential reactions +% Returns +% ------- +% essentialRxns : cell +% cell array with the IDs of the essential reactions. +% essentialRxnsIndexes : double +% vector with the indexes of the essential reactions. % -% Essential reactions are those which, when constrained to 0, result in an -% infeasible problem. +% Notes +% ----- +% Essential reactions are those which, when constrained to 0, result in an +% infeasible problem. % -% Usage: [essentialRxns, essentialRxnsIndexes]=getEssentialRxns(model,ignoreRxns) +% Examples +% -------- +% [essentialRxns, essentialRxnsIndexes] = getEssentialRxns(model, ignoreRxns); if nargin<2 ignoreRxns={}; diff --git a/analysis/getFluxZ.m b/analysis/getFluxZ.m index 0e6ba6ef..3a2fda2c 100755 --- a/analysis/getFluxZ.m +++ b/analysis/getFluxZ.m @@ -1,18 +1,29 @@ function Z=getFluxZ(solutionsA, solutionsB) -% getFluxZ -% Calculates the Z scores between two sets of random flux distributions. +% getFluxZ Calculate Z scores between two sets of random flux distributions. % -% solutionsA random solutions for the reference condition (as -% generated by randomSampling) -% solutionsB random solutions for the test condition (as generated -% by randomSampling) +% Parameters +% ---------- +% solutionsA : double +% random solutions for the reference condition (as generated by +% randomSampling). +% solutionsB : double +% random solutions for the test condition (as generated by +% randomSampling). % -% Z a vector with Z-scores that tells you for each reaction -% how likely it is for its flux to have increased (positive sign) -% or decreased (negative sign) in the second condition with -% respect to the first. +% Returns +% ------- +% Z : double +% a vector with Z-scores that tells you for each reaction how likely it +% is for its flux to have increased (positive sign) or decreased +% (negative sign) in the second condition with respect to the first. % -% Usage: Z=getFluxZ(solutionsA, solutionsB) +% Examples +% -------- +% Z = getFluxZ(solutionsA, solutionsB); +% +% See also +% -------- +% randomSampling nRxns=size(solutionsA,1); diff --git a/analysis/getMinNrFluxes.m b/analysis/getMinNrFluxes.m index 6c9ca5e2..0c4e4858 100755 --- a/analysis/getMinNrFluxes.m +++ b/analysis/getMinNrFluxes.m @@ -1,31 +1,48 @@ function [x,I,exitFlag]=getMinNrFluxes(model, toMinimize, params,scores) -% getMinNrFluxes -% Returns the minimal set of fluxes that satisfy the model using -% mixed integer linear programming. +% getMinNrFluxes Find the minimal set of fluxes that satisfy the model. % -% model a model structure -% toMinimize either a cell array of reaction IDs, a logical vector -% with the same number of elements as reactions in the model, -% of a vector of indexes for the reactions that should be -% minimized (optional, default model.rxns) -% params *obsolete option* -% scores vector of weights for the reactions. Negative scores -% should not have flux. Positive scores are not possible in this -% implementation, and they are changed to max(scores(scores<0)). -% Must have the same dimension as toMinimize (find(toMinimize) -% if it is a logical vector) (optional, default -1 for all reactions) +% Uses mixed integer linear programming to find the minimal set of fluxes +% that satisfy the model. % -% x the corresponding fluxes for the full model -% I the indexes of the reactions in toMinimize that were used -% in the solution -% exitFlag 1: optimal solution found -% -1: no feasible solution found -% -2: optimization time out +% Parameters +% ---------- +% model : struct +% a model structure. +% toMinimize : cell or logical or double, optional +% either a cell array of reaction IDs, a logical vector with the same +% number of elements as reactions in the model, or a vector of indexes +% for the reactions that should be minimized (default model.rxns). +% params : struct, optional +% *obsolete option*. +% scores : double, optional +% vector of weights for the reactions. Negative scores should not have +% flux. Positive scores are not possible in this implementation, and +% they are changed to max(scores(scores<0)). Must have the same +% dimension as toMinimize (find(toMinimize) if it is a logical vector) +% (default -1 for all reactions). % -% NOTE: Uses 1000 mmol/gDW/h as an arbitary large flux. Could possibly -% cause problems if the fluxes in the model are larger than that. +% Returns +% ------- +% x : double +% the corresponding fluxes for the full model. +% I : double +% the indexes of the reactions in toMinimize that were used in the +% solution. +% exitFlag : double +% exit status: % -% Usage: [x,I,exitFlag]=getMinNrFluxes(model, toMinimize, params, scores) +% - 1 : optimal solution found +% - -1 : no feasible solution found +% - -2 : optimization time out +% +% Examples +% -------- +% [x, I, exitFlag] = getMinNrFluxes(model, toMinimize, params, scores); +% +% Notes +% ----- +% Uses 1000 mmol/gDW/h as an arbitary large flux. Could possibly cause +% problems if the fluxes in the model are larger than that. exitFlag=1; diff --git a/analysis/haveFlux.m b/analysis/haveFlux.m index 01215959..f926e2c1 100755 --- a/analysis/haveFlux.m +++ b/analysis/haveFlux.m @@ -1,25 +1,36 @@ function I=haveFlux(model,cutOff,rxns) -% haveFlux -% Checks which reactions can carry a (positive or negative) flux. Is used -% as a faster version of getAllowedBounds if it is only interesting -% whether the reactions can carry a flux or not +% haveFlux Check which reactions can carry a flux. % -% Input: -% model a model structure -% cutOff the flux value that a reaction has to carry to be -% identified as positive (optional, default 10^-8) -% rxns either a cell array of IDs, a logical vector with the -% same number of elements as metabolites in the model, or a -% vector of indexes (optional, default model.rxns) +% Checks which reactions can carry a (positive or negative) flux. Is used as +% a faster version of getAllowedBounds if it is only interesting whether the +% reactions can carry a flux or not. % -% Output: -% I logical array with true if the corresponding reaction can -% carry a flux +% Parameters +% ---------- +% model : struct +% a model structure. +% cutOff : double, optional +% the flux value that a reaction has to carry to be identified as +% positive (default 10^-8). +% rxns : cell or logical or double, optional +% either a cell array of IDs, a logical vector with the same number of +% elements as metabolites in the model, or a vector of indexes (default +% model.rxns). % -% If a model has +/- Inf bounds then those are replaced with an arbitary -% large value of +/- 10000 prior to solving +% Returns +% ------- +% I : logical +% logical array with true if the corresponding reaction can carry a +% flux. +% +% Examples +% -------- +% I = haveFlux(model, cutOff, rxns); % -% Usage: I = haveFlux(model, cutOff, rxns) +% Notes +% ----- +% If a model has +/- Inf bounds then those are replaced with an arbitary +% large value of +/- 10000 prior to solving. if nargin<2 cutOff=10^-6; diff --git a/analysis/randomSampling.m b/analysis/randomSampling.m index 9c48431c..3f96faa4 100755 --- a/analysis/randomSampling.m +++ b/analysis/randomSampling.m @@ -1,61 +1,63 @@ function [solutions, goodRxns]=randomSampling(model,nSamples,replaceBoundsWithInf,supressErrors,runParallel,goodRxns,minFlux) -% randomSampling -% Performs random sampling of the solution space, as described in Bordel -% et al. (2010) PLOS Compt Biol (doi:10.1371/journal.pcbi.1000859). +% randomSampling Perform random sampling of the solution space. % -% Input: -% model a model structure -% nSamples the number of solutions to return -% (optional, default 1000) -% replaceBoundsWithInf replace the largest upper bounds with Inf and -% the smallest lower bounds with -Inf. This is -% needed in order to get solutions without loops -% if your model has for example 1000/-1000 as -% arbitary large bounds. If your model only has -% "biologically relevant" bounds, then set this -% to false (optional, default true) -% supressErrors the program will halt if it has problems -% finding non-zero solutions which are not -% involved in loops. This could be because the -% constraints on the model are too relaxed (such -% as unlimited glucose uptake) or too strict -% (such as too many and too narrow constraints) -% (optional, default false) -% runParallel speed up calculations by parallel processing. -% Requires MATLAB Parallel Computing Toolbox. If -% this is not installed, the calculations will -% not be parallelized, regardless what is -% indicated as runParallel. (optional, default -% true) -% goodRxns double vector of indexes of those reactions -% that are not involved in loops and can be used -% as random objective functions, as generated by -% a previous run of randomSampling on the same -% model (optional, default empty) -% minFlux determines if a second optimization should be -% performed for each random sample, to minimize -% the number of fluxes and thereby preventing -% loops. Typically, loops are averaged out when a -% large number of samples are taken, but this is -% not always the case (optional, default false) +% Performs random sampling of the solution space, as described in Bordel et +% al. (2010) PLoS Comput Biol (doi:10.1371/journal.pcbi.1000859). % -% Output: -% solutions matrix with the solutions -% goodRxns double vector of indexes of those reactions -% that are not involved in loops or always carry -% zero flux and can be used as random objective -% functions +% Parameters +% ---------- +% model : struct +% a model structure. +% nSamples : double, optional +% the number of solutions to return (default 1000). +% replaceBoundsWithInf : logical, optional +% replace the largest upper bounds with Inf and the smallest lower +% bounds with -Inf. This is needed in order to get solutions without +% loops if your model has, for example, 1000/-1000 as arbitrarily large +% bounds. If your model only has "biologically relevant" bounds, then +% set this to false (default true). +% supressErrors : logical, optional +% the program will halt if it has problems finding non-zero solutions +% which are not involved in loops. This could be because the constraints +% on the model are too relaxed (such as unlimited glucose uptake) or too +% strict (such as too many and too narrow constraints) (default false). +% runParallel : logical, optional +% speed up calculations by parallel processing. Requires the MATLAB +% Parallel Computing Toolbox. If this is not installed, the calculations +% will not be parallelized, regardless of what is indicated as +% runParallel (default true). +% goodRxns : double, optional +% vector of indexes of those reactions that are not involved in loops +% and can be used as random objective functions, as generated by a +% previous run of randomSampling on the same model (default empty). +% minFlux : logical, optional +% determines if a second optimization should be performed for each +% random sample, to minimize the number of fluxes and thereby prevent +% loops. Typically, loops are averaged out when a large number of +% samples are taken, but this is not always the case (default false). % -% Note: The solutions are generated by maximizing (with random weights) for -% a random set of three reactions. For reversible reactions it randomly +% Returns +% ------- +% solutions : double +% matrix with the solutions. +% goodRxns : double +% vector of indexes of those reactions that are not involved in loops or +% always carry zero flux and can be used as random objective functions. +% +% Notes +% ----- +% The solutions are generated by maximizing (with random weights) for a +% random set of three reactions. For reversible reactions it randomly % chooses between maximizing and minimizing. % % If the model is a GECKO v3+ ecModel, then usage_prot reactions are not % selected for sampling, instead focusing on sampling from the metabolic % aspects that form the solution space. % -% Usage: solutions = randomSampling(model, nSamples, replaceBoundsWithInf,... -% supressErrors, runParallel, goodRxns, minFlux) +% Examples +% -------- +% solutions = randomSampling(model, nSamples, replaceBoundsWithInf, ... +% supressErrors, runParallel, goodRxns, minFlux); if nargin<2 | isempty(nSamples) nSamples=1000; diff --git a/analysis/reporterMetabolites.m b/analysis/reporterMetabolites.m index 022acec1..666244c2 100755 --- a/analysis/reporterMetabolites.m +++ b/analysis/reporterMetabolites.m @@ -1,43 +1,59 @@ function repMets=reporterMetabolites(model,genes,genePValues,printResults,outputFile,geneFoldChanges) -% reporterMetabolites -% The Reporter Metabolites algorithm for identifying metabolites around -% which transcriptional changes occur +% reporterMetabolites Identify metabolites around which transcriptional changes occur. % -% model a model structure -% genes a cell array of gene names (should match with -% model.genes) -% genePValues P-values for differential expression of the genes -% printResults true if the top 20 Reporter Metabolites should be -% printed to the screen (optional, default false) -% outputFile the results are printed to this file (optional) -% geneFoldChanges log-fold changes for the genes. If supplied, then -% Reporter Metabolites are calculated for only up/down- -% regulated genes in addition to the full test (optional) +% The Reporter Metabolites algorithm for identifying metabolites around +% which transcriptional changes occur. % -% repMets an array of structures with the following fields. -% test a string the describes the genes that were used to -% calculate the Reporter Metabolites ('all', 'only up', -% or 'only down'). The two latter structures are -% only calculated if geneFoldChanges are supplied. -% mets a cell array of metabolite IDs for the metabolites for -% which a score could be calculated -% metZScores Z-scores for differential expression around each -% metabolite in "mets" -% metPValues P-values for differential expression around each -% metabolite in "mets" -% metNGenes number of neighbouring genes for each metabolite in -% "mets" -% meanZ average Z-scores for the genes around each metabolite -% in "mets" -% stdZ standard deviations of the Z-scores around each -% metabolite in "mets" +% Parameters +% ---------- +% model : struct +% a model structure. +% genes : cell +% a cell array of gene names (should match with model.genes). +% genePValues : double +% P-values for differential expression of the genes. +% printResults : logical, optional +% true if the top 20 Reporter Metabolites should be printed to the +% screen (default false). +% outputFile : char, optional +% the results are printed to this file (default none). +% geneFoldChanges : double, optional +% log-fold changes for the genes. If supplied, then Reporter +% Metabolites are calculated for only up/down-regulated genes in +% addition to the full test (default none). % -% NOTE: For details about the algorithm, see Patil KR, Nielsen J, -% Uncovering transcriptional regulation of metabolism by using metabolic -% network topology. Proc. Natl Acad. Sci. USA 2005;102:2685-2689. +% Returns +% ------- +% repMets : struct +% an array of structures with the following fields: % -% Usage: repMets=reporterMetabolites(model,genes,genePValues,printResults,... -% outputFile,geneFoldChanges) +% - test : a string that describes the genes used to calculate the +% Reporter Metabolites ('all', 'only up', or 'only down'). The two +% latter structures are only calculated if geneFoldChanges are +% supplied. +% - mets : a cell array of metabolite IDs for the metabolites for which +% a score could be calculated. +% - metZScores : Z-scores for differential expression around each +% metabolite in "mets". +% - metPValues : P-values for differential expression around each +% metabolite in "mets". +% - metNGenes : number of neighbouring genes for each metabolite in +% "mets". +% - meanZ : average Z-scores for the genes around each metabolite in +% "mets". +% - stdZ : standard deviations of the Z-scores around each metabolite +% in "mets". +% +% Examples +% -------- +% repMets = reporterMetabolites(model, genes, genePValues, ... +% printResults, outputFile, geneFoldChanges); +% +% Notes +% ----- +% For details about the algorithm, see Patil KR, Nielsen J, Uncovering +% transcriptional regulation of metabolism by using metabolic network +% topology. Proc. Natl Acad. Sci. USA 2005;102:2685-2689. genes=convertCharArray(genes); if nargin<4 diff --git a/analysis/runDynamicFBA.m b/analysis/runDynamicFBA.m index a556c71b..4c338fbd 100755 --- a/analysis/runDynamicFBA.m +++ b/analysis/runDynamicFBA.m @@ -1,43 +1,58 @@ function [concentrationMatrix, excRxnNames, timeVec, biomassVec] = runDynamicFBA(model, substrateRxns, initConcentrations, initBiomass, timeStep, nSteps, plotRxns, exclUptakeRxns) -% runDynamicFBA -% Performs dynamic FBA simulation using the static optimization approach +% runDynamicFBA Perform dynamic FBA using the static optimization approach. % -% Input: -% model a model structure -% substrateRxns cell array with exchange reaction identifiers for -% substrates that are initially in the media, whose -% concentration may change (e.g. not h2o or co2) -% initConcentrations numeric initial concentrations of substrates -% (matching substrateRxns) -% initBiomass numeric initial biomass (must be non-zero) -% timeStep numeric time step size -% nSteps numeric maximum number of time steps -% plotRxns cell array with exchange reaction identifiers for -% substrates whose concentration should be plotted -% exclUptakeRxns cell array with exchange reaction identifiers for -% substrates whose concentration does not change -% (e.g. co2, o2, h2o, h) +% Parameters +% ---------- +% model : struct +% a model structure. +% substrateRxns : cell +% cell array with exchange reaction identifiers for substrates that are +% initially in the media, whose concentration may change (e.g. not h2o +% or co2). +% initConcentrations : double +% initial concentrations of substrates (matching substrateRxns). +% initBiomass : double +% initial biomass (must be non-zero). +% timeStep : double +% time step size. +% nSteps : double +% maximum number of time steps. +% plotRxns : cell +% cell array with exchange reaction identifiers for substrates whose +% concentration should be plotted. +% exclUptakeRxns : cell +% cell array with exchange reaction identifiers for substrates whose +% concentration does not change (e.g. co2, o2, h2o, h). % -% Output: -% concentrationMatrix numeric matrix with extracellular metabolite -% concentrations -% excRxnNames cell array with exchange reaction identifiers that -% match the metabolites included in the -% concentrationMatrix -% timeVec numeric vector of time points -% biomassVec numeric vector with biomass concentrations +% Returns +% ------- +% concentrationMatrix : double +% matrix with extracellular metabolite concentrations. +% excRxnNames : cell +% cell array with exchange reaction identifiers that match the +% metabolites included in the concentrationMatrix. +% timeVec : double +% vector of time points. +% biomassVec : double +% vector with biomass concentrations. % +% Examples +% -------- +% [concentrationMatrix, excRxnNames, timeVec, biomassVec] = ... +% runDynamicFBA(model, substrateRxns, initConcentrations, ... +% initBiomass, timeStep, nSteps, plotRxns, exclUptakeRxns); +% +% Notes +% ----- % If no initial concentration is given for a substrate that has an open -% uptake in the model (i.e. `model.lb < 0`) the concentration is assumed to +% uptake in the model (i.e. model.lb < 0) the concentration is assumed to % be high enough to not be limiting. If the uptake rate for a nutrient is % calculated to exceed the maximum uptake rate for that nutrient specified % in the model and the max uptake rate specified is > 0, the maximum uptake % rate specified in the model is used instead of the calculated uptake % rate. % -% Modified from COBRA Toolbox dynamicFBA.m -% -% Usage: [concentrationMatrix, excRxnNames, timeVec, biomassVec] = runDynamicFBA(model, substrateRxns, initConcentrations, initBiomass, timeStep, nSteps, plotRxns, exclUptakeRxns) +% Modified from COBRA Toolbox dynamicFBA.m. % Find exchange rxns excRxnNames = getExchangeRxns(model); diff --git a/analysis/runPhenotypePhasePlane.m b/analysis/runPhenotypePhasePlane.m index 6fd206d1..e694ee4a 100755 --- a/analysis/runPhenotypePhasePlane.m +++ b/analysis/runPhenotypePhasePlane.m @@ -1,30 +1,44 @@ function [growthRates, shadowPrices1, shadowPrices2] = runPhenotypePhasePlane(model, controlRxn1, controlRxn2, nPts, range1, range2) -% runPhenotypePhasePlane -% Runs phenotype phase plane analysis and plots the results. The first -% plot is a 3D surface plot showing the phenotype phase plane, the other -% two plots show the shadow prices of the metabolites from the two -% control reactions, which define the phases. Modified from the COBRA -% phenotypePhasePlane function. +% runPhenotypePhasePlane Run phenotype phase plane analysis and plot the results. % -% Input: -% model a model structure -% controlRxn1 reaction identifier of the first reaction to be plotted -% controlRxn2 reaction identifier of the second reaction to be plotted -% nPts the number of points to plot in each dimension (optional, -% default 50) -% range1 the range [from 0 to range1] of reaction 1 to plot -% (optional, default 20) -% range2 the range [from 0 to range2] of reaction 2 to plot -% (optional, default 20) +% Runs phenotype phase plane analysis and plots the results. The first plot +% is a 3D surface plot showing the phenotype phase plane, the other two +% plots show the shadow prices of the metabolites from the two control +% reactions, which define the phases. % -% Output: -% growthRates1 a matrix of maximum growth rates -% shadowPrices1 a matrix with shadow prices for reaction 1 -% shadowPrices2 a matrix with shadow prices for reaction 2 +% Parameters +% ---------- +% model : struct +% a model structure. +% controlRxn1 : char +% reaction identifier of the first reaction to be plotted. +% controlRxn2 : char +% reaction identifier of the second reaction to be plotted. +% nPts : double, optional +% the number of points to plot in each dimension (default 50). +% range1 : double, optional +% the range [from 0 to range1] of reaction 1 to plot (default 20). +% range2 : double, optional +% the range [from 0 to range2] of reaction 2 to plot (default 20). % -% Modified from COBRA Toolbox phenotypePhasePlane.m +% Returns +% ------- +% growthRates : double +% a matrix of maximum growth rates. +% shadowPrices1 : double +% a matrix with shadow prices for reaction 1. +% shadowPrices2 : double +% a matrix with shadow prices for reaction 2. % -% Usage: [growthRates, shadowPrices1, shadowPrices2] = runPhenotypePhasePlane(model, controlRxn1, controlRxn2, nPts, range1, range2) +% Examples +% -------- +% [growthRates, shadowPrices1, shadowPrices2] = ... +% runPhenotypePhasePlane(model, controlRxn1, controlRxn2, nPts, ... +% range1, range2); +% +% Notes +% ----- +% Modified from COBRA Toolbox phenotypePhasePlane.m. close all force % Close all existing figure windows (if open) if nargin < 4 nPts = 50; diff --git a/analysis/runProductionEnvelope.m b/analysis/runProductionEnvelope.m index b9329bf7..302b0149 100755 --- a/analysis/runProductionEnvelope.m +++ b/analysis/runProductionEnvelope.m @@ -1,21 +1,32 @@ function [biomassValues, targetValues] = runProductionEnvelope(model, targetRxn, biomassRxn, nPts) -% runProductionEnvelope -% Calculates the byproduct secretion envelope +% runProductionEnvelope Calculate the byproduct secretion envelope. % -% Input: -% model a model structure -% targetRxn identifier of target metabolite production reaction -% biomassRxn identifier of biomass reaction -% nPts number of points in the plot (optional, default 20) +% Parameters +% ---------- +% model : struct +% a model structure. +% targetRxn : char +% identifier of target metabolite production reaction. +% biomassRxn : char +% identifier of biomass reaction. +% nPts : double, optional +% number of points in the plot (default 20). % -% Output: -% biomassValues Biomass values for plotting -% targetValues Target upper and lower bounds for plotting +% Returns +% ------- +% biomassValues : double +% biomass values for plotting. +% targetValues : double +% target upper and lower bounds for plotting. % -% Modified from COBRA Toolbox productionEnvelope.m +% Examples +% -------- +% [biomassValues, targetValues] = runProductionEnvelope(model, ... +% targetRxn, biomassRxn, nPts); % -% Usage: [biomassValues, targetValues] = runProductionEnvelope(model,... -% targetRxn, biomassRxn, nPts) +% Notes +% ----- +% Modified from COBRA Toolbox productionEnvelope.m. if nargin < 4 nPts = 20; diff --git a/analysis/runRobustnessAnalysis.m b/analysis/runRobustnessAnalysis.m index 2d085103..6146b1f5 100755 --- a/analysis/runRobustnessAnalysis.m +++ b/analysis/runRobustnessAnalysis.m @@ -1,26 +1,40 @@ function [controlFlux, objFlux] = runRobustnessAnalysis(model, controlRxn, nPoints, objRxn, plotRedCost) -% runRobustnessAnalysis -% Performs robustness analysis for a reaction of interest and an objective -% of interest. Modified from the COBRA robustnessAnalysis function. +% runRobustnessAnalysis Perform robustness analysis for a reaction and objective. % -% Input: -% model a model structure -% controlRxn reaction of interest whose value is to be controlled -% nPoints number of points to show on plot (optional, default 20) -% objRxn reaction identifier of objective to be maximized (optional, -% default it uses the objective defined in the model) -% plotRedCost logical whether reduced cost should also be plotted -% (optional, default false) +% Performs robustness analysis for a reaction of interest and an objective +% of interest. % -% Output: -% controlFlux flux values of the reaction of interest, ranging from -% its minimum to its maximum value -% objFlux optimal values of objective reaction at each control -% reaction flux value +% Parameters +% ---------- +% model : struct +% a model structure. +% controlRxn : char +% reaction of interest whose value is to be controlled. +% nPoints : double, optional +% number of points to show on plot (default 20). +% objRxn : char, optional +% reaction identifier of objective to be maximized (default uses the +% objective defined in the model). +% plotRedCost : logical, optional +% whether reduced cost should also be plotted (default false). % -% Modified from COBRA Toolbox robustnessAnalysis.m +% Returns +% ------- +% controlFlux : double +% flux values of the reaction of interest, ranging from its minimum to +% its maximum value. +% objFlux : double +% optimal values of objective reaction at each control reaction flux +% value. % -% Usage: runRobustnessAnalysis(model, controlRxn, nPoints, objRxn) +% Examples +% -------- +% [controlFlux, objFlux] = runRobustnessAnalysis(model, controlRxn, ... +% nPoints, objRxn); +% +% Notes +% ----- +% Modified from COBRA Toolbox robustnessAnalysis.m. if nargin < 3 nPoints = 20; diff --git a/analysis/runSimpleOptKnock.m b/analysis/runSimpleOptKnock.m index a36e0456..69cbfb26 100755 --- a/analysis/runSimpleOptKnock.m +++ b/analysis/runSimpleOptKnock.m @@ -1,33 +1,44 @@ function out = runSimpleOptKnock(model, targetRxn, biomassRxn, deletions, genesOrRxns, maxNumKO, minGrowth) -% runSimpleOptKnock -% Simple OptKnock algorithm that checks all gene or reaction deletions -% for growth-coupled metabolite production, by testing all possible -% combinations. This is not defined as MILP, and is therefore slow (but -% simple). +% runSimpleOptKnock Simple OptKnock for growth-coupled production. % -% Input: -% model a model structure -% targetRxn identifier of target reaction -% biomassRxn identifier of biomass reaction -% deletions cell array with gene or reaction identifiers that -% should be considered for knockout -% (optional, default = model.rxns) -% genesOrRxns string indicating whether deletions parameter is given -% with 'genes' or 'rxns' identifiers (optional, default -% 'rxns') -% maxNumKO numeric with maximum number of simulatenous knockout -% (optional, default 1) -% minGrowth numeric of minimum growth rate (optional, default 0.05) +% Simple OptKnock algorithm that checks all gene or reaction deletions for +% growth-coupled metabolite production, by testing all possible +% combinations. This is not defined as MILP, and is therefore slow (but +% simple). % -% Output: -% out structure with deletions strategies that result in -% growth-coupled production -% KO cell array with gene(s) or reaction(s) to be deleted -% growthRate vector with growth rates after deletion -% prodRate vector with production rates after deletion +% Parameters +% ---------- +% model : struct +% a model structure. +% targetRxn : char +% identifier of target reaction. +% biomassRxn : char +% identifier of biomass reaction. +% deletions : cell, optional +% cell array with gene or reaction identifiers that should be +% considered for knockout (default model.rxns). +% genesOrRxns : char, optional +% string indicating whether deletions parameter is given with 'genes' +% or 'rxns' identifiers (default 'rxns'). +% maxNumKO : double, optional +% maximum number of simultaneous knockouts (default 1). +% minGrowth : double, optional +% minimum growth rate (default 0.05). % -% Usage: out = runSimpleOptKnock(model, targetRxn, biomassRxn, deletions,... -% genesOrRxns, maxNumKO, minGrowth) +% Returns +% ------- +% out : struct +% structure with deletion strategies that result in growth-coupled +% production, with fields: +% +% - KO : cell array with gene(s) or reaction(s) to be deleted. +% - growthRate : vector with growth rates after deletion. +% - prodRate : vector with production rates after deletion. +% +% Examples +% -------- +% out = runSimpleOptKnock(model, targetRxn, biomassRxn, deletions, ... +% genesOrRxns, maxNumKO, minGrowth); if nargin < 4 params.deletions = model.rxns; diff --git a/annotation/assignSBOterms.m b/annotation/assignSBOterms.m index 0c0e1755..4e4093d0 100644 --- a/annotation/assignSBOterms.m +++ b/annotation/assignSBOterms.m @@ -1,62 +1,67 @@ function model = assignSBOterms(model, opts) -% assignSBOterms -% Assign SBO terms to metabolites and reactions following a generic -% rule set. Mirrors raven_python.annotation.add_sbo_terms; -% organism-agnostic, parameterised entirely by `opts`. The -% yeast-GEM port of this function is the legacy addSBOterms.m, -% which becomes a thin shim here. +% assignSBOterms Assign SBO terms to metabolites and reactions. % -% Rules -% ----- -% Metabolites: -% SBO:0000649 (Biomass) when met.name is in opts.biomassMetNames, -% or ends with any of opts.biomassMetSuffixes. Otherwise -% SBO:0000247 (Simple chemical). +% Assign SBO terms to metabolites and reactions following a generic rule +% set. Mirrors raven_python.annotation.add_sbo_terms; organism-agnostic, +% parameterised entirely by opts. The yeast-GEM port of this function is +% the legacy addSBOterms.m, which becomes a thin shim here. % -% Reactions (default → override → pseudoreaction override): -% SBO:0000176 (Metabolic reaction) default. -% Single-reactant reactions become: -% SBO:0000627 (exchange) if the lone metabolite is -% extracellular (compartment 'e' or compartment name -% containing 'extracellular'), -% SBO:0000632 (sink) if coef < 0, -% SBO:0000628 (demand) otherwise. -% Transport reactions (detected by opts.transportDetector or -% the default heuristic: same metName in ≥ 2 compartments -% in a single reaction) → SBO:0000655. -% Reactions whose name matches opts.biomassRxnName → SBO:0000629. -% Reactions whose name matches opts.ngamRxnName → SBO:0000630. -% Reactions whose name contains any of -% opts.pseudoreactionSubstrings → SBO:0000395. +% SBO is written via editMiriam(..., 'fill') so pre-existing SBO +% annotations are preserved. % -% "fill" semantic — SBO is written via editMiriam(..., 'fill') so -% pre-existing SBO annotations are preserved. +% Parameters +% ---------- +% model : struct +% RAVEN model struct. +% opts : struct, optional +% Struct with any of the following fields. Missing fields take the +% defaults shown: % -% Inputs: -% model RAVEN model struct. -% opts (opt) struct with any of the following fields. Missing -% fields take the defaults shown: -% biomassMetNames {'biomass','DNA','RNA','protein', -% 'carbohydrate','lipid','cofactor','ion'} -% biomassMetSuffixes {' backbone',' chain'} -% biomassRxnName 'biomass pseudoreaction' -% ngamRxnName 'non-growth associated maintenance reaction' -% pseudoreactionSubstrings {'pseudoreaction','SLIME rxn'} -% onlyLastReactionForPseudo false. yeast-GEM bug-compat -% flag — replicates the -% legacy `for i=numel(...)` -% typo (pseudoreaction -% SBOs applied only to the -% last reaction). Off by -% default; turn ON for -% byte-equivalent yeast-GEM -% output. +% - biomassMetNames : {'biomass','DNA','RNA','protein','carbohydrate', +% 'lipid','cofactor','ion'} +% - biomassMetSuffixes : {' backbone',' chain'} +% - biomassRxnName : 'biomass pseudoreaction' +% - ngamRxnName : 'non-growth associated maintenance reaction' +% - pseudoreactionSubstrings : {'pseudoreaction','SLIME rxn'} +% - onlyLastReactionForPseudo : false. yeast-GEM bug-compat flag that +% replicates the legacy `for i=numel(...)` typo (pseudoreaction SBOs +% applied only to the last reaction). Off by default; turn ON for +% byte-equivalent yeast-GEM output. % -% Output: -% model Modified model. +% Returns +% ------- +% model : struct +% Modified model. % -% Usage: model = assignSBOterms(model) -% model = assignSBOterms(model, struct('onlyLastReactionForPseudo', true)) +% Examples +% -------- +% model = assignSBOterms(model); +% model = assignSBOterms(model, struct('onlyLastReactionForPseudo', true)); +% +% Notes +% ----- +% Metabolites: +% +% SBO:0000649 (Biomass) when met.name is in opts.biomassMetNames, or +% ends with any of opts.biomassMetSuffixes. Otherwise SBO:0000247 +% (Simple chemical). +% +% Reactions (default → override → pseudoreaction override): +% +% SBO:0000176 (Metabolic reaction) default. +% Single-reactant reactions become: +% SBO:0000627 (exchange) if the lone metabolite is extracellular +% (compartment 'e' or compartment name containing +% 'extracellular'), +% SBO:0000632 (sink) if coef < 0, +% SBO:0000628 (demand) otherwise. +% Transport reactions (detected by opts.transportDetector or the +% default heuristic: same metName in ≥ 2 compartments in a single +% reaction) → SBO:0000655. +% Reactions whose name matches opts.biomassRxnName → SBO:0000629. +% Reactions whose name matches opts.ngamRxnName → SBO:0000630. +% Reactions whose name contains any of opts.pseudoreactionSubstrings +% → SBO:0000395. if nargin < 2 || isempty(opts) opts = struct(); diff --git a/annotation/editMiriam.m b/annotation/editMiriam.m index 35cafdd5..8cb26138 100755 --- a/annotation/editMiriam.m +++ b/annotation/editMiriam.m @@ -1,40 +1,48 @@ function model=editMiriam(model,type,object,miriamName,miriams,keep) -% editMiriam -% Change MIRIAM annotation fields, one annotation type at the same time. +% editMiriam Change MIRIAM annotation fields, one type at a time. % -% Input: -% model model structure -% type 'met', 'rxn', 'gene' or 'comp' dependent on which -% objects the annotations should be assigned to -% object either a cell array of IDs, a logical vector with the -% same number of elements as the type (see above) in the -% model, a vector of indexes, or 'all' -% miriamName string specifying the namespace of the identifier, for -% instance 'bigg.metabolite'. Should be a valid prefix -% from identifiers.org (e.g. -% https://registry.identifiers.org/registry/bigg.metabolite) -% miriam string or cell array of strings with annotation -% identifiers, e.g. '12dgr161' -% keep one of the following strings, specifying what should be -% done if an object already has an existing MIRIAM -% annotations with the same miriamName: -% 'replace' discard all existing annotations, all will -% be overwritten, even if the new annotation -% is an empty field. Should only be used if -% you do not want to keep any of the old -% annotation with the same miriamName -% 'fill' only add annotations to those objects that -% did not yet have an annotation with that -% miriamName. Otherwise, the existing -% annotation is kept, even if it is different -% from the suggested new annotation -% 'add' keep all existing annotations, and add any -% new annotations, after removing duplicates -% -% Ouput: -% model model structure with updated MIRIAM annotation field -% -% Usage: model=editMiriam(model,type,object,miriamName,miriams,keep) +% Parameters +% ---------- +% model : struct +% model structure. +% type : char +% 'met', 'rxn', 'gene' or 'comp' dependent on which objects the +% annotations should be assigned to. +% object : cell or logical or double or char +% either a cell array of IDs, a logical vector with the same number of +% elements as the type (see above) in the model, a vector of indexes, +% or 'all'. +% miriamName : char +% string specifying the namespace of the identifier, for instance +% 'bigg.metabolite'. Should be a valid prefix from identifiers.org +% (e.g. https://registry.identifiers.org/registry/bigg.metabolite). +% miriams : char or cell +% string or cell array of strings with annotation identifiers, e.g. +% '12dgr161'. +% keep : char +% one of the following strings, specifying what should be done if an +% object already has existing MIRIAM annotations with the same +% miriamName: +% +% - 'replace' : discard all existing annotations, all will be +% overwritten, even if the new annotation is an empty field. Should +% only be used if you do not want to keep any of the old annotation +% with the same miriamName. +% - 'fill' : only add annotations to those objects that did not yet +% have an annotation with that miriamName. Otherwise, the existing +% annotation is kept, even if it is different from the suggested new +% annotation. +% - 'add' : keep all existing annotations, and add any new +% annotations, after removing duplicates. +% +% Returns +% ------- +% model : struct +% model structure with updated MIRIAM annotation field. +% +% Examples +% -------- +% model = editMiriam(model, type, object, miriamName, miriams, keep); miriamName=char(miriamName); miriams=convertCharArray(miriams); diff --git a/annotation/extractMiriam.m b/annotation/extractMiriam.m index 6a88edc9..5bb582aa 100755 --- a/annotation/extractMiriam.m +++ b/annotation/extractMiriam.m @@ -1,30 +1,37 @@ function [miriams,extractedMiriamNames]=extractMiriam(modelMiriams,miriamNames) -% extractMiriam -% This function unpacks the information kept in metMiriams, rxnMiriams, -% geneMiriams or compMiriams to make the annotation more -% human-readable. The obtained cell array looks the same like in Excel -% format, just the columns are split to have particular miriam name in -% corresponding column +% extractMiriam Unpack MIRIAM annotations into a human-readable table. % -% modelMiriams a miriam structure (e.g. model.metMiriams) -% for one or multiple metabolites -% miriamNames cell array with miriam names to be -% extracted (optional, default 'all', meaning -% that annotation for all miriam names found -% in modelMiriams will be extracted) +% This function unpacks the information kept in metMiriams, rxnMiriams, +% geneMiriams or compMiriams to make the annotation more human-readable. +% The obtained cell array looks the same as in Excel format, just the +% columns are split to have a particular miriam name in the corresponding +% column. % -% miriams a cell array with extracted miriams. if -% several miriam names are requested, the -% corresponding information is saved in -% different columns. if there are several ids -% available for the same entity (metabolite, -% gene, reaction or compartment), they are -% concatenated into one column. the total -% number of column represent the number of -% unique miriam names per entity -% extractedMiriamNames cell array with extracted miriam names +% Parameters +% ---------- +% modelMiriams : cell +% a miriam structure (e.g. model.metMiriams) for one or multiple +% metabolites. +% miriamNames : cell or char, optional +% cell array with miriam names to be extracted (default 'all', meaning +% that annotation for all miriam names found in modelMiriams will be +% extracted). % -% Usage: miriam=extractMiriam(modelMiriams,miriamName) +% Returns +% ------- +% miriams : cell +% a cell array with extracted miriams. If several miriam names are +% requested, the corresponding information is saved in different +% columns. If there are several ids available for the same entity +% (metabolite, gene, reaction or compartment), they are concatenated +% into one column. The total number of columns represents the number +% of unique miriam names per entity. +% extractedMiriamNames : cell +% cell array with extracted miriam names. +% +% Examples +% -------- +% [miriams, extractedMiriamNames] = extractMiriam(modelMiriams, miriamNames); if nargin<2 || (ischar(miriamNames) && strcmp(miriamNames,'all')) extractAllTypes=true; diff --git a/annotation/loadDeltaGfromCSV.m b/annotation/loadDeltaGfromCSV.m index da56e766..c20f1082 100644 --- a/annotation/loadDeltaGfromCSV.m +++ b/annotation/loadDeltaGfromCSV.m @@ -1,24 +1,33 @@ function model = loadDeltaGfromCSV(model, metCsv, rxnCsv) -% loadDeltaGfromCSV -% Populate model.metDeltaG and model.rxnDeltaG from project CSV -% files. Mirrors raven_python.annotation.load_delta_g_csv and is -% the upstream version of yeast-GEM's loadDeltaG. +% loadDeltaGfromCSV Populate metDeltaG and rxnDeltaG from CSV files. % -% Each CSV is a two-column table: identifier, deltaG. Rows whose -% identifier doesn't appear in the model are silently skipped. -% Pass an empty string ('') for either argument to skip that side. +% Populate model.metDeltaG and model.rxnDeltaG from project CSV files. +% Mirrors raven_python.annotation.load_delta_g_csv and is the upstream +% version of yeast-GEM's loadDeltaG. % -% Inputs: -% model RAVEN model struct. -% metCsv Path to metabolite ΔG CSV (id, ΔG), or '' to skip. -% rxnCsv Path to reaction ΔG CSV (id, ΔG), or '' to skip. +% Each CSV is a two-column table: identifier, deltaG. Rows whose +% identifier doesn't appear in the model are silently skipped. Pass an +% empty string ('') for either argument to skip that side. % -% Output: -% model Model with metDeltaG and/or rxnDeltaG fields added. +% Parameters +% ---------- +% model : struct +% RAVEN model struct. +% metCsv : char, optional +% Path to metabolite ΔG CSV (id, ΔG), or '' to skip. +% rxnCsv : char, optional +% Path to reaction ΔG CSV (id, ΔG), or '' to skip. % -% Usage: model = loadDeltaGfromCSV(model, ... -% 'data/databases/model_metDeltaG.csv', ... -% 'data/databases/model_rxnDeltaG.csv') +% Returns +% ------- +% model : struct +% Model with metDeltaG and/or rxnDeltaG fields added. +% +% Examples +% -------- +% model = loadDeltaGfromCSV(model, ... +% 'data/databases/model_metDeltaG.csv', ... +% 'data/databases/model_rxnDeltaG.csv'); if nargin < 3 rxnCsv = ''; diff --git a/annotation/saveDeltaGtoCSV.m b/annotation/saveDeltaGtoCSV.m index ee682940..491710c5 100644 --- a/annotation/saveDeltaGtoCSV.m +++ b/annotation/saveDeltaGtoCSV.m @@ -1,23 +1,30 @@ function saveDeltaGtoCSV(model, metCsv, rxnCsv, verbose) -% saveDeltaGtoCSV -% Persist model.metDeltaG and model.rxnDeltaG to project CSV files. -% Counterpart of loadDeltaGfromCSV and the upstream version of -% yeast-GEM's saveDeltaG. Mirrors raven_python.annotation.save_delta_g_csv. +% saveDeltaGtoCSV Persist metDeltaG and rxnDeltaG to CSV files. % -% Each CSV gets two columns: identifier, deltaG. Rows are written -% in model order (one row per entity); identifiers without a -% matching field get NaN. Pass an empty string for metCsv or rxnCsv -% to skip that side. +% Persist model.metDeltaG and model.rxnDeltaG to project CSV files. +% Counterpart of loadDeltaGfromCSV and the upstream version of yeast-GEM's +% saveDeltaG. Mirrors raven_python.annotation.save_delta_g_csv. % -% Inputs: -% model RAVEN model struct. -% metCsv Output path for the metabolite ΔG CSV, or '' to skip. -% rxnCsv Output path for the reaction ΔG CSV, or '' to skip. -% verbose (opt, default false) Print "wrote ..." per file. +% Each CSV gets two columns: identifier, deltaG. Rows are written in model +% order (one row per entity); identifiers without a matching field get +% NaN. Pass an empty string for metCsv or rxnCsv to skip that side. % -% Usage: saveDeltaGtoCSV(model, ... -% 'data/databases/model_metDeltaG.csv', ... -% 'data/databases/model_rxnDeltaG.csv') +% Parameters +% ---------- +% model : struct +% RAVEN model struct. +% metCsv : char, optional +% Output path for the metabolite ΔG CSV, or '' to skip. +% rxnCsv : char, optional +% Output path for the reaction ΔG CSV, or '' to skip. +% verbose : logical, optional +% Print "wrote ..." per file (default false). +% +% Examples +% -------- +% saveDeltaGtoCSV(model, ... +% 'data/databases/model_metDeltaG.csv', ... +% 'data/databases/model_rxnDeltaG.csv'); if nargin < 4 verbose = false; diff --git a/biomass/fitParameters.m b/biomass/fitParameters.m index 44c15ecf..b3c949ba 100755 --- a/biomass/fitParameters.m +++ b/biomass/fitParameters.m @@ -1,39 +1,55 @@ function [parameters, fitnessScore, exitFlag, newModel]=fitParameters(model,xRxns,xValues,rxnsToFit,valuesToFit,parameterPositions,fitToRatio,initialGuess,plotFitting) -% fitParameters -% Fits parameters such as maintenance ATP by quadratic programming +% fitParameters Fit parameters such as maintenance ATP by quadratic programming. % -% model a model structure -% xRxns cell array with the IDs of the reactions that will be -% fixed for each data point -% xValues matrix with the corresponding values for each -% xRxns (columns are reactions) -% rxnsToFit cell array with the IDs of reactions that will be fitted to -% valuesToFit matrix with the corresponding values for each -% rxnsToFit (columns are reactions) -% parameterPositions stucture that determines where the parameters are in the -% stoichiometric matrix. Contains the fields: -% position cell array of vectors where each element contains -% the positions in the S-matrix for that parameter -% isNegative cell array of vectors where the elements are true -% if that position should be the negative of the -% fitted value (to differentiate between -% production/consumption) -% fitToRatio if the ratio of simulated to measured values should -% be fitted instead of the absolute value. Used to prevent -% large fluxes from having too large impact (optional, -% default true) -% initialGuess initial guess of the parameters (optional) -% plotFitting true if the resulting fitting should be plotted -% (optional, default false) +% Parameters +% ---------- +% model : struct +% a model structure. +% xRxns : cell +% cell array with the IDs of the reactions that will be fixed for each +% data point. +% xValues : double +% matrix with the corresponding values for each xRxns (columns are +% reactions). +% rxnsToFit : cell +% cell array with the IDs of reactions that will be fitted to. +% valuesToFit : double +% matrix with the corresponding values for each rxnsToFit (columns are +% reactions). +% parameterPositions : struct +% structure that determines where the parameters are in the +% stoichiometric matrix, with fields: % -% parameters fitted parameters in the same order as in -% parameterPositions -% fitnessScore the correponding residual sum of squares -% newModel updated model structure with the fitted parameters +% - position : cell array of vectors where each element contains the +% positions in the S-matrix for that parameter +% - isNegative : cell array of vectors where the elements are true if +% that position should be the negative of the fitted value (to +% differentiate between production/consumption) +% fitToRatio : logical, optional +% if the ratio of simulated to measured values should be fitted +% instead of the absolute value. Used to prevent large fluxes from +% having too large an impact (default true). +% initialGuess : double, optional +% initial guess of the parameters (default ones). +% plotFitting : logical, optional +% true if the resulting fitting should be plotted (default false). % -% Usage: [parameters, fitnessScore, exitFlag, newModel]=fitParameters(model,... -% xRxns,xValues,rxnsToFit,valuesToFit,parameterPositions,fitToRatio,... -% initialGuess,plotFitting) +% Returns +% ------- +% parameters : double +% fitted parameters in the same order as in parameterPositions. +% fitnessScore : double +% the corresponding residual sum of squares. +% exitFlag : double +% exit status returned by fminsearch. +% newModel : struct +% updated model structure with the fitted parameters. +% +% Examples +% -------- +% [parameters, fitnessScore, exitFlag, newModel]=fitParameters(model,... +% xRxns,xValues,rxnsToFit,valuesToFit,parameterPositions,fitToRatio,... +% initialGuess,plotFitting); if nargin<7 fitToRatio=true; diff --git a/biomass/getBiomassFractions.m b/biomass/getBiomassFractions.m index 592e7830..d42fa4d8 100644 --- a/biomass/getBiomassFractions.m +++ b/biomass/getBiomassFractions.m @@ -1,47 +1,52 @@ function fractions = getBiomassFractions(model, biomassConfig) -% getBiomassFractions -% Compute the mass fraction (g/gDW) per biomass component plus the -% total. Mirrors raven_python.biomass.sum_biomass; the MATLAB -% counterpart of yeast-GEM's legacy sumBioMass. +% getBiomassFractions Compute mass fraction per biomass component. % -% The biomassConfig struct describes the per-organism biomass -% layout — see "Inputs" below. Components whose pseudoreaction is -% missing from the model contribute 0. +% Compute the mass fraction (g/gDW) per biomass component plus the total. +% Mirrors raven_python.biomass.sum_biomass; the MATLAB counterpart of +% yeast-GEM's legacy sumBioMass. % -% Inputs: -% model RAVEN model struct. -% biomassConfig struct with fields: -% biomass_rxn rxn id of the top-level -% biomass pseudoreaction. -% proton_met met id of cytosolic H+ (used -% only by rescalePseudoreaction; -% may be unused here). -% components cell array of component -% structs with fields: -% .name component name -% (e.g. 'protein'). -% .pseudoreaction_name model.rxnNames -% entry to identify -% the pseudoreaction. -% .mass_strategy 'mw' | 'mw_minus_2h' -% | 'mw_minus_water' -% | 'grams' — see -% NOTES below. +% The biomassConfig struct describes the per-organism biomass layout — see +% Parameters below. Components whose pseudoreaction is missing from the +% model contribute 0. % -% Output: -% fractions struct keyed by component name plus 'total': -% fractions.protein, fractions.RNA, ... etc. -% All values are in g/gDW. +% Parameters +% ---------- +% model : struct +% RAVEN model struct. +% biomassConfig : struct +% Struct describing the biomass layout, with fields: % -% NOTES on mass_strategy: -% 'mw' MW from chemical formula -% 'mw_minus_2h' MW − 2.016 g/mol (two protons released per -% charged tRNA — protein-pseudoreaction substrates) -% 'mw_minus_water' MW − 18.015 g/mol (water released per -% polymerisation step — RNA / DNA) -% 'grams' stoichiometry already in g/gDW (lipid backbone) +% - biomass_rxn : rxn id of the top-level biomass pseudoreaction. +% - proton_met : met id of cytosolic H+ (used only by +% rescalePseudoreaction; may be unused here). +% - components : cell array of component structs, each with fields: % -% Usage: fractions = getBiomassFractions(model, biomassConfig) +% - name : component name (e.g. 'protein'). +% - pseudoreaction_name : model.rxnNames entry to identify the +% pseudoreaction. +% - mass_strategy : 'mw' | 'mw_minus_2h' | 'mw_minus_water' | +% 'grams' — see Notes below. +% +% Returns +% ------- +% fractions : struct +% Struct keyed by component name plus 'total': fractions.protein, +% fractions.RNA, ... etc. All values are in g/gDW. +% +% Examples +% -------- +% fractions = getBiomassFractions(model, biomassConfig); +% +% Notes +% ----- +% mass_strategy values: +% +% 'mw' MW from chemical formula +% 'mw_minus_2h' MW − 2.016 g/mol (two protons released per charged +% tRNA — protein-pseudoreaction substrates) +% 'mw_minus_water' MW − 18.015 g/mol (water released per +% polymerisation step — RNA / DNA) +% 'grams' stoichiometry already in g/gDW (lipid backbone) fractions = struct(); total = 0; diff --git a/biomass/scaleBiomassFraction.m b/biomass/scaleBiomassFraction.m index 0e4165f2..714bafa8 100644 --- a/biomass/scaleBiomassFraction.m +++ b/biomass/scaleBiomassFraction.m @@ -1,23 +1,37 @@ function model = scaleBiomassFraction(model, biomassConfig, componentName, newValue, balanceOut) -% scaleBiomassFraction -% Rescale a biomass component to a target g/gDW value, optionally -% balancing a second component so the total biomass mass stays at -% 1 g/gDW. Mirrors raven_python.biomass.scale_biomass and yeast-GEM's -% legacy scaleBioMass. +% scaleBiomassFraction Rescale a biomass component to a target value. % -% Inputs: -% model RAVEN model struct. -% biomassConfig struct (see getBiomassFractions). -% componentName Component to rescale. -% newValue Target fraction in g/gDW. -% balanceOut (opt) Second component name to adjust so the -% biomass total remains 1 g/gDW. Empty / omit -% to skip balancing. +% Rescale a biomass component to a target g/gDW value, optionally +% balancing a second component so the total biomass mass stays at 1 g/gDW. +% Mirrors raven_python.biomass.scale_biomass and yeast-GEM's legacy +% scaleBioMass. % -% Output: -% model Modified model. +% Parameters +% ---------- +% model : struct +% RAVEN model struct. +% biomassConfig : struct +% Struct (see getBiomassFractions). +% componentName : char +% Component to rescale. +% newValue : double +% Target fraction in g/gDW. +% balanceOut : char, optional +% Second component name to adjust so the biomass total remains 1 +% g/gDW. Empty / omit to skip balancing. % -% Usage: model = scaleBiomassFraction(model, biomassConfig, 'protein', 0.5, 'carbohydrate') +% Returns +% ------- +% model : struct +% Modified model. +% +% Examples +% -------- +% model = scaleBiomassFraction(model, biomassConfig, 'protein', 0.5, 'carbohydrate'); +% +% See also +% -------- +% getBiomassFractions if nargin < 5 balanceOut = ''; diff --git a/biomass/scaleBiomassPseudoreaction.m b/biomass/scaleBiomassPseudoreaction.m index 4636297c..f565800b 100644 --- a/biomass/scaleBiomassPseudoreaction.m +++ b/biomass/scaleBiomassPseudoreaction.m @@ -1,29 +1,43 @@ function model = scaleBiomassPseudoreaction(model, biomassConfig, componentName, factor) -% scaleBiomassPseudoreaction -% Multiply the substrate coefficients of one biomass component -% pseudoreaction by `factor` and rebalance H+ to preserve charge -% neutrality. Mirrors raven_python.biomass.rescale_pseudoreaction -% and yeast-GEM's legacy rescalePseudoReaction. +% scaleBiomassPseudoreaction Rescale a biomass component pseudoreaction. % -% "Substrate" means every metabolite in the pseudoreaction whose -% metabolite name does NOT match the component name (the -% component's product is left untouched). After rescaling, the -% coefficient of biomassConfig.proton_met is recomputed so the -% pseudoreaction's total ionic charge sums to zero. +% Multiply the substrate coefficients of one biomass component +% pseudoreaction by factor and rebalance H+ to preserve charge neutrality. +% Mirrors raven_python.biomass.rescale_pseudoreaction and yeast-GEM's +% legacy rescalePseudoReaction. % -% Inputs: -% model RAVEN model struct. -% biomassConfig struct (see getBiomassFractions). -% componentName Name of the component to rescale (must match -% biomassConfig.components{i}.name for some i, -% AND be the model.metNames of the produced -% metabolite in the matching pseudoreaction). -% factor Multiplicative factor. +% "Substrate" means every metabolite in the pseudoreaction whose +% metabolite name does NOT match the component name (the component's +% product is left untouched). After rescaling, the coefficient of +% biomassConfig.proton_met is recomputed so the pseudoreaction's total +% ionic charge sums to zero. % -% Output: -% model Modified model. +% Parameters +% ---------- +% model : struct +% RAVEN model struct. +% biomassConfig : struct +% Struct (see getBiomassFractions). +% componentName : char +% Name of the component to rescale (must match +% biomassConfig.components{i}.name for some i, AND be the +% model.metNames of the produced metabolite in the matching +% pseudoreaction). +% factor : double +% Multiplicative factor. % -% Usage: model = scaleBiomassPseudoreaction(model, biomassConfig, 'protein', 0.9) +% Returns +% ------- +% model : struct +% Modified model. +% +% Examples +% -------- +% model = scaleBiomassPseudoreaction(model, biomassConfig, 'protein', 0.9); +% +% See also +% -------- +% getBiomassFractions comp = findComponent(biomassConfig, componentName); rxnPos = find(strcmp(model.rxnNames, comp.pseudoreaction_name)); diff --git a/biomass/setGAM.m b/biomass/setGAM.m index 319c04df..e521139c 100644 --- a/biomass/setGAM.m +++ b/biomass/setGAM.m @@ -1,33 +1,42 @@ function model = setGAM(model, value, biomassRxn, cofactorMetNames, ngamRxn, ngamValue) -% setGAM -% Set the growth-associated maintenance (GAM) coefficient in the -% biomass pseudoreaction, and optionally fix the non-growth -% maintenance (NGAM) reaction's bounds. Mirrors -% raven_python.biomass.set_gam and yeast-GEM's legacy changeGAM. +% setGAM Set the growth-associated maintenance (GAM) coefficient. % -% For every metabolite in the biomass pseudoreaction whose -% `model.metNames` entry is in `cofactorMetNames`, the -% stoichiometric coefficient is set to ±`value` preserving the sign -% of the current coefficient. Yeast-GEM scales ATP, ADP, H2O, H+ -% and phosphate (with ATP and H2O on the substrate side, ADP / H+ / -% phosphate on the product side). +% Set the growth-associated maintenance (GAM) coefficient in the biomass +% pseudoreaction, and optionally fix the non-growth maintenance (NGAM) +% reaction's bounds. Mirrors raven_python.biomass.set_gam and yeast-GEM's +% legacy changeGAM. % -% Inputs: -% model RAVEN model struct. -% value New GAM value (mmol ATP / gDW per growth unit). -% biomassRxn Reaction id of the biomass pseudoreaction. -% cofactorMetNames Cell array of metabolite NAMES (not IDs) -% to rescale, e.g. {'ATP','ADP','H2O','H+', -% 'phosphate'}. -% ngamRxn (opt) NGAM reaction id. Required when -% ngamValue is supplied. -% ngamValue (opt) NGAM flux to fix. Sets the NGAM -% reaction's bounds to (ngamValue, ngamValue). +% For every metabolite in the biomass pseudoreaction whose model.metNames +% entry is in cofactorMetNames, the stoichiometric coefficient is set to +% ±value preserving the sign of the current coefficient. Yeast-GEM scales +% ATP, ADP, H2O, H+ and phosphate (with ATP and H2O on the substrate side, +% ADP / H+ / phosphate on the product side). % -% Output: -% model Modified model. +% Parameters +% ---------- +% model : struct +% RAVEN model struct. +% value : double +% New GAM value (mmol ATP / gDW per growth unit). +% biomassRxn : char +% Reaction id of the biomass pseudoreaction. +% cofactorMetNames : cell +% Cell array of metabolite NAMES (not IDs) to rescale, e.g. +% {'ATP','ADP','H2O','H+','phosphate'}. +% ngamRxn : char, optional +% NGAM reaction id. Required when ngamValue is supplied. +% ngamValue : double, optional +% NGAM flux to fix. Sets the NGAM reaction's bounds to (ngamValue, +% ngamValue). % -% Usage: model = setGAM(model, 80, 'r_4041', {'ATP','ADP','H2O','H+','phosphate'}) +% Returns +% ------- +% model : struct +% Modified model. +% +% Examples +% -------- +% model = setGAM(model, 80, 'r_4041', {'ATP','ADP','H2O','H+','phosphate'}); if nargin < 4 error('setGAM:missingArgs', ... diff --git a/comparison/compareMultipleModels.m b/comparison/compareMultipleModels.m index 596d0957..4bd62aed 100755 --- a/comparison/compareMultipleModels.m +++ b/comparison/compareMultipleModels.m @@ -1,45 +1,52 @@ function compStruct = compareMultipleModels(models,printResults,plotResults,groupVector,funcCompare,taskFile) -% compareMultipleModels -% Compares two or more condition-specific models generated from the same -% base model using high-dimensional comparisons in the reaction-space. +% compareMultipleModels Compare two or more condition-specific models. % -% models cell array of two or more models -% printResults true if the results should be printed on the screen -% (optional, default false) -% plotResults true if the results should be plotted -% (optional, default false) -% groupVector numeric vector or cell array for grouping similar -% models, i.e. by tissue (optional, default, all models -% ungrouped) -% funcCompare logical, should a functional comparison be run -% (optional,default, false) -% taskFile string containing the name of the task file to use -% for the functional comparison (should be an .xls or -% .xlsx file, required for functional comparison) +% Compares two or more condition-specific models generated from the same +% base model using high-dimensional comparisons in the reaction-space. % -% compStruct structure that contains the comparison results -% modelIDs cell array of model ids -% reactions substructure containing reaction information -% matrix binary matrix composed of reactions (rows) in -% each model (column). This matrix is used as the -% input for the model comparisons. -% IDs list of the reactions contained in the reaction -% matrix. -% subsystems substructure containing subsystem information -% matrix matrix with comparison of number of rxns per -% subsystem -% ID vector consisting of names of all subsystems -% structComp matrix with pairwise comparisons of model structure -% based on (1-Hamming distance) between models -% structCompMap matrix with 3D tSNE (or MDS) mapping of model -% structures based on Hamming distances -% funcComp substructure containing function comparison results -% matrix matrix with PASS / FAIL (1 / 0) values for each -% task -% tasks vector containing names of all tasks +% Parameters +% ---------- +% models : cell +% cell array of two or more models. +% printResults : logical, optional +% true if the results should be printed on the screen (default false). +% plotResults : logical, optional +% true if the results should be plotted (default false). +% groupVector : double or cell, optional +% numeric vector or cell array for grouping similar models, i.e. by +% tissue (default all models ungrouped). +% funcCompare : logical, optional +% should a functional comparison be run (default false). +% taskFile : char, optional +% string containing the name of the task file to use for the functional +% comparison (should be an .xls or .xlsx file, required for functional +% comparison). % -% Usage: compStruct=compareMultipleModels(models,printResults,... -% plotResults,groupVector,funcCompare,taskFile); +% Returns +% ------- +% compStruct : struct +% structure that contains the comparison results, with fields: +% +% - modelIDs : cell array of model ids +% - reactions : substructure containing reaction information, with +% fields matrix (binary matrix composed of reactions (rows) in each +% model (column), used as the input for the model comparisons) and IDs +% (list of the reactions contained in the reaction matrix) +% - subsystems : substructure containing subsystem information, with +% fields matrix (matrix with comparison of number of rxns per +% subsystem) and ID (vector consisting of names of all subsystems) +% - structComp : matrix with pairwise comparisons of model structure +% based on (1-Hamming distance) between models +% - structCompMap : matrix with 3D tSNE (or MDS) mapping of model +% structures based on Hamming distances +% - funcComp : substructure containing function comparison results, with +% fields matrix (matrix with PASS / FAIL (1 / 0) values for each task) +% and tasks (vector containing names of all tasks) +% +% Examples +% -------- +% compStruct = compareMultipleModels(models, printResults, ... +% plotResults, groupVector, funcCompare, taskFile); %% Stats toolbox required if ~(exist('mdscale.m','file') && exist('pdist.m','file') && exist('squareform.m','file') && exist('tsne.m','file')) diff --git a/comparison/compareRxnsGenesMetsComps.m b/comparison/compareRxnsGenesMetsComps.m index 29f40dcf..6712bee8 100755 --- a/comparison/compareRxnsGenesMetsComps.m +++ b/comparison/compareRxnsGenesMetsComps.m @@ -1,29 +1,35 @@ function compStruct=compareRxnsGenesMetsComps(models,printResults) -% compareRxnsGenesMetsComps -% Compares two or more models with respect to overlap in terms of genes, -% reactions, metabolites and compartments. +% compareRxnsGenesMetsComps Compare overlap of genes, reactions, metabolites and compartments. % -% models cell array of two or more models -% printResults true if the results should be printed on the screen -% (optional, default false) +% Compares two or more models with respect to overlap in terms of genes, +% reactions, metabolites and compartments. % -% compStruct structure that contains the comparison -% modelIDs cell array of model ids -% rxns These contain the comparison for each field. 'equ' are -% the equations after sorting and 'uEqu' are the -% equations when not taking compartmentalization into acount -% mets -% genes -% eccodes -% metNames -% equ -% uEqu -% comparison binary matrix where each row indicate which models are -% included in the comparison -% nElements vector with the number of elements for each -% comparison +% Parameters +% ---------- +% models : cell +% cell array of two or more models. +% printResults : logical, optional +% true if the results should be printed on the screen (default false). % -% Usage: compStruct=compareRxnsGenesMetsComps(models,printResults) +% Returns +% ------- +% compStruct : struct +% structure that contains the comparison, with fields: +% +% - modelIDs : cell array of model ids +% - rxns, mets, genes, eccodes, metNames, equ, uEqu : the comparison for +% each field. 'equ' are the equations after sorting and 'uEqu' are the +% equations when not taking compartmentalization into account. Each of +% these contains the sub-fields: +% +% - comparison : binary matrix where each row indicates which models +% are included in the comparison +% - nElements : vector with the number of elements for each +% comparison +% +% Examples +% -------- +% compStruct = compareRxnsGenesMetsComps(models, printResults); if nargin<2 printResults=true; diff --git a/conditions/applyCondition.m b/conditions/applyCondition.m index 8494485f..940c5d2f 100644 --- a/conditions/applyCondition.m +++ b/conditions/applyCondition.m @@ -1,52 +1,63 @@ function model = applyCondition(model, condition) -% applyCondition -% Apply a deterministic "condition" to a model: a prelude that resets -% exchange bounds, optional metabolite removals + automatic charge -% rebalancing of a pseudoreaction, optional biomass-stoichiometry -% delta, and a per-reaction bounds diff. The schema is intentionally -% narrow so a condition can be reviewed as data. +% applyCondition Apply a deterministic condition to a model. % -% Yeast-GEM was the first consumer; the same schema works for any -% GEM that keeps its condition presets as data rather than as code. -% Project-specific extensions (e.g. yeast-GEM's amino_acid_ratio -% step that rewrites a protein pseudoreaction's stoichiometry from a -% side-car TSV) are handled by the *caller* before / after this -% function — kept upstream-narrow on purpose. +% Apply a deterministic "condition" to a model: a prelude that resets +% exchange bounds, optional metabolite removals + automatic charge +% rebalancing of a pseudoreaction, optional biomass-stoichiometry delta, +% and a per-reaction bounds diff. The schema is intentionally narrow so a +% condition can be reviewed as data. % -% Inputs: -% model RAVEN model struct. -% condition Either a path to a YAML condition file or a struct -% already produced by parseYAML. The expected schema -% (all keys optional): +% Yeast-GEM was the first consumer; the same schema works for any GEM that +% keeps its condition presets as data rather than as code. +% Project-specific extensions (e.g. yeast-GEM's amino_acid_ratio step that +% rewrites a protein pseudoreaction's stoichiometry from a side-car TSV) +% are handled by the caller before / after this function — kept +% upstream-narrow on purpose. % -% prelude: -% reset_exchanges: out % truthy -> reset all +% Parameters +% ---------- +% model : struct +% RAVEN model struct. +% condition : char or struct +% Either a path to a YAML condition file or a struct already produced +% by parseYAML. The expected schema (all keys optional): % -% cofactor_pseudoreaction: -% rxn_id: r_4598 -% remove_mets: -% - { met: s_3714 } -% charge_balance_met: s_0794 +% prelude: +% reset_exchanges: out % truthy -> reset all % -% biomass_stoichiometry_delta: -% rxn_id: r_4041 -% add: -% - { met: s_0689, coef: 0.08 } -% - { met: s_0687, coef: -0.08 } -% - { met: s_0794, coef: -0.16 } +% cofactor_pseudoreaction: +% rxn_id: r_4598 +% remove_mets: +% - { met: s_3714 } +% charge_balance_met: s_0794 % -% bounds: -% - { rxn: r_1654, lb: -1000 } -% - { rxn: r_1992, lb: 0 } -% - { rxn: r_1663, lb: 0, ub: 0 } +% biomass_stoichiometry_delta: +% rxn_id: r_4041 +% add: +% - { met: s_0689, coef: 0.08 } +% - { met: s_0687, coef: -0.08 } +% - { met: s_0794, coef: -0.16 } % -% expected_uptake_count: 15 +% bounds: +% - { rxn: r_1654, lb: -1000 } +% - { rxn: r_1992, lb: 0 } +% - { rxn: r_1663, lb: 0, ub: 0 } % -% Output: -% model Modified model. +% expected_uptake_count: 15 % -% Usage: model = applyCondition(model, 'data/conditions/anaerobic.yml') -% model = applyCondition(model, parseYAML('data/conditions/anaerobic.yml')) +% Returns +% ------- +% model : struct +% Modified model. +% +% Examples +% -------- +% model = applyCondition(model, 'data/conditions/anaerobic.yml'); +% model = applyCondition(model, parseYAML('data/conditions/anaerobic.yml')); +% +% See also +% -------- +% parseYAML if ischar(condition) || isstring(condition) cond = parseYAML(char(condition)); diff --git a/conversion/addIdentifierPrefix.m b/conversion/addIdentifierPrefix.m index 74672ff6..e4873a41 100644 --- a/conversion/addIdentifierPrefix.m +++ b/conversion/addIdentifierPrefix.m @@ -1,26 +1,35 @@ function [model, hasChanged]=addIdentifierPrefix(model,fields) -% addIdentifierPrefix -% If reaction, metabolite, compartment, gene or model identifiers do not -% start with a letter or _, which conflicts with SBML specifications, -% prefixes are added for all identifiers in the respective model field. -% The prefixes are: -% "R_" for model.rxns, -% "M_" for model.mets, -% "C_" for model.comps; -% "G_" for model.genes (and also represented in model.grRules) +% addIdentifierPrefix Add identifier prefixes required by SBML. % -% Input: -% model model whose identifiers should be modified -% fields cell array with model field names that should be -% checked if prefixes should be added, possible values: -% 'rxns', 'mets', 'comps', 'genes', 'id'. (optional, by -% default all listed model fields will be checked). +% If reaction, metabolite, compartment, gene or model identifiers do not +% start with a letter or _, which conflicts with SBML specifications, +% prefixes are added for all identifiers in the respective model field. +% The prefixes are: % -% Output: -% model modified model -% hasChanged cell array with fields and prefixes that are added +% "R_" for model.rxns, +% "M_" for model.mets, +% "C_" for model.comps; +% "G_" for model.genes (and also represented in model.grRules) % -% Usage: [model, hasChanged]=addIdentifierPrefix(model,fields) +% Parameters +% ---------- +% model : struct +% model whose identifiers should be modified. +% fields : cell, optional +% cell array with model field names that should be checked if prefixes +% should be added, possible values: 'rxns', 'mets', 'comps', 'genes', +% 'id' (default all listed model fields will be checked). +% +% Returns +% ------- +% model : struct +% modified model. +% hasChanged : cell +% cell array with fields and prefixes that are added. +% +% Examples +% -------- +% [model, hasChanged] = addIdentifierPrefix(model, fields); if nargin<2 || isempty(fields) fields = {'rxns','mets','comps','genes','id'}; diff --git a/conversion/ravenCobraWrapper.m b/conversion/ravenCobraWrapper.m index cb82866f..4c692e93 100755 --- a/conversion/ravenCobraWrapper.m +++ b/conversion/ravenCobraWrapper.m @@ -1,33 +1,42 @@ function newModel=ravenCobraWrapper(model) -% ravenCobraWrapper -% Converts between RAVEN and COBRA structures +% ravenCobraWrapper Convert between RAVEN and COBRA structures. % -% Input: model a RAVEN/COBRA-compatible model structure +% This function is a bidirectional tool to convert between RAVEN and COBRA +% structures. It recognises a COBRA structure by checking the existence of +% the field 'rules', which is only found in a COBRA Toolbox structure. If +% the COBRA model also has a grRules field, then this will be used instead +% of parsing the rules field. % -% Ouput: newModel a COBRA/RAVEN-compatible model structure -% -% This function is a bidirectional tool to convert between RAVEN and -% COBRA structures. It recognises COBRA structure by checking field -% 'rules' existense, which is only found in COBRA Toolbox structure. If -% the COBRA model also has a grRules field, then this will be used -% instead of parsing the rules field. +% Parameters +% ---------- +% model : struct +% a RAVEN/COBRA-compatible model structure. % -% NOTE: During RAVEN -> COBRA -> RAVEN conversion cycle the following -% fields are lost: annotation, compOutside, compMiriams, rxnComps, -% geneComps, unconstrained. Boundary metabolites are lost, because COBRA -% structure does not involve boundary metabolites, so they are removed -% using simplifyModel before RAVEN -> COBRA conversion. The field 'rev' -% is also partially lost, but during COBRA -> RAVEN conversion it's -% reconstructed based on lower bound reaction values +% Returns +% ------- +% newModel : struct +% a COBRA/RAVEN-compatible model structure. % -% NOTE: During COBRA -> RAVEN -> COBRA conversion cycle the following -% fields are lost: geneEntrezID, modelVersion, proteins +% Examples +% -------- +% newModel = ravenCobraWrapper(model); % -% NOTE: The information about mandatory RAVEN fields was taken from -% checkModelStruct function, whereas the corresponding information about -% COBRA fields was fetched from verifyModel function +% Notes +% ----- +% During the RAVEN -> COBRA -> RAVEN conversion cycle the following fields +% are lost: annotation, compOutside, compMiriams, rxnComps, geneComps, +% unconstrained. Boundary metabolites are lost, because the COBRA +% structure does not involve boundary metabolites, so they are removed +% using simplifyModel before RAVEN -> COBRA conversion. The field 'rev' is +% also partially lost, but during COBRA -> RAVEN conversion it is +% reconstructed based on lower bound reaction values. % -% Usage: newModel=ravenCobraWrapper(model) +% During the COBRA -> RAVEN -> COBRA conversion cycle the following fields +% are lost: geneEntrezID, modelVersion, proteins. +% +% The information about mandatory RAVEN fields was taken from the +% checkModelStruct function, whereas the corresponding information about +% COBRA fields was fetched from the verifyModel function. if isfield(model,'rules') isRaven=false; diff --git a/conversion/removeIdentifierPrefix.m b/conversion/removeIdentifierPrefix.m index 0a960108..fe509bc4 100644 --- a/conversion/removeIdentifierPrefix.m +++ b/conversion/removeIdentifierPrefix.m @@ -1,31 +1,41 @@ function [model, hasChanged]=removeIdentifierPrefix(model,fields,forceRemove) -% removeIdentifierPrefix -% This function removes identifier prefixes: -% "R_" for model.rxns, model.rxnNames and model.id, -% "M_" for model.mets and model.metNames, -% "C_" for model.comps; -% "G_" for model.genes (and also represented in model.grRules). -% By default, the prefixes are only removed if all entries in a -% particular field has the prefix. The prefixes might have been present -% because one or more identifiers do not start with a letter or _, which -% conflicts with SBML specifications. +% removeIdentifierPrefix Remove SBML-required identifier prefixes. % -% Input: -% model model whose identifiers should be modified -% fields cell array with model field names from which the -% identifiers should be removed, possible values: -% 'rxns', 'mets', 'comps', 'genes', 'metNames', -% 'rxnNames', 'id'. (optional, by default all listed -% model fields will be checked). -% forceRemove if prefixes should be removed even if not all entries -% in a model field have the prefix (optional, default -% false) +% This function removes identifier prefixes: % -% Output: -% model modified model -% hasChanged cell array with fields and prefixes that are removed +% "R_" for model.rxns, model.rxnNames and model.id, +% "M_" for model.mets and model.metNames, +% "C_" for model.comps; +% "G_" for model.genes (and also represented in model.grRules). % -% Usage: model=removeIdentifierPrefix(model,fields,forceRemove) +% By default, the prefixes are only removed if all entries in a particular +% field have the prefix. The prefixes might have been present because one +% or more identifiers do not start with a letter or _, which conflicts +% with SBML specifications. +% +% Parameters +% ---------- +% model : struct +% model whose identifiers should be modified. +% fields : cell, optional +% cell array with model field names from which the identifiers should +% be removed, possible values: 'rxns', 'mets', 'comps', 'genes', +% 'metNames', 'rxnNames', 'id' (default all listed model fields will +% be checked). +% forceRemove : logical, optional +% if prefixes should be removed even if not all entries in a model +% field have the prefix (default false). +% +% Returns +% ------- +% model : struct +% modified model. +% hasChanged : cell +% cell array with fields and prefixes that are removed. +% +% Examples +% -------- +% model = removeIdentifierPrefix(model, fields, forceRemove); if nargin<2 || isempty(fields) fields = {'rxns','mets','comps','genes','metNames','rxnNames','id'}; diff --git a/conversion/standardizeModelFieldOrder.m b/conversion/standardizeModelFieldOrder.m index ffe79e11..7c4f5878 100755 --- a/conversion/standardizeModelFieldOrder.m +++ b/conversion/standardizeModelFieldOrder.m @@ -1,17 +1,30 @@ function orderedModel=standardizeModelFieldOrder(model) -% standardizeModelFieldOrder -% Orders fields of RAVEN model structure as specified at -% https://github.com/SysBioChalmers/RAVEN/wiki/RAVEN-Model-Structure +% standardizeModelFieldOrder Order RAVEN model structure fields. % -% Input: model model structure, either RAVEN or COBRA format +% Orders fields of a RAVEN model structure as specified at +% https://github.com/SysBioChalmers/RAVEN/wiki/RAVEN-Model-Structure % -% Output: orderedModel model structure with ordered fields +% The model fields themselves are not changed, only the order is modified. +% For changing model fields between RAVEN and COBRA format, use +% ravenCobraWrapper(). % -% The model fields themselves are not changed, only the order is -% modified. For changing model fields between RAVEN and COBRA format, use -% ravenCobraWrapper(). +% Parameters +% ---------- +% model : struct +% model structure, either RAVEN or COBRA format. % -% Usage: orderedModel=standardizeModelFieldOrder(model) +% Returns +% ------- +% orderedModel : struct +% model structure with ordered fields. +% +% Examples +% -------- +% orderedModel = standardizeModelFieldOrder(model); +% +% See also +% -------- +% ravenCobraWrapper ravenPath=findRAVENroot(); diff --git a/curation/curateModelFromTables.m b/curation/curateModelFromTables.m index 742b50bd..8af1d203 100644 --- a/curation/curateModelFromTables.m +++ b/curation/curateModelFromTables.m @@ -1,49 +1,60 @@ function newModel=curateModelFromTables(model,metsInfo,genesInfo,rxnsCoeffs,rxnsInfo,metPrefix,rxnPrefix) -% curateModelFromTables -% Curate existing and/or add new metabolites, reactions and genes -% from tabular data files. Originally extracted from yeast-GEM's -% curateMetsRxnsGenes; generalised here so any GEM project can drive -% batch curation from the same set of *.tsv files. +% curateModelFromTables Curate or add mets, rxns and genes from tables. % -% If the *.tsv files contain metabolites, reactions and/or genes that are -% already present in the model, then information in the model will be -% overwritten. Note that this includes empty annotations in the *.tsv -% files! Metabolites are matched by metaboliteName[comp]; reactions by -% the stoichiometry of its reactants and products; genes by their gene -% name. This function can therefore be used to add new entities in the -% model, or curate those already existing in the model. +% Curate existing and/or add new metabolites, reactions and genes from +% tabular data files. Originally extracted from yeast-GEM's +% curateMetsRxnsGenes; generalised here so any GEM project can drive batch +% curation from the same set of *.tsv files. % -% Input: -% model RAVEN model structure to be curated. -% metsInfo path to a *.tsv file with metabolite information, or -% 'none' to skip metabolite curation. Columns: -% metNames, comps, formula, charge, inchi, metNotes, -% then any number of MIRIAM-namespace columns. -% genesInfo path to a *.tsv file with gene information, or -% 'none'. Columns: genes, geneShortNames, then MIRIAM. -% rxnsCoeffs path to a *.tsv file with reaction stoichiometric -% coefficients, or 'none'. Columns: rxnIdx, rxnNames, -% metNames, comps, coefficient. One row per -% (reaction, metabolite) pair. -% rxnsInfo path to a *.tsv file with reaction information, or -% 'none'. Columns: rxnIdx, rxnNames, grRules, lb, ub, -% rev, subSystems, eccodes, rxnNotes, rxnReferences, -% rxnConfidenceScores, then MIRIAM. -% metPrefix prefix used to mint fresh metabolite ids (e.g. 's_' -% for yeast-GEM, 'M_' for the cobrapy/BiGG default). -% Default: 'M_'. -% rxnPrefix prefix used to mint fresh reaction ids. Default: 'R_'. +% If the *.tsv files contain metabolites, reactions and/or genes that are +% already present in the model, then information in the model will be +% overwritten. Note that this includes empty annotations in the *.tsv +% files! Metabolites are matched by metaboliteName[comp]; reactions by the +% stoichiometry of its reactants and products; genes by their gene name. +% This function can therefore be used to add new entities in the model, or +% curate those already existing in the model. % -% Output: -% newModel curated RAVEN model structure. +% Parameters +% ---------- +% model : struct +% RAVEN model structure to be curated. +% metsInfo : char +% Path to a *.tsv file with metabolite information, or 'none' to skip +% metabolite curation. Columns: metNames, comps, formula, charge, +% inchi, metNotes, then any number of MIRIAM-namespace columns. +% genesInfo : char +% Path to a *.tsv file with gene information, or 'none'. Columns: +% genes, geneShortNames, then MIRIAM. +% rxnsCoeffs : char +% Path to a *.tsv file with reaction stoichiometric coefficients, or +% 'none'. Columns: rxnIdx, rxnNames, metNames, comps, coefficient. One +% row per (reaction, metabolite) pair. +% rxnsInfo : char +% Path to a *.tsv file with reaction information, or 'none'. Columns: +% rxnIdx, rxnNames, grRules, lb, ub, rev, subSystems, eccodes, +% rxnNotes, rxnReferences, rxnConfidenceScores, then MIRIAM. +% metPrefix : char, optional +% Prefix used to mint fresh metabolite ids (e.g. 's_' for yeast-GEM, +% 'M_' for the cobrapy/BiGG default) (default 'M_'). +% rxnPrefix : char, optional +% Prefix used to mint fresh reaction ids (default 'R_'). % -% The 'everything after the core columns is MIRIAM' convention applies -% to all three info tables: any column whose header is not one of the -% listed core fields is treated as a MIRIAM annotation namespace and -% stored on the matching entity. +% Returns +% ------- +% newModel : struct +% Curated RAVEN model structure. % -% Usage: newModel = curateModelFromTables(model, metsInfo, genesInfo, ... -% rxnsCoeffs, rxnsInfo, metPrefix, rxnPrefix) +% Examples +% -------- +% newModel = curateModelFromTables(model, metsInfo, genesInfo, ... +% rxnsCoeffs, rxnsInfo, metPrefix, rxnPrefix); +% +% Notes +% ----- +% The 'everything after the core columns is MIRIAM' convention applies to +% all three info tables: any column whose header is not one of the listed +% core fields is treated as a MIRIAM annotation namespace and stored on +% the matching entity. if nargin==4 error('Provide both a ''rxnsInfo'' and a ''rxnsCoeffs'' file') diff --git a/gapfilling/canConsume.m b/gapfilling/canConsume.m index b0256d03..1345bd82 100755 --- a/gapfilling/canConsume.m +++ b/gapfilling/canConsume.m @@ -1,17 +1,26 @@ function consumed=canConsume(model,mets) -% canConsume -% Checks which metabolites that can be consumed by a model using the -% specified constraints +% canConsume Check which metabolites can be consumed by a model. % -% model a model structure -% mets either a cell array of metabolite IDs, a logical vector -% with the same number of elements as metabolites in the model, -% or a vector of indexes to check for (optional, default model.mets) +% Checks which metabolites can be consumed by a model using the specified +% constraints. % -% consumed vector with true if the corresponding metabolite could be -% produced +% Parameters +% ---------- +% model : struct +% a model structure. +% mets : cell or logical or double, optional +% either a cell array of metabolite IDs, a logical vector with the same +% number of elements as metabolites in the model, or a vector of +% indexes to check for (default model.mets). % -% Usage: consumed=canConsume(model,mets) +% Returns +% ------- +% consumed : logical +% vector with true if the corresponding metabolite could be produced. +% +% Examples +% -------- +% consumed = canConsume(model, mets); if nargin<2 mets=model.mets; diff --git a/gapfilling/canProduce.m b/gapfilling/canProduce.m index e8bca220..e9ff2ce5 100755 --- a/gapfilling/canProduce.m +++ b/gapfilling/canProduce.m @@ -1,18 +1,31 @@ function produced=canProduce(model,mets) -% canProduce -% Checks which metabolites that can be produced from a model using the -% specified constraints. This is a less advanced but faster version of -% checkProduction. +% canProduce Check which metabolites can be produced from a model. % -% model a model structure -% mets either a cell array of metabolite IDs, a logical vector -% with the same number of elements as metabolites in the model, -% or a vector of indexes to check for (optional, default model.mets) +% Checks which metabolites can be produced from a model using the +% specified constraints. This is a less advanced but faster version of +% checkProduction. % -% produced vector with true if the corresponding metabolite could be -% produced +% Parameters +% ---------- +% model : struct +% a model structure. +% mets : cell or logical or double, optional +% either a cell array of metabolite IDs, a logical vector with the same +% number of elements as metabolites in the model, or a vector of +% indexes to check for (default model.mets). % -% Usage: produced=canProduce(model,mets) +% Returns +% ------- +% produced : logical +% vector with true if the corresponding metabolite could be produced. +% +% Examples +% -------- +% produced = canProduce(model, mets); +% +% See also +% -------- +% checkProduction if nargin<2 mets=model.mets; diff --git a/gapfilling/checkProduction.m b/gapfilling/checkProduction.m index 13837dd5..b1c53f01 100755 --- a/gapfilling/checkProduction.m +++ b/gapfilling/checkProduction.m @@ -1,46 +1,52 @@ function [notProduced, notProducedNames, neededForProductionMat,minToConnect,model]=checkProduction(model,checkNeededForProduction,excretionFromCompartments,printDetails) -% checkProduction -% Checks which metabolites that can be produced from a model using the -% specified constraints. +% checkProduction Check which metabolites can be produced from a model. % -% model a model structure -% checkNeededForProduction for each of the metabolites that could not -% be produced, include an artificial -% production reaction and calculate which new -% metabolites that could be produced as en -% effect of this (optional, default false) -% excretionFromCompartments cell array with compartment ids from which -% metabolites can be excreted (optional, default -% model.comps) -% printDetails print details to the screen (optional, default -% true) +% Checks which metabolites that can be produced from a model using the +% specified constraints. % -% notProduced cell array with metabolites that could not -% be produced -% notProducedNames cell array with names and compartments for -% metabolites that could not be produced -% neededForProductionMat matrix where n x m is true if metabolite n -% allows for production of metabolite m -% minToConnect structure with the minimal number of -% metabolites that need to be connected in -% order to be able to produce all other -% metabolites and which metabolites each of -% them connects -% model updated model structure with excretion -% reactions added +% The function is intended to be used to identify which metabolites must be +% connected in order to have a fully connected network. It does so by first +% identifying which metabolites could have a net production in the network. +% Then it calculates which other metabolites must be able to have net +% production in order to have production of all metabolites in the network. +% So, if a network contains the equations A[external]->B, C->D, and D->E it +% will identify that production of C will connect the metabolites D and E. % -% The function is intended to be used to identify which metabolites must -% be connected in order to have a fully connected network. It does so by -% first identifying which metabolites could have a net production in the -% network. Then it calculates which other metabolites must be able to -% have net production in order to have production of all metabolites in -% the network. So, if a network contains the equations A[external]->B, -% C->D, and D->E it will identify that production of C will connect -% the metabolites D and E. +% Parameters +% ---------- +% model : struct +% a model structure. +% checkNeededForProduction : logical, optional +% for each of the metabolites that could not be produced, include an +% artificial production reaction and calculate which new metabolites that +% could be produced as an effect of this (default false). +% excretionFromCompartments : cell, optional +% cell array with compartment ids from which metabolites can be excreted +% (default model.comps). +% printDetails : logical, optional +% print details to the screen (default true). % -% Usage: [notProduced, notProducedNames,neededForProductionMat,minToConnect,model]=... -% checkProduction(model,checkNeededForProduction,... -% excretionFromCompartments,printDetails) +% Returns +% ------- +% notProduced : double +% indices of metabolites that could not be produced. +% notProducedNames : cell +% cell array with names and compartments for metabolites that could not +% be produced. +% neededForProductionMat : logical +% matrix where n x m is true if metabolite n allows for production of +% metabolite m. +% minToConnect : cell +% the minimal number of metabolites that need to be connected in order to +% be able to produce all other metabolites, and which metabolites each of +% them connects. +% model : struct +% updated model structure with excretion reactions added. +% +% Examples +% -------- +% [notProduced, notProducedNames, neededForProductionMat, minToConnect, model] = ... +% checkProduction(model, checkNeededForProduction, excretionFromCompartments, printDetails); if nargin<2 checkNeededForProduction=false; diff --git a/gapfilling/checkRxn.m b/gapfilling/checkRxn.m index 1334514c..9874d7f4 100755 --- a/gapfilling/checkRxn.m +++ b/gapfilling/checkRxn.m @@ -1,26 +1,38 @@ function report=checkRxn(model,rxn,cutoff,revDir,printReport) -% checkRxn -% Checks which reactants in a reaction that can be synthesized and which -% products that can be consumed. This is primarily for debugging -% reactions which cannot have flux +% checkRxn Check which reactants can be synthesized and products consumed. % -% model a model structure -% rxn the id of one reaction to check -% cutoff minimal flux for successful production/consumption (optional, -% default 10^-7) -% revDir true if the reaction should be reversed (optional, default -% false) -% printReport print a report (optional, default true) +% Checks which reactants in a reaction that can be synthesized and which +% products that can be consumed. This is primarily for debugging reactions +% which cannot have flux. % -% report -% reactants array with reactant indexes -% canMake boolean array, true if the corresponding reactant can -% be synthesized by the rest of the metabolic network -% products array with product indexes -% canConsume boolean array, true if the corresponding product can -% be consumed by the rest of the metabolic network +% Parameters +% ---------- +% model : struct +% a model structure. +% rxn : char +% the id of one reaction to check. +% cutoff : double, optional +% minimal flux for successful production/consumption (default 10^-7). +% revDir : logical, optional +% true if the reaction should be reversed (default false). +% printReport : logical, optional +% print a report (default true). % -% Usage: report=checkRxn(model,rxn,cutoff,revDir,printReport) +% Returns +% ------- +% report : struct +% report with fields: +% +% - reactants : array with reactant indexes +% - canMake : boolean array, true if the corresponding reactant can be +% synthesized by the rest of the metabolic network +% - products : array with product indexes +% - canConsume : boolean array, true if the corresponding product can be +% consumed by the rest of the metabolic network +% +% Examples +% -------- +% report = checkRxn(model, rxn, cutoff, revDir, printReport); rxn=char(rxn); if nargin<3 diff --git a/gapfilling/consumeSomething.m b/gapfilling/consumeSomething.m index 5ec896cb..d0b87616 100755 --- a/gapfilling/consumeSomething.m +++ b/gapfilling/consumeSomething.m @@ -1,44 +1,55 @@ function [solution, metabolite]=consumeSomething(model,ignoreMets,isNames,minNrFluxes,params,ignoreIntBounds) -% consumeSomething -% Tries to consume any metabolite using as few reactions as possible. -% The intended use is when you want to make sure that you model cannot -% consume anything without producing something. It is intended to be used -% with no active exchange reactions. +% consumeSomething Try to consume any metabolite using as few reactions as possible. % -% model a model structure -% ignoreMets either a cell array of metabolite IDs, a logical vector -% with the same number of elements as metabolites in the model, -% of a vector of indexes for metabolites to exclude from -% this analysis (optional, default []) -% isNames true if the supplied mets represent metabolite names -% (as opposed to IDs). This is a way to delete -% metabolites in several compartments at once without -% knowing the exact IDs. This only works if ignoreMets -% is a cell array (optional, default false) -% minNrFluxes solves the MILP problem of minimizing the number of -% fluxes instead of the sum. Slower, but can be -% used if the sum gives too many fluxes (optional, default -% false) -% params *obsolete option* -% ignoreIntBounds true if internal bounds (including reversibility) -% should be ignored. Exchange reactions are not affected. -% This can be used to find unbalanced solutions which are -% not possible using the default constraints (optional, -% default false) +% The intended use is when you want to make sure that your model cannot +% consume anything without producing something. It is intended to be used +% with no active exchange reactions. % -% solution flux vector for the solution -% metabolite the index of the metabolite(s) which was consumed. If -% possible only one metabolite is reported, but there are -% situations where metabolites can only be consumed in -% pairs (or more) +% Parameters +% ---------- +% model : struct +% a model structure. +% ignoreMets : cell or logical or double, optional +% either a cell array of metabolite IDs, a logical vector with the same +% number of elements as metabolites in the model, or a vector of indexes +% for metabolites to exclude from this analysis (default []). +% isNames : logical, optional +% true if the supplied mets represent metabolite names (as opposed to +% IDs). This is a way to delete metabolites in several compartments at +% once without knowing the exact IDs. This only works if ignoreMets is a +% cell array (default false). +% minNrFluxes : logical, optional +% solves the MILP problem of minimizing the number of fluxes instead of +% the sum. Slower, but can be used if the sum gives too many fluxes +% (default false). +% params : struct, optional +% *obsolete option*. +% ignoreIntBounds : logical, optional +% true if internal bounds (including reversibility) should be ignored. +% Exchange reactions are not affected. This can be used to find +% unbalanced solutions which are not possible using the default +% constraints (default false). % -% NOTE: This works by forcing at least 1 unit of "any metabolites" to be -% consumed and then minimize for the sum of fluxes. If more than one -% metabolite is consumed, it picks one of them to be consumed and then -% minimizes for the sum of fluxes. +% Returns +% ------- +% solution : double +% flux vector for the solution. +% metabolite : double +% the index of the metabolite(s) which was consumed. If possible only one +% metabolite is reported, but there are situations where metabolites can +% only be consumed in pairs (or more). % -% Usage: [solution, metabolite]=consumeSomething(model,ignoreMets,isNames,... -% minNrFluxes,params,ignoreIntBounds) +% Examples +% -------- +% [solution, metabolite] = consumeSomething(model, ignoreMets, isNames, ... +% minNrFluxes, params, ignoreIntBounds); +% +% Notes +% ----- +% This works by forcing at least 1 unit of "any metabolites" to be consumed +% and then minimize for the sum of fluxes. If more than one metabolite is +% consumed, it picks one of them to be consumed and then minimizes for the +% sum of fluxes. if nargin<2 ignoreMets=[]; diff --git a/gapfilling/fillGaps.m b/gapfilling/fillGaps.m index da0d81c9..38bed192 100755 --- a/gapfilling/fillGaps.m +++ b/gapfilling/fillGaps.m @@ -1,66 +1,76 @@ function [newConnected, cannotConnect, addedRxns, newModel, exitFlag]=fillGaps(model,models,allowNetProduction,useModelConstraints,supressWarnings,rxnScores) -% fillGaps -% Uses template model(s) to fill gaps in a model +% fillGaps Use template model(s) to fill gaps in a model. % -% model a model structure that may contains gaps to be filled -% models a cell array of reference models or a model structure. -% The gaps will be filled using reactions from these models -% allowNetProduction true if net production of all metabolites is -% allowed. A reaction can be unable to carry flux because one of -% the reactants is unavailable or because one of the -% products can't be further processed. If this -% parameter is true, only the first type of -% unconnectivity is considered (optional, default false) -% useModelConstraints true if the constraints specified in the model -% structure should be used. If false then reactions -% included from the template model(s) so that as many -% reactions as possible in model can carry flux -% (optional, default false) -% supressWarnings false if warnings should be displayed (optional, default -% false) -% rxnScores array with scores for each of the reactions in the -% reference model(s). If more than one model is supplied, -% then rxnScores should be a cell array of vectors. -% The solver will try to maximize the sum of the -% scores for the included reactions (optional, default -% is -1 for all reactions) +% This method works by merging the model with the reference model(s) and +% checking which reactions can carry flux. All reactions that can't carry +% flux are removed (cannotConnect). If useModelConstraints is false it +% then solves the MILP problem of minimizing the number of active +% reactions from the reference models that are required to have flux in +% all the reactions in model. This requires that the input model has +% exchange reactions present for the nutrients that are needed for its +% metabolism. If useModelConstraints is true then the problem is to +% include as few reactions as possible from the reference models in order +% to satisfy the model constraints. % -% newConnected cell array with the reactions that could be -% connected. This is not calulated if -% useModelConstraints is true -% cannotConnect cell array with reactions that could not be -% connected. This is not calculated if -% useModelConstraints is true -% addedRxns cell array with the reactions that were added from -% "models" -% newModel the model with reactions added to fill gaps -% exitFlag 1: optimal solution found -% -1: no feasible solution found -% -2: optimization time out +% The intended use is that the user can attempt a general gap-filling +% using useModelConstraints=false, or a more targeted gap-filling by +% setting constraints in the model structure and then using +% useModelConstraints=true. For example, to include reactions so that all +% biomass components can be synthesized, the user could define a biomass +% equation and set its lower bound to >0. Running this function with +% useModelConstraints=true would then give the smallest set of reactions +% that have to be included in order for the model to produce biomass. % -% This method works by merging the model to the reference model(s) and -% checking which reactions can carry flux. All reactions that can't -% carry flux are removed (cannotConnect). -% If useModelConstraints is false it then solves the MILP problem of -% minimizing the number of active reactions from the reference models -% that are required to have flux in all the reactions in model. This -% requires that the input model has exchange reactions present for the -% nutrients that are needed for its metabolism. If useModelConstraints is -% true then the problem is to include as few reactions as possible from -% the reference models in order to satisfy the model constraints. -% The intended use is that the user can attempt a general gap-filling using -% useModelConstraint=false or a more targeted gap-filling by setting -% constraints in the model structure and then use -% useModelConstraints=true. Say that the user want to include reactions -% so that all biomass components can be synthesized. He/she could then -% define a biomass equation and set the lower bound to >0. Running this -% function with useModelConstraints=true would then give the smallest set -% of reactions that have to be included in order for the model to produce -% biomass. +% Parameters +% ---------- +% model : struct +% a model structure that may contain gaps to be filled. +% models : cell or struct +% a cell array of reference models or a model structure. The gaps will +% be filled using reactions from these models. +% allowNetProduction : logical, optional +% true if net production of all metabolites is allowed. A reaction can +% be unable to carry flux because one of the reactants is unavailable +% or because one of the products can't be further processed. If true, +% only the first type of unconnectivity is considered (default false). +% useModelConstraints : logical, optional +% true if the constraints specified in the model structure should be +% used. If false then reactions are included from the template +% model(s) so that as many reactions as possible in model can carry +% flux (default false). +% supressWarnings : logical, optional +% false if warnings should be displayed (default false). +% rxnScores : double or cell, optional +% array with scores for each of the reactions in the reference +% model(s). If more than one model is supplied, then rxnScores should +% be a cell array of vectors. The solver will try to maximize the sum +% of the scores for the included reactions (default is -1 for all +% reactions). % -% Usage: [newConnected, cannotConnect, addedRxns, newModel, exitFlag]=... -% fillGaps(model,models,allowNetProduction,useModelConstraints,... -% supressWarnings,rxnScores,params) +% Returns +% ------- +% newConnected : cell +% cell array with the reactions that could be connected. This is not +% calculated if useModelConstraints is true. +% cannotConnect : cell +% cell array with reactions that could not be connected. This is not +% calculated if useModelConstraints is true. +% addedRxns : cell +% cell array with the reactions that were added from "models". +% newModel : struct +% the model with reactions added to fill gaps. +% exitFlag : double +% exit status: +% +% - 1 : optimal solution found +% - -1 : no feasible solution found +% - -2 : optimization time out +% +% Examples +% -------- +% [newConnected, cannotConnect, addedRxns, newModel, exitFlag]=... +% fillGaps(model,models,allowNetProduction,useModelConstraints,... +% supressWarnings,rxnScores,params); %If the user only supplied a single template model if ~iscell(models) diff --git a/gapfilling/fitTasks.m b/gapfilling/fitTasks.m index f823d459..e4095ce9 100755 --- a/gapfilling/fitTasks.m +++ b/gapfilling/fitTasks.m @@ -1,40 +1,46 @@ function [outModel, addedRxns]=fitTasks(model,refModel,inputFile,printOutput,rxnScores,taskStructure) -% fitTasks -% Fills gaps in a model by including reactions from a reference model, -% so that the resulting model can perform all the tasks in a task list +% fitTasks Fill gaps in a model so it can perform a list of tasks. % -% Input: -% model model structure -% refModel reference model from which to include reactions -% inputFile a task list in Excel format. See the function -% parseTaskList for details (optional if taskStructure is -% supplied) -% printOutput true if the results of the test should be displayed -% (optional, default true) -% rxnScores scores for each of the reactions in the reference -% model. Only negative scores are allowed. The solver will -% try to maximize the sum of the scores for the included -% reactions (optional, default is -1 for all reactions) -% taskStructure structure with the tasks, as from parseTaskList. If -% this is supplied then inputFile is ignored (optional) +% Fills gaps in a model by including reactions from a reference model, so +% that the resulting model can perform all the tasks in a task list. The +% gap-filling is done in a task-by-task manner, rather than solving for +% all tasks at once. This means that the order of the tasks could +% influence the result. % +% Parameters +% ---------- +% model : struct +% a model structure. +% refModel : struct +% reference model from which to include reactions. +% inputFile : char, optional +% a task list in Excel format. See the function parseTaskList for +% details (optional if taskStructure is supplied). +% printOutput : logical, optional +% true if the results of the test should be displayed (default true). +% rxnScores : double, optional +% scores for each of the reactions in the reference model. Only +% negative scores are allowed. The solver will try to maximize the sum +% of the scores for the included reactions (default is -1 for all +% reactions). +% taskStructure : struct, optional +% structure with the tasks, as from parseTaskList. If supplied then +% inputFile is ignored. % -% Output: -% outModel model structure with reactions added to perform the -% tasks -% addedRxns MxN matrix with the added reactions (M) from refModel -% for each task (N). An element is true if the corresponding -% reaction is added in the corresponding task. -% Failed tasks and SHOULD FAIL tasks are ignored +% Returns +% ------- +% outModel : struct +% model structure with reactions added to perform the tasks. +% addedRxns : logical +% MxN matrix with the added reactions (M) from refModel for each task +% (N). An element is true if the corresponding reaction is added in +% the corresponding task. Failed tasks and SHOULD FAIL tasks are +% ignored. % -% This function fills gaps in a model by using a reference model, so -% that the resulting model can perform a list of metabolic tasks. The -% gap-filling is done in a task-by-task manner, rather than solving for -% all tasks at once. This means that the order of the tasks could influence -% the result. -% -% Usage: [outModel, addedRxns]=fitTasks(model,refModel,inputFile,printOutput,... -% rxnScores,taskStructure) +% Examples +% -------- +% [outModel, addedRxns]=fitTasks(model,refModel,inputFile,printOutput,... +% rxnScores,taskStructure); if nargin<4 printOutput=true; diff --git a/gapfilling/gapReport.m b/gapfilling/gapReport.m index e4bccec3..b42eda99 100755 --- a/gapfilling/gapReport.m +++ b/gapfilling/gapReport.m @@ -1,45 +1,51 @@ function [noFluxRxns, noFluxRxnsRelaxed, subGraphs, notProducedMets, minToConnect,... neededForProductionMat, canProduceWithoutInput, canConsumeWithoutOutput, ... connectedFromTemplates, addedFromTemplates]=gapReport(model, templateModels) -% gapReport -% Performs a gap analysis and summarizes the results +% gapReport Perform a gap analysis and summarize the results. % -% model a model structure -% templateModels a cell array of template models to use for -% gap filling (optional) +% Parameters +% ---------- +% model : struct +% a model structure. +% templateModels : cell, optional +% a cell array of template models to use for gap filling. % -% noFluxRxns cell array with reactions that cannot carry -% flux -% noFluxRxnsRelaxed cell array with reactions that cannot carry -% flux even if the mass balance constraint is -% relaxed so that it is allowed to have -% net production of all metabolites -% subGraphs structure with the metabolites in each of -% the isolated sub networks -% notProducedMets cell array with the metabolites that -% couldn't have net production -% minToConnect structure with the minimal number of -% metabolites that need to be connected in -% order to be able to produce all other -% metabolites and which metabolites each of -% them connects -% neededForProductionMat matrix where n x m is true if metabolite n -% allows for production of metabolite m -% canProduceWithoutInput cell array with metabolites that could be -% produced even when there is no input to the -% model -% canConsumeWithoutOutput cell array with metabolites that could be -% consumed even when there is no output from -% the model -% connectedFromTemplates cell array with the reactions that could be -% connected using the template models -% addedFromTemplates structure with the reactions that were -% added from the template models and which -% model they were added from +% Returns +% ------- +% noFluxRxns : cell +% reactions that cannot carry flux. +% noFluxRxnsRelaxed : cell +% reactions that cannot carry flux even if the mass balance +% constraint is relaxed so that net production of all metabolites is +% allowed. +% subGraphs : struct +% the metabolites in each of the isolated sub-networks. +% notProducedMets : cell +% the metabolites that could not have net production. +% minToConnect : struct +% the minimal number of metabolites that need to be connected in +% order to be able to produce all other metabolites, and which +% metabolites each of them connects. +% neededForProductionMat : double +% matrix where n x m is true if metabolite n allows for production of +% metabolite m. +% canProduceWithoutInput : cell +% metabolites that could be produced even when there is no input to +% the model. +% canConsumeWithoutOutput : cell +% metabolites that could be consumed even when there is no output from +% the model. +% connectedFromTemplates : cell +% the reactions that could be connected using the template models. +% addedFromTemplates : struct +% the reactions that were added from the template models and which +% model they were added from. % -% Usage: [noFluxRxns, noFluxRxnsRelaxed, subGraphs, notProducedMets, minToConnect,... -% neededForProductionMat, connectedFromTemplates, addedFromTemplates]=... -% gapReport(model, templateModels) +% Examples +% -------- +% [noFluxRxns, noFluxRxnsRelaxed, subGraphs, notProducedMets, ... +% minToConnect, neededForProductionMat, connectedFromTemplates, ... +% addedFromTemplates] = gapReport(model, templateModels); if nargin<2 templateModels=[]; diff --git a/gapfilling/makeSomething.m b/gapfilling/makeSomething.m index 1b9b2ab8..11fa58ed 100755 --- a/gapfilling/makeSomething.m +++ b/gapfilling/makeSomething.m @@ -1,46 +1,57 @@ function [solution, metabolite]=makeSomething(model,ignoreMets,isNames,minNrFluxes,allowExcretion,params,ignoreIntBounds) -% makeSomething -% Tries to excrete any metabolite using as few reactions as possible. -% The intended use is when you want to make sure that you model cannot -% synthesize anything from nothing. It is then a faster way than to use -% checkProduction or canProduce +% makeSomething Excrete any metabolite using as few reactions as possible. % -% model a model structure -% ignoreMets either a cell array of metabolite IDs, a logical vector -% with the same number of elements as metabolites in the model, -% of a vector of indexes for metabolites to exclude from -% this analysis (optional, default []) -% isNames true if the supplied mets represent metabolite names -% (as opposed to IDs). This is a way to delete -% metabolites in several compartments at once without -% knowing the exact IDs. This only works if ignoreMets -% is a cell array (optional, default false) -% minNrFluxes solves the MILP problem of minimizing the number of -% fluxes instead of the sum. Slower, but can be -% used if the sum gives too many fluxes (optional, default -% false) -% allowExcretion allow for excretion of all other metabolites (optional, -% default true) -% params *obsolete option* -% ignoreIntBounds true if internal bounds (including reversibility) -% should be ignored. Exchange reactions are not affected. -% This can be used to find unbalanced solutions which are -% not possible using the default constraints (optional, -% default false) +% Tries to excrete any metabolite using as few reactions as possible. The +% intended use is when you want to make sure that you model cannot +% synthesize anything from nothing. It is then a faster way than to use +% checkProduction or canProduce. % -% solution flux vector for the solution -% metabolite the index of the metabolite(s) which was excreted. If -% possible only one metabolite is reported, but there are -% situations where metabolites can only be excreted in -% pairs (or more) +% Parameters +% ---------- +% model : struct +% a model structure. +% ignoreMets : cell or logical or double, optional +% either a cell array of metabolite IDs, a logical vector with the same +% number of elements as metabolites in the model, or a vector of indexes +% for metabolites to exclude from this analysis (default []). +% isNames : logical, optional +% true if the supplied mets represent metabolite names (as opposed to +% IDs). This is a way to delete metabolites in several compartments at +% once without knowing the exact IDs. This only works if ignoreMets is a +% cell array (default false). +% minNrFluxes : logical, optional +% solves the MILP problem of minimizing the number of fluxes instead of +% the sum. Slower, but can be used if the sum gives too many fluxes +% (default false). +% allowExcretion : logical, optional +% allow for excretion of all other metabolites (default true). +% params : struct, optional +% *obsolete option*. +% ignoreIntBounds : logical, optional +% true if internal bounds (including reversibility) should be ignored. +% Exchange reactions are not affected. This can be used to find +% unbalanced solutions which are not possible using the default +% constraints (default false). % -% NOTE: This works by forcing at least 1 unit of "any metabolites" to be -% produced and then minimize for the sum of fluxes. If more than one -% metabolite is produced, it picks one of them to be produced and then -% minimizes for the sum of fluxes. +% Returns +% ------- +% solution : double +% flux vector for the solution. +% metabolite : double +% the index of the metabolite(s) which was excreted. If possible only +% one metabolite is reported, but there are situations where metabolites +% can only be excreted in pairs (or more). % -% Usage: [solution, metabolite]=makeSomething(model,ignoreMets,isNames,... -% minNrFluxes,allowExcretion,params,ignoreIntBounds) +% Examples +% -------- +% [solution, metabolite] = makeSomething(model, ignoreMets); +% +% Notes +% ----- +% This works by forcing at least 1 unit of "any metabolites" to be produced +% and then minimize for the sum of fluxes. If more than one metabolite is +% produced, it picks one of them to be produced and then minimizes for the +% sum of fluxes. if nargin<2 ignoreMets=[]; diff --git a/io/SBMLFromExcel.m b/io/SBMLFromExcel.m index b88889cf..115d7148 100755 --- a/io/SBMLFromExcel.m +++ b/io/SBMLFromExcel.m @@ -1,21 +1,29 @@ function SBMLFromExcel(fileName, outputFileName,toCOBRA,printWarnings) -% SBMLFromExcel -% Converts a model in the Excel format to SBML +% SBMLFromExcel Convert a model in the Excel format to SBML. % -% fileName the Excel file -% outputFileName the SBML file -% toCOBRA true if the model should be saved in COBRA Toolbox -% format. Only limited support at the moment (optional, -% default false) -% printWarnings true if warnings about model issues should be reported -% (optional, default true) +% For a detailed description of the file format, see the supplied manual. % -% For a detailed description of the file format, see the supplied manual. +% Parameters +% ---------- +% fileName : char +% the Excel file. +% outputFileName : char +% the SBML file. +% toCOBRA : logical, optional +% true if the model should be saved in COBRA Toolbox format. Only +% limited support at the moment (default false). +% printWarnings : logical, optional +% true if warnings about model issues should be reported (default +% true). % -% Usage: SBMLFromExcel(fileName,outputFileName,toCOBRA,printWarnings) +% Examples +% -------- +% SBMLFromExcel(fileName, outputFileName, toCOBRA, printWarnings); % -% NOTE: This is just a wrapper function for importExcelModel, printModelStats -% and exportModel. Use those functions directly for greater control. +% Notes +% ----- +% This is just a wrapper function for importExcelModel, printModelStats and +% exportModel. Use those functions directly for greater control. fileName=char(fileName); outputFileName=char(outputFileName); if nargin<3 diff --git a/io/addJavaPaths.m b/io/addJavaPaths.m index f70ee6ac..87f765d2 100755 --- a/io/addJavaPaths.m +++ b/io/addJavaPaths.m @@ -1,8 +1,8 @@ -% addJavaPaths -% Adds the Apache POI classes to the static Java paths +% addJavaPaths Add the Apache POI classes to the static Java paths. % -% Usage: addJavaPaths() - +% Examples +% -------- +% addJavaPaths(); function addJavaPaths() %Get the path to Apache POI ravenPath=findRAVENroot(); diff --git a/io/checkFileExistence.m b/io/checkFileExistence.m index 576674e9..7d50ef2f 100755 --- a/io/checkFileExistence.m +++ b/io/checkFileExistence.m @@ -1,30 +1,37 @@ function files=checkFileExistence(files,fullOrTemp,allowSpace,checkExist) -% checkFileExistence -% Check whether files exist. If no full path is given a file should be -% located in the current folder, which by default is appended to the -% filename. +% checkFileExistence Check whether files exist. % -% Input: -% files string or cell array of strings with path to file(s) or -% path or filename(s) -% fullOrTemp 0: do not change path to file(s) -% 1: return full path to file(s) -% 2: copy file(s) to system default temporary folder and -% return full path -% (optional, default 0) -% allowSpace logical, whether 'space' character is allowed in the -% path (optional, default true) -% checkExist logical, whether file existence should really be -% checked, as this function can also be used to return -% the full path to a new file (optional, default true). Can -% only be set to false if fullOrTemp is set to 1. +% If no full path is given a file should be located in the current folder, +% which by default is appended to the filename. % -% Output: -% files string or cell array of strings with updated paths if -% fullOrTemp was set as 1 or 2, otherwise original paths -% are returned -% -% Usage: files=checkFileExistence(files,fullOrTemp,allowSpace,checkExist) +% Parameters +% ---------- +% files : char or cell +% string or cell array of strings with path to file(s) or path or +% filename(s). +% fullOrTemp : double, optional +% controls path handling (default 0): +% +% - 0 : do not change path to file(s) +% - 1 : return full path to file(s) +% - 2 : copy file(s) to system default temporary folder and return +% full path +% allowSpace : logical, optional +% whether the 'space' character is allowed in the path (default true). +% checkExist : logical, optional +% whether file existence should really be checked, as this function can +% also be used to return the full path to a new file (default true). +% Can only be set to false if fullOrTemp is set to 1. +% +% Returns +% ------- +% files : char or cell +% string or cell array of strings with updated paths if fullOrTemp was +% set as 1 or 2, otherwise original paths are returned. +% +% Examples +% -------- +% files = checkFileExistence(files, fullOrTemp, allowSpace, checkExist); if nargin<2 fullOrTemp = 0; diff --git a/io/cleanSheet.m b/io/cleanSheet.m index c2184884..2346d7f6 100755 --- a/io/cleanSheet.m +++ b/io/cleanSheet.m @@ -1,22 +1,34 @@ -% cleanSheet -% Cleans up an Excel sheet by removing empty rows/colums (and some other -% checks) +% cleanSheet Clean up an Excel sheet. % -% raw cell array with the data in the sheet -% removeComments true if commented lines (non-empty first cell in each -% row) should be removed (optional, default true) -% removeOnlyCap remove columns with captions but no other values (optional, -% default false) -% removeNoCap remove columns without captions (optional, default true) -% removeEmptyRows remove rows with no non-empty cells (optional, default true) -% -% raw cleaned version -% keptRows indexes of the kept rows in the original structure -% keptCols indexes of the kept columns in the original structure +% Removes empty rows/columns (and performs some other checks). % -% Usage: [raw,keptRows,keptCols]=cleanSheet(raw,removeComments,removeOnlyCap,... -% removeNoCap,removeEmptyRows) - +% Parameters +% ---------- +% raw : cell +% cell array with the data in the sheet. +% removeComments : logical, optional +% true if commented lines (non-empty first cell in each row) should be +% removed (default true). +% removeOnlyCap : logical, optional +% remove columns with captions but no other values (default false). +% removeNoCap : logical, optional +% remove columns without captions (default true). +% removeEmptyRows : logical, optional +% remove rows with no non-empty cells (default true). +% +% Returns +% ------- +% raw : cell +% cleaned version. +% keptRows : double +% indices of the kept rows in the original structure. +% keptCols : double +% indices of the kept columns in the original structure. +% +% Examples +% -------- +% [raw, keptRows, keptCols] = cleanSheet(raw, removeComments, ... +% removeOnlyCap, removeNoCap, removeEmptyRows); function [raw,keptRows,keptCols]=cleanSheet(raw,removeComments,removeOnlyCap,removeNoCap,removeEmptyRows) if nargin<2 removeComments=true; diff --git a/io/exportForGit.m b/io/exportForGit.m index 48448744..bea56a59 100755 --- a/io/exportForGit.m +++ b/io/exportForGit.m @@ -1,38 +1,45 @@ function out=exportForGit(model,prefix,path,formats,mainBranchFlag,subDirs,COBRAtext,neverPrefixIDs) -% exportForGit -% Generates a directory structure and populates this with model files, ready -% to be commited to a Git(Hub) maintained model repository. Writes the model -% as SBML L3V1 FBCv2 (both XML and YAML), COBRA text, Matlab MAT-file -% orthologies in KEGG +% exportForGit Export a model for a Git-maintained model repository. % -% model model structure in RAVEN format that should be -% exported -% prefix prefix for all filenames (optional, default 'model') -% path path where the directory structure should be -% generated and populated with all files (optional, -% default to current working directory) -% formats cell array of strings specifying in what file -% formats the model should be exported (optional, -% default to all formats as {'mat', 'txt', 'xlsx', -% 'xml', 'yml'}) -% mainBranchFlag logical, if true, function will error if RAVEN (and -% COBRA if detected) is/are not on the main branch. -% (optional, default false) -% subDirs logical, whether model files for each file format -% should be written in its own subdirectory, with -% 'model' as parent directory, in accordance to the -% standard-GEM repository format. If false, all files -% are stored in the same folder. (optional, default -% true) -% COBRAtext logical, whether the txt file should be in COBRA -% Toolbox format using metabolite IDs, instead of -% metabolite names and compartments. (optional, -% default false) -% neverPrefixIDs true if prefixes are never added to identifiers, -% even if start with e.g. digits. This might result -% in invalid SBML files (optional, default false) +% Generates a directory structure and populates it with model files, ready +% to be committed to a Git(Hub) maintained model repository. Writes the +% model as SBML L3V1 FBCv2 (both XML and YAML), COBRA text, Matlab MAT-file +% and Microsoft Excel formats. % -% Usage: exportForGit(model,prefix,path,formats,mainBranchFlag,subDirs,COBRAtext,COBRAstyle) +% Parameters +% ---------- +% model : struct +% model structure in RAVEN format that should be exported. +% prefix : char, optional +% prefix for all filenames (default 'model'). +% path : char, optional +% path where the directory structure should be generated and populated +% with all files (default current working directory). +% formats : cell, optional +% cell array of strings specifying in what file formats the model +% should be exported (default all formats as {'mat', 'txt', 'xlsx', +% 'xml', 'yml'}). +% mainBranchFlag : logical, optional +% if true, function will error if RAVEN (and COBRA if detected) is/are +% not on the main branch (default false). +% subDirs : logical, optional +% whether model files for each file format should be written in their +% own subdirectory, with 'model' as parent directory, in accordance to +% the standard-GEM repository format. If false, all files are stored in +% the same folder (default true). +% COBRAtext : logical, optional +% whether the txt file should be in COBRA Toolbox format using +% metabolite IDs, instead of metabolite names and compartments +% (default false). +% neverPrefixIDs : logical, optional +% true if prefixes are never added to identifiers, even if they start +% with e.g. digits. This might result in invalid SBML files (default +% false). +% +% Examples +% -------- +% exportForGit(model, prefix, path, formats, mainBranchFlag, subDirs, ... +% COBRAtext, neverPrefixIDs); if nargin<8 neverPrefixIDs=false; end diff --git a/io/exportModel.m b/io/exportModel.m index 27c9bb6f..7ef80bce 100755 --- a/io/exportModel.m +++ b/io/exportModel.m @@ -1,22 +1,27 @@ function exportModel(model,fileName,neverPrefix,supressWarnings,sortIds) -% exportModel -% Exports a constraint-based model to an SBML file (L3V1 FBCv2) +% exportModel Export a constraint-based model to an SBML file (L3V1 FBCv2). % -% Input: -% model a model structure -% fileName filename to export the model to. A dialog window -% will open if no file name is specified. -% neverPrefix true if prefixes are never added to identifiers, -% even if start with e.g. digits. This might result -% in invalid SBML files (optional, default false) -% supressWarnings true if warnings should be supressed. This might -% results in invalid SBML files, as no checks are -% performed (optional, default false) -% sortIds logical whether metabolites, reactions and genes -% should be sorted alphabetically by their -% identifiers (optional, default false) +% Parameters +% ---------- +% model : struct +% a model structure. +% fileName : char +% filename to export the model to. A dialog window will open if no file +% name is specified. +% neverPrefix : logical, optional +% true if prefixes are never added to identifiers, even if they start +% with e.g. digits. This might result in invalid SBML files (default +% false). +% supressWarnings : logical, optional +% true if warnings should be suppressed. This might result in invalid +% SBML files, as no checks are performed (default false). +% sortIds : logical, optional +% whether metabolites, reactions and genes should be sorted +% alphabetically by their identifiers (default false). % -% Usage: exportModel(model,fileName,neverPrefix,supressWarnings,sortIds) +% Examples +% -------- +% exportModel(model, fileName, neverPrefix, supressWarnings, sortIds); if nargin<2 || isempty(fileName) [fileName, pathName] = uiputfile({'*.xml;*.sbml'}, 'Select file for model export',[model.id '.xml']); diff --git a/io/exportModelToSIF.m b/io/exportModelToSIF.m index fe312f12..b72c4ff6 100755 --- a/io/exportModelToSIF.m +++ b/io/exportModelToSIF.m @@ -1,19 +1,26 @@ function exportModelToSIF(model,fileName,graphType,rxnLabels,metLabels) -% exportModelToSIF -% Exports a constraint-based model to a SIF file +% exportModelToSIF Export a constraint-based model to a SIF file. % -% model a model structure -% fileName the filename to export the model to -% graphType the type of graph to export to (optional, default 'rc') -% 'rc' reaction-compound -% 'rr' reaction-reaction -% 'cc' compound-compound -% rxnLabels cell array with labels for reactions (optional, default -% model.rxns) -% metLabels cell array with labels for metabolites (optional, default -% model.mets) +% Parameters +% ---------- +% model : struct +% a model structure. +% fileName : char +% the filename to export the model to. +% graphType : char, optional +% the type of graph to export to (default 'rc'): % -% Usage: exportModelToSIF(model,fileName,graphType,rxnLabels,metLabels) +% - 'rc' : reaction-compound +% - 'rr' : reaction-reaction +% - 'cc' : compound-compound +% rxnLabels : cell, optional +% cell array with labels for reactions (default model.rxns). +% metLabels : cell, optional +% cell array with labels for metabolites (default model.mets). +% +% Examples +% -------- +% exportModelToSIF(model, fileName, graphType, rxnLabels, metLabels); fileName=char(fileName); if nargin<3 graphType='rc'; diff --git a/io/exportToExcelFormat.m b/io/exportToExcelFormat.m index 9337e76e..f1bf9fc6 100755 --- a/io/exportToExcelFormat.m +++ b/io/exportToExcelFormat.m @@ -1,19 +1,23 @@ function exportToExcelFormat(model,fileName,sortIds) -% exportToExcelFormat -% Exports a model structure to the Microsoft Excel model format +% exportToExcelFormat Export a model to the Microsoft Excel model format. % -% Input: -% model a model structure -% fileName file name of the Excel file. Only xlsx format is supported. -% In order to preserve backward compatibility this could also -% be only a path, in which case the model is exported to a -% set of tab-delimited text files via exportToTabDelimited. -% A dialog window will open if fileName is empty. -% sortIds logical whether metabolites, reactions and genes should be -% sorted alphabetically by their identifiers (optional, -% default false) +% Parameters +% ---------- +% model : struct +% a model structure. +% fileName : char +% file name of the Excel file. Only xlsx format is supported. In order +% to preserve backward compatibility this could also be only a path, in +% which case the model is exported to a set of tab-delimited text files +% via exportToTabDelimited. A dialog window will open if fileName is +% empty. +% sortIds : logical, optional +% whether metabolites, reactions and genes should be sorted +% alphabetically by their identifiers (default false). % -% Usage: exportToExcelFormat(model, fileName, sortIds) +% Examples +% -------- +% exportToExcelFormat(model, fileName, sortIds); if nargin<2 || isempty(fileName) [fileName, pathName] = uiputfile('*.xlsx', 'Select file for model export',[model.id '.xlsx']); diff --git a/io/exportToTabDelimited.m b/io/exportToTabDelimited.m index 98ad6bb7..c8609863 100755 --- a/io/exportToTabDelimited.m +++ b/io/exportToTabDelimited.m @@ -1,22 +1,29 @@ function exportToTabDelimited(model,path,sortIds) -% exportToTabDelimited -% Exports a model structure to a set of tab-delimited text files +% exportToTabDelimited Export a model to tab-delimited text files. % -% model a model structure -% path the path to export to. The resulting text files will be saved -% under the names excelRxns.txt, excelMets.txt, excelGenes.txt, -% excelModel.txt, and excelComps.txt -% sortIds logical whether metabolites, reactions and genes should be -% sorted alphabetically by their identifiers (optional, default false) +% Parameters +% ---------- +% model : struct +% a model structure. +% path : char, optional +% the path to export to. The resulting text files will be saved under +% the names excelRxns.txt, excelMets.txt, excelGenes.txt, +% excelModel.txt, and excelComps.txt (default './'). +% sortIds : logical, optional +% whether metabolites, reactions and genes should be sorted +% alphabetically by their identifiers (default false). % -% NOTE: This functionality was previously a part of exportToExcelFormat. -% The naming of the resulting text files is to preserve backward -% compatibility +% Examples +% -------- +% exportToTabDelimited(model, path, sortIds); % -% NOTE: No checks are made regarding the correctness of the model. Use -% checkModelStruct to identify problems in the model structure +% Notes +% ----- +% This functionality was previously a part of exportToExcelFormat. The +% naming of the resulting text files is to preserve backward compatibility. % -% Usage: exportToTabDelimited(model,path,sortIds) +% No checks are made regarding the correctness of the model. Use +% checkModelStruct to identify problems in the model structure. if nargin<2 path='./'; diff --git a/io/getFullPath.m b/io/getFullPath.m index a766c49e..0b3bfee6 100755 --- a/io/getFullPath.m +++ b/io/getFullPath.m @@ -1,57 +1,66 @@ function File = getFullPath(File, Style) -% getFullPath - Get absolute canonical path of a file or folder -% Absolute path names are safer than relative paths, when e.g. a GUI or TIMER -% callback changes the current directory. Only canonical paths without "." and -% ".." can be recognized uniquely. -% Long path names (>259 characters) require a magic initial key "\\?\" to be -% handled by Windows API functions, e.g. for Matlab's FOPEN, DIR and EXIST. +% getFullPath Get the absolute canonical path of a file or folder. % -% FullName = getFullPath(Name, Style) -% INPUT: -% Name: String or cell string, absolute or relative name of a file or -% folder. The path need not exist. Unicode strings, UNC paths and long -% names are supported. -% Style: Style of the output as string, optional, default: 'auto'. -% 'auto': Add '\\?\' or '\\?\UNC\' for long names on demand. -% 'lean': Magic string is not added. -% 'fat': Magic string is added for short names also. -% The Style is ignored when not running under Windows. +% Absolute path names are safer than relative paths, when e.g. a GUI or +% TIMER callback changes the current directory. Only canonical paths +% without "." and ".." can be recognized uniquely. Long path names (>259 +% characters) require a magic initial key "\\?\" to be handled by Windows +% API functions, e.g. for Matlab's FOPEN, DIR and EXIST. % -% OUTPUT: -% FullName: Absolute canonical path name as string or cell string. -% For empty strings the current directory is replied. -% '\\?\' or '\\?\UNC' is added on demand. +% Parameters +% ---------- +% File : char or cell +% absolute or relative name of a file or folder, as a string or cell +% string. The path need not exist. Unicode strings, UNC paths and long +% names are supported. +% Style : char, optional +% style of the output (default 'auto'). Ignored when not running under +% Windows. One of: % -% NOTE: The M- and the MEX-version create the same results, the faster MEX -% function works under Windows only. -% Some functions of the Windows-API still do not support long file names. -% E.g. the Recycler and the Windows Explorer fail even with the magic '\\?\' -% prefix. Some functions of Matlab accept 260 characters (value of MAX_PATH), -% some at 259 already. Don't blame me. -% The 'fat' style is useful e.g. when Matlab's DIR command is called for a -% folder with les than 260 characters, but together with the file name this -% limit is exceeded. Then "dir(getFullPath([folder, '\*.*], 'fat'))" helps. +% - 'auto' : add '\\?\' or '\\?\UNC\' for long names on demand. +% - 'lean' : magic string is not added. +% - 'fat' : magic string is added for short names also. % -% EXAMPLES: -% cd(tempdir); % Assumed as 'C:\Temp' here -% getFullPath('File.Ext') % 'C:\Temp\File.Ext' -% getFullPath('..\File.Ext') % 'C:\File.Ext' -% getFullPath('..\..\File.Ext') % 'C:\File.Ext' -% getFullPath('.\File.Ext') % 'C:\Temp\File.Ext' -% getFullPath('*.txt') % 'C:\Temp\*.txt' -% getFullPath('..') % 'C:\' -% getFullPath('..\..\..') % 'C:\' -% getFullPath('Folder\') % 'C:\Temp\Folder\' -% getFullPath('D:\A\..\B') % 'D:\B' -% getFullPath('\\Server\Folder\Sub\..\File.ext') -% % '\\Server\Folder\File.ext' -% getFullPath({'..', 'new'}) % {'C:\', 'C:\Temp\new'} -% getFullPath('.', 'fat') % '\\?\C:\Temp\File.Ext' +% Returns +% ------- +% File : char or cell +% absolute canonical path name as a string or cell string. For empty +% strings the current directory is replied. '\\?\' or '\\?\UNC' is +% added on demand. +% +% Examples +% -------- +% cd(tempdir); % Assumed as 'C:\Temp' here +% getFullPath('File.Ext') % 'C:\Temp\File.Ext' +% getFullPath('..\File.Ext') % 'C:\File.Ext' +% getFullPath('..\..\File.Ext') % 'C:\File.Ext' +% getFullPath('.\File.Ext') % 'C:\Temp\File.Ext' +% getFullPath('*.txt') % 'C:\Temp\*.txt' +% getFullPath('..') % 'C:\' +% getFullPath('..\..\..') % 'C:\' +% getFullPath('Folder\') % 'C:\Temp\Folder\' +% getFullPath('D:\A\..\B') % 'D:\B' +% getFullPath('\\Server\Folder\Sub\..\File.ext') +% % '\\Server\Folder\File.ext' +% getFullPath({'..', 'new'}) % {'C:\', 'C:\Temp\new'} +% getFullPath('.', 'fat') % '\\?\C:\Temp\File.Ext' +% +% Notes +% ----- +% The M- and the MEX-version create the same results, the faster MEX +% function works under Windows only. Some functions of the Windows-API +% still do not support long file names. E.g. the Recycler and the Windows +% Explorer fail even with the magic '\\?\' prefix. Some functions of Matlab +% accept 260 characters (value of MAX_PATH), some at 259 already. Don't +% blame me. The 'fat' style is useful e.g. when Matlab's DIR command is +% called for a folder with less than 260 characters, but together with the +% file name this limit is exceeded. Then "dir(getFullPath([folder, +% '\*.*], 'fat'))" helps. % % COMPILE: -% Automatic: InstallMex getFullPath.c uTest_getFullPath -% Manual: mex -O getFullPath.c -% Download: http://www.n-simon.de/mex +% Automatic: InstallMex getFullPath.c uTest_getFullPath +% Manual: mex -O getFullPath.c +% Download: http://www.n-simon.de/mex % Run the unit-test uTest_getFullPath after compiling. % % Tested: Matlab 6.5, 7.7, 7.8, 7.13, WinXP/32, Win7/64 @@ -59,7 +68,9 @@ % Assumed Compatibility: higher Matlab versions % Author: Jan Simon, Heidelberg, (C) 2009-2013 matlab.THISYEAR(a)nMINUSsimon.de % -% See also: CD, FULLFILE, FILEPARTS. +% See also +% -------- +% cd, fullfile, fileparts % $JRev: R-G V:032 Sum:7Xd/JS0+yfax Date:15-Jan-2013 01:06:12 $ % $License: BSD (use/copy/change/redistribute on own risk, mention the author) $ diff --git a/io/getMD5Hash.m b/io/getMD5Hash.m index 36c9bb35..caafcc1c 100755 --- a/io/getMD5Hash.m +++ b/io/getMD5Hash.m @@ -1,20 +1,25 @@ function md5Hash=getMD5Hash(inputFile,binEnd) -% getMD5Hash -% Calculates MD5 hash for a file +% getMD5Hash Calculate the MD5 hash for a file. % -% Input: -% inputFile string with the path to file for which MD5 hash should -% be calculated -% binEnd string that shows the operating system running in the -% client's computer. Use ".exe" for Windows, ".mac" for -% macOS or leave it blank for Linux (""). (optional, by -% default the function automatically detects the client's -% operating system) +% Parameters +% ---------- +% inputFile : char +% string with the path to the file for which the MD5 hash should be +% calculated. +% binEnd : char, optional +% string that indicates the operating system running on the client's +% computer. Use ".exe" for Windows, ".mac" for macOS or leave it blank +% for Linux (""). (default: the function automatically detects the +% client's operating system). % -% Output: -% md5Hash string containing an MD5 hash for inputFile -% -% Usage: md5Hash=getMD5Hash(inputFile,binEnd) +% Returns +% ------- +% md5Hash : char +% string containing an MD5 hash for inputFile. +% +% Examples +% -------- +% md5Hash = getMD5Hash(inputFile, binEnd); inputFile=char(inputFile); if nargin<2 diff --git a/io/getToolboxVersion.m b/io/getToolboxVersion.m index 964e0834..7de954de 100755 --- a/io/getToolboxVersion.m +++ b/io/getToolboxVersion.m @@ -1,18 +1,29 @@ function version = getToolboxVersion(toolbox,fileID,mainBranchFlag) -% getToolboxVersion -% Returns the version of a given toolbox, or if not available the latest -% commit hash (7 characters). +% getToolboxVersion Return the version of a given toolbox. % -% toolbox string with the toolbox name (e.g. "RAVEN") -% fileID string with the name of a file that is only found in -% the corresponding toolbox (e.g. "ravenCobraWrapper.m"). -% mainBranchFlag logical, if true, function will error if the toolbox is -% not on the main branch (optional, default false). +% Returns the version of a given toolbox, or if not available the latest +% commit hash (7 characters). % -% version string containing either the toolbox version or latest -% commit hash (7 characters). +% Parameters +% ---------- +% toolbox : char +% string with the toolbox name (e.g. "RAVEN"). +% fileID : char +% string with the name of a file that is only found in the +% corresponding toolbox (e.g. "ravenCobraWrapper.m"). +% mainBranchFlag : logical, optional +% if true, the function will error if the toolbox is not on the main +% branch (default false). % -% Usage: version = getToolboxVersion(toolbox,fileID,mainBranchFlag) +% Returns +% ------- +% version : char +% string containing either the toolbox version or latest commit hash +% (7 characters). +% +% Examples +% -------- +% version = getToolboxVersion(toolbox, fileID, mainBranchFlag); toolbox=char(toolbox); fileID=char(fileID); diff --git a/io/importExcelModel.m b/io/importExcelModel.m index 0c4c0784..8edfd188 100755 --- a/io/importExcelModel.m +++ b/io/importExcelModel.m @@ -1,79 +1,98 @@ function model=importExcelModel(fileName,removeExcMets,printWarnings,ignoreErrors) -% importExcelModel -% Imports a constraint-based model from a Excel file +% importExcelModel Import a constraint-based model from an Excel file. % -% fileName a Microsoft Excel file to import -% removeExcMets true if exchange metabolites should be removed. This is -% needed to be able to run simulations, but it could also -% be done using simplifyModel at a later stage (optional, -% default true) -% printWarnings true if warnings should be printed (optional, default true) -% ignoreErrors true if errors should be ignored. See below for details -% (optional, default false) +% Loads models in the RAVEN Toolbox Excel format. % -% model -% annotation -% taxonomy String with the NCBI Taxonomy ID, as valid -% identifiers.org annotation -% defaultLB Double with the default lower bound values for reactions -% defaultUB Double with the default upper bound values for reactions -% givenName String with the name of the main model author -% familyName String with the surname of the main model author -% email String with the e-mail address of the main model author -% organization String with the organization of the main model author -% note String with additional comments about the model -% name name of model -% id model ID -% rxns reaction ids -% mets metabolite ids -% S stoichiometric matrix -% lb lower bounds -% ub upper bounds -% rev reversibility vector -% c objective coefficients -% b equality constraints for the metabolite equations -% comps compartment ids -% compNames compartment names -% compOutside the id (as in comps) for the compartment -% surrounding each of the compartments -% compMiriams structure with MIRIAM information about the -% compartments -% rxnNames reaction name -% rxnComps compartments for reactions -% grRules reaction to gene rules in text form -% rxnGeneMat reaction-to-gene mapping in sparse matrix form -% subSystems subsystem name for each reaction -% eccodes EC-codes for the reactions -% rxnMiriams structure with MIRIAM information about the reactions -% rxnNotes reaction notes -% rxnReferences reaction references -% rxnConfidenceScores reaction confidence scores -% genes list of all genes -% geneComps compartments for genes -% geneMiriams structure with MIRIAM information about the genes -% geneShortNames gene alternative names (e.g. ERG10) -% metNames metabolite name -% metComps compartments for metabolites -% inchis InChI-codes for metabolites -% metFormulas metabolite chemical formula -% metMiriams structure with MIRIAM information about the metabolites -% metCharges metabolite charge -% unconstrained true if the metabolite is an exchange metabolite +% Parameters +% ---------- +% fileName : char +% a Microsoft Excel file to import. +% removeExcMets : logical, optional +% true if exchange metabolites should be removed. This is needed to be +% able to run simulations, but it could also be done using +% simplifyModel at a later stage (default true). +% printWarnings : logical, optional +% true if warnings should be printed (default true). +% ignoreErrors : logical, optional +% true if errors should be ignored. See Notes for details (default +% false). % -% Loads models in the RAVEN Toolbox Excel format. A number of consistency -% checks are performed in order to ensure that the model is valid. These -% can be ignored by putting ignoreErrors to true. However, this is highly -% advised against, as it can result in errors in simulations or other -% functionalities. The RAVEN Toolbox is made to function only on consistent -% models, and the only checks performed are when the model is imported. +% Returns +% ------- +% model : struct +% imported model structure with fields: % -% NOTE: Most errors are checked for by checkModelStruct, but some -% are checked for in this function as well. Those are ones which relate -% to missing model elements and so on, and which would make it impossible -% to construct the model structure. Those errors cannot be ignored by -% setting ignoreErrors to true. +% - annotation : structure with model metadata, with fields: % -% Usage: model=importExcelModel(fileName,removeExcMets,printWarnings,ignoreErrors) +% - taxonomy : String with the NCBI Taxonomy ID, as valid +% identifiers.org annotation +% - defaultLB : Double with the default lower bound values for +% reactions +% - defaultUB : Double with the default upper bound values for +% reactions +% - givenName : String with the name of the main model author +% - familyName : String with the surname of the main model author +% - email : String with the e-mail address of the main model author +% - organization : String with the organization of the main model +% author +% - note : String with additional comments about the model +% +% - name : name of model +% - id : model ID +% - rxns : reaction ids +% - mets : metabolite ids +% - S : stoichiometric matrix +% - lb : lower bounds +% - ub : upper bounds +% - rev : reversibility vector +% - c : objective coefficients +% - b : equality constraints for the metabolite equations +% - comps : compartment ids +% - compNames : compartment names +% - compOutside : the id (as in comps) for the compartment surrounding +% each of the compartments +% - compMiriams : structure with MIRIAM information about the +% compartments +% - rxnNames : reaction name +% - rxnComps : compartments for reactions +% - grRules : reaction to gene rules in text form +% - rxnGeneMat : reaction-to-gene mapping in sparse matrix form +% - subSystems : subsystem name for each reaction +% - eccodes : EC-codes for the reactions +% - rxnMiriams : structure with MIRIAM information about the reactions +% - rxnNotes : reaction notes +% - rxnReferences : reaction references +% - rxnConfidenceScores : reaction confidence scores +% - genes : list of all genes +% - geneComps : compartments for genes +% - geneMiriams : structure with MIRIAM information about the genes +% - geneShortNames : gene alternative names (e.g. ERG10) +% - metNames : metabolite name +% - metComps : compartments for metabolites +% - inchis : InChI-codes for metabolites +% - metFormulas : metabolite chemical formula +% - metMiriams : structure with MIRIAM information about the metabolites +% - metCharges : metabolite charge +% - unconstrained : true if the metabolite is an exchange metabolite +% +% Examples +% -------- +% model = importExcelModel(fileName, removeExcMets, printWarnings, ignoreErrors); +% +% Notes +% ----- +% A number of consistency checks are performed in order to ensure that the +% model is valid. These can be ignored by putting ignoreErrors to true. +% However, this is highly advised against, as it can result in errors in +% simulations or other functionalities. The RAVEN Toolbox is made to +% function only on consistent models, and the only checks performed are +% when the model is imported. +% +% Most errors are checked for by checkModelStruct, but some are checked for +% in this function as well. Those are ones which relate to missing model +% elements and so on, and which would make it impossible to construct the +% model structure. Those errors cannot be ignored by setting ignoreErrors +% to true. fileName=char(fileName); if nargin<2 diff --git a/io/importModel.m b/io/importModel.m index dae96063..df4c487b 100755 --- a/io/importModel.m +++ b/io/importModel.m @@ -1,69 +1,78 @@ function model=importModel(fileName,removeExcMets,removePrefix,supressWarnings) -% importModel -% Import a constraint-based model from an SBML file. +% importModel Import a constraint-based model from an SBML file. % -% Input: -% fileName a SBML file to import. A dialog window will open if -% no file name is specified. -% removeExcMets true if exchange metabolites should be removed. This is -% needed to be able to run simulations, but it could also -% be done using simplifyModel at a later stage (optional, -% default true) -% removePrefix true if identifier prefixes should be removed when -% loading the model: G_ for genes, R_ for reactions, -% M_ for metabolites, and C_ for compartments. These are -% only removed if all identifiers of a certain type -% contain the prefix. (optional, default true) -% supressWarnings true if warnings regarding the model structure should -% be supressed (optional, default false) +% Parameters +% ---------- +% fileName : char +% a SBML file to import. A dialog window will open if no file name is +% specified. +% removeExcMets : logical, optional +% true if exchange metabolites should be removed. This is needed to be +% able to run simulations, but it could also be done using +% simplifyModel at a later stage (default true). +% removePrefix : logical, optional +% true if identifier prefixes should be removed when loading the model: +% G_ for genes, R_ for reactions, M_ for metabolites, and C_ for +% compartments. These are only removed if all identifiers of a certain +% type contain the prefix (default true). +% supressWarnings : logical, optional +% true if warnings regarding the model structure should be supressed +% (default false). % -% Output: -% model -% id model ID -% name name of model contents -% annotation additional information about model -% rxns reaction ids -% mets metabolite ids -% S stoichiometric matrix -% lb lower bounds -% ub upper bounds -% rev reversibility vector -% c objective coefficients -% b equality constraints for the metabolite equations -% comps compartment ids -% compNames compartment names -% compOutside the id (as in comps) for the compartment -% surrounding each of the compartments -% compMiriams structure with MIRIAM information about the -% compartments -% rxnNames reaction description -% rxnComps compartments for reactions -% grRules reaction to gene rules in text form -% rxnGeneMat reaction-to-gene mapping in sparse matrix form -% subSystems subsystem name for each reaction -% eccodes EC-codes for the reactions -% rxnMiriams structure with MIRIAM information about the reactions -% rxnNotes reaction notes -% rxnReferences reaction references -% rxnConfidenceScores reaction confidence scores -% genes list of all genes -% geneComps compartments for genes -% geneMiriams structure with MIRIAM information about the genes -% geneShortNames gene alternative names (e.g. ERG10) -% proteins protein associated to each gene -% metNames metabolite description -% metComps compartments for metabolites -% inchis InChI-codes for metabolites -% metFormulas metabolite chemical formula -% metMiriams structure with MIRIAM information about the metabolites -% metCharges metabolite charge -% unconstrained true if the metabolite is an exchange metabolite +% Returns +% ------- +% model : struct +% imported model structure with fields: % -% Note: A number of consistency checks are performed in order to ensure that the +% - id : model ID +% - name : name of model contents +% - annotation : additional information about model +% - rxns : reaction ids +% - mets : metabolite ids +% - S : stoichiometric matrix +% - lb : lower bounds +% - ub : upper bounds +% - rev : reversibility vector +% - c : objective coefficients +% - b : equality constraints for the metabolite equations +% - comps : compartment ids +% - compNames : compartment names +% - compOutside : the id (as in comps) for the compartment surrounding +% each of the compartments +% - compMiriams : structure with MIRIAM information about the +% compartments +% - rxnNames : reaction description +% - rxnComps : compartments for reactions +% - grRules : reaction to gene rules in text form +% - rxnGeneMat : reaction-to-gene mapping in sparse matrix form +% - subSystems : subsystem name for each reaction +% - eccodes : EC-codes for the reactions +% - rxnMiriams : structure with MIRIAM information about the reactions +% - rxnNotes : reaction notes +% - rxnReferences : reaction references +% - rxnConfidenceScores : reaction confidence scores +% - genes : list of all genes +% - geneComps : compartments for genes +% - geneMiriams : structure with MIRIAM information about the genes +% - geneShortNames : gene alternative names (e.g. ERG10) +% - proteins : protein associated to each gene +% - metNames : metabolite description +% - metComps : compartments for metabolites +% - inchis : InChI-codes for metabolites +% - metFormulas : metabolite chemical formula +% - metMiriams : structure with MIRIAM information about the metabolites +% - metCharges : metabolite charge +% - unconstrained : true if the metabolite is an exchange metabolite +% +% Examples +% -------- +% model = importModel(fileName, removeExcMets, removePrefix, supressWarnings); +% +% Notes +% ----- +% A number of consistency checks are performed in order to ensure that the % model is valid. Take these warnings seriously and modify the model % structure to solve them. -% -% Usage: model = importModel(fileName, removeExcMets, removePrefix, supressWarnings) if nargin<1 || isempty(fileName) [fileName, pathName] = uigetfile({'*.xml;*.sbml'}, 'Please select the model file'); diff --git a/io/loadSheet.m b/io/loadSheet.m index 3324964e..211a30c4 100755 --- a/io/loadSheet.m +++ b/io/loadSheet.m @@ -1,14 +1,25 @@ -% loadSheet -% Loads an Excel sheet into a cell matrix using the Java library Apache POI +% loadSheet Load an Excel sheet into a cell matrix. % -% workbook Workbook object representing the Excel file -% sheet name of the sheet (optional, default first sheet) +% Loads an Excel sheet into a cell matrix using the Java library Apache +% POI. % -% raw cell array with the data in the sheet -% flag 0 if everything worked, -1 if it didn't +% Parameters +% ---------- +% workbook : Workbook +% Workbook object representing the Excel file. +% sheet : char, optional +% name of the sheet (default first sheet). % -% Usage: [raw, flag]=loadSheet(workbook, sheet) - +% Returns +% ------- +% raw : cell +% cell array with the data in the sheet. +% flag : double +% 0 if everything worked, -1 if it didn't. +% +% Examples +% -------- +% [raw, flag] = loadSheet(workbook, sheet); function [raw, flag]=loadSheet(workbook, sheet) if nargin<2 sheet=[]; diff --git a/io/loadWorkbook.m b/io/loadWorkbook.m index c1444533..038b82d3 100755 --- a/io/loadWorkbook.m +++ b/io/loadWorkbook.m @@ -1,15 +1,25 @@ function workbook=loadWorkbook(fileName,createEmpty) -% loadWorkbook -% Loads an Excel file into a Workbook object using the Java library Apache POI +% loadWorkbook Load an Excel file into a Workbook object. % -% fileName name of the Excel file. If it doesn't exist it will be -% created -% createEmpty true if an empty workbook should be created if the file -% didn't exist (optional, default false) +% Loads an Excel file into a Workbook object using the Java library Apache +% POI. % -% workbook Workbook object representing the Excel file +% Parameters +% ---------- +% fileName : char +% name of the Excel file. If it doesn't exist it will be created. +% createEmpty : logical, optional +% true if an empty workbook should be created if the file didn't exist +% (default false). % -% Usage: workbook=loadWorkbook(fileName,createEmpty) +% Returns +% ------- +% workbook : Workbook +% Workbook object representing the Excel file. +% +% Examples +% -------- +% workbook = loadWorkbook(fileName, createEmpty); if nargin<2 createEmpty=false; diff --git a/io/parseYAML.m b/io/parseYAML.m index 5c93b76a..72355714 100644 --- a/io/parseYAML.m +++ b/io/parseYAML.m @@ -1,33 +1,43 @@ function out = parseYAML(filename) -% parseYAML -% Read an arbitrary YAML file into a MATLAB struct / cell tree. +% parseYAML Read an arbitrary YAML file into a MATLAB struct/cell tree. % -% Use this for parsing arbitrary YAML configuration / data files -% (e.g. yeast-GEM's data/conditions/*.yml). For loading a cobra-format -% model YAML, use readYAMLmodel instead — that function knows the -% model schema and returns a populated RAVEN model struct. +% Use this for parsing arbitrary YAML configuration / data files (e.g. +% yeast-GEM's data/conditions/*.yml). For loading a cobra-format model +% YAML, use readYAMLmodel instead — that function knows the model schema +% and returns a populated RAVEN model struct. % -% Implementation: delegates to Python's yaml.safe_load, then -% recursively converts the py.dict / py.list tree to native MATLAB -% struct / cell. Requires a working MATLAB-Python bridge and the -% pyyaml package in the linked Python environment: +% Implementation: delegates to Python's yaml.safe_load, then recursively +% converts the py.dict / py.list tree to native MATLAB struct / cell. +% Requires a working MATLAB-Python bridge and the pyyaml package in the +% linked Python environment: % -% pip install pyyaml % from the MATLAB-linked Python env +% pip install pyyaml % from the MATLAB-linked Python env % -% Input: -% filename path to the YAML file. +% Parameters +% ---------- +% filename : char +% Path to the YAML file. % -% Output: -% out MATLAB representation of the document: -% py.dict -> struct -% py.list -> cell column vector -% py.str -> char -% py.int -> double -% py.float -> double -% py.bool -> logical -% py.None -> [] +% Returns +% ------- +% out : struct or cell or char or double or logical +% MATLAB representation of the document: % -% Usage: cfg = parseYAML('data/conditions/anaerobic.yml') +% - py.dict -> struct +% - py.list -> cell column vector +% - py.str -> char +% - py.int -> double +% - py.float -> double +% - py.bool -> logical +% - py.None -> [] +% +% Examples +% -------- +% cfg = parseYAML('data/conditions/anaerobic.yml'); +% +% See also +% -------- +% readYAMLmodel if ~isfile(filename) error('parseYAML:fileNotFound', 'File not found: %s', filename); diff --git a/io/readYAMLmodel.m b/io/readYAMLmodel.m index adc74f58..e71b4206 100755 --- a/io/readYAMLmodel.m +++ b/io/readYAMLmodel.m @@ -1,16 +1,24 @@ function model=readYAMLmodel(fileName, verbose) -% readYAMLmodel -% Reads a yaml file matching (roughly) the cobrapy yaml structure +% readYAMLmodel Read a model structure from a YAML file. % -% Input: -% fileName a model file in yaml file format. A dialog window will open -% if no file name is specified. -% verbose set as true to monitor progress (optional, default false) +% Reads a yaml file matching (roughly) the cobrapy yaml structure. % -% Output: -% model a model structure +% Parameters +% ---------- +% fileName : char +% a model file in yaml file format. A dialog window will open if no +% file name is specified. +% verbose : logical, optional +% set as true to monitor progress (default false). % -% Usage: model = readYAMLmodel(fileName, verbose) +% Returns +% ------- +% model : struct +% a model structure. +% +% Examples +% -------- +% model = readYAMLmodel(fileName, verbose); if nargin<1 || isempty(fileName) [fileName, pathName] = uigetfile({'*.yml;*.yaml'}, 'Please select the model file'); if fileName == 0 diff --git a/io/writeSheet.m b/io/writeSheet.m index 1ca76d31..31ed185a 100755 --- a/io/writeSheet.m +++ b/io/writeSheet.m @@ -1,17 +1,33 @@ function wb=writeSheet(wb,sheetName,sheetPosition,captions,units,raw,isIntegers) -% writeSheet -% Writes a cell matrix to an Excel sheet into using the Java library Apache POI +% writeSheet Write a cell matrix to an Excel sheet. % -% workbook Workbook object representing the Excel file -% sheetName name of the sheet -% sheetPosition 0-based position of the sheet -% captions cell array of captions (optional) -% units WRITE INFO -% raw cell array with the data in the sheet -% isIntegers true if numeric values should be integers (optional, default -% true) +% Writes a cell matrix to an Excel sheet using the Java library Apache POI. % -% Usage: wb=writeSheet(wb,sheetName,sheetPosition,captions,units,raw) +% Parameters +% ---------- +% wb : Workbook +% Workbook object representing the Excel file. +% sheetName : char +% name of the sheet. +% sheetPosition : double +% 0-based position of the sheet. +% captions : cell, optional +% cell array of captions. +% units : cell +% cell array of units for the columns. +% raw : cell +% cell array with the data in the sheet. +% isIntegers : logical, optional +% true if numeric values should be integers (default true). +% +% Returns +% ------- +% wb : Workbook +% Workbook object updated with the written sheet. +% +% Examples +% -------- +% wb = writeSheet(wb, sheetName, sheetPosition, captions, units, raw); if nargin<7 isIntegers=true; diff --git a/io/writeYAMLmodel.m b/io/writeYAMLmodel.m index fc19d9e5..b1f59263 100755 --- a/io/writeYAMLmodel.m +++ b/io/writeYAMLmodel.m @@ -1,28 +1,33 @@ function writeYAMLmodel(model,fileName,preserveQuotes,sortIds) -% writeYAMLmodel -% Writes a yaml file matching cobrapy's YAML structure. The format is -% cobrapy's native !!omap layout, extended with RAVEN-only top-level -% per-entry keys (inchis, deltaG, metFrom, rxnFrom, references, -% confidence_score, protein) and the GECKO ec-rxns / ec-enzymes -% sections. Reaction EC numbers are written inside the `annotation` -% block as `ec-code` (the cobrapy/geckopy convention), not as a -% top-level reaction key. Output is byte-stable with raven_python's -% io.yaml.write_yaml_model when called with the same model. +% writeYAMLmodel Write a model to a yaml file matching cobrapy's structure. % -% model a model structure -% fileName name that the file will have. A dialog window will -% open if no file name is specified. -% preserveQuotes if all string values should be wrapped in double -% quotes. cobrapy emits quotes only where YAML -% requires them, so the default is false (matches -% cobrapy / raven-python). -% (logical, default=false) -% sortIds if metabolites, reactions, genes and compartments -% should be sorted alphabetically by their identifier, -% otherwise they are kept in their original order -% (logical, default=false) +% The format is cobrapy's native !!omap layout, extended with RAVEN-only +% top-level per-entry keys (inchis, deltaG, metFrom, rxnFrom, references, +% confidence_score, protein) and the GECKO ec-rxns / ec-enzymes sections. +% Reaction EC numbers are written inside the `annotation` block as +% `ec-code` (the cobrapy/geckopy convention), not as a top-level reaction +% key. Output is byte-stable with raven_python's io.yaml.write_yaml_model +% when called with the same model. % -% Usage: writeYAMLmodel(model,fileName,preserveQuotes,sortIds) +% Parameters +% ---------- +% model : struct +% a model structure. +% fileName : char +% name that the file will have. A dialog window will open if no file +% name is specified. +% preserveQuotes : logical, optional +% if all string values should be wrapped in double quotes. cobrapy +% emits quotes only where YAML requires them, so the default is false +% (matches cobrapy / raven-python) (default false). +% sortIds : logical, optional +% if metabolites, reactions, genes and compartments should be sorted +% alphabetically by their identifier, otherwise they are kept in their +% original order (default false). +% +% Examples +% -------- +% writeYAMLmodel(model,fileName,preserveQuotes,sortIds); if nargin<2|| isempty(fileName) [fileName, pathName] = uiputfile({'*.yml;*.yaml'}, 'Select file for model export',[model.id '.yml']); if fileName == 0 diff --git a/localization/getExpressionStructure.m b/localization/getExpressionStructure.m index f49899d7..aa4d275a 100755 --- a/localization/getExpressionStructure.m +++ b/localization/getExpressionStructure.m @@ -1,50 +1,61 @@ function experiment=getExpressionStructure(fileName) -% getExpressionStructure -% Loads a representation of an experiment from an Excel file (see -% comments further down) +% getExpressionStructure Load a representation of an experiment from Excel. % -% fileName an Excel representation on an experiment +% Loads a representation of an experiment from an Excel file (see notes +% further down). % -% experiment an experiment structure -% data matrix with expression values -% orfs the corresponding ORFs -% experiments the titles of the experiments -% boundNames reaction names for the bounds -% upperBoundaries matrix with the upper bound values -% fitNames reaction names for the measured fluxes -% fitTo matrix with the measured fluxes +% Parameters +% ---------- +% fileName : char +% an Excel representation of an experiment. % -% A very common data set when working with genome-scale metabolic models -% is that you have measured fermentation data, gene expression data, -% and some different 'bounds' (for example different carbon sources -% or genes that are knocked out) in a number of conditions. This function -% reads an Excel representation of such an experiment. -% The Excel file must contain three sheets, 'EXPRESSION', 'BOUNDS', -% 'FITTING'. Below are some examples to show how they should be -% formatted: +% Returns +% ------- +% experiment : struct +% an experiment structure with fields: % -% -EXPRESSION -% ORF dsm_paa wisc_paa -% Pc00e00030 79.80942723 78.14755338 -% Shows the expression of the gene Pc00e00030 under two different -% conditions (in this case a DSM strain and a Wisconsin strain of P. -% chrysogenum with PSS in the media) +% - data : matrix with expression values +% - orfs : the corresponding ORFs +% - experiments : the titles of the experiments +% - boundNames : reaction names for the bounds +% - upperBoundaries : matrix with the upper bound values +% - fitNames : reaction names for the measured fluxes +% - fitTo : matrix with the measured fluxes % -% -BOUNDS -% Fixed Upper dsm_paa wisc_paa -% paaIN 0.1 0.2 -% The upper bound for the reaction paaIN should be 0.1 for the first -% condition and 0.2 for the second +% Examples +% -------- +% experiment = getExpressionStructure(fileName); % -% -FITTING -% Fit to dsm_paa wisc_paa -% co2OUT 2.85 3.05 -% glcIN 1.2 0.9 -% The measured fluxes for CO2 production and glucose uptake for the two -% conditions. The model(s) can later be fitted to match these values as -% good as possible. +% Notes +% ----- +% A very common data set when working with genome-scale metabolic models +% is that you have measured fermentation data, gene expression data, and +% some different 'bounds' (for example different carbon sources or genes +% that are knocked out) in a number of conditions. This function reads an +% Excel representation of such an experiment. The Excel file must contain +% three sheets, 'EXPRESSION', 'BOUNDS', 'FITTING'. Below are some examples +% to show how they should be formatted: % -% Usage: experiment=getExpressionStructure(fileName) +% -EXPRESSION +% ORF dsm_paa wisc_paa +% Pc00e00030 79.80942723 78.14755338 +% Shows the expression of the gene Pc00e00030 under two different +% conditions (in this case a DSM strain and a Wisconsin strain of P. +% chrysogenum with PSS in the media). +% +% -BOUNDS +% Fixed Upper dsm_paa wisc_paa +% paaIN 0.1 0.2 +% The upper bound for the reaction paaIN should be 0.1 for the first +% condition and 0.2 for the second. +% +% -FITTING +% Fit to dsm_paa wisc_paa +% co2OUT 2.85 3.05 +% glcIN 1.2 0.9 +% The measured fluxes for CO2 production and glucose uptake for the two +% conditions. The model(s) can later be fitted to match these values as +% well as possible. [type, sheets]=xlsfinfo(fileName); diff --git a/localization/getWoLFScores.m b/localization/getWoLFScores.m index b0ad9b60..aafe6855 100755 --- a/localization/getWoLFScores.m +++ b/localization/getWoLFScores.m @@ -1,20 +1,30 @@ function GSS = getWoLFScores(inputFile, kingdom) -% getWoLFScores -% Call WoLF PSort to predict the sub-cellular localization of proteins. -% The output can be used as input to predictLocalization. This function -% is currently only available for Linux and requires Perl to be -% installed. If one wants to use another predictor, see parseScores. The -% function normalizes the scores so that the best score for each gene is -% 1.0. +% getWoLFScores Predict protein sub-cellular localization with WoLF PSORT. % -% Input: -% inputFile a FASTA file with protein sequences -% kingdom the kingdom of the organism, 'animal', 'fungi' or 'plant' +% The output can be used as input to predictLocalization. This function is +% currently only available for Linux and requires Perl to be installed. If +% one wants to use another predictor, see parseScores. The function +% normalizes the scores so that the best score for each gene is 1.0. % -% Output: -% GSS a gene scoring structure to be used in predictLocalization +% Parameters +% ---------- +% inputFile : char +% a FASTA file with protein sequences. +% kingdom : char +% the kingdom of the organism, 'animal', 'fungi' or 'plant'. % -% Usage: GSS = getWoLFScores(inputFile, kingdom) +% Returns +% ------- +% GSS : struct +% a gene scoring structure to be used in predictLocalization. +% +% Examples +% -------- +% GSS = getWoLFScores(inputFile, kingdom); +% +% See also +% -------- +% parseScores, predictLocalization if ~isfile(inputFile) error('FASTA file %s cannot be found',string(inputFile)); diff --git a/localization/mapCompartments.m b/localization/mapCompartments.m index e47c744d..43b78eac 100755 --- a/localization/mapCompartments.m +++ b/localization/mapCompartments.m @@ -1,12 +1,53 @@ function geneScoreStructure=mapCompartments(geneScoreStructure,varargin) -% mapCompartments -% Maps compartments in the geneScoreStructure. This is used if you do not -% want a models that uses all of the compartment from the predictor. This -% function will then let you define rules on how the compartments should -% be merged. +% mapCompartments Map compartments in the geneScoreStructure. % -% Any number of rules could be defined as consecutive strings or in a cell array. -% 'comp1' comp1 should be kept in the structure +% Maps compartments in the geneScoreStructure. This is used if you do not +% want a model that uses all of the compartments from the predictor. This +% function will then let you define rules on how the compartments should be +% merged. +% +% Parameters +% ---------- +% geneScoreStructure : struct +% a structure to be used in predictLocalization. +% varargin : char or cell +% any number of rules, defined as consecutive strings or in a cell +% array: +% +% - 'comp1' : comp1 should be kept in the structure. +% - 'comp1=comp2' : The scores in comp2 are merged to comp1 and comp2 is +% removed from the structure. This automatically keeps comp1 in the +% structure. +% - 'comp1=comp2 comp3' : The scores in comp2 and comp3 are merged to +% comp1 and comp2 & comp3 are removed from the structure. This +% automatically keeps comp1 in the structure. +% - 'comp1 comp2=comp3' : The scores in comp3 are split between comp1 and +% comp2. This automatically keeps comp1 and comp2 in the structure. +% - 'comp1=other' : The scores in any compartment not included are merged +% to comp1. This is applied after all other rules. +% +% Returns +% ------- +% geneScoreStructure : struct +% a structure to be used in predictLocalization. +% +% Examples +% -------- +% The predictor you use gives prediction for Extracellular, Cytosol, +% Nucleus, Peroxisome, Mitochondria, ER, and Lysosome. You want to have a +% model with Extracellular, Cytosol, Mitochondria, and Peroxisome where +% Lysosome is merged with Peroxisome and all other compartments are merged +% to the Cytosol: +% +% GSS = mapCompartments(GSS, 'Extracellular', 'Mitochondria', ... +% 'Peroxisome=Lysosome', 'Cytosol=other'); +% +% Notes +% ----- +% When one compartment is merged to another the resulting scores will be the +% best for each gene in either of the compartments. In the case where one +% compartment is split among several, the scores for the compartment to be +% merged is weighted with the number of compartments to split to. % 'comp1=comp2' The scores in comp2 are merged to comp1 and comp2 is % removed from the structure. This automatically diff --git a/localization/parseScores.m b/localization/parseScores.m index ae865b26..5de1c53b 100755 --- a/localization/parseScores.m +++ b/localization/parseScores.m @@ -1,19 +1,29 @@ function GSS = parseScores(inputFile, predictor) -% parseScores -% Parse the output from a predictor to generate the GSS +% parseScores Parse the output from a predictor to generate the GSS. % -% Input: -% inputFile a file with the output from the predictor -% predictor the predictor that was used. 'wolf' for WoLF PSORT, 'cello' -% for CELLO, 'deeploc' for DeepLoc (optional, default 'wolf') +% The function normalizes the scores so that the best score for each gene +% is 1.0. % -% Output: -% GSS a gene scoring structure to be used in predictLocalization +% Parameters +% ---------- +% inputFile : char +% a file with the output from the predictor. +% predictor : char, optional +% the predictor that was used. 'wolf' for WoLF PSORT, 'cello' for +% CELLO, 'deeploc' for DeepLoc (default 'wolf'). % -% The function normalizes the scores so that the best score for each gene -% is 1.0. +% Returns +% ------- +% GSS : struct +% a gene scoring structure to be used in predictLocalization. % -% Usage: GSS = parseScores(inputFile, predictor) +% Examples +% -------- +% GSS = parseScores(inputFile, predictor); +% +% See also +% -------- +% predictLocalization, getWoLFScores if nargin<2 predictor='wolf'; diff --git a/localization/predictLocalization.m b/localization/predictLocalization.m index 1908caa6..d9e1e334 100755 --- a/localization/predictLocalization.m +++ b/localization/predictLocalization.m @@ -1,66 +1,77 @@ function [outModel, geneLocalization, transportStruct, scores,... removedRxns] = predictLocalization(model, GSS,... defaultCompartment, transportCost, maxTime, plotResults) -% predictLocalization -% Tries to assign reactions to compartments in a manner that is in -% agreement with localization predictors while at the same time -% maintaining connectivity. +% predictLocalization Assign reactions to compartments using localization predictors. % -% Input: -% model a model structure. If the model contains -% several compartments they will be merged -% GSS a gene scoring structure as from parseScores -% defaultCompartment transport reactions are expressed as diffusion -% between the defaultCompartment and the others. -% This is usually the cytosol. The default -% compartment must have a match in GSS -% transportCost the cost for including a transport reaction. If -% this a scalar then the same cost is used for -% all metabolites. It can also be a vector of -% costs with the same dimension as model.mets. -% Note that negative costs will result in that -% transport of the metabolite is encouraged (optional, -% default 0.5) -% maxTime maximum optimization time in minutes (optional, -% default 15) -% plotResults true if the results should be plotted during the -% optimization (optional, default false) +% Tries to assign reactions to compartments in a manner that is in +% agreement with localization predictors while at the same time maintaining +% connectivity. % -% Output: -% outModel the resulting model structure -% geneLocalization structure with the genes and their resulting -% localization -% transportStruct structure with the transport reactions that had -% to be inferred and between which compartments -% scores structure that contains the total score history -% together with the score based on gene -% localization and the score based on included -% transport reactions -% removedRxns cell array with the reaction ids that had to be -% removed in order to have a connected input -% model +% Parameters +% ---------- +% model : struct +% a model structure. If the model contains several compartments they +% will be merged. +% GSS : struct +% a gene scoring structure as from parseScores. +% defaultCompartment : char +% transport reactions are expressed as diffusion between the +% defaultCompartment and the others. This is usually the cytosol. The +% default compartment must have a match in GSS. +% transportCost : double, optional +% the cost for including a transport reaction. If this is a scalar then +% the same cost is used for all metabolites. It can also be a vector of +% costs with the same dimension as model.mets. Note that negative costs +% will result in transport of the metabolite being encouraged (default +% 0.5). +% maxTime : double, optional +% maximum optimization time in minutes (default 15). +% plotResults : logical, optional +% true if the results should be plotted during the optimization +% (default false). % -% This function requires that the starting network is connected when it -% is in one compartment. Reactions that are unconnected are removed and -% saved in removedRxns. Try running fillGaps to have a more connected -% input model if there are many such reactions. The input model should -% also not include any exchange, demand or sink reactions, otherwise this -% function would not provide any results. +% Returns +% ------- +% outModel : struct +% the resulting model structure. +% geneLocalization : struct +% structure with the genes and their resulting localization. +% transportStruct : struct +% structure with the transport reactions that had to be inferred and +% between which compartments. +% scores : struct +% structure that contains the total score history together with the +% score based on gene localization and the score based on included +% transport reactions. +% removedRxns : cell +% cell array with the reaction ids that had to be removed in order to +% have a connected input model. % -% In the final model all metabolites are produced in at least one -% reaction. This does not guarantee a fully functional model since there -% can be internal loops. Transport reactions are only included as passive -% diffusion (A <=> B). +% Notes +% ----- +% This function requires that the starting network is connected when it is +% in one compartment. Reactions that are unconnected are removed and saved +% in removedRxns. Try running fillGaps to have a more connected input model +% if there are many such reactions. The input model should also not include +% any exchange, demand or sink reactions, otherwise this function would not +% provide any results. % -% The score of a model is the sum of scores for all genes in their -% assigned compartment minus the cost of all transport reactions that had -% to be included. A gene can only be assigned to one compartment. This is -% a simplification to keep the problem size down. The problem is solved -% using simulated annealing. +% In the final model all metabolites are produced in at least one reaction. +% This does not guarantee a fully functional model since there can be +% internal loops. Transport reactions are only included as passive diffusion +% (A <=> B). % -% Usage: [outModel, geneLocalization, transportStruct, scores,... -% removedRxns] = predictLocalization(model, GSS,... -% defaultCompartment, transportCost, maxTime, plotResults) +% The score of a model is the sum of scores for all genes in their assigned +% compartment minus the cost of all transport reactions that had to be +% included. A gene can only be assigned to one compartment. This is a +% simplification to keep the problem size down. The problem is solved using +% simulated annealing. +% +% Examples +% -------- +% [outModel, geneLocalization, transportStruct, scores, removedRxns] = ... +% predictLocalization(model, GSS, defaultCompartment, ... +% transportCost, maxTime, plotResults); if nargin<4 transportCost=ones(numel(model.mets),1)*0.5; diff --git a/manipulation/addExchangeRxns.m b/manipulation/addExchangeRxns.m index 831e888b..573240e8 100755 --- a/manipulation/addExchangeRxns.m +++ b/manipulation/addExchangeRxns.m @@ -1,25 +1,36 @@ function [model, addedRxns]=addExchangeRxns(model,reactionType,mets) -% addExchangeRxns -% Adds exchange reactions for some metabolites +% addExchangeRxns Add exchange reactions for some metabolites. % -% model a model structure -% reactionType the type of reactions to add -% 'in' input reactions -% 'out' output reactions -% 'both' reversible input/output reactions. Positive -% direction corresponds to output -% mets either a cell array of metabolite IDs, a logical vector -% with the same number of elements as metabolites in the model, -% or a vector of indexes to add for (optional, default model.mets) +% This is a faster version than addRxns when adding exchange reactions. +% New reactions are named "metName exchange (OUT/IN/BOTH)" while reaction +% ids are formatted as "EXC_OUT/IN/BOTH_METID". % -% model updated model structure -% addedRxns ids of the added reactions +% Parameters +% ---------- +% model : struct +% a model structure. +% reactionType : char +% the type of reactions to add: % -% This is a faster version than addRxns when adding exchange reactions. -% New reactions are named "metName exchange (OUT/IN/BOTH)" while reaction -% ids are formatted as "EXC_OUT/IN/BOTH_METID". +% - 'in' : input reactions +% - 'out' : output reactions +% - 'both' : reversible input/output reactions. Positive direction +% corresponds to output +% mets : cell or logical or double, optional +% either a cell array of metabolite IDs, a logical vector with the same +% number of elements as metabolites in the model, or a vector of +% indexes to add for (default model.mets). % -% Usage: [model, addedRxns]=addExchangeRxns(model,reactionType,mets) +% Returns +% ------- +% model : struct +% updated model structure. +% addedRxns : cell +% ids of the added reactions. +% +% Examples +% -------- +% [model, addedRxns] = addExchangeRxns(model, reactionType, mets); if nargin<3 mets=model.mets; diff --git a/manipulation/addGenesRaven.m b/manipulation/addGenesRaven.m index cc44f153..ac482dca 100755 --- a/manipulation/addGenesRaven.m +++ b/manipulation/addGenesRaven.m @@ -1,28 +1,35 @@ function newModel=addGenesRaven(model,genesToAdd) -% addGenesRaven -% Adds genes to a model +% addGenesRaven Add genes to a model. % -% model a model structure -% genesToAdd the genes genesToAdd can have the following fields: -% genes cell array with unique strings that -% identifies each gene. Only character which are -% allowed in SBML ids are allowed (mainly a-z, -% 0-9 and '_'). However, there is no check -% for this performed, as it only matters if -% the model should be exported to SBML -% geneShortNames cell array of gene abbreviations (optional, -% default '') -% geneMiriams cell array with MIRIAM structures (optional, -% default []) -% proteins cell array of protein names associated to -% each gene (optional, default '') +% This function does not make extensive checks about MIRIAM formats, +% forbidden characters or such. % -% newModel an updated model structure +% Parameters +% ---------- +% model : struct +% a model structure. +% genesToAdd : struct +% the genes to add, which can have the following fields: % -% NOTE: This function does not make extensive checks about MIRIAM formats, -% forbidden characters or such. +% - genes : cell array with unique strings that identifies each gene. +% Only characters which are allowed in SBML ids are allowed (mainly +% a-z, 0-9 and '_'). However, there is no check for this performed, +% as it only matters if the model should be exported to SBML +% - geneShortNames : cell array of gene abbreviations (optional, +% default '') +% - geneMiriams : cell array with MIRIAM structures (optional, +% default []) +% - proteins : cell array of protein names associated to each gene +% (optional, default '') % -% Usage: newModel=addGenesRaven(model,genesToAdd) +% Returns +% ------- +% newModel : struct +% an updated model structure. +% +% Examples +% -------- +% newModel = addGenesRaven(model, genesToAdd); newModel=model; diff --git a/manipulation/addMets.m b/manipulation/addMets.m index 3b4668a6..3e1c14f0 100755 --- a/manipulation/addMets.m +++ b/manipulation/addMets.m @@ -1,57 +1,65 @@ function newModel=addMets(model,metsToAdd,copyInfo,prefix) -% addMets -% Adds metabolites to a model -% -% Input: -% model a model structure -% metsToAdd the metabolite structure can have the following fields: -% mets cell array with unique strings that identifies each -% metabolite (optional, default IDs of new -% metabolites are numbered with the prefix defined -% below) -% metNames cell array with the names of each metabolite -% compartments cell array with the compartment of each -% metabolite. Should match model.comps. If this is a -% string rather than a cell array it is assumed that -% all mets are in that compartment -% b Nx1 or Nx2 matrix with equality constraints for -% each metabolite (optional, default 0) -% unconstrained vector describing if each metabolite is an exchange -% metabolite (1) or not (0) (optional, default 0) -% inchis cell array with InChI strings (optional, default '') -% metSmiles cell array with SMILES strings (optional, default '') -% metFormulas cell array with the formulas (optional, default '') -% metMiriams cell array with MIRIAM structures (optional, default []) -% metCharges metabolite charge (optional, default NaN) -% metDeltaG Gibbs free energy of formation at biochemical -% standard condition in kJ/mole (optional, default NaN) -% metNotes cell array with metabolite notes as strings -% (optional, default '') -% copyInfo when adding metabolites to a compartment where it -% previously did not exist, the function will copy any -% available annotation from the metabolite in another -% compartment (optional, default true) -% prefix when metsToAdd.mets is not specified, new metabolite IDs -% are generated with the prefix specified here. If IDs with -% the prefix are already used in the model then the -% numbering will start from the highest existing integer+1 -% (optional, default 'm_') -% -% Output: -% newModel an updated model structure +% addMets Add metabolites to a model. % % This function does not make extensive checks about MIRIAM formats, % forbidden characters or such. % +% Parameters +% ---------- +% model : struct +% a model structure. +% metsToAdd : struct +% the metabolite structure, which can have the following fields: +% +% - mets : cell array with unique strings that identifies each +% metabolite (optional, default IDs of new metabolites are numbered +% with the prefix defined below) +% - metNames : cell array with the names of each metabolite +% - compartments : cell array with the compartment of each metabolite. +% Should match model.comps. If this is a string rather than a cell +% array it is assumed that all mets are in that compartment +% - b : Nx1 or Nx2 matrix with equality constraints for each +% metabolite (optional, default 0) +% - unconstrained : vector describing if each metabolite is an exchange +% metabolite (1) or not (0) (optional, default 0) +% - inchis : cell array with InChI strings (optional, default '') +% - metSmiles : cell array with SMILES strings (optional, default '') +% - metFormulas : cell array with the formulas (optional, default '') +% - metMiriams : cell array with MIRIAM structures (optional, +% default []) +% - metCharges : metabolite charge (optional, default NaN) +% - metDeltaG : Gibbs free energy of formation at biochemical standard +% condition in kJ/mole (optional, default NaN) +% - metNotes : cell array with metabolite notes as strings (optional, +% default '') +% copyInfo : logical, optional +% when adding metabolites to a compartment where it previously did not +% exist, the function will copy any available annotation from the +% metabolite in another compartment (default true). +% prefix : char, optional +% when metsToAdd.mets is not specified, new metabolite IDs are +% generated with the prefix specified here. If IDs with the prefix are +% already used in the model then the numbering will start from the +% highest existing integer+1 (default 'm_'). +% +% Returns +% ------- +% newModel : struct +% an updated model structure. +% +% Examples +% -------- +% newModel = addMets(model, metsToAdd, copyInfo, prefix); +% +% Notes +% ----- % If multiple metabolites are added at once, the metMiriams cell array % should be defined as (example with ChEBI and KEGG): % -% metsToAdd.metMiriams{1} = struct('name',{{'chebi';'kegg.compound'}},... -% 'value',{{'CHEBI:18072';'C11821'}}); -% metsToAdd.metMiriams{2} = struct('name',{{'chebi';'kegg.compound'}},... -% 'value',{{'CHEBI:31132';'C12248'}}); -% -% Usage: newModel = addMets(model, metsToAdd, copyInfo, prefix) +% metsToAdd.metMiriams{1} = struct('name',{{'chebi';'kegg.compound'}},... +% 'value',{{'CHEBI:18072';'C11821'}}); +% metsToAdd.metMiriams{2} = struct('name',{{'chebi';'kegg.compound'}},... +% 'value',{{'CHEBI:31132';'C12248'}}); if nargin<3 copyInfo=true; diff --git a/manipulation/addRxns.m b/manipulation/addRxns.m index 3ce890c6..a046b9bd 100755 --- a/manipulation/addRxns.m +++ b/manipulation/addRxns.m @@ -1,105 +1,102 @@ function newModel=addRxns(model,rxnsToAdd,eqnType,compartment,allowNewMets,allowNewGenes) -% addRxns -% Adds reactions to a model +% addRxns Add reactions to a model. % -% Input: -% model a model structure -% rxnsToAdd the reaction structure can have the following fields: -% rxns cell array with unique strings that identifies -% each reaction -% equations cell array with equation strings. Decimal -% coefficients are expressed as "1.2". -% Reversibility is indicated by "<=>" or "=>" -% mets (alternative to equations) cell array with the -% metabolites involved in each reaction as nested -% arrays. E.g.: {{'met1','met2'},{'met1','met3','met4'}} -% In the case of one single reaction added, it -% can be a string array: {'met1','met2'} -% stoichCoeffs (alternative to equations) cell array with the -% corresponding stoichiometries as nested vectors -% E.g.: {[-1,+2],[-1,-1,+1]}. In the case of one -% single reaction added, it can be a vector: [-1,+2] -% rxnNames cell array with the names of each reaction -% (optional, default '') -% lb vector with the lower bounds (optional, default -% model.annotations.defaultLB or -inf for -% reversible reactions and 0 for irreversible -% when "equations" is used. When "mets" and -% "stoichCoeffs" are ,used it defaults for all -% to model.annotations.defaultLB or -inf) -% ub vector with the upper bounds (optional, default -% model.annotations.defaultUB or inf) -% c vector with the objective function coefficients -% (optional, default 0) -% eccodes cell array with the EC-numbers for each -% reactions. Delimit several EC-numbers with ";" -% (optional, default '') -% subSystems cell array with the subsystems for each -% reaction (optional, default '') -% grRules cell array with the gene-reaction relationship -% for each reaction. E.g. "(A and B) or (C)" -% means that the reaction could be catalyzed by a -% complex between A & B or by C on its own. All -% the genes have to be present in model.genes. -% Add genes with addGenesRaven before calling -% this function if needed (optional, default '') -% rxnMiriams cell array with Miriam structures (optional, -% default []) -% rxnComps cell array with compartments (as in -% model.comps) (optional, default {}) -% rxnNotes cell array with reaction notes (optional, -% default '') -% rxnDeltaG Gibbs free energy at biochemical standard -% condition in kJ/mole (optional, default NaN) -% rxnReferences cell array with reaction references (optional, -% default '') -% rxnConfidenceScores vector with reaction confidence scores -% (optional, default NaN) -% eqnType double describing how the equation string should be -% interpreted -% 1 - The metabolites are matched to model.mets. New -% metabolites (if allowed) are added to -% "compartment" (default) -% 2 - The metabolites are matched to model.metNames and -% all metabolites are assigned to "compartment". Any -% new metabolites that are added will be assigned -% IDs "m1", "m2"... If IDs on the same form are -% already used in the model then the numbering will -% start from the highest used integer+1 -% 3 - The metabolites are written as -% "metNames[comps]". Only compartments in -% model.comps are allowed. Any -% new metabolites that are added will be assigned -% IDs "m1", "m2"... If IDs on the same form are -% already used in the model then the numbering will -% start from the highest used integer+1 -% compartment a string with the compartment the metabolites should -% be placed in when using eqnType=2. Must match -% model.comps (optional when eqnType=1 or eqnType=3) -% allowNewMets true if the function is allowed to add new -% metabolites. Can also be a string, which will be used -% as prefix for the new metabolite IDs. It is highly -% recommended to first add any new metabolites with -% addMets rather than automatically through this -% function. addMets supports more annotation of -% metabolites, allows for the use of exchange -% metabolites, and using it reduces the risk of parsing -% errors (optional, default false) -% allowNewGenes true if the functions is allowed to add new genes -% (optional, default false) +% This function does not make extensive checks about formatting of +% gene-reaction rules. % -% Output: -% newModel an updated model structure +% When adding metabolites to a compartment where they previously do not +% exist, the function will copy any available information from the +% metabolite in another compartment. % -% This function does not make extensive checks about formatting of -% gene-reaction rules. +% Parameters +% ---------- +% model : struct +% a model structure. +% rxnsToAdd : struct +% the reaction structure, which can have the following fields: % -% When adding metabolites to a compartment where they previously do not -% the function will copy any available information from the metabolite in -% another compartment. +% - rxns : cell array with unique strings that identifies each reaction +% - equations : cell array with equation strings. Decimal coefficients +% are expressed as "1.2". Reversibility is indicated by "<=>" or "=>" +% - mets : (alternative to equations) cell array with the metabolites +% involved in each reaction as nested arrays. E.g.: +% {{'met1','met2'},{'met1','met3','met4'}}. In the case of one single +% reaction added, it can be a string array: {'met1','met2'} +% - stoichCoeffs : (alternative to equations) cell array with the +% corresponding stoichiometries as nested vectors. E.g.: +% {[-1,+2],[-1,-1,+1]}. In the case of one single reaction added, it +% can be a vector: [-1,+2] +% - rxnNames : cell array with the names of each reaction (optional, +% default '') +% - lb : vector with the lower bounds (optional, default +% model.annotations.defaultLB or -inf for reversible reactions and 0 +% for irreversible when "equations" is used. When "mets" and +% "stoichCoeffs" are used it defaults for all to +% model.annotations.defaultLB or -inf) +% - ub : vector with the upper bounds (optional, default +% model.annotations.defaultUB or inf) +% - c : vector with the objective function coefficients (optional, +% default 0) +% - eccodes : cell array with the EC-numbers for each reaction. Delimit +% several EC-numbers with ";" (optional, default '') +% - subSystems : cell array with the subsystems for each reaction +% (optional, default '') +% - grRules : cell array with the gene-reaction relationship for each +% reaction. E.g. "(A and B) or (C)" means that the reaction could be +% catalyzed by a complex between A & B or by C on its own. All the +% genes have to be present in model.genes. Add genes with +% addGenesRaven before calling this function if needed (optional, +% default '') +% - rxnMiriams : cell array with Miriam structures (optional, +% default []) +% - rxnComps : cell array with compartments (as in model.comps) +% (optional, default {}) +% - rxnNotes : cell array with reaction notes (optional, default '') +% - rxnDeltaG : Gibbs free energy at biochemical standard condition in +% kJ/mole (optional, default NaN) +% - rxnReferences : cell array with reaction references (optional, +% default '') +% - rxnConfidenceScores : vector with reaction confidence scores +% (optional, default NaN) +% eqnType : double, optional +% describes how the equation string should be interpreted (default 1): % -% Usage: newModel = addRxns(model, rxnsToAdd, eqnType, compartment,... -% allowNewMets, allowNewGenes) +% - 1 : the metabolites are matched to model.mets. New metabolites (if +% allowed) are added to "compartment" +% - 2 : the metabolites are matched to model.metNames and all +% metabolites are assigned to "compartment". Any new metabolites that +% are added will be assigned IDs "m1", "m2"... If IDs on the same +% form are already used in the model then the numbering will start +% from the highest used integer+1 +% - 3 : the metabolites are written as "metNames[comps]". Only +% compartments in model.comps are allowed. Any new metabolites that +% are added will be assigned IDs "m1", "m2"... If IDs on the same +% form are already used in the model then the numbering will start +% from the highest used integer+1 +% compartment : char, optional +% the compartment the metabolites should be placed in when using +% eqnType=2. Must match model.comps (optional when eqnType=1 or +% eqnType=3). +% allowNewMets : logical or char, optional +% true if the function is allowed to add new metabolites. Can also be a +% string, which will be used as prefix for the new metabolite IDs. It +% is highly recommended to first add any new metabolites with addMets +% rather than automatically through this function. addMets supports +% more annotation of metabolites, allows for the use of exchange +% metabolites, and using it reduces the risk of parsing errors +% (default false). +% allowNewGenes : logical, optional +% true if the function is allowed to add new genes (default false). +% +% Returns +% ------- +% newModel : struct +% an updated model structure. +% +% Examples +% -------- +% newModel = addRxns(model, rxnsToAdd, eqnType, compartment, ... +% allowNewMets, allowNewGenes); if nargin<3 eqnType=1; diff --git a/manipulation/addRxnsGenesMets.m b/manipulation/addRxnsGenesMets.m index 7e856f0c..5b550216 100755 --- a/manipulation/addRxnsGenesMets.m +++ b/manipulation/addRxnsGenesMets.m @@ -1,48 +1,58 @@ function model=addRxnsGenesMets(model,sourceModel,rxns,addGene,rxnNote,confidence) -% addRxnsGenesMets -% Copies reactions from a source model to a new model, including -% (new) metabolites and genes +% addRxnsGenesMets Copy reactions from a source model into another model. % -% model draft model where reactions should be copied to -% sourceModel model where reactions and metabolites are sourced from -% rxns cell array with reaction IDs (from source model). Can also -% be string if only one reaction is added -% addGene three options: -% false no genes are annotated to the new reactions -% true grRules ared copied from the sourceModel and -% new genes are added when required -% string or cell array -% new grRules are specified as string or cell -% array, and any new genes are added when -% required -% (optional, default false) -% rxnNote cell array with strings explaining why reactions were copied -% to the model, to be included as newModel.rxnNotes. Can also -% be string if same rxnNotes should be added for each new -% reaction, or only one reaction is to be added (optional, default -% 'Added via addRxnsAndMets()') -% confidence integer specifying confidence score for all reactions. -% 4: biochemical data: direct evidence from enzymes -% assays -% 3: genetic data: knockout/-in or overexpression -% analysis -% 2: physiological data: indirect evidence, e.g. -% secretion products or defined medium requirement -% sequence data: genome annotation -% 1: modeling data: required for functional model, -% hypothetical reaction -% 0: no evidence -% following doi:10.1038/nprot.2009.203 (optional, default 0) +% Copies reactions from a source model to a new model, including (new) +% metabolites and genes. % -% newModel an updated model structure +% This function only works if the draft model and source model follow the +% same metabolite and compartment naming convention. Metabolites are only +% matched by metaboliteName[compartment]. Useful if one wants to copy +% additional reactions from source to draft after getModelFromHomology was +% used involving the same models. % -% This function only works if the draft model and source model follow -% the same metabolite and compartment naming convention. Metabolites are -% only matched by metaboliteName[compartment]. Useful if one wants to copy -% additional reactions from source to draft after getModelFromHomology was -% used involving the same models. +% Parameters +% ---------- +% model : struct +% draft model where reactions should be copied to. +% sourceModel : struct +% model where reactions and metabolites are sourced from. +% rxns : cell or char +% reaction IDs (from source model). Can also be a string if only one +% reaction is added. +% addGene : logical or char or cell, optional +% three options (default false): % -% Usage: newModel=addRxnsGenesMets(model,sourceModel,rxns,addGene,rxnNote,confidence) +% - false : no genes are annotated to the new reactions +% - true : grRules are copied from the sourceModel and new genes are +% added when required +% - string or cell array : new grRules are specified as string or cell +% array, and any new genes are added when required +% rxnNote : cell or char, optional +% strings explaining why reactions were copied to the model, to be +% included as newModel.rxnNotes. Can also be a string if the same +% rxnNotes should be added for each new reaction, or only one reaction +% is to be added (default 'Added via addRxnsAndMets()'). +% confidence : double, optional +% integer specifying confidence score for all reactions, following +% doi:10.1038/nprot.2009.203 (default 0): +% +% - 4 : biochemical data: direct evidence from enzyme assays +% - 3 : genetic data: knockout/-in or overexpression analysis +% - 2 : physiological data: indirect evidence, e.g. secretion products +% or defined medium requirement; sequence data: genome annotation +% - 1 : modeling data: required for functional model, hypothetical +% reaction +% - 0 : no evidence +% +% Returns +% ------- +% model : struct +% an updated model structure. +% +% Examples +% -------- +% newModel = addRxnsGenesMets(model, sourceModel, rxns, addGene, ... +% rxnNote, confidence); if nargin<6 confidence=0; diff --git a/manipulation/addTransport.m b/manipulation/addTransport.m index 4b88f022..3d960912 100755 --- a/manipulation/addTransport.m +++ b/manipulation/addTransport.m @@ -1,29 +1,43 @@ function [model, addedRxns]=addTransport(model,fromComp,toComps,metNames,isRev,onlyToExisting,prefix) -% addTransport -% Adds transport reactions between compartments +% addTransport Add transport reactions between compartments. % -% model a model structure -% fromComp the id of the compartment to transport from (should -% match model.comps) -% toComps a cell array of compartment names to transport to (should -% match model.comps) -% metNames the metabolite names to add transport for (optional, all -% metabolites in fromComp) -% isRev true if the transport reactions should be reversible -% (optional, default true) -% onlyToExisting true if transport of a metabolite should only be added -% if it already exists in toComp. If false, then new metabolites -% are added with addMets first (optional, default true) -% prefix string specifying prefix to reaction IDs (optional, default -% 'tr_') +% This is a faster version than addRxns when adding transport reactions. +% New reaction names are formatted as "metaboliteName, fromComp-toComp", +% while new reaction IDs are sequentially counted with a tr_ prefix: +% e.g. tr_0001, tr_0002, etc. % -% This is a faster version than addRxns when adding transport reactions. -% New reaction names are formatted as "metaboliteName, fromComp-toComp", -% while new reaction IDs are sequentially counted with a tr_ prefix: -% e.g. tr_0001, tr_0002, etc. +% Parameters +% ---------- +% model : struct +% a model structure. +% fromComp : char +% the id of the compartment to transport from (should match +% model.comps). +% toComps : cell +% compartment names to transport to (should match model.comps). +% metNames : cell, optional +% the metabolite names to add transport for (default all metabolites +% in fromComp). +% isRev : logical, optional +% true if the transport reactions should be reversible (default true). +% onlyToExisting : logical, optional +% true if transport of a metabolite should only be added if it already +% exists in toComp. If false, then new metabolites are added with +% addMets first (default true). +% prefix : char, optional +% prefix to reaction IDs (default 'tr_'). % -% Usage: [model, addedRxns]=addTransport(model,fromComp,toComps,metNames,... -% isRev,onlyToExisting,prefix) +% Returns +% ------- +% model : struct +% updated model structure. +% addedRxns : cell +% ids of the added reactions. +% +% Examples +% -------- +% [model, addedRxns] = addTransport(model, fromComp, toComps, ... +% metNames, isRev, onlyToExisting, prefix); fromComp=char(fromComp); [I, fromID]=ismember(model.comps,fromComp); diff --git a/manipulation/changeGrRules.m b/manipulation/changeGrRules.m index e8418c3e..a2ad5de3 100755 --- a/manipulation/changeGrRules.m +++ b/manipulation/changeGrRules.m @@ -1,20 +1,29 @@ function model = changeGrRules(model,rxns,grRules,replace) -% changeGrRules -% Changes multiple grRules at the same time. +% changeGrRules Change multiple grRules at the same time. % -% model a model structure to change the gene association -% rxns string or cell array of reaction IDs -% grRules string of additional or replacement gene association. -% Should be written with ' and ' to indicate subunits, ' or ' -% to indicate isoenzymes, and brackets '()' to separate -% different instances -% replace true if old gene association should be replaced with new -% association. False if new gene association should be -% concatenated to the old association (optional, default true) +% Parameters +% ---------- +% model : struct +% a model structure to change the gene association. +% rxns : char or cell +% reaction IDs. +% grRules : char or cell +% additional or replacement gene association. Should be written with +% ' and ' to indicate subunits, ' or ' to indicate isoenzymes, and +% brackets '()' to separate different instances. +% replace : logical, optional +% true if old gene association should be replaced with new association. +% False if new gene association should be concatenated to the old +% association (default true). % -% model an updated model structure +% Returns +% ------- +% model : struct +% an updated model structure. % -% Usage: changeGrRules(model,rxns,grRules,replace) +% Examples +% -------- +% model = changeGrRules(model, rxns, grRules, replace); if nargin==3 replace=true; diff --git a/manipulation/changeRxns.m b/manipulation/changeRxns.m index 8527e95f..7fc82bac 100755 --- a/manipulation/changeRxns.m +++ b/manipulation/changeRxns.m @@ -1,57 +1,66 @@ function model=changeRxns(model,rxns,equations,eqnType,compartment,allowNewMets) -% changeRxns -% Modifies the equations of reactions +% changeRxns Modify the equations of reactions in a model. % -% model a model structure -% rxns cell array with reaction ids -% equations cell array with equations. Alternatively, it can be a -% structure with the fields "mets" and "stoichCoeffs", -% in the same fashion as addRxns. E.g.: -% equations.mets = {{'met1','met2'},{'met1','met3'}} -% equations.stoichCoeffs = {[-1,+2],[-1,+1]} -% eqnType double describing how the equation string should be -% interpreted -% 1 - The metabolites are matched to model.mets. New -% metabolites (if allowed) are added to -% "compartment" (default) -% 2 - The metabolites are matched to model.metNames and -% all metabolites are assigned to "compartment". Any -% new metabolites that are added will be assigned -% IDs "m1", "m2"... If IDs on the same form are -% already used in the model then the numbering will -% start from the highest used integer+1 -% 3 - The metabolites are written as -% "metNames[compNames]". Only compartments in -% model.compNames are allowed. Any -% new metabolites that are added will be assigned -% IDs "m1", "m2"... If IDs on the same form are -% already used in the model then the numbering will -% start from the highest used integer+1 -% compartment a string with the compartment the metabolites should -% be placed in when using eqnType=2. Must match -% model.compNames (optional when eqnType=1 or eqnType=3) -% allowNewMets true if the function is allowed to add new -% metabolites. It is highly recommended to first add -% any new metabolites with addMets rather than -% automatically through this function. addMets supports -% more annotation of metabolites, allows for the use of -% exchange metabolites, and using it reduces the risk -% of parsing errors (optional, default false) +% Parameters +% ---------- +% model : struct +% a model structure. +% rxns : cell +% cell array with reaction ids. +% equations : cell or struct +% cell array with equations. Alternatively, it can be a structure with +% the fields "mets" and "stoichCoeffs", in the same fashion as addRxns. +% E.g.: % -% model an updated model structure +% - equations.mets = {{'met1','met2'},{'met1','met3'}} +% - equations.stoichCoeffs = {[-1,+2],[-1,+1]} +% eqnType : double, optional +% describes how the equation string should be interpreted (default 1): % -% NOTE: This function should be used with some care, since it doesn't -% care about bounds on the reactions. Changing a irreversible reaction to -% a reversible one (or the other way around) will only change the -% model.rev field and not the model.lb/model.ub fields. The reaction will -% therefore still be having the same reversibility because of the -% bounds. Use setParams to change the bounds. +% - 1 : the metabolites are matched to model.mets. New metabolites (if +% allowed) are added to "compartment". +% - 2 : the metabolites are matched to model.metNames and all +% metabolites are assigned to "compartment". Any new metabolites that +% are added will be assigned IDs "m1", "m2"... If IDs on the same form +% are already used in the model then the numbering will start from the +% highest used integer+1. +% - 3 : the metabolites are written as "metNames[compNames]". Only +% compartments in model.compNames are allowed. Any new metabolites +% that are added will be assigned IDs "m1", "m2"... If IDs on the same +% form are already used in the model then the numbering will start +% from the highest used integer+1. +% compartment : char, optional +% a string with the compartment the metabolites should be placed in when +% using eqnType=2. Must match model.compNames (optional when eqnType=1 or +% eqnType=3). +% allowNewMets : logical, optional +% true if the function is allowed to add new metabolites. It is highly +% recommended to first add any new metabolites with addMets rather than +% automatically through this function. addMets supports more annotation +% of metabolites, allows for the use of exchange metabolites, and using +% it reduces the risk of parsing errors (default false). % -% NOTE: When adding metabolites to a compartment where it previously -% doesn't exist, the function will copy any available information from -% the metabolite in another compartment. +% Returns +% ------- +% model : struct +% an updated model structure. % -% Usage: model=changeRxns(model,rxns,equations,eqnType,compartment,allowNewMets) +% Examples +% -------- +% model = changeRxns(model, rxns, equations, eqnType, compartment, allowNewMets); +% +% Notes +% ----- +% This function should be used with some care, since it doesn't care about +% bounds on the reactions. Changing an irreversible reaction to a reversible +% one (or the other way around) will only change the model.rev field and not +% the model.lb/model.ub fields. The reaction will therefore still be having +% the same reversibility because of the bounds. Use setParams to change the +% bounds. +% +% When adding metabolites to a compartment where it previously doesn't +% exist, the function will copy any available information from the metabolite +% in another compartment. if nargin<4 eqnType=1; diff --git a/manipulation/closeModel.m b/manipulation/closeModel.m index b91d69f0..9941029f 100755 --- a/manipulation/closeModel.m +++ b/manipulation/closeModel.m @@ -1,13 +1,21 @@ function closedModel=closeModel(model) -% closeModel -% Adds boundary metabolites and their participation in exchange -% reactions. +% closeModel Add boundary metabolites and their exchange reactions. % -% model a model structure +% Adds boundary metabolites and their participation in exchange reactions. % -% closedModel an updated closedModel structure +% Parameters +% ---------- +% model : struct +% a model structure. % -% Usage: closedModel=closeModel(model) +% Returns +% ------- +% closedModel : struct +% an updated model structure with boundary metabolites added. +% +% Examples +% -------- +% closedModel = closeModel(model); closedModel=model; diff --git a/manipulation/contractModel.m b/manipulation/contractModel.m index 0d6af575..35d62fc0 100755 --- a/manipulation/contractModel.m +++ b/manipulation/contractModel.m @@ -1,31 +1,41 @@ function [reducedModel, removedRxns, indexedDuplicateRxns]=contractModel(model,distReverse,mets) -% contractModel -% Contracts a model by grouping all identical reactions. Similar to the -% deleteDuplicates part in simplifyModel but more care is taken here -% when it comes to gene associations. If the duplicated reactions have -% '_EXP_*' suffixes (where * is a digit), then the model is assumed to -% have been passed through expandModel, and these suffixes are removed -% here. +% contractModel Contract a model by grouping all identical reactions. % -% model a model structure -% distReverse distinguish reactions with same metabolites -% but different reversibility as different -% reactions (optional, default true) -% mets string or cell array of strings with metabolite -% identifiers, whose involved reactions should be -% checked for duplication (optional, by default all -% reactions are considered) (option is used by -% replaceMets) +% Similar to the deleteDuplicates part in simplifyModel but more care is +% taken here when it comes to gene associations. If the duplicated reactions +% have '_EXP_*' suffixes (where * is a digit), then the model is assumed to +% have been passed through expandModel, and these suffixes are removed here. % -% reducedModel a model structure without duplicate reactions -% removedRxns cell array for the removed duplicate reactions -% indexedDuplicateRxns indexed cell array for the removed duplicate -% reactions (multiple valuess separated by semicolon) +% Parameters +% ---------- +% model : struct +% a model structure. +% distReverse : logical, optional +% distinguish reactions with same metabolites but different reversibility +% as different reactions (default true). +% mets : char or cell, optional +% string or cell array of strings with metabolite identifiers, whose +% involved reactions should be checked for duplication (by default all +% reactions are considered). This option is used by replaceMets. % -% NOTE: This code might not work for advanced grRules strings -% that involve nested expressions of 'and' and 'or'. +% Returns +% ------- +% reducedModel : struct +% a model structure without duplicate reactions. +% removedRxns : cell +% cell array for the removed duplicate reactions. +% indexedDuplicateRxns : cell +% indexed cell array for the removed duplicate reactions (multiple values +% separated by semicolon). % -% Usage: [reducedModel, removedRxns, indexedDuplicateRxns]=contractModel(model,distReverse,mets) +% Examples +% -------- +% [reducedModel, removedRxns, indexedDuplicateRxns] = contractModel(model, distReverse, mets); +% +% Notes +% ----- +% This code might not work for advanced grRules strings that involve nested +% expressions of 'and' and 'or'. if nargin<2 distReverse=true; diff --git a/manipulation/convertToIrrev.m b/manipulation/convertToIrrev.m index 5dc06f54..5697716d 100755 --- a/manipulation/convertToIrrev.m +++ b/manipulation/convertToIrrev.m @@ -1,23 +1,33 @@ function [irrevModel,matchRev,rev2irrev,irrev2rev]=convertToIrrev(model,rxns) -% convertToIrrev -% Converts a model to irreversible form +% convertToIrrev Convert a model to irreversible form. % -% Input: -% model a model structure -% rxns cell array with the reactions so split (if reversible) -% (optional, default model.rxns) +% Reversible reactions are split into one forward and one reverse +% reaction. The reverse reactions are saved as 'rxnID_REV'. A warning is +% shown if some reaction identifiers already end with '_REV'. % -% Output: -% irrevModel a model structure where reversible reactions have -% been split into one forward and one reverse reaction -% matchRev matching forward reaction to its backward reaction -% rev2irrev forward and backward reactions for reversible reactions -% irrev2rev matching all reactions back to original model +% Parameters +% ---------- +% model : struct +% a model structure. +% rxns : cell, optional +% cell array with the reactions to split, if reversible (default +% model.rxns). % -% The reverse reactions are saved as 'rxnID_REV'. A warning is shown if -% some reaction identifiers already end with '_REV'. +% Returns +% ------- +% irrevModel : struct +% a model structure where reversible reactions have been split into +% one forward and one reverse reaction. +% matchRev : double +% matching forward reaction to its backward reaction. +% rev2irrev : cell +% forward and backward reactions for reversible reactions. +% irrev2rev : double +% matching all reactions back to original model. % -% Usage: [irrevModel,matchRev,rev2irrev,irrev2rev]=convertToIrrev(model,rxns) +% Examples +% -------- +% [irrevModel,matchRev,rev2irrev,irrev2rev]=convertToIrrev(model,rxns); if nargin<2 I=true(numel(model.rxns),1); diff --git a/manipulation/copyToComps.m b/manipulation/copyToComps.m index 042b4c0b..2eed4f66 100755 --- a/manipulation/copyToComps.m +++ b/manipulation/copyToComps.m @@ -1,29 +1,41 @@ function model=copyToComps(model,toComps,rxns,deleteOriginal,compNames,compOutside) -% copyToComps -% Copies reactions to new compartment(s) +% copyToComps Copy reactions to new compartment(s). % -% model a model structure -% toComps cell array of compartment ids. If there is no match -% to model.comps then it is added as a new compartment -% (see below for details) -% rxns either a cell array of reaction IDs, a logical vector -% with the same number of elements as reactions in the model, -% or a vector of indexes to remove (optional, default -% model.rxns) -% deleteOriginal true if the original reactions should be removed -% (making it move the reactions instead) (optional, default -% false) -% compNames cell array of compartment names. This is used if new -% compartments should be added (optional, default toComps) -% compOutside cell array of the id (as in comps) for the compartment -% surrounding each of the compartments. This is used if -% new compartments should be added (optional, default all {''}) +% Parameters +% ---------- +% model : struct +% a model structure. +% toComps : cell +% cell array of compartment ids. If there is no match to model.comps +% then it is added as a new compartment (see compNames and +% compOutside). +% rxns : cell or logical or double, optional +% either a cell array of reaction IDs, a logical vector with the same +% number of elements as reactions in the model, or a vector of indexes +% to copy (default model.rxns). +% deleteOriginal : logical, optional +% true if the original reactions should be removed, making it move the +% reactions instead (default false). +% compNames : cell, optional +% cell array of compartment names. Used if new compartments should be +% added (default toComps). +% compOutside : cell, optional +% cell array of the id (as in comps) for the compartment surrounding +% each of the compartments. Used if new compartments should be added +% (default all {''}). % -% model an updated model structure +% Returns +% ------- +% model : struct +% an updated model structure. % -% NOTE: New reactions and metabolites will be named as "id_toComps(i)". +% Examples +% -------- +% model=copyToComps(model,toComps,rxns,deleteOriginal,compNames,compOutside); % -% Usage: model=copyToComps(model,toComps,rxns,deleteOriginal,compNames,compOutside) +% Notes +% ----- +% New reactions and metabolites will be named as "id_toComps(i)". arguments model (1,1) struct diff --git a/manipulation/deleteUnusedGenes.m b/manipulation/deleteUnusedGenes.m index 7d0427f9..07843c16 100755 --- a/manipulation/deleteUnusedGenes.m +++ b/manipulation/deleteUnusedGenes.m @@ -1,14 +1,22 @@ function reducedModel=deleteUnusedGenes(model,verbose) -% deleteUnusedGenes -% Deletes all genes that are not associated to any reaction +% deleteUnusedGenes Delete all genes not associated to any reaction. % -% model a model structure -% verbose 0 for silent; 1 for printing number of deleted genes; -% 2 for printing the list of deleted genes (optional, default 1) +% Parameters +% ---------- +% model : struct +% a model structure. +% verbose : double, optional +% 0 for silent; 1 for printing the number of deleted genes; 2 for +% printing the list of deleted genes (default 1). % -% reducedModel an updated model structure +% Returns +% ------- +% reducedModel : struct +% an updated model structure. % -% Usage: reducedModel=deleteUnusedGenes(model) +% Examples +% -------- +% reducedModel=deleteUnusedGenes(model); if nargin<2 verbose=1; diff --git a/manipulation/expandModel.m b/manipulation/expandModel.m index e81eeafa..288d2844 100755 --- a/manipulation/expandModel.m +++ b/manipulation/expandModel.m @@ -1,25 +1,34 @@ function [newModel, rxnToCheck]=expandModel(model) -% expandModel -% Expands a model which uses several gene associations for one reaction. -% Each such reaction is split into several reactions, each under the control -% of only one gene. -% -% Input: -% model model structure -% -% Output: -% newModel model structure with separate reactions for iso-enzymes, where -% the reaction ids are renamed as to id_EXP_1, id_EXP_2, etc. -% rxnToCheck cell array with original reaction identifiers for those -% that contained nested and/or-relationships in grRules. +% expandModel Expand reactions that use several gene associations. % -% NOTE: grRules strings that involve nested expressions of 'and' and 'or' -% might not be parsed correctly if they are not standardized (if the -% standardizeGrRules functions was not first run on the model). For -% those reactions, it is therefore advisable to inspect the reactions in -% rxnToCheck to confirm correct model expansion. +% Each reaction that uses several gene associations is split into several +% reactions, each under the control of only one gene. % -% Usage: [newModel, rxnToCheck]=expandModel(model) +% Parameters +% ---------- +% model : struct +% a model structure. +% +% Returns +% ------- +% newModel : struct +% model structure with separate reactions for iso-enzymes, where the +% reaction ids are renamed as id_EXP_1, id_EXP_2, etc. +% rxnToCheck : cell +% cell array with original reaction identifiers for those that +% contained nested and/or-relationships in grRules. +% +% Examples +% -------- +% [newModel, rxnToCheck]=expandModel(model); +% +% Notes +% ----- +% grRules strings that involve nested expressions of 'and' and 'or' might +% not be parsed correctly if they are not standardized (if the +% standardizeGrRules function was not first run on the model). For those +% reactions, it is therefore advisable to inspect the reactions in +% rxnToCheck to confirm correct model expansion. %Check how many reactions we will create (the number of or:s in the GPRs). %This way, we can preallocate all fields and save much computation time diff --git a/manipulation/findDuplicateRxns.m b/manipulation/findDuplicateRxns.m index 44d3011b..469ef8fe 100644 --- a/manipulation/findDuplicateRxns.m +++ b/manipulation/findDuplicateRxns.m @@ -1,27 +1,31 @@ function pairs = findDuplicateRxns(model, ignoreDirection) -% findDuplicateRxns -% Find reactions that share identical stoichiometry. Counterpart of -% raven_python.manipulation.find_duplicate_reactions, and the -% upstream version of yeast-GEM's findDuplicatedRxns. +% findDuplicateRxns Find reactions that share identical stoichiometry. % -% Only stoichiometry is compared — bounds, GPRs, and annotations -% are ignored. The default treats A→B and B→A as duplicates -% (typical curation use case: "find reactions that could be -% merged"). +% Counterpart of raven_python.manipulation.find_duplicate_reactions, and +% the upstream version of yeast-GEM's findDuplicatedRxns. % -% Inputs: -% model RAVEN model struct. -% ignoreDirection (opt, default true) Treat A→B and B→A as -% duplicates. +% Only stoichiometry is compared — bounds, GPRs, and annotations are +% ignored. The default treats A→B and B→A as duplicates (typical curation +% use case: "find reactions that could be merged"). % -% Output: -% pairs Nx2 numeric array of reaction-index pairs -% (i, j) where reactions i and j share the -% same (possibly negated) stoichiometry, with -% i < j. Empty if the model has no duplicates. +% Parameters +% ---------- +% model : struct +% RAVEN model struct. +% ignoreDirection : logical, optional +% treat A→B and B→A as duplicates (default true). % -% Usage: pairs = findDuplicateRxns(model) -% pairs = findDuplicateRxns(model, false) +% Returns +% ------- +% pairs : double +% Nx2 numeric array of reaction-index pairs (i, j) where reactions i +% and j share the same (possibly negated) stoichiometry, with i < j. +% Empty if the model has no duplicates. +% +% Examples +% -------- +% pairs = findDuplicateRxns(model); +% pairs = findDuplicateRxns(model, false); if nargin < 2 ignoreDirection = true; diff --git a/manipulation/generateNewIds.m b/manipulation/generateNewIds.m index f66c6aa5..ade50c06 100755 --- a/manipulation/generateNewIds.m +++ b/manipulation/generateNewIds.m @@ -1,21 +1,33 @@ function newIds=generateNewIds(model,type,prefix,quantity,numLength) -% generateNewIds -% Generates a list of new metabolite or reaction ids, sequentially -% numbered with a defined prefix. The model is queried for the highest -% existing number of that type of id. +% generateNewIds Generate a list of new metabolite or reaction ids. % -% model model structure -% type string specifying type of id, 'rxns' or 'mets' -% prefix string specifying prefix to be used in all ids. E.g. 's_' -% or 'r_'. -% quantity number of new ids that should be generated (optional, default 1) -% numLength length of numerical part of id. E.g. 4 gives ids like -% r_0001 and 6 gives ids like r_000001. If the prefix is -% already used in the model, then the model-defined length -% will be used instead. (optional, default 4) +% The ids are sequentially numbered with a defined prefix. The model is +% queried for the highest existing number of that type of id. % -% Usage: newIds=generateNewIds(model,type,prefix,quantity,numLength) -% +% Parameters +% ---------- +% model : struct +% model structure. +% type : char +% type of id, 'rxns' or 'mets'. +% prefix : char +% prefix to be used in all ids, e.g. 's_' or 'r_'. +% quantity : double, optional +% number of new ids that should be generated (default 1). +% numLength : double, optional +% length of the numerical part of the id. E.g. 4 gives ids like r_0001 +% and 6 gives ids like r_000001. If the prefix is already used in the +% model, then the model-defined length will be used instead +% (default 4). +% +% Returns +% ------- +% newIds : cell +% cell array with the generated ids. +% +% Examples +% -------- +% newIds = generateNewIds(model, type, prefix, quantity, numLength); type=char(type); prefix=char(prefix); diff --git a/manipulation/mergeCompartments.m b/manipulation/mergeCompartments.m index 1c8d9b52..7732d7f3 100755 --- a/manipulation/mergeCompartments.m +++ b/manipulation/mergeCompartments.m @@ -1,36 +1,46 @@ function [model, deletedRxns, duplicateRxns]=mergeCompartments(model,keepUnconstrained,deleteRxnsWithOneMet,distReverse) -% mergeCompartments -% Merge all compartments in a model +% mergeCompartments Merge all compartments in a model. % -% model a model structure -% keepUnconstrained keep metabolites that are unconstrained in a -% 'unconstrained' compartment. If these are merged the -% exchange reactions will most often be deleted (optional, -% default false) -% deleteRxnsWithOneMet delete reactions with only one metabolite. These -% reactions come from reactions such as A[c] + B[c] -% => A[m]. In some models hydrogen is balanced around -% each membrane with reactions like this (optional, -% default false) -% distReverse distinguish reactions with same metabolites but -% different reversibility as different reactions -% (optional, default true) +% Parameters +% ---------- +% model : struct +% a model structure. +% keepUnconstrained : logical, optional +% keep metabolites that are unconstrained in a 'unconstrained' +% compartment. If these are merged the exchange reactions will most often +% be deleted (default false). +% deleteRxnsWithOneMet : logical, optional +% delete reactions with only one metabolite. These reactions come from +% reactions such as A[c] + B[c] => A[m]. In some models hydrogen is +% balanced around each membrane with reactions like this (default +% false). +% distReverse : logical, optional +% distinguish reactions with same metabolites but different reversibility +% as different reactions (default true). % -% model a model with all reactions located to one compartment -% deletedRxns reactions that were deleted because of only -% having one metabolite after merging -% duplicateRxns identical reactions that occurred in different -% compartments and were deleted because they turned -% to be duplicated after merging +% Returns +% ------- +% model : struct +% a model with all reactions located to one compartment. +% deletedRxns : cell +% reactions that were deleted because of only having one metabolite +% after merging. +% duplicateRxns : cell +% identical reactions that occurred in different compartments and were +% deleted because they turned to be duplicated after merging. % -% Merges all compartments into one 's' compartment (for 'System'). This can -% be useful for example to ensure that there are metabolic capabilities to -% synthesize all metabolites. +% Examples +% -------- +% [model, deletedRxns, duplicateRxns] = mergeCompartments(model); % -% NOTE: If the metabolite IDs reflect the compartment that they are in -% the IDs may no longer be representative. +% Notes +% ----- +% Merges all compartments into one 's' compartment (for 'System'). This can +% be useful for example to ensure that there are metabolic capabilities to +% synthesize all metabolites. % -% Usage: [model, deletedRxns, duplicateRxns]=mergeCompartments(model,keepUnconstrained,deleteRxnsWithOneMet,distReverse) +% If the metabolite IDs reflect the compartment that they are in the IDs may +% no longer be representative. if nargin<2 keepUnconstrained=false; diff --git a/manipulation/mergeModels.m b/manipulation/mergeModels.m index fe07a4a0..98c2414a 100755 --- a/manipulation/mergeModels.m +++ b/manipulation/mergeModels.m @@ -1,28 +1,35 @@ function model=mergeModels(models,metParam,supressWarnings,copyToComps) -% mergeModels -% Merges models into one model structure. Reactions are added without any -% checks, so duplicate reactions might appear. Metabolites are matched by -% their name and compartment (metaboliteName[comp]), while genes are -% matched by their name. +% mergeModels Merge models into one model structure. % -% Input: -% models a cell array with model structures -% metParam string metabolite name ('metNames') or ID ('mets') are -% used for matching (optional, default 'metNames') -% supressWarnings logical whether warnings should be supressed (optional, -% default false) -% copyToComps logical whether mergeModels is run via copyToComps -% (optional, default false) +% Merges models into one model structure. Reactions are added without any +% checks, so duplicate reactions might appear. Metabolites are matched by +% their name and compartment (metaboliteName[comp]), while genes are matched +% by their name. % -% Output: -% model a model structure with the merged model. Follows the -% structure of normal models but also has 'rxnFrom/ -% metFrom/geneFrom' fields to indicate from which model -% each reaction/metabolite/gene was taken. If the model -% already has 'rxnFrom/metFrom/geneFrom' fields, then -% these fields are not modified. +% Parameters +% ---------- +% models : cell +% a cell array with model structures. +% metParam : char, optional +% string, metabolite name ('metNames') or ID ('mets') are used for +% matching (default 'metNames'). +% supressWarnings : logical, optional +% whether warnings should be supressed (default false). +% copyToComps : logical, optional +% whether mergeModels is run via copyToComps (default false). % -% Usage: model=mergeModels(models) +% Returns +% ------- +% model : struct +% a model structure with the merged model. Follows the structure of +% normal models but also has 'rxnFrom/metFrom/geneFrom' fields to +% indicate from which model each reaction/metabolite/gene was taken. If +% the model already has 'rxnFrom/metFrom/geneFrom' fields, then these +% fields are not modified. +% +% Examples +% -------- +% model = mergeModels(models); arguments models; diff --git a/manipulation/permuteModel.m b/manipulation/permuteModel.m index 4fad67c2..d71f33d6 100755 --- a/manipulation/permuteModel.m +++ b/manipulation/permuteModel.m @@ -1,18 +1,25 @@ function newModel=permuteModel(model, indexes, type) -% permuteModel -% Changes the order of the reactions or metabolites in a model +% permuteModel Change the order of the reactions or metabolites in a model. % -% Input: -% model a model structure -% indexes a vector with the same length as the number of items in the -% model, which gives the new order of items -% type 'rxns' for reactions, 'mets' for metabolites, 'genes' for -% genes, 'comps' for compartments +% Parameters +% ---------- +% model : struct +% a model structure. +% indexes : double +% a vector with the same length as the number of items in the model, +% which gives the new order of items. +% type : char +% 'rxns' for reactions, 'mets' for metabolites, 'genes' for genes, +% 'comps' for compartments. % -% Output: -% newModel an updated model structure +% Returns +% ------- +% newModel : struct +% an updated model structure. % -% Usage: newModel=permuteModel(model, indexes, type) +% Examples +% -------- +% newModel = permuteModel(model, indexes, type); newModel=model; type=char(type); diff --git a/manipulation/removeBadRxns.m b/manipulation/removeBadRxns.m index 9b6328dd..20625c0e 100755 --- a/manipulation/removeBadRxns.m +++ b/manipulation/removeBadRxns.m @@ -1,73 +1,85 @@ function [newModel, removedRxns]=removeBadRxns(model,rxnRules,ignoreMets,isNames,balanceElements,refModel,ignoreIntBounds,printReport) -% removeBadRxns -% Iteratively removes reactions which enable production/consumption of some -% metabolite without any uptake/excretion +% removeBadRxns Remove reactions that enable production/consumption from nothing. % -% model a model structure. For the intented function, -% the model shouldn't allow for any uptake/excretion. -% The easiest way to achieve this is to import the -% model using importModel('filename',false) -% rxnRules 1: only remove reactions which are unbalanced -% 2: also remove reactions which couldn't be checked for -% mass balancing -% 3: all reactions can be removed -% (optional, default 1) -% ignoreMets either a cell array of metabolite IDs, a logical vector -% with the same number of elements as metabolites in the model, -% of a vector of indexes for metabolites to exclude from -% this analysis (optional, default []) -% isNames true if the supplied mets represent metabolite names -% (as opposed to IDs). This is a way to delete -% metabolites in several compartments at once without -% knowing the exact IDs. This only works if ignoreMets -% is a cell array (optional, default false) -% balanceElements a cell array with the elements for which to -% balance the reactions. May contain any -% combination of the elements defined in parseFormulas -% (optional, default {'C';'P';'S';'N';'O'}) -% refModel a reference model which can be used to ensure -% that the resulting model is still functional. -% The intended use is that the reference model is -% a copy of model, but with uptake/excretion allowed and -% some objectives (such as production of biomass) -% constrained to a non-zero flux. Before a -% reaction is removed from "model" the function first -% checks that the same deletion in "refModel" -% doesn't render the problem unfeasible (optional) -% ignoreIntBounds true if internal bounds (including reversibility) -% should be ignored. Exchange reactions are not affected. -% This can be used to find unbalanced solutions which are -% not possible using the default constraints (optional, -% default false) -% printReport true if a report should be printed (optional, -% default false) +% Iteratively removes reactions which enable production/consumption of some +% metabolite without any uptake/excretion. % -% newModel a model structure after the problematic -% reactions have been deleted -% removedRxns a cell array with the reactions that were -% removed +% Parameters +% ---------- +% model : struct +% a model structure. For the intended function, the model shouldn't +% allow for any uptake/excretion. The easiest way to achieve this is to +% import the model using importModel('filename', false). +% rxnRules : double, optional +% which reactions may be removed (default 1): % -% The purpose of this function is to remove reactions which enable -% production/consumption of metabolites even when exchange reactions aren't used. -% Many models, especially if they are automatically inferred from -% databases, will have unbalanced reactions which allow for -% net-production/consumption of metabolites without any consumption/excretion. -% A common reason for this is when general compounds have different meaning -% in different reactions (as DNA has in these two reactions). -% dATP + dGTP + dCTP + dTTP <=> DNA + 4 PPi -% 0.25 dATP + 0.25 dGTP + 0.25 dCTP + 0.25 dTTP <=> DNA + PPi -% Reactions that are problematic like this are always elementally -% unbalanced, but it is not always that you would like to exclude all -% unbalanced reactions from your model. -% This function tries to remove as few problematic reactions as possible -% so that the model cannot produce/consume anything from nothing. This is done by -% repeatedly calling makeSomething/consumeSomething, checking if any of -% the involved reactions are elementally unbalanced, remove one of them, -% and then iterating until no metabolites can be produced/consumed. -% makeSomething is called before consumeSomething. +% - 1 : only remove reactions which are unbalanced +% - 2 : also remove reactions which couldn't be checked for mass +% balancing +% - 3 : all reactions can be removed +% ignoreMets : cell or logical or double, optional +% either a cell array of metabolite IDs, a logical vector with the same +% number of elements as metabolites in the model, or a vector of indexes +% for metabolites to exclude from this analysis (default []). +% isNames : logical, optional +% true if the supplied mets represent metabolite names (as opposed to +% IDs). This is a way to delete metabolites in several compartments at +% once without knowing the exact IDs. This only works if ignoreMets is a +% cell array (default false). +% balanceElements : cell, optional +% a cell array with the elements for which to balance the reactions. May +% contain any combination of the elements defined in parseFormulas +% (default {'C';'P';'S';'N';'O'}). +% refModel : struct, optional +% a reference model which can be used to ensure that the resulting model +% is still functional. The intended use is that the reference model is a +% copy of model, but with uptake/excretion allowed and some objectives +% (such as production of biomass) constrained to a non-zero flux. Before +% a reaction is removed from "model" the function first checks that the +% same deletion in "refModel" doesn't render the problem unfeasible. +% ignoreIntBounds : logical, optional +% true if internal bounds (including reversibility) should be ignored. +% Exchange reactions are not affected. This can be used to find +% unbalanced solutions which are not possible using the default +% constraints (default false). +% printReport : logical, optional +% true if a report should be printed (default false). % -% Usage: [newModel, removedRxns]=removeBadRxns(model,rxnRules,... -% ignoreMets,isNames,refModel,ignoreIntBounds,printReport) +% Returns +% ------- +% newModel : struct +% a model structure after the problematic reactions have been deleted. +% removedRxns : cell +% a cell array with the reactions that were removed. +% +% Notes +% ----- +% The purpose of this function is to remove reactions which enable +% production/consumption of metabolites even when exchange reactions aren't +% used. Many models, especially if they are automatically inferred from +% databases, will have unbalanced reactions which allow for +% net-production/consumption of metabolites without any consumption or +% excretion. A common reason for this is when general compounds have +% different meaning in different reactions (as DNA has in these two +% reactions): +% +% dATP + dGTP + dCTP + dTTP <=> DNA + 4 PPi +% 0.25 dATP + 0.25 dGTP + 0.25 dCTP + 0.25 dTTP <=> DNA + PPi +% +% Reactions that are problematic like this are always elementally +% unbalanced, but it is not always the case that you would like to exclude +% all unbalanced reactions from your model. This function tries to remove as +% few problematic reactions as possible so that the model cannot +% produce/consume anything from nothing. This is done by repeatedly calling +% makeSomething/consumeSomething, checking if any of the involved reactions +% are elementally unbalanced, removing one of them, and then iterating until +% no metabolites can be produced/consumed. makeSomething is called before +% consumeSomething. +% +% Examples +% -------- +% [newModel, removedRxns] = removeBadRxns(model, rxnRules, ignoreMets, ... +% isNames, balanceElements, refModel, ignoreIntBounds, printReport); if nargin<2 rxnRules=1; diff --git a/manipulation/removeGenes.m b/manipulation/removeGenes.m index ac2b5cf5..cd4980ad 100755 --- a/manipulation/removeGenes.m +++ b/manipulation/removeGenes.m @@ -1,21 +1,31 @@ function reducedModel = removeGenes(model,genesToRemove,removeUnusedMets,removeBlockedRxns,standardizeRules) -% removeGenes -% Deletes a set of genes from a model +% removeGenes Delete a set of genes from a model. % -% model a model structure -% genesToRemove either a cell array of gene IDs, a logical vector -% with the same number of elements as genes in the model, -% or a vector of indexes to remove -% removeUnusedMets remove metabolites that are no longer in use (optional, default -% false) -% removeBlockedRxns remove reactions that get blocked after deleting the genes -% (optional, default false) -% standardizeRules format gene rules to be compliant with standard format -% (optional, default true) +% Parameters +% ---------- +% model : struct +% a model structure. +% genesToRemove : cell or logical or double +% either a cell array of gene IDs, a logical vector with the same number +% of elements as genes in the model, or a vector of indexes to remove. +% removeUnusedMets : logical, optional +% remove metabolites that are no longer in use (default false). +% removeBlockedRxns : logical, optional +% remove reactions that get blocked after deleting the genes (default +% false). +% standardizeRules : logical, optional +% format gene rules to be compliant with the standard format (default +% true). % -% reducedModel an updated model structure +% Returns +% ------- +% reducedModel : struct +% an updated model structure. % -% Usage: reducedModel = removeGenes(model,genesToRemove,removeUnusedMets,removeBlockedRxns) +% Examples +% -------- +% reducedModel = removeGenes(model, genesToRemove, removeUnusedMets, ... +% removeBlockedRxns, standardizeRules); if nargin<3 removeUnusedMets = false; diff --git a/manipulation/removeMets.m b/manipulation/removeMets.m index 0f4a5f26..ded6eda5 100755 --- a/manipulation/removeMets.m +++ b/manipulation/removeMets.m @@ -1,27 +1,35 @@ function reducedModel=removeMets(model,metsToRemove,isNames,removeUnusedRxns,removeUnusedGenes,removeUnusedComps) -% removeMets -% Deletes a set of metabolites from a model +% removeMets Delete a set of metabolites from a model. % -% model a model structure -% metsToRemove either a cell array of metabolite IDs, a logical vector -% with the same number of elements as metabolites in the model, -% of a vector of indexes to remove -% isNames true if the supplied mets represent metabolite names -% (as opposed to IDs). This is a way to delete -% metabolites in several compartments at once without -% knowing the exact IDs. This only works if metsToRemove -% is a cell array (optional, default false) -% removeUnusedRxns remove reactions that are no longer in use (optional, -% default false) -% removeUnusedGenes remove genes that are no longer in use (optional, -% default false) -% removeUnusedComps remove compartments that are no longer in use (optional, -% default false) +% Parameters +% ---------- +% model : struct +% a model structure. +% metsToRemove : cell or logical or double +% either a cell array of metabolite IDs, a logical vector with the same +% number of elements as metabolites in the model, or a vector of +% indexes to remove. +% isNames : logical, optional +% true if the supplied mets represent metabolite names (as opposed to +% IDs). This is a way to delete metabolites in several compartments at +% once without knowing the exact IDs. This only works if metsToRemove +% is a cell array (default false). +% removeUnusedRxns : logical, optional +% remove reactions that are no longer in use (default false). +% removeUnusedGenes : logical, optional +% remove genes that are no longer in use (default false). +% removeUnusedComps : logical, optional +% remove compartments that are no longer in use (default false). % -% reducedModel an updated model structure +% Returns +% ------- +% reducedModel : struct +% an updated model structure. % -% Usage: reducedModel=removeMets(model,metsToRemove,isNames,... -% removeUnusedRxns,removeUnusedGenes,removeUnusedComps) +% Examples +% -------- +% reducedModel = removeMets(model, metsToRemove, isNames, ... +% removeUnusedRxns, removeUnusedGenes, removeUnusedComps); if ~islogical(metsToRemove) && ~isnumeric(metsToRemove) metsToRemove=convertCharArray(metsToRemove); end diff --git a/manipulation/removeReactions.m b/manipulation/removeReactions.m index 4255c6e2..a009f788 100755 --- a/manipulation/removeReactions.m +++ b/manipulation/removeReactions.m @@ -1,24 +1,30 @@ function reducedModel=removeReactions(model,rxnsToRemove,removeUnusedMets,removeUnusedGenes,removeUnusedComps) -% removeReactions -% Deletes a set of reactions from a model +% removeReactions Delete a set of reactions from a model. % -% Input: -% model a model structure -% rxnsToRemove either a cell array of reaction IDs, a logical vector -% with the same number of elements as reactions in the -% model, or a vector of indexes to remove -% removeUnusedMets remove metabolites that are no longer in use -% (optional, default false) -% removeUnusedGenes remove genes that are no longer in use (optional, -% default false) -% removeUnusedComps remove compartments that are no longer in use -% (optional, default false) +% Parameters +% ---------- +% model : struct +% a model structure. +% rxnsToRemove : cell or logical or double +% either a cell array of reaction IDs, a logical vector with the same +% number of elements as reactions in the model, or a vector of indexes +% to remove. +% removeUnusedMets : logical, optional +% remove metabolites that are no longer in use (default false). +% removeUnusedGenes : logical, optional +% remove genes that are no longer in use (default false). +% removeUnusedComps : logical, optional +% remove compartments that are no longer in use (default false). % -% Output: -% reducedModel an updated model structure +% Returns +% ------- +% reducedModel : struct +% an updated model structure. % -% Usage: reducedModel = removeReactions(model, rxnsToRemove, removeUnusedMets,... -% removeUnusedGenes, removeUnusedComps) +% Examples +% -------- +% reducedModel = removeReactions(model, rxnsToRemove, removeUnusedMets, ... +% removeUnusedGenes, removeUnusedComps); if nargin<3 removeUnusedMets=false; diff --git a/manipulation/replaceMets.m b/manipulation/replaceMets.m index 9301d862..7ecaa9cc 100755 --- a/manipulation/replaceMets.m +++ b/manipulation/replaceMets.m @@ -1,32 +1,46 @@ function [model, removedRxns, idxDuplRxns]=replaceMets(model,metabolite,replacement,verbose,identifiers) -% replaceMets -% Replaces metabolite names and annotation with replacement metabolite -% that is already in the model. If this results in duplicate metabolites, -% the replacement metabolite will be kept, while the S matrix is updated -% to use the replacement metabolite instead. At the end, contractModel is -% run to remove any duplicate reactions that might have occured. +% replaceMets Replace a metabolite with another already in the model. % -% Input: -% model a model structure -% metabolite string with name of metabolite to be replace -% replacement string with name of replacement metabolite -% verbose logical whether to print the ids of reactions that -% involve the replaced metabolite (optional, default -% false) -% identifiers true if 'metabolite' and 'replacement' refer to -% metabolite identifiers instead of metabolite names -% (optional, default false) -% -% Output: -% model model structure with selected metabolites replaced -% removedRxns identifiers of duplicate reactions that were removed -% idxDuplRxns index of removedRxns in original model +% Replaces metabolite names and annotation with a replacement metabolite +% that is already in the model. If this results in duplicate metabolites, +% the replacement metabolite will be kept, while the S matrix is updated to +% use the replacement metabolite instead. At the end, contractModel is run +% to remove any duplicate reactions that might have occurred. % -% Note: This function is useful when the model contains both 'oxygen' and -% 'o2' as metabolite names. If 'oxygen' and 'o2' are identifiers instead, -% then the 'identifiers' flag should be set to true. +% Parameters +% ---------- +% model : struct +% a model structure. +% metabolite : char +% string with name of metabolite to be replaced. +% replacement : char +% string with name of replacement metabolite. +% verbose : logical, optional +% whether to print the ids of reactions that involve the replaced +% metabolite (default false). +% identifiers : logical, optional +% true if 'metabolite' and 'replacement' refer to metabolite +% identifiers instead of metabolite names (default false). % -% Usage: [model, removedRxns, idxDuplRxns] = replaceMets(model, metabolite, replacement, verbose) +% Returns +% ------- +% model : struct +% model structure with selected metabolites replaced. +% removedRxns : cell +% identifiers of duplicate reactions that were removed. +% idxDuplRxns : double +% index of removedRxns in original model. +% +% Examples +% -------- +% [model, removedRxns, idxDuplRxns] = replaceMets(model, metabolite, ... +% replacement, verbose); +% +% Notes +% ----- +% This function is useful when the model contains both 'oxygen' and 'o2' as +% metabolite names. If 'oxygen' and 'o2' are identifiers instead, then the +% 'identifiers' flag should be set to true. metabolite=char(metabolite); replacement=char(replacement); diff --git a/manipulation/setExchangeBounds.m b/manipulation/setExchangeBounds.m index 89aa4ac7..74b16c19 100755 --- a/manipulation/setExchangeBounds.m +++ b/manipulation/setExchangeBounds.m @@ -1,50 +1,55 @@ function [exchModel,unusedMets] = setExchangeBounds(model,mets,lb,ub,closeOthers,mediaOnly) -% setExchangeBounds -% Define the exchange flux bounds for a given set of metabolites. +% setExchangeBounds Define exchange flux bounds for a set of metabolites. % -% Input: -% model a model structure -% mets a cell array of metabolite names (case insensitive) or -% metabolite IDs, or a vector of metabolite indices -% (optional, default all exchanged metabolites) -% lb lower bound of exchange flux. Can be either a vector of -% bounds corresponding to each of the provided metabolites, -% or a single value that will be applied to all. -% (optional, default to model.annotation.defaultLB if it exists, -% otherwise -1000) -% ub upper bound of exchange flux. Can be either a vector of -% bounds corresponding to each of the provided metabolites, -% or a single value that will be applied to all. -% (optional, default to model.annotation.defaultUB if it exists, -% otherwise 1000) -% closeOthers close exchange reactions for all other exchanged -% metabolites not present in the provided list. This will -% prevent IMPORT of the metabolites, but their EXPORT will -% not be modified. -% (optional, default true) -% mediaOnly only consider exchange reactions involving exchange to or -% from the extracellular (media) compartment. Reactions -% such as "sink" reactions that exchange metabolites -% directly with an intracellular compartment will therefore -% be ignored even though "getExchangeRxns" identifies such -% such reactions as exchange reactions. -% Note: The function will attempt to identify the -% extracellular compartment by the "compNames" field, and -% also requires the "metComps" field to be present, -% otherwise the mediaOnly flag will be ignored. -% (optional, default false) +% Parameters +% ---------- +% model : struct +% a model structure. +% mets : cell or double, optional +% a cell array of metabolite names (case insensitive) or metabolite +% IDs, or a vector of metabolite indices (default all exchanged +% metabolites). +% lb : double, optional +% lower bound of exchange flux. Can be either a vector of bounds +% corresponding to each of the provided metabolites, or a single value +% that will be applied to all (default model.annotation.defaultLB if it +% exists, otherwise -1000). +% ub : double, optional +% upper bound of exchange flux. Can be either a vector of bounds +% corresponding to each of the provided metabolites, or a single value +% that will be applied to all (default model.annotation.defaultUB if it +% exists, otherwise 1000). +% closeOthers : logical, optional +% close exchange reactions for all other exchanged metabolites not +% present in the provided list. This will prevent IMPORT of the +% metabolites, but their EXPORT will not be modified (default true). +% mediaOnly : logical, optional +% only consider exchange reactions involving exchange to or from the +% extracellular (media) compartment. Reactions such as "sink" reactions +% that exchange metabolites directly with an intracellular compartment +% will therefore be ignored even though "getExchangeRxns" identifies +% such reactions as exchange reactions. The function will attempt to +% identify the extracellular compartment by the "compNames" field, and +% also requires the "metComps" field to be present, otherwise the +% mediaOnly flag will be ignored (default false). % -% Output: -% exchModel a model structure with updated exchange flux bounds for -% the provided set of metabolites -% unusedMets metabolites provided by the user that were not used -% because they are not involved in any exchange reactions -% in the model +% Returns +% ------- +% exchModel : struct +% a model structure with updated exchange flux bounds for the provided +% set of metabolites. +% unusedMets : cell +% metabolites provided by the user that were not used because they are +% not involved in any exchange reactions in the model. % -% NOTE: Exchange reactions involving more than one metabolite will be -% ignored. +% Examples +% -------- +% exchModel = setExchangeBounds(model, mets, lb, ub, closeOthers, ... +% mediaOnly); % -% Usage: exchModel = setExchangeBounds(model,mets,lb,ub,closeOthers,mediaOnly); +% Notes +% ----- +% Exchange reactions involving more than one metabolite will be ignored. % handle input arguments diff --git a/manipulation/setParam.m b/manipulation/setParam.m index 70705f3f..2b4e971c 100755 --- a/manipulation/setParam.m +++ b/manipulation/setParam.m @@ -1,33 +1,42 @@ function model=setParam(model, paramType, rxnList, params, var) -% setParam -% Sets parameters for reactions +% setParam Set parameters for reactions. % -% Input: -% model a model structure -% paramType the type of parameter to set: -% 'lb' lower bound -% 'ub' upper bound -% 'eq' both upper and lower bound (equality constraint) -% 'obj' objective coefficient -% 'rev' reversibility (only changes the model.rev fields, -% does not affect model.lb and model.ub) -% 'var' variance around measured bound -% 'unc' unconstrained, set lower and upper bound to the -% default values (-1000 and 1000, or any other values -% that are defined in model.annotation.defaultLB and -% .defaultUB) -% rxnList a cell array of reaction IDs or a vector with their -% corresponding indexes -% params a vector of the corresponding values -% var percentage of variance around measured value, if 'var' is -% set as paramType. Defining 'var' as 5 results in lb and ub -% at 97.5% and 102.5% of the provide params value (if params -% value is negative, then lb and ub are 102.5% and 97.5%). +% Parameters +% ---------- +% model : struct +% a model structure. +% paramType : char +% the type of parameter to set: % -% Output: -% model an updated model structure +% - 'lb' : lower bound. +% - 'ub' : upper bound. +% - 'eq' : both upper and lower bound (equality constraint). +% - 'obj' : objective coefficient. +% - 'rev' : reversibility (only changes the model.rev fields, does not +% affect model.lb and model.ub). +% - 'var' : variance around measured bound. +% - 'unc' : unconstrained, set lower and upper bound to the default +% values (-1000 and 1000, or any other values that are defined in +% model.annotation.defaultLB and .defaultUB). +% rxnList : cell or double +% a cell array of reaction IDs or a vector with their corresponding +% indexes. +% params : double +% a vector of the corresponding values. +% var : double, optional +% percentage of variance around measured value, if 'var' is set as +% paramType. Defining 'var' as 5 results in lb and ub at 97.5% and +% 102.5% of the provided params value (if params value is negative, +% then lb and ub are 102.5% and 97.5%). % -% Usage: model = setParam(model, paramType, rxnList, params, var) +% Returns +% ------- +% model : struct +% an updated model structure. +% +% Examples +% -------- +% model = setParam(model, paramType, rxnList, params, var); paramType=convertCharArray(paramType); if ~any(strcmpi(paramType,{'lb','ub','eq','obj','rev','var','unc'})) diff --git a/manipulation/simplifyModel.m b/manipulation/simplifyModel.m index d99f7ae6..1c862579 100755 --- a/manipulation/simplifyModel.m +++ b/manipulation/simplifyModel.m @@ -1,40 +1,55 @@ function [reducedModel, deletedReactions, deletedMetabolites]=simplifyModel(model,... deleteUnconstrained, deleteDuplicates, deleteZeroInterval, deleteInaccessible, deleteMinMax, groupLinear, constrainReversible, reservedRxns, suppressWarnings) -% simplifyModel -% Simplifies a model by deleting reactions/metabolites +% simplifyModel Simplify a model by deleting reactions and metabolites. % -% model a model structure -% deleteUnconstrained delete metabolites marked as unconstrained (optional, default true) -% deleteDuplicates delete all but one of duplicate reactions (optional, default false) -% deleteZeroInterval delete reactions that are constrained to zero flux (optional, default false) -% deleteInaccessible delete dead end reactions (optional, default false) -% deleteMinMax delete reactions that cannot carry a flux by trying -% to minimize/maximize the flux through that -% reaction. May be time consuming (optional, default false) -% groupLinear group linearly dependent pathways (optional, default false) -% constrainReversible check if there are reversible reactions which can -% only carry flux in one direction, and if so -% constrain them to be irreversible. This tends to -% allow for more reactions grouped when using -% groupLinear (optional, default false) -% reservedRxns cell array with reaction IDs that are not allowed to be -% removed (optional) -% suppressWarnings true if warnings should be suppressed (optional, -% default false) +% This function is for reducing the model size by removing reactions and +% associated metabolites that cannot carry flux. It can also be used for +% identifying different types of gaps. % -% reducedModel an updated model structure -% deletedReactions a cell array with the IDs of all deleted reactions -% deletedMetabolites a cell array with the IDs of all deleted -% metabolites +% Parameters +% ---------- +% model : struct +% a model structure. +% deleteUnconstrained : logical, optional +% delete metabolites marked as unconstrained (default true). +% deleteDuplicates : logical, optional +% delete all but one of duplicate reactions (default false). +% deleteZeroInterval : logical, optional +% delete reactions that are constrained to zero flux (default false). +% deleteInaccessible : logical, optional +% delete dead end reactions (default false). +% deleteMinMax : logical, optional +% delete reactions that cannot carry a flux by trying to +% minimize/maximize the flux through that reaction. May be time +% consuming (default false). +% groupLinear : logical, optional +% group linearly dependent pathways (default false). +% constrainReversible : logical, optional +% check if there are reversible reactions which can only carry flux in +% one direction, and if so constrain them to be irreversible. This +% tends to allow for more reactions grouped when using groupLinear +% (default false). +% reservedRxns : cell, optional +% cell array with reaction IDs that are not allowed to be removed +% (default none). +% suppressWarnings : logical, optional +% true if warnings should be suppressed (default false). % -% This function is for reducing the model size by removing -% reactions and associated metabolites that cannot carry flux. It can also -% be used for identifying different types of gaps. +% Returns +% ------- +% reducedModel : struct +% an updated model structure. +% deletedReactions : cell +% a cell array with the IDs of all deleted reactions. +% deletedMetabolites : cell +% a cell array with the IDs of all deleted metabolites. % -% Usage: [reducedModel, deletedReactions, deletedMetabolites]=simplifyModel(model,... -% deleteUnconstrained, deleteDuplicates, deleteZeroInterval,... -% deleteInaccessible, deleteMinMax, groupLinear,... -% constrainReversible, reservedRxns, suppressWarnings) +% Examples +% -------- +% [reducedModel, deletedReactions, deletedMetabolites] = ... +% simplifyModel(model, deleteUnconstrained, deleteDuplicates, ... +% deleteZeroInterval, deleteInaccessible, deleteMinMax, ... +% groupLinear, constrainReversible, reservedRxns, suppressWarnings); if nargin<2 deleteUnconstrained=true; diff --git a/manipulation/sortIdentifiers.m b/manipulation/sortIdentifiers.m index 98738de5..5a062a1e 100755 --- a/manipulation/sortIdentifiers.m +++ b/manipulation/sortIdentifiers.m @@ -1,16 +1,22 @@ function newModel = sortIdentifiers(model) -% exportModel -% Sort reactions, metabolites, genes and compartments alphabetically by -% their identifier. +% sortIdentifiers Sort model identifiers alphabetically. % -% Input: -% model a model structure +% Sort reactions, metabolites, genes and compartments alphabetically by +% their identifier. % -% Output: -% newModel an updated model structure with alphabetically sorted -% identifiers +% Parameters +% ---------- +% model : struct +% a model structure. % -% Usage: newModel=sortIdentifiers(model) +% Returns +% ------- +% newModel : struct +% an updated model structure with alphabetically sorted identifiers. +% +% Examples +% -------- +% newModel = sortIdentifiers(model); [~,I]=sort(model.rxns); newModel=permuteModel(model,I,'rxns'); diff --git a/manipulation/sortModel.m b/manipulation/sortModel.m index 7dc00d7e..0ff3d53d 100755 --- a/manipulation/sortModel.m +++ b/manipulation/sortModel.m @@ -1,23 +1,31 @@ function model=sortModel(model,sortReversible,sortMetName,sortReactionOrder) -% sortModel -% Sorts a model based on metabolite names and compartments +% sortModel Sort a model based on metabolite names and compartments. % -% model a model structure -% sortReversible sorts the reversible reactions so the the metabolite -% that is first in lexiographical order is a reactant -% (optional, default true) -% sortMetName sort the metabolite names in the equation, also uses -% compartment abbreviation (optional, default false) -% sortReactionOrder sorts the reaction order within each subsystem so that -% reactions consuming some metabolite comes efter -% reactions producing it. This overrides the -% sortReversible option and reactions are sorted so that -% the production direction matches the consumption -% direction (optional, default false) +% Parameters +% ---------- +% model : struct +% a model structure. +% sortReversible : logical, optional +% sorts the reversible reactions so that the metabolite that is first +% in lexicographical order is a reactant (default true). +% sortMetName : logical, optional +% sort the metabolite names in the equation, also uses compartment +% abbreviation (default false). +% sortReactionOrder : logical, optional +% sorts the reaction order within each subsystem so that reactions +% consuming some metabolite come after reactions producing it. This +% overrides the sortReversible option and reactions are sorted so that +% the production direction matches the consumption direction (default +% false). % -% model an updated model structure +% Returns +% ------- +% model : struct +% an updated model structure. % -% Usage: model=sortModel(model,sortReversible,sortMetName,sortReactionOrder) +% Examples +% -------- +% model = sortModel(model, sortReversible, sortMetName, sortReactionOrder); if nargin<2 sortReversible=true; diff --git a/manipulation/standardizeGrRules.m b/manipulation/standardizeGrRules.m index 7d079043..6d6c560c 100755 --- a/manipulation/standardizeGrRules.m +++ b/manipulation/standardizeGrRules.m @@ -1,25 +1,42 @@ function [grRules,rxnGeneMat,indexes2check] = standardizeGrRules(model,embedded) -% standardizeGrRules -% Standardizes gene-rxn rules in a model according to the following -% - No overall containing brackets -% - Just enzyme complexes are enclosed into brackets -% - ' and ' & ' or ' strings are strictly set to lowercases +% standardizeGrRules Standardize gene-reaction rules in a model. % -% A rxnGeneMat matrix consistent with the standardized grRules is created. +% The grRules are standardized according to the following: % -% model a model structure -% embedded true if this function is called inside of another -% RAVEN function (optional, default false) +% - No overall containing brackets +% - Just enzyme complexes are enclosed into brackets +% - ' and ' and ' or ' strings are strictly set to lowercase % -% grRules [nRxns x 1] cell array with the standardized grRules -% rxnGeneMat [nRxns x nGenes]Sparse matrix consitent with the -% standardized grRules -% -% If this function is going to be used in a model reconstruction or -% modification pipeline it is recommended to run this function just -% at the beginning of the process. +% A rxnGeneMat matrix consistent with the standardized grRules is created. % -% Usage: [grRules,rxnGeneMat,indexes2check]=standardizeGrRules(model,embedded) +% Parameters +% ---------- +% model : struct +% a model structure. +% embedded : logical, optional +% true if this function is called inside of another RAVEN function +% (default false). +% +% Returns +% ------- +% grRules : cell +% [nRxns x 1] cell array with the standardized grRules. +% rxnGeneMat : double +% [nRxns x nGenes] sparse matrix consistent with the standardized +% grRules. +% indexes2check : double +% indices of the grRules with potentially problematic relationships +% that should be checked manually. +% +% Examples +% -------- +% [grRules, rxnGeneMat, indexes2check] = standardizeGrRules(model, embedded); +% +% Notes +% ----- +% If this function is going to be used in a model reconstruction or +% modification pipeline it is recommended to run this function just at the +% beginning of the process. %Preallocate fields n = length(model.rxns); diff --git a/omics/parseHPA.m b/omics/parseHPA.m index 5b7d4ca3..f9ea2d64 100755 --- a/omics/parseHPA.m +++ b/omics/parseHPA.m @@ -1,40 +1,42 @@ function hpaData=parseHPA(fileName, version) -% parseHPA -% Parses a database dump of the Human Protein Atlas (HPA) +% parseHPA Parse a database dump of the Human Protein Atlas (HPA). % -% Input: -% fileName comma- or tab-separated database dump of HPA. For details -% regarding the format, see -% http://www.proteinatlas.org/about/download. -% version version of HPA [optional, default=19] +% Parameters +% ---------- +% fileName : char +% comma- or tab-separated database dump of HPA. For details regarding +% the format, see http://www.proteinatlas.org/about/download. +% version : double, optional +% version of HPA (default 19). % +% Returns +% ------- +% hpaData : struct +% parsed HPA data with fields: % -% Output: -% hpaData -% genes cell array with the unique gene names. In -% version >=18 this is the ensemble name, see -% geneNames below for the names in ver >=18 -% geneNames cell array with the gene names, indexed the -% same way as genes. -% tissues cell array with the tissue names. The list may not be -% unique, as there can be multiple cell types per tissue -% celltypes cell array with the cell type names for each tissue -% levels cell array with the unique expression levels -% types cell array with the unique evidence types -% reliabilities cell array with the unique reliability levels +% - genes : cell array with the unique gene names. In version >=18 this +% is the ensemble name, see geneNames below for the names in ver >=18 +% - geneNames : cell array with the gene names, indexed the same way as +% genes +% - tissues : cell array with the tissue names. The list may not be +% unique, as there can be multiple cell types per tissue +% - celltypes : cell array with the cell type names for each tissue +% - levels : cell array with the unique expression levels +% - types : cell array with the unique evidence types +% - reliabilities : cell array with the unique reliability levels +% - gene2Level : gene-to-expression level mapping in sparse matrix form. +% The value for element i,j is the index in hpaData.levels of gene i +% in cell type j +% - gene2Type : gene-to-evidence type mapping in sparse matrix form. The +% value for element i,j is the index in hpaData.types of gene i in +% cell type j. Doesn't exist in version >=18. +% - gene2Reliability : gene-to-reliability level mapping in sparse +% matrix form. The value for element i,j is the index in +% hpaData.reliabilities of gene i in cell type j % -% gene2Level gene-to-expression level mapping in sparse matrix form. -% The value for element i,j is the index in -% hpaData.levels of gene i in cell type j -% gene2Type gene-to-evidence type mapping in sparse matrix form. -% The value for element i,j is the index in -% hpaData.types of gene i in cell type j. Doesn't -% exist in version >=18. -% gene2Reliability gene-to-reliability level mapping in sparse matrix form. -% The value for element i,j is the index in -% hpaData.reliabilities of gene i in cell type j -% -% Usage: hpaData=parseHPA(fileName,version) +% Examples +% -------- +% hpaData = parseHPA(fileName, version); if nargin<2 version=19; %Change this and add code for more versions when the current HPA version is increased and the format is changed diff --git a/omics/parseHPArna.m b/omics/parseHPArna.m index e58b3d69..8abf3c40 100755 --- a/omics/parseHPArna.m +++ b/omics/parseHPArna.m @@ -1,24 +1,28 @@ function arrayData=parseHPArna(fileName, version) -% parseHPA -% Parses a database dump of the Human Protein Atlas (HPA) RNA-Seq data. +% parseHPArna Parse a dump of Human Protein Atlas (HPA) RNA-Seq data. % -% Input: -% fileName tab-separated database dump of HPA RNA data. For -% details regarding the format, see -% http://www.proteinatlas.org/about/download. -% version version of HPA [optional, default=19] +% Parameters +% ---------- +% fileName : char +% tab-separated database dump of HPA RNA data. For details regarding the +% format, see http://www.proteinatlas.org/about/download. +% version : double, optional +% version of HPA (default 19). Only versions 18 and 19 are supported. % +% Returns +% ------- +% arrayData : struct +% parsed HPA RNA data with fields: % -% Output: -% arrayData -% genes cell array with the unique ensemble gene IDs -% geneNames cell array with the gene names (gene abbrevs) -% tissues cell array with the tissue names -% levels matrix of gene expression levels (TPM), where -% rows correspond to genes, and columns -% correspond to tissues +% - genes : cell array with the unique ensemble gene IDs +% - geneNames : cell array with the gene names (gene abbrevs) +% - tissues : cell array with the tissue names +% - levels : matrix of gene expression levels (TPM), where rows +% correspond to genes, and columns correspond to tissues % -% Usage: arrayData=parseHPArna(fileName,version) +% Examples +% -------- +% arrayData = parseHPArna(fileName, version); if nargin<2 %Change this and add code for more versions when the current HPA diff --git a/omics/scoreModel.m b/omics/scoreModel.m index 896381e2..a1047104 100755 --- a/omics/scoreModel.m +++ b/omics/scoreModel.m @@ -1,60 +1,72 @@ function [rxnScores, geneScores, hpaScores, arrayScores]=scoreModel(model,hpaData,arrayData,tissue,celltype,noGeneScore,multipleGeneScoring,multipleCellScoring,hpaLevelScores) -% scoreRxns -% Scores the reactions and genes in a model based on expression data -% from HPA and/or gene arrays +% scoreModel Score model reactions and genes from HPA and/or array data. % -% Input: -% model a model structure -% hpaData HPA data structure from parseHPA (optional if arrayData is -% supplied, default []) -% arrayData gene expression data structure (optional if hpaData is -% supplied, default []) -% genes cell array with the unique gene names -% tissues cell array with the tissue names. The list may not be -% unique, as there can be multiple cell types per tissue -% celltypes cell array with the cell type names for each tissue -% levels GENESxTISSUES array with the expression level for -% each gene in each tissue/celltype. NaN should be -% used when no measurement was performed -% threshold a single value or a vector of gene expression -% thresholds, above which genes are considered to be -% "expressed". (optional, by default, the mean expression -% levels of each gene across all tissues in arrayData -% will be used as the threshold values) -% tissue tissue to score for. Should exist in either -% hpaData.tissues or arrayData.tissues -% celltype cell type to score for. Should exist in either -% hpaData.celltypes or arrayData.celltypes for this -% tissue (optional, default is to use the best values -% among all the cell types for the tissue. Use [] if -% you want to supply more arguments) -% noGeneScore score for reactions without genes (optional, default -2) -% multipleGeneScoring determines how scores are calculated for reactions -% with several genes ('best' or 'average') -% (optional, default 'best') -% multipleCellScoring determines how scores are calculated when several -% cell types are used ('best' or 'average') -% (optional, default 'best') -% hpaLevelScores structure with numerical scores for the expression -% level categories from HPA. The structure should have a -% "names" and a "scores" field (optional, see code for -% default scores) +% Scores the reactions and genes in a model based on expression data from +% HPA and/or gene arrays. % +% Parameters +% ---------- +% model : struct +% a model structure. +% hpaData : struct, optional +% HPA data structure from parseHPA (optional if arrayData is supplied, +% default []). +% arrayData : struct, optional +% gene expression data structure (optional if hpaData is supplied, +% default []) with fields: % -% Output: -% rxnScores scores for each of the reactions in model -% geneScores scores for each of the genes in model. Genes which are -% not in the dataset(s) have -Inf as scores -% hpaScores scores for each of the genes in model if only taking hpaData -% into account. Genes which are not in the dataset(s) -% have -Inf as scores -% arrayScores scores for each of the genes in model if only taking arrayData -% into account. Genes which are not in the dataset(s) -% have -Inf as scores +% - genes : cell array with the unique gene names +% - tissues : cell array with the tissue names. The list may not be +% unique, as there can be multiple cell types per tissue +% - celltypes : cell array with the cell type names for each tissue +% - levels : GENESxTISSUES array with the expression level for each gene +% in each tissue/celltype. NaN should be used when no measurement was +% performed +% - threshold : a single value or a vector of gene expression +% thresholds, above which genes are considered to be "expressed". +% (optional, by default, the mean expression levels of each gene +% across all tissues in arrayData will be used as the threshold +% values) +% tissue : char +% tissue to score for. Should exist in either hpaData.tissues or +% arrayData.tissues. +% celltype : char, optional +% cell type to score for. Should exist in either hpaData.celltypes or +% arrayData.celltypes for this tissue (default is to use the best values +% among all the cell types for the tissue). Use [] if you want to supply +% more arguments. +% noGeneScore : double, optional +% score for reactions without genes (default -2). +% multipleGeneScoring : char, optional +% determines how scores are calculated for reactions with several genes, +% 'best' or 'average' (default 'best'). +% multipleCellScoring : char, optional +% determines how scores are calculated when several cell types are used, +% 'best' or 'average' (default 'best'). +% hpaLevelScores : struct, optional +% structure with numerical scores for the expression level categories +% from HPA. The structure should have a "names" and a "scores" field +% (default see code for default scores). % -% Usage: [rxnScores, geneScores, hpaScores, arrayScores]=scoreModel(model,... -% hpaData,arrayData,tissue,celltype,noGeneScore,multipleGeneScoring,... -% multipleCellScoring,hpaLevelScores) +% Returns +% ------- +% rxnScores : double +% scores for each of the reactions in model. +% geneScores : double +% scores for each of the genes in model. Genes which are not in the +% dataset(s) have -Inf as scores. +% hpaScores : double +% scores for each of the genes in model if only taking hpaData into +% account. Genes which are not in the dataset(s) have -Inf as scores. +% arrayScores : double +% scores for each of the genes in model if only taking arrayData into +% account. Genes which are not in the dataset(s) have -Inf as scores. +% +% Examples +% -------- +% [rxnScores, geneScores, hpaScores, arrayScores] = scoreModel(model, ... +% hpaData, arrayData, tissue, celltype, noGeneScore, ... +% multipleGeneScoring, multipleCellScoring, hpaLevelScores); if nargin<3 arrayData=[]; diff --git a/queries/buildEquation.m b/queries/buildEquation.m index a4a20563..bcfe77ec 100755 --- a/queries/buildEquation.m +++ b/queries/buildEquation.m @@ -1,15 +1,23 @@ function equationString=buildEquation(mets,stoichCoeffs,isrev) -% buildEquation -% Construct single equation string for a given reaction +% buildEquation Construct single equation string for a given reaction. % -% mets cell array with metabolites involved in the reaction. -% stoichCoeffs vector with corresponding stoichiometric coeffs. -% isrev logical indicating if the reaction is or not -% reversible. +% Parameters +% ---------- +% mets : cell +% metabolites involved in the reaction. +% stoichCoeffs : double +% vector with corresponding stoichiometric coeffs. +% isrev : logical +% indicates if the reaction is reversible or not. % -% equationString equation as a string +% Returns +% ------- +% equationString : char +% equation as a string. % -% Usage: equationString=buildEquation(mets,stoichCoeffs,isrev) +% Examples +% -------- +% equationString = buildEquation(mets, stoichCoeffs, isrev); mets=convertCharArray(mets); if ~isnumeric(stoichCoeffs) diff --git a/queries/checkModelStruct.m b/queries/checkModelStruct.m index eb9f8e6d..140b7c64 100755 --- a/queries/checkModelStruct.m +++ b/queries/checkModelStruct.m @@ -1,18 +1,26 @@ function checkModelStruct(model,throwErrors,trimWarnings) -% checkModelStruct -% Performs a number of checks to ensure that a model structure is ok +% checkModelStruct Perform a number of checks to ensure a model structure is ok. % -% model a model structure -% throwErrors true if the function should throw errors if -% inconsistencies are found. The alternative is to -% print warnings for all types of issues (optional, default true) -% trimWarnings true if only a maximal of 10 items should be displayed in -% a given error/warning (optional, default true) +% Parameters +% ---------- +% model : struct +% a model structure. +% throwErrors : logical, optional +% true if the function should throw errors if inconsistencies are found. +% The alternative is to print warnings for all types of issues +% (default true). +% trimWarnings : logical, optional +% true if only a maximum of 10 items should be displayed in a given +% error/warning (default true). % -% NOTE: This is performed after importing a model from Excel or before -% attempting to export a model to SBML format. +% Notes +% ----- +% This is performed after importing a model from Excel or before attempting +% to export a model to SBML format. % -% Usage: checkModelStruct(model,throwErrors,trimWarnings) +% Examples +% -------- +% checkModelStruct(model, throwErrors, trimWarnings); if nargin<2 throwErrors=true; diff --git a/queries/constructEquations.m b/queries/constructEquations.m index c2f38189..06f5018f 100755 --- a/queries/constructEquations.m +++ b/queries/constructEquations.m @@ -1,38 +1,46 @@ function equationStrings=constructEquations(model,rxns,useComps,sortRevRxns,sortMetNames,useMetID,useFormula,useRevField) -% constructEquations -% Construct equation strings for reactions +% constructEquations Construct equation strings for reactions. % -% Input: -% model a model structure -% rxns either a cell array of reaction IDs, a logical vector -% with the same number of elements as reactions in the -% model, or a vector of reaction indexes (optional, default -% model.rxns) -% useComps include the compartment of each metabolite (optional, -% default true) -% sortRevRxns sort reversible reactions so that the metabolite that -% is first in the lexiographic order is a reactant -% (optional, default false) -% sortMetNames sort the metabolite names in the equation. Uses -% compartment even if useComps is false (optional, default -% false) -% useMetID use metabolite ID in generated equations (optional, -% default false) -% useFormula use metabolite formula in generated equations (optional, -% default false) -% useRevField use the model.rev field to indicate reaction -% reversibility, alternatively this is determined from -% the model.ub and model.lb fields (optional, default true) +% Parameters +% ---------- +% model : struct +% a model structure. +% rxns : cell or logical or double, optional +% either a cell array of reaction IDs, a logical vector with the same +% number of elements as reactions in the model, or a vector of reaction +% indexes (default model.rxns). +% useComps : logical, optional +% include the compartment of each metabolite (default true). +% sortRevRxns : logical, optional +% sort reversible reactions so that the metabolite that is first in the +% lexicographic order is a reactant (default false). +% sortMetNames : logical, optional +% sort the metabolite names in the equation. Uses compartment even if +% useComps is false (default false). +% useMetID : logical, optional +% use metabolite ID in generated equations (default false). +% useFormula : logical, optional +% use metabolite formula in generated equations (default false). +% useRevField : logical, optional +% use the model.rev field to indicate reaction reversibility, +% alternatively this is determined from the model.ub and model.lb fields +% (default true). % -% Output: -% equationStrings a cell array with equations +% Returns +% ------- +% equationStrings : cell +% a cell array with equations. % -% If useRevField is false, then reactions should be organized in their -% forward direction (e.g. ub = 1000 and lb = -1000/0) for the -% reversibility to be correctly determined. +% Examples +% -------- +% equationStrings = constructEquations(model, rxns, useComps, ... +% sortRevRxns, sortMetNames, useMetID, useFormula, useRevField); % -% Usage: equationStrings = constructEquations(model, rxns, useComps,... -% sortRevRxns, sortMetNames, useMetID, useFormula, useRevField) +% Notes +% ----- +% If useRevField is false, then reactions should be organized in their +% forward direction (e.g. ub = 1000 and lb = -1000/0) for the reversibility +% to be correctly determined. if nargin<2 || isempty(rxns) rxns=model.rxns; diff --git a/queries/constructS.m b/queries/constructS.m index 5412cf24..034e8b82 100755 --- a/queries/constructS.m +++ b/queries/constructS.m @@ -1,26 +1,36 @@ function [S, mets, badRxns, reversible]=constructS(equations,mets,rxns) -% constructS -% Constructs a stoichiometric matrix from a cell array of equations +% constructS Construct a stoichiometric matrix from a cell array of equations. % -% equations cell array of equations on the form 'A + 2 B <=> 3 C', -% where <=> indicates reversible and => irreversible reactions -% mets cell array of metabolites. All metabolites in the equations -% must be present in the list (optional, default generated from -% the equations) -% rxns cell array of reaction ids. This is only used for printing -% reaction ids instead of equations in warnings/errors (optional, -% default []) +% Parameters +% ---------- +% equations : cell +% cell array of equations on the form 'A + 2 B <=> 3 C', where <=> +% indicates reversible and => irreversible reactions. +% mets : cell, optional +% cell array of metabolites. All metabolites in the equations must be +% present in the list (default generated from the equations). +% rxns : cell, optional +% cell array of reaction ids. This is only used for printing reaction ids +% instead of equations in warnings/errors (default []). % -% S the resulting stoichiometric matrix mets cell array with -% metabolites that corresponds to the order in the S matrix -% badRxns boolean vector with the reactions that have one or more -% metabolites as both substrate and product. An example would -% be the phosphotransferase ATP + ADP <=> ADP + ATP. In the -% stoichiometric matrix this equals to an empty reaction -% which can be problematic -% reversible boolean vector with true if the equation is reversible +% Returns +% ------- +% S : double +% the resulting stoichiometric matrix. +% mets : cell +% cell array with metabolites that corresponds to the order in the S +% matrix. +% badRxns : logical +% boolean vector with the reactions that have one or more metabolites as +% both substrate and product. An example would be the phosphotransferase +% ATP + ADP <=> ADP + ATP. In the stoichiometric matrix this equals to an +% empty reaction which can be problematic. +% reversible : logical +% boolean vector with true if the equation is reversible. % -% Usage: [S, mets, badRxns, reversible]=constructS(equations,mets) +% Examples +% -------- +% [S, mets, badRxns, reversible] = constructS(equations, mets); equations=convertCharArray(equations); switch nargin diff --git a/queries/getAllRxnsFromGenes.m b/queries/getAllRxnsFromGenes.m index 36e66cde..a1710c92 100755 --- a/queries/getAllRxnsFromGenes.m +++ b/queries/getAllRxnsFromGenes.m @@ -1,19 +1,27 @@ function allRxns=getAllRxnsFromGenes(model,rxns) -% getAllRxnsFromGenes -% Given a list of reactions, this function finds the associated genes in -% the template model and gives all reactions that are annotated by these -% genes. +% getAllRxnsFromGenes Find all reactions annotated by the genes of a set. % -% model a model structure -% rxns either a cell array of IDs, a logical vector with the -% same number of elements as reactions in the model, or a -% vector of indexes +% Given a list of reactions, this function finds the associated genes in +% the model and returns all reactions that are annotated by these genes. % -% allRxns either a cell array of IDs, a logical vector with the -% same number of elements as reactions in the model, or a -% vector of indexes, dependent on the format of rxns +% Parameters +% ---------- +% model : struct +% a model structure. +% rxns : cell or logical or double +% either a cell array of IDs, a logical vector with the same number of +% elements as reactions in the model, or a vector of indexes. % -% Usage: allRxns=getAllRxnsFromGenes(model,rxns) +% Returns +% ------- +% allRxns : cell or logical or double +% either a cell array of IDs, a logical vector with the same number of +% elements as reactions in the model, or a vector of indexes, +% dependent on the format of rxns. +% +% Examples +% -------- +% allRxns = getAllRxnsFromGenes(model, rxns); if ~islogical(rxns) && ~isnumeric(rxns) rxns=convertCharArray(rxns); diff --git a/queries/getElementalBalance.m b/queries/getElementalBalance.m index 8bc76fa7..24c03a4d 100755 --- a/queries/getElementalBalance.m +++ b/queries/getElementalBalance.m @@ -1,30 +1,41 @@ function balanceStructure=getElementalBalance(model,rxns,printUnbalanced,printUnparsable) -% getElementalBalance -% Checks a model to see if the reactions are elementally balanced +% getElementalBalance Check whether the reactions of a model are balanced. % -% model a model structure -% rxns either a cell array of reaction IDs, a logical vector -% with the same number of elements as reactions in the model, -% of a vector of indexes. Only these reactions will be -% checked (optional, default model.rxns) -% printUnbalanced print warnings about the reactions that were -% unbalanced (optional, default false) -% printUnparsable print warnings about the reactions that cannot be -% parsed (optional, default false) +% Checks a model to see if the reactions are elementally balanced. % -% balanceStructure -% balanceStatus 1 if the reaction is balanced, 0 if it's unbalanced, -% -1 if it couldn't be balanced due to missing information, -% -2 if it couldn't be balanced due to an error -% elements -% abbrevs cell array with abbreviations for all used elements -% names cell array with the names for all used elements -% leftComp MxN matrix with the sum of coefficients for each of -% the elements (N) for the left side of the -% reactions (M) -% rightComp the corresponding matrix for the right side +% Parameters +% ---------- +% model : struct +% a model structure. +% rxns : cell or logical or double, optional +% either a cell array of reaction IDs, a logical vector with the same +% number of elements as reactions in the model, or a vector of +% indexes. Only these reactions will be checked (default model.rxns). +% printUnbalanced : logical, optional +% print warnings about the reactions that were unbalanced +% (default false). +% printUnparsable : logical, optional +% print warnings about the reactions that cannot be parsed +% (default false). % -% Usage: balanceStructure=getElementalBalance(model,rxns,printUnbalanced,printUnparsable) +% Returns +% ------- +% balanceStructure : struct +% elemental balance structure with fields: +% +% - balanceStatus : 1 if the reaction is balanced, 0 if it is +% unbalanced, -1 if it could not be balanced due to missing +% information, -2 if it could not be balanced due to an error +% - elements : struct with fields abbrevs (cell array with +% abbreviations for all used elements) and names (cell array with +% the names for all used elements) +% - leftComp : MxN matrix with the sum of coefficients for each of the +% elements (N) for the left side of the reactions (M) +% - rightComp : the corresponding matrix for the right side +% +% Examples +% -------- +% balanceStructure = getElementalBalance(model, rxns, printUnbalanced, printUnparsable); if nargin<2 rxns=[]; diff --git a/queries/getExchangeRxns.m b/queries/getExchangeRxns.m index d058f689..a7de80bc 100755 --- a/queries/getExchangeRxns.m +++ b/queries/getExchangeRxns.m @@ -1,44 +1,49 @@ function [exchangeRxns, exchangeRxnsIndexes]=getExchangeRxns(model,reactionType) -% getExchangeRxns -% Retrieves the exchange reactions from a model. Exchange reactions are -% identified by having either no substrates or products. +% getExchangeRxns Retrieve the exchange reactions from a model. % -% Input: -% model a model structure -% reactionType which exchange reactions should be returned -% 'all' all reactions, irrespective of reaction -% bounds -% 'uptake' reactions with bounds that imply that -% only uptake are allowed. Reaction -% direction, upper and lower bounds are -% all considered -% 'excrete' reactions with bounds that imply that -% only excretion are allowed. Reaction -% direction, upper and lower bounds are -% all considered -% 'reverse' reactions with non-zero upper and lower -% bounds that imply that both uptake and -% excretion are allowed -% 'blocked' reactions that have zero upper and lower -% bounds, not allowing any flux -% 'in' reactions where the boundary metabolite -% is the substrate of the reaction, a -% positive flux value would imply uptake, -% but reaction bounds are not considered -% 'out' reactions where the boundary metabolite -% is the product of the reaction, a -% negative flux value would imply uptake, -% but reaction bounds are not considered. +% Exchange reactions are identified by having either no substrates or no +% products. % -% Output: -% exchangeRxns cell array with the IDs of the exchange reactions -% exchangeRxnsIndexes vector with the indexes of the exchange reactions +% Parameters +% ---------- +% model : struct +% a model structure. +% reactionType : char, optional +% which exchange reactions should be returned (default 'all'): % -% Note: -% The union of 'in' and 'out' equals 'all'. Also, the union of 'uptake', -% 'excrete', 'reverse' and 'blocked' equals all. +% - 'all' : all reactions, irrespective of reaction bounds +% - 'uptake' : reactions with bounds that imply that only uptake is +% allowed. Reaction direction, upper and lower bounds are all +% considered +% - 'excrete' : reactions with bounds that imply that only excretion is +% allowed. Reaction direction, upper and lower bounds are all +% considered +% - 'reverse' : reactions with non-zero upper and lower bounds that +% imply that both uptake and excretion are allowed +% - 'blocked' : reactions that have zero upper and lower bounds, not +% allowing any flux +% - 'in' : reactions where the boundary metabolite is the substrate of +% the reaction; a positive flux value would imply uptake, but +% reaction bounds are not considered +% - 'out' : reactions where the boundary metabolite is the product of +% the reaction; a negative flux value would imply uptake, but +% reaction bounds are not considered % -% Usage: [exchangeRxns,exchangeRxnsIndexes]=getExchangeRxns(model,reactionType) +% Returns +% ------- +% exchangeRxns : cell +% cell array with the IDs of the exchange reactions. +% exchangeRxnsIndexes : double +% vector with the indexes of the exchange reactions. +% +% Notes +% ----- +% The union of 'in' and 'out' equals 'all'. Also, the union of 'uptake', +% 'excrete', 'reverse' and 'blocked' equals 'all'. +% +% Examples +% -------- +% [exchangeRxns, exchangeRxnsIndexes] = getExchangeRxns(model, reactionType); if nargin<2 reactionType='all'; diff --git a/queries/getGenesFromGrRules.m b/queries/getGenesFromGrRules.m index a251d6bd..74e7620c 100755 --- a/queries/getGenesFromGrRules.m +++ b/queries/getGenesFromGrRules.m @@ -1,28 +1,28 @@ function [genes,rxnGeneMat] = getGenesFromGrRules(grRules, originalGenes) -%getGenesFromGrRules Extract gene list and rxnGeneMat from grRules array. +% getGenesFromGrRules Extract gene list and rxnGeneMat from grRules array. % -% USAGE: +% Parameters +% ---------- +% grRules : cell +% a cell array of model grRules, from which a list of genes is to be +% extracted. NOTE: Boolean operators can be text ("and", "or") or +% symbolic ("&", "|"), but there must be a space between operators and +% gene names/IDs. +% originalGenes : cell, optional +% the original gene list from the model as reference. % -% [genes,rxnGeneMat] = getGenesFromGrRules(grRules, originalGenes); -% -% INPUTS: -% -% grRules A cell array of model grRules, from which a list of genes -% are to be extracted. -% NOTE: Boolean operators can be text ("and", "or") or -% symbolic ("&", "|"), but there must be a space -% between operators and gene names/IDs. -% originalGenes The original gene list from the model as reference -% -% OUTPUTS: -% -% genes A unique list of all gene IDs that appear in grRules. -% -% rxnGeneMat (Optional) A binary matrix indicating which genes -% participate in each reaction, where rows correspond to -% reactions (entries in grRules) and columns correspond to -% genes. +% Returns +% ------- +% genes : cell +% a unique list of all gene IDs that appear in grRules. +% rxnGeneMat : double +% (optional) a binary matrix indicating which genes participate in each +% reaction, where rows correspond to reactions (entries in grRules) and +% columns correspond to genes. % +% Examples +% -------- +% [genes, rxnGeneMat] = getGenesFromGrRules(grRules, originalGenes); % handle input arguments diff --git a/queries/getIndexes.m b/queries/getIndexes.m index 72ec4cef..62357b23 100755 --- a/queries/getIndexes.m +++ b/queries/getIndexes.m @@ -1,30 +1,38 @@ function indexes=getIndexes(model, objects, type, returnLogical) -% getIndexes -% Retrieves the indexes for a list of reactions or metabolites +% getIndexes Retrieve the indexes for a list of reactions or metabolites. % -% Input: -% model a model structure -% objects either a cell array of IDs, a logical vector with the -% same number of elements as metabolites in the model, -% of a vector of indexes -% type 'rxns', 'mets', or 'genes' depending on what to retrieve -% 'metnames' queries metabolite names, while 'metcomps' -% allows to provide specific metabolites and their -% compartments in the format metaboliteName[comp]. If a -% model.ec structure exists (GECKO 3), then also -% 'ecenzymes', 'ecrxns' and 'ecgenes' are allowed -% returnLogical Sets whether to return a logical array or an array with -% the indexes (optional, default false) +% Parameters +% ---------- +% model : struct +% a model structure. +% objects : cell or logical or double +% either a cell array of IDs, a logical vector with the same number of +% elements as metabolites in the model, or a vector of indexes. +% type : char +% 'rxns', 'mets', or 'genes' depending on what to retrieve. 'metnames' +% queries metabolite names, while 'metcomps' allows providing specific +% metabolites and their compartments in the format metaboliteName[comp]. +% If a model.ec structure exists (GECKO 3), then also 'ecenzymes', +% 'ecrxns' and 'ecgenes' are allowed. +% returnLogical : logical, optional +% sets whether to return a logical array or an array with the indexes +% (default false). % -% Output: -% indexes can be a logical array or a double array depending on -% the value of returnLogical +% Returns +% ------- +% indexes : logical or double +% can be a logical array or a double array depending on the value of +% returnLogical. % -% Note: If 'ecenzymes', 'ecrxns' or 'ecgenes' are used with a GECKO 3 -% model, then the indexes are from the model.ec.enzymes, model.ec.rxns or -% model.ec.genes fields, respectively. -% -% Usage: indexes=getIndexes(model, objects, type, returnLogical) +% Notes +% ----- +% If 'ecenzymes', 'ecrxns' or 'ecgenes' are used with a GECKO 3 model, then +% the indexes are from the model.ec.enzymes, model.ec.rxns or model.ec.genes +% fields, respectively. +% +% Examples +% -------- +% indexes = getIndexes(model, objects, type, returnLogical); if nargin<4 returnLogical=false; diff --git a/queries/getMetsInComp.m b/queries/getMetsInComp.m index 63f20300..2c6df05b 100755 --- a/queries/getMetsInComp.m +++ b/queries/getMetsInComp.m @@ -1,14 +1,23 @@ function [I, metNames]=getMetsInComp(model,comp) -% getMetsInComp -% Gets the metabolites in a specified compartment +% getMetsInComp Get the metabolites in a specified compartment. % -% model a model structure -% comp string with the compartment id +% Parameters +% ---------- +% model : struct +% a model structure. +% comp : char +% string with the compartment id. % -% I boolean vector of the metabolites -% metNames the names of the metabolites +% Returns +% ------- +% I : logical +% boolean vector of the metabolites. +% metNames : cell +% the names of the metabolites. % -% Usage: [I, metNames]=getMetsInComp(model,comp) +% Examples +% -------- +% [I, metNames] = getMetsInComp(model, comp); comp=char(comp); diff --git a/queries/getRxnsInComp.m b/queries/getRxnsInComp.m index b9daa850..c70b2c0e 100755 --- a/queries/getRxnsInComp.m +++ b/queries/getRxnsInComp.m @@ -1,17 +1,26 @@ function [I, rxnNames]=getRxnsInComp(model,comp,includePartial) -% getRxnsInComp -% Gets the reactions in a specified compartment +% getRxnsInComp Get the reactions in a specified compartment. % -% model a model structure -% comp string with the compartment id -% includePartial true if reactions with metabolites in several -% compartments (normally transport reactions) should -% be included (optional, default false) +% Parameters +% ---------- +% model : struct +% a model structure. +% comp : char +% string with the compartment id. +% includePartial : logical, optional +% true if reactions with metabolites in several compartments (normally +% transport reactions) should be included (default false). % -% I boolean vector of the reactions -% rxnNames the names of the reactions +% Returns +% ------- +% I : double +% boolean vector of the reactions. +% rxnNames : cell +% the names of the reactions. % -% Usage: [I, rxnNames]=getRxnsInComp(model,comp,includePartial) +% Examples +% -------- +% [I, rxnNames] = getRxnsInComp(model, comp, includePartial); comp=char(comp); if nargin<3 diff --git a/queries/getTransportRxns.m b/queries/getTransportRxns.m index 7f4f2f82..122bea38 100755 --- a/queries/getTransportRxns.m +++ b/queries/getTransportRxns.m @@ -1,16 +1,25 @@ function transportRxns=getTransportRxns(model) -% getTransportRxns -% Retrieves the transport reactions from a model +% getTransportRxns Retrieve the transport reactions from a model. % -% model a model structure +% Parameters +% ---------- +% model : struct +% a model structure. % -% transportRxns logical array with true if the corresponding -% reaction is a transport reaction +% Returns +% ------- +% transportRxns : logical +% logical array with true if the corresponding reaction is a transport +% reaction. % -% Transport reactions are defined as reactions involving (at least) one -% metabolite name in more than one compartment. +% Examples +% -------- +% transportRxns = getTransportRxns(model); % -% Usage: transportRxns=getTransportRxns(model) +% Notes +% ----- +% Transport reactions are defined as reactions involving (at least) one +% metabolite name in more than one compartment. transportRxns=false(numel(model.rxns),1); diff --git a/queries/parseFormulas.m b/queries/parseFormulas.m index b54ce40c..d4f29cfd 100755 --- a/queries/parseFormulas.m +++ b/queries/parseFormulas.m @@ -1,34 +1,47 @@ function [elements, useMat, exitFlag, MW]=parseFormulas(formulas, noPolymers,isInchi,ignoreRX) -% parseFormulas -% Gets the elemental composition from formulas +% parseFormulas Get the elemental composition from formulas. % -% formulas a cell array with formulas -% noPolymers assume that all polymers consist of one element. -% Corresponds to counting everything between (...)n as -% n being equal to one. Only one set of parentheses -% is allowed. If this is false then polymers are returned as -% "Could not parse formula" (optional, default false) -% isInchi true if the formulas are in the InChI format (optional, -% default false) -% ignoreRX ignore R-groups and bound protein. This can be useful since they -% are often used only as intermediates (optional, default false) +% Parameters +% ---------- +% formulas : cell +% a cell array with formulas. +% noPolymers : logical, optional +% assume that all polymers consist of one element. Corresponds to +% counting everything between (...)n as n being equal to one. Only one +% set of parentheses is allowed. If this is false then polymers are +% returned as "Could not parse formula" (default false). +% isInchi : logical, optional +% true if the formulas are in the InChI format (default false). +% ignoreRX : logical, optional +% ignore R-groups and bound protein. This can be useful since they are +% often used only as intermediates (default false). % -% elements -% abbrevs cell array with abbreviations for all used elements -% names cell array with the names for all used elements -% useMat MxN matrix with the number of atoms for each formula (M) and each -% element (N) -% exitFlag array with the exit flags: -% 1= Sucessful parsing -% 0= No formula found -% -1= Could not parse formula -% MW predicted molecular weight (g/mol). This is only returned -% for formulas which can be sucessfully parsed, and its -% calculation doesn't affect the exitFlag variable. NaN is -% returned if the weight couldn't be calculated -% -% Usage: [elements, useMat, exitFlag, MW]= -% parseFormulas(formulas, noPolymers,isInchi,ignoreRX) +% Returns +% ------- +% elements : struct +% struct with fields: +% +% - abbrevs : cell array with abbreviations for all used elements +% - names : cell array with the names for all used elements +% useMat : double +% MxN matrix with the number of atoms for each formula (M) and each +% element (N). +% exitFlag : double +% array with the exit flags: +% +% - 1 : successful parsing +% - 0 : no formula found +% - -1 : could not parse formula +% MW : double +% predicted molecular weight (g/mol). This is only returned for +% formulas which can be successfully parsed, and its calculation +% doesn't affect the exitFlag variable. NaN is returned if the weight +% couldn't be calculated. +% +% Examples +% -------- +% [elements, useMat, exitFlag, MW] = ... +% parseFormulas(formulas, noPolymers, isInchi, ignoreRX); if nargin<2 noPolymers=false; diff --git a/queries/parseRxnEqu.m b/queries/parseRxnEqu.m index 08bfdbe5..1a3ab6e9 100755 --- a/queries/parseRxnEqu.m +++ b/queries/parseRxnEqu.m @@ -1,20 +1,28 @@ function metabolites=parseRxnEqu(equations) -% parseRxnEqu -% Gets all metabolite names from a cell array of equations +% parseRxnEqu Get all metabolite names from a cell array of equations. % -% metabolites=parseRxnEqu(equations) +% Parameters +% ---------- +% equations : cell +% a cell array with equation strings. % -% equations A cell array with equation strings +% Returns +% ------- +% metabolites : cell +% a cell array with the involved metabolites. % -% metabolites A cell array with the involved metabolites +% Examples +% -------- +% metabolites = parseRxnEqu(equations); % -% The equations should be written like: -% 1 A + 3 B (=> or <=>) 5C + 2 D +% Notes +% ----- +% The equations should be written like: % -% If the equation is expressed as for example '... + (n-1) starch' then -% '(n-1) starch' will be interpreted as one metabolite +% 1 A + 3 B (=> or <=>) 5C + 2 D % -% Usage: metabolites=parseRxnEqu(equations) +% If the equation is expressed as for example '... + (n-1) starch' then +% '(n-1) starch' will be interpreted as one metabolite. if ~iscell(equations) equations={equations}; diff --git a/queries/printFluxes.m b/queries/printFluxes.m index 8f2d65db..b6099d3d 100755 --- a/queries/printFluxes.m +++ b/queries/printFluxes.m @@ -1,39 +1,49 @@ function printFluxes(model, fluxes, onlyExchange, cutOffFlux, outputFile,outputString,metaboliteList) -% printFluxes -% Prints reactions and fluxes to the screen or to a file +% printFluxes Print reactions and fluxes to the screen or to a file. % -% Input: -% model a model structure -% fluxes a vector with fluxes -% onlyExchange only print exchange fluxes (optional, default true) -% cutOffFlux only print fluxes with absolute values above or equal -% to this value (optional, default 10^-8) -% outputFile a file to save the print-out to (optional, default is -% output to the command window) -% outputString a string that specifies the output of each reaction -% (optional, default '%rxnID\t(%rxnName):\t%flux\n') -% metaboliteList cell array of metabolite names. Only reactions -% involving any of these metabolites will be -% printed (optional) +% Parameters +% ---------- +% model : struct +% a model structure. +% fluxes : double +% a vector with fluxes. +% onlyExchange : logical, optional +% only print exchange fluxes (default true). +% cutOffFlux : double, optional +% only print fluxes with absolute values above or equal to this value +% (default 10^-8). +% outputFile : char, optional +% a file to save the print-out to (default is output to the command +% window). +% outputString : char, optional +% a string that specifies the output of each reaction (default +% '%rxnID\t(%rxnName):\t%flux\n'). +% metaboliteList : cell, optional +% cell array of metabolite names. Only reactions involving any of these +% metabolites will be printed. % +% Notes +% ----- % The following codes are available for user-defined output strings: % -% %rxnID reaction ID -% %rxnName reaction name -% %lower lower bound -% %upper upper bound -% %obj objective coefficient -% %eqn equation -% %flux flux -% %element equation using the metabolite formulas rather than -% metabolite names -% %unbalanced "(*)" if the reaction is unbalanced and "(-)" if it could -% not be parsed -% %lumped equation where the elemental compositions for the left/right -% hand sides are lumped +% - %rxnID : reaction ID +% - %rxnName : reaction name +% - %lower : lower bound +% - %upper : upper bound +% - %obj : objective coefficient +% - %eqn : equation +% - %flux : flux +% - %element : equation using the metabolite formulas rather than metabolite +% names +% - %unbalanced : "(*)" if the reaction is unbalanced and "(-)" if it could +% not be parsed +% - %lumped : equation where the elemental compositions for the left/right +% hand sides are lumped % -% Usage: printFluxes(model, fluxes, onlyExchange, cutOffFlux, outputFile,... -% outputString, metaboliteList) +% Examples +% -------- +% printFluxes(model, fluxes, onlyExchange, cutOffFlux, outputFile, ... +% outputString, metaboliteList); if nargin<3 onlyExchange=true; diff --git a/queries/printModel.m b/queries/printModel.m index baf81e25..d1285d95 100755 --- a/queries/printModel.m +++ b/queries/printModel.m @@ -1,39 +1,47 @@ function printModel(model,rxnList,outputString,outputFile,metaboliteList) -% printModel -% Prints reactions to the screen or to a file +% printModel Print reactions to the screen or to a file. % -% model a model structure -% rxnList either a cell array of reaction IDs, a logical vector -% with the same number of elements as reactions in the model, -% or a vector of indexes to remove (optional, default -% model.rxns) -% outputString a string that specifies the output of each reaction (optional, -% default '%rxnID (%rxnName)\n\t%eqn [%lower %upper]\n') -% outputFile a file to save the print-out to (optional, default is output to -% the command window) -% metaboliteList cell array of metabolite names. Only reactions -% involving any of these metabolites will be -% printed (optional) +% This is a wrapper around printFluxes, intended for use when there is no +% flux distribution. % -% The following codes are available for user-defined output strings: +% Parameters +% ---------- +% model : struct +% a model structure. +% rxnList : cell or logical or double, optional +% either a cell array of reaction IDs, a logical vector with the same +% number of elements as reactions in the model, or a vector of indexes +% to print (default model.rxns). +% outputString : char, optional +% a string that specifies the output of each reaction (default +% '%rxnID (%rxnName)\n\t%eqn [%lower %upper]\n'). +% outputFile : char, optional +% a file to save the print-out to (default is output to the command +% window). +% metaboliteList : cell, optional +% cell array of metabolite names. Only reactions involving any of these +% metabolites will be printed. % -% %rxnID reaction ID -% %rxnName reaction name -% %lower lower bound -% %upper upper bound -% %obj objective coefficient -% %eqn equation -% %element equation using the metabolite formulas rather than -% metabolite names -% %unbalanced "(*)" if the reaction is unbalanced and "(-)" if it could not -% be parsed -% %lumped equation where the elemental compositions for the left/right -% hand sides are lumped +% Notes +% ----- +% The following codes are available for user-defined output strings: % -% NOTE: This is just a wrapper function around printFluxes. It is -% intended to be used when there is no flux distribution. +% - %rxnID : reaction ID +% - %rxnName : reaction name +% - %lower : lower bound +% - %upper : upper bound +% - %obj : objective coefficient +% - %eqn : equation +% - %element : equation using the metabolite formulas rather than metabolite +% names +% - %unbalanced : "(*)" if the reaction is unbalanced and "(-)" if it could +% not be parsed +% - %lumped : equation where the elemental compositions for the left/right +% hand sides are lumped % -% Usage: printModel(model,rxnList,outputString,outputFile,metaboliteList) +% Examples +% -------- +% printModel(model, rxnList, outputString, outputFile, metaboliteList); if nargin<2 || isempty(rxnList) rxnList=model.rxns; diff --git a/queries/printModelStats.m b/queries/printModelStats.m index 6018cf09..fcef6c8d 100755 --- a/queries/printModelStats.m +++ b/queries/printModelStats.m @@ -1,16 +1,20 @@ function printModelStats(model, printModelIssues, printDetails) -% printModelStats -% prints some statistics about a model to the screen +% printModelStats Print some statistics about a model to the screen. % -% model a model structure -% printModelIssues true if information about unconnected -% reactions/metabolites and elemental balancing -% should be printed (optional, default false) -% printDetails true if detailed information should be printed -% about model issues. Only used if printModelIssues -% is true (optional, default true) +% Parameters +% ---------- +% model : struct +% a model structure. +% printModelIssues : logical, optional +% true if information about unconnected reactions/metabolites and +% elemental balancing should be printed (default false). +% printDetails : logical, optional +% true if detailed information should be printed about model issues. +% Only used if printModelIssues is true (default true). % -% Usage: printModelStats(model,printModelIssues, printDetails) +% Examples +% -------- +% printModelStats(model, printModelIssues, printDetails); if nargin<2 printModelIssues=false; diff --git a/reconstruction/combineMetaCycKEGGModels.m b/reconstruction/combineMetaCycKEGGModels.m index d52638cb..ba14f925 100755 --- a/reconstruction/combineMetaCycKEGGModels.m +++ b/reconstruction/combineMetaCycKEGGModels.m @@ -1,18 +1,24 @@ function model=combineMetaCycKEGGModels(metacycModel,keggModel) -% combineMetaCycKEGGModels -% Combine MetaCyc and KEGG draft models into one model structure. +% combineMetaCycKEGGModels Combine MetaCyc and KEGG draft models into one. % -% Input: -% metacycModel the reconstructed model from MetaCyc -% keggModel the reconstructed model from KEGG +% Parameters +% ---------- +% metacycModel : struct +% the reconstructed model from MetaCyc. +% keggModel : struct +% the reconstructed model from KEGG. % -% Output: -% model a model structure generated by integrating information -% from draft models reconstructed using MetaCyc and KEGG -% databases. The 'rxnFrom/metFrom/geneFrom' fields are -% included to indicate the source. +% Returns +% ------- +% model : struct +% a model structure generated by integrating information from draft +% models reconstructed using MetaCyc and KEGG databases. The +% 'rxnFrom/metFrom/geneFrom' fields are included to indicate the +% source. % -% Usage: model=combineMetaCycKEGGModels(metacycModel,keggModel) +% Examples +% -------- +% model = combineMetaCycKEGGModels(metacycModel,keggModel); %Just return the model if nargin<2 diff --git a/reconstruction/guessComposition.m b/reconstruction/guessComposition.m index ab204b85..75056f4e 100755 --- a/reconstruction/guessComposition.m +++ b/reconstruction/guessComposition.m @@ -1,37 +1,50 @@ function [model, guessedFor, couldNotGuess]=guessComposition(model, printResults) -% guessComposition -% Attempts to guess the composition of metabolites without information -% about elemental composition +% guessComposition Guess the composition of metabolites without one. % -% model a model structure -% printResults true if the output should be printed (optional, default true) +% Attempts to guess the composition of metabolites without information about +% elemental composition. % -% model a model structure with information about elemental -% composition added -% guessedFor indexes for the metabolites for which a composition -% could be guessed -% couldNotGuess indexes for the metabolites for which no -% composition could be assigned +% Parameters +% ---------- +% model : struct +% a model structure. +% printResults : logical, optional +% true if the output should be printed (default true). % -% This function works in a rather straight forward manner: +% Returns +% ------- +% model : struct +% a model structure with information about elemental composition added. +% guessedFor : double +% indexes for the metabolites for which a composition could be guessed. +% couldNotGuess : double +% indexes for the metabolites for which no composition could be +% assigned. % -% 1. Get the metabolites which lack composition and participates in -% at least one reaction where all other metabolites have composition information -% 2. Loop through them and calculate their composition based on the rest -% of the involved metabolites. If there are any inconsistencies, so that -% a given metabolite should have different composition in different -% equations, then throw an error -% 3. Go to 1 +% Examples +% -------- +% [model, guessedFor, couldNotGuess] = guessComposition(model, printResults); % -% This simple approach requires that the rest of the metabolites have -% correct composition information, and that the involved reactions are -% correct. The function will exit with an error on any inconsistencies, -% which means that it could also be used as a way of checking the model -% for errors. Note that just because this exits sucessfully, the -% calculated compositions could still be wrong (in case that the existing -% compositions were wrong) +% Notes +% ----- +% This function works in a rather straight forward manner: % -% Usage: [newModel, guessedFor, couldNotGuess]=guessComposition(model, printResults) +% 1. Get the metabolites which lack composition and participates in at +% least one reaction where all other metabolites have composition +% information. +% 2. Loop through them and calculate their composition based on the rest of +% the involved metabolites. If there are any inconsistencies, so that a +% given metabolite should have different composition in different +% equations, then throw an error. +% 3. Go to 1. +% +% This simple approach requires that the rest of the metabolites have +% correct composition information, and that the involved reactions are +% correct. The function will exit with an error on any inconsistencies, +% which means that it could also be used as a way of checking the model for +% errors. Note that just because this exits sucessfully, the calculated +% compositions could still be wrong (in case that the existing compositions +% were wrong). if nargin<2 printResults=true; diff --git a/reconstruction/homology/getBlast.m b/reconstruction/homology/getBlast.m index 6dda0e46..e567ef73 100755 --- a/reconstruction/homology/getBlast.m +++ b/reconstruction/homology/getBlast.m @@ -1,40 +1,51 @@ function [blastStructure,blastReport]=getBlast(organismID,fastaFile,... modelIDs,refFastaFiles,developMode,hideVerbose) -% getBlast -% Performs a bidirectional BLAST between the organism of interest and a -% set of template organisms +% getBlast Bidirectional BLAST between an organism and template organisms. % -% Input: -% organismID the id of the organism of interest. This should also -% match with the id supplied to getModelFromHomology -% fastaFile a FASTA file with the protein sequences for the -% organism of interest -% modelIDs a cell array of model ids. These must match the -% "model.id" fields in the "models" structure if the -% output is to be used with getModelFromHomology -% refFastaFiles a cell array with the paths to the corresponding FASTA -% files -% developMode true if blastReport should be generated that is used -% in the unit testing function for BLAST+ (optional, default -% false) -% hideVerbose true if no status messages should be printed (optional, -% default false) +% Parameters +% ---------- +% organismID : char +% the id of the organism of interest. This should also match with the +% id supplied to getModelFromHomology. +% fastaFile : char +% a FASTA file with the protein sequences for the organism of interest. +% modelIDs : cell +% a cell array of model ids. These must match the "model.id" fields in +% the "models" structure if the output is to be used with +% getModelFromHomology. +% refFastaFiles : cell +% a cell array with the paths to the corresponding FASTA files. +% developMode : logical, optional +% true if blastReport should be generated that is used in the unit +% testing function for BLAST+ (default false). +% hideVerbose : logical, optional +% true if no status messages should be printed (default false). % -% Output: -% blastStructure structure containing the bidirectional homology -% measurements that can be used by getModelFromHomology -% blastReport structure containing MD5 hashes for FASTA database -% files and non-parsed BLAST output data. Will be blank -% if developMode is false. +% Returns +% ------- +% blastStructure : struct +% structure containing the bidirectional homology measurements that +% can be used by getModelFromHomology. +% blastReport : struct +% structure containing MD5 hashes for FASTA database files and +% non-parsed BLAST output data. Will be blank if developMode is false. % -% NOTE: This function calls BLAST+ to perform a bidirectional homology -% test between the organism of interest and a set of other organisms -% using standard settings. The only filtering this function does is the -% removal of hits with an E-value higher than 10e-5. The other homology -% measurements can be implemented using getBlastFromExcel. +% Notes +% ----- +% This function calls BLAST+ to perform a bidirectional homology test +% between the organism of interest and a set of other organisms using +% standard settings. The only filtering this function does is the removal +% of hits with an E-value higher than 10e-5. The other homology +% measurements can be implemented using getBlastFromExcel. % -% Usage: [blastStructure,blastReport]=getBlast(organismID,fastaFile,... -% modelIDs,refFastaFiles,developMode,hideVerbose) +% Examples +% -------- +% [blastStructure,blastReport] = getBlast(organismID,fastaFile,... +% modelIDs,refFastaFiles,developMode,hideVerbose); +% +% See also +% -------- +% getModelFromHomology, getBlastFromExcel, getDiamond if nargin<5 developMode = false; diff --git a/reconstruction/homology/getBlastFromExcel.m b/reconstruction/homology/getBlastFromExcel.m index f1e98774..d193b89d 100755 --- a/reconstruction/homology/getBlastFromExcel.m +++ b/reconstruction/homology/getBlastFromExcel.m @@ -1,29 +1,41 @@ function blastStructure=getBlastFromExcel(models,blastFile,organismId) -% getBlastFromExcel -% Retrieves gene homology information from Excel files. Used as -% input to getModelFromHomology. +% getBlastFromExcel Retrieve gene homology information from Excel files. % -% Input: -% models a cell array of model structures -% blastFile Excel file with homology information -% organismId the id of the organism of interest (as described in the -% Excel file) +% Used as input to getModelFromHomology. % -% Output: -% blastStructure structure containing the information in the Excel -% sheets. +% Parameters +% ---------- +% models : cell +% a cell array of model structures. +% blastFile : char +% Excel file with homology information. +% organismId : char +% the id of the organism of interest (as described in the Excel file). % -% The Excel file should contain a number of spreadsheets which in turn -% contain the bidirectional homology measurements between the genes in the -% organisms. The first and second column headers in each sheet is the -% "to" and "from" model ids (as defined in models or for the new organism). -% The entries should correspond to the gene names in those models. The third, -% fourth, fifth, sixth and seventh columns represent the E-value, alignment -% length, identity, bitscore and percentage of positive-scoring matches for -% each measurement (captions should be "E-value", "Alignment length", -% "Identity", "Bitscore" and "PPOS"). +% Returns +% ------- +% blastStructure : struct +% structure containing the information in the Excel sheets. % -% Usage: blastStructure=getBlastFromExcel(models,blastFile,organismId) +% Notes +% ----- +% The Excel file should contain a number of spreadsheets which in turn +% contain the bidirectional homology measurements between the genes in the +% organisms. The first and second column headers in each sheet is the "to" +% and "from" model ids (as defined in models or for the new organism). The +% entries should correspond to the gene names in those models. The third, +% fourth, fifth, sixth and seventh columns represent the E-value, alignment +% length, identity, bitscore and percentage of positive-scoring matches for +% each measurement (captions should be "E-value", "Alignment length", +% "Identity", "Bitscore" and "PPOS"). +% +% Examples +% -------- +% blastStructure = getBlastFromExcel(models,blastFile,organismId); +% +% See also +% -------- +% getModelFromHomology, getBlast if ~isfile(blastFile) error('BLAST result file %s cannot be found',string(blastFile)); diff --git a/reconstruction/homology/getDiamond.m b/reconstruction/homology/getDiamond.m index 2200b97c..0a0d14ac 100755 --- a/reconstruction/homology/getDiamond.m +++ b/reconstruction/homology/getDiamond.m @@ -1,41 +1,52 @@ function [blastStructure,diamondReport]=getDiamond(organismID,fastaFile,... modelIDs,refFastaFiles,developMode,hideVerbose) -% getDiamond -% Uses DIAMOND to perform a bidirectional BLAST between the organism -% of interest and a set of template organisms +% getDiamond Bidirectional BLAST with DIAMOND against template organisms. % -% Input: -% organismID the id of the organism of interest. This should also -% match with the id supplied to getModelFromHomology -% fastaFile a FASTA file with the protein sequences for the -% organism of interest -% modelIDs a cell array of model ids. These must match the -% "model.id" fields in the "models" structure if the -% output is to be used with getModelFromHomology -% refFastaFiles a cell array with the paths to the corresponding FASTA -% files -% developMode true if blastReport should be generated that is used -% in the unit testing function for DIAMOND (optional, default -% false) -% hideVerbose true if no status messages should be printed (optional, -% default false) +% Parameters +% ---------- +% organismID : char +% the id of the organism of interest. This should also match with the +% id supplied to getModelFromHomology. +% fastaFile : char +% a FASTA file with the protein sequences for the organism of interest. +% modelIDs : cell +% a cell array of model ids. These must match the "model.id" fields in +% the "models" structure if the output is to be used with +% getModelFromHomology. +% refFastaFiles : cell +% a cell array with the paths to the corresponding FASTA files. +% developMode : logical, optional +% true if blastReport should be generated that is used in the unit +% testing function for DIAMOND (default false). +% hideVerbose : logical, optional +% true if no status messages should be printed (default false). % -% Output: -% blastStructure structure containing the bidirectional homology -% measurements which are used by getModelFromHomology -% diamondReport structure containing MD5 hashes for FASTA database -% files and non-parsed BLAST output data. Will be blank -% if developMode is false. +% Returns +% ------- +% blastStructure : struct +% structure containing the bidirectional homology measurements which +% are used by getModelFromHomology. +% diamondReport : struct +% structure containing MD5 hashes for FASTA database files and +% non-parsed BLAST output data. Will be blank if developMode is false. % -% NOTE: This function calls DIAMOND to perform a bidirectional homology -% search between the organism of interest and a set of other organisms -% using the '--more-sensitive' setting from DIAMOND. For the most -% sensitive results, the use of getBlast() is adviced, however, -% getDiamond() is a fast alternative (>15x faster). The blastStructure -% generated is in the same format as those obtained from getBlast(). +% Notes +% ----- +% This function calls DIAMOND to perform a bidirectional homology search +% between the organism of interest and a set of other organisms using the +% '--more-sensitive' setting from DIAMOND. For the most sensitive results, +% the use of getBlast() is adviced, however, getDiamond() is a fast +% alternative (>15x faster). The blastStructure generated is in the same +% format as those obtained from getBlast(). % -% Usage: [blastStructure,diamondReport]=getDiamond(organismID,fastaFile,... -% modelIDs,refFastaFiles,developMode,hideVerbose) +% Examples +% -------- +% [blastStructure,diamondReport] = getDiamond(organismID,fastaFile,... +% modelIDs,refFastaFiles,developMode,hideVerbose); +% +% See also +% -------- +% getModelFromHomology, getBlast if nargin<5 developMode = false; diff --git a/reconstruction/homology/getModelFromHomology.m b/reconstruction/homology/getModelFromHomology.m index aa356b03..66bfa134 100755 --- a/reconstruction/homology/getModelFromHomology.m +++ b/reconstruction/homology/getModelFromHomology.m @@ -1,70 +1,79 @@ function [draftModel, hitGenes]=getModelFromHomology(models,blastStructure,... getModelFor,preferredOrder,strictness,onlyGenesInModels,maxE,... minLen,minIde,mapNewGenesToOld) -% getModelFromHomology -% Constructs a new model from a set of existing models and gene homology -% information. +% getModelFromHomology Construct a new model from existing models and homology. % -% models a cell array of model structures to build the model -% from. These models must be sorted by importance in -% decreasing order -% blastStructure a blastStructure as produced by getBlast or -% getBlastFromExcel -% getModelFor a three-four letter abbreviation of the organism to -% build a model for. Must have BLASTP hits in both -% directions to the organisms in 'models' -% preferredOrder the order in which reactions should be added from the -% models. If not supplied, reactions will be included -% from all models, otherwise one gene will only result -% in reactions from one model (optional, default {}) -% strictness integer that specifies which reactions should be -% included: -% 1: Map new genes to old for all pairs, which have -% acceptable BLASTP results in both directions -% 2: Map new genes to old for all pairs, which have -% acceptable BLASTP results in correspondent direction -% (mapping can be done in the opposite direction, see -% mapNewGenesToOld below) -% 3: Check all BLASTP results and retain only the best -% results by E-value for all gene pairs in each -% direction separately. Then map new genes to old for -% all pairs, which have acceptable BLASTP results in -% both directions (optional, default 1). -% onlyGenesInModels consider BLASTP results only for genes that exist in -% the models. This tends to import a larger fraction -% from the existing models but may give less reliable -% results. Has effect only if strictness=3 (optional, -% default false) -% maxE only look at genes with E-values <= this value (optional, -% default 10^-30) -% minLen only look at genes with alignment length >= this -% value (optional, default 200) -% minIde only look at genes with identity >= this value -% (optional, default 40 (%)) -% mapNewGenesToOld determines how to match genes if not looking at only -% 1-1 orthologs. Either map the new genes to the old or -% old genes to new. The default is to map the new genes -% (optional, default true) +% Constructs a new model from a set of existing models and gene homology +% information. % -% draftModel a model structure for the new organism -% hitGenes collect the old and new genes +% Parameters +% ---------- +% models : cell +% a cell array of model structures to build the model from. These +% models must be sorted by importance in decreasing order. +% blastStructure : struct +% a blastStructure as produced by getBlast or getBlastFromExcel. +% getModelFor : char +% a three-four letter abbreviation of the organism to build a model +% for. Must have BLASTP hits in both directions to the organisms in +% 'models'. +% preferredOrder : cell, optional +% the order in which reactions should be added from the models. If not +% supplied, reactions will be included from all models, otherwise one +% gene will only result in reactions from one model (default {}). +% strictness : double, optional +% integer that specifies which reactions should be included (default 1): % -% The models in the 'models' structure should have named the metabolites -% in the same manner, have their reversible reactions in the same -% direction (run sortModel), and use the same compartment names. To avoid -% keeping unneccesary old genes, the models should not have -% 'or'-relations in their grRules (use expandModel). +% - 1 : Map new genes to old for all pairs, which have acceptable BLASTP +% results in both directions. +% - 2 : Map new genes to old for all pairs, which have acceptable BLASTP +% results in correspondent direction (mapping can be done in the +% opposite direction, see mapNewGenesToOld below). +% - 3 : Check all BLASTP results and retain only the best results by +% E-value for all gene pairs in each direction separately. Then map +% new genes to old for all pairs, which have acceptable BLASTP results +% in both directions. +% onlyGenesInModels : logical, optional +% consider BLASTP results only for genes that exist in the models. This +% tends to import a larger fraction from the existing models but may +% give less reliable results. Has effect only if strictness=3 (default +% false). +% maxE : double, optional +% only look at genes with E-values <= this value (default 10^-30). +% minLen : double, optional +% only look at genes with alignment length >= this value (default 200). +% minIde : double, optional +% only look at genes with identity >= this value (default 40 (%)). +% mapNewGenesToOld : logical, optional +% determines how to match genes if not looking at only 1-1 orthologs. +% Either map the new genes to the old or old genes to new. The default +% is to map the new genes (default true). % -% The resulting draft model contains only reactions associated with -% orthologous genes. The old (original) genes involved in 'and' -% relations in grRules without any orthologs are still included in -% the draft model as OLD_MODELID_geneName. +% Returns +% ------- +% draftModel : struct +% a model structure for the new organism. +% hitGenes : struct +% collect the old and new genes. % -% NOTE: "to" and "from" means relative to the new organism +% Examples +% -------- +% draftModel = getModelFromHomology(models, blastStructure, getModelFor); % -% Usage: draftModel=getModelFromHomology(models,blastStructure,... -% getModelFor,preferredOrder,strictness,onlyGenesInModels,maxE,... -% minLen,minIde,mapNewGenesToOld) +% Notes +% ----- +% The models in the 'models' structure should have named the metabolites in +% the same manner, have their reversible reactions in the same direction +% (run sortModel), and use the same compartment names. To avoid keeping +% unneccesary old genes, the models should not have 'or'-relations in their +% grRules (use expandModel). +% +% The resulting draft model contains only reactions associated with +% orthologous genes. The old (original) genes involved in 'and' relations +% in grRules without any orthologs are still included in the draft model as +% OLD_MODELID_geneName. +% +% "to" and "from" means relative to the new organism. hitGenes.oldGenes = []; % collect the old genes from the template model (organism) hitGenes.newGenes = []; % collect the new genes of the draft model (target organism) diff --git a/reconstruction/homology/makeFakeBlastStructure.m b/reconstruction/homology/makeFakeBlastStructure.m index 43e25231..e7724cd9 100755 --- a/reconstruction/homology/makeFakeBlastStructure.m +++ b/reconstruction/homology/makeFakeBlastStructure.m @@ -1,28 +1,40 @@ function blastStructure=makeFakeBlastStructure(orthologList,sourceModelID,getModelFor) -% makeFakeBlastStructure -% Makes a fake blastStructure, that would normally be generated by -% getBlast. This allows to feed a predefined list of orthologs to -% getModelFromHomology while retaining the further use of that function. -% For this function to work, it is crucial that the orthologList is a -% cell array where the first column contains the genes from the source -% organism, and the second column contains the genes from the target -% organism -% -% orthologList cell array of orthologous genes, where the first -% column contains the genes from the source organism, -% while the second column contains the genes from the -% target organism -% sourceModelID ID of the model that will be used as template, that -% contains the genes in the first column of -% orthologList -% getModelFor the name of the organism to build a model for, -% identical to the getModelFor parameter in the -% getModelFromHomology function +% makeFakeBlastStructure Make a fake blastStructure from an ortholog list. % -% blastStructure a fake blastStructure, where the evalue, identity -% and aligLen are set at extreme values, such that -% all orthologous pairs will pass the filter when -% running getModelFromHomology +% This is a structure that would normally be generated by getBlast. It +% allows to feed a predefined list of orthologs to getModelFromHomology +% while retaining the further use of that function. For this function to +% work, it is crucial that the orthologList is a cell array where the first +% column contains the genes from the source organism, and the second column +% contains the genes from the target organism. +% +% Parameters +% ---------- +% orthologList : cell +% cell array of orthologous genes, where the first column contains the +% genes from the source organism, while the second column contains the +% genes from the target organism. +% sourceModelID : char +% ID of the model that will be used as template, that contains the +% genes in the first column of orthologList. +% getModelFor : char +% the name of the organism to build a model for, identical to the +% getModelFor parameter in the getModelFromHomology function. +% +% Returns +% ------- +% blastStructure : struct +% a fake blastStructure, where the evalue, identity and aligLen are +% set at extreme values, such that all orthologous pairs will pass the +% filter when running getModelFromHomology. +% +% Examples +% -------- +% blastStructure = makeFakeBlastStructure(orthologList,sourceModelID,getModelFor); +% +% See also +% -------- +% getModelFromHomology, getBlast if nargin<3 error('All three parameters should be set'); diff --git a/reconstruction/kegg/constructMultiFasta.m b/reconstruction/kegg/constructMultiFasta.m index de16d8ed..f8594565 100755 --- a/reconstruction/kegg/constructMultiFasta.m +++ b/reconstruction/kegg/constructMultiFasta.m @@ -1,18 +1,25 @@ function constructMultiFasta(model,sourceFile,outputDir) -% constructMultiFasta -% Saves one file in FASTA format for each reaction in the model that has genes +% constructMultiFasta Save a FASTA file per reaction in the model with genes. % -% Input: -% model a model structure -% sourceFile a file with sequences in FASTA format -% outputDir the directory to save the resulting FASTA files in +% Parameters +% ---------- +% model : struct +% a model structure. +% sourceFile : char +% a file with sequences in FASTA format. +% outputDir : char +% the directory to save the resulting FASTA files in. % -% The source file is assumed to have the format '>gene identifier -% additional info'. Only the gene identifier is used for matching. This is -% to be compatible with the rest of the code that retrieves information -% from KEGG. +% Notes +% ----- +% The source file is assumed to have the format '>gene identifier +% additional info'. Only the gene identifier is used for matching. This is +% to be compatible with the rest of the code that retrieves information +% from KEGG. % -% Usage: constructMultiFasta(model,sourceFile,outputDir) +% Examples +% -------- +% constructMultiFasta(model,sourceFile,outputDir); sourceFile=char(sourceFile); outputDir=char(outputDir); diff --git a/reconstruction/kegg/getGenesFromKEGG.m b/reconstruction/kegg/getGenesFromKEGG.m index 90886ee9..66c67ae4 100755 --- a/reconstruction/kegg/getGenesFromKEGG.m +++ b/reconstruction/kegg/getGenesFromKEGG.m @@ -1,39 +1,47 @@ function model=getGenesFromKEGG(keggPath,koList) -% getGenesFromKEGG -% Retrieves information on all genes stored in KEGG database +% getGenesFromKEGG Retrieve information on all genes stored in KEGG. % -% Input: -% keggPath if keggGenes.mat is not in the RAVEN\external\kegg -% directory, this function will attempt to read data from a -% local FTP dump of the KEGG database. keggPath is the path -% to the root of this database -% koList the number of genes in KEGG is very large. koList can be a -% cell array with KO identifiers, in which case only genes -% belonging to one of those KEGG orthologies are retrieved -% (optional, default all KOs with associated reactions) +% Parameters +% ---------- +% keggPath : char, optional +% if keggGenes.mat is not in the RAVEN\external\kegg directory, this +% function will attempt to read data from a local FTP dump of the KEGG +% database. keggPath is the path to the root of this database (default +% 'RAVEN/external/kegg'). +% koList : cell, optional +% the number of genes in KEGG is very large. koList can be a cell array +% with KO identifiers, in which case only genes belonging to one of +% those KEGG orthologies are retrieved (default all KOs with associated +% reactions). % -% Output: -% model a model structure generated from the database. The -% following fields are filled -% id 'KEGG' -% name 'Automatically generated from KEGG database' -% rxns KO ids -% rxnNames Name for each entry -% genes IDs for all the genes. Genes are saved as organism -% abbreviation:id (same as in KEGG). 'HSA:124' for -% example is alcohol dehydrogenase in Homo sapiens -% rxnGeneMat A binary matrix that indicates whether a specific -% gene is present in a KO id +% Returns +% ------- +% model : struct +% a model structure generated from the database, with fields: % -% NOTE: If the file keggGenes.mat is in the RAVEN\external\kegg directory -% it will be loaded instead of parsing of the KEGG files. If it does not -% exist it will be saved after parsing of the KEGG files. In general, you -% should remove the keggGenes.mat file if you want to rebuild the model -% structure from a newer version of KEGG. +% - id : 'KEGG' +% - name : 'Automatically generated from KEGG database' +% - rxns : KO ids +% - rxnNames : name for each entry +% - genes : IDs for all the genes. Genes are saved as organism +% abbreviation:id (same as in KEGG). 'HSA:124' for example is alcohol +% dehydrogenase in Homo sapiens +% - rxnGeneMat : a binary matrix that indicates whether a specific gene +% is present in a KO id % -% Usage: model=getGenesFromKEGG(keggPath,koList) +% Examples +% -------- +% model = getGenesFromKEGG(keggPath, koList); % -% NOTE: This is how one entry looks in the file +% Notes +% ----- +% If the file keggGenes.mat is in the RAVEN\external\kegg directory it will +% be loaded instead of parsing of the KEGG files. If it does not exist it +% will be saved after parsing of the KEGG files. In general, you should +% remove the keggGenes.mat file if you want to rebuild the model structure +% from a newer version of KEGG. +% +% This is how one entry looks in the file: % % ENTRY K11440 KO % NAME gbsB @@ -59,9 +67,6 @@ % The file is not tab-delimited. Instead each label is 12 characters % (except for '///'). % -% Check if the genes have been parsed before and saved. If so, load the -% model. -% if nargin<1 keggPath='RAVEN/external/kegg'; diff --git a/reconstruction/kegg/getKEGGModelForOrganism.m b/reconstruction/kegg/getKEGGModelForOrganism.m index f789b138..a773c09f 100755 --- a/reconstruction/kegg/getKEGGModelForOrganism.m +++ b/reconstruction/kegg/getKEGGModelForOrganism.m @@ -2,120 +2,116 @@ outDir,keepSpontaneous,keepUndefinedStoich,keepIncomplete,... keepGeneral,cutOff,minScoreRatioKO,minScoreRatioG,maxPhylDist,... nSequences,seqIdentity,globalModel) -% getKEGGModelForOrganism -% Reconstructs a genome-scale metabolic model based on protein homology -% to the orthologies in KEGG. If the target species is not available in -% KEGG, the user must select a closely related species. It is also -% possible to circumvent protein homology search (see fastaFile parameter -% for more details) +% getKEGGModelForOrganism Reconstruct a model from KEGG protein homology. % -% Input: -% organismID three or four letter abbreviation of the organism -% (as used in KEGG). If not available, use a closely -% related species. This is used for determing the -% phylogenetic distance. Use 'eukaryotes' or -% 'prokaryotes' to get a model for the whole domain. -% Only applicable if fastaFile is empty, i.e. no -% homology search should be performed -% fastaFile a FASTA file that contains the protein sequences of -% the organism for which to reconstruct a model (optional, -% if no FASTA file is supplied then a model is -% reconstructed based only on the organism -% abbreviation. This option ignores all settings -% except for keepSpontaneous, keepUndefinedStoich, -% keepIncomplete and keepGeneral) -% dataDir directory for which to retrieve the input data. -% Should contain a combination of these sub-folders: -% -dataDir\keggdb -% The KEGG database files used in 1a (see below) -% -dataDir\fasta -% The multi-FASTA files generated in 1b (see -% below) -% -dataDir\aligned -% The aligned FASTA files as generated in 2a (see -% below) -% -dataDir\hmms -% The hidden Markov models as generated in 2b or -% downloaded from BioMet Toolbox (see below) -% The final directory in dataDir should be styled as -% prok90_kegg116 or euk90_kegg116, indicating whether -% the HMMs were trained on pro- or eukaryotic -% sequences; using which sequence similarity treshold -% (first set of digits); using which KEGG version -% (second set of digits). (this parameter should -% ALWAYS be provided) -% outDir directory to save the results from the quering of -% the hidden Markov models. The output is specific -% for the input sequences and the settings used. It -% is stored in this manner so that the function can -% continue if interrupted or if it should run in -% parallel. Be careful not to leave output files from -% different organisms or runs with different settings -% in the same folder. They will not be overwritten -% (optional, default is a temporary dir where all *.out -% files are deleted before and after doing the -% reconstruction) -% keepSpontaneous include reactions labeled as "spontaneous". (optional, -% default true) -% keepUndefinedStoich include reactions in the form n A <=> n+1 A. These -% will be dealt with as two separate metabolites -% (optional, default true) -% keepIncomplete include reactions which have been labelled as -% "incomplete", "erroneous" or "unclear" (optional, -% default true) -% keepGeneral include reactions which have been labelled as -% "general reaction". These are reactions on the form -% "an aldehyde <=> an alcohol", and are therefore -% unsuited for modelling purposes. Note that not all -% reactions have this type of annotation, and the -% script will therefore not be able to remove all -% such reactions (optional, default false) -% cutOff significance score from HMMer needed to assign -% genes to a KO (optional, default 10^-50) -% minScoreRatioG a gene is only assigned to KOs for which the score -% is >=log(score)/log(best score) for that gene. This -% is to prevent that a gene which clearly belongs to -% one KO is assigned also to KOs with much lower -% scores (optional, default 0.8 (lower is less strict)) -% minScoreRatioKO ignore genes in a KO if their score is -%