Skip to content

Commit

Permalink
Commit made by the Bioconductor Git-SVN bridge.
Browse files Browse the repository at this point in the history
Commit id: 47b052bd356a28b6621a64ae2a89492555bcaa9c

    ready for 1.1.6: updated data of package and info in DESCRIPTION file


Commit id: a68381b284aa922c21ed8cc0f006c843f1a02d75

    Toward 1.1.6: running title shorter from full title. For heatmap, combination of subsetting and re-labelling made easier. Again, massive edits in the User Guide.


Commit id: c861c4da45c846ee361329f909a445838e3ee21e

    toward 1.1.6: changed way to deal with synonyms in subset_scores() function. subset_scores() function now creates/updates a filters.GO slot in the output result object, stating the filters and cutoffs applied. Warnings are issue if conflicting filters and cutoff values are applied. Documentation updated for this new feature. A few = replaced by <- in man pages examples. UserGuide updated in places


Commit id: 22839ac06d4b41a65bedcba17bdec4f283aa3c9c

    towards 1.1.6: heatmap_GO replaces blank gene names with the gene feature identifier. heatmap_GO semi-automatically resizes bottom and right margins to accomodate large gene and sample labels, respectively.


Commit id: 9096744d5e7f2393d41a4775e82b639e21878562

    towards 1.1.6: table_genes function supports ordering by score, rank, gene id, and gene name. Some = replaced by <- in the code.


Commit id: 52ea5a0d316a11f13dd786f4919c569d27a29f1f

    Toward 1.1.6: Massive commit (bad practice!). Added pValue_GO function to generate permutation-based GO P-value, help page and section in UserGuide. Support external_gene_id header from Ensembl releases 75 and earlier. Added rank.by slot in GO_analyse output. rerank function updates rank.by slot. rerank function supports p-value. subset_scores supports p-value. progress bar function in toolkit. Updated AlvMac to include RPL36A, an example of gene name with multiple identifiers. Updated annotations with data from Ensembl 75 release for traceability. UserGuide updated to describe how to prepare local annotations. Help pages updated to a more consistent indentation in example section. Updated help pages with example including p-values. Added an example of pValue_GO output and help page. Bug fix for overlap_GO with three go_ids. Update the manual in many places with up-to-date information and more examples (custom annotations, p-values, re-labelling of heatmap, sessionInfo, typos).


Commit id: 371ffd8aadad09cc5acb87b42e06b5455352e54f

    Toward 1.1.6: typo 'labRow' in the man page of heatmap_GO


Commit id: c65147122539f285573fe7895a75ea0db7dbc53a

    Toward 1.1.6: set random seed in vignette to allow reproducible results


Commit id: 5c706d981260e6d993ef79878ad42c0c4622642a

    Toward 1.1.6: heatmap GO row labels can be overriden without affecting the colour-coding of samples.



git-svn-id: file:///home/git/hedgehog.fhcrc.org/bioconductor/trunk/madman/Rpacks/GOexpress@100632 bc3139a8-67e5-0310-9ffc-ced21a209358
  • Loading branch information
Kevin Rue-Albrecht committed Mar 13, 2015
1 parent 9cf7dd4 commit 93a96fc
Show file tree
Hide file tree
Showing 37 changed files with 1,343 additions and 431 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -9,3 +9,4 @@
*.synctex.gz
*.toc
*.tiff
core
27 changes: 15 additions & 12 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Package: GOexpress
Title: Visualise microarray and RNAseq data using gene ontology annotations
Version: 1.1.5
Date: 2014-12-13
Version: 1.1.6
Date: 2014-03-13
Authors@R: c(
person(given="Kevin", family="Rue-Albrecht",
role = c("aut", "cre"), email = "kevin.rue@ucdconnect.ie"),
Expand All @@ -13,20 +13,23 @@ Authors@R: c(
person(given=c("Stephen", "V."), family="Gordon", role = c("ths")),
person(given=c("David", "E."), family="MacHugh", role = c("ths")))
Description: The package contains methods to visualise the expression profile
of genes from a microarray or RNA-seq experiment and offers a clustering
analysis to identify GO terms enriched in genes with expression levels
best clustering two or more predefined groups of samples. Annotations for
the genes present in the expression dataset are obtained from Ensembl
through the biomaRt package. The random forest framework is used to
evaluate the ability of each gene to cluster samples according to the
of genes from a microarray or RNA-seq experiment, and offers a
supervised clustering approach to identify GO terms enriched in genes
with expression levels best clustering two or more predefined groups of
samples. Annotations for the genes present in the expression dataset may
be obtained from Ensembl through the biomaRt package, if not provided by
the user. The default random forest framework is used to evaluate the
ability of each gene to cluster samples according to the
factor of interest. Finally, GO terms are scored by averaging the
rank (alternatively, score) of their respective gene sets to cluster
the samples. An ANOVA approach is also available as an alternative
statistical framework.
Depends: R (>= 3.0.2), grid, Biobase (>= 2.22.0)
the samples. P-values may be computed to assess the significance of GO
term ranking. Visualisation function include gene expression profile,
gene ontology-based heatmaps, and hierarchical clustering of
experimental samples using gene expression data.
Depends: R (>= 3.0.2), grid, Biobase (>= 2.22.0), VennDiagram (>= 1.6.5)
Imports: biomaRt (>= 2.18.0), stringr (>= 0.6.2),
ggplot2 (>= 0.9.0), RColorBrewer (>= 1.0), gplots (>= 2.13.0),
VennDiagram (>= 1.6.5), randomForest (>= 4.6)
randomForest (>= 4.6)
Suggests: RCurl (>= 1.95), BiocStyle
License: GPL (>= 3)
biocViews: Software, GeneExpression, Transcription, DifferentialExpression,
Expand Down
1 change: 1 addition & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ export("hist_scores")
export("list_genes")
export("overlap_GO")
export("plot_design")
export("pValue_GO")
export("quantiles_scores")
export("rerank")
export("subEset")
Expand Down
91 changes: 91 additions & 0 deletions NEWS
Original file line number Diff line number Diff line change
@@ -1,3 +1,94 @@
CHANGES IN VERSION 1.1.6
--------------------------

BUG FIXES:

o overlap_GO() was crashing for 3-group Venn diagrams, except if the
VennDiagram was loaded manually loaded in the workspace using
libray(VennDiagram). The function will now run seemlessly without that
manual step, as loading GOexpress will immediately load VennDiagram
in the workspace (stated as a dependency in the DESCRIPTION file).

NEW FEATURES:

o New function pValue_GO() allows calculation of P-value for each
ontology using permutation of genes labels. This allows users to estimate
the chance of seeing a GO term reach a particular rank (or score).
Features a fancy progres bar shamelessly adapted from StackOverflow.

o heatmap_GO now semi-autmoatically resizes the bottom and right margins
to accomodate large gene and sample labels, respectively. The user may
control those margins using the "margins" argument of the function.

o heatmap_GO default call now shows the gene feature identifier for those
missing an annotated gene name, when gene names are requested (also
the default).

o A rank.by slot is now created by the GO_analyse() function in the result
object to state the metric used to order the result tables.

o a filters.GO slot stating the filters and cutoffs applied to the result
object is now created or updated by successive uses of the subset_scores()
function. Warnings and notes are displayed if conflicting filters and
cutoffs are applied on a previously filtered result object.

o rerank() function now supports re-ordering by P-value. Note that this
is only applicable to the output of the pValue_GO() function mentioned
above.

o rerank() function now updates the rank.by slot of the result object
to state the current ordering metric.

o subset_scores() function now allows filtering by P-value. Note that this
is only applicable to the output of the pValue_GO() function mentioned
above.

o Backward compatibility with Ensembl annotation releases 75 and earlier,
which used 'external_gene_id', which was renamed to 'external_gene_name'
in releases 76 and later.

o table_genes() function defaults to sorting genes by decreasing score
(equivalent to increasing rank). Gene feature name or identifier are
supported alternative filters for sorting.


UPDATED FEATURES:

o Allow user to override row_labels in heatmap_GO. This way, the
color-coding of the sample can be kept, while better description of the
samples can be used to label them, instead of the phenodata values.

o In heatmap_GO(), if the labRow argument is of length 1, it is assumed to
be the name of a column in the phenoData slot. Useful to re-label
subsetted ExpressionSet objects.

GENERAL UPDATES:

o Updated the AlvMac training dataset to include 'RPL36A' an example
of multiple Ensembl gene identifier annotated to the same gene name.

o Updated the AlvMac example custom annotations to match the updated
dataset.

o Updated the example AlvMac_results to match the updated dataset.

o Set the random seed prior to running the GO_analyse() example in
the vignette. Hopefully, this should allow reproducible testing by the
users.

o In User Guide, load package before loading the attached data.

o In User Guide, new sections and examples dealing with the re-labelling
of heatmap samples, the use of P-values, the re-ranking and subsetting
of results using P-values. New sub-sections for clarity. Emphasis on
the use and generation of local annotation, rather than use of current
online Ensembl annotation release.

o No more code connecting to the Ensembl server in any the help files
and User Guide.

o Help pages examples with more consistent indentation of code.

CHANGES IN VERSION 1.1.5
--------------------------

Expand Down
25 changes: 16 additions & 9 deletions R/analysis.R
Original file line number Diff line number Diff line change
Expand Up @@ -250,8 +250,8 @@ GO_analyse <- function(
"Non-NULL GO_genes argument: Ignoring 'biomart_dataset' ",
"and 'microarray' arguments."
)
biomart_dataset = ""
microarray = ""
biomart_dataset <- ""
microarray <- ""
}
mart <- NULL
}
Expand Down Expand Up @@ -332,7 +332,7 @@ GO_analyse <- function(
if (! "name_1006" %in% colnames(all_GO)){
# Allow the header "name" but internally convert it to name_1006
if ("name" %in% colnames(all_GO)){
colnames(all_GO)[colnames(all_GO) == "name"] = "name_1006"
colnames(all_GO)[colnames(all_GO) == "name"] <- "name_1006"
}
# else if could allow more headers
else {
Expand All @@ -347,7 +347,7 @@ GO_analyse <- function(
if ("namespace" %in% colnames(all_GO)){
colnames(all_GO)[
colnames(all_GO) == "namespace"
] = "namespace_1006"
] <- "namespace_1006"
}
# else if could allow more headers
else {
Expand Down Expand Up @@ -449,7 +449,7 @@ GO_analyse <- function(
all_genes <- getBM(
attributes=c(
"ensembl_gene_id",
"external_gene_name",
"external_gene_name", # since Ensembl release 76
"description"
),
filters="ensembl_gene_id",
Expand All @@ -461,7 +461,7 @@ GO_analyse <- function(
all_genes <- getBM(
attributes=c(
microarray,
"external_gene_name",
"external_gene_name", # since Ensembl release 76
"description"
),
filters=microarray,
Expand All @@ -482,9 +482,14 @@ GO_analyse <- function(
if ("name" %in% colnames(all_genes)){
colnames(all_genes)[
colnames(all_genes) == "name"
] = "external_gene_name"
] <- "external_gene_name"
}
# else if could allow more headers
else if ("external_gene_id" %in% colnames(all_genes)){
colnames(all_genes)[
colnames(all_genes) == "external_gene_id"
] <- "external_gene_name"
}
# "else if" could allow more synonym headers
else {
warning(
"We encourage the use of a \"name\" column describing",
Expand Down Expand Up @@ -601,6 +606,7 @@ GO_analyse <- function(
factor=f,
method=method,
subset=subset,
rank.by=rank.by,
ntree=ntree,
mtry=mtry
)
Expand All @@ -614,7 +620,8 @@ GO_analyse <- function(
genes=genes_score,
factor=f,
method=method,
subset=subset
subset=subset,
rank.by=rank.by
)
)
}
Expand Down
Loading

0 comments on commit 93a96fc

Please sign in to comment.