Royal Statistical Society


Royal Statistical Society
Manchester Local Group

 

October 12th 2005, 2pm to 5pm at MANDEC (Manchester Dental Education Centre),
Higher Cambridge Street
(tea will be served about mid-afternoon)
  (building 41, entrance on corner facing building 35)

Joint meeting with Manchester University's Biostats Group

Theme: "Bioinformatics"

NICK FIELLER

Gene Expression and Annotation

Various forms of oligonucleotide microarrays allow direct measurement of gene expression in samples from human subjects and are made with the aim of providing insight into the biological processes of some the condition (e.g. cancer), for example which genes play key roles in its development. Typically, many thousands of genes are measured on relatively few subjects and with relatively sparse replication. From the statistical viewpoint, the major problem is the analysis of very high dimensional data with limited numbers of observations and poor replication.

However, additional information is available. Most obviously there is concomitant information on the subjects themselves, including severity of condition and demographic information.  Appropriate use of this will enhance statistical analysis. Less well known is the availability of information on the genes which could play a dual role in the analysis. The broad term for this information is 'annotation'.  Just as subjects with common characteristics might be expected to have similar gene expression profiles it might be anticipated that genes with some common annotation feature might display similarities.

A particular form of annotation is whether a gene has been referred to in connection with a biological function or disease.  Text mining techniques can determine the number of such citations in a textbase relating genes to a Medical Subject Heading (i.e. MeSH category as defined in the US National Library of Medicine's controlled vocabulary used for indexing). This can provide a measure of linkage between genes.  Since such information is typically extremely sparse, use of the published MeSH hierarchies of terms allows grouping of categories at various levels and hence a measure of further connections between genes.

TOM NYE

Uncovering evolutionary history: new methods for inferring phylogenies

Evolutionary relationships between species can be represented by a tree: the leaf nodes represent extant species, interior nodes represent ancestral species, and the branch lengths indicate the extent to which species have diverged. Such trees are referred to as phylogenies, and there are are a range of different statistical methods available for inferring the phylogeny of a set of species given their DNA sequences.

The first half of the talk serves as a gentle introduction to the main statistical methods used to infer phylogenies. We will then go on to look in more detail at the so-called distance matrix methods and describe some new results in this area.

Tom's talk

MAGNUS RATTRAY

Propagating Measurement Uncertainty in Microarray Data Analysis

High density microarrays were first introduced a decade ago and since then they have played an increasingly important role in many areas of biological and biomedical research. Microarrays can be used to simultaneously measure the concentration of many species of RNA molecules within a sample derived from a tissue of interest. This allows the expression level of tens of thousands of genes to be measured in a single experiment. However, this technology is associated with many sources of experimental uncertainty and noise.

In this talk I will discuss approaches for dealing with this uncertainty. I will focus on the analysis of oligonucleotide arrays, such as the popular Affymetrix GeneChip array, which contain multiple short specific probe sequences for each target RNA.

This set of probes can be used to determine an accurate estimate for the target concentration and can also be used to determine the uncertainty associated with this measurement. The measurement uncertainty can then be propagated through the downstream analysis using probabilistic methods. We show how this approach leads to improved methods combining information from replicate experiments, identifying differential expression and dimensionality reduction. 

Magnus's talk

Paper preprints are available from http://www.bioinf.man.ac.uk/resources/puma/

 

Navigation:-
External Links:-