HDSheep

Welcome to the HD Sheep Model (OVT73) data exploration tool.

Huntington's Disease (HD) is an autosomal dominantly inherited genetic disorder characterised by spontaneous movements, cognitive impairment, psychiatric disturbance and progressive neurodegeneration OMIM # 143100.

Thanks to our collaborating partners:



This project was initiated and supported through the generosity of the Freemasons Foundation and Freemasons of New Zealand.



Datasets Presented in this Project

The datasets presented in this application have been collected from a single cohort of 5-year-old OVT73 (n=6) and control (n=6) sheep. Harvest information for each of the 12 samples is provided in the table below. An overview of each dataset, including experimental design, is in preparation and will be accessible in the publication "A multi-omic transgenic sheep database for exploratory investigation of Huntington's disease pathogenesis".

Please look at the presented journal papers here and on the home page to read on the research that has previously been conducted on these datasets.

Individual datasets can be downloaded using the drop-down box below:



Datasets can be downloaded using the drop-down box below:
Download Dataset








Student's T-test

This exploratory analysis allows individual variables to be investigated within each dataset, comparing data derived from the HD transgenic (OVT73) sheep to controls. The individual measurements for each animal are visualised using box-plots, with each animals' unique identifier number displayed. Student's t-test is then applied as a measure for comparison (transgenic versus control), producing a p-value to assess the level of significance. In addition, a second p-value for significance is generated, using a linear model to fit sex into the transgenic versus control comparison. A p-value < 0.05 is considered statistically significant at the nominal level.

Please note: Due to small sample sizes this p-value is exploratory and should only be used as an initial indicator of transgenic versus control differences. It is not adjusted for multiple testing. The next page (Bootstrap and Permutation Tests) can be used to generate a more reliable p-value.


The plots can be downloaded as a PDF by selecting the "Download" button at the bottom of the selection box.



Bootstrap and Permutation Tests

Bootstrap and permutation tests are useful statistical tools that can be performed when sample sizes in the discrete groups are small, thus overcoming the problem of small sample number. Each statistical test can be applied to individual variables of interest within each dataset.

The bootstrap test is used to identify the probability of observing the true group mean for a variable (the sum of the observed values, divided by the number of samples), given the expression values observed in the sheep. This is done for both the transgenic and control samples. The test relies on random sampling with replacement, and allows the estimation of the sampling distribution. By bootstrapping the expression values and taking the mean 1000 times, a spread of possible group means is presented as a histogram for transgenic (red) and control (blue) groups. The true group mean within each group is displayed on the graph as the solid vertical lines.

The permutation test calculates a more reliable p-value for expression level comparisons between control and transgenic sheep for the selected variable of interest. The expression values obtained from both the control and transgenic samples are randomly separated out into two separate groups. The difference in expression level between these two groups is then calculated, and repeated 1000 times. The p-value is given by the number of observed differences at least as extreme as the original difference between the control and transgenic sheep, thus giving the significance of the original difference observed.

Please note: as the permutation test is run 1000 times, the significance of the p-value is to 0.001.





Principal Components Analysis

Principal components analysis is one of the most common multivariate dataset tests conducted. This test is used to visualise individual samples in fewer dimensions.

The first plot on this page shows the level of variance explained between the samples by each of the principal components, (the first 10 are shown). The values for these are given in the text output below this plot. More variance explained by the principal components gives a better representation of the relationship between the samples in the reduced dimensions (given by the cumulative proportion value).

The relationship between the samples in two-dimensions can be observed in the "Principal Components Plot". Here, the first two principal components are used as the x- and y-axis respectively, and the samples are plotted in the two dimensions, based on levels of variables observed in the datasets used to conduct the test. This visualises the relationship between data derived from the control and HD transgenic sheep.

This test can be conducted on any number and any combination of datasets as desired by adding to those present in the dataset selection box.

Another feature present is the availability to observe the effect of each variable on an individual samples position on the two-dimensional plot by selecting the check-box in the selection panel. The arrow added to the plot shows the direction in which an increase in the level of that particular variable will move the sample in the two dimensions. The magnitude of effect of the variable is given by the absolute length of the arrow, thus longer arrows represent a more significant effect on the position of the sample in the two dimensions.

By default, the Principal Components test is conducted with the data being centred around zero. This is achieved by subtracting the total mean (the mean of all twelve samples, transgenics and controls) for each variable, off the observed value for the variable in each sample.

Finally, the test is conducted by scaling the variance to a unit value, enabling a more informative comparison between the effects of different variables. This can be de-selected by clicking "No" for the option of whether to scale the variance or not. This is not recommended due to the loss of comparison information at the individual variable level.




            

Top five influential variables for each of the twelve Principal Components


            



Differential Correlation Statistics

Changes in correlation structure can provide insight into underlying regulatory networks and how these networks may be differentially implicated in a disease process. This analysis presents an exploratory method of investigating the correlation structures between two variables within each dataset, with comparison between transgenic and control groups.

The analysis determines the Pearson's correlation coefficient and associated p-value between two variables (i.e. two genes) for each of the control and transgenic sample groups. It then uses the fisher r-to-z statistic to assess the significance of the difference between the two correlation coefficients.

This table is ordered by the fisher r-to-z statistic. Only variable-variable combinations with significant fisher-r-to-z statistics (p < 0.05) are displayed. Where applicable, the top fifty correlations are presented, however this may be fewer for some datasets, where the number of significant variable correlations does not reach fifty.

The search bar above the table enables users to search for specific variables presented in the table.

This list is downloadable as a .CSV file.




Differential Correlation Plots

The differential correlations investigated within each dataset (as explored in the previous tab) can be visualised here. This is another exploratory tool to investigate relationships between different variables, within the control and transgenic groups. Gains or losses in correlation structure may be indicative of differential regulatory processes or associations that occur in the transgenic model.

Differential correlation plots allow us to visualise variable-variable associations within each group. Each dot represents an individual sample. The correlation coefficient (r) and associated p-value (p) is provided within each plot. The line of best fit is shown in blue for control plots and red for transgenic plots, with confidence intervals in grey. This method of visualisation allows the identification of outliers that cannot be observed in the differential correlation statistics page.

NOTE: Axes scales differ for each plot.


            


About this Project / Contact Us

Huntington's Disease (HD) is relatively uncommon, affecting approximately 1 in 10,000 individuals of European origin. The disease is caused by the expansion of a coding polymorphic CAG trinucleotide repeat located in exon 1 of the Huntingtin (HTT) gene OMIM # 613004. The biological functions of HTT, and the pathogenic mechanism mediated by the mutant allele are not yet fully understood. There is no therapy in clinical use that can prevent or delay the onset of HD.

There have been many animal models of HD made to enable the investigation of the disease process, and also for use in pre-clinical pharmacological testing. A collaborative project between The University of Auckland, Harvard Medical School and the South Australian Research and Development Institute, resulted in the production first large mammalian model of HD - a transgenic sheep line termed OVT73. The OVT73 sheep line carries copies of full length human huntingtin cDNA with an expanded polyglutamine coding repeat of 73 units. The repeat is relatively stable on transmission with an expression level of approximately half an allele. Sheep have many advantages for neurological study and drug testing due to their large size, long lifespan, and complex brain structure which is comparable to humans. The transgenic sheep show no overt neurological symptoms (some >10 years of age), but do develop some of the hallmark brain pathology of HD such as huntingtin positive inclusions and altered expression of genes that are implicated in HD. The sheep have a measurable circadian abnormality, which is a characteristic behaviour of HD patients and a metabolic disruption. There are approximately one thousand OVT73 and control sheep available for research on the farm which is a normal pasture-based environment. Results from a wide range of OVT73 studies suggest that the model recapitulates the prodromal phase of Huntington's disease before motor age of onset.

A large series of data has been generated from a single cohort of OVT73 animals (n=6) and control sheep (n=6) killed at 5 years of age. This includes transcriptomic, proteomic and metabolic data collected from multiple tissue samples through various independent experimental analyses. This data has been collated and integrated into a single multidimensional platform presented on this website. The aim is to facilitate the 'multi-omic' exploration of the data for correlations and provides a more holistic view of the OVT73 sheep compared to controls.

Our aim is to share the OVT73 data with the HD research community, through an interactive database that can be queried for specific genes/proteins/metabolites of interest. The data can be explored using a range of statistical techniques and multivariate methods (presented as separate tabs at the top of the page). Alternatively, the raw datasets can be downloaded for further analyses here. The ultimate goal of this database is to offer another layer of HD data, from sheep, in an attempt to discover the causative mechanisms behind HD progression and identify potential therapeutic targets.

This interactive, web-based application allows the investigation and exploration of 'multi-omic' variables expressed in a sheep model of Huntington's disease (HD), termed OVT73. OVT73, which has been extensively characterised, represents a prodromal form of Huntington's disease (HD). The aim of this application is to share OVT73 data with the wider HD research community; allow variables of interest to be queried in a HD sheep model; bridge the gap between mouse and human HD studies and act as a hypothesis generator for HD mechanisms of pathogenesis.

If you would like to contact the Snell lab group or provide feedback, please use the Contact Us panel. If you would like to remain anonymous, please leave the name fields blank. This is an early stage project and any feedback provided will be greatly appreciated.

We would like to thank The University of Auckland for providing the services necessary for hosting and maintaining the application and providing the means to undertake the research that has been conducted.

We would also like to thank our current and previous collaborators for the opportunity to conduct experiments and allow the publication of resulting data.

The development of this application was funded by Brain Research New Zealand (BRNZ).

Created by Matthew Grant and Emily Mears, from the Snell lab group, at The University of Auckland.