data_analysis

This is an old revision of the document!


1.1. Short description

1.1.2. The GSRI estimates the fraction of significantly regulated features.

1.1.3. For estimation, the empirical cumulative density function (ecdf) of the p-values is analyzed. An iterative estimation procedure is used to unravel the difference to a uniform distribution of p-values (which corresponds to a diagonal line for the ecdf). It also enables calculation of standard errors for the fraction and significance statements.

1.1.4. In contrast to other similar approaches, no reference gene set which is NOT regulated (e.g. “all genes”) is required.

1.1.5. The most prominent similar approach is GSEA (gene set enrichment analysis)

1.2.1. The approach is applied several times in application project. It works.

1.2.2. Drawback: Collaborators weakly tend to more prominent approaches.

1.3.1. R-package “les” on Bioconductor

1.4.1.

1.5.1. Weighting of the individual p-values leads to LES

6.1.1. Transcription Start Site Identification (TSSi) based on sequencing reads

6.1.2. The data did not uniquely indicate TSSs.

6.1.3. The approach has been applied for prediction TSS for the physcomitrella patens genome. The results were available in the standard genome browser for this organism.

6.2.1. I guess that similar data is not produced any more. Therefore, the approach might be obsolete.

6.3.1. R-package TSSi

6.4.1.

7.1.1. The Mean Optimal Transformation Approach (MOTA) was suggested for investigating non-identifiablities.

7.1.2. Based on alternating conditional expectation (ACE) algorithm

7.1.3. Non-parametric method based on kernel estimation to unravel arbitrary dependencies in data

7.1.4. Works also for relations, e.g. a circle

7.2.1. Since based on kernel estimation restricted to low dimensional problems

7.3.1. R-package MOTA (not maintained any more, see CRAN archive)

7.3.2. ACE is available in as R-package “acepack”

7.3.3. Matlab code for ACE is available internally (ask Clemens)

7.4.1. ACE

7.5.1. Hengl S et al. Data-based identifiability analysis of nonlinear dynamical models (2007)

7.6.1. Breiman & Friedman. Estimating optimal transformations for multiple regression and correlation. (1985)

9.2.1. An explicit function which has very similar shape as ODE solutions of signalling pathways

9.2.2. If small amounts of data (observables) are available, the approach might serve as an alternative to traditional ODE modelling.

9.2.3. The approach provides self-explained parameters (amplitudes, response times, time-scales)

9.2.4. It can be directly fit to data in order to have an explicit function describing the time dependency (like a smoothing spline)

9.2.5. It can be fit to ODEs in order to have an approximation of the dynamics as explicit function (e.g. for multiscale models)

9.3.1. D2D is used for fitting

9.3.2. See: D2D Example folder (ToyModels/TransientFunction)

9.4.1. Fitting is very robust

9.4.2. For data, the outcome is great in 90% of cases

9.4.3. For approximating ODEs, the performance depends on the model. The accuracy is better than uncertainties of data.

9.5.1. Submitted

  • data_analysis.1615717387.txt.gz
  • Last modified: 2021/03/14 11:23
  • by admin