In this first BioRankings’ Technical Report we show how cluster analysis is highly subjective with results changing for different inputs, why arguments against the Dirichlet-multinomial distribution for microbiome data are wrong, how the finite mixture models operate, and finally an example of this analysis using HMP stool samples.
In this Technical Report we focus on wide and short data. When deciding how to analyze big data there are two ways of thinking about it. Are the data organized in a few columns and lots of rows (tall and narrow), or are the data organized in lots of columns and few rows (short and wide)? In biomedical research using high-throughput technology (i.e., -omics), short and wide occurs – lots of columns (wide) corresponding to the biological measurements, and few rows (short) corresponding to patients or samples. While this may not be big data in terms of storage, it presents huge problems from having a very large number of ways of analyzing the many biological measurements.
In Technical Report 2 we showed through simulations that all pairwise distances become identical as the number of dimensions approaches infinity. This is a fact which can also be proven mathematically. In this Supplement, we demonstrate this theory with real microbiome data.
Title of your posts.
By Author's Name
January 1, 2030
Category: Category Name
This is where the post content will show up. The font color, font size, line-height, and other styles related to the font, as well as stroke, corner radius and backgroung color / image can be styled. Simply use the text / color / stroke panel to style any of these elements! You can also change the spacing by clicking on the top right arrow near this field. Again this is a sample text and will be replaced with the content of each post.
Tags: Tag Link