9.5 Conclusion
Dimension 1: explains the difference between the 2 origin of wines.
Dimension 2: explains the difference between the 2 origins of judges.
Based on contributions plot we can see the relationships among products and among judges.
Distance is a versatile aspect of data that is commonly used to analyze realtionships between observations and variables in a dataset. Distance can be mathematically defined in many ways such as Hellinger, Euclidean (most common and is used in this example), Mahalanobis, etc. There are many popular clustering methods that utilize distance as the input data for analysis such as K-means, H-Clust, Tree, Mahalanobis, etc. In DiSTATIS, we performed what essentially is a PCA on distance matrix. The results yielded from DiSTATIS were able to distinguish the 2 orgins of Wines. Our results were able to provide detailed insights on Panelists (research design did not provide indicator value for Panelists). Using K-Means with n = 2 (as design), we can see that there are 2 distinct cluster of Panelists - as expected. Combining results of the RvMap (Panelists) and the Projections of Products (Compromise Map), we also see a clear distinction between the 2 Origins of Wines. H-Clust tree were also moderately successful at distinguishing French from South African Wines. Overall, DiSTATIS provided 2 major benefits. One, it allows us to assess the effects and ratings of each judges in details - this can be very helpful in controlling for Panelists’ bias (should we scale/center?). Two, DiSTATIS’s idea of a Partial and COmpromise Map means that we can project more things on the same space to perform further analysis such as the semantic analysis performed above where we projected Vocabulary data.