Outlier Charts

Outlier ChartsRightChain Help

00:00 / 01:14

RightChain Scatter Plots

Outliers display the distribution of data points, relative to a range of “normalcy”. Statistical outliers are data points that significantly differ from other observations in a data set. They are notable, because they do not fit the expected pattern, or the majority of the data, and can significantly influence statistical analyses and results. “Outliers” are one of many types of visualizations available to users in RightChain AI. “Tab Name”, allows users to name the visualization.
“Tab Description”, allows users to create descriptive tool tips which appear upon hovering on tab names. “Dimension”, allows users to select the dimension of the Outliers Chart. “Y Axis Variable”, allows users to select the key metric for display in the Outliers Chart. Some typical Outlier Charts, include metrics like cases, pallets, cube, weight, order lines, and dollars. “Statistics”, allows users to display a variety of statistics related to the solution and/or input data including minimum, 1st quartile, median, mean, 3rd quartile, maximum, correlations, standard deviations, and kurtosis. Data labels, may also be customized in the visualization.

Fundamentals of Statistical Outliers

Statistical outliers are data points that significantly differ from other observations in a data set. They are notable because they do not fit the expected pattern or the majority of the data, and can significantly influence statistical analyses and results. Here’s a bit more detail about outliers:

Characteristics of Outliers

Extreme Values: Outliers can be much higher or lower than the rest of the data points. They can appear in both tails of the data distribution.
Impact on Mean and Standard Deviation: Outliers can skew the mean and inflate the standard deviation of a data set, which can lead to misleading conclusions.
Result from Various Causes: They can be due to variability in the measurement or may indicate experimental errors; other times, they can be an indication of a population structure differing from assumptions.

Identifying Outliers
Outliers can be identified in several ways:

Statistical Tests: Tests like Grubbs’ test, Dixon’s Q test, or the generalized extreme Studentized deviate test.
Visualization Tools: Box plots, scatter plots, and histograms can visually suggest the presence of outliers.
Z-Scores: Data points that have a Z-score (a measure of how many standard deviations an element is from the mean) beyond a certain threshold (often 3 or -3) are considered outliers.
Interquartile Range (IQR): Points that are more than 1.5 times the IQR above the third quartile or below the first quartile are often considered outliers.

Dealing with Outliers
The approach to handling outliers depends on their cause and the purpose of the analysis:

Exclusion: If outliers are due to errors, they can be legitimately removed.
Adjustment: Sometimes, outliers can be adjusted if the reason for their deviation is understood.
Inclusion: In cases where outliers reflect a natural variation in the data, they should be included in the analysis.
Robust Methods: Using statistical methods that are not affected by outliers, such as median or rank-based methods.