
- SciPy - Home
- SciPy - Introduction
- SciPy - Environment Setup
- SciPy - Basic Functionality
- SciPy - Relationship with NumPy
- SciPy Clusters
- SciPy - Clusters
- SciPy - Hierarchical Clustering
- SciPy - K-means Clustering
- SciPy - Distance Metrics
- SciPy Constants
- SciPy - Constants
- SciPy - Mathematical Constants
- SciPy - Physical Constants
- SciPy - Unit Conversion
- SciPy - Astronomical Constants
- SciPy - Fourier Transforms
- SciPy - FFTpack
- SciPy - Discrete Fourier Transform (DFT)
- SciPy - Fast Fourier Transform (FFT)
- SciPy Integration Equations
- SciPy - Integrate Module
- SciPy - Single Integration
- SciPy - Double Integration
- SciPy - Triple Integration
- SciPy - Multiple Integration
- SciPy Differential Equations
- SciPy - Differential Equations
- SciPy - Integration of Stochastic Differential Equations
- SciPy - Integration of Ordinary Differential Equations
- SciPy - Discontinuous Functions
- SciPy - Oscillatory Functions
- SciPy - Partial Differential Equations
- SciPy Interpolation
- SciPy - Interpolate
- SciPy - Linear 1-D Interpolation
- SciPy - Polynomial 1-D Interpolation
- SciPy - Spline 1-D Interpolation
- SciPy - Grid Data Multi-Dimensional Interpolation
- SciPy - RBF Multi-Dimensional Interpolation
- SciPy - Polynomial & Spline Interpolation
- SciPy Curve Fitting
- SciPy - Curve Fitting
- SciPy - Linear Curve Fitting
- SciPy - Non-Linear Curve Fitting
- SciPy - Input & Output
- SciPy - Input & Output
- SciPy - Reading & Writing Files
- SciPy - Working with Different File Formats
- SciPy - Efficient Data Storage with HDF5
- SciPy - Data Serialization
- SciPy Linear Algebra
- SciPy - Linalg
- SciPy - Matrix Creation & Basic Operations
- SciPy - Matrix LU Decomposition
- SciPy - Matrix QU Decomposition
- SciPy - Singular Value Decomposition
- SciPy - Cholesky Decomposition
- SciPy - Solving Linear Systems
- SciPy - Eigenvalues & Eigenvectors
- SciPy Image Processing
- SciPy - Ndimage
- SciPy - Reading & Writing Images
- SciPy - Image Transformation
- SciPy - Filtering & Edge Detection
- SciPy - Top Hat Filters
- SciPy - Morphological Filters
- SciPy - Low Pass Filters
- SciPy - High Pass Filters
- SciPy - Bilateral Filter
- SciPy - Median Filter
- SciPy - Non - Linear Filters in Image Processing
- SciPy - High Boost Filter
- SciPy - Laplacian Filter
- SciPy - Morphological Operations
- SciPy - Image Segmentation
- SciPy - Thresholding in Image Segmentation
- SciPy - Region-Based Segmentation
- SciPy - Connected Component Labeling
- SciPy Optimize
- SciPy - Optimize
- SciPy - Special Matrices & Functions
- SciPy - Unconstrained Optimization
- SciPy - Constrained Optimization
- SciPy - Matrix Norms
- SciPy - Sparse Matrix
- SciPy - Frobenius Norm
- SciPy - Spectral Norm
- SciPy Condition Numbers
- SciPy - Condition Numbers
- SciPy - Linear Least Squares
- SciPy - Non-Linear Least Squares
- SciPy - Finding Roots of Scalar Functions
- SciPy - Finding Roots of Multivariate Functions
- SciPy - Signal Processing
- SciPy - Signal Filtering & Smoothing
- SciPy - Short-Time Fourier Transform
- SciPy - Wavelet Transform
- SciPy - Continuous Wavelet Transform
- SciPy - Discrete Wavelet Transform
- SciPy - Wavelet Packet Transform
- SciPy - Multi-Resolution Analysis
- SciPy - Stationary Wavelet Transform
- SciPy - Statistical Functions
- SciPy - Stats
- SciPy - Descriptive Statistics
- SciPy - Continuous Probability Distributions
- SciPy - Discrete Probability Distributions
- SciPy - Statistical Tests & Inference
- SciPy - Generating Random Samples
- SciPy - Kaplan-Meier Estimator Survival Analysis
- SciPy - Cox Proportional Hazards Model Survival Analysis
- SciPy Spatial Data
- SciPy - Spatial
- SciPy - Special Functions
- SciPy - Special Package
- SciPy Advanced Topics
- SciPy - CSGraph
- SciPy - ODR
- SciPy Useful Resources
- SciPy - Reference
- SciPy - Quick Guide
- SciPy - Cheatsheet
- SciPy - Useful Resources
- SciPy - Discussion
SciPy − optimal_leaf_ordering_method()
The optimal_leaf_ordering() method in SciPy's cluster.hierarchy module is a tool for improving the arrangement of leaf nodes in hierarchical clustering dendrograms.By reordering the leaf nodes, it minimizes distance between adjacent leaves, making the dendrogram easier to interpret.
This technique can be combined with other clustering techniques like leaves_list() which retrieves the order of leaves, and linkage(), which builds the hierarchical structure of clustering.
This method is helpful for applications like gene expression analysis, image segmentation, and other fields where comprehending hierarchical relationships among huge datasets is essential.
Syntax
Following is the syntax of the SciPy optimal_leaf_ordering() method
.optimal_leaf_ordering(Z, y, metric='euclidean')
Parameters
This method accepts the following parameters −
Z − (ndarray) The linkage matrix produced by the linkage() method.
y − (ndarray) The original input distance matrix or condensed distance matrix from which linkage matrix(Z) generated.
metric − (optional, string or function) The distance metric to compute distances between observations. Default is 'euclidean'.
Return Value
The method returns a copy of the linkage matrix(Z), optimizing the arrangement of leaf nodes to minimize the distance between adjacent leaves and enhance the interpretability of hierarchical clustering.
Example 1
This example demonstrates how optimal_leaf_ordering() method improves the clustering visualization by reordering the leaves. In the output below you can compare the leaf order before and after using optimal leaf ordering.
The reordered leaf order after applying optimal_leaf_ordering() helps minimize the distance between adjacent leaves in the dendrogram, helps to clarify and improve the interpretability of the clustering structure.
import numpy as np from scipy.cluster import hierarchy # Generating a random 2D dataset with 6 points rng = np.random.default_rng() X = rng.standard_normal((6, 2)) # Perform hierarchical clustering using 'ward' method Z = hierarchy.linkage(X, method='ward') original_leaves = hierarchy.leaves_list(Z) print("Original Leaf Order:", original_leaves) # Apply optimal leaf ordering to reorder the linkage matrix optimal_Z = hierarchy.optimal_leaf_ordering(Z, X) # Displaying the leaf order after applying optimal leaf ordering optimal_leaves = hierarchy.leaves_list(optimal_Z) print("Optimized Leaf Order:", optimal_leaves)
When we run above program, it produces following result −
Original Leaf Order: [3 1 4 2 0 5] Optimized Leaf Order: [3 4 1 5 0 2]
Example 2
Let us see how optimal_leaf_ordering uses linkage matrix(z) to reorder adjacent leaves by using cityblock metric.
The cityblock metric calculates the distance between two points as the sum of the absolute differences of their coordinates. It is also called as Manhattan distance.
import numpy as np from scipy.cluster.hierarchy import linkage, optimal_leaf_ordering, leaves_list distance_matrix = np.array([0.5, 1.2, 0.9, 1.8, 1.1, 0.7]) Z = linkage(distance_matrix, method='average') original_order = leaves_list(Z) print("Original Leaf Order:", original_order) optimal_Z = optimal_leaf_ordering(Z, distance_matrix, metric='cityblock') optimal_order = leaves_list(optimal_Z) print("Optimized Leaf Order:", optimal_order)
When we run above program, it produces following result −
Original Leaf Order: [0 1 2 3] Optimized Leaf Order: [1 0 3 2]
Example 3
This example tells how optimal_leaf_ordering_method() can enhance the visualization by ensuring similar clusters are adjacent. The example utilizes a custom metric, cosine distance, to measure similarity between data points.
The output shows the optimized order ensures clusters with high similarity, based on cosine similarity, are placed adjacent.
import numpy as np from scipy.cluster.hierarchy import linkage, optimal_leaf_ordering, leaves_list from scipy.spatial.distance import pdist data = np.random.rand(5, 3) Z = linkage(data, method='single') original_order = leaves_list(Z) print("Original Leaf Order:", original_order) # Apply optimal_leaf_ordering with cosine distance optimal_Z = optimal_leaf_ordering(Z, pdist(data, metric='cosine')) optimal_order = leaves_list(optimal_Z) print("Optimized Leaf Order:", optimal_order)
When we run above program, it produces following result −
Original Leaf Order: [3 1 4 0 2] Optimized Leaf Order: [3 4 1 2 0]