how do I get the subtrees of dendrogram made by scipy.cluster.hierarchy how do I get the subtrees of dendrogram made by scipy.cluster.hierarchy python python

how do I get the subtrees of dendrogram made by scipy.cluster.hierarchy


Answering the part of your question regarding tree manipulation...

As explained in aother answer, you can read the coordinates of the branches reading icoord and dcoord from the tree object. For each branch the coordinated are given from the left to the right.

If you want to manually plot the tree you can use something like:

def plot_tree(P, pos=None):    plt.clf()    icoord = scipy.array(P['icoord'])    dcoord = scipy.array(P['dcoord'])    color_list = scipy.array(P['color_list'])    xmin, xmax = icoord.min(), icoord.max()    ymin, ymax = dcoord.min(), dcoord.max()    if pos:        icoord = icoord[pos]        dcoord = dcoord[pos]        color_list = color_list[pos]    for xs, ys, color in zip(icoord, dcoord, color_list):        plt.plot(xs, ys, color)    plt.xlim(xmin-10, xmax + 0.1*abs(xmax))    plt.ylim(ymin, ymax + 0.1*abs(ymax))    plt.show()

Where, in your code, plot_tree(P) gives:

enter image description here

The function allows you to select just some branches:

plot_tree(P, range(10))

enter image description here

Now you have to know which branches to plot. Maybe the fcluster() output is a little obscure and another way to find which branches to plot based on a minimum and a maximum distance tolerance would be using the output of linkage() directly (Z in the OP's case):

dmin = 0.2dmax = 0.3pos = scipy.all( (Z[:,2] >= dmin, Z[:,2] <= dmax), axis=0 ).nonzero()plot_tree( P, pos )

Recommended references: