Inline labels in Matplotlib

python matplotlib charts coordinates

Update: User cphyc has kindly created a Github repository for the code in this answer (see here), and bundled the code into a package which may be installed using pip install matplotlib-label-lines.

Pretty Picture:

In matplotlib it's pretty easy to label contour plots (either automatically or by manually placing labels with mouse clicks). There does not (yet) appear to be any equivalent capability to label data series in this fashion! There may be some semantic reason for not including this feature which I am missing.

Regardless, I have written the following module which takes any allows for semi-automatic plot labelling. It requires only numpy and a couple of functions from the standard math library.

Description

The default behaviour of the labelLines function is to space the labels evenly along the x axis (automatically placing at the correct y-value of course). If you want you can just pass an array of the x co-ordinates of each of the labels. You can even tweak the location of one label (as shown in the bottom right plot) and space the rest evenly if you like.

In addition, the label_lines function does not account for the lines which have not had a label assigned in the plot command (or more accurately if the label contains '_line').

Keyword arguments passed to labelLines or labelLine are passed on to the text function call (some keyword arguments are set if the calling code chooses not to specify).

Issues

Annotation bounding boxes sometimes interfere undesirably with other curves. As shown by the 1 and 10 annotations in the top left plot. I'm not even sure this can be avoided.
It would be nice to specify a y position instead sometimes.
It's still an iterative process to get annotations in the right location
It only works when the x-axis values are floats

Gotchas

By default, the labelLines function assumes that all data series span the range specified by the axis limits. Take a look at the blue curve in the top left plot of the pretty picture. If there were only data available for the x range 0.5-1 then then we couldn't possibly place a label at the desired location (which is a little less than 0.2). See this question for a particularly nasty example. Right now, the code does not intelligently identify this scenario and re-arrange the labels, however there is a reasonable workaround. The labelLines function takes the xvals argument; a list of x-values specified by the user instead of the default linear distribution across the width. So the user can decide which x-values to use for the label placement of each data series.

Also, I believe this is the first answer to complete the bonus objective of aligning the labels with the curve they're on. :)

label_lines.py:

from math import atan2,degreesimport numpy as np#Label line with line2D label datadef labelLine(line,x,label=None,align=True,**kwargs):    ax = line.axes    xdata = line.get_xdata()    ydata = line.get_ydata()    if (x < xdata[0]) or (x > xdata[-1]):        print('x label location is outside data range!')        return    #Find corresponding y co-ordinate and angle of the line    ip = 1    for i in range(len(xdata)):        if x < xdata[i]:            ip = i            break    y = ydata[ip-1] + (ydata[ip]-ydata[ip-1])*(x-xdata[ip-1])/(xdata[ip]-xdata[ip-1])    if not label:        label = line.get_label()    if align:        #Compute the slope        dx = xdata[ip] - xdata[ip-1]        dy = ydata[ip] - ydata[ip-1]        ang = degrees(atan2(dy,dx))        #Transform to screen co-ordinates        pt = np.array([x,y]).reshape((1,2))        trans_angle = ax.transData.transform_angles(np.array((ang,)),pt)[0]    else:        trans_angle = 0    #Set a bunch of keyword arguments    if 'color' not in kwargs:        kwargs['color'] = line.get_color()    if ('horizontalalignment' not in kwargs) and ('ha' not in kwargs):        kwargs['ha'] = 'center'    if ('verticalalignment' not in kwargs) and ('va' not in kwargs):        kwargs['va'] = 'center'    if 'backgroundcolor' not in kwargs:        kwargs['backgroundcolor'] = ax.get_facecolor()    if 'clip_on' not in kwargs:        kwargs['clip_on'] = True    if 'zorder' not in kwargs:        kwargs['zorder'] = 2.5    ax.text(x,y,label,rotation=trans_angle,**kwargs)def labelLines(lines,align=True,xvals=None,**kwargs):    ax = lines[0].axes    labLines = []    labels = []    #Take only the lines which have labels other than the default ones    for line in lines:        label = line.get_label()        if "_line" not in label:            labLines.append(line)            labels.append(label)    if xvals is None:        xmin,xmax = ax.get_xlim()        xvals = np.linspace(xmin,xmax,len(labLines)+2)[1:-1]    for line,x,label in zip(labLines,xvals,labels):        labelLine(line,x,label,align,**kwargs)

Test code to generate the pretty picture above:

from matplotlib import pyplot as pltfrom scipy.stats import loglaplace,chi2from labellines import *X = np.linspace(0,1,500)A = [1,2,5,10,20]funcs = [np.arctan,np.sin,loglaplace(4).pdf,chi2(5).pdf]plt.subplot(221)for a in A:    plt.plot(X,np.arctan(a*X),label=str(a))labelLines(plt.gca().get_lines(),zorder=2.5)plt.subplot(222)for a in A:    plt.plot(X,np.sin(a*X),label=str(a))labelLines(plt.gca().get_lines(),align=False,fontsize=14)plt.subplot(223)for a in A:    plt.plot(X,loglaplace(4).pdf(a*X),label=str(a))xvals = [0.8,0.55,0.22,0.104,0.045]labelLines(plt.gca().get_lines(),align=False,xvals=xvals,color='k')plt.subplot(224)for a in A:    plt.plot(X,chi2(5).pdf(a*X),label=str(a))lines = plt.gca().get_lines()l1=lines[-1]labelLine(l1,0.6,label=r'$Re=${}'.format(l1.get_label()),ha='left',va='bottom',align = False)labelLines(lines[:-1],align=False)plt.show()

python matplotlib charts coordinates

@Jan Kuiken's answer is certainly well-thought and thorough, but there are some caveats:

it does not work in all cases
it requires a fair amount of extra code
it may vary considerably from one plot to the next

A much simpler approach is to annotate the last point of each plot. The point can also be circled, for emphasis. This can be accomplished with one extra line:

import matplotlib.pyplot as pltfor i, (x, y) in enumerate(samples):    plt.plot(x, y)    plt.text(x[-1], y[-1], f'sample {i}')

A variant would be to use the method matplotlib.axes.Axes.annotate.

python matplotlib charts coordinates

Nice question, a while ago I've experimented a bit with this, but haven't used it a lot because it's still not bulletproof. I divided the plot area into a 32x32 grid and calculated a 'potential field' for the best position of a label for each line according the following rules:

white space is a good place for a label
Label should be near corresponding line
Label should be away from the other lines

The code was something like this:

import matplotlib.pyplot as pltimport numpy as npfrom scipy import ndimagedef my_legend(axis = None):    if axis == None:        axis = plt.gca()    N = 32    Nlines = len(axis.lines)    print Nlines    xmin, xmax = axis.get_xlim()    ymin, ymax = axis.get_ylim()    # the 'point of presence' matrix    pop = np.zeros((Nlines, N, N), dtype=np.float)        for l in range(Nlines):        # get xy data and scale it to the NxN squares        xy = axis.lines[l].get_xydata()        xy = (xy - [xmin,ymin]) / ([xmax-xmin, ymax-ymin]) * N        xy = xy.astype(np.int32)        # mask stuff outside plot                mask = (xy[:,0] >= 0) & (xy[:,0] < N) & (xy[:,1] >= 0) & (xy[:,1] < N)        xy = xy[mask]        # add to pop        for p in xy:            pop[l][tuple(p)] = 1.0    # find whitespace, nice place for labels    ws = 1.0 - (np.sum(pop, axis=0) > 0) * 1.0     # don't use the borders    ws[:,0]   = 0    ws[:,N-1] = 0    ws[0,:]   = 0      ws[N-1,:] = 0      # blur the pop's    for l in range(Nlines):        pop[l] = ndimage.gaussian_filter(pop[l], sigma=N/5)    for l in range(Nlines):        # positive weights for current line, negative weight for others....        w = -0.3 * np.ones(Nlines, dtype=np.float)        w[l] = 0.5        # calculate a field                 p = ws + np.sum(w[:, np.newaxis, np.newaxis] * pop, axis=0)        plt.figure()        plt.imshow(p, interpolation='nearest')        plt.title(axis.lines[l].get_label())        pos = np.argmax(p)  # note, argmax flattens the array first         best_x, best_y =  (pos / N, pos % N)         x = xmin + (xmax-xmin) * best_x / N               y = ymin + (ymax-ymin) * best_y / N               axis.text(x, y, axis.lines[l].get_label(),                   horizontalalignment='center',                  verticalalignment='center')plt.close('all')x = np.linspace(0, 1, 101)y1 = np.sin(x * np.pi / 2)y2 = np.cos(x * np.pi / 2)y3 = x * xplt.plot(x, y1, 'b', label='blue')plt.plot(x, y2, 'r', label='red')plt.plot(x, y3, 'g', label='green')my_legend()plt.show()

And the resulting plot: enter image description here

CodeHunter

Inline labels in Matplotlib

Description

Issues

Gotchas

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last