Detecting comic strip dialogue bubble regions in images Detecting comic strip dialogue bubble regions in images numpy numpy

Detecting comic strip dialogue bubble regions in images

Even though your actual question is concerning step 2 of your processing pipeline, I would like to suggest another approach, that might be, imho, simpler and as you stated that you are open to suggestions.

  1. Using the image from your original step 1 you could create an image without text in the bubbles.


  2. Detect edges on the original image with removed text. This should work well for the speech bubbles, as the bubble edges are pretty distinct.

    Edge detection

  3. Finally use the edge image and the initially detected "text locations" in order to find those areas within the edge image that contain text.


I am sorry for this very general answer, but here it's too late for actual coding for me, but if the question is still open and you need/want some more hints concerning my suggestion, I will elaborate it in more detail. But you could definitely have a look at the Region based segmentation in the scikit-image docs.

While your overall task aims further, your actual question is about your step 2, how to implement a flood fill algorithm on a data set which has detected text in bubbles.

Since you do not give source code, I had to create something from scratch which hopefully interfaces well with your output from step 1. For this I just took 2 fixed coordinates, you would take white points close to blob centers created from text you have extracted in step 1. As soon as you provide proper code, one can adjust that interface.

I took the liberty to fill all internal holes created by the letters you found, If you do not want this, you can skip the code from line 36 on.

For the solution I have actually taken ideas from two pieces of code which I cited in the snipped below. You may find more helpful information there.

Keep us posted on your progress!

import cv2import numpy as np# with ideas from:# cv2.__file__# Read imageim_in = cv2.imread("gIEXY.png", cv2.IMREAD_GRAYSCALE);# Threshold.# Set values equal to or above 200 to 0.# Set values below 200 to, im_th = cv2.threshold(im_in, 200, 255, cv2.THRESH_BINARY_INV);# Copy the thresholded image.im_floodfill = im_th.copy()# Mask used to flood filling.# Notice the size needs to be 2 pixels than the image.h, w = im_th.shape[:2]mask = np.zeros((h+2, w+2), np.uint8)# Floodfill from points inside baloonscv2.floodFill(im_floodfill, mask, (80,400), 128);cv2.floodFill(im_floodfill, mask, (610,90), 128);# Invert floodfilled imageim_floodfill_inv = cv2.bitwise_not(im_floodfill)# Combine the two images to get the foregroundim_out = im_th | im_floodfill_inv# Create binary image from segments with holesth, im_th2 = cv2.threshold(im_out, 130, 255, cv2.THRESH_BINARY)# Create contours to fill holesim_th3 = cv2.bitwise_not(im_th2)contour,hier = cv2.findContours(im_th3,cv2.RETR_CCOMP,cv2.CHAIN_APPROX_SIMPLE)for cnt in contour:    cv2.drawContours(im_th3,[cnt],0,255,-1)segm = cv2.bitwise_not(im_th3)# Display imagecv2.imshow("Original", im_in)cv2.imshow("Segmented", segm)cv2.waitKey(0)