tkinter text widget as html tkinter text widget as html tkinter tkinter

tkinter text widget as html


The text widget has a method named dump which can serialize everything in the text widget. It returns a list of tuples. Each tuple will be of the form (key, value, index). key will be one of the following: text, mark, tagon, tagoff, image, or window. The value will be dependent on the key. For example, with tagon and tagoff the value will be the name of the tag. For text it's the text.

Consider a text widget with the tags "b" for bold and "h1" for the header. It might look something like this:

enter image description here

When you call the dump method (eg: self.text.dump("1.0", "end")), you would get something like the following:

(    ('mark', 'current', '1.0'),    ('tagon', 'h1', '1.0'),    ('text', 'Hello, world!', '1.0'),    ('tagoff', 'h1', '1.13'),    ('text', '\n', '1.13'),    ('text', '\n', '2.0'),    ('text', 'this is a test with some ', '3.0'),    ('tagon', 'b', '3.25'),    ('text', 'bold text', '3.25'),    ('tagoff', 'b', '3.34'),    ('text', '.', '3.34'),    ('mark', 'insert', '3.35'),    ('text', '\n', '3.35'),)

A conversion program simply needs to loop over that data and process each key. If you use tag names that correspond to html tags (eg: "b", "h1", etc), the conversion becomes fairly simple. It might look something like this:

def convert(self):    html = ""    for (key, value, index) in self.text.dump("1.0", "end"):        self.converted.insert("end", str((key, value, index)) + "\n")        if key == "tagon":            html += "<{}>".format(value)        elif key == "tagoff":            html += "</{}>".format(value)        elif key == "text":            html += value

The above would yield something like this for the example window:

<h1>Hello, world!</h1>this is a test with some <b>bold text</b>.

You'll have to add some additional code to handle paragraphs since the dump method just returns newlines rather than tags for each paragraph, but otherwise it's a fairly straight-forward algorithm.