How to use python-docx to replace text in a Word document and save How to use python-docx to replace text in a Word document and save python python

How to use python-docx to replace text in a Word document and save


UPDATE: There are a couple of paragraph-level functions that do a good job of this and can be found on the GitHub site for python-docx.

  1. This one will replace a regex-match with a replacement str. The replacement string will appear formatted the same as the first character of the matched string.
  2. This one will isolate a run such that some formatting can be applied to that word or phrase, like highlighting each occurence of "foobar" in the text or perhaps making it bold or appear in a larger font.

The current version of python-docx does not have a search() function or a replace() function. These are requested fairly frequently, but an implementation for the general case is quite tricky and it hasn't risen to the top of the backlog yet.

Several folks have had success though, getting done what they need, using the facilities already present. Here's an example. It has nothing to do with sections by the way :)

for paragraph in document.paragraphs:    if 'sea' in paragraph.text:        print paragraph.text        paragraph.text = 'new text containing ocean'

To search in Tables as well, you would need to use something like:

for table in document.tables:    for row in table.rows:        for cell in row.cells:            for paragraph in cell.paragraphs:                if 'sea' in paragraph.text:                    paragraph.text = paragraph.text.replace("sea", "ocean")

If you pursue this path, you'll probably discover pretty quickly what the complexities are. If you replace the entire text of a paragraph, that will remove any character-level formatting, like a word or phrase in bold or italic.

By the way, the code from @wnnmaw's answer is for the legacy version of python-docx and won't work at all with versions after 0.3.0.


I needed something to replace regular expressions in docx.I took scannys answer.To handle style I've used answer from: Python docx Replace string in paragraph while keeping styleadded recursive call to handle nested tables.and came up with something like this:

import refrom docx import Documentdef docx_replace_regex(doc_obj, regex , replace):    for p in doc_obj.paragraphs:        if regex.search(p.text):            inline = p.runs            # Loop added to work with runs (strings with same style)            for i in range(len(inline)):                if regex.search(inline[i].text):                    text = regex.sub(replace, inline[i].text)                    inline[i].text = text    for table in doc_obj.tables:        for row in table.rows:            for cell in row.cells:                docx_replace_regex(cell, regex , replace)regex1 = re.compile(r"your regex")replace1 = r"your replace string"filename = "test.docx"doc = Document(filename)docx_replace_regex(doc, regex1 , replace1)doc.save('result1.docx')

To iterate over dictionary:

for word, replacement in dictionary.items():    word_re=re.compile(word)    docx_replace_regex(doc, word_re , replacement)

Note that this solution will replace regex only if whole regex has same style in document.

Also if text is edited after saving same style text might be in separate runs.For example if you open document that has "testabcd" string and you change it to "test1abcd" and save, even dough its the same style there are 3 separate runs "test", "1", and "abcd", in this case replacement of test1 won't work.

This is for tracking changes in the document. To marge it to one run, in Word you need to go to "Options", "Trust Center" and in "Privacy Options" unthick "Store random numbers to improve combine accuracy" and save the document.


I got much help from answers from the earlier, but for me, the below code functions as the simple find and replace function in word would do. Hope this helps.

#!pip install python-docx#start from here if python-docx is installedfrom docx import Document#open the documentdoc=Document('./test.docx')Dictionary = {"sea": "ocean", "find_this_text":"new_text"}for i in Dictionary:    for p in doc.paragraphs:        if p.text.find(i)>=0:            p.text=p.text.replace(i,Dictionary[i])#save changed documentdoc.save('./test.docx')

The above solution has limitations. 1) The paragraph containing The "find_this_text" will became plain text without any format, 2) context controls that are in the same paragraph with the "find_this_text" will be deleted, and 3) the "find_this_text" in either context controls or tables will not be changed.