Interact with other programs using Python Interact with other programs using Python python python

Interact with other programs using Python


If what you're really looking into is a good excuse to teach yourself how to interact with other apps, this may not be the best one. Web browsers are messy, the timing is going to be unpredictable, etc. So, you've taken on a very hard task—and one that would be very easy if you did it the usual way (talk to the server directly, create the text file directly, etc., all without touching any other programs).

But if you do want to interact with other apps, there are a variety of different approaches, and which is appropriate depends on the kinds of apps you need to deal with.

  • Some apps are designed to be automated from the outside. On Windows, this nearly always means they a COM interface, usually with an IDispatch interface, for which you can use pywin32's COM wrappers; on Mac, it means an AppleEvent interface, for which you use ScriptingBridge or appscript; on other platforms there is no universal standard. IE (but probably not Chrome) and Word both have such interfaces.

  • Some apps have a non-GUI interface—whether that's a command line you can drive with popen, or a DLL/SO/DYLIB you can load up through ctypes. Or, ideally, someone else has already written Python bindings for you.

  • Some apps have nothing but the GUI, and there's no way around doing GUI automation. You can do this at a low level, by crafting WM_ messages to send via pywin32 on Windows, using the accessibility APIs on Mac, etc., or at a somewhat higher level with libraries like pywinauto, or possibly at the very high level of selenium or similar tools built to automate specific apps.

So, you could do this with anything from selenium for Chrome and COM automation for Word, to crafting all the WM_ messages yourself. If this is meant to be a learning exercise, the question is which of those things you want to learn today.


Let's start with COM automation. Using pywin32, you directly access the application's own scripting interfaces, without having to take control of the GUI from the user, figure out how to navigate menus and dialog boxes, etc. This is the modern version of writing "Word macros"—the macros can be external scripts instead of inside Word, and they don't have to be written in VB, but they look pretty similar. The last part of your script would look something like this:

word = win32com.client.dispatch('Word.Application')word.Visible = Truedoc = word.Documents.Add()doc.Selection.TypeText(my_string)doc.SaveAs(r'C:\TestFiles\TestDoc.doc')

If you look at Microsoft Word Scripts, you can see a bunch of examples. However, you may notice they're written in VBScript. And if you look around for tutorials, they're all written for VBScript (or older VB). And the documentation for most apps is written for VBScript (or VB, .NET, or even low-level COM). And all of the tutorials I know of for using COM automation from Python, like Quick Start to Client Side COM and Python, are written for people who already know about COM automation, and just want to know how to do it from Python. The fact that Microsoft keeps changing the name of everything makes it even harder to search for—how would you guess that googling for OLE automation, ActiveX scripting, Windows Scripting House, etc. would have anything to do with learning about COM automation? So, I'm not sure what to recommend for getting started. I can promise that it's all as simple as it looks from that example above, once you do learn all the nonsense, but I don't know how to get past that initial hurdle.

Anyway, not every application is automatable. And sometimes, even if it is, describing the GUI actions (what a user would click on the screen) is simpler than thinking in terms of the app's object model. "Select the third paragraph" is hard to describe in GUI terms, but "select the whole document" is easy—just hit control-A, or go to the Edit menu and Select All. GUI automation is much harder than COM automation, because you either have to send the app the same messages that Windows itself sends to represent your user actions (e.g., see "Menu Notifications") or, worse, craft mouse messages like "go (32, 4) pixels from the top-left corner, click, mouse down 16 pixels, click again" to say "open the File menu, then click New".

Fortunately, there are tools like pywinauto that wrap up both kinds of GUI automation stuff up to make it a lot simpler. And there are tools like swapy that can help you figure out what commands you want to send. If you're not wedded to Python, there are also tools like AutoIt and Actions that are even easier than using swapy and pywinauto, at least when you're getting started. Going this way, the last part of your script might look like:

word.Activate()word.MenuSelect('File->New')word.KeyStrokes(my_string)word.MenuSelect('File->Save As')word.Dialogs[-1].FindTextField('Filename').Select()word.KeyStrokes(r'C:\TestFiles\TestDoc.doc')word.Dialogs[-1].FindButton('OK').Click()

Finally, even with all of these tools, web browsers are very hard to automate, because each web page has its own menus, buttons, etc. that aren't Windows controls, but HTML. Unless you want to go all the way down to the level of "move the mouse 12 pixels", it's very hard to deal with these. That's where selenium comes in—it scripts web GUIs the same way that pywinauto scripts Windows GUIs.


The following script uses Automa to do exactly what you want (tested on Word 2010):

def find_lyrics():    print 'Please minimize all other open windows, then enter the song:'    song = raw_input()    start("Google Chrome")    # Disable Google's autocompletion and set the language to English:    google_address = 'google.com/webhp?complete=0&hl=en'    write(google_address, into="Address")    press(ENTER)    write(song + ' lyrics filetype:txt')    click("I'm Feeling Lucky")    press(CTRL + 'a', CTRL + 'c')    press(ALT + F4)    start("Microsoft Word")    press(CTRL + 'v')    press(CTRL + 's')    click("Desktop")    write(song + ' lyrics', into="File name")    click("Save")    press(ALT + F4)    print("\nThe lyrics have been saved in file '%s lyrics' "          "on your desktop." % song)

To try it out for yourself, download Automa.zip from its Download page and unzip into, say, c:\Program Files. You'll get a folder called Automa 1.1.2. Run Automa.exe in that folder. Copy the code above and paste it into Automa by right-clicking into the console window. Press Enter twice to get rid of the last ... in the window and arrive back at the prompt >>>. Close all other open windows and type

>>> find_lyrics()

This performs the required steps.

Automa is a Python library: To use it as such, you have to add the line

from automa.api import *

to the top of your scripts and the file library.zip from Automa's installation directory to your environment variable PYTHONPATH.

If you have any other questions, just let me know :-)


Here's an implementation in Python of @Matteo Italia's comment:

You are approaching the problem from a "user perspective" when you should approach it from a "programmer perspective"; you don't need to open a browser, copy the text, open Word or whatever, you need to perform the appropriate HTTP requests, parse the relevant HTML, extract the text and write it to a file from inside your Python script. All the tools to do this are available in Python (in particular you'll need urllib2 and BeautifulSoup).

#!/usr/bin/env pythonimport codecsimport jsonimport sysimport urllibimport urllib2import bs4  # pip install beautifulsoup4def extract_lyrics(page):    """Extract lyrics text from given lyrics.wikia.com html page."""    soup = bs4.BeautifulSoup(page)    result = []    for tag in soup.find('div', 'lyricbox'):        if isinstance(tag, bs4.NavigableString):            if not isinstance(tag, bs4.element.Comment):                result.append(tag)        elif tag.name == 'br':            result.append('\n')    return "".join(result)# get artist, song to searchartist = raw_input("Enter artist:")song = raw_input("Enter song:")# make requestquery = urllib.urlencode(dict(artist=artist, song=song, fmt="realjson"))response = urllib2.urlopen("http://lyrics.wikia.com/api.php?" + query)data = json.load(response)if data['lyrics'] != 'Not found':    # print short lyrics    print(data['lyrics'])    # get full lyrics    lyrics = extract_lyrics(urllib2.urlopen(data['url']))    # save to file    filename = "[%s] [%s] lyrics.txt" % (data['artist'], data['song'])    with codecs.open(filename, 'w', encoding='utf-8') as output_file:        output_file.write(lyrics)    print("written '%s'" % filename)else:    sys.exit('not found')

Example

$ printf "Queen\nWe are the Champions" | python get-lyrics.py 

Output

I've paid my duesTime after timeI've done my sentenceBut committed no crimeAnd bad mistakesI've made a fewI've had my share of sand kicked [...]written '[Queen] [We are the Champions] lyrics.txt'