Sunday, April 06, 2008

Visualising WSDL with GraphViz and python

sample.jpgI'm working with SOAP and WSDL at the moment and golly that xml can be hard to read.

In the past I've used GraphViz to draw pictures of data structures, once I did an IVR call flow diagram that helped to define what the client wanted and then the source was used to write the dialplan directly.

This is only roughly working, but it's starting to be useful. The python program takes a wsdl file on the command line, and outputs a "dot" file that can be rendered with GraphViz. Note that on the Mac OmniGraffle can also render this format really nicely.

Please let me know if a better tool exists, for the Mac or Unix in general - surely it must, and any improvements would be gratefully accepted. (I am aware of XMLSpy, which looks great but there is no version for the Mac and it looks like it costs "Starting at €399" whatever that means).

#!/usr/bin/env python
"""
A little utility to visualise wsdl using Graphviz.
(http://www.graphviz.org/)
OmniGraffle does an excellent job too.
by Peter B Marks
http://marxy.org

License: You are free to use this for any purpose.

History:
6-April-2008 1 Roughly working for comment.
6-April-2008 2 Skip duplicate arcs.
6-April-2008 3 Add verbose command line switch.
6-April-2008 4 Add levels and outfile options.
"""
import sys
import string
from optparse import OptionParser
import logging
import xml.etree.ElementTree as ET

# global list of nodes we've already outout
# so we don't overwrite them
ALREADY_DONE_NODES = []
ALREADY_DONE_ARCS = []

RECURSE_LEVELS = 50
CURRENT_LEVEL = 0

def main():
global RECURSE_LEVELS
parser = OptionParser()
parser.usage = "%prog [options] infile"
parser.add_option("-v", "--verbose", dest="verbose",
action="store_true", help="print lots of info")
parser.add_option("-o", "--outfile",
dest="outfile", help="Output file to write to")
parser.add_option("-l", "--levels",
dest="levels",
help="Number of levels to recurse or %d" % RECURSE_LEVELS)
(options, args) = parser.parse_args()

if len(args) > 0:
fileName = args[0]
else:
parser.print_help()
return

if options.verbose:
logging.basicConfig(level=logging.DEBUG)
else:
logging.basicConfig(level=logging.ERROR)

if options.levels:
RECURSE_LEVELS = int(options.levels)

if options.outfile:
outfile = options.outfile
else:
outfile = fileName + ".dot"

logging.debug("Reading file %s" % fileName)
infile = open(fileName, "r")
xmldata = infile.read()
infile.close()

logging.debug("Writing to file: %s" % outfile)

of = open(outfile, "w")
of.write('digraph g {graph [rankdir = "LR"];\n')
tree = ET.parse(fileName)
root = tree.getroot()

outputLevel(of, root, root.tag)

of.write('}')
of.close()
logging.debug("Done.")

def removeNamespace(fullString):
where = string.rindex(fullString, "}")
return fullString[where + 1:]

def outputLevel(of, itemList, itemListName):
"""Recursive function for outputting a level"""
global ALREADY_DONE_NODES, ALREADY_DONE_ARCS, RECURSE_LEVELS, CURRENT_LEVEL

itemListLabel = "%s" % (removeNamespace(itemList.tag))
logging.debug("Processing: %s" % itemList.tag)
itemCounter = 0
for item in itemList:
itemName = "%s-%d" % (item.tag, itemCounter)
itemCounter += 1
itemLabel = "%s" % (removeNamespace(item.tag))
for attribute in item.attrib.keys():
itemLabel += "|%s:%s" % (attribute, item.attrib[attribute])

if item.text:
itemText = item.text.strip()
if len(itemText) > 0:
itemLabel += "|%s" % item.text

if itemListName not in ALREADY_DONE_NODES:
ALREADY_DONE_NODES.append(itemListName)
of.write('"%s" [ label = "%s", shape="record" ];\n' % (itemListName, itemListLabel))
else:
logging.debug("Skipping duplicate node:%s" % itemListName)

if itemName not in ALREADY_DONE_NODES:
ALREADY_DONE_NODES.append(itemName)
of.write('"%s" [ label = "%s", shape="record" ];\n' % (itemName, itemLabel))
else:
logging.debug("Skipping duplicate node:%s" % itemName)

arc = '"%s" -> "%s";\n' % (itemListName, itemName)
if arc not in ALREADY_DONE_ARCS:
ALREADY_DONE_ARCS.append(arc)
of.write(arc) # the arc
else:
logging.debug("Skipping duplicate arc:%s" % arc)

CURRENT_LEVEL += 1
logging.debug("Recursed to level %d" % CURRENT_LEVEL)
if CURRENT_LEVEL >= RECURSE_LEVELS:
logging.debug("Hit maximum recursion level %d" % RECURSE_LEVELS)
else:
outputLevel(of, item, itemName) # recurse

if __name__ == "__main__":
main()


The diagram above is a clipping from the output generated from some annotated WSDL.

While I'm here can I mention how useful the techniques shown in this example make python for making little command line utilities. See how OptionParser makes it simple to handle short and long style command line options. Marvel at how you can use the logging module to write output if chosen by the --verbose switch.

It's sometimes a bit hard to find simple examples like this.

4 comments:

Anonymous said...

SAOP? ITYM "SOAP".

Peter Marks said...

Thanks. Fixed.

Anonymous said...

Hi Peter, I must confess I've never done any SOAP/WSDL stuff, but I have to say I find the graphviz output more confusing than the original WSDL.

I ran your script over the snowboard WSDL example linked, and got a diagram that didn't seem to explain what was fundamentally a pretty simple example (ie three operations). The part of the diagram representing the schema I found particularly confusing... (and I am pretty well-acquainted with XML schema).

Also: recursing with a hard-coded limits like this seems, well, yucky :)

(Oh, and why do I have to enter a CAPTCHA to preview my comment and then *again* to publish it? Grrr...)

Peter Marks said...

Yeah, it needs lots of work to be really useful but I figure it's a good start.

I'm really hoping someone will point me to a better, pre-existing, tool. The idea of using GraphViz to draw the diagram seems sound though.

Fair point on the set recursion limit.

No idea about the captcha, the behavior seems to be changing.