GCM 2020 supplementary information

The present set of examples and materials is intended to serve as a supplementary information for the conference paper Efficient Computation of Overlaps for Rule Composition: Theory and Z3 Prototyping (Behr et al. 2020).

Preliminaries: setting up the Jupyter environment

If you have not already done so, please consult the Installation instructions chapter of the ReSMT documentation for information on how to install the necessary Python version, Python packages and the necessary additional shell scripts. You will also require a local copy of the ReSMT GitLab repository, which may be obtained as follows:

$ git clone https://gitlab.com/nicolasbehr/ReSMT.git <path-to-your-local-folder>

A convenient method to experiment with this document consists in starting up a Jupyter Lab instance: navigate to the docs/py_and_ipynb_examples folder of your local copy of the repository in a terminal session and execute the command

$ jupyter lab GCM2020.ipynb

This will open an interactive Jupyter session of the present notebook.

Assuming that the current working directory of the Python instance is the aforementioned docs/py_and_ipynb_examples sub-directory of the ReSMT package main directory, the package is loaded as follows:

import os
import timeit
import sys
sys.path.insert(0, os.path.abspath('../../'))
import resmt.datatypes as datatypes
import resmt.experiments as experiments
import resmt.visualizations as visualizations

import numpy
import networkx as nx
import z3

Note

If running the Python file from within an IDE such as e.g. PyCharm, it is alternatively possible to change the working directory to the docs/py_and_ipynb_examples folder manually before running the above code within the IDE, as in

$ os.chdir('/local/path/to/ReSMT/docs/py_and_ipynb_examples')

The polymer model examples presented at GCM 2020

Consider the following rewriting rules, for which only the “plain” rule parts are explicitly depicted, and which should be considered in the context of rewriting over rigid graphs (i.e. graphs respecting the constraints of not containing any of the “forbidden patterns” mentioned in the previous examples):

../_images/rGraphRules.svg

Based upon these rules, consider the following three overlap search examples, where we denote for a given linear rewriting rule r=(O\leftharpoonup I) by out(r):=O and in(r):=I the out- and input interfaces of the rule r, respectively:

  • Example P1: g_A = out(r_{create-edge}), g_B = in(r_{delete-edge})

  • Example P2: g_A = out(r_{c-c_2}), g_B = in(r_{b-c_2})

  • Example P3: g_A = out(r_{c-c_4}), g_B = in(r_{b-c_4})

  • Example P4: g_A = out(r_{c-c_7}), g_B = in(r_{b-c_7})

Running overlap search experiments with various alternative strategies

In each of the following experiments, all specification data necessary (i.e. the definition of the input graphs, the forbidden patterns, output data file names and other parameters for the algorithms) are loaded from data files in the human-readable JSON format. Information on how these data files were generated via methods of the ReSMT package may be found in the Appendix section of this notebook.

# filenames for data files:
graphDataFname = 'data/GCM2020-graph-data'
forbiddenPatternDataFname = 'data/GCM2020-forbidden-pattern-data'

Experiment # 1: a “dry run” with no constraints (in order to determine the number of overlaps without constraints as a reference)

# specify the experiment specification data file:
expSpecDataFname = 'data/GCM2020-experiments-specification-no-constraints'

# load the "dry run" experiment specification (for determining the number of overlaps without constraints):
expSpecData = []
expSpecData += [datatypes.loadExpsDataFromJSONfile(expSpecDataFname,
                                                   'Experiments P1, P2, P3 and P4 without constraints')]

# specify an output file for the result of the experiment:
resultsDataFname = 'results/GCM2020-results-no-constraints'
# run the experiments:
datatypes.runExperiments(expSpecData, resultsDataFname)

Experiment # 2: “direct”” strategy computations

It is important to estimate the complexity of the problems for the larger examples P3 and P4. For example P3, the preparatory experiment of extracting the number of overlaps without constraints reveals that there are 2426 candidate overlaps to generate and test consider in this example. However, the number of candidate overlaps for the example P4 is too large to even generate via the constraint-less algorithm experiment. One may bound the number n_{P4} of candidate overlaps from below by the number n_{P4:V} of vertex-overlaps alone, and from above by the number n_{P4:max} of injective partial overlaps of the sets of elements (vertices and edges) of each graph, yet discarding the homomorphism constraint. Utilizing the formula for the number of injective partial overlaps of two sets (see also the chapter API experiments in the examples gallery of the ReSMT documentation), we thus find the following estimate:

# load the graph data from the 'data/GCM2020-graph-data' JSON file:
fName = 'data/GCM2020-graph-data'
graphData = datatypes.renderTDGdataFromJSONfile(fName)

# number of vertices and edges in graph 'gA-P4':
numVgAP4 = len(graphData['gA-P4']['vDict'].keys())
numEgAP4 = len(graphData['gA-P4']['eDict'].keys())

# number of vertices and edges in graph 'gB-P4':
numVgBP4 = len(graphData['gB-P4']['vDict'].keys())
numEgBP4 = len(graphData['gB-P4']['eDict'].keys())

# lower bound for number of injective partial overlaps between the two graphs:
# the number nP4V of vertex-only overlaps:
nP4V = experiments.numSetInjectivePartialOverlaps(numVgAP4, numVgBP4)

# upper bound for number of injective partial overlaps between the two graphs:
# the number nP4max of overlaps of all graph elements, but discarding the
# homomorphism constraints
nP4max = experiments.numSetInjectivePartialOverlaps(numVgAP4 + numVgAP4, numVgBP4 + numVgBP4)

print('%s <= number of overlaps of gA-P4 and gB-P4 <= %s' % (nP4V, nP4max))

n_{P4:V} = 1174226049 \quad \leq \quad n_{P4} \quad \leq \quad n_{P4:max} = 30777323181433121931264

It is thus clear that the “direct” aka “generate-and-test” strategy is entirely infeasible for this particular example, which is why example P4 is not considered in the input data for this experiment.

# specify the experiment specification data file:
expSpecDataFname = 'data/GCM2020-experiments-specification-direct-strategy'

# load the "direct" strategy experiment specification:
expSpecData = []
expSpecData += [datatypes.loadExpsDataFromJSONfile(expSpecDataFname,
                                                   'Experiments P1, P2, P3 and P4 with \"direct\" strategy')]

# specify an output file for the result of the experiment:
resultsDataFname = 'results/GCM2020-results-direct-strategy'
# run the experiments:
datatypes.runExperiments(expSpecData, resultsDataFname)

Experiment # 3: “DPE” strategy computations

Warning

Please be aware that the computations for the example P3 may take a _very_ long time on a standard machine (on the order of several hours per individual run)!

# specify the experiment specification data file:
expSpecDataFname = 'data/GCM2020-experiments-specification-DPE-strategy'

# load the "DPE" strategy experiment specification:
expSpecData = []
expSpecData += [datatypes.loadExpsDataFromJSONfile(expSpecDataFname,
                                                   'Experiments P1, P2, P3 and P4 with \"DPE\" strategy')]

# specify an output file for the result of the experiment:
resultsDataFname = 'results/GCM2020-results-DPE-strategy'
# run the experiments:
datatypes.runExperiments(expSpecData, resultsDataFname)

Experiment # 4: “implicit” strategy computations

# specify the experiment specification data file:
expSpecDataFname = 'data/GCM2020-experiments-specification-implicit-strategy'

# load the "implicit" strategy experiment specification:
expSpecData = []
expSpecData += [datatypes.loadExpsDataFromJSONfile(expSpecDataFname,
                                                   'Experiments P1, P2, P3 and P4 with \"implicit\" strategy')]

# specify an output file for the result of the experiment:
resultsDataFname = 'results/GCM2020-results-implicit-strategy'
# run the experiments:
datatypes.runExperiments(expSpecData, resultsDataFname)

Appendix: Generating the input data for the examples

Part 1: graph data

# The file name for the graph data file:
fName = 'data/GCM2020-graph-data'

Example P1

S = z3.Solver()

vDictA = {'vA1': '*', 'vA2': '*'}
eDictA = {'eA1': '-'}
stDictA = {'eA1': ('vA1', 'vA2')}
gA = datatypes.Z3TypedDirectedGraph(S, 'gA-P1', vDictA, eDictA, stDictA)

visA = visualizations.nxTypedDirectedGraph(gA,
                                    drawGraph=True,
                                    writeGraph=True,
                                    fileName='images/gA-P1.svg',
                                    gLayout='neato')

vDictB = {'vB1': '*', 'vB2': '*'}
eDictB = {'eB1': '-'}
stDictB = {'eB1': ('vB1', 'vB2')}
gB = datatypes.Z3TypedDirectedGraph(S, 'gB-P1', vDictB, eDictB, stDictB)

visB = visualizations.nxTypedDirectedGraph(gB,
                                    drawGraph=True,
                                    writeGraph=True,
                                    fileName='images/gB-P1.svg',
                                    gLayout='neato')

# write the two graphs to a JSON file, then read out the entire data and generate the two TDGS again
# (to ensure both datasets are indeed stored properly):
gA.writeTDGdataToJSONfile(fName)
gB.writeTDGdataToJSONfile(fName)

# display the graphs:
visualizations.displaySVGtable({'gA-P1': ['images/gA-P1.svg'], 'gB-P1': ['images/gB-P1.svg']}, width="100%")

Example P2

S = z3.Solver()

vDictA = {'vA1': '*', 'vA2': '*', 'vA3': '*'}
eDictA = {'eA1': '-', 'eA2': '-', 'eA3': '-'}
stDictA = {'eA1': ('vA1', 'vA2'), 'eA2': ('vA2', 'vA3'), 'eA3': ('vA3', 'vA1')}
gA = datatypes.Z3TypedDirectedGraph(S, 'gA-P2', vDictA, eDictA, stDictA)

visA = visualizations.nxTypedDirectedGraph(gA,
                                    drawGraph=True,
                                    writeGraph=True,
                                    fileName='images/gA-P2.svg',
                                    gLayout='neato')

vDictB = {'vB1': '*', 'vB2': '*', 'vB3': '*'}
eDictB = {'eB1': '-', 'eB2': '-'}
stDictB = {'eB1': ('vB1', 'vB2'), 'eB2': ('vB2', 'vB3')}
gB = datatypes.Z3TypedDirectedGraph(S, 'gB-P2', vDictB, eDictB, stDictB)

visB = visualizations.nxTypedDirectedGraph(gB,
                                    drawGraph=True,
                                    writeGraph=True,
                                    fileName='images/gB-P2.svg',
                                    gLayout='neato')


# write the two graphs to a JSON file, then read out the entire data and generate the two TDGS again
# (to ensure both datasets are indeed stored properly):
gA.writeTDGdataToJSONfile(fName)
gB.writeTDGdataToJSONfile(fName)

# display the graphs:
visualizations.displaySVGtable({'gA-P2': ['images/gA-P2.svg'], 'gB-P2': ['images/gB-P2.svg']}, width="100%")

Example P3

S = z3.Solver()

vDictA = {'vA1': '*', 'vA2': '*', 'vA3': '*', 'vA4': '*', 'vA5': '*'}
eDictA = {'eA1': '-', 'eA2': '-', 'eA3': '-', 'eA4': '-', 'eA5': '-'}
stDictA = {'eA1': ('vA1', 'vA2'), 'eA2': ('vA2', 'vA3'), 'eA3': ('vA3', 'vA4'),
           'eA4': ('vA4', 'vA5'), 'eA5': ('vA5', 'vA1')}
gA = datatypes.Z3TypedDirectedGraph(S, 'gA-P3', vDictA, eDictA, stDictA)

visA = visualizations.nxTypedDirectedGraph(gA,
                                    drawGraph=True,
                                    writeGraph=True,
                                    fileName='images/gA-P3.svg',
                                    gLayout='neato')

vDictB = {'vB1': '*', 'vB2': '*', 'vB3': '*', 'vB4': '*', 'vB5': '*'}
eDictB = {'eB1': '-', 'eB2': '-', 'eB3': '-', 'eB4': '-'}
stDictB = {'eB1': ('vB1', 'vB2'), 'eB2': ('vB2', 'vB3'), 'eB3': ('vB3', 'vB4'), 'eB4': ('vB4', 'vB5')}
gB = datatypes.Z3TypedDirectedGraph(S, 'gB-P3', vDictB, eDictB, stDictB)

visB = visualizations.nxTypedDirectedGraph(gB,
                                    drawGraph=True,
                                    writeGraph=True,
                                    fileName='images/gB-P3.svg',
                                    gLayout='neato')

# write the two graphs to a JSON file, then read out the entire data and generate the two TDGS again
# (to ensure both datasets are indeed stored properly):
gA.writeTDGdataToJSONfile(fName)
gB.writeTDGdataToJSONfile(fName)

# display the graphs:
visualizations.displaySVGtable({'gA-P3': ['images/gA-P3.svg'], 'gB-P3': ['images/gB-P3.svg']}, width="100%")

Example P4

S = z3.Solver()

vDictA = {'vA1': '*', 'vA2': '*', 'vA3': '*', 'vA4': '*',
          'vA5': '*', 'vA6': '*', 'vA7': '*', 'vA8': '*'}
eDictA = {'eA1': '-', 'eA2': '-', 'eA3': '-', 'eA4': '-',
          'eA5': '-', 'eA6': '-', 'eA7': '-', 'eA8': '-'}
stDictA = {'eA1': ('vA1', 'vA2'), 'eA2': ('vA2', 'vA3'), 'eA3': ('vA3', 'vA4'), 'eA4': ('vA4', 'vA5'),
           'eA5': ('vA5', 'vA6'), 'eA6': ('vA6', 'vA7'), 'eA7': ('vA7', 'vA8'), 'eA8': ('vA8', 'vA1')}
gA = datatypes.Z3TypedDirectedGraph(S, 'gA-P4', vDictA, eDictA, stDictA)

visA = visualizations.nxTypedDirectedGraph(gA,
                                    drawGraph=True,
                                    writeGraph=True,
                                    fileName='images/gA-P4.svg',
                                    gLayout='neato')

vDictB = {'vB1': '*', 'vB2': '*', 'vB3': '*', 'vB4': '*',
          'vB5': '*', 'vB6': '*', 'vB7': '*', 'vB8': '*'}
eDictB = {'eB1': '-', 'eB2': '-', 'eB3': '-', 'eB4': '-',
          'eB5': '-', 'eB6': '-', 'eB7': '-'}
stDictB = {'eB1': ('vB1', 'vB2'), 'eB2': ('vB2', 'vB3'), 'eB3': ('vB3', 'vB4'), 'eB4': ('vB4', 'vB5'),
           'eB5': ('vB5', 'vB6'), 'eB6': ('vB6', 'vB7'), 'eB7': ('vB7', 'vB8')}
gB = datatypes.Z3TypedDirectedGraph(S, 'gB-P4', vDictB, eDictB, stDictB)

visB = visualizations.nxTypedDirectedGraph(gB,
                                    drawGraph=True,
                                    writeGraph=True,
                                    fileName='images/gB-P4.svg',
                                    gLayout='neato')

# write the two graphs to a JSON file, then read out the entire data and generate the two TDGS again
# (to ensure both datasets are indeed stored properly):
gA.writeTDGdataToJSONfile(fName)
gB.writeTDGdataToJSONfile(fName)

# display the graphs:
visualizations.displaySVGtable({'gA-P4': ['images/gA-P4.svg'], 'gB-P4': ['images/gB-P4.svg']}, width="100%")

Part 2: forbidden pattern data

# The file name for the forbidden pattern data file:
fName = 'data/GCM2020-forbidden-pattern-data'

Definition of the four “forbidden” TDG patterns:

Saux = z3.Solver()

vDictN1 = {'v'+str(i): '*' for i in range(3)}
eDictN1 = {'e'+str(i): '-' for i in range(1, 3)}
stDictN1 = {'e'+str(i): ('v0', 'v'+str(i)) for i in range(1, 3)}
graphN1 = datatypes.Z3TypedDirectedGraph(Saux, 'gN1', vDictN1, eDictN1, stDictN1)

visN1 = visualizations.nxTypedDirectedGraph(graphN1,
                                            drawGraph=True,
                                            writeGraph=True,
                                            fileName='images/graphN1.svg',
                                            gLayout='neato')

vDictN2 = {'v'+str(i): '*' for i in range(3)}
eDictN2 = {'e'+str(i): '-' for i in range(1, 3)}
stDictN2 = {'e'+str(i): ('v'+str(i), 'v0') for i in range(1, 3)}
graphN2 = datatypes.Z3TypedDirectedGraph(Saux, 'gN2', vDictN2, eDictN2, stDictN2)

visN2 = visualizations.nxTypedDirectedGraph(graphN2,
                                            drawGraph=True,
                                            writeGraph=True,
                                            fileName='images/graphN2.svg',
                                            gLayout='neato')

vDictN3 = {'v1': '*', 'v2': '*'}
eDictN3 = {'e1': '-', 'e2': '-'}
stDictN3 = {'e1': ('v1', 'v2'), 'e2': ('v1', 'v2')}
graphN3 = datatypes.Z3TypedDirectedGraph(Saux, 'gN3', vDictN3, eDictN3, stDictN3)

visN3 = visualizations.nxTypedDirectedGraph(graphN3,
                                            drawGraph=True,
                                            writeGraph=True,
                                            fileName='images/graphN3.svg',
                                            gLayout='neato')

vDictN4 = {'v1': '*'}
eDictN4 = {'e1': '-', 'e2': '-'}
stDictN4 = {'e1': ('v1', 'v1'), 'e2': ('v1', 'v1')}
graphN4 = datatypes.Z3TypedDirectedGraph(Saux, 'gN4', vDictN4, eDictN4, stDictN4)

visN4 = visualizations.nxTypedDirectedGraph(graphN4,
                                            drawGraph=True,
                                            writeGraph=True,
                                            fileName='images/graphN4.svg',
                                            gLayout='neato')


# write the forbidden graph patterns to a JSON file, then read out the entire data
# and generate the two TDGS again (to ensure both datasets are indeed stored properly):
for gN in [graphN1, graphN2, graphN3, graphN4]:
    gN.writeTDGdataToJSONfile(fName)

# display the forbidden graph patterns:
visualizations.displaySVGtable({
    'gN1': ['images/graphN1.svg'],
    'gN2': ['images/graphN2.svg'],
    'gN3': ['images/graphN3.svg'],
    'gN4': ['images/graphN4.svg'],
}, width="100%")
# render the forbidden relation data:
fpData = datatypes.renderTDGdataFromJSONfile(fName)

Saux = z3.Solver()

fpList = [datatypes.Z3TypedDirectedGraph(Saux, x['gName'], x['vDict'], x['eDict'], x['stDict']) for key, x in fpData.items()]

frClist = []

for gN in fpList:
    if fpData[gN.name].get('frData', {}) == {}:
        # if the forbidden relation data is not yet present in fpData, generate it:
        frC = datatypes.Z3TDGfrContainer(Saux, gN)
        frC.genTDGfrData()
        frClist += [frC]

        # store the forbidden relation data to the fpData (e.g. for use in the next runs):
        fpData[gN.name]['frData'] = frC.frData

    else:
        # if the forbidden relation data is present, use it to instantiate the frContainer:
        frClist += [datatypes.Z3TDGfrContainer(Saux, gN, fpData[gN.name]['frData'])]

# for buffering the computationally expensive forbidden relation computations, save all the fpData
# to the JSON file with name expSpec.forbiddenPatternDataFname:

with open(fName + '.json', 'w') as f:
    json.dump(fpData, f, sort_keys=True, indent=2)

Part 3: Specification data for the experiments

Input graph and forbidden pattern JSON file names:

graphDataFname = 'data/GCM2020-graph-data'
forbiddenPatternDataFname = 'data/GCM2020-forbidden-pattern-data'

The “dry runs” for determining the number of overlaps without constraints:

experimentName = 'Experiments P1, P2 and P3 without constraints'
experimentsData = [('gA-P1', 'gB-P1'), ('gA-P2', 'gB-P2'), ('gA-P3', 'gB-P3')]
strategyName = 'noConstraints'
expSpecDataFname = 'data/GCM2020-experiments-specification-no-constraints'

numberOfRepetitions = 1
maxRunTime = 0    # <- '0' is the key value to parse as no limitation on the run time!

expsDataNoConstraints = datatypes.ExperimentSpec(experimentName,
                                     graphDataFname,
                                     forbiddenPatternDataFname,
                                     strategyName,
                                     numberOfRepetitions,
                                     experimentsData,
                                     maxRunTime)

expsDataNoConstraints.storeExpsDataToJSONfile(expSpecDataFname)

The “direct” strategy runs:

experimentName = 'Experiments P1, P2 and P3 with \"direct\" strategy'
experimentsData = [('gA-P1', 'gB-P1'), ('gA-P2', 'gB-P2'), ('gA-P3', 'gB-P3')]
strategyName = 'direct'
expSpecDataFname = 'data/GCM2020-experiments-specification-direct-strategy'

numberOfRepetitions = 5
maxRunTime = 0    # <- no limit on maximal runtime

expsDataDirectStrategy = datatypes.ExperimentSpec(experimentName,
                                                  graphDataFname,
                                                  forbiddenPatternDataFname,
                                                  strategyName,
                                                  numberOfRepetitions,
                                                  experimentsData,
                                                  maxRunTime)

expsDataDirectStrategy.storeExpsDataToJSONfile(expSpecDataFname)

The “DPE” strategy runs :

experimentName = 'Experiments P1, P2, P3 and P4 with \"DPE\" strategy'
experimentsData = [('gA-P1', 'gB-P1'), ('gA-P2', 'gB-P2'), ('gA-P3', 'gB-P3'), ('gA-P4', 'gB-P4')]
strategyName = 'DPE'
expSpecDataFname = 'data/GCM2020-experiments-specification-DPE-strategy'

numberOfRepetitions = 5
maxRunTime = 0    # <- no limit on maximal runtime

expsDataDPEStrategy = datatypes.ExperimentSpec(experimentName, graphDataFname,
                                               forbiddenPatternDataFname,
                                               strategyName,
                                               numberOfRepetitions,
                                               experimentsData,
                                               maxRunTime)

expsDataDPEStrategy.storeExpsDataToJSONfile(expSpecDataFname)

The “implicit” strategy runs:

experimentName = 'Experiments P1, P2, P3 and P4 with \"implicit\" strategy'
experimentsData = [('gA-P1', 'gB-P1'), ('gA-P2', 'gB-P2'), ('gA-P3', 'gB-P3'), ('gA-P4', 'gB-P4')]
strategyName = 'implicit'
expSpecDataFname = 'data/GCM2020-experiments-specification-implicit-strategy'


numberOfRepetitions = 5
maxRunTime = 0    # <- no limit on maximal runtime

expsDataImplicitStrategy = datatypes.ExperimentSpec(experimentName,
                                                  graphDataFname,
                                                  forbiddenPatternDataFname,
                                                  strategyName,
                                                  numberOfRepetitions,
                                                  experimentsData,
                                                  maxRunTime)

expsDataImplicitStrategy.storeExpsDataToJSONfile(expSpecDataFname)

Gallery generated by Sphinx-Gallery