ILPIso is a software for substitution-tolerant subgraph isomorphism. It aims at finding assignments between a pattern graph and a target graph. Every vertex/edge of the pattern graph is matched to a different vertex/node in the target graph, but differences between labels are penalized by a cost function. The matching is done so that the global assignment cost is minimal.
The software can be configured in order to search for several instances of the matching, and return them by non-decreasing cost order. Several settings are available to discard solutions previously found. When searching for a new solution:
For this demo, default files are provided for the pattern graph, the target graph and the feature weights. You can choose the following options :
For this demo, please provide:
File formats are described below
The pattern and target graphs are text files in a format adapted from the GML format, initially defined by the FIM of the University of Passau, so that nodes and vertices can support multidimensionnal feature vector.
The structure of the file is the following:
ILPIso processes directed graphs.
The node list is a sequence of node descriptions. Each node is represented according to the following structure:
The id
value is a unique integer from 0 to node_number
-1 in the increasing order.
feature list is a sequence of feature_name-feature_value pairs, where the feature_name is an identifier (a string beginning with [a-zA-Z] and containing letters, digits and underscores). The feature_value is a string representing a floating point number (eg. -145.5037 or 1.0437e-8).
It is important that feature list use the same feature_name set for all nodes. Moreover, the pattern and the target graph must use the same feature set.
The edge list is a sequence of edge descriptions. Each edge is represented according the following structure:
The graph format described above allows that vertices and edges can support multidimensionnal feature vectors. The cost for matching two feature vectors and is the weighted L2-Norm where the defines the weights associated to each dimension. The feature weight file allows to give a different weight to each feature.
The feature weight file is structured according to
Each feature weigth list is a feature_name-feature_weight sequence (one line for each). For each feature_name, the feature_weight is a string representing a floating point number.
It is important that the feature weight file contain a value for each feature present in the graph files.
Download and uncompress the archive ILPIso_linux64.tar.gz. Once done, you
just need to execute the ILPIso
located in the ILPIso
directory as explained in the Usage section. The ILPIso
directory also contains two pattern-target graph pairs and the
associated feature weight file.
PATTERN_FILE, TARGET_FILE,
FEATURE_WEIGHT_FILE
are mandatory.CUT_OPTION
is an integer used for
multiple search:
NB_INSTANCE
is the number of solutions
to be searchedFor each solution found, the software outputs information on the solving (wallclock time, user time, number of analyzed nodes in the solution tree...).
It also outputs the total solution cost, which is the sum of the costs of vertex/edge mappings.
Finally, it outputs the detailed mapping of the solution. For example, the following output line means that the vertex whose id is 0 in the pattern graph has been matched to the vertex whose id is 54 in the target graph. This mapping costs 0.976.
Similarly, the following output line means that the edge from the vertex whose id is 0 to the edge whose id is 1 in the pattern graph is matched to the edge from the vertex whose id is 54 to the edge whose id is 58 in the target graph. This mapping costs 1.102.
We provide four synthetic datasets, two of them have been used in the experimental evaluation of ILPIso published in [Le Bodic et al. 2012].
Each of these data files contains a triplet pattern_file, target_file, groundtruth_file. The syntax of these files is :
g_ep_pvn_tvn_in_pattern.gml
g_ep_pvn_tvn_in_target.gml
g_ep_pvn_tvn_in.gt
ep
stands for edge probability. It gives
the probability that a directed edge exists between two distinct
vertices.pvn
is the number of vertices in the
pattern graph.tvn
is the number of vertices in the
target graph.in
is an instance number. Five instances
are given for each ep-pvn-tvn
combination.The groundtruth_file gives the mapping between pattern vertices and target vertices.
In these datasets vertices and edges are labelled with a unique real number randomly drawned in [-100, 100] with a uniform distribution.
In the ILPIso_exact_synth
dataset, there is
an exact mapping between corresponding vertices/edge (i.e. perfect
equality of labels).
In the ILPIso_noisy_synth
dataset, labels
of mapping vertices/edges in the target graph have been disturbed by the
addition of gaussian noise (m=0, σ=5).
For these two datasets :
ep
∈ {0.01, 0.05, 0.1}.pvn
∈{10, 25, 50}.tvn
∈{50, 100, 250, 500}.Two additional datasets are provided in which graphs are connected. Due to this restriction, the
values for ep
is constrained by pvn
.
pvn
= 10 then ep
∈ {0.1}.pvn
= 20 then ep
∈ {0.05, 0.1}.pvn
= 50 then ep
∈ {0.02, 0.05, 0.1}.In the ILPIso_exact_connected_synth
dataset, there is an exact mapping between corresponding vertices/edge (i.e. perfect
equality of labels).
In the ILPIso_noisy_connected_synth
dataset, labels of mapping vertices/edges in the target graph have been disturbed by the
addition of gaussian noise (m=0, σ=5).
The ILPIso_real
dataset contains 16
pattern graphs, 200 target graphs and, for each, a groundtruth files.
Pattern graphs represent graphical symbols and target graphs represent
architectural floorplans from the SESYD
dataset (the 20 first occurrences for each kind).
The procedure used to build graphs from plan and symbol images is described in [Le Bodic et al. 2012]. Basically, a vertex is associated to each white connected component and an edge is created between vertices representing adjacent white regions. Features attached to vertices are mainly shape descriptors (location, dimension, area, main orientation, 24 first Zernike moments). Labels attached to edges are the area ratio, the distance between centers of white regions, angle...
A groundtruth file is associated to each target graph. It gives the name of the pattern graph and the node ids involved, for each occurrence of the symbol present on the floorplan.
The feature weight file used for this dataset in the experiment described in [Le Bodic et al. 2012] is available here.
ILPIso is a software developed by Pierre Le Bodic, Pierre Héroux and Sébastien Adam.
It implements the approach described in
Please report bugs and suggestions to Pierre Héroux