ILPIso

News

Introduction

ILPIso is a software for substitution-tolerant subgraph isomorphism. It aims at finding assignments between a pattern graph and a target graph. Every vertex/edge of the pattern graph is matched to a different vertex/node in the target graph, but differences between labels are penalized by a cost function. The matching is done so that the global assignment cost is minimal.

The software can be configured in order to search for several instances of the matching, and return them by non-decreasing cost order. Several settings are available to discard solutions previously found. When searching for a new solution:

  1. Only the preceding solutions are cut. Any other solution can occur, even if it contains edge/vertex matchings present in the preceding solutions
  2. Any vertex matching present in the preceding solutions are cut, but the new solution can have a matching involving a vertex/edge which has been matched in previous solutions
  3. Any vertex matched in the preceding solutions is cut. None of the matched vertex/edge can occur in the new solutions

On-line demo

Try it with demo graph files

For this demo, default files are provided for the pattern graph, the target graph and the feature weights. You can choose the following options :

ArchitecturalA.gml

f16-06_12.gml

floorplan.fw

Cut option - Multiple solutions search :




Use your own graphs

For this demo, please provide:

File formats are described below

Cut option - Multiple solutions search :




File format

Graph file format

The pattern and target graphs are text files in a format adapted from the GML format, initially defined by the FIM of the University of Passau, so that nodes and vertices can support multidimensionnal feature vector.

The structure of the file is the following:

ILPIso processes directed graphs.

graph [
 directed 1
 node list
 edge list
]

The node list is a sequence of node descriptions. Each node is represented according to the following structure:

node [
 id node_id
 feature list
]

The id value is a unique integer from 0 to node_number -1 in the increasing order.

feature list is a sequence of feature_name-feature_value pairs, where the feature_name is an identifier (a string beginning with [a-zA-Z] and containing letters, digits and underscores). The feature_value is a string representing a floating point number (eg. -145.5037 or 1.0437e-8).

It is important that feature list use the same feature_name set for all nodes. Moreover, the pattern and the target graph must use the same feature set.

The edge list is a sequence of edge descriptions. Each edge is represented according the following structure:

edge [
 source node_id
 target node_id
 feature list
]

Graph example

graph [
  directed 1
  node [
  id 0
  f0 -0.312318 f1 0.0314013 f2 0.00871597 f3 0.0136246 f4 1.95503e-05
 ]
 node [
  id 1
  f0 -0.316006 f1 0.306046 f2 0.00584333 f3 0.0059654 f4 3.6781e-05
 ]
 node [
  id 2
  f0 -0.3183 f1 0.305775 f2 0.00937711 f3 0.00860679 f4 7.40576e-06
 ]
 node [
  id 3
  f0 -0.310182 f1 0.284027 f2 0.000380925 f3 0.000564455 f4 1.55743e-07
 ]
 edge [
  source 0
  target 1
  area 0.913046 distCdg 0.54849
 ]
 edge [
  source 1
  target 0
  area 0.407857 distCdg 0.54849
 ]
 edge [
  source 0
  target 2
  area 0.915613 distCdg 0.546574
 ]
 edge [
  source 2
  target 0
  area 0.402062 distCdg 0.546574
 ]
 edge [
  source 1
  target 3
  area 0.679713 distCdg 1.12293
 ]
 edge [
  source 3
  target 1
  area 0.733478 distCdg 1.12293
 ]
 edge [
  source 2
  target 3
  area 0.673432 distCdg 1.11904
 ]
 edge [
  source 3
  target 2
  area 0.739249 distCdg 1.11904
 ]
 edge [
  source 0
  target 3
  area 0.900807 distCdg 0.584536
 ]
 edge [
  source 3
  target 0
  area 0.43422 distCdg 0.584536
 ]
]

Feature weights

The graph format described above allows that vertices and edges can support multidimensionnal feature vectors. The cost for matching two feature vectors u and v is the weighted L2-Norm cost where the w defines the weights associated to each dimension. The feature weight file allows to give a different weight to each feature.

The feature weight file is structured according to

nodes_features_weights
node feature weight list

edges_features_weights
edges feature weight list

Each feature weigth list is a feature_name-feature_weight sequence (one line for each). For each feature_name, the feature_weight is a string representing a floating point number.

It is important that the feature weight file contain a value for each feature present in the graph files.

Feature weight example

Here is an example of the feature weight file associated to the graph example given above.
nodes_features_weights
f0 1
f1 2
f2 0.5
f3 0.6
f4 1

edges_features_weights
area 2.0
distCdg 1.0

Download

Binaries

Linux 64 bits

Download and uncompress the archive ILPIso_linux64.tar.gz. Once done, you just need to execute the ILPIso located in the ILPIso directory as explained in the Usage section. The ILPIso directory also contains two pattern-target graph pairs and the associated feature weight file.

Usage

ILPIso PATTERN_FILE TARGET_FILE FEATURE_WEIGHT_FILE CUT_OPTION NB_INSTANCE

Output documentation

For each solution found, the software outputs information on the solving (wallclock time, user time, number of analyzed nodes in the solution tree...).

It also outputs the total solution cost, which is the sum of the costs of vertex/edge mappings.

Finally, it outputs the detailed mapping of the solution. For example, the following output line means that the vertex whose id is 0 in the pattern graph has been matched to the vertex whose id is 54 in the target graph. This mapping costs 0.976.

x_0,54      =  1   0.976

Similarly, the following output line means that the edge from the vertex whose id is 0 to the edge whose id is 1 in the pattern graph is matched to the edge from the vertex whose id is 54 to the edge whose id is 58 in the target graph. This mapping costs 1.102.

y_0-1,54-58 = 1  1.102

Datasets

Synthetic datasets

We provide four synthetic datasets, two of them have been used in the experimental evaluation of ILPIso published in [Le Bodic et al. 2012].

Each of these data files contains a triplet pattern_file, target_file, groundtruth_file. The syntax of these files is :

where :

The groundtruth_file gives the mapping between pattern vertices and target vertices.

In these datasets vertices and edges are labelled with a unique real number randomly drawned in [-100, 100] with a uniform distribution.

In the ILPIso_exact_synth dataset, there is an exact mapping between corresponding vertices/edge (i.e. perfect equality of labels).

In the ILPIso_noisy_synth dataset, labels of mapping vertices/edges in the target graph have been disturbed by the addition of gaussian noise (m=0, σ=5).

For these two datasets :

Two additional datasets are provided in which graphs are connected. Due to this restriction, the values for ep is constrained by pvn.

In the ILPIso_exact_connected_synth dataset, there is an exact mapping between corresponding vertices/edge (i.e. perfect equality of labels).

In the ILPIso_noisy_connected_synth dataset, labels of mapping vertices/edges in the target graph have been disturbed by the addition of gaussian noise (m=0, σ=5).

Real dataset

The ILPIso_real dataset contains 16 pattern graphs, 200 target graphs and, for each, a groundtruth files. Pattern graphs represent graphical symbols and target graphs represent architectural floorplans from the SESYD dataset (the 20 first occurrences for each kind).

The procedure used to build graphs from plan and symbol images is described in [Le Bodic et al. 2012]. Basically, a vertex is associated to each white connected component and an edge is created between vertices representing adjacent white regions. Features attached to vertices are mainly shape descriptors (location, dimension, area, main orientation, 24 first Zernike moments). Labels attached to edges are the area ratio, the distance between centers of white regions, angle...

A groundtruth file is associated to each target graph. It gives the name of the pattern graph and the node ids involved, for each occurrence of the symbol present on the floorplan.

The feature weight file used for this dataset in the experiment described in [Le Bodic et al. 2012] is available here.

Reference

ILPIso is a software developed by Pierre Le Bodic, Pierre Héroux and Sébastien Adam.

It implements the approach described in

P. Le Bodic, P. Héroux, S. Adam and Y. Lecourtier, An integer linear program for substitution-tolerant subgraph isomorphism and its use for symbol spotting in technical drawings, in Pattern Recognition, Vol. 45 (12), pp. 4214-4224, December 2012. doi = "10.1016/j.patcog.2012.05.022" link
@article{LeBodic2012,
   title = "An integer linear program for substitution-tolerant subgraph isomorphism and its use for symbol spotting in technical drawings",
   author = "Pierre Le Bodic and Pierre H\'eroux and S\'ebastien Adam and Yves Lecourtier",
   journal = "Pattern Recognition",
   volume = "45",
   number = "12",
   pages = "4214 - 4224",
   year = "2012",
   note = "",
   issn = "0031-3203",
   doi = "10.1016/j.patcog.2012.05.022",
   url = "http://www.sciencedirect.com/science/article/pii/S0031320312002683",
}

Contact

Please report bugs and suggestions to Pierre Héroux