# Relational data 1 - introduction¶

## Download exercises zip¶

We live in a world of *relations* like John *is friend of* Paul or Mary *works at* CodeWizards. We can display them as networks or, as we call them in computer science, *graphs*. We will see some ways for storing graphs, like matrices, adjacency lists and a also have a quick look at a specialized library called Networkx. Note in this book we limit ourselves to represent and manage relations with relatively simple programs, we won’t deal with path exploration algorithms (no bfs, dfs etc) nor data
formats specific for graphs (like rdf) which can get quite complex.

### Required libraries¶

In order for visualizations to work, you need installed the python library `networkx`

and `pydot`

. Pydot is an interface to the non-pyhon package GraphViz.

**Anaconda:**

From Anaconda Prompt:

Install GraphViz:

```
conda install graphviz
```

Install python packages:

```
conda install pydot networkx
```

**Ubuntu**

From console:

Install PyGraphViz (note: you should use apt to install it, pip might give problems):

```
sudo apt-get install python3-pygraphviz
```

Install python packages:

```
python3 -m pip install --user pydot networkx
```

## What to do¶

unzip exercises in a folder, you should get something like this:

```
relational
relational1-intro.ipynb
relational1-intro-sol.ipynb
relational2-binrel.ipynb
relational2-binrel-sol.ipynb
relational3-simple-stats.ipynb
relational3-simple-stats-sol.ipynb
relational4-chal.ipynb
jupman.py
soft.py
```

**WARNING**: to correctly visualize the notebook, it MUST be in an unzipped folder !

open Jupyter Notebook from that folder. Two things should open, first a console and then browser. The browser should show a file list: navigate the list and open the notebook

`relational/relational1-intro.ipynb`

**WARNING 2**: DO NOT use the *Upload* button in Jupyter, instead navigate in Jupyter browser to the unzipped folder !

Go on reading that notebook, and follow instuctions inside.

Shortcut keys:

to execute Python code inside a Jupyter cell, press

`Control + Enter`

to execute Python code inside a Jupyter cell AND select next cell, press

`Shift + Enter`

to execute Python code inside a Jupyter cell AND a create a new cell aftwerwards, press

`Alt + Enter`

If the notebooks look stuck, try to select

`Kernel -> Restart`

### Graph definition¶

In computer science a *graph* is a set of verteces V (also called *nodes)* linked by a set of edges E. You can visualize nodes as circles and links as lines. If the graph is *undirected,* links are just lines, if the graph is *directed,* links are represented as arrows with a tip to show the direction:

Credits: slide by Dr Alberto Montresor

Credits: slide by Dr Alberto Montresor

For our purposes, we will consider directed graphs (also called *digraphs).*

Usually we will indicate nodes with numbers going from zero included but optionally they can be labelled. Since we are dealing with directed graphs, we can have an arrow going for example from node `1`

to node `2`

, but also another arrow going from node `2`

to node `1`

. Furthemore, a node (for example node `0`

) can have a *cap*, that is an edge going to itself:

### Edge weights¶

Optionally, we will sometimes assign a *weight* to the edges, that is a number to be shown over the edges. So we can modify the previous example. Note we can have an arrow going from node `1`

to node `2`

with a weight which is different from the weight arrow from `2`

to `1`

:

## Matrices¶

Here we will represent graphs as matrices, which performance-wise is particularly good when the matrix is *dense,* that is, has many entries different from zero. Otherwise, when you have a so-called *sparse* matrix (few non-zero entries), it is best to represent the graph with *adjacency list,* but we will deal with them later.

If you have a directed graph (digraph) with `n`

verteces, you can represent it as an `n x n`

matrix by considering each row as vertex:

A row at index

`i`

represents the outward links from node`i`

to the other`n`

nodes, with possibly node`i`

itself included.A value of zero means there is no link to a given node.

In general,

`mat[i][j]`

is the weight of the edge between node`i`

to node`j`

### Visualization examples¶

We defined a function `soft.draw_mat`

to display matrices as graphs (you don’t need to understand the internals, for now we won’t go into depth about matrix visualizations).

If it doesn’t work, see above Required libraries paragraph

```
[2]:
```

```
# PLEASE EXECUTE THIS CELL TO CHECK IF VISUALIZATION IS WORKING
# notice links with weight zero are not shown)
# all weights are set to 1
# first need to import this
from soft import draw_mat
mat = [
[1,1,0,1], # node 0 is linked to node 0 itself, node 1 and node 2
[0,0,1,1], # node 1 is linked to node 2 and node 3
[1,1,1,1], # node 2 is linked to node 0, node 1, node 2 itself and node 3
[0,1,0,1] # node 3 is linked to node 1 and node 3 itself
]
draw_mat(mat)
```

### Saving a graph to an image file¶

If you want (or if you are not using Jupyter), optionally you can save the graph to a `.png`

file by specifying the `save_to`

filepath:

```
[3]:
```

```
mat = [
[1,1],
[0,1]
]
draw_mat( mat, save_to='example.png')
```

```
Image saved to file: example.png
```

### Saving a graph to an dot file¶

You can also save a graph to the original *dot* language of GraphViz:

```
[4]:
```

```
mat = [
[1,1],
[0,1]
]
draw_mat( mat, save_to='example.dot')
```

```
Dot saved to file: example.dot
```

Note no visualization occurs, as you probably might need this kind of output when GraphViz is not installed in your system and you want to display the file elsewhere.

There are lots of websites that take `.dot`

and output images, for example GraphvizOnline

We output here the file content, try to copy/paste it in the above website:

```
[5]:
```

```
with open('example.dot') as f:
print(f.read())
```

```
digraph {
scale=3;
style="dotted, rounded";
node [color=blue, fontcolor=blue];
edge [arrowsize="0.6", splines=curved, fontcolor=brown];
0;
1;
0 -> 0 [weight=1, label=1];
0 -> 1 [weight=1, label=1];
1 -> 1 [weight=1, label=1];
}
```

### Minimal graph¶

With this representation derived from matrices as we intend them (that is with at least one row and one column), the corresponding minimal graph can have only one node:

```
[6]:
```

```
minimal = [
[0]
]
draw_mat(minimal)
```

If we set the weight different from zero, the zeroeth node will link to itself (here we put the weight 5 in the link):

```
[7]:
```

```
minimal = [
[5]
]
draw_mat(minimal)
```

### Graph with two nodes example¶

```
[8]:
```

```
m = [
[5,9], # node 0 links to node 0 itself with a weight of 5, and to node 1 with a weight of 9
[0,6], # node 1 links to node 1 with a weight of 6
]
draw_mat(m)
```

### Distance matrix¶

Depending on the problem at hand, it may be reasonable to change the weights. For example, on a road network the nodes could represent places and the weights could be the distances. If we assume it is possible to travel in both directions on all roads, we get a matrix symmetric along the diagonal, and we can call the matrix a *distance matrix*. Talking about the diagonal, for the special case of going from a place to itself, we set that street length to 0 (which make sense for street length but
could give troubles for other purposes, for example if we give the numbers the meaning ‘is connected’ a place should always be connected to itself)

```
[9]:
```

```
# distance matrix example
mat = [
[0,6,0,8], # place 0 is linked to place 1 and place 2
[6,0,9,7], # place 1 is linked to place 0, place 2 and place 3
[5,9,0,4], # place 2 is linked to place 0, place 1 and place 3
[8,7,4,0] # place 3 is linked to place 0, place 1 and place 2
]
draw_mat(mat)
```

More realistic traffic road network, where going in one direction might take actually longer than going back, because of one-way streets and different routing times.

```
[10]:
```

```
mat = [
[0,6,0,8], # place 0 is linked to place 1 and place 2
[9,0,9,7], # place 1 is linked to place 0, place 2 and place 3
[5,5,0,4], # place 2 is linked to place 0, place 1 and place 3
[7,9,8,0] # place 3 is linked to place 0, place 1, place 2
]
draw_mat(mat)
```

### Boolean matrix example¶

If we are not interested at all in the weights, we might use only zeroes and ones as we did before. But this could have implications when doing operations on matrices, so some times it is better to use only `True`

and `False`

```
[11]:
```

```
mat = [
[False, True, False],
[False, True, True],
[True, False, True],
]
draw_mat(mat)
```

### Matrix exercises¶

We are now ready to start implementing the following functions. Before even start implementation, for each try to interpret the matrix as a graph, drawing it on paper. When you’re done implementing try to use `draw_mat`

on the results. Notice that since `draw_mat`

is a generic display function and knows nothing about the nature of the graph, sometimes it will not show the graph in the optimal way we humans would use.

### Exercise - line¶

✪✪ This function is similar to `diag`

. As that one, you can implement it in two ways: you can use a double `for`

, or a single one (much more efficient). What would be the graph representation of `line`

?

RETURN a matrix as lists of lists where node `i`

must have an edge to node `i + 1`

with weight `1`

Last node points to nothing

`n`

must be`>= 1`

, otherwise raises`ValueError`

```
[12]:
```

```
def line(n):
raise Exception('TODO IMPLEMENT ME !')
assert line(1) == [ [0] ]
assert line(2) == [ [0,1],
[0,0] ]
assert line(3) == [ [0,1,0],
[0,0,1],
[0,0,0] ]
assert line(4) == [ [0,1,0,0],
[0,0,1,0],
[0,0,0,1],
[0,0,0,0] ]
draw_mat(line(4))
```

### Exercise - cross¶

✪✪ RETURN a nxn matrix filled with zeros except on the crossing lines.

`n`

must be >=1 and odd, otherwise a`ValueError`

is thrown

Example for `n=7`

:

```
0001000
0001000
0001000
1111111
0001000
0001000
0001000
```

Try to figure out how the resulting graph would look like (try to draw on paper, also notice that `draw_mat`

will probably not draw the best possible representation)

```
[13]:
```

```
def cross(n):
raise Exception('TODO IMPLEMENT ME !')
assert cross(1) == [
[1]
]
assert cross(3) == [ [0,1,0],
[1,1,1],
[0,1,0] ]
assert cross(5) == [ [0,0,1,0,0],
[0,0,1,0,0],
[1,1,1,1,1],
[0,0,1,0,0],
[0,0,1,0,0] ]
```

### union¶

When we talk about the *union* of two graphs, we intend the graph having union of verteces of both graphs and having as edges the union of edges of both graphs. In this exercise, we have two graphs as list of lists with boolean edges. To simplify we suppose they have the same vertices but possibly different edges, and we want to calculate the union as a new graph.

For example, if we have a graph `ma`

like this:

```
[14]:
```

```
ma = [
[True, False, False],
[False, True, False],
[True, False, False]
]
```

```
[15]:
```

```
draw_mat(ma)
```

And another `mb`

like this:

```
[16]:
```

```
mb = [
[True, True, False],
[False, False, True],
[False, True, False]
]
```

```
[17]:
```

```
draw_mat(mb)
```

The result of calling `union(ma, mb)`

will be the following:

```
[18]:
```

```
res = [[True, True, False], [False, True, True], [True, True, False]]
```

which will be displayed as

```
[19]:
```

```
draw_mat(res)
```

So we get same verteces and edges from both `ma`

and `mb`

### Exercise - union¶

✪✪ Takes two graphs represented as nxn matrices of lists of lists with boolean edges, and RETURN a NEW matrix which is the union of both graphs

if

`mata`

row number is different from`matb`

, raises`ValueError`

```
[20]:
```

```
def union(mata, matb):
raise Exception('TODO IMPLEMENT ME !')
try:
union([[False],[False]], [[False]])
raise Exception("Shouldn't arrive here !")
except ValueError:
"test passed"
try:
union([[False]], [[False],[False]])
raise Exception("Shouldn't arrive here !")
except ValueError:
"test passed"
ma1 = [ [False] ]
mb1 = [ [False] ]
assert union(ma1, mb1) == [ [False] ]
ma2 = [ [False] ]
mb2 = [ [True] ]
assert union(ma2, mb2) == [ [True] ]
ma3 = [ [True] ]
mb3 = [ [False] ]
assert union(ma3, mb3) == [ [True] ]
ma4 = [ [True] ]
mb4 = [ [True] ]
assert union(ma4, mb4) == [ [True] ]
ma5 = [ [False, False, False],
[False, False, False],
[False, False, False] ]
mb5 = [ [True, False, True],
[False, True, True],
[False, False, False] ]
assert union(ma5, mb5) == [ [True, False, True],
[False, True, True],
[False, False, False] ]
ma6 = [ [True, False, True],
[False, True, True],
[False, False, False] ]
mb6 = [ [False, False, False],
[False, False, False],
[False, False, False] ]
assert union(ma6, mb6) == [ [True, False, True],
[False, True, True],
[False, False, False] ]
ma7 = [ [True, False, False],
[False, True, False],
[True, False, False] ]
mb7 = [ [True, True, False],
[False, False, True],
[False, True, False] ]
assert union(ma7, mb7) == [ [True, True, False],
[False, True, True],
[True, True, False] ]
```

### Subgraphs¶

If we interpret a matrix as graph, we may wonder when a graph A is a subgraph of another graph B, that is, when A nodes are a subset of B nodes and when A edges are a subset of B edges. For convenience, here we only consider graphs having the same `n`

nodes both in A and B. Edges may instead vary. Graphs are represented as boolean matrices.

### Exercise - is_subgraph¶

✪✪ RETURN True is A is a subgraph of B, that is, some or all of its edges also belong to B. A and B are boolean matrices of size nxn.

If sizes don’t match, raises

`ValueError`

```
[21]:
```

```
def is_subgraph(mata, matb):
raise Exception('TODO IMPLEMENT ME !')
# the set of edges is empty
ma = [ [False] ]
# the set of edges is empty
mb = [ [False] ]
# an empty set is always a subset of an empty set
assert is_subgraph(ma, mb) == True
# the set of edges is empty
ma = [ [False] ]
# the set of edges contains one element
mb = [ [True] ]
# an empty set is always a subset of any set, so function gives True
assert is_subgraph(ma, mb) == True
ma = [ [True] ]
mb = [ [True] ]
assert is_subgraph(ma, mb) == True
ma = [ [True] ]
mb = [ [False] ]
assert is_subgraph(ma, mb) == False
ma = [ [True, False],
[True, False] ]
mb = [ [True, False],
[True, True] ]
assert is_subgraph(ma, mb) == True
ma = [ [False, False, True],
[True, True,True],
[True, False,True] ]
mb = [ [True, False, True],
[True, True,True],
[True, True,True] ]
assert is_subgraph(ma, mb) == True
```

### Exercise - remove_node¶

✪✪ Here the function text is not so precise, as it is talking about nodes but you have to operate on a matrix. Can you guess exactly what you have to do ? In your experiments, try to draw the matrix before and after executing `remove_node`

```
[22]:
```

```
def remove_node(mat, i):
""" MODIFIES mat by removing node i.
"""
raise Exception('TODO IMPLEMENT ME !')
m = [ [3,5,2,5],
[6,2,3,7],
[4,2,1,2],
[7,2,2,6] ]
remove_node(m,2)
assert len(m) == 3
for i in range(3):
assert len(m[i]) == 3
```

### Exercise - utriang¶

✪✪✪ You will try to create an upper triangular matrix of side `n`

. What could possibly be the graph interpretation of such a matrix? Since `draw_mat`

is a generic drawing function it doesn’t provide the best possible representation, try to draw on paper a more intuitive one.

RETURN a matrix of size `n`

x`n`

which is upper triangular, that is, has all nodes below the diagonal `0`

, while all the other nodes are set to `1`

```
[23]:
```

```
def utriang(n):
raise Exception('TODO IMPLEMENT ME !')
assert utriang(1) == [ [1] ]
assert utriang(2) == [ [1,1],
[0,1] ]
assert utriang(3) == [ [1,1,1],
[0,1,1],
[0,0,1] ]
assert utriang(4) == [ [1,1,1,1],
[0,1,1,1],
[0,0,1,1],
[0,0,0,1] ]
```

### Edge difference¶

The *edge difference* of two graphs `ediff(da,db)`

is a graph with the edges of the first except the edges of the second. For simplicity, here we consider only graphs having the same verteces but possibly different edges. This time we will try operate on graphs represented as dictionaries of adjacency lists.

For example, if we have

```
[24]:
```

```
da = {
'a':['a','c'],
'b':['b', 'c'],
'c':['b','c']
}
draw_adj(da)
```

and

```
[25]:
```

```
db = {
'a':['c'],
'b':['a','b', 'c'],
'c':['a']
}
draw_adj(db)
```

The result of calling `ediff(da,db)`

will be:

```
[26]:
```

```
res = {
'a':['a'],
'b':[],
'c':['b','c']
}
```

Which can be shown as

```
[27]:
```

```
draw_adj(res)
```

### Exercise - ediff¶

✪✪✪ Takes two graphs as dictionaries of adjacency lists da and db, and RETURN a NEW graph as dictionary of adjacency lists, containing the same vertices of da, and the edges of da except the edges of db.

As order of elements within the adjacency lists, use the same order as found in da.

We assume all verteces in da and db are represented in the keys (even if they have no outgoing edge), and that da and db have the same keys

Example:

```
da = { 'a':['a','c'],
'b':['b', 'c'],
'c':['b','c']
}
db = { 'a':['c'],
'b':['a','b', 'c'],
'c':['a']
}
assert ediff(da, db) == { 'a':['a'],
'b':[],
'c':['b','c']
}
```

```
[28]:
```

```
def ediff(da,db):
raise Exception('TODO IMPLEMENT ME !')
da1 = { 'a': [] }
db1 = { 'a': [] }
assert ediff(da1, db1) == { 'a': [] }
da2 = { 'a': [] }
db2 = { 'a': ['a'] }
assert ediff(da2, db2) == { 'a': [] }
da3 = { 'a': ['a'] }
db3 = { 'a': [] }
assert ediff(da3, db3) == { 'a': ['a'] }
da4 = { 'a': ['a'] }
db4 = { 'a': ['a'] }
assert ediff(da4, db4) == { 'a': [] }
da5 = { 'a':['b'],
'b':[] }
db5 = { 'a':['b'],
'b':[] }
assert ediff(da5, db5) == { 'a':[],
'b':[] }
da6 = { 'a':['b'],
'b':[] }
db6 = { 'a':[],
'b':[] }
assert ediff(da6, db6) == { 'a':['b'],
'b':[] }
da7 = { 'a':['a','b'],
'b':[] }
db7 = { 'a':['a'],
'b':[] }
assert ediff(da7, db7) == { 'a':['b'],
'b':[] }
da8 = { 'a':['a','b'],
'b':['a'] }
db8 = { 'a':['a'],
'b':['b'] }
assert ediff(da8, db8) == { 'a':['b'],
'b':['a'] }
da9 = { 'a':['a','c'],
'b':['b', 'c'],
'c':['b','c'] }
db9 = { 'a':['c'],
'b':['a','b', 'c'],
'c':['a'] }
assert ediff(da9, db9) == { 'a':['a'],
'b':[],
'c':['b','c'] }
```

### Exercise - pyramid¶

✪✪✪ The following function requires to create a matrix filled with non-zero numbers. Even if don’t know exactly the network meaning, with this fact we can conclude that all nodes are linked to all others. A graph where this happens is called a *clique* (the Italian name is *cricca)*

Takes an odd number `n >= 1`

and RETURN a matrix as list of lists containing numbers displaced like this example for a pyramid of square `7`

:

```
1111111
1222221
1233321
1234321
1233321
1222221
1111111
```

if

`n`

is even, raises`ValueError`

```
[29]:
```

```
def pyramid(n):
raise Exception('TODO IMPLEMENT ME !')
try:
pyramid(4)
raise Exception("SHOULD HAVE FAILED!")
except ValueError:
"passed test"
assert pyramid(1) == [ [1] ]
assert pyramid(3) == [ [1,1,1],
[1,2,1],
[1,1,1] ]
assert pyramid(5) == [ [1, 1, 1, 1, 1],
[1, 2, 2, 2, 1],
[1, 2, 3, 2, 1],
[1, 2, 2, 2, 1],
[1, 1, 1, 1, 1] ]
```

## Adjacency lists¶

So far, we represented graphs as matrices, saying they are good when the graph is dense, that is any given node is likely to be connected to almost all other nodes - or equivalently, many cell entries in the matrix are different from zero. But if this is not the case, other representations might be needed. For example, we can represent a graph as a *adjacency lists*.

Let’s look at this 6x6 boolean matrix:

```
[30]:
```

```
m = [
[False, False, False, False, False, False],
[False, False, False, False, False, False],
[True, False, False, True, False, False],
[False, False, False, False, False, False],
[False, False, False, False, False, False],
[False, False, True, False, False, False]
]
```

We see just a few `True`

, so by drawing it we don’t expect to see many edges:

```
[31]:
```

```
draw_mat(m)
```

As a more compact representation, we might represent the data as a dictionary of *adjacency lists* where the keys are the node indexes and the to each node we associate a list with the target nodes it points to.

To reproduce the example above, we can write like this:

```
[32]:
```

```
d = {
0: [], # node 0 links to nothing
1: [], # node 1 links to nothing
2: [0,3], # node 2 links to node 0 and 3
3: [], # node 3 links to nothing
4: [], # node 4 links to nothing
5: [2] # node 5 links to node 2
}
```

In `soft.py`

, we provide also a function `soft.draw_adj`

to quickly inspect such data structure:

```
[33]:
```

```
from soft import draw_adj
draw_adj(d)
```

As expected, the resulting graph is the same as for the equivalent matrix representation.

### Exercise - mat_to_adj¶

✪✪ Implement a function that takes a boolean nxn matrix and RETURN the equivalent representation as dictionary of adjacency lists. Remember that to create an empty dict you have to write `dict()`

```
[34]:
```

```
def mat_to_adj(bool_mat):
raise Exception('TODO IMPLEMENT ME !')
m1 = [ [False] ]
d1 = { 0:[] }
assert mat_to_adj(m1) == d1
m2 = [ [True] ]
d2 = { 0:[0] }
assert mat_to_adj(m2) == d2
m3 = [ [False,False],
[False,False] ]
d3 = { 0:[],
1:[] }
assert mat_to_adj(m3) == d3
m4 = [ [True,True],
[True,True] ]
d4 = { 0:[0,1],
1:[0,1] }
assert mat_to_adj(m4) == d4
m5 = [ [False,False],
[False,True] ]
d5 = { 0:[],
1:[1] }
assert mat_to_adj(m5) == d5
m6 = [ [True,False,False],
[True, True,False],
[False,True,False] ]
d6 = { 0:[0],
1:[0,1],
2:[1] }
assert mat_to_adj(m6) == d6
```

### Exercise - mat_ids_to_adj¶

✪✪ Implement a function that takes a boolean nxn matrix and a list of immutable identifiers for the nodes, and RETURN the equivalent representation as dictionary of adjacency lists.

If matrix is not

`n`

x`n`

or`ids`

length does not match`n`

, raise`ValueError`

```
[35]:
```

```
def mat_ids_to_adj(bool_mat, ids):
raise Exception('TODO IMPLEMENT ME !')
try:
mat_ids_to_adj([[False, True]], ['a','b'])
raise Exception("SHOULD HAVE FAILED !")
except ValueError:
"passed test"
try:
mat_ids_to_adj([[False]], ['a','b'])
raise Exception("SHOULD HAVE FAILED !")
except ValueError:
"passed test"
m1 = [ [False] ]
d1 = { 'a':[] }
assert mat_ids_to_adj(m1, ['a']) == d1
m2 = [ [True] ]
d2 = { 'a':['a'] }
assert mat_ids_to_adj(m2, ['a']) == d2
m3 = [ [False,False],
[False,False] ]
d3 = { 'a':[],
'b':[]
}
assert mat_ids_to_adj(m3,['a','b']) == d3
m4 = [ [True,True],
[True,True] ]
d4 = { 'a':['a','b'],
'b':['a','b']
}
assert mat_ids_to_adj(m4, ['a','b']) == d4
m5 = [ [False,False],
[False,True] ]
d5 = { 'a':[],
'b':['b']
}
assert mat_ids_to_adj(m5,['a','b']) == d5
m6 = [ [True,False,False],
[True, True,False],
[False,True,False] ]
d6 = { 'a':['a'],
'b':['a','b'],
'c':['b'] }
assert mat_ids_to_adj(m6,['a','b','c']) == d6
```

### Exercise - adj_to_mat¶

Try now conversion from dictionary of adjacency list to matrix (this is a bit hard).

To solve this, the general idea is that you have to fill an nxn matrix to return. During the filling of a cell at row `i`

and column `j`

, you have to decide whether to put a `True`

or a `False`

. You should put `True`

if in the `d`

list value corresponding to the `i-th`

key, there is contained a number equal to `j`

. Otherwise, you should put `False`

.

If you look at the tests, as inputs we are passing `OrderedDict`

. The reason is that when we check the output matrix of your function, we want to be sure the matrix rows are ordered in a certain way.

But you have to assume d can contain arbitrary ids with no precise ordering, so:

first you should scan the dictionary and lists to save the mapping between indexes to ids in a separate list

**NOTE**: `d.keys()`

is not exactly a list (does not allow access by index), so you must convert to list with this: `list(d.keys())`

then you should build the matrix to return, using the previously built list when needed.

✪✪✪ Now implement a function that takes a dictionary of adjacency lists with arbitrary ids and RETURN its representation as an nxn boolean matrix

assume all nodes are present as keys

assume

`d`

is a simple dictionary (not necessarily an`OrderedDict`

)

```
[36]:
```

```
def adj_to_mat(d):
raise Exception('TODO IMPLEMENT ME !')
from collections import OrderedDict
od1 = OrderedDict([ ('a',[]) ])
m1 = [ [False] ]
assert adj_to_mat(od1) == m1
od2 = OrderedDict([ ('a',['a']) ])
m2 = [ [True] ]
assert adj_to_mat(od2) == m2
od3 = OrderedDict([ ('a',['a','b']),
('b',['a','b']) ])
m3 = [ [True, True],
[True, True] ]
assert adj_to_mat(od3) == m3
od4 = OrderedDict([ ('a',[]),
('b',[]) ])
m4 = [ [False, False],
[False, False] ]
assert adj_to_mat(od4) == m4
od5 = OrderedDict([ ('a',['a']),
('b',['a','b']) ])
m5 = [ [True, False],
[True, True] ]
assert adj_to_mat(od5) == m5
od6 = OrderedDict([ ('a',['a','c']),
('b',['c']),
('c',['a','b']) ])
m6 = [ [True, False, True],
[False, False, True],
[True, True, False] ]
assert adj_to_mat(od6) == m6
```

### Exercise - table_to_adj¶

Suppose you have a table expressed as a list of lists with headers like this:

```
[37]:
```

```
m0 = [
['Identifier','Price','Quantity'],
['a',1,1],
['b',5,8],
['c',2,6],
['d',8,5],
['e',7,3]
]
```

where `a`

, `b`

, `c`

etc are the row identifiers (imagine they represent items in a store), `Price`

and `Quantity`

are properties they might have. **NOTE**: here we put two properties, but they might have `n`

properties !

We want to transform such table into a graph-like format as a dictionary of lists, which relates store items as keys to the properties they might have. To include in the list both the property identifier and its value, we will use tuples. So you need to write a function that transforms the above input into this:

```
[38]:
```

```
res0 = {
'a':[('Price',1),('Quantity',1)],
'b':[('Price',5),('Quantity',8)],
'c':[('Price',2),('Quantity',6)],
'd':[('Price',8),('Quantity',5)],
'e':[('Price',7),('Quantity',3)]
}
```

```
[39]:
```

```
def table_to_adj(table):
raise Exception('TODO IMPLEMENT ME !')
m0 = [ ['I','P','Q'] ]
res0 = {}
assert res0 == table_to_adj(m0)
m1 = [
['Identifier','Price','Quantity'],
['a',1,1],
['b',5,8],
['c',2,6],
['d',8,5],
['e',7,3]
]
res1 = {
'a':[('Price',1),('Quantity',1)],
'b':[('Price',5),('Quantity',8)],
'c':[('Price',2),('Quantity',6)],
'd':[('Price',8),('Quantity',5)],
'e':[('Price',7),('Quantity',3)]
}
assert res1 == table_to_adj(m1)
m2 = [
['I','P','Q'],
['a','x','y'],
['b','w','z'],
['c','z','x'],
['d','w','w'],
['e','y','x']
]
res2 = {
'a':[('P','x'),('Q','y')],
'b':[('P','w'),('Q','z')],
'c':[('P','z'),('Q','x')],
'd':[('P','w'),('Q','w')],
'e':[('P','y'),('Q','x')]
}
assert res2 == table_to_adj(m2)
m3 = [
['I','P','Q', 'R'],
['a','x','y', 'x'],
['b','z','x', 'y'],
]
res3 = {
'a':[('P','x'),('Q','y'), ('R','x')],
'b':[('P','z'),('Q','x'), ('R','y')]
}
assert res3 == table_to_adj(m3)
```

## Networkx¶

**Before continuing, make sure to have installed the** required libraries

Networkx is a library to perform statistics on networks. For now, it will offer us a richer data structure where we can store the properties we want in nodes and also edges.

You can initialize networkx objects with the dictionary of adjacency lists we’ve alredy seen:

```
[40]:
```

```
import networkx as nx
# notice with networkx if nodes are already referenced to in an adjacency list
# you do not need to put them as keys:
G=nx.DiGraph({
'a':['b','c'], # node a links to b and c
'b':['b','c', 'd'] # node b links to b itself, c and d
})
```

The resulting object is not a simple dict, but something more complex:

```
[41]:
```

```
G
```

```
[41]:
```

```
<networkx.classes.digraph.DiGraph at 0x7f5e09983f10>
```

To display it in a way uniform with the rest of the course, we developed a function called `soft.draw_nx`

:

```
[42]:
```

```
from soft import draw_nx
```

```
[43]:
```

```
draw_nx(G)
```

From the picture above, we notice there are no weights displayed, because in networkx they are just considered optional attributes of edges.

To see all the attributes of an edge, you can write like this:

```
[44]:
```

```
G['a']['b']
```

```
[44]:
```

```
{}
```

This graph has no attributes for the node, so we get back an empty dict. If we wanted to add a weight of `123`

to that particular `a b`

edge, you could write like this:

```
[45]:
```

```
G['a']['b']['weight'] = 123
```

```
[46]:
```

```
G['a']['b']
```

```
[46]:
```

```
{'weight': 123}
```

Let’s try to display it:

```
[47]:
```

```
draw_nx(G)
```

We still don’t see the weight as weight can be one of many properties: the only thing that gets displayed is the propery `label`

. So let’s set label equal to the weight:

```
[48]:
```

```
G['a']['b']['label'] = 123
```

```
[49]:
```

```
draw_nx(G)
```

### Fancy networkx graphs¶

With networkx we can set additional attributes to embellish the resulting graph, here we show a bus network example.

```
[50]:
```

```
G = nx.DiGraph()
# we can force horizontal layout like this:
G.graph['graph']= {
'rankdir':'LR',
}
# When we add nodes, we can identify them with an identifier like the
# stop_id which is separate from the label, for example in some unfortunate
# case two different stops can share the same label.
G.add_node('1', label='Trento',
color='orange', fontcolor='black')
G.add_node('723', label='Rovereto',
color='black', fontcolor='black')
G.add_node('870', label='Arco',
color='black', fontcolor='black')
G.add_node('1180', label='Riva',
color='black', fontcolor='blue')
# IMPORTANT: edges connect stop_ids , NOT labels !!!!
G.add_edge('870','1')
G.add_edge('723','1')
G.add_edge('1','1180')
# we can retrieve an edge like this:
edge = G['1']['1180']
# and set attributes, like these:
edge['weight'] = 5 # the actual weight (not shown!)
edge['label'] = str(5) # the label is a string
edge['color'] = '#2ca02c' # we can set some style for the edge, such as color
edge['penwidth']= 4 # and thickness
edge['route_short_name'] = 'B301' # we can add any attribute we want,
# Note these custom ones won't show in the graph
legend = [ {'label': 'B211', 'color': '#2ca02c'} ]
draw_nx(G, legend)
```

### Converting networkx graphs¶

If you try to just output the string representation of the graph, networkx will give the empty string:

```
[51]:
```

```
print(G)
```

```
```

```
[52]:
```

```
str(G)
```

```
[52]:
```

```
''
```

```
[53]:
```

```
repr(G)
```

```
[53]:
```

```
'<networkx.classes.digraph.DiGraph object at 0x7f5e09a61b10>'
```

To convert to the dict of adjacency lists we know, you can use this method:

```
[54]:
```

```
nx.to_dict_of_lists(G)
```

```
[54]:
```

```
{'1': ['1180'], '723': ['1'], '870': ['1'], '1180': []}
```

The above works, but it doesn’t convert additional edge info. For a complete conversion, use `nx.to_dict_of_dicts`

```
[55]:
```

```
nx.to_dict_of_dicts(G)
```

```
[55]:
```

```
{'1': {'1180': {'weight': 5,
'label': '5',
'color': '#2ca02c',
'penwidth': 4,
'route_short_name': 'B301'}},
'723': {'1': {}},
'870': {'1': {}},
'1180': {}}
```

### Exercise - mat_to_nx¶

✪✪ Now try by yourself to convert a matrix as list of lists along with node ids (like you did before) into a networkx object.

This time, don’t create a dictionary to pass it to `nx.DiGraph`

constructor: instead, use networkx methods like `.add_edge`

and `add_node`

. For usage example, check the networkx tutorial.

**QUESTION**: Do you need to explicitly call `add_node`

before referring to some node with `add_edge`

?

Implement a function that given a real-valued \(n\) x \(n\) matrix as list of lists, and a list of immutable identifiers for the nodes, RETURNS the corresponding graph in networkx format (as `nx.DiGraph`

).

If matrix is not

`n`

x`n`

or`ids`

length does not match`n`

, raise`ValueError`

**DO NOT** transform into a dict, use instead `add_`

methods from networkx object!

**WARNING**: remember to set the weights labels AS STRINGS!

```
[56]:
```

```
def mat_to_nx(mat, ids):
raise Exception('TODO IMPLEMENT ME !')
try:
mat_ids_to_adj([[0, 3]], ['a','b'])
raise Exception("SHOULD HAVE FAILED !")
except ValueError:
"passed test"
try:
mat_ids_to_adj([[0]], ['a','b'])
raise Exception("SHOULD HAVE FAILED !")
except ValueError:
"passed test"
m1 = [ [0] ]
d1 = {'a': {}}
assert nx.to_dict_of_dicts(mat_to_nx(m1, ['a'])) == d1
m2 = [ [7] ]
d2 = {'a': {'a': {'weight': 7, 'label': '7'}}}
assert nx.to_dict_of_dicts(mat_to_nx(m2, ['a'])) == d2
m3 = [ [0,0],
[0,0] ]
d3 = { 'a':{},
'b':{}
}
assert nx.to_dict_of_dicts(mat_to_nx(m3,['a','b'])) == d3
m4 = [ [7,9],
[8,6] ]
d4 = { 'a':{'a': {'weight':7,'label':'7'},
'b' : {'weight':9,'label':'9'},
},
'b':{'a': {'weight':8,'label':'8'},
'b' : {'weight':6,'label':'6'},
}
}
assert nx.to_dict_of_dicts(mat_to_nx(m4, ['a','b'])) == d4
m5 = [ [0,0],
[0,7] ]
d5 = { 'a':{},
'b':{
'b' : {'weight':7,'label':'7'},
}
}
assert nx.to_dict_of_dicts(mat_to_nx(m5,['a','b'])) == d5
m6 = [ [7,0,0],
[7,9,0],
[0,7,0] ]
d6 = { 'a':{
'a' : {'weight':7,'label':'7'},
},
'b': {
'a': {'weight':7,'label':'7'},
'b' : {'weight':9,'label':'9'}
},
'c':{
'b' : {'weight':7,'label':'7'}
}
}
assert nx.to_dict_of_dicts(mat_to_nx(m6,['a','b','c'])) == d6
```

```
[57]:
```

```
draw_nx(mat_to_nx([
[7,0,0],
[7,9,0],
[0,7,0]
], ['a','b','c']))
```

## Continue¶

Go on with binary relations