# Visualization solutions¶

## Introduction¶

We will review the famous library Matplotlib which allows to display a variety of charts, and it is the base of many other visualization libraries.

### What to do¶

unzip exercises in a folder, you should get something like this:

```
visualization
visualization.ipynb
visualization-sol.ipynb
jupman.py
soft.py
```

**WARNING**: to correctly visualize the notebook, it MUST be in an unzipped folder !

open Jupyter Notebook from that folder. Two things should open, first a console and then browser. The browser should show a file list: navigate the list and open the notebook

`visualization/visualization.ipynb`

**WARNING 2**: DO NOT use the *Upload* button in Jupyter, instead navigate in Jupyter browser to the unzipped folder !

Go on reading that notebook, and follow instuctions inside.

Shortcut keys:

to execute Python code inside a Jupyter cell, press

`Control + Enter`

to execute Python code inside a Jupyter cell AND select next cell, press

`Shift + Enter`

to execute Python code inside a Jupyter cell AND a create a new cell aftwerwards, press

`Alt + Enter`

If the notebooks look stuck, try to select

`Kernel -> Restart`

## First example¶

Let’s start with a very simple plot:

```
[2]:
```

```
# this is *not* a python command, it is a Jupyter-specific magic command,
# to tell jupyter we want the graphs displayed in the cell outputs
%matplotlib inline
# imports matplotlib
import matplotlib.pyplot as plt
# we can give coordinates as simple numberlists
# this are couples for the function y = 2 * x
xs = [1, 2, 3, 4, 5, 6]
ys = [2, 4, 6, 8,10,12]
plt.plot(xs, ys)
# we can add this after plot call, it doesn't matter
plt.title("my function")
plt.xlabel('x')
plt.ylabel('y')
# prevents showing '<matplotlib.text.Text at 0x7fbcf3c4ff28>' in Jupyter
plt.show()
```

### Plot style¶

To change the way the line is displayed, you can set dot styles with another string parameter. For example, to display red dots, you would add the string `ro`

, where `r`

stands for red and `o`

stands for dot.

```
[3]:
```

```
%matplotlib inline
import matplotlib.pyplot as plt
xs = [1, 2, 3, 4, 5, 6]
ys = [2, 4, 6, 8,10,12]
plt.plot(xs, ys, 'ro') # NOW USING RED DOTS
plt.title("my function")
plt.xlabel('x')
plt.ylabel('y')
plt.show()
```

### x power 2 exercise¶

Try to display the function `y = x**2`

(x power 2) using green dots and for integer xs going from -10 to 10

```
[4]:
```

```
# write here the solution
```

```
[5]:
```

```
```

### Axis limits¶

If you want to change the x axis, you can use `plt.xlim`

:

```
[6]:
```

```
%matplotlib inline
import matplotlib.pyplot as plt
xs = [1, 2, 3, 4, 5, 6]
ys = [2, 4, 6, 8,10,12]
plt.plot(xs, ys, 'ro')
plt.title("my function")
plt.xlabel('x')
plt.ylabel('y')
plt.xlim(-5, 10) # SETS LOWER X DISPLAY TO -5 AND UPPER TO 10
plt.ylim(-7, 26) # SETS LOWER Y DISPLAY TO -7 AND UPPER TO 26
plt.show()
```

### Axis size¶

```
[7]:
```

```
%matplotlib inline
import matplotlib.pyplot as plt
xs = [1, 2, 3, 4, 5, 6]
ys = [2, 4, 6, 8,10,12]
fig = plt.figure(figsize=(10,3)) # width: 10 inches, height 3 inches
plt.plot(xs, ys, 'ro')
plt.title("my function")
plt.xlabel('x')
plt.ylabel('y')
plt.show()
```

### Changing tick labels¶

You can also change labels displayed on ticks on axis with `plt.xticks`

and `plt.yticks`

functions:

**Note:** instead of `xticks`

you might directly use categorical variables IF you have matplotlib >= 2.1.0

Here we use `xticks`

as sometimes you might need to fiddle with them anyway

```
[8]:
```

```
%matplotlib inline
import matplotlib.pyplot as plt
xs = [1, 2, 3, 4, 5, 6]
ys = [2, 4, 6, 8,10,12]
plt.plot(xs, ys, 'ro')
plt.title("my function")
plt.xlabel('x')
plt.ylabel('y')
# FIRST NEEDS A SEQUENCE WITH THE POSITIONS, THEN A SEQUENCE OF SAME LENGTH WITH LABELS
plt.xticks(xs, ['a', 'b', 'c', 'd', 'e', 'f'])
plt.show()
```

## Introducting numpy¶

For functions involving reals, vanilla python starts showing its limits and its better to switch to numpy library. Matplotlib can easily handle both vanilla python sequences like lists and numpy array. Let’s see an example without numpy and one with it.

### Example without numpy¶

If we only use *vanilla* Python (that is, Python without extra libraries like numpy), to display the function `y = 2x + 1`

we can come up with a solution like this

```
[9]:
```

```
%matplotlib inline
import matplotlib.pyplot as plt
xs = [x*0.1 for x in range(10)] # notice we can't do a range with float increments
# (and it would also introduce rounding errors)
ys = [(x * 2) + 1 for x in xs]
plt.plot(xs, ys, 'bo')
plt.title("y = 2x + 1 with vanilla python")
plt.xlabel('x')
plt.ylabel('y')
plt.show()
```

### Example with numpy¶

With numpy, we have at our disposal several new methods for dealing with arrays.

First we can generate an interval of values with one of these methods.

Sine Python range does not allow float increments, we can use `np.arange`

:

```
[10]:
```

```
import numpy as np
xs = np.arange(0,1.0,0.1)
xs
```

```
[10]:
```

```
array([0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])
```

Equivalently, we could use `np.linspace`

:

```
[11]:
```

```
xs = np.linspace(0,0.9,10)
xs
```

```
[11]:
```

```
array([0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])
```

Numpy allows us to easily write functions on arrays in a natural manner. For example, to calculate `ys`

we can now do like this:

```
[12]:
```

```
ys = 2*xs + 1
ys
```

```
[12]:
```

```
array([1. , 1.2, 1.4, 1.6, 1.8, 2. , 2.2, 2.4, 2.6, 2.8])
```

Let’s put everything together:

```
[13]:
```

```
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
xs = np.linspace(0,0.9,10) # left end: 0 *included* right end: 0.9 *included* number of values: 10
ys = 2*xs + 1
plt.plot(xs, ys, 'bo')
plt.title("y = 2x + 1 with numpy")
plt.xlabel('x')
plt.ylabel('y')
plt.show()
```

### y = sin(x) + 3 exercise¶

✪✪✪ Try to display the function `y = sin(x) + 3`

for x at pi/4 intervals, starting from 0. Use exactly 8 ticks.

**NOTE**: 8 is the *number of x ticks* (telecom people would use the term ‘samples’), **NOT** the x of the last tick !!

try to solve it without using numpy. For pi, use constant

`math.pi`

(first you need to import`math`

module)try to solve it with numpy. For pi, use constant

`np.pi`

(which is exactly the same as`math.pi`

)

b.1) solve it with `np.arange`

b.2) solve it with `np.linspace`

For each tick, use the label sequence

`"0π/4", "1π/4" , "2π/4", "3π/4" , "4π/4", "5π/4", ....`

. Obviously writing them by hand is easy, try instead to devise a method that works for any number of ticks. What is changing in the sequence? What is constant? What is the type of the part changes ? What is final type of the labels you want to obtain ?If you are in the mood, try to display them better like 0, π/4 , π/2 π, 3π/4 , π, 5π/4 possibly using Latex (requires some search, this example might be a starting point)

**NOTE**: Latex often involves the usage of the `\`

bar, like in `\frac{2,3}`

. If we use it directly, Python will interpret `\f`

as a special character and will not send to the Latex processor the string we meant:

```
[14]:
```

```
'\frac{2,3}'
```

```
[14]:
```

```
'\x0crac{2,3}'
```

One solution would be to double the slashes, like this:

```
[15]:
```

```
'\\frac{2,3}'
```

```
[15]:
```

```
'\\frac{2,3}'
```

An even better one is to prepend the string with the `r`

character, which allows to write slashes only once:

```
[16]:
```

```
r'\frac{2,3}'
```

```
[16]:
```

```
'\\frac{2,3}'
```

```
[17]:
```

```
# write here solution for a) y = sin(x) + 3 with vanilla python
```

```
[18]:
```

```
```

```
[19]:
```

```
# write here solution b.1) y = sin(x) + 3 with numpy, arange
```

```
[20]:
```

```
```

```
[21]:
```

```
# write here solution b.2) y = sin(x) + 3 with numpy, linspace
```

```
[22]:
```

```
```

```
[23]:
```

```
# write here solution c) y = sin(x) + 3 with numpy and pi xlabels
```

```
[24]:
```

```
```

### Showing degrees per node¶

Going back to the indegrees and outdegrees as seen in Graph formats - Simple statistics paragraph, we will try to study the distributions visually.

Let’s take an example networkx DiGraph:

```
[25]:
```

```
import networkx as nx
G1=nx.DiGraph({
'a':['b','c'],
'b':['b','c', 'd'],
'c':['a','b','d'],
'd':['b', 'd']
})
draw_nx(G1)
```

### indegree per node¶

✪✪ Display a plot for graph `G`

where the xtick labels are the nodes, and the y is the indegree of those nodes.

**Note:** instead of `xticks`

you might directly use categorical variables IF you have matplotlib >= 2.1.0

Here we use `xticks`

as sometimes you might need to fiddle with them anyway

To get the nodes, you can use the `G1.nodes()`

function:

```
[26]:
```

```
G1.nodes()
```

```
[26]:
```

```
NodeView(('d', 'a', 'b', 'c'))
```

It gives back a `NodeView`

which is not a list, but still you can iterate through it with a `for in`

cycle:

```
[27]:
```

```
for n in G1.nodes():
print(n)
```

```
d
a
b
c
```

Also, you can get the indegree of a node with

```
[28]:
```

```
G1.in_degree('b')
```

```
[28]:
```

```
4
```

```
[29]:
```

```
# write here the solution
```

```
[30]:
```

```
```

## Bar plots¶

The previous plot with dots doesn’t look so good - we might try to use instead a bar plot. First look at this this example, then proceed with the next exercise

```
[31]:
```

```
import numpy as np
import matplotlib.pyplot as plt
xs = [1,2,3,4]
ys = [7,5,8,2 ]
plt.bar(xs, ys,
0.5, # the width of the bars
color='green', # someone suggested the default blue color is depressing, so let's put green
align='center') # bars are centered on the xtick
plt.show()
```

### indegree per node bar plot¶

✪✪ Display a bar plot for graph `G1`

where the xtick labels are the nodes, and the y is the indegree of those nodes.

```
[32]:
```

```
# write here
```

```
[33]:
```

```
```

### indegree per node sorted alphabetically¶

✪✪ Display the same bar plot as before, but now sort nodes alphabetically.

NOTE: you cannot run `.sort()`

method on the result given by `G1.nodes()`

, because nodes in network by default have no inherent order. To use `.sort()`

you need first to convert the result to a `list`

object.

```
[34]:
```

```
```

```
[35]:
```

```
# write here
```

### indegree per node sorted¶

✪✪✪ Display the same bar plot as before, but now sort nodes according to their indegree. This is more challenging, to do it you need to use some sort trick. First read the Python documentation and then:

create a list of couples (list of tuples) where each tuple is the node identifier and the corresponding indegree

sort the list by using the second value of the tuples as a key.

```
[36]:
```

```
# write here
```

```
[37]:
```

```
```

### out degrees per node sorted¶

✪✪✪ Do the same graph as before for the outdegrees.

You can get the outdegree of a node with:

```
[38]:
```

```
G1.out_degree('b')
```

```
[38]:
```

```
3
```

```
[39]:
```

```
```

```
[40]:
```

```
# write here
```

### degrees per node¶

✪✪✪ We might check as well the sorted degrees per node, intended as the sum of in_degree and out_degree. To get the sum, use `G1.degree(node)`

function.

```
[41]:
```

```
# write here the solution
```

```
[42]:
```

```
```

✪✪✪✪ **EXERCISE**: Look at this example, and make a double bar chart sorting nodes by their *total* degree. To do so, in the tuples you will need `vertex`

, `in_degree`

, `out_degree`

and also `degree`

.

```
[43]:
```

```
# write here
```

```
[44]:
```

```
```

## Frequency histogram¶

Now let’s try to draw degree frequencies, that is, for each degree present in the graph we want to display a bar as high as the number of times that particular degree appears.

For doing so, we will need a matplot histogram, see documentation

We will need to tell matplotlib how many columns we want, which in histogram terms are called *bins*. We also need to give the histogram a series of numbers so it can count how many times each number occurs. Let’s consider this graph `G2`

:

```
[45]:
```

```
import networkx as nx
G2=nx.DiGraph({
'a':['b','c'],
'b':['b','c', 'd'],
'c':['a','b','d'],
'd':['b', 'd','e'],
'e':[],
'f':['c','d','e'],
'g':['e','g']
})
draw_nx(G2)
```

If we take the the degree sequence of `G2`

we get this:

```
[46]:
```

```
degrees_G2 = [G2.degree(n) for n in G2.nodes()]
degrees_G2
```

```
[46]:
```

```
[7, 3, 7, 3, 3, 3, 6]
```

We see 3 appears four times, 6 once, and seven twice.

Let’s try to determine a good number for the bins. First we can check the boundaries our x axis should have:

```
[47]:
```

```
min(degrees_G2)
```

```
[47]:
```

```
3
```

```
[48]:
```

```
max(degrees_G2)
```

```
[48]:
```

```
7
```

So our histogram on the x axis must go at least from 3 and at least to 7. If we want integer columns (bins), we will need at least ticks for going from 3 included to 7 included, so at least ticks for 3,4,5,6,7. For getting precise display, wen we have integer x it is best to also manually provide the sequence of bin edges, remembering it should start at least from the minimum *included* (in our case, 3) and arrive to the maximum + 1 *included* (in our case, 7 + 1 = 8)

**NOTE**: precise histogram drawing can be quite tricky, please do read this StackOverflow post for more details about it.

```
[49]:
```

```
import matplotlib.pyplot as plt
import numpy as np
degrees = [G2.degree(n) for n in G2.nodes()]
# add histogram
# in this case hist returns a tuple of three values
# we put in three variables
n, bins, columns = plt.hist(degrees_G2,
bins=range(3,9), # 3 *included* , 4, 5, 6, 7, 8 *included*
width=1.0) # graphical width of the bars
plt.xlabel('Degrees')
plt.ylabel('Frequency counts')
plt.title('G2 Degree distribution')
plt.xlim(0, max(degrees) + 2)
plt.show()
```

As expected we see 3 is counted four times, 6 once, and seven twice.

✪✪✪ **EXERCISE**: Still, it would be visually better to align the x ticks to the middle of the bars with `xticks`

, and also to make the graph more tight by setting the `xlim`

appropriately. This is not always easy to do.

Read carefully this StackOverflow post and try do it by yourself.

**NOTE**: set *one thing at a time* and try if it works(i.e. first xticks and then xlim), doing everything at once might get quite confusing

```
[50]:
```

```
# write here the solution
```

```
[51]:
```

```
```

## Showing plots side by side¶

You can display plots on a grid. Each cell in the grid is idientified by only one number. For example, for a grid of two rows and three columns, you would have cells indexed like this:

```
1 2 3
4 5 6
```

```
[52]:
```

```
%matplotlib inline
import matplotlib.pyplot as plt
import math
xs = [1,2,3,4,5,6]
# cells:
# 1 2 3
# 4 5 6
plt.subplot(2, # 2 rows
3, # 3 columns
1) # plotting in first cell
ys1 = [x**3 for x in xs]
plt.plot(xs, ys1)
plt.title('first cell')
plt.subplot(2, # 2 rows
3, # 3 columns
2) # plotting in first cell
ys2 = [2*x + 1 for x in xs]
plt.plot(xs,ys2)
plt.title('2nd cell')
plt.subplot(2, # 2 rows
3, # 3 columns
3) # plotting in third cell
ys3 = [-2*x + 1 for x in xs]
plt.plot(xs,ys3)
plt.title('3rd cell')
plt.subplot(2, # 2 rows
3, # 3 columns
4) # plotting in fourth cell
ys4 = [-2*x**2 for x in xs]
plt.plot(xs,ys4)
plt.title('4th cell')
plt.subplot(2, # 2 rows
3, # 3 columns
5) # plotting in fifth cell
ys5 = [math.sin(x) for x in xs]
plt.plot(xs,ys5)
plt.title('5th cell')
plt.subplot(2, # 2 rows
3, # 3 columns
6) # plotting in sixth cell
ys6 = [-math.cos(x) for x in xs]
plt.plot(xs,ys6)
plt.title('6th cell')
plt.show()
```

### Graph models¶

Let’s study frequencies of some known network types.

#### Erdős–Rényi model¶

✪✪ A simple graph model we can think of is the so-called Erdős–Rényi model: is is an *undirected* graph where have `n`

nodes, and each node is connected to each other with probability `p`

. In networkx, we can generate a random one by issuing this command:

```
[53]:
```

```
G = nx.erdos_renyi_graph(10, 0.5)
```

In the drawing, by looking the absence of arrows confirms it is undirected:

```
[54]:
```

```
draw_nx(G)
```

Try plotting degree distribution for different values of `p`

(0.1, 0.5, 0.9) with a fixed `n=1000`

, putting them side by side on the same row. What does their distribution look like ? Where are they centered ?

To avoid rewriting the same code again and again, define a `plot_erdos(n,p,j)`

function to be called three times.

```
[55]:
```

```
# write here the solution
```

```
[56]:
```

```
```

```
Erdős–Rényi degree distribution SOLUTION
```

## Other plots¶

Matplotlib allows to display pretty much any you might like, here we collect some we use in the course, for others, see the extensive Matplotlib documentation

### Pie chart¶

```
[57]:
```

```
%matplotlib inline
import matplotlib.pyplot as plt
labels = ['Oranges', 'Apples', 'Cocumbers']
fracs = [14, 23, 5] # how much for each sector, note doesn't need to add up to 100
plt.pie(fracs, labels=labels, autopct='%1.1f%%', shadow=True)
plt.title("Super strict vegan diet (good luck)")
plt.show()
```