Leiden algorithm

The Leiden algorithm is a community detection algorithm developed by Traag et al ^[1] at Leiden University. It was developed as a modification of the Louvain method to address the issues with disconnected communities.

Quality[edit]

Similar to modularity, the quality function is used to assess how well the communities have been allocated. The Leiden algorithm uses the Constant Potts Model (CPM):^[2]

${\begin{aligned}{\mathcal {H}}(G,{\mathcal {P}})&=\sum _{C\in {\mathcal {P}}}|E(C,C)|-\gamma {\binom {||C||}{2}}\end{aligned}}$

Algorithm[edit]

The Leiden algorithm starts from a singleton partition **(a)**. The algorithm moves individual nodes from one community to another to find a partition **(b)**, which is then refined **(c)**. An aggregate network **(d)** is created based on the refined partition, using the non-refined partition to create an initial partition for the aggregate network. For example, the red community in **(b)** is refined into two subcommunities in **(c)**, which after aggregation become two separate nodes in **(d)**, both belonging to the same community. The algorithm then moves individual nodes in the aggregate network **(e)**. In this case, refinement does not change the partition **(f)**. These steps are repeated until no further improvements can be made.

The Leiden algorithm is similar to that of the Louvain method, with some important modifications.

Step 1: First, each node in the network is assigned to its own community.

Step 2: Next, we decide which communities to move the nodes into and update the partition ${\mathcal {P}}$ .

queue = V(G) # create a queue from the nodes

while queue != empty:
  node = queue.next() # get the next node
  delta_H = 0
  for C in communities: # compute the change in quality for each community
    if delta_H(node, C) > delta_H:
      delta_H = delta_H(node, C)
      community = C
  if delta_H > 0:
    move node to community
    outside_nodes = { node_i | (node, node_i) are edges and node_i is not in community } # find the nodes which are connected to the node but not in the community
    queue.add(outside_nodes not already in queue)

Step 3: Assign each node in the graph to its own community in a new partition called ${\mathcal {P}}_{\text{refined}}$ .

Step 4: The goal of this step is to separate poorly-connected communities:

for C in communities of P:
  # find the nodes in the community which have lots of edges within the community
  well_connected_nodes = { node | node is in C, |E(node, C\node)| >= gamma ||node||(||C|| - ||node||) }
  for node in well_connected_nodes:
    if node is singleton under P_refined:
      well_connected_communities = { C_i | C_i is in P_refined, C_i is a subset of C, |E(C_i, C\C_i)| >= gamma*||C_i||(||C|| - ||C_i||)
      for C_i in well_connected_communities:
        compute probability P(C_i) # 0 if assignment of node to C_i decreases quality of P_refined, greater weights for greater quality increases
      assign node to C_i by sampling P(C_i) distribution

Step 5: Use the refined partition ${\mathcal {P}}_{\text{refined}}$ to aggregate the graph. Each community in ${\mathcal {P}}_{\text{refined}}$ becomes a node in the new graph $G_{\text{agg}}$ .

Example: Suppose that we have:

${\begin{aligned}V&=\{v_{1},v_{2},v_{3},v_{4},v_{5},v_{6},v_{7}\}\\C_{1}&=\{v_{1},v_{2},v_{3},v_{4}\}\\C_{2}&=\{v_{5},v_{6},v_{7}\}\\{\mathcal {P}}&=\{C_{1},C_{2}\}\\{\mathcal {P}}_{\text{refined}}&=\{C_{1a},C_{1b},C_{2}\}\end{aligned}}$

Then our new set of nodes will be:

${\begin{aligned}V_{agg}&=\{C_{1a}\mapsto w_{1a},C_{1b}\mapsto w_{1b},C_{2}\mapsto w_{2}\}\end{aligned}}$

Step 6: Update the partition ${\mathcal {P}}$ using the aggregated graph. We keep the communities from partition ${\mathcal {P}}$ , but the communities can be separated into multiple nodes from the refined partition ${\mathcal {P}}_{\text{refined}}$ :

${\begin{aligned}{\mathcal {P}}&=\{\{v~|~v\subseteq C,v\in V(G_{\text{agg}})\}~|~C\in {\mathcal {P}}\}\end{aligned}}$

Example: Suppose that $C$ is a poorly-connected community from the partition ${\mathcal {P}}$ :

${\begin{aligned}C&=\{v_{1},v_{2},v_{3},v_{4},v_{5}\}\\{\mathcal {P}}&=\{C\}\end{aligned}}$

Then suppose during the refinement step, it was separated into two communities, $C_{1}$ and $C_{2}$ :

${\begin{aligned}C_{1}&=\{v_{1},v_{2},v_{3}\}\\C_{2}&=\{v_{4},v_{5}\}\\{\mathcal {P}}_{\text{refined}}&=\{C_{1},C_{2}\}\end{aligned}}$

When we aggregate the graph, the new nodes will be:

${\begin{aligned}V(G_{\text{agg}})&=\{C_{1},C_{2}\}\end{aligned}}$

but we will keep the old partition:

${\begin{aligned}{\mathcal {P}}&=\{\{C_{1},C_{2}\}\}\end{aligned}}$

Step 7: Repeat Steps 2 - 6 until each community consists of only one node.

References[edit]

^ Traag, Vincent A; Waltman, Ludo; van Eck, Nees Jan (26 March 2019). "From Louvain to Leiden: guaranteeing well-connected communities". Scientific Reports. 9 (1): 5233. arXiv:1810.08473. Bibcode:2019NatSR...9.5233T. doi:10.1038/s41598-019-41695-z. PMC 6435756. PMID 30914743.
^ Traag, Vincent A; Van Dooren, Paul; Nesterov, Yurii (29 July 2011). "Narrow scope for resolution-limit-free community detection". Physical Review E. 84 (1): 016114. arXiv:1104.3083. Bibcode:2011PhRvE..84a6114T. doi:10.1103/PhysRevE.84.016114. PMID 21867264.

[Leiden-1] Traag, Vincent A; Waltman, Ludo; van Eck, Nees Jan (26 March 2019). "From Louvain to Leiden: guaranteeing well-connected communities". Scientific Reports. 9 (1): 5233. arXiv:1810.08473. Bibcode:2019NatSR...9.5233T. doi:10.1038/s41598-019-41695-z. PMC 6435756. PMID 30914743.

[CPM-2] Traag, Vincent A; Van Dooren, Paul; Nesterov, Yurii (29 July 2011). "Narrow scope for resolution-limit-free community detection". Physical Review E. 84 (1): 016114. arXiv:1104.3083. Bibcode:2011PhRvE..84a6114T. doi:10.1103/PhysRevE.84.016114. PMID 21867264.

[1]

[2]