PageRank with DGL message passin

作者: 魏鹏飞 | 来源:发表于2020-03-07 21:45 被阅读0次

PageRank with DGL message passin
Built-in message passing functio
GoogleColab实现HAN异构图注意力神经网络算法
李宏毅 GNN 课堂笔记
GCN学习实例（dgl）
灵性英文｜Equanimity and Anicca（平常心，无
Pagerank算法
GNN学习【01】——DGL
常用图算法实现--Hadoop
在服务器上运行代码

In this tutorial, you learn how to use different levels of the message passing API with PageRank on a small graph. In DGL, the message passing and feature transformations are user-defined functions (UDFs).

The PageRank algorithm

In each iteration of PageRank, every node (web page) first scatters its PageRank value uniformly to its downstream nodes. The new PageRank value of each node is computed by aggregating the received PageRank values from its neighbors, which is then adjusted by the damping factor:

$PV(u)=\frac{1-d}{N}+d\times \sum_{v\in N(u)}\frac{PV(v)}{D(v)}\tag{1}$

where $N$ is the number of nodes in the graph; $D(v)$ is the out-degree of a node $v$ ; and $N(u)$ is the neighbor nodes.

A naive implementation

Create a graph with 100 nodes by using networkx and then convert it to a DGLGraph.

import networkx as nx
import matplotlib.pyplot as plt
import torch
import dgl

N = 100  # number of nodes
DAMP = 0.85  # damping factor
K = 10  # number of iterations
g = nx.nx.erdos_renyi_graph(N, 0.1)
g = dgl.DGLGraph(g)
nx.draw(g.to_networkx(), node_size=50, node_color=[[.5, .5, .5,]])
plt.show()

image.png

According to the algorithm, PageRank consists of two phases in a typical scatter-gather pattern. Initialize the PageRank value of each node to $\frac{ 1 }{N}$ and then store each node’s out-degree as a node feature.

g.ndata['pv'] = torch.ones(N) / N
g.ndata['deg'] = g.out_degrees(g.nodes()).float()

Define the message function, which divides every node’s PageRank value by its out-degree and passes the result as message to its neighbors.

def pagerank_message_func(edges):
    return {'pv' : edges.src['pv'] / edges.src['deg']}

In DGL, the message functions are expressed as Edge UDFs. Edge UDFs take in a single argument edges. It has three members src, dst, and data for accessing source node features, destination node features, and edge features. Here, the function computes messages only from source node features.

Define the reduce function, which removes and aggregates the messages from its mailbox, and computes its new PageRank value.

def pagerank_reduce_func(nodes):
    msgs = torch.sum(nodes.mailbox['pv'], dim=1)
    pv = (1 - DAMP) / N + DAMP * msgs
    return {'pv' : pv}

The reduce functions are Node UDFs. Node UDFs have a single argument nodes, which has two members data and mailbox. data contains the node features and mailbox contains all incoming message features, stacked along the second dimension (hence the dim=1 argument).

The message UDF works on a batch of edges, whereas the reduce UDF works on a batch of edges but outputs a batch of nodes. Their relationships are as follows:

image.png

g.register_message_func(pagerank_message_func)
g.register_reduce_func(pagerank_reduce_func)

The algorithm is straightforward. Here is the code for one PageRank iteration.

def pagerank_naive(g):
    # Phase #1: send out messages along all edges.
    for u, v in zip(*g.edges()):
        g.send((u, v))
    # Phase #2: receive messages to compute new PageRank values.
    for v in g.nodes():
        g.recv(v)

Batching semantics for a large graph

The above code does not scale to a large graph because it iterates over all the nodes. DGL solves this by allowing you to compute on a batch of nodes or edges. For example, the following codes trigger message and reduce functions on multiple nodes and edges at one time.

def pagerank_batch(g):
    g.send(g.edges())
    g.recv(g.nodes())

You are still using the same reduce function pagerank_reduce_func, where nodes.mailbox['pv'] is a single tensor, stacking the incoming messages along the second dimension.

You might wonder if this is even possible to perform reduce on all nodes in parallel, since each node may have different number of incoming messages and you cannot really “stack” tensors of different lengths together. In general, DGL solves the problem by grouping the nodes by the number of incoming messages, and calling the reduce function for each group.

Use higher-level APIs for efficiency

DGL provides many routines that combine basic send and recv in various ways. These routines are called level-2 APIs. For example, the next code example shows how to further simplify the PageRank example with such an API.

def pagerank_level2(g):
    g.update_all()

In addition to update_all, you can use pull, push, and send_and_recv in this level-2 category. For more information, see API reference.

Use DGL `builtin` functions for efficiency

Some of the message and reduce functions are used frequently. For this reason, DGL also provides builtin functions. For example, two builtin functions can be used in the PageRank example.

dgl.function.copy_src(src, out) - This code example is an edge UDF that computes the output using the source node feature data. To use this, specify the name of the source feature data (src) and the output name (out).
dgl.function.sum(msg, out) - This code example is a node UDF that sums the messages in the node’s mailbox. To use this, specify the message name (msg) and the output name (out).

The following PageRank example shows such functions.

import dgl.function as fn

def pagerank_builtin(g):
    g.ndata['pv'] = g.ndata['pv'] / g.ndata['deg']
    g.update_all(message_func=fn.copy_src(src='pv', out='m'),
                 reduce_func=fn.sum(msg='m',out='m_sum'))
    g.ndata['pv'] = (1 - DAMP) / N + DAMP * g.ndata['m_sum']

In the previous example code, you directly provide the UDFs to the update_all as its arguments. This will override the previously registered UDFs.

In addition to cleaner code, using builtin functions also gives DGL the opportunity to fuse operations together. This results in faster execution. For example, DGL will fuse the copy_src message function and sum reduce function into one sparse matrix-vector (spMV) multiplication.

The following section describes why spMV can speed up the scatter-gather phase in PageRank. For more details about the builtin functions in DGL, see API reference.

You can also download and run the different code examples to see the differences.

for k in range(K):
    # Uncomment the corresponding line to select different version.
    # pagerank_naive(g)
    # pagerank_batch(g)
    # pagerank_level2(g)
    pagerank_builtin(g)
print(g.ndata['pv'])

# Results
tensor([0.0070, 0.0084, 0.0084, 0.0130, 0.0077, 0.0051, 0.0087, 0.0091, 0.0084,
        0.0139, 0.0084, 0.0092, 0.0065, 0.0075, 0.0094, 0.0049, 0.0067, 0.0129,
        0.0084, 0.0123, 0.0117, 0.0096, 0.0075, 0.0068, 0.0140, 0.0086, 0.0094,
        0.0105, 0.0076, 0.0068, 0.0095, 0.0102, 0.0085, 0.0109, 0.0092, 0.0092,
        0.0127, 0.0119, 0.0110, 0.0077, 0.0069, 0.0108, 0.0112, 0.0126, 0.0094,
        0.0144, 0.0101, 0.0129, 0.0146, 0.0112, 0.0094, 0.0127, 0.0094, 0.0111,
        0.0137, 0.0075, 0.0113, 0.0128, 0.0118, 0.0123, 0.0118, 0.0094, 0.0069,
        0.0113, 0.0174, 0.0085, 0.0095, 0.0083, 0.0103, 0.0093, 0.0085, 0.0110,
        0.0119, 0.0103, 0.0122, 0.0059, 0.0044, 0.0131, 0.0075, 0.0130, 0.0059,
        0.0121, 0.0060, 0.0180, 0.0078, 0.0094, 0.0102, 0.0080, 0.0050, 0.0112,
        0.0127, 0.0127, 0.0067, 0.0123, 0.0100, 0.0137, 0.0110, 0.0101, 0.0082,
        0.0130])

Using spMV for PageRank

Using builtin functions allows DGL to understand the semantics of UDFs. This allows you to create an efficient implementation. For example, in the case of PageRank, one common method to accelerate it is by using its linear algebra form.

$R^k=\frac{1-d}{N}1+dA*R^{k-1}\tag{2}$

Here, $R^k$ is the vector of the PageRank values of all nodes at iteration $k$ ; $A$ is the sparse adjacency matrix of the graph. Computing this equation is quite efficient because there is an efficient GPU kernel for the sparse matrix-vector multiplication (spMV). DGL detects whether such optimization is available through the builtin functions. If a certain combination of builtin can be mapped to an spMV kernel (e.g., the PageRank example), DGL uses it automatically. We recommend using builtin functions whenever possible.

原文地址：
https://docs.dgl.ai/tutorials/basics/3_pagerank.html#

PageRank with DGL message passin
In this tutorial, you learn how to use different levels o...
Built-in message passing functio
In DGL, message passing is expressed by two APIs: send(ed...
GoogleColab实现HAN异构图注意力神经网络算法
参考dgl代码： https://github.com/dmlc/dgl/tree/master/examples...
李宏毅 GNN 课堂笔记
重要的网站 https://www.dgl.ai/[https://www.dgl.ai/]https://git...
GCN学习实例（dgl）
前言：结合阿里算法工程师写的关于DGL库文章中的例子和DGL的文档，写了这篇关于DGL的笔记（代码可以跑通）。 ...
灵性英文｜Equanimity and Anicca（平常心，无
灵性英文课素材-No. 5 Everything is ephemeral, arising and passin...
Pagerank算法
一. Pagerank介绍PageRank算法以前就是Google的网页排序算法。PageRank算法，对每个目标...
GNN学习【01】——DGL
学习内容：DGL - Get Started 讨论问题 1. 如何借助networkx创建dgl图？如何直接创建d...
常用图算法实现--Hadoop
PageRank 数据准备边：网页：将这两个文件放入HDFS：编写程序 PageRank PageRank...
在服务器上运行代码
1 安装包 dgl 用于开发图神经网络打开网页https://www.dgl.ai/pages/start.htm...