Transcription
A gentle introduction to graph neural networksSeongok Ryu, Department of Chemistry @ KAIST
Motivation Graph Neural Networks Applications of Graph Neural Networks
Motivation
Non-Euclidean data structureSuccessful deep learning architectures1. Convolutional neural networks
Non-Euclidean data structureSuccessful deep learning architectures1. Convolutional neural networks
Non-Euclidean data structureSuccessful deep learning architectures1. Convolutional neural 7b852
Non-Euclidean data structureSuccessful deep learning architectures1. Convolutional neural networksLee, Jason, Kyunghyun Cho, and Thomas Hofmann."Fully character-level neural machine translation without explicit segmentation." arXiv preprint arXiv:1610.03017 (2016).
Non-Euclidean data structureSuccessful deep learning architectures2. Recurrent neural network
Non-Euclidean data structureSuccessful deep learning architectures2. Recurrent neural network
Non-Euclidean data structureSuccessful deep learning architectures2. Recurrent neural networkSentiment analysis
Non-Euclidean data structureSuccessful deep learning architectures2. Recurrent neural networkNeural machine translation
Non-Euclidean data structureData structures of shown examples are regular.Image โ values on pixels (grids)Sentence โ sequential structure
Non-Euclidean data structureHOWEVER, there are lots of irregular data structure, Social Graph(Facebook, Wikipedia)3D MeshAll you need is GRAPH!Molecular Graph
Graph structure๐ฎ๐๐๐๐ ๐ฎ(๐ฟ, ๐จ)X : Node, Vertex- Individual person in a social network- Atoms in a moleculeRepresent elements of a system
Graph structure๐ฎ๐๐๐๐ ๐ฎ(๐ฟ, ๐จ)A : Adjacency matrix- Edges of a graph- Connectivity, RelationshipRepresent relationship or interaction between elements of the system
Graph structureMore detail, Battaglia, Peter W., et al."Relational inductive biases, deep learning, and graph networks." arXiv preprint arXiv:1806.01261 (2018).
Graph structureAriana Grande 25 years old Female American Singer Node featuresDonald Trump 72 years old Male American President, business man Vladimir Putin 65 years old Male Russian President
Graph structureEdge featuresAromatic bondDouble bondSingle bond
Learning relation and interactionWhat can we do with graph neural networks?Battaglia, Peter W., et al."Relational inductive biases, deep learning, and graph networks." arXiv preprint arXiv:1806.01261 (2018).
Learning relation and interactionWhat can we do with graph neural networks? Node classification Link prediction Node2Vec, Subgraph2Vec, Graph2Vec: Embedding node/substructure/graph structure to a vector Learning physics law from data And you can do more amazing things with GNN!
Graph neural networks
Overall architecture of graph neural networks Updating node states- Graph Convolutional Network (GCN)- Graph Attention Network (GAT)- Gated Graph Neural Network (GGNN) Readout : permutation invariance on changing node orders Graph Auto-Encoders Practical issues- Skip connection- Inception- Dropout
Principles of graph neural networkWeights using in updating hidden states of fully-connected Net, CNN and RNNBattaglia, Peter W., et al."Relational inductive biases, deep learning, and graph networks." arXiv preprint arXiv:1806.01261 (2018).
Overall neural network structureโ case of supervised learning(๐)Input node features, ๐ฏ๐: Raw node information(๐ณ)Final node states, ๐ฏ๐: How the NN recognizes the nodesGraph features, Z
Principles of graph neural networkUpdates in a graph neural network Edge update : relationship or interactions, sometimes called as โmessage passingโex) the forces of spring Node update : aggregates the edge updates and used in the node updateex) the forces acting on the ball Global update : an update for the global attributeex) the net forces and total energy of the physical systemBattaglia, Peter W., et al."Relational inductive biases, deep learning, and graph networks." arXiv preprint arXiv:1806.01261 (2018).
Principles of graph neural networkWeights using in updating hidden states of GNNSharing weights for all nodes in graph,(๐)but nodes are differently updated by reflecting individual node features, ๐ฏ๐
GCN : Graph Convolutional Networkhttp://tkipf.github.io/Famous for variational autoencoder (VAE)Kipf, Thomas N., and Max Welling."Semi-supervised classification with graph convolutional networks." arXiv preprint arXiv:1609.02907 (2016)
GCN : Graph Convolutional NetworkWhat NN learns(๐ ๐)๐ฟ๐ ๐((๐)(๐)(๐)๐ [๐ ๐,๐ ๐] ๐พ๐ ๐ฟ๐ ๐๐ )
GCN : Graph Convolutional Network2134(๐ ๐)๐ฏ๐๐ ๐ ๐ฏ๐ ๐พ๐๐ ๐ฏ๐ ๐พ๐๐ ๐ฏ๐ ๐พ๐๐ ๐ฏ๐ ๐พ๐
GCN : Graph Convolutional NetworkWhat NN learns(๐ ๐)๐ฏ๐๐ ๐๐ฏ๐ ๐พ๐ ๐ต ๐๐๐ ๐ ๐จ๐ฏ๐ ๐พ๐
GCN : Graph Convolutional Network Classification of nodes of citation networks and a knowledge graph ๐ฟ ๐ ๐ฆ๐ฟ๐น๐ 1 ๐๐๐ln ๐๐๐Kipf, Thomas N., and Max Welling."Semi-supervised classification with graph convolutional networks." arXiv preprint arXiv:1609.02907 (2016)
GAT : Graph Attention Network Attention revisitsWhat NN learnsGCN :(๐ ๐)๐ฏ๐๐ ๐๐ฏ๐ ๐พ๐ ๐ต ๐๐๐ ๐ ๐จ๐ฏ๐ ๐พ๐
GAT : Graph Attention Network Attention revisitsWhat NN learns: Convolution weight and attention coefficientGAT :(๐ ๐)๐ฏ๐(๐) ๐๐๐ถ๐๐ ๐ฏ๐ ๐พ๐ ๐ต ๐๐Velickovic, Petar, et al."Graph attention networks." arXiv preprint arXiv:1710.10903 (2017).
GAT : Graph Attention Network Attention mechanism in natural language -neural-machine-translation-gpus-part-3/figure6 sample translations-2/
GAT : Graph Attention Network Attention mechanism in natural language and-memory-in-deep-learning-and-nlp/
GAT : Graph Attention Network Attention mechanism in natural language processing๐๐ ๐ ๐๐๐๐๐ ๐๐ , ๐๐ ๐ ๐๐๐ก๐๐๐ฅ ๐ ๐๐๐๐ ๐๐ , ๐๐๐ฝ ๐๐ป๐ ๐๐๐ ๐๐๐๐(๐๐ , ๐๐ ) ๐๐ป๐ ๐พ๐ ๐๐๐๐ป๐ tanh(๐พ๐ [๐๐ , ๐๐ ])Luong, Minh-Thang, Hieu Pham, and Christopher D. Manning."Effective approaches to attention-based neural machine translation." arXiv preprint arXiv:1508.04025 (2015).
GAT : Graph Attention Network Attention revisitsWhat NN learns: Convolution weight and attention coefficient(๐ ๐)๐ฏ๐(๐) ๐๐๐ถ๐๐ ๐ฏ๐ ๐พ๐ ๐ต ๐๐๐ถ๐๐ ๐(๐ฏ๐ ๐พ, ๐ฏ๐ ๐พ)Velickovic, Petar, et al."Graph attention networks." arXiv preprint arXiv:1710.10903 (2017).
GAT : Graph Attention Network Attention revisitsWhat NN learns: Convolution weight and attention coefficient(๐ ๐)๐ฏ๐(๐) ๐๐๐ถ๐๐ ๐ฏ๐ ๐พ๐ ๐ต ๐๐๐ถ๐๐ ๐(๐ฏ๐ ๐พ, ๐ฏ๐ ๐พ) Velickovic, Petar, et al. โ network analysis๐ถ๐๐ ๐๐๐๐๐๐๐ ๐๐๐ ๐๐๐๐๐๐๐ ๐ต ๐๐๐๐ Seongok Ryu, et al. โ molecular applications๐ถ๐๐ ๐ญ๐๐ง๐ก๐ฏ๐ ๐พ ๐ป ๐ช ๐ฏ๐ ๐พ๐๐๐ ๐ณ๐๐๐๐๐๐น๐๐ณ๐ผ(๐๐ป ๐ฏ๐ ๐พ, ๐ฏ๐ ๐พ )
GAT : Graph Attention Network Multi-head attention Average(๐ ๐)๐ฏ๐ ๐๐๐ฒ๐ฒ(๐)๐๐ถ๐๐ ๐ฏ๐ ๐พ๐๐ ๐ ๐ ๐ต ๐ Concatenation๐ฒ(๐ ๐)๐ฏ๐ (๐)๐๐ ๐๐๐ถ๐๐ ๐ฏ๐ ๐พ๐๐ ๐ต ๐Velickovic, Petar, et al."Graph attention networks." arXiv preprint arXiv:1710.10903 (2017).
GGNN : Gated Graph Neural Network Message Passing Neural Network (MPNN) frameworkUsing previous node state and message stateto update the ith hidden node state(๐ 1)โ๐๐ ๐(โ๐ , ๐๐๐ 1)The message state is updated by previousneighbor states and the route state, and edge states.(๐ 1)๐๐ ๐๐ 1(โ๐ , โ๐ , ๐๐๐ )๐ ๐(๐)Wu, Zhenqin, et al."MoleculeNet: a benchmark for molecular machine learning." Chemical science 9.2 (2018): 513-530.
GGNN : Gated Graph Neural Network Using recurrent unit for updating the node states, in this case GRU.(๐ ๐)๐๐๐ ๐ผ(๐๐ , ๐๐๐ ๐(๐ ๐))๐๐๐ ๐ฎ๐น๐ผ(๐๐ , ๐๐๐ ๐)Updating rate of the temporal state(๐ ๐)๐๐ ๐๐ ๐ ๐๐(๐ ๐) ๐ ๐Temporal node state๐ ๐(๐) ๐๐Previous node stateLi, Yujia, et al."Gated graph sequence neural networks." arXiv preprint arXiv:1511.05493 (2015).
Readout : permutation invariance on changing nodes order(๐)Input node features, ๐ฏ๐: Raw node information(๐ณ)Final node states, ๐ฏ๐: How the NN recognizes the nodesGraph features, Z
Readout : permutation invariance on changing nodes orderMapping Images to Scene Graphs with Permutation-Invariant Structured Prediction - Scientific Figure on ResearchGate. Available eling-function-F-is fig1 323217335 [accessed 8 Sep, 2018]
Readout : permutation invariance on changing nodes order Graph feature๐ง๐บ ๐(๐ฟ)๐ป๐ Node-wise summation๐ง๐บ ๐๐๐ฟ๐ ๐ป๐๐ฟ๐ ๐บ Graph gathering๐ฟ๐ง๐บ ๐๐ ๐๐ฟ๐1 ๐ป๐ , ๐ป๐๐ ๐บ0 ๐๐ฟ๐2 ๐ป๐๐ฟ ๐ : ReLU activation ๐ : sigmoid activationGilmer, Justin, et al."Neural message passing for quantum chemistry." arXiv preprint arXiv:1704.01212 (2017).
Graph Auto-Encoders (GAE) Clustering Link prediction Matrix completion and recommendationKipf, Thomas N., and Max Welling."Variational graph auto-encoders." arXiv preprint arXiv:1611.07308 (2016).https://github.com/tkipf/gae
Graph Auto-Encoders (GAE)Input graph, ๐ฎ(๐ฟ, ๐จ)EncoderNode features, ๐(๐ ๐ฟ, ๐จ)DecoderReconstructedAdjacency matrix, ๐ด ๐(๐๐๐ป )Kipf, Thomas N., and Max Welling."Variational graph auto-encoders." arXiv preprint arXiv:1611.07308 (2016).https://github.com/tkipf/gae
Graph Auto-Encoders (GAE)Encoder - Inference modelInput graph, ๐ฎ(๐ฟ, ๐จ) Two-layer GCN๐บ๐ถ๐ ๐ฟ, ๐จ ๐จ๐ ๐๐ฟ๐ ๐จ๐ฟ๐พ๐ ๐พ๐12 12 ๐จ ๐ซ ๐จ๐ซ: symmetrically normalized adjacency matrix Variational Inference๐(๐๐ ๐ฟ, ๐จ)๐ 1Node features, ๐(๐ ๐ฟ, ๐จ)Decoder๐๐ ๐ ๐ฟ, ๐จ Encoder๐ ๐๐ ๐ฟ, ๐จ ๐ ๐๐ ๐๐ , ๐๐๐๐ ๐๐๐ReconstructedAdjacency matrix, ๐ด ๐(๐๐๐ป )Kipf, Thomas N., and Max Welling."Variational graph auto-encoders." arXiv preprint arXiv:1611.07308 (2016).https://github.com/tkipf/gae
Graph Auto-Encoders (GAE)Decoder - Generative model Inner product between latent vectors๐๐๐ ๐จ๐ ๐(๐จ๐๐ ๐๐ , ๐๐ )๐ ๐จ๐๐ ๐๐ , ๐๐ ๐ ๐๐๐ ๐๐๐ 1 ๐ 1Input graph, ๐ฎ(๐ฟ, ๐จ)EncoderNode features, ๐(๐ ๐ฟ, ๐จ)๐ด๐๐ : the elements of (reconstructed) A๐ : sigmoid activation๐จ ๐(๐๐๐ป )with ๐ ๐ฎ๐ช๐ต(๐ฟ, ๐จ) LearningDecoderReconstructedAdjacency matrix, ๐จ ๐(๐๐๐ป )โ ๐ผ๐(๐ ๐ฟ,๐จ) log ๐(๐จ ๐) KL ๐ ๐ ๐ฟ, ๐จ (๐(๐)Reconstruction lossKL-divergenceKipf, Thomas N., and Max Welling."Variational graph auto-encoders." arXiv preprint arXiv:1611.07308 (2016).https://github.com/tkipf/gae
Practical Issues: InceptionGoogLeNet โ Winner of 2014 ImageNet 1
Practical Issues: InceptionInception guide-to-deep-network-architectures-65fdc477db41
Practical Issues: Inception(๐)๐ฏ๐๐ ๐ ๐จ๐ฏ๐ ๐พ(๐)(๐)๐ฏ๐๐ ๐ ๐จ๐ฏ๐ ๐พ(๐)
Practical Issues: Inception Make network wider Avoid vanishing gradient
Practical Issues: Skip-connectionResNet โ Winner of 2015 ImageNet ChallengeGoing Deeeeeeeeper!
Practical Issues: Skip-connectionResNet โ Winner of 2015 ImageNet Challenge(๐ 1)๐ฆ ๐ป๐(๐) e-guide-to-deep-network-architectures-65fdc477db41
Practical Issues: DropoutDropout rate (๐) 0.25# parameters: 4x4 16# parameters: 4x4 16# parameters: 16x0.75 12# parameters: 16x0.75x0.75 9For a dense network, the dropout of hidden state reduces the number of parametersfrom ๐๐ค to ๐๐ค 1 ๐2
Practical Issues: DropoutHidden states in a graph neural networkGraph Conv. :(๐ 1)๐ป๐ ๐ด๐ฏ(๐) ๐Node 1Node 2Node 3 Age284036 SexMFF NationalityKoreanAmericanFrench JobStudentMedicalDoctorPolitician ๏ฟฝ๏ฟฝ๐
Practical Issues: DropoutHidden states in a graph neural networkNode 1Node 2Node 3 Age284036 ctor(๐)๐ฏ๐(๐)๐ฏ๐Node 1Node 2Node 3 Age284036 SexMFF French NationalityKoreanAmericanFrench Politician JobStudentMedicalDoctorPolitician (๐)๐ฏ๐Mask individual ๏ฟฝ๏ฟฝ๏ฟฝ)๐ฏ๐(๐)๐ฏ๐Mask information (features)of node statesAnd many other options are possible. The proper method depends on your task.
Applications of graph neural networks
Network Analysis1. Node classification2. Link prediction3. Matrix completion Molecular Applications1. Neural molecular fingerprint2. Quantitative Structure-Property Relationship (QSPR)3. Molecular generative model Interacting physical system
Network analysis1. Node classification โ karate club networkKarate club graph, colors denote communities obtainedvia modularity-based clustering (Brandes et al., 2008). All figures and descriptions are taken from Thomas N. Kipfโs blog. Watch video on his blog.GCN embedding (with random weights)for nodes in the karate club network.Kipf, Thomas N., and Max Welling."Semi-supervised classification with graph convolutional networks." arXiv preprint arXiv:1609.02907 etworks/
Network analysis1. Node classification Good node features Good node classification resultsKipf, Thomas N., and Max Welling."Semi-supervised classification with graph convolutional networks." arXiv preprint arXiv:1609.02907 etworks/
Network analysis1. Node classification Semi-supervised learning โ low label rate Citation network โ Citeseer, Cora, Pubmed / Bipartite graph - NELLKipf, Thomas N., and Max Welling."Semi-supervised classification with graph convolutional networks." arXiv preprint arXiv:1609.02907 etworks/
Network analysis1. Node classification Outperforms classical machine learning methodsKipf, Thomas N., and Max Welling."Semi-supervised classification with graph convolutional networks." arXiv preprint arXiv:1609.02907 etworks/
Network analysis1. Node classification Spectral graph convolution๐ป(๐ 1) ๐ ๐ท 1/2 ๐ด๐ท 1/2 ๐ป ๐ ๐ (๐)๐ด ๐ด ๐ผ๐ , ๐ท๐๐ ๐ ๐ด๐๐ Spectral graph filtering๐๐ ๐ฅ ๐๐๐โฒฮ๐๐ ๐ฅ๐ : the matrix of eigenvectors of the normalized graph Laplacian๐ฟ ๐ผ๐ 11 2 2๐ท ๐ด๐ท ๐ฮ๐ ๐๐พ๐๐โฒ ๐๐ (ฮ)๐๐ โฒ ฮ Polynomial approximation (In this case, Chebyshev polynomial)๐ 0Kipf, Thomas N., and Max Welling."Semi-supervised classification with graph convolutional networks." arXiv preprint arXiv:1609.02907 etworks/
Network analysis1. Node classification Spectral graph convolution๐ป(๐ 1) ๐ ๐ท 1/2 ๐ด๐ท 1/2 ๐ป ๐ ๐ (๐)๐ด ๐ด ๐ผ๐ , ๐ท๐๐ ๐ ๐ด๐๐ Spectral graph filtering๐พ๐๐ ๐ฅ ๐0โฒ ๐ฅ ๐1โฒ ๐ฟ ๐ผ๐ ๐ฅ ๐0โฒ ๐ฅ ๐1โฒ ๐ท 1/2 ๐ด๐ท 1/2 ๐ฅ๐๐โฒ ๐๐ ๐ฟ ๐ฅ๐๐ ๐ฅ Linear approx.๐ 0๐๐ ๐ฅ ๐ ๐ท 1/2 ๐ด๐ท 1/2 ๐ฅUse a single parameter ๐ ๐0โฒ ๐1โฒ๐๐ ๐ฅ ๐ ๐ผ๐ ๐ท 1/2 ๐ด๐ท 1/2 ๐ฅRenormalization trick๐ผ๐ ๐ท 1/2 ๐ด๐ท 1/2 ๐ท 1/2 ๐ด๐ท 1/2Kipf, Thomas N., and Max Welling."Semi-supervised classification with graph convolutional networks." arXiv preprint arXiv:1609.02907 etworks/
Network analysis1. Node classification Spectral graph convolution๐ป(๐ 1) ๐ ๐ท 1/2 ๐ด๐ท 1/2 ๐ป ๐ ๐ (๐)๐ด ๐ด ๐ผ๐ , ๐ท๐๐ ๐ ๐ด๐๐Kipf, Thomas N., and Max Welling."Semi-supervised classification with graph convolutional networks." arXiv preprint arXiv:1609.02907 etworks/
Network analysis2. Link prediction Clustering Link prediction Matrix completion and recommendationKipf, Thomas N., and Max Welling."Variational graph auto-encoders." arXiv preprint arXiv:1611.07308 (2016).https://github.com/tkipf/gae
Network analysis2. Link predictionInput graph, ๐ฎ(๐ฟ, ๐จ)EncoderNode features, ๐(๐ ๐ฟ, ๐จ)DecoderReconstructedAdjacency matrix, ๐ด ๐(๐๐๐ป )Kipf, Thomas N., and Max Welling."Variational graph auto-encoders." arXiv preprint arXiv:1611.07308 (2016).https://github.com/tkipf/gae
Network analysis2. Link prediction Trained on an incomplete version of {Cora, Citeseer, Pubmed} datasets where parts of the citation links(edges) have been removed, while all node features are kept. Form validation and test sets from previously removed edges and the same number of randomlysampled pairs of unconnected nodes (non-edges).Kipf, Thomas N., and Max Welling."Variational graph auto-encoders." arXiv preprint arXiv:1611.07308 (2016).https://github.com/tkipf/gae
Network analysis3. Matrix completion Matrix completion Can be applied for recommending systemBerg, Rianne van den, Thomas N. Kipf, and Max Welling."Graph convolutional matrix completion." arXiv preprint arXiv:1706.02263 (2017).
Network analysis3. Matrix completion A rating matrix ๐ of shape ๐๐ข ๐๐ฃ๐๐ข : the number of users, ๐๐ฃ : the number of items User ๐ rated item ๐, or the rating is unobserved (๐๐๐ 0). Matrix completion problem or recommendation a link prediction problem on a bipartite user-item interaction graph.Input graph : ๐ฎ(๐ฆ, ๐, ๐ก) ๐ฆ ๐ค ๐ฅ : user nodes ๐๐ข ๐ค, with ๐ {1, , ๐๐ข } and item nodes ๐๐ ๐ฅ, with ๐ 1, , ๐๐ฃ ๐๐ , ๐, ๐๐ ๐ represent rating levels, such as ๐ 1, , ๐ ๐ก.Berg, Rianne van den, Thomas N. Kipf, and Max Welling."Graph convolutional matrix completion." arXiv preprint arXiv:1706.02263 (2017).
Network analysis3. Matrix completion GAE for the link prediction taskTake as input an ๐ ๐ท feature matrix, ๐ฟ๐ ๐ธ node embedding matrix, ๐ ๐๐ป๐ , , ๐๐ป๐ต๐ป๐ ๐(๐ฟ, ๐จ)A graph adjacency matrix, ๐จ๐จ ๐(๐)which takes pairs of node embeddings ๐๐ , ๐๐ and predicts respective entries ๐จ๐๐Berg, Rianne van den, Thomas N. Kipf, and Max Welling."Graph convolutional matrix completion." arXiv preprint arXiv:1706.02263 (2017).
Network analysis3. Matrix completion GAE for the bipartite recommender graphs, ๐ฎ(๐ฆ, ๐, ๐ก)Encoder๐, ๐ ๐(๐, ๐1 , , ๐๐ ), where ๐๐ 0,1๐๐ข ๐๐ฃis the adjacency matrix associated with rating type ๐ โ๐, ๐ : matrices of user and item embeddings with shape ๐๐ข ๐ธ and ๐๐ฃ ๐ธ, respectively.Decoder๐ ๐(๐, ๐), rating matrix ๐ of shape ๐๐ข ๐๐ฃ๐บ(๐ฟ, ๐จ)๐บ(๐ฆ, ๐, ๐ก)๐ ๐(๐ฟ, ๐จ)๐จ ๐(๐)๐ผ, ๐ฝ ๐(๐ฟ, ๐ด๐ , , ๐ด๐น )๐ด ๐(๐ผ, ๐ฝ)GAE for the link prediction taskGAE for the bipartite recommenderBerg, Rianne van den, Thomas N. Kipf, and Max Welling."Graph convolutional matrix completion." arXiv preprint arXiv:1706.02263 (2017).
Network analysis3. Matrix completion GAE for the bipartite recommender graphs, ๐ฎ(๐ฆ, ๐, ๐ก)๐บ(๐ฆ, ๐, ๐ก)๐ผ, ๐ฝ ๐(๐ฟ, ๐ด๐ , , ๐ด๐น )๐ด ๐(๐ผ, ๐ฝ)Berg, Rianne van den, Thomas N. Kipf, and Max Welling."Graph convolutional matrix completion." arXiv preprint arXiv:1706.02263 (2017).
Network analysis3. Matrix completionEncoder๐ข๐ ๐ ๐โ๐ : the final user embeddingโ๐ ๐ ๐๐๐๐ข๐๐๐ ๐,1 , ,๐ ๐ฉ๐ ,1๐๐ ๐,๐ 1๐๐ฅ๐๐๐ ๐ ๐๐๐ ๐,๐ : intermediate node state๐ ๐ฉ๐ ,๐ : Message function from item ๐ to user i๐๐๐ ๐ฉ๐ ๐ฉ๐Berg, Rianne van den, Thomas N. Kipf, and Max Welling."Graph convolutional matrix completion." arXiv preprint arXiv:1706.02263 (2017).
Network analysis3. Matrix completionDecoder๐ ๐๐๐ ๐ ๐๐ข๐๐ ๐๐ ๐ฃ๐๐๐ข๐ ๐๐ ๐ฃ๐๐ ๐ ๐๐๐๐ ๐ ๐ข๐ , ๐ฃ๐ ๐ผ๐๐๐๐ ๐: probability that rating ๐๐๐ is ๐๐ ๐ ๐(๐๐๐ ๐): expected rating๐ ๐ Loss function๐ โ ๐ผ ๐ ๐๐๐ log ๐ ๐๐๐ ๐๐ผ ๐ ๐ 1, when ๐ ๐ and 0 otherwise๐,๐ ๐ 1Berg, Rianne van den, Thomas N. Kipf, and Max Welling."Graph convolutional matrix completion." arXiv preprint arXiv:1706.02263 (2017).
Network analysis3. Matrix completionBerg, Rianne van den, Thomas N. Kipf, and Max Welling."Graph convolutional matrix completion." arXiv preprint arXiv:1706.02263 (2017).
Molecular applications: Which kinds of datasets exist? Bioactive molecules with drug-like properties 1,828,820 compounds https://www.ebi.ac.uk/chembldb/ Drugs and targets FDA approved, investigational, experimental, 7,713 (all drugs), 4,115 (targets), https://drugbank.ca/ Drugs and targets FDA approved, investigational, experimental, 7,713 (all drugs), 4,115 (targets), https://drugbank.ca/
Molecular applications: Which kinds of datasets exist?Tox21 Data Challenge (@ Kaggle) 12 types of toxicity Molecule species (represented with SMILES)and toxicity labels are given But too small to train a DL sp
Molecular applications1. Neural molecular fingerprintHash function have been used to generate molecular fingerprints.* Molecular fingerprint : a vector representation of molecular ii/
Molecular applications1. Neural molecular fingerprintSuch molecular fingerprints can be easily obtained by open source packages, e.g.) RDKit.http://kehang.github.io/basic roperty-prediction/
Molecular applications1. Neural molecular fingerprintIn recent days, neural fingerprints generated by graph convolutional network is widely used formore accurate molecular property predictions.http://kehang.github.io/basic roperty-prediction/
Molecular applications1. Neural molecular fingerprintDuvenaud, David K., et al."Convolutional networks on graphs for learning molecular fingerprints." Advances in neural information processing systems. 2015.
Molecular applications1. Neural molecular fingerprintDuvenaud, David K., et al."Convolutional networks on graphs for learning molecular fingerprints." Advances in neural information processing systems. 2015.
Molecular applications1. Neural molecular fingerprintDuvenaud, David K., et al."Convolutional networks on graphs for learning molecular fingerprints." Advances in neural information processing systems. 2015.
Molecular applications1. Neural molecular fingerprintDuvenaud, David K., et al."Convolutional networks on graphs for learning molecular fingerprints." Advances in neural information processing systems. 2015.
Molecular applications2. Quantitative Structure-Property Relationships (QSPR)Ryu, Seongok, Jaechang Lim, and Woo Youn Kim."Deeply learning molecular structure-property relationships using graph attention neural network." arXiv preprint arXiv:1805.10988 (2018).
Molecular applications2. Quantitative Structure-Property Relationships (QSPR)Ryu, Seongok, Jaechang Lim, and Woo Youn Kim."Deeply learning molecular structure-property relationships using graph attention neural network." arXiv preprint arXiv:1805.10988 (2018).
Molecular applications2. Quantitative Structure-Property Relationships (QSPR)Learning solubility of moleculesFinal node statesThe neural network recognizesseveral functional groups differentlyRyu, Seongok, Jaechang Lim, and Woo Youn Kim."Deeply learning molecular structure-property relationships using graph attention neural network." arXiv preprint arXiv:1805.10988 (2018).
Molecular applications2. Quantitative Structure-Property Relationships (QSPR)Learning photovoltaic efficiency (QM phenomena)Final node statesInterestingly, The NN also can differentiate nodesaccording to the quantum mechanical characteristics.Ryu, Seongok, Jaechang Lim, and Woo Youn Kim."Deeply learning molecular structure-property relationships using graph attention neural network." arXiv preprint arXiv:1805.10988 (2018).
Molecular applications2. Quantitative Structure-Property Relationships (QSPR)Learning solubility of moleculesAscendingorderGraph features๐๐ ๐๐๐Similar molecules are located closelyin the graph latent space
Molecular applications3. Molecular generative modelMotivation : de novo molecular design Chemical space is too huge: only 108 molecules have beensynthesized as potential drug candidates,whereas it is estimated that there are1023 to 1060 molecules. Limitation of virtual screening
Molecular applications3. Molecular generative modelMotivation : de novo molecular designMolecule GraphSimplified Molecule Line-Entry System(SMILES)Molecules can be represented as strings according to defined rules
Molecular applications3. Molecular generative modelMotivation : de novo molecular designMany SMILES-based generative models existSegler, Marwin HS, et al. "Generating focused molecule libraries for drug discoverywith recurrent neural networks." ACS central science 4.1 (2017): 120-131.Gรณmez-Bombarelli, Rafael, et al. "Automatic chemical design using a data-drivencontinuous representation of molecules." ACS central science 4.2 (2018): 268-276.
Molecular applications3. Molecular generative modelMotivation : de novo molecular designSMILES representation also has a fatal problem thatsmall changes of structure can lead to quite different expressions. Difficult to reflect topological information of molecules
Molecular applications3. Molecular generative modelLiteratures Li, Y., Vinyals, O., Dyer, C., Pascanu, R., & Battaglia, P. (2018). Learning deep generative models of graphs.arXiv preprint arXiv:1803.03324. Jin, Wengong, Regina Barzilay, and Tommi Jaakkola. "Junction Tree Variational Autoencoder for MolecularGraph Generation." arXiv preprint arXiv:1802.04364 (2018). Constrained Graph Variational Autoencoders for Molecule Design." arXiv preprint arXiv:1805.09076(2018). "GraphVAE: Towards Generation of Small Graphs Using Variational Autoencoders." arXiv preprintarXiv:1802.03480 (2018). De Cao, Nicola, and Thomas Kipf. "MolGAN: An implicit generative model for small molecular graphs."arXiv preprint arXiv:1805.11973 (2018).
Molecular applications3. Molecular generative modelLi, Yujia, et al."Learning deep generative models of graphs." arXiv preprint arXiv:1803.03324 (2018).
Molecular applications3. Molecular generative model1. Sample whether to add a new node of a particular type or terminate : if a node type is chosen2. Add a node of this type to the graph3. Check if any further edges are needed to connect the new node to the existing graph4. If yes, select a node in the graph and add an edge connecting the new to the selected node.Li, Yujia, et al."Learning deep generative models of graphs." arXiv preprint arXiv:1803.03324 (2018).
Molecular applications3. Molecular generative model Determine that add a node or notLi, Yujia, et al."Learning deep generative models of graphs." arXiv preprint arXiv:1803.03324 (2018).
Molecular applications3. Molecular generative model If a node is added, determine that add edges between current node and other nodes.Li, Yujia, et al."Learning deep generative models of graphs." arXiv preprint arXiv:1803.03324 (2018).
Molecular applications3. Molecular generative modelLi, Yujia, et al."Learning deep generative models of graphs." arXiv preprint arXiv:1803.03324 (2018).
Molecular applications3. Molecular generative modelGraph propagation process : readout all node states and generating a graph feature๐๐ฎ๐ฝ๐๐ฎ or๐ฃ ๐ฑ๐โฒ๐ ๐๐ ๐๐ , ๐๐๐๐ฎ๐ ๐๐บ๐ฃ๐๐ฎ ๐๐ฎ๐ ๐ ๐๐ ๐๐๐ฃ ๐ฑ ๐ฃ ๐ฑ๐๐ ๐๐ ๐๐ , ๐๐ , ๐๐,๐๐ข,๐ฃ โฐ ๐ฃ ๐ฑLi, Yujia, et al."Learning deep generative models of graphs." arXiv preprint arXiv:1803.03324 (2018).
Molecular applications3. Molecular generative modelAdd node : a step to decide whether or not to add a node๐๐๐ ๐ ๐๐๐ ๐ ๐ฎ softmax ๐๐๐ ๐๐ฎ(๐ป)If โyesโThe new node vectors ๐๐ฝ are carried over to the next stepLi, Yujia, et al."Learning deep generative models of graphs." arXiv preprint arXiv:1803.03324 (2018).
Molecular applications3. Molecular generative modelAdd edge : a step to add an edge to the new node(๐ป)๐๐๐๐๐๐๐๐ ๐บ, ๐ฃ ๐ ๐๐๐ ๐๐ฎ , ๐๐: the probability of adding an edge to the newly created node ๐ฃ.๐ปIf โyesโ(๐ป)๐๐๐๐๐๐ ๐บ, ๐ฃ softmax ๐๐ ๐๐ , ๐๐: Score for each node toconnect the edgesLi, Yujia, et al."Learning deep generative models of graphs." arXiv preprint arXiv:1803.03324 (2018).
Molecular applications3. Molecular generative modelObjective function๐(๐บ) : marginal likelihood๐ ๐บ permutation๐(๐บ, ๐) ๐ผ๐๐ ๐ซ(๐บ)๐ ๐บ๐(๐บ, ๐)๐ ๐ ๐บFollowing all possible permutations is intractable samples from data : ๐ ๐ ๐บ ๐๐๐๐ก๐ (๐ ๐บ)๐ผ๐๐๐๐ก๐ (๐บ,๐) log ๐(๐บ, ๐) ๐ผ๐๐๐๐ก๐(๐บ) ๐ผ๐๐๐๐ก๐๐ ๐บlog ๐(๐บ, ๐)Li, Yujia, et al."Learning deep generative models of graphs." arXiv preprint arXiv:1803.03324 (2018).
Molecular applications3. Molecular generative modelLi, Yujia, et al."Learning deep generative models of graphs." arXiv preprint arXiv:1803.03324 (2018).
Molecular applications3. Molecular generative modelLi, Yujia, et al."Learning deep generative models of graphs." arXiv preprint arXiv:1803.03324 (2018).
Interacting physical system Interacting systems Nodes : particles, Edges : (physical) interaction between particle pairs Latent code : the underlying interaction graphKipf, Thomas, et al."Neural relational inference for interacting systems." arXiv preprint arXiv:1802.04687 (2018).
Interacting physical systemInput graph : ๐ ๐ฅ, ๐ with vertices ๐ฃ ๐ฅ and edges ๐ (๐ฃ, ๐ฃ โฒ ) โฐ๐ฃ ๐ ๐๐(๐,๐) ๐๐๐ ๐๐๐ , ๐๐๐ , ๐๐,๐๐๐ ๐,๐ , ๐๐๐ ๐ฃ ๐๐ ๐ ๐๐ฃ๐๐๐๐ : initial node features, ๐๐,๐: initial edge features๐ ๐ฉ๐Kipf, Thomas, et al."Neural relational inference for interacting systems." arXiv preprint arXiv:1802.04687 (2018).
Interacting physical systemKipf, Thomas, et al."Neural relational inference for interacting systems." arXiv preprint arXiv:1802.04687 (2018).
Interacting physical systemEncoder๐๐(๐,๐) ๐๐1 ๐๐๐ , ๐๐๐๐๐๐ ๐๐ฃ1๐ ๐๐๐(๐,๐)๐๐(๐,๐) ๐๐2 ๐๐๐ , ๐๐๐๐๐๐ ๐๐๐๐ (๐๐ )๐๐ ๐ง๐๐ ๐ฅ softmax ๐๐๐๐,๐ ๐ฅ๐๐,1:๐พ2 softmax โ(๐,๐)Kipf, Thomas, et al."Neural relational inference for interacting systems." arXiv preprint arXiv:1802.04687 (2018).
Interacting physical system๐๐๐บ๐๐ก Decoder๐๐ ๐ ๐บ๐ ๐๐๐๐(๐,๐) ๐ง๐๐,๐ ๐๐๐ ๐๐๐ , ๐๐๐๐๐ ๐๐(๐,๐)๐๐๐๐บ๐๐ก , ๐๐๐ , ๐๐๐๐ ๐ ๐๐๐ ๐๐๐ข๐ก ๐๐๐๐๐ ๐๐ ๐๐ ๐ ๐๐ , ๐ ๐ฉ ๐๐ ๐ , ๐ 2 ๐Kipf, Thomas, et al."Neural relational inference for interacting systems." arXiv preprint arXiv:1802.04687 (2018).
Interacting physical systemโ ๐ผ ๐๐๐๐ ๐ ๐ ๐ ๐ ๐๐ก 1 ๐๐๐ ๐ ๐๐๐๐ ๐ ๐๐ , , ๐๐ ๐ ๐ ๐๐๐ก 1 ๐๐log ๐๐ ๐ ๐ KL ๐๐ ๐ ๐ ๐(๐)๐๐ ๐ ๐๐ ๐ , since the dynamics is Markovian.๐๐๐ , the prior is a factorized uniform distribution over edge typesKipf, Thomas, et al."Neural relational inference for interacting systems." arXiv preprint arXiv:1802.04687 (2018).
Interacting physical systemKipf, Thomas, et al."Neural relational inference for interacting systems." arXiv preprint arXiv:1802.04687 (2018).
Interacting physical systemKipf, Thomas, et
Principles of graph neural network Updates in a graph neural network Edge update : relationship or interactions, sometimes called as 'message passing' ex) the forces of spring Node update : aggregates the edge updates and used in the node update ex) the forces acting on the ball Global update : an update for the global attribute ex) the net forces and total energy of the .