(Paper) GraphStorm

Learn/'24_Fall_(EE599) DataScience

(Paper) GraphStorm

QBBong 2024. 12. 22. 22:42

728x90

GraphStom: All-in-one Graph Machine Learning Framework for Industry Application

교수님 스타일이 최신 논문 리스트를 주고, 그중에서 한개를 선택해서 눈문 분석 발표를 하게 한후, 그 내용을 바탕으로 퀴즈를 진행하다.

40여개의 논문중에서 그나마 제일 쉬워 보이는(수식이 없었다.) 이 논문을 선택했었다.

요약

GraphStorm은 산업 환경에서의 그래프 머신 러닝(GML) 적용을 단순화하고 확장성을 제공하는 혁신적인 프레임워크이다. 2023년 5월에 출시되어 노코드/로우코드 솔루션을 통해 대규모 그래프 처리, 모델 학습, 추론 작업을 간소화한다.

주요 특징

확장성: 수십억 개의 노드와 엣지를 가진 그래프를 처리하며, 하드웨어 환경에 맞게 확장 가능.
사용 편의성: 한 줄 명령으로 그래프 생성 및 학습 가능.
고급 기능:
- 다중 모달 데이터 통합: 텍스트, 이미지, 그래프 데이터를 통합하여 모델링.
- 특징 없는 노드 처리: 주변 노드 정보나 학습 가능한 임베딩을 활용.
- 고립된 노드 처리: GNN 디스틸레이션 기법을 통해 성능 개선.
산업 검증: Microsoft Academic Graph(MAG), Amazon Review Dataset과 같은 대규모 데이터셋에서 성능 검증 완료.
모델 동작: 분산 그래프 엔진, 데이터 파이프라인, 모델 학습/추론, 모델 Zoo 등 4계층 구조 제공.

성능 평가

평가 데이터셋: MAG, Amazon Review Dataset 등 이질적인 대규모 그래프.
평가 작업:
- 노드 분류: 예를 들어, 학술지 유형 또는 브랜드 예측.
- 링크 예측: 논문 인용 또는 공동 구매 관계 식별.
효율성:
- 수백만~수억 개 노드를 포함한 그래프를 몇 시간 내에 처리.
- 다양한 데이터셋에서의 탁월한 확장성과 성능.

기술 기여 및 향후 작업

기여:
- GML 파이프라인 간소화.
- 산업 그래프를 위한 확장 가능한 모델링 솔루션.
- 기존 생산 모델 대비 성능 개선.
향후 연구 방향:
- 더 큰 데이터셋 지원.
- 클라우드 플랫폼 통합.

[발표 PPT]

쉽게 적용할 수 있다는 장점은 있지마, 아무래도 Amazon 에서 만든 프레임워크 이다 보니. 공개되어 있는 정보가 거의 없다.

나중에 제대로 써볼일이 있을지 모르겠다.

그러고보면, 이번학기는 지난학기보다도 퀴즈를 너무 많이 보았다. (총 18개 논문이였다...)

(대신 과제도 없었고, 진도도 좀 늦어지면서 마일스톤도 3으로 마무리 되었으니. 지난학기 보단 좋았다고 해야할지... )

최신 논문들을 나열해보면 다음과 같다.

최근 논문 목록 (2024)

Levie, Ron
A graphon-signal analysis of graph neural networks. Advances in Neural Information Processing Systems 36 (2024).
논문 링크
Amirhossein Farzam, Allen Tannenbaum, and Guillermo Sapiro
From Geometry to Causality-Ricci Curvature and the Reliability of Causal Inference on Networks. 41st International Conference on Machine Learning, 2024.
논문 링크
Khang Nguyen, Nong Minh Hieu, Vinh Duc Nguyen, Nhat Ho, Stanley Osher, and Tan Minh Nguyen
Revisiting over-smoothing and over-squashing using ollivier-ricci curvature. International Conference on Machine Learning, 2023.
논문 링크
Glover, Cory, and Albert-László Barabási
Measuring Entanglement in Physical Networks. Physical Review Letters 133, no. 7 (2024): 077401.
논문 링크
Xue, Leyang, Shengling Gao, Lazaros K. Gallos, Orr Levy, Bnaya Gross, Zengru Di, and Shlomo Havlin
Nucleation phenomena and extreme vulnerability of spatial k-core systems. Nature Communications 15, no. 1 (2024): 5850.
논문 링크
Meng, Xiangyi, Onur Varol, and Albert-László Barabási
Hidden citations obscure true impact in science. PNAS Nexus 3, no. 5 (2024): pgae155.
논문 링크
Jiang, Chunheng, Zhenhan Huang, Tejaswini Pedapati, Pin-Yu Chen, Yizhou Sun, and Jianxi Gao
Network properties determine neural network performance. Nature Communications 15, no. 1 (2024): 5718.
논문 링크
Li, Changbin, Kangshuo Li, Yuzhe Ou, Lance M. Kaplan, Audun Jøsang, Jin-Hee Cho, Dong Hyun Jeong, and Feng Chen
Hyper Evidential Deep Learning to Quantify Composite Classification Uncertainty. ICLR 2024.
논문 링크
Lianghao Xia and Chao Huang
AnyGraph: Graph Foundation Model in the Wild. ACM KDD, 2024.
논문 링크
Zappala, Emanuele, Antonio Henrique de Oliveira Fonseca, Josue Ortega Caro, Andrew Henry Moberly, Michael James Higley, Jessica Cardin, and David van Dijk
Learning integral operators via neural integral equations. Nature Machine Intelligence (2024): 1-17.
논문 링크
Sandhu, Romeil S., Tryphon T. Georgiou, and Allen R. Tannenbaum
Ricci curvature: An economic indicator for market fragility and systemic risk. Science Advances 2, no. 5 (2016): e1501495.
논문 링크
Cao, Qianying, Somdatta Goswami, and George Em Karniadakis
Laplace neural operator for solving differential equations. Nature Machine Intelligence 6, no. 6 (2024): 631-640.
논문 링크
Rusch, T. Konstantin, Nathan Kirk, Michael M. Bronstein, Christiane Lemieux, and Daniela Rus
Message-Passing Monte Carlo: Generating low-discrepancy point sets via graph neural networks. Proceedings of the National Academy of Sciences 121, no. 40 (2024): e2409913121.
논문 링크
Trivedi, Puja, Ryan A. Rossi, David Arbour, Tong Yu, Franck Dernoncourt, Sungchul Kim, Nedim Lipka, Namyong Park, Nesreen K. Ahmed, and Danai Koutra
Editing Partially Observable Networks via Graph Diffusion Models. 41st International Conference on Machine Learning, 2024.
논문 링크
Zhong, Yi, Gaozheng Li, Ji Yang, Houbing Zheng, Yongqiang Yu, Jiheng Zhang, Heng Luo, Biao Wang, and Zuquan Weng
Learning motif-based graphs for drug–drug interaction prediction via local–global self-attention. Nature Machine Intelligence (2024): 1-12.
논문 링크
Zahra Kadkhodaie, Florentin Guth, Eero P. Simoncelli, and Stéphane Mallat
Generalization in diffusion models arises from geometry-adaptive harmonic representation. ICLR 2024 (Best Paper Award).
논문 링크
Berrueta, Thomas A., Allison Pinosky, and Todd D. Murphey
Maximum diffusion reinforcement learning. Nature Machine Intelligence (2024): 1-11.
논문 링크
Gan, Quan, Minjie Wang, David Wipf, and Christos Faloutsos
Graph Machine Learning Meets Multi-Table Relational Data. Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2024.
논문 링크
Zheng, Da, Xiang Song, Qi Zhu, Jian Zhang, Theodore Vasiloudis, Runjie Ma, Houyu Zhang et al.
GraphStorm: All-in-one graph machine learning framework for industry applications. Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2024.
논문 링크
Ke, Qing, Alexander J. Gates, and Albert-László Barabási
A network-based normalized impact measure reveals successful periods of scientific discovery across disciplines. Proceedings of the National Academy of Sciences 120, no. 48 (2023): e2309378120.
논문 링크
Mahowald, Kyle, Anna A. Ivanova, Idan A. Blank, Nancy Kanwisher, Joshua B. Tenenbaum, and Evelina Fedorenko
Dissociating language and thought in large language models. Trends in Cognitive Sciences (2024).
논문 링크
Wu, Tao, Xiangyun Gao, Feng An, Xiaotian Sun, Haizhong An, Zhen Su, Shraddha Gupta, Jianxi Gao, and Jürgen Kurths
Predicting multiple observations in complex systems through low-dimensional embeddings. Nature Communications 15, no. 1 (2024): 2242.
논문 링크
Gan, Xiao, Zixin Shu, Xinyan Wang, Dengying Yan, Jun Li, Shany Ofaim, Réka Albert et al.
Network medicine framework reveals generic herb-symptom effectiveness of traditional Chinese medicine. Science Advances 9, no. 43 (2023): eadh0215.
논문 링크
Zhang, Yang, Lorenzo Boninsegna, Muyu Yang, Tom Misteli, Frank Alber, and Jian Ma
Computational methods for analysing multiscale 3D genome organization. Nature Reviews Genetics 25, no. 2 (2024): 123-141.
논문 링크
Chen, Guozhang, and Pulin Gong
A spatiotemporal mechanism of visual attention: Superdiffusive motion and theta oscillations of neural population activity patterns. Science Advances 8, no. 16 (2022): eabl4995.
논문 링크
Fatemi, Bahare, Jonathan Halcrow, and Bryan Perozzi
Talk like a graph: Encoding graphs for large language models. ICLR 2024.
논문 링크
Max, Kevin, Laura Kriener, Garibaldi Pineda García, Thomas Nowotny, Ismael Jaras, Walter Senn, and Mihai A. Petrovici
Learning efficient backprojections across cortical hierarchies in real time. Nature Machine Intelligence (2024): 1-12.
논문 링크
Cao, Duanhua, Geng Chen, Jiaxin Jiang, Jie Yu, Runze Zhang, Mingan Chen, Wei Zhang et al.
Generic protein–ligand interaction scoring by integrating physical prior knowledge and data augmentation modelling. Nature Machine Intelligence (2024): 1-13.
논문 링크
Koh, Huan Yee, Anh TN Nguyen, Shirui Pan, Lauren T. May, and Geoffrey I. Webb
Physicochemical graph neural network for learning protein–ligand interaction fingerprints from sequence data. Nature Machine Intelligence (2024): 1-15.
논문 링크
Chen, Dong, Jian Liu, and Guo-Wei Wei
Multiscale topology-enabled structure-to-sequence transformer for protein–ligand interaction predictions. Nature Machine Intelligence 6, no. 7 (2024): 799-810.
논문 링크
Schlegel, P., Yin, Y., Bates, A.S. et al.
Whole-brain annotation and multi-connectome cell typing of Drosophila. Nature 634, 139–152 (2024).
논문 링크
Gu, Albert, and Tri Dao
Mamba: Linear-time sequence modeling with selective state spaces. arXiv preprint arXiv:2312.00752 (2023).
논문 링크
Ruiz, Luana, Luiz Chamon, and Alejandro Ribeiro
Graphon neural networks and the transferability of graph neural networks. Advances in Neural Information Processing Systems 33 (2020): 1702-1712.
논문 링크
Fabian, Christian, Kai Cui, and Heinz Koeppl
Learning sparse graphon mean field games. International Conference on Artificial Intelligence and Statistics, 2023.
논문 링크
Xia, Xinyue, Gal Mishne, and Yusu Wang
Implicit graphon neural representation. International Conference on Artificial Intelligence and Statistics, 2023.
논문 링크
Cheng, Chaoran, and Jian Peng
Equivariant neural operator learning with graphon convolution. Advances in Neural Information Processing Systems 36 (2024).
논문 링크
Sun, Yifei, Qi Zhu, Yang Yang, Chunping Wang, Tianyu Fan, Jiajun Zhu, and Lei Chen
Fine-Tuning Graph Neural Networks by Preserving Graph Generative Patterns. Proceedings of the AAAI Conference on Artificial Intelligence, 2024.
논문 링크
Cervino, Juan, Luana Ruiz, and Alejandro Ribeiro
Learning by transference: Training graph neural networks on growing graphs. IEEE Transactions on Signal Processing 71 (2023): 233-247.
논문 링크
Ruiz, Luana, Luiz FO Chamon, and Alejandro Ribeiro
Transferability properties of graph neural networks. IEEE Transactions on Signal Processing (2023).
논문 링크
Chatterjee, Anirban, Soham Dan, and Bhaswar B. Bhattacharya
Higher-Order Graphon Theory: Fluctuations, Degeneracies, and Inference. arXiv preprint arXiv:2404.13822 (2024).
논문 링크
Cao, Yuxuan, Jiarong Xu, Carl Yang, Jiaan Wang, Yunchao Zhang, Chunping Wang, Lei Chen, and Yang Yang
When to Pre-Train Graph Neural Networks? From Data Generation Perspective! Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2023.
논문 링크
Böker, Jan, Ron Levie, Ningyuan Huang, Soledad Villar, and Christopher Morris
Fine-grained expressivity of graph neural networks. Advances in Neural Information Processing Systems 36 (2024).
논문 링크

기타 흥미로운 논문

He, Zhongmou, Jing Zhu, Shengyi Qian, Joyce Chai, and Danai Koutra
LinkGPT: Teaching Large Language Models To Predict Missing Links. arXiv preprint arXiv:2406.04640 (2024).
논문 링크
Barbero, Federico, Andrea Banino, Steven Kapturowski, Dharshan Kumaran, João GM Araújo, Alex Vitvitskyi, Razvan Pascanu, and Petar Veličković
Transformers need glasses! Information over-squashing in language tasks. arXiv preprint arXiv:2406.04267 (2024).
논문 링크

728x90

'Learn > '24_Fall_(EE599) DataScience' 카테고리의 다른 글

(Final Project) Machine Learning-based Intraday Stock Price Prediction with high-frequency data analysis (0)	2024.12.22
(Lecture 12) Structure and inference in hypergraphs with node attributes (1)	2024.12.22
(Lecture 11) ANNs, GNNs, RNNs, DNNs. (1)	2024.12.22
(Lecture 10) Fractional difference operators (0)	2024.12.22
(Lecture 8) Graphon definitions & Multifractal graph generators (1)	2024.12.22

현재글(Paper) GraphStorm

BBong's Story

놀고, 먹고, 일하고, 만들고, 배우고

250x250

네트워크 성능, Iot, FPGA, AWS, 데이터 분석, 뉴욕 여행, rdma, roce, 혼잡 제어, 미국여행, 클라우드 컴퓨팅, 미국로드트립, 가족여행, 핫스프링스, thingsboard io, 주니어 레인저, TCP, 불헤드 시티, 로드트립, 클라우드 네트워크,

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

BBong's Story

(Paper) GraphStorm