Jinkun Geng (耿金坤)

Senior Software Researcher@Clockwork

I am working as a senior software researcher at Clockwork. Before joining Clockwork, I got my PhD degree (2019.9-2024.1) from Department of Computer Science, Stanford University. I have broad interest in networking and systems. Currently, I am focusing on building high-performance and fault-tolerant distributed systems with synchronized clocks. My CV is attached here .

//intro

Education

BEng Software Engineering

2012.09 - 2016.07
I enjoyed a wonderful four-year time in BUAA, where I met many talented teachers and classmates and learned a lot from them. The four-year study led me into the fascinating world of computer science and software engineering. I graduated from BUAA, ranking top 1 among 134 students.

MSc Computer Science

2016.09 - 2019.07
I was recommended for admission to THU for my master degree, where I work with Professor Dan Li and focus on the interdisciplinary area of Network & AI.

PhD Computer Science

2019 - ?
I am enrolled in Computer Science Department of Stanford University in 2019, and I am pursuing my PhD degree. Currently, I am advised by Professor Balaji Prabhakar, and co-advised by Professor Anirudh Sivaraman and Professor Mendel Rosenblum.

Research Projects

Nezha: Deployable and High-Performance Consensus Using Synchronized Clocks

PhD@Stanford
2021.03 - Present
  • Develop a common primitive based on synchronized clocks, deadline-ordered multicast (DOM)
  • Develop a protocol based on DOM, which is called Nezha. Nezha has proved to outperform the typical consensus protocols, including Multi-Paxos, Raft, NOPaxos, Fast Paxos, EPaxos, etc.
  • Nezha is open-sourced here, and we plan to integrated Nezha with some industrial systems which require high-performance consensus.

CloudEx: Next Generation of High-Frequency Financial Trading System in Cloud

PhD@Stanford
2019.08 - Present
  • Integrate the Huygens synchronized clock API to implement fair gaming mechanism for different traders
  • Implement and keep optimizing the main features of CloudEx
  • Now I am focusing on the fault tolerance of CloudEx

Fela: Flexible and Elastic Distributed Machine Learning

Master@THU
2019.03 - 2019.08
  • Incorporate flexible parallelism and elastic tuning mechanism to accelerate distributed machine learning
  • Integrate a variety of scheduling policies (inspired by token scheduling) to optimize the DML performance
  • Implement the prototype atop PyTorch. A tricky virtual layer mechanism is designed and involved

R2-SVM: Large-Scale Support Vector Machine Acceleration

Master@THU
2018.06 - 2018.09
  • Re-design the tree-based structure of existing distributed solutions (e.g. Cascade-SVM, DC-SVM, etc) to large-scale SVM
  • Design interchangeable block rotation strategy to eliminate the skewed updates for Lagrange multipliers
  • Incorporate hybrid synchronous parallel mode to accelerate the convergence of algorithm

Rima: An RDMA-Accelerated Model-Parallelized Solution to Large-Scale Matrix Factorization

Master@THU
2017.12 - 2018.06
  • Re-design the architecture of distributed matrix factorization. Leverage ring-based architecture instead of PS-based architecture to eliminate the centralized bottleneck and data redundancy
  • Design one-step transformation strategy to halve the communication workload for large-scale matrix factorization
  • Design three partial randomness strategies to add more robustness to the algorithm
  • Overlap the disk I/O overheads with the communication/computation overheads
  • Conduct a comparative testbed experiment between Rima and DSGD

SmartPS: Accelerating Distributed Machine Learning by Smart Parameter Server

Master@THU
2018.03 - 2018.07
  • Design a new parameter abstraction (Parameter Unit) for distributed machine learning (DML)
  • Incorporate four strategies to accelerate DML under PS-based architecture, i.e. selective update, proactive update, straggler assistant and blocking unnecessary pushes
  • Incorporate priority-based transmission strategy to mitigate the performance gap between workers (especially for heterogeneous clusters), and conduct comparative evaluation with 17 VPSes on Aliyun.

HiPS: Hierarchical Parameter Synchronization in Large-Scale DML

Master@THU
2017.5 - 2018.03
  • Incorporate data center network (DCN) topology with distributed machine learning (DML) to boost the performance
  • Design high-efficient synchronization algorithms on top of server-centric topologies to better embrace the benefit of RDMA
  • Implement a prototype of BCube+HiPS in Tensorflow, and conduct comparative experiments with a 2-layer BCube testbed

Hosepipe: Resource Management in Data Center

Master@THU
2016.10 - 2017.04
  • Design new schemes to control and manage bandwidth resources in multi-tenant data center environment
  • Implement a kernel module and test the performance in a 4-node cluster

LOS: High Performance and Strong Compatible User-level Network Stack

Master@THU
2015.10 - 2017.05
  • Design and implement a user-level stack based on DPDK, to achieve high performance and strong compatibility
  • Implement user-level Netfilter with dynamic library
  • Port Nginx on LOS without changing the source code

Software Vulnerability Modeling

Intern@THU
2015.03 - 2015.07
  • Modeling software vulnerability and evaluate software trustworthiness
  • Extensive study with the common public vulnerability dataset

Database Intrusion Detection

Intern@THU
2014.08 - 2015.03
  • Design a novel clustering-based intrusion detection algorithm
  • Participate in the development of database intrusion detection system

Artificial Neural Network

Intern@BUAA
2013.05 - 2014.06
  • Study the basics of artificial neural network
  • Apply the typical intelligent algorithms (e.g. ANN, PSO, etc) to signal processing

Publications

  • Jinkun Geng, Anirudh Sivaraman, Balaji Prabhakar, Mendel Rosenblum. Nezha: Deployable and High-Performance Consensus Using Synchronized Clocks (Preprint, Repository)
    49th International Conference on Very Large Data Bases (VLDB 2023)
  • Ahmad Ghalayini, Jinkun Geng, Vighnesh Sachidananda, Vinay Sriram, Yilong Geng, Balaji Prabhakar, Mendel Rosenblum, Anirudh Sivaraman. CloudEx: A Fair-Access Financial Exchange in the Cloud (PDF)
    18th Workshop on Hot Topics in Operating Systems (HotOS 2021)
  • Jinkun Geng, Dan Li, Shuai Wang. Fela: Incorporating Flexible Parallelism and Elastic Tuning to Accelerate Large-Scale DML (Preprint, PPT)
    36th IEEE International Conference on Data Engineering (ICDE 2020)
  • Jinkun Geng, Dan Li, Shuai Wang. Rima: An RDMA-Accelerated Model-Parallelized Solution to Large-Scale Matrix Factorization (PDF, PPT, Poster)
    35th IEEE International Conference on Data Engineering (ICDE 2019)
  • Shuai Wang, Dan Li, Jinkun Geng. Geryon: Accelerating Distributed CNN Training by Network-Level Flow Scheduling (Preprint)
    IEEE International Conference on Computer Communications(INFOCOM 2020)
  • Shuai Wang, Dan Li, Jinkun Geng, Yue Gu, Yang Cheng. Impact of Network Topology on the Performance of DML: Theoretical Analysis and Practical Factors (PDF)
    IEEE International Conference on Computer Communications(INFOCOM 2019)
  • Songtao Wang, Dan Li, Yang Cheng, Jinkun Geng, Yanshu Wang, Shuai Wang, Shutao Xia, Jianping Wu. BML: A High-performance, Low-cost Gradient Synchronization Algorithm for DML Training (PDF, Poster)
    Thirty-second Conference on Neural Information Processing Systems (NeurIPS 2018)
  • Songtao Wang, Dan Li, Yang Cheng, Jinkun Geng, Yanshu Wang, Shuai Wang, Shutao Xia, Jianping Wu. A Scalable, High-Performance, and Fault-Tolerant Network Architecture for Distributed Machine Learning. (PDF)
    IEEE/ACM Transactions on Networking (2020).
  • Kaihui Gao, Dan Li, Li Chen, Jinkun Geng, Fei Gui, Yang Cheng, Yue Gu. Incorporating Intra-flow Dependencies and Inter-flow Correlations for Traffic Matrix Prediction (Preprint)
    IEEE/ACM International Symposium on Quality of Service (IWQoS 2020)
  • Yang Cheng, Dan Li, Zhiyuan Guo, Binyao Jiang, Jiaxin Lin, Xi Fan, Jinkun Geng, Xinyi Yu, Wei Bai, Lei Qu, Ran Shu, Peng Cheng, Yongqiang Xiong, Jianping Wu. DLBooster: Boosting End-to-End Deep Learning Workflows with Offloading Data Preprocessing Pipelines (PDF)
    48th International Conference on Parallel Processing (ICPP 2019)
  • Jinkun Geng, Dan Li, Yang Cheng, Shuai Wang, Junfeng Li. HiPS: Hierarchical Parameter Synchronization in Large-Scale Distributed Machine Learning (PDF, PPT)
    ACM SIGCOMM 2018 Workshop on Network Meets AI & ML (NetAI 2018)
  • Jinkun Geng, Dan Li, Shuai Wang. Accelerating Distributed Machine Learning by Smart Parameter Server (PDF, PPT)
    3rd Asia-Pacific Workshop on Networking (APNet 2019)
  • Jinkun Geng, Dan Li, Shuai Wang. ElasticPipe: An Efficient and Dynamic Model-Parallel Solution to DNN Training (PDF, PPT)
    10th workshop on Scientific Cloud Computing (ScienceCloud'19)
  • Jinkun Geng, Dan Li, Shuai Wang. Horizontal or Vertical? A Hybrid Approach to Large-Scale Distributed Machine Learning (PDF, PPT)
    1st Workshop on Converged Computing Infrastructure (CCIW'19)
  • Yukai Huang, Jinkun Geng, Du Lin, Bin Wang, Junfeng Li, Ruilin Ling, Dan Li. LOS: A High Performance and Compatible User-level Network Operating System (PDF, PPT)
    1st Asia-Pacific Workshop on Networking (APNet 2017)
  • Kaihui Gao, Dan Li, Li Chen, Jinkun Geng, Fei Gui, Yang Cheng, Yue Gu. Predicting Traffic Demand Matrix by Considering Inter-flow Correlations (Preprint)
    The First IEEE INFOCOM Workshop on Networking Algorithms (WNA)
  • Kaihui Gao, Dan Li, Li Chen, Jinkun Geng, Fei Gui, Yang Cheng, Yue Gu. Incorporating Intra-flow Dependencies and Inter-flow Correlations for Traffic Matrix Prediction. (PDF, PPT)
    IEEE/ACM International Symposium on Quality of Service (IWQoS 2020)
  • Best Runner-Up! Junfeng Li, Dan Li, Wenfei Wu, K. K. Ramakrishnan Jinkun Geng, Fei Gui, Fanzhao Wang, Kai Zheng. Sphinx: A Transport Protocol for High-Speed and Lossy Mobile Networks (PDF)
    38th IEEE International Performance Computing and Communications Conference (IPCCC 2019)
  • Jinkun Geng. CODE: Incorporating Correlation and Dependency for Task Scheduling in Data Center (PDF, PPT)
    15th IEEE International Symposium on Parallel and Distributed Processing with Applications (IEEE ISPA 2017)
  • Yang Cheng, Jinkun Geng, Yanshu, Wang, Junfeng, Li, Dan Li, Jianping Wu. Bridging Machine Learning and Computer Network Researches: A Survey (PDF)
    CCF Transactions on Networking
  • Junfeng Li, Dan Li, Yirong, Yu, Yukai, Huang, Jing, Zhu, Jinkun Geng. Towards full virtualization of SDN infrastructure (PDF)
    Computer Networks
  • Zhetao Li, Fei Gui, Jinkun Geng, Dan Li, Zhibo Wang, Junfeng Li, Yang Cheng, Usama Zafar. Dante: Enabling FOV-Aware Adaptive FEC Coding for 360-Degree Video Streaming (PDF, PPT)
    2nd Asia-Pacific Workshop on Networking (APNet 2018)
  • Dan Li, Jinkun Geng. A Survey on Large-Scale Machine Learning Network (in Chinese:大规模机器学习网络研究) (PDF)
    Communications of the CCF
  • Jinkun Geng, Dan Li. Documentary Report on APNet'18 (in Chinese:第二届亚太地区网络研讨会 (APNet 2018)全程纪实) (PDF)
    IEEE Computer (Chinese version)
  • Jinkun Geng, Daren, Ye, Ping Luo, Pin Lv. A Novel Clustering Algorithm for Database Anomaly Detection
    EAI SecureComm Workshop on Applications and Techniques in Information Security (ATCS), 2015
  • Talks

  • Nezha: A High-Performance Consensus Protocol Using Accurately Synchronized Clocks
    Stanford Platform Lab Winter Review, 2022
  • CloudEx: Building a Jitter-free Financial Exchange in the Cloud
    Stanford Platform Lab Seminar, 2020
  • Fela: Incorporating Flexible Parallelism and Elastic Tuning to Accelerate Large-Scale DML
    IEEE International Conference on Data Engineering (ICDE 2020)
  • Accelerating Distributed Machine Learning by Smart Parameter Server
    3rd Asia-Pacific Workshop on Networking (APNet 2019)
  • Some Perspective on Large-Scale Distributed Machine Learning (in Chinese: 大规模分布式机器学习之我见) (PPT)
    Trusted Cloud Summit (可信云大会)
  • ElasticPipe: An Efficient and Dynamic Model-Parallel Solution to DNN Training
    10th workshop on Scientific Cloud Computing (ScienceCloud'19)
  • Horizontal or Vertical? A Hybrid Approach to Large-Scale Distributed Machine Learning
    1st Workshop on Converged Computing Infrastructure (CCIW'19)
  • Rima: An RDMA-Accelerated Model-Parallelized Solution to Large-Scale Matrix Factorization
    35th IEEE International Conference on Data Engineering (ICDE 2019)
  • HiPS: Hierarchical Parameter Synchronization in Large-Scale Distributed Machine Learning
    ACM SIGCOMM 2018 Workshop on Network Meets AI & ML (NetAI 2018)
  • LOS: A High Performance and Compatible User-level Network Operating System
    1st Asia-Pacific Workshop on Networking (APNet 2017)
  • CODE: Incorporating Correlation and Dependency for Task Scheduling in Data Center
    15th IEEE International Symposium on Parallel and Distributed Processing with Applications (ISPA 2017)
  • Work Experiences

    Clockwork

    Palo Alto
    Software Engineer
    2022.06 – 2022.09
    • Refactor the codebase of Nezha and prepare for open-sourcing.
    • Test and evaluate Nezha.
    • Explore the application of clock synchronization in new areas (concurrentcy control).

    Google

    Mountain View (Virtual)
    Software Engineer
    2020.06 – 2020.09
    Mentor: Hui Su
    • Optimize video encoding performance
    • Implement Warped transformation for motion vector predictio
    • Implement DBSCAN algorithm for clustering motion vectors.

    Lopodo.com

    Beijing
    Web Developer
    2015.08 – 2015.11
    • Lopodo.com is a high-tech company majoring in eliminating poisonous emission and creating healthy indoor environment for modern people.
    • I cooperate with three other software engineers to develop and maintain the official website for Lopodo.
    • I also develop a mini E-shop based on Wechat platform for Lopodo.

    Ransetee.com

    Beijing
    Web Developer
    2015.04 – 2015.07
    • Ransetee is a incubator project started in CAFA, which aims to bridge artistic designer and customers and establish a commercial ecosystem for personalized clothing and accessories.
    • I cooperate with other software engineers to develop and maintain the official website for Ransetee.
    • I also conduct some optimization with the content loading algorithm in the website.

    Rewards

  • APNet Student Travel Grant@APNet Committee (2019) (2019.07)
  • Outstanding Master Graduate in Tsinghua (3 out of 141) @THU (2019.07)
  • Best Case Award in 2nd Cloud Max Performance Innovation@ CAICT (2019.06)
  • Outstanding Master Dissertation in Tsinghua (7 out of 141) (PDF)@THU (2019.06)
  • HPDC Student Travel Grant@HPDC Committee (2019) (2019.06)
  • ICDE Student Travel Awards@ICDE Committee (2019) (2019.03)
  • National Scholarship@Ministry of Education&THU (2018.10)
  • Outstanding Graduate@BUAA (2016.07)
  • Samsung Enterprise Scholarship@Samsung Group (2014.10)
  • National Scholarship@Ministry of Education&BUAA (2013.08)
  • First Prize in NECCS @TEFL China (2012.12)