Nezha: Deployable and High-Performance Consensus Using Synchronized Clocks
- Develop a common primitive based on synchronized clocks, deadline-ordered multicast (DOM)
- Develop a protocol based on DOM, which is called Nezha. Nezha has proved to outperform the typical consensus protocols,
including Multi-Paxos, Raft, NOPaxos, Fast Paxos, EPaxos, etc.
- Nezha is open-sourced here,
and we plan to integrated Nezha with some industrial systems which require high-performance consensus.
CloudEx: Next Generation of High-Frequency Financial Trading System in Cloud
- Integrate the Huygens synchronized clock API to implement fair gaming mechanism for different traders
- Implement and keep optimizing the main features of CloudEx
- Now I am focusing on the fault tolerance of CloudEx
Fela: Flexible and Elastic Distributed Machine Learning
- Incorporate flexible parallelism and elastic tuning mechanism to accelerate distributed machine learning
- Integrate a variety of scheduling policies (inspired by token scheduling) to optimize the DML performance
- Implement the prototype atop PyTorch. A tricky virtual layer mechanism is designed and involved
R2-SVM: Large-Scale Support Vector Machine Acceleration
- Re-design the tree-based structure of existing distributed solutions (e.g. Cascade-SVM, DC-SVM, etc) to large-scale SVM
- Design interchangeable block rotation strategy to eliminate the skewed updates for Lagrange multipliers
- Incorporate hybrid synchronous parallel mode to accelerate the convergence of algorithm
Rima: An RDMA-Accelerated Model-Parallelized Solution to Large-Scale Matrix Factorization
- Re-design the architecture of distributed matrix factorization. Leverage ring-based architecture instead of PS-based architecture to eliminate the centralized bottleneck and data redundancy
- Design one-step transformation strategy to halve the communication workload for large-scale matrix factorization
- Design three partial randomness strategies to add more robustness to the algorithm
- Overlap the disk I/O overheads with the communication/computation overheads
- Conduct a comparative testbed experiment between Rima and DSGD
SmartPS: Accelerating Distributed Machine Learning by Smart Parameter Server
- Design a new parameter abstraction (Parameter Unit) for distributed machine learning (DML)
- Incorporate four strategies to accelerate DML under PS-based architecture, i.e. selective update, proactive update, straggler assistant and blocking unnecessary pushes
- Incorporate priority-based transmission strategy to mitigate the performance gap between workers (especially for heterogeneous clusters), and conduct comparative evaluation with 17 VPSes on Aliyun.
HiPS: Hierarchical Parameter Synchronization in Large-Scale DML
- Incorporate data center network (DCN) topology with distributed machine learning (DML) to boost the performance
- Design high-efficient synchronization algorithms on top of server-centric topologies to better embrace the benefit of RDMA
- Implement a prototype of BCube+HiPS in Tensorflow, and conduct comparative experiments with a 2-layer BCube testbed
Hosepipe: Resource Management in Data Center
- Design new schemes to control and manage bandwidth resources in multi-tenant data center environment
- Implement a kernel module and test the performance in a 4-node cluster
LOS: High Performance and Strong Compatible User-level Network Stack
- Design and implement a user-level stack based on DPDK, to achieve high performance and strong compatibility
- Implement user-level Netfilter with dynamic library
- Port Nginx on LOS without changing the source code
Software Vulnerability Modeling
- Modeling software vulnerability and evaluate software trustworthiness
- Extensive study with the common public vulnerability dataset
Database Intrusion Detection
- Design a novel clustering-based intrusion detection algorithm
- Participate in the development of database intrusion detection system
Artificial Neural Network
- Study the basics of artificial neural network
- Apply the typical intelligent algorithms (e.g. ANN, PSO, etc) to signal processing