Tung D. Le

Staff Research Scientist
IBM Research - Tokyo
http://www.ibm.biz/leductung


research interest


My main interest lies in the intersection of parallel programming, compiler, and deep learning. In particular, I would love to propose systematic optimizations for parallel programming and AI systems.

academic activities


professional experience

education


When What (in computer science) @ Where With whom
2013 - 2016 Ph.D. @ SOKENDAI/NII, Japan Prof. Zhenjiang Hu
2008 - 2010 M.Sc. @ HUST, Vietnam Dr. Huu-Duc Nguyen
2002 - 2007 B.S. @ HUST, Vietnam Prof. Thanh-Thuy Nguyen

awards


projects


patents


  1. Tung D. Le. Neural programmer interpreters with modeled primitives. US patent no. 11836613 B2, date of patent Dec. 5, 2023.

  2. Gradus Janssen, Vladimir Zolotov, and Tung D. Le. Neural network training using a data flow graph and dynamic memory management. US patent no. 11512062 B2, date of patent Dec. 6, 2022.

  3. Taro Sekiyama, Kiyokuni Kawachiya, Tung D. Le, and Yasushi Negishi. Real-time resource usage reduction in artificial neural networks. US patent no. 11461637 B2, date of patent Oct. 4, 2022.

  4. Yasushi Negishi, Tung D. Le, Haruki Imai, and Kiyokuni Kawachiya. Relu compression to reduce GPU memory. US patent no. 11362670 B2, date of patent Jun. 14, 2022.

  5. Tung D. Le, Haruki Imai, Taro Sekiyama, and Yasushi Negishi. Multi-GPU deep learning using CPUs. US patent No. 11164079 B2, date of patent Nov. 2, 2021.

  6. Tung D. Le and Taro Sekiyama. Localizing tree-based convolutional neural networks. US patent no. 11106970 B2, date of patent Aug. 31, 2021.

  7. Tung D. Le, Imai Haruki, and Yasushi Negishi. Efficient parallel training of a network model on multiple graphics processing units. US patent no. 10949746 B2, date of patent Mar. 16, 2021.

  8. Tung D. Le, Haruki Imai, Yasushi Negishi, and Kiyokuni Kawachiya. Graph rewriting for large model support using categorized topological sort. US patent no. 10884755 B1, date of Pattent Jan. 5, 2021.

  9. Taro Sekiyama, Kiyokuni Kawachiya, Tung D. Le, and Yasushi Negishi. Real-time resource usage reduction in artificial neural networks. US patent no. 10558914 B2, date of patent Feb. 11, 2020.

  10. Taro Sekiyama, Kiyokuni Kawachiya, Tung D. Le, and Yasushi Negishi. Real-time resource usage reduction in artificial neural networks. US patent no. 10268951 B2, date of patent Apr. 23, 2019.

talks


  1. Tung D. Le (speaker), Alexander Eichenberger, and Tong Chen. Dynamic Dimension Analysis in onnx-mlir Compiler. ONNX Community Meetup 2023 @ NVIDIA, Santa Clara, CA. June 28, 2023. Slides. Video.

  2. Tung D. Le (speaker). Onnx-mlir: an MLIR-based Compiler for ONNX Models - The Latest Status. ONNX Community Meetup 2022 @ Microsoft Sillicon Valley Campus. June 28, 2022. Slides. Video.

conference/journal papers


2020

  1. Tung D. Le, Gheorghe-Teodor Bercea, Tong Chen, Alexandre E Eichenberger, Haruki Imai, Tian Jin, Kiyokuni Kawachiya, Yasushi Negishi, Kevin O’Brien. 2020. Compiling ONNX Neural Network Models Using MLIR. arXiv:2008.08272, Retrieved from https://arxiv.org/abs/2008.08272v1

  2. Haruki Imai, Tung D. Le, Yasushi Negishi, and Kiyokuni Kawachiya. 2020. Acceleration of large deep learning training with hybrid GPU memory management of swapping and re-computing. In Proceedings of the 2020 IEEE International Conference on Big Data (Big Data), December 10-13, 2020, Atlanta, GA, USA. IEEE, 1111-1116.

2019

  1. Haruki Imai, Samuel Matzek, Tung D. Le, Yasushi Negishi, Kiyokuni Kawachiya. 2019. High Resolution Medical Image Segmentation Using Data-Swapping Method. In: Shen D. et al. (eds) Medical Image Computing and Computer Assisted Intervention – MICCAI 2019. MICCAI 2019. Lecture Notes in Computer Science, Vol. 11766. Springer, Cham.

  2. Tung D. Le, Haruki Imai, Yasushi Negishi, Kiyokuni Kawachiya. 2019. Automatic GPU Memory Management for Large Neural Models in TensorFlow. In Proceedings of the 2019 ACM SIGPLAN International Symposium on Memory Management (ISMM 2019), June 2019, Phoenix, Arizona, USA. Association for Computing Machinery, New York, NY, USA, 1-13.

  3. G. Janssen, V. Zolotov and Tung D. Le. 2019. Large Data Flow Graphs in Limited GPU Memory. In Proceedings of the 2019 IEEE International Conference on Big Data (Big Data), December 3-12, 2019, Los Angeles, CA, USA. IEEE, 1821-1830.

  4. Yuki Ito, Haruki Imai, Tung Le Duc, Yasushi Negishi, Kiyokuni Kawachiya, Ryo Matsumiya, and Toshio Endo. 2019. Profiling based out-of-core hybrid method for large neural networks. In Proceedings of the 24th Symposium on Principles and Practice of Parallel Programming (PPoPP ’19), February 16-20, 2019, Washington, DC, USA. Association for Computing Machinery, New York, NY, USA, 399–400.

2018

  1. Minsik Cho, Tung D. Le, Ulrich A Finkler, Haruki Imai, Yasushi Negishi, Taro Sekiyama, Saritha Vinod, Vladimir Zolotov, Kiyokuni Kawachiya, David S. Kung, Hillery C. Hunter. 2018. Large Model Support for Deep Learning in Caffe and Chainer. 2018 SysML conference.

  2. Tung D. Le, Haruki Imai, Yasushi Negishi, Kiyokuni Kawachiya. 2018. TFLMS: Large Model Support in TensorFlow by Graph Rewriting. arXiv:1807.02037. Retrieved from https://arxiv.org/abs/1807.02037

  3. Tung D. Le, Taro Sekiyama, Yasushi Negishi, Haruki Imai, Kiyokuni Kawachiya. 2018. Involving CPUs into Multi-GPU deep learning. In Proceddings of the 2018 ACM/SPEC International Conference on Performance Engineering (ICPE ’18), March 2018, Berlin, Germany. Association for Computing Machinery, New York, NY, USA, 56–67.

2017

  1. Tung D. Le, Taro Sekiyama, Yasushi Negishi, Haruki Imai, Kiyokuni Kawachiya. 2017. Accelerating Multi-GPU Deep Learning by Collecting and Accumulating Gradients on CPUs. SIG Technical Reports 2017-HPC-159(8), 1-8.

2016

  1. Le-Duc Tung, Zhenjiang Hu. Towards Systematic Parallelization of Graph Transformations over Pregel. International Journal of Parallel Program. 45, 2 (April 2017), 320–339.

  2. Chong Li, Le-Duc Tung, Xiaodong Meng, Zhenjiang Hu. 2016. Derivation of parallel-efficient structural recursive functions from declarative graph queries. In Proceedings of the 31st Annual ACM Symposium on Applied Computing (SAC ’16), April 2016, Pisa, Italy. Association for Computing Machinery, New York, NY, USA, 1922–1925.

2015

  1. Le-Duc Tung, Zhenjiang Hu. 2015. Towards Systematic Parallelization of Graph Trans- formations over Pregel. In Proceedings of the 8th International Symposium on High-level Parallel Programming and Applications (HLPP 2015), July 2-3, 2015, Pisa, Italy.

  2. Le-Duc Tung, Zhenjiang Hu. Pregel meets UnCAL: a Systematic Framework for Transforming Big Graphs. In Proceedings of the 2015 31st International Conference on Data Engineering Workshops (ICDE2015), April 13-17, 2015, Seoul, South Korea. IEEE, 250-254.

2013

  1. Le-Duc Tung, Nguyen-Van Quyet, Zhenjiang Hu. 2013. Efficient Query Evaluation on Distributed Graphs with Hadoop Environment. In Proceedings of the Fourth International Symposium on Information and Communication Technology (SoICT ’13), December 5-6, 2013, Da Nang, Vietnam. Association for Computing Machinery, New York, NY, USA, 311—319,

  2. Nguyen-Van Quyet, Le-Duc Tung, Zhenjiang Hu. Minimizing Data Transfers for Regular Reachability Queries on Distributed Graphs. In Proceedings of the Fourth International Symposium on Information and Communication Technology (SoICT ’13), December 5-6, 2013, Da Nang, Vietnam. Association for Computing Machinery, New York, NY, USA, 325-334.

2012

  1. D. T. Le, H. D. Nguyen, T. A. Pham, H. H. Ngo and M. T. Nguyen. 2012. An Intermediate Library for Multi-GPUs Computing Skeletons. In Proceedings of the 2012 IEEE RIVF International Conference on Computing & Communication Technologies, Research, Innovation, and Vision for the Future (RIVF ’12), March 2012, Ho Chi Minh, Vietnam. IEEE, 1-6.

contact


Office address: 23rd floor, river side, IBM Japan headquarters building.
Telephone number: +81-(80)-5915-1439
E-mail: tung@jp.ibm.com


This page was generated by Pandoc from Markdown. Its source code is available here. Latest update on 29 January, 2024.