fujitsu

default

Fujitsu’s Fugaku supercomputer may not hold the title of the world’s fastest according to the latest Top 500 supercomputer rankings, but it remains a powerhouse for diverse applications, especially in artificial intelligence. This week, Fujitsu unveiled the Fugaku-LLM, a robust large language model equipped with sophisticated Japanese language processing features. The Fugaku-LLM is tailored for a broad range of uses, from academic research to commercial deployment, showcasing the versatile capabilities of the A64FX processor in handling complex AI tasks.

Fujitsu’s Fugaku-LLM, a sophisticated large language model, was developed using an impressive 380 billion tokens across 13,824 nodes of the Fugaku supercomputer. The supercomputer’s A64FX processor, which supports FP64, FP32, FP16, and INT8 modes, is ideal for diverse applications spanning AI and traditional supercomputing tasks. The training process for Fugaku-LLM leveraged distributed parallel learning techniques, specifically optimized to harness the full potential of the supercomputer’s architecture and its Tofu Interconnect D system. This approach maximizes efficiency and performance in processing complex computational tasks.

Fujitsu’s Fugaku-LLM, equipped with 13 billion parameters, may seem modest when compared to GPT-4’s massive 175 billion parameters, but it stands as the largest LLM trained in Japan. According to Fujitsu, this model is designed for efficiency, requiring significantly less computational power for inference, making it highly suitable for businesses and researchers across Japan. The model was trained on a diverse dataset comprising approximately 60% Japanese content, with the remaining 40% consisting of English, mathematics, and code. This balanced dataset ensures that Fugaku-LLM is well-adapted for a variety of linguistic and technical tasks.

This extensive Japanese-centric training sets it apart from other Japanese models that were trained primarily on English datasets. As a result, Fugaku-LLM boasts superior proficiency in Japanese, achieving an average score of 5.5 on the Japanese MT-Bench, the top score among openly available models trained with original data from Japan. It particularly excels in humanities and social sciences, achieving an impressive benchmark score of 9.18, according to Fujitsu.

The development of Fujitsu’s Fugaku-LLM has been propelled by a collaborative effort involving premier Japanese entities such as Tokyo Institute of Technology, Tohoku University, Fujitsu Limited, RIKEN, Nagoya University, CyberAgent, and Kotoba Technologies. This collaboration was partly spurred by a GPU shortage, which is commonly relied upon for training and inferring AI models. Additionally, the partnership aimed to leverage the Fugaku-LLM with Fujitsu’s upcoming 150-core Monaka datacenter CPU, which is tailored for both AI and high-performance computing (HPC) workloads. This strategic move highlights a commitment to innovating within Japan’s technological landscape and optimizing AI capabilities for future advancements.

Fugaku-LLM, Fujitsu’s advanced large language model, is now accessible for both academic and commercial use. It can be obtained under specific licensing terms available on platforms like GitHub and Hugging Face, although direct links have not been provided by Fujitsu. Starting May 10, 2024, the model will also be available through the Fujitsu Research Portal. This availability is part of Fujitsu’s initiative to make powerful AI tools more accessible to a broader audience, facilitating enhanced research and development across various sectors.

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Translate »