Earth Simulator
The Earth Simulator (ES) (地球 シミュレータ, Chikyū Shimyurēta?), developed by the Japanese government initiative "Earth Simulator Project", was a vector supercomputer system for running global climate modeling to assess the effects of global warming and problems in the geophysics of the solid earth. The system was developed for the Japan Aerospace Exploration Agency, Japan Atomic Energy Research Institute, and Japan Marine Science and Technology Center (JAMSTEC) in 1997. Construction began in October 1999, and the site officially opened on 11 March 2002. The project cost 60 billion yen.
Created by NEC, ES was based on their SX-6 architecture. It consisted of 640 nodes with eight vector processors and 16 gigabytes of computer memory on each node, for a total of 5,120 processors and 10 terabytes of memory. Two nodes per 1 meter × 1.4 meter × 2 meter cabinet were installed. Each cabinet consumed 20 kW of power. The system had 700 terabytess of disk storage (450 for system and 250 for users) and 1.6 petabytess of mass storage on tape drives. It was able to run holistic simulations of the global climate in both the atmosphere and the oceans down to a resolution of 10 km. Its performance on the LINPACK benchmark was 35.86 TFLOPS, which was almost five times faster than the previous fastest supercomputer, ASCI White. As of 2020, comparable performance can be achieved using 4 Nvidia A100 GPUs, each with 9,746 FP64 TFlops.
ES was the world's fastest supercomputer from 2002 to 2004. Its capability was surpassed by IBM's Blue Gene/L prototype on September 29, 2004.
ES was superseded by Earth Simulator 2 (ES2) in March 2009. ES2 is a NEC SX-9/E system and has one quarter nodes each for 12.8 times the performance (3.2 × clock speed, four times the processing resource per node), for a maximum throughput of 131 TFLOPS. With a delivered LINPACK performance of 122.4 TFLOPS, the ES2 was the most efficient supercomputer in the world at the time. In November 2010, NEC announced that ES2 topped the overall FFT, one of the HPC Challenge Awards metrics, with a performance figure of 11,876 TFLOPS.
ES2 was superseded by Earth Simulator 3 (ES3) in March 2015. ES3 is a NEC SX-ACE system with 5,120 nodes and 1.3 PFLOPS performance.
ES3, from 2017 to 2018, ran alongside Gyoukou, an immersion-cooled supercomputer that can reach up to 19 PFLOPS.
System Overview
Hardware
Earth Simulator (ES for short) was developed as a national project by three government agencies: the Japan National Space Development Agency (NASDA), the Japan Atomic Energy Research Institute (JAERI), and the Japan Institute of Sciences. Marinas and Technological Center (JAMSTEC). The ES is located in the Earth Simulator building (approx. 50m × 65m × 17m). Earth Simulator 2 (ES2) uses 160 nodes of NEC's SX-9E. The Earth Simulator upgrade was completed in March 2015. The Earth Simulator 3 (ES3) system uses 5120 nodes from NEC's SX-ACE.
System Settings
The ES is a highly parallel vector supercomputer system of the distributed memory type, and consisted of 160 processor nodes connected by Fat-Tree Network. Each processor node is a system with a shared memory, consisting of 8 vector type arithmetic processors, a main memory system of 128 GB. The maximum performance of each arithmetic processor is 102.4 Gflops. The ES as a whole consists of 1280 arithmetic processors with 20TB of main memory and theoretical throughput of 131Tflops.
CPU Construction
Each CPU consists of a 4-way superscalar unit (SU), a vector unit (VU), and a main memory access control unit on a single LSI chip. The CPU runs at a clock rate of 3.2 GHz. Each VU has 72 vector registers, each of which has 256 vector elements, along with 8 sets of six different types of vector pipes: add/shift, multiply, divide, logical operations, masking, and load/store. The same type of vector pipelines work together using a single vector instruction, and pipelines of different types can operate simultaneously.
Processor Node (PN)
The processor node consists of 8 CPUs and 10 memory modules.
Interconnection Network (IN)
The RCU is directly connected to the crossbar switches and controls data communications between nodes at a bi-directional transfer rate of 64 GB/s to send and receive data. Therefore, the total network bandwidth between nodes is approximately 10 TB/s.
Processor Node (PN) Cabinet
The processor node is made up of two cabinet nodes and consists of the power supply, 8 memory modules, and a PCI cage with 8 CPU modules.
Software
The following is a description of the software technologies used in the ES2 operating system, job scheduling, and programming environment.
Operating system
The operating system that runs on ES, "Earth Simulator Operating System", is a customized version of NEC's SUPER-UX used for the NEC SX supercomputers that make up ES.
Mass Storage File System
If a large parallel job running on 640 PNs reads or writes to a disk installed on a PN, each PN accesses the disk in sequence and performance degrades terribly. Although local I/O where each PN reads or writes to its own disk solves the problem, it is a very hard job to manage such a large number of partial files. ES then adopts the Global and Staging File System (GFS) which offers high-speed I/O performance.
Job scheduling
ES is basically a batch system. Network Queuing System II (NQSII) is introduced to manage the batch job. Ground Simulator Queue Configuration. ES has queues of two types. The S batch queue is designed for single-node batch jobs and the L batch queue is for multi-node batch queues. There are queues of two types. One is the L batch queue and the other is the S batch queue. The S batch queue is intended to be used for pre- or post-execution for large-scale batch jobs (performing initial data, processing results of a simulation and other processing), and batch queue L is for a production run. Users choose the appropriate queue for their work.
- Nodes assigned to batch work are used exclusively for batch work.
- The batch work is programmed according to the time elapsed rather than the time of CPU.
Strategy (1) allows you to estimate job completion time and make it easy to allocate nodes for subsequent batch jobs in advance. Strategy (2) contributes to efficient work execution. The job can use the nodes exclusively, and the processes on each node can run concurrently. As a result, the large-scale parallel program can be run efficiently. System L PNs are prohibited from accessing the user disk to ensure sufficient disk I/O performance. Therefore, the files used by the batch job are copied from the user disk to the job disk prior to job execution. This process is called the "input stage". It is important to hide this setup time for job scheduling. The main steps of job scheduling are summarized below;
- Node allocation
- Stage-in (copy user disk files to work disk automatically)
- Scaling up work (reprogramming for the previous estimated start time if possible)
- Implementation of work
- Stage-out (copy work disk files to the user's disk automatically)
When a new batch job is submitted, the scheduler searches for available nodes (Step 1). Once the nodes and estimated start time are assigned to the batch job, the seed stage processing (Step 2) begins. The job waits until the estimated start time after the input process is complete. If the scheduler finds a start time earlier than the estimated start time, it assigns the new start time to the batch job. This process is called "Escalation of work" (Step 3). When the estimated start time arrives, the scheduler runs the batch job (Step 4). The scheduler ends the batch job and starts output processing after the job execution finishes or the declared elapsed time ends (Step 5). To run the batch job, the user logs in to the login server and submits the batch script to ES. And the user waits until the execution of the job is finished. During that time, the user can view the status of the batch job using the conventional web browser or user commands. Node programming, file staging and other processes are automatically processed by the system according to the batch script.
Programming environment
ES programming model
The ES hardware has a 3-level parallelism hierarchy: vector processing on an AP, shared-memory parallel processing on a PN, and parallel processing between PNs via IN. To get the most out of ES, you must develop parallel programs that take full advantage of that parallelism. ES's 3-level hierarchy of parallelism can be used in two ways, which are called hybrid and flat parallelism, respectively. In hybrid parallelization, parallelism between nodes is expressed by HPF or MPI, and intranode by microtasks or OpenMP, so you need to consider hierarchical parallelism when writing your programs. In flat parallelization, parallelism between nodes and intranodes can be expressed using either HPF or MPI, and you don't need to consider such complicated parallelism. Generally speaking, hybrid parallelization is superior to flat parallelization in performance and vice versa in ease of programming. Note that the MPI libraries and HPF runtimes are optimized to perform best in both hybrid and flat parallelization.
Languages
Compilers for Fortran 90, C, and C++ are available. All of them have advanced automatic vectorization and microtasking capabilities. Microtasking is a kind of multitasking provided at the same time by Cray's supercomputer and is also used for parallelization within nodes in ES. Microtasks can be controlled by inserting directives into source programs or by using compiler automatic parallelization. (Note that OpenMP is also available in Fortran 90 and C++ for intra-node parallelization.)
Parallelization
Message Passing Interface (MPI)
MPI is a message passing library based on the MPI-1 and MPI-2 standards and provides a high-speed communication capability that takes full advantage of IXS features and shared memory. It can be used for parallelization both within and between nodes. An MPI process maps to an AP in flat parallelization, or to a PN containing OpenMP microtasks or threads in hybrid parallelization. The MPI libraries are carefully designed and optimized to achieve maximum communication performance on the ES architecture in both directions of parallelization.
High Performance Fortrans (HPF)
The main users of ES are considered to be natural scientists who are not necessarily familiar with or dislike of parallel programming. Consequently, there is a great demand for a higher level parallel language. HPF/SX provides easy and efficient parallel programming in ES to meet the demand. Supports HPF2.0 specifications, their approved extensions, HPF/JA and some ES-only extensions
Instruments
-Integrated development environment (PSUITE)
The integrated development environment (PSUITE) is the integration of various tools to develop the program that SUPER-UX operates. Because PSUITE assumes that the GUI can use various tools and has the coordinated function between tools, it becomes able to develop the program more efficiently than the above method of developing the program and easily.
-Debugging support
In SUPER-UX, the following functions are prepared as strong debugging support functions to support program development.
Facilities
Earth Simulator Building Features
Protection against natural disasters
Earth Simulator Center has several special features that help protect your computer from natural disasters or events. A wire nest hangs over the building that helps protect it from lightning strikes. The nest itself uses shielded high-voltage cables to deliver lightning current to the ground. A special light propagation system uses halogen lamps, installed outside the protected walls of the engine room, to prevent any magnetic interference from reaching the computers. The building is built on a seismic isolation system, made up of rubber supports, which protect the building during earthquakes.
Lightning protection system
Three basic characteristics:
- Four poles on both sides of the Earth Simulator building make up a wire nest to protect the ray building.
- A special high voltage armoured cable is used for the inductive cable that releases a lightning current to ground.
- The floor plates are placed separately from the building about 10 meters.
Lighting
Lighting: Light propagation system inside a tube (255 mm diameter, 44 m (49 yd) length, 19 tubes) Light source: 1 kW halogen lamps Illumination: 300 lx on the floor on average The light sources installed outside the armored walls of the engine room.
Seismic isolation system
11 insulators (1 ft. high, 3.3 ft. diameter, 20-ply rubber bands supporting the bottom of the ES building)
Performance
LINPACK
The new Earth Simulator (ES2) system, which went live in March 2009, achieved a sustained performance of 122.4 TFLOPS and a computing efficiency (*2) of 93.38% in the LINPACK Benchmark (*1).
- 1. LINPACK benchmark
LINPACK Benchmark is a measure of a computer's performance and is used as a standard benchmark for ranking computer systems in the TOP500 project. LINPACK is a program for performing numerical linear algebra on computers.
- 2. Computer efficiency
Computing efficiency is the ratio of sustained performance to peak computing performance. Here, it is the ratio of 122.4 TFLOPS to 131.072 TFLOPS.
WRF computational performance in Earth Simulator
WRF (Weather Research and Forecasting Model) is a mesoscale weather simulation code that has been developed in collaboration with US institutions, including NCAR (National Center for Atmospheric Research) and NCEP (National Centers for Environmental Prediction). JAMSTEC has optimized WRFV2 on the 2009 revamped Earth Simulator (ES2) with computational performance measurement. As a result, it was successfully demonstrated that WRFV2 can run on ES2 with outstanding and sustained performance.
Numerical weather simulation was performed using WRF in Earth Simulator for the terrestrial hemisphere under the Nature Run model condition. The spatial resolution of the model is 4486 by 4486 horizontally with a grid spacing of 5 km and 101 levels vertically. Mainly adiabatic conditions were applied with the integration time step of 6 seconds. Very high performance was achieved on the Earth Simulator for high resolution WRF. While the number of CPU cores used is only 1% compared to the world's fastest class Jaguar system (CRAY XT5) at Oak Ridge National Laboratory, the sustained performance achieved in Earth Simulator is almost 50%. than measured in the Jaguar system. The maximum performance rate in Earth Simulator is also a record 22.2%.
Contenido relacionado
National Institute of Aerospace Technology
French Academy of Sciences
Spacewar!







