A balance of peak vs per-core performance

The arrival of AMD 3rd The Generation EPYC processor family, using the new Zen 3 core, was highly anticipated. The promise of a new processor core microarchitecture, updates to connectivity and new security options while still maintaining platform compatibility is a good measure of an enterprise platform update, but One True Metric is the platform’s performance. Seeing Zen 3 punctuate the performance leadership by final core in the consumer market in November raised expectations for a similar blow to the corporate market, and today we can see those results.

AMD EPYC 7003: 64 cores from Milan

The key numbers that AMD is promoting with the new generation of hardware is an increase in the gross performance throughput of + 19%, due to improvements with the new core design. In addition, AMD has new security features, optimizations for different memory configurations and updated performance with Infinity Fabric and connectivity.


3rd Generation EPYC

Anyone looking for the abbreviated specifications in the new EPYC 7003 series, known by its code name Milan, will see a great deal of familiarity with the previous generation, however, this time AMD aims at several different design points.

Milan processors will offer up to 64 cores and 128 threads, using AMD’s latest Zen 3 cores. The processor is designed with eight chips of eight cores each, similar to Rome, but this time all eight cores of the chip are connected, allowing for an effective dual L3 cache design for a lower overall cache latency structure. All processors will have 128 PCIe 4.0 lanes, eight memory channels, with most models supporting dual processor connectivity, and new options for channel memory optimization are available. All Milan processors must be compatible with Rome series platforms with a firmware update.

AMD EPYC: Generation to Generation
AnandTech EPYC
7001
EPYC
7002
EPYC
7003
Code name Naples Pomegranate Milan
Microarchitecture Zen Zen 2 Zen 3
Basic Manufacturing 14 nm 7nm 7nm
Maximum cores / threads 32/64 64/128 64/128
Core Complex 4C + 8 MB 4C + 16 MB 8C + 32 MB
Memory support 8 x DDR4-2666 8 x DDR4-3200 8 x DDR4-3200
Memory capacity 2 TB 4 TB 4 TB
PCIe 3.0 x 128 4.0 x 128 4.0 x 128
Safety SME
SEV
SME
SEV
SME
SEV
SNP
Peak power 180 W 240 W * 280 W
* Roma introduced 280 W for special mid-cycle HPC

One of the highlights here is that the new generation of processors will offer 280 W models to all customers – previous generations had only 240 W models for everyone and 280 W for specific HPC customers, however, this time, all customers can enable these models of high performance parts with the new core design.

This is exemplified if we make direct comparisons of the processor at the top of the stack:

2P Top of Stack GA Offers
AnandTech EPYC
7001
EPYC
7002
EPYC
7003
Intel
Xeon
Processor 7601 7742 7763 6258R
uArch Zen Zen 2 Zen 3 Waterfall
Cores 32 64 64 28
TDP 180 W 240 W 280 W 205 W
Base frequency 2200 2250 2450 2700
Turbo Freq 3200 3400 3500 4000
L3 cache 64 MB 256 MB 256 MB 37.5 MB
PCIe 3.0 x 128 4.0 x 128 4.0 x 128 3.0 x 48
DDR4 8 x 2666 8 x 3200 8 x 3200 6 x 2933
DRAM Cap 2 TB 4 TB 4 TB 1 TB
Price $ 4200 $ 6950 $ 7890 $ 3950

The new top processor from AMD is the EPYC 7763, a 64-core processor at 280 W TDP that offers 2.45 GHz base frequency and 3.50 GHz boost frequency. AMD claims that this processor offers + 106 % performance in industry benchmarks compared to Intel’s best 28-core 2P processor, the Gold 6258R, and + 17% over its previous 280 W generation, version 7H12.

Maximum performance vs performance per core

One of AMD’s angles with the new Milan generation will be targeted performance metrics, with the company not only looking for ‘peak’ numbers, but also taking a broader view for customers who also need high performance per core, especially for software that is, invariably, performance per limited or licensed core. With that in mind, AMD’s F series of ‘fast’ processors is now being crystallized in the stack.

AMD EPYC 7003 F Series Processors
Cores
Topics
Base
Frequency
Turbo
Frequency
L3
(MB)
TDP
(Ç)
Price
F-Series
EPYC 75F3 32/64 2950 4000 256
(8 x 32)
280 W $ 4860
EPYC 74F3 24/48 3200 4000 240 W $ 2900
EPYC 73F3 16/32 3500 4000 240 W $ 3,521
EPYC 72F3 8/16 3700 4100 180 W $ 2,468

These processors have the maximum single threaded values ​​of anything else in AMD’s offering, along with 256 MB of L3 cache, and in our results they get better scores per thread than anything else we tested for companies on x86 and Armar – more details in the review. F-series processors will have a small advantage over others.

AMD EPYC: The Tour of Italy

The first generation of EPYC was launched in June 2017. At that time, AMD was essentially a phoenix: rising from the ashes of its old Opteron business and promising to return to high-performance computing with a new processor design philosophy.

At the time, the traditional business customer base was not initially convinced – AMD’s latest take on the enterprise space with a new generation of paradigm-shift processor cores, while successful, fell apart when AMD had to stop going bankrupt. Opteron’s customers were left with no updates in sight at the time, and the desire to jump onto an unknown platform from a company that had hurt so many in the past was not a positive outlook for many.

At the time, AMD launched a three-year roadmap, detailing its next generations and the path the company would take to overcome the giant 99% market share in performance and offerings. These were seen as high goals, and many were willing to watch others take their chances.


1st generation EPYC launch

When the first generation of Naples was launched, it presented some impressive performance figures. It didn’t compete very well in all areas and, as with any new platform, there were some initial problems to start with. AMD maintained the initial cycle for some of its major OEM partners, before slowly expanding the ecosystem. Naples was the first platform to offer extensive PCIe 3.0 and a lot of memory support, and the platform initially intended for storage or heavy PCIe deployments.


2nd Generation EPYC Launch

The second generation of Rome, launched in August 2019 (+26 months), generated much more fanfare. AMD’s newest Zen 2 core was competitive in the consumer space, and there were a number of major design changes to the SoC layout (like switching to a flat NUMA design) that encouraged several skeptics to start evaluating the platform. Size was the interest that AMD even told us that they had to be selective with the OEM platforms that would help before the official launch. Rome’s performance was good and marked some high-profile supercomputer victories, but more importantly, it may have shown that AMD was able to execute this roadmap in June 2017.

This flat SoC architecture, along with the updated Zen 2 processor core (which actually borrowed elements from Zen 3) and PCIe 4.0, allowed AMD to start competing in performance, as well as simply IO, and AMD’s OEM partners have consistently announced Rome processors as computing platforms, often replacing two 28-core Intel processors with a 64-core AMD processor that also has superior memory support and more PCIe offerings. This also allows for compute density, and AMD was in a place where it could help drive software optimizations for its platform as well, extracting performance, but also moving to parity in the extreme cases for which its competitors were very optimized. All major hypercalcators have also evaluated and implemented AMD-based offerings for their customers, as well as internally. The AMD approval sticker was practically there.


3rd generation EPYC CPU

And so, today AMD is continuing that tour of Italy with a trip to Milan, about 19 months after Rome. The layout of the underlying SoC is the same as that of Rome, but we have superior performance on the table, with added security and more configuration options. Hyperscaleers have been receiving the final hardware for six months for their deployments, and AMD is now in a position to help enable more OEM platforms at launch. Milan is compatible with Rome, which certainly helps, but with Milan covering more optimization points, AMD believes that it is in a better position to reach more of the market with high-performance processors and high-performance processors per core, than ever before. before.

AMD sees the launch of Milan as the third step in the roadmap that was shown in June 2017 and the validation of its ability to reliably execute for its customers, but it also offers performance gains above the industry standard for its customers.

The next stop on the Italy tour is Genoa, set to use AMD’s next Zen 4 microarchitecture. AMD also said that Zen 5 is on its way.

Competition

AMD is launching this new generation of Milan processors approximately 19 months after the launch of Rome. During this period, we saw the launch of Amazon Graviton2 and Ampere Altra, developed based on Arm’s Neoverse N1 family of cores.

Milan Top-of-Stack Competition
AnandTech EPYC
7003
Amazon
Graviton2
Ampere
Altra
Intel
Xeon
Platform Milan Graviton2 Mercury Waterfall
Processor 7763 Graviton2 Q80-33 6258R
uArch Zen 3 N1 N1 Waterfall
Cores 64 64 80 28
TDP 280 W ? 250 W 205 W
Base frequency 2450 2500 3300 2700
Turbo Freq 3500 2500 3300 4000
L3 cache 256 MB 32 MB 32 MB 37.5 MB
PCIe 4.0 x 128 ? 4.0 x 128 3.0 x 48
DDR4 8 x 3200 8 x 3200 8 x 3200 6 x 2933
DRAM Cap 4 TB ? 4 TB 1 TB
Price $ 7890 N / D $ 4,050 $ 3950

From Intel, the company split its efforts between large and small socket configurations. For large sockets (4+), there is Cooper Lake, a derivative of Skylake for select customers only. For smaller socket configurations (1-2), Intel is expected to launch its 10nm Ice Lake portfolio sometime this year, but still remains silent on the exact dates. For that, all we have to compare Milan is Intel’s Cascade Lake Xeon scalable platform, which was the same platform with which we compared Rome.

Interesting times, for sure.

This review

For this review, AMD gave us remote access to several identical servers with different processor configurations. We focused our efforts on the top of the EPYC 7763 stack, a 64-core 280 W processor, the EPYC 7713, a 64-core 225 W processor, and the EPYC 7F53, a 32-core 280 W 280 processor designed as the Milan halo processor for performance by core.

On the next page, we’ll look at AMD’s Milan processor stack and its comparison with Rome, as well as the comparison with current Intel offerings. Next, we go through our test systems, discussions about our SoC structure tests (cache, core to core, bandwidth), processor power and then our complete benchmarks.

  1. This page, the overview
  2. Milan processor offerings
  3. Test bench settings, compiler options
  4. Topology, memory subsystem and latency
  5. Processor power: Core vs IO
  6. SPEC: Multi-Thread Performance
  7. SPEC: Single Thread Performance
  8. SPEC: Victory for core performance for 75F3
  9. SPECjbb MultiJVM: Java Performance
  10. Compilation and compute benchmarks
  11. Conclusions and final observations

These pages can be accessed by clicking on the links or using the drop-down menu below.

Source