Snap! Whap happen?

It looks like you are blocking ads or it doesn't show for some other reason. Ads ensure our revenue stream. If you support our site you can help us by viweing these ads. Thanks.

Tag Archives: AVX2

AnandTech | The Haswell Review: Intel Core i7-4770K & i5-4560K Tested

The Launch Lineup: Quad Cores For All

As was the case with the launch of Ivy Bridge last year, Intel is initially launching with their high-end quad core parts, and as the year passes on will progressively rollout dual cores, low voltage parts, and other lower-end parts. That means the bigger notebooks and naturally the performance desktops will arrive first, followed by the ultraportables, Ultrabooks and more affordable desktops. One change however is that Intel will be launching their first BGA (non-socketed) Haswell part right away, the Iris Pro equipped i7-4770R.

Intel 4th Gen Core i7 Desktop Processors
Model Core i7-4770K Core i7-4770 Core i7-4770S Core i7-4770T Core i7-4770R Core i7-4765T
Cores/Threads 4/8 4/8 4/8 4/8 4/8 4/8
CPU Base Freq 3.5 3.4 3.1 2.5 3.2 2.0
Max Turbo 3.9 (Unlocked) 3.9 3.9 3.7 3.9 3.0
Test TDP 84W 84W 65W 45W 65W 35W
HD Graphics 4600 4600 4600 4600 Iris Pro 5200 4600
GPU Max Clock 1250 1200 1200 1200 1300 1200
L3 Cache 8MB 8MB 8MB 8MB 6MB 8MB
DDR3 Support 1333/1600 1333/1600 1333/1600 1333/1600 1333/1600 1333/1600
vPro/TXT/VT-d/SIPP No Yes Yes Yes No Yes
Package LGA-1150 LGA-1150 LGA-1150 LGA-1150 BGA LGA-1150
Price $339 $303 $303 $303 OEM $303

Starting at the top of the product and performance stack, we have the desktop Core i7 parts. All of these CPUs feature Hyper-Threading Technology, so they’re the same quad-core with four virtual cores that we’ve seen since Bloomfield hit the scene. The fastest chip for most purposes remains the K-series 4770K, with its unlocked multiplier and slightly higher base clock speed. Base core clocks as well as maximum Turbo Boost clocks are basically dictated by the TDP, with the 4770S being less likely to maintain maximum turbo most likely, and the 4770T and 4765T giving up quite a bit more in clock speed in order to hit substantially lower power targets.

It’s worth pointing out that the highest “Test TDP” values are up slightly relative to the last generation Ivy Bridge equivalents—84W instead of 77W. Mobile TDPs are a different matter, and as we’ll discuss elsewhere they’re all 2W higher, but that is further offset by the improved idle power consumption Haswell brings.

Nearly all of these are GT2 graphics configurations (20 EUs), so they should be slightly faster than the last generation HD 4000 in graphics workloads. The one exception is the i7-4770R, which is also the only chip that comes in a BGA package. The reasoning here is simple: if you want the fastest iGPU configuration (GT3e with 40 EUs and embedded DRAM), you’re probably not going to have a discrete GPU and will most likely be purchasing an OEM desktop. Interestingly, the 4770R also drops the L3 cache down to 6MB, and it’s not clear whether this is due to it having no real benefit (i.e. the eDRAM may function as an even larger L4 cache), or if it’s to reduce power use slightly, or Intel may have a separate die for this particular configuration. Then again, maybe Intel is just busily creating a bit of extra market segmentation.

Not included in the above table are all the common features to the entire Core i7 line: AVX2 instructions, Quick Sync, AES-NI, PCIe 3.0, and Intel Virtualization Technology. As we’ve seen in the past, the K-series parts (and now the R-series as well) omit support for vPro, TXT, VT-d, and SIPP from the list. The 4770K is an enthusiast part with overclocking support, so that makes some sense, but the 4770R doesn’t really have the same qualification. Presumably it’s intended for the consumer market, as businesses are less likely to need the Iris Pro graphics.

Intel 4th Gen Core i5 Desktop Processors
Model Core i5-4670K Core i5-4670 Core i5-4670S Core i5-4670T Core i5-4570 Core i5-4570S
Cores/Threads 4/4 4/4 4/4 4/4 4/4 4/4
CPU Base Freq 3.4 3.4 3.1 2.3 3.2 2.9
Max Turbo 3.8 (Unlocked) 3.8 3.8 3.3 3.6 3.6
Test TDP 84W 84W 65W 45W 84W 65W
HD Graphics 4600 4600 4600 4600 4600 4600
GPU Max Clock 1200 1200 1200 1200 1150 1150
L3 Cache 6MB 6MB 6MB 6MB 6MB 6MB
DDR3 Support 1333/1600 1333/1600 1333/1600 1333/1600 1333/1600 1333/1600
vPro/TXT/VT-d/SIPP No Yes Yes Yes Yes Yes
Package LGA-1150 LGA-1150 LGA-1150 LGA-1150 LGA-1150 LGA-1150
Price $242 $213 $213 $213 $192 $192

The Core i5 lineup basically rehashes the above story, only now without Hyper-Threading. For many users, Core i5 is the sweet spot of price and performance, delivering nearly all the performance of the i7 models at 2/3 the price. There aren’t any Iris or Iris Pro Core i5 desktop parts, at least not yet, and all of the above CPUs are using the GT2 graphics configuration. As above, the K-series part also lacks vPro/TXT/VT-d support but comes with an unlocked multiplier.

Obviously we’re still missing all of the Core i3 parts, which are likely to be dual-core once more, along with some dual-core i5 parts as well. These are probably going to come in another quarter, or at least a month or two out, as there’s no real need for Intel to launch their lower cost parts right now. Similarly, we don’t have any Celeron or Pentium Haswell derivatives launching yet, and judging by the Ivy Bridge rollout I suspect it may be a couple quarters before Intel pushes out ultra-budget Haswell chips. For now, the Ivy Bridge Celeron/Pentium parts are likely as low as Intel wants to go down the food chain for their “big core” architectures.

Read the full review @ AnandTech.

Core i7-4770K: Haswell’s Performance, Previewed : Core i7-4770K Gets Previewed

A recent trip got us access to an early sample of Intel’s upcoming Core i7-4770K. We compare its performance to Ivy Bridge- and Sandy Bridge-based processors, so you have some idea what to expect when Intel officially introduces its Haswell architecture.

We recently got our hands on a Core i7-4770K, based on Intel’s Haswell micro-architecture. It’s not final silicon, but compared to earlier steppings (and earlier drivers), we’re comfortable enough about the way this chip performs to preview it against the Ivy and Sandy Bridge designs.

Presentations at last year’s Developer Forum in San Francisco taught us as much as there is to know about the Haswell architecture itself. But as we get closer to the official launch, more details become known about how Haswell will materialize into actual products. Fortunately for us, some of the first CPUs based on Intel’s newest design will be aimed at enthusiasts.

Fourth-Generation Intel Core Desktop Line-Up
Cores / Threads TDP (W) Clock Rate 1 Core 2 Cores 3 Cores 4 Cores L3 GPU Max. GPU Clock TSX
i7-4770K 4 / 8 84 3.5 GHz 3.9 GHz 3.9 GHz 3.8 GHz 3.7 GHz 8 MB GT2 1.25 GHz No
i7-4770 4 / 8 84 3.4 GHz 3.9 GHz 3.9 GHz 3.8 GHz 3.7 GHz 8 MB GT2 1.2 GHz Yes
i5-4670K 4 / 4 84 3.4 GHz 3.8 GHz 3.8 GHz 3.7 GHz 3.6 GHz 6 MB GT2 1.2 GHz No
i5-4670 4 /4 84 3.4 GHz 3.8 GHz 3.8 GHz 3.7 GHz 3.6 GHz 6 MB GT2 1.2 GHz Yes
i5-4570 4 / 4 84 3.2 GHz 3.6 GHz 3.6 GHz 3.5 GHz 3.4 GHz 6 MB GT2 1.15GHz Yes
i5-4430 4 / 4 84 3 GHz 3.2 GHz 3.2 GHz 3.1 GHz 3 GHz 6 MB GT2 1.1 GHz No
i7-4770S 4 / 4 65 3.1 GHz 3.9 GHz 3.8 GHz 3.6 GHz 3.5 GHz 8 MB GT2 1.2 GHz Yes
i5-4570S 4 / 4 65 2.9 GHz 3.6 GHz 3.5 GHz 3.3 GHz 3.2 GHz 6 MB GT2 1.15GHz Yes
i5-4670S 4 / 4 65 3.1 GHz 3.8 GHz 3.7 GHz 3.5 GHz 3.4 GHz 6 MB GT2 1.2 GHz Yes
i5-4430S 4 / 4 65 2.7 GHz 3.2 GHz 3.1 GHz 2.9 GHz 2.8 GHz 6 MB GT2 1.1 GHz No
i7-4770T 4 / 4 45 2.5 GHz 3.7 GHz 3.6 GHz 3.4 GHz 3.1 GHz 8 MB GT2 1.2 GHz Yes
i5-4670T 4 / 4 45 2.3 GHz 3.3 GHz 3.2 GHz 3 GHz 2.9 GHz 6 MB GT2 1.2 GHz Yes
i7-4765T 4 / 4 35 2 GHz 3 GHz 2.9 GHz 2.7 GHz 2.6 GHz 8 MB GT2 1.2 GHz Yes
i5-4570T 2 / 4 35 2.9 GHz 3.6 GHz 3.3 GHz 4 MB GT2 1.15 GHz Yes

According to Intel’s current plans, you’ll find dual- and quad-core LGA 1150 models with the GT2 graphics configuration sporting 20 execution units. There will also be dual- and quad-core socketed rPGA-based models for the mobile space, featuring the same graphics setup. Everything in the table above is LGA 1150, though. All of those models share support for two channels of DDR3-1600 at 1.5 V and 800 MHz minimum core frequencies. They also share a 16-lane PCI Express 3.0 controller, AVX2 support, and AES-NI support. Interestingly, four of the listed models do not support Intel’s new Transactional Synchronization Extensions (TSX). We’re not sure why Intel would want to differentiate its products with a feature intended to handle locking more efficiently, but that appears to be what it’s doing.

The much-anticipated GT3 graphics engine, with 40 EUs, is limited to BGA-based applications, meaning it won’t be upgradeable. Intel will have quad-core with GT3, quad-core with GT2, and dual-core with GT2 versions in ball grid array packaging. GT3 will also make an appearance in a BGA-based multi-chip package that includes a Lynx Point chipset. That’ll be a dual-core part, though.

In addition to the processors Intel plans to launch here in a few months, we’ll also be introduced to the 8-series Platform Controller Hubs, currently code-named Lynx Point. The most feature-complete version of Lynx Point will incorporate six SATA 6Gb/s ports, 14 total USB ports (six of which are USB 3.0), eight lanes of second-gen PCIe, and VGA output.

Eight-series chipsets are going to be physically smaller than their predecessors (23×22 millimeters on the desktop, rather than 27×27) with lower pin-counts. This is largely attributable to more capabilities integrated on the CPU itself. Previously, eight Flexible Display Interface lanes connected the processor and PCH. Although the processor die hosted an embedded DisplayPort controller, the VGA, LVDS, digital display interfaces, and audio were all down on the chipset. Now, the three digital ports are up in the processor, along with the audio and embedded DisplayPort. LVDS is gone altogether, as are six of the FDI lanes.

Although Dhrystone isn’t necessarily applicable to real-world performance, a lack of software already-optimized for AVX2 means we need to go to SiSoftware’s diagnostic for an idea of how Haswell’s support for the instruction set might affect general integer performance in properly-optimized software.

The Whetstone module employs SSE3, so Haswell’s improvements over Ivy Bridge are far more incremental.

Sandra’s Multimedia benchmark generates a 640×480 image of the Mandelbrot Set fractal using 255 iterations for each pixel, representing vectorised code that runs as close to perfectly parallel as possible.

The integer test employs the AVX2 instruction set on Intel’s Haswell-based Core i7-4770K, while the Ivy andSandy Bridge-based processors are limited to AVX support. As you see in the red bar, the task is finished much faster on Haswell. It’s close, but not quite 2x.

Floating-point performance also enjoys a significant speed-up from Intel’s first implementation of FMA3 (AMD’s Bulldozer design supports FMA4, while Piledriver supports both the three- and four-operand versions). The Ivy and Sandy Bridge-based processors utilize AVX-optimized code paths, falling quite a bit behind at the same clock rate.

Why do doubles seem to speed up so much more than floats on Haswell? The code path for FMA3 is actually latency-bound. If we were to turn off FMA3 support altogether in Sandra’s options and used AVX, the scaling proves similar.

All three of these chips feature AES-NI support, and we know from past reviews that because Sandra runs entirely in hardware, our platforms are processing instructions as fast as they’re sent from memory. The Core i7-4770K’s slight disadvantage in our AES256 test is indicative of slightly less throughput—something I’m comfortable chalking up to the early status of our test system.

Meanwhile, SHA2-256 performance is all about each core’s compute performance. So, the IPC improvements that go into Haswell help propel it ahead of Ivy Bridge, which is in turn faster than Sandy Bridge.

The memory bandwidth module confirms our findings in the Cryptography benchmark. All three platforms are running 1,600 MT/s data rates; the Haswell-based machine just looks like it needs a little tuning.

We already know that Intel optimized Haswell’s memory hierarchy for performance, based on information discussed at last year’s IDF. As expected, Sandra’s cache bandwidth test shows an almost-doubling of performance from the 32 KB L1 data cache.

Gains from the L2 cache are actually a lot lower than we’d expect though; we thought that number would be close to 2x as well, given 64 bytes/cycle throughput (theoretically, the L2 should be capable of more than 900 GB/s). The L3 cache actually drops back a bit, which could be related to its separate clock domain.

It still isn’t clear whether something’s up with our engineering sample CPU, or if there’s still work to be done on the testing side. Either way, this is a pre-production chip, so we aren’t jumping to any conclusions.

Source: Tom’s Hardware.

Just another WordPress site