Title:
smaller_and_faster_intel_sse_code Download
Description: Codes exhibiting suitable data-flow parallelism can often
profit from using Intel SSE, a SIMD extension to the CISC style In-tel 64 instruction set architecture. As SSE instructions are, on average,
larger than scalar instructions, they exhibit a heavier load on instruc-tion pre-decoding, decoding, and caching hardware. For long straight-line
SSE codes, instruction lengths become an obstacle to high performance
that is not adequately handled by available optimizing compilers
To Search:
File list (Check if you may need any files):
smaller_and_faster_intel_sse_code.pdf