Intel Register Software Defines Core Proprietary to Improve Single-Enforcement Efficiency by Combining Virtual Core

Tech     7:49am, 5 September 2025

Intel, a large processor manufacturer, recently obtained the "Software Defined Supercore" (SDC). The concept is that software combines multiple cores to form a virtual and ultra-wide "supercore".

According to

Tom’s Hardware, the new technology goal is to provide specific applications with sufficient workload to significantly improve performance performance in single execution. If this technology can be successfully achieved, Intel's central processor (CPU) is expected to provide better single execution performance in selected applications. However, at present, this is just a patent, and it is still unknown whether it will be used in the future.

Intel SDC aims to operate in concert, combining two or more physical CPU cores, such as a single high-performance virtual core. The mechanism is to split a single execution instruction into independent blocks, and the core of different entities is parallel to the calculation. In order to ensure that the original execution sequence of the program is not affected, the system will use dedicated synchronization and data transfer instructions to coordinate. This design maximizes the number of instructions per cycle (IPC) with a minimum of extra operations. Compared to traditional ways of improving performance, such as increasing temporal speed, or building large, single cores, SDC has its own unique advantages. Because building a large core may result in a significant increase in power consumption while increasing transistor production budgets, SDC provides an alternative to further improve single execution performance without sacrificing these resources.

Modern x86 CPU cores can generally decode 4 to 6 instructions, and after these instructions are decoded into micro-ops, 8 to 9 micro-ops will be performed every cycle, thereby achieving the highest IPC performance of this type of processor. However, in contrast to this, Apple's customized Arm architecture high-performance cores (such as Firestorm, Avalanche, Everest) can decode more than eight instructions per cycle and execute more than ten instructions. This is also why Apple processors usually show a better performance than similar Arm chips in terms of single execution performance and power consumption.

Although it is feasible to build an 8-way x86 CPU core that decodes, issues and recycles up to 8 instructions per cycle every time, it has not been realized in actual applications. The main bottlenecks are complex in the front-end design and the significant reduction in performance improvement in power consumption and chip surface cost. In fact, even the most advanced x86 CPUs currently perform general working computing, they can usually only maintain two to three or four continuous IPCs depending on the software. In this regard, Intel SDC proposed an innovative strategy that matches two or more units in a suitable situation, works together to form a large core, instead of the difficulty of directly building an eight-way x86 CPU core.

Hardware layer, each SDC-enabled system core is integrated into a dedicated small hardware module, responsible for managing the coordination of core synchronization, storage memory transmission and memory sorting. These modules use a reserved memory area called the wormhole address space, which coordinates live-in/live-out data and synchronous operations. This ensures that instructions from different cores can be recycled in the correct program sequence. At the same time, it supports in-order and out-of-order cores, and only minimum modification is required for the existing execution engine, and the chip surface can be designed with tighter design.

Software layer, the system will use technologies such as real-time translator (JIT compiler), static translator or binary instrumentation to cut the single execution program into different program segments. These sections are assigned to different cores for execution. At the same time, the system will inject special instructions to control program flow, store memory transmission and synchronous behavior to ensure that the hardware can maintain execution integrity. The operation system (OS) plays a key role and will dynamically determine when to move the execution into or out of hypercore mode according to execution time conditions to achieve the best balance between performance and core availability.

Although Intel's patent document does not provide performance improvement estimates, it implies that the combined performance of the two small cores is expected to approach a large core performance for a specific application scenario. If you can successfully move from patent to commercialization, you can bring significant performance improvements to single-operation applications, especially computing that can make full use of SDC design advantages. However, SDC is still in the patent stage, and its development and application are worthy of attention.