Red Bar
Navigation:: Home >> Electronic components >> this page

Breaking through FPGA Performance Barriers

Greg Martin, Product Marketing Manager, Achronix Semiconductor, explains how new FPGA architectures bring impressive performance improvements.


Since their introduction in 1984, Field Programmable Gate Arrays (FPGAs) have given designers substantial flexibility and time-to-market advantages compared with ASICs. In recent years however, FPGAs have hit a wall.

FPGAs' great strength, reconfigurability, is unfortunately also the cause of a large weakness - low performance - keeping FPGAs from competing with ASICs despite smaller technology nodes being employed. As a result, traditional FPGAs have been prevented from playing a role in many high-performance systems where ASICs still dominate, even in low to medium volume applications where an FPGA would be a much more cost effective solution.

The economic and technical forces driving the high-performance electronics' world are leading to frustration with both the ASIC and traditional FPGA approaches. System architects increasingly require reprogrammability and short time-to-market, while still satisfying their performance requirements.

This article demonstrates how an innovative new FPGA architecture with unprecedented capabilities is signalling a breakthrough in the nearly three decades of FPGA design where performance has often been sacrificed for flexibility and time-to-market. This new architecture blends elements of both synchronous and asynchronous architectures, delivering the world's first FPGAs capable of exceeding standard-cell ASIC performance. This breakthrough opens up worlds of applications previously unavailable to engineers using traditional FPGAs and will lead to more successful development programs producing more competitive products.


New FPGA Architecture

The revolutionary new FPGA architecture from Achronix can achieve three times the throughput of traditional FPGAs, approaching 1.5 GHz in peak performance. At the heart of this new architecture is the picoPIPE™ logic fabric - a fabric that is based on the use of Data Tokens rather than a conventional clocked structure. This high-performance fabric is surrounded by a conventional I/O frame of configurable I/Os, SerDes, clocks, PLLs etc. (Figure 1), providing the off-chip interfaces and forming the boundary between the picoPIPE core and these interfaces. All data entering and exiting the core must pass through the frame. From a designer's perspective, the internal picoPIPE fabric is virtually indistinguishable from a conventional FPGA fabric - the only distinction is that the data throughput is substantially increased.

 Achronix FPGA Architecture

Achronix FPGA Architecture

In conventional logic, a Data Token is a logic value which is qualified by a clock edge. With a traditional logic implementation, data is always present, but is only valid (and therefore propagated though storage elements) when a clock edge is received at a storage element. Hence every time data is propagated from one storage element to the next, only a distinct, valid data value is propagated. The combination of Clock and Data can therefore be implicitly considered as a Data Token. For each register (storage element) in a design, that has a clock , there will be a Data Token propagated at every clock tick.

In an Achronix FPGA, the picoPIPE fabric uses explicit Data Tokens, rather than implicit ones. Wherever there was an implicit data token in the original design, it will be replaced with an explicit Data Token once the design is mapped into the picoPIPE fabric. As explicit Data Tokens are used, the clock information is encoded into the Data Token - the fact that a token exists at all, indicates that a clock edge has occurred. As each Data Token contains both Data AND Clock information, no global clock is required within the fabric. Data Tokens are still clocked into and out of the Fabric using special elements in the Frame. The explicit Data Tokens are controlled by fast, local handshaking, rather than a global clock, hence are able to propagate at very high speeds.

The basic elements of a picoPIPE fabric are the Connection Element (CE), the Functional Element (FE), the Boundary Elements (BEs) and the pipeline stage. Pipeline stages connect CEs, FEs and BEs to form pipeline networks. Once combined into networks the picoPIPE implementation exactly matches the functionality of conventional FPGAs, but is capable of much higher throughput.

Achronix picoPIPE Building Blocks

Achronix picoPIPE Building Blocks

Achronix picoPIPE Pipeline Stages

Achronix picoPIPE Pipeline Stages

Each pipeline stage is capable of holding a Data Token, meaning that picoPIPE logic is highly pipelined by design. In traditional logic designs, adding pipeline stages will change the logic function computed. With picoPIPE logic this is not the case. picoPIPE pipeline stages can be added without automatically adding a new Data Token into the circuit. This is possible because the new data representation has separated pipeline stages from Data Tokens. In a traditional design, adding a pipeline stage (register) will always cause a new Data Token to also be introduced - and thus the functionality has been altered. picoPIPE logic has freed Data Tokens from being tied to pipeline stages, therefore pipeline stages can be added, without adding Data Tokens.

CEs can be initialized with a Data Token, or without. Wherever a register existed in the original design, they will have an initial Data Token, all other CEs will not have initial Tokens. The main difference between a series of uninitialized CEs and a wire, is that each pipeline stage between CEs is still capable of containing a Data Token, even if it doesn't start with one initially. This enables the throughput of Achronix FPGAs to be increased, while maintaining exact logical equivalence to a conventional circuit.

FEs have functionality equivalent to combinatorial logic. The only difference relates to how ingress and egress data is handled. The local handshaking within a picoPIPE network means the FEs must also handshake data in and out. This handshaking ensures only valid, settled data is evaluated and propagated.

BEs are only used at the boundary where the picoPIPE fabric meets the FPGA frame. These elements are responsible for converting Data Tokens in the frame into Data Tokens in the picoPIPE fabric (ingress). They are also used for converting Data Tokens in the fabric back into Data Tokens in the Frame (egress). Therefore every signal entering and exiting the picoPIPE fabric will pass through Ingress Boundary Elements and Egress Boundary Elements respectively.


Increased throughput

Higher throughput compared with existing FPGAs is achieved because of the fine-grained pipeline stages. Unlike existing FPGA implementations, these pipeline stages can be automatically inserted anywhere in a design without changing its logic functionality.

There are often many levels of logic between storage elements in traditional technology. It takes time for data to propagate from the Q register output, through the combinatorial logic and settle at a stable state on the next register's D input. As the clock cannot occur until all data is settled, the clock speed must run no faster than the longest path in the entire clock network. Data in every path that is shorter than the longest path (by definition, all paths except the longest) must wait for the longest path.

In contrast, picoPIPE technology allows optimum pipelining without changing the logic functionality. Each pipeline stage has less logic depth and therefore completes its operation very quickly. This allows the rate of Data Tokens through the logic to be increased, which increases the effective clock rate.

Achronix picoPIPE vs. Existing FPGA Implementation

Achronix picoPIPE vs. Existing FPGA Implementation

In traditional FPGA, signals travel on long routing tracks and pass through many routing components. These signals suffer from a high capacitive load; and the larger the FPGA, the longer the paths that need to be traversed. Additionally, there are many levels of logic between state holding elements (registers).

 Conventional Implementation vs. picoPIPE Implementation

Conventional Implementation vs. picoPIPE Implementation

Within Achronix FPGAs, the built in pipelining ensures that signals only ever need to travel on short routing tracks. This reduces the capacitance of the signal at each stage. For larger devices signals still may need to propagate from one corner of the device to the other. While larger devices may have slightly increased latency, unlike other FPGAs, they do not have decreased throughput, as each pipeline stage is capable of holding a new Data Token. Thus the inherent pipelining of picoPIPE technology allows maximum throughput to be maintained, regardless of how large the FPGA is. Pipelining also ensures there is only one logic level per pipeline stage, allowing a much faster rate of Data Tokens to be used.


Conclusion

Based on the patented picoPIPE technology, Achronix FPGAs can achieve a level of performance unobtainable with traditional FPGAs. Although the logic fabric is based on new technology, designers are not required to learn new design tools and techniques - making picoPIPE technology almost transparent to the designer. By mapping a design to picoPIPE technology, the design is automatically pipelined, without changing its behavior, significantly increasing the throughput that can be achieved. The innovative Achronix picoPIPE acceleration technology breaks through performance barriers and opens a new world of applications previously unavailable to FPGA designers.

greg Martin, AchronixGreg Martin is product marketing manager for Achronix Semiconductor Corporation. His responsibilities include the development of software product strategy as well as product marketing for hardware and software. Mr. Martin joined Achronix in 2006 and strongly influenced the development and production of the company's ACE Software and Speedster FPGA.

Achronix Semiconductor is a privately held fabless corporation headquartered in San Jose, California. Using breakthrough technology, Achronix field programmable gate arrays (FPGAs) can achieve up to 1.5 GHz system performance. Find out more at http://www.achronix.com.

Achronix is a registered trademark and Speedster is a trademark of Achronix Corp. All other brands, product names and marks are the property of their respective owners.