Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Embedded system engineering magazine 2006.05,06

.pdf
Скачиваний:
19
Добавлен:
23.08.2013
Размер:
5.55 Mб
Скачать

cessors: Processors

Max

Type (3)

Min

Max

2 KB

ROM

 

128 KB

KB

 

 

 

8 KB

 

 

 

8 KB

 

 

 

KB

 

 

 

B

EEPROM

64 B

4 KB

bit

 

 

 

KB

L2 SRAM

128 KB

128 KB

KB

 

 

 

B

 

 

 

KB

 

 

 

B

 

 

 

 

 

 

 

KB

L2 cache

 

512 KB

KB

 

 

 

KB

L2 cache

 

1 MB

KB

L2 cache

 

1 MB

 

32 KB

M2 / M3

 

0.5 MB/10 MB

 

B

Flash

 

 

96 Byte

Flash

 

 

 

 

KB

Flash

 

 

 

 

KB

Flash

 

 

 

 

B

 

 

 

36 Bytes

EEPROM

 

1 KB

B

 

 

 

 

 

 

KB

 

 

 

 

 

 

B

EEPROM

 

4 KB

 

KB

 

 

 

 

 

 

KB

Flash

8 KB

8 KB

A

N/A

N/A

N/A

A

N/A

N/A

N/A

A

N/A

N/A

N/A

B

 

 

 

 

 

 

 

 

 

 

 

 

 

 

B

Data Cache

8 KB

8 KB

B

 

 

 

 

 

 

B

 

 

 

 

 

 

 

 

 

 

 

 

 

 

48 Byte

 

 

 

6 Byte

 

 

 

 

 

 

76 Byte

 

 

 

 

 

 

 

 

 

80 KB

 

 

 

KB

RAM

32 KB

320 KB

KB

RAM

128 KB

1MB

24 KB

RAM

2MB

2MB

I/O and Interfaces

Serial

USB

A/D

UART

Ethernet

DDR

SRAM

External Flash

I2C

CAN

Other buses (please name)

15

3

8

7

 

 

λ

λ

λ

 

I2S, MCI

5

1

8

3

 

 

 

 

λ

 

SSC,SPI

8

1

8

3

1

 

 

 

λ

λ

SSC,SPI

8

1

8

3

1

 

 

 

λ

λ

SSC,SPI, TDES, AES

12

1

16

4

 

 

 

 

λ

λ

MMC, SD, SPI, SSC

86

1

16

2

 

 

λ

 

λ

λ

USART, SPI, USB, TWI,

160

1

 

4

2

 

λ

λ

λ

 

USART, AC97, I2S, S/PDIF, SPI,TWI

 

 

 

 

 

λ

λ

λ

 

 

 

6

 

 

2

 

 

λ

λ

 

 

 

2

 

 

1

 

 

λ

λ

 

 

 

2

 

 

1

 

 

λ

λ

 

 

 

4

 

4

2

 

 

λ

λ

λ

 

SPI, IR, S mArt Card

10

1

14

4

1

 

λ

λ

λ

 

SPI, IR, S mArt Card, I2S

4

 

4

2

 

 

λ

λ

λ

 

SPI, IR, S mArt Card

2

 

 

 

1

 

λ

λ

λ

 

SPI

4

2

 

 

1

 

λ

λ

λ

 

SPI, D mA

4

2

 

 

1

 

λ

λ

λ

 

SPI, D mA, PCI

4

2

 

 

1

 

λ

λ

λ

 

SPI, D mA, PCI, IEEE1284, video

<8

 

 

2

4

λ

 

 

λ

 

PCI 2.2, PCI-X, PIC, GPIO, JTAG

2

1

 

2

2

λ

 

 

λ

 

 

>8

 

 

2

4

λ

 

 

λ

 

 

4

 

 

1

2

λ

 

 

λ

 

 

1

 

8/12

2

 

 

 

 

λ

 

LIN

2

 

8

1

 

 

 

 

λ

 

 

3

 

24

4

 

 

 

 

λ

λ

LIN

3

 

24

7

 

 

 

 

λ

λ

D mA

2

 

4

2

 

 

 

 

λ

 

SPI

 

1

16

2

1

 

 

λ

λ

λ

SPI,L IN; Parallel Slave Port

 

 

16

2

 

 

 

 

λ

 

SPI, LIN, IrDA, JTAG

 

 

32

2

 

 

 

 

λ

λ

SPI, LIN, IrDA, JTAG, D mA

 

 

16

2

 

 

 

 

λ

λ

SPI, LIN, Codec

 

 

32

2

 

 

 

 

λ

λ

SPI, LIN, IrDA, JTAG, Code, DmA

2

1

1

4

 

 

λ

λ

λ

λ

Microwire/SP, audio, Bluetooth

40

 

 

4

 

 

λ

λ

λ

 

 

56

 

 

6

 

 

λ

λ

λ

 

 

40

 

 

6

1

 

λ

λ

λ

 

 

 

 

5

2

 

 

λ

 

λ

 

SPI

 

 

 

2

 

 

 

 

λ

 

SPI, JTAG

 

 

 

1

 

 

 

 

λ

 

 

3

2

8

3

 

 

λ

λ

λ

 

color LCD, touch screen

3

1

8

3

1

 

λ

λ

λ

 

color LCD, touch screen

3

 

 

3

 

 

λ

λ

 

 

color LCD

3

 

8

3

 

 

λ

λ

 

λ

color LCD, touch screen

64

 

16

2

 

 

 

 

λ

 

SPI

8

 

8

1

 

 

 

 

λ

 

SPI

40

8

17

2

 

 

 

 

λ

 

SPI

3

1

4

1

 

 

λ

 

λ

 

EMIF, HPI, SPI, McBSP, DmA

3

1

 

1

 

 

 

 

λ

 

EMIF, DmA, MmA, HPI, SPI

3

 

 

 

 

 

λ

 

 

 

PCI, HPI, Expansion Bus, DmA

3

 

 

 

 

λ

λ

 

 

 

PCI, HPI, McBSP, McBSP

Application area

Consumer

 

Portable/hand held

 

Industrial

 

Automotove

 

Aerospace/defence

 

Telecomms

 

 

 

 

 

 

 

 

 

 

 

λ

 

λ

 

λ

 

 

 

 

 

λ

λ

 

λ

 

λ

 

 

 

λ

 

λ

 

 

 

 

 

 

λ

 

λ

 

λ

 

 

 

λ

 

λ

 

 

 

 

 

 

λ

 

λ

 

λ

 

 

 

λ

 

λ

 

 

 

 

 

 

λ

 

λ

 

λ

 

 

 

λ

 

λ

 

 

 

 

 

 

λ

 

λ

 

λ

 

λ

 

λ

 

λ

 

 

 

 

 

λ

 

λ

 

λ

 

 

 

 

 

λ

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

λ

 

 

 

λ

 

λ

λ

 

λ

 

λ

 

λ

 

λ

 

λ

 

 

 

 

 

λ

 

 

 

λ

 

 

 

 

 

λ

 

 

 

 

 

 

 

 

λ

 

 

 

λ

 

 

 

 

 

λ

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

λ

 

λ

 

λ

 

 

 

 

 

λ

λ

 

λ

 

λ

 

 

 

 

 

λ

 

 

 

 

 

 

 

λ

 

λ

 

λ

 

 

 

 

 

λ

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

λ

 

λ

 

λ

 

λ

 

λ

 

λ

λ

 

λ

 

λ

 

λ

 

λ

 

λ

 

 

 

 

 

λ

 

λ

 

λ

 

λ

 

λ

 

λ

 

 

 

 

 

λ

 

λ

 

λ

 

λ

 

λ

 

λ

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

λ

 

λ

 

λ

 

λ

 

 

 

 

λ

 

λ

 

λ

 

λ

 

 

 

 

 

 

 

 

 

 

 

λ

 

λ

 

λ

 

λ

 

 

 

 

 

 

 

 

 

 

 

λ

 

λ

 

λ

 

λ

 

 

 

 

 

 

 

 

 

 

 

λ

 

λ

 

λ

 

λ

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

λ

 

λ

 

λ

 

λ

 

 

 

λ

λ

 

λ

 

λ

 

λ

 

 

 

λ

 

 

 

 

 

 

λ

 

λ

 

λ

 

λ

 

 

 

λ

 

 

 

 

 

 

λ

 

λ

 

λ

 

λ

 

 

 

λ

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

λ

 

λ

 

λ

 

λ

 

λ

 

λ

 

 

 

 

 

 

 

 

 

 

 

λ

 

λ

 

λ

 

λ

 

λ

 

λ

λ

 

λ

 

λ

 

λ

 

λ

 

λ

 

 

 

 

 

λ

 

λ

 

λ

 

λ

 

λ

 

λ

 

 

 

 

 

λ

 

λ

 

λ

 

λ

 

λ

 

λ

 

 

 

 

 

λ

 

λ

 

λ

 

λ

 

λ

 

λ

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

λ

 

λ

 

λ

 

λ

 

λ

 

λ

 

 

 

 

 

 

 

 

 

 

 

 

 

λ

 

λ

 

 

 

λ

 

λ

 

 

λ

 

λ

 

 

 

λ

 

λ

 

 

 

 

 

 

 

 

 

λ

 

λ

 

 

 

λ

 

λ

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

λ

 

λ

 

λ

 

λ

 

λ

 

λ

λ

 

λ

 

λ

 

λ

 

λ

 

λ

 

 

 

 

 

λ

 

λ

 

λ

 

λ

 

λ

 

λ

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

λ

 

λ

 

λ

 

λ

 

 

 

λ

λ

 

λ

 

λ

 

λ

 

 

 

λ

 

 

 

 

 

 

λ

 

λ

 

λ

 

λ

 

 

 

λ

 

 

 

 

 

 

λ

 

λ

 

λ

 

λ

 

 

 

λ

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

λ

 

λ

 

λ

 

λ

 

λ

 

λ

λ

 

λ

 

λ

 

λ

 

λ

 

λ

 

 

 

 

 

λ

 

λ

 

λ

 

λ

 

λ

 

λ

 

 

 

 

 

λ

 

λ

 

λ

 

λ

 

λ

 

λ

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

λ

 

λ

 

λ

 

λ

 

 

 

λ

λ

 

λ

 

λ

 

λ

 

 

 

λ

 

 

 

 

 

 

λ

 

 

 

λ

 

λ

 

λ

 

λ

 

 

 

 

 

 

λ

 

 

 

λ

 

λ

 

λ

 

λ

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

</Buyer's Guide>

ESE Magazine May/June 06

BG04

</Feature>

ESE Magazine May/June 06

22

Multithreaded architecture speeds embedded applications

<Written by> Vivek Sardana, MIPS Technologies, Inc. </W>

Mutithreading offers an alternative path to improved performance.

EMBEDDED SYSTEMS designers are constantly under pressure to boost the execution speed their applications. One common solution has been to ramp up the clock speed of the system processor, but this

usually results in increased power consumption. Additionally, memory performance has not kept pace with processor technology, and this mismatch limits any significant gains in system performance. Consequently the higher-frequency approach is leading to diminishing returns.

The execution pipeline of a traditional singlethreaded processor will stall for many reasons

The key to squeezing the maximum performance in such a design is controlling the way the threads are executed in the pipeline, and so the various threads are scheduled into the execution unit for maximum efficiency. When one thread stalls waiting for memory, the next thread is introduced to ensure that the processor's execution unit is as busy as possible.

Technology

The 34K core family uses a nine stage execution pipeline coupled with a small amount of hardware to handle the virtual processing elements (VPEs) and the thread contexts (TCs), as illustrated in figure 1.

Each thread has its own dedicated hardware, called the thread context (TC). Since the context of each thread is always available, the core can switch between threads on a clock-by-clock basis to keep the pipeline as full as possible. This approach eliminates the massively cycle-ineffi- cient overhead which would otherwise be caused by context switching between threads. Each TC has its own set of general-purpose registers, a PC (program counter) that allows a TC to run a thread from a complex operating system such as Linux. A TC also shares resources with other TCs, particularly the CP0 registers used by the privileged code in an OS kernel.

The set of shared CP0 registers and the TCs affiliated with them make up a VPE (Virtual Processing Element). A VPE running one thread (i.e. with one TC) looks exactly like an independent MIPS CPU, and is fully compliant with the MIPS32 architecture specification. This allows two OS's to run side by side on a single core with no loss of efficiency – Linux plus an RTOS for example.

All threads (in either VPE) share the same caches, so cache coherency is not an issue. This

Figure 2: A dual OS Example: Residential Gateway supporting multiple VoIP channels

Figure 1: MIPS32 34K processor architecture

using two threads indicated that the perform-

 

www.mips.com

ance improvement compared to the equivalent

 

 

 

ARM JTAG+TRACE

JTAG-USB with 2Mbyte of 16 bit trace for less than many JTAG debuggers. Option of unlimited breakpoints in internal flash

Supports ARM7 & 9

Trace supports 200Mhz full and 100Mhz half clock rates

Support for cycle accurate and compresssed tracing

No PSU required

 

External PSU option.

PhaedruS SystemS

 

has a full range of

 

ARM compilers, JTAG

PhaedruS

and dev boards

S y s t e m S

Tel: 08450 09 09 58

chills@phaedsys.org

www.phaedsys.org

Beagle USB

Protocol Analyser

Non-intrusive Full/Low-Speed USB monitoring (1.5 Mbps/12 Mbps)

#

It is even possible to monitor a HighSpeed USB device by using a FullSpeed hub

#

Real-Time Data Capture and

Display

#

Watch USB packets as they occur on the bus

#

Bit-Level Timing

#

See the individual bit timings down to 21 ns resolution

#

Fully Linux and Windows compatible

Only £295 #

Includes full function monitoring tools

Experts in Embedded Systems

or registered trademarks are the property of their respective owners

June 30th, 2006. Rev. #G045uk02

all other trademarks

compliant latest by

Kontron logo and

be lead-free/RoHS

AG. All rights reserved. Kontron and the

compliant logo means that product will

Copyright © 2006 Kontron

and are recognized. RoHS

Brilliant View

with Flatpanel Controllers from Kontron

 

 

 

-7

 

 

CD

 

oL

 

Tt

 

 

CR

 

 

 

 

 

 

 

-5

 

 

 

D

 

 

LC

 

 

o

 

 

Tt

 

 

 

CR

 

 

 

 

 

 

 

 

 

 

-6

 

 

 

 

 

D

 

 

 

 

C

 

 

 

 

L

 

 

 

 

to

 

 

 

 

T

 

 

 

 

R

 

 

 

 

 

C

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

aFLAT- series –

the new generation!

ΨEasiest way to connect CRT-, DVI-, Videoand TV-Sources (incl. Component YPbPr) to a Flat Panel (TTL, LVDS)

ΨResolutions from QVGA up to WUXGA incl. HDTV (1080i und 720p @ 50 Hz)

ΨIncl. powerful software tools (KCWB) via serial port interface

ΨPlug & Display with individual flat panel adaptation via panel file incl. special video mode support

ΨFirst class image quality incl. high end scaling & adaptive motion deinterlacing

ΨBacklight dimming support (incl. PWM and light sensor support)

ΨExtension boards e.g. for TV-tuner

ΨUnbeatable price-performance ratio

Wide range of datacables for all leading flat panel types

Our info hotline: +44 1243 523500

www.kontron.com/flatpanel sales - graphic@ kontron.com

www.gwmicros.com

If it‘s Embedded, it‘s Kontron.

</Feature>

ESE Magazine May/June 06

24

Protecting the “Secret Sauce”

<Written by> Martin Mason, Actel Corporation </W>

IP is expensive to develop and should be strongly protected.

I

While the company is anxious to prevent third-

for accessing proprietary data, as most organiza-

F A COMPANY’S intellectual proper-

( P) block, or “secret sauce,” has taken

party vendors from copying or reverse engineer-

tions do little to address the problem. Therefore,

substantial time and resources to devel-

ing its products to the extent that they can make

it is critical that the industry meet this problem

op, a responsible designer understand-

compatible cartridges, PGI has chosen to imple-

head on by identifying the most critical steps to

The technology in which this IP is housed is often overlooked, resulting in the theft of design information in seconds – a mere fraction of the development time

by-layer deprocessing of an ASIC can yield a

RSI, Lotus takes the completed design stored in

Consider adding digital “watermarks” or

design schematic in a matter of weeks. Volatile

a companion nonvolatile memory component on

“fingerprints” to your design: These are

SRAM-based FPGAs also have their challenges.

the production board and makes an additional

unique features or attributes of the design that

Because the design configuration bit stream is

100,000 units (including not just RSI’s design,

can later be used to prove that a design

loaded from an external volatile PROM device,

but also their own label) without the permission

claimed by a competitor to be independently

SRAM FPGAs face the greatest danger of design

of RSI. These counterfeits were

then

sold

developed is really a copy. This kind of action

IP theft. However, as nonvolatile devices retain

through the “grey market” at a discount.

This

may have prevented the alleged IP infringe-

their configuration data, they don't need to load

resulted in price erosion due to increased com-

ment scandal at Shanghai’s prestigious Jiao

data via a bit stream from an external device.

petition in the marketplace, but also brand

Tong University – whether the team credited

Therefore, of all commercially available tech-

degradation due to RSI’s inability to differentiate

with creating China’s first digital signal proces-

nologies, nonvolatile flash-based FPGAs offer

from competitive

(albeit

its

sor (DSP) used proprietary data from

the most

.

To support

Motorola/Freescale.

 

Pain

 

and reputa-

 

 

 

to support

Use trusted silicon vendors in implement

Imagine

extent)

 

the

the design: A nonvolatile, flash-based device

Incorporated

units.

 

 

programmed in a secure environment protects

gaming

 

steps

customers’ proprietary IP.

 

plan

 

 

 

employs

 

true

 

that

As the industry increasingly moves toward IP-

ness

vulnerabilities

based design models, the security of both in-

which

 

security

house and third-party IP becomes an even more

a l t i e s

flaws of

a

critical issue. Careful technology choice regard-

f r o m

technology

ing design implementation is the first line of

g a m e

are

well

defense in protecting this valuable asset from

sales and

known

 

and

unscrupulous competitors.

<End/>

accessories

attackers only

 

 

www.actel.com

 

sidize the

 

best means

 

 

 

 

Fanless Embedded Controller Series...

NEW!

AEC - 6910

AEC-6910

Ideal for Digital Signage and multimedia advertising.

Video Surveillance, Industrial Automation and Inspection

Intel Pentium M and Celeron up to 2.1GHzSupports PCI, Mini-PCI, PCMCIA, CF

Add on cards for motion control, DVR, Data Acquisition,8-ch video capture

...all on board at Display Solutions

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

AEC-6820

 

AEC-6830

 

 

 

Ideal Vehicle Control Equipment Platform

 

Reliable

 

 

Supports 2 PCMCIA Slots for Expansion

 

Entertainment Application Unit

 

 

Intel® ULV Celeron® 650MHz

 

 

Supports GSM/GPRS/GPS Applications

 

 

 

Operating Temperature: -15 to +70°C

 

Dual View/ 6CH Audio/ TV-out/ DVI-out

 

 

Robust Vehicle/ GPS System/ Transportation Controller

 

Vehicle Multimedia System/ Audio & Video Playing

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

AEC-6840

 

AEC-6850

 

 

 

 

 

NOW WITH

Superior Plant-wide Monitor Security Controller

 

Digital Signage Controller

 

TWO YEAR WARRANTY

 

 

 

 

 

 

 

 

 

 

Intel® ULV Celeron® 650/ 400MHz

 

Intel® Pentium® M/ Celeron® M Processor

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Supports 4 COM/ USB 2.0/ Digital IO/ TV

 

Dual View/ 5.1CH Audio/ Card Reader/ USB/IEEE 1394

 

 

 

 

 

 

 

 

 

Tuner/ Video Capture Function

 

Outdoor Advertisement Player/ Advanced

 

 

 

 

 

 

 

 

 

Factory Automation/ e-learning Family/Security Installation

 

Human-machine Interface

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

</Feature>

Embedded multicore designs force stringent APIs

<Written by> Markus Levy, the Multicore Association </W>

Multicore is beginning to make a real impact in the embedded world, and a new umbrella organisation has been created.

HAVE YOU NOTICED lately that more and more companies are offering products related to some aspect of multicore? Multicore processors. Multicore-

ESE Magazine May/June 06

26

Figure 2: Intellisys SEAFORTH24 – an example of multicore processing.

The embedded market is where a growing amount of the action and many of the challenges are for multicore-enabled applications

Figure 1: It would be desirable to have one unified API that is designed specifically for commu-

 

nication in a closely distributed embedded system (multiple cores on a chip and/or multiple

communicate with one another. So far a num-

chips on a board), and that is agnostic to the type and number of cores, the type of inter-con-

ber of vendors have addressed the issue with

nects and operation systems. (Diagram courtesy of PolyCore Software.)

 

proprietary support for inter-core and/or inter-

chip communications. There are also plenty of standards available to support communication, but none is designed or intended to support the tight memory and task execution time constraints, or the on-chip interconnect requirements, of most embedded systems.

As the multicore ecosystem (comprised of chip vendors, semiconductor IP providers, RTOS, compiler, and development tool vendors, as well as application developers) evolves, vendors will need to evolve their product lines to support more robust interfaces and higher levels of interoperability, and therefore enable quicker time to market for their customers.

To cope with this impending change, it is helpful for the industry to agree on common, simple, and efficient abstractions for such concurrent systems to allow us to describe key aspects of concurrency in ways that can be simply and directly represented as a set of APIs. The resource management, communications, and synchronization required for embedded distributed systems are some specific areas of programming multicore systems that must be addressed. This need stems from the reality that such systems cannot rely on a single OS -- or even an SMP OS -- for such services. When heterogeneous multicore systems employ a range of operating systems across multiple cores, that means they have resources that cannot be managed by any single operating system. This situation is exacerbated further by the presence of hardware accelerators which do not run any form of OS, but which must interact with processes that are potentially running on multiple operating systems on different cores.

API

A new industry organization, the Multicore Association, has been formed to serve as an umbrella organization for multicore related discussions, standards, and support. One of the first areas being addressed by this group is the communications function, which the Multicore Association refers to as the Communications API (CAPI). The primary goal of the Association’s CAPI workgroup is to specify an API, not an implementation, for the purposes of messaging and synchronization in a closely distributed embedded system.

In the traditional sense, distributed computing is parallel computing using multiple independent computers communicating over a network to accomplish a common objective or task. To support this functionality, many protocols have been developed over the years to target a wide variety of distributed systems. These include Open Systems Interconnection (OSI), TCP/IP, Sun’s Remote Procedure Call (RPC), Common Object Request Broker Architecture (CORBA), OpenMP, Message Passing Interface (MPI), and the list goes on. While these indus-

try standards support distributed systems pro-

heterogeneity, operating system heterogeneity,

 

gramming, they have primarily been focused on

software tool chain heterogeneity, and pro-

 

the needs of (1) distributed systems in the large,

gramming language heterogeneity). Thus, the

 

(2) SMP systems, or (3) specific application

CAPI has similar but more highly-constrained

</Feature>

domains (such as scientific computing).

goals than the existing standards with respect

Multiple dimensions

to scalability and fault tolerance, yet has more

generality with respect to application domains.

One interesting aspect and very important dif-

In this article we have barely scratched the

ferentiator of the CAPI is not what it is, but

surface on the issues that are being resolved and

what it isn’t. In other words, to meet the strin-

that the standards that are being developed with-

gent demands of the single-chip (or otherwise

in the Multicore Association. Other topics include

closely knit) multicore platform, CAPI may draw

support for multicore resource management and

upon many bits and pieces implemented in the

multicore debugging. More information can be

 

protocols listed above, which were originally

found at www.multicore-association.org.

 

intended for large scale computing platforms.

The organization welcomes you to join and get

 

The target systems for CAPI will span multiple

involved.

<End/>

 

dimensions of heterogeneity (e.g., core hetero-

www.multicore-association.org

 

geneity, interconnect heterogeneity, memory

 

 

 

 

</Feature>

ESE Magazine May/June 06

28

Six-pin microcontrollers offer ASIC design versatility

<Written by> Scott Fink, Microchip Technology Inc. </W>

Using microcontrollers can provide a route round the inflexibility of ASICs.

MANY HIGH-VOLUME applications benefit from ASIC technology. By tailoring the silicon exactly to the application, size and cost can be kept to a min-

imum. The ASIC approach has contributed so much to the cost-effectiveness of some products that it has made the difference between commercial success and failure.

However, these benefits are usually gained at the cost of versatility. If the design needs to be changed, it is probable that the ASIC will need to be “re-spun”, resulting in massive extra costs and delays in time-to-market. Until now, that is. New 6-pin microcontrollers now coming onto the market can provide cost-effective “electronic glue” in a very small space – enabling an existing ASIC to be used in a changed circuit.

The design process for an ASIC is complex and the combination of design, simulation and verification can often take over a year. Non-recurring engineering (NRE) charges are usually very high, so it is important to be confident that the application will sell in large numbers, so that these costs can be amortised. Long processing times in the fab, from the submission of the design to finished silicon, can delay the product reaching the market, and therefore reduce profitability.

Changes

One of the biggest risks with an ASIC is that the system design changes after the design has been finalised but before the silicon is delivered, or that the design changes only a short time into the end-product’s design life. These changes could

be caused by market fluctuations, a change in surrounding elements of the overall system, or even a whim of the marketing department. There is also the risk that, during the in-system validation of the ASIC, it becomes clear that the specification for one or more of the other components in the system are far from accurate. Of course, it is possible to redesign, resimulate, and reverify the ASIC, send in another NRE check, and then wait for the new design to come out of the fab, or a small microcontroller can be added to the ASIC board to correct the problem (see Figure 1).

For example, a microcontroller with internal oscillator, program memory, I/Os, an 8-bit timer and an optional comparator can cost-effectively correct a multitude of timing or data format problems.

Interfacing

A number of opportunities can be opened up by including a 6-pin microcontroller from the initial concept stages of an ASIC-based design. For example, if there is any uncertainty about the interface to another device or part of the system, using a 6-pin microcontroller can allow the design of the ASIC to continue without the risk of it not working when silicon is received. The microcontroller can then quickly be programmed to implement whatever interface specifications are finally needed.

Programmable board serialisation and authentication, or programmable I/O for different system configurations, can easily be provided by an on-board 6-pin microcontroller. The microcontroller can be used with comparators to

read sensors and could provide calibration and linearisation, so sensors from multiple vendors could be used. This can allow the purchaser to choose the sensor that provides the best value when purchasing decisions are made.

Field reprogramability

A microcontroller can also be used to enable an ASIC-based design to be re-programmed in the field. The features of a single, ASIC-based board design can easily be enabled or disabled, eliminating the need to stock different variants. The robustness of a 6-pin microcontroller could allow it to be used as a system watchdog timer, where if a sophisticated health check communication between the ASIC and the microcontroller fails for any reason, the microcontroller could reset the system, or put it in a known failsafe state.

Microchip Technology’s PIC10F20X 8-bit flash family of 6-pin microcontrollers can also extend the life of an existing ASIC-based design by adding new functionality and features without the need for a full re-design. The use of a small microcontroller can allow the interface to the ASIC to be changed, for example changing a

Figure 1: An “Electronic glue” application example

www.microchip.com

embeddedpcdesign

Compliant. Brilliant!

Designed to provide a fully customised embedded solution with minimal engineering and adaptation costs, the new GX2 ETX is a complete processor core which will enable a custom product to be developed and brought to market quickly and easily.

Geode GX533 processor ETX

400 MHz

512MB DDR RAM

+44 (0) 1462 675530 design@dsl-ltd.co.uk www.embeddedpcdesign.com

ADVANCED MICRO PERIPHERALS LTD.

“The Embedded Video Experts”

MPEG4000 - 4-Channel MPEG4 Encoder

h4 Live PAL/NTSC inputs.

h4 concurrent recordings to disk

hSynchronized Video/Audio recording

hSDK’s for Windows NT/2000/

XP, Linux

http://www.ampltd.com/prod/mpeg4k.html

MICRO886ULP - Ultra Low Power CPU

hFanless operation @ 1GHz h128-512MBytes SDRAM hPC/104 and PC/104+ h10/100MBit Ethernet hSerial, Parallel, USB hEIDE and Floppy h+3.3V and +5V tolerant PCI h-40 to +80°C option

http://www.ampltd.com/prod/u886.html

VAC104+ - Video Annotation Controller

hPAL or NTSC input

hVideo Blending with VGA

Graphics

h256 levels of Alpha Blending hPAL/NTSC and VGA outputs

hSDKs for Windows, Linux,

QNX

http://www.ampltd.com/prod/vac104p.html

All products designed, manufactured and supported in the UK

ADVANCED MICRO PERIPHERALS LTD

TEL: +44 1353 659500 FAX: +44 1353 659600 EMAIL: sales@ampltd.com

http://www.ampltd.com

</Feature>

ESE Magazine Jan/Feb 04

30

Accelerating compute intensive embedded applications

<Written by> Joe Hanson, Stretch Inc. </W>

Software configurable architectures can conquer the requirements of real-time processing.

T

straints.

of software merge the sors with ASIC. Both are described

sion instructions, enabling developers to design "hardware as software" and resulting in reduced design complexity and faster time-to-market. This is achieved by accelerating performance through increased data parallelism, sufficient data bandwidth, and temporal parallelism.

"Hardware to software" enables developers to work within a single development flow that keeps design in a software development environment that is both well-established and familiar to developers. Instead of improving performance by hand-coding assembly or partitioning algorithmic code to an FPGA, for example, developers identify "hot spots" within the application. This enables the compiler to create an optimized configuration to be implemented in a programmable fabric and then treat it as it would any other instruction.

This is critical for evolving applications. With coprocessor-based architectures, development of hardware acceleration logic cannot begin until the algorithm has been completely designed in software. In effect, this reverses the traditional "hardware first, then software" design model. The only way to hardware-accelerate performance without completely undermining time-to- market is through concurrent software and hardware development as is possible with a soft- ware-reconfigurable architecture.

Acceleration through software

Being able to process multiple data concurrently with a single instruction is an essential mechanism for improving performance without having to increase clock frequency and corresponding manufacturing cost and power consumption. Softwarereconfigurable architectures are able to further increase efficiency by defining data size and format on an instruction-by-instruction basis. In this way, extension instructions can match data exactly,

lize them. As a consequence, data bandwidth directly determines the degree of data parallelism that is possible.

Pipeline

Temporal (pipelined) parallelism is the ability of an architecture to make optimal use of the instruction pipeline while executing multiple instructions simultaneously. Efficient temporal parallelism is only possible when hardwareaccelerated instructions share the same instruction decode unit and pipeline as standard instructions. When extension instructions are executed in a separate pipeline, such as is the case when an ASIC, DSP, or FPGA is used as a coprocessor, managing dependencies between standard and extension instructions becomes extremely difficult. Mechanisms to manage these dependencies increase instruction latency and undermine deterministic processing performance; developers must assume worst-case latency to simplify operation. Given that context data (i.e., intermediate results) must also be transferred to a coprocessor with data, processing and bandwidth inefficiencies are also introduced.

With a shared pipeline, context can be passed with a pointer to shared memory or, for zero overhead, stored in state registers persistent from extension instruction to extension instruction. Dependencies can also be managed by the pipeline, minimizing latency. Because such latency is now deterministic, it can be accounted for optimally by the compiler.

Because the pipeline is shared in a softwarereconfigurable architecture, extension instructions implemented in programmable logic are described in the same high order language (i.e., C/C++) as unaccelerated application code in the same format. This is the foundation of the "hard-

 

 

 

 

 

 

 

 

 

 

 

 

I-Cache

 

D-Cache

 

SRAM

DATA RAM

32KB

 

32KB

 

256KB

32KB

 

 

 

 

 

 

MMU

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Control

32-BIT RF

 

 

128-BIT WRF

 

 

 

 

 

 

 

ISEF

 

ALU

 

 

 

 

 

 

 

 

 

 

 

 

INSTRUCTION SET

 

 

 

 

FPU

 

 

EXTENSION FABRIC

 

 

 

 

 

 

 

 

 

 

 

 

RISC Processor

Tensilica - Xtensa V

32 KB I & D Cache

On-Chip Memory, MMU

24 Channels of DMA, FPU

Wide Register File

128-bit wide

32 entries Load/store unit

128-bit load/store

Auto increment/decrement

Immediate, indirect, circular

Variable-byte load/store

Variable-bit load/store

ISEF

3 inputs and 2 outputs

Piplined, byassed, interlocked

32 16-bit MACs and 256 ALUs

Bit-sliced for arbitrary bitwidth

Figure 1: The Stretch engine

ware as software" design model for softwareconfigurable processors as software and hardware are implemented simultaneously through an optimizing compiler. This compiler produces an optimal implementation, minimizing latency, maximizing pipeline efficiency, and increasing the frequency with which extension instructions can be issued. In contrast, architectures that implement extension instructions in a separate FPGA device require developers to work in a different development language such as Verilog using a completely separate development environment.

Configurable fabric

The Stretch S5 is an example of the impact of "hardware as software" acceleration (see Figure 1). The S5 combines a standard five-stage pipeline Tensilica Xtensa V RISC core with a configurable fabric known as the Instruction Set Extension Fabric (ISEF). The ISEF provides a large array of computational resources (4096 arithmetic unit bits and 8912 multiplier unit bits) interlocked to the Xtensa pipeline that can be used in any bit width, thus conserving resources. The 128-bit wide registers (WR) between the RISC core and ISEF facilitate the passing of multiple data. Results can be stored in a single operation and require no computational resources. In this way, the S5 can execute the equivalent of hundreds of C/C++ operations in a single extension instruction, resulting in a considerable improvement in performance over traditional architectures. Extension instructions are optimized for the specific application with the ability to add new instructions at any

stage in the development process.

<End/>

www.stretch.com

 

Соседние файлы в предмете Электротехника