
Dr.Dobb's journal.2006.02
.pdf
functionality occurred in cstart.a66. We also needed to write the code for initialization. We wrote code to initialize the switches to handle CAN — but no other code — to deal with it because it is not used in the basic port. For specifics, look at cstart.a66 and search for npe and rtxc labels to find code changes specific to this port. Keep in mind, when porting to new hardware you may want to adopt a similar strategy for partitioning the code for hardwareand RTOS-specific changes. That is because partitioning code through the use of labels helps with code maintainability.
Test Code
Finally, we needed to create some test code to test our port. Building the test code application was a two-step process:
1.We compiled the RTXC kernel into a library object (rtxc.lib).
2.We compiled the test code and link in rtxc.lib to create the executable.
There are two directories for generating the test code, and they are stored at the same level in the hierarchy. Moreover, all files for creating rtxc.lib are located in the kernel directory. Alternatively, test code-specific files are located in the Porting directory.
The RTXCgen utility creates a set of files corresponding to each RTOS component. For instance, application queues are defined in three files: cqueue.c, cqueue.h, and cqueue.def. The same holds true for tasks, timers, semaphores, mailboxes, and the rest. Changes to the number of RTOS components are handled by this utility. For example, if we wanted to change the number of tasks used by the test code, we use RTXCgen to do it. Figure 2 shows the contents of the task definition file for the test code application. Test code files created by RTXCgen are placed in the Porting directory. Once RTXCgen has defined the system resources, we are ready to build the project.
Creating the executable test code requires the build of two subprojects — the kernel and test code. We performed builds using the Keil Microvision IDE (http:// www.keil.com/). Keil uses project files (*.prj files) to store its build information. RTXC kernel creation consists of building the code using the librtxc.prj file located in the kernel directory. Evoking the librtxc project compiles, links, and creates a librtxc object in the kernel directory. Building the test code is accomplished using the NpeEg.prj file stored in the Porting directory. Invoking the NpeEg project compiles and links files in the Porting di-
rectory, and links the librtxc object in the kernel directory. The resulting executable is then placed in the Porting directory as well. Once the test code was fully built, we were ready to test the board port.
The test code is a simple application used to validate the porting process. Most of the test code is located in main.c located in the Porting directory. The application works by starting five tasks — two user and three system. User tasks execute alternatively, while system tasks execute in the background. One user task begins running. It then outputs data via one of the system tasks to the console. Next, it signals the other to wake up, and it puts itself to sleep, thus waiting for the other task to signal it to wake up again. Figure 3 shows the executing test code.
Conclusion
That’s pretty much it. Porting software to a new hardware board doesn’t need to be hard. With a firm plan (and following this simple process), porting just got a lot easier.
Acknowledgments
Thanks to Joe Tretter and Rick Gibbs at Northpole Engineering for their assistance.
DDJ
Listing One
;NPE specific code in cstart.a66
;NPE-167 Special Function Register Addresses
P2 |
DEFR |
0FFC0H |
DP2 |
DEFR |
0FFC2H |
ODP2 |
DEFR |
0F1C2H |
P7 |
DEFR |
0FFD0H |
DP7 |
DEFR |
0FFD2H |
ODP7 |
DEFR |
0F1D2H |
PICON |
DEFR |
0F1C4H |
CC8IC |
DEFR |
0FF88H |
EXICON |
DEFR |
0F1C0H |
T6CON |
DEFR |
0FF48H |
;---------------------------------------------------------------- |
|
|
;HARDWARE INITIALIZATION FOR NPE-167 A
;This code initializes the processor to work with the peripherials
;on the NPE-167 A.
$IF (WATCHDOG = 1) |
|
|
SRVWDT |
|
; SERVICE WATCHDOG |
$ENDIF |
|
|
BCLR |
T6CON.6 |
; Shut off timer. |
;Initialize Ports 2,3,7 and 8 as standard TTL levels.
MOV |
R5,0 |
MOV |
PICON,R5 |
;
;Initialize Port 2.
; |
Set Output |
= P2.0: |
IO Bus Reset |
|
; |
|
= P2.1, P2.2: |
System LED |
|
; |
|
= P2.3, P2.4: |
CAN LED |
|
; |
|
= P2.6, p2.7: |
Memory Page Select |
|
; |
|
|
|
|
; |
Set Input |
= P2.10, P2.11: CAN Speed Select (SW8, SW9) |
||
; |
|
|
|
|
MOV |
R5,#001Fh |
; Set outputs to off. |
||
MOV |
P2,R5 |
|
|
|
MOV |
R5,#0001h |
; IO is open drain. |
||
MOV |
ODP2,R5 |
|
|
|
MOV |
R5,#00DFh |
; Set output direction. |
||
MOV |
DP2,R5 |
|
|
|
; |
|
|
|
|
;Initialize Port 7.
; |
Set Output = P7.0 - P7.3: IO Bus Slot Select. |
|
; |
|
|
MOV |
R5,#000Fh |
; Set outputs to off. |
MOV |
P7,R5 |
|
MOV |
R5,#000Fh |
; IO is open drain. |
MOV |
ODP7,R5 |
|
MOV |
R5,#000Fh |
; Set output direction. |
MOV |
DP7,R5 |
|
; |
|
|
;Setup IO Interrupt (EX0IN).
;Disable external interrupts and set Interrupt level to 7,
;group level to 3, negative edge.
;
MOV |
R5,#001Fh |
MOV |
CC8IC,R5 |
MOV |
R5,#0001h |
OR |
EXICON,R5 |
Listing Two
;========================================================================
;** Beginning of RTXC specific code ** $IF MICROVISION
;========================================================================
;NULL STACK SIZE DEFINITION
;
;define the size of the stack for the null task.
;NOTE: Ensure you modify the 'C' level constant of the same name in
;the RTXCOPTS.H file.
;---------------------------------------------------------------- |
|
NULLSTKSZ EQU 80H |
|
$ELSE |
|
$INCLUDE(rtxcopts.inc) |
; include kernel assembly definitions |
$ENDIF |
|
; ** END RTXC specific code ** |
|
;=========================================================================
;=========================================================================
;** Beginning of RTXC specific code **
;This user stack is used only for startup and entry into main()
USTSZ |
EQU |
40H |
; set User Stack Size to 64 Bytes. |
; ** END RTXC of RTXC specific code ** ;=========================================================================
;=========================================================================
;** Beginning of RTXC specific code ** EXTRN nullstack:WORD
EXTRN DPP3:DPP1_INITVALUE:WORD
EXTRN DPP3:DPP2_INITVALUE:WORD
; Initialize the 'C' variables for task frame initialization
MOV |
DPP1_INITVALUE, DPP1 |
|
MOV |
DPP2_INITVALUE, DPP2 |
|
$IF NOT TINY |
|
|
MOV |
R0, #POF nullstack |
; restore user stack pointer |
MOV |
R5, #PAG nullstack |
|
MOV |
DPP2,R5 |
|
NOP |
|
|
BSET |
R0.0FH |
; User stack uses DPP1 |
$ELSE |
|
|
MOV |
R0, #nullstack |
; restore user stack pointer |
$ENDIF |
|
|
MOV |
R10, #NULLSTKSZ |
|
ADD |
R0, R10 |
; get to top of stack |
; ** END RTXC of RTXC specific code ** ;===============================================================
DDJ
http://www.ddj.com |
Dr. Dobb’s Journal, February 2006 |
67 |

P R O G R A M M I N G P A R A D I G M S
The Search for Search and Other Inquiries
Michael Swaine
When I studied artificial intelligence in graduate school, we were encouraged to think of any programming task as a search of a problem space. You just decide what problem it is that you’re trying to solve, figure out how to characterize any solution to that problem, settle on a searchable space that includes all such solutions, and voil`a! Your arbitrary programming task has been transformed into the seemingly routine job of writing an efficient search
routine.
Of course, this daunting business of inflicting on yourself a radical paradigm shift over to the search model isn’t really something you’d want to task your imagination with every day. Writing a spreadsheet application, for example, isn’t obviously helped much by viewing the task as a search through a space of programs looking for one that has a lot of little boxes with numbers and formulas in them. Or any other problem-space-search representation that I can think of. Still, it can be a powerful technique, and in fact, it was responsible for bringing about the first successes in artificial intelligence — the early puzzle-solving and game-playing programs. So say Avron Barr and Edward Feigenbaum in The Handbook of Artificial Intelligence (William Kaufmann, 1981; ISBN 0865760047), and they should know.
An entirely different idea regarding the desirable ubiquity of search is the notion that every viable 21st century software business model can and should be built around search. A corollary of this rather bold theorem is the idea that Microsoft wants to or needs to become Google, a notion that one would be tempted to discount as a gross oversimplification if it weren’t for the fact that the person seemingly most responsible
Michael is editor-at-large for DDJ. He can be contacted at mike@swaine.com.
for putting the thought into the technological public’s imagination is Bill Gates. And he should know.
What we all know by now is that Microsoft really is searching for a new business model. Maybe they’ll find it in one of the many search-related projects underway in Microsoft labs. Maybe they’ll adopt Google’s when Google gets done with it.
What I can tell you is that you’ll find here some random observations on search and a brief look at another of those fat books with ambitious titles for which I have an odd fondness. On my Fat Book Shelf right next to Stephen Wolfram’s A New Kind of Science (1197 pages) and Stephen Jay Gould’s The Structure of Evolutionary Theory (1433 pages) stands Roger Penrose’s The Road to Reality: A Complete Guide to the Laws of the Universe (1099 pages). I’ve wanted to have someone explain to me the laws of the universe, and because Penrose won a Wolf Prize with Stephen Hawking “for their joint contribution to our understanding of the universe,” he should know.
Search Is Hardly Hard to Find
As I write this, search is much in the news (with Google indexing blogs, news is much in the search, too). Google itself is much in the news. Stock price? You could drive across the United States in an SUV at summer 2005 gas prices for less than the price of a single share of Google stock. CNet writer Declan McCullough recently wrote a piece canonizing the old search protocol Gopher and its Veronica server; while in The New York Times, John Battelle was raving about the Web 2.0 conference, much of which was about search. Meanwhile, there’s a campaign to save the search mascot, Jeeves, and we read that “Bill Gates Visits the Holy Land and Talks Search.” Search is everywhere.
News aggregators? Ping servers? Mapping, GPS, people search, social bookmarking, tagging, communities of interest. Just more ways to search.
Some communities of interest, particularly ones that lead to people making contact off the Internet, make some people nervous. And what interests are we talking about? Suicide, for example? Should search engines build technologies to push people who are searching for suicide information toward help? Or is it always a bad idea to subvert the proper working of a search engine? And before you answer, have you ever engaged in Googlebombing?
Which raises the rather na¨ive but important question of whether you can trust search. Clearly you can’t, so what can be done about this? Dogpile is a search aggregator that could suggest the way: Don’t trust one engine, but apply some metric of trust over all of them to route around bias and error and Googlebombing.
If All You Have Is a Card Catalog, Everything Looks Like a Lookup
If the Internet is the center of your work, then search is the main task you have to perform.
There was a lot of second guessing in the press about the meaning of Google and Sun agreeing to cooperate on technologies. Mostly this was speculation about how they would target Microsoft. Would the two companies use OpenOffice.org to challenge Microsoft in the Office suite software arena? Or is that yesterday’s platform, and would they push harder on the idea of the Internet as the center of the computer user’s world?
I don’t think anyone knows exactly what Sun and Google will accomplish together, but some scenarios seem more likely than others. Here’s a question that I think brings some of the speculation into focus: Which is more likely, that Google
68 |
Dr. Dobb’s Journal, February 2006 |
http://www.ddj.com |
copies Microsoft, or that Microsoft copies Google?
It’s fun to ask questions like that, but it’s not much fun to watch judges grapple with tricky technological issues. When they have trouble deciding whether “Intelligent Design” is religion or science, I worry about their ability to determine the senses in which BitTorrent resembles Grokster. “Torrent files don’t contain any data,” a defender argued. “This is a search engine scenario. Why aren’t Google, Yahoo, or Microsoft getting sued?”
In fact, Google has been threatened over its Google Print technology, despite the fact that it has gone out of its way to avoid copying copyrighted material.
As I understand it, Google Print, which allows searching inside books, indexes those books rather than caching their content. This means that the book is nowhere copied, and that it would be extremely difficult to reassemble the book from the index.
But Google does cache content, of course — the content of web pages.
The truth is, this routine caching of web pages is much more clearly a case of copying than anything Google is doing with books. When I Google a topic and find several news stories on the topic, and click on a link and find that it has expired and the news service has removed the story or pulled it behind a subscriber wall, I just click back to the search results and go to the cached version. This clearly undermines the intentions of the news service. Is it illegal? Should it be?
Making caching illegal could cause serious damage to the way search engines work and the way the Internet works. But this underscores one of the ways in which the Internet, working as it was intended, calls copyright and other intellectual property laws into question.
Is Advertising Search? Search Me
Personally, I think that a lot of problems are better described as challenges in visualization rather than search tasks. Sometimes you know where the information is and you just want to make some sense of it. Yes, you could define that as some sort of search. And I do suspect that when I’m staring at a spreadsheet I may be getting bogged down in data rather than viewing a solution or insight. But I think that sometimes we want to be intimately involved in the search process, and that converts the process into something other than pure search.
Advertising is certainly search. Search with a hook, which is to say fishing, and much of it is of a type of fishing known as “chumming.” Throw the bait out on the water and hope that some big fish comes along and you’ll be able to snag it.
The opposite of this is targeted advertising, which at first blush seems like a powerful idea that can improve the efficiency of advertising by orders of magnitude. The idea is not new, but technology today makes much greater targeting possible. To a scary degree. But on second glance (and after watching Glengarry Glen Ross again), it seems to me that it’s
“Whatever its virtues, The Road to Reality is not for the faint of heart”
not so simple. Selling is about converting a nonprospect into a prospect and into a customer. There is an inherent problem in defining the search space, due to the unwillingness of the salesperson to refine — and thereby reduce — it. Or at least there are conflicting desires. So maybe the picture of advertising as search is not so clear.
The Search for the Meaning of It All
I finally reached the end of the road; that is, the last page of Roger Penrose’s The Road to Reality (Alfred A. Knopf, 2004; ISBN 0679454438). It was a tough slog. The math made my head hurt, and I like math. I was going to dedicate this whole column to a review of it, on the principle that if I have to wade through 1100 pages of complex manifolds and holomorphic functions, you should be forced to suffer proportionately. But the truth is, I don’t understand this book a whole column’s worth.
According to its jacket, this book is addressed to the serious lay reader. Yeah, right. The audience is a little more rarefied than that: Nobody who hasn’t done graduate work in mathematics is going to get much out of this book. Not only is it richly endowed with dense footnotes, but most of the footnotes have homework problems in them. Like footnote 27.16:
Give a general argument to show why a connected (3-)space cannot be isotropic about two distinct points without being homogeneous.
And make it snappy, serious lay reader. Whatever its virtues, this book is not for the faint of heart. Reading it, or anyway wading through it, was for me a humbling experience. Not only did I have it rubbed in my face how much math I’ve forgotten and how little I knew to begin with, but
I’ve in the past been critical of some things Penrose has written in his more popular and accessible writing, but here I wouldn’t dream of critiquing him. Just over my head.
But I did get something out of the book and I do think that some DDJ readers might find this book interesting and I’m not sorry that I made the effort.
Penrose is a brilliant, important thinker, a collaborator with Stephen Hawking, and he’s not kidding about the title. This book is a whirlwind tour through all the important questions in modern physics and all the math needed to truly understand the questions. I’ve written here about some unorthodox approaches (Wolfram, Fredkin) to understanding the laws of the universe, approaches that make those laws look like computer programs. Penrose has a different view of these things, but his approach is also challenging to orthodoxy. Although orthodoxy is probably the wrong term when any theory that fits the empirical data that quantum physics works with has to be flat-out crazy. What makes Penrose et al. germane to this admittedly wideranging column is that the information is central in all their theories. Information seems to be at the heart of everything; for example, in a brief moment of accessibility to that serious lay reader, Penrose exposes the common misconception that we depend on the energy from the sun for our survival. Nope: Entropy, not energy, is the key. We consume the sun’s information.
It took me a while to figure out what Penrose was doing in this book. This, I think, is what he’s up to: He wants you to be able to visualize mathematical structures. Spaces, fields, bundle spaces. He provides an enormous number of illustrations, mostly looking like something left in the oven too long: surfaces or solids of odd shapes curled back on themselves. It’s a truism that you can’t really visualize 4D space, but you can use visualizations to gain insight into four dimensions, just as you can’t fully represent 3-space objects on 2D paper, but we manage to model them usefully via projections of various kinds. Penrose pushes this as far as he can. Even if you don’t understand all of his helpful diagrams, it’s impressive to realize that he works so hard to find a way to visualize every one of these extremely abstract concepts.
In the later chapters, we see what all this visualization work was for, as he introduces concepts in physics that he — as a mathematician — sees mathematically. Now the foundation work in the early math chapters helps you get an idea how he visualizes the bizarre quantum properties of the universe. If you really work to understand the visualizations in the early chapters, you’ll be able to visualize the tough stuff in the later, physics, chapters.
http://www.ddj.com |
Dr. Dobb’s Journal, February 2006 |
69 |

Like: Ah, this is one of those things where the pie crust has lots of little fishhooks coming out of its upper surface.
I realize that I’m not giving much of a sense of what Penrose covers in the book. Okay, for example, he critiques ontologies for quantum theory, including the Copenhagen interpretation, the manyworlds view, environmental decoherence, consistent histories, and pilot wave approach, and presents his own unorthodox view. He covers the Special and General Theories of Relativity, Quantum Theory and quantum phenomena, and candidates for a Theory of Everything. And like that.
Findings
When he gets to the Theories of Everything, he provides much material for
IDers to get excited about. Like his picture of an anthropomorphic Creator performing an extremely low probability act to place the universe in the immensely low entropy — thus special — Big Bang state. But it’s a metaphor, just as he’s being metaphorical when he presents Lee Smolin’s notion of multiple universes spawned by multiple universes, with a kind of intergalactic natural selection leading to the evolution of better fitted universes. Darwin on the largest scale.
Finally, in chapters 30 and 33, Penrose presents his own approach, which I won’t attempt to summarize except to say that he takes on all the giants of quantum theory. His theory is unorthodox, and he acknowledges that it’s also short on testable
predictions. Its virtues are at present chiefly aesthetic.
However, his theory is not alone in being hard to test. Penrose cautions against being seduced by mathematical beauty when the kind of theories you’re dealing with are Big Science, where empirical refutability can be hard to come by.
As for me, I’m all for beauty and simplicity and the parsimonious explanation. I’m starting to lean toward that model that says that the universe can be most simply described using 11 dimensions. But maybe just because I want to refer to it as “Occam’s Eleven.”
Sorry.
DDJ
Dr. Ecco Solution
Solution to “Fractal Biology,” DDJ January, 2006.
1.With eight nodes, you need only 16 links; see Figure 1.
2.With 12 nodes, you could design three
sets of four nodes that are completely connected (requiring 3×6=18 nodes), then add-in four links between every pair of four, leading to an additional 12 links for a total of altogether.
Figure 1.
3.If there is no limit on the number of links per protein node, then try Figure
2.You would need only 21 links. We call this the “two-fan design,” because each hub creates a fan. You need two fans in case one hub is wounded.
4.Divide the nodes into 96 nodes (the base nodes) that will have six links each and 12 nodes (the switchboard nodes) that will have 59 links each. This results in a total of 642 links. Number the 96 base nodes from 1 to 96. Call 1 through
24the A nodes, 25 through 48 the B nodes, 49 through 72 the C nodes, and
73through 96 the D nodes. The 12 switchboard nodes are divided into two groups of six. The first group is called AB, AC, AD, BC, BD, and CD. The second group is called AB', AC', AD', BC', BD', and CD'. Figure 3 shows all the nodes and all the interconnections ex-
cept the complete graph among all the switchboard nodes (for example, every switchboard node is connected to every other one). Note that XY' has the same connections as XY (both to the base nodes and to the hub nodes).
Why does this work? Any two base nodes, say B and D, are connected through two switchboard nodes (in this case, BD and BD'). Two base nodes are connected through six switchboard nodes if the base nodes are in the same letter group. Any base node is connected to any switchboard node either directly or by connecting to any of the six switchboard nodes that the base node is directly connected to, plus a direct link to the other switchboard node.
DDJ
A:1–24 |
B:25–48 |
|
C:49–72 |
|
D:73–96 |
AB |
AC |
AD |
BC |
BD |
CD |
|
|
|
|
||
AB' |
AC' |
AD' |
BC' |
BD' |
CD' |
|
|
||||
|
|
|
|
|
Figure 2. |
Figure 3. |
|
|
|
|
70 |
Dr. Dobb’s Journal, February 2006 |
http://www.ddj.com |

E M B E D D E D S P A C E
Memory Matters
Ed Nisley
The Difficult is that which can be done immediately; the Impossible that which takes a little longer.
Applications programmers regard computer memory as an essentially endless line of identical bytes awaiting their data structures. Systems pro-
grammers take a more nuanced view, reserving distinct regions for the operating system, application code and data, and perhaps a few memory-mapped peripherals.
Embedded systems folks, alas, must contend with downright weird configurations. The intimate relationship between code and hardware occurs well below the usual levels of abstraction, where physics sets the speed limits, and manufacturing costs define what’s available. Using a fullfeatured operating system just adds another layer to that complexity.
Two threads from the 2005 Linux Symposium lead back into memory matters. I’ll start with the good news, then proceed to something confirming the growing awareness that system security is really hard.
NAND Versus NOR Versus RAM
About a year ago, I observed that the serial nature of NAND Flash memory precluded running code directly from it. As with all things technological, where there’s economic pressure, there’s a way to make the seemingly impossible happen. It’s just a matter of trade-offs.
The Consumer Electronics Linux Forum BoFS at the Linux Symposium included a brief discussion of a Samsung NAND Flash research project that allows XIP (Execute- in-Place) access, so you can run program code directly from the chip. The paper outlining the technique illustrates just how weird embedded system memory can be. However, first you must understand why
Ed’s an EE, PE, and author in Poughkeepsie, NY. Contact him at ed.nisley@ieee
.org with “Dr Dobbs” in the subject to avoid spam filters.
everything you think you know about memory is wrong.
On the small end of the scale, a product’s manufacturing cost can make or break it in the marketplace. That cost, in turn, depends on how many chips must be soldered to the board because the assembly cost can exceed the chip cost. Singlechip solutions reduce both board area and chip count, and may therefore reduce the overall cost even if they’re more expensive than the components they replace.
Any tiny gizmo that handles music or video uses NAND Flash memory, which puts vast, cheap bulk storage behind a disk-like serial interface (ignoring, of course, those gizmos with miniature hard drives). That means a minimum of two chips: NAND Flash and a single-chip microcontroller with a CPU and on-chip program and data storage.
That’s true for very small systems, but anything big enough for a real operating system requires a few megabytes of storage. Even with today’s technology, that means four chips: NAND Flash, NOR Flash for the program, RAM, and the microprocessor. Plus, of course, whatever analog widgetry may be required to turn it into a phone-camera-PDA-web-pod.
In round numbers, a megabyte of NOR Flash costs five times more than NAND Flash and uses three times more power, so there’s a mighty force aligned against that fourth chip. Storing less code, compressing it, and using other tricks may allow a smaller NOR Flash chip, but you want to eliminate that thing entirely.
Some of Samsung’s current NAND Flash parts use their internal buffer RAM as a tiny XIP random-access memory that’s automatically loaded with the contents of a specific NAND page before the CPU boots up. It’s small because NAND Flash chips have only a dozen or so address lines that normally select a block within the chip, so there just isn’t much address space available.
That XIP code copies the bulk of the program from NAND Flash to external
RAM, then jumps into the main routine. That’s a comfortably familiar process occurring in hundreds of millions of larger systems (albeit using disk drives instead of NAND Flash), but in the embedded world it has a severe downside: The system must store two copies of the program code.
Again in round numbers, RAM costs at least 10 times more than NAND Flash and dissipates over five times more power. Program code in RAM tends to be essentially read-only after it’s loaded, so you pay top dollar for a huge expanse of RAM that’s used as ROM. Worst, the copy in NAND Flash is completely unused after booting.
The folks in charge of money don’t like hearing that, of course.
NAND XIP
The entire recent history of CPU development revolves around the simple fact that memory-access time is much, much slower than CPU instruction cycle time. In fact, high-end CPUs now have three levels of cache in front of the main memory, each dedicated to anticipating the CPU’s next request. Lower performance CPUs don’t have quite the same bandwidth pressure (that’s why they’re lower performance, after all) and small embedded systems tend to get by with relatively poky CPUs.
The Samsung project combines the notion of NAND-as-disk with a liberal dash of Moore’s Law to come up with a memory subsystem that the CPU sees as a reasonably high-speed, random-access ROM. They implemented a prototype with an FPGA and some static RAM surrounding a NAND Flash chip, but reducing all that to a single chip is just a matter of time and, perhaps, economics.
The NAND Flash chip is a read-only backing store for the much smaller SRAM cache, with the FPGA implementing the cache-control algorithms. The CPU sees the subsystem as a standard, randomaccess, read-only memory rather than a
http://www.ddj.com |
Dr. Dobb’s Journal, February 2006 |
71 |

serial-access NAND Flash chip. Reads from memory addresses currently in the cache proceed at SRAM speeds, while cache misses stall the system until the FPGA fetches the corresponding block from the NAND Flash.
The overall performance of any cached memory depends critically on the cache hit ratio: If it’s below the mid to upper 90 percent range, you’re sunk. A crude estimate says that when a cache hit costs 10 nanoseconds, and a miss costs 35 microseconds, a 99 percent hit ratio makes the average access time 360 nanoseconds. Ouch!
Unlike most cached systems, embedded applications tend to have fairly straightforward execution paths and a very limited number of programs. The Samsung designers analyzed a program trace to prioritize each address block by the number and frequency of accesses, then stored those priorities in the spare data area associated with each NAND Flash block. The FPGA controller can then decide which blocks are most likely to be required next, based on actual knowledge of the program’s execution, and fetch new blocks into the lowest priority SRAM cache lines.
Their results for an MP3-player program show roughly 100 ns average access times for a 256-KB SRAM cache. It’s not clear whether media data is also streaming through the cache, which could either increase or decrease the hit ratio depending on how the caching algorithm handles relentlessly sequential accesses.
In any event, the net result is randomaccess memory that’s somewhat faster than NOR Flash and somewhat slower than SDRAM. The overall energy cost, measured in nanosecond-milliwatts, is roughly half that of NOR Flash, which may be the single most important parameter for mobile applications. However, the risk of protracted stalls on cache misses requires careful system design to ensure uninterrupted execution of those time-critical music decoding routines.
That’s the sort of memory tradeoff embedded-systems designers and programmers must put up with all the time. Beyond the usual requirement for correct functions, even the code’s location can scuttle a project. Building a working system sometimes seems impossible, but the success of hand-held gizmos shows that it’s merely difficult.
Stack Smashing
Back in the land of infinite virtual memory, even applications programmers could benefit from a little memory differentiation, as a good idea can go badly wrong with just the right design decisions. Here’s a horror story from the world of big memory that extends down into the embedded world.
Back when Intel introduced the 8080 microprocessor, solid-state memory was still breathtakingly expensive. The 8080, implemented with a countably finite number of transistors, had an 8-bit ALU and 16-bit addresses. Filling that 64-KB address was more than most folks, myself included, could afford.
The 8086 microprocessor had a 16-bit ALU, but was more-or-less compatible with the 8080 at the assembly language level. In order to access more than 64 KB of memory, Intel introduced segment registers, which basically provided the upper 16 bits of a 20-bit address. Programmers became intimately familiar with the CPU’s CS, DS, SS, and ES registers because large programs sprawled their code, data, and stack storage into different 64-KB segments.
Those segments were different in name only, as the hardware didn’t enforce any access semantics. You could reach any type of data using any segment register with complete impunity. Needless to say, that led to some truly baffling bugs.
The 32-bit 80386 (the less said about the 80286 the better) enhanced the notion of segments to provide memory protection, while grafting paged virtual memory onto the side. You didn’t have to use the VM paging hardware, but memory segmentation was mandatory.
Segment registers became pointers into tables of segment descriptors and, with the hardware now enforcing access semantics, a oncequirky architecture abruptly grew teeth. Segments contained only code or only data, selected by a single descriptor bit. Code segments could be execute-only or execute-read, while data segments could be read-only or readwrite. Stack segments became a specialized data segment, with the ability to grow downward rather than upward.
Once upon a time, I actually wrote a bare-metal protected-mode program with full-throttle segmentation and can state from personal knowledge that figuring out the segmentation was somewhere beyond difficult. While my demo system worked, it became obvious that scaling it up wasn’t in the cards.
I wasn’t alone, as most OS designers opted for “flat model” segmentation in x86 systems. Although the hardware enforces the segment rules, there is nothing preventing you from defining all the segments to refer to the same chunk of memory. That turned addresses into 32-bit offsets from the common segment base, rather than unique values tied to a specific segment.
The fact that you could only write data in a Data segment, pop registers from a Stack segment, and execute code in the Code segment became completely irrelevant. If you could manage to write arbitrary data into the stack segment, you
could easily run it in the code segment without the hardware ever noticing.
And that, party people, explains why Windows is so vulnerable to stacksmashing attacks.
As it turns out, the Linux kernel has the same exposure. Windows just makes it easier for Other People’s Code to gain access to the stack in the first place.
No Execute, No Cry?
The textbook heap and stack implementation puts the two at opposite ends of a common storage block with the heap growing up from the lowest address, and the stack growing down from the highest. All is well, so long as the two never meet.
The C language, lacking any inherent array index checking, makes buffer overruns trivially simple: Feeding a long string into a strcpy( ) function expecting, say, a username will do the trick. A sufficiently long string not only overflows the target buffer, but can extend all the way up into the stack storage area. In fact, if the string is stored on the stack (where automatic variables within C functions live), you don’t even need the heap.
Strings in C, being just linear arrays of bytes, can contain nearly anything except the binary zero terminator that makes this attack possible. Attackers can therefore write both a small program and the register contents that pass control to it into the stack, ready for action when the abused strcpy( ) function executes a RET instruction.
The details of this process are tedious and depend on exactly what’s going on in the attacked program and the OS. However, stack-smashing attack generation can be automated and, should the attacker get it wrong, the attacked program crashes and wipes out the evidence. If the attack happens in the kernel stack, it can take down the entire system.
Various Linux kernel patches have made stack-smashing attacks far more difficult, but its flat-memory layout means they can’t be completely eliminated. AMD, with Intel tagging on behind, has added an NX (No-Execute) bit to the virtual page description in x86-64 mode, which obviously applies only to 64-bit programs, that does solve the problem.
All this assumes that nobody in their right mind would want to execute code from the stack. That turns out to be not quite correct, as it’s often convenient to build trampolines on the stack, a subject quite outside the scope of this column. In any event, turning off the ability to run code from the stack can break innocent programs doing entirely reasonable things, so changing the OS underneath existing code may require recompiling some applications.
72 |
Dr. Dobb’s Journal, February 2006 |
http://www.ddj.com |

But the AMD NX bit should solve the problem for new code running in 64-bit mode, right? Nope, not quite.
Frankencode
Although 64-bit CPUs aren’t commonly found in current embedded systems, let alone hand-held devices, Moore’s Law tells us that it’s only a matter of time. Let’s suppose you’re building a must-be-secure system, using both a CPU and an OS that can prevent code execution from the stack. Does a no-execute stack render buffer overflow attacks harmless, other than perhaps trashing the stack and crashing the program?
I found a paper by Sebastian Krahmer describing how a stack-smashing attack can execute arbitrary code, even on an x8664 CPU with a properly NX-protected stack. The technique involves stitching together chunks of code from the ordinary library routines that are linked into essentially every compiled program.
Basically, an attacker can arrange the stack so that a RET instruction passes control to the last few instructions of a library function that pops the attacker’s data into registers. Synthesizing system calls with the proper parameters requires finding the proper function epilogs and creating the appropriate stack contents to fill the reg-
isters. This is, of course, subject to automation.
The buffer overflow manipulates pure data on the (necessarily) writable stack, leaving code execution for already-existing functions in the code segment. The CPU’s protection mechanisms have no idea anything is amiss.
The lesson to be drawn from all this resembles the lessons found in copy protection, digital-rights management, and Trusted Computing: The attackers are at least as smart as you are, they have better tools, and they will find a way around whatever technological measures you put in place. Declaring that hardware makes an attack impossible may be strictly correct, but finding an alternative vulnerability is merely difficult.
If you’re building an embedded system that must be reliable and secure, getting the code working is just the first step. You must also control the environment around it, the access to it, and the data in it. Concentrating your attention on any one aspect, no matter how tempting, simply shifts the attacks to a weaker entry point.
Happy memories!
Reentry Checklist
The Linux Symposium proceedings are at http://www.linuxsymposium.org/2005/
and http://www.linuxsymposium.org/ proceedings.php.
The Samsung paper on XIP NAND Flash is at http://www.iccd-conference
.org/proceedings/2003/20250474.pdf. A summary of their existing parts, each sporting a tiny XIP boot block, is at http://www
.samsung.com/Products/Semiconductor/ Flash/OneNAND_TM/.
The Linux Kernel Mailing List discussion of Ingo Molnar’s NX bit patch is at http://kerneltrap.org/node/3240/. Krahmer’s explanation of the x86-64 NX exploit is at http://www.suse.de/~krahmer/no-nx.pdf, but the link for reference 3 should be http:// www.cs.rpi.edu/~hollingd/comporg/notes/ overflow/overflow.pdf. Find more on stacksmashing protection at http://www.research
.ibm.com/trl/projects/security/ssp/. There is a description of GCC trampolines at http://www.delorie.com/gnu/docs/gcc/ gccint_136.html.
Maybe you can’t throw out your Bartlett’s Familiar Quotations yet, but http://www
.brainyquote.com/ is in the running to replace it. “Everything You Know Is Wrong” is a vintage Firesign Theatre album I haven’t heard in quite a while. Bob Marley’s “No, Woman, No Cry” is a true classic.
DDJ
http://www.ddj.com |
Dr. Dobb’s Journal, February 2006 |
73 |

C H A O S M A N O R
Beware of
Sony’s DRM
Jerry Pournelle
In April 2004, Sony covertly added a Digital Rights Management (DRM) scheme to its music CDs, and all Sony music CDs released after that date incorporated DRM software. This DRM system has no effect on Linux and Macintosh computers or standalone music players. However, if you attempt to play a Sony music CD on a Windows PC, you will be told that you must install a Sony music
player.
You are advised not to allow that installation. If you do install the music player, you will also install a rootkit. The term comes from UNIX, where the primary superuser is known as “root” and if you are the root, you can do anything. A rootkit is a particular kind of spyware that hides from detection by spoofing the operating system into believing no such spyware exists. The directory in which the rootkit files reside is hidden and you’ll never find it with any normal operating-system command.
That’s what the Sony music CD system installs on your computer in the name of digital rights protection. The Sony rootkit is a serious invasion of your system, and is so successful at hiding that third-party spyware people can use it to hide their own malware, and at least one is reported to have done so. Moreover, savvy World of Warcraft online players used the Sony rootkit software to hide their cheat software.
It gets worse. Not only does the Sony DRM rootkit hide, but if you detect it, you cannot safely remove it. Attempts to remove it have resulted in blue-screen crashes and the requirement to reformat the disk and reinstall the operating system and all applications. Naturally all unsaved data were lost — and this happened to experts.
The Sony rootkit was discovered by former DDJ Contributing Editor Mark Russinovich at Sysinternals. His story (see http://www.sysinternals.com/blog/2005/10/ sony- rootkits-and- digitalrights.html) makes for fascinating, if horrifying, reading.
Jerry is a science-fiction writer and senior contributing editor to BYTE.com. You can contact him at jerryp@jerrypournelle.com.
But it gets worse. Not only is the Sony DRM rootkit impossible to uninstall, but it “phones home,” giving coded information to a server at Sony headquarters. As I write this, no one has any idea of what Sony plans to do with that information. The important point here is that this is stuff you don’t want on your computer, and you can’t detect it with any normal antispyware programs. It takes a rootkit detector to find it, as a horrified Mark Russinovich discovered during a test of rootkit detection software. The Sony DRM rootkit had been on his computer for some time, and he had never suspected a thing. If that can happen to the system internals guru, it can happen to you.
Now the final horror. Under the DMCA, it is very likely a criminal act for you to remove the Sony rootkit from your system. Worse, it is likely a criminal act if I tell you how to remove the Sony DRM rootkit. And thus my advice: Don’t buy Sony music CDs, especially if there is any chance at all that they will be played on a Windows PC.
Of course, by the time you read this Sony may have provided an uninstaller for its rootkit DRM system. Still, Sony’s actions do not indicate that the company understands the seriousness of this situation, and at this writing Sony has yet to offer an uninstaller I would trust. Stay tuned.
But even if it is legal to remove it, it is dangerous to do so. In particular, booting in DOS and examining the directories to find the rootkit directory, then deleting that, will almost certainly crash your system. The Sony rootkit alters the registry to redirect certain function calls, and if the OS can’t find the instructions it has been redirected to, it can’t recover.
Sony does supply a patch to your operating system that lets you see the rootkit directory. However, the procedure for getting a legal copy of this patch is tedious. Of course, no sane person wants application software that requires an operating system patch from a third party. When Microsoft sends you updates and patches, Microsoft knows that code is there. When you patch your OS with soft-
ware supplied by Sony, how is Microsoft to deal with it? I do not advise you to install a Sony-supplied OS patch.
In response, one reader wrote about a possibly useful program:
I pass along a tool I found to help deal with the Sony disks. http://www.smart-projects
.net/ offers a freeware tool to read CDs called ISO Buster that sees the disk layout and allows extraction of the WAV files.
John
Note that this program may be illegal under the DMCA, and thus may not be available for long. I leave all conclusions in this matter as exercises for the reader.
Between the public (and artist) outcry and a bunch of lawsuits, it didn’t take Sony long to start backing down: First saying that its XCP DRM scheme applied to 20 titles, then later admitting that it was actually 52, Sony decided to pull CDs from the shelf and give customers the opportunity to exchange for nonXCP versions.
U3:
The Next Generation of Thumb Drives
The twice-yearly Demo Conference shows off new technologies and services — up to 70 in two packed days of six-minute demonstrations. Leading off the demo cavalcade at Demo/Fall 2005 was U3 (http:// www.u3.com/), makers of embedded technology for “USB smart drives.” And the demo was indeed smart: Plug the device into the demo Windows laptop, and the software it contained was available to run there — no installation, no footprint on the host PC. Unplug it (even surprise removals) and it all disappears. There were trialware apps and other bundle deals, which vary depending on the U3 partner.
Smartly demonstrating the technology was ready, I was handed no less than four U3 - enabled USB keys from Verbatim, Kingston, SanDisk, and Memorex. U3’s trick took some doing; long-time IT veteran and U3 CEO Kate Purmal hinted at a long development cycle, mostly software to fool the OS into working the way they wanted.
U3’s software (Windows now, Mac soon) is a real stack, not merely a single-point
74 |
Dr. Dobb’s Journal, February 2006 |
http://www.ddj.com |

hack. We’re keenly interested in learning how U3 achieves the application redirection (and the other cleverness). U3 promises much of the info on integrating with its capabilities will be public, so developers can make intelligent use of it. There is a freely downloadable SDK at http://u3
.com/developers/downloads/default.aspx. At home, we ripped open the Verbatim 1-GB U3 drive packaging, and plugged it in. As we surmised upon seeing the demo, to Windows, the U3 looks like two devices in one — a CD and a removable disk drive. The CD part autoplays, runs the (not very big) U3 software stack, which in turn, opens an intro clip, and walks you through a demo. From there, you can simply use it as a standard thumb drive, or add your own software, and have it with you no matter what computer you might use, without installing anything on that computer. It does put a new icon in your system tray when a U3 device is installed, which you should use to eject instead of Win-
dows’ usual icon.
Astute readers will realize U3’s basic strategy is exactly the same method that Sony uses to install its “rootkit” software on your computer, though with far more positive intentions. It could probably silently leave other software behind on your computer, as well, though that’s not the intent, and we’ve never found any signs of that. Instead, it’s a pocket-sized Place For Your Stuff, externally indistinguishable from a standard thumb drive.
The Verbatim 1-GB U3 drive we tested comes with McAfee antivirus, ready to run (again, without installation). We used it to check out a computer that had been running without virus protection, then ejected the Verbatim U3 thumb drive. Actually, as a test, we just yanked it loose without notice (a “surprise removal” in Windows parlance), which caused the U3 software to politely remind us we should use the U3 icon to do that in the future. Lecture complete, it then went completely away.
I’ve done the same test with the Kingston U3 drive, with the same results. It Just Works.
And that is the point of U3’s technology: You can have all your favorite software, ready to run, on any Windowsbased computer you might use, without installing anything or corrupting the host machine. At Demo/Fall, Kate claimed you can install Microsoft Office right onto a U3 drive. We haven’t tried that yet, but what an idea for those willing to rely on the kindness of others.
Prediction: U3-enabled thumb drives are going to become indispensable for road warriors (run your presentation from any available computer!), IT corridor war-
riors (all your favorite fix-it tools, instantly available), and well, just about anyone else who wants Their Own Stuff no matter what computer they happen to be using. For now, the advanced features only work on Windows, though U3 devices work like any other thumb drive on Macs (without the advanced features, at least for now). We’re told U3 will come to Mac, but that’s Real Soon Now.
Seagate External Drives
This is Chaos Manor, and our methods are sometimes, well, chaotic. Sometimes things are just so useful, and so ubiquitous here, that we forget to list them.
That almost happened with the Seagate USB drives. These come in many sizes and flavors, and everyone loves them, and because we all use them and they Just Work, we nearly forgot them. They’re great gifts, and best of all, just about everyone can use another external storage drive, even if they already have one or two.
The most popular Seagate external drives here are the 5-GB “Cookie,” which fits in a shirt pocket and goes with you anywhere and draws its power from the USB connection — it works wonderfully with Lisabetta— my TabletPC. The 100-GB “book” that is small enough to fit in a briefcase, and uses two USB connectors, one for data
and one for power; and the 400 GB, which has its own wall brick power supply.
One way to use the 100 is to have a powered USB hub expander (Belkin makes some good ones, and those are what I carry) so you’re not draining your laptop.
Whatever size you get, you can be sure a Seagate USB external drive is welcome, and they are recommended.
Winding Down
The computer book of the month is Joli Ballew and Jeff Dunteman’s second edition of Degunking Windows (Paraglyph Press), which is better than the first edition. You will certainly profit from the chapters on registry cleaning and the recommended tools for doing that. There’s sound advice in every chapter, and I can pretty well guarantee you’ll learn a few you didn’t know. Recommended.
The second computer book of the month is also from Paraglyph: Jesse M. Torres and Peter Sideris, Surviving PC Disasters, Mishaps, and Blunders. Most of it is just common sense, but if you’ve just had a disaster, common sense is the one thing you won’t have: Just having a book that shows someone else has thought through the situation can help.
DDJ
http://www.ddj.com |
Dr. Dobb’s Journal, February 2006 |
75 |

P R O G R A M M E R ’ S B O O K S H E L F
Inside C# and .NET
Peter N. Roth
Core C# and .NET
Stephen C. Perry
Prentice Hall PTR, 2005 1008 pp., $49.99
ISBN 0131472275
The target audience for Stephen Perry’s Core C# and .NET is the “experienced programmer.” I fit that profile, and based on this book would guess that at least three years of professional programming is a reasonable min-
imum to qualify as such.
Part 1 of Core C# and .NET includes an introduction to .NET and C#, C# fundamentals, class design, and working with objects. If you’re new to C#, you’ll want another text to cover the language in more detail. (I recommend Peter Sestoft and Henrik Hansen’s C# Precisely.)
In part 2, “Using .NET,” Perry includes chapters on text manipulation and file I/O, Windows forms programming and controls, graphics design, fonts/text/printing, XML, ADO.NET, and data binding. The file I/O chapter, in particular, goes a long way to answering questions that show up in newsgroups. And while the book covers Version 2.0 of C# and .NET, the buzz on Version 3.0 has already started, so you can expect that some of the ADO.NET stuff is “transitional” (but then, what isn’t?). The so-called “advanced” section (part 3) addresses the topics of threads, distribut-
ed apps, and refinement/security/deployment. While I claim to be an experienced programmer, I must confess that I have never written a threaded or distributed app, so this material was new to me.
Part 4 deals with programming for the Internet, and includes chapters on ASP.NET web forms and controls, the ASP.NET application environment, and XML web services. Finally, two appendices display the differences between
.NET 1.0 and 2.0, as well as the Events and DataGridView control.
Thus, the text is broadly comprehensive. At the same time, you can do only so much in 1000 pages, so the depth is limited accordingly. Topics average about 56 pages each, which is still a solid chunk for each area addressed. Code examples are downloadable rather than on a CD, which seems to be the current trend in computer books. A bonus download is the Quick Reference — print it, fold it, and stick it into a niche on your desktop (if you have any room left).
Admittedly, .NET and C# are moving targets, and I admire authors who take a shot at them. To that end, Perry does
1.Design Patterns: Elements of Reusable Object-Oriented Software
Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides
2.HTML for the World Wide Web with XHTML and CSS: Visual QuickStart Guide, Fifth Edition
Elizabeth Castro
3.UML Distilled: A Brief Guide to the Standard Object Modeling Language, Third Edition
Martin Fowler
4.Head First Servlets and JSP: Passing the Sun Certified Web Component Developer Exam
Bryan Basham, Kathy Sierra, and Bert Bates
Amazon.com
Recent Top 5
Programming Books
5.Sun Certified Programmer & Developer for Java 2 Study Guide
Kathy Sierra and Bert Bates
a splendid job; though I have a couple of mild cavils: The presentation of conditional compilation should use the new idiom, which is the [Conditional(symbol)] attribute, rather than the older (and wellknown) #if (symbol)/#endif construct. And I found the idea of using a commandline compiler “to get started” a little unusual, given the number of free IDEs out there.
Still, the text is clear of typos and misspellings. But somehow, the military outline for oral presentations (“tell them what you’re going to tell them, tell them, tell them what you told them”) has been carried over into texts. Hence, for each chapter, there is a beginning of chapter summary, a chapter, and an end of chapter summary. This insults our intelligence and wastes paper, because we can easily read the table of contents and scan the chapter to determine what’s coming. We can read the chapter to determine what it is; and we can review all the material to determine what it was. It also makes for a lighter book, or alternatively, the freedup pages provide more space to the author. In the blank, dark gray “separator” pages: They’re not quite dark enough to be easily visible. In this case, a black bleed strip down the edge of the first printed (nonblank!) page of a chapter would be preferred, and save yet another page.
My rant notwithstanding (hey, I’m entitled, I’m an experienced programmer), in general, Core C# and .NET is an excellent production that meets its stated aim — to provide a foundation for programmers moving to .NET.
DDJ
Peter is the president of Engineering Objects International, producers of commercial and custom C++ components. Peter can be contacted at http://www
.engineeringobjects.com/ and pete.roth@ verizon.net.
http://www.ddj.com |
Dr. Dobb’s Journal, February 2006 |
77 |