Programming Language Choice

C? Assembler? Forth? Basic? There are several languages available to program embedded systems. The language choice depends on the processor choice. And the processor choice depends on the semiconductor technology state-of-the-art.

Until recently it was not possible to program a cost effective, single chip microprocessor with a meaningful application in a high level language. Chips were not designed to efficiently execute HLLs and the memory provided was inadequate.

Ancient History

In the early days of microprocessors, most programming was done in assembler. The memory models, economics, speed, and instruction sets of the the early eight bit processors: i8080, mc6800, 6502 and Z80, lent themselves to execution of assembly code. Memory was expensive, good compilers were hard to find, and if you really wanted to squeeze the most out of your processor, assembler was the way to go. Early language innovators wrote Basic, Forth and other languages for these processors. Kernighan and Ritchie were just writing C on DEC minicomputers running Unix.

In the late 70's, the first 16 bit processors appeared. These were register rich and memory rich. For embedded processors, the mc68000 with it's powerful, orthogonal instruction set became the processor of choice for many larger embedded systems. It's instruction set was very orthogonal, meaning that all instructions could access all registers and all memory in the same way. It was in many respects modeled after the DEC PDP-11 instruction set, but extended from a 16 bit machine to access data with 32 bit addressing. It was key to program in a high level language for these larger systems in order to maximize programmer efficiency for large applications. Several excellent C compilers appeared on the market for these new processors.

Modern 32 bit RISC processors are also register rich, even the tiny PIC and AVR eight bit machines.

What makes a processor efficient at executing HLL?

I cannot speak for all languages, but for C, there are a few basic processor resources needed to execute C efficiently. One key requirement is for at least two pointer registers. One register is needed as a stack frame pointer to be used in function call and return operations, used extensively in C. The other is used to perform pointer operations, also critical in C. Having one or two extra registers used for pointers also helps efficiency when programming with multiple pointers, such as manipulating blocks of data. Many early processors did not have this facility and performance and efficiency generally suffers while the compiler is busy constantly switching between these two critical functions. Examples of processors having the facility of multiple pointer registers include even the Z80, 68hc11 and all RISC machines.

Most of my personal experience is in assembler and C, with a bit of dabbling in Basic and Forth. The major advantage C is that code can execute efficiently, that it allows one to operate well at either low or high level, an that many of the low level housekeeping tasks are hidden. Yet C is not so high level that the compiler must operate far from the instruction set. If you look at the instructions available in C and the op-codes of a typical microprocessor, C is simply a consistent and slightly higher level way to perform these operations. If you inspect the assembly code your C compiler generates, you can see the op codes that correspond to the C instructions you wrote. And by counting op-codes and looking at code size, you can tell if you are writing efficient code. You can also try a few different ways to code the same function to see which is more efficient. I use code size as my main metric. Bigger programs take up more memory and generally (but not always) run slower. A notable exception is when you build a look-up table to process data. It may be larger and faster too.

Advantages of different Languages

Assembler, in the hands of an expert, is the execution speed winner. There is nothing as fast as raw, hand tuned assembly code. However with any project, the programming of this code, the debugging, and the support of it in the future must be taken into account. The advantages of assembler are:

	Fastest execution
	Small code size
	Lowest cost tools

The disadvantages of assembler are:

	Programmer inefficiency: all operations and housekeeping is hand written in machine instructions
	Debug is inefficient at the register level
	Code is not portable to other architectures
	Code can be hard to read: Write only
	Larger applications can hit a complexity barrier

In fact the only option on most low-end eight bit processors with 4KB or less of on-chip program memory is assembler. Until recently, single chip processors with more than 16K memory cost 2-3X as much as the 4K versions. Lately this has changed with FLASH processors with larger memories becoming lower cost to the point that a low cost single chip micro programmed in C can now be a reality. The fast execution time of assembler may be needed in the 'inner loop' of an operation, if at all. Much of the code in an embedded application is spent performing non-time-critical operations.

C

The advantages of programming in C are that one can move quickly from very low level bit-banging or I/O code to very high level (graphics display or communications data structures) within the same application. The simple and powerful function calling with parameter passing allows code to be encapsulated and structured. The powerful array and data structures allow high level applications to be built quickly. The wide range of data types allow memory to be used efficiently and quickly.

Advantages of C:

	Code portability
	Execution Speed
	Code size
	Speed of development
	High level and low level control
	Easy to integrate assembly code, if necessary
	Standard I/O, string, and math libraries
	Close match to processor instruction sets
	High level debug
	A availability of C developers

Disadvantages of C

Compiler cost ($0-$1000)

Perhaps the most powerful advantage is that the code is portable. Aside from the obvious benefit of being able to move code from one processor family to another is that a programmers knowledge and experience is also portable. Develop a small application on an 8 bit micro one day, and some of that knowledge and experience (or even code) is applicable to designing another system on another processor, or even a PC application. With no dead-ends your C experience grows with every line of code written.

The availability of programmers who know C is another strong advantage. C is taught in all high school and college programming curricula.

Basic

Basic can be a powerful language for a microprocessor. There are three types of Basic tools, interpreted, tokenized, and compiled. Interpreted Basic is useful when the code must be written and executed on the target procersor. With powerfule PCs widely available it is hardly necessary to waste target processor resources with an interpreter, editor, etc required for interpreted basic. This tends to be the slowest execution time also. Tokenized basic (used on the Basic Stamp and other processors) is a hybrid where the code is written on a PC and converted to tokens which are then faster to execute than raw Basic instrunctions. A Basic compiler generates machine instructions for the processor to execute and is therefore is has the fastest executiontimes. Typical BasCom Basic Compiler execution times on an AVR processor are 100,000 steps per second. This is acceptible on some slower applications, but the execution time of C can approach the raw MIPS rating of the processor at 1-2,000,000 instructions per second on the same processor. \

There is a sutle difference in code size between tokenized and compiled Basic. Tokenized basic programs may be smalller due to the efficiency of tokens, but the token interpreter occupies code memory also. With compiled Basic the program may take up more memory, but there is no interpreter.

Language Vs Semiconductor Processing

What does language selection have to do with semiconductor processing? A lot. In order for engineers to accept any inefficiencies in their products (added cost, power, size) they must make it back somewhere else. As an example, if a processor costs $3 more for the extra program memory to use C, there must be a saving somewhere else: in development cost, future product development, etc. to compensate for that added cost. In this example, if the manufacturer expects to build 1,000 units a month, then the added parts cost is $3,000 a month or $36,000 a year. If the product made it to market two months earlier and costs 1/3 of a person-year less to develop, the payback would about one year. Weird example, fix!

Meanwhile that painful $3 should come down over time if a manufacturer is willing to adopt lower cost technologies as they come available.

As an example here is my holy grail low-end, single chip embedded processor:

	64K or more FLASH ISP program memory (~5000 lines of C code)
	16K or more RAM
	5 MIPS performance
	Lots of serial and parallel I/O

Easy to add Ethernet, CAN...

	Great timer subsystem
	10 bit A/D
	$5.00 in quantity 100

We're getting close. The Atmel AVRMega323 has 32K code, 4K RAM, close to 5MIPS, and the I/O is powerful and plentiful. Everything but Ethernet / CAN, and the cost is $8 qty. 100. The key here is that all the processor blocks are implemented in CMOS technology and still these are challenging chips to fabricate because they have so many different technologies on one die.

Semiconductor Economics 101

Semiconductor technology at any state-of-the-art level is based on a several parameters: Feature size, gate count, process steps, production volumes, packaging, and development costs.

Feature size determines how fast devices can switch and how many transistors fit on a given die. Feature size is what makes Moore's law a reality: every year features shrink allowing more, faster transistors in the same die size.

Gate count and die size are closely related. At a given feature size, die size is roughly proportional to gate count, with allowance for pads and interconnect. Larger die costs more, and it it not linear. Increase a die area by 2X and not only does the cost increase but the tested yield also decreases more than linearly, causing another cost hit.

Package cost has been driven lower with the advent of BGA packages for high pin count devices. In the old days plastic (epoxy packaging) was the lowest cost and there was a penalty for using Ceramic. With Ball Grid Arrays (BGAs) per-pin cost is on par with plastic SMT. The economics are simply that more pins and larger packages cost more.

The number of process steps is a very important parameter. Of the most common three types of devices: CMOS Random logic (gates), dynamic memory (DRAM) and FLASH memory, each has a number of specialized process steps. Industry has worked hard to reduce the number of steps to drive the cost down of each of these major technologies. The result is that the major semiconductor fabrication lines are optimized for one of these technologies at a certain feature size. Each of these is a sweet spot for reducing the cost of a semiconductor chip. And this is why the amazing low costs and high gate counts of ASICs, DRAMs, and FLASH memories drive the electronics industry.

Production volumes go hand in hand with the specialization of process. Build 10 million devices of one type and the costs per chip go way down versus 100,000 devices. Not only are the factories more efficient, but the high chip development costs are spread out over many more devices. These costs include design, simulation, mask fabrication and test development costs. Not to mention marketing and documentation costs.

Analog chips have their own problems. If one can build an analog function on a CMOS digital line, then it can benefit from the low costs of CMOS production. But analog chips tend to have larger geometries, precision matched resistors and capacitors, laser trimming, etc, all of which prevent them from taking advantage of Moore's law.

Notable exceptions are A/Ds and other analog functions that use matched capacitors which are fairly easy to fabricate on CMOS lines. Also Sigma-Delta A/Ds which use a single bit A/D followed by DSP to average and correct the value to produce a result. These too are built on CMOS processes. There is a lot of incentive to build analog functions on digital processes.

The System-on-a Chip (SOC) problem is a big one. It implies that all the technologies: logic, memory, non-volatile memory and analog all co-exist on a single semiconductor device. This requires a process line that can handle all of these technologies and so is the superset of all of them and therefore expensive. And unless the volumes for that SOC are as high as the volumes for the DRAMs and FLASH devices they replace, (they won't be) then the costs will be higher. As much as I like single chip processors with everything on one device, I also like using systems with low cost RAM and FLASH devices off chip.

Ever wonder why they can build a 256M bit (that's more than 256 million transistors) DRAM for $10 and you can't get 4M bit of memory (512KB) on a microprocessor for any price? It's the SOC penalty. Companies like Rabbit have recognized this and are building processors with external RAM and FLASH. Companies like Atmel who are also long time FLASH experts have worked hard to integrate FLASH on their microprocessors and keep the costs down. I thank them both.

To make the next step up in processor performance is:

	1M or more FLASH ISP program memory (5000 lines of C code)
	1M or more RAM
	25 MIPS performance
	Lots of serial and parallel I/O

Easy to add Ethernet, CAN...

	Great timer subsystem
	10 bit A/D
	$15.00 in quantity

This level is more elusive. To build larger FLASH and RAM using a process that is optimized for random logic requires a fairly expensive chip. Possibly DRAM which also requires it's own specialized semiconductor processing. An example of the current state of the art here is the Rabbit processor which uses three chips to implement the processor core: One for the processor and I/O, one for FLASH program memory and one for RAM data memory. This does not include Ethernet or CAN. The advantage of this approach is that the three devices are each optimized for their own semiconductor processing: See Semiconductor Processing 101. The good news is that these three devices consume only about one square inch of board real estate total. This is why Rabbit Semiconductor markets their processors as small plug-in modules as well as individual chips.

The major microprocessor manufacturers are no longer the technology leaders in low end processors. Intel no longer builds eight bit processors for the embedded market. Processor technology advances pretty quickly and large companies get stuck with an older architecture and proprietary instruction set. Instruction set continuity is how large companies keep customers forever: if they change the instruction set, customers have no loyalty. This is bad for the processor manufacturers because they become stuck with an old architecture forever and it is bad for customers because they have no ability to shop around. In addition, a single processor architecture has a limited range of applications that it makes sense for. One wouldn't use a Pentium for a Microwave oven controller or a PIC for a cell phone. However a companies range of product capabilities may not align well with its supplier's product range. Motorola has a wide range of processors, but there is little compatibility between their 68hc08, their ColdFire and their Power PC processors. No assembly code can transcend even two of these processors.

Relatively new 8-bit processor architectures such as Microchip PIC, Atmel AVR, Rabbit 3000 have strong advantages over older architectures. They are designed with small geometry semiconductor processing which keeps power and cost down, and they are very feature-rich. But to use these appealing processors one must either learn a new instruction set for assembler programming. .

How common is it for a company to build a low-end product with simple capabilities only to find that there is a market for far more features: graphics display, communications, and other compute-intensive features may be 'just' another line item on a marketing requirements spec, but they can require major architectural changes in a low end product.

Imagine taking that amazing little product you just sweated for months bringing to market and 'simply' adding Ethernet to it. Maybe no problem if the system is based on PCI bus, but a big problem if it is based on a custom single board design.

Here is a list of various Microprocessor development environments for the two microprocessors listed in the article (Atmel AVR and Rabbit) running on PC

Product Name	Supplier	Processors supported	Lang.	Features	IDE?	Current Cost
ICC	ICC	AVR, 68hc08	C	Full ANSI	Y	$200-500
Dynamic C	Rabbit Semi	Rabbit, Z180	C	Multitasking	Y	$200-500
DDS	Dunfield Development Systems	AVR, 68hc11, '08 8051...	C	No float, long	N	$99
GCC-AVR	GNU C	AVR	C	Full ANSI, Linux avail.	N	Free
Softools	Softools	Rabbit	C	Full ANSI???	Y	$800
AVR C	Tech Code Vision	AVR	C	???	???	150 Euro
See third party list on Atmel Site
Bascom	Bascom	AVR	Basic		Y	<$100

[ Home ] [ Up ]