| Author |
Message |
kybertech
Joined: Jun 11, 2009 Posts: 15 Location: Vienna
|
Posted: Mon Jan 24, 2011 3:37 pm Post subject:
Upcoming Chips: 144x Stackprocessor Arrays Subject description: Something inbetween traditional DSP and FPGAs? |
 |
|
I have found some interesting chips. The company behind it are GreenArray Inc.
( http://www.greenarraychips.com/ )
You may have heard of Chuck Moore. This guy basically is the godfather of FORTH.
His webpage: http://colorforth.com/
there is some information about his new 'colorforth' among other things.
After failure of a similar project with 40 processors on a chip (SEAForth) the new chips with 144 processors are said to be in production and you can preorder samples and boards. The Boards are fairly expensive considering they do not provide you with any extra peripherals. They are just 2 of the chips with some sram and flash and 2 serial ports.
However the potential processing power in these things makes them interesting...
They run at 700 mhz with 1 cycle instructions (there are only 33 of them though) which would give you 201.600mips on the board. So something like a "Nord Modular on Steroids", Resynthesis, FM with arbitrary operator count or something like that would be possible with this.
There are some downsides though:
There is no C compiler but you can do highlevel programming in FORTH and if you want to use memory on a selfmade board you have to write your own memory controller and the chips come in a BGA package...
I'm thinking of preordering a board nevertheless, let me know if anyone is interested, they give a discount at 10 pieces  |
|
|
Back to top
|
|
 |
jksuperstar

Joined: Aug 20, 2004 Posts: 2503 Location: Denver
Audio files: 1
G2 patch files: 18
|
Posted: Tue Jan 25, 2011 3:50 am Post subject:
|
 |
|
These are cool, I remember reading about the F18A some time ago, and even designed my own stack based 32 instruction processor. They have serious limitations when it comes to memory though. At least, when you have a few hundred on a single chip.
I got very interested in this stuff, until Nvidia announced their CUDA project several years ago. At those prices, and that performance for a fully parallelized processing unit, it became easier to learn programming, rather than continue hardware development. However, I still don't own a CUDA compliant video card yet, as I got side tracked by other DSPs. |
|
|
Back to top
|
|
 |
kybertech
Joined: Jun 11, 2009 Posts: 15 Location: Vienna
|
Posted: Tue Jan 25, 2011 9:32 am Post subject:
|
 |
|
| jksuperstar wrote: | | These are cool, I remember reading about the F18A some time ago, and even designed my own stack based 32 instruction processor. They have serious limitations when it comes to memory though. At least, when you have a few hundred on a single chip.. |
That's a bit of a downside of course. But that depends on how you look at it. If you use the inside processors like other people use PICs in a circuit, meaning that you assign them to a single function with fixed I/O this could work. FPGAs have similar limitations.
This could also change in the near future with higher integration or even 3D "sandwich" chips.
Also the clock rate seems high enough to use serial pipelining for regular SRAM to be able to use external memory "deeper in". Sacrificing one quad of the processors seems like a fair price for the possibilities you get.
| Quote: | | I got very interested in this stuff, until Nvidia announced their CUDA project several years ago. At those prices, and that performance for a fully parallelized processing unit, it became easier to learn programming, rather than continue hardware development. However, I still don't own a CUDA compliant video card yet, as I got side tracked by other DSPs. |
Nice to see, I haven't looked into that this much, but from what I heard GPGPUs are basically a few very wide SIMD units, so the latest 1600 stream processor ATI card will have 100 of these. That's fine if you have millions of pixels which have to displayed simultaneously anyway, so you would like to use additive synthesis with it it will be fine. Other techniques might benefit not really as much. But I think it might just be a matter of time since VSTIs which use graphic cards as dsp units come out... 100 pipelineable units is still nothing to sneeze at.
If you have the muse you could even do that for Commercial Success™
The major advantage is that these stack processors are in actual documented chips which you can build a device upon while the graphic cards limit you to a pc. And I think the hobbyist potential is higher since you get a platform in which changes are determined by yourself.
I haven't used forth that much either, only for algorithm testing/brainstorming since it is so transparent.
But there are people who do actual development in it from what I've heard.
PS: I'm intrigued that you designed your own stack machine. Have or would you publish something about it? |
|
|
Back to top
|
|
 |
mhelin
Joined: Feb 07, 2008 Posts: 9 Location: Finland
|
Posted: Wed Jan 26, 2011 3:23 pm Post subject:
|
 |
|
If you want to do something like that now go for the XMOS XCore stuff. Like XS1-G4 on XMOS's XC-1A kit:
http://www.xmos.com/products/development-kits/xc-1a-development-kit
There are only four cores though, but each have 8 threads processing simultaneus events in total of 32 threads. There isn't either too much SRAM on chip but it would be quite easy to interface some serial chip like proposed above let a single thread handle the job. |
|
|
Back to top
|
|
 |
jksuperstar

Joined: Aug 20, 2004 Posts: 2503 Location: Denver
Audio files: 1
G2 patch files: 18
|
Posted: Wed Jan 26, 2011 5:32 pm Post subject:
|
 |
|
No real info. I didn't do any documentation, and just hobbled an instruction decoder and math pipeline together with a handful of registers and a stack. but I wanted to add a memory interface to the bottom of the stack, so the stack could be as large as your memory was. Being only 5 bits to decode, it was fairly fast, and that also meant 4-6 instructions could be read at once into the pipe.
It did make me realize that these processors aren't completely unique. The old SPARC, followed by the ARM processors, have an architecture that 1/2 the RISC registers gets swapped depending on the mode (system, IRQ slow, FF IRQ, user, etc). |
|
|
Back to top
|
|
 |
kybertech
Joined: Jun 11, 2009 Posts: 15 Location: Vienna
|
Posted: Fri Jan 28, 2011 5:57 am Post subject:
|
 |
|
| mhelin wrote: | If you want to do something like that now go for the XMOS XCore stuff. Like XS1-G4 on XMOS's XC-1A kit:
http://www.xmos.com/products/development-kits/xc-1a-development-kit
There are only four cores though, but each have 8 threads processing simultaneus events in total of 32 threads. There isn't either too much SRAM on chip but it would be quite easy to interface some serial chip like proposed above let a single thread handle the job. |
Thanks that really seems to be a cheaper lower end alternative. Are you using that already? Interesting...
Nice that you even get a C compiler, that takes out the learning curve quite well
| jksuperstar wrote: | but I wanted to add a memory interface to the bottom of the stack, so the stack could be as large as your memory was.
...
It did make me realize that these processors aren't completely unique. .. |
I think the major difference here is the very limited direct adressing modes. These things can jump only 10 bit wide... I imagine that moving data around would be done in high level code. |
|
|
Back to top
|
|
 |
jksuperstar

Joined: Aug 20, 2004 Posts: 2503 Location: Denver
Audio files: 1
G2 patch files: 18
|
Posted: Fri Jan 28, 2011 12:02 pm Post subject:
|
 |
|
I just found the switching of register sets as avoiding the need to dump them off to a stack during a context switch. Which is the same claim that stack based processors make: you don't need to waste time pushing registers to an external stack during a context switch. Because the stack as already built-in.
Also, my thought of adding a hardware memory interface at the bottom of the stack was to allow for any sized memory. The CPU could only jump a small amount, or address a limited size of the stack, but the total space available was much larger. However, since I dropped development of it, I never really profiled any programs to see if that idea would help in any way  |
|
|
Back to top
|
|
 |
|