Author |
Message |
mrjason
Joined: Jul 25, 2006 Posts: 8 Location: Austin, Tx
|
Posted: Fri Dec 26, 2008 1:03 pm Post subject:
what does a wav file data sample value mean? |
 |
|
Hi all,
I've been trying to understand how a wav file works. I'm a software developer by profession, and I understand the header and layout of the data chunks just fine.. but I'm having a hard time figuring out what actual data values translate to in the real world.
For example if we have an 8 bit data chunk in 44.1khz sample for the left channel with a value of "511", what is that? Is that a directive that says make the speaker play a frequency close to 22khz? Are some of the bits there a volume level specifier and some of the bits frequency specifiers?
I'm just having trouble understanding what a sample chunk actually represents. I am a total amateur when it comes to music theory so maybe i'm missing something here. I've read that an "A" note is 440hz or something like that.. so if we had a wav file that has a solid A note fading from zero volume to high volume is that just a 440hz directive over and over or is there some multiplier that will increase volume somehow by setting a subsequent sample frame to a value that means something like 880hz? |
|
Back to top
|
|
 |
JovianPyx

Joined: Nov 20, 2007 Posts: 1988 Location: West Red Spot, Jupiter
Audio files: 224
|
Posted: Fri Dec 26, 2008 1:32 pm Post subject:
Re: what does a wav file data sample value mean? |
 |
|
mrjason wrote: | Hi all,
I've been trying to understand how a wav file works. I'm a software developer by profession, and I understand the header and layout of the data chunks just fine.. but I'm having a hard time figuring out what actual data values translate to in the real world.
For example if we have an 8 bit data chunk in 44.1khz sample for the left channel with a value of "511", what is that? |
8 bits can represent a maximum of only +127 down to -128. 511 cannot exist in an 8 bit data stream.
Quote: | Is that a directive that says make the speaker play a frequency close to 22khz? |
No. Wavefile "directives" are found only in the header. The data portion of the file is simply a list of the samples that were acquired during recording. It's just a list of numbers that specify an instantaneous voltage.
Quote: | Are some of the bits there a volume level specifier and some of the bits frequency specifiers? |
Speaker volume is not modulated or controlled by the wav file other than the fact that a (for example) sinewave with peaks at +127 and -128 will be "twice as loud" and a sinewave with peaks at +63 and -64. The data can thus vary loudness of the sound in a very natural (if storage consuming) way.
General speaker volume or "level" is an effect that is added after the wav file is processed. This is done by the sound card's driver and utilities. Again, the sample data portion is simply a list of voltage values that are to be played into a DAC at the sample rate specified in the header. If those samples track a sinewave at 50 Hz, then that's what comes out of the speaker. Once the sample data portion of the file starts - that's all you get. It's an incredibly simple/dumb format.
It's important to note that if you simply graph the data samples from one channel - you'd be able to "see" both frequency and volume. Higher volume portions of the file will have taller peaks. Higher frequency fundamentals will appear with more zero crossings per unit linear measure on the X axis.
Quote: | I'm just having trouble understanding what a sample chunk actually represents. |
After the header, the data is just samples, one after another. For stereo, the data words alternate L-R-L-R-L-R... until the end of the wav file. These sample values are intended to be played or sent to the DAC at whatever sample rate was used to create the file.
Quote: | I am a total amateur when it comes to music theory so maybe i'm missing something here. I've read that an "A" note is 440hz or something like that.. |
440 Hz is considered to be in "concert pitch" tuning for the A above middle C. It is a standard used to make sure all instruments in an ensemble play together harmoniously.
Quote: | so if we had a wav file that has a solid A note fading from zero volume to high volume is that just a 440hz directive over and over or is there some multiplier that will increase volume somehow by setting a subsequent sample frame to a value that means something like 880hz? |
Assuming the signal is a sinewave, then the values you would see in the data portion of the wav file will be zeroes at first (Representing the lowest "volume" portion of the wave) to a full scale output where the peaks of the sinewave are at values +127 and -128. If the peaks of the sinewaves were +63 and -64, then the sinewave would be at "half volume". In other words, the wav file is an "image file" of the data that was recorded.
If I may suggest - if you open a wav file editor, you will see what I'm talking about. The waveform you see is just a simple cartesian graph of the data samples.
This method of recording allows representation of any frequency at any volume level of any waveshape possible given the constraints of the width of the data word and the sample rate.
I hope that helps... _________________ FPGA, dsPIC and Fatman Synth Stuff
Time flies like a banana. Fruit flies when you're having fun. BTW, Do these genes make my ass look fat? corruptio optimi pessima
|
|
Back to top
|
|
 |
BananaPlug

Joined: Jul 04, 2007 Posts: 307 Location: Philly
Audio files: 5
|
Posted: Fri Dec 26, 2008 2:49 pm Post subject:
|
 |
|
Like he said, very low level data. Similar to the representation of pixels in an image file. |
|
Back to top
|
|
 |
mrjason
Joined: Jul 25, 2006 Posts: 8 Location: Austin, Tx
|
Posted: Sun Dec 28, 2008 8:41 pm Post subject:
|
 |
|
yes thank you for explaining a bit.
apologies for the confusion. you're correct in that there are only 256 values in 8 bits, I did some quick (bad) math when posting earlier.
I've seen the audio representation in DAWs before but I assumed (incorrectly?) that what I was seeing was just volume levels.
I'm still a bit confused though. Am I right in understand, now, that the value for any given sample is simply a volume level?
The confusion is probably best explained this way: consider for example's sake we have a wav file that is a mono 1000hz sample rate. Suppose this wav file is one second in length and contains a simple, pure, 440hz max volume signal for the duration of the file. So we would have 1000 8 bit samples, with 440 of those samples being a value of 127 (or 128?) and the non-440 samples being a value of 0? (or -127?)? Does that sound right?
If we further complicate things let's say we have the same 1000hz mono sample rate, and the same 1 second duration, and the same 440hz tone throughout the entire 1 second.. and we add in a 220hz tone for the duration as well. If in our example both tones have max volume, then for starters for the 440hz tone we'd have 440 samples with a value of 128, then half of those same samples would have "128" added to it for the 220hz values and we'd essentially clip on all 220 samples? yeah?
But if we did the same thing and had a volume/voltage value of "15" for both the 440hz and 220 hz tones then we'd have 220 samples with a volume of "30" (where 220 and 440 overlap) and 220 with a value of "15" (where 440 is, but 220 is not) and the other samples would be empty/mute samples?
Is that how it works? |
|
Back to top
|
|
 |
mrjason
Joined: Jul 25, 2006 Posts: 8 Location: Austin, Tx
|
Posted: Sun Dec 28, 2008 8:48 pm Post subject:
|
 |
|
also, one more bit, as to the sinewaves aspect of this conversation. when you say values of +127 and +127, how does that translate to what's going on with volume? how does that translate to what's going on with the speaker/voltage?
is it possible to have an off-axis-centered sound sinewav in the wav file with peaks on the positive side being like +100 and the neg side being -20? what would that sound like, or is that not something that happens?
my basic understanding of speakers is that there's an electro magnet that's moving air at a rapid rate.. put a certain voltage rate on the speaker and the magnet attracts or repels and moves air. perhaps my understanding is too limited though, I thought the voltage was basically on, off, on, off.. these negative values make it sound more like perhaps the speaker at rest state is at "0" on the sinewave map and a positive voltage would make the speaker magnet attract and a negative would make it repel... ?
---
and as to the hertz, does a 440hz sample mean the 0 axis is crossed by the sinewave 440 (or 880) times per second? or does that mean there are 440 peaks per second, or both?
and since this is sinewaves rather than simple binary "on, off".. would a 440hz sample have tons of non-zero values but just 440 of them would be the peaks (or axis-cross, or whatever)? |
|
Back to top
|
|
 |
JovianPyx

Joined: Nov 20, 2007 Posts: 1988 Location: West Red Spot, Jupiter
Audio files: 224
|
Posted: Sun Dec 28, 2008 9:10 pm Post subject:
|
 |
|
mrjason wrote: | yes thank you for explaining a bit.
apologies for the confusion. you're correct in that there are only 256 values in 8 bits, I did some quick (bad) math when posting earlier. |
been there... way too often.
Quote: | I've seen the audio representation in DAWs before but I assumed (incorrectly?) that what I was seeing was just volume levels. |
I wouldn't use the term volume, more like voltage, but essentially, yes, each sample is a representation of the signal voltage at one instant in time. The actual full scale voltage doesn't matter except that it's greater than zero and is constant. Full scale in out 8 bit case would be the values 127 and -128, where they might represent +5 volts and -5 volts respectively. A zero would be for zero volts.
Quote: | I'm still a bit confused though. Am I right in understand, now, that the value for any given sample is simply a volume level? |
a level - at an instant in time.
Quote: | The confusion is probably best explained this way: consider for example's sake we have a wav file that is a mono 1000hz sample rate. Suppose this wav file is one second in length and contains a simple, pure, 440hz max volume signal for the duration of the file. So we would have 1000 8 bit samples, with 440 of those samples being a value of 127 (or 128?) and the non-440 samples being a value of 0? (or -127?)? Does that sound right? |
The samples, as a stream, are just values at instants in time. When you "play" them, the sample values are each describing an instant in time. So if the first on is a zero, then zero volts are output. The next sample will be played after a specific time delay. This goes on for each sample in the file. Their values then define the waveshape. Like points on a graph, you can see the shape.
Quote: | If we further complicate things let's say we have the same 1000hz mono sample rate, and the same 1 second duration, and the same 440hz tone throughout the entire 1 second.. and we add in a 220hz tone for the duration as well. If in our example both tones have max volume, then for starters for the 440hz tone we'd have 440 samples with a value of 128, then half of those same samples would have "128" added to it for the 220hz values and we'd essentially clip on all 220 samples? yeah? |
Well, you can do math to know the exactly places where that occurs. It doesn't matter though, because you don't want that to happen (usually). When you mix two signals in DSP, you add them - but you're right that two full scale signals mixed (or added) will cause clipping. Whenever you add N signals together, you divide the sum by N and the clipping isn't a problem.
Quote: | But if we did the same thing and had a volume/voltage value of "15" for both the 440hz and 220 hz tones then we'd have 220 samples with a volume of "30" (where 220 and 440 overlap) and 220 with a value of "15" (where 440 is, but 220 is not) and the other samples would be empty/mute samples?
Is that how it works? |
?? not altogether sure what you're describing - do you mean this:
Given 2 signals, one has a frequency of 220 Hz and a maximum level (the level of the positive peaks of the waveform) of 15 and a second signal with a frequecy of 440 Hz also with a maximum level of 15, then there will be places with in the sample stream that could be as high as 30.
If you make each signal have a value of 63 for a maximum level, the combined signal will play without clipping. _________________ FPGA, dsPIC and Fatman Synth Stuff
Time flies like a banana. Fruit flies when you're having fun. BTW, Do these genes make my ass look fat? corruptio optimi pessima
|
|
Back to top
|
|
 |
JovianPyx

Joined: Nov 20, 2007 Posts: 1988 Location: West Red Spot, Jupiter
Audio files: 224
|
Posted: Sun Dec 28, 2008 9:23 pm Post subject:
|
 |
|
mrjason wrote: | also, one more bit, as to the sinewaves aspect of this conversation. when you say values of +127 and +127, how does that translate to what's going on with volume? how does that translate to what's going on with the speaker/voltage? |
A signal with peaks at +127 and -127 will have twice the amplitude of one with peaks at +63 and -63. If you saw them on an oscilloscope, you'd see that one is taller than the other, the taller being the louder.
Quote: | is it possible to have an off-axis-centered sound sinewav in the wav file with peaks on the positive side being like +100 and the neg side being -20? what would that sound like, or is that not something that happens? |
What makes you say "that's a violin I hear" is the waveshape of the signal from the instrument. A sinewave sounds different from a square wave because of it's shape (and the harmonics it posesses - a sinewave has only one, where the square wave has an infinte number). the fact that a sinewave has a DC offset, doesn't change how it sounds. Timbre comes from the number and level of harmonics a signal has. DC offset is silent.
Quote: | my basic understanding of speakers is that there's an electro magnet that's moving air at a rapid rate.. put a certain voltage rate on the speaker and the magnet attracts or repels and moves air. perhaps my understanding is too limited though, I thought the voltage was basically on, off, on, off.. |
No, not on and off. Varying through a range of voltages, like from -10volts to +10volts. A sinewave has both negative and positive peaks. Negative values just move the speaker cone in the opposite direction as positive values move it. Higher values (absolute values actually) move the cone farther than smaller absolute values. That's how the samples represent the waveshape, by gradually or quickly moving the cone as descibed by the stream of numbers in the wav file.
Quote: | these negative values make it sound more like perhaps the speaker at rest state is at "0" on the sinewave map and a positive voltage would make the speaker magnet attract and a negative would make it repel... ? |
Yes. It really depends on how the speakers are hooked up but + values might make the cone move out while - values make it move in. This way values that go from + values to - values and back again thousands of times per seconds makes a sound you can hear, they make the cone vibrate.
Quote: | and as to the hertz, does a 440hz sample mean the 0 axis is crossed by the sinewave 440 (or 880) times per second? or does that mean there are 440 peaks per second, or both?
|
a 440 Hz sinewave will have 880 zero crossings in a second. You also have 2 peaks per cycle, so they two would number 880.
Quote: | and since this is sinewaves rather than simple binary "on, off".. would a 440hz sample have tons of non-zero values but just 440 of them would be the peaks (or axis-cross, or whatever)? |
Yes. well, 880, but yes. Tons of nonzero values is correct. Last edited by JovianPyx on Sun Dec 28, 2008 9:27 pm; edited 1 time in total |
|
Back to top
|
|
 |
mrjason
Joined: Jul 25, 2006 Posts: 8 Location: Austin, Tx
|
Posted: Sun Dec 28, 2008 9:24 pm Post subject:
|
 |
|
thanks for the clarification.
i thought of a better example to sum all of this up.
consider a 1000hz mono sample 1-second duration wav file.
that file will have 1000 8 bit samples in it that should be played 1 millisecond apart.
if this file had a single pure max-voltage 100hz tone in it then one cycle should last 10 samples, correct:?
sample 1: 0% of 127
sample 2: 50% of 127
sample 3: 100% of 127
sample 4: 50% of 127
sample 5: 0% of 127
sample 6: -50% of 127
sample 7: -100% of 127
sample 8: -50% of 127
sample 9: 0% of 127
(cleary my math is bad here because i've got 9 samples, and really the first 8 would be the cycle that repeats.. if i div 1000hz by 8 samples per frequency i get that this example above is actually .. 125hz?)
or something along those lines? |
|
Back to top
|
|
 |
JovianPyx

Joined: Nov 20, 2007 Posts: 1988 Location: West Red Spot, Jupiter
Audio files: 224
|
Posted: Sun Dec 28, 2008 9:30 pm Post subject:
|
 |
|
mrjason wrote: | thanks for the clarification.
i thought of a better example to sum all of this up.
consider a 1000hz mono sample 1-second duration wav file.
that file will have 1000 8 bit samples in it that should be played 1 millisecond apart.
if this file had a single pure max-voltage 100hz tone in it then one cycle should last 10 samples, correct:?
sample 1: 0% of 127
sample 2: 50% of 127
sample 3: 100% of 127
sample 4: 50% of 127
sample 5: 0% of 127
sample 6: -50% of 127
sample 7: -100% of 127
sample 8: -50% of 127
sample 9: 0% of 127
(cleary my math is bad here because i've got 9 samples, and really the first 8 would be the cycle that repeats.. if i div 1000hz by 8 samples per frequency i get that this example above is actually .. 125hz?)
or something along those lines? |
Yes, one thing about the math, I see what you're doing but in reality, you don't always get a true zero crossing represented by a zero value in the data. More likely you'll see a sample with a very small positive value followed by another sample with a very small negative value. The zero happened sometime between those samples. _________________ FPGA, dsPIC and Fatman Synth Stuff
Time flies like a banana. Fruit flies when you're having fun. BTW, Do these genes make my ass look fat? corruptio optimi pessima
|
|
Back to top
|
|
 |
mrjason
Joined: Jul 25, 2006 Posts: 8 Location: Austin, Tx
|
Posted: Sun Dec 28, 2008 9:40 pm Post subject:
|
 |
|
i see. thanks so much for clarifying all of this for me. |
|
Back to top
|
|
 |
urbanscallywag

Joined: Nov 30, 2007 Posts: 317 Location: sometimes
|
|
Back to top
|
|
 |
urbanscallywag

Joined: Nov 30, 2007 Posts: 317 Location: sometimes
|
Posted: Sun Dec 28, 2008 11:15 pm Post subject:
|
 |
|
Hah, what's interesting there is that you can see the negative DC bias effect of truncation. The magnitude of the positive half cycle is one less than the negative half cycle. DC offset is a nightmare, analog or digital!  |
|
Back to top
|
|
 |
BananaPlug

Joined: Jul 04, 2007 Posts: 307 Location: Philly
Audio files: 5
|
Posted: Mon Dec 29, 2008 3:21 pm Post subject:
|
 |
|
Quote: | i thought of a better example to sum all of this up.
consider a 1000hz mono sample 1-second duration wav file.
that file will have 1000 8 bit samples in it that should be played 1 millisecond apart. |
Forgive me if I missed something but the whole idea of sampling rate seems to have been left by the side of the road. Typically a WAV file is going to have a sampling rate of 44.1kHz (like a CD), so your 1-second duration wav file would have 44,100 samples regardless of its content. |
|
Back to top
|
|
 |
JovianPyx

Joined: Nov 20, 2007 Posts: 1988 Location: West Red Spot, Jupiter
Audio files: 224
|
Posted: Mon Dec 29, 2008 4:06 pm Post subject:
|
 |
|
Well, while it's typical for a wav file to have a 44.1 KHz sample rate, it is not cast in stone. It could actually have a sample rate of 1000 Hz, though that would not be useful for signals more than 500 Hz (realistically, somewhat less than that actually).
In fact, the digital synths I've designed have sample rates well above 44.1 KHz, one with a sample rate of 1.0 MHz, the others are 250 KHz and 200 KHz.
Also, there are uses for very low sample rates - when the signal being sampled has a low frequency. For example, if you are sampling a 60 Hz power signal you could use a low sample rate (perhaps 6 KHz or even 600 Hz) to conserve data storage space, assuming that you can still extract the desired information at that sample rate.
For audio, yes, 44.1 KHz is typical, just not universal. _________________ FPGA, dsPIC and Fatman Synth Stuff
Time flies like a banana. Fruit flies when you're having fun. BTW, Do these genes make my ass look fat? corruptio optimi pessima
|
|
Back to top
|
|
 |
BananaPlug

Joined: Jul 04, 2007 Posts: 307 Location: Philly
Audio files: 5
|
Posted: Mon Dec 29, 2008 5:04 pm Post subject:
|
 |
|
Quote: | For audio, yes, 44.1 KHz is typical, just not universal. |
Yup. That's exactly why I said "typically."
I only brought it up because taking enough samples is every bit as important as taking accurate samples. |
|
Back to top
|
|
 |
JovianPyx

Joined: Nov 20, 2007 Posts: 1988 Location: West Red Spot, Jupiter
Audio files: 224
|
Posted: Mon Dec 29, 2008 5:25 pm Post subject:
|
 |
|
I understand, and you were quite correct, I was merely expanding, really for the OP - who seems interested in DSP and being a programmer probably already has the math skills to do interesting things. I just wanted to make sure he understands that - really all sample rates are useful, but must be appropriate for the sampled signals, i.e., the Nyquist Limit.
I'll give a bizarre example you may appreciate: I have a Cirrus CS4344 DAC. 24 bit stereo (delta-sigma) with sample rates up to 200 KHz both channels driven. It will go much lower, however. Anyway, it's an I2S interface and has some master clock constraints as well as demanding that the master clock be a multiple N of shift clock, that N being one of 8 (I think) different values based on 32 and 48. The largest of those "dividers" is 1152. But the DAC has an upper limit of 50 MHz for master clock.
50 MHz / 1152 = 43.4 KHz (approx).
43.4 KHz is only a hair under 44.1 KHz. But the fact that the setup naturally provides for 1152 clocks per DAC enable means that I can do a lot of math and logic per sample. I think the difference between the standard 44.1 KHz and 43.4 KHz will go unnoticed. The lower Nyquist means small adjustments, but realistically should be as capable as 44.1 KHz in terms of music.
The point is that 43.4 KHz is anything but standard, but it works well enough that the clock count advantage makes a bigger difference (to me).
Sorry for the blather. Got nuttin' to do today...  _________________ FPGA, dsPIC and Fatman Synth Stuff
Time flies like a banana. Fruit flies when you're having fun. BTW, Do these genes make my ass look fat? corruptio optimi pessima
|
|
Back to top
|
|
 |
urbanscallywag

Joined: Nov 30, 2007 Posts: 317 Location: sometimes
|
Posted: Mon Dec 29, 2008 5:58 pm Post subject:
|
 |
|
After designing some audio DSP I don't think I'd go back to 44.1 ksps instead of 48 ksps. The transition band between the signal of interest and the half sample rate is just too small at 44.1 ksps.
I also wouldn't use 96+ ksps unless necessary. |
|
Back to top
|
|
 |
JovianPyx

Joined: Nov 20, 2007 Posts: 1988 Location: West Red Spot, Jupiter
Audio files: 224
|
Posted: Mon Dec 29, 2008 6:16 pm Post subject:
|
 |
|
I use the highest sample rate a design will allow when designing a synthesizer (which I design for my own use). This comes down to balancing sample rate against other factors such as voice complexity or voice count.
And you may not agree with this - but I've had quite satisfactory results with synthesizer designs using high sample rates as a way of reducing audible aliasing artifacts as opposed to a more complex design that can run at lower sample rates. (though I've experimented with other designs using lower sample rates with very good results)
I have an odd way of looking at sample rate, kind of like the calculus notion of 1/n where as you increase n, the value approaches zero. As you increase sample rate, the representation approaches a continuous time signal. Approaches, but never attains. So I like to keep sample rates as high as I can. This is one reason I enjoy designing for FPGAs. _________________ FPGA, dsPIC and Fatman Synth Stuff
Time flies like a banana. Fruit flies when you're having fun. BTW, Do these genes make my ass look fat? corruptio optimi pessima
|
|
Back to top
|
|
 |
urbanscallywag

Joined: Nov 30, 2007 Posts: 317 Location: sometimes
|
Posted: Mon Dec 29, 2008 7:10 pm Post subject:
|
 |
|
I agree that you'll have less aliasing at high sample rates in waveform generation for example. It might give better results than a more complicated design at a lower sample rate.
But (in general) I don't agree with keeping data at those higher sample rates when it isn't necessary. Its largely a waste of computations/storage/etc. Some exceptions come in digital to analog and analog to digital conversion.
For example, if I have an unnecessarily oversampled audio signal, say 192 ksps. The number of taps in a FIR filter is inversely proportional to the transition band over the sample rate, Ft/Fs. So if my sample rate could have been 48 ksps but I'm using 192 ksps, I have to use 4 times as many taps to build the same FIR filter (and its running at 4 times the clock rate, right?!). Recursive filters have to use higher order where they are more sensitive to finite arithmetic. The impulse response gets longer making the filter sound bad. You can't use polyphase tricks if you aren't willing to go to a lower sample rate either.
So high sample rates have some use, but in general stick to minimal oversampling. |
|
Back to top
|
|
 |
JovianPyx

Joined: Nov 20, 2007 Posts: 1988 Location: West Red Spot, Jupiter
Audio files: 224
|
Posted: Mon Dec 29, 2008 7:26 pm Post subject:
|
 |
|
It really depends on the application. I can see where both approaches are quite valid, equally so. _________________ FPGA, dsPIC and Fatman Synth Stuff
Time flies like a banana. Fruit flies when you're having fun. BTW, Do these genes make my ass look fat? corruptio optimi pessima
|
|
Back to top
|
|
 |
urbanscallywag

Joined: Nov 30, 2007 Posts: 317 Location: sometimes
|
Posted: Mon Dec 29, 2008 7:29 pm Post subject:
|
 |
|
Like I said there are places for oversampling, but most of the time its a waste. |
|
Back to top
|
|
 |
JovianPyx

Joined: Nov 20, 2007 Posts: 1988 Location: West Red Spot, Jupiter
Audio files: 224
|
Posted: Mon Dec 29, 2008 7:58 pm Post subject:
|
 |
|
I would disagree that "most of the time" it's a waste. It really depends on what you are trying to accomplish. I have a set of applications that demand very high oversampling at one stage. The result uses a lower sample rate to the DAC, but interally it runs much higher. I am about to take the internal sample rate even higher since I've heard from others who've tried it with improved results. One example of this is the use of high oversampled generation of naive waveforms sent to a large FIR filter with the DAC's sample rate Nyquist cutoff characteristic. Higher oversample rate allows higher musical fundamentals to be used with less amplitude ripple at the high end. Currently, I'm getting to about 4.7 KHz before I hear the ripple. I'd like to push that up to 6 KHz if I can.
I am writing about synth design, not general audio design. Is that the difference? With synth design, I see a relaxation of other parameters when sample rate is increased. As long as my device (FPGA) will do it, I put no constraint on sample rate. There is no "too high" in my design world. _________________ FPGA, dsPIC and Fatman Synth Stuff
Time flies like a banana. Fruit flies when you're having fun. BTW, Do these genes make my ass look fat? corruptio optimi pessima
|
|
Back to top
|
|
 |
urbanscallywag

Joined: Nov 30, 2007 Posts: 317 Location: sometimes
|
Posted: Mon Dec 29, 2008 8:08 pm Post subject:
|
 |
|
Hey its OK I know a guy who designed a sample rate converter with 140dB dynamic range and I don't agree with that either. Waste away.  |
|
Back to top
|
|
 |
urbanscallywag

Joined: Nov 30, 2007 Posts: 317 Location: sometimes
|
Posted: Mon Dec 29, 2008 9:00 pm Post subject:
|
 |
|
Do you use FIR or IIR as a "musical" filter? Do they get more relaxed requirements at higher sample rates?
I haven't had time to investigate the infamous state variable filter that actually works better at high sample rates.  |
|
Back to top
|
|
 |
JovianPyx

Joined: Nov 20, 2007 Posts: 1988 Location: West Red Spot, Jupiter
Audio files: 224
|
Posted: Mon Dec 29, 2008 10:32 pm Post subject:
|
 |
|
I've used only IIR filters for musical filters.
I first tried a simple single stage IIR filter which was lackluster.
I then tried a SVF, which was much much better - but I knew about it's higher sample rate constraint. Ok, so I run the design at an appropriately high rate. What is nice about a state variable filter is that it is resonant and very easy to tune, simply change one number to change cutoff, change another number to change Q. It is also a common analog construct. But because it has positive gain in the passband near cutoff, amplitude must be compensated. I have a simple subtractive monosynth (MIDI) that runs at 1 MHz sample rate, has a SVF and sounds very analog (to me...) It does portamento without artifacts.
Here are two samples:
http://www.fpga.synth.net/pmwiki/uploads/FPGASynth/GateMan_II_notes.mp3 (crappy clip, I know, it clips, but it still demonstrates the filter)
http://www.fpga.synth.net/pmwiki/uploads/FPGASynth/metal_pipe1.mp3
Portamento definately in use in the second of them. The second uses two oscillators hard synched to two other oscillators. Note that portamento is implemented by passing the pitch signal through single stage IIR filter
According to literature I've read, an SVF can handle a cutoff up to only 1/6 of the sample rate. This is why I chose to run the sample rate as high as possible (1.0 MHz) to get the widest Fc range. Arithmetic calculation word size determines Q range. Wider being better. Hence why FPGAs give a freedom in this regard.
Also, GateManPoly uses one SVF per voice (8 voices). the higher sample rate allows this simplicity. _________________ FPGA, dsPIC and Fatman Synth Stuff
Time flies like a banana. Fruit flies when you're having fun. BTW, Do these genes make my ass look fat? corruptio optimi pessima
Last edited by JovianPyx on Mon Dec 29, 2008 11:26 pm; edited 4 times in total |
|
Back to top
|
|
 |
|