From PIC16F690 to PIC18F26K20 + The big function!

Thread Starter

Eric007

Joined Aug 5, 2011
1,158
I can't see any problems with #156, but I don't really know the 18F series.
Ok. For now I'm just working on the *logic* I will still have to double check the syntax and the 'a' (Access bank and Banks) in the instructions that confuses me sometimes.

Surely when I assemble it I might find some errors that will be easily fixed!

Thanks!
 

Thread Starter

Eric007

Joined Aug 5, 2011
1,158
Here's what we have:

y(n) = b11*x(n) + b12*x(n-1) + b13*x(n-2) - a11*y1(n-1) - a12*y1(n-2)

filter1 dw 17,-25,17,-460,246, ...
filter2 dw 26,-23,26,340,-246, ...
filter3 dw 32,-5,32,168,-246, ...
filter4 dw 35,-17,35,-10,-246, ...
filter5 dw 31,9,31,-205,-246, ...
filter6 dw 25,25,25,-369,-246, ....
filter7 dw 13,21,13,-477,-245, ...

We dealing with the 2nd order for now...and the first half shown above are the coefficients of interest and they are not that large compared to the second half!

And although I have allocated 16 bits for the coefficients they will actually be 12 bits max (hope I am right) and some coefficients are *negative* so they'll subtract...so I am thinking *maybe* I can allocate 16 bits to the Y's as well. Plus the samples are 10 bits!

Guess the best way to be sure will be to simulate it BUT how can I simulate the *samples*?? maybe with random maximal values...plus the equation will be used once only for each new set of {x(n),x(n-1),x(n-2)} then the Y's will be shifted as the X's.

Well for now ima write and post the code *assuming* the Y's are 16 bits too and that they will NOT be >FFFF after the addition. And I will have to simulate to check the results...
 

Thread Starter

Eric007

Joined Aug 5, 2011
1,158
Clearly, there's going to be a problem!
The Y's need to be at least 24 bits IMO and this implies the highlighted part of the equation need to be a 16 by 24 multiplication!

Have a look at the last but one comment 'y(n) <- accum1' I can't just copy the first 2 bytes only (res0:res1)

y(n) = b11*x(n) + b12*x(n-1) + b13*x(n-2) - a11*y1(n-1) - a12*y1(n-2)

The attached code will have to be repeated 7 times for the 7 filters.

Looks like I will have to give up on this thread soon...:) coz its getting ugly!

Was also wondering where's a good place to update the Y's array?
Should I add this job to the 'high isr'?

Edit: maybe if using 8-bit ADC it would work!
 

Attachments

Last edited:

Thread Starter

Eric007

Joined Aug 5, 2011
1,158
Guess I need to take a break from this thread (couple of days or weeks depends...) and come back a bit later if new ideas...

Do you or anyone here have a better algorithm for this thing? like FFT or something...I would like to see what the FFT algorithm for speech looks like...

Will have a look at your link...

How is going to back 'storing half a second samples' will help on the current problem?
by reducing the resolution you mean chopping out the 2 bits of ADC higher byte?

Thanks!
 

Thread Starter

Eric007

Joined Aug 5, 2011
1,158
I have corrected a few errors in the routine in post #163 and completed it.
A part from the Y's size problem that is how that routine should be implemented IMO. Will still have to write the sub-routines: faccum, y1n and clear_accum but that's trivial.
 

Thread Starter

Eric007

Joined Aug 5, 2011
1,158
I decided to allocate 16 bits to Y's and I think, like I said in earlier post, that the result from the equation won't be > FFFF given that there are some *negative* coefficients so they will subtract.

I will complete the code and post everything then I will use Matlab to record the different words I am planning to recognized and sample them at same rate as MCU and with 10-bit resolution get the 2000 samples for each word. I'll have real values to simulate with and check if there will be overflow (ie > FFFF). Sounds like a lot of work but simulation/testing is the only way to make sure.

What do you think?

Looks like you are not going to be able to do it in real time unless you reduce the resolution or frequency.
...
An other idea was dividing all coefficients by 10 for instance to avoid results to get > FFFF. But then I would change the coefficients...do you think that will be a problem? I mean will that impact on the filtering?

Thanks for the comments!
 

Thread Starter

Eric007

Joined Aug 5, 2011
1,158
I am thinking of loosing the 'DSP button', that will trigger the actual computation/processing of the recognition, in my schematics because it doesn't feel as if its the voice command that has ordered a certain action (lighting a LED, or whatever)...instead I'll do as follows:

Once 2000 samples have been captured, the ADC is disabled and the DSP processing get activated. And as soon as the DSP processing is done and whether or not a word is recognized the ADC gets re-enabled for the next recognition!

Sounds cool, right?:)

JohnP was right! this is one of those project where hardware design comes second...
 

Markd77

Joined Sep 7, 2009
2,806
Something you could do is continuously sample and test if there is a sudden increase in sound volume then start processing, might give a better chance of getting the command words.
 

Thread Starter

Eric007

Joined Aug 5, 2011
1,158
Something you could do is continuously sample and test if there is a sudden increase in sound volume then start processing, might give a better chance of getting the command words.
Sounds good but the highlighted part sound *not easy* well I haven't thought how to but This confirms that you agreeing with me that the DSP button should get removed, correct?

I am busy playing with Matlab and some sample 'command voice' I recorded...stop, start,... and checking what the samples look like...will discuss my experiment...

Thanks for your comment!
 
Last edited:

joeyd999

Joined Jun 6, 2011
6,334
Sounds good but the highlighted part sound *not easy*...
Actually, Eric. This would be quite easy. An analog level detector driving another CPU pin could activate the signal processing code (this is assuming you didn't want the complexity of including a software level detector in your code).
 

Thread Starter

Eric007

Joined Aug 5, 2011
1,158
Actually, Eric. This would be quite easy. An analog level detector driving another CPU pin could activate the signal processing code (this is assuming you didn't want the complexity of including a software level detector in your code).
Humm sounds very good! Well I will investigate the hard/soft -ware (I mean the two solutions) complexity and decide BUT I like *complexity* now coz that implies I'll learn something new!:) so if there's room to fit that *code* into the rest I will...

So we keeping this part for a bit later after solving the current problem!

Thanks!
 

joeyd999

Joined Jun 6, 2011
6,334
...BUT I like *complexity* now coz that implies I'll learn something new!:)...
Dude...you are drowning in complexity! Simplify a bit. And get something *small* working so as to achieve some short-term success. This will provide incentive to move forward to bigger and better things.
 

Thread Starter

Eric007

Joined Aug 5, 2011
1,158
I am enjoying playing with Matlab which is a *very powerful* tool!
You can do lots of speech processing with it.

Here are two words recorded as .wav at 8Khz, 16 bit/sample attached.
The x-axis is the number of samples I am not sure yet about the y-axis.

From the attached figure we can see that a complete word takes about 2500 samples.

The samples are similar BUT different but they have decimal point and some are negative...guess the samples from the PIC ADC are converted values of these, huh?

I have attached some samples too (Not all of them!) *for the word Right* as there are 13230 but like I said the word ends at ~2500 samples...I pressed the stop button late when recording...Those are 16-bit samples I'll be using 10-bits ones...

This:
.... test if there is a sudden increase in sound volume then start processing, might give a better chance of getting the command words.
is going to be applied here as shown in the figure!:)

Any comment from this post?
 

Attachments

Last edited:

Thread Starter

Eric007

Joined Aug 5, 2011
1,158
Attached is three routines (faccum, y1n and clear_accum). I had to attach the code coz I can't format it well using the {
Rich (BB code):
,
}.

faccum, accumulates the result of each multiplication in the equation. Given that this pic has 'addwfc' then no need to test the carry bit and increment if there's a carry. But I need to know what to do/happens when there's a carry in the last byte(last addition)?

y1n, stores the result of the entire equation. BUT notice that I only copy the two lower bytes coz I have *assumed* that the result will not overflow FFFF. And I think I have to make this constraint true all the time otherwise it gets too complicated.

clear_accum, just clear the accumulator for the next pass/equation.

This equation y1(n) = b11*x(n) + b12*x(n-1) + b13*x(n-2) - a11*y1(n-1) - a12*y1(n-2) can be rewritten as:

y1(n) = b11*x(n) + b12*x(n-1) + b13*x(n-2) + (-a11)*y1(n-1) +(-a12)*y1(n-2), this implies I will have to change the sign of the last two coefficients ONLY (in the filter table in program memory) so I can use the *same* accumulator routine (faccum) throughout the equation that *will add* and not having an 'add' and a 'subtract', do you agree with me???

Please any critics/comments on the above and code (logic and syntax)
Thanks!
 

Attachments

joeyd999

Joined Jun 6, 2011
6,334
This equation y1(n) = b11*x(n) + b12*x(n-1) + b13*x(n-2) - a11*y1(n-1) - a12*y1(n-2) can be rewritten as:

y1(n) = b11*x(n) + b12*x(n-1) + b13*x(n-2) + (-a11)*y1(n-1) +(-a12)*y1(n-2), this implies I will have to change the sign of the last two coefficients ONLY (in the filter table in program memory) so I can use the *same* accumulator routine (faccum) throughout the equation that *will add* and not having an 'add' and a 'subtract', do you agree with me???
I haven't looked at your code, but, tell me: why do you think using negative coefficients (with unsigned multiplication!) eliminates the need for a subtraction?

EDIT: Oh, and yes, you will need to find something to do with that last carry!

EDIT 2: And, it is the *high* two bytes (+ carry!) of the accumulator that are significant, not the low two bytes.
 
Last edited:

Thread Starter

Eric007

Joined Aug 5, 2011
1,158
I haven't looked at your code, but, tell me: why do you think using negative coefficients (with unsigned multiplication!) eliminates the need for a subtraction?
Not sure if I understand your question well! Which subtraction you referring to? coz we still having those two routines 'muladd' and 'mulsub' that multiply the +ve samples with +ve coefficients and -ve coefficients respectively BUT, if I can rephrase my question, the negative sign in the equation: Is it a subtraction or is it the sign of the negative coefficient??

I am getting a bit confused...:confused: BUT common sense tells me it is a *subtraction* in the equation because the coefficients generated with Matlab are *not* always *negative* at the same location.

A quick question: we have 5 multiplications in the equation and the result of some *may* be negative. Now when accumulating the results of these 5 multiplications, do we have to test each result if it is negative and if it is subtract it from the accumulator?? If so then I will have to edit my accumulator routine.

I think that is what you meant and if so then it does *not* eliminates the need for a subtraction.

Please clarify me on this one!
 

joeyd999

Joined Jun 6, 2011
6,334
Eric,

Sorry to say, but I've grown bored with your project. So, just to give it a kick, I went and wrote your code for you. Here is the data definitions (put this in an .inc file):

Rich (BB code):
	cblock	0x100


;Accumulator (don't change order of data)

	Accum:4		;*must* be on page boundary
	ACSink		;accumulator carry sink (must follow accum)

	AccFlags	;order T4.T3.T2.T1.T0

	tcount		;term counter
	accptr		;accumulator pointer

	
;Samples (don't change order of data)

	XYFlags		;order: x2.x1.x0.y2.y1
	
	Xn_2:2		;X terms
	Xn_1:2
	Xn0:2
	Yn_2:2		;Y terms
	Yn_1:2
	
;Constants (don't change order of data)

	ABFlags		;order: B3.B2.B1.A2.A1
	
	ConstB3:2
	ConstB2:2
	ConstB1:2
	ConstA2:2
	ConstA1:2	

;Sample Data

	Sample

	endc
And here's the code:

Rich (BB code):
	org	0x0000			;<-- change origin to someplace appropriate

start	movlb	1			;all work in bank 1

init

;---- To Do: initialize Constants -----


dosample

;-----  To Do:  HERE need to load Sample with current sample before continuing -------

sloop	movff	Sample,Xn0		;copy current sample into Sample structure
	movff	Sample+1,Xn0+1
	bcf	XYFlags,2,1		;current sample is always positive

	lfsr	0,XYFlags		;point to Sample structure
	lfsr	1,ABFlags		;point to Constant structure
	lfsr	2,ACSink		;point to high byte of accumulator

	movlw	low Accum		;need extra pointer to low byte for byte alignment
	movwf	accptr,1

	call	mulacc			;multiply/accumulate

;shift data in queue

	movff	Xn_1+0,Xn_2+0		;Xn_1 -> Xn_2
	movff	Xn_1+1,Xn_2+1

	movff	Xn0+0,Xn_1+0		;Xn0 -> Xn_1
	movff	Xn0+1,Xn_1+1

	movff	Yn_1+0,Yn_2+0		;Yn_1 -> Yn_2
	movff	Yn_1+1,Yn_2+1

;add accumulator to queue

	movff	Accum+2,Yn_1+0		;copy data from accumulator to mid queue
	movff	Accum+3,Yn_1+1

	bcf	status,c		;assume positive

	btfss	ACSink,6,1		;no negate if positive
	bra	acnn

	comf	Yn_1+1,f,1		;negate
	negf	Yn_1+0,1
	btfsc	status,c
	incf	Yn_1+1,f,1		

	bsf	status,c

acnn	rlcf	XYFlags,f,1		;rotate sign into XY Flags

	goto	dosample		;done with this sample

;***********************************
;** mulacc -- Multiply/Accumulate **
;***********************************

mulacc	movlw	0x80
	movwf	postdec2		;preset carry sink

	clrf	postdec2		;clear accumulator
	clrf	postdec2
	clrf	postdec2
	clrf	indf2

	movf	postinc0,w		;get XY Flags
	xorwf	postinc1,w		;xor with constant flags
	movwf	AccFlags,1		;save +/- result for add/subtract

;set up for loop

	movlw	5			;5 terms
	movwf	tcount,1			

;mulacc loop

maloop

;cross multiply

	incf	accptr,f,1		;set up next cycle alignment

	movf	postinc0,w		;get sample (L*L)
	mulwf	indf1			;multiply by constant
	call	accumulate		;and accumulate

	movf	postdec0,w		;get sample (H*L)
	mulwf	postinc1		;multiply by constant
	call	accumulate		;and accumulate

	incf	accptr,f,1		;set up next cycle alignment

	movf	postinc0,w		;get sample (L*H)
	mulwf	indf1			;multiply by constant
	call	accumulate		;and accumulate

	bcf	accptr,1,1		;reset accumulator pointer for next term

	movf	postinc0,w		;get sample (H*H)
	mulwf	postinc1		;multiply by constant
	call	accumulate		;and accumulate

	rrncf	AccFlags,f,1		;rotate sign flags for next pass

	decfsz	tcount,f,1		;decrement tcount
	bra	maloop


	return

;*****************************************************
;** Accumulate (add or subtract based on AccFlags.0 **
;*****************************************************

accumulate

	btfsc	AccFlags,0		;positive?
	bra	AccSub			;no, subtract

AccAdd	movf	prodl,w			;get low product
	addwf	postinc2,f		;add into accumulator
	movf	prodh,w
	addwfc	postinc2,f
	clrf	wreg
AACy	addwfc	postinc2,f		;process all carries
	btfsc	status,c
	bra	AACy
	bra	AAdone

AccSub	movf	prodl,w			;do same but subtract
	subwf	postinc2,f
	movf	prodh,w
	subwfb	postinc2,f
	clrf	wreg
ASCy	subwfb	postinc2,f
	btfss	status,c
	bra	ASCy

AAdone	movff	accptr,fsr2l		;reset accum. pointer for next mult

	return

	end
It'd be hard for me to think of a faster, more flexible way to do this. You'll have to sit down with the simulator to figure out how it works. I'll help you where I can.

Good luck.

PS -> I only did some quick simulations...there will probably be a bug here or there.
 
Last edited:

joeyd999

Joined Jun 6, 2011
6,334
Here's a modification to the AccAdd/AccSub code segment that is a bit more efficient:

Rich (BB code):
AccAdd	movf	prodl,w			;get low product
	addwf	postinc2,f		;add into accumulator
	movf	prodh,w
	addwfc	postinc2,f
	clrf	wreg
AACy	btfss	status,c
	bra	AAdone
	addwfc	postinc2,f		;process all carries
	bra	AACy

AccSub	movf	prodl,w			;do same but subtract
	subwf	postinc2,f
	movf	prodh,w
	subwfb	postinc2,f
	clrf	wreg
ASCy	btfsc	status,c
	bra	AAdone
	subwfb	postinc2,f
	bra	ASCy
 
Top