Curve Fitting

Discussion in 'Math' started by joeyd999, Jul 26, 2013.

  1. joeyd999

    Thread Starter AAC Fanatic!

    Jun 6, 2011
    2,675
    2,722
    For the math experts:

    Please see the attached .xls spreadsheet. I've got 2 series of data. The first is the original data, the second is a minor transform of the first. Either of the datasets are valid for my computations, but it seems it is easier to curve fit the second set with a polynomial than the first, as is clear from the graphs.

    While the second fit is good, it's not as good as I want it. The numbers represent a natural phenomenon (which I am unable to disclose), and my gut tells me there should be a computable curve that fits exactly.

    Ideally, the entire curve should be able to be computed from a single equation. I've tried splitting the curve into sections, and I've had great success in modeling the first 2/3rds of the curve. But the last 1/3rd always seems to give me problems no matter what I do.

    FYI, I normally use the LINEST function, but I've shown the trendlines computed by the graph in Excel in the data provided.

    I'm wondering if one of our math experts can look at the curves and tell me if a different class of equations might do a better job fitting the data for one or both series.

    Thanks for the help.
     
  2. WBahn

    Moderator

    Mar 31, 2012
    17,715
    4,788
    Without actually playing with it, I would recommend breaking it into the sum of four terms. The first two terms provides the linear character of the middle part of the data while two exponentials provide the tails.
     
    joeyd999 likes this.
  3. studiot

    AAC Fanatic!

    Nov 9, 2007
    5,005
    513
    My first questions would be

    Were the measurements really made to the accuracy implied by the number of decimal places?

    If they were how many repeats of the measurements were taken to confirm?

    Are there any theoretical reasons to assume an asymptote at about x = 0.13 in the first curve or are there any continuity conditions on the curve slopes at the ends of either curve?
     
  4. joeyd999

    Thread Starter AAC Fanatic!

    Jun 6, 2011
    2,675
    2,722
    All good questions:

    All values are accurate to ~4 decimal places. This is both by design, and confirmed by cross-checking the data against both theoretical values and multiple instances of the physical apparatus.

    I have taken the data *dozens* of times.

    There is definitely a vertical asymptote near x = 0.13. This is due to the nature of the phenomena. Unfortunately, my apparatus is not good enough to measure down to the actual asymptote within the margin of error. I wish I could!

    AFAIK, and also theoretically, there are no expected discontinuity in the curves.

    BTW, modeling the precise *shape* of the curve is more important than a one-to-one correspondence of the values. I can adjust for scaling and offset errors, but not for mismatches in the shape. If you look closely at the second curve, the polynomial fit is close, but there are wiggles that extend above and below the actual data.
     
  5. WBahn

    Moderator

    Mar 31, 2012
    17,715
    4,788
    Where are the "theoretical" values coming from? Why doesn't the theory give you the functional form of the curve leaving you to simply fit the parameters, perhaps according to a least-squares fit?
     
  6. joeyd999

    Thread Starter AAC Fanatic!

    Jun 6, 2011
    2,675
    2,722
    Apologies for not being clear:

    The theory (at least as deep as I am capable of taking it) predicts the rough shape of the curve, not the numerical values. The values measured lie within the locus of points that fit the theory.

    The accuracy of the data has been confirmed via repeated measurements with multiple instances of apparatus and NIST traceable instrumentation.

    The curves shown are a least-squares polynomial fit. They work well, especially the second series. But as I mentioned earlier, there are "wobbles" in the curve that tell me it's not as good a fit as possible, and I am curious if there are other classes of equations that would better fit the data provided than a standard polynomial.
     
  7. MrChips

    Moderator

    Oct 2, 2009
    12,421
    3,356
    It is possible to post the graph for us to see?
     
  8. WBahn

    Moderator

    Mar 31, 2012
    17,715
    4,788
    Okay, but what is the rough shape of the curve that the theory provides? Why can't you develop a mathematical model from that and use least-squares to fit parameters to it?

    Speaking of which, "what is this transform" that you are referring to?
     
  9. studiot

    AAC Fanatic!

    Nov 9, 2007
    5,005
    513
    Yes there are many more available functions that might be fitted.

    You could simply look at rational polynomial approximation (Pade), that might well give a better fit with less computational effort.

    An orthogonal series eg Tchebychef. might be better still.

    The above methods only match the values of the function at known points. Very often the issue of the derivatives need to be matched to obtain a better fit. You have plenty of data for this, so you could explore (cubic) splines.

    You say there is a real world cut off or zero in the data at about x=0.12, as previously noted.
    Unfortunately the more tightly you fit a simple polynomial and the higher its order, near a zero, the more wiggly it becomes.
    The other methods suggested above do not have this problem.
     
    Last edited: Jul 27, 2013
    joeyd999 likes this.
  10. MrChips

    Moderator

    Oct 2, 2009
    12,421
    3,356
    Data based on a physical process have a mathematical model. Define that mathematical model first before attempting to do any curve fitting. Fitting to a polynomial of an order higher than the physical process will result in squiggles.
     
  11. studiot

    AAC Fanatic!

    Nov 9, 2007
    5,005
    513
    Not always, by a long way. And even if there is one the computational effort of the real function may well make its use unattractive compared to an approximating one. That is what the calculus of variations and finite elements is all about.
     
  12. The Electrician

    AAC Fanatic!

    Oct 9, 2007
    2,281
    326
    If you reverse the X and Y axes, a sigmoid curve:

    http://en.wikipedia.org/wiki/Sigmoid_function

    seems to fit the data rather well:

    [​IMG]
     
    joeyd999 likes this.
  13. joeyd999

    Thread Starter AAC Fanatic!

    Jun 6, 2011
    2,675
    2,722
    Beautiful! And I'm slightly embarrassed that I haven't heard of such before.

    But at least I have a direction now. I will need to research the following, but any knowledgeable assistance would be appreciated:

    From what I see on the Wikipedia page, there are a class of Sigmoidal Functions. How to go about choosing the appropriate one?

    How to compute coefficients for least R^2?

    Can the function be transposed?
     
  14. joeyd999

    Thread Starter AAC Fanatic!

    Jun 6, 2011
    2,675
    2,722
    The theory predicts the leading edge curve and asymptote, the "linear" middle and the trailing edge curve and *supposed* asymptote. If I already knew the whole equation that models the curve (from a physical phenomena standpoint), I wouldn't need to ask the question! I'd just apply it.

    I am actually trying to derive the model that mathematically explains the phenomenon.
     
  15. joeyd999

    Thread Starter AAC Fanatic!

    Jun 6, 2011
    2,675
    2,722
    Since you asked:

    Assume Y is the vertical axis of the first series, and Y' is the vertical axis of the 2nd series, then:

    Y' = log2(2^Y + 300)

    I derived the transformation empirically, based on the data, that allowed the best overall polynomial fit. I assume it works because it "linearizes" the leading curve and eliminates the asymptote. It is also easy to reverse transform. Aside from that explanation, I have no other reason for using the transform.
     
  16. MrChips

    Moderator

    Oct 2, 2009
    12,421
    3,356
    No. There is a huge difference between a mathematical model of the process and a mathematical model of the data.
     
  17. LDC3

    Active Member

    Apr 27, 2013
    920
    160
    One of the Sigmoidal Functions is also called a four parameter fit.
    http://www.miraibio.com/blog/2010/08/the-4-parameter-logistic-4pl-nonlinear-regression-model/
     
    joeyd999 likes this.
  18. joeyd999

    Thread Starter AAC Fanatic!

    Jun 6, 2011
    2,675
    2,722
    In a private message, The Electrician recommended the software TableCurve 2D available at:

    http://www.sigmaplot.com/products/tablecurve2d/tablecurve2d.php

    I downloaded and tried the "trial" version (actually, full blown version limited for 30 days).

    In about 2 seconds, the software fitted 3,547 different equations for the Series 2 curve, and ranked them vs. R^2. 51 of the equations had R^2 of 0.999999 or better, and it visually graphs the response of the equation vs. the data.

    Interestingly, though not necessarily surprising, the Series 1 fitted curves did not match the data as well as Series 2.

    Very nice. While I typically don't purchase Windows software, I may make an exception in this case.

    Thank you, The Electrician.
     
  19. The Electrician

    AAC Fanatic!

    Oct 9, 2007
    2,281
    326
    A trick you should always keep in mind is to reverse X and Y and try fitting again--TableCurve will do the reversal with a single click. That's what I did with your series 1 data.

    Sometimes you can solve the equation for the reversed X and Y axes in closed form. That actually works for the sigmoid TableCurve found.
     
    joeyd999 likes this.
  20. joeyd999

    Thread Starter AAC Fanatic!

    Jun 6, 2011
    2,675
    2,722
    For those who care, TableCurve 2D gave me this equation for Series 2 as a good fit (R^2 = .99999816):

    Y=(A+Cx+Ex^2+Gx^3)/(1+Bx+Dx^2+Fx^3)

    where:

    A=7.4250189
    B=0.45593574
    C=11.159249
    D=-2.4873822
    E=-26.356847
    F=1.0763004
    G=8.5671721

    While this particular equation (and its constants) does not give me insight into the workings of the phenomena, it does give a great fit without the wiggles. And, it was just a five minute job to plug the formula into my PIC application (in .asm!). The end results are phenomenal.

    Thanks for your help, The Electrician!
     
Loading...