Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Sorry, I mistyped.

I meant that the additional amount of variance described by a more complex model beyond that described by a linear model is much less than that described by the linear model in the first place. Obviously the total amount will be more, or else your model is both complex and wrong. :-)

Consider:

  Model A - 1 degree of freedom - 60% of variance
  Model B - 2 degrees of freedom - 75% of variance
That extra DOF has gotten you 15% better description of the variance, but at the cost of complexity. Perhaps that is worth it, perhaps not. As has been noted above, that complexity has a real cost that can manifest itself as overfitting, instability, and lack of generalization. The curse of dimensionality is very real.

All that I meant was the linear model will probably capture the most variance per unit complexity. Which gets back to my original point that most (all?) problems are linear to a first approximation. It's not just that people are lazy.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: