Degrees of freedom can be a decimal number

Properties of the -Distribution

In the last chapter we demonstrated how the distribution of the respectiveValues ​​looks like.

A linear function of the form was applied to the data of a measurement y = ax + b customized. The parameters of the function, a and b, were determined in such a way that the sum of the squares of errors () becomes minimal. This was repeated for many measurements. The minimum won in this way Values ​​were shown in a histogram.

Using the histogram, we found that the values ​​of are not distributed arbitrarily. can only assume values ​​greater than zero, there are values ​​that occur very frequently, i.e. the Distribution has a maximum and - Values ​​that are much larger or smaller than the values ​​in the range of the maximum occur very rarely.

Let's look at the definition of the chi2 distribution:
The chi-square distribution with f degrees of freedom is the distribution of the sum of f stochastically independent, squared, standard normally distributed random variables.

The following applies to the density of this distribution:

in which represents the gamma function.

The -Distribution only depends on one parameter f which is called the number of degrees of freedom. What are degrees of freedom? The degree of freedom of a system corresponds to the number of freely variable quantities that describe the system in a precisely defined state. Example: The sum of three numbers should add up to eight. The first numbers are completely freely variable. For example, let's choose 5 and 6. The third number is now fixed and must be -3 so that the condition "sum eight" is fulfilled. The system "the sum of three numbers equals eight" therefore has two degrees of freedom.

How many degrees of freedom does the -Distribution of the measurements from the previous chapter? According to the definition, the Distribution of the distribution of the sum of squared standard normal distributed random variables. Do we have n Measured values, the sum consists of n Summands. However, this does not correspond to the number of degrees of freedom, since the measured values ​​themselves are not distributed as standard. We can only achieve this through the minimization process of , through which the parameters of the fit function are determined. By calculating the parameters, two values ​​can no longer be freely varied. The number of the line of freedom f therefore corresponds to the number of measured values n minus the number of function parameters to be calculated p.

In our example there were 10 measured values. A linear function with two parameters was adapted to this. The number of degrees of freedom is therefore 10-2 = 8.

In the following animation you can see the course of the density function of the - Investigate distribution as a function of the number of degrees of freedom. To do this, click on the small arrows to the right of the input field Degrees of freedom. Observe the shape of the distribution with increasing degrees of freedom. How big is for large degrees of freedom (f~ 100) the -Value at the maximum of the distribution? Compare this value with the set degree of freedom.

We can now calculate the probability with which a Value is within a certain range. The probability of a value greater than or equal to a value is obtained by the integral:

Describes here is the integral of the density function of to (Total area). The probability is therefore given by the area on the right under the curve at begins, divided by the total area.

Activate the option - Draw the value and click the switch to calculate. The area below the curve is now highlighted in color. The area to the left of is shown in green, the area on the right is colored red. The box on the right also shows the probability that a Value is in the red area.

We now have the opportunity to make statements about the probability of the function adaptation. Here is an example based on our measurement from the last chapter. Here, a straight line with 2 parameters was adapted to 10 measuring points. The number of degrees of freedom is 8. Let us now assume that the fit has a -Value of 7.3 delivers.

Enter 8 in the applet for the number of degrees of freedom and draw the distribution with the option - Draw the value. Set for enter the value 7.3 (represent decimal numbers with a point) and calculate the probability.

You get a chance of about 50%. This is a very good value because it is half of all possible -Values ​​is either greater or less than the value determined by us. Now let's assume that the data has a -Value of 15.5. If you calculate the probability of this, you get a value of about 5%. Only 5% of all possible - Values ​​are greater than or equal to the specific value of 15.5. The limit is defined here in the statistics. Probabilities that are less than 5% are considered significant. For example, if you had a -Value of 30 received, the probability is only 0.02%. That is the data in this case coincidentally to deviate so much from the adaptation function is very unlikely. Either you are lucky enough to be worried about playing the lottery or something went wrong. The latter is more likely to be the case. Perhaps systematic errors were overlooked or the model is incorrect, i.e. the adaptation function cannot describe the data at all. With all adjustments you must therefore always think about the likelihood of the fit. In the next chapter we will explain this in detail using measurement examples from the internship.

There is another rule of thumb with which the quality of a fit can be estimated without complex calculations. To do this, we define the size (Chi square reduced) resulting from the Value divided by the number of degrees of freedom results in:

The value of should be around one. In this case, the -Value the expected value of the Distribution (the expected value of the Distribution with f Degrees of freedom f). If this is fulfilled, the probabilities, depending on the degree of freedom, are in the range from 35% to 50%. The greater the number of degrees of freedom with the same Value, the closer the probability is to 50%.

Vary the number of degrees of freedom in the applet and set for the same value. Observe the probabilities. Also make great values ​​for and f from 100, 150 and 200. The applet also offers the option of drawing in a confidence interval. This becomes the -Range that corresponds to a certain probability (adjustable). Look at the confidence interval for a probability of 5%, 10%, 50, 90%, and 95%.

In the following chapter "Example of a model adaptation", a measurement from the internship is presented and the quality of the fit is examined using the fit probabilities.