Editing 2560: Confounding Variables
Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.
The edit can be undone.
Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.
Latest revision | Your text | ||
Line 13: | Line 13: | ||
In statistics, a ''confounding variable'' is a third variable that's related to the independent variable, and also causally related to the dependent variable. An example is that you see a correlation between sunburn rates and ice cream consumption; the confounding variable is temperature: high temperatures cause people go out in the sun and get burned more, and also eat more ice cream. | In statistics, a ''confounding variable'' is a third variable that's related to the independent variable, and also causally related to the dependent variable. An example is that you see a correlation between sunburn rates and ice cream consumption; the confounding variable is temperature: high temperatures cause people go out in the sun and get burned more, and also eat more ice cream. | ||
β | One way to control for a confounding variable by restricting your data-set to samples with the same value of the confounding variable. But if you do this too much, your choice of that "same value" can produce results that don't generalize. Common examples of this in medical testing are using subjects of the same sex or race -- the results may only be valid for that sex/race, not for all | + | One way to control for a confounding variable by restricting your data-set to samples with the same value of the confounding variable. But if you do this too much, your choice of that "same value" can produce results that don't generalize. Common examples of this in medical testing are using subjects of the same sex or race -- the results may only be valid for that sex/race, not for all people. |
There can also often be multiple confounding variables. It may be difficult to control for all of them without narrowing down your data-set so much that it's not useful. So you have to choose which variables to control for, and this choice biases your results. | There can also often be multiple confounding variables. It may be difficult to control for all of them without narrowing down your data-set so much that it's not useful. So you have to choose which variables to control for, and this choice biases your results. |