5. Errors in Sampling
Transcript
Welcome back. In the next two lectures, we will be trying to cover about errors. Error is something which is integral part of your data you cannot get a data completely error proof. So, there is a error now how are we going to find out that error and what are the different types of error. This will be discussed in the next two lectures. So, in this lecture we will keep focusing on errors in sampling.
Sampling errors- a sampling error occurs when the sample used in the study is not representative of the whole population. If I am trying to take a class height of 50 children, and in that if all other students are 5 years if there is 1, 10-year student studying in class 1 okay, now he becomes an error. Sampling error occurs when the sample used in the study is not representative of the whole population. In the whole population you might see a very small tree, very small sampling which did not kick start grow or which has got grownup exorbitantly due to some reason. You are supposed to treat those fellows as error. The error has to be subtracted from your sampling data. So, when we talk about measurements there are errors integrated in the measurements, so those things are called as measurement errors. Measurement errors are also called as response errors. It is the difference between measured value and the true value is the error, it is the measured value and the true value. It consists of bias and variance, and it results when data are incorrectly requested provided received or recorded. Suppose, when we use a sensor which works on solar and all of a sudden if there is no light, the charge gets discharged, the sensor will be showing only a standard data for continuously rest of the time until is getting charged. So those periods are called as received as error, okay. These errors may occur because of insufficiency with the questioner, the interviewer, the respondent, or the survey process. Your question is not clear if the interviewer does not know what question to ask, if the respondent simply says ok, yes, maybe, what not, then then or the survey process itself is wrong. You will have error the measurement errors are of basically four types one is because of your pure poor questionnaire, it can be a questioner or your poor table, table which you have formed where at the heading of the table you put a questionnaire. So that is questioner or the interviewing by interviewer bias interviewer bias says that all apple trees of this particular farm are excellently grown. He has a bias or he says anything which comes out during the season is not good, the fish does not get multiplied at all, so this was the bias so what he does is he sees four five segments in the stratified frames, he sees four five and then he gives his judgment, that is interviewer bias. Respondent buyer says why at all I should measure it, why at all I try to do this, so he also can throw some errors. So, interviewer bias can be there. respondent error can be there, and problems with the Surveying Process, if you have a sensor, if you try to do it on a hot day you cannot measure more than five samples. So, it is very natural that whatever data you do it is out of your frustration of your stress you try to identify the data. So that also will be a problem with the surveying process. So what are all the different causes of error due to sampling one is Population specification error, so population specification error approaches incorrect person on the field. You try to pick up the wrong data points from that wrong data points you generate a data. So, that is incorrect person or the field.
The next one is Sample frame error. Here you approach incorrect sub population, you are not focused in the main population or the from the population whatever you did or frame that frame does not truly talk about the population so that also can lead to an error. So, this two are called as causes of error due to sampling is population specification or it can be sample specification. The cause of error due to sampling there are two types whether they are Selection error and there are Sample frame error.
Selection error is like having a self invited impurity in your drink. You don’t want it but it comes in its way. For example, when you try to drink a juice and when a juice is been taken from a barrel so you will always have a froth so you say that I don’t need the froth but still the froth comes. When you try to pluck rose there are thorns, so when you try to only you say that i only need rose I don’t need thorns but it also comes with it so, you have to be careful so, that is called a selection error. So, selection error is like having a self invited impurity in your drink. You don’t want it but it comes on its way that is a selection error. The next one is sampling frame error occurs when a sampling frame does not sufficiently cover the population required for the study. I talked about single stage, sampling double stage sampling, or multi stage sampling, you have to be very careful when you do double stage or multiple stage, you make sure that they also talk about the population.
So, sample frame and selection frame. How can you reduce those errors? I talked about different errors, how can you reduce them so, that is very important to understand. Sampling errors are easy to identify. Here are few simple steps which you have to follow to reduce the error. First, is you increase the sample size. Of course, when you increase the sample size time and cost increases but if you want to reduce the error it is always better to do a increase in sample size. The next one is divide the population into proper groups, for example, when you try to have a farm, when you try to divide it try to divide everyone with equal size and equal space rather, than having micro grids here and being more focused you try to have unit space everywhere and then do it. So, divide the population into groups such that you can try to reduce the error. And the next thing is know your population. See even before doing all these experiments you should know you should have a ballpark figure, ok, this is how the response is going to be. You should have 50 percent of the results slightly known to you prior, the next 50 percent which comes on the top that is going to help you in identifying the significant parameters, and the and the culprits which tries to pull your productivity down. So, these are the three very important steps – increase the sample size, divide the population into groups, and know your population.
I am sure now that you have understood sampling, multi stage sampling, in that sampling you will have to follow probability, and then in that you should also understand there are error which are getting integrated these errors are a known devil, so you have to live with it but the only thing is you should know how do we eliminate these errors such that our data becomes more pure for our further analysis.
Thank you very much.
Download