Please ensure Javascript is enabled for purposes of website accessibility

3. Cluster and multiple stage sampling

Transcript

Since you already know sampling and some sampling techniques. In this lecture we will try to cover up two things one is called as Cluster sampling the other one is called as Multiple sampling. So, cluster and multiple stage sampling will be the focus of this lecture.

So first let us understand little bit more on cluster sampling. Look at this figure which is given here so here the entire set is a population. So, now in this population you are trying to cluster certain colors, each color can be one variety of plant or whatever it is you are trying to cluster them according to their type or it can be clustering according to the height or it can be clustering according to the time or it can be clustering to the amount of water which is spent on it or it can be the amount of fertilizers which are applied in the firm so you can do clustering in any one or some similarity to form a cluster. So, cluster generally what we do is we use when the natural groups are present in a population. This is a entire population we are trying to cluster them into some similarity in the group. Again, I repeat the similarity need not be on the type, it can be the amount of money spent water spent etcetera etcetera etcetera. Cluster as the name suggests formation of clustering of a given population. Typically, used in market research when one can fetch info about clusters but not about the whole population. Within this group you can try to extract and when you try to extract within this group sometimes you generalize and talk about the entire population. Cluster sampling is more beneficial because you have to get the data quickly at the minimum cost economically and time saving manner we follow this cluster. So, here are clusters you do and then what you do is you pick up one from each cluster and then you have a clustered sample now do all the experiments with this clustered sample and talk about the entire population.

So how is it done the process to how it is done has been illustrated very clearly. The first step is you have to define your population you have to define your population. It can be the in a given square kilometer the different types of fishes, in the given square kilometer the cows which are raring, the sheeps which are raring, the fruits which are grown so, that is you have to define the population. Once you define the population then you do clustering of the population. So, earlier I saw all colors put together. Here I make sure in each cluster, I will have two orange and two green so, I can do that, so this is called as clustering the population. Now after clustering the population what do, I do is I randomly select the cluster. So, there are clusters now one, two, so, one, two, three, four, five, and six. Now what do I do is I try to pick randomly three or four whatever it is and then what do I do I collect the data. So, now I collect the data and then I talk about the population. This is what is clustering you do and in the second level of clustering you can do double stage clustering where in which within a cluster I try to pick up two and then start doing it. Here it is pictorially represented so I am getting putting an example of four, but in real time this will be like five hundreds or thousands or at least multiples of ten, you should have so many things after doing the double stage you should have at least in one cluster you should have at least 20 to 30 samples.

So, what are the advantages the advantages are time and cost effective, higher validity due to higher randomization because you have randomly picked from the cluster you have not specifically picked. For example, if you try to pick only apple trees and talk about apple trees then you are trying to talk about only a specific thing but no, in clustering what we do is we try to take apple, orange, whatever whatever it is and then, the next stage what we do is we try to take one apple, one orange, and talk so higher validity due to highly randomness. Clustering should be done very very carefully, please keep it in mind clustering should be done very carefully. If you start doing clustering only on a particular day a particular time so the clustering can fall in a local loop. If it falls in a local loop you will not be able to give your reasoning for the entire population. So, that was single stage now let us get into Multistage Sampling.

In multi stage sampling the extended variation of the cluster sampling. So, this is primary cluster sampling from here you move to the ultimate sample unit which is otherwise called a secondary sample unit you pick up. So, now you talk about a small group you do all the analysis for a small group and talk about the entire population. So, divide a large set of population into stages, combination of stratified sampling, or cluster sampling, and simple random sampling is used, and unlike the single stage sampling here sampling frame are not required. You just pick within the samples you try to pick up the ultimate sample units and start doing it. So, this is called as multi stage sampling so, again what is the advantage of multiple stage it is flexible, it is convenient, and it is cost effective.

In the previous one you clustered all according to one objective but here you within that one objective also you are trying to take a cluster. So, use of n number of stages to come down to the required size that is this needs expertise and this needs your prior experience the last one is going to be no restriction on the way of division of group. So, these are the advantages when you try to take the challenges it is less accurate than simple random sampling, it is subjective component can put results in question, lack of external validity of research finding due to subjective components. You are more focused towards your farm you cannot generalize saying that from Ram Kumar’s farm if I move to Shyams farm will I get the same data. So, that is what we are trying to talk about here. So, the example of multi-stage sampling before that let me introduce a new terminology called as Enumeration Area EA right. So, here this is a definition which is given by the agricultural society so a systematic sampling of type of wheat grown within enumerated area EA within a district is talked about district strata first stage, then EAs clusters which are formed is second stage, then household is third state. So, within each district they take samples of EAs within each EA take sample of household, within each household take samples of individuals. So, this example will give you more clarity on the concept whatever I have presented.

So, in this lecture we went through mult single stage and multiple stage cluster sampling, this is very important if you are trying to do for a hospital this is very important, If you are trying to do for food preservation again it is important. So, everywhere the sampling, clustering, and multiple stage sampling play a very important role.

Thank you.

 

Licence

Icon for the Creative Commons Attribution-ShareAlike 4.0 International License

Statistical Techniques for Agriculturists Copyright © by Commonwealth of Learning (COL) is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License, except where otherwise noted.

Share This Book