2. Planning and execution of sample
Transcript
Welcome to lecture two in week two. In this lecture we will be trying to cover the topic planning and execution of sample. We decide to do statistical analysis, we wanted to improve the productivity of our farm or of our aqua, aqua farm whatever it is. So now we decided that it has to be done, now we should plan for that activity. So, how are we going to plan for that activity and the steps involved in planning for that activity is important. So, the important aspects, are going to be formulation of data requirement and objective of this survey. See you have to define a clear objective, for example, I want the yield of this particular area of the farm to improvise by 10 percent. My fertilizer consumption is going to be reduced by 20 percent. I would like to have better saving on the investment what I do by 20 percent so, all these things are clear objectives. You have to fix the objective and start going around the objective. The objective cannot be very generalized. For example, Ram kumar would like to become the Prime Minister of India. So, it is a very general statement. For this to do there are several sub goals to be achieved. So, try to fix your objective as small sub goals which are reachable and which are executable okay. So, formulation of data requirement objective of the survey has to be clearly written.
Next is going to be Ad hoc or Repetitive survey. See you do experiment, after doing experiment if you do not repeat you do not have the reliability of the data. It is always good to do repetition of the same sort of experiments what you have done in a small field and then try to understand what are all the influencing parameters and then go. So, there is repetitive survey, there is ad hoc survey. Adhoc surveys, hey did you see the tomato getting ripened yes and no or he does it or he does it just for a part-time job. So, what happens this ad hoc survey their data many a times are not reliable. Next is method of data collection is very important. How are we going to do, in which time of the day are we going to do, in which season we are going to do, what is the time, and in every data analysis you have to do all these things time stamp, date stamp, and then you also have to do a climatic stamp and then do. So, method of collection of data is very important. So, here the resolution of the censor also comes into existence. How are we going to collect the data from censor? Now the censor accuracy and reliability is also very important. Next are we going to do a Questionnaire or are we going to do Scheduling. Questionnaire is did you do this job? So, that is question asking questioner. Scheduling is at this point of time 10:10, I will go to the field, I will see what is happening, I will record the scene, and I will come back, that is schedules. Then you will do survey, reference, and reporting periods, all these things are also very important. Then Problems of sampling frame, you cannot do hundred percent inspection or validation of the population. So, you will try to divide it into several small frames, and you will try to see how do I divide these frames, and how do I execute the collection of data. Then choice of sampling design, then Planning of pilot survey, Pilot survey is very important. Whenever somebody writes even a questionnaire to acquire the data, first they write the questionnaire, and then what they do is they go to the customer, get the feedback from the customer, come back and work on the data, and they try to understand the data, and when they do the data understanding, they try to clearly distinguish signal and noise and if there are more amount of noise in the question or whatever they have generated they will go back and reframe the questions. So, doing pilot experiments will always try to improvise your productivity in getting the output. Then Field work and processing of data finally is the preparation of report. All these things are different steps which are involved while planning the experiments.
So, Objectives, Repetition and Method of collection. So, this is the next thing which we are going to discuss. The objective as I told it has to be very clear and brief, it has to be very clear, 10 percent reduction in fertilizer utilization, 10 percent of improving the rejection which comes out of my farm, 10 percent of variation in the quality of the output. So, clear objectives should always be stated the specification of the domain of study, you have to be very clear in the domain where you are going to do the experiment, or where you are trying to look forward for answers. The domain you can say in the entire field of thousand acres I would like to improve the productivity, No. In the entire thousand acres land hundred acres is given for apple, in that apple I would like to see those areas very close to the lake, so that is domain specific. And data tabulation form accuracy of prediction and the cost of the survey all these things should be written in the objective. S, you can set an objective where it is going to be cost intensive, because agriculture, small agriculturists they are all struggling for finance. So, I can’t ask a small farmer to invest heavily to get only the data. So, cost of survey is also very important. Then while an Ad hoc survey, this is what I was saying, adhoc survey is conducted with no intent of repeat ad hoc survey also we do, we just go and see once in a while what is happening. We go to farm daily and in the it can be either in the morning, afternoon or evening. Just look at what is the response the plants are giving. Just look at the aqua aqua farm, see what is the response the fish are giving, the growth of the fish, so all these things are there so while an ad hoc survey is conducted with no intent of repeat so, I told you if you do not repeat the reliability of the data is not accepted. Generally, when we do design of experiments, we always make sure that you repeat the experiments to validate, whatever is your final model is correct or not. So, with no intent if you want to do that is called as ad hoc survey. Repetitive survey is done periodically at a given time, at a given space, with the given instruments by a single person who is doing the previous experiments, that is called as periodic, okay. And there are two probable ways of collecting data, one is go to field and collect the data next, one is interview and collect the data. Interview and collect the data is always subjective. When you get the data, it is very difficult for you to normalize it and get the true sense out of the data. It is always better to go and do on field sampling and then get the data. So, if you have a big farm like this, what you are supposed to do is, you are supposed to divide the big form into small grids. This is what A2 talks about. A1 is the complete farm where it is maybe thousand acres of land is there. So, in this thousand acres of land you are trying to divide it into several small grids. It is always good, and it is healthy to have multiple cropping in the given area. So, what we do is we divide into several small grids. Now in the several small grids also, what we do is, we try to pick up some stratification. Some small area we pick up, and then do experiments and when we do experiments also you can do hundred percent, or you can do probability proportion also can be followed. So, these are the different types, or these are the steps which are always used in doing the statistical analysis.
Questionnaire, Sampling frame and Design. Questionnaire is a set of characteristics according to which the survey proceeds, then frames are not perfect as they might have omissions duplications or other imperfections,so these are frames, right. So, frames are not perfect, so how did you divide the grid. It is not exactly the same, you can have up and down, maybe when you try to do in a grid you might also omit some very valid data. So, frames are not perfect you should keep it in mind, the gridding pattern frames are not perfect, as they might have omissions, duplications, or other imperfections, it is possible but, you should understand you know that there are some omissions, when you try to look at the data you have to have this at the back of your mind and process this data. The design in general should reduce the overall cost, that should be the objective. When you say cost then time is also integrated into it. For the pre specified permissible error, or reduce margin of the error, of the estimate for a fixed cost. So, I was always talking about the time period, there are three time periods which you should understand. One is called a survey period, the other one is called as reference period, and the third one is called as reporting period. For example, if I want to see the yield of 2022 then it is called as survey period, you are trying to talk about a very large bandwidth, the entire yield of the year 2022. Here you will have multiple things like apple, orange, grapes, then you can have lemon, lemongrass, you can have rice, you can have paddy, you can have cashew, all put together because, finally end of the year what do you look at how much yield or how much money did this thousand acres of land give me back, that’s how you do that is called as survey period.
When you are trying to talk about Reference period, you are getting into much more specific, what are you doing in this specific time, you are only trying to talk about June yield. When we talk about June yield several of these fruits or vegetables which we are trying to have a larger bracket will not be in a fit position to be ripened and to be given in the market. So, then what happens you are trying to talk about only a fixed period which is called as the Reference period. Then the last one is going to be Reporting period. Reporting period means on June 21st , on May 22nd , on March 21st , exactly that day that is called as the reporting period. You should understand the difference between these three survey period, reference period, and reporting period. The time period for which the required statistical information is collected for a unit at a time. So, here we will talk only about apple, we will not talk about anything else, only about mango on june 21st . June 21st , this is what June 21st 2022. This is what we are going to talk. So, I am sure you will be able to distinguish and this is very important, because when we do Statistical Analysis we always look for reporting period more in focus.
So why do we do Pilot experiments. The pilot experiments are always done only to make sure whether the tabulation what we have formed, the person who goes to the site, what data he has to extract, and then how is he presenting that data. So, to have a understanding of all these three things it is always called pilot experiments are very very necessary. So, when you do pilot experiments you will understand the data of what you are supposed to table. Maybe you will not be able to try to get in quantity of one, for example, you can pluck every apple, and do when you do grapes it is a bunch of grapes. So, now until and unless you do the pilot experiments you will not be able to understand whether to report it in bunch, or to report it in kgs, or to report it in individual number, units. So, pilot experiments are very important, conducting pilot experiments helps in testing the preparation whatever we have done, evolving procedures, at while doing pilot experiments you will try to improvise lot of things. Suppose if you say I would like to measure the length of the fish. How do I measure the length of the fish? Am I going to catch it, am I going to put it under vernier calliper, or am I going to take a photo and measure it, when I try to take a photo what is the guarantee that the plane in which the camera is there, and the fish, they both are normal. So, all these things are evolving procedures.
Then Training field and Tabulation. It is very easy for you to sit on desk and then say that ok collect these data. For example, if you are looking for lemon grass so we always look at the length of the grass, the width of the grass. So, now until and unless you try to find out what are all the variables, train the person you will not be able to get the best out of him, that is what is training field and tabulation stuff. So, field work is very important for collecting of data as I told you should keep the time frames and then you have to do it. Finally, the data need to be processed to table, graph, and plots. It is mainly tabulating and summarizing the data and analyzing the subject. This then finally is compiled to a report. Until and unless after taking the data he does not make the graph, then he will not be able to understand what is this graph going to convey to my final report, so for that we are always looking forward for how do we process, and how do we present the data in the final report.
So, in this lecture what we covered was what are all the different time scales, and what is the need for doing pilot experiments, and what are all the frame and frameworks which has to be followed in getting the good output in an efficient manner.
Thank you.