1. Role of statistics in agriculture
Transcript
Welcome to the course on statistics for agriculturist. Agriculturist people are intelligent and smart. For becoming intelligent we have to collect data, work on the data to convert the data into information, extract lot of information and convert it into a knowledge. Today agriculture is one field where the knowledge is being transferred from generation to generation based upon the data which is collected. So at the end of this course, we would like to make you smart agriculturist which is on top of intelligent agriculturist. This course is planned for six weeks we will try to cover in week number one on use of statistics in agriculture following that, we will try to teach you some of the sampling techniques, third week will be predominantly on hypothesis generation and hypothesis testing, fourth we will try to tell you which statistical model you should choose depending upon the situation, five data representation and interpretation, and the last one is going to be the the icing on the cake which is more focused towards ICT usage on this statistical data.
Statistics is nothing but a science of probability. Moment, I say probability it is sure that there is going to be a variation. The variation when we talk about food agriculture it depends upon soil condition, humidity, water content, water ph level, all these things try to dictate the yield. When we are trying to talk about aquaculture it tries to talk about the amount of other ingredients which does not allow the fish or the prawn to get multiplied. When we talk about horticulture, we predominantly focus on insects affecting the fruit or the flower. So, there is lot of variations, so this variation we will try to apply science and extract information, so that is why agriculture is a probabilistic in nature. A study of this probabilistic and their analysis helps us to derive results. When you have data you can always derive information for example here I have just put a 2d graph which is between yield and the recommendation of fertilizer. So, people think that if I add more and more and more fertilizer my yield is going to increase linearly. So they inverse a heavy amount in doing so but this graph clearly says that there is no relationship like linear initially with 0 also you have a yield which is there that means to say without addition of fertilizer you have an yield and then onwards it tries to follow not a linear but a semi linear form, and after some point of time you can see there is not much of slope change that means to say the graph continues marginally high. So, this will try to give an information to the agriculture is that anything more than 200 kgs please don’t use it that is not going to bring you a steep increase in the yield, which in turn helps the farmer to be more productive rather than looking at production.
So, this simple graph tries to give you lot of information from the data which is collected. So, this is what is derived results we can further understand the results and do some amount of research and development in terms of technology. Many of the agricultural technologies are mass customized it is focused towards a particular zone, a particular period, in a particular country. So, you can try to take the technology from a best practicing country trying to customize it depending upon the data which is available. So, a graph speaks less words and gives more information, so the data which we acquire will be converted into a graphical representation to derive knowledge from it. Role of statistics in agriculture – there are three important steps in statistics which we do so one is collection of data, two is entry of data, three is the source of data. So, collection of data is how do I collect the data and use this data for analysis. Should I do it one in the entire form, should I do it at the corners only and talk about the data, should I do it in a matrix form such that I get more data. So, all these things depend on collection of data. If you do a small mistake in collecting the data, then rest of the statistical interpretation will be misleading. Data collection is very important you have to use either binary data or you can use the magnitude. For example, if you want to measure the height of a particular sample you can always measure in terms of centimeter, you can talk in terms of aspect ratio, aspect ratio is nothing but length to diameter which is nothing but aspect ratio. So, here what happens is you try to talk with both and then you try to infer the data. So, collection of data from the form or from the aqua is very important. Then you try to collate, collate the data and try to represent the data. For example, here I am trying to do it year wise from 1600 to 2016, and here are the yields. So how I can try to talk about see, look at the data, how it is getting added from sixteen hundred to twenty sixteen here there is a collation of data. You can do it one, or you can try to put a bundle of five and then try to talk the interpretation. Then the next one, is the source you are again going back to the same field experiment, should you try to take from individual spots or should you try to take it random. People generally, mistake what they do is. They try to take soil samples across the field and then they average out the data and then try to say in this particular form you have so much of water content, so much of phosphorus is there, so much of nitrate is there and, other things. But generally, what happens there is always a gradient. So, the source also tries to play a very important role. Data when collected directly by the user is the primary, but when used for some organization is secondary data. So, whether you go to the firm, or take the data from the published literature those things are secondary data. When you use the secondary data be very careful about the data whatever is there. They would have massaged the data and they would have published it. So, be little careful when your secondary data for an agriculturist. It is always good to have primary data.
When you look at this example, here see they are trying to increase the production by introducing lot of technology and latest devices, to increase the yield which reduces the drudgery. So, here we would like to talk about data collection, then we would try to talk about mean, median, and mode. When the data is collected you try to take an average of the data that becomes the mean. For example, I have the height given here, so when we try to take the mean, you try to average out everybody and try to draw a line. So sometimes you try to get information which is related to one or it tries to average out everything. In a crude sense let me give you a data you have minus three zero and three. If you try to take the average it is 0, but still, you have 3 data points. So, you should know mean is very much talked about but generally mean you miss out some data points. Next, is median. Median always tries to take the center point so; in this you try to see minus three zero three. So, when you try to take the center point is zero. So, when you have a set of data, if you always try to have six data points try to take the mid of it that becomes the median. Whatever gets repeated so that becomes the mode. So, here if you take it in absolute, three becomes the mode you have to use all the three data points to talk about any data which you collect from the firm. And all the collector data is now represented in a graphical form. Why to interpret the data which is collected. Previously, we saw a 2d line diagram. Here we see a bar chart the bar chart can be horizontal, or it can be vertical. So, here is a horizontal stacked bar chart. So, you can see earth surface what is the land and what is it for ocean you can try to say agriculture land how much of these agriculture land is predominantly used for may be cultivation of x crops, and then this is for y crops. Again, in y crops you are trying to look into what percentage of y you are trying to grow y one crop and you can try to see y two crop, ok. So, this talks about y one and in turn y one if you want to split it you can again split it into two and start representing. So, this is a horizontal stacked bar. So, where in which we try to interpret the data which has been collected by us.
So, before I conclude in this, we saw the need for statistics, how do you collect the data, how do you represent the data. We have understood the basics of three terminologies of statistics which is mean, median, and mode.
Thank you.