Present-day ‘Statistics’ derives its name from the Latin word ‘status’ from which it originated, the term ‘statist’.
Statistics may be defined as the science of collection, presentation, analysis, and interpretation of numerical data.
Use of Statistics -:
Statistical observations come into use in planning for social development, analysis of trends in various social sciences such as Finance, Economics, Commerce, and even Natural Science.
Statistics Use in Ancient Time -:
In ancient times the application of Statistics was very limited. But by the rulers and kings collected information about claims, Agriculture, Commerce, the population of their states, to access their military potential, their wealth, their taxation, and other aspects of government.
Statistics in Research Field -:
C.R. Rao, one of the pioneering Statisticians, has rightly said ‘Statistics is median of all Sciences’. Statistics has dominance over other sciences because without it scientific research is possible.
In recent years, many new areas have come upon the application of statistical tools for study uncertainty on the various phenomena.
For example Bio-statistics, Bio-informatics, Stochastics, and Computational Finance, Econometrics, Geostatistic, Psychometric, Business statistics, Social statistics, Statistical Pattern recognization, Energy Statistics, Actuarial Science, Engineering Statistics, Chemometrics, Epidimology, and so on.
The attribute is defined as a characteristic or quality of an object. In statistics classifying data based on attributes or characteristics is known as Qualitative Classification of data.
Example: Region, Caste, Beauty, Honesty, etc.
2. Quantitative (Variables)
The characteristics which can be measured and can be expressed in quantitative terms are called Quantitative Characteristics.
Example: Height, Income, etc.
A Quantitative Characteristic is also called a variable.
Types of Variable -:
1. Discrete variable -:
Any variable taking specific measured values in a given range is called a Discrete Variable.
2. Continuous variable -:
Any variable which takes all possible values or integral values in a given range is called a Continuous Variable.
Data are in fact fundamentals or raw materials of Statistics.
Information, facts, and figures relative to a particular phenomenon on the study is the collection of reliable data.
Data are collected with regards to one or more characteristics “variables” and “attributes”.
Types of Data -:
1.Primary Data -:
The data which are originally collected by an investigator or agency for the first time for any statistical investigation is called Primary data.
2. Secondary Data -:
The data which have already been collected and processed by others and available in public sources such as magazine, journals are called Secondary data.
Condensation and Summarization of Data
There are various methods that bring about condensation or summarization of data.
(i) Classification and Tabulation
(ii) Diagrammatic and Graphical representation
(1) Classification -:
The process of arranging data in groups or classes according to their common characteristics or similarities is called Classification.
The classified data is readily suitable for further processing like tabulation analysis and interpretation.
Kind of Classification -:
1.) Geographical or Special classification
2.) Chronological classification
3.) Qualitative classification
4.) Quantitative classification
By tabulation, we need systematic presentation or arrangement of data in rows or columns in accordance with some salient features or characteristics.
Frequency Distribution -:
When observations are available on a single variable (discrete or continuous) for a large number of individuals, it becomes necessary to reduce the volume of the numerical data without losing any information of interest. A convenient method of reduction or condension is forming a frequency distribution.
This is done by arranging observed values of the variables or groups of classes together with their frequency.
Rules for Classification of Data
1.) The classes should be clearly defined without any ambiguity.
2.) The classes should be mutually exclusive or non-overlapping i.e. each observation must be included in one of the classes and the classes should be non-overlapping.
3.) The number of classes should neither be too large nor too small.
4.) In determinant classes that are open-end classes like less than or greater than should be avoided as far as possible.
5.) The class intervals should be equally wide.
6.) The class limit should be chosen in such a way that the mid-value of the class interval and the actual average of the observation in the class intervals as near to each other as possible.
(ii) Diagrammatic and Graphical Representation
Diagramatic and graphical representation of data is the method of presenting the data in a condensed and summary form is used of diagram and graphs. They are the geometrical figures like points, lines, bars, rectangles, squares, circles, cubes, pictures, maps, or charts.
(i) Diagrams and graphs provide a visual representation of statistical data in a simple readily comprehensible and intelligent form and thus easy to understand.
(ii) They easily highlight the salient features of the collected data, facilitated comparisons among two or more sets of data, and available for us to study the relationship between them more readily.
(iii.)They reveal the trends if any present in the data, more privily than the tabulated data.
Difference Between Graphs and Diagrams
(i) Since graphs are constructed on graph paper, they help us to study the mathematical relationships between two or more variables, but diagrams are generally constructed on plane paper and are used for comparisons only and not for studying, the relationship between the variables.
(ii) Diagrams provide only approximate information. They do not have much use to a statistician or to a research worker for further mathematical treatment or statistical analysis.
But graphs are more precise and accurate than diagrams and are quite helpful for the study of slopes, trends, rate of changes, and estimation.
(iii) Graphs are used for the study of time series and frequency distribution but diagrams fail for this, They are only useful for categorical and geographical data.
(iv) Construction of graphs is easier as compared to the construction of diagrams.