KHYBER ACADEMY: STATISTICS

STATISTICS

Showing posts with label STATISTICS. Show all posts

Cumulative frequency distribution with example

STATISTICS

commercealls Add Comment

commercealls

Cumulative frequency distribution introduction:

From a grouped frequency distribution, we can simply read how many observations fall within different classes. E.g. consider the following frequency distance.

Mark: 0 – 20 20 – 40 40 – 60 60 – 80

No. of students: 5 25 45 19

The number of students who secured marks 40 or more but less than 60 are 45, those who secured marks 20 or more but less than 40 are 25. But if one wants to know the number of students who obtained 40 or more marks or those who obtained less than 60 marks, then the given frequency distribution does not serve our purpose. To answer such type of questions we construct a frequency distribution known as cumulative frequency distribution.

Definition:

The total frequency of all the classes less than the upper class boundary or move than or equal to the lower class boundary of a particular class is called the cumulative frequency of that class. A table showing the cumulative frequencies of different classes is called the cumulative frequency distribution.

There are two methods of constructing the cumulative frequency distributions.

i. Less than type cumulative frequency distance:

A table showing total frequencies of all the classes less than the upper class boundaries is called less than type cumulative frequency distribution. In this type the frequencies are serially added from top to bottom. The cumulative frequency of last class should equal the total frequency.

ii. More than type cumulative frequency distance:

A table showing total frequencies of all the classes more than equal to the lower class boundaries is called the more than type cumulative frequency distribution. In this type the frequencies are serially added from bottom to top. The cumulative frequency of first class should equal the total frequency.

Question:

From the following frequency distribution construct less than type and more than type cumulative frequency distribution.

Marks: 0 – 10 10 – 20 20 – 30 30 – 40 40 – 50 50 - 60

Frequency: 5 25 37 43 13 7

Sol:

a) Less type cumulative frequency distribution

Marks	No. of students	Cumulative frequency
Below 10	5	5
Below 20	5 + 25	30
Below 30	5 + 25 +37	67
Below 40	5 + 25 + 37 + 43	110
Below 50	5 + 25 + 37 + 43 + 13	123
Below 60	5 + 25 + 37 + 43 + 13 +7	130

b) More than type cumulative frequency distribution

Marks	No. of students	Cumulative frequency
Above 0	7 + 13 + 43 + 37 + 25 + 5	130
Above 10	7 + 13 + 43 + 37 + 25	125
Above 20	7 + 13 + 43 + 37	100
Above 30	7 + 13 + 43	63
Above 40	7 + 13	20
Above 50	7	7

What is Frequency distribution and construction of frequency distribution?

STATISTICS

commercealls Add Comment

commercealls

Frequency distribution:

The arrangement of data into groups or classes together with the number of observations in each group or class is called frequency distribution.

The number of observations falling (lying) in a particular class is called frequency and is usually denoted by (f). Data presented in the form of frequency distribution is also known as grouped data, while the data in original form is called ungrouped data while constructing a grouped frequency distribution. The following terms are associated with its construction. i.e. class limits, class boundaries, class mark, or midpoint of a class, width of a class or class interval size.

i. Class limits:

The pair of number of a variable which describe a class are called class limits. The smaller number is called the lower class limit while the larger number is called the upper class limit. The class limits are constructed in such a way that the upper limit of one class do not coincide with the lower limit of next higher class. Thus there is a gap between successive classes e.g. 10-19, 20-29, 30-39 etc.

ii. Class boundaries:

When class are constructed in such a way that the upper limit of one class coincides with the lower limit of next higher class, then such limits are called class boundaries. Thus there will be no gap between the successive classes. The class boundaries are exclusive.

e.g. 10-20, 20-30, 30-40, 40-50, etc hence 20 will be included in the class 20-30 instead of 10-20.

iii. Midpoint of a class or class mark:

The midpoint of a class is obtained by dividing the sum of upper and lower class limits/boundaries by 2. Since individual identity of the observation is lost in grouping process, hence for convenience of computation midpoint are compute, about midpoint of a class, we assume that each value in a class is equal to its midpoint e.g. if frequency of a class is 9 and its midpoint is 24, it means that all 9 values of a class are equal to 24.

iv. Width of a class or class interval size:

The different between successive lower limits or between successive upper limits is called width of class or class interval size. It may also be obtained by finding the difference between successive midpoints. The width of a class is usually denoted by (h).

Construction of frequency distribution:

While constructing a grouped frequency distribution the following steps should be taken into consideration.

i. For convenience arrange the data in an array using stem and leaf display.

ii. Determine the largest and smallest number from an arrayed data in order to find range i.e. the difference between largest and smallest number.

iii. Decide upon the number of classes. There is no hand and fast rule for deciding the number of classes, but a reasonable number of classes between 5 to 20 may be included depending upon the size of the data. Sturges also suggested a formula for deciding the number of classes i.e.

K = 1+3.3 log N

iv. To find the class interval size (h) divide the range by the desired number of classes e.g. if range of values is 87 and number of classes are 9, then class interval size will be 87/ 9 = 9.67 or approximately 10. Similarly if range is 43 and number of classes are 8, then class interval size will be 43/ 8 = 5.375 or 6 approximately (round off to next higher integer).

v. Decide what should be the starting value of the first class. The starting value is usually taken as the lowest value of the given data or less than that which is a multiple of 2 5, 10, and such other figures. The upper limit is obtained by adding the width of a class with the lower class limit. The remaining class limits are determined similarly.

vi. Distribute the values in to appropriate classes either by listing actual values In their proper classes or by using tally bars. The number of tallies is then written infrequency column.

Types of Classification in Statistic?

STATISTICS

commercealls Add Comment

commercealls

Definition:

The process of arranging data into various groups or classes according to some common characteristics is called classification.

Types of classification:

There are four types of classification.

i. Quantitative classification:

When the data are classified according to quantitative variable, then it is known quantitative classification e.g. when population of a city is classified by income, age, weight, height etc.

Height (inches): 54-56 57-59 60-62 63-65 65-67.

No. of person : 289 356 589 297 240.

ii. Qualitative classification:

When the data are classified according to qualitative characteristics like sex, literacy, religion education etc. then it is called qualitative classifications. E.g. classification of population according to sex (i.e. male and female), according to education (i.e. literate and illiterate), according to wealth (i.e. rich and poor) etc.

iii. Geographic classification:

When the data are classified according to places or geographic location. Then it is called geographic classification. E.g. population of Khyber Pakhtunkhwa recorded in 1990 district wise, literacy rate in Pakistan province wise etc. the following example illustrate geographic classification.

Country:	Canada	U.S.A	Germany	France
National income	7930	7880	7510	6730

Series which are obtained by arranging the data on the basis of places are called “spatial series”.

iv. Chronological OR temporal classification:

When the data are classified on the basis of time, then it is known as chronological classification and the series so obtained is called time series. The following table would give an idea of chronological classification.

Year : 1930 1940 1950 1960 1970.

Population(crores) : 2311 1785 3135 3688 3940.

What is Classification? Objectives and Basic principle of classification in statistic?

STATISTICS

commercealls 1 Comment

commercealls

Classification:

Introduction:

It is difficult to draw any inferences form the data that have been originally collected. There are chances of making wrong decisions about the nature of the data, because human mind is not so capable of memorizing all the figures. Thus there arises a need to reduce and simplify the raw data (primary data) into such a form that is easily understood. One such form of reducing the data is classification.

Definition:

The process of arranging data into various groups or classes according to some common characteristics is called classification.

Aims or objectives of classification:

The main objectives of classifying the data are:

Since human mind is not so fertile to remember all the figures. Therefore classification is the only way to reduce the large mass of data.
Classification facilitate comparison i.e. when data are classified it becomes easy to know how many students source marks between 20-40, 40-60 etc
Classification simplifies calculation of statistical measure like mean, median, standard deviation etc.
When data are originally collected, there is repetition of values which consumes too much space and time. The classification technique saves time and space because data are presented in a compact form in comparison to lose form.

Basic principle of classification:

While studying the larger set of data, the following points should be taken into Consideration.
The classes into which data are to be distribute should be mutually exclusive i.e. successive classes should not overlap.

The classification procedure should be exhaustive i.e. classes should completely cover the whole data. For the proper analysis no item should be left classified.

Classification should be clear and simple. Ambiguities and doubtful entries must be removed.
The classification procedure should not be so slab orate to lead trivial classes nor it should be so crude as to accommodate whole data in one or two classes.

Secondary data and Methods for collection of secondary data in statistic?

STATISTICS

commercealls Add Comment

commercealls

Secondary data:

The data that have already been collected by someone and the statistic techniques are applied at least once on such data, are called secondary data.

When statistical methods are applied on primary data, then they lose their original shape and become secondary e.g. if the data in different census years are again used to measure the changes in the population growth, sex ratio, mortality rate etc.

Methods for collection of secondary data:

Secondary data are those, which have already been collected by someone for their own use and now he same data is used by different persons for another purpose. Such data can be collected from the following sources.

1. Official sources:

E.g. publication of statistical divisions, reports of ministries of finance, food and agriculture, planning and development etc.

2. Semi official sources:

E.g. publication of state bank, wapda, P.I.A local bodies etc.

3. Private sources:

E.g. publication of state association, chambers of commerce and industry, private commercial and financial institutions etc.

4. Research organizations:

E.g. publication of research organizations like universities, institute of education and research etc.

Primary data and Methods of collecting primary data in statistic?

STATISTICS

commercealls Add Comment

commercealls

Primary data:

The data that have been originally collected by someone and no statistical techniques have been applied on such data, are called the primary data e.g. data obtained in census study are called primary data. OR to know the effect of fertilizer on the yield of wheat, then the observations taken on each plot are called primary data.

Methods of collecting primary data:

The methods involved in collecting primary data are described below:

1. Direct personal investigation:

In this method the investigation interview. The concerned persons on the spot about the problems under study and record the required information’s personally.

Advantage:

The data obtained by this method are highly accurate and reliable and reliable. The accuracy of the results depends on the efficiency and proper training of the investigator. The investigator should be polite, tactful and conversant. He should mix himself with the people and speak the language of the people. In this way he can get maximum information’s about the problem under study.

Disadvantages:

This methods is very slow, expensive and time consuming and particularly suitable for small scale and secret inquiries. The personal like and dislike of the investigator may surely affect the result.

2. Indirect personal investigation:

Sometimes the information’s refuse to give answer to some direct questions, the information’s are then collected by putting some indirect questions on informants OR by interviewing several third persons or witnesses who are expected to know the full knowledge about the problems under study. E.g. when the businessman are reluctant to give information’s about their income to income tax authorities then the authorities (i.e. officers) can get the required information’s from the persons like salesman, clerks etc who directly involve in that business.

Advantages:

This methods saves time and money because only those persons are interviewed who know the full facts. Proper training and tactfulness of the investigator may produce good results.

Disadvantages:

The investigator takes too much time in convincing the persons to supply information’s. In many cases people do not co-operate and refuse to supply the needed information’s.

3. Investigators through mail questionnaire:

Under this method a questionnaire (a standard list of questions about a particular problem) is sent by post to persons from when information is to be obtained. These persons are explained the purpose, the scope and importance f the inquiry. They are requested to fill in the questionnaire properly and send it back as soon as possible.

Advantages:

This is less costly and less time consuming. The information’s can be collected from a wide area. A reasonable standard of accuracy is expected by this method.

Disadvantages:

Most of the information’s do not care to fill in the questionnaire and the rate of return is very slow. Some of the information’s return the questionnaire incomplete and full of errors which certainly affect the result.

4. Investigation through questionnaire in charge of enumerators:

This method is the most satisfactory and most widely used method of data collection. In this method information’s are collected by appointing trained investigators who go to the information’s with a questionnaire and help them in filling up the relevant columns in the questionnaire.

Advantages:

It is one of the most satisfactory methods of data collection where informants are uneducated. The investigator can explain the purpose of the enquiry and meaning of questions to illiterate persons and thus can get the accurate results. The possibility of non-response is very much low which is most serious in case of mail questionnaire.

Disadvantage:

This method is very expansive and time consuming, because large number of investigators is to be employed. They are to be paid salaries, daily allowances and travelling expenses. In case the informant is not available at his place. Then investigator has to pay two or more visits t get information’s. Thus it is time consuming method.

5. Investigation through correspondents:

Under this method, the data collecting agency appoints local agents or correspondents in different areas for collecting and supplying information. They collect the needed data and send the same to the agency for supplying e.g. news paper, magazines etc uses this methods to get information’s from their correspondents in different fields such as strikes, sports, wars, politics etc.

Definition of Variable, Constant, Discrete variable and continuous variable, Qualitative variable, Quantitative variable, Observation, Population, Sample, Survey?

STATISTICS

commercealls Add Comment

commercealls

Variable:

A measure quantity which can vary from time to time, place to place, person to person is called variable e.g. height, weight, ages, prices etc. variable are usually denoted by capital letters such as X, Y, Z etc.

Constant:

Any quantity which can assume only one value is called constant. Example of constants are = 3.14159 e = 2.71828. A constant is usually denoted by first letter of alphabet e.g. a, b, c etc.

Discrete variable:

A variable which can take finite or countable number of values is called discrete variable e.g. number of children in a family, number of road accidents, number of rooms in a house etc. In discrete variable the values are taken by jumps or by breaks e.g. number of children in a family can be 0, 1, 2, 3…… etc.

Continuous variable:

A variable which can assume every possible value in the given range (interval) is called continuous variable. Height and weight of the individuals, height of mercury in thermometer, speed of a car etc. the values of continuous variables very without any gaps or jumps e.g. height of individual can be 62”, 62.3” 2.7” etc.

Qualitative variable:

A variable which be expressed numerically is called qualitative variable. OR a variable which cannot possess any unit of measurement is called qualitative variable e.g. intelligence, honesty eye clour etc.

Quantitative variable:

A variable which can be expressed numerically is called quantitative variable. OR a variable which can possess some units of measurements is called quantitative variable e.g. height can be expressed in inches centimeter, meters etc, weight can be expressed in kgs, grams etc.

Observation:

Any sort of numerical recording of information’s is called an observation or data. The observation may either be a physical measurement. Such as height, weight ages etc. OR answer to question such as yes or no etc.

Population:

The aggregate or totality of certain elements under study is called population. Total number of students in a college, total number of trees in a forest etc. is the examples of population.

Sample:

A small part of the population selected for the purpose of certain study is called sample. E.g. to diagnose the blood disease, a doctor takes a small part (few drops) of blood from patients body. Such a small part of blood is called sample.

Survey:

A planned and systematic process of collecting statistical data is called survey. There are two types of surveys i.e. census survey and sample survey.

1. Census survey:

A survey in which observations are made on each and every unit of the population is called census survey or population survey. The data obtained by recording the relevant information’s on each and every element of the population are called census data or population data.

2. Sample survey:

A survey in which information’s are collected by studying only small part of the population is called sample survey and the data obtained from sample survey is called sample data.

What is Descriptive and Inferential Statistics? What is Functions & Limitation of statistic?

STATISTICS

commercealls Add Comment

commercealls

Descriptive and inferential statistics:

Statistics as a subject may be divided into two branches i.e. descriptive and inferential statistics.

1. Descriptive statistics:

Descriptive statistics includes the methods and procedures used in the collection, analysis and interpretation of data and expressing the data into various from such as tables, graphs, charts, diagrams and finding averages and other measures which would describe the data.

The purpose of descriptive statistics is to present data in such a way that one can easily draw conclusion.

2. Inferential statistics:

Inferential or inductive statistics includes the methods and procedures used to draw inferences (results, conclusion or decisions about the population on the basis of sample data. This branch of statistics includes the methods of estimation and testing of hypothesis.

Functions or uses of statistics:

1. Statistics simplifies complex data:

The main function of statistics is to simplify the large mass of data, as it is inconvenient to draw any conclusion from such data. The human mind is not as fertile as to memories all facts (figures) about the problem e.g. it is not possible for a research worker to remember the wages of 500 workers of a firm. If he finds the average wage of the 500 workers, then it is easy to remember.

2. Statistics are used for comparison:

Statistics facilitate comparison between two or more variable relating to different time or places. Suppose we are told that whole sale price index in Pakistan is 254 in 2006, then this single figure does not show whether price levels high or how. On the other hand if we are supplied the whole sale price index for 1995 also, then we can compare the two quantities and can decide whether the price level has increased or decreased.

3. Statistics studies relationship among different facts:

Statistics establishes relationships between two or more variable relating to different fields e.g. the yield of crop is affected by many factors like soil fertility, quality of seed, amount of fertilizer, amount of rainfall, temperature etc. statistics enable us to study the effect of such factors. Statistics tools may be employed to established the relationship among these factors.

4. Statistics provides techniques for drawing inferences:

In most of the enquiries it is not possible to conduct the census survey to study the characteristics of the population e.g. To study the average age of the population of a country. Then due shortage of time, resources and lack of trained investigators, we are unable to study the age of each and every person. In such a case sampling methods may be employed to get the deserved information about the population under study. Thus statistics inference has become the most important branch of statistics.

5. Statistics helps in fore casting of future events:

Forecasting of future events has become an important function of statistics. In most organizations plans of development are prepared in advance for future and formulate economic policies by analyzing the past and present conditions. E.g. the government make estimates of the domestic production of food grain and requirement of food grains and then decides the quantum (i.e. quantity) of imports if necessary. For making fore cast statistics provides techniques such as regression (simple, multiple, partial) etc.

Limitation of statistics:

1. Statistics does not study individuals it deals with aggregates:

Statistics deals with aggregates. Statistics methods are employed to study the characteristics of the pop as a whole and not of individuals. Suppose we compute average height of the population of students as 68 inches. From this figure it’s not possible to know the height of a particular student named “X”.

2. Statistics does not study the qualitative phenomena:

Statistics methods cannot be employed in qualities problems such as honesty, intelligence etc. these characteristics cannot be given numerical expressions e.g. a statement of the type that ”sir Syed Ahmad Khan was a great scholar” cannot be given qualitative expression and cannot be analysed by statistics methods.

3. Statistics of results are true on the average or long run:

Statistics laws are not exactly like the laws of physics or chemistry, therefore conclusion drawn from statistical studies are not universally true, but they are true on the average. E.g. when we say that per capita income in Pakistan is Rs. 1450 per capita income is computed by dividing the total income of the country by the total population. The per capita income therefore represents the average income of the people.

4. Statistics is liable to be misused:

Statistics methods are liable to be misused and mishandled. If less export people are appointed for the collection and analysis of the data, it will surely give inaccurate results. Only the persons who have an export knowledge of statistics can efficiently handle the statistical methods.

What is Statistic and Characteristic of Statistic?

STATISTICS

commercealls Add Comment

commercealls

Meaning of Statistics:

The word statistics is used to give the following three meanings.

Firstly it is used in plural sense which refers to the collection of numerical data in aggregate form relating to any field of stud e.g. statistics of births, statistics of prices, statistics of road accidents etc.

Secondly it is used in singular sense which refers to the methods and procedures used in the collection, analysis, interpretation of data and presently the data in various forms such as tables, graphs, charts, diagrams and finding averages and other measure which would describable the data.

Thirdly it is used in a technical sense as plural of the word “STATISTIC”. By statistic we mean a quantity such as mean, median, s.d etc computed from sample data e.g. if we select a sample of 20 students from a class containing 100 students and find their average height, then this average is called statistic.

Characteristics of statistics:

The fundamental characteristics of statistics are described below.

1. Statistics are aggregate of facts:

Statistics are aggregate of facts. A single figure cannot called statistics. Because can be drawn about such figures e.g. if we say that income of a person is RS 500, this figure is meaningless, because we cannot draw any conclusion about this figure i.e. whether the income of a person has increased or decrease. Thus the science of statistics is the science of aggregate and not of individuals.

2. Statistics are affected by multiplicity of causes:

Statistics obtained about a particular phenomena are due to a number of causes e.g. production of wheat depends on seeds, fertilizer, irrigation, rainfall etc. it is not possible to study the influence of these factors separately, because these factors jointly determine the yield.

3. Statistics are numerically expressed:

Statistics are numerical statement of facts. Qualitative expression such as honestly, good, bad etc. do not form statistics e.g. consider the following statement “QUAID AZAM was a great leader”. This is not a statistics statement. If on the other hand, we say that per capita income in Pakistan in 1960 was Rs. 840 and Rs. 1400 in 1990. This is of course a statistics statement as per capita incomes is expressed numerically and is comparable.

4. Statistics are collected in a systematic manner:

When statistics are collected in a systematic manner, then they may give accurate result. If they are collected in a haphazard manner, then the very purpose of collecting statistics will be damaged such statistics always leads to misleading conclusions. Thus there must be trained investigators and proper organizations for the collection of statistics.

5. Statistics are placed in relation to each other:

Statistics are collected mainly for the purpose of comparison. Data collected must be comparable and homogeneous e.g. if we compare the heights of persons with their income, it is not statistics, on the other hand if we compare the ages of husbands with the ages of their wives. Then it is called statistics.

6. Statistics are collected for a per-determined purpose:

The purpose for which statistics are to be collected is always determined in advance. It enables the investigator to distinguish between wanted and unwanted data. If the purpose is not determined in advance, then investigator may collect the irrelevant data.