BA3300 Introduction to Business Statistics


 
  • "What is not measured cannot be controlled." (paraphrase of  W. Edwards Deming)
  • "Argument is only possible when you don't have the facts. Get the facts." (paraphrase of Peter F. Drucker)
  • "Discovery consists of seeing what everybody has seen and thinking what nobody has thought." —Albert von Szent-Gyorgy


  •  

    BA3300 Introduction to Business Statistics

    COURSE DESCRIPTION & OBJECTIVES:

    "This is a second course in a required two-course statistics sequence. The first course (Math 105/1105 or its equivalent), covers the basics of probability and statistics, and BA3300 builds on that material. It is assumed that the student is acquainted with the material in Math 1105, and basic use of a PC-Compatible computer, including navigating on the web, fundamental concepts of spreadsheets, and e-mail. Students will learn techniques used for relational analysis and business forecasting and how to apply them in a business context. Tools include Chi-Square tests of statistical independence; analysis of variance; simple linear regression and correlation; multiple linear regression; and extrapolative techniques such as moving averages and exponential smoothing. Emphasis is placed on problem definition, construction of statistical models, analysis of data, and interpretation of  results."

    Prerequisites:

      BA1800(103), Math 1105(105), familiarity with basic statistics and PC-compatible computers, spreadsheets, email, and use of the internet. Minimum 2.0 Campus GPA. These prerequisites are not waived. 

    Required:

    • Recommended Text: Anderson, Sweeney, Williams-Statistics for Business and Economics (8th or 9th edition as used in Math 1105); handouts and web materials. Here is a link to download the data  and tools related to the textbook. http://www.swlearning.com/quant/asw/sbe_8e/sbe.html
    • Save your work every day on your virtual drive, disks or flashdrive. The plan is to turn in assignments and exams using the mygateway dropbox.  Most of our calculations will be done in EXCEL, but a calculator may still be useful. You shouldn't need a complex statistical calculator. 
    • You will need an email account.
    • A Notebook. Although a lot of the skills we will develop have to do with spreadsheets, My hope is to reduce emphasis on mechanics of the calculations so we can focus effectively on what the results mean rather than memorizing a bunch of formulas. A notebook will help as we work through the concepts.
    • Note: The Forgetting Curve
    • Ode to Procrastination
     

    Winter/Spring, 2005 -- 

    Robert J. "Bud" Banis, Ph.D.,C.M.A.
    CCB 230 314-516-6136; 636-394-4950 
    E-mail: ba3300 (at) bud banis .com   (omit spaces) use this email address for all communications on the course. don't copy and paste it, as it has extra spaces to thwart automated harvesting by the spammers 

    Administrative issues:

    Historical Grade Distributions
    "The Rules" on things like attendance, drop dates, academic dishonesty, and the like. 
    Rules on cheating are enforced.
    See the discussion at collegecheating.com
     
     
    Tentative GRADING:
    Two Exams 500
    Suggested exam questions:  20
    up to 12 quizzes 150
    5 exercises 300
    Research report (borderline decisions) Due last class 4/28 50
    Total   points 1000
    Two exams are closed book, one page (2-sides) notes, part multiple choice and short answer. Students submit questions via the mygateway discussion board, (3 each exam, MC with >=5 options, suggested answers) I may use some of those.  The other halves are hands-on computer exams, closed book but one page notes and access to virtual drives and the web.
    Quizzes are open for group discussions, but all exercises and exams are individual effort only.  Submissions very similar in form and /or content will result in academic affairs investigations. 
    Letter grade breaks are  90, 80, 70, 60%. Plus & minus grades are rare. Research Report quality will be used to decide on borderline cases.

    Resources:

    Review spreadsheet formula design from BA 1800

    Resources and video tutorials

    Videos and other course materials on course CDs and DVD-- This is different from the CD's that came with the book.
     

    Is EXCEL an adequate statistics package?
    http://www.practicalstats.com/Pages/excelstats.html
    http://www.cmis.csiro.au/Mary.Barnes/PDF/Statistical%20flaws%20in%20Excel_Woody.pdf
    http://www.stat.uni-muenchen.de/~knuesel/elv/accuracy.html
    http://www.stat.uni-muenchen.de/~knuesel/elv/excelxp.pdf
    Camtasia pack test anova2v.exe
    computer labs
    Hours in the computer labs
    More Statistics Links on the web 
    -----Some interesting statistics questions to research 

    Tutor for the Spring Session.

    Maggie Shen will be the BA 3300 tutor again this  semester. She will hold office hours in CCB213 during the following times:

     

    Approximate timing

    in process: 
    Weeks 1-4 
     
     
    Monday 
    MLK Day Holiday doesn't affect schedule

     
     
     
     
     
     
     
     
     
     
     
     
     
     
     

     

    Overview, review of Z and t distributions, hypothesis testing. The Central Limit Theorem. One sample t-test. 

    Review of Netscape Navigator/MS Internet Explorer, Windows Explorer and EXCEL Editing, converting and cleaning up real data-Descriptive Statistics, Graphing, t-tests. Comparing two sample means. 
     Normsdist function:
    Be the "Martha Stewart" of Statistics-- make your own Z score Table. The "Martha Stewart" table of critical t's (and a more complex version of t-tail probabilities
     
    Exercise 1: Sampling and the central limit theorem. 
    Due before the end of the day Thursday, February 3
    using the random sampling procedure, collect 20 samples of size n=25 from the airplane empty seat data. 
    derive means, standard deviations and 95% confidence intervals for estimates of the population mean for each of the 20 samples. 
    compare histograms of the original distribution and the distribution of sample means. 
    Turn in the resulting spreadsheet printed to be 1 page (legible) with your name, section and student number printed at the top of the sheet. You can accomplish this by printing a selected range that leaves out some of the original data.
    Also submit a file via the Mygateway dropbox. be sure to click send to submit.
    How many of your samples had intervals including the true mean? 
    How many gave an erroneous estimate? 
    What is the expected frequency of errors in estimating the population mean when you use a 95% confidence interval? Was your observed result close to that expectation? 
    State the three characteristics we expect to see in the distribution of sample means from the Central Limit Theorem. Are your results reasonably consistent with those expectations?

    Hypothesis testing and sample means h0test.pdf

    Caution in the interpretation of hypothesis test!

    Comparison of sample variances through F tests, comparison of sample means, controlled experiments, Ch 9 
     
    Exercise 2:
    Click here for Detailed Description with questions to guide you in interpretation
    Paired ttest on gender and grades Due before the end of the day Tuesday March 1.
    Required (or no grade):
    1)Print your name and student number (STUDENT NUMBER--up to 50 points will be taken off for not using the student number) at the top of the sheet. use the last 4 digits of your student number in the data in place of A,B,C,D.
    2)Print interpretations in text boxes by the procedures they refer to.
    See the completed example we did in class on the salaries data.
    3)Turn in the spreadsheet printed to be 1 page-legible. You can accomplish this by printing a selected range that leaves out some of the original data.
    4)Also submit a file via the Mygateway dropbox.Title it Exercise 2.  Be sure to click "add" and then "send" to submit.

    Variance and Standard Deviation:

    Ch 6 Describing data measures of central tendency: dispersity Areas under the normal curve-- the Z score table . Applications to normallized scores--the IQ Score examples

    Central Limit Theorem

    Sampling Statistics and confidence limits on estimates of parameters. Ch 7 Sampling from Airplane empty seats data CXA07_01.PRN
    video sample.avi  Results show Central Limit Theorem on distribution of sample means. 
    Joint probabilities and probabilities of multiple errors in the sampling experiment. Normal approximation to binomial distribution.
     

    Hypothesis testing and sample means

    Example one-sample t-test: Is hospital length of stay < 5 days?
    Beta error & sample size losbeta.xls
     

    T-test two sample means:

    gender and height data from class
    height data 2 sample t test completed

    paired Sample T test:

    Example on salary related to gender: Power of controlling other variables through Paired comparisons Male & Female Salaries are they significantly different? Results with interpretation: sexsalt.xls

    Application of confidence intervals when data consists of population proportions Drug Use Survey


     

     Exam 1--

    There will be the paper exam with mostly MC questions.
    Thursday will be the hands on exam where you demonstrate skill with procedures as well as interpretation of the EXCEL output.
    Exam 1 questions  by discussion board due by Tuesday Mar. 1

     The hands-on exam will have at least a histogram, descriptive statistics, various t-tests and interpretations.
    The multiple choice part will have questions of the sort seen on the quizzes, perhaps including some student-submitted questions. 
    Collection of Raw Exam 1 questions for Fall 2003, both sections
    Collection of raw questions W04
    Be wary. The Raw questions collected here are of variable quality and some are ambiguous or may even have the wrong answer suggested. Use these as a guide for the types of information to be covered but you'd better make sure you understand the information rather than just memorizing rhese questions! 

    Ch 13 Analysis of Variance

     Concepts of ANOVA , "explaining variance" Where does it come from and how can you mathematically separate out the sources of variation?
    EXCEL model showing partition of Variance by averaging out different effects.  Be sure to set Tools  >Macro >security at  "medium" and enable the macros so the drop down list works.
     
    Exercise 3 Store sale prices due Thursday, March 31

    Ch 14 & 15 Regression

    -- Simple Correlation And Linear Regression (pdf regression overview
    See the Varsorc modelabout detecting effects in the presence of other factors 
    Regression least squares fit equations and EXCEL equivalent functions
    Concepts of ANOVA  gender-height data from F01 correcting for other effects
     Golf balls and clubs data cxa15_06.prn
    Video with sound 1-Factor ANOVA2-Factor ANOVA with Replication 
    Screen captures of golf club example  if you want to see how means could be adjusted to arithmetically to do what is done in ANOVA, canceling the treatment effects. golfadj1.jpg
    and golfadjf.jpg , showing formulas 2wayanov.xls
    completed comparisons showing "manual" calculations golf9.xls
    More Discussion
    Ch 14 & 15 Regression
    Is crime related to level of gambling activity? Quadratic models--  Electricity Usage and size of house, kwhuse.xls ,  results kwhuse9.xls
    Learning Curves, assembly.dat  results  9 am  how does this relate to learning curve theory
    data for paste special/ transpose. Interaction example from book--the clock auction.
    Weeks9-12 
     
     
     

    Examples of CHISquared studies:

    Note you can save just the analysis in a small file separate from the data by copy/paste into a new sheet. Just be sure to paste in the same cells as there are absolute references that might otherwise be messed up.
    Examples for highschool program:
    just percents --metrostatus and dangerous behaviors
                            --Increase in Marijuana Use with Age
                            --Marijuana Use and Grades 
                            --Marijuana Use and Suicide
    news item highschool girls using steroids

    YRBSS2001 data with smoking vs. Sex
    Smoking vs Sex-2003
    who is hitting whom?
    Being hit related to sex-2003
    Who gets better grades in Highschool?
    Grades by sex-2003 

    Grade in school vs. age distributions-2003

     Sexual experience and carrot consumption: carrotx2.xls


     
     
    EXC grades available until  last drop day
    Processing larger datasets 
    (YRBSS data Weight Multiple Regression
    Exercise 4-Due Thursday April 21)-we'll walk through the actual exercise together in class --each person with their own individualized data) to complete a larger dataset Multiple regression with about 14,000 records, studying the age, gender, height effects on weight of highschool students from The Youth Risky Behavior Surveillance System  (YRBS)
     Click here for detail on the assignment. See the Multiple Regression Video for explicit details
     

    See the screen captures on parsing the 2003 data into EXCEL.


    Chi-squared analysis for independence

    Crosstabulation of variables can most simply be done by pivot tables in EXCEL. See the videos and an example sheet on the YRBS2001 data.We will use pivot tables in EXCEL to generate  the data we need from YRBS2003 for chi2 analysis
    We will walk through this together in class.
     Exercise 5-due before Thursday April 28, the last class)--get it in early, none accepted after April 28 (at the end of the semester, due dates are firm to allow for timely grading )
     a printable version of the exercise in WORD
     More detail on pivotTables in EXCEL, usepivot.doc

     

    Crosstabulation can also be done in ACCESS. See the videos on importing the YRBS99 into ACCESS that are on the BA103 site--the procedure is similar to the EXCEL Wizard Management of large files using FTP or  Winzip
    Crosstabulating variables in ACCESS to preprocess for statistical analysis. Chi Square can be done on ACCESS crosstabulated data from the YRBSS.  Crosstab Query video
    Rainy days
    Ch 12. Categorical Analysis Chi Square tests of independence drinking and smoking. Spreadsheet formulas: smkdrkfo.jpg There are, of course, several ways to approach the calculation of the table of expected values. Another approach used in class F02  is shown in smkdrkf2.jpg

    Copying text out of a PDF: pdftxt.avi
    and pasting into EXCEL pastechi.avi
    Formulas are designed to accomodate new problems by inserting or deleting rows or columns expchi.avi  Then all you have to do is paste the observed values into the first table, and the whole sheet will recalculate to give the solution. 
    Father's and son's career paths
    This is a problem taken from the old text book, p.965, problem 16.38. Pasting new data from the strangely arranged file that came with the book into a modified sheet is shown in pastenew.avi
    Bring your disk to the final exam, as you will have a problem requiring deletion and insertion. If you know how to do this, the problem will take only a few minutes to do. 


     
     
    Everything is due by the Last Class. Nothing will be accepted late.
     6: Time series and forecasting. PPt summary and tools in EXCEL

    Stock Market information:

    • Market Beta for individual stocks
    • NYSE Stock directory
    • NASDAQ site
    • AMEX
    • Yahoo Links on Investment concepts
    • BUDweiser and other symbol lookup at YAHOO
    • to find history, click on historical prices.
    • Stock Market Beta  = volatility relative to the market, a measure of nondiversifiable risk. (definitions from Yahoo) Should be related to Return.
    • Estimating Risk Parameters and Costs of Financing-- a chapter from Aswath Damodaran of NYU
    • Examples from Class: Budweiser (Beta may be zero) and IBM (Beta >1) compared to S&P 500
    • Exercise 6 -extra credit due Thursday April 28

    • Download 5 years' worth of monthly prices from YAHOO for the S&P 500. there is a download to spreadsheet link at the bottom of the historical prices. open the file in EXCEL and calculate returns as (Pt-Pt-1)/Pt-1, based on adjusted closing prices.
    • Do the same for a stock of your choice (avoid  stocks others are doing) 
    • Re-sort the date from oldest to most recent data so that the time series charts are in the right direction.
    • chart a time series for the two returns and estimate trend lines using the right-click chart options.
    • Produce a scatterchart of the single stock returns vs. the S&P and estimate an intercept and  linear slope with chart options.
    • Conduct a Simple Regression procedure and see if you get a slope significantly different from zero. Comment on the slope and the Pvalue. This is known as Market Beta.
    Last Class, Application Reports, course evaluation.

    Time series and Forecasting

    powerpoint from the book

    Seasonal data on KWH use Timseasn.xls
    Some Really BAD Forecasts

    Think for yourself.

    How bad is unemployment right now? See historical and current unemployment statistics from the Current Population Survey Statistics . Here is data converted to EXCEL, unemp.xls

    Did the economy take a dive after 9/11 or was the dip something that started even before the current administration came into office?
    Historical Nasdaq daily data in EXCEL
    Historical Stock index data free from  YAHOO:
    Dow-Jones index monthly, 1928 to current djindex.xls
    S&P 500 from 1982 to present spindex.xls



     
    A discussion board is set up on my gateway for you to submit 3 possible multiple choice exam questions due by Friday, May 6..
    You can view the submissions there as they are collected.
    Be wary. The Raw questions collected here are of variable quality and some are ambiguous or may even have the wrong answer suggested. Use these as a guide for the types of information to be covered but you'd better make sure you understand the information rather than just memorizing these questions! 

    Both exams Collection of raw questions, i04

    Collection of raw exam 2 questions from a previous semester

    Both parts of the exam will be together. 
    The Multiple Choice part will be done without computer access, closed book, one page of notes allowed.

    Be sure to be prepared to use pivot tables to generate summary data for Chi-square analysis from the YRBS 2003 data. For the hands-on part, you will have access to the web, saved files on your K drive and your USB flashdrive, so you can have some of the work done ahead of time. Turn in the the files at the end of the exam via the Mygateway dropbox.

    If you are prepared and know what you are doing then there will be lots of time.

    If you aren't prepared, then there will not be time to learn it and do it.
     

    Room

    Exam 2-Date & Time

    Ref No /Section  Time 
     13990/004  12:30-1:45TR  CCB003  Thurs May 12, 12:30-2:30 PM
    14000 / 005 2:00-3:15 TR CCB003 Thurs May 12, 2:45-4:45 PM

    Switching sections is not allowed unless there is a good reason and prior permission. Taking an exam at a later time than prescribed may result in a grade penalty. Doing so without permission may result in a zero grade for the exam.

    last modified July 19, 2005
    comments or questions to rbanis (at) jinx.umsl.edu






     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     


     

     Bud's Statistics Book sites:
    heuristicbooks.com
    statisticsbook.com
    statisticsbooks.com
    std-statistics.com
    winningwithstatistics.com
    questionnairehints.com
    questionnairetips.com
    statisticsisforwinners.com
     howslotswork.com
     understandingslots.com
     understandtheslots.com