• Complain

it-ebooks - Computational and Inferential Thinking (UCB Data8)

Here you can read online it-ebooks - Computational and Inferential Thinking (UCB Data8) full text of the book (entire story) in english for free. Download pdf and epub, get meaning, cover and reviews about this ebook. year: 2017, publisher: iBooker it-ebooks, genre: Children. Description of the work, (preface) as well as reviews are available. Best literature library LitArk.com created for fans of good reading and offers a wide selection of genres:

Romance novel Science fiction Adventure Detective Science History Home and family Prose Art Politics Computer Non-fiction Religion Business Children Humor

Choose a favorite category and find really read worthwhile books. Enjoy immersion in the world of imagination, feel the emotions of the characters or learn something new for yourself, make an fascinating discovery.

it-ebooks Computational and Inferential Thinking (UCB Data8)
  • Book:
    Computational and Inferential Thinking (UCB Data8)
  • Author:
  • Publisher:
    iBooker it-ebooks
  • Genre:
  • Year:
    2017
  • Rating:
    4 / 5
  • Favourites:
    Add to favourites
  • Your mark:
    • 80
    • 1
    • 2
    • 3
    • 4
    • 5

Computational and Inferential Thinking (UCB Data8): summary, description and annotation

We offer to read an annotation, description, summary or preface (depends on what the author of the book "Computational and Inferential Thinking (UCB Data8)" wrote himself). If you haven't found the necessary information about the book — write in the comments, we will try to find it.

it-ebooks: author's other books


Who wrote Computational and Inferential Thinking (UCB Data8)? Find out the surname, the name of the author of the book and a list of all author's works by series.

Computational and Inferential Thinking (UCB Data8) — read online for free the complete book (whole text) full work

Below is the text of the book, divided by pages. System saving the place of the last page read, allows you to conveniently read the book "Computational and Inferential Thinking (UCB Data8)" online for free, without having to search again every time where you left off. Put a bookmark, and you can go to the page where you finished reading at any time.

Light

Font size:

Reset

Interval:

Bookmark:

Make
Table of Contents
  1. 1.1
  2. 1.2
    1. 1.2.1
      1. 1.2.1.1
      2. 1.2.1.2
    2. 1.2.2
    3. 1.2.3
      1. 1.2.3.1
      2. 1.2.3.2
  3. 1.3
    1. 1.3.1
    2. 1.3.2
    3. 1.3.3
    4. 1.3.4
    5. 1.3.5
  4. 1.4
    1. 1.4.1
    2. 1.4.2
    3. 1.4.3
      1. 1.4.3.1
    4. 1.4.4
  5. 1.5
    1. 1.5.1
      1. 1.5.1.1
    2. 1.5.2
    3. 1.5.3
    4. 1.5.4
    5. 1.5.5
    6. 1.5.6
  6. 1.6
    1. 1.6.1
    2. 1.6.2
    3. 1.6.3
    4. 1.6.4
  7. 1.7
    1. 1.7.1
    2. 1.7.2
    3. 1.7.3
  8. 1.8
    1. 1.8.1
    2. 1.8.2
    3. 1.8.3
    4. 1.8.4
    5. 1.8.5
  9. 1.9
    1. 1.9.1
    2. 1.9.2
    3. 1.9.3
    4. 1.9.4
    5. 1.9.5
  10. 1.10
    1. 1.10.1
    2. 1.10.2
    3. 1.10.3
  11. 1.11
    1. 1.11.1
    2. 1.11.2
    3. 1.11.3
    4. 1.11.4
  12. 1.12
    1. 1.12.1
    2. 1.12.2
    3. 1.12.3
    4. 1.12.4
  13. 1.13
    1. 1.13.1
    2. 1.13.2
    3. 1.13.3
    4. 1.13.4
    5. 1.13.5
    6. 1.13.6
  14. 1.14
    1. 1.14.1
    2. 1.14.2
    3. 1.14.3
    4. 1.14.4
    5. 1.14.5
    6. 1.14.6
  15. 1.15
    1. 1.15.1
    2. 1.15.2
    3. 1.15.3
  16. 1.16
    1. 1.16.1
    2. 1.16.2
    3. 1.16.3
    4. 1.16.4
    5. 1.16.5
    6. 1.16.6
  17. 1.17
    1. 1.17.1
    2. 1.17.2
    3. 1.17.3
  18. 1.18
    1. 1.18.1
    2. 1.18.2
A/B Testing
Interact
A/B Testing

We have used random permutations to see whether two samples are drawn from the same underlying categorical distribution. If the samples are numerical, the same method can be used; the choice of test statistic is usually simpler. In our example with the Deflategate data, we used the difference of means to test whether the Patriots' and Colts' balls came from the same underlying distribution.

In modern data analytics, deciding whether two numerical samples come from the same underlying distribution is called A/B testing. The name refers to the labels of the two samples, A and B.

Smokers and Nonsmokers

We have performed many different analyses on our random sample of mothers and their newborn infants, but we haven't yet looked at the data whether the mothers smoked. One of the aims of the study was to see whether maternal smoking was associated with birth weight.

baby = Table . read_table ( 'baby.csv' ) baby
Birth WeightGestational DaysMaternal AgeMaternal HeightMaternal Pregnancy WeightMaternal Smoker
1202842762100False
1132823364135False
1282792864115True
1082822367125True
136286256293False
1382443362178False
1322452365140False
1202892562125False
1432993066136True
1403512768120False

... (1164 rows omitted)

We'll start by selecting just Birth Weight and Maternal Smoker. There are 715 non-smokers among the women in the sample, and 459 smokers.

weight_smoke = baby . select ( 'Birth Weight' , 'Maternal Smoker' )
weight_smoke . group ( 'Maternal Smoker' )
Maternal Smokercount
False715
True459

The first histogram below displays the distribution of birth weights of the babies of the non-smokers in the sample. The second displays the birth weights of the babies of the smokers.

nonsmokers = baby . where ( 'Maternal Smoker' , are . equal_to ( False )) nonsmokers . hist ( 'Birth Weight' , bins = np . arange ( , , ), unit = 'ounce' )
smokers baby where Maternal Smoker are equalto True smokers - photo 1
smokers = baby . where ( 'Maternal Smoker' , are . equal_to ( True )) smokers . hist ( 'Birth Weight' , bins = np . arange ( , , ), unit = 'ounce' )
Both distributions are approximately bell shaped and centered near 120 ounces - photo 2

Both distributions are approximately bell shaped and centered near 120 ounces. The distributions are not identical, of course, which raises the question of whether the difference reflects just chance variation or a difference in the distributions in the population.

This question can be answered by a test of hypotheses.

Null hypothesis: In the population, the distribution of birth weights of babies is the same for mothers who don't smoke as for mothers who do. The difference in the sample is due to chance.

Alternative hypothesis: The two distributions are different in the population.

Test statistic: Birth weight is a quantitative variable, so it is reasonable to use the absolute difference between the means as the test statistic.

The observed value of the test statistic is about 9.27 ounces.

means_table = weight_smoke . group ( 'Maternal Smoker' , np . mean ) means_table
Maternal SmokerBirth Weight mean
False123.085
True113.819
nonsmokers_mean = means_table . column ( ) . item ( ) smokers_mean = means_table . column ( ) . item ( ) nonsmokers_mean - smokers_mean
9.266142572024918
A Permutation Test

To see whether such a difference could have arisen due to chance under the null hypothesis, we will use a permutation test just as we did in the previous section. All we have to change is the code for the test statistic. For that, we'll compute the difference in means as we did above, and then take the absolute value.

Remember that under the null hypothesis, all permutations of birth weight are equally likely to be appear with the Maternal Smoker column. So, just as before, each repetition starts with shuffling the variable being compared.

def permutation_test_means ( table , variable , classes , repetitions ): """Test whether two numerical samples come from the same underlying distribution, using the absolute difference between the means. table: name of table containing the sample variable: label of column containing the numerical variable classes: label of column containing names of the two samples repetitions: number of random permutations""" t = table . select ( variable , classes ) # Find the observed test statistic means_table = t . group ( classes , np . mean ) obs_stat = abs ( means_table . column ( ) . item ( ) - means_table . column ( ) . item ( )) # Assuming the null is true, randomly permute the variable # and collect all the generated test statistics stats = make_array () for i in np . arange ( repetitions ): shuffled_var = t . select ( variable ) . sample ( with_replacement = False ) . column ( ) shuffled = t . select ( classes ) . with_column ( 'Shuffled Variable' , shuffled_var ) m_tbl = shuffled . group ( classes , np . mean ) new_stat = abs ( m_tbl . column ( ) . item ( ) - m_tbl . column ( ) . item ( )) stats = np . append ( stats , new_stat ) # Find the empirical P-value: emp_p = np . count_nonzero ( stats >= obs_stat ) / repetitions # Draw the empirical histogram of the tvd's generated under the null, # and compare with the value observed in the original sample Table () . with_column ( 'Test Statistic' , stats ) . hist ( bins = ) plots . title ( 'Empirical Distribution Under the Null' ) print ( 'Observed statistic:' , obs_stat ) print ( 'Empirical P-value:' , emp_p )
Next page
Light

Font size:

Reset

Interval:

Bookmark:

Make

Similar books «Computational and Inferential Thinking (UCB Data8)»

Look at similar books to Computational and Inferential Thinking (UCB Data8). We have selected literature similar in name and meaning in the hope of providing readers with more options to find new, interesting, not yet read works.


Reviews about «Computational and Inferential Thinking (UCB Data8)»

Discussion, reviews of the book Computational and Inferential Thinking (UCB Data8) and just readers' own opinions. Leave your comments, write what you think about the work, its meaning or the main characters. Specify what exactly you liked and what you didn't like, and why you think so.