{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Mean, Median, Mode, and introducing NumPy"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Mean vs. Median"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's create some fake income data, centered around 27,000 with a normal distribution and standard deviation of 15,000, with 10,000 data points. (We'll discuss those terms more later, if you're not familiar with them.)\n",
"\n",
"Then, compute the mean (average) - it should be close to 27,000:"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"26933.720599154913"
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import numpy as np\n",
"\n",
"incomes = np.random.normal(27000, 15000, 10000)\n",
"np.mean(incomes)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can segment the income data into 50 buckets, and plot it as a histogram:"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXoAAAD8CAYAAAB5Pm/hAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvnQurowAAEttJREFUeJzt3W2MXFd9x/HvvzFJKE+2401q2U7XERYttErirkJoqiqNKSQ2wqlEqtCqmODKUgkolFbgkBcFqZUcqJoHgUIsAnVQIAkBaisJpK5J1PZFTOyQR5zgxZh4sRtvmsSURrRy+ffFnMVje9Y7szuzO3v8/UijuffcMzPn7L372+Nz71xHZiJJqtevzHQDJEm9ZdBLUuUMekmqnEEvSZUz6CWpcga9JFXOoJekyhn0klQ5g16SKjdnphsAsGDBghwcHJzpZkjSrLJz584XMnNgonp9EfSDg4Ps2LFjppshSbNKRPy4nXpO3UhS5Qx6SaqcQS9JlTPoJalyBr0kVc6gl6TKGfSSVDmDXpIqZ9BLUuX64pux0slscP19Lcv3blg1zS1RrQx6qcsMbvUbp24kqXKO6KVpMt5IX+o1R/SSVDmDXpIqZ9BLUuUMekmqXFsnYyNiLvAF4LeABD4APAvcBQwCe4E/zsyXIiKAm4CVwCvA+zPz0a63XJoGJzqB6uWSmi3aHdHfBHw7M38DOBfYBawHtmXmMmBbWQe4DFhWHuuAW7raYklSRyYM+oh4PfD7wG0Amfm/mfkysBrYVKptAi4vy6uB27PhYWBuRCzsesslSW1pZ+rmHGAU+FJEnAvsBK4BzsrMAwCZeSAiziz1FwH7ml4/UsoOdK3VUh/wunjNFu1M3cwBlgO3ZOb5wH9zZJqmlWhRlsdVilgXETsiYsfo6GhbjZUkda6doB8BRjJze1m/h0bwPz82JVOeDzbVX9L0+sXA/mPfNDM3ZuZQZg4NDAxMtv2SpAlMGPSZ+R/Avoh4UylaAXwf2AKsKWVrgM1leQvwvmi4EDg0NsUjSZp+7d7r5sPAHRFxKrAHuIrGH4m7I2It8BxwRal7P41LK4dpXF55VVdbLEnqSFtBn5mPAUMtNq1oUTeBq6fYLklSl/jNWEmqnEEvSZUz6CWpcv7HIxJ++Ul1c0QvSZUz6CWpcga9JFXOoJekyhn0klQ5g16SKmfQS1LlDHpJqpxBL0mVM+glqXIGvSRVzqCXpMoZ9JJUOYNekirnbYqlPjXerZP3blg1zS3RbOeIXpIqZ9BLUuUMekmqnHP00izj3L065YhekirXVtBHxN6IeDIiHouIHaVsfkRsjYjd5XleKY+IuDkihiPiiYhY3ssOSJJOrJMR/R9k5nmZOVTW1wPbMnMZsK2sA1wGLCuPdcAt3WqsJKlzU5m6WQ1sKsubgMubym/PhoeBuRGxcAqfI0magnZPxibwzxGRwK2ZuRE4KzMPAGTmgYg4s9RdBOxreu1IKTvQ/IYRsY7GiJ+zzz578j2QOjDeiUypZu0G/UWZub+E+daIeOYEdaNFWR5X0PhjsRFgaGjouO2SpO5oa+omM/eX54PAN4ELgOfHpmTK88FSfQRY0vTyxcD+bjVYktSZCYM+Il4TEa8bWwbeATwFbAHWlGprgM1leQvwvnL1zYXAobEpHknS9Gtn6uYs4JsRMVb/K5n57Yh4BLg7ItYCzwFXlPr3AyuBYeAV4Kqut1qS1LYJgz4z9wDntij/T2BFi/IEru5K6yRJU+Y3YyWpcga9JFXOoJekyhn0klQ5g16SKmfQS1LlDHpJqpxBL0mVM+glqXIGvSRVzqCXpMoZ9JJUOYNekipn0EtS5Qx6SaqcQS9JlTPoJalyBr0kVa6d/zNWmnUG1983002Q+oZBL1VivD9uezesmuaWqN84dSNJlXNEL1XOkb4c0UtS5doO+og4JSK+FxH3lvWlEbE9InZHxF0RcWopP62sD5ftg71puiSpHZ2M6K8BdjWtXw/ckJnLgJeAtaV8LfBSZr4RuKHUkyTNkLbm6CNiMbAK+DvgoxERwCXAn5Qqm4BPArcAq8sywD3AZyMiMjO712ypwcsopYm1O6K/EfgY8IuyfgbwcmYeLusjwKKyvAjYB1C2Hyr1JUkzYMKgj4h3AQczc2dzcYuq2ca25vddFxE7ImLH6OhoW42VJHWunRH9RcC7I2IvcCeNKZsbgbkRMTb1sxjYX5ZHgCUAZfsbgBePfdPM3JiZQ5k5NDAwMKVOSJLGN2HQZ+a1mbk4MweBK4HvZOafAg8C7ynV1gCby/KWsk7Z/h3n5yVp5kzlOvqP0zgxO0xjDv62Un4bcEYp/yiwfmpNlCRNRUffjM3Mh4CHyvIe4IIWdX4OXNGFtkmSusBvxkpS5Qx6SaqcQS9JlTPoJalyBr0kVc6gl6TKGfSSVDmDXpIqZ9BLUuUMekmqnEEvSZUz6CWpcga9JFXOoJekyhn0klS5ju5HL82UwfX3zXQTqjPez3TvhlXT3BL1miN6SaqcQS9JlTPoJalyBr0kVc6gl6TKedWNpKN4NU59HNFLUuUMekmq3IRBHxGnR8R3I+LxiHg6Ij5VypdGxPaI2B0Rd0XEqaX8tLI+XLYP9rYLkqQTaWdE/z/AJZl5LnAecGlEXAhcD9yQmcuAl4C1pf5a4KXMfCNwQ6knSZohEwZ9NvysrL6qPBK4BLinlG8CLi/Lq8s6ZfuKiIiutViS1JG2rrqJiFOAncAbgc8BPwRezszDpcoIsKgsLwL2AWTm4Yg4BJwBvHDMe64D1gGcffbZU+uFquE9baTua+tkbGb+X2aeBywGLgB+s1W18txq9J7HFWRuzMyhzBwaGBhot72SpA51dNVNZr4MPARcCMyNiLF/ESwG9pflEWAJQNn+BuDFbjRWktS5dq66GYiIuWX51cDbgV3Ag8B7SrU1wOayvKWsU7Z/JzOPG9FLkqZHO3P0C4FNZZ7+V4C7M/PeiPg+cGdE/C3wPeC2Uv824MsRMUxjJH9lD9otSWrThEGfmU8A57co30Njvv7Y8p8DV3SldZKkKfObsZJUOYNekipn0EtS5Qx6SaqcQS9JlTPoJalyBr0kVc6gl6TKGfSSVDmDXpIqZ9BLUuUMekmqnEEvSZUz6CWpcga9JFXOoJekyhn0klQ5g16SKmfQS1LlDHpJqtyE/zm41AuD6++b6SZIJw1H9JJUOYNekio3YdBHxJKIeDAidkXE0xFxTSmfHxFbI2J3eZ5XyiMibo6I4Yh4IiKW97oTkqTxtTNHfxj4q8x8NCJeB+yMiK3A+4FtmbkhItYD64GPA5cBy8rjrcAt5VnSLHai8yp7N6yaxpaoUxOO6DPzQGY+Wpb/C9gFLAJWA5tKtU3A5WV5NXB7NjwMzI2IhV1vuSSpLR3N0UfEIHA+sB04KzMPQOOPAXBmqbYI2Nf0spFSJkmaAW0HfUS8Fvg68JHM/OmJqrYoyxbvty4idkTEjtHR0XabIUnqUFtBHxGvohHyd2TmN0rx82NTMuX5YCkfAZY0vXwxsP/Y98zMjZk5lJlDAwMDk22/JGkC7Vx1E8BtwK7M/IemTVuANWV5DbC5qfx95eqbC4FDY1M8kqTp185VNxcBfwY8GRGPlbJPABuAuyNiLfAccEXZdj+wEhgGXgGu6mqLJUkdmTDoM/PfaT3vDrCiRf0Erp5iuyRJXeK9biRN2XjX2Ht9fX/wFgiSVDlH9Oop71IpzTxH9JJUOYNekipn0EtS5Qx6SaqcQS9JlTPoJalyBr0kVc6gl6TKGfSSVDmDXpIqZ9BLUuW8142knvGulv3BEb0kVc6gl6TKOXWjrvB2xFL/ckQvSZUz6CWpcga9JFXOOXpJ087LLqeXQa+OeNJVmn2cupGkyk0Y9BHxxYg4GBFPNZXNj4itEbG7PM8r5RERN0fEcEQ8ERHLe9l4SdLE2hnR/yNw6TFl64FtmbkM2FbWAS4DlpXHOuCW7jRTkjRZEwZ9Zv4r8OIxxauBTWV5E3B5U/nt2fAwMDciFnarsZKkzk12jv6szDwAUJ7PLOWLgH1N9UZK2XEiYl1E7IiIHaOjo5NshiRpIt0+GRstyrJVxczcmJlDmTk0MDDQ5WZIksZMNuifH5uSKc8HS/kIsKSp3mJg/+SbJ0maqsleR78FWANsKM+bm8o/FBF3Am8FDo1N8Wh28Xp5qR4TBn1EfBW4GFgQESPA39AI+LsjYi3wHHBFqX4/sBIYBl4BrupBmyVJHZgw6DPzveNsWtGibgJXT7VRkqTu8ZuxklQ5g16SKudNzST1De9q2RuO6CWpcga9JFXOoJekyhn0klQ5g16SKmfQS1LlvLzyJOB9a6STmyN6SaqcI3pJfc8vUk2NI3pJqpwjekmzliP99jiil6TKGfSSVDmDXpIqZ9BLUuUMekmqnFfdVMRvwEoNXo1zNIN+FjLQJXXCqRtJqpwjekknjZN1SqcnQR8RlwI3AacAX8jMDb34nNo5RSOpG7oe9BFxCvA54A+BEeCRiNiSmd/v9mdJUjd0Oqiabf8C6MWI/gJgODP3AETEncBq4KQPekfoUh1m2xRQL4J+EbCvaX0EeGsPPgfo/Ad+orCdzGskacxksmI6/jj0IuijRVkeVyliHbCurP4sIp5t470XAC+01Yjr26k19ddMQdt9mSXsT/+qqS9QV38WxPVT6suvt1OpF0E/AixpWl8M7D+2UmZuBDZ28sYRsSMzh6bWvP5QU1/A/vSzmvoCdfVnuvrSi+voHwGWRcTSiDgVuBLY0oPPkSS1oesj+sw8HBEfAh6gcXnlFzPz6W5/jiSpPT25jj4z7wfu78FbdzTV0+dq6gvYn35WU1+grv5MS18i87jzpJKkinivG0mqXF8EfUT8dURkRCwo6xERN0fEcEQ8ERHLm+quiYjd5bGmqfx3IuLJ8pqbIyJK+fyI2Frqb42IeT3sx2ci4pnS5m9GxNymbdeWtj0bEe9sKr+0lA1HxPqm8qURsb20+65yYpuIOK2sD5ftg73qTzvGa38/iIglEfFgROyKiKcj4ppS3vKY6OZx18M+nRIR34uIe8t6x8dJp8diD/syNyLuKb8zuyLibbN130TEX5Zj7KmI+GpEnN5X+yYzZ/RB41LMB4AfAwtK2UrgWzSuyb8Q2F7K5wN7yvO8sjyvbPsu8Lbymm8Bl5XyTwPry/J64Poe9uUdwJyyfP3YZwFvBh4HTgOWAj+kcaL6lLJ8DnBqqfPm8pq7gSvL8ueBvyjLHwQ+X5avBO6awX03bvv74QEsBJaX5dcBPyj7ouUx0c3jrod9+ijwFeDeyRwnkzkWe9iXTcCfl+VTgbmzcd/Q+JLoj4BXN+2T9/fTvumHX8Z7gHOBvRwJ+luB9zbVeZbGL+17gVubym8tZQuBZ5rKf1lv7LVleSHw7DT164+AO8rytcC1TdseKAfg24AHmsqvLY+g8YWQsT8av6w39tqyPKfUixnady3bP9PH1Anau5nGPZhaHhPdPO561P7FwDbgEuDeyRwnnR6LPezL62mEYxxTPuv2DUfuBjC//KzvBd7ZT/tmRqduIuLdwE8y8/FjNrW6jcKiCcpHWpQDnJWZBwDK85ld68CJfYDGKAI6788ZwMuZefiY8qPeq2w/VOrPhPHa33fKP4/PB7Yz/jHRzeOuF24EPgb8oqxP5jjptI+9cg4wCnypTEV9ISJewyzcN5n5E+DvgeeAAzR+1jvpo33T8/vRR8S/AL/WYtN1wCdoTHcc97IWZTmJ8q47UX8yc3Opcx1wGLhj7GXjtK/VH9qJ+jNtfW1DP7VlXBHxWuDrwEcy86cnmKrt5+PuXcDBzNwZERePFZ/g8ztt83jHYq/MAZYDH87M7RFxE42pmvH0876ZR+PGjUuBl4GvAZed4POnfd/0POgz8+2tyiPit2n8YB4vv3iLgUcj4gLGv43CCHDxMeUPlfLFLeoDPB8RCzPzQEQsBA72oj9jysmgdwErsvw7ixPfFqJV+QvA3IiYU/7iN9cfe6+RiJgDvAF4cfI9mpK2bncxkyLiVTRC/o7M/EYpHu+Y6OZx120XAe+OiJXA6TSmPm6k8+Ok02OxV0aAkczcXtbvoRH0s3HfvB34UWaOAkTEN4DfpZ/2Ta/m4CYxz7WXI3P0qzj6xMt3S/l8GvN688rjR8D8su2RUnfsxMvKUv4Zjj658+ke9uFSGrdjHjim/C0cfZJlD40TLHPK8lKOnGR5S3nN1zj6RM4Hy/LVHH0i5+4Z3Gfjtr8fHuVYuB248ZjylsdEN4+7HvfrYo6cjO3oOJnMsdjDfvwb8Kay/MmyX2bdvqFxd96ngV8tn7UJ+HA/7ZsZ/2Vs+mHt5UjQB43/vOSHwJPAUFO9DwDD5XFVU/kQ8FR5zWc58mWwM2icwNpdnuf3sA/DNObSHiuPzzdtu6607Vmazv7TuJrgB2XbdU3l59C4amC4HDCnlfLTy/pw2X7ODO+3lu3vhwfwezT+iftE0z5ZOd4x0c3jrsf9upgjQd/xcdLpsdjDfpwH7Cj7559oBPWs3DfAp4Bnyud9mUZY982+8ZuxklS5vvjClCSpdwx6SaqcQS9JlTPoJalyBr0kVc6gl6TKGfSSVDmDXpIq9/8oT81we+AN6gAAAABJRU5ErkJggg==\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"%matplotlib inline\n",
"import matplotlib.pyplot as plt\n",
"plt.hist(incomes, 50)\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now compute the median - since we have a nice, even distribution it too should be close to 27,000:"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"27092.39320043316"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"np.median(incomes)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we'll add Jeff Bezos into the mix. Darn income inequality!"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"incomes = np.append(incomes, [1000000000])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The median won't change much, but the mean does:"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"27092.571036571815"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"np.median(incomes)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"126921.0284963053"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"np.mean(incomes)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Mode"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Next, let's generate some fake age data for 500 people:"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([44, 51, 62, 80, 36, 61, 35, 64, 40, 45, 50, 33, 20, 63, 38, 87, 18,\n",
" 35, 21, 56, 57, 61, 52, 51, 45, 84, 40, 59, 22, 71, 25, 49, 19, 33,\n",
" 57, 43, 48, 65, 47, 38, 39, 59, 74, 49, 54, 59, 21, 79, 38, 80, 86,\n",
" 23, 26, 86, 55, 23, 34, 36, 48, 83, 77, 55, 22, 42, 78, 35, 53, 35,\n",
" 73, 87, 50, 85, 42, 52, 86, 31, 27, 22, 46, 20, 32, 63, 45, 61, 24,\n",
" 20, 45, 21, 40, 83, 66, 33, 49, 43, 78, 35, 83, 37, 48, 49, 81, 20,\n",
" 43, 26, 41, 24, 43, 69, 42, 39, 50, 57, 70, 31, 66, 82, 43, 19, 64,\n",
" 65, 29, 21, 35, 88, 43, 56, 56, 25, 19, 60, 57, 48, 74, 22, 65, 74,\n",
" 76, 42, 89, 59, 51, 19, 43, 80, 37, 20, 88, 30, 26, 37, 64, 28, 30,\n",
" 65, 21, 50, 54, 44, 86, 64, 81, 33, 67, 49, 55, 82, 64, 20, 47, 89,\n",
" 24, 18, 23, 32, 59, 86, 50, 25, 23, 23, 40, 41, 61, 38, 20, 20, 71,\n",
" 43, 30, 83, 83, 73, 49, 73, 20, 41, 84, 43, 39, 63, 25, 45, 33, 84,\n",
" 46, 70, 74, 57, 23, 37, 20, 78, 26, 25, 79, 38, 46, 53, 45, 27, 35,\n",
" 48, 83, 54, 50, 38, 23, 66, 55, 33, 62, 85, 36, 18, 22, 28, 22, 57,\n",
" 26, 62, 22, 23, 31, 35, 85, 48, 26, 41, 38, 25, 69, 44, 79, 26, 54,\n",
" 23, 19, 66, 41, 76, 60, 42, 39, 76, 27, 69, 57, 44, 86, 18, 42, 76,\n",
" 29, 86, 29, 56, 82, 65, 72, 68, 44, 83, 77, 37, 85, 34, 58, 86, 68,\n",
" 68, 36, 74, 78, 56, 19, 22, 60, 53, 30, 89, 41, 57, 60, 58, 84, 48,\n",
" 29, 85, 44, 25, 58, 69, 29, 80, 36, 34, 65, 58, 59, 66, 66, 42, 79,\n",
" 21, 72, 27, 54, 57, 55, 83, 37, 30, 63, 79, 56, 25, 35, 37, 31, 81,\n",
" 51, 63, 49, 31, 60, 77, 87, 20, 61, 49, 40, 25, 31, 87, 81, 33, 32,\n",
" 40, 88, 46, 54, 18, 62, 50, 44, 64, 83, 65, 47, 36, 45, 24, 36, 86,\n",
" 28, 57, 82, 45, 42, 61, 70, 35, 52, 21, 45, 81, 77, 52, 81, 48, 26,\n",
" 55, 69, 61, 84, 70, 71, 63, 79, 77, 25, 23, 43, 30, 21, 49, 31, 78,\n",
" 73, 53, 28, 48, 83, 31, 19, 62, 71, 20, 26, 23, 32, 77, 73, 58, 31,\n",
" 39, 20, 59, 39, 25, 51, 78, 62, 22, 48, 84, 75, 71, 24, 54, 68, 30,\n",
" 18, 30, 44, 51, 53, 25, 67, 39, 87, 73, 23, 67, 62, 83, 69, 59, 33,\n",
" 83, 70, 20, 51, 50, 36, 83, 48, 44, 89, 76, 64, 62, 65, 35, 73, 65,\n",
" 40, 75, 62, 20, 72, 35, 82, 35, 76, 72, 49, 55, 66, 88, 24, 72, 68,\n",
" 65, 79, 57, 76, 41, 68, 58])"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ages = np.random.randint(18, high=90, size=500)\n",
"ages"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"ModeResult(mode=array([20]), count=array([15]))"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from scipy import stats\n",
"stats.mode(ages)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.3"
}
},
"nbformat": 4,
"nbformat_minor": 1
}