{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Sagan 2020: Python Introduction" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is a brief overview of the Jupyter notebook interface and the Python3 language. The functionality demonstrated here is quite basic and should be familiar to you. If you are struggling with any of these items, please follow the linked tutorials for more information, and/or follow up in the workshop Slack channel." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Jupyter Overview" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Each entry in a Jupyter notebook is called a cell. \n", "\n", "You can run the contents of individual cells by selecting them and pressing **CTRL-Enter** (or **Shift-Enter** on a Mac).\n", "\n", "You can run the contents of individual cells AND add a new cell underneath by pressing **ALT-Enter** (or **Option-Enter** on a Mac).\n", "\n", "You can delete a cell by selecting it and pressing **d** twice.\n", "\n", "You can add a text block like this cell by pressing **Esc-m** (or use the dropdown at the top and change from Code to Markdown). This is useful for adding in notes that you want to remember.\n", "\n", "You can find more information on Jupyter notebooks here: \n", "https://jupyter-notebook.readthedocs.io/en/stable/\n", "\n", "As you work through the sections below, run each cell to see the results.\n", "***" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Python Overview" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The core python programming language provides a small number of built-in functions. You can see a description of them here: https://docs.python.org/3/library/functions.html . Most of the high-level functions you will want for numerical data analysis are not built-in. You access these by importing *packages*. \n", "\n", "The Anaconda distribution you installed included a large library of packages, but in order to use them you need to first import them into your current programing environment.\n", "\n", "You can import an entire package like this:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import astropy" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This gives you access to the *astropy* package, which provides numerous astronomical utilities. There are sub-packages within *astropy*, such as *constants*, which contains useful astronomical constants. You import the subpackage like this:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " Name = Speed of light in vacuum\n", " Value = 299792458.0\n", " Uncertainty = 0.0\n", " Unit = m / s\n", " Reference = CODATA 2014\n", "299792458.0\n" ] } ], "source": [ "import astropy.constants\n", "c = astropy.constants.c #Retrieve the speed of light, and store in variable 'c'\n", "print(c) #Print a variable using the print object\n", "print(c.value) #Get just the value" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You will frequently see someone do imports like this:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "from datetime import *" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This says *from the datetime package, import everything directly into the current environment.* **This is very bad, and you should not do it.** If two different packages have functions in them that are named the same, and if you import both of them like this, then they will overwrite each other and your code will be confusing. \n", "\n", "*There are a few exceptions to this rule, but you will likely not encounter them in this workshop.*\n", "\n", "***" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Mathematics" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To calculate things we need to import a mathematical package. *numpy* is the standard." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "import numpy as np" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that I am both importing the 'numpy' package, and changing its name (in this session) to 'np'. This is because I am lazy and typing 'np' is faster than typing 'numpy'. (np is also the canonical abbreviation for numpy, and is what you will see in almost all online tutorials.)\n", "\n", "You can find the documentation on the numpy package here: [https://numpy.org/devdocs/reference/index.html](https://numpy.org/devdocs/reference/index.html) . I found this by searching for 'numpy manual'.\n", "\n", "Numpy allows you to create vector and matrix arrays:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([1, 2, 3, 4])" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "v = np.array([1,2,3,4])\n", "v" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that I did not use print(v). Instead I asked python to give me information about the object itself. It tells me that it is *type* array, and has elements [1, 2, 3, 4]" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[1 2 3 4]\n" ] } ], "source": [ "print(v)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Print just shows the contents (as a python list).\n", "\n", "Now make a bigger array:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "v = np.arange(0,10000,1) #This arange call creates a long array from 0 to 10000 with step size = 1" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([ 0, 1, 2, ..., 9997, 9998, 9999])" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "v" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that python doesn't display the entire array, only the beginning and end.\n", "\n", "***" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Uniform Distribution" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's use numpy to generate a uniform set of 100 random numbers from -10 to 10" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [], "source": [ "samples = np.random.uniform(-10,10,100)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's examine those samples:" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[ 2.63781862 0.86063598 -1.88745636 6.7147289 -9.12525893 2.35907358\n", " -3.00063273 -6.04347342 5.83474723 9.27988138 5.21427148 -0.6442014\n", " -6.00675335 0.11148202 -7.35116533 8.4097447 8.81774668 2.87126134\n", " -4.50363081 -3.74912127 -6.32753074 -2.23652499 3.34880466 -6.28131757\n", " 1.07192016 4.80567687 1.90609829 0.46704396 0.52071202 2.92681156\n", " 2.01257686 3.83776863 -5.81175492 -1.28601643 -7.50050052 3.57829466\n", " -0.81811899 -9.47628633 -5.82451086 -7.73421009 -7.57495481 -2.38421499\n", " -5.16967593 6.73106699 0.18648938 -2.93963042 -8.75835466 4.50331816\n", " -7.60162745 -0.1459412 -4.10564979 8.64750158 -8.81957247 5.64268331\n", " -8.16179188 3.53340838 2.20299483 -8.94016824 -7.25821202 8.84234342\n", " 0.51549915 -6.07523159 -6.81028142 -8.34091069 6.547428 6.91971674\n", " -4.89265694 7.21408641 -8.11548152 -8.25617521 -0.18475636 -3.52180417\n", " -6.04489938 4.06440391 3.3675582 -7.47972838 6.83738935 -3.80050618\n", " 1.10642165 4.30795846 -5.20965377 2.68744651 -0.02726845 7.94760136\n", " -4.39067086 -7.09457958 -5.32829454 8.04851677 0.21831475 4.3188892\n", " 9.48269771 -1.7232098 2.85934262 -4.84827442 -1.93293992 -3.47165077\n", " 6.43642274 3.14667705 9.67503316 0.89812918]\n" ] } ], "source": [ "print(samples)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create a histogram of those samples, with one bin per integer value:" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [], "source": [ "bins = np.linspace(-10,10,21)" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[-10. -9. -8. -7. -6. -5. -4. -3. -2. -1. 0. 1. 2. 3.\n", " 4. 5. 6. 7. 8. 9. 10.]\n" ] } ], "source": [ "print(bins)" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [], "source": [ "histogram, bins = np.histogram(samples, bins=bins)" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[2 7 8 7 5 5 5 3 4 5 8 3 8 6 5 3 6 2 5 3]\n" ] } ], "source": [ "print(histogram)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Why did we use linspace instead of array? Read the documentation on numpy.linspace" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Plotting" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we can use the matplotlib package to plot this distribution. First import it, and configure it to draw within the python notebook" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "*%matplotlib inline* is known as a \"magic\" function. It tells matplotlib to show plots in line in the notebook interface." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We want the plot to show bin centers halfway between the edge of each bin" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [], "source": [ "bin_centers = 0.5*(bins[1:]+bins[:-1])" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[-9.5 -8.5 -7.5 -6.5 -5.5 -4.5 -3.5 -2.5 -1.5 -0.5 0.5 1.5 2.5 3.5\n", " 4.5 5.5 6.5 7.5 8.5 9.5]\n" ] } ], "source": [ "print(bin_centers)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now setup the plot." ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "plt.figure(figsize=(10,6))\n", "plt.plot()\n", "plt.plot(bin_centers, histogram)\n", "plt.ylim(0,np.max(histogram*1.2))\n", "plt.xlabel('Value')\n", "plt.ylabel('Frequency (N)')\n", "plt.title('Uniform Distribution')\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "That doesn't look very uniform. What happens if we increase the number of points? Change the size of the sample in the original np.random.uniform call and re-run the notebook steps." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When plotting histograms, it is useful for the plot to not play 'connect-the-dots'. Rather, we want *step* style plots." ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "plt.figure(figsize=(10,4))\n", "plt.plot()\n", "plt.step(bin_centers, histogram)\n", "plt.ylim(0,np.max(histogram*1.2))\n", "plt.xlabel('Value')\n", "plt.ylabel('Frequency (N)')\n", "plt.title('Uniform Distribution')\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Gaussian Distribution" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now change from a uniform distribution to a normal (or Gaussian) distribution" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [], "source": [ "samples = np.random.normal(0,2.5,100000)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that the function parameters for 'normal' are different than for uniform. Use the numpy manual to figure out what they are." ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[-0.64934609 1.00487234 -0.45274114 ... 0.88182832 -3.66914704\n", " -3.32787441]\n" ] } ], "source": [ "print(samples)" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [], "source": [ "histogram, bins = np.histogram(samples, bins=bins)" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "plt.figure(figsize=(10,4))\n", "plt.plot()\n", "plt.step(bin_centers, histogram)\n", "plt.ylim(0,np.max(histogram*1.2))\n", "plt.xlabel('Value')\n", "plt.ylabel('Frequency (N)')\n", "plt.title('Uniform Distribution')\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Experiment with changing the standard deviation and center location of the distribution. You may need to change the range of the bins. Insert the required code into the notebook above the plotting section." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "***" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Conclusion" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This has been a simple introductory tutorial to Python and Jupyter notebooks. If you want more information about the packages we used, see the following on-line tutorials:\n", "\n", "*Python*: http://introtopython.org/. The *Python Essentials* menu contains a list of tutorials that discuss variables, functions, syntax, and code structure.\n", "\n", "*Matplotlib*: the tutorials at: https://matplotlib.org/tutorials/introductory/pyplot.html\n", "\n", "You can find more information on *Jupyter notebooks* here: \n", "https://jupyter-notebook.readthedocs.io/en/stable/" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.7" } }, "nbformat": 4, "nbformat_minor": 4 }