{ "metadata": { "name": "", "signature": "sha256:8f74bbd536ca59518a5eda98693b64a256c8585075674131ded9bfd3b5b0c235" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# An introduction to parallel programming with IPython\n", "\n", "For more information on how to do parallel computing with IPython, you can visit our [documentation](http://ipython.org/ipython-doc/rel-1.0.0/parallel/index.html) or visit our [in-depth parallel tutorial](http://minrk.github.io/scipy-tutorial-2011).\n", "\n", "\n", "## Parallel client\n", "\n", "First we load the usual plotting/numerics we'll use always" ] }, { "cell_type": "code", "collapsed": false, "input": [ "%pylab inline\n", "import numpy as np\n", "import matplotlib.pyplot as plt" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And now we connect to our parallel cluster, which we've already started:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "from IPython.parallel import Client\n", "\n", "# Uncomment one of the next two lines to run depending on where this is executing:\n", "\n", "c = Client() # use this if running locally\n", "# c = Client(packer='pickle') # use this if running on Amazon\n", "\n", "clust = c[:]\n", "nnod = len(clust)\n", "print \"My cluster has\", nnod, \"nodes\"" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "print clust.apply_sync(lambda : \"Hello, World\")" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "apply_res = clust.apply_async(lambda : \"Hello, World\")\n", "print apply_res.get()" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "print clust.map_sync(lambda x: x**2, [1,2,3,4])" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "map_res = clust.map_async(lambda x: x**2, [1,2,3,4])\n", "print map_res.get()" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Parallel magic" ] }, { "cell_type": "code", "collapsed": false, "input": [ "%px print \"Hello from independent Engine\"" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "%%px\n", "import os\n", "print \"I am process:\", os.getpid()\n", "print" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Pi via simple Monte Carlo\n", "\n", "![Monte Carlo Pi](http://docs.picloud.com/_images/basic_example_monte.gif)\n" ] }, { "cell_type": "code", "collapsed": false, "input": [ "def sample_circle(n):\n", " import numpy as np\n", " print 'n', n\n", " m = 0\n", " for i in xrange(int(n)):\n", " p = np.random.rand(2)\n", " if sum(p**2.) <= 1.:\n", " m += 1\n", " return m\n", "\n", "\n", "def brute_pi(n):\n", " m = sample_circle(n)\n", " return 4.* m/n\n", "\n", "\n", "def cluster_pi(n):\n", " m = sum(clust.map_sync(sample_circle, nnod*[n/nnod]))\n", " return 4.*m/n\n", "\n", "\n", "def err(npi):\n", " return 100*abs(np.pi-npi)/np.pi" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We make the local versions available on all the nodes, so we can compare two approaches" ] }, { "cell_type": "code", "collapsed": false, "input": [ "clust.push(dict(sample_circle=sample_circle, brute_pi=brute_pi));" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, let's start with a local computation. We pick the total number of samples to take:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "n = 1e6" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "%time bpi = brute_pi(n)\n", "print \"\\nError: %.2f%%\" % err(bpi)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next, let's do it on the cluster cluster by calling the *exact* same code above, but manually adjusting the value of `n` on each node:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "clust['n']= n/nnod" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "%%px\n", "%time cpi = brute_pi(n)\n", "print" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And manually aggregating the results:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "cpi = np.mean(clust['cpi'])\n", "print \"Error: %.2f%%\" % err(cpi)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Alternatively, we can call the cluster version locally:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "%time cpi = cluster_pi(n)\n", "print \"Error: %.2f%%\" % err(cpi)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Visualizing the process" ] }, { "cell_type": "code", "collapsed": false, "input": [ "def sample_circle2(n):\n", " import numpy as np\n", " print 'n', n\n", " \n", " points = np.random.rand(2, n)\n", " dist2 = (points**2).sum(axis=0)\n", " inside_mask = dist2 <= 1.0\n", " return inside_mask.sum(), points\n", "\n", "\n", "def brute_pi2(n):\n", " m, p = sample_circle2(n)\n", " return 4.* m/n, p" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For ease of visualization, let's do it with fewer points" ] }, { "cell_type": "code", "collapsed": false, "input": [ "nn = 1000\n", "%time bpi, p = brute_pi2(nn)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "from matplotlib.patches import Wedge\n", "w = Wedge((0,0), 1.0, 0, 90, alpha=0.2, fc='g')\n", "fig, ax = plt.subplots()\n", "ax.add_patch(w)\n", "ax.plot(p[0], p[1], 'o');\n", "plt.axis('scaled');\n", "print 'Total points:', len(p[0])" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "def cluster_pi2(n):\n", " counts_pts = clust.map_sync(sample_circle2, nnod*[n/nnod])\n", " m = sum(c[0] for c in counts_pts)\n", " p = [c[1] for c in counts_pts]\n", " return 4.*m/n, p" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "%time cpi2, cp = cluster_pi2(nn)\n", "print \"Error: %.2f%%\" % err(cpi2)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "from matplotlib.patches import Wedge\n", "w = Wedge((0,0), 1.0, 0, 90, alpha=0.2, fc='g')\n", "fig, ax = plt.subplots()\n", "ax.add_patch(w)\n", "for p in cp:\n", " ax.plot(p[0], p[1], 'o', alpha=0.6);\n", "plt.axis('scaled');\n", "print 'Total points:', len(p[0]),'x',len(cp),'=',len(p[0])*len(cp)" ], "language": "python", "metadata": {}, "outputs": [] } ], "metadata": {} } ] }