Jekyll2020-12-23T18:20:58+00:00https://sameen.dev/feed.xmlSameen IslamFascinated by technology.Polynomial Curve Fitting2020-12-20T14:15:00+00:002020-12-20T14:15:00+00:00https://sameen.dev/ml/2020/12/20/Poly-Curve-Fitting<p><i>This post makes a deeper dive into the subject of my previous blog post titled <a href="/ml/2020/02/11/Bias-variance-tradeoff.html">Bias-Variance Tradeoff in Machine Learning models</a>. Where the previous post used the <a href="https://scikit-learn.org/stable/">sklearn library</a>, I will now use the underlying mathematics to implement the same model from scratch.</i></p>
<p>In machine learning, we assume data arises from some underlying function which is unknown and we try to estimate this function from observations (training data). In the example below, we pretend to know the underlying function \(sin(2\pi x)\) and we generate 10 data points from this function with some added noise which represents our observations. Without knowing anything about the green function in the plot, how can we fit a new function such that we can approximate the green function?</p>
<p align="center">
<img src="/assets/curve_fit_1.png" alt="Plot showing sin(2 pi x) with 10 training data points normally distributed around the function." width="500" />
</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="n">np</span>
<span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="k">as</span> <span class="n">plt</span>
<span class="o">%</span><span class="n">matplotlib</span> <span class="n">inline</span>
<span class="k">def</span> <span class="nf">create_sample_data</span><span class="p">(</span><span class="n">fun</span><span class="p">,</span> <span class="n">size</span><span class="p">,</span> <span class="n">sigma</span><span class="p">):</span>
<span class="n">x</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">linspace</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="n">size</span><span class="p">)</span>
<span class="n">t</span> <span class="o">=</span> <span class="n">fun</span><span class="p">(</span><span class="n">x</span><span class="p">)</span> <span class="o">+</span> <span class="n">np</span><span class="p">.</span><span class="n">random</span><span class="p">.</span><span class="n">normal</span><span class="p">(</span><span class="n">scale</span><span class="o">=</span><span class="n">sigma</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="n">x</span><span class="p">.</span><span class="n">shape</span><span class="p">)</span>
<span class="k">return</span> <span class="n">x</span><span class="p">,</span> <span class="n">t</span>
<span class="k">def</span> <span class="nf">fun</span><span class="p">(</span><span class="n">x</span><span class="p">):</span>
<span class="k">return</span> <span class="n">np</span><span class="p">.</span><span class="n">sin</span><span class="p">(</span><span class="mi">2</span> <span class="o">*</span> <span class="n">np</span><span class="p">.</span><span class="n">pi</span> <span class="o">*</span> <span class="n">x</span><span class="p">)</span>
<span class="n">x_train</span><span class="p">,</span> <span class="n">y_train</span> <span class="o">=</span> <span class="n">create_sample_data</span><span class="p">(</span><span class="n">fun</span><span class="p">,</span> <span class="mi">10</span><span class="p">,</span> <span class="mf">0.25</span><span class="p">)</span>
<span class="n">x_test</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">linspace</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">100</span><span class="p">)</span>
<span class="n">y_test</span> <span class="o">=</span> <span class="n">fun</span><span class="p">(</span><span class="n">x_test</span><span class="p">)</span>
<span class="n">plt</span><span class="p">.</span><span class="n">scatter</span><span class="p">(</span><span class="n">x_train</span><span class="p">,</span> <span class="n">y_train</span><span class="p">,</span> <span class="n">facecolor</span><span class="o">=</span><span class="s">"none"</span><span class="p">,</span> <span class="n">edgecolor</span><span class="o">=</span><span class="s">"b"</span><span class="p">,</span>
<span class="n">s</span><span class="o">=</span><span class="mi">50</span><span class="p">,</span> <span class="n">label</span><span class="o">=</span><span class="s">"training data"</span><span class="p">)</span>
<span class="n">plt</span><span class="p">.</span><span class="n">plot</span><span class="p">(</span><span class="n">x_test</span><span class="p">,</span> <span class="n">y_test</span><span class="p">,</span> <span class="n">c</span><span class="o">=</span><span class="s">"g"</span><span class="p">,</span> <span class="n">label</span><span class="o">=</span><span class="s">"$\sin(2\pi x)$"</span><span class="p">)</span>
<span class="n">plt</span><span class="p">.</span><span class="n">legend</span><span class="p">()</span>
<span class="n">plt</span><span class="p">.</span><span class="n">show</span><span class="p">()</span></code></pre></figure>
<p>We can achieve this approximation by using a polynomial model whose weights, \(\boldsymbol{w}\), need to be estimated. We can do this estimation in many ways, but the best one will resemble our underlying \(sin(2\pi x)\) function. The order of a polynomial is its highest power. So for example the expression \(x^2 + 2x + 1\) is of order 2, while \(x^3 + 2x + 1\) is of order 3. Increasing the order \(M\) of a polynomial can increase its ‘degrees of freedom’, meaning that a polynomial of order 0 is a straight line, while one with a higher order can curve in increasingly different ways as we will later see. Here, we use the function \(y(x, \boldsymbol{w})\) where,</p>
\[y(x, \boldsymbol{w}) = w_0 + w_1x + w_2x^2 + ... + w_Mx^M
= \sum_{j=0}^{M}{w_jx^j}\]
<p>Linear least squares regression is a method of fitting a curve which minimises the sum of squares of error between our function and training datapoints. This means that it adjusts the weights of our polynomial so that it passes as close to the blue datapoints as possible. More formally, the error is defined as:</p>
\[E(\boldsymbol{w}) = \sum_{n=1}^{N}{\{y(x_n, \boldsymbol{w}) - t_n\}}^2\]
<p>where \(t_n\) is the \(n\)-th training datapoint. Below, we perform an experiment with varying orders of polynomial of our least square model to see the effects of increasing model complexity (in this context, a more complex model consists of a higher order polynomial).</p>
<p align="center">
<img src="/assets/curve_fit_2.png" alt="A model of low complexity is a straight line as does not pass through any training data. One with higher complexity passes through every training datapoint." width="500" />
</p>
<p>The top left plot shows a model of <code class="language-plaintext highlighter-rouge">order=0</code> which results in a straight line. This model is too unsophisticated to be able to learn from our training data because changing its weights does not make it curve to fit any of the training datapoints. The next model (upper right) with <code class="language-plaintext highlighter-rouge">order=1</code> is not appropriate either. The more interesting examples are the lower two plots; the lower left <code class="language-plaintext highlighter-rouge">order=3</code> shows a good fit, while the lower right <code class="language-plaintext highlighter-rouge">order=9</code> one shows an overfitted model.</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="k">def</span> <span class="nf">transform</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">degree</span><span class="p">):</span>
<span class="k">if</span> <span class="n">x</span><span class="p">.</span><span class="n">ndim</span> <span class="o">==</span> <span class="mi">1</span><span class="p">:</span>
<span class="n">x</span> <span class="o">=</span> <span class="n">x</span><span class="p">[:,</span> <span class="bp">None</span><span class="p">]</span>
<span class="n">x_t</span> <span class="o">=</span> <span class="n">x</span><span class="p">.</span><span class="n">T</span>
<span class="n">features</span> <span class="o">=</span> <span class="p">[</span><span class="n">np</span><span class="p">.</span><span class="n">ones</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">x</span><span class="p">))]</span>
<span class="k">for</span> <span class="n">degree</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="n">degree</span> <span class="o">+</span> <span class="mi">1</span><span class="p">):</span>
<span class="k">for</span> <span class="n">items</span> <span class="ow">in</span> <span class="n">itertools</span><span class="p">.</span><span class="n">combinations_with_replacement</span><span class="p">(</span><span class="n">x_t</span><span class="p">,</span> <span class="n">degree</span><span class="p">):</span>
<span class="n">features</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">functools</span><span class="p">.</span><span class="nb">reduce</span><span class="p">(</span><span class="k">lambda</span> <span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">:</span> <span class="n">x</span> <span class="o">*</span> <span class="n">y</span><span class="p">,</span> <span class="n">items</span><span class="p">))</span>
<span class="k">return</span> <span class="n">np</span><span class="p">.</span><span class="n">asarray</span><span class="p">(</span><span class="n">features</span><span class="p">).</span><span class="n">T</span></code></pre></figure>
<p>Overfitting is when our model fits so closely to the data that it cannot generalise well when it comes to making predictions on unseen data.</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="k">class</span> <span class="nc">LinearRegression</span><span class="p">():</span>
<span class="c1"># linear least squares regression
</span> <span class="k">def</span> <span class="nf">fit</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">X</span><span class="p">,</span> <span class="n">t</span><span class="p">):</span>
<span class="bp">self</span><span class="p">.</span><span class="n">w</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">linalg</span><span class="p">.</span><span class="n">inv</span><span class="p">(</span><span class="n">X</span><span class="p">.</span><span class="n">T</span> <span class="o">@</span> <span class="n">X</span><span class="p">)</span> <span class="o">@</span> <span class="n">X</span><span class="p">.</span><span class="n">T</span> <span class="o">@</span> <span class="n">t</span>
<span class="n">t_hat</span> <span class="o">=</span> <span class="n">X</span> <span class="o">@</span> <span class="bp">self</span><span class="p">.</span><span class="n">w</span>
<span class="k">def</span> <span class="nf">predict</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">X</span><span class="p">):</span>
<span class="n">y</span> <span class="o">=</span> <span class="n">X</span> <span class="o">@</span> <span class="bp">self</span><span class="p">.</span><span class="n">w</span>
<span class="k">return</span> <span class="n">y</span></code></pre></figure>
<p>We compute the linear least squares regression model using the <a href="https://en.wikipedia.org/wiki/Moore–Penrose_inverse">Moore-Penrose pseudo-inverse</a> of matrix \(\boldsymbol{X}\):</p>
\[\boldsymbol{w} = (\boldsymbol{X}^T \boldsymbol{X})^{-1} \boldsymbol{X}^T \boldsymbol{t}\]
<p>where \(\boldsymbol{X}\) is the \(N \times M\) design matrix with \(N>M\) and full rank (i.e. <code class="language-plaintext highlighter-rouge">rank=m</code>) and \(\boldsymbol{t}\) is the target matrix. Recall that an inverse of the matrix \(X\) can only be found if \(X\) is square with full rank. But since we assume that our design matrix is not square, we use the pseudo-inverse. Note that when \(X\) is square and invertible, the inverse is equal to the pseudo-inverse solution.</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="k">for</span> <span class="n">i</span><span class="p">,</span> <span class="n">degree</span> <span class="ow">in</span> <span class="nb">enumerate</span><span class="p">([</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">9</span><span class="p">]):</span>
<span class="n">plt</span><span class="p">.</span><span class="n">subplot</span><span class="p">(</span><span class="mi">2</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="n">i</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span>
<span class="n">X_train</span> <span class="o">=</span> <span class="n">transform</span><span class="p">(</span><span class="n">x_train</span><span class="p">,</span> <span class="n">degree</span><span class="p">)</span>
<span class="n">X_test</span> <span class="o">=</span> <span class="n">transform</span><span class="p">(</span><span class="n">x_test</span><span class="p">,</span> <span class="n">degree</span><span class="p">)</span>
<span class="n">model</span> <span class="o">=</span> <span class="n">LinearRegression</span><span class="p">()</span>
<span class="n">model</span><span class="p">.</span><span class="n">fit</span><span class="p">(</span><span class="n">X_train</span><span class="p">,</span> <span class="n">y_train</span><span class="p">)</span>
<span class="n">y</span> <span class="o">=</span> <span class="n">model</span><span class="p">.</span><span class="n">predict</span><span class="p">(</span><span class="n">X_test</span><span class="p">)</span>
<span class="n">plt</span><span class="p">.</span><span class="n">scatter</span><span class="p">(</span><span class="n">x_train</span><span class="p">,</span> <span class="n">y_train</span><span class="p">,</span> <span class="n">facecolor</span><span class="o">=</span><span class="s">"none"</span><span class="p">,</span>
<span class="n">edgecolor</span><span class="o">=</span><span class="s">"b"</span><span class="p">,</span> <span class="n">s</span><span class="o">=</span><span class="mi">50</span><span class="p">,</span> <span class="n">label</span><span class="o">=</span><span class="s">"Training data"</span><span class="p">)</span>
<span class="n">plt</span><span class="p">.</span><span class="n">plot</span><span class="p">(</span><span class="n">x_test</span><span class="p">,</span> <span class="n">y_test</span><span class="p">,</span> <span class="n">c</span><span class="o">=</span><span class="s">"g"</span><span class="p">,</span> <span class="n">label</span><span class="o">=</span><span class="s">"True function/$\sin(2\pi x)$"</span><span class="p">)</span>
<span class="n">plt</span><span class="p">.</span><span class="n">plot</span><span class="p">(</span><span class="n">x_test</span><span class="p">,</span> <span class="n">y</span><span class="p">,</span> <span class="n">c</span><span class="o">=</span><span class="s">"r"</span><span class="p">,</span> <span class="n">label</span><span class="o">=</span><span class="s">"Approximated function/fit"</span><span class="p">)</span>
<span class="n">plt</span><span class="p">.</span><span class="n">ylim</span><span class="p">(</span><span class="o">-</span><span class="mf">1.5</span><span class="p">,</span> <span class="mf">1.5</span><span class="p">)</span>
<span class="n">plt</span><span class="p">.</span><span class="n">annotate</span><span class="p">(</span><span class="s">"M={}"</span><span class="p">.</span><span class="nb">format</span><span class="p">(</span><span class="n">degree</span><span class="p">),</span> <span class="n">xy</span><span class="o">=</span><span class="p">(</span><span class="o">-</span><span class="mf">0.15</span><span class="p">,</span> <span class="mi">1</span><span class="p">))</span>
<span class="n">plt</span><span class="p">.</span><span class="n">legend</span><span class="p">(</span><span class="n">loc</span> <span class="o">=</span> <span class="s">'lower center'</span><span class="p">,</span> <span class="n">bbox_to_anchor</span> <span class="o">=</span> <span class="p">(</span><span class="mf">0.0</span><span class="p">,</span><span class="o">-</span><span class="mf">0.15</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span><span class="mi">0</span><span class="p">),</span>
<span class="n">bbox_transform</span> <span class="o">=</span> <span class="n">plt</span><span class="p">.</span><span class="n">gcf</span><span class="p">().</span><span class="n">transFigure</span><span class="p">)</span>
<span class="n">plt</span><span class="p">.</span><span class="n">savefig</span><span class="p">(</span><span class="s">'curve_fit_2'</span><span class="p">,</span> <span class="n">bbox_inches</span><span class="o">=</span><span class="s">'tight'</span><span class="p">)</span></code></pre></figure>
<h2 id="summary">Summary</h2>
<p>In this article, we discussed how data from the real world can be thought of to be generated from an underlying function, which we then set out to approxiate via the adjustment of weights of a polynomial function. We also explored the impact of changing the order of this polynomial function on our model and found that a delicate balance is required to arrive at a good, generalised model.</p>
<hr />
<p>Citations:
<br />
[1] Bishop, C. M. (2007), Pattern Recognition and Machine Learning (Information Science and Statistics), Springer.
<br />
[2] PRML algorithms implemented in Python (<a href="https://github.com/ctgk/PRML">https://github.com/ctgk/PRML</a>).</p>This post makes a deeper dive into the subject of my previous blog post titled Bias-Variance Tradeoff in Machine Learning models. Where the previous post used the sklearn library, I will now use the underlying mathematics to implement the same model from scratch.Geolocation and mapping in apps2020-07-14T22:30:00+00:002020-07-14T22:30:00+00:00https://sameen.dev/mobile/app/flutter/2020/07/14/flutter-google-maps<p>During this lockdown, I wanted to play with new and unfamiliar technology. Enter Flutter. Google’s all new framework for building cross-platform mobile applications.</p>
<p>In many ways, Flutter feels like frontend development reimagined. Gone are HTML markups and CSS styles. Even JavaScript has been done away with in Flutter land.</p>
<p>Instead, what we have is a modern, object-oriented language called Dart which unifies all disparate aspects of traditional frontend development. It even features a hot reload feature which shaves development time significantly, as code changes do not require constant recompiling.</p>
<p>Although young, Flutter seems to be a leap forward in the world of mobile app development. As such, this framework is certainly one to keep an eye on. Personally, I found the quality of documentation and support to be excellent. Google has even made it open source; being able to peek at the implementation of underlying libraries is something I’ve grown to appreciate a lot.</p>
<h1 id="final-result">Final Result</h1>
<p>Today, we will look at how we can build something like the Uber home screen with a map.</p>
<div style="text-align: center">
<div style="padding-right: 80px">
<img src="/assets/geo-maps-demo.gif" width="200" />
</div>
</div>
<p><br /></p>
<h1 id="steps">Steps</h1>
<p>First, let’s break the task down by visual features, going from top to bottom. We will need a:</p>
<ol>
<li>Card which takes up roughly half of screen length</li>
<li>Loading animation to show before the map loads</li>
<li>Map with custom colour scheme centered on device current location</li>
</ol>
<p>Note: I encourage you to follow along and code, but as a reminder, all code is available in my <a href="https://github.com/samisnotinsane/flutter-bites/tree/master/geo_maps">GitHub repository</a>.</p>
<h1 id="constructing-the-bottom-sheet">Constructing the bottom sheet</h1>
<p>We begin with a blank slate:</p>
<p><code class="language-plaintext highlighter-rouge">lib/main.dart</code></p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>import 'package:flutter/material.dart';
void main() {
runApp(MyApp());
}
class MyApp extends StatelessWidget {
@override
Widget build(BuildContext context) {
return MaterialApp(
title: 'Geo Maps',
theme: ThemeData(
accentColor: Color(0xFFFF6238), // Orange, opacity=1.0
visualDensity: VisualDensity.adaptivePlatformDensity,
),
home: Scaffold(
body: Placeholder(),
),
);
}
}
</code></pre></div></div>
<p>which should look like this:</p>
<div style="text-align: center">
<img src="/assets/flutter-placeholder.png" width="200" />
</div>
<p><br /></p>
<h4 id="where-to">Where to?</h4>
<p>Refer back to our final result and notice how the greeting and recent destination list is placed in a sheet that can be pulled vertically. Also notice that it is ‘in front’/’on top’ of the map, so we will need to use a <code class="language-plaintext highlighter-rouge">Stack</code> (<a href="https://api.flutter.dev/flutter/widgets/Stack-class.html">API Doc</a>) to implement that effect:</p>
<p><code class="language-plaintext highlighter-rouge">lib/main.dart</code></p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>import 'package:flutter/material.dart';
import 'widgets/where_to_sheet.dart'; // import custom widget
void main() {
runApp(MyApp());
}
class MyApp extends StatelessWidget {
@override
Widget build(BuildContext context) {
return MaterialApp(
title: 'Geo Maps',
theme: ThemeData(
accentColor: Color(0xFFFF6238),
visualDensity: VisualDensity.adaptivePlatformDensity,
),
home: Scaffold(
body: Stack( // new code start
children: <Widget>[
Placeholder(),
DraggableScrollableSheet(
initialChildSize: 0.3,
minChildSize: 0.1,
maxChildSize: 0.3,
builder: (context, scrollController) {
return Container(
padding: EdgeInsets.all(8.0),
color: Colors.amberAccent, // contrast color for debug
child: WhereToSheet(), // custom widget
);
},
),
],
),
), // new code end
);
}
}
</code></pre></div></div>
<p>To improve readability of our <code class="language-plaintext highlighter-rouge">build</code> method and to make our code modular, we have created a custom widget which is a child of <code class="language-plaintext highlighter-rouge">DraggableScrollableSheet</code>(<a href="https://api.flutter.dev/flutter/widgets/DraggableScrollableSheet-class.html">API Doc</a>). This way, the sheet behaviour code lives with the home screen layout and the layout of the sheet itself is encapsulated within the custom widget <code class="language-plaintext highlighter-rouge">WhereToSheet</code>.</p>
<p>Since the sheet will have UI elements distributed vertically, it makes sense to use a <code class="language-plaintext highlighter-rouge">Column</code> (<a href="https://api.flutter.dev/flutter/widgets/Column-class.html">API Docs</a>).</p>
<p><code class="language-plaintext highlighter-rouge">lib/widgets/where_to_sheet.dart</code></p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>import 'package:flutter/material.dart';
class WhereToSheet extends StatelessWidget {
@override
Widget build(BuildContext context) {
return Column(
crossAxisAlignment: CrossAxisAlignment.center, // centres children horizontally
children: <Widget>[
Container( // Handlebar
height: 5.0,
width: 50.0,
decoration: BoxDecoration(
color: Theme.of(context).dividerColor, // light grey by default on iOS
borderRadius: BorderRadius.circular(5.0),
),
),
],
);
}
}
</code></pre></div></div>
<p>Et voilà!</p>
<p>Obviously the final design doesn’t have an amber background color, but we’re using it for now to make the region the sheet will cover visible. Change the value to <code class="language-plaintext highlighter-rouge">Theme.of(context).cardColor</code> once you’re convinced the sheet exists.</p>
<div style="text-align: center">
<img src="/assets/bottom-sheet-blank.png" width="200" />
</div>
<p><br /></p>
<p><strong>Lost?</strong> Refer back to <a href="https://github.com/samisnotinsane/flutter-bites/commit/552f2aba19b5f62c0605c92b480fc1b5386a9d31">my snapshot</a> to get back on track!</p>
<p>Now, let’s build the ‘Where to?’ button. We know it’s going to be a button and not a TextField because in the original app, tapping this element takes the user to a different screen. We begin with a <code class="language-plaintext highlighter-rouge">FlatButton</code> (<a href="https://api.flutter.dev/flutter/material/FlatButton-class.html">API Doc</a>) and customise it to suit our needs:</p>
<p><code class="language-plaintext highlighter-rouge">where_to_button.dart</code></p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>import 'package:flutter/material.dart';
class WhereToButton extends StatelessWidget {
WhereToButton({@required this.onPressedHandler});
final Function onPressedHandler;
@override
Widget build(BuildContext context) {
return FractionallySizedBox(
widthFactor: 0.95,
child: FlatButton(
padding: EdgeInsets.symmetric(
horizontal: 8.0,
vertical: 14.0,
),
child: Align(
alignment: Alignment.centerLeft,
child: Text(
'Where to?',
style: TextStyle(
fontSize: 18.0,
fontWeight: FontWeight.bold,
),
),
),
onPressed: onPressedHandler,
color: Colors.grey[300],
textColor: Colors.grey[900],
),
);
}
}
</code></pre></div></div>
<p>Despite being an object-oriented language, Dart provides first-class support for functions through the <code class="language-plaintext highlighter-rouge">Function</code> type(<a href="https://api.dart.dev/stable/2.8.4/dart-core/Function-class.html">API Doc</a>). Here, we accept a <code class="language-plaintext highlighter-rouge">Function</code> <code class="language-plaintext highlighter-rouge">onPressedHandler</code> which is simply a callback. This way, we’re free to define the button now as we see fit, and leave the code open to modification later when we actually instantiate the button. This makes sense because what actually happens when the user taps on the button is undefined under the scope of this article.</p>
<p>The most interesting component in this widget is the <code class="language-plaintext highlighter-rouge">FractionallySizedBox</code> (<a href="https://api.flutter.dev/flutter/widgets/FractionallySizedBox-class.html">API Doc</a>) which basically allows us to say that “make the width of this button 95% than that of its parent container” (which is the width of the <code class="language-plaintext highlighter-rouge">Column</code> in <code class="language-plaintext highlighter-rouge">where_to_sheet.dart</code>).</p>
<p>Let’s see the results:</p>
<div style="text-align: center">
<img src="/assets/where_to_btn.png" width="200" />
</div>
<p><br /></p>
<p>Cool, things appear to be taking shape. Now, on to implementing the list of recent destinations.</p>
<h4 id="recent-destinations-list">Recent Destinations List</h4>
<p>Begin by creating an object model for a <code class="language-plaintext highlighter-rouge">Destination</code>. Looking at the UI mockup, we can see there’s a title and an address line - so we pick those as attributes for the object model.</p>
<div style="text-align: center">
<img src="/assets/destination-tile.png" width="200" />
</div>
<p><br /></p>
<p>The idea is, we will create a mock data class which will inject instances of <code class="language-plaintext highlighter-rouge">Destination</code> through the constructor of <code class="language-plaintext highlighter-rouge">WhereToSheet</code> which we will modify shortly.</p>
<p><code class="language-plaintext highlighter-rouge">lib/model/destination.dart</code></p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>import 'package:flutter/foundation.dart';
class Destination {
Destination({@required this.title, @required this.address});
final String title;
final String address;
}
</code></pre></div></div>
<p>In our <code class="language-plaintext highlighter-rouge">MockData</code> class, we’re using an <code class="language-plaintext highlighter-rouge">UnmodifiableListView</code> (<a href="https://api.dart.dev/stable/2.8.4/dart-collection/UnmodifiableListView-class.html">API Doc</a>) to get an immutable view of our private list <code class="language-plaintext highlighter-rouge">_destinations</code> - this prevents client code from tampering our private list by accessing its reference through the getter.</p>
<p><code class="language-plaintext highlighter-rouge">lib/mock/mock_data.dart</code></p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>import 'dart:collection';
import '../model/destination.dart';
class MockData {
final List<Destination> _destinations = [];
UnmodifiableListView<Destination> get destinations =>
UnmodifiableListView(_destinations);
set addDestination(Destination destination) => _destinations.add(destination);
}
</code></pre></div></div>
<p>Another important observation to make is that the <code class="language-plaintext highlighter-rouge">final</code> keyword doesn’t make our list immutable. It just prevents <em>reassignment</em>, guaranteeing that we’re always going to be working with the same list when interacting with <code class="language-plaintext highlighter-rouge">MockData</code> during runtime.</p>
<p>Now is a good time to convert <code class="language-plaintext highlighter-rouge">WhereToSheet</code> to a <code class="language-plaintext highlighter-rouge">StatefulWidget</code> (<a href="https://api.flutter.dev/flutter/widgets/StatefulWidget-class.html">API Doc</a>) because we will need to make use of the <code class="language-plaintext highlighter-rouge">initState</code> lifecycle method, where we will instantiate some Destination objects and load them up in our <code class="language-plaintext highlighter-rouge">MockData</code>.</p>
<p><code class="language-plaintext highlighter-rouge">lib/widgets/where_to_sheet.dart</code></p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>import 'package:flutter/material.dart';
import 'package:geo_maps/model/destination.dart';
import '../mock/mock_data.dart';
import 'where_to_button.dart';
import 'where_to_recent_dest_list.dart';
class WhereToSheet extends StatefulWidget {
@override
_WhereToSheetState createState() => _WhereToSheetState();
}
class _WhereToSheetState extends State<WhereToSheet> {
final MockData _mockData = MockData();
List<Destination> _destinations;
@override
void initState() {
super.initState();
addDummyDestinations();
_destinations = _mockData.destinations; // get an immutable reference to destinations list.
}
// Insert a few destinations to populate our list.
void addDummyDestinations() {
// Use setter [addDestination] to push objects into list.
_mockData.addDestination = Destination(
title: 'Home',
address: 'Knightsbridge, London',
);
_mockData.addDestination = Destination(
title: 'Work',
address: 'Piccadilly, London',
);
_mockData.addDestination = Destination(
title: 'Black Sheep Coffee',
address: 'Leadenhall St, London',
);
}
@override
Widget build(BuildContext context) {
return Column(
crossAxisAlignment: CrossAxisAlignment.center,
children: <Widget>[
Container(
// Handlebar
height: 5.0,
width: 50.0,
decoration: BoxDecoration(
color: Theme.of(context).dividerColor,
borderRadius: BorderRadius.circular(5.0),
),
),
Text(
// Greeting
'Good morning, Sameen',
style: TextStyle(
fontSize: 22.0,
fontWeight: FontWeight.bold,
),
),
Divider(
color: Theme.of(context).dividerColor,
),
WhereToButton(
onPressedHandler: () {},
),
WhereToRecentDestList( // Create destination tiles using mock data
destinations: _destinations,
),
],
);
}
}
</code></pre></div></div>
<p>Let’s take a look at our progress so far:</p>
<div style="text-align: center">
<img src="/assets/uber-iter-1-1.png" width="200" />
</div>
<p><br /></p>
<p>Comparing with our final mockup, we can spot a few discrepancies here:</p>
<ul>
<li>Greeting message does not have enough padding</li>
<li>Recent destination tiles have too much padding</li>
<li>Icon missing background color and shape</li>
</ul>
<p>In the next section, we will bring some polish to our app’s look.</p>
<h4 id="refactoring">Refactoring</h4>
<p>Before modifying the code further, it’s worth looking at how we’ve quickly prototyped the recent destination list (as seen in the screenshot above) using Flutter’s built in <code class="language-plaintext highlighter-rouge">ListTile</code> (<a href="https://api.flutter.dev/flutter/material/ListTile-class.html">API Doc</a>).</p>
<p><code class="language-plaintext highlighter-rouge">lib/widgets/where_to_recent_dest_list.dart</code></p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>import 'package:flutter/material.dart';
import '../model/destination.dart';
class WhereToRecentDestList extends StatelessWidget {
final List<Destination> _destinations;
@override
Widget build(BuildContext context) {
return Expanded(
// [Expanded] prevents vertical overflow
child: ListView.separated(
shrinkWrap: true, // prevent setting height to infinity
padding: EdgeInsets.all(0), // remove default padding
separatorBuilder: (context, index) => Divider(
// 16% of screen width
indent: MediaQuery.of(context).size.width * 0.16,
),
itemBuilder: (context, index) => ListTile( // this will be refactored
leading: Icon(Icons.history),
title: Text(_destinations[index].title),
subtitle: Text(_destinations[index].address),
),
itemCount: _destinations.length,
),
);
}
}
</code></pre></div></div>
<p>Rather than using a <code class="language-plaintext highlighter-rouge">ListTile</code>, we will use the more primitive <code class="language-plaintext highlighter-rouge">Row</code> and <code class="language-plaintext highlighter-rouge">Column</code> widgets to build a similar layout; we refactor this way to afford greater customisation.</p>
<div style="text-align: center">
<img src="/assets/destination-tile-wireframe.jpg" width="400" />
</div>
<p><br /></p>
<p>The sketch above translates to the following widget:</p>
<p><code class="language-plaintext highlighter-rouge">lib/widgets/where_to_recent_dest_tile.dart</code></p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>import 'package:flutter/material.dart';
import '../model/destination.dart';
class WhereToRecentDestTile extends StatelessWidget {
const WhereToRecentDestTile({
@required this.destination,
});
final Destination destination;
@override
Widget build(BuildContext context) {
return GestureDetector(
behavior:
HitTestBehavior.translucent, // includes tapping in 'blank' areas
onTap: () => print('${destination.title} tapped'),
child: Row(
mainAxisAlignment: MainAxisAlignment.start,
children: <Widget>[
RawMaterialButton(
onPressed: null, // disables inkwell effect
shape: CircleBorder(),
fillColor: Theme.of(context).accentColor, // orange
elevation: 0.2,
child: destination.title.toUpperCase() == 'HOME'
? Icon(
Icons.home,
color: Theme.of(context).canvasColor, // white
)
: destination.title.toUpperCase() == 'WORK'
? Icon(
Icons.work,
color: Theme.of(context).canvasColor,
)
: Icon(
Icons.history,
color: Theme.of(context).canvasColor,
),
),
Column(
crossAxisAlignment: CrossAxisAlignment.start,
children: <Widget>[
Text(
destination.title,
style: TextStyle(
fontSize: 18.0,
fontWeight: FontWeight.bold,
),
),
SizedBox(
height: 6.0, // create space between title and address
),
Text(
destination.address,
style: TextStyle(
fontSize: 14.0,
),
),
],
),
],
),
);
}
}
</code></pre></div></div>
<p>By wrapping the entire <code class="language-plaintext highlighter-rouge">Row</code> in a <code class="language-plaintext highlighter-rouge">GestureDetector</code> (<a href="https://api.flutter.dev/flutter/widgets/GestureDetector-class.html">API Doc</a>) we’re able to detect taps which will let us progress to the next screen in the future. We’ve also made the list dynamic by conditionally rendering icons depending on the title.</p>
<p>Go back to <code class="language-plaintext highlighter-rouge">lib/widgets/where_to_sheet.dart</code> and in the <code class="language-plaintext highlighter-rouge">build</code> method add spacing between elements by inserting <code class="language-plaintext highlighter-rouge">SizedBox</code> with a small height like this:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>// ...
Container(
// Handlebar
height: 5.0,
width: 50.0,
decoration: BoxDecoration(
color: Theme.of(context).dividerColor,
borderRadius: BorderRadius.circular(5.0),
),
),
SizedBox(
height: 6.0, // spacing
),
Text(
// Greeting
'Good morning, Sameen',
style: TextStyle(
fontSize: 22.0,
fontWeight: FontWeight.bold,
),
),
// ...
</code></pre></div></div>
<p>I’m not adding the full code here again for brevity, but <a href="https://github.com/samisnotinsane/flutter-bites/blob/master/geo_maps/lib/widgets/where_to_sheet.dart">click here</a> to see the full source for this widget.</p>
<p>At this point you may begin to notice the sheet itself may be too small, so you may return to <code class="language-plaintext highlighter-rouge">main.dart</code> and tweak the <code class="language-plaintext highlighter-rouge">initialChildSize</code> property of <code class="language-plaintext highlighter-rouge">DraggableScrollableSheet</code>. I set mine to <code class="language-plaintext highlighter-rouge">0.35</code>.</p>
<p><strong>Caution</strong>: Ensure <code class="language-plaintext highlighter-rouge">initialChildSize <= maxChildSize</code> to prevent crash.</p>
<p>After these changes, you should be done with the bottom sheet.</p>
<div style="text-align: center">
<img src="/assets/uber-iter-1-3.png" width="200" />
</div>
<p><br /></p>
<h1 id="map-with-custom-theme">Map with custom theme</h1>
<p>Let’s integrate Google Maps into our app. We will need an API key, a package dependency and of course, the app skeleton on which the map will be displayed.</p>
<p>Open the Google Cloud Platform console and navigate to credentials where you will be able to create a new API key. Once done, you should be able to see it under the ‘API Keys’ table (see screenshot below).</p>
<div style="text-align: center">
<img src="/assets/google-console-api.png" width="200" />
</div>
<p><br /></p>
<p>Caution: It’s advisable not to share live API keys publicly as others using your key may incur charges on your behalf.</p>
<p>With that out of the way, we now have to activate the Maps SDK so our app will be able to query Google’s map servers. Since Flutter is cross platform, your app could potentialy run on both iOS and Android, so we’ll activate both ‘Maps SDK for iOS’ and ‘Maps SDK for Android’; just use the search bar to find these two products and click on ‘Enable’ under their respective pages to activate.</p>
<h4 id="project-configuration-and-adding-dependencies">Project Configuration and Adding Dependencies</h4>
<p><a href="https://pub.dev/">pub.dev</a> is <em>the</em> place for all your package needs. If you’re coming from JavaScript world, this is akin to NPM. From pub.dev, we need a package called <a href="https://pub.dev/packages/google_maps_flutter">google_maps_flutter</a> (version <code class="language-plaintext highlighter-rouge">^0.5.28+1</code> as of writing) which does exactly what it says on the tin</p>
<p>Add this in your <code class="language-plaintext highlighter-rouge">pubspec.yaml</code> like so:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>dependencies:
...
google_maps_flutter: ^0.5.28+1
</code></pre></div></div>
<p>Note: <code class="language-plaintext highlighter-rouge">yaml</code> files are notorious for being sensitive to indentation, so pay extra attention here.</p>
<p>Now copy and paste the following line with your API key in your Android application manifest (located in <code class="language-plaintext highlighter-rouge">android/app/src/main/AndroidManifest.xml</code>):</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><meta-data android:name="com.google.android.geo.API_KEY"
android:value="YOUR KEY HERE"/>
</code></pre></div></div>
<p>Be sure to place it as the direct child of the <code class="language-plaintext highlighter-rouge"><application></code> tag.</p>
<p>That’s all for Android, now on to iOS.</p>
<p>Open <code class="language-plaintext highlighter-rouge">ios/Runner/AppDelegate.swift</code> and replace the file contents with the following code block (but add in your API key of course):</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>import UIKit
import Flutter
import GoogleMaps
@UIApplicationMain
@objc class AppDelegate: FlutterAppDelegate {
override func application(
_ application: UIApplication,
didFinishLaunchingWithOptions launchOptions: [UIApplication.LaunchOptionsKey: Any]?
) -> Bool {
GMSServices.provideAPIKey("YOUR KEY HERE")
GeneratedPluginRegistrant.register(with: self)
return super.application(application, didFinishLaunchingWithOptions: launchOptions)
}
}
</code></pre></div></div>
<p>One last thing in this step: the custom theme. Just use the <a href="https://mapstyle.withgoogle.com/">Google Maps Styling Wizard</a> to create or select a predefined theme and copy the generated JSON.</p>
<p>In our case, since we’re looking for an ‘Uber-esque’ theme, we’re going to use <a href="https://snazzymaps.com/style/90982/uber-2017">this link</a> to copy a similarly structured JSON with different values.</p>
<p>Create a new folder <code class="language-plaintext highlighter-rouge">/assets</code> in project root and create a JSON file <code class="language-plaintext highlighter-rouge">map_style.json</code> (it doesn’t matter what you name it), pasting in the JSON string you copied from the link above.</p>
<p>Now you have to give your app permission to access this newly created <code class="language-plaintext highlighter-rouge">/assets</code> folder. Open <code class="language-plaintext highlighter-rouge">pubspec.yaml</code> and scroll down, you should see a commented out section about assets. Uncomment and change so it reads the following:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> assets:
- assets/
</code></pre></div></div>
<p>This gives your app access to all files under <code class="language-plaintext highlighter-rouge">assets</code> folder, including <code class="language-plaintext highlighter-rouge">map_style.json</code> that you just created.</p>
<p>We will come back to this later after creating a skeleton app, so we can apply our style on the rendered map.</p>
<p><strong>Lost?</strong> See <a href="https://github.com/samisnotinsane/flutter-bites/commit/800a1512778d4c061128c88f6aae3ccad75a947b#diff-ef3842c19e4a6b4139f27c2313c9c4b4">this</a> and <a href="https://github.com/samisnotinsane/flutter-bites/commit/bd088d4d4e0aa81c9f6224ddf6b854637340ddb3#diff-ef3842c19e4a6b4139f27c2313c9c4b4">this</a> example to get back on track!</p>
<h1 id="acquiring-device-location">Acquiring Device Location</h1>
<p>We’re going to add another dependency called <a href="https://pub.dev/packages/geolocator">geolocator</a> (version <code class="language-plaintext highlighter-rouge">^5.3.2+2</code> as of writing) which will determine device location.</p>
<p>Add this in your <code class="language-plaintext highlighter-rouge">pubspec.yaml</code> under <code class="language-plaintext highlighter-rouge">google_maps_flutter</code> which was an earlier entry you made:
<code class="language-plaintext highlighter-rouge">geolocator: ^5.3.2+2</code></p>
<p>Since device location is a sensitive operation, we need to ask the app user for permission. In iOS, this is configured by adding the following keys in <code class="language-plaintext highlighter-rouge">Info.plist</code> file (these should be the direct child of the <code class="language-plaintext highlighter-rouge"><dict></code> tag):</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><key>NSLocationWhenInUseUsageDescription</key>
<string>This app needs access to location when open.</string>
<key>NSLocationAlwaysUsageDescription</key>
<string>This app needs access to location when in the background.</string>
<key>NSLocationAlwaysAndWhenInUseUsageDescription</key>
<string>This app needs access to location when open and in the background.</string>
</code></pre></div></div>
<p>In Android’s case, open <code class="language-plaintext highlighter-rouge">android/app/src/main/AndroidManifest.xml</code> and add the following line as a direct child of the top-level <code class="language-plaintext highlighter-rouge"><manifest></code> tag:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><uses-permission android:name="android.permission.ACCESS_FINE_LOCATION" />
</code></pre></div></div>
<p><strong>Lost?</strong> See <a href="https://github.com/samisnotinsane/flutter-bites/commit/60ddadb0ad026c244099c39c21616bee4ed9e905#diff-ef3842c19e4a6b4139f27c2313c9c4b4">example</a> to get back on track!</p>
<p>When our screen loads, we want to use Geolocator to get device location. The callback needed for this behavior, <code class="language-plaintext highlighter-rouge">initState</code>, is only available in a <code class="language-plaintext highlighter-rouge">StatefulWidget</code>. Accordingly, we create a <code class="language-plaintext highlighter-rouge">GoogleMapView</code> to sit alongside our <code class="language-plaintext highlighter-rouge">DraggableScrollableSheet</code>.</p>
<p><code class="language-plaintext highlighter-rouge">lib/widgets/map_view.dart</code></p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>import 'package:flutter/material.dart';
import 'package:flutter/services.dart';
import 'package:geolocator/geolocator.dart';
class GoogleMapView extends StatefulWidget {
@override
_GoogleMapViewState createState() => _GoogleMapViewState();
}
class _GoogleMapViewState extends State<GoogleMapView> {
String _mapStyle;
Position _currentPosition;
@override
void initState() {
super.initState();
rootBundle
.loadString('assets/map_style.json')
.then((value) => _mapStyle = value); // loads custom map theme
_getCurrentLocation(); // uses geolocator to acquire device position
}
void _getCurrentLocation() {
final Geolocator geolocator = Geolocator()..forceAndroidLocationManager;
geolocator
.getCurrentPosition(desiredAccuracy: LocationAccuracy.best)
.then((position) => setState(() => _currentPosition = position))
.catchError((e) => print(e));
}
@override
Widget build(BuildContext context) {
return Container(
height: MediaQuery.of(context).size.height * 0.65,
child: Placeholder(),
);
}
}
</code></pre></div></div>
<p>Upon running the app after adding the <code class="language-plaintext highlighter-rouge">GoogleMapView</code> widget, you’ll see the standard platform specific location prompt - make sure you accept this.</p>
<div style="text-align: center">
<img src="/assets/location-prompt-ios.png" width="200" />
</div>
<p><br /></p>
<h4 id="add-a-loading-animation">Add a loading animation</h4>
<p>This is a good point to add in the loading animation dependency. While the <code class="language-plaintext highlighter-rouge">geolocator</code> package makes an <code class="language-plaintext highlighter-rouge">async</code> operation, there will be nothing to display, so we can fill this time with a nice animated loading indicator which can be pulled in using the <code class="language-plaintext highlighter-rouge">flutter_spinkit</code> <a href="https://pub.dev/packages/flutter_spinkit">dependency</a>. As of writing, I am using <code class="language-plaintext highlighter-rouge">flutter_spinkit: ^4.1.2+1</code>.</p>
<p>Your <code class="language-plaintext highlighter-rouge">build</code> method in <code class="language-plaintext highlighter-rouge">map_view.dart</code> should read like this after adding in the <code class="language-plaintext highlighter-rouge">SpinKitPulse</code>. Notice we use conditional rendering to show the loading blip. We expect the value of <code class="language-plaintext highlighter-rouge">_currentPosition</code> to be <code class="language-plaintext highlighter-rouge">null</code> until the device acquires good GPS signal and computes current position.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>@override
Widget build(BuildContext context) {
return Container(
height: MediaQuery.of(context).size.height *
0.65, // height (65%) = screen height (100%) - height of bottom card (35%)
child: _currentPosition == null ? LoadingBlip() : Placeholder(),
);
}
</code></pre></div></div>
<p>where <code class="language-plaintext highlighter-rouge">LoadingBlip</code> is the custom widget we just created:</p>
<p><code class="language-plaintext highlighter-rouge">lib/widgets/loading_blip.dart</code></p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>import 'package:flutter/material.dart';
import 'package:flutter_spinkit/flutter_spinkit.dart';
class LoadingBlip extends StatelessWidget {
@override
Widget build(BuildContext context) {
return Container(
height: MediaQuery.of(context).size.height * 0.65,
width: MediaQuery.of(context).size.width * 1,
color: Colors.grey[300],
child: SpinKitPulse(
color: Theme.of(context).indicatorColor,
size: 50.0,
),
);
}
}
</code></pre></div></div>
<div style="text-align: center">
<img src="/assets/spinkit-loading-blip.gif" width="200" />
</div>
<p><br /></p>
<p><strong>Feeling lost?</strong> Take a look at <a href="https://github.com/samisnotinsane/flutter-bites/commit/fbe6b11bfe50ae24406b478c1715c6dcf786e803#diff-ef3842c19e4a6b4139f27c2313c9c4b4">my commit</a> to get back on track.</p>
<h4 id="adding-the-map">Adding the map</h4>
<p>Returning to <code class="language-plaintext highlighter-rouge">lib/widgets/map_view.dart</code> in <code class="language-plaintext highlighter-rouge">build</code>, we now replace the <code class="language-plaintext highlighter-rouge">Placeholder</code> with a <code class="language-plaintext highlighter-rouge">GoogleMap</code> object which comes from <code class="language-plaintext highlighter-rouge">google_maps_flutter</code> dependency.</p>
<p><code class="language-plaintext highlighter-rouge">lib/widgets/map_view.dart</code></p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>// ...
// Centres map on device location coordinates.
_buildCameraPosition() => CameraPosition(
target: LatLng(_currentPosition.latitude, _currentPosition.longitude),
zoom: 16); // positive integers; higher value = more zoom.
// Applies theme to map once it has loaded.
_buildMap(GoogleMapController mapController) =>
mapController.setMapStyle(_mapStyle);
@override
Widget build(BuildContext context) {
return Container(
height: MediaQuery.of(context).size.height *
0.65, // height (65%) = screen height (100%) - height of bottom card (35%)
child: _currentPosition == null
? LoadingBlip()
: GoogleMap(
myLocationButtonEnabled: false,
myLocationEnabled: true,
initialCameraPosition: _buildCameraPosition(),
onMapCreated: _buildMap,
),
);
}
// ...
</code></pre></div></div>
<p><strong>Important</strong>: For iOS, it’s crucial you enable embedded views preview by inserting:</p>
<p><code class="language-plaintext highlighter-rouge">info.plist</code></p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><dict>
<!-- Add the following two lines -->
<key>io.flutter.embedded_views_preview</key>
<string>YES</string>
<!-- ... -->
</dict>
</code></pre></div></div>
<p>Make sure you cold restart the app for the linker to update your changes.</p>
<p><strong>Caution</strong>: Check your <code class="language-plaintext highlighter-rouge">ios/Runner/AppDelegate.swift</code> and <code class="language-plaintext highlighter-rouge">android/app/src/main/AndroidManifest.xml</code> to see you have added your unique API key as detailed in section ‘Project Configuration and Adding Dependencies’ above.</p>
<p><strong>Feeling lost?</strong> Take a look at <a href="https://github.com/samisnotinsane/flutter-bites/commit/75b126553c2ac9721b711e6a46304067478a4cc4">my commit</a> to get back on track.</p>
<h4 id="summary">Summary</h4>
<p>If you’ve been able to follow along so far, congratulations! You should have an app with Google Maps integrated with a custom theme with a list of recent destinations as an overlay card - just like Uber!</p>
<div style="text-align: center">
<img src="/assets/geo-maps-demo.gif" width="200" />
</div>
<p><br /></p>
<p>Thanks for reading!</p>During this lockdown, I wanted to play with new and unfamiliar technology. Enter Flutter. Google’s all new framework for building cross-platform mobile applications.Bias-Variance Tradeoff in Machine Learning models2020-02-11T17:16:05+00:002020-02-11T17:16:05+00:00https://sameen.dev/ml/2020/02/11/Bias-variance-tradeoff<p>Have you noticed that when you ask a child to do a task for you, sometimes they are overly pedantic about small details?
Say you ask them to pour about two cups of milk in a pan. They might excitedly get out a syringe from the drawer with millimetre precision and pour out the milk in the saucepan.</p>
<p>Now imagine making the same request to a moody teenager. They probably can’t be bothered to get involved with this chore and will grudgingly pour out the milk - the amount poured could be way less or way more than the two cups you had asked for.</p>
<p>You on the other hand might have years of experience on the job. When someone tells you to pour two cups of milk in a pan, you can estimate and pour straight from the bottle - chances are, your estimate will be pretty close to the <em>true value</em> of two cups.</p>
<p>If you have been able to follow the analogy I have just given, then the notion of over and underfitting a model is intuitive to you. It is the child’s model that has overfitted and the teenager’s, underfitted - you on the other hand have acquired a model with <em>just</em> the right degree of tradeoff.</p>
<h2 id="overfitting">Overfitting</h2>
<p>When we use a model that is too complex, we end up adapting too closely to our data which results in <em>high variance</em> as it fluctuates wildly from one datapoint to the next. The model pays so much attention to every little detail in data that it doesn’t generalise well and often results in incorrect predictions for new, unseen data.</p>
<h2 id="underfitting">Underfitting</h2>
<p>This is the exact opposite of overfitting. In this case, our model is ignoring all the important details in our data and is looking at things from the highest level of generality - we know this because we can see by eye that a curve will better approximate the true function and fit the data samples well. This kind of model is said to have a high <em>bias</em>.</p>
<h2 id="experiment">Experiment</h2>
<p>Inspired by the <a href="https://scikit-learn.org/stable/auto_examples/model_selection/plot_underfitting_overfitting.html">scikit-learn documentation</a> I did a quick experiment to see for myself if the code works and if the problem of overfitting and underfitting really is real - seeing is believing afterall!</p>
<p>Begin by importing the prerequisites in your Jupyter notebook:</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="o">%</span><span class="n">matplotlib</span> <span class="n">inline</span>
<span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="n">np</span>
<span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="k">as</span> <span class="n">plt</span>
<span class="kn">from</span> <span class="nn">sklearn.pipeline</span> <span class="kn">import</span> <span class="n">Pipeline</span>
<span class="kn">from</span> <span class="nn">sklearn.preprocessing</span> <span class="kn">import</span> <span class="n">PolynomialFeatures</span>
<span class="kn">from</span> <span class="nn">sklearn.linear_model</span> <span class="kn">import</span> <span class="n">LinearRegression</span>
<span class="kn">from</span> <span class="nn">sklearn.model_selection</span> <span class="kn">import</span> <span class="n">cross_val_score</span></code></pre></figure>
<p><em>Note: If you’re not set up with Jupyter already, watch my <a href="https://www.youtube.com/watch?v=lM_y35fXuEw">step-by-step tutorial on YouTube</a> which shows you how to do this.</em></p>
<p>Our raw data will be generated using a sine function.</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="k">def</span> <span class="nf">true_fun</span><span class="p">(</span><span class="n">X</span><span class="p">):</span> <span class="k">return</span> <span class="n">np</span><span class="p">.</span><span class="n">sin</span><span class="p">(</span><span class="mf">1.5</span> <span class="o">*</span> <span class="n">np</span><span class="p">.</span><span class="n">pi</span> <span class="o">*</span> <span class="n">X</span><span class="p">)</span></code></pre></figure>
<p>But before generating datapoints from this function, we will add some with <a href="https://en.wikipedia.org/wiki/Normal_distribution">Gaussian noise</a> to make the data <em>vaguely</em> realistic - otherwise we will just end up with datapoints which follow a sine curve with no variance and our learning function won’t be able to over or underfit!</p>
<p><code class="language-plaintext highlighter-rouge">random.seed</code> ensures that for given a seed value, we generate the exact random values. We then specify we want to generate 30 datapoints in <code class="language-plaintext highlighter-rouge">n_samples</code> and create an array called <code class="language-plaintext highlighter-rouge">degrees</code> which will hold various polynomial degrees which will make our model increasingly complex. In this example we will generate three models:</p>
<ul>
<li>a linear regressor (<code class="language-plaintext highlighter-rouge">degree=1</code> line)</li>
<li>a quartic function (<code class="language-plaintext highlighter-rouge">degree=4</code> polynomial)</li>
<li>a dodecic function (<code class="language-plaintext highlighter-rouge">degree=12</code> polynomial)</li>
</ul>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="n">np</span><span class="p">.</span><span class="n">random</span><span class="p">.</span><span class="n">seed</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span>
<span class="n">n_samples</span> <span class="o">=</span> <span class="mi">30</span>
<span class="n">degrees</span> <span class="o">=</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">12</span><span class="p">]</span></code></pre></figure>
<p>Next, we generate our input <code class="language-plaintext highlighter-rouge">X</code> and target <code class="language-plaintext highlighter-rouge">y</code> datapoints in <a href="https://en.wikipedia.org/wiki/Vector_space">vector space</a>. These serve as examples which we can use to <em>train</em> our model. It is during this training phase that our model will try to learn from data and arrive at an approximation of our <em>true function</em> (sine wave).</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="n">X</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">sort</span><span class="p">(</span><span class="n">np</span><span class="p">.</span><span class="n">random</span><span class="p">.</span><span class="n">rand</span><span class="p">(</span><span class="n">n_samples</span><span class="p">))</span>
<span class="n">y</span> <span class="o">=</span> <span class="n">true_fun</span><span class="p">(</span><span class="n">X</span><span class="p">)</span> <span class="o">+</span> <span class="n">np</span><span class="p">.</span><span class="n">random</span><span class="p">.</span><span class="n">randn</span><span class="p">(</span><span class="n">n_samples</span><span class="p">)</span> <span class="o">*</span> <span class="mf">0.1</span></code></pre></figure>
<p>OK, let’s train and plot the three different models we’ve been speaking about:</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">degrees</span><span class="p">)):</span>
<span class="n">plt</span><span class="p">.</span><span class="n">figure</span><span class="p">(</span><span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">8</span><span class="p">,</span> <span class="mi">4</span><span class="p">))</span>
<span class="n">polynomial_features</span> <span class="o">=</span> <span class="n">PolynomialFeatures</span><span class="p">(</span><span class="n">degree</span><span class="o">=</span><span class="n">degrees</span><span class="p">[</span><span class="n">i</span><span class="p">],</span>
<span class="n">include_bias</span><span class="o">=</span><span class="bp">False</span><span class="p">)</span>
<span class="n">linear_regression</span> <span class="o">=</span> <span class="n">LinearRegression</span><span class="p">()</span>
<span class="n">pipeline</span> <span class="o">=</span> <span class="n">Pipeline</span><span class="p">([(</span><span class="s">"polynomial_features"</span><span class="p">,</span> <span class="n">polynomial_features</span><span class="p">),</span>
<span class="p">(</span><span class="s">"linear_regression"</span><span class="p">,</span> <span class="n">linear_regression</span><span class="p">)])</span>
<span class="n">pipeline</span><span class="p">.</span><span class="n">fit</span><span class="p">(</span><span class="n">X</span><span class="p">[:,</span> <span class="n">np</span><span class="p">.</span><span class="n">newaxis</span><span class="p">],</span> <span class="n">y</span><span class="p">)</span>
<span class="n">scores</span> <span class="o">=</span> <span class="n">cross_val_score</span><span class="p">(</span><span class="n">pipeline</span><span class="p">,</span> <span class="n">X</span><span class="p">[:,</span> <span class="n">np</span><span class="p">.</span><span class="n">newaxis</span><span class="p">],</span> <span class="n">y</span><span class="p">,</span>
<span class="n">scoring</span><span class="o">=</span><span class="s">"neg_mean_squared_error"</span><span class="p">,</span> <span class="n">cv</span><span class="o">=</span><span class="mi">10</span><span class="p">)</span>
<span class="n">X_test</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">linspace</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">100</span><span class="p">)</span>
<span class="n">plt</span><span class="p">.</span><span class="n">plot</span><span class="p">(</span><span class="n">X_test</span><span class="p">,</span> <span class="n">pipeline</span><span class="p">.</span><span class="n">predict</span><span class="p">(</span><span class="n">X_test</span><span class="p">[:,</span> <span class="n">np</span><span class="p">.</span><span class="n">newaxis</span><span class="p">]),</span>
<span class="n">label</span><span class="o">=</span><span class="s">"Learning function"</span><span class="p">)</span>
<span class="n">plt</span><span class="p">.</span><span class="n">scatter</span><span class="p">(</span><span class="n">X</span><span class="p">,</span> <span class="n">y</span><span class="p">,</span> <span class="n">edgecolor</span><span class="o">=</span><span class="s">'b'</span><span class="p">,</span> <span class="n">s</span><span class="o">=</span><span class="mi">20</span><span class="p">,</span> <span class="n">label</span><span class="o">=</span><span class="s">"Samples"</span><span class="p">)</span>
<span class="n">plt</span><span class="p">.</span><span class="n">plot</span><span class="p">(</span><span class="n">X_test</span><span class="p">,</span> <span class="n">true_fun</span><span class="p">(</span><span class="n">X_test</span><span class="p">),</span> <span class="n">label</span><span class="o">=</span><span class="s">"True function"</span><span class="p">)</span>
<span class="n">plt</span><span class="p">.</span><span class="n">legend</span><span class="p">(</span><span class="n">loc</span><span class="o">=</span><span class="s">"best"</span><span class="p">)</span>
<span class="n">plt</span><span class="p">.</span><span class="n">title</span><span class="p">(</span><span class="s">"Degree {}</span><span class="se">\n</span><span class="s">MSE = {:.2e}(+/-) {:.2e}"</span><span class="p">.</span><span class="nb">format</span><span class="p">(</span><span class="n">degrees</span><span class="p">[</span><span class="n">i</span><span class="p">],</span>
<span class="o">-</span><span class="n">scores</span><span class="p">.</span><span class="n">mean</span><span class="p">(),</span> <span class="n">scores</span><span class="p">.</span><span class="n">std</span><span class="p">()))</span></code></pre></figure>
<p>Here, our model has drawn a straight line of best fit because our polynomial degree was one. Without making our model any more complicated, this is the best we can do, but you can see that this is nowhere near good enough.</p>
<p align="center">
<img src="/assets/5OI3FKC2OU82R44V.png" alt="drawing" width="500" />
</p>
<p>Interesting, our model (blue) has learnt a very close approximation of the true function (orange). It looks like our model will be able to predict the position of a new sample pretty well, but let’s keep going and see what a fancier, more complicated model can offer.</p>
<p align="center">
<img src="/assets/LKPRO99PFXEBV4SO.png" alt="drawing" width="500" />
</p>
<p>Disaster! You can clearly see what overfitting looks like in the first plot where our model (blue line) is tightly fitting our datapoints (samples in blue). Since we are using the same training method, we know there’s nothing inherently wrong with our training process, it just means we need to go back and reduce our model complexity so our model can generalise a bit more.</p>
<p align="center">
<img src="/assets/AKOGHLREOP1TIOSM.png" alt="drawing" width="500" />
</p>
<h2 id="summary">Summary</h2>
<p>We have seen that a complex model produces more variance by overfitting the data and a simple model produce more bias through using a simple line where a curve was needed. We also tuned our model to find the right balance of bias and variance where our model was able to make a good enough assumption about our data and approximate the true function well.</p>Have you noticed that when you ask a child to do a task for you, sometimes they are overly pedantic about small details? Say you ask them to pour about two cups of milk in a pan. They might excitedly get out a syringe from the drawer with millimetre precision and pour out the milk in the saucepan.Training, Validation and Test Set2020-02-09T13:49:14+00:002020-02-09T13:49:14+00:00https://sameen.dev/ml/2020/02/09/training-validation-test-set<p>When deploying a machine learning solution, we want our model to make predictions based on training data. But, it must make these predictions with data it has never seen before. This gives rise to errors in predictions which engineers must try to reduce before deploying the model. This whole exercise of splitting data is to make our validation and test set representative of unknown future data.</p>
<p>You might think we pour all our data into our model to get the perfect solution, but this is far from the truth. Data is carefully split up into three categories during the exploratory analysis phase to minimise the likelihood of error in prediction, which results in a much more useful system with fewer false-positives and true-negatives.</p>
<p>Finding the exact split ratio is very much dependent on the nature of the data you are dealing with. But as a rule of thumb, a typical split ratio as outlined in the seminal book <a href="https://web.stanford.edu/~hastie/Papers/ESLII.pdf">The Elements of Statistical Learning</a>, says 50% should be allocated to a training set, with the remaining 50% evenly split between validation and test set.</p>
<p><img src="/assets/C50C302C-E55A-4905-8171-D2D6B08285CF.jpeg" alt="50% training set, 25% validation set, 25% test set" /></p>
<p><strong>Note:</strong> The term <em>validation set</em> and <em>test set</em> is sometimes used interchangeably in the industry.</p>
<h2 id="training-set">Training set</h2>
<p>This should contain both your independent and dependent variables - also known as input and
target vectors. 50% of your data should belong to this set and you should only use this set to actually train your model to prevent overfitting.</p>
<h2 id="validation-set">Validation set</h2>
<p>This set consists of the next 25% of your dataset and is used to estimate the prediction error for different models so that you can empirically test which model has the highest accuracy.</p>
<h2 id="test-set">Test set</h2>
<p>Keep this set locked up until your model is ready for production. Do not, in any way, try to use this data to train your model as this will definitely result in overfitting and will result in a less accurate model. This set should only contain independent variables and when correctly used, should give us the generalisation error of the final model you have chosen - because sometimes, your model may score a low prediction error just through sheer chance.</p>When deploying a machine learning solution, we want our model to make predictions based on training data. But, it must make these predictions with data it has never seen before. This gives rise to errors in predictions which engineers must try to reduce before deploying the model. This whole exercise of splitting data is to make our validation and test set representative of unknown future data.Getting the hang of Pandas2020-02-07T17:16:54+00:002020-02-07T17:16:54+00:00https://sameen.dev/ml/2020/02/07/Pandas-basics<p>Think of Pandas like Excel, but for hackers. It is infinitely faster and more powerful. Did I also mention it was free?</p>
<p>Yes it’s free.</p>
<p>In pandas, you will work commonly with Series and DataFrame. So, what’s the difference between the two?</p>
<h2 id="series">Series</h2>
<p>This is basically a one-dimensional array or a list - nothing new in the world of programming. Most often, you’ll come across a series in Pandas when you extract a column from a dataframe, because a series is what each column is made of in a dataframe.</p>
<h2 id="dataframe">DataFrame</h2>
<p>This is where the juice lies. A dataframe is basically a table from your spreadsheet. As you might expect, it can have columns with rows of data. As we start out, this sounds pretty cool, but its true power starts to shine when you begin to manipulate a 50-dimensional table, or 500 for that matter.</p>
<hr />
<p>Without any further ado, let’s jump in! Before running any code, remember to import <code class="language-plaintext highlighter-rouge">pandas</code> along with <code class="language-plaintext highlighter-rouge">Series</code> and <code class="language-plaintext highlighter-rouge">DataFrame</code>. We also import a very handy library called <code class="language-plaintext highlighter-rouge">numpy</code> which we will look in-depth in another post.</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="kn">from</span> <span class="nn">pandas</span> <span class="kn">import</span> <span class="n">Series</span><span class="p">,</span> <span class="n">DataFrame</span>
<span class="kn">import</span> <span class="nn">pandas</span> <span class="k">as</span> <span class="n">pd</span>
<span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="n">np</span></code></pre></figure>
<h3 id="series-cheatsheet">Series cheatsheet</h3>
<p>To create a new series:</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="n">my_series</span> <span class="o">=</span> <span class="n">Series</span><span class="p">([</span><span class="mi">10</span><span class="p">,</span> <span class="mi">8</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">6</span><span class="p">])</span></code></pre></figure>
<p>Presumably you want to do something with it, such as using a predicate to filter and return a new series:</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="n">filtered_series</span> <span class="o">=</span> <span class="n">my_series</span><span class="p">[</span><span class="n">my_series</span> <span class="o">></span> <span class="mi">7</span><span class="p">]</span></code></pre></figure>
<p>after this, <code class="language-plaintext highlighter-rouge">filtered_series</code> will contain <code class="language-plaintext highlighter-rouge">[8, 10]</code>.</p>
<p>Some more handy features include the <code class="language-plaintext highlighter-rouge">isnull</code> and <code class="language-plaintext highlighter-rouge">notnull</code> operations. Given a series <code class="language-plaintext highlighter-rouge">raw_series</code> with values <code class="language-plaintext highlighter-rouge">[5, 10, 15, NaN]</code>, if we do <code class="language-plaintext highlighter-rouge">pd.isnull()</code>, our output will be:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>raw_series
0 False
1 False
2 False
3 True
</code></pre></div></div>
<p>Conversely, using <code class="language-plaintext highlighter-rouge">pd.notnull()</code>, we get:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>raw_series
0 True
1 True
2 True
3 False
</code></pre></div></div>
<p>Real world data is messy, and these operations remain our best mates when tackling certain columns with missing data.</p>
<p>One of the cool things which differentiates a series from a standard array is the ability to name indices. Imagine <code class="language-plaintext highlighter-rouge">my_series</code> above represents points scored by different players in a game, then, instead of remembering the element <code class="language-plaintext highlighter-rouge">0</code> represents <code class="language-plaintext highlighter-rouge">Tom</code> and <code class="language-plaintext highlighter-rouge">1</code> represents <code class="language-plaintext highlighter-rouge">Jane</code>, we can just alter the index of <code class="language-plaintext highlighter-rouge">my_series</code> <em>in-place</em>, meaning it changes the original series, not its view:</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="n">my_series</span><span class="p">.</span><span class="n">index</span> <span class="o">=</span> <span class="p">[</span><span class="s">'Tom'</span><span class="p">,</span> <span class="s">'Jane'</span><span class="p">,</span> <span class="s">'Kathy'</span><span class="p">,</span> <span class="s">'Sam'</span><span class="p">]</span></code></pre></figure>
<p>This gives the output:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>my_series
Tom 10
Jane 8
Kathy 3
Sam 6
</code></pre></div></div>
<p>You can then easily pick out the score of <code class="language-plaintext highlighter-rouge">Kathy</code> through <code class="language-plaintext highlighter-rouge">my_series['Kathy']</code> which returns <code class="language-plaintext highlighter-rouge">3</code>.</p>
<h3 id="dataframe-cheatsheet">Dataframe cheatsheet</h3>
<p>There’s absolutely a ton of stuff you can do with dataframes. To begin with, let’s see how we can read a csv file into a pandas dataframe. In this example, we’re using the nCoV-2019 coronavirus data originating from Wuhan, in Hubei province, China.</p>
<p>Let’s preview the file before we read it, to get a sense of what we’re dealing with:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># source: BNO @ https://bnonews.com/index.php/2020/01/the-latest-coronavirus-cases/
# update: 2020-02-3 5:02:00 ET
place|confirmed_cases|deaths|notes|sources
Hubei|11,177|350|1,223 serious, 478 critical|http://wjw.hubei.gov.cn/fbjd/dtyw/202002/t20200203_2018272.shtml
Zhejiang|724|0|48 serious, 12 critical|https://www.zjwjw.gov.cn/art/2020/2/3/art_1202101_41869217.html
Guangdong|725|0|58 serious, 22 critical|http://wsjkw.gd.gov.cn/zwyw_yqxx/content/post_2882427.html
Henan|566|2|30 serious, 14 critical|https://m.weibo.cn/status/4467799441404602
Hunan|521|0|58 serious|http://wjw.hunan.gov.cn/wjw/xxgk/gzdt/zyxw_1/202002/t20200203_11168209.html
Anhui|408|0|4 critical|http://wjw.ah.gov.cn/news_details_54452.html
Jiangxi|391|0| 34 serious |http://hc.jiangxi.gov.cn/doc/2020/02/03/138004.shtml
</code></pre></div></div>
<p>Ok, so the first two lines beginning with <code class="language-plaintext highlighter-rouge">#</code> are comments, so we need to skip that. Also, it appears that fields are separated by <code class="language-plaintext highlighter-rouge">|</code>. We can account for both of these by adding a couple of arguments to <code class="language-plaintext highlighter-rouge">read_csv</code>:</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="n">url</span> <span class="o">=</span> <span class="s">'https://raw.githubusercontent.com/globalcitizen/2019-wuhan-coronavirus-data/master/data-sources/bno/data/20200124-145500-bno-2019ncov-data.csv'</span>
<span class="n">ncov</span> <span class="o">=</span> <span class="n">pd</span><span class="p">.</span><span class="n">read_csv</span><span class="p">(</span><span class="n">url</span><span class="p">,</span> <span class="n">sep</span><span class="o">=</span><span class="s">'|'</span><span class="p">,</span> <span class="n">skiprows</span><span class="o">=</span><span class="mi">2</span><span class="p">)</span></code></pre></figure>
<p>To get a list of all column names: <code class="language-plaintext highlighter-rouge">ncov.columns</code></p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="n">In</span><span class="p">[</span><span class="mi">18</span><span class="p">]:</span> <span class="n">ncov</span><span class="p">.</span><span class="n">columns</span>
<span class="n">Out</span><span class="p">[</span><span class="mi">18</span><span class="p">]:</span> <span class="n">Index</span><span class="p">([</span><span class="s">'place'</span><span class="p">,</span> <span class="s">'confirmed_cases'</span><span class="p">,</span> <span class="s">'deaths'</span><span class="p">,</span> <span class="s">'notes'</span><span class="p">,</span> <span class="s">'sources'</span><span class="p">],</span> <span class="n">dtype</span><span class="o">=</span><span class="s">'object'</span><span class="p">)</span></code></pre></figure>
<p>Let’s make a table showing a list of places and the corresponding number of deaths:</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="n">In</span><span class="p">[</span><span class="mi">19</span><span class="p">]:</span> <span class="n">death_pivot</span> <span class="o">=</span> <span class="n">ncov</span><span class="p">.</span><span class="n">pivot_table</span><span class="p">(</span><span class="n">index</span><span class="o">=</span><span class="s">'place'</span><span class="p">,</span> <span class="n">values</span><span class="o">=</span><span class="s">'deaths'</span><span class="p">)</span>
<span class="n">Out</span><span class="p">[</span><span class="mi">19</span><span class="p">]:</span> <span class="n">death_pivot</span>
<span class="n">place</span> <span class="n">deaths</span>
<span class="n">Anhui</span> <span class="mf">0.0</span>
<span class="n">Beijing</span> <span class="mf">1.0</span>
<span class="n">CHINA</span> <span class="n">TOTAL</span> <span class="mf">361.0</span>
<span class="n">Cambodia</span> <span class="mf">0.0</span>
<span class="n">Canada</span> <span class="mf">0.0</span>
<span class="n">Chongqing</span> <span class="mf">2.0</span></code></pre></figure>
<p>Transposing is another handy feature for long tables, such as the one above. If you do: <code class="language-plaintext highlighter-rouge">death_pivot.T</code>, you will get a table that spans horizontally instead of vertically.</p>
<p>You may also want to delete columns from your dataframe. The <code class="language-plaintext highlighter-rouge">drop</code> method will help you with this. It takes an argument <code class="language-plaintext highlighter-rouge">axis</code> which tells panda what dimension to target. An example is the best way to understand this: <code class="language-plaintext highlighter-rouge">axis=0</code> refers to rows (x-axis), <code class="language-plaintext highlighter-rouge">axis=1</code> refers to columns (y-axis), <code class="language-plaintext highlighter-rouge">axis=2</code> refers to frames (z-axis). Of course, in a 70-dimension dataset, your axis value can range from 0-69.</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="c1"># remove 'sources' column
</span><span class="n">In</span><span class="p">[</span><span class="mi">20</span><span class="p">]:</span> <span class="n">ncov</span><span class="p">.</span><span class="n">drop</span><span class="p">(</span><span class="s">'sources'</span><span class="p">,</span> <span class="n">axis</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span></code></pre></figure>
<p>Say you wish to pick out rows 5-10 from your dataframe, you can use the following:</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="n">In</span><span class="p">[</span><span class="mi">21</span><span class="p">]:</span> <span class="n">ncov</span><span class="p">[</span><span class="mi">4</span><span class="p">:</span><span class="mi">10</span><span class="p">]</span></code></pre></figure>
<p>Basically, the syntax for slicing is: <code class="language-plaintext highlighter-rouge">df[start_index:end_index:increment]</code> where:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">start_index</code> is inclusive</li>
<li><code class="language-plaintext highlighter-rouge">end_index</code> is non-inclusive</li>
<li><code class="language-plaintext highlighter-rouge">increment</code> with a value of <code class="language-plaintext highlighter-rouge">2</code> would pick every two rows in the range specified.</li>
</ul>
<p>Ok, now we want to get the top-10 places with the highest number of deaths. <code class="language-plaintext highlighter-rouge">ascending=False</code> gives a descending order list as you might expect and <code class="language-plaintext highlighter-rouge">head(10)</code> returns the first 10 rows from the dataframe.</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="n">In</span><span class="p">[</span><span class="mi">22</span><span class="p">]:</span> <span class="n">ncov</span><span class="p">.</span><span class="n">sort_index</span><span class="p">(</span><span class="n">by</span><span class="o">=</span><span class="s">'deaths'</span><span class="p">,</span> <span class="n">ascending</span><span class="o">=</span><span class="bp">False</span><span class="p">).</span><span class="n">head</span><span class="p">(</span><span class="mi">10</span><span class="p">)</span></code></pre></figure>
<p>This is only the beginning of what we can do with Pandas. There’s a lot more to explore and in future posts, we will do some analysis with various datasets from Kaggle, so if you don’t have an account with them, I highly recommend signing up to get access to lots of data and cloud-powered jupyter notebooks free of charge.</p>Think of Pandas like Excel, but for hackers. It is infinitely faster and more powerful. Did I also mention it was free?