Posts (old posts, page 1)

Project Euler Problem 3: Largest prime factor

The prime factors of 13195 are 5, 7, 13 and 29.

What is the largest prime factor of the number 600851475143 ?

Version 1 - Inefficient Trial Division

First a function to generate all factors. We need only test candidate factors up to $n$.

In [1]:
from six.moves import filter, map, range, reduce
In [2]:
factors = lambda n: list(filter(lambda x: not n % x, range(1, n+1)))
In [3]:
factors(24)
Out[3]:
[1, 2, 3, 4, 6, 8, 12, 24]

This isn't really that helpful, since we are only interested in prime factors. While we can implement a primality predicate and filter out composites numbers, it isn't particularly efficient. Let's rethink our approach.

Instead of iterating over all integers up to $n$, we can just iterate over the prime numbers. But first we'll need a prime number generator. The simple Sieve of Eratosthenes (250s BCE) seems like a good place to start.

Read more…

Project Euler Problem 2: Fibonacci Numbers

Each new term in the Fibonacci sequence is generated by adding the previous two terms. By starting with 1 and 2, the first 10 terms will be:

1, 2, 3, 5, 8, 13, 21, 34, 55, 89, ...

By considering the terms in the Fibonacci sequence whose values do not exceed four million, find the sum of the even-valued terms.

Version 1: The obvious but inefficient way

In [1]:
from itertools import takewhile

def fibonacci():
    a, b = 0, 1
    while True:
        yield a
        a, b = b, a + b
In [2]:
sum(filter(lambda x: x%2==0, takewhile(lambda x: x <= 4*10**6, fibonacci())))
Out[2]:
4613732

Project Euler Problem 1: Multiples of 3 and 5

If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3, 5, 6 and 9. The sum of these multiples is 23.

Find the sum of all the multiples of 3 or 5 below 1000.

In [1]:
from six.moves import range

Version 1: The obvious way

In [2]:
sum(filter(lambda x: x%3==0 or x%5==0, range(1000)))
Out[2]:
233168

Version 2: If you prefer list comprehensions

In [3]:
sum(x for x in range(1000) if x%3==0 or x%5==0)
Out[3]:
233168

Pelican's USE_FOLDER_AS_CATEGORY setting and behaviour

Pelican's USE_FOLDER_AS_CATEGORY setting is set to True by default. If you place an article within a subfolder of the content folder, and don't specify a category in your article metadata, then the name of the subfolder will be the category of your article. However, the documentation does not specify Pelican's behavior under all possible situations. E.g. What happens if an article is within a subfolder, and its category is specified but different the name of the subfolder?

We summarize Pelican's behavior under all possible circumstances here.

USE_FOLDER_AS_CATEGORY Category c specified in metadata Article in subfolder d Article's Category
True True True c
True True False c
True False True d
True False False DEFAULT_CATEGORY (Misc)
False True True c
False True False c
False False True DEFAULT_CATEGORY (Misc)
False False False DEFAULT_CATEGORY (Misc)

Generating PDFs from Pelican Articles

As of mid-2013, Pelican still advertised PDF generation of articles/pages as one of its features. However, the change log indicates that this is no longer a core feature and has since become a Pelican plugin as of version 3.3.0 (2013-09-24), see issue for further discussion. Therefore, it was rather confounding to find the setting PDF_GENERATOR to be listed in the examples settings of version 3.5.0, and of course, to no one's surprise, adding the setting PDF_GENERATOR=True isn't going to do anything.

If you have used Pelican plugins before, then the solution should be obvious. Simply install the pdf plugin from Pelican plugins. I personally prefer to keep all plugins (and themes) in the <pelican_site_root> directory, on the same level as the pelicanconf.py settings file:

$ cd <pelican_site_root>
$ mkdir plugins
$ git clone https://github.com/getpelican/pelican-plugins.git plugins

Optionally, you can also add the repository as a submodule:

$ git submodule add https://github.com/getpelican/pelican-plugins.git plugins

or clone it anywhere else you like for that matter.

Lastly, you simply need to add plugins to PLUGIN_PATHS and pdf to PLUGINS. The former will temporarily add PLUGIN_PATHS to your system path so that the latter is importable:

PLUGIN_PATHS = ['plugins']
PLUGINS = ['pdf']

When you run pelican (or make html), the generated PDFs of your articles will appear in the pdf directory of the output directory, named according to the article slug with the .pdf extension.

In my opinion, the generated PDFs aren't exactly terrific, and the plugin could do with a little bit more work.

Note

If you happen to be using the notmyidea theme, a link get the pdf will appear by simply adding PDF_PROCESSOR=True to your settings (as of commit a7ca52).

IPython Notebook Demo

IPython's Rich Display System

In Python, objects can declare their textual representation using the __repr__ method. IPython expands on this idea and allows objects to declare other, richer representations including:

  • HTML
  • JSON
  • PNG
  • JPEG
  • SVG
  • LaTeX

A single object can declare some or all of these representations; all are handled by IPython's display system. This Notebook shows how you can use this display system to incorporate a broad range of content into your Notebooks.

Basic display imports

The display function is a general purpose tool for displaying different representations of objects. Think of it as print for these rich representations.

In [1]:
from IPython.display import display

A few points:

  • Calling display on an object will send all possible representations to the Notebook.
  • These representations are stored in the Notebook document.
  • In general the Notebook will use the richest available representation.

If you want to display a particular representation, there are specific functions for that:

In [2]:
from IPython.display import display_pretty, display_html, display_jpeg, display_png, display_json, display_latex, display_svg

Images

To work with images (JPEG, PNG) use the Image class.

In [3]:
from IPython.display import Image
In [5]:
i = Image(filename='../images/ipython_logo.png')

Returning an Image object from an expression will automatically display it:

In [6]:
i
Out[6]:

Or you can pass it to display:

In [7]:
display(i)

An image can also be displayed from raw data or a url

In [8]:
Image(url='http://python.org/images/python-logo.gif')
Out[8]:

SVG images are also supported out of the box (since modern browsers do a good job of rendering them):

In [9]:
from IPython.display import SVG
SVG(filename='images/python_logo.svg')
Out[9]:
image/svg+xml

If we want to create a link to one of them, we can call use the FileLink object.

In [10]:
from IPython.display import FileLink, FileLinks
FileLink('Running Code.ipynb')

Alternatively, if we want to link to all of the files in a directory, we can use the FileLinks object, passing '.' to indicate that we want links generated for the current working directory. Note that if there were other directories under the current directory, FileLinks would work in a recursive manner creating links to files in all sub-directories as well.

Embedded vs Non-embedded Images

By default, image data is embedded in the Notebook document so that the images can be viewed offline. However it is also possible to tell the Image class to only store a link to the image. Let's see how this works using a webcam at Berkeley.

In [12]:
from IPython.display import Image
img_url = 'http://www.lawrencehallofscience.org/static/scienceview/scienceview.berkeley.edu/html/view/view_assets/images/newview.jpg'

# by default Image data are embedded
Embed      = Image(img_url)

# if kwarg `url` is given, the embedding is assumed to be false
SoftLinked = Image(url=img_url)

# In each case, embed can be specified explicitly with the `embed` kwarg
# ForceEmbed = Image(url=img_url, embed=True)

Here is the embedded version. Note that this image was pulled from the webcam when this code cell was originally run and stored in the Notebook. Unless we rerun this cell, this is not todays image.

In [13]:
Embed
Out[13]:

Here is today's image from same webcam at Berkeley, (refreshed every minutes, if you reload the notebook), visible only with an active internet connection, that should be different from the previous one. Notebooks saved with this kind of image will be lighter and always reflect the current version of the source, but the image won't display offline.

In [14]:
SoftLinked
Out[14]:

Of course, if you re-run this Notebook, the two images will be the same again.

Audio

IPython makes it easy to work with sounds interactively. The Audio display class allows you to create an audio control that is embedded in the Notebook. The interface is analogous to the interface of the Image display class. All audio formats supported by the browser can be used. Note that no single format is presently supported in all browsers.

In [15]:
from IPython.display import Audio
Audio(url="http://www.nch.com.au/acm/8k16bitpcm.wav")
Out[15]:

A Numpy array can be auralized automatically. The Audio class normalizes and encodes the data and embed the result in the Notebook.

For instance, when two sine waves with almost the same frequency are superimposed a phenomena known as beats occur. This can be auralised as follows

In [16]:
import numpy as np
max_time = 3
f1 = 220.0
f2 = 224.0
rate = 8000.0
L = 3
times = np.linspace(0,L,rate*L)
signal = np.sin(2*np.pi*f1*times) + np.sin(2*np.pi*f2*times)

Audio(data=signal, rate=rate)
Out[16]:

Video

More exotic objects can also be displayed, as long as their representation supports the IPython display protocol. For example, videos hosted externally on YouTube are easy to load (and writing a similar wrapper for other hosted content is trivial):

In [17]:
from IPython.display import YouTubeVideo
# a talk about IPython at Sage Days at U. Washington, Seattle.
# Video credit: William Stein.
YouTubeVideo('1j_HxD4iLn8')
Out[17]:

Using the nascent video capabilities of modern browsers, you may also be able to display local videos. At the moment this doesn't work very well in all browsers, so it may or may not work for you; we will continue testing this and looking for ways to make it more robust.

The following cell loads a local file called animation.m4v, encodes the raw video as base64 for http transport, and uses the HTML5 video tag to load it. On Chrome 15 it works correctly, displaying a control bar at the bottom with a play/pause button and a location slider.

In [18]:
from IPython.display import HTML
from base64 import b64encode
video = open("images/animation.m4v", "rb").read()
video_encoded = b64encode(video).decode('ascii')
video_tag = '<video controls alt="test" src="data:video/x-m4v;base64,{0}">'.format(video_encoded)
HTML(data=video_tag)
Out[18]:

HTML

Python objects can declare HTML representations that will be displayed in the Notebook. If you have some HTML you want to display, simply use the HTML class.

In [19]:
from IPython.display import HTML
In [20]:
s = """<table>
<tr>
<th>Header 1</th>
<th>Header 2</th>
</tr>
<tr>
<td>row 1, cell 1</td>
<td>row 1, cell 2</td>
</tr>
<tr>
<td>row 2, cell 1</td>
<td>row 2, cell 2</td>
</tr>
</table>"""
In [21]:
h = HTML(s); h
Out[21]:
Header 1 Header 2
row 1, cell 1 row 1, cell 2
row 2, cell 1 row 2, cell 2

Pandas makes use of this capability to allow DataFrames to be represented as HTML tables.

In [22]:
import pandas

Here is a small amount of stock data for APPL:

In [23]:
%%file data.csv
Date,Open,High,Low,Close,Volume,Adj Close
2012-06-01,569.16,590.00,548.50,584.00,14077000,581.50
2012-05-01,584.90,596.76,522.18,577.73,18827900,575.26
2012-04-02,601.83,644.00,555.00,583.98,28759100,581.48
2012-03-01,548.17,621.45,516.22,599.55,26486000,596.99
2012-02-01,458.41,547.61,453.98,542.44,22001000,540.12
2012-01-03,409.40,458.24,409.00,456.48,12949100,454.53
Writing data.csv

Read this as into a DataFrame:

In [24]:
df = pandas.read_csv('data.csv')

And view the HTML representation:

In [25]:
df
Out[25]:
Date Open High Low Close Volume Adj Close
0 2012-06-01 569.16 590.00 548.50 584.00 14077000 581.50
1 2012-05-01 584.90 596.76 522.18 577.73 18827900 575.26
2 2012-04-02 601.83 644.00 555.00 583.98 28759100 581.48
3 2012-03-01 548.17 621.45 516.22 599.55 26486000 596.99
4 2012-02-01 458.41 547.61 453.98 542.44 22001000 540.12
5 2012-01-03 409.40 458.24 409.00 456.48 12949100 454.53

6 rows × 7 columns

External sites

You can even embed an entire page from another site in an iframe; for example this is today's Wikipedia page for mobile users:

In [26]:
from IPython.display import IFrame
IFrame('http://en.mobile.wikipedia.org/?useformat=mobile', width='100%', height=350)
Out[26]:

LaTeX

And we also support the display of mathematical expressions typeset in LaTeX, which is rendered in the browser thanks to the MathJax library.

In [27]:
from IPython.display import Math
Math(r'F(k) = \int_{-\infty}^{\infty} f(x) e^{2\pi i k} dx')
Out[27]:
$$F(k) = \int_{-\infty}^{\infty} f(x) e^{2\pi i k} dx$$

With the Latex class, you have to include the delimiters yourself. This allows you to use other LaTeX modes such as eqnarray:

In [28]:
from IPython.display import Latex
Latex(r"""\begin{eqnarray}
\nabla \times \vec{\mathbf{B}} -\, \frac1c\, \frac{\partial\vec{\mathbf{E}}}{\partial t} & = \frac{4\pi}{c}\vec{\mathbf{j}} \\
\nabla \cdot \vec{\mathbf{E}} & = 4 \pi \rho \\
\nabla \times \vec{\mathbf{E}}\, +\, \frac1c\, \frac{\partial\vec{\mathbf{B}}}{\partial t} & = \vec{\mathbf{0}} \\
\nabla \cdot \vec{\mathbf{B}} & = 0 
\end{eqnarray}""")
Out[28]:
\begin{eqnarray} \nabla \times \vec{\mathbf{B}} -\, \frac1c\, \frac{\partial\vec{\mathbf{E}}}{\partial t} & = \frac{4\pi}{c}\vec{\mathbf{j}} \\ \nabla \cdot \vec{\mathbf{E}} & = 4 \pi \rho \\ \nabla \times \vec{\mathbf{E}}\, +\, \frac1c\, \frac{\partial\vec{\mathbf{B}}}{\partial t} & = \vec{\mathbf{0}} \\ \nabla \cdot \vec{\mathbf{B}} & = 0 \end{eqnarray}

Or you can enter latex directly with the %%latex cell magic:

In [29]:
%%latex
\begin{align}
\nabla \times \vec{\mathbf{B}} -\, \frac1c\, \frac{\partial\vec{\mathbf{E}}}{\partial t} & = \frac{4\pi}{c}\vec{\mathbf{j}} \\
\nabla \cdot \vec{\mathbf{E}} & = 4 \pi \rho \\
\nabla \times \vec{\mathbf{E}}\, +\, \frac1c\, \frac{\partial\vec{\mathbf{B}}}{\partial t} & = \vec{\mathbf{0}} \\
\nabla \cdot \vec{\mathbf{B}} & = 0
\end{align}
\begin{align} \nabla \times \vec{\mathbf{B}} -\, \frac1c\, \frac{\partial\vec{\mathbf{E}}}{\partial t} & = \frac{4\pi}{c}\vec{\mathbf{j}} \\ \nabla \cdot \vec{\mathbf{E}} & = 4 \pi \rho \\ \nabla \times \vec{\mathbf{E}}\, +\, \frac1c\, \frac{\partial\vec{\mathbf{B}}}{\partial t} & = \vec{\mathbf{0}} \\ \nabla \cdot \vec{\mathbf{B}} & = 0 \end{align}

While displaying equations look good for a page of samples, the ability to mix math and text in a paragraph is also important.
This expression $\sqrt{3x-1}+(1+x)^2$ is an example of an inline equation. As you see, MathJax equations can be used this way as well, without unduly disturbing the spacing between lines.

ReStructuredText Demo

Author: David Goodger
Address:
123 Example Street
Example, EX  Canada
A1B 2C3
Contact: docutils-develop@lists.sourceforge.net
Authors: Me
Myself
I
Organization: humankind
Date: 2012-01-03
Status: This is a "work in progress"
Revision: 7302
Version: 1
Copyright: This document has been placed in the public domain. You may do with it as you wish. You may copy, modify, redistribute, reattribute, sell, buy, rent, lease, destroy, or improve it, quote it at length, excerpt, incorporate, collate, fold, staple, or mutilate it, or do anything else to it that your or anyone else's heart desires.
field name: This is a generic bibliographic field.
field name 2:

Generic bibliographic fields may contain multiple body elements.

Like this.

Dedication

For Docutils users & co-developers.

Abstract

This document is a demonstration of the reStructuredText markup language, containing examples of all basic reStructuredText constructs and many advanced constructs.

1   Structural Elements

1.1   Section Title

That's it, the text just above this line.

1.2   Transitions

Here's a transition:


It divides the section.

2   Body Elements

2.1   Paragraphs

A paragraph.

2.1.1   Inline Markup

Paragraphs contain text and may contain inline markup: emphasis, strong emphasis, inline literals, standalone hyperlinks (http://www.python.org), external hyperlinks (Python [5]), internal cross-references (example), external hyperlinks with embedded URIs (Python web site), footnote references (manually numbered [1], anonymous auto-numbered [3], labeled auto-numbered [2], or symbolic [*]), citation references ([CIT2002]), substitution references (EXAMPLE), and inline hyperlink targets (see Targets below for a reference back to here). Character-level inline markup is also possible (although exceedingly ugly!) in reStructuredText. Problems are indicated by |problematic| text (generated by processing errors; this one is intentional).

The default role for interpreted text is Title Reference. Here are some explicit interpreted text roles: a PEP reference (PEP 287); an RFC reference (RFC 2822); a subscript; a superscript; and explicit roles for standard inline markup.

Let's test wrapping and whitespace significance in inline literals: This is an example of --inline-literal --text, --including some-- strangely--hyphenated-words.  Adjust-the-width-of-your-browser-window to see how the text is wrapped.  -- ---- --------  Now note    the spacing    between the    words of    this sentence    (words should    be grouped    in pairs).

If the --pep-references option was supplied, there should be a live link to PEP 258 here.

2.2   Bullet Lists

  • A bullet list

    • Nested bullet list.
    • Nested item 2.
  • Item 2.

    Paragraph 2 of item 2.

    • Nested bullet list.
    • Nested item 2.
      • Third level.
      • Item 2.
    • Nested item 3.

2.3   Enumerated Lists

  1. Arabic numerals.

    1. lower alpha)
      1. (lower roman)
        1. upper alpha.
          1. upper roman)
  2. Lists that don't start at 1:

    1. Three
    2. Four
    1. C
    2. D
    1. iii
    2. iv
  3. List items may also be auto-enumerated.

2.4   Definition Lists

Term
Definition
Term : classifier

Definition paragraph 1.

Definition paragraph 2.

Term
Definition

2.5   Field Lists

what:

Field lists map field names to field bodies, like database records. They are often part of an extension syntax. They are an unambiguous variant of RFC 2822 fields.

how arg1 arg2:

The field marker is a colon, the field name, and a colon.

The field body may contain one or more body elements, indented relative to the field marker.

2.6   Option Lists

For listing command-line options:

-a command-line option "a"
-b file options can have arguments and long descriptions
--long options can be long also
--input=file long options can also have arguments
--very-long-option
 

The description can also start on the next line.

The description may contain multiple body elements, regardless of where it starts.

-x, -y, -z Multiple options are an "option group".
-v, --verbose Commonly-seen: short & long options.
-1 file, --one=file, --two file
  Multiple options with arguments.
/V DOS/VMS-style options too

There must be at least two spaces between the option and the description.

2.7   Literal Blocks

Literal blocks are indicated with a double-colon ("::") at the end of the preceding paragraph (over there -->). They can be indented:

if literal_block:
    text = 'is left as-is'
    spaces_and_linebreaks = 'are preserved'
    markup_processing = None

Or they can be quoted without indentation:

>> Great idea!
>
> Why didn't I think of that?

2.8   Line Blocks

This is a line block. It ends with a blank line.
Each new line begins with a vertical bar ("|").
Line breaks and initial indents are preserved.
Continuation lines are wrapped portions of long lines; they begin with a space in place of the vertical bar.
The left edge of a continuation line need not be aligned with the left edge of the text above it.
This is a second line block.

Blank lines are permitted internally, but they must begin with a "|".

Take it away, Eric the Orchestra Leader!

A one, two, a one two three four

Half a bee, philosophically,
must, ipso facto, half not be.
But half the bee has got to be,
vis a vis its entity. D'you see?

But can a bee be said to be
or not to be an entire bee,
when half the bee is not a bee,
due to some ancient injury?

Singing...

2.9   Block Quotes

Block quotes consist of indented body elements:

My theory by A. Elk. Brackets Miss, brackets. This theory goes as follows and begins now. All brontosauruses are thin at one end, much much thicker in the middle and then thin again at the far end. That is my theory, it is mine, and belongs to me and I own it, and what it is too.

—Anne Elk (Miss)

2.10   Doctest Blocks

>>> print 'Python-specific usage examples; begun with ">>>"'
Python-specific usage examples; begun with ">>>"
>>> print '(cut and pasted from interactive Python sessions)'
(cut and pasted from interactive Python sessions)

2.11   Tables

Here's a grid table followed by a simple table:

Header row, column 1 (header rows optional) Header 2 Header 3 Header 4
body row 1, column 1 column 2 column 3 column 4
body row 2 Cells may span columns.
body row 3 Cells may span rows.
  • Table cells
  • contain
  • body elements.
body row 4
body row 5 Cells may also be empty: -->  
Inputs Output
A B A or B
False False False
True False True
False True True
True True True

2.12   Footnotes

[1] (1, 2)

A footnote contains body elements, consistently indented by at least 3 spaces.

This is the footnote's second paragraph.

[2] (1, 2) Footnotes may be numbered, either manually (as in [1]) or automatically using a "#"-prefixed label. This footnote has a label so it can be referred to from multiple places, both as a footnote reference ([2]) and as a hyperlink reference (label).
[3] This footnote is numbered automatically and anonymously using a label of "#" only.
[*] Footnotes may also use symbols, specified with a "*" label. Here's a reference to the next footnote: [†].
[†] This footnote shows the next symbol in the sequence.
[4] Here's an unreferenced footnote, with a reference to a nonexistent footnote: [5]_.

2.13   Citations

[CIT2002] (1, 2) Citations are text-labeled footnotes. They may be rendered separately and differently from footnotes.

Here's a reference to the above, [CIT2002], and a [nonexistent]_ citation.

2.14   Targets

This paragraph is pointed to by the explicit "example" target. A reference can be found under Inline Markup, above. Inline hyperlink targets are also possible.

Section headers are implicit targets, referred to by name. See Targets, which is a subsection of Body Elements.

Explicit external targets are interpolated into references such as "Python [5]".

Targets may be indirect and anonymous. Thus this phrase may also refer to the Targets section.

Here's a `hyperlink reference without a target`_, which generates an error.

2.14.1   Duplicate Target Names

Duplicate names in section headers or other implicit targets will generate "info" (level-1) system messages. Duplicate names in explicit targets will generate "warning" (level-2) system messages.

2.14.2   Duplicate Target Names

Since there are two "Duplicate Target Names" section headers, we cannot uniquely refer to either of them by name. If we try to (like this: `Duplicate Target Names`_), an error is generated.

2.15   Directives

These are just a sample of the many reStructuredText Directives. For others, please see http://docutils.sourceforge.net/docs/ref/rst/directives.html.

2.15.1   Document Parts

An example of the "contents" directive can be seen above this section (a local, untitled table of contents) and at the beginning of the document (a document-wide table of contents).

2.15.2   Images

An image directive (also clickable -- a hyperlink reference):

../images/rest_demo/title.png

A figure directive:

reStructuredText, the markup syntax

A figure is an image with a caption and/or a legend:

re Revised, revisited, based on 're' module.
Structured Structure-enhanced text, structuredtext.
Text Well it is, isn't it?

This paragraph is also part of the legend.

2.15.3   Admonitions

Attention!

Directives at large.

Caution!

Don't take any wooden nickels.

!DANGER!

Mad scientist at work!

Error

Does not compute.

Hint

It's bigger than a bread box.

Important

  • Wash behind your ears.
  • Clean up your room.
  • Call your mother.
  • Back up your data.

Note

This is a note.

Tip

15% if the service is good.

Warning

Strong prose may provoke extreme mental exertion. Reader discretion is strongly advised.

And, by the way...

You can make up your own admonition too.

2.15.4   Topics, Sidebars, and Rubrics

Topic Title

This is a topic.

This is a rubric

2.15.7   Compound Paragraph

This paragraph contains a literal block:

Connecting... OK
Transmitting data... OK
Disconnecting... OK

and thus consists of a simple paragraph, a literal block, and another simple paragraph. Nonetheless it is semantically one paragraph.

This construct is called a compound paragraph and can be produced with the "compound" directive.

2.16   Substitution Definitions

An inline image (EXAMPLE) example:

(Substitution definitions are not visible in the HTML source.)

2.17   Comments

Here's one:

(View the HTML source to see the comment.)

3   Extensions

3.1   Code Blocks

Here's a neat implementation of the Sieve of Eratosthenes.

1
2
3
4
5
6
7
8
9
def sieve_of_eratosthenes():
    factors = defaultdict(set)
    for n in count(2):
        if factors[n]:
            for m in factors.pop(n):
                factors[n+m].add(m)
        else:
            factors[n*n].add(n)
            yield n

3.2   Mathematics

Here are some remarkable equations

While displaying equations look good for a page of samples, the ability to mix math and text in a paragraph is also important. This expression \(\sqrt{3x-1}+(1+x)^2\) is an example of an inline equation. As you see, MathJax equations can be used this way as well, without unduly disturbing the spacing between lines.

3.2.1   An Identity of Ramanujan

\begin{equation*} \frac{1}{(\sqrt{\phi \sqrt{5}}-\phi) e^{\frac25 \pi}} = 1+\frac{e^{-2\pi}} {1+\frac{e^{-4\pi}} {1+\frac{e^{-6\pi}} {1+\frac{e^{-8\pi}} {1+\ldots} } } } \end{equation*}

3.2.2   Maxwell's Equations

\begin{align*} \nabla \times \vec{\mathbf{B}} -\, \frac1c\, \frac{\partial\vec{\mathbf{E}}}{\partial t} & = \frac{4\pi}{c}\vec{\mathbf{j}} \\ \nabla \cdot \vec{\mathbf{E}} & = 4 \pi \rho \\ \nabla \times \vec{\mathbf{E}}\, +\, \frac1c\, \frac{\partial\vec{\mathbf{B}}}{\partial t} & = \vec{\mathbf{0}} \\ \nabla \cdot \vec{\mathbf{B}} & = 0 \end{align*}

4   Error Handling

Any errors caught during processing will generate system messages.

|*** Expect 6 errors (including this one). ***|

There should be six messages in the following, auto-generated section, "Docutils System Messages":

Docutils System Messages

System Message: ERROR/3 (<string>, line 89); backlink

Undefined substitution referenced: "problematic".

System Message: ERROR/3 (<string>, line 603); backlink

Undefined substitution referenced: "*** Expect 6 errors (including this one). ***".

System Message: ERROR/3 (<string>, line 346); backlink

Unknown target name: "5".

System Message: ERROR/3 (<string>, line 355); backlink

Unknown target name: "nonexistent".

System Message: ERROR/3 (<string>, line 380); backlink

Unknown target name: "hyperlink reference without a target".

System Message: ERROR/3 (<string>, line 393); backlink

Duplicate target name, cannot be used as a unique reference: "duplicate target names".

How I customized my Nikola-powered site

1   Installation

First, create a virtualenv

$ mkvirtualenv <venv_name>

Now, install Nikola

$ pip install nikola

There are a lot of optional dependencies, the bulk of which can be installed like this

$ pip install nikola[extras]

If we use a feature that has missing dependencies, we will be informed anyways, so let's just go with a minimal installation and install any dependencies later on as needed.

On Mac OS X, even with XCode installed and fully up-to-date, you may still encounter the following error

In file included from src/lxml/lxml.etree.c:346:
$WORKON_HOME/venv_name/build/lxml/src/lxml/includes/etree_defs.h:9:10: fatal error: 'libxml/xmlversion.h' file not found
#include "libxml/xmlversion.h"
         ^
1 error generated.
error: command 'clang' failed with exit status 1

The simple fix is to install lxml by itself first with the STATIC_DEPS set to true, and then try installing Nikola again

$ STATIC_DEPS=true pip install lxml
$ pip install nikola

2   Initialization

Now we're ready to initialize the site

$ nikola init <site_root>
Creating Nikola Site
====================

This is Nikola v7.3.1.  We will now ask you a few easy questions about your new site.
If you do not want to answer and want to go with the defaults instead, simply restart with the `-q` parameter.
--- Questions about the site ---
Site title [My Nikola Site]: Louis Tiao
Site author [Nikola Tesla]: Louis Tiao
Site author's e-mail [n.tesla@example.com]: <email_address>
Site description [This is a demo site for Nikola.]: Computer Science / Math / Software Engineering
Site URL [http://getnikola.com/]: http://ltiao.github.io/
--- Questions about languages and locales ---
We will now ask you to provide the list of languages you want to use.
Please list all the desired languages, comma-separated, using ISO 639-1 codes.  The first language will be used as the default.
Type '?' (a question mark, sans quotes) to list available languages.
Language(s) to use [en]:

Please choose the correct time zone for your blog. Nikola uses the tz database.
You can find your time zone here:
http://en.wikipedia.org/wiki/List_of_tz_database_time_zones

Time zone [Australia/Sydney]:
    Current time in Australia/Sydney: 23:03:16
Use this time zone? [Y/n] Y
--- Questions about comments ---
You can configure comments now.  Type '?' (a question mark, sans quotes) to list available comment systems.  If you do not want any comments, just leave the field blank.
Comment system: ?

# Available comment systems:
#   disqus, facebook, googleplus, intensedebate, isso, livefyre, muut

Comment system: disqus
You need to provide the site identifier for your comment system.  Consult the Nikola manual for details on what the value should be.  (you can leave it empty and come back later)
Comment system site identifier: ltiao

That's it, Nikola is now configured.  Make sure to edit conf.py to your liking.
If you are looking for themes and addons, check out http://themes.getnikola.com/ and http://plugins.getnikola.com/.
Have fun!
[2015-03-15T12:03:26Z] INFO: init: Created empty site at ltiao.github.io.

Note that this also works on an existing directory [1], so you can create a repository on Github (which conveniently generates README, LICENSE, .gitignore files for you), clone it into <site_root>, and then execute the above initialization command. Or you can initialize the site first, and then initialize the git repository. Whatever tickles your fancy.

3   Create Demo Posts

Similar to Octopress, but unlike Pelican [2], Nikola provides you with commands for post and page creation

$ nikola help new_post
Purpose: create a new blog post or site page
Usage:   nikola new_post [options] [path]

Options:
  -p, --page                Create a page instead of a blog post. (see also: `nikola new_page`)
  -t ARG, --title=ARG       Title for the post.
  -a ARG, --author=ARG      Author of the post.
  --tags=ARG                Comma-separated tags for the post.
  -1                        Create the post with embedded metadata (single file format)
  -2                        Create the post with separate metadata (two file format)
  -e                        Open the post (and meta file, if any) in $EDITOR after creation.
  -f ARG, --format=ARG      Markup format for the post, one of rest, markdown, wiki, bbcode, html, textile, txt2tags
  -s                        Schedule the post based on recurrence rule
  -i ARG, --import=ARG      Import an existing file instead of creating a placeholder

$ nikola help new_page
Purpose: create a new page in the site
Usage:   nikola new_page [options] [path]

Options:
  -t ARG, --title=ARG       Title for the page.
  -a ARG, --author=ARG      Author of the post.
  -1                        Create the page with embedded metadata (single file format)
  -2                        Create the page with separate metadata (two file format)
  -e                        Open the page (and meta file, if any) in $EDITOR after creation.
  -f ARG, --format=ARG      Markup format for the page, one of rest, markdown, wiki, bbcode, html, textile, txt2tags
  -i ARG, --import=ARG      Import an existing file instead of creating a placeholder

Let's go ahead and create a new post. I mostly care about how code blocks, mathematical expressions and other ReStructuredText extensions appear, so let's create a post with these syntax contructs. A good starting point is the reStructuredText Demonstration. Download the source as a .txt file and use it as as starting point for our new post

$ nikola new_post --title="ReStructuredText Demo" --import=demo.txt
Importing Existing Post
-----------------------

Title: ReStructuredText Demo
Scanning posts....done!
[2015-03-16T04:01:39Z] INFO: new_post: Your post's text is at: posts/restructuredtext-demo.rst

Note that some metadata is generated for us at the top, along with the contents of demo.txt

.. title: ReStructuredText Demo
.. slug: restructuredtext-demo
.. date: 2015-03-16 02:02:07 UTC+11:00
.. tags:
.. category:
.. link:
.. description:
.. type: text

...

For the most part, this looks pretty good.

../../images/rest_demo.thumbnail.png

There's something funky going on with the footnotes and citations

../../images/rest_demo_footnotes.thumbnail.png

It appears some the text color of some citations are white; they become visible when we highlight it.

Obviously, we haven't included the images corresponding to the demo, or even set up our images directory yet.

../../images/rest_demo_images.thumbnail.png

So let's go ahead and do just that.

3.1   Images

The default image-related options in conf.py are as follows

# #############################################################################
# Image Gallery Options
# #############################################################################

# One or more folders containing galleries. The format is a dictionary of
# {"source": "relative_destination"}, where galleries are looked for in
# "source/" and the results will be located in
# "OUTPUT_PATH/relative_destination/gallery_name"
# Default is:
GALLERY_FOLDERS = {"galleries": "galleries"}
# More gallery options:
THUMBNAIL_SIZE = 180
MAX_IMAGE_SIZE = 1280
USE_FILENAME_AS_TITLE = True
EXTRA_IMAGE_EXTENSIONS = []

# If set to False, it will sort by filename instead. Defaults to True
GALLERY_SORT_BY_DATE = True

# Folders containing images to be used in normal posts or
# pages. Images will be scaled down according to IMAGE_THUMBNAIL_SIZE
# and MAX_IMAGE_SIZE options, but will have to be referenced manually
# to be visible on the site. The format is a dictionary of {source:
# relative destination}.
IMAGE_FOLDERS = {'images': ''}
IMAGE_THUMBNAIL_SIZE = 400

So let's create an images directory, and a subdirectory for the images related to our demo

$ mkdir <site_root>/images
$ mkdir <site_root>/images/rest_demo

Next, download the images in http://docutils.sourceforge.net/docs/user/rst/images/ and place them in the subdirectory we just created

$ wget -nd -r -l 1 -P images/rest_demo/ -A jpeg,jpg,png,bmp,gif http://docutils.sourceforge.net/docs/user/rst/images/

In the posts/restructuredtext-demo.rst file, we must now update all occurrences of images/ to ../rest_demo/.

Now we should see

../../images/rest_demo_images_fixed.thumbnail.png

To reference the contents of the images directory within our post, we use ../, since the images and its thumbnails are outputted to to the output directory by default. Since our posts are outputted to output/posts, it makes sense that we reference the parent directory.

If we update the IMAGE_FOLDERS setting to

IMAGE_FOLDERS = {'images': 'images'}

the images will be outputted to output/images, (which makes more sense in my opinion.) But now, we must remember to update ../rest_demo/ to ../images/rest_demo/!

3.2   Extensions

Now let's check out some of the directives and roles that are not part of Docutils but are supported by Nikola.

3.2.1   Code Blocks

The code directive has been part of Docutils since version 0.9 and two aliases, code-block and sourcecode are also supported.

Let's add the following code block to the demo and see how it looks:

.. code-block:: python
   :number-lines:

   def sieve_of_eratosthenes():
       factors = defaultdict(set)
       for n in count(2):
           if factors[n]:
               for m in factors.pop(n):
                   factors[n+m].add(m)
           else:
               factors[n*n].add(n)
               yield n

Seems to look alright.

../../images/rest_demo_code_default.thumbnail.png

For any theme that uses Pygments (e.g. the default themes), we can modify the color scheme to any one of

  • autumn
  • borland
  • bw
  • colorful
  • default
  • emacs
  • friendly
  • fruity
  • manni
  • monokai
  • murphy
  • native
  • pastie
  • perldoc
  • rrt
  • tango
  • trac
  • vim
  • vs

You can use Pygments online demo to see how each of these styles look.

# Color scheme to be used for code blocks. If your theme provides
# "assets/css/code.css" this is ignored.
# Can be any of autumn borland bw colorful default emacs friendly fruity manni
# monokai murphy native pastie perldoc rrt tango trac vim vs
CODE_COLOR_SCHEME = 'default'

Obviously (and as noted in the comments), if you provide your own assets/css/code.css, this setting will have no effect. (If you want to dig in to the asset-copying mechanism, take a look at copy_assets.py.)

If you did dive in to that file, you'll probably notice that you have access to more color schemes than just those listed above. To list your available styles, run:

>>> from pygments.styles import get_all_styles
>>> sorted(list(get_all_styles()))
['autumn', 'borland', 'bw', 'colorful', 'default', 'emacs', 'friendly', 'fruity', 'igor', 'manni', 'monokai', 'murphy', 'native', 'paraiso-dark', 'paraiso-light', 'pastie', 'perldoc', 'rrt', 'tango', 'trac', 'vim', 'vs', 'xcode']

Let's go ahead and check out xcode.

CODE_COLOR_SCHEME = 'xcode'
../../images/rest_demo_code_xcode.thumbnail.png

In the end, I still think default looks best so I stuck with it.

3.2.2   MathJax

The next thing I care most about is how mathematical expressions are rendered. Mathematical expressions can be displayed inline using the math role:

While displaying equations look good for a page of samples, the
ability to mix math and text in a paragraph is also important.
This expression :math:`\sqrt{3x-1}+(1+x)^2` is an example of an
inline equation. As you see, MathJax equations can be used this
way as well, without unduly disturbing the spacing between lines.

while equations can be displayed with the math directive:

Here are some remarkable equations

An Identity of Ramanujan
************************

.. math::

   \frac{1}{(\sqrt{\phi \sqrt{5}}-\phi) e^{\frac25 \pi}} =
   1+\frac{e^{-2\pi}} {1+\frac{e^{-4\pi}} {1+\frac{e^{-6\pi}}
   {1+\frac{e^{-8\pi}} {1+\ldots} } } }

Maxwell's Equations
*******************

.. math::

   \nabla \times \vec{\mathbf{B}} -\, \frac1c\, \frac{\partial\vec{\mathbf{E}}}{\partial t} & = \frac{4\pi}{c}\vec{\mathbf{j}} \\
   \nabla \cdot \vec{\mathbf{E}} & = 4 \pi \rho \\
   \nabla \times \vec{\mathbf{E}}\, +\, \frac1c\, \frac{\partial\vec{\mathbf{B}}}{\partial t} & = \vec{\mathbf{0}} \\
   \nabla \cdot \vec{\mathbf{B}} & = 0

Let's see what we get

../../images/rest_demo_math_no_tag.thumbnail.png

Oops... That's not what we want. Turns out, we forgot to add the mathjax tag to the post's tags metadata:

.. title: ReStructuredText Demo
.. slug: restructuredtext-demo
.. date: 2015-03-16 15:01:39 UTC+11:00
.. tags: mathjax
.. category:
.. link:
.. description:
.. type: text

Now we're in business.

../../images/rest_demo_math_tag.thumbnail.png

Note that these examples were ripped off from http://cdn.mathjax.org/mathjax/latest/test/sample.html.

3.2.3   Doc

I'm not that fussy about the other roles and directives, and I doubt I will need to use them very much. The last one that I will really need is the doc role, which allows you to link to other pages and articles.

Let's add a link to this post in our demo post:

This post is a demo for :doc:`how-i-customized-my-nikola-powered-site`.

This will use the post's title as the link's text. Alternatively, we can define our own text:

This post is a demo for :doc:`my interesting post <how-i-customized-my-nikola-powered-site>`.

You can see the demo post here.

3.3   IPython Notebooks

Support for writing content IPython Notebooks is a big deal for me, and is the whole reason I switched to Nikola. Unlike Pelican, Nikola support for IPython Notebooks is built in, and there's no need to mess around with all the hacky plugins that Pelican requires.

First, let's add *.ipynb files as a source format

POSTS = (
    ("posts/*.rst", "posts", "post.tmpl"),
    ("posts/*.txt", "posts", "post.tmpl"),
    ("posts/*.ipynb", "posts", "post.tmpl"),
)

Otherwise you'll encounter the exception:

Exception: Can't find a way, using your configuration, to create a post in format ipynb. You may want to tweak COMPILERS or POSTS in conf.py

Now let's create some example notebooks, starting with the sample notebook for demonstrating IPython's Rich Display System and Typesetting Equations:

$ nikola new_post --title="IPython Notebook Demo" --format=ipynb --import=Display\ System.ipynb
Importing Existing Post
-----------------------

Title: IPython Notebook Demo
Scanning posts.....done!
[2015-03-16T10:54:19Z] WARNING: new_post: This compiler does not support one-file posts.
[2015-03-16T10:54:19Z] INFO: new_post: Your post's metadata is at: posts/ipython-notebook-demo.meta
[2015-03-16T10:54:19Z] INFO: new_post: Your post's text is at: posts/ipython-notebook-demo.ipynb

This looks promising, but it doesn't have the nice In [#] / Out [#] formatting we're so used to seeing from IPython. Also, the tables look a bit condensed.

../../images/rest_demo_ipython_initial.thumbnail.png

The MathJax equations are also problematic

../../images/rest_demo_ipython_initial_math.thumbnail.png

Luckily, this is easily fixed by installing the ipython theme.

To see the list of available themes from the default repository, run:

$ nikola install_theme --list
INFO:requests.packages.urllib3.connectionpool:Starting new HTTP connection (1): themes.getnikola.com
Themes:
-------
blogtxt
bootstrap3-gradients
bootstrap3-gradients-jinja
ipython
ipython-xkcd
monospace
oldfashioned
planetoid
readable
reveal
reveal-jinja
zen
zen-ipython
zen-jinja

Note that the default themes such as bootstrap3 are shipped with Nikola and lives in $WORKON_HOME/<venv_name>/lib/python2.7/site-packages/nikola/data/themes/.

Let's just go ahead and install ipython:

$ nikola install_theme ipython
INFO:requests.packages.urllib3.connectionpool:Starting new HTTP connection (1): themes.getnikola.com
[2015-03-16T11:07:25Z] INFO: install_theme: Downloading 'http://themes.getnikola.com/v7/ipython.zip'
INFO:requests.packages.urllib3.connectionpool:Starting new HTTP connection (1): themes.getnikola.com
[2015-03-16T11:07:26Z] INFO: install_theme: Extracting 'ipython' into themes/
[2015-03-16T11:07:26Z] NOTICE: install_theme: Remember to set THEME="ipython" in conf.py to use this theme.

This creates the directory themes in under <site_root>. Let's now set this as our theme (conf.py):

# Name of the theme to use.
THEME = "ipython"

That's better.

../../images/rest_demo_ipython_themed.thumbnail.png ../../images/rest_demo_ipython_themed_math.thumbnail.png

But wait! The inline math expressions are still not rendered correctly. To fix this, we need to set the MATHJAX_CONFIG (in conf.py):

 # If you are using the compile-ipynb plugin, just add this one:
 MATHJAX_CONFIG = """
 <script type="text/x-mathjax-config">
 MathJax.Hub.Config({
     tex2jax: {
         inlineMath: [ ['$','$'], ["\\\(","\\\)"] ],
         displayMath: [ ['$$','$$'], ["\\\[","\\\]"] ],
         processEscapes: true
     },
     displayAlign: 'left', // Change this to 'center' to center equations.
     "HTML-CSS": {
         styles: {'.MathJax_Display': {"margin": 0}}
     }
 });
 </script>
"""

And with that, we seem to be in a good way.

../../images/rest_demo_ipython_mathjax.thumbnail.png

You can see the demo post here.

4   Deployment

At this point, I'm assuming your Git repository is initialized but nothing has been commited yet. If this is not the case, you should consider starting a fresh repository, unless you really know what you are doing.

4.1   Github Pages

Since we're using a Github User / Organization page, the output of the site must be pushed to the master branch. But we still need to track our sources. So first we create a source branch. Note that by default, Nikola assumes the branch to be named deploy, but I find that kind of confusing.

$ git checkout -b source
# For user.github.io OR organization.github.io pages, the DEPLOY branch
# MUST be 'master', and 'gh-pages' for other repositories.
GITHUB_SOURCE_BRANCH = 'source'
GITHUB_DEPLOY_BRANCH = 'master'

# The name of the remote where you wish to push to, using github_deploy.
# GITHUB_REMOTE_NAME = 'origin'

Now, let's track and commit the relevant source files, which so far is just posts/, images/, conf.py.

$ echo .DS_Store >> .gitignore
$ echo .ipynb_checkpoints >> .gitignore
$ git add conf.py images/ posts/
$ git commit -a -m 'initial commit'

And we should push it to Github:

$ git push origin source
Counting objects: 40, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (21/21), done.
Writing objects: 100% (21/21), 2.64 MiB | 264.00 KiB/s, done.
Total 21 (delta 2), reused 0 (delta 0)
To https://github.com/ltiao/ltiao.github.io.git
 * [new branch]      source -> source

We're also ready to deploy to Github Pages:

$ nikola github_deploy
Scanning posts.....done!
Scanning posts.....done!
[2015-03-16T13:05:50Z] INFO: github_deploy: ==> ['ghp-import', '-n', '-m', u'Nikola auto commit.\n\nSource commit: 708c86073cf740997166eacfdb65851acfa74b9d\nNikola version: 7.3.1', '-p', '-r', u'origin', '-b', u'master', u'output']
Counting objects: 144, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (70/70), done.
Writing objects: 100% (143/143), 5.24 MiB | 570.00 KiB/s, done.
Total 143 (delta 69), reused 143 (delta 69)
To https://github.com/ltiao/ltiao.github.io.git
   f8605ee..e159dec  master -> master

Caution!

The nikola github_deploy doesn't seem to generate the required thumbnails for the thumbnail directive.

5   Theme Customization

This is optional, but since I want to be able to distribute my custom theme, and also reuse it in other projects, I created a new repository on Github, and chose the Sass option for the .gitignore file. Next I clone it into <site_root>/themes.

Now, let's create a new theme based on ipython using the Bootswatch theme yeti (which is basically the default Foundation look implemented in Bootstrap.)

$ nikola help bootswatch_theme
Purpose: given a swatch name from bootswatch.com and a parent theme, creates a custom theme
Usage:   nikola bootswatch_theme [options]

Options:
  -n ARG, --name=ARG        New theme name (default: custom)
  -s ARG                    Name of the swatch from bootswatch.com.
  -p ARG, --parent=ARG      Parent theme name (default: bootstrap3)

$ nikola bootswatch_theme --name=tiao --parent=ipython -s yeti

Caution!

You're likely to encounter the exception:

SSLError: hostname 'bootswatch.com' doesn't match either of 'ssl2000.cloudflare.com', 'cloudflare.com', '*.cloudflare.com'

The quick fix if you're desperate to get up and running is to add the keyword argument verify=False to the call to the get method on line 100 of bootswatch_theme.py. But for the sake of security, you'd better wait for this to be fixed.

Note

I'd much prefer it if we could specify a Bootswatch theme in the configuration file since it wouldn't be too difficult to support. As it is, we must create a custom theme if we want to use a different Bootswatch theme.

Now we can set our theme to tiao.

THEME = 'tiao'

and we should now see a yeti version of our site.

I don't think navigation bars are well-suited personal websites and blogs, and that's the main thing I want to address with my custom theme.

To do this, we must modify the base template base.tmpl. Let's first copy it from the parent and take it from there. You'll note that ipython does not actually have a base.tmpl, it uses its parent's, namely bootstrap3-jinja.

$ ls themes/ipython/templates/
base_helper.tmpl  index.tmpl    post.tmpl

$ cat themes/ipython/parent
bootstrap3-jinja

$ mkdir themes/tiao/templates

$ cp $WORKON_HOME/<venv_name>/lib/python2.7/site-packages/nikola/data/themes/bootstrap3-jinja/templates/base.tmpl themes/tiao/templates/base.tmpl

Now we replace every between <!-- Menubar --> ... <!-- End of Menubar --> with

<div class="container">
    <div class="page-header">
        {% if search_form %}
        {{ search_form }}
        {% endif %}
        <ul class="nav nav-pills pull-right">
            {% block belowtitle %}
            {% if translations|length > 1 %}
                <li>{{ base.html_translations() }}</li>
            {% endif %}
            {% endblock %}
            {% if show_sourcelink %}
                {% block sourcelink %}{% endblock %}
            {% endif %}
            {{ template_hooks['menu_alt']() }}
        </ul>
        <a  href="{{ abs_link(_link("root", None, lang)) }}">
            {% if logo_url %}
                <img src="{{ logo_url }}" alt="{{ blog_title }}" id="logo">
            {% endif %}
            <h2 class="text-muted">
            {% if show_blog_title %}
                <span id="blog-title"><strong>{{ blog_title }}</strong></span>
            {% endif %}
            </h2>
        </a>
    </div> <!-- ./page-header -->
</div> <!-- ./container -->

We also replace the body by

<div class="container" id="content" role="main">
    <div class="row">
        <div class="col-sm-3 col-md-2">
            <ul class="nav nav-pills nav-stacked">
                {{ base.html_navigation_links() }}
                {{ template_hooks['menu']() }}
            </ul>
        </div> <!-- ./col -->
        <div class="col-sm-9 col-md-10">
            <div class="body-content">
                <!--Body content-->
                <div class="row">
                    {{ template_hooks['page_header']() }}
                    {% block content %}{% endblock %}
                </div>
                <!--End of body content-->
            </div>
        </div> <!-- ./col -->
    </div> <!-- ./row -->
</div> <!-- ./container -->

<footer class="footer">
    <div class="container">
        <p class="text-muted">
            {{ content_footer }}
            {{ template_hooks['page_footer']() }}
        </p>
    </div>
</footer>

and lastly create assets/css/custom.css, add

html {
  position: relative;
  min-height: 100%;
}

body {
  margin-top: 0px;
  /* Margin bottom by footer height */
  margin-bottom: 60px;
}

.page-header a {
  text-decoration: none;
}

.footer {
  position: absolute;
  bottom: 0;
  width: 100%;
  /* Set the fixed height of the footer here */
  height: 60px;
  background-color: #f5f5f5;
}

.footer .text-muted {
  margin: 20px 0;
}

What this does is summarized below:

  1. Created a page header in place of the navigation bar
  2. Reduce the top margin to 5 pixels
  3. Created a sidebar to contain all the navigation links, i.e. the main menu.
  4. Created navigation pills, pulled to the right on the page header, which contains the search form, source link and translation links, i.e. the alt menu.
  5. Stick the footer to the bottom, loosely based on the Bootstrap Sticky footer example and its corresponding CSS file.

TODO

Since we are now using the vertical (stacked) Nav pills, we can still use one level submenus, but this will create a dropdown menu, which looks ugly for vertical navs. With Bootstrap 2.3.2 you could use the nav-header class to create a grouped list, but this no longer exists in Bootstrap 3. So we define our own. Coming soon.

6   Tweaks

6.1   Stories vs. Pages

One thing that peeved me was that pages are frequently referred to as stories throughout the documentation and configuration file. Calling it stories doesn't even make much sense, but calling it two things makes it even more confusing.

For example,

PAGES = (
    ("stories/*.rst", "stories", "story.tmpl"),
    ("stories/*.txt", "stories", "story.tmpl"),
)

So pages are in the stories directory and use the story.tmpl template? Why not just call it pages!

PAGES = (
    ("pages/*.rst", "pages", "story.tmpl"),
    ("pages/*.txt", "pages", "story.tmpl"),
)

I left the template file as is because I couldn't be bothered with template inheritance at this point, but it is possible to modify it if you're keen. But now, at least the pages are outputted to the pages/ directory, which is reflected in the URL for pages, and that is what I care about the post. I also changed the input directory to pages/ for consistency.

6.2   Site first, blog second

I want my site to be a first a foremost just that - a site, perhaps with the occasional post, notebook, recipe or what have you, rather than a blog with some static pages. The way to do this is more or less outlined in Creating a Site (Not a Blog) with Nikola.

I change the INDEX_PATH from the default ("") to "posts", so that the index of all posts will be generated under the posts subdirectory rather than the root of the output directory.

# Final location for the main blog page and sibling paginated pages is
# output / TRANSLATION[lang] / INDEX_PATH / index-*.html
INDEX_PATH = "posts"

This leaves me free to create my own index page for the site and there are a number of ways to go about this. Perhaps the most straightforward approach is to change the destination of pages to "", so all pages will go directly to the output directory.

PAGES = (
    ("pages/*.rst", "", "story.tmpl"),
    ("pages/*.txt", "", "story.tmpl"),
)

Now we can create our own homepage

$ nikola new_page --title=Home
Creating New Page
-----------------

Title: Home
Scanning posts.....done!
[2015-04-02T10:04:58Z] INFO: new_page: Your page's text is at: pages/home.rst

Since we need the output to be named "index.html", we simply have to modify the slug metadata of the page to be index:

.. slug: index

Important

If we had left INDEX_PATH = "" and set the slug of our new page as index, Nikola won't be able to decide what to generate as the index.html at the site's root and will give the following error when building

ERROR: Two different tasks can't have a common target.'output/index.html' is a target for render_indexes:output/index.html and render_pages:output/index.html.

We can now fill this with whatever content we wish. Personally, I would have liked to be able to specify an existing page to use as the homepage, and the closest thing to achieving this is using the reStructredText .. include:: directive. For now, I set my about page as my homepage:

.. include:: pages/about.rst

Of course, you can also augment this with other content, such as a Post List.

6.3   Tags and Categories

By default, TAG_PATH = "categories" and CATEGORY_PATH = "categories", which means that tags and categories are displayed on the same page. Furthermore, pages for both tags and categories are outputted to the same directory, so category pages must always be prefixed and the default is cat_.

To separate tags from categories, I set TAG_PATH = "tags" (which IMHO is what the default ought to be), leave CATEGORY_PATH = "categories" and get rid of the categories prefix CATEGORY_PREFIX = "".

Finally, pages for both tags and categories will simply contain a list of links. For me personally, I like this behavior for tags, but I would like for category pages to contain the post themselves. So we can leave the default TAG_PAGES_ARE_INDEXES = False and set CATEGORY_PAGES_ARE_INDEXES = True. This is summarized below.

# Paths for different autogenerated bits. These are combined with the
# translation paths.

# Final locations are:
# output / TRANSLATION[lang] / TAG_PATH / index.html (list of tags)
# output / TRANSLATION[lang] / TAG_PATH / tag.html (list of posts for a tag)
# output / TRANSLATION[lang] / TAG_PATH / tag.xml (RSS feed for a tag)
TAG_PATH = "tags"

# If TAG_PAGES_ARE_INDEXES is set to True, each tag's page will contain
# the posts themselves. If set to False, it will be just a list of links.
# TAG_PAGES_ARE_INDEXES = False

# Final locations are:
# output / TRANSLATION[lang] / CATEGORY_PATH / index.html (list of categories)
# output / TRANSLATION[lang] / CATEGORY_PATH / CATEGORY_PREFIX category.html (list of posts for a category)
# output / TRANSLATION[lang] / CATEGORY_PATH / CATEGORY_PREFIX category.xml (RSS feed for a category)
CATEGORY_PATH = "categories"
CATEGORY_PREFIX = ""

# If CATEGORY_PAGES_ARE_INDEXES is set to True, each category's page will contain
# the posts themselves. If set to False, it will be just a list of links.
CATEGORY_PAGES_ARE_INDEXES = True

This is one things that make Nikola great - the degree of control you have over how your site is rendered.

6.5   Google Analytics

# Google Analytics or whatever else you use. Added to the bottom of <body>
# in the default template (base.tmpl).
# (translatable)
BODY_END = """
<script>
  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
  })(window,document,'script','//www.google-analytics.com/analytics.js','ga');

  ga('create', <TRACKING_ID>, 'auto');
  ga('send', 'pageview');

</script>
"""

6.7   Favicons

I created a favicon in a bunch of different sizes and placed them in files/ to be copied to output/ by the copy_files task. The relevant conf.py settings are shown below

# FAVICONS contains (name, file, size) tuples.
# Used for create favicon link like this:
# <link rel="name" href="file" sizes="size"/>
FAVICONS = {
    ("icon", "/favicon_16x16.ico", "16x16"),
    ("icon", "/favicon_32x32.ico", "32x32"),
    ("icon", "/favicon_256x256.ico", "256x256"),
}

6.8   Teasers

With most static site/blog generators, the teaser is often defined by showing only the first \(n\) words of the content and truncating the rest, or allowing you to provide a teaser in the summary field of the post's metadata. Nikola allows you to define when the teaser ends in each post on an individual basis, using the following directive:

.. TEASER_END

Now, showing only teasers is disabled by default so you will have to enable this in the conf.py as show below. Note also the collection of template variables that are given to you for customizing the read more links. Impressive.

# Show only teasers in the index pages? Defaults to False.
INDEX_TEASERS = True

# HTML fragments with the Read more... links.
# The following tags exist and are replaced for you:
# {link}                        A link to the full post page.
# {read_more}                   The string “Read more” in the current language.
# {reading_time}                An estimate of how long it will take to read the post.
# {remaining_reading_time}      An estimate of how long it will take to read the post, sans the teaser.
# {min_remaining_read}          The string “{remaining_reading_time} min remaining to read” in the current language.
# {paragraph_count}             The amount of paragraphs in the post.
# {remaining_paragraph_count}   The amount of paragraphs in the post, sans the teaser.
# {{                            A literal { (U+007B LEFT CURLY BRACKET)
# }}                            A literal } (U+007D RIGHT CURLY BRACKET)

# 'Read more...' for the index page, if INDEX_TEASERS is True (translatable)
INDEX_READ_MORE_LINK = '<p class="more"><a href="{link}">{read_more}…</a></p>'
# 'Read more...' for the RSS_FEED, if RSS_TEASERS is True (translatable)
RSS_READ_MORE_LINK = '<p><a href="{link}">{read_more}…</a> ({min_remaining_read})</p>'

6.12   Languages

[1] Unlike the initialization commands of more prominent projects, such as Django (django-admin.py startproject), Scrapy (scrapy startproject) and probably others.
[2] Admittedly not a huge predicament since one can trivially implement such a command in Pelican, given that projects come equipped with a fabfile.py and a Makefile, that makes extensive use of it.

Serious shortcomings of n-gram feature spaces in text classification

The major drawback of feature spaces represented by \(n\)-gram models is extreme sparcity.

But even more unsettling is that it can only interpret unseen instances with respect to learned training data. That is, if a classifier learned from the instances 'today was a good day' and 'that is a ridiculous thing to say', it is unable to say much about the instance 'i love this song!' since the features are 'today', 'was', 'a', 'good', 'day', 'that', 'is', 'ridiculous', 'thing', 'to', 'say'.

It is impossible to classify this new instance because it is entirely meaningless to the classifier - it cannot be represented. So no matter how many millions of instances the classifier learns from, by knowing the feature space, one can always artificially construct "hard" examples by using words not in the feature space.

So we see this model is only well-suited for extremely large amounts of training data [1] - but even then, there is no guarantee that it is able to represent all unseen instances in its feature space.

The Iris flower data set is a very typical test case for many statistical classification techniques. An interesting observation is that for an English sentence to be valid, it need not necessarily contain specific words, like 'was' or 'good' for example. Yet, for an iris flower to be an iris flower, it necessarily has sepals and petals with their respective widths and lengths.

[1] Halevy, Alon, Peter Norvig, and Fernando Pereira. "The unreasonable effectiveness of data." Intelligent Systems, IEEE 24.2 (2009): 8-12.