Semantic Versioning for Papers: A Manifesto

In writing the current paper I’m working on, I have decided to adopt a semantic versioning scheme for each draft of the paper. There’s probably a ton out there, and I think I got a bit fed up with other versioning schemes where people tag on _INITIALS to the end of the file. Moreover, I found value in tracking the evolution of the paper – in other words, how does the final product look compared to the original? Therefore, I thought, why not adopt a numbering system that semantically makes sense, the same way that it works for code?

Shamelessly copying http://semver.org, here’s my proposed scheme.

Given a version number MAJOR.MINOR.PATCH, for a prose body of text, increment the:

  1. MAJOR version after each submission, to keep track of the number of times a paper has been submitted.
  2. MINOR version after each:
    1. re-arrangement of logic,
    2. large word-smithing or rephrasing of things, and
    3. addition of new insights compared to the previous version
  3. PATCH version after making:
    1. grammatical or spelling changes
    2. substitute individual words for other words

I will note that formatting is intentionally not dealt with here, but is assumed to be part of the MAJOR version increment when formatting a manuscript for submission. This is because a writer ought not to be concerned with formatting in the writing stages. A writer ought to be most concerned with getting his/her thoughts into prose form.

Now, for the figures, which I believe should be developed in parallel but separately from the text.

Given a version number MAJOR.MINOR.PATCH, for a document that lays out the organization of figures, increment the:

  1. MAJOR version after each submission.
  2. MINOR version after each:
    1. addition, removal or rearrangement of figures,
    2. changing of figure representations (i.e. scatterplot changed to 2D histogram),
    3. major changes to the figure caption/legend
  3. PATCH version after each:
    1. grammatical or spelling changes, in the figure caption/legend,
    2. minor word substitutions or additions/deletions in the figure caption/legend,
    3. resizing of figures for aesthetic purposes.

So far, I have tried to keep the figure versions in sync with the text versions to keep things really simple. This system has worked well, as I usually do an export of both the text and the figures at the same time, incrementing whichever needs to be incremented accordingly. When this first manuscript is done, next steps would be to run a ‘diff’ to see how the final version differs from version 0.1.0. Can’t wait for that to happen!

hiveplot 0.1.7

This is the current release of hiveplot on PyPI. Because I haven’t had to use it for other use cases, I have paused maintenance on the package. However, please feel free to fork it and send PRs! Most needed is a set of tests for the package, as well as much better documentation that I have provided so far. :-)

hiveplot 0.1.0 on PyPI!

I am happy to announce that I have made my first Python package and uploaded it to PyPI!

This package is called hiveplot, and its sole purpose is to generate Hive Plots from network data. The API is simple – once the data are prepared, it is a single function call to generate the hive plot.

Hive plots were conceived by Martin Krzwynski of the BCGSC, but for the longest time, code to draw hive plots were only available in R, Java, Perl and JavaScript. Even Python-based drawing of hive plots required the use of d3.js. I wanted everything in pure Python – and since I had a need for hive plots myself, I built the package to suit my needs.

By design, I have implemented everything in pure Python, using only the simplest of Python data structures (lists & dictionaries) and a minimal number of dependencies (numpy and matplotlib). A simple tutorial is also available on the repository page.

Tests are currently not available; I am preparing figures for publication right now, and have yet the time to do so, but if contributors would like to help me set this up, I would welcome this as well! (It would also be a learning experience for me.)

I hope you enjoy the package and the aesthetically beautiful hive plots that come out from your work with this package :-).

And here is a hive plot made using hiveplot with data from my own work:

hive_plot

Links to:

  1. PyPI page
  2. Github repository page

Hive Plot in Python!

For the longest time, nobody has implemented a Hive Plot in matplotlib. Today, I tried. It’s non-trivial.

Source code will come in a few days, once I am done with generalizing it for use with the NetworkX package.

In the meantime, I hope you enjoy the plot below, made entirely with Python code, using the matplotlib API.

hive_plot

To boost productivity, work analog

Cal Newport recently wrote a blog article about the benefits of working away from the computer. Well, I’m a coder – would that work at all?

Turns out, it does. I finally understand why the exams I took in my only CS class (CPSC111, UBC Vancouver, now CPSC110) were in pen and paper. Writing down code on a piece of paper in ink forces me to commit to a train of thought. I cannot easily erase my code without visibly crossing out my mistakes and re-writing below it. There is something to be said about actually being able to see the mistakes I made – it increases my ability to commit those mistakes to memory. If I were typing this stuff on a computer, I would likely visibly erase those “mistakes” made earlier, and in turn commit those mistakes repeatedly.

Thanks, Cal, for writing good advice :).

How to use TextExpander to conveniently timestamp every thought in Evernote

tl;dr: Set up a time stamping snippet in your automatic text expansion software (e.g. TextExpander).

I use Evernote as an electronic lab notebook, and I have extolled its virtues as an ELN in a previous post. While it is important to organize my thoughts by theme, for example, by logical steps in a procedure, it’s also important to timestamp them to help us get a sense for how our thoughts are progressing. Sometimes we will end up non-linearly editing a note. For example, I might have a computational experiment where I record my thoughts as I go along for the first few steps, but I decide later that the earlier steps need revising. If I were to just go back and edit the earlier steps, I have no historical record of what I had done earlier, which makes it tough to disentangle my original thoughts and the changes later made.

To get around this, I have opted to timestamp every thing I type into a note. However, it gets troublesome when I have to type 05 February 2015 11:28:AM every single time. How do I get around this, then? Thankfully, TextExpander (or any other automatic text snippet expansion software) can help.

I set up a shortcut, in my case “;dtime”, which TextExpander then expands to
%d %B %Y %1I:%M:%p.

And so anytime I want to type in the current time and date, all I do is type “;dtime”. Try it!

05 February 2015 11:31:AM

05 February 2015 11:31:AM

05 February 2015 11:31:AM

The convenience is addicting at first, and after a while, it becomes second nature – I’ve tried to do text expansion on computers without TextExpander, and found it to be quite frustrating (an understatement).

Note Histories make Evernote Premium viable as an electronic lab notebook

Available only to Premium users, Evernote offers a feature called “Note History”. Essentially, it’s version control for digital lab notebooks. It takes a snapshot of notes a few times a day, unobtrusively, and in the background. This gives us two advantages as a lab notebook, for which the integrity of the content is important.

Firstly, it provides a verifiable history of any particular note. As changes are made to it, approx. 1/4 day snapshots can be viewed in sequence to transparently note any changes that have happened. If there are any forensic investigations needed, the snapshots provide a good starting place. (If they would allow a snapshot per sync, that would be even better, but I’m guessing that would take too much space.)

Secondly, because of the tracked history, I can actually track the progress of a note as I update it. For example, I usually write down progress pertaining to each computational experiment in a single note. If the experiment (computational or wet-lab) that takes more than a few days to complete, I can go into the note histories to get a sense for how things are progressing in a time-stamped fashion. I have also used TextExpander to help me time-stamp my thoughts as I add them in (that will be in a separate post).

If Evernote can make (1) diff/merge and (2) real-time collaborative editing capabilities for notes, that would really bring their “single workspace” idea up one notch!