Planet Crustaceans

This is a Planet instance for lobste.rs community feeds. To add/update an entry or otherwise improve things, fork this repo.

August 25, 2018

Gustaf Erikson (gerikson)

Plays with Cars by Doug DeMuro August 25, 2018 07:58 PM

DeMuro is a prolific car reviewer on YouTube. This is a breezy read with some amusing anecdotes. I’m currently in a car-interest phase so I enjoyed it.

August 23, 2018

Wallaroo Labs (chuckblake)

Real-time Streaming Pattern: Analyzing Trends August 23, 2018 07:42 PM

Introduction This week, we continue to look at data processing patterns used to build event triggered stream processing applications, the use cases that the patterns relate to, and how you would go about implementing within Wallaroo. This purpose of these posts is to help you understand the data processing use cases that Wallaroo is best designed to handle and how you can go about building them right away. I will be looking at the Wallaroo application builder, the part of your application that hooks into the Wallaroo framework, and some the business logic of the pattern.

Zac Brown (zacbrown)

Things I Learned About Rust (1 of N) August 23, 2018 07:00 AM

Rust has been a daily part of my life for about 8 months now. I’ve had the distinct and somewhat surprising pleasure of getting to work with it professionally for Red Canary. It’s been a whole lot of fun.

Every so often, I learn something new about writing Rust. Most recently, that was over-use of match statements. In particular, my code base has a number of cases where I’m dealing with the Result<V, E> type. I need to decompose the Result<V, E> into its constituent Ok(v) and Err(e) variants in order to branch logic.

However, there are a number of places where I use a result type similar to: Result<(), SomeError>. This results in a bunch of code that looks like:

let foo: Result<(), SomeError> = something_that_returns_result();
match foo {
    Ok(_) => {},
    Err(e) => println!("My error: {}", e.description())
}

Initially, I didn’t think to hard about writing that. It was short enough and the intent was clear to me. After starting to use cargo clippy fastidiously on my code base, it kept pointing out ways to simplify the use of the inverse pattern above for the Option<T> type:

let foo: Option<String> = something_that_returns_option();
match foo {
    Some(v) => do_a_thing_with_string(v),
    None => {}
}

cargo clippy kindly informed me that I should be using the if let... syntax instead of match for brevity:

if let Some(foo) = something_that_returns_option() {
    do_a_thing_with_string(foo);
}

After fixing a few of the cargo clippy lint issues above, I realized I could do the same thing with those functions I used which returned Result<(), SomeError>, just the inverse:

if let Err(e) = something_that_returns_result() {
    println!("My error: {}", e.description();
}

That’s clearer and simplifies the thinking in my opinion.

PS: Maybe I should look at adding the inverse pattern to cargo clippy as a feature addition to that lint rule.

August 21, 2018

Stig Brautaset (stig)

Making Team Decisions August 21, 2018 05:26 PM

I summarise a useful technique for making team decisions I learnt about at The London Lead Developer conference a couple months ago.

Grzegorz Antoniak (dark_grimoire)

My system's time was so wrong, that even ntpd didn't work August 21, 2018 06:00 AM

Hi,

Time management is always hard.

I mean, it's hard for ordinary people -- the internets are full of ebooks and tips on how to organize your time, and there are lots of books about the problem. It's also hard for programmers -- you need to take into account different timezones, those …

Robin Schroer (sulami)

Pipes in Python August 21, 2018 12:00 AM

I just found an article about pipes in Python on lobste.rs and was reminded that I was toying with the exact same thing recently. Using a lot of functional languages (mainly Haskell, Clojure, Elixir)… but being paid to write (mostly highly object-oriented) Python four out of five days a week. We do use some Elixir at my place, but we are still a Python shop first.

and also a fair bit of bash, I am very used to streaming data through chains of functions using pipe-like constructs. Python does make this quite difficult and encourages a more imperative approach with intermediate variables.

The author of the article above uses AST rewriting, which I have to admit is very clever, though hard to introspect and extend unless you are already familiar with AST manipulation in Python.

My approach is slightly different, and yields a less pretty syntax, but is arguably more flexible and extensible, by using a plain function. The definition currently looks like this:

def pype(x, *fs):
    """
    Pipe function. Takes an initial value and any number of functions/methods.
    Methods as strings. Additional args are supported for functions & methods
    by suppling a step as a tuple/list with function/method as the first
    element and the args as the rest. The pipe input is used as the last
    argument in this case. Currently no kwargs.
    """
    while fs:
        f = fs[0]
        args = []
        if isinstance(f, (list, tuple)):
            args = list(f[1:])
            f = f[0]
        if isinstance(f, str):
            if f.startswith('.'):
                x = getattr(x, f[1:])(*args)
            else:
                x = x[f]
        elif isinstance(f, int):
            x = x[f]
        else:
            x = f(*args + [x])
        fs = fs[1:]
    return x

The docstring is a bit abstract, so I think an example is much more explanatory:

from pype import pype

def add_suffix(number, s):
    return '{} is {} cool!'.format(
        s,
        ' '.join('very' for _ in range(number))
    )

pype(
    '   abc: {}   ',
    '.strip',
    ('.format', 3),
    (add_suffix, 2),
    '.upper',
)

# 'ABC: 3 IS VERY VERY COOL!'

I am aware this is a very constructed example, but you get the idea. It currently does not handle keyword arguments, and you cannot specify in which place you would like the input argument to go, it always takes the last slot, which isn’t always convenient, but this way you can avoid using lambdas everywhere, which are quite long in PythonI quite enjoy Elixir’s solution: &func(1, &1, 3), &1 being the placeholder.

.

I might extend this further in the future and maybe introduce it into some code bases of mine, if it turns out to be useful.

August 20, 2018

Gokberk Yaltirakli (gkbrk)

Mastodon Bot in Common Lisp August 20, 2018 10:20 PM

If you post a programming article to Hacker News, Reddit or Lobsters; you will notice that soon after it gets to the front page, it gets posted to Twitter automatically.

But why settle for Twitter when you can have this on Mastodon? In this article we will write Mastodon bot that regularly checks the Lobste.rs front page and posts new links to Mastodon.

Since this is a Mastodon bot, let’s start by sending a post to our followers.

Sending a Mastodon post

To interact with Mastodon, we are going to use a library called Tooter. To get the API keys you need; just log in to Mastodon, go to Settings > Development > New Application. Once you create an application, the page will show all the API keys you need.

(defun get-mastodon-client ()
  (make-instance 'tooter:client
                 :base "https://botsin.space"
                 :name "lobsterbot"
                 :key "Your client key"
                 :secret "Your client secret"
                 :access-token "Your access token")
  )

This function will create a Mastodon client whenever you call it. Now, let’s send our first message.

(tooter:make-status (get-mastodon-client) "Hello world!")

Now that we can send messages, the next step in our project is to fetch the RSS feed.

Fetching the RSS feed

Fetching resources over HTTP is really straightforward with Common Lisp, the drakma library provides an easy-to-use function called http-request. In order to get the contents of my blog, all you need to do is

(drakma:http-request "https://gkbrk.com")

So let’s write a function that takes a feed URL and returns the RSS items.

There is one case we need to handle with this. When you are fetching text/html, drakma handles the decoding for you; but it doesn’t do this when we fetch application/rss. Instead, it returns a byte array.

(defvar feed-path "https://lobste.rs/rss")

(defun get-rss-feed ()
  "Gets rss feed of Lobste.rs"
  (let* ((xml-text (babel:octets-to-string (drakma:http-request feed-path)))
         (xml-tree (plump:parse xml-text)))
    (plump:get-elements-by-tag-name xml-tree "item")
    ))

This function fetches an RSS feed, parses the XML and returns the <item> tags in it. In our case, these tags contain each post Lobste.rs.

Creating structs for the links

A struct in Common Lisp is similar to a struct in C and other languages. It is one object that stores multiple fields.

(defstruct lobsters-post
  title
  url
  guid
  )

Getting and setting fields of a struct can be done like this.

; Pretend that we have a post called p
(setf (lobsters-post-title p) "An interesting article") ; Set the title
(print (lobsters-post-title p))                         ; Print the title

Let’s map the RSS tags to our struct fields.

(defun find-first-element (tag node)
  "Search the XML node for the given tag name and return the text of the first one"
  (plump:render-text (car (plump:get-elements-by-tag-name node tag)))
  )

(defun parse-rss-item (item)
  "Parse an RSS item into a lobsters-post"
  (let ((post (make-lobsters-post)))
    (setf (lobsters-post-title post) (find-first-element "title" item))
    (setf (lobsters-post-url post) (find-first-element "link" item))
    (setf (lobsters-post-guid post) (find-first-element "guid" item))
    post
    ))

Now, we can make the previous get-rss-feed function return lobsters-post‘s instead of raw XML nodes.

(defun get-rss-feed ()
  "Gets rss feed of Lobste.rs"
  (let* ((xml-text (babel:octets-to-string (drakma:http-request *feed-url*)))
         ; Tell the parser that we want XML tags instead of HTML
         ; This is needed because <link> is a self-closing tag in HTML
         (plump:*tag-dispatchers* plump:*xml-tags*)
         (xml-tree (plump:parse xml-text))
         (items (plump:get-elements-by-tag-name xml-tree "item"))
         )
    (reverse (map 'list #'parse-rss-item items))
    ))

Posting the first link to Mastodon

(defun share-post (item)
  "Takes a lobsters-post and posts it on Mastodon"
  (tooter:make-status (get-mastodon-client) (format nil "~a - ~a ~a"
                                                    (lobsters-post-title item)
                                                    (lobsters-post-guid item)
                                                    (lobsters-post-url item)))
  )

(share-post (car (get-rss-feed)))

Keeping track of shared posts

We don’t want our bot to keep posting the same links. One solution to this is to keep all the links we already posted in a file called links.txt.

Every time we come accross a link, we will record it to our “database”. This basically appends the link follewed by a newline to the file. Not very fancy, but certainly enough for our purposes.

(defun record-link-seen (item)
  "Writes a link to the links file to keep track of it"
  (with-open-file (stream "links.txt"
                          :direction :output
                          :if-exists :append
                          :if-does-not-exist :create)
    (format stream "~a~%" (lobsters-post-guid item)))
  )

In order to filter our links before posting, we will go through each line in that file and check if our link is in there.

(defun is-link-seen (item)
  "Returns if we have processed a link before"
  (with-open-file (stream "links.txt"
                          :if-does-not-exist :create)
    (loop for line = (read-line stream nil)
       while line
       when (string= line (lobsters-post-guid item)) return t))
  )

Now let’s wrap this all up by creating a task that

  • Fetches the RSS feed
  • Gets the top 10 posts
  • Filters out the links that we shared before
  • Posts them to Mastodon
(defun run-mastodon-bot ()
  (let* ((first-ten (subseq (get-rss-feed) 0 10))
         (new-links (remove-if #'is-link-seen first-ten))
         )
    (loop for item in new-links do
         (share-post item)
         (record-link-seen item))
    ))

How you schedule this to run regularly is up to you. Set up a cron job, make a timer or just run it manually all the time.

You can find the full code here.

Ponylang (SeanTAllen)

Last Week in Pony - August 19, 2018 August 20, 2018 01:12 AM

Last Week In Pony is a weekly blog post to catch you up on the latest news for the Pony programming language. To learn more about Pony check out our website, our Twitter account @ponylang, our users’ mailing list or join us on IRC. Got something you think should be featured? There’s a GitHub issue for that! Add a comment to the open “Last Week in Pony” issue. News and Blog Posts The paper “43 Years of Actors: A Taxonomy of Actor Models and Their Key Properties” looks at actor systems and proposes different ways of categorizing them.

Pete Corey (petecorey)

Coffee, Tea, and Theanine August 20, 2018 12:00 AM

I’ve always had a tumultuous relationship with caffeine. When I was younger I would drink gallons of diet soda a day (sometimes literally). As I grew older I discovered coffee and increased my caffeine consumption by orders of magnitude.

It took me until I was nearly thirty years old to realize that my consumption of huge amounts of caffeine was probably related to a low-level background radiation of anxiety that I’d been experiencing nearly my entire life. In an effort to better myself, I tried cutting back on coffee and even quitting altogether. I always found it difficult to restrict myself to one or even two cups a day, and quitting was always unsuccessful for one reason or another.

At that point, I decided that my current relationship with coffee was unhealthy, so I decided to try switching to tea. I quickly went whole hog into the tea world. I bought myself a gaiwan, and started buying tea from amazing import companies like What-Cha and white2tea.

Tea in a gaiwan.

I loved it! What’s more, I felt better. My anxiety lessened, I felt less dependent on it every morning, and I didn’t feel like I was chasing the daily “buzz” like I would with coffee.

Recently I started drinking more pu-erh teas, which are a style of tea where the leaves are fermented before being pressed into cakes. As I started to drink pu-erh more consistently, I began to notice my anxiety levels begin to ramp up. Once again, I found myself having cup after cup, trying to catch a caffeine buzz. Something was different. I decided to look into the differences between pu-erh and other types of teas.

I came across this 2016 study carried out by the Department of Pharmacognosy, University of Szeged, Hungary, who’s goal was to compare the correlations of caffeine and tea contents between a variety of tea processing techniques.

Caffeine and L-theanine are pharmacologically important constituents of tea, especially due to their effects on the central nervous system. The effects of these two compounds are opposite: While caffeine is a well-known stimulant, theanine has a relaxing effect. Tea processing may influence the caffeine and theanine content of tea leaves.

While a sampled set of white, green, oolong, black, and pu-erh teas all contained similar levels of caffeine, the L-theanine content in the pu-erh samples tested were “practically zero.”

The average theanine concentration of oolong samples was 6.09 mg/g with a mean caffeine level of 19.31 mg/g (caffeine/theanine ratio 4.20). In the pu-erh tea, no theanine was detected.

So maybe it’s not a reduction in caffeine that’s improving my quality of life. Maybe it’s the introduction of theanine. Either way, I’ve decided to drop the pu-erh for now and go back to drinking oolongs.

It’s said that programmers “turn caffeine into code.” How’s your relationship with coffee, tea, and caffeine, and how does it affect your life and career?

August 19, 2018

Derek Jones (derek-jones)

Impact of team size on planning, when sitting around a table August 19, 2018 08:17 PM

A recent blog post by Allan Kelly caught my attention; on Monday Allan sent me some comments on the draft of my book and I got to ask for a copy of his data (you don’t need your own software engineering data before sending me comments).

During an Agile training course he gives, Allan runs an exercise based on an extended version of the XP game. The basic points are: people form into teams, a task is announced, teams have to estimate how long it will take them to complete the task and then to plan the task implementation. Allan recorded information on team size, time spent estimating and time spent planning (no information on the tasks, which were straightforward, e.g., fold a paper airplane).

In a recent post I gave a brief analysis of team size on productivity. What does this XP game data have to say about the impact of team size on performance?

We don’t have task information, but we do have two timing measurements for each team. With a bit of suck-it-and-see analysis, I found that the following equation explained 50% of the variance (code+data):

Planning=-4+0.8*TeamSize+5*sqrt{Estimate}

where: TeamSize is the number of people on a team, Estimate is the time in minutes the team spent estimating and Planning is time in minutes the team spent planning the task implementation.

There was some flexibility in the numbers, depending on the method used to build the regression model.

The introduction of each new team member incurs a fixed overhead. Given that everybody is sitting together around a table, this is not surprising; or, perhaps the problem was so simply that nobody felt the need to give a personal response to everything said by everybody else; or, perhaps the exercise was run just before lunch and people were hungry.

I am not aware of any connection between time spent estimating and time spent planning, but then I know almost nothing about this kind of XP game exercise. That square-root looks interesting (an exponent of 0.4 or 0.6 was a slightly less good fit). Thoughts and experiences anybody?

Update: I forgot to mention that including the date of the workshop in the above model increases the variance explained to 90%. The date here is a proxy for the task being solved. A model that uses just the date explains 75% of the variance.

August 18, 2018

Andrew Montalenti (amontalenti)

Flow and concentration August 18, 2018 03:52 PM

From Good Business, by Mihaly Csikszentmihalyi, the author of Flow.

Another condition that makes work more flowlike is the opportunity to concentrate. In many jobs, constant interruptions build up to a state of chronic emergency and distraction.

He goes on:

Stress is not so much the product of hard work, as it is of having to switch attention to from one task to the other without having any control over the process.

This is why engineers glare at you during interruptions:

If a person who is working on a problem for hours is interrupted by a phone call, it may take another half hour afterward to get her mind back to the point where it was before the call.

And, this is why that’s not so crazy:

When person A comes by to discuss his problems, you have to re-organize your mind to see things from his point of view.

Finally, this is how a single incident can spiral into an “interruption culture”:

But when B, C, and D stop in one after the other with their issues, and each requires that you clear your mind of the previous set and refurnish it with the elements of new personalities, and their specific problems, that can take a toll on consciousness quite quickly. After a few hours, your brain feels like a quivering mass of jelly.

We need to be careful about this on fully distributed teams, too. That’s because with tools like Slack, we are actually quickly careening back toward the interruption culture of co-located offices of the 90’s and early 00’s.

A version of this post was created originally for Medium.

Nikita Voloboev (nikivi)

learn-anything.xyz that I am building is working towards that goal. August 18, 2018 10:52 AM

learn-anything.xyz that I am building is working towards that goal.

Oh those. Yeah I deleted those maps as that map isn’t really meant to be used. August 18, 2018 10:51 AM

Oh those. Yeah I deleted those maps as that map isn’t really meant to be used. We are building Learn Anything website to replace it now.

Wesley Moore (wezm)

Anatomy of a Great Rust Blog August 18, 2018 06:17 AM

To date I've posted 718 posts to Read Rust. I can't profess to having read every single one completely (although I have read a lot of them) but I have at least skimmed them all and extracted the information required to post them to the site. Some blogs make this easier than others. In this post I cover some things you can do to make your blog, and the posts upon it, easier for readers and myself alike, to read and share with as many people as possible.

I'll cover four areas:

  1. Tell a Story
  2. Sign Your Work
  3. Make It Easy to Read Future Posts
  4. Provide Metadata

Tell a Story

A story has a beginning, middle, and end. Blog posts benefit from this structure too. The beginning sets the scene, and provides a shared starting point for the main content of your post. When a post just dives straight into the details without context it can be hard to follow the topic, the background, and the motivations behind the work.

Once you've set the scene in your introduction, you can dig into the details knowing your readers are on the same page, and more likely to follow along. This is where the bulk of your post is written.

At the end of your post wrap up with a conclusion. This may include a summary, details of future work, or unsolved problems.

Sign Your Work

Writing a post takes time and effort. You can be proud of that and sign your work! I'm aware that some people prefer not to use their real names online. A pseudonym, or handle, work well too. When posting to Read Rust it's important to me to attribute the article to the original author. When there is no information on a post it's hard to work out how to credit the post.

Make It Easy to Read Future Posts

If you've written an interesting post that readers have enjoyed, often they will want to read future posts that you write. You can make this easy using an RSS feed. Pretty much all blogging software supports RSS. If you aren't already generating a feed I highly recommended adding one.

If you already have an RSS feed on your blog ensure it's easily discoverable by linking it. Perhaps in the header, footer, sidebar, or about page. Additionally include a <link rel="alternate"> tag in the <head> of your HTML to make the feed automatically discoverable by feed readers. MDN have a great tutorial series about RSS covering these details: Syndicating content with RSS.

When looking for posts for Read Rust it would be impractical for me to manually visit the websites of every interesting blog to see if there are new posts. RSS lets me subscribe to blogs in my feed reader of choice (Feedbin), allowing me and other readers to discover, and read your new posts all in one place.

Provide Metadata

There are actually two audiences for your content: humans and machines. The humans are the readers, the machines are computers such as search engine indexers, web archivers, and the Read Rust tools! Ideally your content should be easy for both to read.

The add-url tool in the Read Rust codebase looks for a number of pieces of metadata in order to fill in the details that are included in the entry for every post:

  • Title in a <title> tag.
  • Author Name in a <meta name="author" …> tag.
  • Author URL in a <link rel="author" …> tag.
  • Date Published in a <time> tag, typically nested within an <article> tag.
  • Post Summary (excerpt) in a <meta name="description" …> tag.

The tool looks for these in the post itself, as well as in the RSS feed if found. Often it still turns up empty. You can help your content be more machine readable by including this metadata in your HTML. The example below shows all of these properties in use.

<!DOCTYPE html>
<html>
<head>
  <meta charset="utf-8" />
  <meta name="viewport" content="initial-scale=1.0" />
  <title>Post Title</title>
  <meta name="description" content="The design and operation of the little Rust content aggregator I built.">
  <meta name="author" content="Wesley Moore">
  <link rel="alternate" href="http://www.wezm.net/feed/" type="application/atom+xml" title="WezM.net - All Articles" />
  <link rel="author" href="http://www.wezm.net/" />
</head>
<body>
  <!-- header, nav, etc. -->

  <main>
    <article>
      <h1>Post Title</h1>
      <time datetime="2018-06-03T07:34">03 June 2018</time>

      <!-- post content -->
    </article>
  </main>

  <!-- footer, etc. -->
</body>
</html>

So that's it. Four things you can do to help make your blog more readable, attributable, and discoverable. Your readers, human and machine will thank you.

With thanks to Gregor Cresnar from the Noun Project for the icons used in this post:

August 17, 2018

Nikita Voloboev (nikivi)

Which links? I believe I updated MindNode links in my wiki. August 17, 2018 11:10 PM

Which links? I believe I updated MindNode links in my wiki.

Carlos Fenollosa (carlesfe)

Run QBasic in your browser August 17, 2018 07:50 AM

Steve Hanov produced an impressive implementation of QBasic in Javascript, with detailed explanations, that also runs on the browser. The post is eight years old!

If you're nostalgic for DOS Basic, you can't miss this link.

qb.js: An implementation of QBASIC in Javascript (via)

Tags: programming, retro

&via=cfenollosa">&via=cfenollosa">Comments? Tweet  

August 16, 2018

Gustaf Erikson (gerikson)

The Command of the Ocean: A Naval History of Britain, 1649­-1815 by N.A.M. Rodger August 16, 2018 02:12 PM

A readable though academically stringent book about the organization of the Royal Navy from the end of the Republic to the Napoleonic Wars.

A lot about shipyard organization and political maneuvering, but hugely interesting if you’re a Patrick O’Brian buff like I am.

The Corporation Wars: Emergence by Ken MacLeod August 16, 2018 02:10 PM

The conclusion of the trilogy.

Lukewarm recommendation.

Carlos Fenollosa (carlesfe)

My Apple Watch killed my iPhone August 16, 2018 01:56 PM

This is the incredibly weird chain of bugs and hardware issues that bricked my iPhone after the battery of my Apple Watch started to swell.

A couple of months ago the battery of my 1st gen Apple Watch started to swell and the screen popped out. I googled about this issue and read that it's covered by a warranty program, so I brought the watch to the Apple Store in Barcelona. The watch got serviced in a few days, excellent customer support as usual by Apple.

I got home and tried to link this new watch to my iPhone 5s. For some reason the watch refused to link unless I upgraded iOS 10 to iOS 11 on my phone.

My old watch had been working perfectly with iOS 10, but apparently this refurbished one had a new software version that required iOS 11 to work.

I had kept my phone at iOS 10 because my 5s is a bit slow nowadays, I don't need the new features, and in general prefer stability on my main devices. I think, it can't be that bad, and furthermore I had missed my watch so much these last days, so I decide to upgrade.

Terrible decision.

I tap on "Upgrade". The phone downloads the upgrade, starts installing it, progress bar, reboot, progress bar, reboot... one too many times. It's stuck on a reboot look around 80% of progress. Ok, two options, I think. Hardware issue or software issue. How could it be hardware? The phone was working well up to ten minutes ago. So I decide to install clean, wiping out all my data

It's now late afternoon and next day I have to work, and need the phone. You know that feeling, right? This won't end well. I do a clean install, set up Whatsapp and Google Maps, hoping to restore from an icloud backup next day while at work.

The clean install lets me reach the iOS 11 setup screen. Set up wifi, tap next, and reboot. Damn. Set up wifi again, reboot. This doesn't look like a software issue. I try something... I wait five minutes on the wifi setup screen without touching anything. Surprisingly, the phone does not reboot.

I set up wifi after these five minutes and the phone reboots instantly. Any electrical engineer (or probably most of you here that's read about batteries and iOS 11) knows what's happening by now. The battery is failing to supply enough voltage, and this is made apparent at peak power demand, that is, when antennas are working and CPU is at max. I resign myself to having no phone for the next day.

Then, I realize I have a spare iPhone 5s battery laying around, One that I bought to replace my mother's battery (she has also an iPhone 5s) but never ended up fixing. I've changed batteries maybe a dozen times before, and work with electronics regularly. I know best practices. I ground myself, pick up the screwdrivers and suction cup, open up the iPhone carefully, remove the battery glue strips, and install the new battery.

The phone boots.

With the new battery I finally manage to get past the wifi screen but unfortunately the phone keeps rebooting randomly when accessing networks. Damn. My phone clearly has an electrical problem and for whatever reason iOS 11 triggered it. Later, when discussing this issue when a Genius, they confirmed that this is a motherboard problem which required an expensive repair.

Back to the 5S. Since I couldn't use a phone that dies on me randomly, and it's late at night, I picked up my old 4S, popped in my SIM, quickly downloaded Whatsapp and Maps, set up my work email and a few more apps, hoping all icloud data syncs over night. Fortunately, it did.

The next day I started using the 4S as my daily driver. I managed to stick with it for a month, but in the end, it was too slow for everyday usage. It was nice as an experiment, but a pain in the neck to work with.

That's the end more or less. I have a new Apple Watch that killed my 5S, which ironically I couldn't use because my replacement 4S wasn't compatible with that Watch.

I still don't know why iOS 11 draws more power than iOS 10, or if it was a firmware change that really killed my phone. But my bet is on battery management. Doesn't matter now. It was a disaster.

As I was saying, I ended up buying an SE, which is two years old, at full retail price. Well, I got a 40€ discount by trading in the broken 5S.

The cheapest iPhone is not a great deal nowadays, but it still is the perfect phone for my usage/size/budget.

It is not my intention to blame Apple. I fully understand what happened, and it was a chain of unfortunate events. However, I have the feeling that if I could have downgraded the 5S to iOS 10, it may have come back from the dead.

Tags: apple

&via=cfenollosa">&via=cfenollosa">Comments? Tweet  

Jan van den Berg (j11g)

Save data from your broken Raspberry Pi SD card with GNU ddrescue August 16, 2018 08:40 AM

This week my Pi stopped working. After hooking up a monitor I saw kernel errors related to VFS. So the file system was obviously broken. Oops.

The end conclusion is that the SD card is physically ‘broken’, but I still managed to salvage my data — which is more important than the card. Here’s how.

Broken file system: fsck or dd?

What didn’t work for me, but you might want to give a try first are fsck for file system consistency check or using dd to create a disk image.

I couldn’t check/repair the file system with fsck (it gave errors), not even when setting different superblocks. It might work for you, so you can give this blog a try.

Next, I tried to ‘clone’ the bits on the file system with dd. To get a usable image. But that didn’t work either, spewing out errors. But this is where I stumbled across ddrescue.

GNU ddrescue

I had not heard of ddrescue before but it turned out to be a life datasaver! It does what dd does, but in the process tries “to rescue the good parts first in case of read errors”. There are two versions of this program, I used the GNU version.

sudo apt-get install gddrescue

And here is what a sigh of relief looks like, after 43 minutes:

So the command is:

ddrescue -f -n /dev/[baddrive] /root/[imagefilename].img /tmp/recovery.log

The options I used came from this blog:

  • -f Force ddrescue to run even if the destination file already exists (this is required when writing to a disk). It will overwrite.
  • -n Short for’–no-scrape’. This option prevents ddrescue from running through the scraping phase, essentially preventing the utility from spending too much time attempting to recreate heavily damaged areas of a file.

After you have an image you can mount it and browse your data:

mount -o loop rescue.img /mnt/rescue

With this I had access to my data! So I got a new SD card copied my data over and chucked the old SD card. And remember:

The post Save data from your broken Raspberry Pi SD card with GNU ddrescue appeared first on Jan van den Berg.

August 15, 2018

Andrew Montalenti (amontalenti)

Public technical talks and slides August 15, 2018 08:43 PM

Over the years, I’ve put together a few public technical talks where the slides are accessible on this site. These are only really nice to view on desktop, and require the use of arrow keys to move around. Long-form notes are also available — generated by a sweet Sphinx and reStructuredText plugin. I figured I’d link to them all here so I don’t lose track:

Using even older slide software called S5, which is also based on reStructuredText, I also have a couple of Python training materials. I’ve given these materials as full-class trainings at various times, including a tutorial at PyCon on web development (the last link here).

Wallaroo Labs (chuckblake)

Utilizing Elixir as a lightweight tool to store real-time metrics data August 15, 2018 01:00 PM

Visibility into performance bottlenecks was the driving force behind the design of Wallaroo’s Monitoring Hub and Metrics UI. We wanted to provide tooling for users to be able to observe their application as it performed in real-time and provide enough introspection for them to make adjustments to their applications based on what they were seeing; whether that was adding additional workers to distribute a high workload or rewriting a computation to be more efficient.

August 14, 2018

Derek Jones (derek-jones)

2018 in the programming language standards’ world August 14, 2018 04:17 PM

I am sitting in the room, at the British Standards Institution, where today’s meeting of IST/5, the committee responsible for programming languages, has just adjourned (it’s close to where I have to be in a few hours).

BSI have downsized us, they no longer provide a committee secretary to take minutes and provide a point of contact. Somebody from a service pool responds (or not) to emails. I did not blink first to our chair’s request for somebody to take the minutes :-)

What interesting things came up?

It transpires that reports of the death of Cobol standards work may be premature. There are a few people working on ‘new’ features, e.g., support for JSON. This work is happening at the ISO level, rather than the national level in the US (where the real work on the Cobol standard used to be done, before being handed on to the ISO). Is this just a couple of people pushing a few pet ideas or will it turn into something more substantial? We will have to wait and see.

The Unicode consortium (a vendor consortium) are continuing to propose new pile of poo emoji and WG20 (an ISO committee) were doing what they can to stay sane.

Work on the Prolog standard, now seems to be concentrated in Austria. Prolog was the language to be associated with, if you were on the 1980s AI bandwagon (and the Japanese were going to take over the world unless we did something about it, e.g., spend money); this time around, it’s machine learning. With one dominant open source implementation and one commercial vendor (cannot think of any others), standards work is a relic of past glories.

In pre-internet times there was an incentive to kill off committees that were past their sell-by date; it cost money to send out mailings and document storage occupied shelf space. In an electronic world there is no incentive to spend time killing off such committees, might as well wait until those involved retire or die.

WG23 (programming language vulnerabilities) reported lots of interest in their work from people involved in the C++ standard, and for some reason the C++ committee people in the room started glancing at me. I was a good boy, and did not mention bored consultants.

It looks like ISO/IEC 23360-1:2006, the ISO version of the Linux Base Standard is going to be updated to reflect LBS 5.0; something that was not certain few years ago.

Benaiah Mischenko (benaiah)

Configuring Go Apps with TOML August 14, 2018 02:17 PM

Configuring Go Apps with TOML

So you’ve been writing an application in Go, and you’re getting to the point where you have a lot of different options in your program. You’ll likely want a configuration file, as specifying every option on the command-line can get difficult and clunky, and launching applications from a desktop environment makes specifying options at launch even more difficult.

This post will cover configuring Go apps using a simple, INI-like configuration language called TOML, as well as some related difficulties and pitfalls.

TOML has quite a few implementations, including several libraries for Go. I particularly like BurntSushi’s TOML parser and decoder, as it lets you marshal a TOML file directly into a struct. This means your configuration can be fully typed and you can easily do custom conversions (such as parsing a time.Duration) as you read the config, so you don’t have to do them in the rest of your application.

Configuration location

The first question you should ask when adding config files to any app is "where should they go?". For tools that aren’t designed to be run as a service, as root, or under a custom user (in other words, most of them), you should be putting them in the user’s home directory, so they’re easily changed. A few notes:

  • Even if you currently have only one file, you should use a folder and put the config file within it. That way, if and when you do need other files there, you won’t have to clutter the user’s home directory or deal with loading config files that could be in two different locations (Emacs, for instance, supports both ~/.emacs.d/init.el and ~/.emacs for historical reasons, which ends up causing confusing problems when both exist).

  • You should name your configuration directory after your program.

  • You should typically prefix your config directory with a . (but see the final note for Linux, as configuration directories within XDG_CONFIG_HOME should not be so prefixed).

  • On most OSs, putting your configuration files in the user’s “home” directory is typical. I recommend the library go-homedir, rather than the User.Homedir available in the stdlib from os/user. This is because use of os/user uses cgo, which, while useful in many situations, also causes a number of difficulties that can otherwise be avoided - most notably, cross-compilation is no longer simple, and the ease of deploying a static Go binary gets a number of caveats.

  • On Linux specifically, I strongly encourage that you do not put your configuration directory directly in the user’s home directory. Most commonly-used modern Linux distributions use the XDG Base Directory Specification from freedesktop.org, which specifies standard locations for various directories on an end-user Linux system. (Despite this, many applications don’t respect the standard and put their configurations directly in ~ anyway). By default, this is ~/.config/, but it can also be set with the XDG_CONFIG_HOME environment variable. Directories within this should not use a leading ., as the directory is already hidden by default.

The following function should get you the correct location for your config directory on all platforms (if there’s a platform with a specific convention for config locations which I’ve missed, I’d appreciate you letting me know so I can update the post - my email is at the bottom of the page).

import (
    "path/filepath"
    "os"
    "runtime"

    "github.com/mitchellh/go-homedir"
)

var configDirName = "example"

func GetDefaultConfigDir() (string, error) {
    var configDirLocation string

    homeDir, err := homedir.Dir()
    if err != nil {
        return "", err
    }

    switch runtime.GOOS {
    case "linux":
        // Use the XDG_CONFIG_HOME variable if it is set, otherwise
        // $HOME/.config/example
        xdgConfigHome := os.Getenv("XDG_CONFIG_HOME")
        if xdgConfigHome != "" {
            configDirLocation = xdgConfigHome
        } else {
            configDirLocation = filepath.Join(homeDir, ".config", configDirName)
        }

    default:
        // On other platforms we just use $HOME/.example
        hiddenConfigDirName := "." + configDirName
        configDirLocation = filepath.Join(homeDir, hiddenConfigDirName)
    }

    return configDirLocation, nil
}

Within the config folder, you can use any filename you want for your config - I suggest config.toml.

Loading the config file

To load a config file, you’ll first want to define the what config values you’ll use. burntsushi/toml will ignore options in the TOML file that you don’t use, so you don’t have to worry about that causing errors. For instance, here’s the proposed configuration for a project I’m maintaining, wuzz (the keybindings aren’t currently implemented, but I’ve left them in for the sake of demonstration):

type Config struct {
    General GeneralOptions
    Keys    map[string]map[string]string
}

type GeneralOptions struct {
    FormatJSON             bool
    Insecure               bool
    PreserveScrollPosition bool
    DefaultURLScheme       string
}

It’s pretty simple. Note that we use a named struct for GeneralOptions, rather than making Config.General an anonymous struct. This makes nesting options simpler and aids tooling.

Loading the config is quite easy:

import (
    "errors"
    "os"
    
    "github.com/BurntSush/toml"
)

func LoadConfig(configFile string) (*Config, error) {
    if _, err := os.Stat(configFile); os.IsNotExist(err) {
        return nil, errors.New("Config file does not exist.")
    } else if err != nil {
        return nil, err
    }
    
    var conf Config
    if _, err := toml.DecodeFile(configFile, &conf); err != nil {
        return nil, err
    }

    return &conf, nil
}

toml.DecodeFile will automatically populate conf with the values set in the TOML file. (Note that we pass &conf to toml.DecodeFile, not conf - we need to populate the struct we actually have, not a copy). Given the above Config type and the following TOML file…

[general]
defaultURLScheme = "https"
formatJSON = true
preserveScrollPosition = true
insecure = false

[keys]

  [keys.general]
  "C-j" = "next-view"
  "C-k" = "previous-view"
  
  [keys.response-view]
  "<down>" = "scroll-down"

…we’ll get a Config like the following:

Config{
    General: GeneralOptions{
        DefaultURLScheme:       "https",
        FormatJSON:             true,
        PreserveScrollPosition: true,
        Insecure:               false,
    },
    Keys: map[string]map[string]string{
        "general": map[string]string{
            "C-j": "next-view",
            "C-k": "previous-view",
        },
        "response-view": map[string]string{
            "<down>": "scroll-down",
        },
    },
}

Automatically decoding values

wuzz actually uses another value in its config - a default HTTP timeout. In this case, though, there’s no native TOML value that cleanly maps to the type we want - a time.Duration. Fortunately, the TOML library we’re using supports automatically decoding TOML values into custom Go values. To do so, we’ll need a type that wraps time.Duration:

type Duration struct {
    time.Duration
}

Next we’ll need to add an UnmarshalText method, so we satisfy the toml.TextUnmarshaler interface. This will let toml know that we expect a string value which will be passed into our UnmarshalText method.

func (d *Duration) UnmarshalText(text []byte) error {
    var err error
    d.Duration, err = time.ParseDuration(string(text))
    return err
}

Finally, we’ll need to add it to our Config type. This will go in Config.General, so we’ll add it to GeneralOptions:

type GeneralOptions struct {
    Timeout                Duration
    // ...
}

Now we can add it to our TOML file, and toml.DecodeFile will automatically populate our struct with a Duration value!

Input:

[general]
timeout = "1m"
# ...

Equivalent output:

Config{
    General: GeneralOptions{
        Timeout: Duration{
            Duration: 1 * time.Minute
        },
        // ...
    }
}

Default config values

We now have configuration loading, and we’re even decoding a text field to a custom Go type - we’re nearly finished! Next we’ll want to specify defaults for the configuration. We want values specified in the config to override our defaults. Fortunately, toml makes really easy to do.

Remember how we passed in &conf to toml.DecodeFile? That was an empty Config struct - but we can also pass one with its values pre-populated. toml.DecodeFile will set any values that exist in the TOML file, and ignore the rest. First we’ll create the default values:

import (
    "time"
)
var DefaultConfig = Config{
    General: GeneralOptions{
        DefaultURLScheme:       "https",
        FormatJSON:             true,
        Insecure:               false,
        PreserveScrollPosition: true,
        Timeout: Duration{
            Duration: 1 * time.Minute,
        },
    },
    // You can omit stuff from the default config if you'd like - in
    // this case we don't specify Config.Keys
}

Next, we simply modify the LoadConfig function to use DefaultConfig:

func LoadConfig(configFile string) (*Config, error) {
    if _, err := os.Stat(configFile); os.IsNotExist(err) {
        return nil, errors.New("Config file does not exist.")
    } else if err != nil {
        return nil, err
    }

    conf := DefaultConfig
    if _, err := toml.DecodeFile(configFile, &conf); err != nil {
        return nil, err
    }

    return &conf, nil
}

The important line here is conf := DefaultConfig - now when conf is passed to toml.DecodeFile it will populate that.

Summary

I hope this post helped you! you should now be able to configure Go apps using TOML with ease.

If this post was helpful to you, or you have comments or corrections, please let me know! My email address is at the bottom of the page. I’m also looking for work at the moment, so feel free to get in touch if you’re looking for developers.

Complete code

package config

import (
    "errors"
    "path/filepath"
    "os"
    "runtime"
    "time"

    "github.com/BurntSushi/toml"
    "github.com/mitchellh/go-homedir"
)

var configDirName = "example"

func GetDefaultConfigDir() (string, error) {
    var configDirLocation string

    homeDir, err := homedir.Dir()
    if err != nil {
        return "", err
    }

    switch runtime.GOOS {
    case "linux":
        // Use the XDG_CONFIG_HOME variable if it is set, otherwise
        // $HOME/.config/example
        xdgConfigHome := os.Getenv("XDG_CONFIG_HOME")
        if xdgConfigHome != "" {
            configDirLocation = xdgConfigHome
        } else {
            configDirLocation = filepath.Join(homeDir, ".config", configDirName)
        }

    default:
        // On other platforms we just use $HOME/.example
        hiddenConfigDirName := "." + configDirName
        configDirLocation = filepath.Join(homeDir, hiddenConfigDirName)
    }

    return configDirLocation, nil
}

type Config struct {
    General GeneralOptions
    Keys    map[string]map[string]string
}

type GeneralOptions struct {
    DefaultURLScheme       string
    FormatJSON             bool
    Insecure               bool
    PreserveScrollPosition bool
    Timeout                Duration
}

type Duration struct {
    time.Duration
}

func (d *Duration) UnmarshalText(text []byte) error {
    var err error
    d.Duration, err = time.ParseDuration(string(text))
    return err
}

var DefaultConfig = Config{
    General: GeneralOptions{
        DefaultURLScheme:       "https",
        FormatJSON:             true,
        Insecure:               false,
        PreserveScrollPosition: true,
        Timeout: Duration{
            Duration: 1 * time.Minute,
        },
    },
}

func LoadConfig(configFile string) (*Config, error) {
    if _, err := os.Stat(configFile); os.IsNotExist(err) {
        return nil, errors.New("Config file does not exist.")
    } else if err != nil {
        return nil, err
    }

    conf := DefaultConfig
    if _, err := toml.DecodeFile(configFile, &conf); err != nil {
        return nil, err
    }

    return &conf, nil
}

If you’d like to leave a comment, please email benaiah@mischenko.com

Artemis (Artemix)

Kickstart, a fast and simple project template bootstrapper August 14, 2018 02:17 PM

Ever had some code base that you regularly use to start a new project?

Until now, you've probably lost some time refactoring everything to fill out the right project name, title etc.

A few tools already exist but either you're lost in feature bloating hell or you're fighting to configure everything in most cases.

The following tool is an early-developement, rust-based, template tool made to be more versatile that its closest python counter-part, cookiecutter.

Also, it perfectly works on Windows, Linux and MacOS without any issue!

A demonstration

Let me demonstrate this tool from a user's point of view.

Say I'm John. I have a need to make several workers for my job queue.

All workers follow the same base structure and only a few bits of code change. Error management is the same, etc.

Without any external tool, the easiest way I could handle that would be to put the template in a Git repository, clone it when needed and change the few fields, like the handler's class name, the configuration file etc.

Still, I'd like to quickly bootstrap all my workers and not have to, by hand, replace each key with the right value.

At the same time, I'm too lazy to set up an entire boostrapping suite or tool, and I'd like to be able to make some very basic project templates that'd work almost out of the box.

My example of worker system uses the following file structure for worker classes:

.
├── {{name}}.java
└── template.toml

Yes, this tool supports templating in file and folder names!

The {{name}}.java file contains the following very basic template code:

package org.artemix.worker;

class {{name}} extends WorkerBase {
        public void handle(Payload payload) {
                // Some default code
        }
}

The template.toml contains a few template-specific fields (like its name, description and configuration format version), but also variable declaration for setup prompts.

name = "Worker"
description = "Worker base template"
kickstart_version = 1

[[variable]]
name="name"
default="MyClass"
prompt="Which class name do you want to give to your worker ?"
validation="^[A-Z]\w+$"

Here, with the validation regex, we can attest that we'll get a valid Java class file starting with a capitalized letter.

And that's it!

In a few instants, we created our worker template, and we have a nice base we can later grow if we need any new more complex worker.

Now, how do you create a worker with this template?

The easiest way is by simply running the following command from inside the folder in which you want the files to be created.

$ kickstart /path/to/my/template

This will parse the template.toml inside the given template path and, for each [[variable]] block, will ask the prompt, validate the input, and repeat until it's rightly entered.

Once all prompts have been resolved, it'll copy and parse the files into the current directory.

But I don't want to have all my templates in local, or pass the path every time!

Sure, having everything in local can be bothersome, especially if you're starting a project from a huge project template.

To answer that, in the current version, kickstart supports cloning from a git remote by either HTTP(S), SSH or even git shortcuts!

In that sense, all the following commands are perfectly valid and will clone from a git remote.

$ kickstart git@gitlab.com:MyUser/MyTemplate
$ kickstart https://git.local/test
# With git shortcuts in the configuration, see https://stackoverflow.com/a/25975967/10165294
$ kickstart gl:MyUser/MyTemplate

In conclusion

While still young, this tool is already, as-is, really useful and can handle a lot of situations.

Since the goal of this tool is to be as lightweight and basic as possible, you can have a fair guarantee that it won't become an over-complex, feature-bloated, tool.

Not every feature has been showcased in this article, so I can only recommend you to go to the official Github repository, and take a look at the README.md file, which should contains every information you'd need to learn how to use it.

As I publish this article, the official v0.1.7 version have been released, and you can find the relesae on the releases page. If you'd prefer to be on the latest version, you can simply clone the repository, and, while inside it, run cargo builld and have a working version just for you.

Andrew Owen (yumaikas)

Factor: An impressive stack-based language environment August 14, 2018 02:17 PM

Recently the Factor programming language had a new 0.98 release, after a 4 year hiatus after the 0.97 release. Finding this on lobsters, I decided to take Factor for a spin again, after years of having left it mostly alone to work on PISC. I decided I wanted to try to build a (still slightly buggy) 4-function calculator, as I find that a good way to gauge how easy/hard it is to use a GUI system for small things, and as a way to gauge what Factor is like in general.

The (quite frankly) awesome

Probably by far the most impressive part of Factor that I’ve seen so far: Factor is the second language I’ve come across that I’d comfortably take on a deserted island without an internet connection (the other being Go with it’s godoc server). This is due to the fact that a) 95% of all Factor documentation I’ve seen is accessible from the included Help browser, and b) the Factor debugger/inspector can inspect itself in great detail. This is the cherry atop Factor: You don’t just get a REPL, you also get a really useful set of debugging tools (especially the object inspector), and can dig into any of the state in the system, and using the help tools, once you understand some basics, you stand a decent chance of being able to either correct the state, or diagnose the bug.

This sort of tooling is something that SmallTalk is famous for, but any time I’ve tried SmallTalk, I’ve usually bounced off of it because most SmallTalk implementations end up living in their own world, making it hard to bring some of my favorite programming tools with me, and making it hard to interact with the outside world (footnote 3). Factor, on the other hand, even though it has an image that can updated to keep state around, and has a really good debugger, still integrates well with the outside world.

Really well, in fact. It’s easy to scaffold a vocabulary USE: tools.scaffold "vocabulary-name" scaffold-vocabulary , and then run USE: editors.sublime (or emacs, or vim, or any of about 24 other fairly popular editors), and then run "vocabulary-name" edit and have it open the vocabulary in the editor in question. This allows you to open up any of the Factor source code in the editor of your choice. And when you’re ready to start using/testing the vocabulary, running "vocabulary-name" reload puts you right into compile-fixing mode, where you can fix the bug in question, and then reload the vocabulary. When I was working on a 4-function calculator using the Factor listener and Sublime Text, it was a really tight feedback loop.

The GUI framework has a pretty nifty sorta-FRP like approach to how to handles data binding in controls, allowing you to daisy chain models (data that can be updated over time) via arrows (models with a quotation they apply to a base model when it’s data changes).

Also, Factor has a mostly working dark-mode setup (there are some sections of the inspector that have poor contrast, but the listener is in a good state), if you dislike the default black-on-white color scheme (I found it difficult for some of the late night hacking I was doing). Run the following commands in the listener (assuming Factor 0.98)

USE: tools.scaffold 
scaffold-factor-boot-rc

Then edit the .factor-boot-rc to look like so:

USE: ui.theme.switching 

dark-mode

Then then run run-bootstrap-init save in the listener, and close and re-open the listener. Viola, Factor’s dark mode!

Consequences of (relative) obscurity

That being said, there are still a few rough edges with Factor. For one, even though most things are documented, and those that are not have easily accessible source, Factor itself has almost no Google presence, which means that you have to be comfortable digging though the included help docs a bit longer to sort things out. This played out in a few practical ways when I was working on the 4-function calculator. Until the end of the project, I missed that gadgets (Factor’s equivalent for controls in WinForms) were subclasses of rectangles, which meant that I could set their width and height by storing to the dim(ension) on a gadget:

"Test" <label> { 200 300 } >>dim "example of a 200x300 label" open-window 

I ended up looking around the documentation about 4 times before I made that connection. This is the sort of question that would be on stack overflow for a more popular language.

For another, I seemed to be running across rendering or state glitches in the listener that could cause buttons to be mislabeled. I’m unsure what was leading to it, but it was distracting, to say the least. The other thing with Factor that became evident was the singled threaded nature of it’s runtime (as of currently). When first making a search in the help system, it would lock up, sometimes as long as 10-15 seconds, as it was indexing or loading in the various articles and such into memory. This limitation has kept me from digging into some languages (like Ocaml, though there are a bunch of other reasons, like iffy Windows support) in the past, especially when alternatives like Go and Erlang with their strong multi-threaded runtimes exist, but I think I’m willing to look past it in Factor for now, especially since I hear that there is a half decent IPC story for Factor processes.

The other consequence of all this was that it took me roughly 10-15 hours of noodling around with Factor, and writing an even smaller example GUI script to be able to get the linked calculator above done. I think I could build other new things with far less stumbling around, but it was a good reminder of how slow learning something new can be.

The other minor frustration I have with Factor is the fact that any file that does much work tends to accrete quite a few imports in its USING: line (the calculator I wrote has 23 imports across 4 lines, even though it’s only 100 lines of code). This is mostly due to a lot of various Factor systems being split into quite a few vocabularies. I could see this being helpful with compilation

Comparisons to PISC

Factor was a big inspiration for Position Independent Source Code (PISC), especially since it gave me the idea of code quotations some basic ideas for a few stack shuffling operations and some combinators (like when or bi). Factor and PISC have diverged a decent amount, though Factor is far and away further down the general path I’d been wanting to take PISC in terms of documentation and interactivity. Revisiting it after spending 3 years of assorted free time working with PISC demonstrated that in a lot of ways.

Factor has a much higher level of introspection. When making the calculator I only had to think in Factor, rather than thinking in both PISC and Go (which is common when writing PISC). I also didn’t have to rebuild many building blocks to get where I was going, like I would in PISC, as Slava and the other maintainers have done a lot of the foundational work already. Revisiting Factor after having written so much PISC did make understanding the stack a lot easier. I imagine that stack management will be most people’s struggle with Factor. I’ve found that implementing a stack language helped me there, but for some people, they may just have to do some exercises with it.

Conclusions

I do anticipate building more side-projects in Factor, as I’ve yet to try out the web framework, and I have a couple of ideas for websites in the back of my mind. There’s a lot of vocabularies that come with the base Factor download. I get the feeling that I may be adding some to it in the future, though time will tell.

Indrek Lasn (indreklasn)

Want to become a top writer? Follow this simple framework and you will succeed August 14, 2018 02:09 PM

“A sticker reading “good news is coming” on an advertising column” by Jon Tyson on Unsplash

Let’s face it — writing is hard. Writing to a tech savvy crowd is even harder. The tech industry is one of the most competitive and lucrative fields. It’s no surprise the demand for quality articles is very high. Everyone in the tech field is bright, hungry for knowledge and success. They won’t let you waste their time even if you tried.

Which is exactly why I’ve put together a small framework on how to write well and succeed at it.

I will do my best to extract my approach to writing so you can apply the same methods.

Hi, this is me!

Without further ado, let’s jump in!

Persistence is key

A river cuts through rock, not because of its power, but because of its persistence. First and most important rule is to have persistence and the “never give up” mentality. When I started writing, I had close to 100 readers per month, and I was very grateful for even that. Thank you!

300k monthly views back in ‘17 December

If you like what you’re doing, keep doing it. If you don’t like it, don’t keep doing it. It’s easy as that — if you enjoy writing, keep doing it. Be open to criticism and feedback so you can improve. Never let anyone tell you how and what to say. Keep a healthy balance.

Come up with a realistic number of articles you can produce per month and stick with it. You will be surprised how much you can achieve in year if you follow this quick practise.

Humility loves humor

You are never too important. I’m never too important. No one is too important, ever. Even the most famous kings have fallen and so will we.

Be humble and don’t lose the bond between readers. Always consider your readers and be open for dialog. Don’t have strong opinions weakly held.

Comment found on the internet

Besides, who doesn’t love jokes ❤

Have something interesting to say

Writing is a monologue, always write about something which provides value. When following this advice — don’t take it to the extreme. Your next article doesn’t have to carry the burden of discovering the cure for cancer — although that would definitely help.

These authors have something interesting to say

Why are some articles popular and some are not? It’s possible people don’t read some articles because they’re not interesting enough, right?

A story should entertain the writer too — Stephen King

It’s a very common feeling not having anything of value to say. Especially to a big audience. Which exactly brings me to my next point since they compliment each other.

Read more than you write

This one is obvious, but I also need to remind myself this. Reading will help you come up with new ideas and extend on ideas. The perfect time to pick up a book or browse through the dust of bookmarks is when the mind is blank.

Embrace it, own it and use it to your advantage. Having a blank mind is the perfect time for accepting new ideas.

Be trendy and irresistible

That’s right, the subject matters a lot. Let’s say we have two articles:

  • The first article is about a new Javascript framework which will be next React and take over the world.
  • The second article is about the terms and conditions for the microwave.

Which one would you read?

Keep your eyes open and try to come up with new and fresh topics. Readers are demanding and won’t even click on articles which give the slightest scent of being boring. And why would they? There’s so much good material out there.

Find an interesting subject, have a great title and have something interesting to say.

“Modern art piece featuring a halved orange with the rind painted blue on a blue background” by Cody Davis on Unsplash

If you combine all these five tips, success is sure to follow. Tomorrow already starts today, pick up that pen or keyboard and start writing!

Thanks for reading! ❤

This story is published in The Startup, Medium’s largest entrepreneurship publication followed by 358,974+ people.

Subscribe to receive our top stories here.


Want to become a top writer? Follow this simple framework and you will succeed was originally published in The Startup on Medium, where people are continuing the conversation by highlighting and responding to this story.

August 13, 2018

Siddhant Goel (siddhantgoel)

Nim: First impressions August 13, 2018 10:00 PM

As someone who writes software for a living, I think it's a good idea to learn a new language/technology every once in a while. It makes you think differently from what you've been focusing on in your normal 9-to-5.

In that spirit, I started playing around with Nim last weekend. Nim is a neat little language that has been around for a bit more than 10 years, but has been (surprisingly) under the radar, at least if you compare it to how much publicity Go and Rust generate.

The promise looks quite nice - high performance, (optional) garbage collection, dependency-free binaries, compilation to JavaScript (!), and a clean syntax that at times reminds me of Python.

After working through the official tutorial and Nim by Example over the weekend, I've come to really like the language. There are some quirks that I wish weren't there, but I'll try to summarize my first impressions in the following sections.

The Compiler

If you're used to writing code in an interpreted language, "fighting the compiler" can be a thing.

But I'm pleased to report that the Nim compiler, while pedantic like all others, is actually quite helpful. The error messages are (for the most part) concise and tell you exactly what you should do to turn your code into a binary. And it's always nice to catch possible runtime errors at compile time.

For example, consider the following snippet that asks the user to enter a number.

import strutils

echo "What's your number? "

var number = strutils.parseInt(readLine(stdin))

case number:
  of 1:
    echo "One"
  of 2:
    echo "Two"

This fails to compile with the error message not all cases are covered, which means that the compiler noticed that your case statement is not checking all the possible cases and that this may cause unexpected behavior at runtime.

Simple, but not really

Nim looks small and simple on the surface, but it's a fairly advanced language. There are features that let you do extremely funky things. You probably won't end up using those features in daily use (well, depending on what you're using it for), but you know that they are there.

I would argue that this is nice because the language feels approachable. It feels like you can hold the entire language in your head (even though that may not be the case). I don't feel the urge to hide in a corner on seeing Nim code as a beginner to the language. At the same time you know it in the back of your head that when you need to do advanced things, you can.

Dependency-free binaries

Coming from a Python world, this is huge. The simplest Python deployments can still be fairly tricky to get right. There's the whole dance of virtualenvs, installing dependencies, n-number of decisions you need to take before even starting. And god help you if you need to install something that's not pure Python.

Nothing is simpler than compiling the project and putting one binary on the server and seeing things work.

Compile-time constants

const value = doSomethingExpensive()

value is set here at compile-time. Of course, if doSomethingExpensive cannot be worked out at compile-time, the compiler will shout at you. But when you're looking to reduce every single bit of runtime overhead, this is extremely valuable.

Productivity

It was fascinating to observe that after spending effectively only a few hours with the language, I felt like I was in a position to write working code. I've had multiple moments writing Nim code when you write something and then look at it and then go "was that it?".

The last time something like this happened was when I started learning Python. It was astonishing that a programming language could put you in a position where you can write working code so quickly.

Variable assignment

var x, y = 3

This sets the value of both x and y to be 3, which seems slightly confusing. Go does not compile something similar, complaining that there are 2 variables but 1 value.

To be honest, this is not a huge deal, but I can definitely see myself debugging bugs caused by this behavior.

Methods

From methods:

Adding a method to a class the programmer has no control over is impossible or needs ugly workarounds.

This is mentioned under the section "disadvantages". I don't think this is a disadvantage. I would actually argue that this keeps the programmer from doing funky things that harm readability.

I once spent 2 days trying to debug an issue in a Ruby on Rails codebase where a third-party gem was monkey-patching some functions defined in ActiveRecord. This was a classic example of modifying something you're not supposed to modify.

Module import semantics

So far this has been my biggest complaint with Nim.

Consider this simple program.

import strutils

let s = "Hello, World!"

echo s.split()
echo strutils.split(s)
echo split(s)

The last three lines are all doing the same thing. So not only is the split() function suddenly available on strings, but is also (optionally) a part of the local namespace. It's difficult to say if split is a part of strutils or a built-in function or a function defined on string types (if you didn't notice the import strutils).

After writing Python code for slightly more than 10 years, I have "explicit is better than implicit" plastered all over my brain. So such behavior makes me a little uneasy.

From what I understand, this is a consequence of Uniform Function Call Syntax, but IMHO this feels like a step back in terms of being able to look at a piece of code and immediately make sense out of it.

But the Nim developers know this complaint. There's already a Github issue, and the forum thread even has a proposal that might be implemented in one of the next releases. So I'm hopeful!


Overall, I like Nim. If it's not painfully obvious, this article barely scratched the surface. There is so much more to the language that I haven't written about here for the sake of keeping this article short.

There are some design-things which I either don't get or don't agree with. But on the other hand it provides many nice features without necessarily compromising on speed or introducing complexity. My first attempt at picking up one of the new systems-programming languages was at Rust, but it introduced so much complexity that it was overwhelming.

Nim feels approachable, and gives you the feeling that it's powerful for when you need something more than just simple.

Given that I've spent less than a week working with it and still like it so much, I can definitely see myself writing more Nim code in the future.

Luke Picciau (user545)

Announcing Pikatrack an Open Source Fitness Tracker August 13, 2018 01:36 PM

Just a short post today to show off what I have been working on for the last few weeks. I have been busy making a website for activity tracking primarily for cycling and running. The website is 100% open source and you can check it out on GitLab. Currently the project is in a very early stage and you can’t do a whole lot but I will be updating this blog with the progress.

Ponylang (SeanTAllen)

Last Week in Pony - August 12, 2018 August 13, 2018 12:46 AM

Last Week In Pony is a weekly blog post to catch you up on the latest news for the Pony programming language. To learn more about Pony check out our website, our Twitter account @ponylang, our users’ mailing list or join us on IRC.

Got something you think should be featured? There’s a GitHub issue for that! Add a comment to the open “Last Week in Pony” issue.

Pete Corey (petecorey)

Algorithmically Fingering Guitar Chords with Elixir August 13, 2018 12:00 AM

Last time we wrote about using Elixir to generate all possible voicings of a given guitar chord to find the voicing with the best voice leading between another chord.

While this was great, there were several issues. We were conflating the idea of “musical distance” and “physical distance” when calculating optimal voice leading, and we weren’t taking the playability of the progressions we were generating into account.

To address both of these issues, we need to know not only which voicings are possible for a given chord, but also how each of those voicings can be played. We need to generate all possible fingerings for a given guitar chord voicing.

This sounds like a fantastic excuse to flex our Elixir muscles!

Calculating Fingerings

We’ll start our journey into calculating all possible fingerings for a given guitar chord by creating a new Elixir module, Chord.Fingering, and a new fingerings/1 function:


defmodule Chord.Fingering do
  def fingerings(chord)
end

Our high level plan of attack for computing possible fingerings is fairly straight forward. Given that each chord is a six-element array of frets being played, like [nil, 5, 7, 7, 6, nil], we want to:

  1. Attach all possible fingerings that can be played on each fret.
  2. Choose each possible finger in turn, sieve out all subsequent impossible fingers, and recursively repeat to get all possible fingerings.
  3. Perform any necessary cleanup.

Our final fingerings/1 function makes these steps fairly explicit:


def fingerings(chord),
  do:
    chord
    |> attach_possible_fingers()
    |> choose_and_sieve()
    |> cleanup()

Possible Fingers? Sieves?

Before we dive deeper into our solution, we should take a detour and talk about how we’re computing fingerings.

Our solution takes inspiration from the “Sieve of Eratosthenes”, which is a clever technique for calculating prime numbers. The basic idea of a “sieve” is that a choice made now can be used to filter out future unwanted results.

To bring it back to our situation, imagine we’re trying to play a D minor chord on the fifth fret:

Our D minor chord.

If we were to start fingering this chord by placing our second finger on the low D note, we know that we couldn’t use our first finger on any of the other notes in the chord. Our first finger would have to wrap over or sneak under our second finger to reach those notes, and that’s essentially impossible:

We can't use our first finger anywhere!

So by choosing to use our second finger on the fifth string and fret, we can sieve out the possibility of using our first finger on any of the remaining notes.

If we think about it, we can also sieve out the possibility of re-using our second finger. A finger can’t be re-used unless it’s forming a bar or a double-stop on an adjacent fret.

Our remaining set of possible fingers for the remaining notes are fingers three and four.

By recursively picking another of our possible fingers on another string and applying our sieving rules, we can come up with our entire set of possible fingers.

Choosing and Sieving

The meat of our algorithm lives in the choose_and_sieve/2 function, which takes an initial chord, complete with “possible fingers”, and a fingerings argument that defaults to an empty list:


defp choose_and_sieve(chord, fingerings \\ [])

The fingerings argument will be used to hold each finger choice for our chord, as we choose them.

Our choose_and_sieve/1 function expects each element of chord to be a two-element tuple, where the first element is the fret being played, and the second element is the set of possible fingers that could be chosen to play that fret.

Our attach_possible_fingers/1 helper function transforms our initial chord into that expected structure:


defp attach_possible_fingers(chord),
  do: Enum.map(chord, &{&1, 1..4})

Our implementation of choose_and_sieve/2 is recursive, so we should start our implementation by defining our base case. The base case for choose_and_sieve/2 is triggered when chord is empty. At that point, we’ve handled every note in the chord, and need to return our fully constructed fingering:


defp choose_and_sieve([], fingerings),
  do:
    fingerings
    |> Enum.reverse()
    |> List.to_tuple()

As we’ll soon see, chosen fingers are appended onto fingerings in reverse order, so we reverse/1 our list to reorient our strings. Lastly we turn our fingerings list into a tuple so that we can safely flatten/1 our resulting list of fingerings without losing our groupings.

Once flattened, our cleanup/1 function maps over this final list and converts each tuple back into an array:


defp cleanup(fingerings),
  do: Enum.map(fingerings, &Tuple.to_list/1)

Moving on from our base case, it’s time to start thinking of other simple to handle situations.

If the next element in our chord list is an unplayed string (nil), we add it to our fingerings list and designate it to be played with no finger (nil), and recursively call choose_and_sieve/2 on our remaining chord:


defp choose_and_sieve([{nil, _possible_fingers} | chord], fingerings),
  do: choose_and_sieve(chord, [{nil, nil} | fingerings])

Similarly, if the next element of our chord is an open string, we’re recursively call chose_and_sieve/2, passing in our remaining chord, and our set of fingers appended with the open string played with no finger (nil):


defp choose_and_sieve([{0, _possible_fingers} | chord], fingerings),
  do: choose_and_sieve(chord, [{0, nil} | fingerings])

In the case of actually needing to finger a note, the situation becomes more complicated. In that case, the next element of our chord is a fret and some set of possible_fingers.

We’ll map over each of the possible_fingers, appending each finger and fret to our list of fingerings, sieving out any now-impossible possible_fingerings from the remaining notes in our chord, and then recursively calling our choose_and_sieve/2 function with our newly sieved chord and new_fingerings:


defp choose_and_sieve([{fret, possible_fingers} | chord], fingerings),
  do:
    possible_fingers
    |> Enum.map(fn finger ->
      new_fingerings = [{fret, finger} | fingerings]

      chord
      |> sieve_chord(new_fingerings)
      |> choose_and_sieve(new_fingerings)
    end)
    |> List.flatten()

The sieve_chord/2 helper function maps over each of the notes in what’s left of our chord, and updates the possible_fingers tuple element to sieve any fingerings that are now deemed impossible to play after placing our most recent finger:


defp sieve_chord(chord, fingerings),
  do:
    chord
    |> Enum.map(fn {fret, possible_fingers} ->
      {fret, sieve_fingers(possible_fingers, fret, fingerings)}
    end)

The sieve_fingers/3 helper function is where we make real decisions about the behavior of our fingering algorithm. The sieve_fingers/3 function itself is fairly straight forward. It simply rejects and possible_fingers that are considered “bad” by our bad_finger?/3 helper function:


defp sieve_fingers(possible_fingers, fret, fingerings),
  do: Enum.reject(possible_fingers, &bad_finger?(fret, &1, fingerings))

The bad_finger?/3 function runs each finger/fret combinations through four rules used by our algorithm to determine if a finger choice is “impossible”, and should be culled from our possible_fingers set:


defp bad_finger?(fret, finger, fingerings),
  do:
    Enum.any?([
      fret_above_finger_below?(fret, finger, fingerings),
      fret_below_finger_above?(fret, finger, fingerings),
      same_finger?(fret, finger, fingerings),
      impossible_bar?(fret, finger, fingerings)
    ])

If any of those rules are violated, the finger is rejected.

The first two rules check if a possible finger would need to stretch over or under an already placed finger, respectively:


defp fret_above_finger_below?(fret, finger, [{new_fret, new_finger} | _]),
  do: fret > new_fret && finger < new_finger

defp fret_below_finger_above?(fret, finger, [{new_fret, new_finger} | _]),
  do: fret < new_fret && finger > new_finger

The third rule verifies that no finger can be used twice, unless when performing a bar or double-stop over adjacent frets:


defp same_finger?(fret, finger, [{new_fret, new_finger} | _]),
  do: finger == new_finger && fret != new_fret

Finally, we need to prevent “impossible bars”, or bars that would mute notes played on lower frets:


defp impossible_bar?(_fret, finger, fingerings = [{new_fret, _} | _]),
  do:
    fingerings
    |> Enum.filter(fn {fret, _finger} -> fret > new_fret end)
    |> Enum.map(fn {_fret, finger} -> finger end)
    |> Enum.member?(finger)

The Results

Now that we’ve implemented our fingering algorithm, let’s try a few examples.

We’ll start by calculating the possible fingerings for the D minor chord we’ve been using as an example. Fingering suggestions are listed below each string:


[nil, 5, 7, 7, 6, nil]
|> Chord.Fingering.fingerings()
|> Enum.map(&Chord.Renderer.to_string/1)
|> Enum.join("\n\n")
|> IO.puts

Fingerings for our D minor chord.

Awesome! The first suggested bar can be difficult to play, but with some practice doing Ted Greene-style double-stops, it’s manageable. The second and third suggestions are what I would normally reach for.

Another interesting example is an open G major shape:


[3, 2, 0, 0, 3, 3]
|> Chord.Fingering.fingerings()
|> Enum.map(&Chord.Renderer.to_string/1)
|> Enum.join("\n\n")
|> IO.puts

Fingerings for our G chord.

The first few fingering suggestions make sense, but as we get closer to the end of the list, some of the suggestions are increasingly difficult to play. I don’t think I’ll ever be able to play this fingering:

An "impossible" to play fingering.

As a human, I can explain to you why this is difficult to play, but I haven’t been able to come up with a general rule to add to our rule set that would prevent these kinds of fingerings from being suggested. At this point, I’d rather have the algorithm present potentially impossible fingerings, than have it over-aggressively prune possible fingerings from the result set.

What’s Next?

In my previous article on “Voice Leading with Elixir”, I mentioned that I was conflating the ideas of “musical distance” and “physical distance”. In terms of voice leading, all I really care about is optimizing a chord progression for musical distance. But as a guitar player, I also want to consider “physical distance”.

If a set of chords all have the same “musical distance” from a given starting chord, I want to choose the chord that has the lowest “physical distance”. By “physical distance”, I mean literally fret distance, but also how difficult it is to transition from one chord to another. Do I just need to slide one finger? That’s easy! Do I need to lift and replace three fingers while sliding the fourth? That’s not so easy…

We can’t calculate the “physical distance” between chords unless we know the fingerings for the chords in question. Now that we know the potential fingerings for a given chord, we can compute a (modified) levenshtein distance between the fingerings of two chords!

Why is that cool?

Once that’s done, we’ll be able to take a starting chord (optionally with a starting fingering), and find the best voicing of the landing chord in terms of voice leading and ease of playability!

Be sure to check out the entire project on Github, and stay tuned for more.

August 12, 2018

Patrick Marchand (superpat)

Run BSD August 12, 2018 03:00 PM

Run BSD

By reading this blog, you will probably notice that I'm a bit a BSD fan.

Roman Zolotarev hosts a series of stories about BSD users on his BSD job board. There are a lot of fascinating stories about how people got into and / or use BSD.

You can find mine here

August 11, 2018

Vincent (vfoley)

How to alleviate the pain of Rust compile times August 11, 2018 11:10 AM

A few days ago, I wrote about two Rust pain points when using Rust at work. One of these points were the long compile times. In this post, I want to share a few tips that can help alleviate that pain.

Use cargo check

We typically use the compiler for two reasons: to verify if the syntax and/or types are correct and to generate a runnable program. When compiling a program, especially a release build, the majority of the time is spent generating LLVM bytecode and optimizing that bytecode. If you only want to know whether your 3-line change typechecks, you don’t want to wait for the optimizer.

For this reason, Cargo has a subcommand called check that only invokes the front-end. It completes faster than a debug build and much faster than a release build.

As an example, here are the timing results of cargo check and cargo build for my personal project, ppbert. (The benchmarks are performed by hyperfine.)

Benchmark #1: cargo clean && cargo check

  Time (mean ± σ):      8.131 s ±  0.424 s    [User: 24.234 s, System: 1.245 s]

  Range (min … max):    7.529 s …  8.845 s

Benchmark #2: cargo clean && cargo build

  Time (mean ± σ):     16.904 s ±  0.794 s    [User: 52.009 s, System: 2.240 s]

  Range (min … max):   15.728 s … 18.098 s

Benchmark #3: cargo clean && cargo build --release

  Time (mean ± σ):     48.454 s ±  2.540 s    [User: 145.644 s, System: 3.205 s]

  Range (min … max):   46.229 s … 54.107 s

Summary

  'cargo clean && cargo check' ran
    2.08x faster than 'cargo clean && cargo build'
    5.96x faster than 'cargo clean && cargo build --release'

From a clean slate, cargo check is 2x faster than cargo build and 6x faster than cargo build --release. During development, when you just want to check that what you wrote is correct, definitely reach for cargo check.

Use sccache

A co-worker introduced me to sccache, a compilation caching service by Mozilla that is compatible with Rust. It will cache the build artifacts that Cargo generates, so if you cargo clean your project or go to work on a second project that shares dependencies with the first one, you won’t have to rebuild everything.

You can install sccache with Cargo: cargo install sccache. To activate sccache, add the following line to your .bashrc:

export RUSTC_WRAPPER=sccache

And here are the time differences.

Benchmark #1: cargo clean && cargo build

  Time (mean ± σ):     15.726 s ±  0.330 s    [User: 50.183 s, System: 2.210 s]

  Range (min … max):   15.148 s … 16.210 s

Benchmark #2: cargo clean && RUSTC_WRAPPER=sccache cargo build

  Time (mean ± σ):      6.877 s ±  4.665 s    [User: 6.964 s, System: 0.916 s]

  Range (min … max):    5.135 s … 20.136 s

  Warning: The first benchmarking run for this command was
  significantly slower than the rest (20.136 s). This could be caused
  by (filesystem) caches that were not filled until after the first
  run. You should consider using the '--warmup' option to fill those
  caches before the actual benchmark. Alternatively, use the
  '--prepare' option to clear the caches before each timing run.

Summary

  'cargo clean && RUSTC_WRAPPER=sccache cargo build' ran
    2.29x faster than 'cargo clean && cargo build'

Given the ease of installing and configuring sccache, I see no reason not to use it and get that 2x speed boost. One downside of sccache: it won’t make your continuous integration builds faster. (You could configure sccache to use an S3 bucket, but I prefer to have completely clean builds from my CI.)

Avoid LTO

LTO is the acronym for link-time optimization. Compilers build and optimize compilation units individually. LTO is a mechanism for taking those individually-optimized object files and finding more opportunities for optimization when they are considered as a group. This is great if you want to have the fastest possible program, but the price you pay are higher compilation times. In Rust, the compilation unit is the crate; if the heavy computations are all done in your crate, then maybe LTO isn’t necessary.

Here is the difference in execution time of ppbert with and without LTO on a 100 MiB .bert2 file.

Benchmark #1: /tmp/ppbert-without-lto -2 /tmp/100m.bert2

  Time (mean ± σ):      1.926 s ±  0.005 s    [User: 1.308 s, System: 0.618 s]

  Range (min … max):    1.921 s …  1.937 s

Benchmark #2: /tmp/ppbert-with-lto -2 /tmp/100m.bert2

  Time (mean ± σ):      1.887 s ±  0.012 s    [User: 1.273 s, System: 0.613 s]

  Range (min … max):    1.867 s …  1.905 s

Summary

  '/tmp/ppbert-with-lto -2 /tmp/100m.bert2' ran
    1.02x faster than '/tmp/ppbert-without-lto -2 /tmp/100m.bert2'

There is a slight difference (which is why I enable LTO), but let’s look at what LTO does to compile times (with sccache enabled):

Benchmark #1: sed -i "s/lto.*/lto=false/" Cargo.toml && cargo clean && cargo build --release

  Time (mean ± σ):      5.793 s ±  0.075 s    [User: 11.390 s, System: 0.767 s]

  Range (min … max):    5.693 s …  5.901 s

Benchmark #2: sed -i "s/lto.*/lto=true/" Cargo.toml && cargo clean && cargo build --release

  Time (mean ± σ):     18.483 s ±  0.321 s    [User: 34.288 s, System: 1.182 s]

  Range (min … max):   17.971 s … 18.893 s

Summary

  'sed -i "s/lto.*/lto=false/" Cargo.toml && cargo clean && cargo build --release' ran
    3.19x faster than 'sed -i "s/lto.*/lto=true/" Cargo.toml && cargo clean && cargo build --release'

LTO makes the compilation of ppbert 300% longer for a 2% speed gain at run-time. If you have a project that you find too long to build, I encourage you to measure the compile times with and without LTO, and the execution speed with and without LTO. Maybe you’ll find (like I did in a project at work) that you don’t want to enable LTO.

Control your dependencies

Rust has a rich ecosystem, and every day we have access to more quality crates. This is mostly a blessing, but more crates mean more compilation, and this has a negative effect on your compile times. Now, don’t shun all dependencies and write everything yourself! Rather, make sure that the crates your bring in pay for themselves.

One thing that you can do to control dependencies is to examine your depdendencies and see if they have features that you can disable. For instance, in ppbert I use clap to parse command-line arguments; by default, clap colors its output and offers suggestions when the user makes a spelling mistake. Some people like these features, but I’m not a fan myself. Fortuately, the authors of clap have made such features optional. So I am able to deactivate what I don’t need, and not pay the compilation price for it.

Here’s the dependency tree of clap with default-features enabled, and without.

$ rg default-features Cargo.toml && cargo tree -p clap
33:default-features=true
clap v2.31.2
├── ansi_term v0.11.0
├── atty v0.2.11
│   └── libc v0.2.42
├── bitflags v1.0.3
├── strsim v0.7.0
├── textwrap v0.9.0
│   └── unicode-width v0.1.5
├── unicode-width v0.1.5 (*)
└── vec_map v0.8.1

$ vi Cargo.toml

$ rg default-features Cargo.toml && cargo tree -p clap
33:default-features=false
clap v2.31.2
├── bitflags v1.0.3
├── textwrap v0.9.0
│   └── unicode-width v0.1.5
└── unicode-width v0.1.5 (*)

So I only need to compile 4 external crates rather than 9. Many crates make some of their dependencies optional; make sure you only bring in what you really need.

Conclusion

Rust doesn’t have the fastest compiler in the world, and that creates friction when writing code. I’m looking forward to improvements on that front more than any other Rust feature. Fortunately, in the mean time, there are ways to make Rust compile your projects a little faster.

Derek Jones (derek-jones)

Maximum team size before progress begins to stall August 11, 2018 01:47 AM

On multi-person projects people have to talk to each other, which reduces the amount of time available for directly working on writing software. How many people can be added to a project before the extra communications overhead is such that the total amount of code, per unit time, produced by the team decreases?

A rarely cited paper by Robert Tausworthe provides a simple, but effective analysis.

The plot below shows team productivity rate for a given number of team sizes, based on the examples discussed below.

Team productivity for given number of members

Activities are split between communicating and producing code.

If we assume the communications overhead is give by: t_0(S^{alpha}-1), where t_0 is the percentage of one person’s time spent communicating in a two-person team, S the number of developers and alpha a constant greater than zero (I’m using Tausworthe’s notation).

The maximum team size, before adding people reduces total output, is given by: S=({1+t_0}/{(1+alpha)t_0})^{1/{alpha}}.

If alpha=1 (i.e., everybody on the project has the same communications overhead), then S={1+t_0}/{2t_0}, which for small t_0 is approximately S=1/{2t_0}. For example, if everybody on a team spends 10% of their time communicating with every other team member: S={1+0.1}/{2*0.1}approx 5.

In this team of five, 50% of each persons time will be spent communicating.

If alpha=0.8, then we have S=({1+0.1}/{(1+0.8)*0.1})^{1/0.8}approx 10.

What if the percentage of time a person spends communicating with other team members has an exponential distribution? That is, they spend most of their time communicating with a few people and very little with the rest; the (normalised) communications overhead is: 1-e^{-(S-1)t_1}, where t_1 is a constant found by fitting data from the two-person team (before any more people are added to the team).

The maximum team size is now given by: S=1/{t_1}, and if t_1=0.1, then: S=1/{0.1}=10.

In this team of ten, 63% of each persons time will be spent communicating (team size can be bigger, but each member will spend more time communicating compared to the linear overhead case).

Having done this analysis, what is now needed is some data on the distribution of individual communications overhead. Is the distribution linear, square-root, exponential? I am not aware of any such data (there is a chance I have encountered something close and not appreciated its importance).

I have only every worked on relatively small teams, and am inclined towards the distribution of time spent communicating not being constant. Was it exponential or a power-law? I would not like to say.

Could a communications time distribution be reverse engineered from email logs? The cc’ing of people who might have an interest in a topic complicates the data analysis; time spent in meetings are another complication.

Pointers to data most welcome and as is any alternative analysis using data likely to have a higher signal/noise ratio.

August 10, 2018

Jon Williams (wizardishungry)

dockertest timeouts August 10, 2018 12:00 AM

Added Docker container timeouts to dockertest.

August 07, 2018

Unrelenting Technology (myfreeweb)

Raptor has revealed pricing for their POWER9 "cloud" VPS. Almost AWS levels of complicated... August 07, 2018 10:18 PM

Raptor has revealed pricing for their POWER9 "cloud" VPS. Almost AWS levels of complicated pricing. And it's expensive. Starting with 10$ per month for just 1 core and 256mb (!) RAM…

August 06, 2018

Siddhant Goel (siddhantgoel)

Switching from KeePassXC to Bitwarden August 06, 2018 10:00 PM

I have been looking to replace my password management setup for a few months now. Up until now I was using KeePassXC, but I never got around to putting in the work to make it work properly on mobile.

I understand that the standard solution is to just get the file synced on your phone using Dropbox and then install a mobile app which can read the file off of Dropbox. But the idea that all my passwords are in one single file somewhere on the internet (cloud, if you will) is a bit disturbing.

Incidentally, this is also the reason I never got around to using 1password, Lastpass, or other cloud-hosted password managers.

This is not to discount the work that the security folks at these companies are putting in on a daily basis to make their services secure. And I know that my passwords file is encrypted. I've just seen enough mistakes happen and I'd like to avoid being a part of one.

Requirements

My requirements were fairly simple. I wanted something that

  1. is open-source
  2. allows self-hosting
  3. works on mobile

... in that order.

Turns out that just these 3 requirements narrowed down the search to Bitwarden.

Bitwarden

Bitwarden is open source. There's a core server written in C# and then there are multiple client apps (iOS, Android, desktop).

The backend appears quite heavy weight. Being written in C# and talking to a SQL server installation, it's not exactly what I would call "deployment friendly". Luckily, they provide a Docker image which you can use to self-host the whole thing.

While this works on a modern machine, I wanted to run Bitwarden on a spare Raspberry Pi connected to my home network where the system requirements are a bit less than what the Docker image requires.

It turns out that I'm not the first person to run into this problem. @jcs already wrote a Ruby server which is API-compatible with the "official" Bitwarden backend. This implementation is very lightweight, and completely doable for the Pi. So all I had to do was create a new bitwarden user on the Pi, install rvm, git clone the repository, and then start the server process.

The only problem left now was setting up a static IP for the Pi so that the desktop client on my laptop and the iOS app on my phone know the (static & private) IP address they should be connecting to. Luckily my modem supported allocating the same IP to devices based on their MAC addresses so this was also easy.

Migrating from KeePassXC

rubywarden includes a script to import existing KeePassXC database files, and it worked without any problems. The script somehow didn't handle my KeePassXC folders very well. So an entry called "Google" in the "Internet" folder was imported as "Internet/Google". Those slashes look pretty annoying, so I took some time out to delete some unused passwords and organize the rest into folders.

Migration was a non-issue, I would say.

Caveats

The only caveat with this system that I can think of right now is that when you're not at home, you won't be able to save/edit/delete passwords. You'll be able to read just fine, but editing won't work.

The reason is that your client apps are configured to talk to a private IP address (of the Pi in this case). But so far I haven't had the need to save/edit/delete passwords when I'm outside. And reading them works just fine.

If this is important for you, then this is probably the price you pay for keeping the passwords on a machine you can physically look at.

But overall, I find this setup quite nice to work with.

August 04, 2018

Leo Tindall (LeoLambda)

PDF Embedding Attacks August 04, 2018 05:17 PM

PDF, or Portable Document Format, is an incredibly complex file format, governed by many standards and semi-standards. Like HTML and CSS, it was primarily designed for document layout and presentation. Also like HTML and CSS, it has been augmented with a JavaScript engine and document API that allows programmers to turn PDF documents into applications - or vehicles for malware. Embedding Files in PDF Documents It’s very easy to embed any kind of file in a PDF document.

August 02, 2018

Wallaroo Labs (chuckblake)

Dynamic Keys August 02, 2018 11:00 AM

Wallaroo is designed to help you build stateful event processing services that scale easily and elastically. State is partitioned across workers in the system and migrates when workers join or leave the cluster. Wallaroo routes messages to the correct worker by extracting a key from the message’s content. Our initial implementation of Wallaroo was designed so that all of the keys that would be used by the system were known when defining the application.

Sevag Hanssian (sevagh)

Sudoers in namespace August 02, 2018 12:00 AM

Motivation If your sudoers file has lots of aliases, wildcards, etc., the visudo command can validate them (or format them with JSON if your copy of visudo has the -x flag). However, it won’t resolve the indirection. Here’s what I mean: $ cat test-sudoers-file Runas_Alias DANGEROUS = root User_Alias INNOCENT = sevagh INNOCENT remotehost = (DANGEROUS) /bin/sh $ visudo -cf ./test-sudoers-file ./test-sudoers-file: parsed OK visudo will not tell you that sevagh can run sh as root.

July 31, 2018

Gustaf Erikson (gerikson)

June July 31, 2018 04:21 PM

Rosebud bokeh wash

Meet the parents - Jan & Karin, Stensun Jun 2018

Jun 2017 | Jun 2016 | Jun 2015 | Jun 2014 | Jun 2013 | Jun 2012 | Jun 2011 | Jun 2010 | Jun 2009

July 30, 2018

Gustaf Erikson (gerikson)

July July 30, 2018 08:29 PM

Go Like Hell: Ford, Ferrari, and Their Battle for Speed and Glory at Le Mans by A.J. Baime July 30, 2018 08:26 PM

An entertaining account of the 1960s rivalry between Ford and Ferrari at Le Mans.

Frederic Cambus (fcambus)

The future of VIA x86 processors July 30, 2018 06:08 PM

I've been interested in VIA motherboards and CPUs ever since they came up with the Mini-ITX standard in the very early 2000s. Their approach of bringing fanless and power efficient designs to the x86 market was ground breaking at the time.

VIA processors are designed by Centaur Technology, and there is an excellent documentary entitled "Rise of the Centaur" retracing their history, which I really enjoyed watching.

It's unfortunately very difficult to find information about recent VIA x86 CPUs. The VIA QuadCore, their latest one, was announced in May 2011, and it remains an open question whether there will be newer ones or not, as the company seems to be focusing on the ARM architecture.

In fact the status of VIA's x86 licensing agreement is quite unclear. There was an FTC ruling against Intel in 2010 specifying that a five years extension should be offered to VIA once the ongoing agreement would expire in 2013, and that the agreement should be modified to allow VIA (among other companies) to consider mergers or joint ventures. The later being the way VIA apparently took, which gave birth to Zhaoxin.

Some information in English about Zhaoxin x86 CPUs is available on WikiChip. However, those CPUs seems to only be available within China. For now?

Given the current state of affairs on the x86 market (Intel ME, AMD Secure Technology, Meltdown, Spectre) and upcoming vulnerabilities annoucements lurking on the horizon, there is definitely a spot for alternative x86 processors.

Gonçalo Valério (dethos)

Experiment: ChainSentinel.co July 30, 2018 05:08 PM

The amount and maturity of the tools available to help developers in process of building new applications and products is often crucial to the success of any given technology, platform or ecosystem.

Nowadays in this blockchain trend we are witnessing, the front runner and most mature contender is Ethereum, for sure. The quality and quantity of the tools and content (documentation, tutorials, etc) available to developers in order to build on top of it, is miles away from the competition.

Recently I’ve been working and experimenting with NEO blockchain (as you can see on some of my previous posts), a team that I took part even won an award of merit in their most recent dApp competition (Github repository). During that period we felt the pain of the lack of maturity and documentation that affected this new “ecosystem”.

Things got better, but there are a few things still missing, such as tools that help you integrate your applications and services with the blockchain, tools to make the developer’s life easier and tools to make their dApps more useful, such as the equivalent to Ethereum’s web3.js and Metamaskextension for example.

Even though you can achieve a lot with NEO’s JSON RPC API and through running your own node, I still think things should be easier. So at the last Whitesmith hackathon we’ve tried to address a subset of these pains.

We’ve put together, on that limited timeframe, a simple and rough service that delivers blockchain events as traditional Webhooks (websockets are planned) to make it easier for everybody to interact in real-time with any smart-contract.

We are looking for feedback to understand if it is something more developers also need and in that case work towards improving the service. Feel free to take a look at:

https://chainsentinel.co

Pete Corey (petecorey)

Voice Leading with Elixir July 30, 2018 12:00 AM

I play quite a bit of guitar in my free time. Once of the things I’ve been practicing lately is improving my voice leading between chords.

Voice leading refers to how the individual notes, or voices, within a chord move when you transition to another chord. You often want as little movement as possible to keep the transition from sounding jarring (unless you’re going for jarring).

So for example, if I play a G7 way up the neck, I probably wouldn’t want to follow it with a Cmaj7 played towards the nut. Instead, I’d like to find another voicing of Cmaj7 that’s both physically and musically closer to our G7 chord.

Knowing how to voice lead between chords usually requires a vast knowledge of the fretboard, a huge chord vocabulary, and lots of practice. But who needs all then when you have a computer and the Elixir programming language?

Let’s use Elixir to chug through all of the possible Cmaj7 chords and find those with the best voice leading from our G7!

Rendering Chords

Before we start talking about recruiting our computer to help us find the best voice leading between chords, we should take a detour and talk about guitar chords and how we’ll work with them.

When you break it down to the basic, a “guitar” is just six strings attached to a piece of wood. A “chord” is just a set of notes played simultaneously across any number of those strings. Different notes can be played on each string by pressing on any “fret” along the neck.

Given those definitions, the simplest ways to represent a chord using Elixir data structures probably be as a six element list (or tuple).

Here’s our G7 chord represented as an array:


[nil, 10, 12, 10, 12, nil]

From the thickest string to the thinnest, we’re not playing anything on the first string (nil). We’re playing a G on the next string (10), a D on the next string (12), an F on the next string (10), a B on the next string (12), and nothing on the thinnest string (nil).

To make our lives easier, we should come up with some way of displaying these chords in a more guitarist-friendly manner. One common option for displaying guitar chords is with chord charts:

To kick things off, let’s write a Chord.Renderer module with a to_string/2 function that takes a chord and returns a unicode-based chart for the provided chord:


defmodule Chord.Renderer do
  def to_string(chord, chord_name) do
  end
end

The first thing we’ll need to do is find out the “reach” of our chord. What’s the lowest fret used in the chord and the highest?


{min, max} =
  chord
  |> Enum.reject(&(&1 == nil))
  |> Enum.min_max()

We can use Elixir’s Enum.reject/2 to filter out unplayed strings and then use Enum.min_max/1 to easily find both the lowest and highest fret used in the chord.

Next we’ll iterate over every set of frets within the range of the chord and render each row using a row_to_string/4 helper function:


0..max(max - min, 3)
|> Enum.map(&row_to_string(&1, min, chord, chord_name))

Most fret charts render some minimum number of rows, even if the chord only takes up one fret of vertical space. We’ll iterate between 0 and either max - min, or 3, depending on which value is larger. This means we’ll always render at least four rows of frets for each diagram.

We’ll also want to intersperse the horizontal fret lines below each row of fingered notes on each row of frets:


|> Enum.intersperse([:bright, :black, "\n   ├┼┼┼┼┤\n"])

We’re using Elixir’s ANSI color codes to color our fretboard lines a dark grey color, and building our final string as an IO list, rather than a single concatenated string.

Because we’re using ANSI color codes, we need to format and convert our resulting nested list structure into a string before returning it from our to_string/2 function:


|> IO.ANSI.format()
|> IO.chardata_to_string()

Our row_to_string/3 helper function is fairly straight forward. It simply renders a left gutter, the row of frets with any potential fingerings, and a right gutter:


defp row_to_string(offset, base, chord, chord_name),
  do: [
    left_gutter(offset, base + offset),
    Enum.map(chord, &fret_to_string(&1, base + offset)),
    right_gutter(offset, chord_name)
  ]

The left_gutter/2 helper function renders the lowest fret used in the chord on the first line of the chart:


defp left_gutter(0, fret),
    do: [:bright, :yellow, String.pad_leading("#{fret}", 2, " ") <> " "]

Otherwise, we render a spacer:


defp left_gutter(_, _),
  do: "   "

Similarly, the right_gutter/2 helper function either renders an optional chord_name on the first line of the chord chart:


defp right_gutter(0, chord_name),
  do: [:yellow, " #{chord_name}"]

Or an empty string:


defp right_gutter(_, _),
  do: ""

That’s all there is to it!

Now we can render chords by passing them into Chord.Renderer.to_string/2:


Chord.Renderer.to_string([nil, 10, 12, 10, 12, nil], "G7")
|> IO.puts
10 │●│●││ G7
   ├┼┼┼┼┤
   ││││││
   ├┼┼┼┼┤
   ││●│●│
   ├┼┼┼┼┤
   ││││││

And in its fully colored glory:

Our G7 chord, as rendered by our new module.

Chord Distance

We can roughly approximate how “good” the voice leading is between two chords by counting the number of frets each finger has to move when changing chords. We can call this the “distance” between the two chords. In the simplest terms, chords with good voice leading have minimal distance between each other.

If we can write a function that computes this distance between chords, we might be able to generate all possible Cmaj7 voicings, and find the voicing that leads best from our G7!

Let’s say that each fret moved on a single string equals one unit of “distance”, and adding or removing a note to or from a string also counts as a single unit of distance.

Using that heuristic, let’s write a new Chord module and a distance/2 function that calculates the distance between two chords.

If both chords are equal, there is zero distance between them:


def distance(chord, chord),
  do: 0

Otherwise, the distance between two chords is the sum of the distance between their individual fretted notes on each string:


def distance([fret_a | rest_a], [fret_b | rest_b]),
  do: distance(fret_a, fret_b) + distance(rest_a, rest_b)

If a the first chord doesn’t have a note fretted on a string, and the next chord does, we’ll add one unit of distance:


def distance(nil, fret),
  do: 1

And visa versa:


def distance(fret, nil),
  do: 1

Otherwise, if both strings have fretted notes, the distance moved on that string is the number of frets between the two chords on that string:


def distance(fret_a, fret_b),
  do: abs(fret_a - fret_b)

We can manually calculate the distance between our G7 chord ([nil, 10, 12, 10, 12, nil]), and a few different Cmaj7 voicings we may know:


Chord.distance([nil, 10, 12, 10, 12, nil], [nil, 3, 5, 4, 5, nil])   # 27
Chord.distance([nil, 10, 12, 10, 12, nil], [8, 10, 9, 9, nil, nil])  # 6

So according to our heuristic, the second voicing of Cmaj7 has much better voice leading between our G7 than the first voicing of Cmaj7.

This is great, but we’re still limited by our knowledge of the fretboard. What if we only know two voicings of a Cmaj7 chord. Is this the best we can do?

Absolutely not!

Brute Forced Voicings

The last piece of this puzzle is to write a function that will generate all possible voicings of a given chord across the neck of the guitar. If we have all of the possible voicings of our Cmaj7, for example, we can easily find the voicing that has the best voice leading from our G7 chord!

Let’s start by creating a new voicings/1 function in our Chord module:


def voicings(notes) do
end

The voicings/1 function accepts an array of numbers representing the notes we want in our chord. For example, if we wanted all of the voicings of our Cmaj7 chord, we’d call vocings/1 with a C (0), an E (4), a G (7), and a B (11). These numbers correspond to the lowest set of MIDI notes, ranging from 0 to 11.

The first thing we want to do is calculate all of the possible “note sets” that will be spread across our guitar strings:


notes
|> all_note_sets()

If a chord has fewer notes than the number of strings we want to play, some number of those notes will have to be repeated. To illustrate, imagine we want to play our four note Cmaj7 using all six strings of the guitar. We’ll obviously have four strings playing C, E, G, and B, but what will the other two strings play?

The all_note_sets/1 helper functions calculates this list of all possible note sets using some hand-waving combinatorics, and a few unfortunate list comprehensions:


def all_note_sets(notes) do
  for length <- 6..length(notes) do
    for base <- Combination.combine(notes, min(length, length(notes))) do
      for extension <- Combination.combine(notes, length - length(notes)) do
        base ++ extension
      end
    end
  end
  |> Enum.reduce(&Kernel.++/2)
  |> Enum.reduce(&Kernel.++/2)
end

Next, our voicings/1 function needs to take each of these possible note sets and build all possible chords using that set of notes:


|> Enum.map(&build_chords/1)

The build_chords/1 helper works by recursively building up all possible chords made of all possible notes in the provided note sets.


def build_chords(note_set, chord \\ [nil, nil, nil, nil, nil, nil], chords \\ [])

It starts by looking at the first note in the provided note set and finds all occurrences of that note across all of the strings of our guitar using the all_notes/1 helper:


note
|> all_notes

Next, it filters out notes on strings already used in the current chord under construction:


|> Enum.filter(fn {string, fret} -> Enum.at(chord, string) == nil end)

Finally, it takes each note, inserts it into the current chord, and checks the “stretch” of the chord. If the chord spans more than five frets, we deem it impossible to play and filter it out (which is obviously an over-simplification, especially at higher frets). Otherwise, we recursively call build_chords/3, passing in the newly updated current chord and the remaining set of notes in our note set:


|> Enum.map(fn {string, fret} ->
  new_chord = List.replace_at(chord, string, fret)

  {min, max} =
    new_chord
    |> Enum.reject(&(&1 == nil))
    |> Enum.min_max(fn -> {0, 0} end)

  if max - min <= 5 do
    build_chords(rest, new_chord, chords)
  else
    chords
  end
end)

The all_notes/1 helper function works by accepting the abstract value of the note we’re looking for (C is 0), the optional MIDI notes of the tuning of each string, and the optional number of frets up the neck we want to look for notes:


def all_notes(target_note, strings \\ [40, 45, 50, 55, 59, 64], frets \\ 12) do
end

It then constructs a two dimensional list of MIDI notes up the neck and across the fretboard:


fretboard =
  for fret <- 0..frets,
    do: Enum.map(strings, &(&1 + fret))

Once we’ve built up our fretboard, we’ll filter out all of the notes that aren’t the specific note we’re looking for. We loop over every row of frets, and every string:


fretboard
|> Enum.with_index()
|> Enum.map(fn {row, index} ->
  row
  |> Enum.with_index()
  |> Enum.map(fn {note, string} ->
    ...
  end)
end)

For each note we encounter, we check if rem(note, 12) equals our target_note. If it does, we replace the current note value with a string/index tuple that can be used when building our guitar chord:


if rem(note, 12) == target_note do
  {string, index}
else
  nil
end

Otherwise, we replace the current note with nil.

Next, we flatten our multidimensional fretboard representation and filter out all of the nil values, leaving us with just the set of notes we’re looking for, and where they can be found on the fretboard.

Perfect.

Let’s try it out by listing the first three voicings of a Cmaj7 chord our new voicings/1 helper finds:


Chord.voicings([0, 4, 7, 11])
|> Enum.take(3)
|> Enum.map(&Chord.Renderer.to_string/1)
|> Enum.map(&IO.puts/1)

 0 ││││●│   0 ││││●│   1 ││││●│ 
   ├┼┼┼┼┤     ├┼┼┼┼┤     ├┼┼┼┼┤
   ││││││     ││││││     │●●│││
   ├┼┼┼┼┤     ├┼┼┼┼┤     ├┼┼┼┼┤
   ││●│││     ││●│││     ●││││●
   ├┼┼┼┼┤     ├┼┼┼┼┤     ├┼┼┼┼┤
   ●●│││●     ●●│││●     │││●││
   ├┼┼┼┼┤     ├┼┼┼┼┤
   │││●││     │││●││

Cool!

Putting it all Together

Now that our voicings/1 helper is finished, we can put all of the pieces together.

Let’s start by calculating all of the possible voicings of our Cmaj7 chord:


[0, 4, 7, 11]
|> Chord.voicings()

Next, let’s map over each voicing and build a tuple who’s first element is the distance from our G7 chord, and who’s second element is the generated voicing itself:


|> Enum.map(&{Chord.distance(&1, [nil, 10, 12, 10, 12, nil]), &1})

Now let’s sort that list. Because the distance from our G7 chord is the first element in each tuple, we’re effectively sorting by distance:


|> Enum.sort()

Now the “best” options for Cmaj7 voicings should be at the top of our list. Let’s take the first three:


|> Enum.take(3)

We’ll map each voicing through our chord chart renderer:


|> Enum.map(fn {distance, chord} -> Chord.to_string(chord, "Cmaj7") end)

Finally, let’s join each of our three charts together with newlines and print the result:


|> Enum.join("\n\n")
|> IO.puts()

Our generated Cmaj7 voicings.

Each of the voicings recommended by our software sound fairly nice. Much nicer than the first voicing we were using way down the neck. The third voicing definitely has an interesting flavor, and is something I never would have reached for without the help of this software, but I’m glad to know it’s there.

Final Thoughts and Future Work

I have many, many final thoughts about this project. If you can’t tell, I’m incredibly excited about this kind of thing.

I’m currently working on improving the “distance” heuristic, which raises many interesting questions about what exactly voice leading is, and who it’s for. Should I optimize for the player, or the listener? Thanks to how the guitar works, chords on wildly different sections of the neck may be very close musically, but my algorithm will filter these chords out as being “too far.” In many ways, I’m conflating “voice leading” between chords with “playability” between chords. Is this what I want?

I’m also working on optimizing the voice leading over entire chord progressions. As you might guess, this is a much more expansive problem.

A generated chord progression.

Lastly, if you’re interested in this kind of thing, I highly recommend you check out Ted Greene’s guitar work. Ted is, in my opinion, one of the true masters of the guitar, and put some serious work into perfecting his voice leading skills.

Check out the Ted Greene archive, the archive’s Youtube page, and definitely check out two of Ted’s books: Chord Chemistry, and Modern Chord Progressions.

I’ve uploaded this entire project to Github, if you’re curious the see the source in its entirety. Check it out!

Obviously, this kind of thing is just a tool, and the chord transitions and progressions generated by the Chord module are just suggestions and starting places, not fully-fleshed out music. That being said, these tools have already given me many new ideas and shown me many interesting chords I never would have reached for without having them shown to me.

July 29, 2018

Andreas Zwinkau (qznc)

The Cost of Agile July 29, 2018 12:00 AM

Agile improves time to market at the cost of process efficiency

Read full article!

July 28, 2018

Unrelenting Technology (myfreeweb)

July 27, 2018

David N. Welton (davidw)

Fight or flight? YIMBYs and the exodus to smaller towns July 27, 2018 11:59 PM

Slate has an article about people departing larger cities for the “greener pastures” of small towns.  It’s something that has had me thinking a lot lately. The damage that excessive housing regulation (restrictive zoning, parking minimums, etc…) in our most productive places can be quantified in the billions of dollars.  Foregone jobs, more expensive university […]

July 26, 2018

Alex Wilson (mrwilson)

Teams from Scratch —Part 1: Psychological Safety July 26, 2018 09:19 AM

December 2017 — I had the privilege and opportunity of transitioning from Senior Developer at Unruly to a Team Lead of a newly-created team.

The new team was created to tackle a recurring issue at Unruly — our teams relied on multiple shared services for observability, alerting, and configuration management.

These services, while functional and only rarely prone to error, did not receive the same level of attention as our core products and therefore were only maintained, not advanced. When everyone is responsible, no-one is.

Taking the name SHIFT (a shortening of Shared Infrastructure Team), we had an opportunity to start a team from scratch, without needing to cookie-cutter process from the core product teams … although we could, if we wanted.

A snapshot from one of our team wipe-boards.

In this multi-part series, I’ll be recounting my experiences of the different facets of our team culture and process, deliberate and emergent behaviors, and how it has shaped our team’s direction.

This part is about what we do to create an environment with Psychological Safety.

I’ve been at Unruly for approaching six years, and with that comes six years of understanding from a lot of mistakes—however, a lot of that knowledge is specialized, and the team as a whole come from front-end/back-end development backgrounds, not infrastructure.

Thus, it was important for us to build a culture where asking questions and admitting gaps in knowledge was okay.

What is psychological safety?

For us, psychological safety was that every member of the team felt safe taking risks: asking “silly” questions, putting forward their views and arguments, making mistakes.

A team of individuals that feel empowered to experiment, safe to make mistakes, and able to deal with conflict in a constructive and non-violent way will grow and learn incredibly quickly.

On a personal level, I wanted the team to feel happy working with each-other to tackle problems, and made it a top priority to embed this into our working practices and values.

How did we build it into our team culture?

  • Team Lead setting example — I have been deliberately bringing my whole self to work, to encourage openness and set a positive example.
    I received a course of therapy earlier this year, to deal with some mental health issues, and I was very open about this with the team.
    This is not to say that everyone should be open about everything, but should feel safe speaking about themselves if they need to, in public or in private.
  • Regular feedback sessions — We have sessions to give each-other both positive and constructive feedback. Some of the “rules” are that no team member has to give or receive feedback, and that the feedback should be specific (not “You always do X”) and I-statements (not “You always do X”).
  • Asking for clarification or more detail is fine—There will be different methods by which team members learn best even in a small team like ours. Some of the team are very visual learners, and learn best from diagrams, and some prefer analogies and metaphor.
    We’ve tried to embed the idea that it’s okay for one member of the pair or mob to stop and ask for explanation in another form that helps them learn best.
    This empathy helps tailor our explanations in the moment, and encourages cognizance of each-other’s needs
  • Blameless post-incident retrospectives — The flip-side of being able to make mistakes safely is that when mistakes happen, we don’t point fingers and apportion blame.
    If something in our process enabled an outage to be manually triggered, that’s a problem with the process, not the person.
    We found Norm Kerth’s Prime Directive a good starting point: “Regardless of what we discover, we understand and truly believe that everyone did the best job they could, given what they knew at the time, their skills and abilities, the resources available, and the situation at hand.”
  • Not everything works for everyone — the tools above are fine and good, but they also make assumptions that everyone works the same way, or deals with questions/conflict equally. We encourage empathy and awareness of each-others’ needs where it’s reasonable to expect (e.g. social cues may not be as obvious to some people as others).

There are many more things we do, as part of our conscious everyday practice and process, but these are the first concrete first steps.

We found that these processes and cultural axioms created an environment that supported different kinds of positive emergent behavior that we did not expect, and I’ll discuss in more detail in the next post, about our team’s sense of Belonging.

Robin Schroer (sulami)

How this blog is made July 26, 2018 12:00 AM

While this blog has always been powered by a static-page generator, a while ago I switched from using Pelican, a Python-based generator, to Hakyll, a Haskell-based one. There was no real practical reason for this, and objectively the switch has been a huge waste of time, though I have learned a lot about Haskell and am very happy with how this blog currently works.

The basic setup is quite simple, I am using Stack to manage build dependencies and sandboxing for HaskellThis is actually one of my basic requirements for new languages that I pick up. It is 2018, you can ship your language with a package manager that sandboxes by default. One of my biggest problems with Python is virtualenv.

. For reasons that become clear later in this post, I also need a LaTeX installation, which currently is not managed in any way, but I do not require anything out of the ordinary, so usually it is just a matter of install ing the distribution for my operating system.

Hakyll Hacks

To achieve a nice, human-readable URL scheme I am not only generating a slug from the original file name, which usually matches the title, but to get rid of the ugly .html postfix I actually render all pages and posts to index.html files in directories with the corresponding name, resulting in URLs with trailing slashes. Credit for this goes to Rohan Jain.

Of course this blog also has an Atom feedAtom vs. RSS has been debated for a while, in the end my use case is super simple anyway, so I am just using Atom until I find an actually valid reason to get into comparing the two formats.

, so you can follow my posts in your favourite newsreader, or use Firefox live bookmarks for example. I ran into one particular problem with this though, as one of my recent posts included an ampersand (&) in the title. The rendered feed file (no matter the format) would be invalid due to this. So I had to implement URL encoding for titles myself (this is already done for the body by Hakyll). Thanks to the way Hakyll embraces the Haskell philosphy, this was just a matter of mapping the encoding function over the post titles for the feed output.

Symetric HTML & PDF output

A long while ago I had the idea of making my CV available online in the browser, like many front end developers do to showcase their skills. At the same time I still need a PDF version that can be printed neatly. Being a developer, I of course cannot fathom the idea of having two sets of CVs, so I thought why not generate both versions from the same source (of truth), using one single build process. So that is what I am currently finalising.

The HTML version for the website is just a static page in the blog, simple enough. Hakyll gives me very fine-grained control over the actual build process, so I can leverage custom Markdown tags to control layout if I need to. The PDF version of my CV has always been generated using LaTeX, because it generates beautifully rendered output in a reproducible fashion. Because I am using Pandoc to generate HTML from the Markdown source, I am also using it to generate the LaTeX source code from the same source, and then just pass it into a LaTeX template. Then I just run xetex in a subprocess to render the final PDF.

Deployment

This blog is currently hosted in two locations, Github Pages which I have been using for many years, and GitLab pages, which I only added recently. While the build and deployment process for these two platforms is slightly different, they mostly work off the same codebase, with the only difference being a makefile for Github being replaced by the GitLab-specific build file. The Github version I generate locally with my locally compiled Hakyll, and then push the the right branch using the makefile. This makefile also allows me to run a local server to preview the rendered output before committing. The GitLab repository is setup to mirror the one on GitHub and rebuild via GitLab CI on every change, so it is compiling the Hakyll application in a Docker container and the generating the output.

These two build processes have different pros and cons. The GitHub version is available slightly faster, as my local render only takes a couple of seconds and after pushing I just have to wait for Github’s cache to refresh, which usually takes only a couple of minutes, while the GitLab version has to run the CI job which takes a couple of minutes. On the upside the GitLab version does not require me to have a locally installed version of Haskell, Stack or anything else, as long as I can push to the repository, allowing me to explore workflows which happen end-to-end on iOS. I have been investigating this exact workflow, using a combination of iA Writer, Workflow and Working Copy to write, transform and push the posts, leaving the build process to GitLab CI.

If you are interested in details, have a look at the source on either Github or GitLab.

July 25, 2018

Unrelenting Technology (myfreeweb)

An SPI flash chip module arrived in the mail today. I ordered that for... July 25, 2018 08:00 PM

An SPI flash chip module arrived in the mail today. I ordered that for replicating this basically. To (network) boot the Orange Pi PC without microSD.

Naturally, instead of booting Linux to flash or (?) trying flash from U-Boot itself, time was spent on making flashrom work on FreeBSD :)

Noon van der Silk (silky)

The Ethics of AI - An Empathy-Based Approach? July 25, 2018 12:00 AM

Posted on July 25, 2018 by Noon van der Silk

There’s lots of talk about the Ethics of AI at the moment. As with any research, there’s too much for any one person to read. Here’s a bunch of papers that I’ve collected haphazardly in the early part of this year:

One thing I wanted to think about is, speaking as someone working in this field and interested in making changes in my day-to-day life, what kind of tools or ideas would be useful for me? What should I do?

Alongside this thought, another thought I had is that somehow the big lists of rules feel very impersonal and disconnected from my experiences. I also feel a little bit unsatisfied about opt-in rules. Here’s a few from the around the place, that I’ve seen:

  • Future of Life (June 2018, relevant items)
    • 5 - Race Avoidance: Teams developing AI systems should actively cooperate to avoid corner-cutting on safety standards.
    • 6 - Safety: AI systems should be safe and secure throughout their operational lifetime, and verifably so where applicable and feasible
    • 7 - Failure Transparency: If an AI system causes harm, it should be possible to ascertain why.
    • 8 - Judical Transparency: Any involvement by an autonomous system in judicial decision-making should provide a satisfactory explanation auditable by a competent human authority.
    • 9 - Responsibility: Designers and building of advanced AI systems are stakeholders in the moral implications of their use, misuse, and actions, with a responsibility and opportunity to shape those implications.
    • 10 - Value Alignment: Highly autonomous AI systems should be designed so that their goals and behviours
    • 11 - Human Values: AI Systems should be designed and operated so as to be compatible with ideals of human dignift, rights, freedoms, and cultural diversity.
    • 12 - Personal Privacy: People should have the right to access, manage and control the data they generate, given AI systems’ power to analyze and utilize that data.
    • 13 - Liberty and Privacy: The application of AI to personal data must not unreasonably curtail people’s real or perceived liberty.
    • 14 - Shared Benefit: AI technologies should benefit and empower as many people as possible.
    • 15 - Shared Prosperity: The economic prosperity created by AI should be shared broadly, to benefit all of humanity.
    • 16 - Human Control: Humans should choose how and whether to delegate decisions to AI systems, to accomplish human chose objectives.
    • 17 - Non-subversion: The power conferred by control of highly advanced AI systems should respect and improve, rather an subvert, the social and civic processes on which the health of society depends.
    • 18 - AI Arms Race: An arms race in lethal autonomous weapons should be avoided.
  • AI For Humanity (June 2018)
    • 01 - Develop an aggressive data policy
    • 02 - Targeting four strategic sectors
    • 03 - Boosting the potential of French research
    • 04 - Planning for the impact of AI on labour
    • 05 - Making AI more environmentally friendly
    • 06 - Opening up the black boxes of AI
    • 07 - Ensuring that AI supports inclusivity and diversity
  • Humans for AI (June 2018)
    • Broaden the pipeline of minorities currently in tech careers, seeking to move to careers in AI by being the go to destination for all things AI because we believe that diversity of thought and opinion ultimately builds better products.
    • Open and inclusive community of people interested in AI by facilitating interactions with experts, practitioners and thought leaders in the field.
    • Leverage AI to release a set of free products built by this community to further our mission of bringing diversity to AI.
    • Demystify AI by providing a basic understanding of the concepts, thinking and events in AI for novices and non-technical people interested in how AI will impact their lives and their jobs.
    • Avoid Negative Side Effects
    • Avoid Reward Hacking
    • Scalable Oversight
    • Safe Exploration
    • Robustness to Distributional Shift

I have a few problems with these rules:

  • It’s easy to imagine situations in which they are counter-productive,
  • I don’t feel a lot of ownership of them, as I wasn’t involved in their construction,
  • No-one is enforcing them on me,
  • They’re often highly impractical or contain colloquial/regional/policital concerns (“Boost French Research …”),
  • They’re also very overwhelming and demanding, how can I ensure that we do all of them?
  • Even if I say I’m doing these things, how does any non-technical person know? How can I prove it?

The positive aspects of them are:

  • It’s sometimes easy to think about how to apply them to day-to-day work,
  • They help me think of things that I might not care about day-to-day (i.e. the environmental concerns?),
  • It might help to lobby governments/organisations to get funding to make progress on certain aspects?
  • It provides a framework that might be useful for discussing with colleagues/other people

So, what should any given engineer working in this area do? One thought I’ve had recently is a simple one: Let’s just aim at building empathy for the people that will be affected by our software.

This is reasonably actionable, say, with local groups by organising meetings between technical people and the people that may be affected. I.e. in the medical-AI setting, let’s organise regular catch-ups between the engineers, the doctors, nursing staff, and hospital adminstration types, along with perhaps patient representatives.

In the setting, of, say, law software, again we just set up regular events for the two groups to chat through issues, work together on small projects, and build a mutual understanding of difficulties.

I think this approach is a bit nicer than, say, creating a new set of rules that make sense for us locally, and then forcing people to follow them. One idea I like about the empathy-based/collaborative approach (or “human-centered design”; another term for this kind thing), is that it allos people to adapt to local circumstances, which I think is really crucial in allowing any one person to feel like they have some control over the application of any rules they come up with, and thus getting them to actually take an interest in enforcing them in their organisation.

So, my new rule of thumb for this ethics-related AI stuff will be: Can I meet with some of the people that will be affected? What are their thoughts? What problems are they working through and what are they interested in?

As always, I’m interested in your thoughts on the matter!

July 24, 2018

Gergely Nagy (algernon)

GitHub vs mailing lists, from another perspective July 24, 2018 12:45 PM

The other day, I made the mistake of getting involved in a kind of flamewar, that revolved around GitHub versus an email-driven workflow. As can be safely inferred, I do not subscribe to the email-driven workflow idea. There seem to be fundamental disagreements, and I think it's worth a shot to show my side of things.

In this first part, I will reflect on an article titled "Mailing lists vs Github", which should have been titled "Mailing lists vs the GitHub web UI", but I digress. Lets cut to the chase!

Critique

Code changes are proposed by making another Github-hosted project (a “fork”), modifying a remote branch, and using the GUI to open a pull request from your branch to the original.

That is a bit of a simplification, and completely ignores the fact that GitHub has an API. So does GitLab and most other similar offerings. You can work with GitHub, use all of its features, without ever having to open a browser. Ok, maybe once, to create an OAuth token.

Email sub-threads allow specialized discussion about different aspects or sections of the code. A linear Github-style discussion would mix those conversations.

I'm not a fan of deep threads. If something has many sub-threads, or the thread goes too deep, for me that indicates that there's a deeper problem. That the patch may need to be split up, or discussed, or even re-designed first. A certain level of discussion is useful, but once it splits up into longer sub-threads, it becomes way too easy to loose sight of the whole picture.

GitHub's flat discussion discourages this, and I find that to be beneficial most of the time. There are times when I don't, but that happens rarely enough that I can live with it.

On Github, comments continually change. They become “outdated” and disappear when attached to a line that has been changed. Same for the commits, which vanish after a force-push to the pull request branch. In an email thread, by contrast, the original messages and proposed changes remain for comparison with later messages and patches.

On mailing lists, when you "force push", and start a new thread (or subthread at best) with the new version of a patchset, the history is similarly hard to use. You can link the new patchset to the older discussion, but comparsion has to be done manually. Same applies to GitHub. I don't see much of a difference here.

If you send updated patches, that's the same as pushing new commits on GitHub, which appear clearly, and still keep older comments available. Some may get marked outdated, but I consider that a good thing. I don't want to see typically irrelevant, already-addressed comments by default. I can still look at them if I want to, mind you.

Furthermore, patches from multiple authors can’t mix in a Github pull request.

Except, you can, and there are a number of ways to accomplish that. You can allow maintainers of the upstream repo to edit your pull request. You can also give permission to other to collaborate with you on a repo, and push to your branches. The latter does give a bit wider access than one might wish, but it is an option. Furthermore, others can open pull request against the branch you used to open yours from. When you merge those, their commits will be added to the pull-request you opened.

This last one fits the GitHub pull-request model best, and when you are comfortable with working with GitHub, it is not any more complex to work this way than with e-mail. Instead of applying patchsets, you merge PRs. Both are supported by integrations, the difference in complexity is none.

The pull requester must turn those comments into commits on the branch if he or she wants to incorporate the suggested changes.

Or the commenter can send a pull-request against the branch the PR in question was opened from. You can have a discussion there, to discuss the changes (much like a subthread on the mailing list). You can even link the two together, and navigate between the two.

Another nice effect is that other people can carry the patch to the finish line if the original author stops caring or being involved.

On GitHub, if the original proposer goes MIA, anyone can take the pull request, update it, and push it forward. Just like on a mailing list. The difference is that this'll start a new pull request, which is not unreasonable: a lot of time can pass between the original request, and someone else taking up the mantle. In that case, it can be a good idea to start a new thread, instead of resurrecting an ancient one.

Patch Format

This is indeed a case where an e-mail based workflow is more flexible. Yet, does it matter? In either case, you can just apply the patch, and create diffs in whatever format, and as much context as you wish. You'll see not just the parts, but the whole thing in context. You can do this with GitHub, you can do this with email. I've been doing it ever since I started working with patches, and it is mighty convenient.

Use the tools you have, if you need and want to. You are not limited to the GitHub UI. Nor are you limited to your MUA, either. There are tools outside of those, tools you can integrate with. Use them.

Don't treat patches as discrete items. Apply them, and have a look at the whole. That gives you all the context you need, in whatever format you desire. I found this to be a very powerful workflow, one that is also easier to work with than patches, because the tree is easier to navigate this way. You can use tools that understand the code, to jump to definitions (that's a lot harder when you view a diff). You can apply code formatting, use refactoring tools to better understand the code - and then undo it all if so desired. You can edit and change code as you review, and work with the code the same way you normally do, using all the help your IDE can give you.

Patch/Discussion mix

You can link issues and pull requests on GitHub. IDEs with good integration allow easy navigation too, akin to navigating an e-mail thread. There are projects out there that separate bug report and development lists, which suffer from the same issue as GitHub's issue/pull request split.

This is not an inherent advantage of mailing lists. It's an advantage of not separating the two.

While web apps deliver a centrally-controlled user interface, native applications allow each person to customize their own experience.

GitHub has an API. There are plenty of IDE integrations. You can customize your user-experience just as much as with an email-driven workflow. You are not limited to the GitHub UI.

Open protocols like SMTP encourage a proliferation of clients.

There are all kinds of GitHub clients, each with their own added functionalities, each with their own set of features. Just like email clients, you have a wide selection to choose from.

Mail clients provide ways to mark a message important, or set it back as unread.

You can build something like that on top of the GitHub API. I believe that is what Octobox have been doing.

Again, you are not limited to the GitHub UI.

Some people script their mail client so that they can apply patches with a keyboard shortcut, others go minimalist, and still others even use webmail. Each person is different, and so is their software, but the nature of the mailing list allows them all to work together.

Nothing is stopping anyone from doing the same with GitHub. For example, I use Emacs and Magit to work with GitHub, never leaving my IDE. Others I collaborate with use the GitHub UI. Others use vim and other tools I have no clue about. Some others use Visual Code Studio or Atom. Or GitHub Desktop. Or in some cases, e-mail. We all work on the same projects.

Another area of control is the ability to search and interact with a mailing list while offline.

You can do that with GitHub too. With limitations, and a bit of preparation, but similar limits apply to working with a mailing list while offline, too.

Github requires connectivity to review issues and pull requests.

No. You can fetch the pull requests you want to review before going offline.

You can also use the API to cache issues, and schedule updates.

A lot of people don't sync their mail locally either, so will have to do some preparations when going offline too. Those who do, can also set up automatic PR syncing too.

Again, you are not limited to what the GitHub UI offers. You can use additional tools, the same way you do when using an email-driven workflow. You aren't limited to what you MUA has to offer.

With a native email client you can review all emails and attachments offline. You can even send replies to messages offline and the client will queue them until internet access becomes available.

This assumes that one synced e-mail locally. Many people use IMAP or similar protocols, and typically don't sync for offline use. Doing so is most often a conscious decision. When you do decide that, you can also pull down PRs and issues first. There are existing tools to aid you in that.

Tools can work together, rather than having a GUI locked in the browser.

GitHub has an API. Granted, it is not an RFC, and you are at the mercy of GitHub to continue providing it. But then, you are often at the mercy of your email provider too.

What a twist of history, then, that users of git chose Github… a centralized host granting free licenses for open source projects, and requiring projects to store their metadata on company servers.

Except they do not do that. They provide you an API to look at the additional meta-data, to build custom integrations on top of it.

You can host your issues elsewhere. You can even use GitHub as a mirror only. You only need to host your metadata there, if you want to use the features GitHub provides. You are not in any way required to do that.

You can even opt to do both! You can accept GitHub issues, pull request, and have a mailing list! There is nothing stopping you from doing that.

Github can legally delete projects or users with or without cause.

Whoever is hosting your mailing list archives, or your mail, can do the same. It's not unheard of.

An author can download the source release tarball, make changes in the copy, capture the diff, and email it.

You can do the same with a project using GitHub. Send the e-mail to one of the maintainers. You are at the mercy of the recipients to deal with the patch appropriately, but that's the same situation when they use CVS/Mercurial/etc, and you send a bare patch. They still have a little work to do to fit it in their workflow.

But GitHub itself does not prevent accepting e-mail.

Sending and applying patches cuts out busywork like cleaning up remote branches after merge, or creating a local branch in the forked repo in preparation for a pull request.

All of this can be easily automated away. Besides, if you are a long-term contributor, creating a local branch is a good idea anyway, GitHub or not.

For comparison, I remember teaching a group of new programmers how to use Github, and was conscious at the time of all the weird steps I asked them to do.

For comparsion, when my Wife (a garden engineer, not a tech-savvy person) wanted to contribute a little during Hacktoberfest, she found the "Edit file" button on the UI, and went further from there on her own. No weird steps, just edit a file, submit changes, done.

She'd never be able to send an patch by e-mail.

There’s also less busywork for finished communications. There aren’t things to “clean up” like abandoned pull requests, merged branches, or issues to mark closed. The replies just stop on those threads.

I'd rather see an explicit marker that an issue is resolved or a PR is merged, but hey, that's me, who doesn't like to dig this information out of random e-mail threads. Discoverability is important.

PGP provides a further guarantee of identity, verified through a decentralized web of trust.

You can sign commits and tags while using GitHub. You can't sign comments, last I checked, but you can respond to them by email, which is signed. It might look a bit weird on the UI, but hey.

Perhaps this article can start these developers on the path to rediscovering the care and engineering that went into classic email clients (“MUAs” as they are called).

Or perhaps there are people who'd rather not deal with email, because they have purpose-built tools that aid them better than a MUA and tools built around that workflow would.

Exceptions

There are certainly projects where the GitHub model just doesn't work, the Linux kernel being one particularly great example of that. What works for one project, may not be the best choice for another. I assert that most projects are not the Linux kernel, and aren't anywhere near that level. As such, modeling your workflow on something with vastly different needs may not be the best course of action.

Next up

In the next installment of these series, I will explain my workflow. No comparisons, just examples of how I, an allegedly seasoned engineer of sorts, uses the tools he has at hand. It's a workflow of a power user. One who prefers working with APIs instead of email. It may include a little bit of history.

After that, I'll have a look at a few other posts that campaign for an email-driven workflow, and see where that takes us.

July 23, 2018

Gokberk Yaltirakli (gkbrk)

Writing a Simple IPFS Crawler July 23, 2018 03:54 PM

IPFS is a peer-to-peer protocol that allows you to access and publish content in a decentralized fashion. It uses hashes to refer to files. Short of someone posting hashes on a website, discoverability of content is pretty low. In this article, we’re going to write a very simple crawler for IPFS.

It’s challenging to have a traditional search engine in IPFS because content rarely links to each other. But there is another way than just blindly following links like a traditional crawler.

Enter DHT

In IPFS, the content for a given hash is found using a Distributed Hash Table. Which means our IPFS daemon receives requests about the location of IPFS objects. When all the peers do this, a key-value store is distributed among them; hence the name Distributed Hash Table. Even though we won’t get all the queries, we will still get a fraction of them. We can use these to discover when people put files on IPFS and announce it on the DHT.

Fortunately, IPFS lets us see those DHT queries from the log API. For our crawler, we will use the Rust programming language and the ipfsapi crate for communicating with IPFS. You can add ipfsapi = "0.2" to your Cargo.toml file to get the dependency.

Using IPFS from Rust

Let’s test if our IPFS daemon and the IPFS crate are working by trying to fetch and print a file.

let api = IpfsApi::new("127.0.0.1", 5001);

let bytes = api.cat("QmWATWQ7fVPP2EFGu71UkfnqhYXDYH566qy47CnJDgvs8u")?;
let data = String::from_utf8(bytes.collect())?;

println!("{}", data);

This code should grab the contents of the hash, and if everything is working print “Hello World”.

Getting the logs

Now that we can download files from IPFS, it’s time to get all the logged events from the daemon. To do this, we can use the log_tail method to get an iterator of all the events. Let’s get everything we get from the logs and print it to the console.

for line in api.log_tail()? {
    println!("{}", line);
}

This gets us all the loops, but we are only interested in DHT events, so let’s filter a little. A DHT announcement looks like this in the JSON logs.

{
  "duration": 235926,
  "event": "handleAddProvider",
  "key": "QmWATWQ7fVPP2EFGu71UkfnqhYXDYH566qy47CnJDgvs8u",
  "peer": "QmeqzaUKvym9p8nGXYipk6JpafqqQAnw1ZQ4xBoXWcCrLb",
  "session": "ffffffff-ffff-ffff-ffff-ffffffffffff",
  "system": "dht",
  "time": "2018-03-12T00:32:51.007121297Z"
}

We are interested in all the log entries with the event handleAddProvider. And the hash of the IPFS object is key. We can filter the iterator like this.

let logs = api.log_tail()
        .unwrap()
        .filter(|x| x["event"].as_str() == Some("handleAddProvider"))
        .filter(|x| x["key"].is_string());

for log in logs {
    let hash = log["key"].as_str().unwrap().to_string();
    println!("{}", hash);
}

Grabbing the valid images

As a final step, we’re going to save all the valid image files that we come across. We can use the image crate. Basically; for each object we find, we’re going to try parsing it as an image file. If that succeeds, we likely have a valid image that we can save.

Let’s write a function that loads an image from IPFS, parses it with the image crate and saves it to the images/ folder.

fn check_image(hash: &str) -> Result<(), Error> {
    let api = IpfsApi::new("127.0.0.1", 5001);

    let data: Vec<u8> = api.cat(hash)?.collect();
    let img = image::load_from_memory(data.as_slice())?;

    println!("[!!!] Found image on hash {}", hash);

    let path = format!("images/{}.jpg", hash);
    let mut file = File::create(path)?;
    img.save(&mut file, image::JPEG)?;

    Ok(())
}

And then connecting to our main loop. We’re checking each image in a seperate thread because IPFS can take a long time to resolve a hash or timeout.

for log in logs {
    let hash = log["key"].as_str().unwrap().to_string();
    println!("{}", hash);

    thread::spawn(move|| check_image(&hash));
}

Possible improvements / future work

  • File size limits: Checking the size of objects before downloading them
  • More file types: Saving more file types. Determining the types using a utility like file.
  • Parsing HTML: When the object is valid HTML, parse it and index the text in order to provide search

Writing a Simple D-Bus Service in Python July 23, 2018 03:54 PM

D-Bus is a message bus that Linux systems use in order to make programs communicate with each other or with the system itself. It allows applications to integrate amongst themselves using well-defined interfaces. This allows each application to provide services that can be used by others, sort of like adding API’s to your programs.

In this article, we’re going to write a small D-Bus service and a client to consume it. Perhaps some people see D-Bus as an ancient and scary relic of the past, but hopefully after reading this you will find it an approachable topic.

D-Bus Service for Getting the Current Time

First of all, let’s import dbus and time. This shouldn’t be a big surprise since we’re writing a D-Bus service about time.

import dbus
import dbus.service
import time

Next, we need to create a D-Bus service class. This is done by inheriting from dbus.service.Object. The functions we put into this class will be automatically exported as D-Bus methods.

class Time(dbus.service.Object):
    def __init__(self):
        self.bus = dbus.SessionBus()
        name = dbus.service.BusName('com.gkbrk.Time', bus=self.bus)
        super().__init__(name, '/Time')

This part just gets a connection to the Session Bus and sets the name of our service. Let’s write our function that returns the current time.

    @dbus.service.method('com.gkbrk.Time', out_signature='s')
    def CurrentTime(self):
        """Use strftime to return a formatted timestamp
        that looks like 23-02-2018 06:57:04."""

        formatter = '%d-%m-%Y %H:%M:%S'
        return time.strftime(formatter)

This method is a small wrapper around the strftime function from the standard library, which formats the current time into a human readable string.

I want to talk about the decorator for a bit. The out_signature = ‘s’ part basically tells D-Bus that the function returns a string. If we had made it out_signature = ‘ss’, that would mean we are returning two strings from our function, which is a tuple of strings in Python. Similarly, putting in_signature there would let us denote the function arguments.

Running the Service

In order to run the service, we’re going to need a bit more boilerplate. The code below creates the GLib main loop, registers our service and runs the loop. The main loop will stop the program from terminating, so we can respond to calls to our service.

if __name__ == '__main__':
    import dbus.mainloop.glib
    from gi.repository import GLib

    dbus.mainloop.glib.DBusGMainLoop(set_as_default=True)

    loop = GLib.MainLoop()
    object = Time()
    loop.run()

Writing a Client to Access Our Service

Writing a client is a lot easier than writing the service. Again, we need to import dbus and connect to the Session Bus for this.

import dbus

bus = dbus.SessionBus()

After this, we will create our Time object again, using the com.gkbrk.Time address we created before.

time = bus.get_object('com.gkbrk.Time', '/Time')

Using this object, we can call any method we defined on the service and use the result. Let’s print the current time that we get from our D-Bus service.

curr = time.CurrentTime()
print('The current time is', curr)

If there are no problems; if you run the service we wrote in a terminal and run the client from another terminal, you should get the time printed on your console.

Now, you should be able to create simple services that are useful to you. If you come up with any interesting services, I would love to hear about them via email or Twitter.


Thank you for reading my article. You can find related information in these sources.

Evolving Line Art July 23, 2018 03:54 PM

In this article, I want to talk about a really simple technique for evolving line-art from pictures. On top of being an simple example for genetic algorithms, it is also a fun programming project that can be done in short time.

For this project, we are going to use the Hill Climbing algorithm. The algorithm for evolving pictures is like the following.

  1. Create two empty images, image 1 and image 2
  2. Do a small random modification (like a line) to the first image
  3. If the first image is closer to the target than the first one, copy it to image 2
  4. Otherwise, copy the second image to image 1
  5. Goto 2

I will be using Rust for this project. Fortunately, Rust has really good libraries that we can use to read, manipulate and write image files.

Loading the Target Image

The image crate makes loading files really straightforward. Here’s how to load the file called target.png. We will also create the two temporary images that we need for the algorithm. We will make those

let target = image::open("target.png").expect("Cannot load target image");

let mut img1 = DynamicImage::new_rgb8(target.width(), target.height());
let mut img2 = DynamicImage::new_rgb8(target.width(), target.height());

Extracting the Colours

Normally, picking the colours randomly should work just as well as using the colours from the picture. But if we extract all the colours and pick randomly from those, the algorithm will be able to generate the images faster. To achieve this, we will create a Vector and store all the colours in there.

let mut colours = Vec::new();

for pixel in target.pixels() {
    let rgba = pixel.2.to_rgba();

    if !colours.contains(&rgba) {
        colours.push(rgba);
    }
}

Difference between two images

To determine if an image is closer to our target, we need a function to find the difference of two images. The root_mean_squared_error from the imageproc crate does exactly that.

fn image_diff(img1: &DynamicImage, img2: &DynamicImage) -> f64 {
    imageproc::stats::root_mean_squared_error(img1, img2)
}

This function returns the amount of difference between two images as a float. When we’re building the hill climbing algorithm, our goal is going to be minimising this number.

Drawing random circles

Personally I like the art style of the random lines, but it evolves slowly. One way to speed-up the process and get a different art style is using different shapes. Perhaps the simplest shape that can be drawn is a circle. It only needs a location (x, y) and a radius. Here’s how to draw a random circle with a radius of 5 in your image.

let pos: (i32, i32) = (rng.gen_range(0, target.width() as i32), rng.gen_range(0, target.height() as i32));
let colour = rng.choose(&colours).unwrap();

imageproc::drawing::draw_filled_circle_mut(&mut img1, pos, 5, *colour);

Drawing random lines

For drawing, we are going to be using the imageproc crate again. Here’s how to draw a black line from (10, 10) to (20, 20).

let start = (10f32, 10f32);
let end = (20f32, 20f32);
let color = Rgba([0, 0, 0, 1]);

imageproc::drawing::draw_line_segment_mut(&mut img1, start, end, colour);

Let’s draw a line with a color chosen from our random list.

let mut rng = rand::thread_rng();

let start: (f32, f32) = (rng.gen_range(0.0, target.width() as f32), rng.gen_range(0.0, target.height() as f32));
let end:   (f32, f32) = (rng.gen_range(0.0, target.width() as f32), rng.gen_range(0.0, target.height() as f32));
let colour = rng.choose(&colours).unwrap();

imageproc::drawing::draw_line_segment_mut(&mut img1, start, end, *colour);

The Hill Climbing Algorithm

If you don’t know how it works, I have written about the hill climbing algorithm in my wiki. Basically, it starts with a random solution and repeatedly mutates it in a way that it becomes better and better according an evaluation function we provide.

Let’s write the skeleton of our hill climber. We’ve already created the target image and the temporary images in the previous sections.

loop {
    let pos: (i32, i32) = (rng.gen_range(0, target.width() as i32), rng.gen_range(0, target.height() as i32));
    let colour = rng.choose(&colours).unwrap();

    imageproc::drawing::draw_filled_circle_mut(&mut img1, pos, 5, *colour);

    if image_diff(&target, &img1) < image_diff(&target, &img2) {
        &img2.copy_from(&img1, 0, 0);
    } else {
        &img1.copy_from(&img2, 0, 0);
    }
}

Saving regular snapshots

Since this program can improve the image for a really long time, until it becomes a pixel-perfect copy of the target, we might want to save snapshots if the image we get in regular intervals. This will allow us to check the progress and stop the program once we are satisfied with the results. Let’s save the image every 500 iterations.

if i % 500 == 0 {
    img2.save(&mut File::create(&Path::new("output.png")).unwrap(), image::PNG);
}

And with that, the project is done. Here’s the full code, in a form that compiles and runs. Play around with it, try new shapes and different algorithms. If you implement this in another language, let me know and share your own code with result pictures.

extern crate image;
extern crate imageproc;
extern crate rand;

use std::fs::File;
use std::path::Path;
use rand::Rng;

use image::{DynamicImage, GenericImage, Pixel};

fn image_diff(img1: &DynamicImage, img2: &DynamicImage) -> f64 {
    imageproc::stats::root_mean_squared_error(img1, img2)
}

fn main() {
    let target = image::open("target.png").expect("Cannot load target image");

    let mut img1 = DynamicImage::new_rgb8(target.width(), target.height());
    let mut img2 = DynamicImage::new_rgb8(target.width(), target.height());

    let mut colours = Vec::new();

    for pixel in target.pixels() {
        let rgba = pixel.2.to_rgba();

        if !colours.contains(&rgba) {
            colours.push(rgba);
        }
    }

    let mut rng = rand::thread_rng();

    let mut i = 0;
    loop {
        let pos:   (i32, i32) = (rng.gen_range(0, target.width() as i32), rng.gen_range(0, target.height() as i32));
        let colour = rng.choose(&colours).unwrap();

        imageproc::drawing::draw_filled_circle_mut(&mut img1, pos, 5, *colour);

        if image_diff(&target, &img1) < image_diff(&target, &img2) {
            &img2.copy_from(&img1, 0, 0);
        } else {
            &img1.copy_from(&img2, 0, 0);
        }

        if i % 100 == 0 {
            println!("{}", i);
            img2.save(&mut File::create(&Path::new("output.png")).unwrap(), image::PNG);
        }

        i += 1;
    }
}

Welcome 2018! July 23, 2018 03:54 PM

Hello dear readers, welcome to another episode of “New Year, New Me”. First of all, I want to wish everyone a happy new year. Hopefully, 2018 will be full of happiness, health and success for you. For a variety of reasons some of you might have had a bad year. But worry not; because 2018 is here and whatever your goals were, you can keep trying.

Today I want to reflect upon the previous year and make some resolutions about the upcoming one. Personally, 2017 was a really good year for me. On the education, career and personal fronts, I made a lot of progress and had a lot of fun. Since I never got around to writing one of those for 2016, this post might have some older stuff as well.

Reflecting on the previous year

I finished my first year of university and started my second one. During this process I met a lot of interesting and cool people, like-minded individuals if you will. I always imagined university would be a fun place to be in but that was an understatement. If you use all the resources and facilities provided to you, it can be a very useful and valuable experience.

I got my first job during August 2017. I started working part-time as a web developer. This provided me with experience related to my field as well as some extra money. I think having my first job in my own field is a really good opportunity compared to just working a random job.

Other than my current job, I also found a placement for next year. I signed a contract with them around the end of 2017. After I finish my second year in school, I will be working in London as a full-time Software Engineer. I am really excited for this, I have met some people from the company as well as people from my team. They all feel like really nice people and I’m looking forward to working with them.

I started learning Chinese (Mandarin if you’re pedantic). Originally I was planning to learn Japanese, which would probably be easier for me because of similarities between Japanese and my native language, but in the end I decided to go with Chinese. So far I’m pleased with my decision, although my progress is slow because of my heavy workload.

During the good and bad parts of the year, my amazing girlfriend stood by me and supported me a lot. She made the worst part of my year (and one of worst times of my life) bearable, and I am really thankful to have such a kind and loving person in my life. We have been in a long-distance relationship for a while, but I am going to meet her in a few days so I am really excited.

I spent the whole year with great people. I want to express my gratitude towards one of the really special ones. Melis, my former flatmate, has been an incredible friend. She made my already bright life even brighter. At this point, she is like a sister to me. We secretly believe that we are somehow telepathically connected, although we have not been able to find any scientific evidence for this. Anyway; she is my brotato, and hopefully she will stay that way for a long time.

Resolutions for the new year

Ah, so here is the part where I tell you about how I’m going to become a completely better person compared to yesterday. Get ready for a whole bunch of empty promises that say how I’m going to study 20 hours a day and do cardio on top of it. Joking aside, I believe these new-year resolutions are useful things. It’s like a to-do list of things that would be good to try this year. Even if you don’t accomplish all (or any) of them, it is still enough to try.

Here is my list for 2018.

  • Make progress in Chinese. I want to spend more time this year learning Chinese. My goal is practicing common words and sentence structures. Maybe I will be able to read a children’s book by the end of this year.
  • Write more blog posts. I think having a small number of blog posts is fine, as long as the content is interesting. But I would still like to write more regularly. At least once a month would be ideal.
  • More gym. Although I walk a lot every day and prefer walking to the bus or Uber, I still think I need to be a little more active. I think the gym is a good way to spend time and become healthier at the same time.
  • Eat healthier. Speaking of healthy, I want to focus on foods with more nutrients. This is not very easy since I don’t have too much time for cooking, but a phone app can be helpful for keeping track of my food intake.
  • Waste less time. I usually find myself needing more time for things, which makes me believe there is a lot of wasted time in my day. I bought an Android app that allows me to do ‘Time-tracking’. I am hoping that it helps me be a little more efficient with my time management, but I am scared it might turn into “You watched 6 hours of Netflix today”.
  • Learn how to draw. I want to improve my drawing, which isn’t in a very good condition now. My goal is to spend around 15-30 minutes every day to try drawing something simple.

Is it a long list? Yeah. Is it likely that I will do everything in my list for the entire year? Probably not. But is it worth trying to do it and possibly improving yourself? Absolutely.


I urge everyone to also write something short, reflecting on their year and making resolutions for the new one. Writing this down and sharing it with other people might provide some extra motivation.

Putting My Blog on IPFS July 23, 2018 03:54 PM

I’ve always been fascinated by the idea of peer-to-peer network protocols, and putting my website on a distributed network was something I’ve been meaning to do for a while. The recent increase in blog posts about IPFS finally pushed me over the tipping point. Hopefully, you can read this article on IPFS here.

I am really happy with this change, and I urge everyone to do the same with their websites. IPFS may or may not be the perfect solution to the decentralized web, but we need to start somewhere. In this article, I will try to give a step-by-step guide for putting static websites on IPFS.

Getting started

First, a bit about getting your feet wet with IPFS. Every file and directory in IPFS can be referenced by a hash that looks like QmNiMm9LUX9R4Ezu2NAsraxFSyrbNR6rgGwCDu71Dy4NwQ. If you have it installed on your computer, you can go on the command line to download and view files.

ipfs cat QmWATWQ7fVPP2EFGu71UkfnqhYXDYH566qy47CnJDgvs8u

This command will output ‘Hello World’. I chose this as an example, because I assume hello world is something a lot of people will have on their nodes.

Just as easy as getting files, putting files is also easy. Lets say you have a picture of a cat called kitten.jpg. Running ipfs add kitten.jpg give you a hash and your node will start seeding it. You can just give that hash to people you want to share the file with. You can try it like this.

Putting a website on IPFS

So how do we go from putting up a file there to putting up a whole website? Just like most unix commands, IPFS also accepts the parameter -r for running the command recursively. Let’s say our website is in the public/ directory. If we run ipfs add -r public/, it will first add all the files and subdirectories into IPFS, and finally give us a hash that represents the whole folder.

Other people can view your website with that hash. The hash changes if you update your website, but you can solve this issue with IPNS. IPNS basically allows you use a private key to update the contents of your files. As long as you have your private key, you can publish updates to your IPFS website.

If you want to use IPNS (you probably do), you need to run ipfs name publish <the hash of the folder>. After doing these, you can give your IPNS hash to people, and whenever you update your website they will see the updated version as well.

But as a person who has keeps losing private keys, I can just use my domain’s DNS records to change the hash whenever I need to.

Using DNS for IPFS

If you don’t want to deal with giving out long hashes to people, IPFS allows you to use your normal domain as well. To do this, you need to add a line to your DNS records.

Just go to your domain’s DNS settings, and find where to add a TXT record. In the content, you should write dnslink=/ipfs/<your hash here> or dnslink=/ipns/<your hash here> depending on what you use.

As an example, you can access my blog from /ipns/gkbrk.com.

Hugo Specific Fixes

When you are hosting your website on IPFS, one issue you should be careful of is relative vs. absolute links. If the linking is wrong, you will have issues when people try to navigate your website.

In your Hugo config file (config.toml), you should comment out baseurl and set relativeURLs to true.

# baseurl = ''
relativeURLs = true

If there are still issues, you can fix them whenever you come across them incrementally.


After all these steps, you should be able to successfully host your website on IPFS. If you do it, make sure to let me know by sending an email, a tweet or comment on where you’re reading this from.

Unprotected Redis Instances in the Wild July 23, 2018 03:54 PM

If you follow programming blogs, it is not uncommon to come across articles that mention how MongoDB exposes your private information without any protection on default settings. But Mongo is not alone in this. Even with sane defaults, it is possible to find that a lot of people have misconfigured their databases for their convenience. In this list of exposed servers is our beloved Redis.

Redis is normally highly praised among developers. But even this doesn’t stop it from being used incorrectly. The server can accept a plaintext protocol, which will make it easier for us to use. To test; we can download Redis, create a server and insert some dummy keys into it (like key1, key2, key3).

If we look at the Redis documentation, we will find a function called RANDOMKEY. The name is pretty self-explanatory, it returns the name of a random key from the database. If we open a TCP connection to the server and send RANDOMKEY\r\n, the server might respond with key2.

In order to find Redis servers that allow anonymous access, we will write a script to scan the internet. The script will

  1. Generate random IP addresses
  2. Try to connect to them on the Redis port 6379
  3. If the connection is successful, send the RANDOMKEY command and check if there is any output.

Sending the RANDOMKEY command will let us check if the server we connected to is actually Redis. It also allows us to filter empty servers if we want to.

We can use Go to write a short script for this purpose.

ip := fmt.Sprintf("%d.%d.%d.%d:6379", rand.Intn(256), rand.Intn(256),
                    rand.Intn(256), rand.Intn(256))
fmt.Println(ip)

This code will generate a random IP address with the port 6379, and print it on the screen. When we’re scanning for lots of random IP addresses, it is likely that we will attempt to connect to lots of addresses that have no servers listening. In order to prevent slowdowns, we will set a timeout of 2 seconds.

conn, err := net.DialTimeout("tcp", ip, timeout)

If our connection is successful, we set a new timeout of 5 seconds and send the command.

conn.SetDeadline(time.Now().Add(time.Second * 5))
fmt.Fprintf(conn, "RANDOMKEY\r\n")

After this, we try to read a line from the socket and print it along with the IP. This will allow us to see which servers are empty and which ones have content. It also lets us filter potential false positives that aren’t actually Redis servers.

var line string
_, err := fmt.Fscanln(conn, &line)
fmt.Println(ip)
fmt.Println(line)

Let’s wrap all this in a function along with error handling, and spawn 200 coroutines to scan the internet for unprotected Redis servers.

import (
    "net"
    "fmt"
    "math/rand"
    "time"
)

func scanner() {
    for {
        timeout := time.Second * 2
        ip := fmt.Sprintf("%d.%d.%d.%d:6379", rand.Intn(256),
                           rand.Intn(256), rand.Intn(256), rand.Intn(256))
        conn, err := net.DialTimeout("tcp", ip, timeout)
        if err == nil {
            conn.SetDeadline(time.Now().Add(time.Second * 5))
            fmt.Fprintf(conn, "RANDOMKEY\r\n")
            var line string
            _, err := fmt.Fscanln(conn, &line)
            if err == nil {
                fmt.Println(ip)
                fmt.Println(line)
            }
        }
    }
}

func main() {
    for i := 0;i<300;i++ {
        go scanner()
    }
    fmt.Scanln()
}

A script like this can be used to find servers. While possible uses for these servers include data storage and transfer, or just dumping all the data; the responsible thing to do would be running a reverse DNS on the IP and contacting the website and letting them know about the issue.

Android Dialers are Stealing Your Data July 23, 2018 03:54 PM

In Android, most functionality of your phone is provided by apps. And this includes making phone calls as well. Android lets you replace the dialer app on your phone with a custom one. This can be amazing and horrifying at the same time. It is amazing because it allows programmers to create interesting ways to call people. But it also allows the creators of malicious apps to secretly send your private data to their servers.

For tech-savvy people this isn’t such a big issue, trust only your phone manufacturer and open source apps and you’re golden. But things aren’t always so simple when people who aren’t familiar with the best privacy practices see these apps on their app store. On top of that, things can get out of your hand when a phone update replaces the default telephone app on your phone with TrueCaller.

I wanted to see just how bad the situation was with my own eyes, so I equipped myself with a packet sniffer and started installing those apps on my phone. I know, I know, not the safest thing to do. But your choices are limited when your computer is too slow to emulate anything more complicated than an atari.

This article is also available in Turkish.

Drupe, our first test subject

When you first install this app, it greets you with a permission request for your contact list and refuses to start without being granted the permission. But that’s not too suspicious, an app that you use for calling people, an app that advertises itself as “Contacts Phone Dialer” can have tons of valid reasons for needing access to your contacts. But unfortunately, the first thing this app does after getting the permission is serializing all your contacts into a big string and sending it over to their servers.

{{< centerimage “/img/articles/android-dialer-packets/drupe-packet.jpg” 150 >}}

Asus Dialer

Asus Dialer is the app that comes preinstalled with Asus phones. In my tests, it didn’t send anything from my contact list to their server. Also, no communication was observed when calling other numbers. It is consistent with the opening paragraph that a telephone app by a phone manufacturer wouldn’t steal your data carelessly, it’s just unnecessary risk for them.

Dialer+ / Contacts+

An API call to an endpoint called ‘/report’ was made with every call I did. This API call included my email address, a token and the number I was calling. I assume a copy of my contact list was also sent but I was unable to take a screenshot of that.

{{< centerimage “/img/articles/android-dialer-packets/contactsplus.jpg” 150 >}}

TrueCaller

TrueCaller, the telephone app which another blogger was suspicious of, is also guilty in this regard. It sends all your call start-end times and some more data such as outgoing call and number dialed events to an analytics server. On top of that, it keeps track of calls and reports to their server when they start and end, along with the number called and a client ID.

This extensive collection of information is enough to gather when you to talk with people, and who you talk with. Since these apps are installed by a lot of people and your name is in their contacts list, even if you don’t install the apps you can still be tracked to a degree.

The Sad State of Privacy

All the apps I tested were the top results for the search dialer. Some of them were given the Editor’s Choice branding and all of them had massive install numbers. If the most popular dialer apps, the ones that have been approved by “editors”, disregard our privacy like that; I can’t even imagine the kind of intrusion shady apps will do.


Thanks for reading my blog post. If you subscribe to my RSS feed in 10 seconds you will have good privacy for 10 years, I hope.

Graphs From My Todo.txt July 23, 2018 03:54 PM

I am a really lazy person, there, I said it. I also get distracted really often. These two things combined might be the worst thing that can happen to one’s productivity. After trying many methods of creating todo lists, I have settled on two. Markdown files for detailed note-taking, and todo.txt for the list of things to do. On my phone, the Simpletask Cloudless app did an amazing job of bringing some order into my chaotic schedule.

But I felt that I could use some more incentive to do work. Something to gameify my tasks, if you will. What better way do this than a fancy graph with some numbers. I decided to host the resulting graph on my website and regularly update it.

If you want to see the final result, you can see the code file here and the resulting graph here.

Parsing the todo file

Normally, we would need to parse the todo.txt format, but since we’re only going to look at the completed tasks, we can get by with some string splitting. Let’s write a Python function to “parse” the file and get some statistics from it.

def get_stats(filename):
    data = {}
    with open(filename) as todofile:
        for line in todofile:
            date = line.split()[1]
            if date in data:
                data[date] += 1
            else:
                data[date] = 1
    return data

This function returns a dictionary of dates and the number of tasks completed. There’s only one problem with this, the days where we did nothing are skipped, making our graph look wrong. To fix this, we will write another function to return the dates of the last 15 days. We’ll use the datetime library for this.

import datetime

def get_last_days(days):
    for day in range(days)[::-1]:
        date = datetime.datetime.today() - datetime.timedelta(days=day)
        yield date.date().isoformat() # Isoformat is 2017-04-18

After that we can collect these values on an array, which we will use to make our final graph. My todo archive is at the Sync folder, which is synchronized with my phone regularly.

stats = get_stats("/home/leonardo/Sync/default/done.txt")

todoCounts = []
for date in get_last_days(15):
    # Get the value from the todo.txt file
    # Use 0 as default
    todoCounts.append(stats.get(date, 0))

Creating the graph

The only thing left to do is plot these values on a graph. For the graph, we’ll use matplotlib. Let’s configure it a little so the graph looks like what we want.

import matplotlib.pyplot as plt

plt.grid(True)
plt.title("Todo.txt Progress")
plt.ylabel("Number of tasks done")

This should display a grid behind the line graph and add titles to make the plot more clear.

plt.yticks(range(max(todoCounts) + 1))
plt.xticks(range(15), get_last_days(15), rotation=80)

This will make the Y axis only have integers and get rid of fractional values, giving us a cleaner graph. It will also write the dates below the graph, rotated 80 degrees to fit together.

Let’s call the plot function to put the values into our graph and save it to my websites public folder. “`python plt.plot(todoCounts, marker="o”)

plt.tight_layout() plt.savefig(“public/img/todo.jpeg”, dpi=100) “`

Final words

I added this script to the hooks on my websites Makefile, so everytime I compile my blog it will get updated. One of the reasons I put it public is to give me some feeling of responsibility and judgement, even if practically no one will look at it.

Todo list graph


Hey everyone! Thanks for reading my blog post. If you enjoyed it, you can check out my other posts or subscribe via RSS. RSS is still relevant, right?

Fetching ActivityPub Feeds July 23, 2018 03:54 PM

Mastodon is a federated social network that uses the ActivityPub protocol to connect separate communities into one large network. Both Mastodon and the ActivityPub protocol are increasing in usage every day. Compared to formats like RSS, which are pull-based, ActivityPub is push-based. This means rather than your followers downloading your feed regularly to check if you have shared anything, you send each follower (or each server as an optimization) the content you shared.

While this decreases latency in your followers receiving your updates, it does complicate the implementation of readers. But fortunately, it is still possible to pull the feed of ActivityPub users. Just like the good old days.

In this article; we’re going to start from a handle like leo@niu.moe, and end up with a feed of my latest posts.

WebFinger

First of all, let’s look at how the fediverse knows how to find the ActivityPub endpoint for a given handle. The way this is done is quite similar to email.

To find the domain name, let’s split the handle into the username and domain parts.

handle           = 'leo@niu.moe'
username, domain = handle.split('@')

Next, we need to make a request to the domain’s webfinger endpoint in order to find more data about the account. This is done by performing a GET request to /.well-known/webfinger.

wf_url = 'https://{}/.well-known/webfinger'.format(domain)
wf_par = {'resource': 'acct:{}'.format(handle)}
wf_hdr = {'Accept': 'application/jrd+json'}

# Perform the request
wf_resp = requests.get(wf_url, headers=wf_hdr, params=wf_par).json()

Now we have our WebFinger response. We can filter this data in order to find the correct ActivityPub endpoint. We need to do this because webfinger can return a variety of URLs, not just ActivityPub.

Filtering the endpoints

The response we get from WebFinger looks like this.

{
  "subject": "acct:leo@niu.moe",
  "aliases": [
    "https://niu.moe/@leo",
    "https://niu.moe/users/leo"
  ],
  "links": [
    {
      "rel": "http://webfinger.net/rel/profile-page",
      "type": "text/html",
      "href": "https://niu.moe/@leo"
    },
    {
      "rel": "http://schemas.google.com/g/2010#updates-from",
      "type": "application/atom+xml",
      "href": "https://niu.moe/users/leo.atom"
    },
    {
      "rel": "self",
      "type": "application/activity+json",
      "href": "https://niu.moe/users/leo"
    }
  ]
}

Depending on the server, there might be more or less entries in the links key. What we are intereted in is the URL with the type application/activity+json. Let’s go through the array and find the link URL we’re looking for.

matching = (link['href'] for link in wf_resp['links'] if link['type'] == 'application/activity+json')
user_url = next(matching, None)

Fetching the feed link

We can fetch our feed URL using requests like before. One detail to note here is the content type that we need to specify in order to get the data in the format we want.

as_header = {'Accept': 'application/ld+json; profile="https://www.w3.org/ns/activitystreams"'}
user_json = requests.get(user_url, headers=as_header).json()

user_json is a dictionary that contains information about the user. This information includes the username, profile summary, profile picture and other URLs related to the user. One such URL is the “Outbox”, which is basically a feed of whatever that user shares publicly.

This is the final URL we need to follow, and we will have the user feed.

feed_url  = user_json['outbox']

In ActivityPub, the feed is an OrderedCollection. And those can be paginated. The first page can be empty, or have all the content. Or it can be one event for each page. This is completely up to the implementation. In order to handle this transparently, let’s write a generator that will fetch the next pages when they are requested.

def parse_feed(url):
    feed = requests.get(url, headers=as_header).json()

    if 'orderedItems' in feed:
        for item in feed['orderedItems']:
            yield item

    next_url = None
    if 'first' in feed:
        next_url = feed['first']
    elif 'next' in feed:
        next_url = feed['next']

    if next_url:
        for item in parse_feed(next_url):
            yield item

Now; for the purposes of a blog post and for writing simple feed parsers, this code works with most servers. But this is not a fully spec-complient function for grabbing all the pages of content. Technically next and first can be lists of events instead of other links, but I haven’t come across that in the wild. It is probably a good idea to write your code to cover more edge cases when dealing with servers on the internet.

Printing the first 10 posts

The posts in ActivityPub contain HTML and while this is okay for web browsers, we should strip the HTML tags before printing them to the terminal.

Here’s how we can do that with the BeautifulSoup and html modules.

def clean_html(s):
    text = BeautifulSoup(s, 'html.parser').get_text()
    return html.unescape(text)

i = 0
for item in parse_feed(feed_url):
    try:
        # Only new tweets
        assert item['type'] == 'Create'
        content = item['object']['content']
        text = clean_html(content)

        print(text)
        i += 1
    except:
        continue

    if i == 10:
        break

Future Work

Mastodon is not the only implementation of ActivityPub, and each implementation can do things in slightly different ways. While writing code to interact with ActivityPub servers, you should always consult the specification document.

Useful Links

Generating Vanity Infohashes for Torrents July 23, 2018 03:54 PM

In the world of Bittorrent, each torrent is identified by an infohash. It is basically the SHA1 hash of the torrent metadata that tells you about the files. And people, when confronted with something that’s supposed to be random, like to control it to some degree. You can see this behaviour in lots of different places online. People try to generate special Bitcoin wallets, Tor services with their nick or 4chan tripcodes that look cool. These are all done by repeatedly generating the hash until you find a result that you like. We can do the exact same thing with torrents as well.

The structure of torrent files

Before we start tweaking our infohash, let’s talk about torrent files first. A torrent file is a bencoded dictionary. It contains information about the files, their names, how large they are and hashes for each piece. This is stored in the info section of the dictionary. The rest of the dictionary includes a list of trackers, the file comment, the creation date and other optional metadata. The infohash is quite literally the SHA1 hash of the info section of the torrent. Any modification to the file contents changes the infohash, while changing the other metadata doesn’t.

This gives us two ways of affecting the hash without touching the file contents. The first one is adding a separate key called vanity and chaning the value of it. While this would be really flexible and cause the least change that the user can see, it adds a non-standard key to our dictionary. Fortunately, torrent files are supposed to be flexible and handle unknown keys gracefully.

The other thing we can do is to add a prefix to the file name. This should keep everything intact aside from a random value in front of our filename.

Parsing the torrent file

First of all, let’s read our torrent file and parse it. For this purpose, I’m using the bencoder module.

import bencoder

target = 'arch-linux.torrent'
with open(target, 'rb') as torrent_file:
    torrent = bencoder.decode(torrent_file.read())

Calculating the infohash

The infohash is the hash of the info section of the file. Let’s write a function to calculate that. We also encode the binary of the hash with base 32 to bring it to the infohash format.

import hashlib
import base64

def get_infohash(torrent):
    encoded = bencoder.encode(torrent[b'info'])
    sha1 = hashlib.sha1(encoded).hexdigest()
    return sha1

Prefixing the name

Let’s do the method with prefixing the name first. We will start from 0 and keep incrementing the name prefix until the infohash starts with cafe.

original_name = torrent[b'info'][b'name'].decode('utf-8')

vanity = 0
while True:
    torrent[b'info'][b'name'] = '{}-{}'.format(vanity, original_name)
    if get_infohash(torrent).startswith('cafe'):
        print(vanity, get_infohash(torrent))
        break
    vanity += 1

This code will increment our vanity number in a loop and print it and the respective infohash when it finds a suitable one.

Adding a separate key to the info section

While the previous section works well, it still causes a change that is visible to the user. Let’s work around that by modifying the data in a bogus key called vanity.

vanity = 0
while True:
    torrent[b'info'][b'vanity'] = str(vanity)
    if get_infohash(torrent).startswith('cafe'):
        print(vanity, get_infohash(torrent))
        break
    vanity += 1

Saving the modified torrent files

While it is possible to do the modification to the file yourself, why not go all the way and save the modified torrent file as well? Let’s write a function to save a given torrent file.

def save_torrent(torrent, name):
    with open(name, 'wb+') as torrent_file:
        torrent_file.write(bencoder.encode(torrent))

You can use this function after finding an infohash that you like.

Cool ideas for infohash criteria

  • Release groups can prefix their infohashes with their name/something unique to them
  • Finding smaller infohashes - should slowly accumulate 0’s in the beginning
  • Infohashes with the least entropy - should make them easier to remember
  • Infohashes with the more digits
  • Infohashes with no digits

Numerical Domains of China July 23, 2018 03:54 PM

I recently noticed that numbers are used a lot in China for email addresses and user names. I also found out that a number of popular websites, such as Alibaba and Baidu, had official domain names that are entirely numbers. It seemed that people had a preference for numbers instead of latin letters, and even big websites wanted to accommodate for this.

My girlfriend later confirmed that there are indeed lots of websites using just numbers as their domains. She told me this is sometimes used as a way to hide websites, mostly gambling and porn, and in some cases even sell access to them by just getting money and giving the secret domain name in exchange.

Wait a second! This sounds like security through obscurity, hiding things in plain sight. It’s a very creative way to restrict and sell access to websites, and it clearly works well enough for their purpose. But that’s not enough to stop us from finding them with a simple script. Numbers are very easy to generate and the fact that we’re looking for domains with all numbers increases our chances of coming across one.

Scanning random domains

My strategy in trying to find these websites is checking random domains until we find one. And the first step in anything involving randomness is to import random. Now we can start our script by writing a generator for random domains.

def domains():
    while True:
        yield "{}.com".format(random.randint(1000, 1000000))

This will give us an endless stream of random domains. After this, we will want to check if these domains actually have a DNS record, which is basically checking if that domain exists. To do that; we can use the socket library, mainly the socket.gethostbyname function.

def ips(domain):
    try:
        yield socket.gethostbyname(domain)
    except socket.error:
        pass

All this code does is try to get the IP address for the domain and return it if we succeed. If the way we’re writing these functions look weird, don’t worry. They actually fit together quite nicely.

These two functions should be enough to do random scans to see if anything turns up. We can use them together like this.

for domain in domains():
    for ip in ips(domain):
        print(domain, ip)

This will start scanning random domains and probably print lots of domains and their IP addresses. Here’s the full code of the scanner.

import socket
import random

def domains():
    while True:
        yield "{}.com".format(random.randint(1000, 1000000))

def ips(domain):
    try:
        yield socket.gethostbyname(domain)
    except socket.error:
        pass

for domain in domains():
    for ip in ips(domain):
        print(domain, ip)

Future work and improvements

We’re getting domains but you will notice some of them are just Domains for sale! pages. In order to help us find interesting domains faster, we can write another function to grab the title of these websites.

def titles(domain):
    try:
        html = requests.get("http://{}".format(domain), timeout=3).text
        title = re.search("<title>(.*?)</title>", html)
        if title:
            yield title.group(1)
    except:
        pass

We can combine this with the two other functions in order to print valid domains with their title, and print just the domains if there isn’t any title.

for domain in domains():
    for ip in ips(domain):
        for title in titles(domain):
            print(domain, ip, title)
            break
        else:
            print(domain, ip)

There are some easy ways this script can be improved in the future. Adding multithreading or an asynchronous DNS implementation might increase the performance. Also highlighting certain keywords and characters in the title should help us find interesting websites more efficiently.

Unrelenting Technology (myfreeweb)

Scaleway's ARM64 VPS has been successfully depenguinated! :) Now you can run FreeBSD on... July 23, 2018 02:48 PM

Scaleway's ARM64 VPS has been successfully depenguinated! :) Now you can run FreeBSD on four ThunderX cores, 2GB RAM and 50GB SSD for 3€/month. Awesome!

Also, in the process, I finally discovered the cause of GPT partitions sometimes disappearing on reboot. It was the size of the partition table. It's 128 by default, but sometimes it's not — e.g. on the FreeBSD installer memstick image, it's 2. Creating a third partition with gpart "succeeded", but the partition disappeared on reboot.

Pete Corey (petecorey)

Building a Better Receive Loop July 23, 2018 12:00 AM

I’ve been putting quite a bit of time this past week into overhauling and refactoring my in-progress Elixir-based Bitcoin node.

As a part of that overhaul, I turned my attention to how we’re receiving packets from connected peers. The way we’ve been handling incoming packets is overly complicated and can be greatly simplified by taking advantage of the Bitcoin protocol’s packet structure.

Let’s go over our old solution and dig into how it can be improved.

The Original Receive Loop

Our Bitcoin node uses Erlang’s :gen_tcp module to manage peer to peer communications. Originally, we were using :gen_tcp in “active mode”, which means that incoming packets are delivered to our node’s Elixir process in the form of :tcp messages:


def handle_info({:tcp, _port, data}, state) do
  ...
end

Because TCP is a streaming protocol, no guarantees can be made about the contents of these messages. A single message may contain a complete Bitcoin packet, a partial packet, multiple packets, or any combination of the above. To handle this ambiguity, the Bitcoin protocol deliminates each packet with a sequence of “magic bytes”. Once we reach this magic sequence, we know that everything we’ve received up until that point constitutes a single packet.

My previous receive loop worked by maintaining a backlog of all incoming bytes up until the most recently received sequence of magic bytes. Every time a new message was received, it would append those incoming bytes to the backlog and chunk that binary into a sequence of packets, which could then be handled individually:


{messages, rest} = chunk(state.rest <> data)

case handle_messages(messages, state) do
  {:error, reason, state} -> {:disconnect, reason, %{state | rest: rest}}
  state -> {:noreply, %{state | rest: rest}}
end

This solution works, but there are quite a few moving pieces. Not only do we have to maintain a backlog of all recently received bytes, we also have to build out the functionality to split that stream of bytes into individual packets:


defp chunk(binary, messages \\ []) do
  case Message.parse(binary) do
    {:ok, message, rest} ->
      chunk(rest, messages ++ [message])

    nil ->
      {messages, binary}
  end
end

Thankfully, there’s a better way.

Taking Advantage of Payload Length

Every message sent through the Bitcoin protocol follows a specific format.

The first four bytes of every packet are reserved for the network’s magic bytes. Next, twelve bytes are reserved for the name of the command being sent across the network. The next four bytes hold the length of the payload being sent, followed by a four byte partial checksum of that payload.

These twenty four bytes can be found at the head of every message sent across the Bitcoin peer-to-peer network, followed by the variable length binary payload representing the meat and potatoes of the command being carried out. Relying on this structure can greatly simplify our receive loop.

By using :gen_tcp in “passive mode” (setting active: false), incoming TCP packets won’t be delivered to our current process as messages. Instead, we can ask for packets using a blocking call to :gen_tcp.recv/2. When requesting packets, we can even specify the number of bytes we want to receive from the incoming TCP stream.

Instead of receiving partial messages of unknown size, we can ask :gen_tcp for the next 24 bytes in the stream:


{:ok, message} <- :gen_tcp.recv(socket, 24)

Next, we can parse the received message bytes and request the payload’s size in bytes from our socket:


{:ok, %{size: size}} <- Message.parse(message),
{:ok, payload} <- :gen_tcp.recv(socket, size)

And now we can parse and handle our payload, knowing that it’s guaranteed to be a single, complete Bitcoin command sent across the peer-to-peer network.

Final Thoughts

There’s more than goes into the solution that I outlined above. For example, if we’re receiving a command like "verack", which has a zero byte payload, asking for zero bytes from :gen_tcp.recv/2 will actually return all of the available bytes it has in its TCP stream.

Complications included, I still think this new solution is superior to our old solution of maintaining and continually chunking an ongoing stream of bytes pulled off the network.

If you’re eager to see the full details of the new receive loop, check it out on Github!

I’d also like to thank Karl Seguin for inspiring me to improve our Bitcoin node using this technique. He posted a message on the Elixir Slack group about prefixing TCP messages with their length to easily determine how many bytes to receive:

I’d length prefix every message with 4 bytes and do two recvs, {:ok, <<length::big-32>>} = recv(socket, 4, TIMEOUT) {:ok, message} = recv(socket, length, TIMEOUT)

This one line comment opened my mind to the realization that the Bitcoin protocol was already doing this, and that I was overcomplicating the process of receiving messages.

Thanks Karl!

July 21, 2018

Unrelenting Technology (myfreeweb)

Building a reader on your website is not too hard when you already have... July 21, 2018 08:20 AM

Building a reader on your website is not too hard when you already have webmention processing (so you have code to parse entries and whatnot). So I kinda have one now. There's even some Microsub support, but that's not complete yet.

There's a funny bug in my feed fetching though: OAuth for the open web is always on top of the feed (its published date gets set to feed fetch time every time) :D

Simon Zelazny (pzel)

Sending 0.0.0.0 doesn't make sense July 21, 2018 07:00 AM

While I've grown used to specifying 0.0.0.0 as the listening address for servers, I got bitten yesterday and realized the dangers of this practice.

If the listening address of a server ever needs to be communicated to others (as is the case with clustered systems), configuring 0.0.0.0 will lead to bad things happening. That is: other servers, having learned that your address is 0.0.0.0:SOMEPORT, will attempt to connect and fail.

The RFC states clearly:

This host on this network. MUST NOT be sent, except as a source address as part of an initializatioan procedure by which the host learns its own IP address.

Pages From The Fire (kghose)

Overlays! July 21, 2018 03:38 AM

Yes, I learned how to draw text … Advertisements

Sevan Janiyan (sevan)

Something blogged (on pkgsrcCon 2018) July 21, 2018 01:17 AM

For this years pkgsrcCon, the baton was passed on to Pierre Pronchery & Thomas Merkel, location Berlin. It wasn’t clear whether I would be able to attend this year until the very last minute, booking plane tickets and accommodation a couple of days before. The day before I flew out was really hectic and I …

July 20, 2018

Kevin Burke (kb)

AWS’s response to ALB internal validation failures July 20, 2018 04:49 AM

Last week I wrote about how AWS ALB's do not validate TLS certificates from internal services. Colm MacCárthaigh, the lead engineer for Amazon ELB, writes:

I’m the main author of Amazon s2n, our Open Source implementation of TLS/SSL, and a contributor to the TLS/SSL standards. Hopefully I’m qualified to chime in!

You’re right that ALB does not validate the certificates on targets, but it’s important to understand the context that ALBs run in to see why this is still a pending item on our roadmap, rather than something we’ve shipped already as a “must have”.

The role that server certificates play in TLS is to authenticate the server, so that it can’t be impersonated or MITM. ALBs run exclusively on our Amazon VPC network, a Software Defined Network where we encapsulate and authenticate traffic at the packet level. We believe that this protection is far stronger than certificate authentication. Every single packet is being checked for correctness, by both the sender and the recipient, even in Amazon-designed hardware if you’re using an Enhanced Networking interface. We think it’s better than the ecosystem where any CA can issue a certificate at any time, with still limited audit controls (though certificate transparency is promising!).

The short of it is that traffic simply can’t be man-in-the-middled or spoofed on the VPC network, it’s one of our core security guarantees. Instances, containers, lambda functions, and Elastic Network Interfaces can only be given IPs via the secure and audit-able EC2 APIs. In our security threat model, all of this API and packet level security is what plugs in the role performed by server certificates.

This contrasts with the older EC2 classic network, a big shared network, which is why classic load balancers do support backend authentication.

We actually find that many customers actually load their targets and backends with “invalid” certificates that are self-signed or expired, because it’s so operationally hard to stay up-to-date and it’s hard to automate, even with projects like LetsEncrypt, when your instances are inherently unreachable on the internet.

All that said, we’ll be adding support for certificate validation, probably including pinning and private CAs! Used well with good operational controls it can be a measure of defense in depth, and it’s important for cases such as targets hosted on less secure private networks such as on-premesis data-centers.

Lewis Van Winkle (code)

July 19, 2018

Anish Athalye (anishathalye)

Gemini: A Modern LaTeX Poster Theme July 19, 2018 04:00 AM

Programs like PowerPoint, Keynote, and Adobe Illustrator are common tools for designing posters, but these programs have a number of disadvantages, including lack of separation of content and presentation and lack of programmatic control over the output. Designing posters using these programs can require countless hours calculating positions of elements by hand, manually laying out content, manually propagating style changes, and repeating these kinds of tasks over and over again during the iterative process of poster design.

The idea of using a document preparation system like LaTeX to implement a poster using code sounds fantastic, and indeed, there are a number of LaTeX templates and packages for making posters, such as a0poster, sciposter, and beamerposter. However, I didn’t like the look of the existing themes and templates — they all looked 20 years old — and this is what kept me from using LaTeX for making posters, even though I had been using the software for years for authoring documents.

I finally bit the bullet and spent some time designing a clean, stylish, and minimal poster theme for LaTeX, building on top of the beamerposter package. The result has been open-sourced as Gemini, and it makes it really easy to design posters that look like this:

Poster example

Why LaTeX?

There are a number of programs commonly used for making academic posters. These include:

  • Word processing programs (e.g. Word, Pages, and LibreOffice Writer)
  • Presentation programs (e.g. PowerPoint, Keynote, and LibreOffice Impress)
  • Vector editing programs (e.g. Adobe Illustrator and Inkscape)

Why use LaTex over these programs? The biggest benefit is that LaTeX does not require manual effort to lay out contents and apply a uniform style to the entire poster. All layout and styling is done using code relying on TeX’s sophisticated layout algorithms, and there is a clean separation of content and presentation, similar to the content/style separation in HTML/CSS.

There are other benefits as well. TeX is a sophisticated typesetting system that produces excellent results for text as well as mathematical formulae; LaTeX packages provide support for plotting and algorithmically specified diagrams and vector graphics; and beamer provides support for column-based layout, including variable-width and nested columns. This means that all content in the poster, not just the text, can be produced using code: no more screenshots of mathematical equations; no more positioning shapes with the mouse to create diagrams; no more screenshots of plots where the styling doesn’t quite match the style of the poster; and no more manual positioning of blocks.

A modern LaTeX poster theme

Building posters with LaTeX is by far a better experience than using PowerPoint, Keynote, or Illustrator. I felt that the one thing missing was an aesthetically pleasing poster theme. There’s no reason a poster designed using LaTeX should look any less beautiful than a poster made using graphic design software like Adobe Illustrator.

This is what led to the creation of Gemini, a LaTex poster theme with a focus on being clean, minimal, and looking great out of the box while being customizable:

Gemini default theme example

MIT theme example

The theme is actually a pretty small amount of code; most of the functionality is provided by LaTeX and beamerposter. But making conscious choices on title and block layout, font families, font weights, color schemes, and other little details makes a pretty big difference in how the poster looks and feels.

July 18, 2018

Brian Hicks (brianhicks)

Let's Make Nice Packages! July 18, 2018 07:11 PM

This year at Elm Europe I gave a talk called “Let’s Make Nice Packages!”

It’s about research.

No, wait, come back!


Continue Reading

Jan van den Berg (j11g)

The Phoenix Project July 18, 2018 03:10 PM

When a co-worker handed me a copy of The Phoenix Project, the 8-bit art on the cover looked fun. But the tagline — ‘A Novel About IT, DevOps and Helping your Business Win’ — sounded a bit like the usual buzzword management lingo. But I was clearly wrong, I loved this book!

It is unlike anything I’ve read before and it really spoke to me because the situations were so incredibly recognizable. The book tells a fictionalized story where the main character, Bill, gets promoted — more or less against his will — to VP IT Operations and subsequently inherits a bit of a mess. Things keeps breaking and escalating, causing SEV-1 outages all while the billion dollar company is having a bad couple of quarters and put all their hope on Project Phoenix. An IT project that is supposed to solve anything and everything; already three years in the making and nowhere close to be finished.

The story revolves around Bill and his struggle of how to turn things around. On his path to discovery he is mentored by an eccentric figure called Eric (who is such a great and funny character).

https://www.magnusdelta.com/blog/2017/9/16/thephoenixprojectsummary

I feel like Bill and I have a lot in common, mainly because the book is really spot on when describing situations IT departments can find themselves in. Some scenes were a literal copy of things I have experienced. As if the writers were there and took notes. It made me laugh out loud or raise my eyebrows on more than one occasion. The reliance on certain key-figures, the disruption of self-involved Marketing/Sales people, the office politics, the lack of trust in teams, the weight of technical debt, the difference between requirements and customer needs. It was all too familiar. So for me the power of the book is the true-to-life examples, because those provide the basis for arguing the successful application of the theory.

Because the book is in fact the theory of DevOps compiled into an exciting story. Which is a lot more fun than it sounds.

Actually the book could be seen as a modern day version of The Goal by Dr. Goldratt — a book that handles the Theory of Constraints — which I had of course heard of, but never read. The writers of The Phoenix Project make no secret of their admiration for Goldratts’ theory. But DevOps is of course a thing of its own. A relatively new paradigm, borrowing from TOC, Lean and Agile principles among other things. Its goal is ‘to aim at shorter development cycles, increased deployment frequency, and more dependable releases, in close alignment with business objectives’. And where The Three Ways theory is a central aspect, unifying culture with production flow. The book shows how those theoretic mechanics work in practice. And that IT is closer to manufacturing than you might think; by breaking down the four different types of work there are in IT. That was actually an eye-opener for me. But I won’t go into too much detail about DevOps, I just wanted to point you in the right direction. If you work with different people to create anything in IT, you are probably going to like this book, and are bound to learn something.

 

The post The Phoenix Project appeared first on Jan van den Berg.

July 13, 2018

Jakob (jakob)

Replacing Anki With org-drill July 13, 2018 02:53 PM

Recently, I read Michael Nielsen's essay, "Augmenting Cognition". It talks about some very interesting use cases for the spaced repetition software "Anki" that made me want to try it out again. I'm familiar with Anki, as I used it extensively throughout my last year of high school to study for AP exams. At the time, Anki's "killer feature" for me over similar software was being able to typeset mathematical notation in LaTeX (the exams were Chemistry and Calculus, so almost all of the material to memorize was mathematical notation). It's a great piece of software; I've been using it with the brother I'm helping through summer school. But ever since I began using Gentoo, I've been trying to avoid packages like QtWebView, which has deterred me from installing Anki on my machine. With a little bit of searching, however, I found that there was an Emacs package for spaced repetition named 'org-drill', so I decided to check it out.

org-drill is included in org by default (which happens to be included in Emacs by default), but it does need to be enabled. The steps to do so are outlined on the corresponding worg page. So far, I've used it to study German vocabulary and the material for my ham radio license exams, and I'm very happy with it. It has all of the features you might want from Anki, like Cloze deletion and double-sided cards, but I find that card creation is even more intuitive in org markup. Clozes are as simple as enclosing the answers in square brackets, and multi-sided cards just entail making multiple headings and setting the ":DRILL_CARD_TYPE:". You can even write your own card types in elisp. Another benefit of using org markup as the source for cards is that I can easily transform a plain text file into a deck using emacs macros.

Unlike Anki, however, org-drill has support for the SM5 and SM8 scheduling algorithms. Anki is quite outspoken about the benefits of SM2 over the later renditions, but I appreciate that I at least have the option to use these schedulers if I want to. The algorithms' parameters can also be finely tuned; the one I've found most useful is 'org-drill-learn-fraction', which I can use to decrease the amount of time before I see a card again.

As I mentioned earlier, the feature that brought me to Anki was its support for typesetting math with LaTeX. Emacs certainly has support for rendering LaTeX, but I have a pretty wonky setup where I'm running Emacs in a terminal emulator, so what I opted for instead was a typesetting language that renders to unicode text. There are quite a few of these, but the one I was most impressed with is Diagon. It's meant to be run in the browser, but the backend is written in C++ and can be compiled to run natively. Be warned, however, that the build system does require Java.

First, I replace 'src/main.cpp' with the following. The version in VCS will unconditionally run the SequenceTranslator, but this modification enables us to select which translator to use from a command-line argument.

#include "translator/Translator.h"
#include <iostream>

int main(int argc, const char **argv) {
    if (argc != 2) {
        std::cerr << "usage: " << argv[0] << " [translator]" << std::endl;
        return 1;
    }

    std::string input;
    for (std::string line; std::getline(std::cin, line);) {
        input += line + "\n";
    }

    auto translator = TranslatorFromName(argv[1]);
    std::cout << (*translator)(input, "") << std::endl;

    return 0;
}

Then, compiling is as easy as

cd tools/antlr/
./download_and_patch.sh
cd ../../
mkdir build
cd build
cmake ..
make

And for Emacs integration, I've added the following to my '.emacs'

;; Applies Diagon's "Math" formatter to the current region, replacing
;; the contents of the region with the formatted output.
(defun format-math-at-region ()
  (interactive)
  (let* ((math-to-format (buffer-substring (region-beginning) (region-end)))
         (command (format "echo \"%s\" | diagon Math" math-to-format))) ;; Bad and hacky. I'm aware.
      (kill-region (region-beginning) (region-end))
      (insert (string-trim-right (shell-command-to-string command)))))

It's not as powerful as LaTeX, but it certainly suits my needs.

Demo

July 12, 2018

Stjepan Golemac (stjepangolemac)

Hi Marco, July 12, 2018 08:35 PM

Hi Marco,

I am glad that you find this article useful! I suppose that you are asking me about the prevention of endless looping.

The code from the provided repo will surely loop to eternity if the REQUEST action fails every time. The solution here could be adding a retry counter to the action. Such information would allow the monitor saga to abandon the action after the configured number of attempts.

If you look at the Flux Standard Action (https://github.com/redux-utilities/flux-standard-action), it says the best place to put such data is in the meta property of the action.

You could extend the functionality of the monitor saga to increment the counter every time it encounters a failed action.

Example:

starting to monitor action:
{ type: ‘FOO_REQUEST’, payload: { … } }

action failed, refresh the token and dispatch it again:
{ type: ‘FOO_REQUEST’, payload: { … }, meta: { retry: 1 } }

action failed again, retry counter is > 3, ignore the action…

I hope this is understandable. Cheers!

Oleg Kovalov (olegkovalov)

Contributing to Go with go-critic July 12, 2018 07:06 PM

You might remember announce of go-critic last month.

We’ve verified golang/go repository and have send few patches, that are fixing found problems. In this post, we will inspect the suggested changes.

You can find list of go-critic patches on trophies page.

List of patches covered in this post:

  1. net: combine append calls in reverseaddr appendCombine
  2. cmd/link/internal/ld: avoid Reloc copies in range loops rangeValCopy
  3. cmd/compile/internal/ssa: fix partsByVarOffset.Less method dupSubExpr
  4. runtime: remove redundant explicit deref in trace.go underef
  5. cmd/link/internal/sym: uncomment code for ELF cases in RelocName commentedOutCode
  6. runtime: simplify slice expression to sliced value itself unslice
  7. html/template: use named consts instead of their values namedConst
  8. cmd/internal/obj/arm64: simplify some bool expressions boolExprSimplify
  9. math,net: omit explicit true tag expr in switch switchTrue
  10. archive/tar: remore redundant parens in type expressions typeUnparen

dupSubExpr

Everyone makes mistakes. In Go sometimes you may write boring and boilerplate code that increases copy/paste error probability.

CL122776 contains bugfix that was found by dupSubExpr checker:

https://medium.com/media/7efcba7c15089e3bedcf904534040b44/href

Take a look at index on the right and on the left. They were identical before changes. dupSubExpr is created exactly for this situation, when left and right expression are identical.

commentedOutCode

Hopefully your project is under VCS and instead of disabling code via commenting it — it’s better to remove it completely. There’re exceptions, of course, but `dead code` is usually redundant and can hide bugs.

commentedOutCode has found this interesting code (CL122896):

https://medium.com/media/49b3f468bd36ad824059ef80359a2497/href

Above that code we can see a comment:

// We didn't have some relocation types at Go1.4.
// Uncomment code when we include those in bootstrap code.

Switching to go1.4 branch and uncommenting this 3 lines will prevent code from a compilation, but uncommenting them on amaster branch will be ok.

Usually, the code is commented-out should be totally deleted or uncommented. It’s a good practice to review such old parts of the code and decide their meaning for the future (It’s one of my lovely checks, but also it’s a `noisy` one).

There are a lot of false positives for math/big package and inside the compiler. In the first case, they’re explaining operations, in the second case — which AST code it is processing. Programmatically decide what comment is just a comment and what is real commented code (might be `dead` code) — isn’t an easy task. Sadly.

Here is the idea: what about to mark commented-out code which explains behaviour of program from just a commented-out code? This might be a trivial thing, but it will help future readers a lot (by example, a code in comments can be prepended with a # character).

Another category — comments with explicit TODO. If the code was commented, but it has an explanation why it’s commented. For this case it’s better not to warn. It’s already implemented, but might work better.

boolExprSimplify

Sometimes people are writing strange code. For me, boolean expressions are a pain point more often than others.

Let’s look at this code:

if !(o1 != 0) {
break
}

“If not o1 isn’t equal to zero” — double negation, classical. If you agree with me that it’s hard to read, here is a patch CL123377 with a simplification.

boolExprSimplify is targeting on boolean expression simplification, that will improve code readability (question regarding performance will be handled by Go optimizer).

underef

If you’re using Go from early versions you might remember when ; was mandatory, the absence of automatic pointer dereference and other `features` that aren’t presented now.

In old Go code you might found something like this:

// Long time ago there was not automatic pointer dereference:
buf := (*bufp).ptr()
// ...now it can be as simple as:
buf := bufp.ptr()

Few underef checker triggers were fixed here: CL122895.

appendCombine

You might know that append can receive a variable number of arguments. In most cases this allows to improve readability but also this might improve performance too, ’cause compiler doesn’t combine a sequence of append calls (cmd/compile: combine append calls).

In Go repo appendCombine checker has found such code:

https://medium.com/media/11d1c3c8c549ef58601d7c659fbc43cc/href
name              old time/op  new time/op  delta
ReverseAddress-8 4.10µs ± 3% 3.94µs ± 1% -3.81% (p=0.000 n=10+9)

Details in CL117615.

rangeValCopy

It’s not a secret that value in the range loop are copying. For a small objects (let’s say less than 64 bytes) you might not see a difference. But when it’s a `hot loop` and objects aren’t small enough — performance degradation might occur.

Go has a slightly slow linker (cmd/link) and without big changes in architecture achieving great performance boost isn’t so easy. But we’ve micro-optimizations 😉.

rangeValCopy check has found few loops with unwanted data copying. Here is the most interesting one:

https://medium.com/media/01aac782ea56117f34dc1d4e5d37e25c/href

Instead of copying R[i] on every iteration we’re accessing only 1 interesting field for us Sym.

name      old time/op  new time/op  delta
Linker-4 530ms ± 2% 521ms ± 3% -1.80% (p=0.000 n=17+20)

Full patch is here CL113636.

namedConst

In Go, sadly, named constants even when grouped together aren’t connected between each other and didn’t form enumeration like the other languages (proposal: spec: add typed enum support).

One of this problem is a casting untyped const to a type that you’re using like an enum.

Suppose you have a type Color and it’s default const ColDefault Color = 0.
Which of 2 examples your like more?

// (A)
if color == 0 {
// do smth
}
// (B)
if color == colorDefault {
// do smth
}

If case (B) for you looks more clear namedConst will help you to find such usages and propose named const for that.

Here are the changes for context.mangle method from package html/template:

https://medium.com/media/78494dacd2c0653c242d56f4750d5a4c/href

By the way, sometimes in a patch review you can find interesting discussions…CL123376 — is a one of such discussions.

unslice

One of the features of slice expression is that x[:] is always identical tox (or if x is a string). In case of a slice it works for any type.

x       // x is a slice
x[:] // also a slice
x[:][:] // even this

unslice finds similar redundant slice expressions. They’re bad ’cause they’re giving unneeded cognitive load. x[:] makes sense when x is an array, but for a slice it’s just a noise.

Here is a patch CL123375.

switchTrue

The replacement of switch true {...} toswitch {...} is done in CL123378.
Both of forms are equivalent, but last one is more idiomatic.

Most of the style checks are targeting on similar expressions, where both of them are allowed and correct, but only one of them is more common for Go developers. Next check is from the same category.

typeUnparen

Go as many other languages likes parentheses. So heavily, that allows any number of them:

type (
t0 int
t1 (int)
t2 ((int))
// ... so, you get the idea.
)

But what will happen after running gofmt?

type (
t0 int
t1 (int) // <- Woah! Nothing has changed.
t2 (int) // <- Meh..only half of work is done here...
)

For this case, we have typeUnparen checker. It finds all type expressions, where some parentheses can be omitted. We’ve tried to fix few warnings here CL123379, we will see if it will be merged.

go-critic on duty

We’ve covered only tiny part of all checks and in near future, we will have more of them with new cases, suggestions and solutions thanks to the people that are contributing to go-critic.

Send us ideas or any code improvements, inform about found warnings, maybe bugs or just ping us. You might also suggest projects for auditing, it’s a priceless experience for us. Thank you.

More where this came from

This story is published in Noteworthy, where thousands come every day to learn about the people & ideas shaping the products we love.

Follow our publication to see more product & design stories featured by the Journal team.


Contributing to Go with go-critic was originally published in Noteworthy - The Journal Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.

Unrelenting Technology (myfreeweb)

For 0.5 dollars per hour (or currently 0.1/hr if you reserve for 24h?) on... July 12, 2018 06:24 PM

For 0.5 dollars per hour (or currently 0.1/hr if you reserve for 24h?) on packet.net you can get access to a dedicated dual-socket Cavium ThunderX server with 128GB RAM and a 250GB SSD. I took it for a few hours and now lang/crystal, lang/mono and some other ports work on aarch64.

Ironically, these two builds have involved long single-threaded compile processes. In the mono case, parallelism had to be disabled for C# compilation to work around a concurrency bug.

At least building things like WebKitGTK+ (to test a one line patch) and Krita felt awesome :D

July 11, 2018

Caius Durling (caius)

Running rails tests under CircleCI 2.0 with MariaDB July 11, 2018 06:30 PM

CircleCI have released their version 2.0 platform, which is based on top of docker and moves the configuration for each project into a config file in the git repository.

They have a bunch of documentation at https://circleci.com/docs/2.0/. Basic gist is the config file lives at .circleci/config.yml and defines which images to run a series of commands in. You can either specify jobs to run in series, or a workflow containing jobs which can depend on each other and/or run in parallel.

The first step is finding a base image that contains ruby, node and chrome/chromedriver so the the app runs, assets compile and rails feature specs work respectively.

The available images for ruby are listed at https://github.com/CircleCI-Public/circleci-dockerfiles/tree/master/ruby/images, and the mariadb images are listed at https://github.com/CircleCI-Public/circleci-dockerfiles/tree/master/mariadb/images. For the ruby images you'll want to use the …-node-browsers image as it has Node.js for assets and Chrome/chromedriver installed for headless browser testing.

So the start of our config file looks something like the following:

version: 2
working_directory: "~/project"
docker:
  - image: "circleci/ruby:2.4.1-node-browsers"
    environment:
      RAILS_ENV: "test"
  - image: "mariadb:10.2.12"
    environment:
      MYSQL_DATABASE: "app_test"
      MYSQL_USER: "root"
      MYSQL_ALLOW_EMPTY_PASSWORD: true
      MYSQL_HOST: "localhost"

Once we have that then we can start on setting up our rails environment to the point we can run tests. First of all we need to install all our ruby dependencies via bundler.

jobs: 
  - run:
      name: "Install ruby dependencies"
      command: "bundle install --path vendor/bundle"

Then we need to install our JS dependencies via yarn, in much the same way as we did for the ruby dependencies.

  - run:
      name: "Install js dependencies"
      command: "yarn install"

Then we need to sort out our database. There's a chance that the docker instance for MariaDB hasn't come up yet, so we can lean on a tool called dockerize to wait for it to be available. Then we can ask rails to go ahead and setup our test database.

  - run:
      name: "Wait for database to be available"
      command: "dockerize -wait tcp://127.0.0.1:3306 -timeout 1m"
  - run:
      name: "Setup database"
      command: "bundle exec rake db:setup"

And then finally we can run our tests as the final step.

  - run:
      name: "Run tests"
      command: "bundle exec rspec"

Putting it all together, we have the following in .circleci/config.yml:

version: 2
working_directory: "~/project"
docker:
  - image: "circleci/ruby:2.4.1-node-browsers"
    environment:
      RAILS_ENV: "test"
  - image: "mariadb:10.2.12"
    environment:
      MYSQL_DATABASE: "app_test"
      MYSQL_USER: "root"
      MYSQL_ALLOW_EMPTY_PASSWORD: true
      MYSQL_HOST: "localhost"
jobs: 
  - run:
      name: "Install ruby dependencies"
      command: "bundle install --path vendor/bundle"
  - run:
      name: "Install js dependencies"
      command: "yarn install"
  - run:
      name: "Wait for database to be available"
      command: "dockerize -wait tcp://127.0.0.1:3306 -timeout 1m"
  - run:
      name: "Setup database"
      command: "bundle exec rake db:setup"
  - run:
      name: "Run tests"
      command: "bundle exec rspec"

May your test runs always be green and your bugs be squished.

July 10, 2018

David Wilson (dw)

Mitogen released! July 10, 2018 05:08 PM

After 4 months development, a design phase stretching back 10 years and more than 1,300 commits, I am pleased to finally announce the first stable series of Mitogen and the Mitogen for Ansible extension.

Mitogen is a Python zero-deploy distributed programming library designed to drastically increase the functional capability of infrastructure software operating via SSH. Mitogen for Ansible is a drop-in replacement for Ansible's lower tiers, netting huge speed and efficiency improvements for common playbooks.

What's There

This initial series covers a widely compatible drop-in Ansible extension on Python 2.6, 2.7, and 3.6, a preview of the first value-added functionality for Ansible (Connection Delegation), and a freeze of the underlying library required to support it.

With the exception of some gotchas listed in the release notes, you should expect the Ansible extension to just work, and if it doesn't please let me know via GitHub.

Demo

Refer to the posts under Just Tuning In? below for a 1000 foot view of the direction this work is heading, but for an idea of how things are today, watch the first minute of this recording, demonstrating a loop-heavy configuration of Mitogen's tests executing against the local machine.

Installation

To install Mitogen for Ansible, just follow the 5 easy steps in the documentation. For non-Ansible users the library is available from PyPI via pip install mitogen. Introductory documentation for the library is very weak right now, it will improve over the course of the stable series.

Thanks to all the supporters!

Mitogen development in 2018 was sponsored by a fabulous group of individuals and businesses through a crowdfunding campaign launched in February. Thanks to everyone who participated by pledging, testing, writing bug reports, and helping with upfront planning. A huge special thanks to the primary sponsor:

Founded in 1976, CGI is one of the world’s largest IT and business consulting services firms, helping clients achieve their goals, including becoming customer-centric digital organizations.


For career opportunities, please visit cgi-group.co.uk/defence-and-intelligence-opportunities.

To directly apply to a UK team currently using Mitogen, contact us regarding Open Source Developer/DevOps opportunities.

What's next

Feature work will resume after most issues are ironed out of the stable series -- in particular I'm expecting more bugs around Python 3 and cross 2/3 interoperability. Once 0.2.x looks solid, one important goal is a complete and widely compatible Connection Delegation feature, including a rewrite of the fakessh component to support transparent use of the synchronize module.

Just tuning in?

Until next time!

July 09, 2018

Pete Corey (petecorey)

Golfing for FizzBuzz in Clojure and Elixir July 09, 2018 12:00 AM

I recently came across this riff on the FizzBuzz problem written in Clojure. While it’s admittedly not terribly obvious what’s going on, I thought it was a novel solution to the FizzBuzz problem.

How could we recreate this solution using Elixir? There are some obvious similarities between Clojure’s cycle and Elixir’s Stream.cycle/1. As someone who’s always been a fanboy of Lisp syntax, which solution would I prefer?

There’s only one way to find out…

But First, an Explanation

Before we dive into our Elixir solution, we should work out what exactly this Clojure solution is doing:


(clojure.pprint/pprint
  (map vector
    (range 25)
    (cycle [:fizz :_ :_])
    (cycle [:buzz :_ :_ :_ :_])))

Clojure’s clojure.pprint/pprint obviously just prints whatever’s passed into it. In this case, we’re printing the result of this expression:


(map vector
  (range 25)
  (cycle [:fizz :_ :_])
  (cycle [:buzz :_ :_ :_ :_])))

But what exactly’s happening here? Clojure’s map function is interesting. It let’s you map a function over any number of collections. The result of the map expression is the result of applying the function to each of the first values of each collection, followed by the result of applying the mapped function to each of the second values, and so on.

In this case, we’re mapping the vector function over three collections: the range of numbers from zero to twenty four ((range 25)), the infinite cycle of :fizz, :_, and :_ ((cycle [:fizz :_ :_])), and the infinite cycle of :buzz, :_, :_, :_, :_ ((cycle [:buzz :_ :_ :_ :_])).

Mapping vector over each of these collections creates a vector for each index, and whether it should display Fizz, Buzz, or FizzBuzz for that particular index.

The result looks just like we’d expect:


([0 :fizz :buzz]
 [1 :_ :_]
 [2 :_ :_]
 [3 :fizz :_]
 [4 :_ :_]
 [5 :_ :buzz]
 ...
 [24 :fizz :_])

An Elixir Solution

So how would we implement this style of FizzBuzz solution using Elixir? As we mentioned earlier, Elixir’s Stream.cycle/1 function is almost identical to Clojure’s cycle. Let’s start there.

We’ll make two cycles of our Fizz and Buzz sequences:


Stream.cycle([:fizz, :_, :_])
Stream.cycle([:buzz, :_, :_, :_, :_])

On their own, these two cycles don’t do much.

Let’s use Stream.zip/2 to effectively perform the same operation as Clojure’s map vector:


Stream.zip(Stream.cycle([:fizz, :_, :_]), Stream.cycle([:buzz, :_, :_, :_, :_])) 

Now we can print the first twenty five pairs by piping our zipped streams into Enum.take/2 and printing the result with IO.inspect/1:


Stream.zip(Stream.cycle([:fizz, :_, :_]), Stream.cycle([:buzz, :_, :_, :_, :_])) 
|> Enum.take(25)
|> IO.inspect

Our result looks similar:


[
  fizz: :buzz,
  _: :_,
  _: :_,
  fizz: :_,
  _: :_,
  _: :buzz,
  ...
  fizz: :_
]

While our solution works, I’m not completely happy with it.

Polishing Our Solution

For purely aesthetic reasons, let’s import the function’s we’re using from Stream, Enum and IO:


import Stream, only: [cycle: 1, zip: 2]
import Enum, only: [take: 2]
import IO, only: [inspect: 1]

This simplifies the visual complexity of our solution:


zip(cycle([:fizz, :_, :_]), cycle([:buzz, :_, :_, :_, :_]))
|> take(25)
|> inspect

But we can take it one step further.

Rather than using Stream.zip/2, which expects a left and right argument, let’s use Stream.zip/1, which expects to be passed an enumerable of streams:


[
  cycle([:fizz, :_, :_]),
  cycle([:buzz, :_, :_, :_, :_])
]
|> zip
|> take(25)
|> inspect

And that’s our final solution.

Final Thoughts

To be honest, I’ve been having troubles lately coming to terms with some of Elixir’s aesthetic choices. As someone who’s always admired the simplicity of Lisp syntax, I fully expected myself to prefer the Clojure solution over the Elixir solution.

That being said, I hugely prefer the Elixir solution we came up with!

The overall attack plan of the algorithm is much more apparent. It’s immediately clear that we start with two cycles of :fizz/:buzz and some number of empty atoms. From there, we zip together the streams and take the first twenty five results. Lastly, we inspect the result.

Which solution do you prefer?

Ping, Pong, and Unresponsive Bitcoin Nodes July 09, 2018 12:00 AM

The last piece of low-hanging fruit required to finish up the connectivity of our in-progress, Elixir-powered Bitcoin node is to implement a system to detect unresponsive peer connections and prune them from our list of active peers.

Once an inactive peer is removed, our current system will automatically connect to a new peer to take its place.

There are several potential solutions for building out this kind of timeout system, and I’ve been weighing their pros and cons in the back of my mind for several weeks. I think I’ve come to a relatively simple and elegant solution that tackles the problem with minimal technical and mental overhead.

Let’s dive in!

Who Cares About Unresponsive Nodes?

In its current state, our Bitcoin node will connect to up to one hundred twenty five peer nodes. We assume that each of these nodes is a fully functioning and active part of the Bitcoin peer-to-peer network. If we don’t receive any messages from them, or if messages dwindle over time, we just assume that the network doesn’t have much to tell us.

This assumption can lead to trouble. If we continue to persist our connections to unresponsive nodes, it’s conceivable that eventually every node we’re connected to will become unresponsive for some reason or another.

At that point, our Bitcoin node is dead in the water. It’s unable to send or receive any information, and it’s unable to fetch any additional peers to reestablish its place in the peer-to-peer network. At this point our only course of action would be to restart the node and try again.

And that’s not a very robust solution…

Detecting Slow Connections

Instead, we should be proactive about pruning unresponsive nodes from our set of peers. The first piece of low hanging fruit was can go after is adding a timeout to our :gen_tcp.connect/2 call:


:gen_tcp.connect(
  IP.to_tuple(state.ip),
  state.port,
  options,
  Application.get_env(:bitcoin_network, :timeout)
)

If a node takes too long to respond to our initial connection request (in this case, :timeout is set to thirty seconds), we’ll retry the connection a few times and then ultimately remove the node from our set of peers.

Detecting Unresponsive Nodes

The next step in aggressively pruning our peer list is to watch for unresponsive nodes. We’ll do this by setting up a timeout between every message we receive from our peer. If we don’t receive another message before a certain cutoff time, we deem the peer unresponsive and break our connection.

We’ll start by adding a call to a new refresh_timeout/1 helper function in our :tcp info handler:


def handle_info({:tcp, _port, data}, state) do
  state = refresh_timeout(state)
  ...
end

The first time refresh_timeout/1 is called, it schedules a :timeout message to be sent to the current process after a certain amount of time. A reference to that timer is stored in the process’ current state:


defp refresh_timeout(state) do
  timer = Process.send_after(self(), :timeout, Application.get_env(:bitcoin_network, :timeout))
  Map.put_new(state, :timer, timer)
end

Subsequent calls to refresh_timeout/1 cancel the existing timer, and create a new one:


defp refresh_timeout(state = %{timer: timer}) do
  Process.cancel_timer(timer)
  refresh_timeout(Map.delete(state, :timer))
end

Now we need to add a callback to handle the scheduled :timeout message:


def handle_info(:timeout, state) do
  {:disconnect, :timeout, state}
end

Whenever we receive a :timeout message, we simply kill the current process, effectively disconnecting the associated peer.

Ensuring A Constant Stream of Messages

So now we’re disconnecting peers if we don’t receive a message from them within a certain period of time (thirty seconds in my case), but we have no way of guaranteeing that we should receive messages this frequently. What if there are no new blocks or transactions on the network?

To guarantee what we receive regular periodic messages, we need to set up a ping/pong loop.

Every so often we’ll send our peer node a “ping” message. If they’re still responsive, they’ll immediately respond with a “pong”. The peer will ensure our responsiveness by sending their own “pings”, which we’re already responding to.

According to the woefully under-documented Bitcoin protocol, we can’t send our first “ping” until we send back our “verack” message. Any messages sent prior to our “verack” will mark our node as “misbehaving” and risk a disconnection.


defp handle_payload(%Version{}, state) do
  with :ok <- Message.serialize("verack") |> send_message(state.socket),
       :ok <- Message.serialize("getaddr") |> send_message(state.socket),
       :ok <-
         Message.serialize("ping", %Ping{
           nonce: :crypto.strong_rand_bytes(8)
         })
         |> send_message(state.socket) do
    {:ok, state}
  else
    {:error, reason} -> {:error, reason, state}
  end
end

Now that we’ve sent our “ping”, we can expect to receive a “pong” in reply. When we receive the peer’s “pong” response, we want to schedule another “ping” to be sent a short time in the future. We do this by scheduling a :send_ping message to be sent to the current process after a short interval:


defp handle_payload(%Pong{}, state) do
  Process.send_after(self(), :send_ping, Application.get_env(:bitcoin_network, :ping_time))
  {:ok, state}
end

Our :send_ping handler sends another “ping” message, completing the ping/pong cycle:


def handle_info(:send_ping, state) do
  with :ok <-
         Message.serialize("ping", %Ping{
           nonce: :crypto.strong_rand_bytes(8)
         })
         |> send_message(state.socket) do
    {:noreply, state}
  else
    {:error, reason} -> {:error, reason, state}
  end
end

And that’s all there is to it!

As long as :ping_time is reasonably less than our :timeout, we should always have a constant stream of “ping” messages to keep our timeout timer from firing. If one of our peers ever fails to send their “pong”, we kill their corresponding Node process.

Final Thoughts

As far as I’m concerned, that wraps up the networking portion of our in-progress Elixir-based Bitcoin node project. In the future we’ll turn our attention to the actual guts of a Bitcoin node: processing blocks and transactions.

At some point we might also slap a fancy user interface on top of our node. Everything’s better with a great UI.

Stay tuned!

July 08, 2018

Unrelenting Technology (myfreeweb)

I was wondering why replies sent with the Omnibear Micropub browser extension ended up... July 08, 2018 12:28 PM

I was wondering why replies sent with the Omnibear Micropub browser extension ended up with the URL /replies/ instead of the auto generated slug. Turns out Omnibear sends mp-slug="" and my server happily accepted the empty slug :D

Andreas Zwinkau (qznc)

My Raspberry Pi RF ID music player July 08, 2018 12:00 AM

My electronics for the kids music player are complete and working.

Read full article!

July 07, 2018

Unrelenting Technology (myfreeweb)

What's going on with this piece of Android UI. Why are the icons arranged... July 07, 2018 07:03 PM

What's going on with this piece of Android UI. Why are the icons arranged diagonally?? Why.

The web standards process is a weird thing. Something as complicated and incredible as... July 07, 2018 12:36 AM

The web standards process is a weird thing. Something as complicated and incredible as CSS Grid is already shipping everywhere… while something as mundane and simple as registering a website as a share target has been in the bikeshedding stage for two years already.

July 06, 2018

Unrelenting Technology (myfreeweb)

asrpo (asrp)

Making a low level (Linux) debugger, part 3: our first program July 06, 2018 02:57 PM

This continues a series where we make a debugger and live editor for (re)creating assembly and C programs.

In part 1, we got the assembly parts: read/write registers and memory, single step, single instruction execution, function calls (although not perfect), set/restore breakpoints, memory allocation and examining upcoming instructions.

In part 2, we got the C parts: read/write variables using ptrace and memory maps, a C read-eval-print loop (REPL), line numbers from DWARF headers and undo using fork().

Its now time to actually use our debugger/editor! We'll try to use it to write a C program.

July 05, 2018

Unrelenting Technology (myfreeweb)

I rewrote micro-panel (the "admin panel" for this site) from scratch with LitElement and... July 05, 2018 09:10 PM

I rewrote micro-panel (the "admin panel" for this site) from scratch with LitElement and no material design components. It's really tiny now! The minified bundle is 57kb (and that still includes a code editor with syntax highlighting). The previous version was nearly 1mb.

Also, the new version is a bit simplified: no iframe mode, only cookie auth. And it doesn't wrap the whole page in an element, it's now more of a set of elements.

Check out this piece of code, by the way:

async close () {
  if ('animate' in this && 'finished' in Animation.prototype) {
    await this.animate({transform: ['none', 'translateY(100vh)']},
      {duration: 300, easing: 'ease-out'}).finished
  }
  this.hidden = true
}

Pepijn de Vos (pepijndevos)

Futhark: Python gotta go faster July 05, 2018 12:00 AM

While discussing the disappointing performance of my Futhark DCT on my “retro GPU”(Nvidia NVS 4200M) with Troels Henriksen, it came up that the Python backend has quite some calling overhead.

Futhark can compile high-level functional code to very fast OpenCL, but Futhark is meant to be embedded in larger programs. So it provides a host library in C and Python that set up the GPU, transfer the memory, and run the code. It turns out the the Python backend based on PyOpenCL is quite a bit slower at this than the C backend.

I wondered why the Python backend did not use the C one via FFI, and Troels mentioned that someone had done this for a specific program and saw modest performance gains. However, this does require a working compiler and OpenCL installation, rather than just a pip install PyOpenCL, so he argued that PyOpenCL is the easiest solution for the average data scientist.

I figured I might be able to write a generic wrapper for the generated C code by feeding the generated header directly to CFFI. That worked on the first try, so that was nice. The hard part was writing a generic, yet efficient and Pythonic wrapper around the CFFI module.

The first proof of concept required quite a few fragile hacks (pattern matching on function names and relying on the type and number of arguments to infer other things) But it worked! My DCT ran over twice as fast. Then, Troels, helpful as always, modified the generated code to reduce the number of required hacks. He then proceeded to port some of the demos and benchmarks, request some features, and contribute Python 2 support.

futhark-ffi now supports all Futhark types on both Python 2 and 3, resulting in speedups of anywhere between 20% and 100% compared to the PyOpenCL backend. Programs that make many short calls benefit a lot, while programs that call large, long-running code benefit very little. The OpenCL code that runs is the same, only the calling overhead is reduced.

One interesting change suggested by Troels is to not automatically convert Futhark to Python types. For my use case I just wanted to take a Numpy array, pass it to Futhark, and get a Numpy array back. But for a lot of other programs, the Futhark types are passed between functions unchanged, so not copying them between the GPU and CPU saves a lot of time. There is even a compatibility shim that lets you use futhark-ffi with existing PyOpenCL code by merely changing the imports. An example of this can be seen here

After installing Futhark, you can simply get my library with pip. (working OpenCL required)

pip install futhark-ffi

Usage is as follows. First generate a C library, and build a Python binding for it

futhark-opencl --library test.fut
build_futhark_ffi test

From there you can import both the CFFI-generated module and the library to run your Futhark code even faster!

import numpy as np
import _test
from futhark_ffi import Futhark

test = Futhark(_test)
res = test.test3(np.arange(10))
test.from_futhark(res)

July 04, 2018

Peter Bhat Harkins (pushcx)

NixOS on prgmr and Failing to Learn Nix July 04, 2018 06:30 PM

This is a writeup of my notes on how to get NixOS running on a VPS at prgmr, followed by more general notes on this experiment in learning nix.

Provision

I went with the lowest tier, currently 1.25 GiB RAM, 15 GiB Disk for $5/month. I’m only running weechat for irc/twitter/fediverse/slack and some miscellaneous small things. For “pre-installed distribution” I chose “None (HVM)”.

Netboot to start install

I ssh’d into the management console, ssh [hostname]@[hostname].console.xen.prgmr.com

  • 6 bootloader
  • 4 netboot installer, pick nixos
  • 0 twice for main menu
  • 4 to power off
  • 2 to start (see “Booting” below)
  • 1 to log in as root with no password (relax, ssh is off)

Partition

Surprisingly, the included 1.25 GB of RAM was not enough to run some nix commands. I had to back up and recreate the box with some swap space. I didn’t think too hard about it, just guessed at 2 GB and it worked OK. 2018-07-09: Vaibhav Sagar suggested this is probably this known bug.

gdisk /dev/xvda
 
  o to create gpt
 
  n to create swap partition
 
    Command (? for help): n
    Partition number (1-128, default 1): 1
    First sector (34-31457246, default = 2048) or {+-}size{KMGTP}:
    Last sector (2048-31457246, default = 31457246) or {+-}size{KMGTP}: +32M
    Current type is 'Linux filesystem'
    Hex code or GUID (L to show codes, Enter = 8300): EF02
    Changed type of partition to 'BIOS boot partition'
 
    Command (? for help): n
    Partition number (2-128, default 2):
    First sector (34-31457246, default = 67584) or {+-}size{KMGTP}:
    Last sector (67584-31457246, default = 31457246) or {+-}size{KMGTP}: -2G
    Current type is 'Linux filesystem'
    Hex code or GUID (L to show codes, Enter = 8300):
    Changed type of partition to 'Linux filesystem'
 
    Command (? for help): n
    Partition number (3-128, default 3):
    First sector (34-31457246, default = 27262976) or {+-}size{KMGTP}:
    Last sector (27262976-31457246, default = 31457246) or {+-}size{KMGTP}:
    Current type is 'Linux filesystem'
    Hex code or GUID (L to show codes, Enter = 8300): 8200
    Changed type of partition to 'Linux swap'
 
  w to write and exit
 
mkswap -L swap /dev/xvda3
 
swapon /dev/xvda3
 
mkfs.ext4 -L root /dev/xvda2
 
mount /dev/xvda2 /mnt

Configuring nix

I generated the initial config and added a few prgmr-specific tweaks:

nixos-generate-config --root /mnt
 
cd /mnt/etc/nixos
 
vi configuration.nix

Here’s my tweaks:

  boot.loader.grub.device = "/dev/xvda"
 
  # prgmr console config:
  boot.loader.grub.extraConfig = "serial --unit=0 --speed=115200 ; terminal_input serial console ; terminal_output serial console";
  boot.kernelParams = ["console=ttyS0"];
 
  environment.systemPackages = with pkgs; [
    bitlbee
    tmux
    weechat
    wget
    vim
  ];
 
  services.openssh.enable = true;
 
  networking.firewall.allowedTCPPorts = [ 22 ];
 
  sound.enable = false;
  services.xserver.enable = false;
  services.openssh.enable = true;
 
  users.extraUsers.pushcx = {
    name = "pushcx";
    isNormalUser = true;
    extraGroups = [ "wheel" "disk" "systemd-journal" ];
    uid = 1000;
    openssh.authorizedKeys.keys = [ "[ssh public key here]" ];
  };

Then I ran nixos-install to install the system.

Booting

The NixOS manual says you should be able to run reboot to boot to the new system, but something in xen doesn’t reload the new boot code and I got the netboot again rather than the new system. After talking to prgmr I found it worked if I pulled up the management console and did:

  • 6 -> 1 boot from disk
  • then 4 to fully poweroff
  • then 2 to create/start

After this I had a running system that I could ssh into as a regular user.

Prgmr donates hosting to Lobsters, but because Alan configured the hosting, this was my first time really using the system. It was painless and getting support in #prgmr on Freenode was comfortable for me as a longtime IRC user. I liked them before, and now I’m happy to recommend them for no-nonsense VPS hosting.

Nix/NixOS

I did this setup because I’ve been meaning to learn nix (the package manager) and NixOS (the Linux distribution built on nix) for a while. As I commented on Lobsters, they look like they didn’t start from manual configuration and automate that, they started from thinking hard about what system configuration is and encoded that. (The final impetus was that I ran out of stored credit at Digital Ocean, hit several billing bugs trying to pay them, and couldn’t contact support – six tries in four mediums only got roboresponses.)

The NixOS manual is solid and I had little trouble installing the OS. It did a great job of working through a practical installation while explaining the underlying concepts.

I then turned to the Nix manual to learn more about working with and creating packages and failed, even with help from the nixos IRC channel and issue tracker. I think the fundamental cause is that it wasn’t written for newbies to learn nix from; there’s a man-page like approach where it only makes sense if you already understand it.

Ultimately I was stopped because I needed to create a package for bitlbee-mastodon and weeslack. As is normal for a small distro, it hasn’t packaged these kind of uncommon things (or complex desktop stuff like Chrome2018-07-16: I’ve learned that Nix does have a package for Chrome, but it doesn’t appear in nix-env searches or the official package list because it’s hidden by an option that is not referenced in system config files, user config files, the NixOS Manual, the Nix Manual, the man page for nix-env, the package search site, or the the documentation of any other tool it hides packages from.) but I got the impression the selection grows daily. I didn’t want to install them manually (which I doubt would really work on NixOS), I wanted an exercise to learn packaging so I could package my own software and run NixOS on servers (the recent issues/PRs/commits on lobsters-ansible tell the tale of my escalating frustration at its design limitations).

The manual’s instructions to build and package GNU’s “hello world” binary don’t actually work (gory details there). I got the strong impression that no one has ever sat down to watch a newbie work through this doc and see where they get confused; not only do fundamentals go unexplained and the samples not work, there’s no discussion of common errors. Frustratingly, it also conflates building a package with contributing to nixpkgs, the official NixOS package repository.

Either this is a fundamental confusion in nix documentation or there’s some undocumented assumption about what tools go where that I never understood. As an example, I tried to run nix-shell (which I think is the standard tool for debugging builds but it has expert-only docs) and it was described over in the Nixpkgs Manual even though it’s for all packaging issues. To use the shell I have to understand “phases”, but some of the ones listed simply don’t exist in the shell environment. I can’t guess if this a bug, out-dated docs, or incomplete docs. And that’s before I got to confusing “you just have to know it” issues like the src attribute becoming unpackPhase rather than srcPhase, or “learn from bitter experience” issues like nix-shell polluting the working directory and carrying state between build attempts. (This is where I gave up.)

I don’t know how the NixOS Manual turned out so well; the rest of the docs have this fractal issue where, at every level of detail, every part of the system is incompletely or incorrectly described somewhere other than expected. I backed up and reread the homepages and about pages to make sure I didn’t miss a tutorial or other introduction that might have helped make sense of this, but found nothing besides these manuals. If I sound bewildered and frustrated, then I’ve accurately conveyed the experience. I gave up trying to learn nix, even though it still looks like the only packaging/deployment system with the right perspective on the problems.

I’d chalk it up to nix being young, but there’s some oddities that look like legacy issues. For example, commands vary: it’s nix-env -i to install a package, but nix-channel only has long options like --add, and nix-rebuild switch uses the more modern “subcommand” style. With no coherent style, you have to memorize which commands use which syntax – again, one of those things newbies stumble on but experts don’t notice and may not even recognize as a problem.

Finally, there’s two closely-related issues in nix that look like misdesigns, or at least badly-missed opportunities. I don’t have a lot of confidence in these because, as recounted, I was unable to learn to use nix. Mostly these are based on my 20 years of administrating Linux systems, especially the provisioning and devops work I’ve done with Chef, Puppet, Ansible, Capistrano, and scripting that I’ve done in the last 10. Experience has led me to think that the hard parts of deployment and provisioning boil down to a running system being like a running program making heavy use of mutable global variables (eg. the filesystem): the pain comes from unmanaged changes and surprisingly complex moving parts.

The first issue is that Nix templatizes config files. There’s an example in my configuration.nix notes above: rather than editing the grub config file, the system lifts copies from this config file to paste into a template of of grub’s config file that must be hidden away somewhere. So now instead of just knowing grub’s config, you have to know it plus what interface the packager decided to design on top of it by reading the package source (and I had to google to find that). There’s warts like extraConfig that throw up their hands at the inevitable uncaptured complexity and offer a interface to inject arbitrary text into the config. I hope “inject” puts you in a better frame of mind than “interface”: this is stringly-typed text interpolation and a typo in the value means an error from grub rather than nix. This whole thing must be a ton of extra work for packagers, and if there’s a benefit over vi /etc/default/grub it’s not apparent (maybe in provisioning, though I never got to nixops).

This whole system is both complex and incomplete, and it would evaporate if nix configured packages by providing a default config file in a package with a command to pull it into /etc/nix or /etc/nixos for you to edit and nix to copy back into the running system when you upgrade or switch. This would lend itself very well to keeping the system config under version control, which is never suggested in the manual and doesn’t seem to be integrated at any level of the tooling – itself a puzzling omission, given the emphasis on repeatability.

Second, to support this complexity, they developed their own programming language. (My best guess – I don’t actually know which is the chicken and which is the egg.) A nix config file isn’t data, it’s a turning-complete language with conditionals, loops, closures, scoping, etc. Again, this must have been a ton of work to implement and a young, small-team programming language has all the obvious issues like no debugger, confusing un-googleable error messages that don’t list filenames and line numbers, etc.; and then there’s the learning costs to users. Weirdly for a system inspired by functional programming, it’s dynamically typed, so it feels very much like the featureset and limited tooling/community of JavaScript circa 1998. In contrast to JavaScript, the nix programming language is only used by one project, so it’s unlikely to see anything like the improvements in JS in last 20 years. And while JavaScript would be an improvement over inventing a language, using Racket or Haskell to create a DSL would be a big improvement.

These are two apparent missed opportunities, not fatal flaws. Again, I wasn’t able to learn nix to the level that I understand how and why it was designed this way, so I’m not setting forth a strongly-held opinion. They’re really strange, expensive decisions that I don’t see a compelling reason for, and they look like they’d be difficult to change. Probably they have already been beaten to death on a mailing list somewhere, but I’m too frustrated by how much time I’ve wasted to go looking.

I’ve scheduled a calendar reminder for a year from now to see if the manual improves or if Luc Perkins’s book is out.2018-08-09: Apparently because I’m a masochist, I wasted another eight hours trying Nix from the other direction. Rather than try to build up from the basics I tried to start from the top down and create a “Hello World” Rails app. I ran into another dozen code and documentation bugs without managing to get the current version of Ruby installed. I pushed the reminder out to 2020.

Pepijn de Vos (pepijndevos)

Loefflers Discrete Cosine Transform algorithm in Futhark July 04, 2018 12:00 AM

If you search for Loefflers algorithm you get a few academic papers, and for Futhark you get the Germanic runes. This post is a SEO masterpiece.

Discrete Cosine Transform is a variation on the Discrete Fourier Transform. It is used in basically every lossy compression format ever. The reason DCT is preferred is that discrete transforms are cyclic. So the DFT has a jump at the edges of the data, where it wraps around. (this is why windowing is frequently used in DFT) This jump at the edges leads to a fat tail in the frequency spectrum, which does not compress well.

The DCT constructs an “even” signal (mirrored around the 0 axis), so the signal is continuous at the edges. This leads to much lower high frequency coefficients. Lossy compression basically works by quantizing/masking/thresholding those coefficients, which produces many zeros at high frequencies. Long runs of zeros compress really well, so that’s what happens in most compression algorithms.

I was playing a bit with compression, but found that scipy.fftpack.dct was not fast enough to my liking. Since I had recently discovered Futhark, which is an amazing ML-like functional programming language for GPU programming that compiles to OpenCL, I thought it’d be fun to implement the DCT in Futhark. Little did I know what I was getting myself into.

After some searching, I found that Loefflers algorithm is the way to go. It’s what everyone seems to be using, because for an 8-point DCT it obtains the theoretical lower bound of 11 multiplications. After chasing some references in more recent papers, I found the original: Practical fast 1-D DCT algorithms with 11 multiplications, and after days of struggling, I almost understood it.

I knew that a Fast Fourier Transform is based on taking the DFT equation, and splitting it up in odd and even parts. If you keep doing this recursively (called decimation in time/decimation in frequency), you end up with this “butterfly” structure, which are additions of two “branches” scaled by some factor. For the DCT there are also butterflies, but also rotation blocks.

It took a few mental leaps to understand that you can write the DCT or DFT in matrix form, express elementary row operations in matrix form, use those to factorize the DCT matrix, and derive an optimal implementation from this matrix factorization.

The Futhark side of things was really fun. If you know a bit of functional programming, it’s really not hard, and you don’t need to know anything about OpenCL or GPU’s. I hopped on Gitter, and Troels Henriksen was super helpful. I’d come up with a problem, and a few hours later I’d git pull and the compiler got better.

There were a few surprises though. Many array operations are basically free, by returning a view of the same array. But there is an underlying assumption that arrays are big, heap allocated, and parallelized relentlessly. Tuples, on the other hand, are assumed to be small, and register allocated. By rewriting my inner DCT structure from (tiny) arrays to using tuples, performance more than doubled.

At first I tied to optimize my code to use in-place updates, but this was actually significantly slower than out-of-place. By doing in-place updates, I force the compiler to do the operations completely sequentially, while normally it could do a lot in parallel. It turns out that moving data around is by far the slowest thing, and arithmetic is basically free. So the best way to write fast code is to move less data, not to worry about every addition.

Actually implementing the DCT was really hard though. As I mentioned, searching for it brought up only academic papers, barely any working code. I managed to find two eventually: dct_simd and mozjpeg. I actually ported the first one to Python to compare intermediate results with my own implementation.

def pydct(x):
    r = [1.414214, 1.387040, 1.306563, 1.175876, 1.000000, 0.785695, 0.541196, 0.275899]
    y = np.zeros(8)

    invsqrt2= 0.707107;

    c1 = x[0]; c2 = x[7]; t0 = c1 + c2; t7 = c1 - c2;
    c1 = x[1]; c2 = x[6]; t1 = c1 + c2; t6 = c1 - c2;
    c1 = x[2]; c2 = x[5]; t2 = c1 + c2; t5 = c1 - c2;
    c1 = x[3]; c2 = x[4]; t3 = c1 + c2; t4 = c1 - c2;

    c0 = t0 + t3; c3 = t0 - t3;
    c1 = t1 + t2; c2 = t1 - t2;

    y[0] = c0 + c1;
    y[4] = c0 - c1;
    y[2] = c2 * r[6] + c3 * r[2];
    y[6] = c3 * r[6] - c2 * r[2];

    c3 = t4 * r[3] + t7 * r[5];
    c0 = t7 * r[3] - t4 * r[5];
    c2 = t5 * r[1] + t6 * r[7];
    c1 = t6 * r[1] - t5 * r[7];

    y[5] = c3 - c1; y[3] = c0 - c2;
    c0 = (c0 + c2) * invsqrt2;
    c3 = (c3 + c1) * invsqrt2;
    y[1] = c0 + c3; y[7] = c0 - c3;
    return y

def pyidct(y):
    r = [1.414214, 1.387040, 1.306563, 1.175876, 1.000000, 0.785695, 0.541196, 0.275899]
    x = np.zeros(8)
    
    z0 = y[1] + y[7]; z1 = y[3] + y[5]; z2 = y[3] + y[7]; z3 = y[1] + y[5];
    z4 = (z0 + z1) * r[3];

    z0 = z0 * (-r[3] + r[7]);
    z1 = z1 * (-r[3] - r[1]);
    z2 = z2 * (-r[3] - r[5]) + z4;
    z3 = z3 * (-r[3] + r[5]) + z4;

    b3 = y[7] * (-r[1] + r[3] + r[5] - r[7]) + z0 + z2;
    b2 = y[5] * ( r[1] + r[3] - r[5] + r[7]) + z1 + z3;
    b1 = y[3] * ( r[1] + r[3] + r[5] - r[7]) + z1 + z2;
    b0 = y[1] * ( r[1] + r[3] - r[5] - r[7]) + z0 + z3;
    #return np.array([z0, z1, z2, z3])

    z4 = (y[2] + y[6]) * r[6];
    z0 = y[0] + y[4]; z1 = y[0] - y[4];
    z2 = z4 - y[6] * (r[2] + r[6]);
    z3 = z4 + y[2] * (r[2] - r[6]);
    a0 = z0 + z3; a3 = z0 - z3;
    a1 = z1 + z2; a2 = z1 - z2;
    #return np.array([a0, a1, a2, a3, b0, b1, b2, b3])

    x[0] = a0 + b0; x[7] = a0 - b0;
    x[1] = a1 + b1; x[6] = a1 - b1;
    x[2] = a2 + b2; x[5] = a2 - b2;
    x[3] = a3 + b3; x[4] = a3 - b3;
    return x

From this code and staring at the paper, I learned a few things. First of all figure 1 is wrong. The rotate block should be sqrt(2)c6 instead of sqrt(2)c1. Another small detail is the dashed lines, meaning that some butterflies are upside down. Another one is the rotate block symbol. It says kcn, which are the k and n in the block equation, not the one in the DCT equation, which confused me a lot. So for sqrt(2)c6 you just substitute sqrt(2) and 6 in the rotation block. I noted down some more insights in response to a two year old question about the paper on the DSP StackExchange

Having implemented the forward DCT from the paper, I moved on to the inverse. All information the paper has about this is “just do everything backwards”. Thanks, paper. It turns out you use the same blocks, but in the reverse order, except… the rotate block n becomes -n. The inverse cosine transform has a negative angle, and this translates to cos(-x)=cos(x), sin(-x)=-sin(x).

type octet 't = [8]t
type f32octet = octet f32

let butterfly (a: f32) (b: f32) : (f32, f32) =
  (a+b, a-b)

let mk_coef (k:f32) (n:i32) : f32 =
  k*f32.cos ((r32 n)*f32.pi/16)

let coefr = map (mk_coef (f32.sqrt 2)) <| iota 8
let coef1 = map (mk_coef 1) <| iota 8

let rotate (sin_coef: f32) (cos_coef: f32) (a: f32) (b: f32): (f32, f32) =
  (a*cos_coef + b*sin_coef,
   b*cos_coef - a*sin_coef)

entry fdct8 (a: f32octet) : f32octet  =
  -- stage 1
  let (st1_0, st1_7) = butterfly a[0] a[7]
  let (st1_1, st1_6) = butterfly a[1] a[6]
  let (st1_2, st1_5) = butterfly a[2] a[5]
  let (st1_3, st1_4) = butterfly a[3] a[4]
  -- even part, stage 2
  let (st2_0, st2_3) = butterfly st1_0 st1_3
  let (st2_1, st2_2) = butterfly st1_1 st1_2
  -- stage 3
  let (y0, y4)   = butterfly st2_0 st2_1
  let (y2, y6)   = rotate coefr[2] coefr[6] st2_2 st2_3
  -- odd part, stage 2
  let (st2_4, st2_7)   = rotate coef1[5] coef1[3] st1_4 st1_7
  let (st2_5, st2_6)   = rotate coef1[7] coef1[1] st1_5 st1_6
  -- stage 3
  let (st3_4, st3_6)   = butterfly st2_4 st2_6
  let (st3_7, st3_5)   = butterfly st2_7 st2_5
  -- stage 4
  let (y1, y7)   = butterfly st3_7 st3_4
  let y3  = f32.sqrt(2)*st3_5
  let y5  = f32.sqrt(2)*st3_6
  in [y0, y4, y2, y6, y7, y3, y5, y1]


entry idct8 (a: f32octet) : f32octet  =
  -- odd part, stage 4
  let (st4_7, st4_4)   = butterfly a[7] a[4]
  let st4_5 = f32.sqrt(2)*a[5]
  let st4_6 = f32.sqrt(2)*a[6]
  -- stage 3
  let (st3_4, st3_6)   = butterfly st4_4 st4_6
  let (st3_7, st3_5)   = butterfly st4_7 st4_5
  -- stage 2
  let (st2_4, st2_7)   = rotate (-coef1[5]) coef1[3] st3_4 st3_7
  let (st2_5, st2_6)   = rotate (-coef1[7]) coef1[1] st3_5 st3_6
  -- even part, stage 3
  let (st3_0, st3_1)   = butterfly a[0] a[1]
  let (st3_2, st3_3)   = rotate (-coefr[2]) coefr[6] a[2] a[3]
  -- stage 2
  let (st2_0, st2_3) = butterfly st3_0 st3_3
  let (st2_1, st2_2) = butterfly st3_1 st3_2
  -- stage 1
  let (st1_0, st1_7) = butterfly st2_0 st2_7
  let (st1_1, st1_6) = butterfly st2_1 st2_6
  let (st1_2, st1_5) = butterfly st2_2 st2_5
  let (st1_3, st1_4) = butterfly st2_3 st2_4
  in [st1_0/8, st1_1/8, st1_2/8, st1_3/8, st1_4/8, st1_5/8, st1_6/8, st1_7/8]

July 02, 2018

Pete Corey (petecorey)

Making Noise with J July 02, 2018 12:00 AM

I’ve always been fascinated by live-coded music. Frameworks like Chuck, Supercollider, Overtone, Extempore, and Sonic PI, along with popular performers and musicians like Sam Aaron and Andrew Sorensen have never ceased to amaze and inspire me.

That said, whenever I’ve tried to use one of those tools or frameworks to create my own music, I’ve always quickly given up. Maybe it’s because I’m just lazy and learning new things is hard, but I’ve always told myself that it’s because the tools I was using just didn’t fit with how I felt programming music should be. Syntactically, ergonomically, and conceptually, the tools just didn’t jive.

And then I stumbled across J.

J and the entire family of APL languages have a beautiful terseness and closeness to the data being operated on. They’re also fundamentally designed to operate on arrays, a data structure ripe for musical interpretation. I’ve convinced myself that if I can learn J, I’ll be able to build the live coding environment of my dreams!

That’s a big goal, but I’m taking baby steps to get there. Today, I’ll show you how I managed to make noise with J.

Making Noise Without J

My plan for making noise with J doesn’t actually involve my J software producing any noise directly. Instead, it’ll act as a controller that instructs other software on my machine to make noise on its behalf.

The software making the noise will be SimpleSynth, which is a small, easy to use MIDI synthesizer. If you’re following along, feel free to use any other MIDI synth you’d like, or a full audio workstation like Ableton or even GarageBand.

SimpleSynth.

When we fire up SimpleSynth, it’ll ask which MIDI source it should use. MIDI is a protocol that lets us pass around musical information, like when and how loud certain notes should be played, between different devices. SimpleSynth is asking which stream of notes it should listen to and play.

Setting up our J virtual device in MIDI Studio.

I used MacOS’ built-in MIDI Studio to create a virtual MIDI channel called “J”, with a MIDI port called “Bus 1.” After making sure the virtual device was online, I selected it in SimpleSynth.

Selecting our J virtual device in SimpleSynth.

The last piece of the puzzle is finding some way of programmatically sending MIDI messages through my “J Bus 1” to be played by SimpleSynth. Geert Bevin’s SendMIDI command line tool did just the trick.

Once installed, we can use SendMIDI to send MIDI notes to SimpleSynth from our command line:

sendmidi dev "J Bus 1" on 60 100

Turning on note 60, with a velocity of 100 effectively plays a middle C at full volume.

Now we’re making music!

Talking to SendMIDI with J

The next challenge lies in getting J to execute sendmidi commands.

After much searching and head scratching, I learned that J exposes a wide range of miscellaneous functionality under the “foreigns” (!:) verb. Calling 2!:1 y lets you spawn a new process, running whatever command you pass in through y.

Let’s try invoking our spawn verb with our sendmidi command:

   2!:1 'sendmidi dev "J Bus 1" on 60 100'
|interface error
|       2!:1'sendmidi dev "J Bus 1" on 60 100'

After even more searching and head scratching, I realized that I needed to use the fully-qualified sendmidi path when making the call:

   2!:1 '/usr/local/bin/sendmidi dev "J Bus 1" on 60 100'

I hear sound! Success!

Making Music with J

While this is great, it’s not much better just running our sendmidi command directly from the command line. What would make things even better is if we could build ourselves a play verb that plays any notes passed to it.

For example, if I were to run:

   play 60 64 67

I’d expect J to construct and execute our sendmidi command, which should play a C major chord:

sendmidi dev "J Bus 1" on 60 100 on 64 100 on 67 100

After a few brain-expanding weekends of playing around in J, I came up with this version of the play verb:

   on =: ('on ',5|.' 100 ',":)"0
   play =: [:2!:1'/usr/local/bin/sendmidi dev "J Bus 1" ',[:,/on

The on verb turns an integer note into an “on string” of the format, 'on <note> 100 ', and the play verb spawns the result of appending '/usr/local/bin/sendmidi ...' to append mapped over on applied to y.

Put simply, it constructs our sendmidi command and executes it.

We can play a C major chord:

   play 60 64 67

Or any other chord we want:

   play 60 63 54 70 73

Final Thoughts

Please keep in mind that I’m very new to J, and even newer to tacit programming. If you see anything that can be improved, clarified, or corrected, please let me know.

I still feel very clunky and slow when it comes to using J. Building this two line program took hours of my time. That said, I feel like there is potential here. As I grow more used to the tacit paradigm and play with other ways of interacting to DAWs and other audio producers, I feel like J might turn into my ideal music creation environment.

Time will tell.

July 01, 2018

Gergely Nagy (algernon)

Kaleidoscope progress report: 2017-12-01 - 2018-07-01 July 01, 2018 08:35 PM

It's been a while I wrote about what's happening with Kaleidoscope, and I've been putting off writing this post for so long, that I can't ignore it anymore. I've been putting it off because a lot of things happened, and many more are under development. It's a huge amount of work, even to summarize. Fortunately, there's a lot of good stories to tell.

Suspend / resume

Last time I wrote that I found a way to respond to host suspend and resume events, but wasn't able to make the keyboard able to wake the host up - but felt that the solution is right in front of my nose. I was right, it was right there. Funnily enough, the solution is closely linked to our next topic - boot protocol.

You see, for a keyboard to be able to wake the host up, it needs to signal in its configuration that it is capable of doing so. We do that in Kaleidoscope, and have been doing so since day one. Yet, it did not work. We also need to do some USB magic to wake the host, which we have been doing since day one too. Yet, it did not work.

On the other hand, my other keyboards were able to wake the host, so I went and compared what they do, and what we do differently. Turns out, that to wake the host, we have to support the boot protocol too. We did that too, since December, but waking the host up still did not work. Not by default.

It turns out that we could tell the operating system (be that Windows, Linux, or MacOS) that yes, we want this keyboard to wake the host!. But we had to tell them, they did not default to it. They did not, because they only allow devices that implement a boot keyboard to wake the host up. With proper fallback and support for the boot protocol, we have that, and wakeing up works by default too!

It took a while to get here, we had a lot of dead ends, but I think we ended up with something reasonable, given our constraints.

Boot report protocol

Oh, boot protocol, my old nemesis! Back in December, I thought I'm done with it, but it wasn't meant to be. I'd rather not explain the whole story, because it still haunts me. We had to go as far as buying me a Mac Mini so I can test on OSX, and figure out what goes wrong (and I don't like OSX; I like even Windows better, and I have a history with that OS). It was that bad.

The gist of it is that a lot of BIOSes, and even some operating systems, or boot loaders do not parse HID descriptors at all, nor do they set the protocol to boot mode. They just expect the descriptors to be the same as an example in the spec. Our descriptor wasn't the same, it was a bit further optimized, which rendered BootKeyboard useless with anything that didn't parse the HID descriptors.

So we switched the descriptors, redid the whole fallback mechanism, and even made it possible for the user to switch protocols forcibly, without reconnecting the keyboard (see Model01-Firmware#55 for details).

The result? We now properly support the boot protocol, and the keyboard works fine under BIOSes, GRUB, OSX's FileVault, the Windows disk password prompt, and so on. Most of the time, out of the box. For the few rare times it does not, one can forcibly change the protocol.

An interesting part of this whole story is that it turns out, OSX is the only operating system that behaves according to spec: it sets the boot protocol explicitly when in FileVault, and sets it to report once it fully loads (and keeps setting it back if it sees the keyboard change back to boot). Neither Linux nor Windows do this. Surprising, to be honest, but happy that there's at least one operating system that does at least one thing according to spec!

Main loop speedup

We made a very simple change to how keyswitch events get handled, which resulted in a huge speedup in the vast majority of cases. We used to call the keyswitch event handlers for each and every plugin, every cycle, even if the keys were idle. This took a lot of time, and most plugins weren't interested in the idle state at all, anyway.

So we simply removed this part of the code. When a keyswitch is idle, we won't be calling any event handlers. Due to keys being idle is how they spend the vast majority of their time (it is fairly unusual to have more than 10 keys pressed at a time, and even in that case, we have 54 more idle), this has been a tremendous boost to our cycle speed, shaving off more than a full millisecond when the keyboard is idle, and close to a millisecond with fast (close to 100WPM) typing.

ErgoDox port

I wanted to use Kaleidoscope on my ErgoDox EZ for a long, long time now, even had a few attempts at porting before, but always ran into issues and hit a dead-end. One day, when I wanted to relax and do something different, to let my mind wander, let it do something else for a change, I attempted another port. Because I learned a lot about keyboards since my previous attempt, this port was quick, and rewarding.

We can now use Kaleidoscope on the ErgoDox EZ, and any other ErgoDox - or ergodox-like keyboard - that is compatible with it, like the original ErgoDox, or Dactyl.

Kaleidoscope running on the Dactyl makes me incredibly happy, because the Dactyl was one major reason I started looking into mechanical keyboards.

Pluggable HID adaptors

Probably not too interesting for most people, but we made the HID layer mostly pluggable. This means that if one wants to use a different HID library, this is now possible. This also opens up the way to implementing a Bluetooth keyboard.

New plugin API

Thanks to @noseglasses, we have a new plugin API and hook system. While Jesse and myself made a lot of changes with regards to naming and code structure, the core idea remained the same.

This is one of the most exciting developments lately, at least for myself, and probably other developers, because the new system is far more efficient than the old. We no longer keep a statically allocated array for hooks - saving significant amounts of RAM. We no longer need plugins to register their event handlers and loop hooks - they just need to re-implement a few methods inherited from kaleidoscope::Plugin.

The new system is not only more efficient, it is also considerably lighter. It made adding new hooks cheap, and new hooks we will have.

Not only that, but the naming is - in my opinion at least - a lot better. The new API should be easier for developers to use.

Just don't look at the code implementing it. That's a smaller ball of mud. But sometimes small balls of mud can do wonders.

Deprecations and Upgrading

There have been a couple of deprecations, bigger and smaller. When compiling code that uses deprecated interfaces, the compiler should loudly warn about them, with pointers to an upgrade path. We prepared an UPGRADING.md document, and a forum post to aid with the upgrades.

Going forward, we'll be keeping both of these updated, to make it easier for both end-users and developers to upgrade their code.

Miscellaneous

Apart from the big changes I mentioned above, there have been plenty of bugfixes, some new features, even new plugins! Too many things to list here, really, so if you are interested, have a look at my work log.

Thanks

Many thanks to @noseglasses, Michael Richters, and Jesse for their contributions, ranging from ideas, brainstorming, through code, reviews, to naming things. Huge parts of our progress since last December would not have been possible without them.

<3

Cleaning up, part #2 July 01, 2018 04:15 PM

Last time I wrote about the reasons why I redesigned the looks of this site, the ways I cut down on size. I described how I went from 290kb initial download through 14 requests to 8.4kb through three requests. Since then, I made a few minor changes that made the site even smaller.

While I should be writing about keyboard firmware progress, and a number of other things, time only allows me to write this post, about website diet. Like the size itself, this will not be a very long post.

To cut to the chase, the end result is that I went from 8.4kb through three requests (of which 7kb was CSS with long expiration) down to 2.1kb through two requests, of which only 0.8kb is CSS.

The biggest, and most visible change is that I dropped syntax highlighting, that saved me 4.2kb on a fresh visit, but I still had almost 3kb of CSS for no good reason. That no good reason was that I was writing my CSS by hand, and didn't have a minify step. Instead of implementing a minification step, I converted my CSS rules to Garden, which can minify it for me. And I get to write CSS with Clojure, so even better! This made my CSS considerably smaller: from 2.8kb to around 1kb, slightly less.

And if I minify CSS, I might as well minify the HTML too, using clj-html-compressor. That had a noticeable effect too, although not as much as the CSS minification, I only saved about 200 bytes on the index page. More on longer posts, but at most a kilobyte. Well worth the trouble in the end.

But there are visible changes too, apart from the removal of syntax highlighting: I made the font size bigger on the desktop, but left it unchanged everywhere else. I did this because desktop resolutions are typically bigger than tablets and phones, and the default font size was too small on my screens.

I also changed the main header, which used some special styling, drop shadows, and colors not used elsewhere on the site. It stood out, looked different, and I found it too distracting. So it is now styled as normal text, only a tiny bit bigger, and using a monospace font. It stands out less, and as such, is less distracting. But it is still there to serve as a way back to the root of the site.

I also moved the article meta-data (tags and date) to be closer to the title, and removed the dimming. It should be more accessible now.

There is still no JavaScript, and the CSS I have is even smaller, the markup trying to get out of your way, letting you enjoy the content itself. At least, I hope it does that, because ultimately, that's my goal: to have something easy and pleasant to consume, something... not too remarkable as far as looks go, so the focus can be on what's important: the content. The better the UI stays out of the reader's way, the happier I am.

I suppose I should advertise it better that the blog here has an RSS feed with full contents... Perhaps I'll put that into the footer somewhere!

June 29, 2018

Nikita Voloboev (nikivi)

The new version of the website will focus a lot on providing the tools for users to curate learning… June 29, 2018 01:08 PM

The new version of the website will focus a lot on providing the tools for users to curate learning tracks for any topic. The quality comes from the users submitting the learning tracks and other people ‘starring’ them similar to GitHub. In future this kind of curated approach will be coupled with some kind of automated system. The end goal though is to create a kind of Netlfix for learning where Learn Anything will ‘know’ what you know already and what you want to know as well as your preferred learning methods (i.e. video or articles) and will craft the ‘most effective’ learning paths for you. That you can then learn from.

June 27, 2018

Zac Brown (zacbrown)

Adventures with Plex June 27, 2018 07:00 AM

For a long time, I’ve wanted to digitize my collection of DVDs and Blu-Rays. At least in my home, our Apple TV 4K has become the single pane of glass” for viewing media. We have an XBox One for watching the Blu-Rays but we don’t have a surround sound system so this isn’t much benefit. Additionally, being able to sync our library of DVDs and Blu-Rays to iOS devices in the same you can with iTunes, Amazon Prime, or Netflix content would be handy.

I finally sat down last weekend and spent some time looking at what sorts of media server software was available. I’ve actually already spent some time with media server software, but specifically I had been looking at solutions for serving FLAC files. In that case, I’d landed on using Asset which was specifically designed to serve audio. Anyway, looking at media servers for video, I was hoping to find one with the following features:

  • ease of use
  • mobile device sync - ability to download video media to an iPad
  • robust client applications - stable applications for iOS and tvOS
  • robust server management - easily add libraries, access controls, transcoding, etc.

As part of this experiment, I spent a little time reading about Emby, Kodi, and Plex. Plex had caught my eye in particular, because I’d heard friends talk about it and it’s actually a supported application for my NAS (QNAP TVS-471). I’d known of Kodi since its early days when it was called XBox Media Center (XBMC). I knew the least about Emby but a quick glance indicated it had some similarities to Plex but without any distinct advantage I could discern.

When looking at these three solutions and comparing them to the list of what I needed, Kodi was already out. While it’s open source and well established, its story around mobile device sync seems to require a lot of hoops to jump through. Meanwhile, Emby seems to be a middle ground between Kodi and Plex. It allows a client/server model like Plex and seems to handle mobile device sync as well as tvOS pretty well. That said, it’s less robust than Plex and ease-of-use is probably the single most important feature I’m looking for.

After installing each of them and playing with a small sample library, I ended up landing on Plex as my solution. I was surprised at how easy it was to get the library setup. I was impressed with the tvOS and iOS apps so I decided to buy a lifetime membership. Some folks complain about the speed at which features are added. I found many folks saying you shouldn’t pay for a subscription but I’ve found that products you don’t pay for have a habit of disappearing :).

After buying the lifetime membership, I tried out the iOS content sync. It worked surprisingly well - and it’s quite fast if the content is already in the supported container (MP4, h264). Unfortunately, only parts of my library are encoded in a natively supported format for iOS. Normalizing my library is a topic for a separate post - hint: H.265 is probably the right option in the future.

I’ve now got a few TV shows as well as about 30 movies loaded in. The app looks great, here’s a few snaps:

Show Selection

Episode Selection

Movie Selection

In closing…

I’ll probably have more to say on this, especially on encoding files. Turns out H.265 (aka HEVC) is incredibly impressive. 10GB encodings of H.265 for movies have incredible detail. That said, it takes a comically long time to encode (8+ hours) a single file on my poor MacBook Pro. However, I have ordered a new gaming desktop that will have the distinct pleasure of encoding content when it’s not being used for games.

More on all that later.

June 26, 2018

Frederic Cambus (fcambus)

Oldest domains in the .com, .net, and .org TLDs June 26, 2018 03:57 PM

As someone interested in DNS and Internet history, I've always been enjoying facts and articles about early registered domain names. Wikipedia has a page on the subject, but the list is extremely short for .net and .org domains.

Using the DDN NIC domain summaries, it shouldn't be too difficult to extract a list of domains, perform whois queries to get registration dates, and sort the results. Let's find out.

For the record, the oldest issue I could find, dating from December 1987, doesn't list nordu.net, the first .net domain ever registered. So I opted for the August 1995 edition to be on the safe side. While I could also find an issue from 1996, there are a lot more domains listed so the whois lookups would take a lot more time, for no evident benefit.

Preparing the domain lists

After manually splitting the file to get rid of the TLDs we are not interested in, we save them in a distinct file for each TLD.

Then, we need to process the lists so that each domain is on its own line, and we strip the eventual subdomains with rev using a neat trick:

tr -s "[[:blank:]]" "\n" < com | \
    rev | cut -d '.' -f 1 -f 2 | rev | sort | uniq > com.txt

tr -s "[[:blank:]]" "\n" < net | \
    rev | cut -d '.' -f 1 -f 2 | rev | sort | uniq > net.txt

tr -s "[[:blank:]]" "\n" < org | \
    rev | cut -d '.' -f 1 -f 2 | rev | sort | uniq > org.txt

Performing whois requests

Internic whois server allows to query domains in the .com and .net TLDs without imposing any drastic rate limit, albeit slowly. As our corpus is rather short, this isn't an issue.

Whois script for .com and .net domains:

#!/bin/sh
while read domain
do
	creation_date=$(whois -h whois.internic.net $domain =$domain | \
	    grep "   Creation Date:" | uniq | sed -e 's/Creation Date://')

	if [ ! -z "$creation_date" ]; then
		date=$(echo $creation_date | cut -d 'T' -f 1 -)
		echo $date $domain
	fi

	# Wait one second to avoid triggering rate-limiting mechanisms
	sleep 1
done < $1

On the other hand, the Public Interest Registry whois server only allows 4 queries per minute, so we have to sleep for a little while between each request.

Whois script for .org domains:

#!/bin/sh
while read domain
do
	creation_date=$(whois $domain =$domain | grep "Creation Date:" | \
	    uniq | sed -e 's/Creation Date://')

	if [ ! -z "$creation_date" ]; then
		date=$(echo $creation_date | cut -d 'T' -f 1 -)
		echo $date $domain
	fi

	# Wait thirty seconds to avoid triggering rate-limiting mechanisms
	sleep 30
done < $1

We can now launch the scripts to perform whois requests:

sh internic.sh com.txt > com.dates.txt
sh internic.sh net.txt > net.dates.txt
sh pir.sh org.txt > org.dates.txt

And finally sort results and keep the 100 oldest domains for each TLD:

sort com.dates.txt | head -n100
sort net.dates.txt | head -n100
sort org.dates.txt | head -n100

Results

Oldest registered .com domains:

1985-03-15 symbolics.com
1985-04-24 bbn.com
1985-05-24 think.com
1985-07-11 mcc.com
1985-09-30 dec.com
1985-11-07 northrop.com
1986-01-09 xerox.com
1986-01-17 sri.com
1986-03-03 hp.com
1986-03-05 bellcore.com
1986-03-19 ibm.com
1986-03-19 sun.com
1986-03-25 intel.com
1986-03-25 ti.com
1986-04-25 att.com
1986-05-08 gmr.com
1986-05-08 tek.com
1986-07-10 fmc.com
1986-07-10 ub.com
1986-08-05 bell-atl.com
1986-08-05 ge.com
1986-08-05 grebyn.com
1986-08-05 isc.com
1986-08-05 nsc.com
1986-08-05 stargate.com
1986-09-02 boeing.com
1986-09-18 itcorp.com
1986-09-29 siemens.com
1986-10-18 pyramid.com
1986-10-27 alphacdc.com
1986-10-27 bdm.com
1986-10-27 fluke.com
1986-10-27 inmet.com
1986-10-27 kesmai.com
1986-10-27 mentor.com
1986-10-27 nec.com
1986-10-27 ray.com
1986-10-27 rosemount.com
1986-10-27 vortex.com
1986-11-05 alcoa.com
1986-11-05 gte.com
1986-11-17 adobe.com
1986-11-17 amd.com
1986-11-17 das.com
1986-11-17 data-io.com
1986-11-17 octopus.com
1986-11-17 portal.com
1986-11-17 teltone.com
1986-12-11 3com.com
1986-12-11 amdahl.com
1986-12-11 ccur.com
1986-12-11 ci.com
1986-12-11 convergent.com
1986-12-11 dg.com
1986-12-11 peregrine.com
1986-12-11 quad.com
1986-12-11 sq.com
1986-12-11 tandy.com
1986-12-11 tti.com
1986-12-11 unisys.com
1987-01-19 cgi.com
1987-01-19 cts.com
1987-01-19 spdcc.com
1987-02-19 apple.com
1987-03-04 nma.com
1987-03-04 prime.com
1987-04-04 philips.com
1987-04-23 datacube.com
1987-04-23 kai.com
1987-04-23 tic.com
1987-04-23 vine.com
1987-04-30 ncr.com
1987-05-14 cisco.com
1987-05-14 rdl.com
1987-05-20 slb.com
1987-05-27 parcplace.com
1987-05-27 utc.com
1987-06-26 ide.com
1987-07-09 trw.com
1987-07-13 unipress.com
1987-07-27 dupont.com
1987-07-27 lockheed.com
1987-07-28 rosetta.com
1987-08-18 toad.com
1987-08-31 quick.com
1987-09-03 allied.com
1987-09-03 dsc.com
1987-09-03 sco.com
1987-09-22 gene.com
1987-09-22 kccs.com
1987-09-22 spectra.com
1987-09-22 wlk.com
1987-09-30 mentat.com
1987-10-14 wyse.com
1987-11-02 cfg.com
1987-11-09 marble.com
1987-11-16 cayman.com
1987-11-16 entity.com
1987-11-24 ksr.com
1987-11-30 nynexst.com

Oldest registered .net domains:

1985-01-01 nordu.net
1986-04-01 broken.net
1986-11-05 nsf.net
1987-01-27 nyser.net
1987-05-20 uu.net
1987-07-21 sesqui.net
1988-05-25 mr.net
1988-06-09 oar.net
1988-07-08 sura.net
1988-09-07 the.net
1988-09-16 nwnet.net
1988-10-21 es.net
1988-10-25 mid.net
1989-01-04 barrnet.net
1989-01-05 cic.net
1989-01-27 hawaii.net
1989-03-07 psi.net
1989-03-27 near.net
1989-04-11 eu.net
1989-06-29 ln.net
1989-09-12 sub.net
1989-09-14 westnet.net
1989-11-06 cypress.net
1989-11-15 cerf.net
1989-11-17 risq.net
1990-02-09 ca.net
1990-05-21 wiscnet.net
1990-07-25 cent.net
1990-07-26 alter.net
1990-09-27 ans.net
1990-11-07 mich.net
1991-02-26 hk.net
1991-04-10 cix.net
1991-04-11 team.net
1991-05-07 five-colleges.net
1991-05-17 ja.net
1991-06-03 illinois.net
1991-06-20 more.net
1991-06-24 ohio-dmz.net
1991-07-08 icp.net
1991-08-07 swip.net
1991-08-15 michnet.net
1991-11-29 notes.net
1991-12-10 merit.net
1991-12-31 mu.net
1992-01-17 first.net
1992-02-17 ebone.net
1992-02-19 holonet.net
1992-02-25 ripe.net
1992-03-24 csn.net
1992-04-06 mcast.net
1992-04-08 life.net
1992-04-20 rahul.net
1992-04-21 cyber.net
1992-05-11 sprintlink.net
1992-05-18 ids.net
1992-05-21 q.net
1992-06-01 netconnect.net
1992-07-07 use.net
1992-07-16 tip.net
1992-07-27 capcon.net
1992-07-27 nexsys.net
1992-07-29 umass.net
1992-07-31 solinet.net
1992-08-06 fish.net
1992-08-18 ps.net
1992-09-10 eds.net
1992-09-18 lig.net
1992-10-01 ix.net
1992-10-19 aol.net
1992-10-30 win.net
1992-11-02 cren.net
1992-11-03 path.net
1992-11-04 quake.net
1992-11-20 access.net
1992-11-20 tsoft.net
1992-11-23 inter.net
1992-11-30 individual.net
1992-12-04 raider.net
1992-12-09 europa.net
1992-12-21 demon.net
1992-12-22 press.net
1992-12-23 bc.net
1993-01-01 internic.net
1993-01-04 cls.net
1993-01-20 sam.net
1993-02-09 kanren.net
1993-02-11 ubs.net
1993-02-15 digex.net
1993-02-15 mobilecomm.net
1993-02-17 xlink.net
1993-02-18 fr.net
1993-03-03 onenet.net
1993-03-08 aco.net
1993-03-24 clark.net
1993-03-24 olympus.net
1993-03-24 satlink.net
1993-04-02 netcom.net
1993-04-07 nl.net
1993-04-13 ins.net

Oldest registered .org domains:

1985-07-10 mitre.org
1986-03-25 src.org
1986-07-10 super.org
1987-01-07 aero.org
1987-01-15 mcnc.org
1987-04-02 rand.org
1987-04-04 mn.org
1987-05-01 rti.org
1987-07-14 usenix.org
1987-09-03 software.org
1988-02-25 fidonet.org
1988-04-27 ampr.org
1988-08-04 osf.org
1988-08-11 ida.org
1988-09-09 cactus.org
1988-09-09 nm.org
1988-09-22 ccf.org
1988-10-21 erim.org
1988-11-11 ski.org
1988-11-30 iti.org
1989-01-11 jax.org
1989-01-13 ncsc.org
1989-02-09 aaai.org
1989-02-24 ie.org
1989-03-29 stjude.org
1989-04-11 mbari.org
1989-05-24 castle.org
1989-06-07 carl.org
1989-06-27 msri.org
1989-07-15 agi.org
1989-07-17 sf-bay.org
1989-07-31 mef.org
1989-08-11 oclc.org
1989-08-23 ei.org
1989-09-05 cas.org
1989-09-11 battelle.org
1989-09-12 sub.org
1989-09-21 aip.org
1989-09-28 sdpa.org
1989-11-08 lonestar.org
1989-12-01 ieee.org
1990-01-10 cit.org
1990-01-22 sematech.org
1990-02-07 omg.org
1990-02-12 decus.org
1990-03-13 sublink.org
1990-03-16 cam.org
1990-03-20 cpl.org
1990-04-10 ori.org
1990-04-13 fhcrc.org
1990-05-16 nwf.org
1990-05-18 mskcc.org
1990-05-23 boystown.org
1990-05-24 bwc.org
1990-05-31 topsail.org
1990-06-28 ciit.org
1990-07-17 central.org
1990-07-27 mind.org
1990-08-03 stonemarche.org
1990-08-28 cshl.org
1990-08-30 fstrf.org
1990-09-12 dorsai.org
1990-09-14 elf.org
1990-09-18 siggraph.org
1990-09-21 sjh.org
1990-09-27 igc.org
1990-10-10 cotdazr.org
1990-10-10 eff.org
1990-10-10 sfn.org
1990-10-31 csn.org
1990-11-01 sfbr.org
1990-11-07 ais.org
1990-11-07 hjf.org
1991-01-04 uniforum.org
1991-01-04 wgbh.org
1991-02-01 fsf.org
1991-02-06 eso.org
1991-02-06 tiaa.org
1991-02-13 nysernet.org
1991-02-20 acr.org
1991-02-26 nybc.org
1991-02-26 nypl.org
1991-04-10 cnytdo.org
1991-04-10 htr.org
1991-04-10 hvtdc.org
1991-04-10 nycp.org
1991-04-11 bpl.org
1991-04-11 scra.org
1991-04-12 amnh.org
1991-04-15 hellnet.org
1991-04-15 sil.org
1991-04-18 apc.org
1991-04-22 mobot.org
1991-04-25 cni.org
1991-05-01 gumption.org
1991-05-02 hslc.org
1991-05-13 guild.org
1991-05-22 acs.org
1991-05-22 lpl.org
1991-05-22 rsage.org