Planet Crustaceans

This is a Planet instance for lobste.rs community feeds. To add/update an entry or otherwise improve things, fork this repo.

March 25, 2019

Wesley Moore (wezm)

My First 3 Weeks of Professional Rust March 25, 2019 06:00 AM

For the last 15 years as a professional programmer I have worked mostly with dynamic languages. First Perl, then Python, and for the last 10 years or so, Ruby. I've also been writing Rust on the side for personal projects for nearly four years. Recently I started a new job and for the first time I'm writing Rust professionally. Rust represents quite a shift in language features, development process and tooling. I thought it would be interesting to reflect on that experience so far.

Note that some of my observations are not unique to Rust and would be equally present in other languages like Haskell, Kotlin, or OCaml.

Knowledge

In my first week I hit up pretty hard against my knowledge of lifetimes in Rust. I was reasonably confident with them conceptually and their simple application but our code has some interesting type driven zero-copy parsing code that tested my knowledge. When encountering some compiler errors I was fortunate to have experienced colleagues to ask for help. It's been nice to extend my knowledge and learn as I go.

Interestingly I had mostly been building things without advanced lifetime knowledge up until this point. I think that sometimes the community puts too much emphasis on some of Rust's more advanced features when citing its learning curve. If you read the book you can get a very long way. Although that will depend on the types of applications or data structures you're trying to build.

Confidence

In my second week I implemented a change to make a certain pattern more ergonomic. It was refreshing to be able to build the initial functionality and then make a project-wide change, confident that given it compiled after the change I probably hadn't broken anything. I don't think I would have had the confidence to make such a change as early on in the Ruby projects I've worked on previously.

Testing

I cringe whenever I see proponents of statically typed languages say things like, "if it compiles, it works", with misguided certainty. The compiler and language do eliminate whole classes of bugs that you'd need to test for in a dynamic language but that doesn't mean tests aren't needed.

Rust has great built in support for testing and I've enjoyed being able to write tests focussed solely on the behaviour and logic of my code. Compared to Ruby where I have to write tests that ensure there are no syntax errors, nil is handled safely, arguments are correct, in addition to the behaviour and logic.

Editor and Tooling

Neovim is my primary text editor. I've been using vim or a derivative since the early 2000s. I have the RLS set up and working in my Neovim environment but less than a week in I started using IntelliJ IDEA with the Rust and Vim emulation plugins for work. A week after that I started trialling CLion as I wanted a debugger.

JetBrains CLion IDE JetBrains CLion IDE

The impetus for the switch was that I was working with a colleague on a change that had a fairly wide impact on the code. We were practicing compiler driven development and were doing a repeated cycle of fix an error, compile, jump to next top most error. Vim's quickfix list + :make is designed to make this cycle easier too but I didn't have that set up at the time. I was doing a lot of manual jumping between files, whereas in IntelliJ I could just click the paths in the error messages.

It's perhaps the combination of working on a foreign codebase and also trying to maximise efficiency when working with others that pushed me to seek out better tooling for work use. There is ongoing to work to improve the RLS so I may still come back to Neovim and I continue to use it for personal projects.

Other CLion features that I'm enjoying:

  • Reliable autocomplete
  • Reliable jump to definition, jump to impl block, find usages
  • Refactoring tooling (rename across project, extract method, extract variable)
  • Built in debugger

VS Code offers some of these features too. However, since they are built on the RLS they suffer many of the same issues I had in Neovim. Additionally I think the Vim emulation plugin for IntelliJ is more complete, or at least more predictable for a long time vim user. This is despite the latter actually using Neovim under the covers.

Debugging

In Ruby with a gem like pry-byebug it's trivial to put a binding.pry in some code to be dropped into a debugger + REPL at that point in the code. This is harder with Rust. println! or dbg! based debugging can get you a surprisingly long way and had served me well for most of my personal projects.

When building some parsing code I quickly felt the need to use a real debugger in order to step through and examine execution of a failing test. It's possible to do this on the command line with the rust-gdb or rust-lldb wrappers that come with Rust. However, I find them fiddly to use and verbose to operate.

CLion makes it simple to add and remove break points by clicking in the gutter, run a single test under the debugger, visually step through the code, see all local variables, step up and down the call stack, etc. These are possible with the command line tools (which CLIon is using behind the scenes), but it's nice to have them built in and available with a single click of the mouse.

Conclusion

So far I am enjoying my new role. There have been some great learning opportunities and surprising tooling changes. I'm also keen to keep an eye on the frequency of bugs encountered in production, their type (such as panic or incorrect logic), their source, and ease of resolution. I look forward to writing more about our work in the future.

Discuss on Lobsters



Previous Post: A Coding Retreat and Getting Embedded Rust Running on a SensorTag
Next Post: Cross Compiling Rust for FreeBSD With Docker

Cross Compiling Rust for FreeBSD With Docker March 25, 2019 04:45 AM

For a little side project I'm working on I want to be able to produce pre-compiled binaries for a variety of platforms, including FreeBSD. With a bit of trial and error I have been able to successfully build working FreeBSD binaries from a Docker container, without using (slow) emulation/virtual machines. This post describes how it works and how to add it to your own Rust project.

I started with Sandvine's freebsd-cross-build repo. Which builds a Docker image with a cross-compiler that targets FreeBSD. I made a few updates and improvements to it:

  • Update from FreeBSD 9 to 12.
  • Base on newer debian9-slim image instead of ubuntu 16.04.
  • Use a multi-stage Docker build.
  • Do all fetching of tarballs inside the container to remove the need to run a script on the host.
  • Use the FreeBSD base tarball as the source of headers and libraries instead of ISO.
  • Revise the fix-links script to automatically discover symlinks that need fixing.

Once I was able to successfully build the cross-compilation toolchain I built a second Docker image based on the first that installs Rust, and the x86_64-unknown-freebsd target. It also sets up a non-privileged user account for building a Rust project bind mounted into it.

Check out the repo at: https://github.com/wezm/freebsd-cross-build

Building the Images

I haven't pushed the image to a container registry as I want to do further testing and need to work out how to version them sensibly. For now you'll need to build them yourself as follows:

  1. git clone git@github.com:wezm/freebsd-cross-build.git && cd freebsd-cross-build
  2. docker build -t freebsd-cross .
  3. docker build -f Dockerfile.rust -t freebsd-cross-rust .

Using the Images to Build a FreeBSD Binary

To use the freebsd-cross-rust image in a Rust project here's what you need to do (or at least this is how I'm doing it):

In your project add a .cargo/config file for the x86_64-unknown-freebsd target. This tells cargo what tool to use as the linker.

[target.x86_64-unknown-freebsd]
linker = "x86_64-pc-freebsd12-gcc"

I use Docker volumes to cache the output of previous builds and the cargo registry. This prevents cargo from re-downloading the cargo index and dependent crates on each build and saves build artifacts across builds, speeding up compile times.

A challenge this introduces is how to get the resulting binary out of the volume. For this I use a separate docker invocation that copies the binary out of the volume into a bind mounted host directory.

Originally I tried mounting the whole target directory into the container but this resulted in spurious compilation failures during linking and lots of files owned by root (I'm aware of user namespaces but haven't set it up yet).

I wrote a shell script to automate this process:

#!/bin/sh

set -e

mkdir -p target/x86_64-unknown-freebsd

# NOTE: Assumes the following volumes have been created:
# - lobsters-freebsd-target
# - lobsters-freebsd-cargo-registry

# Build
sudo docker run --rm -it \
  -v "$(pwd)":/home/rust/code:ro \
  -v lobsters-freebsd-target:/home/rust/code/target \
  -v lobsters-freebsd-cargo-registry:/home/rust/.cargo/registry \
  freebsd-cross-rust build --release --target x86_64-unknown-freebsd

# Copy binary out of volume into target/x86_64-unknown-freebsd
sudo docker run --rm -it \
  -v "$(pwd)"/target/x86_64-unknown-freebsd:/home/rust/output \
  -v lobsters-freebsd-target:/home/rust/code/target \
  --entrypoint cp \
  freebsd-cross-rust \
  /home/rust/code/target/x86_64-unknown-freebsd/release/lobsters /home/rust/output

This is what the script does:

  1. Ensures that the destination directory for the binary exists. Without this, docker will create it but it'll be owned by root and the container won't be able to write to it.
  2. Runs cargo build --release --target x86_64-unknown-freebsd (the leading cargo is implied by the ENTRYPOINT of the image.
    1. The first volume (-v) argument bind mounts the source code into the container, read-only.
    2. The second -v maps the named volume, lobsters-freebsd-target into the container. This caches the build artifacts.
    3. The last -v maps the named volume, lobsters-freebsd-cargo-registry into the container. This caches the carge index and downloaded crates.
  3. Copies the built binary out of the lobsters-freebsd-target volume into the local filesystem at target/x86_64-unknown-freebsd.
    1. The first -v bind mounts the local target/x86_64-unknown-freebsd directory into the container at /home/rust/output.
    2. The second -v mounts the lobsters-freebsd-target named volume into the container at /home/rust/code/target.
    3. The docker run invocation overrides the default ENTRYPOINT with cp and supplies the source and destination to it, copying from the volume into the bind mounted host directory.

After running the script there is a FreeBSD binary in target/x86_64-unknown-freebsd. Copying it to a FreeBSD machine for testing shows that it does in fact work as expected!

One last note, this all works because I don't depend on any C libraries in my project. If I did, it would be necessary to cross-compile them so that the linker could link them when needed.

Once again, the code is at: https://github.com/wezm/freebsd-cross-build.



Previous Post: My First 3 Weeks of Professional Rust

Pete Corey (petecorey)

Bending Jest to Our Will: Restoring Node's Require Behavior March 25, 2019 12:00 AM

Jest does some interesting things to Node’s default require behavior. In an attempt to encourage test independence and concurrent test execution, Jest resets the module cache after every test.

You may remember one of my previous articles about “bending Jest to our will” and caching instances of modules across multiple tests. While that solution works for single modules on a case-by-case basis, sometimes that’s not quite enough. Sometimes we just want to completely restore Node’s original require behavior across the board.

After sleuthing through support tickets, blog posts, and “official statements” from Jest core developers, this seems to be entirely unsupported and largely impossible.

However, with some highly motivated hacking I’ve managed to find a way.

Our Goal

If you’re unfamiliar with how require works under the hood, here’s a quick rundown. The first time a module is required, its contents are executed and the resulting exported data is cached. Any subsequent require calls of the same module return a reference to that cached data.

That’s all there is to it.

Jest overrides this behavior and maintains its own “module registry” which is blown away after every test. If one test requires a module, the module’s contents are executed and cached. If that same test requires the same module, the cached result will be returned, as we’d expect. However, other tests don’t have access to our first test’s module registry. If another test tries to require that same module, it’ll have to execute the module’s contents and store the result in its own private module registry.

Our goal is to find a way to reverse Jest’s monkey-patching of Node’s default require behavior and restore it’s original behavior.

This change, or reversal of a change, will have some unavoidable consequences. Our Jest test suite won’t be able to support concurrent test processes. This means that all our tests will have to run “in band”(--runInBand). More interestingly, Jest’s “watch mode” will no longer work, as it uses multiple processes to run tests and maintain a responsive command line interface.

Accepting these limitations, let’s press on.

Dependency Hacking

After several long code reading and debugging sessions, I realized that the heart of the problem resides in Jest’s jest-runtime module. Specifically, the requireModuleOrMock function, which is responsible for Jest’s out-of-the-box require behavior. Jest internally calls this method whenever a module is required by a test or by any code under test.

Short circuiting this method with a quick and dirty require causes the require statements throughout our test suites and causes our code under test to behave exactly as we’d expect:


requireModuleOrMock(from: Path, moduleName: string) {
+ return require(this._resolveModule(from, moduleName));
  try {
    if (this._shouldMock(from, moduleName)) {
      return this.requireMock(from, moduleName);
    } else {
      return this.requireModule(from, moduleName);
    }
  } catch (e) {
    if (e.code === 'MODULE_NOT_FOUND') {
      const appendedMessage = findSiblingsWithFileExtension(
        this._config.moduleFileExtensions,
        from,
        moduleName,
      );

      if (appendedMessage) {
        e.message += appendedMessage;
      }
    }
    throw e;
  }
}

Whenever Jest reaches for a module, we relieve it of the decision to use a cached module from it’s internally maintained moduleRegistry, and instead have it always return the result of requiring the module through Node’s standard mechanisms.

Patching Jest

Our fix works, but in an ideal world we wouldn’t have to fork jest-runtime just to make our change. Thankfully, the requireModuleOrMock function isn’t hidden within a closure or made inaccessible through other means. This means we’re free to monkey-patch it ourselves!

Let’s start by creating a test/globalSetup.js file in our project to hold our patch. Once created, we’ll add the following lines:


const jestRuntime = require('jest-runtime');

jestRuntime.prototype.requireModuleOrMock = function(from, moduleName) {
    return require(this._resolveModule(from, moduleName));
};

We’ll tell our Jest setup to use this config file by listing it in our jest.config.js file:


module.exports = {
    globalSetup: './test/globalSetup.js',
    ...
};

And that’s all there is to it! Jest will now execute our globalSetup.js file once, before all of our test suites, and restore the original behavior of require.

Being the future-minded developers that we are, it’s probably wise to document this small and easily overlooked bit of black magic:


/*
 * This requireModuleOrMock override is _very experimental_. It affects
 * how Jest works at a very low level and most likely breaks Jest-style
 * module mocks.
 *
 * The upside is that it lets us evaluate heavy modules once, rather
 * that once per test.
 */

jestRuntime.prototype.requireModuleOrMock = function(from, moduleName) {
    return require(this._resolveModule(from, moduleName));
};

If you find yourself with no other choice but to perform this incantation on your test suite, I wish you luck. You’re most likely going to need it.

March 24, 2019

Pages From The Fire (kghose)

Another short story, where unit tests save my butt March 24, 2019 10:04 PM

I like to refactor. A lot. As I work on a problem I understand it better and I want to reflect this in the code. I was nearing the end of a pretty serious refactor. The tests had been failing for the last 60 commits or so. I wasn’t worried, I was expecting this. But …

Ponylang (SeanTAllen)

Last Week in Pony - March 24, 2019 March 24, 2019 01:59 PM

Last Week In Pony is a weekly blog post to catch you up on the latest news for the Pony programming language. To learn more about Pony check out our website, our Twitter account @ponylang, or our Zulip community.

Got something you think should be featured? There’s a GitHub issue for that! Add a comment to the open “Last Week in Pony” issue.

Wesley Moore (wezm)

A Coding Retreat and Getting Embedded Rust Running on a SensorTag March 24, 2019 02:01 AM

This past long weekend some friends on I went on a coding retreat inspired by John Carmack doing similar in 2018. During the weekend I worked on adding support for the Texas Instruments SensorTag to the embedded Rust ecosystem. This post is a summary of the weekend and what I was able to achieve code wise.

Back in March 2018 John Carmack posted about a week long coding retreat he went on to work on neural networks and OpenBSD. After reading the post I quoted it to some friends and commented:

I finally took another week-long programming retreat, where I could work in hermit mode, away from the normal press of work.

In the spirit of my retro theme, I had printed out several of Yann LeCun’s old papers and was considering doing everything completely off line, as if I was actually in a mountain cabin somewhere

I kind of love the idea of a week long code retreat in a cabin somewhere.

One of my friends also liked the idea and actually made it happen! There was an initial attempt in June 2018 but life got in the way so it was postponed. At the start of the year he picked it up again and organised it for the Labour day long weekend, which just passed.

We rented an Airbnb in the Dandenong Ranges, 45 minutes from Melbourne. Six people attended, two of which were from interstate. The setting was cozy, quiet and picturesque. Our days involved coding and collaborating, shared meals, and a walk or two around the surrounds.

Photo of a sunrise with trees and windmill visible The view from our accommodation one morning.

After linux.conf.au I got inspired to set up some self-hosted home sensors and automation. I did some research and picked up two Texas Instruments SensorTags and a debugger add-on. It uses a CC2650 microcontroller with an ARM Cortex-M3 core and has support for a number of low power wireless standards, such as Bluetooth, ZigBee, and 6LoWPAN. The CC2650 also has a low power 16-bit sensor controller that can be used to help achieve years long battery life from a single CR2032 button cell. In addition to the microcontroller the SensorTag also add a bunch of sensors, including: temperature, humidity, barometer, accelerometer, gyroscope, and light.

Two SensorTags. One with it's rubberised case removed and debugger board attached Two SensorTags. One with it's rubberised case removed and debugger board attached.

My project for the weekend was to try to get some Rust code running on the SensorTag. Rust has good support out of the box for targeting ARM Cortex microcontrollers but there were no crates to make interacting with this particular chip or board easy, so I set about building some.

The first step was generating a basic crate to allow interacting with the chip without needing to wrap everything in an unsafe block and poke at random memory addresses. Fortunately svd2rust can automate this by converting System View Description XML files (SVD) into a Rust crate. Unfortunately TI don't publish SVD files for their devices. As luck would have it though, M-Labs have found that TI do publish XML descriptions in format of their own called DSLite. They have written a tool, dslite2svd, that converts this to SVD, so you can then use svd2rust. It took a while to get dslite2svd working. I had to tweak it to handle differences in the files I was processing, but eventually I was able to generate a crate that compiled.

Now that I had an API for the chip I turned to working out how to program and debug the SensorTag with a very basic Rust program. I used the excellent embedded Rust Discovery guide as a basis for the configuration, tools, and process for getting code onto the SensorTag. Since this was a different chip from a different manufacturer it took a long time to work out which tools worked, how to configure them, what format binaries they wanted, create a linker script, etc. A lot of trial and error was performed, along with lots of searching online with less than perfect internet. However, by Sunday I could program the device, debug code, and verify that my very basic program, shown below, was running.

fn main() -> ! {
    let _y;
    let x = 42;
    _y = x;

    // infinite loop; just so we don't leave this stack frame
    loop {}
}

The combination that worked for programming was:

  • cargo build --target thumbv7m-none-eabi
  • Convert ELF to BIN using cargo objcopy, which is part of cargo-binutils: cargo objcopy --bin sensortag --target thumbv7m-none-eabi -- -O binary sensortag.bin
  • Program with UniFlash:
    • Choose CC2650F128 and XDS1100 on the first screen
    • Do a full erase the first time to reset CCFG, etc
    • Load image (select the .bin file produced above)

For debugging:

  • Build OpenOCD from git to get support for the chip and debugger (I used the existing AUR package)
  • Run OpenOCD: openocd -f jtag/openocd.cfg
  • Use GDB to debug: arm-none-eabi-gdb -x jtag/gdbinit -q target/thumbv7m-none-eabi/debug/sensortag
  • The usual mon reset halt in GDB upsets the debugger connection. I found that soft_reset_halt was able to reset the target (although it complains about being deprecated).

Note: Files in the jtag path above are in my sensortag repo. Trying to program through openocd failed with an error that the vEraseFlash command failed. I'd be curious to know if anyone has got this working as I'd very much like to ditch the huge 526.5 MiB UniFlash desktop-web-app dependency in my workflow.

Now that I could get code to run on the SensorTag I set about trying to use the generated chip support crate to flash one of the on board LEDs. I didn't succeed in getting this working by the time the retreat came to an end, but after I arrived home I was able to find the source of the hard faults I was encountering and get the LED blinking! The key was that I needed to power up the peripheral power domain and enable the GPIO clocks to be able to enable an output GPIO.

It works!

Below is the code that flashes the LED. It should be noted this code is operating with very little abstraction and is using register and field names that match the data sheet. Future work to implement the embedded-hal traits for this controller would make it less verbose and less cryptic.

#![deny(unsafe_code)]
#![no_main]
#![no_std]

#[allow(unused_extern_crates)] // NOTE(allow) bug rust-lang/rust#53964
extern crate panic_halt; // panic handler

// SensorTag is using RGZ package. VQFN (RGZ) | 48 pins, 7×7 QFN

use cc2650_hal as hal;
use cc2650f128;
use cortex_m_rt::entry;

use hal::{ddi, delay::Delay, prelude::*};

pub fn init() -> (Delay, cc2650f128::Peripherals) {
    let core_peripherals = cortex_m::Peripherals::take().unwrap();
    let device_peripherals = cc2650f128::Peripherals::take().unwrap();

    let clocks = ddi::CFGR {
        sysclk: Some(24_000_000),
    }
    .freeze();

    let delay = Delay::new(core_peripherals.SYST, clocks);

    // LEDs are connected to DIO10 and DIO15
    // Configure GPIO pins for output, maximum strength
    device_peripherals.IOC
        .iocfg10
        .modify(|_r, w| w.port_id().gpio().ie().clear_bit().iostr().max());
    device_peripherals.IOC
        .iocfg15
        .modify(|_r, w| w.port_id().gpio().ie().clear_bit().iostr().max());

    // Enable the PERIPH power domain and wait for it to be powered up
    device_peripherals.PRCM.pdctl0.modify(|_r, w| w.periph_on().set_bit());
    loop {
        if device_peripherals.PRCM.pdstat0.read().periph_on().bit_is_set() {
            break;
        }
    }

    // Enable the GPIO clock
    device_peripherals.PRCM.gpioclkgr.write(|w| w.clk_en().set_bit());

    // Load settings into CLKCTRL and wait for LOAD_DONE
    device_peripherals.PRCM.clkloadctl.modify(|_r, w| w.load().set_bit());
    loop {
        if device_peripherals.PRCM.clkloadctl.read().load_done().bit_is_set() {
            break;
        }
    }

    // Enable outputs
    device_peripherals.GPIO
        .doe31_0
        .modify(|_r, w| w.dio10().set_bit().dio15().set_bit());

    (delay, device_peripherals)
}

#[entry]
fn entry() -> ! {
    let (mut delay, periphs) = init();
    let half_period = 500_u16;

    loop {
        // Turn LED on and wait half a second
        periphs.GPIO.dout11_8.modify(|_r, w| w.dio10().set_bit());
        delay.delay_ms(half_period);

        // Turn LED off and wait half a second
        periphs.GPIO.dout11_8.modify(|_r, w| w.dio10().clear_bit());
        delay.delay_ms(half_period);
    }
}

The rest of the code is up on Sourcehut. It's all in a pretty rough state at the moment. I plan to tidy it up over the coming weeks and eventually publish the crates. If you're curious to see it now though, the repos are:

  • cc2650f128 crates.io Documentation -- chip support crate generated by dslite2svd and svd2rust.
  • cc26x0-hal (see wip branch, currently very rough).
  • sensortag -- LED flashing code. I hope to turn this into a board support crate eventually.

Overall the coding retreat was a great success and we hope to do another one next year.



Previous Post: Rebuilding My Personal Infrastructure With Alpine Linux and Docker
Next Post: My First 3 Weeks of Professional Rust

March 23, 2019

Gokberk Yaltirakli (gkbrk)

Phone Location Logger March 23, 2019 09:45 PM

If you are using Google Play Services on your Android phone, Google receives and keeps track of your location history. This includes your GPS coordinates and timestamps. Because of the privacy implications, I have revoked pretty much all permissions from Google Play Services and disabled my Location History on my Google settings (as if they would respect that).

But while it might be creepy if a random company has this data, it would be useful if I still have it. After all, who doesn’t want to know the location of a park that they stumbled upon randomly on a vacation 3 years ago.

I remember seeing some location trackers while browsing through F-Droid. I found various application there, and picked one that was recently updated. The app was a Nextcloud companion app, with support for custom servers. Since I didn’t want a heavy Nextcloud install just to keep track of my location, I decided to go with the custom server approach.

In the end, I decided that the easiest path is to make a small CGI script in Python that appends JSON encoded lines to a text file. Because of this accessible data format, I can process this file in pretty much every programming language, import it to whatever database I want and query it in whatever way I see fit.

The app I went with is called PhoneTrack. You can find the APK and source code links on F-Droid. It replaces the parameters in the URL, logging every parameter looks like this: https://example.com/cgi-bin/locationrecorder.py ?acc=%ACC&alt=%ALT&batt=%BATT&dir=%DIR&lat=%LAT&lon=%LON&sat=%SAT&spd=%SPD &timestamp=%TIMESTAMP

Here’s the script in all it’s glory.

import cgi
import json

PATH = '/home/databases/location.txt'

print('Content-Type: text/plain\n')
form = cgi.FieldStorage()

# Check authentication token
if form.getvalue('token') != 'SECRET_VALUE':
    raise Exception('Nope')

obj = {
    'accuracy':   form.getvalue('acc'),
    'altitude':   form.getvalue('alt'),
    'battery':    form.getvalue('batt'),
    'bearing':    form.getvalue('dir'),
    'latitude':   form.getvalue('lat'),
    'longitude':  form.getvalue('lon'),
    'satellites': form.getvalue('sat'),
    'speed':      form.getvalue('spd'),
    'timestamp':  form.getvalue('timestamp'),
}

with open(PATH, 'a+') as log:
    line = json.dumps(obj)
    log.write(f'{line}\n')

March 22, 2019

Ponylang (SeanTAllen)

0.28.0 Released March 22, 2019 08:06 PM

Pony 0.28.0 is a high-priority release. We advise updating as soon as possible.

In addition to a high-priority bug fix, there are “breaking changes” if you build Pony from source. We’ve also dropped support for some Debian and Ubuntu versions. Read on for further details.

March 21, 2019

Derek Jones (derek-jones)

Describing software engineering in terms of a traditional science March 21, 2019 04:33 PM

If you were asked to describe the ‘building stuff’ side of software engineering, by comparing it with one of the traditional sciences, which science would you choose?

I think a lot of people would want to compare it with Physics. Yes, physics envy is not restricted to the softer sciences of humanities and liberal arts. Unlike physics, software engineering is not governed by a handful of simple ‘laws’, it’s a messy collection of stuff.

I used to think that biology had all the necessary important characteristics needed to explain software engineering: evolution (of code and products), species (e.g., of editors), lifespan, and creatures are built from a small set of components (i.e., DNA or language constructs).

Now I’m beginning to think that chemistry has aspects that are a better fit for some important characteristics of software engineering. Chemists can combine atoms of their choosing to create whatever molecule takes their fancy (subject to bonding constraints, a kind of syntax and semantics for chemistry), and the continuing existence of a molecule does not depend on anything outside of itself; biological creatures need to be able to extract some form of nutrient from the environment in which they live (which is also a requirement of commercial software products, but not non-commercial ones). Individuals can create molecules, but creating new creatures (apart from human babies) is still a ways off.

In chemistry and software engineering, it’s all about emergent behaviors (in biology, behavior is just too complicated to reliably say much about). In theory the properties of a molecule can be calculated from the known behavior of its constituent components (e.g., the electrons, protons and neutrons), but the equations are so complicated it’s impractical to do so (apart from the most simple of molecules; new properties of water, two atoms of hydrogen and one of oxygen, are still being discovered); the properties of programs could be deduced from the behavior its statements, but in practice it’s impractical.

What about the creative aspects of software engineering you ask? Again, chemistry is a much better fit than biology.

What about the craft aspect of software engineering? Again chemistry, or rather, alchemy.

Is there any characteristic that physics shares with software engineering? One that stands out is the ego of some of those involved. Describing, or creating, the universe nourishes large egos.

Stig Brautaset (stig)

Bose QuietComfort 35 Review March 21, 2019 02:39 PM

I review the noise-cancelling headphones I've been using for about 3 years.

Átila on Code (atilaneves)

When you can’t (and shouldn’t) unit test March 21, 2019 11:59 AM

I’m a unit test aficionado, and, as such, have attempted to unit test what really shouldn’t be. It’s common to get excited by a new hammer and then seeing nails everywhere, and unit testing can get out of hand (cough! mocks! cough!). I still believe that the best tests are free from side-effects, deterministic and […]

March 19, 2019

Simon Zelazny (pzel)

How to grab all hosts but the first, in Ansible March 19, 2019 11:00 PM

Today I was trying to figure out how to run a particular ansible play on one host out of a group, and another play on all the other hosts.

The answer was found in a mailing group posting from 2014, but in case that service goes down, here's my note-to-self on how to do it.

Let's say you have a group of hosts called stateful_cluster_hosts in your inventory. You'd like to upload files/leader_script.sh.j2 to the first host, and files/follower_script.sh.j2 to all the others.

The play for the leader host would look like this:

- hosts: stateful_cluster_hosts[0]
  tasks:
  - name: "Upload leader start script"
    template:
      src: files/leader_script.sh.j2
      dest: start.sh
      mode: "u=rwx,g=rx,o=rx"

The play for the follower hosts would look like this:

- hosts:  stateful_cluster_hosts:!stateful_cluster_hosts[0]
  tasks:
  - name: "Upload follower start script"
    template:
      src: files/follower_script.sh.j2
      dest: start.sh
      mode: "u=rwx,g=rx,o=rx"

Where the sytax list:!list[idx] means take list, but filter out list[idx].

Richard Kallos (rkallos)

Inveniam viam - Building Bridges March 19, 2019 03:00 AM

If you know where you are and where you want to be, you can start to plot a course between the two points. Following on the ideas presented in the previous two posts, I describe a practice I learned from reading Robert Fritz’s book Your Life as Art.

I spent a few years of my life voraciously consuming self-help books. I believe the journey started with Eckhart Tolle’s The Power of Now, and it more-or-less ended with Meditations, Stoicism and the Art of Happiness, and Your Life as Art. I’ll probably wind up writing about my path to Stoicism some other time. This post is about what I learned from Robert Fritz.

Your Life as Art is filled with insight about navigating the complex murky space of life while juggling the often competing aspects of spontaneity and rigid structure. My collection of notes about Your Life as Art is nearly a thousand lines long, and there’s definitely far too much good stuff to fit into a single blog post. At this point, I’ll focus on what Robert Fritz calls structural tension, and his technique for plotting a course with the help of a chart.

Hopefully I convinced you with the previous two posts about the importance of objectively “seeing” where you are and where you want to be. These two activites form the foundation of what Fritz calls structural tension, a force that stems from the contrast between reality and an ideal state and seeks to relieve the tension by moving you from your present state to your ideal state.

Writing is a handy exercise generating this force. A structural tension chart (or ST chart) has your desired ideal state at the top of a page, your current state at the bottom, and a series of steps bridging the gap between the two. First you write the ideal section, then the real section, and finally add the steps in the middle. It’s very important to be as objective and detailed as possible about your ideal and current states. Here’s an example:

--- Ideal ---
I meditate every day for at least 15 minutes. My mind is calm
and focused as I go about my daily activities. I feel comfortable
sitting for the duration of my practice, no matter how long it is.
-------------

- Try keeping a meditation journal
- Experiment with active forms of meditation
- Experiment with seating positions
- Give myself time in the morning to meditate
- Wake up at the same time every day

--- Real ----
I meditate approximately once per week. I have difficulty
finding a regular time during the day to devote to meditation,
making it difficult to create a habit that sticks. I find that
I become uncomfortable sitting with my legs crossed for more
than 5 minutes. I do not often remember how good I feel after
meditating, which results in difficulty deciding to sit.

If you’re interested in reading more, I highly recommend Your Life as Art. Robert Fritz’s books are filled with great ideas. While this is basically a slightly more detailed to-do list, I find the process to be very grounding.

In conclusion, once you know where you are and where you want to be, try writing a structural tension chart in order to set a course.

March 18, 2019

Gergely Nagy (algernon)

Solarium March 18, 2019 11:45 AM

I wanted to build a keyboard for a long time, to prepare myself for building two for our Twins when they're old enough, but always struggled with figuring out what I want to build. I mean, I have the perfect keyboard for pretty much all occasions: my daily driver is the Keyboardio Model01, which I use for everything but the few cases highlighted next. For Steno, I use a Splitography. When I need to be extra quiet, I use an Atreus with Silent Reds. For gaming, I have a Shortcut prototype, and use the Atreus too, depending on the game. I don't travel much nowadays, so I have no immediate need for a portable board, but the Atreus would fit that purpose too.

As it turns out there is one scenario I do not have covered: if I have to type on my phone, I do not have a bluetooth keyboard to do it with, and have to rely on the virtual keyboard. This is far from ideal. Why do I need to type on the phone? Because sometimes I'm in a call at night, and need to be quiet, so I go to another room - but I only have a phone with me there. I could use a laptop, but since I need the phone anyway, carrying a phone and a laptop feels wrong, when I could carry a phone and a keyboard instead.

So I'm going to build myself a bluetooth keyboard. But before I do that, I'll build something simpler. Simpler, but still different enough from my current keyboards that I can justify the effort going into the build process. It will not be wireless at first, because during my research, I found that complicates matters too much, at least for a first build.

A while ago, I had another attempt at coming up with a keyboard, which had bluetooth, was split, and had a few other twists. We spent a whole afternoon brainstorming on the name with the twins and my wife. I'll use that name for another project, but I needed another one for the current one: I started down the same path we used back then, and found a good one.

You see, this keyboard is going to feature a rotary encoder, with a big scrubber knob on top of it, as a kind of huge dial. The knob will be in the middle, surrounded by low-profile Kailh Choc keys.

Solarium

balcony, dial, terrace, sundial, sunny spot

The low-profile keys with a mix of black and white keycaps does look like a terrace; the scrubber knob, a dial. So the name fits like a glove.

Now, I know very little about designing and building keyboards, so this first attempt will likely end up being a colossal failure. But one has to start somewhere, and this feels like a good start: simple enough to be possible, different enough to be interesting and worthwhile.

It will be powered by the same ATMega32U4 as many other keyboards, but unlike most, it will have Kailh Choc switches for a very low profile. It will also feature a rotary encoder, which I plan to use for various mouse-related tasks, such as scrolling. Or volume setting. Or brightness adjustment. Stuff like that.

This means I'll have to add rotary encoder support to Kaleidoscope, but that shouldn't be too big of an issue.

The layout

Solarium

(Original KLE)

The idea is that the wheel will act as a mouse scroll wheel by default. Pressing the left Fn key, it will turn into volume control, pressing the right Fn key, it will turn into brightness control. I haven't found other uses for it yet, but I'm sure I will once I have the physical thing under my fingers. The wheel is to be operated by the opposite hand that holds either Fn, or any hand when no Fn is held. Usually that'll be the right hand, because Shift will be on the left thumb cluster, and I need that for horizontal scrolling.

While writing this, I got another idea for the wheel: I can make it switch windows or desktops. It can act as a more convenient Alt+Tab, too!

Components

The most interesting component is likely the knob. I've been eyeing the Scrubber Knob from Adafruit. Need to find a suitable encoder, the one on Adafruit is out of stock. One of the main reasons I like this knob is that it's low profile.

For the rest, they're pretty usual stuff:

  • Kailh Choc switches. Not sure whether I want reds or browns. I usually enjoy tactile switches, but one of the goals of this keyboard is to be quiet, and reds might be a better fit there.
  • Kailh Choc caps: I'll get a mix of black and white caps, for that terrace / balcony feeling.
  • ATMega32U4

Apart from this, I'll need a PCB, and perhaps a switch- and/or bottom plate, I suppose. Like I said, I know next to nothing about building keyboards. I originally wanted to hand-wire it, but Jesse Vincent told me I really don't want to do that, and I'll trust him on that.

Future plans

In the future, I plan to make a Bluetooth keyboard, and a split one (perhaps both at the same time, as originally planned). I might experiment with adding LEDs to the current one too as a next iteration. I also want to build a board with hotswap switches, though I will likely end up with Kailh Box Royals I guess (still need my samples to arrive first, mind you). We'll see once I built the first one, I'm sure there will be lessons learned.

#DeleteFacebook March 18, 2019 08:00 AM

Your Account Is Scheduled for Permanent Deletion

On March 15, the anniversary of the 1848-49 Hungarian Revolution and war for independence, I deleted my facebook account. Or at least started the process. This is something I've been planning to do for a while, and the special day felt like the perfect opportunity to do so. It wasn't easy, not because I used facebook much - I did not, I haven't looked at my timeline in months, had a total of 9 posts over the years (most of them private). I didn't "like" stuff, and haven't interacted with the site in any meaningful way.

I did use Messenger, mostly to communicate with friends and family, convincing at least some of them to find alternative ways to contact me wasn't without issues. But by March 15, I got the most important people to use another communication platform (XMPP), and I was able to hit the delete switch.

I have long despised facebook, for a whole lot of reasons, but most recently, they started to expose my phone number, which I only gave them for 2FA purposes. They exposed it in a way that I couldn't hide it from friends, either. That's a problem because I don't want every person who I "friended" on there to know my phone number. It's a privilege to know it, and facebook abusing its knowledge of it was over the line. But this isn't the worst yet.

You see, facebook is so helpful that it lets people link their contacts with their facebook friends. A lot of other apps are after one's contact list, and now my phone number got into a bunch more of those. This usually isn't a big deal, people will not notice. But programs will. Programs that hunt for phone numbers to sell.

And this is exactly what happened: my phone number got sold. How do I know? I got a call from an insurance company. One I never had any prior contact with, nor did anyone in my family. I was asked if I have two minutes, and I frankly told them that yes, I do, and I'd like to use that two minutes to inquire where they got my phone number from, as per the GDPR, because as a data subject, I have the right to know what data has been collected about me, how such data was processed. I twisted the right a bit, and said I have the right to know how I got into their database - I'm not sure I have this right. In any case, poor caller wasn't prepared for this, took a bit more than two minutes to convince him that he's better off complying with my request, otherwise they'll have a formal GDPR data request and a complaint against him, personally filed within hours.

A few hours later, I got a call back: they got my phone number from facebook. I thanked them for the information, and asked them to delete all data they have about me, and never contact me again. Yes, there's a conflict between those two requests, we'll see how they handle it, let it be their problem figuring out how to resolve it. Anyway, there's only a few possibilities how they could've gotten my number through facebook:

  • If I friended them, they'd have access. They wouldn't have my consent to use it for this kind of stuff, but they'd have the number. This isn't the case. I'm pretty sure I can't friend corporations on facebook (yet?) to begin with.
  • Some of my friends had their contacts synced with facebook (I know of at least two who made this mistake, one of them by mistake, one too easily made), and had their contacts uploaded to the insurance company via their app, or some similarly shady process. This still doesn't mean I consented to being contacted.
  • Facebook sold my number to them. Likewise, this doesn't imply consent, either.

They weren't able to tell me more than that they got my number from facebook. I have a feeling that this is a lie anyway - they just don't know where they bought it from, and facebook probably sounded like a reasonable source. On the other hand, facebook selling one's personal data, despite the GDPR is something I'm more than willing to believe, considering their past actions. Even if facebook is not the one who sold the number, the fact that an insurance company deemed it acceptable to lie and blame them paints an even worse picture.

In either case, facebook is a sickness I wanted to remove from my life, and this whole deal was the final straw. I initiated the account deletion. They probably won't delete anything, just disable it, and continue selling what they already have about me. But at least I make it harder for them to obtain more info about me. I started to migrate my family to better services: we use an XMPP server I host, with end to end encryption, because noone should trust neither me, nor the VPS provider the server is running on.

It's a painful break up, because there are a bunch of people who I talked with on Messenger from time to time, who will not move away from facebook anytime soon. There are parts of my family (my brother & sister) who will not install another chat app just to chat with me - we'll fall back to phone calls, email and SMS. Nevertheless, this had to be done. I'm lucky that I could, because I wasn't using facebook for anything important to begin with. Many people can't escape its clutches.

I wish there will be a day when all of my family is off of it. With a bit of luck, we can raise our kids without facebook in their lives.

Jan van den Berg (j11g)

The Effective Executive – Peter Drucker March 18, 2019 06:51 AM

Pick up any good management book and chances are that Peter Drucker will be mentioned. He is the godfather of management theory. I encountered Drucker many times before in other books and quotes, but I had never read anything directly by him. I have now, and I can only wish I had done so sooner.

The Effective Executive – Peter Drucker (1967) – 210 pages

The sublime classic The Effective Executive from 1967 was a good place to start. After only finishing the first chapter at the kitchen table, I already told my wife: this is one of the best management books I have ever read.

Drucker is an absolute authority who unambiguously will tell you exactly what’s important and what’s not. His voice and style cuts like a knife and his directness will hit you like a ton of bricks. He explains and summarizes like no one else, without becoming repetitive. Every other sentence could be a quote. And after reading, every other management book makes a bit more sense, because now I can tell where they stem from.

Drucker demonstrates visionary insight, by correctly predicting the rise of knowledge workers and their specific needs (and the role of computers). In a rapidly changing society all knowledge workers are executives. And he/she needs to be effective. But, mind you, executive effectiveness “can be learned, but can’t be taught.”

Executive effectiveness

Even though executive effectiveness is an individual aspiration, Drucker is crystal clear on the bigger picture:

Only executive effectiveness can enable this society to harmonize its two needs: the needs of organization to obtain from the individual the contribution it needs, and the need of the individual to have organization serve as his tool for the accomplishment of his purposes. Effectiveness must be learned…..Executive effectiveness is our one best hope to make modern society productive economically and viable socially.


So this book makes sense on different levels and is timeless. Even if some references, in hindsight, are dated (especially the McNamara references, knowing what we now know about the Vietnam war). I think Drucker himself did not anticipate the influence of his writing, as the next quotes demonstrates. But this is also precisely what I admire about it.

There is little danger that anyone will compare this essay on training oneself to be an effective executive with, say, Kierkegaard’s great self-development tract, Training in Christianity. There are surely higher goals for a man’s life than to become an effective executive. But only because the goal is so modest can we hope at all to achieve it; that is, to have the large number of effective executives modern society and its organizations need.

The post The Effective Executive – Peter Drucker appeared first on Jan van den Berg.

Pete Corey (petecorey)

A Better Mandelbrot Iterator in J March 18, 2019 12:00 AM

Nearly a year ago I wrote about using the J programming language to write a Mandelbrot fractal renderer. I proudly exclaimed that J could be used to “write out expressions like we’d write English sentences,” and immediately proceeded to write a nonsensical, overcomplicated solution.

My final solution bit off more than it needed to chew. The next verb we wrote both calculated the next value of iterating on the Mandelbrot formula and also managed appending that value to a list of previously calculated values.

I nonchalantly explained:

This expression is saying that next “is” (=:) the “first element of the array” ({.) “plus” (+) the “square of the last element of the array” (*:@:{:). That last verb combines the “square” (*:) and “last” ({:) verbs together with the “at” (@:) adverb.

Flows off the tongue, right?

My time spent using J to solve last year’s Advent of Code challenges has shown me that a much simpler solution exists, and it can flow out of you in a fluent way if you just stop fighting the language and relax a little.


Let’s refresh ourselves on Mandelbrot fractals before we dive in. The heart of the Mandelbrot fractal is this iterative equation:

The Mandelbrot set equation.

In English, the next value of z is some constant, c, plus the square of our previous value of z. To render a picture of the Mandelbrot fractal, we map some section of the complex plane onto the screen, so that every pixel maps to some value of c. We iterate on this equation until we decide that the values being calculated either remain small, or diverge to infinity. Every value of c that doesn’t diverge is part of the Mandelbrot set.

But let’s back up. We just said that “the next value of z is some constant, c, plus the square of our previous value of z”.

We can write that in J:

   +*:

And we can plug in example values for c (0.2j0.2) and z (0):

   0.2j0.2 (+*:) 0
0.2j0.2

Our next value of z is c (0.2j0.2) plus (+) the square (*:) of our previous value of z (0). Easy!


My previous solution built up an array of our iterated values of z by manually pulling c and previously iterated values off of the array and pushing new values onto the end. Is there a better way?

Absolutely. If I had read the documentation on the “power” verb (^:), I would have noticed that “boxing” (<) the number of times we want to apply our verb will return an array filled with the results of every intermediate application.

Put simply, we can repeatedly apply our iterator like so:

   0.2j0.2 (+*:)^:(<5) 0
0 0.2j0.2 0.2j0.28 0.1616j0.312 0.128771j0.300838

Lastly, it’s conceivable that we might want to switch the order of our inputs. Currently, our value for c is on the left and our initial value of z is on the right. If we’re applying this verb to an array of c values, we’d probably want c to be the right-hand argument and our initial z value to be a bound left-hand argument.

That’s a simple fix thanks to the “passive” verb (~):

   0 (+*:)^:(<5)~ 0.2j0.2
0 0.2j0.2 0.2j0.28 0.1616j0.312 0.128771j0.300838

We can even plot our iterations to make sure that everything looks as we’d expect.

Our plotted iteration for a C value of 0.2 + 0.2i.


I’m not going to lie and claim that J is an elegantly ergonomic language. In truth, it’s a weird one. But as I use J more and more, I’m finding that it has a certain charm. I’ll often be implementing some tedious solution for a problem in Javascript or Elixir and find myself fantasizing about how easily I could write an equivalent solution in J.

That said, I definitely haven’t found a shortcut for learning the language. Tricks like “reading and writing J like English” only really work at a hand-wavingly superficial level. I’ve found that learning J really just takes time, and as I spend more time with the language, I can feel myself “settling into it” and its unique ways of looking at computation.

If you’re interested in learning J, check out my previous articles on the subject and be sure to visit the JSoftware home page for books, guides, and documentation.

Richard Kallos (rkallos)

Esse quam videri - Seeing what's in front of you March 18, 2019 12:00 AM

As humans, we are easily fooled. Our five senses are the primary way we get an idea of what’s happening around us, and it’s been shown time and time again that our senses are unreliable. In this post, I try to explain the importance of ‘seeing’ what’s in front of you, and how to practice it.

See this video a classic example of our incredible ability at missing important details.

Rembrandt used to train his students by making them copy his self-portraits. This exercise forced them to see their subject as objectively as possible, which was essential to make an accurate reproduction. Only after mastering their portraiturne skills did Rembrandt’s students go on to develop their own artistic styles.

It is important to periodically evaluate your position and course in life. It’s something you do whether you’re aware of it or not. When you plan something, you’re setting a course. When you’re reflecting on past events, you’re estimating your position. For the sake of overloading words like sight, vision, and planning, let’s refer to this act as life portraiture.

Life portraiture can be compared to navigating on land, air, or sea, except the many facets of our lives results in a space of many more dimensions. We can consider our position and course on axes like physical health, emotional health, career, finance, and social life. If we want finer detail, we can split any of those axes into more dimensions.

Objective life portraiture is not easy. We are all vulnerable to cognitive biases. Following the above analogy with navigation, our inaccuracy at objectively evaluating our lives is akin to inaccurately navigating a ship or airplane. If you’re not well-practiced at seeing, your only tool for navigation might be dead reckoning. If you practice drawing self-portraits of your life, you might suddenly find yourself in possession of a sextant and an almanac, so you can navigate using the stars. The ideal in this case would be to have something like GPS, which might look like Quantified Self with an incredible amount of detail.

It’s worth mentioning that our ability to navigate varies across different dimensions. This is an idea that doesn’t really carry over to navigating Earth, but it’s important to recognize. For example, if you’re thorough with your personal finances, you could have tools akin to GPS for navigating that part of your life. At the same time, if you don’t check in with your emotions, or do anything to improve your emotional health, you might be lost in those spaces.

There are ways to improve our navigating abilities depending on the spaces we’re looking at. To improve navigating your personal finances, you can regularly consult your banking statements, make budgets, and explore different methods of investing. To improve navigating your physical health, you can perform one of many different fitness tests, or consult a personal trainer. To improve navigating your emotional health, you could try journaling, or maybe begin seeing a therapist. Any and all of these could help you locate yourself in the vast space where your life could be.

In order to get where you want to go, you need to know where you are, and what direction you’re moving in.

March 17, 2019

Ponylang (SeanTAllen)

Last Week in Pony - March 17, 2019 March 17, 2019 02:35 PM

Last Week In Pony is a weekly blog post to catch you up on the latest news for the Pony programming language. To learn more about Pony check out our website, our Twitter account @ponylang, or our Zulip community.

Got something you think should be featured? There’s a GitHub issue for that! Add a comment to the open “Last Week in Pony” issue.

Caius Durling (caius)

Download All Your Gists March 17, 2019 02:32 PM

Over time I've managed to build up quite the collection of Gists over at Github, including secret ones there's about 1200 currently. Some of these have useful code in, some are just garbage output. I'd quite like a local copy either way, so I can easily search1 across them.

  1. Install the gist command from Github

    brew install gist
  2. Login to your Github Account through the gist tool (it'll prompt for your login credentials, then generate you an API Token to allow it future access.)

    gist --login
  3. Create a folder, go inside it and download all your gists!

    mkdir gist_archive
    cd gist_archive
    for repo in $(gist -l | awk '{ print $1 }'); do git clone $repo 2> /dev/null; done
  4. Now you have a snapshot of all your gists. To update them in future, you can run the above for any new gists, and update all the existing ones with:

    cd gist_archive
    for i in */; do (cd $i && git pull --rebase); done

Now go forth and search out your favourite snippet you saved years ago and forgot about!


  1. ack, ag, grep, ripgrep, etc. Pick your flavour. [return]

Marc Brooker (mjb)

Control Planes vs Data Planes March 17, 2019 12:00 AM

Control Planes vs Data Planes

Are there multiple things here?

If you want to build a successful distributed system, one of the most important things to get right is the block diagram: what are the components, what does each of them own, and how do they communicate to other components. It's such a basic design step that many of us don't think about how important it is, and how difficult and expensive it can be to make changes to the overall architecture once the system is in production. Getting the block diagram right helps with the design of database schemas and APIs, helps reason through the availability and cost of running the system, and even helps form the right org chart to build the design.

One very common pattern when doing these design exercises is to separate components into a control plane and a data plane, recognizing the differences in requirements between these two roles.

No true monoliths

The microservices and SOA design approaches tend to push towards more blocks, with each block performing a smaller number of functions. The monolith approach is the other end of the spectrum, where the diagram consists of a single block. Arguments about these two approaches can be endless, but ultimately not important. It's worth noting, though, that there are almost no true monoliths. Some kinds of concerns are almost always separated out. Here's a partial list:

  1. Storage. Most modern applications separate business logic from storage and caching, and talk through APIs to their storage.
  2. Load Balancing. Distributed applications need some way for clients to distribute their load across multiple instances.
  3. Failure tolerance. Highly available systems need to be able to handle the failure of hardware and software without affecting users.
  4. Scaling. Systems which need to handle variable load may add and remove resources over time.
  5. Deployments. Any system needs to change over time.

Even in the most monolithic application, these are separate components of the system, and need to be built into the design. What's notable here is that these concerns can be broken into two clean categories: data plane and control plane. Along with the monolithic application itself, storage and load balancing are data plane concerns: they are required to be up for any request to succeed, and scale O(N) with the number of requests the system handles. On the other hand, failure tolerance, scaling and deployments are control plane concerns: they scale differently (either with a small multiple of N, with the rate of change of N, or with the rate of change of the software) and can break for some period of time before customers notice.

Two roles: control plane and data plane

Every distributed system has components that fall roughly into these two roles: data plane components that sit on the request path, and control plane components which help that data plane do its work. Sometimes, the control plane components aren't components at all, and rather people and processes, but the pattern is the same. With this pattern worked out, the block diagram of the system starts to look something like this:

Data plane and control plane separated into two blocks

My colleague Colm MacCárthaigh likes to think of control planes from a control theory approach, separating the system (the data plane) from the controller (the control plane). That's a very informative approach, and you can hear him talk about it here:

I tend to take a different approach, looking at the scaling and operational properties of systems. As in the example above, data plane components are the ones that scale with every request1, and need to be up for every request. Control plane components don't need to be up for every request, and instead only need to be up when there is work to do. Similarly, they scale in different ways. Some control plane components, such as those that monitor fleets of hosts, scale with O(N/M), which N is the number of requests and M is the requests per host. Other control plane components, such as those that handle scaling the fleet up and down, scale with O(dN/dt). Finally, control plane components that perform work like deployments scale with code change velocity.

Finding the right separation between control and data planes is, in my experience, one of the most important things in a distributed systems design.

Another view: compartmentalizing complexity

In their classic paper on Chain Replication, van Renesse and Schneider write about how chain replicated systems handle server failure:

In response to detecting the failure of a server that is part of a chain (and, by the fail-stop assumption, all such failures are detected), the chain is reconfigured to eliminate the failed server. For this purpose, we employ a service, called the master

Fair enough. Chain replication can't handle these kinds of failures without adding significant complexity to the protocol. So what do we expect of the master?

In what follows, we assume the master is a single process that never fails.

Oh. Never fails, huh? They then go on to say that they approach this by replicating the master on multiple hosts using Paxos. If they have a Paxos implementation available, then why do they just not use that and not bother with this Chain Replication thing at all? The paper doesn't say2, but I have my own opinion: it's interesting to separate them because Chain Replication offers a different set of performance, throughput, and code complexity trade offs than Paxos3. It is possible to build a single code base (and protocol) which handles both concerns, but at the cost of coupling these two different concerns. Instead, by making the master a separate component, the chain replicated data plane implementation can focus on the things it needs to do (scale, performance, optimizing for every byte). The control plane, which only needs to handle the occasional failure, can focus on what it needs to do (extreme availability, locality, etc). Each of these different requirements adds complexity, and separating them out allows a system to compartmentalize its complexity, and reduce coupling by offering clear APIs and contract between components.

Breaking down the binary

Say you build awesome data plane based on chain replication, and an awesome control plane (master) for that data plane. At first, because of its lower scale, you can operate the control plane manually. Over time, as your system becomes successful, you'll start to have too many instances of the control plane to manage by hand, so you build a control plane for that control plane to automate the management. This is the first way the control/data binary breaks down: at some point control planes need their own control planes. Your controller is somebody else's system under control.

One other way the binary breaks down is with specialization. The master in the chain replicated system handles fault tolerance, but may not handling scaling, or sharding of chains, or interacting with customers to provision chains. In real systems there are frequently multiple control planes which control different aspects of the behavior of a system. Each of these control planes have their own differing requirements, requiring different tools and different expertise. Control planes are not homogeneous.

These two problems highlight that the idea of control planes and data planes may be too reductive to be a core design principle. Instead, it's a useful tool for helping identify opportunities to reduce and compartmentalize complexity by introducing good APIs and contracts, to ensure components have a clear set of responsibilities and ownership, and to use the right tools for solving different kinds of problems. Separating the control and data planes should be a heuristic tool for good system design, not a goal of system design.

Footnotes:

  1. Or potentially with every request. Things like caches complicate this a bit.
  2. It does compare Chain Replication to other solutions, but doesn't specifically talk about the benefits of seperation. Murat Demirbas pointed out that Chain Replication's ability to serve linearizable reads from the tail is important. He also pointed me at the Object Storage on CRAQ paper, which talks about how to serve reads from intermediate nodes. Thanks, Murat!
  3. For one definition of Paxos. Lamport's Vertical Paxos paper sees chain replication as a flavor of Paxos, and more recent work by Heidi Howard et al on Flexible Paxos makes the line even less clear.

March 16, 2019

Richard Kallos (rkallos)

Memento Mori - Seeing the End March 16, 2019 09:30 PM

Memento Mori (translated as “remember death”) is a powerful idea and practice. In this post, I make the case that it’s important to think not just about your death, but to clearly define what it means to be finished in whatever you set out to do.

We are mites on a marble floating in the endless void. Our lifespans are blinks in cosmic history. Furthermore, for many of us, our contributions are likely to be forgotten soon after we rejoin the earth, if not sooner.

This is great news.

Whenever I’m feeling nervous or embarrassed, I start to feel better when I realize that nobody in front of me is going to be alive 100 years from now, and I doubt that they’ll be telling their grandchildren about that time when Richard made a fool of himself, because I try to make a fool of myself often enough that it’s usually not worth telling people about.

Knowing that we are finite is also pretty motivating. I feel less resistance to starting new things. It doesn’t have to be perfect, in fact, it’s probably going to be average. However, it’s my journey, so it’s special to me, and I probably (hopefully?) learned and improved on the way.

It’s important to think about the ends of things, even when we don’t necessarily want things to end. Endings are as much a part of life as beginnings are, to think otherwise is delusion. Endings tend to have a reputation for being sad, but they don’t always have to be.

For example, some developers of open source software get stuck working on their projects for far longer than they expected. It’s unfortunate that creating something that people enjoy can turn into a source of grief and resentment.

Specifying an end of any endeavor is an important task. If no ‘end state’ is declared, it’s possible that a project will continue to take up time and effort, perpetually staying on the back-burner of things you have going on, draining you of resources until you are no longer able to start anything new.

Spending time thinking about what your finished project will look like sets a target for you to achieve, which is a point I’ll elaborate on very soon. This exercise, along with evaluating where you are currently on your path toward achieving your goal/finishing your project, are immensely useful for getting your brain to focus on the intermediate tasks that need to be finished in order to get to that idealized ‘end state’.

All in all, while it’s sometimes nice to simply wander, it’s important to acknowledge that you are always going somewhere, even when you think you’re standing still. You should be the one who decides where you go, not someone else.

March 14, 2019

Derek Jones (derek-jones)

Altruistic innovation and the study of software economics March 14, 2019 02:11 PM

Recently, I have been reading rather a lot of papers that are ostensibly about the economics of markets where applications, licensed under an open source license, are readily available. I say ostensibly, because the authors have some very odd ideas about the activities of those involved in the production of open source.

Perhaps I am overly cynical, but I don’t think altruism is the primary motivation for developers writing open source. Yes, there is an altruistic component, but I would list enjoyment as the primary driver; developers enjoy solving problems that involve the production of software. On the commercial side, companies are involved with open source because of naked self-interest, e.g., commoditizing software that complements their products.

It may surprise you to learn that academic papers, written by economists, tend to be knee-deep in differential equations. As a physics/electronics undergraduate I got to spend lots of time studying various differential equations (each relating to some aspect of the workings of the Universe). Since graduating, I have rarely encountered them; that is, until I started reading economics papers (or at least trying to).

Using differential equations to model problems in economics sounds like a good idea, after all they have been used to do a really good job of modeling how the universe works. But the universe is governed by a few simple principles (or at least the bit we have access to is), and there is lots of experimental data about its behavior. Economic issues don’t appear to be governed by a few simple principles, and there is relatively little experimental data available.

Writing down a differential equation is easy, figuring out an analytic solution can be extremely difficult; the Navier-Stokes equations were written down 200-years ago, and we are still awaiting a general solution (solutions for a variety of special cases are known).

To keep their differential equations solvable, economists make lots of simplifying assumptions. Having obtained a solution to their equations, there is little or no evidence to compare it against. I cannot speak for economics in general, but those working on the economics of software are completely disconnected from reality.

What factors, other than altruism, do academic economists think are of major importance in open source? No, not constantly reinventing the wheel-barrow, but constantly innovating. Of course, everybody likes to think they are doing something new, but in practice it has probably been done before. Innovation is part of the business zeitgeist and academic economists are claiming to see it everywhere (and it does exist in their differential equations).

The economics of Linux vs. Microsoft Windows is a common comparison, i.e., open vs. close source; I have not seen any mention of other open source operating systems. How might an economic analysis of different open source operating systems be framed? How about: “An economic analysis of the relative enjoyment derived from writing an operating system, Linux vs BSD”? Or the joy of writing an editor, which must be lots of fun, given how many have text editors are available.

I have added the topics, altruism and innovation to my list of indicators of poor quality, used to judge whether its worth spending more than 10 seconds reading a paper.

March 13, 2019

Oleg Kovalov (olegkovalov)

Indeed, I should add it. Haven’t used it for a long time. March 13, 2019 05:18 PM

Indeed, I should add it. Haven’t used it for a long time.

Wesley Moore (wezm)

My Rust Powered linux.conf.au e-Paper Badge March 13, 2019 09:39 AM

This week I attended linux.conf.au (for the first time) in Christchurch, New Zealand. It's a week long conference covering Linux, open source software and hardware, privacy, security and much more. The theme this year was IoT. In line with the theme I built a digital conference badge to take to the conference. It used a tri-colour e-Paper display and was powered by a Rust program I built running on Raspbian Linux. This post describes how it was built, how it works, and how it fared at the conference. The source code is on GitHub.

The badge in its final state after the conference. The badge in its final state after the conference

Building

After booking my tickets in October I decided I wanted to build a digital conference badge. I'm not entirely sure what prompted me to do this but it was a combination of seeing projects like the BADGEr in the past, the theme of linux.conf.au 2019 being IoT, and an excuse to write more Rust. Since it was ostensibly a Linux conference it also seemed appropriate for it to run Linux.

Over the next few weeks I collected the parts and adaptors to build the badge. The main components were:

The Raspberry Pi Zero W is a single core 1Ghz ARM SoC with 512Mb RAM, Wi-FI, Bluetooth, microSD card slot, and mini HDMI. The Inky pHAT is a 212x104 pixel tri-colour (red, black, white) e-Paper display. It takes about 15 seconds to refresh the display but it draws very little power in between updates and the image persists even when power is removed.

Support Crates

The first part of the project involved building a Rust driver for the controller in the e-Paper display. That involved determining what controller the display used, as Pimoroni did not document it. Searching online for some of the comments in the Python driver suggested the display was possibly a HINK-E0213A07 from Holitech Co. Further searching based on the datasheet for that display suggested that the controller was a Solomon Systech SSD1675. Cross referencing the display datasheet, SSD1675 datasheet, and the Python source of Pimoroni's Inky pHAT driver suggested I was on the right track.

I set about building the Rust driver for the SSD1675 using the embedded HAL traits. These traits allow embedded Rust drivers to be built against a de facto standard set of traits that allow the driver to be used in any environment that implements the traits. For example I make use of traits for SPI devices, and GPIO pins, which are implemented for Linux, as well as say, the STM32F30x family of microcontrollers. This allows the driver to be written once and used on many devices.

The result was the ssd1675 crate. It's a so called no-std crate. That means it does not use the Rust standard library, instead sticking only to the core library. This allows the crate to be used on devices and microcontrollers without features like file systems, or heap allocators. The crate also makes use of the embedded-graphics crate, which makes it easy to draw text and basic shapes on the display in a memory efficient manner.

While testing the ssd1675 crate I also built another crate, profont, which provides 7 sizes of the ProFont font for embedded graphics. The profont crate was published 24 Nov 2018, and ssd1675 was published a month later on 26 Dec 2018.

The Badge Itself

Now that I had all the prerequisites in place I could start working on the badge proper. I had a few goals for the badge and its implementation:

  • I wanted it to have some interactive component.
  • I wanted there to be some sort of Internet aspect to tie in with the IoT theme of the conference.
  • I wanted the badge to be entirely powered by a single, efficient Rust binary, that did not shell out to other commands or anything like that.
  • Ideally it would be relatively power efficient.

An early revision of the badge from 6 Jan 2019 showing my name, website, badge IP, and kernel info. An early revision of the badge from 6 Jan 2019

I settled on having the badge program serve up a web page with some information about the project, myself, and some live stats of the Raspberry Pi (OS, kernel, uptime, free RAM). The plain text version of the page looked like this:

Hi I'm Wes!

Welcome to my conference badge. It's powered by Linux and
Rust running on a Raspberry Pi Zero W with a tri-colour Inky
pHAT ePaper dispay. The source code is on GitHub:

https://github.com/wezm/linux-conf-au-2019-epaper-badge


Say Hello
---------

12 people have said hi.

Say hello in person and on the badge. To increment the hello
counter on the badge:

    curl -X POST http://10.0.0.18/hi


About Me
--------

I'm a software developer from Melbourne, Australia. I
currently work at GreenSync building systems to help make
better use of renewable energy.

Find me on the Internet at:

   Email: wes@wezm.net
  GitHub: https://github.com/wezm
Mastodon: https://mastodon.social/@wezm
 Twitter: https://twitter.com/wezm
 Website: http://www.wezm.net/


Host Information
----------------

   (_\)(/_)   OS:        Raspbian GNU/Linux
   (_(__)_)   KERNEL:    Linux 4.14.79+
  (_(_)(_)_)  UPTIME:    3m
   (_(__)_)   MEMORY:    430.3 MB free of 454.5 MB
     (__)


              .------------------------.
              |    Powered by Rust!    |
              '------------------------'
                              /
                             /
                      _~^~^~_
                  \) /  o o  \ (/
                    '_   -   _'
                    / '-----' \

The interactive part came in the form of a virtual "hello" counter. Each HTTP POST to the /hi endpoint incremented the count, which was shown on the badge. The badge displayed the URL of the page. The URL was just the badge's IP address on the conference Wi-Fi. To provide a little protection against abuse I added code that only allowed a given IP to increment the count once per hour.

When building the badge software these are some of the details and things I strived for:

  • Handle Wi-Fi going away
  • Handle IP address changing
  • Prevent duplicate submissions
  • Pluralisation of text on the badge and on the web page
  • Automatically shift the text as the count requires more digits
  • Serve plain text and HTML pages:
    • If the web page is requested with an Accept header that doesn't include text/html (E.g. curl) then the response is plain text and the method to, "say hello", is a curl command.
    • If the user agent indicates they accept HTML then the page is HTML and contains a form with a button to, "say hello".
  • Avoid aborting on errors:
    • I kind of ran out of time to handle all errors well, but most are handled gracefully and won't abort the program. In some cases a default is used in the face of an error. In other cases I just resorted to logging a message and carrying on.
  • Keep memory usage low:
    • The web server efficiently discards any large POST requests sent to it, to avoid exhausting RAM.
    • Typical RAM stats showed the Rust program using about 3Mb of RAM.
  • Be relatively power efficient:
    • Use Rust instead of a scripting language
    • Only update the display when something it's showing changes
    • Only check for changes every 15 seconds (the rest of the time that thread just sleeps)
    • Put the display into deep sleep after updating

I used hyper for the HTTP server built into the binary. To get a feel for the limits of the device I did some rudimentary HTTP benchmarking with wrk and concluded that 300 requests per second was was probably going to be fine. ;-)

Running 10s test @ http://10.0.0.18:8080/
  4 threads and 100 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   316.58ms   54.41ms   1.28s    92.04%
    Req/Sec    79.43     43.24   212.00     67.74%
  3099 requests in 10.04s, 3.77MB read
Requests/sec:    308.61
Transfer/sec:    384.56KB

Mounting

When I started the project I imagined it would hang around my neck like a conference lanyard. By the time departure day arrived I still hadn't worked out how this would work in practice (power delivery being a major concern). In the end I settled on attaching it to the strap on my backpack. My bag has lots of webbing so there were plenty of loops to hold it in place. I was also able to use the Velcro covered holes intended for water tubes to get the cable neatly into the bag.

At the Conference

I had everything pretty much working for the start of the conference. Although I did make some improvements and add a systemd unit to automatically start and restart the Rust binary. At this point there were still two unknowns: battery life and how the Raspberry Pi would handle coming in and out of Wi-Fi range. The Wi-Fi turned out fine: It automatically reconnected whenever it came into range of the Wi-Fi.

Badge displaying a count of zero. Ready for day 1

Reception

Day 1 was a success! I had several people talk to me about the badge and increment the counter. Battery life was good too. After 12 hours of uptime the battery was still showing it was half full. Later in the week I left the badge running overnight and hit 24 hours uptime. The battery level indicator was on the last light so I suspect there wasn't much juice left.

Me with badge display showing a hello count of 1. Me after receiving my first hello on the badge

On day 2 I had had several people suggest that I needed a QR code for the URL. Turns out entering an IP address on a phone keyboard is tedious. So that evening I added a QR code to the display. It's dynamically generated and contains the same URL that is shown on the display. There were several good crates to choose from. Ultimately I picked one that didn't have any image dependencies, which allowed me to convert the data into embedded-graphics pixels. The change was a success, most people scanned the QR code from this point on.

Badge display now including QR code. Badge display showing the newly added QR code

On day 2 I also ran into E. Dunham, and rambled briefly about my badge project and that it was built with Rust. To my absolute delight the project was featured in their talk the next day. The project was mentioned and linked on a slide and I was asked to raise my hand in case anyone wanted to chat afterwards.

Photo of E. Dunham's slide with a link to my git repo. Photo of E. Dunham's slide with a link to my git repo

At the end of the talk the audience was encouraged to tell the rest of the room about a Rust project they were working on. Each person that did so got a little plush Ferris. I spoke about Read Rust.

Photo of a small orange plush crab. Plush Ferris

Conclusion

By the end of the conference the badge showed a count of 12. It had worked flawlessly over the five days.

Small projects with a fairly hard deadline are a good way to ensure they're seen through to completion. They're also a great motivator to publish some open source code.

I think I greatly overestimated the number of people that would interact with the badge. Of those that did, I think most tapped the button to increase the counter and didn't read much else on the page. For example no one commented on the system stats at the bottom. I had imagined the badge as a sort of digital business card but this did not really eventuate in practice.

Attaching the Pi and display to my bag worked out pretty well. I did have to be careful when putting my bag on as it was easy to catch on my clothes. Also one day it started raining on the walk back to the accommodation. I had not factored that in at all and given it wasn't super easy to take on and off I ended up shielding it with my hand all the way back.

Would I Do It Again?

Maybe. If I were to do it again I might do something less interactive and perhaps more informational but updated more regularly. I might try to tie the project into a talk submission too. For example, I could have submitted a talk about using the embedded Rust ecosystem on a Raspberry Pi and made reference to the badge in the talk or used it for examples. I think this would give more info about the project to a bunch of people at once and also potentially teach them something at the same time.

All in all it was a fun project and excellent conference. If you're interested, the Rust source for the badge is on GitHub.



Next Post: Rebuilding My Personal Infrastructure With Alpine Linux and Docker

Rebuilding My Personal Infrastructure With Alpine Linux and Docker March 13, 2019 09:37 AM

For more than a decade I have run one or more servers to host a number of personal websites and web applications. Recently I decided it was time to rebuild the servers to address some issues and make improvements. The last time I did this was in 2016 when I switched the servers from Ubuntu to FreeBSD. The outgoing servers were managed with Ansible. After being a Docker skeptic for a long time I have finally come around to it recently and decided to rebuild on Docker. This post aims to describe some of the choices made, and why I made them.

Before we start I'd like to take a moment to acknowledge this infrastructure is built to my values in a way that works for me. You might make different choices and that's ok. I hope you find this post interesting but not prescriptive.

Before the rebuild this is what my infrastructure looked like:

You'll note 3 servers, across 2 countries, and 2 hosting providers. Also the Rust Melbourne server was not managed by Ansible like the other two were.

I had a number of goals in mind with the rebuild:

  • Move everything to Australia (where I live)
  • Consolidate onto one server
  • https enable all websites

I set up my original infrastructure in the US because it was cheaper at the time and most traffic to the websites I host comes from the US. The Wizards Mattermost instance was added later. It's for a group of friends that are all in Australia. Being in the US made it quite slow at times, especially when sharing and viewing images.

Another drawback to administering servers in the US from AU was that it makes the Ansible cycle time of "make a change, run it, fix it, repeat", excruciatingly slow. It had been on my to do list for a long time to move Wizards to Australia but I kept putting it off because I didn't want to deal with Ansible.

While having a single server that does everything wouldn't be the recommended architecture for business systems, for personal hosting where the small chance of downtime isn't going to result in loss of income the simplicity won out, at least for now.

This is what I ended up building. Each box is a Docker container running on the host machine:

Graph of services

I haven't always been in favour of Docker but I think enough time has passed to show that it's probably here to stay. There are some really nice benefits to Docker managed services too. Such as, building locally and then shipping the image to production, and isolation from the host system (in the sense you can just nuke the container and rebuild it if needed).

Picking a Host OS

Moving to Docker unfortunately ruled out FreeBSD as the host system. There is a very old Docker port for FreeBSD but my previous attempts at using it showed that it was not in a good enough state to use for hosting. That meant I needed to find a suitable Linux distro to act as the Docker host.

Coming from FreeBSD I'm a fan of the stable base + up-to-date packages model. For me this ruled out Debian (stable) based systems, which I find often have out-of-date or missing packages -- especially in the latter stages of the release cycle. I did some research to see if there were any distros that used a BSD style model. Most I found were either abandoned or one person operations.

I then recalled that as part of his Sourcehut work, Drew DeVault was migrating things to Alpine Linux. I had played with Alpine in the past (before it became famous in the Docker world), and I consider Drew's use some evidence in its favour.

Alpine describes itself as follows:

Alpine Linux is an independent, non-commercial, general purpose Linux distribution designed for power users who appreciate security, simplicity and resource efficiency.

Now that's a value statement I can get behind! Other things I like about Alpine Linux:

  • It's small, only including the bare essentials:
    • It avoids bloat by using musl-libc (which is MIT licensed) and busybox userland.
    • It has a 37Mb installation ISO intended for virtualised server installations.
  • It was likely to be (and ended up being) the base of my Docker images.
  • It enables a number of security features by default.
  • Releases are made every ~6 months and are supported for 2 years.

Each release also has binary packages available in a stable channel that receives bug fixes and security updates for the lifetime of the release as well as a rolling edge channel that's always up-to-date.

Note that Alpine Linux doesn't use systemd, it uses OpenRC. This didn't factor into my decision at all. systemd has worked well for me on my Arch Linux systems. It may not be perfect but it does do a lot of things well. Benno Rice did a great talk at linux.conf.au 2019, titled, The Tragedy of systemd, that makes for interesting viewing on this topic.

Building Images

So with the host OS selected I set about building Docker images for each of the services I needed to run. There are a lot of pre-built Docker images for software like nginx, and PostgreSQL available on Docker Hub. Often they also have an alpine variant that builds the image from an Alpine base image. I decided early on that these weren't really for me:

  • A lot of them build the package from source instead of just installing the Alpine package.
  • The Docker build was more complicated than I needed as it was trying to be a generic image that anyone could pull and use.
  • I wasn't a huge fan of pulling random Docker images from the Internet, even if they were official images.

In the end I only need to trust one image from Docker Hub: The 5Mb Alpine image. All of my images are built on top of this one image.

Update 2 Mar 2019: I am no longer depending on any Docker Hub images. After the Alpine Linux 3.9.1 release I noticed the official Docker images had not been updated so I built my own. Turns out it's quite simple. Download the miniroot tarball from the Alpine website and then add it to a Docker image:

FROM scratch

ENV ALPINE_ARCH x86_64
ENV ALPINE_VERSION 3.9.1

ADD alpine-minirootfs-${ALPINE_VERSION}-${ALPINE_ARCH}.tar.gz /
CMD ["/bin/sh"]

An aspect of Docker that I don't really like is that inside the container you are root by default. When building my images I made a point of making the entrypoint processes run as a non-privileged user or configure the service drop down to a regular user after starting.

Most services were fairly easy to Dockerise. For example here is my nginx Dockerfile:

FROM alpine:3.9

RUN apk update && apk add --no-cache nginx

COPY nginx.conf /etc/nginx/nginx.conf

RUN mkdir -p /usr/share/www/ /run/nginx/ && \
  rm /etc/nginx/conf.d/default.conf

EXPOSE 80

STOPSIGNAL SIGTERM

ENTRYPOINT ["/usr/sbin/nginx", "-g", "daemon off;"]

I did not strive to make the images especially generic. They just need to work for me. However I did make a point not to bake any credentials into the images and instead used environment variables for things like that.

Let's Encrypt

I've been avoiding Let's Encrypt up until now. Partly because the short expiry of the certificates seems easy to mishandle. Partly because of certbot, the recommended client. By default certbot is interactive, prompting for answers when you run it the first time, it wants to be installed alongside the webserver so it can manipulate the configuration, it's over 30,000 lines of Python (excluding tests, and dependencies), the documentation suggests running magical certbot-auto scripts to install it... Too big and too magical for my liking.

Despite my reservations I wanted to enable https on all my sites and I wanted to avoid paying for certificates. This meant I had to make Let's Encrypt work for me. I did some research and finally settled on acme.sh. It's written in POSIX shell and uses curl and openssl to do its bidding.

To avoid the need for acme.sh to manipulate the webserver config I opted to use the DNS validation method (certbot can do this too). This requires a DNS provider that has an API so the client can dynamically manipulate the records. I looked through the large list of supported providers and settled on LuaDNS.

LuaDNS has a nice git based workflow where you define the DNS zones with small Lua scripts and the records are published when you push to the repo. They also have the requisite API for acme.sh. You can see my DNS repo at: https://github.com/wezm/dns

Getting the acme.sh + hitch combo to play nice proved to be bit of a challenge. acme.sh needs to periodically renew certificates from Let's Encrypt, these then need to be formatted for hitch and hitch told about them. In the end I built the hitch image off my acme.sh image. This goes against the Docker ethos of one service per container but acme.sh doesn't run a daemon, it's periodically invoked by cron so this seemed reasonable.

Docker and cron is also a challenge. I ended up solving that with a simple solution: use the host cron to docker exec acme.sh in the hitch container. Perhaps not "pure" Docker but a lot simpler than some of the options I saw.

Hosting

I've been a happy DigitalOcean customer for 5 years but they don't have a data centre in Australia. Vultr, which have a similar offering -- low cost, high performance servers and a well-designed admin interface -- do have a Sydney data centre. Other obvious options include AWS and GCP. I wanted to avoid these where possible as their server offerings are more expensive, and their platforms have a tendency to lock you in with platform specific features. Also in the case of Google, they are a massive surveillance capitalist that I don't trust at all. So Vultr were my host of choice for the new server.

Having said that, the thing with building your own images is that you need to make them available to the Docker host somehow. For this I used an Amazon Elastic Container Registry. It's much cheaper than Docker Hub for private images and is just a standard container registry so I'm not locked in.

Orchestration

Once all the services were Dockerised, there needed to be a way to run the containers, and make them aware of each other. A popular option for this is Kubernetes and for a larger, multi-server deployment it might be the right choice. For my single server operation I opted for Docker Compose, which is, "a tool for defining and running multi-container Docker applications". With Compose you specify all the services in a YAML file and it takes care of running them all together.

My Docker Compose file looks like this:

version: '3'
services:
  hitch:
    image: 791569612186.dkr.ecr.ap-southeast-2.amazonaws.com/hitch
    command: ["--config", "/etc/hitch/hitch.conf", "-b", "[varnish]:6086"]
    volumes:
      - ./hitch/hitch.conf:/etc/hitch/hitch.conf:ro
      - ./private/hitch/dhparams.pem:/etc/hitch/dhparams.pem:ro
      - certs:/etc/hitch/cert.d:rw
      - acme:/etc/acme.sh:rw
    ports:
      - "443:443"
    env_file:
      - private/hitch/development.env
    depends_on:
      - varnish
    restart: unless-stopped
  varnish:
    image: 791569612186.dkr.ecr.ap-southeast-2.amazonaws.com/varnish
    command: ["-F", "-a", ":80", "-a", ":6086,PROXY", "-p", "feature=+http2", "-f", "/etc/varnish/default.vcl", "-s", "malloc,256M"]
    volumes:
      - ./varnish/default.vcl:/etc/varnish/default.vcl:ro
    ports:
      - "80:80"
    depends_on:
      - nginx
      - pkb
      - binary_trance
      - wizards
      - rust_melbourne
    restart: unless-stopped
  nginx:
    image: 791569612186.dkr.ecr.ap-southeast-2.amazonaws.com/nginx
    volumes:
      - ./nginx/conf.d:/etc/nginx/conf.d:ro
      - ./volumes/www:/usr/share/www:ro
    restart: unless-stopped
  pkb:
    image: 791569612186.dkr.ecr.ap-southeast-2.amazonaws.com/pkb
    volumes:
      - pages:/home/pkb/pages:ro
    env_file:
      - private/pkb/development.env
    depends_on:
      - syncthing
    restart: unless-stopped
  binary_trance:
    image: 791569612186.dkr.ecr.ap-southeast-2.amazonaws.com/binary_trance
    env_file:
      - private/binary_trance/development.env
    depends_on:
      - db
    restart: unless-stopped
  wizards:
    image: 791569612186.dkr.ecr.ap-southeast-2.amazonaws.com/mattermost
    volumes:
      - ./private/wizards/config:/mattermost/config:rw
      - ./volumes/wizards/data:/mattermost/data:rw
      - ./volumes/wizards/logs:/mattermost/logs:rw
      - ./volumes/wizards/plugins:/mattermost/plugins:rw
      - ./volumes/wizards/client-plugins:/mattermost/client/plugins:rw
      - /etc/localtime:/etc/localtime:ro
    depends_on:
      - db
    restart: unless-stopped
  rust_melbourne:
    image: 791569612186.dkr.ecr.ap-southeast-2.amazonaws.com/mattermost
    volumes:
      - ./private/rust_melbourne/config:/mattermost/config:rw
      - ./volumes/rust_melbourne/data:/mattermost/data:rw
      - ./volumes/rust_melbourne/logs:/mattermost/logs:rw
      - ./volumes/rust_melbourne/plugins:/mattermost/plugins:rw
      - ./volumes/rust_melbourne/client-plugins:/mattermost/client/plugins:rw
      - /etc/localtime:/etc/localtime:ro
    depends_on:
      - db
    restart: unless-stopped
  db:
    image: 791569612186.dkr.ecr.ap-southeast-2.amazonaws.com/postgresql
    volumes:
      - postgresql:/var/lib/postgresql/data
    ports:
      - "127.0.0.1:5432:5432"
    env_file:
      - private/postgresql/development.env
    restart: unless-stopped
  syncthing:
    image: 791569612186.dkr.ecr.ap-southeast-2.amazonaws.com/syncthing
    volumes:
      - syncthing:/var/lib/syncthing:rw
      - pages:/var/lib/syncthing/Sync:rw
    ports:
      - "127.0.0.1:8384:8384"
      - "22000:22000"
      - "21027:21027/udp"
    restart: unless-stopped
volumes:
  postgresql:
  certs:
  acme:
  pages:
  syncthing:

Bringing all the services up is one command:

docker-compose -f docker-compose.yml -f production.yml up -d

The best bit is I can develop and test it all in isolation locally. Then when it's working, push to ECR and then run docker-compose on the server to bring in the changes. This is a huge improvement over my previous Ansible workflow and should make adding or removing new services in the future fairly painless.

Closing Thoughts

The new server has been running issue free so far. All sites are now redirecting to their https variants with Strict-Transport-Security headers set and get an A grade on the SSL Labs test. The Wizards Mattermost is much faster now that it's in Australia too.

There is one drawback to this move though: my sites are now slower for a lot of visitors. https adds some initial negotiation overhead and if you're reading this from outside Australia there's probably a bunch more latency than before.

I did some testing with WebPageTest to get a feel for the impact of this. My sites are already quite compact. Firefox tells me this page and all resources is 171KB / 54KB transferred. So there's not a lot of slimming to be done there. One thing I did notice was the TLS negotiation was happening for each of the parallel connections the browser opened to load the site.

Some research suggested HTTP/2 might help as it multiplexes requests on a single connection and only performs the TLS negotiation once. So I decided to live on the edge a little and enable Varnish's experimental HTTP/2 support. Retrieving the site over HTTP/2 did in fact reduce the TLS negotiations to one.

Thanks for reading, I hope the bits didn't take too long to get from Australia to wherever you are. Happy computing!



Previous Post: My Rust Powered linux.conf.au e-Paper Badge
Next Post: A Coding Retreat and Getting Embedded Rust Running on a SensorTag

Oleg Kovalov (olegkovalov)

What I don’t like in your repo March 13, 2019 06:35 AM

What I Don’t Like In Your Repo

Hi everyone, I’m Oleg and I’m yelling at (probably your) repo.

This is a copy of my dialogue with a friend about how to make a good and helpful repo for any community of any size and any programming language.

Let’s start.

README says nothing

But it’s a crucial part of any repo!

It’s the first interaction with your potential user and this is a first impression that you might bring to the user.

After the name (and maybe a logo) it’s a good place to put a few badges like:

  • recent version
  • CI status
  • link to the docs
  • code quality
  • code coverage
  • even the number of users in a chat
  • or just scroll all of them on https://shields.io/

Personal fail. Not so long time ago I did a simple, hacky and a bit funny project in Go which is called sabotage. I put a quote from a song, have added a picture, but… haven’t provided any info what it does.

This takes like 10 minutes to make a simpler intro and explain what I’m sharing and what it can do.

There is no reason why you or I should skip it.

Custom license or no license at all

First and most important: DO. NOT. (RE)INVENT. LICENSE. PLEASE.

When you’re going to create a new shiny new license or make any existent much better, please, ask yourself 17 times: what is the point to do so?

Companies of any size are very conservative in licenses, ’cause it might destroy their business. So if you’re targeting big audience — it’s a dumb way to do so.

There are a lot of guides on how to select the license and living it unlicensed or using an unpopular or funny (like WTFPL) will just be a bad sign for a user.

Feel free to choose one of the most popular:

  • MIT — when you want to give it for free
  • BSD3 — when you want a bit more rights for you
  • Apache 2.0 — when it’s a commercial product
  • GPLv3 — which is also a good option

(that’s might be an opinionated list, but whatever)

No Dockerfile

It’s already 2019 and the containers have won this world.

It’s much simpler for anyone to make a docker pull foo/bar command rather than download all dependencies, configure paths, realise that some things might be incompatible or even be scared to totally destroy their system.

Is there a guarantee that there is no rm -rf in an unverified project? 😈

Adding a simple Dockerfile with everything needed can be done is 30 mins. But this will give your users a safe and fast way to start using, validating or helping to improve your work. A win-win situation.

Changes without pull requests

That might look weird, but give me a second.

When a project is small and there are 0 or few users — that might be okay. It’s easy to follow what happened last days: fixes, new features, etc. But when the scale gets bigger, oh… it becomes a nightmare.

You have pushed few commits into the master, so probably you did it on your computer and no one saw what happened, there wasn’t any feedback. You may break API backward compatibility, forgot to add or remove something, even make useless work (oh, nasty one).

When you’re doing a pull request, some random guru-senior-architect might occasionally check your code and suggest few changes. Sounds unlikely but any additional eyes might uncover bugs or architecture mistakes.

Do not hide your work, isn’t this a reason for open sourcing it?

Bloated dependencies

Maybe it’s just me but I’m very conservative with dependencies.

When I see dozens of deps in the lock file, the first question which comes to my mind is: so, am I ready to fix any failures inside any of them?

Yeah, it works today, maybe it worked 1 week/month/year before, but can you guarantee what will happen tomorrow? I cannot.

No styling or formatting

Different files (sometimes even functions) are written in a different style.

This causes troubles for the contributors, ‘cause one prefers spaces and another prefers tabs. And this is just the simplest example.

So what will be the result:

  • 1 file in one style and another in completely different
  • 1 with { at the end of a line and another { on the new line
  • 1 function in functional style and right below in pure procedural

Which of them is right? — I dunno but this is acceptable if it works but also this horribly distracts readers for no reason.

Simple rule for this: use formatter and linters: eslint, gofmt, rustfmt…oh tons of them! Feel free to configure it as you would like to but keep in mind that the most popular tend to be most natural.

No automatic builds

How you can verify that user can build your code?

The answer is quite simple — build system. TravisCI, GitlabCI, CircleCI and that ‘s only a few of them.

Treat a build system as a silent companion that will check your every commit and will automatically run formatters/linters to ensure that new code has good quality. Sounds amazing, isn’t it?

And adding a simple yaml file which describes how the build should be done in minutes, as always.

No releases or Git tags

Master branch might be broken.

That happening. This is unpleasant stuff but it happens.

Some recent changes might be merged and somehow this causes troubles on the master. How much time it will take to fix for you? few minutes? an hour? a day? Till you’ll be back from vacation? Who knows ‾\_(ツ)_/‾

But when there is a Git tag which points to the time when a project was correct and able to be built, oh, that’s a good thing to have and make the life of your users much better.

Adding release on a Github (similar as Gitlab or any other place) is literally seconds, no reason to omit this step.

No tests

Well, it might be okay.

Of course, having correct tests is a nice thing to have, but probably you’re doing this project after work in your free time or weekend (I’m guilty, I’m doing this so often instead to have a rest).

So don’t be strict to yourself, feel free to share your work and share knowledge. The test can be added later, time with family and friends is more important, the same as mental or physical health.

Conclusion

There are a lot of other things that will make your repo even better, but maybe you will mention them personally in comments?

Twitter: https://twitter.com/oleg_kovalov/status/1105719270116388864

Lobsters: https://lobste.rs/s/6gixqw/what_i_don_t_like_your_repo

HN: https://news.ycombinator.com/item?id=19376264

Reddit: https://www.reddit.com/r/programming/comments/b0isug/what_i_dont_like_in_your_repo/

Thanks.


What I don’t like in your repo was originally published in ITNEXT on Medium, where people are continuing the conversation by highlighting and responding to this story.

March 12, 2019

Kevin Burke (kb)

Phone Number for SFMTA Temporary Sign Office March 12, 2019 04:10 AM

The phone number for the SFMTA Temporary Sign Office is very difficult to find. The SFMTA Temporary Sign web page directs you to 311. 311 does not know the right procedures for the Temporary Sign Office.

The email address on the website is also slow to get back to requests. The Temporary Sign department address listed on the website, at 1508 Bancroft Avenue, is not open to the public — it's just a locked door.

To contact the Temporary Sign Office, call 415-550-2716. This is the direct line to the department. I reached someone in under a minute.

If your event is more than 90 days in the future, don't expect an update. They don't start processing signage applications until 90 days before the event.

Here's a photo of my large son outside of the SFMTA Temporary Sign Office, where I did not find anyone to speak with, but I found the phone number that got me the right phone number to get someone to give me an update on my application.

Using an AWS Aurora Postgres Database as a Source for Database Manager Service March 12, 2019 03:53 AM

Say you have a Aurora RDS PostgreSQL database that you want to use as the source database for Amazon Database Manager Service.

The documentation is unclear on this point so here you go: you can't use an Aurora RDS PostgreSQL database as the source database because Aurora doesn't support the concept of replication slots, which is how Amazon DMS migrates data from one database to another.

Better luck with other migration tools!

Andreas Zwinkau (qznc)

The New Economics March 12, 2019 12:00 AM

A book review which is about system thinking, statistics, learning, and psychology.

Read full article!

March 11, 2019

Maxwell Bernstein (tekknolagi)

Understanding the 100 prisoners problem March 11, 2019 04:04 PM

I visited my friends Chris and Yuki in Seattle. After lunch, Chris threw us a brainteaser: the 100 prisoners problem. For those not familiar, Minute Physics has a great YouTube video about it.

For those who would prefer not to watch a video, a snippet from the Wikipedia page is attached here:

In this problem, 100 numbered prisoners must find their own numbers in one of 100 drawers in order to survive. The rules state that each prisoner may open only 50 drawers and cannot communicate with other prisoners. At first glance, the situation appears hopeless, but a clever strategy offers the prisoners a realistic chance of survival.

and for some reason that snippet sounds like the voice-over to a movie trailer.

Since we did not have a good intuitive grasp of the solution and reasoning, we decided to simulate the experiment and run some numbers. When in doubt, implement it yourself, right?

The Minute Physics video has 100 boxes, but we should generalize to n. Since the boxes in the room are shuffled at the beginning of each experiment, we start by shuffling a list of numbers from 1 to n:

import random


def sample(n=100, limit=50):
    boxes = list(range(n))
    random.shuffle(boxes)
    return sum(try_find_self(boxes, person, limit) for person in range(n))

Then, for each person (which is the same as “for each box” in this case), attempt to find their hidden box using the method described in the video. Since try_find_self yields a success (True) or a failure (False), summing should give the number of people who found their boxes.

def try_find_self(boxes, start, limit):
    next_box = boxes[start]
    num_opened = 1
    while next_box != start and num_opened < limit:
        next_box = boxes[next_box]
        num_opened += 1
    return next_box == start

The try_find_self function implements the strategy described in the video: start at the box indexed by your number (not necessarily containing your number) and follow that linked list of boxes until you either hit the limit or find your number. If the next box at the end is yours, you have found your box!

Now, this isn’t very interesting on its own. We can run an experiment, sure, but we still have to analyze the results of the data over multiple samples and varying parameters.

In order to do that, we made some visualizations. We start off by importing all of the usual suspects:

import random
import simulate

import matplotlib.pyplot as plt
import numpy as np

Then, in order to get reproducible results, seed the random number generator. This was essential for improving our implementations of both the visualizations and the simulations while verifying that the end results did not change.

random.seed(5)

In order to get a feel for the effect of different parameters on the probability of a group of people winning, we varied the number of boxes and the maximum number of tries. It’s a good thing we tried this, since our intuition about how the results scale with the ratio was very wrong.

num_samples = 1000
max_tries_options = np.arange(5, 50, 10)
num_box_options = np.arange(10, 100, 10)

Since our sampler only takes one parameter pair at once, we have to vectorize our function. Note that we specify otypes, because otherwise vectorize has to run the sample function with the first input multiple times in order to determine the type of the output. This is a known issue and was very annoying to debug, given the randomness.

vsample = np.vectorize(simulate.sample, otypes=[int])

Now we take samples at all combinations of the parameter, num_samples number of times. This returns a large NumPy array with dimensions like results[sample_num][max_tries][num_boxes]. For each sample, all of the combinations of parameters are tried and returned in a 2D grid.

params = np.meshgrid(num_box_options, max_tries_options)
results = np.array([vsample(*params) for _ in range(num_samples)])

This produces some nice data, like this:

[[[10  2  0 ...  1  7  6]
  [10  4 30 ... 32  7  1]
  [10 20 30 ...  1 11 35]
  [10 20 30 ... 70 41 40]
  [10 20 30 ...  3 29 30]]

 ...

 [[ 4 13 18 ...  3  2  3]
  [10  0 30 ... 31 11 47]
  [10 20 30 ... 43  0 34]
  [10 20 30 ... 29 80 45]
  [10 20 30 ... 70 80 90]]]

While it’s all nice and good to know how many people in each sample found their boxes, we want to visualize the probability of a group winning. Remember that a group winning is defined by all of the n people finding their number in a box. To calculate that probability, we binarize the results and get the mean success rate across all the samples.

results_bin = np.sum(results == num_box_options, axis=0) / num_samples

This turns the results from above into an array like this:

[[0.337 0.012 0.    0.    0.    0.    0.    0.    0.   ]
 [1.    0.699 0.338 0.127 0.029 0.007 0.003 0.    0.   ]
 [1.    1.    0.836 0.545 0.304 0.181 0.072 0.038 0.012]
 [1.    1.    1.    0.871 0.662 0.462 0.316 0.197 0.093]
 [1.    1.    1.    1.    0.907 0.694 0.54  0.429 0.313]]

which has dimensions results_bin[max_tries][num_boxes].

If you are unfamiliar with the term binarize, I was too until last night. It means reduce to a success/failure value.

There are three interesting regions of this data, identifiable even before plotting:

  1. The bottom left field of 1s, which comes from allowing many tries compared to the number of boxes in the room.
  2. The top right field of 0s, which comes from allowing not many tries compared to the number of boxes in the room. They really shouldn’t be zero, but winning is so rare that we would need to have a lot of samples.
  3. The middle “normal” numbers.

Let’s chart the data and see what this looks like in beautiful shades of purple:

ax = plt.axes()
plt.set_cmap('Purples')
contour = ax.contourf(*params, results_bin)
ax.set_xlabel('num boxes')
ax.set_ylabel('max tries allowed')
ax.set_title('probability of group win')
plt.colorbar(contour)
plt.show()

Note that this graph was generated with 1000 samples, and intervals of 1 for max_tries_options and num_box_options, which is different than the above code snippets. It took a while to generate the data.

On the x-axis we have the total number of both people and boxes and on the y-axis we have the maximum number of tries that each person is given to find their box. This confirms Minute Physics’ conclusion about the probability of everyone winning using the strategy. It also provides a handy way of testing your own strategy against the proposed one and seeing how often you lead your group to success! Feel free to send any interesting ones in.

If Chris, Yuki, and I have time, we’ll update this post with a more efficient simulation so it doesn’t take so dang long to generate the data. We also have another visualization lying around that contains the different probability distributions for all the configuration settings, but haven’t written about it… yet.

There’s some sample code in the repo — check it out and let us know what you think. We found that re-writing the simulation as a Python C-extension improved speeds 20x, so there’s also a small C++ program in there.

March 10, 2019

Ponylang (SeanTAllen)

Last Week in Pony - March 10, 2019 March 10, 2019 03:19 PM

Last Week In Pony is a weekly blog post to catch you up on the latest news for the Pony programming language. To learn more about Pony check out our website, our Twitter account @ponylang, or our Zulip community.

Got something you think should be featured? There’s a GitHub issue for that! Add a comment to the open “Last Week in Pony” issue.

March 09, 2019

Gustaf Erikson (gerikson)

February March 09, 2019 08:30 PM

Andreas Zwinkau (qznc)

Deriving Story Points March 09, 2019 12:00 AM

Story points are a useful technique to improve prediction but they have limits because the lack in statistics.

Read full article!

March 08, 2019

Jan van den Berg (j11g)

Getting Things Done – David Allen March 08, 2019 06:38 PM

For some reason I had never read the David Allen classic Getting Things Done. But I found out that 18 years after its release it’s still a good introduction to time and action management.

Getting Things Done – David Allen (2001) – 220 pages

David Allen tries to make the natural, systematic. He does so by introducing a 5 step workflow: capture, clarify, organize, reflect, and engage. Allen does a great job of explaining these steps with real world examples and sprinkles his text with inspiring and relevant quotes. His system is very much based in the physical world — notes, folders, file cabinets etc. — which can feel a bit outdated, but does make sense (as he explains).

GTD in less than 200 words

GTD is a way of thinking about organizing. And it has elements you can also find in other organisation methods. But GTD really focuses around three main concepts.

1. Put everything on a list

Yes, everything. The idea is to clear your head, and use your brain to think about things, not to think of things.

2. Define the next ACTION

This is really the hardcore key concept of GTD. Define the next step. Think about results and decide the next action. And it is very important that the next step is an action. If your car needs a check-up, your list entry is not “Car check-up”, your next action and list entry is “Call the garage to make an appointment”. But you may discover that you need the phone number first. So your next action becomes, look up garage phone number. Get it?

3. Update actions

When you’ve written down everything you need or want to do in your system (1), and decided on the next action (2), your system will only work if you regularly revise your system. You do so by updating or working on your actions.

Conclusion

I can see how the GTD method can work, when you stick to it. And even if I don’t think I will apply GTD fully, I certainly take away some key concepts. And I like how the system tries to empower our natural abilities, and to let your brain do what your brain is good at. That is: not keeping track of things, but creating new things.

My only question is that people who could really benefit from such a system, are usually already in over their head. So they would need a coach (or outside help) to successfully implement GTD.

I enjoyed reading GTD and would argue to read it at least once. By just reading it, it already seems to activate a mental process to want to organise and declutter. How else can you explain that I just ordered a labelprinter and 60 feet of bookshelves?

The post Getting Things Done – David Allen appeared first on Jan van den Berg.

Nikola Plejić (nikola)

Developing Web Services in Rust: my talk at the Zagreb Rust Meetup March 08, 2019 04:30 PM

I've given a short overview of Rust's support for writing web services at the Zagreb Rust meetup. It was all in Croatian, but there's some code with the occasional comment in English which may or may not be useful.

There's also an org-mode file in the repository, containing the vast majority of the stuff I've covered in the talk. Incidentally, GitLab does a far better job at rendering org-mode files than GitHub does, so you can enjoy a beautiful render without notable loss of content.

March 06, 2019

Jan van den Berg (j11g)

Leonardo da Vinci – Walter Isaacson March 06, 2019 07:26 PM

My favorite biographer, Walter Isaacson, did it again. He created a gorgeously illustrated book about the quintessential renaissance man, Leonardo da Vinci. The book is based on the mind blowing — in number and content — 7200 pages of notes Leonardo left behind (which probably only accounts for one quarter, the rest is lost). As far as I am concerned this biography is the definitive introduction to this left-handed, mirror writing, ever procrastinating, sculpting, painting, stargazing, riddle creating, bird watching, theatre producing, water engineering, corpse dissecting, observing and ever curious dandy polymath.

Leonardo da Vinci – Walter Isaacson (2017) – 601 pages

“Leonardo’s notebooks are nothing less than an astonishing windfall that provides the documentary record of applied creativity.”

Walter Isaacson

I don’t want to go into too much detail about Leonardo da Vinci; just read the book! But needless to say he was one of a kind, his mind worked differently from other people and he made wide varying discoveries. I always thought he must have been a reclusive person. Because he was so far ahead of his time — sometimes centuries — that he must not have enjoyed present company. But, this couldn’t be further from the truth.

Leonardo was very much a people person. And this is one of the key arguments made by Isaacson about Leonardo’s art and skill. Not only was he a keen curious (the most curious) observer and tinkerer but he also sought cooperation to bounce ideas off. Isaacson makes a strong case of Leonardo specifically becoming and being a genius because of the combination of these things.

A different print than my copy, but still gorgeous. Also, my copy has an autograph 😉

As I’ve come to expect of biographies by Isaacson, his own personal passion and admiration for the subject shine trough. Which is why I always enjoy his writing. Of course, some things that happened 500 years ago are up for debate, but Isaacson demonstrates enough knowledge and backstory to his findings to come to mostly natural conclusions. This book does an especially good job of going through da Vinci’s life chronologically but still managing to show the cross-sections and connections between art and science (and everything else) throughout Leonardo’s life. And with Leonardo everything was interconnected and related, so this is quite an accomplishment!

All of Leonardo’s skills and knowledge, of course, came together in the painting he worked on for 16 years. The Mona Lisa. The book beautifully works towards that conclusion. And by reading this book you come away with a deeper understanding and appreciation of what exactly it is you’re looking at.

The post Leonardo da Vinci – Walter Isaacson appeared first on Jan van den Berg.

Derek Jones (derek-jones)

Regression line fitted to noisy data? Ask to see confidence intervals March 06, 2019 06:13 PM

A little knowledge can be a dangerous thing. For instance, knowing how to fit a regression line to a set of points, but not knowing how to figure out whether the fitted line makes any sense. Fitting a regression line is trivial, with most modern data analysis packages; it’s difficult to find data that any of them fail to fit to a straight line (even randomly selected points usually contain enough bias on one direction, to enable the fitting algorithm to converge).

Two techniques for checking the goodness-of-fit, of a regression line, are plotting confidence intervals and listing the p-value. The confidence interval approach is a great way to visualize the goodness-of-fit, with the added advantage of not needing any technical knowledge. The p-value approach is great for blinding people with science, and a necessary technicality when dealing with multidimensional data (unless you happen to have a Tardis).

In 2016, the Nationwide Mutual Insurance Company won the IEEE Computer Society/Software Engineering Institute Watts S. Humphrey Software Process Achievement (SPA) Award, and there is a technical report, which reads like an infomercial, on the benefits Nationwide achieved from using SEI’s software improvement process. Thanks to Edward Weller for the link.

Figure 6 of the informercial technical report caught my eye. The fitted regression line shows delivered productivity going up over time, but the data looks very noisy. How good a fit is that regression line?

Thanks to WebPlotDigitizer, I quickly extracted the data (I’m a regular user, and WebPlotDigitizer just keeps getting better).

Below is the data plotted to look like Figure 6, with the fitted regression line in pink (code+data). The original did not include tick marks on the axis. For the x-axis I assumed each point was at a fixed 2-month interval (matching the axis labels), and for the y-axis I picked the point just below the zero to measure length (so my measurements may be off by a constant multiplier close to one; multiplying values by a constant will not have any influence on calculating goodness-of-fit).

Nationwide: delivery productivity over time; extracted data and fitted regression line.

The p-value for the fitted line is 0.15; gee-wiz, you say. Plotting with confidence intervals (in red; the usual 95%) makes the situation clear:

Nationwide: delivery productivity over time; extracted data and fitted regression line with 5% confidence intervals.

Ok, so the fitted model is fairly meaningless from a technical perspective; the line might actually go down, rather than up (there is too much noise in the data to tell). Think of the actual line likely appearing somewhere in the curved red tube.

Do Nationwide, IEEE or SEI care? The IEEE need a company to award the prize to, SEI want to promote their services, and Nationwide want to convince the rest of the world that their IT services are getting better.

Is there a company out there who feels hard done-by, because they did not receive the award? Perhaps there is, but are their numbers any better than Nationwide’s?

How much influence did the numbers in Figure 6 have on the award decision? Perhaps not a lot, the other plots look like they would tell a similar tail of wide confidence intervals on any fitted lines (readers might like to try their hand drawing confidence intervals for Figure 9). Perhaps Nationwide was the only company considered.

Who are the losers here? Other companies who decide to spend lots of money adopting the SEI software process? If evidence was available, perhaps something concrete could be figured out.

March 05, 2019

Tobias Pfeiffer (PragTob)

Slides: Do You Need That Validation? Let Me Call You Back About It March 05, 2019 07:12 PM

I had a wonderful time at Ruby On Ice! I gave a talk, that I loved to prepare to formulate the ideas the right way. You’ll see it focuses a lot on the problems, that’s intentional because if we’re not clear on the problems what good is a solution? Anyhow, here are the slides – […]

March 04, 2019

Derek Jones (derek-jones)

Polished human cognitive characteristics chapter March 04, 2019 01:32 AM

It has been just over two years since I release the first draft of the Human cognitive characteristics chapter of my evidence-based software engineering book. As new material was discovered, it got added where it seemed to belong (at the time), no effort was invested in maintaining any degree of coherence.

The plan was to find enough material to paint a coherence picture of the impact of human cognitive characteristics on software engineering. In practice, finishing the book in a reasonable time-frame requires that I stop looking for new material (assuming it exists), and go with what is currently available. There are a few datasets that have been promised, and having these would help fill some holes in the later sections.

The material has been reorganized into what is essentially a pass over what I think are the major issues, discussed via studies for which I have data (the rule of requiring data for a topic to be discussed, gets bent out of shape the most in this chapter), presented in almost a bullet point-like style. At least there are plenty of figures for people to look at, and they are in color.

I think the material will convince readers that human cognition is a crucial topic in software development; download the draft pdf.

Model building by cognitive psychologists is starting to become popular, with probabilistic languages, such as JAGS and Stan, becoming widely used. I was hoping to build models like this for software engineering tasks, but it would have taken too much time, and will have to wait until the book is done.

As always, if you know of any interesting software engineering data, please let me know.

Next, the cognitive capitalism chapter.

Pete Corey (petecorey)

Secure Meteor is Live March 04, 2019 12:00 AM

The big day is finally here. Secure Meteor is live and available for purchase!

Secure Meteor is the culmination of all of my work as a Meteor security professional. Between the years of 2014 and 2017 I completely immersed myself in the Meteor ecosystem and became increasingly focused on the unique security characteristics of Meteor applications. I wrote and spoke about Meteor security,built security-focused tools and packages for the Meteor ecosystem, and worked hands-on with talented teams to better secure their Meteor applications. Secure Meteor is the embodiment of everything I learned about Meteor security during that time.

It’s my goal that reading Secure Meteor will teach you the ins and outs of the various attack vectors present in your Meteor application, and will also teach you how to see your application through the eyes of a potential attacker.

Check out the Secure Meteor page for more details, samples chapters, and to snag your copy today!

On a personal note, it’s been over a year since I first announced and started working on Secure Meteor. There were many times over the past year when I never thought I’d finish. Writing a book has always been a personal goal of mine, and I couldn’t be more happy to have persevered and seen this project through to completion.

I deeply believe that Secure Meteor is a valuable addition to the Meteor community, and I’m happy to be giving back to a community that has given so much to me over the years.

Thanks for all of your support.

March 03, 2019

Gokberk Yaltirakli (gkbrk)

Writing a Simple IPFS Crawler March 03, 2019 04:38 PM

IPFS is a peer-to-peer protocol that allows you to access and publish content in a decentralized fashion. It uses hashes to refer to files. Short of someone posting hashes on a website, discoverability of content is pretty low. In this article, we’re going to write a very simple crawler for IPFS.

It’s challenging to have a traditional search engine in IPFS because content rarely links to each other. But there is another way than just blindly following links like a traditional crawler.

Enter DHT

In IPFS, the content for a given hash is found using a Distributed Hash Table. Which means our IPFS daemon receives requests about the location of IPFS objects. When all the peers do this, a key-value store is distributed among them; hence the name Distributed Hash Table. Even though we won’t get all the queries, we will still get a fraction of them. We can use these to discover when people put files on IPFS and announce it on the DHT.

Fortunately, IPFS lets us see those DHT queries from the log API. For our crawler, we will use the Rust programming language and the ipfsapi crate for communicating with IPFS. You can add ipfsapi = "0.2" to your Cargo.toml file to get the dependency.

Using IPFS from Rust

Let’s test if our IPFS daemon and the IPFS crate are working by trying to fetch and print a file.

let api = IpfsApi::new("127.0.0.1", 5001);

let bytes = api.cat("QmWATWQ7fVPP2EFGu71UkfnqhYXDYH566qy47CnJDgvs8u")?;
let data = String::from_utf8(bytes.collect())?;

println!("{}", data);

This code should grab the contents of the hash, and if everything is working print “Hello World”.

Getting the logs

Now that we can download files from IPFS, it’s time to get all the logged events from the daemon. To do this, we can use the log_tail method to get an iterator of all the events. Let’s get everything we get from the logs and print it to the console.

for line in api.log_tail()? {
    println!("{}", line);
}

This gets us all the loops, but we are only interested in DHT events, so let’s filter a little. A DHT announcement looks like this in the JSON logs.

{
  "duration": 235926,
  "event": "handleAddProvider",
  "key": "QmWATWQ7fVPP2EFGu71UkfnqhYXDYH566qy47CnJDgvs8u",
  "peer": "QmeqzaUKvym9p8nGXYipk6JpafqqQAnw1ZQ4xBoXWcCrLb",
  "session": "ffffffff-ffff-ffff-ffff-ffffffffffff",
  "system": "dht",
  "time": "2018-03-12T00:32:51.007121297Z"
}

We are interested in all the log entries with the event handleAddProvider. And the hash of the IPFS object is key. We can filter the iterator like this.

let logs = api.log_tail()
        .unwrap()
        .filter(|x| x["event"].as_str() == Some("handleAddProvider"))
        .filter(|x| x["key"].is_string());

for log in logs {
    let hash = log["key"].as_str().unwrap().to_string();
    println!("{}", hash);
}

Grabbing the valid images

As a final step, we’re going to save all the valid image files that we come across. We can use the image crate. Basically; for each object we find, we’re going to try parsing it as an image file. If that succeeds, we likely have a valid image that we can save.

Let’s write a function that loads an image from IPFS, parses it with the image crate and saves it to the images/ folder.

fn check_image(hash: &str) -> Result<(), Error> {
    let api = IpfsApi::new("127.0.0.1", 5001);

    let data: Vec<u8> = api.cat(hash)?.collect();
    let img = image::load_from_memory(data.as_slice())?;

    println!("[!!!] Found image on hash {}", hash);

    let path = format!("images/{}.jpg", hash);
    let mut file = File::create(path)?;
    img.save(&mut file, image::JPEG)?;

    Ok(())
}

And then connecting to our main loop. We’re checking each image in a seperate thread because IPFS can take a long time to resolve a hash or timeout.

for log in logs {
    let hash = log["key"].as_str().unwrap().to_string();
    println!("{}", hash);

    thread::spawn(move|| check_image(&hash));
}

Possible improvements / future work

  • File size limits: Checking the size of objects before downloading them
  • More file types: Saving more file types. Determining the types using a utility like file.
  • Parsing HTML: When the object is valid HTML, parse it and index the text in order to provide search
2018-03-12-writing-a-simple-ipfs-crawler.html.html

title: Writing a Simple IPFS Crawler author: Gokberk Yaltirakli

date: 2018-03-12

IPFS is a peer-to-peer protocol that allows you to access and publish content in a decentralized fashion. It uses hashes to refer to files. Short of someone posting hashes on a website, discoverability of content is pretty low. In this article, we’re going to write a very simple crawler for IPFS.

It’s challenging to have a traditional search engine in IPFS because content rarely links to each other. But there is another way than just blindly following links like a traditional crawler.

Enter DHT

In IPFS, the content for a given hash is found using a Distributed Hash Table. Which means our IPFS daemon receives requests about the location of IPFS objects. When all the peers do this, a key-value store is distributed among them; hence the name Distributed Hash Table. Even though we won’t get all the queries, we will still get a fraction of them. We can use these to discover when people put files on IPFS and announce it on the DHT.

Fortunately, IPFS lets us see those DHT queries from the log API. For our crawler, we will use the Rust programming language and the ipfsapi crate for communicating with IPFS. You can add ipfsapi = "0.2" to your Cargo.toml file to get the dependency.

Using IPFS from Rust

Let’s test if our IPFS daemon and the IPFS crate are working by trying to fetch and print a file.

```rust let api = IpfsApi::new(“127.0.0.1”, 5001);

let bytes = api.cat(“QmWATWQ7fVPP2EFGu71UkfnqhYXDYH566qy47CnJDgvs8u”)?; let data = String::from_utf8(bytes.collect())?;

println!(“{}”, data); ```

This code should grab the contents of the hash, and if everything is working print “Hello World”.

Getting the logs

Now that we can download files from IPFS, it’s time to get all the logged events from the daemon. To do this, we can use the log_tail method to get an iterator of all the events. Let’s get everything we get from the logs and print it to the console.

rust for line in api.log_tail()? { println!("{}", line); }

This gets us all the loops, but we are only interested in DHT events, so let’s filter a little. A DHT announcement looks like this in the JSON logs.

json { "duration": 235926, "event": "handleAddProvider", "key": "QmWATWQ7fVPP2EFGu71UkfnqhYXDYH566qy47CnJDgvs8u", "peer": "QmeqzaUKvym9p8nGXYipk6JpafqqQAnw1ZQ4xBoXWcCrLb", "session": "ffffffff-ffff-ffff-ffff-ffffffffffff", "system": "dht", "time": "2018-03-12T00:32:51.007121297Z" }

We are interested in all the log entries with the event handleAddProvider. And the hash of the IPFS object is key. We can filter the iterator like this.

```rust let logs = api.log_tail() .unwrap() .filter(|x| x[“event”].as_str() == Some(“handleAddProvider”)) .filter(|x| x[“key”].is_string());

for log in logs { let hash = log[“key”].as_str().unwrap().to_string(); println!(“{}”, hash); } ```

Grabbing the valid images

As a final step, we’re going to save all the valid image files that we come across. We can use the image crate. Basically; for each object we find, we’re going to try parsing it as an image file. If that succeeds, we likely have a valid image that we can save.

Let’s write a function that loads an image from IPFS, parses it with the image crate and saves it to the images/ folder.

```rust fn check_image(hash: &str) -> Result<(), Error> { let api = IpfsApi::new(“127.0.0.1”, 5001);

let data: Vec<u8> = api.cat(hash)?.collect();
let img = image::load_from_memory(data.as_slice())?;

println!("[!!!] Found image on hash {}", hash);

let path = format!("images/{}.jpg", hash);
let mut file = File::create(path)?;
img.save(&mut file, image::JPEG)?;

Ok(())

} ```

And then connecting to our main loop. We’re checking each image in a seperate thread because IPFS can take a long time to resolve a hash or timeout.

```rust for log in logs { let hash = log[“key”].as_str().unwrap().to_string(); println!(“{}”, hash);

thread::spawn(move|| check_image(&hash));

} ```

Possible improvements / future work

  • File size limits: Checking the size of objects before downloading them
  • More file types: Saving more file types. Determining the types using a utility like file.
  • Parsing HTML: When the object is valid HTML, parse it and index the text in order to provide search

Ponylang (SeanTAllen)

Last Week in Pony - March 3, 2019 March 03, 2019 03:58 PM

Last Week In Pony is a weekly blog post to catch you up on the latest news for the Pony programming language. To learn more about Pony check out our website, our Twitter account @ponylang, or our Zulip community.

Got something you think should be featured? There’s a GitHub issue for that! Add a comment to the open “Last Week in Pony” issue.

Gustaf Erikson (gerikson)

January March 03, 2019 01:22 PM

The only image worth posting this month is in this post.

Jan 2018 | Jan 2017 | Jan 2016 | Jan 2015 | Jan 2014 | Jan 2013 | Jan 2012 | Jan 2011 | Jan 2010 | Jan 2009

November March 03, 2019 10:59 AM

March 02, 2019

Grzegorz Antoniak (dark_grimoire)

Is it worth using make? March 02, 2019 06:00 AM

You may think you've created a nice and tidy Makefile, which tracks dependencies and works across many different operating systems, but ask yourself these questions (and answer honestly):

Which compiler does your Makefile support?

Is it GCC? Or Clang? Some of their options are pretty similar, so it may not …

March 01, 2019

Ponylang (SeanTAllen)

0.27.0 Released March 01, 2019 05:00 AM

Pony 0.27.0 is a big release for us. LLVM 7, 6, and 5 are all now fully supported by Pony. This is the last release that LLVM 3.9.1 will be supported for.

Additionally, there are a number of important fixes and a couple of breaking changes in the release. Keep reading for more information.

February 28, 2019

Patrick Louis (venam)

Time On The Internet February 28, 2019 10:00 PM

Trapped Within

Time can be measured in all sort of ways, some more accurate than others, but the perception of its flow varies widely depending on the subjective experience. That’s the distinction between physical and psychological time.
Psychological time is influenced and influences our cognitive systems. It influences how we act and respond to information and events around us, and the information and events around us influence it.

Boredom is one fascinating stimuli or lack thereof. According to researches it makes us feel as if time is passing slowly, minutes can feel like hours, things that happened yesterday feel like they happened a week ago. Boredom digs us deep into the ground, stifling our performance; it is associated with depression, lack of satisfaction, lack of alertness, and other nasty effects. A new study has found that when experiencing boredom we are prone to be more altruistic, empathic, and perhaps more creative. The reason could be that boredom incites daydreaming or incites the hope to escape from this mental state.
Routine resembles boredom but has a time-lapse effect, weeks go by so quickly we may not notice them. Unsurprisingly, emotions are also related to how time feels. For instance, when facing a threat, or in a state of fear, time slows down.

Tying these together, a theory proposes that time perception is tightly linked to how we pay attention. If we had money for the hours in a day, let’s say $24, then we would spend it with our attention. The parts of the day we allocate extra money for are the ones that are felt slower.
Another way to think about it is to imagine it as a beat. The faster the beat, the slower it feels.

beat

Thus a lack of stimuli leads to time warps; while over stimuli, by boredom accompanied by inaction, for instance, makes us feel trapped in a time cage.

Time can pass quickly on the internet, news get old fast, and boredom or routine follows. What are the emotional states affecting this awkward time dimension?

It begins with the war for attention and the information overload we are presented with everyday.

“There are things you don’t know about that you absolutely need to know about, and please remember this other thing too at the same time”, is what is being shouted at us. It’s easy to fall prey and cave in to the pressure to follow up, the new internet version of keeping up with the Kardashian.
What are worthy news and what are unworthy ones, what are the things that captivate us and make us spend our precious attention coins, consequentially dilating our time on them. There are many actors that have different incentives and would benefit greatly from our attention. They may bore us in a time capsule, making us feel scammed and robbed, may fall into a routine or may play with our emotions.

The upfront players in the online space are advertisers, they keep the machine running. Marketers know their craft of branding and exposure, spamming us with their names. Their goal is to remind us that they exist, and this is achieved through repetition.
Unfortunately, everything else has started to interact with us in the same fashion. In marketing, popularity is key while with the rest it’s supposed to be about value. Nonetheless trendism is in vogue: “the belief that an already-trending topic deserves to be promoted”. Repetition is even praised as a strategy to brand one-self over and over again because of the short shelf life of media, so that followers can spend enough time to get an impression. Everything is a product.
Keep in mind that information is defined as anything that is surprising and new, something of value. Things that tell us something we already know are worthless. Repetition has become the norm, so much brain grinding and mashing with the same topics over and over again - An eternity of time passed on the same topics.
It’s even more upsetting that less than 60 percent of the web traffic is human, the rest is fake. This means a lot of the grinding we get refurbished is based on metrics that don’t exist, using algorithms that know us on a superficial level. An algorithm can hardly learn novelty, at least not yet.
Hence the unoriginality and predictability that is spreading fast is nothing extraordinary.

Not limited to algorithms, that predictability permeates everything.
The web has dwindled in attempts at being daring, original, or standing out. People love novelty and are seeking it yet it seems to fade away remarkably fast. As soon as something new comes out it’s instantly part of the copy-paste culture, part of the memetic. Additionally, we’ve all become trained critics looking for the finest in every little thing. It’s risky to bet on being audacious and easier to bet on tried-and-tested. We’ve turned into the Edmund Burke in the Burke vs Pain debate, or Hem in the “Who moved my cheese” book.

This clearly explains why nostalgia is such a business. It’s better to use the “good-ol’ days” with rosy retrospection as a backup rather than entering the unknown. The internet is too serious to allow things to go haywire - Stay in the tunnel and don’t divert.

We now talk about digital presence and digital identity and it matters. It blurs the lines between professional, personal, and online lives, mixing all of it in a bowl that is neither neat nor tidy. The internet is important business!
In this era of hyper-usefulness and over-rationalization everything has to serve an obligation, has to be straight-forward, has to intermix work and social appearances. You have to watch your steps as there may be unforeseen consequences that may be judged. Nothing disappears in the online world. In consequence we suppress our speech, we’re held in the cage of time even more, we can’t escape boredom because it’s too risky and too real.

This opens the door for tribality, propaganda, and political affiliations. A great amount of discussions are diverted into political messages using the time jail to their advantage. Black PR, fake news, state trolls, all use repetition and emotional manipulation to spread ideas as viruses on the internet. The internet is a commodity.

We are watched, remembered, and measured deliberately, and not so deliberately. Privacy is a beloved issue we like to see brought up in the news. We’re on edges, we can’t make mistakes. “L’enfer, c’est les autres.”

“We have arrived at a version where everything seems to be just another version of LinkedIn. Every online space is supposed to get you a job or a partner or a stronger personal brand so you can accomplish the big, public-record goals of life. The public marketplace is everywhere. It’s an interactive and immersive CV, an archive. It all counts, and it all matters.” (The decline of Snapchat and the secret joy of internet ghost towns)

LinkedIn displays the epitome of this with its SSI, the Social Selling Index, a metric to calculate how successful you are. This can’t be more explicit. “By checking out your SSI, you’ll see how you stack up against your industry peers and your network on LinkedIn.”
“There are things you don’t know about that you absolutely need to know about, and please remember this other thing too at the same time”, “furthermore, you are behind and need to be productive”.

As with real life, comparing yourself with others drags with it the phantom of unhappiness. It sings along the tune of the thriving productivity craze that has emerged, histrionic productivity. Seemingly something wanted or needed in the age of multitasking and constant interruptions.
Time runs fast when you see that everyone is already leading the race and you’re far behind. Days are not enough to complete the never ending TODO lists. We’re getting old so fast, how could we possibly keep up before being discarded as an unwanted product! Guilty of not displaying our labor; A worthlessness hole. How to to be worthy in a place where unworthiness is the norm.
Boredom may lead to creativity…

Then marketers know exactly our wish to escape and offer us a pre-packaged solution to our external validation needs. Using our time, we buy molds and frameworks advertised as metamorphosis.

How to genuinely break out of the cocoon.
How to not go the effortless apathetic advertised way.
How to take back control of time.

Adding to the statistic that 40% of the traffic online is fake we have to add that the actual contribution and participation of users is done by a mere 1 to 3%. The rest are lurkers.
As we’ve said boredom accompanied with inaction makes us feel trapped in a time cage.

So what should we do to govern our content instead of the passive consumption. Unfortunately, we start with a bad hand. It’s frightening to be a voice on the fringe because the house plays against you.
Maybe if we moved to a different host or become hosts ourselves it would change the rules. What we do to maintain our homes is certainly worthy of our time. When you’re building your living space you shouldn’t worry about what happens in the neighbours houses.
Indeed, there are many barriers stopping us, making us think we’re not good enough, making us believe we aren’t original nor interesting. Hence we keep quiet as to not bother the owners and roommates.

We then build a preference for ephemeral media, places where our voices dissipate quickly, where we’re anonymous but can nevertheless talk without holding ourselves.
Still we are in the dungeon where no one can see, living an illusion in a parallel time that doesn’t affect the current one.

You have to take the effort and step out.
Break the algorithms, use them to your advantage.

Then space-time curves on itself.

time curvature

Time is relative, on the internet.
A watched pot never boils!

And this concludes the small amount of time I’ll get from you on the internet.


Because even this post is a repetition:

  • TIME PERCEPTION

https://en.wikipedia.org/wiki/Time_perception
https://www.youtube.com/watch?v=VGe1M_z91iA
https://www.theguardian.com/science/2013/jan/01/psychology-time-perception-awareness-research
https://www.exactlywhatistime.com/psychology-of-time/time-perception/
http://cogprints.org/3125/1/Subjective_Perception_of_Time_and_a_Progressive_Present_Moment_-_The_Neurobiological_Key_to_Unlocking_Consciousness.pdf
https://fermatslibrary.com/s/does-being-bored-make-us-more-creative
https://www.researchgate.net/publication/21193823_Effect_of_boredom_proneness_on_time_perception
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4140165/
https://www.spring.org.uk/2011/06/10-ways-our-minds-warp-time.php

  • ATTENTION PLEASE

https://www.nytimes.com/2015/08/03/your-money/what-is-our-attention-really-worth.html
https://warontherocks.com/2018/09/social-media-as-war/
https://medium.com/@enkiv2/against-trendism-how-to-defang-the-social-media-disinformation-complex-81a8e2635956
https://nrempel.com/posts/what-we-have-now-is-not-advertising/
https://lobste.rs/s/mwlrof/what_we_have_now_is_not_advertising
https://letsknowthings.com/episode144/

  • GRIND IT

https://dictionary.reverso.net/english-cobuild/grinding%20media
https://blog.hootsuite.com/social-media-groundhog-day-why-you-should-embrace-repetition/
https://www.insidehighered.com/blogs/student-affairs-and-technology/7-years-social-media-repetition-time-be-bold-again
https://nymag.com/intelligencer/2018/12/how-much-of-the-internet-is-fake.html
https://www.forbes.com/sites/laurenfriedman/2016/08/02/why-nostalgia-marketing-works-so-well-with-millennials-and-how-your-brand-can-benefit/
http://www.adotas.com/2019/02/6-ways-use-nostalgia-marketing/
https://www.enginecommerce.com/nostalgia-marketing-lion-king-2019/

  • SERIOUS

https://kevq.uk/privacy-vs-i-have-nothing-to-hide/
https://lobste.rs/s/mzamzy/privacy_vs_i_have_nothing_hide
http://www.iftf.org/statesponsoredtrolling
http://www.iftf.org/fileadmin/user_upload/images/DigIntel/IFTF_State_sponsored_trolling_report.pdf
http://www.iftf.org/fileadmin/user_upload/images/ourwork/digintel/IFTF_biology_of_disinformation_062718.pdf
https://www.ntd.tv/2018/02/21/memetic-warfare-spreading-weaponized-ideas-for-influence-and-control/
https://www.ifla.org/publications/node/11174
https://nixers.net/showthread.php?tid=2218
https://business.linkedin.com/sales-solutions/blog/g/get-your-score-linkedin-makes-the-social-selling-index-available-for-everyone
https://medium.com/the-graph/reflections-of-unoriginality-repetition-of-social-media-3cb330eca80f
https://www.theverge.com/2018/5/18/17366528/snapchat-decline-internet-ghost-towns

  • PRODUCTION

http://ewanvalentine.io/how-to-never-complete-anything/
https://medium.com/@_cbrown/my-response-to-ewan-valentine-aa7042f787f2
https://thezvi.wordpress.com/2017/04/09/youre-good-enough-youre-smart-enough-and-people-would-like-you/
https://www.reddit.com/r/slatestarcodex/comments/9rvroo/most_of_what_you_read_on_the_internet_is_written/

Derek Jones (derek-jones)

Modular vs. monolithic programs: a big performance difference February 28, 2019 03:15 PM

For a long time now I have been telling people that no experiment has found a situation where the treatment (e.g., use of a technique or tool) produces a performance difference that is larger than the performance difference between the subjects.

The usual results are that differences between people is the source of the largest performance difference, successive runs are the next largest (i.e., people get better with practice), and the smallest performance difference occurs between using/not using the technique or tool.

This is rather disheartening news.

While rummaging through a pile of books I had not looked at in many years, I (re)discovered the paper “An empirical study of the effects of modularity on program modifiability” by Korson and Vaishnavi, in “Empirical Studies of Programmers” (the first one in the series). It’s based on Korson’s 1988 PhD thesis, with the same title.

There were four experiments, involving seven people from industry and nine students, each involving modifying a 900(ish)-line program in some way. There were two versions of each program, they differed in that one was written in a modular form, while the other was monolithic. Subjects were permuted between various combinations of program version/problem, but all problems were solved in the same order.

The performance data (time to complete the task) was published in the paper, so I fitted various regressions models to it (code+data). There is enough information in the data to separate out the effects of modular/monolithic, kind of problem and subject differences. Because all subjects solved problems in the same order, it is not possible to extract the impact of learning on performance.

The modular/monolithic performance difference was around twice as large as the difference between subjects (removing two very poorly performing subjects reduces the difference to 1.5). I’m going to have to change my slides.

Would the performance difference have been so large if all the subjects had been experienced developers? There is not a lot of well written modular code out there, and so experienced developers get lots of practice with spaghetti code. But, even if the performance difference is of the same order as the difference between developers, that is still a very worthwhile difference.

Now there are lots of ways to write a program in modular form, and we don’t know what kind of job Korson did in creating, or locating, his modular programs.

There are also lots of ways of writing a monolithic program, some of them might be easy to modify, others a tangled mess. Were these programs intentionally written as spaghetti code, or was some effort put into making them easy to modify?

The good news from the Korson study is that there appears to be a technique that delivers larger performance improvements than the difference between people (replication needed). We can quibble over how modular a modular program needs to be, and how spaghetti-like a monolithic program has to be.

February 25, 2019

Pete Corey (petecorey)

Secure Meteor Releasing Next Week! February 25, 2019 12:00 AM

You may have noticed that I haven’t been doing much publicly in 2019. I haven’t released any new articles, I haven’t send out any newsletters, and I’ve been relatively quiet on Twitter.

Underneath the surface of this calm lake of inactivity, my little duck feet have been churning. I’ve been pursuing one of my main goals for the new year and working diligently to finish my first book, Secure Meteor. I’m excited to announce that I’ll be releasing Secure Meteor early next week!

Between the years of 2014 and 2017, I lived and breathed Meteor security. I spent those years writing and speaking about Meteor security, developing and deploying secure Meteor applications, working with amazing teams to better secure their Meteor applications, and building security-focused packages and tools for the Meteor ecosystem.

While Meteor doesn’t play as central of a role in my day-to-day development work today, it would be a shame to throw away all of the knowledge and expertise I built up around the ins-and-outs of securing Meteor applications.

Secure Meteor is an effort to capture and distill everything I’ve learned about Meteor security from my years of real-world Meteor security experience.

I’m happy to announce that over a year after originally announcing Secure Meteor, it’s ready to be released! The final product is one hundred ten pages of what I consider to be vitally important information on securing your Meteor application. If you are actively developing Meteor applications, or own a Meteor application living in production, Secure Meteor can help bring you understanding and peace of mind in the unforgiving world of software security.

Be sure to check out a few of the sample chapters to whet your whistle for next week’s release.

If you’re interested in the book, sign up for the Secure Meteor newsletter. Subscribers will be the first to know when Secure Meteor launches, and I might even offer them an initial discount for all of their support.

I couldn’t be more excited for next week. See you then!

February 24, 2019

Derek Jones (derek-jones)

Evidence-based election campaigning February 24, 2019 09:30 PM

I was at a hackathon on evidence-based election campaigning yesterday, organized by Campaign Lab.

My previous experience with political oriented hackathons was a Lib Dem hackathon; the event was only advertised to party members and I got to attend because a fellow hackathon-goer (who is a member) invited me along. We spent the afternoon trying to figure out how to extract information on who turned up to vote, from photocopies of lists of people eligible to vote marked up by the people who hand out ballot papers.

I have also been to a few hackathons where the task was to gather and analyze information about forthcoming, or recent, elections. There did not seem to be a lot of information publicly available, and I had assumed that the organization, and spending power, of the UK’s two main parties (i.e., Conservative and Labour) meant that they did have data.

No, the main UK political parties don’t have lots of data, in fact they don’t have very much at all, and make hardly any use of what they do have.

I had a really interesting chat with Campaign Lab’s Morgan McSweeney, about political campaigning, and how they have not been evidence-based. There were lots of similarities with evidence-based software engineering, e.g., a few events (such the Nixon vs. Kennedy and Bill Clinton elections) created campaigning templates that everybody else now follows. James Moulding drew diagrams showing Labour organization and Conservative non-organization (which looked like a Dalek) and Hannah O’Rourke spoiled us with various kinds of biscuits.

An essential component of evidence-based campaigning is detailed knowledge of the outcome, such as: how many votes did each candidate get? Based on past hackathon experience, I thought this data was only available for recent elections, but Morgan showed me that Wikipedia had constituency level results going back many years. Here was a hackathon task; collect together constituency level results, going back decades, in one file.

Following the Wikipedia citations led me to Richard Kimber’s website, which had detailed results at the constituency level going back to 1945. The catch was that there was a separate file for each constituency; I emailed Richard, asking for a file containing everything (Richard promptly replied, the only files were the ones on the website).

Pivot.

The following plot was created using some of the data made available during a hackathon at the Office of National Statistics (sometime in 2015). We (Pavel+others and me) did not make much use of this plot at the time, but it always struck me as interesting. I showed it to the people at this hackathon, who sounded interested. The plot shows the life-expectancy for people living in a constituency where Conservative(blue)/Labour(red) candidate won the 2015 general election by a given percentage margin, over the second-placed candidate.

Life-expectancy for people living in a constituency where Conservative/labour won by a given percentage margin.

Rather than scrape the election data (added to my TODO list), I decided to recreate the plot and tidy up the associated analysis code; it’s now available via the CampaignLab Github repo

My interpretation of the difference in life-expectancy is that the Labour strongholds are in regions where there is (or once was) lots of heavy industry and mining; the kind of jobs where people don’t live long after retiring.

Artemis (Artemix)

2019 and this blog February 24, 2019 06:12 PM

As you can notice, articles are quite slow to come. The reason is mainly due to me being a slow writer, and not having a lot of topics to write on.

Another problem is school, as there's definitely more work than the previous year, giving me less time to maintain this blog, together with some other projects.

# Website changes and this blog

A project I'm currently walking towards is to rework most of my websites.

I have a lot of very small static websites for lots of different things, and it doesn't make a lot of sense in my opinion to spread them so much, so I'll work on joining together a lot of them.

Part of this work has already been done, where several small documentation websites were finally brought together at docs.artemix.org.

I have a few websites left:

In the Portfolios section, there's some heavy duplication, so I intend to remove all of them, and merge them into a single simple page, which will be simpler to find (and link back to me), and much cleaner to maintain.

I also intend to merge my devlog inside this blog, and to, once again (I know), re-work this blog.

This includes a re-design, a change in the categorization system, and a removal of the comment space (since the service I used is pretty much closing his doors).

# A new hope

The first goal is to merge my Devlog into my blog, which should be followed by the integration of my folio (a.k.a. CV) into the blog website, then followed by the integration of a small "About" page.

To organise all written articles, there'll be some categorization.

The design'll probably change to become simpler; I'll probably use a third-party lightweight CSS library, like skeleton or milligram for the base CSS.

I'll probably redevelop the static site generator, not in nodejs but in go, to be simpler and easier to maintain, because it's a real hell as of now, especially around file manipulation.

# Articles access

The direct articles' links won't change, but the RSS feeds will be differently structured: the global one will continue to work, but a new feed by category will be created.

# Overall

Overall, a lot of planned change, which will take some time, especially around the static site generation (with a much more complex layout), but I personally think it's really worth the change and amount of work.

For now, I'll work on that, so I'll see you later!

Ponylang (SeanTAllen)

Last Week in Pony - February 24, 2019 February 24, 2019 03:33 PM

Last Week In Pony is a weekly blog post to catch you up on the latest news for the Pony programming language. To learn more about Pony check out our website, our Twitter account @ponylang, or our Zulip community.

Got something you think should be featured? There’s a GitHub issue for that! Add a comment to the open “Last Week in Pony” issue.

February 23, 2019

Jeff Carpenter (jeffcarp)

Book Review: The Shame of the Nation February 23, 2019 02:42 PM

The Shame of the Nation: The Restoration of Apartheid Schooling in America Author: Jonathan Kozol Published: 2005 Rating: ⭐⭐⭐⭐⭐ This is an upsetting book. It describes the dream of integrated schooling enabled by Brown v. Board of Education in 1954 and how, through racist policy making at the federal, state, and local levels, this dream has been slowly dismantled resulting in an American school system that is as segregated today as it was during the civil rights movement.

February 22, 2019

David Wilson (dw)

Reasons Mitogen sucks February 22, 2019 07:30 PM

I have a particular dislike for nonspecific negativity, where nothing can be done to address its source because the reasons underlying it are never explicitly described. In the context of Mitogen, there has been a consistent stream of this sort originating from an important camp in public spaces, and despite efforts to bring specifics out into the open, still it continues to persist.

For that reason I'd like to try a new strategy: justify the negativity and give it a face by providing all the fuel it needs to burn. Therefore in this post, in the interests of encouraging honesty, I will critique my own work.

Mitogen is threaded

Mitogen is a prehistoric design, with the earliest code dating all the way back to 2007, a time when threading seemed cool and the 'obvious' solution to the problem the library intended to solve. Through mountains of words I have since then justified the use of threading, as if the use of threads were a mandatory feature. In some scenarios this is partially true, but just as often entirely incorrect.

This code does everything your mother told you never to do when using threads, and suffers endless asynchrony issues as a result. Historically it failed to lock mutable data structures, and some remnants of that past remain to the present day.

If you're going to throw tomatoes, threading is a great place to start. If there is any legitimate reason to have a problem with integrating this library, look no further.

The documentation sucks

If this guy gets hit by a bus, what happens to the library? Valid concern! Documentation is currently inconsistent at best, out of date and nonexistent at worst. I care a lot about good documentation, but not quite as much as keeping users happy. Therefore as a result, a description of the internals detailed enough to allow someone to take over the work, or even fix some kinds of bugs, does not as yet exist.

It's not that Mitogen is exactly rocket science to maintain, but through poor and inconcise documentation, coupled with a mysterious name, a simple solution is cloaked in a mysterious aura of inscrutability. That incites fear in many who attempt to grasp it.

Mitogen isn't tested properly

Due to the endless asynchrony issues created by threading and running across multiple processes, Mitogen is a nightmare to test. In the common case where some important test passes, there is little to no guarantee that test will continue to pass, say, if the moon turned blue or a flock of migratory pink flamingos happened to pass overhead.

Addressing testing in Mitogen is more than a full-time job, it's an impossible nightmare literally exceeding the boundaries of computer science. Due to the library's design, and the underlying use of threading, a combinatorial explosion of possible program states are produced for which no good testing strategy can or ever will exist.

Many of these aspects can be tested only partially by throwing the library at the wall in a loop and ensuring it sticks, and due to the problem it tries to solve, this will likely remain true in perpetuity. The sound byte you're looking for to justify your loathing is "Mitogen is spaghetti code that's impossible to test"

Mitogen does things it shouldn't do.

If you look closely enough you'll find some modules importing ctypes and doing all kinds of crazy things to work around deficiencies in Python's interfaces. Who the hell writes cowboy code like this? What happens if I run this on some ancient 286 running Xenix? While such atrocities are carefully sandboxed, the reality is that testing in these scenarios has never been done. The computer might catch fire, or the moon might coming crashing into the earth.

Mitogen is riddled with internationalization issues

As a side effect of being written by a monolingual developer, Mitogen sometimes doesn't have a clue about sudo or su when it says things like المصادقة غير صحيحة, therefore you can never really truly 100% rely on it to function when charging handsomely to peddle someone else's work at customer sites, when some of those might be large international concerns, say, in the oil or finance sector.

Because of the hacky methods used to integrate external tools, there may always be some operating system, tool and language combination for which the library will fail to notice something is wrong. In the worst case it might hang, or confusingly claim a connection timeout occurred when really it just didn't understand the tool.

Mitogen uses pickle on the network

The crown jewel: it is absolutely true that Mitogen will pass untrusted pickles through an unpickler, with mitigations in place to prevent arbitrary code execution. But is it enough? Nobody knows for sure! It's impossible to combat this variety of FUD, but just in case a replacement serialization has existed in a local branch for over 6 months, waiting for just the day some material vulnerability is finally revealed.

Be certain to mention pickle insecurity at every opportunity when generating negativity about the library, it's almost impossible to fight against it.

The Ansible extension monkey-patches half the application

Half may be a slight exaggeration, but indeed it's true, the Ansible extension is littered with monkey-patches. It doesn't matter that these are tidy and conservative patches that uninstall themselves when not in use or when incompatible situations are detected, monkey patching! is! wrong! and must! be prevented!

Who could trust any code that subclasses types dynamically, or assigns a method to a class instance? No code should ever do that, or heaven forbid, nor should any test suite. Get this printed on a t-shirt so the message is abundantly clear.

The Ansible extension reimplements the whole module executor

It's true, it didn't even cutpaste the existing code, just blanket reimplements it! All spread out in a rats nest of 10 line methods across a giant inscrutible class hierarchy. The nerve!

Despite replacing a giant chunk of Ansible's lower tiers, the plain truth is that this code is a weird derivative of existing Ansible functionality bolted on the side. Who in their right mind wants to run a setup like that?

The Ansible extension prefers performance over safety

Here's a great one: to avoid super important network round-trips when copying small files, rather than raise an error when the file copy fails, it instead raises the error on the follow-up module execution. How confusing! How can this mess possibly be debugged?

The Ansible extension makes it easy for buggy modules to break things

The dirty secret behind some of that performance? We're reusing state, the horror! And because we're reusing state, if some of that state is crapped over by a broken module, all hell could break loose. The sky might fall, hard disks could be wiped, fortune 500s might cease to exist!

In the absense of collaboration with the authors of such evil code, Mitogen is forced between a rock and a hard place: maintaining a blacklist of known-broken modules and forking a threaded process (fork! with threads! the horror!) to continue the illusion of good performance in spite of absolutely no help from those who could provide it.

Forking support is entirely disabled when targetting Python 2.4, and there is a high chance Mitogen will remove forking support entirely in the future, but in the meantime, use no less than 72 point bold font when discussing one of the library's greatest sins of all.

The Ansible extension rewrites the user's playbook

It's true! Why does a strategy module get to pick which connection plugins are used? It's as if the author believed in eliminating the need for unproductive busywork at all costs for the users of his project, and dared put user before code with a monkey-patch to ensure manual edits across a huge Git repo full of playbooks weren't necessary to allow the code to be tested, you know, by busy people who were already so stressed out about not having enough time to complete the jobs they've been assigned, due to a fundamentally slow and inefficient tool, that they'd risk experimenting with such a low quality project to begin with. The horror!

Summary

Mitogen has in excess of 4,000 interactive downloads (12,000 including CI) over the past year, of course not all of them users, but I have long since lost count of the number of people who swear by it and rely on it daily. In contrast to these numbers, GitHub currently sports a grand total of 23 open user-reported bugs.

If you need more ammunition in your vague and futile battle with the future, please file a bug and I will be only too happy to supply it. But please, please, please avoid suggesting things "don't work" or are broken somehow without actually specifying how, especially in the context of a tool that continues to waste thousands of man hours every year. It is deeply insulting and a damning indictment on your ability to communicate and openly collaborate in what is otherwise allegedly the context of a free software project.

Hopefully this post should arm such people with legitimate complaints for the foreseeable future. Meanwhile, I will work to improve the library, including as a result of bug reports filed in GitHub.

Noon van der Silk (silky)

2018s Crazy Ideas February 22, 2019 12:00 AM

Posted on February 22, 2019 by Noon van der Silk

I can’t believe I forgot to do this at the start of the year!

Let’s look back over 2018’s ideas:

The Ideas

  1. using ai to reduce medical dosages: i.e. imagine you need such-and-such amount of radiation in order for certain whatever to be seen in some scan can you lower the dosage and then use some ai technique to enhance the quality of the image?
  2. something about shape-based reasoning: i.e. the fact that some grammar-parsing problem teems like a good fit for tensor-network works because they both have the “shape” of a tree another example would be thinking of solving a problem of identifying different types of plants as it fits in the “shape” of a classifier but these things could potentially have other shapes? and therefore fit into other kinds of problems?
  3. a q&a bot that can answer tests on the citizenship test: Q: What is our home girt by? A: ..
  4. buddhist neural network: it features significant self-attention and self-awareness it performs classification, but predicts the same class for every input as it has a nondual mind it’s loss function features a single term for each of the 6 paramitas: - Generosity: to cultivate the attitude of generosity. - Discipline: refraining from harm. - Patience: the ability not to be perturbed by anything. - Diligence: to find joy in what is virtuous, positive or wholesome. - Meditative concentration: not to be distracted. - Wisdom: the perfect discrimination of phenomena, all knowable things.
  5. music-to-image: say you’re listening to greek music you want to generate an image of a singer singing on some clifftop on greece under an olive tree near a vineyard surely a simple matter of deep learning
  6. code story by word frequency: take all the words in a code repo, order them by frequency, then match that up to some standard book, then remap the code according to the frequency
  7. generalisation of deutch-jozsa’s problem: here - https://en.wikipedia.org/wiki/Deutsch%E2%80%93Jozsa_algorithm generalise it so that we have multiple f’s that have different promises; i.e. i’m constant “50% of the time”. what now?
  8. analogy-lang: reading https://maetl.net/membrane-oriented-programming by @maetl i had an idea for a programming language consider two types, “person” and “frog”, let’s say there are at least two problems; one is the one in the article - it’s hard to come up with a complete list of methods/properties that should exactly define one of these things. in surfaces and essences they argue that in reality no categorisation has perfectly defined boundaries; so how to define the types? another problem is what happens if i want to build a thing that is both “person-like” and “frog-like”? you can’t. especially not if the differences are far away (i.e. is frog hopping like “walking”? should it be the implementation of a “walk” method? probably not; but a “move” method? probably? but what about less-clear things? and doesn’t it depend on what your merging with) in this way it seems like it’s impossible to come up with strict types for anything so here’s an idea: analogy-lang: let’s you define types in a much more relaxed way; “this thing is like that thing in these ways, but different in these ways”
  9. multi-layered network: train some network f to produce y_1 from x_1 f(x_1) = y_1 then, wrap that in a new network, g, that produces y_1 and y_2 from x_1 and x_2 g(x_1, x_2) = ( f(x_1,), g’(x_1, x_2) ) and so on. interesting
  10. psychadelics for ai: ideas: - locate some kind of “default-mode network” in your model and inhibit it - after training; allow many more connections - have two modes of network; one this “high-entropy” learning one, which prunes down to a more efficient one that can’t learn but can decide quickly
  11. ultimate notification page: it’s just a page with a bunch of iframes to all your different websites where you get the little notification things and then it just tiles them; showing in-which places you have notifications.
  12. use deep learning to replace a movie character with yourself and your own acting: 1. semantic segmentation to remove an actor 2. film yourself saying their lines 3. plug yourself back into the missing spot 4. dress yourself in the appropriate clothes 5. adjust the lighting 6. ??? 7. profit!
  13. paramterised papers: “specifically, this model first generates an aligned position p_t for each target word at time t” show me this sentence with t = whatever and p = some dimension
  14. on a slide-deck, have a little small game that plays out in a tiny section of each slide
  15. use autocomplete and random phrases to guess things about people: “i love you …” find out who they love “give me …” find out what they want and so on
  16. visualise the difference between streaming and shuffling sampling algorithms: streaming -> for low numbers, misses items shuffling -> for low numbers, bad distribution across indices that are shuffled. @dseuss @martiningram
  17. symbolic tensorflow: so i can do convs and explain them really easily
  18. the “lentictocular”: uses lenticular technology on a sphere with AI so that it watches your gaze and moves itself accordingly so that it always displays the appropriate time
  19. ai email judgement front: intercepts all your emails for every email, it decides the optimal time that someone will respond it sends it at that time so that they respond
  20. “growth maps” for determining affected areas of projects w.r.t. a pattern language
  21. friend tee: lights up when other friends are nearby
  22. lenticular business cards: this is already done by many people
  23. innovative holiday inventor: thinks up cool holidays
  24. buddhist twitter: there’s only one account, and no password
  25. programming ombudsman: @sordina @kirillrdy
  26. the computational complexity of religion: given various religious abilities, what computational problems can you solve? what are the implications on computational complexity by buddhism? and so on.
  27. spacetime calendar: we have calendars for dates but they don’t often contain space constraints so why not a space-time calendar, defined in some kind of light cone?
  28. app to check consistency of items before you leave: it’s an app you configure it to be aware of your keys, laptop, wallet, glasses then, as you leave your house, it can inform you of the status of those items: - “hey, your computer is at home” - “all g, your glasses are at work” etc @gacafe
  29. gradient tug-of-war demonstration: given a function f(x,y) = x + y then if we have competing loss functions then it would be nice to visualise the gradient flow as a tug of war
  30. a website that is entirely defined in the scroll bars/url link bar, whatever: you can move pages by moving your mouse to different parts of the scroll bars and so on in that fashion
  31. quantum calendar: it’s a calendar where on any given day in the future, items can be scheduled at the same time. but up to some limit (say 1 week) the items get collapsed and locked in @silky @dseuss
  32. streaming terminal graph receiving updates over mqtt: then, can use it to plot tensorboard logs to the terminal instead of tensorboard using blessed + blessed-contrib seems to be the easiest way - https://github.com/yaronn/blessed-contrib/blob/master/examples/line-random-colors.js just need to put in the mqtt part and update the data that way.
  33. ai escape room: this is an idea of dave build an escape room controlled entirely by ai the only way out is by interacting with the machine it can control everything: heating, doors, whatever
  34. programmable themepark: here’s a ride; how you interact with it is defined via your own programs you play minigolf, but instead of a club you use programming @sordina
  35. graph+weights to neuronal net rendering: https://github.com/BlueBrain/NeuroMorphoVis
  36. Arbitrary-Task Programming: given that programming is just arranging symbols, and we can use deep learning to interpret the real world into symbols, then it’s possible to do programming by performing arbitrary tasks i.e. any job can be programming, if we can build the deep-learning system that converts actions in that job into symbols in a programming language
  37. Sitcom-Lang: it’s a pre-processor, or something, for an arbtrariy language whenever a symbol is defined, that symbol is embued with a soul and a “nature”. it starts to have wants and needs; and those must be satisfied in order for it to stay in it’s present form (i.e. as a “for loop”), otherwise it might change (i.e. to be a “while” loop, or maybe even a string instead). all the symbols will interact with each other, and in that way a program will be made @sordina
  38. brain2object2teeshirt: this - https://scirate.com/arxiv/1810.02223 - but once it’s decided on the object, it gets rendered on your LED tee-shirt @geekscape @sordina
  39. pix2pix sourcecode2cat: generate pictures of source code convert to pictures of cat instant machine for generating cat pictures from code what cat does your code look like? @sordina
  40. physical xmonad: use the uarm to be a “physical” xmonad you want to write on some piece of paper? no worries, the uarm will re-arrange your physical desk so that everything is conveniently arranged to do that @sordina
  41. collaborative painting in the style of christopher alexander: it has 3 parts done by 3 artists i draw the left part; you draw the middle, and it has to interact coherently with whatever i’ve drawn; then another person draws the right side, again, it must interact that’s it.
  42. lego laptops: laptops that plug together in a lego-like way
  43. business version of 30 kids vs 2 professional soccer players: 30 grads vs 2 ceos 30 ceos vs 2 grads etc.
  44. shops in parks: would make the parks safer/nicer, because people would be in them more could limit the type of shops, and their size, but would be a nice way to build a bit more of a community feeling in them
  45. an icon next to your public email address that indicates how many unread emails you have: then people can gauge what will happen if they email you
  46. ml for configuring linux: “what file should i look at to change default font sizes?” “how can i set up my new gpu? what settings should i set?”
  47. water-t-shirt: the essence of a water-bed, in t-shirt form!
  48. deep antiques roadshow: the idea explains itself
  49. a being that is by-default inherently abstract, instead of inherently practical like us: for them, being practical would be really hard by default they live at the other end of the abstraction spectrum
  50. business card bowl: through all the business cards into a bowl; each day, call them, if they don’t want to do business, throw the card out
  51. “live blog”: whenever someone visits your blog, instead of reading articles, they get to open a chat window with you in your terminal then you tell them what you’ve been up to; and they can ask you questions
  52. software art: take all the source code; stack them up as if the line count is one slice, that’s the structure
  53. t-shirt whiteboard: in essence i.e. you can draw on the t-shirt, and the writing just washes out the next time then you can design whatever you want would this just work?
  54. physics simulation + diagrams: would be great to define things such as “two pieces of rope inter-twined”, and then “drop” them, but then let that resulting expression become a haskell diagrams Diagram, so that you can then do diagram stuff with it
  55. git version flattener: clone a git repo at every revision, into some folder.
  56. see-through jacket that is also warm: optionally also magnetic @sordina
  57. magnetic glass: @sordina
  58. heated keyboard: keeps your fingers warm
  59. run an experiment where monkeys/dogs/whatever are encouraged to learn some kind of programming to solve a task: i.e. a monkey gets 1 food package per day but if learns to program, using the tools provided to it, ( something like a giant physical version of scratch ) then it gets 3 food packages in some sense people have tried this, with them solving problems, but has anyone tried it where the tool they use to solve the problem is general, and can be applied to other areas of their life?
  60. tree to code: physical trees 1. order trees by the number of leaves 2. order code by the number of statements train a deeplearning network to map between these things then, trees can write computer programs @sordina
  61. ethical algorithms testing ground: related to the last two #409 #408 basically, people can sign up to be ethical tester algorithms can join to provide games for people to play how would it work?
  62. ethical testers: beta testers game testers ethics testers
  63. simulation for ethical machine learning problems: consider the situation: “how do i know if this algorithm X is unethical?” well, instead of waiting for the salespeople to tell you, you could just have it run in a simulated environment and see if it’s unethical by the way that it acts.
  64. minecraft file browser: walk around your filesystem in 3d
  65. ocr clipboard copy and paste: select an image region, send it to some text api thing, get the text back in the clipboard
  66. low-powered de-colourisation network: learns to convert colour -> black and white if it doesn’t do a good job, it’ll look awesome
  67. physical quine: a robot that can type on the computer and write code that writes the program that writes itself
  68. deep learning “do” networks: can you include the “do calculus” into neural networks somehow? to make it do some causal things?
  69. plot websites on cesium map, browse the internet that way.: web-world
  70. animated colour schemes for vim: the colour scheme rotates as you code
  71. a tale of three dickens: the movie: it’s an auto-generated movie from locally-coherent slices instead of the book, we make a movie, where all the scenes in the movie are interspersed based on “local coherence” i.e. from two movies select two people having a conversation with someone named bill or, flick between scenes at the beach @sordina
  72. revolutionary walls: the floor is fixed; but the walls are a tube you pull on some part of the wall to rotate it @tall-josh
  73. activation function composer: or more generally, a function composer 1. what does the graph of relu look like? 2. what about the graph of relu . tanh ? and so on, indefinitely and arbitrarily. some features: - what points should be push through? maybe could add certain kinds of initialisations and ranges - add things like drop-out and whatnot.
  74. record videos of people doing interviews but have their voice replaced by obama and their image replaced by obama
  75. hair cut & deep learning deal with the hair-dresser across the road: sit down for a hair cut, get an hour of deep learning consulting as well
  76. live action star wars playing out across many websites in the background of cesium js windows: on my website, a death-star is driving around on it’s way somewhere eventually it reachers your website, and destroy’s it’s logo, or something
  77. deep learning tcp or upd: find something inbetween
  78. meta-search in google: “i want to see all the alternatives to cloud-ranger” it’s impossible to do this search.
  79. umbrella-scarf / fresh-scarf: it’s a scarf but also, it has a hood that you can pull up, maybe even a clear hood, that let’s you see out front of it, but keeps you under cover could also keep smoke out of your face
  80. meme-net: watch video, extract meme i.e. and the rollsafe guy
  81. ultimate computer setup person: someone who just has the worlds best computer set up everything works no data is duplicated whole operating system exists in 1.5 gig; they’ve got 510 gig free no conda/ruby/stack issues
  82. codebase -> readme: looks at an entire codebase; learns to predict the readme
  83. divangulation theorem for websites: @sordina surfaces can be triangulated websites can be divangulated what are the associated theorems?
  84. tasting plates for saas, *aas: instead of just saying “sign up now for 6 months free”; just auto-sign people up for x free things, then let them use it up. easy way to get a billion more dollars for your saas business. @sordina
  85. self-skating skateboard: it drives down to the skate park; skates around on the pipes, does flips, 180s, griding, whatever. @sordina @tall-josh
  86. different password entry forms: 1. any password you type logs you in, but they all take you to a different computer. only your password takes to you yours. “honeypassword” 2. you password consists of the actual letters, but also the timing between the letters @sordina 3. any key you press is irrelevant, all that matters is the spacing; everything is done via morse-code (@geekscape)
  87. congratulations!!!
  88. meeting chaos monkey: every time a meeting is scheduled, a random attendee is replaced by some other random staff member
  89. consulto the consulting teddybear: @sordina “that sounds good in theory” “have you tried kan-ban’ing that” “moving forward that sounds good, but right now i think we should be pragmatic”
  90. small magnets in fabric that can attach to other magnets so-as to create customisable clothing: just put a diff design on by switching out the magnets i just need some small magnets. jaycar sells them
  91. “collaboration card”: some way of listing and engaging with people in various projects you’re interested in
  92. nlp self-defending thesis
  93. rent factor charged in the city based on how innovative your store is: hairdresser: f = 0.85 funky clothing store: f = 0.6 some weird shop that only sells whatever: 0.2 cafe: 1 or some kind of scheme like so
  94. e-fabric: like e-ink, but for fabric
  95. clothes that change colour with respect to the magnetic fields that are around it
  96. grand designs: of computer programs: follow the development of some kind of app, over a few years. hahahaha would be terrible.
  97. giant magnet that aligns all the spins of the atoms of objects (people?!) so that they can pass through each other with different polarisations
  98. dance-curve net
  99. shoes that look like little cars: volvo shoes monster-truck shoes lamboghini shoes fi-car shoes etc.
  100. augmentation reality glasses that convert what people are saying into words that float in front of you that you can read: so you can “hear” what people say to you when you’re wearing headphones
  101. “html/css layout searcher”, like visual image search, but for how to lay things out with flex/css/react/whatever: input: some scribble about how you want your content laid out in boxes: output: the css/html that achieves this. there’s some networks that do this already, where they convert the drawings to code. but maybe that can be augmented by thinking of it like a search across already-existing content?
  102. “relax ai” or “mindful story ai”: it makes up nice stories, like “you are walking on the beach, you see a small turtle; you follow the turtle for a swim in the water …” could also use cool accents of people, and make sure the story is consistent with another NLP after the first generative run
  103. comedy audience that instead of laughing they just say the things people say when they think something is funny: instead of “ahahaha” audience (in unison): “that’s funny” audience (in unison): “good one” audience (in unison): “great joke”
  104. Adversial NLP: a sentence so similar to another sentence as to be humanly-indistinguishable, but makes the AI think it’s something wildly different
  105. Inflatable Whiteboard Room: it’s a large room, inflatable like a balloon or whatever, but you can walk into it and use the internals of it as a whiteboard useful for offices
  106. Collaborative Password Manager: say i want to make a password for a system you will control, but we both need access to maybe i can have my program generate part of it; you’re program generate part, then combine them both on our independent computers, without the entire password leaving either of them could build this on top of the public keys on github somehow; so i just pick the github user i’m going to share a password with could clearly do this immediately by encrypting it with their public key, or something. but maybe something richer can be done
  107. Video Issues: https://vimeo.com/265518095
  108. Faux Project Management Generator Thing: it’s an RNN that generates hundreds of tickets in trello or jira or whatever; with arbitrary due dates makes you feel stressed @sordina could be used for project-management training scenarios
  109. quantum cppn
  110. AIWS: ai for aws you: “hey, i need to computer with whatever to be up at whichever.com and to have some database, blah blah” aiws: “no worries, that’s set up for you!” alt. “talky-form for AWS” @sordina
  111. deep haircut mirror: a mirror infront of hair-dressers that lets you look at potential haircuts on your own head
  112. train a network to learn when to laugh in response to jokes: deep-audience
  113. dance led prompt device: it’s a little led board that sits at the front of a dance thing, like a teleprompter, but for dance it puts out the next dance moves a dance-prompter move-prompter
  114. Easter Egg Evangelist for Enterprises (E^4): A floating employee who embeds on teams to consult on how to best add easter-eggs to the features they build.
  115. self-driving food truck: @martiningram
  116. Stabbucks: Starbucks for knives. * https://i.imgur.com/1ZCIQnh.jpg * Order venti, grande, etc knives
  117. BrainRank: A leaderboard of brain-shaped logos.
  118. DeliveryNet: Reads prose with impeccable timing.
  119. Rant Meetup: Rant about stuff that sucks. * No solutions allowed * Surely james has something to say
  120. submit an AI-entry to every large festival in melbourne in a single year: https://whatson.melbourne.vic.gov.au/Whatson/Festivals/Pages/festival_calendar.aspx
  121. Stochasm: Metal band that plays random notes. * Easy to swap out band members!
  122. Seinfreinds: Have the cast of one sitcom act out an episode of another and see if anyone notices.
  123. hire a comedian to come along to your meetings: they can provide background entertainment me: “hey nice to meet you, this is my associate jerry seinfeld, let’s get started” jerry: “what’s the deal with peanuts?” …
  124. stacked double-coffee-cup holder: it’s just a handle, that holds on to two cups, one above the other useful for carrying multiple cups
  125. Auto-generating face detection camouflage: Aka, auto-generating styles from https://cvdazzle.com/
  126. use the technology of marina (ShapeTex) to make little movable people in jackets: https://www.linkedin.com/in/marina-toeters-a55a685/
  127. “studio gan”: it just makes up every single thing, much like #341 , but in more depth and for everything could use for #343 for example.
  128. the journey of your parcel: imagine you’re waiting for a parcel from auspost you put on your VR headset and you get a real-time view into it’s life; maybe it’s sitting on a boat, on it’s way here, or it’s in an airplane, or it’s driving etc. you’d get a full HD video-style image of the thing moving, that would be completely imagined by a gan or something.
  129. menu democracy: buy a coffee, earn 1 voting right to change the menu in some way buy more coffees, proceed in this fashion other food yields you more votes
  130. dynamic videos built on the fly to answer standard google queries: i.e “use python requests to do post request” a video could be made on the fly using the celeb-generating stuff of stack-gan, then the voice-simulation stuff of lyrebird or whoever, then the lip-moving stuff, the text-to-speech of wavenet or whatever, and some other random backing scene gans and music production networks it would get the content by reading the first answer it finds on google, in some summarised way. @sordina
  131. fully-automated fashion design: 1. Fashion-MNIST CPPN - At random, pick a random item of clothing, figure out what it is, and generate a large version. 2. Pick a random (creative-commons) photo from Flickr, train a style transfer network on it. 3. Apply the style transfer to a bunch of different clothing items? To make a theme? 4. Pick a name from an RNN? 5. Upload to PAOM? Run-time should be several hours for one collection? Not so bad.
  132. remote-controlled magnet: a perfectly spherical magnet that can be rolled around by remote.
  133. use cppn to generate a 3d landscape by determining the height by the colour
  134. lunch formation yaml specification: example: lunch: - sandwhich: - bread - butter - lettuce - cucumber - butter - bread region: cbd elements are ordered by height on the plate. @sordina
  135. a tale of 3 dickens: combine: 1. a christmas carol 2. a tale of two cities 3. great expectations in order page-by-page. @sordina
  136. instead of colouring in the retro-haskell tee with colours, print the source code for the program itself in the previous colour: easy!
  137. reverse twospace - use offices for other purposes out of hours: silverpond -> t-shirt business on the weekends
  138. dance karaoke: like karaoke, but instead of singing you need to dance uses some pose-recognition thing
  139. build a markov chain thing and then run all the words through some “smoothing” operation by way of a word embedding: i.e. somehow pick a few lines within embedding space, and move all of the words closer to those lines maybe something would happen
  140. naive nn visualisation: just reshape all the weights to be in the shape of an image, normalise the values, and output it.
  141. map sentences to “the gist”; just a few words: “an embedding that compresses a piece of text to its core concepts” “like if i can compress an image” “then i should be able to compress a book” “and if i can do that that means that i can also write a compressed book” “and have the neural network write my book” @gacafe
  142. version number which, in ascii, eventually approaches the source code itself
  143. personal world map: it’s one of those scaled world maps, where the scaling is determined by say your gps coordinates over a given year, so that it only enlarges the places you go. @mobeets
  144. water doughnut
  145. Deep-Can-I-Do-Deep-Learning-Here?: it’s a network for which you input a situation and it tells you if you can use deep learning to help.
  146. haskell type journey challenge: get from package x to package y using only the following types once ….
  147. make the 3d wall art that we saw at the house of sonya
  148. novels in binder-form so that you can take out small sections of the pages and read them
  149. multi-agent learning where the agents also watch each other locally and learn from each other
  150. ml for learning the life/centers function from christopher alexander: two pictures which one has more life? alt. something about centers?
  151. bureaucratic-net: instead of a network that is really good at explanationing it’s decisions, this network is really bad at it. nothing it says makes sense, or alternatively it’s really long-winded in it’s responses. or maybe it’s always right, but it never has any idea why.
  152. artistic-arxiv: instead of papers, each day take a random few images from every paper and show that. maybe it’d be cool.
  153. DeepWiki: on normal wikipedia, humans edit pages about concepts in the form of words on deepwiki AIs edit concepts in the form of embeddings by way of adjusting the vectors (or something) they’d need to think about how to manage edits and revisions and so on. but that’s the general idea. @sordina
  154. secret walls: wear the streets (or: graffiti on a wall of clothes; and wear them)
  155. a network that is given the punchline and has to work out the setup: @icp-jesus
  156. reverse website or inverse website: normally, you visit a site and see the website and you can view source to see the source what about if you could visit a site and see the source, then view the source to see the site
  157. endless pasta hat: has a self-pesto’ing tube that pushed out a long piece of spaghetti that you can munch on.
  158. in the gan setting, the discriminator isn’t needed when generating, maybe there’s another setting where the discriminator is still useful at the generative stage?
  159. a jacket that makes amazon’s automated shopping thing think you’re a packet of chips: or something similar
  160. CompromiseApp: two people need to agree on something they both have the app person 1 rates the estimated compromise, on a scale, of person 2 person 2 likewise both people record their own true compromise values then, over time, there’s a bunch of things that can be done, such as comparing predicted compromises, total compromises made, etc.
  161. DerivativeNet: it watches all seinfield episodes and sees if it can generate curb your enthusiasm episodes it reads all smbc comics and sees if it can generate xkcd ones etc.
  162. Cap-Gun mechanical keyboard: You pull back a bunch of hammers then as you type it fires the caps.

February 21, 2019

Patrick Louis (venam)

February 2019 Projects February 21, 2019 10:00 PM

The new year has begun… A while ago!

My last post Was almost 9 months ago, more than half a year has passed. A lot has happened but I still feel like time has passed quickly.

Psychology, Philosophy & Books

Language: brains
Explanation:

Les fleurs du mal

The majority of my reading has been through the newsletter, however I still had the time to go through the following:

  • The man who mistook his wife for a hat - Oliver Sacks
  • Les fleurs du mal - Baudelaire
  • Tribal leadership - Dave Logan
  • Managing business through human psychology - Ashish Bhagoria
  • The new one minute manager - Ken Blanchard
  • Authentic leadership - Harvard Business Review
  • The one thing - Garry Keller

As you might have noticed a lot of them are related to methodologies, approaches of interaction with others, and new ways of thinking. I find it fascinating to gather novel ways of seeing the world, all the mental models, all the manners to make decisions. This was the mindset for most of the past months along with re-energizing, invigorating, and bringing back my artistic sense.
A long long time ago I used to be in love with poetry and thus I’m slowly reincorporating this old habit into my daily life. Too much rationality is a disease of these days and ages which I’m trying not to fall into. I’m also working on my personal “immunity to change” regarding over-planning, so all of this helps.

As far as podcasts go, here’s the new list apart from the usual ones I was already following:

  • The Knowledge Project with Shane Parrish [All]
  • Planet Money [All]
  • The Food Chain [All]
  • Science Friday [All]
  • Hi-Phi Nation [All]
  • The History of GNOME [All]
  • The End Of The World with Josh Clark [All]
  • The Darknet Diaries [Most]
  • LeVar Burton Reads [Most]
  • Team Human [Started]

I’ve gathered more than one thousand hours those past months in AntennaPod, the application I’m using to listen to podcast (Something like 3h a day everyday). So I thought of moving away from the podcast world, at least for a while, for the next 2 or 3 months to learn something else on the road. I’ve chosen to dedicate this time to practicing singing. We’ll see what comes out of it, so far it’s going great but there needs to be a day or two a week for resting.

Learning & Growth

Language:growth
Explanation:

Face to face

I go through phases of learning and then creating. These months it’s been about learning. The emphasis was on work life, management, leadership, and android.

I have in mind some ideas for applications I want to build and I’m slowly gathering info, and in the meantime having fun, learning the various aspects of the android ecosystem.

On the other side, I’m working on my “immunity to change”, something I’ve learned from a Robert Kegan’s book. This relates to how we unknowingly create toxic loops within our lives that stop us to do actual changes we would like to do because those loops inherently defines us. For me it’s about an obsession with time, delegation, and loosening the grip over control of time.
Thus the switch to reading and learning in those leadership, management, and emotional intelligence books instead of digging deeper into projects one after the other like time is running out fast. I would’ve dismissed such content before and the same goes for the reemergence of artistic hobbies in my day to day.

Upgrading the Old

Language: C++
Explanation:

Manga downloader GUI

On that same note, I’ve gone in the past to upgrade an old project of mine, I’ve upgraded my manga downloader to support webtoons, a popular free Korean manwha website.

Get it here.

It has been fun re-reading old code, checking out what kind of mindset I had while writing it. I can’t access that it’s great code, nor that I would write it the same way today, but hey it’s still standing!

Newsletter and community

Language: Unix
Explanation:

nixers sticker

Countless videos, talks, research papers, and articles about Unix were consumed with much pleasure!
The newsletter has had its two years anniversary (issue 104), it has been a blast with a drawing from rocx, a ricing tip section from xero, and the start of a partnership with vermaden.

Every week we’re learning more and topics start to link together in a spider web of ideas. I like to reference previous sections of the newsletter when things are related and long time readers might enjoy this too (Check the archive).

To the joy of the now ~400 readers, vermaden has now partnered to give a nice touch to the newsletter, he’s more into BSD news than I am which keeps the balance.

I’ve dropped the “Others” part of the newsletter because of criticism that it was too offtopic and had a hint of political ideas to it. Let’s cut it short and simply say that I’d rather leave that section out than play the so-trendy game of internet over-interpretative argumentations and red herrings.

As for the podcast, I couldn’t put it on my top list but still felt in my guts that I wanted to write or prepare something similar. So I’ve begun a series named “Keeping time and date” which is similar to having a podcast/research on it. I’m hoping to do more of those in the future. By the time this post is online the series should almost be over.

On the nixers community side of things, the forums don’t get much appreciation but the irc conversation is still going on. I might organize events in the near future but I’m hoping a community push will do some good. My guess is that the forums aren’t as active because it’s seen as too high a pedestal to contribute to, nobody thinks their ideas are worth sharing there and if they find it worth sharing they’d rather do that on their personal platform, which I totally understand.

2bwm and Window management

Language: C
Explanation:

Glue

2bwm now supports a new fancy EWMH for full screen messages. This is a change I wanted to make for a while but didn’t put on my list.

Fortunately, frstrikerman has had the courtesy of starting a pull request with enough energy to make it happen. I’ve guided him a bit and together we’ve implemented it.
I’m looking forward to writing more about changes I want to make, how to make them, but leave the door open for the changes to be done by contributors. This creates a learning opportunity for anyone interested.

Three new articles have seen the light of day on my blog, all of them, unexpectedly, in close tie with the topic of window management. I’ve also added some preloading/caching behavior on the index page of the blog.

All of the articles in themselves were popular although the window automation one got quite a crowd (25k+) the first two days coming from tech news aggregation websites. Clearly, someone had posted it and it attracted readers, all for the best.

Ascii Art & Art

Language: ASCII
Explanation:

Abstract

Many packs have been posted by impure during the past period, impure 69, impure 70, and impure 71.
In 69 all I could manage to pull of was an abstract piece and in 70 and 71 I’ve had a cretaceous dinosaurs series.

In between, there was the Evoke 2018 where I scored fifth position. The idea of that compo was to restrict the size of the piece to 80 columns by 25 lines.

trapped within

I’ve also indirectly participated in Demosplash 2018 and scored the tenth position. I hadn’t really done anything particular for it but simply compiled together all the dinosaurs I had until that point.

ankylosaurus

A similarity you might have noticed is that I’ve toned down on the coloring. I’m trying to extract as much as possible from the art without having to think about anything other than message and form.

To put it bluntly I’ve turned off color mode:

syntax off

Right now I’m letting myself flow through a novel totem piece, a similar but more pronounced style to the one I had for Evoke 2017, that I call “dogmaIamgod” and which I should publish in the next two or three weeks.

Likewise, I’ve done some drawings and paintings too. I bought canvases for the first time and started experimenting, though I haven’t put them on the blog yet. Here’s a peek:

Extrapolating

Life and other hobbies

Language: life
Explanation:

Fungi board game

The quest for wild mushrooms is still ongoing. My SO and I have been gathering and studying mushrooms almost everyday. /r/mycology has become one of my favorite place to browse on lazy mornings.

In autumn we went on many small hikes and some bigger ones too without much luck but with much fun. All we could find were amanita citrina and beautiful though non-edible elfin saddles. There were other non-interesting species too.

elfin saddle

However, we’re not giving up, we’re taking this hobby to the next step. We lately went shopping for some hiking equipments, brand new fancy boots, some sticks, a handmade straw basket, etc.. And we’re planning on going on more hikes in spring. Spring is not a high season for edibles but we’re still hoping to find morels as it’s their prime time.

Overall this hobby has got us closer together and brought excitement to our couple. We’ve even got a card game named fungi for valentine. Moreover, we’ve searched local places that serve wild mushrooms and went to all the ones we could find. They’re usually pricey but worth the culinary experience.

Other than that with my SO we’ve got into retro gaming, which I’ve written about before in the past. These days it has become trendy again and many console manufacturers are relaunching their old brand. I guess nostalgia is a market that is well tapped into.

I’m actively looking for mushroom books to order from local libraries but most of them are not available so I end up reading the PDF versions I can find online. I’ve also been binge watching on Carluccio’s mushroom recipes, such a master.

This got me back into honing my cooking knowledge and art. I got tired of overcooking on Sunday for the rest of the week so I’m trying to juggle 3 days a week of home-cooked meals with restaurants or others for the rest of the week.
Which all drove me into two ideas.

First, the idea for a specific cooking diary app.
Second, the creation of a Zomato account. This goes with the same mindset as when I created the Google map account. I want to contribute to the local community by sharing the places I like the most. My mantra is that there will be no bad reviews and only constructive criticism if any.

Lastly, on the topic of food, fermentation has got my attention and I’ve now got mason jars filled with awesome vegetables. I’m currently on my third batch and exploring different formulas.

fermentation

When appetite is good, life’s good.

And life is!
Every year I normally fix a certain theme to rotate around and focus on. This time I’ve chosen to awaken my artistic side, spend more time in nature, organize more activities with my friends, write and contribute more to communities I’m part of through what I know best.

I did try to refresh my Spanish tongue but it wasn’t really part of anything and I didn’t follow up on it.

And to finish of, I’ve begun a daily diary. A quick summary of what I’ve done during the day, what’s on my mind, what I feel, what I want to do next. It’s a complement to what I was already doing through short, medium, and long term goals, associating it with my global path and intentions in life. In general this is revitalizing to do at the end of the day, it makes me more aware of my actions, appreciate the good parts and reflect on what could be done in the future.

Now

Which all leads to what’s in store for tomorrow.
More of the same but upgraded.

I really want to work on the idea I got for the application. Contribute to community projects more, write more articles about what I know, share ideas. Obviously the newsletter is included with more mini-series.
2bwm needs a big shake to add the wanted EWMH.
Maybe I’ll get back into WeChall, though it’s not part of the priority of the year.
And definitely push other hobbies too!

I’m going to travel in June with a friend to New York and then Miami, that’ll shift my perspective for a while, maybe bring some new insights.

This is it!

As usual… If you want something done, no one’s gonna do it for you, use your own hands.
And let’s go for a beer together sometime, or just chill.

Joe Nelson (begriffs)

Browsing a remote git repository February 21, 2019 12:00 AM

Git has no built-in way to browse and display files from a remote repo without cloning. Its closest concept is git ls-remote, but this shows only the hashes for references like HEAD or master, and not the files inside.

I wrote a server to expose a git repo in a new way. Watch the short demo:

Watch Video

How to try it yourself

You can get the code at begriffs/gitftp. It’s currently a proof of concept. Once I’ve added some more features I’ll run a public server to host the project code using the project itself.

The server is written in C and requires only libgit2. It’s small and portable.

Why do it this way?

The standard solution is to use a web interface like GitHub, GitLab, cgit, stagit, klaus, GitWeb, etc. However these interfaces are fairly rigid, and don’t connect well with external tools. While some of these sites also provide RESTful APIs, the clients available to consume those APIs are limited. Also desktop clients for these proprietary services are often big Electron apps.

By serving a repo behind an FTP interface, we get these benefits:

  • Web browser supported but not required
  • Minimized network traffic, sending just the file data itself
  • Supported on all platforms, with dozens of clients already written
  • Support for both the command line and GUI

GitFTP reads from a git repo’s internal database and exposes the trees and blobs as a filesystem. It reads from the master branch, so each new connection sees the newest code. Any single connection sees the files in a consistent state, unchanging even if new commits happen during the duration of the connection. The FTP welcome message identifies the SHA being served.

TODOs

This is a proof of concept. I’m putting it out there to gauge general interest. If we want to continue working on it, there are plenty of features to add, like providing a way to browse code at different commits, supporting SFTP so files cannot be changed by a man in the middle, etc etc. See the project issues for more.

February 20, 2019

Indrek Lasn (indreklasn)

How to create a blazing fast modern blog with Nuxt and Prismic February 20, 2019 04:25 PM

Let’s build a modern blog with Vue, Nuxt and Prismic.

I chose Vue + Nuxt since they’re fun to work with. It’s easy to start with, offers plenty of essential features out of the box, and provides good performance.

If you’re new to Vue, I encourage to take a look at this article for understanding the basics.

Nuxt is a Vue framework for server side rendering. It is a tool in the Vue ecosystem that you can use to build server-rendered apps from scratch without being bothered by the underlying complexities of rendering a JavaScript app to a server.

Why Nuxt?

https://nuxtjs.org/

Nuxt.js is an implementation of what we call a Universal Application.

It became famous with React but is currently getting more and more popular for many client-side libraries such as Angular, Vue.js, etc.

A Universal Application is a kind of application that renders your component on the server side.

Nuxt.js offers a simple way to first retrieve your asynchronous data from any data source and then renders it and sends it the the browser as HTML.

In terms of SEO, the Google bot crawler will get the rendered content and index it properly. In addition to that, the fact that your content can be pre-rendered and ready to be served increases the speed of your website and in that way, it also improves your SEO.

The Nuxt ecosystem is a never ending stream of handy tools and packages.

Contents

Fast rendering ensured by virtual DOM and minimal load time

Vue.js is only ~30 KB gzipped with the core module, the router and Vuex.

A minimal footprint offers short load time, meaning higher speed for users and better ranking on the speed criterium for Google crawler.

Virtual DOM!

Vue.js also took inspiration in ReactJS by implementing Virtual DOM under the hood since the version 2.0. Virtual DOM is basically a way to generate a version of the DOM in-memory each time you change a state and compare it to the actual DOM, so you can update only the part that needs to be updated instead of re-rendering everything.

Benchmarking

Vue.js offers some really good overall performance as you can see on the following benchmarks:

Duration in milliseconds ± standard deviation (Slowdown = Duration)

Memory allocation in MB

(Source: third-party benchmarks by Stefan Krause)

What is Prismic and why should I care?

Prismic is a headless CMS. This means you edit your templates on your own server, but the backend runs on the cloud. This presents a few advantages such as being able to use an API to feed your content into external apps.

Imagine that you built a blog, not for yourself, but for someone else who is not a developer, so they can edit their content. You want to have full control over the layout (built with Vue) but you don’t want to go over the tedious process of deploying every time a new file for a new blog post is created.

This is where including a headless content management system (CMS) into your app is useful — so you don’t have to deal with that.

https://prismic.io/usecases

What’s the difference between a headless CMS and vanilla CMS?

A traditional CMS like Wordpress would provide the editing tools for managing content. But, it also would assume full control of the front-end of your website — the way the content is displayed is largely defined in a CMS.

Headless content management system, or headless CMS, is a back-end only content management system (CMS) built from the ground up as a content repository that makes content accessible via a RESTful API for display on any device.

If you want to know more, Prismic wrote a clear article about headless cms

I chose Prismic as my headless CMS — it’s super simple to set up and has great features out of the box.

Why I chose Prismic

  • Easy to setup — took me only couple hours to set-up the environment and push to production.
  • Live Preview mode — allows editors to preview their content on their website and apps whether it’s in a draft or scheduled to be published later. This allows marketing teams for example to have a full preview of their website for a specific date and time. This can be extremely useful to manage upcoming blog releases, and preview edits.
  • Slices — Slices are reusable components. Enabling Slices in your template will allow writers to choose between adding a text section, an image, or a quote in the piece of content they are creating. This gives writers the freedom to compose a blog post by alternating and ordering as many of these choices/content blocks as they want/need.
  • Simple and comprehensive documentation.
  • Strong community, e.g Google, New Relic, Ebay, etc are using Prismic
  • Friendly free tier

Setting up Prismic is very simple, let’s get started!

Head over to the Prismic website and create a new user.

After creating a new user on Prismic, we should see something like this:

Building our custom type

Custom Types are models of content that we setup for our marketing or writing team. The marketing team will fill them with content (text, images, etc.), and we’ll be able to retrieve this content through Prismic’s API.

There are two kinds of Custom Types: the Single Type and the Repeatable type.

The Single Type is used for pages of which there is only one instance (a home page, a pricing page, an about us page).

Repeatable Custom Types will be templates used in more than one document (ie. having many blog post pages, product pages, landing pages for your website).

We want a blog post. In fact we want many blog posts, so it should be a repeatable type.

choosing the type

Creating a repeatable type blog post.

We should be on the content builder now. Prismic gives us a lot of options to choose from. If you look on the right, you should see a sidebar including lots of options — from images, titles, content relationship to SEO options.

Let’s build a reusable blog post with the prismic builder. Our blog will include a title and the body.

Start with adding the following fields:

  • UID field
  • Title field
  • Rich text field

Each time you add a field you can define formatting options for it. The UID field is a unique identifier that can be used specifically to create SEO and user-friendly website URLs

Creating our blog post title

Don’t forget to save our progress!

Make sure you have the following fields for the blog post:

  • uid
  • blog_post_title
  • blog_content

So far we have the layout for our reusable blog post.

Custom types menu

Time to create a blog post! Head over to the content tab on the left.

Content tab

This will take us to the blog layout we built earlier. Insert the desired text for the uid, post_title, blog_content blocks.

Building our page with Prismic layout builder

Great! We have our blog post set up now. Look at the right-top, we should see a “save” button. Clicking this saves our progress. After saving we can publish our content. Publishing the content makes it available via the API for our front-end to consume.

Starting a new Nuxt project

Open your terminal and run this command. Make sure you have npx installed (shipped by default with npm +5.2.0)

$ npx create-nuxt-app vue-nuxt-prismic-blog

The Nuxt installer conveniently asks us our preferences and creates the project.

We should end up with a project structure like below:

Nuxt project structure

Great! Let’s build our blog now. We need to fetch the blog content from Prismic. Luckily, prismic gives us plenty of handy tools.

https://medium.com/media/0df21286ef3529027e62f310485e7b47/href

The prismic-javascript package includes many utilities, including fetching from our api. The prismic-dom gives us helper functions to render markup.

Prismic NPM package — https://www.npmjs.com/package/prismic-javascript

Let’s quickly create the prismic.config.js file in our root directory. This is where we’ll place our Prismic related configuration.

https://medium.com/media/5f5caa9dfbe727f13858883c0a369265/href

Note: Make sure you use the API endpoint associated with your blog.

Open the pages/index.vue file and import the Prismic library with our config.

https://medium.com/media/2c21de38ae4ae6ea43abe4446422731a/href

Great! Next, we have to call the API somewhere, let’s use the asyncData life-cycle hook for this.

https://medium.com/media/6ee52da5bc7b385518fde18cd7f5ab15/href

First, we initialize our API with the endpoint. Then, we query the API to return our blog post. We can specify the language and document type.

The Prismic API is promise based, which means we can call the API and chain promises. Hurray for promises. We also can use the async/await syntax to resolve promises. Check out this post I wrote about aysnc/await.

Prismic response

All we need to do is render the markup now!

https://medium.com/media/7ac5c40614193b76fd9bbd263d596f77/href

There you go. We successfully fetched our blog post from the Prismic API.

Applying the styles — grab a copy and place it in the style section of the Vue component:

https://medium.com/media/beec023f7d55d59fc84f92f750d4f768/href

If we open our app, this is what we should see.

End result

Voilà! We have a modern server-side rendered blog built with Nuxt and Prismic.

We barely scratched the surface — we can do a lot more with Nuxt and Prismic. My favorite Prismic features are Slices and Live Preview. I encourage you to check them out!

Slices will allow you to create “dynamic” pages with richer content, and Live preview will allow you to instantly preview your edits in your webpage.

Slices

For example in this project we worked on only one post, but if we had created lots of posts in Prismic, then one really cool thing about Nuxt.js is that automatically creates routes for you.

Behind the scenes, it still uses Vue Router for this, but you don’t need to create a route config manually anymore. Instead, you create your routing using a folder structure — inside the pages folder. But you can read all about that in the official docs on routing in Nuxt.js.

Thanks for reading! If you found this useful, please give the article some claps so more people see it! ❤

In case you got lost, here’s the repository for our blog:

wesharehoodies/nuxt-prismic-blog

If you have any questions regarding this article, or anything general — I’m active on Twitter and always happy to read comments and reply to tweets.

Here are some of my previous articles:


How to create a blazing fast modern blog with Nuxt and Prismic was originally published in freeCodeCamp.org on Medium, where people are continuing the conversation by highlighting and responding to this story.

February 19, 2019

Dan Luu (dl)

Randomized trial on gender in Overwatch February 19, 2019 12:00 AM

A recurring discussion in Overwatch (as well as other online games) is whether or not women are treated differently from men. If you do a quick search, you can find hundreds of discussions about this, some of which have well over a thousand comments. These discussions tend to go the same way and involve the same debate every time, with the same points being made on both sides. Just for example, these three threads on reddit that spun out of a single post that have a total of 10.4k comments. On one side, you have people saying "sure, women get trash talked, but I'm a dude and I get trash talked, everyone gets trash talked there's no difference", "I've never seen this, it can't be real", etc., and on the other side you have people saying things like "when I play with my boyfriend, I get accused of being carried by him all the time but the reverse never happens", "people regularly tell me I should play mercy[, a character that's a female healer]", and so on and so forth. In less time than has been spent on a single large discussion, we could just run the experiment, so here it is.

This is the result of playing 339 games in the two main game modes, quick play (QP) and competitive (comp), where roughly half the games were played with a masculine name (where the username was a generic term for a man) and half were played with a feminine name (where the username was a woman's name). I recorded all of the comments made in each of the games and then classified the comments by type. Classes of comments were "sexual/gendered comments", "being told how to play", "insults", and "compliments".

In each game that's included, I decided to include the game (or not) in the experiment before the character selection screen loaded. In games that were included, I used the same character selection algorithm, I wouldn't mute anyone for spamming chat or being a jerk, I didn't speak on voice chat (although I had it enabled), I never sent friend requests, and I was playing outside of a group in order to get matched with 5 random players. When playing normally, I might choose a character I don't know how to use well and I'll mute people who pollute chat with bad comments. There are a lot of games that weren't included in the experiment because I wasn't in a mood to listen to someone rage at their team for fifteen minutes and the procedure I used involved pre-committing to not muting people who do that.

Sexual or sexually charged comments

I thought I'd see more sexual comments when using the feminine name as opposed to the masculine name, but that turned out to not be the case. There was some mention of sex, genitals, etc., in both cases and the rate wasn't obviously different and was actually higher in the masculine condition.

Zero games featured comments were directed specifically at me in the masculine condition and two (out of 184) games in the feminine condition featured comments that were directed at me. Most comments were comments either directed at other players or just general comments to team or game chat.

Examples of typical undirected comments that would occur in either condition include ""my girlfriend keeps sexting me how do I get her to stop?", "going in balls deep", "what a surprise. *strokes dick* [during the post-game highlight]", and "support your local boobies".

The two games that featured sexual comments directed at me had the following comments:

  • "please mam can i have some coochie", "yes mam please" [from two different people], ":boicootie:"
  • "my dicc hard" [believed to be directed at me from context]

During games not included in the experiment (I generally didn't pay attention to which username I was on when not in the experiment), I also got comments like "send nudes". Anecdotally, there appears to be a different in the rate of these kinds of comments directed at the player, but the rate observed in the experiment is so low that uncertainty intervals around any estimates of the true rate will be similar in both conditions unless we use a strong prior.

The fact that this difference couldn't be observed in 339 games was surprising to me, although it's not inconsistent with McDaniel's thesis, a survey of women who play video games. 339 games probably sounds like a small number to serious gamers, but the only other randomized experiment I know of on this topic (besides this experiment) is Kasumovic et al., which notes that "[w]e stopped at 163 [games] as this is a substantial time effort".

All of the analysis uses the number of games in which a type of comment occured and not tone to avoid having to code comments as having a certain tone in order to avoid possibly injecting bias into the process. Sentiment analysis models, even state-of-the-art ones often return nonsensical results, so this basically has to be done by hand, at least today. With much more data, some kind of sentiment analysis, done with liberal spot checking and re-training of the model, could work, but the total number of comments is so small in this case that it would amount to coding each comment by hand.

Coding comments manually in an unbiased fashion can also be done with a level of blinding, but doing that would probably require getting more people involved (since I see and hear comments while I'm playing) and relying on unpaid or poorly paid labor.

Being told how to play

The most striking, easy to quantify, difference was the rate at which I played games in which people told me how I should play. Since it's unclear how much confidence we should have in the difference if we just look at the raw rates, we'll use a simple statistical model to get the uncertainty interval around the estimates. Since I'm not sure what my belief about this should be, this uses an uninformative prior, so the estimate is close to the actual rate. Anyway, here are the uncertainty intervals a simple model puts on the percent of games where at least one person told me I was playing wrong, that I should change how I'm playing, or that I switch characters:

table {border-collapse: collapse;}table,th,td {border: 1px solid black;}td {text-align:center;}

Cond Est P25 P75
F comp 19 13 25
M comp 6 2 10
F QP 4 3 6
M QP 1 0 2

The experimental conditions in this table are masculine vs. feminine name (M/F) and competitive mode vs quick play (comp/QP). The numbers are percents. Est is the estimate, P25 is the 25%-ile estimate, and P75 is the 75%-ile estimate. Competitive mode and using a feminine name are both correlated with being told how to play. See this post by Andrew Gelman for why you might want to look at the 50% interval instead of the 95% interval.

For people not familiar with overwatch, in competitive mode, you're explicitly told what your ELO-like rating is and you get a badge that reflects your rating. In quick play, you have a rating that's tracked, but it's never directly surfaced to the user and you don't get a badge.

It's generally believed that people are more on edge during competitive play and are more likely to lash out (and, for example, tell you how you should play). The data is consistent with this common belief.

Per above, I didn't want to code tone of messages to avoid bias, so this table only indicates the rate at which people told me I was playing incorrectly or asked that I switch to a different character. The qualitative difference in experience is understated by this table. For example, the one time someone asked me to switch characters in the masculine condition, the request was a one sentence, polite, request ("hey, we're dying too quickly, could we switch [from the standard one primary healer / one off healer setup] to double primary healer or switch our tank to [a tank that can block more damage]?"). When using the feminine name, a typical case would involve 1-4 people calling me human garbage for most of the game and consoling themselves with the idea that the entire reason our team is losing is that I won't change characters.

The simple model we're using indicates that there's probably a difference between both competitive and QP and playing with a masculine vs. a feminine name. However, most published results are pretty bogus, so let's look at reasons this result might be bogus and then you can decide for yourself.

Threats to validity

The biggest issue is that this wasn't a pre-registered trial. I'm obviously not going to go and officially register a trial like this, but I also didn't informally "register" this by having this comparison in mind when I started the experiment. A problem with non-pre-registered trials is that there are a lot of degrees of freedom, both in terms of what we could look at, and in terms of the methodology we used to look at things, so it's unclear if the result is "real" or an artifact of fishing for something that looks interesting. A standard example of this is that, if you look for 100 possible effects, you're likely to find 1 that appears to be statistically significant with p = 0.01.

There are standard techniques to correct for this problem (e.g., Bonferroni correction), but I don't find these convincing because they usually don't capture all of the degrees of freedom that go into a statistical model. An example is that it's common to take a variable and discretize it into a few buckets. There are many ways to do this and you generally won't see papers talk about the impact of this or correct for this in any way, although changing how these buckets are arranged can drastically change the results of a study. Another common knob people can use to manipulate results is curve fitting to an inappropriate curve (often a 2nd a 3rd degree polynomial when a scatterplot shows that's clearly incorrect). Another way to handle this would be to use a more complex model, but I wanted to keep this as simple as possible.

If I wanted to really be convinced on this, I'd want to, at a minimum, re-run this experiment with this exact comparison in mind. As a result, this experiment would need to be replicated to provide more than a preliminary result that is, at best, weak evidence.

One other large class of problem with randomized controlled trials (RCTs) is that, despite randomization, the two arms of the experiment might be different in some way that wasn't randomized. Since Overwatch doesn't allow you to keep changing your name, this experiment was done with two different accounts and these accounts had different ratings in competitive mode. On average, the masculine account had a higher rating due to starting with a higher rating, which meant that I was playing against stronger players and having worse games on the masculine account. In the long run, this will even out, but since most games in this experiment were in QP, this didn't have time to even out in comp. As a result, I had a higher win rate as well as just generally much better games with the feminine account in comp.

With no other information, we might expect that people who are playing worse get told how to play more frequently and people who are playing better should get told how to play less frequently, which would mean that the table above understates the actual difference.

However Kasumovic et al., in a gender-based randomized trial of Halo 3, found that players who were playing poorly were more negative towards women, especially women who were playing well (there's enough statistical manipulation of the data that a statement this concise can only be roughly correct, see study for details). If that result holds, it's possible that I would've gotten fewer people telling me that I'm human garbage and need to switch characters if I was average instead of dominating most of my games in the feminine condition.

If that result generalizes to OW, that would explain something which I thought was odd, which was that a lot of demands to switch and general vitriol came during my best performances with the feminine account. A typical example of this would be a game where we have a 2-2-2 team composition (2 players playing each of the three roles in the game) where my counterpart in the same role ran into the enemy team and died at the beginning of the fight in almost every engagement. I happened to be having a good day and dominated the other team (37-2 in a ten minute comp game, while focusing on protecting our team's healers) while only dying twice, once on purpose as a sacrifice and second time after a stupid blunder. Immediately after I died, someone asked me to switch roles so they could take over for me, but at no point did someone ask the other player in my role to switch despite their total uselesses all game (for OW players this was a Rein who immediately charged into the middle of the enemy team at every opportunity, from a range where our team could not possibly support them; this was Hanamura 2CP, where it's very easy for Rein to set up situations where their team cannot help them). This kind of performance was typical of games where my team jumped on me for playing incorrectly. This isn't to say I didn't have bad games; I had plenty of bad games, but a disproportionate number of the most toxic experiences came when I was having a great game.

I tracked how well I did in games, but this sample doesn't have enough ranty games to do a meaningful statistical analysis of my performance vs. probability of getting thrown under the bus.

Games at different ratings are probably also generally different environments and get different comments, but it's not clear if there are more negative comments at 2000 than 2500 or vice versa. There are a lot of online debates about this; for any rating level other than the very lowest or the very highest ratings, you can find a lot of people who say that the rating band they're in has the highest volume of toxic comments.

Other differences

Here are some things that happened while playing with the feminine name that didn't happen with the masculine name during this experiment or in any game outside of this experiment:

  • unsolicited "friend" requests from people I had no textual or verbal interaction with (happened 7 times total, didn't track which cases were in the experiment and which weren't)
  • someone on the other team deciding that my team wasn't doing a good enough job of protecting me while I was playing healer, berating my team, and then throwing the game so that we won (happened once during the experiment)
  • someone on my team flirting with me and then flipping out when I don't respond, who then spends the rest of the game calling me autistic or toxic (this happened once during the experiment, and once while playing in a game not included in the experiment)

The rate of all these was low enough that I'd have to play many more games to observe something without a huge uncertainty interval.

I didn't accept any friend requests from people I had no interaction with. Anecdotally, some people report people will send sexual comments or berate them after an unsolicited friend request. It's possible that the effect show in the table would be larger if I accepted these friend requests and it couldn't be smaller.

I didn't attempt to classify comments as flirty or not because, unlike the kinds of commments I did classify, this is often somewhat subtle and you could make a good case that any particular comment is or isn't flirting. Without responding (which I didn't do), many of these kinds of comments are ambiguous

Another difference was in the tone of the compliments. The rate of games where I was complimented wasn't too different, but compliments under the masculine condition tended to be short and factual (e.g., someone from the other team saying "no answer for [name of character I was playing]" after a dominant game) and compliments under the feminine condition tended to be more effusive and multiple people would sometimes chime in about how great I was.

Non differences

The rate of complements and the rate of insults in games that didn't include explanations of how I'm playing wrong or how I need to switch characters were similar in both conditions.

Other factors

Some other factors that would be interesting to look at would be time of day, server, playing solo or in a group, specific character choice, being more or less communicative, etc., but it would take a lot more data to be able to get good estimates when adding it more variables. Blizzard should have the data necessary to do analyses like this in aggregate, but they're notoriously private with their data, so someone at Blizzard would have to do the work and then publish it publicly, and they're not really in the habit of doing that kind of thing. If you work at Blizzard and are interested in letting a third party do some analysis on an anonymized data set, let me know and I'd be happy to dig in.

Experimental minutiae

Under both conditions, I avoided ever using voice chat and would call things out in text chat when time permitted. Also under both conditions, I mostly filled in with whatever character class the team needed most, although I'd sometimes pick DPS (in general, DPS are heavily oversubscribed, so you'll rarely play DPS if you don't pick one even when unnecessary).

For quickplay, backfill games weren't counted (backfill games are games where you join after the game started to fill in for a player who left; comp doesn't allow backfills). 6% of QP games were backfills.

These games are from before the "endorsements" patch; most games were played around May 2018. All games were played in "solo q" (with 5 random teammates). In order to avoid correlations between games depending on how long playing sessions were, I quit between games and waited for enough time (since you're otherwise likely to end up in a game with some or many of the same players as before).

The model used probability of a comment happening in a game to avoid the problem that Kasumovic et al. ran into, where a person who's ranting can skew the total number of comments. Kasumovic et al. addressed this by removing outliers, but I really don't like manually reaching in and removing data to adjust results. This could also be addressed by using a more sophisticated model, but a more sophisticated model means more knobs which means more ways for bias to sneak in. Using the number of players who made comments instead would be one way to mitigate this problem, but I think this still isn't ideal because these aren't independent -- when one player starts being negative, this greatly increases the odds that another player in that game will be negative, but just using the number of players makes four games with one negative person the same as one game with four negative people. This can also be accounted for with a slightly more sophisticated model, but that also involves adding more knobs to the model.

Appendix: comments / advice to overwatch players

A common complaint, perhaps the most common complaint by people below 2000 SR (roughly 30%-ile) or perhaps 1500 SR (roughly 10%-ile) is that they're in "ELO hell" and are kept down because their teammates are too bad. Based on my experience, I find this to be extremely unlikely.

People often split skill up into "mechanics" and "gamesense". My mechanics are pretty much as bad as it's possible to get. The last game I played seriously was a 90s video game that's basically online asteroids and the last game before that I put any time into was the original SNES super mario kart. As you'd expect from someone who hasn't put significant time into a post-90s video game or any kind of FPS game, my aim and dodging are both atrocious. On top of that, I'm an old dude with slow reflexes and I was able to get to 2500 SR (roughly 60%-ile) by avoiding a few basic fallacies and blunders despite have approximately zero mechanical skill. If you're also an old dude with basically no FPS experience, you can do the same thing; if you have good reflexes or enough FPS experience to actually aim or dodge, you basically can't be worse mechnically than I am and you can do much better by avoiding a few basic mistakes.

The most common fallacy I see repeated is that you have to play DPS to move out of bronze or gold. The evidence people give for this is that, when a GM streamer plays flex, tank, or healer, they sometimes lose in bronze. I guess the idea is that, because the only way to ensure a 99.9% win rate in bronze is to be a GM level DPS player and play DPS, the best way to maintain a 55% or a 60% win rate is to play DPS, but this doesn't follow.

Healers and tanks are both very powerful in low ranks. Because low ranks feature both poor coordination and relatively poor aim (players with good coordination or aim tend to move up quickly), time-to-kill is very slow compared to higher ranks. As a result, an off healer can tilt the result of a 1v1 (and sometimes even a 2v1) matchup and a primary healer can often determine the result of a 2v1 matchup. Because coordination is poor, most matchups end up being 2v1 or 1v1. The flip side of the lack of coordination is that you'll almost never get help from teammates. It's common to see an enemy player walk into the middle of my team, attack someone, and then walk out while literally no one else notices. If the person being attacked is you, the other healer typically won't notice and will continue healing someone at full health and none of the classic "peel" characters will help or even notice what's happening. That means it's on you to pay attention to your surroundings and watching flank routes to avoid getting murdered.

If you can avoid getting murdered constantly and actually try to heal (as opposed to many healers at low ranks, who will try to kill people or stick to a single character and continue healing them all the time even if they're at full health), you outheal a primary healer half the time when playing an off healer and, as a primary healer, you'll usually be able to get 10k-12k healing per 10 min compared to 6k to 8k for most people in Silver (sometimes less if they're playing DPS Moira). That's like having an extra half a healer on your team, which basically makes the game 6.5 v 6 instead of 6v6. You can still lose a 6.5v6 game, and you'll lose plenty of games, but if you're consistently healing 50% more than an normal healer at your rank, you'll tend to move up even if you get a lot of major things wrong (heal order, healing when that only feeds the other team, etc.).

A corollary to having to watch out for yourself 95% when playing a healer is that, as a character who can peel, you can actually watch out for your teammates and put your team at a significant advantage in 95% of games. As Zarya or Hog, if you just boringly play towards the front of your team, you can basically always save at least one teammate from death in a team fight, and you can often do this 2 or 3 times. Meanwhile, your counterpart on the other team is walking around looking for 1v1 matchups. If they find a good one, they'll probably kill someone, and if they don't (if they run into someone with a mobility skill or a counter like brig or reaper), they won't. Even in the case where they kill someone and you don't do a lot, you still provide as much value as them and, on average, you'll provide more value. A similar thing is true of many DPS characters, although it depends on the character (e.g., McCree is effective as a peeler, at least at the low ranks that I've played in). If you play a non-sniper DPS that isn't suited for peeling, you can find a DPS on your team who's looking for 1v1 fights and turn those fights into 2v1 fights (at low ranks, there's no shortage of these folks on both teams, so there are plenty of 1v1 fights you can control by making them 2v1).

All of these things I've mentioned amount to actually trying to help your team instead of going for flashy PotG setups or trying to dominate the entire team by yourself. If you say this in the abstract, it seems obvious, but most people think they're better than their rating. A survey of people's perception of their own skill level vs. their actual rating found that 1% of people thought they were overrated, 32% of people thought they were rated accurately, and the other 77% of people thought they were underrated. It doesn't help that OW is designed to make people think they're doing well when they're not and the best way to get "medals" or "play of the game" is to play in a way that severely reduces your odds of actually winning each game.

Outside of obvious gameplay mistakes, the other big thing that loses games is when someone tilts and either starts playing terribly or flips out and says something to enrage someone else on the team, who then starts playing terribly. I don't think you can actually do much about this directly, but you can never do this, so 5/6th of your team will do this at some base rate, whereas 6/6 of the other team will do this. Like all of the above, this won't cause you to win all of your games, but everything you do that increases your win rate makes a difference.

Poker players have the right attitude when they talk about leaks. The goal isn't to win every hand, it's to increase your EV by avoiding bad blunders (at high levels, it's about more than avoiding bad blunders, but we're talking about getting out of below median ranks, not becoming GM here). You're going to have terrible games where you get 5 people instalocking DPS. Your odds of winning a game are low, say 10%. If you get mad and pick DPS and reduce your odds even further (say this is to 2%), all that does is create a leak in your win rate during games when your teammates are being silly.

If you gain/lose 25 rating per game for a win or a loss, your average rating change from a game is 25 (W_rate - L_rate) = 25 (2W_rate - 1). Let's say 1/40 games are these silly games where your team decides to go all DPS. The per-game SR difference of trying to win these vs. soft throwing is maybe something like 1/40 * 25 (2 * 0.08) = 0.1. That doesn't sound like much and these numbers are just guesses, but everyone outside of very high-level games is full of leaks like these, and they add up. And if you look at a 60% win rate, which is pretty good considering that your influence is limited because you're only one person on a 6 person team, that only translates to an average of 5SR per game, so it doesn't actually take that many small leaks to really move your average SR gain or loss.

Appendix: general comments on online gaming, 20 years ago vs. today

Since I'm unlikely to write another blog post on gaming any time soon, here are some other random thoughts that won't fit with any other post. My last serious experience with online games was with a game from the 90s. Even though I'd heard that things were a lot worse, I was still surprised by it. IRL, the only time I encounter the same level and rate of pointless nastiness in a recreational activity is down at the bridge club (casual bridge games tend to be very nice). When I say pointless nastiness, I mean things like getting angry and then making nasty comments to a teammate mid-game. Even if your "criticism" is correct (and, if you review OW games or bridge hands, you'll see that these kinds of angry comments are almost never correct), this has virtually no chance of getting your partner to change their behavior and it has a pretty good chance of tilting them and making them play worse. If you're trying to win, there's no reason to do this and good reason to avoid this.

If you look at the online commentary for this, it's common to see people blaming kids, but this doesn't match my experience at all. For one thing, when I was playing video games in the 90s, a huge fraction of the online gaming population was made up of kids, and online game communities were nicer than they are today. Saying that "kids nowadays" are worse than kids used to be is a pastime that goes back thousands of years, but it's generally not true and there doesn't seem to be any reason to think that it's true here.

Additionally, this simply doesn't match what I saw. If I just look at comments over audio chat, there were a couple of times when some kids were nasty, but almost all of the comments are from people who sound like adults. Moreover, if I look at when I played games that were bad, a disproportionately large number of those games were late (after 2am eastern time, on the central/east server), where the relative population of adults is larger.

And if we look at bridge, the median age of an ACBL member is in the 70s, with an increase in age of a whopping 0.4 years per year.

Sure, maybe people tend to get more mature as they age, but in any particular activity, that effect seems to be dominated by other factors. I don't have enough data at hand to make a good guess as to what happened, but I'm entertained by the idea that this might have something to do with it:

I’ve said this before, but one of the single biggest culture shocks I’ve ever received was when I was talking to someone about five years younger than I was, and she said “Wait, you play video games? I’m surprised. You seem like way too much of a nerd to play video games. Isn’t that like a fratboy jock thing?”

Appendix: FAQ

Here are some responses to the most common online comments.

Plat? You suck at Overwatch

Yep. But I sucked roughly equally on both accounts (actually somewhat more on the masculine account because it was rated higher and I was playing a bit out of my depth). Also, that's not a question.

This is just a blog post, it's not an academic study, the results are crap.

There's nothing magic about academic papers. I have my name on a few publications, including one that won best paper award at the top conference in its field. My median blog post is more rigorous than my median paper or, for that matter, the median paper that I read.

When I write a paper, I have to deal with co-authors who push for putting in false or misleading material that makes the paper look good and my ability to push back against this has been fairly limited. On my blog, I don't have to deal with that and I can write up results that are accurate (to the best of my abillity) even if it makes the result look less interesting or less likely to win an award.

Gamers have always been toxic, that's just nostalgia talking.

If I pull game logs for subspace, this seems to be false. YMMV depending on what games you played, I suppose. FWIW, airmash seems to be the modern version of subspace, and (until the game died), it was much more toxic than subspace even if you just compare on a per-game basis despite having much smaller games (25 people for a good sized game in airmash, vs. 95 for subsace).

This is totally invalid because you didn't talk on voice chat.

At the ranks I played, not talking on voice was the norm. It would be nice to have talking or not talking on voice chat be an indepedent variable, but that would require playing even more games to get data for another set of conditions, and if I wasn't going to do that, choosing the condition that's most common doesn't make the entire experiment invalid, IMO.

Some people report that, post "endorsements" patch, talking on voice chat is much more common. I tested this out by playing 20 (non-comp) games just after the "Paris" patch. Three had comments on voice chat. One was someone playing random music clips, one had someone screaming at someone else for playing incorrectly, and one had useful callouts on voice chat. It's possible I'd see something different with more games or in comp, but I don't think it's obvious that voice chat is common for most people after the "endorsements" patch.

Appendix: code and data

If you want to play with this data and model yourself, experiment with different priors, run a posterior predictive check, etc., here's a snippet of R code that embeds the data:

library(brms)
library(modelr)
library(tidybayes)
library(tidyverse)

d <- tribble(
  ~game_type, ~gender, ~xplain, ~games,
  "comp", "female", 7, 35,
  "comp", "male", 1, 23,
  "qp", "female", 6, 149,
  "qp", "male", 2, 132
)

d <- d %>% mutate(female = ifelse(gender == "female", 1, 0), comp = ifelse(game_type == "comp", 1, 0))


result <-
  brm(data = d, family = binomial,
      xplain | trials(games) ~ female + comp,
      prior = c(set_prior("normal(0,10)", class = "b")),
      iter = 25000, warmup = 500, cores = 4, chains = 4)

The model here is simple enough that I wouldn't expect the version of software used to significantly affect results, but in case you're curious, this was done with brms 2.7.0, rstan 2.18.2, on R 3.5.1.

Thanks to Leah Hanson, Sean Talts and Sean's math/stats reading group, Annie Cherkaev, Robert Schuessler, Wesley Aptekar-Cassels, Julia Evans, Paul Gowder, Jonathan Dahan, Bradley Boccuzzi, Akiva Leffert, and one or more anonymous commenters for comments/corrections/discussion.

February 17, 2019

Simon Zelazny (pzel)

Figuring out a gen_tcp:recv limitation February 17, 2019 11:00 PM

In which a suprisingly pernicious framed payload leads to OTP spelunking.

The setup: sending a string over TCP

Let's say you want to send the ASCII string Fiat lux! to an Erlang process listening on the other side of a TCP connection. Not a big deal, right?

Our sending application is written in Python. Here's what it might look like:

#!/usr/bin/env python3
import socket
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.connect(("127.0.0.1", 7777))
data_to_send = b"Fiat Lux!"
sock.sendall(data_to_send)

... and here's the receiving Erlang application:

#!/usr/bin/env escript
main(_) ->
  {ok, L} = gen_tcp:listen(7777, [binary, {active, false}, {reuseaddr, true}]),
  {ok, Sock} = gen_tcp:accept(L),
  {ok, String} = gen_tcp:recv(Sock, 0),
  io:format("Got string: ~ts~n", [String]),
  erlang:halt(0).

If we start the Erlang receiver (in shell 1), then run the Python sender (in shell2), we should see the receiver emit the following:

$ ./receive.escript
Got string: Fiat Lux!
$

As you can see, we optimistically sent all our data over TCP from the Python app, and received all that data, intact, on the other side. What's important here is that our Erlang socket is in passive mode, which means that incoming TCP data needs to be recv'd off of the socket. The second argument in gen_tcp:recv(Sock, 0) means that we want to read however many bytes are available to be read from the OS's network stack. In this case all our data was kindly provided to us in one nice chunk.

Success! Our real production application will be dealing with much bigger pieces of data, so it behooves us to test with a larger payload. Let's try a thousand characters.

More data

We update the sender and receiver as follows:

#!/usr/bin/env python3
import socket
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.connect(("127.0.0.1", 7777))
data_to_send = b'a' * 1000
sock.sendall(data_to_send)
#!/usr/bin/env escript
main(_) ->
  {ok, L} = gen_tcp:listen(7777, [binary, {active, false}, {reusaddr, true}]),
  {ok, Sock} = gen_tcp:accept(L),
  {ok, String} = gen_tcp:recv(Sock, 0),
  io:format("Got string of length: ~p~n", [byte_size(String)]),
  erlang:halt(0).

When we run our experiment, we see that our Erlang process does indeed get all 1000 bytes. Let's add one more zero to the payload.

#!/usr/bin/env python3
import socket
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.connect(("127.0.0.1", 7777))
data_to_send = b'a' * 10000
sock.sendall(data_to_send)

And we hit our first snag!

Got string of length: 1460

Aha! Our gen_tcp:recv(Sock, 0) call asked the OS to give us whatever bytes it had ready in the TCP buffer, and so that's what we received. TCP is a streaming protocol, and there is no guarantee that a given sequence of bytes received on the socket will correspond to a logical message in our application layer. The low-effort way of handling this issue is by prefixing every logical message on the TCP socket with a known-width integer, representing the length of the message in bytes. "Low-effort" sounds like the kind of thing you put in place when the deadline was yesterday. Onward!

Let's take our initial string as an example. Instead of sending the following sequence of 9 bytes on the wire:

Ascii:     F    i   a    t   ␣    l    u    x   !

Binary:   70  105  97  116  32  108  117  120  33

We'd first prefix it with an 32-bit integer representing its size in bytes, and then append the binary, giving 13 bytes in total.

Ascii:     ␀   ␀  ␀   ␉  F    i   a    t   ␣    l    u    x   !

Binary:    0   0   0   9  70  105  97  116  32  108  117  120  33

Now, the first 4 bytes that reach our receiver can be interpreted as the length of the next logical message. We can use this number to tell gen_tcp:recv how many bytes we want to read from the socket.

To encode an integer into 32 bits, we'll use Python's struct module. struct.pack(">I", 9) will do exactly what we want: encode a 32-bit unsigned Integer (9, in this case) in Big-endian (or network) order.

#!/usr/bin/env python3
import socket
import struct
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.connect(("127.0.0.1", 7777))
data_to_send = b'a' * 10000
header = struct.pack(">I", len(data_to_send))
sock.sendall(header + data_to_send)

On the decoding side, we'll break up the receiving into two parts:

1) Read 4 bytes from the socket, interpret these as Header, a 32-bit unsigned int.

2) Read Header bytes off the socket. The receiving Erlang process will 'block' until that much data is read (or until the other side disconnects). The received bytes constitute a logical message.

#!/usr/bin/env escript
main(_) ->
  {ok, L} = gen_tcp:listen(7777, [binary, {active, false}, {reuseaddr, true}]),
  {ok, Sock} = gen_tcp:accept(L),
  {ok, <<Header:32>>} = gen_tcp:recv(Sock, 4),
  io:format("Got header: ~p~n", [Header]),
  {ok, String} = gen_tcp:recv(Sock, Header),
  io:format("Got string of length: ~p~n", [byte_size(String)]),
  erlang:halt(0).

When we run our scripts, we'll see the Erlang receiver print the following:

Got header: 10000
Got string of length: 10000

Success! But apparently, our application needs to handle messages much bigger than 10 kilobytes. Let's see how far we can take this approach.

Yet more data

Can we do a megabyte? Ten? A hundred? Let's find out, using the following loop for the sender:

#!/usr/bin/env python3
import socket
import struct
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.connect(("127.0.0.1", 7777))
for l in [1000, 1000*1000, 10*1000*1000, 100*1000*1000]:
    data_to_send = b'a' * l
    header = struct.pack(">I", len(data_to_send))
    sock.sendall(header + data_to_send)
sock.close()

...and a recursive receive function for the receiver:

#!/usr/bin/env escript
recv(Sock) ->
  {ok, <<Header:32>>} = gen_tcp:recv(Sock, 4),
  io:format("Got header: ~p~n", [Header]),
  {ok, String} = gen_tcp:recv(Sock, Header),
  io:format("Got string of length: ~p~n", [byte_size(String)]),
  recv(Sock).

main(_) ->
  {ok, L} = gen_tcp:listen(7777, [binary, {active, false}, {reuseaddr, true}]),
  {ok, Sock} = gen_tcp:accept(L),
  recv(Sock).

Running this will lead to our Erlang process crashing with an interesting message:

Got header: 1000
Got string of length: 1000
Got header: 1000000
Got string of length: 1000000
Got header: 10000000
Got string of length: 10000000
Got header: 100000000
escript: exception error: no match of right hand side value {error,enomem}

enomem looks like a strange kind of error indeed. It happens when we get the 100-megabyte header and attempt to read that data off the socket. Let's go spelunking to find out where this error is coming from.

Spelunking for {error, enomem}

First, let's take a look at what gen_tcp:recv does with its arguments. It seems that it checks inet_db to find our socket, and calls recv on that socket.

OK, let's check out inet_db. Looks like it retrieves module information stored via erlang:set_port_data, in the call above.

A grepping for a call to inet_db:register_module reveals that multiple modules register themselves this way. Among these, we find one of particular interest.

lib/kernel/src/inet_tcp.erl
169:        inet_db:register_socket(S, ?MODULE),
177:        inet_db:register_socket(S, ?MODULE),

Let's see how inet_tcp.erl implements recv. Hmm, just a pass-through to prim_inet. Let's look there.

It seems here that our erlang call-chain bottoms out in a call to ctl_cmd, which is itself a wrapper to erlang:port_control, sending control data over into C-land. We'll need to look at out TCP port driver to figure out what comes next.

    case ctl_cmd(S, ?TCP_REQ_RECV, [enc_time(Time), ?int32(Length)])

A slight hitch is finding the source code for this driver. Perhaps the marco ?TCP_REQ_RECV can help us find what we're after?

$  rg 'TCP_REQ_RECV'
lib/kernel/src/inet_int.hrl
100:-define(TCP_REQ_RECV,           42).

erts/preloaded/src/prim_inet.erl
584:    case ctl_cmd(S, ?TCP_REQ_RECV, [enc_time(Time), ?int32(Length)]) of

erts/emulator/drivers/common/inet_drv.c
735:#define TCP_REQ_RECV           42
10081:    case TCP_REQ_RECV: {
10112:  if (enq_async(INETP(desc), tbuf, TCP_REQ_RECV) < 0)

A-ha! inet_drv.c, here we come!

Indeed, this C function here, responsible for the actual call to sock_select, will proactively reject recv calls where the requested payload size n is bigger than TCP_MAX_PACKET_SIZE:

if (n > TCP_MAX_PACKET_SIZE)
    return ctl_error(ENOMEM, rbuf, rsize);

and TCP_MAX_PACKET_SIZE itself is defined in the same source file as:

#define TCP_MAX_PACKET_SIZE 0x4000000 /* 64 M */

thereby explaining our weird ENOMEM error.

Now, how to solve this conundrum? A possible approach would be to maintain some state in our receiver, optimistically read as much data as possible, and then try to reconstruct the logical messages, perhaps using something like erlang:decode_packet to take care of the book-keeping for us.

Taking a step back — and finding a clean solution

Before we jump to writing more code, let's consider our position. We're trying to read a framed message off of a TCP stream. It's been done thousands of times before. Surely the sagely developers whose combined experience is encoded in OTP have thought of an elegant solution to this problem?

It turns out that if you read the very long man entry for inet:setopts, you'll eventually come across this revealing paragraph:

{packet, PacketType}(TCP/IP sockets)

Defines the type of packets to use for a socket. Possible values:

raw | 0

No packaging is done.

1 | 2 | 4

Packets consist of a header specifying the number of bytes in the packet, followed by that number of bytes. The header length can be one, two, or four bytes, and containing an unsigned integer in big-endian byte order. Each send operation generates the header, and the header is stripped off on each receive operation.

The 4-byte header is limited to 2Gb.

Packets consist of a header specifying the number of bytes in the packet, followed by that number of bytes. Yes indeed they do! Let's try it out!

#!/usr/bin/env escript
recv(Sock) ->
  {ok, String} = gen_tcp:recv(Sock,0),
  io:format("Got string of length: ~p~n", [byte_size(String)]),
  recv(Sock).

main(_) ->
  {ok, L} = gen_tcp:listen(7777, [binary, {active, false}, {reuseaddr, true}, {packet, 4}]),
  {ok, Sock} = gen_tcp:accept(L),
  recv(Sock).

And the output is:

Got string of length: 1000
Got string of length: 1000000
Got string of length: 10000000
Got string of length: 100000000
escript: exception error: no match of right hand side value {error,closed}

Problem solved! (The last error is from a recv call on the socket after it has been closed from the Python side). Turns out that our TCP framing pattern is in fact so common, it's been subsumed by OTP as a mere option for gen_tcp sockets!

If you'd like to know why setting this option lets us sidestep the TCP_MAX_PACKET_SIZE check, I encourage you to take a dive into the OTP codebase and find out. It's suprisingly easy to navigate, and full of great code.

And if you ever find yourself fighting a networking problem using brute-force in Erlang, please consider the question: "Peraphs this was solved long ago and the solution lives in OTP?" Chances are, the answer is yes!

Ponylang (SeanTAllen)

Last Week in Pony - February 17, 2019 February 17, 2019 02:19 PM

Last Week In Pony is a weekly blog post to catch you up on the latest news for the Pony programming language. To learn more about Pony check out our website, our Twitter account @ponylang, or our Zulip community.

Got something you think should be featured? There’s a GitHub issue for that! Add a comment to the open “Last Week in Pony” issue.

Luke Picciau (user545)

Gnubee Install Guide February 17, 2019 01:51 AM

The GnuBee is an open source crowd funded nas system. It requires a fair bit of configuration to use and I found that the docs on the website to be fairly inadequet for setting up a working system so after a lot of messing around I have come up with a guide to get to a fully working debian system quickly. What you need The gnubee An SD card A 19v power supply A usb UART cable!

Kevin Burke (kb)

Going Solo, Successfully February 17, 2019 12:48 AM

Three years ago I quit my job and started consulting full time. It's worked out really well. I get to spend more time doing things I like to do and I've been able to deliver great products for clients. I wanted to go over some tips for starting a successful consulting business.

  • Charge more - Everyone says it and it's true. I started out charging a monthly rate that was close to my full time salary / 12. This is not a good idea because you have overhead that your employer is no longer covering - health care probably the biggest one, you don't have paid vacations, there may be unpaid downtime between contracts and also companies might not pay you. You need to be charging a lot more to just break even.

    I dread "what's your rate" conversations every time and they haven't gotten easier. Before I quote my rate I reread the details of the High Tech Employee Antitrust case to pump myself up - it reminds me that I'm negotiating with a company that couldn't care less really and I am the only one who's going to stand up for myself. If you think you don't need the extra money - get it anyway, and then donate more to charities at the end of the year/buy CD's/put it in the stock market/give it to the government. Amazon just made $11 billion and paid $0 in taxes; you are going to spend an additional dollar better than Amazon's executives will.

    If you are not sure how much to charge, quote each new client more than the last. Your quote is often a signal of your quality so it's not even really the case that demand slopes downwards as you charge more.

    If you are working with a client and they are very happy with your work and want to extend your contract consider asking for a higher rate. "Now that you know how good I am," etc.

  • Get the money - Signed contracts, work performed, don't matter until the money hits your bank account. I learned this the hard way. If a company is going under your invoices are worthless. You can hold onto the company IP but that's probably also worthless. You can sue but at the end you will win a judgment that you can collect money from a company that doesn't have any to pay you.

    Try to get as much of the contract as you can paid up front - I generally ask for half or more up front. If a company offers Net 30 ask if you can shorten it to Net 5 or 10 or submit your invoices in advance. Submit invoices on time - it's a very costly mistake and you won't learn its importance until it's too late.

    Try as hard as you can to figure out the financial health of the company - if you can do your homework in the press or ask questions to your primary point of contact, like how much cash they are burning, how many months of runway do you have. If a company is not forthcoming with this information it's a red flag that they may not be able to pay you.

    If you see any red flags - the company wants to cut the contract short, people start leaving, company suddenly cuts back on perks - tell your contact that you need to be paid upfront or you are not going to work anymore. If they push back on this they may not have the cash to pay you at all. It's a crappy situation but better to cut your losses than to work for someone that can't actually pay.

  • Don't charge by the hour - I have never actually done this so I can't speak to how bad it is but don't do this. You don't want a client to cut you loose at 3pm and suddenly you lose three hours you were counting on. Charge per week.

  • Get a lawyer - Get a lawyer to review every contract you sign. Read through them, flag concerning things to the lawyer. They will suggest language. Send the language to the company. You are not being difficult when you do this, the company does this all the time. Lawyers are expensive, expect to pay north of $400 per hour and contract review can take 30-60 minutes. This money is worth it.

    A good clause to try to push for is limitation of liability. You don't want to be in a situation where $2 million of damages occurred or a high value client left the company because of an error you pushed and the company is suddenly coming after everything you own. Similarly the company may want to protect against you trying to sue them for a high amount of damages to your reputation, future business etc. Limiting the total liability to the size of the contract, or a multiple of the size of the contract - on both sides - can be a good move.

  • Register as a Company - Consult with the lawyer you hired on what kind of company you want to be. Generally the more "company-like" you are the harder it is for companies to try to take your personal assets. I don't have employees or shareholders so I have a single member LLC that is disregarded for tax purposes — read this description from the IRS. Sometimes businesses are confused what this means when I tell them or try to sign up for things. Still, it is a good fit for me. It may not be for you - I am not a lawyer, you should talk with one, learn the tradeoffs and see what makes sense for your business.

  • Make Sure Contracts Are Signed With the Company - The contracts you sign should be between the client you are working with and your company NOT between the client and you personally. Discuss this with your lawyer.

  • Get an accountant - As a small business a lot of stuff is tax deductible - a home office, client travel, for example, even if it's just across town - and you want to make sure you are getting ~35% off on everything that you can. An accountant will help you with this.

  • Market yourself - Not necessarily ads or sponsorships, but: everyone you've worked with full time should know they can hire you now. If they don't then reach out to people and let them know. Put up a website that engineers can send to their boss. My website isn't fancy but it is effective. Order business cards - VistaPrint is garbage, order from moo.com. If you have a website or open source projects put a note at the bottom advertising that you're available for hire, like the one at the bottom of this post.

  • Set up separate accounts for everything - Open separate accounts for your business. Get a business credit card or just a separate cash back card on your personal account. I don't have a checking account registered for the business but I opened a second checking account that I call my "business account". Clients pay into that account and I pay the business credit card out of that account. I even have a separate Clipper card that I use for business travel.

    There are two reasons for this. It makes tax accounting a lot easier. I know that every transaction on the business Clipper card is work travel and can be expensed; I don't have to try to remember what I was doing when I paid $2 to SamTrans at 5:34pm on July 27.

    Second, if you don't keep good records for the business - if you "commingle" funds between your personal life and the business - it makes it much easier for clients to go after your personal assets, what's called "piercing the veil." Separate accounts (and discipline about transfers!) make it much easier to argue that your business income and spending and personal income and spending are separate even if you don't necessarily have the legal structures to back them up.

    I also set up a new Github account for every company I work with. This avoids any issues with emails going to the wrong place, or the need to grant/revoke permissions to any 3rd party tools a company uses. I use github.com/kevinburke/swish to swap SSH settings between my Github accounts:

    $ cat $(which notion-github)
    #!/usr/bin/env bash
    ${GOPATH}/bin/swish --identity-file ${HOME}/.ssh/github_notion_ed25519 --user kevinburkenotion
  • Balancing multiple clients: If you can do this or do things like charge retainers, great. I find it really hard to switch contexts so I work with one client at a time and treat it as a full time job. Do what works for you.

  • Give back to the tools that make you successful - I give a percentage of my earnings every year to support software tools that help me do my job - iTerm2, Vim, Go, Postgres, Node.js, Python, nginx, various other open source projects. You should consider doing this too. (If you are an open source maintainer reading this - tell me how I can pay you!!)

February 16, 2019

David Wilson (dw)

Threadless mode in Mitogen 0.3 February 16, 2019 10:00 PM

Mitogen has been explicitly multi-threaded since the design was first conceived. This choice is hard to regret, as it aligns well with the needs of operating systems like Windows, makes background tasks like proxying possible, and allows painless integration with existing programs where the user doesn't have to care how communication is implemented. Easy blocking APIs simply work as documented from any context, and magical timeouts, file transfers and routing happen in the background without effort.

The story has for the most part played out well, but as work on the Ansible extension revealed, this thread-centric worldview is more than somewhat idealized, and scenarios exist where background threads are not only problematic, but a serious hazard that works against us.

For that reason a new operating mode will hopefully soon be included, one where relatively minor structural restrictions are traded for no background thread at all. This article documents the reasoning behind threadless mode, and a strange set of circumstances that allow such a major feature to be supported with the same blocking API as exists today, and surprisingly minimal disruption to existing code.

Recap

Above is a rough view of Mitogen's process model, revealing a desirable symmetry as it currently exists. In the master program and replicated children, the user's code maintains full control of the main thread, with library communication requirements handled by a background thread using an identical implementation in every process.

Keeping the user in control of the main thread is important, as it possesses certain magical privileges. In Python it is the only thread from which signal handlers can be installed or executed, and on Linux some niche system interfaces require its participation.

When a method like remote_host.call(myfunc) is invoked, an outgoing message is constructed and enqueued with the Broker thread, and a callback handler is installed to cause any return value response message to be posted to another queue created especially to receive it. Meanwhile the thread that invoked Context.call(..) sleeps waiting for a message on the call's dedicated reply queue.

Latches

Those queues aren't simply Queue.Queue, but a custom reimplementation added early during Ansible extension development, as deficiencies in Python 2.x threading began to manifest. Python 2 permits the choice between up to 50 ms latency added to each Queue.get(), or for waits to execute with UNIX signals masked, thus preventing CTRL+C from interrupting the program. Given these options a reimplementation made plentiful sense.

The custom queue is called Latch, a name chosen simply because it was short and vaguely fitting. To say its existence is a great discomfort would be an understatement: reimplementing synchronization was never desired, even if just by leveraging OS facilities. True to tribal wisdom, the folly of Latch has been a vast time sink, costing many days hunting races and subtle misbehaviours, yet without it, good performance and usability is not possible on Python 2, and so it remains.

Due to this, when any thread blocks waiting for a result from a remote process, it always does so within Latch, a detail that will soon become important.

The Broker

Threading requirements are mostly due to Broker, a thread that has often changed role over time. Today its main function is to run an I/O multiplexer, like Twisted or asyncio. Except for some local file IO in master processes, broker thread code is asynchronous, regardless of whether it is communicating with a remote machine via an SSH subprocess or a local thread via a Latch.

When a user's thread is blocked on a reply queue, that thread isn't really blocked on a remote process - it is waiting for the broker thread to receive and decode any reply, then post it to the queue (or Latch) the thread is sleeping on.

Performance

Having a dedicated IO thread in a multi-threaded environment simplifies reasoning about communication, as events like unexpected disconnection always occur in a consistent location far from user code. But as is evident, it means every IO requires interaction of two threads in the local process, and when that communication is with a remote Mitogen process, a further two in the remote process.

It may come as no surprise that poor interaction with the OS scheduler often manifests, where load balancing pushes related communicating threads out across distinct cores, where their execution schedule bears no resemblance to the inherent lock-step communication pattern caused by the request-reply structure of RPCs, and between threads of the same process due to the Global Interpreter Lock. The range of undesirable effects defies simple description, it is sufficient to say that poor behaviour here can be disastrous.

To cope with this, the Ansible extension introduced CPU pinning. This feature locks related threads to one core, so that as a user thread enters a wait on the broker after sending it a message, the broker has much higher chance of being scheduled expediently, and for its use of shared resources (like the GIL) to be uncontended and exist in the cache of the CPU it runs on.

Runs of tests/bench/roundtrip.py with and without pinning.
Pinned? Round-trip delay
No 960 usec Average 848 usec ± 111 usec
782 usec
803 usec
Yes 198 usec Average 197 usec ± 1 usec
197 usec
197 usec

It is hard to overstate the value of pinning, as revealed by the 20% speedup visible in this stress test, but enabling it is a double-edged sword, as the scheduler loses the freedom to migrate processes to balance load, and no general pinning strategy is possible that does not approach the complexity of an entirely new scheduler. As a simple example, if two uncooperative processes (such as Ansible and, say, a database server) were to pin their busiest workers to the same CPU, both will suffer disastrous contention for resources that a scheduler could alleviate if it were permitted.

While performance loss due to scheduling could be considered a scheduler bug, it could be argued that expecting consistently low latency lock-step communication between arbitrary threads is unreasonable, and so it is desirable that threading rather than scheduling be considered at fault, especially as one and not the other is within our control.

The desire is not to remove threading entirely, but instead provide an option to disable it where it makes sense. For example in Ansible, it is possible to almost halve the running threads if worker processes were switched to a threadless implementation, since there is no benefit in the otherwise single-threaded WorkerProcess from having a distinct broker thread.

UNIX fork()

In its UNIX manifestation, fork() is a defective abstraction surviving through symbolism and dogma, conceived at a time long predating the 1984 actualization of the problem it failed to solve. It has remained obsolete ever since. A full description of this exceeds any one paragraph, and an article in drafting since October already in excess of 8,000 words has not yet succeeded in fully capturing it.

For our purposes it is sufficient to know that, as when mixed with most UNIX facilities, mixing fork() with threads is extremely unsafe, but many UNIX programs presently rely on it, such as in Ansible's forking of per-task worker processes. For that reason in the Ansible extension, Mitogen cannot be permanently active in the top-level process, but only after fork within a "connection multiplexer" subprocess, and within the per-task workers.

In upcoming work, there is a renewed desire for a broker to be active in the top-level process, but this is extremely difficult while remaining compatible with Ansible's existing forking model. A threadless mode would be immediately helpful there.

Python 2.4

Another manifestation of fork() trouble comes in Python 2.4, where the youthful implementation makes no attempt to repair its threading state after fork, leading to incurable deadlocks across the board. For this reason when running on Python 2.4, the Ansible extension disables its internal use of fork for isolation of certain tasks, but it is not enough, as deadlocks while starting subprocesses are also possible.

A common idea would be to forget about Python 2.4 as it is too old, much as it is tempting to imagine HTTP 0.9 does not exist, but as in that case, Python is treated not just as a language runtime, but as an established network protocol that must be implemented in order to communicate with infrastructure that will continue to exist long into the future.

Implementation Approach

Recall it is not possible for a user thread to block without waiting on a Latch. With threadless mode, we can instead reinterpret the presence of a waiting Latch as the user's indication some network IO is pending, and since the user cannot become unblocked until that IO is complete, and has given up forward execution in favour of waiting, Latch.get() becomes the only location where the IO loop must run, and only until the Latch that caused it to run has some result posted to it by the previous iteration.

@mitogen.main(threadless=True)
def main(router):
    host1 = router.ssh(hostname='a.b.c')
    host2 = router.ssh(hostname='c.b.a')

    call1 = host1.call_async(os.system, 'hostname')
    call2 = host2.call_async(os.system, 'hostname')

    print call1.get().unpickle()
    print call2.get().unpickle()

In the example, after the (presently blocking) connection procedure completes, neither call_async() wakes any broker thread, as none exists. Instead they enqueue messages for the broker to run, but the broker implementation does not start execution until call1.get(), where get() is internally synchronized using Latch.

The broker loop ceases after a result becomes available for the Latch that is executing it, only to be restarted again for call2.get(), where it again runs until its result is available. In this way asynchronous execution progresses opportunistically, and only when the calling thread indicated it cannot progress until a result is available.

Owing to the inconvenient existence of Latch, an initial prototype was functional with only a 30 line change. In this way, an ugly and undesirable custom synchronization primitive has accidentally become the centrepiece of an important new feature.

Size Benefit

The intention is that threadless mode will become the new default in a future version. As it has much lower synchronization requirements, it becomes possible to move large pieces of code out of the bootstrap, including any relating to implementing the UNIX self-pipe trick, as required by Latch, and to wake the broker thread from user threads.

Instead this code can be moved to a new mitogen.threads module, where it can progressively upgrade an existing threadless mitogen.core, much like mitogen.parent already progressively upgrades it with an industrial-strength Poller as required.

Any code that can be removed from the bootstrap has an immediate benefit on cold start performance with large numbers of targets, as the bottleneck during cold start is often a restriction on bandwidth.

Performance Benefit

Threadless mode tallies in well with existing desires to lower latency and resource consumption, such as the plan to reduce context switches.

.right-aligned td, .right-aligned th { text-align: right; }
Runs of tests/bench/roundtrip.py with and without threadless
Threaded+Pinned Threadless
Average Round-trip Time 201 usec 131 usec (-34.82%)
Elapsed Time 4.220 sec 3.243 sec (-23.15%)
Context Switches 304,330 40,037 (-86.84%)
Instructions 10,663,813,051 8,876,096,105 (-16.76%)
Branches 2,146,781,967 1,784,930,498 (-15.85%)
Page Faults 6,412 17,529 (+173.37%)

Because no broker thread exists, no system calls are required to wake it when a message is enqueued, nor are any necessary to wake the user thread when a reply is received, nor any futex() calls due to one just-woke thread contending on a GIL that has not yet been released by a just-about-to-sleep peer. The effect across two communicating processes is a huge reduction in kernel/user mode switches, contributing to vastly reduced round-trip latency.

In the table an as-yet undiagnosed jump in page faults is visible. One possibility is that either the Python or C library allocator employs a different strategy in the absence of threads, the other is that a memory leak exists in the prototype.

Restrictions

Naturally this will place some restraints on execution. Transparent routing will no longer be quite so transparent, as it is not possible to execute a function call in a remote process that is also acting as a proxy to another process: proxying will not run while Dispatcher is busy executing the function call.

One simple solution is to start an additional child of the proxying process in which function calls will run, leaving its parent dedicated just to routing, i.e. exclusively dedicated to running what was previously the broker thread. It is expected this will require only a few lines of additional code to support in the Ansible extension.

For children of a threadless master, import statements will hang while the master is otherwise busy, but this is not much of a problem, since import statements usually happen once shortly after the first parent->child call, when the master will be waiting in a Latch.

For threadless children, no background thread exists to notice a parent has disconnected, and to ensure the process shuts down gracefully in case the main thread has hung. Some options are possible, including starting a subprocess for the task, or supporting SIGIO-based asynchronous IO, so the broker thread can run from the signal handler and notice the parent is gone.

Another restriction is that when threadless mode is enabled, Mitogen primitives cannot be used from multiple threads. After some consideration, while possible to support, it does not seem worth the complexity, and would prevent the aforementioned reduction of bootstrap code size.

Ongoing Work

Mitogen has quite an ugly concept of Services, added in a hurry during the initial Ansible extension development. Services represent a bundle of a callable method exposed to the network, a security policy determining who may call it, and an execution policy governing its concurrency requirements. Service execution always happens in a background thread pool, and is used to implement things like file transfer in the Ansible extension.

Despite heavy use, it has always been an ugly feature as it partially duplicates the normal parent->child function call mechanism. Looking at services from the perspective of threadless mode reveals some notion of a "threadless service", and how such a threadless service looks even more similar to a function call than previously.

It is possible that as part of the threadless work, the unification of function calls and services may finally happen, although no design for it is certain yet.

Summary

There are doubtlessly many edge cases left to discover, but threadless mode looks very doable, and promises to make Mitogen suitable in even more scenarios than before.

Until next time!

Just tuning in?

Jan van den Berg (j11g)

Ten years on Twitter 🔟❤️ February 16, 2019 07:44 AM

Today marks my ten year anniversary on Twitter! There are few other web services I have been using for ten years. Sure, I have been e-mailing and blogging for longer, but those are activities — like browsing — and not specifically tied to one service (e.g. Gmail is just one of many mail services). And after ten years, Twitter is still a valuable and fun addition to life online. But it takes a bit of work to keep it fun and interesting.

TL;DR

  • Twitter is your Bonsai tree: cut and trim.
  • Use the Search tab, it’s awesome!
  • Stay away from political discussions.
  • Be nice! No snark.
  • Bookmark all the things.

Twitter, the protocol

Twitter, of course, provides short synchronous one-to-all updates. In comparison; mail and blogging are asynchronous. Their feedback loop is different and they’re less instant. And WhatsApp or messaging are forms of one-to-many communication and they’re not public (so not one-to-all). So Twitter takes a unique place among these communication options.

Effectively the service Twitter provides is it’s own thing. Because Twitter is more a protocol, or an internet utility if you like. And more often than not, protocols or utilities tend to get used in ways they weren’t supposed to. I’ve written about Twitter many times before. And I love blogging and RSS but Twitter for me is still the place for near real-time updates. This post is part celebration of Twitter and part tips how I, personally, use this protocol to keep it fun and interesting.

Bonsai

Twitter can be many things to many people. For some people it can be the number one place to get their news on politics. For others Twitter is all about comedy (Twitter comedy is certainly a fun place!) or sports (I do follow quite a bit of NBA news). And some people just jump in, enjoy the moment, not caring about what came before and logging off again. And that is fine, but that is just not how I roll. When I follow you, I care about what you have to say, so I make an effort to read it.

So I am careful about following people that tweet very often. When I check out a profile page, and see a user with 45,978 updates, that’s usually an indication that I will not follow that account. But, this is me. I see my Twitter timeline like a bonsai tree, cutting and trimming is an integral part of keeping things manageable. Because when you’re not careful, Twitter can become overwhelming. Certainly when you’re a strict chronological timeline user, like I am. But, sparingly following accounts can make you miss out on great stuff, right?

Search tab

My solution to this problem is the Search tab (available on the app and mobile). Because this tab is actually pretty good! Twitter knows my interests based on a cross-section of accounts I follow, and in this tab it makes a nice selection of tweets that I need to see. It is my second home, my second timeline. Usually I catch up on interesting things of otherwise loud Twitter accounts (i.e. lots of tech accounts that I don’t follow). So Twitter helps me to point out things that I still would like to see. I get the best of both worlds. It’s great!

Politics

There are few subjects as flammable as politics on Twitter. So I try to stay away from politics and try not to get involved in political discussions. That doesn’t mean I am not aware of things going on, or that I am not interested in politics. Quite the opposite! I just don’t think Twitter is the best place for political debate. The new 280 character limit was an improvement, but it’s still too short for real discussions or nuance (maybe this is true for the internet as a whole). Sure, certain threads can provide insight, and some people really know what they’re talking about. But I will think twice before personally entering a discussion. I do follow quite a bit of blogs/sites on politics and Twitter helps me to point to those things. These places usually help me more in understanding things that are otherwise hard to express in 280 characters.

Be positive

It is very easy to be dismissive or negative on Twitter. But very little good comes from that. So I always try to add something positive. I recently came across this tweet, and I think this sums it up rather well:

Pointer

Like stated Twitter can be many things to many people. But from day one, for me it has always been a place to point and get pointed to interesting things. The best Twitter for me is Twitter as a jump-off zone. My love for Twitter comes from the experience of being pointed to great books, movies, blogs, (music) videos and podcasts. And I am a heavy user of the bookmark option. (I tend to like very little on Twitter, which is more of a thank you nowadays.) But I bookmark all the things. Usually I scan and read my timeline on mobile, bookmark the interesting things and come back to it later in the day on a PC.

What’s next?

I had been blogging for a few years when Twitter came along. So I have never been able to shake the feeling of seeing Twitter as a micro blog for everyone. (Which is just one of its uses.) I am also aware of concepts like micro.blog, matrix.org or Mastodon. Services that, at the very least, have been inspired by Twitter, and build further on the idea of a communication protocol. But the thing is, Twitter was first, and Twitter is where everybody is. It’s part of the plumbing of the internet now, I don’t see it going away soon and that is all right by me! Cheers!

The post Ten years on Twitter 🔟❤️ appeared first on Jan van den Berg.

February 15, 2019

Pierre Chapuis (catwell)

Goodbye Lima February 15, 2019 07:40 PM

You may have heard it already: five years after I joined Lima, the company is shutting down.

Obviously, things did not go like we hoped they would. Customers are disappointed, and wondering what will happen now. Let me try to answer some of the questions I read online the best I can.

Please note that this is my personal take on things, and does not represent the views of anyone else but me (i.e. not Lima, not other employees...).

What happened to the company exactly?

Lima as a company no longer exists. It ran out of money. Its employees (including me) have all been fired, and its assets will be sold to pay its debts.

Regarding why the company died, it is a long story and it is not my place to tell it all. What I can say is that it ran into unexpected funding problems in early 2017, shortly after we started shipping the Lima Ultra. During most of 2017, there was strong demand for the product but we could not fulfill it because we did not have enough cash to pay for production and shipping (Remember the never-ending waiting list?) At the end of the year, we had to fire a large part of the team and we switched our business model to sell our software to other companies. We made a deal where we worked for another startup. The deal was good enough to keep the company afloat and the product alive for a year, but it forced us to stop selling Lima devices. What happened recently is that this deal eventually fell through, leaving us with no viable options.

This past year was not the best time of my life, or for any of the other employees who stayed. Many of us could have left for much better jobs at any time, some did and I cannot blame them. All those who stayed on board all this time did so hoping for a better end for the company and its customers.

What will happen to the devices?

Once Lima's servers shut down, Lima will keep working on your local LAN with the devices you have already paired with it. However, a lot of things will stop working.

First, it won't be possible to add new devices to the system. That's because, when you log a new device into Lima, you do so with an email and password. To find out which Lima those credentials belong to, the system asks a server, and that server won't answer anymore.

Second, it won't be possible to reset your password, because email confirmation will be broken. If you have forgotten your password, change it now while the servers are still up.

Third, the sharing feature will be broken, because it relies on sending HTTP requests to relay servers which will go down as well.

Finally, it won't be possible to access Lima from outside your home. This is a little harder to explain than the rest. Basically all communications between anything related to Lima (Lima devices, your devices, servers...) happen in a peer-to-peer VPN. To "locate" devices within the VPN (basically figure out how to talk to something), devices rely on a node which is called the ZVPN master. The IP address and public key of that node are hardcoded into every Lima client, and that node will go down as well. The use of that node is not needed on local networks because Lima devices and applications have a protocol to pair with other devices associated to the same account on a LAN without talking to any server.

Is there a risk for my personal data?

At that moment, not that I know of. Your data was never stored on Lima's servers, and all data traffic going through relay servers is end-to-end encrypted, which means that even if an attacker took control of one they couldn't decipher your data.

However in the long run there are two issues.

First, we won't be able to publish updates for the Lima firmware and applications anymore. If a security issue is found in one of the components they use, they may become vulnerable with no way to fix them.

Second, if someone was to acquire all the assets or Lima, including the domain and code signing certificate, they could theoretically do everything Lima was able to do, including publishing updates. That means they could publish malicious updates of the applications and firmware.

That second issue sounds scary but I do not think there is any chance it will happen. Potential acquirers will probably be more interested in Lima's technological IP, there are very few chances that an acquirer will get all the assets necessary for such an attack, and even if they do they probably won't have an interest in performing it. Even if it did happen, it would be easy to notice. Still, I have to mention it for transparency.

What I will personally do now, and what I advise users to do as well, is export all my data out of Lima, unplug the device and uninstall all the applications.

Note: If you have problems when trying to recover your data (due to e.g. a hardware issue with the USB drive), do not uninstall the applications. The data on your desktop might sometimes help recovering some of the files.

If you have an issue with the Decrypt Tool, check here for potential answers.

What can users replace Lima with?

It depends on the users. I don't know anything that is exactly like Lima. There was Helixee, which I have never tried out, but I just found out they are shutting down as well. I also learned that a project I had never heard about before called Amber had a special offer for Lima customers.

For technical people, you can probably do most of what you were doing with Lima with a Synology NAS, or a setup based on some small computer and Open Source software such as Nextcloud or Cozy Cloud.

However, Lima was never designed for technical customers. It was built for, marketed to and mostly bought by non-technical people. For them, I don't have a good answer. I heard that WD My Cloud Home had become a lot better than it once was, but I have not tried it personally.

Can you open-source the code?

To the best of my knowledge, there is no way that can happen. This makes me extremely sad, especially since I know there are parts of the code I would love to reuse myself, and that could be useful to other projects.

The reason why we cannot open-source is that the code does not belong to us, the employees, or the CEO. Intellectual property is considered an asset of a bankrupt company, and as such will be sold to the highest bidder to pay the company's debts.

That being said, Lima has contributed some source code to a few Open Source projects already. Most importantly we fixed the issues in OSXFUSE that prevented it from being used for something like Lima, and those fixes are now in the main branch.

Completely independently from the company, the former CTO of Lima has also released a project which looks a lot like a second, fully decentralized iteration of the Lima network layer ZVPN (using a DHT instead of a master node, and WireGuard instead of TLS). Let me be clear: this project contains no code or IP from Lima, it is a clean room implementation.

Can you give us root access to the device?

For Lima Original, no, I think that would be impossible (or rather, I can't see a solution that doesn't involve soldering...). The device is not worth much today anyway, its specs are so low I don't think you could run any other private cloud software on it.

For Lima Ultra, a few of us ex-Lima employees (and the CEO) are trying to figure out a way to let users get root access. We can't promise anything, but we will keep you informed if we do.

EDIT (2019-02-18): We did it, check this out!

Why does it say something different in the Kickstarter FAQ?

Some people have mentioned that what was happening was not in line with what had been said in the Kickstarter FAQ.

This FAQ has been written in 2013, before I or any other Lima developer joined the company. At the time Lima was a very small project with two founders trying to raise $70,000 to make their dream happen. Instead they raised $1,229,074, hired 12 people (including me), and the rest is history.

I do not think we have not communicated like that ever since, especially regarding decentralization. As far as what I know we have been transparent that our servers were needed for some major features of the product, as it was obvious the few times they went down. You may ask why we didn't amend this page then, and the answer is (I think) that it is technically impossible to edit it after the campaign is over.

Regarding Open Source, I sincerely believe the CEO of Lima would have done it if it was possible, but with the success of the Kickstarter the company had to take VC funding very early on (see below), and from that moment on I do not think it was in his hands.

Where did all that Kickstarter money go?

OK, let's address this last. What Kickstarter money?

Yeah, the founders raised over a million dollar. But do you remember how much the backers paid for those devices? From $59 to $79 each. Well, as bad as the hardware was, it was planned for about 1000 devices, not over 10,000. And it was pretty expensive.

I don't know the exact figures, but basically Lima did not make money on those devices, or no significant amount of money at least. Which is why it raised extra cash from VCs just afterwards, to pay the team that worked on the project, the production of more devices to sell, etc...

If you still think something shady went on with that money, rest assured: when a company like Lima goes bankrupt, its books are closely investigated by the state, which is one of its main creditors. So if you are right, the people responsible will end up in jail. (Spoiler: I really don't think it will happen.)

What are you going to do next?

Yes, I have plans.

No, they are not in any way related to Lima.

I will tell you more next month, probably.

Gustaf Erikson (gerikson)

Fotografiska, 14 Feb 2019 February 15, 2019 06:13 PM

Jonas Bendiksen - The Last Testament

Magnum photographer photographs seven people around the world who claim they are Jesus Christ. Great reportage.

Anja Niemi - In Character

Self-portraits (sometimes doubled), with that “2-stops overexposed Portra” aesthetics that the kids like so much these days. It’s nothing we haven’t seen before.

Kirsty Mitchell - Wonderland

Exceedingly lush tableaux, backed by a tragic backstory (the memory of the creator’s deceased mother) and a hugely successful Kickstarter campaign. There’s no denying the craftmanship nor the quality of the work, but somehow it feels a bit weird for an artist to so publicly involve crowdfunding in something so private. On the other hand the work of Niemi (above) struck me as very cold and solitary, so what do I know about how artists get inspiration from others.

Pierre Chapuis (catwell)

Software Architecture Principles February 15, 2019 10:20 AM

This is just a short post to share what I now consider, after 10 years in the industry (and almost twice as many writing code), my core software architecture principles.

You may or may not agree with them all, but if you design software or systems, you should have a similar list in your head; it really helps a lot when making decisions.

Without further ado, the principles are:

  • Separation of Concern often trumps not repeating oneself (DRY). In other words, avoiding duplication does not justify introducing coupling.

  • Gall's Law: "A complex system that works is invariably found to have evolved from a simple system that worked."

  • Conway's Law: "Organizations produce designs which are copies of their communication structures."

  • When writing code or designing, stop and think "consequences". What will be the impact of what you are doing on the rest of the systems? Could there be adverse side-effects?

  • Think about debuggability in production. There is nothing worse than having your software break and not being able to figure out why. Do not automate things you do not understand.

  • Write code that is easy to delete, not easy to extend.

Andreas Zwinkau (qznc)

The Spartan Web February 15, 2019 12:00 AM

Defining a label for websites I like to visit and would like to see more of.

Read full article!

February 14, 2019

Jan van den Berg (j11g)

Blue Bananas – Wouter de Vries jr. & Thiemo van Rossum February 14, 2019 06:36 PM

Blauwe Bananen (Blue Bananas) is a management book that was number one for 38 days on managementboek.nl. It is aimed at people who generally don’t read management books. So it sometimes tries to be unnecessarily funny, seemingly afraid to alienate the reader with otherwise dry concepts. Nonetheless the message itself is pretty solid. The theme being: how to become a blue banana. A blue banana is a business with a unique skill set or proposition.

Blue Bananas – Wouter de Vries jr. & Thiemo van Rossum (2012) – 94 pages

That the message carries merit is not a surprise. This book unabashedly builds on the famous organisational and business strategy theories laid out by Treacy & Wiersema and Hamel & Prahalad. The book introduces readers to a succinct and on-point summary of their concepts. It does so by guiding the reader through four steps: Pursuits, Promises, Perception, Proof (freely translated by me from the Dutch B letter words).

With these steps the book makes the theory practical and consequently is very direct. Which is a good thing. To further cement the theory it offers 29 exercises and practical thought experiments (Things like “write down what you think are unique talents of your organisation”). Overall it does a good job of landing one of the main messages: it does not matter what value you add, if your customer does not perceive it as such. Everything you do as an organisation should add value to your customers’ experience.

If you rarely read management books, Blue Bananas can be a good starting point and offers valid questions of how to add value to your organisation.

The post Blue Bananas – Wouter de Vries jr. & Thiemo van Rossum appeared first on Jan van den Berg.

February 13, 2019

Derek Jones (derek-jones)

Offer of free analysis of your software engineering data February 13, 2019 03:02 AM

Since the start of this year, I have been telling people that I willing to analyze their software engineering data for free, provided they are willing to make the data public; I also offer to anonymize the data for them, as part of the free service. Alternatively you could read this book, and do the analysis yourself.

What will you get out of me analyzing your data?

My aim is to find patterns of behavior that will be useful to you. What is useful to you? You have to be the judge of that. It is possible that I will not find anything useful, or perhaps any patterns at all; this does not happen very often. Over the last year I have found (what I think are useful) patterns in several hundred datasets, with one dataset that I am still scratching my head over it.

Data analysis is a two-way conversation. I find some patterns, and we chat about them, hopefully you will say one of them is useful, or point me in a related direction, or even a completely new direction; the process is iterative.

The requirement that an anonymized form of the data be made public is likely to significantly reduce the offers I receive.

There is another requirement that I don’t say much about: the data has to be interesting.

What makes software engineering data interesting, or at least interesting to me?

There has to be lots of it. How much is lots?

Well, that depends on the kind of data. Many kinds of measurements of source code are generally available by the truck load. Measurements relating to human involvement in software development are harder to come by, but becoming more common.

If somebody has a few thousand measurements of some development related software activity, I am very interested. However, depending on the topic, I might even be interested in a couple of dozen measurements.

Some measurements are very rare, and I would settle for as few as two measurements. For instance, multiple implementations of the same set of requirements provides information on system development variability; I was interested in five measurements of the lines of source in five distinct Pascal compilers for the same machine.

Effort estimation data used to be rare; published papers sometimes used to include a table containing the estimate/actual data, which was once gold-dust. These days I would probably only be interested if there were a few hundred estimates, but it would depend on what was being estimated.

If you have some software engineering data that you think I might be interested in, please email to tell me something about the data (and perhaps what you would like to know about it). I’m always open to a chat.

If we both agree that it’s worth looking at your data (I will ask you to confirm that you have the rights to make it public), then you send me the data and off we go.

February 10, 2019

David Wilson (dw)

Mitogen v0.2.4 released February 10, 2019 11:59 PM

Mitogen for Ansible v0.2.4 v0.2.5 has been released. This version is noteworthy as it contains major refinements to the core libary and Ansible extension to improve its behaviour during larger Ansible runs.

Work on scalability is far from complete, as it progresses towards inclusion of a patch held back since last summer to introduce per-CPU multiplexers. The current idea is to exhaust profiling gains from a single process before landing it, as all single-CPU gains continue to apply in that case, and there is much less risk of inefficiency being hidden in noise created by multiple multiplexer processes.

Please kick the tires, and as always, bug reports are welcome!

Just tuning in?

Ponylang (SeanTAllen)

Last Week in Pony - February 10, 2019 February 10, 2019 04:03 PM

Last Week In Pony is a weekly blog post to catch you up on the latest news for the Pony programming language. To learn more about Pony check out our website, our Twitter account @ponylang, or our Zulip community.

Got something you think should be featured? There’s a GitHub issue for that! Add a comment to the open “Last Week in Pony” issue.

February 09, 2019

Unrelenting Technology (myfreeweb)

haha wow my SoundFixer addon is on the addons.mozilla.org front page February 09, 2019 04:53 PM

haha wow my SoundFixer addon is on the addons.mozilla.org front page

Nikita Voloboev (nikivi)

What problem did you encounter? February 09, 2019 03:59 PM

What problem did you encounter? I updated the article to include `:simlayers {:o-mode {:key :o}}` for launcher key blocks as it is needed and was missing.

I am using Mojawe and you can look at my config on GitHub for reference of a working config I use currently.

Alex Wilson (mrwilson)

Notes from the Week #18 February 09, 2019 12:00 AM

Monday was pretty meeting-focused.

We had a huddle on deriving a set of SLOs from our initial graphite SLIs. The outcome of the session was that our metrics needed further refinement — what we actually want is to have a response time bound for well-formed requests and a different threshold for number of queries that time out or are invalid rather than overall request latency.

Our second session was a retrospective on how we handle ‘walk-ups’ — Shift is pretty lucky that we surrounded by our customers which keeps feedback loops tight but we become overloaded by questions and distractions. We used index cards to keep track of the number of walkups broken down by subject, and decided to productise it into a Google Form for longer-term storage and analytics.

Tuesday

I paired with Narayan on Tuesday to make some efficiency improvements to our generated firewall configurations. We’ve been less than judicious with some of the templated rule-sets and this was an opportunity to smooth out our global Puppet runtimes. We did this by putting feature flags on our security-related puppet classes and started to turn off parts that weren’t being used.

I also went along to my first weeknotes meetup! :D

The first venue that Steve suggested turned out to be mostly booked by a speed-dating event so we ended up decamping to a cocktail bar nearby. I always like meeting new people (big shout-out to Dan and Giuseppe) and it’s a weird sensation to hang out for the first time with people who you only know from the Twitter-sphere, but good times were had by all even if I did forget to actually eat and had a delicious two-pint dinner instead.

Wednesday

As tradition dictates, it was 20% time day — I finally got around to releasing my ProtonMail DNS terraform module and pushing it to the terraform registry.

I set aside a bit of time for attempting an upgrade of part of our Puppet systems and it turns out that it’s going to be a fair bit more work than I thought — we’re using the open-source version and we’ve architected it in a way that worked when we first brought it up but makes it harder to incrementally scale.

Thursday and Friday

I did a fair bit of pairing this week, in total!

I paired with Petrut on AWS optimisations and with Seng on improving the state of our SSL certificates, both of which required doing a fair bit of Terraform-ing (I swear this is 90% of my development time now, the rest is Python). I miss TDD’d app development. :(

Speaking of app development, Stephen made an initial release of a small puppet-token Slack app that builds on the data that we started piping into DynamoDB from our Puppet runs.

We have a monolithic shared puppet codebase and because we practice trunk-based development we also use a physical mutex to make sure only one team is committing/deploying at a time. This token, an android plush, often requires developers to go and “search” for it to acquire the lock (a practice that worked when we were small but Shift is working to make more scale-appropriate).

We can type /puppet-token into our Slack and it will tell you where it thinks the token is!

An improvement to our process, but the next step will be using these events (START, FINISH) to determine whether a run is in progress and use that as a mutex instead of our token. I’m excited for more of these small human-centric improvements.

This weekend, I ordered a copy of Shoshana Zuboff’s The Age of Surveillance Capitalism — probably required reading for working in ad-tech? — and I’m looking forward to eating through it, I’m going to regenerate my GPG keys, and then quite probably bake a Lemon Drizzle cake (which will require me taking at least some of it into work).

Originally published at blog.probablyfine.co.uk on February 9, 2019.

February 08, 2019

Ponylang (SeanTAllen)

Pony Stable 0.2.0 Released February 08, 2019 04:00 AM

Pony-stable 0.2.0 is a recommended release. It fixes a couple bugs that could result in end user issues.

February 07, 2019

Grzegorz Antoniak (dark_grimoire)

C++: Shooting yourself in the foot #4 February 07, 2019 06:00 AM

C++11 has introduced a better way to generate random numbers than the immortal srand(time(NULL)) and rand() % N method. However, this family of functions sometimes may behave in a not very intuitive way, especially when it comes to multi-platform programming.

TL;DR: before commenting, please at least read …

February 06, 2019

Jeff Carpenter (jeffcarp)

Kaiser SF Half Race Report February 06, 2019 09:46 PM

Overall It went great, I PR’d by 10 minutes! The course is super fast and the light drizzle of rain didn’t really put a damper on things. Report t-0:20 I arrived and was able to use the bathroom – they did a great job of making sure there were enough port-a-potties. After that since it was drizzling I hid under a tree to the side of the start line with a bunch of other runners who looked like they were from a club and knew what they were doing.

February 05, 2019

Bogdan Popa (bogdan)

Google Groups Without Google Accounts February 05, 2019 06:00 PM

It turns out that when you delete your Google accounts, Google unsubscribes you from any (public and private) Google Groups you’re a member of. I found out about this only because my inbox traffic these past couple of days felt unusually light so I went and looked at racket-users and, lo and behold, there were a bunch of new posts I hadn’t received. I get it. They want to avoid sending emails to an address that, from their perspective, no longer exists.

February 04, 2019

Bogdan Popa (bogdan)

Bye, Bye, Google February 04, 2019 07:00 AM

I spent this past weekend de-Google-ifying my life and, despite my expectations, it wasn’t too hard to do. I started by moving all of my websites off of Google App Engine and onto a dedicated box that I had already owned. That was straightforward enough. Next, I removed any Google Analytics snippets from each of them and replaced those with my own analytics server that I had built a while back (it doesn’t store any PII, only aggregate information (and very little of that, too)).

Jan van den Berg (j11g)

Plato – R.M. Hare February 04, 2019 12:41 AM

Writing short introductions to classic philosophers are hard. This book tries, but falls a bit short as a true introduction.

Plato – R.M. Hare (1983) – 117 pages

Plato, the first documented, Western philosopher set the pace for 25 centuries of philosophy. This book explains the culture and setting where Plato developed his philosophy, and their interrelation. It also touches on the main aspects of his philosophy as well as you could possibly expect in a short book of around 100 pages. But I do have two issues with this book.

Firstly, as a reader you need to bring your a-game. There are quite a few names and concepts thrown at you. I assume that people who pick up this book know very little about philosophy so this seems like a mismatch. Secondly; it does not help that most of the language is highly academic (note, I did read a Dutch translation). Two or three chapters were decisively easier to read than the rest of the book, because the language was completely different.

So even if I picked up a few things I would not suggest this book as an introduction to Plato. (Reasons are similar to the Kierkegaard book.) In 2019, if you need an introduction I would suggest you read the Wikipedia page. It’s clearer in language and structure than this book from 1983. I expect somewhere there must be easier introductions to philosophy, geared towards true novices. If not, consider it an untapped market (or maybe we have it already and it’s called Wikipedia).

The post Plato – R.M. Hare appeared first on Jan van den Berg.

February 03, 2019

Jeff Carpenter (jeffcarp)

Building a Running Pace Calculator With AMP February 03, 2019 11:45 PM

Sometimes you need to know how fast you need to run to achieve a personal best time. Previously the way I did this was to search “running pace calculator” and follow and use one of the top results. However, I was doing this almost always on mobile and none of those results are very mobile friendly. There might be good native apps for this, but I’m a fan of the web and don’t want to download an extra app if I can avoid it.

How I Host Static Sites With Automatic Deploy on Green February 03, 2019 11:19 PM

This site, jeffcarp.com, is written in markdown and uses the Hugo static site generator. This post walks you through how I set automatic building, testing, and deployment to Firebase hosting. Project Setup I assume we’re starting from a working Hugo project. For more on how to set that up, see the Hugo docs. Testing Setup I want the site to be Deploy-on-Green (i.e. only if it passes the tests). The CI setup I use is GCP Cloud Build.

Stig Brautaset (stig)

Musical Goals January Update February 03, 2019 10:38 PM

The first of (hopefully) monthly posts with updates on my musical goals for 2019. I cover achievements in January, and new goals for February.

Ponylang (SeanTAllen)

Last Week in Pony - February 3, 2019 February 03, 2019 03:57 PM

Last Week In Pony is a weekly blog post to catch you up on the latest news for the Pony programming language. To learn more about Pony check out our website, our Twitter account @ponylang, or our Zulip community.

Got something you think should be featured? There’s a GitHub issue for that! Add a comment to the open “Last Week in Pony” issue.

Gergely Nagy (algernon)

NOP NOP NOP says the clock, on the bug-fuse falls a lock February 03, 2019 01:00 AM

Lately, I've been porting Kaleidoscope to keyboards that happened to land on my desk for one reason or the other. Keyboards such as the ErgoDox EZ, the Atreus, Splitography, and most recently, KBD4x. In almost every case, I ran into weird issues I couldn't immediately explain, where the symptoms weren't search-engine friendly. There wasn't anything obviously wrong with the ported code, either, because the same code worked on another board. Figuring out what went wrong and where was an incredibly frustrating process. I'd like to save others from having to do the same digging, debugging, hair-tearing I did, so we'll look at these three cases today.

A tale of fuses and woe

The first problem I ran into was during the Splitography port. It should have been a straightforward port, because it is built on top of ATMegaKeyboard, like the Atreus port, which has been working wonderfully. I prepared the port in advance, before the keyboard arrived, and was eagerly waiting the shipment to try it. I was confident it will work out of the box. It did not: the left half was dead.

I quickly flashed QMK back to verify that the hardware is fine, and it was, both halves worked with QMK. What am I doing wrong then? I verified that the pinout is correct, I checked with a simple key logger that the problem is not that we don't act on key presses, but the firmware doesn't even see them. This was the first clue, but I wasn't paying enough attention, and went comparing ATMegaKeyboard's matrix scanning code to QMK. I even copied QMK's matrix scanner verbatim - to no avail.

At this point, I looked at the pinout again, and noticed that the left half's columns are all on PINF. Why aren't we able to read from PINF? It works on the Atreus! At this point, I searched for "reading from PINF not working", but since PINF isn't a common search term, my search engine helpfully added results for "ping not working" too - which I did not notice because I've been fighting this for over an hour by that point. Eventually, once describing the problem to Jesse on Discord, he gave me the final clue: JTAG.

The ATMega32u4 MCU the Splitography has JTAG support enabled in fuses by default. Most vendors who ship the MCU to end-users disable this, but on the Splitography, it wasn't disabled. This meant that using PINF didn't work, because the MCU was expecting to use it for JTAG, not as an input pin to read from. Once this was clear, it didn't take much time to find the solution by looking at the datasheet, the following sections in particular:

  • 2.2.7 (PIN Descriptions; PIN F)
  • 7.8.7 (On-chip Debug System)
  • 26.5.1 (MCU Control Register – MCUCR)

In the end, the fix was these two lines in the constructor of the Splitography hardware plugin:

MCUCR |= (1 << JTD); MCUCR |= (1 << JTD);

What it does, is it writes to the JTD bit of MCUCR twice within four cycles, which disables JTAG at run-time, and makes it possible for us to use PINF for input.

Time is of the essence

The next problem I faced was when I started to work on underglow support for the KBD4x keyboard. As expected, it didn't quite work with my port, but ran flawlessly with QMK. So what do I do? Compare the code.

The code to drive the LEDs on the KBD4x (pretty common WS2812 leds), I used the same source library the QMK code is based on. This was strange, because the code I used for the KBD4x LED strips, I used before for the Shortcut port, and everything was working fine there. Nevertheless, I went and compared the code, down on the compiled, optimized assembly level. It was exactly the same.

Yet, even though the code was the same, with QMK, I was able to set the LED colors as I saw fit. With my Kaleidoscope port, no matter what data I sent its way, the LEDs always ended up being bright white. This is surprisingly hard to search for, and my searches yielded no useful results. At the end of the day, I let my frustration out in QMK's discord a bit, and got a little hint from there: the WS2812 strips are very picky about timing.

After a good night's sleep, I went looking into the QMK sources to see if there's anything there I do differently in Kaleidoscope, focusing on time-related things, such as the clock. And that was the key!

The ATMega32u4 has an option to divide its clock speed, to conserve power. Like in the case of JTAG, this can be set or unset in fuses. Thankfully, like in the JTAG case, we can disable this at run-time too, with the following magic words:

CLKPR = (1 << CLKPCE); CLKPR = (0 << CLKPS3) | (0 << CLKPS2) | (0 << CLKPS1) | (0 << CLKPS0);

If the MCU divides its speed to conserve power, it will run slower, and all the timing the library I work with uses will be totally wrong. No wonder poor LEDs lit up white!

With the magic incantation added to the keyboard's constructor, I was able to set colors properly. Why wasn't this a problem on the Shortcut? Because it had clock division disabled in fuses.

NOP, NOP, NOP

Many days later, I had a few spare minutes, so I figured I'll add support for the KBD4x to Chrysalis. This was a 15 minute task, and everything worked fine, yay! I figured I'll build a sketch for my own uses while there, and that's when I noticed that the first column wasn't working at all.

Quickly flashing QMK back verified that the issue is not with the hardware. Yay, I guess?

So the usual thing happens: what went wrong? The pinout is the same as in QMK, JTAG and clock division are disabled. The first column is on PIN_F0, so making sure JTAG was disabled was my first step. Some other columns were also on PINF, and those worked, so it's not JTAG.

Frustrating. I cry out on Discord, and Jesse tells me immediately he saw something similar on the Planck, and had a fix. We look into ATMegaKeyboard, and indeed, there appears to be a fix there:

uint16_t ATMegaKeyboard::readCols() { uint16_t results = 0x00 ; for (uint8_t i = 0; i < KeyboardHardware.matrix_columns; i++) { // We need to pause a beat before reading or we may read // before the pin is hot asm("NOP"); results |= (!READ_PIN(KeyboardHardware.matrix_col_pins[i]) << i); } return results; }

Emphasis on the asm("NOP"). That line is supposed to slow us down a bit so that the pin we're reading has a chance to settle. I had two questions at this point: why isn't the existing fix enough, and why do I remember my first column working before?

Turns out, this is related to the previous issue! You see, when I turned clock division off, the keyboard started to run a bit faster, which meant that a single NOP didn't slow us down enough for the pin to go hot. With clock division, we were running slow enough for a single NOP to be enough.

But how do we fix this? I checked QMK (checking other, similar projects for clues is such a great thing!), and they delay for 30µs. While that'd work for us too, we didn't want to add an explicit delay on the fast path. I looked at the assembly to see if we can do anything smart there, and noticed an interesting thing: the compiler inlined .readCols(), and unrolled the loop too.

What if we didn't inline it? We'd have a function call overhead then, which is faster than a delay, but slower than being inlined. Adding an attribute that disables inlining made my first column work. However, I wasn't satisfied, figured we can do better. What if we allowed inlining, but stopped unrolling the loop? Checking the loop condition is faster than calling a function, but still slower than being inlined and unrolled. Turns out, disabling loop unrolling was enough in this case:

__attribute__((optimize("no-unroll-loops"))) uint16_t ATMegaKeyboard::readCols() { // ... }

Summary

In the end, if I read documentation, and truly understood the hardware I'm working with, I wouldn't have faced any of these issues. But I'm not good at hardware, never will be. My brain just stops if I try to ingest too much hardware documentation in one sitting. Which means I'll run into similar issues again for everyone's benefit, but mine! Yay.

And to think I'm about to build a home security system soon... yikes. I'm utterly scared about that prospect. So many things will go wrong, that the next post about hardware-related things going awry will be even longer.

Alex Wilson (mrwilson)

Notes from the Week #17 February 03, 2019 12:00 AM

Oh, it’s a long one. I’m trying another new format, breaking down by day — I often forget highlights in trying to limit myself to 2 or 3 things.

Monday

Going faster

I had a two hour session with the rest of the Team Leads about ways to help us go faster, within the constraints of keeping the Unruly culture that makes us unique and not over-egging the process pudding (so to speak).

It feels a lot like a linear/combinatorial optimisation problem that I learned about in school and uni respectively (I’m shuddering at the memory of manually running the Simplex algorithm during exams). There are a bunch of different levers we are able to pull but every action has an effect on everything else.

This probably falls somewhere in what Cynefin calls the complex domain, and we improve by Probing -> Sensing -> Responding.

Demo time

Shift have also started doing huddles to demonstrate our little pet projects since we’ve been trialling a flexible working system. I’m a self-identified morning person and I like to tinker when I’m in the office on my own.

This week I demo’d a custom Terraform module to wrap up different resources between GitHub, AWS, and other sources. Modules support multiple providers being passed in as attributes since 0.11, so this is no longer problematic to wire up.

Stephen demo’d a spike of a Slack app to administer resources on AWS, as well as communicate to users when they have stale assets, building on his 20% from last week

Tuesday

Yes/Yes/No

I’m deliberately very free-and-easy with our team’s process — when someone wants to try something new, like a facilitation technique or a way of doing things, I try my best to take a leaf from performance improv and respond with “Yes, and …” (unless the idea would potentially have drastic negative consequences, of course).

The idea came from Stephen listening to the podcast Reply All which has a segment called “Yes/Yes/No”. The podcasters look at a particular tweet, and answer “Yes” or “No” to whether they understand what it means or not, before finally explaining it to each-other so that everyone can answer Yes.

We tried using this format to really dig into our network security model, the technologies we use, the way we provision it and team feedback was a unanimous “let’s do this again”.

My gut reaction to why this worked for us was that actively engaging with the content and explaining bits of it to each-other rather than someone doing a demo made our brains work differently and absorb the information better.

1-to-1s

My team and I have 1-to-1s every other Tuesday after lunch. I genuinely love these because I’m able to talk with my team members on a different level to when we’re in a group, and I get a lot of pleasure from engaging ith their thoughts and ideas.

It’s a great time to give and receive feedback, outside of our fortnightly “feedback and cake” sessions which focuses more on group feedback, so 1-to-1 feedback tends to be a lot more personal.

Wednesday

Coffee with Steve

Sometimes I’m a bit rubbish with times, but now that I’m in the office early every morning I no longer have an excuse for being late for my wednesday chat with Steve!

This week we talked about the differences between the Unruly and the GDS models for infrastructure and SREs. Our SREs have a team of their own but also act as enablers and pseudo-coaches to raise ProDev’s skill level across the board.

The world of SRE-ness is one that Shift has been dipping its feet into a lot over the last couple of months, so it’s great to hear how places other than e.g. Google, Facebook do SRE stuff.

20% Time

Wednesday is traditionally (but not always) the day I take my 20% time. It’s normally the day with fewest meetings and it breaks the week up quite nicely. I spiked the terraform module for managing ProtonMail DNS records that I spoke about in my last weeknotes, which will be released on GitHub and the Terraform Registry very soon.

(Is it obvious yet that I really like Terraform? I want to build a custom provider next, probably for some obscure web service)

Thursday

DIY Team Lunch

Sarah had the great idea to hold a spontaneous team lunch in our office — she wrote about it in her own weekly reflections. She brought a picnic blanket to one of our meeting rooms, we all brought along our own lunches, and we shared a bottle of Appletizer.

I hadn’t tasted Appletizer for years, and I felt undeniably classier drinking it out of champagne glasses.

There was plenty of non-shop talk but we had a brilliant idea over the course of the hour — we provide a bit of tooling to deploy and run our puppet code, but we still use a manual mutex (in the form of a slightly grubby android plush).

Could we steal a leaf from Hashicorp’s book and replace our manual lock with something like Terraform’s state locking?

This discussion escalated into how we could push events to DynamoDB and profile the workflow much like we would any other system, and show our slightly unkempt Python script a bit of love.

Tech talks

Ina and I presented the findings and progress that Shift had made towards implementing SLx and Error Budgets for the first of our mission critical systems, graphite.

We got some really great feedback on both the presentation and the content, and there were some great questions about how we might be using our Error Budget when we’ve finished making the calculations.

Farewell to Jahed

We said goodbye to Jahed this week who is, or … was :(, one of our developers — he’s been here long enough to feel part of the furniture and we’ll certainly miss what he brought to Unruly and ProDev as a whole.

He’s a big supporter of open source like myself so I personally will miss his voice in the blogging and open source group.

Friday

Team lunch followup

As the last Shift member in the office, I took 15 minutes and spiked a quick-and-dirty attempt at event pushing functionality in our Puppet workflow.

  • Terraform’d a DynamoDB instance on AWS.
  • Python’d pushing { session_uuid, date, commit_hash, event, hostname } at specific points in the workflow.

I’m excited for this to start accreting data over the next week — we’ll have better insight onto how many runs start but are abandoned, a better look at how long runs take, and many more.

This weekend I baked some easy Fork Biscuits as I slowly build up my baking rep, and I’ve not made biscuits for absolutely YONKS.

I’ve also been ill, but this felt more like a physical cold than the last one I had which seriously affected my usual levels of reasoning. Grotty, but mostly still able to operate at normal capacity.

Originally published at blog.probablyfine.co.uk on February 3, 2019.

Pepijn de Vos (pepijndevos)

LM13700: Voltage Controlled Everything February 03, 2019 12:00 AM

When making a modular synth, everything has to be voltage controlled. So how do you make a voltage controlled amplifier, a voltage controlled oscillator, or a voltage controlled filter? One way is with an operational transconductance amplifier.

The LM13700 is like a swiss army knife of voltage control. Its datasheet is completely packed with refference circuits for voltage controlled everything. To get familiar with its operation, I built a few of the circuits on breadboard.

Voltage controlled amplifier

Basically an OTA is like an opamp with current output, but it’s frequently used without feedback. To make the differential pair more linear, biasing diodes are used at the input. But the linear range is still limited to a few dozen millivolt. What makes it voltage controlled is that the current gain is controlled by IABC, which is the tail current of the differential pair.

For my test circuit I hooked the current gain up to a button with an RC network connected to it, so it does a nice attack and decay when pressed and released.

State variable filter

Then I fed the output of my VCA into this beautiful state variable filter. What is cool about state variable filters is that they can have low-pass, high-pass and band-pass outputs from the same signal. Each OTA basically forms a Gm-C filter. Put simply, a resistor’s current depends on the voltage you put on it, and so does the current of the OTA depend on its input voltage.

For the above video, I output white noise and a low-frequency sine from the MyDAQ. The white noise goes through the VCA controlled by my RC button envelope, and through the band-pass output of the state variable filter, controlled by the slow sine wave.

February 02, 2019

Pepijn de Vos (pepijndevos)

Microrack: A Small Modular Synthesizer February 02, 2019 12:00 AM

Inspired by the Modulin, I’ve been making my own synthesizer, starting with a Game Boy violin, adding pressure sensitivity, and adding analog delay.

Over the past weeks I’ve been thinking about how I want to connect everything together. I knew I wanted to make it modular, but also that it had to be small enough to become a portable instrument, and hopefully easy to prototype and not too expensive. So I came up with what I call Microrack, a compact mixed-signal bus that is electronically compatible with CV. I typed up a rough description here. In short, it uses a bus with analog multiplexers for audio, and an I2C bus for control signals.

I started by making the power supply and base board. Ideally you’d have something more efficient and powerful, but I started with a simple half-wave rectifier into linear regulators. The I2C lines are exposed to an external Arduino board that will control the user interface and the digital bus. Here is a rough schematic. One thing that is regrettably absent is any sort of current limit or fuse.

power supply schematic

Then I started working on the first module. I decided to start a little drum machine based on a noise source, a filter, and an envelope generator. The drum machine was mostly driven by the idea to make white noise in discrete logic. The heart of this module is a linear feedback shift register, implemented with two 74HC595 shift registers and a 4030 quad XOR gate.

linear feedback shift register

The shift clock of the registers is driven by an atmega328p. The output clock of the last shift register is driven by a NOT-wired XOR gate to close the feedback loop. The output clock of the first shift register is driven by the atmega at a lower rate, to sample the noise. The outputs of the first shift register are fed to a R-R2 resistor ladder.

resistor ladder

So by controlling the shift clock and the output clock, the bandwidth and randomness of the noise can be controlled. The DAC output is then fed into an opamp to translate from [0 5] V to [-5 +5] V, which is then output via the analog multiplexer. I’m pretty happy with the result.

microrack

Except then I fried the atmega.

Robin Schroer (sulami)

Building a Literal Library of Building Blocks February 02, 2019 12:00 AM

This postI know, insert the obligatory “I haven’t posted in a while” bit here.

is heavily inspired by a remark Zach Tellman made on the defn podcast, where he says:

Having been a professional programmer for a decade, I have a decade’s worth of experience at writing stuff from scratch, not a decade’s worth of tools in my toolbox. And that seems like a not optimal set of circumstances. [Quote at 57:45]

I have listened to this some time around Christmas, and this quote has kept me thinking over the past couple of months. What Zach is talking about is a project he is working on which would allow you to capture explorative programming in the branching fashion in which it happens. His example revolves around using a shell to perform some work, like extracting some specific values from a file.You should really go and listen to the episode, he has a lot of very great insights.

He explains how we work out these sequences of commands that to accomplish our goal, but never generalise them, but instead throw them away, just to write them from scratch the next time we encounter a similar problem. This rings true for all kinds of programming, not just shell scripting, though shell scripts are especially susceptible to this.

Like Zach, I believe this to be a suboptimal situation. Especially being a functional programmer, I believe in small, abstract building blocks, composition, and code reuse, rather than overly specific, bespoke solutions that have to be written from scratch every time. I am someone who tinkers a lot, and there is a lot of code I never commit anywhere. As a matter of fact, I have a habit of creating throw-away files or whole projects in /tmp just to play with something for anywhere between five minutes and a weekend. At the same time I also have a repository on my Github literally called playground, which contains all kinds of small things that I did not want to go through the hassle of creating a Github repository for.Interesting aside: while creating a local repository has so little friction that I do it all the time, only a fraction of them ever touch Github’s servers, as creating a repository through the web interface incurs so much friction.

This repository has allowed me to cannibalise some snippets of codes I used in the past, but it is not what I would call a comprehensive library of generalised solutions to problems I repeatedly face. And that has been hugely helpful already, for example I have written about path-finding using the A* algorithm before, so I had a working implementation ready when I needed it for another project.

Having a library, in the worldly sense of the word, of useful, generalised snippets of code would institutionalise the knowledge of them. You would not have to remember how to invert a binary tree, because if you have ever played with binary trees you would already have an implementation handy, and it would be tried and tested, and performance-optimised.

Practical Implementations

Having arrived at the decision of generalising and collecting useful snippets of code somewhere, we are now facing the question of where somewhere actually is, and how we distribute the snippets in a way that allows us to easily use them.

The simplest solution would be to maintain one or several collections of useful snippets, and just copy-pasting them into the code you are writing. While this is fast and simple, it does not facilitate innovation flowing in either direction. Updates to the generalised versions are not included in downstream products using them, and vice versa. The result would likely be a duplication of similar, but subtly different solutions to all kinds of problems, scattered over various projects. Bugs that have long been fixed in one of them might still be present in others.

The alternative solution is packaging your snippets, and using them as a library. Most of the practical implementation will depend on the programming language you are using, and what kind of projects you are usually working on. Zach Tellman himself has a Clojure library called Potemkin, which is a collection of “some ideas which are almost good”, and which he uses as a dependency for most of his other libraries.

While this incurs some overhead, namely the packaging of the library, it does come with a lot of advantages. Other people can benefit from your library. Depending on the scale of the overhead involved with building a library, splitting snippets by topic into “actual” libraries might make sense. It does require more abstraction, and more documentation, but that is not a bad thing. For a simple library with a handful of data structures or functions, writing a quick readme and some docstrings takes less than an hour.

There is still room for a default, catch-all library that is just for personal use and contains miscellaneous snippets without any particular topic, and it can be where new snippets end up first. If a section of it grows large enough, it can be extracted into its own library. The bottom line here is, if you write something that solves a problem, keep it somewhere, ideally where you can find it again. Even if it is not generalised or documented, it might come in handy in the future.

February 01, 2019

Unrelenting Technology (myfreeweb)

LLVM 8.0 release highlight (ha): LLDB now has syntax highlighting! February 01, 2019 11:27 PM

LLVM 8.0 release highlight (ha): LLDB now has syntax highlighting!

January 31, 2019

Bogdan Popa (bogdan)

Announcing north January 31, 2019 05:00 AM

A couple of days ago, I released north, a database schema migration tool written in Racket. It currently supports PostgreSQL and SQLite, but new adapters are extremely easy to add and my main reason for building it was because I wanted not only a CLI utility but also programmatic access to do migrations from Racket. I’m going to make that last part easy for everyone with the next release after I clean up some of the internals and write more docs.

Alex Wilson (mrwilson)

Debugging an outage without an internet connection January 31, 2019 12:00 AM

The Monday of this week, I was drafted in to help resolve a production incident on a system that I had helped build before I moved teams. What makes this unusual is that I had no way to actually debug it at the time. So here’s a small experience post about what I did and what I learned.

NB: These are my personal conclusions, YMMV

A small amount of scene-setting

  1. Unruly practices developers-on-call because we believe it makes us build better services.
  2. We practice primary, secondary, team-lead levels of escalation, but engineers with relevant experience can be drafted in if they are available.
  3. For REASONS, I had my laptop but no internet access, so I couldn’t see anything that was going on.
  4. I was on a phone-call with the lead of the team that owns the service, and he had all the usual tools at his disposal

For the next half hour, I was asking him questions and helping to debug remotely given what I knew of the system.

He couldn’t read my mind.

If I was debugging alone, I would likely be jumping back and forth between my hunches, trying to cross them off as quickly as possible. The nature of this new dynamic required a much slower and measured approach.

Don’t say “Can you read out the logs from 3 o’clock to 5 o’clock?”

We’re trying to identify possible causes of the issues, and these (caveat: in my opinion) are better phrased as questions rather than requests

Do say “There was a deploy at 3 o’ clock today, is there anything unusual in the app log?”

If you have the source code available to you, referring to files and line numbers is really helpful — this enabled us to identify particular lines of configuration that might be causing the issue.

Lesson 2: Say why you’re asking the question

I’m not going to treat my colleague like he’s just a pair of hands for me, and I felt it was really important to clarify why I was asking the question. It gives an opportunity to short-circuit the query if it’s something that’s already been eliminated.

Don’t say “Can you search the logs for AWS request errors?”

Bonus points: ask about the output, not the act. I don’t really mind how we get to the answer, more that we eliminate or prove a potential cause.

Do say “I think that the problem might be being caused by a lack of correct permissions for the AWS credentials. Are there AWS permission denied errors in the app log?”

Lesson 3: They are the system owners, and the experts. Treat them as such.

I was pulled in because I’d worked on the system before but well over a year ago. The system might have changed in ways that I don’t know about, so it was important for me to recognise that my mental model might be out of date and I needed to tailor my questions as such.

Do say “I remember it behaving like X due to Y. Is this still true?”

In this scenario, there were a number of things I could probably eliminate based on my previous knowledge of what failure causes would look like for network issues, datastore connections, etc, but I didn’t want to assume anything.

Lesson 4: On-call outages are inherently stressful. Don’t make it worse.

Feedback when debugging yourself is of the order of seconds.

The thought -> question -> action -> reply loop is significantly longer.

Given that we’re trying to solve the issue in the shortest timeframe available, this process can make things even more stressful if things get out of hand.

There are several things you can do but they depend on how the other person works — for example, are they someone who likes to talk a lot, or more measured with the way they speak?

In the former, I would try to keep a conversation going with my thought process to normalise the conversation, but I wouldn’t do this in the latter scenario.

I hope will be a useful read for people who do on-call and who might encounter something like this — tl;dr try your best to help, be clear and up-front with questions, show empathy and care to not make things more stressful.

Originally published at blog.probablyfine.co.uk on January 31, 2019.

January 30, 2019

Mark J. Nelson (mjn)

DeepMind AlphaStar (Starcraft II bot) roundup January 30, 2019 12:00 PM

The UK-based Google subsidiary, DeepMind, which brought big news to the AI & games world in 2016 with its bot AlphaGo defeating a world-class Go player, says they've done it again with AlphaStar, a bot that beat a high-level human player in the real-time strategy (RTS) game StarCraft II, previously considered to be a top-tier challenge.

I'm currently teaching an AI & Games special-topics course at American University, where I recently started in a new job. It's a seminar-style class where the first part of the course is largely based on jointly reading Julian Togelius's new book (so new, in fact, that it was officially published one day after our semester started), Playing Smart: On Games, Intelligence, and Artificial Intelligence.

The idea is that both I and the students bring some things to discuss to each class, within a framework broadly set by Togelius. But this AlphaStar news is pretty high-profile breaking news, at least as far as AI & games news goes. So I decided that we'd dedicate the next class or two to this event.

Some discussion that immediately ensued around the Google StarCraft bot: What does this show, does it show what it claims, and what does it mean for AI & Games, or AI in general?

* * *

I've collected relevant links below:

DeepMind StarCraft II Demonstration video

  • Commented recording of five human-vs-machine games
  • There's an explanation of the learning algorithm at 1:16:30

DeepMind blog post

  • Explanation of the algorithms and bot training league
  • Some interactive visualizations of strategies
  • Downloadable data

Reaction blog posts and articles:

Indrek Lasn (indreklasn)

How to set-up a powerful API with GraphQL, Koa, and MongoDB — CRUD January 30, 2019 12:00 AM

How to set-up a powerful API with GraphQL, Koa, and MongoDB — CRUD

This is a series where we learn how to set-up a powerful API with GraphQL, Koa, and Mongo. The primary focus will be on GraphQL. Check out part I if you haven’t yet.

Mutations

So far we can read our data, but there’s a big chance we need to edit our data records/documents. Any complete data platform needs a way to modify server-side data as well.

Okay, imagine this: a company launched a new gadget. How would we go on about adding the record to our database with GraphQL?

Mutations to the rescue!

Think of Mutations like POST or PUT REST actions. Setting up a mutation is quite straight-forward.

Let’s jump in!

Adding records to our database

Create a file graphql/mutations.js

Inside the file, we will place mutations.

  • We import the GraphQLObjectType and GraphQLObjectType Objects from the GraphQL library.
  • Import the GraphQL type for gadget
  • Import the gadget mongoose Model.

After importing the stuff we need, we can go on about creating the mutation.

A Mutation is just a plain GraphQLObjectType like the query we had before. It has two main properties we’re interested in.

The name of the mutation is what appears in the graphiql docs.

Fields are where we can place our mutation logic.

Notice I added a new object inside the fields object. It’s called addGadget and it would do exactly what it says it will do.

Inside the addGadget we have access to three properties, type, args, and resolve().

The addGadget type will be gadgetGraphQLType. The gadget can only have properties which are allowed in the gadgetGraphQLType type we declared earlier.

addGadget is a query which accepts arguments. The arguments are needed to specify which gadget we want to add to our database.

We declare up-front which arguments the query accepts and the types of the arguments.

Lastly — what happens with the query? Which is precisely why we have the resolve() function.

Remember the resolve() function has two arguments: parent and args. We’re interested in the args since these are the values we pass to our query.

Inside the resolve we place the logic for creating a new mongo record.

We create a new instance of our Gadget mongoose model, pass the props we receive from GraphQL as new fields, and finally save the record.

Here’s how the full mutation looks:

graphl/mutations.js

Voilà! All we need to do is import the mutation to our schema.js file.

graphl/schema.js

If everything went fine, this is what we should see on our graphiql.

And if we click on it:

Notice how GraphQL creates automatically self-documentation. This is why we have such strong typing.

Firing off the mutation query

A mutation is just a plain graphQL query which takes our arguments, saves it to the Mongo database, and returns the properties we want.

Here’s the catch: every mutation needs to be marked as mutation:

Voilà! We successfully created and inserted a new gadget to our mongo database.

If you head over to mlab or whatever provider you’re using, you should see the new record.

Here’s the complete query for our mutation.

Good job!

Editing our records in the database

What if we want to edit pre-existing records? Surely we can’t rely on never making a typo, or what if the price changes?

Editing a record is also a mutation. Remember, every time we want to change/add a new record, it’s a graphql mutation!

Open the graphql/mutations file and create another mutation. A mutation is just a plain object.

Notice the new mutation called updateGadget. It’s pretty much a replica of the previous mutation. Notice the extra argument, the id — that’s because we need to find the existing gadget and change it. We can find the gadget by id.

The resolve() function is where it gets more interesting. Ideally, we want to find the gadget by id, change the props, and save it. How would we go on about doing this?

Mongoose gives us a method to do this, called findById.

This returns a promise. If we console.log the promise, we can see a huge blob of properties attached to it. What we can do with the promise is chain it with a then() method. If promises are a stranger, check out this article I wrote.

Like so: we find the Gadget, change the props, save it. But this returns another promise which we need to resolve.

.catch() for error handling in case we run into errors. Remember, you can monitor your pm2 logs via the pm2 logs command. If you run into errors, these will be logged to the pm2 logger.

That’s all! Query time! Look at your Mongo table and pick a random id from there and edit the corresponding gadget.

And if we inspect the database, we should see the edited record.

Voilà! Success!

Here’s the query for the updateGadget mutation.

https://medium.com/media/7382914bb8bd13ff1d668cea6ca3e62f/href

Okay, good job for making it this far. So far we have the Create, Read, Update, but we’re missing the final d(elete).

Deleting a record from a mongo database is quite straight-forward. All we need is another mutation, since we are, in fact mutating the database.

For deleting records, Mongoose gives us a handy method called findOneAndDelete – read more about findOneAndDelete here.

https://medium.com/media/f8d4d24ff095c6a0160d08d113f5849a/href

Deleting the record just takes one argument, the id. We find the gadget by ID, delete it, and return it. If there’s an error, we’ll log it.

And the query:

https://medium.com/media/19399028d64f306e689d94a6ee56dfee/href

Note: Make sure the id is correct and exists in the database, otherwise it won’t work.

If we head over to our database and inspect it — indeed the record got deleted from our database.

Well done, we have achieved basic CRUD functionality. Notice how GraphQL is a thin layer between our database and view. It’s not supposed to replace a database, but rather make it easier to work with data, fetching, and manipulating.

Here’s the source code.

I’ll see you at part 3 where we’ll do more cool stuff!

Thanks for reading!

Originally published at strilliant.com on January 30, 2019.


How to set-up a powerful API with GraphQL, Koa, and MongoDB — CRUD was originally published in freeCodeCamp.org on Medium, where people are continuing the conversation by highlighting and responding to this story.

January 29, 2019

Jeff Carpenter (jeffcarp)

A Year of the Pomodoro Technique January 29, 2019 11:19 PM

The Pomodoro Technique is method for improving productivity by segmenting work into 25-minute intervals. You focus intensely on a task for 25 minutes, then take a break. Rinse, repeat. I began using the technique to study for the cryptography course I took last winter. The benefits were clear from the beginning. I enjoyed working in 25-minute segments and started using it at work as well. I want to share with you some of the things I learned along this year-long journey.

Derek Jones (derek-jones)

Modeling visual studio C++ compile times January 29, 2019 04:34 PM

Last week I spotted an interesting article on the compile-time performance of C++ compilers running under Microsoft Windows. The author had obviously put a lot of work into gathering the data, and had taken care to have multiple runs to reduce the impact of random effects (128 runs to be exact); but, as if often the case, the analysis of the data was lackluster. I posted a comment asking for the data, and a link was posted the next day :-)

The compilers benchmarked were: Visual Studio 2015, Visual Studio 2017 and clang 7.0.1; the compilers were configured to target: C++20, C++17, C++14, C++11, C++03, or C++98. The source code used was 100 system headers.

If we are interested in understanding the contribution of each component to overall compile-time, the obvious fist regression model to build is:

compile_time = header_x+compiler_y+language_z

where: header_x are the different headers, compiler_y the different compilers and language_z the different target languages. There might be some interaction between variables, so something more complicated was tried first; the final fitted model was (code+data):

compile_time = k+header_x+compiler_y+language_z+compiler_y*language_z

where k is a constant (the Intercept in R’s summary output). The following is a list of normalised numbers to plug into the equation (clang is the default compiler and C++03 the default language, and so do not appear in the list, the : symbol represents the multiplication; only a few of the 100 headers are listed, details are available):

                             Estimate Std. Error  t value Pr(>|t|)    
               (Intercept)                  headerany 
               1.000000000                0.051100398 
               headerarray             headerassert.h 
               0.522336397               -0.654056185 
...
            headerwctype.h            headerwindows.h 
              -0.648095154                1.304270250 
              compilerVS15               compilerVS17 
              -0.185795534               -0.114590143 
             languagec++11              languagec++14 
               0.032930014                0.156363433 
             languagec++17              languagec++20 
               0.192301727                0.184274629 
             languagec++98 compilerVS15:languagec++11 
               0.001149643               -0.058735591 
compilerVS17:languagec++11 compilerVS15:languagec++14 
              -0.038582437               -0.183708714 
compilerVS17:languagec++14 compilerVS15:languagec++17 
              -0.164031495                         NA 
compilerVS17:languagec++17 compilerVS15:languagec++20 
              -0.181591418                         NA 
compilerVS17:languagec++20 compilerVS15:languagec++98 
              -0.193587045                0.062414667 
compilerVS17:languagec++98 
               0.014558295 

As an example, the (normalised) time to compile wchar.h using VS15 with languagec++11 is:
1-0.514807638-0.183862162+0.033951731-0.059720131

Each component adds/substracts to/from the normalised mean.

Building this model didn’t take long. While waiting for the kettle to boil, I suddenly realised that an additive model was probably inappropriate for this problem; oops. Surely the contribution of each component was multiplicative, i.e., components have a percentage impact to performance.

A quick change to the form of the fitted model:

log(compile_time) = k+header_x+compiler_y+language_z+compiler_y*language_z

Taking the exponential of both side, the fitted equation becomes:

compile_time = e^{k}e^{header_x}e^{compiler_y}e^{language_z}e^{compiler_y*language_z}

The numbers, after taking the exponent, are:

               (Intercept)                  headerany 
              9.724619e+08               1.051756e+00 
...
            headerwctype.h            headerwindows.h 
              3.138361e-01               2.288970e+00 
              compilerVS15               compilerVS17 
              7.286951e-01               7.772886e-01 
             languagec++11              languagec++14 
              1.011743e+00               1.049049e+00 
             languagec++17              languagec++20 
              1.067557e+00               1.056677e+00 
             languagec++98 compilerVS15:languagec++11 
              1.003249e+00               9.735327e-01 
compilerVS17:languagec++11 compilerVS15:languagec++14 
              9.880285e-01               9.351416e-01 
compilerVS17:languagec++14 compilerVS15:languagec++17 
              9.501834e-01                         NA 
compilerVS17:languagec++17 compilerVS15:languagec++20 
              9.480678e-01                         NA 
compilerVS17:languagec++20 compilerVS15:languagec++98 
              9.402461e-01               1.058305e+00 
compilerVS17:languagec++98 
              1.001267e+00 

Taking the same example as above: wchar.h using VS15 with c++11. The compile-time (in cpu clock cycles) is:
9.724619e+08*3.138361e-01*7.286951e-01*1.011743e+00*9.735327e-01

Now each component causes a percentage change in the (mean) base value.

Both of these model explain over 90% of the variance in the data, but this is hardly surprising given they include so much detail.

In reality compile-time is driven by some combination of additive and multiplicative factors. Building a combined additive and multiplicative model is going to be like wrestling an octopus, and is left as an exercise for the reader :-)

Given a choice between these two models, I think the multiplicative model is probably closest to reality.