Planet Crustaceans

This is a Planet instance for lobste.rs community feeds. To add/update an entry or otherwise improve things, fork this repo.

November 23, 2020

Ponylang (SeanTAllen)

Last Week in Pony - November 22, 2020 November 23, 2020 03:21 AM

The ponylang/glob package has upgraded its ponylang/regex dependency to improve the installation exprience for Windows users.

November 22, 2020

Derek Jones (derek-jones)

What software engineering data have I collected on subject X? November 22, 2020 10:32 PM

While it’s great that so much data was uncovered during the writing of the Evidence-based software engineering book, trying to locate data on a particular topic can be convoluted (not least because there might not be any). There are three sources of information about the data:

  • the paper(s) written by the researchers who collected the data,
  • my analysis and/or discussion of the data (which is frequently different from the original researchers),
  • the column names in the csv file, i.e., data is often available which neither the researchers nor I discuss.

At the beginning I expected there to be at most a few hundred datasets; easy enough to remember what they are about. While searching for some data, one day, I realised that relying on memory was not a good idea (it was never a good idea), and started including data identification tags in every R file (of which there are currently 980+). This week has been spent improving tag consistency and generally tidying them up.

How might data identification information be extracted from the paper that was the original source of the data (other than reading the paper)?

Named-entity recognition, NER, is a possible starting point; after all, the data has names associated with it.

Tools are available for extracting text from pdf file, and 10-lines of Python later we have a list of named entities:

import spacy

# Load English tokenizer, tagger, parser, NER and word vectors
nlp = spacy.load("en_core_web_sm")

file_name = 'eseur.txt'
soft_eng_text = open(file_name).read()
soft_eng_doc = nlp(soft_eng_text)

for ent in soft_eng_doc.ents:
     print(ent.text, ent.start_char, ent.end_char,
           ent.label_, spacy.explain(ent.label_))

The catch is that en_core_web_sm is a general model for English, and is not software engineering specific, i.e., the returned named entities are not that good (from a software perspective).

An application domain language model is likely to perform much better than a general English model. While there are some application domain models available for spaCy (e.g., biochemistry), and application datasets, I could not find any spaCy models for software engineering (I did find an interesting word2vec model trained on Stackoverflow posts, which would be great for comparing documents, but not what I was after).

While it’s easy to train a spaCy NER model, the time-consuming bit is collecting and cleaning the text needed. I have plenty of other things to keep me busy. But this would be a great project for somebody wanting to learn spaCy and natural language processing :-)

What information is contained in the undiscussed data columns? Or, from the practical point of view, what information can be extracted from these columns without too much effort?

The number of columns in a csv file is an indicator of the number of different kinds of information that might be present. If a csv is used in the analysis of X, and it contains lots of columns (say more than half-a-dozen), then it might be assumed that it contains more data relating to X.

Column names are not always suggestive of the information they contain, but might be of some use.

Many of the csv files contain just a few rows/columns. A list of csv files that contain lots of data would narrow down the search, at least for those looking for lots of data.

Another possibility is to group csv files by potential use of data, e.g., estimating, benchmarking, testing, etc.

More data is going to become available, and grouping by potential use has the advantage that it is easier to track the availability of new data that may supersede older data (that may contain few entries or apply to circumstances that no longer exist)

My current techniques for locating data on a given subject is either remembering the shape of a particular plot (and trying to find it), or using the pdf reader’s search function to locate likely words and phrases (and then look at the plots and citations).

Suggestions for searching or labelling the data, that don’t require lots of effort, welcome.

November 17, 2020

Bogdan Popa (bogdan)

Racketeering Gophers November 17, 2020 10:30 AM

rocketeering gopher
Close enough.

I’ve been working on a Wasm implementation in Racket for the past couple of weeks and have recently reached a neat milestone.

November 16, 2020

Andreas Zwinkau (qznc)

Use decision records already! November 16, 2020 12:00 AM

To develop more systematically, decision record are a great first step.

Read full article!

November 15, 2020

Derek Jones (derek-jones)

Researching programming languages November 15, 2020 10:12 PM

What useful things might be learned from evidence-based research into programming languages?

A common answer is researching how to design a programming language having a collection of desirable characteristics; with desirable characteristics including one or more of: supporting the creation of reliable, maintainable, readable, code, or being easy to learn, or easy to understand, etc.

Building a theory of, say, code readability is an iterative process. A theory is proposed, experiments are run, results are analysed; rinse and repeat until a theory having a good enough match to human behavior is found. One iteration will take many years: once a theory is proposed, an implementation has to be built, developers have to learn it, and spend lots of time using it to enable longer term readability data to be obtained. This iterative process is likely to take many decades.

Running one iteration will require 100+ developers using the language over several years. Why 100+? Lots of subjects are needed to obtain statistically meaningful results, people differ in their characteristics and previous software experience, and some will drop out of the experiment. Just one iteration is going to cost a lot of money.

If researchers do succeed in being funded and eventually discovering some good enough theories, will there be a mass migration of developers to using languages based on the results of the research findings? The huge investment in existing languages (both in terms of existing code and developer know-how) means that to stand any chance of being widely adopted these new language(s) are going to have to deliver a substantial benefit.

I don’t see a high cost multi-decade research project being funded, and based on the performance improvements seen in studies of programming constructs I don’t see the benefits being that great (benefits in use of particular constructs may be large, but I don’t see an overall factor of two improvement).

I think that creating new programming languages will continue to be a popular activity (it is vanity research), and I’m sure that the creators of these languages will continue to claim that their language has some collection of desirable characteristics without any evidence.

What programming research might be useful and practical to do?

One potentially practical and useful question is the lifecycle of programming languages. Where the components of the lifecycle includes developers who can code in the language, source code written in the language, and companies dependent on programs written in the language (who are therefore interested in hiring people fluent in the language).

Many languages come and go without many people noticing, a few become popular for a few years, and a handful continue to be widely used over decades. What are the stages of life for a programming language, what factors have the largest influence on how widely a language is used, and for how long it continues to be used?

Sixty years worth of data is waiting to be collected and collated; enough to keep researchers busy for many years.

The uses of a lifecycle model, that I can thinkk of, all involve the future of a language, e.g., how much of a future does it have and how might it be extended.

Some recent work looking at the rate of adoption of new language features includes: On the adoption, usage and evolution of Kotlin Features on Android development, and Understanding the use of lambda expressions in Java; also see section 7.3.1 of Evidence-based software engineering.

Andreas Zwinkau (qznc)

OKRs are about change November 15, 2020 12:00 AM

Objectives and Key Results is better understood by Change principles

Read full article!

November 14, 2020

Gokberk Yaltirakli (gkbrk)

Status update, November 2020 November 14, 2020 09:00 PM

I started working on a web front-end project. It is a compiler for web components. The compiler reads single-file web components similar to the Vue.js ones, and emits vanilla JavaScript.

I found that it works well for a small library of reusable components, and I prefer the minimal design over something like Vue or React which might be overkill for most components.

I put the code, along with a few examples, at element-compiler-python.


The second project I worked on is a homemade Version Control System (VCS). While it doesn’t have too much fancy functionality, it has basic support for branches and commits.

The project is following the Unix philosophy of small, composable tools that each do one thing well. The internal repo format is just simple plain text files. My goal with the project is to make it easy to write code that operates on the repo. Ideally, someone will be able to replace a sub-command like log or shortlog relatively easily.

All the commits are stored in diff form starting from an empty repo. In order to publish projects developed using the VCS, I also created a sub-command that pushes the commits into a Git mirror. This makes it possible to collaborate and share the projects even before the VCS has a non-zero traction.

While a lot of it currently works, I am planning to do a few ergonomics-related fixes before releasing the tool. An old version, which will be replaced by the final one after the release, can be found at dum. Oh yeah, I called it dum, because it is a short word that has a similar meaning to Git.


I started getting more familiar with the Gemini protocol and document format. I had previously written about these in Gemini. My goal is to write my own client and server, as per Gemini tradition.

I wanted something slightly easier to complete before I made a full client, so I decided to make yet another Gemini-to-HTTP proxy and host it as a serverless script. It is not feature-complete yet, but it can render most simple pages.

Here are a few examples:


I wrote a small utility, called httptime, that can synchronize your system time based on the HTTP Date header. If you know about the header, you will know that it has a 1-second resolution. This is usually not sufficient for accurate timekeeping. The script makes multiple HEAD requests at strategic timestamps in order to synchronize itself with the server time and increase the timing accuracy.

I found that this approach can get you to between 0.001 to 0.004 seconds of accuracy, which is good enough for most use cases.

In most cases, you should just use NTP or SNTP. But if you don’t want to get an NTP daemon, or you want to use TLS or plain HTTP for some reason, this approach works well is very minimal. Most importantly, the code can be understood by mere mortals in a few minutes.


I played around with neural networks and text generation. Instead of going with something fancy and using an attention mechanism, I just fed the last N characters to the network to predict the next one. The results were as expected, the rough format of words and sentences look okay but it lacks the context that makes the fancier text generation models more comprehensible.

I also started to collect data using the Twitch.tv GraphQL endpoint about when certain channels are online or offline. I might do a project in the future where I try to predict the future schedule of channels based on past data using machine learning.

That’s all for this month, thanks for reading!

Andreas Zwinkau (qznc)

Switch book review November 14, 2020 12:00 AM

How to Change Things When Change Is Hard

Read full article!

November 11, 2020

Grzegorz Antoniak (dark_grimoire)

Rant about Apple's keyboard hotkey system November 11, 2020 06:00 AM

Nothing could be easier to remember and understand -- I don't understand why Windows hasn't adopted it yet.

Command is generally used for command execution. But also it's used as a modification for mouse clicks. So, when you click with Command button active, you can sometimes get a different outcome.

Shift …

November 08, 2020

Derek Jones (derek-jones)

Evidence-based software engineering: book released November 08, 2020 11:30 PM

My book, Evidence-based software engineering, is now available; the pdf can be downloaded here, here and here, plus all the code+data. Report any issues here. I’m investigating the possibility of a printed version. Mobile friendly pdf (layout shaky in places).

The original goals of the book, from 10-years ago, have been met, i.e., discuss what is currently known about software engineering based on an analysis of all the publicly available software engineering data, and having the pdf+data+code freely available for download. The definition of “all the public data” started out as being “all”, but as larger and higher quality data was discovered the corresponding were ignored.

The intended audience has always been software developers and their managers. Some experience of building software systems is assumed.

How much data is there? The data directory contains 1,142 csv files and 985 R files, the book cites 895 papers that have data available of which 556 are cited in figure captions; there are 628 figures. I am currently quoting the figure of 600+ for the ‘amount of data’.


Cover image of book Evidence-based software engineering.

Things that might be learned from the analysis has been discussed in previous posts on the chapters: Human cognition, Cognitive capitalism, Ecosystems, Projects and Reliability.

The analysis of the available data is like a join-the-dots puzzle, except that the 600+ dots are not numbered, some of them are actually specs of dust, and many dots are likely to be missing. The future of software engineering research is joining the dots to build an understanding of the processes involved in building and maintaining software systems; work is also needed to replicate some of the dots to confirm that they are not specs of dust, and to discover missing dots.

Some missing dots are very important. For instance, there is almost no data on software use, but there can be lots of data on fault experiences. Without software usage data it is not possible to estimate whether the software is very reliable (i.e., few faults experienced per amount of use), or very unreliable (i.e., many faults experienced per amount of use).

The book treats the creation of software systems as an economically motivated cognitive activity occurring within one or more ecosystems. Algorithms are now commodities and are not discussed. The labour of the cognitariate is the means of production of software systems, and this is the focus of the discussion.

Existing books treat the creation of software as a craft activity, with developers applying the skills and know-how acquired through personal practical experience. The craft approach has survived because building software systems has been a sellers market, customers have paid what it takes because the potential benefits have been so much greater than the costs.

Is software development shifting from being a sellers market to a buyers market? In a competitive market for development work and staff, paying people to learn from mistakes that have already been made by many others is an unaffordable luxury; an engineering approach, derived from evidence, is a lot more cost-effective than craft development.

As always, if you know of any interesting software engineering data, please let me know.

Sevan Janiyan (sevan)

LFS, round #2, 3rd try November 08, 2020 02:33 AM

In my previous post I ended with the binutils test suite not being happy after steering off the guide and making some changes to which components were installed. I decided to start again but cut back on the changes and see just how much I could omit from installing to get to the point of …

Robin Schroer (sulami)

Writing for Reasons November 08, 2020 12:00 AM

This year, I have been writing more than even before over. In this article, I would like to discuss some of the reasons for writing and provide some thoughts on each.

Writing to Remember

This is probably the most obvious reason to write for a lot of people. Having written down a piece of information, you can come back later and recall it. Historical context can be invaluable for decision making, and often covers information that is not readily available anymore.

The key here is being able to find notes later on. Paper-based ones can be sorted by topic or chronologically, digital ones can be searched for. Formats can be useful here too, for example by supporting embedded code blocks or graphics.

Writing to Solve Problems

Early this year, before the pandemic hit Europe, I saw Paulus Esterhazy’s talk Angels Singing: Writing for Programmers at clojureD. It contained this great quote of Milton Friedman:

If you cannot state a proposition clearly and unambiguously, you do not understand it.

In another talk, Rich Hickey explained his notion of using notes as an extension of his working memory:

So we have a problem, in general, because we’re just being asked to write software that’s more and more complex as time goes by. And we know there’s a 7 +/- 2 sort of working memory limit and as smart as any of us are, we all suffer from the same limit but the problems that we are called upon to solve are much bigger than that. So what do we do if we can’t think the whole thing in our head at the same time? How can we work on a problem with more than nine components. What I’m going to recommend is that you write all the bits down.

[…]

But if we look at the 7 +/- 2 thing, we could say we can juggle seven to nine balls but if you can imagine having an assistant who every now and then can take one of those out and put a different color in and you can juggle balls of 20 different colors at the same time as long as there are only nine in the air at any one point in time. And that’s what you’re doing, you’re going to sort of look around at all these pieces and shift arbitrary shapes of seven into your head at different points in time.

Writing everything down allows digging deep into details and going off on tangents, and then returning to other aspects. As an added bonus, these notes can be useful in the future as well, if archived properly. I found org-mode outlines incredibly powerful for this purpose, with their foldable, tree-like structure that allows nesting sub-problems.

Writing to Make Decisions

Writing is invaluable for decision making. Not only does it aid the decision process (see above), it also allows returning to a decision later and reviewing it.

Architecture decision records (ADRs) are a tool established just for this purpose. The exact formats vary, and the details do not matter too much, but here are a few key points I consider essential:

  • The motivation for the decision
  • The constraints involved
  • The alternatives to consider and their respective tradeoffs

All of these are useful in several ways: they force you to acknowledge the components of the decision, make it simple to get an opinion on the matter from someone else, and also allow you to review the (potentially bad) decision later on.

There is one more point: the conclusion. This is easy to forget, because once a conclusion is reached, no one wants to spend time writing it down. But if you do not write it down, the document does not tell the whole story if reviewed in the future.

Writing to Develop Ideas

This year I have seen a lot of people writing about Sönke Ahrens’ How to Take Smart Notes, which is about taking notes as a means to develop long form writing. It popularised the idea of the Zettelkasten, a physical or virtual box of notes which reference each other to build an information network.

While I found the book quite interesting, I would not recommend it to everyone due to the significant organisation overhead involved.

That being said, I believe that if you have a digital system which can provide automatic back-links to avoid the exponentially growing amount of manual maintenance required, there is little harm in linking notes. At the very least it will make it easier to find a note, and maybe it can aid the thinking process by exposing previously unseen connections between concepts.

Writing to Communicate

This very article was written expressively to communicate information, and as such required some extra work for it to be effective.

The most important factor when writing for communication is the target audience. It dictates the format to use, and which prior knowledge can be assumed. Maximising information density by being as concise as possible is important to avoid wasting the reader’s time.

As an added difficulty, when writing something to be published you need to get it right the first time, there is no channel for discussing follow-up questions. The old adage in writing is “writing is rewriting”, and I very much believe that to be true in this case. Write an outline, then a first draft, then keep reading and revising it until it is just right. Maybe show it to someone you trust for feedback.

I personally also like to leave a draft and come back a few weeks later. This way I always have a few drafts for new articles ready for revision, until I feel that one is ready for publishing.

November 05, 2020

Gustaf Erikson (gerikson)

6,000 dead in Sweden November 05, 2020 01:58 PM

November 03, 2020

Wesley Moore (wezm)

Turning One Hundred Tweets Into a Blog Post November 03, 2020 12:40 AM

Near the conclusion of my #100binaries Twitter series I started working on the blog post that contained all the tweets. It ended up posing a number of interesting challenges and design decisions, as well as a couple of Rust binaries. Whilst I don't think the process was necessary optimal I thought I'd share the process to show my approach to solving the problem. Perhaps the tools used and approach taken is interesting to others.

My initial plan was to use Twitter embeds. Given a tweet URL it's relatively easy to turn it into some HTML markup. By including Twitter's embed JavaScript on the page the markup turns into rich Twitter embed. However there were a few things I didn't like about this option:

  • The page was going to end up massive, even split across a couple of pages because the Twitter JS was loading all the images for each tweet up front.
  • I didn't like relying on JavaScript for the page to render media.
  • I didn't really want to include Twitter's JavaScript (it's likely it would be blocked by visitors with an ad blocker anyway).

So I decided I'd render the content myself. I also decided that I'd host the original screenshots and videos instead of saving them from the tweets. This was relatively time consuming as they were across a couple of computers and not named well but I found them all in the end.

To ensure the page wasn't enormous I used the loading="lazy" attribute on images. This is a relatively new attribute that tells the browser to delay loading of images until they're within some threshold of the view port. It currently works in Firefox and Chrome.

I used preload="none" on videos to ensure video data was only loaded if the visitor attempted to play it.

To prevent the blog post from being too long/heavy I split it across two pages.

Collecting All the Tweet URLs

With the plan in mind the first step was getting the full list of tweets. For better or worse I decided to avoid using any of Twitter's APIs that require authentication. Instead I turned to nitter (an alternative Twitter front-end) for its simple markup and JS free rendering.

For each page of search results for '#100binaries from:@wezm' I ran the following in the JS Console in Firefox:

tweets = []
document.querySelectorAll('.tweet-date a').forEach(a => tweets.push(a.href))
copy(tweets.join("\n"))

and pasted the result into tweets.txt in Neovim.

When all pages had be processed I turned the nitter.net URLs in to twitter.com URLs: :%s/nitter\.net/twitter.com/.

This tells Neovim: for every line (%) substitute (s) nitter.net with twitter.com.

Turning Tweet URLs Into Tweet Content

Now I needed to turn the tweet URLs into tweet content. In hindsight it may have been better to use Twitter's GET statuses/show/:id API to do this (possibly via twurl) but that is not what I did. Onwards!

I used the unauthenticated oEmbed API to get some markup for each tweet. xargs was used to take a line from tweets.txt and make the API (HTTP) request with curl]

xargs -I '{url}' -a tweets.txt -n 1 curl https://api.twitter.com/1/statuses/oembed.json\?omit_script\=true\&dnt\=true\&lang\=en\&url\=\{url\} > tweets.json

This tells xargs to replace occurrences of {url} in the command with a line (-n 1) read from tweets.txt (-a tweets.txt).

The result of one of these API requests is JSON like this (formatted with jq for readability):

{
  "url": "https://twitter.com/wezm/status/1322855912076386304",
  "author_name": "Wesley Moore",
  "author_url": "https://twitter.com/wezm",
  "html": "<blockquote class=\"twitter-tweet\" data-lang=\"en\" data-dnt=\"true\"><p lang=\"en\" dir=\"ltr\">Day 100 of <a href=\"https://twitter.com/hashtag/100binaries?src=hash&amp;ref_src=twsrc%5Etfw\">#100binaries</a><br><br>Today I&#39;m featuring the Rust compiler — the binary that made the previous 99 fast, efficient, user-friendly, easy-to-build, and reliable binaries possible.<br><br>Thanks to all the people that have worked on it past, present, and future. <a href=\"https://t.co/aBEdLE87eq\">https://t.co/aBEdLE87eq</a> <a href=\"https://t.co/jzyJtIMGn1\">pic.twitter.com/jzyJtIMGn1</a></p>&mdash; Wesley Moore (@wezm) <a href=\"https://twitter.com/wezm/status/1322855912076386304?ref_src=twsrc%5Etfw\">November 1, 2020</a></blockquote>\n",
  "width": 550,
  "height": null,
  "type": "rich",
  "cache_age": "3153600000",
  "provider_name": "Twitter",
  "provider_url": "https://twitter.com",
  "version": "1.0"
}

The output from xargs is lots of these JSON objects all concatenated together. I needed to turn tweets.json into an array of objects to make it valid JSON. I opened up the file in Neovim and:

  • Added commas between the JSON objects: %s/}{/},\r{/g.
    • This is, substitute }{ with },{ and a newline (\r), multiple times (/g).
  • Added [ and ] to start and end of the file.

I then reversed the order of the objects and formatted the document with jq (from within Neovim): %!jq '.|reverse' -.

This filters the whole file though a command (%!). The command is jq and it filters the entire document ., read from stdin (-), through the reverse filter to reverse the order of the array. jq automatically pretty prints.

It would have been better to have reversed tweets.txt but I didn't realise they were in reverse chronological ordering until this point and doing it this way avoided making another 100 HTTP requests.

Rendering tweets.json

I created a custom Zola shortcode, tweet_list that reads tweets.json and renders each item in an ordered list. It evolved over time as I kept adding more information to the JSON file. It allowed me to see how the blog post looked as I implemented the following improvements.

💡
You used Rust for this!?

This is the sort of thing that would be well suited to a scripting language too. These days I tend to reach for Rust, even for little tasks like this. It's what I'm most familiar with nowadays and I can mostly write a "script" like this off the cuff with little need to refer to API docs.

The markup Twitter returns is full of t.co redirect links. I wanted to avoid sending my visitors through the Twitter redirect so I needed to expand these links to their target. I whipped up a little Rust program to do this: expand-t-co. It finds all t.co links with a regex (https://t\.co/[a-zA-Z0-9]+) and replaces each occurrence with the target of the link.

The target URL is determined by making making a HTTP HEAD request for the t.co URL and noting the value of the Location header. The tool caches the result in a HashMap to avoid repeating a request for the same t.co URL if it's encountered again.

I used the ureq crate to make the HTTP requests. Arguably it would have been better to use an async client so that more requests were made in parallel but that was added complexity I didn't want to deal with for a mostly one-off program.

Adding the Media

At this point I did a lot of manual work to find all the screenshots and videos that I shared in the tweets and added them to my blog. I also renamed them after the tool they depicted. As part of this process I noted the source of media files that I didn't create in a "media_source" key in tweets.json so that I could attribute them. I also added a "media" key with the name of the media file for each binary.

Some of the externally sourced images were animated GIFs, which lack playback controls and are very inefficient file size wise. Whenever I encountered an animated GIF I converted it to an MP4 with ffmpeg, resulting in large space savings:

ffmpeg -i ~/Downloads/so.gif -movflags faststart -pix_fmt yuv420p -vf "scale=trunc(iw/2)*2:trunc(ih/2)*2" so.mp4

This converts so.gif to so.mp4 and ensures the dimensions are a divisible by 2, which is apparently a requirement of H.264 streams encapsulated in MP4. I worked out how to do this from: https://unix.stackexchange.com/a/294892/5444

I also wanted to know the media dimensions for each file so that I could have them scaled properly on the page — most images are HiDPI and need to be presented at half their pixel width to appear the right size.

For this I used ffprobe, which is part of ffmpeg. I originally planned to use another tool to handle images (as opposed to videos) but it turns out ffprobe handles them too.

Since I wanted to update the values of JSON objects in tweets.json I opted to parse the JSON this time. Again I whipped up a little Rust "script": add-media-dimensions. It parses tweets.json and for each object in the array runs ffprobe on the media file, like this:

ffprobe -v quiet -print_format json -show_format -show_streams file.mp4

I learned how to do this from: https://stackoverflow.com/a/11236144/38820

With this invocation ffprobe produces JSON so add-media-dimensions also parses that and adds the width and height values to tweets.json. At the end the updated JSON document is printed to stdout. This turned out to be a handy sanity check as it detected a couple of copy/paste errors and typos in the manually added "media" values.

The oEmbed markup that Twitter returns includes links for each piece of media. Now that I'm handling that myself these can be deleted. Neovim is used for this:

:%s/ <a href=\\"https:\/\/twitter\.com[^"]\+\(photo\|video\)[^"]\+">pic.twitter.com[^<]\+<\/a>//

For each line of the file (%) substitute (s) matches with nothing. And that took care of them. Yes I'm matching HTML with a regex, no you shouldn't do this for something that's part of a program. For one-off text editing it's fine though, especially since you can eyeball the differences with git diff, or in my case tig status.

Adding a HiDPI Flag

I initially tried using a heuristic in tweet_list to determine if a media file was HiDPI or not but there were a few exceptions to the rule. I decided to add a "hidpi" value to the JSON to indicate if it was HiDPI media or not. A bit of trial and error with jq led to this:

jq 'map(. + if .width > 776 then {hidpi: true} else {hidpi:false} end)' tweets.json > tweets-hidpi.json

If the image is greater then 776 pixels wide then set the hidpi property to true, otherwise false. 776 was picked via visual inspection of the rendered page. Once satisfied with the result I examined the rendered result and flipped the hidpi value on some items where the heuristic was wrong.

Adding alt Text

Di, ever my good conscience when it comes to such things enquired at one point if I'd added alt text to the images. I was on the fence since the images were mostly there to show what the tools looked like — I didn't think they were really essential content — but she made a good argument for including some alt text even if it was fairly simplistic.

I turned to jq again to add a basic "media_description" to the JSON, which tweet_list would include as alt text:

jq 'map(. + {media_description: ("Screenshot of " + (.media // "????" | sub(".(png|gif|mp4|jpg)$"; "")) + " running in a terminal.")})' tweets.json > tweets-alt.json

For each object in the JSON array it adds a media_description key with a value derived from the media key (the file name with the extension removed). If the object doesn't have a media value then it is defaulted to "????" (.media // "????").

After these initial descriptions were added I went though the rendered page and updated the text of items where the description was incorrect or inadequate.

Video Poster Images

As it stood all the videos were just white boxes with playback controls since I has used preload="none" to limit the data usage of the page. I decided to pay the cost of the larger page weight and add poster images to each of the videos. I used ffmpeg to extract the first frame of each video as a PNG:

for m in *.mp4; do ffmpeg -i $m -vf "select=1" -vframes 1 $m.png; done

I learned how to do this from: https://superuser.com/a/1010108

I then converted the PNGs to JPEGs for smaller files. I could have generated JPEGs directly from ffmpeg but I didn't know how to control the quality — I wanted a relatively low quality for smaller files.

for f in *.mp4.png; do convert "$f" -quality 60 $f.jpg ; done

This produced files named filename.mp4.png.jpg. I'm yet to memorise how to manipulate file extensions in zsh, despite having been told how to do it, so I did a follow up step to rename them:

for f in *.mp4; do mv $f.png.jpg $f.jpg ; done

Wrapping Up

Lastly I ran pngcrush on all of the PNGs. It reliably reduces the file size in a lossless manner:

for f in *.png; do pngcrush -reduce -ow $f; done

With that I did some styling tweaks, added a little commentary and published the page.

If you made it this far, thanks for sticking with it to the end. I'm not sure how interesting or useful this post is but if you liked it let me know and I might do more like it in the future.

November 02, 2020

Wesley Moore (wezm)

One Hundred Rust Binaries - Page 2 November 02, 2020 02:00 AM

This is page two of my #100binaries list containing binaries 51–100. See the first page for the introduction and binaries 1–50.

  1. Screenshot of the set of images generated by color_blinder when applied the Rust home page.
  2. Source: https://github.com/Szymongib/bookmark/blob/f46e5361878de972b7f0d11565fbecdb6a66bad9/assets/bookmark-demo.gif
  3. Screenshot of Artichoke running a small Ruby program in a terminal.
  4. Screenshot of csview rendering a sample CSV file in a terminal using default, reinforced, and rounded styles. Source: https://github.com/wfxr/i/blob/e04314806087faf8715a753e70f1a77f10b189d2/csview-screenshot.png
  5. Source: https://github.com/marcusbuffett/pipe-rename/blob/b734616bab4b4ca4f31de0902479202f33bda545/renamer.gif
  6. Screenshot of Cogsy running in a terminal. Source: https://github.com/cartoon-raccoon/cogsy/blob/8111b15243398cfe9cec990b88ed19f6155f8b37/images/screenshots/cogsy_main.png
  7. Screenshot of tiny connected to several IRC channels on chat.freenode.net in a terminal.
  8. Source: https://github.com/orf/ptail/blob/b26b089816cf3f495dae26ae0316c91f724667ce/images/readme.gif
  9. Screenshot of procs running in a terminal.
  10. Screenshot of vopono running in a terminal and two different browsers, one showing the VPN applied, the other not. Source: https://github.com/jamesmcm/vopono/blob/ef9653b80aea5f1695f9ca02b06e2ff340f1fae0/screenshot.png
  11. Source: https://github.com/tarkah/tickrs/blob/a5bc18a470999b5c18c98a7188a477c8e305652b/assets/demo.gif
  12. Source: https://github.com/orf/git-workspace/blob/8403c57edd172e925b682ee6220653db37dd616c/images/readme-example.gif
  13. Source: https://github.com/wfxr/i/blob/e04314806087faf8715a753e70f1a77f10b189d2/minimap-vim.gif
  14. Screenshot of kmon running in a terminal.
  15. Source: https://github.com/samtay/so/blob/93c13cdbf3fecaf23f21237ecee42d62f62905e0/assets/demo.gif
  16. Screenshot of lipl plotting the CPU temperature of my computer in a terminal.
  17. Screenshot of Cicero running in a terminal, displaying the graphemes of the text 'Rust Café 🦀' and rendering the R glyph in PragmataPro.
  18. Screenshot of battop running in a terminal.
  19. Screenshot of xxv running in a terminal. Source: https://chrisvest.github.io/xxv/screenshot.png
  20. Screenshot of indexa running in a terminal.
  21. Screenshot of shy running in a terminal. Source: https://github.com/xvxx/shy/blob/21555eb5259fd498d1d8fb4a4c39cf90a502f443/img/screen1.jpeg
  22. Screenshot of frawk running in a terminal.
  23. Screenshot of serial-monitor running in a terminal.
  24. Screenshot of gfold running in a terminal.
  25. Screenshot of fselect running in a terminal.
  26. Screenshot of lfs running in a terminal.
  27. Screenshot of dotenv-linter running in a terminal.
  28. Screenshot of bottom running in a terminal.
  29. Screenshot of the output of huniq -h in a terminal.
  30. Screenshot of cargo-wipe being run on my Projects directory in a terminal.
  31. Screenshot of terminal-typeracer running in a terminal.
  32. Screenshot of Audiobench. Source: https://joshua-maros.github.io/audiobench/book/images/default_patch.png
  33. Animated GIF of rust-sloth rendering a 3D model of Pikachu in a terminal.
  34. Screenshot of fhc running in a terminal.
  35. Screenshot of desed running in a terminal.
  36. Screenshot of silver running in a terminal.
  37. Screenshot of fnm running in a terminal.
  38. Screenshot of the waitfor documentation showing the various condition flags it accepts.
  39. Screenshot of rusty-tags running in a terminal.
  40. Screenshot of the SongRec GUI after recognising a few songs. There is album art on the left and a history of recognised songs on the right.
  41. Screenshot of ddh running in a terminal.
  42. Source: https://github.com/Nukesor/images/blob/72c983b374ea32b64e5997477693030001bdd7a6/pueue.gif
  43. The Rust logo. Source: https://upload.wikimedia.org/wikipedia/commons/thumb/d/d5/Rust_programming_language_black_logo.svg/1200px-Rust_programming_language_black_logo.svg.png

« Back to page 1

One Hundred Rust Binaries November 02, 2020 02:00 AM

I recently completed a #100binaries series on Twitter wherein I shared one open-source Rust tool or application each day, for one hundred days (Jul—Nov 2020). This post lists binaries 1–50. See page 2 for binaries 51–100.

All images and videos without an explicit source were created by me for the series. Most picture the Alacritty terminal emulator running on Linux. I use the PragmataPro font and my prompt is generated by vim-promptline. The colour scheme is Base16 Default Dark.

I also wrote a follow-up post about how this page was built and the considerations that went into making it as lightweight as possible: Turning One Hundred Tweets Into a Blog Post.

  1. Screenshot of hexyl running in a terminal.
  2. Screenshot of exa running in a terminal.
  3. Screenshot of Alacritty disiplaying the Alacritty logo.
  4. Screenshot of Amp editing Rust source code in a terminal.
  5. Screenshot of the output of running Tokei on the Allsorts repository.
  6. Output generated by Silicon for a small Rust program.
  7. Screenshot of broot running in a terminal.
  8. Screenshot of viu rendering Ferris the Rustacean in a terminal.
  9. Screenshot of Emulsion displaying an image of Ferris the Rustacean.
  10. Screenshot of rusty-man rendering the Allsorts docs in a terminal.
  11. Source: https://github.com/imsnif/diskonaut/blob/2cf5c7bd061f42443288e538ae75fedf7a846d76/demo.gif
  12. Source: https://user-images.githubusercontent.com/12150276/75177190-91d4ab00-572d-11ea-80bd-c5e28c7b17ad.gif
  13. Screenshot of dijo running in a terminal.
  14. Screenshot of pastel running in a terminal.
  15. Screenshot of DWFV running in a terminal.
  16. Screenshot of Zenith running in a terminal displaying CPU, memory, network, disk, and process information.
  17. Screenshot of the output of dtool --help in a terminal.
  18. Screenshot of Castor displaying the Gemini home page.
  19. Screenshot of the output of watchexec --help in a terminal.
  20. Screenshot of meli running in a terminal. Source: https://meli.delivery/images/screenshots/threads.webp
  21. Screenshot of delta running in a terminal.
  22. Screenshot of sharewifi running in a terminal.
  23. Screenshot of eva running in a terminal.
  24. Screenshot of bat showing some Rust code in a terminal.
  25. Screenshot of dust running in a terminal.
  26. Screenshot taken by shotgun of mdcat rendering the shotgun README in a terminal.
  27. Screenshot of ripgrep running in a terminal.
  28. Screenshot of mdcat rendering a sampel Markdown document in a terminal.
  29. Source: https://github.com/hatoo/oha/blob/10b1dc0103c11e8144f3a61cbb481092d24a2062/demo.gif
  30. Source: https://starship.rs/demo.webm
  31. Source: https://raw.githubusercontent.com/foriequal0/git-trim/master/screencast.png
  32. Source: https://github.com/imsnif/bandwhich/blob/fde53ddb3bcb769bc3474ba3d739d268619bf138/demo.gif
  33. Screenshot of xsv running in a terminal.
  34. Screenshot of Shellcaster running in a terminal. Source: https://github.com/jeff-hughes/shellcaster/blob/f6cb4c55c4a6765483d7810a2b6d08a928e799e1/img/screenshot.png
  35. Screenshot of yj transformating a small YAML document into JSON in a terminal.
  36. Screenshot of tealdeer showing the tldr page for ls in a terminal.

Continue to page 2 »

November 01, 2020

Derek Jones (derek-jones)

The Weirdest people in the world November 01, 2020 11:13 PM

Western, Educated, Industrialized, Rich and Democratic: WEIRD people are the subject of Joseph Henrich’s latest book “The Weirdest People in the World: How the West Became Psychologically Peculiar and Particularly Prosperous”.

This book is in the mold of Jared Diamond’s Guns, Germs, and Steel: The Fates of Human Societies, but comes at the topic from a psychological/sociological angle.

This very readable book is essential reading for anyone wanting to understand how very different WEIRD people are, along with the societies they have created, compared to people and societies in the rest of the world today and the entire world up until around 500 years ago.

The analysis of WEIRD people/societies has three components: why we are different (I’m assuming that most of this blog’s readers are WEIRD), the important differences that are known about, and the cultural/societal consequences (the particularly prosperous in the subtitle is a big clue).

Henrich cites data to back up his theories.

Starting around 1,500 years ago the Catholic church started enforcing a ban on cousin marriage, which was an almost universal practice at the time and is still widely practiced in non-WEIRD societies. Over time the rules got stricter, until by the 11th century people were not allowed to marry anyone related out to their sixth cousin. The rules were not always strictly enforced, as Henrich documents, but the effect was to change the organization of society from being kin-based to being institution-based (in particular institutions such as the Church and state). Finding a wife/husband required people to interact with others outside their extended family.

Effects claimed, operating over centuries, of the shift from extended families to nuclear families are that people learned what Henrich calls “impersonal prosociality”, e.g., feeling comfortable dealing with strangers. People became more altruistic, the impartial rule of law spread (including democracy and human rights), plus other behaviors needed for the smooth running of large social units (such as towns, cities and countries).

The overall impact was that social units of WEIRD people could grow to include tens of thousands, even millions, or people, and successfully operate at this scale. Information about beneficial inventions could diffuse rapidly and people were free(ish) to try out new things (i.e., they were not held back by family customs), and operating in a society with free movement of people there were lots of efficiencies, e.g., companies were not obligated to hire family members, and could hire the best person they could find.

Consequently, the West got to take full advantage of scientific progress, invent and mass produce stuff. Outcompeting the non-WEIRD world.

The big ideas kind of hang together. Some of the details seem like a bit of a stretch, but I’m no expert.

My WEIRD story occurred about five years ago, when I was looking for a publisher for the book I was working on. One interested editor sent out an early draft for review. One of the chapters discusses human cognition, and I pointed out that it did not matter that most psychology experiments had been done using WEIRD subjects, because software developers were WEIRD (citing Henrich’s 2010 WEIRD paper). This discussion of WEIRD people was just too much for one of the reviewers, who sounded like he was foaming at the mouth when reviewing my draft (I also said a few things about academic researchers that upset him).

Gustaf Erikson (gerikson)

October 31, 2020

Patrick Louis (venam)

What Does It Take To Resolve A Hostname October 31, 2020 10:00 PM

slide1

Can also be found in presentation format here

Resolving A Name Is Complex

slide2

Resolving a domain name is complex. It’s not limited to the DNS, the Domain Name System — A decentralized and hierarchical system to associate names and other information to IP addresses.
It’s not something we, as users, usually pay attention to. We notice it only when we’re facing an issue. It normally works out of the box but really nobody get the crux.
You search online for clarifications but they barely help and add more confusion.

Here are some schemas trying to decipher the mystery that domain name resolution came to be.

slide3

One, two, and three, I think you get me, it is not easy. It’s never as simple as taking a hostname as a string, getting the DNS address in the /etc/resolv.conf config, then sending a request to the DNS on port 53 to be greeted back with the IP.
Behind the scene there are ton of files and libraries involved, all of this to get a domain name solved.

So in this talk we’ll try to create some order to try to understand thing as an end-user. Let’s make sense and reason behind this mess even if I have to say, I don’t get it much myself.
I can’t assess I haven’t made mistakes but if I did, please correct me, that would be great!

NIH

slide4

Let’s start with the misfits, the ones that don’t follow the rules, the not-invented-here syndrome found within our tools.
When it comes to DNS resolution, there’s no one-size fit all solution. Obviously, many of us don’t want to deal with all the complexity, so we say, “let’s pack these bytes ourselves, and forget the hassle”.
That’s pure heresy though. We’d prefer everything to work the same way, so that it’s easier to follow. It would be preferable that they all use the same lib, to all have the same behavior. That is, in our case to rely on the C standard lib, or the POSIX API our savior.

In all cases, let’s note some software that don’t rely on it, as we said, all the misfits.

  • The ISC/BSD BIND tools: from host, to dig, to drill, to nslookup, and more, used for debugging chores.
  • Firefox/Chrome/Chromium: There are the browsers, because they are one of a kind, bypassing libc and POSIX mechanism, implementing their own DNS API for performance reasons and perfectionism.
  • Any applications needing advanced DNS features, other than simple name to IP.
  • Language that don’t wrap around a libc: The Go programming language comes to mind. It implements it’s own resolver API.

Fortunately, I can ease your mind by letting you know that all of these will at least respect /etc/resolv.conf and /etc/hosts configurations. Files that we’ll see in the next sections.

Historic

slide5

I’ve taken a look at over a dozen different technologies and I think the best way to understand them is through their archaeologies. There’s a lot that can be explained about DNS resolution simply based on all the historic reasons.
The main thing you need to understand, is that there’s not a single clean library call to resolve a hostname. Standards and new specs have pilled up over the years, with some software that haven’t followed, but risking to disappear.

Overall, libc and POSIX provide multiple resolution APIs:

  • There’s the historic, low level one provided by ISC/BSD BIND resolver implementation within libc. Accessed though libresolv/resolv.h incantation.
  • The gethostbyname(3) and related functions, implementing an obsolete POSIX C specification.
  • The getaddrinfo(3), that is the modern POSIX C API for name resolution.

All these combinations, ladies and gentlemen, are the standard ways to resolve a name.
Newer applications will use getaddrinfo while older ones will use gethostbyname. Both of these 2 will often rely on something called NSS and another part to manage resolv.conf access.

Now let’s dive into each of these and you’ll get them like a breeze.

resolver(3)

slide6

The resolver layer is the oldest and most stable in our quest. It originates from 1983, today almost 37 years ago, at Berkeley university.

It comes from a project called BIND, Berkely Internet Name Domain, which was sponsored by a DARPA grants. And like the Berkeley socket that gave rise to the internet, it has now turned into much much pain.
It was the very first implementation of the DNS specifications. It got released in BSD4.3 and today the BIND project is maintained by the Internet Systems Consortium, aka ISC.

It not only offers servers and clients, and the debug tools which we mentioned earlier, but also offers a library called “libbind”. This library is the defacto implementation, the standard resolver, the one of a kind. It is initially based on all the original RFC discussions, namely RFC 881, 882, and 883.
The BSD people wrote technical papers assessing its feasibility, and went on recommending and implementing it within BSD.

At that point BIND wasn’t a standard yet, it was an optionally-compiled code for those who wanted to get their feet wet, those who wanted to try DNS.
Then it got part of the C standard library interface through resolver, libresolv, -lresolv, resolv.h, and closed the case

If you take a look at most Unix-like systems today, from MacOS, to OpenBSD, to Linux, and company, you’ll see clearly in resolv.h, the copyright going back to 1983, to that very date. But obviously, it depends on the choice of the implementer, a case by case

So then the code diverged, there’s the libresolv provided by the C standardization and the libbind provided by the BIND implementation. However, most Unix only add small specific changes to their needs. For example, resolver in glibc is baselined off libbind from BIND version 8.2.3.

This layer is normally used for low level DNS interactions because it’s missing the goodies we’ll see later in this presentation.

Now let’s talk about environments and configurations.

The resolver configuration file

The resolver configuration files were mentioned in BIND first release, in section 4.2.2.2 of “The Design and Implementation of ‘Domain Name Resolver’” by Mark Painter based on RFC883, part of the DNS RFC series.

This particular file being /etc/resolv.conf, you’ll see it hardcoded in resolv.h and if that file is missing, it’ll fall back to the localhost as the DNS, just to be safe.
Additionally, there’s /etc/host.conf, according to the manpage also “the resolver configuration file”, it’s so appropriately named. It’s a conf that dictates the working of /etc/hosts, the “static table lookup for hostsnames”.

So what’s in these files.
resolv.conf takes care of how to resolve names and which nameserver to use for that, while hosts simply has a list of known host aliases, ip + name, as simple as that.

Within resolv.conf you can also have a search list for domains. That’s if a name you’re searching for doesn’t have the minimum number of dots in it then it’ll add one of these TLD to it, top-level-domains, and keep searching until it finds something that fits.
This can also be manipulated in an environment variable LOCALDOMAIN.

$ echo 'example www.example.com' > ./host_aliases
$ HOSTALIASES="./host_aliases" getent hosts example
93.184.216.34   www.example.com

There can also be a sortlist IP netmask, for when there’s many results to match but you don’t want to give priority to the cloud VPS that lives only for cash.

Finally, there’s the option field, also overriden on the command line by the RES_OPTIONS environment variable. It manipulates the minimum number of dots we mentioned and also if you want can set debug as enabled.

Meanwhile, the hosts file is but a key-value db, simply made of domain names and IPs.

Its config also lets you change the order of results and for the rest you have host.conf to consult.

So remember, that all of these are mostly used everywhere because it’s the lowest layer. So it’s used by libbind and libresolv but also the custom NIH syndrome

Alright, so far that’s all classic clean stuff. Let’s move on to the next sections, you’ll scratch your head until there’s no dandruff.

gethostbyname(3) and getaddrinfo(3)

slide7

The C library POSIX specs create a superset over the C standard library. They add a few simpler calls to resolve hostnames and make it easy. These focus on returning A and AAAA records only, ipV4 and ipV6 respsectively.
There’s gethostbyname(3) which is deprecated, and there’s the newer getaddrinfo(3) defined in IEEE Std 1003.1g-2000, which mainly adds RFC3493 aka ipV6 is now supported. So applications are recommended to use this updated version unless they want to divert from mainland.

There are functions to resolve IP addresses to host names, but let’s focus only on name to ip for today, I know it’s lame.

Apart from ipV6 support being added, some internal structures have been updated as they weren’t so safe between subsequent calls and thus could be your demise and your fall.

Obviously they both return different structures.

hostent struct is returned to gethostbyname function caller. while getaddrinfo returns an addrinfo structure. Both being defined in the netdb.h header.

struct hostent {
	char  *h_name;            /* official name of host */
	char **h_aliases;         /* alias list */
	int    h_addrtype;        /* host address type */
	int    h_length;          /* length of address */
	char **h_addr_list;       /* list of addresses */
}
struct addrinfo {
	int              ai_flags;
	int              ai_family;
	int              ai_socktype;
	int              ai_protocol;
	socklen_t        ai_addrlen;
	struct sockaddr *ai_addr;
	char            *ai_canonname;
	struct addrinfo *ai_next;
};

Some libc implementations will get fancy and add their own modified versions of gethostbyname. For instance in glibc they add support for ipV6 in their modified gethostbyname2 for backward compatibility.

Regarding configuration files, getaddrinfo will consult /etc/gai.conf which takes care of the precedence of the addresses returned in the results. And now, you’re going to brandish your torch yelling at me “but resolver(3) already does that by default”. But I’ll let you know that resolver(3) is only interested in DNS calls only while these two POSIX functions in their egocentrism are more interested in all the ways, files, and mechanism that a name can be converted to an IP.
That is, they often rely on something called NSS which is what we’ll see in our next analysis.

nss(5)

slide8

Both gethostbyname(3) and getaddrinfo(3) will most likely rely on the NSS service, but what is NSS, aka Name Service Switch.
First of all it is not to be confuse with “Network Security Services”, which has the same accronym but has a lib called -libnss. In our case it’s -lnetdb, with the netdb.h header, so keep this in mind for later.

To understand what’s NSS is, we, again, have to go back in time, back when the tech was still in its prime.
There always has been the idea of sharing configurations between machines, however back in the days it was all hardcoded, with the exception of Ultrix.
Hardcoded in files like aliases for emails, /etc/hosts for local domains, the finger database and all that it entails. This idea dates back for so long that netdb.h header was almost always there, but was looking in these files we mentioned earlier

There are also a bunch of POSIX functions to get these values getservbyname, gethostent, gethostbyname, getservbyport, etc.. I think you can continue.

From that point on we needed something more flexible, and so Solaris OS said let’s not have it hardcoded, that’s not-acceptable. Let’s create something called the Yellow Page, a sort of phone book for configurations brokerage. But the name Yellow Page had legal issues so let’s go with NIS, for the Network Information Service.
Other Unices liked what they were doing in their business so they reproduced it in something called NSS. Though NSS, Network Service Switch is much simpler than NIS.

Let’s have a side note about OpenBSD OS which doesn’t implement NSS but has a pseudo-N.I.S., something called the ypserv(8), the Yellow Pages written by Theo de Raadt from scratch, but he doesn’t care about the legal name wrath.

On OpenBSD you can also find the nsdispatch(3) function The name-service switch dispatcher, something similar to NSS But I’m not sure, I’ll recheck my citations.

So let’s summarize, NSS is a client-server directory service protocol that has as role to distribute system config between different computers, to keep them harmozined. It is more flexible than the fixed files in libc and POSIX, and is arguably like LDAP, or zookeeper, if you know it. Or actually, like any modern way to share configs between containers and microservices.

“But what does it have to do with domain names”, you may ask, well, a map of name with ip is a config like any others, so it’s the same task. That also includes things from hosts, password, port, aliases, and groups. Yep, it’s quite the big soup.

Apart from the functions in POSIX there is command line utilities that goes by the name of getent that lets you access NSS facilities to do simple queries for its entries.

So for example you can get a service port based on the name of that service Yes, simple the name suffice.

> getent  services domain   
domain                53/tcp

This particular module will read the /etc/services file NSS is quite versatile.

We can obviously query for a hostname which is our main game.

And note that you can disable the IDN encoding too Remember all that domain name we did on the forums, all that voodoo

getent -i hosts 𝕟𝕚𝕩𝕖𝕣𝕤.𝕟𝕖𝕥 
getent  hosts 𝕟𝕚𝕩𝕖𝕣𝕤.𝕟𝕖𝕥 
#  178.62.236.80   STREAM nixers.net
#  178.62.236.80   DGRAM  
#  178.62.236.80   RAW  

So how is NSS actually working, how does it also do the resolving. The NSS library consults the /etc/nsswitch.conf and /etc/default/nss files and depending on the entries it will sequentially attempt until it’s satisfied, until it find what it wants until it got the demand.

You’ll find the “hosts” entry in this file, along with a list of string on its right.

hosts: files mymachines myhostname resolve [!UNAVAIL=return] dns

These strings are the modules which will dynamically be loaded and sequentially executed, the format even allowing to have appended conditional rules.
Like here I’m skipping resolve plugin if it’s not available on my machine.

To get a list of all modules, you can look in your lib directory mess for anything that starts with libnss_.

 /usr/lib > ls libnss_*
libnss_compat-2.32.so  libnss_dns-2.32.so    libnss_hesiod-2.32.so   libnss_systemd.so.2
libnss_compat.so       libnss_dns.so         libnss_hesiod.so        libnss_winbind.so
libnss_compat.so.2     libnss_dns.so.2       libnss_hesiod.so.2      libnss_winbind.so.2
libnss_db-2.32.so      libnss_files-2.32.so  libnss_myhostname.so.2  libnss_wins.so
libnss_db.so           libnss_files.so       libnss_mymachines.so.2  libnss_wins.so.2
libnss_db.so.2         libnss_files.so.2     libnss_resolve.so.2

The most common modules are the following: files, dns, nis, myhostname, and resolve (for systemd-resolved).

  • files: Reads a local file in our case /etc/resolv.conf or /etc/hosts, no polling or anything
  • dns: will try to resolve the name remotely, in this case yes, it’s pulling it.
  • nis: To use solaris YP/NIS
  • myhostname: which reads local files such as /etc/hosts and /etc/hostname similar to the files plugin in case you missed.
  • resolve: the resolve plugin is the systemd-resolved, yes don’t put me on a crucifix.

And theres a bunch of others In case you’re in a mood to be a crusader.

Let’s open a parenthesis on the resolve plugin, before you throw it quickly in the dustbin. It’s quite advanced having multiple features like caching, to DNSSEC validation, to resolveconf, as well as being an NSS plugin. And when used as an NSS plugin, you communicate with systemd-resolved via dbus sockets, otherwise it always listens on port 53 for fallback in case you didn’t use NSS.

You can consult its ResolveHostname() method/interface part of the org.freedesktop.resolve1.Manager dbus object.

Not let’s move to something else, something you haven’t thought of yet.

resolvconf(8)

slide9

As we said, resolv.conf is used by all these components, but not only them, also all network agents. They are also in charge of setting or changing the DNS address, each of them, from dhcp client, ppp daemon, vpn manager, network manager, they all want access. And what about having 2 network connections concurrently, each requiring their own separate DNS, obviously.

So everyone wants to use the resolv.conf file, thus we need a manager to handle it. We want to avoid an inconsistent state, it’s vital not let everyone mess with it, and that is what resolvconf(8) role is.
Anyone wanting to change the resolv.conf should instead pass through resolveconf to avoid the hassle. It does that by using it’s resolvconf command line executable. Similarly to the resolv.conf configuration, you can pass anything to it like domain, search, and options.

resolvconf -a eth0.dhclient << EOF
nameserver 10.0.0.42
nameserver 10.0.1.42
EOF

Now resolv.conf is rarely a plain normal file itself because the manager finds it easier to create a symbolic link and avoid the abusiveness. The default implementation has it in /run/resolvconf/name-interface/resolv.conf.

Accordingly, like any other tooling, resolvconf has configuration files in /etc/resolvconf.conf, and a directory with hooks in /etc/resolvconf/. Within these files you can mention if you want the symlink to be at another location.

resolv_conf=/var/adsuck/resolv.conf

I’m saying default implementation because like anything else on a system you can replace it with your own concoction. Two popular alternatives solution to this problem: openresolv, systemd-resolved, which we mentioned earlier.

So resolv.conf is rarely a file it’s more of a symlink, check all of these for example, you’ll be surprised I think.

/run/resolv.conf/resolv.conf
/run/systemd/resolve/stub-resolv.conf
/run/systemd/resolve/resolv.conf
/var/run/NetworkManager/resolv.conf
/var/run/NetworkManager/no-stub-resolv.conf

Caching

slide10

In computers you can make anything faster with another level of indirection. That’s what all cache mechanism try to offer and domain name resolving is no exception.
There are two places where caching is available, either through a local dns proxy or through something called nscd. Just remember that this last one isn’t very stable.

Let’s start with nscd which is an NSS proxy, so it not only caches the DNS queries but also anything related to getting an NSS entry.

The other caching method is to run your own local dns server, be it bind9, djbdns, dnscache, lwresd, dnscrypt-proxy or any other resolver.
These can either be full featured, bells and whistles or only provide lightweight cache proxy if you’re not feeling like you want the details.

Another reason to run such service would be to block ads and all their malice.

Also, just beware of flushing the cache, otherwise you’ll get surprises that will make you crash.

How To Debug

slide11

So now you sort of know that it depends on what everthing uses Once you got that you can now start an analysis.

You can use a BIND tool To debug if DNS is the fool Or simply do a wireshark trace if you don’t want to bother or these are not under your grace

You can also check which NSS pluging is loaded And make sure they’re not aborted

ltrace -e "*gethostbyname*@libnss*" getent hosts www.example.com

Remember that each tool can have their own configurations So it adds complexity to the equation.

Big Picture

slide12

Let’s conclude here.
You should now be comfortable with anything in the domain name resolution sphere. It’s all about shared config management, like zookeeper, ldap, and these other arrangements.

I hope you’ve learned a thing or two and that domain name resolution is less of a taboo.
Thanks for listening and have a nice evening.

slide13

References

Gokberk Yaltirakli (gkbrk)

How I keep track of what I’ve been working on October 31, 2020 09:00 PM

Especially on busy times, it is possible to forget the projects I’ve been working on. While I tend to remember the big ones, some small projects slip away from memory. This is troubling when someone asks if I’ve been working on anything interesting recently, or if I feel like I haven’t been productive. Seeing how many thing I managed to work on can be a good morale-booster.

This problem became more apparent recently when I started to publish “Status Update” blog posts, in which I write short notes about the projects I’ve been working on. Instead of looking through used to-do lists or diaries, I found a more effective solution using the POSIX tool find, specifically the -mtime flag.

When you call find with -mtime, it searches for files based on their modification dates. Here is the snippet I use to find the files I’ve worked on in the last month.

find ~/projects -mtime -30

The parameter -30 stands for the last 30 days, it can be modified as you wish. For example -7 would filter for the last week.

October 29, 2020

Maxwell Bernstein (tekknolagi)

Compiling a Lisp: Labelled procedure calls October 29, 2020 08:00 AM

firstprevious

Welcome back to the Compiling a Lisp series. Last time, we learned about Intel instruction encoding. This time, we’re going to use that knowledge to compile procedure calls.

The usual function expression in Lisp is a lambda — an anonymous function that can take arguments and close over variables. Procedure calls are not this. They are simpler constructs that just take arguments and return values.

We’re adding procedure calls first as a stepping stone to full closure support. This will help us get some kind of internal calling convention established and stack manipulation figured out before things get too complicated.

After this post, we will be able to support programs like the following:

(labels ((add (code (x y) (+ x y)))
         (sub (code (x y) (- x y))))
    (labelcall sub 4 (labelcall add 1 2)))
; => 1

and even this snazzy factorial function:

(labels ((factorial (code (x) 
            (if (< x 2) 1 (* x (labelcall factorial (- x 1)))))))
    (labelcall factorial 5))
; => 120

These are fairly pedestrian snippets of code but they demonstrate some new features we are adding, like:

  • A new labels form that all programs will now have to look like
  • A new code form for describing procedures and their parameters
  • A new labelcall expression for calling procedures

Ghuloum does not explain why he does this, but I imagine that the labels form was chosen over allowing multiple separate top-level bindings because it is easier to parse and traverse.

Big ideas

In order to compile a program, we are going to traverse every binding in the labels. For each binding, we will generate code for each code object.

Compiling code objects requires making an environment for their parameters. We’ll establish a calling convention later so that our compiler knows where to find the parameters.

Then, once we’ve emitted all the code for the bindings, we will compile the body. The body may, but is not required to, contain a labelcall expression.

In order to compile a labelcall expression, we will compile all of the arguments provided, save them in consecutive locations on the stack, and then emit a call instruction.

When all of these pieces come together, the resulting machine code will look something like this:

mov rsi, rdi  # prologue
label0:
  label0_code
label1:
  label1_code
main:
  main_code

You can see that all of the code objects will be compiled in sequence, followed by the body of the labels form.

Because I have not yet figured out how to start executing at somewhere other than the beginning of the generated code, and because I don't store generated code in any intermediate buffers, and because we don't know the sizes of any code in advance, I do this funky thing where I emit a `jmp` to the body code. If you, dear reader, have a better solution, please let me know.

Edit: jsmith45 gave me the encouragement I needed to work on this again. It turns out that storing the code offset of the beginning of the main_code (the labels body) adding that to the buf->address works just fine. I’ll explain more below.

A calling convention

We’re not going to use the System V AMD64 ABI. That calling convention requires that parameters are passed first in certain registers, and then on the stack. Instead, we will pass all parameters on the stack.

This makes our code simpler, but it also means that at some point later on, we will have to add a different kind of calling convention so that we can call foreign functions (like printf, or exit, or something). Those functions expect their parameters in registers. We’ll worry about that later.

If we borrow and adapt the excellent diagrams from the Ghuloum tutorial, this means that right before we make a procedure call, our stack will look like this:

               Low address

           |   ...            |
           +------------------+
           |   ...            |
           +------------------+
      +->  |   arg3           | rsp-56
  out |    +------------------+
  args|    |   arg2           | rsp-48
      |    +------------------+
      +->  |   arg1           | rsp-40
           +------------------+
           |                  | rsp-32
           +------------------+
      +->  |   local3         | rsp-24
      |    +------------------+
locals|    |   local2         | rsp-16
      |    +------------------+
      +->  |   local1         | rsp-8
           +------------------+
  base     |   return point   | rsp

               High address

Stack illustration courtesy of Leonard.

You can see the first return point at [rsp]. This is the return point placed by the caller of the current function.

Above that are whatever local variables we have declared with let or perhaps are intermediate values from some computation.

Above that is a blank space reserved for the second return point. This is the return point for the about-to-be-called function. The call instruction will fill in after evaluating all the arguments.

Above the return point are all the outgoing arguments. They will appear as locals for the procedure being called.

Finally, above the arguments, is untouched free stack space.

The call instruction decrements rsp and then writes to [rsp]. This means that if we just emitted a call, the first local would be overwritten. No good. Worse, the way the stack would be laid out would mean that the locals would look like arguments.

In order to solve this problem, we need to first adjust rsp to point to the last local. That way the decrement will move it below the local and the return address will go between the locals and the arguments.

After the call instruction, the stack will look different. Nothing will have actually changed, except for rsp. This change to rsp means that the callee has a different view:

               Low address

           |   ...            |
           +------------------+
           |   ...            |
           +------------------+
      +->  |   arg3           | rsp-24
  in  |    +------------------+
  args|    |   arg2           | rsp-16
      |    +------------------+
      +->  |   arg1           | rsp-8
           +------------------+
  base     |   return point   | rsp
           +------------------+
           |   ~~~~~~~~~~~~   |
           +------------------+
           |   ~~~~~~~~~~~~   |
           +------------------+
           |   ~~~~~~~~~~~~   |
           +------------------+
           |   ~~~~~~~~~~~~   |

               High address

Stack illustration courtesy of Leonard.

The empty colored in spaces below the return point indicate that the values on the stack are “hidden” from view, since they are above (higher addresses than) [rsp]. The called function will not be able to access those values.

If the called function wants to use one of its arguments, it can pull it off the stack from its designated location.

One unfortunate consequence of this calling convention is that Valgrind does not understand it. Valgrind cannot understand that the caller has placed data on the stack specifically for the callee to read it, and thinks this is a move/jump of an uninitialized value. This means that we get some errors now on these labelcall tests.

Eventually, when the function returns, the ret instruction will pop the return point off the stack and jump to it. This will bring us back to the previous call frame.

That’s that! I have yet to find a good tool that will let me visualize the stack as a program is executing. GDB probably has a mode hidden away somewhere undocumented that does exactly this. Cutter sort of does, but it’s finicky in ways I don’t really understand. Maybe one day Kartik’s x86-64 Mu fork will be able to do this.

Building procedure calls in small pieces

In order for this set of changes to make sense, I am going to explain all of the pieces one at a time, top-down.

First, we’ll look at the new-and-improved Compile_entry, which has been updated to handle the labels form. This will do the usual Lisp entrypoint setup and some checks about the structure of the AST.

Then, we’ll actually look at compiling the labels. This means going through the bindings one-by-one and compiling their code objects.

Then, we’ll look at what it means to compile a code object. Hint: it’s very much like let.

Last, we’ll tie it all together when compiling the body of the labels form.

Compiling the entrypoint

Most of this code is checking. What used to just compile an expression now validates that what we’ve passed in at least vaguely looks like a well-formed labels form before picking it into its component parts: the bindings and the body.

int Compile_entry(Buffer *buf, ASTNode *node) {
  assert(AST_is_pair(node) && "program must have labels");
  // Assume it's (labels ...)
  ASTNode *labels_sym = AST_pair_car(node);
  assert(AST_is_symbol(labels_sym) && "program must have labels");
  assert(AST_symbol_matches(labels_sym, "labels") &&
         "program must have labels");
  ASTNode *args = AST_pair_cdr(node);
  ASTNode *bindings = operand1(args);
  assert(AST_is_pair(bindings) || AST_is_nil(bindings));
  ASTNode *body = operand2(args);
  return Compile_labels(buf, bindings, body, /*labels=*/NULL);
}

Compile_entry dispatches to Compile_labels for iterating over all of the labels. Compile_labels is a recursive function that keeps track of all the labels so far in its arguments, so we start it off with an empty labels environment.

Compiling labels

In Compile_labels, we have first a base case: if there are no labels we should just emit the body.

int Compile_labels(Buffer *buf, ASTNode *bindings, ASTNode *body,
                   Env *labels) {
  if (AST_is_nil(bindings)) {
    buf->entrypoint = Buffer_len(buf);
    // Base case: no bindings. Compile the body
    Buffer_write_arr(buf, kEntryPrologue, sizeof kEntryPrologue);
    _(Compile_expr(buf, body, /*stack_index=*/-kWordSize, /*varenv=*/NULL,
                   labels));
    Buffer_write_arr(buf, kFunctionEpilogue, sizeof kFunctionEpilogue);
    return 0;
  }
  // ...
}

We also set the buffer entrypoint location to the position where we’re going to emit the body of the labels. We’ll use this later when executing, or later in the series when we emit ELF binaries. You’ll have to add a field word entrypoint to your Buffer struct.

We pass in an empty varenv, since we are not accumulating any locals along the way; only labels. For the same reason, we give a stack_index of -kWordSize — the first slot.

If we do have labels, on the other hand, we should deal with the first label. This means:

  • pulling out the name and the code object
  • binding the name to the code location (the current location)
  • compiling the code

And then from there we deal with the others recursively.

int Compile_labels(Buffer *buf, ASTNode *bindings, ASTNode *body,
                   Env *labels) {
  // ....
  assert(AST_is_pair(bindings));
  // Get the next binding
  ASTNode *binding = AST_pair_car(bindings);
  ASTNode *name = AST_pair_car(binding);
  assert(AST_is_symbol(name));
  ASTNode *binding_code = AST_pair_car(AST_pair_cdr(binding));
  word function_location = Buffer_len(buf);
  // Bind the name to the location in the instruction stream
  Env entry = Env_bind(AST_symbol_cstr(name), function_location, labels);
  // Compile the binding function
  _(Compile_code(buf, binding_code, &entry));
  return Compile_labels(buf, AST_pair_cdr(bindings), body, &entry);
}

It’s important to note that we are binding before we compile the code object and we are making the code location available before it is compiled! This means that code objects can reference themselves and even recursively call themselves.

Since we then pass that binding into labels for the recursive call, it also means that labels can access all labels defined before them, too.

Now let’s figure out what it means to compile a code object.

Compiling code

I split this into two functions: one helper that pulls apart code objects (I didn’t want to do that in labels because I thought it would clutter the meaning), and one recursive function that does the work of putting the parameters in the environment.

So Compile_code just pulls apart the (code (x y z ...) body) into the formal parameters and the body. Since Compile_code_impl will need to recursively build up information about the stack_index and varenv, we supply those.

int Compile_code(Buffer *buf, ASTNode *code, Env *labels) {
  assert(AST_is_pair(code));
  ASTNode *code_sym = AST_pair_car(code);
  assert(AST_is_symbol(code_sym));
  assert(AST_symbol_matches(code_sym, "code"));
  ASTNode *args = AST_pair_cdr(code);
  ASTNode *formals = operand1(args);
  ASTNode *code_body = operand2(args);
  return Compile_code_impl(buf, formals, code_body, /*stack_index=*/-kWordSize,
                           /*varenv=*/NULL, labels);
}

I said this would be like let. What I meant by that was that, like let bodies, code objects have “locals” — the formal parameters. We have to bind the names of the parameters to successive stack locations, as per our calling convention.

In the base case, we do not have any formals, so we compile the body:

int Compile_code_impl(Buffer *buf, ASTNode *formals, ASTNode *body,
                      word stack_index, Env *varenv, Env *labels) {
  if (AST_is_nil(formals)) {
    _(Compile_expr(buf, body, stack_index, varenv, labels));
    Buffer_write_arr(buf, kFunctionEpilogue, sizeof kFunctionEpilogue);
    return 0;
  }
  // ...
}

We also emit this function epilogue, which right now is just ret. I got rid of the push rbp/mov rbp, rsp/pop rbp dance because we switched to using rsp only instead. I alluded to this in the previous instruction encoding interlude post.

In the case where we have at least one formals, we bind the name to the stack location and go on our merry way.

int Compile_code_impl(Buffer *buf, ASTNode *formals, ASTNode *body,
                      word stack_index, Env *varenv, Env *labels) {
  // ...
  assert(AST_is_pair(formals));
  ASTNode *name = AST_pair_car(formals);
  assert(AST_is_symbol(name));
  Env entry = Env_bind(AST_symbol_cstr(name), stack_index, varenv);
  return Compile_code_impl(buf, AST_pair_cdr(formals), body,
                           stack_index - kWordSize, &entry, labels);
}

That’s it! That’s how you compile procedures.

Compiling labelcalls

What use are procedures if we can’t call them? Let’s figure out how to compile procedure calls.

Code for calling a procedure must put the arguments and return address on the stack precisely how the called procedure expects them.

Getting this contract right can be tricky. I spent several frustrated hours getting this to not crash. Then, even though it didn’t crash, it returned bad data. It turns out that I was overwriting the return address by accident and returning to someplace strange instead.

Making handmade diagrams that track the changes to rsp and the stack really helps with understanding calling convention bugs.

We’ll start off by dumping yet more code into Compile_call. This code will look for something of the form (labelcall name ...).

Before calling into a helper function Compile_labelcall, we get two bits of information ready:

  • arg_stack_index, which is the first place on the stack where args are supposed to go. Since we’re skipping a space for the return address, this is one more than the current (available) slot index.
  • rsp_adjust, which is the amount that we’re going to have to, well, adjust rsp. Without locals from let or incoming arguments from a procedure call, this will be 0. With locals and/or arguments, this will be the total amount of space taken up by those.

Then we call Compile_labelcall.

int Compile_call(Buffer *buf, ASTNode *callable, ASTNode *args,
                 word stack_index, Env *varenv, Env *labels) {
    // ...
    if (AST_symbol_matches(callable, "labelcall")) {
      ASTNode *label = operand1(args);
      assert(AST_is_symbol(label));
      ASTNode *call_args = AST_pair_cdr(args);
      // Skip a space on the stack to put the return address
      word arg_stack_index = stack_index - kWordSize;
      // We enter Compile_call with a stack_index pointing to the next
      // available spot on the stack. Add kWordSize (stack_index is negative)
      // so that it is only a multiple of the number of locals N, not N+1.
      word rsp_adjust = stack_index + kWordSize;
      return Compile_labelcall(buf, label, call_args, arg_stack_index, varenv,
                               labels, rsp_adjust);
    }
    // ...
}

Compile_labelcall is one of those fun recursive functions we write so frequently. Its job is to compile all of the arguments and store their results in successive stack locations.

In the base case, it has no arguments to compile. It should just adjust the stack pointer, call the procedure, adjust the stack pointer back, and return.

void Emit_rsp_adjust(Buffer *buf, word adjust) {
  if (adjust < 0) {
    Emit_sub_reg_imm32(buf, kRsp, -adjust);
  } else if (adjust > 0) {
    Emit_add_reg_imm32(buf, kRsp, adjust);
  }
}

int Compile_labelcall(Buffer *buf, ASTNode *callable, ASTNode *args,
                      word stack_index, Env *varenv, Env *labels,
                      word rsp_adjust) {
  if (AST_is_nil(args)) {
    word code_address;
    if (!Env_find(labels, AST_symbol_cstr(callable), &code_address)) {
      return -1;
    }
    // Save the locals
    Emit_rsp_adjust(buf, rsp_adjust);
    Emit_call_imm32(buf, code_address);
    // Unsave the locals
    Emit_rsp_adjust(buf, -rsp_adjust);
    return 0;
  }
  // ...
}

Emit_rsp_adjust is a convenience function that takes some stack adjustment delta. If it’s negative, it will issue a sub instruction. If it’s positive, an add. Otherwise, it’ll do nothing.

In the case with arguments, we should compile them one at a time:

int Compile_labelcall(Buffer *buf, ASTNode *callable, ASTNode *args,
                      word stack_index, Env *varenv, Env *labels,
                      word rsp_adjust) {
  // ...
  assert(AST_is_pair(args));
  ASTNode *arg = AST_pair_car(args);
  _(Compile_expr(buf, arg, stack_index, varenv, labels));
  Emit_store_reg_indirect(buf, Ind(kRsp, stack_index), kRax);
  return Compile_labelcall(buf, callable, AST_pair_cdr(args),
                           stack_index - kWordSize, varenv, labels, rsp_adjust);
}

There, that wasn’t so bad, was it? I mean, if you manage to get it right the first time. I certainly did not. In fact, I gave up on the first version of this compiler many months ago because I could not get procedure calls right. With this post, I have now made it past that particular thorny milestone!

One last thing: we’ll need to update the code that converts buf->address into a function pointer. We have to use the buf->entrypoint we set earlier.

uword Testing_execute_entry(Buffer *buf, uword *heap) {
  assert(buf != NULL);
  assert(buf->address != NULL);
  assert(buf->state == kExecutable);
  // The pointer-pointer cast is allowed but the underlying
  // data-to-function-pointer back-and-forth is only guaranteed to work on
  // POSIX systems (because of eg dlsym).
  byte *start_address = buf->address + buf->entrypoint;
  JitFunction function = *(JitFunction *)(&start_address);
  return function(heap);
}

Let’s test our implementation. Maybe these tests will help you.

Testing

I won’t include all the tests in this post, but a full battery of tests is available in compile-procedures.c. Here are some of them.

First, we should check that compiling code objects works:

TEST compile_code_with_two_params(Buffer *buf) {
  ASTNode *node = Reader_read("(code (x y) (+ x y))");
  int compile_result = Compile_code(buf, node, /*labels=*/NULL);
  ASSERT_EQ(compile_result, 0);
  // clang-format off
  byte expected[] = {
      // mov rax, [rsp-16]
      0x48, 0x8b, 0x44, 0x24, 0xf0,
      // mov [rsp-24], rax
      0x48, 0x89, 0x44, 0x24, 0xe8,
      // mov rax, [rsp-8]
      0x48, 0x8b, 0x44, 0x24, 0xf8,
      // add rax, [rsp-24]
      0x48, 0x03, 0x44, 0x24, 0xe8,
      // ret
      0xc3,
  };
  // clang-format on
  EXPECT_EQUALS_BYTES(buf, expected);
  AST_heap_free(node);
  PASS();
}

As expected, this takes the first argument in [rsp-8] and second in [rsp-16], storing a temporary in [rsp-24]. This test does not test execution because I did not want to write the testing infrastructure for manually setting up procedure calls.

Second, we should check that defining labels works:

TEST compile_labels_with_one_label(Buffer *buf) {
  ASTNode *node = Reader_read("(labels ((const (code () 5))) 1)");
  int compile_result = Compile_entry(buf, node);
  ASSERT_EQ(compile_result, 0);
  // clang-format off
  byte expected[] = {
      // mov rax, compile(5)
      0x48, 0xc7, 0xc0, 0x14, 0x00, 0x00, 0x00,
      // ret
      0xc3,
      // mov rsi, rdi
      0x48, 0x89, 0xfe,
      // mov rax, 0x2
      0x48, 0xc7, 0xc0, 0x04, 0x00, 0x00, 0x00,
      // ret
      0xc3,
  };
  // clang-format on
  EXPECT_EQUALS_BYTES(buf, expected);
  Buffer_make_executable(buf);
  uword result = Testing_execute_entry(buf, /*heap=*/NULL);
  ASSERT_EQ_FMT(Object_encode_integer(1), result, "0x%lx");
  AST_heap_free(node);
  PASS();
}

This tests for a jump over the compiled procedure bodies (CHECK!), emitting compiled procedure bodies (CHECK!), and emitting the body of the labels form (CHECK!). This one we can execute.

Third, we should check that passing arguments to procedures works:

TEST compile_labelcall_with_one_param(Buffer *buf) {
  ASTNode *node = Reader_read("(labels ((id (code (x) x))) (labelcall id 5))");
  int compile_result = Compile_entry(buf, node);
  ASSERT_EQ(compile_result, 0);
  // clang-format off
  byte expected[] = {
      // mov rax, [rsp-8]
      0x48, 0x8b, 0x44, 0x24, 0xf8,
      // ret
      0xc3,
      // mov rsi, rdi
      0x48, 0x89, 0xfe,
      // mov rax, compile(5)
      0x48, 0xc7, 0xc0, 0x14, 0x00, 0x00, 0x00,
      // mov [rsp-16], rax
      0x48, 0x89, 0x44, 0x24, 0xf0,
      // call `id`
      0xe8, 0xe6, 0xff, 0xff, 0xff,
      // ret
      0xc3,
  };
  // clang-format on
  EXPECT_EQUALS_BYTES(buf, expected);
  Buffer_make_executable(buf);
  uword result = Testing_execute_entry(buf, /*heap=*/NULL);
  ASSERT_EQ_FMT(Object_encode_integer(5), result, "0x%lx");
  AST_heap_free(node);
  PASS();
}

This tests that we put the arguments in the right stack locations (skipping a space for the return address), emit a call to the right relative address, and that the call returns successfully. All check!!

Fourth, we should check that we adjust the stack when we have locals:

TEST compile_labelcall_with_one_param_and_locals(Buffer *buf) {
  ASTNode *node = Reader_read(
      "(labels ((id (code (x) x))) (let ((a 1)) (labelcall id 5)))");
  int compile_result = Compile_entry(buf, node);
  ASSERT_EQ(compile_result, 0);
  // clang-format off
  byte expected[] = {
      // mov rax, [rsp-8]
      0x48, 0x8b, 0x44, 0x24, 0xf8,
      // ret
      0xc3,
      // mov rsi, rdi
      0x48, 0x89, 0xfe,
      // mov rax, compile(1)
      0x48, 0xc7, 0xc0, 0x04, 0x00, 0x00, 0x00,
      // mov [rsp-8], rax
      0x48, 0x89, 0x44, 0x24, 0xf8,
      // mov rax, compile(5)
      0x48, 0xc7, 0xc0, 0x14, 0x00, 0x00, 0x00,
      // mov [rsp-24], rax
      0x48, 0x89, 0x44, 0x24, 0xe8,
      // sub rsp, 8
      0x48, 0x81, 0xec, 0x08, 0x00, 0x00, 0x00,
      // call `id`
      0xe8, 0xd3, 0xff, 0xff, 0xff,
      // add rsp, 8
      0x48, 0x81, 0xc4, 0x08, 0x00, 0x00, 0x00,
      // ret
      0xc3,
  };
  // clang-format on
  EXPECT_EQUALS_BYTES(buf, expected);
  Buffer_make_executable(buf);
  uword result = Testing_execute_entry(buf, /*heap=*/NULL);
  ASSERT_EQ_FMT(Object_encode_integer(5), result, "0x%lx");
  AST_heap_free(node);
  PASS();
}

This tests the presence of sub and add instructions for adjusting rsp. It also tests that that did not mess up our stack frame for returning to the caller of the Lisp entrypoint — the test harness.

Fifth, we should check that procedures can refer to procedures defined before them:

TEST compile_multilevel_labelcall(Buffer *buf) {
  ASTNode *node =
      Reader_read("(labels ((add (code (x y) (+ x y)))"
                  "         (add2 (code (x y) (labelcall add x y))))"
                  "    (labelcall add2 1 2))");
  int compile_result = Compile_entry(buf, node);
  ASSERT_EQ(compile_result, 0);
  Buffer_make_executable(buf);
  uword result = Testing_execute_entry(buf, /*heap=*/NULL);
  ASSERT_EQ_FMT(Object_encode_integer(3), result, "0x%lx");
  AST_heap_free(node);
  PASS();
}

And last, but definitely not least, we should check that procedures can refer to themselves:

TEST compile_factorial_labelcall(Buffer *buf) {
  ASTNode *node = Reader_read(
      "(labels ((factorial (code (x) "
      "            (if (< x 2) 1 (* x (labelcall factorial (- x 1)))))))"
      "    (labelcall factorial 5))");
  int compile_result = Compile_entry(buf, node);
  ASSERT_EQ(compile_result, 0);
  Buffer_make_executable(buf);
  uword result = Testing_execute_entry(buf, /*heap=*/NULL);
  ASSERT_EQ_FMT(Object_encode_integer(120), result, "0x%lx");
  AST_heap_free(node);
  PASS();
}

Ugh, beautiful. Recursion works. Factorial works. I’m so happy.

What’s next?

The logical next step in our journey is to compile lambda expressions. This has some difficulty, notably that lambdas can capture variables from outside the lambda. This means that next time, we will implement closures.

For now, revel in your newfound procedural freedom.

Mini Table of Contents

October 25, 2020

Derek Jones (derek-jones)

Benchmarking desktop PCs circa 1990 October 25, 2020 11:05 PM

Before buying a computer customers want to be confident of choosing the best they can get for the money, and performance has often been a major consideration. Computer benchmark performance results were once widely discussed.

Knight’s analysis of early mainframe performance was widely cited for many years.

Performance on the Byte benchmarks was widely cited before Intel started spending billions on advertising, clock frequency has not always had the brand recognition it has today.

The Byte benchmark was originally designed for Intel x86 processors running Microsoft DOS; The benchmark was introduced in the June 1985 issue, and was written in the still relatively new C language (earlier microprocessor benchmarks were often written in BASIC, because early micros often came with a free BASIC interpreter), it was updated in the 1990s to be Windows based, and implemented for Unix.

Benchmarking computers using essentially the same cpu architecture and operating system removes many complications that have to be addressed when these differ. Before Wintel wiped them out, computers from different manufacturers (and often the same manufacturer) contained completely different cpu architectures, ran different operating systems, and compilers were usually created in-house by the manufacturer (or some university who got a large discount on their computer purchase).

The Fall 1990 issue of Byte contains tables of benchmark results from 1988-90. What can we learn from these results?

The most important takeaway from the tables is that those performing the benchmarks appreciated the importance of measuring hardware performance using the applications that customers are likely to be running on their computer, e.g., word processors, spreadsheets, databases, scientific calculations (computers were still sufficiently niche back then that scientific users were a non-trivial percentage of the market), and compiling (hackers were a large percentage of Byte’s readership).

The C benchmarks attempted to measure CPU, FPU (built-in hardware support for floating-point arrived with the 486 in April 1989, prior to that it was an add-on chip that required spending more money), Disk and Video (at the time support for color was becoming mainstream, but bundled hardware graphics support still tended to be minimal).

Running the application benchmarks takes a lot of time, plus the necessary software (which takes time to install from floppies, the distribution technology of the day). Running the C benchmarks is much quicker and simpler.

Ideally the C benchmarks are a reliable stand-in for the application benchmarks (meaning that only the C benchmarks need be run).

Let’s fit some regression models to the measurements of the 61 systems benchmarked, all supporting hardware floating-point (code+data). Surprisingly there is no mention of such an exercise being done by the Byte staff, even though one of the scientific benchmarks included regression fitting.

The following fitted equations explain around 90% of the variance of the data, i.e., they are good fits.

Wordprocessing=0.66+0.56*CPU+0.24*Disk

For wordprocessing, the CPU benchmark explains around twice as much as the Disk benchmark.

Spreedsheet=-0.46+0.8*CPU+1*Disk-0.16*CPU*Disk

For spreadsheets, CPU and Disk contribute about the same.

Database=0.6+0.01*CPU*FPU+0.53*Disk

Database is nearly all Disk.

ScientificEngineering=0.27+FPU*(0.59-0.17*Disk-0.03*CPU)+0.45*CPU*Disk

Scientific/Engineering is FPU, plus interactions with other components.

Compiling=-0.33+CPU*(1.1-0.09*Disk-0.16*Video)+0.33*Disk*Video

Compiling is CPU, plus interactions with other components.

Byte’s benchmark reports were great eye candy, and readers probably took away a rough feel for the performance of various systems. Perhaps somebody at the time also fitted regression models to the data. The magazine contained plenty of adverts for software to do this.

Gokberk Yaltirakli (gkbrk)

Dynamic DNS with AWS Route 53 October 25, 2020 09:00 PM

I occasionally need to SSH into my laptop, or use other services hosted on it, but my ISP gives me a dynamic IP. While it is stable most of the time, it does occasionally change.

To work around this, I had previously set up a cron job that curl’s a specific URL on my website, and I could get the IP by grep-ing through my server logs. But this is both time-consuming and requires me to update the IP address on different applications every time it changes.

I wanted to have a subdomain that always pointed to my laptop, so I used the AWS CLI to create a DIY dynamic DNS. It fetches your IP from the AWS endpoint, but any source can be used. You can also host this on a server or a serverless function to get the client IP and require less dependencies.

Here’s the shell script that runs every X minutes on my laptop in order to update the domain record.

#!/bin/sh

IP="$(curl -s http://checkip.amazonaws.com/)"

DNS="$(mktemp)"

cat > "${DNS}" <<EOF
{
  "Comment": "DDNS update",
  "Changes": [
    {
      "Action": "UPSERT",
      "ResourceRecordSet": {
        "ResourceRecords": [
          {"Value": "${IP}"}
        ],
        "Name": "subdomain123.gkbrk.com",
        "Type": "A",
        "TTL": 300
      }
    }
  ]
}
EOF

aws route53 change-resource-record-sets \
    --hosted-zone-id "/hostedzone/ZONEID" \
    --change-batch "file://${DNS}"

rm "${DNS}"

October 24, 2020

Andrew Montalenti (amontalenti)

New essay: The case for a vote to restore jobs and health October 24, 2020 04:19 PM

It’s not a “left” or a “right” thing. It’s about jobs and health. If you study the data, you’ll learn this is what’s at stake in 2020.

We can restore our country’s health. We can rebuild our economy. We can do both of these things.

There is a precedent for this.

To learn more behind my thinking — digging into the data, steering clear of the partisan bickering — check out this essay, a few thousand words, supported by real data.

2020 is a very important year for our country, and for the world. Whether you’re a Democrat, or a Republican, or an Independent, your vote matters.

Click through to the essay here:

https://amontalenti.com/pub/jobs-health-2020/

And if you want to share this essay on Twitter, here’s a place to start:

Mark J. Nelson (mjn)

Years since Republican control of major cities October 24, 2020 12:00 PM

A large urban–rural divide is a current feature of American politics. Republicans rarely win cities, and Democrats rarely win rural areas. This goes beyond regional concepts like "the heartland" and "coastal elites", and is replicated within just about every state.

There are many ways to slice and dice that divide. This post takes one of them. Republicans can rarely win in a major city. But how rarely? How long has it been since a Republican candidate has won the mayorship of major American cities?

From longest to shortest, here is the list of Republican shutouts in American municipal politics, among the top 20 largest cities:

City No R mayor since Notes
Chicago 1931 Officially nonpartisan since 1999
Philadelphia 1952
Indianapolis 1956
Denver 1963 Officially nonpartisan
San Francisco 1964 Officially nonpartisan
San Jose 1967/∞ Officially nonpartisan; elected office created in 1967
Seattle 1969 Officially nonpartisan
Washington, DC 1975/∞ Elected office created in 1975
Houston 1982 Officially nonpartisan
Austin 1991 Officially nonpartisan
Columbus 2000
Los Angeles 2001 Officially nonpartisan
San Antonio 2001 Officially nonpartisan
Phoenix 2004 Officially nonpartisan
New York City 2007 Bloomberg changed from R to I in 2007
Charlotte 2009
Dallas 2011 Officially nonpartisan
San Diego 2020 Current R officeholder; offically nonpartisan
Jacksonville 2020 Current R officeholder
Fort Worth 2020 Current R officeholder; offically nonpartisan

October 23, 2020

Carlos Fenollosa (carlesfe)

You may be using Mastodon wrong Seven years later, I bought a new Macbook. For the first time, I don't love it No more Google Analytics Evolution of my link roundups Links for 2020-02-09 October 23, 2020 09:32 PM

I'm sure you have already heard about Mastodon, typically marketed as a Twitter alternative.

I will try to convince you that the word alternative doesn't mean here what you think it means, and why you may be using Mastodon wrong if you find it boring.

An alternative community

You should not expect to "migrate from Twitter to Mastodon."

Forget about the privacy angle for now. Mastodon is an alternative community, where people behave differently.

It's your chance to make new internet friends.

There may be some people for whom Mastodon is a safe haven. Yes, some users really do migrate there to avoid censorship or bullying but, for most of us, that will not be the case.

Let's put it this way: Mastodon is to Twitter what Linux is to Windows.

Linux is libre software. But that's not why most people use it. Linux users mostly want to get their work done, and Linux is an excellent platform. There is no Microsoft Word, no Adobe Photoshop, no Starcraft. If you need to use these tools, honestly, you'd better stick with Windows. You can use emulation, in the same way that there are utilities to post to Twitter from Mastodon, but that would miss the point.

The bottom line is, you can perform the same tasks, but the process will be different. You can post toots on Mastodon, upload gifs, send DMs... but it's not Twitter, and that is fine.

The Local Timeline is Mastodon's greatest invention

The problem most people have with Mastodon is that they "get bored" with it quickly. I've seen it a lot, and it means one thing: the person created their account on the wrong server.

"But," they say, "isn't Mastodon federated? Can't I chat with everybody, regardless of their server?" Yes, of course. But discoverability works differently on Mastodon.

Twitter has only two discoverability layers: your network and the whole world. Either a small group of contacts, or everybody in the whole world. That's crazy.

They try very hard to show you tweets from outside your network so you can discover new people. And, at the same time, they show your tweets to third parties, so you can get new followers. This is the way that they try to keep you engaged once your network is more or less stable and starts getting stale.

Mastodon, instead, has an extra layer between your network and the whole world: messages from people on your server. This is called the local timeline.

The local timeline is the key to enjoying Mastodon.

How long it's been since you made a new internet friend?

If you're of a certain age you may remember BBSs, Usenet, the IRC, or early internet forums. Do you recall how exciting it was to log into the unknown and realize that there were people all around the world who shared your interests?

It was an amazing feeling which got lost on the modern internet. Now you have a chance to relive it.

The local timeline dynamics are very different. There is a lot of respectful interactions among total strangers, because there is this feeling of community, of being in a neighborhood. Twitter is just the opposite, strangers shouting at each other.

Furthermore, since the local timeline is more or less limited in the amount of users, you have the chance to recognize usernames, and being recognized. You start interacting with strangers, mentioning them, sending them links they may like. You discover new websites, rabbit holes, new approaches to your hobbies.

I've made quite a few new internet friends on my Mastodon server, and I don't mean followers or contacts. I'm talking about human beings who I have never met in person but feel close to.

People are humble and respectful. And, for less nice users, admins enforce codes of conduct and, on extreme cases, users may get kicked off a server. But they are not being banned by a faceless corporation due to mass reports, everybody is given a chance.

How to choose the right server

The problem with "generalist" Mastodon servers like mastodon.social is that users have just too diverse interests and backgrounds. Therefore, there is no community feeling. For some people, that may be exactly what they're looking for. But, for most of us, there is more value on the smaller servers.

So, how can you choose the right server? Fortunately, you can do a bit of research. There is an official directory of Mastodon servers categorized by interests and regions.

Since you're reading my blog, start by taking a look at these:

And the regionals

There are many more. Simply search online for "mastodon server MY_FAVORITE_HOBBY." And believe me, servers between 500 and 5,000 people are the best.

Final tips

Before clicking on "sign up", always browse the local timeline, the about page, and the most active users list. You will get a pretty good idea of the kind of people who chat there. Once you feel right at home you can continue your adventure and start following users from other servers.

Mastodon has an option to only display toots in specific languages. It can be very useful to avoid being flooded by toots that you just have no chance of understanding or even getting what they're about.

You can also filter your notifications by types: replies, mentions, favorites, reposts, and more. This makes catching up much more manageable than on Twitter.

Finally, Mastodon has a built-in "Content Warning" feature. It allows you to hide text behind a short explanation, in case you want to talk about sensible topics or just about spoiling a recent movie.

Good luck with your search, and see you on the Fediverse! I'm at @cfenollosa@mastodon.sdf.org

Tags: internet

The 2013 Macbook Air is the best computer I have ever owned. My wish has always been that Apple did nothing more than update the CPU and the screen, touching nothing else. I was afraid the day of upgrading my laptop would come.

But it came.

My Air was working flawlessly, if only unbearably slow when under load. Let me dig a bit deeper into this problem, because this is not just the result of using old hardware.

When video conferencing or under high stress like running multiple VMs the system would miss key presses or mouse clicks. I'm not saying that the system was laggy, which it was, and it is expected. Rather, that I would type the word "macbook" and the system would register "mok", for example. Or I would start a dragging event, the MouseUp never registered but the MouseMove continued working, so I ended up flailing an icon around the screen or moving a window to some unexpected place.

This is mostly macOS's fault. I own a contemporary x230 with similar specs running Linux and it doesn't suffer from this issue. Look, I was a computer user in the 90s and I perfectly understand that an old computer will be slow to the point of freezing, but losing random input events is a serious bug on a modern multitasking system.

Point #1: My old computer became unusable due to macOS, not hardware, issues.

******

As I mentioned, I had been holding on my purchase due to the terrible product lineup that Apple held from 2016 to 2019. Then Apple atoned, and things changed with the 2019 16". Since I prefer smaller footprints, I decided that I would buy the 13" they updated next.

So here I am, with my 2020 Macbook Pro, i5, 16 GB RAM, 1 TB SSD. But I can't bring myself to love it like I loved my 2013 Air.

Let me explain why. Maybe I can bring in a fresh perspective.

Most reviewers evaluate the 2020 lineup with the 2016-2019 versions in mind. But I'm just some random person, not a reviewer. I have not had the chance to even touch any Mac since 2015. I am not conditioned towards a positive judgement just because the previous generation was so much worse.

Of course the new ones are better. But the true test is to compare them to the best laptops ever made: 2013-2015 Airs and Pros.

Point #2: this computer is not a net win from a 2013 Air.

Let me explain the reasons why.

The webcam

You will see the webcam reviewed as an afterthought in most pieces. I will cover it first. I feel like Apple is mocking us by including the worst possible webcam on the most expensive laptop.

Traditionally, this has been a non-issue for most people. However, due to covid-19 and working from home, this topic has become more prominent.

In my case, even before the pandemic I used to do 2-3 video conferences every day. Nowadays I spend the day in front of my webcam.

What infuriates me is that the camera quality in the 2013 Air is noticeably better. Why couldn't they use at least the same part, if not a modern one?

See for yourself. It really feels like a ripoff. Apple laughing at us.

A terrible quality picture from the macbook pro webcam
The 2020 macbook pro webcam looks horrible, and believe me, it is not only due to Yours Truly's face.

A reasonable quality picture from the 2013 Air
A reasonable quality picture from the 2013 Air

For reference, this is the front facing camera of the 2016 iPhone SE
For reference, this is the front facing camera of the 2016 iPhone SE, same angle and lighting conditions.

For reference, a picture taken with my 2006 Nokia 5200
As a second reference, a picture taken with the 640x480 VGA camera of my 2006 Nokia 5200. Which of the above looks the most like this?

I would have paid extra money to have a better webcam on my macbook.

The trackpad

The mechanism and tracking is excellent, but the trackpad itself is too large and the palm rejection algoritm is not good enough.

Point #3: The large trackpad single-handedly ruins using the experience of working on this laptop for me.

I am constantly moving the cursor accidentally. This situation is very annoying, especially for a touch typist as my fingers are always on hjkl and my thumb on the spacebar. This makes my thumb knuckle constantly brush the trackpad and activate it.

I really, really need to fix this, because I have found myself unconsciously raising my palms and placing them at a different angle. This may lead to RSI, which I have suffered from in the past.

This is a problem that Apple created on their own. Having an imperfect palm rejection algorithm is not an issue unless you irrationally enlarge the trackpad so much that it extends to the area where the palm of touch typists typically rests.

Video: Nobody uses the trackpad like this
Is it worth it to antagonize touch typists in order to be able to move the cursor from this tiny corner?

I would accept this tradeoff if the trackpad was Pencil-compatible and we could use it as some sort of handwriting tablet. That would actually be great!

Another very annoying side effect of it being so large is that, when your laptop is in your lap, sometimes your clothes accidentally brush the trackpad. The software then registers spurious movements or prevents some gestures from happening because it thinks there is a finger there.

In summary, it's too big for no reason, which turns it into an annoyance for no benefit. This trackpad offers a bad user experience, not only that, it also ruins the keyboard—read below.

I would have paid extra money to have a smaller trackpad on my macbook.

The keyboard

The 2015 keyboard was very good, and this one is better. The keyswitch mechanism is fantastic, the layout is perfect, and this is probably the best keyboard on a laptop.

Personally, I did not mind the Escape key shenanigans because I remapped it to dual Ctrl/Escape years ago, which I recommend you do too.

Touch ID is nice, even though I'm proficient at typing my password, so it was not such a big deal for me. Face ID would have been much more convenient, I envy Windows Hello users.

Unfortunately, the large trackpad torpedoes the typing experience. Writing on this Macbook Pro is worse than on my 2013 Air.

I will keep searching for a tool which disables trackpad input within X miliseconds of a key press or disables some areas of the trackpad. I have not had any luck with neither Karabiner nor BetterTouchTool.

The Touchbar

After having read mostly negative feedback about it, I was determined to drill myself to like it, you know, just to be a bit contrarian.

"I will use tools to customize it so much that it will be awesome as a per-application custom function layer!"

Unfortunately, the critics are right. It's an anti-feature. I gave it an honest try, I swear. It is just bad, though it could have been better with a bit more effort.

I understand why it's there. Regular users probably find it useful and cute. It's ironically, a feature present in pro laptops meant for non-pro users: slow typists and people who don't know the regular keyboard shortcuts.

That being said, I would not mind it, probably would even like it, if it weren't for three major drawbacks:

First and foremost, it is distracting to the point that the first thing I did was to search how to completely turn it off.

This is because, by default, it offers typing suggestions. Yes, while you are typing and trying to concentrate, there is something in your field of vision constantly flashing words that you didn't mean to type and derailing your train of thought.

Easy to fix, but it makes me wonder what were Apple product managers thinking.

Secondly, it is placed in such a way that resting your fingers on top of the keyboard trigger accidental key presses.

I can and will retrain my hand placement habits. After all, this touchbar-keyboard-trackpad combo is forcing many people to learn to place their hands in unnatural positions to accommodate these poorly designed peripherals.

However, Apple could have mitigated this by implementing a pressure sensor to make it more difficult to generate involuntary key presses. It would be enough to distinguish a brush from a tap.

Finally, and this is also ironic because it's in contradiction with the previous point, due to lack of feedback, sometimes you're not sure whether you successfully pressed a touchbar key. And, in my experience, there is an unjustifiable large number of times where you have to press them twice, or press very deliberately to activate that key you want.

There are some redeeming features, though.

As stated above, I am determined to make it bearable, and even slightly useful for me, by heavily modifying it. I suggest you go to System Preferences > Keyboard and use the "Expanded Control Strip".

Then, customize the touchbar buttons, remove keys you don't use, and add others. Consider paying for BetterTouchTool for even more customization options.

Then, on the same window, go to the Shortcuts tab, and select Function keys on the left. This allows you to use function keys by default in some apps, which is useful for Terminal and other pro apps like Pycharm.

(Get the third irony? To make the touchbar, a pro feature, useful for pro apps, the best setup is to make it behave like normal function keys)

Finally, if you're registering accidental key presses, just leave an empty space in the touchbar to let your fingers rest safely until you re-train your hands to rest somewhere else. This is ridiculous, but hey, better than getting your brightness suddenly dimming to zero accidentaly.

Leave an empty space in the touchbar
Leave an empty space in the touchbar on the area where you are used to rest your fingers.

I would have paid extra money to not have a touchbar on my macbook.

The ports

Another much-debated feature where I resigned myself to just accept this new era of USB-C.

I did some research online and bought the "best" USB-C hub, along with new dongles. I don't mind dongles, because I was already using some with my Air. It's not like I swim in money, but there is no need to blow this out of proportion.

Well, I won't point any fingers to any review site, but that "best" hub is going back to Amazon as I write these lines. Some of my peripherals disconnect randomly, plus I get an "electric arc" noise when I disconnect the hub cable. I don't know how that is even possible.

The USB-C situation is terrible. Newly bought peripherals still come with USB-A cables. Regarding hubs, it took me a few years to find a reliable USB3 hub for my 2013 Air. I will keep trying, wish me luck.

About Magsafe, even though I really liked it, I don't miss it as much as I expected. I do miss the charging light, though. No reason not to have it integrated in the official cable, like the XPS does.

Some people say that charging via USB-C is actually better due to standardization of all devices, but I don't know what periperals these people use. My iPhone and Airpods charge via Lightning, my Apple Watch charges via a puck, and other minor peripherals like cameras and external batteries all charge via micro-USB. Now I have to carry the same amount of cables as before, I just swapped the Magsafe cable and charger for the USB-C cable and charger.

Another poorly thought decision is the headphone jack. It is on the wrong side. Most of the population is right-handed, so there usually is a notebook, mouse, or other stuff to the right of the laptop. The headphones cable then gets in the way. The port should have been on the left, and close to the user, not far away from them, to gain a few extra centimeters to the cable.

By the way, not including the extension cord is unacceptable. This cord is not only a convenience, but it increases safety, because it's the only way to have earth grounding for the laptop. Without it, rubbing your fingers on the surface of the computer generates this weird vibration due to current. I have always recommended Mac users that they use their chargers with the extension cable even if they don't need the extra length.

I would have paid extra money to purchase an Apple-guaranteed proper USB-C hub. Alternatively, I would have paid extra money for this machine to have a couple of USB-A ports so I can keep using my trusty old hub.

I would not have paid extra money to have the extension cord, because it should have come included with this 2,200€ laptop. I am at a loss for words. Enough of paying extra money for things that Apple broke on purpose.

Battery life

8-9 hours with all apps closed except Safari. Browsing lightly, with an occasional video, and brightness at the literal minimum. This brightness level is only realistic if it's night time. In a normally lit environment you need to set the brightness level at around 50%.

It's not that great. My Air, when it was new, easily got 12 hours of light browsing. Of course, it was not running Catalina, but come on.

When I push the laptop a bit more, with a few Docker containers, Pycharm running, Google Chrome with some Docs opened, and brightness near the maximum, I get around 4 hours. In comparison, that figure is reasonable.

Overall, it's not bad, but I expected more.

While we wait for a Low Power Mode on the mac, do yourself a favor and install Turbo Boost Switcher Pro.

The screen

Coming from having never used a Retina screen on a computer, this Macbook Pro impressed me.

Since I don't edit photos or videos professionally, I can only appreciate it for its very crisp text. The rest of features are lost on me, but this does not devalue my opinion of the screen.

The 500-nit brightness is not noticeable on a real test with my 2013 Air. For some reason, both screens seem equally bright when used in direct daylight.

This new Retina technology comes with a few drawbacks, though.

First, it's impossible to get a terminal screen without anti-aliasing. My favorite font, IBM VGA8, is unreadable when anti aliased, which is a real shame, because I've been using it since the 90s, and I prefer non-anti-aliased fonts on terminals.

Additionally, many pictures on websites appear blurry because they are not "retina-optimized". The same happens with some old applications which display crappy icons or improperly proportioned layouts. This is not Apple's fault, but it affects the user experience.

Finally, the bezels are not tiny like those in the XPS 13, but they are acceptable. I don't mind them.

To summarize, I really like this screen, but like everything else in this machine, it is not a net gain. You win some, you lose some.

Performance

This is the reason why I had to switch from my old laptop, and the 2020 MBP delivers.

It allows me to perform tasks that were very painful in my old computer. Everything is approximately three times faster than it was before, which really is a wow experience, like upgrading your computer in the 90s.

Not much to add. This is a modern computer and, as such, it is fast.

Build quality

Legendary, as usual.

To nitpick on a minor issue, I'd like Apple to make the palm rest area edges a bit less sharp. After typing for some time I get pressure marks on my wrists. They are not painful, but definitely discomforting.

Likewise, when typing on my lap, especially when wearing sports shorts in summer like I'm doing right now, the chassis leaves marks on my legs near the hinge area. Could have been reduced by blunting the edges too.

One Thousand Papercuts

In terms of software, Apple also needs to get its stuff together.

Catalina is meh. Not terrible, but with just too many annoyances.

  • Mail keeps opening by itself while I'm doing video conferences and sharing my screen. I have to remind myself to close Mail before any video conference, because if I don't, other people will read my inbox. It's ridiculous that this bug has not been fixed yet. Do you remember when Apple mocked Microsoft because random alert windows would steal your focus while you were typing? This is 100x worse.
  • My profile picture appears squished on the login screen, and there is no way to fix it. The proportions are correctly displayed on the iCloud settings window.
  • Sometimes, after resuming from sleep, the laptop doesn't detect its own keyboard. I can assure you, the keyboard was there indeed, and note how the dock is still the default one. This happened to me minutes after setting up the computer for the first time, before I had any chance to install software or change any settings.
  • I get constant alerts to re-enter my password for some internet account, but my password is correct. Apple's services need to differentiate a timeout from a rejected password, or maybe retry a couple times before prompting.
  • Critical software I used doesn't run anymore and I have to look for alternatives. This includes Safari 13 breaking extensions that were important for me. Again, I was prepared for this, but it's worth mentioning.

Praise worthy

Here are a few things that Apple did really well and don't fit into any other category.

  • Photos.app has "solved" the photos problem. It is that great. As a person who has 50k photos in their library, going back to pictures of their great grandparents: Thank you, Apple!
  • Continuity features have been adding up, and the experience is now outstanding. The same goes for iCloud. If you have an iPhone and a Mac, things are magical.
  • Fan and thermal configuration is very well crafted on this laptop. It runs totally silent, and when the fans kick off, the system cools down very quickly and goes back to silent again.
  • The speakers are crisp and they have very nice bass. They don't sound like a tin can like most laptops, including the 2013 Air, do.

Conclusion

This computer is bittersweet.

I'm happy that I can finally perform tasks which were severely limited on my previous laptop. But this has nothing to do with the design of the product, it is just due to the fact that the internals are more modern.

Maybe loving your work tools is a privilege that only computer nerds have. Do taxi drivers love their cars? Do baristas love their coffee machines? Do gardeners love their leaf blowers? Do surgeons love their scalpels?

Yes, I have always loved my computer. Why wouldn't I? We developers spend at least eight hours a day touching and looking at our silicon partners. We earn our daily bread thanks to them. This is why we chose our computers carefully with these considerations in mind, why we are so scrupulous when evaluating them.

This is why it's so disappointing that this essential tool comes with so many tradeoffs.

Even though this review was exhaustive, don't get me wrong, most annoyances are minor except for the one deal-breaker: the typing experience. I have written this review with the laptop keyboard and it's been a continuous annoyance. Look, another irony. Apple suffered so much to fix their keyboard, yet it's still ruined by a comically large trackpad. The forest for the trees.

Point #4: For the first time since using Macs, I do not love this machine.

Going back to what "Pro" means

Apple engineers, do you know who is the target audience for these machines?

This laptop has been designed for casual users, not pro users. Regular users enjoy large trackpads and Touch Bars because they spend their day scrolling through Twitter and typing short sentences.

Do you know who doesn't, because it gets in the way of them typing their essays, source code, or inputting their Photoshop keyboard shortcuts? Pro users.

In 2016 I wrote:

However, in the last three to five years, everybody seemed to buy a Mac, even friends of mine who swore they would never do it. They finally caved in, not because of my advice, but because their non-nerd friends recommend MBPs. And that makes sense. In a 2011 market saturated by ultraportables, Windows 8, and laptops which break every couple years, Macs were a great investment. You can even resell them after five years for 50% of their price, essentially renting them for half price.

So what happened? Right now, not only Pros are using the Macbook Pro. They're not a professional tool anymore, they're a consumer product. Apple collects usage analytics for their machines and, I suppose, makes informed decisions, like removing less used ports or not increasing storage on iPhones for a long time.

What if Apple is being fed overwhelmingly non-Pro user data for their Pro machines and, as a consequence, their decisions don't serve Pro users anymore, but rather the general public?

The final irony: Apple uses "Pro" in their product marketing as a synonymous for "the more expensive tier", and they are believing their own lies. Their success with consumer products is fogging their understanding of what a real Pro needs.

We don't need a touchbar that we have to disable for Pro apps.

We don't need a large trackpad that gets in the way of typing.

We need more diverse ports to connect peripherals that don't work well with adapters.

We need a better webcam to increase productivity and enhance communication with our team.

We need that you include the effin extension cable so that there is no current on the chassis.

We need you to not splash our inbox contents in front of guests while sharing our screens.

We need a method to extend the battery as long as possible while we are on the road—hoping that comes back some day.

Point #5: Apple needs to continue course-correcting their design priorities for power users

Being optimistic for the future

I have made peace with the fact that, unlike my previous computer, this one will not last me for 7 years. This was a very important factor in my purchase decision. I know this mac is just bridging a gap between the best lineup in Apple's history (2015) and what will come in the future. It was bought out of necessity, not out of desire.

14" laptop? ARM CPUs? We will be awaiting new hardware eagerly, hoping that Apple keeps rolling back some anti-features like they did with the butterfly keyboard. Maybe the Touchbar and massive trackpad will be next. And surely the laggy and unresponsive OS will have been fixed by then.

What about the alternatives?

Before we conclude I want to anticipate a question that will be in some people's mind. Why didn't you buy another laptop?

Well, prior to my purchase I spent two months trying to use a Linux setup full-time. It was close, but not 100% successful. Critical software for my job had no real alternatives, or those were too inconvenient.

Regarding Windows, I had eyes on the XPS 13 and the X1 Carbon which are extremely similar to this macbook in most regards. I spent some time checking if Windows 10 had improved since the last time I used it and it turns out it hasn't. I just hate Windows so much it is irrational. Surely some people prefer it and feel the same way about the Mac. To each their own.

Point #6: Despite its flaws, macOS is the OS that best balances convenience with productive work. When combined with an iPhone it makes for an unbeatable user experience.

I decided that purchasing this new Mac was the least undesirable option, and I still stand by that decision. I will actively try to fix the broken trackpad, which will increase my customer satisfaction from a 6 —tolerate— to an 8 or 9 —like, even enjoy—.

But that will still be far away from the perfect, loving 10/10 experience I had with the 2013 Air.

Tags: apple, hardware

I have removed the GA tracking code from this website. cfenollosa.com does not use any tracking technique, neither with cookies, nor js, nor image pixels.

Even though this was one of the first sites to actually implement a consent-based GA tracking, the current situation with the cookie banners is terrible.

We are back to the flash era where every site had a "home page" and you needed to perform some extra clicks to view the actual content. Now those extra clicks are spent in disabling all the tracking code.

I hate the current situation so much that I just couldn't be a part of it any more. So, no banner, no cookies, no js, nothing. Any little traffic I get I'll analyze with a log parser like webalizer. I wasn't checking it anyways.

Tags: internet, web, security

As you may have noticed, I'm a fan of link compilation digests.

However, compiling them was quite the work for me. I always found interesting links during the week, then had to reserve an hour in the weekend to prepare the blogpost, which sometimes I did not had.

Furthermore, this format was flooding my blog with link roundups, which is not very user friendly for somebody who stumbles upon my front page.

I needed something better in two ways. First, the link publication has to be on the spot. Adding them to a list, then editing a post was not cutting it. Second, the links need to be their own section, independent from the rest of blog posts.

Fortunately, one of my link sources had the solution in front of me. The idea behind it is very simple and I got inspired by waxy's implementation. A box with links in the front page, and a special page only with links.

So this weekend project has been a very nice 1-line patch to bashblog, a bit of messing with postfix to parse links received to a special inbox, and some glue on top of it. I'm happy with the result!

The links index page is very crude right now. There is no CSS, and no feed available, but that will come soon. Meanwhile, feel free to bookmark it and visit it sometime!

Tags: roundup, bashblog

🐲 For Tolkien fans

The Tolkien Meta-FAQ (RH, via usenet)

Usenet FAQs used to be a great source of information. I recently found the Tolkien Meta-FAQ and it is absolutely amazing.

🎨 Mario Paint tunes

Meet the musicians who compose in Mario Paint (5 min, via waxy)

Delightfully retro.

PS: There is a Mario Paint subreddit!

💣 Android remote code execution via Bluetooth

Critical Bluetooth Vulnerability in Android (CVE-2020-0022) (1 min, via @dethos@s.ovalerio.net)

On Android 8.0 to 9.0, a remote attacker within proximity can silently execute arbitrary code [...] as long as Bluetooth is enabled. No user interaction is required.

I wonder if there are exploits in the wild already. Walking around a big city infecting all phones in a 10-foot radius.

🤯 40 concepts for understanding the world

In 40 tweets I will describe 40 powerful concepts for understanding the world (5 min, via @paulg)

This thread is worth reading. It's better than most popular books about ideas, and much shorter.

📒 What they don't teach you in CS classes

The Missing Semester of Your CS Education (RH, via lobste.rs)

Over the years, we have seen that many students have limited knowledge of the tools available to them.

Common examples include holding the down arrow key for 30 seconds to scroll to the bottom of a large file in Vim, or using the nuclear approach to fix a Git repository (https://xkcd.com/1597/)

This is one of the best resources I have ever linked to.

You must learn these skills.

(Self plug: my own UNIX tools workshop slides)

🚂 Upscaling a 1896 film with AI

Someone used neural networks to upscale a famous 1896 video to 4k quality (5 min, via HN)

We already had this capability. Only that it required an enormous effort by experienced video editors.

In a few years movies will be created just by feeding a script to an AI.

🚗 Fake GMaps traffic jam

Google Maps Hacks (5 min, via @simon_deliver)

99 smartphones are transported in a handcart to generate virtual traffic jam in Google Maps. Through this activity, it is possible to turn a green street red which has an impact in the physical world by navigating cars on another route!

Devilishly genius!

Tags: roundup

&via=cfenollosa">&via=cfenollosa">Comments? Tweet  

Gustaf Erikson (gerikson)

Return of a King: The Battle for Afghanistan by William Dalrymple October 23, 2020 01:46 PM

A good overview of the First Anglo-Afghan War. The parallells to today’s situation are presented but not in a polemical way. Dalrymple presents “both sides”, avoiding the all too common trope of only focusing on the British defeat and hardships.

Fastnet, Force 10 by John Rousmaniere October 23, 2020 11:48 AM

Written shortly after the tragedy, this is a very 1970s book. The author describes himself unapologetically as a “WASP”, for example, which would probably not fly these days.

It’s long on descriptions but short on analysis. The descriptions however are pretty horrifying. If you ever feel like taking up ocean racing maybe read this first.

October 19, 2020

Marc Brooker (mjb)

Getting Big Things Done October 19, 2020 12:00 AM

Getting Big Things Done

In one particular context.

A while back, a colleague wanted to make a major change in the design of a system, the sort of change that was going to take a year or more, and many tens of person-years of effort. They asked me how to justify the project. This post is part of the email reply I sent. The advice is in context of technical leadership work at a big company, but perhaps it may apply elsewhere.

Is it the right solution?

I like to pay attention to ways I can easily fool myself. One of those ways is an availability heuristic applied to big problems. I see a big problem that needs a big solution, and am strongly biased to believe that the first big solution that presents itself is the right one. It takes intentional effort to figure out whether the big solution is, indeed, a solution to the big problem. Bold action, after all, isn't a solution itself.

Sometimes, in one of his more exuberant or desperate moods, Pa would go out in the veld and sprinkle brandy on the daisies to make them drunk so that they wouldn't feel the pain of shriveling up and dying. (André Brink)

Because I am so easily fooled in this way, I like to write my reasoning down. Two pages of prose normally does it, building an argument as to why this is the right solution to the problem. Almost every time, this exposes flaws in my reasoning, opportunities to find more data, or other solutions to explore. Thinking in my head doesn't have this effect for me, but writing does. Or, rather, the exercise of writing and reading does.

The first step is to write a succinct description of the problem, and what it means for the problem to be solved. Sometimes those are quantitative goals. Speeds and feeds. Sometimes, they are concrete goals. A product launch, or a document. Sometimes, it's something more qualitative and harder to explain. Thinking about the problem bears a great deal of fruit.

Then, the solution. The usual questions apply here, including cost, viability, scope and complexity. Most important is engaging with the problem statement. It's easy to make the exercise useless if you disconnect the problem statement from the solution.

It is important you feel comfortable with the outcome of this exercise, because losing faith in your own work is a sure way to have it fail. Confidence is one valuable outcome. Another one is a simpler solution.

Is it the right problem?

An elegant solution to the wrong problem is worse than no solution at all, at least in that it might fool people into thinking that the true problem has been solved, and to stop trying. You need to deeply understand the problem you are solving. Rarely, this will be an exercise only in technology or engineering. More commonly, large problems will span business, finance, engineering, management and more. You probably don't understand all of these things. Be sure to seek the help of people who do.

“Would you tell me, please, which way I ought to go from here?” “That depends a good deal on where you want to get to,” said the Cat.

Once I think I understand multiple perspectives on a problem, I like to write them down and run them by the people who explained the problem to me. They'll be able to point out where you're still wrong. Perhaps you're confusing your net and operational margins, or your hiring targets make no sense, or your customers see a different problem from you. This requires that the people you consult trust you, and you trust them. Fortunately, non-engineers in engineering organizations are always looking out for allies and friends. Most are, like engineers, only too excited to explain their work.

Engage with the doubters, but don't let them get you down

Be prepared! And be careful not to do Your good deeds when there's no one watching you (Tom Lehrer)

You will never convince everybody of your point of view. By now, you have two powerful tools to help convince people: A clear statement of the problem that considers multiple points-of-view, and a clear statement of the solution. Some people will read those and be convinced. Others would never be convinced, because their objections lie beyond the scope of your thinking. A third group will have real, honest, feedback about the problem and proposed solution. That feedback is gold that should be mined. Unfortunately, separating the golden nuggets of feedback from the pyrite nuggets of doubt isn't easy.

The doubters will get you down. Perhaps they think the problem doesn't exist, or that the solution is impractical. Perhaps they think you aren't the person to do it. Perhaps they think the same resources should be spent on a different problem, or a different solution. You'll repeat, repeat, and repeat. Get used to it. I'm still not used to it, but you should be.

Again, writing is a tool I reach for. "Today, I'm doubting my solution because..." Sometimes that doubt will be something more about you than the project. That's OK. Sometime it'll be about the project, and will identify a real problem. Often, it'll just point to one of those unknown unknowns that all projects have.

Meet the stakeholders where they are

Most likely, you're going to need to convince somebody to let you do the work. That's good, because doing big things requires time, people and money. You don't want to be working somewhere that's willing to waste time, people or money. If they're willing to waste time on your ill-conceived schemes, they'll be willing to waste your time on somebody else's ill-conceived schemes.

May your wisdom grace us until the stars rain down from the heavens. (C.S. Lewis)

The best advice I've received about convincing stakeholders is to write for them, not you. Try to predict what questions they are going to ask, what concerns they will have, and what objections they will bring up and have answers for those in the text. That doesn't mean you should be defensive. Don't aim only to flatter. Instead, tailor your approach. It can help to have the advice of people who've been through this journey before.

The previous paragraph may seem to you like politics, you may have a distaste for politics, or believe you can escape it by moving to a different business. It is. You may. You can't.

Leadership willing to engage with your ideas and challenge you on them is a blessing.

Build a team

You need to build two teams. Your local team is the team of engineers who are going to help you write, review, test, deploy, operate, and so on. These people are critical to your success, not because they are the fingers to your brain, but because the details are at least as important as the big picture. Get into some details yourself. Don't get into every detail. You can't.

Your extended team is a group of experts, managers, customer-facing folks, product managers, lawyers, designers and so on. Some of these people won't be engaged day-to-day. You need to find them, get them involved, and draw on them when you need help. You're not an expert in everything, but expertise in everything will be needed. Getting these people excited about your project, and bought into its success, is important.

Finally, find yourself a small group of people you trust, and ask them to keep you honest. Check in with them, to make sure your ideas still make sense. Share the documents you wrote with them.

Be willing to adapt

You will learn, at some point into doing your big project, that your solution is bullshit. You completely misunderstood the problem. You may feel like this leaves you back at the beginning, where you started. It doesn't. Instead, you've stepped up your level of expertise. Most likely, you can adapt that carefully-considered solution to the new problem, but you might need to throw it out entirely. Again, write it down. Be specific. What have you learned, and what did it teach you? Look for things you can recover, and don't throw things out prematurely.

Maxwell Bernstein (tekknolagi)

Compiling a Lisp: Instruction encoding interlude October 19, 2020 12:00 AM

firstprevious

Welcome back to the Compiling a Lisp series. In this thrilling new update, we will learn a little bit more about x86-64 instruction encoding instead of allocating more interesting things on the heap or adding procedure calls.

I am writing this interlude because I changed one register in my compiler code (kRbp to kRsp) and all hell broke loose — the resulting program was crashing, rasm2/Cutter were decoding wacky instructions when fed my binary, etc. Over the span of two very interesting but very frustrating hours, I learned why I had these problems and how to resolve them. You should learn, too.

State of the instruction encoder

Recall that I introduced at least 10 functions that looked vaguely like this:

void Emit_mov_reg_imm32(Buffer *buf, Register dst, int32_t src) {
  Buffer_write8(buf, kRexPrefix);
  Buffer_write8(buf, 0xc7);
  Buffer_write8(buf, 0xc0 + dst);
  Buffer_write32(buf, src);
}

These functions all purport to encode x86-64 instructions. They do, most of the time, but they do not tell the whole story. This function is supposed to encode an instruction of the form mov reg64, imm32. How does it do it? I don’t know!

They have all these magic numbers in them! What is a kRexPrefix? Well, it’s 0x48. Does that mean anything to us? No! It gets worse. What are 0xc7 and 0xc0 doing there? Why are we adding dst to 0xc0? Before this debugging and reading extravaganza, I could not have told you. Remember how somewhere in a previous post I mentioned I was getting these hex bytes from reading the compiled output on the Compiler Explorer? Yeah.

As it turns out, this is not a robust development strategy, at least with x86-64. It might be okay for some more regular or predictable instruction sets, but not this one.

Big scary documentation

So where do we go from here? How do we find out how to take these mystical hexes and incantations to something that better maps to the hardware? Well, we once again drag Tom1 into a debugging session and pull out the big ol’ Intel Software Developer Manual.

This is an enormous 26MB, 5000 page manual comprised of four volumes. It’s very intimidating. This is exactly why I didn’t want to pull it out earlier and do this properly from the beginning… but here we are, eventually needing to do it properly.

I will not pretend to understand all of this manual, nor will this post be a guide to the manual. I will just explain what sections and diagrams I found useful in understanding how this stuff works.

I only ever opened Volume 2, the instruction set reference. In that honking 2300 page volume are descriptions of every Intel x86-64 instruction and how they are encoded. The instructions are listed alphabetically and split into sections based on the first letter of each instruction name.

Let’s take a look at Chapter 3, specifically at the MOV instruction on page   1209. For those following along who do not want to download a massive PDF, this website has a bunch of the same data in HTML form. Here’s the page for MOV.

This page has every variant of MOV instruction. There are other instructions begin with MOV, like MOVAPD, MOVAPS, etc, but they are different enough that they are different instructions.

It has six columns:

  • Opcode, which describes the layout of the bytes in the instruction stream. This describes how we’ll encode instructions.
  • Instruction, which gives a text-assembly-esque representation of the instruction. This is useful for figuring out which one we actually want to encode.
  • Op/En, which stands for “Operand Encoding” and as far as I can tell describes the operand order with a symbol that is explained further in the “Instruction Operand Encoding” table on the following page.
  • 64-Bit Mode, which tells you if the instruction can be used in 64-bit mode (“Valid”) or not (something else, I guess).
  • Compat/Leg Mode, which tells you if the instruction can be used in some other mode, which I imagine is 32-bit mode or 16-bit mode. I don’t know. But it’s not relevant for us.
  • Description, which provides a “plain English” description of the opcode, for some definition of the words “plain” and “English”.

Other instructions have slightly different table layouts, so you’ll have to work out what the other columns mean.

Here’s a preview of some rows from the table, with HTML courtesy of Felix Cloutier’s aforementioned web docs:

Opcode Instruction Op/En 64-Bit Mode Compat/Leg Mode Description
88 /r MOV r/m8,r8 MR Valid Valid Move r8 to r/m8.
REX + 88 /r MOV r/m8***,r8*** MR Valid N.E. Move r8 to r/m8.
89 /r MOV r/m16,r16 MR Valid Valid Move r16 to r/m16.
89 /r MOV r/m32,r32 MR Valid Valid Move r32 to r/m32.
..................
C7 /0 id MOV r/m32, imm32 MI Valid Valid Move imm32 to r/m32.
REX.W + C7 /0 id MOV r/m64, imm32 MI Valid N.E. Move imm32 sign extended to 64-bits to r/m64.

If you take a look at the last entry in the table, you’ll see REX.W + C7 /0 id. Does that look familiar? Maybe, if you squint a little?

It turns out, that’s the description for encoding the instruction we originally wanted, and had a bad encoder for. Let’s try and figure out how to use this to make our encoder better. In order to do that, we’ll need to first understand a general layout for Intel instructions.

Instruction encoding, big picture

All Intel x86-64 instructions follow this general format:

  • optional instruction prefix
  • opcode (1, 2, or 3 bytes)
  • if required, Mod-Reg/Opcode-R/M, also known as ModR/M (1 byte)
  • if required, Scale-Index-Base, also known as SIB (1 byte)
  • displacement (1, 2, or 4 bytes, or none)
  • immediate data (1, 2, or 4 bytes, or none)

I found this information at the very beginning of Volume 2, Chapter 2 (page 527) in a section called “Instruction format for protected mode, real-address mode, and virtual-8086 mode”.

You, like me, may be wondering about the difference between “optional”, “if required”, and “…, or none”. I have no explanation, sorry.

I’m going to briefly explain each component here, followed up with a piece-by-piece dissection of the particular MOV instruction we want, so we get some hands-on practice.

Instruction prefixes

There are a couple kind of instruction prefixes, like REX (Section 2.2.1) and VEX (Section 2.3). We’re going to focus on REX prefixes, since they are needed for many (most?) x86-64 instructions, and we’re not emitting vector instructions.

The REX prefixes are used to indicate that an instruction, which might normally refer to a 32-bit register, should instead refer to a 64-bit register. Also some other things but we’re mostly concerned with register sizes.

Opcode

Take a look at Section 2.1.2 (page 529) for a brief explanation of opcodes. The gist is that the opcode is the meat of the instruction. It’s what makes a MOV a MOV and not a HALT. The other fields all modify the meaning given by this field.

ModR/M and SIB

Take a look at Section 2.1.3 (page 529) for a brief explanation of ModR/M and SIB bytes. The gist is that they encode what register sources and destinations to use.

Displacement and immediates

Take a look at Section 2.1.4 (page 529) for a brief explanation of displacement and immediate bytes. The gist is that they encode literal numbers used in the instructions that don’t encode registers or anything.

If you’re confused, that’s okay. It should maybe get clearer once we get our hands dirty. Reading all of this information in a vacuum is moderately useless if it’s your first time dealing with assembly like this, but I included this section first to help explain how to use the reference.

Encoding, piece by piece

Got all that? Maybe? No? Yeah, me neither. But let’s forge ahead anyway. Here’s the instruction we’re going to encode: REX.W + C7 /0 id.

REX.W

First, let’s figure out REX.W. According to Section 2.2.1, which explains REX prefixes in some detail, there are a couple of different prefixes. There’s a helpful table (Table 2-4, page 535) documenting them. Here’s a bit diagram with the same information:

@font-face { font-family: "Virgil"; src: url("https://excalidraw.com/FG_Virgil.woff2"); } @font-face { font-family: "Cascadia"; src: url("https://excalidraw.com/Cascadia.woff2"); } 0100WRXBHigh bitLow bitREX

In English, and zero-indexed:

  • Bits 7-4 are always 0b0100.
  • Bit 3 is the W prefix. If it’s 1, it means the operands are 64 bits. If it’s 0, “operand size [is] determined by CS.D”. Not sure what that means.
  • Bits 2, 1, and 0 are other types of REX prefixes that we may not end up using, so I am omitting them here. Please read further in the manual if you are curious!

This MOV instruction calls for REX.W, which means this byte will look like 0b01001000, also known as our friend 0x48. Mystery number one, solved!

C7

This is a hexadecimal literal 0xc7. It is the opcode. There are a couple of other entries with the opcode C7, modified by other bytes in the instruction (ModR/M, SIB, REX, …). Write it to the instruction stream. Mystery number two, solved!

/0

There’s a snippet in Section 2.1.5 that explains this notation:

If the instruction does not require a second operand, then the Reg/Opcode field may be used as an opcode extension. This use is represented by the sixth row in the tables (labeled “/digit (Opcode)”). Note that values in row six are represented in decimal form.

This is a little confusing because this operation clearly does have a second operand, denoted by the “MI” in the table, which shows Operand 1 being ModRM:r/m (w) and Operand 2 being imm8/16/32/64. I think it’s because it doesn’t have a second register operand that this space is free — the immediate is in a different place in the instruction.

In any case, this means that we have to make sure to put decimal 0 in the reg part of the ModR/M byte. We’ll see what the ModR/M byte looks like in greater detail shortly.

id

id refers to an immediate double word (32 bits). It’s called a double word because, a word (iw) is 16 bits. In increasing order of size, we have:

  • ib, byte (1 byte)
  • iw, word (2 bytes)
  • id, double word (4 bytes)
  • io, quad word (8 bytes)

This means we have to write our 32-bit value out to the instruction stream. These notations and encodings are explained further in Section 3.1.1.1 (page 596).

Overall, that means that this instruction will have the following form:

@font-face { font-family: "Virgil"; src: url("https://excalidraw.com/FG_Virgil.woff2"); } @font-face { font-family: "Cascadia"; src: url("https://excalidraw.com/Cascadia.woff2"); } REXOpModR/MImmediate01237

If we were to try and encode the particular instruction mov rax, 100, it would look like this:

@font-face { font-family: "Virgil"; src: url("https://excalidraw.com/FG_Virgil.woff2"); } @font-face { font-family: "Cascadia"; src: url("https://excalidraw.com/Cascadia.woff2"); } REXOpModR/MImmediate012370x480xc70xc00x64 0x00 0x00 0x00

This is how you read the table! Slowly, piece by piece, and with a nice cup of tea to help you in trying times. Now that we’ve read the table, let’s go on and write some code.

Encoding, programatically

While writing code, you will often need to reference two more tables than the ones we have looked at so far. These tables are Table 2-2 “32-Bit Addressing Forms with the ModR/M Byte” (page 532) and Table 2-3 “32-Bit Addressing Forms with the SIB Byte” (page 533). Although the tables describe 32-bit quantities, with the REX prefix all the Es get replaced with Rs and all of a sudden they can describe 64-bit quantities.

These tables are super helpful when figuring out how to put together ModR/M and SIB bytes.

Let’s start the encoding process by revisiting Emit_mov_reg_imm32/REX.W + C7 /0 id:

void Emit_mov_reg_imm32(Buffer *buf, Register dst, int32_t src) {
  // ...
}

Given a register dst and an immediate 32-bit integer src, we’re going to encode this instruction. Let’s do all the steps in order.

REX prefix

Since the instruction calls for REX.W, we can keep the first line the same as before:

void Emit_mov_reg_imm32(Buffer *buf, Register dst, int32_t src) {
  Buffer_write8(buf, kRexPrefix);
  // ...
}

Nice.

Opcode

This opcode is 0xc7, so we’ll write that directly:

void Emit_mov_reg_imm32(Buffer *buf, Register dst, int32_t src) {
  Buffer_write8(buf, kRexPrefix);
  Buffer_write8(buf, 0xc7);
  // ...
}

Also the same as before. Nice.

ModR/M byte

ModR/M bytes are where the code gets a little different. We want an abstraction to build them for us, instead of manually slinging integers like some kind of animal.

To do that, we should know how they are put together. ModR/M bytes are comprised of:

  • mod (high 2 bits), which describes what big row to use in the ModR/M table
  • reg (middle 3 bits), which either describes the second register operand or an opcode extension (like /0 above)
  • rm (low 3 bits), which describes the first operand

This means we can write a function modrm that puts these values together for us:

byte modrm(byte mod, byte rm, byte reg) {
  return ((mod & 0x3) << 6) | ((reg & 0x7) << 3) | (rm & 0x7);
}

The order of the parameters is a little different than the order of the bits. I did this because it looks a little more natural when calling the function from its callers. Maybe I’ll change it later because it’s too confusing.

For this instruction, we’re going to:

  • pass 0b11 (3) as mod, because we want to move directly into a 64-bit register, as opposed to [reg], which means that we want to dereference the value in the pointer
  • pass the destination register dst as rm, since it’s the first operand
  • pass 0b000 (0) as reg, since the /0 above told us to

That ends up looking like this:

void Emit_mov_reg_imm32(Buffer *buf, Register dst, int32_t src) {
  Buffer_write8(buf, kRexPrefix);
  Buffer_write8(buf, 0xc7);
  Buffer_write8(buf, modrm(/*direct*/ 3, dst, 0));
  // ...
}

Which for the above instruction mov rax, 100, produces a modrm byte that has this layout:

@font-face { font-family: "Virgil"; src: url("https://excalidraw.com/FG_Virgil.woff2"); } @font-face { font-family: "Cascadia"; src: url("https://excalidraw.com/Cascadia.woff2"); } ModR/Mmodregrm11000direct/0RAX000

I haven’t put a datatype for mods together because I don’t know if I’d be able to express it well. So for now I just added a comment.

Immediate value

Last, we have the immediate value. As I said above, all this entails is writing out a 32-bit quantity as we have always done:

void Emit_mov_reg_imm32(Buffer *buf, Register dst, int32_t src) {
  Buffer_write8(buf, kRexPrefix);
  Buffer_write8(buf, 0xc7);
  Buffer_write8(buf, modrm(/*direct*/ 3, dst, 0));
  Buffer_write32(buf, src);
}

And there you have it! It took us 2500 words to get us to these measly four bytes. The real success is the friends we made along the way.

Further instructions

“But Max,” you say, “this produces literally the same output as before with all cases! Why go to all this trouble? What gives?”

Well, dear reader, having a mod of 3 (direct) means that there is no special-case escape hatch when dst is RSP. This is unlike the other mods, where there’s this [--][--] in the table where RSP should be. That funky symbol indicates that there must be a Scale-Index-Base (SIB) byte following the ModR/M byte. This means that the overall format for this instruction should have the following layout:

@font-face { font-family: "Virgil"; src: url("https://excalidraw.com/FG_Virgil.woff2"); } @font-face { font-family: "Cascadia"; src: url("https://excalidraw.com/Cascadia.woff2"); } REXOpModR/M0123SIB4Disp5

If you’re trying to encode mov [rsp-8], rax, for example, the values should look like this:

@font-face { font-family: "Virgil"; src: url("https://excalidraw.com/FG_Virgil.woff2"); } @font-face { font-family: "Cascadia"; src: url("https://excalidraw.com/Cascadia.woff2"); } REXOpModR/M01230x480x890x44SIB4Disp50x240xf8

This is where an instruction like Emit_store_reg_indirect (mov [REG+disp], src) goes horribly awry with the homebrew encoding scheme I cooked up. When the dst in that instruction is RSP, it’s expected that the next byte is the SIB. And when you output other data instead (say, an immediate 8-bit displacement), you get really funky addressing modes. Like what the heck is this?

mov qword [rsp + rax*2 - 8], rax

This is actual disassembled assembly that I got from running my binary code through rasm2. Our compiler definitely does not emit anything that complicated, which is how I found out things were wrong.

Okay, so it’s wrong. We can’t just blindly multiply and add things. So what do we do?

The SIB byte

Take a look at Table 2-2 (page 532) again. See that trying to use RSP with any sort of displacement requires the SIB.

Now take a look at Table 2-3 (page 533) again. We’ll use this to put together the SIB.

We know from Section 2.1.3 that the SIB, like the ModR/M, is comprised of three fields:

  • scale (high 2 bits), specifies the scale factor
  • index (middle 3 bits), specifies the register number of the index register
  • base (low 3 bits), specifies the register number of the base register

Intel’s language is not so clear and is kind of circular. Let’s take a look at sample instruction to clear things up:

mov [base + index*scale + disp], src

Note that while index and base refer to registers, scale refers to one of 1, 2, 4, or 8, and disp is some immediate value.

This is a compact way of specifying a memory offset. It’s convenient for reading from and writing to arrays and structs. It’s also going to be necessary for us if we want to write to and read from random offsets from the stack pointer, RSP.

So let’s try and encode that Emit_store_reg_indirect.

Encoding the indirect mov

Let’s start by going back to the table enumerating all the kinds of MOV instructions (page 1209). The specific opcode we’re looking for is REX.W + 89 /r, or MOV r/m64, r64.

We already know what REX.W means:

void Emit_store_reg_indirect(Buffer *buf, Indirect dst, Register src) {
  Buffer_write8(buf, kRexPrefix);
  // ...
}

And next up is the literal 0x89, so we can write that straight out:

void Emit_store_reg_indirect(Buffer *buf, Indirect dst, Register src) {
  Buffer_write8(buf, kRexPrefix);
  Buffer_write8(buf, 0x89);
  // ...
}

So far, so good. Looking familiar. Now that we have both the instruction prefix and the opcode, it’s time to write the ModR/M byte. Our ModR/M will contain the following information:

  • mod of 1, since we want an 8-bit displacement
  • reg of whatever register the second operand is, since we have two register operands (the opcode field says /r)
  • rm of whatever register the first operand is

Alright, let’s put that together with our handy-dandy ModR/M function.

void Emit_store_reg_indirect(Buffer *buf, Indirect dst, Register src) {
  Buffer_write8(buf, kRexPrefix);
  Buffer_write8(buf, 0x89);
  // Wrong!
  Buffer_write8(buf, modrm(/*disp8*/ 1, dst.reg, src));
  // ...
}

But no, this is wrong. As it turns out, you still have do this special thing when dst.reg is RSP, as I keep mentioning. In that case, rm must be the special none value (as specified by the table). Then you also have to write a SIB byte.

void Emit_store_reg_indirect(Buffer *buf, Indirect dst, Register src) {
  Buffer_write8(buf, kRexPrefix);
  Buffer_write8(buf, 0x89);
  if (dst.reg == kRsp) {
    Buffer_write8(buf, modrm(/*disp8*/ 1, kIndexNone, src));
    // ...
  } else {
    Buffer_write8(buf, modrm(/*disp8*/ 1, dst.reg, src));
  }
  // ...
}

Astute readers will know that kRsp and kIndexNone have the same integral value of 4. I don’t know if this was intentional on the part of the Intel designers. Maybe it’s supposed to be like that so encoding is easier and doesn’t require a special case for both ModR/M and SIB. Maybe it’s coincidental. Either way, I found it very subtle and wanted to call it out explicitly.

For an instruction like mov [rsp-8], rax, our modrm byte will look like this:

@font-face { font-family: "Virgil"; src: url("https://excalidraw.com/FG_Virgil.woff2"); } @font-face { font-family: "Cascadia"; src: url("https://excalidraw.com/Cascadia.woff2"); } ModR/Mmodregrm11100disp8RAXnone000

Let’s go ahead and write that SIB byte. I made a sib helper function like modrm, with two small differences: the parameters are in order of low to high bit, and the parameters have their own special types instead of just being bytes.

typedef enum {
  Scale1 = 0,
  Scale2,
  Scale4,
  Scale8,
} Scale;

typedef enum {
  kIndexRax = 0,
  kIndexRcx,
  kIndexRdx,
  kIndexRbx,
  kIndexNone,
  kIndexRbp,
  kIndexRsi,
  kIndexRdi
} Index;

byte sib(Register base, Index index, Scale scale) {
  return ((scale & 0x3) << 6) | ((index & 0x7) << 3) | (base & 0x7);
}

I made all these datatypes to help readability, but you don’t have to use them if you don’t want to. The Index one is the only one that has a small gotcha: where kIndexRsp should be is kIndexNone because you can’t use RSP as an index register.

Let’s use this function to write a SIB byte in Emit_store_reg_indirect:

void Emit_store_reg_indirect(Buffer *buf, Indirect dst, Register src) {
  Buffer_write8(buf, kRexPrefix);
  Buffer_write8(buf, 0x89);
  if (dst.reg == kRsp) {
    Buffer_write8(buf, modrm(/*disp8*/ 1, kIndexNone, src));
    Buffer_write8(buf, sib(kRsp, kIndexNone, Scale1));
  } else {
    Buffer_write8(buf, modrm(/*disp8*/ 1, dst.reg, src));
  }
  // ...
}

If you get it right, the SIB byte will have the following layout:

@font-face { font-family: "Virgil"; src: url("https://excalidraw.com/FG_Virgil.woff2"); } @font-face { font-family: "Cascadia"; src: url("https://excalidraw.com/Cascadia.woff2"); } SIBscaleindexbase001000noneRSP100

This is a very verbose way of saying [rsp+DISP], but it’ll do. All that’s left now is to encode that displacement. To do that, we’ll just write it out:

void Emit_store_reg_indirect(Buffer *buf, Indirect dst, Register src) {
  Buffer_write8(buf, kRexPrefix);
  Buffer_write8(buf, 0x89);
  if (dst.reg == kRsp) {
    Buffer_write8(buf, modrm(/*disp8*/ 1, kIndexNone, src));
    Buffer_write8(buf, sib(kRsp, kIndexNone, Scale1));
  } else {
    Buffer_write8(buf, modrm(/*disp8*/ 1, dst.reg, src));
  }
  Buffer_write8(buf, disp8(indirect.disp));
}

Very nice. Now it’s your turn to go forth and convert the rest of the assembly functions in your compiler! I found it very helpful to extract the modrm/sib/disp8 calls into a helper function, because they’re mostly the same and very repetitive.

What did we learn?

This was a very long post. The longest post in the whole series so far, even. We should probably have some concrete takeaways.

If you read this post through, you should have gleaned some facts and lessons about:

  • Intel x86-64 instruction encoding terminology and details, and
  • how to read dense tables in the Intel Developers Manual
  • maybe some third thing, too, I dunno — this post was kind of a lot

Hopefully you enjoyed it. I’m going to go try and get a good night’s sleep. Until next time, when we’ll implement procedure calls!

Here’s a fun composite diagram for the road:

This is a composite of all the instruction encoding diagrams present in the post. If you're seeing this text, it means your browser cannot render SVG.

Mini Table of Contents



  1. If you are an avid reader of this blog (Do those people exist? Please reach out to me. I would love to chat.), you may notice that Tom gets pulled into shenanigans a lot. This is because Tom is the best debugger I have ever encountered, he’s good at reverse engineering, and he knows a lot about low-level things. I think right now he’s working on improving open-source tooling for a RISC-V board for fun. But also he’s very kind and helpful and generally interested in whatever ridiculous situation I’ve gotten myself into. Maybe I should add a list of the Tom Chronicles somewhere on this website. Anyway, everyone needs a Tom. 

October 18, 2020

Derek Jones (derek-jones)

Learning useful stuff from the Human cognition chapter of my book October 18, 2020 09:37 PM

What useful, practical things might professional software developers learn from the Human cognition chapter in my evidence-based software engineering book (an updated beta was release this week)?

Last week I checked the human cognition chapter; what useful things did I learn (combined with everything I learned during all the other weeks spent working on this chapter)?

I had spent a lot of time of learning about cognition when writing my C book; for this chapter I was catching up on what had happened in the last 10 years, which included: building executable models has become more popular, sample size has gotten larger (mostly thanks to Mechanical Turk), more researchers are making their data available on the web, and a few new theories (but mostly refinements of existing ideas).

Software is created by people, and it always seemed obvious to me that human cognition was a major topic in software engineering. But most researchers in computing departments joined the field because of their interest in maths, computers or software. The lack of interested in the human element means that the topic is rarely a research topic. There is a psychology of programming interest group, but most of those involved don’t appear to have read any psychology text books (I went to a couple of their annual workshops, and while writing the C book I was active on their mailing list for a few years).

What might readers learn from the chapter?

Visual processing: the rationale given for many code layout recommendations is plain daft; people need to learn something about how the brain processes images.

Models of reading. Existing readability claims are a joke (or bad marketing, take your pick). Researchers have been using eye trackers, since the 1960s, to figure out what actually happens when people read text, and various models have been built. Market researchers have been using eye trackers for decades to work out where best to place products on shelves, to maximise sales. In the last 10 years software researchers have started using eye trackers to study how people read code; next they need to learn about some of the existing models of how people read text. This chapter contains some handy discussion and references.

Learning and forgetting: it takes time to become proficient; going on a course is the start of the learning process, not the end.

One practical take away for readers of this chapter is being able to give good reasons how other people’s proposals, that are claimed to be based on how the brain operates, won’t work as claimed because that is not how the brain works. Actually, most of the time it is not possible to figure out whether something will work as advertised (this is why user interface testing is such a prolonged, and expensive, process), but the speaker with the most convincing techno-babble often wins the argument :-)

Readers might have a completely different learning experience from reading the human cognition chapter. What useful things did you learn from the human cognition chapter?

Bogdan Popa (bogdan)

Racket Web Development with Koyo October 18, 2020 06:00 PM

Inspired by Brian Adkins' RacketCon talk from yesterday, I decided to record a screencast on what it’s like to write a little web application using my not-quite-a-web-framework, koyo. You can watch it over on YouTube and you can find the resulting code on GitHub. It’s unscripted and I don’t go too deep on how everything works, but hopefully it’s easy enough to follow and I’ve left the various mistakes I’ve made in since it’s usually helpful to watch someone get out of a tricky situation so look forward to those if you watch it!

Carlos Fenollosa (carlesfe)

You may be using Mastodon wrong October 18, 2020 05:13 PM

I'm sure you have already heard about Mastodon, typically marketed as a Twitter alternative.

I will try to convince you that the word alternative doesn't mean here what you think it means, and why you may be using Mastodon wrong if you find it boring.

An alternative community

You should not expect to "migrate from Twitter to Mastodon."

Forget about the privacy angle for now. Mastodon is an alternative community, where people behave differently.

It's your chance to make new internet friends.

There may be some people for whom Mastodon is a safe haven. Yes, some users really do migrate there to avoid censorship or bullying but, for most of us, that will not be the case.

Let's put it this way: Mastodon is to Twitter what Linux is to Windows.

Linux is libre software. But that's not why most people use it. Linux users mostly want to get their work done, and Linux is an excellent platform. There is no Microsoft Word, no Adobe Photoshop, no Starcraft. If you need to use these tools, honestly, you'd better stick with Windows. You can use emulation, in the same way that there are utilities to post to Twitter from Mastodon, but that would miss the point.

The bottom line is, you can perform the same tasks, but the process will be different. You can post toots on Mastodon, upload gifs, send DMs... but it's not Twitter, and that is fine.

The Local Timeline is Mastodon's greatest invention

The problem most people have with Mastodon is that they "get bored" with it quickly. I've seen it a lot, and it means one thing: the person created their account on the wrong server.

"But," they say, "isn't Mastodon federated? Can't I chat with everybody, regardless of their server?" Yes, of course. But discoverability works differently on Mastodon.

Twitter has only two discoverability layers: your network and the whole world. Either a small group of contacts, or everybody in the whole world. That's crazy.

They try very hard to show you tweets from outside your network so you can discover new people. And, at the same time, they show your tweets to third parties, so you can get new followers. This is the way that they try to keep you engaged once your network is more or less stable and starts getting stale.

Mastodon, instead, has an extra layer between your network and the whole world: messages from people on your server. This is called the local timeline.

The local timeline is the key to enjoying Mastodon.

How long it's been since you made a new internet friend?

If you're of a certain age you may remember BBSs, Usenet, the IRC, or early internet forums. Do you recall how exciting it was to log into the unknown and realize that there were people all around the world who shared your interests?

It was an amazing feeling which got lost on the modern internet. Now you have a chance to relive it.

The local timeline dynamics are very different. There is a lot of respectful interactions among total strangers, because there is this feeling of community, of being in a neighborhood. Twitter is just the opposite, strangers shouting at each other.

Furthermore, since the local timeline is more or less limited in the amount of users, you have the chance to recognize usernames, and being recognized. You start interacting with strangers, mentioning them, sending them links they may like. You discover new websites, rabbit holes, new approaches to your hobbies.

I've made quite a few new internet friends on my Mastodon server, and I don't mean followers or contacts. I'm talking about human beings who I have never met in person but feel close to.

People are humble and respectful. And, for less nice users, admins enforce codes of conduct and, on extreme cases, users may get kicked off a server. But they are not being banned by a faceless corporation due to mass reports, everybody is given a chance.

How to choose the right server

The problem with "generalist" Mastodon servers like mastodon.social is that users have just too diverse interests and backgrounds. Therefore, there is no community feeling. For some people, that may be exactly what they're looking for. But, for most of us, there is more value on the smaller servers.

So, how can you choose the right server? Fortunately, you can do a bit of research. There is an official directory of Mastodon servers categorized by interests and regions.

Since you're reading my blog, start by taking a look at these:

And the regionals

There are many more. Simply search online for "mastodon server MY_FAVORITE_HOBBY." And believe me, servers between 500 and 5,000 people are the best.

Final tips

Before clicking on "sign up", always browse the local timeline, the about page, and the most active users list. You will get a pretty good idea of the kind of people who chat there. Once you feel right at home you can continue your adventure and start following users from other servers.

Mastodon has an option to only display toots in specific languages. It can be very useful to avoid being flooded by toots that you just have no chance of understanding or even getting what they're about.

You can also filter your notifications by types: replies, mentions, favorites, reposts, and more. This makes catching up much more manageable than on Twitter.

Finally, Mastodon has a built-in "Content Warning" feature. It allows you to hide text behind a short explanation, in case you want to talk about sensible topics or just about spoiling a recent movie.

Good luck with your search, and see you on the Fediverse! I'm at @cfenollosa@mastodon.sdf.org

Tags: internet

&via=cfenollosa">&via=cfenollosa">Comments? Tweet  

You may be using Mastodon wrong October 18, 2020 05:13 PM

I'm sure you have already heard about Mastodon, typically marketed as a Twitter alternative.

I will try to convince you that the word alternative doesn't mean here what you think it means, and why you may be using Mastodon wrong if you find it boring.

An alternative community

You should not expect to "migrate from Twitter to Mastodon."

Forget about the privacy angle for now. Mastodon is an alternative community, where people behave differently.

It's your chance to make new internet friends.

There may be some people for whom Mastodon is a safe haven. Yes, some users really do migrate there to avoid censorship or bullying but, for most of us, that will not be the case.

Let's put it this way: Mastodon is to Twitter what Linux is to Windows.

Linux is libre software. But that's not why most people use it. Linux users mostly want to get their work done, and Linux is an excellent platform. There is no Microsoft Word, no Adobe Photoshop, no Starcraft. If you need to use these tools, honestly, you'd better stick with Windows. You can use emulation, in the same way that there are utilities to post to Twitter from Mastodon, but that would miss the point.

The bottom line is, you can perform the same tasks, but the process will be different. You can post toots on Mastodon, upload gifs, send DMs... but it's not Twitter, and that is fine.

The Local Timeline is Mastodon's greatest invention

The problem most people have with Mastodon is that they "get bored" with it quickly. I've seen it a lot, and it means one thing: the person created their account on the wrong server.

"But," they say, "isn't Mastodon federated? Can't I chat with everybody, regardless of their server?" Yes, of course. But discoverability works differently on Mastodon.

Twitter has only two discoverability layers: your network and the whole world. Either a small group of contacts, or everybody in the whole world. That's crazy.

They try very hard to show you tweets from outside your network so you can discover new people. And, at the same time, they show your tweets to third parties, so you can get new followers. This is the way that they try to keep you engaged once your network is more or less stable and starts getting stale.

Mastodon, instead, has an extra layer between your network and the whole world: messages from people on your server. This is called the local timeline.

The local timeline is the key to enjoying Mastodon.

How long it's been since you made a new internet friend?

If you're of a certain age you may remember BBSs, Usenet, the IRC, or early internet forums. Do you recall how exciting it was to log into the unknown and realize that there were people all around the world who shared your interests?

It was an amazing feeling which got lost on the modern internet. Now you have a chance to relive it.

The local timeline dynamics are very different. There is a lot of respectful interactions among total strangers, because there is this feeling of community, of being in a neighborhood. Twitter is just the opposite, strangers shouting at each other.

Furthermore, since the local timeline is more or less limited in the amount of users, you have the chance to recognize usernames, and being recognized. You start interacting with strangers, mentioning them, sending them links they may like. You discover new websites, rabbit holes, new approaches to your hobbies.

I've made quite a few new internet friends on my Mastodon server, and I don't mean followers or contacts. I'm talking about human beings who I have never met in person but feel close to.

People are humble and respectful. And, for less nice users, admins enforce codes of conduct and, on extreme cases, users may get kicked off a server. But they are not being banned by a faceless corporation due to mass reports, everybody is given a chance.

How to choose the right server

The problem with "generalist" Mastodon servers like mastodon.social is that users have just too diverse interests and backgrounds. Therefore, there is no community feeling. For some people, that may be exactly what they're looking for. But, for most of us, there is more value on the smaller servers.

So, how can you choose the right server? Fortunately, you can do a bit of research. There is an official directory of Mastodon servers categorized by interests and regions.

Since you're reading my blog, start by taking a look at these:

And the regionals

There are many more. Simply search online for "mastodon server MY_FAVORITE_HOBBY." And believe me, servers between 500 and 5,000 people are the best.

Final tips

Before clicking on "sign up", always browse the local timeline, the about page, and the most active users list. You will get a pretty good idea of the kind of people who chat there. Once you feel right at home you can continue your adventure and start following users from other servers.

Mastodon has an option to only display toots in specific languages. It can be very useful to avoid being flooded by toots that you just have no chance of understanding or even getting what they're about.

You can also filter your notifications by types: replies, mentions, favorites, reposts, and more. This makes catching up much more manageable than on Twitter.

Finally, Mastodon has a built-in "Content Warning" feature. It allows you to hide text behind a short explanation, in case you want to talk about sensible topics or just about spoiling a recent movie.

Good luck with your search, and see you on the Fediverse! I'm at @cfenollosa@mastodon.sdf.org

Tags: internet

&via=cfenollosa">&via=cfenollosa">Comments? Tweet  

Sevan Janiyan (sevan)

LFS, round #1 October 18, 2020 02:07 AM

Following on from the previous blog post, I started on the path of build a Linux From Scratch distribution. The project offers two paths, one using traditional Sys V init and systemd for the other. I opted for systemd route and followed the guide, it was all very straight forward. Essentially you fetch a bunch …

October 16, 2020

Patrick Louis (venam)

October 2020 Projects October 16, 2020 09:00 PM

Conveyor belt

Seven long and perilous months have gone by since my previous article, what feels like an eternity, and yet feels like a day — Nothing and everything has happened.
All I can add to the situation in my country, that I’ve already drawn countless times, is that my expectations weren’t fulfilled. Indeed, after a governmental void and a horrific explosion engulfing a tremendous part of the capital, I’m not sure any words can express the conflicting feelings and anger I have. Today marks exactly 1 year since the people started revolting.
Sentimentalities aside, let’s get to what I’ve been up to.

Psychology, Philosophy & Books

zadig cover

Language: brains
Explanation: My reading routine has been focused one part on heavy technical books, and one part on leisure books.
On the leisure side, I’ve finished the following books:

  • The Better Angels of Our Nature: Why violence declined — Stephen Pinker
  • The Book of M — Peng Shepherd
  • The Gene — Siddhartha mukherjee
  • The second sex — Simone de Beauvoir
  • Aphorism on Love and Hate — Nietzsche
  • How To Use Your Enemies — Baltasar Gracian
  • Zadig ou La Destinée — Voltaire
  • Great Expectations — Charles Dickens [ongoing]

While on the technical side, I’ve finished these bricks:

  • Computer Architecture: A Quantitative approach - Hennesy Patterson
  • Beyond Software Architecture - Creating and sustaining winning solutions — Luke Hohmann
  • Compilers/dragon book - Aho Lam Seth Ullman
  • Operating system concepts - Silberschatz [ongoing]

Obviously, I still want to build a bookshelf, however the current situation has postponed this project.

As far as podcasts go, I’ve toned down on them and only listen when exercising; which doesn’t amount to much compared to when I was commuting to work.

Life Refresh & Growth

sunrise

Language: growth
Explanation: As you might have already noticed, I’ve redesigned my blog. I tried to give it more personality and to be more reflective of who I am as a person.
That involved reviewing the typography, adding meta-tags and previews, adding relevant pictures for every articles, including general and particular descriptions for sections of the blog, and more.

Additionally, I’ve polished my online presence on StackOverflow and LinkedIn. It is especially important these days, when in need of new opportunities.

LinkedIn Profile

As far as software architecture goes, I’m still on the learning path which consists of reading articles, watching videos, and trying to apply the topics to real scenarios. Recently, I’ve started following Mark Richards’ and Neil Ford’s Foundations Friday Forum, which is a monthly webinar on software architecture.

When it comes to articles, that’s where I’ve put the most energy. Here’s the list of new ones.

  • The Self, Metaperceptions, and Self-Transformation: One of my favorite article about the self and growth. It has been influenced by theories from Carl Jung and Nietzsche.
  • Software Distributions And Their Roles Today: This is an article I had in mind for a long while but didn’t get to write. It was initially supposed to be a group discussion as a podcast but I ended up writing it as an article, and then recording a podcast too.
  • Time on Unix: My biggest and most complete article to date. I consider it an achievement, and it has been well received by readers. It’s now the goto article when it comes to time.
  • Domain Driven Design Presentation: The transcript of a talk I’ve given for the MENA-Devs community.
  • Evolutionary Software Architecture: An article where I apply my knowledge of software architecture to explain a trending topic.
  • D-Bus and Polkit, No More Mysticism and Confusion: There’s a lot of confusion and hate about dbus and everything around it. I personally had no idea what these tech implied so I wrote an article and found out for myself if the hate was justified.
  • Computer Architecture Takeaways: An article reviewing my knowledge on computer architecture after reading a book on it.
  • Notes About Compilers: Another article reviewing my knowledge on compilers after finishing the dragon book and other related content.
  • Did You Know Fonts Could Do All This?: Fonts is a topic that is very deep and complex, you can talk endlessly about it. In this particular article, I had a go at different settings and how they affect the rendering of fonts.
  • Corruption Is Attractive!: I’m fascinated by glitch art, and so I wrote an article about it, trying to sum up different techniques and give my personal view of what it consists of.

Recently, I’ve also had an Interview with my friend Oday on his YouTube channel. We had an interesting talk.

Now on the programming language side, I’m hoping on the bandwagon and learning Rust. I’m still doing baby steps.

When it comes to personal fun, I’ve stopped my Elevate subscription because of the spending restriction my country has implemented. But instead, I’ve started with a word of the day app.

Mushrooms

mushroooomz

Language: mycelium
Explanation: Finally, I’ve pushed my research about mushrooms in Lebanon online. It’s here and is composed of a map with information about each specimen.
I hope to soon go hike and discover new ones.

Unix, Nixers, 2bwm

nixers workflow compilation

Language: Unix
Explanation: There were a lot of ups and downs in the nixers community the past few months. We had to detach ourselves from the previous people that managed IRC because of their unprofessional and unacceptable behavior. Soon after, I’ve created our own room on freenode, and since then things have gone smoothly. That was until hell happened around me and I decided to close the forums. However, the community made it clear that they wanted to help and keep it alive and well. Thus, I retracted my decision and started implementing mechanisms to make the forums more active such as: A thread of the week, a gopher server, a screenshots display, fixed dmarc for emails, and fix the forums mobile view.
I’ve also uploaded all the previous year’s video compilation on YouTube.

When it comes to 2bwm, we finally added support for separate workpaces per monitor.

CTF

CTF Arab Win

Language: Security
Explanation: A friend of mine recently invited me to be part of his CTF team for the national competition. Lo and behold a couple of months later we win the national competition, get 2nd place in the b01lers CTF, and get 1st place in the Arab & Africa CybertTalents regional CTF.
In the coming months I’ll train on topics I haven’t dealt with before.

Ascii Art & Art

Nature ASCII Art

Language: ASCII
Explanation: I haven’t drawn too many pieces recently, however I’d rather emphasize quality over quantity. You can check my pieces here.

Additionally, I’ve tried to make my RedBubble Shop more attractive, maybe it’ll help in these hard economic times. Otherwise, it’s always nice to have it around.
Other than that, I’ve joined the small tilde.town community, which also has a love for ascii art.

Life and other hobbies

Goat farm

Language: life
Explanation: Life has been a bit harsh recently but I’ve tried to make the most of it.

We started gardening my father and I, we planted everything from sunflowers, cucumbers, tomatoes, rocca, parsley, cilantro, zucchinis, eggplants, bell peppers, hot peppers, ginger, green beans, garlic, onions, and more.
After the tomato harvest, we made our own tomato sauce and pasteurized it — It was heaven.

Recently, I’ve visited a local farm called Gout Blanc, it was a fun experience, but marked in time by the reality of the economic crisis we are in. The owners were wonderful and friendly.

Like anyone in this lock down, I’m going through a bread making phase. I quite enjoy ciabatta bread with halloum:

Homemade Ciabatta Bread

When the initial lock down started, I ordered some joysticks to play retro games with my brother, little did I know that I would only get them 5 months later. My brother left Lebanon to continue his study in France by that time, but I still got a retro-gaming setup.
Like many people close to me, he left for better pastures…

An anecdote, I’ve started to be hassled by Google and YouTube. I simply cannot open a video these days without being asked to fill captchas, so I’ve gotten quite good at finding cars, traffic light, and other trifles in random pictures. More than that, the kind of ads I’ve been getting are of the weirdest kind. Just take a look.

ad 1 ad 2 ad 3 ad 4 ad 5

Now

What’s in store for tomorrow… I’m not sure anymore. There hasn’t been more of a need for change.

This is it!
As usual… If you want something done, no one’s gonna do it for you, use your own hands, even if it’s not much.
And let’s go for a beer together sometime, or just chill.





Attributions:

  • Internet Archive Book Images, No restrictions, via Wikimedia Commons
  • Claud Field, Public domain, via Wikimedia Commons

October 15, 2020

Caius Durling (caius)

Let's Peek: A tale of finding "Waypoint" October 15, 2020 07:00 PM

Following a product launch at work earlier this year, I theorised if someone was watching the published lists of SSL Certificates they could potentially sneak a peak at things before they were publicised. Probably far too much noise to monitor continuously, but as a potential hint towards naming of things with a more targeted search it might be useful. Sites like https://crt.sh/ and https://censys.io/certificates make these logs searchable and queryable.

Fast forward to this week, where at HashiConf Digital HashiCorp are announcing two new products, which they’ve been teasing for a month or so. Watching Boundary get announced in the HashiConf opening keynote I then wondered what the second project might be called.

I’ve spent a chunk of the last month looking at various HashiCorp documentation for their projects, and I noticed they have a pattern recently of using <name>project.io as the product websites. The newly announced Boundary also fits this pattern.

🤔 Could I figure out the second product name 24 hours before public release? Amazingly, yes! 🎉

Searching at random for all certificates issued for *project.io was probably going to be a bit futile, so to narrow the search space slightly I started by looking at when boundaryproject.io had its certificate issued, and who by. The list of things I spotted were:

  • Common name is “boundaryproject.io”
  • Issued by LetsEncrypt (no real surprise there)
  • Issued on 2020-09-23
  • Leaf certificate
  • Not yet expired (still trusted)
  • No alternate names in the certificate

Loading up https://censys.io/certificates and building a query for this, resulted in a regexp lookup against the common name, and an issued at date range of 10 days, just before and a week after the boundary certificate issued date.

parsed.subject.common_name:/[a-z]+project\.io/ AND
parsed.issuer.organization.raw:"Let's Encrypt" AND
parsed.validity.start:["2020-09-20" TO "2020-09-30"] AND
tags.raw:"leaf" AND
tags.raw:"trusted"

(Run the search yourself)

Searching brought back a couple of pages of results, I scanned them by eye and copied out the ones that only had the single name in the certificate which resulted in the following shortlist:

  • boundaryproject.io
  • essenceproject.io
  • lumiereproject.io
  • techproject.io
  • udproject.io
  • vesselproject.io
  • waypointproject.io

We already know about Boundary, so the fact I found it in our list suggests the query might have captured the new product site too. Loading all these sites in a web browser showed some had password protection on them (ooh!) and some just plain didn’t load (ooh!), and some others were blatently other things (boo!). Removing the latter ones left us with a much shorter list:

  • essenceproject.io
  • udproject.io1
  • waypointproject.io

All domains on the internet have to point somewhere, using DNS records. On a hunch I looked up a couple of the existing HashiCorp websites to see if they happened to all point at the same IP Address(es).

$ host boundaryproject.io
boundaryproject.io has address 76.76.21.21
$ host nomadproject.io
nomadproject.io has address 76.76.21.21
$ host hashicorp.com | head -1
hashicorp.com has address 76.76.21.21

Ah ha, now I wonder if any of the shortlist also points to 76.76.21.21 🤔2

$ host essenceproject.io | head -1
essenceproject.io has address 198.185.159.145
$ host udproject.io | head -1
udproject.io has address 137.74.116.3
$ host waypointproject.io
waypointproject.io has address 76.76.21.21

🎉 Excellent, https://waypointproject.io was a password protected site pointed at HashiCorp’s IP address 🎉

I then wondered if I could verify this somehow ahead of waiting for the second keynote. I firstly tweeted about it but didn’t name Waypoint explicitly, just hid “way” and “point” in the tweet. I got a reply from @ksatirli which suggested it was correct (and then later @mitchellh confirmed it.3)

HashiCorp also does a lot in public, and all the source code and related materials are on GitHub so perhaps some of their commit messages or marketing sites will contain reference to Waypoint. One github search later across their organisation: https://github.com/search?q=org%3Ahashicorp+waypoint&type=issues and I’d discovered a commit in the newly-public hashicorp/boundary-ui repo which references Waypoint: 346f76404

chore: tweak colors to match waypoint and for a11y

Good enough for me, now to wait and see what the project is for. Given it’s now all announced and live, you can just visit https://waypointproject.io to find out! (It’s so much cooler/useful than I’d hoped for.)


  1. I so hope whoever registered this was going for UDP in the name, rather than UD Project. ↩︎

  2. I’m a massive fan of IP address related quirks. Facebook’s IPv6 address contains face:b00c for example. A nice repeating 76.76.21.21 is almost IPv4 art somehow. ↩︎

  3. Secrets are more fun when they are kept secret. 🥳 ↩︎

Jeremy Morgan (JeremyMorgan)

Building a Go Web API with the Digital Ocean App Platform October 15, 2020 05:03 AM

Recently, Digital Ocean announced they’re entering the PaaS market with their new application platform. They’ve hosted virtual machines (droplets) and Kubernetes based services for years, but now they’re creating a platform that’s a simple point and click to get an application up and running. So I decided to try it. In this tutorial, we’re going to build an application on that platform. I will use Go to make a small web API and have it backed by a SQLite database.

October 14, 2020

Gokberk Yaltirakli (gkbrk)

Status update, October 2020 October 14, 2020 09:00 PM

To nobody’s surprise, the consistency of status updates have been less than perfect. But still, here I am with another catch-up post. Since the last update was a while back, this one might end up slightly longer.


First off, let me start with a career update. I have received my undergraduate degree and I am now officially a Software Engineer. Recently I’ve started working with a company that does mobile network optimization. I’m now a part of their Integration team, and I get to work with a lot of internals of mobile networks. This is exciting for me because of my interest in radio communications, as I get to work on non-toy problems now.

I migrated my personal finances from basic CSV files to double-entry bookkeeping. I decided to go with a homebrew solution, so I published ledger.py. It has a syntax that roughly resembles ledger-cli and beancount, but is currently not compatible with either.

I have also written a few throw-away scripts that can read both my previous budget CSV and exports from my previous bank, so I get to backfill a lot of historical data.

I started working on a networking stack, along with a custom packet routing algorithm. There is no name for the project yet, and it is not quite ready for a fancy public release, but I am occasionally publishing code dumps on gkbrk/network01. I am testing this network in a sort of closed-alpha with a small group of friends.

The network is intended to work with a topology where nodes don’t have direct links to other nodes. This is different from the so-called overlay networks. While most links between nodes go through the internet via our ISPs right now, we are intending to add radio links between some nodes in order reduce our reliance on ISPs. There is nothing in the network design that prevents different kinds of links from being used.

As of now, the network can find paths between nodes, can recover and discover new paths in case some links fail, and can route packets between all nodes. We have done some trivial tests including private messaging and a few extremely choppy voice calls.

I am intending to work more on this project and even write some blog posts about it if I manage to stay interested.

As I have moved countries, I have a lot of paperwork to do. And some of this paperwork involves grabbing difficult-to-get appointments. I had the joy of automating this work and keeping me up to date using selenium and the SMS API from AWS.

I initially thought I would go with Twilio, but to my disappointment things weren’t too smooth with them. Everything went smoothly and I started to integrate their APIs, and it was time to put some credits in my account. While I looked completely normal to their automated systems, they decided to block me seconds after charging my card. Apparently paying for services is suspicious these days. And of course, no reply to support tickets and currently no refund in sight.

That’s all for this month, thanks for reading!

October 13, 2020

Mark Fischer (flyingfisch)

Sharpie Stainless Steel Pen Refill Update October 13, 2020 01:14 PM

I have finally tried another refill with my Sharpie pen that I think is much similar to the original. The refill is a Schmidt 6040 FineLiner Fiber Tip Metal Refill M. I wish they made a Fine version, but this is close enough. It doesn’t bleed, and feels similar to the original pen.

October 12, 2020

Noon van der Silk (silky)

New job; moving to Cambridge! October 12, 2020 12:00 AM

Posted on October 12, 2020 by Noon van der Silk

So, I’m very excited to share that we’re moving overseas! We’re headed to Cambridge in the UK at the end of this month.

I’m very lucky to be starting work with a very cool quantum computing company: Riverlane.

Leaving Melbourne

It’s going to be interesting leaving Melbourne. I’ve lived here all my life basically, and I’ve found a really nice group of friends. I’m going to miss everyone.

I owe a big thanks to everyone that’s helped me in my career and life over the years here. I won’t list all of you, but thanks :) I wouldn’t be where I am if I hadn’t had your help and support. ❤️ 💖

In particular the meetup community has been a great place for me and somewhere I was able to forge some really strong friendships. Specifically, I’ve had a great time at the Melbourne Maths and Science Meetup, the Haskell Meetup, back in the day I loved MXUG, and of course I have to thank my friend David Kemp for being the only consistent attendee of the Quantum Lunch, started way back in the day! I’ll also miss hanging out with the cool people who’ve come to events I’ve helped organise, such as compose conference and post-prediction conference. I love Melbourne for the really nice connection you get between different communities, and it’s been some of my favourite times meeting people from outside my standard little bubble.

I also have to thank my friend Charles Hill from Melbourne University, who taught me so much about quantum computing, is an amazing researcher and very kind and generous person.

Thanks also to the other people in the tech community here that have supported me and helped me first move into new jobs and learn interesting and fun things. I’ve been very lucky.

Of course, thanks again to all the people that helped out with Braneshop whom I’ve already mentioned over there.

Cambridge

We’re totally new to Cambridge, so if you’ve got suggestions/connections I’d love to hear about it! Definitely keen to meet some people and get into the community over there.

Reach out if you like!

I’ve benefited a lot from emailing random people and just asking questions, so please feel free to reach out to me if you think I can help with anything!

October 11, 2020

Derek Jones (derek-jones)

Learning useful stuff from the Cognitive capitalism chapter of my book October 11, 2020 10:19 PM

What useful, practical things might professional software developers learn from the Cognitive capitalism chapter in my evidence-based software engineering book?

This week I checked the cognitive capitalism chapter; what useful things did I learn (combined with everything I learned during all the other weeks spent working on this chapter)?

Software systems are the product of cognitive capitalism (more commonly known as economics).

My experience is that most software developers don’t know anything about economics, so everything in this chapter is likely to be new to them. The chapter is more tutorial like than the other chapters.

Various investment models are discussed. The problem with these kinds of models is obtaining reliable data. But, hopefully the modelling ideas will prove useful.

Things I learned about when writing the chapter include: social learning, group learning, and Open source licensing is a mess.

Building software systems usually requires that many of the individuals involved to do lots of learning. How do people decide what to learn, e.g., copy others or strike out on their own? This problem is not software specific, in fact social learning appears to be one of the major cognitive abilities that separates us from other apes.

Organizational learning and forgetting is much talked about, and it was good to find some data dealing with this. Probably not applicable to most people.

Open source licensing is a mess in that software containing a variety of, possible incompatible, licenses often gets mixed together. What future lawsuits await?

For me, potentially the most immediately useful material was group learning; there are some interesting models for how this sometimes works.

Readers might have a completely different learning experience from reading the cognitive capitalism chapter. What useful things did you learn from the cognitive capitalism chapter?

Andreas Zwinkau (qznc)

Pondering Amazon's Manyrepo Build System October 11, 2020 12:00 AM

Amazon's build system provides valuable insights for manyrepo environments.

Read full article!

Maxwell Bernstein (tekknolagi)

Compiling a Lisp: Heap allocation October 11, 2020 12:00 AM

firstprevious

Welcome back to the “Compiling a Lisp” series. Last time we added support for if expressions. This time we’re going add support for basic heap allocation.

Heap allocation comes in a couple of forms, but the one we care about right now is the cons primitive. Much like AST_new_pair in the compiler, cons should:

  • allocate some space on the heap,
  • set the car and cdr, and
  • tag the pointer appropriately.

Once we have that pair, we’ll want to poke at its data. This means we should probably also implement car and cdr primitive functions today.

What a pair looks like in memory

In order to generate code for packing and pulling apart pairs, we should probably know how they are laid out in memory.

Pairs contain two elements, side by side — kind of like a two-element array. The first element (pair[0]) is the car and the second one (pair[1]) is the cdr.

   +-----+-----+
...| car | cdr |...
   +-----+-----+
   ^
   pointer

The untagged pointer points to the address of the first element, and the tagged pointer has some extra information (kPairTag == 1) that we need to get rid of to inspect the elements. If we don’t, we’ll try and read from one byte after the pointer, somewhere in the middle of the car. This will give us bad data.

To make things more concrete, imagine our pair is allocated at 0x10000. Our car lives at *(0x10000) (using C notation) and our cdr lives at *(0x10000 + kWordSize). The tagged pointer in this case would be 0x10001 and kWordSize is 8.

Allocating some memory

We could make a call to malloc whenever we need a new object. This has a couple of downfalls, notably that malloc does a lot of internal bookkeeping that we really don’t need, and that there’s no good way to keep track of what memory we have allocated and needs garbage collecting (which we’ll handle later). It also has the unfortunate property of requiring C functional call infrastructure, which we don’t have yet.

What we’re going to do instead is allocate a big slab of memory at the beginning of our process. That will be our heap. Then, to keep track of what memory we have used so far, we’re going to bump the pointer every time we allocate. So here’s what the heap looks like before we allocate a pair:

+-----+-----+-----+-----+-----+-----+-----+-----+
|     |     |     |     |     |     |     |     |...
+-----+-----+-----+-----+-----+-----+-----+-----+
^
heap

The empty cells aren’t necessarily empty, but they are unused and they are garbage data.

Here is what it looks like after we allocate a pair:

+-----+-----+-----+-----+-----+-----+-----+-----+
| car | cdr |     |     |     |     |     |     |...
+-----+-----+-----+-----+-----+-----+-----+-----+
^           ^
pair        heap

Notice how the heap pointer has been moved over 2 words, and the pair pointer is the returned cons cell. Although we’ll tag the pair pointer, I am pointing it at the beginning of the car for clarity in the diagram.

In order to get this big slab of memory in the first place, we’ll have the outside C code (right now, that’s our test handler) call malloc.

You’re probably wondering what we’re going to do when we run out of memory. At some point in this series we’ll have a garbage collector that can reclaim some space for us. Right now, though, we’re just going to do … nothing. That’s right, we won’t even raise some kind of “out of memory” error. Remember, we don’t yet have error reporting facilities! Instead, we’ll use tools like Valgrind and AddressSanitizer to make sure we’re not overrunning our allocated buffer.

Implementation strategy

In order to make allocation from that big buffer fast and easy, we’re going to keep the heap pointer in a register. Our compiler emits instructions that use rbp, rsp, and rax, so we’ll have to pick another one. Ghuloum uses rsi, so we’ll use that as well.

In order to get the heap pointer in rsi in the first place, we’ll have to capture it from the outside C code. To do this, we’ll add a parameter to our entrypoint by modifying the function prologue.

Remember JitFunction? This is what the C code uses to understand how to call our mmap-ed function. We’re going to need to modify this first.

// Before:
typedef uword (*JitFunction)();

// After:
typedef uword (*JitFunction)(uword *heap);

That’s going to need to take a new parameter now — a pointer to the heap. This means that our kFunctionPrologue will need to expect that in the first parameter register in the calling convention, and store it somewhere safe. This register is rdi, so we can emit a mov rsi, rdi to store our heap pointer away.

Now, for the lifetime of the Lisp entrypoint, we can refer to the heap by the name rsi and modify it accordingly. We’ll keep an internal convention that rsi always points to the next available chunk of memory.

Want to allocate memory? Copy the current heap pointer into rax and update the heap pointer with add rsi, AllocationSize. We’ll need to add a new instruction for moving data between registers. Honestly, I am kind of surprised we haven’t needed that yet.

Want to store your car and cdr in your new pair? Write to offset 0 and kWordSize of rax, respectively. We’ll reuse our indirect store instruction.

Want to tag your pointer? add rax, Tag or or rax, Tag. These two instructions are equivalent because all the three taggable bits in a heap object will be zero.

This word-alignment is easy to maintain now because all pairs will be size 16, which is a multiple of 8. Later on, when we add symbols and strings and other data types that have non-object data in them, we’ll have to insert padding between allocations to keep the alignment invariant.

Once we have pairs allocated, it’s kind of useless unless we can also poke at their elements.

To implement car, we’ll remove the tag from the pointer and read from the memory pointed to by the register: mov rax, [Ptr+Car-Tag]. You can also do this with a sub rax, Tag and then a mov.

Implementing cdr is very similar, except we’ll be doing mov rax, [Ptr+Cdr-Tag].

Brass tacks

Now that we’ve gotten our minds around the abstract solution to the problem, we should write some code.

First off, here is the addition to the prologue I mentioned earlier:

const byte kEntryPrologue[] = {
  // Save the heap in rsi, our global heap pointer
  // mov rsi, rdi
  kRexPrefix, 0x89, 0xfe,
};

Let’s once more add an entry to Compile_call.

int Compile_call(Buffer *buf, ASTNode *callable, ASTNode *args,
                 word stack_index, Env *varenv) {
  if (AST_is_symbol(callable)) {
    // ...
    if (AST_symbol_matches(callable, "cons")) {
      return Compile_cons(buf, /*car=*/operand1(args), /*cdr=*/operand2(args),
                          stack_index, varenv);
    }
    // ...
  }
}

We don’t really need to add a whole new function for cons since we’re not doing structural recursion on the parameters or anything, but Compile_call just keeps getting bigger and this helps keep it smaller.

Compile_cons is pretty much exactly what I described above. I pulled out rsi into kHeapPointer so that we can change it later if we need to.

const Register kHeapPointer = kRsi;

int Compile_cons(Buffer *buf, ASTNode *car, ASTNode *cdr,
                 int stack_index, Env *varenv) {
  // Compile and store car on the stack
  _(Compile_expr(buf, car, stack_index, varenv));
  Emit_store_reg_indirect(buf,
                          /*dst=*/Ind(kRbp, stack_index),
                          /*src=*/kRax);
  // Compile and store cdr
  _(Compile_expr(buf, cdr, stack_index - kWordSize, varenv));
  Emit_store_reg_indirect(buf, /*dst=*/Ind(kHeapPointer, kCdrOffset),
                          /*src=*/kRax);
  // Fetch car and store in the heap
  Emit_load_reg_indirect(buf, /*dst=*/kRax, /*src=*/Ind(kRbp, stack_index));
  Emit_store_reg_indirect(buf, /*dst=*/Ind(kHeapPointer, kCarOffset),
                          /*src=*/kRax);
  // Store tagged pointer in rax
  Emit_mov_reg_reg(buf, /*dst=*/kRax, /*src=*/kHeapPointer);
  Emit_or_reg_imm8(buf, /*dst=*/kRax, kPairTag);
  // Bump the heap pointer
  Emit_add_reg_imm32(buf, /*dst=*/kHeapPointer, kPairSize);
  return 0;
}

Note that even though we’re compiling two expressions one right after another, we don’t need to bump stack_index or anything. This is because we’re storing the results not on the stack but in the pair.

As it turns out, we do need to store one of the intermediates on the stack because otherwise we risk overwriting random data in the heap. As Leonard Schütz pointed out to me, the previous version of this code would fail if either the car or cdr expressions modified the heap pointer. Thank you for the correction!

As promised, here is the new instruction to move data between registers:

void Emit_mov_reg_reg(Buffer *buf, Register dst, Register src) {
  Buffer_write8(buf, kRexPrefix);
  Buffer_write8(buf, 0x89);
  Buffer_write8(buf, 0xc0 + src * 8 + dst);
}

Alright, that’s cons. Let’s implement car and cdr. These are extraordinarily short implementations:

int Compile_call(Buffer *buf, ASTNode *callable, ASTNode *args,
                 word stack_index, Env *varenv) {
  if (AST_is_symbol(callable)) {
    // ...
    if (AST_symbol_matches(callable, "car")) {
      _(Compile_expr(buf, operand1(args), stack_index, varenv));
      Emit_load_reg_indirect(buf, /*dst=*/kRax,
                             /*src=*/Ind(kRax, kCarOffset - kPairTag));
      return 0;
    }
    if (AST_symbol_matches(callable, "cdr")) {
      _(Compile_expr(buf, operand1(args), stack_index, varenv));
      Emit_load_reg_indirect(buf, /*dst=*/kRax,
                             /*src=*/Ind(kRax, kCdrOffset - kPairTag));
      return 0;
    }
    // ...
  }
}

Both car and cdr compile their argument and then load from the resulting address.

That’s it. That’s the whole implementation! It’s kind of nice that now we have these building blocks, adding new features is not so hard.

Testing

I’ve written a couple of tests for this implementation. In order to make this testing painless, I’ve also added a new type of test harness that passes the tests a buffer and a heap. I call it — wait for it — RUN_HEAP_TEST.

Anyway, here’s a test that we can allocate pairs. To fully test it, I’ve added some helpers for poking at object internals: Object_pair_car and Object_pair_cdr. Note that these may be the same as but are not necessarily the same as the corresponding AST functions. The C compiler could hypothetically re-order struct elements, I think. Joker_vD on Hacker News points out that C compilers are not permitted to re-order elements, but may insert padding for alignment.

TEST compile_cons(Buffer *buf, uword *heap) {
  ASTNode *node = Reader_read("(cons 1 2)");
  int compile_result = Compile_entry(buf, node);
  ASSERT_EQ(compile_result, 0);
  // clang-format off
  byte expected[] = {
      // mov rax, 0x2
      0x48, 0xc7, 0xc0, 0x04, 0x00, 0x00, 0x00,
      // mov [rbp-8], rax
      0x48, 0x89, 0x45, 0xf8,
      // mov rax, 0x4
      0x48, 0xc7, 0xc0, 0x08, 0x00, 0x00, 0x00,
      // mov [rsi+Cdr], rax
      0x48, 0x89, 0x46, 0x08,
      // mov rax, [rbp-8]
      0x48, 0x8b, 0x45, 0xf8,
      // mov [rsi+Car], rax
      0x48, 0x89, 0x46, 0x00,
      // mov rax, rsi
      0x48, 0x89, 0xf0,
      // or rax, kPairTag
      0x48, 0x83, 0xc8, 0x01,
      // add rsi, 2*kWordSize
      0x48, 0x81, 0xc6, 0x10, 0x00, 0x00, 0x00,
  };
  // clang-format on
  EXPECT_ENTRY_CONTAINS_CODE(buf, expected);
  Buffer_make_executable(buf);
  uword result = Testing_execute_entry(buf, heap);
  ASSERT(Object_is_pair(result));
  ASSERT_EQ_FMT(Object_encode_integer(1), Object_pair_car(result), "0x%lx");
  ASSERT_EQ_FMT(Object_encode_integer(2), Object_pair_cdr(result), "0x%lx");
  AST_heap_free(node);
  PASS();
}

Here is a test for that tricky nested cons case that messed me up originally:

TEST compile_nested_cons(Buffer *buf, uword *heap) {
  ASTNode *node = Reader_read("(cons (cons 1 2) (cons 3 4))");
  int compile_result = Compile_entry(buf, node);
  ASSERT_EQ(compile_result, 0);
  Buffer_make_executable(buf);
  uword result = Testing_execute_entry(buf, heap);
  ASSERT(Object_is_pair(result));
  ASSERT(Object_is_pair(Object_pair_car(result)));
  ASSERT_EQ_FMT(Object_encode_integer(1),
                Object_pair_car(Object_pair_car(result)), "0x%lx");
  ASSERT_EQ_FMT(Object_encode_integer(2),
                Object_pair_cdr(Object_pair_car(result)), "0x%lx");
  ASSERT(Object_is_pair(Object_pair_cdr(result)));
  ASSERT_EQ_FMT(Object_encode_integer(3),
                Object_pair_car(Object_pair_cdr(result)), "0x%lx");
  ASSERT_EQ_FMT(Object_encode_integer(4),
                Object_pair_cdr(Object_pair_cdr(result)), "0x%lx");
  AST_heap_free(node);
  PASS();
}

Here’s a test for reading the car of a pair. The test for cdr is so similar I will not include it here.

TEST compile_car(Buffer *buf, uword *heap) {
  ASTNode *node = Reader_read("(car (cons 1 2))");
  int compile_result = Compile_entry(buf, node);
  ASSERT_EQ(compile_result, 0);
  // clang-format off
  byte expected[] = {
      // mov rax, 0x2
      0x48, 0xc7, 0xc0, 0x04, 0x00, 0x00, 0x00,
      // mov [rbp-8], rax
      0x48, 0x89, 0x45, 0xf8,
      // mov rax, 0x4
      0x48, 0xc7, 0xc0, 0x08, 0x00, 0x00, 0x00,
      // mov [rsi+Cdr], rax
      0x48, 0x89, 0x46, 0x08,
      // mov rax, [rbp-8]
      0x48, 0x8b, 0x45, 0xf8,
      // mov [rsi+Car], rax
      0x48, 0x89, 0x46, 0x00,
      // mov rax, rsi
      0x48, 0x89, 0xf0,
      // or rax, kPairTag
      0x48, 0x83, 0xc8, 0x01,
      // add rsi, 2*kWordSize
      0x48, 0x81, 0xc6, 0x10, 0x00, 0x00, 0x00,
      // mov rax, [rax-1]
      0x48, 0x8b, 0x40, 0xff,
  };
  // clang-format on
  EXPECT_ENTRY_CONTAINS_CODE(buf, expected);
  Buffer_make_executable(buf);
  uword result = Testing_execute_entry(buf, heap);
  ASSERT_EQ_FMT(Object_encode_integer(1), result, "0x%lx");
  AST_heap_free(node);
  PASS();
}

Other objects

I didn’t cover variable-length objects in this post because I wanted to focus on the basics of allocating and poking at allocated data structures. Next time, we’ll add symbols and strings we’ll learn about instruction encoding.

Mini Table of Contents

Robin Schroer (sulami)

Testing Hexagonal Architecture October 11, 2020 12:00 AM

Hexagonal Architecture, also known as Ports and Adapters, was first conceived by Cockburn in 2005, and popularised by Freeman & Pryce’s Growing Object-Oriented Software, Guided by Tests in 2009. For those unfamiliar, it describes an application architecture entirely comprised of ports, which are interfaces, and adaptors, which are implementations for those interfaces. The adaptors can depend on other ports, but not on other adaptors. A system is then constructed by selecting a full set of adaptors, depending on the requirements, and composing them using dependency injection.

A port can represent an external resource or service, but also a logical component of the system, like an HTTP server or a queue handler.

An Example Port & Adaptor

A simple example for a port could be blob storage. I will be using Clojure in this post, but no prior knowledge is required for understanding. A port in this case is a protocol, which we implement like so:

(defprotocol BlobStoragePort
  (store-object [this loc obj]
    "Store `obj` at  `loc`.")
  (retrieve-object [this loc]
    "Retrieve the object at `loc`.
    Returns `nil` if not found."))

Now that we have a port with an interface in the form of abstract method declarations, we can implement an adaptor, for example using S3:

(defrecord S3StorageAdaptor [bucket-loc]
  BlobStoragePort
  (store-object [this loc obj]
    (s3/put-object :bucket-loc bucket-loc
                   :key loc
                   :file obj))
  (retrieve-object [this loc]
    (s3/get-object :bucket-loc bucket-loc
                   :key loc)))

(defn new-s3-storage-adaptor [bucket-loc]
  (s3/create-bucket bucket-loc)
  (->S3StorageAdaptor bucket-loc))

During tests, we would like to use a blob storage that is much faster and not dependent on external state, so we can use a simple map in an atom:

(defrecord MemoryBlobStorageAdaptor [storage-map]
  BlobStoragePort
  (store-object [this loc obj]
    (swap! storage-map assoc loc obj))
  (retrieve-object [this loc]
    (:loc @storage-map)))

(defn new-memory-blob-storage-adaptor []
  (->MemoryBlobStorageAdaptor (atom {})))

Testing the Port

It has been long known that a direct mapping of tests to internal methods is an anti-pattern to be avoided. As such we will prefer testing on a port-level over testing on an adaptor-level. In practice that means we assert a certain set of behaviours about every adaptor for a given port by using only the public port methods in our tests, and using the same tests for all adaptors.

;; Abstract port test suite

(defn- store-and-retrieve-test [adaptor]
  (testing "store and retrieve returns the object"
    (let [loc "store-and-retrieve"
          obj "test-object"]
      (store-object adaptor loc obj)
      (is (= obj
             (retrieve-object adaptor loc))))))

(defn- not-found-test [adaptor]
  (testing "returns nil for nonexistent objects"
    (is (nil? (retrieve-object adaptor "not-found")))))

;; Specific adaptor tests

(deftest blob-storage-adaptor-test
  (let [adaptors [(new-memory-blob-storage-adaptor)
                  (new-s3-blob-storage-adaptor "test")]]
    (for [adaptor adaptors]
      (store-and-retrieve-test adaptor)
      (not-found-test adaptor))))

This has the advantage of establishing a consistent set of behaviours across all adaptors and keeping them in sync. One might wonder about intended behavioural differences between adaptors for the same port, but I would argue that from the outside, all adaptors for a given port should exhibit the same behaviour. Because we are only using the public interface for testing, any internal differences are conveniently hidden from us.

The Rest of the System

Now that we have established a port, as well as some adaptors, we can build on top of them. Blob storage is a lower level ports in our system, and we are going to add a higher level port that implements some kind of business logic which requires blob storage.

;; Port definition omitted for brevity.

(defrecord BusinessLogicAdaptor [blob-storage-adaptor]
  BusinessLogicPort
  (retrieve-double [this loc]
    (* 2 (retrieve-object blob-storage-adaptor loc))))

We are free to use different blob storage adaptors for different systems, for example production, staging, CI, or local development. The business logic adaptor is oblivious to the actual blob storage implementation injected.

On Mocks & Stubs

The careful reader might have noticed that the dependency injection of different adaptors looks a lot like mocking, and this is very much true. While mocking has been considered more and more problematic in recent years, the fact that we assert the same set of behaviours for our mocks as we assert for the “real components” leads us to much more fully featured and realistic mocks, compared to the ones which are written for specific tests and then rarely touched after.

If the difference in behaviour between different adaptors leads to problems which are not caught by the test suite, the problems is not mocking, but an incomplete behaviour specification for the adaptor in question.

eta (eta)

Strict COVID-19 restrictions in universities are irresponsible October 11, 2020 12:00 AM

This post is about mostly personal circumstances / issues, as well as current affairs. If that’s not what you want, turn back now.

At the start of the current coronavirus disease 2019 (COVID-19) pandemic, we were told that “flattening the curve” was a good idea – i.e. attempting to limit the spread of the disease by staying at home, wearing face coverings, etc. was a necessary step we should all take in order to prevent the national health services from getting overwhelmed (leading to an excess of deaths of people who could otherwise be helped).

A significant number of months have passed since March, and a new wave of unsuspecting secondary school graduates have descended on the UK’s universities1 – but, obviously, since there’s still a pandemic going on, things are different from the way they used to be. Pretty much all universities have new precautions to limit the spread of the disease, including things like

  • grouping students into (logical) “households”, and restricting interaction between said households
  • enforcing social distancing requirements
  • enforcing face covering usage
  • limiting the number of students that can be in the same place at one time (in line with the nationwide “rule of six”)
  • getting rid of all face-to-face tuition, and moving everything online
  • adding a curfew to, or closing, pubs and social spaces

Some of these precautions involve more sacrifices on the part of the students than others; wearing face coverings is relatively zero-cost, and has been shown to limit the spread of the disease quite significantly2. However, the goal of the overwhelming majority of the restrictions is clear: limit social interaction as far as practicable. (This ‘makes sense’, because social interaction is how the virus is spread.)

The point I want to express here is that having that as a goal in the context of universities is somewhat irresponsible, and seems to completely ignores the mental health concerns of an entire year’s worth of students at university right now3. Most students have left the (hopefully relatively comfortable) environment of secondary school to come to university – sometimes in an entirely new city, or indeed country. These students typically don’t have many people they can talk to once they arrive, having left the vast majority of their friends behind from school; instead, they must somehow discover new people, usually by having a lot of spontaneous interactions until they’re able to bed in and start to establish some friendships.

It doesn’t take a genius to realise that this process is not compatible with the above stated goal of not having much social interaction.

However, what I think is particularly irresponsible is the lack of discussion surrounding the consequences of not letting this process play out as normal. The need for students to socialize and make friends is invariant; the feeling of loneliness is inherent to being human and isn’t going away any time soon, so people will (attempt to) socialize to feel less lonely, especially when placed in an unfriendly new environment. Examples of consequences arising from a lack of social interactions among students include

  • greater incidences of mental health problems, as loneliness creates new or exacerbates existing issues
  • a reduced ability to even notice and help with such problems, as remote learning can mask all sorts of issues that are more easily recognizable in person
  • reduced academic performance and ability, due to previously mentioned mental health problems
  • a greater dropout rate, leading to reduced income for universities (some of which are already struggling to stay afloat)
  • in the extreme case, greater incidences of suicides

It’s also the case that not everyone is perfectly rule-abiding. While more meek students might follow restrictions and suffer the associated consequences, others will flagrantly disobey them, a fact which has consequences of its own:

  • instead of socializing in ‘controlled’ environments, under the purview of (e.g.) student wellbeing officers, students will socialize elsewhere (e.g. a random park)
    • in these ‘uncontrolled’ environments, a greater prevalence of dangerous behaviours (excessive drinking, drug use, etc.) would be expected
    • …but since these are the only opportunities available to undersocialized students, more students might end up taking unwise risks than would otherwise
    • there is already evidence to suggest more students are taking drugs and dying from it through precisely this mechanism4
  • from the perspective of the virus, the replacements for the now-banned opportunities to socialize are likely a lot worse, increasing net transmission

A lot of the problems here tie into greater issues with the discussion of the pandemic in the media and elsewhere; a lot of people seem to think that the worrying graph of growing cases is unquestionably something that must be dealt with immediately (perhaps with a lockdown, which is even worse for students). Don’t get me wrong – COVID-19 is a deadly disease, and must not be underestimated. Letting the disease run completely unchecked throughout the population, without any restrictions whatsoever, is a terrible idea and would kill many people unnecessarily; a very contested document called the Great Barrington declaration calls for something akin to that (albeit with protections in place for vulnerable members of society).

The reality is that it’s very difficult to come to a decision, and neither extremist view is correct; making everyone sit on their hands until a vaccine is available is stupid, but so is letting the disease run wild. There’s much we don’t know about the impacts of the virus, including whether or not it has long-term health implications for certain groups (and the conditions under which such long-term complications might arise) – but sensationalizing (e.g. evocative news headlines that attempt to instil fear as to the deadliness of the disease) does not help us come to a reasoned conclusion about risk.

To conclude, then, I believe the evidence to support strict COVID-19 restrictions in UK universities is questionable, and a re-think about the rationale for, and the consequences of, such strict restrictions is sorely required. It’s really unclear whether the benefits conferred by severely limiting social interaction (at least, imposing rules that attempt to achieve such) are worth the consequences of doing so – heck, it’s even unclear whether people even follow the rules enough to limit transmission at all (and the recent outbreaks in universities across the nation confirm that).

A lack of humane thinking seems to be the case amongst those who impose said restrictions; the problem cannot be viewed as a simple mathematical calculation of how to reduce cases (if reducing cases is even something worth attempting to do!), but one that leads to significant human suffering for those affected. With the world being more divided and polarised than ever, it’s worth trying to be empathetic – to both see the fear on the part of those pushing for a lockdown and limitation of cases, and to recognize the crushing impact restrictions have on the restricted.

  1. I’m one of these, of course, which is why I’m writing this. 

  2. Even if you disagree with the evidence here, face coverings are still basically zero-cost – you really don’t sacrifice much by wearing one! 

  3. If you disagree with me, please read the whole article first before getting angry. 

  4. I can’t find a citation for this, so take this claim with a pinch of salt. 

October 09, 2020

Sevan Janiyan (sevan)

How to open source: going from NetBSD to Linux October 09, 2020 01:30 AM

TL;DR: some BSD user tries something other and wonder why things are different. This post has sat in draft form for quite some time. At first it was written with highlighting the NetBSD project in mind and I started thinking about revisiting it recently due to frustration with running a mainstream Linux distribution when investigating …

October 07, 2020

Maxwell Bernstein (tekknolagi)

Compiling a Lisp: If October 07, 2020 11:00 AM

firstprevious

Welcome back to the “Compiling a Lisp” series. Last time we added support for let expressions. This time we’re going to compile if expressions.

Compiling if expressions will allow us to write code that performs decision making. For example, we can write code that does something based on the result of some imaginary function coin-flip:

(if (= (coin-flip) heads)
    123
    456)

If the call to coin-flip returns heads, then this whole if-expression will evaluate to 123. Otherwise, it will evaluate to 456. To determine if an expression is truthy, we’ll check if it is not equal to #f.

Note that the iftrue and iffalse expressions (consequent and alternate, respectively) are only evaluated if their branch is reached.

Implementation strategy

People normally compile if expressions by taking the following structure in Lisp:

(if condition
    consequent
    alternate)

and rewriting it to the following pseudo-assembly (where ...compile(X) is replaced with compiled code from the expressions):

  ...compile(condition)
  compare result, #f
  jump-if-equal alternate
  ...compile(consequent)
  jump end
alternate:
  ...compile(alternate)
end:

This will evaluate the condition expression. If it’s falsey, jump to the code for the alternate expression. Otherwise, continue on to the code for the consequent expression. So that the program does not also execute the code for the alternate, jump over it.

This transformation requires a couple of new pieces of infrastructure.

Implementation infrastructure

First, we’ll need two types of jump instructions! We have a conditional jump (jcc in x86-64) and an unconditional jump (jmp in x86-64). These are relatively straightforward to emit.

Somewhat more complicated are the targets of those jump instructions. We’ll need to supply each of the instructions with some sort of destination code address.

When emitting text assembly, this is not so hard: make up names for your labels (as with alternate and end above), and keep the names consistent between the jump instruction and the jump target. Sure, you need to generate unique labels, but the assembler will at least do address resolution for you. This address resolution transparently handles backward jumps (where the label is already known) and forward jumps (where the label comes after the jump instruction).

Since we’re not emitting text assembly, we’ll need to calculate both forward and backward jump offsets by hand. This ends up not being so bad in practice once we come up with an ergonomic way to do it. Let’s take a look at some production-grade assemblers for inspiration.

How Big Kid compilers do this

I read some source code for assemblers like the Dart assembler. Dart is a language runtime developed by Google and part of their infrastructure includes a Just-In-Time compiler, sort of like what we’re making here. Part of their assembler is some slick C++-y RAII infrastucture for emitting code and doing cleanup. Their implementation of compiling if expressions might look something like:

// Made-up APIs to make the Dart code look like our code
int Compile_if(Buffer *buf, ASTNode *cond, ASTNode *consequent,
               ASTNode *alternate) {
   Label alternate;
   Label end;
   compile(buf, cond);
   buf->cmp(kRax, Object::false());
   buf->jcc(kEqual, &alternate);
   compile(buf, consequent);
   buf->jmp(&end);
   buf->bind(&alternate);
   compile(buf, alternate);
   buf->bind(&end);
}

Their Label objects store information about where in the emitted machine code they are bound with bind. If they are bound before they are used by jcc or jmp or something, then the emitter will just emit the destination address. If they are not yet bound, however, then the Label will keep track of where it has to go back and patch the machine code once the label is bound to a location.

When the labels are destructed — meaning they can no longer be referenced by C++ code — their destructors have code to go back and patch all the instructions that referenced the label before it was bound.

While x86-64 has multiple jump widths available (for speed, I guess), it is a little tricky to use them for forward jumps. Because we don’t know in advance how long the intermediate code will be, we’ll just stick to generating 32-bit relative jumps always.

Virtual Machines like ART, OpenJDK Hotspot, SpiderMonkey, V8, HHVM, and Strongtalk also use this approach. So do the VM-agnostic AsmJit and GNU lightning assemblers. If I didn’t link an implementation, it’s either because I found the it too complicated to reproduce or couldn’t quickly track it down. Or maybe I don’t know about it!

Basically what I am trying to tell you is that this bind-and-backpatch approach is tried and tested and that we’re going to implement it in C. I hope you enjoyed the whirlwind tour of assemblers in various other runtimes along the way.

Compiling if-expressions, finally

Alright, so we finally get the big idea about how to do this transformation. Let’s put it into practice.

First, as with let, we’re going to need to handle the if case in Compile_call.

int Compile_call(Buffer *buf, ASTNode *callable, ASTNode *args,
                 word stack_index, Env *varenv) {
  if (AST_is_symbol(callable)) {
    // ...
    if (AST_symbol_matches(callable, "if")) {
      return Compile_if(buf, /*condition=*/operand1(args),
                        /*consequent=*/operand2(args),
                        /*alternate=*/operand3(args), stack_index, varenv);
    }
  }
  // ...
}

As usual, we’ll pull apart the expression so Compile_if has less work to do. Since we now have more than two operands (!), I’ve added operand3. It works just like you would think it does.

For Compile_if, we’re going to largely replicate the pseudocode C++ from above. I think you’ll find that if you squint it looks similar enough.

int Compile_if(Buffer *buf, ASTNode *cond, ASTNode *consequent,
               ASTNode *alternate, word stack_index, Env *varenv) {
  _(Compile_expr(buf, cond, stack_index, varenv));
  Emit_cmp_reg_imm32(buf, kRax, Object_false());
  word alternate_pos = Emit_jcc(buf, kEqual, kLabelPlaceholder); // je alternate
  _(Compile_expr(buf, consequent, stack_index, varenv));
  word end_pos = Emit_jmp(buf, kLabelPlaceholder); // jmp end
  Emit_backpatch_imm32(buf, alternate_pos);        // alternate:
  _(Compile_expr(buf, alternate, stack_index, varenv));
  Emit_backpatch_imm32(buf, end_pos); // end:
  return 0;
}

Instead of having a Label struct, though, I opted to just have a function to backpatch forward jumps explicitly. If you prefer to port Label to C, be my guest. I found it very finicky1.

Also, instead of bind, I opted for a more explicit backpatch. This makes it clearer what is happening, I think.

This explicit backpatch approach requires manually tracking the offsets (like alternate_pos and end_pos) inside the jump instructions. We’ll need those offsets to backpatch them later. This means functions like Emit_jcc and Emit_jmp should return the offsets inside buf where they write placeholder offsets.

Let’s take a look inside these helper functions’ internals.

jcc and jmp implementations

The implementations for jcc and jmp are pretty similar, so I will only reproduce jcc here.

word Emit_jcc(Buffer *buf, Condition cond, int32_t offset) {
  Buffer_write8(buf, 0x0f);
  Buffer_write8(buf, 0x80 + cond);
  word pos = Buffer_len(buf);
  Buffer_write32(buf, disp32(offset));
  return pos;
}

This function is like many other Emit functions except for its return value. It returns the start location of the 32-bit offset for use in patching forward jumps. In the case of backward jumps, we can ignore this, since there’s no need to patch it after-the-fact.

Backpatching implementation

Here is the implementation of Emit_backpatch_imm32. I’ll walk through it and explain.

void Emit_backpatch_imm32(Buffer *buf, int32_t target_pos) {
  word current_pos = Buffer_len(buf);
  word relative_pos = current_pos - target_pos - sizeof(int32_t);
  Buffer_at_put32(buf, target_pos, disp32(relative_pos));
}

The input target_pos is the location inside the jmp (or similar) instruction that needs to be patched. Since we need to patch it with a relative offset, we compute the distance between the current position and the target position. We also need to subtract 4 bytes (sizeof(int32_t)) because the jump offset is relative to the end of the jmp instruction (the beginning of the next instruction).

Then, we write that value in. Buffer_at_put32 and disp32 are similar to their 8-bit equivalents.

Congratulations! You have implemented if.

A fun diagram

Radare2 has a tool called Cutter for reverse engineering and binary analysis. I decided to use it on the compiled output of a function containing an if expression. It produced this pretty graph!

Fig. 1 - Call graph as produced by Cutter.

It’s prettier in the tool, trust me.

Tests

I added two trivial tests for the condition being true and the condition being false. I also added a nested if case as a smoke test but I did not foresee that being troublesome with our handy recursive approach.

TEST compile_if_with_true_cond(Buffer *buf) {
  ASTNode *node = Reader_read("(if #t 1 2)");
  int compile_result = Compile_function(buf, node);
  ASSERT_EQ(compile_result, 0);
  byte expected[] = {
      // mov rax, 0x9f
      0x48, 0xc7, 0xc0, 0x9f, 0x00, 0x00, 0x00,
      // cmp rax, 0x1f
      0x48, 0x3d, 0x1f, 0x00, 0x00, 0x00,
      // je alternate
      0x0f, 0x84, 0x0c, 0x00, 0x00, 0x00,
      // mov rax, compile(1)
      0x48, 0xc7, 0xc0, 0x04, 0x00, 0x00, 0x00,
      // jmp end
      0xe9, 0x07, 0x00, 0x00, 0x00,
      // alternate:
      // mov rax, compile(2)
      0x48, 0xc7, 0xc0, 0x08, 0x00, 0x00, 0x00
      // end:
  };
  EXPECT_FUNCTION_CONTAINS_CODE(buf, expected);
  Buffer_make_executable(buf);
  uword result = Testing_execute_expr(buf);
  ASSERT_EQ_FMT(Object_encode_integer(1), result, "0x%lx");
  AST_heap_free(node);
  PASS();
}

TEST compile_if_with_false_cond(Buffer *buf) {
  ASTNode *node = Reader_read("(if #f 1 2)");
  int compile_result = Compile_function(buf, node);
  ASSERT_EQ(compile_result, 0);
  byte expected[] = {
      // mov rax, 0x1f
      0x48, 0xc7, 0xc0, 0x1f, 0x00, 0x00, 0x00,
      // cmp rax, 0x1f
      0x48, 0x3d, 0x1f, 0x00, 0x00, 0x00,
      // je alternate
      0x0f, 0x84, 0x0c, 0x00, 0x00, 0x00,
      // mov rax, compile(1)
      0x48, 0xc7, 0xc0, 0x04, 0x00, 0x00, 0x00,
      // jmp end
      0xe9, 0x07, 0x00, 0x00, 0x00,
      // alternate:
      // mov rax, compile(2)
      0x48, 0xc7, 0xc0, 0x08, 0x00, 0x00, 0x00
      // end:
  };
  EXPECT_FUNCTION_CONTAINS_CODE(buf, expected);
  Buffer_make_executable(buf);
  uword result = Testing_execute_expr(buf);
  ASSERT_EQ_FMT(Object_encode_integer(2), result, "0x%lx");
  AST_heap_free(node);
  PASS();
}

I made sure to test the generated code because we added some new instructions and also because I had trouble getting the offset computations perfectly right initially.

Anyway, that’s all for today. This post was made possible by contributions2 to my blog from Viewers Like You. Thank you.

Next time on PBS, heap allocation.

Mini Table of Contents



  1. Maybe it would be less finicky with __attribute__((cleanup)), but that is non-standard. This StackOverflow question and associated answers have some good information.

  2. By “contributions” I mean thoughtful comments, questions, and appreciation. Feel free to chime in on Twitter, HN, Reddit, lobste.rs, the mailing list… 

October 05, 2020

Ponylang (SeanTAllen)

Last Week in Pony - October 4, 2020 October 05, 2020 12:30 AM

There’s a new meeting URL for the weekly Pony developer sync meeting.

Marc Brooker (mjb)

Consensus is Harder Than It Looks October 05, 2020 12:00 AM

Consensus is Harder Than It Looks

And it looks pretty hard.

In his classic paper How to Build a Highly Available System Using Consensus Butler Lampson laid out a pattern that's become very popular in the design of large-scale highly-available systems. Consensus is used to deal with unusual situations like host failures (Lampson says reserved for emergencies), and leases (time-limited locks) provide efficient normal operation. The paper lays out a roadmap for implementing systems of this kind, leaving just the implementation details to the reader.

The core algorithm behind this paper, Paxos, is famous for its complexity and subtlety. Lampson, like many who came after him1, try to build a framework of specific implementation details around it to make it more approachable. It's effective, but incomplete. The challenge is that Paxos's subtlety is only one of the hard parts of building a consensus system. There are three categories of challenges that I see people completely overlook.

Determinism

"How can we arrange for each replica to do the same thing? Adopting a scheme first proposed by Lamport, we build each replica as a deterministic state machine; this means that the transition relation is a function from (state, input) to (new state, output). It is customary to call one of these replicas a ‘process’. Several processes that start in the same state and see the same sequence of inputs will do the same thing, that is, end up in the same state and produce the same outputs" - Butler Lampson (from How to Build a Highly Available System Using Consensus).

Conceptually, that's really easy. We start with a couple of replicas with state, feed them input, and they all end up with new state. Same inputs in, same state out. Realistically, it's hard. Here are just some of the challenges:

  • Concurrency. Typical runtimes and operating systems use more than just your program's state to schedule threads, which means that code that uses multiple threads, multiple processes, remote calls, or even just IO, can end up with non-deterministic results. The simple fix is to be resolutely single-threaded, but that has severe performance implications2.
  • Floating Point. Trivial floating-point calculations are deterministic. Complex floating point calculations, especially where different replicas run on different CPUs, have code built with different compilers, may not be3. In Physalia we didn't support floating point, because this was too hard to think about.
  • Bug fixes. Say the code that turns state and input into new state has a bug. How do you fix it? You can't just change it and then roll it out incrementally to different replicas. You don't want to deploy all your replicas at once (we're trying to build an HA system, remember?) So you need to come up with a migration strategy. Maybe a flag sequence number. Or complex migration code that changes buggy new state into good new state. Possible, but hard.
  • Code updates. Are you sure that version N+1 produces exactly the same output as version N for all inputs? You shouldn't be, because even in the well-specified world of cryptography that's not always true.
  • Corruption. In reality, input isn't just input, it's also a constant stream of failing components, thermal noise, cosmic rays, and other similar assaults on the castle of determinism. Can you survive them all?

And more. There's always more.

Some people will tell you that you can solve these problems by using byzantine consensus protocols. Those people are right, of course. They're also the kind of people who solved their rodent problem by keeping a leopard in their house. Other people will tell you that you can solve these problems with blockchain. Those people are best ignored.

Monitoring and Control

Although using a single, centralized, server is the simplest way to implement a service, the resulting service can only be as fault tolerant as the processor executing that server. If this level of fault tolerance is unacceptable, then multiple servers that fail independently must be used. - Fred Schneider (from Implementing Fault-Tolerant Services Using the State Machine Approach: A Tutorial)

The whole point of building a highly-available distributed system is to exceed the availability of a single system. If you can't do that, you've added a bunch of complexity for nothing.

Complex systems run in degraded mode. - Richard Cook (from How Complex Systems Fail)

Depending on what you mean by failed, distributed systems of f+1, 2f+1 or 3f+1 nodes can entirely hide the failure of f nodes from their clients. This, combined with a process of repairing failed nodes, allows us to build highly-available systems even in the face of significant failure rates. It also leads directly to one of the traps of building a distributed system: clients can't tell the difference between the case where an outage is f failures away, and where it's just one failure away. If a system can tolerate f failures, then f-1 failures may look completely healthy.

Consensus systems cannot be monitored entirely from the outside (see why must systems be operated?). Instead, monitoring needs to be deeply aware of the implementation details of the system, so it can know when nodes are healthy, and can be replaced. If they choose the wrong nodes to replace, disaster will strike.

Control planes provide much of the power of the cloud, but their privileged position also means that they have to act safely, responsibly, and carefully to avoid introducing additional failures. - Brooker, Chen, and Ping (from Millions of Tiny Databases)

Do You Really Need Strong Consistency?

It is possible to provide high availability and partition tolerance, if atomic consistency is not required. - Gilbert and Lynch

The typical state-machine implementation of consensus provides a strong consistency property called linearizability. In exchange, it can't be available for all clients during a network partition. That's probably why you chose it.

Is that why you chose it? Do you need linearizability? Or would something else, like causality be enough? Using consensus when its properties aren't really needed is a mistake a lot of folks seem to make. Service discovery, configuration distribution, and similar problems can all be handled adequately without strong consistency, and using strongly consistent tools to solve them makes systems less reliable rather than more. Strong consistency is not better consistency.

Conclusion

Despite these challenges, consensus is an important building block in building highly-available systems. Distribution makes building HA systems easier. It's a tool, not a solution.

Think of using consensus in your system like getting a puppy: it may bring you a lot of joy, but with that joy comes challenges, and ongoing responsibilities. There's a lot more to dog ownership than just getting a dog. There's a lot more to high availability than picking up a Raft library off github.

Footnotes

  1. Including Raft, which has become famous for being a more understandable consensus algorithm. Virtual Synchrony is less famous, but no less a contribution.
  2. There are some nice patterns for building deterministic high-performance systems, but the general problem is still an open area of research. For a good primer on determinism and non-determinism in database systems, check out The Case for Determinism in Database Systems by Thomson and Abadi.
  3. Bruce Dawson has an excellent blog post on the various issues and challenges.
  4. Bailis et al's Highly Available Transactions: Virtues and Limitations paper contains a nice breakdown of the options here, and Aphyr's post on Strong Consistency Models is a very approachable breakdown of the topic. If you really want to go deep, check out Dziuma et al's Survey on consistency conditions

October 04, 2020

Derek Jones (derek-jones)

Memory capacity growth: a major contributor to the success of computers October 04, 2020 09:32 PM

The growth in memory capacity is the unsung hero of the computer revolution. Intel’s multi-decade annual billion dollar marketing spend has ensured that cpu clock frequency dominates our attention (a lot of people don’t know that memory is available at different frequencies, and this can have a larger impact on performance that cpu frequency).

In many ways memory capacity is more important than clock frequency: a program won’t run unless enough memory is available but people can wait for a slow cpu.

The growth in memory capacity of customer computers changed the structure of the software business.

When memory capacity was limited by a 16-bit address space (i.e., 64k), commercially saleable applications could be created by one or two very capable developers working flat out for a year. There was no point hiring a large team, because the resulting application would be too large to run on a typical customer computer. Very large applications were written, but these were bespoke systems consisting of many small programs that ran one after the other.

Once the memory capacity of a typical customer computer started to regularly increase it became practical, and eventually necessary, to create and sell applications offering ever more functionality. A successful application written by one developer became rarer and rarer.

Microsoft Windows is the poster child application that grew in complexity as computer memory capacity grew. Microsoft’s MS-DOS had lots of potential competitors because it was small (it was created in an era when 64k was a lot of memory). In the 1990s the increasing memory capacity enabled Microsoft to create a moat around their products, by offering an increasingly wide variety of functionality that required a large team of developers to build and then support.

GCC’s rise to dominance was possible for the same reason as Microsoft Windows. In the late 1980s gcc was just another one-man compiler project, others could not make significant contributions because the resulting compiler would not run on a typical developer computer. Once memory capacity took off, it was possible for gcc to grow from the contributions of many, something that other one-man compilers could not do (without hiring lots of developers).

How fast did the memory capacity of computers owned by potential customers grow?

One source of information is the adverts in Byte (the magazine), lots of pdfs are available, and perhaps one day a student with some time will extract the information.

Wikipedia has plenty of articles detailing cpu performance, e.g., Macintosh models by cpu type (a comparison of Macintosh models does include memory capacity). The impact of Intel’s marketing dollars on the perception of computer systems is a PhD thesis waiting to be written.

The SPEC benchmarks have been around since 1988, recording system memory capacity since 1994, and SPEC make their detailed data public :-) Hardware vendors are more likely to submit SPEC results for their high-end systems, than their run-of-the-mill systems. However, if we are looking at rate of growth, rather than absolute memory capacity, the results may be representative of typical customer systems.

The plot below shows memory capacity against date of reported benchmarking (which I assume is close to the date a system first became available). The lines are fitted using quantile regression, with 95% of systems being above the lower line (i.e., these systems all have more memory than those below this line), and 50% are above the upper line (code+data):

Memory reported in systems running the SPEC benchmark on a given date.

The fitted models show the memory capacity doubling every 845 or 825 days. The blue circles are memory that comes installed with various Macintosh systems, at time of launch (memory doubling time is 730 days).

How did applications’ minimum required memory grow over time? I have a patchy data for a smattering of products, extracted from Wikipedia. Some vendors probably required customers to have a fairly beefy machine, while others went for a wider customer base. Data on the memory requirements of the various versions of products launched in the 1990s is very hard to find. Pointers very welcome.

Patrick Louis (venam)

Corruption Is Attractive! October 04, 2020 09:00 PM

Chaos, an important theme in hermetism

We live in a world that is gradually and incessantly attracted by over-rationality and order. In this article we’ll burst the enchanted bubble and embrace corruption and chaos — We’re going to discuss the topic of image glitch art.

w̸h̸a̷t̴’̶s̴ ̶a̴ ̷g̷l̸i̷t̴c̵h̵

Welcome to the land of creative destruction: image glitch art. Our story starts with a simple idea: glitching a wallpaper to create a slideshow of corrupted pictures.
The unfortunate victim of our crime: The world (Right click > View image, while keeping the Control key pressed, to admire it in more details while its still in its pristine form):

World Map, nominal case

Before we begin, let’s attempt to define what we’re trying to do: What is glitch art?
Like any art movement, words can barely express the essence behind the meaning, they are but fleeting and nebulous. Regardless, I’ll be an infidel and valiantly express what I think glitch art is.

A glitch is a perturbation, a minor malfunction, a spurious signal. In computers, glitches are predominantly accidental events that are undesirable and could possibly corrupt data.
Glitch art started as people developed a liking for such unusual events and the effects glitches had on the media they were perturbing. Some started to collect these glitches that happened naturally in the wild, and others started to intentionally appropriate the effects by manually performing them.
In the art scene, some started using image processing to “fake” true glitching effects.

Glitches happen all the time and everywhere, information is never as durable and reliable as we might like it to be, and living in a physical world makes it even less so. You’ve probably encountered or heard of the effect of putting a magnet next to anything electronic that hasn’t been rugged to withstand such scenario.
That’s why many techniques have been put in place to avoid glitches, at all layers, from the hardware storage, to the software reading and interpreting it. Be it error correcting codes (ECC) or error detection algorithms, they are all enemies of glitch art and the chaos we like.

However, this niche aesthetic is more than a fun pass-time for computer aficionado, there is a bigger picture. Similar to painters with brushes on a canvas, we are offered a material, an object to work with — a material made of bits and formatted in a specific way.
Like any object, our medium has a form and meaning, it can move, it has a size, it can be transferred, and interpreted — information theory is the field interested in this.
Like any object, our medium can be subject and react to deformations, forces, and stressors. How it flows is what the field of rheology is interested in (not to be confused with computational rheology, the field of fluid simulation.) The medium fluidity can be studied to answer questions such as: is it elastic, solid, viscous, or oily, how does it respond, within the bound of information theory, to different types of applied forces.

Here are some words you may encounter and that you definitely want to know:

  • Misregistration: Whenever a physical medium misread data because of damages caused by scratches, dirt, smudges, gamma rays, or any other treasures the universe throws at us.

  • Datamoshing, Photomosh, Imagemosh: Abusing the format of a medium, normally compression artefacts, to create glitches. For example, video compression often use i-frames for fixed images and p-frame for the movement/transition of pixels on that image. Removing i-frames is a common glitching method.

  • Databending: An idea taken from circuit bending, bending the circuit board of a toy to generate weird sounds. Databending is about bending the medium into another unrelated one, reinterpreting it as something it is not meant to be.

Let me add that glitch art is vast and fascinating, this article is but a glimpse into this space. If you’re captivated as much as I am, please take a look at gli.tc and Rosa Menkman’s Beyond Resolution. Images can be pleasantly destroyed in a great number of ways to create masterpieces.

I̷m̷a̷g̴e̴ ̸G̸l̴i̴t̴c̵h̸ ̴A̶r̵t̵

Before starting let’s give some advices:

  • Back up your precious files before corrupting them.
  • Any glitching techniques can be combined and/or applied multiple times.
  • Sometimes too little has no effect, and sometimes too much can destroy the file.
  • It’s all about trials and errors, especially errors that result in glitches.

̷H̷o̵w̶ ̵T̸o̴ ̶I̷n̶d̸u̷c̶e̵ ̶A̸ ̶G̸l̵i̷t̶c̸h̴

Now it’s time to think about how we can apply our mischievous little stimuli, its size, the level or layer at which it’ll be applied, and the methodological recipe we’ll concoct to poison our images.

Glitch artist Benjamin Berg classifies glitches into 3 categories:

  • Incorrect Editing: Editing a file using a software that wasn’t made to edit such file. Like editing an image file as if it was a text file.
  • Reinterpretation aka Databending: Convert or read a file as if it was another type of medium. Like listening to an image file as if it was an audio file (aka sonification).
  • Forced errors, Datamoshing, and Misregistration: A software or hardware bug to force specific errors in the file. This can be about the corruption of specific bytes in the file to induce glitches, or something happening accidentally like a device turning off when saving a file.

So let’s get to work!

M̷a̵s̵h̸i̶n̶g̷ ̴T̶h̷e̷ ̷D̶a̸t̵a̸ ̸R̷a̸n̶d̶o̸m̴l̵y̷

The easiest, but roughest, way to glitch a file is to put on our monkey suit and overwrite or add random bytes in our image. As you would have guessed, this isn’t very efficient but half the time it does the trick and forces errors.

This technique is better suited for stronger materials like images in raw format — without metadata and headers. We’ll understand why in a bit.
To convert the file to raw format, open it in GIMP, select Export As, select the file by extension, and choose the raw type. For now, it doesn’t matter if you pick pixel-ordered or planar, but we’ll come back to this choice later because it’s an important one.

GIMP process to save image as raw

file world_map.data
# world_map.data: Targa image data - Map (771-3) 771 x 259 x 1 - 1-bit alpha "\003\003\003\003\003\003\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001\001"

You should also note the width and height of the image as it now doesn’t contain this information anymore, and we’ll need those to reopen it in GIMP. In our case it is 2000x1479.

We now proceed to hand over the file to our least favorite staff and let them have an anger tantrum at it. So what does it look like, let’s take a look at the result our monkey did:

World Map, monkey have been randomly mashing the
world

Not bad at all for something random, but we can do better.

C̸o̶m̸p̷r̶e̷s̸s̵i̸o̶n̴ ̶D̵e̵f̶o̶r̴m̷a̶t̷i̵o̸n̷

Some medium are more malleable when squished properly and squished in different ways. The image sheds a lot of information and only the essence stays. That’s a form of databending.
For example, increasing the compression of JPEG images can open the path for glitches to happen more frequently. This is a key asset, especially when trying to create errors related to the compression parameters within the format of the file.

convert -quality 2 world_map.jpg world_map_compressed.jpg

World Map, compressed to extract its
essence

Keep this in your toolbox to use along with other techniques.

G̵e̴t̵t̶i̸n̵g̷ ̴I̸n̵t̷i̵m̷a̴t̸e̴ ̸W̶i̵t̸h̶ ̴T̵h̸e̸ ̴F̷o̴r̶m̸a̴t̷

We want to corrupt in the most efficient way possible, to create attractive chaos from the smallest change possible. To do that we have to get intimate with the medium, to understand its deepest secrets, tickle the image in the right places. This is what we previously referred to as imagemoshing.

There’s a panoply of image formats, and they all are special in their own ways. However, there’s still some commonality:

  • Header, Footer, and Metadata: If the format contains these extra information, be it extraneous or essential, what they represent, and how they affect the rest of the image.
  • Compression: The format can either be compressed or not. When it is compressed, there can be extra bits of information to help other software uncompress the image data.
  • How the data is laid out: Usually, the image color information is decomposed into its components such as HSL, RGB, or others. These components then need to be represented in the image data, either in an interleaved or planar manner. Planar refers to writing components independently in the data (ex: all R, then all G, then all B), while interleaved refers to having them joined non-contiguously in an alternate sequence (ex: RGB, then RGB, then RGB..).

Manipulating these to our advantage can lead to wonderful glitches. For example, in our previous raw image example — an image bare of header, footer, and without compression — the pixels were interleaved which gave rise to the effect we’ve seen, namely shifts and changes in some colors. Having them in planar form would’ve led to different glitches in separate color channels.

R̵e̷i̷n̴t̶e̷r̶p̸r̴e̸t̸a̷t̶i̷o̵n̴ ̵A̸s̵ ̵R̸i̷c̵h̸ ̸T̷e̵x̵t̴ ̴A̴K̷A̸ ̷W̶o̴r̵d̴P̴a̸d̵ ̷E̵f̸f̴e̶c̶t̶

Let’s give this a try with the well-known WordPad effect, which is about databending an image into rich text: opening the image in WordPad and saving it.
Keep in mind that this only works with raw images as it’s highly destructive and otherwise could break fragile key info in the header and footer. So let’s reuse our interleaved raw image of earlier but also get a planar one.

This is our results for interleaved:

World Map, WordPad effect interleaved

And for planar:

World Map, WordPad effect planar

Technically, what happens is that during the bending and interpretations as rich text, some bytes are inserted in some places and others are replaced. Namely, carriage return (0x0D aka \r) and line feed (0x0A aka \n) association needs to be respected, so if one is missing then WordPad adds it. It also replaces other characters such as 0x07 aka \b by 0x20 aka a space, but that replacement doesn’t affect much the image.
You can find a code simulating this bending here.

R̸e̵i̶n̴t̵e̷r̴p̶r̷e̵t̶a̷t̴i̶o̴n̴ ̶A̵s̷ ̷A̸u̵d̴i̷o̵ ̴A̴K̶A̴ ̶S̸o̸n̵i̷f̴i̸c̷a̸t̴i̵o̴n̴

Another popular databending is sonification, which is about converting non-audio information into auditory information — Something extremely useful for visually impaired people.
In our case, we’ll use an image as audio content, edit it as if it was sound, and visualize it again as an image. Similarly, like most databending, this is almost impossible to do with any other format than a raw uncomressed one, so I hope you haven’t thrown the two originals from the previous section.

We’ll opt for Audacity as our audio editor. Launch it, select: File > Import > Raw Data. Then pick either your planar or interleaved image data and you’ll be presented with this screen:

Audacity raw import options

Don’t freak out! What you pick doesn’t really matter but as far as I’m concerned, better results are found with encoding such as U-law or A-Law, big-endian byte orientation, and mono channel.
When you are done with the editing, go to the File menu > Export > Export Audio. Then pick on the bottom right “Other uncompressed files”, then select the RAW header-less along with the encoding you’ve previously chosen when importing the file (NB adding the extension .data makes it easier to open in GIMP later). Don’t fill anything on the next screen asking for metadata.

You can now have fun applying different types of audio filters on sections of your image. Here are some examples of my favorite songs.

Interleaved image, reverb filter applied:

World Map, Sonification reverb effect interleaved

And the same reverb filter applied to the planar version:

World Map, Sonification reverb effect planar

Interleaved image, echo filter aplied:

World Map, Sonification echo effect planar

And the same echo filter applied to the planar version:

World Map, Sonification echo effect planar

Reverse some sections with others in a planar image:

World Map, Sonification reverse effect planar

Tremolo effect on an interleaved image:

World Map, Sonification termolo effect interleaved

Wahwah effect on a planar image:

World Map, Sonification wahwah effect planar

Overall, most audio effects work pretty well in both interleaved and planar formats. One of them that actually works on compressed media is the cut-paste-and-reverse, as it is not as destructive as other techniques, we’ll give it a shot in the next section.

C̵a̵s̸e̸ ̷S̸t̴u̵d̷y̸:̷ ̵J̵P̸E̶G̷

Let’s get acquainted with one of the most popular compressed image format: JPEG.
We’ll peel its layers, spread them apart, get more intimate, and understand its deepest feelings.

Like most binary formats, JPEG is composed of TLV segments, Type-Length-Value, which as the name implies have well-defined tags for type followed by the length of the value that will be associated with the tag (plus the length of the length itself, 2 in the case of JPEG).
All tags in the JPEG format standard (you’ll find multiples links in the further reading section) are 2 bytes long and always start with 0xFF. JPEG has a well-defined header that starts with the appropriately named “Start of Image” or SoI tag (0xFFD8) and ends with the also appropriately named “End of Image” or EoI tag (0xFFD9).

Anything that starts with 0xFF is considered a tag if it isn’t within the value part of another tag.
There’s one exception to this: when it comes to the actual content of the image, the Entropy Coded Segment (ECS), as it doesn’t have a fixed size but reads until it finds another tag. That is why if 0xFF needs to be included in it, it is stuffed with 0x00 afterward to know it’s really meant to be 0xFF.

Concretely, JPEG are formed of a header which dictates how to decompress the data of the image. This data is then found in a series of loops/scans that each encode a different type of information about the image, be it lighting, tints of red, or others. This loop goes as follows:

JPEG scan loop

The name Entropy Coded Segment, ECS, comes from the fact that it is encoded, usually compressed using a huffman table. As JPEG can contain different tables to compress different color components, the table itself has to be pointed to by the information that is read before the ECS comes along, namely that Start of Scan or SoS section (in the diagram above “Scan”).
So each iteration of the loop tells us how to decode the information in the ECS.

That’s the gist of it, we don’t really care about understanding precisely the full scope of the JPEG specifications, but just enough to be dangerous with it.
Let’s write a script to split up the components of the JPEG so that we can manipulate them independently, to then recompose it and admire the result. You can find such script here.

These are the parts I get after running the script on the world — we divide and conquer.

tree -L 1
.
├── 01_header.jpg
├── 02_scan.jpg
├── 03_data.jpg
├── 04_scan.jpg
├── 05_data.jpg
├── 06_scan.jpg
├── 07_data.jpg
├── 08_scan.jpg
├── 09_data.jpg
└── 10_end.jpg

0 directories, 10 files

We can definitely hire another monkey to mash the data part of the image, make a mess of its internal structure. However, the JPEG is a bit more sensitive to changes. Still, you can get pretty good results, this is what happens when we have a go at 09_data.jpg and 03_data.jpg.
We can get back our image by doing:

```
cat [0-9]*.jpg > reconstituted.jpg
```

World Map, mash jpeg

Even though the changes were minimal, the effect is radical.

Let’s see if we can employ the cut-paste-reverse with the ECS data, just like we’ve done with the raw image for sonification. Let’s get a clean version and open it in audacity.

World Map, cut-paste-reverse jpeg

Cutting and pasting definitely works, but reversing most of the time doesn’t, and the data is extremely sensible.

Each section affects a different feature of the image. The glitches above are caused by editing 03_data.jpg and 09_data.jpg.
If we want to know what each scan adds to the image we can shortcut them by removing other scans or inserting the End of Image early on. This is what each of the JPEG layer does:

World Map, layer 1 jpeg World Map, layer 2 jpeg World Map, layer 3 jpeg World Map, layer 4 jpeg

It’s interesting to notice that the first layer is the most colorful but the smallest, while the last layer is the biggest but has more minutia and fewer colors.

Now we’re ready to dive deeper and to mess with the header parts of the image, fire up your hex editor and get ready. I won’t bother you with the details, but it all comes down to editing specific bits in the headers. Let’s explain by example.

The most-significant bits in the quantization tables and huffman tables have more weight than the least-significant ones. That comes from the way they are laid out, the quantization is in zig-zag while the huffman is coded in a tree, and both are ordered by importance. Also note that each huffman and quantization table is used for different parts of the image, the ones we’ve shown above.

In the header there are 3 quantization tables 0xFFDB, let’s edit one of them and see what it does. In our image they all seem to be for luminance. Editing a single byte in the third quantization table results in this — a true glitch that would be caused by a single bit flip:

World Map, jpeg header dct single bit flip corruption

What about manipulating one of the 3 huffman tables 0xFFC4, which is a bit more tricky but feasible, we just have to edit the symbols in it, 17B after the length tag. Let’s swap two of these symbols in the beginning.

World Map, jpeg header huffman swap bytes corruption

Pretty impressive, what about symbols in the middle of the table.

World Map, jpeg header huffman swap bytes middle corruption

The effect is less pernicious, barely noticeable, because the least significant changes are at the end of the table and represented by more bits.

Instead of manipulating the huffman table in the header we can manipulate which one is pointed to by each part of the image scan, in the Start of Scan aka SoS section 0xFFDA.
This section encodes a lot of interesting things, not only which huffman table should be used, but for what they should be used for, the coefficient of the values in the DCT, the first and last coefficient to use, and much more.
Because the first part is more colorful, let’s play with its SoS, open 02_scan.jpg in your favorite hex editor.

Hex editor

We can see there are 3 components, that means the bytes in the ECS will encode 3 different color components and builds each using two different tables for DC and AC, which are two different coefficients for the DCT table.
Anyway, let’s change the table number, and obviously, that glitches the file.

World Map, jpeg header SoS huffman table corruption

We can also change the way these factors are applied by modifying the last byte of the SoS, the successive approx, the result is dazzling.

World Map, jpeg header SoS successive factor corruption

We can also edit the spectral selection, which is the first and last DCT coefficient used in the zig-zag order. The effect depends on the segment it is applied to, the following was done on the second and third data segments.

World Map, jpeg header SoS spectral selection corruption

This is it for our destructive love of the JPEG format, now let’s move to other things.

I̴m̷a̷g̴e̵ ̶P̸r̷o̷c̸e̷s̵s̸i̴n̵g̵

If you dare to mention image processing in the glitch art community, you’ll bring hell upon yourself. This will be called art that looks glitchy, art glitch and not glitch art.
Technically, there isn’t much difference between databending the pixel components in a raw image form and manipulating the pixel in memory via an image processing algorithm. However, I can agree that the soul and essence of the art will be missing. Meanwhile, the imagemoshing we’ve done with JPEG was closer to what glitch art is about.
You’re going to ask “who cares, and who’ll know the difference”, well I will, and I’ll be watching you along with my evil datamashing monkeys.

Image processing that looks like glitching works by using an image manipulation library such as Python pillow and manipulating the color components of the image to make them look glitchy. If that sounds similar to raw images, it’s because it is.

Let’s mention some techniques I find fun. Let’s start with Pixel Sorting.
Pixel sorting is about selecting pixels that pass a certain criterion, gather them in an array, and order them according to that same criterion. You can apply it horizontally, vertically, or in blocks. You can apply it according to lightness, darkness, hue, or anything that suits you.
Here, you’ll find a script that does just that, and the result looks like this:

World Map, image processing sorting, not corruption

Another technique is called Channel Shifting. It is about storing each color component in a different array and shifting them on the X or Y axis. Doesn’t that remind you of planar raw images? Certainly, it’s because it’s the same technique!

There’s a lot of image processing effects that can be done, as is shown by this website, however the result isn’t always as appealing as a real glitch.
Yes, we’re now true glitch amateurs!

We can even go as far as mixing these by doing an overlay/composite of them using imagemagick.

A̸n̵i̷m̶a̷t̴e̵d̶ ̵W̸a̷l̴l̷p̵a̴p̵e̶r̸

Back to our initial goal: Create a wallpaper carousel of madness.

I’ve tested different approaches, from gifview to mplayer, and the best way I’ve found is the simplest way: A basic script that loops through the images, sleeps, and sets them as wallpaper.
The script can be found here.

C̸o̶n̶c̶l̵u̴s̷i̸o̶n̷

Chaos, an important theme in hermetism

Congratulations, you know enough corruption to start a career in politics! You’ll thank me later when you’re rich.
Now go have fun with what you’ve learned, and enjoy your day.





Further Reading


Attributions:

  • S. Michelspacher, Cabala, Augsburg, 1616
  • Homer B. Sprague, Milton's Cosmography, Boston, 1889

Sevan Janiyan (sevan)

Trying to operate macOS in single user mode October 04, 2020 07:52 PM

Wednesday lunch time, I opened up my laptop and in the middle of writing an email my machine froze and after a few seconds rebooted. Uh oh, the system sat at the grey screen for a few seconds and then the dreaded folder with a question mark began flashing which means there was no bootable …

Gustaf Erikson (gerikson)

Two more novels by Paul McAuley October 04, 2020 05:20 PM

(Previously.)

  • War of the Maps
  • Austral

McAuley has a wide range. These books were read in reverse publication order.

War of the Maps is a far-future SF story. After our sun has become a white dwarf, post-modern humans construct a Dyson sphere around it and seed it with humans and Earth life. According to the internal legends, they play around a bit then buzz off, leaving the rest of the environment to bumble along as best they can.

The tech level is more or less Victorian but people contend with unique challenges, such as a severe lack of metallic iron and malevolent AIs buried here and here.

Austral is a near-future crime story. A genetically modified young woman gets dragged into a kidnapping plot in a post-AGW Antarctica.

Both are well worth reading!

Andreas Zwinkau (qznc)

Notes with TiddlyWiki October 04, 2020 12:00 AM

Describing my note taking system inspired by Zettelkasten

Read full article!

October 01, 2020

Eric Faehnrich (faehnrich)

Plus-Minus Operator in C October 01, 2020 07:16 PM

The C language has some features that try to achieve orthogonality. For instance, there's both an increment operator ++ and decrement operator --.

I don't think many C programers realize that this is fully orthogonal, with the plus-minus operator +- and minus-plus operator -+, which are combinations of the increment and decrement operators, and result in an unchanged value.

#include 

int main() {
   int i = 3;
   printf("i: %d\n", i);  // 3
   ++i;
   printf("++i: %d\n", i);// 4
   --i;
   printf("--i: %d\n", i);// 3
   +-i;
   printf("+-i: %d\n", i);// 3
   -+i;
   printf("-+i: %d\n", i);// 3
}

Of course I'm kidding, but not exactly how you might think.

I'm not kidding about this code working as I claim, it does compile and +- and -+ really do leave i unchanged.

I'm kidding about +- and -+ being whole operators on their own. This is similar to the "goes to" operator -->. They're not operators on their own, but a combination of multiple operators, in this case the negation operator - and the unary plus operator +.

The unary plus operator may be the part that makes this trick work because I don't think it's that well know. How it works, the negation and unary plus operator don't change the value of their operand, so in the above code the operator is applied then we just drop the value. You'll see an unused value warning if you turned them on.

If you thought the reason I'm kidding is because they wouldn't have added operators that do nothing just for the sake of symmetry, well that's where you're wrong. Because the unary plus operator is just that. It's like negation, but not negating, so it gives you the same value. (Ok, it doesn't do "nothing", but really it is just the same value.)

And it really was added just so we have something matching the negation operator. From K&R second edition:

The unary + is new with the ANSI standard. It was added for symmetry with the unary -.

Maxwell Bernstein (tekknolagi)

Compiling a Lisp: Let October 01, 2020 04:00 PM

firstprevious

Welcome back to the “Compiling a Lisp” series. Last time we added a reader (also known as a parser) to our compiler. This time we’re going to compile a new form: let expressions.

Let expressions are a way to bind variables to values in a particular scope. For example:

(let ((a 1) (b 2))
  (+ a b))

Binds a to 1 and b to 2, but only for the body of the let — the rest of the S-expression — and then executes the body.

This is similar to this very rough translated code in C:

int result;
{
  int a = 1;
  int b = 2;
  result = a + b;
}

It’s also different because let-expressions do not make previous binding names available to expressions being bound. For example, the following program should fail because it cannot find the name a:

(let ((a 1) (b a))
  b)

There is a form that makes bindings available serially, but that is called let* and we are not implementing that today.

For completeness’ sake, there is also letrec, which makes names available to all bindings, including within the same binding. This is useful for binding recursive or mutually recursive functions. Again, we are not implementing that today.

Name binding implementation strategy

You’ll notice two new things about let expressions:

  1. They introduce ways to bind names to values, something we have to figure out how to keep track of
  2. In order to use those names we have to figure out how to look up what the name means

In more technical terms, we have to add environments to our compiler. We can then use those environments to map names to stack locations.

“Environment” is just a fancy word for “look-up table”. In order to implement this table, we’re going to make an association list.

An association list is a list of (key value) pairs. Adding a pair means tacking it on at the end (or beginning) of the list. Searching through the table involves a linear scan, checking if keys match.

You may be wondering why we’re using this data structure to implement environments. Didn’t I even take a data structures course in college? Shouldn’t I know that linear equals slow and that I should obviously use a hash table?

Well, hash tables have costs too. They are hard to implement right; they have high overhead despite being technically constant time; they incur higher space cost per entry.

For a compiler as small as this, a tuned hash table implementation could easily be as many lines of code as the rest of the compiler. Since we’re also compiling small programs, we’ll worry about time complexity later. It is only an implementation detail.

In order to do this, we’ll first draw up an association list. We’ll use a linked list, just like cons cells:

// Env

typedef struct Env {
  const char *name;
  word value;
  struct Env *prev;
} Env;

I’ve done the usual thing and overloaded Env to mean both “a node in the environment” and “a whole environment”. While one little Env struct only holds a one name and one value, it also points to the rest of them, eventually ending with NULL.

This Env will map names (symbols) to stack offsets. This is because we’re going to continue our strategy of not doing register allocation.

To manipulate this data structure, we will also have two functions1:

Env Env_bind(const char *name, word value, Env *prev);
bool Env_find(Env *env, const char *key, word *result);

Env_bind creates a new node from the given name and value, borrowing a reference to the name, and prepends it to prev. Instead of returning an Env*, it returns a whole struct. We’ll learn more about why later, but the “TL;DR” is that I think it requires less manual cleanup.

Env_find takes an Env* and searches through the linked list for a name matching the given key. If it finds a match, it returns true and stores the value in *result. Otherwise, it returns false.

We can stop at the first match because Lisp allows name shadowing. Shadowing occurs when a binding at a inner scope has the same name as a binding at an outer scope. The inner binding takes precedence:

(let ((a 1))
  (let ((a 2))
    a))
; => 2

Let’s learn about how these functions are implemented.

Name binding implementation

Env_bind is a little silly looking, but it’s equivalent to prepending a node onto a chain of linked-list nodes. It returns a struct Env containing the parameters passed to the function. I opted not to return a heap pointer (allocated with malloc, etc) so that this can be easily stored in a stack-allocated variable.

Env Env_bind(const char *name, word value, Env *prev) {
  return (Env){.name = name, .value = value, .prev = prev};
}

Note that we’re prepending, not appending, so that names we add deeper in a let chain shadow names from outside.

Env_find does a recursive linear search through the linked list nodes. It may look familiar to you if you’ve already written such a function in your life.

bool Env_find(Env *env, const char *key, word *result) {
  if (env == NULL)
    return false;
  if (strcmp(env->name, key) == 0) {
    *result = env->value;
    return true;
  }
  return Env_find(env->prev, key, result);
}

We search for the node with the string key and return the stack offset associated with it.

Alright, now we’ve got names and data structures. Let’s implement some name resolution and name binding.

Compiling name resolution

Up until now, Compile_expr could only compile integers, characters, booleans, nil, and some primitive call expressions (via Compile_call). Now we’re going to add a new case: symbols.

When a symbol is compiled, the compiler will look up its stack offset in the current environment and emit a load. This opcode, Emit_load_reg_indirect, is very similar to Emit_add_reg_indirect that we implemented for primitive binary functions.

int Compile_expr(Buffer *buf, ASTNode *node, word stack_index,
                 Env *varenv) {
  // ...
  if (AST_is_symbol(node)) {
    const char *symbol = AST_symbol_cstr(node);
    word value;
    if (Env_find(varenv, symbol, &value)) {
      Emit_load_reg_indirect(buf, /*dst=*/kRax, /*src=*/Ind(kRbp, value));
      return 0;
    }
    return -1;
  }
  assert(0 && "unexpected node type");
}

If the variable is not in the environment, this is a compiler error and we return -1 to signal that. This is not a tremendously helpful signal. Maybe soon we will add more helpful error messages.

Ah, yes, varenv. You will, like I had to, go and add an Env* parameter to all relevant Compile_XYZ functions and then plumb it through the recursive calls. Have fun!

Compiling let, finally

Now that we can resolve the names, let’s go ahead and compile the expressions that bind them.

We’ll have to add a case in Compile_expr. We could add it in the body of Compile_expr itself, but there is some helpful setup in Compile_call already. It’s a bit of a misnomer, since it’s not a call, but oh well.

int Compile_call(Buffer *buf, ASTNode *callable, ASTNode *args,
                 word stack_index, Env *varenv) {
  if (AST_is_symbol(callable)) {
    // ...
    if (AST_symbol_matches(callable, "let")) {
      return Compile_let(buf, /*bindings=*/operand1(args),
                         /*body=*/operand2(args), stack_index,
                         /*binding_env=*/varenv,
                         /*body_env=*/varenv);
    }
  }
  assert(0 && "unexpected call type");
}

We have two cases to handle: no bindings and some bindings. We’ll tackle these recursively, with no bindings being the base case. For that reason, I added a helper function Compile_let2.

As with all of the other compiler functions, we pass it an machine code buffer, a stack index, and an environment. Unlike other functions, we passed it two expressions and two environments.

I split up the bindings and the body so we can more easily recurse on the bindings as we go through them. When we get to the end (the base case), the bindings will be nil and we can just compile the body.

We have two environments for the reason I mentioned above: when we’re evaluating the expressions that we’re binding the names to, we can’t add bindings iteratively. We have to evaluate them in the parent environment. It’ll be come clearer in a moment how that works.

We’ll tackle the simple case first — no bindings:

int Compile_let(Buffer *buf, ASTNode *bindings, ASTNode *body,
                word stack_index, Env *binding_env, Env *body_env) {
  if (AST_is_nil(bindings)) {
    // Base case: no bindings. Compile the body
    _(Compile_expr(buf, body, stack_index, body_env));
    return 0;
  }
  // ...
}

In that case, we compile the body using the body_env as the environment. This is the environment that we will have added all of the bindings to.

In the case where we do have bindings, we can take the first one off and pull it apart:

  // ...
  assert(AST_is_pair(bindings));
  // Get the next binding
  ASTNode *binding = AST_pair_car(bindings);
  ASTNode *name = AST_pair_car(binding);
  assert(AST_is_symbol(name));
  ASTNode *binding_expr = AST_pair_car(AST_pair_cdr(binding));
  // ...

Once we have the binding_expr, we should compile it. The result will end up in rax, per our internal compiler convention. We’ll then store it in the next available stack location:

  // ...
  // Compile the binding expression
  _(Compile_expr(buf, binding_expr, stack_index, binding_env));
  Emit_store_reg_indirect(buf, /*dst=*/Ind(kRbp, stack_index),
                          /*src=*/kRax);
  // ...

We’re compiling this binding expression in binding_env, the parent environment, because we don’t want the previous bindings to be visible.

Once we’ve generated code to store it on the stack, we should register that stack location with the binding name in the environment:

  // ...
  // Bind the name
  Env entry = Env_bind(AST_symbol_cstr(name), stack_index, body_env);
  // ...

Note that we’re binding it in the body_env because we want this to be available to the body, but not the other bindings.

Also note that since this new binding is created in a way that does not modify body_env (entry only points to body_env), it will automatically be cleaned up at the end of this invocation of Compile_let. This is a little subtle in C but it’s clearer in more functional languages.

At this point we’ve done all the work required for one binding. All that’s left to do is emit a recursive call to handle the rest – the cdr of bindings. We’ll decrement the stack_index since we just used the current stack_index.

  // ...
  _(Compile_let(buf, AST_pair_cdr(bindings), body, stack_index - kWordSize,
                /*binding_env=*/binding_env, /*body_env=*/&entry));
  return 0;

That’s it. That’s let, compiled, in five steps:

  1. If in the base case, compile the body
  2. Pick apart the binding
  3. Compile the first binding expression
  4. Store it in the environment
  5. Recurse

Well done!

Internal state and debugging

It’s hard to write the above code without really proving to yourself that it does something reasonable. For that, we can add some debug print statements to our compiler that print out at what stack offsets it is storing variables.

sequoia% ./bin/compiling-let --repl-eval
lisp> (let () (+ 1 2))
3
lisp> (let ((a 1)) (+ a 2))
binding 'a' at [rbp-8]
3
lisp> (let ((a 1) (b 2)) (+ a b))
binding 'a' at [rbp-8]
binding 'b' at [rbp-16]
3
lisp> (let ((a 1) (b 2)) (let ((c 3)) (+ a (+ b c))))
binding 'a' at [rbp-8]
binding 'b' at [rbp-16]
binding 'c' at [rbp-24]
6
lisp>

This shows us that everything looks like it is working as intended! Variables all get sequential locations on the stack.

Compiling let* and modifications

A thought exercise for the reader: what would it mean to compile let*? What modifications would you make to the Compile_let function? Take a look at the footnote3 if you want to double check your answer. I’m not going to implement it in my compiler, though. Too lazy.

Testing

As usual, we have a testing section. There are a couple checks that a reasonable compiler should do to reject bad programs that we’ve left on the table, so we won’t test:

  • let expressions that bind a name twice
  • poorly formed binding lists
  • poorly formed let bodies

I suppose we expect programmers to write well-formed programs. You’re more than welcome to add informative error messages and helpful return values, though.

Here are some tests that I added for let. One for the base case:

TEST compile_let_with_no_bindings(Buffer *buf) {
  ASTNode *node = Reader_read("(let () (+ 1 2))");
  int compile_result = Compile_function(buf, node);
  ASSERT_EQ(compile_result, 0);
  Buffer_make_executable(buf);
  uword result = Testing_execute_expr(buf);
  ASSERT_EQ_FMT(Object_encode_integer(3), result, "0x%lx");
  AST_heap_free(node);
  PASS();
}

One for let with one binding:

TEST compile_let_with_one_binding(Buffer *buf) {
  ASTNode *node = Reader_read("(let ((a 1)) (+ a 2))");
  int compile_result = Compile_function(buf, node);
  ASSERT_EQ(compile_result, 0);
  Buffer_make_executable(buf);
  uword result = Testing_execute_expr(buf);
  ASSERT_EQ_FMT(Object_encode_integer(3), result, "0x%lx");
  AST_heap_free(node);
  PASS();
}

and for multiple bindings:

TEST compile_let_with_multiple_bindings(Buffer *buf) {
  ASTNode *node = Reader_read("(let ((a 1) (b 2)) (+ a b))");
  int compile_result = Compile_function(buf, node);
  ASSERT_EQ(compile_result, 0);
  Buffer_make_executable(buf);
  uword result = Testing_execute_expr(buf);
  ASSERT_EQ_FMT(Object_encode_integer(3), result, "0x%lx");
  AST_heap_free(node);
  PASS();
}

Last, most interestingly, we have a test that let is not actually let* in disguise. We check this by compiling a let expression with bindings that expect to be able to refer to one another. I wrote this test afer realizing that I had accidentally written let* in the first place:

TEST compile_let_is_not_let_star(Buffer *buf) {
  ASTNode *node = Reader_read("(let ((a 1) (b a)) a)");
  int compile_result = Compile_function(buf, node);
  ASSERT_EQ(compile_result, -1);
  AST_heap_free(node);
  PASS();
}

Next time

That’s a wrap, folks. Time to let go. Har har har. Next time we’ll add if-expressions, so our programs can make decisions! Have a great day. Don’t forget to tell your friends you love them.

Mini Table of Contents



  1. While I am very pleased with the bind/find symmetry, I am less pleased with the Env/bool asymmetry. Maybe I should have gone for Node

  2. If you’re a seasoned Lisper, you may be wondering why I don’t rewrite let to lambda and use my implementation of closures to solve this problem. Well, right now we don’t have support for closures because I’m following the Ghuloum tutorial and that requires a lot of to-be-implemented machinery.

    Even if we did have that machinery and rewrote let to lambda, the compiler would generate unnecessarily slow code. Without an optimizer to transform the lambdas back into lets, the naïve implementation would output call instructions. And if we had the optimizer, well, we’d be back where we started with our let implementation. 

  3. To compile let*, you could do one of two things: you could remove the second environment parameter and compile the bindings in the same environment as you compile the body. Alternatively, you could get fancy and make an AST rewriter that rewrites (let* ((a 1) (b a)) xyz) to (let ((a 1)) (let ((b a)) xyz)). The nested let will have the same effect. 

Gustaf Erikson (gerikson)

Noon van der Silk (silky)

Privileges October 01, 2020 12:00 AM

Posted on October 1, 2020 by Noon van der Silk

Inspired by this recent paper - Towards decolonising computational sciences - I thought it might be a nice idea to write down a list of privileges that I can identify I have. I’ll probably update this list as I think of them.

My list:

  • White, Male, Cisgendered: I don’t often feel out of place.
  • Healthy: Nothing has prevented me from finding work.
  • Able-bodied: Getting around is easy.
  • Somewhat extroverted: Able to engage and network somewhat easily when I want to.
  • No dependants: Freedom to move around and spend time as I wish.
  • Financial stability from a young age: Able to take (moderate) risks.
  • Supportive parent: I know I’ll have a place to go.
  • Supportive partner: Financial/emotional support to experiment.
  • Live close to the city: Access to events, convenience.
  • Good mental health: I’m able to focus when I want; am most of the time moderately calm.
  • Language: I speak English natively.
  • Exposed to tech from a young age: I had a computer in the house since I was young, and was able to learn programming on my own as a result.
  • Well-educated: I went to a private school, and was eventually able to attend good universities and get access to experts and knowledge in this way.

These things have helped me get jobs in the past. Some of them are traits that you can improve, as well as baseline things thare are out of my control to some degree (i.e. mental health; certainly it can be improved, but certainly it’s not all my doing.)

I also found this: Privilege Checklist from this Social Justice Training program, that is a very good starting point for building a comprehensive list. This feels like a great exercise for teams to get together and think about.

What’s your list?

September 30, 2020

Jan van den Berg (j11g)

Working 101 September 30, 2020 10:05 AM

Do you struggle to organise your work, because it seems everybody wants something from you? Of do you often wonder whether you’re doing the right things? This post helps you to answer those questions.

Here are the six basic responsibilities you have as a professional in the modern workplace. Follow these and you are on the right path.

I wrote these down as a reminder to myself and to pass on to people. Because it is easy to lose sight of your basic responsibilities. I also noticed a lot of young professionals struggle with what is asked of them.

Regardless of your specific job — whether you are a manager or engineer — just starting out or a seasoned professional, the following basics will always apply:

  1. Know thy time
  2. Add value
  3. Use your leverage
  4. Manage expectations
  5. Track your tasks
  6. Prepare for a different job

I distilled these from my experiences as an engineer, Engineering Manager and CTO for a tech company. And all of them were shaped or sharpened by reading and applying what I read.

The basics are presented as instructions. The key action per item is bold and at the bottom of each item are the book references in cursive.

Know thy time

Time is totally perishable and cannot be stored. Yesterday’s time is gone forever and will never come back.

Peter F. Drucker

This is the most important thing you can learn about the most valuable asset you have: your time. Every second is unique and you can only spend it once.

Know where your time goes and demand that your time is used wisely.

Know

Measure your time. There are many ways. Here is the most basic one: write down, during the day what you did and then every morning — and this key: take a moment and reflect on the previous day.

  • What worked, what didn’t, what are you happy about, what not?
  • What would have liked to done differently?

Do this every day and you will see patterns emerge, and you will learn more about yourself and your talents and your future (more on these two things later).

If you have never tracked your time: this is the most basic thing you can do. As you become more skilled in this, weekly, monthly or yearly reflections provide even more insight. As will discussing and reflecting on your accomplishments with an accountability partner or coach.

Demand

You value your time, and will even more when you start writing down your daily accomplishments.
And you should also demand people to take your time seriously. Examples:

  • Skip meetings if you don’t feel you can add value or if you think you can add more value somewhere else.
  • Shorten your meetings, someone shoots an invite for an hour? A very common thing that happens in organisations. Reply to make it half an hour.

Companies and teams have the characteristic to follow Parkinson’s law. Study this. This is a real thing organisations and teams struggle with. Be on the lookout for it.

If you cannot manage your time you can’t manage anything else.

Read on and you will see that this specific instruction permeates all other instructions.

Read Drucker and Aurelius to know more about this responsibility.

Add value

You are paid to add value. You are not paid for your time; for simply clocking in 8 hours every day. If you are, your company is doing it wrong and you are on the wrong path.

If you are an engineer there are only two things that add value, and only two:

Creating things and solving problems.

That’s it.

You help your company by creating things and solving problems that their customers pay for. This is your contribution. Everything else is a byproduct of the above. And if it isn’t, immediately stop doing it.
You are by no means paid to have meetings, they can be a necessity, a means to an end, but never the end itself.

Meetings are also arrangements for people to socialize. This is fine and has its purpose (teambonding or building trust or just fun). But again, the real purpose and goal is to always add value.

Yes, but I am a manager?

Make no mistake, as a manager you are paid for the exact same two things. However as a product-, customer- or teammanager, your work is often less tangible or more indirectly related to the above. But if you drill down, your responsibilities as a manager are:

  • Decide priorities of things that need to be created or solved
  • Keep track of projects and commitments
  • Communicate within team and with other teams
  • Help team members grow

These four duties as a manager (or senior engineer) are to ensure the team is still doing either one of these two things: creating the right things or solving the right problems. There is no difference in responsibilities, just different tasks.

Read Grove, Drucker and Evans to know more about this responsibility.

Use your leverage

If you combine the above two instructions (Know thy time and Add Value) it leads to this: you are always trying to spend your time to add as much value as possible.

Whether you are an engineer or manager: you have unique talents. This is your leverage, this is what enables you to add value, this is why you were hired. Use your talents as a leverage to always try to add the most value.
You know your talents. And if you don’t, start writing down what you did the day before, reflect on it and I assure you your talents will soon emerge (Know thy time). And with this knowledge:

Always ask: where can I at this moment add the most value?

Is sitting in a meeting with junior engineers to train them the best use of my time? Or should I try to finish building this database cluster? Or should I call this supplier and discuss their proposal? Different tasks that ask for different talents. And the answer is never straightforward and depends on many things. You have to decide.

But the rule of thumb is: always pick the activity where your unique talents can have the most impact to the added value of the team or company at that moment.

Read Grove to know more about this responsibility.

Manage expectations

In trying to reach the goals of either creating things or solving problems there are only two outcomes.

  1. You reach the result: you created what was expected or you solved the problem. Great!
  2. You communicate early that things weren’t going as planned. Not great, but this happens all the time.

It is your responsibility to manage expectations and try to eliminate surprises.

For companies: surprises are bad, avoid them. A job is not a birthday surprise party. Your coworker does not like surprises, nor does your manager and I can assure you his / her manager even less (unless it is their actual birthday of course).
The way to avoid surprises is to communicate often and early. And sometimes this is the only tool you have to manage things that are beyond your control (suppliers, illness etc.).

Of course always focus first on the first outcome (reach the result), but don’t wait to communicate when commitments or expectations are on the verge of being broken.

Read Allen, Drucker and Crucial Conversations to know more about this responsibility.

Track your tasks

You cannot slay the dragon until you can see it.

Cal Newport

If you have a job where you don’t have to write down what your team, manager, customers, third parties need from you, you are either a genius or your job cannot be very satisfying. Let’s assume you are not a genius and that you have a challenging job. You need to start writing things down. You need a system to keep track of everything that is in your head, to get it out of your head and actively work on it.

Clear your head by writing everything down. Please don’t use precious brain cycles to keep track of what needs to be done. I repeat: you are not paid to keep track of things. You are paid to add value. You do not add value by keeping track of things, you add value by creating things or solving problems.

At the simplest form this is the opposite of Know thy time where you write down and reflect on what you did the previous day (backwards). Track your tasks is, at the most basic level, a list of what you will do today (forward). You can combine these two activities in one sitting, every morning. It will only take a couple of minutes.

Write down what you want to achieve today.

This not only gives you a reflection point for the next morning (Know Thy Time) but it will also structure your day and give you a good guideline of when you need to demand your time to be taken seriously (“Sorry I can’t work on that right now because..” etc.).

It will also ensure that you add value and it will be an invaluable resource in deciding whether you need to manage expectations.

Of course there are all sorts of ways to structure this to prioritize or specify your tasks. Here are the three main ones:

  • First things first
  • Start with the end in mind
  • Do one thing at a time

You can discuss at length about these, but see it as starting point. The key thing here is: you need a system: pen & paper, a computerfile or specialized tools. It does not matter what system you use. But please: clear your head.

Read Allen, Grove, Drucker and Covey to know more about this responsibility.

Prepare for a different job

This is not your last job. You will need to find another job. Prepare for this. This is your responsibility. Now. Not when you find yourself looking for a new job.

Know where your time goes and you will know your talents. You will also need to know or find out if your talents apply to other areas.

Actively prepare for a different job by always testing your talents against other jobs.

Know the difference between skill and talent. A good employer will look for talent more than skill. Say you are masterfully skilled in the custom, specialized CRM of your current employer. Your next employer will not have this CRM. This skill is useless. Your talent however could be you are very quick in picking up working with CRMs in general.

See training as a continuous process and not an event. You should always be trying to learn new things. Look to train for things that are generally applicable.

Don’t know what to do next? Write your own eulogy, be candid. What would you want people to say or remember about you? This is not some morbid experiment but one that will reveal your true desires. See if they line up with your talents. Where is the gap? Actively try to close this gap.

Read Drucker, Kotter, Covey and Johnson to know more about this responsibility.

Conclusion

This post is a summary of every responsibility you have as a professional. It presents a coherent model of six principles that can sharpen your views on your professional responsibilities.

This post also offers a variety of literary references as a starting point for you to dig deeper into the mentioned subjects. Because every subject here is, of course, much broader and deeper than will fit into a blogpost.

Why?

This question was left unanswered. And for all intended purposes it could as well have been at the top of the post. Why indeed a summary of principles and instructions?

Simple: you spend about half your waking life at your job. This is time you can only spend once. So this is extremely valuable time.

Your time is valuable and important and you want to spend it on something that is both satisfying and fulfilling. You don’t want to spend your days propped up behind a screen doing things until you can clock out, right? This a a dead-end. And you know it. I believe that a job that is satisfying and fulfilling provides meaning and leads to a richer life. And I am sure these instructions can you help you achieve that.

This post is also available in Dutch 🇳🇱.

The post Working 101 appeared first on Jan van den Berg.

September 28, 2020

Phil Hagelberg (technomancy)

in which many rays are cast September 28, 2020 01:06 PM

The Lisp Game Jam is a semiannual tradition I enjoy participating in. This time around I created Spilljackers, a 3D cyberspace heist game with my friend Emma Bukacek. Rather than use Fennel with the LÖVE framework (my usual choice) we went back to TIC-80 which we had used previously on our last game collaboration.

3D brightly-colored maze with menacing enemy approaching

There's a lot to say about the style and feel of the game; Emma's punchy writing and catchy tunes brought so much to the table, but for this post I want to focus on the technical aspects. This was the first time I tried writing a 3D game. Instead of doing "proper 3D" which requires a lot of matrix math. I took a much simpler approach and built it using raycasting which applies some limitations but results in much simpler and faster1 code.

Raycasting (not to be confused with ray-tracing) works by making each column of pixels on the screen cast a "ray" out to see what walls it encounters, and we use the information about walls to draw a column of pixels representing the wall. Some raycasting engines (like that of the famous Wolfenstein 3D) force all walls to be the same height, which means you can stop tracing when you hit the first wall, but we want to allow walls of various heights, so it traces the ray out to the distance limit, then tracks back and draws all the lines back-to-front so that the nearer lines cover the further ones.

diagram of casting rays

The final jam version of Spilljackers was 1093 lines, but I had the basic rendering of the map nailed down after the first evening of coding, and in fact the core of the algorithm can be demoed in 43 lines of code. There is some trig used, but if you've ever built a 2D game that used trig to do movement or collision detection, the math here is no more complex than that. Let's take a look!

We start by defining some constants and variables, including screen size, player characteristics (speed, turn speed, width, and height), and position/rotation:

(local (W MW H MH tau) (values 240 136 140 68 (* math.pi 2)))
(local (spd tspd pw ph) (values 0.7 0.06 0.7 0.5))
(var (x y rotation) (values 12 12 0))

Next we have the movement code. This looks a lot like it would in a 2D TIC game—when we move, we check all four corners of the player's bounding box to see if the new position is valid based on whether the map coordinates for that position show an empty tile (zero) or not. In a real game we would have some momentum and sliding across walls, but that's omitted for clarity.

(fn ok? [x y] (= 0 (mget (// x 8) (// y 8))))

(fn move [spd]
  (let [nx (+ x (* spd (math.cos rotation))) ny (+ y (* spd (math.sin rotation)))]
    (when (and (ok? (- nx pw) (- ny pw)) (ok? (- nx pw) (+ ny pw))
               (ok? (+ nx pw) (- ny pw)) (ok? (+ nx pw) (+ ny pw)))
      (set (x y) (values nx ny)))))

(fn input-update []
  (when (btn 0) (move spd))
  (when (btn 1) (move (- spd)))
  (when (btn 2) (set rotation (% (- rotation tspd) tau)))
  (when (btn 3) (set rotation (% (+ rotation tspd) tau))))

Now before we get to the actual raycasting, let's take a look at the map data. In TIC-80 each map position has a tile number in it which corresponds to a location on the sprite sheet. Rather than encoding complex tables of tile numbers to the visual properties of the map cells they describe, we encode properties about the tile in its sprite sheet position.

grid of colored sprites

The color of the cell is determined by the sprite's column, and the height of the cell is determined by the its row. In this image tile #34 is selected. Since the sprites are arranged in a 16x16 grid, we calculate its column (and therefore its color) by taking the tile number and calculating modulo 16, getting 3. Likewise 34 divided by 16 (integer division) is 3, which gives us our height multiplier.

Back to the code—let's jump to the outermost function and work our way inwards. The TIC global function is called sixty times per second and ties everything together: reading input, updating state, and drawing. The for loop here steps thru every column to call draw-ray after precalculating a few things.

(fn _G.TIC []
  (input-update)
  (cls)
  (for [col 0 W] ; draw one column of the screen at a time
    (let [lens-r (math.atan (/ (- col MW) 100))]
      (draw-ray (math.sin (+ rotation lens-r)) (math.cos (+ rotation lens-r))
                (math.cos lens-r) col x y x y 16))))
fisheye effect

If we calculate all distances as being from the single x,y point representing the player's position, (as is the case in the video here) then columns at the player's peripheral vision will look further away than columns near the center of the screen, resulting in a fisheye lens effect. The lens-r value above is used below to counteract that by calculating how far away the current column is from the midpoint of the screen. We also precalculate cos and sin once as an optimization to avoid having to do it repeatedly within draw-ray.

The draw-ray function below starts by calling the cast helper function to see where the ray will intersect with the next tile, and what tile number that is. The height of the wall is calculated based on the distance of that intersection point from the player, with the lens-factor applied as mentioned above to counteract the fisheye effect. Once we have the height factor, it's used to calculate the top and bottom of the "wall slice" line by offsetting it from MH (the vertical midpoint of the screen), the ph height of the player, and (// tile 16), which tells us which row in the sprite sheet we're looking at.

Because we have to draw some walls behind other walls, the draw-ray function must be recursive. The limit argument tells us how far to recurse; if we haven't hit our limit, keep casting the ray before calling line to actually render the wall we've just calculated. This ensures that more distant walls are drawn behind closer walls. Finally we only call line if the tile is nonzero, because the zeroth tile indicates empty space. The color of the line is (% tile 16) since as per above, the column in the 16-tile-wide sprite sheet determines wall color.

(fn draw-ray [sin cos lens-factor col rx ry x y limit]
  (let [(hit-x hit-y tile) (cast rx ry cos sin) ; where and what tile is hit?
        dist (math.sqrt (+ (math.pow (- hit-x x) 2) (math.pow (- hit-y y) 2)))
        height-factor (/ 800 (* dist lens-factor))
        top (- MH (* height-factor (+ (// tile 16) (- 1 ph))))
        bottom (+ MH (* height-factor ph))]
    (when (< 0 limit) ; draw behind the current wall first
      (draw-ray sin cos lens-factor col hit-x hit-y x y (- limit 1)))
    (when (not= tile 0) ; only draw nonzero tiles
      (line col top col bottom (% tile 16)))))

In order to determine which tile a ray hits next, the cast function must check whether the ray will hit a horizontal edge of the next map cell or a vertical edge. Once it determines this it can use the precalculated cos and sin values to pinpoint the coordinates at which the next cell is hit, and call mget to identify the tile of the cell.

(fn cast-n [n d]
  (- (* 8 (if (< 0 d) (+ 1 (// n 8)) (- (math.ceil (/ n 8)) 1) )) n))

(fn ray-hits-x? [nx ny nxy nyx]
  (< (+ (* nx nx) (* nxy nxy)) (+ (* ny ny) (* nyx nyx))))

(fn cast [x y cos sin]
  (let [nx (cast-n x cos) nxy (/ (* nx sin) cos)
        ny (cast-n y sin) nyx (/ (* ny cos) sin)]
    (if (ray-hits-x? nx ny nxy nyx)
        (let [cx (+ x nx) cy (+ y nxy)]
          (values cx cy (mget (// (+ cx cos) 8) (// cy 8))))
        (let [cx (+ x nyx) cy (+ y ny)]
          (values cx cy (mget (// cx 8) (// (+ cy sin) 8)))))))

And that's it! That's all you need2 for a minimal raycasting game in TIC-80. Below is an embedded HTML export of the game so you can try it out for yourself! Pressing ESC and clicking "close game" will bring you to the TIC console where you can press ESC again to see the code and map editors. Making changes to the code or map and entering RUN in the console will show you the effects of your changes!

.modal{display:none;position:fixed;z-index:1;padding-top:100px;left:0;top:0;width:100%;height:100%;overflow:auto;background-color:#000;background-color:rgba(0,0,0,.4)} .modal-content{color: #333c57;position:relative;background-color:#fefefe;margin:auto;padding:2px 16px;border:1px solid #888;width:500px;box-shadow:0 4px 8px 0 rgba(0,0,0,.2),0 6px 20px 0 rgba(0,0,0,.19);-webkit-animation-name:animatetop;-webkit-animation-duration:.4s;animation-name:animatetop;animation-duration:.4s}@keyframes animatetop{from{top:-300px;opacity:0}to{top:0;opacity:1}} .close{color:#000;float:right;font-size:28px;font-weight:700} .close:focus, .close:hover{color:#000;text-decoration:none;cursor:pointer} #game-frame > div { font-size: 44px; font-family: monospace; font-weight: bold;} .game { width: 800px; height: 500px; }

- CLICK TO PLAY -


I learned about raycasting from reading and modifying the source to the game Portal Caster which creates some neat puzzles using portals. I also found this write-up of FPS-80, a somewhat more elaborate TIC-80 raycaster that includes some impressive lighting effects. In the final version of Spilljackers, I used the interlaced rendering strategy from FPS-80. Every even tick, you render the even columns, and every odd tick you render the odd columns. This results in some "fuzzy" visuals, but it improves performance to the point where the game runs smoothly even with a long render distance even on an old Thinkpad from 2008.


[1] In this context, the reason raycasting is faster is that the platform I'm using (TIC-80) does not have any access to the GPU and does all its rendering on the CPU. If you have a GPU then things are different!

[2] You can see the full code on its own in a text file here. If you have TIC-80 downloaded, you can run tic80 mini.fnl to load up the game locally; the data for the map and palette are encoded as comments in the bottom of the text file.

September 27, 2020

Derek Jones (derek-jones)

Learning useful stuff from the Ecosystems chapter of my book September 27, 2020 09:35 PM

What useful, practical things might professional software developers learn from the Ecosystems chapter in my evidence-based software engineering book?

This week I checked the ecosystems chapter; what useful things did I learn (combined with everything I learned during all the other weeks spent working on this chapter)?

A casual reader would conclude that software engineering ecosystems involved lots of topics, with little or no theory connecting them. I had great plans for the connecting theories, but lack of detailed data, time and inspiration means the plans remain in my head (e.g., modelling the interaction between the growth of source code written in a particular language and the number of developers actively using that language).

For managers, the usefulness of this chapter is the strategic perspective it provides. How does what they and others are doing relate to everything else, and what patterns of evolution are to be expected?

Software people like to think that everything about software is unique. Software is unique, but the activities around it follow patterns that have been followed by other unique technologies, e.g., the automobile and jet engines. There is useful stuff to be learned from non-software ecosystems, and the chapter discusses some similarities I have learned about.

There is lots more evidence of the finite lifetime of software related items: lifetime of products, Linux distributions, packages, APIs and software careers.

Some readers might be surprised by the amount of discussion about what is now historical hardware. Software needs hardware to execute it, and the characteristics of the hardware of the day can have a significant impact on the characteristics of the software that gets written. I suspect that most of this discussion will not be that useful to most readers, but it provides some context around why things are the way they are today.

Readers with a wide knowledge of software ecosystems will notice that several major ecosystems barely get a mention. Embedded systems is a huge market, as is Microsoft Windows, and very many professional developers use C++. However, to date the focus of most research has been around Linux and Android (because its use of Java, a language often taught in academia), and languages that have a major package repository. So the ecosystems chapter presents a rather blinkered view of software engineering ecosystems.

What did I learn from this chapter?

Software ecosystems are bigger and more complicated that I had originally thought.

Readers might have a completely different learning experience from reading the ecosystems chapter. What useful things did you learn from the ecosystems chapter?

Ponylang (SeanTAllen)

Last Week in Pony - September 27, 2020 September 27, 2020 02:55 PM

Ponyc 0.38.1 has been released. Support for prebuilt “generic glibc Linux” ponyc binaries is being dropped in favor of prebuilt images for specific Linux distributions. We are also pleased to announce Jason Carr, AKA @jasoncarr0, is now a Pony committer!

September 24, 2020

Unrelenting Technology (myfreeweb)

Noticed the mgb driver for Microchip LAN7430 September 24, 2020 05:31 PM

Noticed the mgb driver for Microchip LAN7430 (/31) NIC in FreeBSD commit logs. Huh, interesting stuff: Microchip publishes so much documentation.. a “Programmer’s Guide” PDF with lots of driver pseudocode, and even evaluation board design files!

September 23, 2020

Sevan Janiyan (sevan)

Book review: BPF Performance Tools: Linux System and Application Observability September 23, 2020 07:21 PM

It’s more than 11 years since the shouting in the data centre video landed and I still manage to surprise folks in 2020 who have never seen it with what is possible.The idea that such transparency is a reality in some circles comes as a shock. Without the facility to be able to dynamically instrument …

September 22, 2020

Maxwell Bernstein (tekknolagi)

Compiling a Lisp: Reader September 22, 2020 05:00 AM

firstprevious

Welcome back to the “Compiling a Lisp” series. This time I want to take a break from compiling and finally add a reader. I’m finally getting frustrated manually entering increasinly complicated ASTs, so I figure it is time. After this post, we’ll be able to type in programs like:

(< (+ 1 2) (- 4 3))

and have our compiler make ASTs for us! Magic. This will also add some nice debugging tools for us. For example, imagine an interactive command line utility in which we can enter Lisp expressions and the compiler prints out human-readable assembly (and hex? maybe?). It could even run the code, too. Check out this imaginary demo:

lisp> 1
; mov rax, 0x4
=> 1
lisp> (add1 1)
; mov rax, 0x4
; add rax, 0x4
=> 2
lisp>

Wow, what a thought.

The Reader interface

To make this interface as simple and testable as possible, I want the reader interface to take in a C string and return an ASTNode *:

ASTNode *Reader_read(char *input);

We can add interfaces later to support reading from a FILE* or file descriptor or something, but for now we’ll just use strings and line-based input.

On success, we’ll return a fully-formed ASTNode*. But on error, well, hold on. We can’t just return NULL. On many platforms, NULL is defined to be 0, which is how we encode the integer 0. On others, it could be defined to be 0x555555551 or something equally silly. Regardless, its value might overlap with our type encoding scheme in some unintended way.

This means that we have to go ahead and add another immediate object: an Error object. We have some open immediate tag bits, so sure, why not. We can also use this to signal runtime errors and other fun things. It’ll probably be useful.

The Error object

Back to the object tag diagram. Below I have reproduced the tag diagram from previous posts, but now with a new entry (denoted by <-). This new entry shows the encoding for the canonical Error object.

High                                                         Low
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX00  Integer
0000000000000000000000000000000000000000000000000XXXXXXX00001111  Character
00000000000000000000000000000000000000000000000000000000X0011111  Boolean
0000000000000000000000000000000000000000000000000000000000101111  Nil
0000000000000000000000000000000000000000000000000000000000111111  Error <-
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX001  Pair
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX010  Vector
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX011  String
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX101  Symbol
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX110  Closure

If we wanted to, we could even add additional tag bits to the (currently all 0) payload, to signal different kinds of errors. Maybe later. For now, we add a tag constant and associated Object and AST functions:

const unsigned int kErrorTag = 0x3f; // 0b111111
uword Object_error() { return kErrorTag; }

bool AST_is_error(ASTNode *node) { return (uword)node == Object_error(); }
ASTNode *AST_error() { return (ASTNode *)Object_error(); }

That should be enough to get us going for now. Perhaps we could even convert our Compile_ suite of functions to use this object instead of an int. It would certainly be more informative. Maybe in a future post.

Language syntax

Let’s get back to business and think about what we want our language to look like. This is a Lisp series but really you could adapt your reader to read any sort of syntax. No need for parentheses if you’re allergic.

I’m going to use this simple Lisp reader because it’s short and simple, so we’ll have some parens.

First, our integers will look like integers in most languages — 0, 123, -123.

You can add support for other bases if you like, but I don’t plan on it here.

Second, our characters will look like C characters — 'a', 'b', etc. Some implementations opt for #'a but that has always looked funky to me.

Third, our booleans will be #t and #f. You’re also welcome to go ahead and use symbols to represent the names, avoid special syntax, and have those symbols evaluate to truthy and falsey values.

Fourth, the nil object will be (). We can also later bind the symbol nil to mean (), too.

I’m going to skip error objects, because they don’t yet have any sort of user-land meaning yet — they’re just used in compiler infrastructure right now.

Fifth, pairs will look like (1 2 3), meaning (cons 1 (cons 2 (cons 3 nil))). I don’t plan on adding support for dotted pair syntax. Whitespace will be insignificant.

Sixth, symbols will look like any old ASCII identifier: hello, world, fooBar. I’ll also include some punctuation in there, too, so we can use + and - as symbols, for example. Or we could even go full Lisp and use train-case identifiers.

I’m going to skip closures, since they don’t have a syntactic representation — they are just objects known to the runtime. Vectors and strings don’t have any implementation right now so we’ll add those to the reader later.

That’s it! Key points are: mind your plus and minus signs since they can appear in both integers and symbols; don’t read off the end; have fun.

The Reader implementation

Now that we’ve rather informally specified what our language looks like, we can write a small reader. We’ll start with the Reader_read function from above.

This function will just be a shell around an internal function with some more parameters.

ASTNode *Reader_read(char *input) {
  word pos = 0;
  return read_rec(input, &pos);
}

This is because we need to carry around some more state to read through this string. We need to know how far into the string we are. I chose to use an additional word for the index. Some might prefer a char** instead. Up to you.

With any recursive reader invocation, we should advance through all the whitespace, because it doesn’t mean anything to us. For this reason, we have a handy-dandy skip_whitespace function that reads through all the whitespace and then returns the next non-whitespace character.

void advance(word *pos) { ++*pos; }

char next(char *input, word *pos) {
  advance(pos);
  return input[*pos];
}

char skip_whitespace(char *input, word *pos) {
  char c = '\0';
  for (c = input[*pos]; isspace(c); c = next(input, pos)) {
    ;
  }
  return c;
}

We can use skip_whitespace in the read_rec function to fetch the next non-whitespace character. Then we’ll use that character (and sometimes the following one, too) to determine what structure we’re about to read.

bool starts_symbol(char c) {
  switch (c) {
  case '+':
  case '-':
  case '*':
  case '>':
  case '=':
  case '?':
    return true;
  default:
    return isalpha(c);
  }
}

ASTNode *read_rec(char *input, word *pos) {
  char c = skip_whitespace(input, pos);
  if (isdigit(c)) {
    return read_integer(input, pos, /*sign=*/1);
  }
  if (c == '+' && isdigit(input[*pos + 1])) {
    advance(pos); // skip '+'
    return read_integer(input, pos, /*sign=*/1);
  }
  if (c == '-' && isdigit(input[*pos + 1])) {
    advance(pos); // skip '-'
    return read_integer(input, pos, /*sign=*/-1);
  }
  if (starts_symbol(c)) {
    return read_symbol(input, pos);
  }
  if (c == '\'') {
    advance(pos); // skip '\''
    return read_char(input, pos);
  }
  if (c == '#' && input[*pos + 1] == 't') {
    advance(pos); // skip '#'
    advance(pos); // skip 't'
    return AST_new_bool(true);
  }
  if (c == '#' && input[*pos + 1] == 'f') {
    advance(pos); // skip '#'
    advance(pos); // skip 'f'
    return AST_new_bool(false);
  }
  if (c == '(') {
    advance(pos); // skip '('
    return read_list(input, pos);
  }
  return AST_error();
}

Note that I put the integer cases above the symbol case because we want to catch -123 as an integer instead of a symbol, and -a123 as a symbol instead of an integer.

We’ll probably add more entries to starts_symbol later, but those should cover the names we’ve used so far.

For each type of subcase (integer, symbol, list), the basic idea is the same: while we’re still inside the subcase, add on to it.

For integers, this means multiplying and adding (concatenating digits, so to speak):

ASTNode *read_integer(char *input, word *pos, int sign) {
  char c = '\0';
  word result = 0;
  for (char c = input[*pos]; isdigit(c); c = next(input, pos)) {
    result *= 10;
    result += c - '0';
  }
  return AST_new_integer(sign * result);
}

It also takes a sign parameter so if we see an explicit -, we can negate the integer.

For symbols, this means reading characters into a C string buffer:

const word ATOM_MAX = 32;

bool is_symbol_char(char c) {
  return starts_symbol(c) || isdigit(c);
}

ASTNode *read_symbol(char *input, word *pos) {
  char buf[ATOM_MAX + 1]; // +1 for NUL
  word length = 0;
  for (length = 0; length < ATOM_MAX && is_symbol_char(input[*pos]); length++) {
    buf[length] = input[*pos];
    advance(pos);
  }
  buf[length] = '\0';
  return AST_new_symbol(buf);
}

For simplicity’s sake, I avoided dynamic resizing. We only get at most symbols of size 32. Oh well.

Note that symbols can also have trailing numbers in them, just not at the front — like add1.

For characters, we only have three potential input characters to look at: quote, char, quote. No need for a loop:

ASTNode *read_char(char *input, word *pos) {
  char c = input[*pos];
  if (c == '\'') {
    return AST_error();
  }
  advance(pos);
  if (input[*pos] != '\'') {
    return AST_error();
  }
  advance(pos);
  return AST_new_char(c);
}

This means that input like '' or 'aa' will be an error.

For booleans, we can tackle those inline because there’s only two cases and they’re both trivial. Check for #t and #f. Done.

And last, for lists, it means we recursively build up pairs until we get to nil:

ASTNode *read_list(char *input, word *pos) {
  char c = skip_whitespace(input, pos);
  if (c == ')') {
    advance(pos);
    return AST_nil();
  }
  ASTNode *car = read_rec(input, pos);
  assert(car != AST_error());
  ASTNode *cdr = read_list(input, pos);
  assert(cdr != AST_error());
  return AST_new_pair(car, cdr);
}

Note that we still have to skip whitespace in the beginning so that we catch cases that have space either right after an opening parenthesis or right before a closing parenthesis. Or both!

That’s it — that’s the whole parser. Now let’s write some tests.

Tests

I added a new suite for reader tests. I figure it’s nice to have them grouped. Here are some of the trickier tests from that suite that originally tripped me up one way or another.

Negative integers originally parsed as symbols until I figured out I had to flip the case order:

TEST read_with_negative_integer_returns_integer(void) {
  char *input = "-1234";
  ASTNode *node = Reader_read(input);
  ASSERT_IS_INT_EQ(node, -1234);
  AST_heap_free(node);
  PASS();
}

Oh, and the ASSERT_IS_INT_EQ and upcoming ASSERT_IS_SYM_EQ macros are helpers that assert the type and value are as expected.

I also forgot about leading whitespace for a while:

TEST read_with_leading_whitespace_ignores_whitespace(void) {
  char *input = "   \t   \n  1234";
  ASTNode *node = Reader_read(input);
  ASSERT_IS_INT_EQ(node, 1234);
  AST_heap_free(node);
  PASS();
}

And also whitespace in lists:

TEST read_with_list_returns_list(void) {
  char *input = "( 1 2 0 )";
  ASTNode *node = Reader_read(input);
  ASSERT(AST_is_pair(node));
  ASSERT_IS_INT_EQ(AST_pair_car(node), 1);
  ASSERT_IS_INT_EQ(AST_pair_car(AST_pair_cdr(node)), 2);
  ASSERT_IS_INT_EQ(AST_pair_car(AST_pair_cdr(AST_pair_cdr(node))), 0);
  ASSERT(AST_is_nil(AST_pair_cdr(AST_pair_cdr(AST_pair_cdr(node)))));
  AST_heap_free(node);
  PASS();
}

And here’s some goofy symbol to make sure all these symbol characters work:

TEST read_with_symbol_returns_symbol(void) {
  char *input = "hello?+-*=>";
  ASTNode *node = Reader_read(input);
  ASSERT_IS_SYM_EQ(node, "hello?+-*=>");
  AST_heap_free(node);
  PASS();
}

And to make sure trailing digits in symbol names work:

TEST read_with_symbol_with_trailing_digits(void) {
  char *input = "add1 1";
  ASTNode *node = Reader_read(input);
  ASSERT_IS_SYM_EQ(node, "add1");
  AST_heap_free(node);
  PASS();
}

Nice.

Some extras

Now, we could wrap up with the tests, but I did mention some fun features like a REPL. Here’s a function repl that you can call from your main function instead of running the tests.

int repl() {
  do {
    // Read a line
    fprintf(stdout, "lisp> ");
    char *line = NULL;
    size_t size = 0;
    ssize_t nchars = getline(&line, &size, stdin);
    if (nchars < 0) {
      fprintf(stderr, "Goodbye.\n");
      free(line);
      break;
    }

    // Parse the line
    ASTNode *node = Reader_read(line);
    free(line);
    if (AST_is_error(node)) {
      fprintf(stderr, "Parse error.\n");
      continue;
    }

    // Compile the line
    Buffer buf;
    Buffer_init(&buf, 1);
    int result = Compile_expr(&buf, node, /*stack_index=*/-kWordSize);
    AST_heap_free(node);
    if (result < 0) {
      fprintf(stderr, "Compile error.\n");
      Buffer_deinit(&buf);
      continue;
    }

    // Print the assembled code
    for (size_t i = 0; i < buf.len; i++) {
      fprintf(stderr, "%.02x ", buf.address[i]);
    }
    fprintf(stderr, "\n");

    Buffer_deinit(&buf);
  } while (true);
  return 0;
}

And we can trigger this mode by passing --repl-assembly:

int run_tests(int argc, char **argv) {
  GREATEST_MAIN_BEGIN();
  RUN_SUITE(object_tests);
  RUN_SUITE(ast_tests);
  RUN_SUITE(reader_tests);
  RUN_SUITE(buffer_tests);
  RUN_SUITE(compiler_tests);
  GREATEST_MAIN_END();
}

int main(int argc, char **argv) {
  if (argc == 2 && strcmp(argv[1], "--repl-assembly") == 0) {
    return repl();
  }
  return run_tests(argc, argv);
}

It uses all the machinery from the last couple posts and then prints out the results in hex pairs. Interactions look like this:

sequoia% ./bin/compiling-reader --repl-assembly
lisp> 1
48 c7 c0 04 00 00 00 
lisp> (add1 1)
48 c7 c0 04 00 00 00 48 05 04 00 00 00 
lisp> 'a'
48 c7 c0 0f 61 00 00
lisp> Goodbye.
sequoia% 

Excellent. A fun exercise for the reader might be going further and executing the compiled code and printing the result, as above. The trickiest (because we don’t have infrastructure for that yet) part of it will be printing the result, I think.

Another fun exercise might be adding a mode to the compiler to print text assembly to the screen, like a debugging trace. This should be straightforward enough since we already have very specific opcode implementations.

Anyway, thanks for reading. Next time we’ll get back to compiling and tackle let-expressions.

Mini Table of Contents



  1. See this series of Tweets by Kate about changing the value of NULL in the TenDRA compiler. 

September 20, 2020

Ponylang (SeanTAllen)

Last Week in Pony - September 20, 2020 September 20, 2020 11:38 PM

Sean T. Allen has released version 0.0.1 of the lori TCP library.

Derek Jones (derek-jones)

Learning useful stuff from the Projects chapter of my book September 20, 2020 09:24 PM

What useful, practical things might professional software developers learn from the Projects chapter in my evidence-based software engineering book?

This week I checked the projects chapter; what useful things did I learn (combined with everything I learned during all the other weeks spent working on this chapter)?

There turned out to be around three to four times more data publicly available than I had first thought. This is good, but there is a trap for the unweary. For many topics there is one data set, and that one data set may not be representative. What is needed is a selection of data from various sources, all relating to a given topic.

Some data is better than no data, provided small data sets are treated with caution.

Estimation is a popular research topic: how long will a project take and how much will it cost.

After reading all the papers I learned that existing estimation models are even more unreliable than I had thought, and what is more, there are plenty of published benchmarks showing how unreliable the models really are (these papers never seem to get cited).

Models that include lines of code in the estimation process (i.e., the majority of models) need a good estimate of the likely number of lines in the final software system. One issue that nobody had considered was the impact of developer variability on the number of lines written to implement the same functionality, which turns out to be large. Oops.

Machine learning has infested effort estimation research. What the machine learning models actually do is estimate adjustment, i.e., they do not create their own estimate but adjust one passed in as input to the model. Most estimation data sets are tiny, and only contain a few different variables; unless the estimate is included in the training phase, the generated model produces laughable results. Oops.

The good news is that there appear to be lots of recurring patterns in the project data. This is good news because recurring patterns are something to be explained by a theory of software project development (apparent randomness is bad news, from the perspective of coming up with a model of what is going on). I think we are still a long way from having workable theories, but seeing patterns is a good sign that one or more theories will be possible.

I think that the main takeaway from this chapter is that software often has a short lifetime. People in industry probably have a vague feeling that this is true, from experience with short-lived projects. It is not cost effective to approach commercial software development from the perspective that the code will live a long time; some code does live a long time, but most dies young. I see the implications of this reality being a major source of contention with those in academia who have spent too long babbling away in front of teenagers (teaching the creation of idealized software that lives on forever), and little or no time building software systems.

A lot of software is written by teams of people, however, there is not a lot of data available on teams (software or otherwise). Given the difficulty of hiring developers, companies have to make do with what they have, so a theory of software teams might not be that useful in practice.

Readers might have a completely different learning experience from reading the projects chapter. What useful things did you learn from the projects chapter?

September 19, 2020

Gustaf Erikson (gerikson)

Re-reading Dune and Heretics of Dune September 19, 2020 07:58 PM

I’ve re-read Frank Herbert’s 1965 novel Dune, partly inspired by the upcoming movie.

Based on my memories I first read it in 1988 or so. The first novel in the series I read was actually Heretics of Dune (published in 1984) which I borrowed from the library in Halmstad. This must have been in 1986 or ‘87. I’ve long realized that it’s not a huge deal to read some novel series out of order - especially ones that are so self-contained as the Dune novels. Heretics takes place 5,000 years after Dune, after all.

Anyway, if you’re only going to read one Dune novel, the first one is the best. It has all the goodies - the worldbuilding, the Hero’s Journey, the tight plotting and good use of language. Even the 1960s elements have aged well - while standards like telepathy are there they’re only mentioned in passing, and the central idea of prescience is part of the plot and well handled there.

I wonder what the movie will do with the implicit connection of the Fremen with modern-day inhabitants of the Middle East. While using terms like jihad was merely a frisson in the original, they take on a darker tone in today’s climate - at least among the less enlightened. I suspect the projected 2-parter will not emphasize the jihad Paul foresees throughout the novel and instead focus on the thrilling twists and turns.

After Dune I decided to re-read Heretics. There’s almost 20 years between the novels, and it’s clear that Herbert has picked up a lot of contemporary SF tropes in the meantime. The tech in Dune is almost indistinguishable from magic - devices such as suspensors and personal shields were never explained, instead added to impart flavor - and to enforce the quasi-medieval setting of the universe.

Heretics is much more explicit in its descriptions of space travel, weapons and other technology, but not in a way that feels dated. However, the novel is marred by long stretches of interior dialogue, where the protagonists muse about religion, history, and fate in excruciating detail. While I admire Herbert for bringing in female protagonists (in the form of the Bene Gesserit sisterhood), they’re really not that interesting as characters.

I consider Dune a bona-fide SF classic and anyone interested in the genre should read it. But don’t feel pressured to read more from Herbert’s universe.

September 18, 2020

Gonçalo Valério (dethos)

Django Friday Tips: Inspecting ORM queries September 18, 2020 07:01 PM

Today lets look at the tools Django provides out of the box to debug the queries made to the database using the ORM.

This isn’t an uncommon task. Almost everyone who works on a non-trivial Django application faces situations where the ORM does not return the correct data or a particular operation as taking too long.

The best way to understand what is happening behind the scenes when you build database queries using your defined models, managers and querysets, is to look at the resulting SQL.

The standard way of doing this is to set the logging configuration to print all queries done by the ORM to the console. This way when you browse your website you can check them in real time. Here is an example config:

LOGGING = {
    ...
    'handlers': {
        'console': {
            'level': 'DEBUG',
            'filters': ['require_debug_true'],
            'class': 'logging.StreamHandler',
        },
        ...
    },
    'loggers': {
        ...
        'django.db.backends': {
            'level': 'DEBUG',
            'handlers': ['console', ],
        },
    }
}

The result will be something like this:

...
web_1     | (0.001) SELECT MAX("axes_accessattempt"."failures_since_start") AS "failures_since_start__max" FROM "axes_accessattempt" WHERE ("axes_accessattempt"."ip_address" = '172.18.0.1'::inet AND "axes_accessattempt"."attempt_time" >= '2020-09-18T17:43:19.844650+00:00'::timestamptz); args=(Inet('172.18.0.1'), datetime.datetime(2020, 9, 18, 17, 43, 19, 844650, tzinfo=<UTC>))
web_1     | (0.001) SELECT MAX("axes_accessattempt"."failures_since_start") AS "failures_since_start__max" FROM "axes_accessattempt" WHERE ("axes_accessattempt"."ip_address" = '172.18.0.1'::inet AND "axes_accessattempt"."attempt_time" >= '2020-09-18T17:43:19.844650+00:00'::timestamptz); args=(Inet('172.18.0.1'), datetime.datetime(2020, 9, 18, 17, 43, 19, 844650, tzinfo=<UTC>))
web_1     | Bad Request: /users/login/
web_1     | [18/Sep/2020 18:43:20] "POST /users/login/ HTTP/1.1" 400 2687

Note: The console output will get a bit noisy

Now lets suppose this logging config is turned off by default (for example, in a staging server). You are manually debugging your app using the Django shell and doing some queries to inspect the resulting data. In this case str(queryset.query) is very helpful to check if the query you have built is the one you intended to. Here’s an example:

>>> box_qs = Box.objects.filter(expires_at__gt=timezone.now()).exclude(owner_id=10)
>>> str(box_qs.query)
'SELECT "boxes_box"."id", "boxes_box"."name", "boxes_box"."description", "boxes_box"."uuid", "boxes_box"."owner_id", "boxes_box"."created_at", "boxes_box"."updated_at", "boxes_box"."expires_at", "boxes_box"."status", "boxes_box"."max_messages", "boxes_box"."last_sent_at" FROM "boxes_box" WHERE ("boxes_box"."expires_at" > 2020-09-18 18:06:25.535802+00:00 AND NOT ("boxes_box"."owner_id" = 10))'

If the problem is related to performance, you can check the query plan to see if it hits the right indexes using the .explain() method, like you would normally do in SQL.

>>> print(box_qs.explain(verbose=True))
Seq Scan on public.boxes_box  (cost=0.00..13.00 rows=66 width=370)
  Output: id, name, description, uuid, owner_id, created_at, updated_at, expires_at, status, max_messages, last_sent_at
  Filter: ((boxes_box.expires_at > '2020-09-18 18:06:25.535802+00'::timestamp with time zone) AND (boxes_box.owner_id <> 10))

This is it, I hope you find it useful.

September 17, 2020

Gustaf Erikson (gerikson)

Six months since WFH began September 17, 2020 02:57 PM

September 16, 2020

Maxwell Bernstein (tekknolagi)

Compiling a Lisp: Primitive binary functions September 16, 2020 05:00 AM

firstprevious

Welcome back to the “Compiling a Lisp” series. Last time, we added some primitive unary instructions like add1 and integer->char. This time, we’re going to add some primitive binary functions like + and <. After this post, we’ll be able to compile programs like:

(< (+ 1 2) (- 4 3))

Note that these expressions may look like function calls but, like last chapter, they are not opening new stack frames (which I’ll explain more about later). Instead, the compiler will recognize that the programmer is directly applying the symbol + and generate special code. You can think about this kind of like an inlined function call.

It’s important to remember that the compiler has a certain internal contract: the result of any given compiled expression is stored in rax. This isn’t some intrinsic property of all compilers, but it’s one we’ve kept so far in this series.

This is similar to but not the same as the calling convention that I mentioned earlier, where function results are stored in rax. That calling convention is for interacting with other people’s code. Within your own generated code, there are no rules. So we could pick any other register, really, for storing intermediate results.

Now that we’re building primitive functions that can take two arguments, you might notice a problem: our strategy of storing the result in rax won’t work on its own. If we were to naïvely write something like the following to implement +, then rax would get overwritten in the code generated by compiling operand1(args):

int Compile_call(Buffer *buf, ASTNode *callable, ASTNode *args) {
  if (AST_is_symbol(callable)) {
    // ...
    if (AST_symbol_matches(callable, "+")) {
      _(Compile_expr(buf, operand2(args)));
      // The result of this is stored in rax ^
      _(Compile_expr(buf, operand1(args)));
      // Oops, we just overwrote rax ^
      Emit_add_something(buf, /*dst=*/kRax));
      return 0;
    }
    // ...
  }
  // ...
}

We could try and work around this by adding some kind of register allocation algorithm and take advantage of rcx, rdx, etc. Or, simpler, we could decide to allocate all intermediate values on the stack and move on with our lives. I prefer the latter. It’s simpler.

Stack background info

Since we can’t yet save our compiled programs to disk, there’s some amount of setup that has to happen before they’re run. Right now, the C programs I’m providing along with this series compile to binaries that just run the test suites for the compiler. They don’t actually run full programs. For this reason, there are already some call frames on the stack by the time our generated code is run.

Let’s take a look at the stack at the moment we enter a compiled Lisp program:

|                  | High addresses
+------------------+
|  main            |
+------------------+ |
|  ~ some data ~   | |
|  ~ some data ~   | |
+------------------+ |
|  compile_test    | |
+------------------+ |
|  ~ some data ~   | |
|  ~ some data ~   | v
+------------------+
|  Testing_exe...  | rsp (stack pointer)
+------------------+
|                  | <-- Our frame!
|                  | Low addresses

In this diagram, we have the C program’s main function, which has its own local variables and so on. Then the main function calls the compile_test unit suite. This in turn calls this Testing_execute_expr function (abbreviated in the diagram), which is responsible for calling into our generated code. Every call stores the return address (some place to find the next instruction to execute) on the stack and adjusts rsp down.

Refresher: the call stack grows down. Why? Check out this StackOverflow answer that quotes an architect on the Intel 4004 and 8080 architectures. It’s stayed the same ever since.

In this diagram, we have rsp pointing at a return address somewhere inside the function Testing_execute_expr, since that’s what called our Lisp entrypoint. We have some data “above” (higher addresses) rsp that we’re not allowed to poke at, and we have this empty space “below” (lower addresses) rsp that is in our current stack frame. I say “empty” because we haven’t yet stored anything there, not because it’s necessarily zero-ed out. I don’t think there are any guarantees about the values in this stack frame.

We can use our stack frame to write and read values for our current Lisp program. With every recursive subexpression, we can allocate a little more stack space to keep track of the values. When I say “allocate”, I mean “subtract from the stack pointer”, because the stack is already a contiguous space in memory allocated for us. For example, here is how we can write to the stack:

mov [rsp-8], 0x4

This puts the integer 4 at displacement -8 from rsp. On the stack diagram above, it would be at the slot labeled “Our frame”. It’s also possible to read with a positive or zero displacement, but those point to previous stack frames and the return address, respectively. So let’s avoid manipulating those.

Note that I used a multiple of 8. Not every store has to be a to an address that is a multiple of 8, but it is natural and I think also faster to store 8-byte-sized things at aligned addresses.

Let’s walk through a real example to get more hands-on experience with this stack storage idea. We’ll use the program (+ 1 2). The compiled version of that program should:

  • Move compile(2) to rax
  • Move rax into [rsp-8]
  • Move compile(1) to rax
  • Add [rsp-8] to rax

So after compiling that, the stack will look like this:

|                  | High addresses
+------------------+
|  Testing_exe...  | RSP
+------------------+
|  0x8             | RSP-8 (result of compile(2))
|                  | Low addresses

And the result will be in rax, per our internal compiler contract.

This is all well and good, but at some point we’ll need our compiled programs to emit the push instruction or make function calls of their own. Both of these modify the stack pointer. push writes to the stack and decrements rsp. call is roughly equivalent to push followed by jmp.

For that reason, x86-64 comes with another register called rbp and it’s designed to hold the Base Pointer. While the stack pointer is supposed to track the “top” (low address) of the stack, the base pointer is meant to keep a pointer around to the “bottom” (high address) of our current stack frame.

This is why in a lot of compiled code you see the following instructions repeated1:

myfunction:
push rbp
mov rbp, rsp
sub rsp, N  ; optional; allocate stack space for locals
; ... function body ...
mov rsp, rbp  ; required if you subtracted from rsp above
pop rbp
ret

The first three instructions, called the prologue, save rbp to the stack, and then set rbp to the current stack pointer. Then it’s possible to maintain steady references to variable locations on the stack even as rsp changes. Yes, the compiler could adjust its internal table of references every time the compiler emits code that modifies rsp, but that sounds much harder.

The last three instructions, called the epilogue, fetch the old rbp that we saved to the stack, write it back into rbp, then exit the call.

To confirm this for yourself, check out this sample compiled C code. Look at the disassembly following the label square. Prologue, code, epilogue.

Stack allocation infrastructure

Until now, we haven’t needed to keep track of much as we recursively traverse expression trees. Now, in order to keep track of how much space on the stack any given compiled code will need, we have to add more state to our compiler. We’ll call this state the stack_index — Ghuloum calls it si — and we’ll pass it around as a parameter. Whatever it’s called, it points to the first writable (unused) index in the stack at any given point.

In compiled functions, the first writable index is -kWordSize (-8), since the base pointers is already at 0.

int Compile_function(Buffer *buf, ASTNode *node) {
  Buffer_write_arr(buf, kFunctionPrologue, sizeof kFunctionPrologue);
  _(Compile_expr(buf, node, -kWordSize));
  Buffer_write_arr(buf, kFunctionEpilogue, sizeof kFunctionEpilogue);
  return 0;
}

I’ve also gone ahead and added the prologue and epilogue. They’re stored in static arrays. This makes them easier to modify, and also makes them accessible to testing helpers. The testing helpers can use these arrays to make testing easier for us — we can check if our expected code is book-ended by this code.

static const byte kFunctionPrologue[] = {
    // push rbp
    0x55,
    // mov rbp, rsp
    kRexPrefix, 0x89, 0xe5,
};

static const byte kFunctionEpilogue[] = {
    // pop rbp
    0x5d,
    // ret
    0xc3,
};

For Compile_expr, we just pass this new stack index through.

int Compile_expr(Buffer *buf, ASTNode *node, word stack_index) {
  // ...
  if (AST_is_pair(node)) {
    return Compile_call(buf, AST_pair_car(node), AST_pair_cdr(node),
                        stack_index);
  }
  // ...
}

And for Compile_call, we actually get to use it. Let’s look back at our stack storage strategy for compiling (+ 1 2) (now replacing rsp with rbp):

  • Move compile(2) to rax
  • Move rax into [rbp-8]
  • Move compile(1) to rax
  • Add [rbp-8] to rax

For binary functions, this can be generalized to:

  • Compile arg2 (stored in rax)
  • Move rax to stack_index
  • Compile arg1 (stored in rax)
  • Do something with the results (in [rbp-stack_index] and rax)

The key is this: for the first recursive call to Compile_expr, the compiler is allowed to emit code that can use the current stack_index and anything below that on the stack. For the second recursive call to Compile_expr, the compiler has to bump stack_index, since we’ve stored the result of the first compiled call at stack_index.

Take a look at our implementation of binary add:

int Compile_call(Buffer *buf, ASTNode *callable, ASTNode *args,
                 word stack_index) {
  if (AST_is_symbol(callable)) {
    // ...
    if (AST_symbol_matches(callable, "+")) {
      _(Compile_expr(buf, operand2(args), stack_index));
      Emit_store_reg_indirect(buf, /*dst=*/Ind(kRbp, stack_index),
                              /*src=*/kRax);
      _(Compile_expr(buf, operand1(args), stack_index - kWordSize));
      Emit_add_reg_indirect(buf, /*dst=*/kRax, /*src=*/Ind(kRbp, stack_index));
      return 0;
    }
    // ...
  }
  // ...
}

In this snippet, Ind stands for “indirect”, and is a constructor for a struct. This an easy and readable way to represent (register, displacement) pairs for use in reading from and writing to memory. We’ll cover this more detail in the instruction encoding.

To prove to ourselves that this approach works, we’ll add some tests later.

Other binary functions

Subtraction, multiplication, and division are much the same as addition. We’re also going to completely ignore overflow, underflow, etc.

Equality is different in that it does some comparisons after the fact (see Primitive unary functions). To check if two values are equal, we compare their pointers:

    if (AST_symbol_matches(callable, "=")) {
      _(Compile_expr(buf, operand2(args), stack_index));
      Emit_store_reg_indirect(buf, /*dst=*/Ind(kRbp, stack_index),
                              /*src=*/kRax);
      _(Compile_expr(buf, operand1(args), stack_index - kWordSize));
      Emit_cmp_reg_indirect(buf, kRax, Ind(kRbp, stack_index));
      Emit_mov_reg_imm32(buf, kRax, 0);
      Emit_setcc_imm8(buf, kEqual, kAl);
      Emit_shl_reg_imm8(buf, kRax, kBoolShift);
      Emit_or_reg_imm8(buf, kRax, kBoolTag);
      return 0;
    }

It uses a new comparison opcode that compares a register with some memory. This is why we can’t use the Compile_compare_imm32 helper function.

The less-than operator (<) is very similar to equality, but instead we use setcc with the kLess flag instead of the kEqual flag.

New opcodes

We used some new opcodes today, so let’s take a look at the implementations. First, here is the indirection implementation I mentioned earlier:

typedef struct Indirect {
  Register reg;
  int8_t disp;
} Indirect;

Indirect Ind(Register reg, int8_t disp) {
  return (Indirect){.reg = reg, .disp = disp};
}

I would have used the same name in the struct and the constructor but unfortunately that’s not allowed.

Here’s an implementation of an opcode that uses this Indirect type. This emits code for instructions of the form mov [reg+disp], src.

uint8_t disp8(int8_t disp) { return disp >= 0 ? disp : 0x100 + disp; }

void Emit_store_reg_indirect(Buffer *buf, Indirect dst, Register src) {
  Buffer_write8(buf, kRexPrefix);
  Buffer_write8(buf, 0x89);
  Buffer_write8(buf, 0x40 + src * 8 + dst.reg);
  Buffer_write8(buf, disp8(dst.disp));
}

The disp8 function is a helper that encodes negative numbers.

The opcodes for add, sub, and cmp are similar enough to this one, except src and dst are swapped. mul is a little funky because it doesn’t take two parameters. It assumes that one of the operands is always in rax.

Testing

As usual, we’ll close with some snippets of tests.

Here’s a test for +. I’m trying to see if inlining the text assembly with the hex makes it more readable. Thanks Kartik for the suggestion.

TEST compile_binary_plus(Buffer *buf) {
  ASTNode *node = new_binary_call("+", AST_new_integer(5), AST_new_integer(8));
  int compile_result = Compile_function(buf, node);
  ASSERT_EQ(compile_result, 0);
  byte expected[] = {
      // 0:  48 c7 c0 20 00 00 00    mov    rax,0x20
      0x48, 0xc7, 0xc0, 0x20, 0x00, 0x00, 0x00,
      // 7:  48 89 45 f8             mov    QWORD PTR [rbp-0x8],rax
      0x48, 0x89, 0x45, 0xf8,
      // b:  48 c7 c0 14 00 00 00    mov    rax,0x14
      0x48, 0xc7, 0xc0, 0x14, 0x00, 0x00, 0x00,
      // 12: 48 03 45 f8             add    rax,QWORD PTR [rbp-0x8]
      0x48, 0x03, 0x45, 0xf8};
  EXPECT_FUNCTION_CONTAINS_CODE(buf, expected);
  Buffer_make_executable(buf);
  uword result = Testing_execute_expr(buf);
  ASSERT_EQ(result, Object_encode_integer(13));
  AST_heap_free(node);
  PASS();
}

Here’s a test for <.

TEST compile_binary_lt_with_left_greater_than_right_returns_false(Buffer *buf)
{
  ASTNode *node = new_binary_call("<", AST_new_integer(6), AST_new_integer(5));
  int compile_result = Compile_function(buf, node);
  ASSERT_EQ(compile_result, 0);
  Buffer_make_executable(buf);
  uword result = Testing_execute_expr(buf);
  ASSERT_EQ_FMT(Object_false(), result, "0x%lx");
  AST_heap_free(node);
  PASS();
}

There are more tests in the implementation, as usual. Take a look if you like.

This has been a more complicated post than the previous ones, I think. The stack allocation may not make sense immediately. It might take some time to sink in. Try writing some of the code yourself and see if that helps.

Next time we’ll add the ability to bind variables using let a parser so we can input expressions more easily. See you then!



  1. You may also see an enter instruction paired with a leave instruction. These are equivalent. Read more here

September 15, 2020

Kevin Burke (kb)

How to Get a Human Operator on the California EDD Paid Family Leave line September 15, 2020 08:23 PM

The California EDD Paid Family Leave phone tree is like a choose your own adventure book, where almost every option leaves you with no option to contact a human. This can be frustrating. But you can reach a human if you know the right buttons to press!

Here is how to reach a human:

  • Call the EDD Paid Family Leave number at 877-238-4373.

  • Press '1' for "benefit information."

  • Follow the prompts to enter your SSN, zip code, date of birth, and weekly benefit amount.

  • The computer will read you an automated list of information about your claim.

The computer will then read a list of prompts. Wait!! After the computer asks if you want to go back to the main menu it will say "press 0 to speak to a human." Press 0 and then wait and you should get a human!

Unrelenting Technology (myfreeweb)

Burstable Graviton2 inst... September 15, 2020 08:23 PM

Burstable Graviton2 instances are now a thing. Cool! Changed the instance type for this website from a1.medium to t4g.micro so that Jeff Bezos gets less of my money :P (Basically no money until the end of this year, even — there’s a free trial for t4g.micro for all AWS accounts!)

Eric Faehnrich (faehnrich)

T Puzzle September 15, 2020 01:57 AM

The T puzzle has four T pieces, and you have to fit them all so they lay flat without overlapping in a frame. And each side has a frame, one is a little larger so is easier, the other side is smaller so harder.

Here is the file I made to laser cut it.

Spoiler, it's setup as the solution to the harder puzzle. That's because it has to be more compact to fit in the harder puzzle.

Also, since the Ts are more compact, they are placed next to each other. But the way I made the file, the edges where the Ts touch are just stacked so they're doubled up. This means a bunch of places on the Ts, the laser goes over again.

It prints just fine. In fact, it shaves off a little bit more from the edge of the T so it makes a bit more tolerance and it's easier put the puzzle together, but as far as I could tell it doesn't make it so you can cheat the puzzle. I also found if you don't run over the edges these extra few times, the Ts are so tight that they don't fit in the puzzle.

September 14, 2020

Ponylang (SeanTAllen)

Last Week in Pony - September 13, 2020 September 14, 2020 12:20 AM

A Pony talk given by Sophia Drossopoulou is now available on InfoQ.

September 13, 2020

Derek Jones (derek-jones)

Learning useful stuff from the Reliability chapter of my book September 13, 2020 09:38 PM

What useful, practical things might professional software developers learn from my evidence-based software engineering book?

Once the book is officially released I need to have good answers to this question (saying: “Well, I decided to collect all the publicly available software engineering data and say something about it”, is not going to motivate people to read the book).

This week I checked the reliability chapter; what useful things did I learn (combined with everything I learned during all the other weeks spent working on this chapter)?

A casual reader skimming the chapter would conclude that little was known about software reliability, and they would be right (I already knew this, but I learned that we know even less than I thought was known), and many researchers continue to dig in unproductive holes.

A reader with some familiarity with reliability research would be surprised to see that some ‘major’ topics are not discussed.

The train wreck that is machine learning has been avoided (not forgetting that the data used is mostly worthless), mutation testing gets mentioned because of some interesting data (the underlying problem is that mutation testing assumes that coding mistakes are local to one line, but in practice coding mistakes often involve multiple lines), and the theory discussions don’t mention non-homogeneous Poisson process as the basis for software fault models (because this process is not capable of solving the questions asked).

What did I learn? My highlights include:

  • Anne Choa‘s work on population estimation. The takeaway from this work is that if people want to estimate the number of remaining fault experiences, based on previous experienced faults, then every occurrence (i.e., not just the first) of a fault needs to be counted,
  • Phyllis Nagel and Janet Dunham’s top read work on software testing,
  • the variability in the numeric percentage that people assign to probability terms (e.g., almost all, likely, unlikely) is much wider than I would have thought,
  • the impact of the distribution of input values on fault experiences may be detectable,
  • really a lowlight, but there is a lot less publicly available data than I had expected (for the other chapters there was more data than I had expected).

The last decade has seen fuzzing grow to dominate the headlines around software reliability and testing, and provide data for people who write evidence-based books. I don’t have much of a feel for how widely used it is in industry, but it is a very useful tool for reliability researchers.

Readers might have a completely different learning experience from reading the reliability chapter. What useful things did you learn from the reliability chapter?

Patrick Louis (venam)

Did You Know Fonts Could Do All This? September 13, 2020 09:00 PM

Confusing Mexican Calendar, at least for those not in the know

Freetype, included in the font stack on Unix, is quite complex. There are so many layers to get it to do what it does that it’s easy to get lost. From finding the font, to actually rendering it, and everything in between.
Like most of the world, I use a rather low screens definition (1366x768 with 96 dpi) and rather old-ish laptop, unlike some font designers that live in a filter bubble where everyone has the latest macbook. Thus, good and legible font rendering is important.
Let’s play with lesser known toggles available to us when it comes to font rendering and see what they do, let’s have fun and explore possibilities.

A General Picture

Generally, to make a font look better on screens, which are arrays of pixels, we use a combination of these three:

  • Antialiasing: Applying a light shade around the glyph. It is useful at small scale, when you don’t have enough pixels, but it makes most glyphs look bolder.

Font anti-alias example

  • Subpixel rendering: A technique similar to antialias but using subpixels, the color components inside the pixels. By applying a small amount of colors on the sides you can reach more granular precision. However, if applied clumsily, or if you simply move the window containing the text, these colored subpixels will become apparent, what we call fringe.

Font sub-pixel rendering example

  • Hinting: Pixels are blocks but text is made of curves, that means these curves will never match exactly with screen pixels. Hinting is about repositioning or selecting the closest pixels while trying as much as possible to keep the shape of the glyph intact. There are multiple levels of hinting, hinting information provided by the font itself (bytecode interpreter hinting), and hinting provided by the rendering library (auto-hinting).

Font hinting example

NB: “It’s just text”… This article is yet another that shows how fonts aren’t as easy as they look. For more info about the font stack, please visit my previous article on the topic, and if you want an idea of what it means to draw them on the screen take a look a this article.

What is applied, when, how to control all of this, can we see what they do, and should we even care?

Freetype and fontconfig default rendering these days is pretty good, so there shouldn’t be anything to worry about; Until there’s something to worry about, like a font not looking the way you want.
Our first stop will be something that intrigued me because I haven’t heard many talk about it: the Freetype driver’s properties.
The Freetype driver is used whenever hinting is needed, so this is the part it actually changes — how hinting is applied.

Getting The Right Tools For The Task

Let’s start with arming ourselves with ways to easily test all this.
Freetype2 demos utilities are a must, you can clone them here or fetch them from your package repositiory, for example Debian and Arch Linux.
These will give you a bunch of useful tools such as ftdiff, ftview, ftstring ftgrid, fttimer, ftbench, and others. The most important ones for us are ftdiff and ftgrid.

Example usage:

ftdiff -r 96 -s 10 ~/.local/share/fonts/times.ttf
ftgrid -r 96 -f 20 10 ~/.local/share/fonts/times.ttf
ftstring -r 96 -m 'Hello World!' 10 ~/.local/share/fonts/times.ttf

Additionally, you can install pango-view from pango-tools to later test if fontconfig applies your configurations properly. It can be used by preparing a file written in pango markup and displaying it using pango-view --markup file.pangpang.
You can set the fontconfig debug level higher to see which font is actually loaded by setting the FC_DEBUG to something like 4096, FC_DEBUG=4096.

More values can be found here, we’ll use them later to see if our fontconfig settings are applied properly:

Name         Value    Meaning
---------------------------------------------------------
MATCH            1    Brief information about font matching
MATCHV           2    Extensive font matching information
EDIT             4    Monitor match/test/edit execution
FONTSET          8    Track loading of font information at startup
CACHE           16    Watch cache files being written
CACHEV          32    Extensive cache file writing information
PARSE           64    (no longer in use)
SCAN           128    Watch font files being scanned to build caches
SCANV          256    Verbose font file scanning information
MEMORY         512    Monitor fontconfig memory usage
CONFIG        1024    Monitor which config files are loaded
LANGSET       2048    Dump char sets used to construct lang values
MATCH2        4096    Display font-matching transformation in patterns

Yet another way is to test directly in your browser URL bar:

data:text/html,<meta charset="utf8"><p style="font-family: Times New Roman;">Hello World</p>

The Freetype2 Drivers Properties

So let’s get back to our testing of Freetype2 drivers.
On this documentation page, ft (freetype) properties are listed and are said to affect the behavior of the drivers, each touching a different one. They are set by modifying the FREETYPE_PROPERTIES environment variable, normally loaded from /etc/profile.d/freetype2.sh.
However, most of the ones listed are targeted at the CFF, Type 1, and CID fonts driver and not at TrueType fonts, so they do nothing if you don’t have these font types. The only toggle available for TrueType is the interpreter-version which controls the bytecode interpreter, the rasterizer, and thus how the outline gets hinted.

The options available to us are the following:

  • 35 — For classic mode GDI (Win 98/2000)
  • 38 — GDI+ old (Vista, Win 7), Infinality, considered slow
  • 40 — For minimal mode (stripped down Infinality, this is the default) (After Win 7)

Kind of weird that we jump from 35 to 38, where did 36 and the rest go? The answer is that it’s a choice from the Freetype devs to only include those and not the ones in between.

And the differences look as follows, notice the native hinter in the left column:

  • v35
FREETYPE_PROPERTIES="truetype:interpreter-version=35" ftdiff -r 96 -s 10 ~/.local/share/fonts/times.ttf

ftdiff interpreter v35

FREETYPE_PROPERTIES="truetype:interpreter-version=35" ftgrid -r 96 -f 36 10 ~/.local/share/fonts/times.ttf

ftgrid interpreter v35

  • v38
FREETYPE_PROPERTIES="truetype:interpreter-version=38" ftdiff -r 96 -s 10 ~/.local/share/fonts/times.ttf

ftdiff interpreter v38

FREETYPE_PROPERTIES="truetype:interpreter-version=38" ftgrid -r 96 -f 36 10 ~/.local/share/fonts/times.ttf

ftgrid interpreter v38

  • v40
FREETYPE_PROPERTIES="truetype:interpreter-version=40" ftdiff -r 96 -s 10 ~/.local/share/fonts/times.ttf

ftdiff interpreter v40

FREETYPE_PROPERTIES="truetype:interpreter-version=40" ftgrid -r 96 -f 36 10 ~/.local/share/fonts/times.ttf

ftgrid interpreter v40

We can also test using pango-view (remember again that this should be a font that has native hinting enabled but not the auto-hinter):

<span font_family="Times New Roman" font="10" foreground="black" alpha="83%">
Lorem ipsum dolor sit amet, c
onsectetur adipiscing elit, s
ed do eiusmod tempor incididu
nt ut labore et dolore magna 
aliqua. Ut enim ad minim venia
m, quis nostrud exercitation u
llamco laboris nisi ut aliquip
ex ea commodo consequat. Duis 
aute irure dolor in reprehende
rit in voluptate velit esse ci
llum dolore eu fugiat nulla pa
riatur. Excepteur sint occaeca
t cupidatat non proident, sunt
in culpa qui officia deserunt 
mollit anim id est laborum.
</span>

You can also change the font via the --font= argument of pango-view.

FREETYPE_PROPERTIES="truetype:interpreter-version=35" pango-view --markup text.bangarang
  • v35

pango interpreter v35

  • v38

pango interpreter v38

  • v40

pango interpreter v40

So definitely, older interpreter versions were rougher with hinting, much bolder, and could deform the glyphs. The newer ones are more minimal with it. We also notice that the auto-hinter isn’t that bad and that avoiding hinting can help. I took the specific case of the Windows font ‘Times New Roman’ because it has the reputation of rendering badly with Freetype, mostly because of the job the interpreter does. Applying very light or no hinting at all helps tremendously, even at very small point size as you can see in the next comparison. The hinting does indeed help legibility at this scale but the font shape and personality is completely destroyed.

From left to right: v35, v38, v40.

pang interpreter small point comparison

How Fontconfig Works

We’re not done with hinting yet, there can be many levels of hinting that can be applied, but let’s first take a detour and learn a bit about fontconfig and how to use it.

Fontconfig is the layer in the font stack responsible for loading the font along with the configurations that tell the next layer how to find the font file and what changes to apply when rendering it. It is usually composed of a library, a preset of configuration files, and a bunch of helpful tools all starting with the prefix fc- such as: fc-cache, fc-query, fc-match, and fc-conflist, to name a few.

The configuration files are usually found in /etc/fonts/ and split into the presets available /etc/fonts/conf.avail, and the chosen presets in /etc/fonts/conf.d, which are symbolic links to the former.
The precedence of the rules is alphanumerical, a first-come first-served principle, thus 01-custom-rule.conf will be loaded before 99-not-important-rule.conf. Local user configurations, in the user’s $XDG_CONFIG_HOME/fontconfig directory, are loaded from one of these configurations that contains an include statement. On my machine it is the 50-user.conf, so it’s precedence is lower than anything loaded before it. This isn’t practical when testing rules so rename this file to something like 01-user.conf. Now anything you put in $XDG_CONFIG_HOME/fontconfig/conf.d or $XDG_CONFIG_HOME/fontconfig/fonts.conf should have priority.
You can make sure the order and configurations are loaded properly by using the fc-conflist command. It lists in order of precedence the configurations found, the ones starting with a + are loaded, the ones with - are not.

These files are composed of mainly 4 components:

  • Match rules: If something matches, then edit the properties mentioned. There are ton of matching and editing rules, even including stuff like the program name that is currently trying to load the fonts and custom ones. You can also match at different times: when looking for a pattern/font, after finding the font, when scanning the font.
  • Aliases creation: An alias is a font name shorthand, it’s useful when querying generic family names such as “monospace”.
  • Inclusion of other configurations: There can be so many configuration files that it’s good practice to split them.
  • Where to look for settings and fonts, and if some fonts should be skipped entirely (like if they aren’t scalable — bitmap): You may think that the location of fonts is a constant value, but it’s not. For example, on my machine it’s set in /etc/fonts/fonts.conf as:
<!-- Font directory list -->
<dir>/usr/share/fonts</dir>
<dir>/usr/local/share/fonts</dir>
<dir prefix="xdg">fonts</dir>
<!-- the following element will be removed in the future -->
<dir>~/.fonts</dir>

Editing XML files is cumbersome, unfortunately today there aren’t many GUIs or simpler tools to set these. I’ve found a single one to date that is named fontweak but that isn’t complete.
It’s a shame because it’s rare to find people that have a clue about how to actually set font configuration nicely.

If you want more info, you can consult man 5 fonts-conf. It’s heavy content and can be confusing content, but still great content.

NB: Fontconfig is not enough to configure every graphical program, some programs load font settings in a simpler way through Xresources, the RESOURCE_MANAGER of X.

Testing Different Hinting

Let’s close this parenthesis and get back to hinting.
Fontconfig has 4 settings related to it, of which one is a matching criterion and the other three are edit rules. They are the following.

  • fonthashint: Matching test to check if the font has built-in hints, namely bytecode interpreter hinting.
  • hinting: If set to true, it tells the next phase, the rasterizer, that hinting in general will be applied.
  • autohint: Use the autohinter instead of the normal hinter. This will skip entirely the bytecode interpreter.
  • hintstyle: The harshness of the hinting that will be applied. It could either be hintnone, hintslight, hintmedium, or hintfull. It needs to be mentioned that these will use a mix of the autohinter and bytecode interpreter if the font has hints. For example, hintslight will snap on the vertical grid only but hintmedium and hintfull will snap harder on the horizontal grid too.

Practically, what does it mean? Let’s show what a font looks like with a combination of these hinting configurations.
Remember that if you’re having issues applying these configurations in your user fontconfig file that you can set the FC_DEBUG environment variable we mentioned before. Always be sure everything loads properly by checking fc-conflist and the currently applied match rules via fc-match --verbose YourFontSearchHere

Let’s test hinting enabled, autohint enabled, and full on grid snapping.

<edit mode="assign" name="hinting">
	<bool>true</bool>
</edit>
<edit name="autohint" mode="assign">
	<bool>true</bool>
</edit>
<edit mode="assign" name="hintstyle">
	<const>hintfull</const>
</edit>

Test Hinting autohint+hintfull

What about disabling autohint and full on grid snapping.

<edit mode="assign" name="hinting">
	<bool>true</bool>
</edit>
<edit name="autohint" mode="assign">
	<bool>false</bool>
</edit>
<edit mode="assign" name="hintstyle">
	<const>hintfull</const>
</edit>

Test Hinting no-autohint+hintfull

Not so pretty, maybe just snapping vertically is better, let’s try no-autohinter and a slight hinting.

<edit mode="assign" name="hinting">
	<bool>true</bool>
</edit>
<edit name="autohint" mode="assign">
	<bool>false</bool>
</edit>
<edit mode="assign" name="hintstyle">
	<const>hintslight</const>
</edit>

Test Hinting no-autohint+hintslight

Better but it still looks too bold. Let’s try again the autohinter but with a softer hinting now.

<edit mode="assign" name="hinting">
	<bool>true</bool>
</edit>
<edit name="autohint" mode="assign">
	<bool>true</bool>
</edit>
<edit mode="assign" name="hintstyle">
	<const>hintslight</const>
</edit>

Test Hinting autohint+hintslight

It looks very similar to the full hinting, let’s test without hinting at all.

<edit mode="assign" name="hinting">
	<bool>false</bool>
</edit>

Test Hinting disabled

It seems like the auto-hinter is doing a good job at aligning the letters vertically in a subtle way. When zoomed in, you can clearly see how the letters seem a bit more compressed with the auto-hinter turned on.

Test Hinting vs No-Hinting

Overall, for the specific font I tested, “Times New Roman”, no hinting at all or slight auto-hinting are the best on my display.

Subpixel Rendering

Let’s move to subpixel rendering.
Fontconfig offers some preset to how harshly the subpixel rendering is done. lcddefault is color-balanced and normalized, lcdlegacy is neither normalized nor color-balanced, it uses any sub-pixels it can find, lcdlight is similar to lcddefault but applies a lighter hint to the surrounding pixels, and lcdnone disables it.
Additionally, there’s also ways to enable Microsoft’s Cleartype subpixel rendering by recompiling Freetype (disabled by default because of patent), and ways to tweak the subpixel rendering matrix by manually editing the Freetype code. But why go through the hassle.

Before testing these, you should find out what’s the subpixel geometry of your screen by consulting this page, and set it as the rgba property. Normally, preset files such as 10-sub-pixel-rgb.conf already come installed so you simply have to symlink them to the /etc/fonts/conf.d directory.

NB: These tests don’t seem to show differences with pango-view but starting any other graphical program should be enough.
NB: Fringes are more apparent with white text on black background.

Here’s the result of the comparison, you can clearly see the fringes when the wrong subpixel geometry is chosen, here my screen has rgb geometry. Also, no-subpixel rendering at all seems like a very good choice for bitmap fonts, keep this in mind.

Test Subpixel geometry comparison

I’ve tried to notice the differences between lcddefault, lcdlight, and lcdlegacy but it’s so minimal that it isn’t worth mentioning. So lcddefault should be fine in most cases. Someone made a comparison on this website if you want to check.

NB: It is rare, but if fonts look deformed on your screen it might be because your DPI isn’t detected properly by fontconfig. Find it on X11 by doing xdpyinfo | grep -B 2 resolution and set it with the following match:

<match target="pattern">
	<edit name="dpi" mode="assign">
		<double>96.0</double>
	</edit>
</match>

Antialiasing

Antialias is the settings you should almost never turn off, unless your font is bitmap/non-scalable.
This picture clearly shows the advantage of antialias on scalable fonts. On the right is the non-antialiased version.

Test Anti-Alias comparison

Weird things happen when the 10-scale-bitmap-fonts.conf preset is present. The following image shows a bitmap font without hinting and antialias on the left and on the right with them. Removing this file should fix the font and show it as crisp as possible.

Test Anti-Alias bitmap

NB: If you want to convert bitmap/pcf/bdf fonts to be supported by Pango see this thread on the nixers.net forums.

Applying What We’ve Learned

Some fonts are known to render badly with Freetype, such as Windows fonts. So let’s test what we’ve learned to make them look better.

You can get a copy of the Windows font from a Windows machine, they are present in the C:\Windows\Fonts\* directory (PS: I do not take responsibility if you do this, for legal reasons).
You should now have the fonts, put them in either $XDG_DATA_HOME/fonts (usually $HOME/.local/share/fonts) or $XDG_DATA_DIRS/fonts (usually /usr/share/fonts).
Be sure to have followed the previous advice of renaming 50-user.conf to 01-user.conf, and confirm that your local font configuration is the first by executing fc-conflist.

Now let’s take the name of all the Windows font we got:

fc-query --format='%{family}\n' * | sort | uniq
  • Arial
  • Arial Black
  • Calibri
  • Calibri Light
  • Cambria
  • Cambria Math
  • Comic Sans MS
  • Consolas
  • Georgia
  • Impact
  • Javanese Text
  • Segoe Print
  • Segoe Script
  • Segoe UI
  • Segoe UI Emoji
  • Segoe UI Historic
  • Segoe UI Black
  • Segoe UI Light
  • Segoe UI Semibold
  • Segoe UI Semilight
  • Segoe UI Symbol
  • Tahoma
  • Times New Roman
  • Trebuchet MS
  • Verdana
  • Webdings
  • Wingdings

And let’s add some rules to our fontconfig file as follows:

<?xml version="1.0"?>
<!DOCTYPE fontconfig SYSTEM "fonts.dtd">
<fontconfig>

	<description>Make Windows Font Look Good</description>

	<match target="font">
		<edit name="iswindowsfont" mode="assign">
			<or>
				<eq>
					<name>family</name>
					<string>Arial</string>
				</eq>
				<eq>
					<name>family</name>
					<string>Arial Black</string>
				</eq>
				<eq>
					<name>family</name>
					<string>Calibri</string>
				</eq>
				<eq>
					<name>family</name>
					<string>Calibri Light</string>
				</eq>
				<eq>
					<name>family</name>
					<string>Cambria</string>
				</eq>
				<eq>
					<name>family</name>
					<string>Cambria Math</string>
				</eq>
				<eq>
					<name>family</name>
					<string>Comic Sans MS</string>
				</eq>
				<eq>
					<name>family</name>
					<string>Consolas</string>
				</eq>
				<eq>
					<name>family</name>
					<string>Georgia</string>
				</eq>
				<eq>
					<name>family</name>
					<string>Impact</string>
				</eq>
				<eq>
					<name>family</name>
					<string>Javanese Text</string>
				</eq>
				<eq>
					<name>family</name>
					<string>Segoe Print</string>
				</eq>
				<eq>
					<name>family</name>
					<string>Segoe Script</string></eq>
				<eq>
					<name>family</name>
					<string>Segoe UI</string>
				</eq>
				<eq>
					<name>family</name>
					<string>Segoe UI Emoji</string>
				</eq>
				<eq>
					<name>family</name>
					<string>Segoe UI Historic</string>
				</eq>
				<eq>
					<name>family</name>
					<string>Segoe UI Black</string>
				</eq>
				<eq>
					<name>family</name>
					<string>Segoe UI Light</string>
				</eq>
				<eq>
					<name>family</name>
					<string>Segoe UI Semibold</string>
				</eq>
				<eq>
					<name>family</name>
					<string>Segoe UI Semilight</string>
				</eq>
				<eq>
					<name>family</name>
					<string>Segoe UI Symbol</string>
				</eq>
				<eq>
					<name>family</name>
					<string>Tahoma</string>
				</eq>
				<eq>
					<name>family</name>
					<string>Times New Roman</string>
				</eq>
				<eq>
					<name>family</name>
					<string>Trebuchet MS</string>
				</eq>
				<eq>
					<name>family</name>
					<string>Verdana</string>
				</eq>
				<eq>
					<name>family</name>
					<string>Webdings</string>
				</eq>
				<eq>
					<name>family</name>
					<string>Wingdings</string>
				</eq>
			</or>
		</edit>
	</match>

	<match target="font">
		<test name="iswindowsfont" compare="eq">
			<bool>true</bool>
		</test>
		<edit mode="assign" name="hinting">
			<bool>false</bool>
		</edit>
		<edit name="autohint" mode="assign">
			<bool>false</bool>
		</edit>
		<edit mode="assign" name="hintstyle">
			<const>hintnone</const>
		</edit>
		<edit mode="assign" name="antialias">
			<bool>true</bool>
		</edit>
		<edit name="embeddedbitmap" mode="assign">
			<bool>false</bool>
		</edit>
	</match>

</fontconfig>

File also hosted here

This may look like a big script and it might be your first time seeing someone write such script for fontconfig but don’t worry. It’s pretty simple overall, it checks the name of the family of the font and sets a variable to true iswindowsfont if it matches. Then, if this is set, it configures the values we want for this group of fonts. You can play with the values if you aren’t satisfied, the grouping should help.
You shouldn’t even have to run fc-cache, this should take effect as soon as you restart an application that uses fontconfig.

fc-match --verbose 'Cambria' | grep iswindowsfont
# iswindowsfont: True(w)

Conclusion

This is it for this post.
I hope you’ve learned a thing or two about font configurations with Freetype and Fontconfig and were surprised by at least one of them.

If you’ve enjoyed my article, have comments, suggestions, or simply want to say thanks, please leave a comment.






References

Attributions

  • Internet Archive Book Images / No restrictions

Gonçalo Valério (dethos)

The app I’ve used for the longest period of time September 13, 2020 03:18 PM

What is the piece of software (app) you have used continuously for the longest period of time?

This is an interesting question. More than 2 decades have passed since I’ve got my first computer. Throughout all this time my usage of computers evolved dramatically, most of the software I installed at the time no longer exists or is so outdated that there no point in using it.

Even the “type” of software changed, before I didn’t rely on so many web apps and SaaS (Software as a service) products that dominate the market nowadays.

The devices we use to run the software also changed, now it’s common for people to spend more time on certain mobile apps than their desktop counterparts.

In the last 2 decades, not just the user needs changed but also the communication protocols in the internet, the multimedia codecs and the main “algorithms” for certain tasks.

It is true that many things changed, however others haven’t. There are apps that were relevant at the time, that are still in use and I expect that they will still be around in for many years.

I spent some time thinking about my answer to the question, given I have a few strong contenders.

One of them is Firefox. However my usage of the browser was split by periods when I tried other alternatives. I installed it when it was initially launched and I still use it nowadays, but the continuous usage time doesn’t take it to the first place.

I used Windows for 12/13 straight years before switching to Linux, but it is still not enough (I also don’t think operating systems should be taken into account for this question, since for most people the answer would be Windows).

VLC is another contender, but like it happened to Firefox, I started using it early and then kept switching back and forth with other media players throughout the years. The same applies to the “office” suite.

The final answer seems to be Thunderbird. I’ve been using it daily since 2004, which means 16 years and counting. At the time I was fighting the ridiculously small storage limit I had for my “webmail” inbox, so I started using it to download the messages to my computer in order to save space. I still use it today for totally different reasons.

And you, what is the piece of software or app you have continuously used for the longest period of time?

September 11, 2020

Bit Cannon (wezm)

Finding an Alternative to iOS September 11, 2020 11:20 PM

I've used iPhones since 2008, adding thousands of dollars to Apple's giant pile of cash. Much like my move from macOS to Linux more than 3 years ago, Apple's recent behaviour has prompted me to consider iPhone/iOS alternatives. Join me on this journey into the world of Android and the lack of real choice that smartphones present in 2020.

Background

For about 12 years I've owned iPhones, most bought outright, totalling thousands of dollars. I've held on to my most recent iPhone, an iPhone X longer than all others. Contrary to claims of planned obsolescence it still works well. I like technology though, and was planning to replace it this year and pass it on to my father.

Apple have recently ramped up their hostility towards the developers that make iOS the desirable platform it is. App Store horror stories are nothing new, but lately Apple seems to have really ramped up their desire to extract money from every developer's business, despite being one of the richest companies in the world. They seemingly do so without regard for whether the end-user experience is actually better for it.

Recent events, perhaps starting with the Hey saga and continuing with the ongoing battle with Epic have not reflected well. Apple appears to see developers as owing them for the privilege of being in their store and using their APIs. This is despite app development requiring a yearly membership fee of AU$149, and purchase of Mac hardware for development.

We understand that Basecamp has developed a number of apps and many subsequent versions for the App Store for many years, and that the App Store has distributed millions of these apps to iOS users. These apps do not offer in-app purchase — and, consequently, have not contributed any revenue to the App Store over the last eight years.

Apple App Review Board

Epic decided that it would like to reap the benefits of the App Store without paying anything for them.

— Apple legal submission, via Marco Arment

Apple: Epic only looking for a free ride

Epic, according to Apple, has given Apple $257,000,000 in commission fees in two years over in-app purchases that Apple has no hand, act, part in, doesn't host on their servers, just for the privilege of existing on their OS. ‘Free ride’.

Steve Troughton-Smith

To take just one example, Epic has for years used Apple's groudbreaking graphics technology, Metal. [..] Apple doesn't charge anything beyond its standard commission for the use of Metal or any of the other tools that Epic has used to develop great games on iOS.

— Philip Schiller, via Steve Troughton-Smith

The only alternative to Metal is OpenGL and Apple have deprecated that!

Anyway, whether you agree with Apple or not this whole thing has me (a developer by trade, and a past contributor to the App Store) feeling offside. Additionally, since I now use Linux full-time there are other sources of friction:

  • iPhones work best when paired with a Mac (or even a PC running Windows).
  • Apple only support building apps on Macs, so if I want to cobble together an app for my phone it's no longer possible.

My only real recourse as a consumer is voting with my wallet and perhaps sharing my reasoning on this blog, so here we are. If enough people do this maybe they will take notice, maybe they won't, but I feel I at least need to try. Just like last time, when I sought a replacement for Mac OS X and switched to Linux I have been evaluating alternatives to iOS.

It's worth noting at this point that I really dislike Google. I distanced myself from all of their services about 8 years ago. The only Google service I use regularly is YouTube. I use Fastmail for email, DuckDuckGo for search, Apple + Flickr for photos, Mattermost, iMessage, Matrix, and Telegram for chat.

Evaluating Alternatives

Initial research turned up the following candidates. Almost all were immediately written-off due to lacking apps or being too immature:

  • Android as shipped on a mainstream phone
    • Full of apps and services dependant on Google.
  • LineageOS
  • LineageOS for microG
    • LineageOS with microG compatibility library to allow running apps that rely on Google APIs, without using Google services.
  • postmarketOS
    • Good in theory: An Alpine Linux based OS for your phone. However, it notes, "Beta version. Calls don't work on most phones yet", on the home page.
  • Librem 5 + PureOS
    • By all accounts the expensive hardware is still not great quality and the software is still being built.
  • LuneOS (WebOS)
    • Very small ecosystem.
  • Sailfish OS
    • Bills itself as, "the mobile OS solution for corporations and governments", right on the front page. I am neither of these things.
  • Give up on a smartphone
    • Get a basic phone for calls and texts and do everything else on a real computer, possibly an ultra compact like the GPD Pocket 2.
    • A friend who has never owned a smartphone talked me out of this. It's possible but very inconvenient. Especially due to some things only being possible with a smartphone like ride sharing.

Turns out duopolies suck: you can choose some modicum of respect for privacy with developer hostile Apple, or get a bit more freedom with surveillance capitalist Google. The candidates that seem most viable for me are LineageOS, and LineageOS for microG. To test out this theory I purchased the cheapest phone supported by LineageOS that I could get new: a Redmi 7 by Xiaomi for AU$175.

For the price I was honestly expecting this phone to be hot garbage. It is in fact much better that I expected. However, this was just a platform for testing the software ecosystem, I won't be reviewing the hardware or letting it colour my impressions of Android. If this experiment goes well my plan would be to by a higher quality iPhone replacement phone.

Stock MIUI ROM

I spent a small amount of time with the stock ROM1 (MIUI) that the Redmi comes with to get a bit of a baseline. It worked well enough and was fairly aesthetically pleasing, but the ads and tracking were truly horrifying. Just take a look at this post describing the steps required to disable data collection and ads — and this is just what you can turn off. Who knows what else it's doing behind the scenes.

LineageOS + Open GApps

I quickly nuked MIUI and installed LineageOS + Open GApps (nano). Open GApps gives you access to some of Google's closed-source apps and libraries, crucially the Google Play Store. The "Open" part of the name refers to the open-source scripts the project publishes for the generation of up-to-date Google Apps packages.

This ROM provides a decent balance between open-source Android and access to the breadth of the Google Play Store. In hindsight The nano version of Open GApps includes more Google than I actually want. I think the ideal for me is the pico package, which is just what's needed to run the Google Play Store.

With this install I attempted to replace the apps that I use most on iOS. For the following apps I just used the Android version:

  • Authy
  • Catch.com.au
  • Deliveroo
  • Discord
  • Element (Matrix)
  • Fastmail
  • Firefox
  • Firefox Focus
  • Instagram
  • Mattermost
  • Reddit
  • Slack
  • Telegram
  • Up
  • YouTube

For these apps I found a replacement that I was mostly happy with:

For these apps I wasn't able to/have not yet found a replacement that I was happy with (please don't send me recommendations):

In general I don't find Android apps to be as nice, or as polished as iOS apps. John and Ben recently discussed this on the 9 September episode of Dithering, which matched my experience. I also really dislike the visual style and slow animations of the Material design language. Especially the circular animation on tap. The apps I like the most are the ones that shun the Material style for their own.

Something I learnt from my move to Linux though, was to embrace the platform's conventions as opposed to trying to reproduce the system you're moving from as much as possible. So I will put my dislike of Material aside.

Screenshot of the emoji keyboard on LineageOS

So Ugly

One thing I'm not sure I can put aside is the use of the super ugly Noto Color Emoji font for emoji on Android. On Linux my system emoji font is JoyPixels and I go to certain lengths to avoid seeing Noto Color Emoji. Almost any other widely available emoji font would be preferable to me. I did try side-loading a JoyPixels package when flashing the ROM but couldn't get it to stick. Apparently something changed in Android 10.

I could "root" the phone and swap out the font file but in the same way I've never jail-broken an iPhone this is not a path I want to go down right now. If worst comes to worst I could actually build LineageOS from source and swap out Noto Color Emoji — what a concept!

LineageOS for microG

microG is a library that implements various APIs provided by Google closed-source libraries in order to be able to run more apps — those that depend on Google's mapping APIs for example. The microG versions of the APIs don't rely on Google servers. Critically going down this path you lose access to Google's push notification servers. Some apps like Telegram work around this but for the most part you lose notifications.

LineageOS for microG gives a familiar LineageOS experience initially. Instead of the Google Play Store though, it uses F-Droid, a repository of strictly free and open-source software. As expected there are far fewer apps available on F-Droid. Most of the big names are missing.

I think if you were especially principled, were happy to use web apps for many things (like Twitter), and didn't use a smartphone all that much LineageOS for microG could work. After spending some time with it though, it's just too limited for me.

Picking a New Phone

The experiment so far showed that I could probably get by with LineageOS + Open GApps. I started looking into what real phone to get as opposed to the Redmi 7 test phone. I had these requirements:

  1. I want a phone around the size of my iPhone X (5.8" display).
    • I find larger phones like the iPhone 6 Plus I owned, and Redmi 7 uncomfortable in pocket, especially when sitting.
  2. If I go with Android I want it to run LineageOS or similar (least amount of Google as possible).
  3. Available in Australia.

That basically only leaves Google Pixel 3 and 4 phones. Pixel 4 seems to have been a bit of a dud. It was discontinued after 9 months and the unreleased successor is rumoured to revert a bunch of the changes it introduced: back to fingerprint sensor, removal of radar gesture sensor. Pixel 3 (from 2018) seemed like it could be viable… but then I looked at GeekBench benchmarks:

  • Pixel 3 — 468 single core, 1833 multi-core
  • Pixel 4 — 610 single core, 2210 multi-core
  • iPhone X — 916 single core, 2334 multi-core

At the time of writing no Android phone is faster in single core performance than my iPhone X from 2017. The OnePlus 8 is at the top with a score of 900. It seems they caught up on multi-core ~last year (by having more cores).

So if the Pixel 3 is my main option I'd be spending money to upgrade to a significantly slower phone made by Google to escape Apple's restrictive, developer hostile, albeit more privacy respecting ecosystem… this is not immediately compelling.

Closing Thoughts

I'm really torn. The upcoming Pixel 5 would likely be a good option if it were possible to strip out as much of the Google dependencies as possible. If past releases are anything to go by it seems that it's likely to be almost another year or so before LineageOS is available for the Pixel 5.

I don't like the idea of buying a Pixel 3 given that it's a step backwards performance wise. After 3 years with the iPhone X I kind of what the replacement to perform close to it. Sadly modern web pages and fake native apps (apps built with web tech) demand fast performance. For example, the Redmi 7 has a really hard time with long Medium articles.

Another option would be to just keep using the iPhone X. It still performs well, battery capacity is still 89% of new, it's still getting major iOS updates. And I'm still voting with my wallet by not giving Apple more money. I did however tell my Dad to hold off buying a new phone earlier in the year because he could have mine when I replace it. So I kind of need a new phone one way or another.

For now I'm going to wait for the Pixel 5 and new iPhones to be released later this year and continue to follow Apple's behaviour towards developers. It's not uncommon for them to actually listen to their customers eventually — often it takes longer than it feels it should though (*cough* butterfly keyboard). As usual subscribe to the feed, or follow me on Twitter or the Fediverse for future updates.

1

I'm using "ROM" (Read Only Memory) here knowing that it's incorrect, since that's the typical language for alternate OSes for Android phones.

Jan van den Berg (j11g)

Moby-Dick – Herman Melville September 11, 2020 06:20 PM

I suspect Moby-Dick — the quintessential Great American Novel — has the curious accolade of being one of the most famous books ever, while also being one of the least read books. Its reputation greatly exceeds its appeal. Nonetheless, I had always wanted to read this extraordinary 170 year old book. And now that I did, I think I understand its reputation as well as I understand the incongruent appeal.

Moby-Dick stats

Moby-Dick clocks in around 650+ pages and 212,000 words. It’s not a small book but it’s also not the biggest book I ever read. But it was definitely one of the hardest, and one that demanded a dedicated and focused effort to finish.

Long story short: reading Moby-Dick is hard work and it’s not exactly the most riveting thing I ever read.

It doesn’t keep you on the edge of your seat. Surprisingly very little happens for such a big book. You can summarize the entire thing in one sentence (yes, I’ll get to the allegories later).

That is not to say that this is not a smart book. Herman Melville’s IQ probably bordered on genius and he pulled out all the stops with Moby-Dick. However, those two things don’t necessarily make for a good book. Why is it then than Moby-Dick is so revered? I can think of a few things.

Moby-Dick – Herman Melville (1851) – 656 pages. Don’t mind the sticker.

Words, just so.many.different.words

Melville’s dictionary must be the most abused book ever. Because if there was an Olympics for using the most different words, Herman Melville would win first, second and third place. This is actually a scientific fact: “About 44% of the distinct set of words in this novel occur only once”

Read that again: 44% of all words in Moby-Dick are used only once.

If you don’t believe me just open this book on any page and you can tell this right away. Moby-Dick is not like any other book.

It is divided in 135 small chapters — and one very important epilogue — each chapter deals with a dedicated subject. And it seems Melville took it as an exercise to fill each chapter with as many different words as he could. Not only that, he likes to use long, half page long rambling sentences. There is also an enormous variation in style per chapter; from dialogue to scientific descriptions to inner thoughts to poetic or philosophical or almost theatrical treaties. And to top it all off, this is all done in English from 170 years ago. Just to give you an idea of what a chore it is to read.

And all of these things are reasons Moby-Dick stands out among other books. Another is because it’s about whaling.

Whaling

Whaling in the 19th century was astoundingly difficult and fantastical venture. If I hadn’t known about it and you would explain it to me I wouldn’t believe you. People actually set out on wooden ships for three or four years and just randomly sail around the world until they found some whales?! Whales that are actual leviathans and that can kill any man in an instant? And when they do spot these whales, they set out on even smaller wooden boats to try to harpoon these 100 foot creatures, BY HAND?! Surely this is all made up! This cannot be real! But it is.

Whaling is an absolutely insane endeavour. And this makes it a terrific backdrop for a story.

I would like to argue no man before or after has know more about whaling than Melville. He not only writes from his own experiences as a whaler, he also had probably read everything ever written (at that point) about whaling and whales. And he uses all this knowledge to bombard the reader with more facts than your brain can handle, about whaling, whales and whalers.

He also shares detailed glimpses of 19th century Nantucket life. Which makes this book a time-capsule of the American spirit. These are reasons this book is so revered in the English speaking world. So much so, that it is regarded as the definitive Great American Novel.

Even though the book suffered greatly from negative reviews and criticism about alleged blasphemy. And it wasn’t until a good 70 years later that Moby-Dick started to be regarded as the classic we now all know. (But this is a story by itself).

Without the bookcover. Gorgeous.

Allegories

On to the good parts. Moby-Dick is not really about the demonstration of Melville’s mastery of language or even about whaling. These two things make it unique, but what makes it good is what is under the surface (see what I did there?).

This book is absolutely brimmed with allegories, allusions and metaphors. Some small, some encapsulate the entire plot, some are even displayed by the book’s structure.

The most clear-cut one is of course that the whale Moby-Dick represents fate itself. But there are many more. Philosophical or contemplative of nature. You can talk and discuss and debate on this endlessly.

Meta

There is one meta-allegory I particularly like. In Moby-Dick we read about a whaler, Ahab, that sets out to kill this mythical monster Moby-Dick, a sperm whale he lost his leg to previously. We as a reader slowly get to experience how this whaler goes maniacally insane and takes his crew with him. Until they all go under.
In a sense this is about Melville himself and his experience and difficulty writing this book! And we, the readers, are the crew.

This is just one take. But there are many more direct allegories, about names, stories and references. Specifically the boats and captains Ahab and Ishmael meet along the way, are loaded with biblical references and meaning. I am sure I probably missed a whole bunch too. Melville uses these narrative devices to deal with many different themes. And it is exactly this what sets Moby-Dick apart from other books. There is a score of things that aren’t said, but implied.

My copy of the book ends with a couple of letters from Melville about his book and his struggles in getting it published. Right after the letters the book, oddly enough, shares a couple of very negative reviews from the time of publishing. I am not sure why they are in there. Maybe to demonstrate that people did not recognize the genius at once? Or how remarkable it is that this book still became a classic? I am not sure.

Conclusion

All in all Moby-Dick is a distinctive and unique reading experience detailing a story about a very specific time and endeavour. And I can now boast “I read Moby-Dick”, and I am glad I did but I will also say I didn’t really enjoy reading it all that much.

I think I understand what Melville set out to do and I admire his genius. I also think I understand the appeal of this book 170 years later. This book makes you work and that is not a problem, but there were times that I really had to force myself, and that does not happen to books that are favorites of mine.

Melville was a genius wordsmith and put many ideas in this book for people to contemplate over for generations to come. But as is the case with music, I don’t care how many different notes a guitar player can hit on his guitar in 1 minute, that is not music, that is a demonstration of mastery. In the end it is about what songs this mastery produces. And in this case, I think I wanted to have liked the song more.

The post Moby-Dick – Herman Melville appeared first on Jan van den Berg.

September 09, 2020

Kevin Burke (kb)

Let employees sell their equity September 09, 2020 10:27 PM

Sometimes people choose to work for one company over another for reasons related to the work environment, for example what the company does, and whether the other employees create a place that's pleasant to work at. But a major factor is compensation. If Company A and Company B are largely comparable, but Company A offers $30,000 more in base pay per year more than Company B, most people will choose Company A.

At tech companies, compensation usually breaks down into four components: company stock, benefits, cash salary, and bonus. When you get an offer from a company, these are the four areas that the recruiter will walk you through. The equity component is a key part of the compensation at startups. Small startups hope that the potential for a large payoff is worth sacrificing a few years of smaller base pay.

If you join a small startup and you get stock, you generally can't sell it until an "exit event" - an IPO or acquisition - even if your entire stock grant has vested. Generally, any stock sale before an exit event will require approval of the board, and the boards generally frown on stock sales, for reasons I will get into. So while you may own something that is worth a lot of money, you can't convert it into cash you can actually spend for a half decade or more.

By contrast, if you join a public company, your compensation includes equity that you can sell basically immediately after it vests, because it trades on a public exchange. There are hundreds of people who will compete to offer the best price for your shares every day between 9am and 4:30pm.

As an employee, how should you think about the equity component of your offer? One reason to take a big equity stake is to bet on yourself. If you have a great idea about how you can make the company 10%, 50%, or 200% more valuable, and you think you can execute it, you should take an equity stake! After you implement the changes, your equity will be massively more valuable. Broadly speaking this is what "activist investors" try to do; they have a theory about how to improve companies, they buy a stake and hope the value changes in line with the theory.

One problem with this is that you are much likely to be in a position to make these changes if you are someone important like a C-level executive or a distinguished engineer. However, most tech employees are not C-level executives. If you are an engineer on the fraud team, and you try really, really hard at your job for a year, maybe you can increase the value of the company by 1% or 2%. You are just not in a position, scope wise, to drastically alter the trajectory of the company by yourself.

Rationally speaking, it does not make much sense for you, an engineer on the fraud team, to double or triple your effort just to make your equity stake worth 1% more. There might be other reasons to do it - you could really buy into the mission, or you hate being yelled at or whatever - but just looking at the compensation, whether you, personally, work really hard or slack off, your stock is probably going to be worth about the same. Unless you are the CEO or other C-level executive, at which point you have a big enough lever that your level of effort matters.

Another way to think about it is, imagine you have invested your money in a broad range of stocks and bonds, and then someone asked you to sell 30% of it and place it all in a single tech stock. Modern portfolio theory would suggest that that is a bad thing to do. You could gain a lot if the stock does well, but on the other hand, if the company's accountant was embezzling funds, or the company lost a lawsuit, or the company lost a database or had the factory struck by lightning or something, you could lose a ton of money that you wouldn't if you were better diversified. It's not worth the risk.

All this goes to say that employees should value their equity substantially less than an equivalent amount of cash. Outside of the C-level, you can't do much to make the equity more valuable, and an extra dollar worth of equity takes your portfolio further away from an ideal portfolio that you could buy if you just had cash. (For more on this topic you should read Lisa Meulbroek (hi, Professor Meulbroek), whose CV is criminally underrated.)

(On the flip side, if your company is small and valuable, it may have its pick of investors to take money from, and be able to dictate investment terms. Holding equity in a company like this is a way to approximate the "deal flow" of a good Silicon Valley investor - as an employee you are getting the chance to buy and hold stock in a company at prices that would not be accessible to you otherwise. This may be true of small, hot startups but it gets less and less true the bigger a company gets and the more fundraising rounds it goes through.)

One implication is that you should prefer to work at public companies. At a public company, you can take your equity compensation and immediately sell it and buy VT (or even QQQ) or whatever and be much better off because you are diversified. You can't do that at a private startup.

Another problem is that public companies tend to have better equity packages. I went through a round of interviews recently and I was stunned at how paltry the equity offers were from private, Series A-C companies. For most of the offers I received, the company valuation would need to increase by 8-20x for the yearly compensation to achieve parity with the first-year offer from a public SF-based company, let alone to exceed it. Even if they did achieve 4 doublings of their valuation, you might not be able to sell the private company stock, so you're still behind the public company.

I expect larger companies to have better compensation, it's part of the deal, but that large of a differential, plus the cash premium to be able to sell instantly, makes it foolish to turn down the public company offer. 1

So how can you compete if you're a smaller company? The obvious answers are what they've always been: recruit people with backgrounds that bigger companies overlook, give people wild amounts of responsibility, sell people on the vision, commit to "not being evil" and actually follow through on it.

But you can also try to eliminate an advantage that public companies have by letting your employees sell their equity. Not just, like, one time, at a huge discount before you go public, or when you get to Stripe's size and want to appease your employees. But routinely; because your employees want to boost their cash base, or buy the stock market, or buy a vacation, or whatever.

There are some objections. Having more than 500 shareholders triggers SEC disclosure requirements, which can be a pain to deal with. So require employees to sell to other employees or existing investors. Cashing out entirely might send the wrong signals, so limit sales to 10-20% of your stake per calendar year. A liquid market might require repricing stock options constantly. So implement quarterly trading windows.

Executives might not want to see what the market value of your stock is at a given time. That's tougher. But a high day-to-day price might convince people to join when they otherwise wouldn't. A low price might convince you to change direction faster than waiting for the next fundraising round.

There are also huge benefits. Employees can cash in earlier in ways that are generally only available to executives. They can take some risk off the table. People who want to double up on their equity position can do so.

Finally, you might be able to attract employees you might not otherwise be able to. A lot of folks who are turned off by the illiquidity of an equity offer might turn their heads when you describe how they can sell a portion at market value every year.

Big companies have big moats. One of them - the ability to convert stock to cash instantly - doesn't need to be one.

Thanks to Dan Luu and Alan Shreve for reading drafts of this post.

You may think they were lowballing me, but this was after negotiation with each. Another possibility is that I did differently on the interviews for each, and the smaller companies offered me lower packages because they thought I did worse. I think I did about equally well on the interviews for each.

Patrick Louis (venam)

Notes About Compilers September 09, 2020 09:00 PM

Architect style wall, nothing really related but it looks good and gives a vibe

Compilers, these wonderful and intricate pieces of software that do so much and that so many know little of. Similar to the previous article about computer architecture, I’ll take a look at another essential, but lesser known, CS topic: Compilers.
I won’t actually dive into much details but I’ll keep it short to my notes, definitions, and what I actually found intriguing and helpful.

General schema of a compiler pieces

A compiler is divided into a frontend and a backend. The frontend role is to parse the textual program, or whatever format the programmer uses to input the code, verify it, and turn it into a representation that’s easier to work with — an IR or Intermediary Representation.
Anything after getting this intermediate representation, which is usually either a tree or a three-address code, is the backend which role is to optimize the code and generate an output. This output could be anything ranging from another programming language, what’s called a transpiler, to compiling into specific machine code instructions.
These days many programming languages rely on helpful tools to make these steps easier. For example, most of them use Yacc and Lex to build the front-end, and then use LLVM to automatically have a backend. LLVM IR is a backend that could in theory plug to any compiler frontend, thus any compiler relying on it will necessarily benefit from optimizations done in the LLVM IR.

Personally, I’ve found that the most interesting parts were in the backend. While the frontend consist of gruesome parsing, things become fascinating when you realizing everything can be turned into three-address code, instructions that consist of maximum 3 operands and that have only one operand on the left side for assignment and one operator on the right.
From this point on, you can apply every kind of optimizations possible, like if loops over arrays can have their address represented by linear functions, or if dependence between data allows to reposition the code, of if following the lifetime of values help. In the backend you can manage what the process will look like in memory, and you can also implement garbage collection.

Overall, learning a bit about compilers doesn’t hurt. It gives insights into the workings of the languages we use everyday, removing the magic around them but keeping the awe and amazement.
So here are my rough notes and definitions I took while learning about compilers, I hope these help someone going on the same path as there’s a lot of jargon involved.

Definitions

  • Terminals: Basic symbols from which strings are formed, also called token name.

  • Nonterminals: Syntactic variables that denote sets of strings. It helps define the language generated by the gammar, imposing a hierarchical structure on the language that is key to syntax analysis and translation.

  • Production: What nonterminals produce, the manner in which the terminals and nonterminals can be combind to form strings. They have a left/head side and a body/right side, separated by -> or sometimes ::==

  • Grammar: The combination of terminal symbols, nonterminal symbols, productions (nonterminals output)

  • Context free grammar: It has 4 components: terminal symbols/tokens, nonterminal symbols/syntactic variables (a string of terminals), productions (nonterminals called the head/left side + arrow + sequence of terminals and/or nonterminals the body or right side), and the designation of nonterminals as start symbol.

  • The language: The strings that we can derive from the grammar.

  • Parse Tree: Finding a tree that can be used to derive/yield a string in the language.

  • Parsing: The process of finding a parse tree for a given string of terminals.

  • Ambiguous grammar: A grammar that can have more than one parse tree that can generate a given string.

  • Associativity: The side to which the operator belongs to if the operator is within two tokens. Could be left-side associativity or right-side associativity. This is a way to assign and resolve the priority/precedence of operators.

  • Syntax-directed translation scheme: Attaching rules (semantic rules) or program fragments to productions in a grammar. The output is the translated program.

A schema representing simple syntax-directed translation

  • Attributes: Any quantity associated with a programming construct.

  • Syntax tree: The tree generated from a syntax-directed translation.

  • Synthesized attributes: We can associate attributes with terminals and nonterminals, then also attach rules that dictate how to fill these attributes. This can be done in syntax-directed translation.

  • Semantic rules: When displaying a syntax-directed grammar, the semantic rules are the attached actions that need to be done to synthesized attributes (other than the usual production).

  • Tree traversal: How we visit each element of a tree, could be depth first, aka go to children first, or breadth first/top-down, aka root first.

  • Translation schemes: executing program fragments, semantic actions, instead of concatenating strings.

  • Top-down parsing: Start at the root/breadth first, the starting nonterminal, and repeatedly perform: select one production at that node and construct children, find next node at which the subtree is constructed. The selection involves trial and error.

  • lookahead symbol: The current or future terminal being scanned in the input. Typically, the leftmost terminal of the input string.

  • Recursive-descent parsing: a top-down method of syntax analysis in which you recursively try to process the input. There’s a set of procedures, one for each nonterminal.

void A() {
	Choose an A-production, A->XaX2 ... Xk;
	for (i = 1 to k) {
		if (Xi is a nonterminal) {
			call procedure Xi();
		} else if ( i equals the current input symbol a) {
			advance the input to the next symbol;
		} else {
			/* an error has occurred */
		}
	}
}
  • Backtracking: Going backward in the input to parse them again using another production as the new choice.

  • Predictive parsing: A form of recursive-descent parsing in which the lookahead symbol unambiguously determines the flow of control through the procedure body of non-terminal. This implicitly defines a parse tree for the input and can also be used to build an explicit parse tree. The procedure does two things: It decides which production to use by examining the lookadhead symbol if it is in the FIRST(a), The procedure mimics the body of the chosen production, it fakes execution until a terminal.

  • FIRST(a): Function to return the set of terminals that appear as the first symbols of one or more strings of terminals generated from a.

    1. If X is a terminal, then FIRST(X) = {X}.
    2. If X is a nonterminal and X-> Y1Y2...Yk is a production, then place a in FIRST(X), a in FIRST(Yi) and ε in all of FIRST(Y1)...FIRST(Yi-1), that basically means that X -> ε a.
    3. If X -> ε is a production, then add ε to FIRST(X).


  • FOLLOW: Function to return the rightmost symbols in the derivation sentential form.
    1. Place $ in FOLLOW(S), S is the start symbol
    2. If there is a production A-> aBb then everything in FIRST(b) except ε is in FOLLOW(B), so in sum any terminal that follows B
    3. If there is a production A -> aB, or a production A -> aBb, where FIRST(b) contains ε, then everything in FOLLOW(A) is in FOLLOW(B).


  • Left recursion: A recursive-descent parser could loop forever, we need to avoid that. It can be eliminated by rewriting the offending production. Example A -> Aa | B which is left recursive, can be rewriten as A -> BR, R -> aR | ε.
    Algorithm to remove left recursion:
arrange the nonterminals in some order A1,A2,..., An
for (each i from 1 to n) {
	for (each j from 1 to i-1) {
		replace each production of them form Ai -> Aiy by the
		productions Ai -> d1y | d2y| .. | dky, where
		Aj -> d1 | d2 | ... | dk are all current Aj-productions
	}
	eliminate the immediate left recursion among the Ai-productions
}
  • Left Factoring: When it’s ambiguous which production to select in A -> aB1 | aB2 , we can defer the selection to later by factoring it to A -> aA1 and A1 -> B1 | B2. We factor by the most common prefix.

  • Abstract syntax tree or syntax tree: A tree in which interior nodes represent an operator and children node represent operands of the operator. They differs from parse tree in the way that they have programming construct in interior nodes instead of non-terminals.

  • Token: Terminal with additional information, name and optional attribute value. The name is an abstract symbol representing a kind of lexical unit, be it a keyword, an identifier, etc.

  • Lexeme: sequence of characters from the source program that comprises a single token name. It’s an instance of that token.

  • Pattern: A description of the form that the lexeme of a token may take. A sequence of characters that form a keyword or form identifiers and other tokens, any more complex string structure that needs to be matched.

  • Lexical analysis/analyzer: a lexical analyzer reads characters from the input and groups them into “token objects”. Basically, it creates the tokens. It could be split into two parts, a scanning that consists of processing input by removing comments and compacting white spaces, and a proper lexical analysis that is the more complex portion that produces the token.

Interaction between syntax analyzer and parser

  • Reading Ahead: It’s useful to read future characters to decide if they are part of the same lexeme. A technique is to use an input buffer or a peek variable that holds the next character.

  • Input buffering: The best technique is to use a buffer pairs, 2 buffers of the size of a disk block so that reading is more efficient. We use a lexemeBegin pointer and a forward pointer. To check if we are out of bound of a buffer or that reading is finished we can use “sentinels”, special characters that specify the end of file, if in the middle, or end of buffer, if at the end of the buffer. This character can be EOF.

  • Keywords: character strings, lexeme, that identify constructs such as if, for, do, etc.

  • Identifier: also a character string, lexeme, that identify a named value.

  • Symbol Tables: Data structures used by compilers to hold information about source-program constructs. The info is collected incrementally by the analysis phase and used by the synthesis phases to generate the target code. Entries in it contain info about identifiers such as its character string (lexeme), its type, its position in storage, and any other relevant info. Each scope usually have their own symbol table. It gets filled during the analysis phase, the semantic action fills the symbol table, then for example factor -> id, id token gets replaced by its symbol that was declared in the table.

  • Intermediate Representations: The frontend generates an intermediate representation of the source program so that the backend can generate the target program. The two most important are: Trees (parse trees and abstract syntax trees), and linear representations (such as three-address code).

  • Static checking: The process of checking if the program follows syntactic and semantic rules of the source language. Assures that the program will compile successfully, and catches errors early. Contain: Syntactic checking: checks grammar, identifier declared, scope check, break statement at end of loops, and type checking.

  • Type Checking: Assures that an operator or function is applied the right number and type of operands, also handles the conversion if necessary aka “coercion”.

  • Strings and languages: A string is synonym for word or sentence. It’s a finite set of characters, its length measured as |s|. ε, or e, is the string of length 0, |ε| = 0. Strings can be concatenated, if concatenated with ε they stay the same. We can define exponentiation of strings, as in s**0 = ε, s**1 = s, s**2 = ss, s**n = s**(n-1) s. Language is the countable set of strings over an alphabet, a set { x y z }, empty set is 0 or { ε }.

  • Operations over language: We can perform union, concatenation, and closure, which are the most important operations. Union of two different languages is the same as in set theory. Concatenation is strings formed by taking strings from the first language and string from the second language. Closure aka kleene of L or L* is a set of string you can get by doing concatenation of L zero or more time. L+, positive closure is 1 or more time.

Closure operation

DFA NFA NDFA NFA Example Subset construction Transition table for conversion

  • Regular expressions aka regex: The joint of all operations over language done in an expresive way. There are precedence priority rules: All are left associative, the highest precedence goes to *, then concatenation, then union |. A language that can be defined by regular expressions is called a regular set.

Algebraic laws for regular expressions

  • Regular definitions: Like variables holding a regex for later use, to make it more readable.

Summary of lexer 1 Summary of lexer 2 Summary of lexer 3



Notational convention 1 Notational convention 2

  • Aho-Corasick algorithm: Algorithm that permits to find the longuest prefix that matched a single keyword b1b2..bn that is found as a prefix of a string. It defines a special transition diagram called a trie, it is a tree structured transition diagram. Define for every node of that tree a failure function which is the previous state that fits the prefix, f(s), s being the current position in the string we are trying to match. The seek pointer should be put back at b(f(s)+1) in case of error. There’s also the KMP algoritm to match the string.

Pseudo code for failure function:

t = 0;
f(1) = 0;
for (s = 1; s < n; s++) {
	while (t > 0 && b(s+1) != b(t+1)) t = f(t);
	if (b(s+1) == b(t+1)) {
		t = t + 1;
		f(s+1) = t;
	} else {
		f(s+1) = 0;
	}
}
  • Conflict resolution in Lex:
    1. Always prefer a longer prefix to a shorter prefix.
    2. If the longest possible prefix matches two or more patterns, prefer the pattern listed first in the Lex program.

Position of Parser in compiler model

  • Constructing parse tree through derivation: Begin with the start symbol and then at each step replace a nonterminal by the body of one of its production. It’s a top-down construction of a parse tree. We use the => to denote “derives”. This proves that a certain terminal derives, in a number of steps, from a particular instance of an expression. If a form with no-nonterminals derives from the start symbol we can say that it is a sentential form of the grammar. The language of a grammar is the set of sentences. Some grammars can be equivalent, different path for same sentence, we denote leftmost derivation and rightmost/canonical derivation.

  • LL(1) grammars: L for left to right scanning, L for a leftmost derivation, 1 for using one input symbol of lookahead at each step to make parsing action decisions.

  • Constructing parse trees through reduction: Reduction or bottom-up parsing, is the inverse of derivation, it consists of reducing terminals until the start symbol is found. A “handle” is a substring that matches the substring of the body of a production, it is reduced to the left-most/head of it.

  • LR(k) parsing: L for left to right scanning of the input, R for constructing the rightmost derivation in reverse, and the k for the number of input symbols of lookahead that are used in making parsing decisions.

  • Items in LR(0): States represent sets of “items”, it’s a production of G grammar with a dot at some position of the body indicating where we are in the parsing. For example: A -> X . Y Z

  • Augmented expression grammar in LR(0): a grammar with added initial state S' that produces S, such as: S' -> S, we accept the state once everything is reduced to S'.

  • CLOSURE and GOTO in LR(0): The CLOSURE of a set of items I for a grammar G is constructed as follows: add every items in I to the CLOSURE(I), if A -> a.Bb is in CLOSURE(I) then replace B by what B produces, example: B -> y, then we add A -> a.yb we do this until we cannot apply this rule anymore. We call the added ones nonkernel items, and the initial ones kernel items.
    The GOTO(I, X), where I is the set of items and X a grammar symbol, defines the closure of the set of all items [ A -> a X. b ] such that [A -> a .X b] is in I. It defines the transitions in the LR(0) automaton for a grammar on input X.

  • CLOSURE and GOTO in LR(1): LR(1) is similar to LR(0) however it has one lookahead character, an item as the form: [A -> a.B, a] where this production is valid only when the next input symbol is a.

SetOfItems CLOSURE(I) {
	repeat
		for (each item [A -> a.Bb, a] in I)
			for (each production B -> y in G')
				for (each terminal c in FIRST(ba)
					add [B -> .y, c] to set I;
	until no more items are added to I;
	return I;
}

SetOfItems GOTO(I, X) {
	initialize J to be the empty set;
	for (each item [A -> a.XB, a] in I)
		add item [A -> aX.B, a] to set J;
	return CLOSURE(J);
}

void items(G') {
	initialize C to {CLOSURE({|S' -> .S, $|})};
	repeat
		for (each set of items I in C)
			for (each grammar symbol X)
				if (GOTO(I, X) is not empty and not in C)
					add GOTO(I,X) to C;
	until no new sets of items are added to C;
}

LR(1) example 1 LR(1) example 2 LR(1) example 3 LR(1) example 4 LR(1) example 5



Parser Summary 1 Parser Summary 2 Parser Summary 3 Parser Summary 4 Parser Summary 5 Parser Summary 6

  • L-attributed translations: Class of syntax-directed translations (L for left-to-right), which encompass virtually all translations that can be performed during parsing.

  • SDD, syntax-directed definition: Context-free grammar together with attributes and rules. Attributes are associated with grammar symbols and rules are associated with productions. If X is a symbol, X.a shows a as an attribute of X.

  • Synthesized and Inherited attributes: Synthesized attributes at node N for nonterminal A are computed from the semantic rules at that node, while inherited the attribute of the children are computed from the parent’s semantic rules. Terminals only have synthesized attributes.

  • S-attributed SDD: A syntax directed definition that only contains synthesized attributes, that is the head attributes are computed from its production body at node N only (not parent).

  • L-attributed SDD: Where the inherited attributes are only defined by one of the attribute on the left or in the head of the production (left-to-right).

  • Attribute grammar: An SDD without side effects.

  • Annotated parse tree: A parse tree showing the value(s) of its attribute(s).

  • Dependency graph: A graph with arrows/edges pointing in the direction of the value that depends upon the other side of those arrows. It’s applicable for both synthesized attributes and inherited attributes.

  • Topological sort: a way of sorting the dependency graph in a way in which the attributes/node have to be processed. When there are loops in the dependency, topological sorts are not possible.

  • Syntax-directed translation (SDT) for L-Attributed Definition: A syntax directed translation where we put the action/semantic-rule right before the character that requires them, and put the semantic-rule of the head as the last rule.

Syntax directed definition summary 1 Syntax directed definition summary 2 Syntax directed definition summary 3

  • Directed acyclic graph (DAG): A way to convert a syntax-directed definition into a graph where leaves are unique/atomic operand, and interior nodes correspond to operators. A leaf node can have many parents. It expresses the syntax tree more succintly and can be used for generation of efficient code to evaluate expressions. Nodes can be stored in an array of records, where each row represents one node. Leaves have a field as lexical value and interior nodes have two fields for left and right children.
.--------------.
|1| id  |  ----|-> to entry for i
|2| num | 10   |
|3| +   |1 | 2 |
|4| =   |1 | 3 |
|5|  ....      |

     .- = .
   .'      `.         
  :         +         
  :      .'  `.       
  `.   .'      `      
     i          10    

Intermediate representation position in compiler 1 Intermediate representation position in compiler 2

  • Three-address code: Instructions where there are at most one operator on the right side. It is a linear representation of a syntax tree or a DAG in which explicit names correspond to the interior nodes of the graph. Three-address code is composed of addresses and instructions. An address could either be a name, a constant, or a compiler-generated temporary. Common instructions used can be an assignment instruction (x = y op z), unary operator assignment (x = op y), copy instruction of the form x = y, unconditional jump goto L, conditional jump of the form if x goto L and ifFalse x goto L, conditional jump such as if x relop y goto L relop being a conditional operator, and procedure calls such as param x for parameters and call p, n and y = call p, n (last n arguments) for procedures and function calls respectively, and return y y being the returned value, indexed copy instructions of the form x = y[i] and x[i] = y, and address and pointer assignments of the form x = &y x = *y and *x = y.

if-else to three-address code 1 if-else to three-address code 2 if-else to three-address code 3 if-else to three-address code 4

  • Quadruples (in the context of three-address code): A table where we map 4 columns: op, arg1, arg2, result. Unary operators don’t fill arg2, param don’t fill arg2 nor result, and conditional jumps have the target label in result.

  • Triples (in the context of three-address code): A table where we map 3 columns, similar to quadruples, but without the result. The result is referred to by its position only. They are one to one with syntax tree. Indirect triples are like triples but instead of pointing the result directly we point to the result position in a separate instruction table, and thus can move chunks of code independently.

  • Static single-assignment form (SSA): An intermediary representation similar to three-address code but where all assignments are to variables with distinct names. It uses ø-function to combine definitions of the same variable, returns the value of the asignment-statement corresponding to the control-flow path.

  • Translation applications: From the type of a name, the compiler can determine the type of storage (storage layout) that will be needed for that name at run time. Type information can be used to calculate addresses denoted in arrays for example.
    Array layout is either row major or column major, as: base + (i-low)*w Some types could be left chosen by the output archicture, left as symbolic type width in the intermediate representation.

  • Type checking: A method the compiler uses, with a type system, to assign type expression to each components of a source program to avoid inadvertent error and malicious misbehavior. A language is either strongly typed or not, meaning it needs all the types to be chosen explicitly.
    Two forms: synthesis and inference, synthesis builds up the type of an expression from the type of its subexpressions. It requires names to be declared before they are used. ex: if f has type s -> t and x has type s, then expression f(x) has type t. Type inference determines the type of a language construct from the way it is used. ex: if f(x) is an expression, then for some a and b, f has type a -> b and x hs type a.

  • Implicit and explicit type conversion: implicit conversion is when the compiler coerces the types, usually when widening types, explicit is when the programmer must write something to cause the conversion. Two semantic actions for checking E -> E1 + E2 one is max(t1, t2) another widen(a,t,w) which widen address a of type t into a value of type w.

Addr widen(Addr a, Type t, Type w) {
	if (t = w) return a;
	else if (t = integer and w = float) {
		temp = new Temp();
		gen(temp = '=' (float)' a);
		return temp;
	} else {
		error;
	}
}
  • Polymorphic function: A type expression with a stands “for any type” which the function can be applied to. Each time a polymorphic function is applied, its bound type variables can denote a different type.

  • Unification: The problem of determining whether two expressions s and t can be made identical by substituting expressions for the variables in s and t.

  • Boolean expressions: Either used to alter the flow of control or to compute logical.
    Example:

B -> B || B | B && B | !B | (B) | E rel E | true | false

We can short-circuit boolean operators, translating them into jumps:

if (x < 100 || x > 200 && x != y) x = 0;

equivalent to:

if x < 100 goto L2
ifFalse x > 200 goto L1
ifFalse x != y goto L1
L2: x = 0
L1:
  • Backpatching: A method of generating labels for jumps in boolean expression (ex: if (B)) S) in one pass as synthesized attributes.

Intermediate representation summary 1 Intermediate representation summary 2

  • Run-time environment: The environment provided by the operating system so that the program runs. Typically:
Code/Text
Static
Heap
 |
 V
Free memory
 ^
 |
 Stack

General activation record

  • Stack vs Heap: Stack storage: for names local to a procedure. Heap storage: data that may outlive the call to the procedure that created it (we talk of virtual memory).

  • Memory Manager: A subsystem that allocates and deallocates space within the heap, it serves as an interface between application programs and the operating system. It performs two basic functions: allocation and deallocation.
    A memory manager should be space efficient, minimizing the total heap space needed by a program, program efficient, it should allow the program to run faster by making use of the memory subsystem, and have a low overhead, because memory allocation and deallocation are frequent in many programs.

  • Garbage collectors: A piece of code to reclaims chunks of storage that aren’t accessed anymore.
    Things to consider: overall execution time, space usage, pause time, program locality.
    Either we catch the transition when object become unreachable (like reference counting), or we periodically locate all the reachable objects and then infer that all the other objects are unreachable (trace-based).

  • Mutator: A subsystem that is in charge of manipulating memory. It performs 4 basic operations: Object allocation, parameter passing and return values, reference assignments, procedure returns.

  • Root set: All the data that can be accessed directly by a program, without having to dereference any pointer.

  • Code generation: The process of generating machine instruction/target program (be it asm or other) from an intermediary representation.

  • Addresses in target code: The code found in a static area is used for, static for global constants, and the heap is the dynamic managed area during program execution, stack is dynamic for holding activation records as they are created and destroyed during calls and returns

Environment summary 1 Environment summary 2 Environment summary 3 Environment summary 4 Environment summary 5




Position of code generator in compiler

  • Basic blocks and flow graphs: Dividing the code into sections called blocks, consisting of: flow that can only enter the basic block through the first instruction, no jump in the middle, and control will leave the block without halting or branching execpt possibly as the last instruction. The basic block becomes a node in a flow graph.

  • Live variable, and next-use: A variable that lives after one basic block, the next-use tell us when it’s going to be used

  • Optimizing the code: Optimization is based on multiple things including: cost of instruction, eliminate local common subexpressions, eliminate dead code, reorder statements that do not depend on one another, use algebraic laws to reorder operands of three-address instructions and sometimes simplify the computation

  • DAG for basic block: The basic block itself can be represented by a DAG, having as parents the operators and as leaves the operands. This is used for simplifications and to represent array references too.

  • Managing register and address descriptors: registers are limited and so we need an algorithm, using a getReg() method to choose what to do with the registers. We need two structures, one to know what is currently in the registers, a register descriptor, and one to know where, in which addresses, the variables are currently found, an address descriptor.

  • A register spill: When there’s no place in the current register to store the operand of an instruction and that register value needs to be stored on its own memory location.

Code generation summary 1 Code generation summary 2 Code generation summary 3

  • Peephole optimization: Improving a known target code, a peephole, by replacing instruction sequences within it by a shorter or faster sequence. It usually consists of many passes. Examples: redundant-instruction elimination, flow-of-control optimizations, algebraic simplifications, use of machine idioms.

  • Data flow graph analysis: A way of drawing the flow of a program/blocks to optimize it. When iterative, it usually consists of parameters in a semi-lattice with a domain, direction (forward, backward), a transfer function which has results in the domain, a boundary (top and bottom), a meet operator ∧ (that follows ≤ properties), equations, and initialization. Such graph can be: reaching definitions, live variables, available expressions, constant propagation, partial redundancy, etc..

Reaching definition

  • Monotonicity: A function f on a partial order is monotonic if: if x ≤ y then f(x) ≤ f(y)

Data flow summary 1 Data flow summary 2 Data flow summary 3 Data flow summary 4 Data flow summary 5 Data flow summary 6 Data flow summary 7

  • MOP (meet-over-all-paths solution): Then the “best” possible solution to a dataflow problem for node n is given by computing the dataflow information for all possible paths from entry to n, and then combining them ø. in general there will be an infinite number of possible paths to n.

  • Very busy expressions: An expression e is very busy at point p if On every path from p, expression e is evaluated before the value of e is changed

  • Natural loop: Conditions: It must have a single-entry node, called the header. This entry node dominates all nodes in the loop. There must be a back edge that enters the loop header. Otherwise, it is not possible for the flow of control to return to the header directl from the “loop”.

ILP summary 1 ILP summary 2 ILP summary 3 ILP summary 4

  • Region based analysis: Instead of iterative, we start from a small scope, apply the transfer function, and wide the scope.

  • Hardware vs software ILP: Machine that let the software manage parallelism are called VLIW machines (Very Long instruction word), and those that use the hardware are called superscalar machines. See computer architecture article.

  • Array afine optimization: When you can express the indices of the array by an affine function, you can start applying types of optimization such as time based and space based optimization.

Basic matrix multiplication 1 Basic matrix multiplication 2

Array access with matrix vector 1 Array access with matrix vector 2




Hardware optimization summary 1 Hardware optimization summary 2 Hardware optimization summary 3 Hardware optimization summary 4

Further Reading






Attributions:

  • Internet Archive Book Images / No restrictions

September 07, 2020

Frederic Cambus (fcambus)

Playing with Kore JSON API September 07, 2020 03:15 PM

Kore 4.0.0 has been released a few days ago, and features a brand new JSON API allowing to easily parse and serialize JSON objects.

During the last couple of years, I have been using Kore for various projects, including exposing hardware sensor values over the network via very simple APIs. In this article, I would like to present a generalization of this concept and show how easy it is to expose system information with Kore.

This small API example allows to identify hosts over the network and has been tested on Linux, OpenBSD, NetBSD, and macOS (thanks Joris!).

After creating a new project:

kodev create identify

Populate src/identify.c with the following code snippet:

#include <sys/utsname.h>

#include <kore/kore.h>
#include <kore/http.h>

#if defined(__linux__)
#include <kore/seccomp.h>

KORE_SECCOMP_FILTER("json",
	KORE_SYSCALL_ALLOW(uname)
);
#endif

int		page(struct http_request *);

int
page(struct http_request *req)
{
	char *answer;

	struct utsname u;

	struct kore_buf buf;
	struct kore_json_item *json;

	if (uname(&u) == -1) {
		http_response(req, HTTP_STATUS_INTERNAL_ERROR, NULL, 0);
		return (KORE_RESULT_OK);
	}

	kore_buf_init(&buf, 1024);
	json = kore_json_create_object(NULL, NULL);

	kore_json_create_string(json, "system", u.sysname);
	kore_json_create_string(json, "hostname", u.nodename);
	kore_json_create_string(json, "release", u.release);
	kore_json_create_string(json, "version", u.version);
	kore_json_create_string(json, "machine", u.machine);

	kore_json_item_tobuf(json, &buf);

	answer = kore_buf_stringify(&buf, NULL);
	http_response(req, 200, answer, strlen(answer));

	kore_buf_cleanup(&buf);
	kore_json_item_free(json);

	return (KORE_RESULT_OK);
}

And finally launch the project:

kodev run

The kodev tool will build and run the project, and we can now query the API to identify hosts:

{
  "system": "OpenBSD",
  "hostname": "foo.my.domain",
  "release": "6.8",
  "version": "GENERIC.MP#56",
  "machine": "amd64"
}

Wesley Moore (wezm)

Slowing Down Read Rust Posting September 07, 2020 12:00 AM

After nearly 3 years and more than 3200 posts I'm going to slow down the posting frequency on Read Rust. I hope this will free up some spare time and make it easier to take breaks from social media. I aim to share all of the #rust2021 posts I can find, but after that I'll probably only share posts that seem particularly noteworthy or interesting.

I started Read Rust in January 2018 to track the posts being shared as part of the inaugural call for blog posts. When I started there were only a handful of new posts each day to triage. Now there are many more and unless I triage and publish daily they quickly pile up.

Also, I've kind of built a reflex of trying to "complete the Internet" each day by ensuring that I read my whole Twitter feed, and new posts on /r/rust. I would like to break this habit and be able to take breaks from these things, without feeling like I might miss an important post.

Whilst I think there is value in the curation and archiving of posts on Read Rust, the website doesn't see a lot of use. I think most of the value for people is following the Twitter, Mastodon, and Facebook accounts. However, there's a fair amount of overlap between posts shared on /r/rust, @rustlang, and This Week in Rust. So, I think that if folks keep an eye on one or more of those they will still see most posts of note.

If you're not into social media, the full list of more than 450 Rust RSS feeds I subscribe to is available via an OPML file on the site. So, feel free to use that to subscribe to a bunch of feeds instead. Rust blogs OPML.

It's been fun to build, and rebuild the website and surrounding tooling over the years. Read Rust was initially just an RSS feed but after requests for an actual web-page I built a small site with the Cobalt static site compiler. In late 2019 in an effort to streamline the sharing of posts I rebuilt the site as dynamic web app. In early 2020 I added full test search.

As mentioned in the introduction, from here I plan to share #rust2021 posts and after that posting will be much less frequent. Thanks for reading, and happy coding 🦀.

Frequently Anticipated Questions

Q. What about getting others to help share posts?

I considered this, and it it was actually part of the motivation for the rebuild in 2019. However, ultimately Rust is now large enough and continuing to grow such that it's become less and less feasible to curate the entire firehose of Rust content.

Q. What about making it a sort of RSS powered Rust planet?

I think there's value in curation. Rust is popular enough now that there's a lot of low effort posts, or repetitious getting started posts. Also, people rightly have diverse interests and their blog may not solely contain Rust posts. So, I'd prefer to keep the archive in the focussed state it's in now.

Q. What will happen to the site and social media accounts now?

I plan to keep the site up and running indefinitely. I am a strong believer in not breaking links on the web, and I think I have a pretty decent track record. For example, this site has been online for 13 years and I still have redirects in place from the very first version of it. I may still share the occasional post but in general I hope to free up a bit of time to work on other things.

September 06, 2020

Derek Jones (derek-jones)

Impact of function size on number of reported faults September 06, 2020 09:55 PM

Are longer functions more likely to contain more coding mistakes than shorter functions?

Well, yes. Longer functions contain more code, and the more code developers write the more mistakes they are likely to make.

But wait, the evidence shows that most reported faults occur in short functions.

This is true, at least in Java. It is also true that most of a Java program’s code appears in short methods (in C 50% of the code is contained in functions containing 114 or fewer lines, while in Java 50% of code is contained in methods containing 4 or fewer lines). It is to be expected that most reported faults appear in short functions. The plot below shows, left: the percentage of code contained in functions/methods containing a given number of lines, and right: the cumulative percentage of lines contained in functions/methods containing less than a given number of lines (code+data):

left: the percentage of code contained in functions/methods containing a given number of lines, and right: the cumulative percentage of lines contained in functions/methods containing less than a given number of lines.

Does percentage of program source really explain all those reported faults in short methods/functions? Or are shorter functions more likely to contain more coding mistakes per line of code, than longer functions?

Reported faults per line of code is often referred to as: defect density.

If defect density was independent of function length, the plot of reported faults against function length (in lines of code) would be horizontal; red line below. If every function contained the same number of reported faults, the plotted line would have the form of the blue line below.

Number of reported faults in C++ classes (not methods) containing a given number of lines.

Two things need to occur for a fault to be experienced. A mistake has to appear in the code, and the code has to be executed with the ‘right’ input values.

Code that is never executed will never result in any fault reports.

In a function containing 100 lines of executable source code, say, 30 lines are rarely executed, they will not contribute as much to the final total number of reported faults as the other 70 lines.

How does the average percentage of executed LOC, in a function, vary with its length? I have been rummaging around looking for data to help answer this question, but so far without any luck (the llvm code coverage report is over all tests, rather than per test case). Pointers to such data very welcome.

Statement execution is controlled by if-statements, and around 17% of C source statements are if-statements. For functions containing between 1 and 10 executable statements, the percentage that don’t contain an if-statement is expected to be, respectively: 83, 69, 57, 47, 39, 33, 27, 23, 19, 16. Statements contained in shorter functions are more likely to be executed, providing more opportunities for any mistakes they contain to be triggered, generating a fault experience.

Longer functions contain more dependencies between the statements within the body, than shorter functions (I don’t have any data showing how much more). Dependencies create opportunities for making mistakes (there is data showing dependencies between files and classes is a source of mistakes).

The previous analysis makes a large assumption, that the mistake generating a fault experience is contained in one function. This is true for 70% of reported faults (in AspectJ).

What is the distribution of reported faults against function/method size? I don’t have this data (pointers to such data very welcome).

The plot below shows number of reported faults in C++ classes (not methods) containing a given number of lines (from a paper by Koru, Eman and Mathew; code+data):

Number of reported faults in C++ classes (not methods) containing a given number of lines.

It’s tempting to think that those three curved lines are each classes containing the same number of methods.

What is the conclusion? There is one good reason why shorter functions should have more reported faults, and another good’ish reason why longer functions should have more reported faults. Perhaps length is not important. We need more data before an answer is possible.

Ponylang (SeanTAllen)

Last Week in Pony - September 6, 2020 September 06, 2020 07:19 PM

We have a new RFC for added syntax to extend automatic receiver recovery. The shared-docker shellcheck image is being deprecated.

Gonçalo Valério (dethos)

Giving a new life to old phones September 06, 2020 12:18 PM

Nowadays, in some “developed” countries, it is very common for people to have a bunch of old phones stored somewhere in a drawer. Ten years have passed since smartphones became ubiquitous and those devices tend to become unusable very quickly, at least for their primary purpose. Either a small component breaks, the vendor stops providing updates, newer apps don’t support those older versions, etc.

The thing is, these phones are still powerful computers. It would be great if we could give them another life once they are no longer fit for regular day to day use or the owner just wants to try a shiny new device.

I never had many smartphones, mines tend to last many years, but I still have one or two lying around. Recently I started thinking of new uses for them, make them work instead of just gathering dust. A quick search on the internet tells me that many people already had the same idea (I’m quite late to the party) and have been working on cool things to do with these devices.

However, most of these articles just throw the idea at you, without telling you how to do it. Others assume that your device is relatively recent.

Of course the difficulty increases with the age of the phone, in my case the software that I will be able to run on a 10 year old Samsung Galaxy S will not be as easy to find as the software that I can run on another device with just one or two years.

Bellow is a list posts I found online with cool things you can do with your old phones. What sets this list apart from other results is that all the items aren’t just ideas, they contain step by step instructions of how to achieve the end result.

You don’t have to follow the provided instructions rigorously and you should introduce some variations that are more appropriate to your use case.

Have fun and reuse your old devices.

September 05, 2020

Maxwell Bernstein (tekknolagi)

Compiling a Lisp: Primitive unary functions September 05, 2020 09:00 PM

firstprevious

Welcome back to the “Compiling a Lisp” series. Last time, we finished adding the rest of the constants as tagged pointer immediates. Since it’s not very useful to have only values (no way to operate on them), we’re going to add some primitive unary functions.

“Primitive” means here that they are built into the compiler, so we won’t actually compile the call to an assembly procedure call. This is also called a compiler intrinsic. “Unary” means the functions will take only one argument. “Function” is a bit of a misnomer because these functions won’t be real values that you can pass around as variables. You’ll only be able to use them as literal names in calls.

Though we’re still not adding a reader/parser, we can imagine the syntax for this looks like the following:

(integer? (integer->char (add1 96)))

Today we also tackle nested function calls and subexpressions.

Adding function calls will require adding a new compiler datastructure, an addition to the AST, but not to the compiled code. The compiled code will still only know about the immediate types.

Ghuloum proposes we add the following functions:

  • add1, which takes an integer and adds 1 to it
  • sub1, which takes an integer and subtracts 1 from it
  • integer->char, which takes an integer and converts it into a character (like chr in Python)
  • char->integer, which takes a character and converts it into an integer (like ord in Python)
  • null?, which takes an object and returns true if it is nil and false otherwise
  • zero?, which takes an object and returns true if it is 0 and false otherwise
  • not, which takes an object and returns true if it is false and false otherwise
  • integer?, which takes an object and returns true if it is an integer and false otherwise
  • bool?, which takes an object and returns true if it is a boolean and false otherwise

The functions add1, sub1, and the char/integer conversion functions will be our first real experience dealing with object encoding in the compiled code. What fun!

The implementations for null?, zero?, not, integer?, and bool? are so similar that I am only going to reproduce one or two in this post. The rest will be visible at assets/code/lisp/compiling-unary.c.

In order to implement these functions, we’ll also need some more instructions than mov and ret. Today we’ll add:

  • add
  • sub
  • shl
  • shr
  • or
  • and
  • cmp
  • setcc

Because the implementations of shl, shr, or, and and are so straightforward — just like mov, really — I’ll also omit them from the post. The implementations of add, sub, cmp, and setcc are more interesting.

The fundamental data structure of Lisp

Pairs, also called cons cells, two-tuples, and probably other things too, are the fundamental data structure of Lisp. At least the original Lisp. Nowadays we have fancy structures like vectors, too.

Pairs are a container for precisely two other objects. I’ll call them car and cdr for historical1 and consistency reasons, but you can call them whatever you like. Regardless of name, they could be represented as a C struct like this:

typedef struct Pair {
  ASTNode *car;
  ASTNode *cdr;
} Pair;

This is useful for holding pairs of objects (think coordinates, complex numbers, …) but it is also incredibly useful for making linked lists. Linked lists in Lisp are comprised of a car holding an object and the cdr holding another list. Eventually the last cdr holds nil, signifying the end of the list. Take a look at this handy diagram.

Fig. 1 - Cons cell list, courtesy of Wikipedia.

This represents the list (list 42 69 613), which can also be denoted (cons 42 (cons 69 (cons 613 nil))).

We’ll use these lists to represent the syntax trees for Lisp, so we’ll need to implement pairs to compile list programs.

Implementing pairs

In previous posts we implemented the immediate types the same way in the compiler and in the compiled code. I originally wrote this post doing the same thing: manually laying out object offsets myself, reading and writing from objects manually. The motivation was to get you familiar with the memory layout in the compiled code, but ultimately it ended up being too much content too fast. We’ll get to memory layouts when we start allocating pairs in the compiled code.

In the compiler we’re going to use C structs instead of manual memory layout. This makes the code a little bit easier to read. We’ll still tag the pointers, though.

const unsigned int kPairTag = 0x1;        // 0b001
const uword kHeapTagMask = ((uword)0x7);  // 0b000...0111
const uword kHeapPtrMask = ~kHeapTagMask; // 0b1111...1000

This adds the pair tag and some masks. As we noted in the previous posts, the heap object tags are all in the lowest three bits of the pointer. We can mask those out using this handy utility function.

uword Object_address(void *obj) { return (uword)obj & kHeapPtrMask; }

We’ll need to use this whenever we want to actually access a struct member. Speaking of struct members, here’s the definition of Pair:

typedef struct Pair {
  ASTNode *car;
  ASTNode *cdr;
} Pair;

And here are some functions for allocating and manipulating the Pair struct, to keep the implementation details hidden:

ASTNode *AST_heap_alloc(unsigned char tag, uword size) {
  // Initialize to 0
  uword address = (uword)calloc(size, 1);
  return (ASTNode *)(address | tag);
}

void AST_pair_set_car(ASTNode *node, ASTNode *car);
void AST_pair_set_cdr(ASTNode *node, ASTNode *cdr);

ASTNode *AST_new_pair(ASTNode *car, ASTNode *cdr) {
  ASTNode *node = AST_heap_alloc(kPairTag, sizeof(Pair));
  AST_pair_set_car(node, car);
  AST_pair_set_cdr(node, cdr);
  return node;
}

bool AST_is_pair(ASTNode *node) {
  return ((uword)node & kHeapTagMask) == kPairTag;
}

Pair *AST_as_pair(ASTNode *node) {
  assert(AST_is_pair(node));
  return (Pair *)Object_address(node);
}

ASTNode *AST_pair_car(ASTNode *node) { return AST_as_pair(node)->car; }

void AST_pair_set_car(ASTNode *node, ASTNode *car) {
  AST_as_pair(node)->car = car;
}

ASTNode *AST_pair_cdr(ASTNode *node) { return AST_as_pair(node)->cdr; }

void AST_pair_set_cdr(ASTNode *node, ASTNode *cdr) {
  AST_as_pair(node)->cdr = cdr;
}

There a couple important things to note.

First, AST_heap_alloc very intentionally zeroes out the memory it allocates. If the members were left uninitialized, it might be possible to read off a struct member that had an invalid pointer in car or cdr. If we zero-initialize it, the member pointers represent the object 0 by default. Nothing will crash.

Second, we keep moving our ASTNode pointers through AST_as_pair. This function has two purposes: catch invalid uses (via the assert that the object is indeed a Pair) and also mask out the lower bits. Otherwise we’d have to do the masking in every operation individually.

Third, I abstracted out the AST_heap_alloc so we don’t expose the calloc function everywhere. This allows us to later swap out the allocator for something more intelligent, like a bump allocator, an arena allocator, etc.

And since memory allocated must eventually be freed, there’s a freeing function too:

void AST_heap_free(ASTNode *node) {
  if (!AST_is_heap_object(node)) {
    return;
  }
  if (AST_is_pair(node)) {
    AST_heap_free(AST_pair_car(node));
    AST_heap_free(AST_pair_cdr(node));
  }
  free((void *)Object_address(node));
}

This assumes that each ASTNode* owns the references to all of its members. So don’t borrow references to share between objects. If you need to store a reference to an object, make sure you own it. Otherwise you’ll get a double free. In practice this shouldn’t bite us too much because each program is one big tree.

Implementing symbols

We also need symbols! I mean, we could try mapping all the functions we need to integers, but that wouldn’t be very fun. Who wants to try and debug a program crashing on function#67? Not me. So let’s add a datatype that can represent names of things.

As above, we’ll need to tag the pointers.

const unsigned int kSymbolTag = 0x5;      // 0b101

And then our struct definition.

typedef struct Symbol {
  word length;
  char cstr[];
} Symbol;

I’ve chosen this variable-length object representation because it’s similar to how we’re going to allocate symbols in assembly and the mechanism in C isn’t so gnarly. This struct indicates that the memory layout of a Symbol is a length field immediately followed by that number of bytes in memory. Note that having this variable array in a struct is a C99 feature.

If you don’t have C99 or don’t like this implementation, that’s fine. Just store a char* and allocate another object for that string.

You could also opt to not store the length at all and instead NUL-terminate it. This has the advantage of not dealing with variable-length arrays (it’s just a tagged char*) but has the disadvantage of an O(n) length lookup.

Now we can add our Symbol allocator:

Symbol *AST_as_symbol(ASTNode *node);

ASTNode *AST_new_symbol(const char *str) {
  word data_length = strlen(str) + 1; // for NUL
  ASTNode *node = AST_heap_alloc(kSymbolTag, sizeof(Symbol) + data_length);
  Symbol *s = AST_as_symbol(node);
  s->length = data_length;
  memcpy(s->cstr, str, data_length);
  return node;
}

See how we have to manually specify the size we want. It’s a little fussy, but it works.

Storing the NUL byte or not is up to you. It saves one byte per string if you don’t, but it makes printing out strings in the debugger a bit of a pain since you can’t just treat them like normal C strings.

Some Lisp implementations use a symbol table to ensure that symbols allocated with equivalent C-string values return the same pointer. This allows the implementations to test for symbol equality by testing pointer equality. I think we can sacrifice a bit of memory and runtime speed for implementation simplicity, so I’m not going to do that.

Let’s add the rest of the utility functions:

bool AST_is_symbol(ASTNode *node) {
  return ((uword)node & kHeapTagMask) == kSymbolTag;
}

Symbol *AST_as_symbol(ASTNode *node) {
  assert(AST_is_symbol(node));
  return (Symbol *)Object_address(node);
}

const char *AST_symbol_cstr(ASTNode *node) {
  return (const char *)AST_as_symbol(node)->cstr;
}

bool AST_symbol_matches(ASTNode *node, const char *cstr) {
  return strcmp(AST_symbol_cstr(node), cstr) == 0;
}

Now we can represent names.

Representing function calls

We’re going to represent function calls as lists. That means that the following program:

(add1 5)

can be represented by the following C program:

Pair *args = AST_new_pair(AST_new_integer(5), AST_nil());
Pair *program = AST_new_pair(AST_new_symbol("add1"), args);

This is a little wordy. We can make some utilities to trim the length down.

ASTNode *list1(ASTNode *item0) {
  return AST_new_pair(item0, AST_nil());
}

ASTNode *list2(ASTNode *item0, ASTNode *item1) {
  return AST_new_pair(item0, list1(item1));
}

ASTNode *new_unary_call(const char *name, ASTNode *arg) {
  return list2(AST_new_symbol(name), arg);
}

And now we can represent the program as:

list2(AST_new_symbol("add1"), AST_new_integer(5));
// or, shorter,
new_unary_call("add1", AST_new_integer(5));

This is great news because we’ll be adding many tests today.

Compiling primitive unary function calls

Whew. We’ve built up all these data structures and tagged pointers and whatnot but haven’t actually done anything with them yet. Let’s get to the compilers part of the compilers series, please!

First, we have to revisit Compile_expr and add another case. If we see a pair in an expression, then that indicates a call.

int Compile_expr(Buffer *buf, ASTNode *node) {
  // Tests for the immediates ...
  if (AST_is_pair(node)) {
    return Compile_call(buf, AST_pair_car(node), AST_pair_cdr(node));
  }
  assert(0 && "unexpected node type");
}

I took the liberty of separating out the callable and the args so that the Compile_call function has less to deal with.

We’re only supporting primitive unary function calls today, which means that we have a very limited pattern of what is accepted by the compiler. (add1 5) is ok. (add1 (add1 5)) is ok. (blargle 5) is not, because the blargle isn’t on the list above. ((foo) 1) is not, because the thing being called is not a symbol.

int Compile_call(Buffer *buf, ASTNode *callable, ASTNode *args) {
  assert(AST_pair_cdr(args) == AST_nil() &&
         "only unary function calls supported");
  if (AST_is_symbol(callable)) {
    // Switch on the different primitives here...
  }
  assert(0 && "unexpected call type");
}

Compile_call should look at what symbol it is, and depending on which symbol it is, emit different code. The overall pattern will look like this, though:

  • Compile the argument — the result is stored in rax
  • Do something to rax

Let’s start with add1 since it’s the most straightforward.

    if (AST_symbol_matches(callable, "add1")) {
      _(Compile_expr(buf, operand1(args)));
      Emit_add_reg_imm32(buf, kRax, Object_encode_integer(1));
      return 0;
    }

If we see add1, compile the argument (as above). Then, add 1 to rax. Note that we’re not just adding the literal 1, though. We’re adding the object representation of 1, ie 1 << 2. Think about why! When you have an idea, click the footnote.2

If you’re wondering what the underscore (_) function is, it’s a macro that I made to test the return value of the compile expression and return if there was an error. We don’t have any non-aborting error cases just yet, but I got tired of writing if (result != 0) return result; over and over again.

Note that there is no runtime error checking. Our compiler will allow (add1 nil) to slip through and mangle the pointer. This isn’t ideal, but we don’t have the facilities for error reporting just yet.

sub1 is similar to add1, except it uses the sub instruction. You could also just use add with the immediate representation of -1.

integer->char is different. We have to change the tag of the object. In order to do that, we shift the integer left and then drop the character tag onto it. This is made simple by integers having a 0b00 tag (nothing to mask out).

Here’s a small diagram showing the transitions when converting 97 to 'a':

High                                                           Low
0000000000000000000000000000000000000000000000000000000[1100001]00  Integer
0000000000000000000000000000000000000000000000000[1100001]00000000  Shifted
0000000000000000000000000000000000000000000000000[1100001]00001111  Character

where the number in enclosed in [brackets] is 97. And here’s the code to emit assembly that does just that:

    if (AST_symbol_matches(callable, "integer->char")) {
      _(Compile_expr(buf, operand1(args)));
      Emit_shl_reg_imm8(buf, kRax, kCharShift - kIntegerShift);
      Emit_or_reg_imm8(buf, kRax, kCharTag);
      return 0;
    }

Note that we’re not shifting left by the full amount. We’re only shifting by the difference, since integers are already two bits shifted.

char->integer is similar, except it’s just a shr. Once the value is shifted right, the char tag gets dropped off the end, so there’s no need to mask it out.

nil? is our first primitive with ~ exciting assembly instructions ~. We get to use cmp and setcc. The basic idea is:

  • Compare (this means do a subtraction) what’s in rax and nil
  • Set rax to 0
  • If they’re equal (this means the result was 0), set al to 1
  • Shift left and tag it with the bool tag

al is the name for the lower 8 bits of rax. There’s also ah (for the next 8 bits, but not the highest bits), cl/ch, etc.

    if (AST_symbol_matches(callable, "nil?")) {
      _(Compile_expr(buf, operand1(args)));
      Emit_cmp_reg_imm32(buf, kRax, Object_nil());
      Emit_mov_reg_imm32(buf, kRax, 0);
      Emit_setcc_imm8(buf, kEqual, kAl);
      Emit_shl_reg_imm8(buf, kRax, kBoolShift);
      Emit_or_reg_imm8(buf, kRax, kBoolTag);
      return 0;
    }

The cmp leaves a bit set (ZF) in the flags register, which setcc then checks. setcc, by the way, is the name for the group of instructions that set a register if some condition happened. It took me a long time to realize that since people normally write sete or setnz or something. And cc means “condition code”.

If you want to simplify your life — we’re going to do a lot of comparisons today – we can extract that into a function that compares rax with some immediate value, and then refactor Compile_call to call that.

void Compile_compare_imm32(Buffer *buf, int32_t value) {
  Emit_cmp_reg_imm32(buf, kRax, value);
  Emit_mov_reg_imm32(buf, kRax, 0);
  Emit_setcc_imm8(buf, kEqual, kAl);
  Emit_shl_reg_imm8(buf, kRax, kBoolShift);
  Emit_or_reg_imm8(buf, kRax, kBoolTag);
}

Let’s also poke at the implementations of cmp and setcc, since they involve some fun instruction encoding.

cmp, as it turns out, has a short-path when the register it’s comparing against is rax. This means we get to save one (1) whole byte if we want to!

void Emit_cmp_reg_imm32(Buffer *buf, Register left, int32_t right) {
  Buffer_write8(buf, kRexPrefix);
  if (left == kRax) {
    // Optimization: cmp rax, {imm32} can either be encoded as 3d {imm32} or 81
    // f8 {imm32}.
    Buffer_write8(buf, 0x3d);
  } else {
    Buffer_write8(buf, 0x81);
    Buffer_write8(buf, 0xf8 + left);
  }
  Buffer_write32(buf, right);
}

If you don’t want to, just use the 81 f8+ pattern.

For setcc, we have to define this new notion of “partial registers” so that we can encode the instruction properly. We can’t re-use Register because there are two partial registers for rax. So we add a PartialRegister.

typedef enum {
  kAl = 0,
  kCl,
  kDl,
  kBl,
  kAh,
  kCh,
  kDh,
  kBh,
} PartialRegister;

And then we can use those in the setcc implementation:

void Emit_setcc_imm8(Buffer *buf, Condition cond, PartialRegister dst) {
  Buffer_write8(buf, 0x0f);
  Buffer_write8(buf, 0x90 + cond);
  Buffer_write8(buf, 0xc0 + dst);
}

Again, I didn’t come up with this encoding. This is Intel’s design.

The zero? primitive is much the same as nil?, and we can re-use that Compile_compare_imm32 function.

    if (AST_symbol_matches(callable, "zero?")) {
      _(Compile_expr(buf, operand1(args)));
      Compile_compare_imm32(buf, Object_encode_integer(0));
      return 0;
    }

not is more of the same — compare against false.

Now we get to integer?. This is similar, but different enough that I’ll reproduce the implementation below. Instead of comparing the whole number in rax, we only want to look at the lowest 2 bits. This can be accomplished by masking out the other bits, and then doing the comparison. For that, we emit an and first and compare against the tag.

    if (AST_symbol_matches(callable, "integer?")) {
      _(Compile_expr(buf, operand1(args)));
      Emit_and_reg_imm8(buf, kRax, kIntegerTagMask);
      Compile_compare_imm32(buf, kIntegerTag);
      return 0;
    }

It’s possible to shorten the implementation a little bit because and sets the zero flag. This means we can skip the cmp. But it’s only one instruction and I’m lazy so I’m reusing the existing infrastructure.

Last, boolean? is almost the same as integer?.

Boom! Compilers! Let’s check our work.

Testing

I’ll only include a couple tests here, since the new tests are a total of 283 lines added and are a little bit repetitive.

First, the simplest test for add1.

TEST compile_unary_add1(Buffer *buf) {
  ASTNode *node = new_unary_call("add1", AST_new_integer(123));
  int compile_result = Compile_function(buf, node);
  ASSERT_EQ(compile_result, 0);
  // mov rax, imm(123); add rax, imm(1); ret
  byte expected[] = {0x48, 0xc7, 0xc0, 0xec, 0x01, 0x00, 0x00,
                     0x48, 0x05, 0x04, 0x00, 0x00, 0x00, 0xc3};
  EXPECT_EQUALS_BYTES(buf, expected);
  Buffer_make_executable(buf);
  uword result = Testing_execute_expr(buf);
  ASSERT_EQ(result, Object_encode_integer(124));
  AST_heap_free(node);
  PASS();
}

Second, a test of nested expressions:

TEST compile_unary_add1_nested(Buffer *buf) {
  ASTNode *node = new_unary_call(
      "add1", new_unary_call("add1", AST_new_integer(123)));
  int compile_result = Compile_function(buf, node);
  ASSERT_EQ(compile_result, 0);
  // mov rax, imm(123); add rax, imm(1); add rax, imm(1); ret
  byte expected[] = {0x48, 0xc7, 0xc0, 0xec, 0x01, 0x00, 0x00,
                     0x48, 0x05, 0x04, 0x00, 0x00, 0x00, 0x48,
                     0x05, 0x04, 0x00, 0x00, 0x00, 0xc3};
  EXPECT_EQUALS_BYTES(buf, expected);
  Buffer_make_executable(buf);
  uword result = Testing_execute_expr(buf);
  ASSERT_EQ(result, Object_encode_integer(125));
  AST_heap_free(node);
  PASS();
}

Third, the test for boolean?.

TEST compile_unary_booleanp_with_non_boolean_returns_false(Buffer *buf) {
  ASTNode *node = new_unary_call("boolean?", AST_new_integer(5));
  int compile_result = Compile_function(buf, node);
  ASSERT_EQ(compile_result, 0);
  // 0:  48 c7 c0 14 00 00 00    mov    rax,0x14
  // 7:  48 83 e0 3f             and    rax,0x3f
  // b:  48 3d 1f 00 00 00       cmp    rax,0x0000001f
  // 11: 48 c7 c0 00 00 00 00    mov    rax,0x0
  // 18: 0f 94 c0                sete   al
  // 1b: 48 c1 e0 07             shl    rax,0x7
  // 1f: 48 83 c8 1f             or     rax,0x1f
  byte expected[] = {0x48, 0xc7, 0xc0, 0x14, 0x00, 0x00, 0x00, 0x48, 0x83,
                     0xe0, 0x3f, 0x48, 0x3d, 0x1f, 0x00, 0x00, 0x00, 0x48,
                     0xc7, 0xc0, 0x00, 0x00, 0x00, 0x00, 0x0f, 0x94, 0xc0,
                     0x48, 0xc1, 0xe0, 0x07, 0x48, 0x83, 0xc8, 0x1f};
  EXPECT_EQUALS_BYTES(buf, expected);
  Buffer_make_executable(buf);
  uword result = Testing_execute_expr(buf);
  ASSERT_EQ(result, Object_false());
  AST_heap_free(node);
  PASS();
}

I’m getting the fancy disassembly from defuse.ca. I include it because it makes the tests easier for me to read and reason about later. You just have to make sure the text and the binary representations in the test don’t go out of sync because that can be very confusing…

Anyway, that’s a wrap for today. Send your comments on the elist! Next time, binary primitives.

Mini Table of Contents



  1. There’s a long-running dispute about what to call these two objects. The original Lisp machine (the IBM 704) had a particular hardware layout that led to the creation of the names car and cdr. Nobody uses this hardware anymore, so the names are historical. Some people call them first/fst and second/snd. Others call them head/hd and tail/tl. Some people have other ideas

  2. If you said “to preserve the tag” or “adding 1 would make it a pair” or some variant on that, you’re correct! Otherwise, I recommend going back to the diagram from the last couple of posts and then writing down binary representations of a couple of numbers by hand on a piece of paper. 

September 04, 2020

Kevin Burke (kb)

Building a better home network September 04, 2020 09:28 PM

I finally got my home network in a place where I am happy with it. I wanted to