Planet Crustaceans

This is a Planet instance for lobste.rs community feeds. To add/update an entry or otherwise improve things, fork this repo.

June 14, 2021

Gustaf Erikson (gerikson)

Why Gemini is not my favorite internet protocol June 14, 2021 01:46 PM

TL;DR: the Gemini protocol removes too much functionality for it to interesting to me.

What is Gemini?

Gemini is a simple web publishing protocol. It can be seen as a descendant, successor of Gopher. Gemini primarily emphasizes developer simplicity, and secondarily user privacy.

This comes with significant trade-offs for the author, however. Compared to standard vanilla HTML4, there are no inline links, no provision for media other than text on a page, and the styling of the content is left to the client.

My background in web publishing - I’ve been fascinated by publishing since I was a kid and I’ve been involved in printing zines and in student newspapers etc through the years. The idea that I can publish what I want, when I want, at whatever lengths I want, for effectively free, is still mind-blowing to me, almost 30 years since I copied some HTML code and made it mine.

Here’s where Gemini falls down for me.

First, there’s no official client. The fact that it’s so easy to implement a client means there’s a Cambrian explosion going on, and the filtering die-back has not yet occurred. This might change in the medium future.

Second, the styling limitations are crippling. I can probably survive without having images etc. on the same page, but the lack of inline links (each link has to be on its own line) leads to stilted, quasi-academic jargony text like this:

Check out my cool blog[1]! It’s full of cats!

[1] https://gerikson.com/blog

I’m not going to abandon three decades of hypertext authoring habits to make a developer’s life slightly easier.

Third, Gemini puts the cart before the horse when it comes to privacy. The solution to widespread tracking and user surveillance isn’t a bespoke hairshirt protocol that no-one is going to use. The solution is widespread legislation that makes using people’s personal data for targeted advertising illegal or very expensive. (This is not limited to Gemini. A great many influential Internet people are convinced politics is utterly broken, so “technical solutions” are all that’s left).

Gemini, to me, is part of the nostalgia for a past that never really was — the halcyon days of the Internet, before the Eternal September. But time is the great filter. What has survived from that era is not the spam, the pointless Usenet arguments, the shitposting, but finely polished nuggets. If you weren’t there, it might have seemed a paradise, but I was, and it wasn’t. It was today’s internet, but text-only and with proportionally even more white dudes.

Update

This entry was submitted to Lobste.rs and spawned an interesting discussion. Among others, this piece was posted (comments), and this post was also mentioned.

I suggest those interested in Gemini and what it means to people to read the linked items and discussions.

I’d also like to clarify that I wrote this rant basically to have a single place to point Gemini proselytizers to. I need to emphasize that if you personally use Gemini, or find Gemini useful or fun, more power to you! I am not suggesting you stop or that your efforts are in vain.

Sadly, there are some Gemini enthusiasts who are the equivalent of people handing out flyers for some obscure club, and who get shirty when you try to politely explain that you’re not interested in attending. Those people are annoying, and they give Gemini a bad name.

June 13, 2021

Derek Jones (derek-jones)

The CESAW dataset: a brief introduction June 13, 2021 09:31 PM

I have found that the secret for discovering data treasure troves is persistently following any leads that appear. For instance, if a researcher publishes a data driven paper, then check all their other papers. The paper: Composing Effective Software Security Assurance Workflows contains a lot of graphs and tables, but no links to data, however, one of the authors (William R. Nichols) published The Cost and Benefits of Static Analysis During Development which links to an amazing treasure trove of project data.

My first encounter with this data was this time last year, as I was focusing on completing my Evidence-based software engineering book. Apart from a few brief exchanges with Bill Nichols the technical lead member of the team who obtained and originally analysed the data, I did not have time for any detailed analysis. Bill was also busy, and we agreed to wait until the end of the year. Bill’s and my paper: The CESAW dataset: a conversation is now out, and focuses on an analysis of the 61,817 task and 203,621 time facts recorded for the 45 projects in the CESAW dataset.

Our paper is really an introduction to the CESAW dataset; I’m sure there is a lot more to be discovered. Some of the interesting characteristics of the CESAW dataset include:

  • it is the largest publicly available project dataset currently available, with six times as many tasks as the next largest, the SiP dataset. The CESAW dataset involves the kind of data that is usually encountered, i.e., one off project data. The SiP dataset involves the long term evolution of one company’s 20 projects over 10-years,
  • it includes a lot of information I have not seen elsewhere, such as: task interruption time and task stop/start {date/time}s (e.g., waiting on some dependency to become available)
  • four of the largest projects involve safety critical software, for a total of 28,899 tasks (this probably more than two orders of magnitude more than what currently exists). Given all the claims made about the development about safety critical software being different from other kinds of development, here is a resource for checking some of the claims,
  • the tasks to be done, to implement a project, are organized using a work-breakdown structure. WBS is not software specific, and the US Department of Defense require it to be used across all projects; see MIL-STD-881. I will probably annoy those in software management by suggesting the one line definition of WBS as: Agile+structure (WBS supports iteration). This was my first time analyzing WBS project data, and never having used it myself, I was not really sure how to approach the analysis. Hopefully somebody familiar with WBS will extract useful patterns from the data,
  • while software inspections are frequently talked about, public data involving them is rarely available. The WBS process has inspections coming out of its ears, and for some projects inspections of one kind or another represent the majority of tasks,
  • data on the kinds of tasks that are rarely seen in public data, e.g., testing, documentation, and design,
  • the 1,324 defect-facts include information on: the phase where the mistake was made, the phase where it was discovered, and the time taken to fix.

As you can see, there is lots of interesting project data, and I look forward to reading about what people do with it.

Once you have downloaded the data, there are two other sources of information about its structure and contents: the code+data used to produce the plots in the paper (plus my fishing expedition code), and a CESAW channel on the Evidence-based software engineering Slack channel (no guarantees about response time).

Ponylang (SeanTAllen)

Last Week in Pony - June 13, 2021 June 13, 2021 03:34 PM

Borja (@ergl) released pony-protobuf, an experimental Pony protocol buffers library and compiler.

June 12, 2021

Jan van den Berg (j11g)

A European watches Major League Baseball June 12, 2021 02:08 PM

I see a shiny grassy green diamond, some guys in uniform. One guy on a patch of dirt in the middle throws a ball, and another guy tries to hit it with a stick. If he hits it he wins, if he doesn’t: the other guy wins. It looks like most sports.
On second glance however, I realize things are not as clear. The guy with the stick — I guess it’s called a bat — seems to be alone against a team. The rest of his team is looking on. Is that fair? Even a soccer penalty — a mostly rare event — it’s still one against one. And how often can he try to hit the ball and how often can this other guy throw it? This is not clear. And what happens after he hits it? And also, what’s an inning? And how many are there? And are there any other sports where they wear hats?

And it’s not just the hats though. Belts too. Which athletes wear belts? Really.
Both teams wear collarless uniforms that look like they haven’t changed in a 100 years, which must be intentional. Some wear long pants, others wear knee socks. Either way, the uniforms are buttoned up and either striped or clear. But always neatly tucked in. Except for the guy trying to catch the ball — behind the guy with the bat — he has his own thing going on.

And then there is this oversized strange looking leather glove. It makes the players look like crabs who lost a claw. But every player wears one like it’s perfectly normal. The glove is mostly used to catch balls, but it also doubles as an accessory to share secrets with teammates that you want to keep from the other team, whose players must all be prodigious lip readers.

So everyone wears a glove. Except the guy with a bat. A bat that’s either made of wood or metal. Okay, so let’s see, if you have the bat you get three attempts to try and hit the ball? If you hit it correctly, you have to run to a first designated stop — or a base — and if you make it to this first base without your ball being caught or you ball arriving there before you — because someone picked it up and threw it there — you’re safe. Three more bases for a point. If you hit it really hard you might try and run to two, or even three bases in one go. But this is hard. Only if you hit it out of the park, you get to go past all bases and collect a point. That’s called a homerun. But mostly you try to make it to first base, and you wait for your teammate to try and hit the ball. If he does, you get a chance to run to second base (or more). Unless of course a ball is caught or hitters haven’t hit the ball for a combined number of three times, then teams switch sides.

Pitchers — as the guys that throw the ball are called — seem to center themselves by straightening their hat — a lot — so that their cap is perfectly perpendicular to their faces. And they like to fondle the ball between their fingers while either specifically looking at it or distinctively not. Then they do a little bit more hat straightening before curling up by pulling their knee to their chest, channeling all their power into their throw to make the ball fly around 90 mph between a small rectangle box that is only visible on TV and not to the players. And they do this again and again and again. He has to throw the ball three times correctly between the rectangle — without being it being hit — for the hitter to be out. If he throws it four times outside the box, the hitter is free to walk to first base.

Hitters also seem to try to get in a meditative state of mind. There are quite a few mannerisms and straight out ticks they perform while standing at the plate. They touch the middle of the plate, swing the bat an exact numbers of times, pull the fingers of their gloves — different gloves — an even number of times. Adjusting, centering, focussing. If you look closely some guys have stains on their right collarbone or shoulder, it’s where the bat touches as they twist and twirl it while waiting for the ball to arrive. It seems that wooden bats are tarred up, for grip, maybe? It’s not as much a physical as it is game about focus and wits. Nonetheless you have to throw the ball really hard, and hit it really hard. It is a multidimensional game.

Players all look very serious and focused. There is no flopping or cheating, everybody knows the rules and umpires are on top of things. And maybe it’s this aspect that I like most. It’s a fair game.

It quickly becomes clear that this a game of variables. Lots and lots of variables. And watching a baseball game must be a supremely different experience from playing the game. Watching the game I am constantly assessing stats, making small calculations, holding variables in my head. There are so many intangibles to keep track of, your head needs to clear out the rest of whatever it is working on. Watching baseball is almost meditative. For the guys on the field however, the only focus can be the next pitch.

I learn that an inning has two parts — a top and a bottom — and a pitcher has to try and strike out hitters three times for a team to change sides, this constitutes an inning. As I learn this the numbers on the screen start to make a little bit more sense, it’s as if I am slowly let in on a secret. A very old and unchanged secret.

A secret that can be endlessly explored and excavated further. There is no shortage of variables to do so. So an RBI is a Round Batted In, that seems important. And I am glad the announcers explain what the 7th inning stretch is. This sport is drenched in tradition and statistics, and if you want you can drown in it. Not only are there 120 years of history to dive in to, even with such a long history the sport is still evolving and historic events occur in this day and age.

Baseball is played all over the US: little league, high school, college, minor league and more. But Major League Baseball is of course where the best of the best compete. There are 30 teams in the Major League Baseball. Which doesn’t seem excessive for a country as big as the US. But teams play 162 games in a regular season. Please read the last sentence again, this is not a typo. This adds up to 2430 baseball games in one MLB season! You might call that excessive and very on-brand for the US. So the number of games and stats to obsess over are dazzling. And maybe it’s because of these numbers, but there is something about this sport. Because special things tend happen or are — statistically speaking — bound to happen. And they do. All of the time.

And so much so that you might even call this sport romantic. This makes complete sense.

A scene from Moneyball.

The post A European watches Major League Baseball appeared first on Jan van den Berg.

June 09, 2021

Pages From The Fire (kghose)

Money for your Monet should be income: A restricted unrealized capital gains tax June 09, 2021 04:52 PM

I’ve seen the usual frothing at the mouth reports of how the ultra-rich pay less taxes than salaried Sam and how we should be raising taxes everywhere to pay for all the government spending we are doing and will be doing in the future. For the greater good of course. Leaving aside the question of… Read More Money for your Monet should be income: A restricted unrealized capital gains tax

June 08, 2021

Frederic Cambus (fcambus)

Diving into toolchains June 08, 2021 01:37 PM

I've been wanting to learn more about compilers and toolchains in general for a while now. In June 2016, I asked about recommended readings on lexers and parsers on Twitter. However, I have to confess that I didn't go forward with reading the Dragon Book.

Instead, I got involved as a developer in the OpenBSD and NetBSD projects, and witnessing the evolution of toolchains within those systems played a big role in maintaining my interest and fascination in the topic. In retrospective, it now becomes apparent that the work I did on porting and packaging software for those systems really helped to put in perspective how the different parts of the toolchains interact together to produce binaries.

Approximately one year ago, I asked again on Twitter whether I knew anyone having worked on compilers and toolchains professionally to get real world advice on how to gain expertise in the field. I got several interesting answers and started to collect and read more resources on the topic. Some of the links I collected ended up on toolchains.net, a collection of toolchain resources which I put online in February.

But the answer that resonate the most with me was Howard's advice to learn by doing. Because I seem to be the kind of person which need to see some concrete results in order to keep motivated, that's exactly what I decided to do.

I started by doing some cleanups in the binutils package in NetBSD's pkgsrc, which resulted in a series of commits:

  • 2020-12-20 - ca38479 - Remove now unneeded OpenBSD specific checks in gold
  • 2020-12-15 - 7263eee - Add missing TEST_DEPENDS on devel/dejagnu
  • 2020-12-14 - b1637da - Don't use hard-coded -ldl in the gold test suite.
  • 2020-12-13 - 146def2 - Remove apparently unneeded patch for libiberty
  • 2020-12-12 - 6b347a9 - Remove CFLAGS.OpenBSD+= -Wno-bounded directive
  • 2020-12-11 - f53b2d8 - Remove now unneeded patch dropping hidden symbols warning
  • 2020-12-10 - b037380 - Enable building gold on Linux
  • 2020-12-03 - 75d00bc - Remove now unneeded workaround for binutils 2.24
  • 2020-12-03 - adfee30 - Drop all Bitrig related patches

Meanwhile, I also got the opportunity to update our package and apply security fixes:

  • 2021-02-11 - 761e000 - Update to binutils 2.36.1
  • 2021-01-27 - ba983e5 - Update to binutils 2.36
  • 2021-01-07 - 7aef5c0 - Add upstream fixes for CVE-2020-35448
  • 2020-12-06 - 99fdf39 - Update to binutils 2.35.1

I eventually took maintainership of binutils in Pkgsrc.

Building it repeatedly with different compilers exposed different warnings, and I've also run builds through Clang's static analyzer.

All of this resulted in the opportunity to contribute to binutils itself:

  • 2021-04-14 - 5f47741 - Remove unneeded tests for definitions of NT_NETBSDCORE values
  • 2021-04-12 - 0fa29e2 - Remove now unneeded #ifdef check for NT_NETBSD_PAX
  • 2021-03-12 - be3b926 - Add values for NetBSD .note.netbsd.ident notes (PaX)
  • 2021-01-26 - e37709f - Fix a double free in objcopy's memory freeing code

Most recently, I also wrote a couple of blog posts on the topic:

And the journey continues. I'm following a different path from traditional compiler courses starting with lexers and parsers, and doing the opposite curriculum somehow, starting from binaries instead. I will be focusing on the final stages of the pipeline for now: compiling assembly to machine code and producing binaries.

My next steps are to read the full ELF specification, followed by the Linkers and Loader book, and then refresh my ASM skills. My favorite course at university was the computer architecture one and especially its MIPS assembly part, so I'm looking to revisit the subject but with ARM64 assembly this time.

Gustaf Erikson (gerikson)

Agency by William Gibson June 08, 2021 11:29 AM

This is Gibson’s worst novel. Not recommended.

It’s in the same (multi)verse as The Peripheral.

While that novel had engaging characters, this one doesn’t. The Jackpot protagonist, Wilf, shows us the horrifying prospect of the repressed Englishman surviving global collapse and an 80% die-off of humanity. The present-day character has no inner life to speak of. I have no clue how she managed to get in a relationship with her world’s Elon Musk analog. We’re supposed to believe that Eunice, the AI that the Jackpot side uses to try to avert nuclear war, is this fantastically engaging personality everyone loves, but in the end she’s basically sassy Magic Negro. The novel has entire chapters describing drives through the Bay Area.

In the end it reads as therapy for Gibson to cope with the Trump years.

June 06, 2021

Derek Jones (derek-jones)

Impact of native language on variable naming June 06, 2021 10:28 PM

When creating a variable name, to what extent are developers influenced by their native human language?

There is lots of evidence that variable names are either English words, abbreviations of English words, or some combination of these two. Source code containing a large percentage of identifiers using words from other languages does exist, but it requires effort to find; there is a widely expressed view that source should be English based (based on my experience of talking to non-native English speakers, and even the odd paper discussing the issue, e.g., Language matters).

Given that variable names can prove information that reduces the effort needed to understand code, and that most code is only ever read by the person who wrote it, developers should make the most of their expertise in using their native language.

To what extent do non-native English-speaking developers make use of their non-English native language?

I have found it very difficult to even have a discussion around this question. When I broach the subject with non-native English speakers, the response is often along the lines of “our develo0pers speak good English.” I am careful to set the scene by telling them of my interest in naming, and that I think there are benefits for developers to make use of their native language. The use of non-English languages in software development is not yet a subject that is open for discussion.

I knew that sooner or later somebody would run an experiment…

How Developers Choose Names is another interesting experiment involving Dror Feitelson (the paper rather confusingly refers to it as a survey, a post on an earlier experiment).

What makes this experiment interesting is that bilingual subjects (English and Hebrew) were used, and the questions were in English or Hebrew. The 230 subjects (some professional, some student) were given a short description and asked to provide an appropriate variable/function/data-structure name; English was used for 26 of the question, and Hebrew for the other 21 questions, and subjects answered a random subset.

What patterns of Hebrew usage are present in the variable names?

Out of 2017 answers, 14 contained Hebrew characters, i.e., not enough for statistical analysis. This does not mean that all the other variable names were only derived from English words, in some cases Hebrew words appeared via transcription using the 26 English letters. For instance, using “pinuk” for the Hebrew word that means “benefit” in English. Some variables were created from a mixture of Hebrew and English words, e.g., deservedPinuks and pinuksUsed.

Analysing this data requires someone who is fluent in Hebrew and English. I am not a fluent, or even non-fluent, Hebrew speaker. My role in this debate is encouraging others, and at last I have some interesting data to show people.

The paper spends time showing how for personal preferences result in a wide selection of names being chosen by different people for the same quantity. I cannot think of any software engineering papers that have addressed this issue for variable names, but there is lots of evidence from other fields; also see figure 7.33.

Those interested in searching source code for the impact of native-language might like to look at the names of variables appearing as operands of the bitwise and logical operators. Some English words occur much more frequently in the names of these variable, compared to variables that are operands of arithmetic operators, e.g., flag, status, and signal. I predict that non-native English-speaking developers will make use of corresponding non-English words.

June 05, 2021

Gustaf Erikson (gerikson)

Death of a channel June 05, 2021 02:35 PM

This is a short write-up of how a channel on Freenode was hijacked by staff, and how it was effectively deleted.

Background

The channel #photography was founded on Freenode in 2001, as a channel to discuss photography. Its single-hash status was challenged by Freenode staffer lilo, and it was changed to #photogeeks to comply with the requirement for single-hash channels to be associated with a project.

In 2006, the channel #photo was registered to discuss photography per se, and not gear. Quite soon the channel became social in nature, and it was hidden and a password set.

Over the years, about a dozen people used this channel as a social space to discuss everything under the sun.

Hijack

When the takeover of Freenode by Andrew Lee occurred, the channel was registered on Libera.chat, and on 2021-05-21, the topic was updated to say

09:07  -!- gustaf changed the topic of #photo to: this channel is now up and running on Libera.chat (same pwd), but this place is still the primary! | Discord bolthole - https://discord.gg/XXXX

On 2021-05-26, the channel was hijacked:

05:02  -!- freenodecom <~com@freenode/staff> has joined #photo
05:02  -!- mode/#photo (+o freenodecom) by freenodecom
05:02  -!- freenodecom changed the topic of #photo to: This channel has moved to ##photo. The topic is in violation of freenode policy: https://freenode.net/policies
05:02 <@freenodecom> This channel has been reopened with respect to the communities and new users. The topic is in violation of freenode policy: https://freenode.net/policies
05:02  -!- mode/#photo (+o freenodecom) by OperServ
05:02 <@freenodecom> The new channel is ##photo
05:02  -!- mode/#photo (-s+t) by ChanServ
05:02  -!- mode/#photo (+spimf ##photo) by freenodecom
05:02  -!- freenodecom <~com@freenode/staff> has left #photo ()
05:02  -!- mode/#photo (+f ##photo) by freenodecom
07:37 < gustaf> <abbr title="what the fsck">wtf</abbr>
07:37  -!- #photo Cannot send to nick/channel

(all times are in CEST).

A few hours later, Andrew Lee (rasengan) sent a network-wide wallop informing users that an attempt to enforce newly instituted rules against advertising other IRC networks had been overly broad and targeted more channels than intended.

It’s clear that our channel was included in this.

Users were encouraged to submit a request to Freenode staff to get their channels back.

Aftermath

The “regulars” of the channel were contacted via PM and informed that the Freenode channel was now closed. Most moved over to Libera. A few mentioned that they were permanently leaving Freenode.

As a good faith effort, prompted by Freenode promoters, I attempted to regain control of the channel at Freenode, but was informed that having a single-hash channel was not according to policy. The request was denied.

Discussion

I’ve written this post to present my side of the story. Over the last weeks, I’ve been told on Freenode that the widespread channel hijacking of 26 May 2021 (some reports say that 700+ channels were affected) was either not as widespread as reported, or “justified” to stem the flow of users to Libera.

I’m also a member of the channel #lobsters, which suffered the same fate. However, in that case, the project had officially moved, and the Freenode channel was locked. Based on Lee’s rationale, I actually find the hijack justified, as there were presumably people who would prefer to remain on Freenode and discuss the site there. However, note that there very little warning before this happened. There was no attempt to contact the project to present Freenode’s case as a better IRC host than Libera. Freenode instead unilaterally decided they knew better than the project’s themselves.

When Libera was announced, I did not feel that the urgency presented by the staff there was entirely justified. Never would I imagine that Andrew Lee would, within a week, exceed those warnings by a wide margin.

He and the current Freenode staff have proven that they cannot be trusted to be stewards of communities, by hijacking channels and disrupting them. They have proven to be incompetent, by affecting more channels than intended. They have proven to be discourteous, by requiring channel owners affected by their incompetence to apply, hat in hand and papers in order, for their channels to be reinstated. And finally, they’ve proven to be bad business people, by alienating their future customers and torching their future income stream.

Future

I’m nostalgic for my almost 17 year old Freenode account. But the more time passes, the more bitter I become. I’m going to hang around in some channels to see how things work out. I’m open to a more humble approach from Freenode staff and boosters. But if I feel I can’t be a part of a network that treats its users as peons to be exploited, I’m out.

Kevin Burke (kb)

Cheap ideas for slowing down speeding cars June 05, 2021 05:01 AM

We live on a street that's about 29 feet wide. There are parked cars on either side that take up about 7 feet each, which leaves 15 feet in the middle of the street for through traffic. There are hundreds or thousands of streets like this in San Francisco; here's a screenshot of one at random here.

Cars go really fast on our street - they'll gun it up to 35-40mph, but only when there's no traffic coming in the other direction. When traffic comes in the other direction they need to slow to steer between the oncoming car and the row of parked cars. Speeding cars are not safe for kids and they are leading sources of noise and air pollution.

If there are no cars coming, they can drive very quickly right down the middle of the street. So the obvious first step to slowing down traffic is don't let the cars drive in the middle of the street.

The SFMTA's program to address speeding involves adding a speed bump. A speed bump is very effective at slowing cars, but it's expensive and labor intensive. They only have the budget to address about 30-40 blocks a year.

What if we wanted to fix this for every residential block in the city? You'd want something that you could put in the very middle of the street, to prevent the behavior where cars drive very fast right down the middle. The ideal obstacle would be

  1. cheap
  2. would not cause any damage to the car
  3. very visible
  4. easy to move if necessary

I think a seven inch tall plastic soccer cone fits these criteria. You can buy them on Amazon for less than $1 each.

With a cone in the street, cars would either need to drive over it (undesirable) or around it, which would force them to slow down because there's a lot less space between the cones and the lane of parked cars.

My theory is that this would slow down cars a lot. It's cheap to test! Let me know if you are interested in trying it.

Measuring

To convince the MTA you'd want to do a test to see whether the cone is bringing down speeds. To do this, pick an hour during a weekday and count the car traffic — speeding vs. non-speeding, and then deploy the intervention the next day or next week during the same hour, and count the car traffic, speeding vs. non-speeding. You can then do a statistical test to determine if the results are significant.

You can buy a radar gun for about $100 to measure car speeds, which would be more accurate than eyeballing "speeding" vs. "not speeding" or have someone who doesn't know about the experiment (and can't see the cones) make the judgments.

June 04, 2021

Mark Fischer (flyingfisch)

Three Weeks of Hungarian June 04, 2021 04:00 PM

This week I continued to progress in DuoLingo. My goal for this week was to learn definite conjugations of common verbs, but although I made some progress towards that end, I haven’t made as much progress as I had hoped. Instead I spent my past week drilling my understanding of the vocabulary I have learned so far, accusative and plural endings for nouns, and plural person indefinite verb conjugations (we see, you see, they see). I’ve gotten to a point where I think I need to practice what I know a little bit more before moving on to keep things from becoming too confusing.

I’m hoping that in the coming week I can get comfortable enough with my current knowledge to continue progressing. Outside of DuoLingo, I’ve started trying to translate some of the articles on the Hungarian National Geographic website using Glosbe as needed to look up the definition of words. It’s pretty difficult for me to read those still, but I’m hoping that the exposure to the language with pictures and context will be beneficial.

I’m still trying to find Hungarian children’s ebooks to read on my Kindle but I’m having trouble finding them. That might be because my Hungarian knowledge isn’t good enough yet to know how to search for them in Hungarian though. What I’d really like is to be able to find the Magic Treehouse series translated to Hungarian, or a similar book series. Something like the DK Eyewitness books would be good too.

Some Hungarian sentences

I figured as a little self exercise this week I’d post some pictures and try to write as much as I can about them with my current understanding of Hungarian. Native speakers who are reading this, please feel free to correct my sentences and translations below.

A red Ford Mustang Photo by Alan Flack on Unsplash

Ez egy piros autó. Azok ablakok az autó mögött vannak. Az autó Ford Mustang GT. Ez az autó egy Ford Mustang GT. Az autó kint áll. Az egy épület az autó mögött van. Az épület az autó mögött van. Az az autó szeretek Szeretem ezt az autót.

This is a red car. There are windows behind the car. The car is a Ford Mustang GT. The car stands (is parked) outside. There is a building behind the car. I like this car.

A man hiking Photo by Mr.Autthaporn Pradidpong on Unsplash

A férfi kint sétál. A férfi nem bent van. A fák között sétál. A férfi van egyedül. A férfi egyedül van. A férfi nem alacsony. Nem sétálok, hanem ülök. Te sétálsz vagy ülsz? Szeretsz sétálni?

The man walks outside. The man is not inside. He walks among/between the trees. The man is alone. The man is not short. I am not walking, but sitting. Do you walk or sit? Do you like to walk?

Resources

  • Word order for adverbs of frequency - I was having trouble understanding where adverbs of frequency belong in a sentence and this blog post helped
  • UPDATE: Just found this website, and I was able to read this article without much trouble! I’m excited that I’ve finally found something I can read that isn’t too difficult!

Corrections

Thanks to Teleonomix for the corrections to my Hungarian above.

June 03, 2021

Eric Faehnrich (faehnrich)

Griz Sextant Cyberdeck Build June 03, 2021 11:23 PM

I've been interested in a more self-reliant computer for some time. By that I mean, if you have a laptop from a typical company and the case or screen cracks, it's basically trash. The battery will eventually not hold a charge and usually the battery is specifically for that laptop so eventually they'll stop making them.

I want computer that if something goes, I could fix on my own. There's been some computers coming out lately that fit that.

Probably one of the most popular and closest to what I was interested in was the MNT Reform. This is a custom laptop that's completely open. Other's can use their design files so you could get new parts from third parties possibly. It uses 18650 batteries so replacing those is not an issue.

However, there wasn't much of an ecosystem around the MNT yet in terms of software or hardware. Also the price was too high, but it's understandable for a completely custom laptop.

There's also the idea of a cyberdeck, the computer used by hackers in cyberpunk stories. They come in all shapes, but the typical form is just a keyboard and a screen at the top. I'm not talking about a clam shell like a typical laptop, you just carry around the thing with a keyboard exposed.

That makes a great visual but I wanted a clam shell. There are also simple builds that are just a screen, bluetooth keyboard, and a Raspberry Pi. That's pretty DIY, but I wanted something less slapped together.

That's when I saw the Griz Sextant, and I knew that's what I wanted to build. It's repairable, modifiable, has a good form. And has a retro look which is even better.

The designer put a lot of work in to it, and did a great job. He released the files here.

This page will document my work on my own build. My hope is that I can contribute by recreating the build and see how it is for someone other than the designer to build it.

My first step was ordering everything from the BOM. That was straight forward, there were links for just what to buy with some alternatives. I noticed the switches for the keyboard were no longer available, but Amazon showed I could buy the same switches but in a different quantity so I went with that.

This will also be my first build of a custom keyboard, something else I've been interested in doing. That's why I'm going with the same switches, I'm not familiar enough yet with building keyboards to know that a different switch would work.

I also saw some of the HDMI cable parts weren't available at Adafruit. I'd prefer to support them over Amazon, but I want my parts soon. Also, it looked like I'd have to build the HDMI cable from seperate parts if I ordered from Adafruit which I don't see why I'd want to build my own cable.

This is also a good excuse to get my 3D printer operable again. I'm using a PowerSpec 3D printer from Micro Center. I found out you can get a really good deal on this if you buy refurbished. There would be nothing wrong with the printer itself, but it seems many people returned these so Micro Center would sell them refurbished at a big discount.

It turns out all that was wrong was the instructions missed a step to plug in a cable. That cable was labeled something like "D" and the only port it dangled right next to was also labeled "D". Some people missed that and returned it.

The biggest pain of this printer is calibrating the bed so it's the right length away from the print head around the entire bed. It's at least 8x8 which the Griz parts are for so it should work, but being that big it's even more important that it's calibrated correctly.

It will be annoying for all the parts to be delivered at different times, but I'll have printing the parts to keep me busy for the time being.

I did recieve my screws already which was impressively fast. But then I realized McMaster has a distribution center in Aurora, Ohio, about 30 minutes away from me.

I'll update this post with more as I go along. I'll post the updates to the RSS feed.

Frederic Cambus (fcambus)

NetBSD on the Vortex86DX CPU June 03, 2021 06:40 PM

I'm not exactly sure how I first heard about the Vortex86 CPUs, I think it was either when seeing the demonstration video on KolibriOS project site showcasing the system running on a DMP EBOX machine, or when skimming NetBSD's identcpu.c code. Or did the discovery of the machine prompted me to check if the CPU would be correctly probed by the NetBSD's kernel?

For those interested, Wikipedia has an article retracing the history of the Vortex86 from its birth at Rise to our days.

Several DMP EBOX machines are available for sale at various specialized vendors, but new devices cost several hundreds of dollars which is prohibitive for such low spec systems. However, I was recently able to acquire a boxed older model on a local auction site for about $25: the EBOX 3300A-H, with a 1GHz CPU and 256MB of RAM, no less.

As I already mentioned, those machines are quite slow but they still do have a few things going for them:

  • They are totally fanless, and the metal case finish is quite nice
  • They are very low-power x86 embedded devices, and still being produced

I used a power meter to do measurements, and an idle system consumes 5.3W. Power consumption peaked at 6.4W when running the OpenSSL speed benchmark.

There is space for a 2.5" hard drive in the enclosure, but I don't have any IDE drives anymore so I opted to use old CompactFlash cards I had laying around. As a side note, it's actually exquisite to use those cards like glorified floppies :-)

For this post, I used a 1GB CompactFlash card and selected a minimal installation in sysinst.

The installed system takes 212M:

Filesystem         Size       Used      Avail %Cap Mounted on
/dev/wd0a          919M       212M       661M  24% /
kernfs             1.0K       1.0K         0B 100% /kern
ptyfs              1.0K       1.0K         0B 100% /dev/pts
procfs             4.0K       4.0K         0B 100% /proc
tmpfs               64M         0B        64M   0% /var/shm

On a freshly booted system, 15 processes are running and 26M of RAM are used:

load averages:  0.01,  0.00,  0.00;               up 0+00:48:26        14:48:28
16 processes: 15 sleeping, 1 on CPU
CPU states:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
Memory: 26M Act, 6460K Exec, 12M File, 195M Free
Swap: 

  PID USERNAME PRI NICE   SIZE   RES STATE      TIME   WCPU    CPU COMMAND
    0 root      96    0     0K   26M usbevt     0:01  0.00%  0.00% [system]
  795 root      43    0  6160K 1628K CPU        0:00  0.00%  0.00% top
  555 root      85    0    12M 3472K wait       0:00  0.00%  0.00% login
  630 postfix   85    0    13M 3220K kqueue     0:00  0.00%  0.00% qmgr
  599 postfix   85    0    12M 3172K kqueue     0:00  0.00%  0.00% pickup
  575 root      85    0    13M 2304K kqueue     0:00  0.00%  0.00% master
  196 root      85    0  9780K 1960K kqueue     0:00  0.00%  0.00% syslogd
  583 root      85    0  6788K 1824K wait       0:00  0.00%  0.00% sh
  710 root      85    0  6276K 1448K nanoslp    0:00  0.00%  0.00% cron
  733 root      85    0  6108K 1396K ttyraw     0:00  0.00%  0.00% getty
  730 root      85    0  5720K 1392K ttyraw     0:00  0.00%  0.00% getty
  633 root      85    0  6104K 1388K ttyraw     0:00  0.00%  0.00% getty
  211 root      85    0  7316K 1360K kqueue     0:00  0.00%  0.00% dhcpcd
    1 root      85    0  6600K 1340K wait       0:00  0.00%  0.00% init
  689 root      85    0  5700K 1184K kqueue     0:00  0.00%  0.00% inetd
  402 root      84    0  5920K 1140K kqueue     0:00  0.00%  0.00% powerd

Here is the result of running cat /proc/cpuinfo on this device:

processor	: 0
vendor_id	: Vortex86 SoC
cpu family	: 5
model		: 2
model name	: Vortex86DX
stepping	: 2
cpu MHz		: 1000.05
apicid		: 0
initial apicid	: 0
fdiv_bug	: no
fpu		: yes
fpu_exception	: yes
cpuid level	: 1
wp		: yes
flags		: fpu tsc cx8 
clflush size	: 0

For the record, OpenSSL speed benchmark results are available here.

System message buffer (dmesg output):

[     1.000000] Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,
[     1.000000]     2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017,
[     1.000000]     2018, 2019, 2020 The NetBSD Foundation, Inc.  All rights reserved.
[     1.000000] Copyright (c) 1982, 1986, 1989, 1991, 1993
[     1.000000]     The Regents of the University of California.  All rights reserved.

[     1.000000] NetBSD 9.2 (GENERIC) #0: Wed May 12 13:15:55 UTC 2021
[     1.000000] 	mkrepro@mkrepro.NetBSD.org:/usr/src/sys/arch/i386/compile/GENERIC
[     1.000000] total memory = 255 MB
[     1.000000] avail memory = 231 MB
[     1.000000] rnd: seeded with 66 bits
[     1.000000] timecounter: Timecounters tick every 10.000 msec
[     1.000000] Kernelized RAIDframe activated
[     1.000000] running cgd selftest aes-xts-256 aes-xts-512 done
[     1.000000] timecounter: Timecounter "i8254" frequency 1193182 Hz quality 100
[     1.000003] Generic PC
[     1.000003] mainbus0 (root)
[     1.000003] Firmware Error (ACPI): A valid RSDP was not found (20190405/tbxfroot-261)
[     1.000003] autoconfiguration error: acpi_probe: failed to initialize tables
[     1.000003] ACPI Error: Could not remove SCI handler (20190405/evmisc-312)
[     1.000003] cpu0 at mainbus0
[     1.000003] cpu0: Vortex86DX, id 0x522
[     1.000003] cpu0: package 0, core 0, smt 0
[     1.000003] pci0 at mainbus0 bus 0: configuration mode 1
[     1.000003] pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
[     1.000003] pchb0 at pci0 dev 0 function 0: vendor 17f3 product 6021 (rev. 0x02)
[     1.000003] vga0 at pci0 dev 3 function 0: vendor 18ca product 0020 (rev. 0x00)
[     1.000003] wsdisplay0 at vga0 kbdmux 1: console (80x25, vt100 emulation)
[     1.000003] wsmux1: connecting to wsdisplay0
[     1.000003] drm at vga0 not configured
[     1.000003] rdcpcib0 at pci0 dev 7 function 0: vendor 17f3 product 6031 (rev. 0x02)
[     1.000003] rdcpcib0: watchdog timer configured.
[     1.000003] vte0 at pci0 dev 8 function 0: vendor 17f3 product 6040 (rev. 0x00)
[     1.000003] vte0: Ethernet address 00:1b:eb:22:16:5c
[     1.000003] vte0: interrupting at irq 10
[     1.000003] rdcphy0 at vte0 phy 1: R6040 10/100 media interface, rev. 1
[     1.000003] rdcphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
[     1.000003] ohci0 at pci0 dev 10 function 0: vendor 17f3 product 6060 (rev. 0x12)
[     1.000003] ohci0: interrupting at irq 11
[     1.000003] ohci0: OHCI version 1.0, legacy support
[     1.000003] usb0 at ohci0: USB revision 1.0
[     1.000003] ehci0 at pci0 dev 10 function 1: vendor 17f3 product 6061 (rev. 0x03)
[     1.000003] ehci0: interrupting at irq 11
[     1.000003] ehci0: BIOS has given up ownership
[     1.000003] ehci0: EHCI version 1.0
[     1.000003] ehci0: 1 companion controller, 2 ports: ohci0
[     1.000003] usb1 at ehci0: USB revision 2.0
[     1.000003] ohci1 at pci0 dev 11 function 0: vendor 17f3 product 6060 (rev. 0x12)
[     1.000003] ohci1: interrupting at irq 11
[     1.000003] ohci1: OHCI version 1.0, legacy support
[     1.000003] usb2 at ohci1: USB revision 1.0
[     1.000003] ehci1 at pci0 dev 11 function 1: vendor 17f3 product 6061 (rev. 0x03)
[     1.000003] ehci1: interrupting at irq 11
[     1.000003] ehci1: BIOS has given up ownership
[     1.000003] ehci1: EHCI version 1.0
[     1.000003] ehci1: 1 companion controller, 2 ports: ohci1
[     1.000003] usb3 at ehci1: USB revision 2.0
[     1.000003] rdcide0 at pci0 dev 12 function 0: RDC R1011 IDE controller (rev. 0x01)
[     1.000003] rdcide0: bus-master DMA support present
[     1.000003] rdcide0: primary channel configured to compatibility mode
[     1.000003] rdcide0: primary channel interrupting at irq 14
[     1.000003] atabus0 at rdcide0 channel 0
[     1.000003] rdcide0: secondary channel configured to compatibility mode
[     1.000003] rdcide0: secondary channel interrupting at irq 15
[     1.000003] atabus1 at rdcide0 channel 1
[     1.000003] isa0 at rdcpcib0
[     1.000003] pckbc0 at isa0 port 0x60-0x64
[     1.000003] attimer0 at isa0 port 0x40-0x43
[     1.000003] pcppi0 at isa0 port 0x61
[     1.000003] midi0 at pcppi0: PC speaker
[     1.000003] sysbeep0 at pcppi0
[     1.000003] isapnp0 at isa0 port 0x279
[     1.000003] attimer0: attached to pcppi0
[     1.000003] isapnp0: no ISA Plug 'n Play devices found
[     1.000003] timecounter: Timecounter "clockinterrupt" frequency 100 Hz quality 0
[     1.064509] uhub0 at usb1: NetBSD (0000) EHCI root hub (0000), class 9/0, rev 2.00/1.00, addr 1
[     1.064509] uhub0: 2 ports with 2 removable, self powered
[     1.064509] uhub1 at usb2: NetBSD (0000) OHCI root hub (0000), class 9/0, rev 1.00/1.00, addr 1
[     1.064509] uhub1: 2 ports with 2 removable, self powered
[     1.064509] uhub2 at usb3: NetBSD (0000) EHCI root hub (0000), class 9/0, rev 2.00/1.00, addr 1
[     1.064509] uhub2: 2 ports with 2 removable, self powered
[     1.064509] uhub3 at usb0: NetBSD (0000) OHCI root hub (0000), class 9/0, rev 1.00/1.00, addr 1
[     1.064509] uhub3: 2 ports with 2 removable, self powered
[     1.064509] IPsec: Initialized Security Association Processing.
[     3.914550] uaudio0 at uhub3 port 2 configuration 1 interface 0
[     3.914550] uaudio0: vendor 0d8c (0xd8c) C-Media USB Audio Device (0x08), rev 1.10/1.00, addr 2
[     3.934546] uaudio0: audio rev 1.00
[     3.934546] audio0 at uaudio0: playback, capture, full duplex, independent
[     3.934546] audio0: slinear_le:16 2ch 48000Hz, blk 11520 bytes (60ms) for playback
[     3.934546] audio0: slinear_le:16 1ch 48000Hz, blk 6000 bytes (62.5ms) for recording
[     3.934546] uhidev0 at uhub3 port 2 configuration 1 interface 3
[     3.934546] uhidev0: vendor 0d8c (0xd8c) C-Media USB Audio Device (0x08), rev 1.10/1.00, addr 2, iclass 3/0
[     3.944550] uhid0 at uhidev0: input=4, output=4, feature=0
[     4.054550] wd0 at atabus1 drive 0
[     4.054550] wd0: <Hitachi XX.V.3.5.0.0>
[     4.054550] wd0: drive supports 1-sector PIO transfers, LBA addressing
[     4.054550] wd0: 977 MB, 1987 cyl, 16 head, 63 sec, 512 bytes/sect x 2002896 sectors
[     4.064551] wd0: 32-bit data port
[     4.064551] wd0: drive supports PIO mode 4
[     4.064551] wd0(rdcide0:1:0): using PIO mode 4
[     4.084559] WARNING: 1 error while detecting hardware; check system log.
[     4.084559] boot device: wd0
[     4.084559] root on wd0a dumps on wd0b
[     4.094550] root file system type: ffs
[     4.094550] kern.module.path=/stand/i386/9.2/modules
[    20.764808] wsdisplay0: screen 1 added (80x25, vt100 emulation)
[    20.784809] wsdisplay0: screen 2 added (80x25, vt100 emulation)
[    20.794810] wsdisplay0: screen 3 added (80x25, vt100 emulation)
[    20.804812] wsdisplay0: screen 4 added (80x25, vt100 emulation)

Noon van der Silk (silky)

☐ He died with a todo list in his hand ... June 03, 2021 12:00 AM

Posted on June 3, 2021 by Noon van der Silk

A classic Australian movie: A nightmare chase through hell in a never-ending, unrequited daisy chain of desire… .

What is time? How do we measure a life? How many things did you get done today? What are you working on tomorrow!? Can you come hang out on Sunday!!?


Recently, and also always, I’ve been thinking about how to manage a list of todos. During an internal knowledge-sharing session at work I was prompted to reflect on all the different schemes I’ve tried:

Right now I’m back to a whiteboard (by luck I happen to have one behind me where I work), and ‘todo’ notes in code + github issues.


My friend Ruth told me something that perhaps is very obvious but was also surprisingly calming to me: You’ll die with items on your todo list. So don’t worry about it!

I think there’s a few ways to think about this:

  1. The optimisic view: Amazing that I am lucky enough to have things I intend to do and look forward to!
  2. The pessimistic view: Sad that I had goals that I was never able to achieve!
  3. The Buddhist view: Your big mistake is having desires at all!

I want to live around option (3) and (1) at the moment, but of course option (2) is a very very strong driving force; and often necessary (for example, at work!); the only time I was brave enough to enact option (3) at a work setting was when I ran my own business!

Another interesting thing that came up as part of the work discussion was that in some ways what you want to do depends on your mood. No doubt! So … maybe mood can be a parameter?!


I think, secretly, for the rest of my life, I will believe that there is a True answer out there; an ultimate organisation scheme; the Four-Fold path to Life Efficiency: Right Editing, Right Organisation, Right Workflow and Right History.

But I also think that I should never think about this again. But on the other hand, I was inspired at the talk today; so I might give task-warrior another go.

But, you didn’t come here for that! Maybe you came here for advice? So, here it is (it’s not even my advice! Gala claims it’s her idea which is probably true):

If you can, don’t even write the thing down; just do it immediately, or, alternatively, convince yourself you don’t need to think about it now.

Maybe it’s part of my personality, but this works surprisingly well for me. Especially if it’s something that I’m working on with someone else; instead of delaying my part, I can just try and make as much progress as I can immediately, and send it back to the other person! Another way of saying this: I can offset storing my todo list to you!

I find this idea to be quite operaitonal; I often find myself using this rule to do something sooner rather than later.

Here’s one thing I know about myself: I can’t be trusted with lists. I get a bit obsessed with organising them. So I’m trying to avoid them these days.

June 02, 2021

Ponylang (SeanTAllen)

Last Week in Pony - June 2, 2021 June 02, 2021 05:08 PM

@redvers has published an XML library for pony.

June 01, 2021

Julien Blanchard (julienXX)

C styles June 01, 2021 12:18 PM

A collection of interesting C style guides found here and there. If you know other style guides please let me know!

May 31, 2021

Gustaf Erikson (gerikson)

Yann Esposito (yogsototh)

Utopia (2013) May 31, 2021 10:00 PM

UtopiaUtopia

I wanted to write a few articles about great shows lot of people around me do not know about. Utopia (2013) is one of these TV Shows that deserve more attention.

The filmography is quite original. I have never seen another film/TV show with a similar atmosphere. The usage of bright colors is magistral. Flashy yellow, green, red.

The soundtrack is also pretty surprising original and enjoyable.

The acting is really great. Actors are doing a great job and are quite relatable. These are not the (super)heroes you are used to. We are far away from shows where every girl is a bimbo and every guy is a Chippendale.

The scenario, in regard to the recent events related to COVID19 is just perfect. I do not want to reveal too much. But let's just say the current real events are close to the events predicted in this show.

The Surrealistic Humor atmosphere make the viewing experience quite exceptionnal. There is a mix between nonchalance and extreme violence. It really feels surrealist. In this show some people act with extreme violence as if there is no choice.

As a conclusion, if you are looking for a very innovative TV show then search no further this one is a great original choice. If you are still unsure, just watch the opening scene, it is quite incredible.

ps: Also try to get the original content. Amazon Prime apparently cut some very important scene and also changed the ratio which hurt the very good image work.

May 30, 2021

Derek Jones (derek-jones)

Pomodoros worked during a day: an analysis of Alex’s data May 30, 2021 10:16 PM

Regular readers know that I am always on the lookout for software engineering data. One search technique is to feed a ‘magic’ phrase into a search engine, this can locate data hiding in plain sight. This week the magic phrase: “record of pomodoros” returned pages discussing two collections of daily Pomodoros worked, each over a year, plus several possible collections, i.e., not explicitly stated. My email requests for data have so far returned one of the collections, kindly made available by Alex Altair, and this post discusses Alex’s data (I have not discussed the data with Alex, who I’m hoping won’t laugh too loud at the conclusions I have reached).

Before analyzing data I always make predictions about what I expect to see. I know from the email containing the data that it consisted of two columns: date and Pomodoro’s worked, i.e., no record of task names. The first two predictions for this data were the two most common patterns seen in estimation data, i.e., use of round numbers, and a weekend-effect (most people don’t work during the weekend and the autocorrelation of the daily counts contain peaks at lags of 6 and 7). The third prediction was that over time the daily total Pomodoro counts would refine into counts for each of the daily tasks (I had looked at the first few lines of the data and seen totals for the daily Pomodoros worked.

The Renzo Pomodoro dataset is my only previous experience analysing Pomodoro data. Renzo created a list of tasks for the day, estimated the number of Pomodoros for each task would take, and recorded how many it actually took. For comparison, the SiP effort estimation dataset estimates software engineering tasks in hours.

Alex uses Pomodoros as a means of focusing his attention on the work to be done, and the recorded data is a measure of daily Pomodoro work done.

I quickly discovered that all my predictions were wrong, i.e., no obvious peaks showing use of round numbers, no weekend effect, and always daily totals. Ho-hum.

The round number effect is very prominent in estimates, but is not always visible in actuals; unless people are aiming to meet targets or following Parkinson’s law.

How many days had one Pomodoro worked, how many two Pomodoro, etc? The plot below shows the number of days for which a given number of Pomodoros were worked (the number of days with zero Pomodoros is not shown); note the axis are log scaled. The blue points are for all days in 2020, and the green points are all days in 2020+178 days of 2021. The red lines are two sets of two fitted power laws (code+data):

Number of days on which a given number of Pomodoros were worked, with fitted power laws.

Why the sudden change of behavior after seven Pomodoro? Given a Pomodoro of 25-minutes (Alex says he often used this), seven of them is just under 3-hours, say half a day. Perhaps Alex works half a day, for every day of the week.

Why the change of behavior since the end of 2020 (i.e., exponent of left line changes from 0.3 to -0.1, exponent of right line is -3.0 in both cases)? Perhaps Alex is trying out another technique. The initial upward trend is consistent with the Renzo Pomodoro dataset.

The daily average Pomodoros worked is unchanged at around 5.6. The following plot shows daily Pomodoros worked over the 543 days, red line is a fitted loess model.

Daily Pomodoros worked over 543 days.

The weekend effect might not be present, but there is a strong correlation between adjacent days (code+data). The best fitting ARIMA model gives the equation: P_t=0.37+0.93*P_{t-1}+w_t-0.74*w_{t-1}, where: P_t is the Pomodoros worked on day t, P_{t-1} Pomodoros worked on the previous day, w_t is white noise (e.g., a Normal distribution) with a zero mean and a standard deviation of 4 (in this case) on day t, and w_{t-1} the previous day’s noise (see section 11.10 of my book for technical time series details).

This model is saying that the number of Pomodoros worked today is strongly influenced by yesterday’s Pomodoro worked, modulated by a large random component that could be large enough to wipe out the previous days influence. Is this likely to be news to Alex, or to anybody looking at the plot of Pomodoros over time? Probably not.

For me, the purpose of data analysis is to find patterns of behavior that are of use to those involved in the processes that generated the data (for many academics, at least in software engineering, the purpose appears to be to find patterns that can be used to publish papers, and given enough searching, it is always possible to find patterns in data). What patterns of behavior might Alex be interested in?

Does more Pomodoro work get done at the start of the week, compared to the end of the week? The following heatmap is based on the number of week days on which a given number of Pomodoros were worked. The redder the region, the more likely that value is likely to occur (code+data):

Heatmap of number of days on which a given number of Pomodoros were worked on a given day of the week.

There are certainly more days near the end of the week having little or no Pomodoro work, and the high Pomodoro work days appear to be nearer the start of the week. I need to find a statistical technique that quantifies these observations.

I think that the middle plot is the most generally useful, it illustrates how variable the work done during a day can be.

Is Alex’s Pomodoro work typical or unusual? We will have to wait for a lot more data before that question can be addressed.

If you are a Pomodoro user, and have ideas for possible patterns in the data, please let me know.

As always, pointers to more data, Pomodoro or otherwise, most welcome.

May 28, 2021

Mark Fischer (flyingfisch)

Two weeks of Hungarian May 28, 2021 03:00 PM

It has now been 2 weeks since I started learning Hungarian. In the past week I’ve used DuoLingo daily, attended a group Zoom lesson, and I attended a private Hungarian lesson last night. I’ve also done independent research into different aspects of the language that interest me, as they come up in my more structured learning. Some of the initial learning fever has worn off and I can feel myself starting to fall into a sustainable pace. I’m familiar with this feeling since it used to happen back when learning programming languages was new for me: I’d start off knowing nothing and feeling simultaneously overwhelmed and awestruck by the breadth and depth of information I was consuming. This caused (and still causes) my initial learning period to be very intense while I try to establish basic level understanding in a programming language’s syntax or a spoken language’s basic grammar and vocabulary.

Hungarian flag waving in the wind Image Credit

My learning experience so far

For me, learning something brand new feels metaphorically like being hit by a giant wave of information and having no way to consume it. I then start to understand bits and pieces of it and eventually get to a point where I start to feel comfortable with some very small part of the language. For programming languages this was being able to write and understand my first chunk of meaningful code, usually a small program that responded to user input and maybe had a couple if statements for different control flow. For Hungarian that point was when I could start saying basic present tense sentences with a subject, verb, and object like János egy almát lát (János sees an apple). I see this small nucleus of understanding like a beachhead: it’s a starting point that I can use to explore in many different directions.

A red car in Hungary Egy piros autó (a red car)
Image Credit

For instance, I can start exploring grammatical number using this basic framework. I know how to say “János sees an apple”, now I can also say János almákat lát (János sees apples). I can also try some different adjectives to describe the apple that János sees, for instance János egy nagy almát lát (János sees a large apple). I can play around with word order, saying János lát egy almát instead of János egy almát lát.

Let’s try out some new verbs in our kicsi (little) sentence. János egy almát vár (János waits for an apple). Kind of a nonsensical sentence, but it’s grammatically correct. We can also say János talál egy almát (János finds an apple).

These little forays into different aspects of the language while building on things I already know help me to steadily progress without getting overwhelmed. They also give me tangible goals that keep me from being discouraged by how much there is to learn and wanting to know all of it right now. If I focus on the vague concept of wanting to learn Hungarian, it immediately becomes overwhelming because mastering a language takes years. Instead I focus on figuring out how to say “János found an apple” or “János will find an apple”, and so on. These small goals make progression in the language seem relatively easy and rewarding.

A brief ramble about the difficulty of the Hungarian language

I’m writing this blog post as an exercise for myself, since teaching something is one of the best ways to learn, but also as a way to document my process of learning Hungarian in the hopes that it can help or encourage someone else. The Hungarian language specifically can be discouraging to start learning since Hungarian has a reputation for being impossible for English speakers to learn, or at least extremely difficult. And maybe it is difficult. Aside from taking a year of Latin in high school I have nothing to compare it to. Perhaps if I had learned an easier language first, Hungarian would seem more overwhelming, who knows. But what I do know is that this language is not impossible.

A view of Lake Balaton A view of Lake Balaton from Hegymagas, Hungary.
Image Credit

Something I find interesting is that language learning difficulty goes both ways. If Hungarian is difficult for English speakers, then English is also difficult for Hungarian speakers. My grandmother learned English in her teens, and spoke perfect English for the rest of her life with only the faintest of an accent. At this very moment there are thousands, maybe millions of Chinese speakers who are learning English and will succeed.

Why do English speakers get so overwhelmed by the difficulty of a language? Maybe it’s because we don’t have to learn another language. Maybe it has to do with the American notion of forcing our culture on others instead of learning and assimilating other cultures. I don’t really know. I do know I’m waxing political though, so I’ll end this ramble here.

Plans for my next week of learning

In the next week, I am hoping to be able to attend another group lesson, and of course will be continuing to use DuoLingo. My goal for the next week is to expand my Hungarian vocabulary and learn definite conjugations for verbs.

What is a definite conjugation? In Hungarian, different endings are used for verbs depending on whether the sentence is talking about something specific or something indefinite. So in the sentence a nő egy almát lát (the woman sees an apple), the object (egy almát or an apple) is indefinite. We aren’t talking about a specific apple, but an indefinite one.

On the other hand, the sentence a nő az almát látja (the woman sees the apple) has a definite object. We are talking about a specific apple this time. Because of this, the verb lát gets the ending -ja. You may say, “well Mark, it seems like you understand indefinite and definite verbs already”, and you’d be wrong, because it just so happens that I only understand the concept and I don’t know the endings for the different conjugations of verbs. I know the definite conjugation for “he/she/it sees” is látja, but I don’t know what the definite conjugation for “I see”, “we see”, etc is. And that will be one of my focuses for the coming week.

Resources

Andreas Zwinkau (qznc)

Can Nokia and Netscape tell us when radical change is necessary? May 28, 2021 12:00 AM

The lessons from the history of the two companies are not at all obvious to me.

Read full article!

May 27, 2021

Frederic Cambus (fcambus)

Character and color cycling effect in C on DOS May 27, 2021 11:21 PM

As mentioned in my previous post about playing with DJGPP and GCC 10 on DOS, I have been redoing my small character and color cycling effect in text mode. The original version in JavaScript can be seen here.

Cycling effect

To understand why we can't access video memory directly and need to use the DPMI service to create a selector to access the required real-mode segment address, please refer to section 18.4 of the DJGPP FAQ.

The effect can be downloaded here. Sadly, the resulting binary is quite large (104K stripped!), and it requires CWSDPMI to run.

Here is the code:

#include <conio.h>
#include <dpmi.h>
#include <go32.h>
#include <pc.h>

void
wait_vbl()
{
	while(inp(0x3da) & 0x08);
	while(!(inp(0x3da) & 0x08));
}

int
main()
{
	short video = __dpmi_segment_to_descriptor(0xb800);
	unsigned char buffer[4000];

	int character = 0, color = 0;

	/* Define character and color arrays */
	char characters[] = { 0x5c, 0x7c, 0x2f, 0x2d };
	char colors[] = { 0xf, 0xb, 0x9, 0x1, 0x9, 0xb, 0xf };

	while (!kbhit()) {
		for (size_t offset = 0; offset < 4000; offset += 2) {
			/* Write character and color data */
			buffer[offset] = characters[character];
			buffer[offset + 1] = colors[color];

			/* Increment the color index */
			color = color + 1 < sizeof(colors) ? color + 1 : 0;
		}

		/* Increment the character index */
		character = character + 1 < sizeof(characters) ?
		    character + 1 : 0;

		/* Copy the buffer into text mode video memory */
		movedata(_my_ds(), (unsigned)buffer, video, 0, 4000);

		/* Wait for VBL */
		wait_vbl();
		wait_vbl();
		wait_vbl();
		wait_vbl();
		wait_vbl();
	}
}

May 25, 2021

Gustaf Erikson (gerikson)

The Mirror and the Light by Hilary Mantel May 25, 2021 07:27 AM

Mantel concludes the trilogy about Thomas Cromwell as he reaches the pinnacle of power and then plummets precipitously.

As I’ve mentioned before about this trilogy, this is great historical fiction. Mantel deserves every ounce of praise for these.

May 24, 2021

Yann Esposito (yogsototh)

Fast Static Site with make May 24, 2021 10:00 PM

This article will dig a bit deeper about how I generate my static website. In a previous article I just gave the rationale and an overview to do it yourself. Mainly it is very fast and portable.

A few goal reached by my current build system are:

  1. Be fast, try to make as few work as possible. I don't want to rebuild all the html pages if I only change one file.
  2. Source file format agnostic. You can use markdown, org-mode or even directly writing html.
  3. Support gemini
  4. Optimize size: minify HTML, CSS, images
  5. Generate an index page listing the posts
  6. Generate RSS/atom feed (for both gemini and http)

make will take care of handling the dependency graph to minimize the amount of effort when a change occurs in the sources. For some features, I built specifics small shell scripts. For example to be absolutely agnostic in the source format for my articles I generate the RSS out of a tree of HTML files. But taking advantage of make, I generate an index cache to transform those HTML into XML which will be faster to use to build different indexes. To make those transformations I use very short a shell scripts.

Makefile overview

A Makefile is constitued of rules. The first rule of your Makefile will be the default rule. The first rule of my Makefile is called all.

A rule as the following format:

target: file1 file2
    cmd --input file1 file2 \
        --output target

if target does not exists, then make will look at its dependencies. If any of its dependency need to be updated, it will run all the rules in the correct order to rebuild them, and finally run the script to build target. A file need to be updated if one of its dependency need to be updated or is newer.

The ususal case of make is about building a single binary out of many source files. But for a static website, we need to generate a lot of files from a lot of files. So we construct the rules like this:

all: site

# build a list of files that will need to be build
DST_FILES := ....
# RULES TO GENERATE DST_FILES
ALL += $(DST_FILES)

# another list of files
DST_FILES_2 := ....
# RULES TO GENERATE DST_FILES_2
ALL += $(DST_FILES_2)

site: $(ALL)

In my Makefile I have many similar block with the same pattern.

  1. I retrieve a list of source files
  2. I construct the list of destination files (change the directory, the extension)
  3. I declare a rule to construct these destination files
  4. I add the destination files to the ALL variable.

I have a block for:

  • raw assets I just want copied
  • images I would like to compress for the web
  • html I would like to generate from org mode files via pandoc
  • gmi I would like to generate from org mode files
  • xml files I use as cache to build different index files
  • index.html file containing a list of my posts
  • rss.xml file containing a list of my posts
  • gemini-atom.xml file containing a list of my posts

Assets

The rules to copy assets will be a good first example.

  1. find all assets in src/ directory
  2. generate all assets from these file in _site/ directory
  3. make this rule a dependency on the all rule.
SRC_ASSETS := $(shell find src -type f)
DST_ASSETS := $(patsubst src/%,_site/%,$(SRC_ASSETS))
_site/% : src/%
    @mkdir -p "$(dir $@)"
    cp "$<" "$@"
.PHONY: assets
assets: $(DST_ASSETS)
ALL += assets

OK, this looks terrible. But mainly:

  • SRC_ASSETS will contains the result of the command find.
  • DST_ASSETS will contains the files of SRC_ASSETS but we replace the src/ by _site/.
  • We create a generic rule; for all files matching the following pattern _site/%, look for the file src/% and if it is newer (in our case) then execute the following commmands:
    • create the directory to put _site/% in
    • copy the file

About the line @mkdir -p "$(dir $@)":

  • the @ at the start of the command simply means that we make this execution silent.
  • The $@ is replaced by the target string.
  • And $(dir $@) will generate the dirname of $@.

For the line with cp you just need to know that $< will represent the first dependency.

So my Makefile is composed of similar blocks, where I replace the first find command to match specific files and where I use different building rule. An important point, is that the rule must be the most specific possible because make will use the most specific rule in case of ambiguity. So for example, the matching rule _site/%: src/% will match all files in the src/ dir. But if we want to treat css file with another rule we could write:

_site/%.css: src/%.css
    minify "$<" "$@"

And if the selected file is a css file, this rule will be selected.

Prelude

So to start I have a few predefined useful variables.

all: site
# directory containing the source files
SRC_DIR ?= src
# directory that will contain the site files
DST_DIR ?= _site
# a directory that will contain a cache to speedup indexing
CACHE_DIR ?= .cache

# options to pass to find to prevent matching files in the src/drafts
# directory
NO_DRAFT := -not -path '$(SRC_DIR)/drafts/*'
# option to pass to find to not match  org files
NO_SRC_FILE := ! -name '*.org'

CSS

So here we go, the same simple pattern for CSS files.

# CSS
SRC_CSS_FILES := $(shell find $(SRC_DIR) -type f -name '*.css')
DST_CSS_FILES := $(patsubst $(SRC_DIR)/%,$(DST_DIR)/%,$(SRC_RAW_FILES))
$(DST_DIR)/%.css : $(SRC_DIR)/%.css
    @mkdir -p "$(dir $@)"
    minify "$<" > "$@"
.PHONY: css
css: $(DST_CSS_FILES)
ALL += css

This is very similar to the block for raw assets. The difference is just that instead of using cp we use the minify command.

ORG → HTML

Now this one is more complex but is still follow the same pattern.

# ORG -> HTML
EXT ?= .org
SRC_PANDOC_FILES ?= $(shell find $(SRC_DIR) -type f -name "*$(EXT)" $(NO_DRAFT))
DST_PANDOC_FILES ?= $(patsubst %$(EXT),%.html, \
                        $(patsubst $(SRC_DIR)/%,$(DST_DIR)/%, \
                            $(SRC_PANDOC_FILES)))
PANDOC_TEMPLATE ?= templates/post.html
MK_HTML := engine/mk-html.sh
PANDOC := $(MK_HTML) $(PANDOC_TEMPLATE)
$(DST_DIR)/%.html: $(SRC_DIR)/%.org $(PANDOC_TEMPLATE) $(MK_HTML)
    @mkdir -p "$(dir $@)"
    $(PANDOC) "$<" "$@.tmp"
    minify --mime text/html "$@.tmp" > "$@"
    @rm "$@.tmp"
.PHONY: html
html: $(DST_PANDOC_FILES)
ALL += html

So to construct DST_PANDOC_FILES this time we also need to change the extension of the file from org to html. We need to provide a template that will be passed to pandoc.

And of course, as if we change the template file we would like to regenerate all HTML files we put the template as a dependency. But importantly not at the first place. Because we use $< that will be the first dependency.

I also have a short script instead of directly using pandoc. It is easier to handle toc using the metadatas in the file. And if someday I want to put the template in the metas, this will be the right place to put that.

The mk-html.sh is quite straightforward:

#!/usr/bin/env bash
set -eu

# put me at the top level of my project (like Makefile)
cd "$(git rev-parse --show-toplevel)" || exit 1
template="$1"
orgfile="$2"
htmlfile="$3"

# check if there is the #+OPTIONS: toc:t
tocoption=""
if grep -ie '^#+options:' "$orgfile" | grep 'toc:t'>/dev/null; then
    tocoption="--toc"
fi

set -x
pandoc $tocoption \
       --template="$template" \
       --mathml \
       --from org \
       --to html5 \
       --standalone \
       $orgfile \
       --output "$htmlfile"

Once generated I also minify the html file. And, that's it. But the important part is that now, if I change my script or the template or the file, it will generate the dependencies.

Indexes

One of the goal I have is to be as agnostic as possible regarding format. I know that the main destination format will be html. So as much as possible, I would like to use this format. So for every generated html file I will generate a clean XML file (via hxclean) so I will be able to get specific node of my HTML files. These XML files will constitute my "index". Of course this is not the most optimized index (I could have used sqlite for example) but it will already be quite helpful as the same index files will be used to build the homepage with the list of articles, and the RSS file.

# INDEXES
SRC_POSTS_DIR ?= $(SRC_DIR)/posts
DST_POSTS_DIR ?= $(DST_DIR)/posts
SRC_POSTS_FILES ?= $(shell find $(SRC_POSTS_DIR) -type f -name "*$(EXT)")
RSS_CACHE_DIR ?= $(CACHE_DIR)/rss
DST_XML_FILES ?= $(patsubst %.org,%.xml, \
                        $(patsubst $(SRC_POSTS_DIR)/%,$(RSS_CACHE_DIR)/%, \
                            $(SRC_POSTS_FILES)))
$(RSS_CACHE_DIR)/%.xml: $(DST_POSTS_DIR)/%.html
    @mkdir -p "$(dir $@)"
    hxclean "$<" > "$@"
.PHONY: indexcache
indexcache: $(DST_XML_FILES)
ALL += indexcache

So to resume this rule will generate for every file in site/posts/*.html a corresponding xml file (hxclean takes an HTML an try its best to make an XML out of it).

HTML Index

So now we just want to generate the main index.html page at the root of the site. This page should list all articles by date in reverse order.

So the first step is to take advantage of the cache index. For every XML file I generated before I should generate the small HTML block I want for every entry. For this I use a script mk-index-entry.sh. He will use hxclean to retrieve the date and the title from the cached XML files. Then generate a small file just containing the date and the link.

Here is the block in the Makefile:

DST_INDEX_FILES ?= $(patsubst %.xml,%.index, $(DST_XML_FILES))
MK_INDEX_ENTRY := ./engine/mk-index-entry.sh
INDEX_CACHE_DIR ?= $(CACHE_DIR)/rss
$(INDEX_CACHE_DIR)/%.index: $(INDEX_CACHE_DIR)/%.xml $(MK_INDEX_ENTRY)
    @mkdir -p $(INDEX_CACHE_DIR)
    $(MK_INDEX_ENTRY) "$<" "$@"

which reads, for every .xml file generate a .index file with mk-index-entry.sh.

#!/usr/bin/env zsh

# prelude
cd "$(git rev-parse --show-toplevel)" || exit 1
xfic="$1"
dst="$2"
indexdir=".cache/rss"

# HTML Accessors (similar to CSS accessors)
dateaccessor='.yyydate'
# title and keyword shouldn't be changed
titleaccessor='title'
finddate(){ < $1 hxselect -c $dateaccessor | sed 's/\[//g;s/\]//g;s/ .*$//' }
findtitle(){ < $1 hxselect -c $titleaccessor }

autoload -U colors && colors

blogfile="$(echo "$xfic"|sed 's#.xml$#.html#;s#^'$indexdir'/#posts/#')"
printf "%-30s" $blogfile
d=$(finddate $xfic)
echo -n " [$d]"
rssdate=$(formatdate $d)
title=$(findtitle $xfic)
keywords=( $(findkeywords $xfic) )
printf ": %-55s" "$title ($keywords)"
{ printf "\\n<li>"
  printf "\\n<span class=\"pubDate\">%s</span>" "$d"
  printf "\\n<a href=\"%s\">%s</a>" "${blogfile}" "$title"
  printf "\\n</li>\\n\\n"
} >> ${dst}

echo " [${fg[green]}OK${reset_color}]"

Then I use these intermediate file to generate a single bigger index file.

HTML_INDEX := $(DST_DIR)/index.html
MKINDEX := engine/mk-index.sh
INDEX_TEMPLATE ?= templates/index.html
$(HTML_INDEX): $(DST_INDEX_FILES) $(MKINDEX) $(INDEX_TEMPLATE)
    @mkdir -p $(DST_DIR)
    $(MKINDEX)
.PHONY: index
index: $(HTML_INDEX)
ALL += index

This script is a big one, but it is not that complex. For every file, I generate a new file DATE-dirname, I sort them in reverse order and put their content in the middle of an HTML file.

The important part is that it is only generated if the index change. So first part of the script handle the creation of file using the date in their file name which will help us sort them later.

#!/usr/bin/env zsh

autoload -U colors && colors
cd "$(git rev-parse --show-toplevel)" || exit 1
# Directory
webdir="_site"
indexfile="$webdir/index.html"
indexdir=".cache/rss"
tmpdir=$(mktemp -d)

echo "Publishing"

dateaccessor='.pubDate'
finddate(){ < $1 hxselect -c $dateaccessor }
# generate files with <DATE>-<FILENAME>.index
for fic in $indexdir/**/*.index; do
    d=$(finddate $fic)
    echo "${${fic:h}:t} [$d]"
    cp $fic $tmpdir/$d-${${fic:h}:t}.index
done

Then I use these files to generate a file that will contain the body of the HTML.

# for every post in reverse order
# generate the body (there is some logic to group by year)
previousyear=""
for fic in $(ls $tmpdir/*.index | sort -r); do
    d=$(finddate $fic)
    year=$( echo "$d" | perl -pe 's#(\d{4})-.*#$1#')
    if (( year != previousyear )); then
        if (( previousyear > 0 )); then
            echo "</ul>" >> $tmpdir/index
        fi
        previousyear=$year
        echo "<h3 name=\"${year}\" >${year}</h3><ul>" >> $tmpdir/index
    fi
    cat $fic >> $tmpdir/index
done
echo "</ul>" >> $tmpdir/index

And finally, I render the HTML using a template within a shell script:

title="Y"
description="Most recent articles"
author="Yann Esposito"
body=$(< $tmpdir/index)
date=$(LC_TIME=en_US date +'%Y-%m-%d')

# A neat trick to use pandoc template within a shell script
# the pandoc templates use $x$ format, we replace it by just $x
# to be used with envsubst
template=$(< templates/index.html | \
    sed 's/\$\(header-includes\|table-of-content\)\$//' | \
    sed 's/\$if.*\$//' | \
    perl -pe 's#(\$[^\$]*)\$#$1#g' )
{
    export title
    export author
    export description
    export date
    export body
    echo ${template} | envsubst
} > "$indexfile"

rm -rf $tmpdir
echo "* HTML INDEX [done]"

RSS

So for my RSS generation this is quite similar to the system I use to generate my index file. I just slightly improved the rules.

The makefile blocks look like:

# RSS
DST_RSS_FILES ?= $(patsubst %.xml,%.rss, $(DST_XML_FILES))
MK_RSS_ENTRY := ./engine/mk-rss-entry.sh
$(RSS_CACHE_DIR)/%.rss: $(RSS_CACHE_DIR)/%.xml $(MK_RSS_ENTRY)
    @mkdir -p $(RSS_CACHE_DIR)
    $(MK_RSS_ENTRY) "$<" "$@"

RSS := $(DST_DIR)/rss.xml
MKRSS := engine/mkrss.sh
$(RSS): $(DST_RSS_FILES) $(MKRSS)
    $(MKRSS)

.PHONY: rss
rss: $(RSS)
ALL += rss

Gemini

I wrote a minimal script to transform my org files to gemini files. I also need to generate an index and an atom file for gemini:

# ORG -> GEMINI
EXT := .org
SRC_GMI_FILES ?= $(shell find $(SRC_DIR) -type f -name "*$(EXT)" $(NO_DRAFT))
DST_GMI_FILES ?= $(subst $(EXT),.gmi, \
                        $(patsubst $(SRC_DIR)/%,$(DST_DIR)/%, \
                            $(SRC_GMI_FILES)))
GMI := engine/org2gemini.sh
$(DST_DIR)/%.gmi: $(SRC_DIR)/%.org $(GMI) engine/org2gemini_step1.sh
    @mkdir -p $(dir $@)
    $(GMI) "$<" "$@"
ALL += $(DST_GMI_FILES)
.PHONY: gmi
gmi: $(DST_GMI_FILES)

# GEMINI INDEX
GMI_INDEX := $(DST_DIR)/index.gmi
MK_GMI_INDEX := engine/mk-gemini-index.sh
$(GMI_INDEX): $(DST_GMI_FILES) $(MK_GMI_INDEX)
    @mkdir -p $(DST_DIR)
    $(MK_GMI_INDEX)
ALL += $(GMI_INDEX)
.PHONY: gmi-index
gmi-index: $(GMI_INDEX)

# RSS
GEM_ATOM := $(DST_DIR)/gem-atom.xml
MK_GEMINI_ATOM := engine/mk-gemini-atom.sh
$(GEM_ATOM): $(DST_GMI_FILES) $(MK_GEMINI_ATOM)
    $(MK_GEMINI_ATOM)
ALL += $(GEM_ATOM)
.PHONY: gmi-atom
gmi-atom: $(GMI_ATOM)

.PHONY: gemini
gemini: $(DST_GMI_FILES) $(GMI_INDEX) $(GEM_ATOM)

Images

For images, I try to convert all of them with imagemagick to compress them.

# Images
SRC_IMG_FILES ?= $(shell find $(SRC_DIR) -type f -name "*.jpg" -or -name "*.jpeg" -or -name "*.gif" -or -name "*.png")
DST_IMG_FILES ?= $(patsubst $(SRC_DIR)/%,$(DST_DIR)/%, $(SRC_IMG_FILES))

$(DST_DIR)/%.jpg: $(SRC_DIR)/%.jpg
    @mkdir -p $(dir $@)
    convert "$<" -quality 50 -resize 800x800\> "$@"

$(DST_DIR)/%.jpg: $(SRC_DIR)/%.jpeg
    @mkdir -p $(dir $@)
    convert "$<" -quality 50 -resize 800x800\> "$@"

$(DST_DIR)/%.gif: $(SRC_DIR)/%.gif
    @mkdir -p $(dir $@)
    convert "$<" -quality 50 -resize 800x800\> "$@"

$(DST_DIR)/%.png: $(SRC_DIR)/%.png
    @mkdir -p $(dir $@)
    convert "$<" -quality 50 -resize 800x800\> "$@"

.PHONY: img
img: $(DST_IMG_FILES)
ALL += $(DST_IMG_FILES)

Deploy

A nice bonus is that I also deploy my website using make. And note I protect myself from Makefile temporary bugs for the clean rule.

# DEPLOY
.PHONY: site
site: $(ALL)

.PHONY: deploy
deploy: $(ALL)
    engine/sync.sh

.PHONY: clean
clean:
    -[ ! -z "$(DST_DIR)" ] && rm -rf $(DST_DIR)/*
    -[ ! -z "$(CACHE_DIR)" ] && rm -rf $(CACHE_DIR)/*

Ponylang (SeanTAllen)

Last Week in Pony - May 23, 2021 May 24, 2021 01:04 AM

Version 0.41.1 of ponylang/ponyc has been released!

Marc Brooker (mjb)

Metastability and Distributed Systems May 24, 2021 12:00 AM

Metastability and Distributed Systems

What if computer science had different parents?

There's no more time-honored way to get things working again, from toasters to global-scale distributed systems, than turning them off and on again. The reasons that works so well are varied, but one reason is especially important for the developers and operators of distributed systems: metastability.

I'll let the authors of Metastable Failures in Distributed Systems define what that means:

Metastable failures occur in open systems with an uncontrolled source of load where a trigger causes the system to enter a bad state that persists even when the trigger is removed.

What they're identifying here is a kind of stable down state, where the system is stable but not doing useful work, even though it's only being offered a load that it successfully handled sometime in the past.

One classic version of this problem involves queues. A system is ticking along nicely, and something happens. Could be a short failure, a spike of load, a deployment, or one of many other things. This causes queues to back up in the system, causing an increase in latency. That increased latency causes clients to time out before the system responds to them. Clients continue to send work, and the system continues to complete that work. Throughput is great. None of the work is useful, though, because clients aren't waiting for the results, so goodput is zero. The system is mostly stable in this state, and without an external kick, could continue going along that way indefinitely. Up, but down. Working, but broken.

In Metastable Failures in Distributed Systems, Bronson et al correctly observe that these types of failure modes are well-known1 to the builders of large-scale systems:

By reviewing experiences from a decade of operating hyperscale distributed systems, we identify a class of failures that can disrupt them, even when there are no hardware failures, configuration errors, or software bugs. These metastable failures have caused widespread outages at large internet companies, lasting from minutes to hours. Paradoxically, the root cause of these failures is often features that improve the efficiency or reliability of the system.

The paper identifies a list of other triggers for these types of metastable failures, including retries, caching, slow error handling paths and emergent properties of load-balancing algorithms. That's a good list, but just scratching the surface of all of the possible causes of these 'down but up' states in distributed systems.

Is there a cure?

The disease is a serious one, but perhaps with the right techniques we can build systems that don't have these metastable states. Bronson et al propose approaching that in several ways:

We consider the root cause of a metastable failure to be the sustaining feedback loop, rather than the trigger. There are many triggers that can lead to the same failure state, so addressing the sustaining effect is much more likely to prevent future outages.

This isn't a controversial point, but is an important one: focusing on just fixing the triggering causes of issues causes us to fail to prevent similar issues with slightly different causes in future.

The rest of their proposed solutions are more debatable. Changing policy during overload introduces modal behavior that can be hard to reason about (and modes are bad). Prioritization and fairness are good if you can get them, but many systems can't, either because their workloads are complex interdependent graphs without clear priority order, or because the priority order is unpalatable to the business. Fast error paths and outlier hygiene are good, in an eat your broccoli kind of way.

The other two they cover that really resonate with me are organizational incentives and autoscaling. Autoscaling, again, is a good if you can get it kind of thing, but most applications can get it by building on top of modern cloud systems. Maybe even get it for free by building on serverless2. On organizational incentives:

Optimizations that apply only to the common case exacerbate feedback loops because they lead to the system being operated at a larger multiple of the threshold between stable and vulnerable states.

Yes, precisely. This is a very important dynamic to understand, and design an organization around defeating4. One great example of this behavior is retries. If you're only looking at your day-to-day error rate metric, you can be lead to believe that adding more retries makes systems better because it makes the error rate go down. However, the same change can make systems more vulnerable, by converting small outages into sudden (and metastable) periods of internal retry storms. Your weekly loop where you look at your metrics and think about how to improve things may be making things worse.

Where do we go?

Knowing this problem exists, and having some tactics to fix certain versions of it, is useful. Even more useful would be to design systems that are fundamentally stable.

Can you predict the next one of these metastable failures, rather than explain the last one?

The paper lays out a couple of strategies here. The most useful one is a characteristic metric that gives insight into the state of the feedback loop that's holding the system down. This is the start of a line of thinking that treats large-scale distributed systems as control systems, and allows us to start applying the mathematical techniques of control theory and dynamical systems theory.

I believe that many of the difficulties we have in this area come from where computing grew up. Algorithms, data structures, discrete math, finite state machines, and the other core parts of the CS curriculum are only one possible intellectual and theoretical foundation for computing. It's interesting to think about what would be different in the way we teach CS, and the way we design and build systems, if we had instead chosen the mathematics of control systems and dynamical systems as the foundation. Some things would likely be harder. Others, like avoiding building metastable distributed systems, would likely be significantly easier.

In lieu of a time-travel-based rearrangement of the fundamentals of computing, I'm excited to see more attention being paid to this problem, and to possible solutions. We've made a lot of progress in this space over the last few decades, but there's a lot more research and work to be done.

Overall, Metastable Failures in Distributed Systems is an important part of a conversation that doesn't get nearly the attention it deserves in the academic or industrial literature. If I have any criticism, it's that the paper overstates its case for novelty. These kinds of issues are well known in the world of control systems, in health care, in operations research, and other fields. The organizational insights echo those of folks like Jens Rasmussen3. But it's a HotOS paper, and this sure is a hot topic, so I won't hold the lack of a rigorous investigation of the background against the authors.

If you build, operate, or research large-scale distributed systems, you should read this paper. There's also a good summary on Aleksey Charapko's blog.

Footnotes

  1. For example, I wrote about part of this problem in Some risks of coordinating only sometimes, and talked about it at HPTS'19, although framed the issue as a bistability rather than metastability. Part of the thinking in that talk came from my own experience, and discussions of the topic in books like designing distributed control systems. It's a topic we've spent a lot of energy on at AWS over the last decade, although typically using different words.
  2. Of course I'm heavily biased, but the big advantage of serverless is that most applications are small relative to the larger serverless systems they run on, and so have a lot of headroom to deal with sudden changes in efficiency. In practice, I think that building on higher-level abstractions is going to be the best way for most people to avoid problems like those described in the paper, most of the time.
  3. Specifically the discussion of the "error margin" in Risk Management in a Dynamic Society, and how economic and labor forces push systems closer to the boundary of acceptable performance.
  4. An organization and an economy. As we saw with supply-side shortages of things like masks early in the Covid pandemic, real-world systems are optimized for little excess capacity too, and optimized for the happy case.

May 23, 2021

Derek Jones (derek-jones)

Where are the industrial strength R compilers? May 23, 2021 10:45 PM

Why don’t compiler projects for the R language make it into production use? The few that have been written have remained individual experimental products, e.g., RLLVMCompile.

Most popular languages attract many compiler implementations. I’m not saying that any of these implementations have more than a handful of users, that they implement the full language (a full implementation is not common), or that they fulfil any need other than their implementers desire to build something.

A commonly heard reason for the lack of production R compilers is that it is not worth the time and effort, because most of an R program’s time is spent in the library code which is written in a compiled language (e.g., C or Fortran). The fact that it is probably not worth the time and effort has not stopped people writing compilers for other languages, but then I think that the kind of people who use R tend not to be the kind of people who want to spend their time writing compilers. On the whole, they are the kind of people who are into statistics and data analysis.

Is it true that that most R programs spend most of their time executing library code? It’s certainly true for me. But I have noticed that a lot of the library functions executed by my code are written in R. Also, if somebody uses R for all their programming needs (it might be the only language they know), then their code might not be heavily library dependent.

I was surprised to read about Tierney’s byte code compiler, because his implementation is how I thought the R-core’s existing implementation worked (it does now). The internals of R is based on 1980s textbook functional techniques, and like many book implementations of the day, performance is dependent on the escape hatch of compiled code. R’s implementers wisely spent their time addressing user concerns, which revolved around statistics and visual presentation, i.e., not internal implementation technicalities.

Building an R compiler is easy, the much harder and time-consuming part is the runtime system.

Threaded code is a quick and simple approach to compiler implementation. R source gets mapped to a sequence of C function calls, with these functions proving a wrapper to library functions implementing the appropriate basic functionality, e.g., add two vectors. This approach has been the subject of at least one Master’s thesis. Thesis implementations rarely reach production use because those involved significantly underestimate the work that remains to be done, which is usually a lot more than the original implementation.

A simple threaded code approach provides a base for subsequent optimization, with the base having a similar performance to an interpreter. Optimizing requires figuring out details of the operations performed and replacing generic function calls with ones designed to be fast for specific cases, or even better replacing calls with inline code, e.g., adding short vectors of integers. There is a lot of existing work for scripting languages and a few PhD thesis researching R (e.g., Wang). The key technique is static analysis of R source.

Jan Vitek is running what appears to be the most active R compiler research group, at the moment e.g., the Ř project. Research can be good for uncovering language usage and trying out different techniques, but it is not intended to produce industry strength code. Lots of the fancy optimizations in early versions of the gcc C compiler started life as a PhD thesis, with the respective individual sometimes going on to spend a few years creating a production quality version for the released compiler.

The essential ingredient for building a production compiler is persistence. There are an awful lot of details that need to be sorted out (this is why research project code does not directly translate to production code, they ignore ‘minor’ details in order to concentrate on the ‘interesting’ research problem). Is there a small group of people currently beavering away on a production quality compiler for R? If there is, I can understand being discrete, on long-term projects it can be very annoying to have people regularly asking when the software is going to be released.

To have a life, once released, a production compiler needs to attract users, who are often loyal to their current compiler (because they know that their code works for this compiler); there needs to be a substantial benefit to entice people to switch. The benefit of compiling R to machine code, rather than interpreting, is performance. What performance improvement is needed to attract a viable community of users (there is always a tiny subset of users who will pay lots for even small performance improvements)?

My R code is rarely cpu bound, so I am not in the target audience, no matter what the speed-up. I don’t have any insight in the performance problems experienced by the R community, and have no idea whether a factor of two, five, ten or more would be enough.

Frederic Cambus (fcambus)

Playing with DJGPP and GCC 10 on DOS May 23, 2021 12:36 AM

I was recently amazed to discover that DJGPP was still being maintained, and had very recent versions of GCC and binutils available. I have not been doing any C programming on DOS in a very long time now, so I think the timing is right.

There is an installation program with an interface very similar to the good old Turbo Vision, which could be helpful in case one wants to install the full environment.

DJGPP Installer

DJGPP Installer

I'm only interested in using the C frontend for now, so I will do a manual installation.

We need the following components:

djdev205.zip    B  2,509,574  2015-10-18   DJGPP development kit 2.05
bnu2351b.zip    B  6,230,009  2021-01-16   GNU Binutils 2.35.1 binaries for DJGPP
gcc1030b.zip    B 42,027,946  2021-04-18   GNU GCC C compiler 10.3.0 for DJGPP V2
csdpmi7b.zip    B     71,339  2010-01-29   CS's DPMI Provider r7 Binaries

The development environment can be bootstrapped as follows:

mkdir -p ~/dos/djgpp
cd ~/dos/djgpp
wget https://www.delorie.com/pub/djgpp/current/v2/djdev205.zip
wget https://www.delorie.com/pub/djgpp/current/v2gnu/bnu2351b.zip
wget https://www.delorie.com/pub/djgpp/current/v2gnu/gcc1030b.zip
wget https://www.delorie.com/pub/djgpp/current/v2misc/csdpmi7b.zip
unzip djdev205.zip
unzip bnu2351b.zip
unzip gcc1030b.zip
unzip csdpmi7b.zip

When using FreeDOS, we need to add the following in AUTOEXEC.BAT:

set DJGPP=c:\djgpp\djgpp.env
set PATH=c:\djgpp\bin;%PATH%

Alternatively, when using DOSBox instead, we need the following in dosbox.conf:

[autoexec]
mount c ~/dos
path=c:\djgpp\bin
set DJGPP=c:\djgpp\djgpp.env
c:

Once we are done installing, this gives us GCC 10.3.0 and ld 2.35.1:

C:\>gcc --version
gcc.exe (GCC) 10.3.0
Copyright (C) 2020 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

C:\>ld --version
GNU ld (GNU Binutils) 2.35.1
Copyright (C) 2020 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public License version 3 or (at your option) a later version.
This program has absolutely no warranty.

To verify things are working properly, let's create a simple test program:

#include <stdio.h>

int
main()
{
	puts("Hello World!");

	return 0;
}

We then build and run it:

C:\>gcc hello.c -o hello

C:\>hello
Hello World!

Here is the output of running file on the executable:

HELLO.EXE: MS-DOS executable, COFF for MS-DOS, DJGPP go32 DOS extender

Let's build only an object file:

C:\>gcc -c hello.c

We can run nm on it to list symbols:

C:\>nm hello.o
00000000 b .bss
00000000 N .comment
00000000 d .data
00000000 t .text
0000000d T _main
         U _puts

And run objdump to display its content:

C:\>objdump -s hello.o

HELLO.o:     file format coff-go32

Contents of section .text:
 0000 48656c6c 6f20576f 726c6421 008d4c24  Hello World!..L$
 0010 0483e4f0 ff71fc55 89e55183 ec0483ec  .....q.U..Q.....
 0020 0c680000 0000e8d5 ffffff83 c410b800  .h..............
 0030 0000008b 4dfc89ec 5d8d61fc c3909090  ....M...].a.....
Contents of section .comment:
 0000 4743433a 2028474e 55292031 302e332e  GCC: (GNU) 10.3.
 0010 30000000                             0...            

We can also use objdump to disassemble the object file:

C:\>objdump -d hello.o

HELLO.o:     file format coff-go32

Disassembly of section .text:

00000000 <.text>:
   0:	48                      dec    %eax
   1:	65 6c                   gs insb (%dx),%es:(%edi)
   3:	6c                      insb   (%dx),%es:(%edi)
   4:	6f                      outsl  %ds:(%esi),(%dx)
   5:	20 57 6f                and    %dl,0x6f(%edi)
   8:	72 6c                   jb     76 <_main+0x69>
   a:	64 21 00                and    %eax,%fs:(%eax)

0000000d <_main>:
   d:	8d 4c 24 04             lea    0x4(%esp),%ecx
  11:	83 e4 f0                and    $0xfffffff0,%esp
  14:	ff 71 fc                pushl  -0x4(%ecx)
  17:	55                      push   %ebp
  18:	89 e5                   mov    %esp,%ebp
  1a:	51                      push   %ecx
  1b:	83 ec 04                sub    $0x4,%esp
  1e:	83 ec 0c                sub    $0xc,%esp
  21:	68 00 00 00 00          push   $0x0
  26:	e8 d5 ff ff ff          call   0 <.text>
  2b:	83 c4 10                add    $0x10,%esp
  2e:	b8 00 00 00 00          mov    $0x0,%eax
  33:	8b 4d fc                mov    -0x4(%ebp),%ecx
  36:	89 ec                   mov    %ebp,%esp
  38:	5d                      pop    %ebp
  39:	8d 61 fc                lea    -0x4(%ecx),%esp
  3c:	c3                      ret
  3d:	90                      nop
  3e:	90                      nop
  3f:	90                      nop

Besides the C frontend, the Ada, C++, Fortran, Objective C, and Pascal frontends are also available. While I used DOSBox to test the development environment and prepare this article, I've since moved to FreeDOS which I'm running natively on my ASUS Eee PC. I'm currently having a lot of fun redoing some small character and color cycling effects in text mode.

May 22, 2021

Patrick Louis (venam)

May 2021 Projects May 22, 2021 09:00 PM

Column

It’s been seven months since the last update article in October 2020. We’re still in a pandemic and living a more self-reliant, distanced, and introspective life. I’ve personally taken the stance of slowing down, to use this intermittent time to my advantage for personal growth. Slow living and the enjoyment of little things.

Let’s get to what I’ve been up to.

Psychology, Philosophy & Books

M Fritz Lang

Language: culture
Explanation: The past few months I’ve been interesting in broadening my horizon, the so-called cognitive diversity. To do that I’ve tried to learn about other cultures, through translation tools, language, traveling by reading and viewing documentaries, or even watching “ancient” films.

One of the excellent series I’ve watched is called “Les Routes de L’Impossible”. It follows truck drivers from around the world as they move their load from one place to another. Transportation of goods, moving, is a great way to get acquainted with countries we didn’t know about.

Apart from the countless documentaries, I’ve also started a new hobby which consists of watching movies from the 1920-1930s era. They offer a crack through time, a telescope effect over new ways of thinking. It’s fascinating from an artistic point of view, as the medium was still relatively new, people were exploring the opportunities about what could be done with it, not bounded by templates, tropes, and budgets.
Arguably, I’ve found these movies more entertaining, perplexing, emotional, and enjoyable than the current ones, especially compared to Hollywood movies. They have more depth, an emphasis on the story-telling and not quick and dirty effects or cliché scenarios.

I’ve written some reviews of films I’ve watched on my gopher:

gopher://g.nixers.net/0/~vnm/classic_movies

In case you don’t have a gopher client, here’s a proxy that allows browsing gopher over the web. I am also considering adding a page on the blog with the same content.

When it comes to books, I haven’t read that many new books that were not technical.

  • The Five Stages of Collapse: Survivors’ Toolkit by Dmitry Orlov
    This was more of a long rant on occidental cultures and economies than an actual book about collapse. It has many tidbits about indigenous groups and different ways of living, which was interesting.
  • Das Kapital by Karl Marx
    A hard read, but worth reading. It splits everything that makes up what a “capital” really is, how it cannot disappear but inherent to transactions. Then defines “capitalist”, “capitalism”, and the concept of “surplus value”. The book then goes on a tirade about worker’s rights, especially in the textile industry with the 1847 Factories Act in the United Kingdom. It gives a novel perspective on things we take for granted.
  • Waiting For Godot by Samuel Beckett
    An absurd metaphorical play that is instantly captivating. In its simplicity, you cannot but extract more meaning out of the actions of the characters.

Similarly, I’ve taken a liking to dadaist art, post-modernist, meta-modernist, surrealist, and shinkankakuha art.
I’ve watched multiple shorts, films, and visited a couple of online museums. This is worth the mind-opening experience, especially during these times, with friends through a video call.

Ascii Art & Art

Japanese Warrior ASCII

Language: ASCII
Explanation: I’ve included, as part of the cultural journey, my ASCII artworks as an occasion to dive into history.

My latest set of pieces are about armours through the ages and geographical regions.
I’ve also updated my Redbubble account with them. Check it out!

With that, I’ve learned many of the historical biases, especial regarding the old “history is written by the victor”. It affects our perception of certain geographical areas on Earth and how we refer to them.

Learning

ESCWA

Language: brain
Explanation: Between October and December, the UNESCWA, in association with the Lebanese ministry of Labor, due to the dire situation, started a free access program to Coursera for Lebanese that registered on their platform. I did so, and went on a spree of courses. It was a hectic learning experience.
When I was done, I had completed over 20 certificates, both sub-courses, and specialization included.

Here’s some of them:

  • University of MichiganModel Thinkingcertificate
  • StanfordCryptography Icertificate
  • University of Colorado BoulderEffective Communication:Writing, Design, and Presentationcertificate
  • BCGThe hidden value – Lean in manufacturing and servicescertificate
  • HEC ParisBusiness Model Innovationcertificate
  • HEC ParisInspirational Leadership:Leading with Sensecertificate
  • University of ColoradoSecure Software Design specializationcertificate
  • The University of North Carolina at Chapel HillPositive Psychologycertificate
  • UCIFundamentals of Managementcertificate
  • UCIThe Art of Negotiationcertificate
  • The University of EdinburghPhilosophy, Science and Religioncertificate 1, certificate 2, certificate 3

As you can see, I went over multiple topics in a range of domains in a short span of time. The courses were all very interesting and added to my general knowledge. I might not use these in my day-to-day, but who would say no to more education and the chance to learn things we’ve always wanted to learn.

Articles & Writings

Medium for Communication, Medium for Narrative Control

Language: pen
Explanation: I’ve written a book!
Ok, now that I’ve said it, I can explain.

I got interested in the subject of internet studies with an emphasis on narrative control. The content I found online was too descriptive of current events and didn’t take distance from them, a lot of them anchored in shock value without reflection.
So I took it upon myself to do my own mini-research, which turned into an article series, then a booklet called “Internet: Medium For Communication, Medium For Narrative Control” that I’ve released here.
This is probably the longest research I’ve done for anything, and I still felt as if it was barely scratching the surface of the topic.

I’ve later found that the booklet is also listed on Google Scholar, and so I’ve registered myself to be listed as the rightful author. See it live.

The cover, seen above, is an AI generated image that has been glitched using a single byte technique that I’ve mentioned in this article.

Apart from the book, I’ve also written 3 new articles since the last update:

  • What Does It Take To Resolve A Hostname
    A transcript of a talk I’ve given for the nixers conf 2020.
  • A Peek Into The Future Of Distros
    A reflection on the big projects that are moving the Linux ecosystem. Some people took it as a direct attack on their views, or as a representation of what I personally believe to be “a good development”. However, this was merely an observation of facts and not a personal opinion. I’m neither in favor, nor against the changes I’ve listed, I only see them happening.
  • Making Sense of The Audio Stack On Unix
    An article that got some of attention about demystifying the audio stack on Unix. It covers a lot of ground in an approachable manner.

Unix, Nixers, Programming

nixers conf

Language: Unix
Explanation: Soon after the last update article, we organized a nixers conference, the very first one. It was a really fun experience that everyone enjoyed. It could have been a bit more of organized, but it was nice to try out.
We had four speakers and each presented some unique talk.

My own talk was about resolving hostnames, which was something I wanted to dive into for a long time, as it is convoluted. I hope I made it simpler.

On nixers, we’ve also started a book club. Phillbush is managing the thread. Here are the books we went over so far.

I love the idea of reviewing the books, discussing, and reading them with others. It’s easier to grasp the ideas when you have others that are discovering them at the same time.

Our CTF team went to the Trend Micro finals in December, it was enjoyable but extremely hard. I haven’t looked much into CTFs since then, as it was only something I attempted on the side.

Additionally, I’ve been trying out the Rust language and the Nix package manager.

I’ve finished reading the Rust official book, and updated systemd-manager to support switching between session and service bus (more on dbus). This last one was something that motivated me to learn Rust, to test out my skills.

On the Nix pkg manager side, I’ve read the usual documentation, the nix-pills, and tried out things by myself.
I’ve found nix really nice and practical but I’ve ran into issues when downloading packages as it is extremely slow. The same goes for searching as it seems to be interpreting all expressions, so it’s easier to search online instead. These two are the same issues I’ve had when trying out port systems in the past, but they certainly didn’t have the advantages that nix has, such as rollbacks and reproducibility.

When it comes to audio specifically, after writing that audio stack article, I dig into multiple aspects that I’ve always wanted to.

pa-resto-edit

  • I made a small program called aconfdump after someone messaged me on IRC saying that the best explanation of ALSA configuration was in that post, but that they still found it difficult to understand how the configuration files got parsed. So I read a bit of the configurations and tools but couldn’t find how to dump the whole tree. Therefore, I wrote a mini tool to do it. However, the next day we saw that the upcoming ALSA version will have in the alsactl a dump-cfg option that does more or less the same thing.
  • I found, when writing that article on audio, that pulseaudio restore database was something most had no idea about, was an unapproachable and uneditable binary blob. So I’ve written an editor and viewer for PulseAudio restoration DBs, pa-resto-edit.
  • I started reading and trying out things with PipeWire. I’m planning to write an article specifically on this as I’ve found that the documentation is still extremely sparse. Even the documentation files are barely understood by anyone. I tried getting into the Rust binding to write a native client, however it isn’t ready yet, it’s still a work-in-progress, as Rust doesn’t create binding for inline functions, which constitute most of PipeWire code. I’m following on their IRC channel what is happening. I also had in mind to write a native client, but someone is already working on one called helvum, so I’m tagging along and trying to see how they go about it.
  • To better understand PipeWire, I’ve taken a detour to learn Gstreamer. All the container format, codecs, and pipeline handling. I like how well documented and introspective the code and plugins are. Hopefully, PipeWire will soon follow the same route.

Mushrooms

Suillus mediterraneensis

Language: mycelium
Explanation: The online research I posted got referenced by some people discussing Lebanon biodiversity but later removed due to the work-in-progress.

I’ve also found a suillus mediterraneensis in my garden.

Other than this, I’ve implemented the Morel game on Tabletopia to be able to play even at distance. Give it a try here.

Life and other hobbies

elevate streak

Language: life
Explanation: My other hobbies are still there. I’ve tried a lot of cooking styles, different recipes, mastered the art of fried rice, had fun making bread, etc.. However, I haven’t updated my cooking section in a long time.

I’m still on a streak with the brain teaser app Elevate, however I’ve stopped my subscription since the banking situation in Lebanon and foreign currency restrictions have been in place.

My biggest hobby right now is my garden. I’ve been composting and planting different things. From tomatoes, cherry tomatoes, radish, carrots, zucchinis, rocket, persil, garden cress, cilantro, mint, different types of lettuces, cucumbers, hot and mild peppers, lemon, walnuts trees, loquat tree, common fig trees, lemon trees, berries, oranges, avocados, different types of beans, eggplants, a dozen olives trees, grape vines, and more…
It’s a hobby that is time consuming but rewarding. I love that when working from home I can just go in the garden directly after work and slowly, but certainly, take care of it.

Now

In retrospect that seems like plenty of stuff!
I got multiple plans ahead, countless things I want to explore. My most recent obsession has been with Gstreamer and PipeWire. I’m preparing an article about PipeWire, it’s still a work in progress though but if I ever need help the main devs are really friendly on IRC.

Aside from this, I want to keep that new hobby of watching old movies and enriching my perspectives by learning about other cultures and traditions.

Obviously, it’s gardening season so I won’t stop that until next October at least.

Next month, I’m planning to kick back the Nixers June events. We’ll have a workflow compilation and the week in the TTY.

I’m not so sure I’ll dive into nix or nixOs more, I’ll try from time to time.

There’s another topic, unrelated to tech I might explore and write about. Maybe it’ll give rise to another booklet. Some weird topic like Everyday philosophy. I’ve found I love writing in general.

On the tech side, I’m looking forward, after the audio topic, to go into more popular ones such as containers, eBPF, maybe continue with Wayland, flatpak, access control mechanisms, init systems or service managers, and randomness. These last two have been on my list for a while, especially randomness on Unix as I was planning to write about it before the Time on Unix article but left it aside when I did the Coursera courses.

As far as books are concerned, I haven’t read much because of the economic situation. Books are hard to come by and are now worth one fourth of the average salary. I’ve also refrained from ordering new ones as I used to ship them to my workplace and we’re all doing remote work now. Shipping to my house is too complicated with the lack of courier service and no addresses.

This is it!
As usual… If you want something done, no one’s gonna do it for you, use your own hands, even if it’s not much.
And let’s go for a beer together sometime, or just chill.





Attributions:

  • Internet Archive Book Images, No restrictions, via Wikimedia Commons

May 21, 2021

Gonçalo Valério (dethos)

Django Friday Tips: Password validation May 21, 2021 06:11 PM

This time I’m gonna address Django’s builtin authentication system, more specifically the ways we can build custom improvements over the already very solid foundations it provides.

The idea for this post came from reading an article summing up some considerations we should have when dealing with passwords. Most of those considerations are about what controls to implement (what “types” of passwords to accept) and how to securely store those passwords. By default Django does the following:

  • Passwords are stored using PBKDF2. There are also other alternatives such as Argon2 and bcrypt, that can be defined in the setting PASSWORD_HASHERS.
  • Every Django release the “strength”/cost of this algorithm is increased. For example, version 3.1 applied 216000 iterations and the last version (3.2 at the time of writing) applies 260000. The migration from one to another is done automatically once the user logs in.
  • There are a set of validators that control the kinds of passwords allowed to be used in the system, such as enforcing a minimum length. These validators are defined on the setting AUTH_PASSWORD_VALIDATORS.

By default when we start a new project these are the included validators :

  • UserAttributeSimilarityValidator
  • MinimumLengthValidator
  • CommonPasswordValidator
  • NumericPasswordValidator

The names are very descriptive and I would say a good starting point. But as the article mentions the next step is to make sure users aren’t reusing previously breached passwords or using passwords that are known to be easily guessed (even when complying with the other rules). CommonPasswordValidator already does part of this job but with a very limited list (20000 entries).

Improving password validation

So for the rest of this post I will show you some ideas on how we can make this even better. More precisely, prevent users from using a known weak password.

1. Use your own list

The easiest approach, but also the more limited one, is providing your own list to `CommonPasswordValidator`, containing more entries than the ones provided by default. The list must be provided as a file with one entry in lower case per line. It can be set like this:

{
  "NAME": "django.contrib.auth.password_validation.CommonPasswordValidator",
  "OPTIONS": {"password_list_path": "<path_to_your_file>"}
}

2. Use zxcvbn-python

Another approach is to use an existing and well-known library that evaluates the password, compares it with a list of known passwords (30000) but also takes into account slight variations and common patterns.

To use zxcvbn-python we need to implement our own validator, something that isn’t hard and can be done this way:

# <your_app>/validators.py

from django.core.exceptions import ValidationError
from zxcvbn import zxcvbn


class ZxcvbnValidator:
    def __init__(self, min_score=3):
        self.min_score = min_score

    def validate(self, password, user=None):
        user_info = []
        if user:
            user_info = [
                user.email, 
                user.first_name, 
                user.last_name, 
                user.username
            ]
        result = zxcvbn(password, user_inputs=user_info)

        if result.get("score") < self.min_score:
            raise ValidationError(
                "This passoword is too weak",
                code="not_strong_enough",
                params={"min_score": self.min_score},
            )

    def get_help_text(self):
        return "The password must be long and not obvious"

Then we just need to add to the settings just like the other validators. It’s an improvement but we still can do better.

3. Use “have i been pwned?”

As suggested by the article, a good approach is to make use of the biggest source of leaked passwords we have available, haveibeenpwned.com.

The full list is available for download, but I find it hard to justify a 12GiB dependency on most projects. The alternative is to use their API (documentation available here), but again we must build our own validator.

# <your_app>/validators.py

from hashlib import sha1
from io import StringIO

from django.core.exceptions import ValidationError

import requests
from requests.exceptions import RequestException

class LeakedPasswordValidator:
    def validate(self, password, user=None):
        hasher = sha1(password.encode("utf-8"))
        hash = hasher.hexdigest().upper()
        url = "https://api.pwnedpasswords.com/range/"

        try:
            resp = requests.get(f"{url}{hash[:5]}")
            resp.raise_for_status()
        except RequestException:
            raise ValidationError(
                "Unable to evaluate password.",
                code="network_failure",
            )

        lines = StringIO(resp.text).readlines()
        for line in lines:
            suffix = line.split(":")[0]

            if hash == f"{hash[:5]}{suffix}":
                raise ValidationError(
                    "This password has been leaked before",
                    code="leaked_password",
                )

    def get_help_text(self):
        return "Use a different password"

Then add it to the settings.

Edit: As suggested by one reader, instead of this custom implementation we could use pwned-passwords-django (which does practically the same thing).

And for today this is it. If you have any suggestions for other improvements related to this matter, please share them in the comments, I would like to hear about them.

Mark Fischer (flyingfisch)

One week of Hungarian May 21, 2021 02:58 PM

Today marks my first ever 7 day streak in DuoLingo. I have attempted to learn Hungarian before, but apparently not for 7 days in a row. During the past week I’ve learned some basic phrases and greetings, some of the basic grammar for the present tense singular, and I have a general sense of the word order rules. I can now confidently say én magyar diák vagyok (I am a hungarian student). Well, somewhat confidently… magyar is an adjective there and I’m not sure if it works the same as English where it can either mean “I am a student of Hungarian descent” or “I am a student of Hungarian”. Perhaps there is more I need to learn in order to say it correctly.

UPDATE: Magyarul tanulok is a better way to say this.

my first seven day duolingo streak My first 7 day DuoLingo streak

Things I learned this week about Hungarian

Here are some interesting things I learned this week about Hungarian.

Word order nuance

Hungarian word order is more flexible than English. However, it’s important to realize that this does not mean Hungarian word order doesn’t matter. Certain phrases cannot be jumbled. For instance, articles and adjectives seem to always come before the noun they modify, so te a fiatal fiú látsz (you see the young boy) cannot be rephrased as te fiú a látsz fiatal. However, the I believe the sentence can be rephrased as te látsz a fiatal fiú to place more emphasis on látz (see).

Which segues into what happens when you use different word ordering in Hungarian. The two places where the main emphasis is placed in a Hungarian sentence are the beginning of the sentence, and directly before the verb. In fact, the focus of words directly before the verb is so strong that it’s required for question words like mi (what) to be there. For instance, mi látsz? (what do you see?) can’t be rephrased as látsz mi?. So when you place a word in a different place in the sentence, you are usually not changing the literal meaning of the sentence, but you are changing the emphasis.

UPDATE: My understanding of this is incorrect Judit294350 (and jzsuzsi) kindly pointed out on the DuoLingo discussion boards:

“te a fiatal fiú látsz?” - Sorry several mistakes here.

1: You do not normally use the pronoun. If you do it is for emphasis and this sentence’s emphasis is “a fiatal fiú”

2: The direct object has t be put in the accusative “a fiatal fiút”

3: A definite diret object uses a different conjugation - the definite conjugation - which you would not have learnt yet - so látod not látsz.

4: I also have a sneaky suspicion it would be meglátod.

Optional pronouns (and sometimes verbs)

Hungarian doesn’t always require a pronoun when the verb already conveys that information. For instance, én látok egy almát and látok egy almát both mean “I see an apple”. The -ok suffix on lát tells us that the verb is in the first person, so adding én is optional.

In a very special case verbs can also be optional. This is the case for the verb van (he/she/it is) and only in the third person. So instead of saying piros autó van we can say piros autó, and both mean “the car is red”. We can’t do this for other conjugations of van, however, so jól vagyok (I am well) has to include vagyok. See update below.

UPDATE: jzuzsi pointed out that the second paragraph here is incorrect.

“In a very special case verbs can also be optional. This is the case for the verb van (he/she/it is) and only in the third person. So instead of saying piros autó van we can say piros autó, and both mean “the car is red”. We can’t do this for other conjugations of van, however, so jól vagyok (I am well) has to include vagyok.”

In this case it is not optional. You have to omit van. The car is red. Az autó piros.

You said “and both mean the car is red”…. actually none of them means that. “piros autó van” can mean “There is a red car” and “piros autó” = “red car”.

Genderless grammar

Hungarian does not have grammatical gender, so there aren’t different versions of verbs, nouns, pronouns, etc. for different genders. This also means that unlike many other languages (looking at you, Latin) random objects don’t have a gender. Hungarian specifies no gender by default, although you can specify that a noun is female by tacking on (woman) to the end. For instance, színész is actor, and színésznő is actress (literally “actor woman”).

Unexpected similarities to English

The Hungarian people (Magyars) descended from the Ugrians Ugrics (thank you csepcsenyi!), a people that lived near the Ural mountains. Some of these people moved west at some point. These people later split again, with one group, the Finns, going north, and the other group, the Magyars, settling in the Carpathian basin. As a result, the Hungarian language has very little in common with other European languages, it’s closest relatives being Khanty and Mansi which are spoken by remaining Uralic tribes to this day. Its closest relatives in Europe are Estonian and Finnish.

map showing the location of Ugric languages today Hungarian and its closest relatives

The reason I’m leading with that is to demonstrate why I’m surprised when Hungarian has similarities to English, since they are nowhere near each other in the family tree of languages.

old world language family tree Family tree of the Old World languages, note how Hungarian and Finnish are so far removed from every other European language.

One similarity I found interesting is that just as in English, székeket festek (I paint chairs) can either mean “I paint pictures of chairs” or “I paint actual chairs”.

Another interesting similarity is that just as in English we don’t put “a” before a word starting with a vowel, instead adding a consonant to make it “an”, in Hungarian the definite article a gets changed to az when it comes before a word starting with a vowel.

I’m not 100% sure on this, but from what I can tell “objects” predicates of van are not in the accusative case in Hungarian, just as “objects” predicates of “is” are not in the objective case in English. Whether Hungarian has predicate nominatives or if this is for some other grammatical reason is a point of future research for me.

UPDATE: jzuzsi confirmed that these are predicates and not objects in both English and Hungarian, and are in the nominative case.

There are also certain vocabulary similarities especially with modern words, like telefon meaning “telephone”, but these are less interesting to me since I’m pretty sure they’re loan words from the rest of Europe and really just show the influence English has on the world’s languages.

Pronunciation issues

I’m sure I’m mispronouncing many things and have a heavy American accent, but the most glaring things I notice about my pronunciation at the moment are the following:

  • Te. This is pronounced “teh”, not “tay”. Years of pronouncing the Latin word “te” as “tay” are coming to haunt me.
  • Rolled R’s. I can’t roll my r’s yet. I can do a voiced dental tap, but not a trill. I also struggle with r’s before some consonants, like vársz.
  • Nyt. I’m not sure if I’m trying too hard or what, but I have a really hard time doing this sound. The Hungarian consonant ny is pronounced like the ny in Kenya without including the “a” (so not nyuh). I can pronounce it before vowels easily, and before some consonants, but for some reason when it comes before “t” I have a lot of trouble, so words like lányt become difficult for me.
  • Az. This is pronounced similar to “ahz”, not like the English word “as”. For some reason I am occasionally mixing up these vowels, usually in az but sometimes in other words like tanár (teacher).

Studying techniques

There are a few studying techniques I am trying to employ to help with my understanding of Hungarian.

One technique is to try to visualize the concept of a word instead of tagging it with the English equivalent in my head. For instance, instead of thinking “tűz means fire” I say tűz while visualizing a campfire. I think to some extent, at least at first, some tagging does happen regardless, because I’m finding that with many words I still think the English word instead of simply understanding the Hungarian word. I hope that as I study, speak, and listen to Hungarian more that will gradually become less and less of an issue.

To help with memorizing words I’ve been doodling their meanings and then writing the word below. I keep a bullet journal, so in addition to these doodles I also keep notes of things I have trouble remembering.

my hungarian bullet journal A couple pages of my Hungarian bullet journal.

I really want to start speaking Hungarian to a native speaker. There is a virtual group lesson series available that I plan to attend soon. I’d love to find a Discord server for Hungarian language learning, but I haven’t been able to find one. I may also try language exchange at some point, but my Hungarian skills aren’t good enough for that yet.

Things I wish I knew earlier

One thing I learned a couple days ago that I wish I knew way earlier is that the DuoLingo website has Hungarian tips which include grammar explanations, etc., but the DuoLingo mobile app only has tips for Spanish, French, and Chinese at the moment. So do yourself a favor and open the website and read the tips before you start a new bubble. It will save you a ton of trial and error.

Come on DuoLingo devs, you can do better than that.

Also, each DuoLingo exercise has a discussion thread which you can access after you complete the exercise. Be sure to check it out when you have questions about why an answer is right or wrong, or whether you can phrase a word a different way, or what a word actually means, or whatever. There are a few native speakers that seem to be pretty active, and I’ve usually gotten responses to my questions within hours. I’ve also found that many times my question has already been asked and answered very thoroughly.

Additional Language Learning Resources

I’ve been using the following resources in addition to DuoLingo:

  • HungarianReference.com - though this website looks like it was made in 1999 and has an occasional English typo, it’s actually a very good resource, especially for grammar. Be aware that this website has Hungarian typos and inaccuracies as well and was not written by a native Hungarian speaker.
  • Glosbe - a useful dual-language dictionary
  • Wiktionary - good for looking up definitions for Hungarian words and also their conjugations and suffixes
  • Drops - this seems to be a good vocabulary trainer. I like that it uses pictures to represent the Hungarian words instead of English words. I do about five minutes of these exercises a day.
  • /r/hungarian - this is a great community of Hungarian learners and teachers/speakers. I’ve had a few positive experiences with various questions over the past week, as well as searching for answers to questions I had.

Credits for corrections

If I missed anyone else who critiqued this post, I apologize. Please see both the DuoLingo thread and the Reddit thread to see the kind help people put into correcting statements in this post. :)

Further reading

A non-exhaustive list of resources and articles that were referenced by people critiquing this post:

May 19, 2021

Frederic Cambus (fcambus)

The state of toolchains in OpenBSD May 19, 2021 12:08 PM

For most of the 2010s, the OpenBSD base system has been stuck with GCC 4.2.1. It was released in July 2007, imported into the OpenBSD source tree in October 2009, and became the default compiler on the amd64, i386, hppa, sparc64, socppc and macppc platforms in OpenBSD 4.8, released in November 2010. As specified in the commit message during import, this is the last version released under the GPLv2 license.

OpenBSD was not the only operating system sticking to GCC 4.2.1 for licensing reasons, FreeBSD did the same, and Mac OS X as well.

As a general rule, and this is not OpenBSD specific, being stuck with old compilers is problematic for several reasons:

  • The main reason has to be newer C and C++ standards support. While the C standards committee is conservative and the language evolves slowly, the pace at which new C++ standards appear has been accelerating, with a new version emerging every 3 years.
  • Another major reason is new architectures support. Not only new ISAs like ARMv8 and RISC-V, but also x86-64 microarchitecture updates.
  • They are not getting bugfixes, nor new optimizations or advances in diagnostic (better warnings) and security features.

The latest point has been partially mitigated on OpenBSD, as several developers have worked on fixing OpenBSD related issues and backporting fixes, as detailed in Miod's excellent "Compilers in OpenBSD" post from 2013.

Regarding new architectures support, the more astute reader will know that all OpenBSD supported platforms are self-hosted and releases must be built using the base system compiler on real hardware. No cross-compilation, no emulators. The ARMv8 architecture was announced in 2011, a few years after GCC 4.2.1 was released. By the year 2016, 64-bit ARMv8 devices were getting widely available and more affordable. During the g2k16 hackathon, the Castle Inn pub had become a favorite meet-up point among OpenBSD developers, and the topic came up at one of the evening gatherings. I happened to be sitting nearby when patrick@ discussed with deraadt@ about the possibility of importing LLVM to make a future OpenBSD/arm64 porting effort possible, and Theo said that there was nothing blocking it. The next day, pascal@ mentioned he already had some Makefiles to replace the LLVM build system, and when someone then asked how long it would take to put them in shape for import, he said he didn't know, then smiled and said: "Let's find out! :-)". Before the end of the hackathon, he imported LLVM 3.8.1 along with Clang. Patrick's g2k16 hackathon report retraces the events and gives more technical details.

OpenBSD/arm64 became the first platform to use Clang as base system compiler and LLD as the default linker. Clang then became the default compiler on amd64 and i386 in July 2017, on armv7 in January 2018, on octeon in July 2019, on powerpc in April 2020, and finally on loongson in December 2020.

LLVM was updated regularly along the way up to the 8.0.1 version, which was the latest version released under the NCSA license. From then all later LLVM versions have been released under the Apache 2.0 license, which couldn't be included in OpenBSD. The project's copyright policy page details OpenBSD's stance on the license, and Mark Kettenis objection on the llvm-dev mailing list gives more background information.

While staying with LLVM 8.0.1 would not have been an immediate problem for the OpenBSD kernel and the base system userland which uses C99, the project also includes 3rd party C++ codebases for parts of the graphics stack and of course LLVM itself. Jonathan Gray hinted at the problem on the openbsd-misc mailing list, mentioning that not updating was becoming increasingly painful. The effect which can be observed in the 3rd party software ecosystem regarding newer C and C++ standards is that while C99 is still reigning supreme in C codebases, C++ codebases maintainers have been eager to adopt new C++ standards (and for good reasons). The recent RFC on cfe-dev about bumping toolchain requirements for LLVM to Clang 6.0 (released in March 2018) proves that LLVM is no exception. A compromise was thus inevitable, and LLVM 10.0.0 was imported into -current in August 2020.

At the time the OpenBSD 6.9 branch was created, the CVS tree contained LLVM 10.0.1, GCC 4.2.1, and GCC 3.3.6. However, it's important to understand that not all compilers are built on all platforms:

  • Clang is the default compiler on amd64, arm64, armv7, i386, loongson, macppc, octeon, powerpc64, and riscv64.
  • GCC 4.2.1 is still the default compiler on alpha, hppa, landisk, and sparc64.
  • OpenBSD/luna88k is the only platform still using GCC 3.3.6, as m88k support was removed in GCC 3.4.

Following the OpenBSD 6.9 release, OpenBSD-current has been updated to LLVM 11.1.0 and GCC 4.2.1 is not built anymore on amd64. GCC 8.4.0 (released in March 2020) is available in the ports collection.

Among the remaining platforms still using GCC 4.2.1 as the default compiler, only sparc64 will be able to switch in the future. LLVM has a Sparc V9 backend and work has been done in OpenBSD to make the switch possible. For all the other remaining ones, there are no alpha, hppa, sh4, nor m88k backends in LLVM, and even if this changed in the future, the hardware is too slow to be able to self-host the compiler.

Regarding linkers, LLD is the default linker on amd64, arm64, armv7, i386, powerpc64, and riscv64. All other platforms still use GNU ld from binutils 2.17. Realistically, it should be possible to switch to LLD in the future on the following platforms: loongson, macppc, octeon, and sparc64.

At this point, all relevant architectures have modern and up-to-date toolchains, and we can look ahead in confidence on that front.

May 18, 2021

Gokberk Yaltirakli (gkbrk)

JSON Serializer in Python May 18, 2021 09:00 PM

.iocell { display: flex; justify-content: space-between; align-items: center; } @media (max-width: 540px) { .iocell { flex-direction: column; } .iocell * { width: 100%; } }

Due to JSON’s ubiquity, we end up reaching for JSON libraries regularly when we have a project that needs to exchange data with other systems. Whenever something becomes widespread and becomes an “infrastructure”, it turns into a black-box in people’s minds.

One reason JSON got so popular is the fact that it’s simple. It’s not the simplest solution to the problem, not by a long shot, but it’s flexible enough to solve a lot of problems without becoming too large. In this post we’ll be making a JSON serializer in Python that can serialize arbitrary nested data structures in a few lines of code. And more importantly, every part should be understandable and self-contained.

Below is the default implementation of the encoder, which fails with an “Unknown type” error. We will implement encoding for the types shared by Python and JSON, and have this error as a fallback in case people try to encode something weird.

In [2]:
from functools import singledispatch

@singledispatch
def encode(value):
    raise Exception(f"Unknown type: {type(value)}")

Serializing None

This is probably the easiest type to encode, as it involves no logic whatsoever. JSON has a null type that represents a value that is intentionally missing. The None keyword in Python is the equivalent. When we see a value of NoneType, we should encode it as the 4-bytes null.

In [3]:
@encode.register(type(None))
def encode_none(_):
    return "null"
In [4]:
print(encode(None))
Out:
null

Serializing booleans

The boolean values True and False in Python can be encoded as true and false respectively. This is also pretty straightforward.

In [5]:
@encode.register(bool)
def encode_bool(value):
    if value:
        return "true"
    return "false"
In [6]:
print(encode(True))
print(encode(False))
Out:
true
false

Serializing numbers

The way Python formats numbers by default is actually suitable for JSON. So encoding int’s and float’s is pretty easy, just call str(x) on the value.

In [7]:
@encode.register(int)
@encode.register(float)
def encode_number(val):
    return str(val)
In [8]:
print(encode(42))
print(encode(123.45))
Out:
42
123.45

Serializing strings

Strings are a list of characters enclosed in double quotes ("). Aside from some special characters, everything else can be encoded as-is. The characters we’ll escape are double-quotes and newlines.

In [9]:
@encode.register(str)
def encode_str(val):
    result = '"'
    for c in val:
        if c == '"':
            result += r'\"'
        elif c == '\n':
            result += r'\n'
        else:
            result += c
    result += '"'
    return result
In [10]:
print(encode("Hello, world!"))
print(encode('Hello "World"'))
print(encode("""Hello
world!"""))
Out:
"Hello, world!"
"Hello \"World\""
"Hello\nworld!"

Serializing lists

Lists in JSON are comma-separated values enclosed in [ and ]. An empty list becomes [], a list with one element is encoded as [42], and a list with two elements can be written as [42, 43]. Of course, you are not limited to only integers. Any JSON type, including other lists and dictionaries, can be a list member.

In [11]:
@encode.register(list)
def encode_list(l):
    vals = ','.join(map(encode, l))
    return f"[{vals}]"
In [12]:
print(encode([]))
print(encode([1]))
print(encode([1, 2, 3]))
print(encode([1, 2.0, "3"]))
print(encode([1, 2, [3, 4], 5, 6, False, True]))
Out:
[]
[1]
[1,2,3]
[1,2.0,"3"]
[1,2,[3,4],5,6,false,true]

Serializing dictionaries

Dictionaries are similar to lists, but instead of a list of values, we have a list of key-value pairs. Before we get to the dictionary part, let’s start by encoding key-value pairs in the suitable format.

In JSON, key value pairs are formatted as key:value.

In [13]:
def encode_key_value(t):
    k, v = t
    return f"{encode(k)}:{encode(v)}"
In [14]:
print(encode_key_value((1, 2)))
print(encode_key_value(("key", 1234)))
print(encode_key_value(("this is a list", [1, 2, 3])))
Out:
1:2
"key":1234
"this is a list":[1,2,3]

Looks good so far. To turn key-value pairs into a dictionary, we encode them the same way as the list, but instead of wrapping them in [ and ], we wrap them in { and }.

In [15]:
@encode.register(dict)
def encode_dict(d):
    vals = ','.join(map(encode_key_value, d.items()))
    return '{' + vals + '}'
In [16]:
print(encode({"Name": "Leo", "BirthYear": 1998, "Website": "www.gkbrk.com"}))
Out:
{"Name":"Leo","BirthYear":1998,"Website":"www.gkbrk.com"}

Mark Fischer (flyingfisch)

My Hungarian Language Learning Journey May 18, 2021 01:36 PM

Jó napot!

I have decided to finally make a serious attempt at learning Hungarian. I will be trying to blog in Hungarian occasionally, and then posting the English translation in the second half of the post.

Also, in case anyone is curious, I am using the Hungarian 101-key keyboard layout. To set this up in Windows 10, go to Start > Language Settings, click Add preferred language, and select Hungarian. Once the language pack finishes downloading, click the language, then Options, and remove the default QWERTZ keyboard. Click Add a keyboard and select Hungarian 101-key, QWERTY.

You should now be able to switch between English and Hungarian keyboards by using the keyboard switcher in the taskbar.

Taskbar keyboard switcher showing English and Hungarian keyboard layouts.

Viszlát!

May 16, 2021

Derek Jones (derek-jones)

Delphi and group estimation May 16, 2021 09:58 PM

A software estimate is a prediction about the future. Software developers were not the first people to formalize processes for making predictions about the future. Starting in the last 1940s, the RAND Corporation’s Delphi project created what became known as the Delphi method, e.g., An Experiment in Estimation, and Construction of Group Preference Relations by Iteration.

In its original form experts were anonymous; there was a “… deliberate attempt to avoid the disadvantages associated with more conventional uses of experts, such as round-table discussions or other milder forms of confrontation with opposing views.”, and no rules were given for the number of iterations. The questions involved issues whose answers involved long term planning, e.g., how many nuclear weapons did the Soviet Union possess (this study asked five questions, which required five estimates). Experts could provide multiple answers, and had to give a probability for each being true.

One of those involved in the Delphi project (Helmer-Hirschberg) co-founded the Institute for the Future, which published reports about the future based on answers obtained using the Delphi method, e.g., a 1970 prediction of the state-of-the-art of computer development by the year 2000 (Dalkey, a productive member of the project, stayed at RAND).

The first application of Delphi to software estimation was by Farquhar in 1970 (no pdf available), and Boehm is said to have modified the Delphi process to have the ‘experts’ meet together, rather than be anonymous, (I don’t have a copy of Farquhar, and my copy of Boehm’s book is in a box I cannot easily get to); this meeting together form of Delphi is known as Wideband Delphi.

Planning poker is a variant of Wideband Delphi.

An assessment of Delphi by Sackman (of Grant-Sackman fame) found that: “Much of the popularity and acceptance of Delphi rests on the claim of the superiority of group over individual opinions, and the preferability of private opinion over face-to-face confrontation.” The Oracle at Delphi was one person, have we learned something new since that time?

Group dynamics is covered in section 3.4 of my Evidence-based software engineering book; resource estimation is covered in section 5.3.

The likelihood that a group will outperform an individual has been found to depend on the kind of problem. Is software estimation the kind of problem where a group is likely to outperform an individual? Obviously it will depend on the expertise of those in the group, relative to what is being estimated.

What does the evidence have to say about the accuracy of the Delphi method and its spinoffs?

When asked to come up with a list of issues associated with solving a problem, groups generate longer lists of issues than individuals. The average number of issues per person is smaller, but efficient use of people is not the topic here. Having a more complete list of issues ought to be good for accurate estimating (the validity of the issues is dependent on the expertise of those involved).

There are patterns of consistent variability in the estimates made by individuals; some people tend to consistently over-estimate, while others consistently under-estimate. A group will probably contain a mixture of people who tend to over/under estimate, and an iterative estimation process that leads to convergence is likely to produce a middling result.

By how much do some people under/over estimate?

The multiplicative factor values (y-axis) appearing in the plot below are from a regression model fitted to estimate/actual implementation time for a project involving 13,669 tasks and 47 developers (data from a study Nichols, McHale, Sweeney, Snavely and Volkmann). Each vertical line, or single red plus, is one person (at least four estimates needed to be made for a red plus to occur); the red pluses are the regression model’s multiplicative factor for that person’s estimates of a particular kind of creation task, e.g., design, coding, or testing. Points below the grey line are overestimation, and above the grey line the underestimation (code+data):

3n+1 programs containing various lines of code.

What is the probability of a Delphi estimate being more accurate than an individual’s estimate?

If we assume that a middling answer is more likely to be correct, then we need to calculate the probability that the mix of people in a Delphi group produces a middling estimate while the individual produces a more extreme estimate.

I don’t have any Wideband Delphi estimation data (or rather, I only have tiny amounts); pointers to such data are most welcome.

Ponylang (SeanTAllen)

Last Week in Pony - May 16, 2021 May 16, 2021 02:54 PM

Audio from the May 11, 2021 Pony development sync is available.

May 14, 2021

Chris Allen (bitemyapp)

Why I converted May 14, 2021 12:00 AM

I was received into the Catholic Church on the Easter Vigil of 2019.

May 12, 2021

Frederic Cambus (fcambus)

Speedbuilding LLVM/Clang in 2 minutes on ARM May 12, 2021 11:09 PM

This post is the AArch64 counterpart of my "Speedbuilding LLVM/Clang in 5 minutes" article.

After publishing and sharing the previous post URL with some friends on IRC, I was asked if I wanted to try doing the same on a 160 cores ARM machine. Finding out what my answer was is left as an exercise to the reader :-)

The system I'm using for this experiment is a BM.Standard.A1.160 bare-metal machine from Oracle Cloud, which has a dual-socket motherboard with two 80 cores Ampere Altra CPUs, for a total 160 cores, and 1024 GB of RAM. This is to the best of my knowledge the fastest AArch64 server machine available at this time.

The system is running Oracle Linux Server 8.3 with up-to-date packages and kernel.

The full result of cat /proc/cpuinfo is available here.

uname -a
Linux benchmarks 5.4.17-2102.201.3.el8uek.aarch64 #2 SMP Fri Apr 23 09:42:46 PDT 2021 aarch64 aarch64 aarch64 GNU/Linux

Let's start by installing required packages:

dnf in clang git lld

Unfortunately the CMake version available in the packages repository (3.11.4) is too old to build the main branch of the LLVM Git repository, and Ninja is not available either.

Let's bootstrap Pkgsrc to build and install them:

git clone https://github.com/NetBSD/pkgsrc.git
cd pkgsrc/bootstrap
./bootstrap --make-jobs=160 --unprivileged

===> bootstrap started: Wed May 12 12:23:34 GMT 2021
===> bootstrap ended:   Wed May 12 12:26:08 GMT 2021

We then need to add ~pkg/bin and ~pkg/sbin to the path:

export PATH=$PATH:$HOME/pkg/bin:$HOME/pkg/sbin

For faster Pkgsrc builds, we can edit ~/pkg/etc/mk.conf and add:

MAKE_JOBS=              160

Let's build and install CMake and Ninja:

cd ~/pkgsrc/devel/cmake
bmake install package clean clean-depends

cd ~/pkgsrc/devel/ninja-build
bmake install package clean clean-depends

The compiler used for the builds is Clang 10.0.1:

clang --version
clang version 10.0.1 (Red Hat 10.0.1-1.0.1.module+el8.3.0+7827+89335dbf)
Target: aarch64-unknown-linux-gnu
Thread model: posix
InstalledDir: /bin

Regarding linkers, we are using GNU ld and GNU Gold from binutils 2.30, and LLD 10.0.1.

GNU ld version 2.30-79.0.1.el8
GNU gold (version 2.30-79.0.1.el8) 1.15
LLD 10.0.1 (compatible with GNU linkers)

For all the following runs, I'm building from the Git repository main branch commit cf4610d27bbb5c3a744374440e2fdf77caa12040. The build directory is of course fully erased between each run.

commit cf4610d27bbb5c3a744374440e2fdf77caa12040
Author: Victor Huang <wei.huang@ibm.com>
Date:   Wed May 12 10:56:54 2021 -0500

I'm not sure what the underlying storage is, but with 1 TB of RAM there is no reason not to use a ramdisk.

mkdir /mnt/ramdisk
mount -t tmpfs -o size=32g tmpfs /mnt/ramdisk
cd /mnt/ramdisk

To get a baseline, let's do a full release build on this machine:

cd llvm-project
mkdir build
cd build

cmake   -DCMAKE_C_COMPILER=clang \
        -DCMAKE_CXX_COMPILER=clang++ \
        -DCMAKE_BUILD_TYPE=Release \
        -DLLVM_ENABLE_PROJECTS=clang \
        ../llvm

time make -j160
real    7m3.226s
user    403m28.362s
sys     6m41.331s

By default, CMake generates Makefiles. As documented in the "Getting Started with the LLVM System" tutorial, most LLVM developers use Ninja.

Let's switch to generating Ninja build files, and using ninja to build:

cmake   -DCMAKE_C_COMPILER=clang \
        -DCMAKE_CXX_COMPILER=clang++ \
        -DCMAKE_BUILD_TYPE=Release \
        -DLLVM_ENABLE_PROJECTS=clang \
        -GNinja ../llvm

time ninja
[4182/4182] Linking CXX executable bin/c-index-test

real    4m20.403s
user    427m27.118s
sys     7m2.320s

htop

By default, GNU ld is used for linking. Let's switch to using gold:

cmake   -DCMAKE_C_COMPILER=clang \
        -DCMAKE_CXX_COMPILER=clang++ \
        -DCMAKE_BUILD_TYPE=Release \
        -DLLVM_ENABLE_PROJECTS=clang \
        -DLLVM_USE_LINKER=gold \
        -GNinja ../llvm

time ninja
[4182/4182] Linking CXX executable bin/c-index-test

real    4m1.062s
user    427m1.648s
sys     6m58.282s

LLD has been a viable option for some years now. Let's use it:

cmake   -DCMAKE_C_COMPILER=clang \
        -DCMAKE_CXX_COMPILER=clang++ \
        -DCMAKE_BUILD_TYPE=Release \
        -DLLVM_ENABLE_PROJECTS=clang \
        -DLLVM_USE_LINKER=lld \
        -GNinja ../llvm

time ninja
[4182/4182] Linking CXX executable bin/clang-scan-deps

real    3m58.476s
user    428m3.807s
sys     7m14.418s

Using GNU gold instead of GNU ld results in noticeably faster builds, and switching to LLD shaves a few mores seconds from the build.

If we want to build faster, we can make some compromises and start stripping the build by removing some components.

Let's start by disabling additional architecture support:

cmake   -DCMAKE_C_COMPILER=clang \
        -DCMAKE_CXX_COMPILER=clang++ \
        -DCMAKE_BUILD_TYPE=Release \
        -DLLVM_ENABLE_PROJECTS=clang \
        -DLLVM_USE_LINKER=lld \
        -DLLVM_TARGETS_TO_BUILD="AArch64" \
        -GNinja ../llvm

time ninja
[3195/3195] Linking CXX executable bin/c-index-test

real    3m10.312s
user    326m54.898s
sys     5m24.770s

We can verify the resulting Clang binary only supports AArch64 targets:

bin/clang --print-targets
  Registered Targets:
    aarch64    - AArch64 (little endian)
    aarch64_32 - AArch64 (little endian ILP32)
    aarch64_be - AArch64 (big endian)
    arm64      - ARM64 (little endian)
    arm64_32   - ARM64 (little endian ILP32)

Let's go further and disable the static analyzer and the ARC Migration Tool:

cmake   -DCMAKE_C_COMPILER=clang \
        -DCMAKE_CXX_COMPILER=clang++ \
        -DCMAKE_BUILD_TYPE=Release \
        -DLLVM_ENABLE_PROJECTS=clang \
        -DLLVM_USE_LINKER=lld \
        -DLLVM_TARGETS_TO_BUILD="AArch64" \
        -DCLANG_ENABLE_STATIC_ANALYZER=OFF \
        -DCLANG_ENABLE_ARCMT=OFF \
        -GNinja ../llvm

time ninja
[3146/3146] Creating library symlink lib/libclang-cpp.so

real    3m6.474s
user    319m25.914s
sys     5m20.924s

Let's disable building some LLVM tools and utils:

cmake   -DCMAKE_C_COMPILER=clang \
        -DCMAKE_CXX_COMPILER=clang++ \
        -DCMAKE_BUILD_TYPE=Release \
        -DLLVM_ENABLE_PROJECTS=clang \
        -DLLVM_USE_LINKER=lld \
        -DLLVM_TARGETS_TO_BUILD="AArch64" \
        -DCLANG_ENABLE_STATIC_ANALYZER=OFF \
        -DCLANG_ENABLE_ARCMT=OFF \
        -DLLVM_BUILD_TOOLS=OFF \
        -DLLVM_BUILD_UTILS=OFF \
        -GNinja ../llvm

time ninja
[2879/2879] Creating library symlink lib/libclang-cpp.so

real    2m59.659s
user    298m47.482s
sys     4m57.430s

Compared to the previous build, the following binaries were not built: FileCheck, count, lli-child-target, llvm-jitlink-executor, llvm-PerfectShuffle, not, obj2yaml, yaml2obj, and yaml-bench.

We are reaching the end of our journey here. At this point, we are done stripping out things.

Let's disable optimizations and do a last run:

cmake   -DCMAKE_C_COMPILER=clang \
        -DCMAKE_CXX_COMPILER=clang++ \
        -DCMAKE_BUILD_TYPE=Release \
        -DLLVM_ENABLE_PROJECTS=clang \
        -DLLVM_USE_LINKER=lld \
        -DLLVM_TARGETS_TO_BUILD="AArch64" \
        -DCLANG_ENABLE_STATIC_ANALYZER=OFF \
        -DCLANG_ENABLE_ARCMT=OFF \
        -DLLVM_BUILD_TOOLS=OFF \
        -DLLVM_BUILD_UTILS=OFF \
        -DCMAKE_CXX_FLAGS_RELEASE="-O0" \
        -GNinja ../llvm

time ninja
[2879/2879] Linking CXX executable bin/c-index-test

real    2m37.003s
user    231m53.133s
sys     4m56.675s

So this is it, this machine can build a full LLVM/Clang release build in a bit less than four minutes, and a stripped down build with optimizations disabled in two minutes. Two minutes. This is absolutely mind-blowing… The future is now!

May 11, 2021

Frederic Cambus (fcambus)

Speedbuilding LLVM/Clang in 5 minutes May 11, 2021 09:21 PM

This post is a spiritual successor to my "Building LLVM on OpenBSD/loongson" article, in which I retraced my attempts to build LLVM 3.7.1 on MIPS64 in a RAM constrained environment.

After reading the excellent "Make LLVM fast again", I wanted to revisit the topic, and see how fast I could build a recent version of LLVM and Clang on modern x86 server hardware.

The system I'm using for this experiment is a CCX62 instance from Hetzner, which has 48 dedicated vCPUs and 192 GB of RAM. This is the fastest machine available in their cloud offering at the moment.

The system is running Fedora 34 with up-to-date packages and kernel.

The full result of cat /proc/cpuinfo is available here.

uname -a
Linux benchmarks 5.11.18-300.fc34.x86_64 #1 SMP Mon May 3 15:10:32 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

Let's start by installing required packages:

dnf in clang cmake git lld ninja-build

The compiler used for the builds is Clang 12.0.0:

clang --version
clang version 12.0.0 (Fedora 12.0.0-0.3.rc1.fc34)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

Regarding linkers, we are using GNU ld and GNU Gold from binutils 2.35.1, and LLD 12.0.0.

GNU ld version 2.35.1-41.fc34
GNU gold (version 2.35.1-41.fc34) 1.16
LLD 12.0.0 (compatible with GNU linkers)

For all the following runs, I'm building from the Git repository main branch commit 831cf15ca6892e2044447f8dc516d76b8a827f1e. The build directory is of course fully erased between each run.

commit 831cf15ca6892e2044447f8dc516d76b8a827f1e
Author: David Spickett <david.spickett@linaro.org>
Date:   Wed May 5 11:49:35 2021 +0100

To get a baseline, let's do a full release build on this machine:

cd llvm-project
mkdir build
cd build

cmake   -DCMAKE_C_COMPILER=clang \
        -DCMAKE_CXX_COMPILER=clang++ \
        -DCMAKE_BUILD_TYPE=Release \
        -DLLVM_ENABLE_PROJECTS=clang \
        ../llvm

time make -j48
real    11m19.852s
user    436m30.619s
sys     12m5.724s

By default, CMake generates Makefiles. As documented in the "Getting Started with the LLVM System" tutorial, most LLVM developers use Ninja.

Let's switch to generating Ninja build files, and using ninja to build:

cmake   -DCMAKE_C_COMPILER=clang \
        -DCMAKE_CXX_COMPILER=clang++ \
        -DCMAKE_BUILD_TYPE=Release \
        -DLLVM_ENABLE_PROJECTS=clang \
        -GNinja ../llvm

time ninja
[4182/4182] Generating ../../bin/llvm-readelf

real    10m13.755s
user    452m16.034s
sys     12m7.584s

htop

By default, GNU ld is used for linking. Let's switch to using gold:

cmake   -DCMAKE_C_COMPILER=clang \
        -DCMAKE_CXX_COMPILER=clang++ \
        -DCMAKE_BUILD_TYPE=Release \
        -DLLVM_ENABLE_PROJECTS=clang \
        -DLLVM_USE_LINKER=gold \
        -GNinja ../llvm

time ninja
[4182/4182] Generating ../../bin/llvm-readelf

real    10m13.405s
user    451m35.029s
sys     11m57.649s

LLD has been a viable option for some years now. Let's use it:

cmake   -DCMAKE_C_COMPILER=clang \
        -DCMAKE_CXX_COMPILER=clang++ \
        -DCMAKE_BUILD_TYPE=Release \
        -DLLVM_ENABLE_PROJECTS=clang \
        -DLLVM_USE_LINKER=lld \
        -GNinja ../llvm

time ninja
[4182/4182] Generating ../../bin/llvm-readelf

real    10m12.710s
user    451m12.444s
sys     12m12.634s

During tests on smaller build machines, I had observed that using GNU gold or LLD instead of GNU ld resulted in noticeably faster builds. This doesn't seem to be the case on this machine. We end up with a slightly faster build by using LLD, but not by a large margin at all.

If we want to build faster, we can make some compromises and start stripping the build by removing some components.

Let's start by disabling additional architecture support:

cmake   -DCMAKE_C_COMPILER=clang \
        -DCMAKE_CXX_COMPILER=clang++ \
        -DCMAKE_BUILD_TYPE=Release \
        -DLLVM_ENABLE_PROJECTS=clang \
        -DLLVM_USE_LINKER=lld \
        -DLLVM_TARGETS_TO_BUILD="X86" \
        -GNinja ../llvm 

time ninja
[3196/3196] Generating ../../bin/llvm-readelf

real    7m55.531s
user    344m56.462s
sys     8m53.970s

We can verify the resulting Clang binary only supports x86 targets:

bin/clang --print-targets
  Registered Targets:
    x86    - 32-bit X86: Pentium-Pro and above
    x86-64 - 64-bit X86: EM64T and AMD64

Let's go further and disable the static analyzer and the ARC Migration Tool:

cmake   -DCMAKE_C_COMPILER=clang \
        -DCMAKE_CXX_COMPILER=clang++ \
        -DCMAKE_BUILD_TYPE=Release \
        -DLLVM_ENABLE_PROJECTS=clang \
        -DLLVM_USE_LINKER=lld \
        -DLLVM_TARGETS_TO_BUILD="X86" \
        -DCLANG_ENABLE_STATIC_ANALYZER=OFF \
        -DCLANG_ENABLE_ARCMT=OFF \
        -GNinja ../llvm 

time ninja
[3147/3147] Generating ../../bin/llvm-readelf

real    7m42.299s
user    334m47.916s
sys     8m44.704s

Let's disable building some LLVM tools and utils:

cmake   -DCMAKE_C_COMPILER=clang \
        -DCMAKE_CXX_COMPILER=clang++ \
        -DCMAKE_BUILD_TYPE=Release \
        -DLLVM_ENABLE_PROJECTS=clang \
        -DLLVM_USE_LINKER=lld \
        -DLLVM_TARGETS_TO_BUILD="X86" \
        -DCLANG_ENABLE_STATIC_ANALYZER=OFF \
        -DCLANG_ENABLE_ARCMT=OFF \
        -DLLVM_BUILD_TOOLS=OFF \
        -DLLVM_BUILD_UTILS=OFF \
        -GNinja ../llvm

time ninja
[2880/2880] Generating ../../bin/llvm-readelf

real    7m21.016s
user    315m42.127s
sys     8m9.377s

Compared to the previous build, the following binaries were not built: FileCheck, count, lli-child-target, llvm-jitlink-executor, llvm-PerfectShuffle, not, obj2yaml, yaml2obj, and yaml-bench.

We are reaching the end of our journey here. At this point, we are done stripping out things.

Let's disable optimizations and do a last run:

cmake   -DCMAKE_C_COMPILER=clang \
        -DCMAKE_CXX_COMPILER=clang++ \
        -DCMAKE_BUILD_TYPE=Release \
        -DLLVM_ENABLE_PROJECTS=clang \
        -DLLVM_USE_LINKER=lld \
        -DLLVM_TARGETS_TO_BUILD="X86" \
        -DCLANG_ENABLE_STATIC_ANALYZER=OFF \
        -DCLANG_ENABLE_ARCMT=OFF \
        -DLLVM_BUILD_TOOLS=OFF \
        -DLLVM_BUILD_UTILS=OFF \
        -DCMAKE_CXX_FLAGS_RELEASE="-O0" \
        -GNinja ../llvm 

time ninja
[2880/2880] Linking CXX executable bin/c-index-test

real    5m37.225s
user    253m18.515s
sys     9m2.413s

That's it. Five minutes. Don't try this at home :-)

May 10, 2021

Ponylang (SeanTAllen)

Last Week in Pony - May 10, 2021 May 10, 2021 04:17 AM

New releases are available, including ponylang/ponyc 0.41.0 and seantallen-org/lori 0.1.0. Interested in helping maintain ponyup?

May 09, 2021

Derek Jones (derek-jones)

Estimate variability for the same task May 09, 2021 10:36 PM

If 100 people estimate the time needed to implement a feature, in software, what is the expected variability in the estimates?

Studies of multiple implementations of the same specification suggest that standard deviation of the mean number of lines across implementations is 25% of the mean (based on data from 10 sets of multiple implementations, of various sizes).

The plot below shows lines of code against the number of programs (implementing the 3n+1 problem) containing that many lines (red line is a Normal distribution fitted by eye, code and data):

3n+1 programs containing various lines of code.

Might any variability in the estimates for task implementation be the result of individuals estimating their own performance (which is variable)?

To the extent that an estimate is based on a person’s implementation experience, a developer’s past performance will have some impact on their estimate. However, studies have found a great deal of variability between individual estimates and their corresponding performance.

One study asked 14 companies to bid on implementing a system (four were eventually chosen to implement it; see figure 5.2 in my book). The estimated elapsed time varied by a factor of ten. Until the last week this was the only study of this question for which the data was available (and may have been the only such study).

A study by Alhamed and Storer investigated crowd-sourcing of effort estimates, structured by use of planning poker. The crowd were workers on Amazon’s Mechanical Turk, and the tasks estimated came from the issue trackers of JBoss, Apache and Spring Integration (using issues that had been annotated with an estimate and actual time, along with what was considered sufficient detail to make an estimate). An initial set of 419 issues were whittled down to 30, which were made available, one at a time, as a Mechanical Turk task (i.e., only one issue was available to be estimated at any time).

Worker estimates were given using a time-based category (i.e., the values 1, 4, 8, 20, 40, 80), with each value representing a unit of actual time (i.e., one hour, half-day, day, half-week, week and two weeks, respectively).

Analysis of the results from a pilot study were used to build a model that detected estimates considered to be low quality, e.g., providing a poor justification for the estimate. These were excluded from any subsequent iterations.

Of the 506 estimates made, 321 passed the quality check.

Planning poker is an iterative process, with those making estimates in later rounds seeing estimates made in earlier rounds. So estimates made in later rounds are expected to have some correlation with earlier estimates.

Of the 321 quality check passing estimates, 153 were made in the first-round. Most of the 30 issues have 5 first-round estimates each, one has 4 and two have 6.

Workers have to pick one of five possible value as their estimate, with these values being roughly linear on a logarithmic scale, i.e., it is not possible to select an estimate from many possible large values, small values, or intermediate values. Unless most workers pick the same value, the standard deviation is likely to be large. Taking the logarithm of the estimate maps it to a linear scale, and the plot below shows the mean and standard deviation of the log of the estimates for each issue made during the first-round (code+data):

Mean against standard deviation for log of estimates of each issue.

The wide spread in the standard deviations across a spread of mean values may be due to small sample size, or it may be real. The only way to find out is to rerun with larger sample sizes per issue.

Now it has been done once, this study needs to be run lots of times to measure the factors involved in the variability of developer estimates. What would be the impact of asking workers to make hourly estimates (they would not be anchored by experimenter specified values), or shifting the numeric values used for the categories (which probably have an anchoring effect)? Asking for an estimate to fix an issue in a large software system introduces the unknown of all kinds of dependencies, would estimates provided by workers who are already familiar with a project be consistently shifted up/down (compared to estimates made by those not familiar with the project)? The problem of unknown dependencies could be reduced by giving workers self-contained problems to estimate, e.g., the 3n+1 problem.

The crowdsourcing idea is interesting, but I don’t think it will scale, and I don’t see many companies making task specifications publicly available.

To mimic actual usage, research on planning poker (which appears to have non-trivial usage) needs to ensure that the people making the estimates are involved during all iterations. What is needed is a dataset of lots of planning poker estimates. Please let me know if you know of one.

May 08, 2021

Gonçalo Valério (dethos)

My picks on open-source licenses May 08, 2021 04:14 PM

Sooner or later everybody that works with computers will have to deal with software licenses. Newcomers usually assume that software is either open-source (aka free stuff) or proprietary, but this is a very simplistic view of the world and wrong most of the time.

This topic can quickly become complex and small details really matter. You might find yourself using a piece of software in a way that the license does not allow.

There are many types of open-source licenses with different sets of conditions, while you can use some for basically whatever you want, others might impose some limits and/or duties. If you aren’t familiar with the most common options take look at choosealicense.com.

This is also a topic that was recently the source of a certain level of drama in the industry, when companies that usually released their software and source code with a very permissive license opted to change it, in order to protect their work from certain behaviors they viewed as abusive.

In this post I share my current approach regarding the licenses of the computer programs I end up releasing as FOSS (Free and Open Source Software).

Let’s start with libraries, that is, packages of code containing instructions to solve specific problems, aimed to be used by other software developers in their own apps and programs. On this case, my choice is MIT, a very permissive license which allows it to be used for any purpose without creating any other implications for the end result (app/product/service). In my view this is exactly the aim an open source library should have.

The next category is “apps and tools”, these are regular computer programs aimed to be installed by the end user in his computer. For this scenario, my choice is GPLv3. So I’m providing a tool with the source code for free, that the user can use and modify as he sees fit. The only thing I ask for is: if you modify it in any way, to make it better or address a different scenario, please share your changes using the same license.

Finally, the last category is “network applications”, which are computer programs that can be used through the network without having to install them on the local machine. Here I think AGPLv3 is a good compromise, it basically says if the end user modifies the software and let his users access it over the network (so he doesn’t distribute copies of it), he is free to do so, as long as he shares is changes using the same license.

And this is it. I think this is a good enough approach for now (even though I’m certain it isn’t a perfect fit for every scenario). What do you think?

May 06, 2021

asrpo (asrp)

Instruction level just-in-time programming May 06, 2021 08:39 PM

Just-in-time programming is a workflow for creating a program top-down, while a program is running. This is typical in Smalltalk environments like Squeak.

In this post, I want to describe an instruction level variant of this.

I think the best way to describe instruction level just-in-time programming (IL-JIT programming) is by showing how it works. I'll start with the classic Fibonacci example (Rosetta code has many implementations). We'll implement Fibonacci while evaluating fib(3).

Robin Schroer (sulami)

Onboarding Across Timezones May 06, 2021 12:00 AM

Even in a fully distributed organisation, teams are often clustered in timezones to facilitate collaboration. Another model is the deliberate spreading of teams to enable Follow the Sun workflows, which can also improve pager rotations.

In a distributed team, onboarding new team members can be especially difficult. New onboardees lack the context and institutional knowledge required to work effectively in an independent manner. Here are some strategies which I have found ease this process, specifically in the field of software engineering.

Onboarding Buddies

It is standard practice to designate an onboarding buddy for a new onboardee, a person in the same team who can take care of them during the first few weeks, answer questions, and pair a lot. It is not always possible to have this onboarding buddy in the same timezone, and in these cases it is a good idea to have an existing employee in the same timezone as a designated point of contact, regardless of team affiliations.

Making the Most of the Overlap

When onboarding someone in different timezones, everything has to be planned around the overlap in office hours. Start by clearly identifying this overlap. It is most important for the onboardee to be able to onboard and work effectively outside of this overlap.

To optimise the value gotten out of the overlap, it is best spent with synchronous conversations. Other tasks such as code reviews or planning work should be moved outside the overlap unless there is value in doing it synchronously, for example by sharing additional context.

If the overlap is very small, short screencasts can be medium-cost, high-bandwidth method to transfer context asynchronously. These can be recorded and watched outside the overlap, but convey much of the same context as pairing could, albeit without the ability to actually interject questions.

Enabling Self-Directed Learning

To enable the onboardee to learn on the job, there needs to be a pool of work items for them to pick up tasks from. These items need be narrowly scoped and include much more detail and context than they ordinarily would, to reduce the risk of the onboardee getting stuck.

Tasks should be sized “just right,” that is not larger than two or three days, but also not shorter than a single day. Very small tasks encourage the onboardee to start working on several tasks in parallel while waiting for code reviews, increasing cognitive load.

A library of self-directed training material is a must-have for every organisation, distributed or not, but can also serve as a fallback if the onboardee gets stuck on their current task and has to wait for the next overlap.

Just Because You Can Does Not Mean You Should

While asking colleagues for ideas & feedback, a sentiment I have heard more than once was “if you can, avoid scattering your team.” Just because you can spread a team literally around the globe does not mean you should do this. At my current day job we have employees around the globe, but generally try to keep teams within two timezone regions, for example within the North American east coast and Europe, or the west coast and APAC, so that we have an overlap of at least two to three hours.

May 05, 2021

Mark Fischer (flyingfisch)

Is a bell noisy if there is no bike to attach it to May 05, 2021 08:24 PM

There was once a young lady named Bella
Who fancied herself at Coachella
She wanted a blog post
Instead got a blog roast
I don’t know how to end this poem wella.

Mark J. Nelson (mjn)

Academic ancestor graphs from Wikidata May 05, 2021 12:00 PM

Note: A version of this blog post is available as an interactive Jupyter Notebook hosted on Google Colab.

The Wikidata relation doctoral advisor (P184) links researchers to the advisor or advisors who supervised their Ph.D. Note that coverage is currently a bit spotty. But, it's at least a superset of the data in the Mathematics Genealogy Project (because MGP data was imported into Wikidata), plus some from various other sources, such as parsing Wikipedia infoboxes, and manual additions (I've added quite a bit myself).

A nice thing about Wikidata is that it's queryable with SPARQL. Here is a query that finds all my academic "ancestors":


SELECT ?ancestorLabel WHERE {
  wd:Q65921654 wdt:P184+ ?ancestor.
  ?ancestor wdt:P31 wd:Q5.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}

Brief explanation:

  • Q65921654 is the Wikidata ID for me (Mark J. Nelson).
  • P184+ means one or more applications of the doctoral advisor relation.
  • ?ancestor is a variable
  • We only consider possible ancestors who are (P31) humans (Q5). This shouldn't really be necessary, but I sometimes get spurious results if it's omitted.
  • The wikibase:label service adds some magic that grabs the labels for variables: for any variable ?foo, ?fooLabel will be Wikidata's label for that item (in this case in English). That's so we get people's names back instead of Qxxxx codes.

You can edit and run queries like this interactively over at Wikidata, or programmatically from any language that can make HTTP requests.

Academic ancestor tree

That's a flat list of ancestors. But how do they relate? It might be nice to draw out the tree. To do that, we need to save a little more data: who's connected to who. (The query below uses some techniques adapted from this Stackoverflow answer by Joshua Taylor.) Specifically, we want every link in the ancestry tree. In the query below, ?ancestor2 is a direct advisor of ?ancestor1, and ?ancestor1 is either myself, or one of my academic ancestors:


SELECT ?ancestor1Label ?ancestor2Label WHERE {
  wd:Q65921654 wdt:P184* ?ancestor1.
  ?ancestor1 wdt:P184 ?ancestor2.
  ?ancestor1 wdt:P31 wd:Q5.
  ?ancestor2 wdt:P31 wd:Q5.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
As before, this can be run interactively. But it's probably more useful to do it programmatically, since we want to graph the results. For examples of doing that in Python, see either the Jupyter notebook version of this post, or the standalone Python script.

Either way, we now we have enough information to draw an ancestor tree using graphviz!

Mine ends up being pretty giant, so I won't show it here. One of my advisors, Charles Isbell, has an absolutely huge ancestor tree that trails off into long chains of medieval mathematicians that the Mathematics Genealogy Project has meticulously chronicled. But here's the other half of my academic ancestor tree, the one starting at my advisor Michael Mateas:

Common academic ancestor tree

So far we've been querying ancestors of a specific person. There are of course a lot more ways to slice and dice this big doctoral-advisor graph in Wikidata. Another one I find interesting: do two people have a common ancestor?

Here's one way to pull that out of SPARQL, again borrowing an idea from something Joshua Taylor posted on Stackoverflow. It's a bit hairier than the previous queries.


SELECT ?ancestor1aLabel ?ancestor2aLabel ?ancestor1bLabel ?ancestor2bLabel WHERE {
  # ancestors of the first person leading to a common ancestor (or ancestors)
  wd:Q65921654 wdt:P184* ?ancestor1a.
  ?ancestor1a wdt:P184 ?ancestor2a.
  ?ancestor2a wdt:P184* ?common_ancestor.
  # ancestors of the second person leading to a common ancestor (or ancestors)
  wd:Q105669257 wdt:P184* ?ancestor1b.
  ?ancestor1b wdt:P184 ?ancestor2b.
  ?ancestor2b wdt:P184* ?common_ancestor.
  # stop at the common ancestor(s) rather than retrieving their own ancestors
  FILTER NOT EXISTS {
    wd:Q65921654 wdt:P184* ?intermediate_ancestor.
    wd:Q105669257 wdt:P184* ?intermediate_ancestor.
    ?intermediate_ancestor wdt:P184 ?common_ancestor.
  }
  ?ancestor1a wdt:P31 wd:Q5.
  ?ancestor2a wdt:P31 wd:Q5.
  ?ancestor1b wdt:P31 wd:Q5.
  ?ancestor2b wdt:P31 wd:Q5.
  ?common_ancestor wdt:P31 wd:Q5.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
This particular query (interactive version) looks for a common ancestor between myself and Amy Hoover.

The result:

We weren't sure we had one, but it turns out that we do have a common academic ancestor, Roger Schank.

Naturally, all these queries depend on the data being in Wikidata. Anyone can add data there, so if you or your advisor are missing (or are in Wikidata but the advisor link isn't there), it's possible to go add it.

* * *

I've written some command-line Python scripts implementing the above queries:

Pepijn de Vos (pepijndevos)

Evaluating GUI toolkits: Flutter, React, Elm, Yew, ClojureScript, Kotlin May 05, 2021 12:00 AM

For Mosaic, my open source schematic entry and simulation tool, I’m reevaluating the choice of GUI toolkit, and along with that the implementation language. The original plan was to use Rust and GTK, but I’m slowly coming around to the idea that if its feasible performance wise to make a web app, that there is great value in that.

On the one hand IC designers are a conservative bunch and it’s great to have a good desktop app, but on the other hand EDA tools can be notoriously hard to set up and manage to the point where most commercial IC design happens via remote desktop. Having a the option to a web app allows you to have a snappy interface without all the hassle, and as eFabless demonstrated allows people to do IC design without signing an NDA.

I kind of want the option to run as a desktop app or as a web app. So what are my options? I think the most common one is to use Electron, which everyone likes to hate on, but in fact I’m typing up this blogpost in an Electron app and I can’t complain. Another option would be Flutter, which has web and desktop targets. And finally, some desktop GUI toolkits can be ran in a browser, which I’ll almost dismiss right away because they don’t behave like a proper web app.

This last point is worth reiterating on, because people keep suggesting canvas based systems such as Dear ImGui and Makepad. These systems break basic expectations like being able to select text, tab through form fields, search, use onscreen keyboards, screen readers… the list is endless. I will not write software that breaks basic expectations and is inaccessible to some people.

My plan is to make some very simple proof of concepts in various GUI toolkits to see how they stack up. I plan to look at

  • Flutter (Not Electron!)
  • TypeScript + React
  • Elm
  • Rust + Yew
  • ClojureScript + Reagent
  • Kotlin + React

I will consider the following criteria

  • Ease of use
  • Performance
  • Not breaking basic shit
  • Churn

By implementing the following tasks that I deem critical for my use as a schematic entry and simulation tool

  • Schematic: Draw a bunch of SVGs and lines that can be panned and zoomed.
  • Scripting: Interface with an IPython kernel for a minial notebook style interface.
  • Plotting: Plot a simulation result with a lot of traces and time points that can be panned and zoomed.

Flutter

On the surface it seems like the dream combo. Instead of a bloated Electron app you get a snappy desktop app that can also render to the web. It’s not all roses though, it’s originally a mobile toolkit, and it kind of shows. The desktop target it still in beta, missing advanced features such as “file picker” and “menu bars”, who needs ‘em right?

Flutter has a nice gallery where you can browse around some example apps. I opened the email app and you can’t select the text in an email. Talk about “breaking basic shit”. To continue on that point, I tried to access the app with a screen reader and on Android I just could not. On Linux I managed to push the “enable accessibility” button, but all the actions in the email app are just “push button clickable” without further description. They do bring up the on-screen keyboard in the compose window and increasing the magnification in Firefox did work, so there are some things they managed not to break.

On my desktop performance seemed good, but on my phone it is super janky. (jank seems to be the favorite word in the Flutter community) Curious to see how performance will hold up with my benchmarks.

Schematic

The code for drawing a hundred mosfets was quite okay. I added the flutter_svg library, added an a mosfet to the assets, and generated a hundred SVG widgets inside a Positioned inside a Stack inside a InteractiveViewer for the pan/zoom action. I don’t particularly care about all the details for the moment, I just want to see how easy it is and how well it performs.

Pretty easy, but honestly it doesn’t perform that well. In release mode, with a 100 mosfets, it feels like it’s dragging along with like 15fps. In profiling mode it instantly crashes. It actually seems faster in debugging mode though?! According to the profiler it’s running at 30fps when I’m panning around.

a bunch of mosfets flutter perfomance

Plotting

Of course step one is to locate a library. I checked out several. The first was very fancy but doesn’t seem to offer panning and zooming. The next gave a compile error because it’s not null safe and I’m not a savage. The next is a commercial package which is a instant no-go for an open source project. Then I posted on Stack Overflow and moved on. I also had a look at the Discord server but it seems like the #help channel is just noobs like me asking questions with no one giving answers.

Notebook

There are two ways to go about embedding a IPython notebook. One would be to use a webview, which is ruled out because it only supports iOS and Android. The other is to implement it from scratch with Websockets, which is what I will do.

I figured I could use the code_editor package, but was once again foiled by null safety. My understanding is that Flutter 2 does not allow nullable types by default, so packages that have not yet been updated can’t be used with it. I’m sure there are ways to deal with this but for a quick test I just counted my losses and used a plain text field.

The async http and websocket code to connect to the IPython kernel was pleasant, and after a bit of fiddling, I managed to write this simple demo app that once you enter a token and connect, you can execute Python code.

flutter ipython app

So while I got a good impression of async programming, I wasn’t able to get syntax highlighting to work, and am worried about how I will handle anything other than plaintext output. One of the main powers of notebooks is being able to embed figures and even interactive widgets. Without a Webview to reuse parts of the Jupyter frontend this becomes a difficult proposition.

Summary

Ease of use is quite okay. Installation was smooth, VS Code integration is amazing, hot code reload is sweet, programming in Dart is more or less fine, and working with async websockets was also fine.

Performance was quite disappointing, with the app unable to keep up the frame rate when panning around with 100 SVGs on the screen. I was unable to test the performance of the plotting libraries because I was unable to find any that support interactive panning and zooming.

Breaking basic shit: yes. The fact that you can’t select text in an email app says it all. Accessibility is also not great on the web, and in this beta completely absent when building for Linux.

In terms of churn, the ecosystem seems to be in the middle of a big migration to Flutter 2. Flutter for desktop is also still in beta, so will likely see some breaking changes when it comes to using plugins for file pickers and menu bars. On the other hand, with a significant community and the weight of Google behind it, Flutter is not likely to disappear anytime soon, and is on the whole a lot less fragmented than the JS ecosystem.

React + TypeScript

Basically I blinked and the entire ecosystem changed. Are Grunt and Gulp not a thing anymore? Is Yarn the new npm? I would very much like to not rewrite my whole app every year. But hey, React and TypeScript seem to be still around and going strong so that’s something.

I tried to figure out how to set this up and kinda just got lost in boilerplate. I was about to just go back to Makefiles and script tags when someone pointed out Next.js which allows you to create an app with a simple yarn create next-app. I’m sure it has a bunch of things I don’t need but at least it gets me a good setup with little hassle.

I had a look at nextron for making an Electron app with Next.js but it crashed while downloading the internet. Aint nobody got time for that. I’ll start with a plain web app.

Next.js worked as advertised and had me editing a React app in TypeScript with hot reload in no time. TypeScript is basically just JavaScript with type annotations, so it’s fine but not amazing.

Schematic

The start was pretty great. I made a for loop that inserts a bunch of <image> tags into an <svg> tag, and boom a hundred mosfets on the screen. Then I tried to add scrolling but I forgot how CSS works. I wrapped the SVG in a <div> with overflow: scroll but if I set the size to 100% it becomes as big as the <svg> and if I make it a fixed width, well, it’s a fixed width. I should probably learn flexbox properly (I’m from the float days) but for now I just went with a fixed size.

Holy shit it’s fast. I cranked it up to 5000 mosfets and it’s still smooth as butter. Browsers are amazing.

Now for the zooming part, I figured I could listen for wheel events and if control is pressed, prevent the default behaviour. But no such luck. After a bit of searching I gave up and just used a range input. Hooking up the range input I ran into some JS this weirdness already, oh well… The range input just sets transform: scale(value); which works well enough, but does not maintain scroll position. Proper implementation left as an exercise for my future self.

Even zooming is quite amazingly fast. Only when zoomed all the way out does it struggle a tiny bit with redrawing 5000 mosfets, but it’s super smooth when zoomed in.

5000 mosfets

Plotting

It seems Plotly is a widely used plotting library for the JS world, so I went with that. It even comes with a nice React wrapper, and can also be used in IPython notebooks. For using with TypeScript it’s helpful to install the @types/react-plotly.js dev dependency to make the type checker happy.

But then I got bitten HARD by my decision to use Next.js for getting up and running quickly. You see, Next.js tries to be all fancy with sever side rendering, and Plotly does NOT like this. I’m not particularly interested in any of the features Next.js offers, I just don’t want to deal with all the boilerplate of setting it up manually. In the end it turns out you have to use some dynamic import loophoole to avoid SSR and the resulting “document not found” errors from Plotly trying to access the DOM.

From there it’s smooth sailing to generate a bunch of data points and plot them. Subplots were a small puzzle, but actually fine. The interactive interface is very good, easy to pan and zoom around.

In terms of performance, I first made 10 traces of 100k points, which works fine. But increasing it to 10 traces of 1M points sends performance completely off a cliff. The interactive cursor updates with a few frames per second, panning and zooming hangs for a second, but once you start dragging it’s actually smooth until you release.

I did a bit of profiling, and it’s looping over the entire 10M points where it could probably be O(log(n)) by using some spatial index. I reported an issue, where the maintainers suggested using scattergl instead, which improved panning and zooming. Then I added hovermode: false to work around the last issue, and that gives very performant results all around.

In short, plotting works very well, but Plotly could use some improvements for very large datasets.

plotly

Notebook

I went ahead and asked on the Jupyter mailing list what would be the best way to integrate it. I was pointed to an example that reuses Jupyter components to build a custom notebook. It was also suggested I could embed my entire app as an extension into Jupyter Lab, which is not as crazy as it sounds as Jupyter Lab has grown into much more than just Python notebooks.

For now I’ll just try to follow the example though. I tried with Yarn and that gave errors but then I tried with npm install; npm run build and it worked. Don’t ask me… Npm told me there are 28 vulnerabilities (1 moderate, 27 high), so packages must be upgraded and the churn must continue.

It’s a bit of a different setup though, where the JS is compiled and then served with a Python script that is a subclass of Jupyter server. That takes care of the kernels of course, but doesn’t give you the fancy hot reload. I wonder if could somehow rig hot reloading into the bundled JS served by Python, but that’s for later. For now I just want to see if I can interact with the notebook from the outside.

I found it quite hard to find good documentation on how to interface with the notebook, so I settled for just poking around a bit. The commands.ts file provides some guidance for what sort of actions are available. Call me barbaric, but the easiest way for me to play with the code was to assign the NotebookPanel and NotebookAction to the window object so I can access them from the browser console.

> nb.content.activeCellIndex = 0
0
> nb.content.activeCellIndex = 3
3
> nba.run(nb.content, nb.context.sessionContext)
Promise {<pending>}
> nb.content.activeCell.model.outputs.get(0).data["application/vnd.jupyter.stdout"]
"test\n"

I think that’s sufficient for now to know that it is possible to use Jupyter Lab as plain TypeScript modules and interact with the notebook cells in interesting ways. The actual integration is going to be a bit of a puzzle though.

Summary

This whole setup honestly gave me quite a bit of puzzles, surprises, and headaches. Next.js gave me an easy start with React, TypeScript, and hot module reload, but server side rendering bit me when I tried to use Plotly. I will probably want to move to a pure React setup, but it won’t be enjoyable to set up. HTML makes it very easy to draw arbitrary boxes and media, but on the flip side, making conventional widgets and interactions requires a lot more manual effort. Like how Flutter just has a pan-zoom container, while I had a big struggle to make my SVGs zoom in and out.

Performance was honestly super impressive compared to Flutter. Where Flutter compiled as a native Linux app struggled to pan and zoom 100 mosfets, Firefox just kept going with 5k of them. I wasn’t able to compare plotting performance because Flutter just doesn’t seem to have a suitable plotting library, but the results with Plotly were quite impressive. Granted, I had to switch to a canvas backend and disable hover behavior, but 10M points is no joke.

HTML allows you to select text! Amazing, I know, right? React wins the doesn’t-break-basic-shit badge. Of course, unless you as the author break basic shit. Don’t do that.

Honestly the “churn” item is there because of the JS ecosystem. Somehow they managed to build an ecosystem on top of a language that’s backwards compatible decades into the past, where as soon as you look away for a second your code bitrots away. I hope that by sticking to the tried and true, I can avoid some of that. The big libraries seem to be here to stay.

Elm

For the remaining browser based solutions I will take a more lightweight approach. Obviously all of them inherit the same performance characteristics when it comes to rendering HTML. The plotting and notebook benchmarks are mainly about integrating libraries. So for the browser based solutions I’ll be mainly looking at the strength of the package ecosystem or interop with the JS ecosystem.

And frankly Elm just doesn’t cut it. They take a pretty hardball stance on interop, requiring the package ecosystem to be pure Elm, and only allowing top-level “ports” to talk to JS code. I had a look at plotting/charting libraries in Elm and none of them have been updated in the last 3 years.

Rust + Yew

The main concerns for me here are compile times and once again interop. I will try to plot a thing and see how it goes. In Rust the dominant plotting library seems to be Plotters, but it’s fairly low-level and does not have built in pan/zoom. Instead I’ll try using Plotly again.

Interop in Yew is similar to calling a C function. There are example templates for webpack and parcel, so you can just npm install things. I went with the parcel one, which told me once more there are a billion outdated dependencies and security vulnerabilities. If you want to use the JS ecosystem you have to deal with the JS ecosystem. Obvious but true.

Running the example template is easy enough. It even reloads and recompiles automatically, though it’s not hot module reload, just a page refresh. On this small example, compile times are short, not sure it’ll stay that way. A tendency of JSX-like Rust macros is that they require an ever-increasing recursion limit and result in slow compile times.

After several hours of trying to integrate Plotly by fiddling with annotations, I can confidently say: screw that.

ClojureScript

I used to do a lot of Clojure some years ago but never really got into ClojureScript. Clojure has always had good interop, and a functional programming approach without static typing. Not having static types reduces friction, but makes debugging harder.

In particular, Clojure has mostly useful default behaviour for handing nil rather than forcing you to handle Maybe results, until you get a null pointer exception and you’re in for a mystery hunt. The tooling seems not as mature as some other languages. There seem to be two major Clojure/ClojureScript VS Code extensions, neither of which supports debugging ClojureScript. The on I tried complained about me using VIM mode…

I’m once more just going to try to integrate Plotly and see how it goes.

The official website lists several abandoned projects which is not a good look. I asked about it on Slack and and they maintain that once software is done it requires no updates ever. It’s perfect, don’t touch. Which is honestly kinda refreshing compared to JS land where you have to burn everything to the ground perpetually. But you also have to be realistic about it, if a project has a bunch of open issues and PRs it’s maybe not as perfect as you say.

At first I went with Leiningen and Figwheel based on my experience from a few years back, but along the way learned from friendly community members that these days tools-deps is the standard way to bundle apps and shadow-cljs is the new hotness when it comes to ClojureScript tooling. Both Figwheel and shadow-cljs offer a nicely integrated development environment with state preserving hot code reload and a REPL into the running app but shadow-cljs has better integration with npm.

I created a new project with npx create-cljs-project cljsplotly and from there added Reagent (a react wrapper) and copied some example code. Compiling and running the project is as simple as npx shadow-cljs watch frontend, and its not even like there is a ton of boilerplate, just the simple config files for npm and shadow-cljs. Then I added react-plotly.js as a regular npm dependency. After that I was able to import the library with ["react-plotly.js$default" :as PlotComponent], and define it as a reagent component with (def plot (r/adapt-react-class PlotComponent)), which can then simply be used like so [plot {:data (clj->js [{:x [1 2 3 4], :y [1 2 3 4], :type "scatter"}])}].

clojurescript plotly

Kotlin

I wasn’t even planning on covering this option, but it popped up along the way and I figured I might as well check it out. I mainly know Kotlin as a very pragmatic “better java”, and have used it once in this capacity. Apparently it now also supports native and JS targets, with what looks to be favorable options.

Right away the first struggle is the tooling. The setup page only refers to IntelliJ with some mentions of the actual Gradle stuff. I’m sure you can use plain Gradle to do it, but for getting started, not ideal. It seems that the VS Code plugins are comparatively basic. So I guess I’m installing IntelliJ and figure out another workflow later if needed.

Setting up a project was easy. From there things got a bit confusing. There is a debugger, but it debugs the JVM server process. You can use the browser debugger, but sourcemaps aren’t working for me. (this turns out to be because you have to us the LEGACY compiler for that) It has hot code reload, but not state preserving, it just reloads the page.

Installing an npm dependency is pretty easy, but then you run into the exact same problems as with Rust where you have to add some externs with magic annotations to declare the JS interface. The tutorial for adding a React component has the following helpful information on the matter

Because JavaScript imports/exports isn’t the simplest topic, it can sometimes be tricky to find the correct combination between annotations to get the Kotlin compiler on the same page as us.

Again, I tried for a good while and asked on Slack. A day later a suggestion came in and after a bith more back and forth I managed to get it to work. I still do not understand why or how, and how I would go about future JS interactions. I also had a bunch of trouble just making a JS object to pass to the component. In short the interop story isn’t very obvious.

kotlin plotly

Conclusion

I had high hopes for Flutter, but it was pretty disappointing across the board. Poor performance, poor accessibility, poor libraries.

Browser performance when drawing 5k SVGs was amazing, and so was Plotly performance when drawing 10M points, once I used the right settings. So the web is definitely the way forward.

React is good, TypeScript is a minor improvement over JavaScript, but honestly the worst part is the tooling and quirky browser APIs. If I have to, I’m sure I can make it work. Not sure it’ll be the most enjoyable thing in the world though.

I dismissed Elm for lack of libraries. Rust and Kotlin seemed interesting, but their hot code reload does not maintain state, there is no REPL, and I had a lot of trouble making JS interop work. I’m sure they work well for some people, but I just could not make them work well for me.

ClojureScript was a very pleasant homecoming. I had kind of stopped using Clojure because a lot of what I do these days is either scientific Python/Julia scripts or embedded C/C++/Rust, but it’s a nice language. The community was very helpful and welcoming, with some old friends still around. It has by far the nicest toolchain and workflow of all the compile-to-js systems I tried, and the only one that can come close to TypeScript in JS interop.

So in the end I think the competition is between ClojureScript and TypeScript. I had a lot more fun with ClojureScript, but of course it’s a less widely used language. TypeScript, for better or worse, is more mainstream and closer to JavaScript.

silly plot of languages

May 03, 2021

Ponylang (SeanTAllen)

Last Week in Pony - May 3, 2021 May 03, 2021 04:27 PM

Version 0.40.0 of ponylang/ponyc has been released! Also, the ‘Add String.add iso proposal’ RFC is now in final comment period.

May 02, 2021

Derek Jones (derek-jones)

Claiming that software is AI based is about to become expensive May 02, 2021 09:59 PM

The European Commission is updating the EU Machinery Directive, which covers the sale of machinery products within the EU. The updates include wording to deal with intelligent robots, and what the commission calls AI software (contained in machinery products).

The purpose of the initiative is to: “… (i) ensuring a high level of safety and protection for users of machinery and other people exposed to it; and (ii) establishing a high level of trust in digital innovative technologies for consumers and users, …”

What is AI software, and how is it different from non-AI software?

Answering these questions requires knowing what is, and is not, AI. The EU defines Artificial Intelligence as:

  • ‘AI system’ means a system that is either software-based or embedded in hardware devices, and that displays behaviour simulating intelligence by, inter alia, collecting and processing data, analysing and interpreting its environment, and by taking action, with some degree of autonomy, to achieve specific goals;
  • ‘autonomous’ means an AI system that operates by interpreting certain input, and by using a set of predetermined instructions, without being limited to such instructions, despite the system’s behaviour being constrained by and targeted at fulfilling the goal it was given and other relevant design choices made by its developer;

‘Simulating intelligence’ sounds reasonable, but actually just moves the problem on, to defining what is, or is not, intelligence. If intelligence is judged on an activity by activity bases, will self-driving cars be required to have the avoidance skills of a fly, while other activities might have to be on par with those of birds? There is a commission working document that defines: “Autonomous AI, or artificial super intelligence (ASI), is where AI surpasses human intelligence across all fields.”

The ‘autonomous’ component of the definition is so broad that it covers a wide range of programs that are not currently considered to be AI based.

The impact of the proposed update is that machinery products containing AI software are going to incur expensive conformance costs, which products containing non-AI software won’t have to pay.

Today it does not cost companies to claim that their systems are AI based. This will obviously change when a significant cost is involved. There is a parallel here with companies that used to claim that their beauty products provided medical benefits; the Federal Food and Drug Administration started requiring companies making such claims to submit their products to the new drug approval process (which is hideously expensive), companies switched to claiming their products provided “… the appearance of …”.

How are vendors likely to respond to the much higher costs involved in selling products that are considered to contain ‘AI software’?

Those involved in the development of products labelled as ‘safety critical’ try to prevent costs escalating by minimizing the amount of software treated as ‘safety critical’. Some of the arguments made for why some software is/is not considered safety critical can appear contrived (at least to me). It will be entertaining watching vendors, who once shouted “our products are AI based”, switching to arguing that only a tiny proportion of the code is actually AI based.

A mega-corp interested in having their ‘AI software’ adopted as an industry standard could fund the work necessary for the library/tool to be compliant with the EU directives. The cost of initial compliance might be within reach of smaller companies, but the cost of maintaining compliance as the product evolves is something that only a large company is likely to be able to afford.

The EU’s updating of its machinery directive is the first step towards formalising a legal definition of intelligence. Many years from now there will be a legal case that creates what later generation will consider to be the first legally accepted definition.

Gustaf Erikson (gerikson)

14,000 dead in Sweden May 02, 2021 02:23 PM

May 01, 2021

Jan van den Berg (j11g)

2020 Music Discoveries May 01, 2021 07:52 PM

Remember 2020? Yeah, I know. Well here is the list you’ve been waiting for.

Previous lists are here: 2019, 2018 part 1, 2018 part 2, 2017, 2016 and 2015

As usual: a you get couple of words and a YouTube video. All killer, no filler. I promise.

Note: I only noticed when I compiled the list that a lot of songs are about death…

Let’s kick it off with Molchat Doma. What’s that? Glad you asked! Molchat Doma is russian doomer music and the perfect soundtrack for 2020. Their album Etazhi is probably my most played album of 2020. It’s a rare album in its complete tonal consistency and overall freshness. This next track Volny is one of the highlights. The video alone should receive all sorts of prizes. Enjoy, and don’t worry there is more Molchat Doma on the list.

Small song, country feel. 272 views. Steph Copeland ft. Chris Wong. When I lay my head to rest, there will grow a garden from my chest.

This next song is quite the trip! If the first few notes don’t grab you by the throat, maybe Ben’s performance will. This is actually three songs into one. Take your time, this is a very intense performance. And do stick around for what happens after four minutes.

I like me some Bruce and you know it. Bruce is like a machine and delivers, always. And the title track for the docu/album is as Springsteen as they come. Banger!

While we’re on the subject of old rockers with guitars, let me just share this rendition of an old folk song. I can picture this song to be the last song in every Irish pub before closing time (you know, pre-covid etc.). You cannot not love this.

Ok last old guy with a guitar. I can’t quite put my finger on it but the remaster of Pocahontas is much better than the original?? It’s the tempo, maybe. I don’t know. Anyway here is Neil Young singing he wished he had a thousand pelts so he could sleep with Pocahontas (?!), read the wiki if you want to know more. It’s quite the trip.

Lets pick up the tempo. I am a sucker for urgent sounding songs like this. The pacing here is terrific. And only 278 views. How?! This song is STRAIGHT FIRE. Time! Let the room untouched!

Talking about urgency, Fontaines D.C. know everything about that. The band with multiple entries on last year’s list just keeps producing one great track after. This band is everything a band should be. And if this next track doesn’t convince you, I am sorry for you.

One of the few women on the list. Katie Pruitt. Poppy but she has skills. I like it! The Fleetwood Mac is strong in this one.

Back to the UK. If you are still wondering where Arctic Monkeys got their inspiration, look no further. I was thoroughly surprised when I found out this song actually predates the Arctic Monkeys because this song is MOST definitely the inspiration for Teddy Picker. Not the other way around. Alex Turner you cheeky bastard!

Let’s stay in the UK shall we. Bands like the Enemy, with clear tracebacks to the Jam can only spring from the UK. A quintessentially British song.

Here is a very special song by Empress Camelian. This song has everything you would want from a laptop artist. Or any artist for that matter. And this is exactly the kind of song I hope to find when I scour the internet. Intimate and melodic. You’ll be surprised.

Ok I lied, here is another old guy with a guitar. Bob Dylan. Of course. Dylan released his 39th studio album in 2020 Rough and Rowdy Ways. And a few alternate takes and reissues surfaced. Good stuff for fans of the man that’ll turn 80 (!) soon. But the one song that had everybody talking, because of its length and content, was Murder Most Foul. There is a lot to unpack here, because not only is it a Bob Dylan song, it’s also almost 17 minutes. Clear your schedule please.

Ok so this is a Dutch track, but I can’t not include this. These boys hit all the right notes. The guitar is definitely a callback to best Dutch popbands of the 80s. But don’t be fooled, there is more here than what it looks like, and it already looks just terrific.

Paul Simon is and will always be one of my favorite songwriters. This song was written just after Nixon got elected and is probably one of his most political songs. And I only recently learned what makes the melody for American Tune so great. Can you hear it?

I am sorry what’s that? You want sci-fi m83-ish sounds? Say no more.

I promised a little bit more of Molchat Doma. Well here it is, for the acquired taste palette. This track is just completely bonkers, but in a good way, image what that sounds like live.

Oh, Bright Eyes how I love thee. This guy can hardly do wrong and just keeps churning out instant classics. And this is a fantastic live rendition of a fantastic song. Handsdown the best Covid performance I saw this year. For the daily double bonus: can you tell who the drummer is?

This track was removed from Youtube. So I’ll share another rather similar one.

You know what, that is a GREAT question Jason Isbell?! What have *I* done.

Declan McKenna was on my list in 2015, when he was just a 16 year old kid, because the talent was so obviously apparent. This next track — 5 years on — proves that his talent has not been wasted. He already has quite the following and is on his way to becoming a bona fide popstar. This is his 2020 hit. Poppy, sure. Delicious, yes. You can’t hide talent like this.

Save the best for last. This is hands down the best song of 2020 for me (Molchat Doma takes home the album award). Wolf Parade. The video, well it’s… something and the artist went ALL out. The song itself is a grower. And I admit It didn’t click on my first listen. But hidden in this song is something quite remarkable. There is no greater build up to an outro break in all of 2020 — or maybe even the decade — than the 3:28 mark in this video. Just when you start wondering where this song is going, the essence of this track lays it soul bare. It is astounding and it’s the absolute pinnacle of an already absolutely incredible song. One for the ages. Turn it up to 11. Please do also check out the some of the live versions. The live performances really punctuate what this song is about. Drums, keys, guitar and a singer who doesn’t hold back. A delight to watch. This is what it is about. To say that he loves you, would be unfair towards you.

The post 2020 Music Discoveries appeared first on Jan van den Berg.

Pierre Chapuis (catwell)

What I do at Inch May 01, 2021 06:00 PM

I have been working at Inch for about a year and a half now, so I thought it was a good time to write a bit about what I do there.

First, a bit of context: Inch is SaaS software for property managers that operates in the French (and Belgian) market. Many of our customers are co-ownership trustees, which is a much bigger deal in France than in most countries, but we also serve other professionals such as rental managers and social housing landlords.

The software can be seen as a mix between a CRM and a ticketing solution. The code base dates back to 2013. The backend is a Rails monolith and the frontend is a React application. There were two other developers on the team when I joined, now we are five, of which I am the most senior. We all directly report to the founder responsible for product, but I am sometimes presented as the CTO to customers because from their point of view I act as such (more on that later).

Now, what do I do there in practice? As you may guess if you know me, a few different things.

Dealing with integrations

That is my main role, and the reason I was hired in the first place. See, our software is used by our customers to manage the relationship with their own customers and suppliers, but they invariably already have pre-existing sotware with overlapping data: their accounting software or ERPs. Some have built them in-house, but most picked from a very fragmented market of industry-specific software which we have to integrate with. So I maintain somewhere between 20 and 30 different integrations with those pieces of software you have probably never heard about.

You may not understand the complexity until you realize that none of those integrations looks the same, because the software itself doesn't. Some is cloud-based, some is on-prem. In some cases they provide APIs or flat-file data export mechanisms, but sometimes we have to go get the data directly in their database. Of course, we almost never have technical documentation for their data model, and the databases themselves are all different. I have code for over 10 different major DBMS, not all of them using SQL, and not all of them running on typical OSs (hello AS400, hello SCO...).

Add to that integration with a few APIs we use, the maintenance of our own APIs so others can integrate with us and a few custom features for large customers, and you will have an idea of the scope. It is a role that requires domain knowledge, technical insight about many systems and the ability to understand new ones quickly. And maybe most importantly, I have to model our own software and systems so that they can support all of this. If you think about it it's all about synchronizing data between systems, so I am not all that far from my specialty after all. :)

This role means that when we talk to large customers or prospects I am often the person representing the technical side of the company (hence the CTO thing).

Operations and security

You know the pattern now: I arrive at a startup with a small team of developers where nobody has significant operations experience and I end up becoming responsible for them. It never bothered me, quite the opposite actually: I have been doing systems administration for 20 years and I have always liked it.

In this case, there had been some people who knew what they were doing on the team before, so it was not that bad. For instance, they had a reasonable backup strategy already (yay!).

One of the first initiatives I took was look at spending, which was too high, rationalize a few things and get the hosting bills down. Also, Inch had four main hosting providers for historical reasons: AWS, GCP, Digital Ocean and OVH. The plan was to move everything to GCP, for simplification and compliance reasons. I'm still not done but I got out of Digital Ocean and reduced our usage of AWS by a lot.

We still had three servers left at OVH when their DCs burned last March. We lost two which served as integration gateways, and I took the opportunity to move them to GCP. Losing those servers has made me very busy in the last two months because their IP adresses were all over our customers' and partners' config, so I had to get in touch with them all to fix it, since there was sadly no way we could get back the IPs from OVH in time. That won't happen again now, I used reallocatable IPs this time. The third server will remain at OVH for the time being because it is a very special machine with very specific hardware hosting needs.

At small companies like this, operations also come with a lot of security-related responsibilities. Security is pretty important for Inch, which deals with a lot of personal data. So far, I have worked on improving our operational practices, added a team password manager, rationalized internal auth{n,z} around the GCP tools, and written some documentation. I have also upgraded several important software components and reviewed / fixed all the usual configuration (systems, TLS, CORS, CSP...). I have also started working on the Web application security itself, I have less experience in that space - especially regarding the frontend - than in systems security but I am learning and we have a few experienced Web developers on the team to help.

As usual, nothing is ever done in that space and I still have a lot of work coming up, but at least I believe we've significantly improved since I joined.

Fullstack Web development

Finally, the third leg of my role is the same as every developer at Inch: working on the product.

I know the "fullstack" wording makes some people cringe. Of course we all have our favorite part of the stack but I think its is extremely important, especially on small teams, that everyone be able to write or at least understand a feature in its entirety.

With the other two roles taking much of my time, I don't do as much feature work as the rest of the team but I still try to implement features myself regularly. I had never done frontend stuff seriously before Chilli and it was a different stack so I learn a lot about React and its ecosystem. I do my share of performance and bug fixes too. On the backend especially, I also do quite a lot of architecture and code reviews.

Other things

Besides those three main parts of my role, there are of course other things I take part in. Inch encourages its employees to participate in many cross-cutting activities.

An important one is that we all do support in rotation, which means the whole team ends up knowing the customers and identifying the parts of the product that need improvement. This also means everyone can, and does, participate in product decisions.

But that doesn't stop at product. Inch founders are very transparent about the business, our finances and future plans. Everyone is also encouraged to chime in on strategic decisions for the company.

This article is already long enough so I'll stop here. I hope that gives you an idea of what I've been doing at work since November 2019. If any of this interests you in any way, don't be shy, get in touch. :)

April 30, 2021

Yann Esposito (yogsototh)

Static Blog Builder April 30, 2021 10:00 PM

As someone on the Internet said not so far ago. Building its own static building system is a rite of passage for many developers. It has a lot of nice features. It gives a goal with a feeling of accomplishment. It is simple enough so most developers could build their own system. It could also become complex when you go down the rabbit hole.

Along the years I used different tools and used and wrote of few static website systems:

So if you look at the progression, I first used nanoc because I used ruby and it was a new solution, the website looked really great. Also the main developer Denis Defreyne was really helpful. Ruby was really great at dealing with regular expressions for hacking my documents.

Then I was interested in Haskell, and I switched to a Haskell-made solution. I used hakyll, and I wrote a bit about it in Hakyll Setup. As a side note, the author of Hakyll Jasper Van der Jeugt is apparently a friend of the author of nanoc. They both wrote a static site generators with their preferred programming language. I added a lot of personal features to my own site builder. It was a nice toy project.

Then, due to a major disruption in my professional and private life I stopped to take care of my website.

And a few years ago, I wanted to start a new website from scratch. In the meantime I switched my editor of choice from vim to Emacs. I started to work in Clojure and emacs is generally a natural choice because you can configure it with LISP. I discovered org-mode (I don't think the homepage of org mode makes justice to how incredible it is). So org-mode comes with an export system. Thus I switched to org-publish. Again I wrote a bit about it.

It was nice but slow. I improved a few things like writing a short script to Generate RSS from a tree of html files. I still had the feeling it was too slow.

Static site building is a specific usage of a build system. And as I knew I could use pandoc to build HTML out of org-mode files and still versed in the Haskell culture I decided to try shake. You can learn more by reading this excellent paper about it, I think all developer should read it: Build System à la carte.

As a bonus, pandoc is written in Haskell. I could then directly use the pandoc library in my build program. It worked like a charm and it was very fast as compared to other solutions I tried. So really let me tell you shake is a great build system.

Unfortunately it was not perfect. While it was very fast, and I was able to use pandoc API directly. It made me dependent on Haskell. The best way I found to have Haskell reproducible build environment is to use nix. This was great until the Big Sur update. To keep it short, nix stopped working on my computers after I upgraded my to Big Sur. Gosh, it was painful to fix.

Concurrently I discovered gemini and wanted to duplicate my website into gemini sphere. So I tried to update my build system but my code was to oriented to use pandoc and it was painful to have gemini in the middle of it. Particularly, generating a gemini index file. My main goal was to have gemini file that could only be linked from withing gemini sphere. Because gemini is a lot smaller web where you could feel a bit more protected from what the Web has become along the years. Whatever, in the end, I just had two problems to tackles.

  1. Haskell became difficult to trust as very stable tool. Stable in the sense that I would not have any support work to do in order to keep just using it and not fixing/tweaking it.
  2. Simplify the overall system to have a simpler build description

So a very stable tool that I am pretty sure will still work almost exactly as today in 10 years is make (more precisely gnumake). I expected a lot of people had already come to the same conclusion and wrote about it. To my great surprise, I found very few article about generating static website with make. I only found solutions a bit too specific for my need. This is why I would like to give you a more generic starting point solution.

The Makefile

Instead of copy/pasting my current Makefile entirely let me give you a more generic one. It should be a great start.

The first part will be used to simply copy the files from src/ to _site/.

all: website

# directory containing my org files as well as my assets files
SRC_DIR ?= src
# directory where I will but the files for my website (HTML + assets)
DST_DIR ?= _site

# list all files in src
# if you want to exclude .org files use the exclude from the find command
SRC_RAW_FILES := $(shell find $(SRC_DIR) -type f)
# generate all file that should be copied in the site
# For my site, I want to publish my source files along the HTML files
DST_RAW_FILES   := $(patsubst $(SRC_DIR)/%,$(DST_DIR)/%,$(SRC_RAW_FILES))
ALL             += $(DST_RAW_FILES)

# COPY EVERYTHING (.org file included)
$(DST_DIR)/% : $(SRC_DIR)/%
    mkdir -p "$(dir $@)"
    cp "$<" "$@"

This part is about running the pandoc command for all org files in src/ so they generate a html file in _site/.

# ORG -> HTML, If you prefer markdown replace .org by .md
EXT := .org
# all source file we'll pass to pandoc
SRC_PANDOC_FILES ?= $(shell find $(SRC_DIR) -type f -name "*$(EXT)")
# all destination files we expect (replace the extension by .html)
DST_PANDOC_FILES ?= $(subst $(EXT),.html, \
                        $(subst $(SRC_DIR),$(DST_DIR), \
                            $(SRC_PANDOC_FILES)))
ALL              += $(DST_PANDOC_FILES)

# use a template (you should use one)
TEMPLATE ?= templates/post.html
# URL of the CSS put yours
CSS = /css/y.css
# The pandoc command to run to generate an html out of a source file
PANDOC := pandoc \
            -c $(CSS) \
            --template=$(TEMPLATE) \
            --from org \
            --to html5 \
            --standalone

# Generate all html if the org file change or the template change
$(DST_DIR)/%.html: $(SRC_DIR)/%.org $(TEMPLATE)
    mkdir -p $(dir $@)
    $(PANDOC) $< \
        --output $@

A missing part is often the part where you would like to generate an index page to list the latest posts. Here you are a bit alone, you need to make one yourself. There is not generic way to do this one.

# Generating an index page is not difficult but not trivial either
HTML_INDEX := $(DST_DIR)/index.html
MKINDEX := engine/mk-index.sh
$(HTML_INDEX): $(DST_PANDOC_FILES) $(MKINDEX)
    mkdir -p $(DST_DIR)
    $(MKINDEX)
ALL += $(HTML_INDEX)

Finally, a few useful make commands. make clean and make deploy.

# make deploy will deploy the files to my website write your own script
deploy: $(ALL)
    engine/deploy.sh

website: $(ALL)

.PHONY: clean

clean:
    -rm -rf $(DST_DIR)/*

Limitation: make is old. So it really does not support spaces in filenames. Take care of that.

Let me tell you. While this is quite a minimalist approach (<100 lines) it is nevertheless very fast. It will only generate the minimal amount of work to generate your website. I have a nice watcher script that update the website every time I save a file. It is almost instantaneous.

The only risky dependencies for my website now is pandoc. Perhaps, they will change how they generate an HTML from the same org file in the future. I still use nix to pin my pandoc version. The static site builder itself is very simple, very stable and still very efficient.

As a conclusion, if you want to write your own static site builder that's great. There are plenty of things to learn along the way. Still if you want something stable for a long time, with a minimal amount of dependencies, I think this Makefile is really a great start.

April 28, 2021

Sevan Janiyan (sevan)

Served a Gemini page to myself April 28, 2021 10:17 PM

I’ve been looking at Gemini recently, having tried out various clients such as Lagrange on macOS and Bullox on the shell, I thought I’d try serving. There are many projects for servers written in different languages, I picked The Unsinkable Molly Brown from the list for the technical reason of great name 🙂

Patrick Louis (venam)

Medium for Communication, Medium for Narrative Control April 28, 2021 09:00 PM

An internet studies about narrative control. To no one’s surprise, the internet has permeated all aspects of our lives. Allother means of communication have dwindled in comparison, even though thetechnological behemoth is relatively young (around 50 years old as of 2021).Worldwide, according to statistics from 2019, people spent an average of 2 anda half hours a day on social media. The top place goes to The Philippines with3h53min per day

Internet: Medium For Communication, Medium For Narrative Control — Conclusion & Bibliography April 28, 2021 09:00 PM

The circle and the point: the circle is the symbol of eternity. The point is the symbol of the concentration of time in the moment

  • Internet: Medium For Communication, Medium For Narrative Control
  • Conclusion & Bibliography
Table Of Content

Our mini internet study has come to an end.
In this series we’ve seen the new artifacts and spaces introduced by the internet, the actors using them, from the new economies, to netizens in between, to state actors. We also reflected and tried to understand why we can be susceptible to biases and why we have so much difficulties with online interactions. Next we’ve looked at the big picture by diving into subjects such as paralysis, neoliberalism, the truth and trust crises, and a future glance at mass hypnosis and psychosis. Finally, in the last part of the series, we’ve seen four type of solutions: market and economy or laissez-faire, legal path with governments being involved for transparency and accountability, technical software solutions, and web literacy as education and maturity to learn to live in the information society — a post-modern or meta-modern society.

I hope you’ve learned as much as I did during this series. We’ve broadly covered quite a lot of topics, from sociology, psychology, computer science, art, memetics, history, warfare studies, politics, and more.

Let me know if it has been as helpful as it has been for me.

This series can also be found in PDF booklet format.







Bibliography








Table Of Content







Attributions: Philotheus, Symbola Christiana, Frankfurt, 1677

April 27, 2021

Patrick Louis (venam)

Internet: Medium For Communication, Medium For Narrative Control — Adapting: Education, Literacy, and Reality April 27, 2021 09:00 PM

Perpetual motion. In order to understand it, one must think of it in motion

  • Internet: Medium For Communication, Medium For Narrative Control
  • Part 5 — Adapting
  • Section 4 — Education, Literacy, and Reality
Table Of Content
  • Web and internet literacy
  • Biases and Human Nature
  • Our Online Social Lives
  • Disconnecting to Better Reconnect
  • Learning to Live in The Information Era

We can attempt to patch things, use tools and software as countermeasures, to add laws and regulations, or to let the market decide, but in the end we are at the center of everything. If we are facing difficulties with the medium it’s because we haven’t matured enough to handle it properly. Right now we’re still in the process of trying to grasp how it works, in an apprenticeship stage.

Knowing our tools, how to use the internet medium properly, is a foundational skill, as important as any other in our information society. This goes under the name of digital, internet, or web literacy: the ability to read, write, and participate properly on the medium, in its full extent. It’s an application of the broader information literacy.

Literacy is important because it shapes societies. The degree of literacy of people affects directly what they’re able to imagine doing and actually do on a medium. By providing these skills to more people, teaching them how to use the internet, it would make the access to participation more equitable. It would ensure that there would be more representation of diverse ideas, geographies, languages, and cultures. Consequently, it would reduce the digital divide and increase the cognitive diversity of the netizens.

At the moment, as we’ve seen before, some topics are reserved to the elites. For example, we’ve dived into privacy as a luxury in one of the previous sections. Bringing internet-literacy to everyone, as a human right, would bridge the class gaps.
In 2016 the UN started considering the right to connect as a human right and some countries have adopted it in their laws. However, education on how to use the medium is as important but often dismissed.

Different organizations have thought this through and came up with pedagogical curriculums for the digital literacy core skills needed.
These are skills that are part of information literacy in general but applied to the internet medium. They fall in these categories:

  • Seeking information
  • Verifying information
  • Using information
  • Collaborating and participating
  • Ethics
  • Security and protection

The Mozilla foundation has created a web literacy curriculum that sets the base for the creation of courses that educators can give. They have three broad sections: write, read, and participate. These are subdivided into particular digital skills such as: designing, coding, composing, revising, remixing, connecting, protecting, open practices, contributing, sharing, evaluating, synthesizing, navigating, and searching.

Instead of jumping without knowing — letting users discover everything by themselves — this curriculum could be taught in schools, or as part of trainings easily available to everyone willing to learn.
Such digital citizenship curriculum would help everyone navigate the internet effectively, communicate on it using a variety of methods and tools to a range of different audiences, and have the adequate critical thinking to be able to evaluate information and arguments, identify patterns and connections, and construct meaningful knowledge that can be applied online and in the real world.

Additionally, this curriculum could have an emphasis on understanding the dynamics of the environment, namely: the actors, economies, and algorithms we’ve seen through this series.
Awareness is key to empower us against the influential algorithms, the curation engines of the internet, so that we can use them to our advantage.

Inherently part of this dynamic are the last point above of security and protection, which rhyme with anonymity, lack of trust, and the truth crisis we’ve seen before.

Security also goes along with privacy, implying tackling social cooling problems. Teaching people about cyberbullying, digital footprints, e-safety, and cyber hygiene as soon as possible is a must to protect ourselves online. As we said, people can, after getting a basic overview of the topics, follow recommendations and news from the Electronic Frontier Foundation (EFF) and other non-profit organizations.

One of the most important aspect of information literacy on the internet is making sure of the veracity of what we are consuming. Some organizations and institutions related to libraries, scholars, and media literacy as a whole have put forward good practices that can be taught to everyone to critically assess and navigate the internet space.
The IFLA (International Federation of Library Associations and Institutions) has redacted a document on how to spot fake news and published it initially on https://factcheck.org, to later translate it into 45 languages for accessibility.
Similarly, the Center For Media Literacy has also put forward key questions to ask every time we encounter dubious messages.

A critical part of fact checking is to know our own biases, to understand our own vulnerabilities.
Being aware of the existence and inner workings of the techniques that can be used to abuse our biases is the best defence against them. It provides the basis to quickly notice when we’re being tricked by messages and take some distance. Be it propaganda or marketing schemes, an intimate knowledge of our automatic impulses, training to be primed to know the coercive nature of these messages, make us proactive instead of reactive.

Outrage and shock are two emotional reactions that are employed to get us caught in the tornado of rampage. Noticing how anger doesn’t make our opinion valid, nor more righteous, is essential. Our emotions and opinions shouldn’t always mix.

Many of these biases are temporal and only work for a short period of time, this is why they often seek immediacy, scarcity, and speed. Slowing down media sharing and consumption practices is a good method to avoid falling for them.
The slow thinking way of processing information, compared with fast thinking, would also counter the speed property of the internet medium.

The author Hans Rossling has written about the declinism mindset, how the media change our perceptions, and how to fight these biases in his book Factfulness. The book takes three sides: realization that we do not see the world as is, awareness and identification of our instincts, and how to fight and control them by changing the way we see things.
Documents and teaching materials for educators can be found on the gapminder’s website.

Biases and instincts are short-term, on the long-term we should consider habits related to our surroundings, how we participate online and create communities. Just like it’s essential to learn the dynamics of the environment, it’s also essential to understand and learn how social interactions work on the internet.

This includes grasping what social biases we are prone to — the influence of our virtual surroundings — a good calibration of what this surrounding consists of, and how they are linked.

To calibrate, we can deliberately determine and examine the gatekeeping level of the community we are part of, or the ones we intend on joining or creating. This can be used to bring back a certain quality and trust.

The calibration can come in the form of segregation or separation of our online interactions and profiles. Instead of partaking on platforms that force us to integrate all of our lives, we have to select the ones that help us distinguish between the multiple facets and activities we do.
In the real world we go to different places with different people to do different things, we can do the same virtually. This would help mitigate the phenomena of private life becoming public life, social cooling, and overall paralysis.

The influence of our digital surrounding comes from various sides, all of them wanting to put us in a box, as a well-defined templatized individual. This could be because of marketing, the recommendation and curation engines, or simply the influencers and people we keep up-to-date with.

As with most of what we’ve mentioned so far, knowledge is the best way to avoid being manipulated to fit a narrative. We could train ourselves in metamemetics thinking to be able to recognize the content of the memeplexes, take some distance to criticise them. This would include stopping seeing memes as infotainment and to see them in their real light.
Above all, we need to learn to separate our definition of self from what we consume and share.

We could take this further and distance ourselves completely, to disconnect as a way of reconnecting.

Many are seeking interactions on the internet as an escapism from the difficulties of their lives. To feel safe between the walls of filter bubbles.
Some of these malaises emerge from the cultural insecurities and lacuna we’ve dived into in other sections.

Yet, some studies have found that these insecurities are reduced, along with fewer biases and reaction to outrage, and better communication, when we pull the plug for a while to do offline activities.

In particular, the most effective studies have looked at performing offline pro-social activities such as helping NGOs. Keep in mind that this should be sustained long enough so that it isn’t linked to any online social justice, mob-justice, internet karma points, or the virtue signalling we discussed earlier.

On one side, it works because we get far away from outrage and the negativity that is prominent online. It brings a more positive balance to what we feel instead of being consumed by our screens and their constantly urgent and pressing issues that we sense we are forced to react to.

On another side, it works because it takes us back to raw reality, with its sense of community away from neoliberalism and individualism. Maybe it shows how real human connection cannot be replicated online, but only the chemical reactions.
Nearly everything distinctive about human nature can be seen as a form of cooperation, and taking part in offline activities brings back our humanity.

Ironically, this is the same mechanism that is used to deprogram people from cults or addictive habits. To create a support group around them, with real connection, bonding, support, and orientation.

It could also be because taking time offline and doing non-profit work changes our neoliberal views, if our societies are inundated with them. We can think again of society as an organism which we are part of, with its flaws, and accepting each member with their minute differences.

This last point is probably what triggers our issues on the internet, the malaise and cultural gaps we talked about but that are not addressed.
Learning to live in the information era and the societies shaped around it needs the acceptance of multi-culturality and ambiguity. It requires a post-modern mindset, or even a meta-modern mindset, while most are equipped with a modern one.

For that, we have to make peace with post-modernism and what it implies, without reverting, because of our insecurities, to a cozy pre-modern mindset of tribalism. Some call this transition meta-modernism, or neo-modernism and post-postmodernity.
This means mitigating absolutist thinking, mitigating our inner urge to grasp for reassurances that would reinforce our beliefs when we find ourselves in the absence of certainty. The post-modern world doesn’t owe us certainty, on the contrary, it disintegrates absolutisms, annihilate dichotomy thinking (black-and-white mindset), removes categorical imperatives, and all the expectations on oneself and others. For post-modernity and meta-modernity, these are obstacle to our growth, they are our immunity to change and it makes us uncomfortable when we have to let go.
Absolutist thinking can significantly contribute to disorders of mood and affect, which can, in turn, negatively impact our quality of life because the world itself isn’t absolute and doesn’t bend to our preconceived notions. They are considered cognitive distortions in modern psychology since most of reality and everyday life takes place in the gray area, in between extremes.

This also applies to the trend of over-rationality and over-objectivism, where everything has metrics and is calculated. This over-emphasis on clear definitions and delineations, with no in-betweens, is another kind of absolutist thinking.

Navigating risk and uncertainty is a never ending endeavor. No amount of reassurance will ever quell all of the anxieties we have in our lifetimes.

This transition from modernity isn’t new, it’s been happening for almost a hundred years now and is only accelerated by the new attributes the internet brings.
This is clearly seen through some Western and Eastern art movements.

Modernism was about individualism, the conquest of the single person against the world, independence, personal ownership, but also a general malaise about our place in the world, nihilism, existentialism, fordism, seeing humans as tools, consumption, the psychoanalysts such as Freud, and the marketers such as Bernays.
In this transition, initially out of the absurdity of consumption, sprung art movements such as Dada. It had objects at its center and rejected all logic and reasons, the non-sense of all the machineries.

This extended into movements such as surrealism, shinkankakuha, and abstract expressionism that now had in the center the internal human experience and its reflection on the world. It portrayed a space between dreams and reality, using psychic automatism as a tool to extract thoughts, an inner discovery of our own senses and connection with our environment. The Surrealists and the shinkankakuha movement were fascinated by dreams, desire, magic, sexuality, and the revolutionary power of artworks to transform how we understand the world.

All of these movements were prequels to today’s post-modernism and meta-modernism. Our malaise was put unto art that expressed our infinite smallness, squeezing out our unconscious thought for us to admire and discover new visions, always reinterpreting, always breaking assumptions.
They express the tension we have between structure and non-structure, in any domain, even vis-à-vis language with topics such semiotics.

Presently, we have neural networks dreaming, algorithms that go into inceptions called deep dreams. We are the spectators of this space, the guides which gets back the reflection of their own dreams through a machine.
Yet, dreams rely on associations, even if unthinkable. They learn from these templatized, memeplexes, absolutist thinking.
Again, we should learn to distance our definition of self from what we consume, or offered to consume.

To tackle this we have to fix our cultural gaps, to get our act together. Do everything we can to increase our cognitive diversity, to have more global voices, to look for bridge figures, to search for in-betweens be it geographical, ideological, cultural, or others.

We can build internet realities that take into consideration our complexity, our diversity, our various and myriads of ways of expression who and what we are, and how we want to live. To accept the multiple facets of human ingenuity and creativity.
We can deliberately dabble with this way of thinking, this “meaningness”. The acceptance of nebulosity and fluidity, to learn to find meaning in uncertainty, to learn inter-sectional, cross-cultural connectivity. We can find our gaps and fix our weaknesses within this space, in others, creating a collage of humanness.

However, this poses multiples forms of challenges to move out of our comfort zones and expand our world views. Societies move the burden on the individuals and ask them to provide the structure and system for themselves instead of delegating it to some absolute authority.

Yet as the meme trickled into other online spaces, the line “we live in a society” – originally intended to be an enlightened statement which denounced the many flaws and contradictions of society – instead turned into a piece of satire.

The online world and discussions that bridge cultures make issues more apparent. These controversial topics need to be addressed openly, not through mobs, but maturely. Each on their own.

Some authors and psychologists have written endlessly on the individual’s effort to make sense of experiences, the meaning-making process, metaperception, neo-perception, and self-transformation.
From the “Mental demands of modern society” from Kegan, to Piaget subject-object relations, to Erik Erikson “ego identities”, to Abraham Maslow “self-actualization”.

Learning to live in the informational society is being deliberately open and motivated to possibilities from the intersectionality of all the experiences and ways of seeing the world.
We’re learning to communicate on a global scale while coexisting with algorithms.

This concludes our review of how we can mature to better adapt to the internet medium. We’ve started by exploring educational curriculum such as web and internet literacy, so that we can be properly prepared. We’ve said this includes reading, writing, and participating. Next, we’ve said this curriculum should also take into consideration awareness to biases, knowledge being the best way to tackle them. Later, we’ve mentioned long-term adaptation, social adaptation to the internet, which include learning to manage our connections online by segregating them properly and not being put in templatized boxes. After that, we’ve examined completely disconnecting as a way to better reconnect. We’ve seen some studies about helping NGOs and how it reduced biases and our reaction to online outrage. Finally, we’ve dived into what it means to be a netizen of the information era and societies, to live in post-modernity reaching meta-modernity. We’ve discussed some of the artistic transitions and how they reflect our malaise and continuous improvement at including ourselves in this intersectional and cross-cultural world that has never been as connected as today.

Table Of Content

References








Attributions: Origin of the ‘Primum Mobile’, from: Robert Fludd, Philosophia Sacra, Frankfurt, 1626

April 26, 2021

Mark J. Nelson (mjn)

Deadlines for 2021 technical games research conferences April 26, 2021 12:00 PM

This is a bit late since some of the deadlines have already passed, but nonetheless, here are the deadlines I know of for 2021 editions of the conferences I track for my lists of institutions and researchers active in technical games research. Plus two journal special issues.

Deadlines still in the future (as of April 26):

DeadlineVenuePaper type
2021 May 28CoGShort/demo/other papers
2021 May 30ToGSpecial issue on Evolutionary Computation for Games
2021 Jun 07ICECAll papers
2021 Jun 07AIIDEResearch paper abstract registration
2021 Jun 14AIIDEAll papers
2021 Jun 25ICIDSAll papers
2021 Jul 16CHI PLAYShort/demo/other papers
2021 Jul 31ToGSpecial issue on User Experience of AI in Games
2021 Sep 06ACGResearch papers

Deadlines that have passed:

DeadlineVenuePaper type
2020 Dec 22I3DResearch papers
2021 Jan 25 Feb 01FDGResearch papers
2021 Feb 17CHI PLAYResearch papers
2021 Mar 16 Mar 30I3DPoster papers
2021 Apr 12 Apr 19CoGResearch papers
2021 Apr 12 Apr 20FDGShort/demo/other papers

I might make this a regularly maintained feature in the future, but this is a one-off list for now. Partly so I can stop having to repeatedly search for the deadlines.

There are some other deadlines omitted from the tables above, like workshops, tutorial proposals, doctoral consortia, etc. Click through for the complete CfPs and deadlines.

Ponylang (SeanTAllen)

Last Week in Pony - April 25, 2021 April 26, 2021 12:41 AM

Audio from the April 20, 2021 Pony development sync call is available.

April 25, 2021

Derek Jones (derek-jones)

Software engineering experiments: sell the idea, not the results April 25, 2021 09:31 PM

A new paper investigates “… the feasibility of stealthily introducing vulnerabilities in OSS via hypocrite commits (i.e., seemingly beneficial commits that in fact introduce other critical issues).” Their chosen Open source project was the Linux kernel, and they submitted three patches to the kernel review process.

This interesting idea blew up in their faces, when the kernel developers deduced that they were being experimented on (they obviously don’t have a friend on the inside). The authors have come out dodging and weaving.

What can be learned by reading the paper?

Firstly, three ‘hypocrite commits’ is not enough submissions to do any meaningful statistical analysis. I suspect it’s a convenience sample, a common occurrence in software engineering research. The authors sell three as a proof-of-concept.

How many of the submitted patches passed the kernel review process?

The paper does not say. The first eight pages provide an introduction to the Open source development model, the threat model for introducing vulnerabilities, and the characteristics of vulnerabilities that have been introduced (presumably by accident). This is followed by 2.5 pages of background and setup of the experiment (labelled as a proof-of-concept).

The paper then switches (section VII) to discussing a different, but related, topic: the lifetime of (unintended) vulnerabilities in patches that had been accepted (which I think should have been the topic of the paper. This interesting discussion is 1.5 pages; also see The life and death of statically detected vulnerabilities: An empirical study, covered in figure 6.9 in my book.

The last two pages discuss mitigation, related work, and conclusion (“…a proof-of-concept to safely demonstrate the practicality of hypocrite commits, and measured and quantified the risks.”; three submissions is not hard to measure and quantify, but the results are not to be found in the paper).

Having the paper provide the results (i.e., all three commits spotted, and a very negative response by those being experimented on) would have increased the chances of negative reviewer comments.

Over the past few years I have started noticing this kind of structure in software engineering papers, i.e., extended discussion of an interesting idea, setup of experiment, and cursory or no discussion of results. Many researchers are willing to spend lots of time discussing their ideas, but are unwilling to invest much time in the practicalities of testing them. Some reviewers (who decide whether a paper is accepted to publication) don’t see anything wrong with this approach, e.g., they accept these kinds of papers.

Software engineering research remains a culture of interesting ideas, with evidence being an optional add-on.

Patrick Louis (venam)

Internet: Medium For Communication, Medium For Narrative Control — Adapting: Technical Solutions, Wars and Patches April 25, 2021 09:00 PM

I have robbed the golden vessels of the Egyptians to build a holiness for my God, far from the borers of Egypt

  • Internet: Medium For Communication, Medium For Narrative Control
  • Part 5 — Adapting
  • Section 3 — Technical Solutions, Wars and Patches
Table Of Content
  • Detection and Remedial, War of Algorithms
  • Changing the Way the Platform Work, Communication and Filters
  • Differential Privacy and Privacy First Internet
  • Defensive Tools
  • Attention Management
  • Decentralization, Transparency, and Free Software

When free market and regulations fail, when the laws of rights can’t properly protect anyone and trust has eroded, we’re left only with ourselves. In that scenario, tech is seen as the savior of the internet, the weapon and armor of choice for everyone, building and selecting software that resolve issues.

For social media platforms and other big entities such as governments, algorithms can be used for detection, categorization, and remedial. The first step of an issue is to know: without knowledge it’s hard to defend.

Platforms could employ algorithms that would automatically flag posts containing hate-speech or other sensitive content, and either remove them or let a person take further actions from there. There could also be automatic detection systems that would find bot accounts, satiric content (including deep fake), and state-linked accounts, then label them as such. As we said earlier, if there are laws to tag such accounts, it makes their activities transparent and reduces the chances of black propaganda.
Some even talk of contextualization engines that could bring up the context behind any piece of media posted.
Facebook has recently been experimenting with labeling bot accounts, adding an official stamp on state related and celebrity accounts, and labeling satiric and other types of posts as such. Other platforms are applying comparable labeling, at least for official accounts.

Algorithms could then be countered with algorithms. After detecting malicious intents some actors — be it the platforms to protect their own policies, or states entities — could launch offensive “pro-social” bots themselves, or other means of automatic counter-propaganda as a defense.

However, algorithmic solutions are problematic because they create an arms race. Moreover, the over-reliance on algorithms can have consequential results as they can be reflective of inner-biases. We’ve seen that before.

Another technical solution, in between internet platforms and their users, could be to change criteria that make the internet what it is: speed, exposure, and long-lasting.
There has been some arguable success with platforms that make messages ephemeral, for example. However, these are hard to control because these are innate attributes of the medium.

Rethinking the current way we communicate online and providing means to better communicate without toxicity, filter bubbles, addiction, and extremism is what a lot of people are thinking about today. It’s a question that remains unanswered.
Some have attempted to create platforms around, not only communicating, but also organizing ideas, increasing the efforts to partake in the activities, adding a positive creative gatekeeping.
Some have played with new recommendation algorithms that would increase cognitive diversity, pushing people out of filter bubbles by exposing them to different ideas, cultures, geographies, and anything unfamiliar.
Others are trying to build systems that would be able to put metrics on our biases, on our tendencies to associate with similar people, of clustering. Essentially, creating a self-monitoring to add serendipity, inclusiveness, and diversity in our own lives.
Some have tried a topic-based internet where people are randomly connected based on common interests, instead of vote metrics and shock-value.
Another way is to not put all the facets of our lives in the same place, to distinguish between different activities, hobbies, and interests, to keep them separate. This helps avoid social cooling and facades.
Others are trying to see if the micro-transactions and micro-payments we discussed in a previous section, would work to make an internet of the passion economy and drive away other incentives, leaving only pure human creativity and interests.

Technologically, internet giants are trying to win back the trust of the netizens by talking of differential privacy and privacy first tech.
Like we said before, this is similar to privacy as a product, privacy as market value. However, big techs are selling these words not only for the users but to protect themselves from the law.
Some of these companies are now disabling third-party cookies by default in their products, notifying users of tracking, providing privacy sandboxes, enabling end-to-end and other types of encryptions by default, using differential privacy or cohort-based analysis or context-based ads instead of micro-targeting.
Yet, these all make no sense without transparency and when these entities are for-profit.

Digital citizens still don’t trust them and would rather rely on defensive tools and products to feel safer online and avoid being tracked.
They use tools such as VPN, which we discussed earlier, proxies, and ad blockers. According to statistics 47% of USA internet users now utilize ad blocking software.

Additionally, many are now using attention management and informational management tools to get back the control over their attention. We’ve seen earlier how the internet can affect us long-term cognitively and how we are inundated with information.
Attention management tools are software that are used to warn people when they get inadvertently absorbed into activities or media. To be proactive instead of reactive.
Informational management tools are database systems used to organize in a clear and concise way the information we find important. They help to deliberately decide what should be in our memory extension, which we discussed when we saw the cognitive changes the internet brings.

One great thing about the internet, is that even though it’s convenient, people don’t have to use pre-existing platforms, they can be empowered to create their own.
The edification of standards that allow for decentralization and keeping platforms open are good technological ways to avoid bait-and-switch tactics, data leaks, privacy concerns, loss of access, being locked out of accounts, etc..
Avoiding monoliths is avoiding putting all eggs in one basket. Users can choose to spread their internet usage across different services and favor decentralized ones.

On top of this, if netizens have enough patience, expertise, and time, one of the best solution is to own the data and tools by hosting them, self-hosting. This is more than having backup copies of the digital assets, it’s also about regaining control, trust, and privacy.
This is what some popular YouTubers and others are doing by building their own site, to not be victims of the market and keep platforms decentralized and open.

Sometimes openness isn’t enough, we can host services ourselves but if the software is proprietary then we might still not trust it. What is needed in that case is transparency.

Transparency can be achieved in different ways, self-sovereign identity is one that we’ve seen in the previous section.
Another way is to use so-called “zero-data” applications, software that let us be in control of our data from the start, doesn’t access it, or doesn’t do any type of tracking.
Users can rely on feedbacks and recommendations from non-profit organizations that try to defend digital privacy such as the Electronic Frontier Foundation (EFF) to be up-to-date with the best practices and events in the online sphere. We’ll tackle the education part in the next section.

Yet, that can be limiting and not transparent enough. The most transparency we can have from software is when it is open source and when the licenses enforce the respect of freedom and liberty, what we call free software.

Certain non-profit organizations and projects have as mission to promote such kind of user freedom, namely the Free Software Foundation (FSF) and the GNU Project.
Technically savvy netizens could still rely on their own instinct and replace their tools and services with the open sources projects they deem deserve more trust.

Open source and free software licenses can enhance and create value for the public sector too. They can be used within the ICT framework in infrastructure and services offered by the institutions, all publicly funded developments.
Through the use of free software, the citizens and government will feel more in control over information technology. It would grant them digital independence, sovereignty, and security ensuring that citizen’s data is handled in a trustworthy manner.
The use of free formats and protocol will also influence the way development is done, increase trust and reduce distance between government software and the citizen involvement. Open source is collaborative and encourages collaboration by nature.
Having everything done in the open, open access (OA), would also reduce waste, avoiding non-replaceable software, and offering technological neutrality. This reminds us of the current trendy discussion rotating around the subject of electronic right to repair.

These could be applied to any government services, especially if they involve social media as a utility and digital identity. The possibilities are interesting.

Practically, this can be implemented at the state or institution level either through the legal system, regulations, policies, or encouragements or promotion (or non-encouragement).
Multiple nation-wide entities and bodies are transitioning to open source or free-software solutions through different measures. For example:

  • In 2002, the Peruvian government voted the adoption of open source across all its bodies to ensure the pillars of democracy were safeguarded.
  • In 2004, the Venezuelan government passed a law, decree 3390, that would also transition the public agencies to use open source software.
  • The National Resource Centre for Free and Open Source Software (NRCFOSS) in India since 2005 is promoting the use and development of free software.
  • The Malaysian Public Sector Open Source Software Program in Malaysia since 2008 is similarly discouraging the use of proprietary-software and encouraging free software.
  • In the same year, 2008, Ecuador passed a law, decree 1014, to migrate the public sector to Libre software.
  • New-Zealand, in 2010, through its open access initiative, NZGOAL, started promoting the use of free and open source licenses as a guidance for releasing publicly funded software.
  • In the same spirit, the UK Government Digital Service formed in 2011 after a proposal from 2010, included in its guideline the promotion of open source when it fits the government IT strategy.

…And countless other examples that show how government entities are promoting or using open source and free software to bring back trust and transparency. As we said earlier, this is a must during the trust and truth crises we are facing.

This concludes our review of technical and software solutions that can be used to avoid issues we’ve seen in this series such as filter bubbles, toxicity, truth and trust crises, attention and information management, and others. In a first time, we’ve seen how big entities can use algorithms to fight online content through detection and remedial, either removal or tagging/labeling of content. However, we’ve also seen that this would give rise to an arms race of bots. Then we’ve looked at ways internet platforms could change their approach to make communication less toxic and increase cognitive diversity. Next, we’ve mentioned how internet giants are patching themselves by introducing more privacy and security features in their products. After that, we’ve said netizens would still lack trust because of the lack of transparency, thus would rely on defensive privacy tools such as VPN, proxies, and ad blockers. Users could also use software tools to help them manage their attention and information. Lastly, we’ve explored finding transparency and trust by decentralizing services and the usage of free and open source software. This can also be applied at the national level to tackle the trust issue with governments.

Table Of Content

References








Attributions: J. Kepler, Mysterium Cosmographicum, 1660

April 23, 2021

Patrick Louis (venam)

Internet: Medium For Communication, Medium For Narrative Control — Adapting: Legality, Transparency, Accountability, The Nations Intervene April 23, 2021 09:00 PM

I am the Black of the White and the Red of the White and the Yellow of the Red; I am the herald of the Truth and no liar

  • Internet: Medium For Communication, Medium For Narrative Control
  • Part 5 — Adapting
  • Section 2 — Legality, Transparency, Accountability, The Nations Intervene
Table Of Content
  • Laws and Governments as Moral Arbiters
  • Free Market Competition Is Not Enough, Privacy as a Human Right
  • Balance Between Sociality And Privacy
  • Consent, Ownership
  • Consent, Persuasive Tech
  • Consent, Right Of Non-Reception
  • Accountability, Securing Data
  • Accountability, Content + Gatekeeping
  • Transparency, Processing, Usage, Analytics, and Partnerships
  • Transparency, Disclosing Sponsors
  • Transparency, Bots
  • Example Laws
  • Drawback Of GDPR
  • Standardization Of Data
  • Privacy as a Utility, Government Monopoly on Identity
  • Social Media as a Utility
  • Digital Identity, Identity as a Public Infrastructure
  • Examples of Implementations

The market and corporate self-regulation have their limits. They cannot in themselves be sources of morals and ethics. This is the realm of laws, the legislations that governments make are the real arbiters of duties and rights.
The governments, as state entities, can impose the rules that should be followed to be able to act on their territories. However, laws are bound by geographical areas and as such cannot be international. They can only be inter-governmental if treaties and partnerships are in place. Companies can decide to comply to different regulations in different areas to be able to operate on these markets.

The best defensive tactic for nations is always legislations, instead of leaving it as a free-for-all medley. This is what we’ll dive into: when governments intervene on the internet.
The neoliberal mindset abhors this return to authorities because of the general lack of truth we’ve seen earlier. Yet, who do we trust more and want to grant more authority to: private for-profit entities on not-so-free markets or our governments and their legal systems?

In this section we’ll focus on rights and not regulations related to the private sector. Things related to transparency, accountability, consent, ownership, utility, and identity.
We’ve already seen market regulations for fair competition, conflicts of interests, collusions, antitrust, and others that are currently discussed or getting in place around the world.

Despite these new market regulations, many of the internet giants and their partners still stand tall and confident that their business model isn’t going to fall. Either because of nifty bypassing techniques, or because of lobbying. That’s why such regulations don’t coincide with what most of the netizens actually want.

For example, many recognize and want privacy to be a basic human right, part of human dignity. The “right to be let alone”.
Similarly, we’ve previously seen how researchers and human right activists are now thinking, when it comes to brain data privacy, of new rights such as: cognitive liberty, mental privacy, mental integrity, and psychological continuity.
On the surface, these will only be fancy words of wisdom and considerations if they aren’t written in laws, legally binding. Otherwise, companies could lie. Governments have to ensure the spirit of the laws are applied and not mere market regulations. It would be catastrophic if no such basic rights are in place and sensitive personal information such as brain data become common place on the internet.

The question today are related to who’s in charge of the privacy and what kind of balance we can have. Can our information be both a currency for some and protected as a right for others?

The actual implementation is a hard problem when it comes to privacy guidelines as there is a balance between sociality and privacy: sociality has always required the voluntary abandonment of privacy. In order to be social we must give up some of our private time and space to share it with others.
This takes new forms online where sociality is taken at the expense of privacy and social autonomy. Today, it rests mostly upon digital records and archives in the hands of private companies that track our social behavior. As we’ll see, it could be more attractive to have these records switch hands from the corporate world to the public sector.

The legal world is a tricky one, the laws need to be written so that they cannot be abused, covering as many aspects of an issue as possible to not leave room for ambiguity.

The first clarification that needs to be made is about defining what is meant by personal data and who is the actual owner of such data. We already gave our definition in the data and personal data section in part 1 of this series.

Personal data is any information that can be used to directly or indirectly identify a person, an attribute currently attached to them, a behavior, a preference or interest, personal history, anything.
These include things such as name, phone number, address, email, schoolbook, credit scores, geolocation, travel logs, interests, information about past purchases, health records, insurance records, online behavior, etc..
We call the person to which the personal data belongs a data subject.

Having this definition in a text law is the starting point.
Then, comes the issue of ownership of the data. As you might remember, we called the person which data was extracted from the data subjects. Hence, we need to specify what are the rights of ownership they have over their data, and in which cases they need to give consent to allow others to access it.

The consent itself needs to be delineated properly. We need to know what can be done with the data, for what purpose, for how long the retention can be, and most importantly, if consent is transferable. One company getting approval to access data could share this data with other entities, the transitive consent and transitive propagation of data.

Ownership also rhymes with control. Laws might want to spell out if a data subject should or should not be able to perform corrections, updates, or erasures of the data they have generated.

Yet, consent makes no sense if people can be coerced into giving it, if it isn’t informed consent. This is the case online with the use of persuasive technology, both to extract data and be subject to advertisements. They go hand in hand. Thus, the legal domain has to take this into consideration and posit whether such technology is allowed or not.
Additionally, it could make the parallel with gambling, alcohol, cigarettes, or other questionable products, basically limiting advertisements on mainstream channels.
Furthermore, when it comes to consent, the law has to deliberate if persuasive algorithms and designs should be allowed on children or not — age is often a factor when it comes to consent. Persuasive design and dark patterns could be made illegal altogether.

When it comes to advertisements, it has to be decided if consent needs to be given even before receiving them, or whether the person can at a later time refuse to receive them, retracting it.
The right of non-reception, is the right of someone not wanting to be the recipient of ads. If such right is in place, companies could be pursued in court and fined if they advertise to people who have opted-out of them, or to those who haven’t opted-in depending on how the law is phrased.
Such right is generally associated with online cookies, the right to not be traced online.
Offline, it is associated with not receiving ads in mailboxes, or at city-scale, to not have or limit billboards. This is also a method to avoid material waste.

Still, these can easily be bypassed with an ingenuous use of product placement and sponsoring instead of direct ads. Or even turning customers into brand evangelists, which is the norm with neoliberalism as we’ve seen earlier.

Once the personal data is stored, it could and is expected to be subject to access restriction and accountability. The companies and entities we have trusted with our data should be responsible for their safety so that nobody else retrieves them.
Thus, governments could create bodies and processes to verify the security, integrity, and confidentiality of the data storage in companies and entities that choose to host it. With the increase in quantity and types of consumer data, it is imperative to have such measures in place. We’ve heard too many data leaks stories in the news lately, either from hacking or rogue employees.
Arguably, even standards like the PCI-DSS, if applied to personal data, might not be enough without real-time checks.

Along with the responsibility of storing the data, laws could dive into the accountability related to the content of the data itself. Should platforms be accountable for the content it hosts, or should it be the data subjects themselves, or both?
This would in turn create new gatekeepers: a monitoring and reviewing of the content published and hosted. This could also include the algorithms that are used to promote content.

These algorithms have acted as amplifiers of extremism, enlarging dangerous fringe communities, and pushing biases (racial or others). Currently, the entities that are using them have been hiding behind the fact that these are algorithms, and thus “mathwashed” removing their responsibility because “they have no conscience of their own”. The myth of the impartial machine.

Many of the social network platforms, in the USA, have been avoiding monitoring the content posted on them by referring to the USA first amendment of free speech and the 1996 USA Communications Decency Act, shielding them from liability for content posted by their users.
However, this only applies to the USA and the different branches of the same platforms are able to monitor content in other countries.

To thwart this, governments need to either consider social media as news platforms, or find methods to strengthen the institutions that create and distribute reliable information: mainstream media, academia, nonpartisan government agencies, etc..
Once social media are part of the mainstream news system, the gatekeepers are back in place, they’ll have to uphold journalism standards for the content posted on them.
Governments can also enforce social networks to pay news that get published and distributed there, indirectly re-strengthening the local media publishers.

Apart from news, the platforms can be held liable for the undesirable content that is shared there.
This is most important when this content is shared in a public space that children can access. Laws can ensure children won’t be subject to neither surveillance, tracking, and be protected from predatory and dangerous content.
Contrary to popular belief, in the USA the COPPA (Children’s Online Privacy Protection Act) only takes the advertisement and tracking of children, but not whether the platforms are responsible for the content to which they are subject to. The Child Online Protection Act of 1998 (COPA, yes similar name) is the one that has the intent of preventing minors from accessing obscene material on commercial websites in the USA.
Most countries already protect their children against being subject to harmful material on public commercial channels.
India’s new intermediary liability and digital media regulations takes a step further and forces content to be traceable to its source, to be able to verify social media users, and rapidly take down non-consensual sexually explicit content and morphed/impersonated content within 24h of being notified.
With the same mindset, some countries have laws specifically to take down online hate speech and incitations of violence or terrorism.

If we can’t blame the hosts of the content, we might want to hold the people that generated it accountable. This, in turn, should make black propaganda ineffective. However, because most online-interactions are anonymous, it is very hard to do.
A solution to this would be to attach a real identity to whoever uses, posts, and creates online content, this is what the concept of digital identity is about.
Online platforms that allow bots could also be forced to tag them as such, making it clear to everyone that these aren’t real humans.
We’ll discuss this later, but as you can expect, while it does bring back gatekeepers, it also could create more social cooling.

Agreeing that the personal data should only be used for specific purposes is fine, but there needs to be some guarantee behind the agreement. For trust to set in, transparency is mandatory.

One type of transparency is about understanding why and how the recommendation engines, the curation systems of the internet, work. From advertisements, to news feeds and other algorithms. Having a clear picture of their inner-workings would help us make better decisions and responsibilities instead of overly relying and trusting them.

With the recommendation engines come the analytics, the categorizing, ranking, rating, and scoring of data: how data is processed internally. Transparency is about knowing where and how our data will be used, in what process.
It could be hard for companies to accept transparency at this level because that would mean opening their valuable extracted data storage to the world.

The transitive propagation of data needs also to be transparent if it happens. That includes disclosing who are the third parties that will access the data, for which reasons, what data points they used, the target audience, and who paid for the ads campaigns.
This type of financial accountability, full transparency regarding the amount spent by companies on ads, also applies to political campaigns sponsorship. Ad political campaigns, which on the internet were often overlooked, would now have their layer of anonymity removed. Yet again removing black propaganda and state-linked accounts from the equation and making microtargeting less obvious.

Many legislatures, regions, states, governments, authorities, and countries have erected or proposed laws to tackle the things we’ve seen, let’s mention some of them.

Related to the financial accountability, the Honest Ads Act, is a bill in the USA that was proposed such that online services would be required to reveal the description of the target audience of ads campaigns.
However, this bill wasn’t passed yet in favor of self-regulation. The USA, in general, is a laggard on the domain of privacy as it feels more threatened by the intervention of the state than the market.
Still, in 2018, the California Consumer Privacy Protection, a legislature passed in California (a USA state) would guarantee users in that region the right to know what data is being collected, and opting out of the sale of their data.
This opt-out approach isn’t the best but it’s a start.

Others might find that there needs to be a stronger data protection, that it’s part of human dignity and that it shouldn’t be threatened by any kind of intrusions. In that view, privacy and ownership over our private data is a default and not an opt-out.

The European Union was the first in 2016 to spear-head this change by rolling out the largest attempt at protecting user data with the General Data Protection Regulation.
Privacy laws are not new, for example France’s data protection law dates back to 1978, and the EU already had Data Protection Directive law about the protection of fundamental rights and freedoms in the processing of personal data, dating from 1995. However, the GDPR is an extension of all this with modernization, clarification, and making it actionable. It is a move that has spurred other states to enforce similar laws. It also gives them more confidence in intervening against the internet behemoths, which they wouldn’t dare attack before.

The French data protection law ensures the following rights:

Personal data must be collected and processed fairly and lawfully for specified, explicit, and legitimate purposes, and with the consent of the data subject. In addition to the right to consent, data subjects have been given the following rights: right to be informed, right to object, right of access, right to correct and delete information, and right to be forgotten.

These have been mapped unto GDPR to hold data processors accountable when manipulating data that isn’t mandatory for the functioning of the service. It defines clearly what is meant by personal data, what it means to process it, and what is required of organizations and entities that process it.
The organizations are expected to have a management procedure in place to keep the data safe and secure from breach. That means specific roles need to be in place, such as a Data Protection Officer, and the risk management assured. They are held accountable for the privacy and protection of the data, and data protection assessment can be performed to ensure this is properly applied.
The data subjects keep their rights over their data. They should be informed about the processing activities taking place, the data not being used for any non-legitimate purpose. Their explicit consent needs to be requested for its usage and collection. That means the data subjects have the right to update/correct and erase their data. Additionally, the data subject can ask to see which data is held about them and control whether its transfer can happen between third parties or not.

Most importantly, regulators can ask for demonstrations of accountability and impose fines for the entities that aren’t complying. This is the clause that makes the difference with any previous laws.

After it was passed, the internet has been shaken. Most websites have chosen to display annoying pop-ups/notification banners asking if users want to allow their information to be shared with third parties. It defaults to no but it is sometimes hard to manipulate and very intrusive.
Moreover, some companies are also allowing their users residing in the EU to download the personal data that has been gathered, and give them the possibility to delist it, essentially erasing it.

The way the GDPR has been applied by companies has given rise to more dark patterns, persuasive technologies trying to trick users into accepting to give their personal info.
Research has shown that only around 11.8% of consent forms in 2019 met the minimal requirements based on the European law.

Multiple countries have their own privacy laws and acts, and many today are modernizing and reforming them to add actionability like GDPR.
Some even include the right to data portability and the right to object to marketing, the right of non-reception which we mentioned earlier.

Let’s take some examples.
Canada has the Privacy Act from 1983, which it is modernizing to be similar to the GDPR in the Personal Information Protection and Electronic Data Act. China’s Personal Information Security Specification that took effect in 2018, and ratified in 2020 in the Personal Information Protection Law, has more strenuous and far-reaching requirements than GDPR applying to any personal data and not only sensitive personal data. The UK has its UK Data Protection Act from 1998, which has been updated in 2018 to follow and supplement the GDPR, and is being updated today to follow its own standards. The Irish Data Protection Commissioner (DPC) upholds the same fundamental privacy standards as the EU. Similarly, Australia has updated its 1988 Privacy Act and Australian Privacy Principles to be like the GDPR. Russia Federal Law on Personal Data of 2006 regulates data processors to protect the defined sensitive personal data against unlawful and accidental access. India’s PDP Bill of 2019 goes in the same direction as GDPR but is more integrated with Aadhaar portability which we’ll see in a bit. Argentina is going the same way, currently amending their laws to be like the GDPR. etc..

It’s apparent that most of the world is moving in the direction of edifying laws to protect the privacy of the citizen from the private corporate data brokers. Market regulations aren’t seen as enough.

Some nations take it a step further, considering the protection, privacy, standardization, digitalization, and usage of identity to be the role of the state and not corporations. The state should have a monopoly on identity, just like it should, in theory, have a monopoly on violence because it has the same destructive power.
As we said, sociality requires giving up a bit of privacy, and the digitalization of identity is a requirement for the information society. Some people are uncomfortable with the idea of making it the job of the state and would rather have centralized private entities do that. Yet we know that corporate incentives are only for-profit and not the shared good, which could hinder the transformation of the society into this new era.

Thus, some governments force the private sector to strictly follow their privacy rules, while the public sector uses identity as a utility, a part of public infrastructure for the public services.
This requires standardization, definition, transparency, and data portability of digital identity. We’ll come back to this in a bit but first let’s imagine social media as one of these services.

We already discussed social media as utilities in the social media section of part 1. We said back then that for it to be a utility, it should change from a want to a need, an essential infrastructure, a necessity like electricity.
Social media could be said to be an integral part of information society, in order to adequately take part in the 21st century as an individual, and thus would be important to have the government provide it to remove any profit incentives from it.

Some believe that current popular social media already act, feel, and are considered by people like utilities. So they push forward the idea of making them as such, just like the telecom sector or electricity sector: a public service that is regulated by the government.
It would make it easier to ensure the protection of the constitutional rights of users, such as freedom of speech in some countries. Additionally, the government could enforce search neutrality, modeled after net neutrality regulations — essentially assure equal access for everyone.

Yet, others argue that making social media similar to telecom regulations would be bad for the market, reduce innovation and growth because of the lack of competition. This is a market view of the subject.
Moreover, it can be said that social media haven’t gained the status of utility because countries don’t go haywire when they go down, that there is always an alternative, they are still new and replaced every couple of years.

Furthermore, from the neoliberal perspective, one that dreads authorities and prefers seeing things as a free flowing market, not catching nuances, this is the equivalent of the chilling rise of “authoritarianism” and “fascism”. Words that are used as emphasis to display the attack they feel towards their world view.
Still, there might indeed be a new kind of social cooling taking place when we become owners of our social media, instead of having them as private companies. State actors were already requesting personal data from these companies and using them for their gain, as we’ve seen before. Making them public utilities would instead force the general public to keep it in check, as it would be something they would be indirectly paying to keep running.

Another trend is the one of the digitalization of identity by governments. This is the standardization of identification and identity systems and management that become part of the public infrastructure.
In practice, this is more of a re-definition of national identity cards by making them digital, interoperable, and accessible by all citizens. Having it as an infrastructure means that all kinds of public and private services can rely on it to verify reliably the identity of people.

This comes with a lot of challenges for governments as they have to set in place the legal framework around the infrastructure, its usage, regulations, auditability, traceability, transparency, and obviously actually create a technology that is secure, confidential, keeps the integrity, is scalable, and respectful of all individuals.
The capability of the identity system needs to be well-defined, from authentication, authorization, roles, and delegations available. Some government bodies and international standards have been created for these identity management systems such as ISO, the EU eIDAS, European Self-Sovereign Identity Framework, European Blockchain Services Infrastructure (EBSI), and the GSMA Identity programme and Mobile Connect.

Like with social media as a utility, this comes with the same perception from those who’d rather be surveilled by private companies than their government. But with enough transparency and preparation, the most digitally-skeptic can get on board, especially if the solutions are free and open source as we’ll see in the next section.

So far, the implementations of digital identities by governments have only been applied when it comes to accessing public services and KYC processes when signing up for services such as telcos, energy, and banks.
Estonia, India, and China are countries that are leading the way. The upside is convenience and an economic boost. According to McKinsey Global Institute, countries that implement a digital ID scheme can boost their gross domestic product by 3 to 13 percent by 2030.

Estonia has been rolling its digital identity program for the past 20 years. Technically, it is a mandatory 11-digit unique identifier assigned to all citizens above the age of 15 that is stored along with a public-private key pair in a PKI (Public-Key Infrastructure).
The default format of this digital ID is a physical card, an i-card, similar to a credit card, passports, and sim cards: an embedded chip (UICC) protected by PIN codes. Internally these circuit cards use secure elements (SE) as a root of trust to store the sensitive information. We generally refer to this type of technology as hardware security modules (HSM). Other formats exist such as a mobile application that inspired China’s version of it, I am not currently aware if the Chinese solution additionally requires an UICC to be inserted in the phones.

The data is stored in a decentralized blockchain that is scalable and secure. Protected against cyber threats, which they’ve been subject to in the past, and are obviously continuously facing and tackling newer security challenges. Additionally, the data is also backed up in Estonia’s e-embassy in Luxembourg.

Practically, this allows someone to be verified during a KYC process (Know Your Customer), login to private or public services, to encrypt or decrypt files securely, and to legally digitally sign documents.
Note that e-signatures are allowed in the EU, through the eSignature Directive, to replace handwritten ones.

Users of such system are thus in the center, practicing self-sovereign identity (SSI), personal autonomy, in full control of who has access to their data. The system allows for traceability and auditability. Citizens can transparently see what data is collected about them, when, why, who has accessed to them, revoke or keep this information, and more. That is unless law enforcement requests access, but they’ll still be notified once the investigation wraps up.
This can be abused but the country has shown a strong legal stance whenever law enforcement or doctors took advantage of their positions. The traceability of each access also makes it harder for someone to touch the data without anyone else noticing — unlike traditional physical cabinet storages. This enables more privacy and empowerment that when a private company has control over personal data.

The advantages are obvious: designers of systems save time by relying on the government infrastructure, and citizens too by carrying their life activities online.
This includes banking, e-residency, accessing health records, medical prescriptions, voting, paying taxes, and more. One thing they are considering integrating is their own cryptocurrency.

Estonia also teaches cyber hygiene in elementary school. Education is an important factor that we’ll dive into next.

The EU is considering implementing a similar system, but leaving each national authority the choice. However, it is still being discussed and argued if it should be mandated across all nations. They have agreed on eSignature Directive, allowing electronic signatures to be the equivalent of handwritten ones, and took a few steps with encouraging national digital IDs in 2018 through the Electronic Identification And Trust Services (eIDAS) regulation. The implementation of digital ID in Europe would allow cross-border electronic transactions and electronic signatures, but only 14 out of 27 member countries have introduced online authentication systems, such as DigiD in the Netherlands, Belgium’s eID card, and Spain’s DNIe.

Another large implementation of digital identity is India’s Aadhaar. Technically, it is a non-compulsory 12-digit unique identifier, termed Aadhaar, that also comes with a public-private key infrastructure that stores biometric information.

Due to the non-mandatory nature, the government has to ensure that no service can be denied if the person hasn’t signed up for Aadhaar. However, in practice, many services are now using it for KYC as a replacement for paper proof of identification.
Moreover, because it isn’t mandatory, Aadhaar isn’t a proof of citizenship or residence: it doesn’t replace other IDs like passports or driver’s licenses.

Aadhar practical form is as a biometric ID system, allowing citizens to prove their identity by simply providing a fingerprint or iris scan for example. There is also an ID card that can be issued.
On top of Aadhar, the government has rolled out a Unified Payment Interface (UPI) allowing banks to interoperate with Aadhaar. UPI is designed to make person-to-person (p2p) and e-commerce transactions swifter and easier.

For India, this has been a game changer, enabling access to bank accounts and services that most couldn’t access before, making them more inclusive. Figures are not robust, but it can be assured that most (4 out of 5) Indian citizens have the ID card.
This saves a lot of overhead, someone can transfer money simply by showing their Aadhaar card, or paying with their fingerprint.

Unfortunately, there are a lot of criticism, calling it a “tech-solutionism” and saying the roll-out has been discriminatory.
Besides, there have been emerging reports regarding the security concerns of the personal data associated with Aadhaar ID being sold in alternate markets.

These flaws in privacy and security need to be addressed with accountability, traceability, and a strong legal framework, similar to what Estonia is doing. So far, Aadhaar doesn’t yet seem to put the person in control of their own information, self-sovereign identity. Plus, the document could be upgraded to be considered a true proof of citizenship.
Additionally, only basic biometrics is not enough, but adding a smart card with a pin to the mix was the way to go. It fulfills: something you know, something you have, something you are.
The development possibilities are there and the outcomes would be interesting.

China has slowly been pushing for its virtual/electronic ID card, a digital ID. Practically, it is a valid national ID that contains biometric information such as face and fingerprints.
The solution takes a mobile-first approach, as an app with QR codes, and would integrate well with all services, both online and offline. It is still in its pilot/trial phase, but would be a true replacement for official ID cards and could be used within popular applications such as WeChat and Alipay. For example, WeChat Payment is already extremely popular for making digital payments.

China’s implementation would be directly inspired by Estonia’s one, very similar. At China’s scale, like India, it would open a lot of future possibilities as 67 percent of Chinese residents do not currently have a credit record. The system would also be linked to a generic credit score database.

China is also eyeing entering the digitalization of the supply chain. Their Standardization Administration describes it neatly: “First-class companies do standards. Second-tier companies do technology. Third-tier companies do products.”

This means being able to associate identity and traceability of all the supply chain and trade processes across multiple geographies and organizations. A true verifiable life-cycle where you could identify compliance and transparency.
For that, they’ve partnered with international standard organizations such as ISO and ITU, and started designing and implementing the next wave of standardization in cyber-physical trade with its ambitious China 2035 Standards strategy. This initiative is also driven by other organizations such as UN/CEFACT, the ICC, World Customs Organization, the European Union Intellectual Property Office (EUIPO), and International Air Transport Association (IATA).

This type of digital meta-platform, this platform-of-plaforms, would empower individual actors by giving them the power of traceability and identity for supply chain & trade. This applies to both people within it and outside it.
For the consumer, that would mean being able to reliably see the origin of a product, including its materials, legal, geographical, intellectual-property, its origins, and more. This is essential for many of us, and would make it easy to see the quality of intangible goods.

Consumers, governments, and companies are demanding details about the systems, enterprises, and sources that delivered and transformed the goods along their value chain. They worry most about quality, safety, ethics, and environmental impact, to name just a few.

This would ensure the transparency when buying products from big platforms that contain sub-markets such as Alibaba, Amazon, and Wal-Mart. This adds a lot of trust and transparency with consumers, they can verify the digital twin of their physical good, along with its history.
Now, that is going digital!

A path towards societal digitalization through internet technologies.

Other countries such as Argentina, Afghanistan, Denmark, Pakistan, Kazakhstan, and more have or are introducing national ID cards with embedded chips used for services and are considering or in the progress of integrating with biometrics, payment, and the online world just like Estonia, India, and China are doing.
This seems like a global trend and is expected to increase with new types of root of trust that can be shared over the wire such as IoT SAFE.

This concludes our review of how governments can bring back order, morals, and ethics on the internet. In a first place, we’ve dabbled with how the market isn’t enough and how there’s a need for laws that would consider privacy as a human rights. Then we’ve seen the balance between sociality and privacy, to whom we’d prefer giving it. Next we went over three topics applied to legality: consent, accountability, and transparency. After that we’ve looked at some examples, how the world is moving in that direction. Finally, we’ve tackled identity and social media as utilities and infrastructures, and what that would imply, along with some of today’s examples.

Table Of Content

References








Attributions: Michael Maier, Atalanta fugiens, Oppenheim, 1618

April 21, 2021

Patrick Louis (venam)

Internet: Medium For Communication, Medium For Narrative Control — Adapting: Free Market, Let It Solve Itself April 21, 2021 09:00 PM

The

  • Internet: Medium For Communication, Medium For Narrative Control
  • Part 5 — Adapting
  • Section 1 — Free Market, Let It Solve Itself
Table Of Content
  • The Market Will Decide
  • Promoting Fair Competition
  • Privacy as a Feature
  • Changing The Economic Dynamics

In this last part of the series we’ll go over the adaptations we are undergoing to remove the frictions we have with the internet — anything to make it better suited for us. We’ll take multiple perspectives, primarily the ones of users, societies, and others that are encountering the issues we’ve seen related to this new communication channel.

Let’s start by taking the point of view of the market and economy as ways, in themselves, to provide solutions to problems we’ve had.
Keep in mind that these heavily depend on the locality, culture, and practices of the tech companies at the epicenter of the internet drastic changes that are taking over the world. Most of these giants embody a neoliberal mindset and originate from the USA.

A market solution is a solution based on trusting the free market before all else. It hinges on not having to intervene with regulations and laws but to believe the behavior of consumers, their choices and decisions, will eventually lead to an ethical and moral balance.
This laissez-faire attitude is anchored in what we’ve seen in the cultural ambiguity and truth and trust crises sections. They rely on the assumption that individuals understand the dynamics, are well-educated on the subject matter, and will always make the best (economic) decisions overall.
This line of thinking is contested, but let’s consider the adjustments and innovations that can be driven by users.

Market solutions are hardly international as they can only work by relying on fair competition and transparency between actors. Non-intuitively, at least in the eye of some, this can only be achieved through some regulations that keep the market in checks, and such regulations are geographically bound.

When it comes to the online world, because a lot of the money is made from ads and tracking, it means regulating the whole digital advertising tech supply chain, and data brokers along with their partnerships.

If this isn’t done properly, there could be conflicts of interests, some companies self-preferencing advertisements for the services they themselves provide. This is relevant because many of the ad-tech providers also offer other services, which they can promote through their ads to compete with other players.

This problem is exacerbated when only a few entities, which as we know is the case today, have the upper hand (hear by that almost monopoly) in the ad tech sector and data broker space. These companies can then thwart competitors by buying them or competing directly with their services.
The companies will then accrue larger datasets and it’ll be harder to compete with them. This is what we’ve seen in the new economies section. They can control the pricing and quality of services on this digital supply chain.

To promote fair competition, to make it a free market, many things can be done.
One of them is to break the large datasets into subsets to allow new smaller entities to compete instead of having systems of interlocking monopolies.
Another thing that can be done is to manage conflicts of interests, which is what some lawmakers are doing these days. This includes institutions such as the European commission, the Competition and Markets Authority in the UK, the Australian government, and the USA Congress that are now checking for risks of collusions, antitrust, and what can be done for fair competition. We’ll see more of the law and regulatory concerns next.

Additionally, one thing that can foster good competition is to have open standards describing what can be done with ad tech, the format of the data traveling on the pipeline. This includes implementing a common transaction and user ID for consumer and making them portable and interoperable across ad providers.
It would then be easier to port the data between rivals and might even allow consumers to regain control over their data, selecting with whom they’d like to share it. We’ll see more of that when discussing digital identity.
Apple uses something called IDFA, Google uses an analytics and advertising tracking ID, Facebook uses a Pixel ID and others.

Standardization also goes along openness such as the right to repair, to allow products to be repairable to last longer, and decentralization.

However, openness can also be abused for profit. We’ve seen countless times that standards that are initially open can be hijacked. The big players will in turn drive the standard towards more complexity and growth, consequentially kicking out any small player that can’t follow.
This is arguably what happened in the “browser war”, and is still happening today with web standards. The ones contributing the most will control and subdue them, while defending their actions on the basis of openness, improvement, and progress.

History also shows that the same companies that emerged thanks to openness tend to reject these principles when they undermine their strategic position. After gaining a substantial amount of users, after being rooted deep enough, they’ll use the network effect to justify changing their tactic and stance.
A too big to fail scenario.

Obviously, companies can also spend money to defend their interests, for example by lobbying the USA Congress, as we’ve seen in the new economies section.

There are many ways to make money online, on the internet market.
To make profit, as we’ve seen in the new economies section, the most popular models are the ones that go along with what users expect, namely zero-price models where data is exchanged instead of money. This goes under different names such as “free model”, “data-as-payment model”, “online advertisement-based business model”, “ad-supported model”, “freemium model”, etc..
There are other models that are possible, as we’ll see, such as a flat-rate model with unlimited or limited usage, a bundle-like model, and a per-usage or subscription based model where users are enrolled and charged recurrently.

The names for the free models actually have separate particular meanings that reflect how each view and make money from the data, personal data, or attention, which become the currencies of exchange. These models also provide insights into manners of approaching privacy.
The consumers are then free to select the model that they find the most suited to their needs, interests, and ethical concerns.

The ad-supported model, one in which companies make money from selling ads or from gathering data for ad-tech providers, poses the most privacy concerns. It is either targeted advertising, such as microtargeting, or context-based advertising.

A freemium model is one in which the product is offered free of charge and money, a premium, can be paid to access additional features or services.
The free of charge part can be combined as a hybrid with advertisement.

A data-as-payment model, aka PDE or Personal Data Economy, is a model in which individuals deliberately sell their personal data as an asset to pay for services and products. This model turns the surveillance economy upside down.
Similarly, a privacy-discount model is a model in which the users relinquish their privacy to obtain a discount on a product or service.
These two models are user-centric, letting the consumers take back a semblance of control over their data: with whom they share or disclose it, the price for it, and on which terms.

Each model has a different perception of what privacy is, either as a commodity/product/feature, as a luxury/premium, as a right, or as a utility.
In the market view it is primarily a feature or premium that can be transacted.

What we described as data-as-a-payment or privacy-discount are instances of perceiving privacy as a commodity. In these scenarios neither citizens nor governments care about privacy regulations but freely exchange and trade it. In this case, people are powerless about what information is being transferred, while tech companies can make huge profits by selling the data.

Privacy can also be seen as a luxury or something you get for a premium, aka PFP or pay-for-privacy, aka primium aka privacy-as-a-premium. In this scenario, consumers are encouraged to pay extra to gain back their privacy, or to acquire new products that will fill their desire for more privacy and security.
This is a case where companies profit from privacy by charging users for it. Transparency, awareness, and privacy have become competitive features delivered at a premium.
We’ll see that awareness and transparency are two things that can be enforced through laws in the next section.

On the other side, privacy can also be seen as a utility or universal right. Either something the governments regulate, control, and monitor, just like water, a tool — this is what we’ll see later with digital identity. Either as a universal right that needs to be protected by law, which is something we’ll see next too.

The debate over privacy can be confusing for consumers, they may feel torn between all these options. Some of them wanting to disengage with the virtual world to maintain their anonymity instead of putting themselves at risks. Without privacy, we are subject to social cooling, feeling tracked, caught in the polarization, as we’ve seen before.
This confusion is indeed what sparked the whole branch of privacy as a luxury, companies are capitalizing on it, for them it’s a new business opportunity.

Privacy is not only a feature in itself but also a stand, a brand, a show of force for publicity and marketing differentiation.
Many companies sole existence rely on this need: selling something at a higher price mark because of this or having products, devices, services, and tools, that will offer control back to users.
The market size of this type of business, according to Arthur D. Little, is estimated to be, as of 2020, around $5 to $6 billion.

VPN providers are one category of these flourishing businesses. They sell a service to avoid the continuous online tracking and data collection. A VPN being a proxy to internet requests, essentially hiding where they are originating from. Some commentators have suggested that all consumers should use a VPN.
In general, their model is based on offering a subscription, a pay-for-privacy product with recurrent payment. Though, ironically, some VPN companies also provide their services for free through the hybrid “ad-supported” freemium model.

Another company profiting from privacy is the device and OS manufacturer called Apple that sells products at a premium. It has recently (2021) stirred uproar from American companies that make money from the ad tech infrastructure and pipeline when it announced a recent update to their phone, tablet, and TV OS. Apple is a popular product in the USA among high earners, thus good ad targets.
The recent update would notify users when applications want to access location, track them across apps and websites, or others. The user would have the choice to allow or disallow such action.

The move is arguable, as Apple makes a lot of money from the ~30% cut (the Apple tax) it takes from in-app purchases, disallowing any other type of external purchases linked to apps. Such a change would force app developers relying on the ad model to switch to another model such as subscription or in-app purchases, which they’d take a percentage of.

As can be seen, the free market and laissez-faire approach brings with it a lot of conflicts. When big players can enforce business models on their platforms, after using the network-effect to acquire users, it enrages other players.
The dynamics at play are interesting.

When privacy is offered at a premium it raises concern about the inequality of acquiring it, a socioeconomic digital divide. “Advertising is a tax on the poor”, as Scott Galloway said. It creates incentives for the companies offering privacy and security-centered products to keep the status-quo, to make more money from it.
There’s also the lack of consumer knowledge and understanding on these issues and their implications. They could either be unaware that these are issues, or be misled by marketing schemes portraying products as privacy-aware while not providing any proof to back it up.
Privacy can be advertised on the front while in the back tracking is done. The tracking, the telemetric gathering, is not limited to advertisement purposes but could also be used for partnerships with other entities that would allow companies to gain an edge. For example, a partnership with state actors, internet providers, or mobile operators.

Let’s remember that companies are not moral agents, nor arbiters of truth, and that their decisions are both restricted by, on one side, the legislative geography they are operating on, and on the other side, the profit made from the appearance and satisfaction they offer customers. Ethics is only a cosmetic feature.
This can be hard to grasp for the neoliberal mind that tries to find their meaning through the purchase of products, as an extension of themselves. Especially when these products only provide that in return of profit.

One solution for consumers is to go toward models that remove data from the equation, thus changing the economic incentives for online businesses. These alternative models rely on putting real value on what we consume and use online, going back to transacting money for intangible assets.

While the free models were disruptive when they came out, it turns out, unsurprisingly, that consumers don’t like being harassed by constant ads.
Tech aware users will often install adblockers on their machines, rely on one of the above paid solutions to not be tracked, use browsers that block third-party cookies, and other mechanisms. We’ll dive more into the tech solutions later. Statistics show that in the USA 47% of users use adblockers.

This kills companies that rely mainly on the ad model. Either because they aren’t able to attract enough users to generate revenue, or because it damages the brand image.
Additionally, companies don’t want to be at the mercy of the ad tech monopoly, which puts them at risks when they decide on the budget and price they allocate for specific ads and on which platforms they’ll run them. They could demonitize them at any time or disallow them on the ad platform for arbitrary reasons.
Furthermore, even companies paying for ads are starting to see the limits of this model because of the lack of good insights, bots, and fake information that return to them.

For these reasons, we’re seeing the rise on the internet of the subscription economy, micro-transaction economy, passion economy, and the economy of patronage.

We already discussed the passion and patronage economy in part 2 section 2 so let’s focus on the others. All these are about putting the consumers in power, letting them make decisions through their wallet. In this case we’ll have a consumer-controlled world, thus a working free-market solution.

There are different types of subscription models, either based on usage volume, or based on time via a billing cycle. It stands in contrast with a pay-per-product model where users would pay only once, in subscription the payment is recurrent.
Consequentially, this works well for products that either perish quickly, need to be replaced, e-commerce loyalty programs, or intangible goods such as the ones provided by SaaS (Software as a Service) companies.
These include Netflix, Amazon Prime, Hulu, YouTube, Spotify, SirusXM, Pandora, VPN services, among others.

The subscription model has been estimated in 2017 to have 28k companies worldwide and had more than a 100% yearly growth. The largest retailers in the category generated more than $2.6 billion in sales in 2016, up from $57 million in 2011.

The model has the advantage that it is flexible and provides customer satisfaction. Buyers aren’t burdened by large fees and can stop the subscription at any time.
The companies providing the services are then forced to innovate, improve, and provide quality to be able to attract and keep customers. There are many instances that show that switching from an ad-based model to a subscription model has helped increase revenue.

Here the customer is at the center and their expectations are king. For this model to work the company has to build a long-term relationship with them, often focusing on the relationship more than the product or the transaction.

Nonetheless, it isn’t an easy task. A minimum number of subscribers is needed for the business to subsist, and it’s not easy to attract them when netizens are used to getting things for free.
This has especially been hard for newspapers, as we’ve said in the past. Statistics show that in the USA in 2017, only 11% of consumers were paying for online news. The rest of the world isn’t as generous either, the statistics have been replicated in other places with an average of 10% paying for news. Yet, recently we are now seeing an unexpected resurgence, a new growth. It has been reported that in 2017 the American newspapers The New York Times gained 130k subscribers and the WSJ gained 300%, the British newspaper the Guardian has also announced that it had seen profits for the first time after 20 years with more than 300k people making contributions.

The most successful stories of subscription based businesses are those that relied heavily on partnerships — between brands, consumers, affiliates, and others.
Despite all this, companies relying on the subscription model can still rely on product placement and advertisements, which puts us back at square one.

Another method to increase the number of customers, to persuade them of contributing, is to lower the barrier of entry. This could mean making it more affordable, or even to change the value of the product based on the purchasing power of the individuals or their locality.

The new disruptive approach is to make the payment infinitesimally small, a micro-payment, a micro-transaction, which can be fractions of cents, there’s almost no line of demarcation.
This can create a dynamic in which ads and trackers are unnecessary because users would pay directly with these monetary tokens the online content creators. The tokens essentially being the monetary form of the attention of users.

The word micro-transaction refers to purchasing virtual goods through micro-payments, however it also has a historical meaning attached to the gaming world.
Initially micro-payments acted as a sort of virtual currency that existed only within games.
Later on, some game developers used a model in which they provide virtual desirable assets, such as loot boxes, in exchange for micro-payments. Micro-transactions have thus gained a connotation with gambling and addiction, which brings some countries’ legal matters and regulations in the mix especially when it involves kids.

The main issue with micro-transactions is that they can only be possible if there is a monetary technology that allows real-time transfer, no fees, and privacy.
Cryptocurrencies on a blockchain seem to be the go-to solutions for this. They are far away from the traditional pipeline that is also linked to data brokers, as we’ve previously seen.

Unfortunately, some of the cryptocurrencies have issues regarding scalability and volatility, which makes them ill-suited for such endeavor. Cryptocurrencies are now more attached to investment and gambling than an internet free of ads.
Recently, payment gateways and banks have taken a particular interest in trying to gain power in this space. Namely, VISA and PayPal are now allowing certain cryptocurrencies to be used and converted on their processing platform. This would then bring information related to cryptocurrency to data brokers, removing the privacy aspect.

Still, the applications for micro-payments open avenues and unlock an entire economy between subscriptions and ads.
We’re slowly starting to see this appear in specialized browsers that allow to distribute attention tokens to websites and content creator, and newspaper that want to allow micro-transactions along with their other revenue streams.

This concludes our review of how the free market can be used as a method to adapt to the internet. First we’ve seen what this idea implies: well-informed customers making the right choices to balance the status and set things right. Then we’ve explored how this cannot be true if there aren’t regulations set in place for fair competition to happen between companies. Next, we’ve discussed some of the possible regulations and things that could be put in place to achieve this such as collusion checks and open standards. After that we’ve set ourselves on looking at different business practices, ways of making money online. Some of them are related to the ad business, others subscriptions. We’ve tackled how these models offer a glimpse into what privacy is considered: product, luxury, tool, universal right. When privacy is a product it’s traded freely. When privacy is offered at a premium, which we called primium, it’s done because of a need and not because of moral or ethical standards. We’ve listed some businesses that base their marketing or product around privacy. Finally, we’ve seen what removing data from the equation implies: having customers pay again to be able to choose, either through subscriptions or micro-payments.

Table Of Content

References








Attributions: S. Trismosin, Splendor solis, London, 16th century

April 20, 2021

Sevan Janiyan (sevan)

HFS “Incorrect number of thread records” error April 20, 2021 11:28 PM

Note: this post doesn’t provide a solution to address the “Incorrect number of thread records” file system issue but documents what I went through to see if I could when I faced it. I think your best bet would probably be to try ALSOFT’s Disk Warrior if you need to fix this issue. I was …

April 19, 2021

Patrick Louis (venam)

Internet: Medium For Communication, Medium For Narrative Control — The Big Picture: Mass Hypnosis or Mass Psychosis April 19, 2021 09:00 PM

The creation of the world

  • Internet: Medium For Communication, Medium For Narrative Control
  • Part 4 — The Big Picture
  • Section 3 — Mass Hypnosis or Mass Psychosis
Table Of Content
  • Relations With Well-Known Dystopia
  • Attention, Awakening
  • Mind Control and Mass Hypnosis
  • Alienation and Mass Psychosis

The internet brings with it technological advancements that remind us of dystopia that sci-fi writers have talked about. So let’s go beyond what we’ve tackled so far and project ourselves in hypothetical futures to posit bigger consequences.

The two archetypical stories of dystopian futures are Brave New World by Aldous Huxley and 1984 by George Orwell.

In Brave New World, Huxley presents a hedonistic future that is governed by techno-science and eugenics. The people conform and adhere to the societies rules for the following reasons:

  • The future of people is predetermined based on their genetics, their place is chosen and what makes them happy too.
  • The people are conditioned in their sleep, while unaware, to find certain perspectives more enticing and to always be happy and satisfied of their position.
  • It is a future in which pleasure and happiness are everywhere, and it makes everyone prone to accept the status-quo. Who will rebel against pleasure?

We can see some parallel with neoliberalism, consumerism, and the passivity or paralysis we’ve seen earlier. However, we’ve also seen how people can’t deal with such world of templatized lives, even when pushed by the internet.

Mass production, homogeneity, predictability, and consumption of disposable consumer goods.

In 1984, Orwell presents a world of constant surveillance by Big Brother. This surveillance is extended through self-surveillance between individuals with a thought police. The world described is believed to be in perpetual war (forever war), historical facts and news are rewritten or negated to fit the accepted narrative, and language is used in a powerful way to accompany it.
The fictitious world employs many of the propaganda techniques and the eunemics we’ve encountered.

We can see some parallel with the previous sections related to the surveillance economy, the return of the collectivist narrative, and the internet wars of memeplexes that use censorship, call-out, canceling, and social cooling to keep everyone in check.

In one of these worlds facts are eliminated and in the other facts don’t matter. Today we see both of them, censorship and speech itself as a censorial weapon. Two techniques used to make people docile and well-behaved, two techniques that are omnipresent on any communication channel but even more on the internet. We’re in the center of an information warfare.

Beyond dystopian worlds, the attention economy on the internet carries with it questions related to the very nature of consciousness.

When we talk of attention we also indirectly refer to themes such as awareness and awakening. These are then tied to spiritual practices. For instance, how we pay attention is an inherent part of Buddhist practices.
Hence, attention can be said to be the outermost expression of our consciousness. Losing control of it would be the equivalent of losing oneself or being detached from reality and be brought into a fabricated universe.
Some have come to call this the hyperpresent.

In an attention economy, we would then live in memeplexes of ideas, packaged, wrapped, and opened effortlessly to let us access these collapsible concepts, these portable philosophical ideas, stories, emotions, narratives, and meanings.
We are what we focus on, we live what we are aware of. Our reality is limited by our perception of it.

We all think that the way we live is the best way to live. We all think that we see the world as it is, that we’ve come to the best possible conclusion about our direction in life. However, that’s just our definition of normality. And this normality isn’t even ours as most of it happens without us being there. We often can’t justify why we do what we do or why we like what we like.

When we spend so much time on the internet, and will possibly spend much more in the future, we’re indirectly unaware that it is defining us.
Looking around us, for some it might appear more explicit than others as they get caught in apparent internet cults or are memeoids combatant of their memeplex, but aren’t we all?
Isn’t most of the content that is fed back to us a reflection of who we already are? And as the internet takes a bigger part of our lives, will everyone get stuck in loops of paralysis like we’ve seen before?

One question we can ask is whether it’s possible to engineer these wants, to use mind control or hypnosis to make people more inclined to give their attention or perform specific actions.

A far-fetched, or not so far-fetched, concept from sci-fi is a psychotronic weapon or electronic harassment. This abstract weapon is a representation of the totality of all possible methods and means, be them suggestive, pharmacological, paranormal, technogenic, or others that influence the psyche of a person for the purpose of modifying their consciousness, behavior, or health.
While this idea is often associated with lunatic conspiracy theories, it’s interesting to tinker about the possibility that the internet could indirectly be used in the future, or now, for such ends.

One way to be able to achieve this would be to get a direct interface to the brain. The neurotechnology for brain scanning, which used to be limited to laboratories, is slowly getting more accessible, transportable, and less invasive. There are instances using electrical activity circuits, rhythmic sound or flashing light, ultrasonic or magnetic simulation, all to modify brain waves and patterns of electrical activities in the brain. These can work both as reading devices or output devices.

Today, these are still used mainly in labs to treat neurological and mental illness, and improve mood and cognition. The USA FDA has approved in 2008 transcranial magnetic stimulation to treat depression, pain, and migraine.

However, we can imagine that when this becomes more mainstream, when brain-related data starts to be common on the internet, that ad-tech companies will jump on the opportunity to access whatever comes out of these new body sensors.
There is nothing more private than thoughts. This can give the ability to know and predict what someone’s mental dispositions are, with whatever this implies: from knowing their cognition style, to identifying cognitive strengths and weaknesses, to perceiving personality traits, and to determine their aptitudes or likeliness for certain information.
When in the wrong hands this could be called a neuroweapon.

Having this type of information in the wild would be the most advanced form of social cooling, full-on paralysis, a forced “psychocivilized” society, not far from Huxley’s brainwashing dreams.
This is why researchers and human rights activists are starting to think about adding new rights: cognitive liberty, mental privacy, mental integrity, and psychological continuity.

Another method to alter minds that has been tried in the past is by using chemicals and pharmaceuticals drugs. The research has shown, at least as far as the public knows, that they aren’t very effective and their applicability comes with a lot of technical difficulties.

Some mind-altering and stimulating chemicals are already readily available on the market of most countries. Others such as hallucinogenic mushrooms, marijuana, heroin, LSD, and truth serums aren’t as available everywhere but the intelligence community showed that they weren’t effective when it comes to applying a particular military objective or operational deployment. This is because their effects isn’t reliable nor the information that people give when under them — people could become drowsy and spurt out fictitious stories.
Yet, we can imagine that these can be used without a particular goal but with a more generic approach. This is why some of these are now being introduced and tested as psychiatric drugs to treat anxiety, depression, or mania. They force people to enter new states of minds and emotions. Many of them work by returning overactive or underactive neural networks and signal to more “normal” levels — a reset button on the brain.
These drugs can also engender the opposite effect and actually increase confusion, anxiety, and depression. Some military have been using them for torture to force the person to cooperate or surrender.

An additional topic in relation to mind control is hypnosis.

Hypnosis is a state of awareness where someone detaches their of attention from their environment and get absorbed by their inner experience, their feelings, imagery, and cognition. These inner experiences and imageries are induced by a clinician that will guide the process.
The imagined feels real in that state, bridging the gap between the left and right hemispheres, the hypnotic reality. Similar to a trance state or meditative state, an obliteration of the ego.

Three things happen in the brain of the hypnotized person:

  • The activity in their salience network decreases, a place in the brain related to worry, so they get absorbed in the moment.
  • The brain-body connection increases.
  • There’s a reduction, even a disconnect, between the actions and the awareness of them.

As we said, this can be seen abstractly as a link — by using imagery — between the left-intentional brain (logical, rational, abstract, critical), and right-involuntary-state brain (emotions, feelings, instinctive, intuitive).

In practice hypnosis is studied and used along other forms of therapy such as CBT (Cognitive Behavioral Therapy) to fix misbehavior. It is used to increase the effectiveness of suggestion, give access to the mind-body link and unconscious processing. Alone it is not very useful, but it facilitates other kinds of therapies.

Hypnosis can be self-taught but it is most often administered by a clinician. It starts with a relaxing phase where the patient focuses their attention, giving it away, an induction step. This could be done in multiple ways, be it a visual focus or auditory one. The next step is a suggestion phase which consists of the clinician guiding the patient by helping them create a visualization of events and scenarios, be it verbally or using imagery, to help them address or counteract unhelpful behaviors and emotions.

In a sense, this is very like guided meditation where someone sits in a relaxing position, calmed by a mantra or sound, and is guided by a master that will help them address their inner woes.
“In hypnosis, you’re so absorbed that you’re not worrying about anything else.”

However, studies show that only a minority of the population is hypnotisable, about 10-20% are highly receptive to it. It has been shown that hypnotisability is a genetic trait and follows a Gaussian or bell-shaped distribution. So most research related to hypnosis focuses on this part of the population.

Now, related to our main topic, this doesn’t seem like mass hypnosis would be possible, nor very actionable either. Yet, we can imagine a world in which daily activities, wandering online, becomes a trance state and where imageries are transmitted using compacted packages such as memes.
Still, that is far from the truth as most studies show that it is almost impossible to make someone do something against their will. Moreover, only an extremely small part of the population would be both willing to do “evil” biddings and also be susceptible to hypnosis. But wouldn’t these fringe people be easily findable on the internet and band together anyway?

Homo homini lupus.
Men is a wolf to men.

When the psychology of a whole population changes it can create a deleterious unstoppable snow-ball effect.

Indeed, it is becoming ever more obvious that it is not famine, not earthquakes, not microbes, not cancer but man himself who is man’s greatest danger to man, for the simple reason that there is no adequate protection against psychic epidemics, which are infinitely more devastating than the worst of natural catastrophes. — Carl Jung

When a society loses the grip on reality, becomes driven by uncontrollable fear, when mental illness becomes the norm rather than the exception, social alienation takes over, a psychic epidemic.

All one’s neighbours are in the grip of some uncontrolled and uncontrollable fear… In lunatic asylums it is a well-known fact that patients are far more dangerous when suffering from fear than when moved by rage or hatred. — Psychology and Religion

Mass psychosis happens when a large portion of society descends into delusion, similar to what happened during the European witch hunt.
It is a paranoia taken to the extreme caused by a rise in anomie, societal stress, and a flood of negative emotions.
“A mysterious, contagious insanity”.

We can already see how this can be amplified on the internet where these types of emotions, ambiguity, confusion, fear, and doubts thrive. We’ve discussed amply the cultural insecurities and cognitive dissonance in the previous sections.

We can also wonder if these can deliberately be induced in another population through PSYOPs.
Don’t we also see the rise of this phenomenon through the memeoids and their memeplexes, taken over by ideas so powerful that they possess them, consume them, even destroy them? Mobs and self-justice carnage and destruction.
Moral absolutism, and categorical imperatives are cognitive distortions because most of reality and the living of everyday life takes place in the gray area, between the extremes.

Those who control the flow of information in a society, the ideas we accept as true or false exert a great power over the course of civilization.

This concludes our review of future societal meltdown, be them hypothetical sci-fi scenarios or realistic ones. We’ve started by taking a look at the relation with well-known dystopian stories such as Brave New World and 1984. We discussed the parallel with today’s internet, censorship and speech itself as a censorial weapon. Then we’ve talked about the deep inter-relation between attention, awareness, and consciousness. We’ve said that a control of attention is a control of reality but that online we’re often only given back what we already want. Next, we’ve wondered about different ways to engineer the “wants” of a person, from mind-control weapons, to getting access to data taken directly from the brain, to drugs and chemicals, to hypnosis. Finally, we’ve discusses mass-psychosis, the state of a society where mental illness becomes the norm and reality is distorted to fit an invented narrative.

Table Of Content

References








Attributions: Hieronymus bosch, outer wings of the Garden of Delights, c. 1510

Marc Brooker (mjb)

Tail Latency Might Matter More Than You Think April 19, 2021 12:00 AM

Tail Latency Might Matter More Than You Think

A frustratingly qualitative approach.

Tail latency, also known as high-percentile latency, refers to high latencies that clients see fairly infrequently. Things like: "my service mostly responds in around 10ms, but sometimes takes around 100ms". There are many causes of tail latency in the world, including contention, garbage collection, packet loss, host failure, and weird stuff operating systems do in the background. It's tempting to look at the 99.9th percentile, and feel that it doesn't matter. After all, 999 of 1000 calls are seeing lower latency than that.

Unfortunately, it's not that simple. One reason is that modern architectures (like microservices and SoA) tend to have a lot of components, so one user interaction can translate into many, many, service calls. A common pattern in these systems is that there's some frontend, which could be a service or some Javascript or an app, which calls a number of backend services to do what it needs to do. Those services then call other services, and so on. This forms two kinds of interactions: parallel fan-out, where the service calls many backends in parallel and waits for them all to complete, and serial chains where one service calls another, which calls another, and so on.

Service call graph showing fan-out and serial chains

These patterns make tail latency more important than you may think.

To understand why, let's do a simple numerical experiment. Let's simplify the world so that all services respond with the same latency, and that latency follows a very simple bimodal distribution: 99% of the time with a mean of 10ms (normally distributed with a standard deviation of 2ms), and 1% of the time with a mean of 100ms (and SD of 10ms). In the real world, service latencies are almost always multi-modal like this, but typically not just a sum of normal distributions (but that doesn't matter here).

Parallel Calls

First, let's consider parallel calls. The logic here is simple: we call N services in parallel, and wait for the slowest one. Applying our intuition suggests that as N increases, it becomes more and more likely that we'll wait for a ~100ms slow call. With N=1, that'll happen around 1% of the time. With N=10, around 10% of the time. In this simple model, that basic intuition is right. This is what it looks like:

The tail mode, which used to be quite rare, starts to dominate as N increases. What was a rare occurrence is now normal. Nearly everybody is having a bad time.

Serial Chains

Serial chains are a little bit more interesting. In this model, services call services, down a chain. The final latency is the sum of all of the service latencies down the chain, and so there are a lot more cases to think about: 1 slow service, 2 slow services, etc. That means that we can expect the overall shape of the distribution to change as N increases. Thanks to the central limit theorem we could work out what that looks like as N gets large, but the journey there is interesting too.

Here, we're simulating the effects of chain length on the latency of two different worlds. One Tail world which has the bimodal distribution we describe above, and one No Tail world which only has the primary distribution around 10ms.

Again, the tail latency becomes more prominent here. That relatively rare tail increases the variance of the distribution we're converging on by a factor of 25. That's a huge difference, caused by something that didn't seem too important to start with.

Choosing Summary Statistics One way that this should influence your thinking is in how you choose which latency statistics to monitor. The truth is that no summary statistic is going to give you the full picture. Looking at histograms is cool, but tends to miss the time component. You could look at some kind of windowed histogram heat map, but probably won't. Instead, make sure you're aware of the high percentiles of service latency, and consider monitoring common customer or client use-cases and monitoring their end-to-end latency experience.

Trimmed means, winsorized means, truncated means, interquartile ranges, and other statistics which trim off some of the tail of the distribution seem to be gaining in popularity. There's a lot to like about the trimmed mean and friends, but cutting off the right tail will cause you to miss effects where that tail is very important, and may become dominant depending on how clients call your service.

I continue to believe that if you're going to measure just one thing, make it the mean. However, you probably want to measure more than one thing.

April 18, 2021

Ponylang (SeanTAllen)

Last Week in Pony - April 18, 2021 April 18, 2021 11:07 PM

The supported version of FreeBSD has switched to 13.0 for ponyc and corral. The Roaring pony folks met again.

Derek Jones (derek-jones)

Another nail for the coffin of past effort estimation research April 18, 2021 09:32 PM

Programs are built from lines of code written by programmers. Lines of code played a starring role in many early effort estimation techniques (section 5.3.1 of my book). Why would anybody think that it was even possible to accurately estimate the number of lines of code needed to implement a library/program, let alone use it for estimating effort?

Until recently, say up to the early 1990s, there were lots of different computer systems, some with multiple (incompatible’ish) operating systems, almost non-existent selection of non-vendor supplied libraries/packages, and programs providing more-or-less the same functionality were written more-or-less from scratch by different people/teams. People knew people who had done it before, or even done it before themselves, so information on lines of code was available.

The numeric values for the parameters appearing in models were obtained by fitting data on recorded effort and lines needed to implement various programs (63 sets of values, one for each of the 63 programs in the case of COCOMO).

How accurate is estimated lines of code likely to be (this estimate will be plugged into a model fitted using actual lines of code)?

I’m not asking about the accuracy of effort estimates calculated using techniques based on lines of code; studies repeatedly show very poor accuracy.

There is data showing that different people implement the same functionality with programs containing a wide range of number of lines of code, e.g., the 3n+1 problem.

I recently discovered, tucked away in a dataset I had previously analyzed, developer estimates of the number of lines of code they expected to add/modify/delete to implement some functionality, along with the actuals.

The following plot shows estimated added+modified lines of code against actual, for 2,692 tasks. The fitted regression line, in red, is: Actual = 5.9Estimated^{0.72} (the standard error on the exponent is pm 0.02), the green line shows Actual==Estimated (code+data):

Estimated and actual lines of code added+modified to implement a task.

The fitted red line, for lines of code, shows the pattern commonly seen with effort estimation, i.e., underestimating small values and over estimating large values; but there is a much wider spread of actuals, and the cross-over point is much further up (if estimates below 50-lines are excluded, the exponent increases to 0.92, and the intercept decreases to 2, and the line shifts a bit.). The vertical river of actuals either side of the 10-LOC estimate looks very odd (estimating such small values happen when people estimate everything).

My article pointing out that software effort estimation is mostly fake research has been widely read (it appears in the first three results returned by a Google search on software fake research). The early researchers did some real research to build these models, but later researchers have been blindly following the early ‘prophets’ (i.e., later research is fake).

Lines of code probably does have an impact on effort, but estimating lines of code is a fool’s errand, and plugging estimates into models built from actuals is just crazy.

April 17, 2021

Patrick Louis (venam)

Internet: Medium For Communication, Medium For Narrative Control — The Big Picture: Truth & Trust Crises April 17, 2021 09:00 PM

A. Kircher, Ars magna lucis, Amsterdam, 1671

  • Internet: Medium For Communication, Medium For Narrative Control
  • Part 4 — The Big Picture
  • Section 2 — Truth & Trust Crises
Table Of Content
  • Individualism, Neoliberalism, And Relativism
  • Lack Of Trust — Doubt As The New Normal
  • The Collapse Of journalism
  • Mobs & Self-Justice

Nobody likes to be put in a box, nobody likes to be paralyzed. In a world where the individual is put first, we want to feel in control. Yet, we are lost in a whirlpool of chaotic and clashing ideologies, memeplexes and absolutist templates. Who are we, how should we define ourselves? This makes us distrust any form of authority and renounce classical forms of media. As individuals, we’re looking for authenticity, whatever form it can take even if disingenuous, so that we can find our new roots, our new base. Then, there’s nobody better suited to affirm justice than internet mobs.
It feels like the internet is the new far-west, a lawless land of incessant doubt, lacking meaning and trust, where gurus, inspirational influencers, and vigilantes reign and data, personal data, and metrics of reputation are the currency held by the banks of social media.
Or it may also feel like eunemics in action — the deliberate improvement of the meme pool — and people wanting to take part in the cultural meme wars and genocides.

A lot of this can be attributed to the wide spread of neoliberalism values and mindset, which as we’ve seen in the previous section are the most prevalent online. Particularly, we’re faced with individualism, relativism, and inequalities all justified by the protection of personal rights.

If you see a world full of individuals then you’ll want a morality that protects those individuals and their individual rights.

You’ll emphasize concerns about harm and fairness.

But if you live in a society in which people are more likely to see relationships, context, groups, and institutions. Then you won’t be so focus on satisfying individuals, you’ll have a more sociocentric morality.

Which means that you place the needs of groups and institutions first, often ahead of the needs of individuals.

If you do that then a morality based on harm and fairness won’t be sufficient. You’ll have additional concerns. And you’ll need additional virtues to bind people together.

The Righteous Mind

The individual becomes reason in and of itself, and not only a figurant in their own life. We are expected, in the neoliberal theory, to develop our own opinions, outlook, stances through a considerable amount of solitary reflection. However, most important issues, such as the definition of our identity, happen over dialogue. We can’t define ourselves or recognize ourselves, in a void.
We’ve discovered how this can be beneficial in part 2 section 2, as networked individuals connected by interests. Interests become part of our identity and how we generate meaning.

On the internet, however, this leads to struggles as we’re incessantly exposed to a wide range of ideas, some of which may be breaking our local taboos. The Overton window keeps moving, for the better or the worst.
Isolated societies, communities, and ways of living may be forcedly assailed by ideas outside their comfort zone. We’ve touched this topic in the cultural malaise section.

I went a bit over this in my article entitled cross culture frustration.

The new mentality appears different and frightening — different acting, different social dynamics, different manner of building the “self”, different definition of who we are within the society. In addition, we hate being wrong, and we like confirmation. We create a cocoon and hide ourselves in it repeating the hokum over and over again. And we are afraid of the unknown — afraid that the stable self we’ve built to be part of a society is not a self that could partake and feel comfortable in the activities of another society.

In this mindset, the individual always comes before the large collective forces and needs to be protected against them. Personal life choices can’t be criticised as this would be tantamount to an assault on personal dignity. Inherently, this carries the acceptance of a soft moral relativism, any opinion should be respected and nobody should impede on others.
This translates on the social plane into political acts for individual rights, recognition, and equality. Anything else is understood as inflicting damage and being a form of oppression. We are more interested in dignity, universals, innate attributes, and equality.

Additionally, this requires the breaking of hierarchies and a loss of a grand sense of meaning. The perception is now centered on the self to find these, rejecting, and even erasing from awareness, issues and concerns that transcend the self.
We aren’t recognized through honors, distinctions, or any rare achievements, but by our self-generated worth. We’ve replaced these forms of recognition, the need for belonging and authenticity for example, by other things.

Personal and intimate relationships are now the definition and emphasis of life. We now expect to find the greater meaning, which was previously found in institutions and traditional structures, in these relationships. The domestic becomes the loci of “the good life”.
This is the core of what social media are about. They are filling a cultural gap as we said before.

Self-fulfillment and worth is also found through the instrumental reasoning of neoliberalism — actions driven by profit, optimization, market, and metrics. Meaning lies in efficiency, success is found in numbers, tangible assets, products, and money. In a way, it has replaced grand narratives and sense with short-term narcissistic, selfish, pleasures and pains, with little regard for others or society as a whole.
Similarly, the intimate relationships are also seen as tools for self-development and so are also serving an end.
It permeates every aspect of life. Even simple things such as food have become an individual matter, for personal consumption, rather than a convivial mean of regrouping and conversations.

Old social orders are shattered and replaced with resources for projects, economic growth, and prestige in technological progress.

No empathy. No loyalty. No forgiveness. Thanks to the market, the old-fashioned virtues have been rendered obsolete.

Common goods and social objectives are de-emphasized. The protection of the individual is only an extension of the protection of their private economic rights. The new morals are the morals of protecting the individual and their economic rights, the sanctity of the individual and the pursuit of self-interest as the highest moral ideal.

However, this system generates a lot of inequalities, there’s a wide distribution of wealth and income. It is hard to keep competition open and fair — a competitive attitude that also fosters more individualism. Corporations are free to self-regulate.
Comparativeness and cooperation are both part of human nature, but the one that is valued and promoted by the culture will thrive.

On the internet we’re presented with the myth of individual geniuses working in their garages without any investment, self-made success stories. Yet, this is only a myth.

And because worth is found in such success, it is then displayed in clear light for everyone to admire. Social media make this even more apparent.
These inequalities will undeniably boil into social unrest. Individual persons will demand their individual rights and feel entitled to do so.

Overall, there’s a sentiment of being stranded on an island, lost, afraid, and missing something vital. Many societies embracing or exposed directly or indirectly through the internet (or other political pressures) to neoliberalism have a hard time coping with this. The competitive and instrumental mindset encourages a fear of the other. The moral relativism destroys grand narratives, old societies heroes, and replaces them with ambiguity and an identity crisis. The cultural malaise, the feeling of anomie, we’ve discussed earlier.
We’re left with defining ourselves through interests, intimate relationships, metrics, economic activities, and a moral that only has one thing to say: “everyone is on their own, let them be”.

We are like islands in the sea, separate on the surface but connected in the deep — William James

More than anything, this damages and erodes our trust in any form of authority, this makes us doubt what we know and what others are telling us. We’ve seen how this is abused on social media in a previous section.

When reality doesn’t reflect our projections, it will backfire. Turning things upside down, changing love into hate, or any other ways of rebalancing our perceptions.
We project on people, ideologies, cultures, and politics, wanting to put our trust in something, to see them as extensions of us, to feel that our conception of reality is stable and secure.
When it doesn’t match the people revolt.

On social networks we are torn between different narratives that threaten each others. There’s no accountability or regulation for this user-generated content and it is often published under aliases that can’t be traced to any source.
Truth and trust are bent to their extreme. The digital revolution undermines mainly the control that governments used to have on them, replacing it with a technological alternative of wild-west, open, democratic, and sometimes manipulated discussions.
Infobesity and infoxication don’t help either, we’re inundated and can’t keep track of what is important, nor absorb it, nor understand it, nor know who said it. It makes it hard to effectively grasp issues and make decisions, as we’ve discussed in the paralysis section.
We’ve also previously seen how state entities are trying to get back the control by using speech itself as a censorial weapon.

The biggest consequence of this is that there has never been as much political discussion happening. It’s everywhere on the internet and it is what thrives.
However, the memetic material, the content, is always provocative, conflicting, shocking, and sensational because, as we keep repeating, this is exactly what thrives in the internet environment, namely: what touches our cultural sensitivities.
Consequentially, in relation to what we said before, this content will be about identity politics and identity in general, personal definitions. Discussions will rotate around recognitions of who some people are, the gaps they have in their cultures, how they patch them, the oppressions and damages they feel are being done to their previous or new-found identities.

Some discussions and content are more powerful and dangerous than others, activated ideologies that have real world consequences. This can lead to warranted or unwarranted protests, extreme ideologies, discriminations, assaults, suicides, acts of violence, and others.
In a previous section we’ve dabbled into how this can be used and initiated as part of PSYOPs.

This becomes alarming when doubt engenders the creation of an alternative reality that contests well-accepted scientific truth and facts, turning virtue into a vice. The lack of trust in authority being replaced by a trust in narratives that claim pseudo-authenticity and certainty. Sometimes coming from ill-informed people or from nations or entities that have an agenda — misinformation and disinformation.
This is what we’ve talked about when discussing memeoids, propaganda, and cults: a memetic equilibrium, a coherent world of cohesive meaning. The grand sense of meaning that was broken with neoliberalism but that couldn’t be recovered through anything else. The modern hermetism, or return of mysticism.
Doesn’t memetics itself reflect the concept of neoliberalism, all selfish memes fighting for their individual survival?

The word “truth” is often semantically associated with some religious or moral doctrines. This confusion also occurred in psychological literature.

In practice, this is embodied through conspiracy theories, built over time on the accumulation of confirmation biases and wishful thinking, and also through polarized and absolutist memetic wars that carry new narratives.

The side effect of these is the collapse of the classical form of journalism. Traditional media in its original shape is dead and along with it any form of “official narrative”.
There exists some media that are government owned, publicly funded, but the netizens do not trust them.

The online economy has ravage journalism. For example, in the USA, about a fifth of newspapers have closed and the survivors are shadows of what they were. The ones that are left rely on shock value to drive revenue from ads, the new business model, or have switched to subscriptions. Many geographical regions are now news deserts leaving people to rely on social media instead.
This results in the dramatic decline of professional reporting, yet a majority of people have not, and will not, notice any significant change.

The dominating ad networks, Google and Facebook, have their own alternative news channels and also decide on the internet dynamics: promoting or demoting a newspaper based on its popularity and how much revenue it attracts.

Citizen journalism is now king and the traditional media are following their footstep, redefining themselves as intermediates. Social media platforms have replaced the gatekeepers, as we’ve seen earlier, and the “democratic” curatorial systems decide which news are important and how they get framed.

While in the past the news were responsible for what they said, regulated, limited by their journalistic values, now it’s fully unregulated.

There are upsides and downsides to this new development. As we said, these are amateurs, people without journalism training, and so the quality of the reports and coverage can vary, is subjective, heavily opinionated, unregulated, and can be abused by malicious actors — such as spin doctors.
However, it can also promote human rights and democratic values to have the people join together and collaborate in this endeavor.

We’ve tackled before how the engines reinforce what people already believe, amplifying their narrative through their self-generated (often fabricated) news.
It’s even more distressing when heads of nations publicly attack the press, calling them “enemy of the people”, inflaming opinions, polarization, and partisanships. Politicians have learned how to play the game on the internet.

Along with the self-generated meaning and self-generated news and facts, we see the rise of self-justice, private justice.
Ironically, this is in stark contrast with what neoliberalism is about, it is the large collective that fights individuals, the return of the battle of the collectivist narratives.
The polarization with its absolute us vs them mentality. A return of a sort of absolute moral rigidity with categorical imperatives.

Laws should be there to protect minorities, not lead to mob violence, and vigilantism, the kind of fast justice, with total impunity, and auto-defence that is justified by militias around the world — the wild-west and absence of authority that favors the creation of these dynamics.
This lack of trust in institutions and truth makes us want to create our own. We substituted the old gatekeepers, so we can also redefine the social contract.

Corporations and multinationals have been doing this for some time now, pushing their own justice, influencing legislations, spying, and thwarting competitors using any means. This has definitely been facilitated by social media.

On the individual side, it’s understandable, we’re presented a world, through the lenses of the news and social media, of constant chaos, clashes, and protests everywhere. It seems to be the only logical conclusion, when doubting all authorities, that we should fight for our own rights and that no one else will do it.
The offline consequences of this are apparent so let’s talk about the online ones.

The collectivists narrative, one that sees each person as a member of group and not an individual, a kind of “social tribalism”, is the way people bring back order from the chaos of getting exposed to different ideas — a pre-neoliberal view within a neoliberal mindset.
It is associated with a make-belief of simple dichotomies, groups at war, enemies, activism and counter-activism. Societies then become a struggle for power, a zero-sum war of ideologies. These ideologies are, within neoliberalism, related to identities.
Some of these are hijacked or pushed by political actors for gain. This is by various kind of parties that adhere to different political ideologies on the political spectrum.

This forcing of homogeneity, is the same force that drives indoctrination, cults, and memeoids. On themselves, these distorted worldviews and conflicts are wars of memeplexes, each side carrying their package of culture.

The combatants fight to annihilate the other memeplexe, to achieve eunemics — essentially trying to deliberately control the meme pool.
The war of memeoids is fought by quarantining other memes, restricting their speech/thoughts through censorship, and also done by executing attacks that ridicule memes to detach them from the enemy’s memeplex.
“Tyranny is the deliberate removal of nuance.” – Albert Maysles
Only one reality must survive!

When it comes to these attacks, they are filled with derogatory terms, loaded words that carry with them cultural ideas, politicized insults: essentially memes in the form of words. As we said, ridicule and satire are the best tools to fight narratives and drive discussions.
Creating a new language is also creating new ideas. Satiric words are thus both attacking memes toward the enemy memeplex, and new memes to be incorporated in the memeplex of the generator. A memetic warfare. Benjamin Lee Whorf said: “Language shapes the way we think, and determines what we can think about.”

A lot of the popular words are taken from the USA politics, so we’ll extract some examples from there, any reader should be familiar with the terms considering their invasiveness in the online sphere.

One of these words is SJW, “Social Justice Warrior”, a pejorative term used to describe someone who promotes “socially progressive views” for personal validation rather than conviction. This word carries a political meaning in the USA politics, is a satire thus a meme.
The term itself is used by the opposite ideologists to refer to the internet combatants of the other groups. These SJWs rely on victimhood talk, a twist on the oppressor narrative along with emotions, to silence/censor their opponents. Absurdingly, the groups using the words are themselves portraying the world in a similar way, the back to the collectivist narrative.

The personal validation is described as “virtue signalling”, a way to show your virtues and allegiance to a group. Virtue signalling is the common phenomenon we’ve seen happening when people share memes on social media: informing and showing others that they belong, that they know what is right, following what they think is prestigious, along with a sense of superiority.
It is most often used to point out “fake outrage”, “political correctness”, “microaggression”, “microinvalidations”, and “self-righteousness” (also meme terms), but it can also be applied more generally to anything used to get social karma/points.
Indeed, this is related to the neoliberalism narcissism coming from finding meaning through metrics. Obviously, and contrary to popular beliefs, this noisemaking is done by anybody on the political spectrum and not only by a single political camp. Only the association and its cultural reference that packages it in the name makes it a novel meme.
“In a democracy, recognition matters. Everyone wants to be seen as they are. If they are not, then it’s impossible for them to enjoy the experience of being full citizens.” — Melissa Harris-Perry

The term is also associated with slacktivism, hashtag activism, and pathological altruism. These are ways of supporting causes on social media while not involved in the on-the-ground activities and the messiness. These are terms used, and mostly prevalent, by Western netizens to describe how they are distanced from reality and act from the clean pristine coccoons of their sofa.
They are pejorative because they imply that these actions have little effects, might even be ineffective, are low-cost efforts, and can’t substitute more substantive actions. However, as we’ve seen over the past sections, we know that online political talks do indeed have an effect on mindsets and allow for more discussions around a topic — temporarily getting attention, selecting what is important.

One technique that is extensively used today by all groups, political parties, or association as a defense mechanism is a sort of deliberate social cooling of others, a self-surveillance that goes by the name of call-out culture and cancel culture.
The names are normally used by a specific political side in USA politics as a derogatory term describing another side, but the techniques themselves are applied by everyone anywhere in the world. It’s about using the online-outrage machine we talked about earlier to attempt to defame and vilify characters that go against the memeplex of the group.
This public lynching type of censorship should be a reminder of the propaganda we’ve seen earlier. Speech as a censorial weapon.
A perfect way to achieve eunemics through the suffocation of memeplexes.

This concludes our review of the truth and trust crises accelerated by the internet. In a first part we’ve seen how neoliberalism and individualism have shaken our notions of stability and broken our vision of authorities and hierarchies. The individual is favored before the collective, and they now find meaning through metrics, economic activities, and moral as a protection of new personal identities. Next, we’ve looked at how this unbalance can backfire if projections don’t match reality, especially when presented with so many worldviews on social networks.
Then we went over how this generates an enormous amount of political discussions, especially related to identity, and how this generates alternative universes of meanings that live on social media, some of it conspiratorial or based on novel narratives. A kind of self-generated meaning.
After that we’ve observed the collapse of traditional journalism and all that it implies, especially that now people can generate their own news, their own facts.
Finally, we’ve concluded with a glanced at a self-generated justice, a private justice created through the return of a collectivist narrative, fights between groups. These groups being memeoids, that fight eunemics wars on the internet trying to anhilate other memeplexes using some techniques we’ve encountered in previous sections such as speech as a censorial weapon.

Table Of Content

References








Attributions: A. Kircher, Ars magna lucis, Amsterdam, 1671

April 15, 2021

Patrick Louis (venam)

Internet: Medium For Communication, Medium For Narrative Control — The Big Picture: The Formatted Individual April 15, 2021 09:00 PM

The Parmasayika grid is a fundamental religious diagram which divides up the Hindu Pantheon according to the measure of the purusha of the cosmic primal man

  • Internet: Medium For Communication, Medium For Narrative Control
  • Part 4 — The Big Picture
  • Section 1 — The Formatted Individual
Table Of Content
  • Paralysis
  • Normalcy and Innovation
  • The Neoliberal Mindset

In this part of the series, we’ll explore the bigger picture and the generic issues and “ill effects” on societies that are brought by the emergence of the internet or accelerated by it. We’ll begin with a look at three inter-related subjects: a general social paralysis, an apparent sentiment of homogeneity, and the relation with the widespread neoliberal mindset.

Paralysis is experienced at many levels and in different ways. The first of these are cognitives, which we’ve brushed in part 2 section 2.

One of the cognitive effect of the internet, proven by studies, is how much it affects our attention on the short term and long term. It is more divided and we’re less able to sustain concentration, it is more shallow and guided by attraction.
Research has shown the effect is clear on children, and long term studies have also shown that, contrary to popular belief, constant multi-tasking actually impedes the very act of task switching and does not improve it. This last one is due to how frequent exposure to this environment increases our susceptibility to distraction from irrelevant environmental stimuli.
Overall, multi-tasking in an internet environment has been associated with significantly poorer overall cognitive performance, it trains us to pay attention to what is flashier. We then feel frustrated and hindered when we have to perform tasks that require concentration. We feel obstructed and paralyzed, as if we’ve been robbed of our concentration.

Another effect we’ve looked at before is how the internet changed the way we retrieve information and value knowledge, acting as a transactive memory — an externalized memory.
The result is that we remember more where to find the information rather than the information itself. We know less about factual information because we’ve offloaded it to this new medium.
Hence, when we have decisions that rely on facts it makes us dependent on the internet. This can be both beneficial and harmful, as we said, giving us the opportunity to focus on conclusions, emergent hypothesis, overall aspect rather than facts. However, it also makes us focus more on opinions, trust the first results to our query (usually confirming our thoughts), ask fewer questions related to facts but questions related to individuals and society, and put an emphasis on fads and trends.
Our decisions are thus slower, having to consult online resources to make them, and hampered by what someone else has written about the subject we’re looking for.

We’ve previously also seen that social media had the same neurocognitive effects as real life social interactions but with an instant feedback through clear metrics. Time is now objective, not subjective, anything to turn us into machines. This has a direct impact on the concept of the self and self-esteem, making us pay attention to social judgement and social comparison. We’ve talked about influencers, hyper-successful individuals, and the unrealistic expectations these can create.
These can make us feel inadequate, especially for people that can’t manage properly their emotions.

Another aspect that gives us cold feet is the over-reliability of the medium. We’re deeply aware that nothing disappears on the internet, that any faux pas will be remembered.
Yet, we’ve also never been as uncertain about anything. We’re constantly the spectators of theatrical clashes of opinions and it makes us doubt our own.
Moreover, we’re constantly overthinking, overanalysing things, we are submerged in information and can’t take decisions fearing we’ll make the wrong ones. An analysis paralysis due to the infobesity.

Joining these ideas related to judgement makes us stay on the bench. These are compounded in phenomena like social cooling, the 1% rule, and lurking.

Social cooling is the self-censure that people exert on themselves because of social pressure. The tension between always feeling watched by surveillance algorithms, knowing our personal data is gathered against our will, understanding that our weaknesses and flaws could be exposed and used against us, and fearing how we’ll be scored based on the whims of the internet crowd. We’re intimately aware that our digital reputation is now serious business and can impact our real life.
For these reasons, people change their behavior and adapt to whatever makes them look better on the platforms, have better scores or avoid doing anything that could add a negative side to their reputation. We hesitate before making a move. All over, it makes an environment socially rigid with risk-averse people that conform.
A chilling effect on society and a powerful form of control.

This is in direct relation with the notion of lurkers, a silent majority of passive users that is said to constitute 99% of the internet, according to the 1% rule, and that doesn’t participate in the creative aspect.
The vast majority of people only consumes and doesn’t contribute any content. This participation inequality has been researched, verified, and shown to be consistent on different online forums and wiki pages such as Wikipedia and others.

There are multiple reasons why most stay on the periphery as read-only users. Social cooling plays a big part in it and the studies make it clear.
When asked why people only observed and did not participate, the common reply is that they were gathering cultural capital: the knowledge that enables an individual to interpret various cultural codes. We take the time to evaluate the implicit and explicit norms of the group, to learn if we can fit, to understand the preferred method of interaction without making mistakes and avoid being rejected, to see the topics of conversations that are allowed, and to learn the conventions of the community. This is commonly done by finding the typical figures representing the community, the models to adopt, templates that exhibit the kind of dialogues we can have.
People also lurk because they fear being redundant or that their contribution would not be worthy. By lurking they can glimpse at what has already been said and done.
We want to copy, belong, but also be unique, that’s something we’ve already seen in the biases section. Maybe it’s because our human experience is so similar and yet distinct.

Lastly, people stay afar because they feel that browsing is enough for them. That could be simply because they are looking to find particular content from a place and nothing else.

On the psychological side, lurking is associated with loneliness and apathy, a sense of not belonging. Lurkers are found to be less satisfied and experience more distractions, they are distant.
In the case of social networking, they experience less intimacy and feeling of personal well-being, and an overwhelming loneliness as they watch others participate in the community.
Social media does indeed feel cognitively similar to amplified real life interaction.
This is akin to a TV audience but with the sense that what we see is real and not a fabricated reality.

What makes someone de-lurk is situational and depends on both the individual’s personality, the environment, and the community. What remains is that only a handful of superusers generate the vast majority of content.

Research shows that these contributors are highly motivated, feel a sense of duty, attachment, and belonging to their online community. Other research show that active participants seemed to be more extroverted and open, felt they were more empowered and in control of their environment, had confidence in their ability to influence, had higher self-efficacy, and a greater need for gratification.
That is in sync with what we’ve seen regarding frustrated individuals that had their cultural malaise and gaps filled by the internet. Participants need a place to express themselves and that is relevant to them.

Statistics show that the people contributing the most are out of the ordinary, they exhibit extraordinary behavior. That means that when we consume any content on the internet, we’re mostly consuming content created by people who, for some reason, spend most of their time and energy creating this content. These people clearly differ from the general population in important ways. The kind of people that move the Overton window, which we talked about.
This is worth keeping in mind when on the internet.

Knapp, had been submitting an average of 385 edits per day since signing up in 2005 as of 2012. Assuming he doesn’t sleep or eat or anything else (currently my favored prediction), that’s still one edit every four minutes. He hasn’t slowed down either; he hit his one millionth edit after seven years of editing and is nearing his two millionth now at 13 years. This man has been editing a Wikipedia article every four minutes for 13 years. He is insane, and he has had a huge impact on what you and I read every day when we need more information about literally anything. And there are more like him; there is one user with 2.7 million edits and many others with more than one million.

These days the 1% rule has been more or less debunked as the barrier to engagement has decreased. However, it hasn’t decrease enough, the number of lurkers is still relatively high and social cooling adds to the equation.

This paradoxical balance between lurkers, conformity, social cooling, and out-of-the-norm influencers and contributors in different communities, makes us feel a sense of normalcy and lack of innovation.

Normalcy is the consequence of adherence and conformity. We lurk because we want to adhere to the norm, not stand out, to fit in. We’re also interested in participating because of the homophily phenomenon we’ve discussed before, “birds of feather flock together”.
This all keeps the status quo and homogeneity in the community.

This creates a situation in which we’re offered templatized, cookie-cutter identities which we can choose from, and indirectly coerced to fit into them by the persuasive technology. It’s easier to market to persons that fit into well-separated categories, to “buyer personas”. One could even call these memeplexes and memeoids, honed through eunemics.
This is all decided by fads, opinions, and interests, a sort of “choose your character/future” mode of identity building. With these templates there’s an emphasis on attitude rather than facts, truth, or moral. Sometime called “Hypernormalization”.
“How can you have a personality if you have no knowledge to base it upon, and if you merely have opinions that have been given to you through slogans or clichés.”

It is hard to get out of a template, these mass one-sided personalities, because of all that we’ve seen before especially the social judgement, social cooling, the action-attention-reward addictive mechanism, and others. We need privacy as a right to not fit in, a right not to be perfectly molded, well-behaved, docile, and conformed humans.
We are guided by biased algorithms that are mathwashed, believing they are impartial.

This has detrimental effects on kids that are in the process of finding themselves, in the process of building their personality. They feel enormous social pressure to be perfect. Self-inflicted injuries, such as cuttings that are serious enough to require treatment in an emergency room, in the USA, have increased dramatically in 10- to 14-year-old girls, up 19% per year since 2009.
Less physically harmful, is the effect on young boys that are looking for achievements and get addicted to “competency porn”, any type of media that fulfills the need for recognition, achievement, and control over the environment — Be it video games or animated movies. The epic tales and escapist fantasies of today’s world.

It is astonishing because different persons are part of different internet realms, with different templates, and are given skewed pictures of reality — it’s as if everyone lived in alternate universes that sometimes, but rarely, cross.

Indeed, even though the UN considers the right to internet access a human right, the effect of the globalisation of the internet are unclear.
Already we’ve seen filter bubbles, homophily, and confirmation bias that keep people within the same realms. Additionally, we have to add to this the privileges associated with the internet. Not everyone can access the internet, not everyone can use it powerfully, not everyone can participate in community content creation, and not everyone has the language skills to participate in these various communities.
The content is still subject to being part of a hierarchy, still based on the same social and international hierarchies that exist today. Minority views in the real world are still minority views online, the same people facing discriminations offline also tend to face them on the internet.
And yet, it also fosters the democratization of speech if used properly, but the content we are given somehow reinforces the status quo.

It’s a global and local presence, we could say a glocalization. Everyone that has internet uses it in their own way, and the way they use it affects their perception. Some use the internet for research, some for hobbies, some to socialize, some to learn, some as a tool for work, etc.. Internet usage is shaped by the myriads of ways that we access it for our needs and values.
Yet, this exact habit of usage is shaped by the local culture and thus also creates a divide as the internet is inherently based on connecting networks together.

For example, even though we can use the internet for different things, the language we speak will only give us a narrow view of the whole spectrum. If we search using a language that is more “popular”, let’s say English, we’re prone to be given the templatized identities related to the most vociferous online English speakers (USA).
Language then is part of the dynamic of who defines and who is defined.
There are other imbalances that exist based on geolocal preferences. For instance, speakers of a certain language will prefer domestic issues.

Moreover, advertisers, the main source of revenue online, will prioritize specific consumers that fit a template from a certain culture and geographical area, and not others. This will incentivize content creation for that specific audience, as this is where more money is made. This in turn will widen the cultural gap, now a digital advertising gap, thus a media content quantity and quality gap.

There are many factors that make the internet both global and local at the same time. Again, it depends on our usage of it but when we default to the average person, there’s no room around being presented only narrow views.
The question remains whether internet truly makes people conform, what’s the state of creativity and counterculture, and how much propaganda comes into play.

The literature shows that a small subset of the population, around 2 to 3%, have international ties and act as bridge figures between cultures. These persons show a higher creativity than their peers because of their exposure to different views.
From this we can imagine that it’s the same thing on the internet.

When using a service you become part of its ecosystem, there are many platforms and the big ones, the FAANG, have a tech hegemony. Internet in itself didn’t kill counterculture, it is just not present on these platforms as they encompass a sort of new overarching powerful entity, sometimes more organized than governments.

This means that counterculture today isn’t any more about a war of ideologies but about going against the trends of the internet. The counterculture of the past, which rotated around personal expressions and identity are now used by platforms to drive engagement and profit, it’s part of their lucrative business. Social media in particular have been catering to the “demands” of a generation by filling the gap that the culture couldn’t — self-fulfillment as a product, as we’ve seen earlier.
This means that these types of old countercultures are still part of the system, but by embodying them we’re not in charge. We pull ourselves together by gathering pieces of templatized subcultures. Then, this collection becomes our niche personal branding, our personal expressions and ideologies.

As we went over in part 2 section 2, we are now seeking individuation, we are networked individuals connected by interests. This is in stark contrast with counterculture which requires a group to thrive, an “us against the world” mindset. The internet does allow regrouping but only on a shallow level, particularly for collective dissent and unsatisfaction, turning the groups to absurdities and cult-like behavior.
Directly opposing the system will only make the system stronger.

True counter-culture can either be found by embracing the algorithms, understanding their inner-working and warping them to our needs, or by signing-off and rejecting the whole system for an alternative one. Both of these are difficult in the era of like-and-share.

So we are bound to use the internet, and the internet to project our anthropological nature, its current state and flaws, be it conformity, homogeneity, or others.
We’ll be given a reflection according to the function it has for us, our lifestyle, current society, culture, and time. For a big part of the world that means the internet is becoming inherently neoliberal, used for neoliberal ends, and encourages neoliberal behaviors.

This isn’t surprising considering the origin of the internet, who are the biggest internet companies, and who are the most vocal online. The internet embodies the “American mindset”, a mindset of personal brands and marketing, the self-made man.
A considerable amount of people are driven by this, with different subset cultural bubbles living outside the hegemony of these giants because of different environmental constraints.

The internet is the perfect tool for the workings and reinforcement of neoliberalism values. It is a perfect display of how this way of viewing the world, which was heavily questioned during the Cold War, has taken over.

It focuses on a certain flavor of democracy, individualism, and a laissez-faire absolutism of the marketplace. The belief and undeserved faith in the efficiency of markets. It centers around deregulation, free trade, privatization, and puts the individual at the center of everything. Any other attitude, and especially government intervention, is frowned upon.
We can clearly see how this has unfolded on the internet, with Facebook’s many scandals clarifying the limits of corporate self-regulation.

There’s an over-emphasis on economic efficiency and growth at all costs. Political theorist Wendy Brown has gone even further and asserted that the overriding objective of neoliberalism is “the economization of all features of life”.

The self becomes a product, a personified capital that needs to accumulate symbols of status, a storyline, and advertise itself. With regards to the internet, and because the money is where advertisers and the audience put their interests, a personal-brand will only be profitable if it fits the mold of the popular culture.
We have to partner with the brands and products that make us shine. We own things that reflect our personal meaning. We calculate the benefits, efficiency, and productivity of each action. We had a look at this in the new economies section, this is in relation to this. The commodification of personal data and attention for profit.

This unfettered exploitation puts gains before anything and it doesn’t only apply to individuals but especially to corporations. That is why marketing is prioritized on the internet, an optimization of products and surveillance. We de-emphasize public goods to turn our eyes towards the market idolization, a corporatocracy.

Everything has a monetary incentive rather than a social objective. Anything needs a metric, an indicator, to be tracked and quantifiable. This focus on economic efficiency can compromise other, perhaps more important, factors, or promote exploitation and social injustice. We see this on the personal level on social media through likes, shares, and subscribers.

It’s arguable if neoliberalism, the advent and rise of this way of life, are bi-products of the internet or if it’s the opposite. It’s arguable if individuation is the product of neoliberalisation or vice-versa, and if they are accelerated by the internet.
It’s probably a mix of everything, societal changes with the private enterprises reflecting the will of individuation of private individuals, which are also projected on their internet — the internet driven by the big players.

We’ll see later how these societal change and acceleration bring social unrest.

The individuals are at the heart, all the responsibility and weight are on them, that’s why they focus so much on identity politics and individual gains, while dismissing broader identities and class consciousness.
A mix of looking for authenticity, as it seems to be the gap in cultures that have embraced neoliberalism, and an overwhelming majority of narcissistic individuals.
A mix of losing political freedom because of apathy, and looking for self-fulfillment as withdrawal from moral concerns. Are lurkers neoliberal? Are counter-culture and creativity now anti-neoliberalism?

Globalisation doesn’t necessary mean homogenisation, but it’s undeniable that the neoliberal mindset has spread further than any other and can’t be dismissed. Its acceptance, normalization, the incentives, indirect influences on the internet and its dynamics are apparent.
Isn’t this the best propaganda, one we don’t even notice unless we’re someone in the middle, someone cross-cultural?

Nonetheless, what happens online is only a function of the offline status quo, the offline paralysis. The internet does not exist in a vacuum. It is made with the purpose of letting people connect with their own networks, assuming that users are individuals, and inherently individualising them.
The online world recreates and accelerates — there is nothing inherent within the technology that makes it neoliberal, it’s only a tool. Each can have their own different internet experience depending on how they use it, as we kept saying.
We’re both formatted and free individuals — freedom as a menu.

This concludes our review of how the internet has consequences in the acceleration of cultural evolution, formatting individuals, paralyzing them, imposing an overwhelming homogenisation while letting fringe people contribute, and fostering and spreading the neoliberal mindset.
At first, we’ve examined the cognitive effects: attention, memory, and social cognition. Our attention is divided, our memory externalized, and our social fear amplified. Then we’ve seen how the internet makes us constantly anxious of making indelible mistakes, embodied in the concept of social cooling and self-censorship.
Next, we’ve observed the link with the notion of lurkers, read-only users, the 1% rule, and the out-of-the-ordinary users that actually contribute the content. These contributors do so because the internet has filled something they were missing, a cultural gap.
After this we’ve questioned conformity and homogeneity, how the internet offers template we select from, and puts pressure on individuals to fit perfectly. Yet, we’ve also observed that these templates, these experiences, and the internet function as a whole, differ and are divided per culture, environment, language, and others — the glocalization of the internet.
Later, we’ve added to this the fact that the internet is driven by monetary incentives and that it will create a dynamic in which more content is targeted at people that fit certain templates and others will be enticed to fill the templates of other cultures, an indirect cultural assimilation.
Following this we’ve discussed what it would mean to stand out of these templates, of all these memeplexes, to be more creative or counter-cultural and we’ve concluded it means either embracing the tech or going against it. Finally, we’ve linked the topic with neoliberalisation and how it exemplifies the phenomenon we’re seeing, both by emphasizing the market, efficiency, and the individual.

Table Of Content

References








Attributions: Parmasayika grid

April 14, 2021

Gustaf Erikson (gerikson)

The Pacific War Trilogy by Ian W. Toll April 14, 2021 01:23 PM

  • Pacific Crucible: War at Sea in the Pacific, 1941–1942
  • The Conquering Tide: War in the Pacific Islands, 1942–1944
  • Twilight of the Gods: War in the Western Pacific, 1944–1945

An excellent and readable account of the (US) war in the Pacific against Japan in World War 2. Highly recommended.

Marc Brooker (mjb)

Redundant against what? April 14, 2021 12:00 AM

Redundant against what?

Threat modeling thinking to distributed systems.

There's basically one fundamental reason that distributed systems can achieve better availability than single-box systems: redundancy. The software, state, and other things needed to run a system are present in multiple places. When one of those places fails, the others can take over. This applies to replicated databases, load-balanced stateless systems, serverless systems, and almost all other common distributed patterns.

One problem with redundancy is that it adds complexity, which may reduce availability. Another problem, and the one that people tend to miss the most, is that redundancy isn't one thing. Like security, redundancy is a single word that we mean that our architectures and systems are resistant to different kinds of failures. That can mean infrastructure failures, where redundancy could mean multiple machines, multiple racks, multiple datacenters or even multiple continents. It can mean software failures, where common techniques like canary deployments help systems to be redundant when one software version failures. I can also mean logical failures, where we recognize that state can affect the performance or availability of our system, and we try ensure that the same state doesn't go to every host. Sometimes that state is configuration, sometimes it's stored data or requests and responses.

An Example

Unfortunately, when we talk about system designs, we tend to forget these multiple definitions of redundancy and instead just focus on infrastructure. To show why this matters, let's explore an example.

Event logs are rightfully a popular way to build large-scale systems. In these kinds of systems there's an ordered log which all changes (writes) flows through, and the changes are then applied to some systems that hang off the log. That could be read copies of the data, workflow systems taking action on the changes, and so on. In the simple version of this pattern one thing is true: every host in the log, and every consumer, sees the same changes in the same order.

Event bus architecture, with three replicas hanging off the bus

One advantage of this architecture is that it can offer a lot of redundancy against infrastructure failures. Common event log systems (like Kafka) can easily handle the failure of a single host. Surviving the failure of a single replica is also easy, because the architecture makes it very easy to keep multiple replicas in sync.

Event bus architecture, with three replicas hanging off the bus, with host failures

Now, consider the case where one of the events that comes down the log is a poison pill. This simply means that the consumers don't know how to process it. Maybe it says something that's illegal ("I can't decrement this unsigned 0!"), or doesn't make sense ("what's this data in column X? I've never heard of column X!"). Maybe it says something that only makes sense in a future, or past, version of the software. When faced with a poison pill, replicas have basically two options: ignore it, or stop.

Event bus architecture, with three replicas hanging off the bus, with logical failures

Ignoring it could lead to data loss, and stopping leads to writes being unavailable. Nobody wins. The problem here is a lack of redundancy: running the same (deterministic) software on the same state is going to have the same bad outcome every time.

More Generally

This problem doesn't only apply to event log architectures. Replicated state machines, famously, suffer from the same problem. So does primary/backup replication. It's not a problem with one architecture, but a problem with distributed systems designs in general. As you design systems, it's worth asking the question about what you're getting from your redundancy, and what failures it protects you against. In some sense, this is the same kind of thinking that security folks use when they do threat modeling:

Threat modeling answers questions like “Where am I most vulnerable to attack?”, “What are the most relevant threats?”, and “What do I need to do to safeguard against these threats?”.

A few years ago, I experimented with building a threat modeling framework for distributed system designs, called CALISTO, but I never found something I loved. I do love the way of thinking, though. "What failures am I vulnerable to?", "Which are the most relevant failures?", "What do I need to do to safeguard against those failures?"

If your answer to "What failures am I vulnerable to?" doesn't include software bugs, you're more optimistic than me.

April 13, 2021

Patrick Louis (venam)

Internet: Medium For Communication, Medium For Narrative Control — Biases & Self: Cultural Ambiguities & Insecurities April 13, 2021 09:00 PM

The Sufi-infuenced teachings were aimed at destroying man's illusory sel-image, and revealing him as being guided by mechanical reflexes

  • Internet: Medium For Communication, Medium For Narrative Control
  • Part 3 — Biases & Self
  • Section 2 — Cultural Ambiguities & Insecurities
Table Of Content
  • Constrained By Platforms Or Platforms Constrained
  • Beliefs and Values
  • Repressed Cultural Weaknesses
  • Polarization As A Natural Phenomenon
  • Primal Needs
  • True Believers

To be a netizen means to be part of the online social sphere. There’s no way around it, to have a voice and participate people have to join a platform, which comes with its own limitations.
The rules of the platforms are the rules of the information society but the platforms adapt more to fit us than we adapt to them. Anything happening on them is directly because of real people interacting together. People that have their own hopes, emotions, values, prejudices, and beliefs. Consequently, through our own cultural differences, ambiguities, and insecurities, we are indirectly manipulating ourselves.

We look for confirmations of our own experiences, as we said in the previous section, confirmation bias. We want to share and search for things that relate to our local lives, to make us look smart, empathic, cast us in a positive light, or that is useful in our day-to-day lives.
On the internet this is proved through many experiments. Emily Falk’s Lab, Director of the University of Pennsylvania’s Communication Neuroscience Lab, has demonstrated how the expectation of social confirmation and reward influence the likelihood of someone sharing a meme with someone else. We share for social comparison, to look favorable to other group members, for “virtue signalling”.
Similarly, according to research, on average people visit a maximum of 30 websites before making their mind about something. The bulk of these websites are part of the top results of the search engines of which 70% will always support the view portrayed by how the query was formulated. This is what we refer to as an echo chamber and filter bubble, it echoes back what we think.
We seek information sources that support and reinforce existing attitudes or beliefs, as well as process them. “All else [being] equal, people seem to prefer not changing their opinions to changing them” (Lebo & Cassino, 2007, p. 722).

We increasingly, inadvertently through selective exposure by our own judgement, are localized, not only geographically but also through our language, ideologies, time, and cultures. As we said earlier, the internet is made of networked separated individuals, linked by their interests and similarities.
This isn’t only about filter bubbles created through algorithms, but it’s our natural tendency to pay attention to those who have a lot in common with us. We lack cognitive diversity.
We call this phenomenon homophily, “birds of feather flock together”. We’re really good at it, we have this inner urge to find people similar to us, and this is confirmed by over 100 studies.
This isn’t necessarily negative, this allows us to create communities based on niches, on interests, be more democratic, etc.. All things we’ve seen in part 2 section 2.

We all think that the way we live is the best way to live. We all think that we see the world as it is, that we’ve come to the best possible conclusion about our direction in life. However, that’s just our definition of normality. And this normality isn’t even ours as most of it happens without us being there. We often can’t justify why we do what we do or why we like what we like.

This is as much as about externalities as internalities. We associate ourselves with others like us, as a way of expressing our own individuality by proxy.
We said people spent more time viewing and sharing memes that confirmed their views with others, not only because it confirms them, but also because it’s a way of expressing one-self and indirectly promoting who we are. We went over this in the meme section, memes are a representation of our mental models, how we comprehend and make sense of the world.

Ideologies are never force-fed, we are deliberately looking for them. Nobody can be persuaded if they don’t want to be persuaded.
That is why when we discussed propaganda we said that the best messages are those that resonate with the audience, that aren’t perceived as propaganda. Not imposed, but emerging from themselves and expressing the concerns, tensions, aspirations, and hopes of the persons.

We live through our media, they define us, we identify with them. That is why the interactions on the internet are a reflection of our different cultures.
Everyone has an opinion, however, what is worrying is when they give rise to consequential beliefs. When they manifest in the real world.

On social media, these opinions are now all expressed and openly visible. The pristine walls protecting us have fallen and we are faced with a messy world full of ambiguities.

This is increasingly true when netizens use persona, and even more when they use hide under an anonymous pseudonym. People are then free to express themselves without restraints, breaking accepted cultural codes, expectations, unspoken tensions, and preconceived notions. They aren’t burdened by what their peers think of them.

However, we are all biased, as we saw in the previous section, and this confluence of voices brings uncertainty to our own. Doubt sets in when many claim the contrary of something widely accepted. It’s also hard for us to accept coincidence as an explanation, which gives rise to some shouting their conspiratorial thinking. Trust is easy to lose and hard to gain.
A trust crisis is setting in.

On social media everyone has an equal voice and different views. We said this was moving us towards a low-context society or informational society. We’ve also just seen the tendency to regroup in bubbles with people and content that is in agreement with our current views.
Yet, research tells us that social media users have a more diverse news diet than non-users. It shows how much we are bound to encounter opinions and information that might upset us because they go in the opposite direction of our preconceived notions. The bubbles are bursting against each others, and this implicitly creates hostility.

As humans, we tend to connect ideas in a coherent way, to make sense of the world around us based on our culture and routine. We have expectations of what is true, know historical facts set by our institutions, build a narrative from the events we are told, hold public opinions, have pictures of what is normal or not, assumptions about how structures fall into place, and which established powers we think is better.
Yet, again, reality isn’t sterile and spotless, these preconceptions are but a thin layer holding a story, and it only covers an infinitely small portion of the existence on this planet. Cultures often have unspoken tensions, anxieties, weaknesses — no story is perfect. This is felt even more heavily when they clash on the internet.
A truth crisis setting in.

These contradictions need to be accepted and that might generate cognitive dissonance in some people. Cognitive dissonance is a word used to describe the effects someone feels when holding contradictory beliefs, ideas, or values and it’s denoted by psychological stress. The human-psyche wants to be consistent, otherwise it creates discomfort. We feel an urge to resolve these contradictions.

Then it’s no wonder that some content go viral, it’s because they’re unleashing a repressed cultural agenda. Online discourse promotes outrage because we feel the need to defend or attack our positions so that our world view doesn’t shatter — to keep our own consistency or to find it again.
If we didn’t have issues in our societies, these extreme forms of discussions wouldn’t materialize. Thus, it is reductive to point the finger at just the social media platforms, that’s why we’re pointing it at ourselves here.

When facing an unclear situation, a tension, a question, or decision, we instinctively take stances. Naturally we either are with or against. We put ourselves in camps. At least, the most prominent and outspoken people take sides, so on the internet these are the ones whose voice break through the noise.
That person will be exposed to a disproportionately large amount of info similar to their view and a small amount of info against their view. Naturally, they’ll want to react the contrarian viewpoints to defend what they think is right.

After that dance, opinions will be polarized and the diversity of point of views and nuances will disappear — the system will have reached its equilibrium.
Absolutist thinking is in human nature when there is an absence of certainty. We want to grasp for reassurance.

The researcher Cass Sunstein has shown that this is even more apparent when putting people that have similar mindsets in the same room. They divide themselves according to more polarized version of the same mindset.
Initially the group shows a lot of internal disagreement over issues but then, again, fall unto an equilibrium. People held more-extreme positions after speaking with like-minded individuals and were more homogeneous, squelching diversity. Either an all or nothing, a black and white vision, highly moralistic, with expectations on us and others.
Similar studies show the same results and adds that people have a less negative perception of opposite views if they get exposed to them more often, instead of mainly the ones confirming their own.

We’ve seen that people self-sort and seek confirmating views online, this widens the rifts between different points. The internet makes it exceedingly easy for people to move into extreme absolute positions and versions of whatever is discussed.

The internet, and particularly social media, also fills the voids that our societies cannot. Gaps in our primal needs: the need for certainty, the need for identity, the need for belonging, the need for accomplishment, competency, and approval, or higher spiritual and moral callings. All things that are missing in contemporary civilizations according to Charles Taylor.

Our lack of certainty about fundamental assumptions and loss of control inclines us to the middle but opinions on extremes look steady and we slowly move towards them — a need for closure, narrowing our perception.
We generally adopt the first belief that fills the gap, which is given to us by the flashy promotional culture or fringe ones.
These fillings do not have to be understood but believable, to look consistent and absolutely certain. The quality of the ideas themselves play a minor role but the meaning and confidence that they bring matters.

The persons the most sensitive are the ones who are bored, feel powerless, have insecurities in their culture, feel uneasy about the ambiguity of the world, and are frustrated. A feeling of anomie.
They are driven primarily by hope for a coherent future, their gaps filled by the ideology, being consoled and rewarded for taking part in it.
They go under the name of fanatics, highly motivated individuals, defenders of culture, true believers, memeoids, or hyper partisans.

The transformation or indoctrination can happen slowly over time, through repetition and a slippery slope, as we said, reaching a target level of saturation creating a full universe of meanings. Then stuck in a feedback-loop.

Undeniably, micro-targeting can be used to find these individuals which needs aren’t filled by society, to then turn them into true believers. Monitoring posts can give us insights on whether someone feels stressed, defeated, overwhelmed, anxious, nervous, stupid, useless, and a failure.
It’s not surprising that this is going to be used because hyper partisans are the most motivated, they are driven by their ideology and occupy a bigger space in the information market. This is a sort of tyranny of the minority, which drags people that are undecided.

Sarah Golding, the president of the Institute of Practitioners in Advertising, says “It has essentially weaponized ad technology designed for consumer products and services. There is a danger that every single person can get their own concerns played back to them”.

Cognitive dissonance and uncertainty open a window of opportunity to hack the mind and bring these individuals in.
The memeoids are known to self-isolate into fringe communities. Ironically, statistically these communities, with their extreme viewpoints, are the most influential on the internet. This is all unsurprising because they offer certainty and answers for the lost netizens.

Additionally, these communities always have gatekeepers and look enticing. They isolate the members and shun anyone that has a different opinion. The stories are reinforced through time. This reminds us that certain communities on the internet are no different from cults, and of the reasons why people join them.

This concludes our review of cultural angst and how they are reflected on the internet. First of all, we’ve talked about how the internet has its limitations but is more shaped by us than we are shaped by it. We’ve then dabbled with the concepts of beliefs and values which always get confirmed through the results given by recommendation/curation engines and other algorithms. Next, we’ve seen how when cultures clash they make us feel discomfort and bring to the surface cultural anxieties and weaknesses, thus we react. After that, we’ve talked about how polarization is natural, how we instinctively take sides to become more homogeneous. On the internet this creates rifts between ideas, people want to take sides. Later, we’ve covered how the internet can fill needs that aren’t filled by societies, be it a need for closure, accomplishment, belonging, or approval. Finally, we’ve looked at hyper partisanship, how people get stuck in these communities, and why they are the most vociferous online.

Table Of Content

References








Attributions: Alexander de Salzmann, cover-design for the programme of the “Institute for the Harmonic Development of Man”, Tiflis, 1919

April 11, 2021

Ponylang (SeanTAllen)

Last Week in Pony - April 11, 2021 April 11, 2021 11:47 PM

New versions of the ponylang crypto, http, http_server, and net_ssl packages are available.

Derek Jones (derek-jones)

Pricing by quantity of source code April 11, 2021 09:45 PM

Software tool vendors have traditionally licensed their software on a per-seat basis, e.g., the cost increases with the number of concurrent users. Per-seat licensing works well when there is substantial user interaction, because the usage time is long enough for concurrent usage to build up. When a tool can be run non-interactively in the cloud, its use is effectively instantaneous. For instance, a tool that checks source code for suspicious constructs. Charging by lines of code processed is a pricing model used by some tool vendors.

Charging by lines of code processed creates an incentive to reduce the number of lines. This incentive was once very common, when screens supporting 24 lines of 80 characters were considered a luxury, or the BASIC interpreter limited programs to 1023 lines, or a hobby computer used a TV for its screen (a ‘tiny’ CRT screen, not a big flat one).

It’s easy enough to splice adjacent lines together, and halve the cost. Well, ease of splicing depends on programming language; various edge cases have to be handled (somebody is bound to write a tool that does a good job).

How does the tool vendor respond to a (potential) halving of their revenue?

Blindly splicing pairs of lines creates some easily detectable patterns in the generated source. In fact, some of these patterns are likely to be flagged as suspicious, e.g., if (x) a=1;b=2; (did the developer forget to bracket the two statements with { }).

The plot below shows the number of lines in gcc 2.95 containing a given number of characters (left, including indentation), and the same count after even-numbered lines (with leading whitespace removed) have been appended to odd-numbered lines (code+data, this version of gcc was using in my C book):

North Star Horizon with cover removed.

The obvious change is the introduction of a third straight’ish line segment (the increase in the offset of the sharp decline might be explained away as a consequence of developers using wider windows). By only slicing the ‘right’ pairs of lines together, the obvious patterns won’t be present.

Using lines of codes for pricing has the advantage of being easy to explain to management, the people who sign off the expense, who might not know much about source code. There are other metrics that are much harder for developers to game. Counting tokens is the obvious one, but has developer perception issues: Brackets, both round and curly. In the grand scheme of things, the use/non-use of brackets where they are optional has a minor impact on the token count, but brackets have an oversized presence in developer’s psyche.

Counting identifiers avoids the brackets issue, along with other developer perceptions associated with punctuation tokens, e.g., a null statement in an else arm.

If the amount charged is low enough, social pressure comes into play. Would you want to work for a company that penny pinches to save such a small amount of money?

As a former tool vendor, I’m strongly in favour of tool vendors making a healthy profit.

Creating an effective static analysis requires paying lots of attention to lots of details, which is very time-consuming. There are lots of not particularly good Open source tools out there; the implementers did all the interesting stuff, and then moved on. I know of several groups who got together to build tools for Java when it started to take-off in the mid-90s. When they went to market, they quickly found out that Java developers expected their tools to be free, and would not pay for claimed better versions. By making good enough Java tools freely available, Sun killed the commercial market for sales of Java tools (some companies used their own tools as a unique component of their consulting or service offerings).

Could vendors charge by the number of problems found in the code? This would create an incentive for them to report trivial issues, or be overly pessimistic about flagging issues that could occur (rather than will occur).

Why try selling a tool, why not offer a service selling issues found in code?

Back in the day a living could be made by offering a go-faster service, i.e., turn up at a company and reduce the usage cost of a company’s applications, or reducing the turn-around time (e.g., getting the daily management numbers to appear in less than 24-hours). This was back when mainframes ruled the computing world, and usage costs could be eye-watering.

Some companies offer bug-bounties to the first person reporting a serious vulnerability. These public offers are only viable when the source is publicly available.

There are companies who offer a code review service. Having people review code is very expensive; tools are good at finding certain kinds of problem, and investing in tools makes sense for companies looking to reduce review turn-around time, along with checking for more issues.

April 10, 2021

Patrick Louis (venam)

Internet: Medium For Communication, Medium For Narrative Control — Biases & Self: Cognitive Biases April 10, 2021 09:00 PM

The art of memory is like an inner writing. Those who know the letters of the alphabet can write down what is dictated to them and read out what they have written

  • Internet: Medium For Communication, Medium For Narrative Control
  • Part 3 — Biases & Self
  • Section 1 — Cognitive Biases
Table Of Content
  • Biases Aren’t Long Term
  • Who
  • Physical
  • Biases Related To Visibility
  • Biases Related To Information Availability
  • Biases Related To Social Pressure

Humans are prone to manipulations by their own experiences and biases. Let’s review what are some of the cognitive biases and cultural weaknesses that make us fall for all sort of things on the internet.
We’ve already looked at coercion, deception, and other types of persuasion principles in a previous section. In the following two we’ll emphasize on ourselves instead of external factors.

There are many cognitive biases that have been documented over the years, this is nothing new. However, it’s interesting to see them under the light of this medium, how they come into play on the internet.
I’ll try to list the ones that are the most prominent in my opinion.

In general, biases can affect us when it comes to beliefs, decision-making, reasoning, economic decisions, and behavior. Yet, biases don’t have a significant influence on the long run, at least the ones that aren’t inherently connected to our cultures. We’ll dive into culture in the next section.
Cognitive biases are temporary tricks, temporary stimulations that hack our brain. When someone is aware that another person has used such bias against them, they know they have been cheated. Regardless, studies show that even when trying to influence subliminally, without the audience realising, these types of stimulations only have negligible effects.

In most cases, a bias isn’t something that someone creates but something we all naturally have and that is exposed. As such, we can all have them, different persons having different ones under different circumstances. Kids in particular are more vulnerable to them.

Before diving into cognitive biases, let’s stop to mention some physical reactions.
In part 2, section 2, we’ve mentioned social cognition and how communicating on the internet had the same bodily-effects and cognitive responses as other types of social interactions. This implies, as research shows, the same status-to-hormone correlation: stress with cortisol and social confirmation/success with oxytocin, endorphin, and dopamine.
Practically, in the social media environment, which works with interfaces showing quick metrics of social success or failure, that means people’s Action-Attention-Reward system will lead them towards addiction. Social media acting like slot-machines — classic conditioning — and keep us hooked.

We all react similarly to these hormones; biased to associate pleasant and positive feelings with what triggers some, and associate negative feelings with others. Hence, we are incentivized to repeat the same behaviors that once got us a dopamine hit — be it finding a niche of expertise we cater to, or posting and finding something shocking.

A market in which everybody has the right to speak comes with the biases of everyone in it.

Some of the biases we can have on social media are in relation to the control over visibility. Biases such as attentional bias, salience bias, framing effect, mere exposure effect, anchoring effect, and others.
Attentional bias is the tendency to pay attention to what is more recent in memory. Similarly, salience bias is about paying attention to what is more remarkable and emotionally striking. The anchor bias is a similar bias, fixating on a single part of a story. In the attention economy we definitely pay attention to what is more extraordinary and recent, and we’ll be shown these stories more often.
Mere exposure effect is about being influenced to like something because we are familiar with them, just by being exposed more frequently. Framing is similar, it’s about extracting different conclusions from the same story depending on how it’s presented. What we are exposed to on the internet is selected by the best buyers, and by our personal interests and stubbornness.

These are compounded into something called continued influence effect, which is a bias related to how hard it is to correct misinformation in our memory after having seen it the first time.
Moreover, our visibility is also affected by biases that make us seek certain things more than others. The automation bias, on the internet, is a tendency to rely too much on the algorithms for recommendations. We know these algorithms are not impartial and can be biased.
While the well-known confirmation bias, illusion of validity, optimism and pessimism biases, are all related to how much we chase, interpret, remember, and focus on information that is consistent and related to what we already believe. We like to stay in our bubble, we are our own filters of visibility with our biases. We are especially interested in reading and sharing content that connects to our own experiences and senses. These are biases that play a major role with what we’ll see in the next section, namely cultural insecurities.

Other biases are in relation to the over-availability of information and how it relates to our memory.
One such bias is the information bias, which is a tendency to look and search for information even though this information is meaningless to us, and cannot affect our actions.
The quest for information is also related to something we’ve seen in a previous section regarding the extension of our memory. The Google effect is a bias in which we are more likely to forget information if we know that it is readily available online.
Additionally, another popular bias under the name of Dunning-Kruger effect is the tendency of unskilled people to overestimate their own skills. With the whole internet acting as an external memory, it seems like more people are having this bias.
A bias that goes in the opposite direction is the imposter syndrome in which someone with enough experience feels they aren’t adequately qualified and fraudulent. With the whole internet pushing forward images of successful and perfect individuals, it seems like some people are more prone to this bias.

The over-availability of information itself can also lead to biases that makes it bypass certain of our defenses. As far as this is concerned, internet memes are an interesting case to examine.

The humor effect is a tendency to more easily remember things that are funny because they both capture our attention, stand out in a flood of information, and make us feel good. Humorous content also takes less time to process cognitively.
The picture superiority effect is the notion that pictures are remembered more easily and frequently than any other format. We’ve seen that one in the meme section and also said that pictures are processed more rapidly, 60 thousand times faster than text.
There are two ways to process information, through fast thinking or slow thinking, and memes harness speed to skip any form of slow thinking.

These effects, along with content that somewhat goes along with our preconceived views of the world, mean that visual internet memes will bypass criticism, go straight through a peripheral route, and lodge themselves in the brain, to be memorable.

Let’s now go briefly over some social biases, which will be helpful for us when tackling the cultural perspective in the next section.
Some of them are related to conformity and others to individuality in a social context. Online, as we now know, social interactions are mostly the same as in the real world, plus instant feedback, big exposure, and a sprinkle of anonymity and fabricated opinions.

When it comes to individuality, the biases of people on the internet are due to the constant exposure they have. For example, the spotlight effect is the tendency to overestimate the amount that others will pay attention to us.
Another bias, in relation, is the reactance: the urge to do the opposite of what others would want us to do, feeling a need to stand out and to assert freedom. It’s very common to see this contrarian behavior on social media.
The social comparison bias, yet again related, is about favoring candidates for a position that don’t compete with our own. We always compete on the internet, social media profiles are like public CVs and so this bias comes into play.

Other social biases tend to be linked to conformity such as the authority bias, the bandwagon effect, groupthink, social comparison bias, and in-group and out-group biases.
Authority bias is the tendency to trust and be more influenced by something or someone based on their origin and perception. We’ve seen earlier how state actors can abuse this to trade-up-the-chain to reach trusted mainstream media.
The bandwagon effect, groupthink, herd behavior, and in-group biases are all linked to preferential treatment to our own group, and keeping harmony, consensus, and conformity in it.
On the other side, there’s out-group homogeneity bias where we see members of an outside group as relatively similar. Confusingly, this bias also implies that the members of our own group are dissimilar.

Some people are more inclined to social biases than others. Some will recall images better than others. Distinctive people develop or react to biases in their own ways, be it because of their gender, sex, context, environment, geographical location, professional experiences, life experiences, education, etc..

This concludes our review of cognitive biases and how they get reflected on the internet. We’ve first seen some of the physical aspects of online addiction. Then we’ve covered three broad categories of biases: some related to the visibility, others to the information availability, and finally some related to social pressure. The biases related to visibility are about our fixation on what we see more frequently. The biases related to information availability are about what information we seek and how it affects our memory or confidence. Finally, the social biases are about how we conform to a group while still seeking individuality. On the internet these can be used to frame our views, bought or selected by algorithms, decide how information is accessible, and how social interactions get mapped unto social media with an instant feedback.

Table Of Content

References








Attributions: R. Fludd, Utriusque Cosmi, Tractatus primi, Oppenheim, 1620

Pepijn de Vos (pepijndevos)

SimServer: Tight Integration with Decoupled Simulators April 10, 2021 12:00 AM

I am working on Mosaic, a modern, open source schematic entry and simulation program for IC design. With a strong focus on simulation, I want to offer deep integration with the simulator, but also be able to run it on a beefy server and shield my program from simulator crashes. To this end, I have developed an RPC abstraction for interfacing with simulators remotely.

Here is a demo of a short Python script that uses Pandas, Matplotlib, and Cap’n Proto to run a CMOS netlist on Ngspice and Xyce and a behavioural Verilog equivalent on CXXRTL, allowing for easy verification of the transistor implementation.

You can see that the behavioural simulation is nearly instantaneously, while the spice results stream in much slower because they are doing a fully analog transistor simulation. You can see there is a bit of overshoot at the edges, and zooming in on that, you can see minor differences between the analog simulators because Xyce is using an adaptive timestep.

close up of overshoot

Now let’s take a step back and take a look at the design and implementation of this system. There are several reasons why I chose for a simulation server.

  • Ease of installation. Xyce is notoriously hard to install and only works on Linux as far as I know. An RPC protocol allows Xyce to run in a Docker container.
  • Performance. My laptop might not be the best place to run the simulation. An RPC protol allows the simulator to run on a beefy server, while running my user interface locally for a snappy experience.
  • Integration. Running a simulator in batch mode provides no indication of progress and requires setting up and parsing output files. An RPC protocol allows for direct, streaming access to simulation results.
  • Stability. It would not be the first time I’ve seen Ngspice segfault, and I’d hate for it to take the user interface along with it. An RPC protocol allows the same tight integration as its C API without linking the simulator into the GUI.

For the RPC library I settled on Cap’n Proto, but the next question is, what does the actual API look like? Ngspice has quite an extensive API, but the same can’t be said for Xyce and CXXRTL. So I could offer the lowest common denominator API of “load files, run, read results”, but one of my main goals was deep integration, so this is unsatisfactory. What I ended up doing is define small interfaces that expose a single functionality, and use multiple inheritance to assemble simulator implementations.

So I currently have 3 implementations of the run interface, and on top of that Ngspice implements the tran, op, and ac interfaces, with more to follow. I hope that in the future JuliaSpice will be a simulator that provides even deeper integration.

Please check out the code, and let me know your thoughts: github.com/NyanCAD/SimServer (How to expose simulator configuration and other functionality? Can we do remote cosimulation? Any other interesting usecases?)

Meanwhile, here is a demo of the example Python client running a transient and AC simulation on my VPS.

# on my VPS
docker pull pepijndevos/ngspicesimserver:latest
sudo docker run -d -p 5923:5923 pepijndevos/ngspicesimserver:latest
# in the examples folder
python ../client.py ngspice myvps:5923 rc.sp tran 1e-6 2e-3 0 ac 10 1 1e5

transient result AC result

April 09, 2021

Frederic Cambus (fcambus)

The state of toolchains in NetBSD April 09, 2021 10:42 PM

While FreeBSD and OpenBSD both switched to using LLVM/Clang as their base system compiler, NetBSD picked a different path and remained with GCC and binutils regardless of the license change to GPLv3. However, it doesn't mean that the NetBSD project endorses this license, and the NetBSD Foundation's has issued a statement about its position on the subject.

Realistically, NetBSD is more or less tied to GCC, as it supports more architectures than the other BSDs, some of which will likely never be supported in LLVM.

As of NetBSD 9.1, the latest released version, all supported platforms have recent versions of GCC (7.5.0) and binutils (2.31.1) in the base system. Newer (and older!) versions of GCC can be installed via Pkgsrc, and the following packages are available, going all the way back to GCC 3.3.6:

+---------+------------+-------------------+
| Package | Version    |      Release date |
+---------+------------+-------------------+
| gcc10   | GCC 10.2.0 |     July 23, 2020 |
| gcc9    | GCC  9.3.0 |    March 12, 2020 |
| gcc8    | GCC  8.4.0 |     March 4, 2020 |
| gcc7    | GCC  7.5.0 | November 14, 2019 |
| gcc6    | GCC  6.5.0 |  October 26, 2018 |
| gcc5    | GCC  5.5.0 |  October 10, 2017 |
| gcc49   | GCC  4.9.4 |    August 3, 2016 |
| gcc48   | GCC  4.8.5 |     June 23, 2015 |
| gcc3    | GCC  3.3.6 |       May 3, 2005 |
+---------+------------+-------------------+

The focus on GCC doesn't mean that the GNU and LLVM toolchains cannot coexist within NetBSD, and work has in fact been done during the last decade to make it happen.

Despite currently not being built by default in official NetBSD releases, LLVM has been imported in the NetBSD source tree in 2013. Daily images are built from NetBSD-current for selected platforms (at least amd64, i386 and evbarm) with the MKLLVM and HAVE_LLVM build options enabled, and contain LLVM and Clang.

Moreover, NetBSD has invested a lot of work on LLVM during the past few years, including funding some developer contracts for Kamil Rytarowski (kamil@) and Michał Górny (mgorny@), which allowed them to work on various parts of the LLVM toolchain to add and enhance support for sanitizers, and to improve LLDB support.

They both published several dozen articles on the NetBSD blog along the way, retracing their journey. Kamil's final report about upstreaming support to LLVM sanitizers summarizes the work accomplished. Thanks to this work, sanitizer support on NetBSD is mature and mostly on par with Linux. As a result, because LLVM is upstream for GCC sanitizers, they are also available in GCC on NetBSD. Similarly, Michał's final report on his LLDB work details the achievements on the debuggers front.

As always, work continues towards keeping the toolchains up to date, and upstreaming local changes whenever possible.

April 06, 2021

Bogdan Popa (bogdan)

Screencast: Writing a Resource Pool Library for Racket April 06, 2021 04:42 AM

After hacking on redis-lib for a bit on Sunday, I decided to write a general-purpose resource pooling library that I can re-use between it and http-easy and I recorded the process. You can check it out on YouTube: You can find the library on GitHub. One particularly interesting bit about the library, that I did not to record, is that the tests are all property-based. I might do another screencast at some point to talk about how they work and the bugs they found in my original implementation (from the video).

April 05, 2021

Patrick Louis (venam)

Internet: Medium For Communication, Medium For Narrative Control — The Actors and Incentives: State Actors: PSYOP, Narrative Warfare, And Weaponized Tech April 05, 2021 09:00 PM

Access to the mountain of the philosophers is blocked by a wall of false, sophistical doctrines.

  • Internet: Medium For Communication, Medium For Narrative Control
  • Part 2 — The Actors and Incentives
  • Section 3 — State Actors: PSYOP, Narrative Warfare, And Weaponized Tech
Table Of Content
  • Nothing New — Censorship and Propaganda
  • Psychology as a Weapon
  • Information Society = Information War & Weaponized Tech
  • Internal: Population Control, Data and Surveillance
  • Computational Propaganda
  • External: Cyber Wars
  • Social Media As Battleground — State Sponsored Trolling
  • Memes as Vectors of Propaganda

Since ancient times, nations have tried to write history from their own point of view. As they say, history is written by the victors. Today, the speed of the internet allows rewriting the narrative in real-time, and state actors will certainly take advantage of this. Namely, there are two ways, or generations, of state information-control practices: an information scarcity approach, aka censorship, and an embracing of information approach, speech itself as a censorial weapon. Both require monitoring of the population to properly adapt the tactic in use.

State surveillance is nothing new either, but the era of information abundance gives the capability of amassing data on a colossal scale.

Historically, nations have used propaganda and surveillance campaigns internally, to control their population, and externally, to sway the opinion of other countries, direct revolts, and other favorable incentives.
Today, these dynamics have moved to the cyberspace. The internet has been part of all contemporary conflicts as a force acting on the public opinion. Ultimately, the entity that is able to control the media, the discussion environment, the social networking sites, is in control of the narrative.

Before diving into the new era war, let’s review how psychology can be used as a weapon.

Psychology has always been used for military gain but got more traction with the concept of total war. This type of warfare includes civilian-associated resources and infrastructure as legitimate targets, mobilizing all resources of society to fight the war, prioritizing it over other needs — an unrestricted war. Thus, having a way to sway civilians in this type of war is a must-have in the fighting arsenal.
Although alone it isn’t sufficient and needs to be complemented with simultaneous diplomatic, military, and political tactics. Psychological war takes into consideration the full environmental scope.

A psychological war consists of psychological operations, also known in the contracted form as PSYOPs, which consist of planning psychological activities designed to influence the attitudes and behavior for a political or military objective — to evoke a planned reaction from the people.
This is a vague definition that includes anything we’ve seen from the previous influence, propaganda, and persuasion section. These tactics, when used as weapons, will be backed up by important state incentives making them more powerful, and arguably more successful, than when they are employed by other entities.

In a total war the target can be individuals, civilians, organizations, groups, or governments. The audience needs to be well delineated and studied so that the message is appropriately formed, as we’ve discussed beforehand. There are four audiences to a message:

  • The ultimate one, the real target.
  • The intermediate one, which are likely to receive the message and could be part of the target audience.
  • The apparent one, an audience that seemingly appears to be the target but isn’t the real intended one.
  • The unintended one, which the planner didn’t intend to reach but still received the message.

The receptivity, as we’ve kept saying, depends on many cultural and environment factors. However, in a military setting and with government sponsorship, vulnerabilities can be artificially created through means such as kinetic (bombs and guns), non-lethal biological weapons affecting human psyche and mood, and physical threats. The full environmental scope comes into play.
These operations will employ the schemes we’ve seen such as anchors, hooks, and imageries. They will be delivered through different methods, and will have a clear aim. The objective is either a positive one — to reinforce the behaviour and feelings of friendliness —, a negative one — to destroy morale, weaken the adversary, create dissonance and disaffection —, or to gain support from the uncommitted and undecided.

When faced with a new weapon, states have to develop defensive systems. These include doing PSYOPs on their own population, censorship, and counterpropaganda. Although some countries have laws that forbid applying psychological operations on domestic grounds and on their own citizens.

Nonetheless, most countries have integrated in their military structure distinctive branches that are specialized in the type of operations related to the information and psychological sector.
Here is a generic definition of the role of such unit:

The integrated employment of the core capabilities of electronic warfare, computer network operations, psychological operations, military deception, and operations security, in concert with specified supporting and related capabilities, to influence, disrupt, corrupt or usurp adversarial human and automated decision-making while protecting our own.

When deployed these well-trained units can apply these tactics at multiple levels.

  • Strategic: applies to anything outside the military area, even in peacetime, to prepare the global information environment.
  • Operational: applies to a joint military operation to help achieve consistency between the national objective and the objective of the coalition who they are partnering with.
  • Tactical: applies to on the ground operations, especially when facing an opposing force, to support the actions of a tactical commander.

What’s important for these units is to be able to measure the effectiveness of their operations (MOE). In military, if it can’t be measured then it’s no good. The internet offers clear metrics and is the perfect ground for that.

The ethics of these methods is arguable. Some say that it’s an arms race and that they are forced to apply them, that the development is unavoidable. Others say they offer a way to reduce the “significant loss of life”. As with anything, there’s a lot of subjectivity when it comes to these.

In an information society, information is the center of the new-generation war and thus these operational units are put to the forefront.
Information superiority is a must to win. Everyone fights for their narrative.

This new generation war is often referred to as the 6th one, where the aim is to destroy economic potential, not take part in physical battlespace, and be the dominant information holder.
Like the information society, this type of war is network-centric, taking into consideration the inter-relations between bubbles and how affecting one could lead to a quick victory. This requires information superiority to be able to plan in advance, conceal, and prepare the multiple aspects that will jointly create a favorable ground for the goal of the operation — to easily make a nation ply to your intentions.
From information, moral, psychological, ideological, diplomatic, economic, and so on, these are all required. Practically, this means using mass media, religious organizations, cultural institutions, NGOs, scholars, etc.. In today’s world this translates into using the right influential methods, and influencers on social media.

The data that is available on the internet, and particularly social media, can be gathered, stored, and linked, to finally become an armor or weapon in the hands of a state entity. It is a tool that can be used both defensively and offensively.

This intelligence gathering often take the form of social media surveillance. The information can be used to identify possible bad actors or individuals vulnerable for a PSYOPs campaign.

On the defensive side, a prominent explicit example of intel gathering is how the USA is now asking VISA applicants to submit five years of social media handles for some selected platforms.
However, the intel gathering probably happens most of the time unbeknownst to the data subjects. This is what has been revealed by internal whistle-blowers, and by more explicit state requirements like ISP keeping internet connection logs for at least 2 months in some countries.

Additionally, we need to remember that social media platforms are “independent” businesses. As such, they are bound by the legalities of the geographic areas in which they want to operate. Consequentially, governments can pressure them by taking the legal route and force them to act a certain way to operate on their legislature. Or they can simply buy access to that data from data brokers.
For example, it’s not uncommon that the platforms will have to obey when receiving an order to censure certain messages. Sometimes notifying users, sometimes not notifying them, according to the transparency rules of the platforms and the legislation in which it is operating. This is the simplest method to get rid of dissidents and keep order in a country. However, for the citizen of information societies, censure isn’t looked at too well. The people are craving honesty and individuality, while rejecting authority or association to greater organizations like nations and their wealth, as we’ve seen in the previous section.

On the offensive side, micro-targeting — that is using a conjunction of very specific attributes such as political leaning and interests to target individuals — can be used to amplify a message and have measurable metrics when performing a psychological operation. We’ll come back to this topic in a bit.

No need to use forceful actions when speech itself can be used as a weapon. This is what computational propaganda is about, using social media political discussions to sway opinions. Especially that this is the go-to place to discuss these things. This is known as platform weaponization. It can be used internally, to thwart opposition, or externally, to influence the decision-making of other states.
The act itself could either be state-executed, by the related military branch, state-coordinated, state-incited, or state-leveraged or endorsed.

Computational propaganda relies on the anonymity provided by the platforms which favors black propaganda and gives more credibility, making it seem more genuine. This means that states can deny involvement while inciting or leveraging activities happening on social media for their gain.
Without this knowledge, it is hard to know whether an opponent is attacking or not and how to defend against such attacks.

The attacks can be conducted with bots, automated scripts to scrap and spread data, with trolls, online accounts that deliberately target individuals to trigger and harass them, PR, public relation through news outlets and others, and with memes, which we’ve covered previously.

Bots, and in particular social-bots, give the ability to create fake online voices, and so to fabricate a public opinion. They are one of the weapon of choice in the cyber warfare of who can shout their narrative the loudest. Online, popularity is measured with “likes”, “votes”, “friends”, “followers”, and others which can all be faked.

The advantage that algorithmic campaigns provide is the ability to manipulate what people see, to temporarily boost the visibility of a message, to bypass the social media “democratic” process.

Social bots are accounts that impersonate people of a certain demographic, mimicking social interactions on the platforms. These can be used to create a sense of consensus where there is none, or generate doubt on a subject that is supposed to be widely accepted in the target audience.
With the visibility also comes awareness. This can be used by political parties to obtain votes, and to look more popular than others, to look more seductive and attractive.

Another use for such bot is something called astroturfing, fighting for a “turf”. This is about creating the impression of grassroots movements in favor or against something, be it an event, person, or idea.

This visibility can be used to choke off debates, smothering adversaries by making it seem like issues are one-sided, controlling the bulk of the conversation. A sort of artificially created social censorship. Trolls can be used for such effect too as we’ll see.

Alternatively, algorithmic amplification can be used to muddy political issues, create chaos and instability in the discourse (local or another country), to generate uncertainty, distrust, and divisions.
This is absolutely destructive if combined with on-the-ground actions when discord and confusion reigns in the population, and devastating when combined with economic pressure.

Not all types of content prosper on social media, we’ve seen that before. We’ve learned that outrage and politics resonate well, the things that clash with our cultural insecurities. That is why the bot-led political campaigns tend to come from the most radical parties, it is the most appropriate medium for them.
Apart from these, any type of negative content spreads like wild-fire. This is useful to amplify a sentiment of dissatisfaction and disagreement.

In retrospect, social media certainly has some part to play in the process of radicalization, however none of the messages there would resonate if they didn’t reflect some issues that were already present in us. As usual, it’s an exaggeration of our own ambiguous cultural code.

This is exactly what some state entities use to attempt to influence elections in other countries. Like bots generating discord and chaos regarding a certain topic, micro-targeting can be used to personalize messages towards hyper-partisans. In turn, these selected partisans will be pushed into their narrative, radicalized, polarized, and moving the Overton window. Furthermore, this radicalization gives them a bigger exposure and sets the ground for instability in a population.
This can be an attractive outcome in coordination with other types of actions.

These hyper-partisans can be converted into trolls, online accounts that deliberately target and harass particular individuals. They can either be paid/sponsored real people or bots, either aware or unaware of the tacit agenda, with or without explicit instructions, and either real partisans or fabricated ones.
This is frequently used along with black PR that are campaigns to deliberately engage in disinformation and harassment against a perceived opponent. It is used to intimidate and silence individuals.

There are countless examples of this tactic being used by states, either deliberately launched by them, or the states harnessing an already ongoing attack. These include, but are not limited to, making death and rape threats, amplifying vitriolic attacks, making accusations of treason or collusion, disseminating libelous disinformation, spreading doctored images and memes, and sowing acrimonious sexism.
The target of these attacks are usually journalists, activists, human rights defenders, and vocal members of an opposite ideology. These types of attacks are used everywhere in the world, some for capitalistic gains, and others for state gains.

Yet another efficient way for states to impose their ideas on the minds is through the news networks, all the headlines. They can employ the typical colored language, the emotive stories, imageries, hooks, and anchors we’ve seen before to shape the narrative.
Promotional culture is also an important aspect of pushing a narrative through the headlines.

Aside from these, nations can rely on the visibility principle, once again, to promote the stories they want. This could be by voting on social media with bots, or by infiltrating the news network and trading up the chain. News websites can be constructed from scratch and filled with bogus content, to then rely on the spreading mechanism and citizen journalism to gain credibility and attention. It might finally emerge on the mainstream news media consumed by the desired target audience.
Moreover, trading up the chain can be achieved by using fringe social media platforms which are prone to influence other online ecosystems. This works very well when the message is based on a kernel of truth but with a spin. Even when more reputable outlets are only debunking them, it’s giving the content a position in mainstream news.

All these are extremely hard to defend against, a David vs Goliath affair, and so states prefer to defend through a good offense. Counterpropaganda, part of counter-insurgency (COIN), is hard to practice on social media. However, one particular vector of information that can be used as COIN are memes, the new favorite shell for political messages.

Governments are getting more and more interested in memes, especially as a useful method to compact narrative and culture in a transportable container.
All modern, arguably post-modern, PSYOPs involve the propagation of memes on social media. Meme wars are an inherent part of political life. We’ve amply discussed them in a previous section.
They are the embodiment of the competition over narrative, creating coherent constellations of meanings and stories. We know it’s easy to overlook them because they are a common way of expression and often use humor, but that makes them the perfect craft for information warfare.

What appears like a prank or trope, soon turns into an influential narrative campaign spread by bots and trolls. Remember that memes aren’t limited to their internet format but that they go beyond, that this is only the appropriate envelope they take to move on that medium.
As such, like all that we’ve seen in this section, they can be used defensively, offensively, and to sway people on the fence, to recruit them or drive them away. They are used both for local and international sentiments.

Many countries employ, or discuss employing, memetics in their military sector. It consists of having separate units in a specialized center for these type of operations, which would be a sub-branch of the usual information and psychological operations but tailored to social media.
These meme warfare centers, or referred to as meme farms when described by opponents, would include interdisciplinary experts such as cultural anthropologists, computer scientists, economists, and linguists.

There are many open discussions about which countries actually employ these units, knowing their associations would reduce their power, but we find implicit indicators in the wild. Most countries and international associations have them in sight, take them seriously, or are actually armed with them already. Nowadays, any side of a conflict will use them, that includes all NATO nations, Russia, Hong Kong, China, Armenia, Azerbaijan, Palestine, Israel, India, Pakistan, etc..

However, it is an unconventional mean and so it is tricky to be put in place, or to let the citizens know it is in place. It is hard to include in a military structure. Memes are abstract concepts and they might sound unconvincing and bring obstacles related to finance — a lack of investments —, culture — the mindset to grasp the new generation warfare —, legalities — to encompass the ethics around using them —, and bureaucracies — who should be in command.
Additionally, there needs to be scientific proofs of their efficiency to justify the investment. Efficiency should be tested and measurable, and this isn’t straight forward when a change in attitude doesn’t necessarily correlate with a change in behavior.

If a nation lacks the framework, mindset, resources, knowledge, and tools, they’ll leave the advantage to others that can embrace this new paradigm. Nonetheless, it’s apparent that this is a threat but it wasn’t until 2017 that the EU and NATO established together a center to counter “hybrid threats”. Though its purpose is more analytical than proactive, for them, at least in the public eye, the best form of defence is awareness.

Theoretically, a meme, like a virus, can be categorized in infect, inoculate, and treat. To transmit, prevent or minimize, and contain a message.
A quarantine doesn’t really work in the virtual world though and combining this with the neutralization of the infected memeoids by killing them in the real world might have the opposite effect. History has shown that it is likely it will instead validate the ideology of the meme.

Detection plays an important role in the defence mechanism, to then be able to launch a counter-message. In practice, it has been attempted for counter-radicalization, a form of counter-insurgency, with mixed effects.
An invisible enemy is and invincible enemy, that is why identifying, cataloguing, and tracing should be done beforehand. Which can be almost impossible when the opponent, be it foreign or local, hides within the civilian population, netizens. Thus, the focus on “hybrid treats” center.

In all cases, displacing and overwriting dangerous pathogenic memes is normally done by replacing them with more benign ones. These reactionary memes have different degrees of applicability, actionability, operationalization, and success.
The fruitful reactionary memes have one thing in common: they use humor, irony, and sarcasm to deligitimize the message of an opponent by ridiculing and mocking it. This is similar to WWII propaganda techniques used in cartoons.
“No one is above satire.”

This concludes our review of how state actors are employing the internet narrative as part of their overall information and psychological operations. We’ve first seen how this is nothing new, how psychology has been used as a weapon for quite a while now, both on the defensive and offensive. Then we’ve looked at how data is at the center of wars, now that we’ve moved to an information society. Next, we’ve seen how this data can be part of the surveillance campaign of states. Later we’ve examined computational propaganda, how algorithm dictates our world and how consequently they can be used by nations. Finally, we’ve dived into social media, bots, trolls, news control, and memes, all communication vectors that rely on gaining visibility and credibility. Speech as a method of censorship, speech as a method of propaganda, speech as a weapon.

Table Of Content

References








Attributions: Geheime Figuren der Rosenkreuzer, Altona, 1785

Ponylang (SeanTAllen)

Last Week in Pony - April 4, 2021 April 05, 2021 12:30 AM

Ponyc 0.39.1 has been released! The Roaring Pony group has made progress, including an addition to the standard library math package. Sean T. Allen is looking for assistance with the ponydoc project.

April 04, 2021

Derek Jones (derek-jones)

The first computer I owned April 04, 2021 09:06 PM

The first computer I owned was a North Star Horizon. I bought it in kit form, which meant bags of capacitors, resistors, transistors, chips, printed circuit boards, etc, along with the circuit diagrams for each board. These all had to be soldered in the right holes, the chips socketed (no surface mount soldering for such a low volume system), and wires connected. I was amazed when the system booted the first time I powered it up; debugging with the very basic equipment I had would have been a nightmare. The only missing component was the power supply transformer, and a trip to the London-based supplier sorted that out. I saved a months’ salary by building the kit (which cost me 4-months salary, and I was one of the highest paid people in my circle).

North Star Horizon with cover removed.

The few individuals who bought a computer in the late 1970s bought either a Horizon or a Commodore Pet (which was more expensive, but came with an integrated monitor and keyboard). Computer ownership really started to take off when the BBC micro came along at the end of 1981, and could be bought for less than a months’ salary (at least for a white-collar worker).

My Horizon contained a Z80A clocking at 4MHz, 32K of RAM, and two 5 1/4-inch floppy drives (each holding 360K; the Wikipedia article says the drives held 90K, mine {according to the labels on the floppies, MD 525-10} are 40-track, 10-sector, double density). I later bought another 32K of memory; the system ROM was at 56K, and contained 4K of code, various tricks allowed the 4K above 60K to be used (the consistent quality of the soldering on one of the boards below identifies the non-hand built board).

North Star Horizon underside of boards.

The OS that came with the system was CP/M, renamed to CP/M-80 when the Intel 8086 came along, and will be familiar to anybody used to working with early versions of MS-DOS.

As a fan of Pascal, my development environment of choice was UCSD Pascal. The C compiler of choice was BDS C.

Horizon owners are total computer people :-) An emulator, running under Linux and capable of running Horizon disk images, is available for those wanting a taste of being a Horizon owner. I didn’t see any mention of audio emulation in the documentation; clicks and whirls from the floppy drive were a good way of monitoring compile progress without needing to look at the screen (not content with using our Horizon’s at home, another Horizon owner and I implemented a Horizon emulator in Fortran, running on the University’s Prime computers). I wonder how many Nobel-prize winners did their calculations on a Horizon?

The Horizon spec needs to be appreciated in the context of its time. When I worked in application support at the University of Surrey, users had a default file allocation of around 100K’ish (memory is foggy). So being able to store stuff on a 360K floppy, which could be purchased in boxes of 10, was a big deal. The mainframe/minicomputers of the day were available with single-digit megabytes, but many previous generation systems had under 100K of RAM. There were lots of programs out there still running in 64K. In terms of cpu power, nearly all existing systems were multi-user, and a less powerful, single-user, cpu beats sharing a more powerful cpu with 10-100 people.

In terms of sheer weight, visual appearance and electrical clout, the Horizon power supply far exceeds those seen in today’s computers, which look tame by comparison (two of those capacitors are 4-inches tall):

North Star Horizon power supply.

My Horizon has been sitting in the garage for 32-years, and tucked away in unused rooms for years before that. The main problem with finding out whether it still works is finding a device to connect to the 25-pin serial port. I have an old PC with a 9-pin serial port, but I have spent enough of my life fiddling around with serial-port cables and Kermit to be content trying a simpler approach. I connect the power supply and switched it on. There was a loud crack and a flash on the disk-controller board; probably a tantalum capacitor giving up the ghost (easy enough to replace). The primary floppy drive did spin up and shutdown after some seconds (as expected), but the internal floppy engagement arm (probably not its real name) does not swing free when I open the bay door (so I cannot insert a floppy).

I am hoping to find a home for it in a computer museum, and have emailed the two closest museums. If these museums are not interested, the first person to knock on my door can take it away, along with manuals and floppies.

Bogdan Popa (bogdan)

Screencast: Building a Redis Session Store for Koyo April 04, 2021 10:37 AM

I decided to write a library for storing koyo sessions in Redis today and I recorded the process. If that sounds appealing, you can check it out on YouTube:

April 02, 2021

Carlos Fenollosa (carlesfe)

Fed up with the Mac, I spent six months with a Linux laptop. The grass is not greener on the other side April 02, 2021 09:11 AM

This article is part of a series:

  1. Seven years later, I bought a new Macbook. For the first time, I don't love it
  2. How I moved my setup from a Mac to a Linux laptop
  3. This article
  4. (TBD) My review of the M1 Macbook Air

Due to very bad decisions by Apple's product marketing teams, Mac hardware and software had been in steady decline since 2016.

Therefore, there has been a trickle of articles on the Geekosphere about people switching from Macs to Linux or Windows.

This is the contrarian view. Don't do it.

The TL;DR is right there in the title: migrating to Linux is fine, but don't expect a better experience than the Mac.

My experience with the Dell XPS 13" Developer Edition was positive in general, including a self-hosted Cloud setup, but not good enough to convince me to stay with it.

We will cover:

  1. A comparison of generic productivity software: email, calendar, image manipulation, etc.
  2. Available power tools to customize your keyboard, trackpad, and more.
  3. A quick jab at decades-old issues which still haven't been solved.
  4. Misc stuff that Linux does better than the Mac.
~~~~~

I feel like I need to clarify that this is an article aimed at Mac users who are considering a migration to Linux in hope of a more polished system. As usual, personal experiences and requirements are subjective. I know that Ubuntu ≠ Gnome ≠ Linux. I also know that I'm not entitled to anything, everybody is welcome to send patches. Just let me say that if you try to cherry-pick any single issue, you're missing the forest for the trees.

~~~~~

Linux productivity software is fine, but there are rough edges for the power user

The typical disclaimer when recommending Linux to a Mac/Windows user is that some proprietary software may not be available, like MS Office, Photoshop, games, etc.

Nobody says, "the main problem you will find with Linux is that email and calendar clients fall apart when you scratch under the surface."

It is truly ironic because I ran MS Office with Wine and it worked well but I was unhappy with my email workflow.

Yes, the apps I missed the most from the Mac were Mail.app, Calendar.app, and Preview.app.

I am an extreme power user, to the point that many of the keys on my keyboard don't do what the keycap says. I want my apps to let me do easy things fast while allowing me to do complex tasks with a bit of extra work.

I send and receive maybe 100 emails per day. Most of them are HTML, with attachments, video conference invitations, and such. I don't live in a vacuum. I can't ask my clients to send me plaintext email only. I need to send long emails with pictures, I want my zoom invites to appear automatically in my calendar.

For some reason Mail.app gets a lot of criticism, but it does almost everything well. It has conversation view, search is fast and helpful, multiple accounts are combined seamlessly including autodetection of the "From" field based on the recipient, and smart folders (search folders) are updated when you need them.

On Linux, the only email client with a native "conversation view" is Geary, which is in early development and still very buggy. Evolution is fine and well-integrated with the rest of the desktop apps, but the lack of conversation view was a deal-breaker for me. Thunderbird is an excellent email client, but conversation view is provided by a plugin that is also buggy. Other options like Claws, Sylpheed, Kmail, and terminal clients are more limited in terms of features and don't work for me.

I ended up using Thunderbird, but I felt like I was doing my email with handcuffs. Suffice to say, I had both Thunderbird and Gmail open and used either one depending on the task I needed to complete.

The situation of calendar and contacts clients is similar, with the same contenders. I also ended up using Thunderbird along with Google Calendar.

About PDF and basic image management, anybody who has used Preview.app will realize that it's much more than just a viewer. There is simply no replacement on Linux. You'll need to open either the Gimp or Xournal for any basic editing. I am an advanced Gimp user, but for most operations, Preview.app is faster and more convenient.

Desktop notifications are something we don't think a lot about, but a bad system can be very annoying. Gnome has a system-wide framework, which is not well thought in terms of dealing with the actual notifications.

Most apps have their own notifications system which runs in parallel, especially Thunderbird and Evolution. You end up with different types of notifications on different parts of the screen, and a non-consistent UI to deal with them.

Finally, on the Mac, you can find an ecosystem of alternative paid PIM apps, like Spark, Fantastical, Things, and others. There is no equivalent ecosystem on Linux, probably because they would be difficult to monetize.

Power tools are more limited and more difficult to use

The previous section could be summarized as "Linux PIM software is fine in general, but gets in the way of power users."

That is counterintuitive, right? Linux is a much nerdier OS than the Mac and everything is customizable.

But when you jump from theory to practice, at some point you just want a tool to help you set up your config, without the need to edit your trackpad driver source file.

Any advanced Mac user knows about Karabiner, BetterTouchTool, Choosy, Alfred, Automator, and more.

With Linux, you can achieve almost the same feature set, but it is harder and more limited.

For example. To customize your keyboard, you will need a combination of xdotool, xbindkeys, xcape, xmodmap and setxkbmap to capture some event and then run a shell script. There is a Gnome Shell plugin that allows you to tweak your keyboard, but it's nowhere near Karabiner.

If you want to achieve some specific action you need to read four or five manpages, search online, and figure out how you are going to put the pieces together. That made me appreciate Karabiner and BTT much more.

Furthermore, I couldn't find a real alternative to BTT to customize trackpad multi-touch gestures. I tried a few approaches with libinput-gestures but none worked.

In the end, I was able to replicate most of my macOS power tools setup via input hooks and shell scripts, but it took much longer than it should have. I found it surprising that, given the number of nerds using Linux every day, there are no specific tools equivalent to those mentioned above.

"I Can't believe we're still protesting this crap"

Please allow me to make a bit of fun of issues that existed back in 1999 when I started using Linux and still exist today.

  • Screen tearing with the intel driver. Come on. This was solved on xorg and now with Wayland it's back. I fiddled multiple times with the settings but couldn't fix it. Even with OpenBSD it was easier to fix. The default settings should be better. I don't care if the video driver will use an extra buffer or whatever.
  • Resolving new hosts is slow, with a delay of about 2-3 seconds. I tried to disable IPv6 and other tricks from Stackoverflow threads, but none solved the issue completely. Again, an issue with the default settings. macOS does some DNS magic or something and the network feels much faster.
  • Resuming after suspend seems to work at first. As soon as you start to trust it and not save your work before closing the lid, it betrays you and you lose your work. Later, you upgrade the kernel and it works all the time for weeks until you upgrade the kernel again and it goes back to working 80% of the time. What a mess.

We've come a long way with Linux on the desktop but I think it's funny that some things never change.

Linux also hides some gems

I want to end this review on a positive note.

During those six months, I also took notes on apps and workflows that are still better on Linux.

  • Tracker/search is better and faster than Spotlight. It's a shame that not all apps take advantage of it, especially Thunderbird.
  • Firefox is amazing. On the Mac, Safari is a better choice, but I was very happy using Firefox full-time on Linux. I am going to miss some great plugins, like Multi-account containers, Instagram-guest, Reddit Enhancement Suite, and of course NoScript and uBlock Origin
  • Nautilus is better than the Finder. It's not even close.
  • The Gnome Shell Extension Gallery has many hidden gems, like Nothing to say which mutes the microphone system-wide with a shortcut, the Emoji selector, Caffeine to keep your computer awake, a Clipboard manager, and Unite to tweak basic UI settings. I am now using macOS equivalents to those, and I discovered their existence thanks to the Linux counterparts.
  • Insync for Linux is better than the official Google Drive File Stream for the Mac. In fact, I am now using the Mac version of Insync.
  • Gimp and Inkscape are excellent apps, and it's a pity that the macOS ports are mediocre. I'd rather use them than Pixelmator/Affinity Designer. Hopefully, someday either GTK or these apps will get decent macOS versions.
  • apt-get was a revolution when it was released in 1998 and it is still the best way to manage software today. brew is a mediocre replacement.
  • I paid for Crossover which allowed me to use MS Office and other Windows apps I needed. Kudos to the Wine developers for almost 30 years of continuous effort.
  • Xournal is an obscure app that allows you to annotate PDF documents as well as draw with a Wacom tablet. I used it constantly as a whiteboard for online presentations. The macOS port is very buggy, unfortunately, so I use OneNote which is not that good.

Hopefully, the success of paid tools like Insync or Crossover can encourage the developer ecosystem to continue developing quality apps, even if they are non-free, or supported by donations.

What's next?

On November 10th Apple showed us the future of the Mac and released again laptops worth buying. So I bought the 2020 M1 Macbook Air. You will read a review of it soon.

The hardware is much better than the Dell's and, I guess, every other PC laptop. The software ecosystem is a big improvement over my Linux setup, and Big Sur course corrects the absolute mess that Catalina was. With every passing year, the iCloud offering keeps getting better, especially if you have other Apple devices.

I am somewhat sad that I couldn't join the Linux Resistance. After all, I've been an annoying proselytizer heavy free software advocate in the past, and I still am, though I nowadays admit there are many nuances.

The experience of using Linux as a daily driver has been very positive for me, but I do need my productivity. I can work much faster with macOS and iCloud than I was with Linux and my self-hosted cloud setup.

If there ever was a period where the Mac experience was worse than Linux, it is now over. The Mac ecosystem wins again. Don't switch to Linux expecting it to have fewer papercuts than the Mac. It's quite the opposite.

There is definitely grass on the other side of the fence, but it is not greener.

Tags: apple, linux

&via=cfenollosa">&via=cfenollosa">Comments? Tweet  

April 01, 2021

Gergely Nagy (algernon)

The Logical Next Step April 01, 2021 12:30 PM

It's been a while I posted here, even though there were a lot of things happening around me. Alas, some of those led me onto a path that I am now ready to talk about: it's been brewing for a good while now, and I'm excited to announce it today!

I've spent the past two years working on Kaleidoscope and Chrysalis, and enjoyed every moment of it, keyboards, ergonomic keyboards, are my passion. Today, I'm taking that a step further, and am joining a major keyboard manufacturer to work on their input devices from now on. Not only will I be working with an incredible team developing the next generation of smart, cloud-connected input devices, but we will be bringing my own devices to the market aswell!

I was originally approached by a different team from the company, due to having a background in IoT and security, but the talks quickly turned to ergonomics and input devices (my greatest fault is that I can't stop talking about things I'm passionate about), and after a bit of back and forth, we ended up deciding that we can combine all our experiences, all our knowledge, and build products where all of them are put to good use. This is an incredible opportunity to put everything I've learned so far while working on Kaleidoscope, Chrysalis, and the many different projects of the past, towards the goal of building products that will revolutionize what we think about input devices.

I can't wait to share the crazy new ideas we've come up with during the past year of negotiation!

Gustaf Erikson (gerikson)

March April 01, 2021 07:35 AM

Derek Jones (derek-jones)

Linux has a sleeper agent working as a core developer April 01, 2021 12:36 AM

The latest news from Wikileaks, that GCHQ, the UK’s signal intelligence agency, has a sleeper agent working as a trusted member on the Linux kernel core development team should not come as a surprise to anybody.

The Linux kernel is embedded as a core component inside many critical systems; the kind of systems that intelligence agencies and other organizations would like full access.

The open nature of Linux kernel development makes it very difficult to surreptitiously introduce a hidden vulnerability. A friendly gatekeeper on the core developer team is needed.

In the Open source world, trust is built up through years of dedicated work. Funding the right developer to spend many years doing solid work on the Linux kernel is a worthwhile investment. Such a person eventually reaches a position where the updates they claim to have scrutinized are accepted into the codebase without a second look.

The need for the agent to maintain plausible deniability requires an arm’s length approach, and the GCHQ team made a wise choice in targeting device drivers as cost-effective propagators of hidden weaknesses.

Writing a device driver requires the kinds of specific know-how that is not widely available. A device driver written by somebody new to the kernel world is not suspicious. The sleeper agent has deniability in that they did not write the code, they simply ‘failed’ to spot a well hidden vulnerability.

Lack of know-how means that the software for a new device is often created by cutting-and-pasting code from an existing driver for a similar chip set, i.e., once a vulnerability has been inserted it is likely to propagate.

Perhaps it’s my lack of knowledge of clandestine control of third-party computers, but the leak reveals the GCHQ team having an obsession with state machines controlled by pseudo random inputs.

With their background in code breaking I appreciate that GCHQ have lots of expertise to throw at doing clever things with pseudo random numbers (other than introducing subtle flaws in public key encryption).

What about the possibility of introducing non-random patterns in randomised storage layout algorithms (he says waving his clueless arms around)?

Which of the core developers is most likely to be the sleeper agent? His codename, Basil Brush, suggests somebody from the boomer generation, or perhaps reflects some personal characteristic; it might also be intended to distract.

What steps need to be taken to prevent more sleeper agents joining the Linux kernel development team?

Requiring developers to provide a record of their financial history (say, 10-years worth), before being accepted as a core developer, will rule out many capable people. Also, this approach does not filter out ideologically motivated developers.

The world may have to accept that intelligence agencies are the future of major funding for widely used Open source projects.

March 31, 2021

Patrick Louis (venam)

Internet: Medium For Communication, Medium For Narrative Control — The Actors and Incentives: Internet, In Between The Corporate, Private, And Public Spheres March 31, 2021 09:00 PM

For all are Men in Eternity, Rivers, Mountains, Cities, Villages. In your imagination, of which this World of Mortality is but a shadow. For it includes all things, and everything that exists, only exists through it

  • Internet: Medium For Communication, Medium For Narrative Control
  • Part 2 — The Actors and Incentives
  • Section 2 — Internet, In Between The Corporate, Private, And Public Spheres
Table Of Content
  • Availability of Information
  • Networks And Cultures
  • Individuation
  • Self-Generated Societies
  • Making Up Your Own Mind + Learning & Creativity
  • Citizen Journalist
  • Cognitive effects
  • Private Life Becoming Public Life
  • Passion Economy
  • Extreme Noeliberalisation

The internet is a new resource and when introduced in our social structures it has fueled the construction of utilities around it. Like any tool it has no effects on its own but only through its usages. In particular, it has altered our capacity of communication making it interactive, multimodal, asynchronous or synchronous, global or local, many-to-many, one-to-many, one-to-object, object-to-object.
In this section we’ll go over some of the things the internet allows to do through the platforms and services hosted on it. We can’t go over everything because that would be equivalent to describing what modern life is about. Instead, we’ll paint a rough picture of the internet in between the corporate, private, and public sphere.

We can describe ourselves as an information society, an internet-everywhere society. Calling ourselves digital natives, digital citizens, digital cosmopolitans, and netizens.
Everything that can be digitized gets digitized. According to a study published by Martin Hilbert in Science in 2011, 95% of all information in existence on the planet is already digitized and accessible through the internet.
This has shrunk the globe, brought knowledge to everyone, altered the way we communicate and connect, changed the way we make money, and changed the way we think.

The internet reflects us, it can convey the best and worst of humankind. It is an open-ended system, like our societies, with the inter-relations mapped in the digital space, with all the same logics that exist between cultures, organizations, and technologies.
Similarly, the internet is networked and so are our societies: constructed around personal and organizational networks, without clear boundaries. Thus, the same paradigms and dynamics that apply in our real world are echoed online. Studying the internet is studying anthropology.

Previously we’ve seen how this bridging of cultures online makes us shift from high-context cultures to low-context ones, or informational ones. Because of this, we are forced to learn to communicate in a different way, with an emphasis on collaboration, learning to negotiate, to accept disagreement as normal.
Paradoxically, this has made us more socially conscious, and social in general, on the internet than in the physical world. There is less distance, it feels like the globe has shrunk, and that has intensified social interactions, for better or worse.
Some studies have observed that overall, there is a correlation between sociability and internet usage. This applies to civic engagements and other types of relationships. This is a new kind of sociability, one that is driven by permanent connectivity, especially on social media.

When bridging cultures a common language has to be chosen, and on the internet it is often English that dominates. It composes around 30% of the internet according to estimates, but isn’t the native tongue of most users. This can limit the spread of information from one cluster to another, create boundaries for communities, an inequality of information, an inequality of representation, and shape the digital experience and views of people on the internet. The language, and indirectly culture, barriers can direct the flow of information, tending in the direction where it is sometimes more domestic. We’ll get back to this in the consequences part of the series.

The apparent reduction in distance also creates networking opportunities. In the past interactions were limited to the personal networks; people being favored for whether they went to a prestigious school, university, or worked at a well-known company. These days we can be exposed to a diverse range of networks that are much more accessible, increasing the employment prospect.
This applies to creativity too, being exposed to so many diverse ideas, different ways to see a problem. Creativity is an import-export business. Taking things that are ordinary from multiple places and putting them in totally different settings. What is called cognitive diversity.
Additionally, widespread connectivity brings the possibility of working from almost anywhere in the world, at any time of the day, if the job doesn’t require to be on-place — that is if it is an information related job. This unlocks tremendous freedom and autonomy.

The concept of autonomy is an important one, as it relates to another one: individuation. When moving from high-context society to low-context information society, the process of individuation takes place. Some ironically calls it the “ME-centered” society, or the “selfie generation”. This isn’t something new, it’s been happening for a couple of years, but it is more widely spread and the internet is a catalyst.
Concretely, this translates into the decline of the definition of the self based on organizations and organizational roles, such as work, nation, and family. The individual is now in the center and doesn’t find meaning in these classical system, sometimes leading to a cross culture frustration.

Nevertheless, this is not the end of the concept of community but the end of having it based on places of interactions. Instead, it has shifted toward the self-authorship of the community, one based on personal ties, social relationships, individual interests, values, and projects. As a matter of fact, the individual is not isolated, nor less sociable, but redefined as a networked individual with a quest for like-minded people.
The information society is a network of self-authored individuals.

The transformation and evolution in culture bring with them a reorganization of economic, communication, and political activities. According to some research, it is correlated with a sense of empowerment, increased feelings of security, personal freedom, and influence — all in relation with overall happiness and well-being.

Whole communities join together and participate in their own identity development, as a choice. In this model, people replace blind reverence for authority with authenticity.
In the past information were shaped and based on hype, now people crave what is genuine and transparent. This is a change in perception with a focus on raw emotions and connections.

This authenticity is an essential aspect for identity development, it is how we identify ourselves and create our stories. Our narratives are based on our interactions in this participatory network. We are the creators, refining our persona with the help of the individuals that surround us, to whom we tell our story.
McAdams (1993) asserts, “Once an individual realizes that he or she is responsible for defining the self, the issue of self definition remains a preoccupation through most of the adult years”.

Being the subject, the individual, in the process of creation is what is meant by autonomy. This is accompanied by an urge to not submit to institutions, especially for minorities. This switch in mindset is highly correlated with social autonomy, independent-minded citizen able to participate in self-generated networks.
This autonomy materializes itself in society through entrepreneurship, creative works, proactive consumers, self-informed critical thinkers, pedagogical platforms that allow self-learning, e-governments, grassroots movements, etc..
The paper entitled “The Social Media Mindset” lists six major types of autonomy:

  • Professional development
  • Communicative autonomy
  • Entrepreneurship
  • Autonomy of the body
  • Sociopolitical participation
  • Personal, individual autonomy

More than anything, to be autonomous means mastering our tools, grasping how to make algorithms work for us. They are ubiquitous, we are surrounded by them, the internet is mediated by them. Hence, we need to muster their strength.
To be internet literate implies understanding the technology that drives it. The customization and recommendation engines can be used to facilitate our daily routine, letting the users create their own personal optimized environments.

Social networks are used to create our own constructed world, which includes everything from e-commerce, education, entertainment, to sociopolitical activism. The users group together according to their interests and tailor it to their needs. A self-constructed society with people yearning for honesty, genuineness, transparency, common interests, and connections.
Many of the real world has migrated to the internet, from web marketers, work organizations, service agencies, governments, and civil societies.
Technology, being material, is based on products made by people based on their ideas, values, and knowledge. We adapt the technology to fit us, rather than adopting it. It is built into our society and thus it has to follow its constraints. Obviously, social media platforms are trying to monetize the process, turning it into a business.

Collectively, this has brought us to an era of creativity. Where everything is a remix and shared. We aren’t bound by the mainstream media gatekeepers like TV, radio, newspapers, and academia. Information is now freely available on the internet, accessible by anyone, from the worst atrocities to the most awe-inspiring instances of human kindness.
In the internet society we are forced to broaden our collective perspective, a perspective that is more global.

The informational freedom carries with it transformations in the sociopolitical realm. There’s a widespread availability of tools for creating and distributing content across the globe. Most often, there is no need for training nor expertise to participate, everyone is empowered to mass media — A direct access to million of other people in an expanded community.
We bypass traditional media, letting us challenge the assertions of powerful corporations, government officials, news reports, and television advertisements. We do it through blogs, Twitter, Facebook, and others.

Everyone is an author and can express their thoughts to the world and this undoubtedly means that some people are going to spread disinformation, misinformation, and other types of erroneous information.
Our new distrust in authority and increasing doubts, can be abused through our cognitive biases. The attention economy, with the never ending competition between all actors, is filled with manipulation.
This is something the netizens are slowly learning. We’ll come back to biases later.

The constant connectivity has given rise to a more democratic and participatory form of journalism: citizen journalists. The audience employs the press tools to inform one another. The public now plays the role of collecting, reporting, analyzing, and disseminating news and information.

There are upsides and downsides to this new development. As we said, these are amateurs, people without journalism training, and so the quality of the reports and coverage can vary, is subjective, heavily opinionated, unregulated, and can be abused by malicious actors — such as spin doctors.
However, it can also promote human rights and democratic values to have the people join together and collaborate in this endeavor.

Social network websites, accessible to everyone, have provided an alternative for activists to report on their activities. Social media also allow real time participation and messages can be sent quickly in reaction to events.
This is particularly helpful during an earthquake, street movements, and protests.

Lots of things are now shaped and decided on the internet, it is tightly linked to political changes in the process. Citizen journalists now surround all the protests and movements these days. These movements, rejecting authority, are often without leaders, more spontaneous, and full of emotional outbursts.

The media of choice to express feelings about a situation are internet memes. They act like the modern era version of caricatures found in traditional newspapers.

Some memes have been put forward in democratic campaigns such as sharing messages about climate change. Though arguably, the research hasn’t found their impact in altering mindsets that impressive. In that study only 5% of participants changed their minds.
Memes, like caricatures, can also act as a power and counterpower, a battle of communication. Humor has always been a tool for people to express feelings and thoughts that were repressed, to fight against oppression.
This is the case in countries where people have to bypass censorship and rhetoric by reusing the semantic meaning of the slogans, ideas, and terminology. They reuse the language, twist it, satire it, to reinterpret it in their own way. They use the official language in reverse by applying it with humor.

Still, we have to remember that citizen reporting is full of biases, doubts, lies, polarization, conspiracy theories, and subjectivities. These can also obviously be manipulated by other actors, as we said.

Similar to citizen journalists we are seeing the rise of citizen science, which we won’t dive into here.

Let’s now jump a bit into some of the effects that being constantly online can have on our cognitive abilities. In a place of never-ending multitasking, attention economy, infobesity, and new social concepts.
This will be reviewed again in the consequences part of the series.

Our attention is now always divided between multiple media sources at the expense of sustained concentration. Everything tries to attract our attention in this hyperlinked addictive environment and we got used to it.
Education providers can already see the effects this has on children attention span: their attention is more shallow.
Studies show a significant effect on cognitive performance, where even short term engagement reduces attention scope for a sustained duration after coming offline.

Memory-wise, the vast array of information has changed the way we retrieve, store, and even value knowledge. The internet is now acting like a transactive memory, externalizing our thoughts — a helper, a memo, a memento.
We have most of the factual information in existence at our fingertips, introducing the possibility that the internet could ultimately negate or replace certain parts of our memory system, especially the semantic memory — memory of facts.
Multiple research show that people searching online were more likely to remember where they found the facts rather than the facts themselves. We are more reliant on the internet for information retrieval.
I’ve dived into this topic previously here and here.
At the group level this can be efficient but at the individual level it can hinder how we recall the actual information. The internet sometimes even bypasses the ability to retain where we found the information when it gives us the opportunity to search for it. We don’t have to remember the exact information anymore, nor where its located, just that it can be searched and that it’ll be available.
This cognitive offloading using an external long term storage can be beneficial to focus, not on facts, but on aspects that are not retrievable like meta information, conclusions, and emergent hypothesis. It frees some space to think about other things than plain facts.
Though other research show that analytical thinkers are less likely to rely on such means in their day-to-day than other people.

Cognitively, the internet also creates new ways to learn. Any new skills and information are available in this space. People now have more breadth of knowledge instead of depth of knowledge.

When it comes to social cognition, there are a lot of debates regarding whether the internet provokes the same effects as real interactions would. So far, the studies and evidences show that it does indeed reflect real things like self-concepts and self-esteem. We clearly see this through the neurocognitive responses to online social interactions.
These online social relationships, like any relationships, are connected to feelings of happiness, and mental and physical well-being. Thus being accepted or rejected online feels the same way in the brain, an interpretation, which we’d like to remember, in the offline world is often ambiguous and left to self-interpretation.
However, on social media, unlike the real world, we are presented with clear metrics which give us an indication of whether we fit in or not. They take the form of “followers”, “shares”, “likes”, “friends”. The potential can be painful, or addictive because of the immediacy. There is a direct feedback on our self-esteem.
Additionally, the public life on these social media also means that upward social comparisons are omnipresent. People take part in artificial environments manufactured to picture hyper-successful persona. Obviously, this is what the platforms capitalize on. You can go back to the previous section to get an idea.

On social media we aren’t anonymous but often create persona, a self-presentation of a real person. One that we mold to be perfect. Everyone has the ability to become an orator, a super-star in front of a gigantic crowd. An online life is like being followed by a million paparazzi.

We know that the medium is quick, indelible, and that a misstep can make us victim of the internet outrage machine. Be it positive, or not, because of private justice or vigilantism, or any other reasons. We are intimately aware of how extreme views proliferate and how ideas are hard to separate — always coming as blocks, hooks, and anchors.

With social media our personal life and work life are intertwined, they come as a package that can’t be separated. Our views in one field can propagate and affect other aspects of our lives.
Personal life is now public life, and not only for celebrities.

This is compounded with the feeling of always being scored and tracked. According to the Pew Research Center, 72% of people feel that almost all of what they do online is being tracked by advertisers, technology firms or other companies.

Having private personal details online and our digital reputation can have repercussions in the real world affecting our relationships, job opportunities, bank loans, visa applications, and more.
It is hard to erase something once it is on the internet, and so we apply self-surveillance between ourselves. Indeed, we know our weaknesses are mapped and become increasingly transparent and this leads to self-censorship, conformity, risk-aversion, and social rigidity.
This is a phenomenon called “social cooling”. “You feel you are being watched, you change your behavior.” Isn’t this a form of censorship? Privacy is the right to be imperfect and not judged. We’ll come back to this in the last section of the article when discussing some of the solutions.

I was told that I couldn’t be trusted since people can’t check online what I’m doing when I’m not around.

On the good side, the ability to easily get exposure, the appetite for authenticity, and the itch for autonomy have given birth to the passion economy.
The passion economy consists of taking the cliché of “following your passion” to the extreme. While it didn’t work properly in the past, it is now increasingly easier to achieve, either through an online patronage system or simply because of the new exposure.
The individual is empowered and liberated from previous notions of fixed career path.

The passion economy is centered around purposeful creators that are motivated and driven by their passion, displaying their expertise and experiences with the world.

The audience is as passionate and niche as the creator, influenced and transported through the craft. This is in stark contrast with the apathetic consumers of previous years that made price-focused decision about trinkets of mass productions.
People want to be involved in the making process, they want the whole experience, to engage with the services and ideas they like, to associate with their stories, to build a relationship with the authors. They want to find individuality in their products.

Along with an always public life comes the idolization of hyper-successful individuals. There is a rise in unrealistic expectations of oneself.
Individuation puts the weight on people to find who they are, and it is easy to go back to role models for escapism. These people might in turn use their influence to create a monetizable audience.
The gold rush for scores and fame, the new cult of personality, the new promotional culture going viral.
This is the rise of the “influencers” who’ll sell their authenticity for paid product placements and advertisements.

Lastly, these coalesce — individuation, influencers, passion economy, private life becoming public life, and social media — into neoliberalization.
We’ll see more of this later, as it also depends on culture and how people perceive and use the internet in different ways.

This is the ailment of modern society in general, and the internet makes society more “modern”. A shift in motivations to be goal-oriented and efficient, instead of being based on the more traditional values.
The individuals, with the help of the internet, are now accustomed to marketing themselves as products. This is the modern rule of the game: corporate life merging with private life.

Many products are sold to fill this “need”. I’ll digress, as I have discussed this in other articles, and here. We’ll come back to this later.

This concludes our review of the person’s life on the internet in between the corporate, private, and public spheres. We started by seeing how the availability of information is now everywhere and shrinks distances. Then we’ve taken a look at the relationship between networks in societies being mapped online and the effects on cultures. Next we’ve discussed the switch to the information society which creates a need for individuation, networks of people grouping based on their interests. Then we’ve seen how this allows self-generated societies and making our own mind about things. Later we’ve discussed the empowerment of the average person through citizen journalism. After that we’ve glanced at some of the cognitive effects in the realm of attention, memory, and social cognition. Following this we’ve concluded with three subjects: the private life becoming public, the passion economy, and a new extreme form of neoliberalisation that is normalized.

Table Of Content

References








Attributions: W. Blake, The Sun at its Eastern Gate, c 1815

March 29, 2021

Ponylang (SeanTAllen)

Last Week in Pony - March 29, 2021 March 29, 2021 04:39 PM

The ‘Force declaration of FFI functions’ RFC has entered final comment period. OpenSSL and LibreSSL builders have been updated. There are also some releases from last week.

Kevin Burke (kb)

Preferred Stock and Common Stock Aren’t The Same March 29, 2021 04:00 PM

When you get an offer from a tech company it will usually be some combination of cash and stock. Small companies give out stock because it's hard to compete, cash-wise, with bigger companies, and a grant of stock or options offers the chance at a large payday down the road.

Valuing the cash part of an offer is pretty simple. How should you value the stock? Well, one answer is whatever someone's willing to pay you for it.

To that end recruiters will sometimes give you information like, "our last round closed at $4.80 per share, so, if you get X shares per year, your stock compensation is worth $4.80 * X." Sometimes recruiters will send a fancy tables showing the value if the company doubles or 10X's its value.

This is a classic bait and switch. When a company raises a round from, say, Sequoia, Sequoia wires money to the company and gets preferred shares. When you are granted stock options as an employee, you are getting common shares. The differences will vary from company to company, but commonly:

  • Holders of preferred shares get paid out before common shareholders (ie you). Bankruptcy is not intuitive. If you get in a traffic accident, the insurers will usually say something like, the other party is X% at fault, and you were Y% at fault, so this is what you owe. Bankruptcy is not like traffic court. All the rights holders ahead of you will get paid out 100% before you see a cent. If the company is sold or liquidated, the preferred shareholders will likely be paid in full before any holder of common stock sees a dollar. Because of this, preferred shares are more valuable.

  • Preferred shares have different voting rights than common shares. A preferred share might get five or ten (or zero) votes, to one for a common share.

  • Preferred shares may have other downside protection in the event an IPO or sale does not reach a target price.

So preferred shares are worth more than common shares. It is a mistake to look at the preferred share price and multiply by the number of shares and use that as the current value of your common shares. Your shares are worth much less than that.

One reason this happens is that preferred shares are easier to value, because there are regular funding rounds, insider sales. Common stock doesn't trade hands that often before an IPO because stock sales often require board approval. But that doesn't excuse anyone from giving you common shares and pretending they are worth as much as preferred shares.

The recruiters and VC's next trick is to pretend that you shouldn't be concerned about the difference between common and preferred stock, because in an IPO preferred stock is usually converted to common stock. That's true. But this is sort of like saying a home without fire insurance and a home with fire insurance are worth the same as long as they don't burn down. If you have common stock, you don't have fire insurance. And a startup is significantly more likely to fail than a home is to burn down.

If anyone tries to convince you that the difference doesn't matter, ask them if they'd like to swap their preferred shares one-for-one with your common shares.

If you are being recruited and someone tries this on you, please point out the difference and explain that you don't appreciate being condescended to. You should also think less of the company. Every one of your coworkers went through the same process of being lied to about their potential share value.

If you are an employer and you want to do this honestly, quote the most recent share price you have, and then explain that that's the preferred share price, but you are not giving out preferred shares. Explain that recruits should value their shares lower than the preferred share price you quoted - exactly how much is difficult to say, but the preferred share number should be an upper bound on that value. If your common stock is traded, or any of your shareholders are forced to mark their shares to market (Fidelity holds them in a mutual fund, for example), you should disclose that.

(You could also let your employees sell their equity more often, maybe to insiders.)

March 28, 2021

Derek Jones (derek-jones)

The aura of software quality March 28, 2021 09:45 PM

Bad money drives out good money, is a financial adage. The corresponding research adage might be “research hyperbole incentivizes more hyperbole”.

Software quality appears to be the most commonly studied problem in software engineering. The reason for this is that use of the term software quality imbues what is said with an aura of relevance; all that is needed is a willingness to assert that some measured attribute is a metric for software quality.

Using the term “software quality” to appear relevant is not limited to researchers; consultants, tool vendors and marketers are equally willing to attach “software quality” to whatever they are selling.

When reading a research paper, I usually hit the delete button as soon as the authors start talking about software quality. I get very irritated when what looks like an interesting paper starts spewing “software quality” nonsense.

The paper: A Family of Experiments on Test-Driven Development commits the ‘crime’ of framing what looks like an interesting experiment in terms of software quality. Because it looked interesting, and the data was available, I endured 12 pages of software quality marketing nonsense to find out how the authors had defined this term (the percentage of tests passed), and get to the point where I could start learning about the experiments.

While the experiments were interesting, a multi-site effort and just the kind of thing others should be doing, the results were hardly earth-shattering (the experimental setup was dictated by the practicalities of obtaining the data). I understand why the authors felt the need for some hyperbole (but 12-pages). I hope they continue with this work (with less hyperbole).

Anybody skimming the software engineering research literature will be dazed by the number and range of factors appearing to play a major role in software quality. Once they realize that “software quality” is actually a meaningless marketing term, they are back to knowing nothing. Every paper has to be read to figure out what definition is being used for “software quality”; reading a paper’s abstract does not provide the needed information. This is a nightmare for anybody seeking some understanding of what is known about software engineering.

When writing my evidence-based software engineering book I was very careful to stay away from the term “software quality” (one paper on perceptions of software product quality is discussed, and there are around 35 occurrences of the word “quality”).

People in industry are very interested in software quality, and sometimes they have the confusing experience of talking to me about it. My first response, on being asked about software quality, is to ask what the questioner means by software quality. After letting them fumble around for 10 seconds or so, trying to articulate an answer, I offer several possibilities (which they are often not happy with). Then I explain how “software quality” is a meaningless marketing term. This leaves them confused and unhappy. People have a yearning for software quality which makes them easy prey for the snake-oil salesmen.

Gustaf Erikson (gerikson)

The Eighteenth Brumaire of Louis Bonaparte by Karl Marx March 28, 2021 10:37 AM

Available online here.

Come for the class analysis, stay for the bon mots.

It’s probably fitting that the only way obscure French politicians are remembered today is through their skewering in this piece.

Hegel remarks somewhere that all great world-historic facts and personages appear, so to speak, twice. He forgot to add: the first time as tragedy, the second time as farce. Caussidière for Danton, Louis Blanc for Robespierre, the Montagne of 1848 to 1851 for the Montagne of 1793 to 1795, the nephew for the uncle. And the same caricature occurs in the circumstances of the second edition of the Eighteenth Brumaire.

The period that we have before us comprises the most motley mixture of crying contradictions: constitutionalists who conspire openly against the constitution; revolutionists who are confessedly constitutional; a National Assembly that wants to be omnipotent and always remains parliamentary; a Montagne that finds its vocation in patience and counters its present defeats by prophesying future victories; royalists who form the patres conscripti of the republic and are forced by the situation to keep the hostile royal houses they adhere to abroad, and the republic, which they hate, in France; an executive power that finds its strength in its very weakness and its respectability in the contempt that it calls forth; a republic that is nothing but the combined infamy of two monarchies, the Restoration and the July Monarchy, with an imperial label – alliances whose first proviso is separation; struggles whose first law is indecision; wild, inane agitation in the name of tranquillity, most solemn preaching of tranquillity in the name of revolution – passions without truth, truths without passion; heroes without heroic deeds, history without events; development, whose sole driving force seems to be the calendar, wearying with constant repetition of the same tensions and relaxations; antagonisms that periodically seem to work themselves up to a climax only to lose their sharpness and fall away without being able to resolve themselves; pretentiously paraded exertions and philistine terror at the danger of the world’s coming to an end, and at the same time the pettiest intrigues and court comedies played by the world redeemers, who in their laisser aller remind us less of the Day of Judgment than of the times of the Fronde – the official collective genius of France brought to naught by the artful stupidity of a single individual; the collective will of the nation, as often as it speaks through universal suffrage, seeking its appropriate expression through the inveterate enemies of the interests of the masses, until at length it finds it in the self-will of a filibuster. If any section of history has been painted gray on gray, it is this. Men and events appear as reverse Schlemihls, as shadows that have lost their bodies. The revolution itself paralyzes its own bearers and endows only its adversaries with passionate forcefulness. When the “red specter,” continually conjured up and exercised by the counterrevolutionaries finally appears, it appears not with the Phrygian cap of anarchy on its head, but in the uniform of order, in red breeches.

The coup d’etat was ever the fixed idea of Bonaparte. With this idea he had again set foot on French soil. He was so obsessed by it that he continually betrayed it and blurted it out. He was so weak that, just as continually, he gave it up again.

The army itself is no longer the flower of the peasant youth; it is the swamp flower of the peasant lumpen proletariat. It consists largely of replacements, of substitutes, just as the second Bonaparte is himself only a replacement, the substitute for Napoleon. It now performs its deeds of valor by hounding the peasants in masses like chamois, by doing gendarme duty; and if the natural contradictions of his system chase the Chief of the Society of December 10 across the French border, his army, after some acts of brigandage, will reap, not laurels, but thrashings.

Confessions of a Long-Distance Sailor by Paul Lutus March 28, 2021 08:17 AM

A self-published book available online recounting the author’s solo round the world sail.

A worthy entry in the long roster of such accounts.

Libra Shrugged: How Facebook’s dream of controlling the world’s money crashed and burned by David Gerard March 28, 2021 08:09 AM

A short account of how Bitcoiners tried to create a Facebook currency and how the rest of the world reacted.

March 27, 2021

Pierre Chapuis (catwell)

Booting GDM on my XPS with kernel 5.11 March 27, 2021 08:15 PM

When I updated my Linux kernel to 5.11 I had the bad surprise to end up with a blinking underscore on reboot. It had been many years since an update had broken my system like that. I fixed it rather easily by booting in rescue mode and downgrading the kernel. I had no time to investigate so I just added linux to IgnorePkg at the time, But I don't use Arch to run old kernels so today I took the time to fix it "properly".

To do so, I reproduced the issue, then downgraded again and looked at the logs with journalctl -b --boot=-1. It quickly let me understand that it was GDM that was failing due to something wrong with graphics initialization.

To keep things short, let me skip to the conclusion: if you run into this issue on an old-ish Dell XPS with an Intel Iris Plus 640 graphics card like mine with GDM, Arch and Wayland (or something similar), try enabling early KMS by adding i915 to the MODULES array in mkinitcpio.conf and rebuilding the initramfs, that fixed it for me.

Andreas Zwinkau (qznc)

How you can handle The Diamond with CMake March 27, 2021 12:00 AM

CMake requires old-school include-guards and prefix at scale

Read full article!

March 25, 2021

Marc Brooker (mjb)

What You Can Learn From Old Hard Drive Adverts March 25, 2021 12:00 AM

What You Can Learn From Old Hard Drive Adverts

The single most important trend in systems.

Adverts for old computer hardware, especially hard drives, are a fun staple of computer forums and the nerdier side of the internet1. For example, a couple days ago, Glenn Lockwood tweeted out this old ad:

Apparently from the early '80s, these drives offered seek times of 70ms, access speeds of about 900kB/s, and capacities up to 10MB. Laughable, right? But these same ads hide a really important trend that's informed system design more than any other. To understand what's going on, let's compare this creaky old 10MB drive to a modern competitor. Most consumers don't buy magnetic drives anymore, so we'll throw in an SSD for good measure.

XCOMP 10MB    Modern HDD   Change Modern SSD  Change
Capacity 10MB 18TiB 1.8 million times   2 TiB 200,000x
Latency 70ms 5ms 14x 50μs 1400x
Throughput 900kB/s 220MB/s 250x 3000MB/s 3300x
IOPS/GiB (QD1) 1400 0.01 0.00007x 10 0.007x

Or there abouts2. Starting with the magnetic disk, we've made HUGE gains in storage size, big gains in throughput, modest gains in latency, and a seen a massive drop in random IO per unit of storage. What may be surprising to you is that SSDs, despite being much faster in every department, have seen pretty much the same overall trend.

This is not, by any stretch, a new observation. 15 years ago the great Jim Gray said "Disk is Tape". David Patterson (you know, Turing award winner, RISC co-inventor, etc) wrote a great paper back in 2004 titled Latency Lags Bandwidth that made the same observation. He wrote:

I am struck by a consistent theme across many technologies: bandwidth improves much more quickly than latency.

and

In the time that bandwidth doubles, latency improves by no more than a factor of 1.2 to 1.4.

That may not sound like a huge amount, but remember that we're talking about exponential growth here, and exponential growth is a wicked thing that breaks our minds. Multiplying Patterson's trend out, by the time bandwidth improves 1000x, latency improves only 6-30x. That's about what we're seeing on the table above: a 250x improvement in bandwidth, and a 14x improvement in latency. Latency lags bandwidth. Bandwidth lags capacity.

One way to look at this is how long it would take to read the whole drive with a serial stream of 4kB random reads. The 1980s drive would take about 3 minutes. The SSD would take around 8 hours. The modern hard drive would take about 10 months. It's not a surprise to anybody that small random IOs are slow, but maybe not how slow. It's a problem that's getting exponentially worse.

So what?

Every stateful system we build brings with it some tradeoff between latency, bandwidth, and storage costs. For example, RAID5-style 4+1 erasure coding allows a system to survive the loss of one disk. 2-replication can do the same thing, with 1.6x the storage cost and 2/5ths the IOPS cost. Log-structured databases, filesytems and file formats all make bets about storage cost, bandwidth cost, and random access cost. The changing ratio between the hardware capabilities require that systems are re-designed over time to meet the capabilities of new hardware: yesterday's software and approaches just aren't efficient on today's systems.

The other important thing is parallelism. I pulled a bit of a slight-of-hand up there by using QD1. That's a queue depth of one. Send an IO, wait for it to complete, send the next one. Real storage devices can do better when you give them multiple IOs at a time. Hard drives do better with scheduling trickery, handling "nearby" IOs first. Operating systems have done IO scheduling for this purpose forever, and for the last couple decades drives have been smart enough to do it themselves. SSDs, on the other hand, have real internal parallelism because they aren't constrained by the bounds of physical heads. Offering lots of IOs to an SSD at once can improve performance by as much as 50x. Back in the 80's, IO parallelism didn't matter. It's a huge deal now.

There are two conclusions here for the working systems designer. First, pay attention to hardware trends. Stay curious, and update your internal constants from time to time. Exponential growth may mean that your mental model of hardware performance is completely wrong, even if it's only a couple years out of date. Second, system designs rot. The real-world tradeoffs change, for this reasons as well as many others. The data structures and storage strategies in your favorite textbook likely haven't stood the test of time. The POSIX IO API definitely hasn't.

Footnotes

  1. See, for example, this Reddit thread, unraid forums, this site and so on. They're everywhere.
  2. I extracted these numbers from my head, but I think they're more-or-less representative of modern mainstream NVMe and enterprise magnetic drives.

March 24, 2021

Patrick Louis (venam)

Internet: Medium For Communication, Medium For Narrative Control — The Actors And Incentives: New Economies March 24, 2021 10:00 PM

The spectacular widespread magnetic cures of doctor Franz Anton Mesmer. Work based on the magnetic bipolarity of the human body and the influence of an all pervasive vital fluid.

  • Internet: Medium For Communication, Medium For Narrative Control
  • Part 2 — The Actors and Incentives
  • Section 1 — New Economies: Information Economy, Attention Economy, Data Economy, Surveillance Economy
Table Of Content
  • Data And Attention As New Assets
  • How Much Value
  • The Actors (Social Media, Ads Networks)
  • What They’d Do To Get It
  • What They’d Do To Protect It

The artifacts we’ve previously seen have given rise to new types of economies, tools, and arsenals that can be used by different actors for different incentives.
Let’s start by focusing on the economic aspects by looking at actors such as social media platforms, newspapers, and advertisement companies.

Life in an information society is about moving data, the new raw material, around our manufacturing pipelines. This data is then consumed by either paying with attention or money.
Data and attention are the two assets of the digital economies that emerged.

Whoever holds the means of productions, by centralizing and monopolizing them, has a tremendous advantage in any economy. In a data economy these are the data brokers who figuratively data mine the raw material, store it, and keep it well guarded — as is required in any intangible economy to succeed (See Capitalism Without Capital: The Rise of the Intangible Economy).

The other new asset is attention. In an environment where it is at the same time so hard to reach an audience and, once reaching a threshold, so easy to spread virally, having the skills to capture attention is valuable.
This is due to multiple factors such as the rise of social media, the average users becoming generator of content, and the ever-growing infobesity.

Davenport & Beck (2001) define attention as “focused mental engagement on a particular item of information. Items come into our awareness, we attend to a particular item, and then we decide whether to act.
Its scarcity turns it into a prized resource, constructing an attention economy. Attention is exchanged as a commodity, either by giving it or taking it — an intangible good on one side, money and physical goods on the other.
There is a fight for this resource, the corporate entities need it because the digital economy is in sync with advertising. The drivers and generators of money are buyers of attention and data, namely marketers and advertisers. Just like traditional media advertising, the online world follows a model that uses a linear process called AIDA: Attention, Interest, Desire, and Action.
Hence, social media platforms, online newspapers, and others make money by selling ad spaces, selling targeting information using the data gathered, and giving access to or selling the data itself.

Marketers have two choices when it comes to social media: either passive marketing — gathering data from their consumers posts and public information — or active — engaging with users, creating polls, paying influencers to advertise a product, etc..
Both parts are facilitated by platforms that allow tracking and analyzing the results of their campaigns. Advertisement is only a sub-section of marketing.

As for online newspapers, and many others, they have chosen to either sell ad space, thus attention, or switched to a subscription model in which the customer pays to get access.
Some argue that the centralization of ad services forces companies to offer paid services, leading to an overall more expensive online experience. There are big questions regarding the value of the content we consume online and how much should be paid for it. The concept of micro-transactions and micro-payments comes to mind, we’ll come back to them in the solutions part.

The internet conundrum is that users have developed a habit of expecting online services to be free. So, to make the services offered profitable, the owners have to extract valuable things from their users, namely their attention and generated data.

The digital culture expert Kevin Kelly describes the expected attributes of this economic space as:

  • Immediacy - priority access, immediate delivery
  • Personalization - tailored just for you
  • Interpretation - support and guidance
  • Authenticity - how can you be sure it is the real thing?
  • Accessibility - wherever, whenever
  • Embodiment - books, live music
  • Patronage - “paying simply because it feels good”,
  • Findability - “When there are millions of books, millions of songs, millions of films, millions of applications, millions of everything requesting our attention — and most of it free — being found is valuable.”

In direct relation with the data economy, a sub-type of economy has appeared that specializes in a particularly profitable data type: personal data. The surveillance economy, as it came to be called, centers itself around making profit from gathering personal information, commodifying it.
This data is obviously more valuable and can be sold for reasons other than precise targeted marketing, such as creating a voters profiles database to be sold to state actors. This was the case of Cambridge Analytica, who used Facebook as a tool to gather information. Allowing them to join up to 7 thousand data points on 240M people. We’ll see more of that later.

This grants companies access to information that can be used to tailor their products to their customers’ needs, and customize their marketing campaigns to specific categories of people, to what we call “buyer personas”. This is good for passive marketing.
For example, a job ad can be sent precisely to men between 25 and 36 that are single and have interests in cars.
When so many data points are involved in identifying an audience, this is referred to as microtargeting.

In the past advertising was used with the goal to inform consumer choices, today it is predominantly used to persuade them, going back to our section on persuasion and propaganda. It steers consumers toward the brands which invest more in advertising, rather than those that are sustainable and responsible but less competitive. Thus, driving global consumption artificially.

Active marketing, engaging with users, works well in a surveillance economy too. Companies can perform direct continual experiments on their customers without them being aware of it nor giving consent.
An A/B or Wizard of Oz testing phase that would’ve taken months to perform can be done in a matter of hours on social media, an almost instantaneous feedback.

In the past marketing used to be more of a one-to-many type of communication, with the internet it now appears more intimate, a one-to-one customized experience with the brand. The company can build a relationship with their customers for long-term success.
This relationship can then work in their favor through the viral nature of social media. The consumers who find what they are looking for will share their positive experience, while those who didn’t will also communicate it. The opinions will proliferate and increase in number.

From the data gathering side of the equation, big tech companies are then justified in amassing data at all costs, tracking and analyzing. This affects privacy and control over our lives, we’ll see some of the deleterious consequences later. The data includes location history, preferences, age, name, and other things we’ve seen in the previous data and metadata section.

These three or four types of economies, digital, data, attention, and surveillance, clash together. The first one generates a mountain of advertisements and content that floods people, making them more picky about what they pay attention to. Accordingly, the value of attention rises, it being more scarce, creating additional incentives to invest into specific targeted advertisement, hence more surveillance.
A true arms race driven by a contamination of the supply that is often filled with irrelevant, redundant, unsolicited, low-value information. An adverse effect of this information revolution.

One of the many consequences, which we’ll see later, is something called continuous partial attention, a phenomenon where someone’s attention is constantly divided. It can lead to increase stress and an inability to reflect or make thoughtful decisions, sometimes related to analysis paralysis.
This consequence is convenient to some actors, especially in relation to what we’ve seen in the propaganda and persuasion section.

There’s a lot of money involved and many reasons to secure a position in these new economies. As with any new commodity that spawns, is lucrative, and fast growing, it attracts regulators and policymakers that want to restrain its control and flow.

Every year, companies worldwide spend more than 1.3 trillion dollars on commercial communication and 600 billion dollars on advertising. The titans of the internet, Alphabet, Amazon, Apple, Facebook and Microsoft look unstoppable. They are part of the most valuable listed firms in the world. Their profits keep surging: they collectively racked up over 25 billion dollars in net profit in the first quarter of 2017.

This is compounded with the feedback effect that more data encourages more data. This confers an enormous advantage to already large incumbents, the ad technology suppliers that have amassed a wall of well guarded data.
In this economy it is quantity over quality.

The giants span different fields, Google can see what people search for, Facebook what they share, Amazon what they buy. There’s no room for small players such as online newspapers, or any other SMEs, they can’t compete. This means the biggest players have a strong grip on the supply chain, both the data, attention, and the ad tech.

Size isn’t the issue here but the danger it can have on the consumers in these economies. It impacts everyone along the supply chain, from publishers, to advertisers buying ads, and the consumers.
This can create conflicts of interests, giving a self-preferenciation to specific suppliers of ad technology. These later can then control the price of access to it, its quality of service, and the transparency going along with it.
With such control, these companies can thwart competitors, either by hoarding more data and making it impossible for others to rival them, or by acquiring or destroying contenders.
For instance, many think that Facebook’s acquisition of WhatsApp for $22 billion falls in that category, namely creating a barrier of entry and stifling competition.

Unquestionably, these data brokers are nothing without their network of buyers, the advertising companies and other entities interested in getting access to mined data and attention. While there are many small players, the bigger ones will often sign partnership deals with the data brokers, so-called third-party partnership (or sub-contractors), to exchange and interoperate data laterally.
There are countless examples of these generally non-transparent deals, at least non-transparent with regard to the data subjects which have their information exchanged without their consent.

For example, AT&T may use third-party data to categorize subscribers without using their real names to show them ads. In practice, that means AT&T can find out which subscribers have which health issues — or which ones bought over-the-counter drugs to treat it. The companies extracted this kind of information from the databases accessed through third parties which contained subscribers’ spending.

An investigation by the New York Times from 2018 detailed how Facebook had made special agreements to share/give access to personal user information and private chats to 150 tech firms, including Amazon, Apple, Microsoft, Netflix, Spotify, and Yandex. The list also included online retailers, car-makers, banks, phone manufacturers, and media organizations.

Facebook distinguished between two types of relationships: integration and instant personalization partnerships.
The first one is about offering Facebook’s features outside its own app or website, the users having to sign in to their account to allow it.
The second one, “instant personalization”, is about allowing other apps to interact with Facebook private messages to customize them.
Facebook noted that it ended “nearly all” of the integration partnership and most of its personalization partnerships.

Facebook is a gargantuan player, but only one among many like Google and Twitter. They all have such deals and act as data brokers.
The most interesting type of partnership relying on these gigantic databases of personal information are the ones with state actors. For now, let’s only note that the USA government requires would-be immigrants and visa applicants to submit five years of social media handles for specific platforms identified by them, mainly Facebook. We’ll see how lucrative social media are for states later.

With lots of money and people involved, many aggressive tactics are used to get the attention and personal data of people, any legal means are allowed. That is why they rely on persuasive technology, a term used to refer to any technology designed to change attitudes or behaviors through persuasion and social influence. In our case, they are used to persuade to act based on ads, pay attention, and give personal data.

The use of persuasive design is custom-fit to the receiver, for example it can be based on age-appropriate desires.
The preliminary information can be gathered from all the sources we mentioned in our data and metadata section such as smart utilities, and our daily ubiquitous online activities. This is all possible because the user inadvertently agreed to a hidden privacy policy, in most cases without reading it. In so far as this is the argument used by these platforms to allow this, while making it as cumbersome as possible to understand what the policies imply.

Then, once the target is set, the software platform algorithms will explicitly or implicitly attempt to win the attention of the users through interface and psychological tricks based on the interests and personalizations from the data extracted earlier.
This includes mundane things such as catchy click-bait titles of news articles, and visual tricks such as using shocking pictures, impactful fonts, and attractive colors. The usual in marketing and other types of media (TV, newspapers, radio).
The new nefarious techniques rely on dark patterns, user interfaces that are designed to mislead and direct users into acting a certain way. This includes hiding parts of the interface that are supposed to be there for legal reasons, encouraging users to sign up or buy services or items they didn’t intend to, and more

Additionally, the platforms will create an environment that nurtures attention, shaping it using the data acquired. This ranges from benign things such as showing users what they are looking for, to turning the digital environment into a casino-like environment.
Social media are especially known to employ this stratagem, relying on the craving for social feedback that people want and mixing it with a reward system, be it “like”, “views”, or “shares”. Overall, this is a vicious cycle made up of a classical conditioning scheme that creates a drug-like addiction for the users participating on social media.

These techniques work well on adults but even better on kids. Perniciously, the different platforms involved have used them to target younger and younger people. The typical age when kids get their first smartphone has fallen to 10yo.
There’s more profit to be made out of a blank mind than an already filled one. This manipulation for profit is a reminder of our section on propaganda and how having full control of an environment allows for indoctrination.

The children are then caught in a virtual environment replacing their real-world needs and basic human drives: to be social and obtain goals.
To reiterate clichés, for teenage girls it could be wanting to be socially successful, and for teenage boys to want to earn competences. Meanwhile, they can easily fall victim to the internet’s attention and surveillance economies — an endless pursuit of stimulating virtual points which they believe will make them happy and successful.
I digress, we’ll have a full part with consequences.

Strikingly, the tracking and fight for attention starts earlier than childhood, the digitalization of our lives starts before birth. This is why data on pregnant women is the most valuable. We are the very first generation to be datafied before birth, this is a historical transformation.

On social networks, people most of the time want to entertain themselves, and entertainment is what makes them stay. One type of very popular entertainment is political infotainment, a sort of caricature of politics. A large portion of end users visit social media only for politics, and so the platforms will be incentivized in promoting this type of content. They give what people are looking for.

Many of these digital spaces rely on something called a “feed” or “home” page that will be generated and presented by default. This page will be optimized and curated by algorithms for the goal of the platforms. It is highly individualized, the news and information are selected by engagement value, intentionally designed to isolate end users from information that will make them leave, and creating a self-reinforcing feedback loop of confirmation, thus satisfaction.

From recommendation engines, search results, trends, autocomplete forms, it all happens through the mediation of algorithms. Whatever content is able to satisfy the criteria of the algorithms will be promoted, be it through clickbait titles, paying for fake views, fake reviews of products, bots interaction, and outrage-related and emotion inducing content.
The objective is to predict what we are more likely to interact with: click, view, engage.

The algorithms themselves are impartial, they do not know what the actual content is, and cannot discern between what is true and what isn’t. They are simply good at what they were programmed to do.
The algorithms are trade secrets of the platforms, opaque tools that influences millions of people each day. The data brokers and networks use this as a justification of non-accountability for the consequences, something we call mathwashing. Hiding behind the algorithm and pleading free speech.

Algorithm amplification is the phenomenon we are experiencing, the automatic radicalization of views and disinformation as an instrument for more views and spiraling out of control. It is a business choice made in pursuit of profit.
Social media will reward, through the point systems, extreme stances on subjects, these serve as the lubricant for these businesses, making them more profitable and influential. Studies have shown how users feel satisfied when finding a niche to cater to that generates a lot of these points (likes/shares).

This is an effect that is comparable with TV long-term media effects theory. It states that the more an individual watches television, the more they believe social reality matches what they see on it.

These have ramification in the real world, the consequences can be dire, as we’ll see later. For example, hate speech amplification can thrive and fuel genocide. Internet platforms play a dominant role in our conversation, extremism cultivated online has real world ramifications.

Evidently, they are now under public scrutiny and lawmakers want to intervene but social network, ad tech service providers, and others in the loop will fight to defend their castle of data.

The most common argument is that users have consented to what is happening by agreeing to the privacy policy and EULA they were presented with when signing up to the service. It is ok because it is the inevitable conclusion of our online activity. In other words, asking companies whose business models revolve around exploiting personal data for consumer-influence techniques to explain their privacy policies is like asking someone to shoot themselves in the foot.

The companies are now tweaking their tactics, using a patch-like approach rather than tackling the real issue, which is the core of their business. We’ll see in the last part some real solutions.

For example, AT&T recently said they would stop sharing users’ location details with data brokers, Facebook said it stopped allowing advertisers to use sensitive categories for targeting, Google said it’s now allowing users to download masses of their data.
Recently Google is also going to remove third-party cookies, to then allow third-parties to have access to the data only through them and getting information about groups and not individuals. They also get information directly from other sources such as their analytics and AMP service. A sort of differential privacy. Arguably, these give Google even more power over this type of data and drives away competitors.

In some countries, like the USA, the companies hide behind laws related to free speech to justify their policies. They are claiming that they do not want to be arbiters of truth.
That helps in taking the blame away from them, leaving them unaccountable as no USA critics want any platform to act as a censor.

As a last resort, like in any industry big enough, these companies don’t shy away from using their money to convince the lawmakers anywhere in the world.
Between 2005 and 2018, the big five tech in the USA have spent more than half a billion dollars lobbying the USA Congress, and donating to parties. Literally purchasing power.
Moreover, other databrokers also spend as much, companies such as CoreLogic spent as much as $215,000 on lobbying in 2020; and Acxiom, which spent $360,000 on lobbying in 2020 for issues related to data security and privacy.

Amazon, Apple, Facebook, Google and Microsoft have collectively thrown $582 million at legislators in an attempt to influence the lawmaking process and make sure their voices (and interests) are taken into account.

All of the firms have taken a position on privacy policy, with the issue coming up 3,240 times in all reports files—by far the most talked about topic. Last year alone, $66.4 million was spent on privacy issues by the tech giants. That includes $16.6 million from Google, $15.1 million from Amazon and $13 million from Facebook.

Whatever legality they deal with, like anything legal, depends on the country where the law is applied. Thus, any solutions will be localized, as we’ll see later.

This concludes our review of the emergence of new economies in the digital space with regards to social networks and ad tech providers. We’ve first seen the two new assets: data and attention. Then we’ve looked at their worth and which actors had incentives to capture them. Next, we’ve seen how a few are owning the pipeline in this new infrastructure and how they have agreements with third parties. After that we’ve dived into some of the manipulation tactics used to gain such assets. Finally, we’ve listed a few of the defenses some of these companies have used against policymakers.

Table Of Content

References








Attributions: E. Sibley, A Key to Magic & the Occult Sciences, c. 1800

March 21, 2021

Phil Hagelberg (technomancy)

in which there is no such thing as a functional programming language March 21, 2021 11:35 PM

There is no such thing as a functional programming language.

Ahem. Is this thing on? Let me try again.

There is no such thing as a functional programming language.

All right, now that I've got your attention, let me explain. Functional programming does not have a single definition that's easy to agree upon. Some people use it to mean any kind of programming that centers around first-class functions passed around as arguments to other functions. Other people use it in a way that centers on the mathematical definition of a function such as ƒ(x) = x + 2; that is, a pure transformation of argument values to return values. I believe it's more helpful to think of it as a "spectrum of functionalness" rather than criteria for making a binary "functional or not" judgment.

oregon coast

So functional programming is an action; it describes something you do, or maybe you could say that it describes a way that you can program. Functional programming results in functional programs. Any given program exists somewhere on the spectrum between "not functional at all" to "purely functional". So the quality of "functionalness" is a property that you apply to programs.

Obviously "functional programming language" is a term in widespread use that people do use to describe a certain kind of language. But what does it really mean? I would argue that a language cannot be functional; only a program can be more or less functional. When people say "functional programming language" what they mean is a language that encourages or allows programs to be written in a functional way.

Except for very rare cases, the language itself does not force the programs written in it to be more or less functional. All the language can do is make it more or less difficult/awkward to write functional programs in. Ruby is rarely called a functional programming language. But it's possible (and often wise) to write functional programs in Ruby. Haskell is basically the textbook example of a functional programming language, but imperative Haskell programs exist. So calling a programming language functional (when taken literally) is a bit of a category error. But "a language that encourages programming in a functional way" is an awkward phrase, so it gets shortened to "functional programming language".

Incidentally the exact same argument about "functional programming language" can be applied to the term "fast programming language". There is no such thing as a language that is fast. Only programs can be fast[1]. The language affects speed by determining the effort/speed trade-off, and by setting an upper bound to the speed it's possible to achieve while preserving the semantics of the language[2]. But it does not on its own determine the speed.

Please don't misunderstand me—I don't say this in order to be pedantic and shout down people who use the term "functional programming language". I think it's actually pretty clear what people mean when they use the term, and it doesn't really bother me when people use it. I just want to offer an alternate way of thinking about it; a new perspective that makes you re-evaluate some of your assumptions to see things in a different light.


[1] If you want to be even more pedantic, only individual executions of a program can be fast or slow. There is no inherent speed to the program that exists in a meaningful way without tying it to specific measurable runs of the program.

[2] For instance, the Scheme programming language has scores of different implementations. The same program run with the Chez Scheme compiler will often run several times faster than when it's run with TinyScheme. So saying "Scheme is fast" is a category error; Scheme is not fast or slow. The same is true of Lua; you will usually get much faster measurements when you run a Lua program with LuaJIT vs the reference implementation.