Tuesday, May 23, 2017

I've become worse, not better, at programming

Once in a while I like to take a peek at reddit's cscareerquestions and look at
the topics there. On a recent travel, I found an interesting question:
What makes someone a bad programmer?
In looking at the answers, I realized some of those apply to me now, but didn't
apply to me in the past. So what happened?
To begin with, my mainstay project is Lily, a programming language, and I've
been working on it for about 6 years now (give or take a couple months). Over
the years, the language has evolved but a majority of the work has been mine.
Solo developer projects are supposed to be awesome and consistent, right?

The symptoms

Closures

If I had to trace it back to something, it would be the implementation of
closures. Closures work by having the lowest-level closure function create a
pool of cells, and pass those to upward cells. The problem comes in needing to
make sure that functions are getting fresh values.
Suppose there's a var 'someval' that's closed over. One way to make sure that
the value is always up-to-date is that any read of 'someval' should be prefixed
by an access of the closure. Any assignment should be followed after with a
write to the closure. This ensures that all accesses of 'someval' will have the
same actual result.
This is implemented as a transformation that walks through opcodes. When I first
wrote it, opcodes didn't have a consistent format, so I had to write custom
iteration code and it wasn't pretty. But it was good enough to work.
Nowadays I've normalized most of the opcodes, but there are still a few strange
outliers. The closure code takes jumps into account and supposedly fixes their
destinations.
Getting closures to work at all was a battle, and still today there are cases
where closures crash out the interpreter. I've resolved to fix those at some
point in the future, maybe.
I think difficulties in closures stem from how I'm inherently very bad at doing
math in my head. Often times, extending the interpreter for a new opcode
involves prodding existing opcodes to see if I move by +3 or +4, and repeated
compiles with different adjustments to see if I can get it to work.

Release cycles

The last release was in October of last year, roughly half a year ago. I try to
make releases once every 3 months since that's enough to get whatever I feel
like doing done. But this one has been an open window, and there's a 1.0 blocker
I should get to.

The roadmap

I never sat down to get an idea of where I want the project to go. This was
good at first, because I had some terrible ideas of what I wanted to do. But now
the issue tracker is barren compared to a few ideas that I have.
I've often used the excuse that nobody's around, so it doesn't matter. But
seeing a project in this state is likely to turn people away. Right now I'm
focusing on writing a new binding tool that I can later use for documentation
extraction so I can have nicer documentation.

The documentation

It's bad enough that I don't use it, yet I've never bothered to write something
until recently (nearly 6 years in) to have a pretty doc generator. Right now,
the pink and the "it came from markdown" make the documentation so bad I just
grep the builtin package for keywords or try it to see what happens.

The underlying causes

Fires everywhere

With an interpreter, there's a large enough playground that something always
needs critical attention. One fire is left to rage while another gets attention.
A better strategy would have been to never let these fires grow. Better testing,
more coverage would help. But some metrics are difficult to test for, which goes
back to me just being lazy.

Documentation, I guess

I like writing, so the core ended up with a whole lot of documentation. Some is
certainly out of date, because I've neglected to comb through it recently. In
recent other project, I hardly comment at all since few others read my work.

Communication

I'm generally antisocial, so cutting away from email/etc. for a few days is
standard practice. I set up an IRC channel which I don't visit often, a discord
I'm off most of the time, and a subreddit that's empty. Most of my social media
comes through shitposting on reddit.
I also don't blog much.

Resigned

Somewhere along the line I resigned myself to "this is the way it is". I came to
accept that some areas aren't going to be that great because it's me alone. I
should have been better about spending time on a section before leaving it. But
if I do that, what do new people end up doing?
I wasn't always like this. I used to chart out how much memory that the
interpreter was using. I used to have better test coverage.

I don't like this anymore

The worst part is that I used to have some measure of pride in what I did. Now,
well, something breaks and I fix it. Done. I don't bother blogging about much of
what I do since I don't find it exciting, and it's not new. I used to be
excited at seeing new features activate and now I hit a random crash once in a
while and I just sigh.
I keep pushing on in hopes because giving up would be such a massive loss of
work. Yet at the same time, I wonder if working on this project is making me a
worse coder, instead of a better one.

Saturday, April 1, 2017

Why I'm moving from Rust to C

Rust is a popular new language that I've been using for several years now. My project, a programming language called Lily, has been developed in Rust for the last 5 years. I chose Rust at the time, because I wanted a safer language to write Lily in.

Lily is a simple language: It's statically-typed, but interpreted. It's capable of being embedded or extended from either Rust or C. I wrote the parser, the lexer, the vm, and all that from scratch. Lily clocks in at around roughly 15K lines of Rust. The core of Lily makes use of relatively few of Rust's features. No lambdas, few generics (and those that exist are simple), and sticks to having values as simple Rust types.

After having developed Lily in Rust for several years, I've decided to transition away from Rust and into C. Changing the underlying language was moderately difficult given the size of Lily. However, after a number of considerations, I felt that it was the right choice to make, even if it is not the choice that many projects are currently making.


rustc is slow

Compilers are great at generating fast code, but it is difficult to perform many optimizations while having a fast turn-around. One of Lily's strengths is that it offers simple static typing, while parsing
as fast as you'd expect a language like Python or Lua to. I have a series of roughly 300 tests, spread out over around 2000 lines of code. Some are feature tests, several of that is method tests, and there's a few benchmark/example scripts as well. A majority of the time currently spent in Travis tests is in compiling on Rust.

I greatly enjoy that Rust has great debugging messages. However, I do wish that rustc were not so slow, because it prevents a quick full turnaround from compiling to running all tests. With the new C system, I can shut off -O2, and get a full fresh build of Lily and run all tests in under a couple minutes (down from 15 minute+ builds).

I'm tired of promises of rustc getting faster. Of "mir will make it better" or "the next version will be faster! It's time to tackle this problem now, before Rust programmers as a group get used to a slow compiler and come to accept it as a fact of life (like Haskell programmers).


rust needs unions

Rust does not, as of yet, contain C-style unions. In transitioning Lily to C, I was able to cut 11% of the memory use of an average program by squeezing structures through utilizing unions in the representation of a Lily value, and throughout Lily's symbol table/parser/ast.

I don't care that there's an issue to add unions to the language. I'm talking right now, several years into the development of Rust, that this feature is missing.


linked lists are great

I'm told I should use arrays everywhere. Arrays are a poor choice for a programming language, where I may have one class, or a thousand classes in a file. Trying to use linked lists is painful. In C, it's as natural to use them. All I need to do is fix a next/prev pointer and I'm good. No need to have a mutable Option of Cell of Refable of a Box of a simple pointer.


the circlejerk

I don't want to be associated with a community whose members constantly drive by to projects they have no investment in to inform them of a supposedly superior message. Every time there is a post on Hacker News or Reddit about a vulnerability in C, the strike team comes out to mention a rewrite in Rust. Regardless of Rust's strengths, it's ugly, and it's also insulting to other developers.


rust's safety is overrated

A majority of Lily's development is in writing the parser, the emitter, and then the tooling. A parser and an emitter do not benefit greatly from memory safety. For emitting code, I have a uint16 buffer I wrote that allows safely inserting 1-5 values at a time (it does a grow check before any insert). I don't have unsafe actions spread around my code. Transitioning from Rust to C was actually fairly boring, and I was able to eliminate a great deal of unnecessary matches on Options that used to exist. The new code is much smaller, and as someone who did a lot of C beforehand, I find the new C code to be much simpler and more pleasant to look at.

If you're interested in checking out the now-C source, you can find Lily here: https://github.com/fascinatedbox/lily