Nov '22

29

Incremental Progress

In an effort to stay current with this most successful blogging adventure yet, I've decided to just throw something quick together. I don't want to leave 2022 out in the cold with no new posts, after all.

So this will be quick, shallow, and probably obvious. But it's something I've been appreciating more and more lately. Specifically: just doing a little bit. Repeatedly, over time.

A Small Step

It takes just one small step to make progress.

Many times, as I consider everything that needs to happen to complete a task, or reach a goal, I find myself feeling overwhelmed. I might think, "there's just too much to do before it pays off", or I may not know where to start or what to do next.

But at the risk of stating the obvious: just do a little. It's at least something, which is way better than nothing!

Eventually, enough small steps have been made that something tangible comes from the effort, and from there it becomes easier to keep building on that. This applies to pretty much any kind of long term goal like learning a new skill, improving your health, or even running a blog in a dusty corner of the internet.

Cheers, I guess

That's it for now. Maybe the future will see an increased posting frequency, who knows.

Just be sure to remember: even the smallest bit of progress is an infinite increase over zero progress.

etcPersonal

Comments

No comments yet! Say something.

Aug '21

26

Self-Hosting for Fun and Personal Freedom

These days, everything is in the cloud. This is fantastic as far as accessibility goes, but it comes at a cost of, well, freedom. Freedom, in this case, refers to the ability to control and access your data when and how you choose.

For example, imagine inadvertent or mis-applied suspension of your Google account that you use for password recovery. Plenty of services even use email as a primary means of confirming legitimate access when logging in.

What do you do if your Google account is inaccessible because Google has decided you shouldn't be allowed to access it?

It's no stretch to come up with reasons to take control of your data. One way to take control is self-hosting. And with many popular services having a viable open source or self-hosted alternative, the question is: should you self-host?

For me, the answer is, "yes, sometimes." Before you jump on board, too, there's a few things you might want to consider.

Benefits

Control and freedom. Your data is yours to do with as you please, with no one to control it. However, be aware that there may still be certain ways that your own self-hosted deployment can be limited through control by the publisher. For example, Bitwarden only supports two-factor authentication (2fa) if you pay for a license. If you don't pay, you can't access your 2fa— even if you self-host.

Another part of control is availability. If GitHub is down, well that's out of your control. But if your GitLab server is down, it's on you to fix it. Congrats! You're in control.

Be sure to keep great documentation. This will help prevent the same problem from frustrating you and might even show patterns where you might improve or automate things.

Learning, or fun. Figuring out how some program works and how to deploy it and use it to your fancy is a challenging but worthwhile experience.

Simplicity. Usually, self-hosted products are simpler and less feature-packed which can lead to a less complicated workflow overall. As an example, Gitea has a lot less features than GitHub or GitLab, but is a breeze to get setup and running. I find even the code is more accessible, but that may be personal bias toward Golang showing.

Drawbacks

Security is probably the most imporant consideration. Study up on basic server hardening. Some basic necessities include (by no means exhaustive): blocking all traffic that isn't explicitly necessary, securing sshd (no root login, no password login), running fail2ban, using strong sudo passwords, running services as non-root users, and keeping systems up to date.

Maintenance is the next potential nightmare. Many self-hosted products don't really provide a lot of support, so figuring out why your particular setup is broken can be a challenge. Learning to comb through logs is a key skill that must be developed. Keeping your own documentation is also key to massively reducing the overall headache of maintenance.

Availability appears as a drawback, too, as it can be difficult to achieve. Deploying a redundant, always available service is not exactly an easy feat. If your service is crucial (e.g. a password manager) then it can be a big problem if your server goes down.

Try to solve problems and not symptoms of problems; if you have to manually fix something on a regular basis, figure out a way to automate it.

Cost is also a factor to consider. Your self-hosted service has to run on some server somewhere and you'll probably need to pay for that. This can quickly balloon if you're not careful, but I find that services like Linode or DigitalOcean are fantastic options when it comes to deploying your own infrastructure.

Conclusion

The drawbacks are considerable, but the payoff might be worth it for you, for certain services.

What do you self-host? How do you tackle the mentioned drawbacks?

etcLearning

Comments

No comments yet! Say something.

Jul '19

10

Closing Channels Twice in Go

Concurrency is natural with the Go language's channels and goroutines. There do exist a few gotchas, however. This post is going to focus on just one: how can we avoid closing channels twice?

The problem lies with the fact that the Go runtime will panic if you close a channel twice. The most minimal example:

package main

func main() {
	ch := make(chan struct{}, 0)
	close(ch)
	close(ch)
	// Output:
	// panic: close of closed channel
	//
	// goroutine 1 [running]:
	// main.main()
	//	/tmp/sandbox701291283/main.go:6 +0x80
}

What are some scenarios this might occur? Why would we ever close a channel twice? Is there a simple way to avoid closing twice? Let's find out.

Channels as a signal

One common scenario is using channels to signal across goroutines that they need to shutdown. One might imagine:

package main

import (
	"fmt"
	"time"
)

// fn is some function.
type fn func()

// scheduler is a minimal task queue.
type scheduler struct {
	queue chan fn

	quit chan struct{}
	done chan struct{}
}

// work continuously reads from the queue channel to perform work.
func (s *scheduler) work() {
	for {
		// Since this select block doesn't have a default, this
		// goroutine will block the until either s.quit closes
		// or something is written to s.queue for us to read.
		select {
		case w := <-s.queue:
			w()

		case <-s.quit:
			// Time to shutdown.
			// Close the queue and drain it.
			close(s.queue)
			for w := range s.queue {
				w()
			}
			return
		}
	}
}

// shutdown signals to the scheduler to stop all workers.
func (s *scheduler) shutdown() {
	// Closing this channel causes any subsequent read operation to
	// return the zero value immediately.
	close(s.quit)
}

func main() {
	s := &scheduler{make(chan fn, 5), make(chan struct{}, 0), make(chan struct{}, 0)}

	wk := func() {
		fmt.Println("werk")
	}

	go s.work()

	s.queue <- wk
	s.queue <- wk
	s.queue <- wk
	s.queue <- wk

	s.shutdown()
	
	time.Sleep(10 * time.Millisecond)
	// Output:
	// werk
	// werk
	// werk
	// werk
}

Play with the example above on the Go playground.

This works great with at least one drawback: calling s.shutdown() does not block-- the program immediately exits. Try removing the time.Sleep() call. The program will probably exit before the first "werk" is printed. Even though we are draining the queue, the program will exit before the queue drains.

The road goes both ways

A simple solution is to create a second channel that is used to signal back to the shutdown method that the worker goroutine is done.

package main

import (
	"fmt"
)

// fn is some function.
type fn func()

// scheduler is a minimal task queue.
type scheduler struct {
	queue chan fn

	quit chan struct{}
	done chan struct{}
}

// work continuously reads from the queue channel to perform work.
func (s *scheduler) work() {
	for {
		select {
		case w := <-s.queue:
			w()

		case <-s.quit:
			close(s.queue)
			for w := range s.queue {
				w()
			}
			// We're all done!
			close(s.done)
			return
		}
	}
}

// shutdown signals to the scheduler to stop all workers.
func (s *scheduler) shutdown() {
	close(s.quit)
	// Then, we read from s.done. This blocks until a value is written to
	// the channel for us to read, or the channel is closed. We're going
	// to rely on the latter.
	<-s.done
}

func main() {
	s := &scheduler{make(chan fn, 5), make(chan struct{}, 0), make(chan struct{}, 0)}

	wk := func() {
		fmt.Println("werk")
	}

	go s.work()

	s.queue <- wk
	s.queue <- wk
	s.queue <- wk
	s.queue <- wk

	s.shutdown()
	// Output:
	// werk
	// werk
	// werk
	// werk
}

Play with the above example on the Go playground.

This works well, but we what happens if shutdown is called multiple times? Add go s.shutdown() above the existing line in the previous example, and run it. Panic!

Closing Channels Safely

If a given channel can be closed multiple times, it must be determined if the channel is already closed. Using synchronization primitives is what some might reach for at this point, but that's completely unnecessary.

Guard channel closes with a select statement that tries to read from the channel in question before closing it.

package main

import (
	"fmt"
)

// fn is some function.
type fn func()

// scheduler is a minimal task queue.
type scheduler struct {
	queue chan fn

	quit chan struct{}
	done chan struct{}
}

// work continuously reads from the queue channel to perform work.
func (s *scheduler) work() {
	for {
		// Since this select block doesn't have a default, this
		// goroutine will block the until either s.quit closes
		// or something is written to s.queue for us to read.
		select {
		case w := <-s.queue:
			w()

		case <-s.quit:
			// Time to shutdown.
			// Close the queue and drain it.
			close(s.queue)
			for w := range s.queue {
				w()
			}
			// We're all done!
			close(s.done)
			return
		}
	}
}

// shutdown signals to the scheduler to stop all workers.
func (s *scheduler) shutdown() {
	select {
	case <-s.quit:
		// already closed, do nothing

	default:
		// Signal to workers that we are quitting
		close(s.quit)
	}
	// Then, we read from s.done. This blocks until a value is written to
	// the channel for us to read, or the channel is closed. We're going
	// to rely on the latter.
	<-s.done
}

func main() {
	s := &scheduler{make(chan fn, 5), make(chan struct{}, 0), make(chan struct{}, 0)}

	wk := func() {
		fmt.Println("werk")
	}

	go s.work()

	s.queue <- wk
	s.queue <- wk
	s.queue <- wk
	s.queue <- wk

	go s.shutdown()
	s.shutdown()
	// Output:
	// werk
	// werk
	// werk
	// werk
}

Attempting to read from a closed channel will return immediately, and since these channels are not for values, but are instead used as signals, we have no worries about who or how often someone calls shutdown.

Play with the above example on the Go playground.

A small challenge

The example above only supports a single worker goroutine; creating multiple will cause panics when each worker attempts to close(s.queue) and then close(s.done).

Try out the broken challenge on the Go playground.

How can the example be modified to support any number of worker goroutines?

Wrapping up

I hope this article helped increase your understanding of Go channels. How do you use channels?

goProgramming

Comments

No comments yet! Say something.

May '19

30

On Life, Legacy, and JavaScript

Prototypal inheritance with closures and late binding: so powerful and versatile that it can support just about any programming paradigm imaginable. Functional folks can memoize their partially applied functions, object-oriented people can inherit from their base class, and the imperatively-minded can use their interactive read, evaluate, print loop. Those few left, still unsatisfied, might use a more refined, stack-based language and transpile the nonsensical glyphs back into prototypal codes and (perhaps with a shim or two) run their fancy new adders on interpreters older than time itself.

Yep, I'm talking about JavaScript. But let us not get ahead of ourselves. Our story starts...

Back in the day

A long while back, some enterprising individual discovered the arcane magic of running JavaScript without Hypertext and on server-grade hardware. Not long afterward, software developers, everywhere, wandered aimlessly through non-blocking purgatory and burned in callback hell. Darkness fell.

And then-- a new power was wrought: babel.

It no longer mattered what kind of code you wanted to write, babel would take it and convert the original writings directly into JavaScript. Code that once ran only on handheld games and in MS-DOS prompts could now run anywhere. Programming languages with Standards and Best Practices could now be used to develop software running in browsers, on desktops, on mobile phones, in cars, on kiosks...

Personal computing had improved, and it kept improving. To the point that an average, technically privileged individual walked around with more flops than a 1.93-meter tall, pale white man pretending he's still in San Diego. Mobile networks, too, improved, with ever increasing numbers of G (whatever that is) and never-high-enough bandwidth caps for those sweet, dank memes.

The internet rejoiced, and humanity soon entered a glorious Fool's Golden Age of Software Engineering. But many could not shake that unease, that tension, the sort that you feel when trying to not stub your toe in a pitch dark room full of kids toys.

"My phone seems like its getting slower," some users wrote on message boards.

"It must be your mobile network," the response, "our app is flawlessly crafted with coffeescript and scss and converted into state of the art, cross-platform, bullet-proof, ironclad. Java. Script."

Paralysed by choice of flavor for the week, many developers abandoned all sense. Long term support version lifespans were cut in half, then half again, then by a magnitude, to allow people to "move fast" and "break things." Soon, things did indeed seem to be nearing a breaking point. Software sucked.

Huge efforts were made to improve the situation, with much time and energy spent, culminating in a wacky new invention: transpilation. No, don't compile the code for machines to consume-- translate it into a different language altogether! For machines to consume! Err.. wait, is that right? Uhh.. best not to think on it.

Code of incredible diversity wound up as dependencies or dev dependencies or peer dependencies or transitive dependencies. Most all JavaScript developers applied the sage wisdom of seeking out other's inventions for their own exploitation, of avoiding any sort of repetition or reimplementation, of semantically versioned source code reuse and automatic integration powered by external and externally controlled third parties.

Everything anyone could think of doing to improve matters was done.

But software still sucked.

And it still does

Admitting there is a problem is the first step. The second is to coin a new name and prepare a steady stream of proper specifications, clarifications, improvements, and carry on as if nothing had ever happened. Some called it ES6, some ES.next, but most would agree that the language underpinning all of humanity, JavaScript, was improving.

In 2019, it's called ES2018. And you don't even need babel anymore.

Technically, you still need some magic sauce to actually avoid using `require()`, but the language itself has come so far from it's original ambiguities. Yet-- the original Object Model (no, not the DOM) has changed very little; much of the JavaScript written two decades ago can run anywhere that JavaScript is still supported.

All this, and no big deal to the language that would be haphazardly created and unintentionally willed upon all developers, front or backend, whether they enjoyed it or not.

Perfect Legacy

JavaScript today is diverse as ever. I often say things like, "I enjoy my JavaScript quite a bit, it's everyone else's that sucks." The honest truth is that even my flawless, artisinal, hand-typed-then-automatically-reformatted ES2018 sucks, too. It all sucks, but that isn't the point.

Practical application of a few, simple ideas is sometimes all that's necessary to leave a long lasting legacy and change the world forever. For better... Or worse. ;)

 

javascriptLearning

Comments

No comments yet! Say something.

Mar '18

14

Refactoring, Now With Generics!

Refactoring is one of the most satisfying programming tasks. It can be difficult, especially in a large or unfamiliar codebase, but I believe thinking critically about your code is beneficial. There is almost always a low-hanging fruit -- fix formatting, naming convention, remove duplication, etc. If you aren't careful however, you can easily refactor in circles and never actually improve anything.

"Quality" is completely subjective, so you should also spend time pondering where your opinions lay and why you think that way. Lively banter with coworkers about design choices and their reasons can be very informative, but I've found that there is no singular Correct Way, especially in coding. Each choice comes with trade-offs that may not be completely evident until much later. And what do you do when you realize six-months-ago You made a mistake? Refactor!

The Beast

Around the middle of last year, my team inherited a massive monolithic Java product and was tasked with stabilizing and improving the platform. My goal was to not only improve the perceived functionality, but also to clean up the code in as many ways as possible.

Note: All examples in this post are super contrived and are only vaguely similar to the real code.

This project is made up of several layers of co-dependent libraries, and the data model (entities, whatever you like calling them) consists of several related types. One major eyesore is that each layer defines a specialized subclass for each of these types. Imagine a Website, which might have multiple Campaigns, and both are defined in the base data access library. A tracking library then builds on top of both of those types, introducing its own subclass of each. Then a messaging library builds on top of those, adding another pair of subclasses. Repeat about 7 times and welcome to my reality!

The immediate issue I have with these kinds of "shared" inheritance hierarchies, especially in a language like Java, is that you end up casting things everywhere. For example, given a core.Campaign with a public core.Website getWebsite() method, every time you use that method in a subclass of core.Campaign, you get a core.Website. Concretely, if you call getWebsite() inside the tracking.Campaign subclass, you will receive a core.Website. Because of the shared hierarchy, it is expected that the actual instance you get will be an instance of tracking.Campaign. You just have to cast it:

package com.acme.tracking;

class Campaign extends com.acme.core.Campaign {
    public void contrivedExample() {
        // We are calling the getWebsite method defined in the core.Campaign superclass. 
        // It returns a core.Website! But we "know" its really a tracking Website. We hope.
        Website website = (Website) getWebsite();

        // do something worthwhile...
    }
}

Right off the bat your olfactories are assaulted with the pungent aroma of wonky code. See, Java is a statically- and strongly-typed language, so this means that the compiler can generally do a great job alerting you to incompatible types. But the moment you're forced to cast, you remove any compile-time guarantees. You have introduced a possible runtime fault. Neat, huh?

The Beauty

Now, as you can imagine, this platform has a ton of baggage. We can't just go around changing APIs, we have upstream and downstream dependencies that rely on our code. That is, we can't just fix the issue at hand by removing the crazy cross-project inheritance hierarchy. That would break an uncountable (literally, we can't be sure) number of other projects. This design decision really is baked-in to the platform at this point, and trying to fight it too much is probably a waste of time.

But that doesn't mean you can't improve things!

A Short Aside

Back in prehistoric times, the folks working on Java decided to implement non-reified generics. Reification, in this context, describes the compiler's and runtime's ability to determine, track, and enforce type information. Java's implementation of generics is non-reified because the compiler actually strips type information from the generics-- the runtime cannot know what these types were originally!

In a nutshell, when you write List<String>, the compiler can use the type parameter (String in this case) to ensure that no tomfoolery occurs with types at compile-time. But the runtime itself has no knowledge of this information and therefore cannot make any assurances about what a generic type contains when the code is actually executed. You may yourself have had the pleasure of experiencing an Integer in your List<String> in not-very-unusual circumstances.

You might imagine generics as sort of the "evil" twin of a type-cast operation. Instead of checking types at runtime with a cast, we do it at compile time and skip the runtime check. But, if you're like me, you vastly prefer knowing about incorrect types at compile-time, not at run-time (sometimes known as "production"). So in my opinion, this trade-off is worth it.

Back to the Beauty

Is there a way we can utilize generics to help ease the situation and provide stronger compile-time guarantees? In this next example, we are looking at another type that represents normalized information about an incoming HTTP request. This CampaignContext is passed around to various services that need to be able to get the current request's core.Campaign (or a subclass thereof).

package com.acme.app.context;

public abstract class CampaignContext implements com.acme.core.CampaignContext {
    private com.acme.core.Campaign campaign;

    public com.acme.core.Campaign getCampaign() {
        return campaign;
    }

    public void setCampaign(com.acme.core.Campaign campaign) {
        this.campaign = campaign;
    }
}

Notice that this class exists in an application-- very obviously denoted by the com.acme.app package name-- nothing else is using it beyond this app. Lots of things need that com.acme.core.CampaignContext interface, but we have a little wiggle room because we are using our own implementation of the interface. What does this mean? It means we can change it with no (okay, very little) impact!

package com.acme.app.context;

public abstract class CampaignContext<T extends com.acme.core.Campaign>
        implements com.acme.core.CampaignContext {

    private T campaign;

    @Override
    public T getCampaign() {
        return this.campaign;
    }

    public void setCampaign(T campaign) {
        this.campaign = campaign;
    }
}

Note that all references to com.acme.core.Campaign have vanished except for in the class-level type parameter constraint. This effectively forces any subclass of this CampaignContext to provide this type parameter, and the type specified must be an instance of core.Campaign.

Now, when we use this CampaignContext, we will leverage the compiler's ability to ensure that the type parameter constraint is enforced, and therefore calling code will always get the expected type (and not a core.Campaign)!

package com.acme.app.servlet;

import com.acme.tracking.Campaign;
import com.acme.app.context.CampaignContext;

public class TrackingContext extends CampaignContext<Campaign> {
    public void contrivedAf() {
        // Look, ma, no casts!
        Campaign campaign = getCampaign(); // Notice this is a tracking.Campaign!

        // do some tracking specific thing with the tracking.Campaign
    }
}

We have introduced a mostly* backward-compatible change to the abstract CampaignContext class. All we need to do is update every CampaignContext subclass to specify which subclass of core.Campaign it needs. In our application, we had a limited number of CampaignContext subclasses so we felt comfortable updating each. Code that called those contexts didn't need to change, except we could now remove the runtime casts!

Finito

This was a long post, but hopefully I've been able to convey my ideas in a digestible way. Let me know why you think our decisions were right or wrong, I'd love to hear counter-opinions.

 

* For what it's worth, you can avoid the BC break entirely by leaving the old CampaignContext implementation in place and using the new one on an opt-in basis as you make changes to other parts of the application.

javaProgramming

Comments

No comments yet! Say something.