MovieLounge

Growing a software

I hereby release MovieLounge, a curated streaming platform where we celebrate our commons!

MovieLounge is a streaming platform where you can find a curated list of Creative Commons licensed and Public Domain films and series. If you want to discover and enjoy these movies and series, please check it out. If you are interested in programming and want to know more about some technical stuff, do read on.

When I need a simple project to play around with something new I want to learn, I like to use a streaming site as a project. A streaming site is quite a simple concept, you can quickly build something functional, and it’s easy to find example data for it. Recently, I needed a play-project again, and I chose exactly this. While it wasn’t originally meant to be released, the fact is that I’ve been having fun growing my little streaming project, and I think it’s in a good state to release, and maybe grow further.

Personal goals for this project

In the last half year or so, I’ve been focusing a bit on good practices in development. Many developers have heard it all, things like “encapsulation”, “reusability”, “loose coupling”, “extreme programming”, “test driven development”, and so on. While I think there’s value in these concepts, I feel that they are always explained too much as separate things, rather than properties emerging from the needs we have when growing a software. As such, I feel they are often not properly used. One goal was to get a better feeling for these good practices.

A second goal was to gain a bit more experience in HTML+CSS+Javascript. One problem I see with these technologies is that people don’t really learn them any more. Instead, they often learn frameworks. Frameworks are not necessarily bad, they are there to abstract a whole lot of plumbing away and make it easier and faster to write your software. The problem is that you now have to first learn these frameworks and abstractions without really grasping why things are done the way they are. It may take longer to get started, and you may get into problems without fully understanding what the actual problem is, leading to hacks upon hacks upon hacks. This is why my second goal was to write a single-page web application in HTML+CSS+Javascript without the use of build tools or Javascript frameworks. The only tools I want to require for building and running the application, is a text editor and a browser.

Units

The first good practice was to split my software into separate “units”. The human brain can only keep so much information in one time. That’s why we abstract a lot away behind interfaces. For example, if you drive your bicycle, you have to pedal, steer, maybe ring your bell, maybe switch gears, and so on. Those are all totally different things, and each of these have more going on behind the scenes. A typical bicycle has the pedals pushing a big cog, who then pulls on a chain, who pulls on a smaller cog, who rotates the wheel it is connected to. But it’s also possible to use a belt instead of a chain. Or have the pedals connected to a wheel directly. Similarly, there are different ways on how gears are implemented. One implementation may have it’s advantages over another, but functionally it does the same, and you don’t need to know exactly how it is all put together, only how to operate it through the interface. Pedals are there to get you forward, regardless of what the exact implementation is. The only things you need to know, is the interface between your bicycle and yourself. Push the pedals, switch the gear, etc.

In software we can think of things in a similar way. If your program is small, does one specific thing, and won’t need much extra work once the initial implementation is done, you can probably keep it all in your head while writing it, and you don’t have to think too much about design, as long as it works. But when your software is supposed to grow bigger, or if you know you’ll need to make a lot of changes in the future, you’ll probably want to make it more maintainable. In that case, it makes sense to also think in terms of units. In MovieLounge, the two big units I see, are the user interface, and the storage. I therefore decided to have a basic structure where we have three units. The “core” of our application is a unit, and then we have the user interface as a unit and the storage parts as a unit.

The way units in this context work, is that other units are only allowed to interact with them through a clearly defined interface. One property such a setup needs, is that you can easily switch one unit implementation out for another, as long as they provide the same interface. The way we put our units together can be done as a setting, or by passing the units as a parameter. In the case of MovieLounge, I concluded that the latter was the best option.

A simplified version of my units could look something like this:

const MovieLounge_WebView = (UserInput) => {
    // Implement the web pages through which people can interact
}

const MovieLounge_Core = (VideoStorage) => {
    // At the moment our core unit doesn't do much
    // When we want to fetch a video or list of videos, we can immediately pass it to the storage unit
    return {
        user_input: {
            fetch_videos: VideoStorage.fetch_videos,
            fetch_video: VideoStorage.fetch_video
        }
    }
}

const MovieLounge_JsStorage = () => {
    // Add logic here for the `fetch_video` and `fetch_videos` functions
    return {
        fetch_videos: fetch_videos,
        fetch_video: fetch_video
    }
}

The first unit, MovieLounge_WebView, draws the web pages that the user interacts with. The UserInput parameter is the interface to the core unit for user input. People interact with the webpage, but the application-specific logic is passed to the core unit through this interface. Next we see the core unit, MovieLounge_Core. We see that it returns a map of functions, which is how we define the interface. The core unit takes a parameter VideoStorage which provides the interface to the storage unit. The storage unit, MovieLounge_JsStorage, provides us with the means to get a list of videos, or a single specific video.

Everything gets put together by initialising them with the correct interfaces. The core is initialised with the storage unit of our choice who returns their storage interface, and then we use that to initialise the webview unit.

MovieLounge_WebView(
  MovieLounge_Core(
    MovieLounge_JsStorage()
  )
)

Tests

When you have a simple program that you barely need to change later on, automated tests may not be needed. But when your program is supposed to grow, automated tests can help you. When developing, you don’t just write and assume it all works perfectly. Instead, you write, run to see if it all does what you want, make adaptions if needed, and so on until you’re finished. Each time you run to see if it all does what you want, you are testing. The idea of automated tests is so you don’t have to do the tests manually. But just like with any automatisation, you have to wonder if the effort is worth it.

The first question we have to ask ourselves is what we actually want to automate and why. We test during development, but we also test afterwards to see if nothing broke. Especially the latter is what we want to happen automatically, because this is what we will do a lot if we are expected to make changes to the code a lot. The advantage of good tests is that when you change code, you can be fairly sure that the program still behaves as it should. But you should also make sure that you don’t put more effort into managing the tests than they actually help you.

In practice I see that supposed “unit tests” often test single functions. I think that’s wrong. A good function should remain trivial. Sure, there are exceptions, but the general rule is that the implementation of a function is trivial. So why write a test, which is also code who can have bugs, to test something that’s trivial to begin with? It doesn’t make sense. What we are really interested in, is how the software operates. The user provides a certain input, so we expect a certain output. All the rest, the insides of it, shouldn’t have to matter for a test. But a program may be quite complex, and only testing on the user input, can make your tests quite complex as the complexity of the software grows.

One advantage of separating our code in units, is that the interfaces to the units are generally stable. This means that we can write tests against those interfaces. If we want to refactor a unit, we don’t need to rewrite our tests, we just rewrite the code, and then run the automated tests instead of having to do it all manually each time we make a change. If an interface changes, we will need to refactor our tests as well, but those cases should be rare. There’s the problem that, depending on how you do it, you may not test if units work together. You can write integration tests for that, but I won’t go deeper into that because I don’t really have those at the moment.

For MovieLounge, I wrote a simple test framework. It’s not very advanced, but it does the trick. I made it so that I can load the data in the storage unit through a parameter, so when I want to test the functionality of that unit, it can initialise it with test data. Here’s an example of how it can look;

    let setup = () => {
        return {movie_storage: MovieLounge_JsStorage(mock_js_movies.movies)}
    }
    /* Fetching movies from the database with `fetch_movies` */
    test("returns all movies when search_query is empty", () => {
        let movie_storage = setup()["movie_storage"]
        assert(movie_storage.fetch_videos({search_query: ""}) === mock_js_movies.movies)
    })

    test("returns only movies whose title contains at least one word that the search_query parameter also contains", () => {
        // We expect only "The good girl" and "The Mark of Zorro"
        let movie_storage = setup()["movie_storage"]
        let expected = mock_js_movies.movies_filter_search_query_good_zorro
        let result = movie_storage.fetch_videos({search_query: "GOoD. ZorRo"})

        assert(result.length === 2)
        assert(result[0].id === expected[0].id)
        assert(result[1].id === expected[1].id)
    })
    /* Fetching a movie from the database with `fetch_movie` */
    test("Fetching a movie provides a label for the subtitles (and `fetch_video(id)` works)", () => {
        let movie_storage = setup()["movie_storage"]
        let result = movie_storage.fetch_video("the_mark_of_zorro")

        assert(result.subtitles[0].label === "English (en)")
        assert(result.subtitles[1].label === "português (pt) (Brazil)")
    })

Similar to how I load data into the storage module for tests, I also have some so called “mock” units who implement the storage and core interfaces to use when testing the other units.

Make the need for non-trivial decisions obsolete

When writing a program, you have to make decisions. One decision in MovieLounge was what storage technology to use. Do I use a database? Should it be a SQL database, or a NoSQL, or a SPARQL, or even some other type of database? What implementation of that database do I use? And if I use a database, where should it be hosted? And then I need a way to listen for incoming requests, and I need a software to turn those request into a database query and provide a properly formatted answer.

As you can see, those are a lot of decisions that need to be made. And each of those can have a big lasting impact on the project. I learned that the best way to deal with such decisions, is to make the need for them obsolete. Let me put it this way; Our software grows. When developing, we write code, test the bit we wrote, then we add some more code. But often we write code that we will later overwrite, or move to a different location. For example, you may write a function that returns a fixed value, just to see that the plumbing for this one case works. And then you adapt the function to something more complex. But do we always need to implement the more complex part?

Postponing the storage technology decision

The most important part of the MovieLounge storage unit is that it returns correct values. So one way to start, is by hard coding entries, see if it all works, and then make the changes to actually call the back-end for whom we need to make all these decisions. But here’s the beautiful thing; Once I implemented the hard-coded entries, I had something that already worked. And I can tell you, it works crazy fast! No wonder that it’s fast, all data is already in memory, meaning there’s no delay due to making calls to a back-end. So there’s really no need to make things more complex at this time. And that means we postpone the need to make all these decisions, and thus, the need for these decisions became obsolete. And we can keep it like this for a long time. MovieLounge can easily hold a thousand entries, maybe more, without causing problems. And we’re still very far away from that.

By working this way, I also realised that hosting this service is very trivial. I don’t need to install, run, and maintain a back-end. I just need to serve the files somewhere, that’s basically it, awesome!

Postponing the abstraction decision

Another decision I postponed, is making certain abstractions that typical frameworks use. The MovieLounge code is split up in units, who are really just implemented as functions. But that means that as the unit grows, so does the code in this one function. At first I just kept it like that, but eventually it became too big for my taste, and I decided I needed to split things up more.

This too is an example of making non-trivial decisions obsolete. I could’ve tried to have these abstractions sooner, but it wasn’t very clear to me how they should look, or why they should look that way. By the time I felt it was needed to split things up, I had a much better feeling for what the MovieLounge code needed, so the decision had become rather trivial at that point. Eventually I used the concept of “modules”, which are basically a set of functions you can use, and “components”, which are customisable and reusable DOM trees you can use in the web view. And all it took me to implement this, was to move some code around. I barely had to touch the implementation details.

It wasn’t more work than if I had started with these abstractions from the beginning, or at least not significantly more work. But if I did it sooner, I probably would’ve made a wrong abstraction, and that would’ve lead to extra, more complex, and more unpleasant, work on the long run.

Refactor when needed

Note that making the need for a non-trivial decision obsolete, is not the same as ignoring the need for that decision. Instead, it’s a way to handle decisions. I do not ignore that I may one day need a different way of storing movie data. I’m just making it so that, once I need to make the decision, the changes will be as easy as if I would’ve implemented it in the first place. Just like it was for the abstractions. But this also means that, once a need is there to make those changes, you should do it. Otherwise you may end up adding code on top of what is now considered a bad design, and that can make changes harder on the long run.

I don’t think there are strict rules you can follow for this. It’s something you have to learn by experience. One problem I believe we have in the programming world, is that the focus is way to often on adding features. Even if developers feel some refactoring is needed, they don’t always get the time to do it. And eventually we end up with a program that becomes so much hacks-upon-hacks, that it seems better to just rewrite it. But if we never learn from our mistakes, the same problem will happen again, and we loose a lot of time and effort chasing something we’ll never reach.

There are some other cases where I postponed decisions. MovieLounge has tiles for the movies on the landing page. Handling a movie title who’s longer than the room on the tile allows, is something I only did when I had a movie with such a title. And while I knew I wanted series, I didn’t handle it until I had one. The same is true for subtitles. Series also have the problem that some properties may by true for a whole series, but for another series, they differ per episode. While you could handle all these cases, I decided to make the data representation flexible enough to allow these cases, but only implemented the cases when needed and document the restrictions. Remarkably enough, this works quite well. And it mostly works quite well because, since the entries are hard-coded, adding a new movie or series, is really changing code. So while I’m changing that code, I may as well add some code for extra functionality while I’m at it. So you see, by postponing one decision, I’m able to postpone other decisions as well, in a very surprising way!

Back to testing

By writing this article, I realised there’s actually a place where I didn’t postpone a non-trivial decision, and that was probably wrong of me. I think the way people generally do unit tests is wrong, because they test too fine-grained. So one thing I tried to do, is write tests in a way that I believed was better. But by doing that, I have made some decisions in what and how to test that may not be a good fit for the current state of the MovieLounge software. I find myself still testing a lot of things manually after changes, and I wonder if the current tests even really helped me, while I did put an effort in creating and maintaining them.

Starting from the idea of growing the MovieLounge code, I probably should’ve started with manual tests, and then automate those. Only when certain changes, for example adding movies, break the tests too much, I should’ve made use of abstractions like mock data or mock units. I could be wrong, but this can be something to experiment with in the future.

The future of MovieLounge

It’s been a fun project so far. Not just the coding parts, but also everything around it. Finding movies, learning about them, learning about film history, learning about the movies that have been released under Creative Commons licenses and some of that history, uploading movies to Archive.org, and even reviving some old torrents.

One interesting thing is that less than two years from now, the American movies from the 1930’s will start entering the public domain. And those are all talkies, and some are even in colour. On the other hand, I’m not really sure if it can keep my interest if all I can add are old movies.

Maybe I’ll get bored of it and the project will mostly remain what it is now. And that’s the beauty of it, isn’t it. The way it is build, is a perfect fit for what it currently is. And if MovieLounge does grow, it should still be able to grow in any direction it needs to.