What shadow-cljs is and isn’t

admin | October 22, 2020 | Technologies | No Comments

I’ll try to properly describe what shadow-cljs actually is since there seem to be a few common misconceptions that keep coming up that are simply incorrect. This is not actually an introduction for shadow-cljs, rather a definition of why it is different compared to other tools. This assumes some familiarity with the current CLJS ecosystem. Please refer to the User’s Guide to learn more about what shadow-cljs actually does.

So first a very brief overview of the most common things and followed by that a more in depth explanation.

What shadow-cljs isn’t

It is not a fork of ClojureScript
It is not a dialect of ClojureScript
It does not introduce new “syntax” for require (eg. (:require ["react" :as r]))
It is not self-hosted

What shadow-cljs is

Short Version: shadow-cljs is a fully featured build tool for ClojureScript and JavaScript. It integrates with the npm JavaScript ecosystem and allows accessing it from ClojureScript. It runs on the JVM and uses the Closure Compiler to process JavaScript and create optimized (aka. minified) JavaScript output. It does not use self-hosted ClojureScript, meaning that it still requires Java to run.

Basically you can split the work required to perform a ClojureScript build into 4 stages:

Basic Setup which just sets up the environment for later stages, applies configuration and so on.
Compile ClojureScript to JavaScript one namespace at a time
Organize the output
Optimize the output (optional)

shadow-cljs replaces Steps 1,3,4. Step 2 remains mostly unchanged and only changes things when there is no “official” API to hook into a particular step (eg. npm namespace aliasing). The very large majority of the code in shadow-cljs actually deals with Step 3,4 which has nothing to do with ClojureScript anymore since at this point only JS code exists and no more CLJS compilation happens.

What goes into a ClojureScript build?

Before we can further define what shadow-cljs actually does we need to understand all the involved steps when it comes to actually compiling ClojureScript and running it on a given platform.

Step #1 – Basic Setup

Before any compilation can be done we need to initialize the compiler state and collect some basic information. The compiler options are validated and applied. Some information from the classpath is extracted (eg. deps.cljs) and an index of :foreign-lib is created from the information. For example if you have cljsjs/react in your dependencies that will contains a deps.cljs which sets up the cljsjs.react namespace alias and so on. No actual CLJS compilation is taking place yet.

In addition the Closure Library “index” is loaded which also just contains a simple lookup of which namespaces any given file in the Closure Library provides. Eg. the goog.object namespace is provided in the goog/object/object.js resource on the classpath.

When using :npm-deps the node_modules directory is also indexed and again a namespace index is created that may map the react namespace to node_modules/react/cjs/react.production.min.js file. This is actually a bit more complicated but what we end up with is just a mapping of namespace -> file.

Once everything is set up the compiler state is bound to the cljs.env/*compiler* atom and compilation can start.

Step #2 – Compile ClojureScript

Now we actually compile ClojureScript to JavaScript. This is done by cljs.analyzer and cljs.compiler. It involves reading the .cljs source one form at a time using tools.reader, analyzing it, expanding macros in the process and then generating the proper JavaScript (with proper source maps).

Suppose this trivial example

(ns demo.app
  "some docstring")

(defn init []
  (js/console.log "Hello World!"))

After compilation this is turned into JavaScript which follows the Closure JS format.

goog.provide("demo.app");
goog.require("cljs.core");

demo.app.init = function() {
  console.log("Hello World!");
}

One very important aspect of CLJS compilation is that all dependencies must be analyzed first. Every CLJS namespace has an implicit dependency on cljs.core and if a namespace has a (:require [something.else :as foo]) in its ns form that dependency must be analyzed before the ns itself can be analyzed. During analysis some of the data from the AST is extracted and added to the compiler state. This mostly contains metadata information about namespaces and their :defs (ie. def, defn). Greatly simplified the above code will generate something like

:cljs.analyzer/namespaces
{demo.app
 {:name demo.app
  :meta {:doc "some docstring"}
  :defs
  {init
   {:name demo.app/init
    :fn-var true
    :meta {...}
    ...}
 ...

The data from those namespaces is used during analysis so we can warn about missing vars and generate the proper code for protocols and such. This is just data, it can be cached and written to disk and read again later.

Some dependencies will be provided by other JS files, either from the Closure Library, :foreign-libs or npm directly. All the analyzer actually needs to know here is which variable name to use when referencing code from other “namespaces”. It doesn’t actually do analysis of said JS code (yet).

So suppose we expand the example to use

(ns demo.app
  (:require

[react :as r]

[react-dom :as rd [goog.dom :as dom]) (defn init [] (-> (r/createElement “h1” nil “Hello World!”) (rd/render (dom/getElementById “app”))))

goog.provide("demo.app");
goog.require("cljs.core");
goog.require("goog.object");

demo.app.init = function() {
  module$react_dom.render(module$react.createElement("h1", null, "Hello World!"), goog.dom.getElementById("app"));
}

In this case I used module$react as the alias for (:require [react :as r]). It won’t actually be that in real builds but all you need to remember is that is just a variable name. The alias mechanism may actually change depending on build options but it is a close enough approximation to assume that the indexes created in Step #1 will be used as a basis to assign these aliases. There may be a goog.require("module$react") or there may not be, it is not relevant since goog.require is basically a noop inside files.

There is a special case for :foreign-lib provided names (eg. everything cljsjs.*) in that those don’t actually create any aliases at all instead just provide a global variable you use directly. So cljsjs.react is not used via a namespace alias but js/React. This is bad for various reasons and CLJSJS started adopting the alternative approach of using proper namespace aliases for the commonly used packages (eg. react).

Once all CLJS namespaces have been compiled we can proceed with the build.

Step #3 – Bundle/Organize JavaScript

If you take the above generated output and run it in a browser you’ll get an error about goog is not defined. goog is actually provided by the goog/base.js file which provides the goog namespace and is an implicit dependency for all CLJS (and the Closure Library). So once ClojureScript compilation finishes we need to generate a few additional files and move them to the proper places and ensure that they are loaded in the proper order. Nothing that is done here is specific to ClojureScript, at this point we only have a bunch of .js files that need to be massaged into the right shape to be loadable in different environments (eg. the Browser).

Step #4 – Optimize JavaScript

This is actually optional and only done for “release” builds, it is not done during development. :optimizations :none means that this is skipped.

After the .js files are organized we still have a lot of them and they can get quite large. This is impractical when building for the Browser since it can take quite a while to load and often contains code we won’t actually use. This is where the Closure Compiler comes into play. It takes all the generated .js code and processes it with the given :optimizations setting (eg. :advanced). This will analyze all the JS code, remove the parts that aren’t used and shorten all variable names and a bunch of other really cool stuff to make the output as small as possible.

What about the REPL?

The REPL actually makes things a bit more complicated but it basically just keeps the compiler state around and repeatedly does Step #2 and then does a custom Step #3 and organizes the code into a shape that can be loaded directly in the REPL. This may happen entirely in memory and not actually involve writing any files. Step #4 is never done for the REPL and is in fact impossible to do for :advanced optimized code.

So what about shadow-cljs?

shadow-cljs replaces Step 1,3,4. Step 2 remains basically the same. A different build tool for the same ClojureScript language we all love. Compare it to lein and boot, both are still just Clojure underneath. shadow-cljs uses the same ClojureScript core library (eg. cljs.core) as any other tool would.

It interfaces directly with cljs.analyzer and cljs.compiler and does some minor modifications since there are no “official” hooks into the namespace aliasing required for npm dependencies and such. Compilation of actual CLJS is unchanged and only the parts that involve interop with npm are changed. I’m very careful about introducing new stuff and want to remain 100% compatible, any new features are opt-in and have “fallbacks”.

Unfortunately the support for :npm-deps in CLJS is rather unreliable so sometimes people mistakenly think that (:require ["foo/bar" :as x]) is something shadow-cljs specific when it is absolutely not. Strings were added since there are certain JS requires that cannot be expressed as a symbol properly. Some examples:

(:require ["object.assign" :as x]) Although this could be a valid symbol it references node_modules/object.assign not node_modules/object/assign and doesn’t follow CLJ(S) rules for . in symbols
(:require ["react-virtualized/dist/commonjs/AutoSizer" :as x]) too many /, can’t be mapped to . due to the ambiguity above
(:require ["@material/button" ...]) npm “scoped” packages, @ already used for deref
(:require [decompress-tar]) following the standard CLJ(S) naming rules would map to node_modules/decompress_tar instead of the actual node_modules/decompress-tar. JS allows -, _ in names.

JavaScript in general and especially npm does not have a proper namespacing system. Everything is just a bunch of files in a “package” which provide some sort of basic isolation. It is just files otherwise, stitched together by relative file paths.

Just like Clojure allows using any Java class we required a way to address all JS and string requires let us do that. In the beginning shadow-cljs actually added a special (:js/require ["..." :as x]) syntax to ns to deal with this but that would have been something that wasn’t supported by plain CLJS. After some discussion David Nolen actually suggested allowing strings in :require and it was thus added not too much later.

This change did not originate in shadow-cljs although it is used far more frequently here and certainly was part of the discussion. Since it doesn’t work reliably enough with :npm-deps people just kept using symbols and never adopted strings. If it were my decision I would not have allowed mapping JS names to symbols (eg. (:require [react :as r])) and instead always forced using strings for JS requires. It wasn’t my decision to make so shadow-cljs supports the symbols as well as strings. I do not like guessing if I’m using a CLJS namespace or something from JS and the ambiguities that come with it, so I just recommend using strings. One important aspect of CLJ(S) is integration with the Host and working with paths and files is just a fact of life in JS that isn’t going anywhere.

So why shadow-cljs?

Why re-implement Step 1,3,4 instead of building on top of what the official tools provide?

A rather long time ago I asked:

What exactly needs to be in CLJS?

And my opinion the answer is still Step #2. ClojureScript should focus on compiling .cljs|.cljc -> .js.

Everything else should definitely have a default implementation which CLJS provides via cljs.closure but there should be alternatives. Building on top of cljs.closure or the “official” cljs.build.api is not flexible enough for what shadow-cljs wants to do.

Rich Hickey made the very smart decision to use the Closure Compiler which was/is fantastic. People just falsely assume that it is actually required for CLJS compilation when it actually isn’t and is only used in the optional Step #4. Nowadays it would probably be a better to emit ES6 code instead but back then that wasn’t an option since it didn’t exist.

The support for :modules is what started me on the path of shadow-cljs (then shadow-build) and it simply wasn’t an option to use the official APIs back then since it wasn’t supported. :modules support was added eventually but only after shadow-cljs had already proven that it was a good thing to have. The npm support in shadow-cljs is somewhat similar. It started as an experiment to see how practical it is.

In my opinion these experiments should not be done in ClojureScript directly.

:npm-deps sort of shows that. It started as an experiment asking: What if everything is :advanced compiled? This is a very valid question to ask and I wish it actually worked. In practice however it doesn’t work very well and probably won’t for a long time. This is not a problem with the implementation. npm is just a total wild west of competing JavaScript standards and idioms that will simply never work correctly with :advanced. So :npm-deps is probably forever limited to “it works for package a,b,c but not the rest”. The idea is still fantastic, it just didn’t work for my projects.

I wanted to try a different approach for so many things and that is what you get in shadow-cljs. Should this be the default? Definitely not. JavaScript is still evolving and keeps adding stuff constantly. Keeping up with all of that is honestly frustrating at times and probably something we want to avoid. Alternatives like CLJSJS or using webpack certainly can work and make sense in certain situations. They just don’t solve the problems I wanted to solve.

Point is that I think most of this is not actually related to ClojureScript itself. Clojure doesn’t include lein or boot. Even tools.deps is just a library. I think it is valuable to try different approaches to building ClojureScript projects, that is what shadow-cljs is about. It is different since so far it is the only tool that doesn’t build on top of the official “build” APIs. I hope there will be others. This is an area worth exploring and there is so much left to learn. The fantastic ideas Bruce Hauman had with figwheel certainly influenced a couple of things in shadow-cljs. I hope my work can have a similar impact on ClojureScript development in the long run.

In the meantime I want to provide a stable and reliable tool that works and makes your live easier when working with ClojureScript and JavaScript. You can support me on Patreon.

Tags:Clojure, ClojureScript, Compile ClojureScript

About The Author

admin

Add a Comment

You must be logged in to post a comment.