runme: main build entry pointsrc/: scala and python sourcescore/: shared functionalityproject/: sbt build-related materials
tools/: build-related and other meta tools
Currently, this code is developed and built on Linux. The main build entry
point, ./runme, will install the needed packages. When everything is
installed, you can use ./runme again to do a build.
From now on, you can continue using ./runme for builds. Alternatively, use
sbt full-build to do the build directly through SBT. The output will show
the individual steps that are running, and you can use them directly as usual
with SBT. For example, use sbt "project foo-bar" test to run the tests of
the foo-bar sub-project, or sbt ~compile to do a full compilation step
whenever any file changes.
Note that the SBT environment is set up in a way that makes all code in
com.microsoft.ml.spark available in the Scala console that you get when you
run sbt console. This can be a very useful debugging tool, since you get to
play with your code in an interactive REPL.
Every once in a while the installed libraries will be updated. In this case,
executing ./runme will update the libraries, and the next run will do a build
as usual. If you're using sbt directly, it will warn you whenever there was
a change to the library configurations.
Note: the libraries are all installed in $HOME/lib with a few
executable symlinks in $HOME/bin. The environment is configured in
$HOME/.mmlspark_profile which will be executed whenever a shell starts.
Occasionally, ./runme will tell you that there was an update to the
.mmlspark_profile file --- when this happens, you can start a new shell
to get the updated version, but you can also apply the changes to your
running shell with . ~/.mmlspark_profile which will evaluate its
contents and save a shell restart.
To add a new module, create a directory with an appropriate name, and in the
new directory create a build.sbt file. The contents of build.sbt is
optional, and can be completely empty: its presence will make the build include
your directory as a sub-project which gets included in SBT work.
You can put the usual SBT customizations in your build.sbt, for example:
version := "1.0"
name := "A Useful Module"In addition, there are a few utilities in Extras that can be useful to
specify some things. Currently, there is only one such utility:
Extras.noJarputting this in your build.sbt indicates that no .jar file should be
created for your sub-project in the package step. (Useful, for example, for
build tools and test-only directories.)
Finally, whenever SBT runs it generates an autogen.sbt file that specifies
the sub-projects. This file is generated automatically so there is no need to
edit a central file when you add a module, and therefore customizing what
appears in it is done via "meta comments" in your build.sbt. This is
currently used to specify dependencies for your sub-project --- in most cases
you will want to add this:
//> DependsOn: coreto use the shared code in the common sub-project.