Dienstag, 11. Mai 2010

A DSL to access the Google App Engine datastore from Clojure

[Update: This post belongs to a guest posting at the Google Code blog. Read it here.]

After more than 10 years of bringing new products online for our customers, we now have our very own startup: TheDeadline, an intelligent to-do management system. One of the tools we used to build TheDeadline was a domain specific language, or DSL, we created for working with the App Engine datastore.
TheDeadline runs on Google App Engine and is written in Clojure (read why here). Clojure is a modern Lisp running on the Java Virtual Machine. App Engine provides a distributed key-value datastore based on Google's Bigtable system. You can use the datastore from Python and Java, as well as other languages that run on the JVM. If you are using Java, you have several options to store and access your data, including standardized object persistence mapping via the provided JDO and JPA interfaces.

Modeling your data structures for a distributed key-value store for large-scale internet applications differs in several key aspects from ER-modeling: Forget normalization, optimizing for read-access, etc. As a result, we believe that using object-oriented persistence mapping can cause a developer to incorrectly abstract object relationships: You should not have complex object relationships in your datastore. In addition, since Clojure is a functional programming language, it makes less sense to use a persistence mechanism rooted in object oriented practices. In Clojure you are using structs (maps) and not "objects" to hold your data, which means that you already have simple key-value structured data at hand. There's no need to use object persistence mapping anyway. The most natural way is to use the low-level API to the datastore directly.

One more thing to mention: The App Engine datastore is a schema-free database. This means that the schema is maintained on the application level and not at the database level. But it is still desirable to have a schema! You still have to structure your data. What you really want to do is to define your data structures and to let Clojure generate the needed code to store and query your data. You can do this by writing Clojure macros. A Clojure macro is a Clojure program that generates another Clojure program. Macros allow you to extend the Clojure language with your own embedded mini-languages, also known as DSLs.

Our solution for TheDeadline consists of two parts: a data structure definition language and a query language. Let's say you want to store data about books in the App Engine datastore. The first step is to define the data structure of a book with the defentity macro. This defines a book entity with six attributes:

(defentity book
  [:key]
  [:title]
  [:author]
  [:publisher]
  [:isbn]
  [:pages])

defentity generates several functions. The most important one in this case is make-book. Let's create some books now:

(def *books*
  (list (make-book :title "On Lisp"
           :author "Paul Graham"
           :publisher "Prentice Hall"
           :isbn "978-0130305527"
           :pages 413)
    (make-book :title "Paradigms of Artificial Intelligence Programming: Case Studies in Common Lisp"
           :author "Peter Norvig"
           :publisher "Morgan Kaufmann"
           :isbn "978-1558601918"
           :pages 946)
    (make-book :title "Programming Clojure"
           :author "Stuart Halloway"
           :publisher "Pragmatic Programmers"
           :isbn "978-1934356333"
           :pages 304)))

When you now evaluate the variable *books*, you'll see it contains a list of Clojure maps with your book data:

repl-prompt> *books*
({:key nil, :title "On Lisp", :author "Paul Graham", :publisher "Prentice Hall", :isbn "978-0130305527", :pages 413, :kind "book"} {:key nil, :title "Paradigms of Artificial Intelligence Programming: Case Studies in Common Lisp", :author "Peter Norvig", :publisher "Morgan Kaufmann", :isbn "978-1558601918", :pages 946, :kind "book"} {:key nil, :title "Programming Clojure", :author "Stuart Halloway", :publisher "Pragmatic Programmers", :isbn "978-1934356333", :pages 304, :kind "book"})

The key-Attribute is nil, because we did not save the data to the datastore yet. We just created the books in memory. If you now want to store these books into the app engine datastore, you just call the function store-entities!:

repl-prompt> (store-entities! *books*)
({:pages 413, :isbn "978-0130305527", :publisher "Prentice Hall", :author "Paul Graham", :title "On Lisp", :key #<Key book(7)>, :parent-key nil, :kind "book"} {:pages 946, :isbn "978-1558601918", :publisher "Morgan Kaufmann", :author "Peter Norvig", :title "Paradigms of Artificial Intelligence Programming: Case Studies in Common Lisp", :key #<Key book(8)>, :parent-key nil, :kind "book"} {:pages 304, :isbn "978-1934356333", :publisher "Pragmatic Programmers", :author "Stuart Halloway", :title "Programming Clojure", :key #<Key book(9)>, :parent-key nil, :kind "book"})

store-entities! returns a list of the entities that have just been stored. We put an exclamation mark at the end of the function name to visualize this side effect. You can see that the key attributes are not nil anymore. What you see here is the string representation of the App Engine datastore Key-Class objects. You can now access each entity by its key. Much of the time, however, you may want to select an entity subset that satisfies a certain criteria set. You can do this with another DSL: the datastore query language.

Let's say we want to select all books from the author "Peter Norvig":

repl-prompt> (select (where book ([= :author "Peter Norvig"])))
({:pages 946, :isbn "978-1558601918", :publisher "Morgan Kaufmann", :author "Peter Norvig", :title "Paradigms of Artificial Intelligence Programming: Case Studies in Common Lisp", :key #<Key book(8)>, :parent-key nil, :kind "book"})

Or all books with less than 400 pages:

repl-prompt> (select (where book ([< :pages 400])))
({:pages 304, :isbn "978-1934356333", :publisher "Pragmatic Programmers", :author "Stuart Halloway", :title "Programming Clojure", :key #<Key book(12)>, :parent-key nil, :kind "book"})

These mini-languages are very simple to use and you don't need to know anything about the datastore internals or the Java datastore API! You don't even need to know what an entity is because you just work with the Clojure maps. You can map, reduce and filter these maps just like any other Clojure internal datastructure. So we have a very natural integration with the language.

There is more: At some point you'll need some more functions to convert complex types between your application and the Google datastore and vice versa because the datastore supports only a fixed set of datatypes. If you want to store unsupported types, you have to take care of the serialization/deserialization yourself (check the supported types here).

Let's construct a simple example: You want to store a Boolean value for whether the book is out of print, but the input data in your program is a string "yes" or "no". "yes" should be translated to true and "no" should be translated to false before the entity is saved to the datastore. When the entity is loaded from the datastore, the Boolean values should be translated back to the string values again. To do this, we add the attribute outofprint to our book definition and we define a :pre-save and a :post-load anonymous function to convert between string and Boolean values. They are called with the current value of the attribute as their only parameter (the placeholder for this is the '%' sign). The return value is set as the new attribute value:

(defentity book
  [:key]
  [:title]
  [:author]
  [:publisher]
  [:isbn]
  [:pages]
  [:outofprint
   :pre-save #(= % "yes")
   :post-load #(if %
                 "yes"
                 "no")]
)


Storing a book would look like this now. We would provide the outofprint parameter as a string:



(store-entitites! (make-book :title "On Lisp"
                             :author "Paul Graham"
                             :publisher "Prentice Hall"
                             :isbn "978-0130305527"
                             :pages 413
                             :outofprint "yes"))

Now let's load the book from the datastore again. The parameter is still a string:


repl-prompt> (select (where book ([= :author "Paul Graham"])))
({:outofprint "yes", :pages 413, :isbn "978-0130305527", :publisher "Prentice Hall", :author "Paul Graham", :title "On Lisp", :key #<Key book(21)>, :parent-key nil, :kind "book"})


But when you examine the entity under the hood, you see that the value of outofprint is stored as a Boolean:


#<Entity <Entity [book(21)]:
    author = Paul Graham
    title = On Lisp
    pages = 413
    isbn = 978-0130305527
    outofprint = true
    publisher = Prentice Hall
>
>


For every attribute, our mini-language executes the :pre-save and :post-load functions automatically before it saves/loads data to/from the datastore. You can use these functions for type conversions, to manipulate the data in other ways, do calculations or whatever, and of course you can use these functions for validation.



This is all you need to know to write code to access the datastore. If you are new to Clojure (and any other Lisp language), than you might get a feeling why Paul Graham once said: "Lisp's power is multiplied by the fact that your competitors don't get it." Use simple data structures. Create powerful functional abstractions. Write less code. If you want to give our mini-languages a try, you can find the code here. You will find features for :pre-save and :post-load functions on entity level, transactions with automatic retries, query by key, return only keys from a query, automatically resolving parent/child relationships between entities and automatically resolving entities from attributes that contain keys.


If you want to get started with Clojure on Google App Engine, I can recommend this post. You'll need this post to setup an interactive programming environment.


If you are curious now and would like to try out TheDeadline, you can sign-up here. For more Clojure and App Engine related posts, you can follow our blog H.W.A. If you're attending Google I/O, you can chat with us about Google App Engine, Clojure and the Universe in the Developer Sandbox on May 19 and 20.

Happy hacking!

Montag, 19. April 2010

Clojure & Google App Engine Setup - Update

We posted an article about interactively developing Clojure applications on Google App Engine some time ago. We use this setup for developing our online todo manager TheDeadline which is now an open beta and accepts registrations from everyone.

As http://blog.miau.biz/2010/01/interactive-clojure-on-appengine-pt2.html correctly pointed out, the old setup didn't work out of the box for serving static files.

We'd like to post some updated and cleaned up setup code that works with the current App Engine SDK and also facilitates writing a web based application.


Have a look here.

Assuming you already defined your Compojure routes named "app", all you have to do is the following:

(ns start
(:require [com.freiheit.clojure.appengine.appengine-local :as gae-local]))

(defn start-server
[]
(gae-local/defgaeserver app-server app)
(gae-local/start-gae-server app))


This also does some setup for service static files you usually need in a web application. The code determines your current working directory and looks for a directory called src/web. Files in the subdirectories "js", "css" and "static" are automatically served as static files.

Mittwoch, 24. März 2010

Ada Lovelace Day 2010

Today we celebrated Ada Lovelace Day 2010 @ freiheit.com technologies as a tribute to all women on this planet who are working in software companies, studying computer science or working as programmers. We printed a special T-Shirt, some teams were cooking for their females members and we had some fun doing this photo shoot! It's spring time in Hamburg! Missing on this picture is Claudia (on a business travel) and Maja (still in India on her Sabattical trip, but soon back in town!).

Mittwoch, 17. Februar 2010

How a Clojure pet project turned into a full-blown cloud-computing web-app

I just made the slides of my Clojure talk available on Slideshare. The meeting took place at the newthinking store in Berlin and it was sponsored by Franz Lisp. Thank you! And "Thank you" to Hans Hübner for organizing this event!
Next "Berlin Lispers Meetup" will be cool, too. Speakers are Luke Gorrie and Edi Weitz. http://netzhansa.blogspot.com/2010/02/berlin-lispers-meetup-tuesday-march-2.html

Montag, 21. Dezember 2009

Weihnachtszocken at freiheit.com 2009

Last Weekend, it started all over again: We celebrated our traditional "Weihnachtszocken" (zocken is german slang for gaming) in our lovely "Gartenhaus" (a part of the company complex with its own garden).


While initiating the event, there was a new attendee record in the air. Finally we had 23 (!!) registrations for gaming. Of course, on the event day we had some failure based on self-updating Windows-Operating-Systems (damn you!!) and overlong game installations (Next time: start some days earlier ;-) ). Theses failures just have lead to maximum 19 simultanious gamers. But hey, still exceptional!

Our lovely "Gartenhaus" was a perfect gaming location as usual, nearly everybody used the big screens and we had a working network with internet flat and separated from our valuably internal server systems, which should be secured from our windows infected gaming pcs ;-).


Like every year we played on the same great game: Team Fortress 2. It's an ego shooter, which makes it easy for beginner but nevertheless offers a lot of strategic challenges to the more experienced gamer.


To sum up it was a great day with a lot emotions:


Freitag, 13. November 2009

Facebook's changing face over time

I work as a systems administrator at freiheit.com. In order for our developers to test the systems they build with different versions of Internet Explorer they asked me to check out a software called "Internet Explorer Collection" [1].  It installs every Internet Explorer from 1.0 to 8... needless to say I needed to try out current sites with old flavors of IE, here are some screenshots of Facebook in the different browsers:


Hmmm, JavaScript wasn't implemented by Microsoft when IE1 was released. It first appeared 1995 with the Netscape browser. Here, the JS-sources are shown within the page...

On to IE2, still no JavaScript:

IE3: Hooray for JavaScript (at that time called JScript by Microsoft due to trademark issues), but apparently CSS support is only in its beginnings:

Then comes IE4. Basic CSS support is there, but not much going on otherwise:


IE5 to the rescue! CSS support is somewhat better, JavaScript also works. But Facebook kept telling me that my Browser is too old...


Needless to say I wasn't able to login with all of those versions. I won't bother to show you screenshots of IE6 - IE8, I'm sure you can find some machines around that have those versions installed.

When playing with the different versions, I frequently got this message dialog, which shows up every time data is stored in a Cookie (a.k.a. "Internet information stored on your computer"). At that time, this was the default behavior.


And guess what page worked in most of the versions? Of course! Google.com, here is a screenshot from IE3, with the suggest box in the top left corner:

Oh well, the good old times...

1: http://finalbuilds.edskes.net/iecollection.htm

Donnerstag, 3. September 2009

Our corporate WebSite runs on Clojure and Google App Engine!

Today we launched our new corporate website.

Our objective: Being able to post new project stories and other information about our company faster and easier. And we wanted to provide a new navigation system based on a tag cloud to make the usage of the site more fun.

So we first wrote a custom content management system. When we write software for ourselves, we use Common Lisp or Clojure. This time we used Clojure, because we wanted to run the system on Google App Engine (GAE). We are quite happy with the result. :)

Some technical details:
  • Text and Images are stored in the Google Datastore. Forget about relational databases.
  • When an image is uploaded, we scale it automatically with the GAE image service.
  • We use memcached to cache the page content. It is quite fast.
  • We use client-side Google Translation to translate the site into 9 languages.
  • The GAE deployment and versioning is easy and fun.
  • We had very, very few problems with GAE or Clojure.
Visit www.freiheit.com

Check out our post about interactive programming with emacs, clojure and app engine!

Freitag, 28. August 2009

Interactive Programming with Clojure, Compojure, Google App Engine and Emacs

Clojure is lisp running on the JVM. Interactive programming with Clojure is fun. Just hit C-c C-c to compile a defun in Emacs. If you come from Java, you will appreciate this: No long Build-Deploy-Run cycles anymore. But if you want to write Clojure programs for Google App Engine, it looks like interactive programming is not possible. Really?! No! This tutorial shows, how to develop applications interactively lisp-style with Clojure, Compojure, Google App Engine and Emacs.

The Google App Engine SDK includes a development server (based on Jetty) that allows you to test your application in an environment that is close to the real one. This is nice for a pre-flight check but for every change you make in the code you must stop the devserver, recompile and start the server again. You can't develop your code incrementally and interactively as you're used to in a lispy language.

We'll presume that Emacs and Clojure is configured, working with Compojure is familiar and that the App Engine SDK has been downloaded and installed.

John Hume created a nice little clojure binding for appengine that we'll use here.

This is the example servlet:
; a basic application using the google app engine
; when this file is created it will create a class that extends
; javax.servlet.http.HttpServlet which can be mapped in the
; applications web.xml.

(ns helloworld
  (:gen-class :extends javax.servlet.http.HttpServlet)
  (:use compojure.http compojure.html)
  (:require [appengine-clj.datastore :as ds])
  (:import [com.google.appengine.api.datastore Query]))

(defn index
  [request]
  (let [items (ds/find-all (Query. "item"))]
    (html
      [:h1 (str "Hello World. There are " (count items) " in the database.")]
      [:a {:href "/new"} "Create another one"])))

(defn new
  [request]
  (do
    (ds/create {:kind "item" :text "something"})
    (redirect-to "/")))

(defroutes helloworld
  (GET "/" index)
  (GET "/new" new))

(defservice helloworld)


This file compiles to a servlet class that can be used in app engine (see the introduction post about clojure on app engine for a more detailed example). If you navigate your browser to "/" when the devserver is running you'll see a page displaying the amount of "item" entities in the datastore. If you click on the link you'll create a new entity and are redirected to the index page.

In order to develop the code more interactively, we'll create another file that will start a Jetty server. We'll call this file start0.clj:

; starting the application in a jetty.
; the original application is decorated by a function that sets up the
; app engine services.

(ns start0
  (:use helloworld)
  (:use compojure.server.jetty compojure.http compojure.control))

(defn start-it
  []
  (do
    (run-server {:port 9090} "/*" (servlet helloworld))))


Compiling this file with C-c C-k in Emacs and calling the function start-it in the REPL will start the Jetty. The application is available on port 9090.



If you view the application in the browser, there's a problem.



More work is needed. The services aren't initialized yet. We'll fix this in start.clj:

; starting the application in a jetty.
; the original application is decorated by a function that sets up the
; app engine services.

(ns start
  (:use helloworld)
  (:use compojure.server.jetty compojure.http compojure.control))

(defmacro with-app-engine
  "testing macro to create an environment for a thread"
  ([body]
    `(with-app-engine env-proxy ~body))
  ([proxy body]
    `(last (doall [(com.google.apphosting.api.ApiProxy/setEnvironmentForCurrentThread ~proxy)
    ~body]))))

(defn login-aware-proxy
  "returns a proxy for the google apps environment that works locally"
  [request]
  (let [email (:email (:session request))]
    (proxy [com.google.apphosting.api.ApiProxy$Environment] []
      (isLoggedIn [] (boolean email))
      (getAuthDomain [] "")
      (getRequestNamespace [] "")
      (getDefaultNamespace [] "")
      (getAttributes [] (java.util.HashMap.))
      (getEmail [] (or email ""))
      (isAdmin [] true)
      (getAppId [] "local"))))

(defn environment-decorator
  "decorates the given application with a local version of the app engine environment"
  [application]
    (fn [request]
      (with-app-engine (login-aware-proxy request)
      (application request))))

(defn init-app-engine
  "Initialize the app engine services."
  ([]
    (init-app-engine "/tmp"))
  ([dir]
    (com.google.apphosting.api.ApiProxy/setDelegate
    (proxy [com.google.appengine.tools.development.ApiProxyLocalImpl] [(java.io.File. dir)]))))

;; make sure every thread has the environment set up

(defn start-it
  []
  (do
    (init-app-engine)
    (run-server {:port 8080} "/*" (servlet (environment-decorator helloworld)))))


Before running jetty, local test implementations of the services are registered. Also, every request is wrapped by a function that registers the services for the current thread. Now the application runs as expected:



Google includes test implementations of the services in the jar files "appengine-local-runtime.jar" and "appengine-api-stubs.jar". So these two must be in the classpath of the clojure that is run from Emacs/SLIME. Be careful not to include these two jars in the war file that you deploy to the app engine servers. The application won't run!

Now that you've got a setup like this, incremental changes to the code are very simple: Try to change the text in the function "index" and just recompile the function with C-c C-c. The change should appear immediately if you reload the page in your browser.

Now you can work interactivly with Clojure and Google App Engine!

Dienstag, 27. Mai 2008

Photos: ECLM 2008 in Amsterdam

I just published my photos from the European Common Lisp Meeting 2008 in Amsterdam.

BTW: Edi Weitz has a good list of "after-action reports" here.

Dienstag, 22. Januar 2008

European Common Lisp Meeting 2008, April 20, Amsterdam

I am invited as a speaker to the ECLM 2008 in Amsterdam. This is the abstract of my talk:

"Using Common Lisp for Large Internet Systems"

The internet is full of interesting problems for computer scientists. Large internet systems tend to be used by hundreds of thousands to millions ofusers. Many of them "serve the long-tail" with millions of products or a large numbers of other content items like text, video and sound. Always, the main problem to solve is to help the user to find "the needle in the haystack". Obviously, such systems need intelligent algorithms and efficient implementations.

This raises the question, which programming language is the right tool to implement such systems. Currently most large internet systems are written in Java, which is not exactly famous for its briefness. Neither is C++. And more and more people are starting to use Ruby (on Rails) because "Ruby is an acceptable Lisp".

Paul Graham said: "Lisp's power is multiplied by the fact that your competitors don't get it".

How could we improve, if we used Common Lisp to build internet systems? This talk provides an in-depth analysis of current and future trends in web/internet programming and shows how to use Common Lisp as a tool to implement intelligent internet systems using the functional paradigm and sometimes even artificial intelligence algorithms. I will reveal what kind of Common Lisp systems we at freiheit.com technologies already have in production and what we are currently working on.