Tuesday, February 5, 2013

CouchDB +GeoCouch Installation on OS X 10.7.5

CouchDB + GeoCouch Installation on OS X 10.7.5

I am working on a mobile project that requires location data.  I originally wanted to try Couchbase Mobile but that looks like it is a ways off.  Ultimately I want to use a PaaS or a DBaaS provider for the location datastore.

Something like Cloudant or IrisCouch seems appealing but even with that I would need something local to test and develop with.  So, not wanting to hold back my momentum, I decided to use a local install of CouchDB with GeoCouch add on.  There were already a few guides on how to do this on both Macs and Linux.

Installing CouchDB

I used Mac Ports to install CouchDB.  Other instructions recommended using brew, but since I already used Mac Ports I stuck with it.

 sudo port install couchdb  

Installing GeoCouch

The Couchbase Github install instructions for GeoCouch got me part of the way there.  After following those instructions the spatial set of tests were failing.  This blog post from b.l.mak_s got me on the right track. As it turned out, the erlang beam files for GeoCouch needed to be copied to the CouchDB ebin folder and the default.ini needed updating for GeoCouch to finally work:


 sudo cp /Users/barryalexander/Development/geocouch/ebin/*  
 /opt/local/lib/couchdb/erlang/lib/couch-1.2.1/ebin/.    

The instructions in the GeoCouch Github page wanted to use an environment variable ERL_FLAGS, but I couldn't get it work, so I just copied the *.beam and couch app file to the couch db ebin folder.

Merge the contents from /opt/local/etc/couchdb/geocouch.ini to the corresponding sections in /opt/local/etc/couchdb/default.ini.  For instance, take the entries under [daemons], [httpd_db_handlers], and [httpd_design_handlers] from the geocouch.ini and move them into the default.ini of the couch installation.

After doing the above two steps, all spatial tests passed and trying the recommended smoke tests found here also worked.

 barry-alexanders-MacBook-Pro:geocouch barryalexander$ curl -X GET 'http://localhost:5984/places/_design/main/_spatial/points?bbox=0,0,180,90'  
 {"update_seq":3,"rows":[  
 {"id":"augsburg","bbox":[10.89833299999999916,48.37166700000000219,10.89833299999999916,48.37166700000000219],"geometry":{"type":"Point","coordinates":[10.89833299999999916,48.37166700000000219]},"value":["augsburg",[10.89833299999999916,48.37166700000000219]]}  
 ]}  

Of course you'll have to do this all over again if you update or reinstall couchdb.

Thursday, November 8, 2012

A Simple Release Calendar App

A Simple Release Calendar App

Agile Releases

Agile releases by nature are short in duration and comprised of 1 or more iterations.  The length of the iteration can be and arbitrary number of days.  The organization that I work for uses a 10 business day iteration.  Each releasable deployment is comprised of 2 iterations.  Our releases are denoted by the last two digits of the year, a dot, and followed by the month: YY.MM.

So, for example, a release destined for the March 2013 would be stated as 13.03.

Our iterations are numbered as a sequence starting from some arbitrary number.  Currently at my work we are in iteration numbers in the two hundreds: 213, 214, 215...etc.


Release    12.11  12.12  13.01  13.02  13.03
Iterations 220    222    224    226    228
           221    223    225    227    229

Need for Automating Release Calendar Creation

To insure all team members are aligned, a single source for release calendar is key to smooth running delivery pipeline.  You could assign an individual or group to manage the creation of a release calendar and publish it to a common document location within the organization.  In fact this is what my current organization does.  But saw that a little automation could ease the creation of the release calendar and provide a more flexible means of consumption thru a publicly available API.

Simple App Creation with App Frameworks

Since most of these kinds of projects are not directly funded within organizations because they generate no perceived business value, it makes sense to be able to quickly build and deploy your little app on a shoestring or no budget at all.

Frameworks like Ruby on Rails, Sinatra, Cake PHP and Grails offer developers a pathway to quickly develop useful applications without a lot of friction. By friction I mean the things that get in a developers way.  These can be anything from overly complex computer languages or configurations, other teams that control infrastructure like deployment engineers or even DBAs.  The more things you need to configure or hoops you need jump thru, the less likely you be able to deliver a useful tool within your organization.

Ideally, a developer should be able to build an application with a few days that accomplishes something useful.  Because the application is somewhat under the radar and not customer facing, you shouldn't have to spend more on additional resources like a coding pair, a build and deployment engineer or a database administrator. 

I chose to use Grails to create my application after trying several frameworks.  I am primarily a Java developer so Grails felt much easier to start using than some other frameworks.  Groovy as a language is very easy to pick up as a Java developer.  I think it is cleaner syntacticly and I am a fan of convention over configuration which the Grails framework employs.

The Virtues of Flying Solo

Working in an agile development workplace should mean I myself follow agile development practices to develop all of my applications.  Well, that's true for the official projects, the ones that have real ROI and are funded by the finance department and have budget and a full project team.  When real money is being spent and actual paying customers will use the application, then it makes sense to go the full distance in supporting agile practices like TDD, pairing and retrospectives and such.

For my little release calendar application that is meant to be used by my organization's project and iteration managers, I can do away with a lot of overhead in terms of pairing and a full project team.  I can pretty much do it all solo.  Of course I will use TDD and unit testing, but since this app won't come into contact with paying customers and generate revenue, a full functional test suite won't be required.  I won't have time or budget anyway!  The idea is to generate something useful, fast. 

Setting Up  A Grails Project Application

I won't go into extreme detail with regards to Groovy and Grails.  There are plenty of great sources online and books.

A useful text to that helped me is Manning's Grails in Action.  Also check out the official Grails site.

MongoDB Datasource

I decided to use MongoDB as my datasource.  I could have used a relational datastore like MySQL but I liked the fact that I could store the data for my release calendar as JSON objects since I was envisioning providing a RESTful API which would return JSON.  This will simplify the code by removing unnecessary mapping layers.

Cloud Ready

Another form of friction when developing an application is where to host it.  The usual drill is you need to beg, borrow or steal some hardware to host and deploy your app.  This usually involves justifying the need with an infrastructure team and then asking the group that holds the purse strings to fund buying the hardware.  Even if you have an internal cloud to deploy to, you still need permission to carve out some VMs to use.

Luckily, a lot of companies offer free cloud resources for small VM instances.  For example, I am using Cloudfoundry but I have also used AppFog, OpenShift etc.  I don't know how long this situation will last, but as long as these providers offer their services, I plan to use them to speed development and deployment.

At the very least, using one of these cloud providers can allow you to showcase your great idea to the powers that be and just maybe you'll get the funding and leadership approval necessary to host in your organizations private cloud.

App Design

Seeding the Release Calendar

Not wanting to invest a lot of time on a fancy UI, I chose to make my application primarily a REST application.  Meaning, to initially seed my calendar, I provide a REST get request with a POST to define a release calendar:

http://hostname/release/calendar

Request POST payload:

{
    "releaseName" : "MyRelease", 
    "releaseDesc" : "My releases are composed of 2 iterations. Each iteration contains 10 business days.",
    "startDate" : '2012-06-29T00:00:00-05:00',
    "duration" : "28",
    "iterations" : "2",
    "iterationNumber" : "213",
    "releaseFormat" :  "YY.MM"
}


Domain Models

As I mentioned earlier, I chose to use MongoDB as my primary datastore.  This in turn means that under the Grails model of object relational model (GORM), each datastore plugin follows uses the same conventions of object instantiation and data access methods.  When you create your domain objects, Grails behind the scenes creates the underlying object creation mapping and access methods.

There are several options for MongoDB plugins in Grails.  I started off using Morphia but ended up using the 'official' MongoDB Grails plugin MongoDB 1.0.0.RC5.  The main reason for this had to do with the fact that the cloudfoundry deployment tools natively recognized the domain objects and mapped them to the underlying services on cloudfoundry whereas the Morphia domain objects were not.

As with all Grails apps, you generate your domain objects from the grails command line:


create-domain-class com.gap.release.calendar.Day

create-domain-class com.gap.release.calendar.Release

generate-all com.gap.release.calendar.Day

generate-all com.gap.release.calendar.Release

Notice the create-domain-class command is being used.  Since I installed the mongoDB Grails plugin (and uninstalled hibernate), Grails knows to use the MongoDB plugin to generate the mapping from Grails objects to the underlying MongoDB objects and providing the access methods leveraging the 'convention over configuration' paradigm.  I'll talk more about that later when I present the REST API implementation code.

Taking a look at the generated domain models, I just filled in the attributes I needed.  Only really different thing I added beyond standard data types is the BSON object id type to support the MongoDB notion object ID for each document.  In this example, I am referencing the owning release calendar object in each calendar day.  This is done to support multiple release calendars within a single document store.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
package com.gap.release.calendar

import java.util.Date;

import org.bson.types.ObjectId

class Day {
 Date relCalDay
 String release
 Integer iterationNumber
 Integer iterationDay
 ObjectId releaseId

    static constraints = {
    }
}

Here is the release calendar domain object.  Again, nothing too spectacular here, just basic datatypes beyond the BSON objectId:


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
package com.gap.release.calendar

import java.util.Date;

import org.bson.types.ObjectId

class Release {

 ObjectId id
 String releaseName
 String releaseDesc
 Date startDate
 Integer relDurationDays
 Integer numIterations
 Integer iterationNumber
 String releaseFormat
 
    static constraints = {
    }
}

URL Mappings in Grails

One of the major reasons I like using Grails for developing simple apps is because implementing a REST API is pretty damn easy as compared to some other established frameworks like Spring.  Sure there are other factors to consider if you are deploying to a heavily traffic e-commerce or social site, but for little in-house tools like my release calendar application, the trade offs are worth it.

OK, I've got a confession about convention over configuration and Grails.  It turns out that not all configuration can be abstracted away.  But at least Grails puts it all under one roof, more or less.  Under the 'conf' location in the project navigator lies all the application configurations that may require tweaking, things like Spring application context, datasources, and in this case URL mappings.

The URL mappings file URLMappings.groovy contains the REST API URL mappings to controllers:


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
class UrlMappings {

 static mappings = {
  
  "/release/calendar"(controller: "relCalRest") {
   action = [GET: "list", POST: "save" ]
  }
  
  "/release/calendar/$releaseID"(controller: "relCalRest") {
   action = [GET: "listRel" ]
  }
  
  "/release/calendar/$releaseID/$calDate?"(controller: "relCalRest") {
   action = [GET: "listDay" ]
  }

  "/$controller/$action?/$id?"{
   constraints {
    // apply constraints here
   }
  }

  "/"(view:"/index")
  "500"(view:'/error')
 }
}

The URLMappings class is where you define which REST action methods get implemented in your Grails controller classes.  Next we'll look at the calendar seeding controller called "restCalRest".

Calendars, Dates and Date Math

 From URLMappings class we can see that when we encounter the URL http://hostname/release/calendar with a POST, we will use the "restCalRest" class to implement the backend logic to create, or seed a release calendar.  Using GORM terminology we are mapping the POST to the save method of the "restCalRest" controller.

Controllers: Where The Magic Happens

It's not really that magical actually.  We are passed a block of JSON, we map the JSON to our domain objects and we save them.  By saving them, we are instructing the domain objects to persist to our chosen datastore.  In this case our datastore is MongoDB.


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
 def save = {
  def json = request.JSON
  
  def release = new Release()
  
  release.releaseName = json.releaseName
  release.releaseDesc = json.releaseDesc
  DateTimeFormatter parser = ISODateTimeFormat.dateTimeParser()
  DateTime startDate = parser.parseDateTime(json.startDate)
  release.startDate = new SimpleDateFormat("yyyy-MM-dd").parse(json.startDate)
  release.relDurationDays = json.duration.toInteger()
  release.numIterations = json.iterations.toInteger()
  release.iterationNumber = json.iterationNumber.toInteger()
  release.releaseFormat = json.releaseFormat
  
  if (release.save()) {
   calDayGeneratorService.generateDays(release)
   render contentType: "application/json", {
    // Return the ID of the new release
    ['id' : release.id.toString()]
   }
  }
  else {
   render contentType: "application/json", {
    ['status' : 'FAILED']
   }
  }
 }
It's pretty straight forward actually but notice that I have utilized a service "calDayGeneratorService" to do the heavy lifting of generating the release calendar days.  Let's look at the details of the "generateDays" method of "calDayGeneratorService"

The basic logic is take the initial date passed in and figure out based on that the next release startdate and fill in the iterations per release and iteration days within each iteration.  It uses the Jodatime plugin because it provides convenience routines for figuring out periods between dates and date formatting for months, days and years.


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
class CalDayGeneratorService {
 
    def generateDays(Release release) {  
  Integer totalDuration = 0
  DateTime nextReleaseDate = new DateTime(release.getStartDate())
  nextReleaseDate = nextReleaseDate.plus(Period.days(release.getRelDurationDays()*2))
  Date relCalDate = release.getStartDate()
  Integer iterationNumber = release.getIterationNumber()
  while (totalDuration < 365) {
   Integer relCnt = 0
      while (relCnt < release.getRelDurationDays()) {
    String relName = nextReleaseDate.toString()
    relName = relName[2..3]
    def relMonth = nextReleaseDate.monthOfYear.toString().padLeft(2,'0')
    relName = relName + '.' + relMonth
  
    for (int iterationCnt=0; iterationCnt < release.getNumIterations(); iterationCnt++) {
     int itDay=1
     for (int iterationDays=0; iterationDays < (release.getRelDurationDays())/2; iterationDays++ ) {
      def day = new Day()
      day.iterationNumber = iterationNumber
      day.relCalDay = relCalDate
      day.release = relName
      day.releaseId = release.id
      if ((relCalDate[Calendar.DAY_OF_WEEK] != Calendar.SATURDAY) &&
       (relCalDate[Calendar.DAY_OF_WEEK] != Calendar.SUNDAY)) {
       day.iterationDay = itDay++
      } else {
       day.iterationDay = itDay
      }
      day.save()
      relCalDate = relCalDate + 1
      relCnt++
     }
     iterationNumber++
    }
      }
      totalDuration = totalDuration + release.getRelDurationDays()
   nextReleaseDate = nextReleaseDate.plus(Period.days(release.getRelDurationDays()))
  }
    }
}

The only thing really special in generating the iteration days happens around week ends.  Since we don't (usually!) work in weekends, I put special logic to recognize when Saturdays and Sundays are encountered, just jump ahead to the next iteration day.  Even this is pretty easy using Jodatime DAY_OF_WEEK and Calendar.Saturday and Calendar.Sunday shortcuts.

MongoDB

When the save method is called on a domain object, it calls MongoDB to persist the objects in what is called a collection.  After calling the release calendar app to seed the release calendar days, we can take a look at what gets stored in our MongoDB collections.

barry-alexanders-MacBook-Pro:bin barryalexander$ ./mongo
MongoDB shell version: 2.0.2
connecting to: test
> show dbs
ReleaseCalendar 0.203125GB
local (empty)
test 0.203125GB
> use ReleaseCalendar
switched to db ReleaseCalendar
> show collections
day
day.next_id
release
release.next_id
system.indexes
user
user.next_id

From this sequence of commands we can see that Grails thru the MongoDB plugin created our day and release collections and sequence objects to generate unique IDs for each.

We can sample the collections to look at what documents got created in each collection:

> db.release.find()
{ "_id" : ObjectId("504c1f8b036461c1a2f31829"), "iterationNumber" : 213, "numIterations" : 2, "relDurationDays" : 28, "releaseDesc" : "GID releases are composed of 2 iterations.  Each iteration contains 10 business days (14 calendar days) week days (M-F) only.", "releaseFormat" : "YY.MM", "releaseName" : "GIDRelease", "startDate" : ISODate("2012-06-29T07:00:00Z"), "version" : NumberLong(1)
}

Here is a small sampling from the day collection:


> db.day.find()
{ "_id" : NumberLong(673), "iterationDay" : 1, "iterationNumber" : 213, "relCalDay" : ISODate("2012-06-29T07:00:00Z"), "release" : "12.08", "releaseId" : ObjectId("504c1f8b036461c1a2f31829"), "version" : 0 }
{ "_id" : NumberLong(674), "iterationDay" : 2, "iterationNumber" : 213, "relCalDay" : ISODate("2012-06-30T07:00:00Z"), "release" : "12.08", "releaseId" : ObjectId("504c1f8b036461c1a2f31829"), "version" : 0 }
{ "_id" : NumberLong(675), "iterationDay" : 2, "iterationNumber" : 213, "relCalDay" : ISODate("2012-07-01T07:00:00Z"), "release" : "12.08", "releaseId" : ObjectId("504c1f8b036461c1a2f31829"), "version" : 0 }
{ "_id" : NumberLong(676), "iterationDay" : 2, "iterationNumber" : 213, "relCalDay" : ISODate("2012-07-02T07:00:00Z"), "release" : "12.08", "releaseId" : ObjectId("504c1f8b036461c1a2f31829"), "version" : 0 }
{ "_id" : NumberLong(677), "iterationDay" : 3, "iterationNumber" : 213, "relCalDay" : ISODate("2012-07-03T07:00:00Z"), "release" : "12.08", "releaseId" : ObjectId("504c1f8b036461c1a2f31829"), "version" : 0 }
{ "_id" : NumberLong(678), "iterationDay" : 4, "iterationNumber" : 213, "relCalDay" : ISODate("2012-07-04T07:00:00Z"), "release" : "12.08", "releaseId" : ObjectId("504c1f8b036461c1a2f31829"), "version" : 0 }

Using the Release Calendar API

Once the calendar is seeded with days, the GET method APIs can be used to either return all releases stored in the database or return details about a given date.


1
2
3
4
5
6
7
8
9
 def list = {
  def List <Release> releaseList = Release.findAll()
  render releaseList as JSON
 }
 
 def listRel = {
  def release = Release.findByReleaseName(params.releaseID)
  render release.encodeAsJSON()
 }

As seen in the list method above, getting a list is as simple as calling the findAll method on the domain object and rendering as JSON.  The listRel method uses the convention of the findBy method followed by the key(s) to search by, in this case ReleaseId.  This will find the release with the passed parameter on the URL:  /release/calendar/$releaseID

The listDay method provides the response to the GET day request of the release calendar API.  The URL contains the release name (id) and a date.  The method looks up the given release calendar  day and returns the JSON rendering of the day domain object.  The only thing of note is that I added support for JSONP to allow cross site calling of the API.


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
 def listDay = {
  def release = Release.findByReleaseName(params.releaseID)
  
  Date calendarDate = new SimpleDateFormat("yyyy-MM-dd").parse(params.calDate)

  def day = Day.findByRelCalDayAndReleaseId(calendarDate,release.id)
  // gotta support jsonp too!
  if (params.callback) {
   render params.callback + '(' + day.encodeAsJSON() + ')'
  }
  else {
   render day.encodeAsJSON()
  }
 }

Deploying to the Cloud

Cloudfoundry has provided a Grails plugin that allows deploying directly from Springsource Tool Suite (STS).  To deploy to cloudfoundry within STS, it's just a matter of issuing this command:

grails prod cf-push

Consult the cloundfoundry site on how to open an account and setup.  Since I am using the MongoDB Grails plugin from VMware, when the application is deployed, it automatically provisions the MongoDB service.

After execute the push command, the console will show the deployment progress:


| Loading Grails 2.0.0
| Configuring classpath.
| Environment set to production.....
Building war file
| Packaging Grails application.....
| Compiling 14 GSP files for package [relCal2]..
| Compiling 8 GSP files for package [jodaTime]..
| Building WAR file.....
| Done creating WAR target/cf-temp-1352438416722.war
Application Deployed URL: 'RelCal2.cloudfoundry.com'? y
y
Would you like to bind the 'mongodb-relcal' service?[y,n] y
y
| Creating application RelCal2 at RelCal2.cloudfoundry.com with 512MB and services [mongodb-relcal]: OK
| Uploading Application:
|   Checking for available resources: OK
|   Processing resources: OK
|   Packing application: OK
|   Uploading (2K): OK
| Trying to start Application: 'RelCal2'.....
| Application 'RelCal2' started at http://relcal2.cloudfoundry.com


You can use the vmc command from cloudfoundry to manage your deployed application.  See the vmc reference page for all commands.  Here's a few examples to show application and services status:



barry-alexanders-MacBook-Pro:RelCal2 barryalexander$ vmc login
Attempting login to [http://api.cloudfoundry.com]
Email: barry.alexander@gmail.com
Password: ********
Successfully logged into [http://api.cloudfoundry.com]

barry-alexanders-MacBook-Pro:RelCal2 barryalexander$ vmc services

============== System Services ==============

+------------+---------+---------------------------------------+
| Service    | Version | Description                           |
+------------+---------+---------------------------------------+
| mongodb    | 2.0     | MongoDB NoSQL store                   |
| mysql      | 5.1     | MySQL database service                |
| postgresql | 9.0     | PostgreSQL database service (vFabric) |
| rabbitmq   | 2.4     | RabbitMQ message queue                |
| redis      | 2.2     | Redis key-value store service         |
+------------+---------+---------------------------------------+

=========== Provisioned Services ============

+----------------+---------+
| Name           | Service |
+----------------+---------+
| mongodb-relcal | mongodb |
| mysql-c5009bd  | mysql   |
+----------------+---------+

barry-alexanders-MacBook-Pro:RelCal2 barryalexander$ vmc runtimes

+--------+-------------+-----------+
| Name   | Description | Version   |
+--------+-------------+-----------+
| java   | Java 6      | 1.6       |
| ruby19 | Ruby 1.9    | 1.9.2p180 |
| ruby18 | Ruby 1.8    | 1.8.7     |
| node08 | Node.js     | 0.8.2     |
| node06 | Node.js     | 0.6.8     |
| node   | Node.js     | 0.4.12    |
| java7  | Java 7      | 1.7       |
+--------+-------------+-----------+

barry-alexanders-MacBook-Pro:RelCal2 barryalexander$ vmc frameworks

+------------+
| Name       |
+------------+
| lift       |
| rails3     |
| sinatra    |
| spring     |
| java_web   |
| standalone |
| rack       |
| node       |
| grails     |
| play       |
+------------+


barry-alexanders-MacBook-Pro:RelCal2 barryalexander$ vmc apps

+-------------+----+---------+----------------------------------+----------------+
| Application | #  | Health  | URLS                             | Services       |
+-------------+----+---------+----------------------------------+----------------+
| RelCal      | 1  | 0%      | relcal.cloudfoundry.com          | mongodb-relcal |
| RelCal2     | 1  | RUNNING | relcal2.cloudfoundry.com         | mongodb-relcal |
| barry       | 1  | STOPPED | barry.cloudfoundry.com           |                |
| caldecott   | 1  | RUNNING | caldecott-38eef.cloudfoundry.com | mongodb-relcal |
| env-node    | 1  | RUNNING | env-node.cloudfoundry.com        | mongodb-relcal |
+-------------+----+---------+----------------------------------+----------------+



Seeding the Release Calendar with Starting Data

Using a HTTP posting tool like XHR Poster, I sent a request POST with the following JSON payload:

{
    "releaseName" : "GID Release", 
    "releaseDesc" : "GID releases are composed of 2 iterations.  Each iteration contains 10 business days (14 calendar days) week days (M-F) only.",
    "startDate" : '2012-06-29T00:00:00-05:00',
    "duration" : "28",
    "iterations" : "2",
    "iterationNumber" : "213",
    "releaseFormat" :  "YY.MM"
}

This will cause the app to generate the release calendar days starting from the start date for one year worth of days.

Verifying The Data


Verifying the MongoDB datastore on cloudfoundry requires you to use something called caldecott.  It's a tunneling tool to access services:

barry-alexanders-MacBook-Pro: barryalexander$ vmc tunnel mongodb-relcal none
Getting tunnel connection info: OK

Service connection info: 
  username : ************************************
  password : ************************************
  name     : db
  url      : mongodb://c71364b2-96c4-4bfa-abd5-614f07ab4435:efbdb576-5269-4719-a301-4a9ce5c9b7d9@172.30.48.63:25138/db

Starting tunnel to mongodb-relcal on port 10000.
Open another shell to run command-line clients or
use a UI tool to connect using the displayed information.
Press Ctrl-C to exit...

In another terminal, connect to MongoDB with the mongo command using the generated username and password:

barry-alexanders-MacBook-Pro:bin barryalexander$ ./mongo -port 10000 -u *********************************** -p *********************************** db
MongoDB shell version: 2.0.2
connecting to: 127.0.0.1:10000/db
> show collections
day
day.next_id
release
system.indexes
system.users
user
user.next_id
> db.release.find()
{ "_id" : ObjectId("509c9b25b844c4506558c0d7"), "iterationNumber" : 213, "numIterations" : 2, "relDurationDays" : 28, "releaseDesc" : "GID releases are composed of 2 iterations.  Each iteration contains 10 business days (14 calendar days) week days (M-F) only.", "releaseFormat" : "YY.MM", "releaseName" : "GID Release", "startDate" : ISODate("2012-06-29T00:00:00Z"), "version" : 0 }
> db.day.find()
{ "_id" : NumberLong(393), "iterationDay" : 1, "iterationNumber" : 213, "relCalDay" : ISODate("2012-06-29T00:00:00Z"), "release" : "12.08", "releaseId" : ObjectId("509c9b25b844c4506558c0d7"), "version" : 0 }
{ "_id" : NumberLong(394), "iterationDay" : 2, "iterationNumber" : 213, "relCalDay" : ISODate("2012-06-30T00:00:00Z"), "release" : "12.08", "releaseId" : ObjectId("509c9b25b844c4506558c0d7"), "version" : 0 }
{ "_id" : NumberLong(395), "iterationDay" : 2, "iterationNumber" : 213, "relCalDay" : ISODate("2012-07-01T00:00:00Z"), "release" : "12.08", "releaseId" : ObjectId("509c9b25b844c4506558c0d7"), "version" : 0 }
...

Source Code

Source code for this blog post can be found here: https://github.com/balexander123/RelCal2.git




Monday, June 13, 2011

Database evolution ACID --> BASE --> DIRT

A good introduction to database evolution and why node JS is most appropriate for DIRT applications.



Monday, May 23, 2011

MongoSF - MongoDB conference notes

The 10Gen people were organized and efficient as registration was smooth and friendly. The information packet received at registration included an updated agenda and a map of the venues. Swag bags were available in the back. Some pretty cool (and some useless) swag was included (a USB drive in the shape of a person called "USB People"). I did get a nifty MongoDB coffe mug and snagged some vendor T-shirts. About a dozen vendors were present mostly pushing there cloud PaaS services. Redhat, VMWare, dotcloud, and bunch of smaller start ups.

PaaS was a big theme beyond just MongoDB. Also, real-time, event driven web with nodeJS was also a hot topic of discussion and popular sessions.


Monitoring & Queuing MongoDB
9:30am - 10am (30m)


This talk will contain 2 topics: 1) How to monitor your MongoDB cluster and what to look out for to prevent explosions. 2) How to build a redundant, scalable queuing system using MongoDB.At Boxed Ice we throw 3.5TB of data into MongoDB each month, which results in processing billions of documents. Fun times

David Mytton
(Boxed Ice)

David is an entrepreneurial programmer based in the UK. He is currently working on a server performance monitoring tool, Server Density, through his startup, Boxed Ice.


My Notes:


Boxed Ice is using RabbitMQ for alerts on background processing. But RabbitMQ has no native failover. This was the primary reason they are exploring using MongoDB to provide a persistent data store for recovery.


Basically, they wanted the following:

  • Redundancy
  • Atomicity
  • Speed
  • Garbage collection


They chose to use MongoDB over RabbitMQ based on experience working with mongo versus RabbitMQ Not wanting to add yet another system to learn, they decided to use MongoDB exclusively.


Next the presenter talked about monitoring performance in MongoDB.


In memory is always faster than disk. MongoDB has an explain method, use this to determine indexes and disk i/o -- whether operations are reading/writing from/to disk.

Regarding storage, MongoDB pre-allocates in 2gb increments.

When considering sharding, max size assumes capacity same on all nodes. Best to set to 70% of memory capacity.

Rotate your logs, don't let them get too big.

Use journaling and don't go over max 1gb.


To determine used connections: db.serverStatus(). Always use use connection pooling.

connPoolStats

indexCounters, from db.ServerStatus()

Op counters, fsynch setting, config slaves to handle reads


background flushing


Dur


rs.Status() (replicaste set status)

myStatus


Optime, last updated


heartbeat, last comm with members


An overview of the mongostats command presented.


Concerning the value of faults, high values implies not enough ram for indexes/data.


If status is 'locked', this causes queuing and signifies an index miss. Excessive queuing will cause performance issues.


Other useful mongostats commands:


db.currentOp()

db.kill


Boxed Ice runs a site called mongomonitor.com that offers a DB monitoring service for MongoDB instances:


http://www.serverdensity.com/mongodb-monitoring/


In summary:


  • Keep indexes in ram and as much data as possible.
  • Watch storage usage, both disk and ram.
  • Monitor status


Questions from audience/me:


Rabbitmq vs MongoDB thru put?

5000 msg/sec vs 2000 msg/sec

Order of magnitude slower, but not bad.


Someone from the audience asked whether the presenter was aware of something called 'Celery'. Celery offers a way to cluster RabbitMQ instance using MongoDB as a data store.


http://celeryproject.org/


OpenShift
10am - 10:45am
(45m)

Tobias Kunze
(Red Hat)

My Notes:

This session is probably going to be the most 'markety'. By that I mean the most full of marketing type speak. I hope note. I really want to know the technical details as much as possible.

OpenShift - Currently is 'developer preview', i.e., 'beta'. It's still early in usage.

How openshift is used withing opneshift PaaS.
Vrishna from RedHat is giving a live demo.

Slides begin, why PaaS?
Dev needs: stop dealing with the stack focus on application development.
Operations needs: focus on service, not deployments.

Server setup takes a long time, stealing development time
Operations, 50% of time spent on deployment. Each organization has it's own issues, but mostly all have 'know issues' that require manual intervention. This eats time away from things that add value to the organization.

Speaker just said: "getting the right abstraction," Yeah, this is a marketing speech.

Dev: Open shift offers "open source ecosystem"
Ops: Open shift offers cloud management

Open shift offers a "platform kernal" they refer to as 'fabric'

Someone just haded me a "Shift Happens" sticker. Ha Ha. I get it.

Distributed apps slide
Composite apps, many apps communicating through services

Rightscale, template as a service, cloud config

PaaS Types:

Middleware
Frameworks, Heroku, vmforce


Opensource, middleware+framework models

Makara, the creators of Openshift (makara bought by redhat)

Openshift Express product: free, git-based deploys. Interaction 'runtime as a service'

Openshift Flex: premium service offers nodes, middleware, frameworks service
uses mongo (vs redis)

MongoDB is agile and scalable, failover built in

"Write now, design later". This refers to the schema-less nature of document storage.

Demo Krishna Raman

kraman/mongo1 on git

Openshift uses EC2

Standard EC2 cluster setup
create your application
java/mysql app
select the components, tomcat, mysql
downloads, installs components, mongo is available too
supports tomcat, jboss apps

Didn't really learn anything new here. Already familar with this stuff so far.


At 11am, couldn't decide which session to go to! "Storing and Querying location data" or "Schema Design at Scale". I wish more folks from my company could have gone. It would have been much
easier to cover multiple threads.

Storing and Querying location data with MongoDB
11am - 11:45am
(45m)

Looking to store and query location data? MongoDB has you covered. Learn how to structure, and even shard your geo data, along with an unlikely use case: an infinitely large board game!

Grant Goodale
(WordSquared)




My Notes
Ok, this gut is super hyper-active and extremely excited about his work. Basically it's an infinite multi-player Scrabble board played online. It
uses nodeJS, html5, and MongoDB (single instance)

It uses geospatial indexing, calculates across 'units' (the playing squares).
Mongodb 1.9, multi geolocation coming

geo2d in 1.8
$geoNear
get result sets for geo lookup
use ordered hashes
returns collection ordered by sorted distance
query within region
$box query
$center query
MongoDB version 1.9 will support polygon searches
$nearSphere, $centerSphere
uses radians, not native units
position is long, lat

geojason standard

67 million records, in one node
activity is mostly in the edges, not all records are active
Current just using master/slave, no sharding

massivelyfun.com

The scrabble like game "Word squared" game looked pretty cool:
http://massivelyfun.com/

Schema Design at Scale
11am - 11:45am
(45m)

Eliot Horowitz
(10gen)

Eliot is CTO of 10gen, the company that sponsors the open source MongoDB project. Eliot is one of the core MongoDB kernel committers. Eliot is also the co-founder and chief scientist of ShopWiki. In January 2005, he began developing the crawling and data extraction algorithm that is the core of ShopWiki's innovative technology. Eliot has quickly become one of Silicon Alley's up and coming entrepreneurs, having been selected as one of BusinessWeek's Top 25 Entrepreneurs Under Age 25 in 2006. Prior to ShopWiki, Eliot was a software developer in the R&D group at DoubleClick. Eliot received a B.S. in Computer Science from Brown University.


My Notes:


Missed this presentation because I attended the geospatial talk...


Shell Hacks
11:45am - 12:30pm
(45m)

Scott Hernandez
(10gen)

My Notes:

This talk was mainly demo, so I was mostly watching trying to absorb the discussion. The scripting shell is pretty much used like a Toad would be used in the Oracle world to prove out your SQL. In this case, working with MongoDB, the scripting language is Javascript. Most everything that is accessible via programming API, you can so in the MongoDB JS shell.

MongoDB for Java Devs with Spring Data
1:30pm - 2pm
(30m)

The Spring Data project provides sophisticated support for NoSQL datastores. The MongoDB module consists of a namespace to easily setup MongoDB access, a template class to provide a nice API to persist and query objects as well as sophisticated support to build repositories accessing entities stored in a MongoDB. The talk will introduce the Spring Data MongoDB support and present the features in hands on demos.

Chris Richardson
(VMware)

My Notes:

Spring Data contains support for nosql databases

The presenter was demonstrating how Spring Data can hide much of the boiler plate and plumbing code that is required with data access technologies like JDBC and JPA. Spring Data also has rich support for NoSQL databases as well and MongoDB in particular.


Cloud Foundry --> SpringSource --> VMWare

Chris Richardson was the founder of Cloud Foundry which was acquired my SpringSource. SpringSOurce was subsequently acquired my VMWare. A case of big fish consuming little fish which was consumed by an even bigger fish.

Spring Data support for MongoDB

Map from java to mongo documents using annotations.
Can also use relational/jpa and mongo data sources through one access layer via MongoTemplate. This was refered to as "cross store persistence".

The MongoConverter interface is used to implement with your domain objects to read/write java to mongo documents.


MongoRepository class defines CRUD methods.

Annotations are used for java to MongoDB document mapping.
@Id, @Indexed, @PersistemceConstructor
@GeoSpatialIndexed

Spring Data has support for QueryDSL, a domain specific language for database queries. This allows a consistent way to query underlying databases. Coupled with the cross store persistence, this is a powerful combination that would enable the introduction of MongoDB without disrupting DAL code.

Query DSL is generated from domain model class and produces type-safe composable queries.


So, to sum up, cross-store persistence allows jpa/relational _and_ document (MongoDB) data stores.

The presenter then demonstrated a grails/MongoDB sample that highlighted the ease at which the domain and data access models were blended.

2PM is another tough choice between "Scaling and Sharding" or "Geospatial Indexing..."

Scaling and Sharding
2pm - 2:45pm
(45m)

Eliot Horowitz
(10gen)

Eliot is CTO of 10gen, the company that sponsors the open source MongoDB project. Eliot is one of the core MongoDB kernel committers. Eliot is also the co-founder and chief scientist of ShopWiki. In January 2005, he began developing the crawling and data extraction algorithm that is the core of ShopWiki's innovative technology. Eliot has quickly become one of Silicon Alley's up and coming entrepreneurs, having been selected as one of BusinessWeek's Top 25 Entrepreneurs Under Age 25 in 2006. Prior to ShopWiki, Eliot was a software developer in the R&D group at DoubleClick. Eliot received a B.S. in Computer Science from Brown University.


Geospatial Indexing with MongoDB
2pm - 2:45pm
(45m)

Greg Studer
(10gen)

Greg works on various aspects of the MongoDB core server. Prior to 10gen, he completed a PhD at the University of Sussex and Masters at Cornell where he studied multi-agent simulation and computational scaling through assembly. He began his career working on various projects at IBM related to enterprise configuration and modeling, and legacy systems integration.


My Notes:


Rich set of storage and query of geospatial coordinates. Spherical coordinates are the most accurate. Version 1.9 supports multiple locations within a document.

MongoDB as a data integration layer between Apps in Cloud Foundry
3pm - 3:30pm
(30m)

In this talk we will discuss a new design pattern for building applications that consist of many small apps that work together to appear as one website. Cloud Foundry is a PaaS that supports many languages and frameworks as well as many services and data stores, one of which is MongoDB, We will talk about a new pattern enabled by this architecture where you can write different pieces of your application in multiple different languages or frameworks and use a shared MongoDB instance provided by Cloud Foundry's data service layer as an integration point between all the app parts. Not only can you store data in MongoDB for display on the web pages of your app, but you can use MongoDB as a message queue or logging device between apps as well as a shared data structure container. So with this pattern you could have written your main application ui in Ruby on Rails but you may have a few web services or applets written in sinatra or node.js as well as perhaps a Spring java application doing some heavier number crunching or data processing. And use Mongodb via Cloud Foundry's Services layer as an integration point between all the pieces of this application in order to make it appear as one app to the outside world.

Ezra Zygmuntowicz
(VMware)

My Notes:

This talk was about Cloud Foundry (cloudfoundry.com). What was cool was the concept of a "micro cloud", essentially a cloud on a vm that a developer can use to code and test cloud solutions before committing to a pre-production or production cloud.

There was a brief demo using vmc commands to build cloud instances and pushing apps to instances.

SpringSource Tool Suite (STS) has a plugin for cloud foundry.


Cloud Foundry supports all the favored languages and HTTP servers. The presenter was touting the use of MongoDB as the common "glue" or persistant state manager of application data between disparate/specialized concerns. Fast, non-blocking HTTP with nodeJS, coupled with Rudy for site metrics and your language of choice for business domain objects, services and front end rendering.

I thought there was going to be more 'meat' provided on this concept as it was the title of the session, but most of the content was a pitch for Cloud Foundry and why you should use VMWare for cloud support.

Rapid Realtime App Development with Node.JS & MongoDB
3:30pm - 4:15pm
(45m)

Jump on board to learn about combining two of the most exciting technologies to quickly build realtime apps yourself. This talk will introduce the popular Node.js library, Mongoose, which is a MongoDB "ORM" for Node.js. First, the speaker will deliver a quick primer on Node.js. Then, he'll walk you through Mongoose's schema api, powerful query builder, middleware capabilities, and exciting plugin ecosystem. Finally, he'll demonstrate some realtime capabilities using Node.js and Mongoose.

Brian Noguchi
(Shortrr.com)

Brian Noguchi is a software engineer in San Francisco. He is the founder of Shortrr, which helps you save time reading the flood of content online. Prior to that, he was the founder and lead engineer of Trendessence, which sold advanced Twitter analytics solutions for the enterprise. He is currently focused on designing and building a realtime framework on top of Node.js that he hopes will do for realtime app development what Rails and Django did for web app development. He is a core contributor to Mongoose, the popular MongoDB package for Node.js, and he is also the author of several other popular Node.js packages. You can find his open source work on github.

My Notes


This was the most crowded session by far. It was mainly about Mongoose, a MongoDB plugin for nodeJS.


http://blog.learnboost.com/blog/mongoose/


There was a brief introduction to nodeJS. As most in attendance were already familar with nodeJS, not a lot of time was spent here. I suggest you look at the offical web sites and blogs for details:

http://nodejs.org/


Who uses nodeJS: yammer/github/netflix/learnboost


nodejs = realtime, evented , non-blocking I/O


nodeJS is fast, although it's JavaScript, server side js

One advantage is it can share code between server and browser. Not sure I get that though. Aren't those very different kinds of concerns? Can UI developers use server based algorithms?


nodeJS is an active community

lots of packages supported


recommended packages: express, jade, socketio, mongoose


express=sinatra

jade=template engine http://jade-lang.com/

socketio, name implies it's use http://socket.io/

mongoose= package to support mongoDB which was developed by learnboost http://blog.learnboost.com/blog/mongoose/


mongoose, gives casting, validation, and crud methods

mongoose.schema


crud

c,. save

r, .find, .findall, .findone

u, .find and .save

d, .remove


schema types

string

number, increment, decrement both atomic

objectid

arrays js array syntax, push, pop

comments[CommentSchema] embedded documents!

defaults

validations required:true, enums

custom validations by passing closures/functions like validate Title, 'myTitleValidate')


indexes

index:true or unique:true


nested documents


virtuals


schema.virtual('name.full')


return first+last


advanced querying


mongoose where.[].where[]


namedScopes


scope.name.query


dynamic named scopes defines functions which accept parameters



middleware


pre/post event methods are passed a callback to execute before a save and after a save

need to call next() to maintain atomicity



schema.connectSet for replica sets


.explain is supported


Question from audience: geospatial supported? use chain builder


plugins

mongoose-types (email, for email type validations)

mongoose-auth offers facebook, twittter, git etc auth login capabilities

mongoose-solr is in the works


schema.plugin(aplugin, attribute hash..


Mongoose code is available on github learnboost


See the mongoose blog for more details: http://blog.learnboost.com/blog/mongoose/


How do you unit test? Use espresso package: http://visionmedia.github.com/expresso/


Application Design with MongoDB: 4 Examples

4:30pm - 5:15pm
(45m)

Kyle Banker
(10gen)

Kyle maintains the MongoDB Ruby Driver and supports the Ruby developer community. Previously, Kyle built e-commerce and social networking applications, and he once worked as teacher of languages and literature. Kyle has presented MongoDB in numerous forums, and he's the author of the forthcoming book MongoDB in Action.


last talk of the day...


Quite honestly, this presenter was over animated and went through his presentation deck at lightening like speed. So I didn't really gleam a lot from this session. Whatever notes are here are of little value I fear.

The first sample problem was how to represent a category hierarchy in a document database. The presented explained the problem and how parent child relationships needed to be defined and modified. Then he presented some methods to for querying for ancestor and descendants and how to keep hierarchies up to date. He show some 'tricks' like using:


$positional operator, updates all instances with the db


The next sample concerned analytics and his suggestions for pre aggregate your data and to keep ephemeral data separate. Age it out, in a separate store.

Collections: days, months
Scope based on day and month durations
Use a composite index, which in this example is uri + date


Next was how to store binary data.
BSON has a bindata type
Also gridFS is a driver level support for large data files, chucks files to 256kbyte chunks
These can be sharded as binary data too.


Last but not least was a transactional model. Although the argument was are transactions actually ever required? Since MongoDB does not support them, how do you handle situations where you might need them? The answer is use compensation model.