CouchDB Working Notes

I've decided to learn CouchDB, and I'm taking notes as I go.

By Ryan McGreal. 1577 words. Approximately a 5 to 10 minute read.
Posted April 09, 2010 in Blog.

Contents

1Summary
2Install CouchDB
3RESTful API
4JSON
5Create a CouchDB Database
6Add a Document
7View a Document
8Add Another Document
9Update a Document
10Data Views
11Misc.
12Utils Web Interface

Note: this article is very much a work in progress. It functions mainly as a place where I can document what I learn about CouchDB as I play around with it and get a progressively better sense of what it does and how it works.

1 Summary

CouchDB is a non-relational database server built in Erlang that stores its data as JSON and communicates RESTfully over HTTP.

2 Install CouchDB

First, you need to install CouchDB. If you're using a civilized operating system with a package manager, this won't be difficult.

ryan@home ~$ sudo apt-get install CouchDB

The package manager will install CouchDB and all its dependencies. The installation script will finish with this hopeful line:

 * Starting database server CouchDB

That's it. Couch is up and running.

3 RESTful API

The first thing to know about CouchDB is that it that you connect to it over HTTP using the standard HTTP "verbs" like GET, PUT, POST and DELETE - the very same protocol that browsers use to connect to web servers.

This is important to understand: you don't need database drivers or clients or anything else to interact with CouchDB. You just need to be able to send an HTTP request to the CouchDB server and receive the response.

You can prove this by opening your browser and navigating to http://localhost:5984. It should return something like this:

{"CouchDB": "Welcome", "version": "0.10.0"}

So far, so good. But what are you looking at?

4 JSON

If you've built any Ajax web apps in the past few years, the output from the CouchDB web server should be familiar to you. It's JSON, or JavaScript Object Notation, the lightweight data format introduced by Douglas Crockford.

(BTW if you write JavaScript but haven't read Crockford's book JavaScript: The Good Parts, I highly recommend it. Chances are you're Doing It Wrong today.)

JSON is based on JavaScript object literal notation, a method of creating objects in JavaScript by literally describing their properties and methods:

myObject = {
    name: "Ryan McGreal",
    age: 36,
    emails: ["ryan@quandyfactory.com", "editor@raisethehammer.org"],
    doSomething: function() {
        alert("This is a useful method!");
    }
}

JSON supports arrays and dictionaries with nesting and various data types (number, string, boolean, array, object and null). It also supports schemas.

A major advantage over XML is that the syntax is much simpler and less verbose, which makes it lighter across networks as well as more human-readable.

While it's possible to evaluate JSON as JavaScript using eval(), it's not recommended (for what I hope are obvious reasons). It's much better to use a JSON parser.

The good news is that there are already JSON parsers for wide range of programming languages in addition to JavaScript. Python, for example, includes the json module as part of the standard library (as of version 2.6). There are also libraries for PHP, Ruby, Perl, Java, Scala, Erlang, Common Lisp, and Clojure, among others.

5 Create a CouchDB Database

Given that you connect to CouchDB over HTTP, it should not surprise you to learn that you execute actions against CouchDB using the HTTP request method 'verbs': GET, PUT, POST and DELETE.

As a result, any programming language that can communicate over HTTP can connect to a CouchDB server.

Let's create a new database. For simplicity's sake I'll connect to the CouchDB server on the command line using curl. The -X command line argument specifies which HTTP request method to use.

ryan@home ~$ curl -X PUT http://localhost:5984/mydb

CouchDB returns the following response (in JSON format, or course):

{"ok": true}

You now have a CouchDB database called mydb. Prove it by running a GET request against the URL for your database:

ryan@home ~$ curl -X GET http://localhost:5984/mydb

Check it out: your database has a URL!

CouchDB will return a summary of the database properties (I've added whitespace to the CouchDB JSON responses to aid readability):

{
    "db_name": "mydb",
    "doc_count": 0,
    "doc_del_count": 0,
    "update_seq": 0,
    "purge_seq": 0,
    "compact_running": false,
    "disk_size": 79,
    "instance_start_time": "1276950495049303",
    "dis_format_version": 4
}

Notice the doc_count property of 0. We should fix that by adding a document.

6 Add a Document

You add documents to a database using the HTTP POST method (the same method HTML forms use to send form data to the server).

ryan@home ~$ curl -X POST http://localhost:5984/mydb \
--header 'Content-Type: application/json' \
--data '{"title": "My First CouchDB Document", "author": "Ryan McGreal", \
"date_posted": "2010-04-09 11:38:26", "section": "blog", \
"content": "This is my first-ever CouchDB document. Niiice." }'

CouchDB returns a response like the following:

{
    "ok": true, 
    "id": "bf58234234aed2389dcb23423",
    "rev": "1-4ac239847fea987bd2340"
}

Let's take a look at what just happened.

We executed an HTTP POST request with a Content-Type header of "application/json" (since we're sending the data in JSON format) and a JSON object for the posted data. The JSON object includes the following keys and values:

CouchDB accepted the JSON object, parsed it to ensure that it's valid JSON, and then created a document in the mydb database with the content you provided.

It also issued the document with an ID and a revision ID, which it included in the response so that you can access the document later.

The revision ID is additionally beneficial: you've got version control built into your database. CouchDB also serves revision numbers as HTTP ETags, so you can take advantage of caching.

7 View a Document

Let's take a look at the document we just created.

ryan@home ~$ curl -X GET http://localhost:5984/mydb/bf58234234aed2389dcb23423

Your document has a URL, too!

CouchDB returns the following response:

{
    "ok": true, 
    "id": "bf58234234aed2389dcb23423",
    "rev": "1-4ac239847fea987bd2340",
    "title": "My First CouchDB Document", 
    "author": "Ryan McGreal", 
    "date_posted": "2010-04-09 11:38:26", 
    "section": "blog", 
    "content": "This is my first-ever CouchDB document. Niiice."
}

There it is. You can access its properties and values just like you would a recordset.

The difference is that the CouchDB data is formatted in JSON and can be nested - i.e. an object's property could be an array or another object with its own keys and values - rather than forced into a flat table.

8 Add Another Document

Let's try adding a document with nested data:

ryan@home ~$ curl -X POST http://localhost:5984/mydb \
--header 'Content-Type: application/json' \
--data '{"title": "Some Handy Links", "author": "Ryan McGreal", \
"links": ["http://quandyfactory.com", "http://raisethehammer.org", \
"http://hamilton350.com"]}'

CouchDB responds:

{
    "ok": true, 
    "id": "e6ed2389dcb26100caeg3423",
    "rev": "1-7c21a0fd39fea987bd2340"
}

Now we have two documents in our database with different sets of fields - you're not forced to map your data into a standardized set of properties.

If we tried to do this in a traditional RDBMS, our documents table would have to have a links column with a Null value in the first document, as well as a content column with a Null value in the second document.

With CouchDB, you can give your documents arbitrary fields and values depending on what's applicable for each document.

9 Update a Document

[Coming Soon]

10 Data Views

[Coming Soon]

11 Misc.

Let's create another database.

ryan@home ~$ curl -X PUT http://localhost:5984/mydb

Wait a minute; didn't we just create a database called mydb?

CouchDB returns this response:

{
    "error": "file_exists", 
    "reason": "The database could not be created, the file already exists."
}

Whew, dodged a bullet there. CouchDB doesn't let you accidentally overwrite an existing database.

At any time you can get an array of all the databases on your server with the following GET request:

ryan@home ~$ curl -X GET http://localhost:5984/_all_dbs
["mydb"]

There's plenty more, but this should be enough to get you started thinking about how you might use CouchDB in projects whose data models don't necessarily lend themselves to strict relational structures.

12 Utils Web Interface

One last thing: CouchDB also provides a web interface with some handy utilities for navigating around the server and its databases.

Just load http://localhost:5984/_utils in your browser to see it.