simple_one_for_one

Erlang PubNub Client and Chat

I was thoroughly impressed with PubNub, a publish/subscribe service, when I first read their articles and played around with it some in Javascript. But obviously I need an Erlang API if I'm going to really use it! So I've created ePubNub.

In the ePubNub README you'll find information on some basic usage of the application. You don't have to do anything more than use the epubnub.erl module to publish and subscribe (by either providing a PID to send messages to or a function handler to process each).

Here I've built a little more complicated app/release called epubnub_chat, and the source is also on github.

The first thing we need is the epubnub app as a dependency in the epubnubchat.app_ file:

   {applications, [kernel, stdlib, epubnub]},

We'll use a simpleoneforone for supervising channel subscribed processes. In epubnubchatsup_ we have 3 API functions for the user to use (startlink_ is run by the _app.erl module on startup): connect/1, connect/2, disconnect/1:

connect(Channel) ->  
    supervisor:start_child(?SERVER, [Channel]).  

connect(EPN, Channel) ->  
    supervisor:start_child(?SERVER, [EPN, Channel]).  

disconnect(PID) ->  
    epubnub_chat:stop(PID).

EPN is a record containing the necessary url and keys for talking to the PubNub service and is created with the new functions in epubnub:

-spec new() -> record(epn).  
new() ->  
    #epn{}.  

-spec new(string()) -> record(epn).  
new(Origin) ->  
    #epn{origin=Origin}.  

-spec new(string(), string()) -> record(epn).  
new(PubKey, SubKey) ->  
    #epn{pubkey=PubKey, subkey=SubKey}.  

-spec new(string(), string(), string()) -> record(epn).  
new(Origin, PubKey, SubKey) ->  
    #epn{origin=Origin, pubkey=PubKey, subkey=SubKey}.

You can pass none to connect/1 and it will send none to the chat gen_server and it will use defaults of pubsub.pubnub.com, demo and demo, for the url, publish key and subscribe key respectively.

Now in the epubnubchat_ gen_server we need the following API functions:

start_link(Channel) ->  
    start_link(epubnub:new(), Channel).  

start_link(EPN, Channel) ->  
    gen_server:start_link(?MODULE, [EPN, Channel], []).  

message(Server, Msg) ->  
    gen_server:cast(Server, {message, Msg}).  

stop(Server) ->  
    gen_server:cast(Server, stop).

The startlink_ functions are run when the supervisor spawns a simpleoneforone supervisor for the process. This returns _{ok, PID}. This PID must be remembered so we can talk to the process we have started thats subscribed to a specific channel. We pass this PID to the message/2 and stop/1 functions, which we'll see at the and when we use the program.

startlink/1_ and /2 call init/1 with the provided arguments:

init([EPN, Channel]) ->  
    {ok, PID} = epubnub_sup:subscribe(EPN, Channel, self()),  
    {ok, #state{epn=EPN, pid=PID, channel=Channel}}.

Here we use the epubnub_sup subscribe/3 function and not epubnub:subscribe because we want it to be supervised. We store the PID for this process so we can terminate it later.

The epubnub subscribe process was given the PID, returned by self/1, of the current process which is the genserver process and will send messages that are published to the channel to that process. We handle these messages in _handleinfo/2_:

handle_info({message, Message}, State) ->  
    io:format("~p~n", [Message]),  
    {noreply, State}.

Lastly, we have to handle the messages from message/2 and stop/1:

handle_cast({message, Msg}, State=#state{epn=EPN, channel=Channel}) ->  
    epubnub:publish(EPN, Channel, Msg),  
    {noreply, State}.  
  
terminate(_Reason, #state{pid=PID}) ->  
    epubnub:unsubscribe(PID),  
    ok.

The handlecast/2_ function published your message to the channel this process subscribed to with epubnub:publish/3 and terminate calls epubnub:unsubscribe/1 before this process ending which sends a terminate message to the subscibred process.

Now lets see this program in action:

[tristan@marx ~/Devel/epubnub_chat]  
09:55 (master)$ sinan dist  
starting: depends  
starting: build  
starting: release  
starting: dist  
[tristan@marx ~/Devel/epubnub_chat]  
09:55 (master)$ sudo faxien ir  
Password:  
Do you want to install the release: /Users/tristan/Devel/epubnub_chat/_build/development/tar/epubnub_chat-0.0.1.tar.gz  
Enter (y)es, (n)o, or yes to (a)ll? > ? y  
Replacing existing executable file at: /usr/local/lib/erlang/bin/epubnub_chat  
Replacing existing executable file at: /usr/local/lib/erlang/bin/5.8.2/epubnub_chat  
Replacing existing executable file at: /usr/local/lib/erlang/bin/erlware_release_start_helper  
Replacing existing executable file at: /usr/local/lib/erlang/bin/5.8.2/erlware_release_start_helper  
ok  
[tristan@marx ~/Devel/epubnub_chat]  
09:56 (master)$ epubnub_chat  
....  
Eshell V5.8.2  (abort with ^G)  
1> {ok, Server} = epubnub_chat_sup:connect("chat").  
{ok,}  
2>  
=PROGRESS REPORT==== 10-Apr-2011::09:57:14 ===  
          supervisor: {local,inet_gethost_native_sup}  
             started: [{pid,},{mfa,{inet_gethost_native,init,[[]]}}]  

=PROGRESS REPORT==== 10-Apr-2011::09:57:14 ===  
          supervisor: {local,kernel_safe_sup}  
             started: [{pid,},  
                       {name,inet_gethost_native_sup},  
                       {mfargs,{inet_gethost_native,start_link,[]}},  
                       {restart_type,temporary},  
                       {shutdown,1000},  
                       {child_type,worker}]  

2> epubnub_chat:message(Server, <"hello there!">).  
ok  
3> <"hello there!">  
<"I'm from the webapp!">  

3> q().  
ok

You can go to the PubNub tutorial page to chat between yourself, or get someone else to join!

That's it! Simple and quick global communication that scales for you!

I've really enjoyed playing with PubNub and hope I get to use it for a real project soon.

eCloudEdit Part 2: CouchDB

In my last post I showed the Webmachine backend to James Yu's CloudEdit app in Backbone.js. What was left out was, where are the documents stored? Here I'll show how this is done with CouchDB. And you can give the app a try at http://erlware.org:8080

First, a new Erlang app is needed that we'll call ece_db.

  
ece_db/  
├── doc  
├── ebin  
│   ├── ece_db.app  
│   └── overview.edoc  
├── include  
└── src  
    ├── ece_db.erl  
    ├── ece_db_app.erl  
    └── ece_db_sup.erl
Three modules are implemented, one that starts the app by calling the supervisor's start function, the supervisor itself that sets up a _simple_one_for_one_ and the _gen_server_ that handles the frontends requests for creating and modifying documents in CouchDB. We'll ignore the _ece_db_app.erl_ module for this post and start with _ece_db_sup.erl_. On requests from the Webmachine resource module we don't want one to wait on the other, requests should be handled in parrallel. One option is to not create a process for the database backend and instead have all of the functions in the database interface run in the process of the Webmachine resource. However, two reasons to not do this are that there are multiple pieces of data that must be read out of a configuration file and used to setup what is needed to talk to CouchDB. We do not want to be doing this for every request! Furthermore a supervised _gen_server_ allows us the possibility of retrying requests with no extra code and crashing without retrying but not taking down the resource's process that is handling the user's HTTP request. It is a lot nicer and easier to just let things fail when we can!
  
  {ece_db, [{server, "HOSTNAME"},  
            {port, 80},  
            {database, "ecloudedit"}]}
  
-spec init(list()) -> {ok, {SupFlags::any(), [ChildSpec::any()]}} |  
                       ignore | {error, Reason::any()}.  
init([]) ->  
    RestartStrategy = simple_one_for_one,  
    MaxRestarts = 0,  
    MaxSecondsBetweenRestarts = 1,  

    SupFlags = {RestartStrategy, MaxRestarts, MaxSecondsBetweenRestarts},  

    Restart = temporary,  
    Shutdown = 2000,  
    Type = worker,  

    {ok, Server} = application:get_env(server),  
    {ok, Port} = application:get_env(port),  
    {ok, DB} = application:get_env(database),  

    AChild = {ece_db, {ece_db, start_link, [Server, Port, DB]},  
              Restart, Shutdown, Type, [ece_db]},  

    {ok, {SupFlags, [AChild]}}.
First, we have the config entries in config/sys.config that will be read in by _application:gen_env_. Here we set the server URL, port number and name of the database to use. CouchDB can easily be run locally but for my running copy I use a database hosted by the great service [Cloudant](http://www.cloudant.com). Next is the _init_ function for _ece_db_sup.erl_. A key thing to note for a _simple_one_for_one_ is that NO process is started after the supervisor's init function returns like with other types of supervisor children. Instead we must explicitly call _start_child_. This is how we are able to create a supervised _gen_server_ process for every HTTP request. Below is the code in _ece_db_sup_ for starting and stopping the process:
  
start_child() ->  
    supervisor:start_child(?SERVER, []).  

terminate_child(PID) ->  
    ece_db:terminate_child(PID).
In the last post I showed the functions that start and stop the _ece_db_ process. Here they are again:
  
init([]) ->  
    {ok, PID} = ece_db_sup:start_child(),  
    {ok, #ctx{db=PID}}.
  
finish_request(ReqData, Ctx) ->  
    ece_db_sup:terminate_child(Ctx#ctx.db),  
    {true, ReqData, Ctx}.
On each request Webmachine calls the _init_ function for the resource that is matched in the dispatch table. In that init function we call _start_child_ in _ece_db_sup_ which returns _{ok, PID}_. The PID is the process id we'll be sending messages. Now in _ece_db_ we implement the API functions needed to start the process and to interact with it.
  
start_link(Server, Port, DB) ->  
    gen_server:start_link(?MODULE, [Server, Port, DB], []).  

all(PID) ->  
    gen_server:call(PID, all).  

find(PID, ID) ->  
    gen_server:call(PID, {find, ID}).  

create(PID, Doc) ->  
    gen_server:call(PID, {create, Doc}).  

update(PID, ID, JsonDoc) ->  
    gen_server:call(PID, {update, ID, JsonDoc}).  

terminate(PID) ->  
    gen_server:call(PID, terminate).  

Each function, besides the one for starting the process, takes a PID. This PID is the process genserver:call_ to which it sends its message. To handle these messages we have the genserver_ callbacks.

  
%%%===================================================================  
%%% gen_server callbacks  
%%%===================================================================  
  
%% @private  
init([Server, Port, DB]) ->  
    CouchServer = couchbeam:server_connection(Server, Port, "", []),  
    {ok, CouchDB} = couchbeam:open_db(CouchServer, DB),  
    {ok, #state{db=CouchDB}}.  

%% @private  
handle_call(all, _From, #state{db=DB}=State) ->  
    Docs = get_docs(DB, [{descending, true}]),  
    {reply, mochijson2:encode(Docs), State};  
handle_call({find, ID}, _From, #state{db=DB}=State) ->  
    [Doc] = get_docs(DB, [{key, list_to_binary(ID)}]),  
    {reply, mochijson2:encode(Doc), State};  
handle_call({create, Doc}, _From, #state{db=DB}=State) ->  
    {ok, Doc1} = couchbeam:save_doc(DB, Doc),  
    {NewDoc} = couchbeam_doc:set_value(<<"id">>, couchbeam_doc:get_id(Doc1), Doc1),  
    {reply, mochijson2:encode(NewDoc), State};  
handle_call({update, ID, NewDoc}, _From, #state{db=DB}=State) ->  
    IDBinary = list_to_binary(ID),  
    {ok, Doc} = couchbeam:open_doc(DB, IDBinary),  
    NewDoc2 = couchbeam_doc:set_value(<<"_id">>, IDBinary, {NewDoc}),  
    NewDoc3 = couchbeam_doc:set_value(<<"_rev">>, couchbeam_doc:get_rev(Doc), NewDoc2),  
    {ok, {Doc1}} = couchbeam:save_doc(DB, NewDoc3),  
    {reply, mochijson2:encode(Doc1), State;  
handle_call(terminate, _From, State) ->  
    {stop, normal, State}.

init takes the server URL, the port number and the name of the database to store the documents as arguments and creates a CouchBeam database record. To handle the other messages we have the handlecall_ function to match on the different tuples sent to genserver:call_ for the different function calls. all uses the internal function getdocs_ with the option to have the function in descending order (so the newest is first) and returns those after encoding to JSON. find uses the same function but only wants a certain document which in the case of CouchDB is specified with a key value. create first saves the new document which returns the document that is actually stored in CouchDB. CouchDB adds the _id and _rev key/value pairs it generates. Since our frontend expects the id to be the value of a key id and not _id we use couchbeamdoc:setvalue to set a value of the new key id with the value of the documents id retrieved from the document with couchbeamdoc:getid. For update we first open the document corresponding to the id passed in as an argument to get the _rev value. _rev needs to be set in the document we send to savedoc_ so CouchDB knows this is a valid update based on the previous version of the document. Thus we set the new documents _id and _rev values and send to savedoc_.

Lastly, we have the getdocs_ function. Since the frontend expects the key id to exist and CouchDB stores the id with the key _id we use a simple view to return the documents with id containing the value of the documents _id. Views are defined with map/reduce functions that CouchDB runs across the databases documents and builds an index from. These indexes are then queried based on keys to find documents. Thus access is very quick, indexes are built with BTrees, and a map/reduce does not have to be run on the documents for every request. Additionally, when a new document is created the view's map/reduce functions only need to run over the new documents to update the BTree. Updates to the view's indexes are either done on each use of the view or can be manually told to run.

  
function (doc)  
{  
  emit(doc._id, {id : doc._id, title : doc.title, created_at : doc.created_at, body : doc.body});  
}
  
get_docs(DB, Options) ->  
    {ok, AllDocs} = couchbeam:view(DB, {"all", "find"}, Options),  
    {ok, Results} = couchbeam_view:fetch(AllDocs),  

    {[{<<"total_rows">>, _Total},  
      {<<"offset">>, _Offset},  
      {<<"rows">>, Rows}]} = Results,  

    lists:map(fun({Row}) ->  
                      {<<"value">>, {Value}} = lists:keyfind(<<"value">>, 1, Row),  
                      Value  
              end, Rows).  

In getdocs_ the couchbeam:view function is used to construct a view request to be sent to the CouchDB server. couchbeamview:fetch_ sends the view request and returns the results as Erlang terms. After this we extract the rows from the results and use lists:map to extract out just the value of each row to be returned.

That's it! There are places that could use improvement. One being the use of lists:map over every returned row to construct documents the frontend can deal with. Additionally, the need to duplicate _id as id for the frontends use could be removed through modifications on the frontend.

In the next installment I'll be updating the code -- and maybe making some of these performance enhancements -- with James Yu's recent changes in his Part 2.