EC2 and REST

I have been looking at Amazon EC2. It is a fascinating service. It is quite a thrill to fire up your own linux instances in the cloud the first time. The fun wears off after a bit because linux instances are quite boring on their own. I am sure designing a web application server infrastructure in the cloud will be a lot more fun. So, I started taking baby steps towards that, which involves staring at their HTTP query API.

The EC2 query API, unlike the S3 API is quite un-RESTful. HTTP GET is overloaded and is neither safe nor idempotent. All requests take a list of query parameters, the first parameter is called Action. Most of the other query parameters are arguments to the Action. The remaining are query parameters to deal authentication and non-repudiation. GET is the only HTTP verb used. The three most common actions I used were start, stop and status.

The two primary nouns in the EC2 vocabulary are images and instances. Images are the bits that make up the Xen virtual instance. Instances are runtime instantiations of images. An operation like getStatus() can be intuitively modeled as a GET on an instance resource. However, I got stuck on modeling start and stop.

I then consulted with a REST expert, he suggested that I think of a start as a POST to list of instances and a stop as a DELETE on an instance. An excellent suggestion, although there is a part of me (the dumb part) that feels that this not terribly intuitive. Although it seems to frowned upon, I actually prefer tunneling RPC calls through POSTs. It gives me more latitude to model my rpc’ish calls with a POST and my entity calls with all the verbs. Which leads to me wonder why EC2 did not use POST instead of GET, wouldn’t they have the REST stamp of approval ?

For now, I am sticking with a purist REST interface. Here is my REST table.

Method URI Description
GET /user list all users
POST /user create a user
GET /user/user1 retrieve user1
PUT /user/user1 update user1
DELETE /user/user1 delete user1
GET /user/{user}/instance list instances for user
POST /user/{user}/instance start a new instance on behalf of user
GET /user/{user}/instance/1 retrieve status of instance 1
DELETE /user/{user}/instance/1 stop the instance 1
GET /user/{user}/image list all images for the user



I used ProjectZero to build this out. The code is in an early alpha state in a branch. I checked in the code to https://www.projectzero.org/svn/zero/branches/p_sr_amazonec2. Prior registration on the ProjectZero site is required.

Project Zero

I have been working on Project Zero for the last year. The project is an incubator and is still in its early stages.

Disclaimer – the views expressed on this blog are mine and mine alone, they do not reflect the views the Project Zero team.

As with all software projects in their early stages, things changed at a frenetic pace at one point. However, the two things that were most important to me throughout were the adoption of dynamic languages on the JVM and REST. On most days I think using Java as a systems programming language in combination with a dynamic scripting language (like Groovy or PHP) makes a lot of sense. On other days, I think the Java bubble is a strange entity and whether the bubble tax is worth it. For reasons I still do not understand, Java has been very successful in the Enterprise and the practical side of me says dynamic languages on the JVM are a fantastic idea. In any case, with proper API design (hard problem), small scripts written in a dynamic language can achieve a lot. Whether you call them little languages or DSLs or glue code, there are significant productivity gains that can be achieved using them.

My affection for REST is partly based on the shock and awe approach taken by WS-*. I see REST as a set of organizing principles that constrain how I think about the server side of a web application (the client side is still a big mess). I like constraints in thinking about software, especially constraints other smart folks have put a lot of thought into. Now, REST might be a little too constraining, since I frequently have to seek out REST Jedi like Joe to feel the force. I will describe more about this in future posts.

The project has been a great learning experience for me. I get to work with a lot of interesting pieces of technology. The one big lesson though – the biggest challenges in software are on the business side. Translating cool technology ideas into dollars is the hardest problem. One solution of course is a network driven, community and folksonomy based ad revenue model, but that one seems to be getting tapped out.

Synergistic Organic Post

This is a synergistic organic blog post meant to innovate and increase a value’s proposition. I had to nurture it within a broader context and leverage strengths to deliver value. Keeping everything organic, while the synergies evolved and matured. The innovative, synergistic value proposition had to be clearly articulated while the post matured and nurtured the organic content. Ultimately, this is all about value. Value and organic matter delivered in a nurturing fashion while your strength is being leveraged. The post is based on a great track record, set on organic tracks, where the synergies in the broader, more complex eco-system helped the grass mature. Leverage was applied all the time. Propositions and their values, living in the broader eco-system, synergistically maturing, leveraging track records, delivering an organic opportunity.

Tipping the point and chasm crossing can also be accomplished by identifying organic opportunities which leverage synergies. Organic organizations that keep tipping points eventually overturn, giving them a great out of the box perspective on the synergies in the box. Frequent chasm crossing can help in leveraging strengths that help in identifying (organically, of course) customer opportunities, establishing a strong track record. This post will continue to organically grow in the comments section.

jQuery

As a server side programmer, I am petrified at the thought of building a Web UI. Not that I have not tried to build a Web UI in the past, but the experiences usually left scars. Experiences that had me scampering back to the coziness of the server side. After all, the server side is rough enough, I thought. There are issues with persistence, state, transactions, scalability, fault tolerance. None of them even remotely require any artistic skills. Well, maybe API design is an art form, but nothing a good JavaDoc cannot fix.

Each time I ran away from front end development, I swore I would come back to it. For the last few months, I decided to take a more careful approach. First, I scouted around to find a good book on design. Based on Amazon reviews, I bought Transcending CSS. It is a pretty interesting book, especially for a technical person. Light on technical details of CSS and heavy on design and different ways to think about design.

Good stuff, but the problem is that most of the JavaScript frameworks out there set out to conquer the world. I did not find a framework that fits in nicely with the HTML/CSS philosophy, until I came upon jQuery. This is an amazing framework. It makes thinking about the Web UI so much easier. Like I mentioned in a previous post, it is very important for a programming language or a framework to surface a few important concepts. These concepts should then be repeated over and over to solve problems for a particular domain. It is very difficult to come up with these central concepts, but very easy for users to understand and use a framework. jQuery does this very effectively.

Now, on to building something meaningful with jQuery….

Software business

An excellent post on open source software and different business models. I think the arguments apply to all software businesses.

Fun with Statistics

I took advice posted here pretty seriously and started reading up on statistics recently. First step, buy a good book.
This book
is hands down the best text book I have ever read. I was expecting a dry textbook filled with formulas, the kind that make you wonder how you ever made it through the formal education process. This book is fantastic, I just wish more of my text books were this good in college.

Despite being a great book, at some point you want to try something other than the exercises. Enter the Netflix prize. I have been mucking around with their large data set for the last couple of days, makes for an excellent practical exercise to try out experiments in statistics (and other fields). My wife asked me if I have a chance of winning the prize and here was my answer:

I see myself as the equivalent of a Chimp trying to solve the Rubik’s cube. I am not sure if anyone has tried the experiment of offering a Chimp the Rubik’s cube. If you can get one to solve the cube, I am sure you can make a lot of money with the demo Chimp, but until then I will assume that it is impossible. The Chimp could still have a lot of fun checking the cube out, twisting the faces, admiring the colors, pulling the cube apart etc. That is exactly what I am doing with the Netflix rating data set. The goal is to come up with one set of predicted ratings, I don’t really care even if the RMSE is twice that of the Cinematch algorithm.

Better Faster and Cheaper

As programmers, most of us want to build things better, faster and cheaper. We are taught big O analysis in our first algorithms course, we never give up optimizing. We love projects that relentlessly simplify. We find compact and powerful languages attractive and surely the next language is better than the previous one. We always want to do more with less, but if we do more with less, shouldn’t it be cheaper too ? Yes! we answer and most of the time we are wrong. Better, faster are great goals but cheaper does not pay the bills.

Let us look at the variations, there is better, faster and costlier. This one makes for a great (arguably, the best) business model. Intuitively, it makes some sense, as human beings we are irrational and emotions trump reason frequently. Evoke strong positive emotions and people will gladly pay you a premium. A lot of great engineering has to go into these products though, it is not just about looking good. If a software company can pull this one off, it is the best way to get rich. The problem is that other programmers are constantly looking to making your product better, faster and cheaper. Eventually, your product will become a commodity.

There is worse, slower and cheaper. This one is also easy to understand, you get what you pay for. If you can churn widgets out quickly and for a lower cost, you can make money on volume. I don’t think software manufacturers have figured out how to make money with this model. The closest I have seen this work is the discount bin for computer games at the local electronics store. A number of hardware manufacturers on the other hand thrive this way.

Then there is worse, slower and costlier. This is the most confusing business model because it just should not work (in theory). And yet, it does. No product starts out this way, they all start out in the better, faster and costlier mode. On the road to commoditization, some successful products end up in this state. The reasons vary, in becoming successful, some products become monopolies. Some of them are part of very complex software systems and complexity resists change. There are probably other reasons involving human factors. This state can last anywhere between a couple of months to a few decades. No one can predict the duration. The complexities of macro economic systems are unfathomable.

This state is a programmers worst problem, because we are supposed to make things better, faster and cheaper. But, remember, cheaper does not pay the bills.

If you think I am wrong, prove it

The best way to kill a nascent debate is to use the line “if you think I am wrong, prove it”. This is one of the most clever ways to avoid criticism (constructive or otherwise). It does not quite close the door on dissent, while effectively slamming it shut.

The problem is that something has to be provable for anyone to prove it. Proving also requires a lot of effort or a lot of time. So, the next time you need to make a controversial decision, make the decision and then use the line if you think I am wrong, prove it. In all likelihood, your decision will stay.

Of course, you still have to be right…….and time will surely tell.

Merge Sort

There are certain incidents in your life that stick around in your brain for a long time. One such incident for me was a job interview a few years ago (at an unnamed company) where I crashed and burned quite spectacularly in the space of 2 hours. It was so bad that mid way through the first interview, I no longer even wanted the job, I just wanted to cut my losses and go back home. Needless to say, the post-mortem analysis of that incident has been going on for a few years. Over the years, the incident has lost a lot of the emotional rawness. To me, it has turned into an analysis of how programming languages shape the way you think about solving problems.

On the day of the interview, I was functioning on an hour’s sleep from the previous night, it did not help at all when the first question I was asked was to write the Merge Sort algorithm on a whiteboard. Not a very difficult question. The classic divide and conquer algorithm I said aloud, break the unsorted list into smaller lists, sort the smaller lists and merge them back together. The devil as always is in the details. In this case, it was in the details of how I translated that sentence in English into code on a whiteboard. No IDE to tell you how badly the syntax is screwed up. No REPL environments to test out a few things. No writing the unit test first and going back and refining the algorithm over successive attempts. A whiteboard is the most stark programming environment in the world, especially when the code you write is supposed to meet some standards of rigor.

As always, I was granted the luxury of writing the algorithm in any programming language. Over the years, in analyzing the incident, I have realized that this choice is not a luxury. In fact, the worst thing you can do is pick the wrong programming language to think in. Even simple things like the representation and manipulation of a list get in the way of how you solve the problem. The language, the libraries, the idioms commonly used, all shape the way you approach any problem.

From that point, any new language I try out has to go through my Merge Sort test. I had no clear favorite, until I tried it in Erlang. It was a like a walk in the park with Erlang (I admit I did compile and run it a couple of times, but in my defense I am still learning Erlang and had syntax errors). Here is the merge routine I came up with on the first attempt.


merge(L1, []) -> L1;
merge([], L2) -> L2;
merge([H1|R1], [H2|R2]) when H1 =< H2 -> [H1] ++ merge(R1, [H2] ++ R2);
merge([H1|R1], [H2|R2]) when H1 > H2 -> [H2] ++ merge([H1] ++ R1, R2).

Pattern matching, recursion and literal syntax to deal with lists just seem to lend themselves to these kinds of problems.
|Read the Rest of the Entry…

Why did Java succeed ?

It seems like the number of pronouncements about the “next java” is on the rise. Folks are boldly announcing their pick, Ruby, Python, Erlang, JavaScript are all apparently in the running. I have seen folks also back Haskell. But, the one question that has bothered me for a while is I don’t understand why Java succeeded in the first place. Shouldn’t the successor for Java then follow the same/similar recipe or have the times changed and the old rules don’t apply.

I have heard so many reasons for Java’s success. Reasons including

  • It was a shrewd marketing campaign, folks just assumed that Java had something to do with the internet. Applets were a nice selling point, even though Java’s eventual success was on the server.
  • Java was a better C++ (at the time).
  • Write Once Run Anywhere cracked open the platform portability problem. Certainly was better than using #ifdef’s all over the place.
  • Java took Smalltalk and made the syntax more palatable (although Alan Kay makes it a point in some of his keynotes that Java and C++ are not what he had in mind when he coined the term OOP).
  • From the Paul Graham school of thought – Java is a mediocre language for mediocre programmers. It thus become a “safe” option for IT middle management, because in theory Java programmers were replaceable. After a point, there were so many of them (us), it minimizes the risk associated with programmers leaving a project.
  • J2EE is a standard and provided an API for for anything an “enterprise” would want to do. Being a standard meant that vendors were replaceable, once again working into the IT middle management’s risk averse mindset.
  • This reason is the one I like best – J2EE application servers provided IT operations staff with a standard deployment environment. One they could provision, deploy, maintain and operate without having to worry about the language du jour.
  • Eclipse is a great tools platform, made Java development a snap.

So, which of these reasons is it ? Or is it a combination of some or all of them ? I am probably missing some reasons, but, if Erlang is to become the next Java, which reason will it go after. I like Erlang, I think message passing concurrency, a system built to handle failures are fantastic design principles. However, is technology enough of a forcing function ? Businesses only react when their competition is able to gain an edge through technology or when their business goes away because of technology.

Will the “next Java” replace Java in an existing customer set or create a whole new customer set ? I have a sneaking suspicion that existing customer sets will not replace Java, there is way too much invested in the language, the libraries, the VM and in J2EE application servers. The change has to come from a new customer set.


-->