code

Bootstrapping a rails app to use google users and calendars

Posted in code on April 21st, 2012 by ben – Be the first to comment

I wanted to create a rails app that authenticates against Google identities and uses the Calendar API for viewing calendar information. All the building blocks exist, including omniauth for doing OAuth authentication (works for more than Google) and the google-api-client for interacting with data services via ruby.

Each project provides adequate documentation, but since they are trying to serve such a wide variety of use cases, it can get a little tricky to reduce the instructions to the minimum configuration. Furthermore, I might like to create multiple similar apps, so I don’t relish restubbing my toes on the same configuration tasks in the future.

I realized the best way to encapsulate the exact config is to create a rails generator, which can automatically create the controllers, views, models, routes, and other miscellaneous config in various files. I decided to put all the config into a gem which provides a single generator to bootstrap the app, and then gem can be removed.

The gem is hosted at github: google_oauth_calendar, with more detailed instructions for installation. Assuming you’ve done the work of configuring the Google API console, just add the gem

gem 'google_oauth_calendar' , :git => 'git@github.com:deafgreatdane/google_oauth_calendar.git'

and run

rails generate google_oauth_calendar:install

and your app just works!

The generator code lays out all the steps, so if you want to run the steps by hand, you can see what it takes.

There are also generators for entire apps (aka templates). The rails_app_composer is a neat project for decomposing a large set of starting gems into recipes so you can generate a complete infrastructure in 1 command. I chose to create a generator via a gem rather than template, since my app’s initial state may not be representable by that project. Also, it suffers from the same problem as the individual gems: by serving so many different starting points, you spend extra time carving around the stuff you don’t want/need.

Do not use gdata-java-api

Posted in code, opinion on March 28th, 2012 by ben – Be the first to comment

TL;DR

  • Google offers two libraries for accessing their services, the gdata-api and google api
  • The older one looks more complete, and with a promise to maintain it, it’s tempting
  • Don’t give in – take the time to deal with the newer API

A current project of mine involves using Google Docs and automating a workflow using Google App Engine (GAE) with Java. There are two libraries provided by Google for accessing their services, an older library called “gdata“, and a newer, called “google api.” The older comes with the warning:

We have stopped actively developing this client library, except critical bug fixes and support for some Google API’s. However, this client library is not deprecated, and is considered the “stable” choice, unless you have a specific requirement that is only supported by the new client library below.

The newer comes with a nice migration message, but also with the caveat regarding the generic Data APIs (and the specific Document List service I wanted)

We do not provide service-specific libraries for the Google Data APIs because they are built on an older infrastructure that does not have a Discovery Service. Nevertheless, the base Google API Client Library for Java fully supports Google Data APIs as long as you write your own Atom XML data model

The last emphasis is mine, and it scared me away from using it, reasoning, “why should I write my own data model when a complete one exists?”

Here’s a stack of reasons why:

  • gdata does not come with any maven support. I did find a script to add the jars to my maven repository, though.
  • gdata has manual hoops to get working with OAuth. Although they have the recipe fully documented, they still have this zinger, “you need some way to persist the token secret in order to create an OAuth token object coming back from the approval page.” They vaguely suggest, “set a session variable or cookie,” but they don’t provide any sample code that closes the loop.
  • gdata has archaic dependencies. Its latest release is completely incompatible with the newer API, since the newer depends on a recent release of guava, but gdata still depends on google-collections-rc1. See Issue 244 for an example. You can’t accuse of the guava folks of premature deprecation – they kept the old methods in for 2 years before removing the deprecated calls.
  • Despite promises of “gdata-java-client is not deprecated. We are keeping it in maintenance mode, fixing critical bugs, and adding a few minimal features,” the code doesn’t support this. Again, see issue 244 where a patch has been submitted and its still sitting there.
  • I really wanted to use the “changes” feed from the Document List API. Although there are ExtensionPoint classes such as Changestamp and LargestChangestamp in the library, no samples referenced them, and questions in the forums about why they weren’t working go unanswered. I couldn’t get changestamp data out of the old classes, which is what eventually caused me to change course
  • Writing custom Atom XML beans isn’t that bad. @Key is your friend, and there are plenty of samples.

I’m still at a loss for why a Google Data API is not fully fleshed out. The calls are well defined, there’s just some effort involved in doing all the mapping. Perhaps there’s a measurable performance penalty of mapping all the XML into beans that is offset by the effort required to write all the ones you need.

Hopefully this post will save someone else the same missteps. It’s not about reinventing the wheel, but learning from other peoples experiences.

Thoughts on Test Driven Development

Posted in code on March 27th, 2012 by ben – Be the first to comment

Part of the last project I did was to investigate ways that a legacy codebase could be rejuvenated. The legacy code had become bogged down in intermingled dependencies, bloated runtime, and a general fear of “if I change something here, what unexpected bugs will appear somewhere else?”.

The project was done in the context of Test Driven Development (TDD) and 100% unit test coverage of java code.

A tongue-in-cheek summary of 10 Reasons to Avoid Test Driven Development

TDD Defined

TDD is also called “Test First Development,” a practice where the test is written first, demonstrating that it fails, and then writing the core code until the test passes. The tests become part of an automated process during compilation to ensure that the code continues to perform to expectations.

If you test after the core code, you’ve probably already done some interactive testing, and now the testing feels like busy work, a common critique of unit testing in general.

TDD is especially relevant for bug fixing. If you can write a test that fails due to the bug, and then fix the code, you’ve guaranteed that the same bug won’t recur.

In practice, there’s some overlap in the sequencing. In java, your tests won’t even compile without the corresponding methods defined in the main code. Also, when doing exploratory work, you want to concentrate on getting things to work, not on how to drive a test. This is all fine, but the theme is for well known code to write your expected results first.

100% coverage defined

100% coverage means that every line of code and every route through a method (conditionals, loops, exceptions) is executed during the course of the test run.

It does not mean that every method of every class has a separate test method. The notion of writing separate methods for every setter and getter of a plain bean is obviously wasteful and clutters the test code with low-value tests. In practice, those methods are going to get called in the context of the logic tests.

However, every branch line is tested. This means a conditional like “if ( (A and B) OR (B and C) )” will take 4 tests. Exception catching must also be covered. If you’re going to bother catching an exception, you need to test that your executing that behavior properly.

What about something less that 100%, to avoid playing coverage games?

  • To riff on John Wanamaker, “Half the time I spend writing unit tests is wasted; the trouble is I don’t know which half.”
  • That means if you set a 90% goal, you might be missing 20% of the valuable use cases.
  • It’s an easy smokescreen to set on checkin. If you allow non-testing of obvious code, then you need to review the non-coverage every time to make sure your metrics are covering the good stuff.

Why it’s worth it

As an investment in the future

A strong set of unit tests can form a safety net for ongoing changes to a codebase. When everything is executed by some form of test, you can be more fluid in code changes, knowing the code still performs the same same result when the tests pass.

Unit tests also function as requirements documentation. If tests and comments are written in a way that implies “this is what the system is supposed to do in case X,” then they will remind folks of those requirements if the test ever fails. This is especially important over the long run when we can’t count on the memory or presence of long-tenured developers.

Increasing quality

TDD fundamentally changes how you write code. Not just the sequence of putting characters in files, but in how the code is structured. Many best practices around code structure, such as method length and low cyclomatic complexity are actually victims of cargo cult programming. Those things are results of good TDD, not an end goal. For example, if you’re forced to cover all the code, you’re going to write small methods just as a way to avoid writing massive fixtures.

TDD also results better separation of concerns. A complaint about TDD is the time wasted setting up the preconditions to the test (the “fixture”), but this is actually an indicator that your method is doing too much. Out of laziness to avoid writing all those fixtures, developers will figure out ways to abstract out the dependencies and focus on the target code. Tighter code results.

TDD forces deeper thought into edge conditions, which are the bane of our runtimes. Rather than writing a function and thinking “this is the happy path, I’ll come back someday and flesh that out,” you actually write a test that exercises those rare occurrences, and codify the expected response.

Going to 100% coverage finds obscure bugs proactively. When you’re trying to nail that final percentage point of coverage, you sometimes get hung up and think “this should work, argh, why not?” Then, when you finally figure out why, it’ll be an epiphany: “That was a potentially nasty bug, and how would we ever have tracked that down in the complexity of a production runtime?”

As a productivity gain

No, really, even with all the silly tests to ensure 100% coverage and all the mocking and expectations it takes to write a valid fixture, you still get a net gain in productivity.

  • executing a unit test class takes < 3 seconds so you get instant feedback on if it’s working.
  • contrast that with the time it takes to spin up the full runtime (database, webapp container, etc). Many projects take 30-90s to spin up the app container, imagine having to do that multiple times while testing simple logic missteps.
  • Stay in the zone. Those server startups are just the right amount of time for the developer to switch to another activity to fill the time, and now they get distracted by that extra email, twitter post, etc. Rather than resuming coding as soon as the app is running, they now need extra time to wrap up the distraction.
  • the sum of well tested smaller bits of code means the bigger code comes together quicker. Even if your integration tests are manual, you’ll get them to pass in fewer tries when you’re putting together quality pieces, and not circling back to fix a forgotten null pointer exception.

Every QA issue avoided is a massive savings. If your code change has unexpected side effects elsewhere, the full test suite will point them out, and you can fix the other items in the same mental context. The alternative is to spin up the entire environment 24 hours later just to fix a few lines.

When it comes to bug fixing, once the cause of the bug is identified, it’s so much easier to zoom in on the fix with a unit test than it is to interactively test against the full application.

Implementation notes

A few miscellaneous bits worth noting from the project.

  • This was all done in java and junit4.
  • Only covered Java, not any UI or javascript
  • Lines of code were almost 1:1 for core vs test.
  • Coverage
    • Measured using Emma for Eclipse. This should also be built into any continuous integration usages
    • make sure to measure both lines and branches in the UI
    • Only the main java gets to 100%. Unit tests won’t be marked as fully covered due to the way exceptions work.
  • Need well documented naming, documentation, and file layout conventions for the tests. Otherwise it’s too hard to find the tests that cover a given chunk of logic.
  • Refactoring tests will become an equal part of the maintenance. Don’t just keep piling on more tests – they need cleanup just the same to make sure they’re still relevant. Especially important when refactoring class/method names, make sure to fix the corresponding test names.
  • Used H2 as a runtime database when testing DAO and persisted models. Do NOT deal with the dependency of mysql or oracle runtimes.
  • Heavy use of mocking with EasyMock. The syntax can get verbose, and can be difficult to get the gist of the test
  • avoided using Spring for injecting dependencies in the tests. Faster to run individual tests and avoids XML hell of managing lists of beans.
  • Ruby kicks Java when it comes to TDD
    • dynamic runtimes mean you write more test first
    • far more succinct fixture setup
    • rspec is such a clean way to declare expectations and get long term documentation from the result

What is “fast?”

Posted in code on December 31st, 2011 by ben – 1 Comment

It’s so easy to let our web app performance degrade as it evolves, each architectural decision and framework adding a nearly imperceptible bit of overhead to the process until eventually people start asking “when did it get so slow?” After a while, we’ve reset our expectations of what “fast” is.

I’ve been tinkering with the simplest use cases to remind myself of how fast things really can go, and it’s worth a reminder of our various orders of magnitudes. When we talk about requests/second from a single machine, what makes us happy? 10′s? 100′s? or 1000′s? These days, we’re dealing with multicore processors, each running gigahertz clocks. That means 1000′s of requests/second is still granting us millions of cycles per request. That’s a lot of instructions! [1]

On my 2+ year old laptop, Apache serves plain files at 1000 files/second. I can also get it to dynamically render a web page with a database query result at the same rate! [2]

That’s an interesting reminder of how fast things can go. Maybe it should cause us to pause and consider all our beloved abstractions and whether they are really helping us, or are they just helping us mask the inadequacies of an earlier abstraction?


[1] Yes, I know cycle != instruction, but I’m talking about orders of magnitude.

[2] I’m getting contention between JMeter and the server itself, so those aren’t upper numbers. I’m also getting benefits of OS-level IO caching.

 

Why “Rails” is the perfect metaphor

Posted in code, opinion on November 1st, 2011 by ben – 1 Comment

I’ve seen various posts that describe what it means to be “on rails” but none that actually address actual physics of it, and why it’s such a perfect metaphor for what Ruby on Rails is.

Why do trains stay on their tracks? Contrary to popular opinion, it’s not the flanges. If it were the flanges, there’d be lots of wear and noise by passing trains. Instead, it’s because treads of the wheels are tapered, and this taper causes the wheel pair to naturally “hunt” for the center of the rails. Richard Feyman explains it quite well on YouTube. Or for even more detail, read about rail adhesion on Wikipedia.

I like to think of the Ruby on Rails framework as the software equivalent of the taper. The conventions that are built into RoR continually drive us back towards making a maintainable application, with good separation of model, view, and controller, test driven development and rich testing frameworks, and the rich ecosystem of gems to avoid reinventing the wheel (pun intended) in common features of applications. Add in some DRY (Don’t Repeat Yourself), and the bevels increase, and the result is like having the two rails turn into a V-groove that your application just rolls smoothly down.

I’ll go even further (and abuse the metaphor) to compare other platforms to the rails. Java is like a railroad without any taper, and it depends on the flanges to stay on track. Java’s flanges include the static typing and the multitude of frameworks that try and help make a better flange (Spring, Struts, Hibernate, J2EE, etc). And oh, does it make a lot of grinding noises!

But thank goodness for the open source community creating all those flanges faster than the Java train can derail, because the .NET world is like a a railroad with the bevels getting wider toward the outside, which is a terribly unstable configuration. (I wish I could find the video I have such a vivid memory seeing as a child (PBS?) showing the comparison of the conical configurations). Sure, really good engineers can create a workable train route, but it’s just too easy for an inexperienced developer to create a train wreck from all the wizards.

RoR is the first framework in my 18+ years of server-side web development that makes me regret all the flanges I had to make in the other frameworks. Thank goodness for those tapers!

All Aboard!

 

Looking for an application UI template

Posted in code on August 28th, 2011 by ben – Be the first to comment

Whenever I start a new web application, it seems that there’s a fair amount of reinventing the wheel in terms of navigation and UI structure. There are tons of tools to help with the individual widgets on the page, including JQueryUI, ExtJS, Dojo, to name a few. They all offer buttons, dialogs, accordions, tabs, etc, that make up the page. Some have built-in layout managers, others leave it to you do separately, in which case you can roll your own, or base it on a grid layout like Blueprint CSS.

However, I haven’t come across a unified package that takes a well documented and rational approach to laying out the elements of a rich application. A framework would help you through questions like:

  • How to navigate between modules of the application
  • How to provide “grounding” (inform the user where they are in the app)
  • where to place “action buttons”
  • When to use sidebars, tables, portlets, etc
The framework would
  • Be well documented
  • Include helper methods for the views/controllers to work with
  • Be flexible enough to support the uniqueness of the actual app
  • Integrate with Rails

Well known application suites have already figured all of this out. Whether it’s Atlassian (Jira, Confluence, Bamboo), 37signals (Basecamp, Highrise), or even Google’s suite, there is obviously something internally leading the consistency in design and layout principles.

It seems this would be a natural fit for Rails. For all its reputation as opinionated software, there should be a gem for opinionated application design that provides all the features above. There are things like web-app-theme, but they don’t really help you understand what to use when.

What’s out there? Or why isn’t there one? Drop a line to deafgreatdane@gmail.com, or leave a comment.

Visualizing state_machine in Rails

Posted in code on August 27th, 2011 by ben – Be the first to comment

Often, an app’s models need a lifecycle, transitioning between different states according to different business rules. The leading contender for abstracting this is the state_machine gem for Rails. (There are other options too, but not as rich.) It allows you to create a finite state machine on top of ActiveRecord, including all the goodness of events, constrained transitions, and many others.

As much as I believe the code should be legible enough for easy reading, sometimes a picture is a worth a thousand words. This gem comes with GraphViz integration, enabling to you create visual representations of your state machines. (GraphViz is a tool for turning textual descriptions of directed or undirected graphs into pictures) This feature is almost a footnote on the gem documentation, but it’s the killer feature for any skeptics that might be tempted to roll their lifecycle management code (usually by a simple “state” string in the model).

The picture to the right is generated with the following command:

rake state_machine:draw CLASS=Vehicle

That’s all pretty nifty, but what if you need an even fancier picture? OmniGraffle to the rescue! The mac’s über diagraming tool can import graphviz files directly, so it’s just a matter of calling the rake task with the right FORMAT, and opening it:

 

rake state_machine:draw FORMAT=dot
open -a /Applications/OmniGraffle\ 5.app Vehicle_state.dot

And now you can make graphic changes to your heart’s content. My tinkering for a few results in the image to the left.