Advertisements

Launching a whole new thing

August 11, 2011 Leave a comment

For the past 15 years I’ve worked in non-profits (missions, humanitarian relief organizations) to enable them to accomplish their ministry objectives through strategic application of technology. Early on, that consisted largely of systems, with administration and system design as the emphasis. Later on it shifted almost entirely to development and application architecture, but still with a strain of making sure services are always available for their intended use.

Now I’m entering a new chapter of my career, with the launch of NodePing, a partnership with my good friend Shawn Parrish. For me (this is my blog after all), that has meant something of a return to activities I haven’t focused on in quite a while: business plans and contracts. Happily though, most of my time in pulling this together has been in code, working with Node.js, Redis, CouchDb, and jQuery. That part has been a lot of fun, and there’s still a lot more to do.

The new service brings a lot of the things I have worked on through the course of my career together. I actually have used my education, a change from my normal work. It also leans on all the years I’ve been responsible for system administration departments, and ensuring services are available. And of course, it has included lots of system and software architecture and coding, by far the funnest parts.

Some posts that might otherwise appear on this blog about Javascript and systems for hosting Node.js applications will likely appear on the NodePing blog.

Huge amounts of work to do from here. I’m excited.

Advertisements
Categories: work and career

Moving CouchDb database files between servers

July 8, 2011 8 comments

There are several methods available for copying data between CouchDb servers. The most obvious is replication, which CouchDb does extremely well and it’s built in. If that option is viable it is probably the way to go. A while back I posted about a method I have used to use bulk document handling in Couch to copy data. That process works well too, and I continue to use that from time to time for some data.

Recently I had a situation in which I needed to set up several development and testing servers with the same initial state for the data. These were not on the same networks, and replication wasn’t convenient. I was moving a good bit of data across several databases, so the bulk document approach wasn’t attractive either. So I resorted to just copying the data files between servers. CouchDb’s design makes this easy to do.

The steps I take are probably overly cautious, but here’s what I do:

  1. Stop the couchdb service on the source host
  2. tar.gz the data files. On my Ubuntu servers this is typically in /var/lib/couchdb (sometimes in a subdirectory based on the Couch version). If you aren’t sure where these files are, you can find the path in your CouchDb config files, or often by doing a ps -A w to see the full command that started CouchDb. Make sure you get the subdirectories that start with . when you archive the files.
  3. Restart the couchdb service on the source host.
  4. scp the tar.gz file to the destination host and unpack them in a temporary location there.
  5. chown the files to the user and group that owns the files already in the database directory on the destination. This is likely couchdb:couchdb. This is important, as messing up the file permissions is the only way I’ve managed to mess up this process so far.
  6. Stop CouchDb on the destination host.
  7. cp the files into the destination directory. Again on my hosts this has been /var/lib/couchdb.
  8. Double check the file permissions in their new home.
  9. Restart CouchDb on the destination host.

You may or may not have to stop the CouchDb services, but it seems like a good idea to me to decrease the chances of inconsistent files.

I have done this a number of times now with no problems, other than when I manage to mess up the file permissions.

Categories: couchDB

Running out of disk space for CouchDB

June 16, 2011 2 comments

I added some new views to CouchDb on a development server the other day. The views included three or four emits and the full document in the view, and the disk space used exploded to several times the size of the actual data stored. Everything came to a screeching halt as the server had no space to write logs, no space to build views, no space to store the data we were pushing in, and no space for the compacts that were trying to run. It was one of those moments when you are really happy that the host you just messed up is a development box. This episode raised several lessons for me, some of which were new in the specific context, some of which were good reminders.

First, CouchDb needs lots of disk space. If you are working with views, lots of disk space can disappear very fast, as Couch copies the data files for compacting. This is not a bad thing, its what allows Couch to keep running and performing while it is doing these operations, but it is something to plan for when sizing resources. Running out of disk space is a bad thing. In my case, I had estimated the size of the data and allowed some room for compacts and views, but not nearly enough.

Second, this episode started up quite a debate within my team about designing views. I tend toward making views overly inclusive. My friend tends toward making the views as small as possible, and then throwing in an include_docs if you need more in the results. This is a trade-off, and in the end you have to consider it with every view you write. Small views save lots of disk space, and speed up writes and simple queries where you don’t need much in the results. Using include_docs is fairly cheap if you are getting a handful of docs. If you’re looking for results from hundreds or thousands of records, include_docs loses its appeal, because the server has to fetch each of those documents. Its a balancing act. If you argue for larger views for faster queries as the side on which you should err, I’d suggest making sure you have plenty of disk space. Turns out it is a little embarrassing when you make that argument and then badly under estimate how much disk space you really need, even on a development box.

Third, I got to learn how to clean up from dead compact jobs. When the server is out of disk space when it tries to compact a database, it can leave behind compact files that prevent compacts from working even after the space problem is solved. In my case on Ubuntu, these are in /var/lib/couchdb. If there are 0 size .compact files in your database directory you just need to delete them and restart the compact. I also noticed that there were some non-zero size .compact files in databases that were not actually compacting, and I removed these as well. Everything went back to humming after that.

CouchDb is a great tool. It does have some peculiarities it is good to keep in mind. Life was simple when we had basically no real choices. “Welcome to LAMP! Would you like MySQL or Postres with that?” Now we have all kinds of options on the menu, and we actually have to think about finding the right tool and understanding its strengths, weaknesses, and quirks.

Categories: couchDB

Time span sliders in jQuery

May 28, 2011 15 comments

I’ve been working on prototyping a new application that needed a way for users to schedule things in spans of time on multiple days. It should be quick and easy for people to pick arbitrary time spans on each day of the week. The main use case is to set a day time block, during business hours for example, or a block that excludes business hours. Someone might schedule the time span from 8am to 5pm, or they might schedule it to end at 8am and then start again at 5pm. I looked around but didn’t find any examples of doing anything like this that I liked.

I settled on a slider that you can invert. So you can set the range from 8am to 5pm, then click on a check box that flips the slider to mean everything except for 8am to 5pm. It needs to be visually obvious what this is doing, so the slider and text needs to reflect what is happening. I think the result is fairly easy to use and very workable.  The layout looks something like this:

The other thing I wanted was to be able to drop the whole thing in a page easily based on code in one place. I anticipate it will be used in more than one location within the application. So I wanted to construct it as a jQueryUI widget that basically wraps the standard slider widget. I didn’t try to make it portable beyond this app. There are a few things I’ve done here that you wouldn’t want to do if you were making a normal jQueryUI plugin.  For example, it carries the HTML for laying out the text and sliders within the widget.  I think its illustrative for a prototype, but not elegant.

Posts on doing multiple sliders or time based sliders are not easy to find. The best one I found was from Marc Neuwirth. That post saved me some time in figuring out how to deal with the times on the slider. There is also an improved slider on the Filament Group’s site, but it doesn’t really answer the basic needs I had.  The fact that I had a hard time finding examples from other people who’d already done the same thing is what prompted this post.

For a time slider, we need to convert the times to integers so we can use them for slider values. To convert the time to an integer, we split the time into hours and minutes, multiply the hours times 60, and add back the minutes. We do the reverse to turn the integer into a formatted time for display. In the code below these operations are done in two functions, named formatTime and getTime. These are single use functions I don’t have a need for elsewhere, so I’m carrying them inside the narrow scope. If I ended up having a similar need somewhere else I’d refactor that out to make it usable elsewhere.

Since I’m making a jQueryUI widget (sort of), I start with the basic self executing function and the widget factory method.

(function($){

    $.widget("ui.timeslider", {
    });

})(jQuery);

To that, I need to add the options object, which will allow us to set our options for the widget and also double as the options we’ll pass to the individual sliders. That last bit gets a little sticky, because we’ll have a changing scope as we deal with events within seven different slider (sub)widgets. Usually I wouldn’t put the whole callback function in the options definition, since its harder to read and maintain, but its easier to make it portable to each of the sliders this way.

options: {
    animate: false,
    days: ["monday","tuesday","wednesday","thursday","friday","saturday","sunday"],
    distance: 0,
    max: 1440,
    min: 0,
    orientation: "horizontal",
    range: true,
    slide: function(event,ui){
        function formatTime(time){
            var hours = parseInt(time / 60 % 24);
            var minutes = parseInt(time % 60);
            minutes = minutes + "";
            if (minutes.length == 1) {
                minutes = "0" + minutes;
            }
            return hours + ":" + minutes;
        }
        var startTime = formatTime(ui.values[0]);
        var endTime = formatTime(ui.values[1]);

        var selector = "#"+this.id;
        $(selector + "-time1").val(formatTime(ui.values[0]));
        $(selector + "-time2").val(formatTime(ui.values[1]));
    },
    step: 5,
    value: 0,
    values: [360, 1080]
},

Most of these options are the standard slider options. The only addition is the days array, which we’ll use to add the individual days to our results. Putting it in the options this way lets us pass in a different set of days if we wanted to (maybe Monday through Friday are more appropriate in some places). The slide callback is basically a typical callback for the slider widget. The id based selector will make more sense in a bit.

The rest of the widget is all in two methods: _create and _sliderhtml. _sliderhtml just holds our html to build the containers for the individual sliders. The underscore at the front of the method names are how we tell the jQueryUI factory that they are private.

_sliderhtml: function(day){
    var displayday = day.charAt(0).toUpperCase() + day.substr(1);
    var html = ""+ 
        '<div class="fieldrow spacer">' +
            '<div>' +
                '<label for="'+day+'-time" class="inline">'+displayday+':</label>&nbsp;' +
                '<span id="'+day+'-midnight1" style="display:none">midnight to</span>&nbsp;'+
                '<input type="text" name="'+day+'-time1" id="'+day+'-time1" value="6:00" class="blended"> - '+
                '<input type="text" name="'+day+'-time2" id="'+day+'-time2" value="18:00" class="blended">&nbsp; '+
                '<span id="'+day+'-midnight2" style="display:none">to midnight</span>&nbsp;&nbsp;&nbsp;'+
                '<span>Except <input type="checkbox" name="'+day+'-except" id="'+day+'-except"></span>' +
            '</div>' +
            '<div id="'+day+'"></div>' +
        '</div>';
    return html;
}

That’s not pretty, and the wrapping displayed here makes it look even worse. This is special use widget, intended for one application. I wouldn’t build more general use widgets with the HTML like this, because you have to edit the widget code to change the layout, but its fine for my limited purpose here. This is the HTML for one slider. We’ll pass it the day from the days array in the options and use the day name for our field names and element IDs. We change the first letter of the day’s name to upper case to make the label.

_create is the standard method that will be used by the jQueryUI to construct our widget. It looks like this:

_create: function(){

    var self = this,
        o = this.options,
        dayslength = o.days.length;
    o.timeslider = this;

    function getTime(time){
        var times = time.split(':');
        time = (times[0] * 60) + parseInt(times[1]);
        return time;
    }

    for(var i=0; i < dayslength; i++){
        self.element.append(self._sliderhtml(o.days[i]));
        $( "#" + o.days[i] ).slider(o);
        $( "#" + o.days[i] + "-time1" ).bind("change",function(){
            var slider = "#" + this.id.substr(0,this.id.length - 6);
            var newval = getTime($("#"+this.id).val());
            if(!isNaN(newval)) $(slider).slider("values" , 0 , newval);
        });
        $( "#" + o.days[i] + "-time2" ).bind("change",function(){
            var slider = "#" + this.id.substr(0,this.id.length - 6);
            var newval = getTime($("#"+this.id).val());
            if(!isNaN(newval)) $(slider).slider("values" , 1 , newval);
        });
        $( "#" + o.days[i] + "-except" ).bind("change",function(){
            var excl = "#"+this.id;
            var sliderlength = excl.length - 7;
            var slider = excl.substr(0,sliderlength);
            var exclude = $(excl).is(":checked");
            if(exclude){
                $(slider).addClass("ui-widget-header");
                $(slider).removeClass("ui-widget-content");
                $(slider + " div").addClass("ui-widget-content");
                $(slider + " div").removeClass("ui-widget-header");
                $(slider + "-midnight1").css("display","inline");
                $(slider + "-midnight2").css("display","inline");
            } else {
                $(slider).addClass("ui-widget-content");
                $(slider).removeClass("ui-widget-header");
                $(slider + " div").addClass("ui-widget-header");
                $(slider + " div").removeClass("ui-widget-content");
                $(slider + "-midnight1").css("display","none");
                $(slider + "-midnight2").css("display","none");
            }
        });
    }
},

The interesting part, and basically the whole reason for the widget to exist, is in the for loop in this method. self.element is the element from the selector in our page. So we might create the widget like this:
$("#scheduler").timeslider();
In that case the DOM element with “scheduler” as the ID will be self.element within the _create method. We iterate through our options.days array and append the html from _sliderhtml for each day in the array. Then we put a slider widget in each of those blocks of HTML (line 16 above), passing it the same options array we’re using. It has the extra days array, but the slider widget will ignore that. The rest of the loop is adding event bindings to the three input elements embedded in our HTML.

The first two input elements hold the times. This makes grabbing the values extremely easy, because we have an input element with an ID. We put the values in these fields in the slide callback in the options (line 23 and 24 in the second code block in this post). Here we’re binding an onchange to these elements. That allows a user to click on the times and type in a new value, which will move the slider to what they typed. It might be that nobody will ever do that, but I think it’s a nice feature.

The third event we’re binding is a change event to our checkbox. The string manipulation on lines 29 and 30 is identifying the various elements that make up our layout. On line 31 we see if the checkbox is checked or not, and then depending on that we set styling on the slider.

As a last step, I added another checkbox at the top to disable the whole thing, but that’s not actually in the widget. When you click the checkbox it does this:

function(){
    if($("#turnitoff").is(":checked")){
        $("#scheduler").timeslider("disable");
    } else {
        $("#scheduler").timeslider("enable");
    }
}

This highlights the benefits of using the jQueryUI widget factory for this exercise, if that wasn’t clear already. By using the widget like this, we get the disable and enable methods, and all the other standard widget methods built in.

This is prototype, and I’m not sure yet if it will make it into the production application. If it does (and I remember) I’ll update this to show it used in context. If it gets into production it will likely need some refinement. For example, the checkbox might be better as a button and “Except” is not the right label, and there are a few other things I’d tweak. In the meantime, I hope this is useful for someone.

Categories: javascript, jQuery

Learning and community life cycles

May 19, 2011 Leave a comment

My first round of experience learning a web development environment was in the 90’s, when I needed new ways to deliver financial reports to department managers and built web reporting using ASP pages on Windows NT servers. It was live. It was on demand. You didn’t need to run a special client program to get your reports. It was one of my first experiences of providing people with a tool that they hadn’t realized they needed, but within a few weeks everybody was convinced it was essential to getting their work done.

I jumped into the online communities (basically email back then) and asked and answered questions in various groups. I quickly realized that PHP fit what I was trying to do better, and I moved to the PHP community. I learned a lot from those groups. My personality is to listen more than I talk, both online and off, but in those days I was excited about what I learned and wanted to help others learn it too.

It didn’t take me long to notice that people ask pretty much the same questions over and over. Many groups online are used largely by people who are learning a new technology. That’s a critical function, and its a good way to distribute the learning. Like many people, I got tired of answering the same questions, and started thinking maybe people should at least make an attempt to look at the group’s archives.

Since I first started, we’ve moved through forums, blog posts with arguments in the comments, and then to sites like stackoverflow.com and microblogging. They are all useful, and all of these formats for distributing information and learning still continue to some extent even as they are replaced by newer models.

Its fun to go through the life cycle with each new technology. You learn something new, hit snags, and go looking for where the community is. Its even more fun when the community is new and the best practices are still being fleshed out. The beginnings of the CouchDb and Node.js communities were more recent examples of that for me. I didn’t catch either of those waves right at the beginning, but I was fairly early. True to my personality, I tend to listen more than talk, but I enjoy the discussions of sorting out the best way to do things.

The great thing about the early days is that the questions aren’t old for most anybody. Not that long ago the package manager for Node wasn’t settled, and there were discussions about the best way to handle that need. Now npm is the standard, and the question is fairly well settled. People still raise the question of whether it should work differently, but most of the questions have been answered. At least for the moment, npm’s place in the Node.js world is fairly settled. In this case, it happened fairly quickly.

Each generation of developers, of course, reinvents it all. Before long you start to see the same discussions popping up in every community. I’m amazed by people who continue to patiently answer the same question again and again. I don’t have the patience for it. I am much more tolerant of helping people in one on one, mentoring new developers in my team, especially when the new guy is really learning.

In Node, people get caught by async programming, and look for ways to make it work differently. Maybe at some point people will come up with a new paradigm, but Node is async and uses callbacks. If you don’t like that maybe you’re on the wrong platform. In Couchdb people get caught on how to construct views, and whether you can make views dependent on external conditions like the contents of other documents. Someone explains again why it doesn’t work like that. Some discussions never die, like when people find new ways to do things insecurely or rediscover familiar patterns. I’ve recently seen some discussions about dependency injection in PHP that seem rather worn to me.

The communities as a whole go through the same cycles too. People from the Erlang community might notice that many of the discussions and debates in the Node.js community are struggling with issues Erlang had to solve during its earlier stages. Languages evolve and have to struggle with “new” issues that are familiar to other platforms, like when PHP adds new pieces of the OOP puzzle and new debates about typing and inheritence in Java erupt in new contexts.

The world in general owes a great deal to all of the people who work through these cycles, and especially the role filled by people who seem to tirelessly answer the newbie questions. We’re all starting somewhere, and usually we find ourselves learning new things fairly often as the paradigms change. I’ve been through a few generations of PHP and MySQL, but I’m newer to CouchDb and Redis. The cycles repeat, but not exactly. Every once in a while something truly new emerges and things progress. That’s a side effect of the same challenges being faced again and again. Every once in a while, someone asking the same question the billionth time does come up with a novel answer. Then we wonder how we ever got by with the answers we all accepted before. Other times people give a name to something people have been doing for a while. Things move on. Technologies and people come and go. The tools we have now are worlds beyond what we had before. What’s next? Who knows, but there will be people learning it, and asking questions, and someone will be tirelessly answering the new questions for the billionth time. To the people who ask the same questions until the answers finally do change, and to all those people who answer the same things again and again: Thanks!

Categories: development

Async doesn’t have race conditions

April 15, 2011 Leave a comment

I’ve been noticing comments about race conditions in async code, most specifically javascript used server side as in node.js. As someone who has written more lines of blocking code than I care to think about, I can relate to the angst of people trying to make the mind bending transition to thinking asynchronously. I think that just the name “race condition” is an artifact of the age of blocking code. There are no race conditions in the async world. The term is about bugs in blocking code. In async code there are no races, because it isn’t about getting done in order, its about getting done when you’re done. Thinking in terms of avoiding race conditions isn’t about bugs, its thinking in the wrong paradigm. There is no spoon.

Categories: javascript

Testing variable types in Javascript

March 5, 2011 1 comment

Anyone who has written more than a handful of lines of Javascript has found a need to find out the type of a variable. A quick search on their favorite search engine or in their reference book turns up typeof, which indeed tells you the type of a variable. It works great, except if you’re looking for arrays, null, or NaN.

The typical way that typeof is used is in tests:

var my_string = "llamas";
var my_number = 42;

if( typeof my_string === "string" ){ 
   // do something with my string
}

if( typeof my_number === "string" ){ 
   // do something with my number
}

This type of test works well for strings, numbers, functions, and undefined. You most often see it used for the first three of these. Most people check for undefined more directly:

if( something === undefined){
   // treat it like it isn't defined
}

This is particularly useful if you are wanting to know whether an argument for your function was included when the function was called.

var myfunction = function(foo){
    if(foo === undefined) foo = "bar";
    // now we can do something with it, knowing that it is defined.
}

So, what do you do if you have a variable that might be an array or an object, and you need to know which it is? Using typeof doesn’t help, because it tells you that it’s an object either way. It is an object either way, actually, but that’s not very helpful.

The easiest test between a non-array object and an array is by checking for length. Arrays have a length, non-array objects do not. So if typeof says its an object, and its length !== undefined, then it is an array.

NaN is a special confusing case. NaN is a value representing “Not a Number,” except that of course typeof says that NaN is a number. It is useful for working with messed up math and bad dates, both of which are best detected as quickly as possible. NaN is returned by several Javascript methods such as parseFloat and parseINt. NaN is never equal to any number, including itself. So to test for NaN you use the aptly named isNaN(testValue), which returns true if testValue is indeed not a number.

Putting all this in a simple function that you can use to find the type of something is fairly easy:

function(thing){
  return (thing === null) ? "null" : 
    (typeof thing == "object" && thing.length !== undefined) ? "array" : 
    (typeof thing == "number" && isNaN(thing)) ? "NaN" :
    typeof thing;
}

If there’s a law against abusing the ternary operator anywhere, that is sure to break it.

One other thing worth noting about testing variable types is testing booleans. Using typeof we get “boolean” as the response if the operand is actually (===) true or false. So the following variables are all boolean, according to typeof:

  var yes = true;
  var no = false;
  var another = ( 1 == 2 );
  var tistrue  = ( yes !== undefined );

However NaN, null and undefined, even though they test as false, aren’t boolean according to typeof. All of these test as false:

( null )
( undefined )
( false )
( NaN )
( 1 == 2 )

But only one of them (false) has a typeof Boolean.

Categories: javascript