Monday, December 17, 2012

getters, setters, and redefining javascript

Given enough time, I'm sure every browser will support proper JavaScript getters and setters. Whether getters and setters are truly good is another matter. Let's assume they're good. And let's pretend we want to use them ... right now.

I'd posit there's a potential solution. And while I can't offer a full or fool-proof solution at this time, an intriguing thought struck me. That is, getters and setters are effectively nothing more than a compiler rewriting code that looks like this:

console.log(obj.property);
console.log(obj['property']);

Into something that looks more like this:

console.log(obj._get("property"));
console.log(obj._get("property"));


I'm not entirely sure why we feel better treating obj.property like any other property. Why don't we want to explicitly acknowledge the "work" that must be done each time we hit it? Perhaps it affords us some debatably necessary ignorance. Perhaps it allows for greater laziness. Perhaps we think the thinking or the characters we save by using the property notation all the time, rather than only when we're referring to actual properties, significantly outweighs the performance cost of preprocessing our code.

... awkward silence ...

Nevermind. Back to pretending getters and setters are unequivocally good ...

I've seen a lot of folks, including myself, trying to hack getters and setters (and even "dynamic" getters and setters) into JavaScript without fully recognizing that JavaScript is a dynamic language. It's compiled at runtime. And segments of every script can even be compiled mid-stream using fancy things like eval(). (Contrary to the scary stories you may have heard, eval() is not evil unless it is abused. And in my opinion, this is precisely the sort of rare case in which eval() is an acceptable solution.)

So, without spending a great deal of time trying to produce a perfect solution, let's just prove the concept with some basic getters and setters using some very simple self-improving script.

When an object property is referred to, we want our self-aware script to look for mechanisms to interact with that property, going from most specific to least specific. That is, when property obj.prop is referred to, we want to use the first available mechanism on our short, ordered list of possible mechanisms.

  1. obj.prop.(get|set(v)) - the property-specific getter or setter
  2. obj.prop - the property itself
  3. obj.(get(n)|set(n,v)) - the class-level getter or setter
  4. undefined - though not a getter or setter, explicitly acknowledge this as the default

Implemented as getter/setter wrappers, our logic looks like this:

Object.prototype._get = function(n) {
  // first, infinite recursion prevention
  this.__gotten = this.__gotten || {};
  if (this.__gotten[n]) {
    return this[n];  // even if it's undefined.
  } else {
    this.__gotten[n] = true;
  }

  // then, call getters
  var rv;
  if (this[n] == undefined) {
    if (this['get'] && typeof(this.get) == 'function') {
      rv = this.get(n);
    } else {
      rv = undefined;
    }
  } else {
    if (this[n]['get'] && typeof(this[n].get) == 'function') {
      rv = this[n].get(n);
    } else {
      rv = this[n];
    }
  }

  // unset the "gotten" flag so the getter works properly in
  // subsequent calls
  this.__gotten[n];
  return rv;
}

Object.prototype._set = function(n, v) {

  // first, infinite recursion prevention
  this.__setted = this.__setted || {};
  if (this.__setted[n]) {
    return (this[n] = v);  // even if it's undefined.
  } else {
    this.__setted[n] = true;
  }

  // then, call setters
  var rv;
  if (this[n] == undefined) {
    if (this['set'] && typeof(this.set) == 'function') {
      rv = this.set(n, v);
    } else {
      rv = undefined;
    }
  } else {
    if (this[n]['set'] && typeof(this[n].set) == 'function') {
      rv = this[n].set(n, v);
    } else {
      rv = (this[n] = v);
    }
  }

  // unset the "setted" flag so the getter works properly in
  // subsequent calls
  this.__setted[n];
  return rv;

}

With or without any fancy rewriting, we can we then build getter/setter endowed objects like this:

var o = {
  a: 1,
  b: {
    get: function() { return 2; },
    set: function() { /* do nothing */ }
  },
  get: function(n) {
    if (this[n] == undefined) {
      this[n] = 0;
      return this[n];
    }
  },
  set: function(n, v) {
    this[n] = v % 2;
  }
};

And we can interact with them like this:

console.log(o._get('a'));
console.log(o._get('b'));
console.log(o._get('c'));
o._set('d', 123);
console.log(o._get('d'));

And, we'll see precisely what we expect to see in our console. But, our goal is to see the same output by writing this:

console.log(o.a);
console.log(o.b);
console.log(o.c);
o.d = 123;
console.log(o.d);


Well alright. So, let's just have our script rewrite itself a little before it executes. And the best way to do that, I'd argue, is to wrap our code in a function that make some minor edits and eval()'s the result. So, here my first simple working implementation:

var F = function(f) {

  var _f = f.toString();
  _f = _f.replace(/\.([a-zA-Z0-9_])+\s*=\s*([^=].*)\s*;/gm, "._set('$1',$2);");
  _f = _f.replace(/\[(['"a-zA-Z0-9_]+)\]\s*=\s*([^=].*)\s*;/gm, "._set($1, $2);");
  _f = _f.replace(/\.([a-zA-Z0-9_])+(\.|\s|;|\n|\))/g, "._get('$1')$2");
  _f = _f.replace(/\[(['"a-zA-Z0-9_]+)\]/g, "._get($1)");

  eval("var rv = " + _f + "");
  rv._original = f; // for debugging, curiosity, etc.
  return rv;
} // F()

Using our modifications to the Object prototype, and our fancy F() function, our getter/setter endowed object works as intended like this:


F(function() {
  var o = {
    a: 1,
    b: {
      get: function() { return 2; },
      set: function() { /* not allowed */ }
    },
    get: function(n) {
      if (this[n] == undefined) {
        this[n] = "no.";
        return this[n];
      }
    },

    set: function(n, v) {
      this[n] = v % 2;
    }

  };


  console.log(o.a);
  console.log(o.b);
  console.log(o.c);
  o.d = 123;
  console.log(o.d);
})();


As expected, our console shows:

1
2
NaN
1

Remember, when pondering line 3 of our output, that our dynamic getter is returning the result of an assignment, which works through our dynamic setter, which expects the assigned value to be a number.

Also bear in mind, this little example is fairly simple, and our F() can tend to mangle more complex syntaxes. Our regular expressions are simple and inflexible. For instance, we can throw it off by hiding an assignment in conditional or by simply using the increment/decrement operators.

Consider this awkwardly written loop.

var app = F(function() {
  var o = {i:0};
  var rand = false;
  while (o.i = rand) {
    console.log(o);
    rand = Math.random();
  }
  o.i--;
  o.i += 5;
});

Now, we wouldn't expect to see such a silly loop in a real application. But, we should expect it to work nonetheless. And it doesn't. The F() rewritten function subverts some getter/setters and replaces others with invalid code. The rewritten, broken function looks like this:

function () {
  var o = {i:0};
  var rand = false;
  while (o._get('i') = rand) {
    console.log(o);
    rand = Math.random();
  }
  o.i--;
  o._get('i') += 5;
}

Multi-line assignments break too. This won't work:

var app = F(function() {
  var o = {};
  o.i = "This"
    + " assignment spans"
    + "multiple lines.";
});

So, F(), as I have defined it above, is limited. But, it illustrates the concept: JavaScript is a dynamic, compile-as-needed programming language. If you want getters, setters, operator overloading, a fancy short-hand syntax, or anything else, implement it!

Please feel free to suggest improvements. My latest getter-setter rewriter will be available here, and will include any suggestions I find significant and valuable. http://www.thepointless.com/js/accessors.js

No comments:

Post a Comment