Monday 28 May 2012

Interesting stuff from these last months

It's been a long while since my last post of the kind "last month's interesting stuff", but I have several interesting things from the last months that I'd like to "serialize" and that I've not been able to fit into any other entry, so here it goes a new mash up.

  • Good question in Stack Overflow about "Utility methods"
  • Astonishing article comparing C# closures with Java pseudo closures http://csharpindepth.com/Articles/Chapter5/Closures.aspx
  • Excellent write up about the usage of GPU's for General Computation
  • .Net IL (bytecodes) has to instructions for calling methods call and callvirt. Contrary to what you could expect, for most of the non virtual method invocations, callvirt is used instead of call. Yes, sounds confusing, but there's a good explanation here. Slightly related to that, you can read here some thoughts about the overuse of virtual
  • I found by chance this excellent answer to a question I had asked myself many years ago but for which I never got to devote any thinking or even googling: Why a C# struct cannot be inherited?

    Technically, being a "value type" means that the entire struct - all of it's contents - are (usually) stored wherever you have a variable or member of that type. As a local variable or function parameter, that means on the stack. For member variables, that means stored entirely as part of the object. That means that if you allowed structs to have subtypes with more members, anything storing that struct type would take up a variable amount of memory based on which subtype it ended up containing, which would be an allocation nightmare. An object of a given class would no longer have a constant, known size at compile time and the same would be true for stack frames of any method call. This does not happen for objects, which have storage allocated on the heap and instead have constant-sized references to that storage on the stack or inside other objects.

Saturday 26 May 2012

Closure in a Loop Revisited (C# 5.0)

I've come across something today that has prompted me to write a fast post to update this previous one.

To my surprise, I read here that the way closures trapping loop variables work is being changed in C# 5.0. I felt quite shocked, as the way it works now is the correct way. I don't mind how many developers find it confusing (of course I found it terribly confusing before I fully grasped the idea), Programming is not an easy art, and it demands from artists (us) practice, effort and devotion. So if you're using closures you should know that Closures trap Variables, not values, meaning that if your loop is updating the variable, it's also updated for the closure(s) that have trapped it.

Seeking to learn more about this change I found this post by the very Eric Lippert

In C# 5, the loop variable of a foreach will be logically inside the loop, and therefore closures will close over a fresh copy of the variable each time. The "for" loop will not be changed. We return you now to our original article.

OK, good it's changing only for the foreach loop, not for for loops. I still find it unnecessary, but have to admit that the arguments given by Eric makes some good sense. However, probably this adds even more confusion to the issue:

  • On one side, people who knew how closures and loops worked before, will learn about this new "feature" and will be aware of things working differently between for and foreach, so no problem with them (us)
  • On the other side, people who were confused about closures and loops before, probably won't learn about this new feature, and suddenly, they can painfully hit a case where replacing a foreach with a for will change the behaviour of their code

Update 2012/05/29 I've just found this complementary question

Wednesday 23 May 2012

Command Query Separation and Method Chaining

There are many Programming Principles that may seem simple when you first read about them, but when it's time to put them into practice it's a different story. Some principles can apparently clash with others, others are just purely ridiculous when taken to the extreme (Single Responsibility can end up in a class explosion anti-pattern with too many classes with maybe just one method). Some do only apply to certain cases, and don't make sense if used as a rule of thumb (the Law of Demeter is still confusing to me in many cases).

One case that has had me thinking in several occasions is the Command Query Separation vs Method Chaining (Fluent interfaces) one. In principle it would seem like Method Chaining violates the Query Command Separation principle, as you can read here. Yes, you have commands (methods that perform and action and change state) that are also queries (they return values). The thing is, is this really a problem?. I think the main idea underlying this principle (at least the main idea in the wikipedia entry) is: asking a question should not change the answer. Yes, that's also for me the main point, querying data should not modify state. This said, I think that the fact that a command returns state as "an extra" should not be a problem. For example, let's think this in terms of DOM manipulation:

$("#miDiv").append(htmlNodes).addClass("updated");

It's clear that append there has been thought as a command, not as a query that has the side effect of modifying state. You don't use append to obtain the same element again, it's absurd, you use it to modify the element, an as an extra you decide to return the element to facilitate writing cleaner code. So, to me this is not really a violation of the CQS principle.

On the other side, I pretty much agree with what what Fowler says "I prefer to follow this principle when I can, but I'm prepared to break it to get my pop.". It's interesting to note the mention that he does to the Stack class, it also called my attention that Java iterators break the principle (next advances and returns the new value). On the contrary, .Net enumerators don't break the principle, we've got a MoveNext "Command" method, and a Current "Query" property.

We should also have present that there are other cases where this principle doesn't hold simply cause the world that we're trying to model does not follow those rules. Let's imagine a "Person" object with a "GetCurrentThoughts" methods. Calling this method could modify the internal state of the Person, by increasing his "Tiredness" property...

There's another topic that some times comes to my mind related to these matters, the Read Only Property (getter) vs Method discussion. This is because I found sometimes among the list of rules to discern if something should be one thing or the other, that Properties should not have side effects, that is, retrieving a property should not change the state of the object. Well, that is quite useless, cause based on the CQS principle, a method that retrieves a value is a command, and as such should not have side effects either. Regarding this, another interesting point is that properties should not return new objects, you should use a method for that. I also recommend reading this discussion about DateTime.Now being a property instead of a method.

Sunday 20 May 2012

Multimethods

Multiple Dispatch (aka Multimethods) is one of those advanced programming features very rarely supported by Programming languages. Long in short, the idea is that the dispatch mechanism for overloaded methods should take into account the runtime type of the arguments (so, extending the normal polymorphism where the runtime type of the object on which the method is invoked is used to dispatch to the correct method via the corresponding vTable entry).

We can have dynamic dispatch, both in static and dynamic languages, by adding ourselves the dispatching mechanism to each method. So in every method for which we wanted multiple dispatch we would have to write some type checking code using instanceof (Java), is (C#), or whatever mechanism the language provides, and then we would have some kind of switch from which we would invoke the needed code

If we move to the JavaScript terrain, where we're so used to closures, functions that generate functions and so on, I thought sure it would be easy to come up with some way to automate this

Before implementing my own solution, I searched the web to see what others had come up with, and found 2 different projects:

  • le Func It's fine, but different from what I had in mind, it only takes into account primitive types
  • Multimethod.js Wow, this is sheer beauty, much more advanced that what I had in mind, cause it lets you dispatch not just based on "types", but just on values. In fact, it's been an eye opener, as I'd never thought of multiple dispatch in terms of values

Let me clarify that I'm very fond of the dynamic nature of JavaScript and I think coding in it should normally be based on duck typing, and in fact I think it's hard to apply the concept of types to a language where an object can be modified (augmented, expanded) so much that it can end up having quite little to do with what we initially obtained from the constructor function used for its creation. Though, let's say I'm a bit old fashioned when it comes to dynamic dispatching and wanted to implement a classic model, one based on the type of the object, and understanding as its type, the constructor function used for its creation. So I decided to roll out my own implementation, coming up with this:

var multimethod = function(){
 var me = function _multimethodImpl(){
  var found = false,
   i = 0;
  while(!found && i < me.keys.length){
   var j = 0,
    misMatch = false;
   while(!misMatch && j < arguments.length){
    var curArgument = arguments[j];
    if(typeof(curArgument) != "undefined" && curArgument != null){
     if(curArgument["constructor"] != me.keys[i][j]){
      misMatch = true;
     }
    }
    j++;
   }
   found = !misMatch;
   i++;
  }
  if (found){
   return me.methods[--i].apply(this, arguments);
  }
  else{
   throw new Error("function not found"); 
  }
 };
 
 me.keys = []; //each entry is an array of constructor functions
 me.methods = [];
 me.add = function(method, paramTypes){
  this.keys.push(paramTypes);
  this.methods.push(method);
  return this; //let's do it fluent
 };
 return me;
};

So the multimethod function returns a function (a closure holding the different functions to be invoked based on the different parameters) that exposes an add method intended for adding the real functions (and types) to which each call should be dispatched. It can be used like this:

var sayHi = multimethod()
 .add(function(){
  console.log("Person is saying Hi");
 }, [Person])
 .add(function(){
  console.log("Employer is saying Hi");
 }, [Employer])
 .add(function(){
  console.log("Employee is saying Hi");
 }, [Employee])
 .add(function(){
  console.log("received Employee and Number");
 }, [Employee, Number])
 .add(function(){
  console.log("received Employee and String");
 }, [Employee, String])
  .add(function(){
  console.log("received Employer and Number");
 }, [Employer, Number])
 .add(function(){
  console.log("received Employer and String");
 }, [Employer, String])
 .add(function(){
  console.log("received Employer and Function");
 }, [Employer, Function]);

Notice that I'm doing the type matching based on the constructor property of the parameters, not on the instanceof operator

.

You can find the code with some samples here

Update 2012/05/29. I'd like to complete this entry by saying that the crazy dispatch mechanism in Groovy presents us with out of the box multimethods. It's amazing to see how clean it makes patterns like Visitor

Saturday 19 May 2012

JavaScript digest

Well, in the last weeks I've been going through a good bunch of new javascript stuff that I think well deserves a summary here (I admit this write up will mainly look like a digest of dailyJS

  • Global eval. To be truth, it's something that I've never needed myself, but when I came across the jQuery.globalEval function in the jQuery documentation I felt the need to check how that was done, and I didn't seem to make any sense of what was going in the source code, as it seemed to clash with my previous knowledge. Well, hopefully this explains it well. It's normal that the code didn't make sense to me, all this works based on some odd EcmaScript detail in the language specification.
  • JavaScript object literals are so handy, but from time to time I stumble upon the same issue, now at least I've found a good workaround. What happens if one of the properties you're initializing depends on other properties? you can't use this there, cause it will point to the this in that function scope, not to the new object that you're creating. You can't use the property name alone either... but you can use this neat code recommended here:
    var foo = {
       a: 5,
       b: 6,
       init: function() {
           this.c = this.a + this.b;
           delete this.init; //do this to keep the foo object "clean"
     return this;
       }
    }.init();
    
  • Even when I'm a loyal Mozilla fan, I tend to prefer node.js over Rhino. I find it terribly useful for unit testing your browser independent code or CLI development, nevertheless, I don't share all the frenzy about its usage for server side stuff. Maybe I'm old fashioned, but I still have not got to grips with the idea of NIO and just one thread. Yes, I know about the ck10 problem and so on... but I grew up (as a programmer) in a time where Threads were cool! Anyway, reading this online book about node.js is really interesting. It gives you the basis there for building your own web framework, which is a really mind stretching exercise and makes you better understand existing frameworks. The author also has a very clever article about Object.create and a classless society :-)
  • It may seem a bit odd that when we still can't use JavaScript 5 with full confidence due to the lack of implementation in some browsers (awful IE 6-7-8...) we start to talk about the next version. The thing is that this excellent video in infoQ got me rather excited about all the neat features they plan (many of them already seem approved) to add to ES.Next aka EcmaScript 6. I love the addition of Array Comprehensions, Generators, destructured assignment, proxies, but I admit that I'm not much happy with the addition of syntax for classes, as on the contrary, I think they should try to promote thinking in prototypes instead of thinking in classes. Anyway, as I understand that under the covers everything will still be based on prototype chains, I just see it as something to avoid. Then I found this post encouraging developers to try to take part in the process,
  • These excellent Front End Development Guidelines discuss also html and CSS, but anyway I think this entry is a good fit for them.

Thursday 17 May 2012

Bruce Eckel, Actors and more

As a Programming languages freak I'm a big advocate of Polyglot programming, so this presentation in infoQ immediately caught my eye. I was greatly pleased when found out that it was written by Bruce Eckel. It's been a very long while since I read his classical Thinking in Java, and even read part of Thinking in C++ (even someone with such distaste for C++ like me has to acknowledge that it's an excellent book), so it's good to see what Bruce's mind has been working on lately, sure it can offer direction :-) Based on this presentation it seems like he's been much into Python and Scala.

I have to painfully admit that in the last years I've been quite disconnected from Python. On one side, JavaScript and Groovy (and even C#) completely fulfill my need for freak-advanced-cute features, and on the other side I think I never fully bought the syntax (each day I'm less willing to move away from "C syntax").

As for Scala, it's one of the many things in my list of cool stuff to play with (even when the geek in me is much more of a Groovy type), along with other topic mentioned by Bruce, the Actor model. I've got zero knowledge about "Actors oriented concurrency", but I remember having first heard about this paradigm some years ago through a Microsoft Research Project named Axum. It seems like that .Net language with built-in Actors is now discontinued, but some of its concepts have made into the .Net framework itself or some other projects. Well, too many things I want to learn and too little time...

Sunday 13 May 2012

Acces to Closure Variables 3

Someone asked me the other day about getting access to a variable captured by a JavaScript closure. I answered him based on this previous post. Then, it occurred to me a different way to get something similar, by modifying the way we define our closure. Instead of capturing the variables, we'll be trapping the function itself, and adding those variables to the function. Well, I better show the code:

//function with a counter keeping track of its invocations
var countableFunction = (function(){
 var me = function (){
  me.counter++;
  //do whatever...
  console.log("called, " + me.counter);

 };
 me.counter = 0;
 return me;
})();


countableFunction();

countableFunction();

console.log("countableFunction has been invoked: " + countableFunction.counter + " times");

So in the end, we have a closure (a function with state) that exposes its state to the outer world, only that the way we add that state creating the closure is slightly different.

Tuesday 8 May 2012

Mixins and Traits

Mixins (I understand them as "composable units of behaviour") are a terribly useful, and as with many other features, once one gets used to them in languages like JavaScript or Groovy, one would love to see them in other languages like C# or Java. Hopefully, the advent of Extension Methods to C# enriched the language to the extent of allowing a form of Mixins.

The problem with extension methods is that as they are nothing more than a compiler trick based on static methods, they don't seem to provide state. Well, I'd never thought much about it, but for some reason I began to ponder on it and came up with a rather simple solution. We could add a static Dictionary<TKey,TVvalue> to the static class containing our extension methods, and we could use that Dictionary as a Register where we could associate state to each object used with our Mixins. Simple, right? I decided to have a look to see what solutions other people have devised to get Mixins with State in C#, and I found this excellent article. It's interesting to see that the guy there came up with just the same solution as me, but with a subtle but fundamental difference, He's not using a Dictionary, but a ConditionalWeakTable<TKey, TValue>

In the .NET Framework 4.0, the ConditionalWeakTable class can be used to associate arbitrary state to any instance. It's thread safe and it keeps a dictionary of weak references from the target instance to its associated state. If a target instance has no references outside the conditional weak table, then it can be reclaimed by the garbage collector. When that happens, its entry in the table is removed.

So yes, after reading the above, I realized using a Dictionary would be quite a bad idea, as it would keep a reference to the involved objects forever, precluding them from being Garbage Collected

It's odd to see how in a time like this, where endless programming discussions about almost philosophical matters like MVC vs MVP vs MVVM fill the net, we find rather common confusion or plain ignorance about some important Computer Science concepts, like Partial functions vs Currying (I already talked about this here) or Mixins vs Traits.

Indeed, I hadn't properly understood the difference between Mixins and Traits until this week, when I came across this excellent article. I used to think the difference lied on them having or not having state. I was quite confused, and both Mixins and Traits can have state, and the differences stem from how clashes between methods are resolved when several mixins or traits are added to a class. With Mixins it's the compiler (or the runtime, think of javascript and object augmentation) who decides which of the clashing methods is selected, while with Traits, it's the user doing the composition who has to take that decision at the time of adding the Trait

With Extension Methods in C# we have a bit of both worlds. If an Object already has a method, and we define one with the same name through an extension method for that type of object, the second one will never be called, it's something that the compiler itself decides and there's nothing we can do with it. Nevertheless, if we have 2 extension methods with the same name that can be applied to an object, we'll get a compiler error (the call is ambigouos between the following...) and using a casting we can decide the method to be applied at that specific case. You can see a sample here, and it's answered/explained here

Monday 7 May 2012

MVC, MVP, MVVM...

One of the few things everyone should agree regarding MVC is that it can mean many different things. An additional problem, is that depending on whether we're trying to implement it (purely server side, purely client side or mixed) some ideas can not be applied. For example, a client side Web controller can not subscribe to a Server Side Model (well, unless we're using Comet, WebSocktes or the likes).
If we add related patterns to the mix, like MVP and MVVM, the confusion and concepts war grows bigger and bigger. I'd like to think of this entry as an open one where maybe some day I'll publish some personal conclusions (if I ever have ones that are worth to share...) but by the moment I'll have to make do with pasting below links to some of the best articles that I've read about this lately.

So far I will say that I don't fully understand all the hype around MVVM JavaScript frameworks. I'm mainly doubtful about the data binding between the VM and the View. On one side, most samples I've seen so far are rather trivial, and on the other side I feel much more comfortable with a MVP approach, with the Presenter telling the View what to do in a generic way (showEmployess, preventEditions...) that different views could implement in a completely different way (a grid, a html list, hide or disable...)