Wednesday, October 26, 2011

Flow Sensitive Typing?

While we (Guillaume, Cedric and myself) had a meeting in Paris, we talked about the typing system of Grumpy a bit.

Coming from a dynamic language and going static often feels quite limiting. For me the main point of a static type system is to ensure the code I just wrote is not totally stupid. Actually many would say the purpose is to ensure the correctness of the code, but maybe I am a more dynamic guy, because I think this is impossible to achieve for a poor compiler. So a static compiler usually checks

  • method calls, to ensure the method I want to call really exists
  • fields/properties, to ensure they exist
  • check assignments, to ensure right-hand-side and left-hand-side have compatible types
  • check type usage, for complying with the type system (including generics, class declarations and so on)
  • and others...
So usually if a compiler detects a violation it will cause a compilation error and if it cannot check things the code will probably include runtime checks.

Optional Typing

Now Groovy has, what we call optional typing. In Groovy the compiler won't check fields, properties or methods on their existence, since the compiler cannot be totally sure we really mean some entity, that exists at compile time. Groovy allows you to create/remove/add/replace methods at runtime and attach them to classes via their MetaClass. A program that would not compile statically, may run just fine with those additions. What the Groovy compiler does though is to check type usage to some extend. So you can for example not extend an interface. The Groovy compiler has to do this, because the JVM is also typed to some extend, and doesn't allow directly for arbitrary type systems. sure there are ways around, but that always means to reduce the high integration with Java, and we don't want that.

Another aspect is for assignments. If you assign a value to a typed variable in Groovy, then the compiler won't ensure this works at runtime, but it will add a cast, that may fail. Effectively this means for a typed variable in Groovy: We guarantee you, that if the assignment works, the value assigned to the variable, will be at least of the declared type.

This implies for example for a String typed variable, that you assign a non String to, that its toString() method is called. We call that way of casting a Groovy cast, and the rules for it are actually quite complex.

Still there are enough cases we could actually check, if we would know the type of the right-hand-side. In general we don't know that type, because for example a method call is done there and then we cannot ensure the type. By the time we actually reach that point in the code, the method might have been replaced.

Strong Typing
If you follow the discussions about typing, then you will most probably see very fast, that dynamic and static typing might be kind of defined, but beyond that, there are often conflicting definitions for other terms. For example some say that Groovy is not strong typed, while Java is. In my definition strong typing means that an value cannot change its type at runtime, without creating first a new value. In Java we have this situation for example for boxing. You can assign an int to an Object, but not without the value being boxed, thus a new value being created. Now in Groovy this is just the same. An int cannot become an Integer or a String, just like that. We depend on the type system, enforced on us by the JVM, and the JVM is strong typed... well that may change in the future, but for now it is strong typed. In Groovy you can add methods for example and with it changing the interface a value provides, but there is no way for a value of a certain class to become the same value with a totally different class, without a new value being created and that one used instead.

Flow Sensitive Typing
Flow Sensitive Typing is not unknown in the world. It is normally used to for example find the type of a complex expression, to then check that with the actual allowed type in an assignment. Now in Groovy we want to go a bit a different way. Basically we want to have not a fixed static type, instead each assignment can specify a new one. If you defined a variable using "def", then in normal Groovy all assignments to it are allowed. Basically we see "def" as Object in Groovy. But if you want static method checks, you still want something like "def str = "my string"; println str.toUpperCase()" to work. This case can so far be solved also by type inference. But in Groovy you can do also this: "def v = 1; v = v.toString(); println v.toUpperCase()". Even though we start out with an int, we assign later a String to v. If we work only with a simple inferencing system, this will not compile. But since it is of course our goal to make a wide range of Groovy programs available even in a static checked version, we would of course like to allow this. And a simple flow analysis can indeed give us the information, that the flow type of v change to String and thus the toUpperCase() method exists. In other words, this would compile. Taking into consideration, that "def" in Groovy doesn't mean much more than Object, we don't want this being limited to "def" only. We want also to allow this: "Object v = 1; v = v.toString(); println v.toUpperCase()" Java would not allow for this. Sure, you can assign the 1, you can even call toString() and assign it to v, but because v is declared as Object, the compiler would start barking at you for the toUpperCase() call. Our thinking is, that there is not really a need to limit the type system like this. As in Groovy, we would again give the guarantee, that v is at least an Object. But the specific type is in Groovy depending on the runtime type, on Grumpy on the flow type. Something Grumpy would for example still not allow is "int i = new Object()"

But till now this flow sensitive type system is not approved of by the community.


3 comments:

chris said...

some typos in the last paragraph:
- "this will not compiler"
- "println v.toUppderCase()"

this blog post is very interesting and im looking forward to static type checking in groovy.
i wonder if you could give an example where reusing a def-variable with different types is a real world use case? of course you might be able to save a few keystrokes and flow sensitive typing will also make it saver, but why would anyone really use this feature since reuse of variables for different purposes really can be considered a bad practice.

Jochen "blackdrag" Theodorou said...

chris thanks for showing me the typos. I corrected them right away.

As for a real world use case. This is actually interesting, if you think about it. If you take code from Java, you will often find people that say assigning a value to a local variable more than once, should be done only if really needed. With this precondition, you won't find a useful example for this. In Groovy programs I do this often. For example in a bytecode viewer I wrote there is some swing interacting code like this:

def rdef = ["2dlu, pref"]*rowCount
rdef = rdef.join(", ")+":grow"
FormLayout layout = new FormLayout (
"2dlu, pref:grow:left, 4dlu, left:pref, pref:grow",
rdef
)
rdef = []
(rowCount-1).times { rdef << (it+1)*2 }
layout.rowGroups = [rdef] as int[][]


The type for rdef is one time List, one time String. the purpose is not really different here. It is just different processing steps and I am actually only interested in the end result. Would I now use multiple local variables, then they would be defined for a much wider scope as I intend them to be. According to some styles of Java people I would have had to use actually 4 local variables for this.

But I agree this is part of the discussion we have to do.

chris said...

thanks for your response.

while your example is taken from the real world i dont see the benefit of using different types here since you could just inline the 2nd line into the 2nd argument of FormLayout and dont suffer from more or harder to understand code but sticking to the single List type.

as you might have guessed i come from a more java oriented background and thus have a hard time seeing a real application of this feature, but that might be a more basic discussion than about flow sensitive type checking.

if groovy does not intend to suggest a standard way on how verbose, readable and understandable one's code should be (which it probably shouldn't) how about some compiler flags or annotations or annotation parameters that allow everyone to decide if they want the compiler to mark "type-changed-defs" as errors, warnings or simply accept them with additional flow sensitive typing.