Monday, April 13, 2015

About being a paid OSS Developer for Groovy

Groovy is my Child in code you could say. For over 10 years I helped the project. I was one of the first guys to be paid to get bugs fixed and work done in Groovy and did so for about 9 years.

In the beginning it was a nice experience. I mean you do your job, you help the community and work on your hobby at the same time! Ideal, or not? Well, not ideal. It largely depends on if this continues as being perceived as hobby project and really don't care about the success or if you do care. Well, getting into a company with services around Groovy and Grails, namely G2One and having to fight off trolls all the time, made me start caring... and soon frustrated. Sure, the workload was high, but I could cope with that. What frustrated me was more, that we couldn't do as much progress on Groovy as we could have if we had more man-power. And G2One management adding feature after feature without testing or documentation, or at least considering the community did not help here at all. But at least there was the perspective of having more people on the project in the future. G2One then being bought by SpringSource stirred up that hope even more. The actual handling of the Groovy and Grails team in SpringSource on the other hand was different. A few liked us, a few hated us, most ignored us. Well, that would have been fine, but there was also no real plan of continuation. Instead things continued as before - with less new features, but more difficult to maintain code... And hope died again. Actually not fully, there was not enough time for that, since we then got bought by VMware. If we had been somewhere deep down in the management chain before, now we have been even deeper down. And what is worse, now not only would we have to get SpingSource convinced, but also VMware.

That's where I started not hoping anymore.. till we got wind of an open position that we had a chance of getting someone on. We had to act fast, pressure management as best as we could and finally got 2013 Cedric on board. The Groovy team of paid developers got actually bigger for once. You have no idea how happy I was about that. Not only did we get another guy working on core, also a really capable one! It still took years to make Cedric feeling at home in most of the codebase. But I also noticed increased usage of Groovy, bugs getting more difficult to fix and that 2 developers doing coding work fulltime, even with as great contributors as we have is not enough.

Still I stayed a bit more happy, I was thinking that maybe in a few years we can get another person working on these things... especially with us being now Pivotal and them wanting to be so open-source-oriented and with Spring-Boot using Groovy too. Well, turns out Pivotal cares only about OSS if they can use it. From the POV of a business man I can understand this very well. But that involves seeing people only as cost factors and not as human beings. Makes me really wonder about their other projects and what happens if their attention shifts. And I so much now remember my discussion with a nice guy on the W-JAX in Munich, Germany in November 2014. He said there is one very interesting thing about American companies, if something doesn't contribute to their business anymore, they cut it. That may or may not apply to "American companies", it seems to apply to Pivotal. Ok, Pivotal tried to help and be nice with the transition, but I think that is more a lesson learned by the Vert.X disaster at VMware-times - and some more sane people trying not to make the company looking too bad. As an OSS-member Pivotal already failed for me. And if Pivotal management does not change their thinking, then they will stay a failure. I expect several more OSS related problems with Pivotal in the future.

Anyway.. the result is that all of the paid Groovy guys... Cedric (developer), Guillaume (manager and evangelist) and me (developer) had to leave Pivotal. This means, even if I would be getting someone paying me to spend 80-100% on Groovy, I still would be the only one doing more or less fulltime. Not sure how much Gradle will let Cedric work on Groovy, but I expect not much more than 20% - on good days. This also means I would have to see Groovy as hobby project again. A project like Groovy really needs at least the equivalent of 3 full time developers to even cope with the bugs and not having any joy to work on some new ideas.

At this point I am still here being serious about Groovy, but by far not having the resources. I am shifting gears and will make Groovy my hobby again. I probably should never have changed the POV and getting serious about Groovy. But would I then have joined G2One? Most likely not. I would have had to leave the project long before, working for some company most probably not caring about Groovy at all. Maybe it sounds arrogant, but I am pretty sure, that if I had not been paid developer for Groovy those first two years of working as paid OSS-developer, even Groovy 1.0 would have not happened. If I had gotten out later I am pretty sure Groovy would have never became this stable. But times change. The project is pretty usable as it is. There are big things that should be changed in the future, but this will require working time, a big code rewrite, lot's of new documentation, convincing of people and many many other things. If that is not supposed to happen in a time frame of many years, then I do not see how this can be done. But the old codebase has some serious problems in a few places. Architecture problems, often created by me, sometimes there was no other choice back then. But things get increasingly difficult to manage and it would be a time for a rewrite. I rewrote already a lot of code in Groovy in the past. Usually that improved the situation in the long term a lot. But it is an investment for the future. Would I really care about this if I am not serious about the success of Groovy? Maybe not really. If you just help to get some fast success, then surely not. And for doing the boring and difficult to fix stuff, most people have to be serious.

Turns out, that unless there is a perspective for more developers in fulltime, I cannot become serious about Groovy anymore. It would frustrate me too much. Given the chance again to work fulltime on Groovy, would I take it? Actually I am not sure. If the company doing that cares, there should be a perspective of more fulltime workers. So probably I would take the job. But such a company has not appeared. And being paid through donations? Well, I am not convinced we would get together enough money to pay more than one person long term. Even one person is difficult, unless you spend a big portion of your time to go around begging companies for donations. I really don't want to beg for my income all the time. If I had my own business and I could do Groovy work by donations instead of working with a customer so that I don't really depend on them, then it would probably be a perspective, assuming I can keep customers like that. But that would not be fulltime either. That would be maybe 50% with luck and customers I don't have. So no option either. And working fulltime, but not being serious about things too much? I think I am not the type for that. Not being serious means to me I spend my energies somewhere else or only if I am interested. With family and work I have almost zero free time. Maybe an hour here or there. But nothing really you can do much with. It would mean I would not have an outlet... no, cannot do. I would explode or go dumb.

So for the time being it looks like there is no chance for me getting fulltime on Groovy. I dare the community to fill the gaps and show they can do serious stuff in their free time. I will help with guidance through the codebase, but in the end it all depends on the community and their motivation. And maybe, if I see several contributors appearing, that do this potentially long term and that can handle me and the community. Then I might be able to get serious again... if my help then is still welcome and a company found to pay for that.

Wednesday, March 04, 2015

Thoughts about the new meta class system MOP 2

Some may remember, MOP2 is the idea of letting Groovy have a new meta class system, since the old system has some serious internal problems. The problems are largely related to the basic concepts like seeing the meta class as something that invokes methods. But also problems with locking for caching and global structures and many many small mistakes in the MOP that become established. The result is a really complicated MOP, that makes the life of everyone challenging, that wants to use those things.

Some ideas have been born out of how to redesign things from scratch. And over the last two years I was trying things here and there, toying with ideas, preparing for a big jump. But since the day I become aware that Pivotal is pulling the plug for funding Groovy I had to realize, that a long term adoption project like I had planed before with MOP 2 might be difficult to do in the future. Why? Even if me working full-time on Groovy in the future will be no problem, it will be a new employer and there you cannot simply come with a project like that in the beginning.

On the other hand I don't want to continue doing nothing for MOP2 in the next year or two.

So I started to think about a plan to slowly migrate the current MOP into one I find better suited for a modern Groovy. As a result I made a list of some features I did plan for the MOP2 and how it fits with the current system, as well as how much I see a possible migration path here. And with migration path I mean to not break old code, and have a different behavior only if intended. Sounds difficult? Well, yes. It can be done only by not doing some things I did intend to do.

New package namespace for groovy.lang

Why a new package? Because the idea was to be able to run the old and the new runtime in parallel, maybe let them communicate. You would have something like a MOP1 compatibility jar, that if on the classpath, allows your old Groovy classes to run normally. While having both runtimes in parallel was thought as optional I think now it has to become the regular mechanism and standard. Having to have a slow migration path means for me here to mostly exchange the standard meta class

Remove MOP specific methods from GroovyObject

The MOP specific methods I am talking about here are invokeMethod, getProperty and setProperty as well as getMetaClass. Those methods have been used in the early times of Groovy as a way to speed up things and avoid reflective method calls where possible, while at the same time providing extension points for the users. I am not so much talking about how bad it is that invokeMethod is usually called as a last resort kind of method, and get/setProperty usually upfront, or that the property methods make it really difficult to keep information about the origin of a call, or the amounts of code that goes into our runtime which tries to recognize if it can safely ignore those methods and directly go to the meta class... No, in terms of a migration path I have to clearly say, that to keep binary compatibility, those methods do have to stay. And they do have to stay with a more or less similar logic as today. But a new GroovyObject class could be made, that goes in a different package... Having to create GroovyObject wrappers all the time to satisfy the needs of older APIs using that won't do though. It is a terrible strain on memory and breaks referential identity. So I guess you would have to have an annotation to switch between old and new GroovyObject.

Make the meta class similar to a Persistent Collection

MetaClassImpl was in the past already going the route of being immutable, but ExpandoMetaClass has shown there are different needs. Well, to be exact, MetaClassImpl has two states, one that allows initialization (and mutation with it) and the initialized state, in which no mutation happens anymore. ExpandoMetaClass is based on MetaClassImpl and more or less switches between states to allow for mutation. But this turned out to be a horrible mess for concurrency. Persistent Collections on the other hand are by nature immutable, which allows nice lock-free paths. SO if you want a mutation, you have to actually use a new instance. The trick is to leverage the internal structures of the old instance to save on memory and initialization time. For example if you have two list, the second one is created by using the first one and appending an element, you can easily use simply the old list, if you link the new element to the old elements. It would be a list in which the new element is before the old elements, like a stack. Of course this is just for illustration and there are solutions to make it in the other order.

Now the current API is more based on having a single reference and the instance behind that is mutated. Doing the persistent collection approach means there would have to be a new instance. But this could be solved by having a kind of dummy meta class, that will keep track of the real meta class. The real meta class would not be mutated directly anymore.

Let open blocks not use a class anymore


In current Groovy if you have code like
list.each{println it}
you will get an anonymous inner class for the open block which is also an instance of Closure. There you can set strategies and whatever. Well, in the new MOP I intend to not produce a class anymore. Instead a method will be used. It would have the same parameters as the Closure doCall method would have now, but plus a helper instance for things like the resolve strategy and such. And in cases where we do not reference any outside context (like in my example) we could omit this as well. The old Closure class would then be only a wrapper for the real representation of reference to the open block. In Java7+ code this could be a a couple of MethodHandles, in older code this would have to be MetaMethods. But of course those MetaMethods would have to be used with care, to avoid to many references to the meta class system. Otherwise you would serialize half the runtime in current Groovy if you wanted to serialize a Closure. I think a lot of the logic that can currently be found in ClosureMetaClass to have a simplified dispatch for Closures can be reused here. I guess this all can be done using the current API, but some frameworks may be based on having those inner classes and recognize them. Also the change means there will be only one meta class for every open block out there. Frameworks building on top of that would either need to change or we would need to provide an option for the old way... The main reason I would want this kind of change is to have a smaller bytecode footprint, less permgen space and of course lower memory consumption. In cases that use no outside references we could even think about reusing instances of Closure to have even less footprint.


New Meta Class DSL

The DSL provided by ExpandoMetaClass has some nasty things here and there. Fixing them would almost certainly break code, especially in Grails. Thus a new optional front end to the meta classes would allow to have a cleaner DSL. The details here really have to be worked out using actual code I guess. But I see no bigger problem.


No Meta Class subclassing anymore

Subclassing proofed to be a problem for performance improvements. Of course there still needs to be a way to add some kind of custom behavior, but that would then be done using composition and with caching in mind, rather than having the meta class as the original instance of a method call invocation, as it is now... and need to be worked around all the time. The composition approach also allows for a cleaner separation of an API. Currently we have for example the MetaClass interface, but it is almost never used for a custom meta class, instead people subclass either DelegatingMetaClass or directly MetaClassImpl. And DelegatigMetaClass is really already going in the direction of a composition model.


Realms

The idea of the Realm concept is to have different spaces for meta classes, that can exist independent from each other and not influence each other. For example in a Groovy implementation of something like DefaultGroovyMethods, you may want to call the Java version of toString, instead of the Groovy one (if existing). Then you need a way to set a Realm and switch them. But since this concept is unknown to the current meta class system, using realms in a class, would automatically mean using the new meta classes. Since old and new system exist at the same time, this will cause confusion I guess. But basically there is only one Real for the old meta classes, so if another realm is requested I guess the best would be to just give a runtime error then.


The plan would be to start implementing the new meta class system and change the old meta class system to use the new one if appropriate. That means for example the old meta classes would become mere skeletons for the real system meta class for example. We would need some annotations or other logic here and there to switch between new and old logic, with the old logic being the default of course

But overall I think this can be done and in a way that won't require people to update all their code right away. The can then migrate with their own pace.

Sunday, February 08, 2015

Getting rid of compareTo for ==

NOTE:This is article is thought as prelude to a discussion on the mailing list about a possible removal of the general compareTo path for the equality operator.


As many may know Groovy has quite the complicated logic for the == operator. Which is to call equals unless the left side implements Comparable, in which case we use compareTo... well simplified...

To illustrate the logic:

 class A implements Comparable {
boolean equals(Object o) {false}
int compareTo(Object o) {0}
}
def xa = new A()
def ya = new A()
assert xa==xa && ya==ya // referential identity override
assert !xa.equals(ya) // direct call to equals
assert xa.compareTo(ya)==0 // direct call to compareTo
assert xa==ya // ignores equals, since it implements Comparable

class B implements Comparable {
boolean equals(Object o) {false}
int compareTo(Object o) {0}
}
def xb = new B()
assert !xa.equals(xb) && !xb.equals(xa)
assert xa.compareTo(xb)==0 && xb.compareTo(xa)==0
assert xa!=xb // ignores equals as well as Comparable

assert 1==1l // compare primtive long and int
assert !(1.0G.equals(1.00G))
assert 1.0G==1.00G // compare BigDecimals with differing scale
assert 1.0G==1l // compare primitive long with BigDecimal
assert 1G==1.0G // compare BigInteger and BigDecimal
assert 1!=new Object() // compare primitive with incompatible instances

In Java you know that this operator allows you for example to compare ints and longs and does this in the given case by transforming the int into a long to then compare the numeric values. Similar things happen for the other primitives. Since Java5 the operator does even allow you to compare a primitive int and an Integer by using autoboxing. Where the equality operator in Java fails is if you compare for example a Long and an Integer. Fail in the sense that it does not the same as for the primitive counter parts.

Now in Groovy the equality operator traditionally had to handle comparing the boxed versions as if they are not boxed. This is because in versions of Groovy before 1.8 every primitive declared variable actually used the boxed version. Only in 1.8 I introduced actual primitives, but the ability of the equality operator to compare for example Integer and Long stayed. It had to stay, because we don't only compare those, we have also those 1-char Strings, that are supposed to be equal to Strings, GString logic and of course BigInteger and BigDecimal logics.

BigDecimal now does something that is not really advised when implementing the interface Comparable, it returns false using equals for a case that is seen as equal for compareTo. For example "1.0" and "1.00" is such a case. They are not equal, because the scale is not, even if the value is projectable without precision loss to the other to do an actual compare.

Since people do also things like `1==new Object()` and since this is not supposed to throw a ClassCastExpcetion, even though the compareTo method will do that here, we had also to add a special logic doing the compareTo call only, iff the right side type is a subtype of the left side type.

This causes all kinds of confusion to people. And my suggestion is to remove the compareTo path.

Instead I suggest adding a path special to BigDecimal to handle the equals problem. This should remove a lot of confusion in the future.

Now this will of course have more impact than some people may think. Obviously classes implementing Comparable may now behave different. But especially custom Number implementations may do that now. So it is a loss of feature to some extend. But if the usage of those features is causing more problems than abilities it allows, then we have to rethink this. And I think this is the case here. My intended change would also change the behavior of the program above. The referential equality override would stay, but "assert xa==ya" would then fail, since equals returns always false. Also if equals did return always true "assert xa!=xb" would fail, since before it did not call equals and now does.

Monday, January 12, 2015

Indy and CompileStatic as tag team to defeat array access times

Micro Benchmarks are Evil

They are evil, aren't they? You test a very specialized case that may have no relation to your everyday application at all. But they can show some weaknesses here and there. If they are relevant or not is a different question, that is most often answered with a "they are not". Still there are sometimes cases on which a language can improve upon. And one such case in Groovy is array access.

Array access in Groovy

For those not being aware of it, but Groovy does array access not like Java. Groovy allows the usage of a negative array index, in which case we go from the first element to the last. So a -1 denotes always the last element. Using -array.length on array will again result in a ArrayIndexOutOfBoundsException.

Benchmarking a little

To measure the extend of the problem I am using this little benchmark named fannkuch. It is based on the alioth shootouts Groovy version for fannkuch. Since I know Groovy will not perform very well on this, even with primitive optimization I don't expects too much checking this with none-static code. For those not knowing what primitive optimizations is... it is letting the compiler generate an alternative bytecode execution path based on primitives and the assumption that there are no meta class changes affecting primitives. To ensure this assumption is legal I am using guards.

fannkuch microbenchmark times in ms (JDK8_u25):
primopts Groovy12889.2358718+2718.9594152/-787.6735898
static Groovy3325.5838752+270.7189528/-266.2819292
Java561+85/-65

Which means even Groovy with primitive optimizations is slower by factor 23. Switching to @CompileStatic makes things look better, but there is still a factor 5.

Analyzing the results

Analyzing the generated bytecode will show us, that the @CompileStatic version is not doing anything strange compared to Java, only the array access parts are done different by using BytecodeInterface8 methods to access the arrays. primopts on the other hand show that besides the BytecodeInterface8 method usage, there is also dynamic access to arrays. This of course then means bad times, since beating primitives on the JVM is difficult with code, that cannot handle primitives all that well... like for example reflection.

So my next was to try if invokedynamic can improve the situation. It may at first look strange to use invokedynamic in static compiled code for something as static as this. We know all the types at compile time so a method call should be faster than any fancy thing invokedynamic could do, right? Wrong. Or I should say it depends. What we can do here is to give a very short path for the optimistic case of the array index being positive. In the original code this is done with a try-catch. But in terms of MethodHandles used by invokedynamic we can use a guard that checks the index for a positive value instead. MethodHandles do also provide an exception catching guard of some kind, but this has issues in terms of performance and how far the code can be optimized. In total the guard version has the big advantage of doing something the JVM would do anyway and thus potentially just remove the second check, making the first check very very cheap. The fallback of course is still as complex as before and there is no real speed improvement to be expected. Another part that should deserve consideration is that in invokedynamic a static call site is no where to be compared to a mutable callsite. Thanks to Java8 lambdas a lot of performance optimization effort has been going in making static callsites fast. And we have one here.

New results

This then resulted in PR #587 and updated times in our table:
primopts Groovy12889.2358718+2718.9594152/-787.6735898
static Groovy3325.5838752+270.7189528/-266.2819292
static Groovy with indy878.0258219+328.9714071/-134.1179639
Java561+85/-65

This indicates a mere slow done of 57% now. I think this is a great improvement... And while it would have been nice to be actually on par with Java here, I assume this can only be done by using Java's array access logic in the end. A slowdown like this is something I found already occurring if you check for a boolean in an if for example. So I doubt there is much more room of improvement.

As for primitive optimizations, after GROOVY-7249 and GROOVY-7251 we can also look forward to improvements in indy and normal primitive optimizations.

I will make a new blogpost of the results, once those are implemented