Planet Scala

Scala blogs aggregated

September 03, 2010

Coderspiel

1366 x 768 MeeGo tablet coming to Germany in two weeks

<object height="254" width="400"><param name="movie" value="http://www.youtube.com/v/7EdNBTwHxWk&amp;rel=0&amp;egm=0&amp;showinfo=0&amp;fs=1"><param name="wmode" value="transparent"><param name="allowFullScreen" value="true"><embed allowfullscreen="true" height="254" src="http://www.youtube.com/v/7EdNBTwHxWk&amp;rel=0&amp;egm=0&amp;showinfo=0&amp;fs=1" type="application/x-shockwave-flash" width="400" wmode="transparent"></embed></object>

1366 x 768 MeeGo tablet coming to Germany in two weeks

September 03, 2010 04:17 PM

SPDE Menger Sierpinski Ball

<object height="325" width="400"><param name="movie" value="http://www.youtube.com/v/Fl-V7GRAJJ4&amp;rel=0&amp;egm=0&amp;showinfo=0&amp;fs=1"><param name="wmode" value="transparent"><param name="allowFullScreen" value="true"><embed allowfullscreen="true" height="325" src="http://www.youtube.com/v/Fl-V7GRAJJ4&amp;rel=0&amp;egm=0&amp;showinfo=0&amp;fs=1" type="application/x-shockwave-flash" width="400" wmode="transparent"></embed></object>

SPDE Menger Sierpinski Ball

September 03, 2010 01:04 PM

Heiko Seeberger

SLF4S - Logging the Scala way

Do we need another logging framework in the Java/Scala world? Certainly not! As Scala is fully "downward" compatible to Java, we can use whatever Java logging solution we want. And there are many, aren't there?.

So why SLF4S? Well, SLF4S isn't another logging framework, but a very thin Scala wrapper around SLF4J which has emerged as the leading Java logging solution. Why do we need a Scala wrapper for SLF4S? Well, there are some nice Scala features that can make logging even easier and/or more performant.

First, SLF4J Loggers use by-name parameters which are only evaluated if needed/accessed. When logging "traditionally", we often create messages by concatenating Strings or using the String.format method, even if we don't need these messages in the end because the logging level is not enabled. Of course we could "manually" check whether the logging level is enabled, e.g. by calling logger.isDebugEnabled, but we often don't, because it's cumbersome. With by-name parameters we can simply call our log methods and let SLF4S check whether the log level is enabled. Just take a look at one example:
def debug(msg: => String) {
if (slf4jLogger.isDebugEnabled) slf4jLogger debug msg
}

Second, SLF4S offers a Logging trait which can be mixed into any class to make a Logger instance available. That particular Logger will be initialized with the name of the class it is mixed into which is a common use case.
class MyClazz extends SomeClazz with Logging
...
logger debug "SLF4S just rocks!"
...

Of course you can create Loggers with arbitrary names by calling Logger("SomeSpecialName").

Last but not least, SLF4S offers implicit conversions from "usual" SLF4J Loggers into "pimped" SLF4S Loggers.

Ah, and of course, SLF4S is OSGi compliant. But that's not a big surprise, taking into account that the authors are OSGi fanboys and SLF4J is OSGi compliant, too.

by Heiko Seeberger (noreply@blogger.com) at September 03, 2010 08:06 AM

Ruminations of a Programmer

Towards generic APIs for the open world


In my last post on how Clojure protocols encourage open abstractions, I did some quick rounds between type classes in Haskell and protocols in Clojure. At the end in the section titled "Not really a type class", I mentioned about the read function of Haskell's Read type class. read takes a String and returns a type - hence it doesn't dispatch on the function argument, but rather on the return type. Clojure protocols can't do this, I am not aware of any dynamic language that can do this. Check out James Iry's insightful comment on this subject on the post.


With type classes all dispatch is static - the dispatch map is passed as a dictionary of types and inferred by the compiler. What benefit does this bring on to us ? Do we really get anything special when the language supports APIs like the read method of Haskell's Read type class ?


In this post I try to explore how type classes help design generic APIs that are open and can work seamlessly with abstractions that you implement much later in timeline than the type class itself. This is in contrast to subtype polymorphism where all subtypes are bound by the contracts that the super type exposes. In this sense subtype polymorphism is closed.


This post is inspired in part by the excellent article Generalizing APIs by Edward Z. Yang. For this post I will use Scala, my current language of choice for most of the things I do today.


My generic API


I want to implement a read API like the one in Haskell encoded in a Scala type class .. Let's make it generic in the type that it returns ..


// type class
// reads a string, returns a T
trait Read[T] {
  def read(s: String): T
}

For the open world


We can define instances of this type class by instantiating the trait as objects. Type classes are implemented in Scala using implicits. In case you're not familiar with the concept, here's what I wrote about them some time back.


// instance for Int
implicit object IntRead extends Read[Int] {
  def read(s: String) = s.toInt
}

// instance for Float
implicit object FloatRead extends Read[Float] {
  def read(s: String) = s.toFloat
}

These are very much like what you would do with type class instances in Haskell. You can even create instances for your own abstractions ..


case class Name(last: String, first: String)

object NameDescription {
  def unapply(s: String): Option[(String, String)] = {
    val a = s.split("/")
    Some((a(1), a(0)))
  }
}

// instance for Name
import NameDescription._
implicit object NameRead extends Read[Name] {
  def read(s: String) = s match {             
    case NameDescription(l, f) => Name(l, f)
    case _ => error("invalid")
  }
}

So the Read type class in Scala is generic enough to be instantiated for all kinds of abstractions. Note that unlike interfaces in Java, the polymorphism is not coupled with inheritance hierarchies. With interface, your abstraction needs to implement the interface statically, which means that the interface has to exist before you design your abstraction. With type classes, the abstractions for Int and Float existed well before we define the Read type class.


Now if we have a generic function that takes a String, we can make it return an instance of the type it is generic on.


def foo[: Read](s: String) = implicitly[Read[T]].read(s)

foo[Int]("123") // 123
foo[Float]("123.0") // 123.0
foo[Name]("debasish/ghosh") // Name("ghosh", "debasish")

Ok .. so that was our generic read API adapting violently to already existing abstractions. In this case it's exactly the Scala variant of how simple type class instances behave in Haskell. The authors of Real World Haskell uses the term open world assumption to describe this feature of the type class system.


Context for selecting the API instance


When the function foo is invoked, the compiler needs to find out the exact instance of the Read type class from the method dictionary in case of Haskell and from the list of available implicit conversions in case of Scala. For this we specify the context bound of the generic type T as T : Read. This is same as the context of the type class that we have in Haskell.  It specifies that the method foo can return any type T provided the type is an instance of the type class Read. Apart from using the context bound, in Scala you can also use view bounds to implement context of a type class. The Haskell equivalent is ..


foo :: Read a => String -> a

Irrespective of Haskell or Scala, our API becomes hugely expressive through such constraints that the static type system allows us to write. And all these constraints are checked during compile time.


Context in implementing specific instances


When defining a generic API, you can also set up a context for specific instances of the type class. Consider our read method for a List datatype in Scala. Haskell defines the instance as ..


instance Read a => Read [a] where ..

Note the context Read a following the instance keyword. This is called the context of the type class instance which says that we can read a List of a only if all individual a's also implement the Read type class. 


We do this in Scala using conditional implicits as ..


implicit def ListRead[A](implicit r: Read[A]) = 
  new Read[List[A]] {
    def read(s: String) = {
      val es = s.split(" ").toList
      es.map(r.read(_))
    }
  }

The implicit definition itself takes another implicit argument to validate during compile time that the individual elements of the List also are instances of the type class. This is similar to what the context does in case of Haskell's type class instantiation.


foo[List[Int]]("12 234 45 678") // List(12, 234, 45, 678)
foo[List[Float]]("12.0 234.0 45.0 678.0") // List(12.0, 234.0, 45.0, 678.0)
foo[List[Name]]("debasish/ghosh maulindu/chatterjee nilanjan/das")
  // List(Name("ghosh", "debasish"), Name("chatterjee", "maulindu"), Name("das", "nilanjan"))

As part of common extensions of GHCI, Haskell also provides support for overlapping instances of type classes ..


instance Read a => Read [a] where ..
instance Read [Int] where ..

In such cases although there are two possible matches for [Int], the compiler can make an unambiguous decision and select the most specific instance. With Scala, there is no such ambiguity to be resolved since Scala anyway allows multiple implementations of the same type class and it's up to the user to import the specific one into the module.


In this post I discussed the power that you get with type class based generic API design. In functional languages like Haskell, type classes are the most potent way to implement extensible APIs for the open world. Of course in object functional languages like Scala, you also have the power of subtyping, which comes good in many circumstances. It will be interesting to come up with a comparative analysis of situations when we prefer one to the other. But that's up for some other day, some other post ..

by Debasish (ghosh.debasish@gmail.com) at September 03, 2010 05:17 AM

September 02, 2010

Coderspiel

Dispatch 0.7.6

Dispatch 0.7.6:

Broader redirect handling, support for Google ClientLogin, a JSON extractor for optional properties, and OAuth compatibility with Twitter’s streaming API.

September 02, 2010 07:41 PM

"simple-build-tool plugin to publish and distribute your Scala projects using Github as Ivy..."

“simple-build-tool plugin to publish and distribute your Scala projects using Github as Ivy repository”

- siasia’s plugin at master - GitHub

September 02, 2010 04:00 PM

Stephan Schmidt

Better Configuration Files

Over the years I have seen many configuration files. Most of them were unusable. There are many reasons for unusable configuration files. What I’ve learned from looking at large configurations are those main points: 1. Values Often configuration files use the wrong values. Developers tend to use true/false for switching options on and off. track-users = true The [...]

by stephan at September 02, 2010 11:45 AM

Mathias

Scala Collections API Charted Out

The revamped collections API is one of cornerstones of the recently released Scala version 2.8.
It provides an incredibly rich toolset for working with object collections and as such is one of the most important parts of the Scala runtime library. The Scala team has made an effort to provide thorough documentation uncovering its feature breadth, the most important sources of information...

September 02, 2010 07:00 AM

September 01, 2010

Coderspiel

Obama administration: "Piracy is flat, unadulterated theft"

Obama administration: "Piracy is flat, unadulterated theft":

If copyright violations are just like theft (and they are not, if you think about it for 1.5 seconds), why are the penalties millions of dollars more than for shoplifting?

September 01, 2010 03:49 PM

Tony Morris

Even Further Understanding scala.Option (part 2)

As a follow-on to Further Understanding scala.Option, following are another 10 exercises (numbered 16 to 25). Included are solutions to the original 1 to 15 exercises. Instructions are in the comments.

// Scala version 2.8.0.final
// http://scala-tools.org/repo-releases/org/scala-tools/testing/scalacheck_2.8.0/1.7/scalacheck_2.8.0-1.7.jar
 
 
/*
 
  PART 1
  ======
  Below are 15 exercises numbered 1 to 15. The task is to emulate the scala.Option API
  without using Some/None subtypes, but instead using a fold (called a
  catamorphism).
 
  A couple of functions are already done (map, get)
  to be used as an example. ScalaCheck tests are given below to
  verify the work. The desired result is to have all tests passing.
 
  The 15th exercise is not available in the existing Scala API so
  instructions are given in the comments.
 
 
  Part 2
  ======
 
  Below are 10 exercises numbered 16 to 25. The task is to implement additional
  methods for the Optional data type. These methods are not provided in the
  scala.Option API so to determine the correct result requires reading the method
  type signature and ensuring that the tests pass.
 
  The 25th exercise is notable in that its signature says nothing about
  scala.Option yet it is usable for Option (see the test for example).
 
 
  Revision History
  ================
 
  23/08/2010
  * Initial revision
 
  ----------------
 
  23/08/2010
  * Fixed prop_getOrElse. Thanks Michael Bayne.
 
  ----------------
 
  26/08/2010
  * Add lazy annotation to orElse method.
 
  ----------------
 
  01/09/2010
  Added Part 2
 
  02/09/2010
  * Fixed mapOptionals test (why wasn't it failing?). Thanks Alec Zorab.
  * Added comments including *** special note ***
 
*/
 
 
trait Optional[A] {
  // single abstract method
  def fold[X](some: A => X, none: => X): X
 
  import Optional._
 
  // Done for you.
  def map[B](f: A => B): Optional[B] =
    fold(f andThen some, none[B])
 
  // Done for you.
  // WARNING: undefined for None
  def get: A =
    fold(a => a, error("None.get"))
 
  // Exercise 1
  def flatMap[B](f: A => Optional[B]): Optional[B] =
    fold(f, none)
 
  // Exercise 2
  // Rewrite map but use flatMap, not fold.
  def mapAgain[B](f: A => B): Optional[B] =
    flatMap(f andThen some)
 
  // Exercise 3
  def getOrElse(e: => A): A =
    fold(s => s, e)
 
  // Exercise 4
  def filter(p: A => Boolean): Optional[A] =
    fold(a => if(p(a)) some(a) else none, none)
 
  // Exercise 5
  def exists(p: A => Boolean): Boolean =
    fold(p, false)
 
  // Exercise 6
  def forall(p: A => Boolean): Boolean =
    fold(p, true)
 
  // Exercise 7
  def foreach(f: A => Unit): Unit =
    fold(f, ())
 
  // Exercise 8
  def isDefined: Boolean =
    fold(_ => true, false)
 
  // Exercise 9
  def isEmpty: Boolean =
    fold(_ => false, true)
 
  // Exercise 10
  def orElse(o: => Optional[A]): Optional[A] =
    fold(_ => this, o)
 
  // Exercise 11
  def toLeft[X](right: => X): Either[A, X] =
    fold(Left(_), Right(right))
 
  // Exercise 12
  def toRight[X](left: => X): Either[X, A] =
    fold(Right(_), Left(left))
 
  // Exercise 13
  def toList: List[A] =
    fold(List(_), Nil)
 
  // Exercise 14
  def iterator: Iterator[A] =
    fold(Iterator.single(_), Iterator.empty)
 
  // Exercise 15 The Clincher!
  // Return a none value if either this or the argument is none.
  // Otherwise apply the function to the argument in some.
  // Don't be afraid to use functions you have written.
  // Better style, more points!
  def applic[B](f: Optional[A => B]): Optional[B] =
    f flatMap map
 
  // Utility
  def toOption: Option[A] = fold(Some(_), None)
 
  // Utility
  override def toString = 
    fold("some[" + _ + "]", "none")
 
  // Utility
  override def equals(o: Any) =
    o.isInstanceOf[Optional[_]] && {
      val q = o.asInstanceOf[Optional[_]]
      fold(a => q.exists(a == _),
           q.isEmpty)
    }
}
 
object Optional {
  // Done for you
  def none[A]: Optional[A] = new Optional[A] {
    def fold[X](some: A => X, none: => X) = none
  }
 
  // Done for you
  def some[A](a: A): Optional[A] = new Optional[A] {
    def fold[X](some: A => X, none: => X) = some(a)
  }
 
  // Utility
  def fromOption[A](o: Option[A]): Optional[A] = o match {
    case None    => none
    case Some(a) => some(a)
  }
 
  // *** Special note ***
  // Some of these functions are likely to be familiar List functions,
  // but with one specific distinction: in every covariant value appearing in
  // the type signature, this value is wrapped in Optional.
  // For example, the unwrapped:
  // filter:          (A => Boolean) => List[A] => List[A]
  // and the wrapped:
  // filterOptionals: (A => Optional[Boolean]) => List[A] => Optional[List[A]]
  // 
  // There are other functions of a similar nature below.
 
  // Exercise 16
  // If a none is encountered, then return a none, otherwise,
  // accumulate all the values in Optional.
  def mapOptionals[A, B](f: A => Optional[B], a: List[A]): Optional[List[B]] =
    error("todo")
 
  // Exercise 17
  // If a none is encountered, then return a none, otherwise,
  // accumulate all the values in Optional.
  def sequenceOptionals[A](a: List[Optional[A]]): Optional[List[A]] =
    error("todo")
 
  // Exercise 18
  // Use sequenceOptionals
  def mapOptionalsAgain[A, B](f: A => Optional[B], a: List[A]): Optional[List[B]] =
    error("todo")
 
  // Exercise 19 
  // Use mapOptionals
  def sequenceOptionalsAgain[A](a: List[Optional[A]]): Optional[List[A]] =
    error("todo")
 
  // Exercise 20
  // If a none is encountered, return none, otherwise,
  // flatten/join by one level.
  def joinOptionals[A](a: Optional[Optional[A]]): Optional[A] =
    error("todo")
 
  // Exercise 21
  def filterOptionals[A](p: A => Optional[Boolean], a: List[A]): Optional[List[A]] =
    error("todo")
 
  // Exercise 22
  def fillOptionals[A](n: Int, a: Optional[A]): Optional[List[A]] =
    error("todo")
 
  // Exercise 23
  // Use sequenceOptionals
  def fillOptionalsAgain[A](n: Int, a: Optional[A]): Optional[List[A]] =
    error("todo")
 
  // Exercise 24
  // Methods mentioning Optional in the type signature are prohibited, except applic and map
  def mapOptionalsYetAgain[A, B](f: A => Optional[B], a: List[A]): Optional[List[B]] =
    error("todo")
 
  // Consider: def joinOptional[A](a: Optional[Optional[A]]): Optional[A]
  // This function "flattens" the Optional into a Some value if possible.
  // It is not possible to write this using only applic and map (try it!).
 
  // Bye bye Option-specificity!
  // (setting up for Exercise 25)
  trait Applic[F[_]] {
    def point[A](a: A): F[A]
    def applic[A, B](f: F[A => B], a: F[A]): F[B]
 
    final def map[A, B](f: A => B, a: F[A]): F[B] =
      applic(point(f), a)
  }
 
  object Applic {
    implicit val OptionalApplic: Applic[Optional] = new Applic[Optional] {
      def point[A](a: A): Optional[A] = some(a)
      def applic[A, B](f: Optional[A => B], a: Optional[A]): Optional[B] = a applic f
    }
  }
 
  // Exercise 25
  // The Double-Clincher!
  def mapWhatever[A, B, F[_]](f: A => F[B], a: List[A])(implicit z: Applic[F]): F[List[B]] =
    error("todo")
}
 
import org.scalacheck._
import Arbitrary.arbitrary
import Prop._
 
object TestOptional extends Properties("Optional") {
  import Optional._
 
  implicit def ArbitraryOptional[A](implicit a: Arbitrary[A]): Arbitrary[Optional[A]] =
    Arbitrary(arbitrary[Option[A]] map fromOption)
 
  property("map") = forAll ((o: Optional[Int], f: Int => String) =>
    (o map f).toOption == (o.toOption map f))
 
  property("get") = forAll((o: Optional[Int]) =>
    o.isDefined ==>
      (o.get == o.toOption.get))
 
  property("flatMap") = forAll((o: Optional[Int], f: Int => Optional[String]) =>
    (o flatMap f).toOption == (o.toOption flatMap (f(_).toOption)))
 
  property("mapAgain") = forAll ((o: Optional[Int], f: Int => String) =>
    (o mapAgain f).toOption == (o map f).toOption)
 
  property("getOrElse") = forAll ((o: Optional[Int], n: Int) =>
    (o getOrElse n) == (o.toOption getOrElse n))
 
  property("filter") = forAll ((o: Optional[Int], f: Int => Boolean) =>
    (o filter f).toOption == (o.toOption filter f))
 
  property("exists") = forAll ((o: Optional[Int], f: Int => Boolean) =>
    (o exists f) == (o.toOption exists f))
 
  property("forall") = forAll ((o: Optional[Int], f: Int => Boolean) =>
    (o forall f) == (o.toOption forall f))
 
  property("foreach") = forAll ((o: Optional[Int], f: Int => Unit, n: Int) => {
    var x: Int = n
    var y: Int = x
 
    o foreach (t => x = x + t)
    o.toOption foreach (t => y = y + t)
 
    x == y
  })
 
  property("isDefined") = forAll ((o: Optional[Int]) =>
    (o.isDefined) == (o.toOption.isDefined))
 
  property("isEmpty") = forAll ((o: Optional[Int]) =>
    o.isEmpty == o.toOption.isEmpty)
 
  property("orElse") = forAll ((o: Optional[Int], p: Optional[Int]) =>
    (o orElse p).toOption == (o.toOption orElse p.toOption))
 
  property("toLeft") = forAll ((o: Optional[Int], n: Int) =>
    (o toLeft n) == (o.toOption toLeft n))
 
  property("toRight") = forAll ((o: Optional[Int], n: Int) =>
    (o toRight n) == (o.toOption toRight n))
 
  property("toList") = forAll ((o: Optional[Int]) =>
    o.toList == o.toOption.toList)
 
  property("iterator") = forAll ((o: Optional[Int]) =>
    o.iterator sameElements o.toOption.iterator)
 
  // *** READ THIS COMMENT FIRST ***
  // Note that scala.Option has no such equivalent to this method
  // Therefore, reading this test may give away clues to how it might be solved.
  // If you do not wish to spoil it, look away now and follow the
  // instruction in the Exercise comment.
  property("applic") = forAll ((o: Optional[Int => String], p: Optional[Int]) =>
    (p applic o).toOption ==
    (for(f <- o.toOption;
         n <- p.toOption)
    yield f(n)))
 
  def trace[A](a: A) = {
    println(a)
    a
  }
 
  property("mapOptionals") = forAll((f: Int => Optional[String], o: List[Int]) =>
  {
    val i = o map f
    mapOptionals(f, o) == (if(i forall (_.isDefined)) some(i map (_.get)) else none)
  })
 
  property("sequenceOptionals") = forAll((o: List[Optional[String]]) =>
      sequenceOptionals(o) == (if(o exists (_.isEmpty)) none else some(o map (_.get))))
 
  property("mapOptionalsAgain") = forAll((f: Int => Optional[String], o: List[Int]) =>
      mapOptionalsAgain(f, o) == mapOptionals(f, o))
 
  property("sequenceOptionalsAgain") = forAll((o: List[Optional[String]]) =>
      sequenceOptionalsAgain(o) == sequenceOptionals(o))
 
  property("joinOptionals") = forAll((o: Optional[Optional[String]]) =>
      joinOptionals(o) == (if(o.isDefined && o.get.isDefined) o.get else none))
 
  property("filterOptionals") = forAll((f: Int => Optional[Boolean], o: List[Int]) =>
      filterOptionals(f, o) == (if(o exists (f(_).isEmpty)) none else some(o filter (f(_).get))))
 
  property("fillOptionals") = forAll((n: Int, o: Optional[String]) =>
      (n < 1000) ==> // prevent stack consumption
      (fillOptionals(n, o) == (if(n <= 0) some(Nil) else (o map (List.fill(n)(_))))))
 
  property("fillOptionalsAgain") = forAll((n: Int, o: Optional[String]) =>
      (n < 1000) ==> // prevent stack consumption      
      (fillOptionalsAgain(n, o) == fillOptionals(n, o)))
 
  property("mapOptionalsYetAgain") = forAll((f: Int => Optional[String], o: List[Int]) =>
      mapOptionalsYetAgain(f, o) == mapOptionals(f, o))
 
  property("mapWhatever") = forAll((f: Int => Optional[String], o: List[Int]) =>
      mapWhatever(f, o) == mapOptionals(f, o))
 
  /*
  $ scala -classpath .:scalacheck_2.8.0-1.7.jar TestOptional
  + Optional.map: OK, passed 100 tests.                                         
  + Optional.get: OK, passed 100 tests.                                         
  + Optional.flatMap: OK, passed 100 tests.                                     
  + Optional.mapAgain: OK, passed 100 tests.                                    
  + Optional.getOrElse: OK, passed 100 tests.                                   
  + Optional.filter: OK, passed 100 tests.                                      
  + Optional.exists: OK, passed 100 tests.                                      
  + Optional.forall: OK, passed 100 tests.                                      
  + Optional.foreach: OK, passed 100 tests.                                     
  + Optional.isDefined: OK, passed 100 tests.                                   
  + Optional.isEmpty: OK, passed 100 tests.                                     
  + Optional.orElse: OK, passed 100 tests.                                      
  + Optional.toLeft: OK, passed 100 tests.                                      
  + Optional.toRight: OK, passed 100 tests.                                     
  + Optional.toList: OK, passed 100 tests.                                      
  + Optional.iterator: OK, passed 100 tests.                                    
  + Optional.applic: OK, passed 100 tests.                                      
  + Optional.mapOptionals: OK, passed 100 tests.                                
  + Optional.sequenceOptionals: OK, passed 100 tests.                           
  + Optional.mapOptionalsAgain: OK, passed 100 tests.                           
  + Optional.sequenceOptionalsAgain: OK, passed 100 tests.                      
  + Optional.joinOptionals: OK, passed 100 tests.                               
  + Optional.filterOptionals: OK, passed 100 tests.                             
  + Optional.fillOptionals: OK, passed 100 tests.                          
  + Optional.fillOptionalsAgain: OK, passed 100 tests.                     
  + Optional.mapOptionalsYetAgain: OK, passed 100 tests.                        
  + Optional.mapWhatever: OK, passed 100 tests.          
  */  
}

by Tony Morris at September 01, 2010 05:45 AM

Graham Lea

Things I Love About IntelliJ IDEA: The Impossible is Possible

The features I've written about so far are the ones that I use really often and that I find make the biggest difference for me when I use IntelliJ IDEA. Some of them are a bit mundane, but when I have to code without them I often get the feeling of being bogged down in typing boilerplate.

But there is another side to IntelliJ - an exciting side! IDEA has a whole host of features which, to me, are bordering on mind-blowing. There are quite a few features in IDEA that are so impressive that I don't just think, "That's a clever idea. I'm glad they put that in." What I really think is, "How the hell did they even IMAGINE that that was POSSIBLE?!" Seriously, the Jetbrains guys have implemented tools that I would have laughed at if they were proposed to me, just from the sheer audacity of the ideas and the complexity that I imagine would be involved in implementing them.

One domain where they have been impressing me lately is in their Hibernate integration. If you tell IDEA about a datasource that your Hibernate entities should be able to map to, it can do some cool things like tell you that no table exists with the same name as your entity, or that the column name in your @Column annotation doesn't match any column in the database table. There's also the HQL console, where you can run HQL queries against your database, complete with code-completion from your domain model. Both of these are "Cool, that's useful" features, but I imagine your mouth is not yet agape.

Then they go to the next level: the same code completion that's available in the HQL editor is also available write in the middle of your code using language injection. That's right, those Strings in your Repository classes are no longer just Strings - that first double quote now denotes the point at which Java stops and HQL starts, and IDEA will give you all the help you need to write a query that is correct both syntactically and in respect of your domain model, with name completion, underlining of errors and meaningful error messages. (To enable HQL editing in a class just hit Alt-Enter, select 'Inject Language', then 'HQL')

But after the Jetbrains guys implemented that awesome integration, they got bored, so they had to think of something even more amazing to integrate. Now I see that in the upcoming IntelliJ IDEA X, they have added similar integration to queries created using Hibernate's Criteria API. The ingenuity involved in this feature astounds me - IDEA is observing the annotations on Hibernate entities, parsing Java code and recognising a Criteria query, then building enough understanding of that query and linking it in with the knowledge from the entity classes to be able to make suggestions about the possible and legal properties and relationships that could be queried. It's another example of IDEA being so intelligent that it probably knows more about my code than I do, and the guys have probably re-written half of Hibernate in order to give it to us.

As I said, this is just one domain (no pun intended) that has been impressing me lately. IntelliJ IDEA is full of little nuggets likes this where the Jetbrains developers have dreamed up and then delivered features that are useful so far beyond any little improvements or tweaks that I could have requested as a day-to-day user. This isn't the product of a team of developers tirelessly implementing feature requests from users - it's the constantly evolving creation of a group of highly visionary people who intimately understand the day to day tasks of a coder but are also able to rise above the detail and imagine a better future where the tools don't just help out with the boilerplate but also start to take on some of the heavy lifting.


by Grazer (noreply@blogger.com) at September 01, 2010 01:32 AM

August 31, 2010

Dave Ray

Scripting JSoar

In simpler times (say 2001) Soar was just Tcl. That is to say, Soar was a module, dynamically loaded into a Tcl interpreter at run-time. When loaded, Soar added a bunch of useful commands to the interpreter. Like run, matches, preferences, and probably most importantly, the sp command. When you “sourced” a Soar file, the Tcl interpreter just executed commands, loading rules, setting watch levels, etc.

The main drawback to this whole situation was that Tcl didn’t always lend itself to friendly embedding in other programs. It had funny rules about threads and, if Tk was involved, demanded to have its message queue pumped. And, of course, very few people get to know Tcl enough to like it :)

On the other hand, you could create macros for repetitive Soar structures, define new RHS functions, manipulate I/O. In short, you had the power of a full programming language mixed in with your Soar code.

With Soar 8.6 Soar’s tight integration with Tcl was broken, replaced by SML and a stricter command interpreter. It still looked like Tcl commands, but there were no Tcl control structures, variables, etc. Way it goes. When I initially started work on JSoar, I needed to quickly bootstrap a command interpreter so I could load existing code into the kernel. I turned to Tcl, in the form of Jacl, a Java implementation of Tcl. It saved a lot of time and, since Soar’s syntax was still basically Tcl, no one would really notice.

Of course, as I mentioned, no one wants Tcl, so over the last couple weeks, I’ve added a new scripting layer to JSoar. This time, I’m taking advantage of the Java Scripting API, JSR-223. This allows any scripting language with a JSR-223 implementation to be pretty seamlessly accessed from Java (and vice versa). With this new capability, it’s now possible to automate Soar agents, implement simple agent environments, and extend SoarUnit testing to include I/O, all from within a Soar source file. All with a variety of languages including Ruby, JavaScript, Python, Clojure, Groovy, etc.

A scripting engine (a language implementation) is invoked with the script command:

script javascript {
   soar.onInput(function(e) {
      soar.wmes.add("my-input", "hello");
   });
   soar.onOutputCommand("say", function(e) {
      soar.print("The agent says: " + e.greeting);
   });
}

This little bit of code sets up an input phase callback and creates a WME on the agen’ts input-link. It also handles an output command called “say”. The equivalent Java code would be … more ceremonious. Not to mention setting up a new project, compiling, etc, etc is a major hassle.

As an example, I’ve implemented a simple waterjugs environment in JavaScript, Ruby, and Python. Here are some things you can do:

  • Generate input
  • Handle output commands
  • Auto-convert hashes (JavaScript objects, Python dicts, or Ruby hashes) to input structures
  • Install new RHS functions
  • Add new commands
  • and on and on

Also, with maybe a little more work, I might have a pretty good story for dealing with I/O in SoarUnit tests. Stay tuned.

More detailed info on JSoar scripting support can be found on the JSoar wiki.

by dave at August 31, 2010 03:15 AM

August 27, 2010

James Iry

Why Scala's "Option" and Haskell's "Maybe" types will save you from null

Cedric Beust makes a bunch of claims in his post on Why Scala's "Option" and Haskell's "Maybe" types won't save you from null, commenters say he doesn't get it and he says nobody has argued with his main points. So I thought I'd do a careful investigation to see what, if anything, is being missed.First, right off the top here: Scala has true blue Java-like null; any reference may be null. Its

by James Iry (noreply@blogger.com) at August 27, 2010 08:55 PM

Coderspiel

"I’ll make you a deal: once Mac Rumors and Macworld update their respective articles that make..."

“I’ll make you a deal: once Mac Rumors and Macworld update their respective articles that make exactly the same points I did, I’ll update mine. Until then, I won’t defer to the wisdom of the internet’s armchair licensing experts.”

- Adventures in Oblivious Self-Contradiction

August 27, 2010 06:19 PM

Ruminations of a Programmer

Random thoughts on Clojure Protocols

Great languages are those that offer orthogonality in design. Stated simply it means that the language core offers a minimal set of non-overlapping ways to compose abstractions. In an earlier article A Case for Orthogonality in Design I discussed some features from languages like Haskell, C++ and Scala that help you compose higher order abstractions from smaller ones using techniques offered by those languages.

In this post I discuss the new feature in Clojure that just made its way in the recently released 1.2. I am not going into what Protocols are - there are quite a few nice articles that introduce Clojure Protocols and the associated defrecord and deftype forms. This post will be some random rants about how protocols encourage non intrusive extension of abstractions without muddling inheritance into polymorphism. I also discuss some of my realizations about what protocols aren't, which I felt was equally important along with understanding what they are.

Let's start with the familiar Show type class of Haskell ..

> :t show
show :: (Show a) => a -> String

Takes a type and renders a string for it. You get show for your class if you have implemented it as an instance of the Show type class. The Show type class extends your abstraction transparently through an additional behavior set. We can do the same thing using protocols in Clojure ..

(defprotocol SHOW 
  (show [val]))

The protocol definition just declares the contract without any concrete implementation in it. Under the covers it generates a Java interface which you can use in your Java code as well. But a protocol is not an interface.

Adding behaviors non-invasively ..

I can extend an existing type with the behaviors of this protocol. And for this I need not have the source code for the type. This is one of the benefits that ad hoc polymorphism of type classes offers - type classes (and Clojure protocols) are open. Note how this is in contrast to the compile time coupling of Java interface and inheritance.

Extending java.lang.Integer with SHOW ..

(extend-type Integer
  SHOW
  (show [i] (.toString i)))

We can extend an interface also. And get access to the added behavior from *any* of its implementations .. Here's extending clojure.lang.IPersistentVector ..

(extend-type clojure.lang.IPersistentVector
  SHOW
  (show [v] (.toString v)))

(show [12 1 4 15 2 4 67])
> "[12 1 4 15 2 4 67]"

And of course I can extend my own abstractions with the new behavior ..

(defrecord Name [last first])

(defn name-desc [name]
  (str (:last name) " " (:first name)))

(name-desc (Name. "ghosh" "debasish")) ;; "ghosh debasish"

(extend-type Name
  SHOW
  (show [n]
    (name-desc n)))

(show (Name. "ghosh" "debasish")) ;; "ghosh debasish"

No Inheritance

Protocols help you wire abstractions that are in no way related to each other. And it does this non-invasively. An object conforms to a protocol only if it implements the contract. As I mentioned before, there's no notion of hierarchy or inheritance related to this form of polymorphism.

No object bloat, no monkey patching

And there's no object bloat going on here. You can invoke show on any abstraction for which you implement the protocol, but show is never added as a method on that object. As an example try the following after implementing SHOW for Integer ..

(filter #(= "show" (.getName %)) (.getMethods Integer))

will return an empty list. Hence there is no scope of *accidentally* overriding some one else's monkey patch on some shared class.

Not really a type class

Clojure protocols dispatch on the first argument of the methods. This limits its ability from getting the full power that Haskell / Scala type classes offer. Consider the counterpart of Show in Haskell, which is the Read type class ..

> :t read  
read :: (Read a) => String -> a

If your abstraction implements Read, then the exact instance of the method invoked will depend on the return type. e.g.

> [1,2,3] ++ read "[4,5,6]"
=> [1,2,3,4,5,6]

The specific instance of read that returns a list of integers is automatically invoked here. Haskell maintains the dispatch match as part of its global dictionary.

We cannot do this in Clojure protocols, since it's unable to dispatch based on the return type. Protocols dispatch only on the first argument of the function.


by Debasish (ghosh.debasish@gmail.com) at August 27, 2010 04:45 PM

Scala Ide

Google Summer of Code 2010 @ Scala IDE

UPDATE: an update site including Jin’s GSoC work is now available.

I was lucky enough to be the mentor on a Scala IDE for Eclipse project which was selected for this year’s Google Summer of Code. Jin Mingjian was the student for this project, and it’s been wonderful to work with him over the last few months.

Jin completed his Doctoral degree at the Institute of Electrical Engineering in the Chinese Academy of Sciences this summer … having to deal with his PhD defence at the same time as the GSoC project was quite a tall order!

He had this to say about himself and about his project …


I have been involved with the Eclipse community for a long time. When I discovered Scala this year, I realized that it was the language that I need. The openness of Scala struck me, as much as the technical merits of the language itself. The Scala IDE community combines two great open source communities and this summer’s GSoC gave me the opportunity to work on it. I like being part of this community. Join us to help to create great Scala tools!


Now that the GSoC ‘pencils down’ date has arrived, it is time to report to the community about the achievements of my Scala IDE project. The goal of this project is to provide Advanced Semantic Tools for the Scala IDE for Eclipse.

First, I am very appreciative for lots of helpful suggestions, advice and encouragement from my mentor Miles Sabin and other guys in the community. Without their help, it would not have been possible to finish this project.

There is a wiki page on the Scala IDE project site which explains some aspects of the features. Let me introduce them by starting with an example code snippet,

object String2Int {
  val a = "1";
  var b: java.lang.String = "3";
  implicit val somewords = new SomeWords("yeah!")

  implicit def string2Int(s: String) = {
    ((s.toInt)+1)
  }

  def main(args:Array[String]) {
    printInt(string2Int("3"))
    printInt(a+b)
    printIntWithSomeWords("5")(new SomeWords("..."))
    printIntWithSomeWords(2)

    val map = Map( 1 ->"one",2->"two")
  }

  def printInt(a:Int) {
    println(a)
  }

  def printIntWithSomeWords(a: Int)(implicit sw: SomeWords) {
    println(a)
    println(sw.words)
  }

  class SomeWords(val words: String)
}

This code stems from an example in Programming in Scala. But I modified it to be “more implicit”.

When you open the editor, all the places where implicit conversions and implicit parameters occur (not the implicit definitions themselves) have been highlighted, as follows,

Implicit visualization

Perhaps you don’t like the default green squiggly underlines as the visual indicator of implicit occurrences? Then you can customize the highlight style via Windows -> Preferences -> Scala -> Editor -> Syntax Coloring as follows,

Preferences

You can hide the highlighting by clearing the checkboxes and setting the underline style to “none”. In future we will add an enable/disable check box to simplify this process.

Along with the visual annotation for implicits, we provide automation around the annotation, known in Eclipse terminology as “quick assists”. The default shortcut for quick assist is “Ctrl+1″. The following screenshot shows the quick assist for an implicit conversion,

Implicit quick assist

After applying this quick assist we get the explicit version of above implicit conversion,

Inline expansion

The following screenshot shows the quick assist for implicit parameters,

Implicti parameter quick assist

After the applying the quick assist, we see the implicit arguments defined in the scope explicitly inlined as follows,

After inlining

One of Scala’s important features is support for type inference. With type inference we are not forced to write many tedious type decorations. However, Scala is a strongly type language and in some contexts it is better to declare the type explicitly, for example method return types.

So we have added functionality which provides explicit type completion, activated automatically by typing “:”. The current implementation is limited to val and method definitions for which the type can be inferred.

The following screenshot shows that I want to add an explicit type for val map. So I type the “:” after the map identifier. Then I get the inferred type quick assist here,

Inferred type completion

When I press Enter, we see the following,

Inferred type completion after expansion

Currently the explicit type is a little long. In future we will allow the qualified part to be moved to the import section.

It is a best practice to add explicit return types for methods (other than for methods which return Unit type). And sometimes you forget to add the “=” when defining a method. This makes the whole method return Unit unfortunately. With our new functionality and that best practice quick assist, you can find and fix this kind of mistake easily. For example, suppose you wanted to add the type for the method by “:” auto-completion. But you find that the proposed type is Unit. Then you will realize that you missed the “=”,

Inferred return type completion

Correct it and add the right type again as follows,

Inferred return type completion after expansion

Adding explicit types for val and method definition is a piece of cake now. In future we will improve explicit type completion to propose types from the classpath if the exact type cannot be inferred.

What do you think about these new features? A build/update site of the GSoC branch will be available shortly and any feedback will be appreciated. When the features has been proven stable, we will merge them into the main distribution. Then everyone can try them out!

by Miles Sabin at August 27, 2010 12:16 PM

Mathias

Scala rich wrapping performance

Recently I repeatedly found myself wondering whether relying on Scalas rich wrappers for convenience was actually coming with some performance cost or not. Not that it really matters in all but some really time-critical edge cases, however, I remember reading something about modern JVMs being able to completely optimize these wrapping constructs away (by means of so-called “escape analysis”?), so maybe there...

August 27, 2010 07:00 AM

August 26, 2010

Froth and Java

Google App Inventor and node.js

My invite to Google's new, easy to use App Inventor for Android came through recently and within minutes I had hello world running. Even better, after a few more minutes of playing with the programming toybox I had an easy to use tool.

For awhile now I've wanted a simple tool that would let me go around scanning the barcodes of books that I own and saving them into a text file for importing into LibraryThing but I haven't had the spare time to learn enough Android programming to build one. A combination of a simple client built with App Inventor and a simple server written with Node and I had the tool I've been wanting in a matter of minutes.

The AppInventor Client

AppInventor comes in two parts - an "Interface Builder" website and a visual coding environment where you connect up blocks of code (that sits on top of the Kawa dialect of Scheme - which makes App Inventor coding a lot like a combintation of Lego and Lisp). I quickly added a label (to display the scanned ISBN number), a button (to start the scan) and then two non-visible components: a barcode scanner and a TinyWebDB. Once the UI was done it was off to the coding environment to wire everything up.

The only not quite straightforward part of all this is the TinyWebDB component as it's App Inventor's only built in way of accesing a web service (other than Twitter, which has it's own component). However, TinyWebDB isn't a generic web service client, it's meant for storing and retreiving simple key/value pairs so I needed to write my server with that in mind.

As the coding environment is visual, it really doesn't lend itself to a blog post. Fortunately, you can download the app source code in a zip file and marvel at the underlying code. The meat of the app is:


<com.google.youngandroid.runtime>;;; Screen1
(do-after-form-creation (set-and-coerce-property! Screen1 'Title "Screen1" 'text)
)
;;; Label1
(add-component Screen1 Label Label1 (set-and-coerce-property! Label1 'Text "Text for Label1" 'text)
)
;;; Button1
(add-component Screen1 Button Button1 (set-and-coerce-property! Button1 'Text "Scan" 'text)
)
(define-event Button1 Click()
(call-component-method 'BarcodeScanner1 'DoScan (list)
*no-coercion*)

)
;;; BarcodeScanner1
(add-component Screen1 BarcodeScanner BarcodeScanner1 )
(define-event BarcodeScanner1 AfterScan( result )
(call-component-method 'TinyWebDB1 'StoreValue (list "isbn" (get-property BarcodeScanner1 Result)
)
'( text any)
)

(set-and-coerce-property! Label1 'Text (get-property BarcodeScanner1 Result)
'text)

)
;;; TinyWebDB1
(add-component Screen1 TinyWebDB TinyWebDB1 (set-and-coerce-property! TinyWebDB1 'ServiceURL "http://192.168.1.101:8765" 'text)
)
(init-runtime #f)
</com.google.youngandroid.runtime>

While that's not quite the entire source, it does give some idea of what goes on underneath.

Node(.js)

TinyWebDB will post a key/value pair to whatever server you want. So I needed a simple server that would sit there, extract the value and output to stdout (which will then get piped into a file). This seemed like the perfect opportunity to try Node (I can't tell if it's supposed to be called node or node.js), which is a Javascript execution environment based on the V8 engine for writing network programs. A quick and dirty Javascript hack produced:

var http = require('http');
http.createServer(function (request, response) {
request.on('data', function(chunk) {
var b = chunk.toString('ascii', 0, chunk.length);
var v = b.split("&")[1];
var vv = v.split("=")[1];
var vvv = vv.replace("%22", "");
var vvvv = vvv.replace("%22", "");
console.log(vvvv);
});
response.writeHead(200, {'Content-Type': 'text/plain'});
response.end('Hello, World\n');
}).listen(8765);


Undoubtedly a regexp could have been used, but would have required more mental effort than I really wanted. In case you're wondering about the response, TinyWebDB apparently essentially disregards it, so sending back Hello, World doesn't hurt.

Conclusion

In a matter of minutes I had a useful (at least to me) tool. I have a feeling that this is what App Inventor will mostly be used for - simple apps that satisfy a single person's needs. It's not going cause the "Animated Cat App Apocalypse" in the Android Market that some commentators have worried about. I have no idea if it will be of any use as the "Logo for the modern age" that was the original reason for it's existence (there are still references to "Young Android" in the generated code and the application icons are of an adult and child robot). I would call it the "Visual Basic for the modern age" except that it feels wrong comparing something that "is part of an ongoing movement in computers and education that began with the work of Seymour Papert and the MIT Logo Group in the 1960s" with Visual Basic.

by Scot McSweeney-Roberts (noreply@blogger.com) at August 26, 2010 11:04 PM

Coderspiel

"On the other hand, because many of Scala’s concepts are very general and orthogonal, they can be..."

“On the other hand, because many of Scala’s concepts are very general and orthogonal, they can be combined in a large number of ways. So it is indeed true that almost anything can be achieved in different ways in Scala. It takes practice to discover that often some ways are preferable to others. And it takes a certain amount of tolerance to accept that sometimes there’s more than one route to a good design.”

- Simple or Complex?

August 26, 2010 01:10 PM

Erik Engbrecht

Scala is for VB programmers

Normally I don't go for flame bait titles. But I haven't finished my morning coffee yet so I can't help myself. There's once again a debate raging across the internet about whether Scala is more or less complex than Java, along with the more nuanced argument that yes, it is more complex but only framework and library developers have to contend with this complexity. This argument usually happens when someone posts a little bit of code like this:

def map[B, That](f: A => B)(implicit bf: CanBuildFrom[Repr, B, That]): That

And then someone responds like this:

Why do not people realize that Java is too difficult for the average programmer? That is the true purpose of Scala, to escape the complexity of Java code! Framework code in Scala, with heavy use of implicit keywords and all kinds of type abstractions, is very difficult. This is correct, but this code is not meant for innocent eyes. You do not use that sort of code when you write an application.

I've seen this type of thinking before. A few years ago I had a bout of insanity and lead an ASP.NET project using ASP.NET 2.0. I had no .NET experience prior to this project. The project failed, although the reasons for that failure are legion and unimportant here. But I noticed something about ASP.NET developers: they have no clue how the technology works. It's a black box. Do you why? Because it is a black box. I searched and searched and couldn't even find a good illustration of the lifecycle for an ASP.NET page that's using events. This type of information is put front and center in the Java world. It's absolutely buried in the Microsoft world. Or at least the parts of it that target the hoards of VB programmers that are undyingly loyal to MS. The framework is magic dust created by the great wizards in Redmond so that you can build solutions for your customers. Do not question the dust. Think about VB. Or, no don't, it might damage your brain. My coffee hasn't quite kicked in, so I should have some immunity, so I'll do it for you. VB is a black box (well, at old school VB6). It was designed to allow people who do not know how to really program, and who will probably never know how to program, to create applications. It's completely flat, opaque abstraction. The dichotomy between the application developer and the framework developer is as high and as thick as the gates of Mordor.

There are many people in the Scala community that claim Scala's complexity can be hidden from the application program. I don't believe them, but there's a chance that they are right. It's technically feasible, and I can see how it could happen if Scala started attracting lots of VB programmers. I can't see how it's going to attract lots of VB programmers, but apparently many people in the Scala community think Scala is for VB programmers. So we'll just have to wait and see...

by Erik Engbrecht (noreply@blogger.com) at August 26, 2010 11:44 AM

scala-lang.org

Simple or Complicated?

Recently we have seen a heated debate on whether Scala is too complicated for normal programmers or whether it’s in fact a rather simple language to program in. Here are two representative blogposts of the debate. The comments on the posts are also worth reading. I have written up some thoughts here.

<o:p> </o:p>

by odersky at August 26, 2010 10:28 AM

The Careful Programmer

From Clojure to Ruby

Ever since I've received Stuart Halloway's Programming Clojure, I've been reading, watching and listening about Clojure. And dabbling with it a little.

A few months ago, I started learning Ruby. My main learning experience has been The Ultimate Book to Ruby Programming by Satish Talim and his free online class Core Ruby.

I just finished the Core Ruby Class 18th batch and I must say it's excellent. The class material not only makes a good introduction, it's also my first reference for looking up core concepts. The class ran for 8 weeks. At first, I thought that was too long to cover the basics, but if you do have a day job, the class will be taking plenty of evening and weekend study time.

Each week has assigned reading materials, exercises and a quiz. To really get more from the class than you'd get from just reading a book, you have to do the exercises, post your solutions, discuss them and read solutions from others. The class have competent and friendly mentors that will really help you get from a "Classic OO language in Ruby" solution to an idiomatic Ruby solution. I've also had the chance to have good classmates.

I must stress the point that you can only get back from the class as much as you invest yourself in it. Workload would vary from week to week, but I consider 5 hours a week an absolute minimum and 10 hours is better. I've once spent 8 hours just on a bonus exercise (The Playfair Cypher, if you must know.) YMMV of course!

The bottom line is if you're serious about learning Ruby and you're ready to put in the hours and your passion, the Core Ruby Class is an excellent way to start your journey.

Going back to Clojure, I got some advantages from learning it that translate to Ruby since they have some things in common.

The first thing I noticed is both languages use keywords and with the same syntax too! Hello, I'm a :keyword.

Keywords can be seen as a way to use strings as constants. The advantage over using normal strings is two symbols with the same name are guaranteed to be the same object and thus, you can compare using reference equality instead of value equality. It's typical to use keywords for hash keys. It took me a while to wrap my head around the keyword concept in Clojure, so I was glad to be able recycle it.

Another common thing is the use of separators instead of camel cases for variable and function names: encrypt_message in Ruby or encrypt-message in Clojure. Also, putting ! or ? at the end of the function to express mutation or a predicate. Example: "Hello".empty? in Ruby or (empty? "Hello") in Clojure. Funnily enough, I'm testing these examples with Redcar, a Ruby editor running on JRuby featuring both a Ruby and a Clojure REPL.

As for conditional testing, Ruby and Clojure considers false and nil to be false, and everything else to be true.

Well, there are plenty of differences too of course and I got bitten more than once. They say when you learn your third spoken languages that you'll mix it up with your second spoken language at first. Likewise, even though one is a Lisp dialect and the other one isn't, more often than not, I'll declare a Ruby function like this:
def repeat_function some_function time_interval total_interval do...

I forget the commas between the arguments.

Strange, but true.

by The Careful Programmer (noreply@blogger.com) at August 26, 2010 02:53 AM

August 24, 2010

Michid's Weblog

So Scala is too complex?

There is currently lots of talk about Scala being to complex. Instead of more arguing I implemented the same bit of functionality in Scala and in Java and let everyone decide for themselves.

There is some nice example code in the manual to the The Scala 2.8 Collections API which partitions a list of persons into two lists of minors and majors. Below are the fleshed out implementations in Scala and Java.

First Scala:

object ScalaMain {
  case class Person(name: String, age: Int)

  val persons = List(
    Person("Boris", 40),
    Person("Betty", 32),
    Person("Bambi", 17))

  val (minors, majors) = persons.partition(_.age <= 18) 

  def main(args: Array[String]) = {
    println (minors.mkString(", "))
    println (majors.mkString(", "))
  }
}

And now Java:

import java.util.ArrayList;
import java.util.Arrays;
import java.util.Iterator;
import java.util.List;

class Person {
    private final String name;
    private final int age;

    public Person(String name, int age) {
        super();
        this.name = name;
        this.age = age;
    }

    public String getName() {
        return name;
    }

    public int getAge() {
        return age;
    }

    @Override
    public boolean equals(Object other) {
        if (this == other) {
            return true;
        }
        else if (other instanceof Person) {
            Person p = (Person) other;
            return name == null ? p.name == null : name.equals(p.name)
                    && age == p.age;

        }
        else {
            return false;
        }
    }

    @Override
    public int hashCode() {
        int h = name == null ? 0 : name.hashCode();
        return 39*h + age;
    }

    @Override
    public String toString() {
        return new StringBuilder("Person(")
            .append(name).append(",")
            .append(age).append(")").toString();
    }
}

public class JavaMain {

    private final static List<Person> persons = Arrays.asList(
        new Person("Boris", 40),
        new Person("Betty", 32),
        new Person("Bamby", 17));

    private static List<Person> minors = new ArrayList<Person>();
    private static List<Person> majors = new ArrayList<Person>();

    public static void main(String[] args) {
        partition(persons, minors, majors);
        System.out.println(mkString(minors, ","));
        System.out.println(mkString(majors, ","));
    }

    private static void partition(List<? extends Person> persons,
            List<? super Person> minors, List<? super Person> majors) {

        for (Person p : persons) {
            if (p.getAge() <= 18) minors.add(p);
            else majors.add(p);
        }
    }

    private static <T> String mkString(List<T> list, String separator) {
        StringBuilder s = new StringBuilder();
        Iterator<T> it = list.iterator();
        if (it.hasNext()) {
            s.append(it.next());
        }
        while (it.hasNext()) {
            s.append(separator).append(it.next());
        }
        return s.toString();
    }

}

Impressive huh? And the Java version is not even entirely correct since its equals() method might not cope correctly with super classes of Person.


Filed under: Uncategorized Tagged: Java, Scala

by michid at August 24, 2010 09:04 PM

Stephan Schmidt

Interface vs. Implementation Dependencies in Java

I often use the notion of interface and implementation dependencies, where interface dependencies are mostly always smaller. I think this is a very important concept to understand for Java developers, and although it seems obvious and self evident, many developers do not think along those lines and still couple classes too tightly. Take this example: class CassandraStorage [...]

by stephan at August 24, 2010 02:58 PM

Graham Lea

Things I Love About IntelliJ IDEA: Smart Intentional Programming Support

Intentional Programming or "Coding by Intention" is a style of code authoring where, rather than creating low-level parts of code (the building blocks) and then writing the higher-level code (the orchestrators) to bring it together afterwards, you start by writing the high-level code, referencing methods and data that don't yet exist, building up a complete view of the big picture, and then implement the lower-level functions needed to make the big picture work. The main benefit, as you can probably see from my description, is that you get the big picture right and then shape the details to fit, rather than getting the details "right" and then making the big picture fit. **

IntelliJ IDEA has always had great support for coding by intention. It was one of the first 'wow' moments I had when I started using it in the early 00s. Some people have probably never coded without automatic importing, but for those of us who had, the pop-up question "Do you want to import java.util.Properties?" was like black magic. The ability to create a method based on a sample invocation was also in IntelliJ from the very early on, and it was quite revolutionary at the time.

What I love about IntelliJ's intentions is that they really do make the product seem intelligent. You get the impression that the IDE knows just as much about your code as you do, perhaps even more. A great example of this is if you use intentional programming to put() an entry into a non-existent Map and then use IDEA's 'Create Field' intention to create the map.



IDEA guesses (based on the fact that you are invoking a method called put() with two arguments) that maybe the field you want to create is a Map, so it makes that its first suggestion. But it also goes the extra step of suggesting type arguments for the K and V parameters on the Map interface that match the argument types in your intentional code, so that the whole type declaration will be complete simply by your pressing Enter.



Seeing as I've described how IDEA figures this out, it doesn't seem all that magical, but it's actually really useful. IDEA saves you from having to type nearly anything about the new field except perhaps to initialise it with a suitable Map implementation, which is actually just as easy as typing "new " and then hitting Ctrl-Shift-Space.

So, what if you try the same trick with Eclipse?
Eclipse has intentions, but compared to IDEA they're kind of brain-dead…



So Eclipse will make you a field with the right name, but it just declares it with a type of Object. So you have to go up the top and type in the type that you actually want to use, including the type parameters for a Map, yourself. Is that really intentional programming, or does it just save you from copy-pasting the field name?

Another example of IDEA's type-aware goodness can be seen by comparing the expansion of IDEA's "iter" template with Eclipse's "fore" template. Both produce the same Java 5 for-each loop, but IDEA's combination of type-aware template completion and smart name suggesting results in far less typing than the corresponding Eclipse template.

The JetBrains guys are so mad about intentional programming that they've dedicated a whole page of their documentation to showing of their many "intentions".

** If you want to, you can read more about coding by intention here


by Grazer (noreply@blogger.com) at August 24, 2010 02:01 PM

August 23, 2010

Stephan Schmidt

Singletons without Singletons: Scala Type Classes

I was using Guice in a project recently. Objects which did not come out of Guice got their dependencies not injected. So I had a global singleton where objects could get their dependencies. Yeah yeah I know singletons are bad and stuff. It would be nicer to let Guice handle it all, but sometimes this [...]

by stephan at August 23, 2010 03:07 PM

Coderspiel

Holding the Parameter

HTTP requests: you want to serve them. You want to respond in the 200s. But some requests just don’t make the cut. They lack the attributes you’re looking for, or they have the attributes but in the wrong proportion. In short, some requests are just a bad match.

The Unfiltered web toolkit gives you two ways to handle request parameters. (Those two are in addition to the infinitely many ways you could handle the parameter map yourself, or even directly access the HttpServletRequest if you want to go there.) The first was really easy to build. It’s a natural extension of the pattern matching that Unfiltered uses generally:

object Number extends Params.Extract(
  "number", 
  Params.first ~> Params.int
)
class MyFilter extends unfiltered.Planify({
  case POST(Path("/hi", Params(Number(num, _), _))) => 
    ResponseString(num.toString)
})

And that would give you a filter that only responds when a parameter is supplied that’s called “number” and is an integer. It’s a little odd that the extractor must be defined in an object outside the partial function that defines request handling, but the real shortcoming of this approach is that you can’t easily define behavior for when the parameter is missing, or bad. You can nest pattern matching statements to catch some of the failure conditions, but it’s not going to be pretty (or complete).

For an internal service, or a prototype of an external one, descriptively responding to bad inputs may not be a concern. In that case, go to town with the parameter extractor object. It’s an easy and typesafe way to define and handle acceptable inputs. It offloads the effort of dealing with errorneous input to the client making the request.

But most services can’t just tell bad clients to bug off. For them, there’s QParam. It can map parameter failures to very specific error responses, no matter how many parameters fail.

For example!

<iframe frameborder="0" height="200" src="http://sourced.implicit.ly/net.databinder/unfiltered.test/0.1.4/ParamsSpec.scala.html?id=19982" width="570">If you’re not seeing sexy code here you need to click through to the actual post. </iframe>

This rather dense chunk of code will match on any GET request to “/even”, and respond with either a 200 or 400 depending on the validity of the parameters supplied. Its parameter requirements are defined in a for expression, whose yield statement is only evaluated if no errors have accumulated. After defining the requirements, you apply a set of parameters and provide a function to handle the error list if it is non-empty.

In this case we start with a parameter “number” which must parse into an int or the error object “nonnumber” will be recorded. (The type of error objects for an expression is inferred from the first error object supplied.) Secondly, it must satisfy the predicate { (_: Int) % 2 == 0 } or we’ll note that it was “odd”, and finally it must exist or else it is “missing”. If any requirement on a parameter fails, its remaining requirements are skipped and the next parameter is evaluated.

How does it work? It’s a variation of the state-transformer monad, apparently. I knew how I wanted parameter evaluation to work, and tried to write it “from scratch” but that didn’t produce anything useful. So I ripped off some code from ScalaQuery (yes—the SQL API) and that somehow did work. I thought my hack job was pretty slick and showed it to Chris League, who spoke to our Meetup group about these fancy topics recently. He promptly rewrote it. After some further negotiation, this is what we ended up with:

<iframe frameborder="0" height="460" src="http://sourced.implicit.ly/net.databinder/unfiltered/0.1.4/request/params.scala.html?id=6843" width="570">Ditto.</iframe>

And so on.

In case you want to try this stuff out, we added some fun parameter gymnastics to Unfiltered’s template project. It’s a giter8 template, so if you have g8 on your path you can

$ g8 softprops/unfiltered

and be on your way. Of course, no one has g8 on their path yet, so you could be the first person in your timezone to make the exciting leap to the future of Scala applications installed and launched by sbt.

It’s been a busy summer.

August 23, 2010 02:35 PM

Tony Morris

Further understanding scala.Option

Below are 15 (probably fun) exercises for anyone interested in obtaining a deeper understanding of scala.Option and algebraic data types in general. I could write the same in Haskell but this will require either type-classes or rank-n types (GHC extension), so I thought I’d give that a miss.

Instructions are in the comments. Let me know if there are any questions.

// Scala version 2.8.0.final
// http://scalacheck.googlecode.com/files/scalacheck_2.8.0-1.8-SNAPSHOT.jar
 
 
/*
 
  Below are 15 exercises. The task is to emulate the scala.Option API
  without using Some/None subtypes, but instead using a fold (called a
  catamorphism).
 
  A couple of functions are already done (map, get)
  to be used as an example. ScalaCheck tests are given below to
  verify the work. The desired result is to have all tests passing.
 
  The 15th exercise is not available in the existing Scala API so
  instructions are given in the comments.
 
  Revision History
  ================
 
  23/08/2010
  Initial revision
 
  ----------------
 
  23/08/2010
  Fixed prop_getOrElse. Thanks Michael Bayne.
 
  ----------------
 
  26/08/2010
  Add lazy annotation to orElse method.
 
*/
 
 
trait Optional[A] {
  // single abstract method
  def fold[X](some: A => X, none: => X): X
 
  import Optional._
 
  // Done for you.
  def map[B](f: A => B): Optional[B] =
    fold(f andThen some, none[B])
 
  // Done for you.
  // WARNING: undefined for None
  def get: A =
    fold(a => a, error("None.get"))
 
  // Exercise 1
  def flatMap[B](f: A => Optional[B]): Optional[B] =
    error("todo")
 
  // Exercise 2
  // Rewrite map but use flatMap, not fold.
  def mapAgain[B](f: A => B): Optional[B] =
    error("todo")
 
  // Exercise 3
  def getOrElse(e: => A): A =
    error("todo")
 
  // Exercise 4
  def filter(p: A => Boolean): Optional[A] =
    error("todo")
 
  // Exercise 5
  def exists(p: A => Boolean): Boolean =
    error("todo")
 
  // Exercise 6
  def forall(p: A => Boolean): Boolean =
    error("todo")
 
  // Exercise 7
  def foreach(f: A => Unit): Unit =
    error("todo")
 
  // Exercise 8
  def isDefined: Boolean =
    error("todo")
 
  // Exercise 9
  def isEmpty: Boolean =
    error("todo")
 
  // Exercise 10
  def orElse(o: => Optional[A]): Optional[A] =
    error("todo")
 
  // Exercise 11
  def toLeft[X](right: => X): Either[A, X] =
    error("todo")
 
  // Exercise 12
  def toRight[X](left: => X): Either[X, A] =
    error("todo")
 
  // Exercise 13
  def toList: List[A] =
    error("todo")
 
  // Exercise 14
  def iterator: Iterator[A] =
    error("todo")
 
  // Exercise 15 The Clincher!
  // Return a none value if either this or the argument is none.
  // Otherwise apply the function to the argument in some.
  // Don't be afraid to use functions you have written.
  // Better style, more points!
  def applic[B](f: Optional[A => B]): Optional[B] =
    error("todo")
 
  // Utility
  def toOption: Option[A] = fold(Some(_), None)
}
 
object Optional {
  // Done for you
  def none[A]: Optional[A] = new Optional[A] {
    def fold[X](some: A => X, none: => X) = none
  }
 
  // Done for you
  def some[A](a: A): Optional[A] = new Optional[A] {
    def fold[X](some: A => X, none: => X) = some(a)
  }
 
  // Utility
  def fromOption[A](o: Option[A]): Optional[A] = o match {
    case None    => none
    case Some(a) => some(a)
  }
}
 
import org.scalacheck._
import Arbitrary.arbitrary
import Prop._
 
object TestOptional {
  import Optional._
 
  implicit def ArbitraryOptional[A](implicit a: Arbitrary[A]): Arbitrary[Optional[A]] =
    Arbitrary(arbitrary[Option[A]] map fromOption)
 
  val prop_map = forAll ((o: Optional[Int], f: Int => String) =>
    (o map f).toOption == (o.toOption map f))
 
  val prop_get = forAll((o: Optional[Int]) =>
    o.isDefined ==>
      (o.get == o.toOption.get))
 
  val prop_flatMap = forAll((o: Optional[Int], f: Int => Optional[String]) =>
    (o flatMap f).toOption == (o.toOption flatMap (f(_).toOption)))
 
  val prop_mapAgain = forAll ((o: Optional[Int], f: Int => String) =>
    (o mapAgain f).toOption == (o map f).toOption)
 
  val prop_getOrElse = forAll ((o: Optional[Int], n: Int) =>
    (o getOrElse n) == (o.toOption getOrElse n))
 
  val prop_filter = forAll ((o: Optional[Int], f: Int => Boolean) =>
    (o filter f).toOption == (o.toOption filter f))
 
  val prop_exists = forAll ((o: Optional[Int], f: Int => Boolean) =>
    (o exists f) == (o.toOption exists f))
 
  val prop_forall = forAll ((o: Optional[Int], f: Int => Boolean) =>
    (o forall f) == (o.toOption forall f))
 
  val prop_foreach = forAll ((o: Optional[Int], f: Int => Unit, n: Int) => {
    var x: Int = n
    var y: Int = x
 
    o foreach (t => x = x + t)
    o.toOption foreach (t => y = y + t)
 
    x == y
  })
 
  val prop_isDefined = forAll ((o: Optional[Int]) =>
    (o.isDefined) == (o.toOption.isDefined))
 
  val prop_isEmpty = forAll ((o: Optional[Int]) =>
    o.isEmpty == o.toOption.isEmpty)
 
  val prop_orElse = forAll ((o: Optional[Int], p: Optional[Int]) =>
    (o orElse p).toOption == (o.toOption orElse p.toOption))
 
  val prop_toLeft = forAll ((o: Optional[Int], n: Int) =>
    (o toLeft n) == (o.toOption toLeft n))
 
  val prop_toRight = forAll ((o: Optional[Int], n: Int) =>
    (o toRight n) == (o.toOption toRight n))
 
  val prop_toList = forAll ((o: Optional[Int]) =>
    o.toList == o.toOption.toList)
 
  val prop_iterator = forAll ((o: Optional[Int]) =>
    o.iterator sameElements o.toOption.iterator)
 
  // *** READ THIS COMMENT FIRST ***
  // Note that scala.Option has no such equivalent to this method
  // Therefore, reading this test may give away clues to how it might be solved.
  // If you do not wish to spoil it, look away now and follow the 
  // instruction in the Exercise comment.
  val prop_applic = forAll ((o: Optional[Int => String], p: Optional[Int]) =>
    (p applic o).toOption ==
    (for(f <- o.toOption;
        n <- p.toOption)
    yield f(n)))
 
  val props =
    List(
      prop_map,
      prop_get,
      prop_flatMap,
      prop_mapAgain,
      prop_getOrElse,
      prop_filter,
      prop_exists,
      prop_forall,
      prop_foreach,
      prop_isDefined,
      prop_isEmpty,
      prop_orElse,
      prop_toLeft,
      prop_toRight,
      prop_toList,
      prop_iterator,
      prop_applic
    )
 
  /*
  $ scala -classpath .:scalacheck_2.8.0-1.8-SNAPSHOT.jar TestOptional
  + OK, passed 100 tests.                                                       
  + OK, passed 100 tests.                                                       
  + OK, passed 100 tests.                                                       
  + OK, passed 100 tests.                                                       
  + OK, passed 100 tests.                                                       
  + OK, passed 100 tests.                                                       
  + OK, passed 100 tests.                                                       
  + OK, passed 100 tests.                                                       
  + OK, passed 100 tests.                                                       
  + OK, passed 100 tests.                                                       
  + OK, passed 100 tests.                                                       
  + OK, passed 100 tests.                                                       
  + OK, passed 100 tests.                                                       
  + OK, passed 100 tests.                                                       
  + OK, passed 100 tests.                                                       
  + OK, passed 100 tests.                                                       
  + OK, passed 100 tests.                       
  */
  def main(args: Array[String]) {
    props foreach (_.check)
  }
}

by Tony Morris at August 23, 2010 01:46 AM

Erik Engbrecht

Scala Actors: loop, react, and schedulers

One of the unfortunate aspects of many of the "published" (meaning blogged) Scala Actor benchmarks out there is that they rarely pay much attention, if any, to the affects of seemingly idiomatic patterns on performance. Some of the main culprits are:

  1. react versus receive (event-based versus threaded)
  2. loop/react versus recursive react
  3. loop/receive versus receive/while
  4. tweaking (or failing to tweak) the scheduler

I've been working on setting up a "benchmarking framework" in conjunction with experimenting with modifications to the underlying thread pool so that all the possible permutations are automatically tested. What I have right now is a classic "ring" benchmark setup to permute the schedulers and loop/react versus recursive react. The loop/react pattern is more idiomatic (or at least more common), but higher overhead, and it looks something like this:

loop {
  react {
    case Msg(m) => // do stuff
    case Stop => exit()
  }
}

The reason it is high-overhead is that both loop and react raise control flow exceptions that result in the creation of new tasks for the thread pool when they are hit, so for each loop, two exceptions are raised and two tasks are executed. There's overhead in both of the operations, especially raising the exceptions. The recursive react pattern looks like this, so it can avoid the extra exception/task:

def rloop(): Unit = react {  //this would be called by the act() method
  case Msg(m) => {
    // do stuff
    rloop()
  }
  case Stop => // just drop out or call exit()
}

Using loop instead of recursive react effectively doubles the number of tasks that the thread pool has to execute in order to accomplish the same amount of work, which in turn makes it so any overhead in the scheduler is far more pronounced when using loop. Now, I should point out that the overhead really isn't that large, so if the actor is performing significant computations it will be lost in the noise. But it's fairly common to have actors do fairly little with each message. Here's some results from the ring benchmark using 10 rings of 10,000 actors passing a token around them 100 times before exiting. I'm using multiple rings because otherwise there is no parallelism in the benchmark. These are being run on my dual core Macbook.

SchedulerReactMethodTime (sec)
ManagedForkJoinSchedulerLoopReact45.416058
ManagedForkJoinSchedulerRecursiveReact25.509482
ForkJoinSchedulerLoopReact65.268584
ForkJoinSchedulerRecursiveReact45.85605
ResizableThreadPoolSchedulerLoopReact98.084794
ResizableThreadPoolSchedulerRecursiveReact53.379757

The fork/join schedulers are faster than the ResizableThreadPoolScheduler because rather than have all of the worker threads pull tasks off of a single, shared queue; each thread maintains its own local dequeue where it can place tasks directly onto if they are generated while it is running a task. This creates a kind of "fast path" for the tasks that involves much less overhead.

I believe the primary reason ManagedForkJoinScheduler is faster because ForkJoinScheduler does not always leverage the "fast path," even when in theory it could be used. I'm unsure about some of the rationale behind it, but I know some of the time the fast path is bypassed probabilistically in order to reduce the chances of starvation causing deadlock in the presence of long running or blocking tasks. ManagedForkJoinScheduler escapes this particular issue by more actively monitoring the underlying thread pool, and growing it when tasks are being starved. The second reason, and I'm somewhat unsure of the actual degree of the affects, if that ForkJoinScheduler configures the underlying thread pool so that the threads work through the local dequeues in FIFO order, while ManagedForkJoinScheduler configures the pool such that the local dequeues are processed in LIFO order. Processing in LIFO order allows the pool to take advantage of locality with regard to the tasks, basically assuming that the last task generated is the most likely to use data that's currently in cache, and thus reduce cache misses.

The benchmark outputs a lot more information than I captured in the above table. If you'd like to run it, you can obtain the code here. The project uses sbt, so you'll need to have it working on your computer. After you run update in sbt to download all of the dependencies, you can run the ring benchmark as follows:

$ sbt
[info] Building project ManagedForkJoinPool 1.0 against Scala 2.8.0
[info]    using ManagedForkJoinPoolProject with sbt 0.7.4 and Scala 2.7.7
> ringbenchmark
[info] 
[info] == compile ==
[info]   Source analysis: 1 new/modified, 0 indirectly invalidated, 0 removed.
[info] Compiling main sources...
[info] Compilation successful.
[info]   Post-analysis: 79 classes.
[info] == compile ==
[info] 
[info] == copy-resources ==
[info] == copy-resources ==
[info] 
[info] == ringbenchmark ==
[info] RingBenchmark ManagedForkJoinScheduler LoopReact 2 ....output truncated...

You can tweak the benchmarks by modifying the sbt project file. If you do run them, I'm very interested in the results.

by Erik Engbrecht (noreply@blogger.com) at August 23, 2010 01:41 AM

August 21, 2010

Coderspiel

"This is a simple-build-tool plugin for compiling CoffeeScript files into their javascript..."

“This is a simple-build-tool plugin for compiling CoffeeScript files into their javascript conterparts.”

- coffee-script-sbt-plugin

August 21, 2010 08:39 PM

Erik Engbrecht

Concurrency Benchmarking, Actors, and sbt tricks

Have you ever noticed that other people's microbenchmarks tend to be hard to run and often impossible to duplicate? And are frequently caveated to the hilt? When it gets down to it, a benchmark is really an experiment, and ideally a scientific experiment. That means all factors that are relevant to the results should be clearly recorded, and the tests should be easy for others to duplicate.

Custom sbt actions for benchmarks

In order to test and run benchmarks on the work I'm doing around creating a managed variant of the JSR-166y ForkJoinPool along with supporting infrastructure for use with Scala Actors, I'm creating a test harness that captures a host of environmental factors about how it was run, and writing sbt actions to make it easy to run the benchmarks and automatically permute the variables.

It still needs a lot of work, but I had some trouble figuring out a really basic task so I thought I'd share it. Basically I wanted to build a Task object that consists of several tasks based on information in the project definition and permuted parameters. It actually pretty easy, as you can see in the snippet below from my project definition:

  /** this task executes the PingPong benchmark using each available scheduler */
  lazy val pingpongbench = pingpongTaskList
  /** produces a sequence of run tasks using all the available schedulers  */
  def pingpongTaskList = {
    val pairs = 100
    val messagesPerPair = 10000
    val tasks = for(sched <- schedulers) yield pingpongTask(sched, pairs, messagesPerPair)
    tasks.reduceLeft((a, b) => a && b)
  }

You can see the whole file here. Basically Task has an && operator that essentially allows you to concatenate one task with another task. This allows you to build a whole chain of tasks. In the example above, I'm having it run the benchmark once for each scheduler configuration. Soon, I'm going to make it permute other parameters. But right now my test harness isn't playing nicely with the schedulers included in the Scala distribution, so first things first.

There's also one other little customization, which is documented, but I think it's important for benchmarking. By default, sbt runs your code in its own process. This can cause problems with multithreaded code, especially if it doesn't terminate properly. It also means the next benchmark to run has to content with any junk that the previous benchmark left around. So I configured sbt to fork new processes. It just required one line:

override def fork = forkRun

Important variables

Here's what I'm capturing for each run right now so that the results can all be dumped into a big spreadsheet for analysis. I'd like to capture more information about the host machine, such as more information about the CPUs and the loading when the benchmark is being run, but haven't got that far yet. Currently these are all captured from within the benchmark process, mostly using system properties and the Runtime object.

  1. Test Name - obviously needed so that results from multiple benchmarks can be stored in the same file
  2. Scheduler - this is my primary variable right now, I want to run each benchmark with each scheduler while holding everything else constant
  3. # of Cores/Processors - essential so that anyone looking at the results has an idea about the hardware used
  4. Java VM Name - different VMs can perform quite differently
  5. Java VM Version - performance characteristics change from version to version (usually getting better)
  6. Java Version - same reason as above, but this is probably the more publicly known version number
  7. Scala Version - this could be important in the future, as it becomes more common for different projects to be on different version of Scala
  8. OS Name and version - again, it can affect performance
  9. Processor Architecture
  10. Approximate Concurrency (number of simultaneously alive actors) - this allows us to examine concurrency levels versus resource consumption, more concurrency does not necessarily mean that more cores or threads would be helpful
  11. Approximate Parallelism (number of simultaneously runnable actors) - this measures how many cores/threads the benchmark can really keep busy</il>
  12. Approximate Total Messages - this estimates the amount of activity that takes place during the benchmark, generally the benchmarks I'm looking at contain very little logic because they are intended to measure overhead introduced by the framework
  13. Total Wall Clock Time (seconds) - as measured using nanoTime within the benchmark process
  14. Initial Thread and Maximum Observed Thread Count - used to examine automatic expansion of the thread pool
  15. Initial Free Memory and Minimum Observed Free Memory - threads use a fair amount of memory, so performance impacts may show up as pressure on the GC as well has contention for the CPU
  16. Initial and Maximum Observed Total Memory - threads use a lot of memory, so it's important to track usage
  17. Verbose - debugging output pretty much invalidates any of these tests

by Erik Engbrecht (noreply@blogger.com) at August 21, 2010 07:09 PM

August 20, 2010

Coderspiel

Mathias

Installing Apache Buildr on OS/X 10.6

For my first larger, Scala-only project I just set up a brand-new project space.
Normally all my projects have an IntelliJ IDEA project structure set up for the day-to-day work. However, I usually also include a paralleling project spec for some build system for creating the final deployment or distribution artifacts. For all my Java projects this is (and always has been)...

August 20, 2010 07:00 AM

August 19, 2010

scala-lang.org

Scala LiftOff Goes International

Last year over 80 developers had a great time at the Scala LiftOff in San Francisco. Since then there has been a tremendous growth in the world wide Lift/Scala community and lots of excitement with the releases of Lift 2.0 and Scala 2.8.0. Scala LiftOff has gone international to allow more people to catch up with the latest developments and let more developers talk over the latest stuff. This year you can join other enthusiastic members of the Scala and Lift community in London, New York or San Francisco and find out what's going on. David Pollak, the power behind Lift, and many industry experts will be at all three while you will have a chance to meet Martin Odersky, the creator of Scala, at the London event. The first one is in September so you will need to register soon.

by bagwell at August 19, 2010 03:02 PM

August 18, 2010

scala-lang.org

What's new in Scala 2.8: Collections API

Scala 2.8 has introduced a great number of improvements and additions. They are summarized in the release notes. Over the next weeks, we will publish a series of stories that explain the major new features one-by-one. We start this week with the Scala 2.8 collections API.

In the eyes of many, the new collections framework is the most significant change in Scala 2.8. Scala had collections before (and in fact the new framework is largely compatible with them). But it's only 2.8 that provides a common, uniform, and all-encompassing framework for collection types.

by odersky at August 18, 2010 12:36 PM

Tim Perrett

Native2Ascii plugin for SBT

As I have been doing quite a lot of localisation work recently, I thought it prudent to port the native2ascii tool over to SBT so that I didn’t constantly have to keep using it manually on the command line. If you are interested, you can grab the source code from here

If order to get started with the plugin add it to your Plugins.scala within your project:


  import sbt._
  class Plugins(info: ProjectInfo) extends PluginDefinition(info) {
    val n2a = "eu.getintheloop" % "sbt-native2ascii-plugin" % "0.1.0" 
  }

...and then mix the Native2Ascii trait into your project:


  class SampleProject(info: ProjectInfo) 
    extends DefaultWebProject(info) 
    with Native2AsciiPlugin {
    ...
  }

Ensure you have all your localization .txt files in src/main/i18n and by default they will be translated into your application resources folder in src/main/resources.

Then, from your SBT prompt just hit:

> native2ascii

All being well you should then see something like:


  [info] == native2ascii ==
  [info] Encoding '.txt' to '.properties' file(s) in src/main/resources
  [info] Translation complete.
  [info] == native2ascii ==

by timperrett at August 18, 2010 09:32 AM

Josh Suereth

What's the state of Jigsaw?

I hadn't been paying close attention, but recently a few emails questioning the future of the jigsaw project have been sent to the jigsaw-dev mailing list. The last check-in posted to the mailing list was on June 8th, and was the tail-end of a flurry of check-ins from the jigsaw team.

The whole thing is rather curious, but without any specific word from Oracle, it's all speculation. I have my own guesses, but it's probably best I keep those to myself for now.

by J. Suereth (noreply@blogger.com) at August 18, 2010 02:41 AM

August 17, 2010

scala-lang.org

Scala Training

Skills Matter and Xebia will be providing a series of Scala training sessions starting in October and initially located in London, Amsterdam and Paris. In response to the growing demand Scala courses are being made available through local, well established commercial partners. The courses have been designed by Martin Odersky and will be delivered by him and Iulian Dragos.

These two day courses are excellent for developers or systems architects wanting to learn about Scala and need to understand how it can fit into their development tool-bag. Next courses are available in:

London, UK 4/5 October 2010 Skills Matter. Learn more about the course and register.
Amsterdam, NL 14/15 October 2010 Xebia. Learn more about the course and register.

by bagwell at August 17, 2010 02:17 PM

Graham Lea

Things I Love About IntelliJ IDEA: Keyboard Navigation of Search Results

One of the things that really helps IntelliJ IDEA deliver on its claim of increased productivity is the JetBrains guys' fixation with keyboard navigation and one area where they've hit a home run with this is in navigating search results. The idea is actually really simple: you can navigate search results using the keyboard: Ctrl-Alt-Down to go to the next result, Ctrl-Alt-Up to go to the previous one.

If I were just to end the blog there, I'd probably have a throng of Eclipse users banging on my door telling me you can do the same thing in Eclipse: "Just click in the Search View, then use Ctrl+. and Ctrl+," they'd say, and they'd be right. But it's the "Just click in the Search View" part which is all wrong. I shouldn't have to click anywhere. Someone might point out that I can can change the focus between different Views using Ctrl-F7, which again would be correct and would again miss the point.

The point is: I'm coding. I'm writing code; I'm searching code; I'm changing code. I don't want to focus on a Seach View, I want to focus on my code. In IntelliJ, I don't even have to have the Search pane visible to be able to browse through the results. I just have to hit Ctrl-Alt-Down and I'm at the next result, ready to edit. In Eclipse, if the window focus is in the Editor and I want to go to the next Search result, I either have to use the mouse to select and double-click on the next result, or I have to change to the Search View (Ctrl-F7, including pop-up View selector window), press down to select the next result, then press Enter to return to the Editor.

This is one of those niggles in Eclipse about which I like to say, "The guy who wrote the Editor plug-in didn't talk to the guy who wrote the Search plug-in". I often get the feeling that while IDEA is the product of a close-knit team aiming to produce a seamlessly integrated code-authoring masterpiece, Eclipse is just a bunch of code-related tools that have been slapped inside the same GUI without enough thought as to how those tools should play well together.


by Grazer (noreply@blogger.com) at August 17, 2010 03:26 AM

August 16, 2010

Stephan Schmidt

Interview with DSL, NoSQL and Scala Practitioner Debasish Ghosh

This interview is with Debasish Ghosh. He’s writing the successful blog Ruminations of a Programmer with the tag line “A programmer’s blog – will deal with everything that relates to a programmer. Occasionally, it will contain some humour, some politics and some sport news.”. Name: Debasish Ghosh (dghosh@acm.org) Blog: http://debasishg.blogspot.com Twitter: @debasishg Github: http://github.com/debasishg Tell us something about you (what [...]

by stephan at August 16, 2010 03:01 PM

Mathias

parboiled for Scala

The recently released parboiled version 0.9.8 comes with a brand-new feature that will hopefully make parboiled an attractive tool for an even larger developer community: a Scala facade.
Before v0.9.8 parboiled was a pure-Java library written mainly for Java developers. It basically consisted of two things, an efficient recursive-descent PEG parsing engine and an...

August 16, 2010 07:00 AM

August 13, 2010

James Iry

Sometime in 1977

Kernighan: "I know, let's open the book with a program that displays 'hello, world.'" Ritchie: "Great idea! I'll start the patent search."

by James Iry (noreply@blogger.com) at August 13, 2010 08:18 PM

Richard Dallaway

London, 7-8 Oct 2010: Scala Lift Off

Sessions

I have my ticket. How about you?  It's £150 at the moment.

What we're looking at here is a two day unconference, meaning we get a chance to learn about things we're interested in by figuring it out on the day, talking to people, rather than being lectured at.  I thoroughly enjoyed the Scala Lift Off last year in San Francisco, so I'm looking forward to the London one.  Especially as it close by, rather than 8,720km away.

Permalink | Leave a comment  »

August 13, 2010 12:16 PM

If you like web MVC, you'll probably like the Play web framework.

 

Play is a MVC, convention-based, stateless web framework for Java with growing support for Scala too.

It's not for me as I can't face going back to MVC and the kinds of presentation languages they use. Having said that, if you like MVC, and you're not already using Grails or Rails or a similar framework, I strongly urge you to look at Play as there's some nice technology in there. 

Everything I know about Play comes from Rustem Suniev's talk for the London Scala User Group at Skilsmatter on Wednesday. The slides and video are already on-line and they contain a really nice live-coding demo of Play which gave me a good sense of what the framework is about.  Nice work—and pizza courtesy of autoquake.com (who are hiring).

One comment I will make is that Play pushes it's stateless-ness prominently. For many of us I suspect our default position is that stateful=bad and stateless=good.  That sounds sane, but you probably do need some state in your application, and you have to deal with it somehow, or push the issue somewhere, which leaves me feeling that state v. stateless thing is all a bit more complicated that we often think it is.  It certainly does not automatically mean better scaling or performance, but there are definite positives to it.  I'm glad to see the Play community discussing state and exploring some nice ideas.  Just don't assume a label of "stateless" solves all your scaling problems—if you're lucky enough to have any :-)

 

Permalink | Leave a comment  »

August 13, 2010 10:18 AM

August 12, 2010

Coderspiel

"C2DM created a nice opportunity for us to pull together different Google developer tools to create a..."

“C2DM created a nice opportunity for us to pull together different Google developer tools to create a simple but useful application to enable users to push links and other information from their desktop / laptop to their phone.”

- Powering Chrome to Phone with Android to Device Messaging

August 12, 2010 07:20 PM

Brendan McAdams gives an introduction to MongoDB and the...

<object data="http://vimeo.com/moogaloop.swf" height="225" type="application/x-shockwave-flash" width="400"><param name="allowscriptaccess" value="always"><param name="allowfullscreen" value="true"><param name="movie" value="http://vimeo.com/moogaloop.swf"><param name="flashvars" value="clip_id=14090025&amp;color=00adef&amp;fullscreen=1&amp;server=vimeo.com&amp;show_byline=1&amp;show_portrait=1&amp;show_title=1"></object>

Brendan McAdams gives an introduction to MongoDB and the libraries available for using it with Scala.

August 12, 2010 03:29 PM

Doug Tangren presents the Unfiltered toolkit for serving HTTP...

<object data="http://vimeo.com/moogaloop.swf" height="225" type="application/x-shockwave-flash" width="400"><param name="allowscriptaccess" value="always"><param name="allowfullscreen" value="true"><param name="movie" value="http://vimeo.com/moogaloop.swf"><param name="flashvars" value="clip_id=14042004&amp;color=00adef&amp;fullscreen=1&amp;server=vimeo.com&amp;show_byline=1&amp;show_portrait=1&amp;show_title=1"></object>

Doug Tangren presents the Unfiltered toolkit for serving HTTP requests in Scala.

August 12, 2010 03:26 PM

Scala Ide

Not a release, but New and Noteworthy even so …

Now that the dust has settled on the Scala 2.8.0.final release, we have some breathing space to think about moving the Scala IDE for Eclipse on towards it’s first release independently of the release of the Scala toolchain — you’ll recall that a big part of the motivation for the move from SVN at EPFL to git at Assembla was to allow the IDE release cycle to be decoupled from the release cycle of the language — now it’s time to take advantage of that.

One part of the decoupling process has been rebooting the version numbering scheme for the IDE. We have three sets of versions to juggle with — the Scala toolchain version, the Eclipse version and the version of the SDT itself. Previously it made sense for the SDT to share it’s version number with the Scala toolchain, but that doesn’t work now that the two can move independently. So, to celebrate the birth of the Scala IDE for Eclipse as an independent open source project in a new OSGi namespace (org.scala-ide rather than ch.epfl.lamp), we’re starting from scratch and are currently at 1.0.0-SNAPSHOT heading for a 1.0.0 release. Readers who have been following the nightly updates for a while will be relieved to know that care has been taken to ensure that upgrades should be smooth despite change of namespace and the decrease in the version number.

Astute readers will have noticed a Maven-ish tinge to the new versioning scheme. That’s no accident, and I’m delighted to report that the move from an Ant-based build to a Tycho/Maven 3.0 based build is now complete and has been a more or less unqualified success. In particular, Tycho’s ability to manage the Eclipse target platform dependencies and build intricacies has been of enormous benefit — without it the Helios build would have taken even longer to arrive. The magic that is git has also paid off here as it’s allowing us to develop a Helios-specific branch with an ease that I would previously have thought impossible.

We still have a little way to go before that first release (I’ll post on the release plan and timetable shortly) but in the meantime I’d like to give you an update on current developments. So, without further ado …

New and Noteworthy

Refactoring Support

Mirko Stocker’s Scala Refactoring framework is now included in the SDT. Currently there is support for Rename, including in-place renaming of local identifiers:

In-place local rename

We have an Extract Method refactoring:

Extract method refactoring

and an Inline Local refactoring:

Inline refactoring

We also have an Extract Local refactoring and the beginnings of Organize Imports — so far existing imports will be tidied, but adding and removing imports is still work in progress.

Mirko’s site and thesis goes into a lot more detail, but in summary he’s done a fantastic job of separating out the lexical noise so that authors of refactorings can concentrate on the essence of the transformations involved without being distracted by whitespace or comment issues. The refactoring API is based on the standard scalac AST and is IDE-independent — we’d both love to see people contribute refactorings whether or not they’re Eclipse users.

Formatting

Matt Russell’s Scala Formatter Scalariform is also now included.

Before:

Formatting before

After:

Formatting after

As you can see, it does a good job of formatting XML literals as well as ordinary Scala source. Like Mirko’s refactoring work, Scalariform has an IDE independent core and we’d welcome contributions from all parties.

Mark Occurrences

Matt has also contributed an implementation of Mark Occurrences which will be familiar to users of Eclipse’s Java tooling — this is somthing I’ve always found invaluable:

Mark occurrences

Code Templates

David Bernard has added support for code templates for common boilerplate and which are offered as completion proposals in the Scala editor in the same way that Java code templates are presented the Java editor.

Requesting completion of a code template for a main method:

Template before

The resulting expansion:

Template after

Quick Fix Imports

More a bug fix than a new feature — Colin Howe contributed this initially — Daniel Ratiu has reinstated the quick fix which adds an import for a missing type.

Add import quick fix

Structured Selections

Matt has added support for Eclipse’s structured selection model — Alt-Shift-Up / Alt-Shift-Down select larger and smaller portions of the AST respectively:

AST select 1
AST select 2
AST select 3
AST select 4

It works particularly nicely alongside the Extract Method and Extract Local refactorings mentioned above.

XML Syntax Highlighting

Matt has also carried on his work on Scala syntax highlighting thanks to which we now have full support for XML literals:

XML syntax highlighting

As well as the highlighting of XML elements and attributes, you’ll see from the screenshot that Scala syntax (keywords, string literals etc.) which appear in XML content are now correctly left as unhighlighted text.

Scala IDE support in m2eclipse

To complement the use of Maven in the SDT’s own development process, we’re making rapid progress in ensuring that the Maven experience is as smooth as possible. In particular David Bernard has done sterling work on the m2eclipse Scala integration started by Jason van Zyl with my encouragement at EclipseCon this year.

An open platform with a growing community

Hopefully a few things are very evident from all of the above. One is that there’s a lively and growing community of active contributors around the project. Another is that the Scala IDE for Eclipse as an umbrella project is very much open to collaborations with projects which aren’t Eclipse-specific, or indeed tied to any kind of IDE at all — both Mirko’s refactoring tools and Matt’s formatter can be used from the command line or incorporated into other kinds of source manipulation tools.

So I’d like to finish by inviting you to join us — whether or not you’re an Eclipse user there is plenty the project as a whole has to offer, and plenty that you could contribute in return. Looking forward to seeing you on the user and developer mailing lists!

by Miles Sabin at August 12, 2010 09:59 AM

Mathias

parboiled 0.9.8 released

Just about a month after the last release parboiled 0.9.8 has hit its github download page today.
It is once again a major step forward, with quite a few changes and additions, some of them being significant enough to accompany the release with a small series of blog posts. They should be of interest to both, people currently using previous parboiled...

August 12, 2010 07:00 AM

Yuvi Masory

Uses of underscore in Scala

The Scala homepage has a poll regarding the uses of the underscore character (_) in Scala. Here's my attempt to list all the uses.

Independent uses:

  • Default values for vars

  • Catch-all on import statements

  • Hiding imported members from catch-all import

  • Ignoring type parameters in type patterns

  • Existential type shorthand

  • Wildcard pattern

  • Omitting arguments to create partially applied function

  • Omitting entire argument list to create partially applied function

  • Omitting argument lists to create a partially applied curried function

  • Omitting all argument lists to create a partially applied curried function

  • Anonymous function shorthand (thanks #scala)




Dependent uses:

  • Passing a Seq as a repeated parameter

  • Matching repeated parameters

  • In alphanumeric identifiers

  • In mixed identifiers

  • In literal identifiers

  • In symbol literals

  • In identifier of special _root_ package

  • In XML literals

by Yuvi Masory (noreply@blogger.com) at August 12, 2010 01:55 AM

August 11, 2010

Coderspiel

jeffplaisance's scala-protobuf

jeffplaisance's scala-protobuf:

Scala compiler plugin, “generates type safe scala wrappers for java protoc output”

August 11, 2010 06:33 PM

Ikai Lan

Using the App Engine Mapper for bulk data import

Since my last post describing App Engine mapreduce, a new InputReader has been added to the Java project for reading from the Blobstore. Nick Johnson wrote a great demo where indexing was done via reading code uploaded to the blobstore. This was demo’d at Google I/O. Now that the library is officially part of the project, it’s become much easier for developers to build Mappers that map across some large, contiguous piece of data as opposed to Entities in the datastore.The most obvious use case is data import. A developer looking to import large amounts of data would take the following steps:

  1. Create a CSV file containing the data you want to import. The assumption here is that each line of data corresponds to a datastore entity you want to create
  2. Upload the CSV file to the blobstore. You’ll need billing to be enabled for this to work.
  3. Create your Mapper, push it live and run your job importing your data.

This isn’t meant to be a replacement for the bulk uploader tool; merely an alternative. This method requires a good amount more programmatic changes for custom data transforms. The advantage of this method is that the work is done on the server side, whereas the bulk uploader makes use of the remote API to get work done. Let’s get started on each of the steps.

Step 1: Create a CSV file with the data you want to upload

We’re going to go through an example of uploading City and State information. MaxMind.com provides a free GeoIP CSV file. The free version isn’t as full featured as the paid version, but it’ll do fine for our demo. Be sure that if you use this file in any kind of production application that you read and understand the license first! For simplicity, we’re going to parse out only cities in the United States using grep. The file should now contain lines that look like this:

605,"US","NY","Valhalla","10595",41.0877,-73.7768,501,914
606,"US","PA","Pittsburgh","15222",40.4495,-79.9880,508,412
607,"US","MO","Bridgeton","63044",38.7667,-90.4201,609,314
608,"US","CA","San Francisco","94124",37.7312,-122.3826,807,415
609,"US","NY","New York","10017",40.7528,-73.9725,501,212
610,"US","PA","Bear Lake","16402",41.9491,-79.4448,516,814
611,"US","NJ","Piscataway","08854",40.5516,-74.4637,501,732
612,"US","NY","Keuka Park","14478",42.5669,-77.1325,555,315
613,"US","VT","Brattleboro","05302",42.8496,-72.6645,506,802

2. Create an upload handler for your CSV file and upload the CSV file

We’re going to create a basic handler for uploading a CSV file and displaying the key. We’ll need to pass this key to our mapper later. There isn’t too much magic here; it’s very similar to the sample code available for the basic blobstore example.

We’ll do a quick overview of the code we need here, but for the purposes of this post, it’s out of scope. We’ll need these files:

upload.jsp

<%@ page language="java" contentType="text/html; charset=ISO-8859-1"
    pageEncoding="ISO-8859-1"%>
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

<%@page import="com.google.appengine.api.blobstore.BlobstoreService"%>
<%@page import="com.google.appengine.api.blobstore.BlobstoreServiceFactory"%>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<title>Upload your CSV file here</title>
</head>
<body>
    <% BlobstoreService blobstoreService = BlobstoreServiceFactory.getBlobstoreService(); %>
    <form action="<%= blobstoreService.createUploadUrl("/upload") %>" method="post" enctype="multipart/form-data">
        <input type="file" name="data">
        <input type="submit" value="Submit">
    </form>
</body>
</html>

UploadBlobServlet.java

package com.ikai.mapperdemo.servlets;

import java.io.IOException;
import java.util.Map;

import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

import com.google.appengine.api.blobstore.BlobKey;
import com.google.appengine.api.blobstore.BlobstoreService;
import com.google.appengine.api.blobstore.BlobstoreServiceFactory;

@SuppressWarnings("serial")
public class UploadBlobServlet extends HttpServlet {
	public void doPost(HttpServletRequest req, HttpServletResponse resp)
			throws IOException {

		BlobstoreService blobstoreService = BlobstoreServiceFactory.getBlobstoreService();
		Map<String, BlobKey> blobs = blobstoreService.getUploadedBlobs(req);
		BlobKey blobKey = blobs.get("data");

		if (blobKey == null) {
			resp.sendRedirect("/");
		} else {
			resp.sendRedirect("/upload-success?blob-key=" + blobKey.getKeyString());
		}
	}

}

SuccessfulUploadServlet.java

package com.ikai.mapperdemo.servlets;

import java.io.IOException;

import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

@SuppressWarnings("serial")
public class SuccessfulUploadServlet extends HttpServlet {
	public void doGet(HttpServletRequest req, HttpServletResponse resp)
			throws IOException {

		String blobKey = req.getParameter("blob-key");

		resp.setContentType("text/html");
		resp.getWriter().println("Successfully uploaded. Download file: <br/>");
		resp.getWriter().println(
				"<a href='/serve?blob-key=" + blobKey
						+ "'>Click to download</a>");
	}

}

Source code for this and other helper functions should be available in the Github repository.

Step 3: Create your Mapper

Now we get to the fun part. We need to create our Mapper. A prerequisite for understanding what’s coming next is reading the last post about Mapper I wrote, so check that out before proceeding if you aren’t familiar with Mapper basics. Our Mapper class looks like this:

ImportFromBlobstoreMapper.java

package com.ikai.mapperdemo.mappers;

import java.util.logging.Logger;

import org.apache.hadoop.io.NullWritable;

import com.google.appengine.api.datastore.Entity;
import com.google.appengine.tools.mapreduce.AppEngineMapper;
import com.google.appengine.tools.mapreduce.BlobstoreRecordKey;
import com.google.appengine.tools.mapreduce.DatastoreMutationPool;

/**
 *
 * This Mapper imports from a CSV file in the Blobstore. The CSV
 * assumes it's in the MaxMind format for cities, states, zipcodes
 * and lat/long.
 *
 *
 * @author Ikai Lan
 *
 */
public class ImportFromBlobstoreMapper extends
		AppEngineMapper<BlobstoreRecordKey, byte[], NullWritable, NullWritable> {
	private static final Logger log = Logger.getLogger(ImportFromBlobstoreMapper.class
			.getName());

	@Override
	public void map(BlobstoreRecordKey key, byte[] segment, Context context) {

		String line = new String(segment);

		log.info("At offset: " + key.getOffset());
		log.info("Got value: " + line);

		// Line format looks like this:
		// 10644,"US","VA","Tazewell","24651",37.0595,-81.5220,559,276
		// We're also assuming no errant commas in this simple example

		String[] values = line.split(",");
		String state = values[2];
		String cityName = values[3];
		String zipcode = values[4];
		Double latitude = Double.parseDouble(values[5]);
		Double longitude = Double.parseDouble(values[6]);		

		state = state.replaceAll("\"", "");
		cityName = cityName.replaceAll("\"", "");
		zipcode = zipcode.replaceAll("\"", "");

		if(!zipcode.isEmpty()) {
			Entity zip = new Entity("Zip", zipcode);
			zip.setProperty("state", state);
			zip.setProperty("city", cityName);
			zip.setProperty("latitude", latitude);
			zip.setProperty("longitute", longitude);

			Entity city = new Entity("City", cityName);
			city.setProperty("state", state);
			city.setUnindexedProperty("zip", zipcode);

			DatastoreMutationPool mutationPool = this.getAppEngineContext(context)
					.getMutationPool();
			mutationPool.put(zip);
			mutationPool.put(city);
		}

	}
}

Let’s explain the things in this Mapper that are new:

public class ImportFromBlobstoreMapper extends
AppEngineMapper&lt;BlobstoreRecordKey, byte[], NullWritable, NullWritable&gt;

Note this line. It’s different from our previous Mappers in that the type arguments are no longer Key and Entity, but BlobstoreRecordKey and byte[]. The source for BlobstoreRecordKey is here. Remember that map-reduce is about some large body of data and breaking it into smaller pieces to operate on. BlobstoreRecordKey represents a pointer to range of data in our Blobstore. byte[] is a byte[] array actually containing that data.

public void map(BlobstoreRecordKey key, byte[] segment, Context context)

Again, notice the new types. By default, we are splitting on a newline, so segment represents a single line. We can change what we split on by specifying a terminator in mapreduce.xml.

		String line = new String(segment);

		// Line format looks like this:
		// 10644,"US","VA","Tazewell","24651",37.0595,-81.5220,559,276
		// We're also assuming no errant commas in this simple example

		String[] values = line.split(",");
		String state = values[2];
		String cityName = values[3];
		String zipcode = values[4];
		Double latitude = Double.parseDouble(values[5]);
		Double longitude = Double.parseDouble(values[6]);		

		state = state.replaceAll("\"", "");
		cityName = cityName.replaceAll("\"", "");
		zipcode = zipcode.replaceAll("\"", "");

This is very naive String parsing. Nothing fancy here.

		if(!zipcode.isEmpty()) {
			Entity zip = new Entity("Zip", zipcode);
			zip.setProperty("state", state);
			zip.setProperty("city", cityName);
			zip.setProperty("latitude", latitude);
			zip.setProperty("longitute", longitude);

			Entity city = new Entity("City", cityName);
			city.setProperty("state", state);
			city.setUnindexedProperty("zip", zipcode);

			DatastoreMutationPool mutationPool = this.getAppEngineContext(context)
					.getMutationPool();
			mutationPool.put(zip);
			mutationPool.put(city);
		}

Again, very straightforward if you’ve seen this before. Some zipcodes in our CSV file subset are empty, so we’ll check for that and just not create an Entity. We’re adding 2 entities to the mutation pool here – a City and a Zipcode. This ensures that we can search by key when we do a datastore get. Remember that fetches by key are always faster than fetches with a query, since a query requires an index scan followed by a batch get, whereas the datastore can perform a get in a single operation.

That’s it for our Mapper. Let’s add a configuration:

  <configuration name="Import all data from the Blobstore">
    <property>
      <name>mapreduce.map.class</name>

      <!--  Set this to be your Mapper class  -->
      <value>com.ikai.mapperdemo.mappers.ImportFromBlobstoreMapper</value>
    </property>

    <!--  This is a default tool that lets us iterate over blobstore data -->
    <property>
      <name>mapreduce.inputformat.class</name>
      <value>com.google.appengine.tools.mapreduce.BlobstoreInputFormat</value>
    </property>

    <property>
      <name human="Blob Keys to Map Over">mapreduce.mapper.inputformat.blobstoreinputformat.blobkeys</name>
      <value template="optional">blobkeyhere</value>
    </property>        

    <property>
      <name human="Number of shards to use">mapreduce.mapper.shardcount</name>
      <value template="optional">10</value>
    </property>        

  </configuration>

We’ve changed 2 properties here: the input format class as well as a property for the blobstore key pointing to the data to iterate over.

Step 4: Deploy!

We can now package our application up and deploy it! Make sure that you built a new JAR file with the new classes in appengine-mapreduce! If you have the old JAR file, it won’t include the BlobstoreInputFormat class that we need to do our work.

Step 5: Using the Mapper

Let’s browse to our upload hander at /upload.jsp. The page should be pretty bare.

Once the upload has finished, we’ll be on a page that looks like this:

Let’s copy the blob-key in the URL. It’s not the most streamlined approach but it works. We’ll use it in the next screen when we browser to our mapper:

We’ll copy-paste the key to replace “blobkeyhere” and hit “Run”. And now we play the waiting game – we’ll be able to check on the status of our Mapper in the UI, or check on Tasks, or look in the datastore to see if the data has been imported correctly:

Get the code

The code is here on Github:

http://github.com/ikai/App-Engine-Java-Mapper-API-demos

It’s been updated with the new examples.

Summary

So there you have it: another way of importing data into the datastore. This isn’t a replacement for the bulk uploader, just another option. Here are some useful links for additional information:

App Engine Mapreduce issues tracker – report issues here

Nick Johnson’s post explaining how he built the code search example

One last tip: the best place for questions or discussion is probably the App Engine Discussion Groups, not the comments.

Happy hacking.


by Ikai Lan at August 11, 2010 06:33 PM

Stephan Schmidt

Continuous Deployment Setup at 2morethin.gs

I’ve set up continuous deployment for 2morethin.gs. Continuous deployment means that as soon as a developer finishes work, his code is pushed to the website. IMVU does it, as does Wordpress.com and kaChing. This post should show you that there is no magic in continuous deployment, everyone can do it. Why would one want to drop [...]

by stephan at August 11, 2010 10:39 AM

Arnold deVos

An Update to the Scala Jetty Wrapper

The JettyS library http://github.com/arnolddevos/JettyS has been updated for Scala 2.8. There is also a workaround for a bug in Jetty 6 associated with setLowResourceMaxIdleTime(). An update to Jetty 7 is coming.

August 11, 2010 05:42 AM

August 10, 2010

Jesper de Jong

A generic interpolate method using type classes

In a previous post I wrote about different interpolate methods in my program for linearly interpolating numbers and vectors. The methods for numbers and vectors are exactly the same, except for the types of the arguments and return value. So, ofcourse I wanted to write a generic interpolate method to avoid repeating myself.

For a type V to be useable by the interpolate method, it must satisfy two requirements:

  1. It can be multiplied by a number, resulting in a V.
  2. You can add two Vs together, resulting in a V.

I started with the idea to do this with a structural type (despite the performance disadvantage that structural types suffer from because methods are called via reflection – I just wanted to know if this could work in principle).

The requirements on the type can be described by a structural type directly, and then the interpolate method can be written in terms of the type V:

trait Container {
  type V = {
    def *(t: Double): V
    def +(v: V): V
  }

  def interpolate(t: Double, a: V, b: V): V = a * (1.0 - t) + b * t
}

That looks straightforward. But unfortunately, it doesn’t work…

<console>:8: error: recursive method + needs result type
           def +(v: V): V
                        ^
<console>:7: error: recursive method * needs result type
           def *(t: Double): V
                             ^

Note that the methods * and + in the type V refer to the type itself. Unfortunately, this is not allowed: structural types cannot be recursive, and that’s why you get these error messages. It’s impossible to use a structural type for the generic interpolate method.

On StackOverflow, oxbow_lakes put me on another track: you can do this with type classes. Type classes are an idea that comes from Haskell. The description on Wikipedia reads a bit academic and abstract, but the idea is really simple: a type class is simply a type that specifies some property of other types (it classifies types). Have a look at these two type classes, for example:

// A Multipliable is something that can be multiplied with a T, resulting in an R
trait Multipliable[-T, +R] { def *(value: T): R }

// An Addable is something that you can add a T to, resulting in an R
trait Addable[-T, +R] { def +(value: T): R }

These two can be combined into a type class Interpolatable, which satisfies the two requirements that are necessary for the interpolate method:

trait Interpolatable[T] extends Multipliable[Double, T] with Addable[T, T]

The interpolate method can now be written using this type class as follows:

def interpolate[T <% Interpolatable[T]](t: Double, a: T, b: T): T = a * (1.0 - t) + b * t

So, now we have a generic interpolate method that works on anything that can be viewed as an Interpolatable[T] (note the view bound, specified using <%). Ofcourse you now have to tell Scala that a type you want to use this with can indeed be viewed as an Interpolatable[T]; this can be done with an implicit conversion. For example for Double you can put the following implicit conversion somewhere so that it’s in scope:

implicit def doubleToInterpolatable(v1: Double) = new Interpolatable[Double] {
    def *(t: Double): Double = v1 * t
    def +(v2: Double): Double = v1 + v2
}

This all works great, but note that all in all we didn’t gain much with this approach. Although the interpolate method is generic, it’s still necessary to write an implicit conversion for each type that we want to use with the generic method – the repetition has simply shifted from the interpolate method itself to the implicit conversions that we have to write. In principle, though, the approach is valuable; interpolate is just a simple example. Suppose that you’d have a much larger and more complicated method than interpolate, then using this approach with type classes would really be worth it.

Here’s an interesting presentation by Daniel Spiewak in which he also talks about type classes.

A sidestep to C++

Note that in C++, using templates, it’s very easy to write a generic interpolate function that works on any type that satisfies the requirements, and you don’t need to specify this for each type that you want to use:

template <class T>
T interpolate(double t, T a, T b) {
    return a * (1.0 - t) + b * t;
}

There’s a big difference between templates in C++ and generics in Scala or Java. A template in C++ is used to generate code for some concrete type at the moment its necessary, and the compiler checks if the code generated from the template is valid (which it is if the concrete type satisfies the requirements). For example, if you use the template on doubles, the compiler generates an instance of the template specifically for double:

double interpolate(double t, double a, double b) {
    return a * (1.0 - t) + b * t;
}

In C++ there is no need to explicitly specify what the * and + functions do for the types that you want to interpolate; they automatically fall into place in the generated code.

by Jesper at August 10, 2010 08:43 PM

Ruminations of a Programmer

Using generalized type constraints - How to remove code with Scala 2.8

I love removing code. More I remove lesser is the surface area for bugs to bite. Just now I removed a bunch of classes, made unnecessary by Scala 2.8.0 type system.

Consider this set of abstractions, elided for demonstration purposes ..

trait Instrument

// equity
case class Equity(name: String) extends Instrument

// fixed income
abstract class FI(name: String) extends Instrument
case class DiscountBond(name: String, discount: Int) extends FI(name)
case class CouponBond(name: String, coupon: Int) extends FI(name)


Well, it's the instrument hierarchy (simplified) that gets traded in a securities exchange everyday. Now we model a security trade that exchanges instruments and currencies ..

class Trade[<: Instrument](id: Int, account: String, instrument: I) {
  //..
  def calculateNetValue(..) = //..
  def calculateValueDate(..) = //..
  //..
}


In real life a trade will have lots and lots of attributes. But here we don't need them, since our only purpose here is to demonstrate how we can throw away some piece of code :)

Trade can have lots of methods which model the domain logic of the trading process, calculating the net amount of the trade, the value date of the trade etc. Note all of these are valid processes for every type of instrument.

Consider one usecase that calculates the accrued interest of a trade. The difference with other methods is that accrued interest is only applicable for Coupon Bonds, which, according to the above hierarchy is a subtype of FI. How do we express this constraint in the above Trade abstraction ? What we need is to constrain the instrument in the method.

My initial implementation was to make the AccruedInterestCalculator a separate class parameterized with the Trade of the appropriate type of instrument ..

class AccruedInterestCalculator[<: Trade[CouponBond]](trade: T) {
  def accruedInterest(convention: String) = //.. impl
}


and use it as follows ..

val cb = CouponBond("IBM", 10)
val trd = new Trade(1, "account-1", cb)
new AccruedInterestCalculator(trd).accruedInterest("30U/360")


Enter Scala 2.8 and the generalized type constraints ..

Before Scala 2.8, we could not specialize the Instrument type I for any specific method within Trade beyond what was specified as the constraint in defining the Trade class. Since calculation of accrued interest is only valid for coupon bonds, we could only achieve the desired effect by having a separate abstraction as above. Or we could take recourse to runtime checks.

Scala 2.8 introduces generalized type constraints which allow you to do exactly this. We have 3 variants as:
      
  • A =:= B, which mandates that A and B should exactly match
  •  
  • A <:< B, which mandates that A must conform to B
  •  
  • A A <%< B, which means that A must be viewable as B

Predef.scala contains these definitions. Note that unlike <: or >:, the generalized type constraints are not operators. They are classes, instances of which are implicitly provided by the compiler itself to enforce conformance to the type constraints. Here's an example for our use case ..

class Trade[<: Instrument](id: Int, account: String, instrument: I) {
  //..
  def accruedInterest(convention: String)(implicit ev: I =:= CouponBond): Int = {
    //..
  }
}



ev is the type class which the compiler provides that ensures that we invoke accruedInterest only for CouponBond trades. You can now do ..


val cb = CouponBond("IBM", 10)
val trd = new Trade(1, "account-1", cb)
trd.accruedInterest("30U/360")


while the compiler will complain with an equity trade ..

val eq = Equity("GOOG")
val trd = new Trade(2, "account-1", eq)
trd.accruedInterest("30U/360")



Now I can throw away my AccruedInterestCalculator class and all associated machinery. A simple type constraint tells us a lot and models domain constraints, and all that too at compile time. Yum!


You can also use the other variants to great effect when modeling your domain logic. Suppose you have a method that can be invoked only for all FI instruments, you can express the constraint succinctly using <:< ..

class Trade[<: Instrument](id: Int, account: String, instrument: I) {
  //..
  def validateInstrumentNotMatured(implicit ev: I <:< FI): Boolean = {
    //..
  }
}


This post is not about discussing all capabilities of generalized type constraints in Scala. Have a look at these two threads on StackOverflow and this informative gist by Jason Zaugg (@retronym on Twitter) for all the details. I just showed you how I removed some of my code to model my real world domain logic in a more succinct way that also fails fast during compile time.




Update: In response to the comments regarding Strategy implementation ..

Strategy makes a great use case when you want to have multiple implementations of an algorithm. In my case there was no variation. Initially I kept it as a separate abstraction because I was not able to constrain the instrument type in the accruedInterest method whole being within the trade class. Calculating accruedInterest is a normal domain operation for a CouponBond trade - hence trade.accruedInterest(..) looks to be a natural API for the context.

Now let us consider the case when the calculation strategy can vary. We can very well extract the variable part from the core implementation and model it as a separate strategy abstraction. In our case, say the calculation of accrued interest will depend on principal of the trade and the trade date (again, elided for simplicity of demonstration) .. hence we can have the following contract and one sample implementation:

trait CalculationStrategy {
  def calculate(principal: Int, tradeDate: java.util.Date): Int
}

case class DefaultImplementation(name: String) extends CalculationStrategy {
  def calculate(principal: Int, tradeDate: java.util.Date) = {
    //.. impl
  }
}

But how do we use it within the core API that the Trade class publishes ? Type Classes to the rescue (once agian!) ..

class Trade[<: Instrument](id: Int, account: String, instrument: I) {
  //..
  def accruedInterest(convention: String)(implicit ev: I =:= CouponBond, strategy: CalculationStrategy): Int = {
    //..
  }
}

and we can now use the type classes using our own specific implementation ..

implicit val strategy = DefaultImplementation("default")
  
val cb = CouponBond("IBM", 10)
val trd = new Trade(1, "account-1", cb)
trd.accruedInterest("30U/360")  // uses the default type class for the strategy

Now we have the best of both worlds. We implement the domain constraint on instrument using the generalized type constraints and use type classes to make the calculation strategy flexible.

by Debasish (ghosh.debasish@gmail.com) at August 10, 2010 07:02 PM

scala-lang.org

Beyond 2.8 - A Roadmap

Scala 2.8 has been released 4 weeks ago. Following our poll on the main page of the site, over 500 projects (more than 50%) have been converted to 2.8 already or the migration will be completed very soon. Now that this major version update is done, I wanted to give a quick sketch of what's to follow.

by odersky at August 10, 2010 02:09 PM

Jim McBeath

Delimited Continuations

Scala's delimited continuations, introduced in version 2.8, can be used to implement all sorts of interesting control constructs.

This is a very long blog post. It took me quite a while to get my head around Scala's reset and shift operators. To help others hopefully avoid the stumbling blocks I encountered, I have tried here to start with the basics and build up from there in some detail. If you want a shorter explanation, see the Resources section at the end of this post for pointers to some other blog entries that are more succinct.

Contents

Mechanics

In order to use Scala's delimited continuations, you must use version 2.8, and you must use the continuations (or CPS) compiler plugin. You do this by specifying a command line option when running both the compiler and the runtime:

$ scalac -P:continuations:enable ${sourcefiles}
$ scala -P:continuations:enable ${classname}
In your source code, you must import the appropriate continuations elements, which you can do most simply by using a wildcard to import everything:
import scala.util.continuations._
If you forget to do the import you will get an error message similar to this:
<console>:6: error: not found: value reset
       reset {
       ^

Continuation Passing Style (CPS)

In order to understand how Scala's delimited continuations work, you have to understand the "continuation passing style", or CPS.

Consider this code in which a method makes a subroutine call:
def main {
    pre
    sub()
    post
}
def sub() {
    substuff
}
where pre and post represent all of the code in main respectively before and after the call to sub, and substuff represents all of the code in sub.

When the sub method gets called, the system, in effect, instructs the processor to execute the sub code, then to continue execution within main immediately after the call to sub.

We can conceptually refactor the code in main so that all of the stuff in pre is in a separate method, and all of the post stuff is in a separate method. We can further refactor the code so that each section (pre, sub, post) takes in all of its input data as arguments and passes all of its data changes out as an aggregate return value (such as a Map or Tuple) of the method for that section. Adding arguments and return value to main, we have something that looks like this:
def main(m:M):Z = {
    val x:X = pre(m)
    val y:Y = sub(m,x)
    val z:Z = post(m,x,y)
    return z
}
def sub(m:M,x:X):Y {
    val y:Y = substuff(m,x)
    return y
}
Now, instead of the system automatically continuing execution at post after finishing sub, let's make that explicit in our code by passing the chunk of code that calls post as an extra argument to sub. We will then modify sub so that, after doing all of its calculations and generating the values it would have returned to main as y, it instead calls post with its arguments as specified, and returns as its own value the return value of post, which is z in main.
def main(m:M) {
    val x:X = pre(m)
    val z:Z = sub(m,x, { post(m,x,_) } )
    return z
}
def sub(m:M,x:X, subCont: (Y) => Z) {
    val y:Y = substuff(m,x)
    val z:Z = subCont(y)
    return z
}
When we pass the code fragment containing post to sub, Scala generates a closure that captures the values available to post at that point, including m and x, so that when that closure is evaluated later it can get those values.

Note that the main method no longer sees y, the original return value from sub, so it can't be explicitly passed to post; instead, we use a placeholder, which is filled in by the code in sub that calls post. We can rewrite that line to use the more explicit function syntax (where, for convenience, we use y as our parameter name):
    val z:Z = sub(m,x, { (y:Y) => post(m,x,y) } )
The gist of CPS is that we don't use return. Rather than calling a subroutine and having it return to us, as is the case in the normal Direct Style, we pass a continuation to the subroutine for it to execute when it is done.

Nested CPS

In the above example we have only taken the first step in converting to CPS. To be able to take advantage of CPS, we need to complete the transformation.

At the top, our main method is still returning a value. Since we have no return in CPS, how do we handle this? The answer is that the topmost level can not return a value. Let's add a top-level wrapper like this:
def prog(m:M) {
    val z:Z = main(m)
    println(z)
    System.exit(z.exitValue)
}
Now we can make the same CPS transformation on prog and main as we did before on main and sub:
def prog(m:M) {
    main(m, { (z:Z) =>
        println(z)
        System.exit(z.exitValue)
    })
}

def main(m:M, mainCont:(Z)=>Unit):Unit = {
    val x:X = pre(m)
    val z:Z = sub(m,x, { (y:Y) => post(m,x,y) } )
    mainCont(z)
}
We are still using a return statement in sub, with code in main following the return from sub. To fix this, we need to push the mainCont in main into the continuation we pass to sub. We modify both main and sub to do this:
def main(m:M, mainCont:(Z)=>Unit):Unit = {
    val x:X = pre(m)
    sub(m,x, { (y:Y) => {
        val z:Z = post(m,x,y) } )
        mainCont(z)
    })
}

def sub(m:M,x:X, subCont: (Y) => Unit) {
    val y:Y = substuff(m,x)
    subCont(y)
}
We have now threaded our top-level continuation - the one that includes the call to System.exit - all the way down to sub, so when we execute the subCont in sub, it will first execute the post method with the code in main that originally appeared after sub, then it will execute the code in prog that originally appeared after the call to main, which will call println and then exit the program by calling System.exit.

If we wanted to convert substuff to CPS, we would apply the same transformation to it and sub, after which the call from sub to substuff would pass an additional argument which was the continuation of the rest of sub, which includes the continuation passed from main to sub, which in turn includes the continuation passed from prog into main.

As you can see, each continuation that we pass down to another subroutine always includes the continuations for all of the callers. In other words, every continuation includes all of the rest of the program to be executed after the called subroutine is done. The other important point is that in every method where we call a subroutine using CPS, that call is always the very last thing in the method.

Full versus Delimited Continuations

In the discussion above we have assumed that the entire program is converted over to CPS. This is the classical definition of continuations, which can be referred to as full continuations. However, using CPS in languages (such as Scala) that were not specifically designed for it can be awkward, so it would be nicer if we could restrict the use of CPS to the specific areas in our code where we want to use it.

This is exactly the intent of a delimited continuation. Rather than attempting to capture the entire remainder of the program execution in a continuation, we only capture the remaining execution of the program up to a specified point.

If we reexamine the start of our sample program, the prog method, we see that the only difference between it and any arbitrary method is that we can't return a Direct Style value from it. If we remove the call to System.exit, we can call prog from normal Direct Style code, with CPS being used within prog and all of its converted subroutines. Program execution within the CPS code proceeds normally using CPS, each method ending by passing a continuation along to the next method. After the last continuation is finally executed, the CPS code is done and control returns to the caller of prog.
def prog(m:M) {
    main(m, { (z:Z) =>
        println(z)
    })
}

Uses

We have gone to a lot of trouble to restructure our code to use CPS while keeping the functionality the same. Now we can examine how we can make changes to the code that are only possible because it uses CPS.

The key ability that CPS gives us is that we have an explicit object (the continuation) representing the remainder of execution of our program (or, in the case of a delimited continuation, of a portion of our program). In the code sample above, we executed that continuation once we reached the end of the line in sub. But what would happen if, instead of executing the continuation at that point, we just saved it somewhere, such as into a singleton?
object ContinuationSaver {
    var savedContinuation:Option[()=>Unit] = None
    def save(saveCont: =>Unit) = savedContinuation = saveCont _
}
def sub(m:M,x:X, subCont: (Y) => Unit) {
    val y:Y = substuff(m,x)
    ContinuationSaver.save { subCont(y) }
}
After sub saves the continuation, it is done, and in fact the entire delimited continuation is done; control returns to the caller of prog. But in ContinuationSaver we still have the continuation that represents execution of the remainder of that portion of the program, which we can execute later. In effect, we have placed the execution of that code into suspended animation, to be revived at some later time of our choosing.

Not only can we call the continuation later, we can call it multiple times. We can also write a more sophisticated ContinuationSaver that can save multiple continuations and keep track of which ones we should execute later, including the order and whether to call them multiple times. We can even save the continuations to persistent storage or move them to another computer, as is done by Swarm.

CPS With Return

In pure CPS, there are no returns. But code in Scala does return, even when we are using CPS. In the previous section I used the phrase "control returns to the caller of prog." This happens in the normal way, by having each of the intervening methods return to its caller until the stack unwinds to the first CPS call. I have assumed that each CPS method returns no value (Unit), but there is nothing preventing us from adding code to each method in the transformed CPS chain to make it return a value.

The examples above demonstrate a transformation from Direct Style code to CPS code, and that transformation always results in code that returns Unit. If we add a return value to the transformed code, this is not something we can get as a result of using the above transformation technique.

What happens if we add a return value to our CPS code? In our examples above, the execution of the continuation was always the last thing in the subroutine. If we keep this as our default behavior, then when we change the CPS methods to return a value, the return value from the last CPS method in a chain of continuations will propagate back up through the chain of CPS callers all the way out to the topmost CPS method, and will appear to the Direct Style code as the value of that outermost method. Of course, one of the intervening CPS method might modify or replace that value as it is being returned through it.

For example, let's take the most recent version of sub above (the one that saves the continuation for later execution) and make it return an Int value:
object ContinuationSaver {
    var numberOfSavedContinuations = 0
    var savedContinuation:Option[()=>Unit] = None
    def save(saveCont: =>Unit):Int = {
        savedContinuation = saveCont _
        numberOfSavedContinuations = numberOfSavedContinuations + 1
        numberOfSavedContinuations
    }
}
def sub(m:M,x:X, subCont: (Y) => Unit):Int = {
    val y:Y = substuff(m,x)
    ContinuationSaver.save { subCont(y) }
}
We also change the rest of the methods in our calling chain to allow us to propagate this value all the way out. Since the call to sub is the last call in main, all we need to do is change the return type on main to match the return type of sub. Likewise, since the call to main is the last call in prog, we change the return type of prog to match the return type of main:
def prog(m:M):Int = {
    main(m, { (z:Z) =>
        println(z)
    })
}

def main(m:M, mainCont:(Z)=>Unit):Int = {
    val x:X = pre(m)
    sub(m,x, { (y:Y) => {
        val z:Z = post(m,x,y) } )
        mainCont(z)
    })
}
We could, if we wanted to, modify main to make a change to the value returned by sub before passing it back as its own return value, or we could make main return something else entirely.

If you think about the CPS code as having been created by transforming some Direct Style code, you can see that the untransformed code had its original return type, and the now-CPS transformed code has a (potentially different) transformed return type.

Reset and Shift

Finally, we have enough background to understand Scala's reset and shift keywords.

The Scala implementation of delimited continuations was created by Tiark Rompf of EPFL, and is described in his explanatory paper on Delimited Continuations in Scala with co-authors Ingo Maier and Martin Odersky. There are also some quotes below from some of Tiark's posts.

Reset is the keyword that demarcates the limits of the delimited continuation. Within the body of the reset, the code is CPS code; the return value of reset is not CPS.

Shift is the keyword that indicates the bottoming out of the CPS path. The body of the shift is not CPS code, but it's untransformed return value is CPS. The shift call gets passed as its argument the continuation that has been collected from all of the callers out to the (dynamically) enclosing reset.

Reset and shift are thus the keywords that take you from Direct Style to CPS, and from CPS to Direct Style, respectively. All of the code between reset and shift is CPS. Any method that includes shift must be marked as CPS, and any method that calls a CPS method must be marked as CPS, until you reach the enclosing reset call.

When you use reset and shift in your code, the continuations compiler plugin transforms your code in a manner similar to the CPS transformation I described above. All of the code from the end of the shift block to the end of the enclosing method or reset block is packaged up as a closure and passed to the body of the shift block as the continuation function.

Let's break down some examples of reset and shift in Scala.
reset {
  shift { k: (Int=>Int) =>
    k(7)
  } + 1
}
The shift statement tells the compiler plugin to restructure the code as in our CPS examples, by converting the code after the shift call into a continuation that gets passed as an argument to the shift. To make it easier to see what that means in this case, let's do that code transformation in a few steps.

First, we assign the result of the shift call to a variable and use that variable later in the code:
reset {
  var r = shift { k: (Int=>Int) =>
    k(7)
  }
  r + 1
}
Second, we convert all of the code following the shift into a function and call it:
reset {
  var r = shift { k: (Int=>Int) =>
    k(7)
  }
  def f(x:Int) = x + 1
  f(r)
}
The function f is our continuation function that represents all of the code between the end of the shift block and the end of the enclosing reset block. Finally, we transform the code as is done by the compiler plugin, binding our continuation function f(x) to the shift parameter k, and making the return value of the fully transformed code be the return value of the body of the shift:
reset {
  def f(x:Int) = x + 1
  f(7)
}
Now we can easily see that the return value is 8.

We can apply the same transformations to
reset {
  shift { k: (Int=>Int) =>
    k(k(k(7)))
  } + 1
}
to get
reset {
  def f(x:Int) = x + 1
  f(f(f(7)))
}
from which we can quickly calculate that this will return a value of 10.

All of our transformations have no effect on anything outside of the reset; for example,
reset {
  shift { k: (Int=>Int) =>
    k(7)
  } + 1
} * 2
just multiplies the return value of the reset expression by 2, so the result of this code snippet would be 16.

Tiark's paper gives this interesting example:
reset {
  shift { k: (Int=>Int) =>
    k(k(k(7))); "done"
  } + 1
}
and points out that the value of this code snippet is "done". The continuation function k is called three times, but the value of that expression is discarded. If we apply our code transformations as before, we see that this transforms into:
reset {
  def f(x:Int) = x + 1
  f(f(f(7))); "done"
}
which makes it more obvious why the result of this code snippet is "done".

A key detail to note here is that the value of the evaluated reset block is not the value of the last expression in that block, as it is in most code. Instead, the value of the evaluated reset block is the value of the last expression in the shift block that gets executed within that reset block. Execution of the body of the shift is always the last thing that happens within the enclosing reset block.

When you look at a shift block and see its return value being used in an expression, as in the "shift + 1" examples above, remember that, due to code transformation, that "return" from the shift block never actually happens as a return. Instead, once execution reaches the shift block, the code after that block gets passed to it as a continuation; if the code in the shift block calls the continuation, the value which is passed as an argument to the continuation appears as the value being returned from the shift block. Thus the type of the argument passed to the shift block's continuation function is the same as the type of the return value of the shift in the source code, and the type of the return value of that continuation function is the same as the type of the return value of the original last value in the reset block that encloses the shift block.

There are thus three types associated with shift:
  • The type of the argument to pass to the continuation, which is the same as the syntactic return type of the shift in the source code.
  • The type of the return from the continuation, which is the same as the return type of all of the code that follows the shift block in the source code (i.e. the type of the last value in the block of code between the shift block and the end of the function or reset block containing the shift block). This is called the untransformed return type.
  • The type of the last value in the shift block, which becomes the type of the return value of the enclosing function or return block. This is called the transformed return type.
In the signature for shift, the above three types appear as A, B and C, respectively:
def shift[A, B, C](fun: ((A) => B) => C): A @scala.util.continuations.cpsParam[B,C]
The two types in the cpsParam annotation always represent the untransformed and the transformed return types, respectively. The CPS annotations are described in more detail below.

The signature for reset only uses two types: the first type is the untransformed type of the code block passed to reset, which matches the B type of shift, and the second type is the type of the transformed code block, which matches the C type of shift, and is also the real return type of the reset block to its caller. The scaladoc for reset uses parameter type names A and C, but I write it here using B and C so that the signature of the ctx by-name parameter matches the signature of the return value of shift:
def reset[B, C](ctx: => B @scala.util.continuations.cpsParam[B,C]): C   
Here's where those types appear:
C = reset { ...; A = shift { k:(A=>B) => ...; C } ...; B }  
In the following example, A=Int, B=String and C=Boolean:
def is123(n:Int):Boolean = {
  reset {
    shift { k : (Int=>String) =>
      (k(n) == "123")
    }.toString
  }
}

Annotations

As you saw above, the signatures for reset and shift include the cpsParam annotation. The compiler plugin uses this type annotation to select what pieces of code to transform to CPS; in Tiark's paper this is referred to as a "type-directed selective CPS transform." If you just use reset and shift without any subroutine calls, you may never need to explicitly use a CPS annotation. But if you put any shift calls into subroutines, as described below, then you will need to use a CPS annotation.

The base annotation is cpsParam[-B, +C]. This annotation tells the compiler that the corresponding block of code has an untransformed return value of type B and a transformed return value of type C, as described in the discussion of the types for reset and shift above.

To simplify the annotation for the common case where the transformed return type is the same as the untransformed type, the continuations package defines the convenience type cps:
type cps[A] = cpsParam[A, A]
If you are looking at old posts on the web, be aware that the cpsParam annotation used to be called simply cps; the old cps annotation was renamed to cpsParam and the new one-type-parameter cps type alias was added.

In the Uses section above we discussed the possibility of saving away the continuation for later execution, after which control returns to the caller. If we do this, we can't return a value from the suspended code to the original caller, since that code has not been executed yet, and the eventual executor of the continuation may not know where it came from, so it too is likely not to care about a return value.

In order to simplify the source code for this typical case, the Scala continuations library includes a special annotation type, suspendable:
type suspendable = cpsParam[Unit, Unit]
In addition to being more succinct, this annotation type can be used to make it clear that this function may suspend its continuation so that it can finish execution later.

Nested Shift

In all of the above examples, the shift block appears directly inside the reset block, and the cpsParam type of the reset block must match the cpsParam type of the shift block.

What happens if you put the shift block in a separate function and call that function from the reset block? In this case, the function containing the shift block must be marked as a CPS function by using the cpsParam annotation on its return type, and that cpsParam type must be the same as the cpsParam type of the enclosed shift block. When this function is invoked from within the reset block, the compiler plugin knows how to transform that block such that the code after the call to the CPS function becomes part of a continuation which is passed in to the CPS function, just as in the Nested CPS examples above.
def is123(n:Int):Boolean = {
  reset {
    is123sub(n)
  }
}

def is123sub(n:Int):String @cpsParam[String,Boolean] = {
    shift { k : (Int=>String) =>
      (k(n) == "123")
    }.toString
}
The function containing the shift block can be refactored to push that shift block down into another function, in which case that new function must also have the same signature as the original function and the shift block. Thus the entire chain of functions between the reset and the shift are all tied together with the same CPS signature.

What if you have an existing CPS function, but you want to call it and change its return type? If you were to follow the pattern of regular code, you might start by trying something like this in order to return floating point 1 or 0 rather than the Boolean true or false returned by a reset block that just calls is123sub.
//this won't compile
def is123f(n:Int):Float = {
    reset {
        val x = is123sub(n)
        if (x) 1.0 else 0.0
    }
}
This does not work as expected; the line of code following the call to is123sub is not operating on what will be the return value of the reset block, despite it being the last statement in that block. Instead, due to the code transformation described above that is being done by the CPS compiler plugin, code added after the call to is123sub gets bundled up as part of the continuation passed to the shift block within is123sub. The code that follows the call to the CPS function must end with a type that matches the first parameter of the cpsParam part of the signature of the function; in this case, String The untransformed return type of is123sub is also String, so in this case the block of code that follows the call to is123sub must take a String (as the return value of the call to is123sub) and must also return a String (which becomes the return value of the shift block within is123sub).

If we want to intercept the Boolean value that is being calculated in the shift block within is123sub, we must do that from within another shift block. The body of a shift block is written in Direct Style, and our subroutine is123sub is CPS, so we can't call it from within the new shift block. What we have to do is to put the new shift block before the call to is123sub. The call to is123sub then becomes part of the continuation that is passed to the new shift block, and we can add code within the new shift block that receives the transformed result of the shift block in is123sub and converts it as desired.

To see the control flow a little more clearly, you can execute this code snippet:
reset {
    println("A")
    shift { k1: (Unit=>Unit) =>
        println("B")
        k1()
        println("C")
    }
    println("D")
    shift { k2: (Unit=>Unit) =>
        println("E")
        k2()
        println("F")
    }
    println("G")
}
Here's the output the above code produces:
A
B
D
E
G
F
C
You can see from the order of execution that the second shift block is being executed as part of the continuation that is passed to the first shift block. Despite the fact that one appears before the other in the source code, the two shift blocks are actually nested. The compiler plugin notices this and handles them slightly differently to prevent the nested shift block from escaping from the enclosing reset block.

To show how all of the types thread together, here is a little piece of code with explicit type annotations on the reset and shift blocks in which you can see sets of places for which the same type needs to be used. The assert statements help show how the values are getting passed around.
def nestedShifts[T1,T2,T3,T4,T5](t1:T1,t2:T2,t3:T3,t4:T4,t5:T5):T2 = {
    reset[T1,T2] {
        val s1:T3 = shift[T3,T5,T2] { k1: (T3=>T5) =>
            val r1:T5 = k1(t3)
            assert(r1==t5)
            t2  //this is the return value of nestedShifts
        }
        assert(s1==t3)
        val s2:T4 = shift[T4,T1,T5] { k2: (T4=>T1) =>
            val r2:T1 = k2(t4)
            assert(r2==t1)
            t5
        }
        assert(s2==t4)
        t1
    }
}
If you get a compiler error when nesting CPS functions like this, try modifying the code to assign the value of the nested CPS function to a local variable, then end with that variable:
def is123f(n:Int):Float = {
    reset {
        val x = shift { k:(Int=>Boolean) =>
            if (k(n)) 1.0f else 0.0f
        }
        val r = is123sub(x)
        r
    }
}
If you leave out the val r and just end the reset block with the call to is123sub, you will get an error such as this:
<console>:13: error: type mismatch;
 found   : String @scala.util.continuations.cpsParam[String,Boolean]
 required: String @scala.util.continuations.cpsParam[String,Float]
           is123sub(x)
                   ^

Control Construct Restrictions

Because of the code transformation performed by the continuations compiler plugin, there are some control constructs that can not be used when calling a CPS function.

Using return statements in a CPS function is unlikely to do what you expect, and may cause type mismatch compiler errors, so you should not use them.

When using an if statement, you may get an error like this:
Foo.scala:21: error: then and else parts must both be cps code or neither of them
Tiark's advice is not to use explicit return, and maybe use shiftUnit on the non-CPS value.

The compiler plugin does not handle try blocks, so you can't catch exceptions within CPS code. Those exceptions will be propagated out to the enclosing reset block and can be caught there - unless the continuation is suspended and executed later, in which case any exceptions would be propagated to the reset block of the code doing that later execution.

You need to be careful when using looping constructs. As Tiark says,
Capturing delimited continuations inside a while loop turns the loop basically into a general recursive function.
You can follow the above link for details, but basically each invocation of shift within a looping construct allocates another stack frame, so after "looping" many times you will likely get a StackOverflowError.

Some looping constructs can not be used with a shift inside them. To quote Tiark again:
In a reset block you can do anything, but shifts are not allowed everywhere. The limitation is that everything on the call path between a shift and its enclosing reset must be "shift-aware". That rules out the regular foreach, map and filter methods because they know nothing about continuations, so they can't call closures containing shift.

Advice

As I mentioned at the start of this post, it took me some time to feel that I had a good understanding of how reset and shift work. You may not get it in one reading of this post. As with any new coding concept, the best way to gain a working understanding is to try using it in some of your own code. You will need patience; the CPS error messages are not always clear.

If you are interested in playing with control constructs, such as actors or generators, then you should definitely take the time to understand reset and shift. You might also want to take a look at Swarm.

On the other hand, you may never need to deal with reset and shift. Now that they are available in Scala, I expect some people will create libraries that build on reset and shift to present APIs for developers that are simpler to understand. Still, even when using those simpler APIs you may find that an understanding of the content of this post will be useful.

Resources

Updated 2010-08-09 to fix error pointed out by mgm7734.

by Jim McBeath (noreply@blogger.com) at August 10, 2010 03:32 AM

Graham Lea

Things I Love About IntelliJ IDEA: Fast Line Copy & Cut

One of the most common things I do when coding is to copy or cut a single line. The JetBrains guys, having realised this, have made it really easy to do: If you have nothing selected in IntelliJ IDEA and you press Ctrl-C or Ctrl-X, it will copy or cut the current line (respectively, not randomly!). In Eclipse, to copy or cut a line you first have to make the sequence of keystrokes to select the line (Home, Home, Shift-Down) and then copy it with Ctrl-C or cut it with Ctrl-X.

It may not seem like a big difference, but once you've enjoyed the convenience of not having to select a line before copying it, moving back to the select-then-execute method is like moving from a ballpoint pen back to a quill & ink. The difference is simple, but the cumulative effect is lots of time saved. (Look after the pennies.)


by Grazer (noreply@blogger.com) at August 10, 2010 03:01 AM